mirror of
https://github.com/ModernRelay/omnigraph.git
synced 2026-06-15 01:55:13 +02:00
docs(rfc): RFC-010 — restructure the CLI around explicit planes (#214)
The CLI silently spans three planes (data / storage-maintenance / control) and forces the operator to name a graph differently per plane: the graph you query as `--server prod --graph knowledge` you must maintain as `s3://bucket/knowledge.omni`. Plane restrictions (graphs list is server-only, optimize is storage-only) are accidental — discovered by hitting a cryptic error, not declared. RFC-010 proposes: one graph-addressing model across every verb, a declared per-subcommand capability surface (expanding RFC-009 Phase 4), and plane-grouped --help. Storage maintenance stays off the wire deliberately (no HTTP routes for optimize/cleanup/repair). CLI-internal only — no engine, server, or wire change. Incorporates the Codex review thread (kept verbatim with per-point Resolution notes): sharpened resolver authority rule (operator/legacy target must be direct storage; cluster-managed graphs via explicit --cluster --graph), per-subcommand capability table (schema plan vs show/apply, queries validate vs list, session/tooling classified), graphs list aligned to RFC-009's both-later target, init promoted to an explicit cluster-apply signpost, and a Test plan that extends the existing CLI suites and pins the new wrong-plane error strings. Linked from docs/dev/index.md. Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
parent
45500a690a
commit
be4f29c0d0
2 changed files with 382 additions and 0 deletions
|
|
@ -79,6 +79,7 @@ Working documents for in-flight feature work. Removed when the work lands.
|
|||
| Per-operator config — `~/.omnigraph/` identity, keyed credentials, named servers (the operator slice of RFC-002) | [rfc-007-operator-config.md](rfc-007-operator-config.md) |
|
||||
| Deprecate `omnigraph.yaml` — one concern per config surface; key-by-key migration map and staged retirement | [rfc-008-deprecate-omnigraph-yaml.md](rfc-008-deprecate-omnigraph-yaml.md) |
|
||||
| Unify CLI embedded/remote access paths — parity referee, shared wire-DTO crate, `GraphClient` trait, declared plane capabilities | [rfc-009-unify-access-paths.md](rfc-009-unify-access-paths.md) |
|
||||
| Restructure the CLI around explicit planes — one graph-addressing model, declared capability surface, plane-grouped help (expands RFC-009 Phase 4) | [rfc-010-cli-planes-restructure.md](rfc-010-cli-planes-restructure.md) |
|
||||
|
||||
## Boundary
|
||||
|
||||
|
|
|
|||
381
docs/dev/rfc-010-cli-planes-restructure.md
Normal file
381
docs/dev/rfc-010-cli-planes-restructure.md
Normal file
|
|
@ -0,0 +1,381 @@
|
|||
# RFC: Restructure the CLI Around Explicit Planes
|
||||
|
||||
**Status:** Proposed
|
||||
**Date:** 2026-06-13
|
||||
**Audience:** CLI/server/cluster maintainers
|
||||
**Builds on:** [rfc-009-unify-access-paths.md](rfc-009-unify-access-paths.md)
|
||||
(Phases 3a–3c landed — the embedded/remote data-plane fork is now one
|
||||
`GraphClient` enum; this RFC **expands RFC-009 Phase 4** from a narrow
|
||||
embedded-vs-remote capability table into the full plane model, and leaves
|
||||
Phase 5 route alignment where it is),
|
||||
[rfc-007-operator-config.md](rfc-007-operator-config.md) (operator
|
||||
`--server`/`--graph`/`--target` addressing — the surfaces this RFC makes
|
||||
uniform across planes),
|
||||
[rfc-008-deprecate-omnigraph-yaml.md](rfc-008-deprecate-omnigraph-yaml.md).
|
||||
**Sequencing:** post-v0.7.0, after RFC-009 Phase 3c (done).
|
||||
|
||||
## Summary
|
||||
|
||||
The CLI silently spans **three planes** — data, storage/maintenance, and
|
||||
control — and forces the operator to know which plane each verb lives on *and*
|
||||
address a graph differently per plane. The same graph you query as
|
||||
`--server prod --graph knowledge` you must maintain as
|
||||
`s3://bucket/knowledge.omni`. Plane restrictions (`graphs list` is server-only,
|
||||
`optimize` is storage-only) are *accidental* — discovered by hitting a cryptic
|
||||
error, not *declared*.
|
||||
|
||||
This RFC makes the plane model **explicit and coherent** with three moves:
|
||||
|
||||
1. **One graph-addressing model** across every verb (`--target`/`--graph`/
|
||||
positional URI/`--server`), resolving to a storage URI for maintenance and a
|
||||
remote client for data — instead of two different ways to name one graph.
|
||||
2. **A declared, per-subcommand capability surface** (RFC-009 Phase 4): each
|
||||
verb declares its plane(s); wrong-plane invocations get an honest "this is
|
||||
storage-plane, `--server` doesn't apply" error from one table, not scattered
|
||||
`bail!`s.
|
||||
3. **Plane-grouped `--help`** so the model is legible at a glance.
|
||||
|
||||
No new server feature. Storage maintenance stays off the wire — deliberately.
|
||||
|
||||
## Current state of affairs
|
||||
|
||||
The CLI has 23 top-level commands. They divide into three planes, addressed
|
||||
three different ways:
|
||||
|
||||
| Plane | Verbs | Reaches the graph by | Addressing surface |
|
||||
|---|---|---|---|
|
||||
| **Data** | `query`, `mutate`, `load`, `ingest`, `branch *`, `snapshot`, `export`, `commit *`, `schema show/apply`, `graphs list` | embedded engine **or** HTTP server (one `GraphClient`, 15 call sites) | positional URI **or** `--target` / `--graph` / `--server` (config aliases) |
|
||||
| **Storage / maintenance** | `init`, `optimize`, `repair`, `cleanup`, `schema plan`, `queries validate` | embedded engine **only**, directly on storage (`file://` or `s3://`) | positional URI **or** `--target` — **no `--server` / `--graph`** |
|
||||
| **Control** | `cluster validate/plan/apply/approve/status/refresh/import/force-unlock` | a cluster **directory** (`file://` or `s3://`), not a graph URI | `--config <dir>` |
|
||||
|
||||
### What's confusing (validated facts)
|
||||
|
||||
1. **Two names for one graph.** Data verbs resolve `--server prod --graph
|
||||
knowledge` through `apply_server_flag` (16 call sites). Maintenance verbs use
|
||||
`resolve_uri`/`resolve_local_uri` and accept only a positional URI or
|
||||
`--target` — so to compact the graph you *query* as `--server prod --graph
|
||||
knowledge` you must *type* `s3://bucket/knowledge.omni`. One graph, two
|
||||
addressing vocabularies.
|
||||
|
||||
2. **Plane restrictions are accidental, not declared.** `graphs list` is
|
||||
server-only and `optimize`/`repair`/`cleanup`/`init` are storage-only purely
|
||||
by code shape. Point `optimize` at an `https://` URL and you get whatever
|
||||
`Omnigraph::open` says about an https URI — accidental error text that, per
|
||||
Hyrum's Law, is already someone's dependency. The capability is real but
|
||||
unstated.
|
||||
|
||||
3. **The split is per-subcommand, and the family names hide it.** `schema plan`
|
||||
is storage-only (`resolve_local_uri`) while `schema show`/`schema apply` are
|
||||
data-plane (the graph client). `queries validate` opens the graph to
|
||||
typecheck while `queries list` only reads the registry config. The plane is
|
||||
a property of the *subcommand*, not the family.
|
||||
|
||||
4. **Maintenance has no server/cluster counterpart at all.** There is no HTTP
|
||||
route and no `cluster` subcommand for `optimize`/`cleanup`/`repair` (verified:
|
||||
nothing in the server route table, nothing in `omnigraph-cluster/src`). For a
|
||||
server-backed deployment you run the *same CLI* against the storage URI,
|
||||
out-of-band from the serving process. This is correct (maintenance is
|
||||
heavyweight, destructive, single-operator — it should not be a multi-tenant
|
||||
HTTP surface), but it is **undocumented in the CLI's own shape**, so it reads
|
||||
as an omission rather than a decision.
|
||||
|
||||
5. **`init` has a hidden control-plane twin.** Bare `init` creates a single
|
||||
graph from storage; in cluster mode the equivalent is `cluster apply`
|
||||
(graph-creation stage, with ledger/recovery/approval semantics). Same intent,
|
||||
two entry points, no signpost between them.
|
||||
|
||||
6. **Flat `--help`.** All 23 commands list as one undifferentiated block, so the
|
||||
plane a verb belongs to is tribal knowledge.
|
||||
|
||||
The net effect: a new operator must already know OmniGraph's plane architecture
|
||||
to predict which flags work on which verb and how to name a graph. The CLI does
|
||||
not teach its own model.
|
||||
|
||||
## Target CLI ergonomics
|
||||
|
||||
The throughline: **you name a graph one way, and the CLI tells you what works
|
||||
where.** Simple examples of the end state:
|
||||
|
||||
### One name for a graph, everywhere
|
||||
|
||||
A config target `knowledge` works on every verb that touches that graph:
|
||||
|
||||
```bash
|
||||
omnigraph query --target knowledge --query q.gq # data (embedded or remote, auto)
|
||||
omnigraph load --target knowledge --data rows.jsonl # data
|
||||
omnigraph optimize --target knowledge # maintenance (resolves to its storage URI)
|
||||
omnigraph cleanup --target knowledge --keep 10 --confirm
|
||||
omnigraph repair --target knowledge --confirm
|
||||
```
|
||||
|
||||
The positional URI form still works everywhere, unchanged:
|
||||
|
||||
```bash
|
||||
omnigraph optimize s3://bucket/knowledge.omni
|
||||
```
|
||||
|
||||
### Data plane: same command, embedded or remote
|
||||
|
||||
You don't pick "local vs server" syntax — resolution decides:
|
||||
|
||||
```bash
|
||||
omnigraph query ./local.omni --query q.gq # opens engine directly
|
||||
omnigraph query --server prod --graph knowledge --query q.gq # over HTTP
|
||||
omnigraph query --target knowledge --query q.gq # whichever the config says
|
||||
```
|
||||
|
||||
### Maintenance: `--target` must resolve to direct storage (loud if not)
|
||||
|
||||
```bash
|
||||
$ omnigraph optimize --target prod
|
||||
error: `--target prod` resolves to a remote server (https://prod…).
|
||||
`optimize` is a storage-plane command and needs direct storage access.
|
||||
Pass the graph's s3://… URI, or use --cluster <dir> --graph <id>.
|
||||
```
|
||||
|
||||
Cluster-managed graphs get an explicit, intentional path (no implicit
|
||||
`cluster.yaml` peeking):
|
||||
|
||||
```bash
|
||||
omnigraph optimize --cluster ./cluster --graph knowledge
|
||||
```
|
||||
|
||||
### Wrong-plane = one honest, stable error
|
||||
|
||||
```bash
|
||||
$ omnigraph optimize --server prod
|
||||
error: `optimize` is a storage-plane command; `--server` addresses the data
|
||||
plane and does not apply here. Use --target <name> or a storage URI.
|
||||
|
||||
$ omnigraph graphs list ./local.omni
|
||||
error: `graphs list` needs a remote multi-graph server (http/https) today.
|
||||
(Embedded cluster-catalog enumeration is planned — RFC-009.)
|
||||
```
|
||||
|
||||
### `--help` teaches the model
|
||||
|
||||
```
|
||||
DATA PLANE run against a graph (embedded or --server)
|
||||
query mutate load branch snapshot export commit schema show schema apply
|
||||
|
||||
STORAGE / MAINTENANCE direct storage access; no server
|
||||
init optimize repair cleanup schema plan queries validate
|
||||
|
||||
CONTROL PLANE manage a cluster directory
|
||||
cluster
|
||||
|
||||
INSPECT / SESSION
|
||||
graphs list queries list lint policy embed login logout config
|
||||
```
|
||||
|
||||
### Exceptions, signposted (not silent)
|
||||
|
||||
```bash
|
||||
omnigraph init --schema s.pg ./new.omni # plain path: fine
|
||||
|
||||
$ omnigraph init --target knowledge --schema s.pg # cluster-managed target: redirected
|
||||
error: `knowledge` is a cluster-managed graph. Create it via `cluster apply`
|
||||
(which records ledger + recovery + approvals), not `init`.
|
||||
```
|
||||
|
||||
**In one line:** one way to name a graph, the right flags accepted per verb, and
|
||||
a CLI that tells you its planes instead of making you memorize them.
|
||||
|
||||
## Proposed shape (mechanism)
|
||||
|
||||
### One addressing model for every graph-addressing verb
|
||||
|
||||
Route **all** graph-addressing verbs — data *and* maintenance — through one
|
||||
resolver that turns `(positional URI | --target | --graph | --server)` into
|
||||
either a **storage URI** (`file://`/`s3://`) → embedded execution, or a **remote
|
||||
`GraphClient`** → HTTP execution, per the verb's declared plane.
|
||||
|
||||
**Authority rule (the precedence must not be silent).** `--target` is an
|
||||
operator/legacy target lookup; `cluster.yaml` is a *different* authority surface
|
||||
(read only by `cluster` commands and `--cluster` boot). A maintenance verb must
|
||||
not quietly consult both and invent a precedence. The rule:
|
||||
|
||||
- A maintenance verb's `--target` resolves through the **operator/legacy**
|
||||
config and its URI must already be **direct storage**; a target that resolves
|
||||
to a remote (`http(s)://`) URL **fails loudly** (see the example above).
|
||||
- **Cluster-managed graphs are addressed explicitly** via `--cluster <dir>
|
||||
--graph <id>`, so reading cluster state is an intentional mode — never an
|
||||
implicit fallback between operator config and `cluster.yaml`.
|
||||
|
||||
### A declared, per-subcommand capability surface (RFC-009 Phase 4, expanded)
|
||||
|
||||
One table, **per subcommand** (family-level rows hide exactly the cases the
|
||||
table exists to make non-accidental):
|
||||
|
||||
| Command | Data (embedded) | Data (remote) | Storage (direct) | Config / session | Notes |
|
||||
|---|---|---|---|---|---|
|
||||
| `query`, `mutate`, `load`, `ingest` | ✅ | ✅ | — | — | `ingest` is the deprecated alias of `load` |
|
||||
| `branch create/list/delete/merge` | ✅ | ✅ | — | — | |
|
||||
| `snapshot`, `export`, `commit list/show` | ✅ | ✅ | — | — | |
|
||||
| `schema show` | ✅ | ✅ | — | — | |
|
||||
| `schema apply` | ✅ | ✅ | — | — | declarative alternative: `cluster apply` |
|
||||
| `schema plan` | — | — | ✅ | — | local resolver today |
|
||||
| `queries validate` | — | — | ✅ | — | opens the graph to typecheck |
|
||||
| `init` | — | — | ✅ | — | cluster-managed graphs → `cluster apply` |
|
||||
| `optimize`, `repair`, `cleanup` | — | — | ✅ | — | |
|
||||
| `graphs list` | (later) | ✅ | — | — | remote today; embedded-cluster later (RFC-009) |
|
||||
| `queries list` | — | — | — | ✅ | reads the registry config; no graph |
|
||||
| `lint` | — | — | ✅ | ✅ | `--schema` file, or opens a local graph |
|
||||
| `policy validate/test/explain` | — | — | — | ✅ | reads policy files + config |
|
||||
| `embed` | — | — | — | ✅ | local tooling (files + embedding API) |
|
||||
| `login`, `logout`, `config`, `version` | — | — | — | ✅ | session / config; no graph |
|
||||
|
||||
The resolver consults this table. A wrong-plane invocation produces one honest,
|
||||
stable message instead of N ad-hoc `bail!`s and accidental `open` errors.
|
||||
|
||||
### Plane-grouped `--help`
|
||||
|
||||
Group the command list by plane (the `--help` block shown under Target CLI
|
||||
ergonomics). Cosmetic, zero behavior change, highest legibility-per-line.
|
||||
|
||||
### Maintenance stays off the wire (decision, not omission)
|
||||
|
||||
This RFC **does not** add server routes for `optimize`/`cleanup`/`repair`:
|
||||
|
||||
- **Serving = the server.** Multi-tenant, safe-for-many-callers data plane.
|
||||
- **Storage maintenance = the CLI against storage**, addressed uniformly,
|
||||
run by an operator or a scheduled job with storage access.
|
||||
|
||||
Adding maintenance-over-HTTP would re-introduce a heavyweight, destructive
|
||||
multi-tenant surface and *add* a plane rather than clarify the three we have.
|
||||
A future cluster-driven maintenance reconciler (scheduled compaction/GC as a
|
||||
control-plane policy) is explicitly **out of scope** — net-new design (who runs
|
||||
it, with what resource bounds), not a CLI restructure.
|
||||
|
||||
### `init` is an explicit exception (decision)
|
||||
|
||||
Direct-storage `init` against a plain URI/target stays. But if a target resolves
|
||||
to a **cluster-managed** graph root, `init` **refuses and signposts** `cluster
|
||||
apply` (which records ledger, recovery, and approval artifacts) rather than
|
||||
initializing that root out of band. This closes the "hidden twin" of the current
|
||||
state.
|
||||
|
||||
## Compatibility
|
||||
|
||||
Additive and low-risk:
|
||||
|
||||
- **`--target`/`--graph` on maintenance verbs** is new capability; the positional
|
||||
URI form keeps working unchanged.
|
||||
- **Grouped `--help`** is cosmetic.
|
||||
- **Capability-surface error text** changes the message you get on a wrong-plane
|
||||
or misaddressed invocation. Per Hyrum's Law that text is observable; the change
|
||||
is deliberate, release-noted, and replaces an *accidental* `Omnigraph::open`
|
||||
string with a *stable, declared* one — a net improvement, but flagged.
|
||||
|
||||
No engine, server, or wire-protocol change. The work is CLI-internal: the shared
|
||||
resolver, the capability table, and help grouping.
|
||||
|
||||
## Test plan
|
||||
|
||||
Extend the existing CLI suites rather than adding a duplicate harness:
|
||||
|
||||
- **`parity_matrix.rs`** — capability exclusions (the per-subcommand plane table
|
||||
becomes the source of truth for which verbs are remote-only / storage-only).
|
||||
- **`cli_data.rs`** — maintenance wrong-plane errors (`optimize --server`,
|
||||
`optimize --target <remote>`), and `--target` resolving to direct storage.
|
||||
- **`cli_schema_config.rs`** — `graphs list` plane behavior, `schema plan`
|
||||
vs `schema show/apply` plane split, and plane-grouped `--help` output.
|
||||
- **`system_local.rs`** — `--server` / operator-targeting edge cases end-to-end.
|
||||
|
||||
Pin the new wrong-plane error strings deliberately: this RFC is intentionally
|
||||
replacing accidental `Omnigraph::open` strings with stable capability errors, and
|
||||
those strings become observable behavior (Hyrum).
|
||||
|
||||
## Relationship to RFC-009
|
||||
|
||||
RFC-009 Phase 4 was scoped as "declared plane capabilities" for the
|
||||
embedded-vs-remote axis only. This RFC **subsumes and broadens** that phase into
|
||||
the full three-plane, per-subcommand model (adds uniform maintenance addressing,
|
||||
the authority rule, and help grouping). RFC-009 Phase 5 (remote `load` →
|
||||
`/load` route alignment) is unaffected and remains in RFC-009.
|
||||
|
||||
**`graphs list` reconciliation:** RFC-009's answered open question (pinned in
|
||||
`parity_matrix.rs`'s exclusions comment) targets `graphs list` becoming
|
||||
Both-capability once the embedded arm enumerates the cluster catalog. This RFC
|
||||
**aligns** with that rather than superseding it: the capability table shows
|
||||
`graphs list` as remote today, embedded-cluster later.
|
||||
|
||||
## Open questions
|
||||
|
||||
1. **Capability-table location** — a CLI-internal const, or surfaced (e.g. in
|
||||
`--help` and a machine-readable `omnigraph capabilities` for tooling)?
|
||||
2. **`--cluster <dir> --graph <id>` for maintenance** — does the maintenance
|
||||
command resolve the storage URI from the applied cluster state, or from the
|
||||
declared `cluster.yaml`? (Applied state is the truth the server serves;
|
||||
declared config may be ahead of it.)
|
||||
|
||||
## Review comments (Codex, 2026-06-13)
|
||||
|
||||
Overall take: the direction is right. The planes already exist; making them
|
||||
declared in code, help text, and error messages should reduce operator surprise.
|
||||
Keeping storage maintenance off HTTP is also the right boundary: `optimize`,
|
||||
`repair`, and `cleanup` are direct-storage operator actions, not a multi-tenant
|
||||
serving surface.
|
||||
|
||||
Before implementation, tighten these points:
|
||||
|
||||
1. **Resolver authority needs a sharper rule.** The proposal says maintenance
|
||||
resolves storage URIs "from `cluster.yaml` / operator config", but those are
|
||||
different authority surfaces. Today `--target` is an operator/legacy
|
||||
graph-target lookup; cluster config is read by `cluster` commands and by
|
||||
`--cluster` server boot. Do not make a maintenance command silently consult
|
||||
both and pick a precedence. Either:
|
||||
- `--target` on maintenance means an operator/legacy target whose URI is
|
||||
already direct storage, with remote targets failing loudly; or
|
||||
- add an explicit cluster-root/config resolver for this case, so reading
|
||||
cluster state is an intentional mode.
|
||||
|
||||
**Resolution (accepted):** both — `--target` resolves through operator/legacy
|
||||
config and must be direct storage (remote → loud fail); cluster-managed graphs
|
||||
use the explicit `--cluster <dir> --graph <id>` resolver. See *Authority
|
||||
rule* under Proposed shape.
|
||||
|
||||
2. **`graphs list` conflicts with RFC-009's target shape.** This RFC classifies
|
||||
`graphs list` as remote-only, while RFC-009's answered open question says it
|
||||
becomes Both-capability once the embedded arm enumerates the cluster catalog.
|
||||
Pick one direction here: either this RFC explicitly supersedes that target,
|
||||
or the capability table should show `graphs list` as remote today and
|
||||
embedded-cluster later.
|
||||
|
||||
**Resolution (accepted):** align, don't supersede. The table shows `graphs
|
||||
list` remote-today / embedded-cluster-later. See *Relationship to RFC-009*.
|
||||
|
||||
3. **The capability table should be per subcommand, not per family.** The
|
||||
family-level rows hide the exact cases the table is supposed to make
|
||||
non-accidental. At minimum, call out:
|
||||
- `schema plan` as local/storage-backed today, while `schema show` and
|
||||
`schema apply` route through the graph client;
|
||||
- `queries validate` versus `queries list`, which do not have the same
|
||||
plane shape;
|
||||
- `lint`, `policy`, `embed`, `login`, `logout`, `config`, and `version`, so
|
||||
enumeration/session/tooling commands are intentionally classified instead
|
||||
of falling outside the model.
|
||||
|
||||
**Resolution (accepted):** the capability table is now per-subcommand and
|
||||
classifies every command, including the session/tooling group.
|
||||
|
||||
4. **`init` should be an explicit exception.** Direct-storage `init` is fine.
|
||||
A cluster-managed graph should be created by `cluster apply`, with ledger,
|
||||
recovery, and approval semantics. If a named target resolves to a
|
||||
cluster-managed graph root, `init` should signpost `cluster apply` rather
|
||||
than quietly initializing that root out of band.
|
||||
|
||||
**Resolution (accepted):** promoted from open question to a decision. See
|
||||
*`init` is an explicit exception*.
|
||||
|
||||
Testing notes for the implementation slice:
|
||||
|
||||
- Extend the existing CLI suites rather than adding a new duplicate harness:
|
||||
`parity_matrix.rs` for capability exclusions, `cli_data.rs` for maintenance
|
||||
wrong-plane errors, `cli_schema_config.rs` for `graphs list` / help behavior,
|
||||
and `system_local.rs` for `--server` / operator-targeting edge cases.
|
||||
- Pin the new wrong-plane error strings deliberately. This RFC is intentionally
|
||||
replacing accidental `Omnigraph::open` strings with stable capability errors,
|
||||
and those strings become observable behavior.
|
||||
|
||||
**Resolution (accepted):** captured as the *Test plan* section.
|
||||
Loading…
Add table
Add a link
Reference in a new issue