The CLI silently spans three planes (data / storage-maintenance / control) and forces the operator to name a graph differently per plane: the graph you query as `--server prod --graph knowledge` you must maintain as `s3://bucket/knowledge.omni`. Plane restrictions (graphs list is server-only, optimize is storage-only) are accidental — discovered by hitting a cryptic error, not declared. RFC-010 proposes: one graph-addressing model across every verb, a declared per-subcommand capability surface (expanding RFC-009 Phase 4), and plane-grouped --help. Storage maintenance stays off the wire deliberately (no HTTP routes for optimize/cleanup/repair). CLI-internal only — no engine, server, or wire change. Incorporates the Codex review thread (kept verbatim with per-point Resolution notes): sharpened resolver authority rule (operator/legacy target must be direct storage; cluster-managed graphs via explicit --cluster --graph), per-subcommand capability table (schema plan vs show/apply, queries validate vs list, session/tooling classified), graphs list aligned to RFC-009's both-later target, init promoted to an explicit cluster-apply signpost, and a Test plan that extends the existing CLI suites and pins the new wrong-plane error strings. Linked from docs/dev/index.md. Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
18 KiB
RFC: Restructure the CLI Around Explicit Planes
Status: Proposed
Date: 2026-06-13
Audience: CLI/server/cluster maintainers
Builds on: rfc-009-unify-access-paths.md
(Phases 3a–3c landed — the embedded/remote data-plane fork is now one
GraphClient enum; this RFC expands RFC-009 Phase 4 from a narrow
embedded-vs-remote capability table into the full plane model, and leaves
Phase 5 route alignment where it is),
rfc-007-operator-config.md (operator
--server/--graph/--target addressing — the surfaces this RFC makes
uniform across planes),
rfc-008-deprecate-omnigraph-yaml.md.
Sequencing: post-v0.7.0, after RFC-009 Phase 3c (done).
Summary
The CLI silently spans three planes — data, storage/maintenance, and
control — and forces the operator to know which plane each verb lives on and
address a graph differently per plane. The same graph you query as
--server prod --graph knowledge you must maintain as
s3://bucket/knowledge.omni. Plane restrictions (graphs list is server-only,
optimize is storage-only) are accidental — discovered by hitting a cryptic
error, not declared.
This RFC makes the plane model explicit and coherent with three moves:
- One graph-addressing model across every verb (
--target/--graph/ positional URI/--server), resolving to a storage URI for maintenance and a remote client for data — instead of two different ways to name one graph. - A declared, per-subcommand capability surface (RFC-009 Phase 4): each
verb declares its plane(s); wrong-plane invocations get an honest "this is
storage-plane,
--serverdoesn't apply" error from one table, not scatteredbail!s. - Plane-grouped
--helpso the model is legible at a glance.
No new server feature. Storage maintenance stays off the wire — deliberately.
Current state of affairs
The CLI has 23 top-level commands. They divide into three planes, addressed three different ways:
| Plane | Verbs | Reaches the graph by | Addressing surface |
|---|---|---|---|
| Data | query, mutate, load, ingest, branch *, snapshot, export, commit *, schema show/apply, graphs list |
embedded engine or HTTP server (one GraphClient, 15 call sites) |
positional URI or --target / --graph / --server (config aliases) |
| Storage / maintenance | init, optimize, repair, cleanup, schema plan, queries validate |
embedded engine only, directly on storage (file:// or s3://) |
positional URI or --target — no --server / --graph |
| Control | cluster validate/plan/apply/approve/status/refresh/import/force-unlock |
a cluster directory (file:// or s3://), not a graph URI |
--config <dir> |
What's confusing (validated facts)
-
Two names for one graph. Data verbs resolve
--server prod --graph knowledgethroughapply_server_flag(16 call sites). Maintenance verbs useresolve_uri/resolve_local_uriand accept only a positional URI or--target— so to compact the graph you query as--server prod --graph knowledgeyou must types3://bucket/knowledge.omni. One graph, two addressing vocabularies. -
Plane restrictions are accidental, not declared.
graphs listis server-only andoptimize/repair/cleanup/initare storage-only purely by code shape. Pointoptimizeat anhttps://URL and you get whateverOmnigraph::opensays about an https URI — accidental error text that, per Hyrum's Law, is already someone's dependency. The capability is real but unstated. -
The split is per-subcommand, and the family names hide it.
schema planis storage-only (resolve_local_uri) whileschema show/schema applyare data-plane (the graph client).queries validateopens the graph to typecheck whilequeries listonly reads the registry config. The plane is a property of the subcommand, not the family. -
Maintenance has no server/cluster counterpart at all. There is no HTTP route and no
clustersubcommand foroptimize/cleanup/repair(verified: nothing in the server route table, nothing inomnigraph-cluster/src). For a server-backed deployment you run the same CLI against the storage URI, out-of-band from the serving process. This is correct (maintenance is heavyweight, destructive, single-operator — it should not be a multi-tenant HTTP surface), but it is undocumented in the CLI's own shape, so it reads as an omission rather than a decision. -
inithas a hidden control-plane twin. Bareinitcreates a single graph from storage; in cluster mode the equivalent iscluster apply(graph-creation stage, with ledger/recovery/approval semantics). Same intent, two entry points, no signpost between them. -
Flat
--help. All 23 commands list as one undifferentiated block, so the plane a verb belongs to is tribal knowledge.
The net effect: a new operator must already know OmniGraph's plane architecture to predict which flags work on which verb and how to name a graph. The CLI does not teach its own model.
Target CLI ergonomics
The throughline: you name a graph one way, and the CLI tells you what works where. Simple examples of the end state:
One name for a graph, everywhere
A config target knowledge works on every verb that touches that graph:
omnigraph query --target knowledge --query q.gq # data (embedded or remote, auto)
omnigraph load --target knowledge --data rows.jsonl # data
omnigraph optimize --target knowledge # maintenance (resolves to its storage URI)
omnigraph cleanup --target knowledge --keep 10 --confirm
omnigraph repair --target knowledge --confirm
The positional URI form still works everywhere, unchanged:
omnigraph optimize s3://bucket/knowledge.omni
Data plane: same command, embedded or remote
You don't pick "local vs server" syntax — resolution decides:
omnigraph query ./local.omni --query q.gq # opens engine directly
omnigraph query --server prod --graph knowledge --query q.gq # over HTTP
omnigraph query --target knowledge --query q.gq # whichever the config says
Maintenance: --target must resolve to direct storage (loud if not)
$ omnigraph optimize --target prod
error: `--target prod` resolves to a remote server (https://prod…).
`optimize` is a storage-plane command and needs direct storage access.
Pass the graph's s3://… URI, or use --cluster <dir> --graph <id>.
Cluster-managed graphs get an explicit, intentional path (no implicit
cluster.yaml peeking):
omnigraph optimize --cluster ./cluster --graph knowledge
Wrong-plane = one honest, stable error
$ omnigraph optimize --server prod
error: `optimize` is a storage-plane command; `--server` addresses the data
plane and does not apply here. Use --target <name> or a storage URI.
$ omnigraph graphs list ./local.omni
error: `graphs list` needs a remote multi-graph server (http/https) today.
(Embedded cluster-catalog enumeration is planned — RFC-009.)
--help teaches the model
DATA PLANE run against a graph (embedded or --server)
query mutate load branch snapshot export commit schema show schema apply
STORAGE / MAINTENANCE direct storage access; no server
init optimize repair cleanup schema plan queries validate
CONTROL PLANE manage a cluster directory
cluster
INSPECT / SESSION
graphs list queries list lint policy embed login logout config
Exceptions, signposted (not silent)
omnigraph init --schema s.pg ./new.omni # plain path: fine
$ omnigraph init --target knowledge --schema s.pg # cluster-managed target: redirected
error: `knowledge` is a cluster-managed graph. Create it via `cluster apply`
(which records ledger + recovery + approvals), not `init`.
In one line: one way to name a graph, the right flags accepted per verb, and a CLI that tells you its planes instead of making you memorize them.
Proposed shape (mechanism)
One addressing model for every graph-addressing verb
Route all graph-addressing verbs — data and maintenance — through one
resolver that turns (positional URI | --target | --graph | --server) into
either a storage URI (file:///s3://) → embedded execution, or a remote
GraphClient → HTTP execution, per the verb's declared plane.
Authority rule (the precedence must not be silent). --target is an
operator/legacy target lookup; cluster.yaml is a different authority surface
(read only by cluster commands and --cluster boot). A maintenance verb must
not quietly consult both and invent a precedence. The rule:
- A maintenance verb's
--targetresolves through the operator/legacy config and its URI must already be direct storage; a target that resolves to a remote (http(s)://) URL fails loudly (see the example above). - Cluster-managed graphs are addressed explicitly via
--cluster <dir> --graph <id>, so reading cluster state is an intentional mode — never an implicit fallback between operator config andcluster.yaml.
A declared, per-subcommand capability surface (RFC-009 Phase 4, expanded)
One table, per subcommand (family-level rows hide exactly the cases the table exists to make non-accidental):
| Command | Data (embedded) | Data (remote) | Storage (direct) | Config / session | Notes |
|---|---|---|---|---|---|
query, mutate, load, ingest |
✅ | ✅ | — | — | ingest is the deprecated alias of load |
branch create/list/delete/merge |
✅ | ✅ | — | — | |
snapshot, export, commit list/show |
✅ | ✅ | — | — | |
schema show |
✅ | ✅ | — | — | |
schema apply |
✅ | ✅ | — | — | declarative alternative: cluster apply |
schema plan |
— | — | ✅ | — | local resolver today |
queries validate |
— | — | ✅ | — | opens the graph to typecheck |
init |
— | — | ✅ | — | cluster-managed graphs → cluster apply |
optimize, repair, cleanup |
— | — | ✅ | — | |
graphs list |
(later) | ✅ | — | — | remote today; embedded-cluster later (RFC-009) |
queries list |
— | — | — | ✅ | reads the registry config; no graph |
lint |
— | — | ✅ | ✅ | --schema file, or opens a local graph |
policy validate/test/explain |
— | — | — | ✅ | reads policy files + config |
embed |
— | — | — | ✅ | local tooling (files + embedding API) |
login, logout, config, version |
— | — | — | ✅ | session / config; no graph |
The resolver consults this table. A wrong-plane invocation produces one honest,
stable message instead of N ad-hoc bail!s and accidental open errors.
Plane-grouped --help
Group the command list by plane (the --help block shown under Target CLI
ergonomics). Cosmetic, zero behavior change, highest legibility-per-line.
Maintenance stays off the wire (decision, not omission)
This RFC does not add server routes for optimize/cleanup/repair:
- Serving = the server. Multi-tenant, safe-for-many-callers data plane.
- Storage maintenance = the CLI against storage, addressed uniformly, run by an operator or a scheduled job with storage access.
Adding maintenance-over-HTTP would re-introduce a heavyweight, destructive multi-tenant surface and add a plane rather than clarify the three we have. A future cluster-driven maintenance reconciler (scheduled compaction/GC as a control-plane policy) is explicitly out of scope — net-new design (who runs it, with what resource bounds), not a CLI restructure.
init is an explicit exception (decision)
Direct-storage init against a plain URI/target stays. But if a target resolves
to a cluster-managed graph root, init refuses and signposts cluster apply (which records ledger, recovery, and approval artifacts) rather than
initializing that root out of band. This closes the "hidden twin" of the current
state.
Compatibility
Additive and low-risk:
--target/--graphon maintenance verbs is new capability; the positional URI form keeps working unchanged.- Grouped
--helpis cosmetic. - Capability-surface error text changes the message you get on a wrong-plane
or misaddressed invocation. Per Hyrum's Law that text is observable; the change
is deliberate, release-noted, and replaces an accidental
Omnigraph::openstring with a stable, declared one — a net improvement, but flagged.
No engine, server, or wire-protocol change. The work is CLI-internal: the shared resolver, the capability table, and help grouping.
Test plan
Extend the existing CLI suites rather than adding a duplicate harness:
parity_matrix.rs— capability exclusions (the per-subcommand plane table becomes the source of truth for which verbs are remote-only / storage-only).cli_data.rs— maintenance wrong-plane errors (optimize --server,optimize --target <remote>), and--targetresolving to direct storage.cli_schema_config.rs—graphs listplane behavior,schema planvsschema show/applyplane split, and plane-grouped--helpoutput.system_local.rs—--server/ operator-targeting edge cases end-to-end.
Pin the new wrong-plane error strings deliberately: this RFC is intentionally
replacing accidental Omnigraph::open strings with stable capability errors, and
those strings become observable behavior (Hyrum).
Relationship to RFC-009
RFC-009 Phase 4 was scoped as "declared plane capabilities" for the
embedded-vs-remote axis only. This RFC subsumes and broadens that phase into
the full three-plane, per-subcommand model (adds uniform maintenance addressing,
the authority rule, and help grouping). RFC-009 Phase 5 (remote load →
/load route alignment) is unaffected and remains in RFC-009.
graphs list reconciliation: RFC-009's answered open question (pinned in
parity_matrix.rs's exclusions comment) targets graphs list becoming
Both-capability once the embedded arm enumerates the cluster catalog. This RFC
aligns with that rather than superseding it: the capability table shows
graphs list as remote today, embedded-cluster later.
Open questions
- Capability-table location — a CLI-internal const, or surfaced (e.g. in
--helpand a machine-readableomnigraph capabilitiesfor tooling)? --cluster <dir> --graph <id>for maintenance — does the maintenance command resolve the storage URI from the applied cluster state, or from the declaredcluster.yaml? (Applied state is the truth the server serves; declared config may be ahead of it.)
Review comments (Codex, 2026-06-13)
Overall take: the direction is right. The planes already exist; making them
declared in code, help text, and error messages should reduce operator surprise.
Keeping storage maintenance off HTTP is also the right boundary: optimize,
repair, and cleanup are direct-storage operator actions, not a multi-tenant
serving surface.
Before implementation, tighten these points:
-
Resolver authority needs a sharper rule. The proposal says maintenance resolves storage URIs "from
cluster.yaml/ operator config", but those are different authority surfaces. Today--targetis an operator/legacy graph-target lookup; cluster config is read byclustercommands and by--clusterserver boot. Do not make a maintenance command silently consult both and pick a precedence. Either:--targeton maintenance means an operator/legacy target whose URI is already direct storage, with remote targets failing loudly; or- add an explicit cluster-root/config resolver for this case, so reading cluster state is an intentional mode.
Resolution (accepted): both —
--targetresolves through operator/legacy config and must be direct storage (remote → loud fail); cluster-managed graphs use the explicit--cluster <dir> --graph <id>resolver. See Authority rule under Proposed shape. -
graphs listconflicts with RFC-009's target shape. This RFC classifiesgraphs listas remote-only, while RFC-009's answered open question says it becomes Both-capability once the embedded arm enumerates the cluster catalog. Pick one direction here: either this RFC explicitly supersedes that target, or the capability table should showgraphs listas remote today and embedded-cluster later.Resolution (accepted): align, don't supersede. The table shows
graphs listremote-today / embedded-cluster-later. See Relationship to RFC-009. -
The capability table should be per subcommand, not per family. The family-level rows hide the exact cases the table is supposed to make non-accidental. At minimum, call out:
schema planas local/storage-backed today, whileschema showandschema applyroute through the graph client;queries validateversusqueries list, which do not have the same plane shape;lint,policy,embed,login,logout,config, andversion, so enumeration/session/tooling commands are intentionally classified instead of falling outside the model.
Resolution (accepted): the capability table is now per-subcommand and classifies every command, including the session/tooling group.
-
initshould be an explicit exception. Direct-storageinitis fine. A cluster-managed graph should be created bycluster apply, with ledger, recovery, and approval semantics. If a named target resolves to a cluster-managed graph root,initshould signpostcluster applyrather than quietly initializing that root out of band.Resolution (accepted): promoted from open question to a decision. See
initis an explicit exception.
Testing notes for the implementation slice:
-
Extend the existing CLI suites rather than adding a new duplicate harness:
parity_matrix.rsfor capability exclusions,cli_data.rsfor maintenance wrong-plane errors,cli_schema_config.rsforgraphs list/ help behavior, andsystem_local.rsfor--server/ operator-targeting edge cases. -
Pin the new wrong-plane error strings deliberately. This RFC is intentionally replacing accidental
Omnigraph::openstrings with stable capability errors, and those strings become observable behavior.Resolution (accepted): captured as the Test plan section.