Merge remote-tracking branch 'origin/main' into ragnorc/omnigraph-mcp-crate

2026-06-18 02:24:27 +02:00 · 2026-06-16 16:44:11 +02:00 · 2026-06-16 16:44:11 +02:00 · c08e8dbac4
commit c08e8dbac4
parent de9e28ecf6 e510937a7e
173 changed files with 20828 additions and 10366 deletions
--- a/docs/dev/architecture.md
+++ b/docs/dev/architecture.md
@ -1,6 +1,6 @@
 # Architecture

-OmniGraph is a typed property-graph engine built as a coordination layer over many Lance datasets, with Git-style branches and commits across the whole graph, multi-modal querying (vector + FTS + BM25 + RRF + graph traversal) in one runtime, an HTTP server with Cedar policy, and a CLI driven by a single `omnigraph.yaml`.
+OmniGraph is a typed property-graph engine built as a coordination layer over many Lance datasets, with Git-style branches and commits across the whole graph, multi-modal querying (vector + FTS + BM25 + RRF + graph traversal) in one runtime, an HTTP server with Cedar policy, and a CLI driven by a per-operator `~/.omnigraph/config.yaml` plus team-owned cluster directories.

 ## Reading guide

@ -10,7 +10,7 @@ Three views, increasing zoom:
 2. **Layer view** — the eight-layer stack inside one OmniGraph process.
 3. **Component zoom-ins** — what's inside each layer.

-For runtime flows (read query, mutation), see [`docs/dev/execution.md`](execution.md). For the on-disk layout of a graph, see [`docs/user/storage.md`](../user/storage.md).
+For runtime flows (read query, mutation), see [`docs/dev/execution.md`](execution.md). For the on-disk layout of a graph, see [`docs/user/storage.md`](../user/concepts/storage.md).

 L1 (orange in the diagrams) is what we inherit from Lance; L2 (blue) is what OmniGraph adds. The L1/L2 framing is also called out in prose at the bottom of this doc.

@ -280,7 +280,7 @@ flowchart LR
    eng --> wq
 ```

-The server applies Cedar policy at the HTTP boundary today. The roadmap, called out in [docs/dev/invariants.md](invariants.md) as a known gap, is to push policy into the planner as predicates. After Cedar, mutating handlers go through `WorkloadController` (per-actor admission cap + byte budget; PR 2 / MR-686) before reaching the engine. The engine itself holds an `Arc<WriteQueueManager>` so concurrent mutations on the same `(table, branch)` serialize at the queue, while disjoint keys run in parallel — see [docs/user/server.md](../user/server.md) "Per-actor admission control" and [docs/dev/writes.md](writes.md). The CLI bypasses the HTTP layer (and admission) and calls the engine API directly.
+The server applies Cedar policy at the HTTP boundary today. The roadmap, called out in [docs/dev/invariants.md](invariants.md) as a known gap, is to push policy into the planner as predicates. After Cedar, mutating handlers go through `WorkloadController` (per-actor admission cap + byte budget; PR 2 / MR-686) before reaching the engine. The engine itself holds an `Arc<WriteQueueManager>` so concurrent mutations on the same `(table, branch)` serialize at the queue, while disjoint keys run in parallel — see [docs/user/server.md](../user/operations/server.md) "Per-actor admission control" and [docs/dev/writes.md](writes.md). The CLI bypasses the HTTP layer (and admission) and calls the engine API directly.

 Code paths:

--- a/docs/dev/branch-protection.md
+++ b/docs/dev/branch-protection.md
@ -8,7 +8,7 @@ This page explains what the policy says and how to change it.

 | Setting | Value | Why |
 |---|---|---|
-| **Required status checks (strict)** | `Classify Changes`, `Check AGENTS.md Links`, `Test Workspace`, `Test omnigraph-server --features aws`, `CODEOWNERS matches source`, `CODEOWNERS not hand-edited` | Every PR must pass workspace tests, AGENTS.md link integrity, and the CODEOWNERS hygiene checks. The two CODEOWNERS contexts must equal the job `name:` values in `.github/workflows/codeowners.yml` **verbatim** — a context naming a job that never reports (the old `CODEOWNERS / drift` used the job *id*, and the job was path-filtered) leaves every PR permanently pending and forces admin overrides. `strict: true` requires the branch to be up-to-date with `main` before merge. |
+| **Required status checks (strict)** | `Classify Changes`, `Check AGENTS.md Links`, `Test omnigraph-server --features aws`, `CODEOWNERS matches source`, `CODEOWNERS not hand-edited` | Every PR must pass the AWS-feature build/test, AGENTS.md link integrity, and the CODEOWNERS hygiene checks. **`Test Workspace` is deliberately NOT required** — it runs only on push to `main` (post-merge), tags, and manual `workflow_dispatch`, to keep PR turnaround fast (it was the ~15min+ slow gate). It is therefore *not* listed here: a required check that never reports on PRs (the `test` job is `if: github.event_name != 'pull_request'`) would leave every PR permanently pending — the same job-never-reports trap the CODEOWNERS contexts call out below. The trade-off (a regression lands on `main` and is caught by the post-merge run, so `main` can briefly go red) and its mitigations are documented in [ci.md](ci.md). The two CODEOWNERS contexts must equal the job `name:` values in `.github/workflows/codeowners.yml` **verbatim** — a context naming a job that never reports (the old `CODEOWNERS / drift` used the job *id*, and the job was path-filtered) leaves every PR permanently pending and forces admin overrides. `strict: true` requires the branch to be up-to-date with `main` before merge. |
 | **Required approving reviews** | `1` | At least one reviewer. With a 2-person team, going higher would block all merges when one person is unavailable. |
 | **Require code-owner reviews** | `true` | The reviewer must be a code owner per `.github/CODEOWNERS`. This is what makes the codeowners chassis enforced. |
 | **Dismiss stale reviews on new commits** | `true` | A push after approval invalidates the prior review. Prevents the "approve, then sneak in unreviewed changes" pattern. |
--- a/docs/dev/ci.md
+++ b/docs/dev/ci.md
@ -3,6 +3,9 @@
 `.github/workflows/`:

 - **ci.yml**: text-only changes skip; otherwise `cargo test --workspace --locked` on ubuntu-latest with protobuf compiler. OpenAPI-drift check that auto-commits the regenerated `openapi.json` for same-repository PRs. Also runs the AGENTS.md cross-link integrity check (`scripts/check-agents-md.sh`).
+  - **`Test Workspace` does not run on pull requests.** The job is gated `if: github.event_name != 'pull_request'`, so the full workspace + failpoints suite runs only on push to `main` (post-merge), on `v*` tags, and on manual `workflow_dispatch`. This was a deliberate PR-latency trade-off — it was the slowest gate (~15min warm, up to the 75min cold ceiling). `RustFS S3 Integration` `needs: test`, so it is push-/dispatch-only for the same reason. The fast PR gates remain: `Classify Changes`, `Check AGENTS.md Links`, `Test omnigraph-server --features aws`, and the two CODEOWNERS checks. `Test Workspace` is correspondingly **not** in the required-check list (`.github/branch-protection.json`); see [branch-protection.md](branch-protection.md).
+  - **Consequences to internalize:** (1) a regression that the suite would catch now lands on `main` and turns the post-merge run red, rather than being blocked pre-merge — `main` can briefly break, so run `cargo test --workspace --locked` locally before merging anything non-trivial, or trigger this workflow on your branch via the Actions "Run workflow" button. (2) `openapi.json` is no longer auto-regenerated on PRs (that step is inside the `test` job); for server/API changes, regenerate it locally with `OMNIGRAPH_UPDATE_OPENAPI=1 cargo test -p omnigraph-server --test openapi` and commit it, or the strict drift check fails the post-merge `main` run.
+  - **Applying this policy:** removing `Test Workspace` from the JSON is inert until an admin runs `./scripts/apply-branch-protection.sh`. **Run it immediately after this change merges** — until then GitHub still requires a `Test Workspace` context that no longer reports on PRs, which leaves every open PR permanently pending (the job-never-reports trap).
 - **AWS feature build job**: `cargo build/test -p omnigraph-server --features aws` on ubuntu-latest.
 - **Windows binary build job**: `cargo build --release --locked -p omnigraph-cli -p omnigraph-server` on windows-latest with smoke checks for `omnigraph.exe version`, `omnigraph-server.exe --help`, and PowerShell installer syntax.
 - **RustFS S3 integration**: spins up RustFS in Docker, runs `s3_storage`, `server_opens_s3_graph_directly_and_serves_snapshot_and_read`, and `local_cli_s3_end_to_end_init_load_read_flow`.
--- a/docs/dev/cluster-config-specs.md
+++ b/docs/dev/cluster-config-specs.md
@ -3,11 +3,11 @@
 **Status:** Draft / thinking-in-progress
 **Type:** Architecture direction
 **Date:** 2026-06-07
-**Relationship:** generalizes today's `omnigraph.yaml` graph/query/policy configuration surface ([CLI reference](../user/cli-reference.md), [server docs](../user/server.md)) into a future cluster control plane. The distilled rules are in [cluster-axioms.md](cluster-axioms.md); detailed downstream implementation spec and blast-radius assessment in [cluster-config-implementation-spec.md](cluster-config-implementation-spec.md). This is a proposed architecture, not an implemented RFC.
+**Relationship:** generalizes today's `omnigraph.yaml` graph/query/policy configuration surface ([CLI reference](../user/cli/reference.md), [server docs](../user/operations/server.md)) into a future cluster control plane. The distilled rules are in [cluster-axioms.md](cluster-axioms.md); detailed downstream implementation spec and blast-radius assessment in [cluster-config-implementation-spec.md](cluster-config-implementation-spec.md). This is a proposed architecture, not an implemented RFC.

 > **Implementation status.** The examples below describe the full target schema.
 > Stage 2B only accepts the read-only subset documented in
-> [cluster-config.md](../user/cluster-config.md). Future-phase fields such as
+> [cluster-config.md](../user/clusters/config.md). Future-phase fields such as
 > `env_file`, `apply`, `providers`, `pipelines`, `embeddings`, `ui`, `aliases`,
 > and `bindings` are intentionally rejected with typed diagnostics until their
 > reconciler semantics are implemented.
--- a/docs/dev/execution.md
+++ b/docs/dev/execution.md
@ -177,4 +177,4 @@ For all three modes, a mid-load failure (RI / cardinality violation, validation

 ## Embeddings during load

-If a node type has `@embed` properties, the loader calls the engine embedding client (Gemini, RETRIEVAL_DOCUMENT) per row to populate the vector column. See [embeddings.md](../user/embeddings.md).
+The loader does **not** embed `@embed` properties at load time. `@embed` is a catalog annotation consumed by query typecheck/lint; vectors are supplied directly in the load data, or pre-filled by the offline `omnigraph embed` pipeline. Query-time `nearest($v, "string")` auto-embeds the query string via the provider-independent embedding client. See [embeddings.md](../user/search/embeddings.md). (Ingest-time `@embed` execution is a planned RFC-012 phase.)
--- a/docs/dev/index.md
+++ b/docs/dev/index.md
@ -20,13 +20,13 @@ constraints. User-facing behavior should still be documented through
 | Area | Read |
 |---|---|
 | System structure, L1/L2 framing, component diagrams | [architecture.md](architecture.md) |
-| On-disk layout, manifest schema, URI behavior | [storage.md](../user/storage.md) |
+| On-disk layout, manifest schema, URI behavior | [storage.md](../user/concepts/storage.md) |
 | Direct-publish writes, D2, staged writes, recovery sidecars | [writes.md](writes.md) |
 | Query execution, mutation execution, loader flow | [execution.md](execution.md) |
-| Index lifecycle and graph topology indexes | [indexes.md](../user/indexes.md) |
-| Branch and commit internals | [branches-commits.md](../user/branches-commits.md) |
+| Index lifecycle and graph topology indexes | [indexes.md](../user/search/indexes.md) |
+| Branch and commit internals | [branches-commits.md](../user/branching/index.md) |
 | Three-way merge implementation and conflicts | [merge.md](merge.md) |
-| Diff/change-feed implementation | [changes.md](../user/changes.md) |
+| Diff/change-feed implementation | [changes.md](../user/branching/changes.md) |
 | Branch protection policy | [branch-protection.md](branch-protection.md) |
 | CODEOWNERS source of truth | [codeowners.md](codeowners.md) |

@ -34,14 +34,14 @@ constraints. User-facing behavior should still be documented through

 | Area | Read |
 |---|---|
-| Schema grammar, catalog, migration planner | [schema-language.md](../user/schema-language.md) |
-| Query grammar, IR, lints, mutation restrictions | [query-language.md](../user/query-language.md) |
-| Embedding client and `@embed` integration | [embeddings.md](../user/embeddings.md) |
-| Cedar policy surface and server gating | [policy.md](../user/policy.md) |
-| Server auth, OpenAPI, endpoint handlers | [server.md](../user/server.md) |
-| Error taxonomy and serialization | [errors.md](../user/errors.md) |
-| Constants and tunables | [constants.md](../user/constants.md) |
-| Transaction model public contract | [transactions.md](../user/transactions.md) |
+| Schema grammar, catalog, migration planner | [schema-language.md](../user/schema/index.md) |
+| Query grammar, IR, lints, mutation restrictions | [query-language.md](../user/queries/index.md) |
+| Embedding client and `@embed` integration | [embeddings.md](../user/search/embeddings.md) |
+| Cedar policy surface and server gating | [policy.md](../user/operations/policy.md) |
+| Server auth, OpenAPI, endpoint handlers | [server.md](../user/operations/server.md) |
+| Error taxonomy and serialization | [errors.md](../user/operations/errors.md) |
+| Constants and tunables | [constants.md](../user/reference/constants.md) |
+| Transaction model public contract | [transactions.md](../user/branching/transactions.md) |

 ## Project Operations

@ -79,6 +79,9 @@ Working documents for in-flight feature work. Removed when the work lands.
 | Per-operator config — `~/.omnigraph/` identity, keyed credentials, named servers (the operator slice of RFC-002) | [rfc-007-operator-config.md](rfc-007-operator-config.md) |
 | Deprecate `omnigraph.yaml` — one concern per config surface; key-by-key migration map and staged retirement | [rfc-008-deprecate-omnigraph-yaml.md](rfc-008-deprecate-omnigraph-yaml.md) |
 | Unify CLI embedded/remote access paths — parity referee, shared wire-DTO crate, `GraphClient` trait, declared plane capabilities | [rfc-009-unify-access-paths.md](rfc-009-unify-access-paths.md) |
+| Restructure the CLI around explicit planes — one graph-addressing model, declared capability surface, plane-grouped help (expands RFC-009 Phase 4) | [rfc-010-cli-planes-restructure.md](rfc-010-cli-planes-restructure.md) |
+| CLI refactoring — one addressing & config model post-`omnigraph.yaml`: scope + `--graph` + derived access path, served-default / privileged-direct, profiles, named queries, capability classifier (completes RFC-008) | [rfc-011-cli-refactoring.md](rfc-011-cli-refactoring.md) |
+| Provider-independent embedding configuration — one resolved `EmbeddingConfig` + sealed provider enum (Gemini/OpenAI/Mock), identity recorded in the schema IR, query-time same-space validation, NFR floor | [rfc-012-embedding-provider-config.md](rfc-012-embedding-provider-config.md) |

 ## Boundary

--- a/docs/dev/invariants.md
+++ b/docs/dev/invariants.md
@ -15,6 +15,38 @@ Use it this way:
 - Keep implementation ledgers, roadmap detail, and historical MR notes in the
  per-area docs. This file is the filter, not the encyclopedia.

+## Governing principle: logical contract over physical state
+
+The hard invariants below are instances of one rule. Keep it in view whenever
+a change touches the boundary between what the graph *means* and how it is
+physically stored.
+
+> **Logical state is the contract. Physical state — index coverage, fragment
+> layout, compaction versions, staged writes — is derived, rebuildable, and may
+> be produced asynchronously. A physical operation must never fail a logical
+> one. Preconditions are checked against logical state; physical reconciliation
+> is idempotent and may lag or retry. Genuine logical conflicts still fail
+> loudly: the licence to lag covers physical convergence, not correctness.**
+
+Invariants that instantiate it: **2** (manifest-atomic visibility) and **5**
+(recovery is part of the commit protocol) — a partially-written physical layer
+never changes what a graph commit means; **7** (indexes are derived state) — a
+query is correct under partial index coverage, and expensive index work
+converges from manifest state instead of gating the write path; **13** (failures
+bounded and observable) — the licence to lag is not a licence to drop, so a
+physical step that cannot make progress is surfaced, not swallowed. Deny-list
+items that enforce it: synchronous inline vector/FTS index rebuilds on the
+commit path; state that drifts from Lance or the manifest when it can be
+derived; job queues for manifest-derivable state where a reconciler fits.
+
+The failure shape it rules out: a legitimate background operation on the
+physical layer (compaction, an index build, an interrupted staged write) is
+allowed to break a logical operation (a query's correctness, a migration's
+success, a branch's writability). The smell to watch for is a logical operation
+whose precondition is a *physical* fact — a cached file version, an index's
+existence, a fragment count. Make the precondition logical and let a reconciler
+converge the physical state.
+
 ## Hard Invariants

 1. **Respect the substrate.** Lance owns columnar storage, per-dataset
@ -58,7 +90,7 @@ Use it this way:
   branch they read even when index coverage is partial. Expensive index work
   should converge from manifest state instead of extending the critical write
   path. Scalar staged index builds and vector inline residuals are documented
-   in [writes.md](writes.md) and [indexes.md](../user/indexes.md).
+   in [writes.md](writes.md) and [indexes.md](../user/search/indexes.md).

 8. **Schema identity survives renames.** Accepted schema identity must remain
   stable across type and property renames. Rename support belongs in migration
@ -100,14 +132,14 @@ Use it this way:
 |---|---|---|
 | Multi-table commit | Manifest CAS plus recovery sidecars; not a single Lance primitive | [writes.md](writes.md), [architecture.md](architecture.md) |
 | Constructive mutations | In-memory `MutationStaging`, one end-of-query table commit per touched table, then one manifest publish | [writes.md](writes.md), [execution.md](execution.md) |
-| Deletes | Inline-commit residual; delete-only queries allowed, mixed insert/update/delete rejected by D2 | [query-language.md](../user/query-language.md), [writes.md](writes.md) |
-| Branch delete | Manifest is the single authority, flipped atomically first; per-table forks + commit-graph branch are derived state, reclaimed best-effort (`force_delete_branch`) with the `cleanup` reconciler as the guaranteed backstop. Reusing a name whose reclaim failed before `cleanup` surfaces an actionable error | [branches-commits.md](../user/branches-commits.md), [maintenance.md](../user/maintenance.md) |
-| Schema validation | Type checks, required fields, defaults, edge endpoint checks, and edge cardinality are enforced on write paths | [schema-language.md](../user/schema-language.md), [execution.md](execution.md) |
-| Unique constraints | Intra-batch and write-path checks exist; intake and branch-merge derive the composite key through one shared function (`loader::composite_unique_key`, a separator-free `Vec<String>` tuple) and fail loudly on an un-keyable column type rather than silently exempting it; full cross-version uniqueness against already-committed rows is still a gap | [schema-language.md](../user/schema-language.md) |
+| Deletes | Inline-commit residual; delete-only queries allowed, mixed insert/update/delete rejected by D2 | [query-language.md](../user/queries/index.md), [writes.md](writes.md) |
+| Branch delete | Manifest is the single authority, flipped atomically first; per-table forks + commit-graph branch are derived state, reclaimed best-effort (`force_delete_branch`) with the `cleanup` reconciler as the guaranteed backstop. Reusing a name whose reclaim failed before `cleanup` surfaces an actionable error | [branches-commits.md](../user/branching/index.md), [maintenance.md](../user/operations/maintenance.md) |
+| Schema validation | Type checks, required fields, defaults, edge endpoint checks, and edge cardinality are enforced on write paths | [schema-language.md](../user/schema/index.md), [execution.md](execution.md) |
+| Unique constraints | Intra-batch and write-path checks exist; intake and branch-merge derive the composite key through one shared function (`loader::composite_unique_key`, a separator-free `Vec<String>` tuple) and fail loudly on an un-keyable column type rather than silently exempting it; full cross-version uniqueness against already-committed rows is still a gap | [schema-language.md](../user/schema/index.md) |
 | Storage trait | `TableStorage` (via `db.storage()`) is staged-only; the inline-commit residuals (`delete_where`, `create_vector_index`) are split onto a separate sealed `InlineCommitResidual` trait reached via `db.storage_inline_residual()` (MR-854), so §1 holds by construction; capability/stat surfaces are roadmap | [writes.md](writes.md), [architecture.md](architecture.md) |
-| Index lifecycle | `ensure_indices` is explicit today; reconciler-based convergence is roadmap | [indexes.md](../user/indexes.md), [maintenance.md](../user/maintenance.md) |
-| Traversal IDs | Runtime still builds `TypeIndex`; Lance stable row-id based graph IDs are roadmap | [architecture.md](architecture.md), [query-language.md](../user/query-language.md) |
-| Auth | Bearer token hashing and server-side actor resolution are implemented at the HTTP boundary | [server.md](../user/server.md), [policy.md](../user/policy.md) |
+| Index lifecycle | `@index`/`@key` declares *intent*; the physical index is derived state and never fails a logical op. `schema apply` builds no indexes (records intent only; index-only changes touch no table data). `load`/`mutate` build inline through one chokepoint (`build_indices_on_dataset_for_catalog`, type-dispatched by `node_prop_index_kind`: enum + orderable scalar → BTREE, free-text String → FTS, Vector → vector) that fault-isolates an untrainable Vector column into a *pending* index instead of aborting. `optimize`/`ensure_indices` is the reconciler: it creates declared-but-missing indexes and folds appended/rewritten fragments into existing ones (`optimize_indices`), reporting still-pending columns. Explicit maintenance call, not yet a background loop | [indexes.md](../user/search/indexes.md), [maintenance.md](../user/operations/maintenance.md) |
+| Traversal IDs | Runtime still builds `TypeIndex`; Lance stable row-id based graph IDs are roadmap | [architecture.md](architecture.md), [query-language.md](../user/queries/index.md) |
+| Auth | Bearer token hashing and server-side actor resolution are implemented at the HTTP boundary | [server.md](../user/operations/server.md), [policy.md](../user/operations/policy.md) |
 | Tests | Tempdir-backed Lance tests are the current substrate; the storage adapter has an in-memory backend for adapter-level contract tests, but Lance datasets bypass it | [testing.md](testing.md) |

 The branch-delete reconciler is authority-derived: it reclaims orphaned forks
@ -132,13 +164,18 @@ them explicit.
  new writer cannot couple a write with a HEAD advance through the default
  surface. The dead legacy methods (`append_batch` on the trait,
  `merge_insert_batch{,es}`, `create_{btree,inverted}_index`) were removed. The
-  remaining residuals are `delete_where` (gated on MR-A — Lance v7.x bump)
-  and `create_vector_index` (gated on Lance #6666); see
-  [lance.md](lance.md) and [writes.md](writes.md). New write paths should use
-  the staged shape unless a documented Lance blocker applies.
+  remaining residuals are `delete_where` and `create_vector_index`. The Lance
+  6.0.1 → 7.0.0 bump landed, so the staged two-phase delete API
+  (`DeleteBuilder::execute_uncommitted`, Lance #6658) is now available and MR-A
+  is unblocked — but the migration itself is still pending, so `delete_where`
+  stays inline for now. `create_vector_index` remains gated on Lance #6666
+  (still open). See [lance.md](lance.md) and [writes.md](writes.md). New write
+  paths should use the staged shape unless a documented Lance blocker applies.
 - **Deletes and vector indexes:** `delete_where` and vector index creation still
-  advance Lance HEAD inline because the required public Lance APIs are missing.
-  Keep D2 and recovery coverage in place until those residuals are removed.
+  advance Lance HEAD inline. The public delete two-phase API now exists (Lance
+  #6658 shipped in 7.0.0), so the delete residual is unblocked pending the MR-A
+  migration; vector index creation is still blocked (Lance #6666 open). Keep D2
+  and recovery coverage in place until those residuals are removed.
 - **Blob-column compaction:** Lance `compact_files` mis-decodes blob-v2 columns
  under its forced `BlobHandling::AllBinary` read ("more fields in the schema
  than provided column indices"), so `optimize` skips any table with a `Blob`
@ -160,6 +197,22 @@ them explicit.
  one-winner-CAS territory; closing this fully needs a cross-process
  serialization primitive (e.g. lease-based use of the schema-apply lock
  branch) — design it before promoting multi-process write topologies.
+- **Fork reclaim is in-process-safe only:** the first write to a table on a
+  branch forks it (a Lance `create_branch` that advances state before the
+  manifest publish). An interrupted fork (crash, or a cancelled request
+  future) leaves a manifest-unreferenced branch ref. The next write self-heals
+  it — `reclaim_orphaned_fork_and_refork` (`force_delete_branch` + re-fork)
+  — but reclaim is only safe because the writer holds the per-`(table,
+  branch)` write queue from before the fork through the publish AND re-checks
+  the live manifest under it, so no *in-process* writer can be mid-fork. A
+  reclaim cannot serialize against a foreign-*process* in-flight fork: it may
+  force-delete a peer's just-created ref, which makes that peer's commit fail
+  and retry — the same one-winner-CAS exposure as above, not corruption. The
+  reclaim never fires unless in-process-queue + manifest authority both prove
+  the ref is manifest-unreferenced. `cleanup`'s per-table reconciler
+  (`reconcile_orphaned_branches`) is the guaranteed backstop for any fork the
+  write path never revisits. Both degrade to a no-op if Lance ships an atomic
+  multi-dataset branch op.
 - **Local `write_text_if_match` is not a cross-process CAS:** object-store
  backends use a true conditional put (ETag If-Match; the in-memory test
  backend too), but upstream `object_store` leaves `PutMode::Update`
--- a/docs/dev/lance.md
+++ b/docs/dev/lance.md
@ -156,7 +156,24 @@ If a future need pulls one of these into scope, add a row to the matching domain

 When Lance ships a major release that changes any of the above (file format bump, new index type, transaction semantics change, new branching primitive), refresh this index in the same change as the omnigraph upgrade. Stale Lance pointers are worse than no pointers.

-### Last alignment audit: 2026-05-22 (Lance 6.0.1 upstream; omnigraph pinned at 6.0.1)
+### Last alignment audit: 2026-06-15 (Lance 7.0.0 upstream; omnigraph pinned at 7.0.0)
+
+Migration from Lance 6.0.1 → 7.0.0 landed in this cycle. **Arrow stayed 58, DataFusion stayed 53** (no change) — the only transitive bump is `object_store` 0.12.5 → 0.13.2. 141 upstream commits reviewed (6.0.1 → 7.0.0); no fixes lost (the 6.0.x release-branch backports are all forward-ported into 7.0.0). Behavior-affecting findings:
+
+- **object_store 0.13 moved convenience methods behind a new `ObjectStoreExt` trait** (`get`/`put`/`head`/`rename`/`delete`; `list`/`list_with_delimiter`/`put_opts` stay on the core `ObjectStore` trait). Fix = add `use object_store::ObjectStoreExt;` to `storage.rs` and `db/manifest/namespace.rs`; no call-site changes. Mirrors Lance's own migration in PR #6672. The local-FS `PutMode::Update` gap is unchanged (still unimplemented upstream), so `storage.rs::write_text_if_match`'s local content-token emulation stays.
+- **`roaring` must be pinned to 0.11.4** (`cargo update -p roaring --precise 0.11.4`). Lance 7.0.0's `UpdatedFragmentOffsets` newtype (PR #6650) derives `Eq` over `HashMap<u64, RoaringBitmap>`, which needs `RoaringBitmap: Eq` — added only in roaring 0.11.4 (roaring-rs PR #341). Lance's loose `roaring = "0.11"` constraint otherwise resolves the broken 0.11.3 and **lance itself fails to compile** (`RoaringBitmap: Eq is not satisfied`). roaring is transitive (no direct workspace dep); the pin lives only in `Cargo.lock`.
+- **`_row_created_at_version` for merge-insert INSERT rows now = the commit version** (PR #6774; was a fallback of 1 / dataset-creation version). Flipped `lance_version_columns.rs::lance_merge_insert_new_row_stamps_created_at_version` to assert `== v2`. Production change-detection keys on `_row_last_updated_at_version` + ID-set membership, so classification logic is unaffected (the `changes/mod.rs` rationale comment was corrected).
+- **BTREE range-query bound inclusiveness fixed** (PR #6796, issue #6792): `x <= hi AND x > lo` returned the wrong boundary row on 6.0.1. omnigraph today builds BTREE only on string `@key` columns (`id`/`src`/`dst`) and queries them by equality/IN, not range, so its *current* query patterns almost certainly never hit this bug — but the corrected boundary semantics are a contract we rely on the moment a BTREE-range path appears (BTREE-on-properties via the index-type tickets, or a range-on-key query). Pinned by `lance_surface_guards.rs::btree_range_query_boundary_is_correct` (reproduces #6792's 5-row + BTREE shape).
+- **`WriteParams::auto_cleanup` default flipped from on (every-20-commits) to `None`** (PR #6755). On 6.0.1 the on-by-default hook could GC versions the `__manifest` pins for snapshots/time-travel. omnigraph owns cleanup explicitly (`optimize.rs::cleanup_all_tables`). Two parts to the fix, because `auto_cleanup` is **create-time config only and has no effect on existing datasets** (Lance `write.rs` docs): (1) `auto_cleanup: None` at all 11 `WriteParams` sites so *new* datasets store no cleanup config; (2) — the load-bearing half — `skip_auto_cleanup: true` on every commit path, because graphs created **before** the bump still carry the on-config in their datasets, and Lance's hook fires off the *dataset's stored* config at commit time (`io/commit.rs`: `if !commit_config.skip_auto_cleanup`). So the staged commit path (`commit_staged` → `CommitBuilder::with_skip_auto_cleanup(true)`), the `__manifest` publisher (`MergeInsertBuilder::skip_auto_cleanup(true)`), and the direct `WriteParams` paths all skip the hook. Without this, an upgraded graph would still auto-cleanup and delete `__manifest`-pinned versions. Pinned by `lance_surface_guards.rs::skip_auto_cleanup_suppresses_version_gc` (negative control + with-skip survival).
+- **Lance #6658 SHIPPED in 7.0.0** (`DeleteBuilder::execute_uncommitted`, exposed via PR #6781) → MR-A (migrate `delete_where` to the staged two-phase API, retire the parse-time D2 rule) is now **unblocked**, tracked separately (dev-graph `iss-950`). The bump itself keeps `delete_where` inline; the `_compile_delete_result_field_shape` guard is left untouched until MR-A.
+- **The unenforced primary key is now immutable once set** (`lance::dataset::transaction`, ~L2472–2480: `if !primary_key_before.is_empty() && (writes_primary_key || primary_key_after != primary_key_before) → "the unenforced primary key is a reserved key and cannot be changed once set"`). omnigraph marks `__manifest.object_id` as the unenforced PK (`lance-schema:unenforced-primary-key`) for merge-insert row-level CAS — baked into `manifest_schema()` at init, and added by the `migrate_v1_to_v2` internal-schema migration for pre-v0.4.0 graphs. The migration relied on Lance 6's idempotent re-apply for crash-recovery (a crash after the field-set but before the stamp bump re-enters the migration with the PK already present); under v7 that re-apply errors, so a real v1 graph could never finish migrating. Fixed by guarding the set on the manifest's unenforced-PK field (`db/manifest/migrations.rs::migrate_v1_to_v2`): `["object_id"]` → no-op, `[]` → set, any other PK field → loud refusal (the wrong CAS key, unchangeable under v7). Pinned by `lance_surface_guards.rs::unenforced_primary_key_is_immutable_once_set` (red if Lance relaxes immutability); regression: `db::manifest::tests::test_publish_migrates_pre_stamp_manifest_to_current_version` (was red under v7).
+- **Native `DirectoryNamespace` no longer recognizes omnigraph's manifest-tracked tables** (`lance-namespace-impls` dir.rs ~L1310): `list/describe/create_table_version` route through `check_table_status`, which reports an omnigraph table absent → `TableNotFound`. The decoupling is *contingent on omnigraph's legacy boolean PK key*, not an unconditional v7 property: v7's namespace eagerly adds the new `lance-schema:unenforced-primary-key:position` key to any `__manifest` lacking it; that write hits the immutable-PK rule above (the boolean key already set the PK), so `ensure_manifest_table_up_to_date` errors and the namespace silently falls back to directory listing. omnigraph keeps the boolean key deliberately — Lance honors it permanently (maps to PK position 0), and one uniform on-disk format beats a new-vs-old split (existing graphs can't be re-keyed to the position key under that same immutability rule). omnigraph production never uses Lance's native namespace (its publisher writes `__manifest` directly via merge_insert; its own `namespace.rs` impls are custom), so this is test-only — the `test_directory_namespace_direct_publish_cannot_replace_native_omnigraph_write_path` surface guard was realigned to the v7 behavior (it now asserts the native namespace is fully decoupled, which only strengthens the guard's thesis).
+- **Still NOT fixed in 7.0.0:** vector-index two-phase (Lance #6666 open) — `create_vector_index` inline residual retained; blob-column compaction — `compact_files_still_fails_on_blob_columns` guard still red on a fix, `optimize` still skips blob tables behind `LANCE_SUPPORTS_BLOB_COMPACTION`.
+- **No Lance API surface omnigraph uses changed at *compile* time** (the only compile break was object_store) — but **two runtime behaviors did** (the unenforced-PK immutability and the native-namespace `TableNotFound`, above), each caught by the full engine test suite rather than the build. `CleanupPolicy`, `WriteParams` (apart from the `auto_cleanup` default), `CompactionOptions`, the namespace models (resolved via `lance-namespace-reqwest-client` 0.7.7, unchanged across the bump), `Operation`, `ManifestLocation`, and `MergeInsertBuilder` shapes are all stable. Lesson: a clean build is not a clean alignment — run `cargo test --workspace` before declaring a Lance bump done.
+
+Bump this date stanza on the next alignment pass.
+
+### Prior alignment audit: 2026-05-22 (Lance 6.0.1 upstream; omnigraph pinned at 6.0.1)

 Migration from Lance 4.0.0 → 6.0.1 landed in this cycle (DataFusion 52 → 53, Arrow 57 → 58, lance-tokenizer 6.0.1 added, tantivy* removed). Direct 4 → 6 jump; v5.x was not used as an intermediate (rationale in `~/.claude/plans/shimmering-percolating-duckling.md`). Behavior-affecting findings:

@ -169,6 +186,7 @@ Migration from Lance 4.0.0 → 6.0.1 landed in this cycle (DataFusion 52 → 53,
 - **`Dataset::checkout_version(N).await?.restore().await?`**: `restore()` takes `&mut self` and returns `Result<()>` (mutates in place, does not consume + return a new dataset). The recovery rollback hammer at `db/manifest/recovery.rs:505-522` continues to work. Pinned by `lance_surface_guards.rs::_compile_checkout_version_then_restore_signature`.
 - **`DatasetBuilder::from_namespace(...).with_branch(...).with_version(...).load()`** surface preserved (the namespace builder chain at `db/manifest/namespace.rs:162-174`). Pinned by `lance_surface_guards.rs::_compile_dataset_builder_from_namespace_signature`.
 - **`compact_files(&mut ds, CompactionOptions::default(), None)`** signature stable. `CompactionOptions` still does not expose `data_storage_version`; `compact_files` builds its own `WriteParams { ..Default::default() }`. Note: `LanceFileVersion::default()` is now V2_1 in v6, so optimize-rewritten fragments come out at V2_1 by default (was V2_0 in v4). Existing explicit V2_2 pins on creates/appends still apply.
+- **`Dataset::optimize_indices(&mut self, &lance_index::optimize::OptimizeOptions)`** (via `DatasetIndexExt`) is a depended-on surface as of the index-coverage work: `db/omnigraph/optimize.rs` calls it after `compact_files` to fold appended/rewritten fragments into existing indexes (incremental merge, not retrain). It is a **committing** call (mutates in place, advances HEAD; no uncommitted variant in v6.0.1), so optimize treats it as an inline-commit residual under the `SidecarKind::Optimize` recovery sidecar. Signature pinned by `lance_surface_guards.rs::_compile_optimize_indices_signature`; the incremental-coverage behavior pinned by `optimize_indices_extends_fragment_coverage` (appended fragment uncovered before, covered after).
 - **`Dataset::delete(predicate)` returns `DeleteResult { new_dataset: Arc<Dataset>, num_deleted_rows: u64 }`** — unchanged shape. Pinned by `lance_surface_guards.rs::_compile_delete_result_field_shape`. MR-A will repurpose this guard to the staged two-phase variant once `DeleteBuilder::execute_uncommitted` migration lands.
 - **File reader read methods now async** (Lance PR #6710, v6.0). No effect — omnigraph reaches Lance exclusively through `Dataset::scan` and the staged-write API.
 - **Tokenizer vendored as `lance-tokenizer`** (Lance PR #6512, v6.0). No effect — no direct tokenizer imports.
@ -178,6 +196,4 @@ Migration from Lance 4.0.0 → 6.0.1 landed in this cycle (DataFusion 52 → 53,
 - **`Dataset::force_delete_branch`** (`branches().delete(name, force=true)`, dataset.rs:524) tolerates a missing branch-*contents* ref (vs plain `delete_branch`'s `RefNotFound`), but on the local store still errors `NotFound` if the branch `tree/` directory is fully absent (`remove_dir_all`'s NotFound is not caught for Lance's native error variant, refs.rs:526-549). Both variants still refuse a branch with referencing descendants (`RefConflict`). `TableStore::force_delete_branch` wraps this to be fully idempotent (tolerates already-absent). The single-authority branch-delete redesign uses it for orphan reclamation (eager best-effort reclaim + cleanup reconciler). Pinned by `lance_surface_guards.rs::force_delete_branch_semantics`. Branch delete is "flip the ref atomically, then `remove_dir_all(tree/{branch})`"; branch-exclusive data lives under `tree/{branch}/` so a drop reclaims it immediately without touching `main`.
 - **Lance blob-v2 `compact_files` bug** (no public issue found as of 2026-06): `compact_files` disables binary-copy for blob datasets and forces `BlobHandling::AllBinary` on the read side; the v2.1+ structural decoder then mis-counts column infos for the blob-v2 struct and fails with `Invalid user input: there were more fields in the schema than provided column indices / infos` (`lance-encoding/src/decoder.rs::ColumnInfoIter::expect_next`). This fails even a pristine uniform-V2_2 multi-fragment blob table; vector/list/scalar/ragged columns and mixed file versions all compact fine. Reads/queries use descriptor handling (`BlobHandling::default()`) and are unaffected. `optimize` skips blob-bearing tables behind `LANCE_SUPPORTS_BLOB_COMPACTION = false` (`db/omnigraph/optimize.rs`), reporting `SkipReason::BlobColumnsUnsupportedByLance`. Pinned by `lance_surface_guards.rs::compact_files_still_fails_on_blob_columns`, which turns red when the bug is fixed → flip the gate, remove the skip branch + the `maintenance.rs::optimize_skips_blob_table_and_reports_skip` skip assertions.

-Surface guards added: `crates/omnigraph/tests/lance_surface_guards.rs` (10 named guards; 5 runtime + 5 compile-only). Future Lance bumps re-run this file first as the smoke check. Two additional guards from the original plan deferred to follow-up (`manifest_cas_returns_row_level_contention_variant` needs full publisher-race harness; `table_version_metadata_byte_compatible_with_v4` needs `pub(crate)` reach extension).
-
-Bump this date stanza on the next alignment pass.
+Surface guards added: `crates/omnigraph/tests/lance_surface_guards.rs` (10 named guards; 5 runtime + 5 compile-only; plus the index-coverage work's `_compile_optimize_indices_signature` and `optimize_indices_extends_fragment_coverage`). Future Lance bumps re-run this file first as the smoke check. Two additional guards from the original plan deferred to follow-up (`manifest_cas_returns_row_level_contention_variant` needs full publisher-race harness; `table_version_metadata_byte_compatible_with_v4` needs `pub(crate)` reach extension).
--- a/docs/dev/rfc-001-queries-envelope-mcp.md
+++ b/docs/dev/rfc-001-queries-envelope-mcp.md
@ -348,4 +348,4 @@ Callers move at their own pace. The envelope upgrades + URL rename ship in v0.6.
 - RFC 8288 (`Link` relations, `successor-version`)
 - MCP spec: [modelcontextprotocol.io](https://modelcontextprotocol.io)
 - [invariants.md](./invariants.md) — substrate boundaries this work respects
- [../user/server.md](../user/server.md) — current HTTP surface (post-MR-656 picks up the `/query`+`/mutate` rename and deprecation)
+- [../user/server.md](../user/operations/server.md) — current HTTP surface (post-MR-656 picks up the `/query`+`/mutate` rename and deprecation)
--- a/docs/dev/rfc-009-unify-access-paths.md
+++ b/docs/dev/rfc-009-unify-access-paths.md
@ -68,7 +68,7 @@ anything moves — mirroring the storage collapse, where the pinned contract
 tests gated the swap, and the test-monolith modularization (#192/#193), which
 makes Phase 3 tractable: the CLI dispatch is 1,184 lines today, not 4,200.

-### Phase 1 — Parity matrix (the referee; do first, no refactor)
+### Phase 1 — Parity matrix (the referee; do first, no refactor) *(landed)*

 A CLI integration test (extend the `system_local.rs` harness, which already
 spawns both binaries): one fixture graph; for every forked verb, run the
@ -81,7 +81,16 @@ This pins today's behavior so Phase 3 can't silently change it, and catches
 every future fork drift. It also incidentally covers utoipa annotation↔route
 mismatches (a lying `#[utoipa::path]` makes the remote leg 404).

-### Phase 2 — One wire-DTO crate
+**Phase 1 outcome (landed):** `crates/omnigraph-cli/tests/parity_matrix.rs`
+— 11 rows green with an **empty divergence ledger**: with matched Cedar
+policy on both arms, embedded and remote agree on every forked verb's
+scrubbed JSON and exit codes. Two findings along the way: like-for-like
+requires the same policy bundle on both arms (a tokens-only server is
+default-deny by design — the harness encodes this), and inline execution's
+unbound-param matches-all vs the invoke path's hard error is a cross-path
+asymmetry, filed as #207 and pinned (not repaired) by the matrix.
+
+### Phase 2 — One wire-DTO crate *(landed)*

 Move the HTTP request/response types and the single `engine result → DTO`
 mapping per verb into a shared crate (working name `omnigraph-api-types`),
@ -113,6 +122,15 @@ neither axum nor the engine's internals. The engine crate does not depend on
 it — the `engine result → DTO` mapping lives in the shared crate (or the CLI/
 server side), taking engine result types as input.

+**Phase 2 outcome (landed):** `crates/omnigraph-api-types` holds the wire
+DTOs + their `engine-result → DTO` mappings; `omnigraph-server::api` is a
+`pub use` re-export (so `openapi.json` is byte-identical — the referee
+passed with zero diff), and the CLI consumes the crate directly. One
+deliberate refinement of the original sketch: `LoadOutput` is a rendered
+CLI output type, not a wire DTO, so it stayed CLI-side — both its mappings
+(local `LoadResult`, remote `IngestOutput`) now sit together in
+`output.rs`. The parity matrix passed textually unchanged.
+
 ### Phase 3 — `GraphClient` trait, two implementations

 ```text
@ -143,15 +161,20 @@ and cluster commands must work with the server down) explicit in code.
 "Server" targets include operator-config named servers (RFC-007), not only
 literal `http(s)://` URIs.

-### Phase 5 — Route alignment
+### Phase 5 — Route alignment (landed)

-Add a canonical `/load` endpoint (the handler already exists behind the
-`/ingest` shim); point `RemoteClient` at it; keep `/ingest` on its existing
-deprecation path. While here, check whether the server uses `utoipa-axum`'s
-router-coupled registration (`OpenApiRouter`/`routes!`); if it hand-mounts
-routes beside `#[utoipa::path]` annotations, prefer migrating registration so
-path annotations and mount points are the same declaration (the modularization
-already hit one orphaned-attribute incident of exactly this class).
+Added a canonical `POST /load` (shared `run_ingest` body; the deprecated
+`/ingest` is now a thin alias carrying `#[deprecated]` + RFC 9745/8288
+`Deprecation`/`Link: </load>` headers, exactly mirroring `/mutate`↔`/change`)
+and pointed the CLI's remote `load` arm at it; `/ingest` stays on its
+deprecation path. `/load` reuses `IngestRequest`/`IngestOutput` (as canonical
+`/mutate` reuses `Change*`); a DTO rename is a separate change.
+
+Registration finding: the server **hand-mounts** routes (`.route(...)`) beside a
+manual `#[openapi(paths(...))]` list, not `utoipa-axum`'s `OpenApiRouter`/
+`routes!`. This PR followed the existing manual pattern (one `.route` + one
+`paths(...)` entry + the `#[utoipa::path]` annotation) rather than migrating
+registration — the migration is a worthwhile but orthogonal cleanup, deferred.

 ## Non-goals

--- a/docs/dev/rfc-010-cli-planes-restructure.md
+++ b/docs/dev/rfc-010-cli-planes-restructure.md
@ -0,0 +1,449 @@
+# RFC: Restructure the CLI Around Explicit Planes
+
+**Status:** Proposed
+**Date:** 2026-06-13
+**Audience:** CLI/server/cluster maintainers
+**Builds on:** [rfc-009-unify-access-paths.md](rfc-009-unify-access-paths.md)
+(Phases 3a–3c landed — the embedded/remote data-plane fork is now one
+`GraphClient` enum; this RFC **expands RFC-009 Phase 4** from a narrow
+embedded-vs-remote capability table into the full plane model, and leaves
+Phase 5 route alignment where it is),
+[rfc-007-operator-config.md](rfc-007-operator-config.md) (operator
+`--server`/`--graph`/`--target` addressing — the surfaces this RFC makes
+uniform across planes),
+[rfc-008-deprecate-omnigraph-yaml.md](rfc-008-deprecate-omnigraph-yaml.md).
+**Sequencing:** post-v0.7.0, after RFC-009 Phase 3c (done).
+
+## Summary
+
+The CLI silently spans **three planes** — data, storage/maintenance, and
+control — and forces the operator to know which plane each verb lives on *and*
+address a graph differently per plane. The same graph you query as
+`--server prod --graph knowledge` you must maintain as
+`s3://bucket/knowledge.omni`. Plane restrictions (`graphs list` is server-only,
+`optimize` is storage-only) are *accidental* — discovered by hitting a cryptic
+error, not *declared*.
+
+This RFC makes the plane model **explicit and coherent** with three moves:
+
+1. **One graph-addressing model** across every verb (`--target`/`--graph`/
+   positional URI/`--server`), resolving to a storage URI for maintenance and a
+   remote client for data — instead of two different ways to name one graph.
+2. **A declared, per-subcommand capability surface** (RFC-009 Phase 4): each
+   verb declares its plane(s); wrong-plane invocations get an honest "this is
+   storage-plane, `--server` doesn't apply" error from one table, not scattered
+   `bail!`s.
+3. **Plane-grouped `--help`** so the model is legible at a glance.
+
+No new server feature. Storage maintenance stays off the wire — deliberately.
+
+## Current state of affairs
+
+The CLI has 23 top-level commands. They divide into three planes, addressed
+three different ways:
+
+| Plane | Verbs | Reaches the graph by | Addressing surface |
+|---|---|---|---|
+| **Data** | `query`, `mutate`, `load`, `ingest`, `branch *`, `snapshot`, `export`, `commit *`, `schema show/apply` (and `graphs list`, **remote-only today** — see note) | embedded engine **or** HTTP server (one `GraphClient`) | positional URI **or** `--target` / `--graph` / `--server` (config aliases) |
+| **Storage / maintenance** | `init`, `optimize`, `repair`, `cleanup`, `schema plan`, `queries validate` | embedded engine **only**, directly on storage (`file://` or `s3://`) | positional URI **or** `--target` — **no `--server` / `--graph`** (except `init`, which today takes **only a required positional URI** — no `--target`) |
+| **Control** | `cluster validate/plan/apply/approve/status/refresh/import/force-unlock` | a cluster **directory** (`file://` or `s3://`), not a graph URI | `--config <dir>` |
+
+### What's confusing (validated facts)
+
+1. **Two names for one graph.** Data verbs resolve `--server prod --graph
+   knowledge` through `GraphClient::resolve*` (the embedded/remote fork collapsed
+   in RFC-009 Phases 3a–3c; only the two `GraphClient` factories call
+   `apply_server_flag`). Maintenance verbs instead use
+   `resolve_uri`/`resolve_local_uri` and accept only a positional URI or
+   `--target` — so to compact the graph you *query* as `--server prod --graph
+   knowledge` you must *type* `s3://bucket/knowledge.omni`. One graph, two
+   addressing vocabularies.
+
+   > **Note (`graphs list`).** It is routed through `GraphClient` only to share
+   > the addressing/token resolver; its embedded arm fails loudly, so it is
+   > **remote-only today** (the later capability table and *Relationship to
+   > RFC-009* record it as remote-now / embedded-cluster-later).
+
+2. **Plane restrictions are accidental, not declared.** `graphs list` is
+   server-only and `optimize`/`repair`/`cleanup`/`init` are storage-only purely
+   by code shape. Point `optimize` at an `https://` URL and you get whatever
+   `Omnigraph::open` says about an https URI — accidental error text that, per
+   Hyrum's Law, is already someone's dependency. The capability is real but
+   unstated.
+
+3. **The split is per-subcommand, and the family names hide it.** `schema plan`
+   is storage-only (`resolve_local_uri`) while `schema show`/`schema apply` are
+   data-plane (the graph client). `queries validate` opens the graph to
+   typecheck while `queries list` only reads the registry config. The plane is
+   a property of the *subcommand*, not the family.
+
+4. **Maintenance has no server/cluster counterpart at all.** There is no HTTP
+   route and no `cluster` subcommand for `optimize`/`cleanup`/`repair` (verified:
+   nothing in the server route table, nothing in `omnigraph-cluster/src`). For a
+   server-backed deployment you run the *same CLI* against the storage URI,
+   out-of-band from the serving process. This is correct (maintenance is
+   heavyweight, destructive, single-operator — it should not be a multi-tenant
+   HTTP surface), but it is **undocumented in the CLI's own shape**, so it reads
+   as an omission rather than a decision.
+
+5. **`init` has a hidden control-plane twin.** Bare `init` creates a single
+   graph from storage; in cluster mode the equivalent is `cluster apply`
+   (graph-creation stage, with ledger/recovery/approval semantics). Same intent,
+   two entry points, no signpost between them.
+
+6. **Flat `--help`.** All 23 commands list as one undifferentiated block, so the
+   plane a verb belongs to is tribal knowledge.
+
+The net effect: a new operator must already know OmniGraph's plane architecture
+to predict which flags work on which verb and how to name a graph. The CLI does
+not teach its own model.
+
+## Target CLI ergonomics
+
+The throughline: **you name a graph one way, and the CLI tells you what works
+where.** Simple examples of the end state:
+
+### One name for a graph, everywhere
+
+A config target `knowledge` works on every verb that touches that graph:
+
+```bash
+omnigraph query    --target knowledge --query q.gq          # data (embedded or remote, auto)
+omnigraph load     --target knowledge --data rows.jsonl     # data
+omnigraph optimize --target knowledge                       # maintenance (resolves to its storage URI)
+omnigraph cleanup  --target knowledge --keep 10 --confirm
+omnigraph repair   --target knowledge --confirm
+```
+
+The positional URI form still works everywhere, unchanged:
+
+```bash
+omnigraph optimize s3://bucket/knowledge.omni
+```
+
+### Data plane: same command, embedded or remote
+
+You don't pick "local vs server" syntax — resolution decides:
+
+```bash
+omnigraph query ./local.omni                     --query q.gq   # opens engine directly
+omnigraph query --server prod --graph knowledge  --query q.gq   # over HTTP
+omnigraph query --target knowledge               --query q.gq   # whichever the config says
+```
+
+### Maintenance: `--target` must resolve to direct storage (loud if not)
+
+```bash
+$ omnigraph optimize --target prod
+error: `--target prod` resolves to a remote server (https://prod…).
+       `optimize` is a storage-plane command and needs direct storage access.
+       Pass the graph's s3://… URI, or use --cluster <dir> --graph <id>.
+```
+
+Cluster-managed graphs get an explicit, intentional path (no implicit
+`cluster.yaml` peeking):
+
+```bash
+omnigraph optimize --cluster ./cluster --graph knowledge
+```
+
+### Wrong-plane = one honest, stable error
+
+```bash
+$ omnigraph optimize --server prod
+error: `optimize` is a storage-plane command; `--server` addresses the data
+       plane and does not apply here. Use --target <name> or a storage URI.
+
+$ omnigraph graphs list ./local.omni
+error: `graphs list` needs a remote multi-graph server (http/https) today.
+       (Embedded cluster-catalog enumeration is planned — RFC-009.)
+```
+
+### `--help` teaches the model
+
+```
+DATA PLANE        run against a graph (embedded or --server)
+  query  mutate  load  branch  snapshot  export  commit  schema show  schema apply
+
+STORAGE / MAINTENANCE   direct storage access; no server
+  init  optimize  repair  cleanup  schema plan  queries validate
+
+CONTROL PLANE     manage a cluster directory
+  cluster
+
+INSPECT / SESSION
+  graphs list  queries list  lint  policy  embed  login  logout  config
+```
+
+### Exceptions, signposted (not silent)
+
+```bash
+omnigraph init --schema s.pg ./new.omni            # plain path: fine
+
+$ omnigraph init --target knowledge --schema s.pg  # cluster-managed target: redirected
+error: `knowledge` is a cluster-managed graph. Create it via `cluster apply`
+       (which records ledger + recovery + approvals), not `init`.
+```
+
+**In one line:** one way to name a graph, the right flags accepted per verb, and
+a CLI that tells you its planes instead of making you memorize them.
+
+## Proposed shape (mechanism)
+
+### One addressing model for every graph-addressing verb
+
+Route **all** graph-addressing verbs — data *and* maintenance — through one
+resolver that turns `(positional URI | --target | --graph | --server)` into
+either a **storage URI** (`file://`/`s3://`) → embedded execution, or a **remote
+`GraphClient`** → HTTP execution, per the verb's declared plane.
+
+**Authority rule (the precedence must not be silent).** `--target` is an
+operator/legacy target lookup; `cluster.yaml` is a *different* authority surface
+(read only by `cluster` commands and `--cluster` boot). A maintenance verb must
+not quietly consult both and invent a precedence. The rule:
+
+- A maintenance verb's `--target` resolves through the **operator/legacy**
+  config and its URI must already be **direct storage**; a target that resolves
+  to a remote (`http(s)://`) URL **fails loudly** (see the example above).
+- **Cluster-managed graphs are addressed explicitly** via a cluster-root +
+  graph-id pair (spelled `--cluster <dir> --graph <id>` for illustration), so
+  reading cluster state is an intentional mode — never an implicit fallback
+  between operator config and `cluster.yaml`.
+
+  > **Flag-shape caveat (deferred).** `--graph` is *already* a global flag that
+  > `requires = "server"` and appends `/graphs/<id>` to a **remote** URL — a
+  > different meaning, and clap won't permit `--graph` without `--server`. So the
+  > cluster-maintenance addressing needs either a distinct flag (e.g.
+  > `--cluster-graph <id>`) or an explicit global-flag migration. This is why
+  > the cluster-managed resolver is **deferred to a later slice** (it also rides
+  > the applied-state-vs-declared-config open question below); the
+  > operator/legacy `--target` path lands first.
+
+### A declared, per-subcommand capability surface (RFC-009 Phase 4, expanded)
+
+One table, **per subcommand** (family-level rows hide exactly the cases the
+table exists to make non-accidental):
+
+| Command | Data (embedded) | Data (remote) | Storage (direct) | Config / session | Notes |
+|---|---|---|---|---|---|
+| `query`, `mutate`, `load`, `ingest` | ✅ | ✅ | — | — | `ingest` is the deprecated alias of `load` |
+| `branch create/list/delete/merge` | ✅ | ✅ | — | — | |
+| `snapshot`, `export`, `commit list/show` | ✅ | ✅ | — | — | |
+| `schema show` | ✅ | ✅ | — | — | |
+| `schema apply` | ✅ | ✅ | — | — | declarative alternative: `cluster apply` |
+| `schema plan` | — | — | ✅ | — | local resolver today |
+| `queries validate` | — | — | ✅ | — | opens the graph to typecheck |
+| `init` | — | — | ✅ | — | cluster-managed graphs → `cluster apply` |
+| `optimize`, `repair`, `cleanup` | — | — | ✅ | — | |
+| `graphs list` | (later) | ✅ | — | — | remote today; embedded-cluster later (RFC-009) |
+| `queries list` | — | — | — | ✅ | reads the registry config; no graph |
+| `lint` | — | — | ✅ | ✅ | `--schema` file, or opens a local graph |
+| `policy validate/test/explain` | — | — | — | ✅ | reads policy files + config |
+| `embed` | — | — | — | ✅ | local tooling (files + embedding API) |
+| `login`, `logout`, `config`, `version` | — | — | — | ✅ | session / config; no graph |
+
+The resolver consults this table. A wrong-plane invocation produces one honest,
+stable message instead of N ad-hoc `bail!`s and accidental `open` errors.
+
+### Plane-grouped `--help`
+
+Group the command list by plane (the `--help` block shown under Target CLI
+ergonomics). Cosmetic, zero behavior change, highest legibility-per-line.
+
+### Maintenance stays off the wire (decision, not omission)
+
+This RFC **does not** add server routes for `optimize`/`cleanup`/`repair`:
+
+- **Serving = the server.** Multi-tenant, safe-for-many-callers data plane.
+- **Storage maintenance = the CLI against storage**, addressed uniformly,
+  run by an operator or a scheduled job with storage access.
+
+Adding maintenance-over-HTTP would re-introduce a heavyweight, destructive
+multi-tenant surface and *add* a plane rather than clarify the three we have.
+A future cluster-driven maintenance reconciler (scheduled compaction/GC as a
+control-plane policy) is explicitly **out of scope** — net-new design (who runs
+it, with what resource bounds), not a CLI restructure.
+
+### `init` is an explicit exception (decision)
+
+Direct-storage `init` against a plain URI/target stays. But if a target resolves
+to a **cluster-managed** graph root, `init` **refuses and signposts** `cluster
+apply` (which records ledger, recovery, and approval artifacts) rather than
+initializing that root out of band. This closes the "hidden twin" of the current
+state.
+
+## Compatibility
+
+Additive and low-risk:
+
+- **`--target`/`--graph` on maintenance verbs** is new capability; the positional
+  URI form keeps working unchanged.
+- **Grouped `--help`** is cosmetic.
+- **Capability-surface error text** changes the message you get on a wrong-plane
+  or misaddressed invocation. Per Hyrum's Law that text is observable; the change
+  is deliberate, release-noted, and replaces an *accidental* `Omnigraph::open`
+  string with a *stable, declared* one — a net improvement, but flagged.
+
+No engine, server, or wire-protocol change. The work is CLI-internal: the shared
+resolver, the capability table, and help grouping.
+
+## Test plan
+
+Extend the existing CLI suites rather than adding a duplicate harness:
+
+- **`parity_matrix.rs`** — capability exclusions (the per-subcommand plane table
+  becomes the source of truth for which verbs are remote-only / storage-only).
+- **`cli_data.rs`** — maintenance wrong-plane errors (`optimize --server`,
+  `optimize --target <remote>`), and `--target` resolving to direct storage.
+- **`cli_schema_config.rs`** — `graphs list` plane behavior, `schema plan`
+  vs `schema show/apply` plane split, and plane-grouped `--help` output.
+- **`system_local.rs`** — `--server` / operator-targeting edge cases end-to-end.
+
+Pin the new wrong-plane error strings deliberately: this RFC is intentionally
+replacing accidental `Omnigraph::open` strings with stable capability errors, and
+those strings become observable behavior (Hyrum).
+
+## Relationship to RFC-009
+
+RFC-009 Phase 4 was scoped as "declared plane capabilities" for the
+embedded-vs-remote axis only. This RFC **subsumes and broadens** that phase into
+the full three-plane, per-subcommand model (adds uniform maintenance addressing,
+the authority rule, and help grouping). RFC-009 Phase 5 (remote `load` →
+`/load` route alignment) is unaffected and remains in RFC-009.
+
+**`graphs list` reconciliation:** RFC-009's answered open question (pinned in
+`parity_matrix.rs`'s exclusions comment) targets `graphs list` becoming
+Both-capability once the embedded arm enumerates the cluster catalog. This RFC
+**aligns** with that rather than superseding it: the capability table shows
+`graphs list` as remote today, embedded-cluster later.
+
+## Open questions
+
+1. **Capability-table location** — a CLI-internal const, or surfaced (e.g. in
+   `--help` and a machine-readable `omnigraph capabilities` for tooling)?
+2. **`--cluster <dir> --graph <id>` for maintenance** — does the maintenance
+   command resolve the storage URI from the applied cluster state, or from the
+   declared `cluster.yaml`? (Applied state is the truth the server serves;
+   declared config may be ahead of it.)
+
+## Review comments (Codex, 2026-06-13)
+
+Overall take: the direction is right. The planes already exist; making them
+declared in code, help text, and error messages should reduce operator surprise.
+Keeping storage maintenance off HTTP is also the right boundary: `optimize`,
+`repair`, and `cleanup` are direct-storage operator actions, not a multi-tenant
+serving surface.
+
+Before implementation, tighten these points:
+
+1. **Resolver authority needs a sharper rule.** The proposal says maintenance
+   resolves storage URIs "from `cluster.yaml` / operator config", but those are
+   different authority surfaces. Today `--target` is an operator/legacy
+   graph-target lookup; cluster config is read by `cluster` commands and by
+   `--cluster` server boot. Do not make a maintenance command silently consult
+   both and pick a precedence. Either:
+   - `--target` on maintenance means an operator/legacy target whose URI is
+     already direct storage, with remote targets failing loudly; or
+   - add an explicit cluster-root/config resolver for this case, so reading
+     cluster state is an intentional mode.
+
+   **Resolution (accepted):** both — `--target` resolves through operator/legacy
+   config and must be direct storage (remote → loud fail); cluster-managed graphs
+   use the explicit `--cluster <dir> --graph <id>` resolver. See *Authority
+   rule* under Proposed shape.
+
+2. **`graphs list` conflicts with RFC-009's target shape.** This RFC classifies
+   `graphs list` as remote-only, while RFC-009's answered open question says it
+   becomes Both-capability once the embedded arm enumerates the cluster catalog.
+   Pick one direction here: either this RFC explicitly supersedes that target,
+   or the capability table should show `graphs list` as remote today and
+   embedded-cluster later.
+
+   **Resolution (accepted):** align, don't supersede. The table shows `graphs
+   list` remote-today / embedded-cluster-later. See *Relationship to RFC-009*.
+
+3. **The capability table should be per subcommand, not per family.** The
+   family-level rows hide the exact cases the table is supposed to make
+   non-accidental. At minimum, call out:
+   - `schema plan` as local/storage-backed today, while `schema show` and
+     `schema apply` route through the graph client;
+   - `queries validate` versus `queries list`, which do not have the same
+     plane shape;
+   - `lint`, `policy`, `embed`, `login`, `logout`, `config`, and `version`, so
+     enumeration/session/tooling commands are intentionally classified instead
+     of falling outside the model.
+
+   **Resolution (accepted):** the capability table is now per-subcommand and
+   classifies every command, including the session/tooling group.
+
+4. **`init` should be an explicit exception.** Direct-storage `init` is fine.
+   A cluster-managed graph should be created by `cluster apply`, with ledger,
+   recovery, and approval semantics. If a named target resolves to a
+   cluster-managed graph root, `init` should signpost `cluster apply` rather
+   than quietly initializing that root out of band.
+
+   **Resolution (accepted):** promoted from open question to a decision. See
+   *`init` is an explicit exception*.
+
+Testing notes for the implementation slice:
+
+- Extend the existing CLI suites rather than adding a new duplicate harness:
+  `parity_matrix.rs` for capability exclusions, `cli_data.rs` for maintenance
+  wrong-plane errors, `cli_schema_config.rs` for `graphs list` / help behavior,
+  and `system_local.rs` for `--server` / operator-targeting edge cases.
+- Pin the new wrong-plane error strings deliberately. This RFC is intentionally
+  replacing accidental `Omnigraph::open` strings with stable capability errors,
+  and those strings become observable behavior.
+
+  **Resolution (accepted):** captured as the *Test plan* section.
+
+## Verification comments (Codex, 2026-06-13)
+
+Follow-up verification against the current CLI/server code found a few
+remaining current-state nits. These are doc-shape issues, not objections to the
+proposal:
+
+1. **Current-state table overstates `graphs list`.** The table under *Current
+   state of affairs* still lists `graphs list` with data verbs that reach the
+   graph by embedded engine or HTTP. Current code routes it through `GraphClient`
+   only to share the resolver, but the embedded arm fails loudly; the later
+   RFC text correctly says remote today / embedded-cluster later. Make the
+   current-state row match that.
+
+   **Resolution (accepted):** the Data row now marks `graphs list` **remote-only
+   today**, with a note that it rides `GraphClient` only to share the resolver.
+
+2. **Current-state table overstates `init` addressing.** `init` is grouped with
+   maintenance verbs whose addressing surface is positional URI or `--target`.
+   Current `init` only accepts a required positional URI and has no `--target`
+   or config path. The proposal can add that capability, but the current-state
+   table should not describe it as already present.
+
+   **Resolution (accepted):** the Storage row now calls out that `init` takes
+   **only a required positional URI** today (no `--target`); adding `--target` to
+   `init` is part of the proposal, entangled with the `init`→`cluster apply`
+   signpost, not current state.
+
+3. **`apply_server_flag` call-site count is stale.** The text says data verbs
+   resolve `--server prod --graph knowledge` through `apply_server_flag` at
+   16 call sites. Current code has the fork collapsed: data verbs call
+   `GraphClient::resolve*`, and only the two `GraphClient` factories call
+   `apply_server_flag`. Rephrase the verified fact around `GraphClient`, not
+   the old pre-collapse call-site count.
+
+   **Resolution (accepted):** validated-fact #1 now describes the post-collapse
+   reality (`GraphClient::resolve*`; the two factories call `apply_server_flag`),
+   dropping the stale count.
+
+4. **`--cluster <dir> --graph <id>` collides with today's global `--graph`
+   semantics.** The target ergonomics section proposes that flag shape for
+   maintenance, but current `--graph` is a global flag that requires
+   `--server` and appends `/graphs/<id>` to a remote server URL. Either choose
+   a separate cluster-maintenance graph flag shape, or call out the clap/global
+   flag migration explicitly as part of the implementation.
+
+   **Resolution (accepted):** the *Authority rule* now carries a flag-shape
+   caveat — the cluster-managed resolver (and its flag shape, e.g.
+   `--cluster-graph` vs a `--graph` migration) is **deferred to a later slice**;
+   the operator/legacy `--target` path lands first. The illustrative
+   `--cluster <dir> --graph <id>` spelling is marked as not-final.
--- a/docs/dev/rfc-011-cli-refactoring.md
+++ b/docs/dev/rfc-011-cli-refactoring.md
@ -0,0 +1,756 @@
+# RFC-011: CLI refactoring — one addressing & config model
+
+**Status:** Accepted — implemented (the `omnigraph.yaml` excision landed as
+#250/#251/#252; D1–D4, D6, D7, D9, D10 shipped). Two items remain: **D11**
+(server-side maintenance jobs) is gated on the bulk-data-plane RFC #219; **D5**
+(combined admin scope) stays deferred by design.
+**Date:** 2026-06-14
+**Audience:** CLI/server maintainers
+**Builds on:** [rfc-007-operator-config.md](rfc-007-operator-config.md)
+(per-operator config, keyed credentials, named servers),
+[rfc-008-deprecate-omnigraph-yaml.md](rfc-008-deprecate-omnigraph-yaml.md)
+(the legacy file this RFC finishes removing),
+[rfc-009-unify-access-paths.md](rfc-009-unify-access-paths.md)
+(`GraphClient` — embedded ≡ remote at the execution layer),
+[rfc-010-cli-planes-restructure.md](rfc-010-cli-planes-restructure.md)
+(declared planes + the wrong-plane guard this RFC subsumes).
+**Sequencing:** lands as / after RFC-008 stage 5 (the `omnigraph.yaml` removal).
+
+## Summary
+
+Refactor the CLI around one coherent model once `omnigraph.yaml` is gone. The
+shape:
+
+- **One ontology** (store, server, cluster; cluster config vs operator config;
+  catalog; profile; capability) where each term names exactly one concept.
+- **Addressing = scope + `--graph`, with the access path *derived*.** A command
+  resolves a *scope* (operator defaults, an optional named *profile*, or one
+  explicit primitive address — `--store` / `--server` / `--cluster`), selects a
+  graph inside it with `--graph`, and the **served-vs-direct access path falls out
+  of the scope's bindings × the verb's capability** — it is never a per-command
+  toggle and never inferred from a URI scheme.
+- **Served is the front door; direct storage is privileged.** The everyday scope
+  is a *server* (a bearer token, no bucket credentials). Reading or writing a
+  remote store/cluster directly is an explicit, credentialed, admin/break-glass
+  act — never the default, never baked into everyday operator config.
+- **The CLI is stateless per command.** No `current_profile` pointer, no
+  `USE`-style mode; every command is fully determined by its flags + static
+  config. You *select* a graph, you do not *switch into* one.
+- **Definitions are named; payloads are passed.** Queries (`.gq`) and schema
+  (`.pg`) live in the catalog and are invoked by name; params and bulk data are
+  the only per-call inputs.
+
+This removes `--target`, `--cluster-graph`, `--uri` scheme-dispatch, and the
+plane guard's "a `--target` that resolves to a remote URL" special case — and it
+collapses the four-plane vocabulary, for users, into a single capability rule.
+
+## Motivation: the legacy file pollutes the taxonomy
+
+Today the CLI exposes four overlapping addressing forms but the system has only
+three real entities; the mismatch is the whole problem, and `omnigraph.yaml` is
+the carrier:
+
+1. **`--target` straddles kinds.** It resolves through the legacy
+   `omnigraph.yaml` `graphs:` map (`config.rs::resolve_target_uri`), and that
+   `.uri` can be a **storage location** (`file`/`s3`) *or* a **remote server**
+   (`http`). One flag, two access paths with different capability and trust
+   models. The wrong-plane guard's storage-plane remote rejection
+   (`helpers.rs:467`) exists *only* to compensate for this overload.
+2. **Scheme-inferred transport.** `<URI>`/`--uri` has the same disease a level
+   down: `is_remote_uri` (`helpers.rs:15`) silently picks embedded vs remote from
+   the scheme. Transport is guessed from a string, not declared.
+3. **No single environment concept.** Defaults are smeared across the deprecated
+   `omnigraph.yaml` (`cli.graph`, `server.graph`) with no clean way to name or
+   switch environments.
+
+Removing `omnigraph.yaml` is the moment to fix all three at once.
+
+## Ontology
+
+Every term is one concept. The rest of this RFC uses them precisely.
+
+### Entities — the things that exist
+
+- **Graph** — a typed property graph (node/edge types over Lance); the thing you
+  query and mutate. *Example: the `knowledge` graph.*
+- **Store** — the storage location of a **single** graph: its Lance datasets at a
+  `file://`/`s3://` URI. Addressed directly with `--store`. *Example:
+  `s3://acme/clusters/brain/graphs/knowledge.omni`.*
+- **Cluster** — a storage root holding **many** graphs plus the catalog and
+  control-plane state (state ledger, approvals, recovery). Managed as-code by the
+  team. *Example: the `brain` cluster at `s3://acme/clusters/brain`.*
+- **Server** — an `omnigraph-server` process serving graphs over HTTP with bearer
+  auth and Cedar policy; boots from a bare graph or a cluster. *Example: `prod` at
+  `https://graph.example.com`, serving the `brain` cluster.*
+
+### Config & catalog — the descriptions
+
+- **Cluster config** — `cluster.yaml` in the cluster root, declaring the **desired
+  state** (graphs, schemas, stored queries, policies, storage), applied with
+  `cluster apply`. Team-owned; the source of truth for *what the system is*.
+- **Catalog** — the **applied** registry the cluster owns in storage: the graphs,
+  stored queries, and policies `cluster apply` materialized. What a server serves
+  and what `query <name>` resolves against. *(Cluster config is the spec; the
+  catalog is the applied result.)*
+- **Operator config** — `~/.omnigraph/config.yaml`, your **personal** file:
+  identity (actor), default graph, named servers/clusters, output prefs, optional
+  profiles. Declares *who I am*, never what the system is.
+- **Profile** — an optional named bundle of **defaults inside the operator
+  config** (one of {cluster, server, store} + a default graph). Config data,
+  **not state**: selecting one fills in omitted flags for a command; it does not
+  put you "in" a mode. Chosen per command (`--profile <name>`) or per shell
+  (`OMNIGRAPH_PROFILE`).
+- **Credential** — a bearer token keyed to a **server name**, resolved via
+  `OMNIGRAPH_TOKEN_<NAME>` or `~/.omnigraph/credentials` (`0600`); sent only to
+  the server it is keyed to. (Per RFC-007 — the operator config holds endpoints,
+  never tokens.)
+
+### What you run — definitions vs payloads
+
+- **Schema** — the `.pg` type definitions for a graph; authored as a file, applied
+  via `schema apply` (or `cluster apply`).
+- **Stored query** — a named query in the catalog, the team's reusable contract;
+  invoked by name. *Example: `find_people`.*
+- **Query file (`.gq`)** — an authoring artifact holding `query <name>`
+  declarations; becomes a stored query when `cluster apply` adopts it. For
+  authoring/ad-hoc, not everyday invocation.
+- **Payload** — the per-call inputs that vary each run: params (`--params`,
+  positional args) and bulk data (`--data`). Never part of config.
+
+### How a command resolves
+
+- **Scope** — the resolved environment a command addresses: operator defaults, a
+  named profile, or one explicit primitive address.
+- **Access path** — **served** (through a server) or **direct** (open storage
+  in-process). Derived from scope × capability; see "Access path" below.
+- **Capability** — what a verb requires: `any`, `served`, `direct`, `control`,
+  or `local`.
+- **Target shape** — whether the verb is **graph-scoped** (selects one graph
+  inside the scope), **scope-scoped** (operates on the whole server/cluster
+  scope), or **local** (does not resolve scope or graph).
+- **Actor** — the identity a write is attributed to: server-resolved from the
+  bearer token (served), or `--as` ?? `operator.actor` (direct).
+
+### The relationships that prevent confusion
+
+- **Exactly two config surfaces:** **cluster config** (team) and **operator
+  config** (personal). Nothing else is "a config."
+- A **profile is not a third config** — it lives *inside* the operator config, and
+  it is **defaults, not state**.
+- A **catalog is not config** — it is the *applied state* the cluster owns.
+- A **store is one graph; a cluster is many graphs** + catalog + control state.
+- A **graph is the logical thing**; store/server/cluster are ways to reach it.
+- "State" elsewhere is not the profile: *graph state* is committed data in Lance;
+  *cluster state* is the applied control-plane ledger. Neither is operator config.
+
+## Design
+
+### First principles
+
+> Addressing should be 1:1 with the system's real entities; the access path
+> (served vs direct) should be **derived**, never inferred from a string or
+> toggled per command; the CLI should be **terse by config and stateless per
+> command**; and **definitions are named while payloads are passed**.
+
+Every command answers four orthogonal questions — kept orthogonal here:
+
+| Axis | Question | Today | Target |
+|---|---|---|---|
+| Scope | which environment? | `omnigraph.yaml` defaults / `--target` | operator defaults · `--profile` · one primitive |
+| Target shape | whole scope or one graph? | implicit in command family | declared per verb |
+| Graph | which graph in it? | tangled into the address | `--graph` only for graph-scoped server/cluster verbs |
+| Access path | served or direct? | inferred from scheme / target | **derived** from scope × capability |
+| Actor | who am I? | `--as` > `cli.actor` (yaml) > `operator.actor` | `--as`/`operator.actor` (direct) · token (served) |
+
+### A scope binds one entity — and served is the default
+
+A scope (a profile, the flat defaults, or one primitive flag) binds **exactly one
+of** {server, cluster, store}. Server and cluster scopes may contain many graphs
+and can carry a `default_graph`; a store scope is already one graph and does not
+accept `--graph`. They differ by privilege, and **the everyday default is a
+server**:
+
+- **server** → served (the everyday scope). A bearer token, **no storage
+  credentials**. Data verbs run through it, policy-enforced; maintenance verbs are
+  unavailable from this scope — there is no server route for them, so you must
+  name storage explicitly. This is what a normal operator's config binds.
+- **cluster** → direct storage to a managed cluster, for **control,
+  maintenance, and graph-backed validation only** (`cluster *`,
+  `optimize`/`repair`/`cleanup`/`schema plan`, graph-backed `lint`, and
+  `queries validate`). Data verbs are **not** run directly against a cluster —
+  they go served, or `--store` for ad-hoc. **Privileged:** requires bucket
+  credentials, so it appears only in a maintainer's config or as an explicit
+  `--cluster` flag — never in an everyday operator's defaults.
+- **store** → one graph's storage, direct. A **local file** store is ordinary
+  local dev; a **remote `s3://`** store is break-glass. No catalog (named queries
+  do not resolve — the ad-hoc lane).
+
+A scope names **one** thing, so there is no independent `server`+`cluster` pair
+that could disagree (the audit's coherence hazard is gone by construction — the
+default is just a server). And the storage root lives only where it must:
+
+### Direct storage access is privileged (the storage-root rule)
+
+> The storage root (`s3://…`) is **server-and-admin knowledge, never
+> everyday-operator knowledge.** Everyday operator config binds a server (a bearer
+> token, no bucket credentials). Direct remote access — opening a cluster root or
+> an `s3://` store — is always **explicit and privileged**: you name
+> `--cluster`/`--store`, and only someone with bucket credentials can. The CLI
+> never opens a remote store from a default scope.
+
+This is the least-privilege posture — revoke a bearer token, don't rotate bucket
+keys; only the **server process** and an occasional **maintenance admin** ever
+hold storage credentials. It makes "use the server, not raw storage"
+**structural**, not advisory: direct access requires credentials a normal operator
+does not have *and* a flag they must type. The only storage root in an everyday
+setup is the one the **server** boots from; operators never see it. (Local *file*
+stores for dev are unaffected — a local file is not the production bucket.)
+
+### Access path is derived, not chosen
+
+The two access paths are genuinely different — not two transports for one thing:
+
+- **Served** (through a server): the server resolves your actor from a token and
+  enforces Cedar policy at the HTTP boundary. In cluster mode the **catalog and
+  config** (graph set, stored queries, policy bundles) are pinned to the applied
+  serving revision and move only on restart; **graph data** is read through the
+  server's engine handle against the requested branch/snapshot (it is not frozen
+  at boot, though a long-running server will not observe *out-of-band direct
+  writes* to storage until its handle refreshes). No storage credentials needed.
+- **Direct** (open the Lance storage in-process): a **privileged** path — it needs
+  your own storage credentials, so only an admin/maintainer (or a local-dev file
+  store) takes it. Actor self-declared (`--as` ?? `operator.actor`), reads **live
+  storage HEAD**. There is **no server-side identity/auth gate** — but engine-level
+  Cedar policy *is* still enforced when the graph selection provides a policy
+  (enforcement is engine-wide; embedded `_as` writers call the same `enforce`).
+  "Direct" means "no HTTP boundary," not "unpoliced."
+
+Because they differ in authority, freshness, and availability, a graph reached via
+a server and that graph's raw storage are **different things you name
+differently** — not one identity you flip. Making the access path a per-command
+toggle (`--via`) is the `--target` mistake in new clothes; it is rejected.
+
+> **The access path follows from the scope and the verb.** A **server** scope →
+> served (data/catalog). A **cluster** scope → direct control, maintenance, and
+> validation. A **store** scope → direct ad-hoc data (no catalog). The verb's
+> capability picks which applies and rejects the mismatches.
+
+State the bound plainly: the everyday data path
+(`query`/`mutate`/`load`/`branch`/`export`/`commit`) against a served graph
+**never needs direct storage access**, and direct access is legitimate only in
+bounded places: **bootstrap** (`init`), **storage-native maintenance**
+(`optimize`/`repair`/`cleanup`/`schema plan`), **graph-backed validation**
+(`lint`), **catalog validation** (`queries validate`), the **control plane**
+(`cluster *`), **local dev** with no server, and **break-glass** (recovery, or
+checking whether a long-running server's handle lags live HEAD). Everything else
+is served. This is what makes "discourage direct storage" enforceable rather
+than aspirational.
+
+This list is expected to **shrink**: Decision 11 moves
+`optimize`/`cleanup` (and healthy-path `repair`) to server-managed jobs, which
+would leave direct access to just standalone/local dev, the control plane, and
+break-glass — and remove the last routine reason an admin needs bucket
+credentials.
+
+### Capability semantics
+
+The CLI validates through verb capability, not plane jargon:
+
+| Capability | Meaning | Examples |
+|---|---|---|
+| `any` | graph-scoped data; served via a server scope; direct only against a **store** scope (local dev / break-glass); **errors on a cluster scope** | `query`, `mutate`, `load`, `export`, branch reads, `schema show/apply` |
+| `served` | requires an HTTP server; may be graph-scoped or scope-scoped | `graphs list`, `queries list` |
+| `direct` | graph-scoped storage-native or graph-backed validation; no server form exists | `init`, `optimize`, `repair`, `cleanup`, `schema plan`, graph-backed `lint` |
+| `control` | cluster-scoped catalog/control-plane work; addresses the cluster, not a single raw store | `cluster *`, `queries validate` |
+| `local` | does not address a graph or scope | `config`, `profile`, `lint --query ... --schema ...` |
+
+`any` does **not** mean "the user picks": the resolver picks from the scope.
+Internally the exhaustive `command_plane` match (`planes.rs`) stays as the drift
+guard; user-facing errors speak in terms of what the command needs.
+
+### Definitions vs payloads
+
+Queries and schema are **definitions** — contracts that live in the catalog and
+are invoked **by name**; params and data are **payloads** passed per call. So the
+everyday form is `omnigraph query <name> [params]`, not
+`omnigraph query --file find.gq`. A `.gq` path on a routine query is a smell: the
+query is not in the catalog yet. Lifecycle: **author a `.gq` → `cluster apply`
+adopts it → invoke by name thereafter.**
+
+Named queries resolve through a **server** (which serves the cluster's catalog).
+`queries list` is therefore a served catalog read. `queries validate` is a
+control/catalog check against the cluster-owned query definitions. A bare
+`--store` has **no catalog**, so it is the ad-hoc lane (`-e` / `--file`), and
+`--cluster` does not invoke stored queries. So named-query invocation is a
+**served** convenience; direct access (`--store`) is always ad-hoc.
+
+| Kind | Examples | How it enters a command |
+|---|---|---|
+| Definition | stored query, schema | named in the catalog; authored as a file, adopted by `cluster apply` |
+| Payload | params, bulk data | passed per call (`--params`, positional args, `--data`) |
+| Authoring / ad-hoc | a `.gq` you're writing | `-e '…'`, `--file new.gq`, `lint --query new.gq --schema schema.pg`, `schema apply --schema` |
+
+### Resolution rule
+
+1. If the verb is `local`, reject graph/scope flags and run without resolving a
+   scope.
+2. If a primitive address is supplied (`--store`/`--server`/`--cluster`), use it
+   and ignore operator-config scope defaults. *(A **named** primitive — `--server
+   prod`, `--cluster brain` — still resolves through the operator-config registry;
+   a **literal** — `--server https://…`, `--store s3://…` — bypasses it. Per
+   Decision 2: a value containing `://` is a literal, otherwise a config-name
+   lookup.)*
+3. Else if `--profile <name>` (or `OMNIGRAPH_PROFILE`) selects a profile, use it.
+4. Else use the operator config's flat defaults. Error only if neither resolves.
+   *(No sticky "current" pointer — each command resolves scope fresh.)*
+5. Resolve the graph only for **graph-scoped** verbs. Server/cluster scopes:
+   exactly one graph in scope → use it; else `default_graph`; else require
+   `--graph <id>`. Store scopes are already one graph, so `--graph` is rejected.
+   **Scope-scoped** verbs (`graphs list`, `queries list`, `queries validate`,
+   and `cluster *`) do not select a graph unless their own resource argument says
+   otherwise.
+6. Derive the access path from capability × scope:
+   - `direct` verb → the scope's cluster/store; if the scope is a server, error
+     (name storage explicitly — it is privileged).
+   - `served` verb → the scope's server; if the scope is a cluster/store, error.
+   - `control` verb → the scope's cluster; if the scope is a server/store, error
+     (name a cluster explicitly — it is privileged).
+   - `any` verb → **served** if the scope is a server; **direct** against a
+     **store** scope (ad-hoc); on a **cluster** scope, error — cluster is
+     maintenance-only, so use a server for data or `--store` for ad-hoc.
+7. Reject mismatches with an error naming the missing axis.
+
+Good errors:
+
+```text
+scope "prod" has 4 graphs; pass --graph <id> or set default_graph
+optimize needs direct storage access; scope "prod" is a server — name storage with --cluster s3://… or --store (requires storage credentials)
+graphs list enumerates a server scope; do not pass --graph
+--store opens raw storage directly, bypassing any server (no HTTP auth gate, live HEAD); for recovery/inspection
+```
+
+### Config shape (operator config)
+
+`~/.omnigraph/config.yaml` — your personal file; the cluster config
+(`cluster.yaml` + catalog) is the separate, team-owned surface. The default-graph
+key is `default_graph` everywhere (the per-command flag is `--graph`).
+
+**Everyday operator — binds a server, holds no storage root:**
+
+```yaml
+defaults:
+  server: prod
+  default_graph: knowledge
+  output: table
+servers:
+  prod:    { url: https://graph.example.com }    # token keyed by name (RFC-007); no creds here
+  staging: { url: https://staging.example.com }
+profiles:                                          # optional, only for multiple environments
+  staging: { server: staging, default_graph: knowledge }
+```
+
+A normal operator never has a storage root or bucket credentials. Their default
+scope is served; `optimize`/`repair`/`cleanup` error with a pointer to name
+storage explicitly.
+
+**Maintainer — opts into a cluster root (and has bucket credentials):**
+
+```yaml
+profiles:
+  brain-admin: { cluster: brain, default_graph: knowledge }   # direct; admin/control/maintenance
+clusters:
+  brain: { root: s3://acme/clusters/brain }                   # the s3:// root lives ONLY here
+```
+
+The `clusters:` block — the only place a storage root appears in operator config —
+is **admin-only and opt-in**, absent from a normal operator's file. Equivalently,
+skip config and name it per command:
+`omnigraph optimize --cluster s3://acme/clusters/brain --graph knowledge`. The
+cluster stays the source of truth for the managed catalog; tokens live in the
+keyed credential store, never in this file.
+
+### Command shape
+
+Assume the everyday flat defaults: server `prod`, default graph `knowledge`.
+
+| Intent | Command | Path |
+|---|---|---|
+| Run a catalog query | `omnigraph query find_people` | served |
+| …with params | `omnigraph query find_people --params '{"title":"Eng"}'` | served |
+| Another graph in scope | `omnigraph query find_people --graph archive` | served |
+| Write | `omnigraph load --data batch.jsonl --mode append` | served |
+| A different environment | `omnigraph --profile staging query find_people` | served |
+| One-off server, no config | `omnigraph query find_people --server https://graph.example.com --graph knowledge` | served |
+| Maintain (admin, explicit storage) | `omnigraph optimize --cluster s3://acme/clusters/brain --graph knowledge` | direct (privileged) |
+| Maintain (admin, via admin profile) | `omnigraph --profile brain-admin optimize --graph knowledge` | direct (privileged) |
+| List catalog queries | `omnigraph queries list` | served |
+| Validate cluster query catalog | `omnigraph queries validate --cluster s3://acme/clusters/brain` | control (privileged) |
+| Offline query lint | `omnigraph lint --query new.gq --schema schema.pg` | local |
+| Graph-backed query lint | `omnigraph lint --query new.gq --cluster s3://acme/clusters/brain --graph knowledge` | direct (privileged) |
+| Local dev, no server | `omnigraph query -e 'match { … } return { … }' --store graph.omni` | direct (local file) |
+| Break-glass: raw storage of a served graph | `omnigraph query --file find.gq --store s3://acme/clusters/brain/graphs/knowledge.omni` | direct (privileged, rare) |
+
+Note what the everyday rows are: **all served.** `optimize` does *not* appear in
+the default-scope rows — from a server scope it errors and points you to name
+storage (see the resolution rule), so maintenance is always a deliberate,
+credentialed act. There is no "force served/direct" row — you never toggle the
+path on a configured graph; the only way to reach raw storage is to *name it*
+(`--cluster`/`--store`), which makes the privileged bypass unmistakable. Everyday
+rows invoke a query **by name**; a `.gq` file appears only where there is no
+catalog (bare store, break-glass) via `-e`/`--file`.
+
+## Before / after
+
+**Before** = best available today (legacy `omnigraph.yaml` `--target`, `.gq`
+files, `--cluster-graph`, scheme inference). **After** = this model.
+
+| Intent | Before | After |
+|---|---|---|
+| Run a query | `omnigraph query --target knowledge --query find.gq --name find_people` | `omnigraph query find_people` |
+| Another graph | `omnigraph query --target archive --query find.gq --name find_people` | `omnigraph query find_people --graph archive` |
+| Load | `omnigraph load --data b.jsonl --mode append --target knowledge` | `omnigraph load --data b.jsonl --mode append` |
+| Maintain (admin) | `omnigraph optimize --cluster brain --cluster-graph knowledge` | `omnigraph optimize --cluster s3://acme/clusters/brain --graph knowledge` |
+| Another environment | edit `omnigraph.yaml`, or re-address with full URIs | `--profile staging …` or `OMNIGRAPH_PROFILE=staging` |
+| One-off remote | `omnigraph query --uri https://… --query find.gq` *(scheme→remote)* | `omnigraph query find_people --server https://… --graph knowledge` |
+| Raw storage of a served graph | `omnigraph query s3://…/knowledge.omni --query find.gq` *(looks like a normal query)* | `omnigraph query --file find.gq --store s3://…/knowledge.omni` *(explicit bypass)* |
+
+**Removed:** `--target`; `--cluster-graph` (`--graph` is the graph selector only
+for graph-scoped server/cluster verbs); `--uri` http-scheme dispatch; `--via`
+(never ships); everyday `--query <file>` (definitions are named);
+`omnigraph.yaml` and its `cli.graph`/`server.graph` defaults.
+
+## Server-side corollary
+
+The same ontology applies to `omnigraph-server` boot: with `omnigraph.yaml` gone,
+a server boots from a single bare graph URI **or** a cluster (`--cluster <dir|s3>`,
+RFC-005), never a `graphs:` map. The store/server/cluster ontology is then
+consistent across CLI and server.
+
+## Migration & compatibility
+
+Addressing flags and config keys are observable contract (Hyrum); every removal is
+staged and release-noted.
+
+- **`config migrate`** (shipped) maps each legacy `graphs:` entry **by what it
+  actually is**: `http(s)` URIs → a `server:` (the recommended everyday shape);
+  `file` URIs → a local `store:`; an `s3://` **graph** URI → an **admin** `store:`
+  (it is a single graph, not a cluster); an `s3://` **cluster root** (one that
+  carries cluster state) → an **admin** `cluster:`. Everyday `s3://` graph usage
+  migrates with a **warning** — prefer serving it via a server rather than
+  re-establishing direct remote access. It reports dropped keys.
+- **Operators move to a server-default scope.** Where a legacy setup pointed
+  `cli.graph` at an `s3://` graph for everyday use, migration flags it: the
+  recommended shape is a `server:` scope (bearer token, no bucket creds), with the
+  `s3://` root kept only in a maintainer's config — not every operator's.
+- **`--target`** warns for one release, then errors; **`OMNIGRAPH_NO_LEGACY_CONFIG=1`**
+  (already the strict switch) becomes the default — loading `omnigraph.yaml` is a
+  hard error.
+- **`--cluster-graph` → `--graph`**: `--cluster-graph` is accepted with a warning
+  for one release, then removed.
+- **`--graph` meaning change**: today `--graph` is "graph id on a multi-graph
+  server" (paired with `--server`); it generalizes to "select the graph for
+  graph-scoped verbs in server/cluster scopes." Existing `--server --graph`
+  usage keeps working (it is a strict superset); release-note the broadened
+  meaning and the fact that store/scope-scoped verbs reject it.
+- **`--uri http://…`** warns, then errors with a pointer to `--server`.
+- **`--as` on served paths**: today global `--as` is accepted (a no-op on remote
+  writes — the server resolves the actor from the token); rejecting it on the
+  served path is staged — warn for one release, then error.
+- **`--alias`** → the `alias` namespace (`omnigraph alias <name>`, Decision 4);
+  the old `--alias` flag warns for one release, then is removed.
+
+## Non-goals
+
+- **No change to the direct/served capability split.** Maintenance stays
+  storage-direct by design (no server routes for `optimize`/`repair`/`cleanup`);
+  this RFC only makes the split explicit.
+- **No new transport.** Addressing surface, not protocol.
+- **No positional sigil grammar** (`@server/graph`, `%cluster/graph`). Considered
+  and rejected: explicit flags are more discoverable; profiles already give
+  brevity. Revisit only on demonstrated expert-terseness demand.
+
+## Decisions
+
+The questions this RFC opened are resolved as follows. Two are explicitly
+deferred (see below); they do not block the model.
+
+1. **Local-dev path → embedded `--store` scope.** Local dev runs the engine
+   in-process against a `--store <file>` (or a store-scoped profile); `omnigraph
+   serve` stays available but is not required. Consistent with embedded ≡ remote
+   (RFC-009).
+2. **Primitives are one flag, typed by content.** `--server` and `--cluster`
+   accept either a config name or a literal URI: a value containing `://` is a
+   literal (bypasses the registry); otherwise it is a config-name lookup (error if
+   unknown). `--store` is always a URI. (Replaces the earlier "literal-vs-named"
+   question — no `--server-url`/`--cluster-root` split.)
+3. **Stored invocation: `query <name>` (read) / `mutate <name>` (write), one
+   catalog namespace.** A name maps to one definition; the verb asserts its kind
+   and the CLI errors on mismatch (`'apply_labels' is a mutation — use
+   omnigraph mutate apply_labels`). No `invoke` verb.
+4. **Aliases live under an `alias` namespace** — `omnigraph alias <name> [args]`,
+   never bare top-level. An alias can therefore neither shadow nor be shadowed by a
+   built-in (current or future) verb.
+6. **Profile merge: scope wholesale, prefs layered.** The entity binding +
+   `default_graph` come *wholesale* from the active scope (a profile, or flat
+   defaults if none) — never per-key merged across the entity dimension (that would
+   yield "server *and* cluster"). Only non-scope preferences (`output`, table
+   layout) take flat defaults as a base. Precedence: explicit flag > profile > flat
+   defaults.
+7. **No default graph → error + list candidates.** A graph-scoped verb with no
+   `--graph`, no `default_graph`, and >1 graph in scope errors and lists candidates
+   (served: `GET /graphs`; cluster-direct: catalog enumeration). If enumeration is
+   policy-gated/unavailable, it says so and asks for `--graph`. Never auto-pick.
+9. **Diagnostics & safety.** Writes echo the resolved scope + access path to stderr
+   (suppress with `--quiet`). Destructive verbs (`cleanup`, overwrite `load`,
+   `branch delete`) require confirmation when the scope is not local; `--yes` skips
+   it; **no TTY without `--yes` errors** (never silently proceed). `--json`/CI never
+   prompt — destructive without `--yes` errors.
+10. **Cluster graphs evolve only via `cluster apply`.** `schema apply` (an `any`
+   verb) targets standalone graphs; against a cluster-managed graph it errors and
+   points at `cluster apply` (which records ledger/recovery/approvals — RFC-004).
+   Mirrors `init`'s refusal of a cluster-managed path.
+11. **Maintenance moves server-side (committed direction).** `optimize`/`cleanup`
+   (and healthy-path `repair`) become server/cluster-managed async jobs —
+   policy-gated, audited, single-coordinator — with `direct` retained only as
+   break-glass (`repair` when the server is down). Runs out-of-band (a worker +
+   async job routes, the `POST …` / `GET …/{id}` shape of the bulk-data-plane RFC
+   (`docs/rfcs/0001-bulk-data-plane.md`, PR #219, not yet merged)), never inline in
+   serving; `schema plan` is
+   excluded (≈ `cluster plan` in cluster mode). The **mechanism** (job routes,
+   worker, scheduling) is a follow-up RFC; until it lands the capability table above
+   stands, and maintenance is `direct`. When it lands, the maintenance verbs'
+   capability becomes "served-job + direct break-glass."
+
+## Deferred
+
+Non-blocking; settle when convenient.
+
+- **D5 — combined admin scope.** A scope binds one entity; admins read via a
+  server scope and maintain via `--cluster`. A `deployments: { … }` object
+  (server + cluster validated coherent, referenced by a profile) is revisited only
+  if admin ergonomics demand it — and Decision 11 largely removes the need.
+- **D8 — the `profile` command surface.** *Shipped:* `profile list` / `profile
+  show [<name>]` (read-only inspection). The *no sticky `profile use`* constraint
+  holds — it is a design principle, not a command.
+
+## Safety
+
+Dropping the sticky `current_profile` pointer removes the main footgun — a
+destructive command silently inheriting a "current" environment from an earlier
+session. Because each command resolves scope fresh, what is on the command line is
+what runs. Two guards remain (a flat default or `OMNIGRAPH_PROFILE` can still point
+at prod): echo the resolved scope + access path on writes, and require
+confirmation (or `--yes`) for destructive verbs when the resolved scope is not
+local (Decision 9). The most dangerous direct writes (`cleanup`, overwrite
+`load`) are *structurally* rare now — unavailable from the everyday server scope,
+and gated behind bucket credentials plus an explicit `--cluster`/`--store` — so a
+normal operator's setup mostly cannot issue them by accident at all.
+
+## Invariants & deny-list check
+
+- **§10 query semantics first-class / §11 transport at the boundary:** preserved —
+  addressing resolves CLI-side to a `GraphClient`; no transport concepts leak into
+  engine crates.
+- **§12 no client-set actor:** strengthened — the served path's actor stays
+  token-resolved and `--as` is rejected there; direct self-declares.
+- **Least privilege (security posture):** everyday operators hold a revocable
+  bearer token, not bucket credentials; only the server process and maintenance
+  admins hold storage creds. Direct remote access is structural opt-in, not a
+  default — narrowing the blast radius of a leaked operator config.
+- **§6 strong consistency:** both paths are snapshot-isolated per query; this RFC
+  changes addressing, not isolation.
+- **Deny-list (no state that drifts):** profiles and aliases are static config
+  sugar that resolve to canonical scopes; they declare nothing the cluster or
+  server doesn't already own. No sticky session state is introduced.
+- No Hard Invariant is weakened; the change is CLI surface + config removal.
+
+## Relationship to prior work
+
+The completion of the config/CLI lineage: RFC-007 added the operator config and
+keyed credentials; RFC-008 demoted `omnigraph.yaml`; RFC-009 unified execution
+behind `GraphClient`; RFC-010 declared the planes. This RFC removes the last
+legacy addressing surface so the plane model becomes a clean function of the three
+real entities, and folds the planes into a single capability rule. It is adjacent
+to the public-track bulk-data-plane RFC (`docs/rfcs/0001-bulk-data-plane.md`,
+PR #219, not yet merged), which canonicalizes `load`/`export` verbs; this RFC
+canonicalizes how every verb *addresses* a graph.
+
+## Appendix: target CLI taxonomy (end state)
+
+The full command set under this model, organized by **capability** (the new
+classifying axis) instead of plane — the end-state counterpart to the
+current-taxonomy appendix below. Every command, with its end-state addressing.
+
+```
+omnigraph
+│
+├─ any — data verbs · served by default (server scope, or --server <url|name>);
+│        --graph selects the graph in scope; --store forces ad-hoc direct (no catalog)
+│  ├─ query   (alias: read*)    invoke a stored query by NAME; -e/--file for ad-hoc
+│  ├─ mutate  (alias: change*)  invoke a stored mutation by name; -e/--file for ad-hoc
+│  ├─ load                      bulk write — --data, --mode required; --from forks a missing branch
+│  ├─ export                    dump graph data (NDJSON / Arrow)
+│  ├─ snapshot                  current per-table versions
+│  ├─ branch { create | list | delete | merge }    merge takes --into <target>
+│  ├─ commit { list | show }    inspect the commit graph
+│  └─ schema { show (alias: get) | apply }          cluster graphs evolve via cluster apply (Decision 10)
+│
+├─ served — needs a server (errors on a store/cluster scope)
+│  ├─ graphs list               enumerate the graphs a server serves
+│  └─ queries list              list stored queries in the served catalog
+│
+├─ direct — storage-native, PRIVILEGED · --cluster <root> | --store <uri> + bucket creds; never a server
+│  ├─ init                      bootstrap a graph (--store <uri>); refuses a cluster-managed path
+│  ├─ optimize                  compaction; --graph selects
+│  ├─ repair                    publish uncovered drift; --confirm / --force
+│  ├─ cleanup                   version GC; --keep / --older-than / --confirm
+│  ├─ schema plan               migration preview (reads storage directly)
+│  └─ lint --query <path>       graph-backed query lint (with --graph on cluster scope)
+│
+├─ control — cluster/catalog control, PRIVILEGED · --cluster <dir|s3>
+│  ├─ cluster { validate | plan | apply | approve | status | refresh | import | force-unlock }
+│                               apply/approve take --as <actor>; force-unlock takes <LOCK_ID>
+│  └─ queries validate          validate cluster-owned stored queries against graph schemas
+│
+└─ local — no graph
+   ├─ policy { validate | test | explain }   offline Cedar tooling
+   ├─ profile { list | show }                read-only; NO mutating `use` (no sticky state)
+   ├─ alias <name> [args]                    personal shortcut; expands to its bound stored-query call (D4)
+   ├─ config { migrate }                     finish the omnigraph.yaml split (RFC-008)
+   ├─ login / logout                         per-server bearer credentials
+   ├─ embed                                  offline embedding pipeline
+   ├─ lint --query <path> --schema <path>    file-only query lint
+   └─ version  (-v)
+```
+
+`*` `read`/`change` remain as deprecated aliases (warn on use); `ingest` and the
+`check`→`lint` argv-shim are **removed**. `get` aliases `schema show`.
+
+### Addressing forms (end state)
+
+Three scope forms — one per real entity — plus the graph selector. No `--target`,
+no `--cluster-graph`, no `--uri` scheme-dispatch, no `--via`.
+
+| Form | Resolves to | Access | Privilege |
+|---|---|---|---|
+| **server scope** — operator default, a `--profile`, or `--server <url\|name>` | a served endpoint + keyed token | served | everyday (bearer token) |
+| **cluster scope** — an admin profile, or `--cluster <root>` | a managed cluster's storage + catalog | direct | privileged (bucket creds) |
+| **store scope** — `--store <uri>` | one graph's storage (no catalog) | direct | local-dev (file) / break-glass (s3) |
+| **`--graph <id>`** | selects the graph for graph-scoped verbs in server/cluster scopes; invalid for store scopes and scope-scoped verbs | — | — |
+
+Resolution: explicit primitive (`--server`/`--cluster`/`--store`) → `--profile` /
+`OMNIGRAPH_PROFILE` → operator flat defaults. Access path is then derived from the
+scope kind × the verb's capability (see the Resolution rule); it is never inferred
+from a URI scheme and never toggled.
+
+### What moved vs today
+
+| Command(s) | Today (plane) | End state (capability) |
+|---|---|---|
+| `query`/`mutate`/`load`/`export`/`snapshot`/`branch`/`commit`/`schema show`/`schema apply` | Data | **`any`** (served-default; `--store` ad-hoc) |
+| `graphs list` | Data (remote-only) | **`served`** |
+| `queries list` | Session | **`served`** (catalog read) |
+| `init`/`optimize`/`repair`/`cleanup`/`schema plan`/graph-backed `lint` | Storage | **`direct`** (privileged) |
+| `queries validate` | Storage | **`control`** (catalog validation) |
+| `cluster *` | Control | **control** (unchanged) |
+| `policy *`/`embed`/`login`/`logout`/`config`/`version`/offline `lint --query --schema` | Session | **`local`** |
+| `ingest`; `--target`; `--cluster-graph`; `--uri http` dispatch | present | **removed** |
+| — | — | **added:** `profile { list | show }` (read-only) |
+
+Cross-capability families: `schema` (`plan` is `direct`, `show`/`apply` are
+`any`), `queries` (`list` is `served`, `validate` is `control`), and `lint`
+(offline with `--schema` is `local`, graph-backed is `direct`) split per
+subcommand/mode, exactly where their authority and data dependencies differ.
+
+## Appendix: current CLI taxonomy (today)
+
+The **as-is** command surface this RFC transforms, kept so the RFC is
+self-contained. The source of truth is the exhaustive `command_plane` match in
+`crates/omnigraph-cli/src/planes.rs`.
+Where it disagrees with the design above (four planes, `--target`,
+`--cluster-graph`, scheme-inferred transport), the design is the *target* and this
+is *today*.
+
+### The four planes (today)
+
+| Plane | What it touches | Addressing accepted |
+|---|---|---|
+| **Data** | a graph — embedded **or** via a server | `<URI>` · `--target` · `--server` (+`--graph`) |
+| **Storage** | direct storage, no server | `<URI>` · `--target` (local/S3 only) · some also `--cluster`+`--cluster-graph` |
+| **Control** | a cluster *directory* | `--config <dir>` |
+| **Session** | no graph | — |
+
+`--server`/`--graph` are gated strictly to the data plane; `guard_addressing`
+(`planes.rs:128`) rejects them elsewhere (RFC-010 Slice 1).
+
+### Command tree by plane (today)
+
+```
+omnigraph
+├─ DATA ────────── run against a graph; embedded or --server
+│  ├─ query (alias: read) · mutate (alias: change) · load · ingest (hidden, deprecated)
+│  ├─ branch { create | list | delete | merge } · snapshot · export · commit { list | show }
+│  ├─ graphs { list }                         (remote-only)
+│  └─ schema { show (alias: get) | apply }    ← show/apply are DATA
+├─ STORAGE ─────── direct file://|s3:// access; --server rejected
+│  ├─ init · optimize · repair · cleanup       (optimize/repair/cleanup also: --cluster --cluster-graph)
+│  ├─ lint (check shim) · schema plan          ← plan is STORAGE
+│  └─ queries validate
+├─ CONTROL ─────── cluster directory via --config <dir>
+│  └─ cluster { validate | plan | apply | approve | status | refresh | import | force-unlock }
+└─ SESSION ─────── no graph
+   ├─ policy { validate | test | explain } · embed · login / logout
+   ├─ config { migrate } · queries list         ← list is SESSION
+   └─ version (-v)
+```
+
+`read`/`change` are visible clap aliases (deprecated names, warn); `check` is an
+argv-shim → `lint`; `get` aliases `schema show`; `ingest` is hidden but runs.
+
+### Cross-plane families (today)
+
+- **`schema`**: `schema plan` is Storage; `schema show`/`apply` are Data.
+- **`queries`**: `queries validate` is Storage; `queries list` is Session.
+
+### Addressing forms (today)
+
+| Form | Looks up in | Resolves to | Source |
+|---|---|---|---|
+| `<URI>` / `--uri` | nothing (explicit) | the literal URI | — |
+| `--target <name>` | `omnigraph.yaml` `graphs:` | that graph's `uri` (local / S3 / **http**) | `config.rs::resolve_target_uri` |
+| `--server <name>` (+`--graph`) | `~/.omnigraph/config.yaml` `servers:` | a remote server URL | `helpers.rs::resolve_server_flag` |
+| `--cluster <dir\|s3> --cluster-graph <id>` | served cluster state | the graph's storage URI | `helpers.rs` (RFC-010 Slice 3) |
+
+Precedence (`resolve_target_uri`): explicit `<URI>`/`--uri` → `--target` →
+`cli.graph` default → error. `is_remote_uri` (`helpers.rs:15`) then selects
+`GraphClient::Remote` vs `Embedded` (`client.rs:86`).
+
+### Enforcement points (today)
+
+- **`guard_addressing`** (`planes.rs:128`): `--server`/`--graph` on a non-data verb
+  fails with a declared message.
+- **Storage-plane remote rejection** (`helpers.rs:467`): a storage verb whose
+  `--target` resolves to `http(s)://` is rejected.
+- **`init` into a cluster layout** is refused (use `cluster apply`).
+
+## Audit comments
+
+Reviewed against the current CLI taxonomy, `planes.rs`, `cli.rs`, `helpers.rs`,
+`client.rs`, RFC-007/RFC-010, and the user-facing CLI/server docs.
+
+### Validated
+
+- The target taxonomy now has a stable classifier: `any`, `served`, `direct`,
+  `control`, and `local` are all declared capabilities.
+- Cluster scope is coherent: it is privileged direct storage for control,
+  maintenance, and validation, not a direct data path. `any` data verbs served by
+  default and reject cluster scope.
+- Graph selection is no longer universal. Graph-scoped verbs select a graph;
+  scope-scoped verbs such as `graphs list`, `queries list`, `queries validate`,
+  and `cluster *` address the whole server/cluster scope.
+- The current-state appendix still matches the implemented CLI: four planes,
+  `--target`, `--cluster-graph`, scheme-inferred transport, `schema plan` as
+  Storage, and `schema show/apply` as Data.
+
+Decisions and deferrals are tracked in [Decisions](#decisions) above — not
+duplicated here.
--- a/docs/dev/rfc-012-embedding-provider-config.md
+++ b/docs/dev/rfc-012-embedding-provider-config.md
@ -0,0 +1,295 @@
+# RFC: Provider-Independent Embedding Configuration
+
+**Status:** Accepted — Phases 1-5 implemented
+**Date:** 2026-06-15
+**Builds on:** the engine embedding client (`crates/omnigraph/src/embedding.rs`), the `@embed` catalog
+annotation (`omnigraph-compiler/src/catalog`), the cluster `providers.embedding` surface
+([cluster-config-specs.md](cluster-config-specs.md), [rfc-007-operator-config.md](rfc-007-operator-config.md)
+for the secret-resolution pattern).
+**Target release:** staged — NFR floor first, then the provider-independent config core; ingest-time `@embed`
+execution is a separate later phase.
+
+## Summary
+
+OmniGraph's embedding subsystem is **hardwired to a single provider (Google Gemini)** and has no recorded
+link between the model that produced a stored vector and the model that embeds a query string. Today that
+happens to be self-consistent (one live client embeds both sides), but it is consistent by accident, not by
+construction: the provider is hardcoded, the model is a moving `-preview` target, nothing validates that a
+query vector and a stored vector share a space, and the one configurable knob (key + base URL) cannot change
+the provider or model.
+
+This RFC makes embedding **provider-independent**: one resolved `EmbeddingConfig { provider, model, base_url,
+api_key, dim, normalize }` behind a sealed provider abstraction, resolved once and shared by every embedder.
+The **primary variant is OpenAI-compatible** — a single request/response shape (`POST {base}/embeddings`,
+`{model, input, dimensions}`) that covers **OpenRouter** (the recommended default gateway, one key for Gemini,
+OpenAI, Mistral, BGE, Qwen, sentence-transformers, …), OpenAI direct, and any self-hosted OpenAI-compatible
+endpoint (vLLM, Ollama, LM Studio, Together). A native **Gemini** (`generativelanguage`) variant is retained
+for shops that want to hit Google directly with its `RETRIEVAL_QUERY`/`RETRIEVAL_DOCUMENT` task-type
+asymmetry, plus a deterministic **Mock**. The embedding *identity* (provider + model + dim) is recorded in the
+schema IR so it travels with the data, and a query whose resolved embedder cannot match the stored vectors'
+recorded identity is **rejected with a typed error instead of silently ranking across vector spaces.**
+Provider/endpoint wiring lands on the already-reserved cluster `providers.embedding` field; secrets follow the
+existing operator-credential pattern; no secret ever enters the schema.
+
+This RFC supersedes the framing in `docs/user/search/embeddings.md` that described "two embedding clients
+with different defaults" — one of those clients was dead code with zero callers and has been removed (see
+Phase 1); the OpenAI request shape returns as a first-class *provider variant* of the one client, not as a
+second parallel client.
+
+## Motivation
+
+This work originated in an external handoff that reported a live cross-provider bug: gemini-3072 stored
+vectors compared against OpenAI-1536 query vectors, silently. Investigation against the current source showed
+the reported mechanism is **inaccurate** — the OpenAI client it blamed (`omnigraph-compiler/src/embedding.rs`)
+was `pub(crate)`, `#![allow(dead_code)]`, and had **zero callers**; the live `nearest("string")` path and the
+offline `omnigraph embed` CLI both use the engine **Gemini** client; and `@embed` does no ingest-time
+embedding at all. So the documented happy path is self-consistent. But the investigation surfaced four real
+problems the handoff's instincts correctly smelled:
+
+- **P1 — Provider is hardwired.** The one live client builds Google `generativelanguage` requests; only key +
+  base URL are configurable, not the provider or model. A non-Gemini shop cannot use `nearest("string")`
+  without a Gemini key, and cannot make it produce non-Gemini vectors. If they store their own vectors and
+  query with `nearest("string")`, the query is embedded with Gemini → a silent cross-space ranking. This is
+  the handoff's failure, reached by a different cause.
+- **P2 — A dead, divergent second client + stale docs** invited exactly the misdiagnosis the handoff made.
+- **P3 — No same-space guarantee recorded with the data.** Nothing stamps which model/dim produced a stored
+  vector, so write-side and read-side embedders can drift with no validation.
+- **P4 — `@embed` is declarative-in-name-only.** It records a source property for the typechecker but never
+  embeds at ingest; the docs claimed otherwise.
+
+Per the project's first principle, the lower-liability shape is **one provider-independent client with the
+identity recorded next to the data**, not N independently-defaulted clients kept in lockstep by discipline.
+Hardcoding one provider mortgages every future "we need OpenAI / a local model / Vertex" against a rewrite;
+recording identity once closes the silent-wrong-results class by construction.
+
+## Current state — which API we actually use
+
+| | Live engine client (`crates/omnigraph/src/embedding.rs`) | Deleted dead client (was `omnigraph-compiler/src/embedding.rs`) |
+|---|---|---|
+| Provider | **Google Gemini Developer API** (`generativelanguage`, *not* Vertex AI) | OpenAI |
+| Endpoint | `POST {base}/models/{model}:embedContent` | `POST {base}/embeddings` |
+| Auth | header `x-goog-api-key`, env `GEMINI_API_KEY` | `Authorization: Bearer`, env `OPENAI_API_KEY` |
+| Model | `gemini-embedding-2-preview` (hardcoded) | `text-embedding-3-small` (env `NANOGRAPH_EMBED_MODEL`) |
+| Base default | `https://generativelanguage.googleapis.com/v1beta` | `https://api.openai.com/v1` |
+| Request body | `{model, content:{parts:[{text}]}, taskType, outputDimensionality}` | `{model, input:[…], dimensions}` |
+| Response | `{embedding:{values:[f32]}}` | `{data:[{index, embedding:[f32]}]}` |
+| Task types | `RETRIEVAL_QUERY` / `RETRIEVAL_DOCUMENT` | none |
+| Status | **live** — used by `nearest("string")` and `omnigraph embed` | **removed in Phase 1** (zero callers) |
+
+Both shapes honour a requested output dimensionality (Gemini `outputDimensionality`, OpenAI `dimensions`)
+driven by the target column width, so dimension is already schema-driven. The two known shapes are exactly the
+two initial provider variants this RFC defines — the OpenAI shape returns from git history as a `Provider`
+variant of the single client.
+
+## Guide-level explanation
+
+### Configuring a provider (operator view)
+
+Pick a provider for the graph in `cluster.yaml` (the team-owned surface), referencing a secret by name. The
+recommended default routes through OpenRouter (OpenAI-compatible, one key for many models):
+
+```yaml
+providers:
+  embedding:
+    default:
+      kind: openai-compatible           # openai-compatible | gemini | mock
+      base_url: https://openrouter.ai/api/v1
+      model: google/gemini-embedding-2  # or openai/text-embedding-3-large, mistralai/mistral-embed, …
+      api_key: ${OPENROUTER_API_KEY}
+graphs:
+  knowledge:
+    schema: knowledge.pg
+    embedding_provider: default
+```
+
+The same `openai-compatible` kind points at OpenAI direct (`base_url: https://api.openai.com/v1`,
+`model: text-embedding-3-large`) or a self-hosted endpoint (vLLM/Ollama/LM Studio) by changing `base_url`. Use
+`kind: gemini` only to reach Google's `generativelanguage` API directly (it keeps the query/document
+task-type asymmetry that the OpenAI-compatible shape does not expose). Dimensions are schema-driven by the
+target `Vector(N)` column, not duplicated in the provider profile.
+
+The zero-config tier keeps working with env only (`OMNIGRAPH_EMBED_PROVIDER`, `OMNIGRAPH_EMBED_BASE_URL`,
+`OMNIGRAPH_EMBED_MODEL`, and the provider api-key env — `OPENROUTER_API_KEY` / `OPENAI_API_KEY` /
+`GEMINI_API_KEY`), so no cluster file is required for a single-graph setup.
+
+### Recording identity in the schema
+
+`@embed` grows optional arguments that pin the embedding identity to the vector column:
+
+```pg
+node Doc {
+  slug: String @key
+  text: String
+  v: Vector(3072) @embed("text", model="gemini-embedding-2", dim=3072) @index
+}
+```
+
+The single-argument form `@embed("text")` keeps working unchanged. The recorded identity persists in the
+schema IR (`_schema.ir.json`) and so travels with `schema apply` and `schema show`.
+
+### What a mismatch looks like
+
+If the resolved read-side embedder cannot produce the recorded identity (wrong model, wrong dim, wrong
+provider), `nearest($v, "string")` fails with a typed error naming both sides, instead of returning a
+plausible-but-meaningless ranking. Changing the recorded identity on an existing column is a loud schema-apply
+refusal (it is a re-embed, a deliberate migration step), reusing the migration planner's existing
+annotation-change rejection.
+
+## Reference-level design
+
+### One client, sealed provider abstraction
+
+Replace the two-variant `EmbeddingTransport` with a resolved config plus a sealed provider enum:
+
+```text
+EmbeddingConfig { provider: Provider, model, base_url, api_key, dim, normalize }
+enum Provider {
+  OpenAiCompatible,   // POST {base}/embeddings, Bearer auth, {model, input, dimensions} → {data:[{embedding,index}]}
+                      //   covers OpenRouter (default gateway), OpenAI direct, vLLM/Ollama/LM Studio/Together
+  Gemini,             // POST {base}/models/{model}:embedContent, x-goog-api-key, with RETRIEVAL_QUERY/DOCUMENT task types
+  Mock,               // deterministic, offline
+}
+struct EmbeddingClient { config, http, retry, deadline }
+```
+
+`Provider` owns the per-API differences (endpoint suffix, auth header, request JSON, response JSON, task-type
+support); the client owns retry/backoff, the deadline, normalization, and tracing — all provider-independent.
+**OpenRouter is not a distinct variant** — it is `OpenAiCompatible` with `base_url =
+https://openrouter.ai/api/v1`, which is the point: one OpenAI-compatible shape gives provider-independence
+across every model OpenRouter fronts, so the gateway does the multi-provider fan-out and OmniGraph carries one
+request shape. The native `Gemini` variant exists only for direct-to-Google with task-type asymmetry. An enum
+(not a trait) is the earned complexity for this small, first-party set; if third-party plug-in providers are
+ever needed, the enum becomes a trait behind the same `EmbeddingConfig` surface without touching callers.
+
+The OpenAI-compatible `input` accepts an **array**, giving batch embedding for free — which the later
+ingest phase needs for throughput, and which removes the open dependency on Gemini's native
+`batchEmbedContents`.
+
+### Config resolution (resolved once, shared)
+
+Precedence, highest first for served cluster graphs: applied cluster `providers.embedding.<name>` profile →
+env (`OMNIGRAPH_EMBED_*`, provider api-key env) → built-in defaults. The cluster `api_key` value is a
+`${NAME}` env reference resolved at server boot; plaintext never lives in the schema, state ledger, or any
+checked-in file. Resolution happens once per graph handle; the resolved client is shared by
+`nearest("string")`. Direct single-graph serving, embedded callers, and the offline CLI keep the env path
+unless they inject an `EmbeddingConfig` directly.
+
+### Identity recorded in the schema IR (not a new store)
+
+The `@embed` args serialize into `PropertyIR.annotations` → `_schema.ir.json`, which `schema apply` already
+persists atomically and which the catalog (the one thing `nearest()` reads at query time) is built from. No
+new metadata store, no manifest column, no extra read on the query path. The migration planner already rejects
+non-description annotation changes as `UnsupportedChange`, so "recorded identity is immutable without a
+deliberate re-embed migration" is the default behaviour, not new code. (A second, optional copy in Lance
+field metadata — co-located with the vectors — is available later by activating the currently no-op
+`UpdatePropertyMetadata` migration step; out of scope here.)
+
+### Query-time validation
+
+`resolve_nearest_query_vec` compares the resolved read-side identity against the column's recorded identity
+before embedding; on mismatch it returns a typed `OmniError` naming recorded vs resolved (model, dim,
+provider). This is the only behaviour that closes P3 by construction.
+
+### NFR floor (independent of the provider work)
+
+- **Deadline:** wrap every embed call (query or document) in a total-operation deadline
+  (`OMNIGRAPH_EMBED_DEADLINE_MS`) so a degraded provider cannot hang the caller for the current ~121 s worst
+  case (4 × 30 s timeout + backoff).
+- **Observability:** `tracing` span per embed call (provider, model, dim, attempts, outcome, elapsed; `warn!`
+  per retry; token usage when the provider returns it). The subsystem has zero instrumentation today.
+- **Single normalization:** one `normalize_vector` (the dead client carried a divergent second copy; removed
+  in Phase 1).
+- **Stable model:** make the model configurable and default to a stable (non-`-preview`) model once the GA
+  name is confirmed.
+
+### Ingest-time `@embed` (later phase, not this RFC's core)
+
+Making `@embed` embed at ingest is a separate phase with a hard constraint: embedding is a slow, external,
+**non-idempotent** side effect, so it must run **entirely before staging** — in the pure in-memory phase,
+before any `stage_*`/Lance HEAD move, alongside the existing constraint validation — so a mid-load provider
+failure aborts with zero drift. It must never sit inside or after the commit protocol, because the recovery
+sweep cannot re-run or undo an external embedding. It also needs a content-hash skip (so `load --mode
+overwrite` does not re-bill every row), batching, and a bounded-concurrency stage. Specified here only to fix
+the design constraint; deferred to its own RFC/phase.
+
+### Phasing (implementation order)
+
+| Phase | Scope | Demo |
+|---|---|---|
+| **1 — NFR floor + dead-client removal** | deadline, observability, single normalize, configurable model, delete dead client + `NANOGRAPH_*` | a hung provider fails at the deadline; embed calls traced; `rg NANOGRAPH_` empty |
+| **2 — Provider-independent config** | `EmbeddingConfig` + `Provider` enum (OpenAiCompatible covering OpenRouter/OpenAI/local, Gemini, Mock); env-first resolution; client reuse | point `base_url` at OpenRouter, run `nearest("string")`, get correct neighbours vs OpenRouter-stored vectors; CLI shares the config |
+| **3 — Record identity in schema IR** | `@embed` args grammar + catalog + IR persistence | `schema show` reflects recorded model/dim |
+| **4 — Query-time validation** | compare resolved vs recorded; typed error; planner refusal on identity change | stored model A vs read model B → loud error, never silent garbage |
+| **5 — Cluster provider wiring** | `providers.embedding` resources; `graphs.<id>.embedding_provider`; `${NAME}` resolution at server boot | provider profile resolved from applied cluster state; legacy `omnigraph.yaml` untouched |
+| later | ingest-time `@embed` (Shape C) | separate RFC |
+
+**Status:** Phases 1–5 are implemented (`@embed("…", model="…")` is recorded in the schema IR and validated at
+query time with a typed same-space error; an unrecorded `@embed` keeps working with no check; cluster-served
+graphs can bind an applied `providers.embedding` profile). Ingest-time `@embed` remains.
+
+## Invariants & deny-list check
+
+- **Invariant 9 (integrity failures are loud):** strengthened — query-time identity mismatch becomes a typed
+  error instead of silent wrong results.
+- **Invariant 10 (query semantics are first-class IR concepts):** embedding identity becomes IR/catalog data,
+  not an out-of-band env guess.
+- **Invariant 11 (transport stays at the boundary):** strengthened — Phase 1 removes the HTTP client + async
+  runtime (`reqwest`, `tokio`) from `omnigraph-compiler`, whose own manifest advertises "Zero Lance
+  dependency"; the embedding HTTP client lives only in the engine.
+- **Invariant 12 / secret handling:** api-keys resolve through the existing credential chain; never in schema
+  or checked-in config.
+- **Invariant 13 (bounded & observable):** addressed — the deadline bounds latency; tracing makes the
+  subsystem observable.
+- **Deny-list — "silent fallback / dropped rows":** the cross-space ranking is exactly a silent-wrong-result;
+  this RFC closes it.
+- **Deny-list — "new write paths that advance Lance HEAD before manifest publish without a recovery
+  sidecar":** the ingest phase (deferred) explicitly keeps embedding *before* staging, so it does not create a
+  new HEAD-advancing write path. No invariant is weakened.
+
+## Drawbacks & alternatives
+
+- **Do nothing.** The happy path works today, so the live risk is narrow (P1 + P3). But the provider hardwiring
+  and missing validation are a latent silent-wrong-results class that bites the first non-Gemini user.
+- **Interim env-only provider switch (no schema record).** Cheaper, but leaves the same-space guarantee to
+  operator discipline (fails P3). Folded in as Phase 2's env-first resolution, with Phases 3–4 adding the
+  record/validate guarantee.
+- **Trait-based provider plug-ins now.** Rejected as unearned complexity for two first-party providers; the
+  enum upgrades to a trait behind the same surface if needed.
+- **Stamp identity in the manifest or Lance field metadata instead of the IR.** The manifest is the wrong
+  granularity; field metadata needs net-new wiring and a query-path dataset open. The IR is where `@embed`
+  already lives and is already read at query time (see spike).
+
+## Reversibility
+
+Mostly reversible. Phases 1–2 and 5 are code/config (env, CLI, cluster keys) and cheap to undo. Phase 3
+(recording identity in the schema IR) is **near-permanent** — it changes the on-disk `_schema.ir.json` shape
+and the schema hash — so it earns the most scrutiny: the single-arg `@embed` form stays byte-compatible, and
+recorded identity is additive (absent identity = today's behaviour). Provider request/response shapes are
+external API contracts, not our format, so adding providers is reversible.
+
+## Gateway tradeoff (OpenRouter)
+
+Routing through OpenRouter (the default) buys provider-independence with one key and one billing relationship,
+batch input, and access to the GA `google/gemini-embedding-2`. Costs to accept, all controllable:
+
+- **Extra network hop** → more query-path latency. The Phase-1 deadline bounds it; the cache mitigates repeats.
+- **Text transits a third party.** OpenRouter's `provider: { data_collection }` routing preference controls
+  retention; shops with strict residency requirements use `kind: gemini`/`openai-compatible` pointed at the
+  provider (or a self-hosted endpoint) directly instead of the gateway. Provider-independence means this is a
+  config change, not a code change.
+- **Loses Gemini's task-type asymmetry** when Gemini is reached via the OpenAI-compatible gateway (both sides
+  embed symmetrically). This is a retrieval-quality cost, **not** a same-space correctness cost — both stored
+  and query vectors take the identical path, so they stay in one space by construction. Shops that want the
+  asymmetry use `kind: gemini`.
+
+## Unresolved questions
+
+- GA Gemini model name — **resolved:** `google/gemini-embedding-2` (via OpenRouter) / `gemini-embedding-2`
+  (direct), 128–3072 dims (recommended 768/1536/3072). Default flips off `-preview` in Phase 2.
+- Gemini `batchEmbedContents` availability — **moot** when going through the OpenAI-compatible gateway (its
+  `input` array batches); still relevant only for the direct `kind: gemini` path.
+- Identity granularity: per-vector-property args vs one graph-level default profile referenced by name.
+- Whether to backfill recorded identity for existing graphs, or treat absent-identity as "unvalidated, legacy"
+  permanently.
+- Default model for the zero-config tier: `google/gemini-embedding-2` vs `openai/text-embedding-3-large`
+  (both 3072-capable) — pick the project default.
--- a/docs/dev/testing.md
+++ b/docs/dev/testing.md
@ -7,7 +7,7 @@ This file is the always-on map of the test surface. **Consult it before every ta
 | Crate | Path | Style |
 |---|---|---|
 | `omnigraph` (engine) | `crates/omnigraph/tests/` | Integration tests (28 files), fixture-driven, share `tests/helpers/mod.rs` |
-| `omnigraph-cli` | `crates/omnigraph-cli/tests/` | Per-area suites (post-modularization): `cli_cluster.rs` (cluster command surface + operator-actor cascade), `cli_cluster_e2e.rs` (spawned-binary lifecycle compositions — lost-state re-import recovery, out-of-band drift, graph-root destruction, multi-graph mixed-disposition convergence), `cli_data.rs` (load/read/change/branch/commit/export/snapshot/policy/embed/maintenance + operator format cascade), `cli_schema_config.rs` (init/config, schema plan/apply, RFC-008 deprecation warnings + `config migrate` + strict mode), `cli_queries.rs`, `system_local.rs` (full-cycle cluster lifecycle with a spawned `--cluster` server, applied-policy enforcement over HTTP, keyed-credential auth, operator aliases), `system_remote.rs`; share `tests/support/mod.rs` (hermetic `OMNIGRAPH_HOME` by default) |
+| `omnigraph-cli` | `crates/omnigraph-cli/tests/` | Per-area suites (post-modularization): `cli_cluster.rs` (cluster command surface + operator-actor cascade), `cli_cluster_e2e.rs` (spawned-binary lifecycle compositions — lost-state re-import recovery, out-of-band drift, graph-root destruction, multi-graph mixed-disposition convergence), `cli_data.rs` (load/read/change/branch/commit/export/snapshot/policy/embed/maintenance + operator format cascade), `cli_schema_config.rs` (init/config, schema plan/apply), `cli_queries.rs`, `parity_matrix.rs` (RFC-009 Phase 1: the embedded-vs-remote referee — every forked verb run against both arms with matched Cedar policy and the same actor, scrubbed-JSON + exit-code equality; divergences are pinned in its `KNOWN_DIVERGENCES` ledger, never silently repaired), `system_local.rs` (full-cycle cluster lifecycle with a spawned `--cluster` server, applied-policy enforcement over HTTP, keyed-credential auth, operator aliases), `system_remote.rs`; share `tests/support/mod.rs` (hermetic `OMNIGRAPH_HOME` by default) |
 | `omnigraph-cluster` | mostly in-source `#[cfg(test)] mod tests`; `tests/failpoints.rs` (feature-gated); `tests/s3_cluster.rs` (bucket-gated full lifecycle on object storage) | Cluster config parser, local JSON state diff, state CAS/lock handling/recovery, read-only validate/plan/status plus explicit refresh/import graph observations, config-only apply (content-addressed payload publish, disposition gating, composite-digest convergence, idempotent re-apply), catalog payload verification (status read-only, refresh drift + self-heal), failpoint crash-mid-apply / CAS-race coverage, Stage 4A graph creation (create executor, recovery sidecars + sweep rows, create crash windows), Stage 4B schema apply (migration previews in plan, schema executor, schema-apply sweep classification, schema crash windows), Stage 4C gated deletes (digest-bound approvals, delete executor + tombstones, delete sweep rows, delete crash windows), and 5A policy binding metadata (applies_to in the applied revision, binding-change diffing + convergence, pre-5A backfill), and the 5B serving-snapshot read API (converged read, refusal rows) |
 | `omnigraph-server` | `crates/omnigraph-server/tests/` | Per-area suites (post-modularization): `auth_policy.rs`, `data_routes.rs`, `schema_routes.rs`, `stored_queries.rs`, `multi_graph.rs` (cluster-mode boot — converged serving, policy binding wiring, boot refusals — + the concurrent branch-ops matrix), `boot_settings.rs` (mode inference, PolicySource), `s3.rs` (bucket-gated: single-graph serving + config-free `--cluster s3://` boot), `openapi.rs` (OpenAPI drift / regeneration); share `tests/support/mod.rs` |
 | `omnigraph-compiler` | mostly in-source `#[cfg(test)] mod tests` | Parser, type-checker, IR lowering, lint |
@ -29,7 +29,7 @@ The engine's `tests/` is the principal coverage surface; most graph-shaped behav
 | `point_in_time.rs` | Snapshots, time travel (`snapshot_at_version`, `entity_at`) |
 | `changes.rs` | `diff_between` / `diff_commits` |
 | `consistency.rs` | Cross-table snapshot isolation, atomic publish |
-| `schema_apply.rs` | Migration plan + apply, schema-apply lock |
+| `schema_apply.rs` | Migration plan + apply, schema-apply lock; index materialization deferred to the reconciler (iss-848): `apply_schema_defers_vector_index_on_empty_table` (an empty-table Vector `@index` never aborts the apply) and `index_only_constraint_apply_touches_no_table_data` (adding an `@index` is metadata-only — no table-version bump) |
 | `search.rs` | FTS / vector / hybrid (`bm25`, `nearest`, `rrf`) |
 | `traversal.rs` | `Expand`, variable-length hops, anti-join (CSR path — `OMNIGRAPH_TRAVERSAL_MODE` unset) |
 | `traversal_indexed.rs` | BTREE-indexed Expand (`execute_expand_indexed`) forced via `OMNIGRAPH_TRAVERSAL_MODE`, asserted semantically equal to the CSR path; own binary, all `#[serial]` so env writes never race |
@ -42,7 +42,7 @@ The engine's `tests/` is the principal coverage surface; most graph-shaped behav
 | `lance_version_columns.rs` | Per-row `_row_last_updated_at_version` behavior |
 | `validators.rs` | Schema constraint enforcement (enum, range, unique, cardinality) across JSONL, insert, update paths |
 | `policy_engine_chassis.rs` | Engine-layer Cedar enforcement (MR-722): allow + deny through every `_as` writer via the SDK directly — no HTTP — proving embedded and CLI callers hit the same gate as the server, with action × scope shapes matching `authorize_request` |
-| `maintenance.rs` | `optimize` (compaction), `repair` (explicit uncovered-drift publish), and `cleanup` (version GC): empty/idempotent/no-op edges, policy validation, head preservation; `optimize` publishes its own compaction (`optimize_publishes_compaction_to_manifest_so_schema_apply_succeeds`), skips pre-existing uncovered drift (`optimize_skips_preexisting_manifest_head_drift`), and refuses to run while a `__recovery` sidecar is pending (`optimize_defers_when_recovery_sidecar_is_pending`); `repair` previews/heals verified maintenance drift, refuses raw semantic drift without `--force`, and forced repair publishes only by explicit operator choice |
+| `maintenance.rs` | `optimize` (compaction), `repair` (explicit uncovered-drift publish), and `cleanup` (version GC): empty/idempotent/no-op edges, policy validation, head preservation; `optimize` publishes its own compaction (`optimize_publishes_compaction_to_manifest_so_schema_apply_succeeds`), skips pre-existing uncovered drift (`optimize_skips_preexisting_manifest_head_drift`), and refuses to run while a `__recovery` sidecar is pending (`optimize_defers_when_recovery_sidecar_is_pending`); `repair` previews/heals verified maintenance drift, refuses raw semantic drift without `--force`, and forced repair publishes only by explicit operator choice; the index reconciler (iss-848): `index_build_tolerates_null_vector_rows` (an untrainable Vector column defers instead of aborting the build, sibling indexes still build) and `optimize_materializes_index_declared_but_unbuilt` (optimize creates a declared-but-deferred index) |
 | `failpoints.rs` | Failure-injection coverage (gated on `failpoints` feature). Includes the five per-writer Phase B → recovery integration tests (`recovery_rolls_forward_after_finalize_publisher_failure`, `schema_apply_phase_b_failure_recovered_on_next_open`, `branch_merge_phase_b_failure_recovered_on_next_open`, `ensure_indices_phase_b_failure_recovered_on_next_open`, `optimize_phase_b_failure_recovered_on_next_open`) and the write-entry in-process heal contract (the four `*_after_finalize_publisher_failure_heals_without_reopen` tests — load, mutation, schema apply, branch merge: a follow-up write on the same handle rolls a sidecar-covered residual forward without reopen/refresh) and the storage-fault matrix for the sidecar lifecycle (`recovery.sidecar_{write,delete,list}` / `recovery.record_audit` failpoints: Phase A put failure aborts with zero drift, Phase D delete failure is swallowed and healed by the next write, list failures are loud at heal and open, audit-append failures are retried to exactly one audit row; plus the bucket-gated `s3_load_recovers_after_publisher_failure_without_reopen`). |
 | `recovery.rs` | Open-time recovery sweep — sidecar I/O, classifier dispatch (NoMovement / RolledPastExpected / UnexpectedAtP1 / UnexpectedMultistep / InvariantViolation), all-or-nothing decision, roll-forward via `ManifestBatchPublisher::publish`, roll-back via `Dataset::restore`, audit row in `_graph_commit_recoveries.lance`, `OpenMode::ReadOnly` skip path |
 | `composite_flow.rs` | Compositional/narrative end-to-end stories — multi-step flows that compose mechanics covered by other test files. Catches integration regressions where individual operations all pass their unit tests but their composition breaks (sequential merges, post-merge main writes, time-travel through merge DAG, reopen consistency over multi-merge histories, post-optimize and post-cleanup strict writes). |
--- a/docs/dev/writes.md
+++ b/docs/dev/writes.md
@ -19,8 +19,14 @@ publisher's row-level CAS on `__manifest` is the single fence.
  `__run__*` branch on an upgraded graph is swept off `__manifest` by the
  v2→v3 internal-schema migration on first read-write open. (The inert
  `_graph_runs.lance` bytes remain until a `delete_prefix` primitive lands.)
- Cancelled mutation futures leave **no graph-level state** — only orphaned
-  Lance fragments, which the existing `omnigraph cleanup` pipe reclaims.
+- Cancelled mutation futures leave **no graph-visible state** — the manifest
+  is never advanced. They can leave two kinds of unreferenced residue, both
+  self-healing: orphaned Lance fragments (reclaimed by `omnigraph cleanup`),
+  and — on the *first* write to a table on a branch, which forks it before the
+  publish — a manifest-unreferenced branch ref. The next write to that table
+  reclaims the stale fork and re-forks (`reclaim_orphaned_fork_and_refork`),
+  and `cleanup`'s per-table reconciler is the guaranteed backstop; see the
+  fork-reclaim note in [invariants.md](invariants.md).

 ## Read-your-writes within a multi-statement mutation

@ -80,10 +86,17 @@ deferred to a follow-up cycle — tracked).
 Three writers have been migrated onto staged primitives:

 * **`ensure_indices`** (`db/omnigraph/table_ops.rs::build_indices_on_dataset_for_catalog`)
-  — scalar indices (BTree, Inverted) now use `stage_create_*_index` +
-  `commit_staged`. Vector indices stay inline (residual — Lance
-  `build_index_metadata_from_segments` is `pub(crate)` in 6.0.1;
-  companion ticket to lance-format/lance#6658 needed).
+  — scalar indices (BTree, Inverted) use `stage_create_*_index` +
+  `commit_staged`. Which index a `@index`/`@key` property gets is dispatched by
+  type via `node_prop_index_kind` (enum + orderable scalar → BTree, free-text
+  String → Inverted/FTS, Vector → vector). Vector indices stay inline (residual
+  — Lance `build_index_metadata_from_segments` is `pub(crate)` in 6.0.1;
+  companion ticket to lance-format/lance#6658 needed). This build is
+  existence-gated (it creates a *missing* index over current fragments); folding
+  fragments appended afterward into an *existing* index is `optimize`'s
+  `optimize_indices` pass — an inline-commit residual, not a staged write (Lance
+  exposes no uncommitted index-optimize), covered by the optimize recovery
+  sidecar (see [maintenance.md](../user/operations/maintenance.md)).
 * **`branch_merge::publish_rewritten_merge_table`**
  (`exec/merge.rs`) — merge_insert now uses `stage_merge_insert` +
  `commit_staged`. Deletes stay inline (Lance #6658 residual).
@ -305,7 +318,7 @@ success and one failure. The losing writer's error is
 `ManifestConflictDetails::ExpectedVersionMismatch { table_key, expected,
 actual }`. The HTTP server maps this to **409 Conflict** with body
 `{"error": "...", "code": "conflict", "manifest_conflict": { "table_key":
-"...", "expected": N, "actual": M }}` — see [docs/user/server.md](../user/server.md).
+"...", "expected": N, "actual": M }}` — see [docs/user/server.md](../user/operations/server.md).

 ## Audit

--- a/docs/releases/v0.7.0.md
+++ b/docs/releases/v0.7.0.md
@ -1,90 +1,294 @@
 # Omnigraph v0.7.0

-v0.7.0 takes the cluster control plane to object storage and overhauls the
-configuration architecture around two single-owner surfaces. A cluster —
-state ledger, content-addressed catalog, and graph data — can now live
-entirely on an S3-compatible bucket, and a server can boot from that bucket
-with no local files at all. Operator identity, credentials, and personal
-aliases move to a home-level config; the legacy combined `omnigraph.yaml`
-enters a guided, staged deprecation.
+v0.7.0 is three large arcs in one release. **Operations:** the cluster control
+plane moves to object storage and the configuration architecture collapses to two
+single-owner surfaces — a cluster can live entirely on an S3-compatible bucket, a
+server boots from it with no local files, and the legacy combined `omnigraph.yaml`
+is **removed**. **CLI:** the command-line surface is unified and made honest —
+embedded and remote runs are one execution path, `load` becomes the single
+bulk-write command, every command declares the **capability** it needs (and
+rejects flags that don't apply), and the server boots only from a cluster.
+**Engine & substrate:** Lance moves to 7.x, traversal/index/recovery internals
+get faster and self-healing, and text embedding becomes provider-independent.

 ## Highlights

- **Clusters on object storage (`storage:`).** `cluster.yaml` gains an
-  optional `storage: s3://bucket/prefix` root. Every stored byte — the
-  state ledger, lock, recovery sidecars, approval artifacts, catalog blobs,
-  and the derived graph roots (`<storage>/graphs/<id>.omni`) — flows
-  through one storage layer, so `file://` (the default, byte-compatible
-  with existing clusters) and `s3://` are a single code path. The ledger's
-  compare-and-swap uses S3 conditional writes (`If-Match` /
-  `If-None-Match`), verified against AWS semantics, RustFS, and
-  Tigris-backed stores; the state lock is genuinely cross-machine on
-  object storage.
- **Config-free serving: `--cluster s3://bucket/prefix`.** The server
-  accepts a bare storage-root URI and boots from the applied revision on
-  the bucket — the ledger and catalog are the whole deployment artifact.
-  Policy bundles serve as digest-verified *content* from the catalog
-  (never re-read from disk), closing the last gap for fully remote
-  clusters. The preferred container shape becomes **bucket, no volume**
-  (see `docs/user/deployment.md`).
- **Per-operator configuration (`~/.omnigraph/`).** A home-level config
-  carries operator identity (`operator.actor`, the new last hop of the
-  `--as` chain), output defaults, named servers, and personal aliases.
-  `$OMNIGRAPH_HOME` relocates it; `$OMNIGRAPH_CONFIG` now stands in for
-  `--config` in both binaries.
- **Credentials keyed by server name.** `omnigraph login <server>` stores
-  a bearer token in `~/.omnigraph/credentials` (created `0600`; over-
-  permissive files are refused). Token resolution for a request whose URL
-  matches an operator-defined server: `OMNIGRAPH_TOKEN_<NAME>` env → the
-  credentials file → the legacy `bearer_token_env` chain unchanged. A
-  token is only ever sent to the server it is keyed to.
- **Operator targeting and aliases.** `--server <name>` (with `--graph
-  <id>` for multi-graph servers) addresses operator-defined endpoints on
-  every remote-capable command. Operator aliases are pure *bindings* —
-  personal name → (server, graph, stored-query name, default params) —
-  invoking catalog-owned stored queries; they carry no query content.
- **`omnigraph.yaml` deprecation begins.** Loading the legacy file prints
-  a per-key notice naming each present key's new home
-  (`OMNIGRAPH_SUPPRESS_YAML_DEPRECATION=1` to silence in CI).
-  `omnigraph config migrate` proposes — and with `--write`, applies — the
-  split: team half to a ready-to-review `cluster.yaml`, personal half
-  merged into the operator config (existing entries always win).
-  `omnigraph init` no longer scaffolds the file. Migrated teams can set
-  `OMNIGRAPH_NO_LEGACY_CONFIG=1` to turn any legacy-file load into a hard
-  error. The file itself keeps working until its removal at the next
-  major version.
+### Clusters & storage on object storage

-## Breaking / behavior changes
+- **Clusters on object storage (`storage:`).** `cluster.yaml` gains an optional
+  `storage: s3://bucket/prefix` root. Every stored byte — state ledger, lock,
+  recovery sidecars, approval artifacts, catalog blobs, and the derived graph
+  roots (`<storage>/graphs/<id>.omni`) — flows through one storage layer, so
+  `file://` (the default, byte-compatible with existing clusters) and `s3://`
+  are a single code path. The ledger's compare-and-swap uses S3 conditional
+  writes (`If-Match`/`If-None-Match`), verified against AWS, RustFS, and other
+  S3-compatible stores; the state lock is genuinely cross-machine on object
+  storage.
+- **Config-free serving: `--cluster s3://bucket/prefix`.** The server accepts a
+  bare storage-root URI and boots from the applied revision on the bucket — the
+  ledger and catalog are the whole deployment artifact. Policy bundles serve as
+  digest-verified *content* from the catalog (never re-read from disk). The
+  preferred container shape becomes **bucket, no volume** (see
+  `docs/user/deployment.md`).
+- **Cluster-only server.** `omnigraph-server` boots **only** from `--cluster
+  <dir | s3://…>` and serves N graphs (N ≥ 1) under cluster routes
+  (`/graphs/{id}/…`, plus a read-only `GET /graphs` enumeration). The old
+  single-graph flat-route mode, positional-`<URI>` boot, and `omnigraph.yaml`
+  `graphs:`-map boot are gone — add or remove graphs with `cluster apply` and
+  restart.
+- **One storage substrate + recovery liveness.** The cluster storage backend and
+  the engine both go through one `StorageAdapter` (versioned read, conditional
+  replace/CAS, prefix delete), exercised by a storage fault-injection matrix.
+  A long-lived server now heals a recoverable write on its *next write* rather
+  than only at restart.

- `omnigraph init` no longer writes an `omnigraph.yaml` into the working
-  directory. Start cluster configs from the documentation templates, or
-  run `omnigraph config migrate` against an existing legacy file.
- Loading a legacy `omnigraph.yaml` now emits a deprecation block on
-  stderr (suppressible; see above). Output on stdout is unchanged.
- `ServingPolicy` (cluster crate API) carries verified policy *content*
-  instead of a blob path; `read_serving_snapshot` and several cluster
-  command entry points are now async.
+### Configuration: two single-owner surfaces
+
+The legacy combined `omnigraph.yaml` is **removed**. Configuration now lives in
+two surfaces with single owners, plus a zero-config tier:
+
+- **Cluster config (`cluster.yaml` + checkout, team-owned)** declares what the
+  system *is*: graphs, schemas, stored queries, policies, storage. A server boots
+  from it via `--cluster`.
+- **Per-operator config (`~/.omnigraph/config.yaml`, person-owned)** declares who
+  *you* are: `operator.actor` (the last hop of the `--as` chain), output
+  defaults, named servers + clusters, profiles, aliases, and a default scope.
+  `$OMNIGRAPH_HOME` relocates it.
+- **Credentials keyed by server name.** `omnigraph login <server>` stores a
+  bearer token in `~/.omnigraph/credentials` (created `0600`; over-permissive
+  files refused). Resolution for a request whose URL matches an operator-defined
+  server: `OMNIGRAPH_TOKEN_<NAME>` env → the credentials file → the default
+  `OMNIGRAPH_BEARER_TOKEN`. A token is only ever sent to the server it is keyed to.
+- **Operator targeting and aliases.** `--server <name>` (with `--graph <id>` for
+  multi-graph servers) addresses operator-defined endpoints. Operator aliases are
+  pure, **read-only** *bindings* — personal name → (server, graph, stored-query
+  name, default params) — invoking catalog-owned stored queries; they carry no
+  query content and a binding to a stored mutation is rejected.
+- **Default scopes.** `defaults.server` (served) or `defaults.store` (a zero-flag
+  *local* default — mutually exclusive with `server`) supply the no-flag scope,
+  with an optional `default_graph`. `--profile <name>` / `$OMNIGRAPH_PROFILE`
+  selects a named scope bundle wholesale; `omnigraph profile list` /
+  `profile show [<name>]` inspect what's defined (read-only).
+
+### Unified, capability-aware CLI
+
+- **One bulk-write command: `omnigraph load`.** `load` is now the single data-write
+  command and works against remote graphs (over HTTP with the same bearer/actor
+  resolution as every other remote command) — previously the only data command
+  forced to open storage directly. `--mode overwrite|append|merge` is **required**
+  (overwrite is destructive, so there is no default); `--from <base>` opts into
+  fork-if-missing for `--branch`. `omnigraph ingest` becomes a **deprecated
+  alias** (`--from main --mode merge` defaults; one-line stderr warning).
+- **No implicit branch forks.** Loading into a branch that does not exist is an
+  **error** unless `--from <base>` is given — a typo'd branch name no longer
+  silently forks `main` and lands your data there. Same rule on the server.
+- **One execution path, embedded ≡ remote.** Every CLI verb runs through one
+  `GraphClient` with two implementations (embedded engine, HTTP) sharing a single
+  wire-DTO crate (`omnigraph-api-types`). An executable parity matrix runs every
+  verb against both and asserts identical results, so local and remote no longer
+  drift.
+- **Declared capabilities + honest addressing.** Every command declares the
+  **capability** it needs — `any` (run against a graph, served or embedded),
+  `served` (needs a server), `direct` (direct storage access), `control`
+  (manage/inspect a cluster), or `local` (no graph) — and the CLI enforces it.
+  Wrong-capability addressing now fails loudly with a declared message (e.g.
+  `--server` on `optimize`) instead of being silently ignored, and a maintenance
+  verb pointed at a remote target is rejected. `omnigraph --help` groups commands
+  by capability with a legend.
+- **Address cluster graphs for maintenance.** `optimize` / `repair` / `cleanup`
+  accept `--cluster <dir|s3://…> --graph <id>` (`--cluster` is a cluster directory,
+  storage-root URI, or a `clusters:` name from `~/.omnigraph/config.yaml`),
+  resolving the graph's storage URI from the served cluster state (no need to
+  hand-type `<storage>/graphs/<id>.omni`). `--graph` is the single graph selector
+  across server and cluster scopes. Conversely, `omnigraph init` **refuses** a
+  cluster-managed path and points at `cluster apply` — graphs in a cluster are
+  created with ledger/recovery/approvals, not by hand. `schema apply` refuses a
+  cluster-managed graph for the same reason (and the server rejects a cluster-
+  backed schema apply with `409`, pointing at `cluster apply`).
+- **Write diagnostics + destructive-write safety (RFC-011 Decision 9).** Every
+  write (`load`, `mutate`, `branch create|delete|merge`, `schema apply`,
+  `optimize`, `repair`, `cleanup`) echoes its resolved target + access path to
+  stderr — e.g. `omnigraph load → s3://…/knowledge.omni (direct, remote)` —
+  suppressible with the global `--quiet`. Destructive writes against a
+  **non-local** scope (`cleanup`, overwrite `load`, `branch delete` against an
+  `http(s)://` server or `s3://` store/cluster) require explicit consent: the
+  global `--yes`, an interactive TTY prompt, or — for a non-interactive /
+  `--json` run — a hard refusal instead of silently proceeding. Local (`file://`)
+  writes are unaffected.
+- **Route alignment: canonical `POST /load`.** The server gains a canonical
+  `POST /load`; `POST /ingest` is now a deprecated alias that emits RFC 9745
+  `Deprecation: true` + RFC 8288 `Link: <load>; rel="successor-version"`
+  headers (a sibling-relative reference that resolves under `/graphs/{id}/…`).
+  The CLI's `load` targets `/load`.
+- **Operator aliases get their own namespace (`omnigraph alias <name>`).** A
+  personal binding to a stored query on a named server is invoked as
+  `omnigraph alias <name> [args]` (RFC-011 Decision 4), so an alias can never
+  shadow — or be shadowed by — a built-in verb. `alias` rejects global scope
+  flags (`--server`/`--graph`/`--store`/`--cluster`/`--profile`/`--as`) its
+  binding already owns.
+- **No-graph addressing lists candidates (RFC-011 Decision 7).** When a scope
+  has no `--graph` and no `default_graph`, the CLI never silently picks. A
+  **cluster** scope with exactly one applied graph uses it automatically and
+  otherwise **lists the candidates** (from the served catalog). A multi-graph
+  **server** lists the candidates (from `GET /graphs`) and requires `--graph <id>`.
+- **Invoke stored queries by name (RFC-011 Decision 3).** `omnigraph query
+  <name>` / `mutate <name>` invoke a stored query **by name** from the served
+  catalog — `omnigraph query find_people` instead of `--query find.gq --name
+  find_people`. The verb asserts the query's kind (an `expect_mutation` flag on
+  `POST /queries/{name}`: `query <a-mutation>` is rejected with `'<name>' is a
+  mutation — use omnigraph mutate <name>`, and vice-versa). `.gq` files become
+  the explicit ad-hoc lane (`-e` / `--query`), with the positional selecting
+  which query in the source.
+
+### Engine & substrate
+
+- **Lance 6.0.1 → 7.0.0.** The columnar substrate is bumped to Lance 7.x with
+  correct-by-design alignment: the unenforced primary key is immutable once set,
+  `WriteParams::auto_cleanup` is disabled so version GC stays operator-owned, and
+  the native namespace/`object_store` 0.13 surface is pinned by surface-guard
+  tests. No on-disk format change for existing graphs.
+- **Indexed graph traversal.** `Expand` can run over a BTREE-indexed path,
+  asserted semantically equal to the CSR traversal it accelerates.
+- **Scalar index coverage + filter literal coercion.** Closes index-coverage gaps
+  and coerces filter literals correctly, cutting query latency on indexed scans.
+- **Index materialization is derived state.** `schema apply` records
+  `@index`/`@key` *intent* and builds nothing (index-only changes touch no table
+  data); `load`/`mutate` build inline through one chokepoint but **defer** an
+  untrainable Vector column as *pending* instead of aborting; `optimize` is the
+  reconciler that materializes declared-but-missing indexes and folds appended
+  fragments back into existing ones.
+- **Recovery liveness + one storage substrate.** Writers heal a recoverable
+  write on the *next write* (not only at the next read-write open); a storage
+  fault-injection matrix exercises the sidecar lifecycle; the cluster and engine
+  share one `StorageAdapter` over `object_store`.
+- **Branch-fork self-heal.** Manifest-unreferenced branch forks are reclaimed
+  (eager best-effort + a `cleanup` reconciler backstop), so a failed branch-delete
+  reclaim no longer wedges a reused branch name.
+- **Composite `@unique(a, b)`.** Enforced as a true composite key, with one shared
+  keying function for intake and branch-merge that fails loudly on an un-keyable
+  column type rather than silently exempting it.
+
+### Embeddings: provider-independent (RFC-012)
+
+- **One client, any provider.** Text embedding moves to a single
+  provider-independent `EmbeddingConfig` behind a sealed `Provider` enum:
+  **OpenAI-compatible** (the **OpenRouter** default gateway — one key for many
+  models — plus OpenAI-direct and self-hosted endpoints), native **Gemini**, and
+  a deterministic **Mock**. One client serves both the query path and the offline
+  `omnigraph embed` CLI, with a per-query deadline and `tracing` observability.
+  The dead, uncallable compiler-crate OpenAI client (and its `reqwest`/`tokio`
+  deps) was removed.
+- **Same-space guarantee.** `@embed("source", model="…")` records the embedding
+  identity (model) in the schema IR so it travels with the data; a string
+  `nearest()` whose resolved embedder model differs from the recorded one is
+  **rejected with a typed error** instead of silently ranking across vector
+  spaces. (`@embed` still does no ingest-time embedding — deferred to a later
+  phase.)
+
+## Breaking & behavior changes
+
+- **`omnigraph.yaml` is removed.** The CLI and server no longer read it at all;
+  the `OmnigraphConfig` type, `omnigraph config migrate`, and the deprecation
+  env vars (`OMNIGRAPH_NO_LEGACY_CONFIG`, `OMNIGRAPH_SUPPRESS_YAML_DEPRECATION`,
+  `OMNIGRAPH_CONFIG`) are gone. Configure via a team `cluster.yaml` and a
+  per-operator `~/.omnigraph/config.yaml` (see Upgrade notes).
+- **`omnigraph-server` boots only from `--cluster`.** The positional-`<URI>`
+  single-graph boot and the `omnigraph.yaml` `graphs:`-map boot are removed; all
+  HTTP is under `/graphs/{id}/…` (with flat `/healthz` and the `/graphs`
+  enumeration). Upgrade deployments to `omnigraph-server --cluster <dir|s3://…>`.
+- **Default embedding provider flips to OpenRouter.** Embedding is no longer
+  hardwired to Gemini: the default provider is **OpenAI-compatible via
+  OpenRouter**, `OMNIGRAPH_GEMINI_BASE_URL` is dropped, and Gemini-direct users
+  must set `OMNIGRAPH_EMBED_PROVIDER=gemini`. A `nearest("string")` query whose
+  resolved model differs from a property's recorded `@embed(model=…)` is now a
+  typed error rather than silent cross-space ranking.
+- **`query --alias <name>` is removed.** Invoke operator aliases via
+  `omnigraph alias <name> [args]`.
+- **`query`/`mutate` no longer take a positional graph URI, `--uri`, or
+  `--name`** (RFC-011 D3). The positional is now the query name; address the
+  graph with `--store` (local) / `--server` / `--profile`, and select a query
+  within an ad-hoc `--query`/`-e` source with the positional (replacing
+  `--name`). By-name catalog invocation is **served-only** (a bare `--store` has
+  no catalog — use `-e`/`--query` there). Scripts using
+  `query <graph-uri> --query f.gq --name q` become
+  `query --store <graph-uri> --query f.gq q`.
+- **Legacy data-plane addressing removed** (#238): `--target`, the positional
+  `http(s)://`→remote dispatch, and `--as` on a served write (the actor is
+  resolved server-side from the bearer token) no longer exist.
+- **`omnigraph load` replaces direct-storage-only loading; `--mode` is required.**
+  Scripts calling `load` without `--mode` must add one (`overwrite|append|merge`).
+- **`omnigraph ingest` is deprecated** (still works; one-line stderr warning).
+  Use `load --from <base> --mode <mode>`.
+- **Loading into a missing branch is now an error without `--from`** (CLI and
+  `POST /load`/`POST /ingest`): a missing branch returns 404 / fails, never an
+  implicit fork. Pass `--from <base>` (CLI) or the request `from` field (HTTP) to
+  fork-if-missing. This affects any workflow that relied on auto-forking.
+- **Scope flags that can't apply now error instead of being silently ignored.**
+  `--server` on any direct/control/session command, `--cluster` outside the
+  cluster-scoped verbs, and `--graph` where no multi-graph scope applies all fail
+  with a declared message. `--graph` is the single graph selector and is
+  **accepted** on `optimize` / `repair` / `cleanup` when paired with `--cluster`
+  (replacing the removed `--cluster-graph`).
+- **`schema apply` is refused against a cluster-managed graph.** The CLI signposts
+  `omnigraph cluster apply`; a cluster-backed server returns `409 Conflict`
+  (after the Cedar gate, so an unauthorized actor still gets `403`). Cluster
+  graphs evolve through `cluster apply`, never a direct apply.
+- **Storage-plane error text changed.** A maintenance verb pointed at a remote
+  target now fails with a declared direct-capability message (replacing the older
+  "only supported against local graph URIs" wording). Error strings are observable
+  contract (Hyrum); pin against the new text.
+- **Non-local destructive writes now require `--yes` in automation.** A
+  `cleanup` / overwrite-`load` / `branch delete` against an `http(s)://` or
+  `s3://` target with `--json` (or any non-TTY context) previously executed;
+  it now **refuses** unless `--yes` is passed. CI scripts that destroy remote
+  data must add `--yes`. Local (`file://`) writes are unchanged.
+- **`omnigraph init` no longer scaffolds a config file,** and **refuses a
+  cluster-managed storage path** (`<root>/graphs/<id>.omni` under a cluster) —
+  create those graphs with `cluster apply`.
+- **`POST /ingest` is deprecated** (kept indefinitely as a shim) and returns
+  `Deprecation`/`Link` headers. **A v0.7 CLI talks to `POST /load`,** which a
+  pre-0.7 server does not expose — upgrade the server and CLI together, or keep
+  using `ingest`.
+- **`ServingPolicy` (cluster crate API) carries verified policy content instead
+  of a blob path; `read_serving_snapshot` and several cluster command entry points
+  are now `async`.**
+- **`omnigraph --help` reorders commands** (grouped by capability) and **hides
+  the deprecated `ingest`** from the listing — `ingest` still runs. Help text is
+  observable; this is a deliberate output change.

 ## Upgrade notes

 - Existing clusters need no migration: an absent `storage:` key keeps the
  config-directory layout byte-for-byte.
- Existing `omnigraph.yaml` setups keep working through the deprecation
-  window; `omnigraph config migrate` produces the recommended split.
- Operator setup is three lines:
-  `mkdir -p ~/.omnigraph`, write `operator.actor` (and `servers:`) into
-  `~/.omnigraph/config.yaml`, then `echo $TOKEN | omnigraph login <server>`.
+- **`omnigraph.yaml` is no longer read.** There is no automated migrate command
+  in 0.7.0; recreate configuration as a team `cluster.yaml` (graphs, schemas,
+  stored queries, policies — see `docs/user/clusters/`) plus a per-operator
+  `~/.omnigraph/config.yaml` (identity, servers, credentials, defaults — see
+  `docs/user/cli/reference.md`).
+- **`omnigraph-server` now requires `--cluster <dir | s3://…>`** — there is no
+  positional-URI boot. Run `cluster apply` first, then serve the applied revision.
+- **Gemini-direct embedding users** set `OMNIGRAPH_EMBED_PROVIDER=gemini` (the
+  default is now OpenRouter); `OMNIGRAPH_GEMINI_BASE_URL` is removed.
+- Audit scripts for two CLI changes: add `--mode` to every `load`, and add
+  `--from <base>` anywhere you relied on a missing branch being auto-created.
+- Upgrade server and CLI together for the `/load` route (or keep `ingest`).
+- Operator setup is three lines: `mkdir -p ~/.omnigraph`, write `operator.actor`
+  (and `servers:`) into `~/.omnigraph/config.yaml`, then
+  `echo $TOKEN | omnigraph login <server>`.

 ## Internals

- The cluster, server, and CLI crates were modularized (the 7.9k-line
-  cluster `lib.rs` is now eight focused modules; the server and CLI test
-  monoliths split into per-area suites) — pure code movement, no behavior
-  change.
- New gated end-to-end suites run the full cluster lifecycle against a
-  real S3-compatible store in CI, including a lock-release regression and
-  a config-free server boot from a bare bucket URI.
- The deployment guide gains the bucket-no-volume container recipe for
-  AWS and Railway, validated against a live Railway deployment
-  (Railway buckets are S3-compatible and pass the conditional-write
-  contract test).
+- The cluster, server, and CLI crates were modularized (the ~7.9k-line cluster
+  `lib.rs` into focused modules; the server and CLI test monoliths into per-area
+  suites) — pure code movement.
+- The parity matrix (embedded vs remote) is the new referee for CLI behavior; the
+  OpenAPI drift test guards `openapi.json`; Lance-surface guard tests pin the
+  upstream APIs the engine depends on (the first smoke check on a Lance bump).
+- Gated end-to-end suites run the full cluster lifecycle against a real
+  S3-compatible store in CI (lock-release regression, config-free boot from a
+  bare bucket URI).
+- The deployment guide gains the bucket-no-volume container recipe for AWS /
+  S3-compatible object storage.
+- `clap` updated to 4.6.1. CI runs the full workspace suite on `main` post-merge
+  rather than on every PR (faster PR turnaround; the local
+  `cargo test --workspace --locked` is the pre-merge gate).
--- a/docs/user/audit.md
+++ b/docs/user/audit.md
@ -1,7 +0,0 @@
-# Audit / Actor tracking
-
- `Omnigraph::audit_actor_id: Option<String>` is the actor in effect.
- `_as` variants of every write API let callers override the actor: `mutate_as`, `load_as`, `branch_merge_as`, `apply_schema_as`, etc.
- Actor IDs are persisted on `GraphCommit.actor_id` with split storage in `_graph_commit_actors.lance` (the commit graph is split into `_graph_commits.lance` for the linkage and `_graph_commit_actors.lance` for the actor map).
- HTTP server uses the bearer-token actor automatically. The CLI resolves one actor chain everywhere: `--as` > legacy `cli.actor` in `omnigraph.yaml` > `operator.actor` in `~/.omnigraph/config.yaml` > none (RFC-007).
- Pre-v0.4.0 graphs also stored actor IDs on `RunRecord.actor_id` in `_graph_runs.lance` / `_graph_run_actors.lance`. The Run state machine was removed in MR-771; those files are inert post-v0.4.0. The v2→v3 manifest migration sweeps any stale `__run__*` branches on first write-open (MR-770); the inert dataset bytes remain until a `delete_prefix` primitive lands.
--- a/docs/user/branches-commits.md
+++ b/docs/user/branches-commits.md
@ -1,63 +0,0 @@
-# Branches, Commits, Snapshots
-
-## L1 — Lance per-dataset branches
-
-Lance supports branching at the dataset level: a branch is a named lineage of versions, and `fork_branch_from_state(source_branch, target_branch, source_version)` creates a copy-on-write fork.
-
-## L2 — Graph-level branches
-
-OmniGraph builds *graph branches* on top by branching every sub-table coherently:
-
- `branch_create(name)` / `branch_create_from(target, name)` — disallowed name `main`; fails if branch exists; ensures the schema-apply lock is idle. Atomic and authority-first like `branch_delete`: it flips the `__manifest` branch (authority), then creates the derived commit-graph branch, force-dropping any orphaned commit-graph ref left by an incomplete prior delete (the manifest branch is fresh, so a same-named commit-graph branch is provably a zombie). If commit-graph creation fails, the manifest branch is rolled back so the name never half-exists.
- `branch_list()` — returns public branches, **filters the internal** `__schema_apply_lock__` branch.
- `branch_delete(name)` — refuses if there are descendants on the branch, or if it is the current branch. The manifest is the single authority for branch existence: deletion flips the `__manifest` branch ref first (one atomic op), after which the branch is gone from every snapshot. The owned per-table forks and the commit-graph branch are derived state, reclaimed best-effort with `force_delete_branch` after the flip. A failure during that reclaim (transient object-store error) does not fail the call or block the authority flip; the leftover forks are unreachable orphans that the [`cleanup`](maintenance.md) reconciler converges. One consequence: if a delete's best-effort reclaim fails, reusing that branch name before the next `cleanup` surfaces a clear error pointing at `cleanup` (the stale fork would otherwise collide on first write).
- **Lazy forking**: a branch only forks a sub-table when that sub-table is first mutated on it. Pure-read branches share fragments with their source. A fork collision is classified by the manifest authority, not by Lance branch versions: if the live manifest already records the fork on the active branch, a concurrent first-write won and the caller gets a retryable "refresh and retry"; if the manifest does not, a physical branch there is an orphan and the caller is pointed at `cleanup`.
- `sync_branch(branch)` — re-binds the in-memory handle to the latest head of the branch.
-
-## L2 — Commit graph (`db/commit_graph.rs`)
-
-In-memory shape of a graph commit:
-
-```
-GraphCommit {
-  graph_commit_id: ULID,
-  manifest_branch: Option<String>,
-  manifest_version: u64,
-  parent_commit_id: Option<String>,
-  merged_parent_commit_id: Option<String>,   // populated for merge commits
-  actor_id: Option<String>,                  // joined in-memory from _graph_commit_actors.lance, NOT a column on _graph_commits.lance
-  created_at: i64 (microseconds since epoch),
-}
-```
-
-Storage is split across two Lance datasets (both with stable row IDs):
-
- `_graph_commits.lance` — every column above *except* `actor_id`.
- `_graph_commit_actors.lance` — optional separate `(graph_commit_id, actor_id)` map, created on demand. The `actor_id` field above is populated by joining this dataset in-memory at load time.
-
-Notes:
-
- Every successful publish (load / change / merge / schema_apply) appends one commit.
- Merge commits have two parents; linear commits have one.
- API: `list_commits(branch)`, `get_commit(id)`, `head_commit_id_for_branch(branch)`.
-
-## L2 — Snapshots & time travel
-
- `snapshot()` — current snapshot for the bound branch; cached.
- `snapshot_of(target)` — snapshot at a `ReadTarget` (branch | snapshot id).
- `snapshot_at_version(v: u64)` — historical snapshot from any manifest version.
- `entity_at(table_key, id, version)` — single-entity time travel without building a full snapshot.
- A `Snapshot` is a `(version, HashMap<table_key, SubTableEntry>)` — cheap to build, snapshot-isolated cross-table reads.
-
-## L2 — Internal system branches
-
-Internal or legacy branch refs:
-
- `__schema_apply_lock__` — serializes schema migrations; filtered from `branch_list()` but visible to internals.
- `__run__<run-id>` — legacy from the pre-v0.4.0 Run state machine (removed in MR-771). These are swept off `__manifest` on the first read-write open by the v2→v3 internal-schema migration (MR-770), and `__run__*` is no longer a reserved name. Known limitation: a pre-v0.4.0 graph opened **read-only** still surfaces any stale `__run__*` branch in `branch_list()` until its first read-write open (the migration is write-path-only, like all manifest migrations).
-
-## L2 — Recovery audit trail
-
-The five migrated writers (`MutationStaging::finalize`, `schema_apply`, `branch_merge`, `ensure_indices`, `optimize_all_tables`) protect their multi-table commits with a sidecar at `__recovery/{ulid}.json` written before Phase B and deleted after Phase C. The next `Omnigraph::open` (gated on `OpenMode::ReadWrite`) runs the recovery sweep in `crates/omnigraph/src/db/manifest/recovery.rs`: classify per-table state, decide all-or-nothing per sidecar, roll forward / back, record an audit row.
-
-Audit rows live in `_graph_commit_recoveries.lance` (sibling to `_graph_commits.lance`) and reference the commit graph by `graph_commit_id`. The linked recovery commit is identified by that same `graph_commit_id`, and `actor_id="omnigraph:recovery"` is stored in `_graph_commit_actors.lance` (joined by `graph_commit_id`) — `_graph_commits.lance` itself does not carry the `actor_id` column. To find recoveries for a specific original actor: `omnigraph commit list --filter actor=omnigraph:recovery`, then join to `_graph_commit_recoveries.lance` by `graph_commit_id` to read `recovery_for_actor`. Schema: see `crates/omnigraph/src/db/recovery_audit.rs`.
--- a/docs/user/branching/changes.md
+++ b/docs/user/branching/changes.md
@ -1,6 +1,6 @@
 # Change Detection / Diff

-`changes/mod.rs`. Three-level algorithm:
+Diffing two read targets uses a three-level algorithm:

 1. **Manifest diff**: skip sub-tables whose `(table_version, table_branch)` is unchanged.
 2. **Lineage check**:
--- a/docs/user/branching/index.md
+++ b/docs/user/branching/index.md
@ -0,0 +1,40 @@
+# Branches, Commits, Snapshots
+
+## L1 — Lance per-dataset branches
+
+Lance supports branching at the dataset level: a branch is a named lineage of versions, and a copy-on-write fork creates a new branch from a source branch at a given version.
+
+## L2 — Graph-level branches
+
+OmniGraph builds *graph branches* on top by branching every sub-table coherently:
+
+- **Create** (`branch create` / `branch create --from <target>`) — the name `main` is disallowed; fails if the branch exists. Atomic: the new branch becomes visible all-or-nothing, so a name never half-exists.
+- **List** (`branch list`) — returns public branches, **filtering the internal** `__schema_apply_lock__` branch.
+- **Delete** (`branch delete`) — refuses if there are descendants on the branch, or if it is the current branch. Once deleted, the branch is gone from every snapshot. The owned per-table forks are reclaimed best-effort; if that reclaim hits a transient object-store error, the leftover storage is reclaimed later by the [`cleanup`](../operations/maintenance.md) command. One consequence: if a delete's reclaim fails, reusing that branch name before the next `cleanup` surfaces a clear error pointing at `cleanup`.
+- **Lazy forking**: a branch only forks a sub-table when that sub-table is first mutated on it. Pure-read branches share storage with their source. If two writers race to first-write the same branch, the loser gets a retryable "refresh and retry".
+
+## L2 — Commit graph
+
+Each graph commit carries a ULID id, the manifest branch and version it published, its parent commit (two parents for a merge commit, one for a linear commit), the actor who made it, and a creation timestamp.
+
+- Every successful publish (load / change / merge / schema apply) appends one commit.
+- Merge commits have two parents; linear commits have one.
+- Inspect history with `commit list` and `commit show`.
+
+## L2 — Snapshots & time travel
+
+Reading a branch at a past version, or a single entity at a past version, is
+covered on the [time travel](time-travel.md) page. Merging branches and the
+conflict kinds are on the [merge](merge.md) page.
+
+## L2 — Internal system branches
+
+- `__schema_apply_lock__` — serializes schema migrations; filtered from `branch list` but used internally.
+
+## L2 — Recovery audit trail
+
+Interrupted multi-table writes are recovered automatically the next time the graph is opened read-write. Recovery commits are recorded in the audit trail under the actor `omnigraph:recovery`, so you can find them with:
+
+```bash
+omnigraph commit list --filter actor=omnigraph:recovery
+```
--- a/docs/user/branching/merge.md
+++ b/docs/user/branching/merge.md
@ -0,0 +1,47 @@
+# Merging Branches
+
+Merging integrates the changes on one branch into another. OmniGraph merges are
+**three-way and row-level**: it compares both branches against their common
+ancestor and merges each node/edge table row by row, then publishes the result as
+**one atomic commit** across the whole graph.
+
+```bash
+omnigraph branch merge review/2026-04-25 --into main s3://bucket/graph.omni
+```
+
+`branch merge <source> [--into <target>]` merges `<source>` into `<target>`
+(default `main`).
+
+## Outcomes
+
+A merge resolves to one of three outcomes:
+
+- **Already up to date** — the target already contains every change on the source;
+  nothing to do.
+- **Fast-forward** — the target has no changes the source lacks, so the target
+  simply advances to the source.
+- **Merged** — both sides diverged; a new merge commit is created with two parents.
+
+## Conflicts
+
+When both branches changed the same data incompatibly, the merge fails with a
+structured list of conflicts (the HTTP server returns `409` with a
+`merge_conflicts[]` array). No partial result is published — the merge is
+all-or-nothing. The conflict kinds are:
+
+| Kind | Meaning |
+|---|---|
+| `DivergentInsert` | The same id was inserted on both branches. |
+| `DivergentUpdate` | The same row was updated differently on both branches. |
+| `DeleteVsUpdate` | One side deleted a row the other side updated. |
+| `OrphanEdge` | An edge references a node the other side deleted. |
+| `UniqueViolation` | The merged result would violate a unique constraint. |
+| `CardinalityViolation` | The merged result would violate an edge cardinality constraint. |
+| `ValueConstraintViolation` | The merged result would violate a value constraint (enum/range). |
+
+Each conflict carries the table, the row id (when applicable), the kind, and a
+message. Resolve conflicts by reconciling the two branches — typically by making
+the conflicting change on one side and re-merging.
+
+See [branches & commits](index.md) for the branch and commit-DAG model, and
+[changes](changes.md) for diffing two branches before you merge.
--- a/docs/user/branching/time-travel.md
+++ b/docs/user/branching/time-travel.md
@ -0,0 +1,31 @@
+# Snapshots & Time Travel
+
+Every read in OmniGraph happens against a **snapshot** — a consistent, cross-table
+view of the graph at one manifest version. A query holds one snapshot for its whole
+lifetime, so it never sees a partial write from a concurrent commit (see
+[transactions](transactions.md)).
+
+## Reading the past
+
+- **Current head** — by default a read targets the current head of the bound branch.
+- **By snapshot id** — read a branch or a specific snapshot id (`--snapshot` on
+  `omnigraph read`).
+- **By version** — reconstruct a historical snapshot from any past manifest version.
+- **Single entity** — look up one entity at a past version without building a full
+  snapshot (cheaper when you only need one node or edge).
+
+Snapshots are cheap to build: a snapshot is just the set of visible sub-table
+versions at a manifest version, so cross-table reads stay snapshot-isolated.
+
+## CLI
+
+```bash
+# Read a query against a past snapshot
+omnigraph read --query ./q.gq --name find --snapshot <snapshot-id> s3://bucket/graph.omni
+```
+
+Time travel composes with branches: every branch has its own version history, and
+you can read any branch at any of its past versions. Commits and the commit DAG
+that these versions correspond to are described in
+[branches & commits](index.md); diffing two versions is on the
+[changes](changes.md) page.
--- a/docs/user/branching/transactions.md
+++ b/docs/user/branching/transactions.md
@ -2,7 +2,7 @@

 OmniGraph does not have `BEGIN` / `COMMIT` / `ROLLBACK`. Branches do that job. This page explains the model, when to use which primitive, and shows worked examples for the patterns that come up most.

-The architectural rule lives in [`docs/dev/invariants.md`](../dev/invariants.md):
+The architectural rule lives in [`docs/dev/invariants.md`](../../dev/invariants.md):

 > **Mutations publish at one boundary.** A `mutate_as` or `load` operation
 > accumulates constructive writes, commits each touched table at the end, then
@ -107,7 +107,7 @@ Properties:
 - Each query on the branch is its own publisher commit — so they're individually atomic. Per-query CAS works on branches just like on main.
 - The branch lives on disk. Process crash mid-workflow? Re-open and resume.
 - Multiple agents can work on different branches in parallel without blocking each other.
- The merge is a three-way merge at the row level. Conflicts surface as `OmniError::MergeConflicts(Vec<MergeConflict>)`, with structured kinds (`DivergentInsert`, `DivergentUpdate`, `DeleteVsUpdate`, …) so callers can handle them programmatically.
+- The merge is a three-way merge at the row level. Conflicts surface as structured merge-conflict kinds (`DivergentInsert`, `DivergentUpdate`, `DeleteVsUpdate`, …) so callers can handle them programmatically.

 ### 4. Coordinating multiple agents

@ -129,14 +129,14 @@ omnigraph branch merge agent-b/work --into main graph.omni

 Each agent sees a consistent snapshot of `main` at the time it forked. The first merge to `main` lands as a fast-forward (or a no-op if no concurrent change). The second merge runs three-way: rows touched by both branches surface as `MergeConflict`s for the caller to resolve.

-This is the workflow MR-797 / agentic loops are designed around: **branches are the unit of "an agent's working set."**
+This is the workflow agentic loops are designed around: **branches are the unit of "an agent's working set."**

 ## Failure modes

 | Scenario | What happens | Caller action |
 |---|---|---|
 | Single query fails mid-flight | Publisher never publishes; target unchanged | Read the error, decide whether to retry |
-| Concurrent writers race the same `(table, branch)` | Publisher CAS rejects the loser with `ManifestConflictDetails::ExpectedVersionMismatch` | Refresh handle, retry the query |
+| Concurrent writers race the same `(table, branch)` | Publisher CAS rejects the loser with a version-mismatch conflict | Refresh handle, retry the query |
 | Branch with N successful mutations, then merge fails (three-way conflict) | Each individual mutation already committed on the branch; merge surfaces `MergeConflicts` | Inspect, decide whether to keep working on the branch, abandon it (`branch_delete`), or resolve and re-merge |
 | Process crashes mid-branch-workflow | Each completed mutation on the branch is durable | Re-open the graph, continue where you left off |

@ -161,8 +161,8 @@ This is the workflow MR-797 / agentic loops are designed around: **branches are

 ## See also

- [`docs/user/branches-commits.md`](branches-commits.md) — branch and commit-graph mechanics.
- [`docs/dev/merge.md`](../dev/merge.md) — three-way merge details and conflict kinds.
- [`docs/user/query-language.md`](query-language.md) — `.gq` syntax for the multi-statement queries used above.
- [`docs/dev/writes.md`](../dev/writes.md) — the per-query commit pipeline that gives single-query atomicity.
- [`docs/dev/invariants.md`](../dev/invariants.md) — the architectural rule.
+- [`docs/user/branches-commits.md`](index.md) — branch and commit-graph mechanics.
+- [`docs/dev/merge.md`](../../dev/merge.md) — three-way merge details and conflict kinds.
+- [`docs/user/query-language.md`](../queries/index.md) — `.gq` syntax for the multi-statement queries used above.
+- [`docs/dev/writes.md`](../../dev/writes.md) — the per-query commit pipeline that gives single-query atomicity.
+- [`docs/dev/invariants.md`](../../dev/invariants.md) — the architectural rule.
--- a/docs/user/cli-reference.md
+++ b/docs/user/cli-reference.md
@ -1,211 +0,0 @@
-# CLI Reference (`omnigraph`)
-
-A reference for the `omnigraph` binary's command surface and `omnigraph.yaml` schema. For a quick-start guide, see [cli.md](cli.md).
-
-Top-level command families and subcommands. Graph-targeting commands accept a positional `URI`, `--uri`, a `--target <name>` resolved against `omnigraph.yaml`, or `--server <name>` (an operator-defined server from `~/.omnigraph/config.yaml`, optionally with `--graph <id>` for multi-graph servers; exclusive with the other forms); `cluster` commands use `--config <dir>`.
-
-## Top-level commands
-
-| Command | Purpose |
-|---|---|
-| `init` | `--schema <pg>` → initialize a graph (no longer scaffolds `omnigraph.yaml` — RFC-008; start cluster configs from the [cluster.md](cluster.md) quick-start or `config migrate`) |
-| `load` | bulk load a branch, local or remote (`--mode overwrite\|append\|merge` is **required** — overwrite is destructive, so there is no default). Without `--from` the target branch must exist; `--from <base>` forks a missing `--branch` from `<base>` first |
-| `ingest` | deprecated alias of `load --from <base>` (defaults: `--from main --mode merge`); prints a one-line warning to stderr |
-| `query` (alias: `read`) | run named read query; source via `--query <path>`, `-e`/`--query-string <GQ>`, or `--alias <name>` (exactly one). `read` is the deprecated previous name and prints a one-line warning to stderr |
-| `mutate` (alias: `change`) | run mutation query; same `--query` / `-e` / `--alias` mutual-exclusion as `query`. `change` is the deprecated previous name and prints a one-line warning to stderr |
-| `snapshot` | print current snapshot (per-table version + row count) |
-| `export` | dump to JSONL on stdout (`--type T`, `--table K` filters) |
-| `branch create \| list \| delete \| merge` | branching ops |
-| `commit list \| show` | inspect commit graph |
-| `schema plan \| apply \| show (alias: get)` | migrations |
-| `lint` (alias: `check`) | offline / graph-backed query validation. Replaces `query lint` / `query check`, which are kept as deprecated argv-level shims that print a one-line warning and rewrite to `omnigraph lint` |
-| `config migrate` | propose (or `--write`: apply) the RFC-008 split of a legacy `omnigraph.yaml` — team half → ready-to-review `cluster.yaml`, personal half → `~/.omnigraph/config.yaml` (key-level merge, existing entries win), plus dropped-key reasons and manual steps |
-| `cluster validate \| plan \| apply \| approve \| status \| refresh \| import \| force-unlock` | declarative cluster control plane. `validate` checks a local `cluster.yaml` folder and referenced schema/query/policy files; `plan` diffs it against local JSON state at `__cluster/state.json`, annotates dispositions, and embeds real schema-migration previews; `apply` converges the cluster — stored-query/policy catalog writes (content-addressed under `__cluster/resources/`), graph creates, schema updates (soft drops only; `--as` records the actor), and graph deletes behind a digest-bound approval from `cluster approve <resource> --as <actor>` (`apply`/`approve` default the actor from the per-operator `omnigraph.yaml`'s `cli.actor` when `--as` is omitted; nothing else in that file affects cluster commands); what apply converges is what an `omnigraph-server --cluster <dir>` deployment serves on its next restart (omnigraph.yaml deployments are unaffected); `status` reads the state ledger; `refresh`/`import` explicitly update local JSON state from read-only graph observations; `force-unlock <LOCK_ID>` manually removes a held local state lock by exact id |
-| `optimize` | non-destructive Lance compaction (skips tables with `Blob` columns or uncovered drift; `--json` reports `skipped`) |
-| `repair [--confirm] [--force]` | preview or explicitly publish uncovered manifest/head drift. `--confirm` heals verified maintenance drift and exits non-zero if suspicious/unverifiable drift is refused; `--force --confirm` publishes suspicious/unverifiable drift after operator review |
-| `cleanup --keep N --older-than 7d --confirm` | destructive version GC |
-| `embed` | offline JSONL embedding pipeline |
-| `policy validate \| test \| explain` | Cedar tooling. Selects `cli.graph`, else `server.graph`, else top-level `policy.file` |
-| `version` / `-v` | print `omnigraph 0.3.x` |
-
-## Config surfaces
-
-Two config surfaces with single owners (RFC-007/RFC-008), plus a zero-config
-tier:
-
-| Surface | Owner | Location | Declares |
-|---|---|---|---|
-| Cluster config | the team, in a repo | `cluster.yaml` + checkout ([cluster-config.md](cluster-config.md)) | what the system **is**: graphs, schemas, queries, policies, storage |
-| Operator config | one person | `~/.omnigraph/config.yaml` (override dir with `$OMNIGRAPH_HOME`) | who **I** am: identity, ergonomics |
-| Flags / env | per invocation | — | everything, explicitly |
-
-`omnigraph.yaml` (below) is the legacy combined file — fully supported
-today, slated for staged deprecation (RFC-008); its keys' future homes are
-listed there.
-
-### `~/.omnigraph/config.yaml` (operator)
-
-```yaml
-operator:
-  actor: act-andrew     # default identity for every --as cascade:
-                        #   --as > legacy cli.actor > operator.actor > none
-servers:                # operator-owned endpoints; names key the credentials
-  prod:
-    url: https://graph.example.com     # no tokens in this file, ever
-defaults:
-  output: table         # read format default, below --json/--format/alias/legacy
-```
-
-Absent file = empty layer. Unknown keys warn and load (a file written for a
-newer CLI works on an older one). `$OMNIGRAPH_CONFIG=<path>` stands in for
-`--config` (the flag wins) in both the CLI and the server.
-
-#### Credentials keyed by server name
-
-`omnigraph login <name>` stores a bearer token in
-`~/.omnigraph/credentials` (created `0600`; group/world-readable files are
-refused). Token from `--token`, or — preferred, keeps it out of shell
-history — one line on stdin: `echo $TOKEN | omnigraph login prod`.
-`omnigraph logout <name>` removes it (idempotent).
-
-#### Operator aliases — bindings, not content
-
-An operator alias is a personal name for *invoking a stored query on a
-named server* — it carries no query content (the stored query in the
-catalog is the team's contract; the alias, its defaults, and its name are
-yours):
-
-```yaml
-aliases:
-  triage:
-    server: intel-dev        # names an entry under servers:
-    graph: spike             # optional (multi-graph servers)
-    query: weekly_triage     # the STORED query's name — never a file
-    args: [since]            # positional args -> params, in order
-    params: { limit: 20 }    # fixed defaults; positionals/--params win
-    format: table
-```
-
-`omnigraph query --alias triage 2026-06-01` invokes
-`POST <server>/graphs/spike/queries/weekly_triage` with the keyed
-credential. A legacy `omnigraph.yaml` alias with the same name wins during
-the deprecation window (with a warning).
-
-A remote command whose URL prefix-matches an operator server's `url` (the
-`gh` host model — no flags needed) resolves its token through:
-
-| Order | Source |
-|---|---|
-| 1 | `OMNIGRAPH_TOKEN_<NAME>` env (`prod` → `OMNIGRAPH_TOKEN_PROD`) |
-| 2 | `[<name>]` section in `~/.omnigraph/credentials` |
-| 3 | the legacy chain unchanged (`bearer_token_env` → `OMNIGRAPH_BEARER_TOKEN` → `auth.env_file`) |
-
-A token is only ever sent to the server it is keyed to: URLs matching no
-operator server use the legacy chain alone.
-
-## `omnigraph.yaml` schema (legacy combined file)
-
-> **Deprecated (RFC-008).** Loading this file prints a per-key notice
-> naming each present key's new home (suppress in CI with
-> `OMNIGRAPH_SUPPRESS_YAML_DEPRECATION=1`); `omnigraph config migrate`
-> produces the split. The file keeps working through the deprecation
-> window. Migrated teams can set `OMNIGRAPH_NO_LEGACY_CONFIG=1` to turn
-> any legacy-file load into a hard error (regression guard; the file's
-> absence is always fine).
-
-```yaml
-project: { name }
-graphs:
-  <name>:
-    uri: <local|s3://|http(s)://>
-    bearer_token_env: <ENV_NAME>
-    queries:                      # per-graph stored-query registry (server-role; multi-graph mode)
-      <query-name>:               # key MUST equal the `query <name>` symbol inside the .gq
-        file: <path-to-.gq>       # relative to this config's directory
-        mcp:
-          expose: true            # default true: listed in the MCP catalog (GET /queries); set false to hide (still HTTP-callable)
-          tool_name: <name>       # optional MCP tool-name override (defaults to <query-name>;
-                                  #   must be unique across exposed queries)
-server:
-  graph: <name>
-  bind: <ip:port>
-cli:
-  graph: <name>
-  branch: <name>
-  output_format: json|jsonl|csv|kv|table
-  table_max_column_width: 80
-  table_cell_layout: truncate|wrap
-query:
-  roots: [<dir>, …]   # search path for .gq files
-auth:
-  env_file: .env.omni
-aliases:
-  <alias>:
-    # accepted values: `read` / `query` (read alias), `change` / `mutate`
-    # (write alias). `query` and `mutate` are recommended; `read` and
-    # `change` remain accepted forever for back-compat.
-    command: read|change|query|mutate
-    query: <path-to-.gq>
-    name: <query-name>
-    args: [<positional-name>, …]
-    graph: <name>
-    branch: <name>
-    format: <output-format>
-queries:                          # top-level registry — applies only to a bare-URI (anonymous) graph; a graph served by name uses its `graphs.<id>.queries`. Mirrors top-level `policy`.
-  <query-name>: { file: <path-to-.gq> }   # mcp.expose defaults to true
-policy:
-  file: policy.yaml
-```
-
-## Cluster config preview
-
-```bash
-omnigraph cluster validate --config company-brain
-omnigraph cluster plan     --config company-brain --json
-omnigraph cluster apply    --config company-brain --json
-omnigraph cluster approve  graph.<id> --config company-brain --as <actor>
-omnigraph cluster status   --config company-brain --json
-omnigraph cluster refresh  --config company-brain --json
-omnigraph cluster import   --config company-brain --json
-omnigraph cluster force-unlock <LOCK_ID> --config company-brain --json
-```
-
-`--config` is a directory containing `cluster.yaml`; it defaults to `.`.
-Stage 3A accepts graphs, schemas, stored queries, and policy bundle file
-references. `cluster plan` reads local JSON state from
-`<config-dir>/__cluster/state.json`; a missing file means empty state. Plan,
-apply, refresh, and import acquire `__cluster/lock.json` by default and release
-it before returning. `cluster apply` executes only stored-query/policy catalog
-writes (content-addressed under `__cluster/resources/`) and requires an
-existing `state.json`; graph/schema changes are deferred with warnings, and
-applied resources do not serve traffic — the server still boots from
-`omnigraph.yaml`. `cluster status` reads state only and reports any existing
-lock metadata. `force-unlock` removes a lock only when the supplied id exactly
-matches the lock file. `refresh` requires an existing `state.json`; `import`
-creates one only when it is missing. Both observe declared graphs read-only at
-`<config-dir>/graphs/<graph-id>.omni`. External state backends, graph/schema
-apply, automatic stale-lock breaking, `plan --refresh`, pipelines, UI specs,
-embeddings, aliases, and bindings are reserved for later stages. See
-[cluster-config.md](cluster-config.md).
-
-## Output formats (`query` command, alias: `read`)
-
- `json` — pretty-printed object with metadata + rows
- `jsonl` — one metadata line then one JSON object per row
- `csv` — RFC 4180-ish quoting
- `table` — fitted text table, honors `table_max_column_width` + `table_cell_layout`
- `kv` — grouped per-row key/value blocks
-
-## Param resolution
-
-Precedence (high to low): explicit `--params` / `--params-file`, alias positional args, `omnigraph.yaml` defaults. JS-safe-integer handling is built in (`is_js_safe_integer_i64`, `JS_MAX_SAFE_INTEGER_U64`) so 64-bit ids round-trip safely through JSON clients.
-
-## Bearer token resolution (CLI)
-
-1. `graphs.<name>.bearer_token_env`
-2. `OMNIGRAPH_BEARER_TOKEN` global env
-3. `auth.env_file` referenced `.env`
-
-## Duration parsing (cleanup)
-
-`s | m | h | d | w` units, e.g. `--older-than 7d`.
--- a/docs/user/cli.md
+++ b/docs/user/cli.md
@ -1,164 +0,0 @@
-# CLI Guide
-
-## Core Graph Flow
-
-```bash
-omnigraph init --schema schema.pg graph.omni
-omnigraph load --data data.jsonl --mode overwrite graph.omni
-omnigraph snapshot graph.omni --branch main --json
-omnigraph query  --uri graph.omni --query queries.gq --name get_person --params '{"name":"Alice"}'
-omnigraph mutate --uri graph.omni --query queries.gq --name insert_person --params '{"name":"Mina","age":28}'
-```
-
-`omnigraph query` is the canonical read command (pairs with `POST /query`);
-`omnigraph mutate` is the canonical write command (pairs with `POST /mutate`).
-The previous names `omnigraph read` and `omnigraph change` keep working as
-visible aliases — invocations emit a one-line deprecation warning to stderr
-and otherwise behave identically. See [Deprecated names](#deprecated-names)
-for the migration table.
-
-For ad-hoc reads and mutations (REPLs, AI agents, one-off scripts), pass the
-GQ source inline with `-e` / `--query-string` instead of a file path:
-
-```bash
-omnigraph query --uri graph.omni \
-  -e 'query find($name: String) { match { $p: Person { name: $name } } return { $p.name, $p.age } }' \
-  --params '{"name":"Alice"}'
-
-omnigraph mutate --uri graph.omni \
-  -e 'query add($name: String, $age: I32) { insert Person { name: $name, age: $age } }' \
-  --params '{"name":"Inline","age":42}'
-```
-
-`-e` is mutually exclusive with `--query <path>` and `--alias <name>`; exactly
-one of the three must be provided. The inline source travels through the same
-parser, lint, params binding, and commit machinery as a file-based query —
-only the source loader changes.
-
-## Branching And Reviewable Data Flows
-
-```bash
-omnigraph branch create --uri graph.omni --from main feature-x
-omnigraph branch list --uri graph.omni
-omnigraph branch merge --uri graph.omni feature-x --into main
-
-omnigraph load --data batch.jsonl --branch review/import-2026-04-09 --from main --mode merge graph.omni
-omnigraph export graph.omni --branch main --type Person > people.jsonl
-omnigraph commit list graph.omni --branch main --json
-omnigraph commit show --uri graph.omni <commit-id> --json
-```
-
-## Remote Server Mode
-
-Serve a graph:
-
-```bash
-omnigraph-server graph.omni --bind 127.0.0.1:8080
-```
-
-Read through the HTTP API:
-
-```bash
-omnigraph query \
-  --target http://127.0.0.1:8080 \
-  --query queries.gq \
-  --name get_person \
-  --params '{"name":"Alice"}'
-```
-
-If the server requires auth, set `OMNIGRAPH_SERVER_BEARER_TOKEN` on the server
-and configure the matching `bearer_token_env` in `omnigraph.yaml`.
-
-## Multi-graph servers (v0.6.0+)
-
-Against a multi-graph server (started with `--config omnigraph.yaml` referencing a non-empty `graphs:` map), use `omnigraph graphs list` to enumerate the registered graphs. The server must configure bearer tokens and `server.policy.file` with a rule that allows `graph_list`; `/graphs` is closed by default even when the server runs with `--unauthenticated`.
-
-```bash
-OMNIGRAPH_BEARER_TOKEN=admin-token \
-  omnigraph graphs list --uri http://server.example.com --json
-```
-
-For config-driven clients, set the remote graph's `bearer_token_env` to an environment variable containing a token whose actor is authorized by `server.policy.file`.
-
-`list` rejects local URI targets — it's for remote multi-graph servers only.
-
-Runtime add/remove is **not** in v0.6.0. To add a graph, stop the server, add a `graphs.<id>` entry to `omnigraph.yaml`, then restart. To remove, stop the server, delete the entry, restart.
-
-Per-graph URLs: hit a graph's cluster route from any subcommand by pointing `--uri` at it:
-
-```bash
-omnigraph read --uri http://server.example.com/graphs/beta --query q.gq ...
-```
-
-## Runs, Policy, And Diagnostics
-
-```bash
-omnigraph lint  --query queries.gq --schema schema.pg --json
-omnigraph check --query queries.gq graph.omni --json
-
-omnigraph schema plan --schema next.pg graph.omni --json
-omnigraph schema apply --schema next.pg graph.omni --json
-omnigraph policy validate --config omnigraph.yaml
-omnigraph policy test --config omnigraph.yaml
-omnigraph policy explain --config omnigraph.yaml --actor act-alice --action read --branch main
-
-omnigraph commit list graph.omni --json
-omnigraph commit show --uri graph.omni <commit-id> --json
-```
-
-(The legacy `omnigraph run list/show/publish/abort` subcommands were removed in MR-771; mutations and loads publish atomically and the commit graph (`omnigraph commit list`) is the audit surface.)
-
-`query lint` and `query check` are the same command surface. In v1, graph-backed
-lint uses local or `s3://` graph URIs; HTTP targets are only supported when you
-also pass `--schema`.
-
-## Config
-
-`omnigraph.yaml` lets the CLI and server share named graphs, defaults, and
-query roots:
-
-```yaml
-graphs:
-  local:
-    uri: demo.omni
-  dev:
-    uri: http://127.0.0.1:8080
-    bearer_token_env: OMNIGRAPH_BEARER_TOKEN
-
-cli:
-  graph: local
-  branch: main
-
-query:
-  roots:
-    - queries
-    - .
-```
-
-The config file can also define:
-
- server bind defaults
- auth env files
- query aliases for common read and change commands
- `policy.file` for Cedar authorization rules
-
-When policy is enabled, `schema apply` is authorized through the
-`schema_apply` action and is typically limited to admins on protected `main`.
-
-## Deprecated names
-
-The CLI was renamed to align with the HTTP server's canonical endpoint
-names (`POST /query`, `POST /mutate`) and the `query` keyword in the GQ
-language. The previous spellings keep working forever; invocations emit a
-one-line warning to stderr and otherwise behave identically.
-
-| Old (deprecated)         | New (canonical)     | Migration                                                |
-|--------------------------|---------------------|----------------------------------------------------------|
-| `omnigraph read`         | `omnigraph query`   | Same flags and behavior. `read` is a visible clap alias. |
-| `omnigraph change`       | `omnigraph mutate`  | Same flags and behavior. `change` is a visible clap alias. |
-| `omnigraph query lint`   | `omnigraph lint`    | Same flags. The argv-level shim rewrites `query lint` to `lint`. |
-| `omnigraph query check`  | `omnigraph check`   | `check` is a visible alias of `omnigraph lint`. |
-
-The `command:` field in `aliases.<name>` in `omnigraph.yaml` accepts both
-`read` / `change` (legacy) and `query` / `mutate` (canonical); the two
-spellings are interchangeable on the wire via serde aliases.
--- a/docs/user/cli/index.md
+++ b/docs/user/cli/index.md
@ -0,0 +1,175 @@
+# CLI Guide
+
+## Core Graph Flow
+
+```bash
+omnigraph init --schema schema.pg graph.omni
+omnigraph load --data data.jsonl --mode overwrite graph.omni
+omnigraph snapshot graph.omni --branch main --json
+# Invoke a stored query BY NAME from the catalog (served — addressed by scope):
+omnigraph query  get_person    --params '{"name":"Alice"}'
+omnigraph mutate insert_person --params '{"name":"Mina","age":28}'
+```
+
+`omnigraph query` is the canonical read command (pairs with `POST /query`);
+`omnigraph mutate` is the canonical write command (pairs with `POST /mutate`).
+The positional argument is the **stored-query name**, invoked from the served
+catalog (RFC-011 D3) — the graph is addressed by scope (`--server` / `--profile`
+/ defaults), and the verb asserts the query's kind (`query` rejects a stored
+mutation, and vice-versa). The previous names `omnigraph read` and
+`omnigraph change` keep working as visible aliases — invocations emit a one-line
+deprecation warning to stderr. See [Deprecated names](#deprecated-names).
+
+For **ad-hoc** reads and mutations (REPLs, AI agents, one-off scripts, local dev),
+pass the GQ source with `-e` / `--query-string` (inline) or `--query <path>` (a
+file), and address a graph's storage directly with `--store`. By-name catalog
+invocation is served-only — a bare `--store` has no catalog, so it's the ad-hoc
+lane:
+
+```bash
+omnigraph query --store graph.omni \
+  -e 'query find($name: String) { match { $p: Person { name: $name } } return { $p.name, $p.age } }' \
+  --params '{"name":"Alice"}'
+
+omnigraph mutate --store graph.omni \
+  -e 'query add($name: String, $age: I32) { insert Person { name: $name, age: $age } }' \
+  --params '{"name":"Inline","age":42}'
+
+# A multi-query file: the positional selects which query to run.
+omnigraph query --store graph.omni --query queries.gq get_person --params '{"name":"Alice"}'
+```
+
+`-e` is mutually exclusive with `--query <path>`. With either, the positional
+name (optional) selects which query in the source to run. The inline source
+travels through the same parser, lint, params binding, and commit machinery as a
+file-based query — only the source loader changes.
+
+## Branching And Reviewable Data Flows
+
+```bash
+omnigraph branch create --uri graph.omni --from main feature-x
+omnigraph branch list --uri graph.omni
+omnigraph branch merge --uri graph.omni feature-x --into main
+
+omnigraph load --data batch.jsonl --branch review/import-2026-04-09 --from main --mode merge graph.omni
+omnigraph export graph.omni --branch main --type Person > people.jsonl
+omnigraph commit list graph.omni --branch main --json
+omnigraph commit show --uri graph.omni <commit-id> --json
+```
+
+## Remote Server Mode
+
+Serve a cluster-applied graph:
+
+```bash
+omnigraph cluster apply --config ./company-brain
+omnigraph-server --cluster ./company-brain --bind 127.0.0.1:8080
+```
+
+Read through the HTTP API — invoke a stored query by name from the catalog:
+
+```bash
+omnigraph query get_person \
+  --server http://127.0.0.1:8080 \
+  --params '{"name":"Alice"}'
+```
+
+A server is addressed with `--server` (a name from `~/.omnigraph/config.yaml` or a
+literal URL); a positional `http(s)://` URI is rejected. If the server requires
+auth, set its bearer token and `omnigraph login <server>` (or
+`OMNIGRAPH_BEARER_TOKEN`).
+
+## Multi-graph servers
+
+A server boots from a cluster directory (`omnigraph-server --cluster <dir>`) and
+serves every graph the cluster declares. Use `omnigraph graphs list` to enumerate
+them. The cluster's server-level policy must allow `graph_list`; `/graphs` is
+closed by default even when the server runs with `--unauthenticated`.
+
+```bash
+OMNIGRAPH_BEARER_TOKEN=admin-token \
+  omnigraph graphs list --server http://server.example.com --json
+```
+
+For an operator-defined server, store its token with `omnigraph login <name>` (or
+`OMNIGRAPH_TOKEN_<NAME>`); the actor must be authorized by the cluster's
+server-level policy.
+
+`list` rejects local (`--store`) targets — it's for remote multi-graph servers only.
+
+Runtime add/remove via API is not exposed. To add or remove a graph, edit the
+cluster's `cluster.yaml`, run `omnigraph cluster apply`, then restart the server.
+
+Per-graph addressing: select a graph on a multi-graph server with `--graph`:
+
+```bash
+omnigraph query get_person --server http://server.example.com --graph beta --params '{"name":"Ada"}'
+```
+
+## Runs, Policy, And Diagnostics
+
+```bash
+omnigraph lint  --query queries.gq --schema schema.pg --json
+omnigraph check --query queries.gq graph.omni --json
+
+omnigraph schema plan --schema next.pg graph.omni --json
+omnigraph schema apply --schema next.pg graph.omni --json
+omnigraph policy validate --cluster ./company-brain --graph knowledge
+omnigraph policy test    --cluster ./company-brain --graph knowledge --tests policy.tests.yaml
+omnigraph policy explain --cluster ./company-brain --graph knowledge --actor act-alice --action read --branch main
+
+omnigraph commit list graph.omni --json
+omnigraph commit show --uri graph.omni <commit-id> --json
+```
+
+(Mutations and loads publish atomically; the commit graph (`omnigraph commit list`) is the audit surface.)
+
+`query lint` and `query check` are the same command surface. In v1, graph-backed
+lint uses local or `s3://` graph URIs; HTTP targets are only supported when you
+also pass `--schema`.
+
+## Config
+
+Configuration has two surfaces with single owners (see the
+[CLI reference](reference.md#config-surfaces) for the full schema):
+
+- **`~/.omnigraph/config.yaml`** — your personal operator config: default actor
+  (`--as`), named servers + credentials, clusters, profiles, aliases, and
+  default scope (`defaults.server` / `defaults.store` / `default_graph`). It
+  decides *who you are* and *what you address by default*.
+- **`cluster.yaml`** (a team-owned cluster directory) — declares *what the system
+  is*: graphs, schemas, stored queries, policies, and storage. A server boots
+  from it (`--cluster <dir>`); see the [cluster guide](../clusters/index.md).
+
+```yaml
+# ~/.omnigraph/config.yaml
+operator:
+  actor: act-andrew
+servers:
+  dev:
+    url: http://127.0.0.1:8080
+defaults:
+  server: dev
+  default_graph: knowledge
+```
+
+When policy is enabled, `schema apply` is authorized through the
+`schema_apply` action and is typically limited to admins on protected `main`.
+
+## Deprecated names
+
+The CLI was renamed to align with the HTTP server's canonical endpoint
+names (`POST /query`, `POST /mutate`) and the `query` keyword in the GQ
+language. The previous spellings keep working forever; invocations emit a
+one-line warning to stderr and otherwise behave identically.
+
+| Old (deprecated)         | New (canonical)     | Migration                                                |
+|--------------------------|---------------------|----------------------------------------------------------|
+| `omnigraph read`         | `omnigraph query`   | Same flags and behavior. `read` is a visible clap alias. |
+| `omnigraph change`       | `omnigraph mutate`  | Same flags and behavior. `change` is a visible clap alias. |
+| `omnigraph query lint`   | `omnigraph lint`    | Same flags. The argv-level shim rewrites `query lint` to `lint`. |
+| `omnigraph query check`  | `omnigraph check`   | `check` is a visible alias of `omnigraph lint`. |
+
+The `command:` field in `aliases.<name>` in `~/.omnigraph/config.yaml` accepts
+both `read` / `change` (legacy) and `query` / `mutate` (canonical); the two
+spellings are interchangeable on the wire via serde aliases.
--- a/docs/user/cli/reference.md
+++ b/docs/user/cli/reference.md
@ -0,0 +1,228 @@
+# CLI Reference (`omnigraph`)
+
+A reference for the `omnigraph` binary's command surface and the per-operator `~/.omnigraph/config.yaml` schema. For a quick-start guide, see [cli.md](index.md).
+
+Top-level command families and subcommands. Graph-targeting commands accept a positional `file://`/`s3://` URI, `--server <name|url>` (an operator-defined server from `~/.omnigraph/config.yaml` by name, or a literal `http(s)://` URL, optionally with `--graph <id>` for multi-graph servers; exclusive with a positional URI), `--store <uri>` (a single graph's storage directly), or `--profile <name>` / `$OMNIGRAPH_PROFILE` (a named scope bundle; see [Scopes & profiles](#scopes--profiles-rfc-011)); `cluster` commands use `--config <dir>`, while `policy` and `queries` read a cluster's applied state via `--cluster <dir|uri>`. A remote server is addressed only with `--server` — a positional `http(s)://` URI is rejected. **`query`/`mutate` are the exception**: their positional is a stored-query *name* (RFC-011 D3), not a graph URI, so they address the graph only via `--store`/`--server`/`--profile`/defaults.
+
+## Top-level commands
+
+| Command | Purpose |
+|---|---|
+| `init` | `--schema <pg>` → initialize a graph (start cluster configs from the [cluster.md](../clusters/index.md) quick-start) |
+| `load` | bulk load a branch, local or remote (`--mode overwrite\|append\|merge` is **required** — overwrite is destructive, so there is no default). Without `--from` the target branch must exist; `--from <base>` forks a missing `--branch` from `<base>` first |
+| `ingest` | deprecated alias of `load --from <base>` (defaults: `--from main --mode merge`); prints a one-line warning to stderr |
+| `query <name>` (alias: `read`) | run a read query. **Catalog lane** (default): `<name>` is a stored query invoked **by name** from the served catalog (served-only — address with `--server`/`--profile`; the verb asserts the query is a read). **Ad-hoc lane**: with `--query <path>` or `-e`/`--query-string <GQ>`, runs that source (the positional `<name>` then selects which query in it). No positional graph URI — address via `--store`/`--server`/`--profile`. `read` is the deprecated previous name (one-line stderr warning) |
+| `mutate <name>` (alias: `change`) | run a mutation query; same catalog (by-name, served-only, verb asserts mutation) / ad-hoc (`--query`/`-e`) lanes as `query`. `change` is the deprecated previous name (one-line stderr warning) |
+| `alias <name> [args]` | invoke an operator alias — a read-only personal binding (under `aliases:` in `~/.omnigraph/config.yaml`) to a stored query on a named server (RFC-011 D4; replaces the removed `--alias` flag; stored mutations are rejected before execution) |
+| `snapshot` | print current snapshot (per-table version + row count) |
+| `export` | dump to JSONL on stdout (`--type T`, `--table K` filters) |
+| `branch create \| list \| delete \| merge` | branching ops |
+| `commit list \| show` | inspect commit graph |
+| `schema plan \| apply \| show (alias: get)` | migrations. `apply` refuses a cluster-managed graph (one whose storage is inside a cluster) and points at `cluster apply` — those graphs evolve through the cluster ledger, not a direct apply |
+| `lint` (alias: `check`) | offline / graph-backed query validation. Replaces `query lint` / `query check`, which are kept as deprecated argv-level shims that print a one-line warning and rewrite to `omnigraph lint` |
+| `cluster validate \| plan \| apply \| approve \| status \| refresh \| import \| force-unlock` | declarative cluster control plane. `validate` checks a local `cluster.yaml` folder and referenced schema/query/policy files; `plan` diffs it against local JSON state at `__cluster/state.json`, annotates dispositions, and embeds real schema-migration previews; `apply` converges the cluster — stored-query/policy catalog writes (content-addressed under `__cluster/resources/`), graph creates, schema updates (soft drops only; `--as` records the actor), and graph deletes behind a digest-bound approval from `cluster approve <resource> --as <actor>` (`apply`/`approve` default the actor from `~/.omnigraph/config.yaml`'s `operator.actor` when `--as` is omitted); what apply converges is what an `omnigraph-server --cluster <dir>` deployment serves on its next restart (`--cluster` is the server's only boot source — RFC-011 cluster-only); `status` reads the state ledger; `refresh`/`import` explicitly update local JSON state from read-only graph observations; `force-unlock <LOCK_ID>` manually removes a held local state lock by exact id |
+| `optimize` | non-destructive Lance compaction (skips tables with `Blob` columns or uncovered drift; `--json` reports `skipped`) |
+| `repair [--confirm] [--force]` | preview or explicitly publish uncovered manifest/head drift. `--confirm` heals verified maintenance drift and exits non-zero if suspicious/unverifiable drift is refused; `--force --confirm` publishes suspicious/unverifiable drift after operator review |
+| `cleanup --keep N --older-than 7d --confirm` | destructive version GC (`--confirm` to execute; also needs `--yes` against a non-local `s3://` target — see *Write diagnostics & destructive confirmation*) |
+| `embed` | offline JSONL embedding pipeline |
+| `policy validate \| test \| explain` | Cedar tooling against a cluster's applied policies (`--cluster <dir>`; `--graph <id>` picks a graph's bundle when several apply). `test` takes `--tests <file>`; `explain` takes `--actor`/`--action`/`--branch`/`--target-branch` |
+| `profile list \| show [<name>]` | read-only inspection of `~/.omnigraph/config.yaml` profiles. `list` shows each profile's binding (server/cluster/store) + default graph and marks the `$OMNIGRAPH_PROFILE`-active one; JSON keeps `binding` and adds `scope_kind`, `target`, `valid`, and `error`; `show` resolves one profile's scope (endpoint + default graph), defaulting to the active profile, else the flat operator defaults |
+| `version` / `-v` | print `omnigraph 0.3.x` |
+
+## Command capabilities
+
+Every command declares the **capability** it needs — what it requires to reach a graph — which determines the addressing flags that apply:
+
+- **`any`** — `query`, `mutate`, `load`, `ingest`, `branch *`, `snapshot`, `export`, `commit *`, `schema show`, `schema apply`. Run against a graph **served (via a server) or embedded (direct against a store)**: accept a positional `file://`/`s3://` URI, `--server <name|url>` (+ `--graph <id>` for multi-graph servers), `--store <uri>`, or `--profile <name>`. A remote server is addressed with `--server` — a positional `http(s)://` URI does **not** dispatch to one.
+- **`served`** — `graphs list`. Requires a server (accepts `--server` / `--profile`).
+- **`direct`** — `init`, `optimize`, `repair`, `cleanup`, `schema plan`, `lint`. Need **direct storage access** (`file://` / `s3://`), never through a server. They accept a positional `URI`, but **not** `--server`, and a remote (`http(s)://`) URI is rejected. `optimize` / `repair` / `cleanup` additionally accept **`--cluster <dir|s3://…> --graph <id>`** (`--cluster` is a cluster directory or storage-root URI, named via `clusters:` in `~/.omnigraph/config.yaml` or a literal root), which resolves the graph's storage URI from the served cluster state (so you needn't know the `<storage>/graphs/<id>.omni` layout). `--graph` is the one graph selector across all scopes — on these three verbs it picks the cluster graph; on the other `direct` verbs it does not apply.
+- **`control`** — `cluster *` via `--config <dir>`; `policy *` and `queries *` via `--cluster <dir|uri>` or a cluster profile.
+- **`local`** — `alias`, `embed`, `login`, `logout`, `profile`, `version`. Address no explicit graph scope.
+
+These restrictions are enforced and reported, not silent:
+
+- A scope flag on a verb that can't consume it fails loudly rather than being silently dropped — `--server` outside a served scope, `--cluster` outside cluster-scoped verbs, or `--graph` where no multi-graph scope applies, e.g.: ``optimize is a direct (storage-native) command; --server addresses a served graph and does not apply. Pass a storage URI, or --cluster <dir> --graph <id>.``
+- A `direct` verb pointed at a remote URI fails loudly, e.g.: ``optimize is a direct (storage-native) command and needs direct storage access; the resolved target is a remote server (https://…). Pass the graph's file:// or s3:// URI.``
+- A data verb pointed at a positional `http(s)://` URI fails loudly: ``a remote graph must be addressed with --server <url> — a positional (or --uri) http(s):// URL no longer dispatches to a server.``
+- `init` into an **established cluster's** storage layout (`<root>/graphs/<id>.omni` where `<root>` holds `__cluster/state.json`) is refused — graphs in a cluster are created by `cluster apply` (which records ledger / recovery / approvals), not `init`.
+
+To maintain a server-backed graph, run the `direct` verbs from a host with storage access against the graph's storage URI (a positional URI, or `--cluster … --graph …`), out-of-band from the serving process — there are no server routes for `optimize` / `repair` / `cleanup` by design.
+
+`omnigraph --help` lists commands with a **capability legend** at the bottom (any / served / direct / control / local).
+
+## Write diagnostics & destructive confirmation
+
+Two global flags make writes self-documenting and guard the dangerous ones (RFC-011 Decision 9):
+
+- **Every write echoes its resolved target to stderr** — `omnigraph load → s3://acme/brain/graphs/knowledge.omni (direct, remote)` — so you catch a scope that resolved somewhere unexpected (e.g. *prod*) before it lands. Applies to `load`, `ingest`, `mutate`, `branch create|delete|merge`, `schema apply`, `optimize`, `repair`, `cleanup`. The line is stderr, so `--json` consumers reading stdout are unaffected; suppress it with **`--quiet`**.
+- **Destructive writes against a non-local scope require confirmation.** `cleanup`, overwrite `load` (`--mode overwrite`), and `branch delete` proceed freely against a local (`file://`) graph, but when the resolved target is **not local** (a served `http(s)://` graph or an `s3://` store/cluster) they require explicit consent: pass **`--yes`** to confirm, an interactive terminal is prompted, and a non-interactive run (no TTY, or `--json`) **refuses with an error** rather than silently destroying. `cleanup` still also requires its existing `--confirm` (preview→execute); `--yes` is the additional non-local consent.
+
+A "local" target is a bare path or a `file://` URI; `http(s)://`, `s3://`, and other object-store schemes are non-local.
+
+## Config surfaces
+
+Two config surfaces with single owners, plus a zero-config tier:
+
+| Surface | Owner | Location | Declares |
+|---|---|---|---|
+| Cluster config | the team, in a repo | `cluster.yaml` + checkout ([cluster-config.md](../clusters/config.md)) | what the system **is**: graphs, schemas, queries, policies, storage |
+| Operator config | one person | `~/.omnigraph/config.yaml` (override dir with `$OMNIGRAPH_HOME`) | who **I** am: identity, ergonomics |
+| Flags / env | per invocation | — | everything, explicitly |
+
+### `~/.omnigraph/config.yaml` (operator)
+
+```yaml
+operator:
+  actor: act-andrew     # default identity for the --as cascade: --as > operator.actor > none
+servers:                # operator-owned endpoints; names key the credentials
+  prod:
+    url: https://graph.example.com     # no tokens in this file, ever
+defaults:
+  output: table         # read format default, below --json/--format/alias
+  server: prod          # the everyday SERVED scope when no address is given (RFC-011)
+  # store: file:///data/dev.omni   # OR a zero-flag LOCAL default (mutually
+  #                                #   exclusive with `server`); the local-dev
+  #                                #   counterpart of `server`
+  default_graph: knowledge   # graph selected in a server/cluster scope
+clusters:               # admin-only: managed-cluster storage roots (RFC-011).
+  brain:                #   the ONLY place a storage root lives in this file.
+    root: s3://acme/clusters/brain
+profiles:               # named scope bundles (RFC-011); pick with --profile
+  staging: { server: staging, default_graph: knowledge }   # a served scope
+  brain-admin: { cluster: brain, default_graph: knowledge } # a direct cluster scope
+```
+
+Absent file = empty layer. Unknown keys warn and load (a file written for a
+newer CLI works on an older one). Override the config directory with
+`$OMNIGRAPH_HOME`.
+
+#### Scopes & profiles (RFC-011)
+
+A command resolves a **scope** — a server, a cluster, or a store — then selects a
+graph in it; the served-vs-direct access path is derived from the scope, not
+toggled. The scope comes from one of (highest precedence first): an explicit
+address (a positional URI, `--server`, or `--store <uri>`); a named
+`--profile <name>` (or `$OMNIGRAPH_PROFILE`); or the flat `defaults.server` +
+`defaults.default_graph` (a served default) **or** `defaults.store` (a zero-flag
+*local* default — mutually exclusive with `defaults.server`). A **profile** binds
+exactly one of `server` / `cluster` / `store` plus an optional default graph —
+config data, not state: every command resolves its scope fresh, there is no
+sticky "current" mode. Inspect what is defined with `omnigraph profile list` and
+`omnigraph profile show [<name>]` (read-only).
+
+- `--store <uri>` addresses a single graph's storage directly (ad-hoc / break-glass).
+- A `cluster`-bound profile reaches `optimize` / `repair` / `cleanup` for a managed
+  graph (resolving its storage root from `clusters:`), the same as
+  `--cluster <root> --graph <id>`. A `--graph` flag overrides the profile's default.
+- A `server`-bound scope on a maintenance verb, or a `cluster`-bound scope on a
+  data verb, is rejected with a message pointing at the right addressing.
+- **No graph selected (RFC-011 D7).** When a scope has no `--graph` and no
+  `default_graph`, the CLI never silently picks:
+  - **Cluster scope** — exactly **one** applied graph is used automatically;
+    **several** errors and lists the candidates (from the served catalog).
+  - **Server scope** — a multi-graph server (any non-empty `GET /graphs`, even a
+    single entry) errors and lists the candidates: you must pass `--graph <id>`.
+    A single-graph / flat server (405 on `/graphs`), or one whose `/graphs` is
+    policy-gated or unreachable, uses its bare URL as before.
+
+`--target`, `--cluster-graph`, and the positional-`http(s)://`→remote dispatch
+have been **removed** (`--graph` is now the one graph selector across server and
+cluster scopes); operator `defaults`/`--profile` supply the no-flag scope and an
+explicit address always wins.
+
+#### Credentials keyed by server name
+
+`omnigraph login <name>` stores a bearer token in
+`~/.omnigraph/credentials` (created `0600`; group/world-readable files are
+refused). Token from `--token`, or — preferred, keeps it out of shell
+history — one line on stdin: `echo $TOKEN | omnigraph login prod`.
+`omnigraph logout <name>` removes it (idempotent).
+
+#### Operator aliases — bindings, not content
+
+An operator alias is a personal name for *invoking a stored query on a
+named server* — it carries no query content (the stored query in the
+catalog is the team's contract; the alias, its defaults, and its name are
+yours):
+
+```yaml
+aliases:
+  triage:
+    server: intel-dev        # names an entry under servers:
+    graph: spike             # optional (multi-graph servers)
+    query: weekly_triage     # the STORED query's name — never a file
+    args: [since]            # positional args -> params, in order
+    params: { limit: 20 }    # fixed defaults; positionals/--params win
+    format: table
+```
+
+`omnigraph alias triage 2026-06-01` invokes
+`POST <server>/graphs/spike/queries/weekly_triage` with the keyed
+credential. Aliases live in their own `alias` namespace (RFC-011 Decision 4),
+so an alias can never shadow — or be shadowed by — a built-in verb. (The old
+`--alias <name>` flag on `query`/`mutate` was removed.)
+
+A remote command whose URL prefix-matches an operator server's `url` (the
+`gh` host model — no flags needed) resolves its token through:
+
+| Order | Source |
+|---|---|
+| 1 | `OMNIGRAPH_TOKEN_<NAME>` env (`prod` → `OMNIGRAPH_TOKEN_PROD`) |
+| 2 | `[<name>]` section in `~/.omnigraph/credentials` |
+| 3 | the default `OMNIGRAPH_BEARER_TOKEN` env |
+
+A keyed token is only ever sent to the server it is keyed to: a URL matching no
+operator server falls back to `OMNIGRAPH_BEARER_TOKEN` alone.
+
+## Cluster config preview
+
+```bash
+omnigraph cluster validate --config company-brain
+omnigraph cluster plan     --config company-brain --json
+omnigraph cluster apply    --config company-brain --json
+omnigraph cluster approve  graph.<id> --config company-brain --as <actor>
+omnigraph cluster status   --config company-brain --json
+omnigraph cluster refresh  --config company-brain --json
+omnigraph cluster import   --config company-brain --json
+omnigraph cluster force-unlock <LOCK_ID> --config company-brain --json
+```
+
+`--config` is a directory containing `cluster.yaml`; it defaults to `.`.
+Stage 3A accepts graphs, schemas, stored queries, and policy bundle file
+references. `cluster plan` reads local JSON state from
+`<config-dir>/__cluster/state.json`; a missing file means empty state. Plan,
+apply, refresh, and import acquire `__cluster/lock.json` by default and release
+it before returning. `cluster apply` executes only stored-query/policy catalog
+writes (content-addressed under `__cluster/resources/`) and requires an
+existing `state.json`; graph/schema changes are deferred with warnings, and
+applied resources do not serve traffic until an `omnigraph-server --cluster
+<dir>` restart picks them up. `cluster status` reads state only and reports any existing
+lock metadata. `force-unlock` removes a lock only when the supplied id exactly
+matches the lock file. `refresh` requires an existing `state.json`; `import`
+creates one only when it is missing. Both observe declared graphs read-only at
+`<config-dir>/graphs/<graph-id>.omni`. External state backends, graph/schema
+apply, automatic stale-lock breaking, `plan --refresh`, pipelines, UI specs,
+embeddings, aliases, and bindings are reserved for later stages. See
+[cluster-config.md](../clusters/config.md).
+
+## Output formats (`query` command, alias: `read`)
+
+- `json` — pretty-printed object with metadata + rows
+- `jsonl` — one metadata line then one JSON object per row
+- `csv` — RFC 4180-ish quoting
+- `table` — fitted text table, honors `table_max_column_width` + `table_cell_layout`
+- `kv` — grouped per-row key/value blocks
+
+## Param resolution
+
+Precedence (high to low): explicit `--params` / `--params-file`, alias positional args. JS-safe-integer handling is built in (`is_js_safe_integer_i64`, `JS_MAX_SAFE_INTEGER_U64`) so 64-bit ids round-trip safely through JSON clients.
+
+## Bearer token resolution (CLI)
+
+1. `graphs.<name>.bearer_token_env`
+2. `OMNIGRAPH_BEARER_TOKEN` global env
+3. `auth.env_file` referenced `.env`
+
+## Duration parsing (cleanup)
+
+`s | m | h | d | w` units, e.g. `--older-than 7d`.
--- a/docs/user/clusters/config.md
+++ b/docs/user/clusters/config.md
@ -1,9 +1,7 @@
 # Cluster Config

-**Status:** Phase 5 — cluster-booted serving (`omnigraph-server --cluster`).
-
 > New to the cluster tooling? Start with the operator how-to guide,
-> [cluster.md](cluster.md) — this document is the reference.
+> [cluster.md](index.md) — this document is the reference.

 Cluster config is the future control-plane configuration surface for a whole
 OmniGraph deployment. In this stage, OmniGraph can validate a local
@ -15,7 +13,8 @@ catalog writes, **graph creation** (a declared graph that does not exist yet
 is initialized by apply at the derived root), **schema updates** (soft drops
 only), and — behind an explicit, digest-bound **approval** — **graph
 deletion**. It does not perform data-loss schema migrations, start servers,
-or serve anything it applies: the server still boots from `omnigraph.yaml`.
+or run data loads. A server can boot from the applied ledger with
+`omnigraph-server --cluster <config-dir | storage-root>`.

 ## Commands

@ -33,33 +32,31 @@ omnigraph cluster force-unlock <LOCK_ID> --config company-brain --json
 `--config` points at a directory, not a file. The directory must contain
 `cluster.yaml`. When omitted, it defaults to the current directory.

-## Relationship to `omnigraph.yaml`
+## Relationship to `~/.omnigraph/config.yaml`

-`cluster.yaml` does not replace `omnigraph.yaml`, and the two never describe
-the same fact. `omnigraph.yaml` is the permanent **per-operator** layer (CLI
-defaults, the operator's identity and credential references, graph targets
-for data-plane commands); `cluster.yaml` is the shared desired state of a
+`cluster.yaml` and the per-operator `~/.omnigraph/config.yaml` never describe
+the same fact. The operator config is the permanent **per-operator** layer
+(the operator's identity and credential references, named servers/clusters,
+profiles, and CLI defaults); `cluster.yaml` is the shared desired state of a
 whole deployment, read only by the `cluster` commands via `--config`.

 The exact contract:

- **Cluster commands read `omnigraph.yaml` for exactly one thing**: the
-  `cli.actor` default used by `apply`/`approve` when `--as` is omitted —
-  operator identity is a per-operator fact. With `--as` present, no config
-  is read at all. Nothing else (its graph set, targets, bind, queries,
-  policies) ever influences a cluster command; a malformed `omnigraph.yaml`
-  breaks only the no-flag actor lookup, loudly.
- **A `--cluster` server reads `omnigraph.yaml` for nothing** — not even the
-  implicit current-directory search runs (mode-inference rule 0). Boot from
-  cluster state XOR `omnigraph.yaml`, never a merge.
- **The other direction is ergonomics, not coupling**: a per-operator
-  `omnigraph.yaml` may point `graphs.<name>.uri` at a cluster's derived root
-  (`company-brain/graphs/knowledge.omni`) so data-plane commands can use
-  `--target <name>` — an ordinary local path, no special handling.
+- **Cluster commands read the operator config for exactly one thing**: the
+  `operator.actor` default used by `apply`/`approve` when `--as` is omitted —
+  operator identity is a per-operator fact. With `--as` present, the operator
+  config is not needed. Nothing else in it influences a cluster command.
+- **No legacy `omnigraph.yaml`**: the CLI does not read `omnigraph.yaml` at
+  all, and a `--cluster` server reads only the cluster catalog — boot is
+  cluster-only.
+- **The other direction is ergonomics, not coupling**: per-operator
+  data-plane commands address a cluster graph by its derived storage root
+  (`company-brain/graphs/knowledge.omni`) with `--store <uri>` — an ordinary
+  local path, no special handling.

 ## Supported `cluster.yaml`

-Stage 3A accepts only this resource subset:
+The current config surface accepts this resource subset:

 ```yaml
 version: 1
@ -70,9 +67,18 @@ state:
  backend: cluster
  lock: true

+providers:
+  embedding:
+    default:
+      kind: openai-compatible
+      base_url: https://openrouter.ai/api/v1
+      model: openai/text-embedding-3-large
+      api_key: ${OPENROUTER_API_KEY}
+
 graphs:
  knowledge:
    schema: knowledge.pg
+    embedding_provider: default
    queries: queries/          # discover every `query <name>` in queries/*.gq

 policies:
@ -101,6 +107,17 @@ updates all of its queries together. Paths are relative to the config
 directory — the cluster is one explicit folder, so no `./` prefixes are
 needed.

+`providers.embedding.<name>` defines a query-time embedding provider profile
+for cluster-served graphs. A graph opts in with `embedding_provider: <name>`;
+bare names normalize to `provider.embedding.<name>`. Supported provider
+`kind` values are `openai-compatible` (default/OpenRouter-compatible),
+`openai` (OpenAI's own host), `gemini`, and `mock`. Real providers require
+`api_key: ${ENV_VAR}`; inline secrets are rejected. The env var is resolved
+only when a `--cluster` server boots, so `cluster validate`, `plan`, and
+`apply` do not need deployment secrets. `mock` is deterministic and does not
+require `api_key`. Vector dimensions stay schema-driven by the target
+`Vector(N)` column, not the provider profile.
+
 `storage:` (optional) is the **storage root URI** for everything the cluster
 stores — the state ledger, lock, content-addressed catalog, recovery
 sidecars, approval artifacts, and the derived graph roots
@ -135,10 +152,12 @@ operation is active.
 - stored-query parsing and query-name matching
 - stored-query type-checking against the desired schema
 - policy `applies_to` graph references
+- embedding provider profiles and graph `embedding_provider` references

-Fields reserved for later phases, such as `pipelines`, `embeddings`, `ui`,
-`aliases`, and `bindings`, fail with a typed diagnostic instead of being
-silently ignored.
+Fields reserved for later phases, such as `pipelines`, top-level
+`embeddings`, `ui`, `aliases`, and `bindings`, fail with a typed diagnostic
+instead of being silently ignored. Under `providers`, only `embedding` is
+supported today; other provider namespaces fail as unsupported config.

 ## Planning

@ -158,9 +177,21 @@ resource is planned as a create. If present, the file must use this shape:
  "applied_revision": {
    "config_digest": "...",
    "resources": {
-      "graph.knowledge": { "digest": "..." },
      "schema.knowledge": { "digest": "..." },
      "query.knowledge.find_experts": { "digest": "..." },
+      "provider.embedding.default": {
+        "digest": "...",
+        "embedding_profile": {
+          "kind": "openai-compatible",
+          "base_url": "https://openrouter.ai/api/v1",
+          "model": "openai/text-embedding-3-large",
+          "api_key": "${OPENROUTER_API_KEY}"
+        }
+      },
+      "graph.knowledge": {
+        "digest": "...",
+        "embedding_provider": "provider.embedding.default"
+      },
      "policy.base": {
        "digest": "...",
        "applies_to": ["cluster", "graph.knowledge"]
@ -236,12 +267,11 @@ Deletes remove the resource from state; their old payload blobs stay on disk
 (garbage collection is a later stage). Re-running a converged apply is a no-op:
 no state write, no revision change (`state_written: false`).

-**Applied means serving — for deployments that opt in.** A server started
-with `--cluster <dir>` boots from the applied revision (see
+**Applied means serving.** A server started with `--cluster <dir>` boots from
+the applied revision (see
 [Serving from the cluster](#serving-from-the-cluster-the-mode-switch)); it
-picks up newly applied state on its next restart. Deployments still booting
-from `omnigraph.yaml` are untouched: for them, applied means recorded in the
-catalog, nothing more.
+picks up newly applied state on its next restart. Until that restart, applied
+means recorded in the catalog, nothing more.

 ### Graph creation

--- a/docs/user/clusters/index.md
+++ b/docs/user/clusters/index.md
@ -7,8 +7,8 @@ destructive changes, and recovering from crashes.

 It is a **how-to**. The reference for every `cluster.yaml` key, command flag,
 state-file field, and diagnostic code is
-[cluster-config.md](cluster-config.md); the HTTP surface is
-[server.md](server.md).
+[cluster-config.md](config.md); the HTTP surface is
+[server.md](../operations/server.md).

 ## The model in one paragraph

@ -102,7 +102,7 @@ curl -H 'authorization: Bearer s3cret' \

 Bearer tokens and the bind address are deliberately *not* cluster facts —
 they are per-replica, set by flag or environment
-([server.md](server.md#modes) for the token sources).
+([server.md](../operations/server.md#modes) for the token sources).

 ## 2. The day-2 loop: edit → plan → apply → restart

@ -117,7 +117,7 @@ omnigraph cluster apply --config company-brain --as andrew

 `--as <actor>` attributes the run: it is recorded in recovery sidecars and
 audit entries and threaded into the engine's commit history. Set
-`cli: { actor: <you> }` in your per-operator `omnigraph.yaml` to make it the
+`operator: { actor: <you> }` in your `~/.omnigraph/config.yaml` to make it the
 default when `--as` is omitted (the flag always wins; `approve` requires one
 of the two).

@ -237,26 +237,53 @@ with an in-flight apply.
  directory; boot is read-only. Roll out a change by `apply` once, then
  restarting replicas (serving is static per process — there is no hot
  reload yet). Container/cloud recipes (AWS ECS+EFS, Railway volumes):
-  [deployment.md](deployment.md#cluster-mode-in-containers-aws-railway).
+  [deployment.md](../deployment.md#cluster-mode-in-containers-aws-railway).
 - **The directory is the deployable unit**: config, catalog, ledger,
  approvals, and graph data all live under it. Back it up as a whole;
  version the *config files* (not `__cluster/` or `graphs/`) in git.
 - **CI-driven convergence**: `validate` and `plan --json` are read-only and
  safe in pipelines; gate `apply --as ci` on plan review. Approvals are the
  human step by design — keep `cluster approve` out of automation.
- **`omnigraph.yaml` still has a job**: per-operator settings — your
-  `cli.actor` default for `--as`, CLI defaults, credentials, and data-plane
-  ergonomics (point `graphs.<name>.uri` at a derived root like
-  `company-brain/graphs/knowledge.omni` to use `--target <name>` for
-  loads). It just no longer describes the deployment — a server boots from
-  one source or the other, never a merge of both.
+- **`~/.omnigraph/config.yaml` is the per-operator config**: your
+  `operator.actor` default for `--as`, named servers/clusters, credentials,
+  profiles, and data-plane ergonomics (address a cluster graph by its derived
+  root like `company-brain/graphs/knowledge.omni` with `--store` for loads). The
+  cluster directory's `cluster.yaml` is the **sole deployment declaration** — the
+  server boots from the cluster only.
+
+## 7. Maintaining a cluster graph
+
+Storage maintenance (`optimize` / `repair` / `cleanup`) is **not** a control-plane
+operation — it runs out-of-band, with direct storage access, against the graph's
+roots. Address a cluster graph by name instead of hand-typing its storage path:
+
+```bash
+omnigraph optimize --cluster ./company-brain --graph knowledge
+omnigraph cleanup  --cluster ./company-brain --graph knowledge --keep 10 --confirm
+# --cluster also takes the storage-root URI directly (config-free), and a
+# `clusters:` name from ~/.omnigraph/config.yaml:
+omnigraph optimize --cluster s3://bucket/clusters/company-brain --graph knowledge
+```
+
+The graph's storage URI is resolved from the **served cluster state** (the same
+truth a `--cluster` server boots from); a graph that hasn't been applied yet is
+not resolvable. Run these from a host with storage access — there are no server
+routes for them. Conversely, **`init` refuses** a cluster-managed path: graphs in
+a cluster are created by `cluster apply`, not by hand.
+
+If the cluster has exactly **one** applied graph you can omit `--graph` — it is
+used automatically. With **several**, omitting `--graph` errors and lists the
+candidates (RFC-011 D7); it never picks one for you.
+
+Against an **`s3://`-backed cluster** the resolved graph storage is non-local, so a
+destructive `cleanup` additionally requires **`--yes`** (an interactive prompt
+otherwise, refusal without a TTY) on top of `--confirm` — see [cli-reference.md](../cli/reference.md)'s
+*Write diagnostics & destructive confirmation*. Every maintenance run also echoes
+its resolved target to stderr (suppress with `--quiet`).

 ## What the control plane does not do (yet)

 - **No hot reload** — applied changes serve on the next restart.
- **No S3-hosted cluster directories** — the config dir, ledger, catalog,
-  and derived graph roots are local-filesystem paths today. (Individual
-  *graphs* on S3 are a server feature outside cluster mode.)
 - **No data operations** — rows move through `omnigraph load / ingest /
  mutate` against the graph roots, with branches and merges as usual.
 - **Stored-query exposure is all-or-nothing per cluster** — every applied
@ -266,4 +293,4 @@ with an in-flight apply.
  reserved and rejected loudly.

 For the full reference — every key, flag, status, disposition, and
-diagnostic — see [cluster-config.md](cluster-config.md).
+diagnostic — see [cluster-config.md](config.md).
--- a/docs/user/concepts/index.md
+++ b/docs/user/concepts/index.md
@ -0,0 +1,49 @@
+# Concepts
+
+OmniGraph is a typed property-graph engine built as a coordination layer over the
+[Lance](https://lance.org) columnar storage format. It gives you a schema-checked
+graph with vector, full-text, and graph queries in one runtime, plus Git-style
+branches and commits across the whole graph.
+
+## The data model
+
+- A graph has **node types** and **edge types**, declared in a
+  [schema](../schema/index.md).
+- Each node type and each edge type is stored as its **own Lance dataset** —
+  columnar, versioned, on local disk or object storage.
+- A single `__manifest` table coordinates all of those datasets, so the graph has
+  one coherent version even though it spans many datasets.
+
+This split is what lets a graph commit be **atomic across every type at once**: a
+publish flips every relevant dataset's version together in one manifest write, so
+readers never see a half-applied change. See [storage](storage.md) for the layout.
+
+## Two layers: inherited vs. added
+
+Throughout the docs, capabilities are framed as **L1** (inherited from Lance) or
+**L2** (added by OmniGraph):
+
+| | L1 — from Lance | L2 — added by OmniGraph |
+|---|---|---|
+| Storage | Columnar Arrow datasets on object storage | Per-type datasets coordinated as one graph |
+| Versioning | Per-dataset versions + time travel | [Snapshots](../branching/time-travel.md) across all types at once |
+| Branches | Per-dataset branches | [Graph-level branches](../branching/index.md), atomic across types |
+| Commits | Per-dataset commits | [Commit DAG](../branching/index.md) for the whole graph; three-way [merge](../branching/merge.md) |
+| Indexes | Scalar / vector / full-text indexes | Built per relevant column; graph topology index for traversal |
+| Search | Vector + full-text primitives | [`nearest` / `bm25` / `rrf`](../search/index.md) in one query, plus graph traversal |
+| Querying | — | The [`.gq` query language](../queries/index.md) and [`.pg` schema language](../schema/index.md) |
+
+## How the pieces fit
+
+- The **schema** (`.pg`) and **query** (`.gq`) languages are compiled to a typed
+  intermediate representation.
+- The **engine** runs queries and mutations against Lance, coordinates the manifest,
+  maintains the commit graph, and builds indexes.
+- The **CLI** ([`omnigraph`](../cli/index.md)) and the
+  **HTTP server** ([`operations/server.md`](../operations/server.md)) are two front
+  ends over the same engine, so embedded and remote behavior match.
+- [Cedar policy](../operations/policy.md) enforcement is engine-wide — every writer
+  goes through the same authorization gate regardless of front end.
+
+For deployment-scale topics — multi-graph servers, control-plane operations,
+recovery — see [clusters](../clusters/index.md).
--- a/docs/user/concepts/storage.md
+++ b/docs/user/concepts/storage.md
@ -7,47 +7,45 @@ Every node type and every edge type is its own Lance dataset:
 - **Columnar Arrow storage**: each property is a column; nullable per Arrow schema.
 - **Fragments**: data is partitioned into fragments; new writes create new fragments.
 - **Manifest versioning**: every commit produces a new dataset version; old versions remain readable.
- **Stable row IDs**: `enable_stable_row_ids: true` is set on every Lance dataset OmniGraph creates — node and edge data tables, `__manifest`, `_graph_commits.lance`, `_graph_commit_recoveries.lance`, and any future system tables. This is an architectural invariant: the flag is one-way at dataset create per Lance's row-id-lineage spec, so a future change that introduces a Lance dataset must preserve it. Consequences: `_row_created_at_version` and `_row_last_updated_at_version` are available on every dataset (load-bearing for change-feed validators); `CreateIndex × Rewrite` is not a retryable conflict, so indices survive `omnigraph optimize` without needing the Fragment Reuse Index; readers must use a Lance build that recognises the flag (our pinned 4.0.0 is fine). Pre-0.4.x graphs created before this code path settled may have datasets without the flag and cannot be retrofitted in place — the supported path is dump-and-reload. The `stage_overwrite` rewrite path (used by `schema_apply`) preserves the flag through `Operation::Overwrite`; pinned by `stage_overwrite_preserves_stable_row_ids` in `crates/omnigraph/tests/staged_writes.rs`.
+- **Stable row IDs**: stable row IDs are enabled on every Lance dataset OmniGraph creates — node and edge data tables, `__manifest`, the commit-graph datasets, and any future system tables. This is an architectural invariant: the flag is one-way at dataset create, so a future change that introduces a Lance dataset must preserve it. Consequences: `_row_created_at_version` and `_row_last_updated_at_version` are available on every dataset (load-bearing for change-feed validators); indices survive `omnigraph optimize`. Pre-0.4.x graphs created before this code path settled may have datasets without the flag and cannot be retrofitted in place — the supported path is dump-and-reload. The rewrite path used by `schema_apply` preserves the flag.
 - **Append / delete / `merge_insert`**: native Lance write modes.
 - **Per-dataset branches** (Lance native): copy-on-write at the dataset level.
- **Object-store agnostic**: file://, s3://, gs://, az://, http (read-only via Lance) — OmniGraph wires file:// and s3:// (`storage.rs`).
+- **Object-store agnostic**: file://, s3://, gs://, az://, http (read-only via Lance) — OmniGraph wires file:// and s3://.

 ## L2 — Multi-dataset coordination via `__manifest`

 OmniGraph is **not** a single Lance dataset; it is a *graph* of datasets coordinated through one append-only manifest table.

 - **Manifest table**: `__manifest/` Lance dataset.
- **Layout** (`db/manifest/layout.rs`, `db/manifest/state.rs`):
+- **Layout**:
  - `nodes/{fnv1a64-hex(type_name)}` — one Lance dataset per node type
  - `edges/{fnv1a64-hex(edge_type_name)}` — one Lance dataset per edge type
  - `__manifest/` — the catalog of all sub-tables and their published versions
  - `_graph_commits.lance` / `_graph_commit_actors.lance` — the commit graph and its actor map
-  - (legacy `_graph_runs.lance` / `_graph_run_actors.lance` from pre-v0.4.0 graphs are inert; the run state machine was removed in MR-771. The v2→v3 manifest migration sweeps stale `__run__*` branches on first write-open; the inert dataset bytes themselves remain until a `delete_prefix` storage primitive lands)
+  - (legacy `_graph_runs.lance` / `_graph_run_actors.lance` from pre-v0.4.0 graphs are inert; the run state machine was removed. The internal schema migration sweeps stale `__run__*` branches on first write-open; the inert dataset bytes themselves remain until a prefix-delete storage primitive lands)
 - **Manifest row schema** (`object_id, object_type, location, metadata, base_objects, table_key, table_version, table_branch, row_count`):
  - `object_type` ∈ `table | table_version | table_tombstone`
  - `table_key` ∈ `node:<TypeName> | edge:<EdgeName>`
  - `table_branch` is `null` for the main lineage and the branch name otherwise
 - **Snapshot reconstruction**: latest visible `table_version` per `(table_key, table_branch)` minus tombstones — rows where `object_type = table_tombstone`, whose own `table_version` (acting as the tombstone version) is `>= the entry's table_version`.
- **Atomic publish**: multi-dataset commits publish via a `ManifestBatchPublisher` so a single write to `__manifest` flips all the new sub-table versions visible at once.
- **Row-level CAS on the merge-insert join key**: `object_id` carries `lance-schema:unenforced-primary-key=true` so Lance's bloom-filter conflict resolver rejects two concurrent commits that land the same `object_id` row. Without this annotation, Lance's transparent rebase would admit silent duplicates of `version:T@v=N` from racing publishers (see `.context/merge-insert-cas-granularity.md`).
- **Optimistic concurrency control on publish**: `ManifestBatchPublisher::publish` accepts a `expected_table_versions: HashMap<table_key, u64>` map. Each entry asserts the manifest's current latest non-tombstoned version for that table is exactly what the caller observed; mismatches surface as `OmniError::Manifest` with `ManifestConflictDetails::ExpectedVersionMismatch { table_key, expected, actual }`. Empty map preserves the legacy "best-effort publish" semantics. The publisher uses `conflict_retries(0)` against Lance and owns retry itself (`PUBLISHER_RETRY_BUDGET = 5`), re-running the pre-check on each iteration so concurrent advances surface as `ExpectedVersionMismatch` rather than being silently rebased through.
+- **Atomic publish**: multi-dataset commits publish so that a single write to `__manifest` flips all the new sub-table versions visible at once.
+- **Row-level CAS on the merge-insert join key**: `object_id` carries an unenforced-primary-key annotation so Lance's bloom-filter conflict resolver rejects two concurrent commits that land the same `object_id` row. Without this annotation, Lance's transparent rebase would admit silent duplicates from racing publishers.
+- **Optimistic concurrency control on publish**: a publish asserts the manifest's current latest non-tombstoned version for each touched table is exactly what the caller observed; mismatches surface as an `ExpectedVersionMismatch` manifest conflict naming the table and the expected/actual versions. Concurrent advances surface as a conflict rather than being silently rebased through.

-### Internal schema versioning (`db/manifest/migrations.rs`)
+### Internal schema versioning

-The on-disk shape of `__manifest` is reconciled with the binary via a single stamp + dispatcher. `INTERNAL_MANIFEST_SCHEMA_VERSION` declares the shape this binary writes; the on-disk stamp `omnigraph:internal_schema_version` lives in the manifest dataset's schema-level metadata (Lance `update_schema_metadata`).
+The on-disk shape of `__manifest` is reconciled with the binary via a single version stamp held in the manifest dataset's schema-level metadata.

- **`init_manifest_graph`** stamps the current version at creation, so newly initialized graphs never need migration.
- **Publisher open-for-write path** (`load_publish_state`) calls `migrate_internal_schema(&mut dataset)` before reading state. When the on-disk stamp matches the binary, this is a single metadata read with no writes; otherwise the dispatcher walks `match`-arm steps forward (1→2, 2→3, …) until the stamp matches, then proceeds with the publish. Reads stay side-effect-free.
+- **Graph creation** stamps the current version, so newly initialized graphs never need migration.
+- **The open-for-write path** migrates the on-disk stamp before reading state. When the stamp matches the binary, this is a single metadata read with no writes; otherwise the migration walks steps forward (1→2, 2→3, …) until the stamp matches, then proceeds with the publish. Reads stay side-effect-free.
 - **Forward-version protection**: a stamp *higher* than the binary's known version triggers a clear "upgrade omnigraph first" error. An old binary cannot clobber a newer schema by silently treating "unknown stamp" as "missing stamp".
- **Idempotency**: each migration step is safe to re-run. A crash between two metadata updates inside a single step leaves the partial state; the next open re-runs the step and the second update lands. The dispatcher itself is a cheap stamp-read on the steady-state path.
-
-Adding a new on-disk shape change is one constant bump (`INTERNAL_MANIFEST_SCHEMA_VERSION`), one match arm in `migrate_internal_schema`, and one test. No code outside this module branches on the stamp.
+- **Idempotency**: each migration step is safe to re-run. A crash between two metadata updates inside a single step leaves the partial state; the next open re-runs the step and the second update lands.

 | Stamp | Shape change |
 |---|---|
-| v1 (implicit, pre-stamp) | `__manifest.object_id` had no PK annotation; publisher had no row-level CAS protection. |
-| v2 | `__manifest.object_id` carries `lance-schema:unenforced-primary-key=true`; row-level CAS engaged. Stamped as `omnigraph:internal_schema_version=2`. |
-| v3 | One-time sweep of legacy `__run__*` staging branches (pre-v0.4.0 Run state machine, removed MR-771) off `__manifest`. Runs at `Omnigraph::open(ReadWrite)` and on publish. Stamped as `omnigraph:internal_schema_version=3`. |
+| v1 (implicit, pre-stamp) | `__manifest.object_id` had no PK annotation; no row-level CAS protection. |
+| v2 | `__manifest.object_id` carries an unenforced-primary-key annotation; row-level CAS engaged. |
+| v3 | One-time sweep of legacy `__run__*` staging branches (pre-v0.4.0 Run state machine, removed) off `__manifest`. Runs at read-write open and on publish. |

 ## On-disk layout

@ -92,20 +90,20 @@ flowchart TB
 - **Graph root** is one directory (or S3 prefix). Everything below is part of one OmniGraph graph.
 - **`__manifest/`** is a Lance dataset whose rows describe which sub-table version is published at which graph-branch. Reading a snapshot starts here.
 - **`nodes/`** and **`edges/`** are sibling directories holding one Lance dataset per declared type. Names are `fnv1a64-hex` of the type name to keep paths fixed-length and case-safe.
- **`_graph_commits.lance`** is an L2 dataset that records the graph-level commit DAG, with a paired `_graph_commit_actors.lance` for the actor map. (Pre-v0.4.0 graphs also have inert `_graph_runs.lance` / `_graph_run_actors.lance` from the removed Run state machine; the v2→v3 migration sweeps their stale `__run__*` branches, and the dataset bytes are reclaimed once `delete_prefix` lands.)
- **`_graph_commit_recoveries.lance`** — one row per recovery sweep action. Joined to `_graph_commits.lance` by `graph_commit_id`; the linked commit row carries `actor_id=omnigraph:recovery`. Operators correlate recoveries with the original mutations they rolled forward / back via this join. See `crates/omnigraph/src/db/recovery_audit.rs`.
- **`__recovery/{ulid}.json`** — transient sidecar files written by the five migrated writers (`MutationStaging::finalize`, `schema_apply`, `branch_merge`, `ensure_indices`, `optimize_all_tables`) before Phase B begins, deleted after Phase C succeeds. A sidecar persisting after process exit means the writer crashed in the Phase B → Phase C window; the next `Omnigraph::open` recovery sweep processes it. Steady-state directory is empty. See `crates/omnigraph/src/db/manifest/recovery.rs`.
+- **`_graph_commits.lance`** is an L2 dataset that records the graph-level commit DAG, with a paired `_graph_commit_actors.lance` for the actor map. (Pre-v0.4.0 graphs also have inert `_graph_runs.lance` / `_graph_run_actors.lance` from the removed Run state machine; the internal schema migration sweeps their stale `__run__*` branches, and the dataset bytes are reclaimed once a prefix-delete primitive lands.)
+- **`_graph_commit_recoveries.lance`** — one row per crash-recovery action. Joined to `_graph_commits.lance` by `graph_commit_id`; the linked commit row carries `actor_id=omnigraph:recovery`. Operators correlate recoveries with the original mutations they rolled forward / back via this join.
+- **`__recovery/{ulid}.json`** — transient sidecar files written by a writer before it advances the underlying dataset, deleted once the matching manifest publish succeeds. A sidecar persisting after process exit means the writer crashed mid-commit; the next read-write open processes it. Steady-state directory is empty.
 - **`_refs/branches/{name}.json`** is graph-level branch metadata — pointers from a branch name to the manifest version it heads.
 - **Inside each Lance dataset** (orange): the standard Lance directory layout. `_versions/{n}.manifest` records every commit; `data/` holds the actual Arrow fragments; `_indices/{uuid}/` holds index segments with their own `fragment_bitmap` for partial coverage; `_refs/` holds Lance-native per-dataset branches and tags.

 The split — L2 owns the cross-dataset catalog; L1 owns the per-dataset internals — means that schema work (which adds or removes datasets) updates `__manifest`, while data work (which adds fragments) updates `_versions/` inside the affected dataset and then bumps `__manifest`.

-## URI scheme support (`storage.rs`)
+## URI scheme support

 | Scheme | Backend | Notes |
 |---|---|---|
-| local path / `file://` | `ObjectStorageAdapter` over `object_store::LocalFileSystem` | Normalized to absolute paths; relative and dot-segment paths are lexically absolutized |
-| `s3://bucket/prefix` | `ObjectStorageAdapter` over `object_store` S3 | Honors `AWS_ENDPOINT_URL_S3`, `AWS_ALLOW_HTTP`, `AWS_S3_FORCE_PATH_STYLE` |
+| local path / `file://` | local filesystem | Normalized to absolute paths; relative and dot-segment paths are lexically absolutized |
+| `s3://bucket/prefix` | S3 object store | Honors `AWS_ENDPOINT_URL_S3`, `AWS_ALLOW_HTTP`, `AWS_S3_FORCE_PATH_STYLE` |
 | `http(s)://host:port` | HTTP client to `omnigraph-server` | Used by CLI as a target, not a storage backend |

 ## Object-store env vars (S3-compatible)
--- a/docs/user/constants.md
+++ b/docs/user/constants.md
@ -1,40 +0,0 @@
-# Constants & Tunables (cheat sheet)
-
-| Name | Value | Where |
-|---|---|---|
-| `MANIFEST_DIR` | `__manifest` | `db/manifest/layout.rs` |
-| Commit graph dir | `_graph_commits.lance` | `db/commit_graph.rs` |
-| Run registry dir (legacy, removed MR-771) | `_graph_runs.lance` | inert post-v0.4.0; bytes remain until a `delete_prefix` primitive lands |
-| Run branch prefix (legacy, removed MR-771/MR-770) | `__run__` | swept off `__manifest` by the v2→v3 migration; no longer a reserved name |
-| Schema apply lock | `__schema_apply_lock__` | `db/mod.rs` |
-| Manifest publisher retry budget | `PUBLISHER_RETRY_BUDGET = 5` | `db/manifest/publisher.rs` |
-| Internal manifest schema version | `INTERNAL_MANIFEST_SCHEMA_VERSION = 3` | `db/manifest/migrations.rs` |
-| Merge stage batch | `MERGE_STAGE_BATCH_ROWS = 8192` | `exec/merge.rs` |
-| Maintenance concurrency | `OMNIGRAPH_MAINTENANCE_CONCURRENCY=8` | `db/omnigraph/optimize.rs` |
-| Lance blob compaction support | `LANCE_SUPPORTS_BLOB_COMPACTION = false` | `db/omnigraph/optimize.rs` |
-| Graph index cache size | `8` (LRU) | `runtime_cache.rs` |
-| Expand indexed-path frontier ceiling | `OMNIGRAPH_EXPAND_INDEXED_MAX_FRONTIER=1024` | `exec/query.rs` |
-| Expand indexed-path hop ceiling | `OMNIGRAPH_EXPAND_INDEXED_MAX_HOPS=6` | `exec/query.rs` |
-| Expand CSR-build cost factor | `CSR_BUILD_FACTOR = 1.5` | `exec/query.rs` |
-| Expand mode override | `OMNIGRAPH_TRAVERSAL_MODE` (`indexed`\|`csr`; unset = cost-based auto) | `exec/query.rs` |
-| Default body limit | `1 MB` | `omnigraph-server/lib.rs` |
-| Ingest body limit | `32 MB` | `omnigraph-server/lib.rs` |
-| Engine embed model | `gemini-embedding-2-preview` | `omnigraph/embedding.rs` |
-| Compiler embed model | `text-embedding-3-small` | `omnigraph-compiler/embedding.rs` |
-| Embed timeout | `30 000 ms` | both clients |
-| Embed retries | `4` | both clients |
-| Embed retry backoff | `200 ms` | both clients |
-| LANCE memory pool default | `1 GB` (raised in v0.3.0) | runtime |
-
-**Expand traversal dispatch.** With `OMNIGRAPH_TRAVERSAL_MODE` unset, the engine
-chooses the indexed (per-hop BTREE) vs CSR (whole-graph in-memory) path with a
-cost model over cheap manifest counts (frontier size, |E|, source-vertex count,
-hops) plus the index-coverage signal: the indexed path is preferred when its
-frontier-relative work beats building the CSR (≈ when `hops × frontier` is a
-small fraction of the source-vertex set), and CSR is preferred for dense/deep
-traversals or when the BTREE coverage is degraded and a full scan would be paid
-per hop. The two ceilings bound the **initial dispatch** frontier/hops (beyond
-them CSR is always used); they are not a hard per-hop bound — the cost model
-*estimates* total indexed work as ~`hops × frontier × fanout`, so dense fan-out is
-priced toward CSR rather than capped mid-traversal. The override flag forces a path (the `auto` result is identical either way;
-only the path differs).
--- a/docs/user/deployment.md
+++ b/docs/user/deployment.md
@ -13,13 +13,10 @@ Omnigraph supports two broad deployment shapes:

 The server binary and container image expose the same HTTP surface.

-The server also has two **boot sources**: `omnigraph.yaml` (graph targets
-declared in the per-operator config) or a **cluster directory**
-(`omnigraph-server --cluster <dir>`), which serves the cluster control
+The server has a single **boot source**: a **cluster directory**
+(`omnigraph-server --cluster <dir | s3://…>`), which serves the cluster control
 plane's applied revision — see
-[cluster-config.md](cluster-config.md#serving-from-the-cluster-the-mode-switch).
-The two are exclusive per deployment; switching is a restart with a different
-flag.
+[cluster-config.md](clusters/config.md#serving-from-the-cluster-the-mode-switch).

 ## Binary Deployment

@ -30,25 +27,29 @@ Build or install:

 On Windows, the binaries are `omnigraph.exe` and `omnigraph-server.exe`.

-Run against a local graph:
+The server boots from a cluster only (RFC-011) — there is no positional
+`<URI>` / single-graph boot. Point it at a local cluster directory:

 ```bash
-omnigraph-server graph.omni --bind 0.0.0.0:8080
+omnigraph-server --cluster ./company-brain --bind 0.0.0.0:8080
 ```

-Run against an object-store-backed graph:
+Or boot config-free from an object-storage-rooted cluster:

 ```bash
 OMNIGRAPH_SERVER_BEARER_TOKEN="change-me" \
 AWS_REGION="us-east-1" \
-omnigraph-server s3://my-bucket/graphs/example/releases/2026-04-10-v0.1.0 \
+omnigraph-server --cluster s3://my-bucket/clusters/company-brain \
  --bind 0.0.0.0:8080
 ```

+The server serves every graph in the cluster's applied revision under
+`/graphs/{id}/...`. See [clusters](clusters/index.md) for authoring and
+applying a cluster.
+
 ## Cluster Mode in Containers (AWS, Railway)

-A cluster-booted deployment has **two shapes** since the `storage:` root
-(RFC-006):
+A cluster-booted deployment has **two shapes** since the `storage:` root:

 - **Bucket, no volume (preferred for cloud)** — the cluster's ledger,
  catalog, and graph data live under an object-storage root
@ -81,10 +82,8 @@ docker run -d \
  -p 8080:8080 <image>
 ```

-`OMNIGRAPH_CLUSTER` is exclusive: combining it with `OMNIGRAPH_TARGET_URI`,
-`OMNIGRAPH_CONFIG`, or `OMNIGRAPH_TARGET` fails fast (exit 64), the same
-rule the server itself enforces. The image also ships the `omnigraph` CLI,
-so the day-2 loop runs in-container with no `omnigraph.yaml`:
+`OMNIGRAPH_CLUSTER` is the server's only boot source. The image also
+ships the `omnigraph` CLI, so the day-2 loop runs in-container:

 ```bash
 docker exec -it <container> sh -c \
@ -105,10 +104,10 @@ docker exec -it <container> sh -c \
   `omnigraph cluster apply --as <you> --config /var/lib/omnigraph/cluster`
   → force a new deployment (restart).

-For a deployment that doesn't need the cluster control plane, the classic
-stateless shape — `OMNIGRAPH_TARGET_URI=s3://bucket/graph.omni`, no volume —
-remains the simplest AWS architecture (see Binary/Container Deployment
-above).
+For a stateless, volume-free deployment, root the cluster on object
+storage and boot config-free with
+`OMNIGRAPH_CLUSTER=s3://bucket/clusters/<name>` (the bucket-no-volume
+shape above) — the simplest AWS architecture.

 ### Railway

@ -130,49 +129,46 @@ above).
  unvalidated** — boot is lock-free read-only so it should compose, but it
  is not yet exercised by tests.

-## One-Command Local RustFS Bootstrap
+## Testing against S3 locally

-The easiest local S3-backed deployment path is:
+To exercise the S3 storage path without a cloud account, run any S3-compatible
+store in Docker and point the standard `AWS_*` environment at it. RustFS is
+shown; MinIO works the same way.

 ```bash
-curl -fsSL https://raw.githubusercontent.com/ModernRelay/omnigraph/main/scripts/local-rustfs-bootstrap.sh | bash
+docker run -d --name omnigraph-s3 -p 9000:9000 \
+  -e RUSTFS_ACCESS_KEY=omnigraph -e RUSTFS_SECRET_KEY=omnigraph \
+  -e RUSTFS_ALLOW_INSECURE_DEFAULT_CREDENTIALS=true \
+  rustfs/rustfs:latest /data
+
+export AWS_ACCESS_KEY_ID=omnigraph AWS_SECRET_ACCESS_KEY=omnigraph \
+  AWS_REGION=us-east-1 AWS_ENDPOINT_URL_S3=http://127.0.0.1:9000 \
+  AWS_ALLOW_HTTP=true AWS_S3_FORCE_PATH_STYLE=true
+
+# create the bucket once (any S3 client works)
+aws --endpoint-url "$AWS_ENDPOINT_URL_S3" s3 mb s3://omnigraph-local
 ```

-The bootstrap:
+Now an `s3://…` URI works anywhere a graph or cluster root is expected. Root a
+cluster on the bucket and serve it config-free:

- starts a local RustFS-backed object store
- creates a bucket and S3-backed Omnigraph graph
- loads the checked-in context fixture
- starts `omnigraph-server` on `127.0.0.1:8080`
+```bash
+# cluster.yaml
+#   version: 1
+#   storage: s3://omnigraph-local/clusters/demo
+#   graphs: { demo: { schema: schema.pg } }

-Supported behavior:
+omnigraph cluster validate --config .
+omnigraph cluster import   --config .
+omnigraph cluster apply    --config . --as you
+omnigraph load --data seed.jsonl --mode merge \
+  s3://omnigraph-local/clusters/demo/graphs/demo.omni
+omnigraph-server --cluster s3://omnigraph-local/clusters/demo \
+  --bind 127.0.0.1:8080 --unauthenticated
+```

- downloads the rolling `edge` binary when one exists for the current platform
- otherwise clones `ModernRelay/omnigraph` and builds from source
- reuses an existing RustFS container if it is already running
-
-Useful overrides:
-
- `WORKDIR=/path/to/state`
- `BUCKET=omnigraph-local`
- `PREFIX=graphs/context`
- `RESET_REPO=1` to delete an existing partially initialized graph prefix before recreating it
- `BIND=127.0.0.1:8080`
- `RUSTFS_CONTAINER_NAME=omnigraph-rustfs-demo`
-
-The bootstrap expects:
-
- Docker
- `curl`
- either a matching release asset or a local Rust toolchain plus `git`
-
-If `aws` is not installed, the script attempts a user-local AWS CLI install via
-`python3 -m pip`. Docker Desktop or another Docker daemon must already be
-running.
-
-If a previous bootstrap left objects behind under the selected `PREFIX` but did
-not finish initializing the graph, rerun with `RESET_REPO=1` or choose a new
-`PREFIX`.
+The same `AWS_*` contract applies to a production object store — swap the
+endpoint and credentials. CI exercises this path against containerized RustFS.

 ## Container Deployment

@ -182,23 +178,24 @@ Build the image:
 docker build -t omnigraph-server:local .
 ```

-Run against a local graph:
+The server boots from a cluster only (RFC-011). Run against a cluster
+directory on a mounted volume:

 ```bash
 docker run --rm -p 8080:8080 \
-  -v "$PWD/graph.omni:/data/graph.omni" \
+  -v "$PWD/company-brain:/var/lib/omnigraph/cluster" \
  omnigraph-server:local \
-  /data/graph.omni --bind 0.0.0.0:8080
+  --cluster /var/lib/omnigraph/cluster --bind 0.0.0.0:8080
 ```

-Run against an S3-backed graph:
+Run config-free against an object-storage-rooted cluster:

 ```bash
 docker run --rm -p 8080:8080 \
  -e OMNIGRAPH_SERVER_BEARER_TOKEN="change-me" \
  -e AWS_REGION="us-east-1" \
  omnigraph-server:local \
-  s3://my-bucket/graphs/example/releases/2026-04-10-v0.1.0 \
+  --cluster s3://my-bucket/clusters/company-brain \
  --bind 0.0.0.0:8080
 ```

@ -209,27 +206,14 @@ When no positional args are given, the image entrypoint

 | Var | Effect |
 |---|---|
-| `OMNIGRAPH_TARGET_URI` | Graph URI, passed as the positional argument. |
-| `OMNIGRAPH_CONFIG` | Path to an `omnigraph.yaml`, passed as `--config`. Used to supply a `policy.file` (Cedar authorization). The config file and any relative `policy.file` must be mounted into the container. |
-| `OMNIGRAPH_TARGET` | Graph name to select from the config's `graphs:` block (with `OMNIGRAPH_CONFIG`, when no `OMNIGRAPH_TARGET_URI`). |
+| `OMNIGRAPH_CLUSTER` | Cluster boot source — a config directory or a storage-root URI, forwarded as `--cluster`. The only boot source. |
 | `OMNIGRAPH_BIND` | Listen address (default `0.0.0.0:8080`). |

-`OMNIGRAPH_TARGET_URI` and `OMNIGRAPH_CONFIG` **compose**: set both to keep the
-graph URI in the env var while loading policy from the config file (the
-positional URI wins over any `graphs:` entry). To enable Cedar policy on a
-container otherwise driven by `OMNIGRAPH_TARGET_URI`, mount the config dir and
-add `OMNIGRAPH_CONFIG`:
-
-```bash
-docker run --rm -p 8080:8080 \
-  -e OMNIGRAPH_SERVER_BEARER_TOKEN="change-me" \
-  -e OMNIGRAPH_TARGET_URI="s3://my-bucket/graphs/example/releases/2026-04-10-v0.1.0" \
-  -e OMNIGRAPH_CONFIG="/etc/omnigraph/omnigraph.yaml" \
-  -v "$PWD/config:/etc/omnigraph:ro" \
-  omnigraph-server:local
-# /etc/omnigraph/omnigraph.yaml contains `policy: { file: policy.yaml }`;
-# policy.yaml (+ optional policy.tests.yaml) sit beside it in the mount.
-```
+Per-graph and server-level Cedar policy come from the cluster's applied
+revision (authored in `cluster.yaml` and published with `cluster apply`),
+not from a separate config file. The cluster docker shapes — volume vs.
+config-free object-storage root — are detailed under
+[Cluster Mode in Containers](#cluster-mode-in-containers-aws-railway) above.

 ## Auth

--- a/docs/user/embeddings.md
+++ b/docs/user/embeddings.md
@ -1,31 +0,0 @@
-# Embeddings
-
-OmniGraph has **two** embedding clients with different defaults and purposes.
-
-## Compiler-side client (`omnigraph-compiler/src/embedding.rs`) — query-time normalization
-
- Default model: `text-embedding-3-small` (OpenAI-style schema)
- Env: `NANOGRAPH_EMBED_MODEL`, `OPENAI_API_KEY`, `OPENAI_BASE_URL` (default `https://api.openai.com/v1`), `NANOGRAPH_EMBEDDINGS_MOCK`, `NANOGRAPH_EMBED_TIMEOUT_MS=30000`, `NANOGRAPH_EMBED_RETRY_ATTEMPTS=4`, `NANOGRAPH_EMBED_RETRY_BACKOFF_MS=200`
- Methods: `embed_text(input, expected_dim)`, `embed_texts(inputs, expected_dim)`
- Mock mode: deterministic FNV-1a + xorshift64 → L2-normalized vectors
-
-## Engine-side client (`omnigraph/src/embedding.rs`) — runtime ingest
-
- Model: `gemini-embedding-2-preview`
- Env: `GEMINI_API_KEY`, `OMNIGRAPH_GEMINI_BASE_URL` (default Google generativelanguage v1beta), `OMNIGRAPH_EMBED_TIMEOUT_MS=30000`, `OMNIGRAPH_EMBED_RETRY_ATTEMPTS=4`, `OMNIGRAPH_EMBED_RETRY_BACKOFF_MS=200`, `OMNIGRAPH_EMBEDDINGS_MOCK`
- Two task types: `embed_query_text` (RETRIEVAL_QUERY) and `embed_document_text` (RETRIEVAL_DOCUMENT)
- Exponential backoff with retryable detection (timeouts, 429, 5xx)
-
-## Schema integration
-
-Mark a Vector property with `@embed("source_text_property")`. At ingest, the engine pulls the source text and writes the embedding into the vector column. Stored as L2-normalized FixedSizeList(Float32, dim).
-
-## CLI `omnigraph embed` (offline file pipeline)
-
-Operates on **JSONL files** (not on a graph). Three modes (mutually exclusive):
-
- (default) `fill_missing` — only embed rows whose target field is empty
- `--reembed-all` — overwrite all
- `--clean` — strip embeddings
-
-Inputs are either a single seed manifest YAML or `--input/--output/--spec`. Selectors `--type T`, `--select T:field=value` filter rows. Streams JSONL → JSONL.
--- a/docs/user/index.md
+++ b/docs/user/index.md
@ -2,44 +2,68 @@

 **Audience:** users, CLI users, HTTP clients, and self-hosting operators

-This is the public-facing entry point. These docs should describe behavior,
-commands, configuration, and operational contracts without requiring knowledge
-of MRs, internal recovery mechanics, or contributor-only invariants.
+This is the public-facing entry point. These docs describe behavior, commands,
+configuration, and operational contracts without requiring knowledge of internal
+recovery mechanics or contributor-only invariants. They are organized by topic —
+start with install, then follow the section that matches your task.

-## Start Here
+## Start here

 | Goal | Read |
 |---|---|
 | Install OmniGraph | [install.md](install.md) |
-| Run the CLI locally | [cli.md](cli.md) |
-| Look up every CLI flag and config field | [cli-reference.md](cli-reference.md) |
-| Deploy and operate a cluster (how-to guide) | [cluster.md](cluster.md) |
-| Validate and plan cluster config | [cluster-config.md](cluster-config.md) |
-| Write schemas | [schema-language.md](schema-language.md) |
-| Read schema-lint diagnostic codes | [schema-lint.md](schema-lint.md) |
-| Write queries and mutations | [query-language.md](query-language.md) |
-| Use embeddings | [embeddings.md](embeddings.md) |
+| Run the core loop end to end | [quickstart.md](quickstart.md) |
+| Understand the model | [concepts/index.md](concepts/index.md) |
+| Run the CLI | [cli/index.md](cli/index.md) |
+| Look up every CLI flag and config field | [cli/reference.md](cli/reference.md) |

-## Operate A Graph
+## Schema & queries

 | Goal | Read |
 |---|---|
-| Understand graph layout and URI support | [storage.md](storage.md) |
-| Work with branches, commits, and snapshots | [branches-commits.md](branches-commits.md) |
-| Coordinate multi-query workflows | [transactions.md](transactions.md) |
-| Read diffs and change feeds | [changes.md](changes.md) |
-| Build and use indexes | [indexes.md](indexes.md) |
-| Compact and clean old versions | [maintenance.md](maintenance.md) |
-| Interpret errors and output formats | [errors.md](errors.md) |
+| Write schemas (the `.pg` language) | [schema/index.md](schema/index.md) |
+| Read schema-lint diagnostic codes | [schema/lint.md](schema/lint.md) |
+| Write queries (the `.gq` language) | [queries/index.md](queries/index.md) |
+| Write data — inserts, updates, deletes | [mutations/index.md](mutations/index.md) |
+| Use vector / full-text / hybrid search | [search/index.md](search/index.md) |
+| Generate embeddings | [search/embeddings.md](search/embeddings.md) |
+| Build and use indexes | [search/indexes.md](search/indexes.md) |

-## Run The Server
+## Branching & version control
+
+| Goal | Read |
+|---|---|
+| Work with branches and commits | [branching/index.md](branching/index.md) |
+| Read past versions (time travel) | [branching/time-travel.md](branching/time-travel.md) |
+| Merge branches and resolve conflicts | [branching/merge.md](branching/merge.md) |
+| Coordinate multi-query workflows | [branching/transactions.md](branching/transactions.md) |
+| Read diffs and change feeds | [branching/changes.md](branching/changes.md) |
+
+## Operations

 | Goal | Read |
 |---|---|
 | Deploy the binary or container | [deployment.md](deployment.md) |
-| Use HTTP endpoints | [server.md](server.md) |
-| Configure Cedar authorization | [policy.md](policy.md) |
-| Track actors and audit behavior | [audit.md](audit.md) |
+| Use HTTP endpoints | [operations/server.md](operations/server.md) |
+| Compact, repair, and clean old versions | [operations/maintenance.md](operations/maintenance.md) |
+| Configure Cedar authorization | [operations/policy.md](operations/policy.md) |
+| Track actors and audit behavior | [operations/audit.md](operations/audit.md) |
+| Interpret errors and output formats | [operations/errors.md](operations/errors.md) |
+
+## Clusters
+
+| Goal | Read |
+|---|---|
+| Deploy and operate a cluster (how-to) | [clusters/index.md](clusters/index.md) |
+| Reference every `cluster.yaml` key and command | [clusters/config.md](clusters/config.md) |
+
+## Concepts & reference
+
+| Goal | Read |
+|---|---|
+| Understand the model and L1/L2 framing | [concepts/index.md](concepts/index.md) |
+| Understand graph layout and URI support | [concepts/storage.md](concepts/storage.md) |
+| Look up constants and tunables | [reference/constants.md](reference/constants.md) |

 ## Releases

@ -48,7 +72,6 @@ changes between versions, not for contributor design history.

 ## Boundary

-User docs should focus on stable behavior. If a paragraph needs to explain
-internal sidecars, Lance API blockers, MR numbers, test strategy, or review
-rules, it probably belongs in [docs/dev/index.md](../dev/index.md) or a developer-area document
-instead.
+User docs focus on stable behavior. If a paragraph needs to explain internal
+sidecars, Lance API blockers, or test strategy, it probably belongs in
+[docs/dev/index.md](../dev/index.md) or a developer-area document instead.
--- a/docs/user/indexes.md
+++ b/docs/user/indexes.md
@ -1,26 +0,0 @@
-# Indexes
-
-## L1 — Lance index types OmniGraph exposes
-
-| Index | Use | Notes |
-|---|---|---|
-| **BTREE scalar** | range / equality on any scalar | created on `@key`, `@index(...)`, and on key columns by `ensure_indices()` |
-| **Inverted (FTS)** | `search`, `fuzzy`, `match_text`, `bm25` | created on text columns referenced by FTS queries |
-| **Vector** | `nearest()` k-NN | Lance picks IVF_PQ vs HNSW family by configuration; OmniGraph stores as FixedSizeList(Float32, dim) |
-
-## L2 — OmniGraph orchestration
-
- `ensure_indices()` / `ensure_indices_on(branch)` — idempotent build of BTREE + inverted indexes for the current head; safe to re-run.
- Indexes are built on the *branch head* (not on a snapshot), so reads always see the current index state.
- **Lazy branch forking for indexes**: a branch that hasn't mutated a sub-table doesn't need its own index — the main lineage's index is reused until the first write triggers a copy-on-write fork.
- Vector index parameters (metric, nlist, nprobe, etc.) are not exposed in the schema; they default at the Lance layer and are picked up automatically when an index is asked for on a Vector column.
-
-## L2 — Graph topology index (`graph_index/mod.rs`)
-
-This is OmniGraph-specific (not Lance):
-
- `TypeIndex`: dense `u32 ↔ String id` mapping per node type.
- `CsrIndex`: Compressed Sparse Row representation of edges per edge type — `offsets[i]..offsets[i+1]` slices into `targets`.
- `GraphIndex { type_indices, csr (out), csc (in) }` — built on demand from a snapshot's edge tables, **lazily**: only when an `Expand` the planner routes to the CSR path (dense / large frontier) or an `AntiJoin` actually needs it.
- Cached in `RuntimeCache::graph_indices` (LRU, max 8 entries, keyed by snapshot id + edge table versions).
- Selective `Expand`s resolve neighbors from the persisted `src`/`dst` BTREE instead (one indexed scan per hop) and never trigger the CSR build; see [query-language](query-language.md) → Expand. Pure scans, and queries served entirely by the indexed traversal path, skip it.
--- a/docs/user/maintenance.md
+++ b/docs/user/maintenance.md
@ -1,47 +0,0 @@
-# Maintenance: Optimize, Repair & Cleanup
-
-`db/omnigraph/optimize.rs` and `db/omnigraph/repair.rs`.
-
-## `optimize_all_tables(db)` — non-destructive
-
- Lance `compact_files()` on every node + edge table on `main`, then **publishes the compacted version to the `__manifest`** so the manifest's `table_version` tracks the compacted Lance HEAD. Reads pin the manifest version, so without this publish compaction would be invisible to readers *and* would break the HEAD-vs-manifest precondition of the next schema apply / strict update/delete ("stale view … refresh and retry"). The publish advances the graph version (a system-attributed commit) only for tables that actually compacted.
- Rewrites small fragments into fewer large ones; old fragments remain reachable via older manifests until `cleanup` runs.
- Each table's compact→publish runs under its per-`(table, main)` write queue (serializing with concurrent mutations — compaction is a Lance `Rewrite` op that retryable-conflicts with a concurrent merge/update/delete on overlapping fragments). The Lance-HEAD-before-manifest-publish gap is covered by a `SidecarKind::Optimize` recovery sidecar (loose-match): a crash in that window rolls the compacted version forward on the next `Omnigraph::open` (compaction is content-preserving, so roll-forward is always safe).
- **Requires a recovered graph.** `optimize` refuses (errors) when an unresolved recovery sidecar is present under `__recovery` — operating on an unrecovered graph could publish a partial write the open-time recovery sweep would roll back. Reopen the graph to run the recovery sweep, then re-run `optimize`.
- **Uncovered drift is skipped, not interpreted.** If a table's Lance HEAD is ahead of the version recorded in `__manifest` and no recovery sidecar covers that movement, `optimize` reports `skipped: Some(DriftNeedsRepair)` with the manifest/head versions and leaves the table untouched. Run `omnigraph repair` to classify and explicitly publish that drift.
- Bounded by `OMNIGRAPH_MAINTENANCE_CONCURRENCY` (default 8).
- Returns `[TableOptimizeStats { table_key, fragments_removed, fragments_added, committed, skipped, manifest_version, lance_head_version }]`.
- **Blob tables are skipped.** A table that declares any `Blob` property is not compacted: it is reported with `skipped: Some(BlobColumnsUnsupportedByLance)` (and logged via `tracing::warn`) instead of compacted, and the rest of the sweep proceeds normally. The current Lance `compact_files` mis-decodes blob-v2 columns under its forced `BlobHandling::AllBinary` read; **reads and writes are unaffected** — only compaction is. This is gated by `LANCE_SUPPORTS_BLOB_COMPACTION` (`db/omnigraph/optimize.rs`) and removed when the upstream Lance fix lands (see [docs/dev/lance.md](../dev/lance.md)). Consequence: fragment count and deleted-row space on blob tables are not reclaimed until then; query results are never affected.
-
-## `repair_all_tables(db, options)` — explicit
-
- Handles **uncovered manifest/head drift**: a table's Lance HEAD is ahead of the manifest pin and no recovery sidecar records the writer intent.
- Preview by default. `omnigraph repair --json <uri>` reports each table's `classification`, `action`, manifest/head versions, Lance operation names, and any classification error. `--confirm` publishes only verified maintenance drift; if any suspicious or unverifiable table is refused, the CLI prints the per-table output and exits non-zero. `--force --confirm` also publishes suspicious or unverifiable drift after operator review.
- Classifies drift by reading Lance transactions from `manifest_version + 1` through `lance_head_version`. Only `ReserveFragments` and `Rewrite` are verified maintenance. Semantic operations such as `Append`, `Delete`, `Update`, `Merge`, or missing transaction history are not auto-healed.
- Publishes repair by advancing `__manifest` to the existing Lance HEAD; it does **not** rewrite Lance data. If the publish succeeds, normal reads and strict writes use the repaired version. If it fails, no new data-side partial state was created.
- Requires a clean recovery state. Pending `__recovery` sidecars still belong to automatic sidecar recovery, not manual repair.
-
-## `cleanup_all_tables(db, options)` — destructive
-
- Lance `cleanup_old_versions()` per table.
- Removes manifests (and their unique fragments) older than the retention policy.
- `CleanupPolicyOptions { keep_versions: Option<u32>, older_than: Option<Duration> }` — at least one is required.
- Returns `[TableCleanupStats { table_key, bytes_removed, old_versions_removed, error }]`.
- **Fault-isolated per table.** A single table's transient failure (version GC or
-  orphan reclaim) is recorded on that table's stats row (`error: Some(..)`, logged
-  via `tracing`) and never aborts the healthy tables — cleanup is the convergence
-  backstop, so it does as much as it can and converges on re-run. The CLI reports
-  any failed tables; rerun `cleanup` to retry them.
- CLI guards with `--confirm`; without it, prints a preview line.
- **Recovery floor:** `--keep < 3` may garbage-collect Lance versions that the open-time recovery sweep needs as a rollback target (the sweep restores to the branch's manifest-pinned table version, which is HEAD-1 in the typical Phase B → Phase C drift case). Default `--keep 10` is safe.
- **Orphaned-branch reconciliation:** before the version GC, cleanup runs `reconcile_orphaned_branches`, which `force_delete_branch`es any per-table or commit-graph Lance branch absent from the manifest branch list. These orphans arise when a `branch_delete` flips the manifest authority but a downstream best-effort reclaim does not complete (see [branches-commits.md](branches-commits.md)). The reconciler is authority-derived and idempotent (it no-ops once nothing is orphaned), runs regardless of the `keep_versions` / `older_than` values (those gate version GC only), and never reclaims `main` or system-branch forks. Reclaimed forks are logged via `tracing::info`.
-
-## Tombstones
-
-Logical sub-table delete markers in `__manifest`; `tombstone_object_id(table_key, version)` excludes a sub-table version from snapshot reconstruction.
-
-## Internal schema migrations (`db/manifest/migrations.rs`)
-
-Version evolutions of the on-disk `__manifest` shape are reconciled automatically on the first write under a new binary. `INTERNAL_MANIFEST_SCHEMA_VERSION` declares the shape the binary expects; the on-disk stamp `omnigraph:internal_schema_version` (Lance schema-level metadata) records the on-disk shape. The publisher's open-for-write path calls `migrate_internal_schema` before reading state; reads are side-effect-free. No operator action is required for in-place upgrades. See [storage.md → Internal schema versioning](storage.md) for the full mechanism.
-
-A binary opening a manifest stamped at a version *higher* than it knows about refuses to publish with a clear "upgrade omnigraph first" error — old binaries cannot clobber a newer schema.
--- a/docs/user/mutations/index.md
+++ b/docs/user/mutations/index.md
@ -0,0 +1,52 @@
+# Mutations
+
+Write statements live inside a `query` declaration whose body is one or more
+mutation statements (the [query language](../queries/index.md) covers the read
+shape and shared declaration syntax).
+
+```
+query onboard($name: String, $title: String) {
+  insert Person { name: $name, title: $title }
+}
+```
+
+An edge type is inserted the same way — its endpoint columns are just
+properties in the assignment block (`insert WorksAt { person: $p, org: $o }`).
+
+## Statements
+
+- `insert <Type> { prop: <value>, … }`
+- `update <Type> set { prop: <value>, … } where <prop> <op> <value>`
+- `delete <Type> where <prop> <op> <value>`
+
+`<value>` is a literal, `$param`, or `now()`.
+
+## Atomicity
+
+A change query publishes **one commit** at the end of the query. Multiple
+insert/update statements accumulate in memory and commit together — a mid-query
+failure leaves the graph untouched. See [transactions](../branching/transactions.md)
+for the per-query atomicity contract and [branches](../branching/index.md) for
+multi-query workflows.
+
+## Inserts/updates and deletes cannot mix in one query
+
+A single change query must be **either insert/update-only or delete-only**.
+Mixing the two is rejected at parse time, before any I/O:
+
+> `mutation '<name>' on the same query mixes inserts/updates and deletes; split
+> into separate mutations: (1) inserts and updates, then (2) deletes.`
+
+Run two separate queries instead — the inserts/updates first, then the deletes.
+The restriction exists because inserts/updates and deletes commit through
+different paths today, and mixing them in one query creates ordering hazards
+(e.g. a same-row insert-then-delete, or a cascading delete of a just-inserted
+edge). Keeping the two kinds in separate queries keeps each one atomic and
+correct.
+
+## Bulk loading
+
+For loading data from files rather than inline statements, use
+[`omnigraph load`](../cli/index.md) (`--mode overwrite|append|merge`) — it is the
+single bulk-write command and applies the same schema validation and atomic
+publish as inline mutations.
--- a/docs/user/operations/audit.md
+++ b/docs/user/operations/audit.md
@ -0,0 +1,46 @@
+# Audit & Actor Tracking
+
+Every write in OmniGraph records **who made it**. The actor id is persisted on the
+graph commit, so the commit history is an audit trail of which actor changed the
+graph and when.
+
+## Where the actor comes from
+
+The actor is resolved differently depending on the front end, but it always lands
+on the commit:
+
+- **HTTP server** — the actor is resolved **server-side from the bearer token**. A
+  client cannot set its own actor id; it is derived from the authenticated token.
+  See [policy](policy.md) for how tokens map to actors.
+- **CLI / embedded** — the actor is self-declared through one resolution chain:
+
+  1. `--as <actor>` on the command,
+  2. then `operator.actor` in `~/.omnigraph/config.yaml` (see the
+     [CLI reference](../cli/reference.md)),
+  3. otherwise none.
+
+This difference is intentional: storage credentials imply a self-declared actor,
+while a server resolves the actor from a token it trusts.
+
+## Reading the audit trail
+
+Actor ids are stored on each commit in the [commit graph](../branching/index.md).
+List commits to see who made each change:
+
+```bash
+omnigraph commit list graph.omni
+```
+
+System-initiated writes use reserved actor ids — for example, automatic recovery
+of an interrupted write records `omnigraph:recovery`, so operator changes and
+machine repairs are distinguishable in the history:
+
+```bash
+omnigraph commit list --filter actor=omnigraph:recovery graph.omni
+```
+
+## What is tracked
+
+Every successful publish — load, change, branch merge, and schema apply — appends a
+commit carrying the resolving actor. Because publishes are atomic, the actor on a
+commit is exactly the actor responsible for that whole change.
--- a/docs/user/operations/errors.md
+++ b/docs/user/operations/errors.md
@ -9,7 +9,7 @@
 - `Manifest(ManifestError { kind: BadRequest|NotFound|Conflict|Internal, details: Option<ManifestConflictDetails>, … })`
  - `ManifestConflictDetails::ExpectedVersionMismatch { table_key, expected, actual }` — caller's `expected_table_versions` did not match the manifest's current latest non-tombstoned version (set by `OmniError::manifest_expected_version_mismatch`).
  - `ManifestConflictDetails::RowLevelCasContention` — Lance row-level CAS rejected the publish because a concurrent writer landed the same `object_id`. Retried internally by the publisher; only surfaces if the retry budget exhausts.
-  - **D₂ parse-time rejection** (MR-794): a single mutation query that mixes inserts/updates with deletes errors out *before any I/O* with kind `BadRequest`. Message: `mutation '<name>' on the same query mixes inserts/updates and deletes; split into separate mutations: (1) inserts and updates, then (2) deletes`. See [docs/user/query-language.md](query-language.md) for the rule and [docs/dev/writes.md](../dev/writes.md) for the underlying staged-write rationale.
+  - **D₂ parse-time rejection**: a single mutation query that mixes inserts/updates with deletes errors out *before any I/O* with kind `BadRequest`. Message: `mutation '<name>' on the same query mixes inserts/updates and deletes; split into separate mutations: (1) inserts and updates, then (2) deletes`. See [query-language.md](../queries/index.md) for the rule.
 - `MergeConflicts(Vec<MergeConflict>)`

 Compiler-side `NanoError` covers parse / catalog / type / storage / plan / execution / arrow / lance / IO / manifest / unique-constraint, each with structured spans (`SourceSpan { start, end }`) for ariadne-style diagnostics.
--- a/docs/user/operations/maintenance.md
+++ b/docs/user/operations/maintenance.md
@ -0,0 +1,50 @@
+# Maintenance: Optimize, Repair & Cleanup
+
+**Addressing.** `optimize`, `repair`, and `cleanup` are **direct** (storage-native) CLI commands: they run with direct storage access against a positional `file://`/`s3://` URI or **`--cluster <dir|s3://…> --graph <id>`** (which resolves the graph's storage URI from the served cluster state, so you needn't know the `<storage>/graphs/<id>.omni` layout). They never run through a server, and reject `--server` or a remote (`http(s)://`) URI with a declared error. There are no server routes for them by design — to maintain a server-backed graph, run them out-of-band against the graph's storage URI. See the *Command capabilities* section of [cli-reference.md](../cli/reference.md).
+
+## `optimize` — non-destructive
+
+- Compacts every node + edge table on `main`, then reindexes them, then **publishes the resulting version to the `__manifest`** so the manifest's recorded version tracks the compacted-and-reindexed state. Reads pin the manifest version, so without this publish the work would be invisible to readers *and* would break the version precondition of the next schema apply / strict update/delete ("stale view … refresh and retry"). The publish advances the graph version (a system-attributed commit) only for tables that actually changed.
+- Rewrites small fragments into fewer large ones; old fragments remain reachable via older versions until `cleanup` runs.
+- **Reindex (index coverage maintenance).** A scalar/FTS/vector index only covers the fragments it was built over. Rows appended after the index was built (e.g. by `load --mode merge`, whose commit does not rebuild an already-existing index) are scanned unindexed, and compaction itself rewrites fragments out of an index's coverage. `optimize` runs Lance's incremental `optimize_indices` after compaction to fold those fragments back in (a delta merge, not a full retrain), restoring full coverage so equality/range/traversal predicates stay index-accelerated. This is why a table with **no compaction work but stale index coverage still commits** a new version under `optimize`. Run `optimize` on a cadence at least as frequent as your freshness window so recently-loaded rows do not linger in the unindexed flat-scan tail.
+- **Create declared-but-missing indexes (the index reconciler).** `@index`/`@key` declares intent; `schema apply` records it but builds nothing, and `load`/`mutate` defer a column that cannot be built yet (a `Vector` column with no trainable vectors). `optimize` materializes any such declared-but-unbuilt index over the compacted layout — so it is the convergence path for an `@index` added after data exists, or a vector index whose embeddings arrived via a later `embed`. A column still not buildable (no vectors yet) is reported on the table's stat as `pending_indexes` (visible in `--json`), not treated as a failure; the next `optimize` retries. So `optimize` is the single operator-facing index reconciler: it compacts, restores coverage, **and** builds declared-but-missing indexes.
+- Each table's compact→reindex→publish serializes with concurrent mutations on the same table. A crash mid-operation is recovered automatically on the next open (both compaction and reindex are content-preserving, so roll-forward is always safe).
+- **Requires a recovered graph.** `optimize` refuses (errors) when a pending crash-recovery operation is present — operating on an unrecovered graph could publish a partial write that recovery would roll back. Reopen the graph to run recovery, then re-run `optimize`.
+- **Uncovered drift is skipped, not interpreted.** If a table's underlying version is ahead of the version recorded in `__manifest` and no crash-recovery record covers that movement, `optimize` reports `skipped: DriftNeedsRepair` with the manifest/head versions and leaves the table untouched. Run `omnigraph repair` to classify and explicitly publish that drift.
+- Bounded by `OMNIGRAPH_MAINTENANCE_CONCURRENCY` (default 8).
+- Returns per-table stats: `table_key, fragments_removed, fragments_added, committed, skipped, manifest_version, lance_head_version, pending_indexes` (the last lists any declared `@index` column the reconciler could not build this run, with the reason — e.g. a vector column with no trainable vectors yet).
+- **Blob tables are skipped.** A table that declares any `Blob` property is not compacted: it is reported with `skipped: BlobColumnsUnsupportedByLance` (and logged) instead of compacted, and the rest of the sweep proceeds normally. **Reads and writes are unaffected** — only compaction is. Consequence: fragment count and deleted-row space on blob tables are not reclaimed; query results are never affected. A skipped blob table is also **not reindexed** in the same sweep (the skip happens before the reindex step), so its index coverage on appended rows is not refreshed by `optimize` today.
+
+## `repair` — explicit
+
+- Handles **uncovered manifest/head drift**: a table's underlying version is ahead of the manifest pin and no crash-recovery record explains the movement.
+- Preview by default. `omnigraph repair --json <uri>` reports each table's `classification`, `action`, manifest/head versions, underlying operation names, and any classification error. `--confirm` publishes only verified maintenance drift; if any suspicious or unverifiable table is refused, the CLI prints the per-table output and exits non-zero. `--force --confirm` also publishes suspicious or unverifiable drift after operator review.
+- Classifies drift by reading the table's transaction history from `manifest_version + 1` through the current head. Only fragment-reservation and rewrite (compaction) operations are verified maintenance. Semantic operations such as append, delete, update, merge, or missing transaction history are not auto-healed.
+- Publishes repair by advancing `__manifest` to the existing head; it does **not** rewrite data. If the publish succeeds, normal reads and strict writes use the repaired version. If it fails, no new data-side partial state was created.
+- Requires a clean recovery state. A pending crash-recovery operation still belongs to automatic recovery, not manual repair.
+
+## `cleanup` — destructive
+
+- Garbage-collects old versions per table.
+- Removes versions (and their unique fragments) older than the retention policy.
+- Policy options `keep_versions` and `older_than` — at least one is required.
+- Returns per-table stats: `table_key, bytes_removed, old_versions_removed, error`.
+- **Fault-isolated per table.** A single table's transient failure (version GC or
+  orphan reclaim) is recorded on that table's stats row (with an `error`) and logged,
+  and never aborts the healthy tables — cleanup is the convergence
+  backstop, so it does as much as it can and converges on re-run. The CLI reports
+  any failed tables; rerun `cleanup` to retry them.
+- CLI guards with `--confirm`; without it, prints a preview line.
+- **Non-local consent (RFC-011 D9).** Against a non-local target (an `s3://` store/cluster), `cleanup` additionally requires `--yes` on top of `--confirm`: a TTY is prompted, and a non-interactive run (no TTY, or `--json`) refuses rather than destroying. A local (`file://`) target needs only `--confirm`. The same `--yes` gate applies to overwrite `load` and `branch delete`; every maintenance run echoes its resolved target to stderr (suppress with `--quiet`).
+- **Recovery floor:** `--keep < 3` may garbage-collect versions that crash recovery needs as a rollback target. Default `--keep 10` is safe.
+- **Orphaned-branch reconciliation:** before the version GC, cleanup reclaims any per-table or commit-graph branch absent from the manifest branch list. These orphans arise when a `branch_delete` flips the manifest authority but a downstream best-effort reclaim does not complete (see [branches-commits.md](../branching/index.md)). The reconciler is idempotent (it no-ops once nothing is orphaned), runs regardless of the `keep_versions` / `older_than` values (those gate version GC only), and never reclaims `main` or system-branch forks. Reclaimed forks are logged.
+
+## Tombstones
+
+Logical sub-table delete markers in `__manifest` that exclude a sub-table version from snapshot reconstruction.
+
+## Internal schema migrations
+
+Version evolutions of the on-disk `__manifest` shape are reconciled automatically on the first write under a new binary. An on-disk stamp records the shape; the binary migrates it forward before reading state, and reads are side-effect-free. No operator action is required for in-place upgrades. See [storage.md → Internal schema versioning](../concepts/storage.md) for the full mechanism.
+
+A binary opening a manifest stamped at a version *higher* than it knows about refuses to publish with a clear "upgrade omnigraph first" error — old binaries cannot clobber a newer schema.
--- a/docs/user/operations/policy.md
+++ b/docs/user/operations/policy.md
@ -13,14 +13,14 @@ Per-graph actions (bind to `Omnigraph::Graph::"<graph_id>"`):
 5. `branch_create`
 6. `branch_delete`
 7. `branch_merge`
-8. `admin` — reserved for policy-management surfaces (hot reload, audit log, approvals). No call site today; see MR-724 for the reservation rationale.
+8. `admin` — reserved for policy-management surfaces (hot reload, audit log, approvals). No call site today.
 9. `invoke_query` — gates invoking a server-side stored query (the `queries:` registry). Graph-scoped (like `admin`) — per-branch access is enforced by the inner `read` / `change` gate, so a rule that sets `branch_scope` on `invoke_query` is rejected. Coarse in this release: an `invoke_query` allow rule permits any stored query on the graph; a future, additive refinement adds an optional per-query-name scope without changing rules written against the coarse action. Enforced at `POST /queries/{name}` (see [server](server.md)). A stored *mutation* is double-gated: `invoke_query` to reach the tool, plus `change` for the write itself (the engine `_as` writers still enforce per the query body).

 Server-scoped action (v0.6.0+; binds to `Omnigraph::Server::"root"`):

 10. `graph_list` — `GET /graphs` registry enumeration (multi-graph mode)

-Server-scoped actions cannot use `branch_scope` or `target_branch_scope` — they operate on the registry, not on a graph's branches. A rule cannot mix server-scoped and per-graph actions; split into separate rules. (Runtime `graph_create` / `graph_delete` are reserved but not shipped in v0.6.0; operators add/remove graphs by editing `omnigraph.yaml` and restarting.)
+Server-scoped actions cannot use `branch_scope` or `target_branch_scope` — they operate on the registry, not on a graph's branches. A rule cannot mix server-scoped and per-graph actions; split into separate rules. (Runtime `graph_create` / `graph_delete` over HTTP are reserved but not shipped; operators add/remove graphs by editing the cluster's `cluster.yaml`, running `omnigraph cluster apply`, and restarting the server.)

 ## Scope kinds

@ -28,38 +28,34 @@ Server-scoped actions cannot use `branch_scope` or `target_branch_scope` — the
 - `target_branch_scope` — applied to destination (`schema_apply`, branch ops, run ops)
 - `protected_branches` — named list with special rules; rule scopes are `any | protected | unprotected`

-## Per-graph vs. server-level policy (multi-graph mode)
+## Per-graph vs. server-level policy

-In multi mode (`omnigraph.yaml` with a non-empty `graphs:` map), policy files attach at two levels:
+A server boots from a cluster (`--cluster <dir>`), and the cluster's
+`cluster.yaml` declares its policy bundles in a `policies:` section. Each bundle
+names the scopes it `applies_to`: a graph id (per-graph rules — `read`, `change`,
+`branch_*`, `schema_apply`) or the literal `cluster` (server-level rules —
+`graph_list`).

 ```yaml
-server:
-  policy:
-    file: server-policy.yaml          # server-level: graph_list
-
-graphs:
+# cluster.yaml
+policies:
+  base:
+    file: base.policy.yaml
+    applies_to: [cluster, knowledge]   # cluster-level + the `knowledge` graph
  alpha:
-    uri: s3://tenant-bucket/alpha
-    policy:
-      file: policies/alpha.yaml       # per-graph: read, change, branch_*, schema_apply
-  beta:
-    uri: s3://tenant-bucket/beta
-    # no per-graph policy → no engine-layer Cedar enforcement on beta
+    file: policies/alpha.yaml
+    applies_to: [alpha]                # per-graph: alpha only
 ```

-**Config follows graph identity, not server mode.** A graph served by **name**
-(`--target <name>` or `server.graph`) uses its own `graphs.<name>.policy.file`,
-exactly as in multi-graph mode. Top-level `policy.file` applies only to an
-**anonymous** graph — one served by a bare `<URI>` with no `graphs:` entry.
-Serving a **named** graph (single- or multi-graph mode) while top-level
-`policy.file` (or `queries:`) is populated **refuses boot**, naming the block,
-since the top-level value would otherwise be silently shadowed by the per-graph
-block. Move per-graph rules to `graphs.<graph_id>.policy.file` and `graph_list`
-rules to `server.policy.file`.
+A graph with no bundle bound to it has no engine-layer Cedar enforcement. Each
+graph's HTTP request flows through its bound bundle; the management endpoint
+(`GET /graphs`) flows through the `cluster`-scoped bundle. When no bundle binds
+`cluster`, `GET /graphs` is denied in every runtime state, including
+`--unauthenticated`; with bearer tokens configured it returns 403 after admission
+control because `graph_list` is not a `read`-equivalent action. The operator must
+bind a `cluster`-scoped bundle granting `graph_list` to expose `/graphs`.

-Each graph's HTTP request flows through its own per-graph policy. The management endpoint (`GET /graphs`) flows through the server-level policy. When `server.policy.file` is unset, `GET /graphs` is denied in every runtime state, including `--unauthenticated`; with bearer tokens configured, it returns 403 after admission control because `graph_list` is not a `read`-equivalent action. The operator must explicitly authorize via `server-policy.yaml` to expose `/graphs`.
-
-Example server-level policy:
+Example `cluster`-scoped bundle:

 ```yaml
 version: 1
@ -72,40 +68,28 @@ rules:
      actions: [graph_list]
 ```

-## Configuration
+Each per-graph rule may use at most one of `branch_scope` or
+`target_branch_scope`. Server-scoped rules (`graph_list`) take neither — they
+have no branch context.

-`omnigraph.yaml`:
+## Actor for direct-engine writes

-```yaml
-policy:
-  file: policy.yaml          # Cedar rules + groups
-  tests: policy.tests.yaml   # declarative test cases
-
-cli:
-  actor: act-andrew            # default actor for CLI direct-engine writes
-```
-
-Each per-graph rule may use at most one of `branch_scope` or `target_branch_scope`. Server-scoped rules (`graph_list`) take neither — they have no branch context.
-
-`cli.actor` is the default actor identity for CLI direct-engine writes
-when `policy.file` is configured. Override per-invocation with `--as
-<ACTOR>` (top-level flag) — `--as` wins, otherwise `cli.actor` is used,
-otherwise no actor. With policy configured and neither set, the
-engine-layer footgun guard intentionally denies the write (silent bypass
-via "I forgot the actor" is exactly what the guard prevents). Remote
-HTTP writes ignore both — they resolve their actor server-side from the
-bearer token.
+The default actor identity for CLI direct-engine (`--store`) writes is
+`operator.actor` in `~/.omnigraph/config.yaml`. Override per-invocation with
+`--as <ACTOR>` — `--as` wins, otherwise `operator.actor`, otherwise no actor.
+Remote HTTP writes ignore both — they resolve their actor server-side from the
+bearer token. (Direct-store access carries no Cedar policy under RFC-011; policy
+lives in the cluster/server.)

 ## CLI

-Policy tooling resolves its graph like server single-mode policy: `cli.graph`
-wins, otherwise `server.graph` is used, otherwise the top-level `policy.file`
-is validated/tested/explained as the anonymous policy.
+Policy tooling reads a cluster's applied policy bundles: pass `--cluster <dir>`,
+and `--graph <id>` to pick a graph's bundle when several apply.

 - `omnigraph policy validate` — parse + count actors, exit 1 on parse error.
- `omnigraph policy test` — run cases in `policy.tests.yaml`, exit 1 on any expectation mismatch.
+- `omnigraph policy test --tests <file>` — run the declarative cases in `<file>` against the selected bundle, exit 1 on any expectation mismatch.
 - `omnigraph policy explain --actor … --action … [--branch …] [--target-branch …]` — show decision and matched rule.
- `omnigraph --as <ACTOR> <subcommand>` — set the actor for the duration of one invocation. Effective for `change`, `load` (and its deprecated `ingest` alias), `branch create|delete|merge`, and `schema apply` against local URIs. No-op against remote HTTP URIs (actor is bearer-token-resolved server-side).
+- `omnigraph --as <ACTOR> <subcommand>` — set the actor for the duration of one invocation. Effective for `change`, `load` (and its deprecated `ingest` alias), `branch create|delete|merge`, and `schema apply` against a direct (`--store`) graph. **Rejected** on a served write (`--server`): the actor is bearer-token-resolved server-side, so `--as` can't set it there.

 ## Enforcement

@ -113,42 +97,38 @@ Policy is a property of the **engine**, not the transport. Every mutating
 write — `mutate_as`, `load_as` (the deprecated `ingest_as` shims route
 through it), `apply_schema_as`,
 `branch_create_as`, `branch_create_from_as`, `branch_delete_as`,
-`branch_merge_as` — calls `Omnigraph::enforce(action, scope, actor)` at
-the head of the method. The gate fires identically whether the call
+`branch_merge_as` — consults the policy gate at the head of the method.
+The gate fires identically whether the call
 originates from the HTTP server, the CLI, or an embedded SDK consumer.
-When no `PolicyChecker` is installed (the dev/embedded default) the gate
+When no policy is installed (the dev/embedded default) the gate
 is a strict no-op; when one is installed and the call site forgets to
 thread an actor through, the gate fails closed rather than silently
 bypassing.

-## Server runtime states (MR-723)
+## Server runtime states

 The HTTP server classifies its startup configuration into one of three
 states based on whether bearer tokens are configured and whether a
 policy file is set. The state determines what happens to a request that
-reaches `authorize_request()` without a matching policy permit.
+reaches the authorization gate without a matching policy permit.

 | State | Tokens | Policy file | Behavior |
 |---|---|---|---|
 | **Open** | no | no | Every request is permitted. Refuses to start unless `--unauthenticated` or `OMNIGRAPH_UNAUTHENTICATED=1` is set — the operator must explicitly opt in. |
 | **DefaultDeny** | yes | no | Every authenticated request for an action other than `read` is rejected with HTTP 403. Closes the "tokens but forgot the policy file" trap — an operator who sets up auth and forgot to point at a policy file used to ship the illusion of protection. |
-| **PolicyEnabled** | yes | yes | Authenticated requests that reach a configured policy engine are evaluated by Cedar. Server-scoped actions still require `server.policy.file`. |
+| **PolicyEnabled** | yes | yes | Authenticated requests that reach a configured policy engine are evaluated by Cedar. Server-scoped actions still require a `cluster`-scoped policy bundle. |

-The classifier is `classify_server_runtime_state` in
-`crates/omnigraph-server/src/lib.rs`; it returns `Err` for the "no
-tokens, no policy, no flag" cell and for "policy file, no tokens" so the
-server refuses to start instead of silently shipping an open instance or
-a policy-protected server that can only 401. Tests pin every cell of the
-matrix and the State-2 deny path.
+The server refuses to start for the "no tokens, no policy, no flag" cell
+and for "policy file, no tokens" — instead of silently shipping an open
+instance or a policy-protected server that can only 401.

-Server-side, `authorize_request()` still runs at the HTTP boundary —
+Server-side, request authorization still runs at the HTTP boundary —
 that's where actor identity is resolved from the bearer token and where
 admission control / per-actor rate limits live. Engine-layer enforcement
 is the **defense in depth** layer: it catches CLI direct-engine writes,
 embedded SDK consumers, and any future transport that hasn't (or won't)
-re-implement HTTP's authorize_request. Both layers consult the same
-Cedar policy via the same `PolicyChecker` trait, so decisions cannot
-disagree.
+re-implement the HTTP boundary's authorization. Both layers consult the same
+Cedar policy, so decisions cannot disagree.

 ## Coarse vs. fine enforcement

@ -157,19 +137,19 @@ responsibilities:

 | Layer | Question it answers | Where it fires |
 |---|---|---|
-| **Engine-layer (coarse)** | Can this actor invoke this action against this branch / branch-transition? | `Omnigraph::enforce(action, scope, actor)` at the head of every `_as` writer; one Cedar decision per call. |
-| **Query-layer (fine)** | For the rows / types this action actually touches, which can the actor see or modify? | Per-row predicates pushed into DataFusion at plan time. **Not yet implemented — see MR-725.** |
+| **Engine-layer (coarse)** | Can this actor invoke this action against this branch / branch-transition? | The policy gate at the head of every `_as` writer; one Cedar decision per call. |
+| **Query-layer (fine)** | For the rows / types this action actually touches, which can the actor see or modify? | Per-row predicates pushed into the query plan. **Not yet implemented.** |

-The engine-layer gate keeps `ResourceScope` deliberately at branch
-granularity (`Graph`, `Branch`, `TargetBranch`, `BranchTransition`).
+The engine-layer gate keeps its resource scope deliberately at branch
+granularity (graph, branch, target branch, branch transition).
 Per-type and per-row authority is the query-layer's job; conflating them
-in `ResourceScope` would create two places per-type policy could be
+in the engine-layer scope would create two places per-type policy could be
 evaluated and a drift surface between them.

 ## Actor identity (signed-claim-only)

 The actor identity used for every policy decision comes from the matched bearer token — never from a client-supplied request header, query parameter, or body field. The server resolves the token at the auth middleware boundary, looks up the actor it was minted for, and overwrites whatever the handler may have placed in the policy request. Clients cannot set `actor_id` directly.

-This is intentional. Trusting client-supplied identity for authorization is "asking the attacker if they're an admin" — Supabase's RLS history names the same footgun. The chokepoint lives in `authorize_request` in `crates/omnigraph-server/src/lib.rs` and is named in `docs/dev/invariants.md` Hard Invariant 11. A regression test asserts the contract: a request with `Authorization: Bearer <token-for-actor-A>` plus `X-Actor-Id: actor-B` always evaluates as actor A, never as actor B.
+This is intentional. Trusting client-supplied identity for authorization is "asking the attacker if they're an admin" — Supabase's RLS history names the same footgun. The chokepoint lives at the server's auth boundary: a request with `Authorization: Bearer <token-for-actor-A>` plus `X-Actor-Id: actor-B` always evaluates as actor A, never as actor B.

 If you find yourself wanting to let clients override `actor_id` for impersonation, delegation, or service-account flows — that's a feature, but it needs explicit design (e.g., signed delegation claims, an `On-Behalf-Of` audit trail). It is not a convenience knob.
--- a/docs/user/operations/server.md
+++ b/docs/user/operations/server.md
@ -1,74 +1,67 @@
 # HTTP Server (`omnigraph-server`)

-Axum 0.8 + tokio + utoipa-generated OpenAPI. **Two modes** (v0.6.0+): single-graph (legacy) and multi-graph (MR-668), with **two boot sources** for multi mode: `omnigraph.yaml` or — exclusively — a cluster directory (`--cluster`, RFC-005). Mode is inferred from CLI args + config shape.
+Axum 0.8 + tokio + utoipa-generated OpenAPI. **Cluster-only boot** (RFC-011): the server always boots from a cluster (`--cluster <dir | s3://…>`) and serves N graphs (N ≥ 1) under cluster routes. There is no longer a single-graph flat-route mode, no positional `<URI>` boot, no `--target`, and no `omnigraph.yaml`-`graphs:`-map boot. All HTTP is nested under `/graphs/{graph_id}/...`; `/healthz` and the management `/graphs` enumeration stay flat.

-## Modes
+## Boot

-### Single-graph mode (legacy)
+### Cluster boot (the only boot)

-`omnigraph-server <URI>` or `omnigraph-server --target <name> --config omnigraph.yaml`. Routes are flat — `/snapshot`, `/read`, `/branches`, etc.
+```bash
+omnigraph-server --cluster <dir | s3://…> --bind 0.0.0.0:8080
+```

-**Config follows graph identity.** A bare `<URI>` is an *anonymous* graph and uses the **top-level** `policy.file` / `queries:`. A graph chosen by **name** (`--target` / `server.graph`) uses its own `graphs.<name>.{policy.file, queries}` — the same block multi-graph mode uses. ⚠️ *Changed from v0.6.0, which always used top-level config in single mode: a named-graph config that puts `policy`/`queries` at top-level now **refuses boot** and points you at `graphs.<name>.…` (move the block there). Bare-`<URI>` single mode is unchanged.*
+`omnigraph-server --cluster <dir-or-uri>` boots from the cluster catalog's
+**applied revision**. The server resolves that revision into per-graph
+startup configs (id, URI, optional per-graph policy, stored-query
+registry) plus an optional server-level policy, then opens every
+configured graph in parallel at startup (bounded concurrency = 4,
+fail-fast on the first open error). Routing is always multi-graph —
+requests to bare flat protected paths (`/read`, `/snapshot`, …) return
+404; the served surface is `/graphs/{graph_id}/...`. See
+[cluster-config.md](../clusters/config.md#serving-from-the-cluster-the-mode-switch)
+for what is read and the fail-fast readiness rules.

-### Multi-graph mode (v0.6.0+)
-
-`omnigraph-server --config omnigraph.yaml` with a non-empty `graphs:` map and **no** single-mode selector (no `server.graph`, no `<URI>`, no `--target`). The server opens every configured graph in parallel at startup (bounded concurrency = 4, fail-fast on the first open error). Routes are nested under `/graphs/{graph_id}/...`. Bare flat paths return 404 in multi mode.
-
-### Cluster-booted multi mode (Phase 5)
-
-`omnigraph-server --cluster <dir-or-uri>` boots from the cluster catalog's **applied
-revision** (`state.json` + content-addressed blobs) instead of
-`omnigraph.yaml` — an exclusive boot source: combining it with `<URI>`,
-`--target`, or `--config` is a startup error, and `omnigraph.yaml` is never
-read in this mode. Always multi-graph routing. See
-[cluster-config.md](cluster-config.md#serving-from-the-cluster-the-mode-switch)
-for what is read and the fail-fast readiness rules. `--bind`,
-`--unauthenticated`, and the bearer-token env vars work identically.
-
-Mode inference:
-
-0. CLI `--cluster <dir | s3://…>` → **multi, cluster-booted** (exclusive; a scheme-qualified argument reads the ledger straight from the storage root, no local config)
-1. CLI positional `<URI>` → single
-2. CLI `--target <name>` → single
-3. `server.graph` in config → single
-4. `--config` + non-empty `graphs:` + no single-mode selector → **multi**
-5. otherwise → error with migration hint
+A scheme-qualified argument (`s3://…`) reads the ledger straight from the
+storage root, with no local config directory. `--bind`,
+`--unauthenticated`, and the bearer-token env vars all apply.

 ### Stored-query validation at startup

-If a graph declares a `queries:` registry (see [cli-reference](cli-reference.md)), the server **loads and type-checks every stored query against that graph's live schema at startup** and **refuses to boot** if any query references a type or property the schema lacks — the same fail-loud posture as a malformed policy file, so schema drift surfaces at the deploy boundary rather than at invocation. Two MCP-exposed queries claiming the same tool name is likewise a boot error. Non-blocking advisories (e.g. an MCP-exposed query with a vector parameter an agent cannot supply) are logged. Validate offline before deploying with `omnigraph queries validate`. Discover the exposed queries as a typed tool catalog with `GET /queries`, and invoke one over HTTP with `POST /queries/{name}` (both below).
+If a graph declares a `queries:` registry (see [cli-reference](../cli/reference.md)), the server **loads and type-checks every stored query against that graph's live schema at startup** and **refuses to boot** if any query references a type or property the schema lacks — the same fail-loud posture as a malformed policy file, so schema drift surfaces at the deploy boundary rather than at invocation. Two MCP-exposed queries claiming the same tool name is likewise a boot error. Non-blocking advisories (e.g. an MCP-exposed query with a vector parameter an agent cannot supply) are logged. Validate offline before deploying with `omnigraph queries validate`. Discover the exposed queries as a typed tool catalog with `GET /queries`, and invoke one over HTTP with `POST /queries/{name}` (both below).

 ## Endpoint inventory

-Per-graph endpoints — same body shape across modes; URLs differ:
+Per-graph endpoints — all nested under `/graphs/{id}/...`. `{id}` is the
+graph id from the cluster's applied revision:

-| Method | Single-mode path | Multi-mode path | Auth | Action | Handler |
-|---|---|---|---|---|---|
-| GET | `/healthz` | `/healthz` | none | — | `server_health` |
-| GET | `/openapi.json` | `/openapi.json` | none | — | `server_openapi` (strips security if auth disabled; in multi mode emits cluster paths with `cluster_` operation-id prefix) |
-| GET | `/snapshot?branch=` | `/graphs/{id}/snapshot?branch=` | bearer + `read` | snapshot of branch | `server_snapshot` |
-| POST | `/query` | `/graphs/{id}/query` | bearer + `read` | inline read query (canonical; clean field names `query`/`name`; mutations → 400) | `server_query` |
-| POST | `/read` | `/graphs/{id}/read` | bearer + `read` | **deprecated** alias of `/query` (legacy field names `query_source`/`query_name`, byte-stable response; carries `Deprecation: true` + `Link: </query>; rel="successor-version"`) | `server_read` |
-| POST | `/export` | `/graphs/{id}/export` | bearer + `export` | NDJSON stream | `server_export` |
-| POST | `/mutate` | `/graphs/{id}/mutate` | bearer + `change` | mutation (canonical; `query`/`name`; accepts legacy `query_source`/`query_name` as serde aliases) | `server_mutate` |
-| POST | `/change` | `/graphs/{id}/change` | bearer + `change` | **deprecated** alias of `/mutate` (carries `Deprecation: true` + `Link: </mutate>; rel="successor-version"`) | `server_change` |
-| GET | `/queries` | `/graphs/{id}/queries` | bearer + `read` | list the `mcp.expose` stored queries as a typed tool catalog | `server_list_queries` |
-| POST | `/queries/{name}` | `/graphs/{id}/queries/{name}` | bearer + `invoke_query` (+ `change` for a stored mutation) | invoke a named query from the `queries:` registry; deny == 404 | `server_invoke_query` |
-| GET | `/schema` | `/graphs/{id}/schema` | bearer + `read` | get current `.pg` source | `server_schema_get` |
-| POST | `/schema/apply` | `/graphs/{id}/schema/apply` | bearer + `schema_apply` (target=`main`) | migrate | `server_schema_apply` |
-| POST | `/ingest` | `/graphs/{id}/ingest` | bearer + `branch_create` (only when `from` is set and the branch is created) + `change` | bulk load; branch creation is opt-in via `from` — without it a missing `branch` is a 404, never an implicit fork | `server_ingest` (32 MB body limit) |
-| GET | `/branches` | `/graphs/{id}/branches` | bearer + `read` | list branches | `server_branch_list` |
-| POST | `/branches` | `/graphs/{id}/branches` | bearer + `branch_create` | create | `server_branch_create` |
-| DELETE | `/branches/{branch}` | `/graphs/{id}/branches/{branch}` | bearer + `branch_delete` | delete | `server_branch_delete` |
-| POST | `/branches/merge` | `/graphs/{id}/branches/merge` | bearer + `branch_merge` | merge `source → target` | `server_branch_merge` |
-| GET | `/commits?branch=` | `/graphs/{id}/commits?branch=` | bearer + `read` | list | `server_commit_list` |
-| GET | `/commits/{commit_id}` | `/graphs/{id}/commits/{commit_id}` | bearer + `read` | show | `server_commit_show` |
+| Method | Path | Auth | Action |
+|---|---|---|---|
+| GET | `/healthz` | none | — |
+| GET | `/openapi.json` | none | — (strips security if auth disabled; emits the nested cluster paths with `cluster_` operation-id prefix) |
+| GET | `/graphs/{id}/snapshot?branch=` | bearer + `read` | snapshot of branch |
+| POST | `/graphs/{id}/query` | bearer + `read` | inline read query (canonical; clean field names `query`/`name`; mutations → 400) |
+| POST | `/graphs/{id}/read` | bearer + `read` | **deprecated** alias of `/query` (legacy field names `query_source`/`query_name`, byte-stable response; carries `Deprecation: true` + `Link: <query>; rel="successor-version"`) |
+| POST | `/graphs/{id}/export` | bearer + `export` | NDJSON stream |
+| POST | `/graphs/{id}/mutate` | bearer + `change` | mutation (canonical; `query`/`name`; accepts legacy `query_source`/`query_name` as serde aliases) |
+| POST | `/graphs/{id}/change` | bearer + `change` | **deprecated** alias of `/mutate` (carries `Deprecation: true` + `Link: <mutate>; rel="successor-version"`) |
+| GET | `/graphs/{id}/queries` | bearer + `read` | list the `mcp.expose` stored queries as a typed tool catalog |
+| POST | `/graphs/{id}/queries/{name}` | bearer + `invoke_query` (+ `change` for a stored mutation) | invoke a named query from the `queries:` registry; deny == 404 |
+| GET | `/graphs/{id}/schema` | bearer + `read` | get current `.pg` source |
+| POST | `/graphs/{id}/schema/apply` | bearer + `schema_apply` (target=`main`) | disabled for cluster-backed serving; returns 409 and points operators at `omnigraph cluster apply` + restart |
+| POST | `/graphs/{id}/load` | bearer + `branch_create` (only when `from` is set and the branch is created) + `change` | bulk load (canonical); branch creation is opt-in via `from` — without it a missing `branch` is a 404, never an implicit fork (32 MB body limit) |
+| POST | `/graphs/{id}/ingest` | bearer + `branch_create` (only when `from` is set and the branch is created) + `change` | **deprecated** alias of `/load` (carries `Deprecation: true` + `Link: <load>; rel="successor-version"`) (32 MB body limit) |
+| GET | `/graphs/{id}/branches` | bearer + `read` | list branches |
+| POST | `/graphs/{id}/branches` | bearer + `branch_create` | create |
+| DELETE | `/graphs/{id}/branches/{branch}` | bearer + `branch_delete` | delete |
+| POST | `/graphs/{id}/branches/merge` | bearer + `branch_merge` | merge `source → target` |
+| GET | `/graphs/{id}/commits?branch=` | bearer + `read` | list |
+| GET | `/graphs/{id}/commits/{commit_id}` | bearer + `read` | show |

-Server-level management endpoints (v0.6.0+):
+Server-level management endpoints:

-| Method | Path | Auth | Action | Handler |
-|---|---|---|---|---|
-| GET | `/graphs` | bearer + `graph_list` on `Server::"root"` | list registered graphs | `server_graphs_list` (405 in single mode) |
+| Method | Path | Auth | Action |
+|---|---|---|---|
+| GET | `/graphs` | bearer + `graph_list` on `Server::"root"` | list registered graphs |

 ### Stored-query catalog (`GET /queries`)

@ -87,17 +80,17 @@ Invoke a curated, server-side stored query by **name** — the source comes from
 - **Requires an explicit policy grant when auth is on.** In default-deny mode (bearer tokens but no `policy.file`), only `read` is permitted, so *every* `/queries/{name}` call returns `404` until an `invoke_query` rule is configured.
 - A stored mutation cannot target a `snapshot` (`400`); a parameter type error is a structured `400` naming the parameter.

-## Adding and removing graphs (multi mode)
+## Adding and removing graphs

-Runtime add/remove via API is **not** exposed in v0.6.0 — neither
-`POST /graphs` nor `DELETE /graphs/{id}` is implemented. Operators add
-or remove graphs by stopping the server, editing the `graphs:` map in
-`omnigraph.yaml`, then restarting. The server treats `omnigraph.yaml`
-as operator-owned configuration and never writes it.
+Runtime add/remove via API is **not** exposed — neither `POST /graphs`
+nor `DELETE /graphs/{id}` is implemented. Operators add or remove graphs
+by running `cluster apply` against the cluster (which publishes a new
+applied revision) and restarting the server so it boots from the new
+revision. The server treats the cluster source as operator-owned and
+never writes it.

-A future release may introduce a managed registry (Lance-backed,
-catalog-style: reserve → init → publish with recovery sidecars) and
-re-expose runtime mutation on top of it.
+A future release may introduce a managed registry and re-expose runtime
+mutation on top of it.

 ## Inline read queries (`POST /query`)

@ -138,8 +131,8 @@ channels:
 - **Response headers (RFC 9745)**: every response carries `Deprecation: true`.
 - **Response headers (RFC 8288)**: every response carries a `Link` header
  pointing at the canonical successor:
-  `Link: </query>; rel="successor-version"` for `/read`, and
-  `Link: </mutate>; rel="successor-version"` for `/change`. SDKs and HTTP
+  `Link: <query>; rel="successor-version"` for `/read`, and
+  `Link: <mutate>; rel="successor-version"` for `/change`. SDKs and HTTP
  proxies can pick the successor up automatically.

 Migration is purely cosmetic on the client side — swap the URL path, leave
@ -153,7 +146,7 @@ Only `/export` streams (`application/x-ndjson`, MPSC channel + `Body::from_strea

 Uniform `ErrorOutput { error, code?, merge_conflicts[], manifest_conflict? }` with `code ∈ unauthorized | forbidden | bad_request | not_found | conflict | too_many_requests | internal`. Merge conflicts attach structured `MergeConflictOutput { table_key, row_id?, kind, message }`.

-`manifest_conflict` is set on **publisher CAS rejections** (HTTP 409): the
+`manifest_conflict` is set on **concurrent-write rejections** (HTTP 409): the
 caller's pre-write view of one table's manifest version was stale.
 `ManifestConflictOutput { table_key, expected, actual }` tells the client
 which table to refresh and retry. This is the conflict shape produced by
@ -168,8 +161,8 @@ Disjoint
 `(table, branch)` writes from different actors now run concurrently,
 guarded only by the engine's per-(table, branch) write queue. To keep
 one heavy actor from exhausting shared capacity (Lance I/O, manifest
-churn, network), the server gates mutating handlers through a
-`WorkloadController` configured per-process from environment variables:
+churn, network), the server gates mutating handlers through per-process
+admission limits configured from environment variables:

 | Env var | Default | Purpose |
 |---|---|---|
@ -198,7 +191,7 @@ admission-gated.
 ## Auth model (`bearer + SHA-256`)

 - Tokens are SHA-256 hashed on startup; plaintext is never persisted in memory.
- Constant-time comparison via `subtle::ConstantTimeEq`.
+- Constant-time comparison.
 - Three sources, in precedence:
  1. `OMNIGRAPH_SERVER_BEARER_TOKENS_AWS_SECRET` — AWS Secrets Manager (build with `--features aws`)
  2. `OMNIGRAPH_SERVER_BEARER_TOKENS_FILE` or `OMNIGRAPH_SERVER_BEARER_TOKENS_JSON` — JSON `{actor_id: token, …}`
@ -208,7 +201,7 @@ admission-gated.
  policy file without tokens is also rejected at startup. In open mode
  `/openapi.json` strips the security scheme.

-See [deployment.md](deployment.md) for token-source operational details.
+See [deployment.md](../deployment.md) for token-source operational details.

 ## Tracing & observability

@ -226,4 +219,4 @@ See [deployment.md](deployment.md) for token-source operational details.
  admission control" above). No global rate limiter is configured;
  add `tower_http::limit` if a graph-wide cap is needed.
 - Pagination — none (commits/branches return everything; export streams).
- Runtime graph add/remove — edit `omnigraph.yaml` and restart.
+- Runtime graph add/remove — run `cluster apply` and restart.
--- a/docs/user/queries/index.md
+++ b/docs/user/queries/index.md
@ -0,0 +1,65 @@
+# Query Language (`.gq`)
+
+## Query declarations
+
+```
+query <name>($p1: T1, $p2: T2?, …)
+  @description("…") @instruction("…") {
+  …
+}
+```
+
+Two body shapes:
+
+- **Read**: `match { … } return { … } [order { … }] [limit N]` — covered on this page.
+- **Mutation**: one or more of `insert | update | delete` statements — see [mutations](../mutations/index.md).
+
+Multi-modal search functions (`nearest`, `bm25`, `rrf`, …) used inside `match`,
+`return`, and `order` are documented on the [search](../search/index.md) page.
+
+Param types reuse all schema scalars; trailing `?` makes a param optional. The compiler reserves `$__nanograph_now` for `now()`.
+
+## MATCH clauses
+
+- **Binding**: `$x: NodeType { prop: <literal | $param | now()>, … }`
+- **Traversal**: `$src EDGE_NAME { min, max? } $dst` — variable-length paths via hop bounds; default 1..1 if bounds omitted.
+- **Filter**: `<expr> <op> <expr>` with operators `>=`, `<=`, `!=`, `>`, `<`, `=`, and string `contains`.
+- **Negation**: `not { clause+ }` — desugars to anti-join over the inner pipeline.
+
+## RETURN clause
+
+`return { <expr> [as <alias>], … }` with expressions:
+
+- Variable / property access: `$x`, `$x.prop`
+- Literals: string, int, float, bool, list
+- `now()`
+- Aggregates: `count`, `sum`, `avg`, `min`, `max`
+- [Search functions](../search/index.md) (so you can return a score column)
+- `AliasRef` — re-use a previous projection alias
+
+## ORDER & LIMIT
+
+- `order { <expr> [asc|desc], … }` — supports plain expressions and `nearest(...)`.
+- `limit <integer>` — required when there is a `nearest(...)` ordering.
+- **Total, deterministic order.** Rows with equal user-sort keys are broken by the bound entities' key columns (`<var>.id`, ascending) appended as a final tie-break, so the result is a *total* order — reproducible across runs, and `order … limit N` returns a deterministic top-N even when ties straddle the cutoff. (Aggregate results have no entity-key columns; their group rows are already distinct on the projected group keys.)
+- **NULL placement** is *nulls-first ascending, nulls-last descending* (i.e. `nulls_first = !descending`): a NULL sorts as if smaller than any value.
+
+Write statements (`insert` / `update` / `delete`) are documented on the
+[mutations](../mutations/index.md) page.
+
+## Traversal execution
+
+Variable-length traversals (`Expand`) are executed one of two ways, chosen per-expand by a cost model over cheap manifest counts (frontier size, edge count, source-vertex count, hops) plus index coverage: selective traversals (small frontier relative to the source set) resolve neighbors from the persisted `src`/`dst` BTREE (one indexed scan per hop); dense / deep / large-frontier traversals — or those whose BTREE coverage is degraded so a full scan would be paid per hop — use an in-memory CSR adjacency index. Both produce identical results. The `OMNIGRAPH_EXPAND_INDEXED_MAX_FRONTIER` / `OMNIGRAPH_EXPAND_INDEXED_MAX_HOPS` ceilings bound the *initial dispatch* frontier/hops (beyond them CSR is always used); the cost model estimates total indexed work as ~`hops × frontier × fanout` and prices dense fan-out toward CSR — they are not a hard per-hop bound. `OMNIGRAPH_TRAVERSAL_MODE=indexed|csr` forces a mode (see [constants](../reference/constants.md)).
+
+## Linting & validation
+
+Codes seen so far:
+
+- **Q000** (Error): parse error
+- **L201** (Warning): nullable property never set by any UPDATE — "{type}.{prop} exists in schema but no update query sets it"
+- (Warning): mutation declares no params — hardcoded mutations are easy to miss
+- Plus all type errors from type checking (undefined types, mismatched operators, undefined edges, etc.)
+
+Lint output reports an overall status, per-query results (name, kind, status, any error and warnings), and structured findings (severity, code, message, and the type/property/query they apply to).
+
+CLI exits non-zero only on `status = Error`.
--- a/docs/user/query-language.md
+++ b/docs/user/query-language.md
@ -1,113 +0,0 @@
-# Query Language (`.gq`)
-
-Pest grammar at `crates/omnigraph-compiler/src/query/query.pest`. AST in `query/ast.rs`. Type checker in `query/typecheck.rs`. Lowering in `ir/lower.rs`.
-
-## Query declarations
-
-```
-query <name>($p1: T1, $p2: T2?, …)
-  @description("…") @instruction("…") {
-  …
-}
-```
-
-Two body shapes:
-
- **Read**: `match { … } return { … } [order { … }] [limit N]`
- **Mutation**: one or more of `insert | update | delete` statements
-
-Param types reuse all schema scalars; trailing `?` makes a param optional. The compiler reserves `$__nanograph_now` for `now()`.
-
-## MATCH clauses
-
- **Binding**: `$x: NodeType { prop: <literal | $param | now()>, … }`
- **Traversal**: `$src EDGE_NAME { min, max? } $dst` — variable-length paths via hop bounds; default 1..1 if bounds omitted.
- **Filter**: `<expr> <op> <expr>` with operators `>=`, `<=`, `!=`, `>`, `<`, `=`, and string `contains`.
- **Negation**: `not { clause+ }` — desugars to anti-join over the inner pipeline.
-
-## Search clauses (multi-modal)
-
-Used inside MATCH or as expressions inside RETURN/ORDER:
-
-| Function | Purpose | Underlying Lance facility |
-|---|---|---|
-| `nearest($x.vec, $q)` | k-NN vector search (cosine) | Lance vector index (IVF / HNSW) |
-| `search(field, q)` | Generic FTS | Inverted index |
-| `fuzzy(field, q [, max_edits])` | Levenshtein-tolerant text search | Inverted index |
-| `match_text(field, q)` | Pattern match | Inverted index |
-| `bm25(field, q)` | BM25 scoring | Inverted index |
-| `rrf(rank_a, rank_b [, k])` | Reciprocal Rank Fusion of two rankings (default k=60) | OmniGraph fuses scored rankings |
-
-`nearest()` requires a `LIMIT`; the compiler resolves the query vector via the param map (or via the runtime embedding client when bound to a text input).
-
-## RETURN clause
-
-`return { <expr> [as <alias>], … }` with expressions:
-
- Variable / property access: `$x`, `$x.prop`
- Literals: string, int, float, bool, list
- `now()`
- Aggregates: `count`, `sum`, `avg`, `min`, `max`
- All search functions above (so you can return a score column)
- `AliasRef` — re-use a previous projection alias
-
-## ORDER & LIMIT
-
- `order { <expr> [asc|desc], … }` — supports plain expressions and `nearest(...)`.
- `limit <integer>` — required when there is a `nearest(...)` ordering.
- **Total, deterministic order.** Rows with equal user-sort keys are broken by the bound entities' key columns (`<var>.id`, ascending) appended as a final tie-break, so the result is a *total* order — reproducible across runs, and `order … limit N` returns a deterministic top-N even when ties straddle the cutoff. (Aggregate results have no entity-key columns; their group rows are already distinct on the projected group keys.)
- **NULL placement** is *nulls-first ascending, nulls-last descending* (i.e. `nulls_first = !descending`): a NULL sorts as if smaller than any value.
-
-## Mutation statements
-
- `insert <Type> { prop: <value>, … }`
- `update <Type> set { prop: <value>, … } where <prop> <op> <value>`
- `delete <Type> where <prop> <op> <value>`
-
-`<value>` is a literal, `$param`, or `now()`. Multi-statement mutations execute atomically (added in v0.2.0).
-
-### D₂ — mixed insert/update + delete is rejected at parse time
-
-A single mutation query must be **either insert/update-only or delete-only**. Mixed → rejected before any I/O with the message:
-
-> `mutation '<name>' on the same query mixes inserts/updates and deletes; split into separate mutations: (1) inserts and updates, then (2) deletes. This restriction lifts when Lance exposes a two-phase delete API (tracked: MR-793 / Lance-upstream).`
-
-Reason: under the staged-write rewire (MR-794), inserts and updates accumulate in memory and commit at end-of-query, while deletes still inline-commit (Lance v6.0.1 has no public two-phase delete). Mixing creates ordering hazards (same-row insert→delete becomes a no-op because the staged insert isn't visible to delete; cascading deletes of just-inserted edges break referential integrity by silent design). Until the MR-A Lance v7 bump migrates `delete_where` to staged (`DeleteBuilder::execute_uncommitted` first ships in `v7.0.0-beta.10`), the parse-time rejection keeps both paths atomic and correct. See [docs/dev/writes.md](../dev/writes.md), [docs/dev/lance.md](../dev/lance.md), and [docs/dev/invariants.md](../dev/invariants.md).
-
-## IR (Intermediate Representation)
-
-`QueryIR { name, params, pipeline: Vec<IROp>, return_exprs, order_by, limit }`
-
-Pipeline operations:
-
- `NodeScan { variable, type_name, filters }`
- `Expand { src_var, dst_var, edge_type, direction (Out|In), dst_type, min_hops, max_hops, dst_filters }` — destination filters are pushed *into* the expand so Lance scalar pushdown can prune. Executed one of two ways, chosen per-expand by a cost model over cheap manifest counts (frontier size, |E|, source-vertex count, hops) plus index coverage: selective traversals (small frontier relative to the source set) resolve neighbors from the persisted `src`/`dst` BTREE (one indexed scan per hop); dense / deep / large-frontier traversals — or those whose BTREE coverage is degraded so a full scan would be paid per hop — use the in-memory CSR adjacency index. Both produce identical results. The `OMNIGRAPH_EXPAND_INDEXED_MAX_FRONTIER` / `OMNIGRAPH_EXPAND_INDEXED_MAX_HOPS` ceilings bound the *initial dispatch* frontier/hops (beyond them CSR is always used); the cost model estimates total indexed work as ~`hops × frontier × fanout` and prices dense fan-out toward CSR — they are not a hard per-hop bound. `OMNIGRAPH_TRAVERSAL_MODE=indexed|csr` forces a mode (see [constants](constants.md)).
- `Filter { left, op, right }`
- `AntiJoin { outer_var, inner: Vec<IROp> }` — for `not { … }`
-
-Lowering:
-
-1. Partition MATCH clauses (bindings, traversals, filters, negations).
-2. Identify "deferred" bindings (a destination of a traversal that has filters) so the Expand can carry the filter as a pushdown.
-3. Emit NodeScan for the first binding, then Expand operations, then remaining Filter operations, then AntiJoins for negations.
-4. Translate RETURN / ORDER expressions; preserve LIMIT.
-
-## Linting & validation (`query/lint.rs`)
-
-Codes seen so far:
-
- **Q000** (Error): parse error
- **L201** (Warning): nullable property never set by any UPDATE — "{type}.{prop} exists in schema but no update query sets it"
- (Warning): mutation declares no params — hardcoded mutations are easy to miss
- Plus all type errors from `typecheck_query_decl()` (undefined types, mismatched operators, undefined edges, etc.)
-
-Output:
-
-```
-QueryLintOutput { status, schema_source, query_path,
-  queries_processed, errors, warnings, infos,
-  results: [{ name, kind, status, error?, warnings[] }],
-  findings: [{ severity, code, message, type_name?, property?, query_names[] }] }
-```
-
-CLI exits non-zero only on `status = Error`.
--- a/docs/user/quickstart.md
+++ b/docs/user/quickstart.md
@ -0,0 +1,84 @@
+# Quickstart
+
+This walks the core loop end to end: define a schema, initialize a graph, load
+data, query it, and use a branch. It uses a local file-backed graph; swap the
+path for an `s3://…` URI to run the same flow against object storage.
+
+[Install](install.md) the `omnigraph` CLI first.
+
+## 1. Write a schema
+
+A schema (`.pg`) declares your node and edge types. Save this as `schema.pg`:
+
+```
+node Person {
+  name: String
+  title: String?
+}
+```
+
+See the [schema language](schema/index.md) for types, constraints, and edges.
+
+## 2. Initialize the graph
+
+```bash
+omnigraph init --schema schema.pg graph.omni
+```
+
+`init` creates an empty graph at the given URI with your schema applied.
+
+## 3. Load data
+
+`load` is the single bulk-write command. `--mode` is required
+(`overwrite | append | merge`):
+
+```bash
+omnigraph load --data people.jsonl --mode overwrite graph.omni
+```
+
+`people.jsonl` is newline-delimited JSON, one record per line. For finer-grained
+or inline writes, see [mutations](mutations/index.md).
+
+## 4. Query
+
+Write a query (`.gq`) — save as `queries.gq`:
+
+```gq
+query find_people($title: String) {
+  match { $p: Person { title: $title } }
+  return { $p.name }
+}
+```
+
+Run it:
+
+```bash
+omnigraph query find_people --query queries.gq \
+  --params '{"title":"Engineer"}' --format table --store graph.omni
+```
+
+The query name is positional; `--query` points at the `.gq` source and
+`--store` addresses the graph's storage directly.
+
+The [query language](queries/index.md) covers `match`/`return`/`order`, and
+[search](search/index.md) covers vector and full-text search.
+
+## 5. Work on a branch
+
+Branches isolate changes until you merge them — Git-style, across the whole graph:
+
+```bash
+omnigraph branch create review/new-hires graph.omni
+omnigraph load --data new-hires.jsonl --mode append --branch review/new-hires graph.omni
+# inspect the branch, then integrate it
+omnigraph branch merge review/new-hires --into main graph.omni
+```
+
+See [branches & commits](branching/index.md) and [merging](branching/merge.md).
+
+## Next steps
+
+- [CLI reference](cli/reference.md) — every command and flag.
+- [Schema language](schema/index.md) and [query language](queries/index.md).
+- [Operating a cluster](clusters/index.md) and [running the server](operations/server.md)
+  for multi-graph, multi-user deployments.
--- a/docs/user/reference/constants.md
+++ b/docs/user/reference/constants.md
@ -0,0 +1,42 @@
+# Constants & Tunables (cheat sheet)
+
+| Name | Value | Area |
+|---|---|---|
+| `MANIFEST_DIR` | `__manifest` | manifest layout |
+| Commit graph dir | `_graph_commits.lance` | commit graph |
+| Run registry dir (legacy, removed) | `_graph_runs.lance` | inert post-v0.4.0; bytes remain until a prefix-delete primitive lands |
+| Run branch prefix (legacy, removed) | `__run__` | swept off `__manifest` by the internal schema migration; no longer a reserved name |
+| Schema apply lock | `__schema_apply_lock__` | schema apply |
+| Manifest publisher retry budget | `PUBLISHER_RETRY_BUDGET = 5` | manifest publish |
+| Internal manifest schema version | `INTERNAL_MANIFEST_SCHEMA_VERSION = 3` | manifest migrations |
+| Merge stage batch | `MERGE_STAGE_BATCH_ROWS = 8192` | merge execution |
+| Maintenance concurrency | `OMNIGRAPH_MAINTENANCE_CONCURRENCY=8` | optimize/cleanup |
+| Lance blob compaction support | `LANCE_SUPPORTS_BLOB_COMPACTION = false` | optimize |
+| Graph index cache size | `8` (LRU) | runtime cache |
+| Expand indexed-path frontier ceiling | `OMNIGRAPH_EXPAND_INDEXED_MAX_FRONTIER=1024` | traversal |
+| Expand indexed-path hop ceiling | `OMNIGRAPH_EXPAND_INDEXED_MAX_HOPS=6` | traversal |
+| Expand CSR-build cost factor | `CSR_BUILD_FACTOR = 1.5` | traversal |
+| Expand mode override | `OMNIGRAPH_TRAVERSAL_MODE` (`indexed`\|`csr`; unset = cost-based auto) | traversal |
+| Default body limit | `1 MB` | HTTP server |
+| Ingest body limit | `32 MB` | HTTP server |
+| Default embed provider/model | `openai-compatible` / `openai/text-embedding-3-large` | engine embedding |
+| OpenAI-direct embed model | `text-embedding-3-large` | engine embedding |
+| Gemini-direct embed model | `gemini-embedding-2` | engine embedding |
+| Embed deadline | `OMNIGRAPH_EMBED_DEADLINE_MS=60000` | engine embedding |
+| Embed timeout | `OMNIGRAPH_EMBED_TIMEOUT_MS=30000` | engine embedding |
+| Embed retries | `OMNIGRAPH_EMBED_RETRY_ATTEMPTS=4` | engine embedding |
+| Embed retry backoff | `OMNIGRAPH_EMBED_RETRY_BACKOFF_MS=200` | engine embedding |
+| LANCE memory pool default | `1 GB` (raised in v0.3.0) | runtime |
+
+**Expand traversal dispatch.** With `OMNIGRAPH_TRAVERSAL_MODE` unset, the engine
+chooses the indexed (per-hop BTREE) vs CSR (whole-graph in-memory) path with a
+cost model over cheap manifest counts (frontier size, |E|, source-vertex count,
+hops) plus the index-coverage signal: the indexed path is preferred when its
+frontier-relative work beats building the CSR (≈ when `hops × frontier` is a
+small fraction of the source-vertex set), and CSR is preferred for dense/deep
+traversals or when the BTREE coverage is degraded and a full scan would be paid
+per hop. The two ceilings bound the **initial dispatch** frontier/hops (beyond
+them CSR is always used); they are not a hard per-hop bound — the cost model
+*estimates* total indexed work as ~`hops × frontier × fanout`, so dense fan-out is
+priced toward CSR rather than capped mid-traversal. The override flag forces a path (the `auto` result is identical either way;
+only the path differs).
--- a/docs/user/schema-language.md
+++ b/docs/user/schema-language.md
@ -1,7 +1,5 @@
 # Schema Language (`.pg`)

-Pest grammar at `crates/omnigraph-compiler/src/schema/schema.pest`. AST at `schema/ast.rs`. Catalog at `catalog/mod.rs`.
-
 ## Top-level declarations

 - `interface <Name> { property* }` — reusable property contracts.
@ -47,37 +45,28 @@ Edge bodies only allow `@unique` and `@index`.

 - `@<ident>` or `@<ident>(<literal>)` on any declaration or property.
 - Known annotations:
-  - `@embed` on a Vector property — names the *source* property whose text gets embedded into this vector at ingest (`embed_sources` map in NodeType).
+  - `@embed("source_property")` on a Vector property — records which String property is the embedding source for query-time `nearest($v, "string")` auto-embedding. It is a catalog annotation; it does **not** populate the vector at ingest (supply vectors in load data, or pre-fill via the offline `omnigraph embed` pipeline). An optional `model="…"` kwarg (`@embed("source_property", model="openai/text-embedding-3-large")`) records the embedding model so a `nearest()` query whose embedder uses a different model is rejected loudly; `model` is the only supported kwarg. See [search/embeddings.md](../search/embeddings.md).
  - `@description("…")`, `@instruction("…")` on query declarations (carried through to clients).
 - Custom annotations are accepted by the parser and surfaced in catalog metadata; unrecognized annotations don't fail compilation.

-## Catalog construction
+## Table layout

- Pass 0: collect interfaces.
- Pass 1: collect nodes, expand `implements`, build constraint and `@embed` mappings, build the Arrow schema for each node table (`id: Utf8` plus all properties; blob columns get `LargeBinary`).
- Pass 2: collect edges, validate that `from_type` / `to_type` exist, normalize edge names case-insensitively for lookup, validate constraints for edges. Edge Arrow schema: `id: Utf8, src: Utf8, dst: Utf8` plus edge properties.
-
-## Schema IR & stable type IDs
-
- `SCHEMA_IR_VERSION = 1` (`catalog/schema_ir.rs`).
- Each interface/node/edge currently gets a `stable_type_id` from a kind+name hash.
- Rename-preserving accepted IDs are an architectural invariant, but the current hash-on-name implementation is a known gap until migration carries IDs across `@rename_from`.
- Serialized as JSON for diff/migration plans.
+- Each node type compiles to a table with an `id: Utf8` column plus all declared properties (blob columns are stored as `LargeBinary`); `implements` clauses expand the interface's properties into the node.
+- Each edge type compiles to a table with `id: Utf8, src: Utf8, dst: Utf8` plus the edge's own properties. Edge endpoint types (`from`/`to`) must exist, and edge names are matched case-insensitively.

 ## Schema migration planning

-`plan_schema_migration(accepted, desired) -> SchemaMigrationPlan { supported, steps[] }` with step types:
+A migration plan compares the accepted schema against the desired one and reports whether the change is supported plus the ordered steps it requires:

- `AddType { type_kind, name }`
- `RenameType { type_kind, from, to }`
- `AddProperty { type_kind, type_name, property_name, property_type }`
- `RenameProperty { type_kind, type_name, from, to }`
- `AddConstraint { type_kind, type_name, constraint }`
- `UpdateTypeMetadata { … annotations }`
- `UpdatePropertyMetadata { … annotations }`
- `UnsupportedChange { entity, reason }` (forces `supported=false`)
+- Add a type
+- Rename a type
+- Add a property
+- Rename a property
+- Add a constraint
+- Update type or property metadata (annotations)
+- Unsupported change (reports the entity and reason; forces the plan to unsupported)

-`apply_schema()` returns `SchemaApplyResult { supported, applied, manifest_version, steps }` and is gated by an internal `__schema_apply_lock__` system branch so concurrent schema applies serialize.
+Applying a plan reports whether it was supported, the steps applied, and the resulting manifest version. Concurrent schema applies serialize so they can't interleave.

 ## Destructive drops — `--allow-data-loss`

--- a/docs/user/schema/lint.md
+++ b/docs/user/schema/lint.md
@ -2,29 +2,26 @@

 The migration planner emits **code-tagged diagnostics** for every schema change it rejects. Codes have the form `OG-XXX-NNN` and identify the rule (not the message); operators reference them in suppression directives, severity overrides, and CI reports.

-This page is the catalog of codes shipped today. The chassis behind it is tracked in [MR-694](https://linear.app/modernrelay/issue/MR-694).
+This page is the catalog of codes shipped today.

-## What's shipped in v0
+## What's shipped

- Stable code attached to every rejection the planner emits (today: 5 of 17 paths — the rest carry `code: None` and are tagged as future work).
+- Stable code attached to every rejection the planner emits (today: 5 of 17 paths — the rest are tagged as future work).
 - Code appears in the user-visible error message: `[OG-DS-104] removing property 'Person.age' is not supported …`.
 - CLI `omnigraph schema plan` shows the code on `unsupported change …` lines.
- Tests in `tests/schema_apply.rs` assert on codes, not on free-text prose.

 ## What's not shipped yet

- Severity configuration in `omnigraph.yaml` (planned: `lint: { OG-DS-103: error }`).
+- Severity configuration (planned: `lint: { OG-DS-103: error }`).
 - `@allow(OG-XXX-NNN, "rationale")` suppression directives.
- Pre-migration checks (the `migration_check { … }` block — MR-941).
- The CD / VE / LK / NM families (MR-942..945).
- CI integration (MR-946).
- Cost-class annotations (MR-944).
+- Pre-migration checks (the `migration_check { … }` block).
+- The CD / VE / LK / NM families.
+- CI integration.
+- Cost-class annotations.

-See the parent chassis issue (MR-694) for the design and the per-family sub-issues for what's planned.
+## Code catalog

-## Code catalog (v0)
-
-The chassis defines ten families. Today only DS and MF have emitted codes. The remaining families are reserved for future PRs.
+The chassis defines ten families. Today only DS and MF have emitted codes. The remaining families are reserved for future releases.

 | Code | Family | Tier | Default severity | Meaning |
 |---|---|---|---|---|
@ -37,24 +34,22 @@ The chassis defines ten families. Today only DS and MF have emitted codes. The r
 | `OG-MF-104` | Maybe-fail | validated | error | tighten nullable to non-nullable (reserved) |
 | `OG-MF-106` | Maybe-fail | destructive | error | narrowing scalar type |

-The full code catalog source of truth lives in `crates/omnigraph-compiler/src/lint/codes.rs`. CI-level invariants (uniqueness, format, family coverage) are unit-tested in the same module.
-
 ## Families

 The ten chassis families:

 | Prefix | Family | Status |
 |---|---|---|
-| **DS** | Destructive (data-loss) | shipped, v0 |
-| **MF** | Maybe-fail / data-dependent | shipped, v0 |
-| **CD** | Constraint deletion (relaxation warning) | tracked in MR-942 |
+| **DS** | Destructive (data-loss) | shipped |
+| **MF** | Maybe-fail / data-dependent | shipped |
+| **CD** | Constraint deletion (relaxation warning) | planned |
 | **BC** | Backward-incompatible (rename) | implicit in `@rename_from`; codify later |
-| **NM** | Naming conventions | tracked in MR-945 |
-| **OW** | Ownership (per-resource Cedar) | tracked in MR-722 |
-| **NL** | Non-linear (branch-merge divergence) | stubbed in MR-947 |
-| **VE** | Vector / embedding | tracked in MR-943 |
-| **ED** | Edge / graph topology | tracked in MR-701, MR-943 |
-| **LK** | Lock duration / cost | tracked in MR-944 |
+| **NM** | Naming conventions | planned |
+| **OW** | Ownership (per-resource Cedar) | planned |
+| **NL** | Non-linear (branch-merge divergence) | planned |
+| **VE** | Vector / embedding | planned |
+| **ED** | Edge / graph topology | planned |
+| **LK** | Lock duration / cost | planned |

 ## Prior art

--- a/docs/user/search/embeddings.md
+++ b/docs/user/search/embeddings.md
@ -0,0 +1,112 @@
+# Embeddings
+
+OmniGraph embeds text through a **single, provider-independent client** resolved from one
+`EmbeddingConfig { provider, model, base_url, api_key }`. The same resolved config is used by the query-time
+auto-embed of a string in `nearest($v, "string")` and by the offline `omnigraph embed` file pipeline, so
+query vectors and document vectors share one model and one vector space.
+
+## Providers
+
+| `provider` | Wire shape | Use it for |
+|---|---|---|
+| `openai-compatible` (default) | `POST {base}/embeddings`, bearer auth, `{model, input, dimensions}` | **OpenRouter** (the default gateway — one key for many models), OpenAI direct, or a self-hosted endpoint (vLLM / Ollama / LM Studio) |
+| `gemini` | `POST {base}/models/{model}:embedContent`, `x-goog-api-key`, with `RETRIEVAL_QUERY` / `RETRIEVAL_DOCUMENT` task types | Reaching Google's `generativelanguage` API directly |
+| `mock` | none — deterministic offline vectors | Tests and local dev without a key |
+
+Vectors are stored L2-normalized as `FixedSizeList(Float32, dim)`; the requested output dimension is driven by
+the target column width and sent as Gemini `outputDimensionality` / OpenAI `dimensions`.
+
+## Configuration (cluster)
+
+Cluster-served graphs can pin their query-time embedder in `cluster.yaml`:
+
+```yaml
+providers:
+  embedding:
+    default:
+      kind: openai-compatible
+      base_url: https://openrouter.ai/api/v1
+      model: openai/text-embedding-3-large
+      api_key: ${OPENROUTER_API_KEY}
+
+graphs:
+  knowledge:
+    schema: knowledge.pg
+    embedding_provider: default
+```
+
+`embedding_provider` references `providers.embedding.<name>`; bare names are
+normalized to that typed ref. The server resolves `${ENV_VAR}` only when it
+boots from the applied cluster ledger, so `cluster validate`, `plan`, and
+`apply` do not need provider secrets. Inline API keys are rejected. `mock`
+needs no key. Vector dimensions stay schema-driven by the target `Vector(N)`
+column.
+
+Direct single-graph serving, embedded callers, and the offline
+`omnigraph embed` pipeline use environment configuration unless they inject an
+`EmbeddingConfig` directly.
+
+## Configuration (environment)
+
+| Variable | Meaning |
+|---|---|
+| `OMNIGRAPH_EMBED_PROVIDER` | `openai-compatible` (default, → OpenRouter) \| `openai` (→ OpenAI's own host) \| `gemini` \| `mock` |
+| `OMNIGRAPH_EMBED_BASE_URL` | endpoint base; defaults `https://openrouter.ai/api/v1` (`openai-compatible`/unset), `https://api.openai.com/v1` (`openai`), `https://generativelanguage.googleapis.com/v1beta` (`gemini`) |
+| `OMNIGRAPH_EMBED_MODEL` | model id; defaults `openai/text-embedding-3-large` (OpenRouter), `text-embedding-3-large` (`openai`), `gemini-embedding-2` (`gemini`) |
+| `OPENROUTER_API_KEY` / `OPENAI_API_KEY` | api key for `openai-compatible` (OpenRouter preferred) |
+| `GEMINI_API_KEY` | api key for `gemini` |
+| `OMNIGRAPH_EMBED_DEADLINE_MS` | total wall-clock budget for one embed call across all retries (default `60000`; `0` = unbounded) |
+| `OMNIGRAPH_EMBED_TIMEOUT_MS` | per-request HTTP timeout (default `30000`) |
+| `OMNIGRAPH_EMBED_RETRY_ATTEMPTS` / `OMNIGRAPH_EMBED_RETRY_BACKOFF_MS` | retry policy (defaults `4` / `200`) |
+| `OMNIGRAPH_EMBEDDINGS_MOCK` | set truthy to force the deterministic mock provider |
+
+The default zero-config path is OpenRouter: set `OPENROUTER_API_KEY` and run. Reaching Gemini takes
+`OMNIGRAPH_EMBED_PROVIDER=gemini` plus `GEMINI_API_KEY`.
+
+### Behavior notes
+
+- **Bounded latency.** Each embed call is wrapped in `OMNIGRAPH_EMBED_DEADLINE_MS`, so a degraded
+  provider cannot hang a read for the full retry envelope.
+- **Reuse.** The query path builds the client once per graph handle (on the first `nearest($v, "string")`
+  that needs embedding) and reuses it, keeping the provider connection pool warm. A graph that never embeds
+  needs no provider key.
+- **Observability.** Embed calls emit `tracing` events under `target = "omnigraph::embedding"` (provider,
+  model, dim, attempt, elapsed, outcome).
+
+## `@embed` schema annotation
+
+Mark a Vector property with `@embed("source_text_property")`. This is a **catalog annotation** consumed by the
+query typechecker and linter: it records which String property is the embedding source and lets
+`nearest($v, "string")` auto-embed a query string for comparison against that vector column.
+
+Optionally record the model that produced the stored vectors:
+`@embed("source_text_property", model="openai/text-embedding-3-large")`. When a model is recorded, a
+`nearest($v, "string")` query is **rejected with a typed error** unless the resolved query embedder uses the
+same model — so stored and query vectors are guaranteed same-space instead of silently ranking across spaces.
+To fix a mismatch, set `OMNIGRAPH_EMBED_MODEL` (and the matching provider) to the recorded model, or re-embed.
+The recorded model is the literal string, so `openai/text-embedding-3-large` (via OpenRouter) and
+`text-embedding-3-large` (OpenAI direct) are distinct identities; use the matching string. Changing a recorded
+model is a loud `schema apply` refusal (treat it as a re-embed migration). `@embed` without a model keeps
+working with no validation. `model` is the only supported `@embed` argument; any other is a parse error.
+
+**It does not embed at ingest.** Stored vectors are supplied directly in your load data, or pre-filled by the
+offline `omnigraph embed` pipeline below. (Ingest-time execution of `@embed` is a planned enhancement.)
+
+## CLI `omnigraph embed` (offline file pipeline)
+
+Operates on **JSONL files** (not on a graph), using the same resolved provider config. Three modes (mutually
+exclusive):
+
+- (default) `fill_missing` — only embed rows whose target field is empty
+- `--reembed-all` — overwrite all
+- `--clean` — strip embeddings
+
+Inputs are either a single seed manifest YAML or `--input/--output/--spec`. Selectors `--type T`, `--select T:field=value` filter rows. Streams JSONL → JSONL.
+
+## Migration
+
+This release has no backwards-compatibility shim (pre-release). The default provider is now OpenRouter, and
+the legacy `OMNIGRAPH_GEMINI_BASE_URL` is removed. A graph whose vectors were produced with
+`gemini-embedding-2-preview` should either re-embed, or pin the query-time embedder to match by setting
+`OMNIGRAPH_EMBED_PROVIDER=gemini` and `OMNIGRAPH_EMBED_MODEL=gemini-embedding-2-preview` (the stored and query
+vectors must come from the same model to be comparable).
--- a/docs/user/search/index.md
+++ b/docs/user/search/index.md
@ -0,0 +1,48 @@
+# Search
+
+OmniGraph runs vector, full-text, and hybrid search in the same runtime as graph
+traversal — a single [query](../queries/index.md) can combine a vector `nearest`,
+a `bm25` text score, and an `Expand` traversal. Search functions are used inside
+`match` (to filter), or as expressions inside `return` / `order` (to score and
+rank).
+
+## Functions
+
+| Function | Purpose | Backing index |
+|---|---|---|
+| `nearest($x.vec, $q)` | k-NN vector search (cosine) | vector index (IVF / HNSW) |
+| `search(field, q)` | Generic full-text search | inverted (FTS) index |
+| `fuzzy(field, q [, max_edits])` | Levenshtein-tolerant text search | inverted index |
+| `match_text(field, q)` | Pattern match | inverted index |
+| `bm25(field, q)` | BM25 relevance scoring | inverted index |
+| `rrf(rank_a, rank_b [, k])` | Reciprocal Rank Fusion of two rankings (default `k=60`) | fuses scored rankings |
+
+- `nearest()` requires a `limit`. The query vector is resolved from the param map,
+  or embedded from a text input at runtime via the configured
+  [embedding client](embeddings.md).
+- Scores and ranks propagate as ordinary columns, so you can `return` a score and
+  `order` by it.
+
+## Hybrid ranking with `rrf`
+
+Reciprocal Rank Fusion combines two independent rankings (typically one vector and
+one text) into a single fused ranking, without needing the two score scales to be
+comparable. Rank each retrieval separately, then fuse:
+
+```gq
+query hybrid($q: String) {
+  match { $d: Document { } }
+  return {
+    $d,
+    rrf( nearest($d.embedding, $q), bm25($d.body, $q) ) as score
+  }
+  order { score desc }
+  limit 10
+}
+```
+
+## Indexes and embeddings
+
+Search functions only work when the backing index exists — see
+[indexes](indexes.md) for building vector and inverted indexes, and
+[embeddings](embeddings.md) for generating the vectors `nearest` searches over.
--- a/docs/user/search/indexes.md
+++ b/docs/user/search/indexes.md
@ -0,0 +1,43 @@
+# Indexes
+
+## L1 — Lance index types OmniGraph exposes
+
+| Index | Use | Notes |
+|---|---|---|
+| **BTREE scalar** | `=` / range / `IN` / `IS NULL` on a scalar | always on the node `id` and edge `src`/`dst`; and on each one-column `@index`/`@key` property that is an **enum** or an **orderable scalar** (`DateTime`/`Date`/`I32`/`I64`/`U32`/`U64`/`F32`/`F64`/`Bool`) |
+| **Inverted (FTS)** | `search`, `fuzzy`, `match_text`, `bm25` | created on **free-text** (non-enum) `String` `@index`/`@key` columns |
+| **Vector** | `nearest()` k-NN | Lance picks IVF_PQ vs HNSW family by configuration; OmniGraph stores as FixedSizeList(Float32, dim) |
+
+The per-property index a column gets is decided by `node_prop_index_kind` (shared
+by the builder and the sidecar-pinning coverage check so they cannot drift):
+enums and orderable scalars → BTREE, free-text Strings → FTS, `Vector` → vector,
+list/`Blob` columns → none.
+
+> **Free-text Strings are not equality-indexed.** A non-enum `String` column
+> (including a `String @key` slug) gets an FTS inverted index, which Lance does
+> **not** consult for `=`/range — only for `search`/`match_text`/`bm25`. So an
+> equality filter on a free-text String falls back to a full scan. If you filter
+> a String identifier by equality on a large table, model it so the value is the
+> node id, or track it as a follow-up to also build a BTREE on such columns.
+
+> **Coverage and cost.** Each indexed column adds index files and build time, and
+> an index only covers the fragments it was built over. Rows appended after the
+> index was built (e.g. by `ingest --mode merge`) are scanned unindexed until a
+> reindex extends coverage; see [maintenance](../operations/maintenance.md) → `optimize`.
+
+## L2 — OmniGraph orchestration
+
+- **`@index`/`@key` declares intent; the physical index is derived state.** A migration records the declaration in the catalog/IR and never fails on it — `schema apply` builds **no** indexes (adding an `@index` to an existing column is a pure metadata change that touches no table data). `load`/`mutate` build declared indexes inline as part of the write, but a column that can't be built yet (a `Vector` column with no trainable vectors — IVF k-means needs ≥1 vector, e.g. rows loaded before `embed` runs) is left **pending**, not fatal. Reads stay correct meanwhile: a missing/partial index degrades to a scan (vector search to brute-force). A later `ensure_indices`/`optimize` materializes the pending index once it is buildable. This mirrors how LanceDB builds indexes asynchronously and serves unindexed rows by brute-force.
+- `ensure_indices()` / `ensure_indices_on(branch)` — idempotent build of BTREE + inverted + vector indexes for the current head; safe to re-run; returns the columns it had to defer as pending. `optimize` runs it after compaction, so the maintenance cron is the convergence path for deferred indexes.
+- Indexes are built on the *branch head* (not on a snapshot), so reads always see the current index state.
+- **Lazy branch forking for indexes**: a branch that hasn't mutated a sub-table doesn't need its own index — the main lineage's index is reused until the first write triggers a copy-on-write fork.
+- Vector index parameters (metric, nlist, nprobe, etc.) are not exposed in the schema; they default at the Lance layer and are picked up automatically when an index is asked for on a Vector column.
+
+## L2 — Graph topology index
+
+This is OmniGraph-specific (not Lance):
+
+- A Compressed Sparse Row (CSR) adjacency representation of edges, with both out- (CSR) and in- (CSC) directions, plus a dense per-node-type id mapping.
+- Built on demand from a snapshot's edge tables, **lazily**: only when an `Expand` the planner routes to the CSR path (dense / large frontier) or an `AntiJoin` actually needs it.
+- Cached per snapshot (LRU, keyed by snapshot id + edge table versions), so repeat traversals over the same snapshot reuse it.
+- Selective `Expand`s resolve neighbors from the persisted `src`/`dst` BTREE instead (one indexed scan per hop) and never trigger the CSR build; see [query-language](../queries/index.md) → Traversal execution. Pure scans, and queries served entirely by the indexed traversal path, skip it.