docs(user): restructure user docs into topic sections (Phase 1) (#223)

Move the 23 flat docs/user/*.md files into topic subdirectories so the user guide is organized by area (schema, queries, search, branching, cli, operations, clusters, concepts, reference) instead of a flat list. This is a pure structural move — whole files relocated, every cross-doc link recomputed, no prose rewrites or content splits (those follow in Phase 2). - 19 `git mv`s (install.md, deployment.md stay top-level); history preserved (renames detected at 92–100% similarity). - All intra-doc links, AGENTS.md's topic table (52 pointers), and the docs/dev + docs/releases back-links recomputed via relpath from each file's new location. - docs/user/index.md rewritten as a sectioned nav hub. - Fixed 5 doc-path references in Rust (comments + two user-facing server settings error strings) to point at the new locations. Verified: zero broken .md links across tracked docs; check-agents-md.sh green (with the untracked scratch docs set aside); touched crates build. Note: the public site (omnigraph-web) imports docs/ via a flat-only script; its import-docs.mjs needs a subdir-aware update before the next re-sync. Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-21 02:28:07 +02:00 · 2026-06-14 13:52:14 +03:00 · 2026-06-14 13:52:14 +03:00 · d46e50dd6d
commit d46e50dd6d
parent 8726ca92ec
33 changed files with 126 additions and 109 deletions
--- a/docs/user/clusters/config.md
+++ b/docs/user/clusters/config.md
@ -0,0 +1,457 @@
+# Cluster Config
+
+**Status:** Phase 5 — cluster-booted serving (`omnigraph-server --cluster`).
+
+> New to the cluster tooling? Start with the operator how-to guide,
+> [cluster.md](index.md) — this document is the reference.
+
+Cluster config is the future control-plane configuration surface for a whole
+OmniGraph deployment. In this stage, OmniGraph can validate a local
+`cluster.yaml` folder, produce a deterministic read-only plan, inspect the
+local JSON state ledger, explicitly refresh/import graph observations into
+that ledger, manually remove a held local state lock by exact lock id, and
+**apply the executable subset of the plan** — stored-query and policy-bundle
+catalog writes, **graph creation** (a declared graph that does not exist yet
+is initialized by apply at the derived root), **schema updates** (soft drops
+only), and — behind an explicit, digest-bound **approval** — **graph
+deletion**. It does not perform data-loss schema migrations, start servers,
+or serve anything it applies: the server still boots from `omnigraph.yaml`.
+
+## Commands
+
+```bash
+omnigraph cluster validate --config company-brain
+omnigraph cluster plan     --config company-brain --json
+omnigraph cluster apply    --config company-brain --json
+omnigraph cluster approve  graph.<id> --config company-brain --as <actor>
+omnigraph cluster status   --config company-brain --json
+omnigraph cluster refresh  --config company-brain --json
+omnigraph cluster import   --config company-brain --json
+omnigraph cluster force-unlock <LOCK_ID> --config company-brain --json
+```
+
+`--config` points at a directory, not a file. The directory must contain
+`cluster.yaml`. When omitted, it defaults to the current directory.
+
+## Relationship to `omnigraph.yaml`
+
+`cluster.yaml` does not replace `omnigraph.yaml`, and the two never describe
+the same fact. `omnigraph.yaml` is the permanent **per-operator** layer (CLI
+defaults, the operator's identity and credential references, graph targets
+for data-plane commands); `cluster.yaml` is the shared desired state of a
+whole deployment, read only by the `cluster` commands via `--config`.
+
+The exact contract:
+
+- **Cluster commands read `omnigraph.yaml` for exactly one thing**: the
+  `cli.actor` default used by `apply`/`approve` when `--as` is omitted —
+  operator identity is a per-operator fact. With `--as` present, no config
+  is read at all. Nothing else (its graph set, targets, bind, queries,
+  policies) ever influences a cluster command; a malformed `omnigraph.yaml`
+  breaks only the no-flag actor lookup, loudly.
+- **A `--cluster` server reads `omnigraph.yaml` for nothing** — not even the
+  implicit current-directory search runs (mode-inference rule 0). Boot from
+  cluster state XOR `omnigraph.yaml`, never a merge.
+- **The other direction is ergonomics, not coupling**: a per-operator
+  `omnigraph.yaml` may point `graphs.<name>.uri` at a cluster's derived root
+  (`company-brain/graphs/knowledge.omni`) so data-plane commands can use
+  `--target <name>` — an ordinary local path, no special handling.
+
+## Supported `cluster.yaml`
+
+Stage 3A accepts only this resource subset:
+
+```yaml
+version: 1
+metadata:
+  name: company-brain
+
+state:
+  backend: cluster
+  lock: true
+
+graphs:
+  knowledge:
+    schema: knowledge.pg
+    queries: queries/          # discover every `query <name>` in queries/*.gq
+
+policies:
+  base:
+    file: base.policy.yaml
+    applies_to: [knowledge]
+```
+
+`queries` is Terraform-shaped — the `.gq` files are the declaration. Three
+forms:
+
+```yaml
+queries: queries/                  # directory: top-level *.gq, sorted; every declaration registers
+queries: [people.gq, extra/a.gq]   # explicit files; every declaration in each
+queries:                           # fine-grained name -> file map
+  find_experts:
+    file: knowledge.gq
+```
+
+Discovery is loud: an unreadable or unparseable `.gq`, or the same query name
+declared in two files, fails validation (`query_parse_error`,
+`duplicate_query_name`). Each discovered query is still an individually
+addressed resource (`query.<graph>.<name>`) with its own plan/apply lifecycle;
+the digest is the containing file's hash, so editing a multi-query file
+updates all of its queries together. Paths are relative to the config
+directory — the cluster is one explicit folder, so no `./` prefixes are
+needed.
+
+`storage:` (optional) is the **storage root URI** for everything the cluster
+stores — the state ledger, lock, content-addressed catalog, recovery
+sidecars, approval artifacts, and the derived graph roots
+(`<storage>/graphs/<id>.omni`). Absent, it defaults to the config directory
+itself (the original layout, byte-compatible with pre-existing clusters).
+`s3://bucket/prefix` puts the whole cluster on S3-compatible object storage:
+the ledger CAS uses conditional writes (verified against AWS S3 semantics and
+RustFS), the lock becomes genuinely cross-machine, and graph roots are
+engine-native S3 URIs. Credentials are **never** in `cluster.yaml` — the
+standard `AWS_*` environment contract applies, identical to graph storage.
+Declared configuration (`cluster.yaml` and the schema/query/policy sources it
+references) always stays in the working tree: config is versioned in git,
+state lives in the store — the Terraform split.
+
+`metadata.name` is a display label. `state.backend` may be omitted or set to
+`cluster`; external state backends are reserved for a later stage. `state.lock`
+defaults to `true`. When enabled, `cluster plan`, `cluster apply`,
+`cluster refresh`, and `cluster import` briefly acquire
+`<config-dir>/__cluster/lock.json`, then remove it before returning. `cluster status` never acquires the lock; it only reports
+whether one is present. `cluster force-unlock` is the only lock-removal command;
+it requires the exact lock id and should be run only after confirming no cluster
+operation is active.
+
+## Validation
+
+`cluster validate` checks:
+
+- `cluster.yaml` syntax and supported fields
+- duplicate YAML keys
+- schema, query, and policy file existence
+- schema parsing and catalog construction
+- stored-query parsing and query-name matching
+- stored-query type-checking against the desired schema
+- policy `applies_to` graph references
+
+Fields reserved for later phases, such as `pipelines`, `embeddings`, `ui`,
+`aliases`, and `bindings`, fail with a typed diagnostic instead of being
+silently ignored.
+
+## Planning
+
+`cluster plan` first performs validation, then reads local JSON state from:
+
+```text
+<config-dir>/__cluster/state.json
+```
+
+If the file is missing, the state is treated as empty and every desired
+resource is planned as a create. If present, the file must use this shape:
+
+```json
+{
+  "version": 1,
+  "state_revision": 0,
+  "applied_revision": {
+    "config_digest": "...",
+    "resources": {
+      "graph.knowledge": { "digest": "..." },
+      "schema.knowledge": { "digest": "..." },
+      "query.knowledge.find_experts": { "digest": "..." },
+      "policy.base": {
+        "digest": "...",
+        "applies_to": ["cluster", "graph.knowledge"]
+      }
+    }
+  },
+  "resource_statuses": {
+    "graph.knowledge": {
+      "status": "applied",
+      "conditions": [],
+      "message": "optional status detail"
+    }
+  },
+  "approval_records": {},
+  "recovery_records": {},
+  "observations": {}
+}
+```
+
+`state_revision`, `resource_statuses`, `approval_records`, `recovery_records`,
+and `observations` are optional so older Stage 1 state fixtures keep working.
+Missing `state_revision` is treated as `0`. Resource status values are
+`pending`, `planned`, `applying`, `applied`, `drifted`, `blocked`, or `error`.
+
+Plan output compares desired resource digests against state resource digests
+and reports `create`, `update`, and `delete` changes. It also reports the state
+CAS (`sha256:<digest>`) and state revision. `state_observations.locked` means an
+existing lock file was observed, along with its metadata (`lock_id`,
+`lock_operation`, `lock_created_at`, `lock_pid`, `lock_age_seconds`); a
+successful `plan` instead reports `lock_acquired: true` and an
+`acquired_lock_id`, then releases the lock before returning. The command never
+writes `state.json` and does not scan live graphs. Use explicit
+`cluster refresh` / `cluster import` when the state ledger should be updated
+from live observations. Live drift scans during plan are later-stage work.
+
+Policy entries additionally record their applied `applies_to` bindings as
+normalized typed refs — the state ledger is serving-sufficient for the
+future server-boot stage. A change to `applies_to` alone (the policy file
+digest unchanged) appears in the plan as an Update marked `binding_change`
+(human output: `[bindings]`), applies like any catalog change, and counts
+toward convergence; ledgers written before this field existed are backfilled
+by the next apply.
+
+Each plan change carries a `disposition` field — an honest preview of what
+`cluster apply` will do with it in this stage: `applied` (executes), `derived`
+(a `graph.<id>` composite-digest update that converges automatically once its
+query digests land), `deferred` (graph/schema change, later phase), or
+`blocked` (query/policy gated by an unapplied or missing dependency, with the
+condition in `reason`).
+
+## Apply
+
+`cluster apply` executes the executable subset of the plan — stored-query and
+policy-bundle changes, graph creates, and schema updates. There is no confirm
+flag: `cluster plan` is the preview,
+and apply recomputes the same diff under the state lock before executing, so a
+stale preview can never be applied. Apply requires an existing `state.json`
+(`state_missing` directs you to `cluster import` first).
+
+For each applied create/update, the resource payload is written
+content-addressed into the local catalog:
+
+```text
+<config-dir>/__cluster/resources/query/<graph>/<name>/<digest>.gq
+<config-dir>/__cluster/resources/policy/<name>/<digest>.yaml
+```
+
+Extensions are fixed per kind regardless of the source file's name. Payloads
+are written before the state update because `state.json` is the publish point:
+if the final CAS-checked state write fails, no success is reported and the
+digest-named blobs already written are inert — re-running apply is the repair.
+Deletes remove the resource from state; their old payload blobs stay on disk
+(garbage collection is a later stage). Re-running a converged apply is a no-op:
+no state write, no revision change (`state_written: false`).
+
+**Applied means serving — for deployments that opt in.** A server started
+with `--cluster <dir>` boots from the applied revision (see
+[Serving from the cluster](#serving-from-the-cluster-the-mode-switch)); it
+picks up newly applied state on its next restart. Deployments still booting
+from `omnigraph.yaml` are untouched: for them, applied means recorded in the
+catalog, nothing more.
+
+### Graph creation
+
+A `graph.<id>` create (the graph is declared but no root exists) is executed
+by apply: the graph is initialized at the derived root
+
+```text
+<config-dir>/graphs/<graph-id>.omni
+```
+
+with the declared schema, before any catalog writes, so queries and policies
+that depend on the new graph apply **in the same run**. Each create is fenced
+by a recovery sidecar under `__cluster/recoveries/{ulid}.json`, written before
+the init and removed only after the state update lands. If apply crashes in
+between, the next state-mutating command (`apply`, `refresh`, `import`) runs a
+**recovery sweep** that classifies the survivor by observation: an absent root
+removes the stale intent; a completed create rolls the cluster state forward
+(recorded in the state's `recovery_records`); a partial root reports
+`graph_create_incomplete` (status `error` — remove the root and re-run apply;
+nothing is auto-deleted); unexpected graph content reports
+`actual_applied_state_pending` (status `drifted` — run `cluster refresh` and
+re-plan). While a kept sidecar is pending, that graph's create and its
+dependents are blocked with `cluster_recovery_pending`. Read-only commands
+(`status`, `plan`) warn about pending sidecars without acting on them.
+
+**Re-creation is convergence.** If a graph root disappears out-of-band,
+`refresh` records the drift and the next `plan` proposes a create — and apply
+will execute it, producing an **empty** graph at the root. The data was
+already lost when the root vanished; the create is visible in the plan
+(disposition `applied`) before anything runs.
+
+### Schema updates
+
+A `schema.<id>` update (the declared schema differs from what state records)
+is executed by apply via the engine's schema-apply, after graph creates and
+before catalog writes — so a query change that depends on the new schema
+applies in the same run. Each schema apply is sidecar-fenced like a create:
+pre-operation manifest version recorded, post-operation version written back,
+sidecar retired only after the state update lands; the recovery sweep
+classifies survivors by schema digest (consistent ledger → retired; completed
+on the graph → state rolled forward with an audit entry; anything else →
+`drifted`/`actual_applied_state_pending`, kept).
+
+Migrations run with **soft drops only** — a removed property disappears from
+the current version while prior versions retain the data (reversible until
+`cleanup`). Data-loss migrations (`allow_data_loss`) are not reachable from
+cluster apply until the approval-artifact stage. Unsupported migrations
+(e.g. changing a property's type), engine lock contention, or graphs with
+user branches fail loudly as `schema_apply_failed` with the engine's message;
+dependent changes are demoted to `blocked` and graph-moving work stops for
+the run.
+
+`cluster plan` previews schema updates with the engine's real migration plan:
+each schema change carries a `migration` field (`supported` + typed steps),
+and the human output prints the steps. If the live graph cannot be opened the
+preview degrades to the digest diff with a `schema_preview_unavailable`
+warning.
+
+**Drift is converged, not just reported.** A schema changed out-of-band on
+the live graph shows up as `drifted` after `refresh`, and the next plan
+proposes migrating it back to the declared schema — apply executes that like
+any other soft migration. Drift correction is gated by the same rules as any
+change; nothing about it is hidden (the plan shows the steps, including soft
+drops of out-of-band fields).
+
+**Attribution.** `cluster apply --as <actor>` records the operator identity
+in recovery sidecars and audit entries and threads it to the engine's
+schema-apply (so commit attribution and Cedar enforcement — wherever a policy
+checker is installed — work unchanged).
+
+### Approvals and graph deletion
+
+Deleting a graph is the irreversible tier: it requires a recorded human
+decision. `cluster plan` lists the gate under `approvals_required` (one gate
+per graph — the graph-level approval carries its schema and queries);
+`cluster approve graph.<id> --as <actor>` writes a digest-bound artifact to
+
+```text
+<config-dir>/__cluster/approvals/<approval-id>.json
+```
+
+bound to the exact desired config digest and the change's state digest, so
+**any config or state drift after approving invalidates the artifact**
+automatically (`approval_stale` warning; it never authorizes a different
+change). An unapproved delete blocks with `approval_required`.
+
+An approved delete executes **last** in the apply run: the graph root is
+removed recursively, the subtree (graph, schema, its queries) is tombstoned
+out of the state ledger with a tombstone observation, and the approval is
+consumed — recorded in the state's `approval_records` in the same state
+update, and the artifact file rewritten with `consumed_at` (the file is never
+deleted: the audit fact survives the loss of either store). A failed run
+consumes nothing; the approval stays valid for the retry. Catalog blobs of
+the deleted graph's queries stay on disk (GC is a later stage).
+
+Crash recovery for deletes: a completed-but-unrecorded delete is rolled
+forward by the sweep (tombstone + approval consumption + audit entry); an
+incomplete delete (root still present) is retired with a
+`graph_delete_incomplete` warning and simply **re-proposed** — prefix removal
+is idempotent, so the still-approved retry is the repair.
+
+Standalone schema deletes are never executed by this stage. They are
+reported as `deferred` (warning `apply_unsupported_change`), and query/policy
+changes that depend on them are `blocked` (warning `apply_dependency_blocked`, status
+`blocked` in state). A partially-applicable plan still exits 0 with warnings;
+the JSON `converged` field is the automation signal for "state now matches the
+desired revision". The applied `config_digest` is only recorded when apply
+fully converges. The `graph.<id>` composite digest is recomputed from state's
+own schema/query digests after each apply, so applied query changes converge
+without graph movement.
+
+## Serving from the cluster (the mode switch)
+
+```bash
+omnigraph-server --cluster company-brain --bind 0.0.0.0:8080
+```
+
+`--cluster <dir>` is an **exclusive boot source** (axiom 15): it cannot
+combine with a graph URI, `--target`, or `--config`, and in this mode
+`omnigraph.yaml` is never read — not for graphs, not for queries, not for
+policies. The server serves the **applied revision**: graph roots recorded in
+`state.json`, stored-query and policy content from the content-addressed
+catalog at the applied digests (re-verified at boot), and policy bundles
+wired by their applied `applies_to` bindings — `cluster`-bound bundles become
+the server-level Cedar engine, graph-bound bundles attach per graph.
+Un-applied config drift never leaks into serving; `cluster plan` is where
+drift is visible. Routing is always multi-graph (`/graphs/{id}/...`). Bearer
+tokens and the bind address stay process-level (flags/env) — they are
+per-replica facts, not cluster facts.
+
+Boot is fail-fast: missing or unreadable state, pending recovery sidecars,
+missing/tampered catalog blobs, policy entries without binding metadata
+(pre-binding ledgers — re-run `cluster apply`), an empty graph set, more than
+one policy bundle binding a single scope (split or merge bundles; stacked
+scopes are a later stage), unopenable graph roots, and stored queries that no
+longer type-check all refuse startup with a remedy. A held state lock is
+*not* an error — boot reads the atomically-replaced state file without
+locking.
+
+Serving is static per process: the server reads the applied revision once at
+startup, so picking up newly applied state means restarting it. Stored
+queries are all listed in `GET /queries` in cluster mode (the cluster
+registry has no expose flag; exposure becomes a policy decision in a later
+phase).
+
+## Status
+
+`cluster status` reads the same local JSON state ledger and prints what the
+ledger says is deployed. It does not validate referenced schema/query/policy
+files and does not inspect live graphs. Missing `state.json` succeeds with a
+warning; invalid state JSON or an unsupported state version fails. If a lock is
+present, status reports its id, operation, creation time, pid, and age.
+
+Status also verifies the catalog payloads read-only: every query/policy digest
+recorded in state is checked against its content-addressed blob under
+`__cluster/resources/` (existence and full digest re-hash). A missing or
+mismatched blob is reported as a warning (`catalog_payload_missing` /
+`catalog_payload_mismatch`); an unreadable blob is an error
+(`catalog_payload_read_error`) because an unverifiable catalog must not report
+healthy. Status never writes state — persisting the `drifted` condition is
+refresh's job. The check runs without the state lock, so it is a point-in-time
+report.
+
+## Refresh And Import
+
+`cluster refresh` updates an existing `state.json` from actual observations.
+`cluster import` creates the first `state.json` when the ledger is missing.
+Both commands open declared graphs read-only at:
+
+```text
+<config-dir>/graphs/<graph-id>.omni
+```
+
+They observe only branch `main`, recording graph existence, manifest version,
+live schema digest, desired schema digest, and schema-match status under
+`observations["graph.<id>"]`. Missing graph roots are recorded as drift and
+remove the graph/schema digests from state so a later `plan` proposes creates.
+Invalid graph roots are recorded as errors; `refresh` persists the error
+observation and exits non-zero, while `import` exits non-zero without creating
+initial state.
+
+Refresh also verifies the catalog payloads of every query/policy digest
+recorded in state (the same check `cluster status` reports read-only), and
+closes the loop:
+
+- a **missing** or **digest-mismatched** blob marks the resource `drifted`
+  (condition `payload_missing` / `payload_mismatch`) and removes its digest
+  from state — so the next `cluster plan` proposes a create and the next
+  `cluster apply` republishes the blob (the self-heal loop, mirroring how a
+  missing graph root is handled);
+- an **unreadable** blob (IO error other than not-found) keeps the digest,
+  marks the resource `error` (condition `payload_read_error`), and exits
+  non-zero — transient IO must not trigger a spurious republish.
+
+Upgrade note: a state ledger written before catalog publish existed records
+query/policy digests with no blobs on disk; the first refresh after upgrading
+flags them all `payload_missing`, and a single `cluster apply` republishes
+everything and converges.
+
+Refresh/import do not observe query or policy resources beyond their catalog
+payloads yet. Existing query and policy state digests are preserved on refresh
+(unless their payload drifted, above) and are not invented on import.
+
+## Force Unlock
+
+`cluster force-unlock <LOCK_ID>` removes `<config-dir>/__cluster/lock.json` only
+when the file exists, is valid version-1 lock JSON, and its `lock_id` exactly
+matches the argument. A wrong id, missing lock, invalid lock JSON, or unsupported
+lock version exits non-zero and leaves the file untouched.
+
+This is manual recovery for abandoned local locks. OmniGraph does not perform
+PID-liveness checks, TTL expiry, stale-lock breaking, or automatic unlock in
+Stage 2C.
--- a/docs/user/clusters/index.md
+++ b/docs/user/clusters/index.md
@ -0,0 +1,285 @@
+# Operating an OmniGraph Cluster
+
+This is the operator's guide to the cluster control plane: how to go from an
+empty directory to a served deployment, and how to run it day to day —
+evolving schemas, rotating queries and policies, healing drift, approving
+destructive changes, and recovering from crashes.
+
+It is a **how-to**. The reference for every `cluster.yaml` key, command flag,
+state-file field, and diagnostic code is
+[cluster-config.md](config.md); the HTTP surface is
+[server.md](../operations/server.md).
+
+## The model in one paragraph
+
+You declare the entire deployment — graphs, schemas, stored queries, Cedar
+policies — as files in one directory (`cluster.yaml` plus the `.pg`/`.gq`/
+`.yaml` files it references). `cluster apply` converges reality to that
+declaration and records what it did in a state ledger
+(`__cluster/state.json`); `cluster plan` previews exactly what apply would
+do, including real schema-migration steps. A server started with
+`omnigraph-server --cluster <dir>` serves what was applied — never what is
+merely written in config. Terraform users will recognize the shape: config
+is desired state, the ledger is recorded state, plan is the diff, apply is
+the only thing that changes the world, and irreversible changes require an
+explicitly recorded approval.
+
+## 1. Deploy a cluster from zero
+
+Lay out a config directory:
+
+```
+company-brain/
+├── cluster.yaml
+├── people.pg            # schema for the "knowledge" graph
+├── queries/             # stored queries — the .gq files ARE the declaration
+│   └── people.gq
+└── base.policy.yaml     # a Cedar policy bundle
+```
+
+```yaml
+# cluster.yaml
+version: 1
+# storage: s3://omnigraph-local/clusters/company-brain   # optional: put the
+#   ledger, catalog, and graph data on object storage (default: this folder)
+metadata:
+  name: company-brain
+graphs:
+  knowledge:
+    schema: people.pg
+    queries: queries/            # every `query <name>` in queries/*.gq registers
+policies:
+  base:
+    file: base.policy.yaml
+    applies_to: [knowledge]      # graph-bound; use [cluster] for server-level
+```
+
+Bring it to life:
+
+```bash
+omnigraph cluster validate --config company-brain   # parse + typecheck everything
+omnigraph cluster import   --config company-brain   # create the state ledger
+omnigraph cluster plan     --config company-brain   # preview: what would apply do?
+omnigraph cluster apply    --config company-brain   # converge
+```
+
+That single `apply` **creates the graph** (at the derived root
+`company-brain/graphs/knowledge.omni`), applies its schema, and publishes
+the query and policy into the content-addressed catalog
+(`__cluster/resources/…`). The output lists every change with its
+disposition; `converged: true` means there is nothing left to do — re-running
+`apply` is always safe and idempotent.
+
+Load data through the normal graph plane (the control plane manages
+*definitions*, not rows):
+
+```bash
+omnigraph load --data seed.jsonl company-brain/graphs/knowledge.omni
+```
+
+Serve it:
+
+```bash
+OMNIGRAPH_SERVER_BEARER_TOKENS_JSON='{"act-reader":"s3cret"}' \
+  omnigraph-server --cluster company-brain --bind 0.0.0.0:8080
+```
+
+`--cluster` accepts either a **config directory** (the storage root resolves
+through `cluster.yaml`'s `storage:` key) or a **storage-root URI directly**
+(`--cluster s3://bucket/prefix`) — config-free serving: a serving box needs
+only the URI and credentials, no checkout of the config repo. The ledger and
+catalog on the bucket are the deployment artifact.
+
+`--cluster` is an **exclusive boot source**: it cannot be combined with a
+graph URI, `--target`, or `--config`, and `omnigraph.yaml` is never read in
+this mode. Routing is always multi-graph:
+
+```bash
+curl -H 'authorization: Bearer s3cret' \
+  -X POST http://localhost:8080/graphs/knowledge/queries/find_person \
+  -H 'content-type: application/json' -d '{"params":{"name":"Ada"}}'
+```
+
+Bearer tokens and the bind address are deliberately *not* cluster facts —
+they are per-replica, set by flag or environment
+([server.md](../operations/server.md#modes) for the token sources).
+
+## 2. The day-2 loop: edit → plan → apply → restart
+
+Every change follows the same loop, whatever its kind:
+
+```bash
+$EDITOR company-brain/people.pg          # or any .gq / policy / cluster.yaml edit
+omnigraph cluster plan  --config company-brain
+omnigraph cluster apply --config company-brain --as andrew
+# restart cluster-booted servers to pick it up
+```
+
+`--as <actor>` attributes the run: it is recorded in recovery sidecars and
+audit entries and threaded into the engine's commit history. Set
+`cli: { actor: <you> }` in your per-operator `omnigraph.yaml` to make it the
+default when `--as` is omitted (the flag always wins; `approve` requires one
+of the two).
+
+What each change kind does:
+
+| You edit | Plan shows | Apply does |
+|---|---|---|
+| a `.gq` file or `queries:` entry | `Update query.<g>.<n>` | publishes the new content-addressed blob, updates the ledger |
+| a policy file | `Update policy.<n>` | same — new blob, ledger update |
+| a policy's `applies_to` | `Update policy.<n> [bindings]` | records the new bindings (the file digest is unchanged; bindings are first-class changes) |
+| a `.pg` schema | `Update schema.<g>` **with the real migration steps embedded** | runs the engine's schema apply on the live graph — soft drops only, sidecar-fenced |
+| `graphs:` gains an entry | `Create graph.<g>` (+ schema, queries) | initializes the graph at its derived root; dependents apply in the same run |
+| `graphs:` loses an entry | `Delete graph.<g>` — **blocked, `approval_required`** | nothing, until approved (see §4) |
+
+Two properties worth internalizing:
+
+- **One apply, ordered correctly.** Creates run first, then schema
+  migrations, then catalog writes, then (approved) deletes — so a schema
+  change plus a query that uses the new field converge together in one run.
+- **Soft drops only.** A removed schema property disappears from the current
+  version while prior versions retain the data (reversible until `cleanup`).
+  Data-loss migrations are not reachable from cluster apply.
+
+Read the plan before applying when the change is non-trivial — for schema
+updates it embeds the engine's actual migration plan (`add_property`,
+`drop_property [soft]`, `unsupported: …`), so you see data impact before
+anything runs.
+
+## 3. Inspect: status, refresh, drift
+
+```bash
+omnigraph cluster status  --config company-brain --json   # ledger only, read-only
+omnigraph cluster refresh --config company-brain          # re-observe live graphs
+```
+
+`status` never touches the graphs; `refresh` opens them read-only and
+records what it finds — manifest versions, live schema digests, catalog blob
+integrity. If someone changed a graph behind the control plane's back (a
+direct `omnigraph schema apply`, a tampered catalog file), refresh marks the
+resource **`drifted`**.
+
+**Drift is converged, not just reported.** After a refresh records drift,
+the next `plan` proposes migrating the live graph back to the declared
+schema — with the steps visible, including the soft drops of out-of-band
+fields — and `apply` executes it like any other change. If the out-of-band
+change is the one you want, change the *config* to match instead, and apply
+converges the ledger.
+
+## 4. Destructive changes: the approval gate
+
+Removing a graph from `cluster.yaml` never executes silently:
+
+```bash
+omnigraph cluster apply --config company-brain
+#   Delete graph.scratch [Blocked: approval_required]
+
+omnigraph cluster approve graph.scratch --config company-brain --as andrew
+#   cluster approve: delete graph.scratch approved by andrew (approval 01KT…)
+
+omnigraph cluster apply --config company-brain --as andrew
+#   Delete graph.scratch [Applied]   ← root removed, subtree tombstoned
+```
+
+The approval artifact (`__cluster/approvals/<id>.json`) is **digest-bound**:
+it authorizes exactly the change you saw when you approved it. Any config or
+state movement afterwards invalidates it automatically (`approval_stale`
+warning) — a stale approval can never authorize a different delete. One
+approval covers the graph's whole subtree (its schema and queries ride
+along). Consumed artifacts are kept (rewritten with `consumed_at`) and
+summarized in the ledger's `approval_records`, so the audit trail of *who
+approved what* survives the loss of either store.
+
+## 5. When things go wrong
+
+**Crashes are designed for.** Every graph-moving operation (create, schema
+apply, delete) writes a recovery sidecar before acting. If an apply dies
+mid-run, the next state-mutating command sweeps the sidecars and reconciles
+— rolling the ledger forward when the operation completed on the graph,
+retiring stale intent when nothing moved, and flagging anything it cannot
+verify. You generally fix a crashed run by **running `cluster apply`
+again**.
+
+**A held lock** (a crashed process left `__cluster/lock.json`):
+
+```bash
+omnigraph cluster status --config company-brain      # shows the lock holder + id
+omnigraph cluster force-unlock <LOCK_ID> --config company-brain
+```
+
+Force-unlock requires the exact lock id (from status) — there is no blind
+unlock.
+
+**A lost or corrupted state ledger**: the cluster is self-describing.
+`cluster import` rebuilds `state.json` from the config plus read-only
+observation of the live graphs; the next `apply` re-converges onto the same
+content-addressed catalog.
+
+**A server that refuses to boot** with `--cluster` is telling you the
+applied revision is not safely servable. Each refusal names its remedy:
+
+| Boot error | Meaning | Remedy |
+|---|---|---|
+| `cluster_state_missing` | no ledger | `cluster import`, then `apply` |
+| `cluster_recovery_pending` | interrupted operation awaiting sweep | run `cluster apply` (or any state-mutating command), restart |
+| `catalog_payload_missing` / `…_digest_mismatch` | catalog blob lost or tampered | `cluster refresh`, then `apply`, restart |
+| `policy_bindings_missing` | ledger predates binding metadata | re-run `cluster apply` (backfills), restart |
+| `cluster_empty` | applied revision has no graphs | apply a cluster with ≥1 graph |
+| multiple bundles bind one scope | serving holds one policy bundle per graph + one server-level | split or merge bundles |
+
+A held *state lock* is deliberately **not** a boot error — the server reads
+the atomically-replaced ledger without locking, so serving never contends
+with an in-flight apply.
+
+## 6. Deployment patterns
+
+- **Replicas**: any number of `--cluster` servers can serve the same config
+  directory; boot is read-only. Roll out a change by `apply` once, then
+  restarting replicas (serving is static per process — there is no hot
+  reload yet). Container/cloud recipes (AWS ECS+EFS, Railway volumes):
+  [deployment.md](../deployment.md#cluster-mode-in-containers-aws-railway).
+- **The directory is the deployable unit**: config, catalog, ledger,
+  approvals, and graph data all live under it. Back it up as a whole;
+  version the *config files* (not `__cluster/` or `graphs/`) in git.
+- **CI-driven convergence**: `validate` and `plan --json` are read-only and
+  safe in pipelines; gate `apply --as ci` on plan review. Approvals are the
+  human step by design — keep `cluster approve` out of automation.
+- **`omnigraph.yaml` still has a job**: per-operator settings — your
+  `cli.actor` default for `--as`, CLI defaults, credentials, and data-plane
+  ergonomics (point `graphs.<name>.uri` at a derived root like
+  `company-brain/graphs/knowledge.omni` to use `--target <name>` for
+  loads). It just no longer describes the deployment — a server boots from
+  one source or the other, never a merge of both.
+
+## 7. Maintaining a cluster graph
+
+Storage maintenance (`optimize` / `repair` / `cleanup`) is **not** a control-plane
+operation — it runs out-of-band, with direct storage access, against the graph's
+roots. Address a cluster graph by name instead of hand-typing its storage path:
+
+```bash
+omnigraph optimize --cluster ./company-brain --cluster-graph knowledge
+omnigraph cleanup  --cluster ./company-brain --cluster-graph knowledge --keep 10 --confirm
+# --cluster also takes the storage-root URI directly (config-free):
+omnigraph optimize --cluster s3://bucket/clusters/company-brain --cluster-graph knowledge
+```
+
+The graph's storage URI is resolved from the **served cluster state** (the same
+truth a `--cluster` server boots from); a graph that hasn't been applied yet is
+not resolvable. Run these from a host with storage access — there are no server
+routes for them. Conversely, **`init` refuses** a cluster-managed path: graphs in
+a cluster are created by `cluster apply`, not by hand.
+
+## What the control plane does not do (yet)
+
+- **No hot reload** — applied changes serve on the next restart.
+- **No data operations** — rows move through `omnigraph load / ingest /
+  mutate` against the graph roots, with branches and merges as usual.
+- **Stored-query exposure is all-or-nothing per cluster** — every applied
+  query is listed and invokable (subject to Cedar `invoke_query`); per-query
+  exposure policy is a planned phase.
+- **Pipelines (ETL)** are a separate project; the `pipelines:` key is
+  reserved and rejected loudly.
+
+For the full reference — every key, flag, status, disposition, and
+diagnostic — see [cluster-config.md](config.md).