From 97eb65e921c704c09bdd8064e8304ef074cc1184 Mon Sep 17 00:00:00 2001
From: aaltshuler <andrew@collectivelab.io>
Date: Wed, 10 Jun 2026 22:10:19 +0300
Subject: [PATCH] docs(cluster): operator how-to guide for deploying and
 managing clusters
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

New docs/user/cluster.md — the practical companion to cluster-config.md's
reference: zero-to-served walkthrough (validate/import/plan/apply, derived
roots, data loading, --cluster serving), the day-2 edit->plan->apply->restart
loop with a per-change-kind table, drift observation and convergence, the
approval gate for destructive changes, crash/lock/lost-ledger recovery, the
boot-refusal table with remedies, deployment patterns (replicas, backup
unit, CI gating), and the explicit not-yet list (hot reload, S3-hosted
cluster dirs, per-query exposure, pipelines). Linked from the user index,
the agent guide's topic map, and cross-linked from the reference.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
---
 AGENTS.md                   |   1 +
 docs/user/cluster-config.md |   3 +
 docs/user/cluster.md        | 256 ++++++++++++++++++++++++++++++++++++
 docs/user/index.md          |   1 +
 4 files changed, 261 insertions(+)
 create mode 100644 docs/user/cluster.md
diff --git a/AGENTS.md b/AGENTS.md
index 25243a5..60276ad 100644
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -86,6 +86,7 @@ Full diagram and concurrency model: [docs/dev/architecture.md](docs/dev/architec
 | Diff / change feed (`diff_between`, `diff_commits`) | [docs/user/changes.md](docs/user/changes.md) |
 | Query execution, mutation execution, bulk loader, `load` vs `ingest` | [docs/dev/execution.md](docs/dev/execution.md) |
 | `optimize` (compaction) and `cleanup` (version GC) | [docs/user/maintenance.md](docs/user/maintenance.md) |
+| Cluster operator guide (deploy/manage clusters, approvals, recovery, serving) | [docs/user/cluster.md](docs/user/cluster.md) |
 | Cedar policy actions, scopes, CLI | [docs/user/policy.md](docs/user/policy.md) |
 | HTTP server endpoints, auth, error model, body limits | [docs/user/server.md](docs/user/server.md) |
 | CLI quick-start | [docs/user/cli.md](docs/user/cli.md) |
diff --git a/docs/user/cluster-config.md b/docs/user/cluster-config.md
index 5c51b1f..5847d8e 100644
--- a/docs/user/cluster-config.md
+++ b/docs/user/cluster-config.md
@@ -2,6 +2,9 @@
 
 **Status:** Phase 5 — cluster-booted serving (`omnigraph-server --cluster`).
 
+> New to the cluster tooling? Start with the operator how-to guide,
+> [cluster.md](cluster.md) — this document is the reference.
+
 Cluster config is the future control-plane configuration surface for a whole
 OmniGraph deployment. In this stage, OmniGraph can validate a local
 `cluster.yaml` folder, produce a deterministic read-only plan, inspect the
diff --git a/docs/user/cluster.md b/docs/user/cluster.md
new file mode 100644
index 0000000..6241378
--- /dev/null
+++ b/docs/user/cluster.md
@@ -0,0 +1,256 @@
+# Operating an OmniGraph Cluster
+
+This is the operator's guide to the cluster control plane: how to go from an
+empty directory to a served deployment, and how to run it day to day —
+evolving schemas, rotating queries and policies, healing drift, approving
+destructive changes, and recovering from crashes.
+
+It is a **how-to**. The reference for every `cluster.yaml` key, command flag,
+state-file field, and diagnostic code is
+[cluster-config.md](cluster-config.md); the HTTP surface is
+[server.md](server.md).
+
+## The model in one paragraph
+
+You declare the entire deployment — graphs, schemas, stored queries, Cedar
+policies — as files in one directory (`cluster.yaml` plus the `.pg`/`.gq`/
+`.yaml` files it references). `cluster apply` converges reality to that
+declaration and records what it did in a state ledger
+(`__cluster/state.json`); `cluster plan` previews exactly what apply would
+do, including real schema-migration steps. A server started with
+`omnigraph-server --cluster <dir>` serves what was applied — never what is
+merely written in config. Terraform users will recognize the shape: config
+is desired state, the ledger is recorded state, plan is the diff, apply is
+the only thing that changes the world, and irreversible changes require an
+explicitly recorded approval.
+
+## 1. Deploy a cluster from zero
+
+Lay out a config directory:
+
+```
+company-brain/
+├── cluster.yaml
+├── people.pg            # schema for the "knowledge" graph
+├── people.gq            # a stored query
+└── base.policy.yaml     # a Cedar policy bundle
+```
+
+```yaml
+# cluster.yaml
+version: 1
+metadata:
+  name: company-brain
+graphs:
+  knowledge:
+    schema: ./people.pg
+    queries:
+      find_person:
+        file: ./people.gq
+policies:
+  base:
+    file: ./base.policy.yaml
+    applies_to: [knowledge]      # graph-bound; use [cluster] for server-level
+```
+
+Bring it to life:
+
+```bash
+omnigraph cluster validate --config ./company-brain   # parse + typecheck everything
+omnigraph cluster import   --config ./company-brain   # create the state ledger
+omnigraph cluster plan     --config ./company-brain   # preview: what would apply do?
+omnigraph cluster apply    --config ./company-brain   # converge
+```
+
+That single `apply` **creates the graph** (at the derived root
+`./company-brain/graphs/knowledge.omni`), applies its schema, and publishes
+the query and policy into the content-addressed catalog
+(`__cluster/resources/…`). The output lists every change with its
+disposition; `converged: true` means there is nothing left to do — re-running
+`apply` is always safe and idempotent.
+
+Load data through the normal graph plane (the control plane manages
+*definitions*, not rows):
+
+```bash
+omnigraph load --data ./seed.jsonl ./company-brain/graphs/knowledge.omni
+```
+
+Serve it:
+
+```bash
+OMNIGRAPH_SERVER_BEARER_TOKENS_JSON='{"act-reader":"s3cret"}' \
+  omnigraph-server --cluster ./company-brain --bind 0.0.0.0:8080
+```
+
+`--cluster` is an **exclusive boot source**: it cannot be combined with a
+graph URI, `--target`, or `--config`, and `omnigraph.yaml` is never read in
+this mode. Routing is always multi-graph:
+
+```bash
+curl -H 'authorization: Bearer s3cret' \
+  -X POST http://localhost:8080/graphs/knowledge/queries/find_person \
+  -H 'content-type: application/json' -d '{"params":{"name":"Ada"}}'
+```
+
+Bearer tokens and the bind address are deliberately *not* cluster facts —
+they are per-replica, set by flag or environment
+([server.md](server.md#modes) for the token sources).
+
+## 2. The day-2 loop: edit → plan → apply → restart
+
+Every change follows the same loop, whatever its kind:
+
+```bash
+$EDITOR company-brain/people.pg          # or any .gq / policy / cluster.yaml edit
+omnigraph cluster plan  --config ./company-brain
+omnigraph cluster apply --config ./company-brain --as andrew
+# restart cluster-booted servers to pick it up
+```
+
+`--as <actor>` attributes the run: it is recorded in recovery sidecars and
+audit entries and threaded into the engine's commit history. Make it a habit
+on every apply (it is required for `approve`).
+
+What each change kind does:
+
+| You edit | Plan shows | Apply does |
+|---|---|---|
+| a `.gq` file or `queries:` entry | `Update query.<g>.<n>` | publishes the new content-addressed blob, updates the ledger |
+| a policy file | `Update policy.<n>` | same — new blob, ledger update |
+| a policy's `applies_to` | `Update policy.<n> [bindings]` | records the new bindings (the file digest is unchanged; bindings are first-class changes) |
+| a `.pg` schema | `Update schema.<g>` **with the real migration steps embedded** | runs the engine's schema apply on the live graph — soft drops only, sidecar-fenced |
+| `graphs:` gains an entry | `Create graph.<g>` (+ schema, queries) | initializes the graph at its derived root; dependents apply in the same run |
+| `graphs:` loses an entry | `Delete graph.<g>` — **blocked, `approval_required`** | nothing, until approved (see §4) |
+
+Two properties worth internalizing:
+
+- **One apply, ordered correctly.** Creates run first, then schema
+  migrations, then catalog writes, then (approved) deletes — so a schema
+  change plus a query that uses the new field converge together in one run.
+- **Soft drops only.** A removed schema property disappears from the current
+  version while prior versions retain the data (reversible until `cleanup`).
+  Data-loss migrations are not reachable from cluster apply.
+
+Read the plan before applying when the change is non-trivial — for schema
+updates it embeds the engine's actual migration plan (`add_property`,
+`drop_property [soft]`, `unsupported: …`), so you see data impact before
+anything runs.
+
+## 3. Inspect: status, refresh, drift
+
+```bash
+omnigraph cluster status  --config ./company-brain --json   # ledger only, read-only
+omnigraph cluster refresh --config ./company-brain          # re-observe live graphs
+```
+
+`status` never touches the graphs; `refresh` opens them read-only and
+records what it finds — manifest versions, live schema digests, catalog blob
+integrity. If someone changed a graph behind the control plane's back (a
+direct `omnigraph schema apply`, a tampered catalog file), refresh marks the
+resource **`drifted`**.
+
+**Drift is converged, not just reported.** After a refresh records drift,
+the next `plan` proposes migrating the live graph back to the declared
+schema — with the steps visible, including the soft drops of out-of-band
+fields — and `apply` executes it like any other change. If the out-of-band
+change is the one you want, change the *config* to match instead, and apply
+converges the ledger.
+
+## 4. Destructive changes: the approval gate
+
+Removing a graph from `cluster.yaml` never executes silently:
+
+```bash
+omnigraph cluster apply --config ./company-brain
+#   Delete graph.scratch [Blocked: approval_required]
+
+omnigraph cluster approve graph.scratch --config ./company-brain --as andrew
+#   cluster approve: delete graph.scratch approved by andrew (approval 01KT…)
+
+omnigraph cluster apply --config ./company-brain --as andrew
+#   Delete graph.scratch [Applied]   ← root removed, subtree tombstoned
+```
+
+The approval artifact (`__cluster/approvals/<id>.json`) is **digest-bound**:
+it authorizes exactly the change you saw when you approved it. Any config or
+state movement afterwards invalidates it automatically (`approval_stale`
+warning) — a stale approval can never authorize a different delete. One
+approval covers the graph's whole subtree (its schema and queries ride
+along). Consumed artifacts are kept (rewritten with `consumed_at`) and
+summarized in the ledger's `approval_records`, so the audit trail of *who
+approved what* survives the loss of either store.
+
+## 5. When things go wrong
+
+**Crashes are designed for.** Every graph-moving operation (create, schema
+apply, delete) writes a recovery sidecar before acting. If an apply dies
+mid-run, the next state-mutating command sweeps the sidecars and reconciles
+— rolling the ledger forward when the operation completed on the graph,
+retiring stale intent when nothing moved, and flagging anything it cannot
+verify. You generally fix a crashed run by **running `cluster apply`
+again**.
+
+**A held lock** (a crashed process left `__cluster/lock.json`):
+
+```bash
+omnigraph cluster status --config ./company-brain      # shows the lock holder + id
+omnigraph cluster force-unlock <LOCK_ID> --config ./company-brain
+```
+
+Force-unlock requires the exact lock id (from status) — there is no blind
+unlock.
+
+**A lost or corrupted state ledger**: the cluster is self-describing.
+`cluster import` rebuilds `state.json` from the config plus read-only
+observation of the live graphs; the next `apply` re-converges onto the same
+content-addressed catalog.
+
+**A server that refuses to boot** with `--cluster` is telling you the
+applied revision is not safely servable. Each refusal names its remedy:
+
+| Boot error | Meaning | Remedy |
+|---|---|---|
+| `cluster_state_missing` | no ledger | `cluster import`, then `apply` |
+| `cluster_recovery_pending` | interrupted operation awaiting sweep | run `cluster apply` (or any state-mutating command), restart |
+| `catalog_payload_missing` / `…_digest_mismatch` | catalog blob lost or tampered | `cluster refresh`, then `apply`, restart |
+| `policy_bindings_missing` | ledger predates binding metadata | re-run `cluster apply` (backfills), restart |
+| `cluster_empty` | applied revision has no graphs | apply a cluster with ≥1 graph |
+| multiple bundles bind one scope | serving holds one policy bundle per graph + one server-level | split or merge bundles |
+
+A held *state lock* is deliberately **not** a boot error — the server reads
+the atomically-replaced ledger without locking, so serving never contends
+with an in-flight apply.
+
+## 6. Deployment patterns
+
+- **Replicas**: any number of `--cluster` servers can serve the same config
+  directory; boot is read-only. Roll out a change by `apply` once, then
+  restarting replicas (serving is static per process — there is no hot
+  reload yet).
+- **The directory is the deployable unit**: config, catalog, ledger,
+  approvals, and graph data all live under it. Back it up as a whole;
+  version the *config files* (not `__cluster/` or `graphs/`) in git.
+- **CI-driven convergence**: `validate` and `plan --json` are read-only and
+  safe in pipelines; gate `apply --as ci` on plan review. Approvals are the
+  human step by design — keep `cluster approve` out of automation.
+- **`omnigraph.yaml` still has a job**: per-operator settings (CLI defaults,
+  credentials, active context). It just no longer describes the deployment —
+  a server boots from one source or the other, never a merge of both.
+
+## What the control plane does not do (yet)
+
+- **No hot reload** — applied changes serve on the next restart.
+- **No S3-hosted cluster directories** — the config dir, ledger, catalog,
+  and derived graph roots are local-filesystem paths today. (Individual
+  *graphs* on S3 are a server feature outside cluster mode.)
+- **No data operations** — rows move through `omnigraph load / ingest /
+  mutate` against the graph roots, with branches and merges as usual.
+- **Stored-query exposure is all-or-nothing per cluster** — every applied
+  query is listed and invokable (subject to Cedar `invoke_query`); per-query
+  exposure policy is a planned phase.
+- **Pipelines (ETL)** are a separate project; the `pipelines:` key is
+  reserved and rejected loudly.
+
+For the full reference — every key, flag, status, disposition, and
+diagnostic — see [cluster-config.md](cluster-config.md).
diff --git a/docs/user/index.md b/docs/user/index.md
index 6cf6ade..956fa0b 100644
--- a/docs/user/index.md
+++ b/docs/user/index.md
@@ -13,6 +13,7 @@ of MRs, internal recovery mechanics, or contributor-only invariants.
 | Install OmniGraph | [install.md](install.md) |
 | Run the CLI locally | [cli.md](cli.md) |
 | Look up every CLI flag and config field | [cli-reference.md](cli-reference.md) |
+| Deploy and operate a cluster (how-to guide) | [cluster.md](cluster.md) |
 | Validate and plan cluster config | [cluster-config.md](cluster-config.md) |
 | Write schemas | [schema-language.md](schema-language.md) |
 | Read schema-lint diagnostic codes | [schema-lint.md](schema-lint.md) |