From ab5f3b878a28ae466b5e16f2389ba1c9ece5ac86 Mon Sep 17 00:00:00 2001
From: aaltshuler <andrew@collectivelab.io>
Date: Mon, 8 Jun 2026 17:31:36 +0300
Subject: [PATCH] docs: add cluster config specs

---
 docs/dev/cluster-axioms.md                    |  97 +++
 .../dev/cluster-config-implementation-spec.md | 705 ++++++++++++++++++
 docs/dev/cluster-config-specs.md              | 415 +++++++++++
 docs/dev/index.md                             |   1 +
 4 files changed, 1218 insertions(+)
 create mode 100644 docs/dev/cluster-axioms.md
 create mode 100644 docs/dev/cluster-config-implementation-spec.md
 create mode 100644 docs/dev/cluster-config-specs.md

diff --git a/docs/dev/cluster-axioms.md b/docs/dev/cluster-axioms.md
new file mode 100644
index 0000000..a3793b4
--- /dev/null
+++ b/docs/dev/cluster-axioms.md
@@ -0,0 +1,97 @@
+# Cluster Control-Plane Axioms
+
+**Type:** Standing design filter
+**Status:** Draft / thinking-in-progress
+**Date:** 2026-06-07
+**Relationship:** the distilled axioms behind [cluster-config-specs.md](cluster-config-specs.md). The downstream implementation inventory and blast-radius assessment live in [cluster-config-implementation-spec.md](cluster-config-implementation-spec.md). The high-level spec is the argument; this is the checklist. Hold any config / control-plane / deployment proposal against these and cite them by number (e.g. "violates axiom 5").
+
+This file is intentionally short and stable. The axioms are phrased so other
+docs can reference "axiom 6" without churn. The motivating requirement comes
+first; the core axioms are what the design is *based on*; the derived rules are
+consequences that follow from them.
+
+> **Revision 2026-06-07 — committed to the Terraform paradigm.** State is now an
+> **authoritative, locked ledger in a backend** (no longer framed as a
+> "mostly-rebuildable projection"); `plan` is a **config ↔ state diff**; and
+> **ETL pipelines** join schema as config-defined resources that trigger
+> data-plane effects. Secrets live in a gitignored **`.env`** file (`${NAME}`),
+> and **query exposure is a policy decision** (no registry `expose:` flag).
+> Axioms **2, 5, 6** revised; **12, 13, 14** added. The earlier
+> "state is just a rebuildable projection; config is the *only* truth" framing is
+> superseded — see axiom 5.
+>
+> **Revision 2026-06-08 — JSON state first.** The baseline state backend is now
+> Terraform-style JSON documents plus backend lock/CAS, not Lance control-plane
+> datasets. Lance remains a possible later backend only if row-level history or
+> queryability justifies the extra machinery.
+
+---
+
+## Tenet 0 — the motivating requirement
+
+**0. The Sarah/Bob test.** If one operator changes schema / queries / policies / UI / pipelines / aliases, another operator (or their agent) must learn *what the deployment is and what changed* from **one source, one history, one diff**. Fragmentation across separate mechanisms is the failure the whole design exists to eliminate. Every other axiom is in service of passing this test.
+
+---
+
+## Core axioms (what the design is based on)
+
+**1. The cluster is the unit of declarative state.** Not the graph (policies, queries, UI, and pipelines cross-cut graphs; "which graphs exist" has no per-graph home), not the fleet (the next scope up — named and deferred). The cluster is what two operators collaborate over; a graph is a *resource within* it.
+
+**2. Two sources of truth, for two different questions — config for *intent*, state for *deployed reality*.** The version-controlled **config** (a set of files in one folder) is the source of truth for what the cluster *should be*. The **state ledger** is the source of truth for what *is* currently deployed. Change flows one way only: you edit config and `apply` converges the cluster (**code → cluster**, never edit-the-cluster-and-call-it-intent). But "what exists right now" is read from **state**, not re-derived from the world on every command. `plan` is the diff between the two.
+
+**3. Declarative, not imperative.** You describe the desired end state; the reconciler computes the steps. No runtime mutation API that makes the running system the place *intent* lives.
+
+**4. As-code is structural, not stylistic (the recursion argument).** Code is the base case; modeling the definition *as data* (a meta-graph describing graphs) recurses with no base case. Config must live **outside** the running system so it is reviewable (PRs), reproducible (clone + apply), diffable as text, and editable by an agent — without the system having to describe itself.
+
+<!-- Audit fix: JSON keeps the first backend Terraform-shaped and inspectable.
+Lance datasets are future optimization, not the baseline state format. -->
+**5. The Terraform model: config / state / reconcile — and state is an authoritative, locked ledger.** Config (as code) = desired truth. **State = the authoritative record of what has been applied**, held in a **backend** — the cluster's own object-store backend *or* a separate cloud store, the operator's choice, exactly like a Terraform backend. The baseline representation is JSON documents (`state.json`, status/approval/recovery JSON records) protected by backend lock/CAS, not Lance control-plane tables. State is **locked** during apply so two operators cannot converge concurrently. `validate` parses and schema-checks desired config; `plan` = `diff(config, state)` as a structured artifact with resource digests, dependency edges, state observations, proposed changes, blast radius, and approval gates; `apply` converges the cluster from an accepted fresh plan and **updates state**, and does not acknowledge success until state has recorded the result. A cluster-hosted JSON backend is still a separate state CAS step from graph Lance manifest moves; failures surface a repair/import condition instead of being described as cross-object all-or-nothing. A future Lance-backed state backend or cluster manifest publisher is optional and must earn its complexity by needing row-level queryability/history or tighter publish fencing. Because OmniGraph's running cluster is self-describing (manifests, commit logs), state is *reconstructable* by import/refresh if lost — its edge over opaque-cloud Terraform — but it is **treated as the source of truth for current reality, not casually regenerated**. The one slice that can never be reconstructed (who approved an irreversible apply) lives in the durable audit ledger; state references it (axiom 11).
+
+**6. The control plane reconciles definition, not data — across two data-plane seams.** Definition — schema, policies, queries, UI, bindings, aliases, ETL **pipelines**, embeddings config, and the set of graphs — is reconciled. Data — rows, edges, vectors — is data-plane content, versioned by the commit DAG and produced by `load` / `mutate` and **pipeline execution**, sitting **outside** the reconcile loop. Exactly two definition kinds *trigger* a data-plane effect without owning data: **schema** (a migration conforms existing rows; `plan` previews its impact) and **ETL pipelines** (their execution ingests external data). The loop converges their *definitions*; the data they produce is never what it reconciles.
+
+**7. Operated by agent (agent-as-controller).** An agent authors config changes and drives reconciliation as an authenticated actor, subject to policy and approval gates — no human state-management burden. This fuses Terraform's as-code config with Kubernetes' continuous reconciliation.
+
+---
+
+## Derived rules (consequences of the axioms)
+
+**8. The reversibility gradient gates apply — including drift correction.** Irreversible / data-loss operations (drop a graph, hard-drop schema data, a pipeline that overwrites) and compatibility-narrowing migrations (for example, future validated enum narrowing) are gated; reversible ones (recolor a dashboard) are not. The gate is keyed to physics, not to who operates it, and a reconciler "just fixing drift" is never an exception.
+
+**9. Atomicity and referential integrity are plan-time, not runtime.** `ApplyGroup` is the atomicity unit; cross-resource references *force* grouping (mandatory, not opt-in); references use typed resource/provider addresses (`graph.knowledge`, `query.knowledge.find_experts`, `provider.source.github_org`) so the planner can reject wrong-kind or missing targets before apply — bare names in a kind-fixed field are accepted shorthand and normalized to the typed address (fix 2026-06-08), while a kind-ambiguous value (e.g. `source: github`) is rejected; a reference to a missing or being-removed resource is a fail-closed `plan` error, not a deferred runtime failure.
+
+**10. Secrets live in a `.env` file; connection/identity is per-operator.** The committed cluster config carries **no secret values** — only `${NAME}` references. The values (embedding API keys, pipeline **source credentials**, per-deployment settings) live in a separate **`.env` file** — which is gitignored and supplied per deployment, never committed. Separately, an operator's own connection (which cluster, which token) is the per-operator layer, distinct from both the shared config and its `.env` file.
+
+**11. Approvals and audit live in a durable ledger, not inline in state.** State *references* the audit record by id. In the baseline, that ledger is append-only JSON records in the state backend; a future Lance table is an implementation option, not a requirement. This keeps the bulk of state reconstructable and keeps approval facts — "who authorized this irreversible apply" — where loss is impossible.
+
+**12. State lives in a backend and is locked.** The state ledger is stored in a configurable backend — the cluster's own backend, or a separate cloud store — and `plan`/`apply` acquire a **state lock** first, so concurrent applies serialize instead of racing. (Generalizes the existing `__schema_apply_lock__` from schema scope to cluster scope.) The backend choice is part of the safety model: the first backend should be JSON plus object-store lock/CAS; any Lance-backed state backend needs its own RFC-level proof that the table semantics are worth the control-plane complexity.
+
+**13. Pipelines are definition; their execution is data-plane.** An ETL pipeline (external source → transform → target graph) is **declared in config and reconciled like any resource**; *running* it produces ordinary data-plane writes (`load`/`mutate`) outside the reconcile loop. `apply` converges the pipeline's *definition* (create / update / delete / schedule); the rows it ingests are never reconciled. A fan-out run over several graphs is statusful rather than magically atomic: each target records commit id, status, retryability, and idempotency key unless the pipeline explicitly uses a branch/merge protocol that can fence the whole target set. Source credentials are secret references (axiom 10).
+
+<!-- Audit fix: current shipped behavior still has mcp.expose and coarse
+invoke_query. This axiom is the target control-plane rule, not a statement
+about today's server catalog. -->
+**14. Exposure is a policy decision, not a config flag.** Target design: which stored queries (and the tools/dashboards built on them) an actor may **list or invoke** is decided by the policy layer (Cedar: `invoke_query` + catalog visibility), not by a per-query `expose:` boolean. The registry only says a query *exists* (name → file); **policy says who may see and run it**, so the MCP catalog (`GET /queries`) becomes each actor's policy-permitted set. This supersedes the engine's current `mcp.expose` flag only after per-query `invoke_query` scope and Cedar-filtered catalog listing land; until then, proposals must state the compatibility bridge to today's `mcp.expose` + coarse invocation gate.
+
+---
+
+## The one-line compression
+
+**One cluster; config (a folder of files) is desired truth and a locked state ledger in a backend is deployed truth; `plan` diffs them, `apply` converges the cluster and updates state, an agent drives the loop — reconciling the cluster's *definition* (schema, policies, queries, UI, pipelines, …) and never its data — so any operator sees the whole system and its history from one place.**
+
+---
+
+## How to use this file
+
+- **Reviewing a proposal:** walk axioms 0–14; any conflict is the burden of the proposer to justify. The most common tensions:
+  - Treating the *running system* as the source of truth for **intent** → axioms 2, 4 (intent lives in config).
+  - Treating state as a throwaway derivation rather than an authoritative, locked, backend-held ledger → axiom 5, 12.
+  - A runtime config-mutation API instead of declarative apply → axiom 3.
+  - "State" meaning a per-operator selection rather than the applied-cluster ledger → axiom 5.
+  - The control plane reconciling (or owning) data — including treating pipeline *rows* as reconciled state → axiom 6, 13.
+  - Treating fan-out pipeline execution as atomic without a branch/merge protocol or per-target status ledger → axiom 13.
+  - Per-graph or per-server scoping of cluster-level definition → axiom 1.
+  - Bare string references that force the planner to guess whether `knowledge` means a graph, query, provider, or path → axiom 9.
+  - A secret value (token, embedding key, pipeline source credential) inline in config instead of in the gitignored `.env` file → axiom 10.
+  - A per-query `expose:`/visibility flag in target-state cluster config instead of governing list/invoke in policy; or failing to account for today's `mcp.expose` compatibility bridge → axiom 14.
+  - Shipping `apply` before hermetic `validate` + read-only `plan` tests, or shipping graph/schema-moving apply before recovery tests for the graph/resource-moved-before-cluster-publish gap → axiom 5 and axiom 12.
+- **Citing:** reference axioms by number in PRs and review comments so the rationale is stable across renames and refactors.
diff --git a/docs/dev/cluster-config-implementation-spec.md b/docs/dev/cluster-config-implementation-spec.md
new file mode 100644
index 0000000..5121451
--- /dev/null
+++ b/docs/dev/cluster-config-implementation-spec.md
@@ -0,0 +1,705 @@
+# Cluster Config Implementation Spec And Blast Radius
+
+**Status:** Draft / implementation planning
+**Type:** Downstream design spec
+**Date:** 2026-06-08
+**Relationship:** companion to [cluster-config-specs.md](cluster-config-specs.md)
+and [cluster-axioms.md](cluster-axioms.md). The high-level spec explains why
+the cluster control plane should exist; this file names what must change
+downstream and how large the blast radius is.
+
+<!-- Spec note: this file exists so the user-facing cluster spec can stay
+readable. Keep implementation inventories, rollout phases, and test ownership
+here instead of expanding the narrative spec into an encyclopedia. -->
+
+## Executive Summary
+
+Overall blast radius: **very high**.
+
+This is not a small extension to `omnigraph.yaml`. The target design creates a
+new shared cluster desired-state document, a locked state ledger, a cluster
+manifest publisher, and a reconciler that coordinates resources above a single
+graph. The existing config system remains useful, but its role changes:
+
+- `omnigraph.yaml` / global config remains the per-operator and startup bridge.
+- `cluster.yaml` becomes shared desired state for a deployment.
+- The cluster state ledger becomes the authoritative record of applied reality.
+- Server/runtime surfaces eventually read from the cluster catalog instead of
+  only from process-start config.
+
+Safe rollout requires an additive path. Do not replace the current config,
+server, or policy behavior in one step.
+
+## Current Surfaces Surveyed
+
+| Surface | Current behavior | Why it matters |
+|---|---|---|
+| `omnigraph-config::OmnigraphConfig` | Layered global/state/project config for CLI and server startup; strict `version: 1`; named maps replace wholesale | A cluster spec needs different ownership and merge semantics; do not stretch this type until it becomes ambiguous |
+| `omnigraph-server::load_server_settings` | Opens either one selected graph or every configured embedded graph in multi mode | Cluster config changes startup, registry identity, and eventually runtime reconcile |
+| `GraphRegistry` | Holds open graph handles; production registry is startup-only today; runtime insert is test-only | Cluster apply wants graph add/remove/reload as real control-plane operations |
+| `omnigraph-queries::QueryRegistry` | Loads `.gq` files from `queries:` and honors `mcp.expose` for catalog listing | Target cluster config removes exposure from the registry and moves list/invoke to policy |
+| `omnigraph-policy::PolicyAction` | Per-graph actions plus server-scoped `graph_list`; `invoke_query` is graph-scoped and coarse | Cluster plan/apply and per-query exposure need new policy scope without breaking coarse rules |
+| Engine graph manifest | Graph-level atomic visibility via `__manifest`, expected table versions, and recovery sidecars | Cluster apply needs a higher-level publisher; Lance still commits per dataset |
+| Schema apply | Existing plan/apply/lock shape for one graph; soft/hard drops already modeled | This is the prototype resource reconciler, but cluster apply cannot call it blindly and then claim cluster atomicity |
+| Public docs/tests | Config, policy, server, and query behavior are already documented and tested | Every behavior change below has user docs and test fallout |
+
+## Compatibility Stance
+
+<!-- Spec note: keep `cluster.yaml` separate from `omnigraph.yaml` because the
+current file is deliberately layered and partly per-operator. Collapsing shared
+cluster intent into it would blur the source-of-truth split the high-level spec
+is trying to create. -->
+
+1. `cluster.yaml` is a new target-state file, not `omnigraph.yaml` v2.
+2. Existing `omnigraph.yaml` keeps working for CLI, server boot, aliases,
+   graph locators, bearer-token env lookup, and the current stored-query
+   registry.
+3. Initial cluster commands are explicit: `omnigraph cluster validate`,
+   `omnigraph cluster plan`, `omnigraph cluster apply`, `omnigraph cluster
+   status`, `omnigraph cluster refresh`, and `omnigraph cluster import`.
+4. Cluster config is one shared folder, resolved from the command's cluster
+   root or explicit path. It is not merged from global + project + active
+   context layers.
+5. The per-operator connection layer selects the cluster root and actor
+   identity. It is not committed into `cluster.yaml`.
+6. `mcp.expose` remains supported in current `omnigraph.yaml` until the
+   per-query policy replacement ships.
+
+## Terraform-Aligned Schema Validation
+
+<!-- Spec note: Terraform is strict for resource/provider/module configuration,
+but looser for variable-value inputs such as `.tfvars` and `TF_VAR_*`. For
+cluster desired state we borrow the strict resource-schema posture because
+`cluster.yaml` is shared intent, not an operator-local variable bag. -->
+
+Every field in target-state `cluster.yaml` must be **honored or rejected**:
+
+- If a field is part of the declared resource schema, it must affect
+  validation, plan, apply, state, or status.
+- If a field is misspelled, placed under the wrong resource kind, or reserved
+  for a future phase, `cluster validate` / `cluster plan` must fail with a
+  typed diagnostic.
+- Compatibility warnings are allowed only in an explicit migration window for
+  old schema versions. They are not allowed in the target schema.
+- Free-form extension areas must be named as such, for example `labels`,
+  `metadata`, `vars`, or `provider_options`; accidental unknown keys are never
+  treated as extension data.
+
+Examples:
+
+```yaml
+graphs:
+  knowledge:
+    schema: ./knowledge.pg
+    lables: { team: platform }       # invalid: typo, use `labels`
+
+pipelines:
+  github_sync:
+    source: { kind: github, token: ${GITHUB_TOKEN} }
+    into:
+      - { graph: engineering, map: ./github.map.yaml }
+    retry_magic: true                # invalid unless `retry_magic` is in schema
+```
+
+```yaml
+graphs:
+  knowledge:
+    schema: ./knowledge.pg
+    labels: { team: platform }       # valid free-form metadata bucket
+    provider_options:
+      lance:
+        compaction_window: daily     # valid only if this extension is declared
+```
+
+## Typed Resource And Provider Addresses
+
+<!-- Spec note: this is the Terraform-aligned version of "typed locators".
+The target cluster spec should not ask later code to guess whether a string is a
+graph name, query name, server endpoint, storage URI, source connector, or
+credential reference. References carry their kind. -->
+
+<!-- Fix (2026-06-08): resolved the "shorthand may exist" (here) vs "bare strings
+are bad shape" (below) contradiction. The rule is now explicit: bare names ARE
+valid shorthand in a field whose schema fixes the referent kind (normalized to a
+typed address); "bad shape" means a value whose KIND is ambiguous or WRONG, not
+merely bare. This also makes the high-level spec's bare examples (policy
+`graphs:`/`applies_to:` lists, pipeline `into.graph`, dashboard `graphs:`) valid. -->
+A locator is a typed address to another declared thing. **Internally — in plan and
+state — every reference is a typed address** (axiom 9). At the config *surface* a
+field may accept **bare shorthand when its schema fixes the referent kind** (a
+policy `applies_to:` list is graph refs; a pipeline `into.graph` is a graph id) —
+the parser normalizes it to the typed address before planning. A value whose
+*kind* is ambiguous or wrong (a `source:` that could be a connector type, an
+instance, or a provider) has no safe normalization and must be a typed
+`provider.*` address or an explicit inline block.
+
+Target address forms:
+
+```text
+graph.<graph_id>
+schema.<graph_id>
+query.<graph_id>.<query_name>
+policy.<policy_name>
+ui.dashboard.<dashboard_name>
+pipeline.<pipeline_name>
+provider.storage.<provider_name>
+provider.source.<provider_name>
+provider.embedding.<provider_name>
+```
+
+Bad shape — the value's **kind is ambiguous or wrong**, not merely bare:
+
+```yaml
+pipelines:
+  github_sync:
+    source: github                             # AMBIGUOUS kind: connector type, instance, or provider?
+                                               #   → provider.source.<name> or inline { kind: github, ... }
+policies:
+  base_rbac:
+    applies_to: [query.knowledge.find_experts] # WRONG kind: a query address in a graph-ref field
+```
+
+OK shorthand (kind fixed by the field → normalized):
+
+```yaml
+policies:
+  base_rbac:
+    applies_to: [knowledge, engineering]       # bare names in a graph-ref field → graph.knowledge, graph.engineering
+```
+
+Target shape:
+
+```yaml
+providers:
+  storage:
+    prod_graphs:
+      kind: s3
+      bucket: company
+      prefix: prod
+  source:
+    github_org:
+      kind: github
+      token: ${GITHUB_TOKEN}
+
+graphs:
+  knowledge:
+    storage: provider.storage.prod_graphs
+    path: graphs/knowledge.omni
+    schema: ./knowledge.pg
+  engineering:
+    storage: provider.storage.prod_graphs
+    path: graphs/engineering.omni
+    schema: ./engineering.pg
+
+policies:
+  base_rbac:
+    file: ./base_rbac.policy.yaml
+    applies_to:
+      - graph.knowledge
+      - graph.engineering
+
+pipelines:
+  github_sync:
+    source: provider.source.github_org
+    into:
+      - { graph: graph.engineering, map: ./github_to_engineering.map.yaml }
+      - { graph: graph.knowledge,   map: ./github_to_people.map.yaml }
+```
+
+<!-- Fix (2026-06-08): this example shows the EXPLICIT/external graph-storage case
+(`storage:` + `path:`). It is not the default — per "Known High-Risk Design
+Decisions" §2 and the cluster storage layout, graph roots derive to
+`ClusterRoot/graphs/<id>.omni` by default; an external storage provider is the
+opt-in. The pipeline `into.graph` here is typed (`graph.engineering`); the bare
+`{ graph: engineering, ... }` shorthand is equally valid (normalized). -->
+
+Validation rules:
+
+- A field that expects a graph address accepts `graph.<id>`, not
+  `query.<graph>.<name>` or an arbitrary string.
+- A field that expects a query address accepts `query.<graph>.<name>`, and the
+  planner validates both the graph and the query symbol.
+- A field that expects a source provider accepts `provider.source.<name>`, not
+  `provider.storage.<name>`.
+- A field that expects storage accepts `provider.storage.<name>` or an explicit
+  storage block, not a server URL or source connector.
+<!-- Fix (2026-06-08): shorthand is a present rule, not "future syntax" — it is how
+the high-level spec's bare examples are valid. -->
+- A field whose schema **fixes the kind** accepts bare shorthand (e.g. `knowledge`
+  in a graph-ref field) and normalizes it to the typed address; a kind-ambiguous
+  or wrong-kind value is rejected with a typed diagnostic.
+- Plan and state always store the **normalized typed address**, regardless of
+  whether the surface used shorthand.
+
+## Target Components
+
+Preferred split:
+
+| Component | Responsibility | Depends on |
+|---|---|---|
+| `omnigraph-cluster` crate | Cluster spec types, path resolution, resource graph, plan model, state backend traits, apply orchestration | `omnigraph-config` only for shared simple config types if needed; avoid server deps |
+| `omnigraph` engine additions | Graph lifecycle primitives, schema-apply integration, recovery hooks for graph moves during cluster apply; optional future cluster manifest publisher if JSON state is not enough | Lance, existing graph manifest/recovery |
+| `omnigraph-cli` | `cluster *` commands, plan rendering, approval collection, state lock UX | `omnigraph-cluster`, engine |
+| `omnigraph-server` | Optional boot from cluster state, registry reload, status endpoints, policy-filtered query catalog | `omnigraph-cluster`, engine, policy |
+| `omnigraph-policy` | Cluster/server actions, per-query list/invoke scope, approval policy predicates | none above server |
+| `omnigraph-queries` | Registry without exposure side-channel; dependency metadata for downstream validation | compiler/config |
+| `omnigraph-api-types` | New status/plan/apply response types if cluster HTTP endpoints ship | serde only |
+
+If the first implementation avoids a new crate, keep the same boundary in
+modules. The important constraint is that cluster spec parsing must not drag
+HTTP/server code into compiler or engine crates.
+
+## Resource Model
+
+Resource identity is stable and typed:
+
+```text
+ClusterRoot
+ResourceKey = <kind>/<scope>/<name>
+ResourceAddress = <kind>.<name> | <kind>.<graph_id>.<name>
+ProviderAddress = provider.<kind>.<name>
+
+graph/cluster/knowledge
+schema/graph:knowledge/main
+query/graph:knowledge/find_experts
+policy/cluster/base_rbac
+ui/cluster/dashboard.overview
+pipeline/cluster/github_sync
+alias/cluster/experts
+embedding/cluster/default
+```
+
+<!-- Fix (2026-06-08): resource key uses `dashboard.overview` (dot) to match the
+address form `ui.dashboard.<dashboard_name>` — was `dashboard:overview`. `dashboard`
+is the only ui sub-kind today. -->
+
+Resource records carry:
+
+| Field | Meaning |
+|---|---|
+| `kind` | Graph, Schema, Query, PolicyBundle, UiSpec, Binding, Alias, EmbeddingConfig, Pipeline |
+| `scope` | Cluster or graph id |
+| `name` | Stable resource name inside scope |
+| `fingerprint` | Content hash of the normalized spec and all referenced files |
+| `dependencies` | Resource keys this resource references |
+| `observed` | Applied graph manifest version, policy digest, query digest, schedule id, etc. |
+| `status` | `Pending`, `Planned`, `Applying`, `Applied`, `Drifted`, `Blocked`, `Error` |
+| `conditions` | Typed details such as `ActualAppliedStatePending`, `NeedsApproval`, `DependencyMissing`, `PartialPipelineRun` |
+
+The planner builds a dependency graph from these records and uses it for both
+validation and blast-radius reporting.
+
+## Terraform-Style Validate / Plan / Apply
+
+The cluster workflow deliberately mirrors Terraform's safe sequence:
+
+```text
+cluster validate   # parse + schema-check desired config, no state mutation
+cluster plan       # diff desired config against state, with optional refresh
+cluster apply      # apply an accepted fresh plan and update state
+cluster status     # read state-backed deployed reality
+cluster refresh    # repair/import observations from actual cluster state
+```
+
+Implementation rollout follows the same safety posture: ship parser/validate
+first, then read-only plan, then state backend and lock, then apply.
+
+The plan is a structured artifact, not just terminal text. It must include:
+
+| Plan field | Why it exists |
+|---|---|
+| `desired_revision` | Git commit / config digest being evaluated |
+| `resource_digests` | Exact digest of every schema, query, policy, UI, pipeline, and map file |
+| `dependencies` | Edges such as query -> graph/schema, dashboard -> query, pipeline -> source provider + graph |
+| `state_observations` | Applied revision, resource fingerprints, graph manifest versions, status conditions, and drift |
+| `changes` | Create/update/delete/replace/refresh-only operations |
+| `blast_radius` | Downstream resources to revalidate or affected behavior to surface |
+| `approvals_required` | Irreversible/data-loss or compatibility-narrowing gates |
+
+`cluster apply` must reject a stale plan when state, resource digests, or
+observed graph versions no longer match the plan base. The operator or agent
+must re-plan or explicitly refresh first.
+
+## Cluster Storage Layout
+
+Target Phase-1 cluster-root layout:
+
+```text
+<cluster-root>/
+  __cluster/
+    state.json
+    lock.json
+    status/
+      <resource-address>.json
+    approvals/
+      <ulid>.json
+    recoveries/
+      <ulid>.json
+    recovery/
+      <ulid>.json
+    resources/
+      query/<graph>/<name>/<digest>.gq
+      policy/<name>/<digest>.yaml
+      ui/<name>/<digest>.dashboard.yaml
+      pipeline/<name>/<digest>.pipeline.yaml
+  graphs/
+    <graph_id>.omni/
+```
+
+<!-- Spec note: JSON is the baseline because it matches Terraform state, is
+easy to inspect/repair, and avoids bootstrapping Lance datasets before the
+control-plane semantics are proven. -->
+The exact filenames can change, but the shape cannot:
+
+- There is one cluster-control namespace under the cluster root.
+- Graph data remains in ordinary OmniGraph graph roots.
+- State is a locked/CAS-updated JSON document, not a Lance dataset.
+- Status, approval, and recovery ledgers are append-only or per-resource JSON
+  records until table semantics are proven necessary.
+- Resource payloads are content-addressed by digest so apply can be idempotent.
+- Cluster state is not inferred from the operator's working tree.
+- A Lance-backed control-plane store is a future backend option only if
+  row-level queryability/history or tighter publish fencing justifies it.
+
+## State Backend Protocol
+
+### Cluster-Hosted JSON State
+
+When `state.backend: cluster`, the baseline backend stores JSON documents under
+`<cluster-root>/__cluster/` and protects `state.json` with object-store lock/CAS.
+It is cluster-hosted, but it is still a separate state write from graph Lance
+manifest movement.
+
+Apply protocol:
+
+1. Acquire the cluster state lock.
+2. Read current `state.json` and backend CAS token / object generation.
+3. Validate plan base still matches state.
+4. Write a cluster recovery sidecar before any graph manifest or non-idempotent
+   resource can move.
+5. Write content-addressed resource payloads and perform any required graph
+   manifest movements.
+6. CAS-update `state.json` with the new applied revision, resource
+   fingerprints, observed graph versions, status references, and approval /
+   recovery references.
+7. If step 6 fails after actual resources moved, do not acknowledge success.
+   Surface `ActualAppliedStatePending` and require `refresh` / `import` repair.
+8. Delete the sidecar and release the lock only after the state outcome is
+   recorded.
+
+### External State
+
+<!-- Spec note: external state is a separate commit domain. The protocol below
+prevents an apply from returning success after the cluster moved but the state
+ledger failed to record that movement. -->
+
+When `state.backend` points outside the cluster root, the same JSON state shape
+lives in an external store. It is locked and CAS-updated, but it is not atomic
+with Lance or OmniGraph manifests.
+
+Apply protocol:
+
+1. Acquire the external state lock.
+2. Read state and CAS token.
+3. Validate plan base still matches state.
+4. Write a cluster recovery sidecar.
+5. Perform the cluster resource changes.
+6. CAS-update external state with the new applied revision, statuses, and the
+   observed graph manifest / resource versions it records.
+7. If step 6 fails, do not acknowledge success. Surface
+   `ActualAppliedStatePending` and require `refresh` / `import` repair.
+8. Release the external lock only after the state outcome is recorded.
+
+This mode can be strongly coordinated, but it must never be documented as one
+atomic commit across both stores.
+
+### Future Lance-Backed State
+
+A Lance-backed state/status/approval/recovery store is deliberately not the
+baseline. It becomes attractive only if JSON files become a real liability:
+large status sets need structured filtering, approval/recovery history needs
+table scans, or cluster apply needs a manifest publisher that can fence state
+and graph-version pins together. Until then, Lance datasets add bootstrapping,
+schema migration, and control-plane recovery surface without enough benefit.
+
+## Cluster Manifest Publisher
+
+The cluster publisher is a possible later layer above today's graph publisher.
+It does not replace Lance or the per-graph `__manifest` table, and it is not
+required for Phase-1 JSON state / read-only plan.
+
+Required semantics:
+
+| Requirement | Detail |
+|---|---|
+| Expected-version CAS | Every resource in an apply group supplies its expected current version/fingerprint |
+| Resource changes | Register/update/tombstone resource payloads and graph version pins |
+| Graph-head fencing | If a graph schema/lifecycle operation moves a graph manifest, the cluster manifest records the exact graph manifest version |
+| Sidecar coverage | Any graph or cluster resource that can move before cluster publish must be recoverable all-or-nothing |
+| Deterministic publish order | Sidecars and apply groups process in stable order |
+| Loud partials | If a group cannot be rolled back or forward in-process, status records the condition before more apply work proceeds |
+
+The risky case is nested publish:
+
+```text
+schema apply moves graph:knowledge manifest
+cluster apply has not yet published query/policy/state records
+process crashes
+```
+
+That is not safe unless the cluster sidecar records enough information to roll
+the graph movement forward into the cluster manifest or roll it back using the
+same recovery discipline as current graph recovery.
+
+## Plan Model
+
+Plan output is a durable, replay-checked proposal, not just pretty text:
+
+```text
+Plan {
+  plan_id,
+  desired_revision,
+  base_state_revision,
+  base_state_cas,
+  changes[],
+  apply_groups[],
+  approvals_required[],
+  blast_radius,
+  diagnostics[]
+}
+```
+
+Each change records:
+
+| Field | Meaning |
+|---|---|
+| `resource` | Stable `ResourceKey` |
+| `operation` | Create, Update, Delete, Replace, RefreshOnly |
+| `reversibility` | Reversible, Recoverable, CompatibilityNarrowing, IrreversibleDataLoss |
+| `effect` | ConfigOnly, Catalog, GraphDefinition, GraphDataRewrite, DataPlaneSchedule |
+| `downstream` | Resources that must be revalidated or will observe changed behavior |
+| `approval` | None, HumanRequired, PolicyRequired, AlreadySatisfied |
+
+`apply` must re-read state and reject stale plans unless an explicit
+`--refresh` / `--replan` path recomputes the plan.
+
+## Downstream Dependency Rules
+
+These are the concrete "what requires downstream" rules.
+
+| Changed resource | Must revalidate / recompute downstream | Blocking failures |
+|---|---|---|
+| Graph create/delete/rename | Policies, queries, aliases, dashboards, pipelines, bindings, server registry, state graph set | Dangling graph references; duplicate URI; invalid `GraphId`; graph delete without irreversible approval |
+| Schema | Stored queries, pipeline maps, UI bindings/query outputs, embedding/index config, data-impact preview, policy predicates once row/type pushdown exists | Unsupported migration; query breakage; missing target type/property; hard drop without approval |
+| Stored query | Aliases, UI bindings, policy list/invoke grants, MCP/tool catalog compatibility, typed params | Query file parse/type errors; registry key != `query <name>`; removed query still referenced |
+| Policy bundle | Query catalog visibility, graph/server action authorization, approval gates, bootstrap permissions | Invalid Cedar/YAML; server-scoped action in graph policy; per-query list/invoke gap unhandled |
+| UI/dashboard | Query bindings, graph refs, output field expectations, policy visibility for referenced queries | Binding to missing graph/query/param/output |
+| Alias | CLI command resolution, graph/query refs, shared-vs-personal boundary | Dangling graph/query; mutation alias pointing at read-only context |
+| Embedding config | Schema `@embed` columns, model dimension, index rebuild/reconcile, env refs | Dimension mismatch; missing env ref; unsupported model/provider |
+| Pipeline definition | Target graph schemas, mapping files, env refs, scheduler/runtime state, per-target run ledger | Missing target graph/type/property; overwrite mode without approval; source secret missing |
+| Binding | Referenced source/surface pair, dependency order, visibility policy | Missing source or target; incompatible params |
+| State backend config | Lock implementation, import/refresh protocol, apply acknowledgements | Backend missing CAS/lock; state CAS failure after graph/resource movement |
+
+## Blast Radius Matrix
+
+| Area | Required downstream change | Blast radius | Notes |
+|---|---|---|---|
+| Config parsing | Add strict `cluster.yaml` parser, path/env-ref resolver, resource fingerprints, no layered merge | High | Separate from `OmnigraphConfig`; existing config tests still need backcompat coverage |
+| CLI | Add `cluster validate/plan/apply/status/refresh/import`, plan rendering, approval flags, actor threading | High | Must not change existing command selection or `omnigraph use` behavior |
+| State backend | Add JSON state document, status/approval/recovery records, lock/CAS, and import/refresh repair | High | Must not silently succeed after state CAS failure |
+| Optional cluster publisher | Add a cluster manifest plus table-backed state/status store only if stronger all-or-nothing apply is required | Very high | Touches core atomicity and recovery invariants |
+| Recovery | Add cluster sidecars and failpoint coverage for graph-move-before-state-publish gaps | Very high | Any missed sidecar is a correctness bug |
+| Graph lifecycle | First-class graph resource create/delete/rename or stable-id story | High | Current server add/remove is intentionally not exposed |
+| Schema apply integration | Make schema apply cluster-aware or wrap it with cluster recovery | High | Existing schema apply cannot be treated as cluster atomic by assertion |
+| Query registry | Remove target-state exposure flag, add dependency metadata, keep `mcp.expose` bridge | Medium/high | Catalog behavior is observable public API |
+| Policy | Add cluster plan/apply/admin actions and per-query list/invoke scope | High | Needs docs, tests, Cedar schema migration, and compatibility with coarse `invoke_query` |
+| Server registry | Boot from cluster state, eventually reload/reconcile graph handles, expose statuses | High | Affects routing, OpenAPI, auth, and workload admission |
+| API types/OpenAPI | Plan/status/apply DTOs if HTTP management endpoints ship | Medium/high | OpenAPI drift must be regenerated |
+| UI specs | New renderer/spec validator/binding checker | High | New product surface, not currently implemented |
+| Pipelines | New scheduler/runtime/connector/mapping/idempotency/run ledger | Very high | Second data-plane seam; large product and correctness surface |
+| Embeddings | Cluster-level defaults, env refs, model/dimension validation, index interaction | Medium | Existing embedding code is mostly offline/client-side |
+| Docs | User docs for cluster config, policy, server, CLI; dev docs for invariants/testing | High | Public contract changes |
+| Tests | New cluster suites plus extensions to config/server/policy/recovery/schema/query tests | High | Needs boundary-matched coverage |
+
+## Reversibility And Approval Tiers
+
+| Tier | Examples | Gate |
+|---|---|---|
+| Display-only | Dashboard layout, non-breaking alias addition | No approval beyond policy |
+| Catalog behavior | Add query, hide/list query via policy, add policy grant | Policy check; no data-loss approval |
+| Compatibility narrowing | Future validated enum narrowing, query param removal, policy removal that revokes access | Explicit compatibility warning; may require human approval by policy |
+| Recoverable definition rewrite | Soft schema drop, graph schema rename, index rebuild | Plan warning; no data-loss approval unless policy requires |
+| Irreversible data loss | Graph delete, hard schema drop, cleanup-triggered prior-version reclamation, overwriting pipeline target | Human approval artifact recorded in audit ledger |
+
+Future enum narrowing belongs in `CompatibilityNarrowing` unless the migration
+also drops/coerces data or triggers cleanup. That distinction matters for plan
+wording and for policy predicates.
+
+## Rollout Phases
+
+<!-- Spec note: the only safe path is staged. The cluster control plane crosses
+config, engine, server, policy, and data-plane-adjacent surfaces; a big-bang
+replacement would make every invariant harder to audit. -->
+
+### Phase 0: Documentation And Parser Skeleton
+
+- Add cluster spec types and strict parser behind an unused feature/module.
+- Implement `cluster validate --config <folder>` with no state backend.
+- Validate file paths, env refs, duplicate resource keys, and dependency graph.
+- No behavior change to `omnigraph.yaml`, server boot, or query exposure.
+
+### Phase 1: Read-Only Planning
+
+- Add `cluster plan` against a mock/imported state snapshot.
+- Produce plan JSON and human output.
+- Reuse existing schema migration planner for schema resources.
+- Validate stored queries against desired schema.
+- Compute downstream dependencies and blast radius.
+- Still no apply.
+
+### Phase 2: State Backend And Lock
+
+- Add `state.backend: cluster` JSON storage and lock/CAS.
+- Add external backend trait only if lock + CAS semantics are explicit.
+- Add `cluster status`, `refresh`, and `import`.
+- Persist `AppliedRevision`, `ResourceStatus`, and audit references in JSON.
+
+### Phase 3: Config-Only Apply
+
+- Apply query, policy, UI, alias, embedding, and pipeline definition resources
+  that do not move graph manifests.
+- Publish by writing content-addressed resource payloads and CAS-updating
+  `state.json`.
+- Keep server boot from `omnigraph.yaml`; cluster state is inspectable but not
+  yet serving traffic.
+
+### Phase 4: Graph And Schema Apply
+
+- Add graph create/delete as cluster resources.
+- Make schema apply cluster-aware, with sidecar coverage for graph manifest
+  movements before JSON state publish.
+- Gate irreversible data-loss operations with approval artifacts.
+- Consider a cluster manifest publisher only if the JSON sidecar + repair path
+  is not strong enough for the accepted safety contract.
+
+### Phase 5: Server Reads Cluster Catalog
+
+- Allow server startup from cluster state.
+- Add status and catalog endpoints as needed.
+- Keep the current `omnigraph.yaml` startup path as compatibility mode.
+- Regenerate OpenAPI for any HTTP surface.
+
+### Phase 6: Policy-Owned Query Exposure
+
+- Add per-query policy scope for list/invoke.
+- Filter `GET /queries` by actor.
+- Keep coarse `invoke_query` as a broad allow rule for compatibility until
+  docs and migrations say it can be narrowed.
+- Deprecate and later remove `mcp.expose` from target-state cluster config.
+
+### Phase 7: Pipeline Runtime
+
+- Add scheduler/worker/runtime.
+- Add source connector contracts, mapping validation, idempotency keys,
+  per-target run status, and retry behavior.
+- Treat fan-out execution as data-plane writes unless explicitly staged through
+  branch/merge.
+
+## Test Ownership
+
+Tests must prove the Terraform-style workflow, not just individual parsers.
+The minimum behavior contract:
+
+```text
+validate catches bad config
+plan is deterministic and complete
+apply only applies a fresh accepted plan
+state changes are locked and durable
+drift and partial convergence are visible, not silent
+```
+
+| Change | Existing coverage to extend | New coverage likely needed |
+|---|---|---|
+| Cluster parser | `omnigraph-config` inline config tests for strictness/path resolution | `omnigraph-cluster` parser/dependency tests |
+| Plan dependency graph | Schema planner tests, query registry tests | Golden plan JSON for cross-resource downstream impacts |
+| State lock/backend | Existing schema apply lock tests as model | JSON state CAS/lock race tests |
+| Optional cluster manifest publisher | `crates/omnigraph/src/db/manifest/tests.rs` | Cluster publisher CAS, expected-version, deterministic order tests if that backend ships |
+| Cluster recovery | `recovery.rs`, `failpoints.rs` | Phase B -> state publish failpoints, external state CAS failure tests |
+| Schema cluster apply | `schema_apply.rs`, failpoints schema apply cases | Nested graph/cluster recovery tests |
+| Query exposure policy | `omnigraph-policy` invoke_query tests, server query catalog tests | Per-query list/invoke allow/deny and no-probing tests |
+| Server cluster boot | `omnigraph-server/tests/server.rs`, `openapi.rs` | Boot from cluster state, registry reload/status tests |
+| CLI cluster commands | `omnigraph-cli/tests/cli.rs`, `system_local.rs` | `cluster validate/plan/apply/status` system tests |
+| Pipelines | None today | New runtime/mapping/idempotency/run-ledger suites |
+
+Workflow-specific tests:
+
+| Workflow area | Required assertions |
+|---|---|
+| Parser / validate | Unknown fields, wrong-kind typed addresses, missing providers, inline secret values, dangling graph/query/pipeline refs, and future-phase fields fail with typed diagnostics |
+| Plan goldens | Given config + imported/fake state, plan JSON contains stable resource digests, dependency edges, state observations, proposed changes, blast radius, and approval gates in deterministic order |
+| Fresh-plan apply | Changing config digest, state revision, resource digest, or observed graph manifest version after planning makes `cluster apply` reject and require re-plan/refresh |
+| State lock / CAS | Concurrent applies against the same backend cannot both succeed; loser gets a typed lock/CAS conflict |
+| Recovery / partial apply | Fail after graph/resource movement but before cluster state publish; assert recovery or status surfaces `ActualAppliedStatePending`/sidecar state and never returns success |
+| Server/runtime phase | Before cluster state drives routing or registry reload, tests are hermetic: no real home dir, no real global config, no real credentials, no ignored remote tests |
+| Pipeline phase | Fan-out run records per-target status, commit ids, retryability, and idempotency keys; no aggregate success unless every target succeeded |
+
+Hard gates:
+
+- Do not ship `cluster apply` until `cluster validate` and read-only
+  `cluster plan` have hermetic tests.
+- Do not ship graph/schema-moving apply until failpoint recovery tests prove the
+  Phase B -> state publish gap is covered.
+
+For docs-only changes, `scripts/check-agents-md.sh` is enough. For
+implementation phases, run the boundary tests above before widening to
+`cargo test --workspace --locked`.
+
+## User-Visible Documentation Fallout
+
+The following public docs must change when the corresponding phase ships:
+
+| Phase | User docs |
+|---|---|
+| Parser/validate | New `docs/user/cluster-config.md`; CLI reference for `cluster validate` |
+| Plan/apply | CLI reference, transactions, policy, errors |
+| State backend | Storage, deployment, constants, maintenance |
+| Server cluster boot | Server, deployment, OpenAPI |
+| Policy query exposure | Policy, server, query language / stored-query registry docs |
+| Pipelines | New pipeline user guide, deployment, audit, errors |
+| Embeddings config | Embeddings, indexes |
+
+Do not ship a user-visible command, flag, env var, endpoint, or config key
+without updating the corresponding user doc in the same PR.
+
+## Known High-Risk Design Decisions
+
+1. **Cluster root identity.** Decide whether `metadata.name` is a label or
+   identity. Prefer root-derived stable identity plus display name to avoid a
+   rename breaking resource identity.
+2. **Graph storage derivation.** The high-level sample omits graph storage.
+   Implementation should derive graph roots under `ClusterRoot/graphs/<id>.omni`
+   by default and treat external graph roots as a separate, explicit feature.
+3. **Nested apply.** Schema apply and graph lifecycle cannot move a graph
+   manifest outside cluster sidecar coverage.
+4. **External state.** Must expose pending repair instead of returning success
+   when graph/resource movement succeeds and external state CAS fails.
+5. **Per-query policy.** Catalog filtering must avoid probing leaks: callers
+   without list/invoke permission should not distinguish hidden from missing.
+6. **Pipeline fan-out.** Do not promise atomic multi-graph ingestion unless the
+   runtime uses a real branch/merge or equivalent protocol for every target.
+7. **Drift correction.** Reconciler-initiated deletes are the same data-loss
+   class as human-requested deletes.
+
+## Exit Criteria For A Real RFC
+
+Before implementation begins beyond parser/validate, the RFC must answer:
+
+1. Exact JSON state/status/approval/recovery schemas and object-store paths.
+2. Exact sidecar JSON schema and recovery decision matrix.
+3. State backend interface and supported lock/CAS implementations.
+4. Cluster apply group syntax and dependency ordering rules.
+5. Plan JSON schema, including blast-radius and approval fields.
+6. Bootstrap authority and first-actor story.
+7. Server startup and migration path from `omnigraph.yaml`.
+8. Per-query policy schema and compatibility bridge for `mcp.expose`.
+9. Pipeline runtime owner, status schema, and idempotency contract.
diff --git a/docs/dev/cluster-config-specs.md b/docs/dev/cluster-config-specs.md
new file mode 100644
index 0000000..8094be2
--- /dev/null
+++ b/docs/dev/cluster-config-specs.md
@@ -0,0 +1,415 @@
+# Cluster Config Spec — Declarative, As-Code, Agent-Operated
+
+**Status:** Draft / thinking-in-progress
+**Type:** Architecture direction
+**Date:** 2026-06-07
+**Relationship:** generalizes today's `omnigraph.yaml` graph/query/policy configuration surface ([CLI reference](../user/cli-reference.md), [server docs](../user/server.md)) into a future cluster control plane. The distilled rules are in [cluster-axioms.md](cluster-axioms.md); detailed downstream implementation spec and blast-radius assessment in [cluster-config-implementation-spec.md](cluster-config-implementation-spec.md). This is a proposed architecture, not an implemented RFC.
+
+> **Revision 2026-06-07 — full commitment to the Terraform paradigm.** Three changes from the earlier draft: (1) **state is an authoritative, locked ledger in a backend** (server-hosted *or* a separate cloud store), not "a mostly-rebuildable projection"; (2) `plan` is framed as the **CLI diff between local config and state**; (3) **ETL pipelines** (external data sources) are a first-class config asset — a second seam, alongside schema, where a definition triggers a data-plane effect. The full set of config assets (incl. **aliases**, **embeddings**) is enumerated below.
+
+---
+
+## The problem (the Sarah/Bob test)
+
+Two operators, Sarah and Bob, administer the same OmniGraph deployment. Sarah adds new queries, changes a schema, adds a dashboard, updates policies, and wires in a new data feed.
+
+**How does Bob find out?**
+
+Today he can't — not cleanly. Sarah's changes land in many different places via many different mechanisms:
+
+- schema → the schema-apply path, accepted state in `_schema.pg`, `_schema.ir.json`, `__schema_state.json`, and table versions in the graph manifest
+- queries → `.gq` files passed per request or resolved through CLI query roots / aliases; not durable cluster state
+- policies → `policy.file` in `omnigraph.yaml`, pointing at Cedar/YAML files that are usually GitOps'd externally
+- aliases → CLI sugar in each operator's `omnigraph.yaml`
+- external data → ad-hoc `load`/`ingest` scripts, cron jobs, glue code that lives nowhere durable
+- UI → undefined
+
+There is no single diff that spans them, no single change record attributed to Sarah, no one place Bob (or Bob's agent) reads to answer "what is this deployment, and what changed?" The state is **fragmented**, and fragmentation is hostile to the one thing an agent must do: reason over the system *as a whole*.
+
+A design passes only if it answers the Sarah/Bob test directly.
+
+---
+
+## Thesis
+
+The unit of declarative state is the **cluster** (the deployment), described by **a single config, as code, in version control**, operated by an **agent** through a plan/apply/reconcile loop against an authoritative state ledger.
+
+Every surface is a declarative as-code artifact — schema (`.pg`), queries (`.gq`), policies (`.yaml`), UI (`.yaml`), aliases, **ETL pipelines**, and embeddings config. The UI is not a separately-deployed application; it is a declarative spec, a first-class resource reconciled exactly like the others.
+
+Three pillars, none optional:
+
+1. **DECLARATIVE** — you describe the desired end state, not the steps. The reconciler computes the steps.
+2. **AS CODE** — the config is declarative text in a repo, version-controlled. This is the **source of truth for *intent***.
+3. **OPERATED BY AGENT** — an agent authors config changes and drives reconciliation as an authenticated actor, with policy and approval gates. No human state-management burden.
+
+This is **Terraform's model, taken literally**: config (as code) is desired truth; **state is an authoritative, locked ledger** of what has been applied — held in a backend (the cluster, or a separate cloud store); `plan` diffs config against state; `apply` converges reality to config and updates state — applied at **cluster** scope, with OmniGraph as its own data-aware provider and an agent as the controller.
+
+---
+
+## Why as-code (the recursion argument)
+
+"As code" is not branding. It is the structural property that makes a self-describing system well-founded.
+
+Consider the rejected alternative: model the cluster's definition *as a graph* (a meta-graph whose nodes are graphs/policies/queries/UI). To describe a graph you need a schema. The meta-graph's schema is either:
+
+- **hardcoded** → the base case is *code* (you smuggled code in at the bottom anyway), or
+- **another graph** → infinite regress, no base case.
+
+Graph-describing-graph never terminates. **Code is the base case.** A declarative config needs no meta-describer because it is parsed by the engine's compiled code — not described by more user-space data.
+
+> **Declarative-as-code terminates. Declarative-as-data (a graph of graphs) recurses.**
+
+This is also why **config** must live **outside** the running system: reviewable (PRs), reproducible (clone + apply), diffable as text, and editable by an agent — without depending on the running system to describe its own intent.
+
+Corollary on direction: change flows **code → cluster, never the reverse.** You do not edit the running system and call that intent. (State, separately, *records* what the cluster currently is — see the next section — but it is never where you express what it *should* be.)
+
+---
+
+## Why per-cluster, not per-graph
+
+The definition Sarah changed does not *belong* to any single graph:
+
+1. **Policies cross-cut graphs.** "Member can't delete on any graph," "who may list/create/delete graphs" — cluster facts. No graph could own them.
+2. **"Which graphs exist" has no home in a per-graph model.** The set of graphs is state *above* any graph.
+3. **Queries, UI, pipelines, and aliases span graphs.** The MCP/tool catalog an agent discovers is the *cluster's* surface; a dashboard renders multiple graphs; a pipeline may fan out into several.
+4. **Cross-graph apply groups.** Sarah may add a graph *and* wire it into the UI *and* grant policy access *and* attach a feed as one logical change — only the cluster can express, plan, and eventually fence that as one apply group.
+5. **Operators operate clusters.** Bob is Sarah's peer on a *deployment*, not a graph. The collaboration unit is the cluster.
+
+The graph is a *resource within* the cluster, not the unit of operation.
+
+The mirror question — *why not per-fleet?* — is the same one this section used against per-graph, one level up. A fleet of clusters may eventually want its own declarative spec describing which clusters exist. That recursion is real but **out of scope here**: this proposal stops at the cluster because the cluster is the unit two operators collaborate over. Fleet is the next scope up, named and deferred, not denied.
+
+---
+
+## The model: config / state / reconcile (the Terraform model, literally)
+
+| Layer | What it is | Source of truth for… | Who manages it |
+|---|---|---|---|
+| **Config** (as code, a folder of files) | Desired state of the whole cluster — graphs, schemas, policies, queries, UI, bindings, aliases, embeddings, ETL pipelines | **Intent** ("what it should be") | Operators/agents, in version control |
+| **State** (a locked ledger in a backend) | The authoritative record of what has been applied — applied revision, per-resource fingerprints, observed graph/table versions, audit-record references, resource conditions | **Deployed reality** ("what is") | The reconciler; humans don't hand-edit it |
+| **Actual cluster** | The realized *definition* of the running graphs — schema/policies/queries/UI/pipelines as actually in force | — (reality itself) | The engine; `apply` converges it to config |
+
+**`plan`** = `diff(config, state)` → proposed change set (optionally refreshed against the actual cluster).
+**`apply`** = acquire the state lock → converge actual → config → **update state** → release lock. Apply does **not** acknowledge success until the state update succeeds; if actual moved but the state write failed, the next `plan` / `refresh` must surface the non-success state and repair or import it before more work proceeds.
+
+### State is an authoritative, locked ledger — not a throwaway projection
+
+This is the 2026-06-07 revision. State is treated exactly as Terraform treats `tfstate`:
+
+- **Authoritative.** State is the trusted record of what is deployed. `plan` diffs config against **state** (fast, deterministic), not against a full live scan of the cluster on every command. "What exists" is answered from state.
+- **In a backend.** State lives in a configurable backend: the **cluster's own object-store backend**, or a **separate cloud store** (e.g. a different bucket/account) — the operator's choice, mirroring Terraform's local/S3/remote backends. The config declares which.
+- **JSON first.** The baseline state format is Terraform-style JSON documents (`state.json` plus status/approval/recovery JSON records) protected by backend lock/CAS. Lance control-plane datasets are a possible later backend only if row-level history, queryability, or tighter publish fencing justifies the added machinery.
+- **Atomicity depends on backend and publish scope.** A JSON state backend, even when stored under the cluster root, is a separate CAS step from graph Lance manifest moves. If actual resources move but the state write fails, apply must surface `ActualAppliedStatePending` (or equivalent) and require refresh/import repair instead of pretending one atomic commit covered every object. A future Lance-backed state backend or cluster manifest publisher may tighten this, but that is not the Phase-1 assumption.
+- **Locked.** `plan`/`apply` acquire a **state lock** before touching state, so two operators (or two agents) cannot converge concurrently and corrupt the ledger. This generalizes the existing `__schema_apply_lock__` from schema scope to cluster scope.
+- **Reconstructable, but not casually rebuilt.** OmniGraph's edge over opaque-cloud Terraform: the running cluster is self-describing (manifests, commit logs), so a lost state ledger can be **imported / refreshed** from the live cluster. That is a *resilience* property — not licence to treat state as disposable. State is protected and backed up like any source of truth.
+- **One slice is never reconstructable.** Who *approved* an irreversible apply cannot be re-derived from a manifest scan. That approval/audit record lives in the **durable audit ledger** (baseline: append-only JSON records in the state backend; future: a Lance table only if needed). State *references* it by id; it never *is* it.
+
+**The control plane reconciles definition, not data.** The reconcile loop converges the cluster's *definition* — schema, policies, queries, UI, bindings, aliases, pipelines, and the set of graphs. It does **not** converge **data**: rows, edges, and vectors are data-plane content, mutated by `load`/`mutate` and by **pipeline execution**, versioned by the commit DAG, and they sit entirely outside the reconcile loop. (`load`/`mutate` never appear in `cluster.yaml`.) **Two** definition kinds *trigger* a data-plane effect without owning data — schema and ETL pipelines (see "ETL pipelines" below).
+
+### Cluster resource model
+
+Minimum vocabulary:
+
+- **ClusterRoot** — the object-store prefix / control namespace for one deployment.
+- **DesiredRevision** — git commit, `cluster.yaml` digest, and per-resource digests.
+- **ResourceKind** — `Graph`, `Schema`, `Query`, `PolicyBundle`, `UiSpec`, `Binding`, `Alias`, `EmbeddingConfig`, **`Pipeline`** (ETL), and future cluster-scoped resources.
+- **ResourceAddress** — normalized typed references between resources, such as `graph.knowledge`, `query.knowledge.find_experts`, `policy.base_rbac`, and `pipeline.github_sync`; illustrative YAML may use shorthand, but plan/state store the typed form.
+- **ProviderAddress** — typed references to provider instances, such as `provider.storage.prod_graphs`, `provider.source.github_org`, and `provider.embedding.default`; provider addresses keep storage, external sources, and embedding providers from being inferred from ambiguous strings.
+- **StateBackend** — where the JSON state ledger is stored: `cluster` (this deployment's own backend) or an external store (a separate bucket/account).
+- **StateLock** — the cluster-scope lock acquired before plan/apply.
+- **AppliedRevision** — the durable, locked record (the heart of state) of which desired revision is applied, with audit-record references, resource fingerprints, and graph/table version observations.
+- **ResourceStatus** — `Pending | Planned | Applying | Applied | Drifted | Blocked | Error`, with typed conditions and observed actual state.
+- **ApplyGroup** — the explicit atomicity unit. Default is one independent resource per group; cross-resource references force planner-derived groups, and user-declared groups may opt into larger atomicity only for resources the active backend protocol can fence or repair. Baseline JSON state supports small, explicit groups; larger all-or-nothing groups require a future cluster publisher or equivalent proof.
+
+---
+
+## State: backend, lock, and the config ↔ state diff
+
+The CLI is the operator's window onto the gap between config and state.
+
+The Terraform-aligned workflow is:
+
+```text
+cluster validate   # parse + schema-check desired config, no state mutation
+cluster plan       # diff desired config against state, with optional refresh
+cluster apply      # apply an accepted fresh plan and update state
+cluster status     # read what state says is deployed now
+cluster refresh    # update/import state observations from actual cluster state
+```
+
+`plan` is the central artifact. It records the desired revision, resource
+digests for every referenced file, dependency edges between resources, observed
+state fingerprints / graph manifest versions, proposed changes, and approval
+gates. The human output below is a rendering of that structured plan, not the
+only representation.
+
+```
+  $ omnigraph cluster plan
+    config ./   →   diff against state   (backend: cluster · lock: acquired)
+
+    ~ schema    knowledge    hard-drop Person.legacy_id              ⚠ prior versions reclaimed — needs approval
+    + query     knowledge.find_experts                              (new stored query)
+    - query     knowledge.orphan_pages                              (removed)
+    ~ policy    base_rbac    grant invoke find_experts → members    (this is what EXPOSES the new query)
+    + pipeline  saas_sync           notion → knowledge, hourly
+    ~ ui        dashboards.overview  add panel "experts"
+    + alias     experts
+    ─────────────────────────────────────────────────────────────────────
+    6 changes · 1 requires approval (hard schema drop on knowledge) · run `apply` to converge
+```
+
+<!-- Audit fix: enum narrowing is not implemented today; hard drops are the
+current supported irreversible schema path, so the example must not teach a
+future migration tier as if it already exists. -->
+That output **is** the answer to the Sarah/Bob test: one diff, spanning every surface, attributed to a git commit and concrete resource digests, with data-impact peeked (axiom-6 schema seam), dependency fallout visible, observed state compared, and approval gates surfaced *before* anything moves. Drift (someone poked the live cluster out-of-band) shows up here too — `plan` reconciles state against the actual cluster and flags resources whose observed version no longer matches the ledger.
+
+<!-- Audit fix: JSON state is the baseline. It is inspectable and Terraform-like,
+but it remains a separate CAS step from graph manifest movement. -->
+`apply` then: acquire **state lock** → execute the change set (ordered/grouped per the planner) → **CAS-update the JSON state ledger** with the new applied revision/status observations → release the lock. For config-only resources, content-addressed payload writes can happen before the state CAS because state is the publish point. For graph/schema moves, the graph manifest may move before the state CAS; a crash or CAS failure there leaves a loud repair/import condition and no success acknowledgement, not a silently successful atomic apply. A future cluster manifest publisher can tighten this gap, but the baseline protocol does not assume it.
+
+---
+
+## ETL pipelines (the second data-plane seam)
+
+External data — from another database, an API, a file drop, a stream — is a first-class config asset, not glue code that lives nowhere.
+
+A **Pipeline** is declared in config: a **source** (e.g. `notion`, `github`, `slack`, `gdrive`, `postgres`, `http`, `s3-files`, `kafka`), an optional **schedule/trigger**, and **one or more target graphs**, each with its own **mapping/transform** (external records → graph types & properties). A single feed can **fan out across graphs** — e.g. a GitHub sync that populates both the `engineering` graph and the people/teams in `knowledge`. It is reconciled like any resource — `apply` creates / updates / deletes / (re)schedules the pipeline *definition*. This is the canonical "company brain" move: the deployment's graphs are continuously assembled from the SaaS tools the org already uses.
+
+The crucial boundary (axiom 6, axiom 13): the pipeline **definition** is control-plane and reconciled; the pipeline's **execution** — actually pulling rows and writing them — is a **data-plane effect** that produces ordinary `load`/`mutate` commits *outside* the reconcile loop. The reconciler converges the pipeline; the rows it ingests are never reconciled state (just as a cron *definition* is config but its output is not). This makes ETL the **second seam** where a definition triggers a data-plane effect — schema being the first (a migration conforms existing rows; ETL ingests new ones).
+
+Consequences that fall out of the existing model:
+
+- **`plan` previews the pipeline, not the data.** "pipeline `saas_sync`: notion → `knowledge`, hourly" is a definition diff; it does not scan the source (data-volume-independent), the same way schema `plan` previews impact only at the bounded, opt-in data peek.
+- **Source credentials come from the `.env` file** (axiom 10): `token: ${NOTION_TOKEN}` — resolved from the gitignored `.env` file per deployment, never inline.
+- **Reversibility gradient applies** (axiom 8): a pipeline that *appends* is reversible-ish; one configured to *overwrite* a target is a data-loss path and hits the irreversible-op gate.
+- **Referential integrity is plan-time** (axiom 9): a pipeline whose `into:` names a graph/type the same revision removes is a fail-closed `plan` error.
+- **Fan-out is statusful, not magically atomic.** A pipeline execution that writes to several graphs is a set of ordinary per-target graph writes unless the pipeline explicitly stages through a branch/merge protocol that can fence those targets. A failed run may therefore leave `engineering=Applied`, `knowledge=Error` (for example), and the pipeline run ledger must expose per-target status, commit ids, retryability, and idempotency keys. Control-plane `apply` only converges the definition/schedule; it never means every future data-plane target has ingested successfully.
+
+---
+
+## Config assets — the full set
+
+Everything below is **shared cluster config** (in the folder, version-controlled, secret-free) unless marked per-operator. The rule of thumb: if two operators must agree on it, it's config; if it's how *you personally* reach or view the cluster, it's per-operator.
+
+| Asset | In config? | Notes |
+|---|---|---|
+| **Graphs** (the set that exists) | ✅ config | the named graphs; their existence is cluster state |
+| **Schema** (`.pg`, **one per graph**) | ✅ config | also encodes indexes (`@index`/`@unique`/vector), constraints, and search (`@embed`) — so indexes & search are reconciled *via* schema |
+| **Stored queries** (`.gq`, **per graph**) | ✅ config | a `.gq` file declares **many** named queries; the registry declares which exist (name → file, key must match the `query <name>` symbol). **Target design:** exposure — who may list/invoke each — is a policy decision, not a registry flag. **Current compatibility bridge:** shipped `omnigraph.yaml` still has `queries.<name>.mcp.expose`, and the HTTP catalog is not Cedar-filtered per query yet. Aliases & bindings reference a query by name |
+| **Policy bundles** (`.yaml`) | ✅ config | YAML (not Cedar files); **shared across graphs** via `applies_to: [cluster \| <graph refs>]` (many-to-many; fix 2026-06-08 unified the old `scope:`/`graphs:` split). Gates actions **and query exposure** (who may list/invoke each stored query) |
+| **UI specs / dashboards** (`.yaml`) | ✅ config | first-class resources; a dashboard **reads from several graphs** (`graphs: [...]`) |
+| **Bindings** | ✅ config | wiring between resources (query ⇄ UI surface) |
+| **Aliases** | ✅ config* | CLI shortcut to a stored query: `{ command, query: <.gq file>, name: <symbol>, args, format }` — `query` is the **file**, `name` the **query symbol** in it. See note |
+| **Embeddings config** | ✅ config | model + dimension + which fields embed; the **API key comes from the `.env` file** (`${…}`) |
+| **ETL pipelines** | ✅ config | source → transform → **one or more target graphs**; source credentials come from the `.env` file |
+| **Apply settings** | ✅ config | `apply.default_grain`, grouping/ordering hints |
+| **State backend + lock** | ✅ config | where the ledger lives, whether to lock |
+| **Secrets (`.env` file)** | ✅ ref'd by config; values **gitignored** | a separate `.env` of secret values, referenced as `${NAME}`; never committed (OmniGraph's standard env-file convention) |
+| **Connection** (which cluster URI) | ❌ per-operator | how *you* reach the cluster |
+| **Operator token** | ❌ per-operator (secret) | each operator's own credential to reach the cluster |
+| **CLI prefs** (output format, table layout, active graph/branch selection) | ❌ per-operator | personal ergonomics, not shared truth |
+
+\* **Aliases — the one with a split.** A shared alias that names a cluster resource (a stored query, a dashboard) is config — it's a vocabulary the whole team relies on, and it belongs in the spec (often it *is* just the stored-query catalog entry, since that already carries name + params + tool metadata). A *purely personal* shortcut (your own command abbreviations) stays in the per-operator layer. When in doubt: if it should survive `git clone` and be the same for Bob as for Sarah, it's config.
+
+---
+
+## The synthesis (beyond vanilla Terraform)
+
+Embracing Terraform does not mean stopping at Terraform. Three extensions make this specifically right for OmniGraph and the agentic future:
+
+1. **OmniGraph is its own data-aware provider, and `plan` can peek across the data boundary.** A Terraform provider CRUDs resources blind to your data. Here, the control-plane resource is the schema **definition** (declarative, reconciled); converging it *triggers* a data-plane **effect** — currently soft/hard drops, rewrites, and index creation, with future validated migrations such as enum narrowing or `String`→`enum` conversion once the planner grows that tier. The leverage is that `plan`, before applying the definition change, can *peek* at bounded data-plane consequence and report it — **"hard-dropping this property requires approval and will make prior versions unreachable after cleanup"** or, in the future, **"narrowing this enum will fail on 37 rows"** — which Terraform structurally cannot do. This is deliberate and bounded: a data peek makes that `plan` cost scale with data volume, so it is **opt-in / bounded** (sampled or skippable for large tables), and it never makes the control plane the owner of data. Schema and ETL pipelines are the **two** seams where the control plane reaches into the data plane; everywhere else `plan` is data-volume-independent.
+
+2. **JSON state first, explicit partials, optional stronger fencing later.** Terraform apply is not transactional — partial applies are a real failure mode. Lance commits are per dataset, and today's OmniGraph manifest atomicity is graph-scoped: one graph commit flips the relevant sub-table versions together, protected by expected table versions and recovery sidecars. The first cluster-control backend should match Terraform's shape: a locked JSON state document plus append-only JSON status/approval/recovery records. That keeps Phase 1 inspectable and narrow. Cluster-level all-or-nothing apply is a later capability only if we add a **cluster manifest publisher** or Lance-backed state backend that fences graph *version pins*, query catalogs, policy bundles, UI specs, pipeline definitions, recovery sidecars, and state as one commit protocol. Until that exists, apply must surface partial convergence as `ResourceStatus`, not pretend it was atomic.
+
+3. **Agent-as-controller fuses Terraform with Kubernetes.** Terraform contributes the as-code config (truth outside the system, recursion-terminating) and the locked state ledger. Kubernetes contributes *continuous* reconciliation (controllers watch, not apply-on-demand). The agent is both author and controller: it reads a config change, runs the data-aware plan, evaluates blast radius against the reversibility gradient, **auto-applies the reversible parts only when policy permits, and escalates irreversible / data-loss gates to a human approval artifact recorded in the audit ledger and referenced by state.**
+
+> Terraform's as-code config + locked state × Kubernetes' continuous reconciliation × the agent as the controller that bridges them — on OmniGraph's data-aware, atomic substrate.
+
+---
+
+## Concrete shape (illustrative)
+
+The config is **a set of files in one folder** (flat, Terraform-style — the extension carries the type):
+
+```
+ company-brain/
+ ├── cluster.yaml              # the spec (graphs, policies, ui, bindings, aliases, pipelines, state, vars ref)
+ ├── .env          # SECRET VALUES — gitignored, never committed
+ ├── knowledge.pg · engineering.pg                                  # schemas (one per graph)        (.pg)
+ ├── knowledge.gq · engineering.gq                                  # query files — each holds MANY queries  (.gq)
+ ├── cluster_admin.policy.yaml · base_rbac.policy.yaml · knowledge_pii.policy.yaml   # shared policy bundles
+ ├── overview.dashboard.yaml   # cross-graph UI spec                                     (.dashboard.yaml)
+ └── notion_to_knowledge.map.yaml · github_to_engineering.map.yaml · github_to_people.map.yaml  # pipeline maps
+```
+
+Secrets live in a gitignored `.env` file (OmniGraph's standard env-file convention); the config references them as `${NAME}`:
+
+```bash
+# .env  —  secret values; gitignored; never committed. Referenced in cluster.yaml as ${NAME}.
+NOTION_TOKEN=…
+GITHUB_TOKEN=…
+EMBEDDING_API_KEY=…
+```
+
+Resource relationships (so the wiring is unambiguous):
+
+```
+   cluster ──has many──► graph ──has one──► schema
+                           └────has──► query file(s) (.gq) ──each declares MANY──► query <name> { … } symbols
+   registry entry  key = the query <name> symbol  ──points to──► its .gq file   (queries: { <name>: { file } })
+                   (registry says a query EXISTS; it carries NO expose flag)
+   policy bundle ──applies to──► { cluster | one or MANY graphs }   (SHARED, many-to-many)
+                 └──governs query EXPOSURE──► who may LIST / INVOKE each stored query  (no `expose:` in the registry)
+   alias           (command, query = .gq FILE, name = symbol, args, format)  ──selects one query from that file
+   binding         names a query by registry name (graph.queryName)  ──► resolved to (file, symbol)
+   dashboard ──reads from──► one or MANY graphs
+   pipeline  ──writes into──► one or MANY graphs
+   secrets   ──live in──► a separate gitignored `.env` file; config uses ${NAME}
+```
+
+```yaml
+# cluster.yaml — desired state of the whole deployment (config = source of truth for INTENT)
+version: 1
+metadata:
+  name: company-brain
+
+state:                                   # the authoritative ledger's backend (Terraform-style)
+  backend: cluster                       #   "cluster" = this deployment's own store; or s3://… (a separate store)
+  lock: true                             # acquire a state lock before plan/apply
+
+env_file: ./.env                         # secret VALUES live in a gitignored .env file; referenced below as ${NAME}
+
+apply:
+  default_grain: resource                # references may force groups; explicit groups request more atomicity
+
+graphs:                                  # the cluster's graphs — each is ONE schema + a set of named queries
+  knowledge:                             # people · teams · docs · decisions · projects
+    schema: ./knowledge.pg               # desired schema; reconciler runs (and plan previews) the migration
+    queries:                             # the graph's stored (named) queries; KEY must match a `query <name>` in the file
+      find_experts: { file: ./knowledge.gq }   # ─┐ `query find_experts` and `query related_docs`
+      related_docs: { file: ./knowledge.gq }    # ─┘ both live in knowledge.gq.  Who may LIST/INVOKE → policy (not here)
+  engineering:                           # repos · services · incidents · PRs
+    schema: ./engineering.pg
+    queries:
+      service_owners: { file: ./engineering.gq }
+      open_incidents: { file: ./engineering.gq }
+
+policies:                                # policy BUNDLES (YAML) — SHARED across graphs (many-to-many).
+                                         # Policy ALSO governs query EXPOSURE: who may list/invoke each stored query.
+                                         # Fix (2026-06-08): unified the binding field on `applies_to:` (was a
+                                         # `scope:` + `graphs:` split) — one field, takes `cluster` or graph refs;
+                                         # bare graph names are shorthand for `graph.<id>` (see impl-spec typed addresses).
+  cluster_admin:                         # cluster-scoped: graph_list, create/delete, management
+    file: ./cluster_admin.policy.yaml
+    applies_to: [cluster]
+  base_rbac:                             # read/write + which roles may invoke which queries, across both graphs
+    file: ./base_rbac.policy.yaml
+    applies_to: [knowledge, engineering]
+  knowledge_pii:                         # an extra bundle, only for knowledge
+    file: ./knowledge_pii.policy.yaml
+    applies_to: [knowledge]
+
+pipelines:                               # ETL — ONE pipeline may write into SEVERAL graphs (definition only)
+  saas_sync:                             # the "company brain" move: assemble graphs from the SaaS tools
+    source: { kind: notion, token: ${NOTION_TOKEN} }    # secret via ${NAME}, never inline
+    schedule: "0 * * * *"                # hourly; execution is a data-plane effect, not reconciled state
+    into:                                # fans out across graphs
+      - { graph: knowledge, map: ./notion_to_knowledge.map.yaml }
+  github_sync:
+    source: { kind: github, token: ${GITHUB_TOKEN} }
+    schedule: "*/15 * * * *"
+    into:
+      - { graph: engineering, map: ./github_to_engineering.map.yaml }
+      - { graph: knowledge,   map: ./github_to_people.map.yaml }   # same feed enriches a SECOND graph
+
+embeddings:                              # semantic search over docs/decisions; key via the `.env` file
+  model: gemini-embedding-2
+  dimension: 3072
+  api_key: ${EMBEDDING_API_KEY}
+
+ui:                                      # dashboards read from SEVERAL graphs
+  dashboards:
+    overview:
+      file: ./overview.dashboard.yaml
+      graphs: [knowledge, engineering]   # cross-graph
+
+aliases:                                 # CLI shortcuts.  ⚠ an alias's `query:` is the .gq FILE PATH;
+                                         #    `name:` selects the query SYMBOL inside it (a file declares many).
+  experts:   { command: query, graph: knowledge,   query: ./knowledge.gq,   name: find_experts,    args: [topic], format: table }
+  incidents: { command: query, graph: engineering, query: ./engineering.gq, name: open_incidents,                 format: table }
+
+bindings:                                # wiring between resources
+  - query: knowledge.find_experts
+    surface: ui.dashboards.overview
+```
+
+<!-- Audit fix: the sample shows the target policy-owned exposure model. The
+current server still uses mcp.expose for catalog membership until per-query
+policy filtering lands. -->
+What this is *not*: it is **not** a graph, and it carries **no credentials** — only secret *references* (`${…}`). It is parsed by the engine (the base case), describes the desired cluster, and is the thing two operators diff and review.
+
+The **state ledger** lives in the configured backend (the cluster, or a separate cloud store), versioned, CAS-updated, schema-versioned, locked during apply, agent-managed — the authoritative record of what is deployed. The baseline backend is JSON, so even cluster-hosted state is published through a state CAS and repaired explicitly if graph/resource movement happened first. A future cluster publisher can tighten that boundary, but it is not assumed by the high-level spec.
+
+---
+
+## Boundaries that hold (orthogonal correctness, not Terraform-bias)
+
+1. **Secrets live in a `.env` file, never inline in config.** The committed config is what the cluster *is* (shared, reviewable, as code) and carries **no secret values** — only `${NAME}` references. The values (embedding API key, pipeline source credentials, per-deployment settings) live in a separate **`.env` file** — which is **gitignored and never committed**, and supplied per deployment. Separately, an *operator's own token* (how they personally reach the cluster) belongs to the per-operator connection layer, not the cluster config or its `.env` file.
+
+2. **The reversibility gradient gates apply — including drift correction.** Dropping a graph, hard-dropping schema data, or an overwriting pipeline is irreversible data loss; a future validated enum narrowing is a compatibility-narrowing migration unless it also drops or coerces stored values; recoloring a dashboard is not. Unified config, unified plan — but **tiered gates inside apply**, keyed to physics, not to who operates it. The gate applies to **drift correction too**: converging actual→config can mean *dropping* something added out-of-band — a data-loss path that hits the same gate. A reconciler "just fixing drift" is never an exception.
+
+3. **Agents are actors, not ambient authority.** The reconciler runs with a resolved actor or service account, subject to Cedar policy. If it applies on behalf of a human, the durable audit ledger carries both the controller actor and the approving human / approval artifact, and state references that ledger entry. Client-supplied actor identity is never trusted.
+
+4. **Status is explicit when apply is not atomic.** A unified plan does not imply a unified commit. If an apply group partially converges, the cluster must expose `ResourceStatus` and typed conditions until reconciliation finishes or rolls back. Silent partial success is forbidden.
+
+5. **State integrity is protected.** State is locked during apply and stored durably in its backend. The baseline state backend is JSON plus lock/CAS, so state update failures surface a repair/import condition before success is acknowledged. A lost ledger is recoverable (import/refresh from the self-describing cluster), but state is never treated as disposable.
+
+---
+
+## Relationship to current config
+
+This is not green field, but it is also not today's `omnigraph.yaml`. The current file is a shared convenience for CLI and server startup: named graph targets, server defaults, query roots, aliases, embeddings model, auth env-file lookup, and `policy.file`. It is **not** the cluster's source of truth, it has no separate state ledger, and parts of it are intentionally per-operator.
+
+This proposal:
+
+- **splits** per-operator connection/credential/preference config from shared cluster config,
+- **adds** `cluster.yaml` + a flat config folder as the full declarative cluster config (graphs, schemas, query catalog, policy bundles, UI specs, bindings, **aliases**, **embeddings**, **ETL pipelines**),
+- **adds** the **JSON state ledger** (authoritative, locked, in a backend) and the `cluster plan`/`apply` loop,
+- **adds** the reconciler (with OmniGraph as its own data-aware provider), while treating a cluster manifest publisher as a later option rather than the baseline,
+- **lets an agent drive** plan/apply/continuous-reconcile.
+
+The connection/credential/preference layer remains per operator: it points at a cluster, resolves that operator's identity, and holds personal ergonomics. The cluster config stays shared, secret-free, and reviewable; the state ledger stays authoritative and locked.
+
+Implementation gate: the Terraform-style workflow must be testable in order.
+`cluster validate` must catch bad config before any apply path exists;
+read-only `cluster plan` must have deterministic structured-plan tests before
+state mutation ships; and graph/schema-moving apply must have recovery tests for
+the gap between graph/resource movement and JSON state publish. Otherwise the
+control plane can look declarative while still hiding drift or partial success.
+
+---
+
+## Open questions
+
+1. **Cluster state layout.** What exact JSON documents / object-store paths hold `AppliedRevision`, `ResourceStatus`, approval records, recovery records, sidecars, and resource content for query/policy/UI/pipeline specs? What evidence would justify a future Lance-backed state backend?
+2. **State backend options.** Beyond "cluster" and "a separate bucket," what backends are first-class (a different account, a remote control service)? How is the backend itself bootstrapped and its lock implemented (object-store CAS vs an external lock service)?
+3. **State import / refresh.** The exact actual-state scan that reconstructs a conservative `AppliedRevision` when the ledger is lost, and which fields become `Unknown`.
+4. **Apply grain syntax.** Apply defaults to per-resource `ApplyGroup`; cross-resource references force planner-derived groups; user-declared groups opt into more atomicity. What's the YAML, and which combinations can the publisher actually fence?
+5. **Pipeline runtime.** Where do pipelines *execute* (in the server? a worker? an external scheduler?), how are runs observed in `ResourceStatus`, and how does a failed/partial run reconcile vs. retry?
+6. **Continuous reconciliation trigger.** Watch-and-converge (k8s-style) vs. apply-on-config-change. The agent-as-controller model leans toward continuous.
+7. **Tenant partitioning (cloud).** A cluster may host multiple tenants; config/state is then tenant-partitioned, consistent with the reserved `GraphKey { tenant_id, graph_id }`. Tenant resolved from the token, never the config.
+8. **Bootstrap — config, state, *and* authority.** How a cluster comes into existence from an initial config (`init` seeds; cluster owns; git mirrors for CI/DR), the first state write, and the chicken-and-egg of the very first apply (which needs an actor before any cluster exists to resolve policy against — so the bootstrap actor is necessarily out-of-band and privileged). Security-sensitive; needs an explicit story.
+9. **Alias scoping.** Where exactly the shared/personal alias line falls, and whether shared aliases are just stored-query catalog entries.
+10. **UI render and safety model.** Generic engine-side renderer vs. thin client, allowed components, query-binding validation, policy propagation, sandboxing, version compatibility.
+11. **Cluster identity vs. `metadata.name`.** Is `metadata.name` a label or stable identity? If identity, renaming loses it — the stable-ID-across-rename gap already in `invariants.md`. Decide whether identity keys on `name` or on `ClusterRoot`, and reuse the existing known-gap framing.
+12. **Resource dependency ordering.** Explicit dependency DAG (Terraform) vs. eventual convergence with retries (k8s). The most consequential unmade fork: it decides whether `plan` can promise an apply *order* before any data moves.
+13. **Query exposure in policy (supersedes `mcp.expose`).** *Today* the stored-query registry carries a per-query `mcp.expose` flag and invocation is gated with the coarse `invoke_query` Cedar action — with **per-query authorization a documented gap** (the catalog isn't Cedar-filtered per query yet). This design **folds exposure fully into policy and drops the flag**: a stored query's visibility (catalog membership) and invocability are both policy decisions, so the catalog `GET /queries` returns each actor's policy-permitted set. The open work is the exact policy predicates for *list* vs *invoke* per query, and retiring `mcp.expose`.
+
+---
+
+## Prior art
+
+- **Terraform** — declarative infra *as code*; config is desired truth, **state is an authoritative ledger in a backend**, **state locking** serializes applies, `plan` diffs config↔state, providers do the CRUD. The core model adopted here, taken literally.
+- **Kubernetes** — one cluster store, many resource types under one API; controllers reconcile continuously; cluster-level RBAC. The continuous-reconciliation half of the synthesis.
+- **dbt / Airflow / Dagster** — declarative, as-code data pipelines with lineage. Prior art for the **ETL-pipeline-as-config** asset (the second data-plane seam).
+- **OmniGraph's own schema-apply** — already a faithful plan/apply/state/drift loop for the `schema` resource type, with `__schema_apply_lock__` as the lock seed; the reconciler this generalizes.
diff --git a/docs/dev/index.md b/docs/dev/index.md
index 1e41342..49b6d76 100644
--- a/docs/dev/index.md
+++ b/docs/dev/index.md
@@ -73,6 +73,7 @@ Working documents for in-flight feature work. Removed when the work lands.
 | Inline + stored queries, request/response envelope, MCP (MR-656 / MR-976 / MR-969) | [rfc-001-queries-envelope-mcp.md](rfc-001-queries-envelope-mcp.md) |
 | Config & CLI architecture — layered config, client targeting, file naming (MR-973 / MR-974 / MR-981) | [rfc-002-config-cli-architecture.md](rfc-002-config-cli-architecture.md) |
 | MCP server surface — full tool parity, stored queries, modular auth (MR-969 / MR-956 / MR-974) | [rfc-003-mcp-server-surface.md](rfc-003-mcp-server-surface.md) |
+| Future cluster control plane — declarative as-code config, JSON state ledger, reconciler | [cluster-config-specs.md](cluster-config-specs.md), [cluster-axioms.md](cluster-axioms.md), [cluster-config-implementation-spec.md](cluster-config-implementation-spec.md) |
 
 ## Boundary