mirror of
https://github.com/ModernRelay/omnigraph.git
synced 2026-06-12 01:45:14 +02:00
docs(cluster): RFC-005 — server boots from cluster state (Phase 5 design) (#174)
The axiom-15 mode switch: omnigraph-server --cluster <dir> (mutually exclusive with uri/--target/--config, zero omnigraph.yaml reads) serves the APPLIED revision — graph set from state, query/policy content from the content-addressed catalog at applied digests, cluster-scoped policy bundles as the server-level Cedar engine. The load-bearing finding: state is not yet serving-sufficient (policy applies_to bindings live only in cluster.yaml), so slice 5A records binding metadata into the applied revision at apply time — without it, boot-from-state silently becomes the merged read axiom 15 forbids. Fail-fast readiness table (missing state, pending sidecars, missing blobs, unbound policies all refuse boot with remedies), the expose-all mcp.expose bridge with its Phase 6 sunset, the operator migration path (exit criterion 7), and 5A/5B/5C sequencing. The existing boot pipeline (GraphStartupConfig -> registry -> routing/auth) is reused as-is — a new source, not a new pipeline. Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
This commit is contained in:
parent
61da7bf406
commit
3e8f103804
3 changed files with 144 additions and 0 deletions
|
|
@ -601,6 +601,10 @@ actor threading, 4A/4B/4C staging).
|
|||
|
||||
### Phase 5: Server Reads Cluster Catalog
|
||||
|
||||
Detailed design: [rfc-005-server-cluster-boot.md](rfc-005-server-cluster-boot.md)
|
||||
(the --cluster mode switch, applied-revision serving, serving metadata in
|
||||
state, readiness table, migration path).
|
||||
|
||||
- Allow server startup from cluster state.
|
||||
- Add status and catalog endpoints as needed.
|
||||
- Keep the current `omnigraph.yaml` startup path as compatibility mode — an
|
||||
|
|
|
|||
|
|
@ -75,6 +75,7 @@ Working documents for in-flight feature work. Removed when the work lands.
|
|||
| MCP server surface — full tool parity, stored queries, modular auth (MR-969 / MR-956 / MR-974) | [rfc-003-mcp-server-surface.md](rfc-003-mcp-server-surface.md) |
|
||||
| Future cluster control plane — declarative as-code config, JSON state ledger, reconciler | [cluster-config-specs.md](cluster-config-specs.md), [cluster-axioms.md](cluster-axioms.md), [cluster-config-implementation-spec.md](cluster-config-implementation-spec.md) |
|
||||
| Cluster graph & schema apply — Phase 4 sidecars, roll-forward recovery, approval artifacts | [rfc-004-cluster-graph-schema-apply.md](rfc-004-cluster-graph-schema-apply.md) |
|
||||
| Server boots from cluster state — Phase 5 mode switch, applied-revision serving | [rfc-005-server-cluster-boot.md](rfc-005-server-cluster-boot.md) |
|
||||
|
||||
## Boundary
|
||||
|
||||
|
|
|
|||
139
docs/dev/rfc-005-server-cluster-boot.md
Normal file
139
docs/dev/rfc-005-server-cluster-boot.md
Normal file
|
|
@ -0,0 +1,139 @@
|
|||
# RFC: Server Boots from Cluster State — Phase 5 of the Cluster Control Plane
|
||||
|
||||
**Status:** Proposed
|
||||
**Date:** 2026-06-10
|
||||
**Builds on:** Phase 4 complete ([rfc-004-cluster-graph-schema-apply.md](rfc-004-cluster-graph-schema-apply.md), Landed): `cluster apply` converges graphs, schemas, stored queries, and policies into the cluster catalog. Normative context: [cluster-config-specs.md](cluster-config-specs.md) (the migration model's "window 2"), [cluster-axioms.md](cluster-axioms.md) (axiom 15), [cluster-config-implementation-spec.md](cluster-config-implementation-spec.md) (Phase 5 rollout, Compatibility Stance #7–#9, exit criterion 7).
|
||||
**Target release:** unversioned (phased — see Sequencing).
|
||||
|
||||
## Summary
|
||||
|
||||
Give `omnigraph-server` a second boot source: `omnigraph-server --cluster <dir>` reads its graph set, stored queries, and Cedar policies from the **cluster catalog** — `state.json`'s applied revision plus the content-addressed blobs under `__cluster/resources/` — instead of `omnigraph.yaml`. This is the moment "applied" finally means "serving": the standing caveat in every cluster doc since Stage 3A ("the server still boots from `omnigraph.yaml`") retires for deployments that flip the switch.
|
||||
|
||||
Three commitments:
|
||||
|
||||
1. **An exclusive mode switch, never a merge** (axiom 15, Compatibility Stance #7). `--cluster <dir>` is mutually exclusive with the positional URI, `--target`, and `--config`. In cluster mode, `omnigraph.yaml` is not read at all — not for graphs, not for queries, not for policies. There is no precedence, no key-level aliasing, no fallback read. A deployment serves from one source.
|
||||
2. **The server serves the *applied* revision, not the desired config.** What's live is what `cluster apply` converged: graph roots recorded in state, query/policy content at the *applied* digests from the content-addressed catalog. Un-applied config drift never leaks into serving — the serving surface and the ledger cannot disagree (axiom 5 extended to the data path).
|
||||
3. **The state ledger becomes serving-sufficient.** Today one fact needed to serve is missing from state: a policy's `applies_to` bindings live only in `cluster.yaml`. A prerequisite slice (5A) records binding metadata into the applied revision at apply time, so a booting server reads state + blobs and nothing else. Without this, "boot from state" would silently become "boot from state *and* config" — the merged read axiom 15 forbids.
|
||||
|
||||
## Motivation
|
||||
|
||||
Phase 4 closed the convergence loop but left it inert: an operator can declare, plan, approve, and apply an entire deployment, and the running server ignores all of it. The Sarah/Bob test still fails at the last step — Sarah's applied change is visible in `cluster status` but Bob's clients hit a server still wired to a hand-maintained `omnigraph.yaml`. Phase 5 makes the catalog the serving source, which is also the precondition for Phase 6 (policy-owned query exposure must filter a catalog the server actually reads).
|
||||
|
||||
## Non-Goals
|
||||
|
||||
- **Runtime reconciliation / hot reload.** Cluster-mode boot is static, exactly like today's boot: the server reads the applied revision once at startup; picking up a newer applied state means restarting the process. The registry's runtime-mutation seam (the test-only `insert()` + mutate `Mutex` in `registry.rs`) stays future-proofing for a later watch-and-reload slice, not this RFC.
|
||||
- **Policy-owned query exposure** (Phase 6) — but this RFC defines the bridge it sunsets (§D5).
|
||||
- **Remote cluster roots.** `--cluster <dir>` is a local directory in this phase, same as the `cluster` CLI commands; S3-hosted cluster roots arrive with external state backends.
|
||||
- **Retiring `omnigraph.yaml` server boot.** It remains a fully supported mode indefinitely (Compatibility Stance #8: the file's job shrinks; the *server-role* keys become inert only for deployments that switch).
|
||||
- **New management endpoints** (`/cluster/status` etc.) — noted as future work; this RFC changes the boot source, not the HTTP surface (beyond OpenAPI regen if anything shifts).
|
||||
|
||||
## Background (verified against main)
|
||||
|
||||
- **Server boot today** (`omnigraph-server/src/main.rs`, `lib.rs:891-1029`): `load_server_settings` applies a four-rule mode inference (positional URI / `--target` / `server.graph` → Single; `--config` + `graphs:` → Multi), builds `ServerConfigMode::{Single,Multi}` with per-graph `GraphStartupConfig {graph_id, uri, policy_file, queries}`, loads `QueryRegistry` from `.gq` files at settings time (identity-checked), type-checks queries at engine open (`validate_and_attach`), loads Cedar via `PolicyEngine::load_graph`/`load_server`, installs it with `with_policy`, and assembles `GraphRegistry::from_handles` (startup-only; lock-free `ArcSwap` reads). Bind address and bearer tokens come from flags/env, not from graph config. No reload machinery exists.
|
||||
- **The catalog today** (`omnigraph-cluster`): `state.json` records `applied_revision.resources` (address → digest) for `graph.*`, `schema.*`, `query.<graph>.<name>`, `policy.<name>`, plus statuses, observations (incl. tombstones), approval and recovery records. Query/policy *content* lives content-addressed at `__cluster/resources/query/<graph>/<name>/<digest>.gq` and `policy/<name>/<digest>.yaml`. Graph roots are derived: `<dir>/graphs/<id>.omni`.
|
||||
- **The gap**: state records a policy's *digest* only; `applies_to` (cluster vs graph refs) lives in `cluster.yaml`. Queries are fine — their graph binding is encoded in the address itself.
|
||||
|
||||
## Design
|
||||
|
||||
### D1. The mode switch
|
||||
|
||||
New server flag: `omnigraph-server --cluster <dir>` (the directory containing `cluster.yaml`, `__cluster/`, and `graphs/`). Mutually exclusive — a hard startup error, not a precedence rule — with the positional URI, `--target`, and `--config`. `--bind`, `--unauthenticated`, and the bearer-token env vars keep working identically: listen address and credentials are **process-operational facts**, not cluster facts (they differ per replica/host and never belonged to the shared catalog; if a `serve:` section ever joins `cluster.yaml`, that's a separate proposal).
|
||||
|
||||
Mode inference gains rule 0: `--cluster <dir>` → **Cluster mode**, which is always multi-graph routing (`/graphs/{graph_id}/...`), even for a single declared graph. No flat-route legacy surface in cluster mode — it's a new mode with no compatibility debt to carry.
|
||||
|
||||
### D2. What the server reads (the applied revision, and only it)
|
||||
|
||||
`load_server_settings` grows a cluster branch that reads, in order:
|
||||
|
||||
1. `__cluster/state.json` — **missing state is a boot error** ("run `cluster import` + `cluster apply` first"). Pending recovery sidecars under `__cluster/recoveries/` are also a boot error (`cluster_recovery_pending`): a server must not start serving a ledger that a sweep is about to rewrite.
|
||||
2. **Graph set** = state's `graph.<id>` resources (tombstoned graphs are absent by construction). Each graph's URI is the derived root `<dir>/graphs/<id>.omni`. A recorded graph whose root does not open is a boot error — same fail-fast posture as today's bad URI.
|
||||
3. **Stored queries** = state's `query.<graph>.<name>` entries, content loaded from the catalog blob at the recorded digest. Blob-missing or digest-mismatched is a boot error (the catalog verification semantics from Stage 3B, applied at boot). Queries type-check at engine open exactly as today (`validate_and_attach` — unchanged).
|
||||
4. **Policies** = state's `policy.<name>` entries, content from catalog blobs, bindings from the applied metadata of D3: bundles bound to `cluster` load as the server-level Cedar engine (`PolicyEngine::load_server`); bundles bound to graphs load per-graph (`PolicyEngine::load_graph`) and install via `with_policy` — the existing two-gate structure, unchanged.
|
||||
5. `cluster.yaml` is parsed **only** to validate that the directory is a cluster root (and for nothing else — explicitly not for resource content; a divergence between desired config and applied state is *served as applied*, visible via `cluster plan`).
|
||||
|
||||
Everything downstream of settings construction — `GraphStartupConfig`, parallel engine opens, `GraphRegistry::from_handles`, routing middleware, auth, workload admission, OpenAPI — is reused as-is. Cluster mode is a new *source* for the same boot pipeline, not a new pipeline.
|
||||
|
||||
### D3. Prerequisite: serving metadata in the applied revision (slice 5A)
|
||||
|
||||
State's `StateResource` records only a digest. To make the ledger serving-sufficient, `cluster apply` (and the sweep's roll-forwards) additionally record **binding metadata** for policy resources at apply time:
|
||||
|
||||
```json
|
||||
"applied_revision": {
|
||||
"resources": {
|
||||
"policy.base_rbac": {
|
||||
"digest": "<sha256>",
|
||||
"applies_to": ["cluster", "graph.knowledge"]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
- Additive and optional (`#[serde(default)]`) — existing state files parse unchanged; a policy entry without `applies_to` (applied before 5A) is a **boot error in cluster mode** with the remedy "re-run `cluster apply`" (one apply rewrites the metadata; the digest needn't change — the metadata write is part of the state mutation, not the blob).
|
||||
- `applies_to` is normalized to typed addresses (`cluster` | `graph.<id>`) at apply time, mirroring the validator's normalization.
|
||||
- Queries need no equivalent: the address (`query.<graph>.<name>`) already carries the binding, and the registry key/symbol invariant is enforced at apply (validate) time.
|
||||
- This is deliberately *applied* metadata, not config mirroring: if `cluster.yaml` changes a binding, the server keeps serving the old binding until `cluster apply` converges it — the same contract as every other resource.
|
||||
|
||||
### D4. Readiness and failure posture
|
||||
|
||||
Boot is fail-fast, matching the server's existing stance (bad policy YAML refuses boot):
|
||||
|
||||
| Condition | Behavior |
|
||||
|---|---|
|
||||
| `state.json` missing / unparseable / unsupported version | boot error |
|
||||
| pending recovery sidecars | boot error (run any state-mutating cluster command to sweep) |
|
||||
| recorded graph root missing or unopenable | boot error |
|
||||
| query/policy blob missing or digest-mismatched | boot error (run `cluster refresh` + `apply` to self-heal, then restart) |
|
||||
| policy entry without `applies_to` metadata | boot error ("re-run cluster apply", D3) |
|
||||
| stored query fails type-check against the live schema | boot error (existing `validate_and_attach` behavior) |
|
||||
| state lock held | **not** an error — boot takes no lock; it reads a point-in-time snapshot of an immutable-once-written state file (the CAS discipline means a concurrent apply produces a *new* file atomically; the server reads whichever was current at open) |
|
||||
|
||||
### D5. The `mcp.expose` bridge in cluster mode
|
||||
|
||||
The cluster query registry has no `expose` flag by design (axiom 14: exposure is a policy decision — Phase 6). Until Phase 6 ships, cluster-mode servers list **all** stored queries in `GET /queries`. This is the documented bridge: *cluster mode = everything exposed; omnigraph.yaml mode = `mcp.expose` honored as today*. Its named sunset is Phase 6's policy-filtered catalog (Compatibility Stance #9). Invocation remains gated by the existing coarse `invoke_query` Cedar action in both modes.
|
||||
|
||||
### D6. Migration path (exit criterion 7)
|
||||
|
||||
For an operator running multi-graph from `omnigraph.yaml`:
|
||||
|
||||
1. Author `cluster.yaml` declaring the same graphs/queries/policies; place existing graph roots under `<dir>/graphs/<id>.omni` (or start fresh).
|
||||
2. `cluster import` (observes live graphs) → `cluster plan` → `cluster apply` (publishes queries/policies into the catalog; with 5A, records policy bindings).
|
||||
3. Restart the server with `--cluster <dir>` instead of `--config omnigraph.yaml`.
|
||||
4. `omnigraph.yaml`'s `graphs:`/`serve:`/`queries:`/`policy:` keys are now inert for this deployment; the file remains the CLI's per-operator config.
|
||||
|
||||
Rollback is the same switch in reverse — nothing in cluster mode mutates `omnigraph.yaml` or the graphs in a way the yaml mode can't serve.
|
||||
|
||||
### D7. Invariants and axioms check
|
||||
|
||||
- *Axiom 15 / Stance #7*: exclusive flag, hard mutual-exclusion error, zero `omnigraph.yaml` reads in cluster mode — no fact has two readers.
|
||||
- *Axiom 5*: the server serves deployed reality (applied digests), never desired intent; D3 keeps the ledger the single serving source.
|
||||
- *Axiom 12*: boot reads without the lock but relies on the atomic-replace write discipline; it never writes state.
|
||||
- *Axiom 14 / Stance #9*: the expose-all bridge is named, scoped to cluster mode, and carries its Phase 6 sunset.
|
||||
- *Loud failures (deny-list)*: every degraded condition is a typed boot error with a remedy; no partial serving, no silent fallback to the yaml.
|
||||
- *Respect the boundaries*: `omnigraph-cluster` stays free of HTTP; the server reads the catalog through a small read-only loader (either a `pub` read surface on `omnigraph-cluster` or a thin module in the server consuming the documented file formats — implementation picks the one that keeps `omnigraph-cluster` dependency-light; the state/blob formats are already a documented contract).
|
||||
|
||||
## Sequencing
|
||||
|
||||
| Slice | Scope | Gate |
|
||||
|---|---|---|
|
||||
| **5A: serving metadata in state** | `applies_to` recorded on policy resources at apply + sweep roll-forward; additive state schema; `status`/plan surfacing | In-crate tests: metadata written/rolled-forward; old state parses; re-apply backfills |
|
||||
| **5B: `--cluster` boot mode** | Flag + mode inference rule 0; catalog loader (state → `GraphStartupConfig`s + registries + policy engines); readiness table; OpenAPI regen if surface shifts | Server tests: boot from a converged fixture dir, serve `/graphs/{id}/query` + stored queries + Cedar gates; every D4 row refuses boot; e2e: `cluster apply` then serve — "applied means serving" |
|
||||
| **5C: docs + caveat retirement** | `cluster-config.md` mode-switch section; `server.md`/`deployment.md`; retire the "not serving" caveats for cluster-mode deployments; migration guide (D6) | `check-agents-md.sh`; doc accuracy review |
|
||||
|
||||
## Exit-criteria coverage
|
||||
|
||||
Answers implementation-spec exit criterion 7 (server startup + migration path) in full; touches 1 (state schema gains policy binding metadata — additive). Criteria 8 (per-query policy) and 9 (pipelines — descoped to a separate project) remain.
|
||||
|
||||
## Open Questions
|
||||
|
||||
1. **Loader home**: `pub` read-only API on `omnigraph-cluster` (server gains the dependency) vs a server-side reader of the documented formats. Leaning `omnigraph-cluster` API — one parser for the state schema beats two drifting ones; the crate stays HTTP-free either way.
|
||||
2. **Boot-time blob re-hash**: D4 requires digest verification at boot; for large catalogs a stat-only fast path with full hashes behind a flag may matter later. Start with full verification (catalogs are small).
|
||||
3. **`GET /graphs` enrichment**: cluster mode could expose applied digests/revision in the enumeration — deferred until a consumer exists.
|
||||
4. **Watch-and-reload**: the natural follow-up once cluster mode exists; the registry's mutation seam is ready, but reload semantics (drain? cutover?) deserve their own design.
|
||||
|
||||
## References
|
||||
|
||||
- [rfc-004-cluster-graph-schema-apply.md](rfc-004-cluster-graph-schema-apply.md) — the convergence machinery this serves
|
||||
- [cluster-config-specs.md](cluster-config-specs.md) §Migration model — window 2 is this RFC
|
||||
- [cluster-axioms.md](cluster-axioms.md) — axioms 5, 12, 14, 15
|
||||
- [cluster-config-implementation-spec.md](cluster-config-implementation-spec.md) — Phase 5 rollout, Compatibility Stance #7–#9, blast-radius rows for the server registry
|
||||
- `crates/omnigraph-server/src/lib.rs` (`load_server_settings`, `ServerConfigMode`, `GraphRegistry`) — the boot pipeline this extends without forking
|
||||
Loading…
Add table
Add a link
Reference in a new issue