omnigraph/docs/dev
Andrew Altshuler 58c66a54a2
docs(cluster): RFC-004 — graph & schema apply design (Phase 4) (#168)
* docs(cluster): RFC-004 — graph & schema apply design (Phase 4)

The design the implementation spec's exit criteria require before
graph-moving cluster apply ships. Core positions:

- Cluster recovery is roll-forward-only: the engine's own sidecars make every
  graph-level operation atomic within the graph, so the cluster never rolls a
  graph back — its sidecars (__cluster/recoveries/{ulid}.json) classify and
  record, converging the ledger to observable reality (axiom 5) or surfacing
  a loud pending-repair condition. Eight-row decision matrix, every row
  testable with the Stage 3B failpoint harness.
- Irreversible operations (graph delete, allow_data_loss schema apply)
  consume digest-bound approval artifacts written by a new cluster approve
  command and retired into state.approval_records (axiom 11). A stale
  approval can never authorize a different change.
- cluster apply gains an actor, threaded to apply_schema_as so engine Cedar
  enforcement and commit attribution work unchanged; the cluster adds no
  policy engine of its own.
- Deterministic ordering (creates -> schema applies -> catalog -> deletes),
  per-resource apply groups, cross-graph atomicity explicitly not promised.
- Staged 4A graph create / 4B schema apply / 4C graph delete, each gated on
  per-matrix-row failpoint tests.

Answers exit criteria 2 and 4 fully, 1/5/6 partially; 3/7/8/9 deferred to
their phases (coverage table in the RFC). Linked from the dev index and the
implementation spec's Phase 4 section.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* docs(cluster): RFC-004 review fixes — graph_delete sweep rows, state_cas_base contract

Two greptile findings: (1) D3 row 2 could not be evaluated for graph_delete
(no manifest to version-check after prefix removal) and 'root absent, state
already tombstoned' fell into the stale row — split into rows 7 (delete's
analog of row 2) and 7b (the roll-forward), with expected_manifest_version
documented as always null for the delete kind. (2) state_cas_base is now
explicitly audit/diagnostics-only — the sweep never consults it; independent
state mutations are handled by the ordinary CAS like any concurrent write.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 04:34:14 +03:00
..
architecture.md (feat) convert engine call sites to &dyn TableStorage; demote legacy TableStore methods to pub(crate) (#86) 2026-06-09 23:03:08 +02:00
branch-protection.md ci(codeowners): un-trap required checks, auto-render, generate owner tables (#142) 2026-06-06 18:09:47 +03:00
ci.md Add Windows release binaries (#127) 2026-05-30 14:23:40 +02:00
cluster-axioms.md docs(cluster): axiom 15 — single ownership, mode-switch migration, per-operator layer (#164) 2026-06-10 00:44:51 +03:00
cluster-config-implementation-spec.md docs(cluster): RFC-004 — graph & schema apply design (Phase 4) (#168) 2026-06-10 04:34:14 +03:00
cluster-config-specs.md docs(cluster): axiom 15 — single ownership, mode-switch migration, per-operator layer (#164) 2026-06-10 00:44:51 +03:00
codeowners.md ci(codeowners): add aaltshuler to engineering role (#147) 2026-06-07 18:05:01 +03:00
execution.md (feat) convert engine call sites to &dyn TableStorage; demote legacy TableStore methods to pub(crate) (#86) 2026-06-09 23:03:08 +02:00
index.md docs(cluster): RFC-004 — graph & schema apply design (Phase 4) (#168) 2026-06-10 04:34:14 +03:00
invariants.md (feat) convert engine call sites to &dyn TableStorage; demote legacy TableStore methods to pub(crate) (#86) 2026-06-09 23:03:08 +02:00
lance.md (feat) convert engine call sites to &dyn TableStorage; demote legacy TableStore methods to pub(crate) (#86) 2026-06-09 23:03:08 +02:00
merge.md docs: split user and developer docs (#93) 2026-05-15 03:45:22 +03:00
rfc-001-queries-envelope-mcp.md feat: inline query strings in CLI and HTTP server (#110) 2026-05-29 13:41:54 +02:00
rfc-002-config-cli-architecture.md Stored-query registry foundation + config/CLI RFC-002 (#128) 2026-06-01 22:50:31 +02:00
rfc-003-mcp-server-surface.md Stored-query registry foundation + config/CLI RFC-002 (#128) 2026-06-01 22:50:31 +02:00
rfc-004-cluster-graph-schema-apply.md docs(cluster): RFC-004 — graph & schema apply design (Phase 4) (#168) 2026-06-10 04:34:14 +03:00
schema-lint-v1-plan.md schema-lint chassis v1.0: DropProperty Soft + code-tagged diagnostics (MR-694) (#90) 2026-05-16 16:30:03 +03:00
testing.md docs(cluster): record Stage 3B failpoint + verification coverage 2026-06-10 02:15:13 +03:00
writes.md (feat) convert engine call sites to &dyn TableStorage; demote legacy TableStore methods to pub(crate) (#86) 2026-06-09 23:03:08 +02:00