* docs(cluster): RFC-004 — graph & schema apply design (Phase 4)
The design the implementation spec's exit criteria require before
graph-moving cluster apply ships. Core positions:
- Cluster recovery is roll-forward-only: the engine's own sidecars make every
graph-level operation atomic within the graph, so the cluster never rolls a
graph back — its sidecars (__cluster/recoveries/{ulid}.json) classify and
record, converging the ledger to observable reality (axiom 5) or surfacing
a loud pending-repair condition. Eight-row decision matrix, every row
testable with the Stage 3B failpoint harness.
- Irreversible operations (graph delete, allow_data_loss schema apply)
consume digest-bound approval artifacts written by a new cluster approve
command and retired into state.approval_records (axiom 11). A stale
approval can never authorize a different change.
- cluster apply gains an actor, threaded to apply_schema_as so engine Cedar
enforcement and commit attribution work unchanged; the cluster adds no
policy engine of its own.
- Deterministic ordering (creates -> schema applies -> catalog -> deletes),
per-resource apply groups, cross-graph atomicity explicitly not promised.
- Staged 4A graph create / 4B schema apply / 4C graph delete, each gated on
per-matrix-row failpoint tests.
Answers exit criteria 2 and 4 fully, 1/5/6 partially; 3/7/8/9 deferred to
their phases (coverage table in the RFC). Linked from the dev index and the
implementation spec's Phase 4 section.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
* docs(cluster): RFC-004 review fixes — graph_delete sweep rows, state_cas_base contract
Two greptile findings: (1) D3 row 2 could not be evaluated for graph_delete
(no manifest to version-check after prefix removal) and 'root absent, state
already tombstoned' fell into the stale row — split into rows 7 (delete's
analog of row 2) and 7b (the roll-forward), with expected_manifest_version
documented as always null for the delete kind. (2) state_cas_base is now
explicitly audit/diagnostics-only — the sweep never consults it; independent
state mutations are handled by the ordinary CAS like any concurrent write.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
---------
Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
4.3 KiB
Developer Docs
Audience: contributors, maintainers, and coding agents
This is the contributor-facing entry point. These docs explain architecture, invariants, implementation contracts, test ownership, and upstream Lance constraints. User-facing behavior should still be documented through docs/user/index.md and the relevant public reference docs.
Required For Every Non-Trivial Change
| Need | Read |
|---|---|
| Architectural rules, known gaps, deny-list | invariants.md |
| Upstream Lance source-of-truth index | lance.md |
| Existing test coverage and test placement | testing.md |
Architecture And Storage
| Area | Read |
|---|---|
| System structure, L1/L2 framing, component diagrams | architecture.md |
| On-disk layout, manifest schema, URI behavior | storage.md |
| Direct-publish writes, D2, staged writes, recovery sidecars | writes.md |
| Query execution, mutation execution, loader flow | execution.md |
| Index lifecycle and graph topology indexes | indexes.md |
| Branch and commit internals | branches-commits.md |
| Three-way merge implementation and conflicts | merge.md |
| Diff/change-feed implementation | changes.md |
| Branch protection policy | branch-protection.md |
| CODEOWNERS source of truth | codeowners.md |
Language, Runtime, And Boundaries
| Area | Read |
|---|---|
| Schema grammar, catalog, migration planner | schema-language.md |
| Query grammar, IR, lints, mutation restrictions | query-language.md |
Embedding client and @embed integration |
embeddings.md |
| Cedar policy surface and server gating | policy.md |
| Server auth, OpenAPI, endpoint handlers | server.md |
| Error taxonomy and serialization | errors.md |
| Constants and tunables | constants.md |
| Transaction model public contract | transactions.md |
Project Operations
| Area | Read |
|---|---|
| CI and release workflows | ci.md |
| Install and deployment packaging | install.md, deployment.md |
| Release history | releases/ |
Contribution & Governance
| Area | Read |
|---|---|
| How to contribute (external) | CONTRIBUTING.md |
| Governance model, roles, decision authority | GOVERNANCE.md |
| Public contribution RFC track | rfcs/ |
The docs/rfcs/ track is the public, externally-authorable RFC process. The
maintainer/internal RFCs below (rfc-00N-*.md) are a separate, team-owned
track; don't conflate the two.
Active Implementation Plans
Working documents for in-flight feature work. Removed when the work lands.
| Area | Read |
|---|---|
Schema-lint chassis v1 (MR-694) — --allow-data-loss, soft/hard drops |
schema-lint-v1-plan.md |
| Inline + stored queries, request/response envelope, MCP (MR-656 / MR-976 / MR-969) | rfc-001-queries-envelope-mcp.md |
| Config & CLI architecture — layered config, client targeting, file naming (MR-973 / MR-974 / MR-981) | rfc-002-config-cli-architecture.md |
| MCP server surface — full tool parity, stored queries, modular auth (MR-969 / MR-956 / MR-974) | rfc-003-mcp-server-surface.md |
| Future cluster control plane — declarative as-code config, JSON state ledger, reconciler | cluster-config-specs.md, cluster-axioms.md, cluster-config-implementation-spec.md |
| Cluster graph & schema apply — Phase 4 sidecars, roll-forward recovery, approval artifacts | rfc-004-cluster-graph-schema-apply.md |
Boundary
Developer docs may mention implementation details, stale gaps, upstream Lance blockers, and review rules. User docs should not require that context unless the detail changes the public contract.