The substrate makes read/write scaling asymmetric: reads are object-store-backed,
snapshot-isolated, stateless -> horizontal to N replicas with zero coordination;
writes are serialized per (table,branch) by one-winner manifest CAS -> scale by
partitioning (branches/graphs/cells), with a single active coordinator per
(cell,graph,branch). Adds the CQRS deployment split (read fleet / write tier /
maintenance / heavy reads), read-your-write via commit_id/snapshot_id pinning,
and gates pooled WRITES (not reads) on closing the cross-process-CAS gap.
General server/topology/auth/deployment RFC resolving the half-built tenancy
ambiguity (cluster-only server vs pooled tenant_id scaffolding). Decision:
the cluster is the tenant is the cell — silo the data (own storage/catalog/
policy/tokens), pool the compute (one process : N cells). No row-level pooling
(no engine RLS).
- §5.1 CellRuntime lifts today's per-cluster runtime into a value.
- §5.2/§5.3 AppState holds a CellRegistry; resolve_cell is one new outer
middleware hop before auth; the per-graph + Cedar + MCP stack is unchanged.
- §5.4 per-cell CellAuth (Static | Oidc TokenVerifier); WorkOS org -> cell 1:1
with per-cell OAuth audience (cross-tenant token replay fails on aud).
- §5.5 Cedar stays per-graph/per-cell; default-deny-read becomes safe; no
tenant dimension needed.
- §5.6 control plane = Cell Registry (metadata only) + provisioning-as-code;
cell hot-load is the one safe runtime mutation (cell-granular, not graph).
- §5.7 tiered dedicated/pooled/on-prem on one binary; §7 backward-compatible
(today's single-cluster server = a one-cell map).
MCP (rfc-003) is one consumer, not the driver. Linked from docs/dev/index.md.