MR-794 step 2: docs — runs/invariants/architecture/execution + cleanup

Refresh user-facing and agent-facing docs for the staged-write rewire
and clean up stale Run-state-machine references that survived MR-771.

MR-794-specific updates:
* docs/runs.md — remove "Known limitation: mid-query partial failure"
  section; document the in-memory accumulator + D₂ rule + the
  LoadMode::Overwrite residual.
* docs/invariants.md §VI.25 — flip from aspirational/open to
  upheld for inserts/updates. Within-query read-your-writes is now
  load-bearing for the publisher CAS contract.
* docs/architecture.md — add "Mutation atomicity — in-memory
  accumulator (MR-794)" subsection with per-op flow; refresh the
  engine + state diagrams to drop RunRegistry and add MutationStaging.
* docs/execution.md — rewrite the mutation flow sequence diagram
  for the staged-write path; updated the LoadMode table to call
  out per-mode commit semantics; rewrote load vs ingest.
* docs/query-language.md — document the D₂ parse-time rule.
* docs/errors.md — add the D₂ BadRequest rejection path.
* docs/testing.md — extend the runs.rs row to cover the new MR-794
  contract tests; add the staged_writes.rs row.
* docs/releases/v0.4.1.md (new) — release note covering the rewire,
  test additions, residuals, and files changed.
* AGENTS.md (CLAUDE.md symlink) — update the atomic-per-query
  description and the L2 capability matrix row.

Stale-reference cleanup (MR-771 leftovers):
* docs/storage.md — drop live _graph_runs.lance / _graph_run_actors.lance
  from the layout diagram and prose; mark legacy.
* docs/branches-commits.md — move __run__<id> to a legacy note;
  remove publish_run from the publish-trigger list.
* docs/audit.md — refresh _as API list (drop begin_run_as / publish_run_as);
  legacy RunRecord.actor_id moved to a historical note.
* docs/constants.md — mark run registry / branch-prefix rows as legacy.
* docs/cli.md — replace the legacy omnigraph run * quickstart block
  with omnigraph commit list/show.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Ragnor Comerford 2026-05-01 10:43:19 +02:00
parent 82350bdc4a
commit a61e82f47a
No known key found for this signature in database
14 changed files with 342 additions and 121 deletions

View file

@ -63,7 +63,7 @@ flowchart TB
subgraph engine[omnigraph engine]
plan[exec query and mutation]:::l2
gi[graph index CSR/CSC<br/>RuntimeCache LRU 8]:::l2
coord[coordinator<br/>ManifestRepo · CommitGraph · RunRegistry]:::l2
coord[coordinator<br/>ManifestRepo · CommitGraph]:::l2
end
subgraph storage[storage trait — wraps Lance]
@ -134,7 +134,7 @@ flowchart TB
coord[GraphCoordinator]:::l2
mr[ManifestRepo<br/>db/manifest.rs]:::l2
cg[CommitGraph<br/>_graph_commits.lance]:::l2
rr[RunRegistry<br/>_graph_runs.lance]:::l2
stg[MutationStaging<br/>per-query in-memory accumulator<br/>exec/staging.rs]:::l2
end
subgraph idx[graph index]
@ -149,17 +149,18 @@ flowchart TB
eq --> gi
eq --> ts
em --> stg
em --> ts
ld --> stg
ld --> ts
eq --> mr
em --> mr
coord --> mr
coord --> cg
coord --> rr
ts --> st
```
The engine binds the compiler IR to Lance. It owns multi-dataset coordination, the graph topology index, the run registry, and the snapshot/manifest read path.
The engine binds the compiler IR to Lance. It owns multi-dataset coordination, the graph topology index, the per-query staging accumulator, and the snapshot/manifest read path.
Code paths:
@ -169,6 +170,46 @@ Code paths:
- Graph index: `crates/omnigraph/src/graph_index/`
- Loader: `Omnigraph::ingest` at `crates/omnigraph/src/loader/mod.rs:74`
### Mutation atomicity — in-memory accumulator (MR-794)
Inserts and updates inside `mutate_as` and the bulk loader's
Append/Merge modes go through `MutationStaging`
([`crates/omnigraph/src/exec/staging.rs`](../crates/omnigraph/src/exec/staging.rs)),
a per-query in-memory accumulator. No Lance HEAD advance happens during
op execution; one `stage_*` + `commit_staged` per touched table runs
at end-of-query, then the publisher commits the manifest atomically.
```
op-1 (insert/update) → push RecordBatch → MutationStaging.pending[table]
op-2 (insert/update) → read committed via Lance + pending via DataFusion
MemTable (read-your-writes) → push batch
op-N → push batch
─── end of query ───────────────────────────────────────
finalize: per pending table:
concat batches → stage_append OR stage_merge_insert → commit_staged
publisher: ManifestBatchPublisher::publish (one cross-table CAS)
```
A failed op leaves Lance HEAD untouched on the staged tables: the next
mutation proceeds normally with no drift to reconcile. Concrete
contracts:
- `D₂` parse-time rule: a query is either insert/update-only or
delete-only. Mixed → reject. Deletes still inline-commit (Lance
4.0.0 has no public two-phase delete); D₂ keeps the inline path safe.
- `LoadMode::Overwrite` keeps the inline-commit path
(truncate-then-append doesn't fit the staged shape; overwrite has no
in-flight read-your-writes requirement).
- Read sites consume `TableStore::scan_with_pending`, which Lance-scans
the committed snapshot at the captured `expected_version` and unions
with a DataFusion `MemTable` over the pending batches.
This pattern realizes [docs/invariants.md §VI.25](invariants.md)
(read-your-writes within a multi-statement mutation) and §VI.32
(failure scope bounded) for inserts/updates by construction at the
writer layer. See [docs/runs.md](runs.md) for the publisher CAS
contract this builds on.
### Storage trait — today vs. roadmap
```mermaid
@ -256,13 +297,13 @@ Throughout the docs, capabilities are split into:
- **MVCC**: every Lance write bumps a per-dataset version; the OmniGraph manifest version coordinates which sub-table versions are visible together.
- **Snapshot isolation**: a query holds one `Snapshot` for its lifetime; concurrent writes don't leak in.
- **Cross-branch isolation**: copy-on-write means readers and writers on different branches don't block each other.
- **Run isolation**: each transactional run lives on its own `__run__<id>` branch.
- **Per-query staging**: `mutate_as` and `load` (Append/Merge) accumulate insert/update batches in an in-memory `MutationStaging`; one `stage_*` + `commit_staged` per touched table runs at end-of-query, then the publisher commits the manifest atomically. A mid-query failure leaves Lance HEAD untouched on staged tables. (MR-794; pre-v0.4.0 used a `__run__<id>` staging branch + Run state machine, removed in MR-771.)
- **Schema-apply lock**: `__schema_apply_lock__` system branch serializes schema migrations.
- **Fail-points** (`failpoints` cargo feature): `failpoints::maybe_fail("operation.step")?` in `branch_create`, publish, etc., for deterministic failure injection in tests.
## Workspace crates
- `omnigraph-compiler` — schema and query grammars, catalog, IR, lowering, type checker, lint, migration planner, OpenAI-style embedding client.
- `omnigraph` (engine, published as `omnigraph-engine` on crates.io since v0.2.2) — the Lance-backed runtime: manifest, commit graph, run registry, snapshot, exec, merge, loader, Gemini embedding client.
- `omnigraph` (engine, published as `omnigraph-engine` on crates.io since v0.2.2) — the Lance-backed runtime: manifest, commit graph, snapshot, exec (incl. per-query `MutationStaging` accumulator), merge, loader, Gemini embedding client.
- `omnigraph-cli` — the `omnigraph` binary.
- `omnigraph-server` — the `omnigraph-server` binary (Axum HTTP server).

View file

@ -1,6 +1,7 @@
# Audit / Actor tracking
- `Omnigraph::audit_actor_id: Option<String>` is the actor in effect.
- `_as` variants of every write API let callers override the actor: `begin_run_as`, `publish_run_as`, `ingest_as`, `mutate_as`, `branch_merge_as`, etc.
- Actor IDs are persisted both on `RunRecord.actor_id` and on `GraphCommit.actor_id`, with optional split storage in `_graph_commit_actors.lance` and `_graph_run_actors.lance`.
- `_as` variants of every write API let callers override the actor: `mutate_as`, `ingest_as`, `branch_merge_as`, `apply_schema_as`, etc.
- Actor IDs are persisted on `GraphCommit.actor_id` with split storage in `_graph_commit_actors.lance` (the commit graph is split into `_graph_commits.lance` for the linkage and `_graph_commit_actors.lance` for the actor map).
- HTTP server uses the bearer-token actor automatically; CLI uses the local user / explicit env (no implicit actor).
- Pre-v0.4.0 repos also stored actor IDs on `RunRecord.actor_id` in `_graph_runs.lance` / `_graph_run_actors.lance`. The Run state machine was removed in MR-771; those files are inert post-v0.4.0 and reclaimed by MR-770's production sweep.

View file

@ -37,7 +37,7 @@ Storage is split across two Lance datasets (both with stable row IDs):
Notes:
- Every successful publish (load / change / merge / schema_apply / publish_run) appends one commit.
- Every successful publish (load / change / merge / schema_apply) appends one commit.
- Merge commits have two parents; linear commits have one.
- API: `list_commits(branch)`, `get_commit(id)`, `head_commit_id_for_branch(branch)`.
@ -53,5 +53,5 @@ Notes:
Filtered from `branch_list()` but visible to internals:
- `__run__<run-id>` — ephemeral isolation branch for a transactional run.
- `__schema_apply_lock__` — serializes schema migrations.
- `__run__<run-id>` — legacy from the pre-v0.4.0 Run state machine (removed in MR-771). The branch-name guard predicate `is_internal_run_branch` is kept as defense-in-depth so users cannot create a branch matching the legacy prefix; the filter will be removed once production legacy branches are swept (MR-770).

View file

@ -56,12 +56,12 @@ omnigraph policy validate --config ./omnigraph.yaml
omnigraph policy test --config ./omnigraph.yaml
omnigraph policy explain --config ./omnigraph.yaml --actor act-alice --action read --branch main
omnigraph run list ./repo.omni --json
omnigraph run show --uri ./repo.omni <run-id> --json
omnigraph run publish --uri ./repo.omni <run-id> --json
omnigraph run abort --uri ./repo.omni <run-id> --json
omnigraph commit list ./repo.omni --json
omnigraph commit show --uri ./repo.omni <commit-id> --json
```
(The legacy `omnigraph run list/show/publish/abort` subcommands were removed in MR-771; mutations and loads publish atomically and the commit graph (`omnigraph commit list`) is the audit surface.)
`query lint` and `query check` are the same command surface. In v1, repo-backed
lint uses local or `s3://` repo URIs; HTTP targets are only supported when you
also pass `--schema`.

View file

@ -4,8 +4,8 @@
|---|---|---|
| `MANIFEST_DIR` | `__manifest` | `db/manifest/layout.rs` |
| Commit graph dir | `_graph_commits.lance` | `db/commit_graph.rs` |
| Run registry dir | `_graph_runs.lance` | `db/run_registry.rs` |
| Run branch prefix | `__run__` | `db/run_registry.rs` |
| Run registry dir (legacy, removed MR-771) | `_graph_runs.lance` | inert post-v0.4.0; reclaimed by MR-770 |
| Run branch prefix (legacy, removed MR-771) | `__run__` | filtered by `is_internal_run_branch` defense-in-depth |
| Schema apply lock | `__schema_apply_lock__` | `db/mod.rs` |
| Manifest publisher retry budget | `PUBLISHER_RETRY_BUDGET = 5` | `db/manifest/publisher.rs` |
| Internal manifest schema version | `INTERNAL_MANIFEST_SCHEMA_VERSION = 2` | `db/manifest/migrations.rs` |

View file

@ -9,6 +9,7 @@
- `Manifest(ManifestError { kind: BadRequest|NotFound|Conflict|Internal, details: Option<ManifestConflictDetails>, … })`
- `ManifestConflictDetails::ExpectedVersionMismatch { table_key, expected, actual }` — caller's `expected_table_versions` did not match the manifest's current latest non-tombstoned version (set by `OmniError::manifest_expected_version_mismatch`).
- `ManifestConflictDetails::RowLevelCasContention` — Lance row-level CAS rejected the publish because a concurrent writer landed the same `object_id`. Retried internally by the publisher; only surfaces if the retry budget exhausts.
- **D₂ parse-time rejection** (MR-794): a single mutation query that mixes inserts/updates with deletes errors out *before any I/O* with kind `BadRequest`. Message: `mutation '<name>' on the same query mixes inserts/updates and deletes; split into separate mutations: (1) inserts and updates, then (2) deletes`. See [docs/query-language.md](query-language.md) for the rule and [docs/runs.md](runs.md) for the underlying staged-write rationale.
- `MergeConflicts(Vec<MergeConflict>)`
Compiler-side `NanoError` covers parse / catalog / type / storage / plan / execution / arrow / lance / IO / manifest / unique-constraint, each with structured spans (`SourceSpan { start, end }`) for ariadne-style diagnostics.

View file

@ -79,13 +79,16 @@ Hybrid example: `order { rrf(nearest($d.embedding, $q), bm25($d.body, $q_text))
## Mutation execution (`exec/mutation.rs`)
Resolves expression values to literals, converts to typed Arrow arrays (`literal_to_typed_array(lit, DataType, num_rows)`), then writes:
Resolves expression values to literals, converts to typed Arrow arrays (`literal_to_typed_array(lit, DataType, num_rows)`), then writes via Lance's two-phase distributed-write API at end-of-query:
- `insert` → Lance `WriteMode::Append`
- `update` → Lance `merge_insert(WhenMatched::Update)`
- `delete` → Lance `merge_insert(WhenMatched::Delete)` (logical) or filtered overwrite.
- `insert` (no `@key`, edges) → accumulate into `MutationStaging.pending` (Append mode); finalize calls `stage_append` once per touched table.
- `insert` (`@key` node) → accumulate into `pending` (Merge mode); finalize calls `stage_merge_insert` once per touched table.
- `update` → scan committed via Lance + pending via DataFusion `MemTable` (read-your-writes), apply assignments, accumulate into `pending` (Merge mode).
- `delete` → still inline-commits via `delete_where` (Lance 4.0.0 has no public two-phase delete); recorded into `MutationStaging.inline_committed`.
Multi-statement mutations are atomic at the manifest commit boundary.
**D₂ parse-time rule.** A single mutation query is either insert/update-only or delete-only. Mixed → reject before any I/O. The check fires in `enforce_no_mixed_destructive_constructive(&ir)` inside `execute_named_mutation`.
Multi-statement mutations are atomic at the publisher commit boundary: every insert/update batch lives in memory until end-of-query, then exactly one `stage_*` + `commit_staged` runs per touched table, then `ManifestBatchPublisher::publish` commits the manifest atomically with per-table `expected_table_versions` CAS.
### Mutation flow — sequence
@ -93,57 +96,58 @@ Multi-statement mutations are atomic at the manifest commit boundary.
sequenceDiagram
autonumber
participant client as Client
participant og as Omnigraph::mutate<br/>(mutation.rs:511)
participant og as Omnigraph::mutate_as<br/>(mutation.rs)
participant cmp as omnigraph-compiler
participant runs as RunRegistry
participant stg as MutationStaging<br/>(exec/staging.rs)
participant ts as table_store
participant lance as Lance dataset
participant mr as ManifestRepo<br/>(manifest.rs:280)
participant pub as ManifestBatchPublisher
client->>og: mutate(target, source, name, params)
og->>cmp: parse + typecheck_query
cmp-->>og: CheckedQuery (Mutation IR)
og->>runs: begin_run(target, op_hash)<br/>fork __run__<id> from target head
runs-->>og: RunRecord
loop for each mutation statement (on __run__<id>)
og->>og: resolve expression literals<br/>literal_to_typed_array(lit, type, n)
alt insert
og->>ts: append RecordBatches
ts->>lance: WriteMode::Append → new fragment(s)
else update
og->>ts: merge_insert keyed by id
ts->>lance: merge_insert(WhenMatched::Update)
else delete
og->>ts: merge_insert with delete predicate
ts->>lance: merge_insert(WhenMatched::Delete)
client->>og: mutate_as(branch, source, name, params, actor_id)
og->>cmp: parse + typecheck + lower_mutation_query
cmp-->>og: MutationIR
og->>og: enforce_no_mixed_destructive_constructive (D₂)
loop for each mutation op
og->>og: resolve literals + build batch
alt insert / update (accumulate)
og->>ts: open dataset @ pre-write version (first touch)
og->>stg: ensure_path + append_batch (PendingMode)
opt update — scan committed + pending
og->>ts: scan_with_pending (Lance + DataFusion MemTable union)
ts-->>og: matched batches
end
else delete (inline-commit, D₂ keeps separate)
og->>ts: delete_where (advances Lance HEAD)
og->>stg: record_inline (SubTableUpdate)
end
lance-->>ts: new dataset version
og->>mr: commit_updates(SubTableUpdate)<br/>per-statement commit on __run__<id>
mr-->>og: ack
end
og->>og: OCC: target head unchanged since begin_run?
og->>og: publish_run(run_id)
alt fast path (target hasn't moved)
og->>mr: commit_updates_on_branch(target, updates)<br/>promote run snapshot
else merge path (target advanced)
og->>og: branch_merge_internal(__run__<id>, target)<br/>three-way merge
og->>stg: finalize(db, branch)
loop per pending table
stg->>ts: stage_append OR stage_merge_insert (one per table)
ts-->>stg: StagedWrite (transaction + fragments)
stg->>ts: commit_staged (advances Lance HEAD)
ts-->>stg: new Dataset
end
mr-->>og: new target snapshot
og->>runs: terminate_run(Published)
stg-->>og: (updates: Vec<SubTableUpdate>, expected_versions)
og->>pub: commit_updates_on_branch_with_expected
pub->>pub: publisher CAS (cross-table OCC on __manifest)
pub-->>og: new manifest version
og-->>client: MutationResult
```
**Code paths:**
- Entry: `Omnigraph::mutate` at `crates/omnigraph/src/exec/mutation.rs:511`
- Per-mutation orchestration: `mutate_with_current_actor` at `crates/omnigraph/src/exec/mutation.rs:539`
- Per-statement commit on the run-branch: `commit_updates` (called from `execute_insert` / `execute_update` / `execute_delete` in `crates/omnigraph/src/exec/mutation.rs`)
- Run publish: `Omnigraph::publish_run` at `crates/omnigraph/src/db/omnigraph.rs:858`
- Manifest commit primitive: `ManifestRepo::commit` at `crates/omnigraph/src/db/manifest.rs:280` (called from both per-statement `commit_updates` and the publish path)
- Entry: `Omnigraph::mutate_as` at `crates/omnigraph/src/exec/mutation.rs`
- Per-mutation orchestration: `mutate_with_current_actor` at `crates/omnigraph/src/exec/mutation.rs`
- D₂ check: `enforce_no_mixed_destructive_constructive` (in the same file)
- Per-op execution: `execute_insert`, `execute_update`, `execute_delete_node`, `execute_delete_edge`
- Pending-aware reads: `TableStore::scan_with_pending` / `count_rows_with_pending` at `crates/omnigraph/src/table_store.rs`
- Edge cardinality with pending: `validate_edge_cardinality_with_pending` at `crates/omnigraph/src/exec/mutation.rs`
- Per-query accumulator: `crates/omnigraph/src/exec/staging.rs` (`MutationStaging`, `PendingTable`, `PendingMode`, `finalize`)
- End-of-query Lance commit: `TableStore::stage_append`, `stage_merge_insert`, `commit_staged` at `crates/omnigraph/src/table_store.rs`
- Manifest commit primitive: `commit_updates_on_branch_with_expected` at `crates/omnigraph/src/db/omnigraph/table_ops.rs`
Multi-statement mutations don't get atomicity from a single final `commit` — they get it from the **run-branch + publish_run** pattern. By default a mutation forks a fresh `__run__<id>` branch (`begin_run`); each statement individually commits its sub-table updates to that run-branch. After all statements complete, an OCC pre-check verifies the target hasn't moved since the run started, then `publish_run` atomically promotes the run-branch into the target — either via the fast path (direct promotion if the target hasn't moved) or a three-way merge. That final publish is what gives multi-statement mutations their atomicity guarantee (per [`docs/invariants.md`](invariants.md) §VI.26). If anything fails mid-run, the run is failed and the run-branch is dropped without affecting the target.
One exception: if the caller already targets a `__run__<id>` branch (mutation.rs:555), the mutation runs directly on that branch with no nested run wrapping — the assumption is the caller is managing the surrounding run lifecycle themselves. See [runs.md](runs.md) for the full run lifecycle.
Atomicity guarantee for multi-statement mutations: a mid-query failure leaves Lance HEAD untouched on staged tables (no inline commit happened during op execution), so the next mutation proceeds normally with no `ExpectedVersionMismatch`. The publisher CAS at the very end either succeeds (manifest advances atomically across all touched sub-tables) or fails with a typed `ManifestConflictDetails::ExpectedVersionMismatch` (no partial publish). See [docs/invariants.md §VI.25 / §VI.32](invariants.md) and [docs/runs.md](runs.md).
## Bulk loader (`loader/mod.rs`)
@ -156,19 +160,18 @@ One exception: if the caller already targets a `__run__<id>` branch (mutation.rs
## Load modes (`LoadMode`)
| Mode | Semantics |
|---|---|
| `Overwrite` | Replace all data in the target tables on the branch |
| `Append` | Strict insert; duplicates error |
| `Merge` | Upsert by id (`merge_insert`) |
| Mode | Semantics | Path (post-MR-794) |
|---|---|---|
| `Overwrite` | Replace all data in the target tables on the branch | Inline-commit per type, then publisher CAS at end-of-load. Truncate-then-append doesn't fit the staged shape; documented residual. |
| `Append` | Strict insert; duplicates error | In-memory `MutationStaging` accumulator; one `stage_append` + `commit_staged` per touched table at end-of-load; publisher CAS. |
| `Merge` | Upsert by `id` (`merge_insert`) | Same accumulator; one `stage_merge_insert` per touched table at end-of-load (Merge mode dedupes by `id`, last-write-wins); publisher CAS. |
For Append/Merge, a mid-load failure (RI / cardinality violation, validation error) leaves Lance HEAD untouched on the staged tables — the next load on the same tables proceeds normally with no `ExpectedVersionMismatch`. For Overwrite, a mid-load failure can still leave Lance HEAD on a partially-truncated table; the next overwrite replaces it.
## `load` vs `ingest`
- `load(branch, data, mode)` — direct load to a branch.
- `ingest(branch, from, data, mode)` — branch-creating, transactional load:
1. If target advanced since the run started, fork a fresh run branch from `from`.
2. Load into the run branch (Append).
3. If target hasn't moved, fast-publish; otherwise abort.
- `load(branch, data, mode)` — direct load to a branch (single publisher commit per call).
- `ingest(branch, from, data, mode)` — branch-creating wrapper: if `branch` doesn't exist, fork it from `from` (default `main`) via `branch_create_from`, then call `load(branch, data, mode)`.
- Returns `IngestResult { branch, base_branch, branch_created, mode, tables[] }`.
- `ingest_as(actor_id)` records the actor on the resulting commit.

View file

@ -110,7 +110,7 @@ Specific defaults (timeout values, memory caps, TTL windows) are *configuration*
*Status: aspirational — referential integrity at scale requires SIP-backed cross-table validation; not yet implemented. Cross-batch / cross-version uniqueness tracked in MR-714.*
25. **Isolation: per-query snapshot; read-your-writes within and across queries in a session.** Each query reads from one consistent manifest version. Within a multi-statement mutation, the read subplan inside each write operator sees the writes from earlier statements. Across queries in a session, reads always resolve the latest manifest version — no reader pinning to older snapshots.
*Status: open — read-your-writes within a multi-statement mutation requires Kuzu-style local-uncommitted scan path; deferred per MR-737 §10.10.*
*Status: upheld for inserts/updates after MR-794 step 2+ — `MutationStaging`'s in-memory accumulator + `TableStore::scan_with_pending` (DataFusion `MemTable` union with the committed Lance scan) implements read-your-writes within a multi-statement mutation. Delete-touching mutations are limited to delete-only by parse-time D₂; closing the within-query RYW gap for deletes requires Lance's two-phase delete API (tracked: MR-793 / Lance-upstream).*
26. **Durability before acknowledgement.** Commit returns only after the substrate has confirmed durable persistence. No "fast" or "fire-and-forget" durability levels.

View file

@ -64,6 +64,14 @@ Used inside MATCH or as expressions inside RETURN/ORDER:
`<value>` is a literal, `$param`, or `now()`. Multi-statement mutations execute atomically (added in v0.2.0).
### D₂ — mixed insert/update + delete is rejected at parse time
A single mutation query must be **either insert/update-only or delete-only**. Mixed → rejected before any I/O with the message:
> `mutation '<name>' on the same query mixes inserts/updates and deletes; split into separate mutations: (1) inserts and updates, then (2) deletes. This restriction lifts when Lance exposes a two-phase delete API (tracked: MR-793 / Lance-upstream).`
Reason: under the staged-write rewire (MR-794), inserts and updates accumulate in memory and commit at end-of-query, while deletes still inline-commit (Lance 4.0.0 has no public two-phase delete). Mixing creates ordering hazards (same-row insert→delete becomes a no-op because the staged insert isn't visible to delete; cascading deletes of just-inserted edges break referential integrity by silent design). Until Lance exposes `DeleteJob::execute_uncommitted`, the parse-time rejection keeps both paths atomic and correct. See [docs/runs.md](runs.md) and [docs/invariants.md §VI.25](invariants.md).
## IR (Intermediate Representation)
`QueryIR { name, params, pipeline: Vec<IROp>, return_exprs, order_by, limit }`

138
docs/releases/v0.4.1.md Normal file
View file

@ -0,0 +1,138 @@
# Omnigraph v0.4.1
Omnigraph v0.4.1 closes the multi-statement-mutation atomicity gap that
v0.4.0 documented as a known limitation. Inserts and updates now route
through an in-memory `MutationStaging` accumulator and commit via Lance's
two-phase distributed-write API at end-of-query. A failed mid-query op
no longer leaves Lance HEAD drifted on the touched table — the next
mutation proceeds normally.
## Highlights
- **Staged-write rewire (MR-794)**: `mutate_as` and `load` (Append /
Merge modes) accumulate insert/update batches into
`MutationStaging.pending` per touched table. No Lance HEAD advance
happens during op execution; one `stage_*` + `commit_staged` per
table runs at end-of-query, then `ManifestBatchPublisher::publish`
commits the manifest atomically. A mid-query failure leaves Lance
HEAD untouched on staged tables.
- **D₂ parse-time rule**: a single mutation query is either
insert/update-only or delete-only. Mixed → rejected with a clear
error directing the caller to split into two queries. Lance 4.0.0
has no public two-phase delete; deletes still inline-commit, and D₂
keeps that path safe.
- **Read-your-writes via DataFusion `MemTable`**: read sites in
multi-statement mutations consume `TableStore::scan_with_pending`,
which Lance-scans the committed snapshot at the captured
`expected_version` and unions with a DataFusion `MemTable` over the
pending batches. Replaces the previous "reopen at staged Lance
version" pattern.
- **Coordinator swap-restore eliminated** from `mutate_with_current_actor`.
Branch is threaded explicitly through the per-op execution path
(`execute_named_mutation`, `execute_insert`, `execute_update`,
`execute_delete*`, `validate_edge_insert_endpoints`,
`ensure_node_id_exists`). The `swap_coordinator_for_branch` /
`restore_coordinator` API and `CoordinatorRestoreGuard` are removed
from `mutation.rs`. (`merge.rs` keeps its own swap pattern; that's
a separate workflow tracked in MR-793.)
- **`docs/invariants.md` §VI.25** flips from `aspirational/open` to
`upheld for inserts/updates`. The within-query read-your-writes
guarantee is now load-bearing for the publisher CAS contract.
## Behavior changes
- A failed multi-statement mutation no longer surfaces
`ExpectedVersionMismatch` on the *next* mutation against the same
table. The next call proceeds normally — Lance HEAD on staged
tables is unchanged.
- Mixed insert/update + delete in one query is rejected at parse
time. Existing test queries that mixed both must be split.
- `MutationStaging`'s shape changed: `pending: HashMap<String, PendingTable>`
+ `inline_committed: HashMap<String, SubTableUpdate>` replaces the
previous `latest: HashMap<String, StagedTable>`. This is an internal
type; no public API impact.
## Residual / out of scope
- **`LoadMode::Overwrite`** keeps the legacy inline-commit path
(truncate-then-append doesn't fit the staged shape). A mid-overwrite
failure can still drift Lance HEAD on a partially-truncated table;
the next overwrite replaces it. Operator-driven, rare.
- **Delete-only multi-statement mutations** still inline-commit per op.
D₂ keeps inserts/updates from coexisting with deletes, so the
inline path remains atomic per op but not per query for delete-only
cascades. Closing this requires Lance to expose
`DeleteJob::execute_uncommitted`; tracked in MR-793 / Lance-upstream.
- **`schema_apply`, `branch_merge_internal`, `ensure_indices`** still
use Lance's inline-commit APIs. The two-phase pattern is in
`mutate_as` and `load` only; hoisting it to a storage-trait
invariant covering all writers is MR-793.
## Tests added
- `tests/runs.rs::partial_failure_leaves_target_queryable_and_unblocks_next_mutation`
(replaces the old `partial_failure_observably_rolls_back_but_blocks_next_mutation_on_same_table`)
- `tests/runs.rs::mutation_rejects_mixed_insert_and_delete_at_parse_time`
- `tests/runs.rs::mixed_insert_and_update_on_same_person_coalesces_to_one_merge`
- `tests/runs.rs::multiple_appends_to_same_edge_coalesce_to_one_append`
- `tests/runs.rs::multi_statement_inserts_publish_exactly_once`
- `tests/runs.rs::load_with_bad_edge_reference_unblocks_next_load`
- `tests/runs.rs::load_with_cardinality_violation_unblocks_next_load`
## Files changed
- `crates/omnigraph/src/exec/staging.rs` (NEW) — `MutationStaging`,
`PendingTable`, `PendingMode`, `StagedTablePath`,
`dedupe_merge_batches_by_id`.
- `crates/omnigraph/src/exec/mutation.rs` — D₂ check; per-op
rewires (`execute_insert`, `execute_update`, `execute_delete*`);
branch threading; coordinator-swap removal; helper
`validate_edge_cardinality_with_pending`; helper
`concat_match_batches_to_schema`; `apply_assignments` updated to
copy unassigned blob columns from full-schema scans.
- `crates/omnigraph/src/loader/mod.rs``load_jsonl_reader` split:
staged path for Append/Merge, legacy inline-commit path for
Overwrite. Helpers `collect_node_ids_with_pending` and
`validate_edge_cardinality_with_pending_loader`.
- `crates/omnigraph/src/table_store.rs``scan_with_pending`,
`count_rows_with_pending` (DataFusion `MemTable`-backed union with
Lance scan).
- `Cargo.toml` (workspace) + `crates/omnigraph/Cargo.toml` — added
`datafusion = "52"` direct dep (transitively pulled by Lance
already; required for `MemTable`).
- `docs/runs.md` — removed "Known limitation" section; documented
the new accumulator + D₂ + LoadMode::Overwrite residual.
- `docs/invariants.md` — §VI.25 status flipped to `upheld for
inserts/updates`.
- `docs/architecture.md` — added "Mutation atomicity — in-memory
accumulator (MR-794)" subsection; refreshed the engine + state
diagrams to drop `RunRegistry` and add `MutationStaging`.
- `docs/execution.md` — rewrote the mutation flow sequence diagram
for the staged-write path; updated the `LoadMode` table to call
out per-mode commit semantics; rewrote `load` vs `ingest`.
- `docs/query-language.md` — documented the D₂ parse-time rule.
- `docs/errors.md` — added the D₂ `BadRequest` rejection path.
- `docs/storage.md` — dropped the live `_graph_runs.lance` reference
(legacy from MR-771) from the layout diagram and prose.
- `docs/branches-commits.md` — moved `__run__<id>` to a legacy note;
removed `publish_run` from the publish-trigger list.
- `docs/audit.md` — current `_as` API list refreshed; legacy
`RunRecord.actor_id` moved to a historical note.
- `docs/constants.md` — marked the run registry / branch-prefix rows
as legacy.
- `docs/cli.md` — replaced the legacy `omnigraph run *` quickstart
block with `omnigraph commit list/show`.
- `docs/testing.md` — extended the `runs.rs` row to cover the new
MR-794 contract tests; added the `staged_writes.rs` row.
- `AGENTS.md` (CLAUDE.md symlink) — updated the atomic-per-query
description and the L2 capability matrix row.
## Included Changes
- MR-794 step 2+ — rewire `mutate_as` and `load` via in-memory
`MutationStaging` + `stage_*` / `commit_staged` per touched table at
end-of-query.
- (MR-794 step 1 shipped in v0.4.0's PR #67`StagedWrite`,
`stage_append`, `stage_merge_insert`, `commit_staged`,
`scan_with_staged`, `count_rows_with_staged` — and is the substrate
this release builds on.)

View file

@ -20,22 +20,60 @@ publisher's row-level CAS on `__manifest` is the single fence.
A `.gq` query with multiple ops (e.g. `insert Person … insert Knows …`)
must observe earlier ops' writes when validating later ops (referential
integrity, edge cardinality). After demotion this is implemented via an
in-process `MutationStaging` accumulator in
`crates/omnigraph/src/exec/mutation.rs`:
integrity, edge cardinality). After MR-794 step 2+ this is implemented
via an in-memory `MutationStaging` accumulator in
[`crates/omnigraph/src/exec/staging.rs`](../crates/omnigraph/src/exec/staging.rs),
shared by both `mutate_as` and the bulk loader:
- On the first touch of each table, the pre-write manifest version is
captured into `expected_versions[table_key]`.
- Subsequent ops on the same table re-open the dataset at the locally
staged Lance version (bypassing the manifest, which has not been
committed yet) so they see prior writes.
- One `commit_with_expected(updates, expected_versions)` at the end
publishes the lot atomically. Cross-table conflicts surface as
captured into `expected_versions[table_key]` (the publisher's CAS
fence at end-of-query).
- Each insert/update op pushes a `RecordBatch` into the per-table
pending accumulator. Lance HEAD does **not** advance during op
execution.
- Read sites (validation, predicate matching for `update`) consume
`TableStore::scan_with_pending`, which scans committed via Lance
and applies the same SQL filter to the pending batches via DataFusion
`MemTable`. Same-query writes are visible to subsequent reads.
- At end-of-query, `MutationStaging::finalize` issues exactly one
`stage_*` + `commit_staged` per touched table (concatenating
accumulated batches; merge-mode dedupes by `id`, last-write-wins),
and the publisher publishes the manifest atomically across all
touched sub-tables. Cross-table conflicts surface as
`ManifestConflictDetails::ExpectedVersionMismatch`.
- **Deletes still inline-commit.** Lance's `Dataset::delete` is not
exposed as a two-phase op in 4.0.0; deletes go through `delete_where`
immediately and record their post-write state in
`MutationStaging.inline_committed`. The parse-time D₂ rule (below)
prevents inserts/updates from coexisting with deletes in one query,
so the inline path is safe for delete-only mutations.
This upholds [docs/invariants.md §VI.23](invariants.md) (atomicity per
query) and §VI.25 (read-your-writes within a multi-statement mutation —
previously aspirational, now upheld).
query) and §VI.25 (read-your-writes within a multi-statement mutation,
upheld).
### D₂ — parse-time mixed-mode rejection
A single mutation query is either insert/update-only or delete-only.
Mixed → rejected at parse time with a clear error directing the user to
split the query. Reason: mixing creates ordering hazards
(insert→delete on the same row would silently no-op because the staged
insert isn't visible to delete; cascading deletes of just-inserted
edges break referential integrity). Until Lance exposes a two-phase
delete API, the parse-time rejection keeps both paths atomic and
correct. Tracked: MR-793, plus a Lance-upstream ticket.
### `LoadMode::Overwrite` residual
The bulk loader's Append and Merge modes use the staged-write path
described above. `LoadMode::Overwrite` keeps the legacy inline-commit
path: truncate-then-append doesn't fit the staged shape cleanly in
Lance 4.0.0, and overwrite has no in-flight read-your-writes
requirement (the prior data is being wiped). A mid-overwrite failure
can leave Lance HEAD on a partially-truncated table; the next overwrite
will replace it. Operator-driven (rare in agent workloads); document
permanently until Lance exposes `Operation::Overwrite { fragments }` as
a two-phase op.
## Conflict shape
@ -59,40 +97,32 @@ list`.
`_graph_runs.lance` belongs in MR-770 (the production sweep) — this PR
stops *creating* run state but does not destroy legacy bytes on disk.
## Known limitation: mid-query partial failure on the same table
## Mid-query partial failure: closed by MR-794
A multi-statement `.gq` mutation where op-N writes a Lance fragment
successfully and op-N+1 then fails leaves the touched table at
`Lance HEAD = manifest_version + 1`. The query is atomic at the manifest
level (the publisher never publishes, so reads at the pinned manifest
version do *not* see op-N's data), but the *next* mutation against the
same table fails loudly with
`ManifestConflictDetails::ExpectedVersionMismatch` because
`ensure_expected_version` enforces strict equality between Lance HEAD and
the manifest's pinned version.
The pre-MR-794 design had a known limitation: a multi-statement `.gq`
mutation where op-N inline-committed a Lance fragment and op-N+1 then
failed left the touched table at `Lance HEAD = manifest_version + 1`,
blocking the next mutation with `ExpectedVersionMismatch`.
**Why the engine doesn't auto-rollback**: Lance's `Dataset::restore()` is
*not* a rewind — it appends a new commit (containing the desired
historical version's data) and advances HEAD further. There is no Lance
API to delete a committed version. A proper fix requires writing each
mutation's per-table fragments to a *transient Lance branch* on the
sub-table, then fast-forwarding main on success or dropping the branch
on failure. That work is tracked as a follow-up to MR-771; in the
meantime:
MR-794 (step 1 + step 2+) closed this for inserts/updates **by
construction at the writer layer**: insert and update batches accumulate
in memory; no Lance HEAD advance happens during op execution; one
`stage_*` + `commit_staged` per touched table runs at end-of-query, and
only after every op succeeded. A failed op leaves Lance HEAD untouched
on the staged tables, so the next mutation proceeds normally with no
drift to reconcile.
- **In practice this is rare.** Most schema-language validation
(`@key`, `@enum`, `@range`, intra-batch uniqueness, edge-endpoint
existence) runs *before* any Lance write inside the failing op, so
single-statement mutations never trip this. The narrow path is
multi-statement queries (`insert ... insert ...`,
`insert ... update ...`) where a late op fails on validation that
depends on earlier ops' staged data.
- **Workaround**: callers that hit this should refresh the handle and
retry the mutation; if Lance HEAD remains drifted the
`omnigraph cleanup` command will GC the orphan version once a later
successful commit on the same table moves HEAD past it. (`cleanup`
cannot reclaim an orphan that *is* the current Lance HEAD; that case
needs the per-table-branch follow-up to fully heal.)
The cancellation case (future drop mid-mutation) inherits the same
guarantee — the in-memory accumulator evaporates with the dropped task
and no Lance write was ever issued.
The cancellation case (future drop mid-mutation) has the same shape and
the same workaround.
For delete-touching mutations the legacy inline-commit shape is
preserved (Lance has no public two-phase delete in 4.0.0) — the same
narrow window remains. The parse-time D₂ rule prevents inserts/updates
from coexisting with deletes in one query, so a pure-delete failure
cannot drift any staged-table state. If a delete-only multi-table
mutation fails mid-cascade, the same workaround as before applies
(retry; rely on `omnigraph cleanup` once a later successful commit
moves HEAD past the orphan version). Closing this requires Lance to
expose `DeleteJob::execute_uncommitted`; tracked in MR-793 and a
Lance-upstream ticket.

View file

@ -22,7 +22,7 @@ OmniGraph is **not** a single Lance dataset; it is a *graph* of datasets coordin
- `edges/{fnv1a64-hex(edge_type_name)}` — one Lance dataset per edge type
- `__manifest/` — the catalog of all sub-tables and their published versions
- `_graph_commits.lance` / `_graph_commit_actors.lance` — the commit graph and its actor map
- `_graph_runs.lance` / `_graph_run_actors.lance` — the run registry and its actor map
- (legacy `_graph_runs.lance` / `_graph_run_actors.lance` from pre-v0.4.0 repos are inert; the run state machine was removed in MR-771 and these files are cleaned up via MR-770's production sweep)
- **Manifest row schema** (`object_id, object_type, location, metadata, base_objects, table_key, table_version, table_branch, row_count`):
- `object_type``table | table_version | table_tombstone`
- `table_key``node:<TypeName> | edge:<EdgeName>`
@ -63,14 +63,12 @@ flowchart TB
nodes["nodes/{fnv1a64-hex}/<br/>one dataset per node type"]:::l2
edges["edges/{fnv1a64-hex}/<br/>one dataset per edge type"]:::l2
cgraph["_graph_commits.lance/<br/>_graph_commit_actors.lance/"]:::l2
runs["_graph_runs.lance/<br/>_graph_run_actors.lance/"]:::l2
refs["_refs/branches/{name}.json<br/>graph-level branches"]:::l2
repo --> manifest
repo --> nodes
repo --> edges
repo --> cgraph
repo --> runs
repo --> refs
subgraph dataset[Inside each Lance dataset — L1]
@ -91,7 +89,7 @@ flowchart TB
- **Repo root** is one directory (or S3 prefix). Everything below is part of one OmniGraph repo.
- **`__manifest/`** is a Lance dataset whose rows describe which sub-table version is published at which graph-branch. Reading a snapshot starts here.
- **`nodes/`** and **`edges/`** are sibling directories holding one Lance dataset per declared type. Names are `fnv1a64-hex` of the type name to keep paths fixed-length and case-safe.
- **`_graph_commits.lance` / `_graph_runs.lance`** are L2 datasets that record the graph-level commit DAG and run registry respectively (each has a paired `*_actors.lance` for the actor map).
- **`_graph_commits.lance`** is an L2 dataset that records the graph-level commit DAG, with a paired `_graph_commit_actors.lance` for the actor map. (Pre-v0.4.0 repos also have inert `_graph_runs.lance` / `_graph_run_actors.lance` from the removed Run state machine; MR-770 sweeps these in production.)
- **`_refs/branches/{name}.json`** is graph-level branch metadata — pointers from a branch name to the manifest version it heads.
- **Inside each Lance dataset** (orange): the standard Lance directory layout. `_versions/{n}.manifest` records every commit; `data/` holds the actual Arrow fragments; `_indices/{uuid}/` holds index segments with their own `fragment_bitmap` for partial coverage; `_refs/` holds Lance-native per-dataset branches and tags.

View file

@ -19,7 +19,8 @@ The engine's `tests/` is the principal coverage surface; most graph-shaped behav
|---|---|
| `end_to_end.rs` | Full init → load → query/mutate flow |
| `branching.rs` | Branch create / list / delete, lazy fork |
| `runs.rs` | Transactional runs (begin/publish/abort), idempotency |
| `runs.rs` | Direct-publish writes: cancellation, concurrent-writer CAS, multi-statement atomicity, MR-794 staged-write rewire (D₂ rejection, insert+update coalesce, multi-append coalesce, partial-failure recovery, load RI/cardinality recovery) |
| `staged_writes.rs` | TableStore staged-write primitives (`stage_append`, `stage_merge_insert`, `commit_staged`, `scan_with_staged`, `count_rows_with_staged`) — primitive-level only; engine code uses the in-memory `MutationStaging` accumulator instead |
| `lifecycle.rs` | Repo lifecycle, schema state |
| `point_in_time.rs` | Snapshots, time travel (`snapshot_at_version`, `entity_at`) |
| `changes.rs` | `diff_between` / `diff_commits` |