# Omnigraph v0.4.1 Omnigraph v0.4.1 closes the multi-statement-mutation atomicity gap that v0.4.0 documented as a known limitation. Inserts and updates now route through an in-memory `MutationStaging` accumulator and commit via Lance's two-phase distributed-write API at end-of-query. A failed mid-query op no longer leaves Lance HEAD drifted on the touched table — the next mutation proceeds normally. ## Highlights - **Staged-write rewire**: `mutate_as` and `load` (Append / Merge modes) accumulate insert/update batches into `MutationStaging.pending` per touched table. No Lance HEAD advance happens during op execution; one `stage_*` + `commit_staged` per table runs at end-of-query, then `ManifestBatchPublisher::publish` commits the manifest atomically. **For op-execution failures** (validation errors, missing endpoints, parse-time D₂ rejection), Lance HEAD on every staged table is untouched and the next mutation proceeds normally. A narrowed residual remains at the finalize→publisher boundary (multi-table `commit_staged` is not atomic with the manifest commit) — see [docs/runs.md](../runs.md) "Finalize → publisher residual" for details. - **D₂ parse-time rule**: a single mutation query is either insert/update-only or delete-only. Mixed → rejected with a clear error directing the caller to split into two queries. Lance 4.0.0 has no public two-phase delete; deletes still inline-commit, and D₂ keeps that path safe. - **Read-your-writes via DataFusion `MemTable`**: read sites in multi-statement mutations consume `TableStore::scan_with_pending`, which Lance-scans the committed snapshot at the captured `expected_version` and unions with a DataFusion `MemTable` over the pending batches. Replaces the previous "reopen at staged Lance version" pattern. - **Coordinator swap-restore eliminated** from `mutate_with_current_actor`. Branch is threaded explicitly through the per-op execution path (`execute_named_mutation`, `execute_insert`, `execute_update`, `execute_delete*`, `validate_edge_insert_endpoints`, `ensure_node_id_exists`). The `swap_coordinator_for_branch` / `restore_coordinator` API and `CoordinatorRestoreGuard` are removed from `mutation.rs`. (`merge.rs` keeps its own swap pattern; that's a separate workflow.) - **`docs/invariants.md` §VI.25** flips from `aspirational/open` to `upheld for inserts/updates`. The within-query read-your-writes guarantee is now load-bearing for the publisher CAS contract. ## Behavior changes - A failed multi-statement mutation no longer surfaces `ExpectedVersionMismatch` on the *next* mutation against the same table. The next call proceeds normally — Lance HEAD on staged tables is unchanged. - Mixed insert/update + delete in one query is rejected at parse time. Existing test queries that mixed both must be split. - `MutationStaging`'s shape changed: `pending: HashMap` + `inline_committed: HashMap` replaces the previous `latest: HashMap`. This is an internal type; no public API impact. ## Residual / out of scope - **`LoadMode::Overwrite`** keeps the legacy inline-commit path (truncate-then-append doesn't fit the staged shape). A mid-overwrite failure can still drift Lance HEAD on a partially-truncated table; the next overwrite replaces it. Operator-driven, rare. - **Delete-only multi-statement mutations** still inline-commit per op. D₂ keeps inserts/updates from coexisting with deletes, so the inline path remains atomic per op but not per query for delete-only cascades. Closing this requires Lance to expose `DeleteJob::execute_uncommitted`; tracked upstream with Lance. - **`schema_apply`, `branch_merge_internal`, `ensure_indices`** still use Lance's inline-commit APIs. The two-phase pattern is in `mutate_as` and `load` only; hoisting it to a storage-trait invariant covering all writers remains future work. ## Tests added - `tests/runs.rs::partial_failure_leaves_target_queryable_and_unblocks_next_mutation` (replaces the old `partial_failure_observably_rolls_back_but_blocks_next_mutation_on_same_table`) - `tests/runs.rs::mutation_rejects_mixed_insert_and_delete_at_parse_time` - `tests/runs.rs::mixed_insert_and_update_on_same_person_coalesces_to_one_merge` - `tests/runs.rs::multiple_appends_to_same_edge_coalesce_to_one_append` - `tests/runs.rs::multi_statement_inserts_publish_exactly_once` - `tests/runs.rs::load_with_bad_edge_reference_unblocks_next_load` - `tests/runs.rs::load_with_cardinality_violation_unblocks_next_load` ## Files changed - `crates/omnigraph/src/exec/staging.rs` (NEW) — `MutationStaging`, `PendingTable`, `PendingMode`, `StagedTablePath`, `dedupe_merge_batches_by_id`. - `crates/omnigraph/src/exec/mutation.rs` — D₂ check; per-op rewires (`execute_insert`, `execute_update`, `execute_delete*`); branch threading; coordinator-swap removal; helper `validate_edge_cardinality_with_pending`; helper `concat_match_batches_to_schema`; `apply_assignments` updated to copy unassigned blob columns from full-schema scans. - `crates/omnigraph/src/loader/mod.rs` — `load_jsonl_reader` split: staged path for Append/Merge, legacy inline-commit path for Overwrite. Helpers `collect_node_ids_with_pending` and `validate_edge_cardinality_with_pending_loader`. - `crates/omnigraph/src/table_store.rs` — `scan_with_pending`, `count_rows_with_pending` (DataFusion `MemTable`-backed union with Lance scan). - `Cargo.toml` (workspace) + `crates/omnigraph/Cargo.toml` — added `datafusion = "52"` direct dep (transitively pulled by Lance already; required for `MemTable`). - `docs/runs.md` — removed "Known limitation" section; documented the new accumulator + D₂ + LoadMode::Overwrite residual. - `docs/invariants.md` — §VI.25 status flipped to `upheld for inserts/updates`. - `docs/architecture.md` — added "Mutation atomicity — in-memory accumulator" subsection; refreshed the engine + state diagrams to drop `RunRegistry` and add `MutationStaging`. - `docs/execution.md` — rewrote the mutation flow sequence diagram for the staged-write path; updated the `LoadMode` table to call out per-mode commit semantics; rewrote `load` vs `ingest`. - `docs/query-language.md` — documented the D₂ parse-time rule. - `docs/errors.md` — added the D₂ `BadRequest` rejection path. - `docs/storage.md` — dropped the live `_graph_runs.lance` reference from the layout diagram and prose. - `docs/branches-commits.md` — moved `__run__` to a legacy note; removed `publish_run` from the publish-trigger list. - `docs/audit.md` — current `_as` API list refreshed; legacy `RunRecord.actor_id` moved to a historical note. - `docs/constants.md` — marked the run registry / branch-prefix rows as legacy. - `docs/cli.md` — replaced the legacy `omnigraph run *` quickstart block with `omnigraph commit list/show`. - `docs/testing.md` — extended the `runs.rs` row to cover the new staged-write contract tests; added the `staged_writes.rs` row. - `AGENTS.md` (CLAUDE.md symlink) — updated the atomic-per-query description and the L2 capability matrix row. ## Included Changes - Rewire `mutate_as` and `load` via in-memory `MutationStaging` + `stage_*` / `commit_staged` per touched table at end-of-query. - (The storage substrate shipped in v0.4.0's PR #67 — `StagedWrite`, `stage_append`, `stage_merge_insert`, `commit_staged`, `scan_with_staged`, `count_rows_with_staged` — and is the substrate this release builds on.)