mirror of
https://github.com/ModernRelay/omnigraph.git
synced 2026-06-12 01:45:14 +02:00
Five fixes from PR #68 review (Cursor Bugbot + Codex + Cubic): * **scan_with_pending gains merge-shadow semantics** (Codex P1, Cubic P1#1): new `key_column: Option<&str>` parameter. When set, committed rows whose key value appears in any pending batch are excluded from the scan — making `scan_with_pending` correctly merge-semantic for chained updates instead of naively unioning. execute_update calls with Some("id"). Without this, a chained `update where age > 30` could match a row whose pending value already moved out of range. * **Multi-delete on same table no longer trips ExpectedVersionMismatch** (Cursor Bugbot HIGH): open_table_for_mutation routes through reopen_for_mutation when staging.inline_committed has the table, using the post-inline-commit Lance version captured at record_inline time. The legacy open_for_mutation_on_branch fence (Lance HEAD == manifest pinned) is correct cross-writer but wrong intra-query when deletes have already advanced HEAD on this table. Branch goes away when Lance ships two-phase delete (lance-format/lance#6658). * **Cardinality validation consolidated** (Cursor LOW + Codex P2 + Cubic P1#2 + Cubic P2): new exec/staging::count_src_per_edge + enforce_cardinality_bounds shared by mutation and loader paths. Restores the missing min-cardinality check on the engine path. Loader Merge mode passes Some("id") to dedupe edges being updated by id (not double-count committed + pending). Loader Append mode and engine path pass None (ULID-generated ids never collide). * **Dead count_rows_with_pending removed** (Cursor LOW): never called. * **Misleading concat-helper comment fixed** (Cubic P3): claimed schema normalization the helper doesn't implement. Updated to match reality. * **Documentation honesty** (Cubic P1#3): MR-794 narrows but doesn't eliminate the "Lance HEAD ahead of __manifest" drift class. Drift is unreachable for op-execution failures (the partial_failure test pins this), but a residual remains at the finalize→publisher boundary because Lance has no multi-dataset commit primitive: per-table commit_staged calls run sequentially before manifest commit. Updated docs/runs.md, docs/invariants.md §VI.25, docs/releases/v0.4.1.md to scope the claim precisely. * **Failpoint test pinning the residual**: new mutation.post_finalize_pre_publisher failpoint + two tests in tests/failpoints.rs that confirm the documented residual behavior. Catches future regressions that widen the residual. Test additions on tests/runs.rs: * chained_updates_with_overlapping_predicate_respects_intermediate_value * multi_statement_delete_on_same_node_table * cascade_delete_node_then_explicit_delete_edge_on_same_table * mutation_insert_edge_enforces_min_cardinality * load_merge_mode_dedupes_edge_for_cardinality_count 113/113 engine integration tests pass (runs + end_to_end + consistency + staged_writes + validators). Failpoints feature build runs in CI. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
7.5 KiB
7.5 KiB
Omnigraph v0.4.1
Omnigraph v0.4.1 closes the multi-statement-mutation atomicity gap that
v0.4.0 documented as a known limitation. Inserts and updates now route
through an in-memory MutationStaging accumulator and commit via Lance's
two-phase distributed-write API at end-of-query. A failed mid-query op
no longer leaves Lance HEAD drifted on the touched table — the next
mutation proceeds normally.
Highlights
- Staged-write rewire (MR-794):
mutate_asandload(Append / Merge modes) accumulate insert/update batches intoMutationStaging.pendingper touched table. No Lance HEAD advance happens during op execution; onestage_*+commit_stagedper table runs at end-of-query, thenManifestBatchPublisher::publishcommits the manifest atomically. For op-execution failures (validation errors, missing endpoints, parse-time D₂ rejection), Lance HEAD on every staged table is untouched and the next mutation proceeds normally. A narrowed residual remains at the finalize→publisher boundary (multi-tablecommit_stagedis not atomic with the manifest commit) — see docs/runs.md "Finalize → publisher residual" for details. - D₂ parse-time rule: a single mutation query is either insert/update-only or delete-only. Mixed → rejected with a clear error directing the caller to split into two queries. Lance 4.0.0 has no public two-phase delete; deletes still inline-commit, and D₂ keeps that path safe.
- Read-your-writes via DataFusion
MemTable: read sites in multi-statement mutations consumeTableStore::scan_with_pending, which Lance-scans the committed snapshot at the capturedexpected_versionand unions with a DataFusionMemTableover the pending batches. Replaces the previous "reopen at staged Lance version" pattern. - Coordinator swap-restore eliminated from
mutate_with_current_actor. Branch is threaded explicitly through the per-op execution path (execute_named_mutation,execute_insert,execute_update,execute_delete*,validate_edge_insert_endpoints,ensure_node_id_exists). Theswap_coordinator_for_branch/restore_coordinatorAPI andCoordinatorRestoreGuardare removed frommutation.rs. (merge.rskeeps its own swap pattern; that's a separate workflow tracked in MR-793.) docs/invariants.md§VI.25 flips fromaspirational/opentoupheld for inserts/updates. The within-query read-your-writes guarantee is now load-bearing for the publisher CAS contract.
Behavior changes
- A failed multi-statement mutation no longer surfaces
ExpectedVersionMismatchon the next mutation against the same table. The next call proceeds normally — Lance HEAD on staged tables is unchanged. - Mixed insert/update + delete in one query is rejected at parse time. Existing test queries that mixed both must be split.
MutationStaging's shape changed:pending: HashMap<String, PendingTable>inline_committed: HashMap<String, SubTableUpdate>replaces the previouslatest: HashMap<String, StagedTable>. This is an internal type; no public API impact.
Residual / out of scope
LoadMode::Overwritekeeps the legacy inline-commit path (truncate-then-append doesn't fit the staged shape). A mid-overwrite failure can still drift Lance HEAD on a partially-truncated table; the next overwrite replaces it. Operator-driven, rare.- Delete-only multi-statement mutations still inline-commit per op.
D₂ keeps inserts/updates from coexisting with deletes, so the
inline path remains atomic per op but not per query for delete-only
cascades. Closing this requires Lance to expose
DeleteJob::execute_uncommitted; tracked in MR-793 / Lance-upstream. schema_apply,branch_merge_internal,ensure_indicesstill use Lance's inline-commit APIs. The two-phase pattern is inmutate_asandloadonly; hoisting it to a storage-trait invariant covering all writers is MR-793.
Tests added
tests/runs.rs::partial_failure_leaves_target_queryable_and_unblocks_next_mutation(replaces the oldpartial_failure_observably_rolls_back_but_blocks_next_mutation_on_same_table)tests/runs.rs::mutation_rejects_mixed_insert_and_delete_at_parse_timetests/runs.rs::mixed_insert_and_update_on_same_person_coalesces_to_one_mergetests/runs.rs::multiple_appends_to_same_edge_coalesce_to_one_appendtests/runs.rs::multi_statement_inserts_publish_exactly_oncetests/runs.rs::load_with_bad_edge_reference_unblocks_next_loadtests/runs.rs::load_with_cardinality_violation_unblocks_next_load
Files changed
crates/omnigraph/src/exec/staging.rs(NEW) —MutationStaging,PendingTable,PendingMode,StagedTablePath,dedupe_merge_batches_by_id.crates/omnigraph/src/exec/mutation.rs— D₂ check; per-op rewires (execute_insert,execute_update,execute_delete*); branch threading; coordinator-swap removal; helpervalidate_edge_cardinality_with_pending; helperconcat_match_batches_to_schema;apply_assignmentsupdated to copy unassigned blob columns from full-schema scans.crates/omnigraph/src/loader/mod.rs—load_jsonl_readersplit: staged path for Append/Merge, legacy inline-commit path for Overwrite. Helperscollect_node_ids_with_pendingandvalidate_edge_cardinality_with_pending_loader.crates/omnigraph/src/table_store.rs—scan_with_pending,count_rows_with_pending(DataFusionMemTable-backed union with Lance scan).Cargo.toml(workspace) +crates/omnigraph/Cargo.toml— addeddatafusion = "52"direct dep (transitively pulled by Lance already; required forMemTable).docs/runs.md— removed "Known limitation" section; documented the new accumulator + D₂ + LoadMode::Overwrite residual.docs/invariants.md— §VI.25 status flipped toupheld for inserts/updates.docs/architecture.md— added "Mutation atomicity — in-memory accumulator (MR-794)" subsection; refreshed the engine + state diagrams to dropRunRegistryand addMutationStaging.docs/execution.md— rewrote the mutation flow sequence diagram for the staged-write path; updated theLoadModetable to call out per-mode commit semantics; rewroteloadvsingest.docs/query-language.md— documented the D₂ parse-time rule.docs/errors.md— added the D₂BadRequestrejection path.docs/storage.md— dropped the live_graph_runs.lancereference (legacy from MR-771) from the layout diagram and prose.docs/branches-commits.md— moved__run__<id>to a legacy note; removedpublish_runfrom the publish-trigger list.docs/audit.md— current_asAPI list refreshed; legacyRunRecord.actor_idmoved to a historical note.docs/constants.md— marked the run registry / branch-prefix rows as legacy.docs/cli.md— replaced the legacyomnigraph run *quickstart block withomnigraph commit list/show.docs/testing.md— extended theruns.rsrow to cover the new MR-794 contract tests; added thestaged_writes.rsrow.AGENTS.md(CLAUDE.md symlink) — updated the atomic-per-query description and the L2 capability matrix row.
Included Changes
- MR-794 step 2+ — rewire
mutate_asandloadvia in-memoryMutationStaging+stage_*/commit_stagedper touched table at end-of-query. - (MR-794 step 1 shipped in v0.4.0's PR #67 —
StagedWrite,stage_append,stage_merge_insert,commit_staged,scan_with_staged,count_rows_with_staged— and is the substrate this release builds on.)