mirror of
https://github.com/ModernRelay/omnigraph.git
synced 2026-06-24 02:38:06 +02:00
MR-794 step 2: address PR #68 review — merge semantics, cardinality, residual
Five fixes from PR #68 review (Cursor Bugbot + Codex + Cubic): * **scan_with_pending gains merge-shadow semantics** (Codex P1, Cubic P1#1): new `key_column: Option<&str>` parameter. When set, committed rows whose key value appears in any pending batch are excluded from the scan — making `scan_with_pending` correctly merge-semantic for chained updates instead of naively unioning. execute_update calls with Some("id"). Without this, a chained `update where age > 30` could match a row whose pending value already moved out of range. * **Multi-delete on same table no longer trips ExpectedVersionMismatch** (Cursor Bugbot HIGH): open_table_for_mutation routes through reopen_for_mutation when staging.inline_committed has the table, using the post-inline-commit Lance version captured at record_inline time. The legacy open_for_mutation_on_branch fence (Lance HEAD == manifest pinned) is correct cross-writer but wrong intra-query when deletes have already advanced HEAD on this table. Branch goes away when Lance ships two-phase delete (lance-format/lance#6658). * **Cardinality validation consolidated** (Cursor LOW + Codex P2 + Cubic P1#2 + Cubic P2): new exec/staging::count_src_per_edge + enforce_cardinality_bounds shared by mutation and loader paths. Restores the missing min-cardinality check on the engine path. Loader Merge mode passes Some("id") to dedupe edges being updated by id (not double-count committed + pending). Loader Append mode and engine path pass None (ULID-generated ids never collide). * **Dead count_rows_with_pending removed** (Cursor LOW): never called. * **Misleading concat-helper comment fixed** (Cubic P3): claimed schema normalization the helper doesn't implement. Updated to match reality. * **Documentation honesty** (Cubic P1#3): MR-794 narrows but doesn't eliminate the "Lance HEAD ahead of __manifest" drift class. Drift is unreachable for op-execution failures (the partial_failure test pins this), but a residual remains at the finalize→publisher boundary because Lance has no multi-dataset commit primitive: per-table commit_staged calls run sequentially before manifest commit. Updated docs/runs.md, docs/invariants.md §VI.25, docs/releases/v0.4.1.md to scope the claim precisely. * **Failpoint test pinning the residual**: new mutation.post_finalize_pre_publisher failpoint + two tests in tests/failpoints.rs that confirm the documented residual behavior. Catches future regressions that widen the residual. Test additions on tests/runs.rs: * chained_updates_with_overlapping_predicate_respects_intermediate_value * multi_statement_delete_on_same_node_table * cascade_delete_node_then_explicit_delete_edge_on_same_table * mutation_insert_edge_enforces_min_cardinality * load_merge_mode_dedupes_edge_for_cardinality_count 113/113 engine integration tests pass (runs + end_to_end + consistency + staged_writes + validators). Failpoints feature build runs in CI. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
a61e82f47a
commit
3223b51cf1
9 changed files with 828 additions and 199 deletions
36
docs/runs.md
36
docs/runs.md
|
|
@ -75,6 +75,42 @@ will replace it. Operator-driven (rare in agent workloads); document
|
|||
permanently until Lance exposes `Operation::Overwrite { fragments }` as
|
||||
a two-phase op.
|
||||
|
||||
### Finalize → publisher residual
|
||||
|
||||
The staged-write rewire eliminates one drift class **by construction at
|
||||
the writer layer**: an op that fails before pushing to the in-memory
|
||||
accumulator (validation errors, missing endpoints, parse-time D₂
|
||||
rejection) leaves Lance HEAD untouched on every staged table. This is
|
||||
the case the `partial_failure_leaves_target_queryable_and_unblocks_next_mutation`
|
||||
test pins.
|
||||
|
||||
A second, narrower drift class remains. `MutationStaging::finalize`
|
||||
runs `stage_*` + `commit_staged` per touched table sequentially, then
|
||||
the publisher commits the manifest. Lance has no multi-dataset atomic
|
||||
commit, so the per-table `commit_staged` calls are independent
|
||||
operations: if commit_staged on table N+1 fails *after* commit_staged
|
||||
on tables 1..N succeeded, or if the publisher's CAS pre-check rejects
|
||||
*after* every commit_staged succeeded, tables 1..N are left at
|
||||
`Lance HEAD = manifest_pinned + 1`. The next mutation against those
|
||||
tables surfaces `ManifestConflictDetails::ExpectedVersionMismatch` —
|
||||
the same loud failure mode the rewire was designed to make rare, just
|
||||
no longer "unreachable."
|
||||
|
||||
Triggers: transient Lance write errors during finalize (object-store
|
||||
retry budget exhaustion, disk full); persistent publisher contention
|
||||
exceeding `PUBLISHER_RETRY_BUDGET = 5` retries. Closing this requires
|
||||
either a Lance multi-dataset atomic-commit primitive (filed upstream
|
||||
alongside the two-phase delete request) or a manifest-layer journal
|
||||
that replays staged commits on next open. Both are heavyweight; the
|
||||
v1 stance is "narrowed window, documented residual, surface the loud
|
||||
error when it fires."
|
||||
|
||||
The publisher-CAS contract is unchanged: a *concurrent writer* that
|
||||
advances any of our touched tables between snapshot capture and
|
||||
publisher commit produces exactly one winner. The residual above is
|
||||
about *our* abandoned commits in the failure path, not about
|
||||
concurrency races.
|
||||
|
||||
## Conflict shape
|
||||
|
||||
Concurrent writers to the same `(table, branch)` produce exactly one
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue