omnigraph

mirror of https://github.com/ModernRelay/omnigraph.git synced 2026-06-09 01:35:18 +02:00

Author	SHA1	Message	Date
Ragnor Comerford	3ad359db8b	tests: admission test uses new_with_workload, drops env mutation + #[serial] Migrates `ingest_per_actor_admission_cap_returns_429` from env-var override to direct `WorkloadController::new(1, ...)` construction via `AppState::new_with_workload`. Removes the `EnvGuard` and the `#[serial]` annotation that paired with it. Why correct by design (AGENTS.md rule 9): the previous round's matrix fix (commit `8bd9a5f`) shielded the matrix from this test's env mutation, but the broader bug class — "test A's process-wide env mutation can leak into any test B that calls `AppState::open` / `WorkloadController::from_env()`" — was still reachable by any future test that didn't think to opt out. Closing the class at the source: this test no longer mutates global state at all, so no other test needs to defend against it. Net effect: - This test no longer needs `#[serial]` (was the only reason it was marked) — runs in parallel with the rest of the suite. - The matrix's defensive `with_defaults()` construction (commit `8bd9a5f`) remains correct but is no longer required for correctness; it's now a "belt and suspenders" guard against any FUTURE env-mutating test. Verified locally: both tests pass when run together; full server suite (44 tests) green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 20:35:41 +02:00
Ragnor Comerford	8bd9a5ff14	tests: matrix harness uses with_defaults() workload, not from_env() Round 4 CI failure: Test Workspace and server-aws both red on `concurrent_branch_ops_morphological_matrix` cell b ("merge × merge: same-target-distinct-sources") — second merge returned 429 instead of 200. The matrix passes locally. Root cause: cargo test runs tests in parallel by default. The admission test `ingest_per_actor_admission_cap_returns_429` is wrapped with `#[serial]` and an EnvGuard that sets `OMNIGRAPH_PER_ACTOR_INFLIGHT_MAX=1` for its duration. Process-wide env vars are visible to concurrently-running tests; the matrix's `Harness::new()` called `AppState::open()` which delegates to `WorkloadController::from_env()`, picking up cap=1 if it ran while the admission test held the EnvGuard. With cap=1 + 2 concurrent merges in cell b, one merge waits behind merge_exclusive while the other is admitted; the waiter holds its admission permit, but a fresh actor permit is needed when admission is per-actor — the second merge's permit acquisition fails because the first hasn't released yet, and 429 fires. Fix (correct by design, AGENTS.md rule 9): the matrix harness builds the WorkloadController explicitly via `WorkloadController::with_defaults()` and passes it to `AppState::new_with_workload`, the constructor added in commit `22d76db`. Closes the bug class "tests pick up another concurrent test's env override at construction time" — the matrix is now insulated from any env-var manipulation in the rest of the test suite. Verified locally: with `OMNIGRAPH_PER_ACTOR_INFLIGHT_MAX=1` set in the environment, the matrix passes (it ignores env entirely now). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 20:19:42 +02:00
Ragnor Comerford	99b0941478	tests: remove three narrow concurrent_branch_* tests subsumed by T1 The previous commit added `concurrent_branch_ops_morphological_matrix` covering 11 cells with stronger assertions (identity + post-op /change + reopen). The three narrow tests it replaces: - concurrent_branch_create_from_distinct_parents_does_not_corrupt_coordinator → matrix cell f, with identity assertions added - concurrent_branch_merges_distinct_targets_do_not_swap_into_each_other → matrix cells a + b + c, with identity assertions that close the symmetric-swap blind spot cubic flagged on commit `64f2b99` - concurrent_change_during_branch_merge_preserves_writes → matrix cell d The matrix retains the original tests' diagnostic granularity through named cell labels in every assertion message ("[a:merge×merge:distinct-targets] merge a"), so a CI failure points to the exact cell + invariant. Net: 522 lines removed, 0 coverage lost. All other server tests pass unchanged (44 total). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 20:09:21 +02:00
Ragnor Comerford	ac8594462e	tests: branch-ops morphological matrix (T1) Replaces three narrow concurrent_branch_* tests (folded in below) with one parameterized matrix test covering 11 representative (op_a, op_b, target_overlap) cells, asserting C1-C6 uniformly: C1 — both complete (no deadlock; tokio::time::timeout(15s)) C2 — status: both 200 or exactly one clean conflict; never 500 C3 — per-target row count C4 — per-target row identity (named persons present + absent — catches the symmetric-swap class that count assertions miss; cubic P2 on commit `64f2b99` flagged this gap on the round-3 merge race test) C5 — engine state coherent (subsequent /snapshot consistent) C6 — post-op /change on main succeeds (engine isn't poisoned) Cells: a. Merge × Merge, distinct targets — branch_merge_impl race pin b. Merge × Merge, same target / distinct sources — merge_exclusive serialization c. Merge × Merge, same source / distinct targets — fanout d. Merge × Change, into target — per-(table, branch) queue e. Merge × BranchCreateFrom, target — interaction with refresh path f. BranchCreateFrom × BranchCreateFrom, distinct parents — round-1 race pin g. BranchCreateFrom × BranchDelete, unrelated branches — disjoint state h. BranchDelete × BranchDelete, distinct branches — concurrent refresh i. BranchDelete × Change, distinct branch — refresh-side vs writer j. BranchCreateFrom × Change, on source — fork-while-writing k. Reopen consistency after concurrent pair — disk-vs-cache drift Each cell: - spins up its own tempdir + AppState so failures don't cascade, - aligns the pair at a tokio::sync::Barrier so both reach the engine close in time, - wraps in a 15s deadlock timeout, - asserts identity via a /read with the `get_person` fixture query (specific names must be present on the right branch and absent from the wrong one). Subsumes: - concurrent_branch_create_from_distinct_parents_does_not_corrupt_coordinator (now cell f, with identity assertions added) - concurrent_branch_merges_distinct_targets_do_not_swap_into_each_other (now cells a + b + c, with identity assertions; the symmetric-swap blind spot cubic flagged on commit `64f2b99` is closed) - concurrent_change_during_branch_merge_preserves_writes (now cell d) Those three narrow tests are removed in the next commit so this lands green standalone. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 20:07:37 +02:00
Ragnor Comerford	64f2b994f5	bench: assert --heavy-concurrency > 0 instead of silently clamping Closes the cubic P2 finding on commit `22d76db`: `Semaphore::new(concurrency.max(1))` silently coerced --heavy-concurrency=0 to 1, so the JSON output reported 0 while execution actually used 1. Reported settings differed from actual. Adds an explicit `--heavy-concurrency > 0` check in `main()` (with a helpful error message pointing to --heavy-batches=0 as the way to disable heavy traffic) and a defensive `assert!()` inside `drive_heavy_actor` so future callers can't pass 0 silently. Verified: `bench_actor_isolation --heavy-concurrency 0` exits with code 2 and the explanatory error message. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 19:23:02 +02:00
Ragnor Comerford	3e6b2af4e9	engine: serialize concurrent branch merges via merge_exclusive mutex Closes the Cursor Bugbot HIGH on commit `22d76db` (round 2 review): `branch_merge_impl` at `crates/omnigraph/src/exec/merge.rs:1085-1100` still used the swap_coordinator_for_branch + operate + restore_coordinator pattern across three separate `coordinator.write().await` acquisitions. Two concurrent merges with distinct targets would interleave their swaps, leaving each merge's body running against the other's swapped coord — A's `feature_a → target_a` would land its rewrite in target_b instead. Adds `merge_exclusive: Arc<tokio::sync::Mutex<()>>` to `Omnigraph`, held across the entire swap → operate → restore window in `branch_merge_impl`. Concurrent branch merges now serialize relative to each other; everything else (per-(table, branch) writer queues, /change, /ingest) is unaffected. Why the mutex rather than the deeper "operate on local coord" refactor (the round-1 fix shape applied to `branch_create_from`): `branch_merge_on_current_target` calls `self.snapshot()` and `self.ensure_commit_graph_initialized()` internally, which use `self.coordinator` directly. Threading an explicit target coord parameter through the merge body would unwind dozens of call sites. The mutex is a smaller intrusion that fully closes the race. Documented as a follow-up if telemetry shows merge concurrency matters. Pinned by `concurrent_branch_merges_distinct_targets_do_not_swap_into_each_other` (previous commit). Pre-fix: M=4 iterations of concurrent merges deterministically corrupted target row counts. Post-fix: all M iterations land each merge on its declared target. The two adjacent branch concurrency tests (`concurrent_change_during_branch_merge_preserves_writes`, `concurrent_branch_create_from_distinct_parents_does_not_corrupt_coordinator`) still pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 19:14:54 +02:00
Ragnor Comerford	2b2e723125	tests: pin branch_merge swap-restore race (red) Per AGENTS.md rule 8, this commit lands the failing regression test ahead of the fix. Cursor Bugbot HIGH on commit `22d76db` rediscovered the residual flagged in the round 1 honest-review note: `branch_merge_impl` at `crates/omnigraph/src/exec/merge.rs:1085-1100` still uses the swap_coordinator_for_branch + operate + restore_coordinator pattern across three separate `coordinator.write().await` acquisitions. The same shape that branch_create_from_impl shed in commit `4ffbf6e`. The test spawns two concurrent /branches/merge calls A (feature-a → target-a) and B (feature-b → target-b) aligned at a tokio::sync::Barrier so both reach swap_coordinator_for_branch close in time. M=4 iterations boost race-catching odds. Currently fails on `22d76db` with target-a=5, target-b=4: B's merge landed on the wrong coord — target-b never got Frank because A's swap pushed self.coordinator to target-a, B's swap captured target-a as B's "previous", and B's restore set self.coordinator back to target-a (not the original main). Subsequent operations using self.coordinator point at the wrong branch. Fix lands in the next commit: serialize concurrent branch merges via `merge_exclusive: Arc<tokio::sync::Mutex<()>>` held across the entire swap-operate-restore window. Closes the bug class "non-atomic three-step coordinator manipulation" for branch_merge by serializing merges relative to each other; per-(table, branch) queue inside the merge body still lets merges and other writers run concurrently. A deeper "operate on local coord" refactor (the round-1 fix shape for branch_create_from) requires unwinding `branch_merge_on_current_target` and its uses of `self.snapshot()` / `self.ensure_commit_graph_initialized()`; deferred to a follow-up. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 19:12:03 +02:00
Ragnor Comerford	22d76dbb40	server+bench: AppState::new_with_workload; bench drops set_var, exercises heavy cap Two cubic findings on bench_actor_isolation.rs flagged together: P2 (lib.rs:202): `unsafe { std::env::set_var(...) }` ran inside `#[tokio::main] async fn main()` AFTER the multi-thread tokio runtime was up. Rust 2024 made `set_var` unsafe because libc's `setenv` is not thread-safe; concurrent env reads from logging or runtime internals can race or read torn state. Fix (correct by design, AGENTS.md rule 9): add a public `AppState::new_with_workload(uri, db, bearer_tokens, workload)` constructor that takes a caller-built `WorkloadController`. Tests and benches override per-actor caps via the constructor instead of mutating global env. Closes the bug class "tests need to mutate global env to override AppState defaults." P2 (lib.rs:130): heavy actor's `oneshot.await` inside the loop serialized — heavy in-flight count was always 1, so cap=1 never tripped on the heavy side. The bench validated isolation (light p99 bounded) but didn't demonstrate the rejection path. Fix: add a `--heavy-concurrency` arg (default 4) and spawn batches as concurrent tokio tasks bounded by an internal semaphore. With heavy_concurrency=4 and inflight_cap=1, the bench now reports heavy_too_many_requests > 0 and heavy_ok == 1 at peak — proving the gate fires for the heavy actor. Sample run on local FS (4 light actors × 30 ops, 20 heavy batches × 50 rows, heavy_concurrency=4, cap=1): heavy_ok: 1 heavy_too_many_requests: 19 light_ok: 120 light_too_many_requests: 0 light_p99: 565 ms (target < 2 s) Heavy saturates its own cap; light actors are completely unaffected. The isolation property is now empirically proven by the rejection counts rather than just by the latency tail. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 17:57:42 +02:00
Ragnor Comerford	8e1a8e7d55	server: document 429 / 503 in admission-gated endpoint OpenAPI responses Closes the cubic finding (P2) at lib.rs:1061: the new admission gates add HTTP 429 / 503 failure paths but the affected endpoint `#[utoipa::path(... responses(...) ...)]` annotations weren't updated. Also closes a pre-existing miss on /change (admission-gated since PR 2 Step F). Adds (status = 429, ...) and (status = 503, ...) to all six admission-gated endpoints: - POST /change (operation_id = "change") - POST /schema/apply (operation_id = "applySchema") - POST /ingest (operation_id = "ingest") - POST /branches (operation_id = "createBranch") - DELETE /branches/{branch} (operation_id = "deleteBranch") - POST /branches/merge (operation_id = "mergeBranches") The descriptions reference the `Retry-After` header, which the `IntoResponse for ApiError` impl emits on both codes (added in commit `c745dd6`). openapi.json regenerated via OMNIGRAPH_UPDATE_OPENAPI=1; the openapi sentinel test passes both with the regen flag and in strict-check mode. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 17:49:02 +02:00
Ragnor Comerford	7fc00142a4	engine: scope refresh() write guard to recovery; release before schema reload Closes the HIGH-severity deadlock flagged by Cursor Bugbot on PR #75 review of commit `b09a097`. Pre-fix: `Omnigraph::refresh()` held `coordinator.write().await` from omnigraph.rs:468 through function exit, including across the call to `reload_schema_if_source_changed()` at line 484. That helper's `self.coordinator.read().await` (only reached when on-disk schema source differs from in-memory cache) deadlocked against the outer write guard because tokio's RwLock is non-reentrant. Reachable from `branch_delete` (omnigraph.rs:910) and `branch_merge` (post-merge refresh at merge.rs:1100). Cross-handle scenario: handle A calls apply_schema, handle B's stale cache hits the reload path on its next refresh. Why correct by design (AGENTS.md rule 9): the write guard's purpose is to serialize the recovery sweep's mutation of GraphCoordinator; the schema reload reads coord.branch_list() and stores into the ArcSwap'd schema_source / catalog without touching the coord. The two operations have disjoint lock requirements; coupling them was over-locking. Scoping the guard matches the natural data-flow: snapshot recovery state under the write, release, then reload schema using a fresh read on the same lock. Pinned by `composite_flow_schema_apply_then_branch_ops_no_deadlock_in_refresh` (previous commit). Pre-fix: 15s timeout fires. Post-fix: completes in 0.25s. Both other composite_flow tests still pass: canonical_lifecycle and multi_branch_sequential_merges. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 17:47:08 +02:00
Ragnor Comerford	8686b1deed	tests: pin refresh() deadlock after schema_apply (red) Per AGENTS.md rule 8, this commit lands the failing regression test ahead of the fix so the red → green pair is visible in git log. Cursor Bugbot flagged the deadlock at HIGH severity on commit `b09a097`: `Omnigraph::refresh()` holds `coordinator.write().await` from omnigraph.rs:468 through function exit, including across the call to `reload_schema_if_source_changed()` at line 484. That helper, when the on-disk schema source differs from the in-memory cache, attempts `self.coordinator.read().await` at line 496. Tokio's RwLock isn't reentrant — the read blocks waiting for the write to release, the write isn't released until refresh() returns. Hard hang. Reachable from `branch_delete` (omnigraph.rs:910 calls `self.refresh()`) and `branch_merge_as` (post-merge refresh at merge.rs:1100). Cross-handle setup is the realistic trigger: handle A applies a schema, advancing _schema.pg on disk and updating A's ArcSwap cache in-line; handle B has stale in-memory schema_source. B's next refresh() (here via branch_delete) hits the read-after-write reload path because B's cache no longer matches disk. Single-handle is unreachable since apply_schema updates the local cache atomically. Test currently fails on `b09a097` with the timeout firing at 15s, proving branch_delete hung. The next commit scopes the write guard to the recovery section so reload_schema_if_source_changed runs without the write held — uncontested read acquisition, no deadlock. The test extends `composite_flow.rs` with a broader sequence (apply_schema → branch_create → branch_delete → branch_merge → mutate with new column → reopen) so the post-fix path's correctness is pinned alongside the deadlock pin per the user's request. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 17:46:07 +02:00
Ragnor Comerford	b09a0972cb	bench: add actor-isolation harness for WorkloadController Empirical proof of MR-686's central design promise: per-actor admission control isolates noisy actors from light traffic. The existing bench_concurrent_http harness measures aggregate throughput; this harness measures the latency tail seen by light actors while a heavy actor saturates its own per-actor cap. Setup: one "heavy" actor flooding /ingest with multi-row NDJSON batches; N "light" actors each running short bursts of /change inserts, each authenticating with a distinct bearer token so the WorkloadController accounts them as separate identities. Output: heavy throughput / 429 count, light p50/p95/p99/max latency. Acceptance heuristic on local FS: light-actor p99 < 2 s while the heavy actor saturates its own cap. Sample run on local FS, cap=1, 4 light actors x 30 ops, 20 heavy batches x 50 rows: light p99 = 710 ms, light errors = 0 (well under the 2 s acceptance target). The test demonstrates the isolation property — the heavy /ingest holds its own admission slot but doesn't affect light actors since they have separate per-actor state. Usage: cargo run --release -p omnigraph-server --example bench_actor_isolation -- \ --light-actors 4 --light-ops-per-actor 30 \ --heavy-batches 20 --heavy-rows-per-batch 50 \ --inflight-cap 1 \ --output .context/bench-results/after-pr2-phase2/actor-isolation.json Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 17:12:50 +02:00
Ragnor Comerford	976aa0ec1d	tests: pin concurrent /change + branch_merge interleave preserves writes Future-proofs against MR-895 work that may move or remove the per-(table, branch) writer queue acquisition inside `branch_merge` (`crates/omnigraph/src/exec/merge.rs:1224`). Today the queue linearizes a concurrent /change on main against a `branch_merge feature → main` on the same touched tables; both succeed and the inserted row is preserved post-merge. Codex flagged this scenario as a P1 in PR #75 review claiming the merge could silently overwrite concurrent target writes because the source-rewrite path opens with `MutationOpKind::Merge` (skipping the strict pre-stage check). Validation showed the queue at merge.rs:1224 is held across both Phase B (per-table commit_staged) and Phase C (manifest publish), so there's no interleave window. The Merge op_kind only affects same-process pre-stage drift detection, not cross-write linearization. The test passes on `f925ad1`; landing it as a regression sentinel catches future changes that drop the queue acquisition. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 17:03:05 +02:00
Ragnor Comerford	5520ab72ff	tests: pin disjoint /change concurrency at HTTP level Closes the cubic acceptance-criteria gap (❌ "Integration test: two /change requests targeting different (table_key, branch) execute concurrently end-to-end"). The bench harness measures the throughput side; this test is the regression sentinel that catches a future change which accidentally re-introduces graph-wide serialization on the disjoint path. Spawns 4 concurrent /change inserts on node:Person and 4 on node:Company. All 8 must return 200, and the post-test row counts on each table must reflect every insert. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 17:01:52 +02:00
Ragnor Comerford	6ef07386d3	docs+engine: refresh server.md rate-limiting note; cache version() TOCTOU Two cleanups bundled because they're both single-line, post-MR-686 hygiene flagged by cubic during PR review: - docs/server.md:102 said "Rate limiting — none" while the new admission-control section earlier in the file documents 429s on the five mutating handlers. Replace with a pointer to the admission section and clarify that no graph-wide rate limiter is wired. - schema_apply.rs:445-451 called `db.version().await` twice — once for the conditional check, once in the error format string — creating a cosmetic TOCTOU under interior mutability. Cache the result in `current_manifest_version` so the error message reflects the version that triggered the rejection. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 16:59:45 +02:00
Ragnor Comerford	c745dd69ae	server: emit Retry-After header on 429 / 503 responses Closes the doc-vs-code gap at api.rs:343 and lib.rs:344-355: the documentation claims `Retry-After` is set on TooManyRequests / ServiceUnavailable responses, but `IntoResponse for ApiError` emitted only `(StatusCode, Json(ErrorOutput))` — no header. Wires a constant `RETRY_AFTER_SECONDS = "60"` for both 429 and 503 codes. Plumbing per-RejectReason durations through is a follow-up; the admission rejects we surface today recover bounded by request handler duration rather than calendar wait, so a constant suffices. Pinned by `ingest_per_actor_admission_cap_returns_429`. Test now fully green: 1+ of 8 concurrent /ingest under cap=1 receives 429 with Retry-After: 60. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 16:58:47 +02:00
Ragnor Comerford	05a8bd5de1	server: gate /ingest /branches/* /schema/apply on per-actor admission Closes the gap that admission control only fired on /change. A heavy actor sending bulk-ingest traffic could exhaust shared engine capacity (Lance I/O threads, manifest churn) without hitting the per-actor cap. Wires `state.workload.try_admit(&actor_arc, est_bytes)` into the five remaining mutating handlers AFTER Cedar authorization (so denied requests don't consume admission slots) and BEFORE the engine call. Byte estimates per handler: - /ingest: request.data.len() (NDJSON body) - /schema/apply: request.schema_source.len() - /branches/create, /branches/delete, /branches/merge: 256 (small JSON; the heavy work is bounded per-(table, branch) by the engine's writer queue rather than by request size) The admission guard is held in `let _admission = ...` so it stays alive until handler return, releasing the count permit + decrementing the byte budget on drop. Pinned by `ingest_per_actor_admission_cap_returns_429` (previous commit). The test still fails on the Retry-After header assertion; the next commit emits the header. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 16:57:53 +02:00
Ragnor Comerford	0976cbebc5	tests: pin /ingest admission gate + 429 Retry-After (red) Per AGENTS.md rule 8, this commit lands the failing regression test ahead of the fix. Currently fails on `f925ad1` with 8/8 statuses returning 200 because /ingest does not call WorkloadController::try_admit. The test pins: - /ingest is gated on per-actor admission control (returns 429 when the cap is exceeded). - 429 responses carry the structured `code: too_many_requests` error body so clients can distinguish them from generic conflicts. - 429 responses include a `Retry-After` header so clients can implement bounded backoff. The doc claim at api.rs:343 and lib.rs:344 was that this header exists; the IntoResponse impl currently emits no headers. Two follow-up commits will turn this green: 1. Wire WorkloadController::try_admit on /ingest and the four other mutating handlers (Block 2.1). 2. Emit the Retry-After header on 429/503 responses (Block 2.2). The test uses #[serial] + EnvGuard to override OMNIGRAPH_PER_ACTOR_INFLIGHT_MAX=1 without racing parallel tests, then spawns 8 concurrent /ingest tasks aligned at a tokio::sync::Barrier so multiple tasks reach try_admit close in time. With cap=1, at least one must be rejected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 16:57:01 +02:00
Ragnor Comerford	c263732b1a	tests: extend same-key insert test with /snapshot row-count assertion The existing change_concurrent_inserts_same_key_serialize_without_409 test claimed in its comment "asserts the final row count equals N" but only checked HTTP status codes. cubic flagged the gap; this commit adds the actual /snapshot read after the concurrent inserts to verify all N batches landed (no silent overwrite) by comparing the post-test node:Person row_count against SEED + N. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 16:49:38 +02:00
Ragnor Comerford	4ffbf6ec61	engine: drop swap-restore in branch_create_from; operate on local coord Closes the swap-restore race in `branch_create_from_impl` by simply not touching `self.coordinator` at all. Open the source-branch coordinator locally, call `branch_create` on it, drop it. The new branch is durable on disk via the manifest write that `GraphCoordinator::branch_create` issues on its own commit graph; subsequent reads of any coord will see it after their normal manifest refresh. Pre-fix: `branch_create_from_impl` ran swap → operate → restore as three separate `coordinator.write().await` acquisitions. Under `&self` concurrency, two callers with distinct source branches could interleave their swaps, leaving each caller's "operate" step running against the other's swapped coordinator and forking the new branch off the wrong HEAD. Pinned by `concurrent_branch_create_from_distinct_parents_does_not_corrupt_coordinator` (previous commit) which deterministically reproduced the race with 8/8 forks landing on the wrong parent. Why correct by design (AGENTS.md rule 9): closing the bug class "non-atomic three-step coordinator manipulation under &self callers" by removing the manipulation entirely. There's no scratch-space race to lose because there's no scratch space. Note: `branch_merge_impl` at `crates/omnigraph/src/exec/merge.rs:1085-1100` keeps the same swap-restore pattern. Its inner `branch_merge_on_current_target` calls `self.snapshot()` and `self.ensure_commit_graph_initialized()` which acquire the coord lock independently, so the simple "operate on local coord" refactor doesn't compose without a deeper interface change. The per-(table, branch) writer queue inside the merge body (`crates/omnigraph/src/exec/merge.rs:1224`) bounds the damage in practice; a deterministic regression for concurrent merges is tracked under Block 3.1 of the plan. `swap_coordinator_for_branch` and `restore_coordinator` remain crate-internal for now (still used by `branch_merge_impl`); a follow-up can remove them if the merge path is similarly refactored. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 16:48:17 +02:00
Ragnor Comerford	3b33e9ac56	tests: pin branch_create_from swap-restore race (red) Per AGENTS.md rule 8, this commit lands the failing regression test ahead of the fix so the red → green pair is visible in git log. The test demonstrates that two concurrent `POST /branches` calls with distinct `from` parents corrupt coordinator state: A's "operate" step runs against B's swapped coordinator instead of its own, forking the new branch off the wrong parent's HEAD. Currently fails on `f925ad1` with all 8 gamma branches (declared parent: alpha, 5 rows) reporting 4 rows — beta's row count. The operate step ran against beta's coord because B's swap interleaved between A's swap and A's operate. Fix lands in the next commit: hold a single `coordinator.write().await` guard across the entire swap-operate-restore sequence in `branch_create_from_impl` so the three steps are atomic relative to other callers. Closes the bug class "non-atomic three-step coordinator manipulation under &self callers" rather than guarding the specific call site — the right architectural seam (single critical section per swap-restore sequence) eliminates the interleave window for branch_create_from and any future swap-restore caller. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 16:44:50 +02:00
Ragnor Comerford	4ca527cc53	staging: op-kind-aware drift check at commit_all entry Closes the bug class "Lance internal conflict surfaces as 500 instead of 409" for in-process concurrent strict-op writers on the same row. Pre-fix: in `MutationStaging::commit_all`, after queue acquisition, the staged Lance transaction (built against V0) was handed straight to `commit_staged`. When Lance HEAD has advanced past V0 (because the queue's prior winner already published), Lance's transaction conflict resolver fires `RetryableCommitConflict` for Update vs Update on the same row, which wraps as `OmniError::Lance(<string>)` and the API maps it to HTTP 500. Users see "internal server error" instead of a clean retryable conflict. Fix: track the strictest `MutationOpKind` per touched table on `MutationStaging` and propagate through `StagedMutation`. In `commit_all`'s recapture loop, before each `commit_staged`, fail-fast with `OmniError::manifest_expected_version_mismatch` (→ HTTP 409 ExpectedVersionMismatch) for tables whose tracked op_kind has `strict_pre_stage_version_check() == true` (Update/Delete/SchemaRewrite) when the staged dataset's version doesn't match the fresh manifest pin under the queue. Insert/Merge tables skip the check — concurrent inserts on disjoint keys legitimately coexist via Lance's auto-rebase, so the check would over-reject the existing same-key insert path. Threading: `ensure_path` now takes `op_kind` and stores it in a new `op_kinds: HashMap<String, MutationOpKind>` on `MutationStaging`, with strictness-upgrade semantics so mixed insert+update on the same table still fires the strict check at commit time. `StagedMutation` carries `op_kinds` through to `commit_all`. Pinned by `change_concurrent_updates_same_key_serialize_via_publisher_cas` in `crates/omnigraph-server/tests/server.rs` (added in the previous commit). All Phase 2 sentinels still pass: change_concurrent_inserts_same_key_serialize_without_409, change_conflict_returns_manifest_conflict_409, branch_merge_conflict_response_includes_structured_conflicts. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 16:42:14 +02:00
Ragnor Comerford	ebf5a5769d	tests: pin UPDATE RYW under in-process concurrency (red) Per AGENTS.md rule 8, this commit lands the failing regression test ahead of the fix so the red → green pair is visible in git log. The test asserts the RYW invariant for in-process concurrent UPDATEs on the same row: exactly one writer commits and N-1 receive 409 manifest_conflict. Currently fails on `f925ad1` with 1 x 200 + 7 x 500: > "storage: Retryable commit conflict for version 6: This Update > transaction was preempted by concurrent transaction Update at > version 6. Please retry." Lance's transaction conflict resolver correctly detects the Update vs Update race, but the error wraps as `OmniError::Lance(<string>)` and the API surfaces it as 500 internal rather than 409 retryable conflict. Users see "internal server error" for what is documented as a retryable conflict path. The fix lands in the next commit: an op-kind-aware drift check at the commit_all entry that returns 409 ExpectedVersionMismatch for tables whose first touch was Update / Delete / SchemaRewrite when the staged dataset version drifts from the manifest pin under the queue. Closes the bug class "Lance internal conflict surfaces as 500 instead of 409" rather than mapping the specific Lance error variant — the right architectural layer (engine boundary, under the queue) catches the drift before commit_staged ever runs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 16:33:53 +02:00
Ragnor Comerford	6bd9f1c085	agents: rules 8 and 9 — test-first for bug fixes; correct by design Add two durable engineering rules to the maintenance contract so they load into context on every session: - Rule 8: write a regression test that reproduces the bug first, confirm it fails, land it just before the fix commit so the red→green pair is visible in git log. A reviewer can check out the test commit alone and reproduce the failure. - Rule 9: when a bug surfaces, identify the root cause and make the fix correct by construction. Don't patch the symptom. If the design admits the bug class, close the class — don't add a guard around the latest instance. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 16:26:34 +02:00
Ragnor Comerford	56a479ea2f	tests: failpoints schema_source().as_str() (CI fix) PR 2 made Omnigraph::schema_source() return Arc<String> via ArcSwap, but the failpoints test still compared against &'static str constants. Three E0308 type mismatches were blocking the Test Workspace CI job; this fix restores compilation. - failpoints.rs:125,160,195 now call schema_source().as_str() to align with the &str constants. - Drops 11 unused let mut db = ... bindings on the same path (engine write APIs are &self post PR 2 Step C). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 16:26:23 +02:00
Ragnor Comerford	f925ad1739	mr-686: Phase 2 — op-kind-aware version check + coord Mutex → RwLock Fix A: op-kind-aware ensure_expected_version. Insert/Merge skip the strict pre-stage check; Update/Delete/SchemaRewrite keep it. New MutationOpKind enum threaded through open_for_mutation_on_branch / open_owned_dataset_for_branch_write / reopen_for_mutation and all callers (execute_insert/update/delete_node/delete_edge, branch_merge::publish_rewritten_merge_table, schema_apply, ensure_indices_for_branch, loader Append/Merge/Overwrite). Closes the 77% rejection rate on same-key concurrent inserts. Fix B: coordinator Mutex -> RwLock. Reads parallelize via .read(); writes serialize via .write(). Atomic-commit invariant preserved by the single .write() covering commit_manifest_updates + record_graph_commit. Bench-as-test change_concurrent_inserts_same_key_serialize_without_409 (server.rs:2180) spawns 12 concurrent /change inserts on a single (table, branch); asserts every request returns 200. Was failing pre-Phase-2; passes post-Phase-2. change_conflict_returns_manifest_conflict_409 (cross-process drift sentinel) and branch_merge_conflict_response_includes_structured_conflicts both still pass. Bench (after-pr2-phase2): - single-actor 1x1: 14.9 ops/s, p50 68ms (baseline 12.3, +22%) - disjoint 8x8: 7.04 ops/s, p50 1023ms (baseline 6.24, +13%) - same-key 8x1: 2.62 ops/s, 0 errors (after-pr2: 77% errors) Disjoint stayed at +13% — Fix B's RwLock helped read paths but the publisher's .write() critical section still serializes graph-wide. Splitting GraphCoordinator into per-concern primitives (manifest in ArcSwap, commit_graph in RwLock, atomic-commit serializer) is the deferred next step. 102 lib + 30 branching + 24 runs + 16 staged_writes + 63 end_to_end + 40 server tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 12:42:26 +02:00
Ragnor Comerford	b93a130b40	staging: re-capture per-branch snapshot under queue (fixes cross-branch fail) The `bdd6440` commit re-captured expected_versions from `db.snapshot()` (bound-branch view). That broke any mutation on a non-bound branch: when the engine handle is bound to main but the mutation targets feature, the bound-branch snapshot returns main's pin for each table, not feature's. The publisher commits to feature, reads feature's manifest entry, sees a different version → 409 even though no concurrent writer existed. Reproduced by `branch_merge_conflict_response_includes_structured_conflicts` which mutates main then mutates feature on the same Omnigraph handle — the second mutation failed with "expected V6, current V5". Switch the re-capture to `db.snapshot_for_branch(branch).await` so the per-branch entries are resolved correctly. This is one fresh manifest read per mutation (the same I/O PR 1b had pre-Step-D), but it is now required for cross-branch correctness — Step D's "in-memory under single-coordinator invariant" rationale was only sound for single-branch workloads. Single-table same-branch mutations could still skip this read (queue exclusivity makes the publisher CAS a no-op), but the conditional adds complexity for marginal gain. Left as a follow-up perf optimization tracked in `.context/bench-results/comparison.md`. Bench numbers updated: - single-actor 1x1: 15.2 ops/s vs baseline 12.3 (+24%) - disjoint 8x8: 7.12 ops/s vs baseline 6.24 (+14%) - same-key 8x1: 77% errors via the strict ensure_expected_version check upstream of commit_all; same-key concurrent-write fix is a separate follow-up. All 102 lib + 39 server + 24 runs + 30 branching + 20 traversal + 9 validators tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 19:44:14 +02:00
Ragnor Comerford	bdd6440c83	staging: re-capture expected_versions under queue (PR 2 Step D fix) The Step D commit (`1b0a2c9`) skipped revalidation for single-table mutations, betting that the publisher's CAS would be a no-op under the per-(table, branch) queue. The bench falsified this: expected_versions was captured during stage_all (BEFORE acquire_many), so by the time the queue acquired and the publisher ran, those captured pins were stale w.r.t. any in-process concurrent writer that had published in between. Same-key 8x1 produced ~99% manifest_conflict 409 rejections because every actor after the first carried stale expected_versions. Fix: always re-read the in-memory snapshot under the queue and overwrite expected_versions with the current per-table values. Single-coordinator invariant (one Arc<Omnigraph> per process) makes this safe with zero I/O — publishes update the shared coordinator BEFORE releasing queue guards, so a contending tenant's read sees a fresh view by the time it acquires its keys. The publisher's CAS becomes a correct no-op for queued tables; cross-process drift (coord stale because coord doesn't see external publishes) still rejects via the publisher CAS as ExpectedVersionMismatch -> 409, preserving the change_conflict_returns_manifest_conflict_409 regression sentinel. Trade-off documented in the comment: SERIALIZABLE-opt-in writes (§VI.36 aspirational) will need an additional revalidation step here; the bench's append/upsert pattern is fine because Lance's natural rebase handles concurrent writes onto the same dataset. Bench results captured at .context/bench-results/after-pr2/ + .context/bench-results/comparison.md: - single-actor 1x1: 15.0 ops/s vs baseline 12.3 (+22%) - disjoint 8x8: 7.03 ops/s vs baseline 6.24 (+13%) - same-key 8x1: still rejected (76% errors) by the ensure_expected_version strict check upstream of commit_all; follow-up to address. Disjoint's 13% is below the master plan's ≥8× target. Bench shows the coordinator Mutex is now the dominant serializer; relaxing to RwLock for snapshot/version reads is the next perf step, tracked as a follow-up in comparison.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 19:28:36 +02:00
Ragnor Comerford	7aca6ddac5	docs: PR 2 documentation pass (server / architecture / §VI.23) - docs/server.md: new "Per-actor admission control (MR-686)" section documenting WorkloadController defaults, the 429/503 mapping with Retry-After semantics, the Cedar-then-admission ordering, and the /change-only-for-now scope. Adds 429 / 503 to the listed HTTP status codes and `too_many_requests` / `service_unavailable` to the ErrorCode enumeration in the error model paragraph. - docs/architecture.md: server/CLI diagram updated. Adds WorkloadController and WriteQueueManager nodes; flow is HTTP -> auth -> Cedar -> admission -> engine -> queue. Engine label changed to "Arc<Omnigraph>" to reflect the AppState flip. Prose now points at server.md and runs.md for the admission/queue contracts. The CLI's bypass-admission note is preserved. - docs/invariants.md §VI.23 status annotation: explicitly cites the per-(table, branch) writer-queue + revalidation-under-queue as closing the Lance-HEAD-vs-manifest drift class under concurrent writers once the global RwLock is removed (PR 2 Step F). Continuous in-process rollback recovery still aspirational (MR-870 ticket). scripts/check-agents-md.sh passes (26 links, 26 docs). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 17:09:49 +02:00
Ragnor Comerford	c15962e6b0	server: flip AppState to Arc<Omnigraph>, wire admission on /change (PR 2 Step F) The substantive PR 2 change. Removes the global server `RwLock<Omnigraph>` that has serialized every mutating request across all actors. Disjoint `(table, branch)` writes from different actors now run concurrently, guarded only by the engine's per-(table, branch) write queue (PR 1b) and per-actor admission control (PR 2 Step E). AppState changes: - `db: Arc<RwLock<Omnigraph>>` -> `engine: Arc<Omnigraph>` - New field: `workload: Arc<workload::WorkloadController>` initialized from env (`OMNIGRAPH_PER_ACTOR_INFLIGHT_MAX=16`, `OMNIGRAPH_PER_ACTOR_BYTES_MAX=4GiB`, `OMNIGRAPH_GLOBAL_REWRITE_MAX=4`). - `tokio::sync::RwLock` import dropped. Handler updates (16 sites): - All `Arc::clone(&state.db).read_owned().await` and `write_owned()` calls replaced with `let db = &state.engine`. Engine APIs are now `&self` (Step C) so this works directly. - `/export` clones `Arc<Omnigraph>` once and moves into the spawned task instead of acquiring a long-held read lock. - `/change` handler additionally wires `state.workload.try_admit(&actor_arc, est_bytes)`. Cedar runs FIRST so denied requests don't consume admission slots; admission runs SECOND before the engine call. `est_bytes` uses the request body size as a coarse proxy. API surface additions (`api::ErrorCode`): - `TooManyRequests` -> HTTP 429 (per-actor cap exceeded; respect `Retry-After`) - `ServiceUnavailable` -> HTTP 503 (global rewrite pool exhausted) `ApiError` constructors `too_many_requests` / `service_unavailable` and `from_workload_reject` (maps `RejectReason` variants to HTTP status). Other mutating handlers (`/ingest`, `/branches/*`, `/branches/merge`, `/schema/apply`) currently flow through the Arc<Omnigraph> path without admission gates; wiring those is mechanical and lands as a follow-up. The /change hot path covers the bulk of MR-686's load profile. OpenAPI regenerated to include the new ErrorCode variants. 102 lib + 39 server tests + 5 workload tests pass. The regression sentinel `change_conflict_returns_manifest_conflict_409` continues to pass (revalidation perf opt + per-table queue + publisher CAS preserve manifest_conflict semantics under concurrent writers). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 17:08:26 +02:00
Ragnor Comerford	17a1665002	server: add WorkloadController for per-actor admission (PR 2 Step E) PR 2 removes the global server `RwLock<Omnigraph>` (Step F). Without admission control, one heavy actor would exhaust shared capacity (Lance I/O threads, manifest churn, network) and starve other actors. The WorkloadController bounds per-actor in-flight count + bytes and provides a global rewrite-pool semaphore for compaction / index builds. New file: `crates/omnigraph-server/src/workload.rs` (~250 LOC + 5 tests). API: - `WorkloadController::new(inflight_cap, byte_cap, rewrite_cap)` / `from_env()` / `with_defaults()`. - `try_admit(actor_id, est_bytes) -> Result<AdmissionGuard, RejectReason>` acquires both an in-flight count permit and adds est_bytes to the per-actor counter atomically; returns RejectReason on either gate. - `try_admit_rewrite() -> Result<RewriteGuard, RejectReason>` for the global rewrite pool (Step F maps RewriteGuard exhaustion to HTTP 503). - `RejectReason::{InFlightCountExceeded, ByteBudgetExceeded, GlobalRewriteExhausted}`. Race-free admission via `tokio::sync::Semaphore::try_acquire_owned()` for the count gate (master plan Finding 6: independent atomic load+check+add lets two callers both pass a cap-N check; the Semaphore gate is atomic). Bytes use `fetch_add` + decrement-on-rejection so the cap is never exceeded even on rollback. Defaults (override via env): - OMNIGRAPH_PER_ACTOR_INFLIGHT_MAX=16 - OMNIGRAPH_PER_ACTOR_BYTES_MAX=4_294_967_296 (4 GiB) - OMNIGRAPH_GLOBAL_REWRITE_MAX=4 Tests cover under-cap admission, byte-budget rollback, per-actor isolation, global rewrite cap, and the load-bearing 32-concurrent-vs- cap-16 race test (forces real contention via a broadcast release channel so guards can't recycle permits task-by-task; pins the master plan's race-free invariant). Adds workspace dep `dashmap = "6"` for per-actor state. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 16:59:45 +02:00
Ragnor Comerford	1b0a2c9310	staging: skip revalidation single-table; in-memory snapshot multi-table (PR 2 Step D) Closes the PR 1b regression (-17% disjoint, -30% same-key) by eliminating the fresh `db.snapshot_for_branch(branch).await` that PR 1b's commit_all issued per mutation. Single-table mutations (`staged.len() + inline_committed.len() == 1`): skip revalidation entirely. The per-(table, branch) Mutex queue holds exclusive while we commit; the publisher's CAS catches any drift that slipped between expected_versions capture and queue acquisition. Conflict cost: 1 orphan Lance HEAD advance, recovered via the existing sidecar protocol on the next ReadWrite open. This is the same trade-off the master plan §"Revalidation perf optimization" prescribes. Multi-table mutations: replace `db.snapshot_for_branch(branch)` (fresh manifest read) with `db.snapshot()` (in-memory). Correct under MR-686's single-process scope because all in-process tenants share one `Arc<Omnigraph>` -> one coordinator; publishes update the shared coordinator BEFORE releasing queue guards, so a contending tenant reads a fresh in-memory view by the time it acquires its queue keys. The within-mutation race (A captures expected_versions[T2]=V0, B publishes T2->V1 during A's stage I/O, A then acquires T2's queue) is caught via the in-memory check. Multi-coordinator deployments (§VI.27 aspirational) would need force-refresh under the queue — documented in §VI's "Explicit non-commitments". Adds a SAFETY comment naming the two load-bearing premises: (1) per-table queue uses exclusive Mutex (not RwLock), and (2) single-coordinator invariant (one Omnigraph engine per process). Migrating either breaks this skip. Regression sentinel `change_conflict_returns_manifest_conflict_409` passes. 102 lib + 24 runs + 16 staged_writes pass with the new path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 16:53:51 +02:00
Ragnor Comerford	d08c42c369	engine: convert write APIs from &mut self to &self (PR 2 Step C) The interior-mutability primitives from Step B (catalog ArcSwap, schema_source ArcSwap, coordinator Mutex, and RuntimeCache's existing internal locking) make every Omnigraph engine write API safe to expose under &self. This commit flips the public surface so the HTTP server can hold Arc<Omnigraph> in PR 2 Step F instead of Arc<RwLock<Omnigraph>>. Public API conversions: - mutate, mutate_as - ingest, ingest_as, ingest_file, ingest_file_as - load, load_as, load_file - branch_merge, branch_merge_as - apply_schema - ensure_indices, ensure_indices_on - optimize Inner functions converted in lockstep (their signatures must match the new caller shape): - mutate_with_current_actor, ingest_with_current_actor, load_direct_on_branch - execute_named_mutation, execute_insert, execute_update, execute_delete, execute_delete_node, execute_delete_edge - branch_merge_impl, branch_merge_on_current_target - load_jsonl_reader - schema_apply::{apply_schema, apply_schema_with_lock, acquire_schema_apply_lock, release_schema_apply_lock, ensure_schema_apply_idle} - table_ops::{ensure_indices, ensure_indices_on, ensure_indices_for_branch, commit_prepared_updates, commit_prepared_updates_with_expected, commit_prepared_updates_on_branch, commit_prepared_updates_on_branch_with_expected, commit_manifest_updates, record_merge_commit, ensure_commit_graph_initialized, commit_updates_on_branch_with_expected} - optimize::optimize_all_tables - Omnigraph::commit_manifest_updates, record_merge_commit, commit_updates_on_branch_with_expected, ensure_commit_graph_initialized The conversion is mechanical: callers that previously took `db: &mut Omnigraph` now take `db: &Omnigraph`; every interior mutation goes through the existing locks (coordinator.lock().await, store_catalog, runtime_cache.invalidate_all). No new locks acquired, no new lock-order hazards introduced. 102 lib tests + 24 runs + 30 branching + 63 end_to_end + 39 server tests pass. Workspace compiles clean (1 warning on a now-redundant `mut` binding in CLI; cleaned up in a follow-up). The remaining work in PR 2 is the AppState flip (Arc<RwLock<Omnigraph>> -> Arc<Omnigraph> + WorkloadController), the revalidation perf optimization in commit_all, and the WorkloadController itself. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 16:52:02 +02:00
Ragnor Comerford	011f9b9610	engine: wrap coordinator in tokio Mutex (PR 2 Step B continued) Wraps the GraphCoordinator field in `Arc<tokio::sync::Mutex<...>>` so engine APIs can move from `&mut self` to `&self` without giving up the coordinator's mutating refresh path. Lock acquisition order: always before runtime_cache (when both are needed in one scope). Critical sections stay short — load+clone for snapshot/version/current_branch, single-method delegations elsewhere. Public API changes: - `Omnigraph::version()` and `Omnigraph::snapshot()` (pub(crate)) become async; callers add `.await`. - `Omnigraph::active_branch()` returns `Option<String>` (cloned) instead of `Option<&str>` borrowed from the coordinator. Callers either `.await` the result + use `.as_deref()`, or hoist into a binding. `&self`-converted methods this round (tied to the coordinator wrap, not the Step C surface refactor): - `swap_coordinator_for_branch` - `restore_coordinator` (now async; was sync) - `sync_branch` - `refresh` - `refresh_coordinator_only` - `reload_schema_if_source_changed` - `branch_create`, `branch_create_from`, `branch_delete`, `branch_list` - `delete_branch_storage_only` - `ensure_branch_delete_safe` - `ensure_schema_apply_idle` - `ensure_schema_apply_idle` helper in schema_apply.rs (matches signature) Caller updates: branch_create_from_impl threads `restore_coordinator`'s new async signature; schema_apply, table_ops, exec/merge wrap every direct `db.coordinator.X()` in `db.coordinator.lock().await.X()`; exec/merge hoists `active_branch_for_keys` once outside the per-table closure that builds queue keys + sidecar pins. All 102 lib tests + 30 branching + 24 runs + 10 lifecycle + 16 staged_writes + 63 end_to_end pass workspace-wide. Zero test regressions; the only behavior change is on the `Omnigraph` API surface (sync -> async on the three accessors above). Step C (engine API conversion: apply_schema, mutate_as, ingest_as, branch_merge_as &mut self -> &self) follows in a subsequent commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 16:38:48 +02:00
Ragnor Comerford	fcb47620d3	mr-686: bundle PR 0/1a/1b foundation + PR 2 catalog/schema_source ArcSwap Bundles the working-tree state from the prior session (PR 0 bench harness, PR 1a audit_actor_id removal, PR 1b WriteQueueManager + writer integration) together with the first half of PR 2's interior-mutability foundation (catalog and schema_source wrapped in Arc<ArcSwap<...>>). The two streams intermix in 7 of the same files, so splitting via git add -p was impractical. Subsequent PR 2 steps land as separate atomic commits. PR 0 — server-level concurrent /change bench harness - crates/omnigraph-server/examples/bench_concurrent_http.rs (new) - .context/bench-results/{baseline-main,after-pr1}/ (gitignored) PR 1a — drop the audit_actor_id field, thread per-call - removed Omnigraph::audit_actor_id and the swap-restore patterns in mutation.rs, merge.rs, loader/mod.rs - actor_id: Option<&str> threaded through MutationStaging::finalize, mutate_with_current_actor, ingest_with_current_actor, branch_merge_impl, branch_merge_on_current_target, commit_prepared_updates*, record_merge_commit, commit_updates_on_branch_with_expected - apply_schema and ensure_indices_for_branch pass None (system-attributed) PR 1b — per-(table_key, branch) write queue + revalidation + sidecar - new crates/omnigraph/src/db/write_queue.rs with WriteQueueManager, acquire/acquire_many, sorted+deduped acquisition; 6 unit tests - Arc<WriteQueueManager> field on Omnigraph + db.write_queue() accessor - MutationStaging::finalize split into stage_all (Phase A, no queue) and StagedMutation::commit_all (Phase B, acquire_many + revalidate pins + sidecar + commit_staged); guards held across publisher - delete-only mutations now emit recovery sidecars; revalidation extended to inline_committed tables - branch_merge_on_current_target, apply_schema_with_lock, and ensure_indices_for_branch acquire per-table queues for their touched tables PR 2 Step B (partial) — catalog and schema_source via ArcSwap - catalog: Catalog -> Arc<ArcSwap<Catalog>> - schema_source: String -> Arc<ArcSwap<String>> - public accessors return Arc<Catalog> / Arc<String>; readers bind locally where the borrow has to outlive an expression - new pub(crate) store_catalog / store_schema_source helpers replace the field assignments in apply_schema and reload_schema_if_source_changed - 117 tests across lifecycle/end_to_end/branching/runs pass; engine lib + workspace compile clean Coordinator wrap (Mutex) and the &mut self -> &self engine API conversion follow in subsequent commits. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 16:22:38 +02:00
Ragnor Comerford	cd780e2d37	deps: add arc-swap to workspace for PR 2 catalog/schema_source wrapping PR 2 wraps the Omnigraph engine's catalog and schema_source fields in ArcSwap so reads stay zero-cost while apply_schema can swap atomically without &mut self. arc-swap lands as an unused workspace dep here so the follow-up commits that wrap fields can land in isolation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 15:25:22 +02:00
Ragnor Comerford	c12f6adb0c	docs/invariants: add §VI.35-37 + non-commitments for MR-686 Three new §VI invariants name what OmniGraph commits to as an agent-native system of record: branches as the cross-query coordination primitive, per-query isolation as a per-query opt-in (Serializable up, eventual down), and type-aware agent-resolvable merges. Plus an explicit non-commitments subsection so reviewers see what is intentionally out of scope (Strict Serializable across queries, cross-process linearizable single-object writes, auto-resolution of ambiguous merge conflicts). §VII and §VIII renumber by +3 to make room (35-43 -> 38-46, 44-47 -> 47-50); deny-list and review-checklist references in §IX/§X follow. testing.md's pre-existing stale §VII.33/34/36 references resolve to their actual §VIII.47/48/50 targets in the same pass. staged_writes.rs:866's docstring gains an MR-686 forward reference so the load-bearing concurrency-hazard test points readers at the queue work that closes the gap. §VI.34 is preserved alongside the broader §VI.36 to keep its MR-425 pointer addressable; the overlap is documented in §VI.36's status line. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 14:45:54 +02:00
Ragnor Comerford	028b913d9a	Merge pull request #72 from ModernRelay/ragnorc/mr847-recovery-reconciler Recovery-on-open reconciler	2026-05-06 11:59:09 +02:00
Ragnor Comerford	a30666bc38	docs/tests: reserve Phase A/B/C/D for the per-writer recovery flow Three terminologies were calling themselves Phase A/B in PR #72: 1. Per-writer recovery (canonical, four phases A/B/C/D — sidecar / commit_staged loop / manifest publish / sidecar delete in `docs/runs.md:157`). 2. Per-table staged-write contract from MR-793 (two phases — `stage_*` then `commit_staged`). 3. Test-narrative scaffolding (Phase A = setup the failure, Phase B = verify recovery — used as section dividers in failpoints.rs). Same letters, three meanings; three reviewers including the bots have already misread the code in the resulting fog. This change keeps "Phase A/B/C/D" exclusively for #1 and rewrites the other two: - `ensure_indices_phase_a_btree_failure_leaves_existing_tables_writable` → `ensure_indices_stage_btree_failure_leaves_existing_tables_writable` (matches the `stage_create_btree_index` API verb). - Comment at `table_ops.rs:610` and the test docstring at `failpoints.rs:807` rewrite "a Phase A failure in the staged-index path" → "a stage-step failure in the staged-index path". - Twelve `// Phase A:` / `// Phase B:` test scaffolding comment headers in `failpoints.rs` (across six test fns) become `// Setup:` / `// Recovery:`. - A "Phase letter convention" note added near `docs/runs.md:165` spells the rule out for future readers. Also bundled: rename `composite_flow_init_load_branch_merge_time_travel_optimize_cleanup` → `composite_flow_canonical_lifecycle` so it pairs as a story name with `composite_flow_multi_branch_sequential_merges` (the previously- deferred symmetry rename). No behaviour change. Both renamed tests pass; full failpoints (18) + composite_flow (2) suites pass; workspace baseline + clippy clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 22:46:03 +02:00
Ragnor Comerford	fb0f024652	recovery: register added tables + tombstones in SchemaApply roll-forward Cursor flagged that SchemaApply sidecars only captured `Update` pins (via `snapshot.entry()?` in schema_apply.rs:166), so recovery's `roll_forward_all` only published `ManifestChange::Update` for the rewritten/indexed tables. Added types (`added_tables`) and tombstones (`renamed_tables` sources) were silently dropped during recovery. Reproducer: in `schema_apply_phase_b_failure_recovered_on_next_open`, the v2 schema added a `Tag` node type. Pre-fix, `node:Tag` ended up as an orphan dataset on disk while the manifest never received a `RegisterTable` entry — the live `_schema.pg` declared a type the manifest didn't know about, and `count_rows(node:Tag)` panicked with `no manifest entry for node:Tag`. The existing test passed only because it never queried Tag. Fix: 1. Extend `RecoverySidecar` with `additional_registrations` and `tombstones` fields (optional, serde-default for backward compat with existing on-disk sidecars). Both are SchemaApply-only. 2. Populate them in `apply_schema_with_lock` from the migration plan's upfront diff (`added_tables` + `renamed_tables` keys for registrations; `renamed_tables` values for tombstones, version- pinned at `source_entry.table_version + 1`). 3. Update `roll_forward_all` to: - emit `RegisterTable` + `Update` for each `additional_registrations` entry (read the dataset's current Lance HEAD for the version metadata + row_count) - emit `Tombstone` for each `tombstones` entry - filter against `snapshot` so previously-published registrations / tombstones are skipped (handles the post-Phase-C-success-but- sidecar-not-yet-deleted case — without filtering, the publisher's CAS pre-check would error with `expected=0, actual=N` on the redundant Register) 4. Extend the audit-row outcomes to include published registrations. Test changes: - `schema_apply_phase_b_failure_recovered_on_next_open` now asserts `count_rows(node:Tag) == 0` (no panic), proving the new manifest entry exists. - `schema_apply_recovers_pre_commit_crash` renamed to `schema_apply_pre_commit_crash_rolls_forward_via_sidecar` and rewritten — pre-fix it expected pre-commit crashes to roll BACK (delete staging, keep V1, leave Company as orphan); the sidecar protocol's "complete the writer's intent" semantic now rolls FORWARD (rename staging -> final, register Company atomically). The new assertions verify schema = V2 and `node:Company` is queryable. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 22:15:50 +02:00
Ragnor Comerford	3ea7a1fd50	recovery: record RolledForward audit on stale-after-success sidecar Cursor flagged that if `roll_forward_all` succeeds (manifest pin advances) but `record_audit` then fails, the sidecar persists. On the next open, every table classifies as NoMovement (lance_head == manifest_pinned, both already reflect the prior roll-forward) → `decide` returns RollBack → `roll_back_sidecar` records a RolledBack audit row with empty per-table outcomes. Operators reading `_graph_commit_recoveries.lance` see "RolledBack" for an operation whose actual outcome was a successful roll-forward. `process_sidecar`'s RollBack arm now distinguishes "stale-after- success" from a legitimate rollback: when every classification is NoMovement AND any pin's `manifest_pinned > expected_version` (the manifest already advanced past the writer's CAS target), recovery dispatches to `record_audit_recovery_rollforward` which writes a RolledForward audit row with reconstructed outcomes (`from_version = expected_version`, `to_version = manifest_pinned`) and deletes the sidecar. No Lance writes — the substrate is already in the post-roll-forward state. Safe in `RollForwardOnly` mode (refresh-time recovery) because no `Dataset::restore` is involved; the legitimate-rollback path stays deferred to the next ReadWrite open as before. Added `recovery_records_rolled_forward_for_stale_sidecar_after_successful_roll_forward` integration test that synthesizes the state by writing a sidecar whose `expected_version < manifest_pin` and asserts: - audit row records `RolledForward` (not `RolledBack`) - per-table outcome reports the correct `from_version` / `to_version` pair - sidecar is deleted Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 20:12:43 +02:00
Ragnor Comerford	11a9b3c8b9	tests: assert actual total_people count, not just row count Cubic flagged that the time-travel `total_people` assertions only checked `result.num_rows() == 1`, which would still pass if the historical query returned the wrong count (e.g., 10 instead of 6 because of a planner regression resolving against current state instead of the captured snapshot). Added `assert_total` helper that extracts the Int64 `total` column and verifies the actual value. Replaces three weak `num_rows() == 1` assertions in `composite_flow_multi_branch_sequential_merges`: - post-both-merges: total = 10 - time-travel to pre-merge-a: total = 6 - post-reopen: total = 10 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 20:12:27 +02:00
Ragnor Comerford	0a6f3d796a	tests: extend multi-branch flow with .gq query checkpoints The first cut of `composite_flow_multi_branch_sequential_merges` used dataset-direct `count_rows` for read-side assertions, which proves data is on disk but skips the query path entirely — planner, BTree index lookup, edge traversal, aggregation, and snapshot resolution all stay untested. Replaced with strategic `.gq` query checkpoints: - branch isolation via `get_person` after Eve insert (Eve visible on feat-a; absent on main) - 1-hop traversal via `friends_of(Grace)` after the Knows-edge insert (validates the topology index against branch-local edges) - post-merge query-engine readback after merge feat-a → main (Eve findable through BTree, Grace's edge traversable through the rebuilt Knows index) - aggregation via `total_people` after merge feat-b → main (count over a multi-fragment table whose shape is the result of two sequential merges) - time-travel via `ReadTarget::Snapshot(captured_id)` for both `total_people` and `friends_of` / `get_person` at the two pre-merge points (catches planner regressions where historical queries accidentally resolve current indices) - post-reopen query-engine readback (catches reopen-time index/ catalog binding regressions) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 19:42:17 +02:00
Ragnor Comerford	9fc6526ec0	tests: multi-branch sequential merges compositional flow Adds `composite_flow_multi_branch_sequential_merges` covering the agent-workflow pattern that single-merge tests in `branching.rs` cannot reach: two feature branches diverging from main with main writes interleaved between every diverge point, sequential merges into main, time-travel through the resulting merge DAG, and reopen consistency over a multi-merge history. The script (18 numbered steps with assertions per step): init+load → mutate main → branch feat-a → mutate main → mutate feat-a → branch feat-b → mutate feat-b → mutate feat-a (with edge) → merge feat-a → mutate main → merge feat-b → time-travel to pre-merge-a + pre-merge-b → reopen + verify. Catches eight compositional gap categories that only surface with ≥2 merges and main mutations between them: base/LCA recomputation across two merges, manifest-pin propagation through merge commits, time-travel through merge DAG without state bleed-through, branch- DAG consistency, sibling-branch isolation under writes elsewhere, post-merge main-write integration, multi-merge reopen replay, and clean-flow recovery-sidecar absence. `composite_flow.rs` was added to `docs/testing.md` so the before- every-task checklist points agents at the file before duplicating coverage. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 19:34:04 +02:00
Ragnor Comerford	58a3ff0e48	recovery: align merge sidecar branch with active_branch + record rollback drift Two small PR #72 review findings addressed: - merge.rs sidecar pin recorded `entry.table_branch` (where the table currently lives in the target manifest) instead of the merge target branch where commits actually land via `publish_rewritten_merge_table` → `open_for_mutation` → `fork_dataset_from_entry_state`. Recovery's `open_lance_head` would then check the wrong ref. Aligned with the pattern already used in `ensure_indices_for_branch` (table_ops.rs:115). Added `branch_merge_sidecar_pins_table_branch_to_active_branch` contract test that reads the sidecar JSON and asserts every per-pin `table_branch` equals the active (target) branch — catches the regression even when the values happen to coincide in the test setup. - Rollback audit `from_version` previously equalled `to_version` (both `manifest_pinned`), telling operators nothing about the actual Lance HEAD drift before restore. Captured `lance_head` in `ClassifiedTable` and used it as `from_version` so audit rows now show "rolled back from v7 to v5" instead of "v5 → v5". Added `assert_rollback_outcomes_record_drift` invariant in the test helper, invoked automatically by every `RecoveryExpectation::RolledBack` test. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 19:33:32 +02:00
Ragnor Comerford	815ff743f5	recovery: refresh-time roll-forward closes the in-process residual + invariants helper Bundle of three correctness fixes plus a shared invariants helper that existing tests now use. 1. SchemaApply atomicity: close the residual gap where a sidecar exists but staging files don't (e.g., Phase B failure BEFORE `_schema.pg.staging` write). `recover_schema_state_files` now returns a `SchemaStateRecovery` discriminator (`Noop` / `CleanedStaging` / `CompletedStagingRename { schema_apply_sidecar }`); the token threads through `recover_manifest_drift` → `process_sidecar`. SchemaApply sidecars are eligible for roll-forward ONLY when the staging rename completed in the same recovery pass. Full mode rolls back; RollForwardOnly defers. Without this, recovery would publish the manifest pin against new-schema data while `_schema.pg` stayed old (real corruption). New failpoint `schema_apply.before_staging_write` + new test `schema_apply_without_schema_staging_rolls_back_on_next_open` pin the gating. 2. Rollback target correction. Rollback now restores Lance HEAD to the current manifest pin (`state.manifest_pinned`) instead of the sidecar's `expected_version`. For UnexpectedAtP1/UnexpectedMultistep classifications these can differ; the old code could regress Lance HEAD past the manifest pin, re-introducing drift in the OTHER direction. The new behavior establishes `Lance HEAD == manifest pin` post-rollback — the canonical drift-free invariant. Param renamed from `expected_version` → `target_version` to match. Audit `to_version` records the actual restore target. This is a latent-behavior change. Any external consumer that compared `audit.to_version` against `sidecar.expected_version` for non-trivial classifications now sees the manifest pin instead. 3. Audit commit-graph unification. `record_audit` now opens the per-branch commit graph for ANY sidecar with `sidecar.branch.is_some()` — not just BranchMerge. Plain Mutation/Load/EnsureIndices commits on a feature branch now correctly land on that branch's commit graph, instead of main's. Closes the class of bug analogous to D2 but for non-merge writers. Pre-existing repos with non-main commits already on main's commit graph stay where they are; future recoveries write to the per-branch ref. Mixed-version compatibility is asymmetric but safe (old binaries ignore per-branch refs they don't know about; new binaries read both). 4. Recovery invariants helper + branch-axis cells. New `tests/helpers/recovery.rs` (~505 LOC) exports `assert_post_recovery_invariants(repo, op_id, RecoveryExpectation)` plus a `TableExpectation` builder. Six existing recovery tests refactored to call it; per-test bespoke assertions replaced. Two new branch-axis cells added in `tests/failpoints.rs`: - `recovery_rolls_forward_load_on_feature_branch` - `recovery_rolls_forward_ensure_indices_on_feature_branch` The loader gains a `mutation.post_finalize_pre_publisher` failpoint hook (gated on the `failpoints` feature; zero-cost in release) so the load test can pin the same Phase B → Phase C boundary the mutation path uses. Misc: - `Omnigraph::refresh` extracts `reload_schema_if_source_changed`: early-return when schema source unchanged (saves IR parse + catalog rebuild on the steady-state refresh path). - New test injection point `failpoint_publish_table_head_without_index_rebuild_for_test` under `#[cfg(feature = "failpoints")]`. Tests: 31 recovery + failpoint integration tests pass (14 + 17, up from 14 + 16). Full workspace sweep with `--features failpoints` clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 16:04:48 +02:00
Ragnor Comerford	44c0d0bc4b	recovery: refresh reloads schema after staging recovery; non-main merge test pins parent_commit_id E1. After D3 added recover_schema_state_files to refresh(), the in-memory `self.schema_source` and `self.catalog` were left stale: a SchemaApply sidecar processed via refresh would rename the staging files (`_schema.pg`, IR contract) into place but the handle continued operating against the old catalog. Subsequent operations would surface schema mismatches against post-migration data on disk. Fix: after recover_manifest_drift completes, refresh() now mirrors open_with_storage_and_mode's schema-load sequence — re-reads `_schema.pg`, parses IR via load_or_bootstrap_schema_contract, rebuilds the catalog with fixup_blob_schemas, and assigns into self.schema_source / self.catalog. Steady-state cost: one read + one parse per refresh; only mutates handle state when the on-disk schema actually changed. E2. The non-main branch_merge recovery test (`branch_merge_phase_b_failure_recovered_on_non_main_target`) asserted only `merged_parent_commit_id` was non-null — but `merged_parent_commit_id` is independently populated from sidecar.merge_source_commit_id (the SOURCE branch's tip), so the assertion would pass even if D2's per-branch CommitGraph fix regressed (the bug was about `parent_commit_id`, the TARGET branch's tip). Fix: capture target_branch's commit-graph head BEFORE the failed merge by scanning target_branch's Lance ref on _graph_commits.lance and picking the latest commit by created_at. After recovery, find the recovery merge commit (the one with non-null merged_parent_commit_id) and assert its `parent_commit_id` == captured pre-failure head. Without D2, recovery would record the GLOBAL head (the source_branch's insert-Carol commit on this test) instead, and the assertion fails. Also fixes the column-type cast: created_at is stored as TimestampMicrosecondArray, not Int64Array. All workspace tests pass with --features failpoints. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 12:06:17 +02:00
Ragnor Comerford	2ce4efc450	recovery: four review-round-4 fixes + branch-axis test matrix D1. roll_forward_all returns per-table actual published versions; the audit row's `to_version` records that, not pin.post_commit_pin (the latter is a lower bound for loose-match writers SchemaApply / EnsureIndices / BranchMerge — pin.post_commit_pin = expected + 1 while actual published HEAD can be expected + N). D2. Branch-merge recovery audit uses CommitGraph::open_at_branch when sidecar.branch is Some, so the merge parent is the TARGET BRANCH's tip (not the global head). Without this, recovered branch_merge on a non-main target records the wrong merged_parent_commit_id and future merges between the same pair lose already-up-to-date detection / merge-base correctness. D3. Omnigraph::refresh now mirrors open's recovery composition: runs recover_schema_state_files BEFORE recover_manifest_drift. Without this, a SchemaApply sidecar processed via refresh would publish the manifest + delete the sidecar without renaming the staging schema files, leaving the repo with new-schema data and old `_schema.pg` (real corruption). Refresh's docstring now enumerates each open-time recovery step it maintains, so the next maintainer's diff between open() and refresh() is trivial. D4. ensure_indices sidecar pin records `active_branch` (where commits actually land), not `entry.table_branch` (where the table currently lives). On first fork-on-write, the processing loop's open_owned_dataset_for_branch_write forks to active_branch and the commit lands there — recovery's open_lance_head must check the same branch. Without this, recovery checks the wrong ref and misses Phase B drift entirely. D5. Two new branch-axis tests: * recovery_rolls_back_feature_branch_sidecar_against_feature_branch — feature-branch rollback variant; asserts post-recovery audit kind == RolledBack and the actual restore commit landed on the feature ref. * branch_merge_phase_b_failure_recovered_on_non_main_target — non-main merge target variant; reads the target branch's commit graph (Lance ref) and asserts the recovery commit has a non-null merged_parent_commit_id (pins D2). Bug pattern: all four are at composition seams between concepts that were each tested individually (writer-precision × actual-Lance-HEAD; branch-context × commit-graph-API; recovery-path × writer-kind; pin- time-branch × commit-time-branch). The branch-axis matrix is the cheapest mechanical prevention for D2/D4-class regressions. All workspace tests pass with --features failpoints. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 11:34:18 +02:00
Ragnor Comerford	aaa031e834	recovery: refresh-time roll-forward closes the in-process residual Adds RecoveryMode { Full, RollForwardOnly } and wires Omnigraph::refresh to invoke roll-forward-only recovery. This closes the documented "long-running server between Phase B failure and process restart" residual without requiring a restart, for the common case (mutation / load finalize → publisher failure). Why roll-forward only and not full sweep: * Roll-forward is safe under concurrency (publisher uses row-level CAS). * Roll-back uses Dataset::restore, which "wins" against concurrent Append/Update/Delete/CreateIndex/Merge per check_restore_txn — silently orphaning the concurrent writer's commit (pinned by tests/staged_writes.rs::lance_restore_loses_to_concurrent_append_via_orphaning). Sidecars that classify as RollBack-eligible are LEFT ON DISK for the next ReadWrite open, where no concurrent writers exist and full restore is safe. Implementation: * recovery.rs: RecoveryMode enum; recover_manifest_drift takes mode; process_sidecar branches on mode for Abort and RollBack — both defer to next ReadWrite open under RollForwardOnly. RollForward behavior unchanged. * omnigraph.rs: Omnigraph::refresh promoted to pub; calls recover_manifest_drift in RollForwardOnly mode after coordinator refresh. Steady-state cost: one list_dir of __recovery (early return on empty). Adds refresh_coordinator_only — pub(crate) — for engine-internal callers that hold an in-flight sidecar (the schema_apply lease-check + lock-release paths). Without this split, refresh would race the in-flight sidecar. * schema_apply.rs: switch all 6 internal db.refresh() call sites to refresh_coordinator_only(). Tests: * refresh_runs_roll_forward_recovery_in_process — trigger mutation.post_finalize_pre_publisher; without restart, call db.refresh(); assert sidecar deleted, drifted row visible, subsequent mutation succeeds. * refresh_defers_rollback_eligible_sidecar_to_next_open — synthesize a Mutation sidecar with bogus expected (UnexpectedAtP1 → RollBack); refresh leaves it on disk and Lance HEAD unchanged; drop and reopen runs the full sweep which advances HEAD via restore. Docs: * docs/runs.md "Long-running servers" caveat updated to describe the refresh-time roll-forward path and the rollback-defer behavior. * docs/invariants.md §VI.23 status line updated to reflect in-process closure of the common case. Workspace tests pass with --features failpoints; no regressions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 00:15:42 +02:00
Ragnor Comerford	8c6506f5cd	recovery: close four correctness gaps (schema-apply, branch-aware, restore short-circuit, merge parent) B1. Schema-apply atomicity. Before this commit, a failure between `_schema.pg.staging` write and the manifest publish left the repo corrupt: Lance HEADs advanced under the new schema, manifest stayed at old pins, and on reopen schema-state recovery deleted the staging files (manifest's table set still matched the live schema), then manifest-drift recovery rolled the table versions forward — leaving new-schema data on disk with the old `_schema.pg` live. Fix: a SchemaApply sidecar is the marker that Phase B completed but Phase C didn't. New helper `has_schema_apply_sidecar` is consulted by `recover_schema_state_files` BEFORE its disambiguation logic; when present, it completes the staging→final rename so the subsequent manifest-drift roll-forward sees the new catalog. B2. Branch-aware recovery. Sidecars from feature-branch writers were being classified against main's snapshot and main's Lance HEAD, silently no-op'ing or rolling back the wrong table version (the classifier saw NoMovement; the writer's drift on the feature branch persisted; subsequent feature writers surfaced ExpectedVersionMismatch). Fix: SidecarTablePin gets an optional `table_branch` field; `recover_manifest_drift` opens a per-branch coordinator (`GraphCoordinator::open_branch`) per sidecar; `open_lance_head`, `restore_table_to_version`, and `roll_forward_all` honor the pin's branch via `Dataset::checkout_branch`. B3. Remove fragment-id short-circuit in `restore_table_to_version`. Equal fragment IDs do NOT imply equal content: Lance index commits and deletion-vector updates change the manifest without touching fragment IDs. Skipping restore in those cases would leave Lance HEAD ahead of the manifest with no recovery artifact left. Restore is now unconditional; pile-up under repeated mid-rollback crashes bounded and reclaimed by `omnigraph cleanup`. B4. Recovered branch_merge records merge parent. `record_audit` always called `append_commit`, dropping `merged_parent_commit_id`. Future `branch_merge feature -> main` between the same pair lost already-up-to-date detection. RecoverySidecar gets an optional `merge_source_commit_id`; `branch_merge_on_current_target` populates it from `source_head_commit_id`; `record_audit` dispatches to `append_merge_commit` when present. New tests: feature-branch sidecar classification (B2); B1 deepens the existing schema_apply test with live-`_schema.pg` and new-type assertions; B4 deepens the existing branch_merge test by reading `_graph_commits.lance` and asserting a non-null `merged_parent_commit_id`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 23:39:41 +02:00

1 2 3 4 5

245 commits