mirror of
https://github.com/ModernRelay/omnigraph.git
synced 2026-06-12 01:45:14 +02:00
Merge origin/main into MR-854 branch
Resolve the optimize.rs conflict: adopt main's optimize_one_table helper (recovery-sidecar + manifest-publish 4-phase pattern, pending-sidecar guard, post-loop cache invalidation) and re-apply the MR-854 db.storage() conversion on top - open via SnapshotHandle::into_dataset() for the Lance-only compact_files API, and read table_state through the trait by re-wrapping the post-compaction dataset. No db.table_store field access remains in engine code.
This commit is contained in:
commit
e050214f2f
15 changed files with 825 additions and 144 deletions
|
|
@ -34,10 +34,10 @@ The engine's `tests/` is the principal coverage surface; most graph-shaped behav
|
|||
| `s3_storage.rs` | S3-backed graph (skipped unless `OMNIGRAPH_S3_TEST_BUCKET` is set) |
|
||||
| `lance_version_columns.rs` | Per-row `_row_last_updated_at_version` behavior |
|
||||
| `validators.rs` | Schema constraint enforcement (enum, range, unique, cardinality) across JSONL, insert, update paths |
|
||||
| `maintenance.rs` | `optimize` (compaction) + `cleanup` (version GC): empty/idempotent/no-op edges, policy validation, head preservation |
|
||||
| `failpoints.rs` | Failure-injection coverage (gated on `failpoints` feature). Includes the four per-writer Phase B → recovery integration tests (`recovery_rolls_forward_after_finalize_publisher_failure`, `schema_apply_phase_b_failure_recovered_on_next_open`, `branch_merge_phase_b_failure_recovered_on_next_open`, `ensure_indices_phase_b_failure_recovered_on_next_open`). |
|
||||
| `maintenance.rs` | `optimize` (compaction) + `cleanup` (version GC): empty/idempotent/no-op edges, policy validation, head preservation; `optimize` publishes the compacted version so the manifest tracks the Lance HEAD and a subsequent schema apply succeeds (`optimize_publishes_compaction_to_manifest_so_schema_apply_succeeds`), and reconciles a pre-existing manifest-behind-HEAD drift forged via raw Lance compaction (`optimize_reconciles_preexisting_manifest_head_drift`) |
|
||||
| `failpoints.rs` | Failure-injection coverage (gated on `failpoints` feature). Includes the five per-writer Phase B → recovery integration tests (`recovery_rolls_forward_after_finalize_publisher_failure`, `schema_apply_phase_b_failure_recovered_on_next_open`, `branch_merge_phase_b_failure_recovered_on_next_open`, `ensure_indices_phase_b_failure_recovered_on_next_open`, `optimize_phase_b_failure_recovered_on_next_open`). |
|
||||
| `recovery.rs` | Open-time recovery sweep — sidecar I/O, classifier dispatch (NoMovement / RolledPastExpected / UnexpectedAtP1 / UnexpectedMultistep / InvariantViolation), all-or-nothing decision, roll-forward via `ManifestBatchPublisher::publish`, roll-back via `Dataset::restore`, audit row in `_graph_commit_recoveries.lance`, `OpenMode::ReadOnly` skip path |
|
||||
| `composite_flow.rs` | Compositional/narrative end-to-end stories — multi-step flows that compose mechanics covered by other test files. Catches integration regressions where individual operations all pass their unit tests but their composition breaks (sequential merges, post-merge main writes, time-travel through merge DAG, reopen consistency over multi-merge histories). |
|
||||
| `composite_flow.rs` | Compositional/narrative end-to-end stories — multi-step flows that compose mechanics covered by other test files. Catches integration regressions where individual operations all pass their unit tests but their composition breaks (sequential merges, post-merge main writes, time-travel through merge DAG, reopen consistency over multi-merge histories, post-optimize and post-cleanup strict writes). |
|
||||
|
||||
## Fixtures
|
||||
|
||||
|
|
|
|||
|
|
@ -153,10 +153,14 @@ are left at `Lance HEAD = manifest_pinned + 1`.
|
|||
|
||||
**Recovery protocol** (lifecycle of every staged-write writer —
|
||||
`MutationStaging::finalize`, `schema_apply::apply_schema_with_lock`,
|
||||
`branch_merge_on_current_target`, `ensure_indices_for_branch`):
|
||||
`branch_merge_on_current_target`, `ensure_indices_for_branch`,
|
||||
`optimize_all_tables`):
|
||||
|
||||
1. **Phase A**: writer writes a sidecar JSON to
|
||||
`__recovery/{ulid}.json` BEFORE its first `commit_staged`. The
|
||||
`__recovery/{ulid}.json` BEFORE its first HEAD-advancing commit
|
||||
(`commit_staged`, or `compact_files` for `optimize_all_tables`,
|
||||
which advances the Lance HEAD via a reserve-fragments + rewrite
|
||||
commit rather than a staged write). The
|
||||
sidecar names every `(table_key, table_path, expected_version,
|
||||
post_commit_pin)` it intends to commit + the writer kind +
|
||||
actor_id.
|
||||
|
|
@ -191,8 +195,13 @@ recovery sweep in `crates/omnigraph/src/db/manifest/recovery.rs`:
|
|||
otherwise full open-time recovery rolls them back and refresh-time
|
||||
recovery leaves them for the next read-write open.
|
||||
- Otherwise **roll back**: per-table `Dataset::restore` to the
|
||||
manifest-pinned table version for that branch. Rollback records the
|
||||
actual restore target in the audit row's `to_version`.
|
||||
manifest-pinned table version, then a single `ManifestBatchPublisher::publish`
|
||||
of the restored HEAD — symmetric with roll-forward, so `manifest == HEAD`
|
||||
after recovery (no residual drift). This convergence is what lets a
|
||||
failed-then-retried schema apply succeed instead of failing one version higher
|
||||
each iteration. The audit row's `to_version` records the logical
|
||||
rolled-back-to version (`manifest_pinned`); the manifest is published at the
|
||||
restore commit (`manifest_pinned + 1`, same content).
|
||||
- After a successful roll-forward or roll-back, an audit row is
|
||||
recorded — `_graph_commits.lance` carries
|
||||
a commit tagged `actor_id = "omnigraph:recovery"`, and a sibling
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue