mirror of
https://github.com/ModernRelay/omnigraph.git
synced 2026-06-09 01:35:18 +02:00
Three bundled changes:
1. Rename `tests/agent_lifecycle.rs` -> `tests/composite_flow.rs` (and
the test function). OmniGraph is consumed by both humans and agents
- naming the test after one audience misframes the library.
2. Strip Linear ticket IDs, PR numbers, bot reviewer names, and
review-round labels from source, tests, and docs added by this
branch. Internal traceability belongs in commit messages and PR
descriptions, not in checked-in artifacts. Upstream
lance-format/lance issue refs and pre-existing MR-XXX refs in docs
not touched by this branch are left alone.
3. Two outstanding review findings addressed:
- `needs_index_work_node` / `needs_index_work_edge`: propagate
`count_rows` errors instead of `unwrap_or(0)`. Silently treating
transient I/O failures as "0 rows" risked skipping a table from
the recovery sidecar pin set that was actually about to be
modified.
- `recovery_multi_sidecar_requires_fresh_snapshot_for_correctness`:
strengthen the assertion to fail when sidecar B classifies under
a stale snapshot. The new assertion checks post-recovery Lance
HEAD == v3 (no `Dataset::restore` ran). The previous "sidecar
deleted + audit rows present" pair passed in both the bug and
fix paths because both delete the sidecar and write an audit
row; the differentiator is the post-recovery HEAD. Strengthening
the assertion exposed an additional nuance: in this overlapping-
sidecar scenario sidecar B's audit kind is RolledBack (no-op)
rather than RolledForward, since sidecar A's roll-forward
publishes Lance HEAD as the new manifest pin (absorbing B's
work). The docstring now explains why this is correct given
current `roll_forward_all` semantics.
All workspace tests pass with --features failpoints.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
29 lines
2.1 KiB
Markdown
29 lines
2.1 KiB
Markdown
# Maintenance: Optimize & Cleanup
|
|
|
|
`db/omnigraph/optimize.rs`.
|
|
|
|
## `optimize_all_tables(db)` — non-destructive
|
|
|
|
- Lance `compact_files()` on every node + edge table on `main`.
|
|
- Rewrites small fragments into fewer large ones; old fragments remain reachable via older manifests.
|
|
- Bounded by `OMNIGRAPH_MAINTENANCE_CONCURRENCY` (default 8).
|
|
- Returns `[TableOptimizeStats { table_key, fragments_removed, fragments_added, committed }]`.
|
|
|
|
## `cleanup_all_tables(db, options)` — destructive
|
|
|
|
- Lance `cleanup_old_versions()` per table.
|
|
- Removes manifests (and their unique fragments) older than the retention policy.
|
|
- `CleanupPolicyOptions { keep_versions: Option<u32>, older_than: Option<Duration> }` — at least one is required.
|
|
- Returns `[TableCleanupStats { table_key, bytes_removed, old_versions_removed }]`.
|
|
- CLI guards with `--confirm`; without it, prints a preview line.
|
|
- **Recovery floor:** `--keep < 3` may garbage-collect Lance versions that the open-time recovery sweep needs as a rollback target (the sweep restores to the manifest-pinned `expected_version`, which is HEAD-1 in the typical Phase B → Phase C drift case). Default `--keep 10` is safe.
|
|
|
|
## Tombstones
|
|
|
|
Logical sub-table delete markers in `__manifest`; `tombstone_object_id(table_key, version)` excludes a sub-table version from snapshot reconstruction.
|
|
|
|
## Internal schema migrations (`db/manifest/migrations.rs`)
|
|
|
|
Version evolutions of the on-disk `__manifest` shape are reconciled automatically on the first write under a new binary. `INTERNAL_MANIFEST_SCHEMA_VERSION` declares the shape the binary expects; the on-disk stamp `omnigraph:internal_schema_version` (Lance schema-level metadata) records the on-disk shape. The publisher's open-for-write path calls `migrate_internal_schema` before reading state; reads are side-effect-free. No operator action is required for in-place upgrades. See [storage.md → Internal schema versioning](storage.md) for the full mechanism.
|
|
|
|
A binary opening a manifest stamped at a version *higher* than it knows about refuses to publish with a clear "upgrade omnigraph first" error — old binaries cannot clobber a newer schema.
|