omnigraph/docs/user/maintenance.md
Ragnor Comerford 2193a24641
docs(optimize): document the pre-existing-drift reconcile
- maintenance.md: add the reconcile bullet (metadata-only manifest catch-up for
  a table with HEAD>manifest and an empty plan), and correct the 'requires a
  recovered graph' note — that guard is what makes the reconcile safe (no
  sidecar-covered drift in flight), not a claim that no drift exists.
- AGENTS.md: restore the reconcile mention in the Compaction capability row.
- testing.md: the maintenance.rs row lists all three optimize tests (publishes /
  reconciles / defers); fix the stale failpoints ensure_indices test name
  (recovery_rolls_forward_ensure_indices_on_feature_branch).
2026-06-08 14:17:23 +02:00

6.1 KiB

Maintenance: Optimize & Cleanup

db/omnigraph/optimize.rs.

optimize_all_tables(db) — non-destructive

  • Lance compact_files() on every node + edge table on main, then publishes the compacted version to the __manifest so the manifest's table_version tracks the compacted Lance HEAD. Reads pin the manifest version, so without this publish compaction would be invisible to readers and would break the HEAD-vs-manifest precondition of the next schema apply / strict update/delete ("stale view … refresh and retry"). The publish advances the graph version (a system-attributed commit) for tables that actually compacted.
  • Reconciles pre-existing benign drift. A table whose Lance HEAD already sits ahead of its manifest pin with nothing left to compact — left by a pre-fix optimize that never published, or by an external raw compact_files — is caught up by publishing manifest = HEAD (a metadata-only commit; committed: true with fragments_added == 0, no fragment changes). This is the producer-side heal for uncovered drift, and it is safe because the "requires a recovered graph" guard below ensures no recovery sidecar is in flight, so any HEAD > manifest it sees is necessarily content-preserving — never a partial write. Strict writes / schema apply on such a table 409 with "stale view" until an optimize (or any insert/merge, which heals it as a side effect) reconciles it.
  • Rewrites small fragments into fewer large ones; old fragments remain reachable via older manifests until cleanup runs.
  • Each table's compact→publish runs under its per-(table, main) write queue (serializing with concurrent mutations — compaction is a Lance Rewrite op that retryable-conflicts with a concurrent merge/update/delete on overlapping fragments). The Lance-HEAD-before-manifest-publish gap is covered by a SidecarKind::Optimize recovery sidecar (loose-match): a crash in that window rolls the compacted version forward on the next Omnigraph::open (compaction is content-preserving, so roll-forward is always safe).
  • Requires a recovered graph. optimize refuses (errors) when an unresolved recovery sidecar is present under __recovery — operating on an unrecovered graph could publish a partial write the open-time recovery sweep would roll back. Reopen the graph to run the recovery sweep, then re-run optimize. (This is exactly what makes the reconcile above safe: a recovered graph has no sidecar-covered drift going in, so any HEAD > manifest optimize then sees is uncovered, content-preserving drift — never a partial write.)
  • Bounded by OMNIGRAPH_MAINTENANCE_CONCURRENCY (default 8).
  • Returns [TableOptimizeStats { table_key, fragments_removed, fragments_added, committed, skipped }].
  • Blob tables are skipped. A table that declares any Blob property is not compacted: it is reported with skipped: Some(BlobColumnsUnsupportedByLance) (and logged via tracing::warn) instead of compacted, and the rest of the sweep proceeds normally. The current Lance compact_files mis-decodes blob-v2 columns under its forced BlobHandling::AllBinary read; reads and writes are unaffected — only compaction is. This is gated by LANCE_SUPPORTS_BLOB_COMPACTION (db/omnigraph/optimize.rs) and removed when the upstream Lance fix lands (see docs/dev/lance.md). Consequence: fragment count and deleted-row space on blob tables are not reclaimed until then; query results are never affected.

cleanup_all_tables(db, options) — destructive

  • Lance cleanup_old_versions() per table.
  • Removes manifests (and their unique fragments) older than the retention policy.
  • CleanupPolicyOptions { keep_versions: Option<u32>, older_than: Option<Duration> } — at least one is required.
  • Returns [TableCleanupStats { table_key, bytes_removed, old_versions_removed, error }].
  • Fault-isolated per table. A single table's transient failure (version GC or orphan reclaim) is recorded on that table's stats row (error: Some(..), logged via tracing) and never aborts the healthy tables — cleanup is the convergence backstop, so it does as much as it can and converges on re-run. The CLI reports any failed tables; rerun cleanup to retry them.
  • CLI guards with --confirm; without it, prints a preview line.
  • Recovery floor: --keep < 3 may garbage-collect Lance versions that the open-time recovery sweep needs as a rollback target (the sweep restores to the branch's manifest-pinned table version, which is HEAD-1 in the typical Phase B → Phase C drift case). Default --keep 10 is safe.
  • Orphaned-branch reconciliation: before the version GC, cleanup runs reconcile_orphaned_branches, which force_delete_branches any per-table or commit-graph Lance branch absent from the manifest branch list. These orphans arise when a branch_delete flips the manifest authority but a downstream best-effort reclaim does not complete (see branches-commits.md). The reconciler is authority-derived and idempotent (it no-ops once nothing is orphaned), runs regardless of the keep_versions / older_than values (those gate version GC only), and never reclaims main or system-branch forks. Reclaimed forks are logged via tracing::info.

Tombstones

Logical sub-table delete markers in __manifest; tombstone_object_id(table_key, version) excludes a sub-table version from snapshot reconstruction.

Internal schema migrations (db/manifest/migrations.rs)

Version evolutions of the on-disk __manifest shape are reconciled automatically on the first write under a new binary. INTERNAL_MANIFEST_SCHEMA_VERSION declares the shape the binary expects; the on-disk stamp omnigraph:internal_schema_version (Lance schema-level metadata) records the on-disk shape. The publisher's open-for-write path calls migrate_internal_schema before reading state; reads are side-effect-free. No operator action is required for in-place upgrades. See storage.md → Internal schema versioning for the full mechanism.

A binary opening a manifest stamped at a version higher than it knows about refuses to publish with a clear "upgrade omnigraph first" error — old binaries cannot clobber a newer schema.