mirror of
https://github.com/ModernRelay/omnigraph.git
synced 2026-06-15 01:55:13 +02:00
docs(optimize): document the pre-existing-drift reconcile
- maintenance.md: add the reconcile bullet (metadata-only manifest catch-up for a table with HEAD>manifest and an empty plan), and correct the 'requires a recovered graph' note — that guard is what makes the reconcile safe (no sidecar-covered drift in flight), not a claim that no drift exists. - AGENTS.md: restore the reconcile mention in the Compaction capability row. - testing.md: the maintenance.rs row lists all three optimize tests (publishes / reconciles / defers); fix the stale failpoints ensure_indices test name (recovery_rolls_forward_ensure_indices_on_feature_branch).
This commit is contained in:
parent
4bcfdee891
commit
2193a24641
3 changed files with 6 additions and 5 deletions
|
|
@ -4,10 +4,11 @@
|
|||
|
||||
## `optimize_all_tables(db)` — non-destructive
|
||||
|
||||
- Lance `compact_files()` on every node + edge table on `main`, then **publishes the compacted version to the `__manifest`** so the manifest's `table_version` tracks the compacted Lance HEAD. Reads pin the manifest version, so without this publish compaction would be invisible to readers *and* would break the HEAD-vs-manifest precondition of the next schema apply / strict update/delete ("stale view … refresh and retry"). The publish advances the graph version (a system-attributed commit) only for tables that actually compacted.
|
||||
- Lance `compact_files()` on every node + edge table on `main`, then **publishes the compacted version to the `__manifest`** so the manifest's `table_version` tracks the compacted Lance HEAD. Reads pin the manifest version, so without this publish compaction would be invisible to readers *and* would break the HEAD-vs-manifest precondition of the next schema apply / strict update/delete ("stale view … refresh and retry"). The publish advances the graph version (a system-attributed commit) for tables that actually compacted.
|
||||
- **Reconciles pre-existing benign drift.** A table whose Lance HEAD already sits ahead of its manifest pin with nothing left to compact — left by a pre-fix `optimize` that never published, or by an external raw `compact_files` — is caught up by publishing `manifest = HEAD` (a metadata-only commit; `committed: true` with `fragments_added == 0`, no fragment changes). This is the producer-side heal for *uncovered* drift, and it is safe because the "requires a recovered graph" guard below ensures no recovery sidecar is in flight, so any `HEAD > manifest` it sees is necessarily content-preserving — never a partial write. Strict writes / schema apply on such a table 409 with "stale view" until an `optimize` (or any insert/merge, which heals it as a side effect) reconciles it.
|
||||
- Rewrites small fragments into fewer large ones; old fragments remain reachable via older manifests until `cleanup` runs.
|
||||
- Each table's compact→publish runs under its per-`(table, main)` write queue (serializing with concurrent mutations — compaction is a Lance `Rewrite` op that retryable-conflicts with a concurrent merge/update/delete on overlapping fragments). The Lance-HEAD-before-manifest-publish gap is covered by a `SidecarKind::Optimize` recovery sidecar (loose-match): a crash in that window rolls the compacted version forward on the next `Omnigraph::open` (compaction is content-preserving, so roll-forward is always safe).
|
||||
- **Requires a recovered graph.** `optimize` refuses (errors) when an unresolved recovery sidecar is present under `__recovery` — operating on an unrecovered graph could publish a partial write the open-time recovery sweep would roll back. Reopen the graph to run the recovery sweep, then re-run `optimize`. (Recovery roll-back now publishes its restored version, so a recovered graph always satisfies `manifest == Lance HEAD` going in; there is no leftover drift for `optimize` to interpret.)
|
||||
- **Requires a recovered graph.** `optimize` refuses (errors) when an unresolved recovery sidecar is present under `__recovery` — operating on an unrecovered graph could publish a partial write the open-time recovery sweep would roll back. Reopen the graph to run the recovery sweep, then re-run `optimize`. (This is exactly what makes the reconcile above safe: a recovered graph has no *sidecar-covered* drift going in, so any `HEAD > manifest` `optimize` then sees is uncovered, content-preserving drift — never a partial write.)
|
||||
- Bounded by `OMNIGRAPH_MAINTENANCE_CONCURRENCY` (default 8).
|
||||
- Returns `[TableOptimizeStats { table_key, fragments_removed, fragments_added, committed, skipped }]`.
|
||||
- **Blob tables are skipped.** A table that declares any `Blob` property is not compacted: it is reported with `skipped: Some(BlobColumnsUnsupportedByLance)` (and logged via `tracing::warn`) instead of compacted, and the rest of the sweep proceeds normally. The current Lance `compact_files` mis-decodes blob-v2 columns under its forced `BlobHandling::AllBinary` read; **reads and writes are unaffected** — only compaction is. This is gated by `LANCE_SUPPORTS_BLOB_COMPACTION` (`db/omnigraph/optimize.rs`) and removed when the upstream Lance fix lands (see [docs/dev/lance.md](../dev/lance.md)). Consequence: fragment count and deleted-row space on blob tables are not reclaimed until then; query results are never affected.
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue