(feat) convert engine call sites to &dyn TableStorage; demote legacy TableStore methods to pub(crate) (#86)

* MR-854: convert engine call sites to &dyn TableStorage; demote legacy methods

Phase 1b: every db.table_store.X(...) call site converts to
db.storage().X(...), reaching the storage layer through the sealed
TableStorage trait (returns &dyn TableStorage). Opaque SnapshotHandle
and StagedHandle replace bare lance::Dataset and Transaction in the
threaded values.

Phase 9: the inherent inline-commit methods on TableStore
(append_batch, merge_insert_batch{,es}, overwrite_batch,
create_btree_index, create_inverted_index) demote from pub to
pub(crate). Their only remaining direct users are table_store.rs
itself and the bulk loader's LoadMode::{Append, Overwrite, Merge}
concurrent fast-paths in loader::write_batch_to_dataset (no
two-phase shape in Lance 4.0.0 — closes after lance#6658 and #6666).

Docs:
- invariants.md \u00a7VI.23: drop "at the writer-trait surface"
  qualifier; staged primitives are now the only engine surface.
- runs.md: residual matrix shrinks to delete_where and
  create_vector_index (the two upstream-blocked residuals).
- forbidden_apis.rs: replace transitional language with the
  current allow-list shape (table_store.rs + loader concurrent
  fast-path only).

Files touched:
- changes/mod.rs, db/omnigraph.rs (+export/optimize/schema_apply/
  table_ops.rs), exec/{merge,mod,mutation,staging}.rs,
  loader/mod.rs, storage_layer.rs, table_store.rs,
  tests/forbidden_apis.rs, docs/{invariants,runs}.md.

Co-Authored-By: Ragnor Comerford <ragnor.comerford@gmail.com>

* MR-854: replace test-only inline-commit append callers with local Lance helpers

After demoting TableStore::append_batch from pub to pub(crate), the
integration tests in tests/recovery.rs and tests/staged_writes.rs
that previously called store.append_batch(...) directly to simulate
HEAD-ahead-of-manifest drift can no longer access the inherent
method. Replace those calls with small in-test helpers that do a raw
Dataset::append (the same body the inherent method runs).

- tests/helpers/mod.rs gains lance_append_inline (shared helper).
- tests/staged_writes.rs gets a file-local lance_append_inline_local
  (staged_writes.rs does not import helpers::).
- tests/recovery.rs drops the unused TableStore import in the one
  function whose store binding became unused after the conversion.

Co-Authored-By: Ragnor Comerford <ragnor.comerford@gmail.com>

* MR-854: retrigger CI for flaky Test Workspace job

Co-Authored-By: Ragnor Comerford <ragnor.comerford@gmail.com>

* MR-854: convert remaining table_store call sites in export.rs / read_blob

Two leftover `self.table_store.X` / `db.table_store.X` call sites were
missed in the initial sweep — flagged by Devin Review on PR #86. Both
now go through the trait surface:

- `entity_from_snapshot` (db/omnigraph/export.rs): switch from
  `db.table_store.open_snapshot_table` + `db.table_store.scan` to
  `db.storage().open_snapshot_at_table` + `db.storage().scan`.

- `read_blob` (db/omnigraph.rs): replace
  `snapshot.open(table_key)` + `self.table_store.first_row_id_for_filter`
  with `self.storage().open_snapshot_at_table` +
  `self.storage().first_row_id_for_filter`. The follow-up
  `take_blobs` call still needs an `Arc<Dataset>` (it's a Lance blob
  accessor not surfaced through the trait), so we hand off via
  `SnapshotHandle::into_arc()` with a comment.

After this commit, no engine code outside `table_store.rs` reaches the
inherent `TableStore` API — the docs/runs.md and docs/invariants.md
claim is now uniformly true.

Co-Authored-By: Ragnor Comerford <ragnor.comerford@gmail.com>

* MR-854: post-rebase doc fixes (Lance 6.0.1, MR-A framing, into_dataset note)

Reviewer feedback on the rebased PR:

* docs/dev/writes.md residuals matrix: drop demoted methods from the trait-surface table (now `pub(crate)`); keep only the two genuine trait-surface residuals (`delete_where`, `create_vector_index`); reframe under MR-A (Lance v7.x bump) per docs/dev/lance.md.

* tests/forbidden_apis.rs: update transitional allow-list header to (a) drop the truncate_table mislabel (truncate_table is a Lance Dataset method, not a TableStore method — overwrite_batch's internal call), (b) reframe trait-surface residuals under MR-A / Lance #6666.

* crates/omnigraph/src/storage_layer.rs::SnapshotHandle::{into_arc, into_dataset}: add single-ref invariant doc — both consume Arc via try_unwrap-or-clone; sibling SnapshotHandle clones across an await point force a deep Dataset clone.

* Replace lance-4.0.0 version refs with lance-6.0.1 in active source/test/dev-doc comments (storage_layer.rs, table_store.rs, table_ops.rs, schema_apply.rs, merge.rs, recovery.rs, staged_writes.rs, consistency.rs, docs/dev/execution.md, docs/user/query-language.md). Historical refs in docs/releases/v0.4.1.md and the canonical "Lance 4.0.0 → 6.0.1 migration" line in docs/dev/lance.md left intact.

No engine code changes.

* MR-854: update docs/dev/invariants.md Storage trait row + gap entry

Reviewer feedback: the docs reorg landed; the invariant row now lives in
docs/dev/invariants.md with stable headings (no more numbered §VI.23).

Update two pieces to reflect MR-854 completion:

* Status table 'Storage trait' row: was 'full call-site migration ... incomplete';
  now 'engine call sites all route through db.storage() (MR-854); inline-commit
  inherent methods are pub(crate)-demoted; capability/stat surfaces are roadmap'.

* 'Known Gaps' 'Storage abstraction' entry: was 'older inherent TableStore call
  sites and inline residuals remain'; now names the closed scope (MR-854 — call
  sites migrated, methods demoted, loader fast-paths) and the remaining
  trait-surface residuals under MR-A (Lance v7.x bump) and Lance #6666.

Cross-links to docs/dev/lance.md and docs/dev/writes.md so the framing stays
co-located with the canonical Lance surface tracking.

* MR-854: remove dead inline-commit methods from the storage surface

The loader concurrent fast-path (write_batch_to_dataset) is only reached
for LoadMode::Overwrite — Append/Merge route through MutationStaging — so
its Append/Merge arms were unreachable. Collapse it to overwrite-only and
drop the now-unused mode params, which removes the only callers of:

- TableStorage::append_batch + TableStorage::merge_insert_batches (trait)
- TableStore::merge_insert_batch + merge_insert_batches (inherent)

create_btree_index / create_inverted_index had zero callers anywhere
(scalar index builds use the stage_* primitives). Remove both from the
trait and the inherent impl.

Inherent append_batch stays pub(crate): overwrite_batch and recovery
tests use it. Migrate the one trait-append_batch test caller
(seed_person_row) to stage_append + commit_staged. The merge_insert
FirstSeen-workaround rationale moves from the deleted merge_insert_batch
into stage_merge_insert (now the sole merge path). No behavior change.

Also corrects the inaccurate loader residual comment (the prior text
blamed Lance #6658/#6666, which are the delete and vector-index issues,
for keeping overwrite inline; a stage_overwrite primitive already exists
and schema_apply uses it).

* MR-854: seal db.storage() to staged-only; move residuals to InlineCommitResidual

Split the three remaining inline-commit writes (overwrite_batch,
delete_where, create_vector_index) off the TableStorage trait onto a new
sealed InlineCommitResidual trait, reachable only via the explicit
Omnigraph::storage_inline_residual() accessor. db.storage() now exposes
only staged primitives + reads, so engine code cannot couple a write
with a Lance HEAD advance through the default surface — MR-793 acceptance
§1 ("no public method commits as a side effect of writing") now holds by
construction, not by review + naming.

Call sites moved to storage_inline_residual(): loader overwrite
fast-path, the three mutation delete_where paths, the branch-merge
delete, and the vector-index build. Impl bodies are unchanged (same
delegation to the pub(crate) inherent methods); this is a pure surface
reshape with no behavior change.

The residual trait holds two genuinely upstream-blocked methods
(delete_where -> Lance #6658/v7.x, create_vector_index -> Lance #6666)
plus overwrite_batch, kept for the loader's cross-table bulk-overwrite
concurrency until its staged migration lands (tracked follow-up).

* MR-854 docs: describe the staged-only seal; fix stale Lance index URLs

- writes.md / invariants.md / AGENTS.md: the inline-commit residuals now
  live on InlineCommitResidual behind db.storage_inline_residual(), so
  acceptance §1 holds by construction rather than 'option (b)' per-method
  enumeration. Drop the inaccurate 'until Lance exposes
  Operation::Overwrite { fragments }' claim (that op exists; stage_overwrite
  already builds it) and reframe overwrite_batch as a removable legacy
  residual gated on the loader's bulk-overwrite concurrency.
- forbidden_apis.rs: rewrite the allow-list doc for the split surface.
- lance.md: the index spec pages moved from /format/table/index/ to
  /format/index/ in Lance 6.x (the old paths 404). Fix all 13 URLs.

* MR-854: fix stale lance-4.0.0 comment refs flagged in review

Addresses greptile (exec/merge.rs) and aaltshuler's stale-version blocker:
update lance-4.0.0 -> 6.0.1 in the comment/doc refs within this PR's
footprint (exec/merge.rs, exec/mutation.rs, docs/dev/writes.md). Also
corrects exec/merge.rs to cite lance#6666 (not #6658) for
build_index_metadata_from_segments — that is the vector-index segment-commit
API; #6658 is the two-phase delete. (Pre-existing 4.0.0 refs in untouched
files like architecture.md/storage.md are main's incomplete migration
cleanup, left out of scope.)

* fix(storage): stage loader overwrites

* fix(storage): stage empty schema rewrites

---------

Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-authored-by: Ragnor Comerford <ragnor.comerford@gmail.com>
Co-authored-by: Ragnor Comerford <hello@ragnor.co>
This commit is contained in:
devin-ai-integration[bot] 2026-06-09 23:03:08 +02:00 committed by GitHub
parent 171a8c5d13
commit 2c578a60b2
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
29 changed files with 1150 additions and 1245 deletions

View file

@ -186,7 +186,8 @@ op-2 (insert/update) → read committed via Lance + pending via DataFusion
op-N → push batch
─── end of query ───────────────────────────────────────
finalize: per pending table:
concat batches → stage_append OR stage_merge_insert → commit_staged
concat batches → stage_append OR stage_merge_insert OR stage_overwrite
→ commit_staged
publisher: ManifestBatchPublisher::publish (one cross-table CAS)
```
@ -197,9 +198,10 @@ contracts:
- `D₂` parse-time rule: a query is either insert/update-only or
delete-only. Mixed → reject. Deletes still inline-commit (Lance
4.0.0 has no public two-phase delete); D₂ keeps the inline path safe.
- `LoadMode::Overwrite` keeps the inline-commit path
(truncate-then-append doesn't fit the staged shape; overwrite has no
in-flight read-your-writes requirement).
- `LoadMode::Overwrite` uses Lance `Operation::Overwrite` through the
same staged path. Loader validation runs against the replacement
in-memory batches before any `commit_staged`, and the publish window is
covered by `SidecarKind::Load` recovery.
- Read sites consume `TableStore::scan_with_pending`, which Lance-scans
the committed snapshot at the captured `expected_version` and unions
with a DataFusion `MemTable` over the pending batches.

View file

@ -84,7 +84,7 @@ Resolves expression values to literals, converts to typed Arrow arrays (`literal
- `insert` (no `@key`, edges) → accumulate into `MutationStaging.pending` (Append mode); finalize calls `stage_append` once per touched table.
- `insert` (`@key` node) → accumulate into `pending` (Merge mode); finalize calls `stage_merge_insert` once per touched table.
- `update` → scan committed via Lance + pending via DataFusion `MemTable` (read-your-writes), apply assignments, accumulate into `pending` (Merge mode).
- `delete` → still inline-commits via `delete_where` (Lance 4.0.0 has no public two-phase delete); recorded into `MutationStaging.inline_committed`.
- `delete` → still inline-commits via `delete_where` (Lance v6.0.1 has no public two-phase delete; `DeleteBuilder::execute_uncommitted` first ships in v7.0.0-beta.10 — tracked as MR-A in [docs/dev/lance.md](lance.md)); recorded into `MutationStaging.inline_committed`.
**D₂ parse-time rule.** A single mutation query is either insert/update-only or delete-only. Mixed → reject before any I/O. The check fires in `enforce_no_mixed_destructive_constructive(&ir)` inside `execute_named_mutation`.

View file

@ -102,7 +102,7 @@ Use it this way:
| Branch delete | Manifest is the single authority, flipped atomically first; per-table forks + commit-graph branch are derived state, reclaimed best-effort (`force_delete_branch`) with the `cleanup` reconciler as the guaranteed backstop. Reusing a name whose reclaim failed before `cleanup` surfaces an actionable error | [branches-commits.md](../user/branches-commits.md), [maintenance.md](../user/maintenance.md) |
| Schema validation | Type checks, required fields, defaults, edge endpoint checks, and edge cardinality are enforced on write paths | [schema-language.md](../user/schema-language.md), [execution.md](execution.md) |
| Unique constraints | Intra-batch and write-path checks exist; intake and branch-merge derive the composite key through one shared function (`loader::composite_unique_key`, a separator-free `Vec<String>` tuple) and fail loudly on an un-keyable column type rather than silently exempting it; full cross-version uniqueness against already-committed rows is still a gap | [schema-language.md](../user/schema-language.md) |
| Storage trait | `TableStorage` exists as the sealed staged-write surface; full call-site migration and capability/stat surfaces are incomplete | [writes.md](writes.md), [architecture.md](architecture.md) |
| Storage trait | `TableStorage` (via `db.storage()`) is staged-only; the inline-commit residuals (`delete_where`, `create_vector_index`) are split onto a separate sealed `InlineCommitResidual` trait reached via `db.storage_inline_residual()` (MR-854), so §1 holds by construction; capability/stat surfaces are roadmap | [writes.md](writes.md), [architecture.md](architecture.md) |
| Index lifecycle | `ensure_indices` is explicit today; reconciler-based convergence is roadmap | [indexes.md](../user/indexes.md), [maintenance.md](../user/maintenance.md) |
| Traversal IDs | Runtime still builds `TypeIndex`; Lance stable row-id based graph IDs are roadmap | [architecture.md](architecture.md), [query-language.md](../user/query-language.md) |
| Auth | Bearer token hashing and server-side actor resolution are implemented at the HTTP boundary | [server.md](../user/server.md), [policy.md](../user/policy.md) |
@ -124,9 +124,16 @@ them explicit.
renames. The current compiler still derives type IDs from `kind:name`; this
must be fixed before relying on renamed IDs across accepted schemas.
- **Storage abstraction:** `TableStorage` is present, sealed, and canonical for
staged writes, but older inherent `TableStore` call sites and inline residuals
remain. New write paths should use the staged shape unless a documented Lance
blocker applies.
staged writes. MR-854 sealed it: `db.storage()` exposes only staged primitives
+ reads, and the inline-commit residuals are split onto a separate sealed
`InlineCommitResidual` trait reached via `db.storage_inline_residual()`, so a
new writer cannot couple a write with a HEAD advance through the default
surface. The dead legacy methods (`append_batch` on the trait,
`merge_insert_batch{,es}`, `create_{btree,inverted}_index`) were removed. The
remaining residuals are `delete_where` (gated on MR-A — Lance v7.x bump)
and `create_vector_index` (gated on Lance #6666); see
[lance.md](lance.md) and [writes.md](writes.md). New write paths should use
the staged shape unless a documented Lance blocker applies.
- **Deletes and vector indexes:** `delete_where` and vector index creation still
advance Lance HEAD inline because the required public Lance APIs are missing.
Keep D2 and recovery coverage in place until those residuals are removed.

View file

@ -55,18 +55,18 @@ Adding/changing index types, fixing coverage, debugging FTS or vector recall, de
| Topic | URL |
|---|---|
| Index spec overview | https://lance.org/format/table/index/ |
| BTREE scalar index | https://lance.org/format/table/index/scalar/btree/ |
| Bitmap scalar index | https://lance.org/format/table/index/scalar/bitmap/ |
| Bloom-filter scalar index | https://lance.org/format/table/index/scalar/bloom_filter/ |
| Label-list scalar index | https://lance.org/format/table/index/scalar/label_list/ |
| Zone-map scalar index | https://lance.org/format/table/index/scalar/zonemap/ |
| R-Tree scalar index (spatial) | https://lance.org/format/table/index/scalar/rtree/ |
| Full-text search (FTS) index | https://lance.org/format/table/index/scalar/fts/ |
| N-gram scalar index | https://lance.org/format/table/index/scalar/ngram/ |
| Vector index | https://lance.org/format/table/index/vector/ |
| Fragment-reuse system index | https://lance.org/format/table/index/system/frag_reuse/ |
| MemWAL system index | https://lance.org/format/table/index/system/mem_wal/ |
| Index spec overview | https://lance.org/format/index/ |
| BTREE scalar index | https://lance.org/format/index/scalar/btree/ |
| Bitmap scalar index | https://lance.org/format/index/scalar/bitmap/ |
| Bloom-filter scalar index | https://lance.org/format/index/scalar/bloom_filter/ |
| Label-list scalar index | https://lance.org/format/index/scalar/label_list/ |
| Zone-map scalar index | https://lance.org/format/index/scalar/zonemap/ |
| R-Tree scalar index (spatial) | https://lance.org/format/index/scalar/rtree/ |
| Full-text search (FTS) index | https://lance.org/format/index/scalar/fts/ |
| N-gram scalar index | https://lance.org/format/index/scalar/ngram/ |
| Vector index | https://lance.org/format/index/vector/ |
| Fragment-reuse system index | https://lance.org/format/index/system/frag_reuse/ |
| MemWAL system index | https://lance.org/format/index/system/mem_wal/ |
| HNSW Rust example | https://lance.org/examples/rust/hnsw/ |
| Distributed indexing | https://lance.org/guide/distributed_indexing/ |
| Tokenizer (FTS, n-gram) | https://lance.org/guide/tokenizer/ |
@ -125,7 +125,7 @@ Touching `omnigraph optimize` / `cleanup`, the underlying `compact_files` / `cle
|---|---|
| Read-and-write guide (covers `compact_files`, `cleanup_old_versions`) | https://lance.org/guide/read_and_write/ |
| Performance (compaction tradeoffs) | https://lance.org/guide/performance/ |
| Fragment-reuse index | https://lance.org/format/table/index/system/frag_reuse/ |
| Fragment-reuse index | https://lance.org/format/index/system/frag_reuse/ |
### DataFusion integration

View file

@ -48,7 +48,7 @@ shared by both `mutate_as` and the bulk loader:
touched sub-tables. Cross-table conflicts surface as
`ManifestConflictDetails::ExpectedVersionMismatch`.
- **Deletes still inline-commit.** Lance's `Dataset::delete` is not
exposed as a two-phase op in 4.0.0; deletes go through `delete_where`
exposed as a two-phase op in 6.0.1; deletes go through `delete_where`
immediately and record their post-write state in
`MutationStaging.inline_committed`. The parse-time D₂ rule (below)
prevents inserts/updates from coexisting with deletes in one query,
@ -82,16 +82,14 @@ Three writers have been migrated onto staged primitives:
* **`ensure_indices`** (`db/omnigraph/table_ops.rs::build_indices_on_dataset_for_catalog`)
— scalar indices (BTree, Inverted) now use `stage_create_*_index` +
`commit_staged`. Vector indices stay inline (residual — Lance
`build_index_metadata_from_segments` is `pub(crate)` in 4.0.0;
`build_index_metadata_from_segments` is `pub(crate)` in 6.0.1;
companion ticket to lance-format/lance#6658 needed).
* **`branch_merge::publish_rewritten_merge_table`**
(`exec/merge.rs`) — merge_insert now uses `stage_merge_insert` +
`commit_staged`. Deletes stay inline (Lance #6658 residual).
* **`schema_apply` rewritten_tables** (`db/omnigraph/schema_apply.rs`)
— non-empty rewrites use `stage_overwrite` + `commit_staged`.
Empty-batch rewrites stay inline (Lance `InsertBuilder::execute_uncommitted`
rejects empty data; the empty case is rare and bounded by the
schema-apply lock branch).
— rewrites use `stage_overwrite` + `commit_staged`, including empty-table
rewrites via a zero-fragment Lance `Operation::Overwrite`.
A defense-in-depth integration test (`tests/forbidden_apis.rs`) walks
engine source and fails if non-allow-listed code calls Lance's
@ -106,34 +104,32 @@ the same drift class. Closing it requires either upstream Lance
multi-dataset commit OR the omnigraph-side recovery-on-open reconciler
described in `.context/mr-793-design.md` §15 (deferred to MR-795).
### Inline-commit method residuals on `TableStorage` (MR-793 acceptance §1 option b)
### Inline-commit residuals live on `InlineCommitResidual`, not `db.storage()` (MR-793 acceptance §1, by construction)
MR-793's acceptance criterion §1 ("`TableStore` public API has no method that performs a manifest commit as a side effect of writing") is met **per-method** by enumerating every inline-commit method that remains on the trait surface, naming why it cannot yet be removed, and keeping the residual comment at every call site:
MR-793's acceptance criterion §1 ("`TableStore` (or successor) public API has no method that performs a manifest commit as a side effect of writing") holds **by construction** after MR-854. `db.storage()` (`&dyn TableStorage`) exposes only staged primitives + reads; the inline-commit writes Lance cannot yet stage live on a separate `InlineCommitResidual` trait reached via `Omnigraph::storage_inline_residual()`. A new engine writer cannot couple a write with a Lance HEAD advance through the default surface — it would have to name the residual accessor explicitly. The dead legacy methods (trait `append_batch` / `merge_insert_batches`, inherent `merge_insert_batch{,es}`, `create_{btree,inverted}_index`) were removed; appends/merges and scalar index builds all use the `stage_*` primitives.
| Method on `TableStore` | Inline-commit reason | Closes when |
Two methods remain on `InlineCommitResidual`, each named honestly at its call site:
| Residual method | Inline-commit reason | Closes when |
|---|---|---|
| `delete_where` | `DeleteJob` is `pub(crate)` in lance-4.0.0 — no public two-phase delete API | [lance-format/lance#6658](https://github.com/lance-format/lance/issues/6658) lands and `stage_delete` joins the trait |
| `create_vector_index` | Vector indices take Lance's "segment commit path"; the helper `build_index_metadata_from_segments` is `pub(crate)` | [lance-format/lance#6666](https://github.com/lance-format/lance/issues/6666) lands and `stage_create_vector_index` joins the trait |
| `append_batch` | Legacy inherent method; some engine call sites haven't migrated to `stage_append + commit_staged` yet | MR-793 Phase 1b (call-site conversion) + Phase 9 (demote to `pub(crate)`) |
| `merge_insert_batch` / `merge_insert_batches` | Legacy inherent method | Same — Phase 1b + Phase 9 |
| `overwrite_batch` | Legacy inherent method | Same — Phase 1b + Phase 9 |
| `create_btree_index` (inherent) | Legacy inherent method (the migrated callers use `stage_create_btree_index` + `commit_staged`; the inherent stays for tests / un-migrated paths) | Same — Phase 1b + Phase 9 |
| `create_inverted_index` (inherent) | Same | Same — Phase 1b + Phase 9 + index-class split (MR-848) |
| `truncate_table` (inherent on `TableStore`) | Used by `overwrite_batch` internally | Phase 9 |
| `delete_where` | `DeleteBuilder::execute_uncommitted` is not in Lance v6.0.1 (closed upstream as [#6658](https://github.com/lance-format/lance/issues/6658) but first ships in `v7.0.0-beta.10`); see [docs/dev/lance.md](lance.md) | MR-A: Lance v7.x bump migrates `delete_where` to staged, retires the parse-time D₂ mutation rule, and extends recovery sidecar coverage |
| `create_vector_index` | Vector indices take Lance's "segment commit path"; `build_index_metadata_from_segments` is `pub(crate)` (Lance [#6666](https://github.com/lance-format/lance/issues/6666) still open) | Lance #6666 lands and `stage_create_vector_index` joins the staged surface |
After **lance#6658 + lance#6666 ship + MR-793 Phase 1b + MR-793 Phase 9 all complete**, the trait surface exposes only staged-write primitives + `commit_staged`. Until then this matrix names every residual explicitly, every call site carries a one-line residual comment, and no engine code outside `table_store.rs` is permitted to reach the inline-commit Lance APIs (enforced by the `tests/forbidden_apis.rs` guard).
The `tests/forbidden_apis.rs` guard still catches direct `lance::*` inline-commit misuse outside the storage layer; the trait split makes the staged-only default a type-system guarantee on top of it.
### `LoadMode::Overwrite` residual
### `LoadMode::Overwrite` uses staged Lance `Overwrite`
The bulk loader's Append and Merge modes use the staged-write path
described above. `LoadMode::Overwrite` keeps the legacy inline-commit
path: truncate-then-append doesn't fit the staged shape cleanly in
Lance 4.0.0, and overwrite has no in-flight read-your-writes
requirement (the prior data is being wiped). A mid-overwrite failure
can leave Lance HEAD on a partially-truncated table; the next overwrite
will replace it. Operator-driven (rare in agent workloads); document
permanently until Lance exposes `Operation::Overwrite { fragments }` as
a two-phase op.
The bulk loader's Append, Merge, and Overwrite modes all use the
staged-write path described above. `LoadMode::Overwrite` accumulates
replacement batches in memory, validates node/edge constraints, referential
integrity, and edge cardinality before any Lance HEAD movement, stages
each touched table with Lance `Operation::Overwrite`, then runs
`commit_staged` under the normal `SidecarKind::Load` recovery sidecar
before publishing `__manifest`. `OMNIGRAPH_LOAD_CONCURRENCY` applies to the
fragment-writing stage only; the commit and manifest publish still run
under the per-table write queues. Empty-table overwrite is represented as
a valid zero-fragment Lance `Overwrite` transaction, not as
truncate-then-append.
### Open-time recovery sweep
@ -286,7 +282,7 @@ guarantee — the in-memory accumulator evaporates with the dropped task
and no Lance write was ever issued.
For delete-touching mutations the legacy inline-commit shape is
preserved (Lance has no public two-phase delete in 4.0.0) — the same
preserved (Lance has no public two-phase delete in 6.0.1) — the same
narrow window remains. The parse-time D₂ rule prevents inserts/updates
from coexisting with deletes in one query, so a pure-delete failure
cannot drift any staged-table state. If a delete-only multi-table