Merge remote-tracking branch 'origin/main' into feat/cluster-apply-stage3a

2026-06-12 01:45:14 +02:00 · 2026-06-10 00:45:03 +03:00 · 2026-06-10 00:45:03 +03:00 · 69b63c33ac
commit 69b63c33ac
parent 5e1dede08f cec65b8ef8
33 changed files with 1250 additions and 1249 deletions
--- a/docs/dev/architecture.md
+++ b/docs/dev/architecture.md
@ -186,7 +186,8 @@ op-2 (insert/update) → read committed via Lance + pending via DataFusion
 op-N → push batch
 ─── end of query ───────────────────────────────────────
 finalize: per pending table:
-   concat batches → stage_append OR stage_merge_insert → commit_staged
+   concat batches → stage_append OR stage_merge_insert OR stage_overwrite
+                  → commit_staged
 publisher: ManifestBatchPublisher::publish (one cross-table CAS)
 ```

@ -197,9 +198,10 @@ contracts:
 - `D₂` parse-time rule: a query is either insert/update-only or
  delete-only. Mixed → reject. Deletes still inline-commit (Lance
  4.0.0 has no public two-phase delete); D₂ keeps the inline path safe.
- `LoadMode::Overwrite` keeps the inline-commit path
-  (truncate-then-append doesn't fit the staged shape; overwrite has no
-  in-flight read-your-writes requirement).
+- `LoadMode::Overwrite` uses Lance `Operation::Overwrite` through the
+  same staged path. Loader validation runs against the replacement
+  in-memory batches before any `commit_staged`, and the publish window is
+  covered by `SidecarKind::Load` recovery.
 - Read sites consume `TableStore::scan_with_pending`, which Lance-scans
  the committed snapshot at the captured `expected_version` and unions
  with a DataFusion `MemTable` over the pending batches.
--- a/docs/dev/cluster-axioms.md
+++ b/docs/dev/cluster-axioms.md
@ -24,6 +24,12 @@ consequences that follow from them.
 > Terraform-style JSON documents plus backend lock/CAS, not Lance control-plane
 > datasets. Lance remains a possible later backend only if row-level history or
 > queryability justifies the extra machinery.
+>
+> **Revision 2026-06-09 — single ownership during migration.** Axiom **15**
+> added: while `omnigraph.yaml` and the cluster catalog coexist, every fact has
+> exactly one owner at a time — coexistence is a **mode switch, never a merge**.
+> `omnigraph.yaml` does not get replaced; its job description shrinks to the
+> permanent per-operator layer.

 ---

@ -72,6 +78,8 @@ invoke_query. This axiom is the target control-plane rule, not a statement
 about today's server catalog. -->
 **14. Exposure is a policy decision, not a config flag.** Target design: which stored queries (and the tools/dashboards built on them) an actor may **list or invoke** is decided by the policy layer (Cedar: `invoke_query` + catalog visibility), not by a per-query `expose:` boolean. The registry only says a query *exists* (name → file); **policy says who may see and run it**, so the MCP catalog (`GET /queries`) becomes each actor's policy-permitted set. This supersedes the engine's current `mcp.expose` flag only after per-query `invoke_query` scope and Cedar-filtered catalog listing land; until then, proposals must state the compatibility bridge to today's `mcp.expose` + coarse invocation gate.

+**15. Every fact has exactly one owner at a time; coexistence is a mode switch, never a merge.** `cluster.yaml` is not `omnigraph.yaml` v2 — the two documents end with disjoint jobs, and only the *shared-truth* parts of today's `omnigraph.yaml` (the set of graphs, stored-query registry, policy wiring, server boot source) migrate to the cluster catalog. The per-operator parts — connection/cluster selection, the operator's own credential reference, active graph/branch context, CLI ergonomics — are per-operator *by nature* (Sarah's and Bob's differ) and stay in the per-operator layer permanently; plan a **shrinking job description** for `omnigraph.yaml`, not an exit. During the migration window each fact is read from exactly one source at a time: a deployment serves from `omnigraph.yaml` **or** boots from cluster state (an exclusive mode switch), never from a precedence-merge of both. Two readers for one fact is the brittle-backcompat failure mode — it is the deny-list's "state that drifts from what it can be derived from" wearing a compatibility costume. Any compatibility bridge must name its replacement and its removal phase (the `mcp.expose` → policy-owned exposure bridge of axiom 14 is the template); bridges that accumulate without an exit are rejected at review.
+
 ---

 ## The one-line compression
@ -82,7 +90,7 @@ about today's server catalog. -->

 ## How to use this file

- **Reviewing a proposal:** walk axioms 0–14; any conflict is the burden of the proposer to justify. The most common tensions:
+- **Reviewing a proposal:** walk axioms 0–15; any conflict is the burden of the proposer to justify. The most common tensions:
  - Treating the *running system* as the source of truth for **intent** → axioms 2, 4 (intent lives in config).
  - Treating state as a throwaway derivation rather than an authoritative, locked, backend-held ledger → axiom 5, 12.
  - A runtime config-mutation API instead of declarative apply → axiom 3.
@ -94,4 +102,5 @@ about today's server catalog. -->
  - A secret value (token, embedding key, pipeline source credential) inline in config instead of in the gitignored `.env` file → axiom 10.
  - A per-query `expose:`/visibility flag in target-state cluster config instead of governing list/invoke in policy; or failing to account for today's `mcp.expose` compatibility bridge → axiom 14.
  - Shipping `apply` before hermetic `validate` + read-only `plan` tests, or shipping graph/schema-moving apply before recovery tests for the graph/resource-moved-before-cluster-publish gap → axiom 5 and axiom 12.
+  - Reading one fact from both `omnigraph.yaml` and the cluster catalog with precedence rules (a merge instead of a mode switch), migrating per-operator concerns into shared cluster config, or adding a compatibility bridge with no named replacement and removal phase → axiom 15.
 - **Citing:** reference axioms by number in PRs and review comments so the rationale is stable across renames and refactors.
--- a/docs/dev/cluster-config-implementation-spec.md
+++ b/docs/dev/cluster-config-implementation-spec.md
@ -64,6 +64,21 @@ is trying to create. -->
   identity. It is not committed into `cluster.yaml`.
 6. `mcp.expose` remains supported in current `omnigraph.yaml` until the
   per-query policy replacement ships.
+7. **Single ownership (axiom 15).** While `omnigraph.yaml` and the cluster
+   catalog coexist, each fact is read from exactly one source at a time.
+   Phase 5 server boot is an exclusive mode switch — boot from cluster state
+   XOR from `omnigraph.yaml` — never a precedence-merge of both. No phase may
+   introduce a surface that reads the same fact (graph set, query registry,
+   policy wiring, bind address) from both sources with tie-break rules.
+8. **`omnigraph.yaml` shrinks; it does not get deprecated.** Its terminal role
+   is the per-operator layer: connection/cluster selection, the operator's
+   credential reference, active graph/branch context, CLI ergonomics, and
+   purely personal aliases (target home: the operator's global config dir per
+   RFC-002). Shared-truth keys migrate to `cluster.yaml`; per-operator keys
+   never do.
+9. **Bridges carry sunsets.** Every compatibility bridge names its replacement
+   and the phase that removes it (`mcp.expose` → Phase 6 policy-owned exposure
+   is the template). A bridge without an exit is a review-blocking finding.

 ## Terraform-Aligned Schema Validation

@ -335,8 +350,6 @@ Target Phase-1 cluster-root layout:
      <ulid>.json
    recoveries/
      <ulid>.json
-    recovery/
-      <ulid>.json
    resources/
      query/<graph>/<name>/<digest>.gq
      policy/<name>/<digest>.yaml
@ -586,7 +599,9 @@ replacement would make every invariant harder to audit. -->

 - Allow server startup from cluster state.
 - Add status and catalog endpoints as needed.
- Keep the current `omnigraph.yaml` startup path as compatibility mode.
+- Keep the current `omnigraph.yaml` startup path as compatibility mode — an
+  **exclusive mode switch** per deployment (cluster state XOR `omnigraph.yaml`),
+  never a merged read of both (Compatibility Stance #7, axiom 15).
 - Regenerate OpenAPI for any HTTP surface.

 ### Phase 6: Policy-Owned Query Exposure
--- a/docs/dev/cluster-config-specs.md
+++ b/docs/dev/cluster-config-specs.md
@ -387,6 +387,65 @@ This proposal:

 The connection/credential/preference layer remains per operator: it points at a cluster, resolves that operator's identity, and holds personal ergonomics. The cluster config stays shared, secret-free, and reviewable; the state ledger stays authoritative and locked.

+### Migration model: single ownership, mode switch, shrinking job description (axiom 15)
+
+`omnigraph.yaml` is not being replaced; its **job description shrinks**. Only the
+shared-truth parts of its current role migrate to the cluster catalog (the set of
+graphs, the stored-query registry, policy wiring, the server boot source). The
+per-operator parts are per-operator *by nature* — Sarah's and Bob's differ — and
+keep `omnigraph.yaml`/the per-operator layer as a permanent, well-defined home.
+
+While both exist, **each fact has exactly one owner at any moment, and
+coexistence is a mode switch, never a merge**. The brittle version of backward
+compatibility — the server reading graphs from `omnigraph.yaml` *and* from
+cluster state with precedence rules gluing them together — is rejected outright:
+two readers for one truth means every bug becomes "which file won?" and every
+feature pays the tax twice. The realistic timeline has three windows:
+
+1. **Now → Phase 4 (no conflict).** Cluster apply writes only to its own catalog
+   (`__cluster/`); `omnigraph.yaml` serves traffic. `Applied` status must
+   visibly mean "recorded in the cluster catalog, not yet serving" so the
+   overlap is loud, not hidden.
+2. **Phase 5 (the mode switch).** A deployment opts into booting from cluster
+   state; `omnigraph.yaml`'s server-role keys become inert *for that
+   deployment*. Exclusive — boot from cluster state XOR `omnigraph.yaml` — with
+   no key-level aliasing and no merged precedence.
+3. **Phase 6+ (bridges with sunsets).** Targeted compatibility bridges are
+   allowed only with a named replacement and a removal phase; `mcp.expose` →
+   policy-owned exposure is the template. Bridges that accumulate without an
+   exit are review-rejected.
+
+Key-by-key compatibility inside one evolving file is the expensive kind of
+backcompat (the v1 `omnigraph.yaml` reshape's `--target`/legacy-key regressions
+are the in-repo cautionary tale); resource-ownership seams between two files
+with a mode switch is the cheap kind. Police the single-owner rule in every
+Phase 3–6 PR: a proposal that merges the two sources for one fact is the
+deny-list's "state that drifts from what it can be derived from" wearing a
+compatibility costume.
+
+### The per-operator layer: contents and destination
+
+The per-operator layer must be **complete** — everything an operator needs to
+work against any cluster from any directory, and nothing that two operators must
+agree on:
+
+| Per-operator concern | Today | Target |
+|---|---|---|
+| Connection (which cluster/server, named endpoints) | `omnigraph.yaml` `graphs.<name>` URIs / `servers:` refs | global config, per-operator |
+| Operator credential **reference** (`bearer_token_env`, env-file lookup) | `omnigraph.yaml` + `.env` | global config references; secret values stay in env/`.env`, never in any config |
+| Active context (current graph/branch selection) | ad-hoc per-command flags / `defaults` | global state layer (e.g. `omnigraph use`), explicitly **not** the cluster state ledger (axiom 5's "state" is the applied-cluster ledger, not a personal selection) |
+| CLI ergonomics (output format, table layout) | `omnigraph.yaml` `cli:`/`defaults:` | global config, per-operator |
+| Personal command shortcuts (purely personal aliases) | `omnigraph.yaml` `aliases:` | global config; *shared* aliases (team vocabulary) are cluster config — see the aliases split note above |
+
+Destination: this layer belongs in the operator's **global config dir**
+(`~/.omnigraph`, per the RFC-002 global-first layered-config direction —
+global config + active-context state file), not in a repo-committed file, so it
+survives `git clone`, works from any directory, and never collides with the
+shared cluster folder. The RFC-002 layering implementation is currently parked
+(PRs #139/#162 closed over review findings), but the *boundary* it draws is the
+one this spec depends on: per-operator → global dir; shared deployment intent →
+the cluster config folder; deployed reality → the state ledger.
+
 Implementation gate: the Terraform-style workflow must be testable in order.
 `cluster validate` must catch bad config before any apply path exists;
 read-only `cluster plan` must have deterministic structured-plan tests before
--- a/docs/dev/execution.md
+++ b/docs/dev/execution.md
@ -84,7 +84,7 @@ Resolves expression values to literals, converts to typed Arrow arrays (`literal
 - `insert` (no `@key`, edges) → accumulate into `MutationStaging.pending` (Append mode); finalize calls `stage_append` once per touched table.
 - `insert` (`@key` node) → accumulate into `pending` (Merge mode); finalize calls `stage_merge_insert` once per touched table.
 - `update` → scan committed via Lance + pending via DataFusion `MemTable` (read-your-writes), apply assignments, accumulate into `pending` (Merge mode).
- `delete` → still inline-commits via `delete_where` (Lance 4.0.0 has no public two-phase delete); recorded into `MutationStaging.inline_committed`.
+- `delete` → still inline-commits via `delete_where` (Lance v6.0.1 has no public two-phase delete; `DeleteBuilder::execute_uncommitted` first ships in v7.0.0-beta.10 — tracked as MR-A in [docs/dev/lance.md](lance.md)); recorded into `MutationStaging.inline_committed`.

 **D₂ parse-time rule.** A single mutation query is either insert/update-only or delete-only. Mixed → reject before any I/O. The check fires in `enforce_no_mixed_destructive_constructive(&ir)` inside `execute_named_mutation`.

--- a/docs/dev/invariants.md
+++ b/docs/dev/invariants.md
@ -102,7 +102,7 @@ Use it this way:
 | Branch delete | Manifest is the single authority, flipped atomically first; per-table forks + commit-graph branch are derived state, reclaimed best-effort (`force_delete_branch`) with the `cleanup` reconciler as the guaranteed backstop. Reusing a name whose reclaim failed before `cleanup` surfaces an actionable error | [branches-commits.md](../user/branches-commits.md), [maintenance.md](../user/maintenance.md) |
 | Schema validation | Type checks, required fields, defaults, edge endpoint checks, and edge cardinality are enforced on write paths | [schema-language.md](../user/schema-language.md), [execution.md](execution.md) |
 | Unique constraints | Intra-batch and write-path checks exist; intake and branch-merge derive the composite key through one shared function (`loader::composite_unique_key`, a separator-free `Vec<String>` tuple) and fail loudly on an un-keyable column type rather than silently exempting it; full cross-version uniqueness against already-committed rows is still a gap | [schema-language.md](../user/schema-language.md) |
-| Storage trait | `TableStorage` exists as the sealed staged-write surface; full call-site migration and capability/stat surfaces are incomplete | [writes.md](writes.md), [architecture.md](architecture.md) |
+| Storage trait | `TableStorage` (via `db.storage()`) is staged-only; the inline-commit residuals (`delete_where`, `create_vector_index`) are split onto a separate sealed `InlineCommitResidual` trait reached via `db.storage_inline_residual()` (MR-854), so §1 holds by construction; capability/stat surfaces are roadmap | [writes.md](writes.md), [architecture.md](architecture.md) |
 | Index lifecycle | `ensure_indices` is explicit today; reconciler-based convergence is roadmap | [indexes.md](../user/indexes.md), [maintenance.md](../user/maintenance.md) |
 | Traversal IDs | Runtime still builds `TypeIndex`; Lance stable row-id based graph IDs are roadmap | [architecture.md](architecture.md), [query-language.md](../user/query-language.md) |
 | Auth | Bearer token hashing and server-side actor resolution are implemented at the HTTP boundary | [server.md](../user/server.md), [policy.md](../user/policy.md) |
@ -124,9 +124,16 @@ them explicit.
  renames. The current compiler still derives type IDs from `kind:name`; this
  must be fixed before relying on renamed IDs across accepted schemas.
 - **Storage abstraction:** `TableStorage` is present, sealed, and canonical for
-  staged writes, but older inherent `TableStore` call sites and inline residuals
-  remain. New write paths should use the staged shape unless a documented Lance
-  blocker applies.
+  staged writes. MR-854 sealed it: `db.storage()` exposes only staged primitives
+  + reads, and the inline-commit residuals are split onto a separate sealed
+  `InlineCommitResidual` trait reached via `db.storage_inline_residual()`, so a
+  new writer cannot couple a write with a HEAD advance through the default
+  surface. The dead legacy methods (`append_batch` on the trait,
+  `merge_insert_batch{,es}`, `create_{btree,inverted}_index`) were removed. The
+  remaining residuals are `delete_where` (gated on MR-A — Lance v7.x bump)
+  and `create_vector_index` (gated on Lance #6666); see
+  [lance.md](lance.md) and [writes.md](writes.md). New write paths should use
+  the staged shape unless a documented Lance blocker applies.
 - **Deletes and vector indexes:** `delete_where` and vector index creation still
  advance Lance HEAD inline because the required public Lance APIs are missing.
  Keep D2 and recovery coverage in place until those residuals are removed.
--- a/docs/dev/lance.md
+++ b/docs/dev/lance.md
@ -55,18 +55,18 @@ Adding/changing index types, fixing coverage, debugging FTS or vector recall, de

 | Topic | URL |
 |---|---|
-| Index spec overview | https://lance.org/format/table/index/ |
-| BTREE scalar index | https://lance.org/format/table/index/scalar/btree/ |
-| Bitmap scalar index | https://lance.org/format/table/index/scalar/bitmap/ |
-| Bloom-filter scalar index | https://lance.org/format/table/index/scalar/bloom_filter/ |
-| Label-list scalar index | https://lance.org/format/table/index/scalar/label_list/ |
-| Zone-map scalar index | https://lance.org/format/table/index/scalar/zonemap/ |
-| R-Tree scalar index (spatial) | https://lance.org/format/table/index/scalar/rtree/ |
-| Full-text search (FTS) index | https://lance.org/format/table/index/scalar/fts/ |
-| N-gram scalar index | https://lance.org/format/table/index/scalar/ngram/ |
-| Vector index | https://lance.org/format/table/index/vector/ |
-| Fragment-reuse system index | https://lance.org/format/table/index/system/frag_reuse/ |
-| MemWAL system index | https://lance.org/format/table/index/system/mem_wal/ |
+| Index spec overview | https://lance.org/format/index/ |
+| BTREE scalar index | https://lance.org/format/index/scalar/btree/ |
+| Bitmap scalar index | https://lance.org/format/index/scalar/bitmap/ |
+| Bloom-filter scalar index | https://lance.org/format/index/scalar/bloom_filter/ |
+| Label-list scalar index | https://lance.org/format/index/scalar/label_list/ |
+| Zone-map scalar index | https://lance.org/format/index/scalar/zonemap/ |
+| R-Tree scalar index (spatial) | https://lance.org/format/index/scalar/rtree/ |
+| Full-text search (FTS) index | https://lance.org/format/index/scalar/fts/ |
+| N-gram scalar index | https://lance.org/format/index/scalar/ngram/ |
+| Vector index | https://lance.org/format/index/vector/ |
+| Fragment-reuse system index | https://lance.org/format/index/system/frag_reuse/ |
+| MemWAL system index | https://lance.org/format/index/system/mem_wal/ |
 | HNSW Rust example | https://lance.org/examples/rust/hnsw/ |
 | Distributed indexing | https://lance.org/guide/distributed_indexing/ |
 | Tokenizer (FTS, n-gram) | https://lance.org/guide/tokenizer/ |
@ -125,7 +125,7 @@ Touching `omnigraph optimize` / `cleanup`, the underlying `compact_files` / `cle
 |---|---|
 | Read-and-write guide (covers `compact_files`, `cleanup_old_versions`) | https://lance.org/guide/read_and_write/ |
 | Performance (compaction tradeoffs) | https://lance.org/guide/performance/ |
-| Fragment-reuse index | https://lance.org/format/table/index/system/frag_reuse/ |
+| Fragment-reuse index | https://lance.org/format/index/system/frag_reuse/ |

 ### DataFusion integration

--- a/docs/dev/writes.md
+++ b/docs/dev/writes.md
@ -48,7 +48,7 @@ shared by both `mutate_as` and the bulk loader:
  touched sub-tables. Cross-table conflicts surface as
  `ManifestConflictDetails::ExpectedVersionMismatch`.
 - **Deletes still inline-commit.** Lance's `Dataset::delete` is not
-  exposed as a two-phase op in 4.0.0; deletes go through `delete_where`
+  exposed as a two-phase op in 6.0.1; deletes go through `delete_where`
  immediately and record their post-write state in
  `MutationStaging.inline_committed`. The parse-time D₂ rule (below)
  prevents inserts/updates from coexisting with deletes in one query,
@ -82,16 +82,14 @@ Three writers have been migrated onto staged primitives:
 * **`ensure_indices`** (`db/omnigraph/table_ops.rs::build_indices_on_dataset_for_catalog`)
  — scalar indices (BTree, Inverted) now use `stage_create_*_index` +
  `commit_staged`. Vector indices stay inline (residual — Lance
-  `build_index_metadata_from_segments` is `pub(crate)` in 4.0.0;
+  `build_index_metadata_from_segments` is `pub(crate)` in 6.0.1;
  companion ticket to lance-format/lance#6658 needed).
 * **`branch_merge::publish_rewritten_merge_table`**
  (`exec/merge.rs`) — merge_insert now uses `stage_merge_insert` +
  `commit_staged`. Deletes stay inline (Lance #6658 residual).
 * **`schema_apply` rewritten_tables** (`db/omnigraph/schema_apply.rs`)
-  — non-empty rewrites use `stage_overwrite` + `commit_staged`.
-  Empty-batch rewrites stay inline (Lance `InsertBuilder::execute_uncommitted`
-  rejects empty data; the empty case is rare and bounded by the
-  schema-apply lock branch).
+  — rewrites use `stage_overwrite` + `commit_staged`, including empty-table
+  rewrites via a zero-fragment Lance `Operation::Overwrite`.

 A defense-in-depth integration test (`tests/forbidden_apis.rs`) walks
 engine source and fails if non-allow-listed code calls Lance's
@ -106,34 +104,32 @@ the same drift class. Closing it requires either upstream Lance
 multi-dataset commit OR the omnigraph-side recovery-on-open reconciler
 described in `.context/mr-793-design.md` §15 (deferred to MR-795).

-### Inline-commit method residuals on `TableStorage` (MR-793 acceptance §1 option b)
+### Inline-commit residuals live on `InlineCommitResidual`, not `db.storage()` (MR-793 acceptance §1, by construction)

-MR-793's acceptance criterion §1 ("`TableStore` public API has no method that performs a manifest commit as a side effect of writing") is met **per-method** by enumerating every inline-commit method that remains on the trait surface, naming why it cannot yet be removed, and keeping the residual comment at every call site:
+MR-793's acceptance criterion §1 ("`TableStore` (or successor) public API has no method that performs a manifest commit as a side effect of writing") holds **by construction** after MR-854. `db.storage()` (`&dyn TableStorage`) exposes only staged primitives + reads; the inline-commit writes Lance cannot yet stage live on a separate `InlineCommitResidual` trait reached via `Omnigraph::storage_inline_residual()`. A new engine writer cannot couple a write with a Lance HEAD advance through the default surface — it would have to name the residual accessor explicitly. The dead legacy methods (trait `append_batch` / `merge_insert_batches`, inherent `merge_insert_batch{,es}`, `create_{btree,inverted}_index`) were removed; appends/merges and scalar index builds all use the `stage_*` primitives.

-| Method on `TableStore` | Inline-commit reason | Closes when |
+Two methods remain on `InlineCommitResidual`, each named honestly at its call site:
+
+| Residual method | Inline-commit reason | Closes when |
 |---|---|---|
-| `delete_where` | `DeleteJob` is `pub(crate)` in lance-4.0.0 — no public two-phase delete API | [lance-format/lance#6658](https://github.com/lance-format/lance/issues/6658) lands and `stage_delete` joins the trait |
-| `create_vector_index` | Vector indices take Lance's "segment commit path"; the helper `build_index_metadata_from_segments` is `pub(crate)` | [lance-format/lance#6666](https://github.com/lance-format/lance/issues/6666) lands and `stage_create_vector_index` joins the trait |
-| `append_batch` | Legacy inherent method; some engine call sites haven't migrated to `stage_append + commit_staged` yet | MR-793 Phase 1b (call-site conversion) + Phase 9 (demote to `pub(crate)`) |
-| `merge_insert_batch` / `merge_insert_batches` | Legacy inherent method | Same — Phase 1b + Phase 9 |
-| `overwrite_batch` | Legacy inherent method | Same — Phase 1b + Phase 9 |
-| `create_btree_index` (inherent) | Legacy inherent method (the migrated callers use `stage_create_btree_index` + `commit_staged`; the inherent stays for tests / un-migrated paths) | Same — Phase 1b + Phase 9 |
-| `create_inverted_index` (inherent) | Same | Same — Phase 1b + Phase 9 + index-class split (MR-848) |
-| `truncate_table` (inherent on `TableStore`) | Used by `overwrite_batch` internally | Phase 9 |
+| `delete_where` | `DeleteBuilder::execute_uncommitted` is not in Lance v6.0.1 (closed upstream as [#6658](https://github.com/lance-format/lance/issues/6658) but first ships in `v7.0.0-beta.10`); see [docs/dev/lance.md](lance.md) | MR-A: Lance v7.x bump migrates `delete_where` to staged, retires the parse-time D₂ mutation rule, and extends recovery sidecar coverage |
+| `create_vector_index` | Vector indices take Lance's "segment commit path"; `build_index_metadata_from_segments` is `pub(crate)` (Lance [#6666](https://github.com/lance-format/lance/issues/6666) still open) | Lance #6666 lands and `stage_create_vector_index` joins the staged surface |

-After **lance#6658 + lance#6666 ship + MR-793 Phase 1b + MR-793 Phase 9 all complete**, the trait surface exposes only staged-write primitives + `commit_staged`. Until then this matrix names every residual explicitly, every call site carries a one-line residual comment, and no engine code outside `table_store.rs` is permitted to reach the inline-commit Lance APIs (enforced by the `tests/forbidden_apis.rs` guard).
+The `tests/forbidden_apis.rs` guard still catches direct `lance::*` inline-commit misuse outside the storage layer; the trait split makes the staged-only default a type-system guarantee on top of it.

-### `LoadMode::Overwrite` residual
+### `LoadMode::Overwrite` uses staged Lance `Overwrite`

-The bulk loader's Append and Merge modes use the staged-write path
-described above. `LoadMode::Overwrite` keeps the legacy inline-commit
-path: truncate-then-append doesn't fit the staged shape cleanly in
-Lance 4.0.0, and overwrite has no in-flight read-your-writes
-requirement (the prior data is being wiped). A mid-overwrite failure
-can leave Lance HEAD on a partially-truncated table; the next overwrite
-will replace it. Operator-driven (rare in agent workloads); document
-permanently until Lance exposes `Operation::Overwrite { fragments }` as
-a two-phase op.
+The bulk loader's Append, Merge, and Overwrite modes all use the
+staged-write path described above. `LoadMode::Overwrite` accumulates
+replacement batches in memory, validates node/edge constraints, referential
+integrity, and edge cardinality before any Lance HEAD movement, stages
+each touched table with Lance `Operation::Overwrite`, then runs
+`commit_staged` under the normal `SidecarKind::Load` recovery sidecar
+before publishing `__manifest`. `OMNIGRAPH_LOAD_CONCURRENCY` applies to the
+fragment-writing stage only; the commit and manifest publish still run
+under the per-table write queues. Empty-table overwrite is represented as
+a valid zero-fragment Lance `Overwrite` transaction, not as
+truncate-then-append.

 ### Open-time recovery sweep

@ -286,7 +282,7 @@ guarantee — the in-memory accumulator evaporates with the dropped task
 and no Lance write was ever issued.

 For delete-touching mutations the legacy inline-commit shape is
-preserved (Lance has no public two-phase delete in 4.0.0) — the same
+preserved (Lance has no public two-phase delete in 6.0.1) — the same
 narrow window remains. The parse-time D₂ rule prevents inserts/updates
 from coexisting with deletes in one query, so a pure-delete failure
 cannot drift any staged-table state. If a delete-only multi-table