address PR #70 bot review (Cubic + Cursor): 7 inline + failpoint test + invariants notes

Cubic findings:
* `tests/forbidden_apis.rs`: expand `FORBIDDEN_PATTERNS` with `Dataset::write`
  / `Dataset::append` / `Dataset::delete` / `Dataset::merge_insert` /
  `Dataset::add_columns` / `update_columns` / `drop_columns` /
  `truncate_table` / `restore` and the bare `.merge_insert(` /
  `.add_columns(` / `.update_columns(` / `.drop_columns(` /
  `.truncate_table(` method patterns. Deliberately avoid `.append(` /
  `.delete(` / `.write(` (over-match `Vec::append`, `.delete_branch(`,
  arrow-array `.append(`, etc.). Allow-list `commit_graph.rs` and
  `graph_coordinator.rs` — they're manifest-layer infra that legitimately
  uses `Dataset::write` for system tables.
* `schema_apply.rs:253`: pass `entry.table_branch.as_deref()` (not
  `None`) to `open_dataset_head_for_write` for consistency with the
  sibling `indexed_tables` block. Schema apply rejects non-main
  branches at the lock-acquire step today, so behavior is unchanged;
  this is a defensive consistency fix that survives a future relaxation
  of the lock check.
* `storage_layer.rs:131` doc: was `Vec<&StagedWrite>` with lifetime
  claim; actually returns `Vec<StagedWrite>` (cloned). Fixed.
* `AGENTS.md:201` capability matrix row + `storage_layer.rs:1` module
  doc: softened the "stage_* + commit_staged are the only paths" /
  "trait funnels every write" overclaim. Inline-commit residuals
  (`delete_where`, `create_vector_index`) remain on the trait pending
  upstream Lance work (#6658, #6666); legacy `append_batch` etc.
  remain pending Phase 1b / Phase 9. Module doc now describes the
  current transitional state honestly.

Cursor Bugbot findings:
* `storage_layer.rs:360`: trait `delete_where` consumed `SnapshotHandle`
  but returned only `DeleteState`, dropping the post-delete dataset.
  Future callers migrating from the inherent `&mut Dataset` API would
  lose the post-delete dataset state needed for indexing /
  `table_state` queries. Fixed: returns `(SnapshotHandle, DeleteState)`
  matching `append_batch` / `overwrite_batch` shape.
* `storage_layer.rs:824`: removed dead `_scanner_type_marker` fn and
  the unused `Scanner` import (the marker existed only to suppress an
  unused-import warning — fixing the import is the cleaner answer).

Engine-level Phase A failpoint test (closes the partial-criterion
flagged in Cubic's acceptance-criteria checklist):
* `db/omnigraph/table_ops.rs::stage_and_commit_btree`: instrumented
  with `crate::failpoints::maybe_fail("ensure_indices.post_stage_pre_commit_btree")`
  between `stage_create_btree_index` and `commit_staged`.
* `tests/failpoints.rs::ensure_indices_phase_a_btree_failure_leaves_existing_tables_writable`:
  triggers the failpoint via a schema-apply that adds a new node type;
  proves that existing tables are unaffected (Person mutation succeeds
  after the failed apply) — i.e. Phase A failure leaves no Lance-HEAD
  drift on tables outside the failed `added_tables` iteration.

`docs/invariants.md` transitional notes:
* §VI.23 (atomicity per query): annotated as upheld at the
  writer-trait surface for inserts / updates / scalar-index builds /
  merge_insert / overwrite after MR-793 PR #70. Per-table
  commit_staged → manifest publish window remains; closing requires
  MR-847's recovery-on-open reconciler. `delete_where` and
  `create_vector_index` remain inline pending lance#6658 / #6666.
* §VII.35 (reconciler pattern): annotated as partial — staged
  primitives are the building blocks; the reconciler task itself is
  MR-848.
* §VIII.45 (reference impl per trait): `TableStorage` has its primary
  impl on `TableStore` with opaque-handle signatures; no test impl
  yet.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Ragnor Comerford 2026-05-02 18:47:07 +02:00
parent b87be5e9f0
commit 9b0920b5da
No known key found for this signature in database
7 changed files with 161 additions and 34 deletions

View file

@ -248,9 +248,20 @@ pub(super) async fn apply_schema_with_lock(
let mut target_ds = if batch.num_rows() == 0 {
TableStore::overwrite_dataset(&dataset_uri, batch).await?
} else {
// Pass `entry.table_branch.as_deref()` (not `None`) for
// consistency with the indexed_tables block below. Schema
// apply runs under `__schema_apply_lock__` which today
// rejects non-main branches, so `entry.table_branch` is
// expected to be `None`. But the defensive passthrough
// means a future relaxation of the lock-check can't quietly
// open the wrong HEAD here.
let existing = db
.table_store
.open_dataset_head_for_write(table_key, &dataset_uri, None)
.open_dataset_head_for_write(
table_key,
&dataset_uri,
entry.table_branch.as_deref(),
)
.await?;
let staged = db.table_store.stage_overwrite(&existing, batch).await?;
db.table_store

View file

@ -376,6 +376,10 @@ async fn stage_and_commit_btree(
table_key, columns, e
))
})?;
// Failpoint between stage and commit. Used by `tests/failpoints.rs`
// to demonstrate that a Phase A failure in the staged-index path
// leaves no Lance-HEAD drift on the touched table.
crate::failpoints::maybe_fail("ensure_indices.post_stage_pre_commit_btree")?;
let new_ds = db
.table_store
.commit_staged(Arc::new(ds.clone()), staged.transaction)

View file

@ -1,17 +1,32 @@
//! Storage trait surface — MR-793.
//!
//! `TableStorage` is the engine-internal trait that funnels every Lance
//! data write through staged primitives. Engine code (in `exec/`,
//! `db/omnigraph/`, `loader/`) holds `Arc<dyn TableStorage>` instead of
//! a concrete `TableStore`; the inline-commit Lance APIs
//! (`Dataset::append`, `MergeInsertBuilder::execute`, etc.) are not
//! reachable through the trait surface.
//! `TableStorage` is the engine-internal trait that exposes the
//! staged-write primitives (`stage_append`, `stage_merge_insert`,
//! `stage_overwrite`, `stage_create_btree_index`,
//! `stage_create_inverted_index`) plus `commit_staged` as the canonical
//! way for new engine writers to advance Lance HEAD without coupling
//! "write bytes" with "advance HEAD" in one Lance API call.
//!
//! ## Transitional residuals on the trait
//!
//! Several inline-commit methods remain on the trait surface as
//! documented residuals: `delete_where` (Lance 4.0.0's `DeleteJob` is
//! `pub(crate)` — see [#6658](https://github.com/lance-format/lance/issues/6658)),
//! `create_vector_index` (segment-commit-path requires
//! `build_index_metadata_from_segments` which is `pub(crate)` — see
//! [#6666](https://github.com/lance-format/lance/issues/6666)), and the
//! legacy `append_batch` / `merge_insert_batches` / `overwrite_batch` /
//! `create_btree_index` / `create_inverted_index` paths kept while
//! engine call sites finish migrating off of them (Phase 1b / Phase 9
//! of MR-793). These are named honestly at every call site; the
//! forbidden-API guard test catches direct lance::* misuse outside the
//! storage layer.
//!
//! ## Sealed
//!
//! `TableStorage: sealed::Sealed`. Only types in this crate can implement
//! the trait, so the staged-write invariant cannot be subverted by a
//! downstream impl.
//! the trait, so a downstream crate cannot subvert the contract by
//! providing its own impl.
//!
//! ## Opaque handles
//!
@ -21,15 +36,15 @@
//! through. This is the §III.9 alignment: `lance::Dataset` does not
//! appear in trait signatures.
//!
//! ## Scope (MR-793 Phase 1)
//! ## Migration status (MR-793 PR #70)
//!
//! The trait surface mirrors the methods engine code currently calls on
//! `TableStore`. Subsequent MR-793 phases:
//! * Phase 2 — add `stage_overwrite`, `stage_create_btree_index`,
//! `stage_create_inverted_index` to the trait.
//! * Phase 46 — migrate writers (`ensure_indices`, `branch_merge`,
//! `schema_apply`) onto the staged primitives.
//! * Phase 9 — demote unused inline-commit methods to `pub(crate)`.
//! Phases 1a / 2 / 4 / 5 / 6 are landed: trait scaffolding, three new
//! staged primitives (`stage_overwrite`, scalar index staging), and
//! migration of `ensure_indices`, `branch_merge`, `schema_apply` onto
//! the staged surface. Phase 1b (call-site conversion to
//! `Arc<dyn TableStorage>`), Phase 9 (demote unused inline-commit
//! methods to `pub(crate)`), Phase 7 (recovery reconciler — MR-847),
//! and Phase 8 (index reconciler — MR-848) are deferred to follow-ups.
use std::fmt::Debug;
use std::sync::Arc;
@ -38,7 +53,7 @@ use arrow_array::RecordBatch;
use arrow_schema::SchemaRef;
use async_trait::async_trait;
use lance::Dataset;
use lance::dataset::scanner::{ColumnOrdering, DatasetRecordBatchStream, Scanner};
use lance::dataset::scanner::{ColumnOrdering, DatasetRecordBatchStream};
use lance::dataset::{WhenMatched, WhenNotMatched};
use crate::db::{Snapshot, SubTableEntry};
@ -126,9 +141,12 @@ impl StagedHandle {
}
}
/// Helper: convert a slice of `StagedHandle` references to a Vec of
/// `&StagedWrite` for handing to `TableStore::stage_append`'s
/// `prior_stages` parameter. The lifetime is tied to the input slice.
/// Helper: clone the inner `StagedWrite` out of each `StagedHandle` and
/// collect into a `Vec<StagedWrite>` for handing to
/// `TableStore::stage_append`'s `prior_stages` parameter. The result is
/// owned (not borrowed) — callers that already had a `&[StagedHandle]`
/// pay a clone cost per element. `StagedWrite::clone` is cheap because
/// `Transaction` and `Vec<Fragment>` are shallow-clone friendly.
pub(crate) fn staged_handles_as_writes(handles: &[StagedHandle]) -> Vec<StagedWrite> {
handles.iter().map(|h| h.inner.clone()).collect()
}
@ -357,7 +375,7 @@ pub trait TableStorage: sealed::Sealed + Send + Sync + Debug {
dataset_uri: &str,
snapshot: SnapshotHandle,
filter: &str,
) -> Result<DeleteState>;
) -> Result<(SnapshotHandle, DeleteState)>;
async fn has_btree_index(&self, snapshot: &SnapshotHandle, column: &str) -> Result<bool>;
async fn has_fts_index(&self, snapshot: &SnapshotHandle, column: &str) -> Result<bool>;
@ -744,10 +762,11 @@ impl TableStorage for TableStore {
dataset_uri: &str,
snapshot: SnapshotHandle,
filter: &str,
) -> Result<DeleteState> {
) -> Result<(SnapshotHandle, DeleteState)> {
let mut ds = Arc::try_unwrap(snapshot.into_arc())
.unwrap_or_else(|arc| (*arc).clone());
TableStore::delete_where(self, dataset_uri, &mut ds, filter).await
let state = TableStore::delete_where(self, dataset_uri, &mut ds, filter).await?;
Ok((SnapshotHandle::new(ds), state))
}
async fn has_btree_index(&self, snapshot: &SnapshotHandle, column: &str) -> Result<bool> {
@ -817,8 +836,3 @@ impl TableStorage for TableStore {
TableStore::scan_stream(snapshot.dataset(), projection, filter, order_by, with_row_id).await
}
}
// Suppress unused-import warning when the module is built without the
// `Scanner` type being referenced from the trait.
#[allow(dead_code)]
fn _scanner_type_marker(_: &Scanner) {}

View file

@ -265,6 +265,71 @@ async fn finalize_publisher_residual_does_not_drift_untouched_tables() {
.expect("Company write on a non-drifted table should succeed");
}
/// MR-793 Phase 4 acceptance bar — proves that a Phase A failure in
/// the staged-index path (`stage_create_btree_index` succeeded;
/// `commit_staged` not yet called) leaves NO Lance-HEAD drift on the
/// existing tables. Subsequent operations against those tables succeed
/// without `ExpectedVersionMismatch`.
///
/// Path: `apply_schema(v1 → v2)` adds a new node type. The
/// `added_tables` loop in `schema_apply` creates the empty dataset and
/// then calls `build_indices_on_dataset_for_catalog` →
/// `stage_and_commit_btree(..., &["id"])`. The failpoint fires
/// between `stage_create_btree_index` and `commit_staged`, so the
/// staged segments are written under `_indices/<uuid>/` but Lance HEAD
/// on the new dataset is unchanged at v=1. The schema-apply lock
/// branch is released by `apply_schema`'s outer match. Existing
/// tables (e.g. `node:Person`) are completely untouched by the new
/// node's added_tables iteration — they're outside the failed apply
/// path entirely — and we assert that mutations against them continue
/// to work.
///
/// The orphan empty dataset from the failed apply is acceptable
/// residual: it's unreferenced by `__manifest` and will be reclaimed
/// by `cleanup_old_versions` (or removed when a future apply at the
/// same target path resolves the rename).
#[tokio::test]
async fn ensure_indices_phase_a_btree_failure_leaves_existing_tables_writable() {
let _scenario = FailScenario::setup();
let dir = tempfile::tempdir().unwrap();
let uri = dir.path().to_str().unwrap().to_string();
// Init with TEST_SCHEMA which declares Person + Knows. Indices on
// those tables get built during init.
let mut db = Omnigraph::init(&uri, helpers::TEST_SCHEMA).await.unwrap();
// Apply a schema that adds a new node type. The added_tables loop
// will hit the failpoint between stage and commit on the new
// node:Project table's btree-on-id build. (TEST_SCHEMA already
// has Person + Company + Knows + WorksAt — pick a name that isn't
// already declared.)
let extended_schema = format!("{}\nnode Project {{ name: String @key }}\n", helpers::TEST_SCHEMA);
{
let _failpoint = ScopedFailPoint::new(
"ensure_indices.post_stage_pre_commit_btree",
"return",
);
let err = db.apply_schema(&extended_schema).await.unwrap_err();
assert!(
err.to_string()
.contains("ensure_indices.post_stage_pre_commit_btree"),
"schema apply should fail with the synthetic failpoint error, got: {err}"
);
}
// Existing tables stayed at their pre-apply versions; subsequent
// mutations against them succeed (no Lance-HEAD drift).
mutate_main(
&mut db,
helpers::MUTATION_QUERIES,
"insert_person",
&mixed_params(&[("$name", "Eve")], &[("$age", 22)]),
)
.await
.expect("Person mutation must succeed after the failed schema apply — existing tables are not drifted");
}
fn assert_no_staging_files(repo: &std::path::Path) {
for name in [
"_schema.pg.staging",

View file

@ -42,20 +42,50 @@
use std::path::{Path, PathBuf};
const FORBIDDEN_PATTERNS: &[&str] = &[
// Builder types — direct construction is the side door around the
// staged-write surface.
"MergeInsertBuilder",
"InsertBuilder::",
"DeleteBuilder",
"CommitBuilder::new",
".create_index_builder(",
".create_index_segment_builder(",
// Associated-function forms of inline-commit Lance APIs. These would
// only appear in source if the file imports `lance::Dataset` and
// calls the static fn — exactly the misuse we want to catch. These
// patterns deliberately exclude `.append(` / `.delete(` / `.write(`
// because those would over-match (`.delete_branch(`, `Vec::append`,
// arrow-array `.append(`, etc.).
"Dataset::write",
"Dataset::append",
"Dataset::delete",
"Dataset::merge_insert",
"Dataset::add_columns",
"Dataset::update_columns",
"Dataset::drop_columns",
"Dataset::truncate_table",
"Dataset::restore",
// Lance-specific method names that don't clash with our `TableStore`
// wrappers (we use `merge_insert_batch{,es}`, `add_columns_to_*`,
// etc. — never the bare Lance names). Engine code that writes
// `ds.merge_insert(...)` against a `Dataset` value is reaching
// around the trait surface.
".merge_insert(",
".add_columns(",
".update_columns(",
".drop_columns(",
".truncate_table(",
];
/// Files exempt from the guard. These are the legitimate storage-layer
/// implementations that USE the forbidden APIs to provide the staged
/// primitives.
/// or manifest-layer implementations that USE the forbidden APIs to
/// provide the staged primitives or to maintain the system tables
/// (commit graph, manifest).
const ALLOW_LIST_FILES: &[&str] = &[
"table_store.rs", // The storage layer itself.
"storage_layer.rs", // The trait module.
"table_store.rs", // The storage layer itself.
"storage_layer.rs", // The trait module.
"commit_graph.rs", // Maintains `_graph_commits.lance` system table.
"graph_coordinator.rs", // Drives the manifest publisher / branch coordinator.
];
/// Directories exempt from the guard. Files under these paths may use