MR-927 Phase 1 — stable-row-id repro across BTree/Bitmap/LabelList + compaction

Builds and runs the small repro specified in
.context/experiments/stable-row-id-compaction.md §5 ("Small repro plan")
and extends the writeup with the empirical Phase 1 evidence (F7–F11).

Matrix {BTree, Bitmap, LabelList} × {stable=true, stable=false}, 6 fragments
forced via small max_rows_per_file and target_rows_per_fragment, with
with_row_id() probes pre- and post-compaction. All six cases return correct
counts; with enable_stable_row_ids: true the row IDs round-trip unchanged
across compaction; with the flag off the row addresses move (fragment_id <<
32 | local_row), which is the documented contract.

Plus a side experiment confirming that Operation::Overwrite (both staged
via InsertBuilder::execute_uncommitted + CommitBuilder::execute and direct
Dataset::write Overwrite) inherits manifest.uses_stable_row_ids from the
existing dataset, even when the WriteParams flag is absent. This resolves
the suspicion about table_store.rs:956 (stage_overwrite path not setting
the flag): the path is correct, not a latent bug.

Conclusion: MR-737 §5.5 substrate caveat ("Stable Row ID for Index is
documented as experimental in lance-4.0.x") is empirically resolved.
Feature works; docs are conservative. RFC shape for MR-927 is a docs-PR.

Refs MR-925, MR-927.

Co-Authored-By: Ragnor Comerford <ragnor.comerford@gmail.com>
This commit is contained in:
Devin AI 2026-05-12 22:56:51 +00:00
parent c0d7ae6ba4
commit 3b661f35d3
5 changed files with 754 additions and 1 deletions

View file

@ -317,4 +317,173 @@ OmniGraph-driven remap path and pin Lance to a release that supports
- **Path A (Lance-managed) is materially better long-term.** When the
§1.2 plugin registry lands, switch. Until then, Path B is
production-grade. See `validation-prototypes/custom-lance-index/` for
the registry blocker repro.
---
## Phase 1 small repro (built and run 2026-05-12)
Built under MR-927 Phase 1 to produce the empirical attachment the RFC
needs. The crate lives at `validation-prototypes/stable-rowid-index/`.
Matrix: `{BTree, Bitmap, LabelList} × {stable=true, stable=false}`, each
case creates a dataset with small `max_rows_per_file` (so the seed lays
down ~6 fragments), creates the index, appends more data, runs
`compact_files` with `target_rows_per_fragment: 10_000` (so compaction
actually consolidates), and probes the index with a `with_row_id()`
scan pre- and post-compaction.
### Repro output (verbatim)
```
=== MR-927 Phase 1 matrix ===
idx stable manif1 manif2 pre/post.cnt pre/post.cnt fragments row_id key=500 row_id key=1234 id500 id1234 ok
BTree true true true 1->1 1->1 6->2 (+2,-6) 500->500 1234->1234 true true OK
BTree false false false 1->1 1->1 6->2 (+2,-6) 8589934592->25769804276 17179869418->30064771306 false false OK
Bitmap true true true 1->1 1->1 6->2 (+2,-6) 500->500 1234->1234 true true OK
Bitmap false false false 1->1 1->1 6->2 (+2,-6) 8589934592->25769804276 17179869418->30064771306 false false OK
LabelList true true true 1->1 1->1 6->2 (+2,-6) 500->500 1234->1234 true true OK
LabelList false false false 1->1 1->1 6->2 (+2,-6) 8589934592->25769804276 17179869418->30064771306 false false OK
=== On-disk index inspection (BTree, stable=true) ===
(BTree, stable=true) _indices/ tree:
7ee5ae1a-4c15-4762-9f4d-d1b0475d3de0/
page_data.lance (16371 bytes)
page_lookup.lance (985 bytes)
=== Side experiment: stage_overwrite flag preservation ===
create (enable_stable_row_ids: true) -> manifest.uses_stable_row_ids = true
staged Overwrite (WriteParams without enable_stable_row_ids: true) -> manifest.uses_stable_row_ids = true
direct Dataset::write Overwrite (same flag absent) -> manifest.uses_stable_row_ids = true
ALL CASES OK — all post-compaction probes returned 1 row.
```
### F7. All three built-in scalar index types are stable-row-id-aware today. ✅
Every case in the matrix returns `count = 1` for both the pre-compact
probe and the post-compact probe, on both the existing key (`key=500`,
present pre-append) and the newly-appended key (`key=1234`, present
post-append). **Compaction does not break BTree, Bitmap, or LabelList
indices on a stable-row-id dataset.** It also does not break them on a
non-stable-row-id dataset — Lance's index machinery transparently does
the right thing for both.
### F8. Stable row IDs survive compaction; physical row addresses change. ✅
With `enable_stable_row_ids: true`:
- `key=500``row_id=500` pre-compact → `row_id=500` post-compact (identical)
- `key=1234``row_id=1234` pre-compact → `row_id=1234` post-compact (identical)
With `enable_stable_row_ids: false`:
- `key=500``row_id=8589934592` (= `2 << 32 | 0`, fragment 2 offset 0) pre-compact → `row_id=25769804276` (= `6 << 32 | 500`) post-compact (changed)
- `key=1234``row_id=17179869418` (fragment 4) pre-compact → `row_id=30064771306` (fragment 7) post-compact (changed)
This is **exactly the contract advertised by the manifest flag.** Stable
IDs are logical and tiny; non-stable IDs are physical addresses
(`fragment_id << 32 | local_row`) that move when fragments are rewritten.
### F9. Compaction is real, not a no-op. ✅
`fragments: 6 → 2 (+2, -6)` for every case — six original small
fragments (from the small `max_rows_per_file` seeding) merged down into
two consolidated ones (the +2 are the consolidated outputs; -6 are the
originals). All four post-compact probes still resolve correctly across
both index types and both row-ID schemes.
### F10. Side experiment: `Operation::Overwrite` preserves `uses_stable_row_ids`. ✅
This closes the open suspicion about `crates/omnigraph/src/table_store.rs:956`
(the `stage_overwrite` path that does NOT set
`enable_stable_row_ids: true` in its `WriteParams`).
Three Overwrite shapes all preserve the flag:
| Path | manifest flag after |
|---------------------------------------------------------------------|---------------------|
| Initial create with `enable_stable_row_ids: true` | `true` |
| `InsertBuilder::with_params({mode: Overwrite, …}) + CommitBuilder::execute` (WriteParams flag absent) | `true` |
| `Dataset::write(reader, uri, WriteParams { mode: Overwrite, …})` (flag absent) | `true` |
The mechanism is at `lance-4.0.1/src/dataset/write/commit.rs:286-290`:
```rust
let use_stable_row_ids = if let Some(ds) = dest.dataset() {
ds.manifest.uses_stable_row_ids() // inherit from existing dataset
} else {
self.use_stable_row_ids.unwrap_or(false) // only when dest is a fresh URI
};
```
So `CommitBuilder::execute(txn)` reads the flag **from the existing
dataset's manifest**, ignoring both the builder's
`use_stable_row_ids` and the WriteParams flag, whenever the destination
is a `Dataset` (which is the case for every Overwrite-on-existing path
in OmniGraph). The `stage_overwrite` site at table_store.rs:956 is NOT
a latent bug.
### F11. BTree on-disk layout. (informational)
`_indices/<uuid>/` for a BTree index of 1000 unique `UInt64` keys on a
stable-row-id dataset:
```
_indices/7ee5ae1a-…/
page_data.lance (16371 bytes)
page_lookup.lance (985 bytes)
```
Both are Lance file-format containers. The bytes are opaque without
loading them through `lance-index`. **What matters for the RFC is the
behavior already documented above**, not the byte-level encoding. The
disk layout is included here only as a pointer for the RFC's "shape of
the index" appendix.
---
## RFC shape for MR-927 (recommended)
Given F7F11, the right RFC shape is **(a) docs PR**: remove the
"experimental" caveat from the *Stable Row ID for Index* docs page,
because the feature works correctly today across all three built-in
scalar index types on Lance 4.0.1.
What the RFC should argue:
1. Built-in scalar indices (BTree, Bitmap, LabelList) all return
logical row IDs that survive compaction on stable-row-id datasets.
2. `CommitBuilder::execute` correctly inherits the flag from the
existing manifest, so Overwrite operations preserve it.
3. The dataset-level `enable_stable_row_ids` flag is sufficient — there
is no per-index opt-in mechanism, and none is needed.
4. The repro in `validation-prototypes/stable-rowid-index/` is the
attached evidence.
What the RFC should NOT argue:
- That a new API is required. The plumbing is already correct.
- That a per-index opt-in flag is needed. The dataset flag is enough.
- That FragReuseIndex behavior needs to change. It is unused on
stable-row-id paths.
The RFC should ask Lance maintainers to confirm any remaining
"experimental" concerns (e.g. specific format-version interactions,
or specific compaction edge cases like deletion-materialization
intersecting with `materialize_deletions`) before the docs change
lands. If there ARE specific edge cases that are still considered
experimental, the RFC should propose updated docs that scope the
"experimental" label to those edges instead of the whole feature.
## Decision impact on MR-737 §5.5
**Substrate caveat ("`Stable Row ID for Index` is documented as
experimental in lance-4.0.x") is empirically resolved to: the feature
works, the docs are conservative.** MR-737 §5.5 should be downgraded
from "substrate risk" to "docs-staleness observation" and reference
this experiment writeup and MR-927 as the upstream remediation. The
caveat does NOT block Phase 0 entry.
production-ready and OSS-compatible.

View file

@ -5007,6 +5007,23 @@ dependencies = [
"syn 2.0.117",
]
[[package]]
name = "stable-rowid-index"
version = "0.0.0"
dependencies = [
"anyhow",
"arrow",
"arrow-array",
"arrow-schema",
"futures",
"lance",
"lance-core",
"lance-index",
"lance-table",
"tempfile",
"tokio",
]
[[package]]
name = "stable_deref_trait"
version = "1.2.1"

View file

@ -5,10 +5,10 @@ members = [
"custom-lance-index",
"custom-operator",
"sip-format-bench",
"stable-rowid-index", # 1.7 / MR-927 Phase 1
# Additional crates added as each experiment is set up:
# "bitmap-pushdown", # 1.5
# "txn-branches-cost", # 1.6
# "stable-rowid-index", # 1.7
]
# Pre-Phase-0 validation prototypes for MR-925 / MR-737.

View file

@ -0,0 +1,27 @@
[package]
name = "stable-rowid-index"
version = "0.0.0"
edition = "2024"
publish = false
# Experiment 1.7 / MR-927 Phase 1 — validate that built-in Lance scalar indices
# survive compaction on a stable-row-id dataset, and that `stage_overwrite`
# preserves the manifest `uses_stable_row_ids` flag across `Operation::Overwrite`.
# Validates MR-737 §5.4, §5.5.
[dependencies]
arrow = { workspace = true }
arrow-array = { workspace = true }
arrow-schema = { workspace = true }
lance = { workspace = true }
lance-table = { workspace = true }
lance-index = { workspace = true }
lance-core = { workspace = true }
tokio = { workspace = true }
futures = { workspace = true }
anyhow = { workspace = true }
tempfile = { workspace = true }
[[bin]]
name = "stable-rowid-index"
path = "src/main.rs"

View file

@ -0,0 +1,540 @@
//! MR-927 Phase 1 — Validate stable-row-id behavior for built-in Lance scalar indices.
//!
//! Goal: produce empirical evidence for the MR-927 RFC about whether Lance 4.0
//! built-in scalar indices (BTree, Bitmap, LabelList) survive `compact_files`
//! on a stable-row-id dataset, and answer the side question raised during MR-925's
//! follow-up: does `Operation::Overwrite` preserve the manifest's
//! `uses_stable_row_ids` flag when `WriteParams::enable_stable_row_ids` is NOT
//! set in the rewrite call?
//!
//! Matrix:
//!
//! For idx in {BTree, Bitmap, LabelList}:
//! For stable in {true, false}:
//! - Create dataset, enable_stable_row_ids=stable
//! - Insert 1000 rows (keys 0..1000)
//! - Create scalar index of type `idx` on the indexed column
//! - Probe: scan with filter targeting one row, expect 1 hit
//! - Append 500 more rows (keys 1000..1500)
//! - Probe: same filter, expect 1 hit (also probe a new-row key)
//! - compact_files
//! - Probe: same filter, expect 1 hit (THIS is the survival check)
//!
//! Plus side experiment for `stage_overwrite` flag preservation.
//!
//! Output: a tabular report. Findings are appended to
//! `.context/experiments/stable-row-id-compaction.md`.
use std::path::{Path, PathBuf};
use std::sync::Arc;
use anyhow::{Context, Result};
use arrow_array::builder::{ListBuilder, StringBuilder, UInt64Builder};
use arrow_array::cast::AsArray;
use arrow_array::types::UInt64Type;
use arrow_array::{RecordBatch, RecordBatchIterator};
use arrow_schema::{DataType, Field, Schema};
use futures::TryStreamExt;
use lance::Dataset;
use lance::dataset::optimize::{CompactionOptions, compact_files};
use lance::dataset::{InsertBuilder, WriteMode, WriteParams};
use lance_index::DatasetIndexExt;
use lance_index::scalar::{BuiltinIndexType, ScalarIndexParams};
use lance_index::IndexType;
use tempfile::TempDir;
#[derive(Copy, Clone, Debug)]
enum Idx {
BTree,
Bitmap,
LabelList,
}
impl Idx {
fn name(&self) -> &'static str {
match self {
Idx::BTree => "BTree",
Idx::Bitmap => "Bitmap",
Idx::LabelList => "LabelList",
}
}
fn lance_type(&self) -> IndexType {
match self {
Idx::BTree => IndexType::BTree,
Idx::Bitmap => IndexType::Bitmap,
Idx::LabelList => IndexType::LabelList,
}
}
fn params(&self) -> ScalarIndexParams {
match self {
Idx::BTree => ScalarIndexParams::for_builtin(BuiltinIndexType::BTree),
Idx::Bitmap => ScalarIndexParams::for_builtin(BuiltinIndexType::Bitmap),
Idx::LabelList => ScalarIndexParams::for_builtin(BuiltinIndexType::LabelList),
}
}
}
/// Schema for BTree/Bitmap experiments: a UInt64 key + Utf8 payload.
fn primitive_schema() -> Arc<Schema> {
Arc::new(Schema::new(vec![
Field::new("key", DataType::UInt64, false),
Field::new("payload", DataType::Utf8, false),
]))
}
fn build_primitive(n: u64, key_base: u64) -> RecordBatch {
let schema = primitive_schema();
let mut keys = UInt64Builder::with_capacity(n as usize);
let mut payloads = StringBuilder::new();
for i in 0..n {
keys.append_value(key_base + i);
payloads.append_value(format!("p_{:06}", key_base + i));
}
RecordBatch::try_new(
schema,
vec![Arc::new(keys.finish()), Arc::new(payloads.finish())],
)
.expect("build primitive batch")
}
/// Schema for LabelList: a List<Utf8> column + payload.
fn list_schema() -> Arc<Schema> {
Arc::new(Schema::new(vec![
Field::new(
"labels",
DataType::List(Arc::new(Field::new("item", DataType::Utf8, true))),
false,
),
Field::new("payload", DataType::Utf8, false),
]))
}
fn build_list(n: u64, key_base: u64) -> RecordBatch {
let schema = list_schema();
let mut labels = ListBuilder::new(StringBuilder::new());
let mut payloads = StringBuilder::new();
for i in 0..n {
let key = key_base + i;
// Each row has one label "k<key>" so the filter `array_contains(labels, 'k500')`
// matches exactly one row (the row whose key is 500).
labels.values().append_value(format!("k{}", key));
labels.append(true);
payloads.append_value(format!("p_{:06}", key));
}
RecordBatch::try_new(
schema,
vec![Arc::new(labels.finish()), Arc::new(payloads.finish())],
)
.expect("build list batch")
}
async fn create_dataset(
uri: &str,
idx: Idx,
stable: bool,
n: u64,
key_base: u64,
) -> Result<Dataset> {
let (schema, batch) = match idx {
Idx::LabelList => (list_schema(), build_list(n, key_base)),
_ => (primitive_schema(), build_primitive(n, key_base)),
};
let reader = RecordBatchIterator::new(vec![Ok(batch)], schema);
// Use small max_rows_per_file so the initial seed lays down several
// small fragments. That makes the subsequent `compact_files` actually
// consolidate them instead of being a no-op.
let params = WriteParams {
mode: WriteMode::Create,
enable_stable_row_ids: stable,
max_rows_per_file: 250,
..Default::default()
};
Dataset::write(reader, uri, Some(params))
.await
.context("create dataset")
}
async fn append_more(ds: &mut Dataset, idx: Idx, n: u64, key_base: u64) -> Result<()> {
let (schema, batch) = match idx {
Idx::LabelList => (list_schema(), build_list(n, key_base)),
_ => (primitive_schema(), build_primitive(n, key_base)),
};
let reader = RecordBatchIterator::new(vec![Ok(batch)], schema);
// Same small-fragment policy on append so the dataset has multiple
// fragments to actually consolidate during compaction.
let params = WriteParams {
mode: WriteMode::Append,
max_rows_per_file: 250,
..Default::default()
};
ds.append(reader, Some(params))
.await
.context("append more")?;
Ok(())
}
/// Issue a filter scan that *should* be served by the index. Returns the
/// number of rows returned AND the row IDs we got back (with stable row
/// IDs enabled, those IDs are logical and must survive compaction). We
/// don't enforce that the index was used (that would require introspection
/// of the physical plan); we only enforce that the answer is correct and
/// that the row IDs the index emits round-trip correctly.
async fn probe(ds: &Dataset, idx: Idx, key: u64) -> Result<ProbeResult> {
let filter = match idx {
Idx::LabelList => format!("array_has(labels, 'k{}')", key),
_ => format!("key = {}", key),
};
let mut scanner = ds.scan();
scanner.filter(&filter).context("set filter")?;
scanner.with_row_id();
let stream = scanner.try_into_stream().await.context("into stream")?;
let batches: Vec<RecordBatch> = stream.try_collect().await.context("collect stream")?;
let total: usize = batches.iter().map(|b| b.num_rows()).sum();
let mut row_ids: Vec<u64> = Vec::new();
for b in &batches {
let col = b
.column_by_name("_rowid")
.context("_rowid missing in probe output")?;
let arr = col.as_primitive::<UInt64Type>();
for i in 0..arr.len() {
row_ids.push(arr.value(i));
}
}
Ok(ProbeResult { count: total, row_ids })
}
#[derive(Debug, Clone)]
struct ProbeResult {
count: usize,
row_ids: Vec<u64>,
}
async fn run_case(idx: Idx, stable: bool) -> Result<CaseResult> {
let tmp = TempDir::new().context("tempdir")?;
let uri = tmp.path().join(format!("{}.lance", idx.name())).to_string_lossy().to_string();
let mut ds = create_dataset(&uri, idx, stable, 1000, 0).await?;
let v_initial = ds.manifest().version;
let stable_after_create = ds.manifest().uses_stable_row_ids();
// Create index
ds.create_index(
&[match idx {
Idx::LabelList => "labels",
_ => "key",
}],
idx.lance_type(),
Some(format!("idx_{}", idx.name().to_lowercase())),
&idx.params(),
false,
)
.await
.context("create_index")?;
let v_post_index = ds.manifest().version;
let pre_compact_existing = probe(&ds, idx, 500).await?;
append_more(&mut ds, idx, 500, 1000).await?;
let pre_compact_new = probe(&ds, idx, 1234).await?;
let pre_fragments = ds.manifest().fragments.len();
// Force consolidation: target_rows_per_fragment is normally 1M, our seed
// is 1500 rows. Set it to something modest so compact_files will gather
// all small fragments into one.
let compaction_opts = CompactionOptions {
target_rows_per_fragment: 10_000,
..Default::default()
};
let metrics = compact_files(&mut ds, compaction_opts, None)
.await
.context("compact_files")?;
let post_fragments = ds.manifest().fragments.len();
let v_post_compact = ds.manifest().version;
let stable_after_compact = ds.manifest().uses_stable_row_ids();
let post_compact_existing = probe(&ds, idx, 500).await?;
let post_compact_new = probe(&ds, idx, 1234).await?;
let stable_ids_500 = pre_compact_existing.row_ids == post_compact_existing.row_ids;
let stable_ids_1234 = pre_compact_new.row_ids == post_compact_new.row_ids;
Ok(CaseResult {
idx,
stable_requested: stable,
stable_after_create,
stable_after_compact,
v_initial,
v_post_index,
v_post_compact,
pre_fragments,
post_fragments,
fragments_removed: metrics.fragments_removed,
fragments_added: metrics.fragments_added,
pre_compact_existing,
pre_compact_new,
post_compact_existing,
post_compact_new,
stable_ids_500,
stable_ids_1234,
})
}
#[derive(Debug)]
#[allow(dead_code)] // Several fields populated for completeness, not all printed.
struct CaseResult {
idx: Idx,
stable_requested: bool,
stable_after_create: bool,
stable_after_compact: bool,
v_initial: u64,
v_post_index: u64,
v_post_compact: u64,
pre_fragments: usize,
post_fragments: usize,
fragments_removed: usize,
fragments_added: usize,
pre_compact_existing: ProbeResult,
pre_compact_new: ProbeResult,
post_compact_existing: ProbeResult,
post_compact_new: ProbeResult,
stable_ids_500: bool,
stable_ids_1234: bool,
}
impl CaseResult {
fn ok(&self) -> bool {
let counts_ok = self.pre_compact_existing.count == 1
&& self.pre_compact_new.count == 1
&& self.post_compact_existing.count == 1
&& self.post_compact_new.count == 1;
let flag_ok = self.stable_after_create == self.stable_requested;
// When stable row IDs are enabled, the row IDs returned for the
// same logical row must be identical pre- and post-compaction.
// When disabled, they may differ — that's the substrate behavior
// we are documenting, not asserting equality on.
let row_ids_ok = if self.stable_requested {
self.stable_ids_500 && self.stable_ids_1234
} else {
true // we don't assert in this branch
};
counts_ok && flag_ok && row_ids_ok
}
}
/// Side experiment: does `Operation::Overwrite` (built via
/// `InsertBuilder::with_params(WriteParams { mode: Overwrite, .. })` without
/// `enable_stable_row_ids: true`) preserve the manifest's stable-row-ids flag?
/// Mirrors `table_store.rs:stage_overwrite` at line 956 in OmniGraph.
async fn run_overwrite_preservation() -> Result<OverwriteResult> {
let tmp = TempDir::new()?;
let uri = tmp.path().join("overwrite.lance").to_string_lossy().to_string();
// 1. Create with stable=true
let schema = primitive_schema();
let reader = RecordBatchIterator::new(vec![Ok(build_primitive(100, 0))], schema.clone());
let create_params = WriteParams {
mode: WriteMode::Create,
enable_stable_row_ids: true,
..Default::default()
};
let mut ds = Dataset::write(reader, &uri, Some(create_params)).await?;
let stable_after_create = ds.manifest().uses_stable_row_ids();
// 2. Build an uncommitted Overwrite transaction WITHOUT enable_stable_row_ids.
let batch = build_primitive(50, 5000);
let owb_params = WriteParams {
mode: WriteMode::Overwrite,
..Default::default() // <-- enable_stable_row_ids defaults to false
};
let txn = InsertBuilder::new(Arc::new(ds.clone()))
.with_params(&owb_params)
.execute_uncommitted(vec![batch])
.await?;
// 3. Commit via CommitBuilder (mirrors `TableStore::commit_staged`).
use lance::dataset::CommitBuilder;
let new_ds = CommitBuilder::new(Arc::new(ds.clone())).execute(txn).await?;
let stable_after_overwrite = new_ds.manifest().uses_stable_row_ids();
// 4. Sanity: a third write via Dataset::write with mode=Overwrite (the
// non-staged path) for comparison.
let reader = RecordBatchIterator::new(vec![Ok(build_primitive(50, 7000))], schema.clone());
let direct_params = WriteParams {
mode: WriteMode::Overwrite,
..Default::default()
};
let _ = Dataset::write(reader, &uri, Some(direct_params)).await?;
ds = Dataset::open(&uri).await?;
let stable_after_direct_overwrite = ds.manifest().uses_stable_row_ids();
Ok(OverwriteResult {
stable_after_create,
stable_after_staged_overwrite: stable_after_overwrite,
stable_after_direct_overwrite,
})
}
#[derive(Debug)]
struct OverwriteResult {
stable_after_create: bool,
stable_after_staged_overwrite: bool,
stable_after_direct_overwrite: bool,
}
#[tokio::main(flavor = "multi_thread", worker_threads = 4)]
async fn main() -> Result<()> {
let mut cases: Vec<CaseResult> = Vec::new();
for idx in [Idx::BTree, Idx::Bitmap, Idx::LabelList] {
for stable in [true, false] {
print_progress(idx, stable);
let case = run_case(idx, stable).await?;
cases.push(case);
}
}
print_matrix(&cases);
println!();
println!("=== On-disk index inspection (BTree, stable=true) ===");
inspect_index_files(Idx::BTree, true).await?;
println!();
println!("=== Side experiment: stage_overwrite flag preservation ===");
let r = run_overwrite_preservation().await?;
print_overwrite(&r);
let all_ok = cases.iter().all(|c| c.ok());
println!();
if all_ok {
println!("ALL CASES OK — all post-compaction probes returned 1 row.");
} else {
println!("SOME CASES FAILED — see matrix above.");
}
Ok(())
}
fn print_progress(idx: Idx, stable: bool) {
println!("running idx={} stable={} ...", idx.name(), stable);
}
fn print_matrix(cases: &[CaseResult]) {
println!();
println!("=== MR-927 Phase 1 matrix ===");
println!(
"{:<10} {:<8} {:<7} {:<7} {:<14} {:<14} {:<18} {:<22} {:<22} {:<6} {:<6} {}",
"idx",
"stable",
"manif1",
"manif2",
"pre/post.cnt",
"pre/post.cnt",
"fragments",
"row_id key=500",
"row_id key=1234",
"id500",
"id1234",
"ok",
);
for c in cases {
let pre500 = c.pre_compact_existing.row_ids.first().copied();
let post500 = c.post_compact_existing.row_ids.first().copied();
let pre1234 = c.pre_compact_new.row_ids.first().copied();
let post1234 = c.post_compact_new.row_ids.first().copied();
let frag = format!(
"{}->{} (+{},-{})",
c.pre_fragments, c.post_fragments, c.fragments_added, c.fragments_removed
);
let id500 = format!(
"{}->{}",
pre500.map_or("-".to_string(), |v| v.to_string()),
post500.map_or("-".to_string(), |v| v.to_string()),
);
let id1234 = format!(
"{}->{}",
pre1234.map_or("-".to_string(), |v| v.to_string()),
post1234.map_or("-".to_string(), |v| v.to_string()),
);
let cnt500 = format!(
"{}->{}",
c.pre_compact_existing.count, c.post_compact_existing.count
);
let cnt1234 = format!(
"{}->{}",
c.pre_compact_new.count, c.post_compact_new.count
);
println!(
"{:<10} {:<8} {:<7} {:<7} {:<14} {:<14} {:<18} {:<22} {:<22} {:<6} {:<6} {}",
c.idx.name(),
c.stable_requested,
c.stable_after_create,
c.stable_after_compact,
cnt500,
cnt1234,
frag,
id500,
id1234,
c.stable_ids_500,
c.stable_ids_1234,
if c.ok() { "OK" } else { "FAIL" },
);
}
}
fn print_overwrite(r: &OverwriteResult) {
println!("create (enable_stable_row_ids: true) -> manifest.uses_stable_row_ids = {}", r.stable_after_create);
println!("staged Overwrite (WriteParams without enable_stable_row_ids: true) -> manifest.uses_stable_row_ids = {}", r.stable_after_staged_overwrite);
println!("direct Dataset::write Overwrite (same flag absent) -> manifest.uses_stable_row_ids = {}", r.stable_after_direct_overwrite);
}
/// Walk `<uri>/_indices/<uuid>/` and print the files Lance laid down for one
/// concrete case. The shapes are stable across BTree/Bitmap/LabelList
/// (each writes a small handful of index segment files); the bytes are
/// opaque, so we just enumerate names and sizes.
async fn inspect_index_files(idx: Idx, stable: bool) -> Result<()> {
let tmp = TempDir::new()?;
let uri = tmp
.path()
.join(format!("inspect_{}.lance", idx.name()))
.to_string_lossy()
.to_string();
let mut ds = create_dataset(&uri, idx, stable, 1000, 0).await?;
ds.create_index(
&[match idx {
Idx::LabelList => "labels",
_ => "key",
}],
idx.lance_type(),
Some(format!("idx_{}", idx.name().to_lowercase())),
&idx.params(),
false,
)
.await?;
let indices_root = Path::new(&uri).join("_indices");
if !indices_root.exists() {
println!("no _indices/ on disk for {} (stable={})", idx.name(), stable);
return Ok(());
}
println!("({}, stable={}) _indices/ tree:", idx.name(), stable);
walk_dir(&indices_root, 1)?;
Ok(())
}
fn walk_dir(dir: &Path, depth: usize) -> Result<()> {
let mut entries: Vec<PathBuf> = std::fs::read_dir(dir)?
.filter_map(|e| e.ok().map(|e| e.path()))
.collect();
entries.sort();
for p in entries {
let name = p.file_name().unwrap().to_string_lossy().to_string();
let pad = " ".repeat(depth);
if p.is_dir() {
println!("{}{}/", pad, name);
walk_dir(&p, depth + 1)?;
} else {
let size = std::fs::metadata(&p).map(|m| m.len()).unwrap_or(0);
println!("{}{} ({} bytes)", pad, name, size);
}
}
Ok(())
}