MR-927 Phase 1 — stable-row-id repro across BTree/Bitmap/LabelList + compaction

Builds and runs the small repro specified in
.context/experiments/stable-row-id-compaction.md §5 ("Small repro plan")
and extends the writeup with the empirical Phase 1 evidence (F7–F11).

Matrix {BTree, Bitmap, LabelList} × {stable=true, stable=false}, 6 fragments
forced via small max_rows_per_file and target_rows_per_fragment, with
with_row_id() probes pre- and post-compaction. All six cases return correct
counts; with enable_stable_row_ids: true the row IDs round-trip unchanged
across compaction; with the flag off the row addresses move (fragment_id <<
32 | local_row), which is the documented contract.

Plus a side experiment confirming that Operation::Overwrite (both staged
via InsertBuilder::execute_uncommitted + CommitBuilder::execute and direct
Dataset::write Overwrite) inherits manifest.uses_stable_row_ids from the
existing dataset, even when the WriteParams flag is absent. This resolves
the suspicion about table_store.rs:956 (stage_overwrite path not setting
the flag): the path is correct, not a latent bug.

Conclusion: MR-737 §5.5 substrate caveat ("Stable Row ID for Index is
documented as experimental in lance-4.0.x") is empirically resolved.
Feature works; docs are conservative. RFC shape for MR-927 is a docs-PR.

Refs MR-925, MR-927.

Co-Authored-By: Ragnor Comerford <ragnor.comerford@gmail.com>
This commit is contained in:
Devin AI 2026-05-12 22:56:51 +00:00
parent c0d7ae6ba4
commit 3b661f35d3
5 changed files with 754 additions and 1 deletions

View file

@ -317,4 +317,173 @@ OmniGraph-driven remap path and pin Lance to a release that supports
- **Path A (Lance-managed) is materially better long-term.** When the
§1.2 plugin registry lands, switch. Until then, Path B is
production-grade. See `validation-prototypes/custom-lance-index/` for
the registry blocker repro.
---
## Phase 1 small repro (built and run 2026-05-12)
Built under MR-927 Phase 1 to produce the empirical attachment the RFC
needs. The crate lives at `validation-prototypes/stable-rowid-index/`.
Matrix: `{BTree, Bitmap, LabelList} × {stable=true, stable=false}`, each
case creates a dataset with small `max_rows_per_file` (so the seed lays
down ~6 fragments), creates the index, appends more data, runs
`compact_files` with `target_rows_per_fragment: 10_000` (so compaction
actually consolidates), and probes the index with a `with_row_id()`
scan pre- and post-compaction.
### Repro output (verbatim)
```
=== MR-927 Phase 1 matrix ===
idx stable manif1 manif2 pre/post.cnt pre/post.cnt fragments row_id key=500 row_id key=1234 id500 id1234 ok
BTree true true true 1->1 1->1 6->2 (+2,-6) 500->500 1234->1234 true true OK
BTree false false false 1->1 1->1 6->2 (+2,-6) 8589934592->25769804276 17179869418->30064771306 false false OK
Bitmap true true true 1->1 1->1 6->2 (+2,-6) 500->500 1234->1234 true true OK
Bitmap false false false 1->1 1->1 6->2 (+2,-6) 8589934592->25769804276 17179869418->30064771306 false false OK
LabelList true true true 1->1 1->1 6->2 (+2,-6) 500->500 1234->1234 true true OK
LabelList false false false 1->1 1->1 6->2 (+2,-6) 8589934592->25769804276 17179869418->30064771306 false false OK
=== On-disk index inspection (BTree, stable=true) ===
(BTree, stable=true) _indices/ tree:
7ee5ae1a-4c15-4762-9f4d-d1b0475d3de0/
page_data.lance (16371 bytes)
page_lookup.lance (985 bytes)
=== Side experiment: stage_overwrite flag preservation ===
create (enable_stable_row_ids: true) -> manifest.uses_stable_row_ids = true
staged Overwrite (WriteParams without enable_stable_row_ids: true) -> manifest.uses_stable_row_ids = true
direct Dataset::write Overwrite (same flag absent) -> manifest.uses_stable_row_ids = true
ALL CASES OK — all post-compaction probes returned 1 row.
```
### F7. All three built-in scalar index types are stable-row-id-aware today. ✅
Every case in the matrix returns `count = 1` for both the pre-compact
probe and the post-compact probe, on both the existing key (`key=500`,
present pre-append) and the newly-appended key (`key=1234`, present
post-append). **Compaction does not break BTree, Bitmap, or LabelList
indices on a stable-row-id dataset.** It also does not break them on a
non-stable-row-id dataset — Lance's index machinery transparently does
the right thing for both.
### F8. Stable row IDs survive compaction; physical row addresses change. ✅
With `enable_stable_row_ids: true`:
- `key=500``row_id=500` pre-compact → `row_id=500` post-compact (identical)
- `key=1234``row_id=1234` pre-compact → `row_id=1234` post-compact (identical)
With `enable_stable_row_ids: false`:
- `key=500``row_id=8589934592` (= `2 << 32 | 0`, fragment 2 offset 0) pre-compact → `row_id=25769804276` (= `6 << 32 | 500`) post-compact (changed)
- `key=1234``row_id=17179869418` (fragment 4) pre-compact → `row_id=30064771306` (fragment 7) post-compact (changed)
This is **exactly the contract advertised by the manifest flag.** Stable
IDs are logical and tiny; non-stable IDs are physical addresses
(`fragment_id << 32 | local_row`) that move when fragments are rewritten.
### F9. Compaction is real, not a no-op. ✅
`fragments: 6 → 2 (+2, -6)` for every case — six original small
fragments (from the small `max_rows_per_file` seeding) merged down into
two consolidated ones (the +2 are the consolidated outputs; -6 are the
originals). All four post-compact probes still resolve correctly across
both index types and both row-ID schemes.
### F10. Side experiment: `Operation::Overwrite` preserves `uses_stable_row_ids`. ✅
This closes the open suspicion about `crates/omnigraph/src/table_store.rs:956`
(the `stage_overwrite` path that does NOT set
`enable_stable_row_ids: true` in its `WriteParams`).
Three Overwrite shapes all preserve the flag:
| Path | manifest flag after |
|---------------------------------------------------------------------|---------------------|
| Initial create with `enable_stable_row_ids: true` | `true` |
| `InsertBuilder::with_params({mode: Overwrite, …}) + CommitBuilder::execute` (WriteParams flag absent) | `true` |
| `Dataset::write(reader, uri, WriteParams { mode: Overwrite, …})` (flag absent) | `true` |
The mechanism is at `lance-4.0.1/src/dataset/write/commit.rs:286-290`:
```rust
let use_stable_row_ids = if let Some(ds) = dest.dataset() {
ds.manifest.uses_stable_row_ids() // inherit from existing dataset
} else {
self.use_stable_row_ids.unwrap_or(false) // only when dest is a fresh URI
};
```
So `CommitBuilder::execute(txn)` reads the flag **from the existing
dataset's manifest**, ignoring both the builder's
`use_stable_row_ids` and the WriteParams flag, whenever the destination
is a `Dataset` (which is the case for every Overwrite-on-existing path
in OmniGraph). The `stage_overwrite` site at table_store.rs:956 is NOT
a latent bug.
### F11. BTree on-disk layout. (informational)
`_indices/<uuid>/` for a BTree index of 1000 unique `UInt64` keys on a
stable-row-id dataset:
```
_indices/7ee5ae1a-…/
page_data.lance (16371 bytes)
page_lookup.lance (985 bytes)
```
Both are Lance file-format containers. The bytes are opaque without
loading them through `lance-index`. **What matters for the RFC is the
behavior already documented above**, not the byte-level encoding. The
disk layout is included here only as a pointer for the RFC's "shape of
the index" appendix.
---
## RFC shape for MR-927 (recommended)
Given F7F11, the right RFC shape is **(a) docs PR**: remove the
"experimental" caveat from the *Stable Row ID for Index* docs page,
because the feature works correctly today across all three built-in
scalar index types on Lance 4.0.1.
What the RFC should argue:
1. Built-in scalar indices (BTree, Bitmap, LabelList) all return
logical row IDs that survive compaction on stable-row-id datasets.
2. `CommitBuilder::execute` correctly inherits the flag from the
existing manifest, so Overwrite operations preserve it.
3. The dataset-level `enable_stable_row_ids` flag is sufficient — there
is no per-index opt-in mechanism, and none is needed.
4. The repro in `validation-prototypes/stable-rowid-index/` is the
attached evidence.
What the RFC should NOT argue:
- That a new API is required. The plumbing is already correct.
- That a per-index opt-in flag is needed. The dataset flag is enough.
- That FragReuseIndex behavior needs to change. It is unused on
stable-row-id paths.
The RFC should ask Lance maintainers to confirm any remaining
"experimental" concerns (e.g. specific format-version interactions,
or specific compaction edge cases like deletion-materialization
intersecting with `materialize_deletions`) before the docs change
lands. If there ARE specific edge cases that are still considered
experimental, the RFC should propose updated docs that scope the
"experimental" label to those edges instead of the whole feature.
## Decision impact on MR-737 §5.5
**Substrate caveat ("`Stable Row ID for Index` is documented as
experimental in lance-4.0.x") is empirically resolved to: the feature
works, the docs are conservative.** MR-737 §5.5 should be downgraded
from "substrate risk" to "docs-staleness observation" and reference
this experiment writeup and MR-927 as the upstream remediation. The
caveat does NOT block Phase 0 entry.
production-ready and OSS-compatible.