Builds and runs the small repro specified in
.context/experiments/stable-row-id-compaction.md §5 ("Small repro plan")
and extends the writeup with the empirical Phase 1 evidence (F7–F11).
Matrix {BTree, Bitmap, LabelList} × {stable=true, stable=false}, 6 fragments
forced via small max_rows_per_file and target_rows_per_fragment, with
with_row_id() probes pre- and post-compaction. All six cases return correct
counts; with enable_stable_row_ids: true the row IDs round-trip unchanged
across compaction; with the flag off the row addresses move (fragment_id <<
32 | local_row), which is the documented contract.
Plus a side experiment confirming that Operation::Overwrite (both staged
via InsertBuilder::execute_uncommitted + CommitBuilder::execute and direct
Dataset::write Overwrite) inherits manifest.uses_stable_row_ids from the
existing dataset, even when the WriteParams flag is absent. This resolves
the suspicion about table_store.rs:956 (stage_overwrite path not setting
the flag): the path is correct, not a latent bug.
Conclusion: MR-737 §5.5 substrate caveat ("Stable Row ID for Index is
documented as experimental in lance-4.0.x") is empirically resolved.
Feature works; docs are conservative. RFC shape for MR-927 is a docs-PR.
Refs MR-925, MR-927.
Co-Authored-By: Ragnor Comerford <ragnor.comerford@gmail.com>
- validation-prototypes/custom-operator/: NeighborExpand toy operator
with paired ExtensionPlanner + custom QueryPlanner via
SessionStateBuilder::with_query_planner
- writeup at .context/experiments/custom-operator.md: 5 probes
(round-trip, EXPLAIN, predicate guard, composition with Filter +
Aggregate, BaselineMetrics) \u2014 all pass; ~250 LoC integration
footprint; no unsafe; no internal API access
- finding: \u00a75.3 is achievable on DF 52.5 as written; deltas are
doc-shaped (predicate push-down opt-in, statistics requirement,
Partitioning override)
- exclude validation-prototypes/ and merge-insert-cas-repro from the main
workspace so the nested cargo workspace can use its own pin set
- add validation-prototypes/{factorized-batches,custom-lance-index}/
scratch crates (never merged to main; long-lived branch only)
- exp 1.1 — factorized batches through DataFusion ops: writeup at
.context/experiments/factorized-batches.md (5 cells × 8 ops; all
scalar-keyed ops accept List<UInt64> input, UNNEST via CROSS JOIN
fails in DF 52.5)
- exp 1.2 — custom Lance index plugin from outside lance: writeup at
.context/experiments/custom-lance-index.md (5 probes; transaction
surface is open, SCALAR_INDEX_PLUGIN_REGISTRY is closed → hard
blocker for MR-737 §5.4; recommends upstream path or external-index
path)