omnigraph/validation-prototypes/Cargo.toml
Devin AI 3b661f35d3 MR-927 Phase 1 — stable-row-id repro across BTree/Bitmap/LabelList + compaction
Builds and runs the small repro specified in
.context/experiments/stable-row-id-compaction.md §5 ("Small repro plan")
and extends the writeup with the empirical Phase 1 evidence (F7–F11).

Matrix {BTree, Bitmap, LabelList} × {stable=true, stable=false}, 6 fragments
forced via small max_rows_per_file and target_rows_per_fragment, with
with_row_id() probes pre- and post-compaction. All six cases return correct
counts; with enable_stable_row_ids: true the row IDs round-trip unchanged
across compaction; with the flag off the row addresses move (fragment_id <<
32 | local_row), which is the documented contract.

Plus a side experiment confirming that Operation::Overwrite (both staged
via InsertBuilder::execute_uncommitted + CommitBuilder::execute and direct
Dataset::write Overwrite) inherits manifest.uses_stable_row_ids from the
existing dataset, even when the WriteParams flag is absent. This resolves
the suspicion about table_store.rs:956 (stage_overwrite path not setting
the flag): the path is correct, not a latent bug.

Conclusion: MR-737 §5.5 substrate caveat ("Stable Row ID for Index is
documented as experimental in lance-4.0.x") is empirically resolved.
Feature works; docs are conservative. RFC shape for MR-927 is a docs-PR.

Refs MR-925, MR-927.

Co-Authored-By: Ragnor Comerford <ragnor.comerford@gmail.com>
2026-05-12 22:56:51 +00:00

69 lines
1.8 KiB
TOML

[workspace]
resolver = "2"
members = [
"factorized-batches",
"custom-lance-index",
"custom-operator",
"sip-format-bench",
"stable-rowid-index", # 1.7 / MR-927 Phase 1
# Additional crates added as each experiment is set up:
# "bitmap-pushdown", # 1.5
# "txn-branches-cost", # 1.6
]
# Pre-Phase-0 validation prototypes for MR-925 / MR-737.
# These are THROWAWAY crates that produce go/no-go signals or calibration
# numbers. Do not merge to main. The findings live in `.context/experiments/`.
[workspace.dependencies]
# Pin to the omnigraph workspace versions so the experiments exercise the
# same substrate behavior the engine will see in Phase 0.
arrow-array = "57"
arrow-ipc = "57"
arrow-schema = "57"
arrow-select = "57"
arrow-cast = { version = "57", features = ["prettyprint"] }
arrow-ord = "57"
arrow = "57"
datafusion = { version = "52", default-features = false }
datafusion-physical-plan = "52"
datafusion-physical-expr = "52"
datafusion-execution = "52"
datafusion-common = "52"
datafusion-expr = "52"
datafusion-functions-aggregate = "52"
datafusion-physical-optimizer = "52"
lance = { version = "4.0.0", default-features = false, features = ["aws"] }
lance-datafusion = "4.0.0"
lance-file = "4.0.0"
lance-index = "4.0.0"
lance-table = "4.0.0"
lance-core = "4.0.0"
tokio = { version = "1", features = ["rt-multi-thread", "macros", "time"] }
futures = "0.3"
async-trait = "0.1"
tempfile = "3"
anyhow = "1"
rand = "0.8"
roaring = "0.11"
croaring = "2"
prost = "0.14"
prost-types = "0.14"
uuid = { version = "1", features = ["v4"] }
tracing = "0.1"
tracing-subscriber = { version = "0.3", features = ["env-filter", "fmt"] }
serde_json = "1"
[profile.dev]
debug = 0
[profile.dev.package."*"]
opt-level = 2
[profile.release]
opt-level = 3
lto = "thin"
codegen-units = 16