omnigraph

mirror of https://github.com/ModernRelay/omnigraph.git synced 2026-06-09 01:35:18 +02:00

Author	SHA1	Message	Date
Devin AI	3b661f35d3	MR-927 Phase 1 — stable-row-id repro across BTree/Bitmap/LabelList + compaction Builds and runs the small repro specified in .context/experiments/stable-row-id-compaction.md §5 ("Small repro plan") and extends the writeup with the empirical Phase 1 evidence (F7–F11). Matrix {BTree, Bitmap, LabelList} × {stable=true, stable=false}, 6 fragments forced via small max_rows_per_file and target_rows_per_fragment, with with_row_id() probes pre- and post-compaction. All six cases return correct counts; with enable_stable_row_ids: true the row IDs round-trip unchanged across compaction; with the flag off the row addresses move (fragment_id << 32 \| local_row), which is the documented contract. Plus a side experiment confirming that Operation::Overwrite (both staged via InsertBuilder::execute_uncommitted + CommitBuilder::execute and direct Dataset::write Overwrite) inherits manifest.uses_stable_row_ids from the existing dataset, even when the WriteParams flag is absent. This resolves the suspicion about table_store.rs:956 (stage_overwrite path not setting the flag): the path is correct, not a latent bug. Conclusion: MR-737 §5.5 substrate caveat ("Stable Row ID for Index is documented as experimental in lance-4.0.x") is empirically resolved. Feature works; docs are conservative. RFC shape for MR-927 is a docs-PR. Refs MR-925, MR-927. Co-Authored-By: Ragnor Comerford <ragnor.comerford@gmail.com>	2026-05-12 22:56:51 +00:00
Devin AI	a09f3ff787	MR-925: experiment 1.4 \u2014 SIP wire format bench (roaring vs varint vs raw) - validation-prototypes/sip-format-bench/: 4 sizes \u00d7 3 distributions \u00d7 3 encodings = 36 cells - writeup at .context/experiments/sip-format-bench.md - finding: roaring wins decisively for dense Lance row IDs (1.05 bits/elem at n=1M dense, 7\u00d7 faster contains than binary_search); loses badly for uniform u64 (176 bits/elem) - recommendation for \u00a75.6: tagged wire format; tag=0x01 roaring (row IDs); tag=0x02 varint-delta (fallback for non-fragment-clustered)	2026-05-12 17:25:56 +00:00
Devin AI	8e54526024	MR-925: experiment 1.3 \u2014 custom UserDefinedLogicalNode + ExecutionPlan e2e - validation-prototypes/custom-operator/: NeighborExpand toy operator with paired ExtensionPlanner + custom QueryPlanner via SessionStateBuilder::with_query_planner - writeup at .context/experiments/custom-operator.md: 5 probes (round-trip, EXPLAIN, predicate guard, composition with Filter + Aggregate, BaselineMetrics) \u2014 all pass; ~250 LoC integration footprint; no unsafe; no internal API access - finding: \u00a75.3 is achievable on DF 52.5 as written; deltas are doc-shaped (predicate push-down opt-in, statistics requirement, Partitioning override)	2026-05-12 17:22:02 +00:00
Devin AI	02c4b45c85	MR-925: validation-prototypes scaffolding + exp 1.1 + exp 1.2 - exclude validation-prototypes/ and merge-insert-cas-repro from the main workspace so the nested cargo workspace can use its own pin set - add validation-prototypes/{factorized-batches,custom-lance-index}/ scratch crates (never merged to main; long-lived branch only) - exp 1.1 — factorized batches through DataFusion ops: writeup at .context/experiments/factorized-batches.md (5 cells × 8 ops; all scalar-keyed ops accept List<UInt64> input, UNNEST via CROSS JOIN fails in DF 52.5) - exp 1.2 — custom Lance index plugin from outside lance: writeup at .context/experiments/custom-lance-index.md (5 probes; transaction surface is open, SCALAR_INDEX_PLUGIN_REGISTRY is closed → hard blocker for MR-737 §5.4; recommends upstream path or external-index path)	2026-05-12 16:49:33 +00:00

4 commits