Cluster A previously listed only distance-kernel candidates (cosine, IVF
partition selection, BM25 scoring), which understated the autoresearch
opportunity in Lance. The single largest hot-cycle pile for analytical reads
is the decode path in lance-encoding, not lance-linalg.
Restructure Cluster A into three sub-groups, all sharing the autoresearch loop
shape (single-agent, bit-exact oracle, seconds-scale eval, self-contained code)
but differing in fixture shape:
Distance kernels (lance-linalg):
A1. Adjacent distance kernels (cosine, dot, hamming)
A2. IVF partition-selection kernel
A3. FTS BM25 scoring kernel
Decode kernels (lance-encoding) - highest hot-cycle pile:
A4. Bitpack integer decode (billions of values per analytical query;
documented SIMD literature BP128 / simdcomp / Lemire bitpacking)
A5. Dictionary decode (SIMD gather + prefetch wins on low-cardinality
string columns)
A6. FSST string decode (Tableau's 2x SIMD opportunity)
Scan / merge kernels:
A7. Take / gather (random-access reads; hot for ANN post-fetch)
A8. Predicate / filter evaluation (per-type comparison kernels)
A9. Posting list intersection (FTS AND queries; Lemire 2-5x SIMD wins)
A10. Top-K k-way merge (every LIMIT / ANN query)
Each new candidate notes why it's high-leverage, the documented SIMD
opportunity if any, and the bit-exact oracle availability. Updates the
cross-cluster prioritization to add a "largest absolute speedup on a real
workload -> run A4" branch alongside the existing branches; notes that A1
and A4 can run in parallel by separate agents since they share loop shape but
not scaffolding.
scripts/check-agents-md.sh still passes (30/30 links).
https://claude.ai/code/session_01Aq8kBUcjmEPobcEufnWbW5
Add a "Next experiment candidates" section grouped by control-loop shape (the
unit of harness reuse is the loop, not the target):
Cluster A - reuses lance-autoresearch as-is:
A1. Adjacent distance kernels (cosine, dot, hamming)
A2. IVF partition-selection kernel (centroid scan)
A3. FTS BM25 scoring kernel
Cluster B - needs a new harness (BauplanLabs tournament loop):
B1. IVF_PQ index-build parameter tuning (original "surface 1")
B2. Auto-index-type selection (categorical + B1 inner)
Cluster C - highest ceiling, hardest harness:
C1. Physical-plan JSON patching for Lance-backed DataFusion
(literal BauplanLabs replication)
Each candidate notes the surface, oracle, harness reuse vs. new, and the
expected payoff. A cross-cluster section frames the three "if your goal is X,
run candidate Y next" branches: shortest path to upstream PR (A1), most
user-facing impact (B1), paper-publishable replication (C1). Includes the
go/no-go logic if A1 and B1 split.
scripts/check-agents-md.sh still passes (30/30 links).
https://claude.ai/code/session_01Aq8kBUcjmEPobcEufnWbW5
The note proposed surface 1 (index-build tuning) with recall@K oracle and
BauplanLabs evolutionary tournament as the "smallest experiment that would
produce signal." What landed at research/lance-autoresearch/ is a different
shape: PQ kernel optimization with bit-exact correctness oracle and Karpathy
single-agent autoresearch loop. Add a "First implementation landed" section
that records the divergence and the reasoning (seconds-scale eval favors the
autoresearch shape; kernel work has a more direct upstream PR path; the
bit-exact oracle removes dataset-overfitting incentive). Bumps the note to
revision 3.
scripts/check-agents-md.sh still passes (30/30 links).
https://claude.ai/code/session_01Aq8kBUcjmEPobcEufnWbW5
User clarified the target: optimize Lance directly rather than OmniGraph's
IR layer. Rewrites the note with Lance as the primary target.
Key reframe: Lance is parameter-heavy (not just plan-shape-heavy). The
biggest wins come from configuration tuples (IvfPq num_partitions /
num_sub_vectors / quantizer choice, nprobes / refine_factor / prefilter,
batch_size / io_buffer_size / thread pools, AIMD throttle, scalar-index
choice per column, compaction policy). None of these need a Lance fork —
Lance accepts them as config and emits the metrics. That makes
parameter-search a no-fork, substrate-respecting application of the
BauplanLabs JSON-Patch-on-DAG mechanic (patches over config objects
instead of plan trees).
The plan-patching angle (LanceTableProvider → DataFusion ExecutionPlan,
HashJoinExec swap, multi-join reorder) is parked as the long-term play
behind an upstream-contribution step: serializing/round-tripping
ExecutionPlan as JSON is the prerequisite Bauplan added in their fork,
and the right move is to contribute it upstream rather than maintain a
fork.
Ranks six surfaces by value/difficulty, proposes a smallest experiment on
surface 1 (workload-conditioned IvfPq tuning on SIFT1M or LAION-sample
with recall@10 / p95-latency fitness, bol_evol with n_steps=3,
n_samples=4), and treats OmniGraph-IR work as a complementary footnote
since it composes cleanly with a Lance-tuner output.
Note on Erol et al. (arXiv 2602.10387) — DBPlanBench's evolutionary search
over DataFusion physical plans — and where the mechanic does and does not
port to OmniGraph. The direct port (fork DataFusion, patch physical plans)
is the wrong target since we touch DataFusion only as a MemTable in
table_store::scan_pending_batches; the adapted form (JSON-Patch search over
QueryIR, especially multi-hop Expand ordering / direction) fits cleanly
above the substrate without violating §I substrate respect.
Lists application surfaces by value/difficulty (multi-hop Expand reorder,
RRF hybrid-retrieval k-tuning, filter-pushdown shape, vector index params,
compaction policy) and proposes the smallest experiment that would produce
signal — bol_evol on a ~30-query .gq corpus with bit-identical result
validation. Calls out the Hyrum's Law / determinism discipline (search
offline, freeze plans for serving) and the corpus bootstrap problem.
Filed under docs/research/ as exploratory; not a committed plan.