mirror of
https://github.com/ModernRelay/omnigraph.git
synced 2026-06-09 01:35:18 +02:00
Cluster A previously listed only distance-kernel candidates (cosine, IVF
partition selection, BM25 scoring), which understated the autoresearch
opportunity in Lance. The single largest hot-cycle pile for analytical reads
is the decode path in lance-encoding, not lance-linalg.
Restructure Cluster A into three sub-groups, all sharing the autoresearch loop
shape (single-agent, bit-exact oracle, seconds-scale eval, self-contained code)
but differing in fixture shape:
Distance kernels (lance-linalg):
A1. Adjacent distance kernels (cosine, dot, hamming)
A2. IVF partition-selection kernel
A3. FTS BM25 scoring kernel
Decode kernels (lance-encoding) - highest hot-cycle pile:
A4. Bitpack integer decode (billions of values per analytical query;
documented SIMD literature BP128 / simdcomp / Lemire bitpacking)
A5. Dictionary decode (SIMD gather + prefetch wins on low-cardinality
string columns)
A6. FSST string decode (Tableau's 2x SIMD opportunity)
Scan / merge kernels:
A7. Take / gather (random-access reads; hot for ANN post-fetch)
A8. Predicate / filter evaluation (per-type comparison kernels)
A9. Posting list intersection (FTS AND queries; Lemire 2-5x SIMD wins)
A10. Top-K k-way merge (every LIMIT / ANN query)
Each new candidate notes why it's high-leverage, the documented SIMD
opportunity if any, and the bit-exact oracle availability. Updates the
cross-cluster prioritization to add a "largest absolute speedup on a real
workload -> run A4" branch alongside the existing branches; notes that A1
and A4 can run in parallel by separate agents since they share loop shape but
not scaffolding.
scripts/check-agents-md.sh still passes (30/30 links).
https://claude.ai/code/session_01Aq8kBUcjmEPobcEufnWbW5
|
||
|---|---|---|
| .. | ||
| releases | ||
| research | ||
| architecture.md | ||
| audit.md | ||
| branch-protection.md | ||
| branches-commits.md | ||
| changes.md | ||
| ci.md | ||
| cli-reference.md | ||
| cli.md | ||
| codeowners.md | ||
| constants.md | ||
| deployment.md | ||
| embeddings.md | ||
| errors.md | ||
| execution.md | ||
| indexes.md | ||
| install.md | ||
| invariants.md | ||
| lance.md | ||
| maintenance.md | ||
| merge.md | ||
| policy.md | ||
| query-language.md | ||
| runs.md | ||
| schema-language.md | ||
| schema-lint.md | ||
| server.md | ||
| storage.md | ||
| testing.md | ||
| transactions.md | ||