omnigraph

mirror of https://github.com/ModernRelay/omnigraph.git synced 2026-06-15 01:55:13 +02:00

Claude 0d72cc69fb research: restructure lance-autoresearch as multi-target workspace The original lance-autoresearch was one Cargo crate optimizing one Lance kernel (PQ L2 distance). With 9+ candidate targets enumerated in the research note, a single-crate shape doesn't scale: per-target deps will collide, the agent's edits to one target's kernels.rs would conflict with another's lib path, and build/test isolation is lost. Restructure into a Cargo workspace. Layout: research/lance-autoresearch/ ├── Cargo.toml (workspace root) ├── README.md (target table, contract overview, repo layout) ├── HARNESS.md (universal loop contract every target inherits) ├── crates/ │ ├── harness-common/ (shared: SplitMix64, geomean, peak RSS, │ │ MAX_ABS_ERR, TOPK_DIST_TOL, TIME_BUDGET_SECS) │ └── pq-l2/ (the landed target; was the previous single crate) └── docs/ ├── design.md (rationale for workspace shape, no Target trait) ├── adding-a-target.md (step-by-step workflow for new targets) └── targets/pq-l2.md (per-target capsule) Decisions documented in docs/design.md: - Workspace, not single crate: per-target Cargo.toml so deps don't collide; per-target src tree so agent edits don't conflict; per-target build/test isolation for faster agent iteration. - harness-common as a plumbing-only crate (PRNG, geomean, peak RSS, tolerance constants, time budget). Intentionally NO Target trait - decode kernel signatures and distance kernel signatures differ enough that a unifying trait would either bloat or require erased boxing. Each target is its own natural shape. - Per-target program.md + shared HARNESS.md: the loop contract is universal, the priors and API spec are per-target. Two files instead of one because copy-pasting the universal loop into every program.md would drift. pq-l2 refactor: - src/* moved into crates/pq-l2/src/* via git mv (preserves history) - crate renamed lance-autoresearch -> pq-l2 - SplitMix64, geomean, peak_rss_mb, MAX_ABS_ERR, TOPK_DIST_TOL, TIME_BUDGET_SECS now imported from harness-common (drops ~70 lines of duplication that would have been copy-pasted into every new target) - program.md trimmed: setup/loop/hygiene moved to HARNESS.md; only the PQ-L2-specific API contract and SIMD priors remain - Cargo.toml depends on harness-common via path; workspace.dependencies pins criterion uniformly across targets The 9 candidate targets from the research note (A1 cosine/dot/hamming, A2 IVF partition select, A3 FTS BM25, A4 bitpack decode, A5 dictionary decode, A6 FSST decode, A7 take/gather, A8 predicate eval, A9 posting list intersect, A10 top-K merge) are listed in README.md's target table as "candidate"; each gets a docs/targets/<name>.md capsule when it's spun up. docs/adding-a-target.md documents the cp -r + edit-Cargo.toml + rewrite-three-files workflow. Verified end-to-end: - cargo build --release: clean, both crates compile - cargo clippy --release --workspace --all-targets -- -D warnings: clean - cargo test --release --workspace: 6/6 pass (4 harness-common + 2 pq-l2) - cargo run --release --bin run_experiment -p pq-l2: correctness pass, geomean ~880k ns, exit 0, ~30s wall-clock - omnigraph parent workspace unchanged (research/ excluded as before) https://claude.ai/code/session_01Aq8kBUcjmEPobcEufnWbW5		2026-05-15 00:15:02 +00:00
..
targets	research: restructure lance-autoresearch as multi-target workspace	2026-05-15 00:15:02 +00:00
adding-a-target.md	research: restructure lance-autoresearch as multi-target workspace	2026-05-15 00:15:02 +00:00
design.md	research: restructure lance-autoresearch as multi-target workspace	2026-05-15 00:15:02 +00:00