From b87be5e9f0235b52b49d7808589dc6b33817fab4 Mon Sep 17 00:00:00 2001 From: Ragnor Comerford Date: Sat, 2 May 2026 17:44:41 +0200 Subject: [PATCH] agents: read every Lance page even slightly relevant, not just the obvious match MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Behavior is interlocked across Lance pages — transactions reference index lifecycle, index lifecycle references compaction, compaction references row-id lineage. Skipping a "slightly relevant" page is how alignment misses happen. The index alone is not a substitute for reading the pages. Co-Authored-By: Claude Opus 4.7 (1M context) --- AGENTS.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/AGENTS.md b/AGENTS.md index aa0de0c..21c9e2a 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -5,7 +5,7 @@ This file is the always-on map for AI coding agents (Claude Code, Codex, Cursor, **Required reading every session, every change:** 1. **[docs/invariants.md](docs/invariants.md)** — the architectural invariants and §IX deny-list. Apply to every PR, not only architecture work. -2. **[docs/lance.md](docs/lance.md)** — the curated index of upstream Lance docs. **Consult it before every task** to identify which Lance pages are relevant to what you're about to do, then fetch those upstream URLs before grepping our code or guessing. Lance is the substrate; behavior is documented there, not here. **Always fetch the FULL page content, not summaries** — use `npx mdrip ` (or `npx mdrip --max-chars 200000 ` for very long pages). Tools that summarize pages (like Claude's `WebFetch`) drop load-bearing details — we have caught alignment misses (default flags, `pub(crate)` blockers, three-page sub-specs hidden behind navigation hubs) only after dumping the full markdown. If `npx mdrip` is unavailable, fall back to `curl | pandoc -f html -t markdown` or paste the rendered page text manually; never act on a summarized fetch alone. +2. **[docs/lance.md](docs/lance.md)** — the curated index of upstream Lance docs. **Consult it before every task** to identify which Lance pages are relevant. **Then fetch every page in the matching domain section, plus every page that is even slightly relevant** — not just the page whose title most obviously matches the task. Behavior is interlocked across pages (transactions reference index lifecycle; index lifecycle references compaction; compaction references row-id lineage), and skipping a "slightly relevant" page is how alignment misses happen. The index itself is not a substitute for reading the pages — never act on the index alone. **Always fetch the FULL page content, not summaries** — use `npx mdrip ` (or `npx mdrip --max-chars 200000 ` for very long pages). Tools that summarize pages (like Claude's `WebFetch`) drop load-bearing details — we have caught alignment misses (default flags, `pub(crate)` blockers, three-page sub-specs hidden behind navigation hubs) only after dumping the full markdown. If `npx mdrip` is unavailable, fall back to `curl | pandoc -f html -t markdown` or paste the rendered page text manually; never act on a summarized fetch alone. 3. **[docs/testing.md](docs/testing.md)** — the test-coverage map. **Always check what already covers your change before writing a new test.** Extending an existing test (an assertion, a fixture row, a parameterization) is preferred over a duplicated `init_and_load()` block. Walk the before-every-task checklist to identify existing coverage, run those tests as a clean baseline, and only add a new test fn or file when no existing one owns the area. Tools that support `@`-imports (Claude Code) auto-include all three files via the imports below — note these must sit at column 0 (not inside a blockquote) for the parser to recognize them. Other agents (Codex, Cursor, Cline, …) must open them explicitly at the start of each session.