From ada58ccd7bfa2decf1b510a3d7128899ca38f082 Mon Sep 17 00:00:00 2001 From: Ragnor Comerford Date: Wed, 29 Apr 2026 00:03:50 +0200 Subject: [PATCH] Make "check existing coverage first" a top-level testing principle MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The original docs/testing.md mentioned finding existing tests as step 1 of the checklist but never explicitly said "if existing coverage already addresses your case, extend it; don't duplicate." Adds a prominent "First principle" section that names extend-vs-new as the preferred outcome and lists three duplicated init_and_load blocks as the most common form of test rot. Adds an extra checklist item: verify your change makes an *existing* test fail before it makes a new one pass — if you can break the code without breaking a test, that coverage gap is the bug to fix first. Strengthens the AGENTS.md callout so the principle ("always check what already covers it") is in scope from the top of every session. Co-Authored-By: Claude Opus 4.7 --- AGENTS.md | 2 +- docs/testing.md | 25 +++++++++++++++++++++---- 2 files changed, 22 insertions(+), 5 deletions(-) diff --git a/AGENTS.md b/AGENTS.md index 7217cbd..e1919cb 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -6,7 +6,7 @@ This file is the always-on map for AI coding agents (Claude Code, Codex, Cursor, > > 1. **[docs/invariants.md](docs/invariants.md)** — the architectural invariants and §IX deny-list. Apply to every PR, not only architecture work. > 2. **[docs/lance.md](docs/lance.md)** — the curated index of upstream Lance docs. **Consult it before every task** to identify which Lance pages are relevant to what you're about to do, then fetch those upstream URLs before grepping our code or guessing. Lance is the substrate; behavior is documented there, not here. -> 3. **[docs/testing.md](docs/testing.md)** — the test-coverage map. **Walk the before-every-task checklist** to identify what tests already cover the area you're changing, run them as a clean baseline before editing, plan your new test up front, and reuse the existing helpers / fixtures. +> 3. **[docs/testing.md](docs/testing.md)** — the test-coverage map. **Always check what already covers your change before writing a new test.** Extending an existing test (an assertion, a fixture row, a parameterization) is preferred over a duplicated `init_and_load()` block. Walk the before-every-task checklist to identify existing coverage, run those tests as a clean baseline, and only add a new test fn or file when no existing one owns the area. > > Tools that support `@`-imports (Claude Code) auto-include all three files via the imports below. Other agents (Codex, Cursor, Cline, …) must open them explicitly at the start of each session. > diff --git a/docs/testing.md b/docs/testing.md index 1d094c3..c23c6ec 100644 --- a/docs/testing.md +++ b/docs/testing.md @@ -77,16 +77,33 @@ There is **no** coverage tooling in the repo today: no `tarpaulin.toml`, no `cod If introducing coverage tooling is in scope for your task, the natural first step is `cargo-llvm-cov` wired into a separate CI job, and a per-crate threshold rather than a global one. +## First principle: check what already covers it + +**Before writing any new test, check whether an existing test already covers the case.** The cost of duplicating coverage is high: more code to read, more places to keep in sync when behavior changes, and more drift when one copy lags. The cost of *extending* an existing test is usually one extra assertion or one extra fixture row. + +How to check: + +1. **Map the change to an area** — use the engine integration-test table above (`branching.rs`, `runs.rs`, `search.rs`, etc.). The filename usually names the area. +2. **Open the file and skim every test fn name.** Test fn names are the index — read them all, not just the first few. +3. **Grep for the symbol or path you're changing.** `rg ` or `rg ` across all `tests/` directories surfaces existing coverage you might miss. +4. **Decide one of three outcomes**, in this order of preference: + - *Existing test already asserts the new behavior* → no new test needed; this PR is a refactor or no-op behaviorally. Confirm by running the existing test against the change. + - *Existing test covers the area but not your case* → **add an assertion or a fixture row to the existing test**, don't write a new function with `init_and_load()` again. + - *No existing coverage in any test file* → only then write a new test; put it in the file that owns the area, or open a new file only if the area itself is new. + +Three duplicated `init_and_load() → run_query → assert_eq` blocks where one parameterized test would do is the most common form of test rot in this repo. Don't add to it. + ## Before-every-task checklist When you pick up any change, walk through this: -1. **Find the existing tests covering this area.** Use the table above; for engine work, start with `crates/omnigraph/tests/.rs`. Open the file and read what's already exercised. -2. **Run them locally before editing.** `cargo test --workspace --locked` for the broad pass; `-p --test ` for a focused loop. Confirm a clean baseline. -3. **Plan the new test up front.** Decide which test file the new case belongs in (extend an existing one if it owns the area; only add a new file if you're opening a new area). -4. **Use the helpers.** `init_and_load()`, fixture files, the CLI `support` harness — re-use them. Don't bootstrap a fresh repo by hand if a helper exists. +1. **Find existing coverage** (per the principle above). Don't just look at the first test file by name — grep for the symbol you're touching across every crate's `tests/`. +2. **Run those tests locally before editing.** `cargo test --workspace --locked` for the broad pass; `-p --test ` for a focused loop. Confirm a clean baseline. +3. **Decide extend-vs-new** explicitly. If you can extend an existing test (assertion, fixture row, parameterization), do that. Only add a new test fn or new file if no existing one owns the area. +4. **Reuse the helpers.** `init_and_load()`, fixture files, the CLI `support` harness — re-use them. Don't bootstrap a fresh repo by hand if a helper exists. 5. **Mind the boundary.** Per [docs/invariants.md §VII.33](invariants.md), test at the layer the change lives at — planner-level changes deserve planner-level tests, not just end-to-end. 6. **For substrate-touching changes** (Lance behavior), reach for `failpoints` or fixture-driven scenarios, not stubbed-out mocks. 7. **For server / API changes**, confirm the OpenAPI regeneration happens in `openapi.rs` and that the diff lands in `openapi.json`. +8. **Verify your change makes an existing test fail before it makes the new one pass.** If you can break the code without breaking a test, your coverage gap is the problem to fix first. When in doubt, re-read [docs/invariants.md §VII](invariants.md) — quality gates apply to every change.