Add docs/testing.md as required-read every session

Maps the test surface (engine integration tests by area, CLI/server
tests, helpers harness, fixtures, failpoints feature, RustFS S3
integration, OpenAPI drift) and gives a before-every-task checklist:
find existing tests for the area, run them as a clean baseline, plan
the new test up front, reuse helpers, mind the layer boundary per
invariants §VII.33.

Notes that there's no coverage tooling today — coverage knowledge
comes from reading and running the relevant integration tests, not a
tarpaulin/codecov report.

Threaded into AGENTS.md as the third required-reading file alongside
invariants.md and lance.md, with a Claude-Code @-import so agents
load it on every turn.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
Ragnor Comerford 2026-04-28 23:55:21 +02:00
parent a06d8bcf82
commit 8be0e6a067
No known key found for this signature in database
2 changed files with 96 additions and 1 deletions

View file

@ -6,11 +6,13 @@ This file is the always-on map for AI coding agents (Claude Code, Codex, Cursor,
>
> 1. **[docs/invariants.md](docs/invariants.md)** — the architectural invariants and §IX deny-list. Apply to every PR, not only architecture work.
> 2. **[docs/lance.md](docs/lance.md)** — the curated index of upstream Lance docs. **Consult it before every task** to identify which Lance pages are relevant to what you're about to do, then fetch those upstream URLs before grepping our code or guessing. Lance is the substrate; behavior is documented there, not here.
> 3. **[docs/testing.md](docs/testing.md)** — the test-coverage map. **Walk the before-every-task checklist** to identify what tests already cover the area you're changing, run them as a clean baseline before editing, plan your new test up front, and reuse the existing helpers / fixtures.
>
> Tools that support `@`-imports (Claude Code) auto-include both files via the imports below. Other agents (Codex, Cursor, Cline, …) must open them explicitly at the start of each session.
> Tools that support `@`-imports (Claude Code) auto-include all three files via the imports below. Other agents (Codex, Cursor, Cline, …) must open them explicitly at the start of each session.
>
> @docs/invariants.md
> @docs/lance.md
> @docs/testing.md
`CLAUDE.md` is a symlink to this file — there is exactly one source of truth. Edit `AGENTS.md`.
@ -67,6 +69,7 @@ Full diagram and concurrency model: [docs/architecture.md](docs/architecture.md)
|---|---|
| **Architectural invariants & deny-list (read before any non-trivial proposal or review)** | **[docs/invariants.md](docs/invariants.md)** |
| **Lance docs index — fetch upstream Lance docs by problem domain** | **[docs/lance.md](docs/lance.md)** |
| **Test coverage map — what's covered, what helpers to reuse, before-every-task checklist** | **[docs/testing.md](docs/testing.md)** |
| Architecture, L1/L2 framing, concurrency model | [docs/architecture.md](docs/architecture.md) |
| Storage layout, `__manifest` schema, URI schemes, S3 env vars | [docs/storage.md](docs/storage.md) |
| `.pg` schema language, types, constraints, annotations, migration planning | [docs/schema-language.md](docs/schema-language.md) |

92
docs/testing.md Normal file
View file

@ -0,0 +1,92 @@
# Testing
This file is the always-on map of the test surface. **Consult it before every task** so you know what tests already cover the area you're about to change, what helpers to reuse, and where a new test belongs. The architectural invariant *"tests at every boundary, not just end-to-end"* lives in [docs/invariants.md §VII.33](invariants.md).
## Where tests live, per crate
| Crate | Path | Style |
|---|---|---|
| `omnigraph` (engine) | `crates/omnigraph/tests/` | Integration tests (16 files), fixture-driven, share `tests/helpers/mod.rs` |
| `omnigraph-cli` | `crates/omnigraph-cli/tests/` | `cli.rs` (unit-ish), `system_local.rs`, `system_remote.rs`, share `tests/support/mod.rs` |
| `omnigraph-server` | `crates/omnigraph-server/tests/` | `server.rs` (HTTP-level), `openapi.rs` (OpenAPI drift / regeneration) |
| `omnigraph-compiler` | mostly in-source `#[cfg(test)] mod tests` | Parser, type-checker, IR lowering, lint |
The engine's `tests/` is the principal coverage surface; most graph-shaped behavior is exercised there.
## Engine integration tests (`crates/omnigraph/tests/`)
| File | Covers |
|---|---|
| `end_to_end.rs` | Full init → load → query/mutate flow |
| `branching.rs` | Branch create / list / delete, lazy fork |
| `runs.rs` | Transactional runs (begin/publish/abort), idempotency |
| `lifecycle.rs` | Repo lifecycle, schema state |
| `point_in_time.rs` | Snapshots, time travel (`snapshot_at_version`, `entity_at`) |
| `changes.rs` | `diff_between` / `diff_commits` |
| `consistency.rs` | Cross-table snapshot isolation, atomic publish |
| `schema_apply.rs` | Migration plan + apply, schema-apply lock |
| `search.rs` | FTS / vector / hybrid (`bm25`, `nearest`, `rrf`) |
| `traversal.rs` | `Expand`, variable-length hops, anti-join |
| `aggregation.rs` | `count`, `sum`, `avg`, `min`, `max` |
| `export.rs` | NDJSON streaming export filters |
| `s3_storage.rs` | S3-backed repo (skipped unless `OMNIGRAPH_S3_TEST_BUCKET` is set) |
| `lance_version_columns.rs` | Per-row `_row_last_updated_at_version` behavior |
| `failpoints.rs` | Failure-injection coverage (gated on `failpoints` feature) |
## Fixtures
`crates/omnigraph/tests/fixtures/` holds the canonical schema (`.pg`), seed data (`.jsonl`), and queries (`.gq`) shared across tests. Reuse these before inventing new ones — the helpers harness already knows how to load them.
## Test helpers
- **Engine**`crates/omnigraph/tests/helpers/mod.rs`: `init_and_load()` (bootstrap a temp repo + load standard fixture), `snapshot_main()`, `snapshot_branch()`, query/mutation runners, row collection and counting. Use these instead of hand-rolling.
- **CLI**`crates/omnigraph-cli/tests/support/mod.rs`: `Command`-style wrapper for invoking `omnigraph`, server-process spawning, fixture resolution, output assertion helpers.
- **Server** — no shared helpers; server tests call the `Omnigraph` engine API directly and exercise endpoints over the wire.
> Note: there is **no `MemStorage` or in-memory backend** today. Tests use `tempfile::tempdir()` for local FS. If you find yourself needing one for layer isolation, that's an architectural ask — see [docs/invariants.md §VII.34](invariants.md) (reference impl + test impl per trait).
## Failpoints (fault injection)
- Cargo feature: `failpoints = ["dep:fail", "fail/failpoints"]` (in `crates/omnigraph/Cargo.toml`).
- Wrapper: `crates/omnigraph/src/failpoints.rs` exposes `maybe_fail("name")` and `ScopedFailPoint` for tests.
- Call sites are inserted at sensitive transaction boundaries (branch create, graph publish commit, etc.).
- Activated tests: `crates/omnigraph/tests/failpoints.rs`. Run with `cargo test -p omnigraph-engine --features failpoints --test failpoints`.
## RustFS / S3 integration
CI runs three S3-backed tests against a containerized RustFS server (`.github/workflows/ci.yml``rustfs_integration` job):
- `cargo test -p omnigraph-engine --test s3_storage`
- `cargo test -p omnigraph-server --test server server_opens_s3_repo_directly_and_serves_snapshot_and_read`
- `cargo test -p omnigraph-cli --test system_local local_cli_s3_end_to_end_init_load_read_flow`
Locally, set `OMNIGRAPH_S3_TEST_BUCKET` (and the usual `AWS_*` vars including `AWS_ENDPOINT_URL_S3` for non-AWS) before running. Without those, S3 tests skip gracefully.
## OpenAPI drift
`crates/omnigraph-server/tests/openapi.rs` regenerates `openapi.json` and diffs against the checked-in copy. CI auto-commits the regeneration on same-repo PRs and otherwise runs in strict-check mode (env: `OMNIGRAPH_UPDATE_OPENAPI`).
## Examples & benches
- `crates/omnigraph/examples/bench_expand.rs` — runnable example (not part of CI).
- No `benches/` directories. The architectural rule [docs/invariants.md §VII.36](invariants.md) requires benchmark motivation before optimization, so add `benches/` per crate when you ship a perf-driven change.
## Coverage tooling — what's missing
There is **no** coverage tooling in the repo today: no `tarpaulin.toml`, no `codecov.yml`, no coverage CI step. If you want to know whether your change is covered, the answer comes from reading and running the relevant integration tests, not from a tool.
If introducing coverage tooling is in scope for your task, the natural first step is `cargo-llvm-cov` wired into a separate CI job, and a per-crate threshold rather than a global one.
## Before-every-task checklist
When you pick up any change, walk through this:
1. **Find the existing tests covering this area.** Use the table above; for engine work, start with `crates/omnigraph/tests/<area>.rs`. Open the file and read what's already exercised.
2. **Run them locally before editing.** `cargo test --workspace --locked` for the broad pass; `-p <crate> --test <file>` for a focused loop. Confirm a clean baseline.
3. **Plan the new test up front.** Decide which test file the new case belongs in (extend an existing one if it owns the area; only add a new file if you're opening a new area).
4. **Use the helpers.** `init_and_load()`, fixture files, the CLI `support` harness — re-use them. Don't bootstrap a fresh repo by hand if a helper exists.
5. **Mind the boundary.** Per [docs/invariants.md §VII.33](invariants.md), test at the layer the change lives at — planner-level changes deserve planner-level tests, not just end-to-end.
6. **For substrate-touching changes** (Lance behavior), reach for `failpoints` or fixture-driven scenarios, not stubbed-out mocks.
7. **For server / API changes**, confirm the OpenAPI regeneration happens in `openapi.rs` and that the diff lands in `openapi.json`.
When in doubt, re-read [docs/invariants.md §VII](invariants.md) — quality gates apply to every change.