omnigraph/docs/dev/index.md
Ragnor Comerford f6d2cc03e3
write-path cost gate + opener bypass (#288)
* docs(rfc): RFC-013 write-path latency design + index link

* perf(engine): open write-path tables directly, bypassing the namespace builder

Write opens routed through DatasetBuilder::from_namespace, whose describe_table
opened the whole dataset just to return a location and then re-resolved the
latest version — an O(commit-depth) double latest-resolution per table open that
missed Lance's O(1) version-hint fast path. On an object store this dominated
write latency (~70%, RFC-013 section 2.4).

TableStore::open_dataset_head_for_write now delegates to the direct opener
(open_dataset_head: Dataset::open by URI + checkout_branch, routed through the
tracked opener so cost tests can count it; a no-op in production). The manifest
already holds every sub-table's location, so the namespace catalog lookup was
redundant; ensure_expected_version still validates head == pinned for strict ops.
This completes PR #268's open-by-location migration on the write side.

With both reads (PR #268) and now writes bypassing it, nothing in production
routes through the per-table Lance namespace. The dead open chain
(load_table_from_namespace, open_table_head_for_write) is deleted and the
StagedTableNamespace contract apparatus is gated #[cfg(test)], mirroring the
already-test-only read namespace; __manifest commit coordination
(GraphNamespacePublisher) is a separate component and is unaffected.

See docs/dev/rfc-013-write-path-latency.md sections 2.4 and 9 (step 3a).

* test(engine): write-path cost-budget gate on a shared harness

Adds tests/helpers/cost.rs, a store-agnostic cost harness (IoCounts/StagedCounts,
measure/measure_with_staged, assert_flat, local_graph/s3_graph) that the read-side
warm_read_cost.rs, write_cost.rs, and write_cost_s3.rs share, so the IOTracker /
task-local plumbing lives in exactly one place instead of duplicated per test.

write_cost.rs (local, every-PR) gates the internal-table scan term flat in
commit-history depth (a RED #[ignore]'d LOCK, the acceptance for bringing the
internal tables into compaction) plus green guards: a single insert's data writes
are bounded, a per-write read-op ceiling fails the moment a round-trip is added,
and a keyed insert routes through stage_merge_insert once with no stage_append or
vector-index build.

write_cost_s3.rs (bucket-gated, rustfs CI) gates the data-table opener term flat
across depth — the object-store-RPC phenomenon local FS cannot reproduce, and the
red->green proof of the opener bypass. Wired into the rustfs_integration CI job
and its path filter.

Guards the "hot-path cost is bounded by work, not history" invariant on writes.
See docs/dev/rfc-013-write-path-latency.md section 5.1, docs/dev/testing.md.

* docs(rfc): RFC-013 step 3a landed; write-skew coupling; cost-gate test map

- Section 9: mark step 1 (gate + harness) and step 3a (opener bypass) landed;
  record the per-table namespace retirement to test-only and the corrected
  measurement note (the opener win is S3-only; the local data-table growth is the
  merge-insert/RI fragment scan, a compaction term, not the opener).
- Sections 7.1/6/11/5.5/10: correct the cross-table write-skew analysis after a
  prototype proved the scoped expected-set fix is a no-op against the per-object_id
  manifest (disjoint writers never share a row, so Lance never conflicts, the
  publisher never retries, and the expected check is a non-atomic pre-check
  evaluated once against stale state). The fix needs a shared contention row
  (Phase-7 graph_head / a minimal head row / commit-time re-validation), so it is
  coupled to that row, not standalone; that contention is load-bearing for
  correctness, not a drawback. Split the concurrent face (read-set + head) from the
  sequential face (inbound-RI validation on node removal) -- two different fixes.
- testing.md: add write_cost.rs / helpers/cost.rs / write_cost_s3.rs to the test
  map; document the local-vs-S3 backend split; extend the cost-budget checklist
  item to the write/open path and point at the shared harness.

* test(engine): isolate the opener in the S3 cost gate; fail loud on S3 setup errors

Addresses two PR review findings on the bucket-gated write_cost_s3 gate:

- The data-table opener was not isolated: `data_reads` also counts the
  merge-insert/RI scan, which reads O(fragment-count) and so grows with history
  for a different reason (compaction's domain, not the opener) -- the same term
  that made the local data-table count grow. The flat assertion would false-RED
  or misattribute scan growth to the opener on rustfs. Fix: compact (db.optimize)
  before each measurement so the table holds ~1 fragment, bounding the scan and
  leaving the opener's latest-version resolution as the only history-varying term.
  Compaction preserves version history, so the opener still faces a deep
  _versions/ chain -- the thing under test.

- s3_graph used `.ok()?`, so when OMNIGRAPH_S3_TEST_BUCKET was set but the store
  was down/misconfigured, init/seed failures collapsed to None and the gate
  skipped + passed vacuously. Fix: skip only when the bucket env var is absent;
  once it is set, init/seed failures panic (mirrors tests/s3_storage.rs).

* test(engine): isolate the S3 opener with a per-prefix IO probe (correct-by-design)

Replaces the fixture-bounded isolation (compact-before-measure) from the prior
commit with the root fix: a path-classifying ObjectStore wrapper (PrefixCounter)
that attributes each data-table read to the opener term (_versions/.manifest) vs
the scan term (data/*.lance). IoCounts now exposes data_opener_reads /
data_scan_reads, so write_cost_s3 asserts the opener flat *directly* -- no
compaction or fixture massaging, and the assertion measures the opener, not the
conflated total. Closes the "harness conflates two IO terms" class: any cost test
(read or write) can now isolate the opener.

PrefixCounter implements only the object_store 0.13 core ObjectStore methods; the
convenience surface (get/put/head/...) routes through get_opts/put_opts via
ObjectStoreExt's blanket impl, so every read/write is still counted.

Validated locally (every-PR) by write_cost::data_table_reads_split_into_flat_opener_
and_growing_scan: opener stays flat (7 -> 3) while scan grows (11 -> 91) and
opener + scan == data_total exactly -- proving the classifier and confirming the
local data-table growth is the fragment scan, not the opener. warm_read_cost
(12 tests) stays green under the shared-harness change.

* refactor(tests): remove cost-harness duplication and namespace cfg(test) noise

Branch self-review (no behavior change) — pay down three liabilities the
write-path work left:

- warm_read_cost.rs kept its own probes() (three IOTrackers + a QueryIoProbes +
  a probe counter) and read raw .stats().read_iops — duplicating the shared
  helpers::cost harness this branch introduced. Migrated all 12 tests onto
  measure()/IoCounts; deleted the local probes(). (This also makes IoCounts'
  version_probes field used rather than dead.)

- insert_cost was copy-pasted verbatim into write_cost.rs and write_cost_s3.rs.
  Hoisted to helpers::cost::measure_insert so the measured write is defined once.

- The per-table Lance namespace (namespace.rs) became entirely test-only after
  step 3a, but was gated with ~22 per-item #[cfg(test)] attributes. Collapsed to
  a single `#[cfg(test)] mod namespace;` and stripped the per-item attributes;
  merged the import groups the gating had split.

Verified: lib in-source 162 passed; write_cost 4 + warm_read_cost 12 passed;
forbidden_apis passed.
2026-06-20 13:31:15 +02:00

100 lines
6.5 KiB
Markdown

# Developer Docs
**Audience:** contributors, maintainers, and coding agents
This is the contributor-facing entry point. These docs explain architecture,
invariants, implementation contracts, test ownership, and upstream Lance
constraints. User-facing behavior should still be documented through
[docs/user/index.md](../user/index.md) and the relevant public reference docs.
## Required For Every Non-Trivial Change
| Need | Read |
|---|---|
| Architectural rules, known gaps, deny-list | [invariants.md](invariants.md) |
| Upstream Lance source-of-truth index | [lance.md](lance.md) |
| Existing test coverage and test placement | [testing.md](testing.md) |
## Architecture And Storage
| Area | Read |
|---|---|
| System structure, L1/L2 framing, component diagrams | [architecture.md](architecture.md) |
| On-disk layout, manifest schema, URI behavior | [storage.md](../user/concepts/storage.md) |
| Direct-publish writes, D2, staged writes, recovery sidecars | [writes.md](writes.md) |
| Query execution, mutation execution, loader flow | [execution.md](execution.md) |
| Index lifecycle and graph topology indexes | [indexes.md](../user/search/indexes.md) |
| Branch and commit internals | [branches-commits.md](../user/branching/index.md) |
| Three-way merge implementation and conflicts | [merge.md](merge.md) |
| Diff/change-feed implementation | [changes.md](../user/branching/changes.md) |
| Branch protection policy | [branch-protection.md](branch-protection.md) |
## Language, Runtime, And Boundaries
| Area | Read |
|---|---|
| Schema grammar, catalog, migration planner | [schema-language.md](../user/schema/index.md) |
| Query grammar, IR, lints, mutation restrictions | [query-language.md](../user/queries/index.md) |
| Embedding client and `@embed` integration | [embeddings.md](../user/search/embeddings.md) |
| Cedar policy surface and server gating | [policy.md](../user/operations/policy.md) |
| Server auth, OpenAPI, endpoint handlers | [server.md](../user/operations/server.md) |
| Error taxonomy and serialization | [errors.md](../user/operations/errors.md) |
| Constants and tunables | [constants.md](../user/reference/constants.md) |
| Transaction model public contract | [transactions.md](../user/branching/transactions.md) |
## Project Operations
| Area | Read |
|---|---|
| CI and release workflows | [ci.md](ci.md) |
| Install and deployment packaging | [install.md](../user/install.md), [deployment.md](../user/deployment.md) |
| Release history | [releases/](../releases/) |
## Contribution & Governance
| Area | Read |
|---|---|
| How to contribute (external) | [CONTRIBUTING.md](../../CONTRIBUTING.md) |
| Governance model, roles, decision authority | [GOVERNANCE.md](../../GOVERNANCE.md) |
| Public contribution RFC track | [rfcs/](../rfcs/) |
The `docs/rfcs/` track is the **public, externally-authorable** RFC process. The
maintainer/internal RFCs below (`rfc-00N-*.md`) are a separate, team-owned
track; don't conflate the two.
## Case Studies
Worked write-ups of specific bugs — root cause, fix, and the reasoning that
ruled out the tempting-but-wrong alternatives. Read these for the debugging
pattern, not just the outcome.
| Area | Read |
|---|---|
| camelCase property filters lowercased at runtime (#283) — two engine→Lance boundaries, two different fixes | [bug-case-fix.md](bug-case-fix.md) |
## Active Implementation Plans
Working documents for in-flight feature work. Removed when the work lands.
| Area | Read |
|---|---|
| Schema-lint chassis v1 (MR-694) — `--allow-data-loss`, soft/hard drops | [schema-lint-v1-plan.md](schema-lint-v1-plan.md) |
| Inline + stored queries, request/response envelope, MCP (MR-656 / MR-976 / MR-969) | [rfc-001-queries-envelope-mcp.md](rfc-001-queries-envelope-mcp.md) |
| Config & CLI architecture — layered config, client targeting, file naming (MR-973 / MR-974 / MR-981) | [rfc-002-config-cli-architecture.md](rfc-002-config-cli-architecture.md) |
| MCP server surface — full tool parity, stored queries, modular auth (MR-969 / MR-956 / MR-974) | [rfc-003-mcp-server-surface.md](rfc-003-mcp-server-surface.md) |
| Future cluster control plane — declarative as-code config, JSON state ledger, reconciler | [cluster-config-specs.md](cluster-config-specs.md), [cluster-axioms.md](cluster-axioms.md), [cluster-config-implementation-spec.md](cluster-config-implementation-spec.md) |
| Cluster graph & schema apply — Phase 4 sidecars, roll-forward recovery, approval artifacts | [rfc-004-cluster-graph-schema-apply.md](rfc-004-cluster-graph-schema-apply.md) |
| Server boots from cluster state — Phase 5 mode switch, applied-revision serving | [rfc-005-server-cluster-boot.md](rfc-005-server-cluster-boot.md) |
| Per-operator config — `~/.omnigraph/` identity, keyed credentials, named servers (the operator slice of RFC-002) | [rfc-007-operator-config.md](rfc-007-operator-config.md) |
| Deprecate `omnigraph.yaml` — one concern per config surface; key-by-key migration map and staged retirement | [rfc-008-deprecate-omnigraph-yaml.md](rfc-008-deprecate-omnigraph-yaml.md) |
| Unify CLI embedded/remote access paths — parity referee, shared wire-DTO crate, `GraphClient` trait, declared plane capabilities | [rfc-009-unify-access-paths.md](rfc-009-unify-access-paths.md) |
| Restructure the CLI around explicit planes — one graph-addressing model, declared capability surface, plane-grouped help (expands RFC-009 Phase 4) | [rfc-010-cli-planes-restructure.md](rfc-010-cli-planes-restructure.md) |
| CLI refactoring — one addressing & config model post-`omnigraph.yaml`: scope + `--graph` + derived access path, served-default / privileged-direct, profiles, named queries, capability classifier (completes RFC-008) | [rfc-011-cli-refactoring.md](rfc-011-cli-refactoring.md) |
| Provider-independent embedding configuration — one resolved `EmbeddingConfig` + sealed provider enum (Gemini/OpenAI/Mock), identity recorded in the schema IR, query-time same-space validation, NFR floor | [rfc-012-embedding-provider-config.md](rfc-012-embedding-provider-config.md) |
| Write-path latency — capture-once `WriteTxn`, version-pinned opens, one `GraphPublishAuthority` fed declarative `PublishPlan`s, manifest-authoritative lineage, epoch fence, bounded history (compaction + cleanup), and an IO-counted cost contract (`iss-write-s3-roundtrip-amplification`, `iss-991`) | [rfc-013-write-path-latency.md](rfc-013-write-path-latency.md) |
## Boundary
Developer docs may mention implementation details, stale gaps, upstream Lance
blockers, and review rules. User docs should not require that context unless
the detail changes the public contract.