omnigraph/docs/dev/index.md
Andrew Altshuler 9c792649e2
docs(user): coherence cleanup aligned with 0.7.1 (#293)
* docs(cli): fix cluster apply semantics — converges graphs+schema, not config-only

`cluster apply` creates graphs, applies schema updates (soft drops), writes
stored-query/policy catalog resources, and executes approved graph deletes in
one ordered run. Both the user docs and the shipped CLI help text still
described it as a "Stage 3A" config-only (query/policy) subset that defers
graph/schema changes "to a later stage" — wrong since the graph/schema executor
landed.

- docs/user/cli/reference.md: rewrite the cluster paragraph to describe apply's
  actual converge behavior; keep deferred for the genuinely-unsupported case
  (standalone schema deletes); drop the stale "Stage 3A" / "reserved for later
  stages" framing.
- crates/omnigraph-cli/src/cli.rs: fix the `cluster apply` help text to match.

Part of the docs/user coherence cleanup (docs/dev/docs-issues.md, P1).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01FQ1Hf4eXLsJmeLUkTYBEw7

* docs(server): align stored-query exposure with cluster-only behavior

server.md documented a per-query expose knob ("`mcp.expose` defaults to true;
set `mcp: { expose: false }` to hide from the catalog") that does not exist in
the only deployment mode. Cluster-only serving lists every stored query: the
cluster registry has no expose field (`QueryConfig { file }`) and the boot
bridge hardcodes `expose: true` for all cluster queries
(omnigraph-server settings), and there is no GQ-level expose annotation. This
contradicted clusters/config.md, which already states the correct behavior.

Replace the knob bullet with the cluster truth (every applied query is listed;
per-query exposure may become a Cedar-policy decision later) and drop the
"`mcp.expose` stored queries" phrasing from the catalog description, the
endpoint table, and the intro. The `mcp_expose` JSON catalog field is unchanged
(still emitted, always true in cluster mode).

Part of the docs/user coherence cleanup (docs/dev/docs-issues.md, P1).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01FQ1Hf4eXLsJmeLUkTYBEw7

* docs(schema): split direct/embedded vs cluster-managed schema apply

schema/index.md claimed `allow_data_loss` is "honored uniformly across
transports" and listed HTTP `POST /schema/apply` among them. But that route is
409-disabled for cluster-backed serving (already documented in server.md), and
cluster-managed graphs evolve only through `cluster apply` with soft drops —
there is no cluster HTTP data-loss path.

Scope the data-loss flag to the direct/embedded path (`schema apply --store`,
SDK), and add a paragraph: cluster-managed graphs use `cluster apply`
(soft drops only); HTTP `POST /schema/apply` is 409 for cluster serving; direct
apply against a cluster-managed path is refused. Cross-refs server + cluster
docs.

Part of the docs/user coherence cleanup (docs/dev/docs-issues.md, P2).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01FQ1Hf4eXLsJmeLUkTYBEw7

* docs(server): document /load as canonical in limits + admission prose

The endpoint table already listed both `/load` (canonical) and `/ingest`
(deprecated alias) at 32 MB, but the admission-control, body-limit,
rate-limit, and manifest-conflict prose named only `/ingest` — and the
constants page called the limit "Ingest body limit". Add `/load` alongside (or
ahead of) `/ingest` everywhere, and rename the constant to "Load (bulk-write)
body limit" noting the `/ingest` alias shares it.

Part of the docs/user coherence cleanup (docs/dev/docs-issues.md, P2).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01FQ1Hf4eXLsJmeLUkTYBEw7

* docs(cli): drop stale bearer-token keys + fix version string

The "Bearer token resolution (CLI)" section still listed removed omnigraph.yaml
keys (`graphs.<name>.bearer_token_env`, `auth.env_file`) — config surfaces that
no longer exist and that implied plaintext tokens in config. Replace it with a
pointer to the keyed-credential model documented above
(`OMNIGRAPH_TOKEN_<NAME>` → `~/.omnigraph/credentials` →
`OMNIGRAPH_BEARER_TOKEN`). Also fix the `version` row: the CLI prints 0.7.x, not
0.3.x.

Part of the docs/user coherence cleanup (docs/dev/docs-issues.md, P2 + smaller).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01FQ1Hf4eXLsJmeLUkTYBEw7

* docs: route-spelling note + drop stale stage/deferred crumbs

- server.md: add a one-line note that the per-graph subsections name routes in
  shorthand (`GET /queries`, `POST /query`, `POST /mutate`,
  `POST /queries/{name}`) but every one is served under `/graphs/{id}/…` — the
  endpoint table is already fully-qualified.
- clusters/config.md: redefine the `deferred` plan disposition as an unsupported
  change (e.g. a standalone schema delete) instead of "graph/schema change,
  later phase" (graph creates and schema updates apply now); drop the "Stage 2C"
  label from the lock-recovery note.
- search/indexes.md: `ingest --mode merge` → canonical `load --mode merge`.

Part of the docs/user coherence cleanup (docs/dev/docs-issues.md, P2 + smaller).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01FQ1Hf4eXLsJmeLUkTYBEw7

* docs(dev): track user-docs coherence ledger; mark 2026-06-20 findings resolved

Convert the scratch review notes into a tracked living ledger and link it from
the dev index. All ten findings from the 2026-06-20 docs/user sweep are
validated and fixed in this branch (P1 cluster-apply semantics + stored-query
exposure; P2 schema-apply paths, /load canonical, bearer-token keys, route
shorthand; plus version/ingest/deferred/stage crumbs). The verification grep
checklist is retained for future audits.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01FQ1Hf4eXLsJmeLUkTYBEw7

* docs(api): align GET /queries OpenAPI contract with cluster-only behavior

Greptile P1 on #293: the prose fix in server.md left the OpenAPI surface stale.
The utoipa annotations (handlers.rs, omnigraph-api-types QueriesCatalogOutput)
still described the catalog as "the `mcp.expose == true` subset", and those
drive the checked-in openapi.json — so SDK consumers read a contract the
cluster-only server does not honor (it lists every stored query).

Update the three Rust doc-comment/annotation strings to "every stored query"
and regenerate openapi.json (OMNIGRAPH_UPDATE_OPENAPI=1; drift test green) in
the same change, per AGENTS.md rule 4. Ledger updated: this finding resolved,
plus the cross-repo drift it surfaced (omnigraph-ts generated spec/types and
omnigraph-cookbooks best-practices bearer_token_env) tracked as open follow-ups.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01FQ1Hf4eXLsJmeLUkTYBEw7

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 00:02:34 +03:00

6.6 KiB

Developer Docs

Audience: contributors, maintainers, and coding agents

This is the contributor-facing entry point. These docs explain architecture, invariants, implementation contracts, test ownership, and upstream Lance constraints. User-facing behavior should still be documented through docs/user/index.md and the relevant public reference docs.

Required For Every Non-Trivial Change

Need Read
Architectural rules, known gaps, deny-list invariants.md
Upstream Lance source-of-truth index lance.md
Existing test coverage and test placement testing.md

Architecture And Storage

Area Read
System structure, L1/L2 framing, component diagrams architecture.md
On-disk layout, manifest schema, URI behavior storage.md
Direct-publish writes, D2, staged writes, recovery sidecars writes.md
Query execution, mutation execution, loader flow execution.md
Index lifecycle and graph topology indexes indexes.md
Branch and commit internals branches-commits.md
Three-way merge implementation and conflicts merge.md
Diff/change-feed implementation changes.md
Branch protection policy branch-protection.md

Language, Runtime, And Boundaries

Area Read
Schema grammar, catalog, migration planner schema-language.md
Query grammar, IR, lints, mutation restrictions query-language.md
Embedding client and @embed integration embeddings.md
Cedar policy surface and server gating policy.md
Server auth, OpenAPI, endpoint handlers server.md
Error taxonomy and serialization errors.md
Constants and tunables constants.md
Transaction model public contract transactions.md
User-doc coherence cleanup ledger docs-issues.md

Project Operations

Area Read
CI and release workflows ci.md
Install and deployment packaging install.md, deployment.md
Release history releases/

Contribution & Governance

Area Read
How to contribute (external) CONTRIBUTING.md
Governance model, roles, decision authority GOVERNANCE.md
Public contribution RFC track rfcs/

The docs/rfcs/ track is the public, externally-authorable RFC process. The maintainer/internal RFCs below (rfc-00N-*.md) are a separate, team-owned track; don't conflate the two.

Case Studies

Worked write-ups of specific bugs — root cause, fix, and the reasoning that ruled out the tempting-but-wrong alternatives. Read these for the debugging pattern, not just the outcome.

Area Read
camelCase property filters lowercased at runtime (#283) — two engine→Lance boundaries, two different fixes bug-case-fix.md

Active Implementation Plans

Working documents for in-flight feature work. Removed when the work lands.

Area Read
Schema-lint chassis v1 (MR-694) — --allow-data-loss, soft/hard drops schema-lint-v1-plan.md
Inline + stored queries, request/response envelope, MCP (MR-656 / MR-976 / MR-969) rfc-001-queries-envelope-mcp.md
Config & CLI architecture — layered config, client targeting, file naming (MR-973 / MR-974 / MR-981) rfc-002-config-cli-architecture.md
MCP server surface — full tool parity, stored queries, modular auth (MR-969 / MR-956 / MR-974) rfc-003-mcp-server-surface.md
Future cluster control plane — declarative as-code config, JSON state ledger, reconciler cluster-config-specs.md, cluster-axioms.md, cluster-config-implementation-spec.md
Cluster graph & schema apply — Phase 4 sidecars, roll-forward recovery, approval artifacts rfc-004-cluster-graph-schema-apply.md
Server boots from cluster state — Phase 5 mode switch, applied-revision serving rfc-005-server-cluster-boot.md
Per-operator config — ~/.omnigraph/ identity, keyed credentials, named servers (the operator slice of RFC-002) rfc-007-operator-config.md
Deprecate omnigraph.yaml — one concern per config surface; key-by-key migration map and staged retirement rfc-008-deprecate-omnigraph-yaml.md
Unify CLI embedded/remote access paths — parity referee, shared wire-DTO crate, GraphClient trait, declared plane capabilities rfc-009-unify-access-paths.md
Restructure the CLI around explicit planes — one graph-addressing model, declared capability surface, plane-grouped help (expands RFC-009 Phase 4) rfc-010-cli-planes-restructure.md
CLI refactoring — one addressing & config model post-omnigraph.yaml: scope + --graph + derived access path, served-default / privileged-direct, profiles, named queries, capability classifier (completes RFC-008) rfc-011-cli-refactoring.md
Provider-independent embedding configuration — one resolved EmbeddingConfig + sealed provider enum (Gemini/OpenAI/Mock), identity recorded in the schema IR, query-time same-space validation, NFR floor rfc-012-embedding-provider-config.md
Write-path latency — capture-once WriteTxn, version-pinned opens, one GraphPublishAuthority fed declarative PublishPlans, manifest-authoritative lineage, epoch fence, bounded history (compaction + cleanup), and an IO-counted cost contract (iss-write-s3-roundtrip-amplification, iss-991) rfc-013-write-path-latency.md

Boundary

Developer docs may mention implementation details, stale gaps, upstream Lance blockers, and review rules. User docs should not require that context unless the detail changes the public contract.