mirror of https://github.com/ModernRelay/omnigraph.git synced 2026-06-24 02:38:06 +02:00

Andrew Altshuler 9c792649e2

docs(user): coherence cleanup aligned with 0.7.1 (#293 )

* docs(cli): fix cluster apply semantics — converges graphs+schema, not config-only

`cluster apply` creates graphs, applies schema updates (soft drops), writes
stored-query/policy catalog resources, and executes approved graph deletes in
one ordered run. Both the user docs and the shipped CLI help text still
described it as a "Stage 3A" config-only (query/policy) subset that defers
graph/schema changes "to a later stage" — wrong since the graph/schema executor
landed.

- docs/user/cli/reference.md: rewrite the cluster paragraph to describe apply's
  actual converge behavior; keep deferred for the genuinely-unsupported case
  (standalone schema deletes); drop the stale "Stage 3A" / "reserved for later
  stages" framing.
- crates/omnigraph-cli/src/cli.rs: fix the `cluster apply` help text to match.

Part of the docs/user coherence cleanup (docs/dev/docs-issues.md, P1).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01FQ1Hf4eXLsJmeLUkTYBEw7

* docs(server): align stored-query exposure with cluster-only behavior

server.md documented a per-query expose knob ("`mcp.expose` defaults to true;
set `mcp: { expose: false }` to hide from the catalog") that does not exist in
the only deployment mode. Cluster-only serving lists every stored query: the
cluster registry has no expose field (`QueryConfig { file }`) and the boot
bridge hardcodes `expose: true` for all cluster queries
(omnigraph-server settings), and there is no GQ-level expose annotation. This
contradicted clusters/config.md, which already states the correct behavior.

Replace the knob bullet with the cluster truth (every applied query is listed;
per-query exposure may become a Cedar-policy decision later) and drop the
"`mcp.expose` stored queries" phrasing from the catalog description, the
endpoint table, and the intro. The `mcp_expose` JSON catalog field is unchanged
(still emitted, always true in cluster mode).

Part of the docs/user coherence cleanup (docs/dev/docs-issues.md, P1).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01FQ1Hf4eXLsJmeLUkTYBEw7

* docs(schema): split direct/embedded vs cluster-managed schema apply

schema/index.md claimed `allow_data_loss` is "honored uniformly across
transports" and listed HTTP `POST /schema/apply` among them. But that route is
409-disabled for cluster-backed serving (already documented in server.md), and
cluster-managed graphs evolve only through `cluster apply` with soft drops —
there is no cluster HTTP data-loss path.

Scope the data-loss flag to the direct/embedded path (`schema apply --store`,
SDK), and add a paragraph: cluster-managed graphs use `cluster apply`
(soft drops only); HTTP `POST /schema/apply` is 409 for cluster serving; direct
apply against a cluster-managed path is refused. Cross-refs server + cluster
docs.

Part of the docs/user coherence cleanup (docs/dev/docs-issues.md, P2).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01FQ1Hf4eXLsJmeLUkTYBEw7

* docs(server): document /load as canonical in limits + admission prose

The endpoint table already listed both `/load` (canonical) and `/ingest`
(deprecated alias) at 32 MB, but the admission-control, body-limit,
rate-limit, and manifest-conflict prose named only `/ingest` — and the
constants page called the limit "Ingest body limit". Add `/load` alongside (or
ahead of) `/ingest` everywhere, and rename the constant to "Load (bulk-write)
body limit" noting the `/ingest` alias shares it.

Part of the docs/user coherence cleanup (docs/dev/docs-issues.md, P2).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01FQ1Hf4eXLsJmeLUkTYBEw7

* docs(cli): drop stale bearer-token keys + fix version string

The "Bearer token resolution (CLI)" section still listed removed omnigraph.yaml
keys (`graphs.<name>.bearer_token_env`, `auth.env_file`) — config surfaces that
no longer exist and that implied plaintext tokens in config. Replace it with a
pointer to the keyed-credential model documented above
(`OMNIGRAPH_TOKEN_<NAME>` → `~/.omnigraph/credentials` →
`OMNIGRAPH_BEARER_TOKEN`). Also fix the `version` row: the CLI prints 0.7.x, not
0.3.x.

Part of the docs/user coherence cleanup (docs/dev/docs-issues.md, P2 + smaller).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01FQ1Hf4eXLsJmeLUkTYBEw7

* docs: route-spelling note + drop stale stage/deferred crumbs

- server.md: add a one-line note that the per-graph subsections name routes in
  shorthand (`GET /queries`, `POST /query`, `POST /mutate`,
  `POST /queries/{name}`) but every one is served under `/graphs/{id}/…` — the
  endpoint table is already fully-qualified.
- clusters/config.md: redefine the `deferred` plan disposition as an unsupported
  change (e.g. a standalone schema delete) instead of "graph/schema change,
  later phase" (graph creates and schema updates apply now); drop the "Stage 2C"
  label from the lock-recovery note.
- search/indexes.md: `ingest --mode merge` → canonical `load --mode merge`.

Part of the docs/user coherence cleanup (docs/dev/docs-issues.md, P2 + smaller).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01FQ1Hf4eXLsJmeLUkTYBEw7

* docs(dev): track user-docs coherence ledger; mark 2026-06-20 findings resolved

Convert the scratch review notes into a tracked living ledger and link it from
the dev index. All ten findings from the 2026-06-20 docs/user sweep are
validated and fixed in this branch (P1 cluster-apply semantics + stored-query
exposure; P2 schema-apply paths, /load canonical, bearer-token keys, route
shorthand; plus version/ingest/deferred/stage crumbs). The verification grep
checklist is retained for future audits.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01FQ1Hf4eXLsJmeLUkTYBEw7

* docs(api): align GET /queries OpenAPI contract with cluster-only behavior

Greptile P1 on #293: the prose fix in server.md left the OpenAPI surface stale.
The utoipa annotations (handlers.rs, omnigraph-api-types QueriesCatalogOutput)
still described the catalog as "the `mcp.expose == true` subset", and those
drive the checked-in openapi.json — so SDK consumers read a contract the
cluster-only server does not honor (it lists every stored query).

Update the three Rust doc-comment/annotation strings to "every stored query"
and regenerate openapi.json (OMNIGRAPH_UPDATE_OPENAPI=1; drift test green) in
the same change, per AGENTS.md rule 4. Ledger updated: this finding resolved,
plus the cross-repo drift it surfaced (omnigraph-ts generated spec/types and
omnigraph-cookbooks best-practices bearer_token_env) tracked as open follow-ups.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01FQ1Hf4eXLsJmeLUkTYBEw7

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

2026-06-21 00:02:34 +03:00

4.7 KiB

Raw Blame History

Schema Language (`.pg`)

Top-level declarations

interface <Name> { property* } — reusable property contracts.
node <Name> [implements <Iface>, ...] { property* | constraint* }
edge <Name>: <FromType> -> <ToType> [@card(min..max)] { property* | constraint* }
Comments: line // and block /* … */.

Property declarations

<ident>: <TypeRef> [annotation*]

Built-in scalar types

Scalar	Arrow type
`String`	Utf8
`Blob`	LargeBinary
`Bool`	Boolean
`I32` / `I64`	Int32 / Int64
`U32` / `U64`	UInt32 / UInt64
`F32` / `F64`	Float32 / Float64
`Date`	Date32
`DateTime`	Date64
`Vector(<dim>)`	FixedSizeList(Float32, dim), `1 ≤ dim ≤ i32::MAX`
`[<scalar>]`	List(scalar)
`enum(v1, v2, …)`	Utf8 with sorted/dedup'd set of allowed string values
`<scalar>?`	Same as scalar but `nullable: true`

Constraints (body level)

Constraint	On	Effect
`@key(p, …)`	node	Primary key; implies index on key columns; `key_property()` returns the first key
`@unique(p, …)`	node, edge	Uniqueness across listed columns
`@index(p, …)`	node, edge	Build a scalar (BTREE) index on the columns
`@range(p, min..max)`	node	Numeric range validation (open ranges allowed)
`@check(p, "regex")`	node	Regex pattern validation
`@card(min..max?)`	edge	Edge multiplicity — default `0..`; `0..1`, `1..1`, `1..`, etc.

Edge bodies only allow @unique and @index.

Annotations

@<ident> or @<ident>(<literal>) on any declaration or property.
Known annotations:
- @embed("source_property") on a Vector property — records which String property is the embedding source for query-time nearest($v, "string") auto-embedding. It is a catalog annotation; it does not populate the vector at ingest (supply vectors in load data, or pre-fill via the offline omnigraph embed pipeline). An optional model="…" kwarg (@embed("source_property", model="openai/text-embedding-3-large")) records the embedding model so a nearest() query whose embedder uses a different model is rejected loudly; model is the only supported kwarg. See search/embeddings.md.
- @description("…"), @instruction("…") on query declarations (carried through to clients).
Custom annotations are accepted by the parser and surfaced in catalog metadata; unrecognized annotations don't fail compilation.

Table layout

Each node type compiles to a table with an id: Utf8 column plus all declared properties (blob columns are stored as LargeBinary); implements clauses expand the interface's properties into the node.
Each edge type compiles to a table with id: Utf8, src: Utf8, dst: Utf8 plus the edge's own properties. Edge endpoint types (from/to) must exist, and edge names are matched case-insensitively.

Schema migration planning

A migration plan compares the accepted schema against the desired one and reports whether the change is supported plus the ordered steps it requires:

Add a type
Rename a type
Add a property
Rename a property
Add a constraint
Update type or property metadata (annotations)
Unsupported change (reports the entity and reason; forces the plan to unsupported)

Applying a plan reports whether it was supported, the steps applied, and the resulting manifest version. Concurrent schema applies serialize so they can't interleave.

Destructive drops — `--allow-data-loss`

DropProperty and DropType steps default to Soft mode: the catalog tombstones the entry but the prior column / dataset remains time-travel-reachable via snapshot_at_version(prev) until omnigraph cleanup runs. Soft drops are reversible.

Pass --allow-data-loss (CLI schema apply) or allow_data_loss: true (SDK SchemaApplyOptions) to promote every drop in the plan to Hard mode. Hard drops run cleanup_old_versions on the affected dataset immediately after the manifest publish, making the prior column / dataset unreachable. Irreversible.

This is the direct/embedded schema-apply path — omnigraph schema apply --store … and the embedded SDK apply_schema_with_options(.., SchemaApplyOptions { allow_data_loss: true }) produce identical plans and identical effects.

Cluster-managed graphs are different. A graph served from a cluster evolves only through omnigraph cluster apply, which performs soft drops only (no allow_data_loss path), and the HTTP POST /schema/apply route is disabled (returns 409) for cluster-backed serving — see server and cluster-config. Direct schema apply against a cluster-managed storage path is likewise refused.

4.7 KiB Raw Blame History

Schema Language (.pg)