Stored-query registry foundation + config/CLI RFC-002 (#128)

* MR-969: add stored-query registry config surface

Introduce the `queries:` block in omnigraph.yaml — an inline
`name -> entry` map of stored queries, per-graph
(`graphs.<id>.queries`) and top-level for single-graph mode, mirroring
how `policy` is wired in both modes. Each entry points at a `.gq` file
and carries optional MCP exposure settings (`expose`, `tool_name`),
defaulting to not-exposed.

Additive: absent `queries:` leaves current behavior unchanged.

- QueryEntry { file, mcp: McpSettings { expose, tool_name } }
- `queries` field on TargetConfig + OmnigraphConfig (serde default)
- query_entries() / target_query_entries() accessors
- resolve_query_file() — base_dir-relative `.gq` path resolution
- round-trip + absent-block tests

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* Add stored-query registry loader and GraphHandle wiring

Add a `queries` module: QueryRegistry loads each declared `.gq` entry,
parses it, and selects the query whose symbol matches the manifest key,
asserting the two agree (key == `query <name>` symbol). Identity is the
query name; a key/symbol mismatch is a load-time error. Errors are
collected, not fail-fast, so a bad registry surfaces every broken entry
at once. Schema type-checking is deliberately left to a separate pass so
the loader stays callable without an open engine.

Thread an `Option<Arc<QueryRegistry>>` through GraphHandle alongside the
per-graph policy; the URI-canonicalizing clone propagates it. Production
openers default to None for now — the boot path loads and attaches the
registry in a later change.

- QueryRegistry::{from_specs, load, lookup, iter}; StoredQuery::is_mutation
- GraphHandle.queries field, propagated on canonical clone
- registry unit tests: identity match/mismatch, multi-query selection,
  per-entry parse errors, error collection, mutation classification

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs: add RFC-002 config & CLI architecture

Layered config (user-global ~/.config/omnigraph/ + per-project), a
unifying `target` abstraction resolving to (locus, graph, sub-state,
credential) with embedded-URI XOR remote-server loci, multi-server ×
multi-graph client targeting, credentials by-reference, and the
file-naming decision: project and server config are one artifact
(`omnigraph.yaml`); the only differently-named file is the user-global
`config.yaml`, split by scope not role. Includes the 12-factor bind
portability rule (prefer --bind/OMNIGRAPH_BIND over a committed
server.bind) and the defined-locally / invoked-remotely model for
stored queries. Derived from first principles working backwards from
what the engine enables; validated against kube/Helix/git/compose.

Linked from docs/dev/index.md. Proposed; phased rollout for the
MR-973/974/981 family.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* Add check() to validate stored queries against the live schema

A pure check(registry, catalog) that type-checks every stored query via
the same typecheck_query_decl the engine runs for inline queries — no
parallel implementation. Failures are collected, not fail-fast, so an
operator sees every broken query (e.g. a type/property a migration
renamed or removed) in one pass. Breakages are fatal (the boot path will
refuse to start); warnings are advisory.

Pure over (registry, catalog) so it is callable both at boot (engine
catalog) and offline from the CLI without an open engine.

Advisory lint: an mcp.expose:true query that declares a Vector(N)
parameter warns — an LLM cannot supply a raw embedding vector; such a
query should take a String parameter and embed server-side. Warns
rather than rejects, since service-to-service callers may pass vectors.

- CheckReport { breakages, warnings }; has_breakages / is_clean
- tests: valid query, unknown type, unknown property, collect-not-fail-fast,
  vector-param-exposed warns, unexposed silent

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* Drop internal plan-label refs from stored-query config comments

Doc comments referenced sequencing labels ("C2") that mean nothing to a
reader; reword to describe the behavior directly. Comment-only.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs: reconcile aliases with the role model in RFC-002

Place the existing client-only `aliases:` block in the client/server
role split: aliases are client-role (CLI, embedded, ungated) and may
live in both user-global and project config; `queries:` is server-role
(deployment manifest only). They overlap as "name -> .gq"; `queries:` is
the superset, and the end-state subsumes aliases (definition -> queries,
target/branch/format -> client invocation context, positional args ->
CLI sugar). v1 keeps aliases unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs: make RFC-002 config global-first, project-optional

The global user config is the primary, self-sufficient default; the
CLI works from any directory with no project file (the kubectl/aws/gh
posture), a deliberate flip from today's project-anchored behavior.
The project omnigraph.yaml becomes an optional repo-scoped override and
the deployment manifest. Uniform schema, both layers optional; global
can hold any section including a personal server's graphs/queries.
Additive: project still overrides global; the flip adds a fallback
layer below the project file rather than removing it.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs: justify XDG ~/.config/omnigraph over legacy ~/.omnigraph in RFC-002

Make the rationale explicit: XDG-first because OmniGraph is a client
that will cache remote catalogs and keep session state alongside
secrets, and XDG separates config / cache / state into distinct dirs
(clear cache without touching creds; backups skip cache) whereas a
single ~/.omnigraph/ mixes them. Honor ~/.omnigraph/ as a fallback for
the peer-group (aws/kube/docker/helix) expectation. Add XDG_CACHE_HOME
/ XDG_STATE_HOME to the override precedence.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs: build RFC-002 credentials on the existing env-file mechanism

OmniGraph already has credentials-by-reference: bearer_token_env names
the env var, and auth.env_file is a git-ignored dotenv the CLI
auto-loads (real env vars win), resolved via resolve_remote_bearer_token.
The RFC's proposed credentials.yaml + token_env were redundant parallel
inventions. Reconcile: reuse bearer_token_env (extend to
servers.<name>) and auth.env_file (add a global ~/.config/omnigraph/.env
layered under the project .env.omni); OS keychain is an additive future
resolver. No new credentials.yaml. Updated summary, non-goals,
background, file-naming, credentials, example, login, migration, rollout.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs: use single ~/.omnigraph dir (Helix-style), not XDG, in RFC-002

Reverse the earlier XDG-first call. The prior argument rested on a false
dichotomy (single-dir => mixed config/cache/state); in fact the peer
tools (aws, kube, helix) achieve separation via SUBDIRECTORIES inside
one ~/.tool/ dir (~/.aws/sso/cache/, ~/.kube/cache/), getting cache
hygiene AND one discoverable place. So everything goes under
~/.omnigraph/: config.yaml, credentials (dotenv, 0600), cache/, state/.
Lower cognitive load, matches what DB/cloud-CLI users expect, matches
Helix. OMNIGRAPH_HOME overrides; $XDG_CONFIG_HOME optionally honored but
~/.omnigraph/ is canonical. Updated all paths, the rationale paragraph,
the file-naming table (added a cache/state row), and env precedence.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs: reconcile RFC-002 with shipped/planned CLI tickets

Align with reality found in existing tickets:
- Noun is graph/graphs, not target/targets (MR-603 done renamed the
  config key targets->graphs, flag --graph). Use graphs:/--graph; an
  entry is embedded (uri) XOR remote (server + remote graph name).
- ~/.omnigraph/ confirmed by MR-581 (og template pull, done) which
  already quick-starts templates there.
- Templates already exist (MR-581/MR-531) — not invented here.
- The init family is already specced (init, quickstart MR-973, serve
  MR-970, prune MR-972, mcp install MR-974, agent-mode MR-981); this
  RFC only adds the user route (~/.omnigraph/config.yaml + login).
- aliases: -> operations: planned (MR-839).
- bearer_token_env gap tracked in MR-971.
- query lint/check already exist (MR-639) — registry validator must not
  collide with the singular `query check`.
Add a Reconciliation section; fix the canonical example to graphs:/--graph.
Also: merge semantics refined (deep-merge settings, replace named
entries, replace lists, config view --resolved --show-origin).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs: correct stale-ticket claims and fold init/bootstrap design into RFC-002

Verify against code, not ticket statuses (MR-581 is marked done but is
stale/unbuilt): no ~/.omnigraph usage, no template/serve/quickstart/
prune/login commands exist; config still uses aliases: (no operations:).
So ~/.omnigraph/ stands on peer-convention merits alone, and templates
are a design question, not a foothold. Add §7.5: the three-tier init
model (user route = login + ~/.omnigraph/config.yaml; thin project init;
fat quickstart + templates) with first-principles positions (split
init/login, in-place refuse-if-exists, interactive vs --auto/agent-mode,
--template flag, secrets-on-scaffold gitignore rule). This RFC owns only
the user route; the rest are sibling tickets (MR-973/970/972/974/981).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs: breadboard + slice Shape A in RFC-002

Add the implementation breadboard (places P1-P5, affordances N1-N14 with
NEW markers, mermaid) and five vertical slices for the selected config/
CLI/init shape: V1 global layer + merge engine + config view; V2 remote
graphs + HTTP-client path + credential resolution; V3 omnigraph login;
V4 init-hardening + quickstart + templates (rides MR-970); V5 agent-mode
(MR-981). Rollout reordered to the slice sequence; spikes X1-X4 gate
their owning slice. V1-V2 close the substantive client->server gap.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* Add InvokeQuery Cedar action (coarse, graph-scoped)

A per-graph, branch-scoped action that gates invoking a server-side
stored query by name. Coarse for now: an `invoke_query` allow rule
permits any stored query on the graph; a future, additive refinement
adds an optional per-query-name scope without changing rules written
against the coarse action. Enforcement is at the HTTP boundary; the
engine `_as` writers still enforce read/change per the query body, so a
stored mutation is double-gated (invoke_query to reach the tool, change
for the write). No call site yet — the invocation handler wires it in a
later change (same pattern as Admin/GraphList added ahead of consumers).

- variant + as_str/resource_kind(Graph)/FromStr/uses_branch_scope
- Cedar schema: invoke_query appliesTo Graph
- tests: per-graph allow/deny, branch-scope accepted

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* Load and type-check stored queries at server boot, refusing breakage

At startup the server now loads each graph's stored-query registry,
type-checks every query against that graph's live schema, and refuses to
boot if any query references a type/property the schema doesn't have
(same posture as bad policy YAML) — so schema drift surfaces at the
deploy boundary, not silently at invocation. Non-blocking warnings are
logged. The validated registry is attached to the GraphHandle (the two
production sites previously held `queries: None`).

Loading (parse + key==symbol identity) happens at settings-build time
where the config is in scope; the schema type-check happens after each
engine opens (single mode in `open_single_with_queries`, multi mode in
`open_single_graph`). `open_with_bearer_tokens_and_policy` delegates
with an empty registry so its 18 test callers are unchanged; the public
`new_*` constructors are unchanged (only the private build path threads
the registry).

- ServerConfigMode::Single / GraphStartupConfig carry the loaded registry
- boot tests: valid registry boots; type-broken query refuses boot + names it

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* Add `omnigraph queries validate` and `queries list` CLI

`queries validate` type-checks the stored-query registry against the
live schema offline — it opens the selected graph, runs the same
check() the server runs at boot, prints breakages/warnings (human or
--json), and exits non-zero on any breakage — so an operator can catch
a query broken by a schema change without restarting the server.
`queries list` prints each registered query's name, MCP exposure, and
typed params.

Named `validate` (not `check`) to avoid overlap with the existing
`omnigraph lint` — `query check`/`query lint` are already deprecated
argv-shims to `lint`. Registry entries resolve like the server: a named
graph uses its per-graph `queries:`; otherwise the top-level one.

- Queries subcommand group; reuses QueryRegistry::load + check from
  omnigraph-server; local-only (needs the schema), mirrors lint
- tests: clean registry exits 0, broken query exits non-zero + names it,
  list shows the query and its typed params

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* Route registry selection through one shared query_entries_for

The "which queries: block applies for graph X" rule existed twice — the
server boot path and the CLI's registry_entries — and had already drifted:
the CLI carried an unreachable unwrap_or_else fallback the server lacked.

Add OmnigraphConfig::query_entries_for(graph: Option<&str>) as the single
definition (named graph -> its per-graph block; otherwise top-level) and
route all three sites through it: server single mode, server multi-graph
loop, and the CLI. The CLI's dead fallback arm is deleted; CLI and server
now resolve identically by construction.

No behavior change. Extends the config round-trip test to pin the selector,
including the unknown-name -> top-level fallback the deleted CLI arm covered.

* Funnel registry validation through one validate_and_attach gate

The check -> refuse-on-breakage -> log-warnings -> empty->None block was
copy-pasted across both open paths (single mode and the multi-graph
per-graph open), differing only by the graph label. A third opener could
attach a registry that was never schema-checked.

Extract validate_and_attach(queries, catalog, label) -> Option<Arc<..>> as
the single gate both paths call, so attaching an unchecked registry is no
longer expressible. The catalog handle is an owned Arc, so calling it
before the multi-mode policy match (which rebinds db) is borrow-clean.

No behavior change. Adds a direct unit test of the helper (empty / clean /
breakage incl. the graph label in the message) — covering the multi-graph
path's logic, which previously had no boot-refusal coverage.

* Resolve param types structurally in the MCP vector lint

The exposed-query advisory detected vector params with
type_name.starts_with("Vector(") — a second copy of the compiler's own
ScalarType::from_str_name vector parsing that could drift from it.

Key the lint off PropType::from_param_type_name + ScalarType::Vector(_)
instead, the one canonical resolver the type system already uses. Any
future param-suppliability lint now reads the structured type rather than
scanning the surface string.

Behavior-preserving: the grammar forbids list-of-vector params
(list_type = "[" base_type "]", and base_type excludes Vector), so the only
input where the structured and string checks could differ is unparseable.
Adds a guard test that an exposed String param does not false-trigger the
warning.

* Refuse duplicate MCP tool names across exposed stored queries

The effective MCP tool name (explicit tool_name, else the query name) is a
second identity namespace beside the registry key, but nothing enforced it
unique — two exposed queries could claim one catalog key, and each consumer
re-derived the name ad hoc.

Add StoredQuery::effective_tool_name() as the one definition, and a
load-time uniqueness pass in from_specs over exposed queries: a collision is
a collected LoadError naming the loser and the winner. Scoped to exposed
queries (unexposed have no MCP tool); deterministic over the BTreeMap so the
first-declared wins and the error order is stable.

New (rare) refusal: a config with colliding exposed tool names now fails
`omnigraph queries validate` offline and refuses server boot, the same
posture as a malformed registry. Release-note-worthy.

Test-first: duplicate_exposed_tool_name_is_a_load_error (red before the
pass, green after) + a CLI offline test; the unexposed sibling pins the
exposed-only scope; effective_tool_name asserts folded into the load test.

* docs: document the queries registry, CLI, and invoke_query action

The stored-query surface shipped without user docs. Add it, per the same-PR
maintenance contract:

- policy.md: invoke_query as per-graph action #10 (branch-scoped), with the
  double-gating note; renumber graph_list; add it to the branch_scope list.
- cli-reference.md: the `queries validate | list` command, and the
  `queries:` config block (per-graph + top-level) with mcp.expose/tool_name
  and the tool-name uniqueness rule.
- server.md: boot-time stored-query type-check (refuse on breakage), noting
  invocation over HTTP/MCP is not yet exposed.

* Add POST /queries/{name} stored-query invocation handler

Invoke a curated server-side stored query by name: source + name come from
the per-graph queries: registry, the client sends only runtime inputs
(params, branch, snapshot). Gated by the invoke_query Cedar action at the
boundary; the handler delegates to the existing run_query/run_mutate, whose
inner Read/Change enforce still runs — so a stored mutation is double-gated
(invoke_query to reach the tool, change for the write).

- InvokeStoredQueryRequest + an untagged InvokeStoredQueryResponse
  { Read(ReadOutput), Change(ChangeOutput) } → one Json<_> return type and a
  oneOf 200 schema (a correct contract, not a wrong-but-simple one).
- Route lives in per_graph_protected → single-mode /queries/{name} and
  multi-mode /graphs/{id}/queries/{name} for free.
- Deny == unknown: an invoke_query denial and a missing query both return the
  same 404, so the catalog can't be probed by an unauthorized caller.
- OpenAPI regenerated; tests cover read, mutation double-gate (403 vs 200),
  bad-param 400, and the identical-404 deny path.

Completes the MR-969 V1 invocation slice (registry + /queries/{name} + invoke_query).

* docs: stored-query invocation endpoint; flip the not-yet-exposed caveat

Now that POST /queries/{name} ships (C7), document it: add the endpoint to
server.md's inventory + an invocation section (body, untagged read/mutate
envelope, invoke_query gate, double-gated mutations, deny == 404), and flip
the startup note that said invocation was not yet exposed. In policy.md,
replace "no invocation call site yet" on the invoke_query action with a
pointer to the endpoint.

* Scope the stored-query 404-hiding claim to non-invoke_query callers

Review found the deny==404 catalog-hiding was overstated as a contract: it
holds only at the outer invoke_query gate. A caller that HOLDS invoke_query
but lacks read/change gets the inner gate's 403 for an existing query vs 404
for an unknown one — so existence is visible to grant-holders by design (the
intended double-gate). The handler docstring, OpenAPI 404 description, and
server.md all claimed the 404 was airtight against any denied actor.

Correct the wording in all three (no behavior change) and add the missing
symmetric test (invoke_query but no read -> 403 for an existing query, 404
for unknown) so the actual contract is pinned. Also document that in
default-deny mode (tokens, no policy) every invocation 404s until an
invoke_query rule is configured.

Nits: the from_specs collision comment said "first declared wins" but it is
lexicographically-first by name (BTreeMap); the effective_tool_name docstring
overclaimed the CLI display routes through it (it resolves the rule on its
own output DTO).

* Default mcp.expose to true (the manifest entry is the opt-in)

expose controls MCP-catalog membership only — it is not an authorization
gate (invocation is gated by invoke_query regardless). So requiring a
per-query mcp.expose: true was friction with no safety benefit: a
non-exposed query is still HTTP-invocable by name. Flip the default so
declaring a query in the manifest exposes it to the agent tool catalog by
default; expose: false is the escape hatch for service-only queries.

Both the absent-mcp path (Default impl) and the present-but-no-expose path
(serde default fn) now yield true. Doc comments + cli-reference updated; the
config round-trip test asserts the new default.

* Add GET /queries stored-query catalog endpoint

List a graph's mcp.expose stored queries as a typed tool catalog so a client
(the MCP server) can register them as tools without fetching .gq source.
Each entry carries name, MCP tool_name, description/instruction, a
read/mutate flag, and decomposed typed params (kind enum: string|bool|int|
bigint|float|date|datetime|blob|vector|list, plus item_kind for lists and
vector_dim) — so the consumer builds an input schema with a closed match and
never re-parses omnigraph type spelling. I64/U64 are bigint (string on the
wire): a JSON number loses precision past 2^53 and the engine already accepts
decimal strings.

Read-gated (works in default-deny; the catalog is graph-wide, authorized
against main). NOT Cedar-filtered per query yet — a reader can list a query
whose invoke_query they lack (documented gap until per-query authz lands);
invocation stays invoke_query-gated + deny==404.

- api: QueriesCatalogOutput / QueryCatalogEntry / ParamDescriptor / ParamKind
  + query_catalog_entry (reuses PropType::from_param_type_name; scalar_kind is
  exhaustive, so a new ScalarType is a compile error here until catalogued).
- GET /queries route in per_graph_protected (→ /graphs/{id}/queries in multi
  mode); OpenAPI regenerated; path allowlists updated.
- Tests: projection unit (every kind, list, vector, nullable, mutation,
  empty) + handler (exposed-only filter, read-gate probe-oracle, empty
  registry).

* docs: GET /queries stored-query catalog endpoint

Document the catalog: the endpoint table row (GET /queries, read-gated), a
catalog section (typed-param kind enum, bigint/date/datetime/blob-as-string,
graph-wide/branch-independent, mcp.expose default true, the read-gated
probe-oracle gap), and flip the startup note now that the catalog ships.

* Collect file-I/O and parse errors in QueryRegistry::load in one pass

load() early-returned on any unreadable .gq file, masking parse / identity /
tool-name-collision errors in the OTHER (readable) files — so an operator
fixed the missing file, restarted, and only then saw the next broken query.
Now it collects I/O errors but still runs from_specs on the readable specs
and returns the union, so every broken entry surfaces at once (matching the
collected-errors contract the rest of the registry already follows).

Safe: from_specs' tool-name collision check runs over loaded queries only, so
dropping an I/O-failed entry can only under-report a collision, never invent
one. I/O errors are ordered first (BTreeMap key order), then spec errors.

Adds a load-level test (tempdir: a valid, a missing, and a parse-broken .gq)
asserting all three surface in one Err — confirmed red before the fix.

* Make invoke_query graph-scoped (one branch authority)

invoke_query gates reaching the curated stored-query surface — a graph-level
capability. Per-branch/snapshot access is already enforced by the inner
read/change gate in run_query/run_mutate (authorized against the resolved
branch), so branch-scoping the outer gate was redundant AND wrong for snapshot
reads (it defaulted to main). Drop the branch dimension: remove InvokeQuery
from uses_branch_scope (it joins admin as graph-scoped) and authorize the
boundary gate with branch: None.

Lossless: an actor confined to branch X by their read/change rules can still
only invoke a stored query that touches X. A rule that sets branch_scope on
invoke_query is now rejected by validate() — write invoke_query in its own
rule.

Ripple (atomic): restructure the server invoke fixture so invoke_query sits in
its own branch_scope-free rule; invert invoke_query_is_branch_scoped ->
invoke_query_rejects_branch_scope; the per-graph authorize test uses
branch: None; docs (policy.md, server.md, the InvokeQuery doc). No wire/OpenAPI
change.

* Resolve graph config by identity, not server mode

Which policy/queries block applies for a graph was decided three different,
mode-dependent ways: single-mode boot used top-level even for a named graph;
multi-mode used per-graph (and silently ignored a top-level queries block); the
CLI used per-graph for a named target. So `queries validate --target prod`
could check a different registry than the single-mode server loaded, and a
named graph's per-graph policy/queries were silently shadowed.

Make config a function of graph IDENTITY: a graph served by NAME
(--target/server.graph, a graphs: entry) uses its own graphs.<name>.{policy,
queries}; a bare URI is anonymous and uses top-level. One rule, applied by
single-mode boot, multi-mode boot, and the CLI — so they can't diverge and the
CLI predicts the server exactly.

No silent ignore: serving a named graph while a top-level policy/queries block
is populated now refuses boot, naming the block (the multi-mode top-level-policy
bail, extended to queries and to single-mode-named). The CLI's `queries
validate` derives the schema URI and the registry from ONE selection, and a
positional URI forces anonymous (ignoring cli.graph) so the two can't come from
different graphs.

BREAKING (released behavior): single mode by name (--target/server.graph) with
top-level policy/queries previously used top-level; it now uses the per-graph
block and refuses boot if top-level is also populated. Bare-URI single mode is
unchanged. Loud, with migration text pointing at graphs.<name>.

- config: resolve_policy_file_for (policy sibling of query_entries_for, no
  top-level fallback) + populated_top_level_blocks for the coherence check.
- characterization tests (single-mode named -> per-graph; named + top-level ->
  bail; multi-mode top-level queries -> bail; CLI positional-URI -> top-level).
- docs: policy.md, server.md, cli-reference.md.

* docs: RFC-002 credentials keyed by server name (keychain/profile/env)

Reworks the RFC's credentials model: secrets are keyed by server name — OS
keychain `omnigraph:<server>` (preferred) -> a `[<server>]` profile in
`~/.omnigraph/credentials` -> `OMNIGRAPH_TOKEN[_<SERVER>]` env (CI), the
AWS/gh/kube model. `servers.<name>` is endpoint-only by default but may carry
an explicit, secret-free `auth: { token: { env|file|command|keychain } }`
source. The shipped `bearer_token_env` + `.env.omni` dotenv remain a legacy
compat path; no `credentials.yaml`.

* docs: RFC-002 — typed graph locator (storage/server/graph_id), not a uri string

Add §1.1: the resolved graph address is a typed GraphLocator
(Embedded{storage} | Remote{server, graph_id}), not a flat uri: String.
Diagnoses the string model's cost in the code today (~16 is_remote_uri forks,
TargetConfig can't express multi-server x multi-graph, the CLI bails on remote,
the ts SDK models baseUrl+graphId separately) and settles the YAML naming so
the key names the locus:

- storage: (embedded) — shipped uri: is a deprecated alias
- server: + graph_id: (remote) — graph_id defaults to the entry key
- storage xor server, reject both/neither (no silent ambiguity)

Kills the graphs:/graph: collision and the uri:-might-be-a-server ambiguity.
Updates the §1/§8 examples and the entry-shape notes to the new naming.

* Test: queries list must reject an unknown --target

queries list opens no graph URI, so unknown-graph validation does not ride
along on resolve_target_uri the way it does for every other command. The new
test reproduces the gap: with an unknown --target the command currently exits 0
and prints the (empty) top-level registry instead of erroring like the
URI-resolving commands do. Fails against current code; the fix follows.

* Validate the graph selection in queries list

Graph-existence validation was a side effect of URI resolution: every
URI-resolving command rejects an unknown --target via resolve_target_uri, but
queries list opens no URI, so query_entries_for(Some(unknown)) silently fell
back to the top-level registry and showed the wrong (or empty) catalog.

Make membership a property of the selection: add the fallible
resolve_graph_selection alongside the infallible query_entries_for (a known
name passes through, an unknown name errors with the same message as
resolve_target_uri, None stays anonymous), and validate the selection in
execute_queries_list. query_entries_for is unchanged — server boot's bare-URI
path still needs its None -> top-level arm.

* Surface policy-engine errors from stored-query invoke

The invoke handler mapped every authorize_request failure to 404 ('stored
query not found'), which collapsed the authorization decision (deny -> 403)
together with operational failures (no actor -> 401, Cedar evaluation error ->
500). A real policy-engine 500 was hidden as a missing query.

Separate the two concerns instead of sniffing the masked status. Extract
authorize() returning an Authz { Allowed, Denied(msg) } decision and reserve
Err for operational failures only; authorize_request becomes a thin wrapper
that maps Denied -> 403, so the 16 deny-as-403 callers are unchanged. The
invoke handler now matches the decision directly: a denial stays 404 (deny ==
missing, so the catalog can't be probed without the grant), while a 401/500
propagates with its true status.

500 is now a reachable outcome on POST /queries/{name}; document it in the
endpoint responses and regenerate openapi.json.

* Extract the named-graph/top-level coherence rule into one helper

The rule 'a named graph uses its own graphs.<name> block, so a populated
top-level block is a config error' lived inline in single-mode server boot.
Extract it to OmnigraphConfig::ensure_top_level_blocks_honored so the same
definition can be shared by the CLI selection gate (next commit) and the two
can't drift. Boot calls the helper; the message is reworded context-neutral
(drops 'serving') so it reads correctly from both boot and the CLI.

Behavior-preserving: multi-graph mode keeps its own unconditional check, and
single_mode_named_graph_rejects_top_level_blocks still passes.

* Test: queries validate/list must reject a named graph with a top-level block

Server boot refuses a config where a graph is selected by name yet a top-level
queries:/policy.file block is populated (the block would be silently ignored).
The CLI's queries validate/list resolve the same named selection but skip that
coherence check, so they give a false green / list the per-graph block. The new
test reproduces it: validate prints OK and list succeeds where boot would
refuse. Fails against current code; the fix follows.

* Enforce top-level coherence in the single CLI selection gate

queries validate validated graph membership only as a side effect of URI
resolution and queries list only via resolve_graph_selection's membership
check; neither applied the named-graph/top-level coherence rule server boot
enforces, so both gave a false green on a config boot refuses.

Fold ensure_top_level_blocks_honored into resolve_graph_selection so it is the
single gate that returns only valid + server-coherent selections, and route
resolve_selected_graph (queries validate) through it; queries list already
calls the gate. A named graph with a populated top-level block now errors in
both commands, matching boot. A positional URI stays anonymous (top-level
honored), so queries_validate_positional_uri_ignores_default_graph is
unaffected.

* docs: RFC-003 — MCP server surface for omnigraph-server

Detailed MCP-transport design for the stored-query/MCP work, building on the
shipped #128 registry. Corrects the draft against the branch head: the coarse
invoke_query gate + 404 denial-masking are already wired (server_invoke_query),
so per-query invoke_query scope (PolicyRequest has no query-name dimension yet)
is the real prerequisite; positions the doc as superseding rfc-001's MCP
transport (/mcp/tools+/mcp/invoke) and reconciles the shipped mcp.expose YAML
form and the schema-introspection non-goal; grounds the parity surface in the
actual omnigraph-ts package (13 tools with read/change ids, 2 resources).

* docs(config): clarify graph config boundaries

* fix(config): enforce graph-scoped policies and query validation

* fix(cli): require graph selection for scoped query registries

* fix(server): preserve named graph id in single mode policy

* fix(cli): share graph identity for policy resolution

* test(cli): cover policy tooling server graph selection

* fix(cli): honor server graph for policy tooling

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
Ragnor Comerford 2026-06-01 22:50:31 +02:00 committed by GitHub
parent fab105bcce
commit 3c2b1b8051
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
21 changed files with 5017 additions and 198 deletions

View file

@ -9,6 +9,7 @@ use clap::{Arg, ArgAction, Args, CommandFactory, FromArgMatches, Parser, Subcomm
use color_eyre::eyre::{Result, bail};
use omnigraph::db::{Omnigraph, ReadTarget, SnapshotId};
use omnigraph::loader::LoadMode;
use omnigraph::storage::normalize_root_uri;
use omnigraph_compiler::query::parser::parse_query;
use omnigraph_compiler::schema::parser::parse_schema;
use omnigraph_compiler::{
@ -24,9 +25,10 @@ use omnigraph_server::api::{
SnapshotTableOutput, commit_output, ingest_output, read_output, schema_apply_output,
snapshot_payload,
};
use omnigraph_server::queries::{QueryRegistry, check, format_check_breakages};
use omnigraph_server::{
AliasCommand, OmnigraphConfig, PolicyAction, PolicyDecision, PolicyEngine, PolicyRequest,
PolicyTestConfig, ReadOutputFormat, load_config,
PolicyTestConfig, ReadOutputFormat, graph_resource_id_for_selection, load_config,
};
use reqwest::Method;
use reqwest::header::AUTHORIZATION;
@ -153,6 +155,11 @@ enum Command {
#[arg(long)]
json: bool,
},
/// Operate on the server-side stored-query registry (`queries:`).
Queries {
#[command(subcommand)]
command: QueriesCommand,
},
/// Show graph snapshot
Snapshot {
/// Graph URI
@ -502,6 +509,35 @@ enum PolicyCommand {
},
}
#[derive(Debug, Subcommand)]
enum QueriesCommand {
/// Type-check the stored-query registry against the live schema.
///
/// Distinct from `omnigraph lint` (which lints one `.gq` file):
/// this validates the whole `queries:` registry — opening the graph
/// to read its schema and confirming every stored query still
/// type-checks. Exits non-zero on any breakage.
Validate {
/// Graph URI
uri: Option<String>,
#[arg(long)]
target: Option<String>,
#[arg(long)]
config: Option<PathBuf>,
#[arg(long)]
json: bool,
},
/// List the registered stored queries (name, MCP exposure, params).
List {
#[arg(long)]
target: Option<String>,
#[arg(long)]
config: Option<PathBuf>,
#[arg(long)]
json: bool,
},
}
#[derive(Debug, Args, Clone)]
struct ParamsArgs {
#[arg(long, conflicts_with = "params_file")]
@ -743,25 +779,66 @@ fn load_cli_config(config_path: Option<&PathBuf>) -> Result<OmnigraphConfig> {
Ok(config)
}
fn resolve_policy_engine(config: &OmnigraphConfig) -> Result<PolicyEngine> {
let policy_file = config
.resolve_policy_file()
.ok_or_else(|| color_eyre::eyre::eyre!("policy.file must be set in omnigraph.yaml"))?;
PolicyEngine::load_graph(&policy_file, &policy_graph_id(config))
#[derive(Debug, Clone)]
struct ResolvedCliGraph {
uri: String,
selected: Option<String>,
graph_id: String,
policy_file: Option<PathBuf>,
is_remote: bool,
}
/// Open a local-URI graph and, when `policy.file` is configured in
/// `omnigraph.yaml`, install the resolved `PolicyEngine` on the engine
/// handle so every direct-engine write goes through
/// `Omnigraph::enforce(...)` (MR-722). Without a configured policy this
/// is identical to a bare `Omnigraph::open`.
///
/// Returns owned `Omnigraph`; chained on top of `Omnigraph::open(...)`'s
/// existing future to keep call sites narrow.
async fn open_local_db_with_policy(uri: &str, config: &OmnigraphConfig) -> Result<Omnigraph> {
let db = Omnigraph::open(uri).await?;
if config.resolve_policy_file().is_some() {
let engine = Arc::new(resolve_policy_engine(config)?);
impl ResolvedCliGraph {
fn selected(&self) -> Option<&str> {
self.selected.as_deref()
}
}
struct ResolvedPolicyContext {
policy_file: PathBuf,
graph_id: String,
}
fn resolve_policy_context(config: &OmnigraphConfig) -> Result<ResolvedPolicyContext> {
let selected = config.resolve_policy_tooling_graph_selection()?;
let policy_file = config
.resolve_policy_file_for(selected)
.ok_or_else(|| {
color_eyre::eyre::eyre!(
"policy.file or graphs.<name>.policy.file must be set in omnigraph.yaml"
)
})?;
let graph_id = match selected {
Some(name) => graph_resource_id_for_selection(Some(name), ""),
None => graph_resource_id_for_selection(None, "default"),
};
Ok(ResolvedPolicyContext {
policy_file,
graph_id,
})
}
fn resolve_policy_engine(context: &ResolvedPolicyContext) -> Result<PolicyEngine> {
PolicyEngine::load_graph(&context.policy_file, &context.graph_id)
}
fn resolve_policy_engine_for_graph(graph: &ResolvedCliGraph) -> Result<PolicyEngine> {
let policy_file = graph.policy_file.as_ref().ok_or_else(|| {
color_eyre::eyre::eyre!(
"policy.file or graphs.<name>.policy.file must be set in omnigraph.yaml"
)
})?;
PolicyEngine::load_graph(policy_file, &graph.graph_id)
}
/// Open a local graph and install the policy resolved for the same graph
/// identity that produced the URI. A named graph uses
/// `graphs.<name>.policy.file`; an explicit positional URI is anonymous and
/// uses the legacy top-level `policy.file`.
async fn open_local_db_with_policy(graph: &ResolvedCliGraph) -> Result<Omnigraph> {
let db = Omnigraph::open(&graph.uri).await?;
if graph.policy_file.is_some() {
let engine = Arc::new(resolve_policy_engine_for_graph(graph)?);
Ok(db.with_policy(engine as Arc<dyn omnigraph_policy::PolicyChecker>))
} else {
Ok(db)
@ -778,22 +855,16 @@ fn resolve_cli_actor<'a>(cli_as: Option<&'a str>, config: &'a OmnigraphConfig) -
cli_as.or(config.cli.actor.as_deref())
}
fn resolve_policy_tests_path(config: &OmnigraphConfig) -> Result<PathBuf> {
config.resolve_policy_tests_file().ok_or_else(|| {
color_eyre::eyre::eyre!(
"policy.tests.yaml requires policy.file to be set in omnigraph.yaml"
)
})
fn resolve_policy_tests_path(context: &ResolvedPolicyContext) -> PathBuf {
context.policy_file.with_file_name("policy.tests.yaml")
}
fn policy_graph_id(config: &OmnigraphConfig) -> String {
if let Some(name) = &config.project.name {
return name.clone();
fn normalize_policy_graph_uri(uri: &str) -> Result<String> {
if is_remote_uri(uri) {
Ok(uri.trim_end_matches('/').to_string())
} else {
Ok(normalize_root_uri(uri)?)
}
config
.resolve_target_uri(None, None, config.server_graph_name())
.or_else(|_| config.resolve_target_uri(None, None, config.cli_graph_name()))
.unwrap_or_else(|_| "default".to_string())
}
fn resolve_remote_bearer_token(
@ -877,6 +948,47 @@ fn resolve_uri(
config.resolve_target_uri(cli_uri, cli_target, config.cli_graph_name())
}
fn resolve_cli_graph(
config: &OmnigraphConfig,
cli_uri: Option<String>,
cli_target: Option<&str>,
) -> Result<ResolvedCliGraph> {
let selected = if cli_uri.is_some() {
None
} else {
cli_target
.map(str::to_string)
.or_else(|| config.cli_graph_name().map(str::to_string))
};
config.resolve_graph_selection(selected.as_deref())?;
let uri = resolve_uri(config, cli_uri, cli_target)?;
let normalized_uri = normalize_policy_graph_uri(&uri)?;
let graph_id = graph_resource_id_for_selection(selected.as_deref(), &normalized_uri);
Ok(ResolvedCliGraph {
graph_id,
is_remote: is_remote_uri(&uri),
policy_file: config.resolve_policy_file_for(selected.as_deref()),
selected,
uri,
})
}
fn resolve_local_graph(
config: &OmnigraphConfig,
cli_uri: Option<String>,
cli_target: Option<&str>,
operation: &str,
) -> Result<ResolvedCliGraph> {
let graph = resolve_cli_graph(config, cli_uri, cli_target)?;
if graph.is_remote {
bail!(
"{} is only supported against local graph URIs in this milestone",
operation
);
}
Ok(graph)
}
/// Parse a Go-style compact duration: `7d`, `24h`, `30m`, `90s`, or a plain
/// integer as seconds. Used by the `cleanup --older-than` flag.
fn parse_duration_arg(s: &str) -> Result<std::time::Duration> {
@ -915,14 +1027,7 @@ fn resolve_local_uri(
cli_target: Option<&str>,
operation: &str,
) -> Result<String> {
let uri = resolve_uri(config, cli_uri, cli_target)?;
if is_remote_uri(&uri) {
bail!(
"{} is only supported against local graph URIs in this milestone",
operation
);
}
Ok(uri)
Ok(resolve_local_graph(config, cli_uri, cli_target, operation)?.uri)
}
fn resolve_branch(
@ -1609,6 +1714,248 @@ async fn execute_query_lint(
))
}
#[derive(serde::Serialize)]
struct QueriesIssue {
query: String,
message: String,
}
#[derive(serde::Serialize)]
struct QueriesValidateOutput {
ok: bool,
breakages: Vec<QueriesIssue>,
warnings: Vec<QueriesIssue>,
}
#[derive(serde::Serialize)]
struct QueriesParam {
name: String,
#[serde(rename = "type")]
type_name: String,
nullable: bool,
}
#[derive(serde::Serialize)]
struct QueriesListItem {
name: String,
mcp_expose: bool,
tool_name: Option<String>,
mutation: bool,
params: Vec<QueriesParam>,
}
#[derive(serde::Serialize)]
struct QueriesListOutput {
queries: Vec<QueriesListItem>,
}
/// Resolve the selected graph to `(local URI, registry selection)` from one
/// precedence, so a command's schema and its stored-query registry can never
/// come from different graphs. A **positional URI is anonymous** (top-level
/// registry, ignoring the configured default graph); otherwise `--target`
/// or the configured `cli.graph` names the graph (its per-graph block).
/// Mirrors the server's single-mode identity rule.
fn resolve_selected_graph(
config: &OmnigraphConfig,
cli_uri: Option<String>,
cli_target: Option<&str>,
operation: &str,
) -> Result<(String, Option<String>)> {
let graph = resolve_local_graph(config, cli_uri, cli_target, operation)?;
Ok((graph.uri, graph.selected))
}
/// Load the stored-query registry for an already-resolved graph selection
/// (`None` = anonymous → top-level; `Some(name)` = that graph's block).
fn load_registry_or_report(
config: &OmnigraphConfig,
selected: Option<&str>,
) -> Result<QueryRegistry> {
QueryRegistry::load(config, config.query_entries_for(selected)).map_err(|errors| {
color_eyre::eyre::eyre!(
"stored-query registry failed to load:\n {}",
errors
.iter()
.map(|e| e.to_string())
.collect::<Vec<_>>()
.join("\n ")
)
})
}
fn graph_query_registry_names(config: &OmnigraphConfig) -> Vec<&str> {
config
.graphs
.iter()
.filter_map(|(name, graph)| (!graph.queries.is_empty()).then_some(name.as_str()))
.collect()
}
fn resolve_registry_selection_for_list(
config: &OmnigraphConfig,
target: Option<&str>,
) -> Result<Option<String>> {
let selected = target
.map(str::to_string)
.or_else(|| config.cli_graph_name().map(str::to_string));
if let Some(name) = selected.as_deref() {
config.resolve_graph_selection(Some(name))?;
return Ok(selected);
}
if !config.query_entries().is_empty() {
return Ok(None);
}
let graph_names = graph_query_registry_names(config);
if graph_names.is_empty() {
return Ok(None);
}
bail!(
"stored-query registries are configured for graph{} {} but no graph was selected. Pass `--target {}` or set `cli.graph`.",
if graph_names.len() == 1 { "" } else { "s" },
graph_names.join(", "),
graph_names[0],
)
}
fn validate_registry_for_catalog(
registry: &QueryRegistry,
catalog: &omnigraph_compiler::catalog::Catalog,
label: &str,
) -> omnigraph::error::Result<()> {
let report = check(registry, catalog);
if report.has_breakages() {
return Err(omnigraph::error::OmniError::manifest(
format_check_breakages(label, &report),
));
}
Ok(())
}
async fn execute_queries_validate(
uri: Option<String>,
target: Option<String>,
config_path: Option<&PathBuf>,
json: bool,
) -> Result<()> {
let config = load_cli_config(config_path)?;
// One selection drives both the schema URI and the registry, so a
// positional URI and a `--target` can't validate different graphs.
let (uri, selected) =
resolve_selected_graph(&config, uri, target.as_deref(), "queries validate")?;
let registry = load_registry_or_report(&config, selected.as_deref())?;
let db = Omnigraph::open(&uri).await?;
let report = check(&registry, &db.catalog());
let output = QueriesValidateOutput {
ok: !report.has_breakages(),
breakages: report
.breakages
.iter()
.map(|b| QueriesIssue {
query: b.query.clone(),
message: b.message.clone(),
})
.collect(),
warnings: report
.warnings
.iter()
.map(|w| QueriesIssue {
query: w.query.clone(),
message: w.message.clone(),
})
.collect(),
};
if json {
print_json(&output)?;
} else {
if output.breakages.is_empty() {
println!(
"OK {} stored quer{} type-check against the schema",
registry.len(),
if registry.len() == 1 { "y" } else { "ies" }
);
}
for issue in &output.breakages {
println!("ERROR query '{}': {}", issue.query, issue.message);
}
for issue in &output.warnings {
println!("WARN query '{}': {}", issue.query, issue.message);
}
}
if report.has_breakages() {
io::stdout().flush()?;
std::process::exit(1);
}
Ok(())
}
fn execute_queries_list(
target: Option<String>,
config_path: Option<&PathBuf>,
json: bool,
) -> Result<()> {
let config = load_cli_config(config_path)?;
let selected = resolve_registry_selection_for_list(&config, target.as_deref())?;
let registry = load_registry_or_report(&config, selected.as_deref())?;
let output = QueriesListOutput {
queries: registry
.iter()
.map(|q| QueriesListItem {
name: q.name.clone(),
mcp_expose: q.expose,
tool_name: q.tool_name.clone(),
mutation: q.is_mutation(),
params: q
.decl
.params
.iter()
.map(|p| QueriesParam {
name: p.name.clone(),
type_name: p.type_name.clone(),
nullable: p.nullable,
})
.collect(),
})
.collect(),
};
if json {
print_json(&output)?;
} else if output.queries.is_empty() {
println!("(no stored queries registered)");
} else {
for q in &output.queries {
let kind = if q.mutation { "mutation" } else { "read" };
let params = q
.params
.iter()
.map(|p| {
format!(
"${}: {}{}",
p.name,
p.type_name,
if p.nullable { "?" } else { "" }
)
})
.collect::<Vec<_>>()
.join(", ");
let mcp = if q.mcp_expose {
format!(" [mcp: {}]", q.tool_name.as_deref().unwrap_or(&q.name))
} else {
String::new()
};
println!("{kind} {}({params}){mcp}", q.name);
}
}
Ok(())
}
async fn execute_read(
uri: &str,
query_source: &str,
@ -1655,7 +2002,7 @@ async fn execute_read_remote(
}
async fn execute_change(
uri: &str,
graph: &ResolvedCliGraph,
query_source: &str,
query_name: Option<&str>,
branch: &str,
@ -1665,7 +2012,7 @@ async fn execute_change(
) -> Result<ChangeOutput> {
let (selected_name, query_params) = select_named_query(query_source, query_name)?;
let params = query_params_from_json(&query_params, params_json)?;
let db = open_local_db_with_policy(uri, config).await?;
let db = open_local_db_with_policy(graph).await?;
let actor = resolve_cli_actor(cli_as_actor, config);
let result = db
.mutate_as(branch, query_source, &selected_name, &params, actor)
@ -1893,9 +2240,10 @@ async fn main() -> Result<()> {
json,
} => {
let config = load_cli_config(config.as_ref())?;
let uri = resolve_local_uri(&config, uri, target.as_deref(), "load")?;
let graph = resolve_local_graph(&config, uri, target.as_deref(), "load")?;
let uri = graph.uri.clone();
let branch = resolve_branch(&config, branch, None, "main");
let db = open_local_db_with_policy(&uri, &config).await?;
let db = open_local_db_with_policy(&graph).await?;
let actor = resolve_cli_actor(cli.as_actor.as_deref(), &config);
let result = db
.load_file_as(&branch, &data.to_string_lossy(), mode.into(), actor)
@ -1936,10 +2284,11 @@ async fn main() -> Result<()> {
let config = load_cli_config(config.as_ref())?;
let bearer_token =
resolve_remote_bearer_token(&config, uri.as_deref(), target.as_deref())?;
let uri = resolve_uri(&config, uri, target.as_deref())?;
let graph = resolve_cli_graph(&config, uri, target.as_deref())?;
let uri = graph.uri.clone();
let branch = resolve_branch(&config, branch, None, "main");
let from = resolve_branch(&config, from, None, "main");
let payload = if is_remote_uri(&uri) {
let payload = if graph.is_remote {
let data = fs::read_to_string(&data)?;
remote_json::<IngestOutput>(
&http_client,
@ -1955,7 +2304,7 @@ async fn main() -> Result<()> {
)
.await?
} else {
let db = open_local_db_with_policy(&uri, &config).await?;
let db = open_local_db_with_policy(&graph).await?;
let actor = resolve_cli_actor(cli.as_actor.as_deref(), &config);
let result = db
.ingest_file_as(
@ -1986,9 +2335,10 @@ async fn main() -> Result<()> {
let config = load_cli_config(config.as_ref())?;
let bearer_token =
resolve_remote_bearer_token(&config, uri.as_deref(), target.as_deref())?;
let uri = resolve_uri(&config, uri, target.as_deref())?;
let graph = resolve_cli_graph(&config, uri, target.as_deref())?;
let uri = graph.uri.clone();
let from = resolve_branch(&config, from, None, "main");
let payload = if is_remote_uri(&uri) {
let payload = if graph.is_remote {
remote_json::<BranchCreateOutput>(
&http_client,
Method::POST,
@ -2001,7 +2351,7 @@ async fn main() -> Result<()> {
)
.await?
} else {
let db = open_local_db_with_policy(&uri, &config).await?;
let db = open_local_db_with_policy(&graph).await?;
let actor = resolve_cli_actor(cli.as_actor.as_deref(), &config);
db.branch_create_from_as(ReadTarget::branch(&from), &name, actor)
.await?;
@ -2027,8 +2377,9 @@ async fn main() -> Result<()> {
let config = load_cli_config(config.as_ref())?;
let bearer_token =
resolve_remote_bearer_token(&config, uri.as_deref(), target.as_deref())?;
let uri = resolve_uri(&config, uri, target.as_deref())?;
let payload = if is_remote_uri(&uri) {
let graph = resolve_cli_graph(&config, uri, target.as_deref())?;
let uri = graph.uri.clone();
let payload = if graph.is_remote {
remote_json::<BranchListOutput>(
&http_client,
Method::GET,
@ -2061,8 +2412,9 @@ async fn main() -> Result<()> {
let config = load_cli_config(config.as_ref())?;
let bearer_token =
resolve_remote_bearer_token(&config, uri.as_deref(), target.as_deref())?;
let uri = resolve_uri(&config, uri, target.as_deref())?;
let payload = if is_remote_uri(&uri) {
let graph = resolve_cli_graph(&config, uri, target.as_deref())?;
let uri = graph.uri.clone();
let payload = if graph.is_remote {
remote_json::<BranchDeleteOutput>(
&http_client,
Method::DELETE,
@ -2072,7 +2424,7 @@ async fn main() -> Result<()> {
)
.await?
} else {
let db = open_local_db_with_policy(&uri, &config).await?;
let db = open_local_db_with_policy(&graph).await?;
let actor = resolve_cli_actor(cli.as_actor.as_deref(), &config);
db.branch_delete_as(&name, actor).await?;
BranchDeleteOutput {
@ -2098,9 +2450,10 @@ async fn main() -> Result<()> {
let config = load_cli_config(config.as_ref())?;
let bearer_token =
resolve_remote_bearer_token(&config, uri.as_deref(), target.as_deref())?;
let uri = resolve_uri(&config, uri, target.as_deref())?;
let graph = resolve_cli_graph(&config, uri, target.as_deref())?;
let uri = graph.uri.clone();
let into = resolve_branch(&config, into, None, "main");
let payload = if is_remote_uri(&uri) {
let payload = if graph.is_remote {
remote_json::<BranchMergeOutput>(
&http_client,
Method::POST,
@ -2113,7 +2466,7 @@ async fn main() -> Result<()> {
)
.await?
} else {
let db = open_local_db_with_policy(&uri, &config).await?;
let db = open_local_db_with_policy(&graph).await?;
let actor = resolve_cli_actor(cli.as_actor.as_deref(), &config);
let outcome = db.branch_merge_as(&source, &into, actor).await?;
BranchMergeOutput {
@ -2248,9 +2601,10 @@ async fn main() -> Result<()> {
let config = load_cli_config(config.as_ref())?;
let bearer_token =
resolve_remote_bearer_token(&config, uri.as_deref(), target.as_deref())?;
let uri = resolve_uri(&config, uri, target.as_deref())?;
let graph = resolve_cli_graph(&config, uri, target.as_deref())?;
let uri = graph.uri.clone();
let schema_source = fs::read_to_string(&schema)?;
let output = if is_remote_uri(&uri) {
let output = if graph.is_remote {
// MR-694 PR B: SchemaApplyRequest gained an
// allow_data_loss field so Hard-mode drops are no
// longer CLI-only. The previous bail is gone; the
@ -2268,13 +2622,22 @@ async fn main() -> Result<()> {
)
.await?
} else {
let db = open_local_db_with_policy(&uri, &config).await?;
let db = open_local_db_with_policy(&graph).await?;
let actor = resolve_cli_actor(cli.as_actor.as_deref(), &config);
let registry = load_registry_or_report(&config, graph.selected())?;
let registry = (!registry.is_empty()).then_some(registry);
let label = graph.selected().unwrap_or(&uri).to_string();
let result = db
.apply_schema_as(
.apply_schema_as_with_catalog_check(
&schema_source,
omnigraph::db::SchemaApplyOptions { allow_data_loss },
actor,
|catalog| {
if let Some(registry) = registry.as_ref() {
validate_registry_for_catalog(registry, catalog, &label)?;
}
Ok(())
},
)
.await?;
schema_apply_output(&uri, result)
@ -2331,6 +2694,23 @@ async fn main() -> Result<()> {
.await?;
finish_query_lint(&output, json)?;
}
Command::Queries { command } => match command {
QueriesCommand::Validate {
uri,
target,
config,
json,
} => {
execute_queries_validate(uri, target, config.as_ref(), json).await?;
}
QueriesCommand::List {
target,
config,
json,
} => {
execute_queries_list(target, config.as_ref(), json)?;
}
},
Command::Snapshot {
uri,
target,
@ -2436,7 +2816,8 @@ async fn main() -> Result<()> {
.as_deref()
.or_else(|| alias_config.and_then(|alias| alias.graph.as_deref()));
let bearer_token = resolve_remote_bearer_token(&config, uri.as_deref(), target_name)?;
let uri = resolve_uri(&config, uri, target_name)?;
let graph = resolve_cli_graph(&config, uri, target_name)?;
let uri = graph.uri.clone();
let query_source = resolve_query_source(
&config,
query.as_ref(),
@ -2458,7 +2839,7 @@ async fn main() -> Result<()> {
alias_config.and_then(|alias| alias.branch.clone()),
)?;
let query_name = name.or_else(|| alias_config.and_then(|alias| alias.name.clone()));
let output = if is_remote_uri(&uri) {
let output = if graph.is_remote {
execute_read_remote(
&http_client,
&uri,
@ -2521,7 +2902,8 @@ async fn main() -> Result<()> {
.as_deref()
.or_else(|| alias_config.and_then(|alias| alias.graph.as_deref()));
let bearer_token = resolve_remote_bearer_token(&config, uri.as_deref(), target_name)?;
let uri = resolve_uri(&config, uri, target_name)?;
let graph = resolve_cli_graph(&config, uri, target_name)?;
let uri = graph.uri.clone();
let query_source = resolve_query_source(
&config,
query.as_ref(),
@ -2543,7 +2925,7 @@ async fn main() -> Result<()> {
"main",
);
let query_name = name.or_else(|| alias_config.and_then(|alias| alias.name.clone()));
let output = if is_remote_uri(&uri) {
let output = if graph.is_remote {
execute_change_remote(
&http_client,
&uri,
@ -2556,7 +2938,7 @@ async fn main() -> Result<()> {
.await?
} else {
execute_change(
&uri,
&graph,
&query_source,
query_name.as_deref(),
&branch,
@ -2575,20 +2957,19 @@ async fn main() -> Result<()> {
Command::Policy { command } => match command {
PolicyCommand::Validate { config } => {
let config = load_cli_config(config.as_ref())?;
let engine = resolve_policy_engine(&config)?;
let policy_file = config
.resolve_policy_file()
.expect("policy file should exist after resolve_policy_engine");
let context = resolve_policy_context(&config)?;
let engine = resolve_policy_engine(&context)?;
println!(
"policy valid: {} [{} actors]",
policy_file.display(),
context.policy_file.display(),
engine.known_actor_count()
);
}
PolicyCommand::Test { config } => {
let config = load_cli_config(config.as_ref())?;
let engine = resolve_policy_engine(&config)?;
let tests_path = resolve_policy_tests_path(&config)?;
let context = resolve_policy_context(&config)?;
let engine = resolve_policy_engine(&context)?;
let tests_path = resolve_policy_tests_path(&context);
let tests = PolicyTestConfig::load(&tests_path)?;
engine.run_tests(&tests)?;
println!("policy tests passed: {} cases", tests.cases.len());
@ -2601,7 +2982,8 @@ async fn main() -> Result<()> {
target_branch,
} => {
let config = load_cli_config(config.as_ref())?;
let engine = resolve_policy_engine(&config)?;
let context = resolve_policy_context(&config)?;
let engine = resolve_policy_engine(&context)?;
let request = PolicyRequest {
action,
branch,
@ -2774,7 +3156,8 @@ mod tests {
use super::{
DEFAULT_BEARER_TOKEN_ENV, apply_bearer_token, bearer_token_from_env_file,
legacy_change_request_body, load_cli_config, load_env_file_into_process,
normalize_bearer_token, parse_env_assignment, resolve_remote_bearer_token,
normalize_bearer_token, parse_env_assignment, resolve_policy_context,
resolve_cli_graph, resolve_remote_bearer_token,
};
use omnigraph_server::load_config;
use reqwest::header::AUTHORIZATION;
@ -3034,4 +3417,150 @@ graphs:
}
}
}
#[test]
fn graph_identity_resolve_policy_context_named_cli_graph_uses_graph_key_not_project_name_or_uri() {
let temp = tempdir().unwrap();
let config_path = temp.path().join("omnigraph.yaml");
fs::write(
&config_path,
r#"
project:
name: misleading-project
graphs:
local:
uri: /tmp/local-policy-graph.omni
policy:
file: ./policy.yaml
cli:
graph: local
"#,
)
.unwrap();
let config = load_config(Some(&config_path)).unwrap();
let context = resolve_policy_context(&config).unwrap();
assert_eq!(context.graph_id, "local");
}
#[test]
fn graph_identity_resolve_policy_context_server_graph_uses_graph_key_when_cli_graph_absent() {
let temp = tempdir().unwrap();
let config_path = temp.path().join("omnigraph.yaml");
fs::write(
&config_path,
r#"
project:
name: misleading-project
graphs:
local:
uri: /tmp/local-policy-graph.omni
policy:
file: ./server-policy.yaml
server:
graph: local
"#,
)
.unwrap();
let config = load_config(Some(&config_path)).unwrap();
let context = resolve_policy_context(&config).unwrap();
assert_eq!(context.graph_id, "local");
assert!(context.policy_file.ends_with("server-policy.yaml"));
}
#[test]
fn graph_identity_resolve_policy_context_anonymous_uses_top_level_default_identity() {
let temp = tempdir().unwrap();
let config_path = temp.path().join("omnigraph.yaml");
fs::write(
&config_path,
r#"
project:
name: misleading-project
graphs:
local:
uri: /tmp/local-policy-graph.omni
policy:
file: ./top-policy.yaml
"#,
)
.unwrap();
let config = load_config(Some(&config_path)).unwrap();
let context = resolve_policy_context(&config).unwrap();
assert_eq!(context.graph_id, "default");
assert!(context.policy_file.ends_with("top-policy.yaml"));
}
#[test]
fn graph_identity_resolve_cli_graph_named_target_uses_graph_key_not_project_name_or_uri() {
let temp = tempdir().unwrap();
let config_path = temp.path().join("omnigraph.yaml");
fs::write(
&config_path,
r#"
project:
name: misleading-project
graphs:
prod:
uri: s3://bucket/prod-graph/
policy:
file: ./prod-policy.yaml
"#,
)
.unwrap();
let config = load_config(Some(&config_path)).unwrap();
let graph = resolve_cli_graph(&config, None, Some("prod")).unwrap();
assert_eq!(graph.selected(), Some("prod"));
assert_eq!(graph.graph_id, "prod");
assert_eq!(graph.uri, "s3://bucket/prod-graph/");
}
#[test]
fn graph_identity_resolve_cli_graph_positional_uri_uses_anonymous_normalized_uri() {
let temp = tempdir().unwrap();
let config_path = temp.path().join("omnigraph.yaml");
fs::write(
&config_path,
r#"
project:
name: misleading-project
graphs:
local:
uri: /tmp/configured-graph.omni
policy:
file: ./policy.yaml
cli:
graph: local
"#,
)
.unwrap();
let config = load_config(Some(&config_path)).unwrap();
let local_graph_path = temp.path().join("explicit-graph.omni");
let local_graph = resolve_cli_graph(
&config,
Some(format!("file://{}", local_graph_path.display())),
None,
)
.unwrap();
assert_eq!(local_graph.selected(), None);
assert_eq!(
local_graph.graph_id,
local_graph_path.to_string_lossy().as_ref()
);
assert_eq!(local_graph.policy_file, None);
let s3_graph = resolve_cli_graph(
&config,
Some("s3://bucket/anonymous-graph/".to_string()),
None,
)
.unwrap();
assert_eq!(s3_graph.selected(), None);
assert_eq!(s3_graph.graph_id, "s3://bucket/anonymous-graph");
assert_eq!(s3_graph.policy_file, None);
}
}

View file

@ -2376,3 +2376,295 @@ fn graphs_list_against_local_uri_errors_with_remote_only_message() {
"expected 'remote multi-graph server URL' rejection in stderr; got:\n{stderr}"
);
}
fn queries_test_config(graph_uri: &str, entry: &str, gq_file: &str) -> String {
format!(
"graphs:\n local:\n uri: '{}'\n queries:\n {entry}:\n file: ./{gq_file}\n\
cli:\n graph: local\npolicy: {{}}\n",
graph_uri.replace('\'', "''")
)
}
#[test]
fn queries_validate_exits_zero_on_clean_registry() {
let graph = SystemGraph::loaded();
graph.write_query(
"find_person.gq",
"query find_person($name: String) { match { $p: Person { name: $name } } return { $p.age } }",
);
let config = graph.write_config(
"omnigraph.yaml",
&queries_test_config(&graph.path().to_string_lossy(), "find_person", "find_person.gq"),
);
let output = output_success(cli().arg("queries").arg("validate").arg("--config").arg(&config));
let stdout = stdout_string(&output);
assert!(stdout.contains("OK"), "stdout:\n{stdout}");
}
#[test]
fn queries_validate_exits_nonzero_on_type_broken_query() {
let graph = SystemGraph::loaded();
// `Widget` is not in the fixture schema.
graph.write_query("ghost.gq", "query ghost() { match { $w: Widget } return { $w.name } }");
let config = graph.write_config(
"omnigraph.yaml",
&queries_test_config(&graph.path().to_string_lossy(), "ghost", "ghost.gq"),
);
let output = output_failure(cli().arg("queries").arg("validate").arg("--config").arg(&config));
let stdout = stdout_string(&output);
assert!(
stdout.contains("ghost"),
"validation should name the broken query; stdout:\n{stdout}"
);
}
#[test]
fn queries_list_prints_registered_query() {
let graph = SystemGraph::loaded();
graph.write_query(
"find_person.gq",
"query find_person($name: String) { match { $p: Person { name: $name } } return { $p.age } }",
);
// Exposed with an explicit tool name so the list shows the MCP suffix.
let config = graph.write_config(
"omnigraph.yaml",
&format!(
concat!(
"graphs:\n",
" local:\n",
" uri: '{}'\n",
" queries:\n",
" find_person:\n",
" file: ./find_person.gq\n",
" mcp: {{ expose: true, tool_name: lookup_person }}\n",
"cli:\n",
" graph: local\n",
"policy: {{}}\n",
),
graph.path().to_string_lossy().replace('\'', "''")
),
);
let output = output_success(cli().arg("queries").arg("list").arg("--config").arg(&config));
let stdout = stdout_string(&output);
assert!(stdout.contains("find_person"), "stdout:\n{stdout}");
assert!(
stdout.contains("$name: String"),
"list should show typed params; stdout:\n{stdout}"
);
assert!(
stdout.contains("[mcp: lookup_person]"),
"list should show the MCP tool name for exposed queries; stdout:\n{stdout}"
);
}
#[test]
fn queries_list_requires_graph_selection_for_per_graph_only_registries() {
let graph = SystemGraph::loaded();
graph.write_query(
"find_person.gq",
"query find_person($name: String) { match { $p: Person { name: $name } } return { $p.age } }",
);
let config = graph.write_config(
"omnigraph.yaml",
&format!(
concat!(
"graphs:\n",
" local:\n",
" uri: '{}'\n",
" queries:\n",
" find_person:\n",
" file: ./find_person.gq\n",
"policy: {{}}\n",
),
graph.path().to_string_lossy().replace('\'', "''")
),
);
let output = output_failure(cli().arg("queries").arg("list").arg("--config").arg(&config));
let stderr = String::from_utf8_lossy(&output.stderr);
assert!(
stderr.contains("local") && stderr.contains("--target local"),
"error must name the graph and give a concrete selection hint; stderr:\n{stderr}"
);
}
#[test]
fn queries_list_without_graph_selection_lists_top_level_registry() {
let graph = SystemGraph::loaded();
graph.write_query(
"top_find.gq",
"query top_find($name: String) { match { $p: Person { name: $name } } return { $p.age } }",
);
let config = graph.write_config(
"omnigraph.yaml",
concat!(
"queries:\n",
" top_find:\n",
" file: ./top_find.gq\n",
"policy: {}\n",
),
);
let output = output_success(cli().arg("queries").arg("list").arg("--config").arg(&config));
let stdout = stdout_string(&output);
assert!(stdout.contains("top_find"), "stdout:\n{stdout}");
}
#[test]
fn queries_list_unknown_target_errors() {
// `queries list` opens no graph URI, so unknown-graph validation can't ride
// along on URI resolution the way it does for every other command. An
// unknown `--target` must still error (naming the graph) instead of
// silently falling back to the top-level registry and showing the wrong
// (or empty) catalog.
let graph = SystemGraph::loaded();
graph.write_query(
"find_person.gq",
"query find_person($name: String) { match { $p: Person { name: $name } } return { $p.age } }",
);
let config = graph.write_config(
"omnigraph.yaml",
&queries_test_config(&graph.path().to_string_lossy(), "find_person", "find_person.gq"),
);
let output = output_failure(
cli()
.arg("queries")
.arg("list")
.arg("--target")
.arg("nonexistent")
.arg("--config")
.arg(&config),
);
let stderr = String::from_utf8_lossy(&output.stderr);
assert!(
stderr.contains("nonexistent"),
"error must name the unknown graph; stderr:\n{stderr}"
);
}
#[test]
fn queries_commands_reject_named_graph_with_populated_top_level_block() {
// A named graph (here via `cli.graph`) uses its own `graphs.<name>` block,
// so a populated top-level `queries:` block would be silently ignored — a
// config the server REFUSES to boot. `queries validate`/`list` must reject
// it too (matching boot) instead of validating/listing the per-graph block
// and giving a false green.
let graph = SystemGraph::loaded();
graph.write_query(
"find_person.gq",
"query find_person($name: String) { match { $p: Person { name: $name } } return { $p.age } }",
);
let config = graph.write_config(
"omnigraph.yaml",
&format!(
concat!(
"graphs:\n",
" local:\n",
" uri: '{}'\n",
" queries:\n",
" find_person:\n",
" file: ./find_person.gq\n",
"cli:\n",
" graph: local\n",
"queries:\n", // populated top-level block: the coherence violation
" legacy:\n",
" file: ./legacy.gq\n",
"policy: {{}}\n",
),
graph.path().to_string_lossy().replace('\'', "''")
),
);
// Both resolve `local` from cli.graph (no positional URI), so both must
// error and name the graph + the ignored block — like server boot does.
for sub in ["validate", "list"] {
let output = output_failure(cli().arg("queries").arg(sub).arg("--config").arg(&config));
let stderr = String::from_utf8_lossy(&output.stderr);
assert!(
stderr.contains("local") && stderr.contains("queries"),
"`queries {sub}` must reject a named graph with a populated top-level block; stderr:\n{stderr}"
);
}
}
#[test]
fn queries_validate_exits_nonzero_on_duplicate_tool_name() {
// Two exposed queries claiming one MCP tool name is a load-time
// collision — `queries validate` must fail (offline, before the engine
// opens) and name both queries plus the contested tool.
let graph = SystemGraph::loaded();
graph.write_query("a.gq", "query a() { match { $p: Person } return { $p.name } }");
graph.write_query("b.gq", "query b() { match { $p: Person } return { $p.name } }");
let config = graph.write_config(
"omnigraph.yaml",
&format!(
concat!(
"graphs:\n",
" local:\n",
" uri: '{}'\n",
" queries:\n",
" a:\n",
" file: ./a.gq\n",
" mcp: {{ expose: true, tool_name: dup }}\n",
" b:\n",
" file: ./b.gq\n",
" mcp: {{ expose: true, tool_name: dup }}\n",
"cli:\n",
" graph: local\n",
"policy: {{}}\n",
),
graph.path().to_string_lossy().replace('\'', "''")
),
);
let output = output_failure(cli().arg("queries").arg("validate").arg("--config").arg(&config));
let stderr = String::from_utf8_lossy(&output.stderr);
assert!(
stderr.contains("dup") && stderr.contains("'a'") && stderr.contains("'b'"),
"duplicate tool name should be reported naming both queries; stderr:\n{stderr}"
);
}
#[test]
fn queries_validate_positional_uri_ignores_default_graph() {
// A positional URI is anonymous → the schema AND the registry both come
// from top-level, even when `cli.graph` names a graph whose per-graph
// queries would fail. Pins that the URI and registry can't diverge.
let graph = SystemGraph::loaded();
graph.write_query(
"clean.gq",
"query clean($name: String) { match { $p: Person { name: $name } } return { $p.age } }",
);
// `Widget` is not in the fixture schema — the default graph's per-graph
// query would break validate if it were (wrongly) selected.
graph.write_query("broken.gq", "query broken() { match { $w: Widget } return { $w.name } }");
let config = graph.write_config(
"omnigraph.yaml",
concat!(
"cli:\n graph: prod\n",
"graphs:\n",
" prod:\n",
" uri: /nonexistent-prod.omni\n",
" queries:\n",
" broken:\n",
" file: ./broken.gq\n",
"queries:\n",
" clean:\n",
" file: ./clean.gq\n",
"policy: {}\n",
),
);
// Positional URI = the real loaded graph; selection is anonymous, so the
// CLEAN top-level registry validates (not prod's broken one).
let output = output_success(
cli()
.arg("queries")
.arg("validate")
.arg(graph.path())
.arg("--config")
.arg(&config),
);
let stdout = stdout_string(&output);
assert!(
stdout.contains("OK"),
"positional URI must validate the top-level registry, not the cli.graph default; stdout:\n{stdout}"
);
}

View file

@ -74,14 +74,36 @@ project:
graphs:
local:
uri: {}
policy:
file: ./policy.yaml
cli:
graph: local
branch: main
query:
roots:
- .
policy:
file: ./policy.yaml
",
yaml_string(&graph.path().to_string_lossy())
)
}
fn local_policy_server_graph_config(graph: &SystemGraph) -> String {
format!(
"\
project:
name: policy-e2e-local
graphs:
local:
uri: {}
policy:
file: ./policy.yaml
server:
graph: local
cli:
branch: main
query:
roots:
- .
",
yaml_string(&graph.path().to_string_lossy())
)
@ -1000,49 +1022,55 @@ query vector_search($q: String) {
#[test]
fn local_cli_policy_tooling_is_end_to_end() {
// Sanity check for the read-only policy CLI surfaces. These don't
// mutate the graph — they just parse and evaluate the policy file —
// so they don't depend on PR #4's engine-side enforcement.
// mutate the graph; they parse and evaluate the effective policy for
// named graph selections, including per-graph policy files.
let graph = SystemGraph::loaded();
let config = graph.write_config("omnigraph-policy.yaml", &local_policy_config(&graph));
let server_graph_config = graph.write_config(
"omnigraph-policy-server.yaml",
&local_policy_server_graph_config(&graph),
);
graph.write_config("policy.yaml", POLICY_E2E_YAML);
graph.write_config("policy.tests.yaml", POLICY_E2E_TESTS_YAML);
let validate = output_success(
cli()
.arg("policy")
.arg("validate")
.arg("--config")
.arg(&config),
);
assert!(stdout_string(&validate).contains("policy valid:"));
for config in [&config, &server_graph_config] {
let validate = output_success(
cli()
.arg("policy")
.arg("validate")
.arg("--config")
.arg(config),
);
assert!(stdout_string(&validate).contains("policy valid:"));
let tests = output_success(cli().arg("policy").arg("test").arg("--config").arg(&config));
assert!(stdout_string(&tests).contains("policy tests passed: 2 cases"));
let tests = output_success(cli().arg("policy").arg("test").arg("--config").arg(config));
assert!(stdout_string(&tests).contains("policy tests passed: 2 cases"));
let explain = output_success(
cli()
.arg("policy")
.arg("explain")
.arg("--config")
.arg(&config)
.arg("--actor")
.arg("act-bruno")
.arg("--action")
.arg("change")
.arg("--branch")
.arg("main"),
);
let explain_stdout = stdout_string(&explain);
assert!(explain_stdout.contains("decision: deny"));
assert!(explain_stdout.contains("branch: main"));
let explain = output_success(
cli()
.arg("policy")
.arg("explain")
.arg("--config")
.arg(config)
.arg("--actor")
.arg("act-bruno")
.arg("--action")
.arg("change")
.arg("--branch")
.arg("main"),
);
let explain_stdout = stdout_string(&explain);
assert!(explain_stdout.contains("decision: deny"));
assert!(explain_stdout.contains("branch: main"));
}
}
#[test]
fn local_cli_change_enforces_engine_layer_policy() {
// Asserts MR-722 PR #4: when `policy.file` is configured in
// `omnigraph.yaml`, the CLI loads PolicyEngine into Omnigraph and
// every direct-engine write hits `enforce(action, scope, actor)` —
// identical to what the HTTP server gets, regardless of transport.
// Asserts MR-722 PR #4: when the selected graph has a configured
// policy file, the CLI loads PolicyEngine into Omnigraph and every
// direct-engine write hits `enforce(action, scope, actor)` — identical
// to what the HTTP server gets, regardless of transport.
//
// Three cases, each discriminating:
//
@ -1135,6 +1163,32 @@ fn local_cli_change_enforces_engine_layer_policy() {
assert_eq!(verify["rows"][0]["p.name"], "RagnorOnMain");
}
#[test]
fn local_cli_positional_uri_does_not_inherit_default_graph_policy() {
let graph = SystemGraph::loaded();
let config = graph.write_config("omnigraph-policy.yaml", &local_policy_config(&graph));
graph.write_config("policy.yaml", POLICY_E2E_YAML);
let mutation_file = insert_person_query(&graph, "system-local-policy-positional.gq");
let allowed = parse_stdout_json(&output_success(
cli()
.arg("--as")
.arg("act-bruno")
.arg("change")
.arg("--config")
.arg(&config)
.arg("--uri")
.arg(graph.path())
.arg("--query")
.arg(&mutation_file)
.arg("--params")
.arg(r#"{"name":"PositionalUriBruno","age":4}"#)
.arg("--json"),
));
assert_eq!(allowed["affected_nodes"], 1);
assert_eq!(allowed["actor_id"], "act-bruno");
}
// ─── MR-722 PR A: CLI×writer matrix ───────────────────────────────────────
//
// The change writer is covered above by `local_cli_change_enforces_engine_layer_policy`.
@ -1293,6 +1347,62 @@ fn local_cli_schema_apply_enforces_engine_layer_policy() {
assert_eq!(allowed["applied"], true);
}
#[test]
fn local_cli_schema_apply_rejects_stored_query_breakage_before_publish() {
let graph = SystemGraph::loaded();
graph.write_query(
"stored-find-person.gq",
"query find_person($name: String) { match { $p: Person { name: $name } } return { $p.age } }",
);
let config = graph.write_config(
"omnigraph-stored-query-schema.yaml",
&format!(
"\
graphs:
local:
uri: {}
queries:
find_person:
file: ./stored-find-person.gq
cli:
graph: local
branch: main
query:
roots:
- .
policy: {{}}
",
yaml_string(&graph.path().to_string_lossy())
),
);
let renamed_schema = std::fs::read_to_string(fixture("test.pg"))
.unwrap()
.replace("age: I32?", "years: I32? @rename_from(\"age\")");
let schema_path = graph.write_file("stored-query-breaks.pg", &renamed_schema);
let rejected = output_failure(
cli()
.arg("schema")
.arg("apply")
.arg("--config")
.arg(&config)
.arg("--schema")
.arg(&schema_path)
.arg("--json"),
);
let stderr = String::from_utf8_lossy(&rejected.stderr);
assert!(
stderr.contains("find_person") && stderr.contains("schema check"),
"schema apply should reject the stored-query breakage before publish; stderr: {stderr}"
);
let schema = stdout_string(&output_success(
cli().arg("schema").arg("show").arg("--config").arg(&config),
));
assert!(schema.contains("age: I32?"));
assert!(!schema.contains("years: I32?"));
}
#[test]
fn local_cli_branch_create_enforces_engine_layer_policy() {
let graph = SystemGraph::loaded();
@ -1448,6 +1558,8 @@ project:
graphs:
local:
uri: {}
policy:
file: ./policy.yaml
cli:
graph: local
branch: main
@ -1455,8 +1567,6 @@ cli:
query:
roots:
- .
policy:
file: ./policy.yaml
",
yaml_string(&graph.path().to_string_lossy()),
actor,

View file

@ -60,10 +60,10 @@ project:
graphs:
local:
uri: {}
policy:
file: ./policy.yaml
server:
graph: local
policy:
file: ./policy.yaml
",
yaml_string(&graph.path().to_string_lossy())
)

View file

@ -56,6 +56,21 @@ pub enum PolicyAction {
/// from v0.6.0; operators add and remove graphs by editing
/// `omnigraph.yaml` and restarting.
GraphList,
/// Gates invoking a server-side stored query by name. Per-graph and
/// **graph-scoped** (no branch dimension, like `Admin`): the per-branch
/// access of the query body is enforced by the inner `Read`/`Change`
/// gate, so branch-scoping this outer gate would be redundant (and was
/// wrong for snapshot reads). A rule that sets `branch_scope` on
/// `invoke_query` is rejected by `validate()`. In this release it is
/// **coarse**: an `invoke_query` allow rule permits *any* stored query
/// on the graph (no per-query dimension yet); a future, additive
/// refinement adds an optional query-name scope.
///
/// This gate sits at the HTTP boundary. The engine `_as` writers still
/// enforce `Read`/`Change` per the query body, so a stored *mutation*
/// is double-gated: `invoke_query` to reach the tool, plus `change` for
/// the write itself.
InvokeQuery,
}
impl PolicyAction {
@ -70,6 +85,7 @@ impl PolicyAction {
Self::BranchMerge => "branch_merge",
Self::Admin => "admin",
Self::GraphList => "graph_list",
Self::InvokeQuery => "invoke_query",
}
}
@ -99,7 +115,8 @@ impl PolicyAction {
| Self::BranchCreate
| Self::BranchDelete
| Self::BranchMerge
| Self::Admin => PolicyResourceKind::Graph,
| Self::Admin
| Self::InvokeQuery => PolicyResourceKind::Graph,
}
}
}
@ -155,6 +172,7 @@ impl FromStr for PolicyAction {
"branch_merge" => Ok(Self::BranchMerge),
"admin" => Ok(Self::Admin),
"graph_list" => Ok(Self::GraphList),
"invoke_query" => Ok(Self::InvokeQuery),
other => bail!("unknown policy action '{other}'"),
}
}
@ -806,6 +824,7 @@ namespace Omnigraph {
action "branch_delete" appliesTo { principal: Actor, resource: Graph, context: RequestContext };
action "branch_merge" appliesTo { principal: Actor, resource: Graph, context: RequestContext };
action "admin" appliesTo { principal: Actor, resource: Graph, context: RequestContext };
action "invoke_query" appliesTo { principal: Actor, resource: Graph, context: RequestContext };
action "graph_list" appliesTo { principal: Actor, resource: Server, context: RequestContext };
}
@ -1264,6 +1283,80 @@ rules:
assert!(!deny.allowed);
}
#[test]
fn invoke_query_authorizes_per_graph() {
let policy: PolicyConfig = serde_yaml::from_str(
r#"
version: 1
groups:
team: [act-alice]
others: [act-bruno]
rules:
- id: team-invoke-queries
allow:
actors: { group: team }
actions: [invoke_query]
"#,
)
.unwrap();
let engine = PolicyCompiler::compile(&policy, "graph").unwrap();
let allow = engine
.authorize(
"act-alice",
&PolicyRequest {
action: PolicyAction::InvokeQuery,
branch: None,
target_branch: None,
},
)
.unwrap();
assert!(allow.allowed);
assert_eq!(
allow.matched_rule_id.as_deref(),
Some("team-invoke-queries")
);
// Actor outside the group → deny.
let deny = engine
.authorize(
"act-bruno",
&PolicyRequest {
action: PolicyAction::InvokeQuery,
branch: None,
target_branch: None,
},
)
.unwrap();
assert!(!deny.allowed);
}
#[test]
fn invoke_query_rejects_branch_scope() {
// invoke_query is graph-scoped (like admin) — per-branch access is
// enforced by the inner read/change gate — so a rule that puts a
// `branch_scope` qualifier on it is rejected at validate().
let policy: PolicyConfig = serde_yaml::from_str(
r#"
version: 1
groups:
team: [act-alice]
rules:
- id: team-invoke-any-branch
allow:
actors: { group: team }
actions: [invoke_query]
branch_scope: any
"#,
)
.unwrap();
let err = policy.validate().unwrap_err().to_string();
assert!(
err.contains("branch_scope") && err.contains("invoke_query"),
"branch_scope on invoke_query must be rejected: {err}"
);
}
#[test]
fn server_scoped_rule_cannot_use_branch_scope() {
let policy: PolicyConfig = serde_yaml::from_str(

View file

@ -1,8 +1,11 @@
use omnigraph::db::{GraphCommit, MergeOutcome, ReadTarget, SchemaApplyResult, Snapshot};
use omnigraph::error::{MergeConflict, MergeConflictKind};
use omnigraph::loader::{IngestResult, LoadMode};
use crate::queries::StoredQuery;
use omnigraph_compiler::SchemaMigrationStep;
use omnigraph_compiler::query::ast::Param;
use omnigraph_compiler::result::QueryResult;
use omnigraph_compiler::types::{PropType, ScalarType};
use serde::{Deserialize, Serialize};
use serde_json::Value;
use utoipa::{IntoParams, ToSchema};
@ -300,6 +303,162 @@ pub struct ChangeRequest {
pub branch: Option<String>,
}
/// Body for `POST /queries/{name}` — invokes the server-side stored query
/// named in the path. The query source and name come from the registry,
/// never the body; only the runtime inputs are supplied here.
#[derive(Debug, Clone, Default, Serialize, Deserialize, ToSchema)]
pub struct InvokeStoredQueryRequest {
/// JSON object whose keys match the stored query's declared parameters.
#[serde(default)]
pub params: Option<Value>,
/// Branch to run against. Defaults to `main`; for a stored mutation the
/// write targets this branch.
#[serde(default)]
pub branch: Option<String>,
/// Snapshot id to read from (read queries only — rejected for a stored
/// mutation). Mutually exclusive with `branch`.
#[serde(default)]
pub snapshot: Option<String>,
}
/// Response for `POST /queries/{name}`: the read envelope for a stored
/// read, or the mutation envelope for a stored mutation. Serialized
/// **untagged**, so the wire shape is exactly [`ReadOutput`] or
/// [`ChangeOutput`] — classification follows the stored query, not a
/// wrapper field.
#[derive(Debug, Serialize, ToSchema)]
#[serde(untagged)]
pub enum InvokeStoredQueryResponse {
Read(ReadOutput),
Change(ChangeOutput),
}
/// The kind of a stored-query parameter, decomposed so a client (e.g. an
/// MCP server) can build a typed input schema with a closed `match` and
/// never re-parse omnigraph's type spelling. `bigint`/`date`/`datetime`/
/// `blob` are carried as JSON strings on the wire: a 64-bit integer past
/// 2^53 loses precision as a JSON number, and Date/DateTime are ISO
/// strings, Blob a blob-URI string.
#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize, ToSchema)]
#[serde(rename_all = "snake_case")]
pub enum ParamKind {
String,
Bool,
Int,
#[serde(rename = "bigint")]
BigInt,
Float,
Date,
#[serde(rename = "datetime")]
DateTime,
Blob,
Vector,
List,
}
/// One declared parameter of a stored query, projected for the catalog.
#[derive(Debug, Clone, Serialize, Deserialize, ToSchema)]
pub struct ParamDescriptor {
pub name: String,
pub kind: ParamKind,
/// Element kind when `kind == list` (always a scalar — the grammar
/// forbids lists of vectors or nested lists).
#[serde(skip_serializing_if = "Option::is_none")]
pub item_kind: Option<ParamKind>,
/// Dimension when `kind == vector`.
#[serde(skip_serializing_if = "Option::is_none")]
pub vector_dim: Option<u32>,
/// `false` → the caller must supply it; `true` → optional.
pub nullable: bool,
}
/// One entry in the stored-query catalog (`GET /queries`).
#[derive(Debug, Clone, Serialize, Deserialize, ToSchema)]
pub struct QueryCatalogEntry {
/// Registry key / invoke path segment (`POST /queries/{name}`).
pub name: String,
/// MCP tool id (the `tool_name` override, else `name`).
pub tool_name: String,
#[serde(skip_serializing_if = "Option::is_none")]
pub description: Option<String>,
#[serde(skip_serializing_if = "Option::is_none")]
pub instruction: Option<String>,
/// `true` for a stored mutation → an MCP read-only hint of `false`.
pub mutation: bool,
pub params: Vec<ParamDescriptor>,
}
/// Response for `GET /queries`: the `mcp.expose` subset of a graph's
/// stored-query registry, each with typed parameters.
#[derive(Debug, Clone, Serialize, Deserialize, ToSchema)]
pub struct QueriesCatalogOutput {
pub queries: Vec<QueryCatalogEntry>,
}
/// Total map from a resolved scalar to its catalog kind. Exhaustive on
/// purpose: a new `ScalarType` is a compile error here until catalogued.
fn scalar_kind(scalar: ScalarType) -> ParamKind {
match scalar {
ScalarType::String => ParamKind::String,
ScalarType::Bool => ParamKind::Bool,
ScalarType::I32 | ScalarType::U32 => ParamKind::Int,
ScalarType::I64 | ScalarType::U64 => ParamKind::BigInt,
ScalarType::F32 | ScalarType::F64 => ParamKind::Float,
ScalarType::Date => ParamKind::Date,
ScalarType::DateTime => ParamKind::DateTime,
ScalarType::Blob => ParamKind::Blob,
ScalarType::Vector(_) => ParamKind::Vector,
}
}
fn param_descriptor(param: &Param) -> ParamDescriptor {
match PropType::from_param_type_name(&param.type_name, param.nullable) {
Some(pt) if pt.list => ParamDescriptor {
name: param.name.clone(),
kind: ParamKind::List,
item_kind: Some(scalar_kind(pt.scalar)),
vector_dim: None,
nullable: param.nullable,
},
Some(pt) => {
let (kind, vector_dim) = match pt.scalar {
ScalarType::Vector(dim) => (ParamKind::Vector, Some(dim)),
other => (scalar_kind(other), None),
};
ParamDescriptor {
name: param.name.clone(),
kind,
item_kind: None,
vector_dim,
nullable: param.nullable,
}
}
// Unreachable for a parsed query (every declared param type is
// grammatical); fall back to an opaque string so the field is still
// usable rather than dropped.
None => ParamDescriptor {
name: param.name.clone(),
kind: ParamKind::String,
item_kind: None,
vector_dim: None,
nullable: param.nullable,
},
}
}
/// Project a loaded stored query into its catalog entry (typed params,
/// MCP tool name, read/mutate flag, description/instruction).
pub fn query_catalog_entry(query: &StoredQuery) -> QueryCatalogEntry {
QueryCatalogEntry {
name: query.name.clone(),
tool_name: query.effective_tool_name().to_string(),
description: query.decl.description.clone(),
instruction: query.decl.instruction.clone(),
mutation: query.is_mutation(),
params: query.decl.params.iter().map(param_descriptor).collect(),
}
}
#[derive(Debug, Clone, Default, Serialize, Deserialize, ToSchema)]
pub struct SchemaApplyRequest {
/// Project schema in `.pg` source form. The diff against the current

View file

@ -9,6 +9,13 @@ use serde::{Deserialize, Serialize};
pub const DEFAULT_CONFIG_FILE: &str = "omnigraph.yaml";
pub fn graph_resource_id_for_selection(
selected_graph: Option<&str>,
normalized_uri: &str,
) -> String {
selected_graph.unwrap_or(normalized_uri).to_string()
}
#[derive(Debug, Clone, Default, Serialize, Deserialize)]
pub struct ProjectConfig {
pub name: Option<String>,
@ -24,6 +31,14 @@ pub struct TargetConfig {
/// graph's HTTP-layer Cedar enforcement.
#[serde(default)]
pub policy: PolicySettings,
/// Per-graph stored-query registry: an inline `name -> entry`
/// map. Mirrors the per-graph `policy` shape — each
/// `graphs.<id>.queries` declares that graph's stored queries. Absent
/// (or empty) = no stored queries for the graph. v1 is inline-only;
/// an external `queries.yaml` manifest indirection is a deferred
/// convenience.
#[serde(default)]
pub queries: BTreeMap<String, QueryEntry>,
}
#[derive(Debug, Clone, Copy, Default, Eq, PartialEq, Serialize, Deserialize, ValueEnum)]
@ -90,6 +105,50 @@ pub struct PolicySettings {
pub file: Option<String>,
}
/// One stored-query registry entry. The map **key** is the query's
/// identity — it must equal the `query <name>` symbol declared inside
/// the referenced `.gq` file (asserted when the registry loads).
/// Renaming the key (or the symbol) is a breaking change to callers, by
/// design.
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct QueryEntry {
/// Path to the `.gq` file (relative to the config's `base_dir`). The
/// file may declare several queries; the registry selects the one
/// whose symbol matches the map key.
pub file: String,
#[serde(default)]
pub mcp: McpSettings,
}
/// MCP exposure for a stored query. A *deployment* concern (the same
/// `.gq` may be exposed in one graph and hidden in another), so it lives
/// in YAML rather than in the `.gq` source. **Default `expose: true`** —
/// declaring a query in the manifest *is* the opt-in, so it appears in the
/// MCP tool catalog (`GET /queries`) by default; set `expose: false` to
/// keep a query HTTP/service-callable but hidden from the agent tool list.
/// `expose` governs catalog membership only — it is **not** an
/// authorization gate (invocation is gated by `invoke_query`), so a hidden
/// query is still invocable by name with the right permission.
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct McpSettings {
#[serde(default = "mcp_expose_default")]
pub expose: bool,
pub tool_name: Option<String>,
}
fn mcp_expose_default() -> bool {
true
}
impl Default for McpSettings {
fn default() -> Self {
Self {
expose: mcp_expose_default(),
tool_name: None,
}
}
}
#[derive(Debug, Clone, Copy, Eq, PartialEq, Serialize, Deserialize)]
#[serde(rename_all = "snake_case")]
pub enum AliasCommand {
@ -137,6 +196,12 @@ pub struct OmnigraphConfig {
pub aliases: BTreeMap<String, AliasConfig>,
#[serde(default)]
pub policy: PolicySettings,
/// Top-level stored-query registry, used in single-graph
/// mode — mirrors how the top-level `policy` applies to the single
/// graph. In multi-graph mode this is unused; each graph's
/// `graphs.<id>.queries` applies instead.
#[serde(default)]
pub queries: BTreeMap<String, QueryEntry>,
#[serde(skip)]
base_dir: PathBuf,
}
@ -152,6 +217,7 @@ impl Default for OmnigraphConfig {
query: QueryDefaults::default(),
aliases: BTreeMap::new(),
policy: PolicySettings::default(),
queries: BTreeMap::new(),
base_dir: PathBuf::new(),
}
}
@ -244,6 +310,124 @@ impl OmnigraphConfig {
.map(|path| self.resolve_config_path(path))
}
/// The top-level stored-query registry entries (single-graph mode).
pub fn query_entries(&self) -> &BTreeMap<String, QueryEntry> {
&self.queries
}
/// The per-graph stored-query registry entries for a named target
/// (multi-graph mode). Returns `None` if the target is unknown.
pub fn target_query_entries(
&self,
target_name: &str,
) -> Option<&BTreeMap<String, QueryEntry>> {
self.graphs.get(target_name).map(|target| &target.queries)
}
/// The stored-query registry entries that apply for a graph
/// selection — the single definition of "which `queries:` block
/// governs graph X", shared by server boot and the CLI so the two
/// can't drift. A named graph present in `graphs:` uses its
/// per-graph block; everything else (no selection, or a name that is
/// not a known graph, e.g. a bare URI) falls back to the top-level
/// block (single-graph mode).
pub fn query_entries_for(&self, graph: Option<&str>) -> &BTreeMap<String, QueryEntry> {
match graph {
Some(name) if self.graphs.contains_key(name) => &self.graphs[name].queries,
_ => &self.queries,
}
}
/// The single CLI gate that turns a raw graph selection into a *validated*
/// one — the fallible counterpart to the infallible
/// [`OmnigraphConfig::query_entries_for`]. Both `queries` subcommands route
/// their selection through here so neither can skip a check the other (or
/// server boot) applies:
/// * a known name passes through, but only after the same coherence check
/// server boot enforces
/// ([`OmnigraphConfig::ensure_top_level_blocks_honored`]) — a named graph
/// with a populated top-level block is rejected;
/// * an unknown name errors with the **same** message
/// [`OmnigraphConfig::resolve_target_uri`] produces, so a command that
/// opens no URI rejects an unknown `--target` exactly like the
/// URI-resolving commands do;
/// * an anonymous selection (`None`, e.g. a bare URI) stays anonymous,
/// resolving to the top-level registry downstream (top-level honored).
pub fn resolve_graph_selection<'a>(&self, graph: Option<&'a str>) -> Result<Option<&'a str>> {
match graph {
Some(name) if self.graphs.contains_key(name) => {
self.ensure_top_level_blocks_honored(Some(name))?;
Ok(Some(name))
}
Some(name) => bail!("graph '{}' not found in {}", name, DEFAULT_CONFIG_FILE),
None => Ok(None),
}
}
pub fn resolve_policy_tooling_graph_selection(&self) -> Result<Option<&str>> {
self.resolve_graph_selection(self.cli_graph_name().or_else(|| self.server_graph_name()))
}
/// The policy file that applies for a graph selection — the policy
/// sibling of [`OmnigraphConfig::query_entries_for`], so policy and
/// queries resolve by the same identity rule. A named graph in
/// `graphs:` uses its per-graph `policy.file` with **no** top-level
/// fallback (a named graph with no per-graph policy has no policy —
/// that keeps the boot-time coherence check meaningful); anything else
/// (no selection, or a bare URI) uses the top-level `policy.file`.
pub fn resolve_policy_file_for(&self, graph: Option<&str>) -> Option<PathBuf> {
match graph {
Some(name) if self.graphs.contains_key(name) => self.resolve_target_policy_file(name),
_ => self.resolve_policy_file(),
}
}
/// Names of any top-level config blocks (`policy.file`, `queries:`)
/// that are populated. Used by the boot-time coherence check: when a
/// **named** graph is served (single-mode by name, or multi-mode),
/// the top-level blocks are not honored, so a populated one is a
/// configuration error rather than a silent no-op.
pub fn populated_top_level_blocks(&self) -> Vec<&'static str> {
let mut blocks = Vec::new();
if self.policy.file.is_some() {
blocks.push("policy.file");
}
if !self.queries.is_empty() {
blocks.push("queries");
}
blocks
}
/// A named graph uses its own `graphs.<name>` block, so a populated
/// top-level block would be silently ignored — a config error. The single
/// definition of that rule, shared by server boot and the CLI selection
/// gate ([`OmnigraphConfig::resolve_graph_selection`]) so the two can't
/// drift. An anonymous selection (`None`, e.g. a bare URI) legitimately
/// honors the top-level blocks, so it is never rejected here.
pub fn ensure_top_level_blocks_honored(&self, selected: Option<&str>) -> Result<()> {
if let Some(name) = selected {
let unhonored = self.populated_top_level_blocks();
if !unhonored.is_empty() {
bail!(
"named graph '{name}' uses its own `graphs.{name}.…` block, but top-level {} \
{} set and would be ignored. Move it to `graphs.{name}` (e.g. \
`graphs.{name}.policy.file`, `graphs.{name}.queries`).",
unhonored.join(" and "),
if unhonored.len() == 1 { "is" } else { "are" },
);
}
}
Ok(())
}
/// Resolve a stored-query `.gq` file path (from a registry entry),
/// relative to the config's `base_dir`. Mirrors policy-file
/// resolution; the registry loader calls this to turn each entry's
/// `file:` value into an absolute path.
pub fn resolve_query_file(&self, value: &str) -> PathBuf {
self.resolve_config_path(value)
}
/// Resolve the server-level policy file path (used by management
/// endpoints). Returns `None` if `server.policy.file` is not set.
pub fn resolve_server_policy_file(&self) -> Option<PathBuf> {
@ -387,7 +571,9 @@ mod tests {
use tempfile::tempdir;
use super::{ReadOutputFormat, TableCellLayout, load_config_in};
use super::{
ReadOutputFormat, TableCellLayout, graph_resource_id_for_selection, load_config_in,
};
#[test]
fn load_config_reads_yaml_defaults_from_current_dir() {
@ -451,6 +637,114 @@ policy: {}
assert!(config.graphs.is_empty());
}
#[test]
fn graph_resource_id_for_selection_uses_name_or_anonymous_uri() {
assert_eq!(
graph_resource_id_for_selection(Some("local"), "/tmp/graph.omni"),
"local"
);
assert_eq!(
graph_resource_id_for_selection(None, "/tmp/graph.omni"),
"/tmp/graph.omni"
);
}
#[test]
fn resolve_graph_selection_validates_membership_and_coherence() {
let temp = tempdir().unwrap();
fs::write(
temp.path().join("omnigraph.yaml"),
"graphs:\n local:\n uri: ./demo.omni\n",
)
.unwrap();
let config = load_config_in(temp.path(), None).unwrap();
// A known graph passes through unchanged.
assert_eq!(config.resolve_graph_selection(Some("local")).unwrap(), Some("local"));
// An anonymous selection stays anonymous (→ top-level registry downstream).
assert_eq!(config.resolve_graph_selection(None).unwrap(), None);
// An unknown name errors, naming the graph (matching resolve_target_uri).
let err = config.resolve_graph_selection(Some("ghost")).unwrap_err().to_string();
assert!(
err.contains("ghost") && err.contains("not found"),
"unknown graph must error naming it: {err}"
);
// Coherence: a named graph plus a populated top-level block is the
// config server boot refuses, so the gate rejects it too (shared rule
// via ensure_top_level_blocks_honored). An anonymous selection still
// passes — top-level is honored when no graph is named.
let temp2 = tempdir().unwrap();
fs::write(
temp2.path().join("omnigraph.yaml"),
"graphs:\n local:\n uri: ./demo.omni\npolicy:\n file: ./top.yaml\n",
)
.unwrap();
let incoherent = load_config_in(temp2.path(), None).unwrap();
let err = incoherent
.resolve_graph_selection(Some("local"))
.unwrap_err()
.to_string();
assert!(
err.contains("local") && err.contains("policy.file"),
"named graph + populated top-level block must be rejected, naming both: {err}"
);
assert_eq!(
incoherent.resolve_graph_selection(None).unwrap(),
None,
"anonymous selection still honors top-level"
);
}
#[test]
fn policy_tooling_graph_selection_prefers_cli_then_server_and_validates() {
let temp = tempdir().unwrap();
fs::write(
temp.path().join("omnigraph.yaml"),
"graphs:\n local:\n uri: ./local.omni\n prod:\n uri: ./prod.omni\n\
server:\n graph: local\ncli:\n graph: prod\n",
)
.unwrap();
let config = load_config_in(temp.path(), None).unwrap();
assert_eq!(
config.resolve_policy_tooling_graph_selection().unwrap(),
Some("prod")
);
let temp = tempdir().unwrap();
fs::write(
temp.path().join("omnigraph.yaml"),
"graphs:\n local:\n uri: ./local.omni\nserver:\n graph: local\n",
)
.unwrap();
let config = load_config_in(temp.path(), None).unwrap();
assert_eq!(
config.resolve_policy_tooling_graph_selection().unwrap(),
Some("local")
);
let temp = tempdir().unwrap();
fs::write(temp.path().join("omnigraph.yaml"), "policy: {}\n").unwrap();
let config = load_config_in(temp.path(), None).unwrap();
assert_eq!(config.resolve_policy_tooling_graph_selection().unwrap(), None);
let temp = tempdir().unwrap();
fs::write(
temp.path().join("omnigraph.yaml"),
"graphs:\n local:\n uri: ./local.omni\nserver:\n graph: ghost\n",
)
.unwrap();
let config = load_config_in(temp.path(), None).unwrap();
let err = config
.resolve_policy_tooling_graph_selection()
.unwrap_err()
.to_string();
assert!(
err.contains("ghost") && err.contains("not found"),
"unknown server.graph must use graph-selection validation: {err}"
);
}
#[test]
fn resolve_query_path_searches_config_roots() {
let temp = tempdir().unwrap();
@ -489,6 +783,118 @@ policy: {}
assert_eq!(resolved, config_dir.join("local.gq"));
}
#[test]
fn queries_block_round_trips_inline_and_per_graph() {
let temp = tempdir().unwrap();
fs::write(
temp.path().join("omnigraph.yaml"),
r#"
graphs:
prod:
uri: s3://bucket/prod
queries:
find_user:
file: ./queries/find_user.gq
mcp:
expose: true
tool_name: lookup_user
internal_audit:
file: ./queries/audit.gq
queries:
single_mode_q:
file: ./q.gq
"#,
)
.unwrap();
let config = load_config_in(temp.path(), None).unwrap();
// Per-graph registry (multi-graph mode).
let prod = config.target_query_entries("prod").unwrap();
assert_eq!(prod.len(), 2);
let find_user = &prod["find_user"];
assert_eq!(find_user.file, "./queries/find_user.gq");
assert!(find_user.mcp.expose);
assert_eq!(find_user.mcp.tool_name.as_deref(), Some("lookup_user"));
// Default exposure is true (the manifest entry is the opt-in); tool_name absent.
let audit = &prod["internal_audit"];
assert!(audit.mcp.expose);
assert!(audit.mcp.tool_name.is_none());
// Top-level registry (single-graph mode).
assert_eq!(config.query_entries().len(), 1);
// The shared selector resolves the same blocks the server boot
// and the CLI use: a known graph → its per-graph block; no
// selection or an unknown name → the top-level block (the latter
// pins the behavior of the CLI's now-deleted fallback arm).
assert_eq!(config.query_entries_for(Some("prod")).len(), 2);
assert_eq!(config.query_entries_for(None).len(), 1);
assert_eq!(config.query_entries_for(Some("nonexistent")).len(), 1);
// Path resolution joins against base_dir, like policy files.
assert_eq!(
config.resolve_query_file(&find_user.file),
temp.path().join("./queries/find_user.gq")
);
}
#[test]
fn resolve_policy_file_for_follows_identity() {
let temp = tempdir().unwrap();
fs::write(
temp.path().join("omnigraph.yaml"),
"policy:\n file: ./top.yaml\ngraphs:\n prod:\n uri: s3://b/prod\n \
policy:\n file: ./prod.yaml\n bare:\n uri: s3://b/bare\n",
)
.unwrap();
let config = load_config_in(temp.path(), None).unwrap();
// Named graph with its own policy → per-graph (not top-level).
assert!(
config
.resolve_policy_file_for(Some("prod"))
.unwrap()
.ends_with("prod.yaml")
);
// Named graph with NO per-graph policy → None (no top-level fallback;
// load-bearing for the boot coherence check).
assert!(config.resolve_policy_file_for(Some("bare")).is_none());
// Anonymous (bare URI) or an unknown name → top-level.
assert!(
config
.resolve_policy_file_for(None)
.unwrap()
.ends_with("top.yaml")
);
assert!(
config
.resolve_policy_file_for(Some("nope"))
.unwrap()
.ends_with("top.yaml")
);
}
#[test]
fn queries_block_absent_yields_empty_registry() {
let temp = tempdir().unwrap();
fs::write(
temp.path().join("omnigraph.yaml"),
"graphs:\n local:\n uri: ./demo.omni\n",
)
.unwrap();
let config = load_config_in(temp.path(), None).unwrap();
// Additive: no `queries:` anywhere → empty registries everywhere.
assert!(config.query_entries().is_empty());
assert!(
config
.target_query_entries("local")
.unwrap()
.is_empty()
);
}
#[test]
fn policy_block_accepts_non_empty_mapping() {
let temp = tempdir().unwrap();

View file

@ -4,6 +4,7 @@ pub mod config;
pub mod graph_id;
pub mod identity;
pub mod policy;
pub mod queries;
pub mod registry;
pub mod workload;
@ -11,6 +12,8 @@ pub use graph_id::GraphId;
pub use identity::{AuthSource, GraphKey, ResolvedActor, Scope, TenantId};
pub use registry::{GraphHandle, GraphRegistry, InsertError, RegistryLookup, RegistrySnapshot};
use crate::queries::{QueryRegistry, check, format_check_breakages};
use std::collections::{HashMap, HashSet};
use std::fs;
use std::io;
@ -22,7 +25,8 @@ use api::{
BranchCreateOutput, BranchCreateRequest, BranchDeleteOutput, BranchListOutput,
BranchMergeOutput, BranchMergeRequest, ChangeOutput, ChangeRequest, CommitListOutput,
CommitListQuery, ErrorCode, ErrorOutput, ExportRequest, GraphInfo, GraphListResponse,
HealthOutput, IngestOutput, IngestRequest, QueryRequest, ReadOutput, ReadRequest,
HealthOutput, IngestOutput, IngestRequest, InvokeStoredQueryRequest,
InvokeStoredQueryResponse, QueriesCatalogOutput, QueryRequest, ReadOutput, ReadRequest,
SchemaApplyOutput, SchemaApplyRequest, SchemaOutput, SnapshotQuery, ingest_output,
schema_apply_output, snapshot_payload,
};
@ -40,12 +44,13 @@ use color_eyre::eyre::{Result, WrapErr, bail};
pub use config::{
AliasCommand, AliasConfig, CliDefaults, DEFAULT_CONFIG_FILE, OmnigraphConfig, PolicySettings,
ProjectConfig, QueryDefaults, ReadOutputFormat, ServerDefaults, TableCellLayout, TargetConfig,
load_config,
graph_resource_id_for_selection, load_config,
};
use futures::stream;
use omnigraph::db::{Omnigraph, ReadTarget};
use omnigraph::error::{ManifestConflictDetails, ManifestErrorKind, OmniError};
use omnigraph::storage::normalize_root_uri;
use omnigraph_compiler::catalog::Catalog;
use omnigraph_compiler::json_params_to_param_map;
use omnigraph_compiler::query::parser::parse_query;
use omnigraph_compiler::{JsonParamMode, ParamMap};
@ -93,6 +98,8 @@ fn hash_bearer_token(token: &str) -> BearerTokenHash {
server_export,
#[allow(deprecated)] server_change,
server_mutate,
server_list_queries,
server_invoke_query,
server_schema_apply,
server_schema_get,
server_ingest,
@ -157,8 +164,16 @@ pub enum ServerConfigMode {
/// set to a named target.
Single {
uri: String,
/// Cedar graph resource id for the single graph. A named selection
/// uses the graph name; an anonymous URI uses the normalized URI to
/// preserve legacy single-graph policy identity.
graph_id: String,
/// Top-level `policy.file` (single-graph Cedar policy).
policy_file: Option<PathBuf>,
/// Top-level stored-query registry, loaded and identity-checked
/// at settings-build time; type-checked against the schema when
/// the engine opens.
queries: QueryRegistry,
},
/// Multi-graph invocation — `--config omnigraph.yaml` with a
/// non-empty `graphs:` map and no single-mode selector.
@ -185,6 +200,10 @@ pub struct GraphStartupConfig {
pub graph_id: String,
pub uri: String,
pub policy_file: Option<PathBuf>,
/// Per-graph stored-query registry, loaded and identity-checked at
/// settings-build time; type-checked against the schema when this
/// graph's engine opens.
pub queries: QueryRegistry,
}
/// Runtime routing for the server. Single mode = legacy
@ -285,7 +304,31 @@ impl AppState {
) -> Self {
let bearer_tokens = hash_bearer_tokens(bearer_tokens);
let per_graph_policy = policy_engine.map(Arc::new);
Self::build_single_mode(uri, db, bearer_tokens, per_graph_policy, Arc::new(workload))
Self::build_single_mode(uri, db, bearer_tokens, per_graph_policy, Arc::new(workload), None)
}
/// Like `new_single`, but attaches a pre-validated stored-query
/// registry. Private — the production single-mode boot path
/// (`open_single_with_queries`) is the only caller; every public
/// `new_*` constructor builds with no stored queries.
fn new_single_with_queries(
uri: String,
db: Omnigraph,
bearer_tokens: Vec<(String, String)>,
policy_engine: Option<PolicyEngine>,
workload: workload::WorkloadController,
queries: Option<Arc<QueryRegistry>>,
) -> Self {
let bearer_tokens = hash_bearer_tokens(bearer_tokens);
let per_graph_policy = policy_engine.map(Arc::new);
Self::build_single_mode(
uri,
db,
bearer_tokens,
per_graph_policy,
Arc::new(workload),
queries,
)
}
pub fn new(uri: String, db: Omnigraph) -> Self {
@ -377,6 +420,39 @@ impl AppState {
uri: impl Into<String>,
bearer_tokens: Vec<(String, String)>,
policy_file: Option<&PathBuf>,
) -> Result<Self> {
Self::open_single_with_queries(
uri,
bearer_tokens,
policy_file,
QueryRegistry::default(),
)
.await
}
/// Single-mode boot with a stored-query registry: open the engine,
/// **type-check the registry against the live schema and refuse to
/// start on a breakage** (same posture as bad policy YAML), log
/// non-blocking warnings, then attach the registry to the handle.
/// With an empty registry the check is a no-op and no registry is
/// attached — that is the path `open_with_bearer_tokens_and_policy`
/// (no stored queries) takes.
pub async fn open_single_with_queries(
uri: impl Into<String>,
bearer_tokens: Vec<(String, String)>,
policy_file: Option<&PathBuf>,
queries: QueryRegistry,
) -> Result<Self> {
Self::open_single_with_queries_for_graph_id(uri, bearer_tokens, policy_file, queries, None)
.await
}
async fn open_single_with_queries_for_graph_id(
uri: impl Into<String>,
bearer_tokens: Vec<(String, String)>,
policy_file: Option<&PathBuf>,
queries: QueryRegistry,
graph_id: Option<String>,
) -> Result<Self> {
// The "policy requires tokens" invariant is enforced once by
// `classify_server_runtime_state` in `serve()`, before either
@ -384,16 +460,24 @@ impl AppState {
// time we get here, the (policy, no-tokens) combination has
// already been rejected — no second bail needed.
let uri = normalize_root_uri(&uri.into()).wrap_err("normalize graph URI")?;
let graph_id = graph_id.unwrap_or_else(|| uri.clone());
let db = Omnigraph::open(&uri).await?;
// Validate the registry against the live schema and resolve it to
// an attachable handle (refuse boot on breakage).
let registry = validate_and_attach(queries, &db.catalog(), &graph_id)?;
let policy_engine = match policy_file {
Some(path) => Some(PolicyEngine::load_graph(path, &uri)?),
Some(path) => Some(PolicyEngine::load_graph(path, &graph_id)?),
None => None,
};
Ok(Self::new_with_bearer_tokens_and_policy(
Ok(Self::new_single_with_queries(
uri,
db,
bearer_tokens,
policy_engine,
workload::WorkloadController::from_env(),
registry,
))
}
@ -408,6 +492,7 @@ impl AppState {
bearer_tokens: Arc<[(BearerTokenHash, Arc<str>)]>,
policy_engine: Option<Arc<PolicyEngine>>,
workload: Arc<workload::WorkloadController>,
queries: Option<Arc<QueryRegistry>>,
) -> Self {
// Engine-layer policy gate (MR-722). With a per-graph policy
// installed, every `_as` writer on `Omnigraph` calls into the
@ -436,6 +521,7 @@ impl AppState {
uri,
engine: Arc::new(db),
policy: policy_engine,
queries,
});
Self {
routing: GraphRouting::Single { handle },
@ -750,6 +836,58 @@ pub fn init_tracing() {
let _ = tracing_subscriber::fmt().with_env_filter(filter).try_init();
}
/// Log each non-blocking advisory from a registry check report.
fn log_registry_warnings(label: &str, report: &queries::CheckReport) {
for warning in &report.warnings {
warn!(graph = label, query = %warning.query, "stored query: {}", warning.message);
}
}
fn validate_registry_against_catalog(
registry: &QueryRegistry,
catalog: &Catalog,
label: &str,
) -> omnigraph::error::Result<()> {
let report = check(registry, catalog);
if report.has_breakages() {
return Err(OmniError::manifest(format_check_breakages(label, &report)));
}
log_registry_warnings(label, &report);
Ok(())
}
/// Validate a loaded stored-query registry against the live schema and
/// resolve it to an attachable handle. Refuses boot on any breakage
/// (same posture as bad policy YAML), logs the non-blocking warnings,
/// and collapses an empty registry to `None` (nothing attached). This is
/// the single gate every open path funnels through, so no opener can
/// attach a registry that has not been schema-checked. `label` names the
/// graph in messages.
fn validate_and_attach(
queries: QueryRegistry,
catalog: &Catalog,
label: &str,
) -> Result<Option<Arc<QueryRegistry>>> {
validate_registry_against_catalog(&queries, catalog, label)
.map_err(|err| color_eyre::eyre::eyre!(err.to_string()))?;
Ok(if queries.is_empty() {
None
} else {
Some(Arc::new(queries))
})
}
/// Format every load error (parse / identity failure) into a multi-line
/// boot-abort message.
fn format_registry_load_errors(label: &str, errors: &[queries::LoadError]) -> String {
let joined = errors
.iter()
.map(|e| e.to_string())
.collect::<Vec<_>>()
.join("\n ");
format!("graph '{label}': stored-query registry failed to load:\n {joined}")
}
pub fn load_server_settings(
config_path: Option<&PathBuf>,
cli_uri: Option<String>,
@ -799,15 +937,43 @@ pub fn load_server_settings(
let uri = normalize_root_uri(&raw_uri).wrap_err_with(|| {
format!("normalize single-graph URI '{raw_uri}' from server settings")
})?;
let policy_file = config.resolve_policy_file();
ServerConfigMode::Single { uri, policy_file }
// Config follows graph IDENTITY, not mode: a bare URI is anonymous
// (top-level config); a graph chosen by name uses its per-graph
// `graphs.<name>.{policy,queries}`. `resolve_target_uri` already
// errored on an unknown name, so a `Some(name)` here is a known graph.
let selected: Option<&str> = if has_cli_uri {
None
} else {
cli_target.as_deref().or_else(|| config.server_graph_name())
};
// A named selection must not leave a populated top-level block
// silently unused — refuse boot and point at the per-graph block. The
// same rule the CLI selection gate enforces, shared via one helper so
// the boot check and `omnigraph queries validate`/`list` can't drift.
config.ensure_top_level_blocks_honored(selected)?;
// Load + identity-check now (no engine needed); the schema
// type-check happens when the engine opens.
let policy_file = config.resolve_policy_file_for(selected);
let queries = QueryRegistry::load(&config, config.query_entries_for(selected))
.map_err(|errs| color_eyre::eyre::eyre!(format_registry_load_errors(&uri, &errs)))?;
let graph_id = graph_resource_id_for_selection(selected, &uri);
ServerConfigMode::Single {
uri,
graph_id,
policy_file,
queries,
}
} else if has_explicit_config && has_graphs_map {
if config.resolve_policy_file().is_some() {
// Multi mode: every graph uses its per-graph block; top-level
// policy/queries are never honored, so a populated one is an error.
let unhonored = config.populated_top_level_blocks();
if !unhonored.is_empty() {
bail!(
"top-level `policy.file` is single-graph/CLI-local policy only; \
in multi-graph mode move per-graph rules to \
`graphs.<graph_id>.policy.file` and move `graph_list` rules to \
`server.policy.file`."
"multi-graph mode: top-level {} {} not honored — each graph uses its own \
`graphs.<graph_id>.` block. Move per-graph rules there (and any \
`graph_list` policy to `server.policy.file`).",
unhonored.join(" and "),
if unhonored.len() == 1 { "is" } else { "are" },
);
}
// Rule 4 → Multi mode. Build a startup config per graph.
@ -823,10 +989,17 @@ pub fn load_server_settings(
let uri = normalize_root_uri(&raw_uri).wrap_err_with(|| {
format!("normalize URI '{raw_uri}' for graph '{name}' in omnigraph.yaml")
})?;
// Per-graph `queries:`, selected through the shared
// `query_entries_for` so server and CLI resolve identically.
// Load + identity-check now; the schema type-check happens
// when this graph's engine opens.
let queries = QueryRegistry::load(&config, config.query_entries_for(Some(name.as_str())))
.map_err(|errs| color_eyre::eyre::eyre!(format_registry_load_errors(name, &errs)))?;
graphs.push(GraphStartupConfig {
graph_id: name.clone(),
uri,
policy_file: config.resolve_target_policy_file(name),
queries,
});
}
let config_path = config_path
@ -949,6 +1122,8 @@ pub fn build_app(state: AppState) -> Router {
server_change
}))
.route("/mutate", post(server_mutate))
.route("/queries", get(server_list_queries))
.route("/queries/{name}", post(server_invoke_query))
.route("/schema", get(server_schema_get))
.route("/schema/apply", post(server_schema_apply))
.route(
@ -1046,10 +1221,28 @@ pub async fn serve(config: ServerConfig) -> Result<()> {
let bind = config.bind.clone();
let state = match config.mode {
ServerConfigMode::Single { uri, policy_file } => {
ServerConfigMode::Single {
uri,
graph_id,
policy_file,
queries,
} => {
let uri_for_log = uri.clone();
info!(uri = %uri_for_log, bind = %bind, mode = "single", "serving omnigraph");
AppState::open_with_bearer_tokens_and_policy(uri, tokens, policy_file.as_ref()).await?
info!(
uri = %uri_for_log,
graph_id = %graph_id,
bind = %bind,
mode = "single",
"serving omnigraph"
);
AppState::open_single_with_queries_for_graph_id(
uri,
tokens,
policy_file.as_ref(),
queries,
Some(graph_id),
)
.await?
}
ServerConfigMode::Multi {
graphs,
@ -1131,6 +1324,12 @@ async fn open_single_graph(cfg: GraphStartupConfig) -> Result<Arc<GraphHandle>>
.await
.map_err(|err| color_eyre::eyre::eyre!("open graph '{}' at {}: {err}", graph_id, uri))?;
// Validate this graph's stored queries against the live schema and
// resolve them to an attachable handle (refuse boot on breakage).
// Done before the policy match rebinds `db`; the catalog handle is an
// owned `Arc`, so no borrow of `db` survives into the match.
let queries = validate_and_attach(cfg.queries, &db.catalog(), graph_id.as_str())?;
let (policy_arc, db) = match &cfg.policy_file {
Some(path) => {
let policy = PolicyEngine::load_graph(path, graph_id.as_str())?;
@ -1146,6 +1345,7 @@ async fn open_single_graph(cfg: GraphStartupConfig) -> Result<Arc<GraphHandle>>
uri,
engine: Arc::new(db),
policy: policy_arc,
queries,
}))
}
@ -1479,7 +1679,21 @@ fn log_policy_decision(actor_id: &str, request: &PolicyRequest, decision: &Polic
);
}
/// HTTP-layer Cedar policy gate. Two sources of the policy engine:
/// The allow/deny **decision** an authorization check produces, kept
/// separate from the operational failures (`Err`) that can occur while
/// computing it. [`authorize_request`] collapses `Denied` to a 403; a caller
/// that needs to remap a denial without also remapping operational failures
/// (the stored-query invoke handler hides a denial as a 404) matches on this
/// directly, so a real 401 (missing bearer) or 500 (policy-evaluation error)
/// keeps its true status instead of being masked as the denial's response.
enum Authz {
Allowed,
Denied(String),
}
/// HTTP-layer Cedar policy gate, returning the allow/deny [`Authz`] decision
/// and reserving `Err` for operational failures (401 missing bearer, 500
/// policy-evaluation error). Two sources of the policy engine:
/// * Per-graph handler — passes `handle.policy.as_deref()` so the
/// graph's Cedar rules govern read/change/branch_*/schema_apply.
/// * Management handler — passes `state.server_policy.as_deref()` so
@ -1493,11 +1707,11 @@ fn log_policy_decision(actor_id: &str, request: &PolicyRequest, decision: &Polic
/// dropped from the type), so handlers cannot smuggle it through the
/// request. See `actor_id_resolves_from_bearer_token_ignoring_client_supplied_headers`
/// at `tests/server.rs`.
fn authorize_request(
fn authorize(
actor: Option<&ResolvedActor>,
policy: Option<&PolicyEngine>,
request: PolicyRequest,
) -> std::result::Result<(), ApiError> {
) -> std::result::Result<Authz, ApiError> {
let Some(engine) = policy else {
// No PolicyEngine installed. Three runtime states can reach this:
//
@ -1524,21 +1738,23 @@ fn authorize_request(
// operator's only path to enabling it is configuring an
// explicit `server.policy.file` in omnigraph.yaml.
if request.action.resource_kind() == PolicyResourceKind::Server {
return Err(ApiError::forbidden(
return Ok(Authz::Denied(
"server-scoped actions require an explicit `server.policy.file` \
configured in omnigraph.yaml the management surface is closed \
by default in every runtime state, including --unauthenticated, \
so that server topology is never exposed without operator opt-in.",
so that server topology is never exposed without operator opt-in."
.to_string(),
));
}
if actor.is_some() && request.action != PolicyAction::Read {
return Err(ApiError::forbidden(
return Ok(Authz::Denied(
"server runs in default-deny mode (bearer tokens configured but no \
policy file). Only `read` actions are permitted; configure \
`policy.file` in omnigraph.yaml to enable other actions.",
`policy.file` in omnigraph.yaml to enable other actions."
.to_string(),
));
}
return Ok(());
return Ok(Authz::Allowed);
};
let Some(actor) = actor else {
return Err(ApiError::unauthorized("missing bearer token"));
@ -1560,9 +1776,26 @@ fn authorize_request(
.map_err(|err| ApiError::internal(format!("policy: {err}")))?;
log_policy_decision(actor_id, &request, &decision);
if decision.allowed {
Ok(())
Ok(Authz::Allowed)
} else {
Err(ApiError::forbidden(decision.message))
Ok(Authz::Denied(decision.message))
}
}
/// Thin wrapper over [`authorize`] for the handlers that treat any denial as a
/// 403: a denial becomes `ApiError::forbidden`, and operational failures
/// (401 missing bearer, 500 policy-evaluation error) propagate unchanged. The
/// stored-query invoke handler does **not** use this — it consumes the
/// [`Authz`] decision directly to hide a denial as a 404 while letting an
/// operational failure keep its true status.
fn authorize_request(
actor: Option<&ResolvedActor>,
policy: Option<&PolicyEngine>,
request: PolicyRequest,
) -> std::result::Result<(), ApiError> {
match authorize(actor, policy, request)? {
Authz::Allowed => Ok(()),
Authz::Denied(message) => Err(ApiError::forbidden(message)),
}
}
@ -2001,6 +2234,194 @@ async fn server_mutate(
))
}
/// Path parameter for `POST /queries/{name}`.
#[derive(Deserialize)]
struct QueryNamePath {
name: String,
}
fn parse_optional_invoke_body(
body: Bytes,
) -> std::result::Result<InvokeStoredQueryRequest, ApiError> {
if body.is_empty() {
return Ok(InvokeStoredQueryRequest::default());
}
serde_json::from_slice::<Option<InvokeStoredQueryRequest>>(&body)
.map(|request| request.unwrap_or_default())
.map_err(|err| {
ApiError::bad_request(format!("invalid stored-query invocation body: {err}"))
})
}
#[utoipa::path(
post,
path = "/queries/{name}",
tag = "queries",
operation_id = "invoke_query",
params(("name" = String, Path, description = "Stored query name (the registry key)")),
request_body = Option<InvokeStoredQueryRequest>,
responses(
(status = 200, description = "Read envelope (ReadOutput) or mutation envelope (ChangeOutput), serialized untagged", body = InvokeStoredQueryResponse),
(status = 400, description = "Bad request (param type error; snapshot on a stored mutation)", body = ErrorOutput),
(status = 401, description = "Unauthorized", body = ErrorOutput),
(status = 403, description = "Forbidden (the inner `change` gate for a stored mutation)", body = ErrorOutput),
(status = 404, description = "Unknown stored query, or `invoke_query` denied — indistinguishable to a caller without the grant", body = ErrorOutput),
(status = 409, description = "Merge conflict", body = ErrorOutput),
(status = 429, description = "Per-actor admission cap exceeded; honor `Retry-After` header", body = ErrorOutput),
(status = 500, description = "Policy evaluation error (a denial is reported as 404, not 500)", body = ErrorOutput),
),
security(("bearer_token" = [])),
)]
/// Invoke a curated, server-side stored query by name.
///
/// The query source comes from the graph's `queries:` registry, not the
/// request body — callers send only runtime inputs (`params`, `branch`,
/// `snapshot`). Gated by the `invoke_query` Cedar action at the boundary;
/// a stored *mutation* additionally passes the engine's `change` gate
/// (double-gated). An actor **without** `invoke_query` cannot tell a denied
/// query from a missing one — both return the same 404, so the catalog
/// can't be probed without the grant. Once `invoke_query` is held, the
/// inner `read`/`change` gate may surface a 403 for an existing query the
/// actor can't run (the intended double-gate signal).
async fn server_invoke_query(
State(state): State<AppState>,
Extension(handle): Extension<Arc<GraphHandle>>,
actor: Option<Extension<ResolvedActor>>,
Path(QueryNamePath { name }): Path<QueryNamePath>,
body: Bytes,
) -> std::result::Result<Json<InvokeStoredQueryResponse>, ApiError> {
let req = parse_optional_invoke_body(body)?;
// A caller without `invoke_query` can't tell a denial from a missing
// query: both 404 with this exact message, so the catalog can't be
// probed without the grant. (A caller that holds invoke_query may still
// see the inner gate's 403 for an existing query it can't run — intended.)
const NOT_FOUND: &str = "stored query not found";
let actor_ref = actor.as_ref().map(|Extension(actor)| actor);
// Boundary gate (authentication already ran in `require_bearer_auth`).
// A denial is hidden as 404 (deny == missing, so the catalog can't be
// probed without the grant), but operational failures (401 missing bearer,
// 500 policy-evaluation error) propagate with their true status via `?`
// rather than being masked as a missing query.
match authorize(
actor_ref,
handle.policy.as_deref(),
PolicyRequest {
action: PolicyAction::InvokeQuery,
// Graph-scoped: no branch dimension. The per-branch/snapshot
// access is enforced by the inner read/change gate in the
// runner, so the outer gate must not resolve a branch (doing so
// was wrong for snapshot reads).
branch: None,
target_branch: None,
},
)? {
Authz::Allowed => {}
Authz::Denied(_) => return Err(ApiError::not_found(NOT_FOUND)),
}
// Resolve against the per-graph registry (same 404 on a miss).
let stored = handle
.queries
.as_ref()
.and_then(|registry| registry.lookup(&name))
.ok_or_else(|| ApiError::not_found(NOT_FOUND))?;
// Detach what we need before `handle` moves into the runner — the
// registry borrow lives inside `handle`.
let source = Arc::clone(&stored.source);
let query_name = stored.name.clone();
let is_mutation = stored.is_mutation();
info!(
graph = %handle.uri,
actor = ?actor_ref.map(|a| a.actor_id.as_ref()),
query = %query_name,
kind = if is_mutation { "mutate" } else { "read" },
"stored query invoked"
);
if is_mutation {
if req.snapshot.is_some() {
return Err(ApiError::bad_request(
"stored mutation cannot target a snapshot",
));
}
let branch = req.branch.unwrap_or_else(|| "main".to_string());
let output = run_mutate(
state,
handle,
actor_ref,
&source,
Some(&query_name),
req.params.as_ref(),
branch,
)
.await?;
Ok(Json(InvokeStoredQueryResponse::Change(output)))
} else {
let (selected, target, result) = run_query(
handle,
actor_ref,
&source,
Some(&query_name),
req.params.as_ref(),
req.branch,
req.snapshot,
true,
)
.await?;
Ok(Json(InvokeStoredQueryResponse::Read(api::read_output(
selected, &target, result,
))))
}
}
#[utoipa::path(
get,
path = "/queries",
tag = "queries",
operation_id = "list_queries",
responses(
(status = 200, description = "Stored-query catalog (the mcp.expose subset, with typed params)", body = QueriesCatalogOutput),
(status = 401, description = "Unauthorized", body = ErrorOutput),
(status = 403, description = "Forbidden", body = ErrorOutput),
),
security(("bearer_token" = [])),
)]
/// List the graph's exposed stored queries as a typed tool catalog.
///
/// Returns the `mcp.expose == true` subset of the `queries:` registry, each
/// with its MCP tool name, read/mutate flag, description/instruction, and
/// typed parameters — enough for a client to register them as tools without
/// fetching `.gq` source. Read-gated; the catalog is graph-wide (branch
/// independent — `read` is authorized against `main`). **Not** Cedar-filtered
/// per query yet, so it can list a query whose `invoke_query` the caller
/// lacks (a known gap until per-query authorization lands).
async fn server_list_queries(
Extension(handle): Extension<Arc<GraphHandle>>,
actor: Option<Extension<ResolvedActor>>,
) -> std::result::Result<Json<QueriesCatalogOutput>, ApiError> {
authorize_request(
actor.as_ref().map(|Extension(actor)| actor),
handle.policy.as_deref(),
PolicyRequest {
action: PolicyAction::Read,
branch: Some("main".to_string()),
target_branch: None,
},
)?;
let queries = match handle.queries.as_ref() {
Some(registry) => registry
.iter()
.filter(|q| q.expose)
.map(api::query_catalog_entry)
.collect(),
None => Vec::new(),
};
Ok(Json(QueriesCatalogOutput { queries }))
}
#[utoipa::path(
get,
path = "/schema",
@ -2088,18 +2509,26 @@ async fn server_schema_apply(
.map_err(ApiError::from_workload_reject)?;
let result = {
let db = &handle.engine;
let registry = handle.queries.as_deref();
let label = handle.key.graph_id.as_str().to_string();
// Engine-layer policy enforcement (MR-722): pass the resolved
// actor through so apply_schema_as can call enforce() with the
// authoritative identity. With a policy installed in AppState,
// engine-side enforcement re-checks the same decision the
// HTTP-layer authorize_request just made above. PR #3 collapses
// the redundancy.
db.apply_schema_as(
db.apply_schema_as_with_catalog_check(
&request.schema_source,
omnigraph::db::SchemaApplyOptions {
allow_data_loss: request.allow_data_loss,
},
actor_id,
|catalog| {
if let Some(registry) = registry {
validate_registry_against_catalog(registry, catalog, &label)?;
}
Ok(())
},
)
.await
.map_err(ApiError::from_omni)?
@ -2658,12 +3087,133 @@ mod tests {
use std::fs;
use tempfile::tempdir;
/// `authorize` returns the allow/deny **decision** (`Authz`) and reserves
/// `Err` for operational failures, so the invoke handler can hide a denial
/// as 404 without also masking a 401/500. Pins each outcome.
#[test]
fn authorize_splits_decision_from_operational_error() {
use super::{Authz, PolicyAction, PolicyCompiler, PolicyConfig, PolicyRequest, ResolvedActor, authorize};
use std::sync::Arc;
fn req(action: PolicyAction) -> PolicyRequest {
PolicyRequest { action, branch: None, target_branch: None }
}
let actor = ResolvedActor::cluster_static(Arc::from("act-alice"));
// --- No policy engine installed (open / default-deny modes) ---
// A server-scoped action is denied in every no-policy state.
assert!(matches!(
authorize(Some(&actor), None, req(PolicyAction::GraphList)).unwrap(),
Authz::Denied(_)
));
// Authenticated actor + a non-read per-graph action → default-deny.
assert!(matches!(
authorize(Some(&actor), None, req(PolicyAction::Change)).unwrap(),
Authz::Denied(_)
));
// `read` is the one per-graph action permitted without a policy.
assert!(matches!(
authorize(Some(&actor), None, req(PolicyAction::Read)).unwrap(),
Authz::Allowed
));
// Open mode (no actor, no policy) → allowed.
assert!(matches!(
authorize(None, None, req(PolicyAction::Read)).unwrap(),
Authz::Allowed
));
// --- Policy engine installed ---
let policy: PolicyConfig = serde_yaml::from_str(
"version: 1\n\
groups:\n team: [act-alice]\n\
rules:\n - id: team-read\n allow:\n actors: { group: team }\n actions: [read]\n branch_scope: any\n",
)
.unwrap();
let engine = PolicyCompiler::compile(&policy, "graph").unwrap();
// A matched allow rule → Allowed.
assert!(matches!(
authorize(
Some(&actor),
Some(&engine),
PolicyRequest { action: PolicyAction::Read, branch: Some("main".to_string()), target_branch: None },
)
.unwrap(),
Authz::Allowed
));
// Known actor, no matching allow rule → Denied, carrying the decision message.
match authorize(
Some(&actor),
Some(&engine),
PolicyRequest { action: PolicyAction::Change, branch: Some("main".to_string()), target_branch: None },
)
.unwrap()
{
Authz::Denied(message) => assert!(!message.is_empty(), "a deny carries its decision message"),
Authz::Allowed => panic!("change must be denied: only read is allowed"),
}
// Policy installed but no actor → operational failure (`Err`), NOT a
// decision. This is the split that keeps a 401/500 from being masked
// as the denial's response in the invoke handler.
assert!(
authorize(None, Some(&engine), req(PolicyAction::Read)).is_err(),
"a missing actor with a policy installed is an operational error, not a deny"
);
}
#[test]
fn hash_bearer_token_produces_32_byte_output() {
let hash = hash_bearer_token("any-token");
assert_eq!(hash.len(), 32);
}
/// The single gate both open paths funnel through: it refuses a
/// schema breakage (naming the graph label + query), attaches a clean
/// registry, and collapses an empty one to `None`. Pure over its args
/// (no engine), so it covers the multi-graph path's logic too — the
/// only per-path difference is the `label`, asserted here.
#[test]
fn validate_and_attach_gates_on_schema_and_collapses_empty() {
use crate::queries::{QueryRegistry, RegistrySpec};
use omnigraph_compiler::catalog::build_catalog;
use omnigraph_compiler::schema::parser::parse_schema;
let schema = parse_schema("node User {\nname: String\n}\n").unwrap();
let catalog = build_catalog(&schema).unwrap();
let spec = |name: &str, source: &str| RegistrySpec {
name: name.to_string(),
source: source.to_string(),
expose: false,
tool_name: None,
};
// Empty registry → nothing attached, no error.
let empty =
super::validate_and_attach(QueryRegistry::default(), &catalog, "g").unwrap();
assert!(empty.is_none());
// A query that type-checks → attached.
let ok = QueryRegistry::from_specs(vec![spec(
"find_user",
"query find_user() { match { $u: User } return { $u.name } }",
)])
.unwrap();
assert!(super::validate_and_attach(ok, &catalog, "g").unwrap().is_some());
// A query referencing a type the schema lacks → boot refusal that
// names both the graph label and the offending query.
let broken = QueryRegistry::from_specs(vec![spec(
"ghost",
"query ghost() { match { $w: Widget } return { $w.name } }",
)])
.unwrap();
let err = super::validate_and_attach(broken, &catalog, "graph-x").unwrap_err();
let msg = err.to_string();
assert!(msg.contains("graph-x"), "labels the graph: {msg}");
assert!(msg.contains("ghost"), "names the query: {msg}");
assert!(msg.contains("schema check"), "mentions the schema check: {msg}");
}
#[test]
fn hash_bearer_token_is_deterministic() {
assert_eq!(
@ -2707,7 +3257,10 @@ server:
let settings = load_server_settings(Some(&config), None, None, None, false).unwrap();
match &settings.mode {
ServerConfigMode::Single { uri, .. } => assert_eq!(uri, "/tmp/demo.omni"),
ServerConfigMode::Single { uri, graph_id, .. } => {
assert_eq!(uri, "/tmp/demo.omni");
assert_eq!(graph_id, "local");
}
ServerConfigMode::Multi { .. } => panic!("expected Single mode, got Multi"),
}
assert_eq!(settings.bind, "0.0.0.0:9090");
@ -2739,7 +3292,10 @@ server:
)
.unwrap();
match &settings.mode {
ServerConfigMode::Single { uri, .. } => assert_eq!(uri, "/tmp/override.omni"),
ServerConfigMode::Single { uri, graph_id, .. } => {
assert_eq!(uri, "/tmp/override.omni");
assert_eq!(graph_id, "/tmp/override.omni");
}
ServerConfigMode::Multi { .. } => panic!("expected Single mode, got Multi"),
}
assert_eq!(settings.bind, "0.0.0.0:9999");
@ -2768,7 +3324,10 @@ server:
load_server_settings(Some(&config), None, Some("dev".to_string()), None, false)
.unwrap();
match &settings.mode {
ServerConfigMode::Single { uri, .. } => assert_eq!(uri, "http://127.0.0.1:8080"),
ServerConfigMode::Single { uri, graph_id, .. } => {
assert_eq!(uri, "http://127.0.0.1:8080");
assert_eq!(graph_id, "dev");
}
ServerConfigMode::Multi { .. } => panic!("expected Single mode, got Multi"),
}
}
@ -2848,6 +3407,7 @@ server:
.to_string_lossy()
.into_owned(),
policy_file: None,
queries: crate::queries::QueryRegistry::default(),
}],
config_path: temp.path().join("omnigraph.yaml"),
server_policy_file: Some(policy_path),
@ -2895,7 +3455,9 @@ server:
.join("graph.omni")
.to_string_lossy()
.into_owned(),
graph_id: "default".to_string(),
policy_file: None,
queries: crate::queries::QueryRegistry::default(),
},
bind: "127.0.0.1:0".to_string(),
allow_unauthenticated: false,

View file

@ -0,0 +1,688 @@
//! Stored-query registry.
//!
//! A server-side registry of named, parameter-typed `.gq` queries that
//! operators declare in `omnigraph.yaml` (per-graph, or top-level in
//! single mode) and the server loads at startup. Each entry is parsed
//! and its identity asserted here (`load`); type-checking against the
//! live schema happens separately (a `check` pass) so the loader stays
//! callable without an open engine (the CLI's offline `queries check`).
//!
//! Identity is the query **name**: the manifest key must equal the
//! `query <name>` symbol declared in the referenced `.gq` file. The two
//! are asserted equal at load — one name, two places that must agree.
//! Renaming either is a breaking change to callers, by design.
use std::collections::BTreeMap;
use std::fs;
use std::sync::Arc;
use omnigraph_compiler::catalog::Catalog;
use omnigraph_compiler::query::ast::QueryDecl;
use omnigraph_compiler::query::parser::parse_query;
use omnigraph_compiler::query::typecheck::typecheck_query_decl;
use omnigraph_compiler::types::{PropType, ScalarType};
use crate::config::{OmnigraphConfig, QueryEntry};
/// One loaded stored query. `source` is the full `.gq` file text — the
/// invocation handler hands it to `run_query` / `run_mutate` verbatim,
/// which reuse the same parse/IR/exec path as the inline routes (no
/// parallel implementation).
#[derive(Debug, Clone)]
pub struct StoredQuery {
/// Identity: manifest key == `query <name>` symbol.
pub name: String,
/// Full `.gq` source text the query was selected from.
pub source: Arc<str>,
/// Parsed declaration (params, mutations, description, …).
pub decl: QueryDecl,
/// Whether this query is listed in the MCP tool catalog (`GET /queries`).
/// Default `true` (the manifest entry is the opt-in); `expose: false`
/// keeps it HTTP/service-callable but hidden from the agent tool list.
/// Catalog membership only — not an authorization gate.
pub expose: bool,
/// Optional MCP tool-name override; defaults to `name`.
pub tool_name: Option<String>,
}
impl StoredQuery {
/// `true` if the selected declaration contains insert/update/delete
/// statements — drives read-vs-mutate routing at invocation time.
pub fn is_mutation(&self) -> bool {
!self.decl.mutations.is_empty()
}
/// The MCP tool name this query is catalogued under: the explicit
/// `tool_name` override, else the query `name`. The catalog key —
/// enforced unique across exposed queries at load. Server-side
/// consumers (the uniqueness check, the future catalog projection) read
/// this; the CLI `queries list` resolves the same rule on its own DTO.
pub fn effective_tool_name(&self) -> &str {
self.tool_name.as_deref().unwrap_or(&self.name)
}
}
/// A loaded, identity-checked stored-query registry for one graph.
#[derive(Debug, Clone, Default)]
pub struct QueryRegistry {
by_name: BTreeMap<String, StoredQuery>,
}
/// In-memory registry entry before file I/O. Used by [`QueryRegistry::load`]
/// (after reading each `.gq` from disk) and directly by tests.
#[derive(Debug, Clone)]
pub struct RegistrySpec {
pub name: String,
pub source: String,
pub expose: bool,
pub tool_name: Option<String>,
}
/// A single registry load failure. Collected (not fail-fast) so a bad
/// `omnigraph.yaml` surfaces every broken entry at once, matching the
/// bad-policy-YAML posture.
#[derive(Debug, Clone)]
pub struct LoadError {
/// The offending query name, when the failure is entry-scoped.
pub query: Option<String>,
pub message: String,
}
impl std::fmt::Display for LoadError {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match &self.query {
Some(name) => write!(f, "stored query '{name}': {}", self.message),
None => write!(f, "stored query registry: {}", self.message),
}
}
}
impl QueryRegistry {
/// Build a registry from in-memory specs: parse each source, select
/// the declaration whose symbol equals the manifest key, and assert
/// they agree. Collects every failure. No schema type-checking here
/// — that is [`check`].
pub fn from_specs(specs: Vec<RegistrySpec>) -> Result<Self, Vec<LoadError>> {
let mut by_name = BTreeMap::new();
let mut errors = Vec::new();
for spec in specs {
match parse_query(&spec.source) {
Ok(file) => {
match file.queries.into_iter().find(|q| q.name == spec.name) {
Some(decl) => {
by_name.insert(
spec.name.clone(),
StoredQuery {
name: spec.name,
source: Arc::from(spec.source),
decl,
expose: spec.expose,
tool_name: spec.tool_name,
},
);
}
None => errors.push(LoadError {
query: Some(spec.name.clone()),
message: format!(
"no `query {}` declaration found in its `.gq` file \
(the registry key must match the query symbol)",
spec.name
),
}),
}
}
Err(err) => errors.push(LoadError {
query: Some(spec.name),
message: format!("parse error: {err}"),
}),
}
}
// Exposed queries are catalogued under their effective tool name;
// two claiming one name is an MCP-namespace collision. Refuse it at
// load (collected, not fail-fast), naming the loser and the winner.
// Iterating the `BTreeMap` makes the winner deterministic (the
// lexicographically-first query name; config is a map, so YAML
// declaration order isn't preserved anyway) and the error order
// stable. Scoped to a block so these borrows of `by_name` end
// before it is moved into `Self`.
{
let mut claimed: BTreeMap<&str, &str> = BTreeMap::new();
for query in by_name.values().filter(|q| q.expose) {
let tool = query.effective_tool_name();
if let Some(winner) = claimed.insert(tool, &query.name) {
errors.push(LoadError {
query: Some(query.name.clone()),
message: format!(
"MCP tool name '{tool}' already claimed by exposed query '{winner}'"
),
});
}
}
}
if errors.is_empty() {
Ok(Self { by_name })
} else {
Err(errors)
}
}
/// Read each registry entry's `.gq` file from disk and build the
/// registry. `entries` is either the top-level `queries` map (single
/// mode) or a graph's `queries` map (multi mode); `config` resolves
/// each entry's relative `file:` path against `base_dir`.
pub fn load(
config: &OmnigraphConfig,
entries: &BTreeMap<String, QueryEntry>,
) -> Result<Self, Vec<LoadError>> {
let mut specs = Vec::with_capacity(entries.len());
let mut errors = Vec::new();
for (name, entry) in entries {
let path = config.resolve_query_file(&entry.file);
match fs::read_to_string(&path) {
Ok(source) => specs.push(RegistrySpec {
name: name.clone(),
source,
expose: entry.mcp.expose,
tool_name: entry.mcp.tool_name.clone(),
}),
Err(err) => errors.push(LoadError {
query: Some(name.clone()),
message: format!("cannot read '{}': {err}", path.display()),
}),
}
}
// Parse/identity/uniqueness-check the readable specs even when some
// files failed to read, so every broken entry (I/O, parse, identity,
// tool-name collision) surfaces in one pass rather than one per
// restart. I/O errors come first (in `entries` key order), then the
// spec errors. A non-empty `errors` always fails the load.
match Self::from_specs(specs) {
Ok(registry) if errors.is_empty() => Ok(registry),
Ok(_) => Err(errors),
Err(spec_errors) => {
errors.extend(spec_errors);
Err(errors)
}
}
}
pub fn lookup(&self, name: &str) -> Option<&StoredQuery> {
self.by_name.get(name)
}
pub fn iter(&self) -> impl Iterator<Item = &StoredQuery> {
self.by_name.values()
}
pub fn is_empty(&self) -> bool {
self.by_name.is_empty()
}
pub fn len(&self) -> usize {
self.by_name.len()
}
}
/// A stored query that fails to type-check against the live schema —
/// e.g. it references a node/edge type or property that was renamed or
/// removed by a migration. Breakages **block server boot** (same posture
/// as bad policy YAML), surfacing schema drift at the deploy boundary
/// rather than silently at invocation time.
#[derive(Debug, Clone)]
pub struct Breakage {
pub query: String,
pub message: String,
}
/// A non-blocking advisory found during validation. Logged at boot;
/// never blocks startup. Currently: an MCP-exposed query that declares a
/// parameter an agent cannot realistically supply.
#[derive(Debug, Clone)]
pub struct Warning {
pub query: String,
pub message: String,
}
/// Outcome of validating a registry against a schema. Breakages are
/// fatal (boot refuses); warnings are advisory.
#[derive(Debug, Clone, Default)]
pub struct CheckReport {
pub breakages: Vec<Breakage>,
pub warnings: Vec<Warning>,
}
impl CheckReport {
pub fn has_breakages(&self) -> bool {
!self.breakages.is_empty()
}
pub fn is_clean(&self) -> bool {
self.breakages.is_empty() && self.warnings.is_empty()
}
}
/// Validate a loaded registry against the live schema.
///
/// Pure over `(registry, catalog)` — takes an already-parsed registry and
/// a catalog, so it is callable both at server boot (with the engine's
/// `catalog()`) and offline from the CLI (`omnigraph queries check`),
/// without coupling to server config or an open engine connection.
///
/// Every query is type-checked via the same `typecheck_query_decl` the
/// engine runs for inline queries — no parallel implementation. Failures
/// are **collected, not fail-fast**, so an operator sees every broken
/// query in one pass.
///
/// Advisory lint (warn, never block): an `mcp.expose: true` query that
/// declares a `Vector(N)` parameter. An LLM cannot supply a raw embedding
/// vector; such a query should take a `String` parameter and let the
/// engine embed it server-side at query time. Service-to-service callers
/// may legitimately pass vectors, so this warns rather than rejects.
pub fn check(registry: &QueryRegistry, catalog: &Catalog) -> CheckReport {
let mut report = CheckReport::default();
for query in registry.iter() {
if let Err(err) = typecheck_query_decl(catalog, &query.decl) {
report.breakages.push(Breakage {
query: query.name.clone(),
message: err.to_string(),
});
}
if query.expose {
for param in &query.decl.params {
// Resolve to the structured type via the compiler's own
// resolver rather than string-matching `Vector(` — one
// canonical definition of "is a vector", so this lint can't
// drift from how the parser/type system spells the type.
let is_vector = PropType::from_param_type_name(&param.type_name, param.nullable)
.is_some_and(|pt| matches!(pt.scalar, ScalarType::Vector(_)));
if is_vector {
report.warnings.push(Warning {
query: query.name.clone(),
message: format!(
"MCP-exposed query declares a `{}` parameter `${}` that agents \
cannot supply; use a `String` parameter for server-side embedding",
param.type_name, param.name
),
});
}
}
}
}
report
}
/// Format every breakage in a registry check report into a multi-line
/// operator-facing message, naming each offending query.
pub fn format_check_breakages(label: &str, report: &CheckReport) -> String {
let joined = report
.breakages
.iter()
.map(|b| format!("query '{}': {}", b.query, b.message))
.collect::<Vec<_>>()
.join("\n ");
format!(
"graph '{label}': {} stored quer{} failed the schema check:\n {joined}",
report.breakages.len(),
if report.breakages.len() == 1 {
"y"
} else {
"ies"
}
)
}
#[cfg(test)]
mod tests {
use super::*;
fn spec(name: &str, source: &str, expose: bool) -> RegistrySpec {
RegistrySpec {
name: name.to_string(),
source: source.to_string(),
expose,
tool_name: None,
}
}
fn spec_tool(name: &str, source: &str, expose: bool, tool_name: &str) -> RegistrySpec {
RegistrySpec {
name: name.to_string(),
source: source.to_string(),
expose,
tool_name: Some(tool_name.to_string()),
}
}
#[test]
fn key_equal_symbol_loads() {
let reg = QueryRegistry::from_specs(vec![spec(
"find_user",
"query find_user($id: String) { match { $u: User } return { $u.name } }",
true,
)])
.unwrap();
let q = reg.lookup("find_user").unwrap();
assert_eq!(q.name, "find_user");
assert!(q.expose);
assert_eq!(q.decl.params.len(), 1);
assert!(!q.is_mutation());
// No override → the effective tool name is the query name.
assert_eq!(q.effective_tool_name(), "find_user");
// An explicit override is what the catalog keys on.
let with_tool = QueryRegistry::from_specs(vec![spec_tool(
"find_user",
"query find_user($id: String) { match { $u: User } return { $u.name } }",
true,
"lookup_user",
)])
.unwrap();
assert_eq!(
with_tool.lookup("find_user").unwrap().effective_tool_name(),
"lookup_user"
);
}
#[test]
fn key_mismatch_is_an_identity_error() {
let errors = QueryRegistry::from_specs(vec![spec(
"find_user",
// symbol is `lookup`, key is `find_user` — must be rejected.
"query lookup($id: String) { match { $u: User } return { $u.name } }",
false,
)])
.unwrap_err();
assert_eq!(errors.len(), 1);
assert_eq!(errors[0].query.as_deref(), Some("find_user"));
assert!(errors[0].message.contains("must match the query symbol"));
}
#[test]
fn multi_query_file_selects_the_matching_symbol() {
let source = "query a($x: I64) { match { $u: User } return { $u.name } }\n\
query b($y: String) { match { $u: User } return { $u.name } }";
let reg = QueryRegistry::from_specs(vec![spec("b", source, false)]).unwrap();
let q = reg.lookup("b").unwrap();
assert_eq!(q.name, "b");
assert_eq!(q.decl.params[0].name, "y");
assert!(reg.lookup("a").is_none(), "only the selected symbol is registered");
}
#[test]
fn duplicate_exposed_tool_name_is_a_load_error() {
// Two MCP-exposed queries claiming one tool name is an ambiguity in
// the catalog key space — refused at load, naming both queries and
// the contested tool.
let errors = QueryRegistry::from_specs(vec![
spec_tool("a", "query a() { match { $u: User } return { $u.name } }", true, "dup"),
spec_tool("b", "query b() { match { $u: User } return { $u.name } }", true, "dup"),
])
.unwrap_err();
assert_eq!(errors.len(), 1);
let msg = errors[0].to_string();
assert!(msg.contains("'dup'"), "names the contested tool: {msg}");
assert!(msg.contains("'a'"), "names the winning query: {msg}");
assert!(msg.contains("'b'"), "names the losing query: {msg}");
}
#[test]
fn duplicate_tool_name_among_unexposed_is_allowed() {
// Unexposed queries have no MCP tool, so a shared effective tool
// name is inert — must not error (pins the exposed-only scope).
let reg = QueryRegistry::from_specs(vec![
spec_tool("a", "query a() { match { $u: User } return { $u.name } }", false, "dup"),
spec_tool("b", "query b() { match { $u: User } return { $u.name } }", false, "dup"),
])
.unwrap();
assert_eq!(reg.len(), 2);
}
#[test]
fn parse_error_surfaces_per_entry() {
let errors =
QueryRegistry::from_specs(vec![spec("broken", "query broken( {{ not valid", false)])
.unwrap_err();
assert_eq!(errors[0].query.as_deref(), Some("broken"));
assert!(errors[0].message.contains("parse error"));
}
#[test]
fn errors_collect_rather_than_fail_fast() {
let errors = QueryRegistry::from_specs(vec![
spec("good", "query good() { match { $u: User } return { $u.name } }", false),
spec("mismatch", "query other() { match { $u: User } return { $u.name } }", false),
spec("broken", "query broken(", false),
])
.unwrap_err();
// `good` loads cleanly; only the mismatch and the parse error are
// reported, and both surface in one pass (not fail-fast).
assert_eq!(errors.len(), 2);
}
#[test]
fn mutation_body_classifies_as_mutation() {
let reg = QueryRegistry::from_specs(vec![spec(
"add_user",
"query add_user($name: String) { insert User { name: $name } }",
false,
)])
.unwrap();
assert!(reg.lookup("add_user").unwrap().is_mutation());
}
// --- check(registry, catalog) ---
use omnigraph_compiler::catalog::build_catalog;
use omnigraph_compiler::schema::parser::parse_schema;
fn test_catalog() -> Catalog {
let schema = parse_schema(
r#"
node User {
name: String
age: I32?
embedding: Vector(4)
}
"#,
)
.unwrap();
build_catalog(&schema).unwrap()
}
#[test]
fn check_passes_for_valid_query() {
let reg = QueryRegistry::from_specs(vec![spec(
"find_user",
"query find_user($name: String) { match { $u: User { name: $name } } return { $u.age } }",
false,
)])
.unwrap();
let report = check(&reg, &test_catalog());
assert!(report.is_clean(), "unexpected: {:?}", report);
}
#[test]
fn check_reports_unknown_type_as_breakage() {
let reg = QueryRegistry::from_specs(vec![spec(
"ghost",
// `Widget` is not in the schema.
"query ghost() { match { $w: Widget } return { $w.name } }",
false,
)])
.unwrap();
let report = check(&reg, &test_catalog());
assert!(report.has_breakages());
assert_eq!(report.breakages[0].query, "ghost");
}
#[test]
fn check_reports_unknown_property_as_breakage() {
let reg = QueryRegistry::from_specs(vec![spec(
"bad_prop",
// `User` exists but has no `nickname`.
"query bad_prop() { match { $u: User } return { $u.nickname } }",
false,
)])
.unwrap();
let report = check(&reg, &test_catalog());
assert!(report.has_breakages());
assert_eq!(report.breakages[0].query, "bad_prop");
}
#[test]
fn check_collects_every_breakage_not_fail_fast() {
let reg = QueryRegistry::from_specs(vec![
spec("a", "query a() { match { $w: Widget } return { $w.x } }", false),
spec("b", "query b() { match { $g: Gadget } return { $g.y } }", false),
spec(
"ok",
"query ok() { match { $u: User } return { $u.name } }",
false,
),
])
.unwrap();
let report = check(&reg, &test_catalog());
assert_eq!(report.breakages.len(), 2, "both bad queries reported: {:?}", report);
}
#[test]
fn vector_param_on_exposed_query_warns() {
let reg = QueryRegistry::from_specs(vec![spec(
"vec_search",
"query vec_search($q: Vector(4)) { match { $u: User } return { $u.name } \
order { nearest($u.embedding, $q) } limit 3 }",
true, // mcp.expose
)])
.unwrap();
let report = check(&reg, &test_catalog());
assert!(!report.has_breakages(), "valid query: {:?}", report);
assert_eq!(report.warnings.len(), 1);
assert_eq!(report.warnings[0].query, "vec_search");
}
#[test]
fn vector_param_on_unexposed_query_is_silent() {
let reg = QueryRegistry::from_specs(vec![spec(
"vec_search",
"query vec_search($q: Vector(4)) { match { $u: User } return { $u.name } \
order { nearest($u.embedding, $q) } limit 3 }",
false, // not exposed — vector param is fine for service-to-service callers
)])
.unwrap();
let report = check(&reg, &test_catalog());
assert!(report.is_clean(), "unexpected: {:?}", report);
}
#[test]
fn non_vector_param_on_exposed_query_does_not_warn() {
// The recommended `String` alternative on an exposed query does not
// resolve to a Vector, so the embedding advisory stays silent. Guards
// the structured type check against a false positive (and pins that
// only `Vector(_)` triggers the warning).
let reg = QueryRegistry::from_specs(vec![spec(
"search",
"query search($name: String) { match { $u: User { name: $name } } return { $u.name } }",
true,
)])
.unwrap();
let report = check(&reg, &test_catalog());
assert!(report.is_clean(), "no breakage or warning expected: {:?}", report);
}
// --- catalog projection (api::query_catalog_entry) ---
#[test]
fn catalog_entry_projects_every_param_kind() {
use crate::api::{self, ParamKind};
let reg = QueryRegistry::from_specs(vec![spec_tool(
"all_types",
"query all_types($s: String, $i: I32, $big: I64, $u: U64, $f: F64, $b: Bool, \
$d: Date, $dt: DateTime, $blob: Blob, $opt: String?, $list: [I32], $vec: Vector(4)) \
{ match { $x: User } return { $x.name } }",
true,
"all",
)])
.unwrap();
let entry = api::query_catalog_entry(reg.lookup("all_types").unwrap());
assert_eq!(entry.name, "all_types");
assert_eq!(entry.tool_name, "all");
assert!(!entry.mutation);
let by: std::collections::HashMap<_, _> =
entry.params.iter().map(|p| (p.name.as_str(), p)).collect();
assert_eq!(by["s"].kind, ParamKind::String);
assert_eq!(by["i"].kind, ParamKind::Int);
assert_eq!(by["big"].kind, ParamKind::BigInt, "I64 → bigint (string on the wire)");
assert_eq!(by["u"].kind, ParamKind::BigInt, "U64 → bigint");
assert_eq!(by["f"].kind, ParamKind::Float);
assert_eq!(by["b"].kind, ParamKind::Bool);
assert_eq!(by["d"].kind, ParamKind::Date);
assert_eq!(by["dt"].kind, ParamKind::DateTime);
assert_eq!(by["blob"].kind, ParamKind::Blob);
assert!(!by["s"].nullable);
assert!(by["opt"].nullable, "String? → nullable");
assert_eq!(by["list"].kind, ParamKind::List);
assert_eq!(by["list"].item_kind, Some(ParamKind::Int), "[I32] → list of int");
assert_eq!(by["vec"].kind, ParamKind::Vector);
assert_eq!(by["vec"].vector_dim, Some(4));
}
#[test]
fn catalog_entry_flags_mutation_and_empty_params() {
use crate::api;
let reg = QueryRegistry::from_specs(vec![spec(
"add_user",
"query add_user($name: String) { insert User { name: $name } }",
true,
)])
.unwrap();
let entry = api::query_catalog_entry(reg.lookup("add_user").unwrap());
assert!(entry.mutation, "insert body → mutation flag");
let reg2 = QueryRegistry::from_specs(vec![spec(
"no_params",
"query no_params() { match { $u: User } return { $u.name } }",
true,
)])
.unwrap();
let entry2 = api::query_catalog_entry(reg2.lookup("no_params").unwrap());
assert!(entry2.params.is_empty(), "no declared params → empty list");
}
// --- load() error collection (file I/O + parse in one pass) ---
#[test]
fn load_collects_io_and_parse_errors_in_one_pass() {
use crate::config::load_config;
let temp = tempfile::tempdir().unwrap();
std::fs::write(
temp.path().join("good.gq"),
"query good() { match { $u: User } return { $u.name } }",
)
.unwrap();
std::fs::write(temp.path().join("broken.gq"), "query broken( {{ not valid").unwrap();
// `missing.gq` is deliberately not written (an I/O failure).
std::fs::write(
temp.path().join("omnigraph.yaml"),
"queries:\n good:\n file: ./good.gq\n \
missing:\n file: ./missing.gq\n broken:\n file: ./broken.gq\n",
)
.unwrap();
let config = load_config(Some(&temp.path().join("omnigraph.yaml"))).unwrap();
let errors = QueryRegistry::load(&config, config.query_entries()).unwrap_err();
let joined = errors.iter().map(|e| e.to_string()).collect::<Vec<_>>().join("\n");
// Both the missing file AND the parse error surface in one pass —
// the I/O failure must not mask the parse failure.
assert!(joined.contains("missing"), "I/O error must surface: {joined}");
assert!(
joined.contains("broken") && joined.contains("parse error"),
"the parse error in a readable file must surface in the same pass: {joined}"
);
assert!(!joined.contains("'good'"), "the valid entry is not an error: {joined}");
}
}

View file

@ -29,6 +29,7 @@ use tokio::sync::Mutex;
use crate::identity::GraphKey;
use crate::policy::PolicyEngine;
use crate::queries::QueryRegistry;
/// Open handle for a single graph in the registry. Cheap to clone (`Arc`-wrapped
/// engine + policy). Cluster-mode handlers extract this via
@ -47,6 +48,11 @@ pub struct GraphHandle {
/// `_as` writers"; the HTTP-layer `require_bearer_auth` middleware still
/// runs regardless.
pub policy: Option<Arc<PolicyEngine>>,
/// Per-graph stored-query registry, loaded and validated at
/// startup. `None` means the operator declared no stored queries for
/// this graph — `POST /queries/{name}` then 404s. Mirrors the
/// optional `policy` shape.
pub queries: Option<Arc<QueryRegistry>>,
}
/// Immutable snapshot of the registry's current state. Replaced atomically
@ -245,6 +251,7 @@ fn canonicalize_handle_uri(
uri: canonical_uri.clone(),
engine: Arc::clone(&handle.engine),
policy: handle.policy.clone(),
queries: handle.queries.clone(),
});
Ok((canonical_uri, canonical_handle))
}
@ -276,6 +283,7 @@ mod tests {
uri: graph_uri,
engine: Arc::new(engine),
policy: None,
queries: None,
})
}
@ -340,12 +348,14 @@ mod tests {
uri: shared_uri.clone(),
engine: Arc::clone(&engine),
policy: None,
queries: None,
});
let h2 = Arc::new(GraphHandle {
key: GraphKey::cluster(GraphId::try_from("beta").unwrap()),
uri: shared_uri,
engine,
policy: None,
queries: None,
});
let registry = GraphRegistry::new();
@ -411,12 +421,14 @@ mod tests {
uri: shared_uri.clone(),
engine: Arc::clone(&engine),
policy: None,
queries: None,
});
let h2 = Arc::new(GraphHandle {
key: GraphKey::cluster(GraphId::try_from("beta").unwrap()),
uri: shared_uri,
engine,
policy: None,
queries: None,
});
let err = match GraphRegistry::from_handles(vec![h1, h2]) {
Ok(_) => panic!("expected DuplicateUri, got Ok"),

View file

@ -168,6 +168,8 @@ const EXPECTED_PATHS: &[&str] = &[
"/export",
"/change",
"/mutate",
"/queries",
"/queries/{name}",
"/schema",
"/schema/apply",
"/ingest",
@ -701,6 +703,8 @@ fn protected_endpoints_reference_bearer_token_security() {
("/read", "post"),
("/change", "post"),
("/schema/apply", "post"),
("/queries", "get"),
("/queries/{name}", "post"),
("/ingest", "post"),
("/export", "post"),
("/snapshot", "get"),
@ -913,6 +917,34 @@ fn post_endpoints_have_request_body() {
}
}
#[test]
fn invoke_stored_query_request_body_is_optional() {
let doc = openapi_json();
let request_body = &doc["paths"]["/queries/{name}"]["post"]["requestBody"];
assert!(
request_body.is_object(),
"POST /queries/{{name}} should document its optional request body"
);
assert_eq!(
request_body["required"].as_bool().unwrap_or(false),
false,
"stored-query invocation body should be optional"
);
let schema = &request_body["content"]["application/json"]["schema"];
let ref_path = schema["$ref"]
.as_str()
.or_else(|| {
schema["oneOf"]
.as_array()
.and_then(|schemas| schemas.iter().find_map(|schema| schema["$ref"].as_str()))
})
.unwrap();
assert!(
ref_path.contains("InvokeStoredQueryRequest"),
"POST /queries/{{name}} requestBody should reference InvokeStoredQueryRequest, got {ref_path}"
);
}
// ---------------------------------------------------------------------------
// Serialization round-trip test
// ---------------------------------------------------------------------------
@ -1117,6 +1149,7 @@ async fn app_for_multi_mode(graph_ids: &[&str]) -> (Vec<tempfile::TempDir>, Rout
uri: graph_uri,
engine: Arc::new(engine),
policy: None,
queries: None,
}));
dirs.push(dir);
}

View file

@ -8,7 +8,7 @@ use axum::body::{Body, to_bytes};
use axum::http::header::AUTHORIZATION;
use axum::http::{Method, Request, StatusCode};
use lance::index::DatasetIndexExt;
use omnigraph::db::{Omnigraph, ReadTarget, SchemaApplyOptions};
use omnigraph::db::{Omnigraph, ReadTarget};
use omnigraph::error::OmniError;
use omnigraph::loader::{LoadMode, load_jsonl};
use omnigraph_policy::{PolicyChecker, PolicyEngine};
@ -16,6 +16,7 @@ use omnigraph_server::api::{
BranchCreateRequest, BranchMergeRequest, ChangeRequest, ErrorOutput, ExportRequest,
IngestRequest, QueryRequest, ReadRequest, SchemaApplyRequest, SchemaOutput,
};
use omnigraph_server::queries::{QueryRegistry, RegistrySpec};
use omnigraph_server::{AppState, build_app};
use serde_json::{Value, json};
use serial_test::serial;
@ -141,6 +142,469 @@ fn graph_path(root: &Path) -> PathBuf {
root.join("server.omni")
}
fn stored_query_registry(specs: &[(&str, &str, bool)]) -> QueryRegistry {
QueryRegistry::from_specs(
specs
.iter()
.map(|(name, source, expose)| RegistrySpec {
name: name.to_string(),
source: source.to_string(),
expose: *expose,
tool_name: None,
})
.collect(),
)
.expect("specs parse and key==symbol")
}
#[tokio::test]
async fn server_boots_with_a_valid_stored_query_registry() {
// A stored query that type-checks against the fixture schema
// (`Person { name, age }`) must let the server boot.
let temp = init_loaded_graph().await;
let graph = graph_path(temp.path());
let registry = stored_query_registry(&[(
"find_person",
"query find_person($name: String) { match { $p: Person { name: $name } } return { $p.age } }",
false,
)]);
let state = AppState::open_single_with_queries(
graph.to_string_lossy().to_string(),
vec![],
None,
registry,
)
.await;
assert!(state.is_ok(), "valid registry should boot: {:?}", state.err());
}
#[tokio::test]
async fn server_refuses_boot_on_type_broken_stored_query() {
// A stored query referencing a type not in the schema (`Widget`)
// must abort boot, naming the offending query.
let temp = init_loaded_graph().await;
let graph = graph_path(temp.path());
let registry = stored_query_registry(&[(
"ghost",
"query ghost() { match { $w: Widget } return { $w.name } }",
false,
)]);
let result = AppState::open_single_with_queries(
graph.to_string_lossy().to_string(),
vec![],
None,
registry,
)
.await;
// `AppState` is not `Debug`, so match rather than `expect_err`.
let err = match result {
Ok(_) => panic!("type-broken stored query must refuse boot"),
Err(err) => err,
};
let msg = err.to_string();
assert!(msg.contains("ghost"), "error should name the broken query: {msg}");
assert!(
msg.contains("schema check"),
"error should mention the schema check: {msg}"
);
}
/// Build a single-mode app with a stored-query registry plus a bearer→actor
/// pairing and a policy, so invoke tests exercise the `invoke_query`
/// boundary gate and the inner read/change gates together.
async fn app_with_stored_queries(
specs: &[(&str, &str, bool)],
tokens: &[(&str, &str)],
policy: &str,
) -> (tempfile::TempDir, Router) {
let temp = init_loaded_graph().await;
let graph = graph_path(temp.path());
let policy_path = temp.path().join("policy.yaml");
fs::write(&policy_path, policy).unwrap();
let registry = stored_query_registry(specs);
let state = AppState::open_single_with_queries(
graph.to_string_lossy().to_string(),
tokens
.iter()
.map(|(actor, token)| ((*actor).to_string(), (*token).to_string()))
.collect(),
Some(&policy_path),
registry,
)
.await
.unwrap();
(temp, build_app(state))
}
/// - `act-invoke`: invoke_query + read (stored reads, not mutations)
/// - `act-full`: invoke_query + read + change (stored mutations)
/// - `act-noinvoke`: read only, no invoke_query (boundary-denied)
/// - `act-invokeonly`: invoke_query only, no read (clears the boundary, inner read denies)
const INVOKE_POLICY_YAML: &str = r#"
version: 1
groups:
invokers: ["act-invoke"]
full: ["act-full"]
readers: ["act-noinvoke"]
invoke_only: ["act-invokeonly"]
protected_branches: [main]
rules:
# invoke_query is graph-scoped its own rules, no branch_scope.
- id: invokers-can-invoke
allow:
actors: { group: invokers }
actions: [invoke_query]
- id: full-can-invoke
allow:
actors: { group: full }
actions: [invoke_query]
- id: invoke-only-can-invoke
allow:
actors: { group: invoke_only }
actions: [invoke_query]
# read / change are branch-scoped.
- id: invokers-can-read
allow:
actors: { group: invokers }
actions: [read]
branch_scope: any
- id: full-can-read-change
allow:
actors: { group: full }
actions: [read, change]
branch_scope: any
- id: readers-can-read
allow:
actors: { group: readers }
actions: [read]
branch_scope: any
"#;
const STORED_QUERY_SCHEMA_APPLY_POLICY_YAML: &str = r#"
version: 1
groups:
admins: [act-ragnor]
protected_branches: [main]
rules:
- id: admins-can-invoke
allow:
actors: { group: admins }
actions: [invoke_query]
- id: admins-can-read
allow:
actors: { group: admins }
actions: [read]
branch_scope: any
- id: admins-can-schema-apply
allow:
actors: { group: admins }
actions: [schema_apply]
target_branch_scope: protected
"#;
const FIND_PERSON_GQ: &str =
"query find_person($name: String) { match { $p: Person { name: $name } } return { $p.age } }";
fn invoke_request(name: &str, token: &str, body: Value) -> Request<Body> {
Request::builder()
.uri(format!("/queries/{name}"))
.method(Method::POST)
.header("content-type", "application/json")
.header("authorization", format!("Bearer {token}"))
.body(Body::from(serde_json::to_vec(&body).unwrap()))
.unwrap()
}
fn invoke_request_bytes(
name: &str,
token: &str,
body: impl Into<Body>,
content_type: Option<&str>,
) -> Request<Body> {
let mut builder = Request::builder()
.uri(format!("/queries/{name}"))
.method(Method::POST)
.header("authorization", format!("Bearer {token}"));
if let Some(content_type) = content_type {
builder = builder.header("content-type", content_type);
}
builder.body(body.into()).unwrap()
}
#[tokio::test(flavor = "multi_thread")]
async fn invoke_stored_read_returns_rows() {
let (_temp, app) = app_with_stored_queries(
&[("find_person", FIND_PERSON_GQ, false)],
&[("act-invoke", "t-invoke")],
INVOKE_POLICY_YAML,
)
.await;
let (status, body) = json_response(
&app,
invoke_request("find_person", "t-invoke", json!({ "params": { "name": "Alice" } })),
)
.await;
assert_eq!(status, StatusCode::OK, "body: {body}");
assert_eq!(body["query_name"], "find_person");
assert_eq!(body["row_count"], 1, "Alice is in the fixture; body: {body}");
assert!(body["rows"].is_array(), "read envelope shape; body: {body}");
}
#[tokio::test(flavor = "multi_thread")]
async fn invoke_stored_read_accepts_absent_or_empty_body() {
let no_param_query = "query list_people() { match { $p: Person } return { $p.name } }";
let (_temp, app) = app_with_stored_queries(
&[("list_people", no_param_query, false)],
&[("act-invoke", "t-invoke")],
INVOKE_POLICY_YAML,
)
.await;
let (status, body) = json_response(
&app,
invoke_request_bytes("list_people", "t-invoke", Body::empty(), None),
)
.await;
assert_eq!(status, StatusCode::OK, "body: {body}");
assert_eq!(body["query_name"], "list_people");
let (status, body) = json_response(
&app,
invoke_request_bytes(
"list_people",
"t-invoke",
Body::empty(),
Some("application/json"),
),
)
.await;
assert_eq!(status, StatusCode::OK, "body: {body}");
let (status, body) = json_response(
&app,
invoke_request_bytes(
"list_people",
"t-invoke",
Body::from("{}"),
Some("application/json"),
),
)
.await;
assert_eq!(status, StatusCode::OK, "body: {body}");
let (status, body) = json_response(
&app,
invoke_request_bytes(
"list_people",
"t-invoke",
Body::from("{"),
Some("application/json"),
),
)
.await;
assert_eq!(status, StatusCode::BAD_REQUEST, "body: {body}");
assert!(
body["error"]
.as_str()
.unwrap_or_default()
.contains("invalid stored-query invocation body"),
"malformed JSON should be rejected as bad request; body: {body}"
);
}
#[tokio::test(flavor = "multi_thread")]
async fn invoke_stored_mutation_double_gates_on_change() {
let specs: &[(&str, &str, bool)] = &[(
"add_person",
"query add_person($name: String) { insert Person { name: $name } }",
false,
)];
let (_temp, app) = app_with_stored_queries(
specs,
&[("act-invoke", "t-invoke"), ("act-full", "t-full")],
INVOKE_POLICY_YAML,
)
.await;
// Has invoke_query but NOT change → the inner change gate denies (403).
let (status, body) = json_response(
&app,
invoke_request("add_person", "t-invoke", json!({ "params": { "name": "Eve" } })),
)
.await;
assert_eq!(
status,
StatusCode::FORBIDDEN,
"invoke_query without change must 403; body: {body}"
);
// Has invoke_query + change → applied.
let (status, body) = json_response(
&app,
invoke_request("add_person", "t-full", json!({ "params": { "name": "Eve" } })),
)
.await;
assert_eq!(status, StatusCode::OK, "body: {body}");
assert_eq!(body["affected_nodes"], 1, "body: {body}");
}
#[tokio::test(flavor = "multi_thread")]
async fn invoke_stored_query_bad_param_is_400() {
let (_temp, app) = app_with_stored_queries(
&[("find_person", FIND_PERSON_GQ, false)],
&[("act-invoke", "t-invoke")],
INVOKE_POLICY_YAML,
)
.await;
// `name` is declared String; pass a number.
let (status, body) = json_response(
&app,
invoke_request("find_person", "t-invoke", json!({ "params": { "name": 123 } })),
)
.await;
assert_eq!(status, StatusCode::BAD_REQUEST, "body: {body}");
assert!(
body["error"].as_str().unwrap_or_default().contains("name"),
"400 should name the offending param; body: {body}"
);
}
#[tokio::test(flavor = "multi_thread")]
async fn invoke_unknown_query_and_denied_actor_return_identical_404() {
let (_temp, app) = app_with_stored_queries(
&[("find_person", FIND_PERSON_GQ, false)],
&[("act-invoke", "t-invoke"), ("act-noinvoke", "t-noinvoke")],
INVOKE_POLICY_YAML,
)
.await;
// Authorized actor, unknown query name → 404.
let (unknown_status, unknown_body) =
json_response(&app, invoke_request("does_not_exist", "t-invoke", json!({}))).await;
// Denied actor (no invoke_query), real query name → 404.
let (denied_status, denied_body) = json_response(
&app,
invoke_request("find_person", "t-noinvoke", json!({ "params": { "name": "Alice" } })),
)
.await;
assert_eq!(unknown_status, StatusCode::NOT_FOUND);
assert_eq!(denied_status, StatusCode::NOT_FOUND);
assert_eq!(
unknown_body, denied_body,
"deny must be byte-identical to a missing query (no catalog probing)"
);
}
#[tokio::test(flavor = "multi_thread")]
async fn invoke_query_holder_without_read_sees_403_not_404() {
// The 404-hiding is for callers WITHOUT invoke_query. An actor that
// HOLDS invoke_query but lacks `read` clears the boundary gate, then the
// inner read gate denies → 403 for an EXISTING read query, vs 404 for an
// unknown one. Existence is visible to grant-holders by design (the
// documented double-gate); this pins that actual contract.
let (_temp, app) = app_with_stored_queries(
&[("find_person", FIND_PERSON_GQ, false)],
&[("act-invokeonly", "t-invokeonly")],
INVOKE_POLICY_YAML,
)
.await;
let (exists_status, _) = json_response(
&app,
invoke_request("find_person", "t-invokeonly", json!({ "params": { "name": "Alice" } })),
)
.await;
let (absent_status, _) =
json_response(&app, invoke_request("does_not_exist", "t-invokeonly", json!({}))).await;
assert_eq!(
exists_status,
StatusCode::FORBIDDEN,
"an existing read query the holder can't read → inner-gate 403"
);
assert_eq!(absent_status, StatusCode::NOT_FOUND, "unknown query still 404s");
}
fn get_request(uri: &str, token: &str) -> Request<Body> {
Request::builder()
.uri(uri)
.method(Method::GET)
.header("authorization", format!("Bearer {token}"))
.body(Body::empty())
.unwrap()
}
#[tokio::test(flavor = "multi_thread")]
async fn list_queries_returns_only_exposed_with_typed_params() {
let (_temp, app) = app_with_stored_queries(
&[
("find_person", FIND_PERSON_GQ, true),
(
"add_person",
"query add_person($name: String) { insert Person { name: $name } }",
true,
),
("hidden", "query hidden() { match { $p: Person } return { $p.name } }", false),
],
&[("act-invoke", "t-invoke")],
INVOKE_POLICY_YAML,
)
.await;
let (status, body) = json_response(&app, get_request("/queries", "t-invoke")).await;
assert_eq!(status, StatusCode::OK, "body: {body}");
let entries = body["queries"].as_array().unwrap();
let names: Vec<&str> = entries.iter().map(|q| q["name"].as_str().unwrap()).collect();
assert!(
names.contains(&"find_person") && names.contains(&"add_person"),
"exposed queries listed: {names:?}"
);
assert!(!names.contains(&"hidden"), "non-exposed query hidden from the catalog: {names:?}");
let fp = entries.iter().find(|q| q["name"] == "find_person").unwrap();
assert_eq!(fp["mutation"], false);
assert_eq!(fp["tool_name"], "find_person");
assert_eq!(fp["params"][0]["name"], "name");
assert_eq!(fp["params"][0]["kind"], "string");
let ap = entries.iter().find(|q| q["name"] == "add_person").unwrap();
assert_eq!(ap["mutation"], true, "stored insert → mutation");
}
#[tokio::test(flavor = "multi_thread")]
async fn list_queries_is_read_gated_so_a_non_invoker_can_list() {
// The catalog is read-gated (not invoke_query-gated), so a reader who
// lacks invoke_query still enumerates the exposed queries — the
// documented probe-oracle gap until per-query Cedar filtering lands.
let (_temp, app) = app_with_stored_queries(
&[("find_person", FIND_PERSON_GQ, true)],
&[("act-noinvoke", "t-noinvoke")],
INVOKE_POLICY_YAML,
)
.await;
let (status, body) = json_response(&app, get_request("/queries", "t-noinvoke")).await;
assert_eq!(status, StatusCode::OK, "read-gated catalog; body: {body}");
let names: Vec<&str> = body["queries"]
.as_array()
.unwrap()
.iter()
.map(|q| q["name"].as_str().unwrap())
.collect();
assert!(
names.contains(&"find_person"),
"a reader lists the catalog despite lacking invoke_query: {names:?}"
);
}
#[tokio::test(flavor = "multi_thread")]
async fn list_queries_is_empty_when_no_registry() {
let (_temp, app) = app_for_loaded_graph_with_auth("demo-token").await;
let (status, body) = json_response(&app, get_request("/queries", "demo-token")).await;
assert_eq!(status, StatusCode::OK, "body: {body}");
assert!(
body["queries"].as_array().unwrap().is_empty(),
"no stored-query registry → empty catalog"
);
}
fn drifted_test_schema() -> String {
fs::read_to_string(fixture("test.pg"))
.unwrap()
@ -423,6 +887,83 @@ async fn schema_apply_route_updates_graph_for_authorized_admin() {
);
}
#[tokio::test(flavor = "multi_thread")]
async fn schema_apply_route_rejects_stored_query_breakage_before_publish() {
let (temp, app) = app_with_stored_queries(
&[("find_person", FIND_PERSON_GQ, true)],
&[("act-ragnor", "admin-token")],
STORED_QUERY_SCHEMA_APPLY_POLICY_YAML,
)
.await;
let request = Request::builder()
.method(Method::POST)
.uri("/schema/apply")
.header("content-type", "application/json")
.header("authorization", "Bearer admin-token")
.body(Body::from(
serde_json::to_vec(&SchemaApplyRequest {
schema_source: renamed_age_schema(),
..Default::default()
})
.unwrap(),
))
.unwrap();
let (status, payload) = json_response(&app, request).await;
assert_eq!(status, StatusCode::BAD_REQUEST, "body: {payload}");
let message = payload["error"].as_str().unwrap_or_default();
assert!(
message.contains("find_person") && message.contains("schema check"),
"registry breakage should name the stored query; body: {payload}"
);
let reopened = Omnigraph::open(graph_path(temp.path()).to_str().unwrap())
.await
.unwrap();
let person = &reopened.catalog().node_types["Person"];
assert!(person.properties.contains_key("age"));
assert!(!person.properties.contains_key("years"));
let (invoke_status, invoke_body) = json_response(
&app,
invoke_request(
"find_person",
"admin-token",
json!({ "params": { "name": "Alice" } }),
),
)
.await;
assert_eq!(invoke_status, StatusCode::OK, "body: {invoke_body}");
assert_eq!(invoke_body["row_count"], 1);
}
#[tokio::test(flavor = "multi_thread")]
async fn schema_apply_route_noop_keeps_valid_stored_query_registry() {
let (_temp, app) = app_with_stored_queries(
&[("find_person", FIND_PERSON_GQ, true)],
&[("act-ragnor", "admin-token")],
STORED_QUERY_SCHEMA_APPLY_POLICY_YAML,
)
.await;
let request = Request::builder()
.method(Method::POST)
.uri("/schema/apply")
.header("content-type", "application/json")
.header("authorization", "Bearer admin-token")
.body(Body::from(
serde_json::to_vec(&SchemaApplyRequest {
schema_source: fs::read_to_string(fixture("test.pg")).unwrap(),
..Default::default()
})
.unwrap(),
))
.unwrap();
let (status, payload) = json_response(&app, request).await;
assert_eq!(status, StatusCode::OK, "body: {payload}");
assert_eq!(payload["applied"], false);
}
#[tokio::test]
async fn schema_apply_route_requires_schema_apply_policy_permission() {
let (_temp, app) = app_for_graph_with_auth_tokens_and_policy(
@ -4690,6 +5231,7 @@ mod multi_graph_startup {
uri: graph_uri,
engine: Arc::new(engine),
policy: None,
queries: None,
}));
dirs.push(dir);
}
@ -4985,12 +5527,14 @@ graphs:
uri: graph_uri.clone(),
engine: Arc::clone(&engine),
policy: None,
queries: None,
});
let beta = Arc::new(GraphHandle {
key: GraphKey::cluster(GraphId::try_from("beta").unwrap()),
uri: format!("file://{graph_uri}/"),
engine,
policy: None,
queries: None,
});
match GraphRegistry::from_handles(vec![alpha, beta]) {
@ -5016,6 +5560,7 @@ graphs:
uri: format!("file://{graph_uri}/"),
engine: Arc::new(engine),
policy: None,
queries: None,
});
let registry = GraphRegistry::from_handles(vec![handle]).unwrap();
@ -5138,11 +5683,11 @@ graphs:
let err = load_server_settings(Some(&config_path), None, None, None, true).unwrap_err();
let msg = err.to_string();
assert!(
msg.contains("top-level `policy.file` is single-graph/CLI-local policy only"),
"expected single-graph policy guidance, got: {msg}"
msg.contains("top-level") && msg.contains("policy.file") && msg.contains("not honored"),
"expected top-level-not-honored guidance, got: {msg}"
);
assert!(
msg.contains("graphs.<graph_id>.policy.file"),
msg.contains("graphs.<graph_id>"),
"expected per-graph migration guidance, got: {msg}"
);
assert!(
@ -5151,6 +5696,88 @@ graphs:
);
}
#[test]
fn mode_inference_multi_rejects_top_level_queries() {
// Symmetric to the policy guard: a top-level `queries:` block in
// multi-graph mode is not honored (each graph uses its own), so it
// is a loud error rather than a silent no-op.
let temp = tempfile::tempdir().unwrap();
let config_path = temp.path().join("omnigraph.yaml");
fs::write(
&config_path,
"queries:\n q:\n file: ./q.gq\ngraphs:\n alpha:\n uri: /tmp/alpha.omni\n",
)
.unwrap();
let err = load_server_settings(Some(&config_path), None, None, None, true).unwrap_err();
let msg = err.to_string();
assert!(
msg.contains("queries") && msg.contains("not honored"),
"top-level queries must be rejected in multi-graph mode: {msg}"
);
}
#[test]
fn single_mode_named_graph_rejects_top_level_blocks() {
// Serving a graph by name (`--target`/`server.graph`) uses its
// per-graph block; a populated top-level block would be silently
// shadowed, so boot refuses and names the per-graph location.
let temp = tempfile::tempdir().unwrap();
let config_path = temp.path().join("omnigraph.yaml");
fs::write(
&config_path,
"policy:\n file: ./top.yaml\ngraphs:\n prod:\n uri: /tmp/prod.omni\n",
)
.unwrap();
let err =
load_server_settings(Some(&config_path), None, Some("prod".to_string()), None, true)
.unwrap_err();
let msg = err.to_string();
assert!(
msg.contains("prod") && msg.contains("policy.file") && msg.contains("graphs.prod"),
"named single-mode + top-level policy must refuse, naming the graph: {msg}"
);
}
#[test]
fn single_mode_named_graph_uses_per_graph_policy_and_queries() {
// The identity rule: `--target prod` attaches `graphs.prod`'s own
// policy + queries, not the top-level ones (which are absent here).
let temp = tempfile::tempdir().unwrap();
fs::write(
temp.path().join("prod.gq"),
"query pq() { match { $u: User } return { $u.name } }",
)
.unwrap();
let config_path = temp.path().join("omnigraph.yaml");
fs::write(
&config_path,
"graphs:\n prod:\n uri: /tmp/prod.omni\n policy:\n file: ./prod-policy.yaml\n \
queries:\n pq:\n file: ./prod.gq\n",
)
.unwrap();
let settings =
load_server_settings(Some(&config_path), None, Some("prod".to_string()), None, true)
.unwrap();
match settings.mode {
ServerConfigMode::Single {
graph_id,
policy_file,
queries,
..
} => {
assert_eq!(graph_id, "prod", "named single-mode keeps graph identity");
assert!(
policy_file
.as_ref()
.is_some_and(|p| p.ends_with("prod-policy.yaml")),
"per-graph policy attached: {policy_file:?}"
);
assert!(queries.lookup("pq").is_some(), "per-graph query attached");
}
other => panic!("expected Single mode, got {other:?}"),
}
}
#[test]
fn mode_inference_normalizes_multi_graph_uris() {
let temp = tempfile::tempdir().unwrap();
@ -5383,6 +6010,7 @@ graphs:
uri: graph_uri,
engine: Arc::new(engine),
policy: None,
queries: None,
});
let tokens = vec![("act-andrew".to_string(), "secret-token".to_string())];
let workload = omnigraph_server::workload::WorkloadController::from_env();
@ -5450,6 +6078,7 @@ graphs:
uri: graph_uri,
engine: Arc::new(engine),
policy: None,
queries: None,
}));
}

View file

@ -67,6 +67,12 @@ pub struct SchemaApplyResult {
pub steps: Vec<SchemaMigrationStep>,
}
#[derive(Debug, Clone)]
pub struct SchemaApplyPreview {
pub plan: SchemaMigrationPlan,
pub catalog: Catalog,
}
/// Top-level handle to an Omnigraph database.
///
/// An Omnigraph is a Lance-native graph database with git-style branching.
@ -493,6 +499,14 @@ impl Omnigraph {
schema_apply::plan_schema(self, desired_schema_source, options).await
}
pub async fn preview_schema_apply_with_options(
&self,
desired_schema_source: &str,
options: SchemaApplyOptions,
) -> Result<SchemaApplyPreview> {
schema_apply::preview_schema_apply(self, desired_schema_source, options).await
}
pub async fn apply_schema(&self, desired_schema_source: &str) -> Result<SchemaApplyResult> {
self.apply_schema_as(desired_schema_source, SchemaApplyOptions::default(), None)
.await
@ -523,7 +537,28 @@ impl Omnigraph {
options: SchemaApplyOptions,
actor: Option<&str>,
) -> Result<SchemaApplyResult> {
schema_apply::apply_schema(self, desired_schema_source, options, actor).await
self.apply_schema_as_with_catalog_check(desired_schema_source, options, actor, |_| Ok(()))
.await
}
pub async fn apply_schema_as_with_catalog_check<F>(
&self,
desired_schema_source: &str,
options: SchemaApplyOptions,
actor: Option<&str>,
validate_catalog: F,
) -> Result<SchemaApplyResult>
where
F: FnOnce(&Catalog) -> Result<()>,
{
schema_apply::apply_schema(
self,
desired_schema_source,
options,
actor,
validate_catalog,
)
.await
}
pub(crate) async fn ensure_schema_apply_idle(&self, operation: &str) -> Result<()> {

View file

@ -48,50 +48,17 @@ pub(super) async fn plan_schema(
Ok(plan)
}
pub(super) async fn apply_schema(
db: &Omnigraph,
desired_schema_source: &str,
options: SchemaApplyOptions,
actor: Option<&str>,
) -> Result<SchemaApplyResult> {
// Engine-layer policy gate (MR-722 chassis core).
//
// Fires BEFORE acquiring the schema-apply lock or doing any other
// work. When no PolicyChecker is installed this is a no-op and
// the apply path behaves exactly as it did before MR-722. When
// a PolicyChecker IS installed and the actor is None, this is a
// hard error — see Omnigraph::enforce's docstring for the
// forget-the-actor-footgun reasoning.
//
// Scope is TargetBranch("main") to match the HTTP-layer convention
// for SchemaApply: branch=None, target_branch=Some("main"). Cedar
// policies in the wild use `target_branch_scope: protected` to
// gate schema applies, so the engine-layer call has to set the
// target_branch shape that activates that predicate. Wrong scope
// here = silent policy mismatch with HTTP. See
// `omnigraph_policy::ResourceScope::to_branch_pair` for the mapping.
db.enforce(
omnigraph_policy::PolicyAction::SchemaApply,
&omnigraph_policy::ResourceScope::TargetBranch("main".to_string()),
actor,
)?;
acquire_schema_apply_lock(db).await?;
let result = apply_schema_with_lock(db, desired_schema_source, options).await;
let release_result = release_schema_apply_lock(db).await;
match (result, release_result) {
(Ok(result), Ok(())) => Ok(result),
(Ok(_), Err(err)) => Err(err),
(Err(err), Ok(())) => Err(err),
(Err(err), Err(_)) => Err(err),
}
struct PlannedSchemaApply {
plan: SchemaMigrationPlan,
desired_ir: SchemaIR,
desired_catalog: Catalog,
}
pub(super) async fn apply_schema_with_lock(
async fn plan_schema_for_apply(
db: &Omnigraph,
desired_schema_source: &str,
options: SchemaApplyOptions,
) -> Result<SchemaApplyResult> {
) -> Result<PlannedSchemaApply> {
db.ensure_schema_state_valid().await?;
let branches = db.coordinator.read().await.all_branches().await?;
// Skip `main` and internal system branches. The schema-apply lock branch
@ -123,6 +90,87 @@ pub(super) async fn apply_schema_with_lock(
.unwrap_or_else(|| "unsupported schema migration plan".to_string());
return Err(OmniError::manifest(message));
}
let mut desired_catalog = build_catalog_from_ir(&desired_ir)?;
fixup_blob_schemas(&mut desired_catalog);
Ok(PlannedSchemaApply {
plan,
desired_ir,
desired_catalog,
})
}
pub(super) async fn preview_schema_apply(
db: &Omnigraph,
desired_schema_source: &str,
options: SchemaApplyOptions,
) -> Result<SchemaApplyPreview> {
let planned = plan_schema_for_apply(db, desired_schema_source, options).await?;
Ok(SchemaApplyPreview {
plan: planned.plan,
catalog: planned.desired_catalog,
})
}
pub(super) async fn apply_schema<F>(
db: &Omnigraph,
desired_schema_source: &str,
options: SchemaApplyOptions,
actor: Option<&str>,
validate_catalog: F,
) -> Result<SchemaApplyResult>
where
F: FnOnce(&Catalog) -> Result<()>,
{
// Engine-layer policy gate (MR-722 chassis core).
//
// Fires BEFORE acquiring the schema-apply lock or doing any other
// work. When no PolicyChecker is installed this is a no-op and
// the apply path behaves exactly as it did before MR-722. When
// a PolicyChecker IS installed and the actor is None, this is a
// hard error — see Omnigraph::enforce's docstring for the
// forget-the-actor-footgun reasoning.
//
// Scope is TargetBranch("main") to match the HTTP-layer convention
// for SchemaApply: branch=None, target_branch=Some("main"). Cedar
// policies in the wild use `target_branch_scope: protected` to
// gate schema applies, so the engine-layer call has to set the
// target_branch shape that activates that predicate. Wrong scope
// here = silent policy mismatch with HTTP. See
// `omnigraph_policy::ResourceScope::to_branch_pair` for the mapping.
db.enforce(
omnigraph_policy::PolicyAction::SchemaApply,
&omnigraph_policy::ResourceScope::TargetBranch("main".to_string()),
actor,
)?;
acquire_schema_apply_lock(db).await?;
let result = apply_schema_with_lock(db, desired_schema_source, options, validate_catalog).await;
let release_result = release_schema_apply_lock(db).await;
match (result, release_result) {
(Ok(result), Ok(())) => Ok(result),
(Ok(_), Err(err)) => Err(err),
(Err(err), Ok(())) => Err(err),
(Err(err), Err(_)) => Err(err),
}
}
pub(super) async fn apply_schema_with_lock<F>(
db: &Omnigraph,
desired_schema_source: &str,
options: SchemaApplyOptions,
validate_catalog: F,
) -> Result<SchemaApplyResult>
where
F: FnOnce(&Catalog) -> Result<()>,
{
let planned = plan_schema_for_apply(db, desired_schema_source, options).await?;
validate_catalog(&planned.desired_catalog)?;
let PlannedSchemaApply {
plan,
desired_ir,
desired_catalog,
} = planned;
if plan.steps.is_empty() {
return Ok(SchemaApplyResult {
supported: true,
@ -132,9 +180,6 @@ pub(super) async fn apply_schema_with_lock(
});
}
let mut desired_catalog = build_catalog_from_ir(&desired_ir)?;
fixup_blob_schemas(&mut desired_catalog);
let snapshot = db.snapshot().await;
let base_manifest_version = snapshot.version();
let mut added_tables = BTreeSet::new();

View file

@ -59,6 +59,8 @@ Working documents for in-flight feature work. Removed when the work lands.
|---|---|
| Schema-lint chassis v1 (MR-694) — `--allow-data-loss`, soft/hard drops | [schema-lint-v1-plan.md](schema-lint-v1-plan.md) |
| Inline + stored queries, request/response envelope, MCP (MR-656 / MR-976 / MR-969) | [rfc-001-queries-envelope-mcp.md](rfc-001-queries-envelope-mcp.md) |
| Config & CLI architecture — layered config, client targeting, file naming (MR-973 / MR-974 / MR-981) | [rfc-002-config-cli-architecture.md](rfc-002-config-cli-architecture.md) |
| MCP server surface — full tool parity, stored queries, modular auth (MR-969 / MR-956 / MR-974) | [rfc-003-mcp-server-surface.md](rfc-003-mcp-server-surface.md) |
## Boundary

View file

@ -0,0 +1,590 @@
# RFC: Config & CLI Architecture — Layered Config, Client Targeting, File Naming
**Status:** Proposed
**Date:** 2026-05-30
**Tickets:** MR-668 (multi-graph server, shipped — the dependency this builds on), MR-969 (stored queries + MCP — supplies the in-repo agent tool surface), MR-973 (quickstart / onboarding), MR-974 (agent setup surface), MR-981 (agent-friendly CLI hardening)
**Target release:** v0.8.x (tentative; phased — see Rollout)
## Summary
OmniGraph today has a single config file, `omnigraph.yaml`, read both by the CLI (operating the embedded engine) and by `omnigraph-server` (hosting graphs). There is **no client-side configuration that targets a *running server*** — to talk to a deployed `omnigraph-server` you drop to `curl` or the `omnigraph-ts` client. This is the one real gap in an otherwise coherent design (storage-URI addressing, multi-graph routing, per-graph policy).
This RFC defines the config and CLI architecture that closes that gap, derived from first principles — *working backwards from what OmniGraph uniquely enables* rather than copying kubeconfig / `helix.toml`. The result:
1. A **global-first layered config** — user-global (`~/.omnigraph/`) is the **primary, self-sufficient default**; per-project (`./omnigraph.yaml`) is an *optional* override + deployment manifest. One uniform schema, both layers optional; the CLI works from any directory with **no project file** (the `kubectl`/`aws`/`gh` posture), unlike today's project-anchored behavior.
2. A single unifying noun — the **target** — that resolves a name to a concrete `(locus, graph, sub-state, credential)` tuple, where the locus is **embedded (storage URI) XOR remote (server endpoint)**.
3. A **multi-server × multi-graph** client model (OmniGraph hosts N graphs per server and there are M servers — unlike Helix's one-cluster-one-graph).
4. **Credentials by reference, keyed by server name** (the AWS/gh/kube model) — OS keychain `omnigraph:<server>` (preferred) → a `[<server>]` profile in `~/.omnigraph/credentials``OMNIGRAPH_TOKEN[_<SERVER>]` env (CI). `servers.<name>` is endpoint-only by default but may carry an explicit, secret-free `auth: { token: { env|file|command|keychain } }` source; no `credentials.yaml`; the shipped `bearer_token_env` + dotenv stay as a legacy compat path. Every committed/GitOps'd surface stays secret-free.
5. A **file-naming** decision: project and server config are **the same artifact, same name** (`omnigraph.yaml`); the only differently-named file is the user-global `config.yaml`, justified by **scope, not role**.
The design optimizes jointly for **DX** (one command surface across embedded and remote; clone-and-go) and **AX** (agent experience: one flat resolved context, secrets structurally unreachable, branch-pinned reproducible reads, and a GitOps'd capability surface).
## Reconciliation with shipped / planned CLI work
Verified **against the code**, not ticket statuses (which are unreliable — e.g. MR-581 is marked done but is stale and unbuilt). Findings and the corrections they force:
- **Noun is `graph`/`graphs`, NOT `target`/`targets`.** The config key is `graphs:` in `config.rs` and the flag is `--graph`. **This RFC uses `graphs:`/`--graph` throughout**; the unifying noun is a **`graphs:` entry** that is *embedded* (`storage:`, formerly `uri:`) XOR *remote* (`server:` + `graph_id:` defaulting to the entry key) — a typed locator (§1.1). Read any lingering `targets:`/`--target` below as `graphs:`/`--graph`.
- **`~/.omnigraph/` stands on its own merits** (Helix/aws/kube peer convention), **not** on precedent — there is **no `~/.omnigraph/` usage in the code** today. (MR-581 / MR-531 templates-into-`~/.omnigraph/` are *stale tickets, unbuilt*.)
- **Templates do not exist** in the code (no `template` command). The template mechanism is a *design question for this RFC / the init family*, not an existing foothold.
- **What actually exists in the CLI** (verified): `init, query(read), mutate(change), load, ingest, branch, schema, lint, snapshot, export, commit, policy, optimize, cleanup, graphs`. **Not built:** `serve, quickstart, template, prune, login`. `omnigraph init` exists (with `scaffold_config_if_missing`, `main.rs:1415`); the rest of the "init family" (`quickstart` MR-973, `serve` MR-970, `prune`/`init --force` MR-972/975, `mcp install`/skills MR-974, agent-mode MR-981) are **unbuilt tickets**, some stale.
- **Config still uses `aliases:`** (no `operations:` in code; MR-839 unbuilt). §6's reconciliation talks about `aliases:` as-is, noting `operations:` is a *proposed* rename.
- **`bearer_token_env` exists** (per-graph, `config.rs`); MR-971 flags a CLI-parity / server-side gap. The per-`servers.<name>` extension lands on top of that.
- **A top-level `omnigraph lint` command exists** (verified). A stored-query *registry* validator must pick a verb that doesn't read as a competing lint/check.
## Motivation
Three problems, in priority order:
- **No client→server targeting config.** The moment an operator stands up `omnigraph-server` — for bearer auth + Cedar at a network boundary + admission control + multi-graph routing — the CLI can't address it. `curl` is the fallback. There is no named, switchable, credential-carrying way to say "run this against `prod` on the team server."
- **Multi-server × multi-graph has no first-class expression.** OmniGraph genuinely runs N graphs per server across M servers. The same graph is **multi-homed**`s3://b/prod` may be `prod` on server A, `production` on server B, and opened directly by the CLI. Today's flat `graphs:` map (name→storage-URI) can't express "graph `production` on server `prod-eu`."
- **Solo-first and embedded-first are unserved by the remote story.** A solo developer with no projects should define everything in `~`. A developer iterating locally (embedded, no server) and then pointing at staging (remote) should change *one word*, not learn a second command surface.
MR-668 shipped the server side (multiple graphs per server). MR-969 ships the in-repo agent tool surface (stored queries / MCP). This RFC supplies the **client and config layer** that lets humans and agents target that surface coherently — the foundation under MR-973 / MR-974 / MR-981.
## Non-Goals
- **A control plane / dashboard for config.** Operators edit files and (for servers) restart. No runtime config-mutation API. Matches the MR-668 / MR-969 operational model.
- **Hot reload.** Restart-only for server-side config, matching MR-668 and MR-969.
- **Embedding secrets in any config file.** Credentials are by-reference; the git-ignored `auth.env_file` dotenv (or, later, the OS keychain) holds tokens. Never a committable `*.yaml`.
- **Renaming the project manifest by role.** No `omnigraph.server.yaml` / `omnigraph.client.yaml`. Role lives in sections, not filenames (see Design §3).
- **Dropping embedded mode.** Embedded-first is load-bearing for the file-naming decision; this RFC assumes it stays.
- **Cross-graph / cross-server tool listing in MCP.** Clients loop over per-graph catalogs (a MR-969 non-goal, restated).
## Background
OmniGraph runs on Lance 6.x: typed nodes/edges in per-type Lance datasets, atomic multi-table commits via a `__manifest` table, branchable and time-travelable. The CLI (`omnigraph`) operates the **embedded engine** directly against a storage URI — no HTTP client in its runtime dependencies. `omnigraph-server` (Axum) is a *separate* HTTP front-end over the same engine, with bearer auth + per-graph Cedar (MR-668). The two read the same `omnigraph.yaml` but never connect to each other.
OmniGraph **already has a credentials-by-reference mechanism**, which this RFC builds on rather than replacing: `TargetConfig.bearer_token_env` names the env var holding a graph's bearer token, and `auth.env_file` points at a git-ignored dotenv (`.env.omni`) that the CLI auto-loads into the process (`load_env_file_into_process`) with real-env-vars-win precedence; `resolve_remote_bearer_token` resolves a token via env var then dotenv named lookup. `.env.omni` is already in `.gitignore`.
The six **irreducible enablers** that drive the design (referenced as E1E6 below):
| # | Enabler | Consequence |
|---|---|---|
| E1 | A graph is a **self-contained storage URI**; the substrate (object store + manifest CAS) is the source of truth — no server required to read/write. | A graph is addressable **directly (embedded)**, not only via a server. |
| E2 | A server hosts **many graphs**; **many servers** exist. | The remote address space is **`{server} × {graph_id}`**. |
| E3 | The same graph is **multi-homed** under different per-locus names. | **Name ≠ identity.** Resolution is mandatory. |
| E4 | **Branch / commit / snapshot** are first-class addressable sub-state. | An address is *graph @ branch/snapshot*, not just graph. |
| E5 | Enforcement is **two-layered**: engine-layer Cedar (`_as` writers, works embedded) + HTTP-boundary bearer+Cedar (server only). | *How* you reach a graph determines *which* enforcement applies. |
| E6 | **Stored queries / MCP tools are a per-graph registry defined in the project config** (MR-969). | The **agent tool surface is version-controlled in the repo**. |
Competitors collapse dimensions OmniGraph keeps live: **Helix** fuses E2+E3 (one cluster = one graph); **namidb** fuses E1+E3 into the URI (`s3://b?ns=prod`) and serves one namespace per process. OmniGraph has all of E1E6 at once, so its config resolves a richer space — but the richness is *earned* by capability.
## Design
### 1. The address space and the `target` abstraction
Every OmniGraph address is a tuple:
```
(locus, graph, sub-state, credential)
locus = embedded(URI) XOR remote(server-endpoint) # E1, E2
graph = a URI (embedded) | a graph_id on a server (remote) # E3
sub-state = branch | snapshot # E4
credential = cloud-storage creds (embedded) | bearer token (remote) # E5
```
The config's only job is **name → this tuple**. Define one noun — a **target** — that resolves to either shape:
```yaml
targets:
dev: # embedded — substrate-direct (E1)
storage: s3://team-bucket/dev.omni
branch: main # sub-state (E4)
staging: # remote — resolves a server by reference (E2/E3)
server: staging # → looked up in `servers`
graph_id: prod # the graph's id on that server (defaults to the entry key)
branch: review
```
`--target staging` resolves: project `targets.staging``{server: staging, graph_id: prod, branch: review}``servers.staging``{endpoint, token-by-ref}` → final `(remote(https://…), prod, review, $TOKEN)`. Embedded targets skip the server hop and use cloud-storage credentials.
**Two concepts, not kubeconfig's three.** kube splits cluster / user / context; that 3-way split is its most-cursed UX. A target *bundles* server+graph+branch+defaults under one name; the **only** thing split out is `servers`, because endpoints+credentials are shared across many targets and are secret-bearing (different ownership and rate-of-change; see §2). Result: **2 nouns — `servers` and `targets`.** Embedded `targets` (`storage:`) subsume today's `graphs:` entries.
### 1.1 The resolved address is a typed *locator*, not a `uri` string
The shipped config models a graph as a single `uri: String`, and code branches on `is_remote_uri(uri)`. That conflates two structurally different addresses: an **embedded** graph is a *complete, self-contained* address — one storage URI = one graph, opened directly via the embedded engine; a **remote** graph is a *server endpoint + a `graph_id`* — one server hosts N graphs. A bare server URL **is not a graph**; it lacks the `graph_id`. The cost of the string model, in the code today:
- the CLI re-decides "server or file?" via `is_remote_uri` at ~16 call sites;
- `TargetConfig` (one `uri` field) **cannot express** multi-server × multi-graph or a multi-homed graph (E2/E3) — "graph `production` on server `prod-eu`" has no representation;
- the CLI **bails on remote URIs** for most operations, precisely because the string can't carry the `graph_id`;
- the `omnigraph-ts` SDK had to model `baseUrl` **+** `graphId` *separately* (rewriting `/graphs/{graphId}/…`) — it invented the structure the string lacks.
So the *resolved* address is a **typed locator**, not a string:
```rust
enum GraphLocator {
Embedded { storage: StorageUri }, // file:// , s3:// — a complete graph
Remote { server: ServerId, graph_id: GraphId }, // which server + which graph (+ bearer creds)
}
```
A `graphs:` entry resolves into this **once**; downstream code dispatches on the variant (the breadboard's `GraphConn = Embedded(engine) | Remote(http)`) instead of re-sniffing a scheme at each call site. The `uri` string becomes an *input format* for the embedded variant, never the address itself.
**YAML naming follows the locator — the *key* names the locus**, so neither the value's scheme nor a comment is load-bearing:
| Locus | Key | Value |
|---|---|---|
| Embedded | **`storage:`** (shipped `uri:` is a deprecated alias) | a storage URI (`s3://…`, `file://…`) |
| Remote | **`server:`** | a name in `servers:` (its `endpoint` + creds resolve by name, §5) |
| Remote graph id | **`graph_id:`** | the id on that server — **defaults to the entry key**; set only when the local alias differs |
An entry has `storage:` **xor** `server:` — the deserializer rejects *both* and *neither* (no silent ambiguity). This removes two prior confusions: `graphs:` (the map) vs `graph:` (the remote id), and `uri:`-might-be-a-server.
```yaml
servers:
prod-eu: { endpoint: https://og-eu.internal:8080 }
graphs:
dev: { storage: s3://team-bucket/dev.omni } # embedded
production: { server: prod-eu } # remote — graph_id = "production" (the key)
staging: { server: prod-eu, graph_id: prod } # remote — alias ≠ server's id
```
### 1.2 Invalid configs are rejected by design
The DX rule is: **a config field is either honored or rejected, never silently ignored**. The loader therefore has two phases:
1. Parse YAML into a loose/raw shape that preserves origin (`base_dir`, layer, line/path when available).
2. Convert once into a typed, role-aware resolved config. Every command receives the resolved form, not the raw YAML structs.
The typed graph shape is:
```rust
enum GraphEntry {
Embedded(EmbeddedGraphEntry),
Remote(RemoteGraphEntry),
}
struct EmbeddedGraphEntry {
storage: StorageUri,
branch: Option<BranchName>,
policy: Option<PolicyFile>,
queries: QueryRegistrySpec,
}
struct RemoteGraphEntry {
server: ServerId,
graph_id: GraphId,
branch: Option<BranchName>,
}
```
That makes these rules structural rather than advisory:
- A graph entry must specify **exactly one** locator: `storage:`/legacy `uri:` xor `server:`.
- `policy:` and `queries:` are valid only on `Embedded` graph entries, because they define the capability surface of a graph this process opens directly. A `Remote` graph entry points at a server; that server owns policy and stored-query definitions.
- `omnigraph-server` may serve only `Embedded` graph entries. A server manifest entry with `server:` is rejected: a server should not "host" a graph by proxying another server.
- A named graph uses its own graph entry. Top-level `policy:` / `queries:` are a legacy anonymous-bare-URI compatibility path only; if a named graph is selected while top-level blocks would be ignored, config validation errors with a migration hint.
- A client-defined remote graph discovers stored queries from the server (`GET /queries`) and invokes them (`POST /queries/{name}`); it does not define `queries:` locally for that remote graph.
Examples that must fail fast:
```yaml
graphs:
prod:
storage: s3://team-bucket/prod.omni
server: prod-us # invalid: storage xor server
```
```yaml
graphs:
prod:
server: prod-us
graph_id: production
policy: { file: ./policies/prod.yaml } # invalid: remote graph policy lives on the server
queries:
find_user: { file: ./queries/find_user.gq } # invalid: remote graph queries are discovered
```
`omnigraph config view --resolved --show-origin` is the user-facing debugger for this boundary: it shows the final `Embedded` or `Remote` graph and where every honored field came from. Fields that cannot be honored never make it into the resolved view; they fail validation first.
### 2. Layered config — global-first, uniform schema, project-optional
**Posture: global-first, project-optional.** OmniGraph's CLI is primarily a *client* (it operates against graphs and servers, embedded or remote), so it sits on the **global-first** side of the CLI-config axis — like `kubectl` / `aws` / `gh` / `docker`, and unlike *project-first* tools (`git` / `cargo` / `terraform`) whose primary config is per-repo. The **global user config is the primary, self-sufficient default**; the project file is an *optional* repo-scoped override (and, when present, the deployment manifest). `omnigraph query --target prod` must work from **any directory with no project file**, exactly as `kubectl get pods --context prod` works from anywhere. *(This is a deliberate flip from today, where the CLI reads `./omnigraph.yaml` and does not even walk parent dirs — i.e. today it is project-anchored.)*
**Rule: the two layers share ONE raw schema, and each is fully self-sufficient** (the git-layering mechanism — same schema at both levels; you never need a repo to have a working config). Do **not** specialize the file format by layer. Instead, run the same role-aware validation everywhere (§1.2): the global and project layers may both define graph locators, defaults, servers, and aliases, but fields that are meaningless for a resolved graph variant are rejected rather than ignored. For example, `queries:` is valid for an embedded graph this config opens directly; it is invalid on a remote graph entry because remote stored queries are server-owned and discovered.
This makes the **zero-project case the default, not an edge case**: a solo user (or an agent) defines everything needed for client work in `~/.omnigraph/config.yaml` — servers, embedded + remote graph locators, defaults, aliases, and optionally personal embedded-graph query registries — and **never creates a project file**. A team adds `./omnigraph.yaml` only when it wants repo-scoped overrides or a committed, GitOps'd deployment manifest. Global-first does **not** forbid project files; it stops *requiring* them (the kubectl model: `~/.kube/config` is sufficient and default; per-project kubeconfigs are opt-in via `KUBECONFIG`).
| Layer | Required? | Typical use | Path |
|---|---|---|---|
| Global | no | **the default** — solo/agent's entire config; shared servers+creds for teams; even a personal server's graphs/queries | `~/.omnigraph/config.yaml` |
| Project | no | **opt-in** — repo-scoped overrides + the committed deployment manifest (graphs, queries, policy) | `./omnigraph.yaml` |
**Precedence (low → high):** built-in defaults < global < project < env vars < CLI flags. With no project file it collapses to **built-in < global < env < flags** the common global-only path.
**Merge semantics — "closest layer wins, at the smallest meaningful unit"** (the field consensus: git / kubeconfig / cargo / Helm / VS Code):
- **Settings objects** (`defaults`, `auth`, `server`) → **deep-merge per field**: a project sets `defaults.graph` and *inherits* the global `defaults.output_format`. (VS Code / cargo behavior.)
- **Named-resource maps** (`servers`, `graphs` / compat `targets`, `queries`, `aliases`) → **union by key; on a collision the higher layer's entry REPLACES the lower wholesale***no field-level deep-merge within an entry*. (kubeconfig: union contexts by name.) The footgun this avoids: global `servers.prod = {endpoint, policy}`, project `servers.prod = {endpoint: other}` — deep-merge would silently retain the old fields; replace makes the project's `prod` self-contained and predictable.
- **Lists/arrays****replace, never append** (Helm convention; appending is order-sensitive and surprising).
- **Scalars** → higher layer wins.
- **Relative paths carry their origin's base_dir.** A `queries:` entry's `.gq` path, or a `policy.file`, resolves against the directory of the layer it was *defined in* — global entries under `~/.omnigraph/`, project entries under the project dir.
- **Inspectable (non-negotiable):** `omnigraph config view --resolved --show-origin` prints each final value *and which layer set it* (the `git config --show-origin` / `kubectl config view` rule). A layered config without origin-tracing is a debugging trap.
### 3. Roles, and the file-naming decision (same name for project = server)
`omnigraph.yaml` carries two *roles* that diverge in prod and collapse on a laptop:
- **Server role** (read by `omnigraph-server`): `graphs:` entries that are **embedded storage locators**, per-graph `policy.file`, **`queries:` — the stored-query/MCP registry lives here**, plus serving knobs. Remote graph locators are rejected in this role.
- **Client role** (read by the CLI/agent): `servers:`, embedded or remote `graphs:` locators, `defaults:`, `aliases:`. A remote graph locator points at server-owned capabilities; it cannot define local `policy:` or `queries:`.
**Project config and server config are the same artifact, hence the same name.** The server *serves the project*: the file that says "these graphs exist, with these stored queries and this policy" is simultaneously the project manifest and the server's deploy config. Role is distinguished by which *sections* are populated, never by filename. Readers ignore sections that are not theirs (today's file already does this with `cli:` vs `server:`).
**Why not kube's role-split.** Two coherent models exist: (A) one project file with role-sections (Helix `helix.toml` holds both `[local.dev]` and `[enterprise.production]`; compose; Cargo), and (B) deployment-manifest strictly separate from client config (kubectl — you never put a context in `deployment.yaml`). kube is the sharpest topological analog (multi-server × multi-graph, one client targeting many), so B has a real claim. The tiebreaker is **E1: OmniGraph is embedded-first.** In embedded mode the manifest's `graphs:` *is* the local target list — manifest and local-client-view are the same object, so splitting them (B) fights the grain and forces two files for local work. kube splits because it has **no** embedded mode (client always remote+global). So: take the half kube is right about — *remote* client targeting (`servers:`, endpoints, creds) is a separate concern in a separate **user-global** file (`config.yaml`, like `~/.kube/config`); reject the half it is wrong about for us — do **not** split the *project* layer by role. **The second name (`config.yaml`) is justified by scope (user-global), not role.** *(If OmniGraph ever dropped embedded mode and went pure-remote, model B's strict split would become cleanest.)*
### 4. File naming
Principles from the field: **one global dir** `~/.omnigraph/` (like `~/.aws`/`~/.kube`/`~/.helix`), with config/cache/state as **subdirectories** (separation without XDG's three-root scatter); **secrets keyed by server name in the OS keychain or a separate git-ignored profile file** (AWS/gh model, not a new `credentials.yaml`); **project-root manifest keeps the app-named file** (`Cargo.toml`, `package.json`); **`.yaml`, not `.yml`**; keep OmniGraph's established names. The genuinely *new* decisions are the **global** dir's existence and keyed-by-name resolution with an explicit `auth.token` override (MR-971); the shipped `bearer_token_env` + `auth.env_file` mechanism remains as legacy compat.
| Artifact | Path / name | Why |
|---|---|---|
| Project = server config (one artifact) | `./omnigraph.yaml` | **Keep.** Root manifest like `Cargo.toml` / `compose.yaml` / `helix.toml`. Same name for both roles because it is one file. In prod the server's deploy repo and an app repo each have their own `omnigraph.yaml` — same name, different repos. |
| Global user config | `~/.omnigraph/config.yaml` | **One dir** (`~/.omnigraph/`, like `~/.aws`/`~/.kube`/`~/.helix`). Named `config.yaml` *not* `omnigraph.yaml` — the name signals scope (and `~/.aws/config`, `~/.kube/config`, `~/.helix/config` all do this). Holds the full schema so a solo user needs nothing else. |
| Credentials | OS keychain (`omnigraph:<server>`, preferred) → `~/.omnigraph/credentials` profile file (`[<server>]`, `0600`, git-ignored). **Keyed by server name**, inside the one dir. | **Key by name, AWS/gh model**`~/.aws/credentials [profile]`, `~/.kube/config users:`, `~/.helix/credentials`. *Not* a `credentials.yaml`, and *not* a per-server hand-named env var; the secret lives under the server name (no indirection). Legacy `bearer_token_env` + `.env.omni` dotenv remain as a compat path. See §5. |
| Cache / state | `~/.omnigraph/cache/`, `~/.omnigraph/state/` | Subdirs of the one dir (like `~/.aws/sso/cache/`, `~/.kube/cache/`) — cache is `rm -rf`-safe and backup-excludable without scattering across XDG roots. |
| Cedar policy | `./policies/<env>.yaml` + `<env>.tests.yaml` | **Keep.** Referenced by `policy.file`. |
| Schema | `./*.pg` (e.g. `schema.pg`) | **Keep.** |
| Stored queries | `./queries/*.gq` | **Keep.** `.gq` sources referenced by the `queries:` registry. |
**Global dir: `~/.omnigraph/` — one place, with subdirectories.** Everything OmniGraph keeps for a user lives under a single `~/.omnigraph/` directory, matching the peer group (`~/.aws`, `~/.kube`, `~/.docker`) and the direct competitor (`~/.helix`). This is what DB/cloud-CLI users expect and the lowest-cognitive-load shape.
*Separation and "one place" are not in conflict* — the decisive realization. The peer tools get config/cache/state separation via **subdirectories inside the one dir**, not via XDG's three scattered roots: `~/.aws/sso/cache/`, `~/.kube/cache/`. So OmniGraph keeps `~/.omnigraph/config.yaml`, `~/.omnigraph/credentials`, `~/.omnigraph/cache/` (catalogs — `rm -rf`-safe, backup-excludable), `~/.omnigraph/state/` (session, logs) — getting cache hygiene **and** a single discoverable location, without the XDG scatter. An earlier draft argued XDG on a false dichotomy (it assumed single-dir ⇒ mixed); subdirs dissolve it. `~/.omnigraph/` is canonical and documented; `$XDG_CONFIG_HOME` may optionally be honored if a user has set it, but XDG is not part of the mental model.
**Env / override precedence (the `KUBECONFIG` analog):**
- `OMNIGRAPH_CONFIG=/path` — explicit config file, highest precedence.
- `OMNIGRAPH_HOME=/path` → the global dir (default `~/.omnigraph/`); `$XDG_CONFIG_HOME` optionally honored if a user has set it, but `~/.omnigraph/` is canonical.
- Cache and state are subdirs of the one dir: `~/.omnigraph/cache/` (cached remote catalogs), `~/.omnigraph/state/` (session, logs).
- Per-server token resolution: an explicit `auth: { token: {...} }` source (env/file/command/keychain) wins if set; otherwise **keyed by the server name**`OMNIGRAPH_TOKEN_<NAME>` (or `OMNIGRAPH_TOKEN` for the active server) → OS keychain `omnigraph:<name>` → the `[<name>]` profile in `~/.omnigraph/credentials`; legacy `bearer_token_env` still honored. See §5.
### 5. Credentials, connection tiers, and bind portability (12-factor)
**Credentials are by-reference everywhere, never inlined — and keyed by the *server name*, not by a hand-invented env-var name.** This is the one place the design departs from simply reusing the shipped `bearer_token_env` mechanism, because that mechanism is sub-optimal for a multi-server client: it forces the operator to invent and coordinate an env-var name per server (three steps to add a server: pick a var, name it in config, set it in the store). The peer group (AWS profiles, `gh` hosts, kubeconfig users, docker auths) instead keys the secret **by the server's name** — no indirection. OmniGraph should match that.
**Resolution for server `<name>` (no config field required):**
1. **`OMNIGRAPH_TOKEN_<NAME>`** env var (name-derived, upper-snake), else **`OMNIGRAPH_TOKEN`** for the active server — the CI/headless override (12-factor).
2. **OS keychain** entry `omnigraph:<name>` — the preferred interactive store (no plaintext on disk); written by `omnigraph login <name>`.
3. **`~/.omnigraph/credentials`** — an AWS-style profile file keyed by server name (mode `0600`, git-ignored), the fallback when no keychain:
```ini
[prod-us]
token = …
[prod-eu]
token = …
```
So a `servers.<name>` with no token field resolves by name — adding a server is one step (`omnigraph login <name>`), and "multiple servers, multiple tokens" falls out for free.
**But implicit must not be the *only* path — explicit sourcing is a first-class option** (the DX/AX lesson). Pure-convention is invisible (you must *know* `OMNIGRAPH_TOKEN_<NAME>`), can't integrate with a secrets-manager's fixed var name, and can't do dynamic/short-lived tokens. So a server may declare an explicit `auth:` block — a **method-agnostic wrapper** (today only `token:` for bearer; `mtls:`/`oidc:` are the future siblings, so the credential model never has to be re-keyed) holding a tagged token *source*. Secrets are *still* never inlined (every source is a reference):
```yaml
servers:
prod-us:
endpoint: https://og-us…
auth: { token: { env: OG_PROD_US_TOKEN } } # explicit env var — self-documenting (= legacy bearer_token_env)
prod-eu:
endpoint: https://og-eu…
auth: { token: { command: [vault, read, -field=token, secret/og] } } # dynamic / short-lived
edge:
endpoint: https://og-edge…
auth: { token: { file: /run/secrets/og-token } } # k8s/docker mounted secret
staging:
endpoint: https://og-staging… # no auth: → implicit chain (below)
```
| `auth.token:` source | when | DX/AX value |
|---|---|---|
| *(auth omitted)* | the common case | zero-config; `omnigraph login` populates keychain `omnigraph:<name>` |
| `{ env: VAR }` | secrets-manager / CI injects a fixed var | **self-documenting** — config states the source; = the legacy `bearer_token_env` |
| `{ file: PATH }` | k8s/docker secret mounted as a file | no env plumbing |
| `{ command: [...] }` | Vault, cloud IAM, `gh auth token` | **dynamic tokens** — first-class exec, the capability pure-env/keychain can't give (kube `exec` / AWS `credential_process`) |
| `{ keychain: ENTRY }` | pin a non-default keychain entry | explicit override of the name-derived default |
**Resolution per server:** if `auth.token:` is set, use that source (no fallthrough). Else the **implicit chain**: `OMNIGRAPH_TOKEN_<NAME>` (or `OMNIGRAPH_TOKEN` for the active server) → keychain `omnigraph:<name>``[<name>]` in `~/.omnigraph/credentials` (`0600`, git-ignored). `omnigraph login <server>` writes/rotates only that server's secret; per-server precedence is independent; sharing is opt-in (same env var or source). The `command` source runs locally with the operator's own privileges and is defined only in operator-owned config (never server-supplied), so it adds no remote-execution surface. The `auth:` wrapper is method-agnostic so adding mTLS/OIDC later is a new sibling key, not a breaking re-key (Hyrum's Law: the field name is a contract once shipped). There is **no `credentials.yaml`** and **no inlined secret**. *Convention for the floor, explicit for control — and explicit is legible to agents and never inlines a secret.*
**Back-compat.** The shipped per-graph `bearer_token_env` + `auth.env_file` dotenv (`resolve_remote_bearer_token`, real-env-wins) keeps working unchanged for existing single-server setups; `bearer_token_env` is just the legacy flat alias for `auth: { token: { env } }`. Resolution tries an explicit `auth.token:` (or legacy `bearer_token_env`) first, then the keyed-by-name chain — so nothing breaks, but the zero-config default is the no-boilerplate keyed-by-name path. (MR-971 — the `bearer_token_env` parity gap — is where this resolver work lands.)
**Three connection tiers** (Supabase/Prisma teach the zero-config floor):
1. **Env vars**`OMNIGRAPH_SERVER=https://…` + `OMNIGRAPH_TOKEN=…`: zero-config remote, no file (the `DATABASE_URL` floor).
2. **Global `config.yaml`** — named `servers:` + `graphs:` for multi-server setups (the AWS-profiles convenience).
3. **Project `omnigraph.yaml`** — project-pinned targets/graphs, committed.
**Keep `omnigraph.yaml` a *portable* manifest (12-factor).** Deploy-specific runtime that varies per environment — the **bind host/port**, worker counts — should be supplied by **`--bind` / `OMNIGRAPH_BIND` (flags/env)**, *not* a committed `server.bind:` baked into the manifest. A manifest that hardcodes `0.0.0.0:8080` is not portable across deploys and leaks an environment detail into a version-controlled file. The same-named `omnigraph.yaml` stays portable across deploys precisely because the volatile, per-environment knobs live in env/flags (12-factor config), while the stable, portable definition (graphs, queries, policy) lives in the file. This is the one concrete lesson taken from kube's model-B without adopting its file split: portability via env/flags, not via a second file.
### 6. Where stored queries live: defined locally, invoked remotely
A stored query splits across two axes; do not conflate them:
- **Definition** (`.gq` source + `queries:` entry) lives next to the **embedded graph entry that owns it**. For a hosted remote graph, that is the **deployment manifest** read by `omnigraph-server`; for a personal embedded graph, it may be the user's own config. It never lives on a client-side `Remote` graph entry.
- **Discovery** ("what tools exist for me?") is fetched from the **server** (Cedar-filtered `GET /queries` / MCP catalog) at connect time.
- **Invocation** is **remote** (client → server, HTTP/MCP) — or **embedded** (the CLI opens the graph directly and reads the same manifest).
For remote use, the client carries *pointers to servers*, not query definitions; it **discovers and invokes**, never defines. This is the **capability-as-code guarantee for agents**: an agent can only invoke tools the server's *committed, reviewed* config exposes — it **cannot define a new tool at runtime**. Definition is structurally outside the agent's reach.
`queries:` (graph-capability registry, Cedar-gated when served remotely, MCP-visible when exposed) and `aliases:` (client CLI shortcut) overlap — both can name `.gq`-backed operations. This RFC keeps them siblings (the MR-969 decision); the clean long-term is **one registry, two invocation surfaces** (embedded + remote), with `aliases:` subsumed. Out of scope here.
#### Reconciling `aliases:` with the role model
`aliases:` is the pre-MR-969, **client-role, embedded-only, ungated** ancestor of `queries:`. An alias bundles `command` (read/change), `query` (`.gq` path), `name` (symbol), `args` (positional param names), and `graph`/`branch`/`format` defaults; the CLI runs it embedded. The server never reads it. So:
- **Role:** `aliases:` is **client-role** (CLI behavior) → it may live in **both** the user-global `config.yaml` and the project manifest, layered. `queries:` is **graph-capability role** → it lives only on an `Embedded` graph entry, and for remote server graphs that means the server deployment manifest. *Who opens the graph determines where query definitions can live.*
- **Difference:** `aliases:` = embedded invocation, no gating, explicit `command`, bundles client defaults + positional args. `queries:` = remote (+future embedded), Cedar + `mcp.expose`, **infers** read/mutate, bundles only MCP settings.
- **Convergence:** decompose an alias — *definition* (name→.gq+symbol) → `queries:` (the superset: typed, validated, gated, multi-surface, no redundant `command`); *target/branch/format* → client invocation context (`--target`/`--branch`/`--format` or `defaults:`), not baked per-query; *positional `args`* → thin CLI sugar or dropped (agents/services use named JSON params). End-state: one `queries:` registry + the client config model subsumes `aliases:`.
- **Validation:** a file-backed alias (`query: ./foo.gq`) may target only an embedded graph. A remote graph shortcut must be explicit that it invokes a server-owned stored query, e.g. `invoke: find_user`, so the client cannot smuggle a new `.gq` definition into a remote capability surface.
- **v1:** keep `aliases:` unchanged. Footgun worth a load-time warn: an alias and a query with the same name in one manifest are different namespaces invoked differently (`--alias X` vs `POST /queries/X`).
```yaml
aliases:
local_owner:
command: query
query: ./queries/owner.gq
name: owner
graph: dev # valid only if `dev` resolves Embedded
remote_owner:
invoke: find_user
graph: prod # valid only if `prod` resolves Remote; source lives on the server
args: [name]
```
### 7. CLI surface
- `omnigraph login <server>` — interactive auth; stores the token keyed by server name in the OS keychain (`omnigraph:<server>`) or the `[<server>]` profile of `~/.omnigraph/credentials` (0600). The `gh auth login` analog.
- `omnigraph use <graph>` — set the active graph (writes the appropriate layer). The `kubectl config use-context` analog.
- `omnigraph config view [--resolved] [--show-origin] [<graph>]` — print the merged config and, with `--resolved`, the final tuple **plus the origin layer of every field** (the `git config --show-origin` / `kubectl config view` analog). Resolution is never a mystery.
- All existing verbs (`query`, `mutate`, `load`, `schema`, `branch`, …) gain `--graph <name>`; resolution decides embedded vs remote transparently.
### 7.5 Init, login, and bootstrap — three tiers (folds in the Q2 design)
Scaffolding splits into three tiers by *scope* and *fatness*, mirroring the field (supabase `init` vs `login`; HelixDB thin `init` vs fat `chef`). Most of this lives in sibling tickets; this RFC owns only the **user route**.
| Tier | Command | Scope | What it does | Model | Status |
|---|---|---|---|---|---|
| **User route** | `omnigraph login [<server>]` | user (`~/.omnigraph/`) | auth + write `~/.omnigraph/config.yaml` / `credentials`; first-run global setup | gh / supabase `login` | **this RFC** (unbuilt) |
| **Thin project init** | `omnigraph init` | project, in-place | create graph + `scaffold_config_if_missing` (`omnigraph.yaml` + minimal `.pg`/`.gq`); refuse-if-exists or `--force` | `cargo init`, `prisma init` | exists; `--force` purge = MR-975 |
| **Fat bootstrap** | `omnigraph quickstart [--template <t>] [--auto]` | project, possibly new-dir | scaffold + seed data + `serve start` + agent prompt file | HelixDB `chef`, `create-next-app` | MR-973 (unbuilt) |
**Design positions** (first-principles, since none of the fat tier is built):
- **Split `init` (project) from `login` (user)** — never one command writing to both `$HOME` and the project (the supabase line, not the dbt line). `init`=project scaffold; `login`=user credential + global config.
- **`init` is in-place + refuse-if-exists** (cargo/prisma/terraform default): don't clobber; adopt existing files; require `--force` to overwrite (and `--force` purges Lance state per MR-975).
- **Interactive for humans, `--auto`/agent-mode for automation** (npm `-y`, create-* `--CI`, MR-981 `--machine`). In `OMNIGRAPH_AGENT_MODE` any prompt → fail with a repair hint.
- **Templates are a `--template <name>` flag on the fat tier** (create-vite model), with the *content* (schema + queries + seed) coming from a template source. Mechanism is a design question (bundled-in vs `og template pull` from a repo vs `npm create-*`-style delegation) — **not** an existing foothold (MR-581 stale). Lean: a small set of bundled templates first (generic `Person→Knows`, plus promote `omnigraph-intel-bootstrap`), `--template <github>` later.
- **`init`/`quickstart` can scaffold the `graphs:` map with one or more entries**; "init with specific graphs" = the scaffolded `graphs:` block (embedded `storage:` locally; the agent/operator adds remote `server:` entries via `login` + editing).
- **Secrets-on-scaffold rule** (prisma/dbt/supabase all do this): anything that writes a token also keeps it out of VCS. `login` prefers the OS keychain (no file); the `~/.omnigraph/credentials` profile fallback is `0600` and git-ignored, and any project-local `.env`-shaped file gets a `.gitignore` entry.
### 8. Concrete shape
**Global** `~/.omnigraph/config.yaml` (per-user, secret-free):
```yaml
servers: # endpoint only — token is keyed by the server name
prod-us: { endpoint: https://og-us.internal:8080 }
prod-eu: { endpoint: https://og-eu.internal:8080 }
staging: { endpoint: https://og-staging.internal:8080 }
graphs:
personal: { storage: ~/graphs/personal.omni }
defaults:
graph: personal
aliases:
my_people:
command: query
query: ~/queries/people.gq
name: list_people
graph: personal
```
**Project client** `./omnigraph.yaml` (committed, secret-free, portable — no `server.bind`). Note the shipped noun is `graphs:` (MR-603); an entry is embedded (`storage:`) XOR remote (`server:` + `graph_id:`, §1.1):
```yaml
graphs:
dev: { storage: s3://team-bucket/dev.omni, branch: main } # embedded
staging: { server: staging, graph_id: prod, branch: review } # remote → graph `prod` on server `staging`
prod-us: { server: prod-us, graph_id: production }
prod-eu: { server: prod-eu, graph_id: production } # multi-homed: same graph, another server
defaults: { graph: dev, output_format: table }
aliases:
owner:
command: query
query: ./queries/owner.gq
name: owner
args: [name]
graph: dev
```
Select with `--graph <name>` (shipped flag, MR-603).
**Server deployment** `./omnigraph.yaml` (committed in the deploy repo, read by `omnigraph-server`). Every served graph is an embedded storage locator; server-owned policy and stored-query definitions live here:
```yaml
graphs:
production:
storage: s3://team-bucket/prod.omni
policy:
file: ./policies/prod.yaml
queries:
find_user:
file: ./queries/find_user.gq
mcp: { expose: true, tool_name: lookup_user }
server:
policy:
file: ./policies/server.yaml
```
**Credentials** are keyed by server name — `omnigraph login prod-us` writes the OS keychain entry `omnigraph:prod-us` (or a `[prod-us]` profile in `~/.omnigraph/credentials`, 0600, git-ignored); `OMNIGRAPH_TOKEN_PROD_US` overrides for CI. No token fields in any config file; no committable secrets.
## DX
1. **One command surface, two loci.** `query --graph dev` (embedded) and `--graph staging` (remote) are the same command; only resolution differs. Change one word, not a mental model.
2. **Clone-and-go.** Project config names servers+graphs; teammate runs `omnigraph login staging` once and every target resolves. The git + `gh auth login` model.
3. **Multi-server × multi-graph is the default.** Remote graph entries reference `server` by name; `servers` is a global named map; graphs are per-server. `prod-us` and `prod-eu` both serving `production` is two graph entries — Helix cannot express this.
4. **Solo-first.** Everything in `~`, no project required.
5. **Laptop-to-fleet on one schema.** Local = one `omnigraph.yaml` (both roles); prod = role-split across repos. No second format to learn.
## AX (agent experience)
1. **One flat resolved context, never a config to navigate.** target→server→endpoint→token resolves *before* the agent sees anything. The agent reasons about tools, not topology (the LLM-safe-surface principle extended to config).
2. **Secrets are structurally outside the agent's reach.** The repo it operates in has no tokens; they are in the global layer / keychain, outside its view. An agent *cannot* exfiltrate a prod token from project config because it is not there.
3. **Branch/snapshot-pinned contexts** (E4) — hand an agent a `branch: review` / `--snapshot v42` target and its reads are reproducible and cannot see uncommitted main-line state. No kubeconfig analog.
4. **The agent's capabilities are a GitOps'd artifact** (E6) — which graphs exist, which stored-query tools it may call, and which Cedar rules gate them are all in the version-controlled server config. Powers change only via a reviewed PR, deployed by restart. Infrastructure-as-code for what the AI can do.
5. **Config + policy compose.** Config = "where am I pointed + which token"; Cedar = "what may I do there." Orthogonal; no enforcement logic leaks into config.
## GitOps — three surfaces, secrets in none
| Surface | Repo | Contents | Deploy | Secrets |
|---|---|---|---|---|
| Server deployment config | infra/deploy repo | `graphs:`, policy, **`queries:` + `.gq` files** | commit → CI → **server restart** (no hot reload) | none — by-reference |
| Project client config | app repo | `graphs:` → embedded storage or remote server+graph | committed, read by CLI/agent | none |
| Global user config | **not GitOps'd** — machine-local `~` | `servers:` + creds-by-ref | `omnigraph login` writes it | refs only (like `~/.kube/config`) |
## Comparison
| Property | kubeconfig | Helix | git | compose | **OmniGraph (this RFC)** |
|---|---|---|---|---|---|
| Named remote endpoints + creds-by-ref | ✅ | ✅ | partial | partial | ✅ (global `servers`) |
| Global + project layering, uniform schema | ✗ | ✗ | ✅ | ✗ | ✅ |
| Embedded OR remote under one name | ✗ | ✗ | n/a | ✗ | ✅ (E1) |
| Multi-server × multi-graph | ✅ | ✗ | n/a | n/a | ✅ (E2) |
| Branch/snapshot in the address | ✗ | ✗ | partial | ✗ | ✅ (E4) |
| Agent tool surface in the repo | ✗ | ✗ (separate bundle) | n/a | n/a | ✅ (E6) |
| Project manifest renamed by role | — | no | — | no | **no** |
| Concept count | 3 | 1 | 2 | 1 | **2 (servers/targets)** |
## Migration / backwards compatibility
- **Additive.** Today's `omnigraph.yaml` (`graphs:`, `cli:`, `server:`, `aliases:`, `policy:`) keeps working unchanged. `graphs:` entries are equivalent to embedded `targets:` with a `storage:` (shipped `uri:` is a deprecated alias); both resolve.
- **`targets:` is new** and optional. `servers:` is new and optional. Absent → today's behavior.
- **Global `~/.omnigraph/config.yaml` is new.** Absent → only project + env + flags, exactly as now. Its addition is the **global-first posture flip**: today the CLI is project-anchored (reads `./omnigraph.yaml`, no parent walk); the global config becomes the new primary discovery path so the CLI works with no project file. Existing project-only workflows are unchanged (project still overrides global); the flip is additive — it adds a fallback layer below the project file, it does not remove the project file.
- **`graphs:``targets:` is an evolution, not a break.** Both can coexist; `targets:` is the superset (adds remote + branch pinning). A future cleanup may alias `graphs:` to embedded `targets:`.
- **`server.bind` stays supported** but documentation steers operators to `--bind` / `OMNIGRAPH_BIND` for portability; no removal.
- **Credentials: keyed-by-name is new; `bearer_token_env` is the compat path.** The primary design (keychain / `[<server>]` profile / `OMNIGRAPH_TOKEN_<SERVER>`) is new resolver work (lands on MR-971). The shipped `bearer_token_env` + `auth.env_file` dotenv (`resolve_remote_bearer_token`) is **unchanged and still honored** — existing single-server dotenv setups keep working, and the resolver honors an explicit `auth: { token: {...} }` source (env/file/command/keychain) with `bearer_token_env` as its flat legacy alias. No `credentials.yaml`.
- **Validation tightens invalid mixes, not valid legacy use.** Top-level `policy:` / `queries:` remain only for anonymous bare-URI compatibility. Named graphs use per-entry fields. Remote graph entries with local `policy:` / `queries:` and server manifests with `server:` graph locators are rejected because there is no correct way to honor those fields.
## Open questions
- **`graphs:` vs `targets:` naming churn.** Do we rename `graphs:``targets:` (with a deprecation alias) or keep `graphs:` for embedded and add `targets:` for remote? Leaning: keep both, document `targets:` as the superset.
- **Keychain integration scope.** Keychain is now the *primary* credential store (§5), so this is on the critical path, not optional: macOS Keychain first (matches operator practice) with the `0600` `[<server>]` profile file as fallback; Linux Secret Service / `pass` later. Open: which keyring crate, and the exact `OMNIGRAPH_TOKEN_<SERVER>` name-derivation (upper-snake, non-alnum → `_`).
- **Project-local `servers:`.** Allowed (e.g. a localhost dev server), merged with global. Confirm creds stay by-reference even for project-local servers (yes).
- **`aliases:``queries:` convergence.** Out of scope here; tracked separately. One registry with embedded + remote invocation surfaces is the target end state.
- **Single-file `KUBECONFIG`-style list.** Do we support `OMNIGRAPH_CONFIG` pointing at multiple files (colon-joined), or a single file only? Start single; revisit if demand appears.
## Implementation — breadboard + slices (Shape A)
Shaped via requirements + a fit check (Shape A — global-first layered config + unified `graphs:` entry + three-tier init — selected over a project-first minimal option and a Helix-clone). This section breadboards A and slices it. **Bold** = NEW.
### Places
| # | Place | What |
|---|---|---|
| P1 | Disk | `~/.omnigraph/{config.yaml, credentials, cache/, state/}` + project `omnigraph.yaml` + `.env.omni` |
| P2 | Config resolution | runs on every command: load layers → merge → resolve `--graph` |
| P3 | Command execution | embedded engine OR remote HTTP client |
| P4 | Remote `omnigraph-server` | existing HTTP surface (`/query`, `/mutate`, `/queries/{name}`) |
| P5 | Scaffold | `login` / `init` / `quickstart` |
### Affordances
| # | Place | Affordance | NEW? | Wires |
|---|---|---|---|---|
| U1 | P1 | `~/.omnigraph/config.yaml` (operator edits) | **N** | → N1 |
| U2 | P1 | project `./omnigraph.yaml` | — | → N1 |
| U3 | P1 | `~/.omnigraph/credentials` / `.env.omni` dotenv (secrets, git-ignored) | — | → N4 |
| U4 | P3 | `omnigraph <verb> --graph <name>` (any command) | — | → N14 |
| U5 | P5 | `omnigraph login [<server>]` | **N** | → N11 |
| U6 | P5 | `omnigraph init` / `quickstart [--template]` | partly | → N12 / N13 |
| U7 | P2 | `omnigraph config view --resolved --show-origin` | **N** | → N10 |
| N1 | P2 | `load_layered_config()` — global (N3) + project (cwd), serde each | **N** | → N2 |
| N2 | P2 | **merge engine** — deep-merge settings; replace named-resource entries; replace lists; **retain provenance** and raw field origins | **N⚠** | → N5, → S_merged |
| N3 | P2 | global-dir resolver — `OMNIGRAPH_HOME` else `~/.omnigraph/` | **N** | → N1 |
| N4 | P2 | `load_env_file_into_process` — dotenv, real-env-wins (existing) | — | → N9 |
| N5 | P2 | `resolve_graph(name, merged)` → typed `Embedded`/`Remote` locator; rejects invalid role/field combinations before execution | **N⚠** | → N6 |
| N6 | P3 | `GraphConn``Embedded(engine)` \| `Remote(http)` dispatch | **N⚠** | → N7, → N8 |
| N7 | P3 | embedded path — `Omnigraph::open(uri)` (existing) | — | → engine |
| N8 | P3 | **HTTP-client path** — POST `/query`/`/mutate`/`/queries/{name}` | **N⚠** | → P4, → N9 |
| N9 | P2 | `resolve_bearer_token(server)` — explicit `auth.token` source if set, else **keyed by name**: `OMNIGRAPH_TOKEN_<NAME>`/`OMNIGRAPH_TOKEN` → keychain `omnigraph:<name>``[<name>]` profile; legacy `bearer_token_env`/dotenv (MR-971) | **N⚠** | → N8 |
| N10 | P2 | `config view` handler — merged + per-field origin (needs N2 provenance) | **N** | → U7 |
| N11 | P5 | `login` handler — interactive auth → write `config.yaml` + `credentials` (0600) + `.gitignore` | **N⚠** | → S_global |
| N12 | P5 | `init` handler — `scaffold_config_if_missing` + create graph; refuse-if-exists/`--force` purge (MR-975) | partly | → S_project |
| N13 | P5 | `quickstart` handler — scaffold + `--template` + seed + `serve start` + agent prompt (MR-973; needs serve MR-970) | **N⚠** | → S_project |
| N14 | P3 | agent-mode wrapper — `--machine`/`OMNIGRAPH_AGENT_MODE`: JSON, structured errors, never-prompt, typed exit codes (MR-981) | **N⚠** | → N1 |
| S_global | P1 | `~/.omnigraph/config.yaml` + `credentials` | **N** | read by N1/N9 |
| S_project | P1 | `./omnigraph.yaml` + `.env.omni` | — | read by N1/N4 |
| S_merged | P2 | in-memory resolved config (per command, with provenance) | **N** | read by N5/N10 |
| S_cache | P1 | `~/.omnigraph/cache/` (remote catalogs) | **N** | read by N8 |
```mermaid
flowchart TB
subgraph P1["P1: Disk"]
U1["U1: ~/.omnigraph/config.yaml"]
U2["U2: ./omnigraph.yaml"]
U3["U3: credentials dotenv"]
end
subgraph P2["P2: Config resolution"]
N3["N3: global-dir (OMNIGRAPH_HOME)"]
N1["N1: load_layered_config"]
N2["N2: merge engine (+provenance)"]
N4["N4: dotenv loader"]
N5["N5: resolve_graph(--graph)"]
N9["N9: resolve_bearer_token"]
N10["N10: config view"]
end
subgraph P3["P3: Command execution"]
U4["U4: omnigraph <verb> --graph"]
N14["N14: agent-mode wrapper"]
N6["N6: GraphConn embedded|remote"]
N7["N7: embedded Omnigraph::open"]
N8["N8: HTTP-client POST"]
end
subgraph P5["P5: Scaffold"]
U5["U5: login"]; U6["U6: init/quickstart"]
N11["N11: login handler"]; N12["N12: init"]; N13["N13: quickstart"]
end
P4["P4: remote omnigraph-server"]
U1-->N1; U2-->N1; N3-->N1; N1-->N2-->N5-->N6
U3-->N4-->N9-->N8
U4-->N14-->N1
N6-->N7; N6-->N8-->P4
N2-->N10-->U7["U7: config view --resolved"]
U5-->N11; U6-->N12; U6-->N13
classDef ui fill:#ffb6c1,stroke:#d87093,color:#000
classDef n fill:#d3d3d3,stroke:#808080,color:#000
class U1,U2,U3,U4,U5,U6,U7 ui
class N1,N2,N3,N4,N5,N6,N7,N8,N9,N10,N11,N12,N13,N14 n
```
### Slices (vertical, each demo-able)
| # | Slice | Parts/affordances | Demo |
|---|---|---|---|
| **V1** | **Global layer + merge + `config view`** | A1A4 · N1,N2,N3,N10 · U1,U7,S_global,S_merged | Put config in `~/.omnigraph/`, run `omnigraph config view --resolved --show-origin` from any dir → merged result with per-field origin; existing embedded commands work global-first with no project file |
| **V2** | **Remote graphs + HTTP client + creds** | A5A7 · N5,N6,N8,N9 · S_cache | Define a `server:` graph entry; `omnigraph query --graph prod` hits the remote server (`curl`-free); embedded `--graph dev` still local |
| **V3** | **`omnigraph login`** | A8 · N11,U5 | `omnigraph login prod` writes `~/.omnigraph/credentials` (0600) + `.gitignore`; V2 remote query now works with no manual env |
| **V4** | **Thin-init hardening + quickstart + templates** | A9 · N12,N13,U6 (needs serve MR-970) | `omnigraph quickstart --template person-knows` scaffolds + seeds + serves; `init --force` purges (MR-975) |
| **V5** | **Agent-mode** | A10 · N14,U4 (MR-981) | `OMNIGRAPH_AGENT_MODE=1 omnigraph query …` → JSON + structured errors + typed exit codes; never-prompt |
V1 is the foundation (global-first + merge + view). V2 closes the substantive client→server gap. V3 is credential ergonomics. V4/V5 ride sibling tickets (MR-970/973/981). MR-969 (stored queries) ships independently and is reached by N8's `/queries/{name}` once V2 lands.
## Rollout
The slices above are the rollout order: **V1 (global layer + merge) → V2 (remote graphs + HTTP client) → V3 (login) → V4 (quickstart/templates, on MR-970) → V5 (agent-mode, MR-981).** V1V2 close the substantive gap (global-first config + `curl`-free server access); V3V5 are ergonomics that ride sibling tickets. Evaluate after V2 against early-adopter and agent-onboarding (MR-973 / MR-974) signal. The spikes (X1 HTTP-client, X2 merge engine, X3 resolver+provenance, X4 login) resolve before their owning slice.
## Prior art
- kubeconfig (clusters / users / contexts; `KUBECONFIG`; `kubectl config view`)
- Helix CLI v2 (`helix.toml` local+enterprise instance blocks; `~/.helix/config`; `~/.helix/credentials`)
- AWS CLI (`~/.aws/config` + `~/.aws/credentials` split; named profiles; `credential_process`)
- git (`~/.gitconfig` + `.git/config`; `--show-origin`)
- Cargo (`Cargo.toml` manifest + `~/.cargo/config.toml`)
- Supabase / Prisma (one project manifest; connection via `DATABASE_URL` env)
- 12-factor app (config that varies by deploy lives in the environment)

View file

@ -0,0 +1,270 @@
# RFC: MCP Server Surface for `omnigraph-server` — Full Tool Parity, Stored Queries, Modular Auth
**Status:** Proposed
**Date:** 2026-06-01
**Tickets:** MR-969 (stored queries + MCP exposure — the surface this completes), MR-956 (federated auth / WorkOS OAuth — the auth substrate this consumes), MR-971 (per-server credential resolver), MR-974 (agent setup surface — the installer that wires this), MR-668 (multi-graph server — shipped, the routing this builds on)
**Builds on:** [omnigraph#128](https://github.com/ModernRelay/omnigraph/pull/128) (`ragnorc/stored-queries-mcp`) — the shipped stored-query registry, `GET /queries`, `POST /queries/{name}`, and the coarse `invoke_query` gate.
**Supersedes:** the MCP-transport portion of [rfc-001-queries-envelope-mcp.md](rfc-001-queries-envelope-mcp.md) (`/mcp/tools` + `/mcp/invoke`). See [Relationship to RFC-001](#relationship-to-rfc-001).
**Target release:** v0.8.x (phased — see Rollout)
## Summary
Add a first-class **MCP (Model Context Protocol) server surface to `omnigraph-server`**, exposed over **Streamable HTTP**, that projects the server's operations as MCP tools and resources for LLM clients (Claude Code/Desktop/web, Cursor, etc.). Two populations of tools share one projection path:
1. **Built-in operational tools** — parity with the existing `@modernrelay/omnigraph-mcp` stdio package's **13 tools** (`health`, `snapshot`, `read`, `schema_get`, `branches_list`, `commits_list`, `commits_get`, `change`, `ingest`, `branches_create`, `branches_delete`, `branches_merge`, `schema_apply`) and its **2 resources** (`omnigraph://schema`, `omnigraph://branches`), plus a new server-scoped `graphs_list` tool and an `omnigraph://graphs` resource (multi-graph mode).
2. **Dynamic stored-query tools** — one MCP tool per `mcp.expose: true` entry in the `queries:` registry (MR-969 / #128), with parameters typed from the `.gq` declaration via the shipped `query_catalog_entry` / `param_descriptor` projection.
Every tool is **authorized by the server's existing Cedar policy engine**. The MCP layer never implements its own authentication: it consumes an **already-resolved `ResolvedActor`** from the server's bearer middleware (`require_bearer_auth` today; the `TokenVerifier` seam when MR-956 lands), so the **same MCP endpoint serves on-prem (static or customer-OIDC tokens) and our cloud (WorkOS OAuth) by configuration only**. Cloud OAuth is an additive layer (RFC 9728 protected-resource metadata) that slots in with zero MCP changes.
The end-state collapses two diverging tool implementations into one: the in-server MCP is the canonical, Cedar-gated, remotely-reachable surface; the stdio package becomes a thin stdio↔HTTP proxy (local on-ramp) over it.
> **Key caveat, stated up front (see §5.9 below):** the headline "a token scoped via Cedar to a *specific set* of stored queries" requires **per-query `invoke_query` scope**, which is *designed* (rfc-001) but **not yet implemented** — the shipped action is coarse (any stored query on the graph, or none). Per-actor Cedar curation works today for *built-in vs ad-hoc vs admin* tools and for *stored-vs-ad-hoc*; sub-selecting individual stored queries per actor is gated on a prerequisite (PR 0b). Until then, stored-query curation is graph-level (registry membership + `mcp.expose`).
## Relationship to RFC-001
[rfc-001-queries-envelope-mcp.md](rfc-001-queries-envelope-mcp.md) (MR-656 / MR-976 / MR-969) is the parent design for stored queries + the response envelope + MCP. This RFC is the **detailed MCP-transport design** that #128 left for a follow-up, and it **revises rfc-001 in three places where the shipped code or the MCP wire protocol diverged from rfc-001's sketch**:
1. **Transport shape.** rfc-001 sketched `GET /mcp/tools` + `POST /mcp/invoke` (a bespoke REST pair). **That is not the MCP wire protocol — real MCP clients cannot connect to it.** This RFC implements actual MCP JSON-RPC over Streamable HTTP and reuses `query_catalog_entry` as a *projection source*, not a parallel surface. (rfc-001's own Open Question already leaned toward Streamable HTTP.)
2. **Exposure config.** rfc-001 specified inline `.gq` pragmas (`@mcp(expose=…)`, default `expose=false`). **#128 shipped a different mechanism:** YAML `queries.<name>.mcp.expose` in `omnigraph.yaml`, **default `true`** (declaring a query in the manifest *is* the opt-in). This RFC builds on the shipped YAML form; the `.gq`-pragma design in rfc-001 is superseded for exposure.
3. **Schema introspection.** rfc-001 lists "Schema introspection through MCP" as a **non-goal** ("agents see types through declared return shapes"). This RFC **revises that**: the operational-parity tools include `schema_get` and `omnigraph://schema`*because the shipped stdio package already exposes both*. The non-goal is achieved by *policy*, not omission: `schema_get`/`omnigraph://schema` are Cedar-gated by `Read`, and the recommended locked-down agent policy denies `Read`, so a curated agent still never sees the schema. (rfc-001's intent is preserved; the mechanism moves from "don't build it" to "build it, gate it.")
Everything else in rfc-001 (two-paths-one-engine, per-query `invoke_query` *as the intended scope*, the response envelope, multi-graph per-graph endpoints) this RFC consumes unchanged.
> **Numbering note:** the `TokenVerifier`/WorkOS auth design is referred to in code (`crates/omnigraph-server/src/identity.rs`) as "RFC 0001," which is a *different* document from this repo's `docs/dev/rfc-001-queries-envelope-mcp.md`. To avoid the collision this RFC cites the auth substrate as **MR-956** throughout, never "RFC 0001."
## Reconciliation with shipped code (verified against `ragnorc/stored-queries-mcp` HEAD)
Verified against `crates/omnigraph-server/src/{lib.rs,api.rs}` and `crates/omnigraph-policy/src/lib.rs` at the current branch head (not the #128 PR body, and not `api.rs` alone):
- ✅ `GET /queries` returns the `mcp.expose == true` subset as `QueriesCatalogOutput { queries: [QueryCatalogEntry] }`, each with typed `ParamDescriptor`s, `tool_name`, `description`, `instruction`, and a `mutation` flag. **MCP-ready projection, but exposed as bespoke REST/JSON — not the MCP wire protocol.**
- ✅ `POST /queries/{name}` route exists (`server_invoke_query`, `lib.rs`).
- ✅ `query_catalog_entry()` / `param_descriptor()` with an exhaustive `ScalarType → ParamKind` map (a new scalar is a compile error).
- ✅ `InvokeQuery` Cedar action defined in `omnigraph-policy`.
- ✅ **`InvokeQuery` IS enforced** at `POST /queries/{name}`: `server_invoke_query` calls `authorize(PolicyAction::InvokeQuery)` and **masks a denial to a 404 identical to "unknown query"** so the catalog isn't probeable (the denial-masking the previous draft of this RFC reported as missing is shipped — it lives in `lib.rs`, not `api.rs`). The stored-mutation path is already double-gated: `InvokeQuery` outer, then `Change` inside `run_mutate`.
- ✅ **Reuse path exists:** `run_query` / `run_mutate` are already decoupled from their HTTP request bodies and take registry-supplied `(source, name, params, branch/snapshot)`. MCP `tools/call` for both stored and ad-hoc tools delegates to these — no new business logic.
- ❌ **Per-query (`invoke_query[name]`) scope is NOT implemented.** `PolicyRequest` carries only `{action, branch, target_branch}`**no query-name dimension** — and the action is documented coarse ("permits *any* stored query on the graph"). rfc-001 *designed* per-name scope; it is unbuilt. This RFC's per-query Cedar filtering (§5.4) and recommended agent policy (§5.9) depend on it → tracked as **PR 0b**.
- ❌ No MCP protocol surface (`initialize`/`tools/list`/`tools/call`, JSON-RPC, transport).
- ❌ No `TokenVerifier` trait yet — `require_bearer_auth` resolves a `ResolvedActor` inline (static-hash). The trait/`OidcJwtVerifier` are MR-956 (draft). The MCP layer's only requirement — *consume `ResolvedActor`* — is satisfiable today.
Stack (verified `Cargo.toml`): Axum + utoipa (OpenAPI) + `omnigraph-policy` (Cedar) + `futures` + `tokio`. **No MCP crate present.** `edition = "2024"`.
## Motivation
- **One curated, safe, remotely-reachable tool surface.** MR-969's thesis: hand an LLM a token Cedar-scoped to a set of tools and it sees exactly those typed tools — cannot construct ad-hoc queries it isn't permitted, cannot read the schema it isn't permitted, cannot reach other graphs. Today the only MCP is the stdio package: local-only, full surface, ungated.
- **Parity, so the in-server MCP can be the single implementation.** Operators/agents already depend on the operational tools. Supporting them server-side behind one Cedar gate lets the stdio package degrade to a proxy and removes two diverging tool sets.
- **On-prem and cloud from one endpoint.** A managed cloud (WorkOS OAuth) and an on-prem/air-gapped deploy (static or customer-OIDC tokens) must serve the same MCP without forks or MCP-specific auth.
- **Foundation for the agent on-ramp (MR-974).** `omnigraph mcp install --agent <tool>` needs a decided transport + a stable endpoint.
## Goals
- Project built-in tools + stored queries as MCP tools through **one** registry abstraction.
- `tools/list` and the callable set are **identical for argument-independent authorization**, both driven by Cedar (see §5.4 for the branch-scoped caveat).
- The MCP layer is **auth-method-agnostic**: it consumes `ResolvedActor`, never a raw token, never branches on how auth happened.
- The same endpoint works on-prem (static/OIDC) and cloud (WorkOS OAuth), switched by config; cloud OAuth is additive (RFC 9728).
- No new business logic: MCP tools delegate to the same `run_query`/`run_mutate`/branch/schema functions the HTTP routes call.
- Behaviour-neutral when unused: no MCP traffic = no change.
## Non-Goals
- **Building/hosting an OAuth authorization server.** The server is a Resource Server; WorkOS AuthKit+Connect is the AS (MR-956). The MCP endpoint validates tokens, never issues them, never holds client secrets.
- **OAuth/WorkOS implementation itself** — MR-956's work. This RFC leaves a clean RFC-9728 hook and consumes `ResolvedActor`.
- **MCP prompts, elicitation, `tools/list_changed`, resource subscriptions, server-initiated messages.** None needed → enables a stateless POST-only transport (§5.6).
- **stdio transport inside the server.** stdio stays in the TS package (now a proxy).
- **Cross-graph tool listing.** Per-graph catalogs only (MR-969 + RFC-002 non-goal).
- **Hot reload of the query registry.** Restart-only (MR-969).
## Background
`omnigraph-server` (Axum) already implements every operation this RFC exposes as an authenticated HTTP route; each authorizes via a `PolicyAction` against the Cedar policy for a server-resolved actor and calls into the engine. The existing stdio MCP package is a *client* of these routes (it owns no business logic). MR-956 will introduce a `TokenVerifier` trait (`StaticHashTokenVerifier` today inline, `OidcJwtVerifier` for OIDC/WorkOS) producing the `ResolvedActor { actor_id, tenant_id: Option, scopes: Vec<Scope>, source }` that already exists in `identity.rs` and is consumed by Cedar — token *validation* is offline (cached JWKS), so on-prem/air-gapped has no request-path dependency on the cloud.
## Design
### 5.1 One tool model: a `McpTool` trait, two populators
Both built-in and stored-query tools implement one trait so `tools/list` / `tools/call` never special-case:
```rust
trait McpTool: Send + Sync {
fn name(&self) -> &str; // MCP tool id (stable)
fn title(&self) -> Option<&str>;
fn description(&self) -> &str;
fn input_schema(&self) -> serde_json::Value; // JSON Schema (draft 2020-12)
fn annotations(&self) -> ToolAnnotations; // readOnlyHint / destructiveHint / idempotentHint
/// The Cedar request(s) this call requires, given parsed args. Used BOTH at
/// list-time (dry-run filter, default args) and call-time (enforce, real args).
fn authorization(&self, args: &ToolArgs) -> Vec<PolicyRequest>;
async fn call(&self, ctx: &GraphCtx, args: ToolArgs) -> Result<ToolOutput, ToolError>;
}
```
- **Built-ins**: ~14 static impls, each delegating to the *same* function its HTTP route calls (`run_query`, `run_mutate`, branch ops, `apply_schema_as`, …). `input_schema` authored once (or derived from each route's existing `utoipa`/`ToSchema` DTO).
- **Stored queries**: generated `McpTool` instances, one per `mcp.expose` entry; `input_schema` from `param_descriptor` (§5.3); `authorization``InvokeQuery` (coarse today; `InvokeQuery{name}` after PR 0b) then the inner `Read`/`Change`.
`ToolRegistry` for a graph = the static built-ins + the dynamic stored-query tools resolved from that graph's `GraphHandle` registry.
### 5.2 Tool catalog (parity) and Cedar mapping
Each built-in **reuses the exact `PolicyAction` its HTTP route already enforces** — verified against the handlers in `lib.rs`, not invented:
| MCP tool | Scope | Read/Mutate | Cedar action (verified from route) |
|---|---|---|---|
| `health` | server | read | none (liveness/version) |
| `graphs_list` *(new)* | server | read | `GraphList` |
| `snapshot` | graph | read | `Read` |
| `schema_get` | graph | read | `Read` |
| `branches_list` | graph | read | `Read` |
| `commits_list`, `commits_get` | graph | read | `Read` |
| `read` (ad-hoc `.gq`) / `query` *(alias)* | graph | read | `Read` |
| `change` (ad-hoc `.gq`) / `mutate` *(alias)* | graph | mutate | `Change` |
| `ingest` (NDJSON) | graph | mutate | `Change` (+ `BranchCreate` when forking a new branch) |
| `branches_create` | graph | mutate | `BranchCreate` |
| `branches_delete` | graph | mutate | `BranchDelete` |
| `branches_merge` | graph | mutate | `BranchMerge` |
| `schema_apply` (`allow_data_loss`) | graph | mutate | `SchemaApply` |
| **stored query** (`find_user`, …) | graph | inferred | `InvokeQuery` (coarse; `InvokeQuery{name}` after PR 0b) + inner `Read`/`Change` |
There is **no `Ingest` and no separate `snapshot`/`Export` action**`ingest` enforces `Change`, `snapshot` enforces `Read`. (`Export` exists but maps to the `/export` route, which this RFC does not expose as a tool.)
**Tool id parity vs. canonicalization.** The shipped stdio package uses tool ids **`read`/`change`** (and calls the deprecated `/read`,`/change` routes). The server HTTP surface canonicalized to `/query`,`/mutate` with `/read`,`/change` deprecated (MR-656). To keep existing package clients working *and* align with the server, the MCP exposes **`query`/`mutate` as canonical with `read`/`change` retained as deprecated-but-live aliases** (both dispatch to the same handler). Open Q7 asks whether to drop the aliases later.
Resources (§5.5): `omnigraph://schema`, `omnigraph://branches` (parity), plus `omnigraph://graphs` *(new)* — each gated by the same action as its list/get route (`Read`, `Read`, `GraphList`).
### 5.3 `ParamDescriptor → JSON Schema` (stored-query tools)
| `ParamKind` | JSON Schema | Notes |
|---|---|---|
| String | `{"type":"string"}` | |
| Bool | `{"type":"boolean"}` | |
| Int (i32/u32) | `{"type":"integer"}` | |
| BigInt (i64/u64) | `{"type":"string","pattern":"^-?\\d+$"}` | JSON numbers lose precision >2⁵³ → string (matches the shipped `api.rs` rationale). (Open Q1) |
| Float (f32/f64) | `{"type":"number"}` | |
| Date | `{"type":"string","format":"date"}` | |
| DateTime | `{"type":"string","format":"date-time"}` | |
| Blob | `{"type":"string","contentEncoding":"base64"}` | |
| Vector | `{"type":"array","items":{"type":"number"},"minItems":dim,"maxItems":dim}` | uses `vector_dim` |
| List | `{"type":"array","items":<item_kind schema>}` | scalar items only (grammar guarantees) |
`nullable == false` → param is in `required`. Annotations: `mutation``{readOnlyHint:false, destructiveHint:true}`; else `{readOnlyHint:true}`. `description` → tool description; `instruction` → appended to description (or `_meta`). (The shipped `check()` already warns when an `mcp.expose` query declares a `Vector` param an LLM can't supply.)
For built-in tools the schema is hand-authored from the route DTO; e.g. `query``{source: string, branch?: string, params?: object}`; `schema_apply``{schema: string, allow_data_loss?: boolean}`; `ingest``{ndjson: string, mode?: "merge"|"append"|"overwrite", branch?: string}`.
### 5.4 `tools/list` (Cedar-filtered) and `tools/call` (dispatch + masking)
- **`tools/list`**: build the `ToolRegistry`; for each tool evaluate `authorization(default_args)` against the actor's Cedar policy; **emit only tools that authorize**. Authz decisions memoized per request. Stored-query tools additionally require `mcp.expose: true`.
- **Exactness caveat (R7 is conditional):** the listed set equals the callable set **only for tools whose authorization is argument-independent** (`health`, `graphs_list`, `snapshot`, `schema_get`, `branches_list`, `commits_*`, ad-hoc `query`/`mutate`, and stored queries under the *coarse* action). For **branch-scoped tools** (`branches_create`/`merge` with `target_branch_scope`, and any branch-scoped `Read`/`Change` rule), list-time uses `default_args` (e.g. branch `main`) and cannot know the real target, so the listed set is a *best-effort approximation* of callability — a call may still be denied (or, rarely, a hidden tool would have been allowed). `tools/call` is always the authoritative gate. The contract is: **list never shows a tool the actor can't ever call; for branch-scoped tools it may show one the actor can call only on some branches.**
- **`tools/call`**: resolve `name``McpTool` (masked-404 if unknown *or* `mcp.expose:false`); parse+validate args against `input_schema`; enforce `authorization(args)` (mutations stay double-gated: `InvokeQuery` then `Change`); on success `call`. **Denial masking** lives in one place (the dispatcher): an authz denial is returned identically to "unknown tool" (§5.10), reusing the same deny≡missing principle already shipped at `POST /queries/{name}`.
### 5.5 Resources
Advertise `resources` capability (`subscribe:false, listChanged:false`). `resources/list` → the URIs the actor may read; `resources/read` → schema `.pg` text / branches JSON / (multi-graph) graphs JSON, each gated by the corresponding action (`Read`, `Read`, `GraphList`). A locked-down agent denied `Read` simply never sees `omnigraph://schema` or `omnigraph://branches` — this is how rfc-001's "agents don't introspect schema" intent is met *by policy* (§Relationship-to-RFC-001).
### 5.6 Transport: Streamable HTTP, stateless, POST-only
- **Streamable HTTP** (MCP's current standard; we're already an HTTP server). One endpoint per scope (§5.7).
- Because the server emits **no** server-initiated messages, implement the **minimal conformant** shape: client `POST`s JSON-RPC, server replies `application/json`. **No SSE channel, no `Mcp-Session-Id`, stateless** — each request authenticated independently via the bearer middleware. Honour the `MCP-Protocol-Version` header. SSE/sessions can be added later if subscriptions land.
- **JSON-RPC methods:** `initialize` (advertise `{tools:{listChanged:false}, resources:{listChanged:false, subscribe:false}}` + serverInfo/version), `notifications/initialized` (no-op ack), `ping`, `tools/list`, `tools/call`, `resources/list`, `resources/read`. `prompts/list` returns empty if probed.
- **Library decision (Open Q2):** spike `rmcp` (official Rust MCP SDK) for conformance + Streamable-HTTP/Axum on edition 2024; **fall back to a hand-rolled ~150 LOC JSON-RPC-over-POST** (only the methods above) on friction. Given the tiny surface, hand-roll is an acceptable default.
### 5.7 Endpoint routing (server- vs graph-scoped)
- **Single-graph mode:** `POST /mcp` — graph tools + server tools (`health`, `graphs_list`).
- **Multi-graph mode (MR-668):** `POST /graphs/{graph_id}/mcp` — graph-scoped tools for that graph; plus a server-level `POST /mcp` exposing only server-scoped tools (`health`, `graphs_list`). A per-graph endpoint never lists another graph's tools (isolation, tested). Mirrors the shipped `/graphs/{graph_id}/…` cluster routing. (Open Q5: confirm naming + whether server tools also appear on the per-graph endpoint.)
### 5.8 Modular / decoupled auth (the cross-cutting requirement)
**Invariant (load-bearing, satisfiable today):** the MCP handler receives an **already-resolved `ResolvedActor`** and **branches on nothing** about how the token was verified. No token parsing, no method check, no OAuth inside the MCP module. Today that actor comes from `require_bearer_auth`; when MR-956 lands it comes from a `TokenVerifier` — the MCP code is identical either way.
```
request → [auth middleware: ResolvedActor] → [MCP route] → Cedar → McpTool
```
**Server side — auth is config, not code:**
| Deployment | Verifier | MCP change |
|---|---|---|
| On-prem, static bearer | `require_bearer_auth` / `StaticHashTokenVerifier` | none |
| On-prem, customer IdP | `OidcJwtVerifier` → customer issuer (MR-956) | none |
| Our cloud | `OidcJwtVerifier` → WorkOS, `tenant_id = Some(org_id)` (MR-956) | none |
Token validation is offline (cached JWKS) — on-prem/air-gapped keeps working with no request-path cloud dependency. The MCP endpoint never terminates OAuth and never holds a client secret (Resource Server only).
**Cloud client negotiation — additive, no MCP changes:** when MR-956 lands, the server publishes RFC 9728 `/.well-known/oauth-protected-resource` and returns `WWW-Authenticate: Bearer ..., resource_metadata="..."` on 401. A compliant MCP client (Claude) then auto-negotiates: static bearer to an on-prem endpoint; on a cloud 401 it discovers the WorkOS AS and runs OAuth/PKCE itself — **same endpoint URL, zero client-side branching.** This RFC only requires that MCP routes flow through the standard 401 path so that hook can be added later without touching MCP.
**Multi-user identity pass-through (cloud):** the *caller's* token (a WorkOS JWT, audience-bound per-tenant) must reach the server so Cedar enforces per-user/per-tenant policy — never a shared service token. The MCP endpoint validates it offline and maps `org_id → tenant_id`. This is why the **remote path is the in-server HTTP MCP that Claude connects to directly** (its token flows through), not a stdio bridge impersonating a user.
**Client-side credential acquisition (CLI/SDK/proxy) — pluggable `CredentialSource`** (RFC-002 §5, MR-971), keyed by server name, so OAuth is a future *sibling key*, not a re-key:
```yaml
servers:
onprem: { endpoint: https://og.internal:8080, auth: { token: { env: OG_TOKEN } } }
edge: { endpoint: https://og-edge, auth: { token: { command: [vault, read, -field=token, secret/og] } } }
cloud: { endpoint: https://api.omnigraph.cloud, auth: { oauth: { issuer: workos } } } # future sibling
```
Implicit chain when `auth:` omitted: `OMNIGRAPH_TOKEN_<NAME>` → keychain `omnigraph:<name>``[<name>]` in `~/.omnigraph/credentials`; legacy `bearer_token_env` honoured. Secrets never inlined.
### 5.9 Safety model — Cedar is the gate, default-deny is the floor
With ad-hoc `query`/`mutate`/`schema_apply` present as tools, the **only** thing protecting an untrusted agent is the Cedar policy. Therefore:
- **Default-deny when tokens are configured** (MR-723, shipped) is the floor — an actor with no grants sees an empty tool list.
- **What works today (coarse action):** a policy can hide all ad-hoc tools and admin tools per-actor (`deny Read, Change, SchemaApply, Branch*`) while allowing stored queries (`allow InvokeQuery`). That already reproduces "can't run ad-hoc, can't read schema, can only call stored queries" — the agent sees *every* exposed stored query plus nothing else.
- **What needs PR 0b (per-query scope):** selecting *which* stored queries an actor may call (`allow InvokeQuery [find_user, list_orders]`, deny the rest). The shipped `invoke_query` is coarse (all stored queries or none). Until PR 0b adds a query-name dimension to `PolicyRequest` + the Cedar schema (rfc-001's intended design), per-actor sub-selection of stored queries is **not expressible**; curation is graph-level (which `.gq` files are registered + `mcp.expose`).
- `schema_apply`, `branches_delete`, ad-hoc `mutate` require an explicit admin-tier grant; never in a default agent policy.
- (Open Q3) Optional `mcp.allow_adhoc` server switch defaulting **off** for the ad-hoc `query`/`mutate` tools — defence-in-depth independent of Cedar, and independent of PR 0b.
### 5.10 Result shaping and error mapping
- **Success:** `tools/call` returns `content: [{type:"text", text:<json>}]` where `<json>` is the route's existing output envelope (read rows / mutation summary, i.e. `ReadOutput` / `ChangeOutput`). (Open Q4: also emit `structuredContent` + `outputSchema` — defer; text-JSON for v1.)
- **Tool execution error** (bad params after schema validation, engine error): result with `isError:true` + a text content block.
- **Authorization denial / unknown tool / `mcp.expose:false`:** a single JSON-RPC error (`-32602`, message `"unknown tool"`) — identical for all three so policy isn't probeable (same principle as the shipped `POST /queries/{name}` 404 masking).
- **Auth failure** (bad/absent bearer): HTTP 401 from the middleware *before* MCP — carries `WWW-Authenticate` (the RFC 9728 hook), never masked as a tool error. (This is exactly the path the shipped `authorize`/`authorize_request` split preserves: operational failures keep their status; only *denials* are masked.)
## Relationship to the `@modernrelay/omnigraph-mcp` stdio package
Verified surface of the package (`omnigraph-ts`, pkg version `0.3.0`, `@modelcontextprotocol/sdk@^1.29.0`, **stdio only**): **13 tools** (`health`, `snapshot`, `read`, `schema_get`, `branches_list`, `commits_list`, `commits_get`, `change`, `ingest`, `branches_create`, `branches_delete`, `branches_merge`, `schema_apply`) and **2 resources** (`omnigraph://schema`, `omnigraph://branches`). It is a thin client over the SDK → HTTP routes and **forwards the caller's bearer verbatim** (no inspection).
Once parity lands, **collapse to one implementation**: the in-server MCP is canonical (Cedar-gated, remote-capable, the path that becomes a Claude-web connector via MR-956). The stdio package degrades to a **thin stdio↔HTTP proxy** forwarding JSON-RPC (and the incoming `Authorization`) to `/mcp` — staying the local on-ramp for Claude Code/Desktop while sharing one tool set, one Cedar gate. Transition: keep the current independent stdio package on its `0.3.x`/`0.6.x` line; ship proxy mode in a later TS minor once the server endpoint is GA. (Note: the package is currently several minors behind the server — its vendored `spec/openapi.json` predates the stored-query routes — so it needs the standard re-sync regardless of MCP work.)
## Testing
- **Protocol conformance:** `initialize` handshake + advertised capabilities; `tools/list` shape; `tools/call` happy path; JSON-RPC error envelopes (`-32601` unknown method, `-32602` invalid params / unknown tool); `resources/list` + `resources/read`.
- **Cedar filtering (coarse, today):** an actor with `allow InvokeQuery` + `deny Read/Change` sees *all* exposed stored queries but **not** `query`/`mutate`/`schema_get`; `tools/call query` returns masked "unknown tool"; an admin sees the full catalog.
- **Cedar filtering (per-query, gated on PR 0b):** actor scoped to `InvokeQuery [find_user]` sees *only* `find_user`; `tools/call list_orders` masks. **This test ships with PR 0b**, not PR 1 — it cannot pass against the coarse action.
- **Parity per built-in:** each tool round-trips against the same expectations as its HTTP route (reuse route tests); `read`/`change` aliases dispatch identically to `query`/`mutate`.
- **Double-gating:** a stored mutation requires both `InvokeQuery` and `Change`; `schema_apply` requires `SchemaApply`.
- **`mcp.expose:false`:** absent from `GET /queries` and MCP `tools/list`; still service-callable by name through `POST /queries/{name}` when the actor has `invoke_query`, but not MCP-callable.
- **Schema generation:** table-driven over every `ParamKind` incl. nullable / list / vector(dim).
- **Branch-scoped list approximation:** assert the documented R7 caveat — a branch-scoped policy lists `branches_create`, and `tools/call` is the authoritative gate (a denied target still 403s/masks).
- **Multi-graph isolation:** `/graphs/a/mcp` never lists graph `b`'s tools; server `/mcp` exposes only server tools.
- **Auth decoupling:** the MCP suite is green under the current `require_bearer_auth` and under a mock OIDC `ResolvedActor` source — proving verifier-agnosticism. A 401 carries `WWW-Authenticate`.
- **OpenAPI:** the JSON-RPC endpoint is not REST — document only the envelope in utoipa (or exclude); keep `openapi.json` drift test green (`OMNIGRAPH_UPDATE_OPENAPI=1` to regenerate on intentional change).
- **Cross-repo smoke (optional):** point `@modelcontextprotocol/sdk` (TS) at the HTTP endpoint in an `omnigraph-ts` integration test.
## Rollout — phased by risk
- **PR 0a — extract the reusable invoke path (small).** The coarse `invoke_query` gate + 404 denial-masking are **already shipped** in `server_invoke_query`. Extract the read/mutate dispatch into `invoke_stored_query(handle, name, params, branch/snapshot, actor)` so MCP `tools/call` and the HTTP route share one path. No behaviour change. *(Replaces the previous draft's "PR 0 — wire the gate", which was already done.)*
- **PR 0b — per-query `invoke_query` scope (the safety prerequisite).** Add a query-name dimension to `PolicyRequest` + the Cedar schema (rfc-001's intended design), wire it at `POST /queries/{name}` and in the stored-query `McpTool::authorization`. Independently useful (the `allow InvokeQuery [find_user]` policy). **Gates the per-query Cedar-filtering test and §5.9's recommended agent policy.**
- **PR 1 — MCP transport + read-only parity + stored-query reads.** Endpoint(s), `initialize`/`tools/list`/`tools/call`/`resources/*`, the `McpTool` registry, Cedar-filtered listing, the read-only built-ins (`health`, `graphs_list`, `snapshot`, `read`/`query`, `schema_get`, `branches_list`, `commits_*`) + resources + stored-query *reads*. All auth-agnostic.
- **PR 2 — mutating parity + stored-query mutations.** `change`/`mutate`, `ingest`, `branches_create/delete/merge`, `schema_apply`, stored-query mutations + the `mcp.allow_adhoc` switch.
- **PR 3 — docs + agent on-ramp hook.** `docs/user/server.md` MCP section (incl. the recommended agent policy + the coarse-vs-per-query caveat), `openapi.json` sync, the `omnigraph mcp install` config target (MR-974), and the downstream `omnigraph-ts` re-sync/proxy follow-up.
- **Later (separate, MR-956):** RFC 9728 protected-resource metadata + WorkOS — slots in with zero MCP changes.
- **Later (TS minor):** stdio package → proxy mode.
## Migration / backwards compatibility
- **Additive.** No `queries:` and no MCP traffic → today's behaviour unchanged. New endpoints are new routes.
- **Cedar default-deny** (when tokens configured) means MCP exposes nothing until an actor is granted — safe by default.
- The stdio package keeps working unchanged; proxy mode is opt-in later.
- `openapi.json` only gains the documented MCP envelope; existing REST routes untouched.
## Open Questions
1. **BigInt/u64 as JSON string** (recommended, precision-safe) vs number.
2. **`rmcp` vs hand-rolled** JSON-RPC (spike `rmcp` on edition 2024; default to hand-roll on friction).
3. **Default-off `mcp.allow_adhoc`** for ad-hoc `query`/`mutate` (recommended) vs always-on + Cedar-only.
4. **`structuredContent` + `outputSchema`** now vs text-JSON v1 (recommend v1 text-JSON).
5. **Endpoint paths:** `/mcp` + `/graphs/{id}/mcp` — confirm naming and whether server-scoped tools also appear on the per-graph endpoint.
6. **Stateless POST-only** confirmed (no near-term server-initiated messages) — revisit only if subscriptions land.
7. **Legacy alias tools** (`read`/`change`): keep for client compat (the shipped package uses them), or drop and rely on `query`/`mutate`?
8. **PR 0b shape:** per-query scope as a Cedar *resource* (`StoredQuery::"find_user"`) vs a `query_name` *context attribute* + policy condition — affects how `allow InvokeQuery [list]` is authored.

View file

@ -20,10 +20,11 @@ A reference for the `omnigraph` binary's command surface and `omnigraph.yaml` sc
| `run list \| show \| publish \| abort` | transactional run ops |
| `schema plan \| apply \| show (alias: get)` | migrations |
| `lint` (alias: `check`) | offline / graph-backed query validation. Replaces `query lint` / `query check`, which are kept as deprecated argv-level shims that print a one-line warning and rewrite to `omnigraph lint` |
| `queries validate \| list` | operate on the server-side stored-query registry (the `queries:` block). `validate` type-checks every stored query against the live schema offline (opens the selected graph; exits non-zero on any breakage), catching schema drift without restarting the server; `list` prints the selected registry's query names, MCP exposure, and typed params. For per-graph registries, pass `--target <graph>` or set `cli.graph`; with no graph selection, `list` shows only top-level `queries:`. Distinct from `lint`, which validates a single `.gq` file |
| `optimize` | non-destructive Lance compaction |
| `cleanup --keep N --older-than 7d --confirm` | destructive version GC |
| `embed` | offline JSONL embedding pipeline |
| `policy validate \| test \| explain` | Cedar tooling |
| `policy validate \| test \| explain` | Cedar tooling. Selects `cli.graph`, else `server.graph`, else top-level `policy.file` |
| `version` / `-v` | print `omnigraph 0.3.x` |
## `omnigraph.yaml` schema
@ -34,6 +35,13 @@ graphs:
<name>:
uri: <local|s3://|http(s)://>
bearer_token_env: <ENV_NAME>
queries: # per-graph stored-query registry (server-role; multi-graph mode)
<query-name>: # key MUST equal the `query <name>` symbol inside the .gq
file: <path-to-.gq> # relative to this config's directory
mcp:
expose: true # default true: listed in the MCP catalog (GET /queries); set false to hide (still HTTP-callable)
tool_name: <name> # optional MCP tool-name override (defaults to <query-name>;
# must be unique across exposed queries)
server:
graph: <name>
bind: <ip:port>
@ -59,6 +67,8 @@ aliases:
graph: <name>
branch: <name>
format: <output-format>
queries: # top-level registry — applies only to a bare-URI (anonymous) graph; a graph served by name uses its `graphs.<id>.queries`. Mirrors top-level `policy`.
<query-name>: { file: <path-to-.gq> } # mcp.expose defaults to true
policy:
file: ./policy.yaml
```

View file

@ -14,10 +14,11 @@ Per-graph actions (bind to `Omnigraph::Graph::"<graph_id>"`):
6. `branch_delete`
7. `branch_merge`
8. `admin` — reserved for policy-management surfaces (hot reload, audit log, approvals). No call site today; see MR-724 for the reservation rationale.
9. `invoke_query` — gates invoking a server-side stored query (the `queries:` registry). Graph-scoped (like `admin`) — per-branch access is enforced by the inner `read` / `change` gate, so a rule that sets `branch_scope` on `invoke_query` is rejected. Coarse in this release: an `invoke_query` allow rule permits any stored query on the graph; a future, additive refinement adds an optional per-query-name scope without changing rules written against the coarse action. Enforced at `POST /queries/{name}` (see [server](server.md)). A stored *mutation* is double-gated: `invoke_query` to reach the tool, plus `change` for the write itself (the engine `_as` writers still enforce per the query body).
Server-scoped action (v0.6.0+; binds to `Omnigraph::Server::"root"`):
9. `graph_list``GET /graphs` registry enumeration (multi-graph mode)
10. `graph_list``GET /graphs` registry enumeration (multi-graph mode)
Server-scoped actions cannot use `branch_scope` or `target_branch_scope` — they operate on the registry, not on a graph's branches. A rule cannot mix server-scoped and per-graph actions; split into separate rules. (Runtime `graph_create` / `graph_delete` are reserved but not shipped in v0.6.0; operators add/remove graphs by editing `omnigraph.yaml` and restarting.)
@ -46,10 +47,15 @@ graphs:
# no per-graph policy → no engine-layer Cedar enforcement on beta
```
Top-level `policy.file` is single-graph / CLI-local policy only. Multi-graph
server startup rejects it because applying one graph policy to every configured
graph is ambiguous. Move per-graph rules to `graphs.<graph_id>.policy.file` and
move `graph_list` rules to `server.policy.file`.
**Config follows graph identity, not server mode.** A graph served by **name**
(`--target <name>` or `server.graph`) uses its own `graphs.<name>.policy.file`,
exactly as in multi-graph mode. Top-level `policy.file` applies only to an
**anonymous** graph — one served by a bare `<URI>` with no `graphs:` entry.
Serving a **named** graph (single- or multi-graph mode) while top-level
`policy.file` (or `queries:`) is populated **refuses boot**, naming the block,
since the top-level value would otherwise be silently shadowed by the per-graph
block. Move per-graph rules to `graphs.<graph_id>.policy.file` and `graph_list`
rules to `server.policy.file`.
Each graph's HTTP request flows through its own per-graph policy. The management endpoint (`GET /graphs`) flows through the server-level policy. When `server.policy.file` is unset, `GET /graphs` is denied in every runtime state, including `--unauthenticated`; with bearer tokens configured, it returns 403 after admission control because `graph_list` is not a `read`-equivalent action. The operator must explicitly authorize via `server-policy.yaml` to expose `/graphs`.
@ -92,6 +98,10 @@ bearer token.
## CLI
Policy tooling resolves its graph like server single-mode policy: `cli.graph`
wins, otherwise `server.graph` is used, otherwise the top-level `policy.file`
is validated/tested/explained as the anonymous policy.
- `omnigraph policy validate` — parse + count actors, exit 1 on parse error.
- `omnigraph policy test` — run cases in `policy.tests.yaml`, exit 1 on any expectation mismatch.
- `omnigraph policy explain --actor … --action … [--branch …] [--target-branch …]` — show decision and matched rule.

View file

@ -6,7 +6,9 @@ Axum 0.8 + tokio + utoipa-generated OpenAPI. **Two modes** (v0.6.0+): single-gra
### Single-graph mode (legacy)
`omnigraph-server <URI>` or `omnigraph-server --target <name> --config omnigraph.yaml`. Routes are flat — `/snapshot`, `/read`, `/branches`, etc. Behavior unchanged from v0.6.0.
`omnigraph-server <URI>` or `omnigraph-server --target <name> --config omnigraph.yaml`. Routes are flat — `/snapshot`, `/read`, `/branches`, etc.
**Config follows graph identity.** A bare `<URI>` is an *anonymous* graph and uses the **top-level** `policy.file` / `queries:`. A graph chosen by **name** (`--target` / `server.graph`) uses its own `graphs.<name>.{policy.file, queries}` — the same block multi-graph mode uses. ⚠️ *Changed from v0.6.0, which always used top-level config in single mode: a named-graph config that puts `policy`/`queries` at top-level now **refuses boot** and points you at `graphs.<name>.…` (move the block there). Bare-`<URI>` single mode is unchanged.*
### Multi-graph mode (v0.6.0+)
@ -20,6 +22,10 @@ Mode inference (four-rule matrix):
4. `--config` + non-empty `graphs:` + no single-mode selector → **multi**
5. otherwise → error with migration hint
### Stored-query validation at startup
If a graph declares a `queries:` registry (see [cli-reference](cli-reference.md)), the server **loads and type-checks every stored query against that graph's live schema at startup** and **refuses to boot** if any query references a type or property the schema lacks — the same fail-loud posture as a malformed policy file, so schema drift surfaces at the deploy boundary rather than at invocation. Two MCP-exposed queries claiming the same tool name is likewise a boot error. Non-blocking advisories (e.g. an MCP-exposed query with a vector parameter an agent cannot supply) are logged. Validate offline before deploying with `omnigraph queries validate`. Discover the exposed queries as a typed tool catalog with `GET /queries`, and invoke one over HTTP with `POST /queries/{name}` (both below).
## Endpoint inventory
Per-graph endpoints — same body shape across modes; URLs differ:
@ -34,6 +40,8 @@ Per-graph endpoints — same body shape across modes; URLs differ:
| POST | `/export` | `/graphs/{id}/export` | bearer + `export` | NDJSON stream | `server_export` |
| POST | `/mutate` | `/graphs/{id}/mutate` | bearer + `change` | mutation (canonical; `query`/`name`; accepts legacy `query_source`/`query_name` as serde aliases) | `server_mutate` |
| POST | `/change` | `/graphs/{id}/change` | bearer + `change` | **deprecated** alias of `/mutate` (carries `Deprecation: true` + `Link: </mutate>; rel="successor-version"`) | `server_change` |
| GET | `/queries` | `/graphs/{id}/queries` | bearer + `read` | list the `mcp.expose` stored queries as a typed tool catalog | `server_list_queries` |
| POST | `/queries/{name}` | `/graphs/{id}/queries/{name}` | bearer + `invoke_query` (+ `change` for a stored mutation) | invoke a named query from the `queries:` registry; deny == 404 | `server_invoke_query` |
| GET | `/schema` | `/graphs/{id}/schema` | bearer + `read` | get current `.pg` source | `server_schema_get` |
| POST | `/schema/apply` | `/graphs/{id}/schema/apply` | bearer + `schema_apply` (target=`main`) | migrate | `server_schema_apply` |
| POST | `/ingest` | `/graphs/{id}/ingest` | bearer + `branch_create` (if new) + `change` | bulk load | `server_ingest` (32 MB body limit) |
@ -50,6 +58,23 @@ Server-level management endpoints (v0.6.0+):
|---|---|---|---|---|
| GET | `/graphs` | bearer + `graph_list` on `Server::"root"` | list registered graphs | `server_graphs_list` (405 in single mode) |
### Stored-query catalog (`GET /queries`)
List the graph's **`mcp.expose`** stored queries as a typed tool catalog — enough for a client (e.g. an MCP server) to register each as a tool without fetching `.gq` source. Each entry: `{ name, tool_name, description, instruction, mutation, params }`, where each param is `{ name, kind, item_kind?, vector_dim?, nullable }`. `kind` is one of `string | bool | int | bigint | float | date | datetime | blob | vector | list` (decomposed so a consumer maps it with a closed `switch`, never re-parsing GQ type spelling). `bigint` (I64/U64), `date`, `datetime`, and `blob` are carried as JSON **strings** — a 64-bit integer loses precision as a JSON number, dates are ISO strings, and a blob is a URI string.
- **Read-gated** (works in default-deny mode). The catalog is **graph-wide** (branch-independent; `read` is authorized against `main`).
- **`mcp.expose` defaults to `true`** — declaring a query in `queries:` lists it; set `mcp: { expose: false }` to keep it HTTP/service-callable but hidden from the catalog.
- **Not Cedar-filtered per query (yet).** A caller with `read` but not `invoke_query` can *list* a query they can't *invoke* (which would 404). Closing that gap is future per-query authorization; for now the catalog is a discovery surface and `invoke_query` remains the invocation gate.
### Stored-query invocation (`POST /queries/{name}`)
Invoke a curated, server-side stored query by **name** — the source comes from the graph's `queries:` registry, so the client never sends `.gq`. The request body itself is optional; omit it for no-param queries, or send `{ "params": { … }, "branch": "main", "snapshot": null }`, where every field is optional and `params` keys match the query's declared parameters. The response is the **read envelope** (`ReadOutput`) for a stored read or the **mutation envelope** (`ChangeOutput`) for a stored mutation — serialized untagged, so the wire shape is identical to `/query` / `/mutate`.
- **Gate:** `invoke_query` (per-graph, graph-scoped) at the boundary. A stored *mutation* is **double-gated** — it also passes the engine's `change` gate, so an actor with `invoke_query` but not `change` gets `403`.
- **Deny == unknown, for callers without `invoke_query`:** for a caller lacking the grant, an `invoke_query` denial and an unknown query name return the **same `404`** (identical body), so the catalog can't be probed. A caller that *holds* `invoke_query` may still get the inner gate's `403` for an existing query it can't `read`/`change` (the double-gate, above) — so existence is visible to grant-holders by design.
- **Requires an explicit policy grant when auth is on.** In default-deny mode (bearer tokens but no `policy.file`), only `read` is permitted, so *every* `/queries/{name}` call returns `404` until an `invoke_query` rule is configured.
- A stored mutation cannot target a `snapshot` (`400`); a parameter type error is a structured `400` naming the parameter.
## Adding and removing graphs (multi mode)
Runtime add/remove via API is **not** exposed in v0.6.0 — neither

View file

@ -829,6 +829,177 @@
]
}
},
"/queries": {
"get": {
"tags": [
"queries"
],
"summary": "List the graph's exposed stored queries as a typed tool catalog.",
"description": "Returns the `mcp.expose == true` subset of the `queries:` registry, each\nwith its MCP tool name, read/mutate flag, description/instruction, and\ntyped parameters — enough for a client to register them as tools without\nfetching `.gq` source. Read-gated; the catalog is graph-wide (branch\nindependent — `read` is authorized against `main`). **Not** Cedar-filtered\nper query yet, so it can list a query whose `invoke_query` the caller\nlacks (a known gap until per-query authorization lands).",
"operationId": "list_queries",
"responses": {
"200": {
"description": "Stored-query catalog (the mcp.expose subset, with typed params)",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/QueriesCatalogOutput"
}
}
}
},
"401": {
"description": "Unauthorized",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/ErrorOutput"
}
}
}
},
"403": {
"description": "Forbidden",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/ErrorOutput"
}
}
}
}
},
"security": [
{
"bearer_token": []
}
]
}
},
"/queries/{name}": {
"post": {
"tags": [
"queries"
],
"summary": "Invoke a curated, server-side stored query by name.",
"description": "The query source comes from the graph's `queries:` registry, not the\nrequest body — callers send only runtime inputs (`params`, `branch`,\n`snapshot`). Gated by the `invoke_query` Cedar action at the boundary;\na stored *mutation* additionally passes the engine's `change` gate\n(double-gated). An actor **without** `invoke_query` cannot tell a denied\nquery from a missing one — both return the same 404, so the catalog\ncan't be probed without the grant. Once `invoke_query` is held, the\ninner `read`/`change` gate may surface a 403 for an existing query the\nactor can't run (the intended double-gate signal).",
"operationId": "invoke_query",
"parameters": [
{
"name": "name",
"in": "path",
"description": "Stored query name (the registry key)",
"required": true,
"schema": {
"type": "string"
}
}
],
"requestBody": {
"content": {
"application/json": {
"schema": {
"oneOf": [
{
"type": "null"
},
{
"$ref": "#/components/schemas/InvokeStoredQueryRequest"
}
]
}
}
}
},
"responses": {
"200": {
"description": "Read envelope (ReadOutput) or mutation envelope (ChangeOutput), serialized untagged",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/InvokeStoredQueryResponse"
}
}
}
},
"400": {
"description": "Bad request (param type error; snapshot on a stored mutation)",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/ErrorOutput"
}
}
}
},
"401": {
"description": "Unauthorized",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/ErrorOutput"
}
}
}
},
"403": {
"description": "Forbidden (the inner `change` gate for a stored mutation)",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/ErrorOutput"
}
}
}
},
"404": {
"description": "Unknown stored query, or `invoke_query` denied — indistinguishable to a caller without the grant",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/ErrorOutput"
}
}
}
},
"409": {
"description": "Merge conflict",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/ErrorOutput"
}
}
}
},
"429": {
"description": "Per-actor admission cap exceeded; honor `Retry-After` header",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/ErrorOutput"
}
}
}
},
"500": {
"description": "Policy evaluation error (a denial is reported as 404, not 500)",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/ErrorOutput"
}
}
}
}
},
"security": [
{
"bearer_token": []
}
]
}
},
"/query": {
"post": {
"tags": [
@ -1628,6 +1799,40 @@
}
}
},
"InvokeStoredQueryRequest": {
"type": "object",
"description": "Body for `POST /queries/{name}` — invokes the server-side stored query\nnamed in the path. The query source and name come from the registry,\nnever the body; only the runtime inputs are supplied here.",
"properties": {
"branch": {
"type": [
"string",
"null"
],
"description": "Branch to run against. Defaults to `main`; for a stored mutation the\nwrite targets this branch."
},
"params": {
"description": "JSON object whose keys match the stored query's declared parameters."
},
"snapshot": {
"type": [
"string",
"null"
],
"description": "Snapshot id to read from (read queries only — rejected for a stored\nmutation). Mutually exclusive with `branch`."
}
}
},
"InvokeStoredQueryResponse": {
"oneOf": [
{
"$ref": "#/components/schemas/ReadOutput"
},
{
"$ref": "#/components/schemas/ChangeOutput"
}
],
"description": "Response for `POST /queries/{name}`: the read envelope for a stored\nread, or the mutation envelope for a stored mutation. Serialized\n**untagged**, so the wire shape is exactly [`ReadOutput`] or\n[`ChangeOutput`] — classification follows the stored query, not a\nwrapper field."
},
"LoadMode": {
"type": "string",
"description": "Shadow enum for documenting [`LoadMode`] in the OpenAPI schema.",
@ -1698,6 +1903,120 @@
}
}
},
"ParamDescriptor": {
"type": "object",
"description": "One declared parameter of a stored query, projected for the catalog.",
"required": [
"name",
"kind",
"nullable"
],
"properties": {
"item_kind": {
"oneOf": [
{
"type": "null"
},
{
"$ref": "#/components/schemas/ParamKind",
"description": "Element kind when `kind == list` (always a scalar — the grammar\nforbids lists of vectors or nested lists)."
}
]
},
"kind": {
"$ref": "#/components/schemas/ParamKind"
},
"name": {
"type": "string"
},
"nullable": {
"type": "boolean",
"description": "`false` → the caller must supply it; `true` → optional."
},
"vector_dim": {
"type": [
"integer",
"null"
],
"format": "int32",
"description": "Dimension when `kind == vector`.",
"minimum": 0
}
}
},
"ParamKind": {
"type": "string",
"description": "The kind of a stored-query parameter, decomposed so a client (e.g. an\nMCP server) can build a typed input schema with a closed `match` and\nnever re-parse omnigraph's type spelling. `bigint`/`date`/`datetime`/\n`blob` are carried as JSON strings on the wire: a 64-bit integer past\n2^53 loses precision as a JSON number, and Date/DateTime are ISO\nstrings, Blob a blob-URI string.",
"enum": [
"string",
"bool",
"int",
"bigint",
"float",
"date",
"datetime",
"blob",
"vector",
"list"
]
},
"QueriesCatalogOutput": {
"type": "object",
"description": "Response for `GET /queries`: the `mcp.expose` subset of a graph's\nstored-query registry, each with typed parameters.",
"required": [
"queries"
],
"properties": {
"queries": {
"type": "array",
"items": {
"$ref": "#/components/schemas/QueryCatalogEntry"
}
}
}
},
"QueryCatalogEntry": {
"type": "object",
"description": "One entry in the stored-query catalog (`GET /queries`).",
"required": [
"name",
"tool_name",
"mutation",
"params"
],
"properties": {
"description": {
"type": [
"string",
"null"
]
},
"instruction": {
"type": [
"string",
"null"
]
},
"mutation": {
"type": "boolean",
"description": "`true` for a stored mutation → an MCP read-only hint of `false`."
},
"name": {
"type": "string",
"description": "Registry key / invoke path segment (`POST /queries/{name}`)."
},
"params": {
"type": "array",
"items": {
"$ref": "#/components/schemas/ParamDescriptor"
}
},
"tool_name": {
"type": "string",
"description": "MCP tool id (the `tool_name` override, else `name`)."
}
}
},
"QueryRequest": {
"type": "object",
"description": "Inline read-query request for `POST /query`.\n\nFriendlier-named alternative to [`ReadRequest`] for ad-hoc reads and\nAI-agent integration. Mutations are rejected with 400 — use `POST\n/mutate` (or its deprecated alias `POST /change`) for write queries.\nField names are deliberately short (`query`, `name`) to match the GQ\nkeyword and the CLI `-e` flag.",