omnigraph/docs/dev/rfc-003-mcp-server-surface.md
Ragnor Comerford 3c2b1b8051
Stored-query registry foundation + config/CLI RFC-002 (#128)
* MR-969: add stored-query registry config surface

Introduce the `queries:` block in omnigraph.yaml — an inline
`name -> entry` map of stored queries, per-graph
(`graphs.<id>.queries`) and top-level for single-graph mode, mirroring
how `policy` is wired in both modes. Each entry points at a `.gq` file
and carries optional MCP exposure settings (`expose`, `tool_name`),
defaulting to not-exposed.

Additive: absent `queries:` leaves current behavior unchanged.

- QueryEntry { file, mcp: McpSettings { expose, tool_name } }
- `queries` field on TargetConfig + OmnigraphConfig (serde default)
- query_entries() / target_query_entries() accessors
- resolve_query_file() — base_dir-relative `.gq` path resolution
- round-trip + absent-block tests

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* Add stored-query registry loader and GraphHandle wiring

Add a `queries` module: QueryRegistry loads each declared `.gq` entry,
parses it, and selects the query whose symbol matches the manifest key,
asserting the two agree (key == `query <name>` symbol). Identity is the
query name; a key/symbol mismatch is a load-time error. Errors are
collected, not fail-fast, so a bad registry surfaces every broken entry
at once. Schema type-checking is deliberately left to a separate pass so
the loader stays callable without an open engine.

Thread an `Option<Arc<QueryRegistry>>` through GraphHandle alongside the
per-graph policy; the URI-canonicalizing clone propagates it. Production
openers default to None for now — the boot path loads and attaches the
registry in a later change.

- QueryRegistry::{from_specs, load, lookup, iter}; StoredQuery::is_mutation
- GraphHandle.queries field, propagated on canonical clone
- registry unit tests: identity match/mismatch, multi-query selection,
  per-entry parse errors, error collection, mutation classification

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs: add RFC-002 config & CLI architecture

Layered config (user-global ~/.config/omnigraph/ + per-project), a
unifying `target` abstraction resolving to (locus, graph, sub-state,
credential) with embedded-URI XOR remote-server loci, multi-server ×
multi-graph client targeting, credentials by-reference, and the
file-naming decision: project and server config are one artifact
(`omnigraph.yaml`); the only differently-named file is the user-global
`config.yaml`, split by scope not role. Includes the 12-factor bind
portability rule (prefer --bind/OMNIGRAPH_BIND over a committed
server.bind) and the defined-locally / invoked-remotely model for
stored queries. Derived from first principles working backwards from
what the engine enables; validated against kube/Helix/git/compose.

Linked from docs/dev/index.md. Proposed; phased rollout for the
MR-973/974/981 family.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* Add check() to validate stored queries against the live schema

A pure check(registry, catalog) that type-checks every stored query via
the same typecheck_query_decl the engine runs for inline queries — no
parallel implementation. Failures are collected, not fail-fast, so an
operator sees every broken query (e.g. a type/property a migration
renamed or removed) in one pass. Breakages are fatal (the boot path will
refuse to start); warnings are advisory.

Pure over (registry, catalog) so it is callable both at boot (engine
catalog) and offline from the CLI without an open engine.

Advisory lint: an mcp.expose:true query that declares a Vector(N)
parameter warns — an LLM cannot supply a raw embedding vector; such a
query should take a String parameter and embed server-side. Warns
rather than rejects, since service-to-service callers may pass vectors.

- CheckReport { breakages, warnings }; has_breakages / is_clean
- tests: valid query, unknown type, unknown property, collect-not-fail-fast,
  vector-param-exposed warns, unexposed silent

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* Drop internal plan-label refs from stored-query config comments

Doc comments referenced sequencing labels ("C2") that mean nothing to a
reader; reword to describe the behavior directly. Comment-only.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs: reconcile aliases with the role model in RFC-002

Place the existing client-only `aliases:` block in the client/server
role split: aliases are client-role (CLI, embedded, ungated) and may
live in both user-global and project config; `queries:` is server-role
(deployment manifest only). They overlap as "name -> .gq"; `queries:` is
the superset, and the end-state subsumes aliases (definition -> queries,
target/branch/format -> client invocation context, positional args ->
CLI sugar). v1 keeps aliases unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs: make RFC-002 config global-first, project-optional

The global user config is the primary, self-sufficient default; the
CLI works from any directory with no project file (the kubectl/aws/gh
posture), a deliberate flip from today's project-anchored behavior.
The project omnigraph.yaml becomes an optional repo-scoped override and
the deployment manifest. Uniform schema, both layers optional; global
can hold any section including a personal server's graphs/queries.
Additive: project still overrides global; the flip adds a fallback
layer below the project file rather than removing it.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs: justify XDG ~/.config/omnigraph over legacy ~/.omnigraph in RFC-002

Make the rationale explicit: XDG-first because OmniGraph is a client
that will cache remote catalogs and keep session state alongside
secrets, and XDG separates config / cache / state into distinct dirs
(clear cache without touching creds; backups skip cache) whereas a
single ~/.omnigraph/ mixes them. Honor ~/.omnigraph/ as a fallback for
the peer-group (aws/kube/docker/helix) expectation. Add XDG_CACHE_HOME
/ XDG_STATE_HOME to the override precedence.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs: build RFC-002 credentials on the existing env-file mechanism

OmniGraph already has credentials-by-reference: bearer_token_env names
the env var, and auth.env_file is a git-ignored dotenv the CLI
auto-loads (real env vars win), resolved via resolve_remote_bearer_token.
The RFC's proposed credentials.yaml + token_env were redundant parallel
inventions. Reconcile: reuse bearer_token_env (extend to
servers.<name>) and auth.env_file (add a global ~/.config/omnigraph/.env
layered under the project .env.omni); OS keychain is an additive future
resolver. No new credentials.yaml. Updated summary, non-goals,
background, file-naming, credentials, example, login, migration, rollout.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs: use single ~/.omnigraph dir (Helix-style), not XDG, in RFC-002

Reverse the earlier XDG-first call. The prior argument rested on a false
dichotomy (single-dir => mixed config/cache/state); in fact the peer
tools (aws, kube, helix) achieve separation via SUBDIRECTORIES inside
one ~/.tool/ dir (~/.aws/sso/cache/, ~/.kube/cache/), getting cache
hygiene AND one discoverable place. So everything goes under
~/.omnigraph/: config.yaml, credentials (dotenv, 0600), cache/, state/.
Lower cognitive load, matches what DB/cloud-CLI users expect, matches
Helix. OMNIGRAPH_HOME overrides; $XDG_CONFIG_HOME optionally honored but
~/.omnigraph/ is canonical. Updated all paths, the rationale paragraph,
the file-naming table (added a cache/state row), and env precedence.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs: reconcile RFC-002 with shipped/planned CLI tickets

Align with reality found in existing tickets:
- Noun is graph/graphs, not target/targets (MR-603 done renamed the
  config key targets->graphs, flag --graph). Use graphs:/--graph; an
  entry is embedded (uri) XOR remote (server + remote graph name).
- ~/.omnigraph/ confirmed by MR-581 (og template pull, done) which
  already quick-starts templates there.
- Templates already exist (MR-581/MR-531) — not invented here.
- The init family is already specced (init, quickstart MR-973, serve
  MR-970, prune MR-972, mcp install MR-974, agent-mode MR-981); this
  RFC only adds the user route (~/.omnigraph/config.yaml + login).
- aliases: -> operations: planned (MR-839).
- bearer_token_env gap tracked in MR-971.
- query lint/check already exist (MR-639) — registry validator must not
  collide with the singular `query check`.
Add a Reconciliation section; fix the canonical example to graphs:/--graph.
Also: merge semantics refined (deep-merge settings, replace named
entries, replace lists, config view --resolved --show-origin).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs: correct stale-ticket claims and fold init/bootstrap design into RFC-002

Verify against code, not ticket statuses (MR-581 is marked done but is
stale/unbuilt): no ~/.omnigraph usage, no template/serve/quickstart/
prune/login commands exist; config still uses aliases: (no operations:).
So ~/.omnigraph/ stands on peer-convention merits alone, and templates
are a design question, not a foothold. Add §7.5: the three-tier init
model (user route = login + ~/.omnigraph/config.yaml; thin project init;
fat quickstart + templates) with first-principles positions (split
init/login, in-place refuse-if-exists, interactive vs --auto/agent-mode,
--template flag, secrets-on-scaffold gitignore rule). This RFC owns only
the user route; the rest are sibling tickets (MR-973/970/972/974/981).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs: breadboard + slice Shape A in RFC-002

Add the implementation breadboard (places P1-P5, affordances N1-N14 with
NEW markers, mermaid) and five vertical slices for the selected config/
CLI/init shape: V1 global layer + merge engine + config view; V2 remote
graphs + HTTP-client path + credential resolution; V3 omnigraph login;
V4 init-hardening + quickstart + templates (rides MR-970); V5 agent-mode
(MR-981). Rollout reordered to the slice sequence; spikes X1-X4 gate
their owning slice. V1-V2 close the substantive client->server gap.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* Add InvokeQuery Cedar action (coarse, graph-scoped)

A per-graph, branch-scoped action that gates invoking a server-side
stored query by name. Coarse for now: an `invoke_query` allow rule
permits any stored query on the graph; a future, additive refinement
adds an optional per-query-name scope without changing rules written
against the coarse action. Enforcement is at the HTTP boundary; the
engine `_as` writers still enforce read/change per the query body, so a
stored mutation is double-gated (invoke_query to reach the tool, change
for the write). No call site yet — the invocation handler wires it in a
later change (same pattern as Admin/GraphList added ahead of consumers).

- variant + as_str/resource_kind(Graph)/FromStr/uses_branch_scope
- Cedar schema: invoke_query appliesTo Graph
- tests: per-graph allow/deny, branch-scope accepted

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* Load and type-check stored queries at server boot, refusing breakage

At startup the server now loads each graph's stored-query registry,
type-checks every query against that graph's live schema, and refuses to
boot if any query references a type/property the schema doesn't have
(same posture as bad policy YAML) — so schema drift surfaces at the
deploy boundary, not silently at invocation. Non-blocking warnings are
logged. The validated registry is attached to the GraphHandle (the two
production sites previously held `queries: None`).

Loading (parse + key==symbol identity) happens at settings-build time
where the config is in scope; the schema type-check happens after each
engine opens (single mode in `open_single_with_queries`, multi mode in
`open_single_graph`). `open_with_bearer_tokens_and_policy` delegates
with an empty registry so its 18 test callers are unchanged; the public
`new_*` constructors are unchanged (only the private build path threads
the registry).

- ServerConfigMode::Single / GraphStartupConfig carry the loaded registry
- boot tests: valid registry boots; type-broken query refuses boot + names it

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* Add `omnigraph queries validate` and `queries list` CLI

`queries validate` type-checks the stored-query registry against the
live schema offline — it opens the selected graph, runs the same
check() the server runs at boot, prints breakages/warnings (human or
--json), and exits non-zero on any breakage — so an operator can catch
a query broken by a schema change without restarting the server.
`queries list` prints each registered query's name, MCP exposure, and
typed params.

Named `validate` (not `check`) to avoid overlap with the existing
`omnigraph lint` — `query check`/`query lint` are already deprecated
argv-shims to `lint`. Registry entries resolve like the server: a named
graph uses its per-graph `queries:`; otherwise the top-level one.

- Queries subcommand group; reuses QueryRegistry::load + check from
  omnigraph-server; local-only (needs the schema), mirrors lint
- tests: clean registry exits 0, broken query exits non-zero + names it,
  list shows the query and its typed params

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* Route registry selection through one shared query_entries_for

The "which queries: block applies for graph X" rule existed twice — the
server boot path and the CLI's registry_entries — and had already drifted:
the CLI carried an unreachable unwrap_or_else fallback the server lacked.

Add OmnigraphConfig::query_entries_for(graph: Option<&str>) as the single
definition (named graph -> its per-graph block; otherwise top-level) and
route all three sites through it: server single mode, server multi-graph
loop, and the CLI. The CLI's dead fallback arm is deleted; CLI and server
now resolve identically by construction.

No behavior change. Extends the config round-trip test to pin the selector,
including the unknown-name -> top-level fallback the deleted CLI arm covered.

* Funnel registry validation through one validate_and_attach gate

The check -> refuse-on-breakage -> log-warnings -> empty->None block was
copy-pasted across both open paths (single mode and the multi-graph
per-graph open), differing only by the graph label. A third opener could
attach a registry that was never schema-checked.

Extract validate_and_attach(queries, catalog, label) -> Option<Arc<..>> as
the single gate both paths call, so attaching an unchecked registry is no
longer expressible. The catalog handle is an owned Arc, so calling it
before the multi-mode policy match (which rebinds db) is borrow-clean.

No behavior change. Adds a direct unit test of the helper (empty / clean /
breakage incl. the graph label in the message) — covering the multi-graph
path's logic, which previously had no boot-refusal coverage.

* Resolve param types structurally in the MCP vector lint

The exposed-query advisory detected vector params with
type_name.starts_with("Vector(") — a second copy of the compiler's own
ScalarType::from_str_name vector parsing that could drift from it.

Key the lint off PropType::from_param_type_name + ScalarType::Vector(_)
instead, the one canonical resolver the type system already uses. Any
future param-suppliability lint now reads the structured type rather than
scanning the surface string.

Behavior-preserving: the grammar forbids list-of-vector params
(list_type = "[" base_type "]", and base_type excludes Vector), so the only
input where the structured and string checks could differ is unparseable.
Adds a guard test that an exposed String param does not false-trigger the
warning.

* Refuse duplicate MCP tool names across exposed stored queries

The effective MCP tool name (explicit tool_name, else the query name) is a
second identity namespace beside the registry key, but nothing enforced it
unique — two exposed queries could claim one catalog key, and each consumer
re-derived the name ad hoc.

Add StoredQuery::effective_tool_name() as the one definition, and a
load-time uniqueness pass in from_specs over exposed queries: a collision is
a collected LoadError naming the loser and the winner. Scoped to exposed
queries (unexposed have no MCP tool); deterministic over the BTreeMap so the
first-declared wins and the error order is stable.

New (rare) refusal: a config with colliding exposed tool names now fails
`omnigraph queries validate` offline and refuses server boot, the same
posture as a malformed registry. Release-note-worthy.

Test-first: duplicate_exposed_tool_name_is_a_load_error (red before the
pass, green after) + a CLI offline test; the unexposed sibling pins the
exposed-only scope; effective_tool_name asserts folded into the load test.

* docs: document the queries registry, CLI, and invoke_query action

The stored-query surface shipped without user docs. Add it, per the same-PR
maintenance contract:

- policy.md: invoke_query as per-graph action #10 (branch-scoped), with the
  double-gating note; renumber graph_list; add it to the branch_scope list.
- cli-reference.md: the `queries validate | list` command, and the
  `queries:` config block (per-graph + top-level) with mcp.expose/tool_name
  and the tool-name uniqueness rule.
- server.md: boot-time stored-query type-check (refuse on breakage), noting
  invocation over HTTP/MCP is not yet exposed.

* Add POST /queries/{name} stored-query invocation handler

Invoke a curated server-side stored query by name: source + name come from
the per-graph queries: registry, the client sends only runtime inputs
(params, branch, snapshot). Gated by the invoke_query Cedar action at the
boundary; the handler delegates to the existing run_query/run_mutate, whose
inner Read/Change enforce still runs — so a stored mutation is double-gated
(invoke_query to reach the tool, change for the write).

- InvokeStoredQueryRequest + an untagged InvokeStoredQueryResponse
  { Read(ReadOutput), Change(ChangeOutput) } → one Json<_> return type and a
  oneOf 200 schema (a correct contract, not a wrong-but-simple one).
- Route lives in per_graph_protected → single-mode /queries/{name} and
  multi-mode /graphs/{id}/queries/{name} for free.
- Deny == unknown: an invoke_query denial and a missing query both return the
  same 404, so the catalog can't be probed by an unauthorized caller.
- OpenAPI regenerated; tests cover read, mutation double-gate (403 vs 200),
  bad-param 400, and the identical-404 deny path.

Completes the MR-969 V1 invocation slice (registry + /queries/{name} + invoke_query).

* docs: stored-query invocation endpoint; flip the not-yet-exposed caveat

Now that POST /queries/{name} ships (C7), document it: add the endpoint to
server.md's inventory + an invocation section (body, untagged read/mutate
envelope, invoke_query gate, double-gated mutations, deny == 404), and flip
the startup note that said invocation was not yet exposed. In policy.md,
replace "no invocation call site yet" on the invoke_query action with a
pointer to the endpoint.

* Scope the stored-query 404-hiding claim to non-invoke_query callers

Review found the deny==404 catalog-hiding was overstated as a contract: it
holds only at the outer invoke_query gate. A caller that HOLDS invoke_query
but lacks read/change gets the inner gate's 403 for an existing query vs 404
for an unknown one — so existence is visible to grant-holders by design (the
intended double-gate). The handler docstring, OpenAPI 404 description, and
server.md all claimed the 404 was airtight against any denied actor.

Correct the wording in all three (no behavior change) and add the missing
symmetric test (invoke_query but no read -> 403 for an existing query, 404
for unknown) so the actual contract is pinned. Also document that in
default-deny mode (tokens, no policy) every invocation 404s until an
invoke_query rule is configured.

Nits: the from_specs collision comment said "first declared wins" but it is
lexicographically-first by name (BTreeMap); the effective_tool_name docstring
overclaimed the CLI display routes through it (it resolves the rule on its
own output DTO).

* Default mcp.expose to true (the manifest entry is the opt-in)

expose controls MCP-catalog membership only — it is not an authorization
gate (invocation is gated by invoke_query regardless). So requiring a
per-query mcp.expose: true was friction with no safety benefit: a
non-exposed query is still HTTP-invocable by name. Flip the default so
declaring a query in the manifest exposes it to the agent tool catalog by
default; expose: false is the escape hatch for service-only queries.

Both the absent-mcp path (Default impl) and the present-but-no-expose path
(serde default fn) now yield true. Doc comments + cli-reference updated; the
config round-trip test asserts the new default.

* Add GET /queries stored-query catalog endpoint

List a graph's mcp.expose stored queries as a typed tool catalog so a client
(the MCP server) can register them as tools without fetching .gq source.
Each entry carries name, MCP tool_name, description/instruction, a
read/mutate flag, and decomposed typed params (kind enum: string|bool|int|
bigint|float|date|datetime|blob|vector|list, plus item_kind for lists and
vector_dim) — so the consumer builds an input schema with a closed match and
never re-parses omnigraph type spelling. I64/U64 are bigint (string on the
wire): a JSON number loses precision past 2^53 and the engine already accepts
decimal strings.

Read-gated (works in default-deny; the catalog is graph-wide, authorized
against main). NOT Cedar-filtered per query yet — a reader can list a query
whose invoke_query they lack (documented gap until per-query authz lands);
invocation stays invoke_query-gated + deny==404.

- api: QueriesCatalogOutput / QueryCatalogEntry / ParamDescriptor / ParamKind
  + query_catalog_entry (reuses PropType::from_param_type_name; scalar_kind is
  exhaustive, so a new ScalarType is a compile error here until catalogued).
- GET /queries route in per_graph_protected (→ /graphs/{id}/queries in multi
  mode); OpenAPI regenerated; path allowlists updated.
- Tests: projection unit (every kind, list, vector, nullable, mutation,
  empty) + handler (exposed-only filter, read-gate probe-oracle, empty
  registry).

* docs: GET /queries stored-query catalog endpoint

Document the catalog: the endpoint table row (GET /queries, read-gated), a
catalog section (typed-param kind enum, bigint/date/datetime/blob-as-string,
graph-wide/branch-independent, mcp.expose default true, the read-gated
probe-oracle gap), and flip the startup note now that the catalog ships.

* Collect file-I/O and parse errors in QueryRegistry::load in one pass

load() early-returned on any unreadable .gq file, masking parse / identity /
tool-name-collision errors in the OTHER (readable) files — so an operator
fixed the missing file, restarted, and only then saw the next broken query.
Now it collects I/O errors but still runs from_specs on the readable specs
and returns the union, so every broken entry surfaces at once (matching the
collected-errors contract the rest of the registry already follows).

Safe: from_specs' tool-name collision check runs over loaded queries only, so
dropping an I/O-failed entry can only under-report a collision, never invent
one. I/O errors are ordered first (BTreeMap key order), then spec errors.

Adds a load-level test (tempdir: a valid, a missing, and a parse-broken .gq)
asserting all three surface in one Err — confirmed red before the fix.

* Make invoke_query graph-scoped (one branch authority)

invoke_query gates reaching the curated stored-query surface — a graph-level
capability. Per-branch/snapshot access is already enforced by the inner
read/change gate in run_query/run_mutate (authorized against the resolved
branch), so branch-scoping the outer gate was redundant AND wrong for snapshot
reads (it defaulted to main). Drop the branch dimension: remove InvokeQuery
from uses_branch_scope (it joins admin as graph-scoped) and authorize the
boundary gate with branch: None.

Lossless: an actor confined to branch X by their read/change rules can still
only invoke a stored query that touches X. A rule that sets branch_scope on
invoke_query is now rejected by validate() — write invoke_query in its own
rule.

Ripple (atomic): restructure the server invoke fixture so invoke_query sits in
its own branch_scope-free rule; invert invoke_query_is_branch_scoped ->
invoke_query_rejects_branch_scope; the per-graph authorize test uses
branch: None; docs (policy.md, server.md, the InvokeQuery doc). No wire/OpenAPI
change.

* Resolve graph config by identity, not server mode

Which policy/queries block applies for a graph was decided three different,
mode-dependent ways: single-mode boot used top-level even for a named graph;
multi-mode used per-graph (and silently ignored a top-level queries block); the
CLI used per-graph for a named target. So `queries validate --target prod`
could check a different registry than the single-mode server loaded, and a
named graph's per-graph policy/queries were silently shadowed.

Make config a function of graph IDENTITY: a graph served by NAME
(--target/server.graph, a graphs: entry) uses its own graphs.<name>.{policy,
queries}; a bare URI is anonymous and uses top-level. One rule, applied by
single-mode boot, multi-mode boot, and the CLI — so they can't diverge and the
CLI predicts the server exactly.

No silent ignore: serving a named graph while a top-level policy/queries block
is populated now refuses boot, naming the block (the multi-mode top-level-policy
bail, extended to queries and to single-mode-named). The CLI's `queries
validate` derives the schema URI and the registry from ONE selection, and a
positional URI forces anonymous (ignoring cli.graph) so the two can't come from
different graphs.

BREAKING (released behavior): single mode by name (--target/server.graph) with
top-level policy/queries previously used top-level; it now uses the per-graph
block and refuses boot if top-level is also populated. Bare-URI single mode is
unchanged. Loud, with migration text pointing at graphs.<name>.

- config: resolve_policy_file_for (policy sibling of query_entries_for, no
  top-level fallback) + populated_top_level_blocks for the coherence check.
- characterization tests (single-mode named -> per-graph; named + top-level ->
  bail; multi-mode top-level queries -> bail; CLI positional-URI -> top-level).
- docs: policy.md, server.md, cli-reference.md.

* docs: RFC-002 credentials keyed by server name (keychain/profile/env)

Reworks the RFC's credentials model: secrets are keyed by server name — OS
keychain `omnigraph:<server>` (preferred) -> a `[<server>]` profile in
`~/.omnigraph/credentials` -> `OMNIGRAPH_TOKEN[_<SERVER>]` env (CI), the
AWS/gh/kube model. `servers.<name>` is endpoint-only by default but may carry
an explicit, secret-free `auth: { token: { env|file|command|keychain } }`
source. The shipped `bearer_token_env` + `.env.omni` dotenv remain a legacy
compat path; no `credentials.yaml`.

* docs: RFC-002 — typed graph locator (storage/server/graph_id), not a uri string

Add §1.1: the resolved graph address is a typed GraphLocator
(Embedded{storage} | Remote{server, graph_id}), not a flat uri: String.
Diagnoses the string model's cost in the code today (~16 is_remote_uri forks,
TargetConfig can't express multi-server x multi-graph, the CLI bails on remote,
the ts SDK models baseUrl+graphId separately) and settles the YAML naming so
the key names the locus:

- storage: (embedded) — shipped uri: is a deprecated alias
- server: + graph_id: (remote) — graph_id defaults to the entry key
- storage xor server, reject both/neither (no silent ambiguity)

Kills the graphs:/graph: collision and the uri:-might-be-a-server ambiguity.
Updates the §1/§8 examples and the entry-shape notes to the new naming.

* Test: queries list must reject an unknown --target

queries list opens no graph URI, so unknown-graph validation does not ride
along on resolve_target_uri the way it does for every other command. The new
test reproduces the gap: with an unknown --target the command currently exits 0
and prints the (empty) top-level registry instead of erroring like the
URI-resolving commands do. Fails against current code; the fix follows.

* Validate the graph selection in queries list

Graph-existence validation was a side effect of URI resolution: every
URI-resolving command rejects an unknown --target via resolve_target_uri, but
queries list opens no URI, so query_entries_for(Some(unknown)) silently fell
back to the top-level registry and showed the wrong (or empty) catalog.

Make membership a property of the selection: add the fallible
resolve_graph_selection alongside the infallible query_entries_for (a known
name passes through, an unknown name errors with the same message as
resolve_target_uri, None stays anonymous), and validate the selection in
execute_queries_list. query_entries_for is unchanged — server boot's bare-URI
path still needs its None -> top-level arm.

* Surface policy-engine errors from stored-query invoke

The invoke handler mapped every authorize_request failure to 404 ('stored
query not found'), which collapsed the authorization decision (deny -> 403)
together with operational failures (no actor -> 401, Cedar evaluation error ->
500). A real policy-engine 500 was hidden as a missing query.

Separate the two concerns instead of sniffing the masked status. Extract
authorize() returning an Authz { Allowed, Denied(msg) } decision and reserve
Err for operational failures only; authorize_request becomes a thin wrapper
that maps Denied -> 403, so the 16 deny-as-403 callers are unchanged. The
invoke handler now matches the decision directly: a denial stays 404 (deny ==
missing, so the catalog can't be probed without the grant), while a 401/500
propagates with its true status.

500 is now a reachable outcome on POST /queries/{name}; document it in the
endpoint responses and regenerate openapi.json.

* Extract the named-graph/top-level coherence rule into one helper

The rule 'a named graph uses its own graphs.<name> block, so a populated
top-level block is a config error' lived inline in single-mode server boot.
Extract it to OmnigraphConfig::ensure_top_level_blocks_honored so the same
definition can be shared by the CLI selection gate (next commit) and the two
can't drift. Boot calls the helper; the message is reworded context-neutral
(drops 'serving') so it reads correctly from both boot and the CLI.

Behavior-preserving: multi-graph mode keeps its own unconditional check, and
single_mode_named_graph_rejects_top_level_blocks still passes.

* Test: queries validate/list must reject a named graph with a top-level block

Server boot refuses a config where a graph is selected by name yet a top-level
queries:/policy.file block is populated (the block would be silently ignored).
The CLI's queries validate/list resolve the same named selection but skip that
coherence check, so they give a false green / list the per-graph block. The new
test reproduces it: validate prints OK and list succeeds where boot would
refuse. Fails against current code; the fix follows.

* Enforce top-level coherence in the single CLI selection gate

queries validate validated graph membership only as a side effect of URI
resolution and queries list only via resolve_graph_selection's membership
check; neither applied the named-graph/top-level coherence rule server boot
enforces, so both gave a false green on a config boot refuses.

Fold ensure_top_level_blocks_honored into resolve_graph_selection so it is the
single gate that returns only valid + server-coherent selections, and route
resolve_selected_graph (queries validate) through it; queries list already
calls the gate. A named graph with a populated top-level block now errors in
both commands, matching boot. A positional URI stays anonymous (top-level
honored), so queries_validate_positional_uri_ignores_default_graph is
unaffected.

* docs: RFC-003 — MCP server surface for omnigraph-server

Detailed MCP-transport design for the stored-query/MCP work, building on the
shipped #128 registry. Corrects the draft against the branch head: the coarse
invoke_query gate + 404 denial-masking are already wired (server_invoke_query),
so per-query invoke_query scope (PolicyRequest has no query-name dimension yet)
is the real prerequisite; positions the doc as superseding rfc-001's MCP
transport (/mcp/tools+/mcp/invoke) and reconciles the shipped mcp.expose YAML
form and the schema-introspection non-goal; grounds the parity surface in the
actual omnigraph-ts package (13 tools with read/change ids, 2 resources).

* docs(config): clarify graph config boundaries

* fix(config): enforce graph-scoped policies and query validation

* fix(cli): require graph selection for scoped query registries

* fix(server): preserve named graph id in single mode policy

* fix(cli): share graph identity for policy resolution

* test(cli): cover policy tooling server graph selection

* fix(cli): honor server graph for policy tooling

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 22:50:31 +02:00

32 KiB

RFC: MCP Server Surface for omnigraph-server — Full Tool Parity, Stored Queries, Modular Auth

Status: Proposed Date: 2026-06-01 Tickets: MR-969 (stored queries + MCP exposure — the surface this completes), MR-956 (federated auth / WorkOS OAuth — the auth substrate this consumes), MR-971 (per-server credential resolver), MR-974 (agent setup surface — the installer that wires this), MR-668 (multi-graph server — shipped, the routing this builds on) Builds on: omnigraph#128 (ragnorc/stored-queries-mcp) — the shipped stored-query registry, GET /queries, POST /queries/{name}, and the coarse invoke_query gate. Supersedes: the MCP-transport portion of rfc-001-queries-envelope-mcp.md (/mcp/tools + /mcp/invoke). See Relationship to RFC-001. Target release: v0.8.x (phased — see Rollout)

Summary

Add a first-class MCP (Model Context Protocol) server surface to omnigraph-server, exposed over Streamable HTTP, that projects the server's operations as MCP tools and resources for LLM clients (Claude Code/Desktop/web, Cursor, etc.). Two populations of tools share one projection path:

  1. Built-in operational tools — parity with the existing @modernrelay/omnigraph-mcp stdio package's 13 tools (health, snapshot, read, schema_get, branches_list, commits_list, commits_get, change, ingest, branches_create, branches_delete, branches_merge, schema_apply) and its 2 resources (omnigraph://schema, omnigraph://branches), plus a new server-scoped graphs_list tool and an omnigraph://graphs resource (multi-graph mode).
  2. Dynamic stored-query tools — one MCP tool per mcp.expose: true entry in the queries: registry (MR-969 / #128), with parameters typed from the .gq declaration via the shipped query_catalog_entry / param_descriptor projection.

Every tool is authorized by the server's existing Cedar policy engine. The MCP layer never implements its own authentication: it consumes an already-resolved ResolvedActor from the server's bearer middleware (require_bearer_auth today; the TokenVerifier seam when MR-956 lands), so the same MCP endpoint serves on-prem (static or customer-OIDC tokens) and our cloud (WorkOS OAuth) by configuration only. Cloud OAuth is an additive layer (RFC 9728 protected-resource metadata) that slots in with zero MCP changes.

The end-state collapses two diverging tool implementations into one: the in-server MCP is the canonical, Cedar-gated, remotely-reachable surface; the stdio package becomes a thin stdio↔HTTP proxy (local on-ramp) over it.

Key caveat, stated up front (see §5.9 below): the headline "a token scoped via Cedar to a specific set of stored queries" requires per-query invoke_query scope, which is designed (rfc-001) but not yet implemented — the shipped action is coarse (any stored query on the graph, or none). Per-actor Cedar curation works today for built-in vs ad-hoc vs admin tools and for stored-vs-ad-hoc; sub-selecting individual stored queries per actor is gated on a prerequisite (PR 0b). Until then, stored-query curation is graph-level (registry membership + mcp.expose).

Relationship to RFC-001

rfc-001-queries-envelope-mcp.md (MR-656 / MR-976 / MR-969) is the parent design for stored queries + the response envelope + MCP. This RFC is the detailed MCP-transport design that #128 left for a follow-up, and it revises rfc-001 in three places where the shipped code or the MCP wire protocol diverged from rfc-001's sketch:

  1. Transport shape. rfc-001 sketched GET /mcp/tools + POST /mcp/invoke (a bespoke REST pair). That is not the MCP wire protocol — real MCP clients cannot connect to it. This RFC implements actual MCP JSON-RPC over Streamable HTTP and reuses query_catalog_entry as a projection source, not a parallel surface. (rfc-001's own Open Question already leaned toward Streamable HTTP.)
  2. Exposure config. rfc-001 specified inline .gq pragmas (@mcp(expose=…), default expose=false). #128 shipped a different mechanism: YAML queries.<name>.mcp.expose in omnigraph.yaml, default true (declaring a query in the manifest is the opt-in). This RFC builds on the shipped YAML form; the .gq-pragma design in rfc-001 is superseded for exposure.
  3. Schema introspection. rfc-001 lists "Schema introspection through MCP" as a non-goal ("agents see types through declared return shapes"). This RFC revises that: the operational-parity tools include schema_get and omnigraph://schemabecause the shipped stdio package already exposes both. The non-goal is achieved by policy, not omission: schema_get/omnigraph://schema are Cedar-gated by Read, and the recommended locked-down agent policy denies Read, so a curated agent still never sees the schema. (rfc-001's intent is preserved; the mechanism moves from "don't build it" to "build it, gate it.")

Everything else in rfc-001 (two-paths-one-engine, per-query invoke_query as the intended scope, the response envelope, multi-graph per-graph endpoints) this RFC consumes unchanged.

Numbering note: the TokenVerifier/WorkOS auth design is referred to in code (crates/omnigraph-server/src/identity.rs) as "RFC 0001," which is a different document from this repo's docs/dev/rfc-001-queries-envelope-mcp.md. To avoid the collision this RFC cites the auth substrate as MR-956 throughout, never "RFC 0001."

Reconciliation with shipped code (verified against ragnorc/stored-queries-mcp HEAD)

Verified against crates/omnigraph-server/src/{lib.rs,api.rs} and crates/omnigraph-policy/src/lib.rs at the current branch head (not the #128 PR body, and not api.rs alone):

  • GET /queries returns the mcp.expose == true subset as QueriesCatalogOutput { queries: [QueryCatalogEntry] }, each with typed ParamDescriptors, tool_name, description, instruction, and a mutation flag. MCP-ready projection, but exposed as bespoke REST/JSON — not the MCP wire protocol.
  • POST /queries/{name} route exists (server_invoke_query, lib.rs).
  • query_catalog_entry() / param_descriptor() with an exhaustive ScalarType → ParamKind map (a new scalar is a compile error).
  • InvokeQuery Cedar action defined in omnigraph-policy.
  • InvokeQuery IS enforced at POST /queries/{name}: server_invoke_query calls authorize(PolicyAction::InvokeQuery) and masks a denial to a 404 identical to "unknown query" so the catalog isn't probeable (the denial-masking the previous draft of this RFC reported as missing is shipped — it lives in lib.rs, not api.rs). The stored-mutation path is already double-gated: InvokeQuery outer, then Change inside run_mutate.
  • Reuse path exists: run_query / run_mutate are already decoupled from their HTTP request bodies and take registry-supplied (source, name, params, branch/snapshot). MCP tools/call for both stored and ad-hoc tools delegates to these — no new business logic.
  • Per-query (invoke_query[name]) scope is NOT implemented. PolicyRequest carries only {action, branch, target_branch}no query-name dimension — and the action is documented coarse ("permits any stored query on the graph"). rfc-001 designed per-name scope; it is unbuilt. This RFC's per-query Cedar filtering (§5.4) and recommended agent policy (§5.9) depend on it → tracked as PR 0b.
  • No MCP protocol surface (initialize/tools/list/tools/call, JSON-RPC, transport).
  • No TokenVerifier trait yet — require_bearer_auth resolves a ResolvedActor inline (static-hash). The trait/OidcJwtVerifier are MR-956 (draft). The MCP layer's only requirement — consume ResolvedActor — is satisfiable today.

Stack (verified Cargo.toml): Axum + utoipa (OpenAPI) + omnigraph-policy (Cedar) + futures + tokio. No MCP crate present. edition = "2024".

Motivation

  • One curated, safe, remotely-reachable tool surface. MR-969's thesis: hand an LLM a token Cedar-scoped to a set of tools and it sees exactly those typed tools — cannot construct ad-hoc queries it isn't permitted, cannot read the schema it isn't permitted, cannot reach other graphs. Today the only MCP is the stdio package: local-only, full surface, ungated.
  • Parity, so the in-server MCP can be the single implementation. Operators/agents already depend on the operational tools. Supporting them server-side behind one Cedar gate lets the stdio package degrade to a proxy and removes two diverging tool sets.
  • On-prem and cloud from one endpoint. A managed cloud (WorkOS OAuth) and an on-prem/air-gapped deploy (static or customer-OIDC tokens) must serve the same MCP without forks or MCP-specific auth.
  • Foundation for the agent on-ramp (MR-974). omnigraph mcp install --agent <tool> needs a decided transport + a stable endpoint.

Goals

  • Project built-in tools + stored queries as MCP tools through one registry abstraction.
  • tools/list and the callable set are identical for argument-independent authorization, both driven by Cedar (see §5.4 for the branch-scoped caveat).
  • The MCP layer is auth-method-agnostic: it consumes ResolvedActor, never a raw token, never branches on how auth happened.
  • The same endpoint works on-prem (static/OIDC) and cloud (WorkOS OAuth), switched by config; cloud OAuth is additive (RFC 9728).
  • No new business logic: MCP tools delegate to the same run_query/run_mutate/branch/schema functions the HTTP routes call.
  • Behaviour-neutral when unused: no MCP traffic = no change.

Non-Goals

  • Building/hosting an OAuth authorization server. The server is a Resource Server; WorkOS AuthKit+Connect is the AS (MR-956). The MCP endpoint validates tokens, never issues them, never holds client secrets.
  • OAuth/WorkOS implementation itself — MR-956's work. This RFC leaves a clean RFC-9728 hook and consumes ResolvedActor.
  • MCP prompts, elicitation, tools/list_changed, resource subscriptions, server-initiated messages. None needed → enables a stateless POST-only transport (§5.6).
  • stdio transport inside the server. stdio stays in the TS package (now a proxy).
  • Cross-graph tool listing. Per-graph catalogs only (MR-969 + RFC-002 non-goal).
  • Hot reload of the query registry. Restart-only (MR-969).

Background

omnigraph-server (Axum) already implements every operation this RFC exposes as an authenticated HTTP route; each authorizes via a PolicyAction against the Cedar policy for a server-resolved actor and calls into the engine. The existing stdio MCP package is a client of these routes (it owns no business logic). MR-956 will introduce a TokenVerifier trait (StaticHashTokenVerifier today inline, OidcJwtVerifier for OIDC/WorkOS) producing the ResolvedActor { actor_id, tenant_id: Option, scopes: Vec<Scope>, source } that already exists in identity.rs and is consumed by Cedar — token validation is offline (cached JWKS), so on-prem/air-gapped has no request-path dependency on the cloud.

Design

5.1 One tool model: a McpTool trait, two populators

Both built-in and stored-query tools implement one trait so tools/list / tools/call never special-case:

trait McpTool: Send + Sync {
    fn name(&self) -> &str;                       // MCP tool id (stable)
    fn title(&self) -> Option<&str>;
    fn description(&self) -> &str;
    fn input_schema(&self) -> serde_json::Value;  // JSON Schema (draft 2020-12)
    fn annotations(&self) -> ToolAnnotations;     // readOnlyHint / destructiveHint / idempotentHint
    /// The Cedar request(s) this call requires, given parsed args. Used BOTH at
    /// list-time (dry-run filter, default args) and call-time (enforce, real args).
    fn authorization(&self, args: &ToolArgs) -> Vec<PolicyRequest>;
    async fn call(&self, ctx: &GraphCtx, args: ToolArgs) -> Result<ToolOutput, ToolError>;
}
  • Built-ins: ~14 static impls, each delegating to the same function its HTTP route calls (run_query, run_mutate, branch ops, apply_schema_as, …). input_schema authored once (or derived from each route's existing utoipa/ToSchema DTO).
  • Stored queries: generated McpTool instances, one per mcp.expose entry; input_schema from param_descriptor (§5.3); authorizationInvokeQuery (coarse today; InvokeQuery{name} after PR 0b) then the inner Read/Change.

ToolRegistry for a graph = the static built-ins + the dynamic stored-query tools resolved from that graph's GraphHandle registry.

5.2 Tool catalog (parity) and Cedar mapping

Each built-in reuses the exact PolicyAction its HTTP route already enforces — verified against the handlers in lib.rs, not invented:

MCP tool Scope Read/Mutate Cedar action (verified from route)
health server read none (liveness/version)
graphs_list (new) server read GraphList
snapshot graph read Read
schema_get graph read Read
branches_list graph read Read
commits_list, commits_get graph read Read
read (ad-hoc .gq) / query (alias) graph read Read
change (ad-hoc .gq) / mutate (alias) graph mutate Change
ingest (NDJSON) graph mutate Change (+ BranchCreate when forking a new branch)
branches_create graph mutate BranchCreate
branches_delete graph mutate BranchDelete
branches_merge graph mutate BranchMerge
schema_apply (allow_data_loss) graph mutate SchemaApply
stored query (find_user, …) graph inferred InvokeQuery (coarse; InvokeQuery{name} after PR 0b) + inner Read/Change

There is no Ingest and no separate snapshot/Export actioningest enforces Change, snapshot enforces Read. (Export exists but maps to the /export route, which this RFC does not expose as a tool.)

Tool id parity vs. canonicalization. The shipped stdio package uses tool ids read/change (and calls the deprecated /read,/change routes). The server HTTP surface canonicalized to /query,/mutate with /read,/change deprecated (MR-656). To keep existing package clients working and align with the server, the MCP exposes query/mutate as canonical with read/change retained as deprecated-but-live aliases (both dispatch to the same handler). Open Q7 asks whether to drop the aliases later.

Resources (§5.5): omnigraph://schema, omnigraph://branches (parity), plus omnigraph://graphs (new) — each gated by the same action as its list/get route (Read, Read, GraphList).

5.3 ParamDescriptor → JSON Schema (stored-query tools)

ParamKind JSON Schema Notes
String {"type":"string"}
Bool {"type":"boolean"}
Int (i32/u32) {"type":"integer"}
BigInt (i64/u64) {"type":"string","pattern":"^-?\\d+$"} JSON numbers lose precision >2⁵³ → string (matches the shipped api.rs rationale). (Open Q1)
Float (f32/f64) {"type":"number"}
Date {"type":"string","format":"date"}
DateTime {"type":"string","format":"date-time"}
Blob {"type":"string","contentEncoding":"base64"}
Vector {"type":"array","items":{"type":"number"},"minItems":dim,"maxItems":dim} uses vector_dim
List {"type":"array","items":<item_kind schema>} scalar items only (grammar guarantees)

nullable == false → param is in required. Annotations: mutation{readOnlyHint:false, destructiveHint:true}; else {readOnlyHint:true}. description → tool description; instruction → appended to description (or _meta). (The shipped check() already warns when an mcp.expose query declares a Vector param an LLM can't supply.)

For built-in tools the schema is hand-authored from the route DTO; e.g. query{source: string, branch?: string, params?: object}; schema_apply{schema: string, allow_data_loss?: boolean}; ingest{ndjson: string, mode?: "merge"|"append"|"overwrite", branch?: string}.

5.4 tools/list (Cedar-filtered) and tools/call (dispatch + masking)

  • tools/list: build the ToolRegistry; for each tool evaluate authorization(default_args) against the actor's Cedar policy; emit only tools that authorize. Authz decisions memoized per request. Stored-query tools additionally require mcp.expose: true.
    • Exactness caveat (R7 is conditional): the listed set equals the callable set only for tools whose authorization is argument-independent (health, graphs_list, snapshot, schema_get, branches_list, commits_*, ad-hoc query/mutate, and stored queries under the coarse action). For branch-scoped tools (branches_create/merge with target_branch_scope, and any branch-scoped Read/Change rule), list-time uses default_args (e.g. branch main) and cannot know the real target, so the listed set is a best-effort approximation of callability — a call may still be denied (or, rarely, a hidden tool would have been allowed). tools/call is always the authoritative gate. The contract is: list never shows a tool the actor can't ever call; for branch-scoped tools it may show one the actor can call only on some branches.
  • tools/call: resolve nameMcpTool (masked-404 if unknown or mcp.expose:false); parse+validate args against input_schema; enforce authorization(args) (mutations stay double-gated: InvokeQuery then Change); on success call. Denial masking lives in one place (the dispatcher): an authz denial is returned identically to "unknown tool" (§5.10), reusing the same deny≡missing principle already shipped at POST /queries/{name}.

5.5 Resources

Advertise resources capability (subscribe:false, listChanged:false). resources/list → the URIs the actor may read; resources/read → schema .pg text / branches JSON / (multi-graph) graphs JSON, each gated by the corresponding action (Read, Read, GraphList). A locked-down agent denied Read simply never sees omnigraph://schema or omnigraph://branches — this is how rfc-001's "agents don't introspect schema" intent is met by policy (§Relationship-to-RFC-001).

5.6 Transport: Streamable HTTP, stateless, POST-only

  • Streamable HTTP (MCP's current standard; we're already an HTTP server). One endpoint per scope (§5.7).
  • Because the server emits no server-initiated messages, implement the minimal conformant shape: client POSTs JSON-RPC, server replies application/json. No SSE channel, no Mcp-Session-Id, stateless — each request authenticated independently via the bearer middleware. Honour the MCP-Protocol-Version header. SSE/sessions can be added later if subscriptions land.
  • JSON-RPC methods: initialize (advertise {tools:{listChanged:false}, resources:{listChanged:false, subscribe:false}} + serverInfo/version), notifications/initialized (no-op ack), ping, tools/list, tools/call, resources/list, resources/read. prompts/list returns empty if probed.
  • Library decision (Open Q2): spike rmcp (official Rust MCP SDK) for conformance + Streamable-HTTP/Axum on edition 2024; fall back to a hand-rolled ~150 LOC JSON-RPC-over-POST (only the methods above) on friction. Given the tiny surface, hand-roll is an acceptable default.

5.7 Endpoint routing (server- vs graph-scoped)

  • Single-graph mode: POST /mcp — graph tools + server tools (health, graphs_list).
  • Multi-graph mode (MR-668): POST /graphs/{graph_id}/mcp — graph-scoped tools for that graph; plus a server-level POST /mcp exposing only server-scoped tools (health, graphs_list). A per-graph endpoint never lists another graph's tools (isolation, tested). Mirrors the shipped /graphs/{graph_id}/… cluster routing. (Open Q5: confirm naming + whether server tools also appear on the per-graph endpoint.)

5.8 Modular / decoupled auth (the cross-cutting requirement)

Invariant (load-bearing, satisfiable today): the MCP handler receives an already-resolved ResolvedActor and branches on nothing about how the token was verified. No token parsing, no method check, no OAuth inside the MCP module. Today that actor comes from require_bearer_auth; when MR-956 lands it comes from a TokenVerifier — the MCP code is identical either way.

request → [auth middleware: ResolvedActor] → [MCP route] → Cedar → McpTool

Server side — auth is config, not code:

Deployment Verifier MCP change
On-prem, static bearer require_bearer_auth / StaticHashTokenVerifier none
On-prem, customer IdP OidcJwtVerifier → customer issuer (MR-956) none
Our cloud OidcJwtVerifier → WorkOS, tenant_id = Some(org_id) (MR-956) none

Token validation is offline (cached JWKS) — on-prem/air-gapped keeps working with no request-path cloud dependency. The MCP endpoint never terminates OAuth and never holds a client secret (Resource Server only).

Cloud client negotiation — additive, no MCP changes: when MR-956 lands, the server publishes RFC 9728 /.well-known/oauth-protected-resource and returns WWW-Authenticate: Bearer ..., resource_metadata="..." on 401. A compliant MCP client (Claude) then auto-negotiates: static bearer to an on-prem endpoint; on a cloud 401 it discovers the WorkOS AS and runs OAuth/PKCE itself — same endpoint URL, zero client-side branching. This RFC only requires that MCP routes flow through the standard 401 path so that hook can be added later without touching MCP.

Multi-user identity pass-through (cloud): the caller's token (a WorkOS JWT, audience-bound per-tenant) must reach the server so Cedar enforces per-user/per-tenant policy — never a shared service token. The MCP endpoint validates it offline and maps org_id → tenant_id. This is why the remote path is the in-server HTTP MCP that Claude connects to directly (its token flows through), not a stdio bridge impersonating a user.

Client-side credential acquisition (CLI/SDK/proxy) — pluggable CredentialSource (RFC-002 §5, MR-971), keyed by server name, so OAuth is a future sibling key, not a re-key:

servers:
  onprem: { endpoint: https://og.internal:8080, auth: { token: { env: OG_TOKEN } } }
  edge:   { endpoint: https://og-edge,          auth: { token: { command: [vault, read, -field=token, secret/og] } } }
  cloud:  { endpoint: https://api.omnigraph.cloud, auth: { oauth: { issuer: workos } } }   # future sibling

Implicit chain when auth: omitted: OMNIGRAPH_TOKEN_<NAME> → keychain omnigraph:<name>[<name>] in ~/.omnigraph/credentials; legacy bearer_token_env honoured. Secrets never inlined.

5.9 Safety model — Cedar is the gate, default-deny is the floor

With ad-hoc query/mutate/schema_apply present as tools, the only thing protecting an untrusted agent is the Cedar policy. Therefore:

  • Default-deny when tokens are configured (MR-723, shipped) is the floor — an actor with no grants sees an empty tool list.
  • What works today (coarse action): a policy can hide all ad-hoc tools and admin tools per-actor (deny Read, Change, SchemaApply, Branch*) while allowing stored queries (allow InvokeQuery). That already reproduces "can't run ad-hoc, can't read schema, can only call stored queries" — the agent sees every exposed stored query plus nothing else.
  • What needs PR 0b (per-query scope): selecting which stored queries an actor may call (allow InvokeQuery [find_user, list_orders], deny the rest). The shipped invoke_query is coarse (all stored queries or none). Until PR 0b adds a query-name dimension to PolicyRequest + the Cedar schema (rfc-001's intended design), per-actor sub-selection of stored queries is not expressible; curation is graph-level (which .gq files are registered + mcp.expose).
  • schema_apply, branches_delete, ad-hoc mutate require an explicit admin-tier grant; never in a default agent policy.
  • (Open Q3) Optional mcp.allow_adhoc server switch defaulting off for the ad-hoc query/mutate tools — defence-in-depth independent of Cedar, and independent of PR 0b.

5.10 Result shaping and error mapping

  • Success: tools/call returns content: [{type:"text", text:<json>}] where <json> is the route's existing output envelope (read rows / mutation summary, i.e. ReadOutput / ChangeOutput). (Open Q4: also emit structuredContent + outputSchema — defer; text-JSON for v1.)
  • Tool execution error (bad params after schema validation, engine error): result with isError:true + a text content block.
  • Authorization denial / unknown tool / mcp.expose:false: a single JSON-RPC error (-32602, message "unknown tool") — identical for all three so policy isn't probeable (same principle as the shipped POST /queries/{name} 404 masking).
  • Auth failure (bad/absent bearer): HTTP 401 from the middleware before MCP — carries WWW-Authenticate (the RFC 9728 hook), never masked as a tool error. (This is exactly the path the shipped authorize/authorize_request split preserves: operational failures keep their status; only denials are masked.)

Relationship to the @modernrelay/omnigraph-mcp stdio package

Verified surface of the package (omnigraph-ts, pkg version 0.3.0, @modelcontextprotocol/sdk@^1.29.0, stdio only): 13 tools (health, snapshot, read, schema_get, branches_list, commits_list, commits_get, change, ingest, branches_create, branches_delete, branches_merge, schema_apply) and 2 resources (omnigraph://schema, omnigraph://branches). It is a thin client over the SDK → HTTP routes and forwards the caller's bearer verbatim (no inspection).

Once parity lands, collapse to one implementation: the in-server MCP is canonical (Cedar-gated, remote-capable, the path that becomes a Claude-web connector via MR-956). The stdio package degrades to a thin stdio↔HTTP proxy forwarding JSON-RPC (and the incoming Authorization) to /mcp — staying the local on-ramp for Claude Code/Desktop while sharing one tool set, one Cedar gate. Transition: keep the current independent stdio package on its 0.3.x/0.6.x line; ship proxy mode in a later TS minor once the server endpoint is GA. (Note: the package is currently several minors behind the server — its vendored spec/openapi.json predates the stored-query routes — so it needs the standard re-sync regardless of MCP work.)

Testing

  • Protocol conformance: initialize handshake + advertised capabilities; tools/list shape; tools/call happy path; JSON-RPC error envelopes (-32601 unknown method, -32602 invalid params / unknown tool); resources/list + resources/read.
  • Cedar filtering (coarse, today): an actor with allow InvokeQuery + deny Read/Change sees all exposed stored queries but not query/mutate/schema_get; tools/call query returns masked "unknown tool"; an admin sees the full catalog.
  • Cedar filtering (per-query, gated on PR 0b): actor scoped to InvokeQuery [find_user] sees only find_user; tools/call list_orders masks. This test ships with PR 0b, not PR 1 — it cannot pass against the coarse action.
  • Parity per built-in: each tool round-trips against the same expectations as its HTTP route (reuse route tests); read/change aliases dispatch identically to query/mutate.
  • Double-gating: a stored mutation requires both InvokeQuery and Change; schema_apply requires SchemaApply.
  • mcp.expose:false: absent from GET /queries and MCP tools/list; still service-callable by name through POST /queries/{name} when the actor has invoke_query, but not MCP-callable.
  • Schema generation: table-driven over every ParamKind incl. nullable / list / vector(dim).
  • Branch-scoped list approximation: assert the documented R7 caveat — a branch-scoped policy lists branches_create, and tools/call is the authoritative gate (a denied target still 403s/masks).
  • Multi-graph isolation: /graphs/a/mcp never lists graph b's tools; server /mcp exposes only server tools.
  • Auth decoupling: the MCP suite is green under the current require_bearer_auth and under a mock OIDC ResolvedActor source — proving verifier-agnosticism. A 401 carries WWW-Authenticate.
  • OpenAPI: the JSON-RPC endpoint is not REST — document only the envelope in utoipa (or exclude); keep openapi.json drift test green (OMNIGRAPH_UPDATE_OPENAPI=1 to regenerate on intentional change).
  • Cross-repo smoke (optional): point @modelcontextprotocol/sdk (TS) at the HTTP endpoint in an omnigraph-ts integration test.

Rollout — phased by risk

  • PR 0a — extract the reusable invoke path (small). The coarse invoke_query gate + 404 denial-masking are already shipped in server_invoke_query. Extract the read/mutate dispatch into invoke_stored_query(handle, name, params, branch/snapshot, actor) so MCP tools/call and the HTTP route share one path. No behaviour change. (Replaces the previous draft's "PR 0 — wire the gate", which was already done.)
  • PR 0b — per-query invoke_query scope (the safety prerequisite). Add a query-name dimension to PolicyRequest + the Cedar schema (rfc-001's intended design), wire it at POST /queries/{name} and in the stored-query McpTool::authorization. Independently useful (the allow InvokeQuery [find_user] policy). Gates the per-query Cedar-filtering test and §5.9's recommended agent policy.
  • PR 1 — MCP transport + read-only parity + stored-query reads. Endpoint(s), initialize/tools/list/tools/call/resources/*, the McpTool registry, Cedar-filtered listing, the read-only built-ins (health, graphs_list, snapshot, read/query, schema_get, branches_list, commits_*) + resources + stored-query reads. All auth-agnostic.
  • PR 2 — mutating parity + stored-query mutations. change/mutate, ingest, branches_create/delete/merge, schema_apply, stored-query mutations + the mcp.allow_adhoc switch.
  • PR 3 — docs + agent on-ramp hook. docs/user/server.md MCP section (incl. the recommended agent policy + the coarse-vs-per-query caveat), openapi.json sync, the omnigraph mcp install config target (MR-974), and the downstream omnigraph-ts re-sync/proxy follow-up.
  • Later (separate, MR-956): RFC 9728 protected-resource metadata + WorkOS — slots in with zero MCP changes.
  • Later (TS minor): stdio package → proxy mode.

Migration / backwards compatibility

  • Additive. No queries: and no MCP traffic → today's behaviour unchanged. New endpoints are new routes.
  • Cedar default-deny (when tokens configured) means MCP exposes nothing until an actor is granted — safe by default.
  • The stdio package keeps working unchanged; proxy mode is opt-in later.
  • openapi.json only gains the documented MCP envelope; existing REST routes untouched.

Open Questions

  1. BigInt/u64 as JSON string (recommended, precision-safe) vs number.
  2. rmcp vs hand-rolled JSON-RPC (spike rmcp on edition 2024; default to hand-roll on friction).
  3. Default-off mcp.allow_adhoc for ad-hoc query/mutate (recommended) vs always-on + Cedar-only.
  4. structuredContent + outputSchema now vs text-JSON v1 (recommend v1 text-JSON).
  5. Endpoint paths: /mcp + /graphs/{id}/mcp — confirm naming and whether server-scoped tools also appear on the per-graph endpoint.
  6. Stateless POST-only confirmed (no near-term server-initiated messages) — revisit only if subscriptions land.
  7. Legacy alias tools (read/change): keep for client compat (the shipped package uses them), or drop and rely on query/mutate?
  8. PR 0b shape: per-query scope as a Cedar resource (StoredQuery::"find_user") vs a query_name context attribute + policy condition — affects how allow InvokeQuery [list] is authored.