omnigraph/docs/dev/rfc-003-mcp-server-surface.md
Ragnor Comerford bcd0d9c867
feat(mcp): MCP server surface — Streamable-HTTP transport + tool/resource projection (RFC-003)
Add the `omnigraph-mcp` crate (stateless Streamable-HTTP transport, `McpBackend`
seam, fail-closed Host/Origin policy) and the server backend projecting built-in
operations and the per-graph stored-query registry as MCP tools + resources over
`POST /graphs/{id}/mcp`. Every tool delegates to the same engine/handler
functions the REST routes use and is gated by the same Cedar `authorize` path;
reads/writes carry structured output.

Includes three correctness fixes from review + live testing:

- tools/list is a faithful relaxation of the per-call gate: a built-in whose
  authorization depends on a caller-chosen branch is shown iff the actor could
  invoke it on some branch, via PolicyEngine::permits_on_any_branch (capability
  probe through the same Cedar authorizer). A fabricated-`main` probe wrongly
  hid graph_mutate under the canonical "protect main, write unprotected" policy.
- The stored-query surface honors mode + `expose` on call as well as on list:
  resolve_stored_tool is the single membership test, so the meta pair
  (stored_query_list/stored_query_run) is callable only in `meta` mode and
  stored_query_run resolves exposed-only. An `expose:false` query is unreachable
  by name on the agent surface (it stays HTTP/service-callable).
- The loopback Host allow-list is the full set [127.0.0.1, ::1, localhost]
  (matches rmcp's default), so an IPv6 loopback `Host: [::1]` is accepted
  regardless of which stack the server bound.

The protocol-version contract is documented (initialize negotiates the version
in its body, so the MCP-Protocol-Version header is validated on non-init
requests only) and pinned by a test.

Tests: omnigraph-mcp/tests/standalone.rs, omnigraph-server/tests/mcp.rs,
omnigraph-policy permits_on_any_branch unit test, omnigraph-api-types schema
projection. Full workspace gate green.
2026-06-17 14:00:52 +02:00

969 lines
57 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# RFC-003: MCP Server Surface for `omnigraph-server`
**Status:** Proposed — buildable implementation spec.
**Date:** 2026-06-13
**Audience:** server/engine maintainers.
**Tickets:** MR-969 (stored queries + MCP exposure), MR-956 (OAuth/RFC-9728 layer),
MR-971 (per-server credential resolver — landed as RFC-007), MR-974 (`omnigraph mcp install`).
**Builds on:** [rfc-001-queries-envelope-mcp.md](rfc-001-queries-envelope-mcp.md)
(stored queries + the response envelope), [rfc-005-server-cluster-boot.md](rfc-005-server-cluster-boot.md)
(multi-graph boot), [rfc-007-operator-config.md](rfc-007-operator-config.md)
(client credential model), [rfc-009-unify-access-paths.md](rfc-009-unify-access-paths.md)
(one-contract/two-implementations posture).
**Target release:** v0.8.x.
**Re-validated against main** 2026-06-16 (post RFC-009 `omnigraph-api-types`/`GraphClient`,
RFC-011 cluster-only server, RFC-009 canonical `POST /load`, RFC-012 embeddings): every
`file:line` and `Reuses` citation below was re-checked against the merged tree; the
deltas are folded in (cluster-only routing in [§15](#15-routing--reuses-build_app),
the DTO crate move in [§9](#9-stored-query-projection), `/ingest``/load`, and the
per-query `expose`/`tool_name` deferral in [§17](#17-decisions--rollout)). An external
review pass (8 findings) was then folded in as **correct-by-design** fixes, not point
patches — the schema generator is locked to the engine coercer by an equivalence test
(§9.1), Origin is fail-closed by a single host-policy constructor (§7), the list seam is
non-paginated by contract (§6), and name collisions fail at validate-time (§9.3); the
resolved decisions are catalogued in [§17](#17-decisions--rollout).
**Validated against** (re-checked 2026-06-13): MCP protocol revision **`2025-11-25`**
(modelcontextprotocol.io), the official Rust SDK **`rmcp 1.7.0`** (crates.io /
github.com/modelcontextprotocol/rust-sdk), and current tool/security best practice
(Anthropic engineering, MCP spec security pages). Provider compatibility was
checked against the live docs of Claude Code/Desktop/web, the Claude Messages API
MCP connector, OpenAI's Responses API + ChatGPT connectors, Cursor, VS Code Copilot,
and OpenCode. Code snippets marked **`Reuses`** are present in `omnigraph-server`
today, cited to `file:line`; snippets marked **`New`** are the code to add.
---
## 1. Summary
Add a first-class **MCP (Model Context Protocol) server surface to
`omnigraph-server`**, served over **Streamable HTTP**, that projects the server's
operations as MCP **tools** and **resources** for LLM clients. Two tool populations
share one projection path:
1. **Built-in operational tools** — graph read/mutate, schema get/apply, branch
create/delete/merge/list, commit list/get, NDJSON load, and a graph-scoped
`graph_health` liveness tool, plus resources `omnigraph://schema` and
`omnigraph://branches`.
2. **Dynamic stored-query tools** — projected from the graph's loaded stored-query
registry: either one typed tool per query (small catalogs) or a
discovery + execute meta-tool pair (large catalogs) — see [§9](#9-stored-query-projection).
Every tool is **authorized by the server's existing Cedar policy engine**. The MCP
layer performs **no authentication of its own**: it consumes an already-resolved
actor identity from the server's bearer/OAuth middleware, so the **same endpoint
serves on-prem (static or customer-OIDC tokens) and cloud (OAuth 2.1) by
configuration only**. The transport is **stateless JSON over a single POST
endpoint** — the minimal conformant Streamable-HTTP shape, since the server emits no
server-initiated messages.
The surface is built so the existing local stdio MCP package can later collapse into
a thin stdio↔HTTP proxy over it, leaving one Cedar-gated, remotely-reachable tool
set ([§13](#13-provider-compatibility)).
## 2. Goals
- Project built-in tools **and** stored queries through **one** registry abstraction,
so `tools/list` / `tools/call` never special-case a population.
- Make `tools/list` and the callable set agree for argument-independent authorization,
both driven by Cedar; `tools/call` is always the authoritative gate.
- Keep the MCP layer **auth-method-agnostic**: it consumes a resolved actor, never a
raw token, and never branches on how authentication happened.
- Add **no business logic**: tools delegate to the same engine functions the HTTP
routes call.
- Be **code-mode-friendly** (typed schemas, structured output, stable names,
progressive disclosure) and **maximally client-compatible** (Streamable HTTP +
bearer today, OAuth 2.1 + RFC 9728 as an additive layer).
- Behaviour-neutral when unused: no MCP traffic ⇒ no change.
## 3. Non-Goals
- **Hosting an OAuth authorization server.** The server is a **Resource Server** only
(validates tokens, never issues them, never holds client secrets). The AS is a
separate concern (MR-956).
- **MCP prompts, elicitation, sampling, tasks, `tools/list_changed` subscriptions,
resource subscriptions, server-initiated messages** — none required, which is what
permits the stateless POST-only transport. (`tools/list_changed` is reconsidered
only if the registry gains runtime reload.)
- **stdio transport inside the server** — stdio stays in the TS package (later a proxy).
- **Client-side "code mode" machinery** (TS wrapper generation, sandboxes, tool
search/deferral) — those are client/runtime concerns; see [§12](#12-code-mode-compatibility)
for what the server does to support them and what it deliberately does not build.
- **Cross-graph tool listing** — per-graph catalogs only.
## 4. Protocol target
Target MCP revision **`2025-11-25`** (current). `rmcp 1.7.0` advertises this as its
latest and negotiates down to any of `2024-11-05 / 2025-03-26 / 2025-06-18 /
2025-11-25`; an absent `MCP-Protocol-Version` header defaults to `2025-03-26` and an
unsupported one is a `400`. Revision `2025-06-18` is the floor we rely on for two
features: **structured tool output** (`outputSchema` + `structuredContent`) and the
**OAuth Resource-Server** model. From `2025-11-25` we adopt: **input-validation
errors as tool-execution errors** (SEP-1303), **JSON Schema 2020-12** as the default
dialect, **`403` on a present-but-disallowed `Origin`** (validated **fail-closed** by a
single host-policy constructor — §7, not a config-presence default), and
**`WWW-Authenticate` made optional** with a `.well-known` fallback.
**Transport shape (stateless Streamable HTTP).** The server exposes one endpoint that
accepts `POST` (and answers `GET`/`DELETE` with `405 + Allow: POST`). For a JSON-RPC
*request* it returns one `application/json` object; it opens no SSE stream, assigns no
`Mcp-Session-Id`, and treats every request independently — a fully conformant
stateless server. It **MUST** validate `Origin` (`403` on mismatch) and honor
`MCP-Protocol-Version`. `rmcp` delivers all of these in stateless mode (§7).
## 5. Crate architecture
Two crates; `rmcp` is contained to one of them.
```
omnigraph-server (implements McpBackend; all omnigraph tool/Cedar/dispatch logic)
│ depends on
omnigraph-mcp (rmcp Streamable-HTTP transport, the McpBackend trait, rmcp model re-exports)
│ depends on
rmcp 1.7 + tower-http(limit) + axum + http
```
The dependency **must** go `server → mcp`. The server binary mounts `/mcp`, so a
`mcp → server` edge cycles at the package level (`server-bin → omnigraph-mcp →
server-lib`), which Cargo rejects. The trait inverts the direction — the crate
defines the seam, the server fills it — which is also why the crate can never name an
omnigraph type (`AppState`, `GraphHandle`, the handlers); it abstracts over them.
`crates/omnigraph-mcp/Cargo.toml`:
```toml
[package]
name = "omnigraph-mcp"
edition = "2024" # rmcp 1.7 is itself edition 2024 — no friction
version.workspace = true
[dependencies]
# `server` is on by rmcp's default features; `transport-streamable-http-server`
# pulls in the tower service + http stack. Do NOT enable rmcp's `local` feature —
# it cfg's the StreamableHttpService tower wiring out.
rmcp = { version = "1.7", default-features = false, features = ["server", "transport-streamable-http-server"] }
axum = { workspace = true }
http = "1"
tower-http = { workspace = true, features = ["limit"] }
tokio = { workspace = true }
async-trait = { workspace = true }
serde_json = { workspace = true }
```
Add `"crates/omnigraph-mcp"` to the workspace `members`; in
`omnigraph-server/Cargo.toml` add `omnigraph-mcp` and **no direct `rmcp` dep**
(verified absent today). The verification gate is `cargo tree -p omnigraph-server -e
normal | grep rmcp` showing rmcp only transitively under `omnigraph-mcp`.
## 6. The `McpBackend` seam — `New` in `omnigraph-mcp`
```rust
// crates/omnigraph-mcp/src/lib.rs
use async_trait::async_trait;
// rmcp model types re-exported so the server speaks rmcp via `omnigraph_mcp::…`
// and carries no direct rmcp dependency.
pub use rmcp::model::{
CallToolResult, Content, RawResource, ReadResourceResult, Resource,
ResourceContents, ServerCapabilities, ServerInfo, Tool, ToolAnnotations,
};
pub use rmcp::ErrorData as McpError; // JSON-RPC error type (method_not_found=-32601, invalid_params=-32602, internal_error=-32603)
pub type JsonObject = serde_json::Map<String, serde_json::Value>;
#[async_trait]
pub trait McpBackend: Clone + Send + Sync + 'static {
fn server_info(&self) -> ServerInfo;
async fn list_tools(&self, parts: &http::request::Parts) -> Result<Vec<Tool>, McpError>;
async fn call_tool(&self, parts: &http::request::Parts, name: &str, args: JsonObject) -> Result<CallToolResult, McpError>;
async fn list_resources(&self, parts: &http::request::Parts) -> Result<Vec<Resource>, McpError>;
async fn read_resource(&self, parts: &http::request::Parts, uri: &str) -> Result<ReadResourceResult, McpError>;
}
```
**The list seam is non-paginated by contract — deliberately.** `list_tools` /
`list_resources` return the *full* set, so `McpService` always emits `nextCursor:
null`. This is correct-by-design for this surface, not an oversight: the catalog is
bounded — built-ins are a fixed ~dozen, and a large stored-query catalog is bounded by
the `meta` projection mode (§9.2), which collapses N queries into two tools rather than
leaning on `tools/list` paging. The trait return type (`Vec<T>`) *is* the contract; the
doc must not claim pagination the signature can't express (§12, §16 are aligned to this
— no `tools/list`/`resources/list` cursor). If a future surface genuinely needs paging,
that is a seam-signature change (`-> ListToolsResult` with a cursor), made together
with the capability — never a doc promise ahead of the type.
`&http::request::Parts` is the decoupling mechanism. The crate hands the backend the
request parts; the backend reads **its own** types out of `parts.extensions`. The
crate never names an omnigraph type, so it is reusable and auth stays decoupled (§8).
> `rmcp`'s own `ServerHandler` trait uses RPITIT (`-> impl Future + …`), not
> `async-trait`. Our `McpBackend` deliberately uses `#[async_trait]`: it is
> implemented once by the server, the boxed future is negligible at MCP QPS, and the
> server already depends on `async-trait`. Either style compiles on edition 2024.
## 7. Transport — `New` in `omnigraph-mcp`
```rust
// crates/omnigraph-mcp/src/transport.rs
use std::sync::Arc;
use rmcp::transport::streamable_http_server::{
StreamableHttpServerConfig, StreamableHttpService,
session::never::NeverSessionManager, // stateless ⇒ reject all session ops
};
// Host + Origin posture as a TOTAL choice — there is no `None ⇒ skip` state to leak
// into a fail-open default. `OriginPolicy` is the by-design closure for the Origin
// class: every deployment lands in exactly one arm, chosen once by `from_bind`.
pub enum OriginPolicy {
Allow(Vec<String>), // browser clients from these origins; any OTHER present Origin → 403
DenyBrowsers, // no browser clients expected; ANY present Origin → 403 (non-browser MCP clients send none)
Unchecked, // explicit opt-out (loopback dev / trusted network) — never the remote default
}
pub struct McpHostPolicy {
pub allowed_hosts: Option<Vec<String>>, // None ⇒ accept any Host (DNS-rebinding defense relaxed for a known-public bind)
pub origin: OriginPolicy, // no Option — a total decision
}
impl McpHostPolicy {
// The ONLY constructor. Host and Origin posture are derived together from the
// bind + config, fail-closed: a remote bind with no configured origins is
// `DenyBrowsers` (a present Origin is rejected), NOT "skip". A caller cannot
// construct a fail-open policy because the struct has no skip-by-absence state.
pub fn from_bind(bind: &SocketAddr, public_hosts: &[String], browser_origins: &[String]) -> Self {
let loopback = bind.ip().is_loopback();
Self {
// Loopback bind ⇒ the full loopback Host set (both stacks + the
// hostname alias), matching rmcp's default `["localhost","127.0.0.1","::1"]`.
// The Host header is independent of the bound socket (in-process,
// proxies, dual-stack localhost), so a 127-bound server must still
// accept a `[::1]` Host — deriving the list from `bind.ip()` alone 403'd it.
allowed_hosts: if loopback { Some(vec!["127.0.0.1".into(), "::1".into(), "localhost".into()]) }
else if public_hosts.is_empty() { None } else { Some(public_hosts.to_vec()) },
origin: if !browser_origins.is_empty() { OriginPolicy::Allow(browser_origins.to_vec()) }
else if loopback { OriginPolicy::Unchecked } // local dev convenience only
else { OriginPolicy::DenyBrowsers }, // remote default: fail-closed
}
}
}
pub fn mcp_router<B: McpBackend>(backend: B, body_limit: usize, hosts: McpHostPolicy) -> axum::Router {
// StreamableHttpServerConfig is #[non_exhaustive]; its Default is stateful_mode=true,
// json_response=false, allowed_hosts=loopback. ALL THREE must be overridden for a
// remote stateless JSON server — build from Default and flip via the with_* setters.
let mut config = StreamableHttpServerConfig::default()
.with_stateful_mode(false)
.with_json_response(true);
config = match &hosts.allowed_hosts {
Some(list) => config.with_allowed_hosts(list.clone()),
None => config.disable_allowed_hosts(), // accept any Host
};
// rmcp validates Origin ONLY when allowed_origins is non-empty (empty ⇒ rmcp skips),
// so DenyBrowsers cannot be expressed by handing rmcp a list. We therefore enforce
// OriginPolicy in a thin pre-layer that 403s a disallowed present Origin BEFORE rmcp
// — making fail-closed independent of rmcp's empty-list semantics (the root cause of
// the original fail-open default). `Allow` also configures rmcp as defense-in-depth.
if let OriginPolicy::Allow(origins) = &hosts.origin { config = config.with_allowed_origins(origins.clone()); }
// service_factory returns Result<S, io::Error>; NeverSessionManager pairs with stateless mode.
let svc = StreamableHttpService::new(
move || Ok(McpService::new(backend.clone())),
Arc::new(NeverSessionManager::default()),
config,
);
axum::Router::new()
.route_service("/mcp", svc)
.layer(origin_guard(hosts.origin)) // fail-closed Origin enforcement (no-op only for Unchecked)
// rmcp reads the body directly (not via an axum extractor), so axum's
// DefaultBodyLimit does NOT bound /mcp — the tower-http layer does.
.layer(tower_http::limit::RequestBodyLimitLayer::new(body_limit))
}
```
`McpService<B>` implements rmcp's `ServerHandler`, pulls `&Parts` out of the request
context once, and delegates each method to `B`. rmcp's `StreamableHttpService`
**consumes the body and injects the remaining `http::request::Parts` into
`RequestContext.extensions`** (this is documented and load-bearing — see §8); inside
the handler, `ctx.extensions.get::<http::request::Parts>()` returns those parts.
**Conformance the stateless transport gives for free** (verified in rmcp 1.7
`tower.rs`): `GET`/`DELETE /mcp → 405` with `Allow: POST`; a disallowed `Host`
`403`; `MCP-Protocol-Version``400` on unsupported, default `2025-03-26` when absent.
The one thing rmcp does **not** give for free is fail-closed Origin: rmcp checks
`Origin` only when `allowed_origins` is non-empty, so an empty list is *fail-open*.
`origin_guard` (above) closes that — a present, disallowed `Origin``403` regardless
of rmcp's empty-list behavior. That layer is the only added middleware.
**Host/Origin policy is fail-closed by construction, derived from the deployment.**
rmcp's default `allowed_hosts` is loopback-only — correct for local dev (DNS-rebinding
defense) but it would `403` every remote client. `McpHostPolicy::from_bind` (the single
constructor) computes both axes once at startup from `--bind` + config: loopback bind →
loopback Host allow-list + `OriginPolicy::Unchecked` (dev convenience); non-loopback
bind → the configured public host(s) (else Host-allowlisting disabled, logged — bearer
is the real control), and **`OriginPolicy::DenyBrowsers` by default** (any present
`Origin``403`) unless `browser_origins` are configured (`OriginPolicy::Allow`). The
key by-design property: `OriginPolicy` has **no "absent ⇒ skip" state** and there is no
other way to build the policy, so a remote deployment cannot accidentally run fail-open
— closing the bug class rather than flipping a default. Non-browser MCP clients (the
Phase-1 tier) send no `Origin` and are unaffected; only a forged browser `Origin` is
rejected.
## 8. Auth & identity — `Reuses` the server's middleware
The backend consumes an already-resolved actor and **branches on nothing** about how
the token was verified. Two values are injected into the request extensions by
middleware that runs **before** the MCP service:
```rust
// Reuses — crates/omnigraph-server/src/identity.rs:186
pub struct ResolvedActor { pub actor_id: Arc<str>, pub tenant_id: Option<TenantId>, pub scopes: Vec<Scope>, pub source: AuthSource }
// Reuses — crates/omnigraph-server/src/registry.rs:37
pub struct GraphHandle {
pub key: GraphKey, pub uri: String,
pub engine: Arc<Omnigraph>,
pub policy: Option<Arc<PolicyEngine>>, // None ⇒ no per-graph Cedar gate
pub queries: Option<Arc<QueryRegistry>>, // None ⇒ no stored queries for this graph
}
```
The middleware order is fixed in `build_app` (`lib.rs:876`; the two `route_layer`s at
`lib.rs:929-936`): the **outer** layer `require_bearer_auth` injects
`Extension<ResolvedActor>` (or `401`); the **inner** layer `resolve_graph_handle`
injects `Extension<Arc<GraphHandle>>`. Both land in `request.extensions()`, which rmcp
copies into `RequestContext.extensions`.
```rust
// New — crates/omnigraph-server/src/mcp/mod.rs
#[derive(Clone)]
pub struct OmnigraphMcpBackend { state: AppState } // AppState is Arc-backed #[derive(Clone)]
impl OmnigraphMcpBackend {
fn ctx<'a>(&self, parts: &'a http::request::Parts) -> Result<(&'a ResolvedActor, &'a Arc<GraphHandle>), McpError> {
let actor = parts.extensions.get::<ResolvedActor>()
.ok_or_else(|| McpError::internal_error("actor missing from request extensions", None))?;
let handle = parts.extensions.get::<Arc<GraphHandle>>()
.ok_or_else(|| McpError::internal_error("graph handle missing from request extensions", None))?;
Ok((actor, handle))
}
}
```
**Auth posture (spec-aligned, MCP 2025-11-25 authorization).** The server is a
Resource Server. Per-request validation only — **sessions are never used for
authentication** (the transport is stateless, which makes this structural). Token
**audience must be validated** and **token passthrough is prohibited**: if a tool
later needs to reach an upstream API, the server acts as a separate OAuth client and
must not forward the client's token.
- **Static bearer (today).** `require_bearer_auth` resolves a `ResolvedActor` from a
SHA-256 hash match. Works for the developer/agent clients (§13).
- **OAuth 2.1 + RFC 9728 (additive, MR-956).** Serve
`/.well-known/oauth-protected-resource`; on `401`, optionally add
`WWW-Authenticate: Bearer resource_metadata="…"` (header is optional in 2025-11-25
given the well-known fallback). Clients run OAuth 2.1 + PKCE + RFC 8707 resource
indicators themselves; the server validates audience-bound JWTs offline (cached
JWKS), so on-prem/air-gapped keeps working. This swaps the bearer middleware behind
a `TokenVerifier` and changes **zero** MCP code.
> **Compatibility caveat to honor (Claude Code issue #59467).** Advertising RFC-9728
> Protected-Resource-Metadata can cause some clients (Claude Code today) to **ignore
> a static `Authorization` header and force the OAuth flow**. So PRM advertisement
> must be **config-gated**: a deployment serving developer clients over static bearer
> does not advertise OAuth; a deployment targeting consumer connectors does. The MCP
> routes only need to flow through the standard `401` path so the hook can be added
> without touching MCP code.
## 9. Stored-query projection
The projection source is the same `query_catalog_entry` the `GET /queries` catalog
uses (`crates/omnigraph-server/src/api.rs:13`). The param/catalog DTOs moved to the
shared `omnigraph-api-types` crate (RFC-009 Phase 2) and are re-exported through
`api.rs` (`pub use omnigraph_api_types::*`), so the `Reuses` types below still resolve
via `omnigraph_server::api::…`. Real types:
```rust
// Reuses — crates/omnigraph-api-types/src/lib.rs:355 (re-exported via omnigraph-server/src/api.rs)
pub enum ParamKind { String, Bool, Int, BigInt, Float, Date, DateTime, Blob, Vector, List }
// Reuses — crates/omnigraph-api-types/src/lib.rs:373
pub struct ParamDescriptor {
pub name: String,
pub kind: ParamKind,
pub item_kind: Option<ParamKind>, // Some(scalar) when kind == List
pub vector_dim: Option<u32>, // Some(dim) when kind == Vector — the dimension lives here, not in the kind
pub nullable: bool,
}
// Reuses — crates/omnigraph-server/src/queries.rs:29
pub struct StoredQuery { pub name: String, pub source: Arc<str>, pub decl: QueryDecl, pub expose: bool, pub tool_name: Option<String> }
impl StoredQuery { pub fn is_mutation(&self) -> bool; pub fn effective_tool_name(&self) -> &str; } // queries.rs:45,55
pub struct QueryRegistry { /* by_name: BTreeMap<String, StoredQuery>; .lookup(&name) */ } // queries.rs:64
```
A query is declared in the cluster's `cluster.yaml graphs.<id>.queries` (a directory
to discover, an explicit file list, or a `name: { file: … }` map); `cluster apply`
publishes it to the content-addressed catalog, and the server loads that graph's
applied registry into `handle.queries` at boot (`settings.rs:71-111`). The
`StoredQuery` struct carries `expose: bool` and `tool_name: Option<String>`, **but
cluster boot currently forces `expose: true, tool_name: None` for every applied
query** (`settings.rs:83-84`, the §D5 bridge — see [§17](#17-decisions--rollout)). So
today the projection lists every applied query and names each by its query name; the
`expose`/`tool_name` plumbing is wired but inert until the cluster catalog grows the
per-query metadata. The projection reads `handle.queries` and is agnostic to the
declaration source. (The legacy single-graph `omnigraph.yaml queries:` map is removed
— RFC-011 made the server cluster-only; there is no other declaration source.)
### 9.1 `ParamDescriptor → JSON Schema` (`New`, shared projection + equivalence test)
JSON Schema 2020-12. **The schema generator is the engine's input contract, not a
second copy of it.** The authority for what a param accepts is the runtime coercer
`json_value_to_literal_typed` (`crates/omnigraph-compiler/src/query_input.rs`); a
hand-written schema in the MCP crate is a parallel encoding that *will* drift — the
review found two drifts at once (Blob, nullable), and BigInt/Date/Vector are latent
siblings of the same class. So the projection lives **next to the DTO it projects**, in
`omnigraph-api-types` (where `ParamKind`/`ParamDescriptor` already live and are
`ToSchema`), is the single mapping both OpenAPI and MCP consume, and is **locked to the
coercer by an equivalence test** — drift becomes a CI failure, not a shipped bug.
```rust
// New — crates/omnigraph-api-types/src/lib.rs (next to ParamKind/ParamDescriptor)
use serde_json::{json, Value};
// Exhaustive, wildcard-free: adding a ParamKind is a COMPILE error until its arm
// (and its equivalence-test corpus row) exist — closing "new kind, wrong/default schema".
fn scalar_schema(kind: ParamKind) -> Value {
match kind {
ParamKind::String => json!({ "type": "string" }),
ParamKind::Bool => json!({ "type": "boolean" }),
ParamKind::Int => json!({ "type": "integer" }),
ParamKind::BigInt => json!({ "type": "string", "pattern": r"^-?\d+$" }), // i64/u64 lose precision >2^53 as JSON numbers
ParamKind::Float => json!({ "type": "number" }),
ParamKind::Date => json!({ "type": "string", "format": "date" }),
ParamKind::DateTime => json!({ "type": "string", "format": "date-time" }),
// FIX (③): the coercer takes Blob as a blob-URI STRING ("expected blob URI
// string", query_input.rs:449; DTO doc api-types:354) — NOT base64-decoded bytes.
ParamKind::Blob => json!({ "type": "string", "format": "uri" }),
ParamKind::Vector | ParamKind::List => unreachable!("composite kinds handled in param_json_schema"),
}
}
// The one entry point the MCP crate calls — applies the nullable rule uniformly.
pub fn param_json_schema(p: &ParamDescriptor) -> Value {
let base = match p.kind {
ParamKind::Vector => {
let mut s = json!({ "type": "array", "items": { "type": "number" } });
if let Some(dim) = p.vector_dim { s["minItems"] = json!(dim); s["maxItems"] = json!(dim); }
s
}
ParamKind::List => json!({ "type": "array", "items": p.item_kind.map(scalar_schema).unwrap_or_else(|| json!({"type":"string"})) }),
scalar => scalar_schema(scalar),
};
// FIX (④): the coercer accepts explicit `null` for a nullable param AND its
// omission (query_input.rs:273,296). `required` alone only covers omission; a
// strictly-validating client (or SEP-1303 input validation) would reject `null`
// against the bare scalar. Allow null at the schema level for nullable params.
if p.nullable { json!({ "anyOf": [ base, { "type": "null" } ] }) } else { base }
}
```
**The lock — an equivalence test (the by-design closure), in the compiler crate** (it
sees both the coercer and `param_json_schema`): for a fixed accept/reject corpus per
`ParamKind` (incl. a blob-URI string, a base64 blob *that must now validate as a plain
string*, `null` for nullable vs non-nullable, an over/under-length vector), assert
`schema_accepts(v) == json_value_to_literal_typed(name, v, kind, mode).is_ok()`. Any
future arm that diverges from the engine — base64 creeping back, a missing null-union, a
new kind without a schema — turns the test red. That test, not reviewer vigilance, is
what makes the schema correct *by construction*.
### 9.2 Two projection modes (small vs large catalogs)
Tool-overload is real: model accuracy degrades sharply as a single client's tool
count climbs past a few dozen, and clients that don't defer tool loading (e.g.
OpenCode) pay the full `tools/list` token cost. So the projection has two modes,
selected per graph by a `stored_query_mode` setting (default `auto`).
**Where the setting lives (by-design, ⑥).** There is no free-floating `mcp.*` key.
`stored_query_mode` and its threshold belong to the **same per-graph `mcp:` metadata
block** that will hold `expose`/`tool_name` (the cluster Phase-6 surface, §D5 bridge —
see [§17](#17-decisions--rollout)) — one mcp-config home, one validator, validated at
`cluster validate`/boot with the rest of the registry. That sequences it correctly: the
knob cannot land before the surface that holds it exists, and it can't drift into a
second config location. Until Phase 6, the mode is **not configurable** — every graph
runs `auto` (the count-based default below), which is the safe, documented behavior.
The modes themselves:
- **`per_query` (small/stable catalogs).** One tool per `expose: true` query, named by
`effective_tool_name()`, with a fully typed `input_schema`. This is the richest
surface — each query is a first-class typed tool, ideal for code-mode runtimes that
compile tools into a typed API.
- **`meta` (large/dynamic catalogs).** Two tools instead of N: `stored_query_list(filter?,
detail_level?)` (returns names + descriptions; full param schema only at higher
detail) and `stored_query_run(name, params, branch?, snapshot?)`. This keeps
`tools/list` small and mirrors the progressive-disclosure shape (`search` + `execute`)
that scales to hundreds of queries.
- **`auto`** picks `per_query` below a threshold (default 24 exposed queries) and `meta`
at or above it; the threshold is configurable. The boundary and count are logged so
a deployment never silently flips modes.
### 9.3 Envelope (collision-free by construction)
In `per_query` mode the tool's `input_schema` **nests query params under `params`**,
mirroring `POST /queries/{name}`:
```jsonc
{ "type": "object",
"properties": {
"params": { "type": "object", "properties": { /* per-param param_json_schema(...) */ }, "required": [ /* names where nullable == false */ ] },
"branch": { "type": "string" },
"snapshot": { "type": "string" } // omit for mutation tools — mutation-against-snapshot is unrepresentable
},
"additionalProperties": false }
```
`required` lists only non-nullable param names; a nullable param is both absent from
`required` **and** carries the `null`-union from `param_json_schema` (§9.1), so omitting
it *and* passing explicit `null` both validate — matching the coercer.
Knobs (`branch`/`snapshot`) and the query's own params live in separate namespaces, so
a query parameter literally named `branch`/`snapshot` cannot collide.
**Built-in vs stored name collision is a load-time error, never a silent skip (⑦).**
The earlier "a colliding stored tool is skipped (built-ins win)" is a silent failure —
a query an operator published just vanishes from the catalog at projection time, which
the deny-list in [docs/dev/invariants.md](invariants.md) forbids. By-design fix: fold
the built-in tool names (a stable closed set from the `Builtin` enum, §10) into the
**same per-graph uniqueness check the registry already runs** at load
(`duplicate_tool_name`, today stored-vs-stored only). A stored `effective_tool_name()`
that shadows a built-in then fails `cluster validate`/server boot **loudly**, before
serving — a runtime-shadowed query becomes structurally impossible rather than silently
dropped.
## 10. Tool catalog + Cedar mapping — `Reuses` `PolicyAction`
Each built-in reuses the **exact `PolicyAction` its REST route enforces**:
```rust
// Reuses — crates/omnigraph-policy/src/lib.rs:16
pub enum PolicyAction {
Read, Export, Change, SchemaApply,
BranchCreate, BranchDelete, BranchMerge,
Admin, // reserved, no call site yet
GraphList, // server-scoped (resource_kind == Server)
InvokeQuery, // graph-scoped, coarse (no per-query dimension yet)
}
```
A tool's scope is **derived from where it is mounted, not asserted independently**:
MCP mounts only under `/graphs/{graph_id}/mcp` (§15), so every MCP tool is graph-scoped
by construction. There is no server-scoped MCP tool — a "server-scoped tool on a
per-graph mount" is unrepresentable (⑧). Server-level liveness stays on REST
`GET /healthz`; the MCP liveness tool is graph-scoped `graph_health` (confirms *this
graph's* handle is live) and needs no Cedar gate.
| MCP tool | Scope | Cedar action |
|---|---|---|
| `graph_health` | graph | none (liveness/version) |
| `graph_snapshot`, `schema_get`, `branch_list`, `commit_list`, `commit_get` | graph | `Read` |
| `graph_query` (ad-hoc read) | graph | `Read` (`run_query` self-authorizes) |
| `graph_mutate` (ad-hoc write) | graph | `Change` |
| `graph_load` (NDJSON) | graph | `Change` (+ `BranchCreate` **iff** `from` is present — see §11) |
| `branch_create` / `branch_delete` / `branch_merge` | graph | `BranchCreate` / `BranchDelete` / `BranchMerge` |
| `schema_apply` (`allow_data_loss`) | graph | `SchemaApply` |
| stored query (`per_query`) / `stored_query_run` (`meta`) | graph | `InvokeQuery` (coarse) then inner `Read`/`Change` |
**Naming.** Tool ids are **domain-qualified `snake_case`** (`graph_query`,
`branch_merge`, `schema_apply`, …) within the spec's `[A-Za-z0-9_.-]`, 1128-char
constraint. Domain qualification (rather than bare `query`/`mutate`) reduces
cross-server collisions when a client loads omnigraph alongside other MCP servers;
clients that auto-prefix by connection name (e.g. OpenCode → `omnigraph_graph_query`)
compose cleanly. Names are a stability contract (Hyrum's Law) — don't churn them.
**Annotations (set explicitly).** MCP annotation defaults are pessimistic
(`readOnlyHint=false`, `destructiveHint=true`, `idempotentHint=false`,
`openWorldHint=true`), so an unannotated read tool is mistaken for a destructive
open-world writer. Set them via rmcp's `ToolAnnotations` (`read_only_hint`,
`destructive_hint`, `idempotent_hint`, `open_world_hint`):
- read tools (`graph_query`, `graph_snapshot`, `schema_get`, `branch_list`,
`commit_*`, stored *reads*) → `read_only_hint = true`, `open_world_hint = false`.
- writers (`graph_mutate`, `graph_load`, `branch_delete`, `branch_merge`,
`schema_apply`) → `read_only_hint = false`, `destructive_hint = true`,
`open_world_hint = false`. Clients use `destructiveHint` to drive human-confirmation
prompts.
- `branch_create` (additive) → `destructive_hint = false`.
Annotations are **advisory hints, not a security boundary** (clients may ignore them);
**Cedar is the enforcement boundary.**
Represent built-ins as a `Builtin` enum (one variant per tool; `descriptor` / `gate` /
`call` as match arms) — lower liability than ~13 unit structs + `dyn`. Stored-query
tools are a sibling populator over `handle.queries`.
**`list_tools` / `list_resources` are Cedar-filtered as a *relaxation* of the
call-path gate** — listing never hides a tool the caller could invoke on some
branch (over-showing is the safe direction; `call_tool` is authoritative). A
built-in whose authorization depends on a caller-chosen branch (`graph_mutate`,
`graph_load`, `branch_*`) is shown iff `authorize_any_branch` →
`PolicyEngine::permits_on_any_branch(actor, action)` is true: that probes the
branch-shape space (omitted / protected / unprotected) through the same Cedar
authorizer and returns true if *any* shape is allowed. A fixed-branch probe is
wrong here — both a fabricated `main` (denied under "protect `main`, write
unprotected branches", the canonical workflow) and a `branch: None` probe
(matches no `branch_scope` rule) under-show `graph_mutate` to an actor who can
write feature branches. The stored-query surface gets the same list/call
agreement structurally: `resolve_stored_tool` is the single membership test, so
the meta pair is callable only in `meta` mode and `stored_query_run` resolves
**exposed-only** (an `expose:false` query is unreachable by name on the agent
surface, though it stays HTTP/service-callable).
## 11. Dispatch reuse + error classification
`call_tool` adds no business logic. Reuse points (all in `handlers.rs`):
```rust
pub(crate) enum Authz { Allowed, Denied(String) } // handlers.rs:313
pub(crate) fn authorize(actor: Option<&ResolvedActor>, policy: Option<&PolicyEngine>, request: PolicyRequest) -> Result<Authz, ApiError>; // :334 — Err = operational 401/500
pub(crate) async fn run_query(handle: Arc<GraphHandle>, actor: Option<&ResolvedActor>, query: &str, name: Option<&str>, params_json: Option<&Value>, branch: Option<String>, snapshot: Option<String>, reject_mutations: bool) -> Result<(String, ReadTarget, QueryResult), ApiError>; // :711
pub(crate) async fn run_mutate(state: AppState, handle: Arc<GraphHandle>, actor: Option<&ResolvedActor>, query: &str, name: Option<&str>, params_json: Option<&Value>, branch: String) -> Result<ChangeOutput, ApiError>; // :645
```
`PolicyRequest` carries `{ action, branch, target_branch }` only — **no actor
identity** (server-resolved, supplied separately) and **no query-name dimension**
(the coarse-`invoke_query` caveat):
```rust
// Reuses — crates/omnigraph-policy/src/lib.rs:251
pub struct PolicyRequest { pub action: PolicyAction, pub branch: Option<String>, pub target_branch: Option<String> }
```
**The stored-query double-gate + deny-masking pattern** (`handlers.rs:913`,
`server_invoke_query`) is the contract `call_tool` mirrors for stored queries:
```rust
// Reuses (pattern) — outer InvokeQuery gate; deny == missing so the catalog can't be probed
match authorize(actor, handle.policy.as_deref(), PolicyRequest {
action: PolicyAction::InvokeQuery, branch: None, target_branch: None, // graph-scoped: NO branch dimension
})? {
Authz::Allowed => {}
Authz::Denied(_) => return Err(ApiError::not_found("stored query not found")),
}
let stored = handle.queries.as_ref().and_then(|r| r.lookup(&name)).ok_or_else(|| ApiError::not_found("stored query not found"))?;
// inner gate runs in run_mutate (Change) / run_query (Read); a stored mutation is double-gated.
```
**`graph_load` (NDJSON)** wraps the unified `load_as` via `run_ingest` (the canonical
`server_load` handler, `handlers.rs:1320`; `POST /ingest` / `server_ingest`,
`handlers.rs:1360`, is a `#[deprecated]` alias emitting RFC-9745 headers — RFC-009
Phase 5): a missing branch with **no `from` is a `404`, never an implicit fork**;
`BranchCreate` is consulted only when `from` is present, then `Change` for the load.
The tool's `input_schema` is `{ data: string, branch?: string, from?: string,
mode?: "merge"|"append"|"overwrite" }`, `additionalProperties: false` (the same
`IngestRequest` shape, `omnigraph-api-types/src/lib.rs:496`).
**Error classification (`New`, one mapper, SEP-1303-aligned).** `ApiError`'s fields are
private (`lib.rs:280`, and still carry no public status/message accessors), so add
`pub(crate) fn status_code(&self)`/`message_str(&self)` accessors. Then one `classify`
is used at every dispatch site:
```rust
// New — the single source of truth
fn classify(r: Result<CallToolResult, ApiError>) -> Result<CallToolResult, McpError> {
match r {
Ok(out) => Ok(out),
// Semantic failures (bad params, validation, business 4xx/409) → isError result,
// fed back to the model so it self-corrects (MCP 2025-11-25 SEP-1303).
Err(e) if e.status_code().is_client_error() => Ok(CallToolResult::error(vec![Content::text(e.message_str())])),
// Operational failures (5xx) → JSON-RPC protocol error.
Err(e) => Err(McpError::internal_error(e.message_str().to_owned(), None)),
}
}
```
Two cases are protocol errors, not `isError`, so the catalog isn't probeable and
malformed calls are unambiguous: an **unknown OR denied tool** returns an identical
`McpError::invalid_params("unknown tool: <name>")` (`-32602`), and a structurally
malformed call (failing the `tools/call` shape) is a protocol error. A missing/bad
bearer is an HTTP `401` at the boundary, before rmcp.
## 12. Code-mode compatibility
"Code mode" (Anthropic's *Code execution with MCP*; Cloudflare's *Code Mode*) is a
**client/runtime** technique: the client compiles a server's tools into a typed code
API (TS modules / a sandbox), the model writes code against it, and intermediate
results are filtered in the sandbox instead of round-tripping through the model
context (reported ~98% context savings on large workflows). It runs over **standard
`tools/list` + `tools/call`** and **requires no new server endpoints**; credentials
stay in the transport and the runtime holds them (the sandbox never sees the bearer).
The server's job is to be a *good source* for that compilation. Concrete server-side
choices this RFC adopts:
1. **Strict, fully-typed `input_schema`** (§9.1) with `additionalProperties: false`,
enums for `mode`/`format`, explicit `required` — these compile into precise TS
input types.
2. **Structured output** — see §13.1: declare `outputSchema` and return
`structuredContent` so generated code gets typed *returns*, not `any`.
3. **Stable, descriptive tool names + rich descriptions** (§10) — names become
function names; descriptions become doc comments.
4. **Progressive disclosure for large catalogs** — the `meta` projection mode (§9.2)
keeps `tools/list` small (`stored_query_list` + `stored_query_run`), the same
`search` + `execute` shape code-mode runtimes prefer.
5. **Bounded `tools/list` instead of pagination.** The list seam is non-paginated by
contract (§6); a large catalog is bounded by the `meta` mode (§9.2), not by cursor
paging. This keeps the seam type honest (no `nextCursor` the `Vec<T>` return can't
carry) while still preventing context blow-up on big query catalogs.
6. **Schemas as resources** (§14) — expose the graph schema (and per-query param
schemas) as MCP resources, the on-demand channel code-mode clients pull from.
7. **Auth in the transport only** — never require secrets as tool *arguments* (that
would put them in model context / generated code and break the sandbox's credential
isolation).
The server deliberately does **not** build TS-wrapper generation, sandboxes, tool
search/deferral, or PII tokenization — those are client/runtime concerns, and there is
no ratified "tools-as-code" MCP spec to target.
## 13. Provider compatibility
**Transport: Streamable HTTP is the universal target** — every current client below
supports it for remote servers, and it is the recommended transport over deprecated
HTTP+SSE.
**Auth splits the ecosystem into two tiers:**
| Client | Remote transport | Auth that works | Notes |
|---|---|---|---|
| **Claude Code** (CLI) | Streamable HTTP | static bearer header **and** OAuth 2.1 | `claude mcp add --transport http <url>/mcp --header "Authorization: Bearer …"`. Advertising RFC-9728 can override the static header (issue #59467) — gate PRM. |
| **Cursor** | Streamable HTTP | static header **and** OAuth 2.1 | `"headers": {"Authorization": "Bearer ${env:…}"}` in `mcp.json`. |
| **VS Code** (Copilot agent) | Streamable HTTP | static header **and** OAuth | needs VS Code ≥ 1.101 for remote + OAuth; auto-detects `401` → sign-in. |
| **OpenCode** | remote HTTP | static header **and** OAuth (auto, DCR) | `mcp` block in `opencode.json`; auto-prefixes tools `omnigraph_…`; **no progressive disclosure** → keep the static surface tight (favors `meta` mode at scale). |
| **Claude Messages API** (`mcp_servers`) | Streamable HTTP (+SSE) | pre-acquired token via `authorization_token` | forwards a token; never runs OAuth. Static bearer fits directly. Pin the beta header you target. |
| **OpenAI Responses API** (`mcp` tool) | Streamable HTTP (+SSE) | pre-acquired token via the dedicated `authorization` field | forwards the token on `Authorization` (static bearer fits directly); never runs OAuth. `require_approval` gates tool calls. (Current docs expose `authorization`, not a free-form `headers` object — ⑤.) |
| **ChatGPT** (developer mode/connectors) | Streamable HTTP (+SSE) | OAuth, **No-Auth**, or Mixed | beta; OAuth is the clean path. |
| **Claude Desktop** (custom connectors) | Streamable HTTP (+SSE) | **OAuth 2.1 or authless** | no static-header field — bearer-only deployments are unreachable without a gateway. |
| **Claude.ai web** (custom connectors) | Streamable HTTP (+SSE) | **OAuth 2.1 + RFC 9728** (or authless) | server **must** serve RFC-9728 PRM; no static-header field. |
**Phased auth recommendation:**
- **Phase 1 — static bearer (this RFC).** Reaches Claude Code, Cursor, VS Code
Copilot, OpenCode, the Claude Messages API connector, and the OpenAI Responses API —
the entire developer/agent/API tier. This is the correct launch posture.
- **Phase 2 — OAuth 2.1 + RFC 9728 (MR-956, additive).** Required to reach **claude.ai
web** and **Claude Desktop** custom connectors and the clean ChatGPT path. The same
endpoint accepts validated OAuth access tokens and (still) static bearers; PRM
advertisement stays config-gated because of the #59467 header-override behavior.
Because the resource server validates whatever token arrives on `Authorization`,
both tiers hit one endpoint with no MCP-layer branching.
### 13.1 Result shaping & structured output
For typed, machine-consumable results (`graph_query`, stored-query reads,
`branch_list`, `commit_*`, `schema_get`) the tool declares an **`outputSchema`** and
returns **`structuredContent`** (the route's existing `ReadOutput` / listing DTOs,
which already derive `ToSchema`), **and also** mirrors the JSON in a text `Content`
block for clients that don't parse structured content. Plain text-JSON is used where a
fixed schema is awkward. (Some clients still mishandle `structuredContent: null` —
emit an empty object, never `null`, when there is no structured payload.)
## 14. Resources
Two resources: `omnigraph://schema` (`Read` → schema `.pg` text) and
`omnigraph://branches` (`Read` → branches JSON). Both are Cedar-filtered and
deny-masked exactly like tools — a locked-down agent denied `Read` never sees them,
which is how the "agents don't introspect schema" intent is met by *policy*, not
omission. Advertise the `resources` capability with `subscribe:false,
listChanged:false` (both handlers are backed — don't advertise a capability whose
`read` would 404). Exposing the schema as a resource is also the on-demand channel
code-mode clients pull from (§12).
No `omnigraph://graphs` resource and no `graphs_list` tool — server-scoped graph
discovery stays REST-only via `GET /graphs` (§15).
## 15. Routing — `Reuses` `build_app`
`/mcp` is merged **into `per_graph_protected`**, which `build_app` always nests under
`/graphs/{graph_id}`. RFC-011 made the server **cluster-only** — there is no flat
single-graph route group and no `match state.routing()`, so `/mcp` is **always**
`/graphs/{graph_id}/mcp` (even a single-graph boot builds a one-graph registry keyed
by `default`; `GraphRouting` is now `{ registry, config_path }`):
```rust
// Reuses — crates/omnigraph-server/src/lib.rs:876 (abridged)
let per_graph_protected = Router::new()
.route("/snapshot", get(server_snapshot))
// … /query /mutate /queries /queries/{name} /schema /schema/apply /load /branches /commits …
.merge(mcp::mcp_router(state.clone())) // ← ADD: brings its own tower-http body-limit layer
.route_layer(middleware::from_fn_with_state(state.clone(), resolve_graph_handle)) // inner: injects Arc<GraphHandle> (lib.rs:929)
.route_layer(middleware::from_fn_with_state(state.clone(), require_bearer_auth)); // outer: injects ResolvedActor / 401 (lib.rs:933)
let management = Router::new()
.route("/graphs", get(server_graphs_list)) // GraphList — server-scoped, REST-only
.route_layer(middleware::from_fn_with_state(state.clone(), require_bearer_auth));
// RFC-011 cluster-only: per-graph routes ALWAYS nest under /graphs/{graph_id};
// there is no flat mode and no routing match. (lib.rs:953)
let protected = Router::new()
.nest("/graphs/{graph_id}", per_graph_protected) // → POST /graphs/{id}/mcp
.merge(management);
```
`mcp::mcp_router(state)` is the server's thin wrapper:
`omnigraph_mcp::mcp_router(OmnigraphMcpBackend::new(state), INGEST_REQUEST_BODY_LIMIT_BYTES /* lib.rs:148, 32 MiB */, host_policy_from_bind(…))`.
Merging the router (rather than `.route("/mcp", …)`) keeps the `/mcp`-specific body
limit from leaking onto the other routes.
**No server-scoped MCP.** Every MCP tool/resource is graph-scoped. `tools/list` can't
carry a graph id, so a single flat `/mcp` taking `graph_id` per call couldn't list
per-graph stored-query tools and would break isolation — hence per-graph routing. A
future server-level flat `/mcp` (bearer-only, no handle, server-scoped tools only)
would live in the `management` group, but is not built speculatively.
### 15.1 Multi-graph model
omnigraph's MCP is **per-graph**: one isolated MCP server per graph, with the graph
identity in the **URL path**, never in tool arguments or output. The server is
cluster-only (RFC-011), so the router **always** nests the whole protected group under
`/graphs/{graph_id}` (`lib.rs:954`) — this per-graph model is now the only model, not a
multi-mode special case. Each `/graphs/{id}/mcp` endpoint's `initialize` / `tools/list`
/ `tools/call` / `resources/*` operate only on that graph and can never list or touch
another graph's tools.
- **Discovery is REST-only, not an MCP tool.** `graphs_list` / `omnigraph://graphs`
are deliberately absent from MCP. Which graphs exist is answered by `GET /graphs`
(multi-mode only) → `GraphListResponse { graphs: [{ graph_id, uri }] }`
(`api.rs:703`), gated by the server-scoped `GraphList` Cedar action and
**default-denied without a server policy** (the registry — graph ids + storage URIs
— is never leaked until an operator authorizes it). An operator discovers graphs via
REST, then points each MCP client connection at the relevant `/graphs/{id}/mcp`; no
single MCP connection ever sees the full graph list.
- **Clients configure one connection per graph.** Tool ids are identical across graphs
(each is its own server), so the **connection name is the namespace**: a client that
auto-prefixes yields `og-sales_graph_query` vs `og-hr_graph_query`.
```bash
claude mcp add og-sales --transport http https://host/graphs/sales/mcp --header "Authorization: Bearer …"
claude mcp add og-hr --transport http https://host/graphs/hr/mcp --header "Authorization: Bearer …"
```
- **Stored queries are per-graph state.** Each graph owns its registry
(`GraphHandle.queries`, `registry.rs:55`), loaded from that graph's declaration
(`cluster.yaml graphs.<id>.queries`). So a query is exposed only on its own graph's
endpoint; the same query *name* may exist on multiple graphs with different
definitions (no cross-graph collision — different servers). `effective_tool_name()`
uniqueness is enforced **per graph** at registry load (`duplicate_tool_name`), not
across graphs. The projection mode (`per_query` vs `meta`, §9.2) is chosen from
*that graph's* exposed-query count, so a small graph can show one typed tool per
query while a large graph on the same server uses the `stored_query_list` +
`stored_query_run` meta pair. `InvokeQuery` is evaluated against *that graph's*
`handle.policy`, so an actor can be allowed stored queries on one graph and denied
on another, independently. The per-graph catalog is also discoverable over REST at
`GET /graphs/{id}/queries`.
So `tools/list` on `/graphs/sales/mcp` returns sales' built-ins + sales' stored
queries; the same call on `/graphs/hr/mcp` returns hr's — two disjoint catalogs, each
Cedar-filtered to the actor.
## 16. Tests & verification
MCP tests land in a new `crates/omnigraph-server/tests/mcp.rs` suite (black-box over
`build_app`); stored-query *projection* tests extend `stored_queries.rs`.
- **Protocol:** `initialize` + advertised `{tools, resources}` caps; `tools/list`
returns the full bounded set with **no `nextCursor`** (the non-paginated contract,
§6); `tools/call` happy path; `GET /mcp → 405`; `MCP-Protocol-Version` 400/default;
unknown/denied tool → identical `-32602`.
- **Origin (fail-closed, ①):** remote bind, no configured origins → a present
`Origin` is `403` (`DenyBrowsers`); **absent** `Origin` → `200` (non-browser clients);
a configured-allowed `Origin` → `200`; a present non-allowed `Origin` under
`OriginPolicy::Allow` → `403`. Asserts `origin_guard`, not rmcp's empty-list path.
- **Cedar:** a read-only actor sees read tools but not writers; a denied call masks
byte-identically to an unknown one; stored queries appear only with `invoke_query`;
the double-gate (an `invoke_query`-only actor sees a stored tool but the call
surfaces `isError` when the inner `Read` denies).
- **Dispatch:** a `graph_mutate` writes end-to-end (proves the actor/handle extension
passthrough); a malformed query → `isError:true`, not a protocol error; `graph_load`
with a missing branch and no `from` → `isError` (404), with `from` → forks.
- **Schema/engine equivalence (the by-design lock, ③④):** the corpus test in the
compiler crate asserting `param_json_schema` accepts *exactly* what
`json_value_to_literal_typed` accepts, per `ParamKind` — incl. **Blob as a URI string
(a base64 blob validates only as a plain string, never decoded)**, **explicit `null`
for a nullable param vs rejection for a non-nullable one**, list items, and `vector`
**with and without `vector_dim`** (absent-dim omits `minItems`/`maxItems`). A drifted
arm turns this red.
- **Tool-name collision (⑦):** a stored query whose `effective_tool_name()` equals a
built-in fails `cluster validate`/boot with a loud error — it is never silently
skipped or served.
- **Structured output:** `outputSchema` present and `structuredContent` validates
against it; the text mirror is present; never emits `structuredContent: null`.
- **Projection modes:** `per_query` below the threshold, `meta` at/above it, with the
switch logged.
- **Auth decoupling:** `/mcp` `401`s without a bearer (before rmcp) and `200`s with
one; green under the static-hash verifier and a mock OIDC `ResolvedActor`.
- **Crate-level:** `omnigraph-mcp/tests/` with a trivial `McpBackend` proving the
crate stands alone (`initialize` + `GET → 405`), plus an **rmcp surface guard**
pinning `StreamableHttpServerConfig`'s `with_*` setters, `NeverSessionManager`, the
`ServerHandler` method shapes, and the `RequestContext.extensions →
http::request::Parts` passthrough — the smoke check on any rmcp bump.
Verification commands:
```bash
cargo build --workspace --locked
cargo tree -p omnigraph-server -e normal | grep rmcp # rmcp only transitively under omnigraph-mcp
cargo test -p omnigraph-server --test mcp
cargo test -p omnigraph-server --test stored_queries
cargo test -p omnigraph-server --test openapi # /mcp carries no #[utoipa::path]; no REST drift
```
## 17. Decisions & rollout
**Locked:** rmcp 1.7 (official SDK); MCP target `2025-11-25`; stateless JSON over a
single `/mcp` POST (`NeverSessionManager`, `stateful_mode=false`, `json_response=true`);
`McpBackend` crate-trait seam with `&Parts` passthrough; `Builtin` enum + stored-query
populator; domain-qualified `snake_case` tool ids; annotations set explicitly; coarse
`InvokeQuery` with the double-gate; per-graph `/mcp` routing, no server-scoped MCP;
`structuredContent` + `outputSchema` (with a text mirror) for typed results;
`vector_dim: Option<u32>` handled with omit-on-absent; auth consumed as a resolved
actor, validated per-request, never passed through.
**Locked (correct-by-design fixes from the external review pass):** one shared
`param_json_schema` in `omnigraph-api-types` (Blob → URI string, nullable → `null`-union)
co-located with the coercer and pinned by a schema/engine **equivalence test** — schema
drift is a CI failure, not a shipped bug (③④); a **non-paginated list seam by contract**
— `meta` mode bounds large catalogs, the seam type carries no `nextCursor` it can't honor
(②); a single fail-closed **`McpHostPolicy::from_bind`** with a total `OriginPolicy`
(no absent-⇒-skip state; remote default `DenyBrowsers` enforced by `origin_guard`) (①);
built-in/stored **name collisions rejected at `cluster validate`/boot**, never silently
skipped (⑦); `stored_query_mode` folded into the one per-graph `mcp:` block (Phase 6),
not a floating key (⑥); MCP scope **derived from the per-graph mount**, so `graph_health`
replaces a server-scoped `health` (⑧). The OpenAI row is corrected to the `authorization`
field (⑤, doc-only).
**Open / deferred:**
- **OAuth 2.1 + RFC 9728 (MR-956)** — additive Phase 2; PRM advertisement config-gated
(issue #59467).
- **Per-query `expose` / `tool_name` (cluster Phase 6, the §D5 bridge)** — the
`StoredQuery` fields exist and the projection already reads them, but cluster boot
forces `expose: true, tool_name: None` for every applied query (`settings.rs:83-84`),
so today every applied query is listed and named by its query name. Per-query
exposure/naming controls (`mcp.expose`, `tool_name`) land when the cluster catalog
grows the metadata — no projection change is then needed.
- **Per-query `invoke_query` scope (PR 0b)** — add a query-name dimension to
`PolicyRequest` + the Cedar schema so an actor can be scoped to *specific* stored
queries. Until then curation is graph-level (registry membership; `expose` once
Phase 6 lands).
- **`tools/list_changed`** — only if the registry gains runtime reload.
- **stdio → proxy collapse** — the local stdio package degrades to a stdio↔HTTP proxy
over `/mcp` once this surface is GA, leaving one tool set and one Cedar gate, the
same one-contract posture as [rfc-009-unify-access-paths.md](rfc-009-unify-access-paths.md).
**Rollout:** (1) the `omnigraph-mcp` crate + transport + a trivial backend (crate
stands alone); (2) the server backend — extension passthrough, `Builtin` enum,
read-only tools + resources, Cedar-filtered listing, the `classify` mapper; (3)
mutating tools + stored-query projection (both modes) + structured output; (4) docs +
the `omnigraph mcp install` on-ramp (MR-974); (5) OAuth/RFC-9728 (MR-956) and the
stdio proxy as separate follow-ups.