diff --git a/docs/dev/rfc-003-mcp-server-surface.md b/docs/dev/rfc-003-mcp-server-surface.md index cff392b..77eefdd 100644 --- a/docs/dev/rfc-003-mcp-server-surface.md +++ b/docs/dev/rfc-003-mcp-server-surface.md @@ -1,551 +1,350 @@ -# RFC-003: MCP Server Surface for `omnigraph-server` — Full Tool Parity, Stored Queries, Modular Auth +# RFC-003: MCP Server Surface for `omnigraph-server` -**Status:** Proposed — buildable implementation spec, **validated against `main` -at the 0.7.0 development head** (post-v0.6.2; workspace bumped to 0.7.0 in -`dedd647`). A reference implementation shipped in -[omnigraph#157](https://github.com/ModernRelay/omnigraph/pull/157) (proved rmcp -1.7 on edition 2024, the auth-extension passthrough, and the full -tool/resource/Cedar surface); this RFC is the canonical spec for a clean -reimplementation from `main` with a dedicated `omnigraph-mcp` crate. +**Status:** Proposed — buildable implementation spec. **Date:** 2026-06-13 -**Tickets:** MR-969 (stored queries + MCP exposure), MR-956 (OAuth/RFC-9728 -fast-follow — the auth substrate), MR-971 (per-server credential resolver — landed -as RFC-007), MR-974 (`omnigraph mcp install`), MR-668/RFC-005 (multi-graph server — shipped). +**Audience:** server/engine maintainers. +**Tickets:** MR-969 (stored queries + MCP exposure), MR-956 (OAuth/RFC-9728 layer), +MR-971 (per-server credential resolver — landed as RFC-007), MR-974 (`omnigraph mcp install`). **Builds on:** [rfc-001-queries-envelope-mcp.md](rfc-001-queries-envelope-mcp.md) -(stored queries + response envelope + the parent MCP design), -[rfc-005-server-cluster-boot.md](rfc-005-server-cluster-boot.md) (multi-graph boot), -[rfc-007-operator-config.md](rfc-007-operator-config.md) (the landed client -credential model), [rfc-009-unify-access-paths.md](rfc-009-unify-access-paths.md) -(the embedded/remote `GraphClient` unification this surface is a sibling of), and -[omnigraph#128](https://github.com/ModernRelay/omnigraph/pull/128) (the shipped -stored-query registry, `GET /queries`, `POST /queries/{name}`, the coarse -`invoke_query` gate). -**Supersedes:** the MCP-transport portion of `rfc-001` (`GET /mcp/tools` + `POST -/mcp/invoke`). See [Relationship to RFC-001](#relationship-to-rfc-001). -**Target release:** v0.8.x (per RFC-009, MCP-adjacent work is post-v0.7.0). +(stored queries + the response envelope), [rfc-005-server-cluster-boot.md](rfc-005-server-cluster-boot.md) +(multi-graph boot), [rfc-007-operator-config.md](rfc-007-operator-config.md) +(client credential model), [rfc-009-unify-access-paths.md](rfc-009-unify-access-paths.md) +(one-contract/two-implementations posture). +**Target release:** v0.8.x. -> **Provenance.** Every snippet labelled **`EXISTING (reuse)`** is copied verbatim -> from `origin/main` with a `file:line` citation, re-read from source on -> 2026-06-13 — verify with `git show origin/main:`. Snippets labelled -> **`NEW (build this)`** are the interfaces to add. The one behavior this repo -> cannot verify — that rmcp 1.7 forwards the request's `http::request::Parts` -> into `RequestContext.extensions` — was proven by the reference impl -> [omnigraph#157](https://github.com/ModernRelay/omnigraph/pull/157); it is -> carried forward as a cited assumption, pinned by a crate-level surface guard -> (§11). This document absorbed the standalone implementation-spec draft -> (formerly `rfc-010`); there is one MCP RFC. +**Validated against** (re-checked 2026-06-13): MCP protocol revision **`2025-11-25`** +(modelcontextprotocol.io), the official Rust SDK **`rmcp 1.7.0`** (crates.io / +github.com/modelcontextprotocol/rust-sdk), and current tool/security best practice +(Anthropic engineering, MCP spec security pages). Provider compatibility was +checked against the live docs of Claude Code/Desktop/web, the Claude Messages API +MCP connector, OpenAI's Responses API + ChatGPT connectors, Cursor, VS Code Copilot, +and OpenCode. Code snippets marked **`Reuses`** are present in `omnigraph-server` +today, cited to `file:line`; snippets marked **`New`** are the code to add. -## Summary +--- + +## 1. Summary Add a first-class **MCP (Model Context Protocol) server surface to -`omnigraph-server`**, exposed over **Streamable HTTP**, projecting the server's -operations as MCP tools and resources for LLM clients (Claude Code/Desktop/web, -Cursor, ChatGPT dev-mode, etc.). Two tool populations share one projection path: +`omnigraph-server`**, served over **Streamable HTTP**, that projects the server's +operations as MCP **tools** and **resources** for LLM clients. Two tool populations +share one projection path: -1. **Built-in operational tools** — `health`, `snapshot`, `schema_get`, - `branches_list`, `commits_list`, `commits_get`, ad-hoc `query`/`mutate`, - `load` (NDJSON), `branches_create`/`branches_delete`/`branches_merge`, - `schema_apply` — plus resources `omnigraph://schema` and `omnigraph://branches`. - The authoritative catalog is [§6](#6-tool-catalog-13-built-ins--cedar-mapping) - (13 tools). Server-scoped graph discovery (`graphs_list` / `omnigraph://graphs`) - is **dropped from MCP** — it stays REST-only via `GET /graphs` ([§9](#9-routing--existing-reuse-validated)). -2. **Dynamic stored-query tools** — one MCP tool per `expose: true` entry in the - graph's loaded stored-query registry, typed from the `.gq` declaration via the - shipped `query_catalog_entry` / `ParamDescriptor` projection ([§7](#7-stored-query-projection--existing-reuse--the-corrected-schema)). +1. **Built-in operational tools** — graph read/mutate, schema get/apply, branch + create/delete/merge/list, commit list/get, NDJSON load, and a `health` liveness + tool, plus resources `omnigraph://schema` and `omnigraph://branches`. +2. **Dynamic stored-query tools** — projected from the graph's loaded stored-query + registry: either one typed tool per query (small catalogs) or a + discovery + execute meta-tool pair (large catalogs) — see [§9](#9-stored-query-projection). Every tool is **authorized by the server's existing Cedar policy engine**. The MCP -layer never implements its own authentication: it consumes an **already-resolved -`ResolvedActor`** from the server's bearer middleware, so the **same MCP endpoint -serves on-prem (static or customer-OIDC tokens) and cloud (WorkOS OAuth) by -configuration only** — cloud OAuth is an additive RFC-9728 layer that slots in with -zero MCP changes. The end-state collapses two diverging tool implementations into -one: the in-server MCP is canonical; the stdio package degrades to a thin -stdio↔HTTP proxy over it. +layer performs **no authentication of its own**: it consumes an already-resolved +actor identity from the server's bearer/OAuth middleware, so the **same endpoint +serves on-prem (static or customer-OIDC tokens) and cloud (OAuth 2.1) by +configuration only**. The transport is **stateless JSON over a single POST +endpoint** — the minimal conformant Streamable-HTTP shape, since the server emits no +server-initiated messages. -> **Key caveat, up front:** the headline "a token Cedar-scoped to a *specific set* -> of stored queries" requires **per-query `invoke_query` scope**, which is *designed* -> (rfc-001) but **not yet implemented** — the shipped action is coarse (any stored -> query on the graph, or none; `policy/src/lib.rs:59-73`). Per-actor Cedar curation -> works today for *built-in vs ad-hoc vs admin* tools and for *stored-vs-ad-hoc*; -> sub-selecting individual stored queries per actor is gated on PR 0b ([§12](#12-decisions--locked--open)). +The surface is built so the existing local stdio MCP package can later collapse into +a thin stdio↔HTTP proxy over it, leaving one Cedar-gated, remotely-reachable tool +set ([§13](#13-provider-compatibility)). ---- +## 2. Goals -## 0. What changed since the first blueprint draft (validation deltas) +- Project built-in tools **and** stored queries through **one** registry abstraction, + so `tools/list` / `tools/call` never special-case a population. +- Make `tools/list` and the callable set agree for argument-independent authorization, + both driven by Cedar; `tools/call` is always the authoritative gate. +- Keep the MCP layer **auth-method-agnostic**: it consumes a resolved actor, never a + raw token, and never branches on how authentication happened. +- Add **no business logic**: tools delegate to the same engine functions the HTTP + routes call. +- Be **code-mode-friendly** (typed schemas, structured output, stable names, + progressive disclosure) and **maximally client-compatible** (Streamable HTTP + + bearer today, OAuth 2.1 + RFC 9728 as an additive layer). +- Behaviour-neutral when unused: no MCP traffic ⇒ no change. -The first blueprint of this RFC was authored ~79 commits behind the current head. -Building from it verbatim would miss these; each is corrected in the noted section. +## 3. Non-Goals -| # | The first draft said | `main` reality | § | -|---|---|---|---| -| D1 | `ParamKind::Vector { dim: u32 }`, "dim intrinsic, never an `Option`" | `ParamKind::Vector` is a **unit** variant; dim is `ParamDescriptor.vector_dim: Option` | §7 | -| D2 | handlers / `server_invoke_query` / auth in `lib.rs`; "verified against `{lib.rs,api.rs}`" | moved to **`handlers.rs`**; stored-query config in **`config.rs`**; DTOs in **`api.rs`** | §4–§8 | -| D3 | `ingest` tool wraps `ingest_as`; "(`+BranchCreate` when `from` forks)" | `server_ingest` calls **`load_as`**; missing branch **without `from` → 404**, no implicit fork; `ingest_as` shim retired server-side; `ingest` is now a **deprecated CLI alias of `load`** | §8 | -| D4 | client creds: `servers: { onprem: { endpoint, auth: { token\|oauth } } }`; chain env→**keychain**→file | `OperatorServer { url }` (field is `url`, **no `auth:` block**, "no tokens in this file, ever"); chain is **env → credentials file** (no keychain step) | §10 | -| D5 | tests: `cargo test -p omnigraph-server --test server` (incl. `mcp_*`), `--test server server_opens_s3…` | `server.rs` monolith split; suites are `auth_policy.rs / data_routes.rs / schema_routes.rs / stored_queries.rs / multi_graph.rs / boot_settings.rs / s3.rs / openapi.rs`. S3 → `--test s3` | §11 | -| D6 | "`queries..mcp.expose` in `omnigraph.yaml`" | field survives (default `true`); multi-graph declaration is **`cluster.yaml graphs..queries`** with file-discovery (`677320c`); RFC-008 deprecates `omnigraph.yaml` | §7.3 | +- **Hosting an OAuth authorization server.** The server is a **Resource Server** only + (validates tokens, never issues them, never holds client secrets). The AS is a + separate concern (MR-956). +- **MCP prompts, elicitation, sampling, tasks, `tools/list_changed` subscriptions, + resource subscriptions, server-initiated messages** — none required, which is what + permits the stateless POST-only transport. (`tools/list_changed` is reconsidered + only if the registry gains runtime reload.) +- **stdio transport inside the server** — stdio stays in the TS package (later a proxy). +- **Client-side "code mode" machinery** (TS wrapper generation, sandboxes, tool + search/deferral) — those are client/runtime concerns; see [§12](#12-code-mode-compatibility) + for what the server does to support them and what it deliberately does not build. +- **Cross-graph tool listing** — per-graph catalogs only. -Everything else from the blueprint (the crate split, the `McpBackend` seam, -stateless Streamable-HTTP transport, the per-graph routing decision, the -coarse-`invoke_query` caveat, "no `read`/`change` aliases", "no `graphs_list` in -MCP") **validated against code and carries forward unchanged.** +## 4. Protocol target ---- +Target MCP revision **`2025-11-25`** (current). `rmcp 1.7.0` advertises this as its +latest and negotiates down to any of `2024-11-05 / 2025-03-26 / 2025-06-18 / +2025-11-25`; an absent `MCP-Protocol-Version` header defaults to `2025-03-26` and an +unsupported one is a `400`. Revision `2025-06-18` is the floor we rely on for two +features: **structured tool output** (`outputSchema` + `structuredContent`) and the +**OAuth Resource-Server** model. From `2025-11-25` we adopt: **input-validation +errors as tool-execution errors** (SEP-1303), **JSON Schema 2020-12** as the default +dialect, **`403` on invalid `Origin`**, and **`WWW-Authenticate` made optional** with +a `.well-known` fallback. -## 1. Crate architecture & dependency direction +**Transport shape (stateless Streamable HTTP).** The server exposes one endpoint that +accepts `POST` (and answers `GET`/`DELETE` with `405 + Allow: POST`). For a JSON-RPC +*request* it returns one `application/json` object; it opens no SSE stream, assigns no +`Mcp-Session-Id`, and treats every request independently — a fully conformant +stateless server. It **MUST** validate `Origin` (`403` on mismatch) and honor +`MCP-Protocol-Version`. `rmcp` delivers all of these in stateless mode (§7). -Two crates. `rmcp` lives **only** in `omnigraph-mcp`; `omnigraph-server` carries -no direct rmcp dep. +## 5. Crate architecture + +Two crates; `rmcp` is contained to one of them. ``` -omnigraph-server (impl McpBackend, all omnigraph logic) - │ depends on +omnigraph-server (implements McpBackend; all omnigraph tool/Cedar/dispatch logic) + │ depends on ▼ -omnigraph-mcp (rmcp transport, McpBackend trait, rmcp model re-exports) - │ depends on +omnigraph-mcp (rmcp Streamable-HTTP transport, the McpBackend trait, rmcp model re-exports) + │ depends on ▼ rmcp 1.7 + tower-http(limit) + axum + http ``` -The dependency **must** go `server → mcp`, never the reverse: the server binary -mounts `/mcp`, so `mcp → server` would cycle at the package level -(`server-bin → omnigraph-mcp → server-lib`). The trait inverts it — the crate -defines the seam, the server fills it — which is also why the crate can never -name `AppState`, `GraphHandle`, `do_*`, or `api::*`. +The dependency **must** go `server → mcp`. The server binary mounts `/mcp`, so a +`mcp → server` edge cycles at the package level (`server-bin → omnigraph-mcp → +server-lib`), which Cargo rejects. The trait inverts the direction — the crate +defines the seam, the server fills it — which is also why the crate can never name an +omnigraph type (`AppState`, `GraphHandle`, the handlers); it abstracts over them. -`omnigraph-mcp/Cargo.toml`: +`crates/omnigraph-mcp/Cargo.toml`: ```toml [package] name = "omnigraph-mcp" -edition = "2024" # matches the workspace (omnigraph-server is edition 2024) +edition = "2024" # rmcp 1.7 is itself edition 2024 — no friction version.workspace = true [dependencies] +# `server` is on by rmcp's default features; `transport-streamable-http-server` +# pulls in the tower service + http stack. Do NOT enable rmcp's `local` feature — +# it cfg's the StreamableHttpService tower wiring out. rmcp = { version = "1.7", default-features = false, features = ["server", "transport-streamable-http-server"] } -axum = { workspace = true } -http = "1" +axum = { workspace = true } +http = "1" tower-http = { workspace = true, features = ["limit"] } -tokio = { workspace = true } +tokio = { workspace = true } async-trait = { workspace = true } serde_json = { workspace = true } ``` -Workspace edit — add the member (current `members`, `Cargo.toml`): +Add `"crates/omnigraph-mcp"` to the workspace `members`; in +`omnigraph-server/Cargo.toml` add `omnigraph-mcp` and **no direct `rmcp` dep** +(verified absent today). The verification gate is `cargo tree -p omnigraph-server -e +normal | grep rmcp` showing rmcp only transitively under `omnigraph-mcp`. -```toml -members = [ - "crates/omnigraph-compiler", - "crates/omnigraph", - "crates/omnigraph-cli", - "crates/omnigraph-cluster", - "crates/omnigraph-policy", - "crates/omnigraph-server", - "crates/omnigraph-mcp", # NEW -] -``` - -In `omnigraph-server/Cargo.toml`: add `omnigraph-mcp = { workspace = true }`; -do **not** add `rmcp`. (Confirmed absent today: `git grep rmcp origin/main -- -'**/Cargo.toml'` is empty.) - ---- - -## 2. The `McpBackend` seam — `NEW (build this)` in `omnigraph-mcp` +## 6. The `McpBackend` seam — `New` in `omnigraph-mcp` ```rust // crates/omnigraph-mcp/src/lib.rs use async_trait::async_trait; -// rmcp model types re-exported so the server uses them via `omnigraph_mcp::…` +// rmcp model types re-exported so the server speaks rmcp via `omnigraph_mcp::…` // and carries no direct rmcp dependency. pub use rmcp::model::{ - Annotated, CallToolResult, Content, RawResource, ReadResourceResult, Resource, + CallToolResult, Content, RawResource, ReadResourceResult, Resource, ResourceContents, ServerCapabilities, ServerInfo, Tool, ToolAnnotations, }; -pub use rmcp::ErrorData as McpError; +pub use rmcp::ErrorData as McpError; // JSON-RPC error type (method_not_found=-32601, invalid_params=-32602, internal_error=-32603) pub type JsonObject = serde_json::Map; #[async_trait] pub trait McpBackend: Clone + Send + Sync + 'static { fn server_info(&self) -> ServerInfo; - - async fn list_tools(&self, parts: &http::request::Parts) - -> Result, McpError>; - - async fn call_tool(&self, parts: &http::request::Parts, name: &str, args: JsonObject) - -> Result; - - async fn list_resources(&self, parts: &http::request::Parts) - -> Result, McpError>; - - async fn read_resource(&self, parts: &http::request::Parts, uri: &str) - -> Result; + async fn list_tools(&self, parts: &http::request::Parts) -> Result, McpError>; + async fn call_tool(&self, parts: &http::request::Parts, name: &str, args: JsonObject) -> Result; + async fn list_resources(&self, parts: &http::request::Parts) -> Result, McpError>; + async fn read_resource(&self, parts: &http::request::Parts, uri: &str) -> Result; } ``` -`&http::request::Parts` is the whole decoupling mechanism. The crate hands the -backend the request parts; the backend reads **its own** types out of -`parts.extensions` (§4). The crate never names an omnigraph type. `#[async_trait]` -boxes the futures — negligible at MCP QPS and it sidesteps the -async-fn-in-trait `Send`-bound friction; the server already depends on -`async-trait`. +`&http::request::Parts` is the decoupling mechanism. The crate hands the backend the +request parts; the backend reads **its own** types out of `parts.extensions`. The +crate never names an omnigraph type, so it is reusable and auth stays decoupled (§8). ---- +> `rmcp`'s own `ServerHandler` trait uses RPITIT (`-> impl Future + …`), not +> `async-trait`. Our `McpBackend` deliberately uses `#[async_trait]`: it is +> implemented once by the server, the boxed future is negligible at MCP QPS, and the +> server already depends on `async-trait`. Either style compiles on edition 2024. -## 3. Transport — `NEW (build this)` in `omnigraph-mcp` +## 7. Transport — `New` in `omnigraph-mcp` ```rust // crates/omnigraph-mcp/src/transport.rs use std::sync::Arc; use rmcp::transport::streamable_http_server::{ - StreamableHttpServerConfig, StreamableHttpService, session::local::LocalSessionManager, + StreamableHttpServerConfig, StreamableHttpService, + session::never::NeverSessionManager, // stateless ⇒ reject all session ops }; pub struct McpHostPolicy { - pub allowed_hosts: Option>, // None ⇒ accept any Host - pub allowed_origins: Option>, // None/empty ⇒ off (non-browser clients send no Origin) + pub allowed_hosts: Option>, // None ⇒ accept any Host (rely on bearer + Origin) + pub allowed_origins: Option>, // None/empty ⇒ Origin check off (non-browser clients send none) } pub fn mcp_router(backend: B, body_limit: usize, hosts: McpHostPolicy) -> axum::Router { - // StreamableHttpServerConfig is #[non_exhaustive] — build from Default, mutate via with_*. - let config = StreamableHttpServerConfig::default() + // StreamableHttpServerConfig is #[non_exhaustive]; its Default is stateful_mode=true, + // json_response=false, allowed_hosts=loopback. ALL THREE must be overridden for a + // remote stateless JSON server — build from Default and flip via the with_* setters. + let mut config = StreamableHttpServerConfig::default() .with_stateful_mode(false) - .with_json_response(true) - .with_allowed_hosts(hosts.allowed_hosts) - .with_allowed_origins(hosts.allowed_origins); + .with_json_response(true); + config = match hosts.allowed_hosts { + Some(list) => config.with_allowed_hosts(list), + None => config.disable_allowed_hosts(), // accept any Host + }; + if let Some(origins) = hosts.allowed_origins { config = config.with_allowed_origins(origins); } + // service_factory returns Result; NeverSessionManager pairs with stateless mode. let svc = StreamableHttpService::new( move || Ok(McpService::new(backend.clone())), - Arc::new(LocalSessionManager::default()), + Arc::new(NeverSessionManager::default()), config, ); axum::Router::new() .route_service("/mcp", svc) - // rmcp reads the body directly (NOT via an axum extractor), so axum's + // rmcp reads the body directly (not via an axum extractor), so axum's // DefaultBodyLimit does NOT bound /mcp — the tower-http layer does. .layer(tower_http::limit::RequestBodyLimitLayer::new(body_limit)) } ``` -`McpService` implements rmcp's `ServerHandler`, extracts `&Parts` from -`ctx.extensions` once (missing → `McpError::internal_error`), and delegates each -method to `B`. `get_info → backend.server_info()`; `initialize`/`ping` use rmcp -defaults. +`McpService` implements rmcp's `ServerHandler`, pulls `&Parts` out of the request +context once, and delegates each method to `B`. rmcp's `StreamableHttpService` +**consumes the body and injects the remaining `http::request::Parts` into +`RequestContext.extensions`** (this is documented and load-bearing — see §8); inside +the handler, `ctx.extensions.get::()` returns those parts. -**Stateless mode gives these conformance MUSTs for free** (rmcp 1.7): `GET`/`DELETE -/mcp → 405` with `Allow`; disallowed `Host`/`Origin → 403`; `MCP-Protocol-Version -→ 400` on unsupported, default `2025-03-26` when absent. No conformance -middleware needed. +**Conformance the stateless transport gives for free** (verified in rmcp 1.7 +`tower.rs`): `GET`/`DELETE /mcp → 405` with `Allow: POST`; a disallowed `Host` → +`403`; a present-but-invalid `Origin` → `403` (Origin checked only when +`allowed_origins` is non-empty); `MCP-Protocol-Version` → `400` on unsupported, +default `2025-03-26` when absent. **No conformance middleware is needed.** -**Host/Origin policy is derived from the deployment, never a fixed default.** -rmcp's loopback `allowed_hosts` default would `403` every non-loopback client -before bearer auth. The server computes `McpHostPolicy` from its `--bind`: loopback -bind → loopback allow-list (rebind protection matters locally); non-loopback bind -→ allow the configured public host(s) or disable Host-allowlisting and log it -(bearer + `Origin` are the real controls). `allowed_origins` is the browser-attack -control: default off; configurable for browser-based clients. +**Host/Origin policy is derived from the deployment, never a fixed default.** rmcp's +default `allowed_hosts` is loopback-only — correct for local dev (DNS-rebinding +defense) but it would `403` every remote client. `omnigraph-server` computes +`McpHostPolicy` once at startup from `--bind`: loopback bind → keep the loopback +allow-list; non-loopback bind → allow the configured public host(s), else disable +Host-allowlisting and log it (bearer + `Origin` are the real controls). `Origin` +validation defaults off (non-browser MCP clients send no `Origin`) and is enabled for +browser-based clients. ---- +## 8. Auth & identity — `Reuses` the server's middleware -## 4. The extension passthrough — `EXISTING (reuse)` types, `NEW` backend - -The two values the backend needs are injected into the request extensions by -middleware that runs **before** the MCP service. Both are real types: +The backend consumes an already-resolved actor and **branches on nothing** about how +the token was verified. Two values are injected into the request extensions by +middleware that runs **before** the MCP service: ```rust -// EXISTING (reuse) — crates/omnigraph-server/src/identity.rs:187 -pub struct ResolvedActor { - pub actor_id: Arc, - pub tenant_id: Option, - pub scopes: Vec, - pub source: AuthSource, // Static today; Oidc when MR-956 lands -} -``` +// Reuses — crates/omnigraph-server/src/identity.rs:187 +pub struct ResolvedActor { pub actor_id: Arc, pub tenant_id: Option, pub scopes: Vec, pub source: AuthSource } -```rust -// EXISTING (reuse) — crates/omnigraph-server/src/registry.rs:37 +// Reuses — crates/omnigraph-server/src/registry.rs:37 pub struct GraphHandle { - pub key: GraphKey, - pub uri: String, + pub key: GraphKey, pub uri: String, pub engine: Arc, - pub policy: Option>, // None ⇒ no per-graph Cedar gate - pub queries: Option>, // None ⇒ no stored queries for this graph + pub policy: Option>, // None ⇒ no per-graph Cedar gate + pub queries: Option>, // None ⇒ no stored queries for this graph } ``` -The middleware order is fixed in `build_app` (`lib.rs:909-956`, see §9): the outer -layer `require_bearer_auth` injects `Extension` (or 401); the inner +The middleware order is fixed in `build_app` (`lib.rs:909`): the **outer** layer +`require_bearer_auth` injects `Extension` (or `401`); the **inner** layer `resolve_graph_handle` injects `Extension>`. Both land in -`request.extensions()`, which rmcp forwards into `RequestContext.extensions`. +`request.extensions()`, which rmcp copies into `RequestContext.extensions`. ```rust -// NEW (build this) — crates/omnigraph-server/src/mcp/mod.rs +// New — crates/omnigraph-server/src/mcp/mod.rs #[derive(Clone)] -pub struct OmnigraphMcpBackend { state: AppState } // AppState is #[derive(Clone)], Arc-backed +pub struct OmnigraphMcpBackend { state: AppState } // AppState is Arc-backed #[derive(Clone)] impl OmnigraphMcpBackend { - fn ctx<'a>(&self, parts: &'a http::request::Parts) - -> Result<(&'a ResolvedActor, &'a Arc), McpError> - { - let actor = parts.extensions.get::() + fn ctx<'a>(&self, parts: &'a http::request::Parts) -> Result<(&'a ResolvedActor, &'a Arc), McpError> { + let actor = parts.extensions.get::() .ok_or_else(|| McpError::internal_error("actor missing from request extensions", None))?; let handle = parts.extensions.get::>() .ok_or_else(|| McpError::internal_error("graph handle missing from request extensions", None))?; Ok((actor, handle)) } } - -// The server's thin wrapper — the only place the two crates meet: -pub fn mcp_router(state: AppState) -> axum::Router { - omnigraph_mcp::mcp_router( - OmnigraphMcpBackend { state }, - INGEST_REQUEST_BODY_LIMIT_BYTES, // lib.rs:137 — 32 MiB, reused - host_policy_from_bind(/* … */), - ) -} ``` ---- +**Auth posture (spec-aligned, MCP 2025-11-25 authorization).** The server is a +Resource Server. Per-request validation only — **sessions are never used for +authentication** (the transport is stateless, which makes this structural). Token +**audience must be validated** and **token passthrough is prohibited**: if a tool +later needs to reach an upstream API, the server acts as a separate OAuth client and +must not forward the client's token. -## 5. Reuse points for dispatch — `EXISTING (reuse)`, all in `handlers.rs` +- **Static bearer (today).** `require_bearer_auth` resolves a `ResolvedActor` from a + SHA-256 hash match. Works for the developer/agent clients (§13). +- **OAuth 2.1 + RFC 9728 (additive, MR-956).** Serve + `/.well-known/oauth-protected-resource`; on `401`, optionally add + `WWW-Authenticate: Bearer resource_metadata="…"` (header is optional in 2025-11-25 + given the well-known fallback). Clients run OAuth 2.1 + PKCE + RFC 8707 resource + indicators themselves; the server validates audience-bound JWTs offline (cached + JWKS), so on-prem/air-gapped keeps working. This swaps the bearer middleware behind + a `TokenVerifier` and changes **zero** MCP code. -The backend adds **no business logic**. `call_tool` routes to the same functions -the REST handlers call. +> **Compatibility caveat to honor (Claude Code issue #59467).** Advertising RFC-9728 +> Protected-Resource-Metadata can cause some clients (Claude Code today) to **ignore +> a static `Authorization` header and force the OAuth flow**. So PRM advertisement +> must be **config-gated**: a deployment serving developer clients over static bearer +> does not advertise OAuth; a deployment targeting consumer connectors does. The MCP +> routes only need to flow through the standard `401` path so the hook can be added +> without touching MCP code. -### 5.1 The authorization gate - -```rust -// crates/omnigraph-server/src/handlers.rs:336 -pub(crate) enum Authz { Allowed, Denied(String) } - -// crates/omnigraph-server/src/handlers.rs:357 — Err reserved for operational 401/500 -pub(crate) fn authorize( - actor: Option<&ResolvedActor>, - policy: Option<&PolicyEngine>, - request: PolicyRequest, -) -> Result; - -// crates/omnigraph-server/src/handlers.rs:438 — denial ⇒ 403 -pub(crate) fn authorize_request( - actor: Option<&ResolvedActor>, - policy: Option<&PolicyEngine>, - request: PolicyRequest, -) -> Result<(), ApiError>; -``` - -`PolicyRequest` carries **no actor identity** (dropped from the type — the MR-731 -invariant) and **no query-name dimension** (the coarse-`invoke_query` caveat): - -```rust -// crates/omnigraph-policy/src/lib.rs:252 -pub struct PolicyRequest { - pub action: PolicyAction, - pub branch: Option, - pub target_branch: Option, -} -// crates/omnigraph-policy/src/lib.rs:259 -pub struct PolicyDecision { pub allowed: bool, pub matched_rule_id: Option, pub message: String } -// crates/omnigraph-policy/src/lib.rs:509 -impl PolicyEngine { pub fn authorize(&self, actor_id: &str, request: &PolicyRequest) -> Result; } -``` - -### 5.2 The read / mutate runners (already decoupled from request bodies) - -```rust -// crates/omnigraph-server/src/handlers.rs:731 — self-authorizes Read -pub(crate) async fn run_query( - handle: Arc, - actor: Option<&ResolvedActor>, - query: &str, - name: Option<&str>, - params_json: Option<&Value>, - branch: Option, - snapshot: Option, - reject_mutations: bool, -) -> Result<(String, ReadTarget, omnigraph_compiler::result::QueryResult), ApiError>; - -// crates/omnigraph-server/src/handlers.rs:665 — self-authorizes Change + admission -pub(crate) async fn run_mutate( - state: AppState, - handle: Arc, - actor: Option<&ResolvedActor>, - query: &str, - name: Option<&str>, - params_json: Option<&Value>, - branch: String, -) -> Result; -``` - -`run_query` returns the raw `QueryResult`; project it for the wire with the -existing `read_output(query_name, &target, result) -> ReadOutput` (`api.rs:634`). - -### 5.3 The canonical double-gate + deny-as-404 pattern to mirror - -`server_invoke_query` (`handlers.rs:933`) is exactly the shape `call_tool` must -reproduce for stored-query tools — reproduced here as the contract: - -```rust -// EXISTING (reuse pattern) — handlers.rs:953 (outer InvokeQuery gate, deny == missing) -match authorize(actor_ref, handle.policy.as_deref(), PolicyRequest { - action: PolicyAction::InvokeQuery, - branch: None, // graph-scoped: NO branch dimension on the outer gate - target_branch: None, -})? { - Authz::Allowed => {} - Authz::Denied(_) => return Err(ApiError::not_found(NOT_FOUND)), // "stored query not found" -} -let stored = handle.queries.as_ref() - .and_then(|r| r.lookup(&name)) - .ok_or_else(|| ApiError::not_found(NOT_FOUND))?; // same 404 on a miss -// … then the inner gate runs inside run_mutate (Change) / run_query (Read). -if is_mutation { - if req.snapshot.is_some() { // mutation-against-snapshot unrepresentable - return Err(ApiError::bad_request("stored mutation cannot target a snapshot")); - } - // run_mutate(…) ← inner Change gate -} else { - // run_query(…) ← inner Read gate -} -``` - -For MCP, the same masking principle applies but the wire encoding differs: an -**unknown tool OR a denied tool** returns the identical `unknown tool: ` -JSON-RPC error (`-32602`) so the catalog can't be probed without the grant (§6). - -### 5.4 Error classification — `NEW (build this)`, one mapper - -`ApiError`'s fields are **private** (`lib.rs:296`), so add two accessors: - -```rust -// EXISTING shape — crates/omnigraph-server/src/lib.rs:296 -pub struct ApiError { - status: StatusCode, code: ErrorCode, message: String, - merge_conflicts: Vec, - manifest_conflict: Option, -} - -// NEW: add accessors so the MCP classifier can read status/message. -impl ApiError { - pub(crate) fn status_code(&self) -> StatusCode { self.status } - pub(crate) fn message_str(&self) -> &str { &self.message } -} -``` - -```rust -// NEW (build this) — the single source of truth, used at EVERY dispatch site -fn classify(r: Result) -> Result { - match r { - Ok(out) => Ok(out), - Err(e) if e.status_code().is_client_error() => // 4xx / 409 → model self-corrects - Ok(CallToolResult::error(vec![Content::text(e.message_str())])), // isError: true - Err(e) => Err(McpError::internal_error(e.message_str().to_owned(), None)), // 5xx → JSON-RPC error - } -} -``` - -One `classify`, not the `api_error_to_tool` / `api_operational_error` split that -drifted in the reference impl. A missing/bad bearer is an HTTP 401 at the boundary -*before* rmcp. - ---- - -## 6. Tool catalog (13 built-ins) + Cedar mapping - -Each built-in reuses the **exact `PolicyAction` its REST route enforces**. The full -action vocabulary on `main`: - -```rust -// EXISTING — crates/omnigraph-policy/src/lib.rs:18 -pub enum PolicyAction { - Read, Export, Change, SchemaApply, - BranchCreate, BranchDelete, BranchMerge, - Admin, // reserved, no call site yet - GraphList, // server-scoped (resource_kind == Server) - InvokeQuery, // graph-scoped, coarse (no per-query dimension yet) -} -``` - -| MCP tool | Scope | Cedar action | -|---|---|---| -| `health` | server | none (liveness/version) | -| `snapshot`, `schema_get`, `branches_list`, `commits_list`, `commits_get` | graph | `Read` | -| `query` (ad-hoc read) | graph | `Read` (`run_query` self-authorizes) | -| `mutate` (ad-hoc write) | graph | `Change` | -| `load` (NDJSON; **renamed from `ingest`** — see §8) | graph | `Change` (+ `BranchCreate` **iff `from` present**) | -| `branches_create` / `branches_delete` / `branches_merge` | graph | `BranchCreate` / `BranchDelete` / `BranchMerge` | -| `schema_apply` (`allow_data_loss`) | graph | `SchemaApply` | -| *stored query* | graph | `InvokeQuery` (coarse) then inner `Read`/`Change` | - -Locked decisions (validated): **tool ids are `query`/`mutate` only** — no -`read`/`change` aliases (the REST surface already deprecated `/read`, `/change`; a -fresh MCP has no legacy clients). **Ad-hoc `query`/`mutate` are always exposed, -Cedar-only** — no `mcp.allow_adhoc`. **No `graphs_list` / `omnigraph://graphs`** — -graph discovery stays REST-only via `GET /graphs` (server-scoped `GraphList`, §9). - -Annotations (rmcp defaults `destructiveHint` **and** `openWorldHint` to true — set -explicitly): read tools → `read_only(true).open_world(false)`; -`mutate`/`load`/`branches_delete`/`branches_merge`/`schema_apply` → -`read_only(false).destructive(true).open_world(false)`; `branches_create` -(additive) → `destructive(false)`. - -Represent built-ins as a `Builtin` enum (one variant per tool; `descriptor`/`gate`/`call` -as match arms) — lower liability than ~13 unit structs + `dyn` + `async-trait`. -Stored-query tools are a sibling populator over `handle.queries`. - -`list_tools`/`list_resources` are Cedar-filtered by running the **same** -authorization the call path runs, with **default args (branch `main`)** — not a -`branch: None` probe (which matches no `branch_scope` rule and would hide callable -tools). Over-showing a branch-scoped grant is the safe direction; `call_tool` is -the authoritative gate. (The contract: list never shows a tool the actor can't ever -call; for branch-scoped tools it may show one callable only on some branches.) - ---- - -## 7. Stored-query projection — `EXISTING (reuse)` + the **corrected** schema +## 9. Stored-query projection The projection source is the same `query_catalog_entry` the `GET /queries` catalog -uses. The real types (note `Vector` is a **unit** variant; the dim is a separate -`Option` field — this is delta **D1**): +uses. Real types: ```rust -// EXISTING — crates/omnigraph-server/src/api.rs:346 +// Reuses — crates/omnigraph-server/src/api.rs:346 pub enum ParamKind { String, Bool, Int, BigInt, Float, Date, DateTime, Blob, Vector, List } -// crates/omnigraph-server/src/api.rs:362 +// Reuses — crates/omnigraph-server/src/api.rs:362 pub struct ParamDescriptor { pub name: String, pub kind: ParamKind, pub item_kind: Option, // Some(scalar) when kind == List - pub vector_dim: Option, // Some(dim) when kind == Vector ← dim lives HERE, not in the kind + pub vector_dim: Option, // Some(dim) when kind == Vector — the dimension lives here, not in the kind pub nullable: bool, } -// crates/omnigraph-server/src/api.rs:453 -pub fn query_catalog_entry(query: &StoredQuery) -> QueryCatalogEntry; // → { name, tool_name, description, instruction, mutation, params } +// Reuses — crates/omnigraph-server/src/queries.rs:32 +pub struct StoredQuery { pub name: String, pub source: Arc, pub decl: QueryDecl, pub expose: bool, pub tool_name: Option } +impl StoredQuery { pub fn is_mutation(&self) -> bool; pub fn effective_tool_name(&self) -> &str; } // queries.rs:51,60 +pub struct QueryRegistry { /* by_name: BTreeMap; .lookup(&name) */ } // queries.rs:67 ``` -```rust -// EXISTING — crates/omnigraph-server/src/queries.rs:32 -pub struct StoredQuery { - pub name: String, - pub source: Arc, - pub decl: QueryDecl, - pub expose: bool, // default true - pub tool_name: Option, -} -impl StoredQuery { - pub fn is_mutation(&self) -> bool; // queries.rs:51 - pub fn effective_tool_name(&self) -> &str; // queries.rs:60 — tool_name override else name -} -pub struct QueryRegistry { /* by_name: BTreeMap */ } // queries.rs:67; .lookup(&name) -``` +A query is declared in the graph's stored-query registry — `cluster.yaml +graphs..queries` with file discovery for cluster/multi-graph servers, or the +legacy single-graph `omnigraph.yaml queries:` map. Each entry carries MCP metadata +(`expose: bool`, default `true`; optional `tool_name`); the projection reads the +loaded `QueryRegistry` on `handle.queries` and is agnostic to the declaration source. -### 7.1 `ParamDescriptor → JSON Schema` — `NEW (build this)`, corrected for the real types +### 9.1 `ParamDescriptor → JSON Schema` (`New`) + +JSON Schema 2020-12. The `Vector` dimension is an `Option` on the descriptor: +when present, constrain the array length; when absent, **omit** `minItems`/`maxItems` +(never emit `0`, which would reject all input). ```rust -// NEW (build this) — corrects the blueprint's nonexistent `ParamKind::Vector { dim }`. +// New — crates/omnigraph-server/src/mcp/schema.rs use serde_json::{json, Value}; fn scalar_schema(kind: ParamKind) -> Value { @@ -558,7 +357,6 @@ fn scalar_schema(kind: ParamKind) -> Value { ParamKind::Date => json!({ "type": "string", "format": "date" }), ParamKind::DateTime => json!({ "type": "string", "format": "date-time" }), ParamKind::Blob => json!({ "type": "string", "contentEncoding": "base64" }), - // Vector/List handled by param_schema (they carry sub-structure). ParamKind::Vector | ParamKind::List => unreachable!("composite kinds handled in param_schema"), } } @@ -567,382 +365,373 @@ fn param_schema(p: &ParamDescriptor) -> Value { match p.kind { ParamKind::Vector => { let mut s = json!({ "type": "array", "items": { "type": "number" } }); - // dim is Option on the descriptor. Present ⇒ constrain length. - // Absent ⇒ OMIT minItems/maxItems — never emit 0 (a 0-length-array - // schema rejects all input; this is the footgun the first draft - // misattributed to a type that doesn't exist). - if let Some(dim) = p.vector_dim { - s["minItems"] = json!(dim); - s["maxItems"] = json!(dim); - } + if let Some(dim) = p.vector_dim { s["minItems"] = json!(dim); s["maxItems"] = json!(dim); } s } - ParamKind::List => { - let item = p.item_kind.map(scalar_schema).unwrap_or_else(|| json!({ "type": "string" })); - json!({ "type": "array", "items": item }) // grammar guarantees scalar items - } + ParamKind::List => json!({ "type": "array", "items": p.item_kind.map(scalar_schema).unwrap_or_else(|| json!({"type":"string"})) }), scalar => scalar_schema(scalar), } } ``` -### 7.2 Envelope (correct by construction) +### 9.2 Two projection modes (small vs large catalogs) -The tool's `input_schema` **nests query params under `params`**, mirroring -`POST /queries/{name}`, so the invocation knobs and the query's own params live in -separate namespaces and cannot collide: +Tool-overload is real: model accuracy degrades sharply as a single client's tool +count climbs past a few dozen, and clients that don't defer tool loading (e.g. +OpenCode) pay the full `tools/list` token cost. So the projection has two modes, +selected per graph by a `mcp.stored_query_mode` setting (default `auto`): + +- **`per_query` (small/stable catalogs).** One tool per `expose: true` query, named by + `effective_tool_name()`, with a fully typed `input_schema`. This is the richest + surface — each query is a first-class typed tool, ideal for code-mode runtimes that + compile tools into a typed API. +- **`meta` (large/dynamic catalogs).** Two tools instead of N: `stored_query_list(filter?, + detail_level?)` (returns names + descriptions; full param schema only at higher + detail) and `stored_query_run(name, params, branch?, snapshot?)`. This keeps + `tools/list` small and mirrors the progressive-disclosure shape (`search` + `execute`) + that scales to hundreds of queries. +- **`auto`** picks `per_query` below a threshold (default 24 exposed queries) and `meta` + at or above it; the threshold is configurable. The boundary and count are logged so + a deployment never silently flips modes. + +### 9.3 Envelope (collision-free by construction) + +In `per_query` mode the tool's `input_schema` **nests query params under `params`**, +mirroring `POST /queries/{name}`: ```jsonc { "type": "object", "properties": { - "params": { "type": "object", "properties": { /* per-param param_schema(...) */ }, "required": [ /* nullable==false */ ] }, + "params": { "type": "object", "properties": { /* per-param param_schema(...) */ }, "required": [ /* nullable == false */ ] }, "branch": { "type": "string" }, - "snapshot": { "type": "string" } // OMIT for mutation tools (mutation-against-snapshot is unrepresentable) + "snapshot": { "type": "string" } // omit for mutation tools — mutation-against-snapshot is unrepresentable }, "additionalProperties": false } ``` -Dispatch reads query params from `args.params` and knobs from `args.branch` / -`args.snapshot`, so a query parameter literally named `branch`/`snapshot` is -harmless. Skip a stored tool whose `effective_tool_name()` collides with a -built-in (built-ins win). The outer gate is coarse `InvokeQuery`; the inner gate -is the query body's `Read`/`Change` — the same double-gate as §5.3. +Knobs (`branch`/`snapshot`) and the query's own params live in separate namespaces, so +a query parameter literally named `branch`/`snapshot` cannot collide. A stored tool +whose `effective_tool_name()` collides with a built-in is skipped (built-ins win). -### 7.3 Where stored queries are declared (delta D6) +## 10. Tool catalog + Cedar mapping — `Reuses` `PolicyAction` -`expose` + `tool_name` survive with default `expose: true` (`config.rs:132`, -`queries.rs:43`). The **declaration source** moved: multi-graph servers declare -queries in **`cluster.yaml graphs..queries`** with Terraform-shaped file -discovery (`queries: queries/` discovers top-level `*.gq`, loud on parse/dup -errors — `677320c`); the single-graph `omnigraph.yaml queries:` map still parses -but is the **legacy** surface RFC-008 deprecates. The MCP projection is agnostic -to the source — it reads the loaded `QueryRegistry` on `handle.queries` — but the -blueprint's "`in omnigraph.yaml`" phrasing is corrected to "the graph's loaded -stored-query registry (`cluster.yaml` discovery, or legacy `omnigraph.yaml`)." - ---- - -## 8. The `load` tool — `EXISTING (reuse)`, ingest/load unified (delta D3) - -The first draft described an `ingest` tool wrapping `ingest_as` with implicit -fork-from-`main`. **That is gone.** `server_ingest` now calls `load_as` directly and -fork is opt-in by `from`: +Each built-in reuses the **exact `PolicyAction` its REST route enforces**: ```rust -// EXISTING — crates/omnigraph-server/src/handlers.rs:1210 (abridged to the load-bearing logic) -let branch = request.branch.unwrap_or_else(|| "main".to_string()); -let from = request.from; // Option -let mode = request.mode.unwrap_or(LoadMode::Merge); -if !branch_exists { - match from.as_deref() { - None => return Err(ApiError::not_found( // ← no implicit fork; a typo'd branch is a 404 - format!("branch '{branch}' not found; pass `from` to create it"))), - Some(from) => authorize_request(actor, handle.policy.as_deref(), PolicyRequest { - action: PolicyAction::BranchCreate, // ← BranchCreate consulted ONLY when a fork happens - branch: Some(from.to_string()), target_branch: Some(branch.clone()), - })?, +// Reuses — crates/omnigraph-policy/src/lib.rs:18 +pub enum PolicyAction { + Read, Export, Change, SchemaApply, + BranchCreate, BranchDelete, BranchMerge, + Admin, // reserved, no call site yet + GraphList, // server-scoped (resource_kind == Server) + InvokeQuery, // graph-scoped, coarse (no per-query dimension yet) +} +``` + +| MCP tool | Scope | Cedar action | +|---|---|---| +| `health` | server | none (liveness/version) | +| `graph_snapshot`, `schema_get`, `branch_list`, `commit_list`, `commit_get` | graph | `Read` | +| `graph_query` (ad-hoc read) | graph | `Read` (`run_query` self-authorizes) | +| `graph_mutate` (ad-hoc write) | graph | `Change` | +| `graph_load` (NDJSON) | graph | `Change` (+ `BranchCreate` **iff** `from` is present — see §11) | +| `branch_create` / `branch_delete` / `branch_merge` | graph | `BranchCreate` / `BranchDelete` / `BranchMerge` | +| `schema_apply` (`allow_data_loss`) | graph | `SchemaApply` | +| stored query (`per_query`) / `stored_query_run` (`meta`) | graph | `InvokeQuery` (coarse) then inner `Read`/`Change` | + +**Naming.** Tool ids are **domain-qualified `snake_case`** (`graph_query`, +`branch_merge`, `schema_apply`, …) within the spec's `[A-Za-z0-9_.-]`, 1–128-char +constraint. Domain qualification (rather than bare `query`/`mutate`) reduces +cross-server collisions when a client loads omnigraph alongside other MCP servers; +clients that auto-prefix by connection name (e.g. OpenCode → `omnigraph_graph_query`) +compose cleanly. Names are a stability contract (Hyrum's Law) — don't churn them. + +**Annotations (set explicitly).** MCP annotation defaults are pessimistic +(`readOnlyHint=false`, `destructiveHint=true`, `idempotentHint=false`, +`openWorldHint=true`), so an unannotated read tool is mistaken for a destructive +open-world writer. Set them via rmcp's `ToolAnnotations` (`read_only_hint`, +`destructive_hint`, `idempotent_hint`, `open_world_hint`): + +- read tools (`graph_query`, `graph_snapshot`, `schema_get`, `branch_list`, + `commit_*`, stored *reads*) → `read_only_hint = true`, `open_world_hint = false`. +- writers (`graph_mutate`, `graph_load`, `branch_delete`, `branch_merge`, + `schema_apply`) → `read_only_hint = false`, `destructive_hint = true`, + `open_world_hint = false`. Clients use `destructiveHint` to drive human-confirmation + prompts. +- `branch_create` (additive) → `destructive_hint = false`. + +Annotations are **advisory hints, not a security boundary** (clients may ignore them); +**Cedar is the enforcement boundary.** + +Represent built-ins as a `Builtin` enum (one variant per tool; `descriptor` / `gate` / +`call` as match arms) — lower liability than ~13 unit structs + `dyn`. Stored-query +tools are a sibling populator over `handle.queries`. + +**`list_tools` / `list_resources` are Cedar-filtered** by running the *same* +authorization the call path runs, with **default args (branch `main`)** — not a +`branch: None` probe (which matches no `branch_scope` rule and would hide tools the +actor can call on a scoped branch). Over-showing a branch-scoped grant is the safe +direction; `call_tool` is the authoritative gate. + +## 11. Dispatch reuse + error classification + +`call_tool` adds no business logic. Reuse points (all in `handlers.rs`): + +```rust +pub(crate) enum Authz { Allowed, Denied(String) } // handlers.rs:336 +pub(crate) fn authorize(actor: Option<&ResolvedActor>, policy: Option<&PolicyEngine>, request: PolicyRequest) -> Result; // :357 — Err = operational 401/500 +pub(crate) async fn run_query(handle: Arc, actor: Option<&ResolvedActor>, query: &str, name: Option<&str>, params_json: Option<&Value>, branch: Option, snapshot: Option, reject_mutations: bool) -> Result<(String, ReadTarget, QueryResult), ApiError>; // :731 +pub(crate) async fn run_mutate(state: AppState, handle: Arc, actor: Option<&ResolvedActor>, query: &str, name: Option<&str>, params_json: Option<&Value>, branch: String) -> Result; // :665 +``` + +`PolicyRequest` carries `{ action, branch, target_branch }` only — **no actor +identity** (server-resolved, supplied separately) and **no query-name dimension** +(the coarse-`invoke_query` caveat): + +```rust +// Reuses — crates/omnigraph-policy/src/lib.rs:252 +pub struct PolicyRequest { pub action: PolicyAction, pub branch: Option, pub target_branch: Option } +``` + +**The stored-query double-gate + deny-masking pattern** (`handlers.rs:933`, +`server_invoke_query`) is the contract `call_tool` mirrors for stored queries: + +```rust +// Reuses (pattern) — outer InvokeQuery gate; deny == missing so the catalog can't be probed +match authorize(actor, handle.policy.as_deref(), PolicyRequest { + action: PolicyAction::InvokeQuery, branch: None, target_branch: None, // graph-scoped: NO branch dimension +})? { + Authz::Allowed => {} + Authz::Denied(_) => return Err(ApiError::not_found("stored query not found")), +} +let stored = handle.queries.as_ref().and_then(|r| r.lookup(&name)).ok_or_else(|| ApiError::not_found("stored query not found"))?; +// inner gate runs in run_mutate (Change) / run_query (Read); a stored mutation is double-gated. +``` + +**`graph_load` (NDJSON)** wraps the unified `load_as` (`server_ingest`, +`handlers.rs:1210`): a missing branch with **no `from` is a `404`, never an implicit +fork**; `BranchCreate` is consulted only when `from` is present, then `Change` for the +load. The tool's `input_schema` is `{ data: string, branch?: string, from?: string, +mode?: "merge"|"append"|"overwrite" }`, `additionalProperties: false`. + +**Error classification (`New`, one mapper, SEP-1303-aligned).** `ApiError`'s fields are +private (`lib.rs:296`), so add `pub(crate) fn status_code(&self)`/`message_str(&self)` +accessors. Then one `classify` is used at every dispatch site: + +```rust +// New — the single source of truth +fn classify(r: Result) -> Result { + match r { + Ok(out) => Ok(out), + // Semantic failures (bad params, validation, business 4xx/409) → isError result, + // fed back to the model so it self-corrects (MCP 2025-11-25 SEP-1303). + Err(e) if e.status_code().is_client_error() => Ok(CallToolResult::error(vec![Content::text(e.message_str())])), + // Operational failures (5xx) → JSON-RPC protocol error. + Err(e) => Err(McpError::internal_error(e.message_str().to_owned(), None)), } } -authorize_request(actor, handle.policy.as_deref(), PolicyRequest { - action: PolicyAction::Change, branch: Some(branch.clone()), target_branch: None, -})?; -let result = handle.engine.load_as(&branch, from.as_deref(), &request.data, mode, actor_id).await?; // unified load_as ``` -DTOs (`api.rs:497` / `:208`): +Two cases are protocol errors, not `isError`, so the catalog isn't probeable and +malformed calls are unambiguous: an **unknown OR denied tool** returns an identical +`McpError::invalid_params("unknown tool: ")` (`-32602`), and a structurally +malformed call (failing the `tools/call` shape) is a protocol error. A missing/bad +bearer is an HTTP `401` at the boundary, before rmcp. + +## 12. Code-mode compatibility + +"Code mode" (Anthropic's *Code execution with MCP*; Cloudflare's *Code Mode*) is a +**client/runtime** technique: the client compiles a server's tools into a typed code +API (TS modules / a sandbox), the model writes code against it, and intermediate +results are filtered in the sandbox instead of round-tripping through the model +context (reported ~98% context savings on large workflows). It runs over **standard +`tools/list` + `tools/call`** and **requires no new server endpoints**; credentials +stay in the transport and the runtime holds them (the sandbox never sees the bearer). + +The server's job is to be a *good source* for that compilation. Concrete server-side +choices this RFC adopts: + +1. **Strict, fully-typed `input_schema`** (§9.1) with `additionalProperties: false`, + enums for `mode`/`format`, explicit `required` — these compile into precise TS + input types. +2. **Structured output** — see §13.1: declare `outputSchema` and return + `structuredContent` so generated code gets typed *returns*, not `any`. +3. **Stable, descriptive tool names + rich descriptions** (§10) — names become + function names; descriptions become doc comments. +4. **Progressive disclosure for large catalogs** — the `meta` projection mode (§9.2) + keeps `tools/list` small (`stored_query_list` + `stored_query_run`), the same + `search` + `execute` shape code-mode runtimes prefer. +5. **Pagination** on every list-style result (cursor-based) so a code-mode run fetches + and filters incrementally without context blow-up. +6. **Schemas as resources** (§14) — expose the graph schema (and per-query param + schemas) as MCP resources, the on-demand channel code-mode clients pull from. +7. **Auth in the transport only** — never require secrets as tool *arguments* (that + would put them in model context / generated code and break the sandbox's credential + isolation). + +The server deliberately does **not** build TS-wrapper generation, sandboxes, tool +search/deferral, or PII tokenization — those are client/runtime concerns, and there is +no ratified "tools-as-code" MCP spec to target. + +## 13. Provider compatibility + +**Transport: Streamable HTTP is the universal target** — every current client below +supports it for remote servers, and it is the recommended transport over deprecated +HTTP+SSE. + +**Auth splits the ecosystem into two tiers:** + +| Client | Remote transport | Auth that works | Notes | +|---|---|---|---| +| **Claude Code** (CLI) | Streamable HTTP | static bearer header **and** OAuth 2.1 | `claude mcp add --transport http /mcp --header "Authorization: Bearer …"`. Advertising RFC-9728 can override the static header (issue #59467) — gate PRM. | +| **Cursor** | Streamable HTTP | static header **and** OAuth 2.1 | `"headers": {"Authorization": "Bearer ${env:…}"}` in `mcp.json`. | +| **VS Code** (Copilot agent) | Streamable HTTP | static header **and** OAuth | needs VS Code ≥ 1.101 for remote + OAuth; auto-detects `401` → sign-in. | +| **OpenCode** | remote HTTP | static header **and** OAuth (auto, DCR) | `mcp` block in `opencode.json`; auto-prefixes tools `omnigraph_…`; **no progressive disclosure** → keep the static surface tight (favors `meta` mode at scale). | +| **Claude Messages API** (`mcp_servers`) | Streamable HTTP (+SSE) | pre-acquired token via `authorization_token` | forwards a token; never runs OAuth. Static bearer fits directly. Pin the beta header you target. | +| **OpenAI Responses API** (`mcp` tool) | Streamable HTTP (+SSE) | arbitrary `headers` (static bearer) or pre-acquired `authorization` | forwards a token; never runs OAuth. `require_approval` gates tool calls. | +| **ChatGPT** (developer mode/connectors) | Streamable HTTP (+SSE) | OAuth, **No-Auth**, or Mixed | beta; OAuth is the clean path. | +| **Claude Desktop** (custom connectors) | Streamable HTTP (+SSE) | **OAuth 2.1 or authless** | no static-header field — bearer-only deployments are unreachable without a gateway. | +| **Claude.ai web** (custom connectors) | Streamable HTTP (+SSE) | **OAuth 2.1 + RFC 9728** (or authless) | server **must** serve RFC-9728 PRM; no static-header field. | + +**Phased auth recommendation:** + +- **Phase 1 — static bearer (this RFC).** Reaches Claude Code, Cursor, VS Code + Copilot, OpenCode, the Claude Messages API connector, and the OpenAI Responses API — + the entire developer/agent/API tier. This is the correct launch posture. +- **Phase 2 — OAuth 2.1 + RFC 9728 (MR-956, additive).** Required to reach **claude.ai + web** and **Claude Desktop** custom connectors and the clean ChatGPT path. The same + endpoint accepts validated OAuth access tokens and (still) static bearers; PRM + advertisement stays config-gated because of the #59467 header-override behavior. + +Because the resource server validates whatever token arrives on `Authorization`, +both tiers hit one endpoint with no MCP-layer branching. + +### 13.1 Result shaping & structured output + +For typed, machine-consumable results (`graph_query`, stored-query reads, +`branch_list`, `commit_*`, `schema_get`) the tool declares an **`outputSchema`** and +returns **`structuredContent`** (the route's existing `ReadOutput` / listing DTOs, +which already derive `ToSchema`), **and also** mirrors the JSON in a text `Content` +block for clients that don't parse structured content. Plain text-JSON is used where a +fixed schema is awkward. (Some clients still mishandle `structuredContent: null` — +emit an empty object, never `null`, when there is no structured payload.) + +## 14. Resources + +Two resources: `omnigraph://schema` (`Read` → schema `.pg` text) and +`omnigraph://branches` (`Read` → branches JSON). Both are Cedar-filtered and +deny-masked exactly like tools — a locked-down agent denied `Read` never sees them, +which is how the "agents don't introspect schema" intent is met by *policy*, not +omission. Advertise the `resources` capability with `subscribe:false, +listChanged:false` (both handlers are backed — don't advertise a capability whose +`read` would 404). Exposing the schema as a resource is also the on-demand channel +code-mode clients pull from (§12). + +No `omnigraph://graphs` resource and no `graphs_list` tool — server-scoped graph +discovery stays REST-only via `GET /graphs` (§15). + +## 15. Routing — `Reuses` `build_app` + +`/mcp` is mounted **inside `per_graph_protected`**, so it is flat in single mode and +nested under `/graphs/{graph_id}` in multi mode automatically: ```rust -pub struct IngestRequest { // request body shape unchanged on the wire - pub branch: Option, // default "main"; without `from`, must already exist (else 404) - pub from: Option, // parent branch; presence = opt into fork-if-missing - pub mode: Option, // default Merge - pub data: String, // NDJSON: {"type":"","data":{…}} per line -} -pub struct IngestOutput { - pub uri: String, pub branch: String, - pub base_branch: Option, // echoes `from`; NULL when absent (was non-null in the blueprint era) - pub branch_created: bool, pub mode: LoadMode, - pub tables: Vec, pub actor_id: Option, -} -``` - -**Spec decisions for the MCP tool:** - -1. **Name the tool `load`, not `ingest`.** The CLI now ships `ingest` only as a - *"Deprecated alias of `load --from `"* (`cli.rs:111`); `load` is the - unified command. A fresh MCP with no legacy clients should expose the canonical - id — consistent with §6's "no `read`/`change` aliases" decision. (The HTTP route - stays `POST /ingest` for back-compat; only the MCP tool id is canonicalized.) -2. **`do_load` wraps `load_as`** via the same body as `server_ingest`. -3. **Input schema carries `from?`** and documents the 404-on-missing-branch: - `{ data: string, branch?: string, from?: string, mode?: "merge"|"append"|"overwrite" }`, - `additionalProperties: false`. There is no silent fork-from-main to represent. - ---- - -## 9. Routing — `EXISTING (reuse)`, validated - -`/mcp` is added **into `per_graph_protected`**, so it is flat in single mode and -nested under `/graphs/{graph_id}` in multi mode automatically — no separate -wiring. The real `build_app` (`lib.rs:907`): - -```rust -// EXISTING — crates/omnigraph-server/src/lib.rs:915 (abridged) +// Reuses — crates/omnigraph-server/src/lib.rs:915 (abridged) let per_graph_protected = Router::new() .route("/snapshot", get(server_snapshot)) // … /query /mutate /queries /queries/{name} /schema /schema/apply /ingest /branches /commits … - .route("/mcp", post(server_mcp)) // ← ADD HERE (or .merge(mcp_router(state))) + .merge(mcp::mcp_router(state.clone())) // ← ADD: brings its own tower-http body-limit layer .route_layer(middleware::from_fn_with_state(state.clone(), resolve_graph_handle)) // inner: injects Arc .route_layer(middleware::from_fn_with_state(state.clone(), require_bearer_auth)); // outer: injects ResolvedActor / 401 let management = Router::new() - .route("/graphs", get(server_graphs_list)) // GraphList — server-scoped, NOT in MCP + .route("/graphs", get(server_graphs_list)) // GraphList — server-scoped, REST-only .route_layer(middleware::from_fn_with_state(state.clone(), require_bearer_auth)); let protected = match state.routing() { - GraphRouting::Single { .. } => per_graph_protected.merge(management), // → POST /mcp + GraphRouting::Single { .. } => per_graph_protected.merge(management), // → POST /mcp GraphRouting::Multi { .. } => Router::new() - .nest("/graphs/{graph_id}", per_graph_protected).merge(management), // → POST /graphs/{id}/mcp + .nest("/graphs/{graph_id}", per_graph_protected).merge(management), // → POST /graphs/{id}/mcp }; ``` -Because `mcp_router` returns its own `Router` with a tower-http body-limit layer, -prefer `.merge(mcp::mcp_router(state.clone()))` into `per_graph_protected` over a -bare `.route("/mcp", …)` so the `/mcp`-specific limit doesn't leak onto the other -routes. **No server-scoped MCP** — `graphs_list` / `omnigraph://graphs` are dropped; -graph discovery is the `management` group's `GET /graphs`. A future server-level -flat `/mcp` (bearer-only, no handle) would live in `management`, but is not built -speculatively. **Do not** consolidate to a single flat `/mcp` that takes `graph_id` -per call: MCP's `tools/list` can't carry a graph, so it couldn't list per-graph -stored-query tools, and it would break isolation. +`mcp::mcp_router(state)` is the server's thin wrapper: +`omnigraph_mcp::mcp_router(OmnigraphMcpBackend::new(state), INGEST_REQUEST_BODY_LIMIT_BYTES /* lib.rs:137, 32 MiB */, host_policy_from_bind(…))`. +Merging the router (rather than `.route("/mcp", …)`) keeps the `/mcp`-specific body +limit from leaking onto the other routes. ---- +**No server-scoped MCP.** Every MCP tool/resource is graph-scoped. `tools/list` can't +carry a graph id, so a single flat `/mcp` taking `graph_id` per call couldn't list +per-graph stored-query tools and would break isolation — hence per-graph routing. A +future server-level flat `/mcp` (bearer-only, no handle, server-scoped tools only) +would live in the `management` group, but is not built speculatively. -## 10. Auth & the client on-ramp — `EXISTING (reuse)` RFC-007 (delta D4) +## 16. Tests & verification -**Server side** (unchanged from the blueprint and correct): the backend consumes an -already-resolved `ResolvedActor` and **branches on nothing** about how the token was -verified. Static bearer works today; OAuth/RFC-9728 (MR-956) is an additive layer -that swaps the bearer middleware behind a `TokenVerifier` and serves -`/.well-known/oauth-protected-resource` — **zero MCP changes**. Token validation is -offline (cached JWKS), so on-prem/air-gapped keeps working with no request-path -cloud dependency; the endpoint is a Resource Server only (never issues tokens, -never holds a client secret). Multi-user identity passes through: the *caller's* -token (a per-tenant audience-bound JWT) reaches the server so Cedar enforces -per-user policy — never a shared service token. +MCP tests land in a new `crates/omnigraph-server/tests/mcp.rs` suite (black-box over +`build_app`); stored-query *projection* tests extend `stored_queries.rs`. -``` -request → [require_bearer_auth: ResolvedActor] → [resolve_graph_handle: GraphHandle] → [/mcp] → Cedar → tool -``` - -| Integration | Static bearer | Note | -|---|---|---| -| Claude Code, Cursor, VS Code | ✅ | `claude mcp add --transport http /mcp --header "Authorization: Bearer "` | -| Claude Messages API MCP connector | ✅ | caller passes `authorization_token` | -| claude.ai web / Desktop, ChatGPT dev-mode | ❌ needs OAuth fast-follow | OAuth 2.1 + PKCE + RFC 9728 | - -**Client side — the *shipped* RFC-007 model (delta D4).** The landed operator config -is: - -```rust -// EXISTING — crates/omnigraph-cli/src/operator.rs:74 -pub(crate) struct OperatorServer { pub(crate) url: String } // field is `url`, NOT `endpoint`; NO `auth:` block -``` - -```yaml -# ~/.omnigraph/config.yaml (operator config — "No tokens in this file, ever") -servers: - prod: { url: https://og.internal:8080 } - edge: { url: https://og-edge:8080 } - cloud: { url: https://api.omnigraph.cloud } # OAuth is resolved by the client, not declared here -``` - -Token resolution is a **two-step keyed chain** (`operator.rs:224`), then the legacy -`bearer_token_env` fallback — **there is no keychain step** (the blueprint's middle -`omnigraph:` keychain link does not exist): - -```rust -// EXISTING — crates/omnigraph-cli/src/operator.rs:229 -// 1. OMNIGRAPH_TOKEN_ (env; `intel-dev` → OMNIGRAPH_TOKEN_INTEL_DEV) -// 2. [] token = … in ~/.omnigraph/credentials (0600; over-permissive ⇒ loud error) -pub(crate) fn resolve_keyed_token(server: &str) -> Result>; -``` - -The `omnigraph mcp install --agent ` on-ramp (MR-974) writes the agent's MCP -client config pointing at `/mcp` with the resolved bearer; it reuses -this chain, so OAuth becomes a future *sibling* of the token chain, not a re-key. - ---- - -## 11. Tests & verification (corrected suite names — delta D5) - -The `server.rs` monolith is gone. MCP tests land in a **new -`crates/omnigraph-server/tests/mcp.rs`** suite (black-box over `build_app`); -stored-query *projection* tests extend the existing **`stored_queries.rs`**. Cover: - -- **Protocol:** `initialize` + advertised `{tools, resources}`; `tools/list` shape; - `tools/call` happy path; JSON-RPC errors (`-32601`/`-32602`); `resources/list` + - `resources/read`; `GET /mcp → 405`; `MCP-Protocol-Version` 400/default; `Origin → 403`. -- **Cedar (coarse):** read-only actor sees read tools but not `mutate`/`load`/`branches_*`/`schema_apply`; - a denied `tools/call` masks byte-identically to an unknown tool; stored queries - list only with `invoke_query`; the double-gate (an `invoke_query`-only actor sees - a stored tool but the call surfaces `isError` when the inner `Read` denies). -- **Dispatch:** a `mutate` call writes end-to-end (proves the actor/handle extension - passthrough); a malformed query → `isError:true`, not a JSON-RPC error. A `load` - with a missing branch and **no `from` → `isError`** (404 classified); with `from` - → forks. -- **Resources:** list + read of `schema`/`branches`; a denied read masks as unknown. -- **Schema generation:** table-driven over every `ParamKind` incl. nullable / list / - `vector` **with and without `vector_dim`** (the absent-dim path must omit - `minItems`/`maxItems`, never emit 0 — the D1 regression guard). -- **Auth decoupling / no-bearer:** `/mcp` 401s without a bearer (before rmcp) and - 200s with one; green under the static-hash verifier and a mock `ResolvedActor`. +- **Protocol:** `initialize` + advertised `{tools, resources}` caps; `tools/list` + shape + pagination; `tools/call` happy path; `GET /mcp → 405`; + `MCP-Protocol-Version` 400/default; `Origin → 403`; unknown/denied tool → + identical `-32602`. +- **Cedar:** a read-only actor sees read tools but not writers; a denied call masks + byte-identically to an unknown one; stored queries appear only with `invoke_query`; + the double-gate (an `invoke_query`-only actor sees a stored tool but the call + surfaces `isError` when the inner `Read` denies). +- **Dispatch:** a `graph_mutate` writes end-to-end (proves the actor/handle extension + passthrough); a malformed query → `isError:true`, not a protocol error; `graph_load` + with a missing branch and no `from` → `isError` (404), with `from` → forks. +- **Schema generation:** table-driven over every `ParamKind` incl. nullable, list, and + `vector` **with and without `vector_dim`** (the absent-dim path omits + `minItems`/`maxItems`). +- **Structured output:** `outputSchema` present and `structuredContent` validates + against it; the text mirror is present; never emits `structuredContent: null`. +- **Projection modes:** `per_query` below the threshold, `meta` at/above it, with the + switch logged. +- **Auth decoupling:** `/mcp` `401`s without a bearer (before rmcp) and `200`s with + one; green under the static-hash verifier and a mock OIDC `ResolvedActor`. - **Crate-level:** `omnigraph-mcp/tests/` with a trivial `McpBackend` proving the - crate stands alone (`initialize` + `GET → 405`), plus an rmcp **surface guard** - pinning `StreamableHttpServerConfig`'s `with_*` setters and the `ServerHandler` - method shapes (the only place the unverifiable rmcp-`Parts`-passthrough assumption - is anchored). + crate stands alone (`initialize` + `GET → 405`), plus an **rmcp surface guard** + pinning `StreamableHttpServerConfig`'s `with_*` setters, `NeverSessionManager`, the + `ServerHandler` method shapes, and the `RequestContext.extensions → + http::request::Parts` passthrough — the smoke check on any rmcp bump. -Verification commands (corrected): +Verification commands: ```bash cargo build --workspace --locked -cargo tree -p omnigraph-server -e normal | grep rmcp # rmcp ONLY under omnigraph-mcp, never direct -cargo test -p omnigraph-server --test mcp # NEW suite (NOT --test server) -cargo test -p omnigraph-server --test stored_queries # projection tests -cargo test -p omnigraph-server --test openapi # no /mcp leak (it carries no #[utoipa::path]) -OMNIGRAPH_UPDATE_OPENAPI=1 cargo test -p omnigraph-server --test openapi # if the REST surface intentionally changed +cargo tree -p omnigraph-server -e normal | grep rmcp # rmcp only transitively under omnigraph-mcp +cargo test -p omnigraph-server --test mcp +cargo test -p omnigraph-server --test stored_queries +cargo test -p omnigraph-server --test openapi # /mcp carries no #[utoipa::path]; no REST drift ``` ---- +## 17. Decisions & rollout -## 12. Decisions — locked & open +**Locked:** rmcp 1.7 (official SDK); MCP target `2025-11-25`; stateless JSON over a +single `/mcp` POST (`NeverSessionManager`, `stateful_mode=false`, `json_response=true`); +`McpBackend` crate-trait seam with `&Parts` passthrough; `Builtin` enum + stored-query +populator; domain-qualified `snake_case` tool ids; annotations set explicitly; coarse +`InvokeQuery` with the double-gate; per-graph `/mcp` routing, no server-scoped MCP; +`structuredContent` + `outputSchema` (with a text mirror) for typed results; +`vector_dim: Option` handled with omit-on-absent; auth consumed as a resolved +actor, validated per-request, never passed through. -**Locked** (validated against code): rmcp 1.7; stateless POST-only Streamable -HTTP; `McpBackend` crate-trait seam with `&Parts` passthrough; `Builtin` enum + -stored-query populator; coarse `InvokeQuery` only; ad-hoc `query`/`mutate` -always-on Cedar-only; `query`/`mutate` ids (no `read`/`change`); **`load` id (not -`ingest`)**; per-graph `/mcp` routing; no server-scoped MCP / no `graphs_list`; -text-JSON content for v1; BigInt as JSON string; `vector_dim: Option` handled -with omit-on-absent. +**Open / deferred:** +- **OAuth 2.1 + RFC 9728 (MR-956)** — additive Phase 2; PRM advertisement config-gated + (issue #59467). +- **Per-query `invoke_query` scope (PR 0b)** — add a query-name dimension to + `PolicyRequest` + the Cedar schema so an actor can be scoped to *specific* stored + queries. Until then curation is graph-level (registry membership + `expose`). +- **`tools/list_changed`** — only if the registry gains runtime reload. +- **stdio → proxy collapse** — the local stdio package degrades to a stdio↔HTTP proxy + over `/mcp` once this surface is GA, leaving one tool set and one Cedar gate, the + same one-contract posture as [rfc-009-unify-access-paths.md](rfc-009-unify-access-paths.md). -**Open (deferred follow-ups):** -- **PR 0b — per-query `invoke_query` scope.** Add a query-name dimension to - `PolicyRequest` + the Cedar schema (rfc-001's intended design). `InvokeQuery` is - documented coarse today (`policy/src/lib.rs:59-73`) and `branch_scope` on it is - rejected by `validate()`. Until PR 0b, stored-query curation is graph-level - (registry membership + `expose`), not per-actor sub-selection — the headline caveat. -- **MR-956 — OAuth/RFC-9728** layer (additive, zero MCP changes). -- **stdio → proxy collapse** (see below). -- **`structuredContent` / `outputSchema`** beyond text-JSON. - ---- - -## Relationship to RFC-001 - -[rfc-001-queries-envelope-mcp.md](rfc-001-queries-envelope-mcp.md) (MR-656 / MR-976 -/ MR-969) is the parent design for stored queries + the response envelope + MCP. -This RFC is the **detailed MCP-transport design** that #128 left for a follow-up, -and it **revises rfc-001 in three places where the shipped code or the MCP wire -protocol diverged from rfc-001's sketch**: - -1. **Transport shape.** rfc-001 sketched `GET /mcp/tools` + `POST /mcp/invoke` (a - bespoke REST pair). **That is not the MCP wire protocol — real MCP clients - cannot connect to it.** This RFC implements actual MCP JSON-RPC over Streamable - HTTP and reuses `query_catalog_entry` as a *projection source*, not a parallel - surface. -2. **Exposure config.** rfc-001 specified inline `.gq` pragmas (`@mcp(expose=…)`, - default `expose=false`). **#128 shipped a different mechanism:** the - `expose`/`tool_name` metadata in YAML, **default `true`** (declaring a query *is* - the opt-in). This RFC builds on the shipped form; the `.gq`-pragma design is - superseded. (The declaration *file* has since moved to `cluster.yaml` discovery - for multi-graph servers — see [§7.3](#73-where-stored-queries-are-declared-delta-d6).) -3. **Schema introspection.** rfc-001 lists "Schema introspection through MCP" as a - **non-goal**. This RFC **revises that**: the operational-parity tools include - `schema_get` and `omnigraph://schema` — *because the shipped stdio package - already exposes both*. The non-goal is achieved by *policy*, not omission: both - are Cedar-gated by `Read`, and the recommended locked-down agent policy denies - `Read`, so a curated agent still never sees the schema. - -Everything else in rfc-001 (two-paths-one-engine, per-query `invoke_query` *as the -intended scope*, the response envelope, multi-graph per-graph endpoints) this RFC -consumes unchanged. - -> **Numbering note:** the `TokenVerifier`/WorkOS auth design is referred to in code -> (`crates/omnigraph-server/src/identity.rs`) as "RFC 0001," a *different* document -> from this repo's `docs/dev/rfc-001-queries-envelope-mcp.md`. To avoid the -> collision this RFC cites the auth substrate as **MR-956** throughout. - ---- - -## Motivation - -- **One curated, safe, remotely-reachable tool surface.** MR-969's thesis: hand an - LLM a token Cedar-scoped to a set of tools and it sees exactly those typed tools — - cannot construct ad-hoc queries it isn't permitted, cannot read the schema it - isn't permitted, cannot reach other graphs. Today the only MCP is the stdio - package: local-only, full surface, ungated. -- **Parity, so the in-server MCP can be the single implementation.** Operators/agents - already depend on the operational tools. Serving them server-side behind one Cedar - gate lets the stdio package degrade to a proxy and removes two diverging tool sets. -- **On-prem and cloud from one endpoint.** A managed cloud (WorkOS OAuth) and an - on-prem/air-gapped deploy (static or customer-OIDC tokens) must serve the same MCP - without forks or MCP-specific auth. -- **Foundation for the agent on-ramp (MR-974).** `omnigraph mcp install --agent - ` needs a decided transport + a stable endpoint. - -## Goals - -- Project built-in tools + stored queries as MCP tools through **one** registry - abstraction. -- `tools/list` and the callable set are **identical for argument-independent - authorization**, both driven by Cedar (see §6 for the branch-scoped caveat). -- The MCP layer is **auth-method-agnostic**: it consumes `ResolvedActor`, never a - raw token, never branches on how auth happened. -- The same endpoint works on-prem (static/OIDC) and cloud (WorkOS OAuth), switched - by config; cloud OAuth is additive (RFC 9728). -- No new business logic: MCP tools delegate to the same `run_query`/`run_mutate`/ - branch/schema functions the HTTP routes call. -- Behaviour-neutral when unused: no MCP traffic = no change. - -## Non-Goals - -- **Building/hosting an OAuth authorization server.** The server is a Resource - Server; WorkOS AuthKit+Connect is the AS (MR-956). The MCP endpoint validates - tokens, never issues them, never holds client secrets. -- **OAuth/WorkOS implementation itself** — MR-956's work. This RFC leaves a clean - RFC-9728 hook and consumes `ResolvedActor`. -- **MCP prompts, elicitation, `tools/list_changed`, resource subscriptions, - server-initiated messages.** None needed → enables the stateless POST-only - transport (§3). -- **stdio transport inside the server.** stdio stays in the TS package (now a proxy). -- **Cross-graph tool listing.** Per-graph catalogs only. -- **Hot reload of the query registry.** Restart-only (MR-969). - -## Background - -`omnigraph-server` (Axum) already implements every operation this RFC exposes as an -authenticated HTTP route; each authorizes via a `PolicyAction` against the Cedar -policy for a server-resolved actor and calls into the engine. The existing stdio MCP -package is a *client* of these routes (it owns no business logic). MR-956 will -introduce a `TokenVerifier` trait (`StaticHashTokenVerifier` today inline, -`OidcJwtVerifier` for OIDC/WorkOS) producing the `ResolvedActor` that already exists -in `identity.rs` (§4) and is consumed by Cedar — token *validation* is offline -(cached JWKS), so on-prem/air-gapped has no request-path dependency on the cloud. - -## Relationship to the `@modernrelay/omnigraph-mcp` stdio package - -The package (`omnigraph-ts`, `@modelcontextprotocol/sdk`, **stdio only**) is a thin -client over the SDK → HTTP routes and **forwards the caller's bearer verbatim** (no -inspection). It exposes the operational tools and the `omnigraph://schema` / -`omnigraph://branches` resources (plus a vendored best-practices cookbook). *Its -exact tool/resource counts and version live in the separate `omnigraph-ts` repo and -are not re-verified here.* - -Once parity lands, **collapse to one implementation**: the in-server MCP is canonical -(Cedar-gated, remote-capable, the path that becomes a Claude-web connector via -MR-956). The stdio package degrades to a **thin stdio↔HTTP proxy** forwarding -JSON-RPC (and the incoming `Authorization`) to `/mcp` — staying the local on-ramp -for Claude Code/Desktop while sharing one tool set and one Cedar gate. This is the -same "one contract, two implementations only where semantics genuinely differ" -posture as [rfc-009-unify-access-paths.md](rfc-009-unify-access-paths.md). +**Rollout:** (1) the `omnigraph-mcp` crate + transport + a trivial backend (crate +stands alone); (2) the server backend — extension passthrough, `Builtin` enum, +read-only tools + resources, Cedar-filtered listing, the `classify` mapper; (3) +mutating tools + stored-query projection (both modes) + structured output; (4) docs + +the `omnigraph mcp install` on-ramp (MR-974); (5) OAuth/RFC-9728 (MR-956) and the +stdio proxy as separate follow-ups.