diff --git a/docs/dev/rfc-003-mcp-server-surface.md b/docs/dev/rfc-003-mcp-server-surface.md index 09542d3..34c9243 100644 --- a/docs/dev/rfc-003-mcp-server-surface.md +++ b/docs/dev/rfc-003-mcp-server-surface.md @@ -11,7 +11,7 @@ Add a first-class **MCP (Model Context Protocol) server surface to `omnigraph-server`**, exposed over **Streamable HTTP**, that projects the server's operations as MCP tools and resources for LLM clients (Claude Code/Desktop/web, Cursor, etc.). Two populations of tools share one projection path: -1. **Built-in operational tools** — parity with the existing `@modernrelay/omnigraph-mcp` stdio package's **13 tools** (`health`, `snapshot`, `read`, `schema_get`, `branches_list`, `commits_list`, `commits_get`, `change`, `ingest`, `branches_create`, `branches_delete`, `branches_merge`, `schema_apply`) and its **2 resources** (`omnigraph://schema`, `omnigraph://branches`). (Server-scoped graph discovery — a `graphs_list` tool / `omnigraph://graphs` resource — was considered but **dropped from MCP**; it stays REST-only via `GET /graphs`. See [B.10](#b10-routing-resolved).) +1. **Built-in operational tools** — parity with the existing `@modernrelay/omnigraph-mcp` stdio package's operational tool set (`health`, `snapshot`, `schema_get`, `branches_list`, `commits_list`, `commits_get`, ad-hoc read/write, `ingest`, branch ops, `schema_apply`) and its core resources (`omnigraph://schema`, `omnigraph://branches`). (Exact counts moved over time — the package is now at 16 tools / ~9 resources; see [Relationship](#relationship-to-the-modernrelayomnigraph-mcp-stdio-package). The authoritative as-built catalog is [B.5](#b5-tool-catalog-13-built-ins--cedar-mapping): 13 tools.) (Server-scoped graph discovery — a `graphs_list` tool / `omnigraph://graphs` resource — was considered but **dropped from MCP**; it stays REST-only via `GET /graphs`. See [B.10](#b10-routing-resolved).) 2. **Dynamic stored-query tools** — one MCP tool per `mcp.expose: true` entry in the `queries:` registry (MR-969 / #128), with parameters typed from the `.gq` declaration via the shipped `query_catalog_entry` / `param_descriptor` projection. Every tool is **authorized by the server's existing Cedar policy engine**. The MCP layer never implements its own authentication: it consumes an **already-resolved `ResolvedActor`** from the server's bearer middleware (`require_bearer_auth` today; the `TokenVerifier` seam when MR-956 lands), so the **same MCP endpoint serves on-prem (static or customer-OIDC tokens) and our cloud (WorkOS OAuth) by configuration only**. Cloud OAuth is an additive layer (RFC 9728 protected-resource metadata) that slots in with zero MCP changes. @@ -57,10 +57,15 @@ pub trait McpBackend: Clone + Send + Sync + 'static { ### B.3 Transport (lives in `omnigraph-mcp`) - A generic `struct McpService` implements rmcp's `ServerHandler`, delegating each method to `B` after extracting `&Parts` from `ctx.extensions` once (missing → `McpError::internal_error`). `get_info → backend.server_info()`; `initialize`/`ping` use rmcp's defaults. -- `pub fn mcp_router(backend: B, body_limit: usize) -> axum::Router`: - - `let svc = StreamableHttpService::new(move || Ok(McpService::new(backend.clone())), Arc::new(LocalSessionManager::default()), config)` with `config = StreamableHttpServerConfig::default().with_stateful_mode(false).with_json_response(true)` (`StreamableHttpServerConfig` is `#[non_exhaustive]` — build from `Default`, mutate via the `with_*` setters; keep the loopback `allowed_hosts` default). - - Return `Router::new().route_service("/mcp", svc).layer(tower_http::limit::RequestBodyLimitLayer::new(body_limit))`. rmcp reads the body directly (not via an axum extractor), so axum's `DefaultBodyLimit` does **not** bound `/mcp`; the tower-http layer does. -- **Stateless mode delivers these conformance MUSTs for free (verified against rmcp 1.7 source):** `GET`/`DELETE /mcp` → `405` (with `Allow`); a disallowed `Host`/`Origin` → `403` (loopback hosts by default — DNS-rebind guard); `MCP-Protocol-Version` → `400` on unsupported, default `2025-03-26` when absent. **No conformance middleware is needed** (the §5.6 "honour the header" footnote is satisfied by rmcp). +- `pub fn mcp_router(backend: B, body_limit: usize, hosts: McpHostPolicy) -> axum::Router`: + - `config = StreamableHttpServerConfig::default().with_stateful_mode(false).with_json_response(true).with_allowed_hosts(hosts.allowed_hosts).with_allowed_origins(hosts.allowed_origins)` (`StreamableHttpServerConfig` is `#[non_exhaustive]` — build from `Default`, mutate via the `with_*` setters). **Do not keep rmcp's fixed loopback `allowed_hosts` default** — it `403`s every non-loopback client *before* bearer auth (a deploy footgun); derive the policy per B.3.1. + - `let svc = StreamableHttpService::new(move || Ok(McpService::new(backend.clone())), Arc::new(LocalSessionManager::default()), config)`; return `Router::new().route_service("/mcp", svc).layer(tower_http::limit::RequestBodyLimitLayer::new(body_limit))`. rmcp reads the body directly (not via an axum extractor), so axum's `DefaultBodyLimit` does **not** bound `/mcp`; the tower-http layer does. +- **Stateless mode delivers these conformance MUSTs for free (verified against rmcp 1.7 source):** `GET`/`DELETE /mcp` → `405` (with `Allow`); a disallowed `Host`/`Origin` → `403` (per the `McpHostPolicy`, B.3.1); `MCP-Protocol-Version` → `400` on unsupported, default `2025-03-26` when absent. **No conformance middleware is needed** (the §5.6 "honour the header" footnote is satisfied by rmcp). + +**B.3.1 Host/Origin policy — derived from deployment, never a fixed default.** The `Host` header is not a security boundary for a bearer-authed server (the token is); rmcp's loopback `allowed_hosts` default exists to protect *local* dev servers from browser DNS-rebinding, and for any non-loopback deploy it just rejects legitimate clients. So `omnigraph-server` computes `McpHostPolicy` once at startup from what it already knows (the `--bind` address + any configured public host) and passes it in: +- **Loopback bind** (`127.0.0.1` / `::1` / `localhost`) → loopback `allowed_hosts` (rebind protection genuinely matters locally). +- **Non-loopback bind** (`0.0.0.0` / a public IP) → the operator chose remote reachability: allow the configured public host(s) if set, else **disable Host-allowlisting** (accept any `Host`) and log it — bearer auth + `Origin` validation are the real controls. +- `allowed_origins` is the actual browser-attack control: default empty (off — non-browser MCP clients send no `Origin`), configurable for browser-based clients. The class "one fixed default that is wrong for some deployments" is closed by making the policy a function of the deployment. - **Client/test obligations rmcp enforces:** the request must carry `Accept: application/json, text/event-stream` (both), `Content-Type: application/json`, and a `Host` header. rmcp negotiates `protocolVersion` (a recent client sees `2025-11-25`). ### B.4 Server-side backend (lives in `omnigraph-server`) @@ -94,7 +99,7 @@ Annotations (rmcp defaults `destructiveHint` **and** `openWorldHint` to **true** ### B.6 Stored-query tools -One MCP tool per `mcp.expose` registry entry (named by its `tool_name`), projected from the **same `api::query_catalog_entry`** the `GET /queries` catalog uses; parameters → JSON Schema per B.7. The outer gate is the **coarse `InvokeQuery`** action (all exposed queries on the graph, or none — per-query scope is deferred, see B.13); the call then runs the registry source through `run_query` / `run_mutate`, whose inner `Read` / `Change` gate the body — the **double-gate of `POST /queries/{name}`**. Skip a stored tool whose name collides with a built-in (built-ins win, so the catalog never has a duplicate tool name). +One MCP tool per `mcp.expose` registry entry (named by its `tool_name`), projected from the **same `api::query_catalog_entry`** the `GET /queries` catalog uses; parameters → JSON Schema per B.7. The outer gate is the **coarse `InvokeQuery`** action (all exposed queries on the graph, or none — per-query scope is deferred, see B.13); the call then runs the registry source through `run_query` / `run_mutate`, whose inner `Read` / `Change` gate the body — the **double-gate of `POST /queries/{name}`**. Skip a stored tool whose name collides with a built-in (built-ins win, so the catalog never has a duplicate tool name). Dispatch reads the query params from `args.params` and the knobs from `args.branch` / `args.snapshot` (B.7's nested envelope), so a query parameter named `branch`/`snapshot` cannot be shadowed. ### B.7 `ParamKind` → JSON Schema (stored-query params) @@ -104,16 +109,16 @@ One MCP tool per `mcp.expose` registry entry (named by its `tool_name`), project | BigInt (i64/u64) | `{"type":"string","pattern":"^-?\\d+$"}` (JSON numbers lose precision >2⁵³) | | Date / DateTime | `{"type":"string","format":"date"}` / `{"type":"string","format":"date-time"}` | | Blob | `{"type":"string","contentEncoding":"base64"}` | -| Vector | `{"type":"array","items":{"type":"number"},"minItems":dim,"maxItems":dim}` (from `vector_dim`) | +| Vector | `{"type":"array","items":{"type":"number"},"minItems":dim,"maxItems":dim}` — **`dim` is intrinsic to the kind** (`ParamKind::Vector { dim: u32 }`, never an `Option`), so a dimensionless vector is unrepresentable and the `unwrap_or(0)` footgun (a 0-length-array schema that rejects all input) cannot exist; if a `dim` is ever genuinely unknown, **omit** `minItems`/`maxItems` rather than emit `0`. | | List | `{"type":"array","items":}` (scalar items only) | -`nullable == false` → in `required`. Add `branch` (and `snapshot` for reads) as optional invocation knobs. Fold `instruction` into the description. +**Envelope (correct by construction).** The tool's `input_schema` **nests the query params under a `params` object**, mirroring the `POST /queries/{name}` request: `{ "params": { }, "branch"?: string, "snapshot"?: string }`. The invocation knobs and the query's own parameters then live in **separate namespaces and cannot collide** — a query param named `branch`/`snapshot` is harmless. `nullable == false` params go in the inner `params.required`. For a **mutation** tool, omit `snapshot` from the schema and set `additionalProperties: false`, so "mutation against a snapshot" is *unrepresentable* (the REST path `400`s it; the MCP schema makes it impossible). Fold `instruction` into the description. ### B.8 `list` / `call` semantics -- **`list_tools` / `list_resources`** are Cedar-filtered: for each tool/resource, evaluate `authorize(actor, policy_for(gate), { action, branch: None })`; emit only `Allowed`; an `Err` (operational, e.g. policy-engine error) propagates as a JSON-RPC error; a `Denied` simply hides. Stored-query tools list as a group iff the coarse `InvokeQuery` is allowed. -- **`call_tool`:** an unknown tool **or** a denied tool returns the **identical** `unknown tool: ` (`-32602`) so the catalog can't be probed without the grant. A business/validation/engine failure (4xx/409) → `CallToolResult { isError: true }` (so the model self-corrects — the 2025-11-25 SEP-1303 split); an operational 5xx → JSON-RPC error. A missing/bad bearer is an HTTP `401` at the boundary *before* rmcp. -- **Branch-scope caveat (R7):** list visibility evaluates with `branch: None`; the actual `do_*` / `run_*` re-authorizes against the real branch, so a branch-scoped policy may list a tool yet deny a specific-branch call. `tools/call` is authoritative. +- **`list_tools` / `list_resources`** are Cedar-filtered. For each tool/resource, run the **same authorization the call path runs, with default args** (the default branch `main`) — **not** a separate `branch: None` probe (a `branch: None` request matches no `branch_scope` rule, so it would *hide* tools the actor can actually call on a scoped branch). Sharing the authorization input makes list and call agree by construction for the common case. Emit only `Allowed`; an `Err` (operational) propagates as a JSON-RPC error; a `Denied` hides. Stored-query tools list as a group iff the coarse `InvokeQuery` is allowed. +- **`call_tool`:** an unknown tool **or** a denied tool returns the **identical** `unknown tool: ` (`-32602`) so the catalog can't be probed without the grant. Route **every** `do_*`/`run_*` result through **one canonical `classify(Result<_, ApiError>) -> Result`** — the single source of truth, used at every dispatch site so the mapping can't drift: `Ok` → success JSON content; `Err` 4xx/409 → `CallToolResult { isError: true }` (the model self-corrects, per the 2025-11-25 SEP-1303 split); `Err` operational 5xx → JSON-RPC error. (Do not keep two mappers — a single `classify` replaces the `api_error_to_tool` / `api_operational_error` split that drifted in the reference impl.) A missing/bad bearer is an HTTP `401` at the boundary *before* rmcp. +- **Branch-scope residual (R7):** because visibility uses the default branch, a grant scoped to a *non-default* branch is an **over-show** — the tool lists and `call_tool` is the authoritative gate. Over-showing is the safe direction (the agent sees the tool, tries it, gets a clear deny); under-showing a callable tool is the failure mode the default-branch probe avoids. ### B.9 Resources @@ -373,6 +378,8 @@ Once parity lands, **collapse to one implementation**: the in-server MCP is cano ## Testing +> *Superseded by [B.12](#b12-tests--verification), which is the authoritative test plan. The list below is the original §5-era sketch; ignore its references to `graphs_list`, `read`/`change` aliases, and the per-query `invoke_query` test (all resolved/dropped — see B.13).* + - **Protocol conformance:** `initialize` handshake + advertised capabilities; `tools/list` shape; `tools/call` happy path; JSON-RPC error envelopes (`-32601` unknown method, `-32602` invalid params / unknown tool); `resources/list` + `resources/read`. - **Cedar filtering (coarse, today):** an actor with `allow InvokeQuery` + `deny Read/Change` sees *all* exposed stored queries but **not** `query`/`mutate`/`schema_get`; `tools/call query` returns masked "unknown tool"; an admin sees the full catalog. - **Cedar filtering (per-query, gated on PR 0b):** actor scoped to `InvokeQuery [find_user]` sees *only* `find_user`; `tools/call list_orders` masks. **This test ships with PR 0b**, not PR 1 — it cannot pass against the coarse action. @@ -388,6 +395,8 @@ Once parity lands, **collapse to one implementation**: the in-server MCP is cano ## Rollout — phased by risk +> *Superseded by the Blueprint's build order ([B.4](#b4-server-side-backend-lives-in-omnigraph-server) → crate ([B.1](#b1-crate-architecture--dependency-direction)–[B.3](#b3-transport-lives-in-omnigraph-mcp)) → backend → tools/stored/resources). The PR phases below are the original §5-era plan; they still mention `graphs_list`, `read`/`change` aliases, and the `mcp.allow_adhoc` switch, all of which the locked decisions ([B.13](#b13-decisions-locked)) drop. PR 0b (per-query `invoke_query` scope) remains the one deferred follow-up.* + - **PR 0a — extract the reusable invoke path (small).** The coarse `invoke_query` gate + 404 denial-masking are **already shipped** in `server_invoke_query`. Extract the read/mutate dispatch into `invoke_stored_query(handle, name, params, branch/snapshot, actor)` so MCP `tools/call` and the HTTP route share one path. No behaviour change. *(Replaces the previous draft's "PR 0 — wire the gate", which was already done.)* - **PR 0b — per-query `invoke_query` scope (the safety prerequisite).** Add a query-name dimension to `PolicyRequest` + the Cedar schema (rfc-001's intended design), wire it at `POST /queries/{name}` and in the stored-query `McpTool::authorization`. Independently useful (the `allow InvokeQuery [find_user]` policy). **Gates the per-query Cedar-filtering test and §5.9's recommended agent policy.** - **PR 1 — MCP transport + read-only parity + stored-query reads.** Endpoint(s), `initialize`/`tools/list`/`tools/call`/`resources/*`, the `McpTool` registry, Cedar-filtered listing, the read-only built-ins (`health`, `graphs_list`, `snapshot`, `read`/`query`, `schema_get`, `branches_list`, `commits_*`) + resources + stored-query *reads*. All auth-agnostic.