test(mcp): guard instructions.py against tool drift

The MCP `instructions` hint is static and baked into the client prompt, while tool names, signatures, and error codes are discovered dynamically via tools/list. The two had drifted: instructions restated stale signatures and an error-code enum that omitted schema_validation and trigger_path_conflict. - Trim instructions.py to tool names + call order; stop restating signatures and error codes the dynamic surface already carries. - Document each tool's full error_code contract in the save_workflow and create_workflow docstrings (the descriptions shipped via tools/list). - Add test_mcp_instructions_drift.py: every tool named in the guide must be registered, and every error_code a tool returns must appear in its description. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-22 08:38:13 +02:00 · 2026-05-20 18:43:18 +05:30 · 2026-05-20 18:43:18 +05:30 · 8484e4bfaf
commit 8484e4bfaf
parent 5762095edf
4 changed files with 170 additions and 30 deletions
--- a/api/mcp_server/tools/save_workflow.py
+++ b/api/mcp_server/tools/save_workflow.py
@ -10,16 +10,12 @@ Execution flow:
    4. Save as a new draft via `db_client.save_workflow_draft` — the
       published version stays intact, so edits are rollback-safe.

-Error codes surfaced to the LLM:
-    parse_error       — TS parse failed or a disallowed construct was used
-    validation_error  — node data failed spec validation (unknown field,
-                        missing required, wrong type, option out of range)
-    schema_validation — ReactFlowDTO Pydantic rejection (rare; parser bug)
-    graph_validation  — semantic graph rule broken (e.g. no start node)
-    bridge_error      — Node subprocess failed before returning JSON
-
-All LLM-facing errors include file:line:column where available so the
-LLM can correct its code directly.
+Each failure path returns an `error_code` via `_error_result`. Those
+codes and their meanings are documented in the `save_workflow` docstring
+(the description shipped to the LLM via `tools/list`); keep the two in
+sync — `test_mcp_instructions_drift.py` enforces it. All LLM-facing
+errors include file:line:column where available so the LLM can correct
+its code directly.
 """

 from __future__ import annotations
@ -91,6 +87,18 @@ async def save_workflow(workflow_id: int, code: str) -> dict[str, Any]:

    On success the draft version is saved; the published version is
    untouched.
+
+    On failure the result has `saved: false`, a machine-readable
+    `error_code`, and a human-readable `error` (with file:line:column
+    where the problem is locatable). Resubmit the full corrected source —
+    patches are not accepted. Possible `error_code` values:
+    - `parse_error` — disallowed construct or malformed TypeScript.
+    - `validation_error` — node data failed spec validation (unknown
+      field, missing required, wrong type, option out of range).
+    - `schema_validation` — wire-format (DTO) rejection; rare.
+    - `graph_validation` — structural rule broken (e.g. no start node,
+      unreachable node, edge to/from the wrong node type).
+    - `bridge_error` — internal/transient; retry once, then surface it.
    """
    user = await authenticate_mcp_request()