feat(mcp):added MCP server (#97)

* docs(specs): design research-agent MCP tools and ktx mcp daemon Adds the 2026-05-14 design spec for exposing four new MCP tools (discover_data, entity_details, dictionary_search, sql_execution), shipping a ktx-research skill, and introducing an HTTP-only ktx mcp daemon so external agents can use KTX as a research-capable context layer. * Refine research-agent MCP tools spec after adversarial review iteration 1 * Refine research-agent MCP tools spec after adversarial review iteration 2 * Refine research-agent MCP tools spec after adversarial review iteration 3 * Refine spec: drop connectionName compat carve-out and ground summary/snippet provenance per kind * feat(daemon): validate read-only SQL with sqlglot * feat(context): expose read-only SQL validation port * feat(context): register MCP sql execution tool * feat(context): execute MCP SQL through validated connector path * test(context): update SQL analysis port fixtures * docs: add research-agent MCP sql execution foundation plan * feat(context): add scan-backed entity details service * feat(context): register MCP entity details tool * feat(context): expose local MCP entity details * test(context): align entity details scan fixtures * docs: add research-agent MCP entity_details plan * feat(context): add dictionary search service * feat(context): register MCP dictionary search tool * feat(context): expose local MCP dictionary search * docs: add research-agent MCP dictionary_search plan * feat: add MCP discover data service * feat: expose discover data MCP tool * feat: wire local discover data MCP port * docs: add research-agent MCP discover_data plan * feat(cli): add mcp http security helpers * feat(cli): host mcp over streamable http * feat(cli): manage mcp daemon lifecycle * feat(cli): add ktx mcp commands * fix(cli): stabilize mcp daemon verification * docs: add research-agent MCP http daemon plan * feat(cli): install KTX research skill * feat(cli): configure MCP clients in setup agents * feat(cli): support Claude local MCP setup scope * docs: add research-agent MCP setup-agents plan * refactor(context): use connectionId in warehouse verification tools * docs(context): update ingest verification prompts for connectionId * docs: add research-agent MCP ingest contract convergence plan * chore: build runtime artifacts in conductor setup --------- Co-authored-by: Andrey Avtomonov <7889985+andreybavt@users.noreply.github.com>
2026-06-22 08:38:08 +02:00 · 2026-05-15 02:35:09 +02:00 · 2026-05-15 02:35:09 +02:00 · b759a4a286
commit b759a4a286
parent c7b64379bf
78 changed files with 13689 additions and 190 deletions
--- a/docs/superpowers/specs/2026-05-14-research-agent-mcp-tools-design.md
+++ b/docs/superpowers/specs/2026-05-14-research-agent-mcp-tools-design.md
@ -0,0 +1,931 @@
+# Research Agent MCP Tools Design
+
+**Date:** 2026-05-14
+**Author:** Andrey Avtomonov
+**Status:** Design — pending implementation plan
+
+## Background
+
+KTX positions itself as a standalone context layer for database agents.
+External agents — Claude Code, Cursor, Codex, opencode — should be able to
+connect to a local KTX instance via MCP and perform research against
+configured data connections.
+
+The existing MCP surface (`packages/context/src/mcp/context-tools.ts`) already
+exposes strong **context** primitives: wiki search/read/write, semantic-layer
+list/read/write/validate/query, ingest and scan run management, memory
+capture. What it is missing is the **active investigation** primitives a
+research agent needs:
+
+- The agent cannot run raw SQL against a connection. `sl_query` only covers
+  semantic-layer-defined queries.
+- The agent cannot inspect raw table or column metadata for tables that are
+  not yet modeled in the semantic layer.
+- The agent cannot find which column holds a literal value mentioned by the
+  user (e.g., "Acme Corp").
+- The agent must call multiple separate search tools (`wiki_search`,
+  `sl_list_sources`) and reconcile results manually instead of getting a
+  unified ranked discovery view.
+
+The Kaelio research agent (reference implementation at
+`/Users/andrey/conductor/workspaces/kaelio-main2/douala/server/src/cores/research-execution.core.ts`)
+addresses these gaps with tools named `sql_execution`, `entity_details`,
+`dictionary_search`, and `discover_data`, used in a discovery → inspection →
+query loop. The corresponding KTX infrastructure already exists in pieces:
+
+- `KtxScanConnector.executeReadOnly` on every connector
+  (`packages/connector-postgres/src/connector.ts:447` and siblings) — read-only
+  SQL execution with `assertReadOnlySql` and `limitSqlForExecution`.
+- `KtxSchemaSnapshot` from scan reports — full table/column/FK metadata.
+- `SlDictionaryEntry` extraction over relationship-profiling artifacts
+  (`packages/context/src/sl/sl-dictionary-profile.ts`).
+- Hybrid search core with Reciprocal Rank Fusion
+  (`packages/context/src/search/{hybrid-search-core,rrf}.ts`).
+
+This design exposes those primitives as four new MCP tools, adds a research
+skill to guide external agents, and introduces an HTTP-only `ktx mcp` daemon
+to host the MCP server.
+
+## Goals
+
+- Expose four new MCP tools that turn KTX into a research-capable context
+  layer for any MCP-compatible client: `discover_data`, `entity_details`,
+  `dictionary_search`, `sql_execution`.
+- Ship a `ktx-research` skill installable via `ktx setup-agents`, describing
+  the discover → inspect → query → capture workflow for external agents.
+- Provide a `ktx mcp` CLI subtree that runs the MCP server over HTTP on
+  localhost, with the same lifecycle pattern as the existing managed Python
+  daemon (`packages/cli/src/managed-python-daemon.ts`).
+- Make `ktx setup-agents` install MCP client configuration for the configured
+  targets pointing at the local HTTP endpoint. v1 splits this by client: for
+  claude-code and cursor (JSON config), `setup-agents` writes the entry
+  directly; for codex (TOML) and opencode (different JSON wrapper),
+  `setup-agents` prints a copy-pasteable snippet rather than writing the file.
+  See the client matrix below for full per-target behavior.
+- Reuse existing infrastructure (connector `executeReadOnly`, schema
+  snapshots, dictionary profile, hybrid search + RRF) rather than building
+  parallel implementations.
+
+## Non-goals
+
+- This spec does not build an agent loop inside KTX. The system prompt, step
+  budget, tool dispatch, and methodology tracking remain in the external
+  client. KTX is a context provider, not an agent runner.
+- This spec does not expose Python code execution. The `ktx-daemon`
+  `/code/execute` endpoint exists but is not surfaced via MCP. That is a
+  separate design with its own sandboxing and security considerations.
+- This spec does not ship widget rendering, chart creation, or scheduled
+  report execution. Those are presentation concerns the external client owns.
+- This spec does not implement stdio MCP transport. HTTP-only.
+- This spec does not implement OS-level auto-start (launchd, systemd user
+  units). `ktx mcp start` must be run explicitly.
+- This spec does not implement remote network exposure beyond loopback. Token
+  auth and non-`127.0.0.1` binding are supported but TLS, audit logging, and
+  multi-tenant isolation are out of scope for v1.
+
+## Tool inventory
+
+Four new MCP tools, registered in `packages/context/src/mcp/context-tools.ts`
+alongside the existing tools.
+
+### Relationship to existing warehouse-verification tools
+
+KTX already ships ingest-side implementations of `sql_execution`,
+`entity_details`, and `discover_data` at
+`packages/context/src/ingest/tools/warehouse-verification/{sql-execution,entity-details,discover-data}.tool.ts`,
+backed by `warehouse-catalog.service.ts`. Their contracts differ from the
+MCP shapes proposed below in three concrete ways:
+
+- They currently take `connectionName` (slug-shaped); this spec renames
+  them to `connectionId` in the same change (see below).
+- They take `targets` (a discriminated `display` vs. `{catalog,db,name}`
+  union) and `rowLimit`, not `entities` / `maxRows`.
+- They return `{ markdown, structured }` with scan availability, candidate
+  matches, and ingest-session-allowed-connection scoping, not the
+  MCP-shaped pure-structured outputs in this spec.
+
+To avoid two divergent contracts for the same primitives, the MCP tools
+**must be implemented by extracting the shared logic out of
+`warehouse-verification/*` and into reusable services**
+(e.g., `WarehouseCatalogService` as the source of truth for table/column
+resolution and discovery, plus a shared read-only SQL executor that wraps
+`assertReadOnlySql`/`limitSqlForExecution`). The ingest tools and the new
+MCP tools then become thin adapters around those services with their own
+input/output shapes appropriate to each surface.
+
+KTX has no public users yet, so the same change that introduces the MCP
+tools renames the ingest-side `connectionName` parameter to `connectionId`
+across `warehouse-verification/*.tool.ts`, `warehouse-catalog.service.ts`,
+and any callers. `connectionId` matches the rest of the in-process MCP
+surface (`sl_query`, `sl_list_sources`, `scan_trigger`, etc.) and the new
+MCP tool inputs. The ingest tools and the new MCP tools then share both
+the service layer and the parameter name; only their input/output shapes
+differ (markdown+structured for the ingest surface, pure structured for
+the MCP surface).
+
+### discover_data
+
+Unified ranked search across wiki, semantic-layer sources/measures/dimensions,
+and raw schema tables/columns. Returns refs only with a uniform shape; the
+agent dereferences top hits using the existing `wiki_read`, `sl_read_source`,
+or `entity_details` tools.
+
+**Input schema:**
+
+```typescript
+{
+  query: z.string().min(1),
+  connectionId: z.string().optional(),         // omit → all connections
+  kinds: z.array(z.enum([
+    'wiki', 'sl_source', 'sl_measure', 'sl_dimension', 'table', 'column',
+  ])).optional(),                              // omit → all kinds
+  limit: z.number().int().min(1).max(50).default(15).optional(),
+}
+```
+
+**Output:** array of refs, each:
+
+```typescript
+{
+  kind: 'wiki' | 'sl_source' | 'sl_measure' | 'sl_dimension' | 'table' | 'column',
+  id: string,                                  // stable id: wiki key, source name, or driver-qualified table/column display string
+  score: number,                               // RRF fused score, 0-1 range
+  summary: string | null,                      // one-line description; null when no source field is populated
+  snippet: string | null,                      // short context snippet, ≤200 chars; null when nothing meaningful to show
+  matchedOn:                                   // why this result matched (powers the snippet for non-description kinds)
+    | 'name' | 'display' | 'description' | 'comment' | 'expr' | 'sample_value' | 'body',
+  connectionId?: string,                       // present for non-wiki kinds
+  tableRef?: {                                 // present for kind 'table' and 'column'
+    catalog: string | null,
+    db: string | null,
+    name: string,
+  },
+  columnName?: string,                         // present for kind 'column'
+}
+```
+
+The structured `tableRef` mirrors the live `KtxSchemaTable` identity
+(`packages/context/src/scan/types.ts:74-83`) so callers can pass refs into
+`entity_details` without losing `catalog`/`db` qualification on drivers
+that need it (BigQuery `project.dataset.table`, Snowflake/SQL Server
+`database.schema.table`).
+
+#### `summary` and `snippet` provenance per kind
+
+Both fields are derived from existing source data, never invented or
+LLM-generated. The resolver is pure and deterministic per kind. When no
+source field exists for a given kind, the field is `null`; agents must
+not assume a missing snippet means "no context" — they should dereference
+the ref via `wiki_read`, `sl_read_source`, or `entity_details` to get
+authoritative content.
+
+| Kind | `summary` source | `snippet` source |
+|---|---|---|
+| `wiki` | `WikiFrontmatter.summary` (`packages/context/src/wiki/types.ts:15`) — populated at write time | Up to 200 chars from the wiki body around the match position; falls back to first 200 chars of body when `matchedOn === 'name'`/`'display'` |
+| `sl_source` | `resolveDescription(source.descriptions, priority)` (`packages/context/src/sl/descriptions.ts:16-34`) over the `user|ai|dbt|db` priority chain (`packages/context/src/sl/types.ts:5`) | When `matchedOn === 'description'`/`'body'`: a window of the resolved description; otherwise the source's `name` + first 1–2 measure or dimension names as context |
+| `sl_measure` | `measure.description` (`packages/context/src/sl/types.ts:37`) | `measure.expr` truncated to 200 chars — the calculation is the most informative one-line context for a measure |
+| `sl_dimension` | `resolveDescription(column.descriptions, priority)` (same precedence as `sl_source`); when empty, fall back to `null` | `${column.name} (${column.type})` formatted exactly like the existing inline rendering in `sl-search.service.ts:29-41` |
+| `table` | `firstDescription(table.descriptions)` then `table.comment` (precedence already used by `warehouse-catalog.service.ts:286-287`); `null` when both are empty | When `matchedOn === 'description'`/`'comment'`: a window of that string; when `matchedOn === 'name'`/`'display'`: a comma-joined list of up to 5 of the table's column names |
+| `column` | `resolveDescription(column.descriptions)` then `column.comment` (`warehouse-catalog.service.ts:228-245`); `null` when both are empty | When `matchedOn === 'description'`/`'comment'`: that text; when `matchedOn === 'sample_value'`: `${column.nativeType} · samples: <up to 5 sampleValues>` formatted from `column.sampleValues` (`warehouse-catalog.service.ts:18-23`); otherwise `${column.nativeType}` |
+
+The `matchedOn` field is the same concept as the existing
+`RawSchemaHit.matchedOn` in `warehouse-catalog.service.ts:40-54`,
+extended to the wiki and SL kinds. Snippets always come from a single
+already-stored field; the resolver never concatenates across sources or
+invents bridging text. Length cap is enforced at the producer side (≤200
+chars after a single-pass slice; no ellipsis appended — clients render
+one if they want).
+
+**Implementation:** new module `packages/context/src/search/discover.ts`.
+Composes three sub-searches in parallel:
+
+1. Wiki search via the existing wiki search backend.
+2. SL search over sources/measures/dimensions using existing
+   `sl-sources-index` (or a new lightweight index if needed for measure
+   granularity).
+3. Raw schema search over tables and columns from `KtxSchemaSnapshot`,
+   indexed at scan time and stored alongside other scan artifacts.
+
+Results from each sub-search are fused with `packages/context/src/search/rrf.ts`
+using equal weights. The `kinds` filter constrains which sub-searches run.
+
+### entity_details
+
+Read structured metadata for one or more raw tables (and optionally specific
+columns) from the latest scan snapshot. The raw-data equivalent of
+`sl_read_source`.
+
+**Input schema:**
+
+```typescript
+{
+  connectionId: z.string().min(1),
+  entities: z.array(z.object({
+    // table accepts either a driver-display string ("project.dataset.table",
+    // "schema.name", "db.schema.name") or a structured ref. The resolver
+    // returns a structured error when the input is ambiguous across multiple
+    // schemas/catalogs.
+    table: z.union([
+      z.string().min(1),
+      z.object({
+        catalog: z.string().nullable(),
+        db: z.string().nullable(),
+        name: z.string().min(1),
+      }),
+    ]),
+    columns: z.array(z.string()).optional(),   // omit → all columns
+  })).min(1).max(20),
+}
+```
+
+**Output:** for each entity, a structured record:
+
+```typescript
+{
+  connectionId: string,
+  tableRef: {                                  // structured identity, lossless on every driver
+    catalog: string | null,                    // BigQuery project, Snowflake/SQL Server database
+    db: string | null,                         // schema/dataset
+    name: string,
+  },
+  display: string,                             // driver-formatted display string
+                                               //   (e.g. "project.dataset.table", "schema.name")
+  kind: 'table' | 'view' | 'external' | 'event_stream',  // matches KtxSchemaTableKind
+  comment: string | null,
+  estimatedRows: number | null,
+  columns: Array<{
+    name: string,
+    nativeType: string,
+    normalizedType: string,
+    dimensionType: 'time' | 'string' | 'number' | 'boolean',
+    nullable: boolean,
+    primaryKey: boolean,
+    comment: string | null,
+  }>,
+  foreignKeys: Array<{
+    fromColumn: string,
+    toCatalog: string | null,                  // qualified FK target, preserves cross-db FKs
+    toDb: string | null,
+    toTable: string,
+    toColumn: string,
+    constraintName: string | null,
+  }>,
+  snapshot: {                                  // freshness metadata, present on every response
+    syncId: string,                            // latest scan/sync identifier
+    extractedAt: string,                       // ISO-8601 UTC of the snapshot
+    scanRunId: string | null,                  // scan run id if available
+  },
+}
+```
+
+Output fields mirror `KtxSchemaTable` / `KtxSchemaColumn` /
+`KtxSchemaForeignKey` from `packages/context/src/scan/types.ts:51-82`. The
+full `KtxSchemaTableKind` set is preserved so BigQuery `external` tables
+and warehouses with event-stream sources are not silently coerced. FK
+target qualification (`toCatalog`/`toDb`) carries through so agents can
+write valid SQL for cross-schema or cross-database references without
+re-resolving.
+
+If `columns` is provided, only the requested columns appear in the `columns`
+array (PKs and FKs still report on the full table).
+
+**Implementation:** new module `packages/context/src/scan/entity-details.ts`.
+Reads `KtxSchemaSnapshot` from the same store the existing `scan_*` tools
+read. No new infrastructure. If the requested table is not in the latest
+snapshot, the tool returns a structured error with a suggestion to run
+`ktx ingest <connectionId>`.
+
+**Cache freshness.** Today `WarehouseCatalogService` caches `ConnectionCatalog`
+per connection name with no invalidation
+(`packages/context/src/ingest/tools/warehouse-verification/warehouse-catalog.service.ts:248-249`,
+`:404-411`). For an ingest tool that runs inside a single short-lived ingest
+session that is acceptable, but the MCP daemon is long-lived and serves
+clients across multiple `scan_trigger` / `ktx ingest` runs. The MCP adapter
+**must** key its cache on the latest scan artifact identity (the `syncId`
+derived from the artifact path, or the artifact file mtime) and re-read when
+that identity advances. The same rule applies to the shared services backing
+`discover_data` and `dictionary_search`. The implementation plan must
+either:
+
+1. Extend `WarehouseCatalogService` (and equivalent dictionary/discover
+   services) to invalidate cached entries when the underlying artifact
+   identity advances, or
+2. Wrap those services in an MCP-adapter cache layer that performs the
+   identity check before returning cached values.
+
+### dictionary_search
+
+Find which connection, source, and column **profile-sampled** a given literal
+value (or substring) such as "Acme Corp" or "shipped". Backed by the existing
+`SlDictionaryEntry` extraction over relationship-profiling artifacts.
+
+**Authoritativeness.** The dictionary index is built from *sampled* values
+captured during relationship profiling — by default 5 values per column,
+drawn from a sample of up to 10,000 rows
+(`packages/context/src/scan/relationship-profiling.ts:409-410`,
+`packages/context/src/sl/sl-dictionary-profile.ts:70`). A hit confirms a
+column contains the value; a miss is **not** proof that the value is absent
+from the column or warehouse — the value may simply have been outside the
+profile sample. The tool must surface this distinction in its output and the
+research skill must teach agents not to treat a miss as exhaustive.
+
+**Input schema:**
+
+```typescript
+{
+  values: z.array(z.string().min(1)).min(1).max(20),
+  connectionId: z.string().optional(),         // omit → all connections
+}
+```
+
+**Output:** for each input value, the list of matching entries plus
+per-connection provenance. Coverage and miss reasons are connection-scoped
+because `loadLatestSlDictionaryEntries` iterates each connection's profile
+artifact independently
+(`packages/context/src/sl/sl-dictionary-profile.ts:96-112`); a single
+all-connections call can mix `no_profile_artifact` (one connection never
+ran an enriched scan), `value_not_in_sample` (another connection ran but
+the literal was outside the sample), and matches in the same response.
+
+```typescript
+{
+  // The set of connections actually searched on this call. When the input
+  // omits connectionId this is every configured connection; otherwise it
+  // contains the single requested connection.
+  searched: Array<{
+    connectionId: string,
+    coverage: {
+      sampledRows: number | null,              // profileSampleRows used at profile time
+      valuesPerColumn: number | null,          // sampleValuesPerColumn used at profile time
+      profiledColumns: number,                 // count of columns in the dictionary index for this connection
+      syncId: string | null,                   // identifier of the profile artifact (null when missing)
+      profiledAt: string | null,               // ISO-8601 UTC of the profile artifact (null when missing)
+    },
+    // Per-connection status, independent of any specific input value:
+    //   ready                — profile present with profiled columns
+    //   no_profile_artifact  — enriched scan never ran for this connection
+    //   no_candidate_columns — profile present but no columns profile-eligible
+    status: 'ready' | 'no_profile_artifact' | 'no_candidate_columns',
+  }>,
+  results: Array<{
+    value: string,                             // input value
+    matches: Array<{
+      connectionId: string,
+      sourceName: string,
+      columnName: string,
+      matchedValue: string,                    // actual value found (may differ in case)
+      cardinality: number | null,              // column cardinality if known
+    }>,
+    // Per-connection miss reasons for this value, present when that
+    // connection produced no match. Connections that matched do not appear
+    // in `misses`. For ready connections with no match, the reason is
+    // 'value_not_in_sample' (non-authoritative miss). For unready
+    // connections, the reason mirrors their `status` above.
+    misses: Array<{
+      connectionId: string,
+      reason:
+        | 'no_profile_artifact'
+        | 'no_candidate_columns'
+        | 'value_not_in_sample',
+    }>,
+  }>,
+}
+```
+
+**Matching semantics:** case-insensitive substring match against the
+profile-sampled values. Misses are never authoritative — they only state
+that the value was not in the captured sample for the listed connection.
+`misses[].reason` distinguishes "no enriched scan has run on this
+connection" (`no_profile_artifact`), "enriched scan ran but no columns
+were profile-eligible" (`no_candidate_columns`), and "scan ran but value
+was not in the sample" (`value_not_in_sample`). The research skill must
+direct agents to follow up a `value_not_in_sample` miss with
+`sql_execution` against the most plausible columns, not to conclude the
+value is absent.
+
+**Cache freshness:** the dictionary index is keyed on the profile artifact
+identity (the `syncId` derived from its path or the artifact mtime). When
+that identity advances, the daemon re-reads the artifact on next call. See
+the `entity_details` cache-freshness note above for the shared rule.
+
+**Implementation:** new module `packages/context/src/sl/dictionary-search.ts`.
+Loads `SlDictionaryEntry` records via the existing extraction code path,
+builds a per-connection in-memory index on first call, caches it for the
+lifetime of the MCP daemon. Invalidated on next ingest run (the daemon
+watches `.ktx/db.sqlite` for changes, or simply re-reads on each call when
+the artifact mtime advances).
+
+### sql_execution
+
+Execute a read-only SQL query against a configured connection and return the
+result. The fallback path for questions the semantic layer does not cover.
+
+**Input schema:**
+
+```typescript
+{
+  connectionId: z.string().min(1),
+  sql: z.string().min(1),
+  maxRows: z.number().int().min(1).max(10_000).default(1000).optional(),
+}
+```
+
+**Output:**
+
+```typescript
+{
+  headers: string[],
+  headerTypes?: string[],                      // driver-mapped type names, one per header; optional
+  rows: Array<Array<unknown>>,
+  rowCount: number,
+}
+```
+
+`headerTypes` is optional because not every connector exposes per-column
+type metadata. The current contract makes it optional
+(`KtxQueryResult.headerTypes` in `packages/context/src/scan/types.ts:272-277`),
+and the SQLite connector currently omits it
+(`packages/connector-sqlite/src/connector.ts:237-240`, `:301-308`). When a
+connector returns header types, the MCP adapter passes them through
+verbatim. When a connector does not, the MCP adapter omits the field rather
+than fabricating values.
+
+**Implementation:** delegates to `KtxScanConnector.executeReadOnly` on the
+matching connector. The connector calls `assertReadOnlySql` and
+`limitSqlForExecution` (`packages/context/src/connections/read-only-sql.ts`).
+
+**Read-only enforcement is lexical, not parser-backed.** The current guard
+inspects the first token with regex: it accepts queries whose first non-space
+token is `SELECT` or `WITH`, and rejects queries whose first non-space token
+matches a fixed mutating-verb list. Implications:
+
+- A CTE that nests a data-modifying statement (e.g., `WITH x AS (INSERT ...
+  RETURNING *) SELECT ...`, valid in Postgres) passes the first-token check
+  and would reach the connector.
+- Dialect-specific read/write constructs and procedure calls that do not
+  start with a listed verb are not caught.
+
+Because `sql_execution` exposes this boundary to external MCP clients, the
+tool **must not** be enabled until one of the following holds:
+
+1. The guard is upgraded to a sqlglot/AST-based read-only check that
+   inspects every statement and CTE node, with explicit tests for CTE-DML,
+   `CALL`, `DO`, vendor pragmas, and multi-statement payloads; or
+2. Connector-side execution forces a read-only transaction / session (e.g.,
+   `SET TRANSACTION READ ONLY` for Postgres, `READ ONLY` connection for
+   MySQL, equivalent for each connector), so the guard is defense-in-depth
+   rather than the sole boundary.
+
+The implementation plan that follows this spec is required to choose and
+land one of those before registering `sql_execution` in the MCP surface.
+Errors from `assertReadOnlySql` (whichever implementation) are returned as
+structured tool errors so the agent can correct the query and retry.
+
+## Tool naming convention
+
+Match the existing KTX MCP convention (no prefix): `discover_data`,
+`entity_details`, `dictionary_search`, `sql_execution`. The existing tools
+(`wiki_search`, `sl_list_sources`, `scan_trigger`, `memory_capture`) all use
+unprefixed snake_case; the new tools follow suit.
+
+## Connection model
+
+- `sql_execution` and `entity_details` require `connectionId` — these tools
+  cannot operate without a target.
+- `discover_data` and `dictionary_search` make `connectionId` optional. Omit
+  it to search across all configured connections; provide it to scope. This
+  matches the existing pattern for `sl_list_sources({ connectionId? })`.
+- All tools are project-locked: the MCP daemon runs in one KTX project dir;
+  to operate on a different project, restart the daemon with a different
+  `--project-dir` or `cwd`.
+
+## MCP daemon: `ktx mcp`
+
+A new CLI subtree in `packages/cli/src/commands/mcp-commands.ts`, wired into
+`cli-program.ts` alongside `setup`, `connection`, `ingest`, `wiki`, `sl`,
+`status`, `dev`.
+
+### Commands
+
+```bash
+ktx mcp start [--port <n>] [--host <h>] [--token <t>] [--foreground] \
+              [--allowed-host <h>...] [--allowed-origin <o>...]
+ktx mcp stop
+ktx mcp status
+ktx mcp logs [--follow]
+```
+
+`--allowed-host` and `--allowed-origin` are repeatable. They extend (not
+replace) the defaults defined in the security model below.
+
+### `ktx mcp start`
+
+Starts a long-lived HTTP MCP server bound to the configured host and port,
+serving every tool registered by `createKtxMcpServer`. The server stays alive
+until `ktx mcp stop` is invoked or the process is terminated.
+
+- Default `--host` is `127.0.0.1`. Any value other than `127.0.0.1` or
+  `localhost` **requires** `--token` (or `KTX_MCP_TOKEN` in the environment);
+  the command refuses otherwise.
+- Default `--port` is 7878. If the port is in use, the command exits with an
+  error explaining how to choose another. Allocated port is persisted to
+  `.ktx/mcp.json` for subsequent `status`, `stop`, `logs`, and
+  `setup-agents` calls.
+- `--foreground` runs the server in the foreground and pipes all logs to
+  stdout, for debugging. Default is background.
+- Background runs detach via the same pattern as the managed Python daemon
+  (`packages/cli/src/managed-python-daemon.ts`): spawn a detached child,
+  write `pid`, `port`, `startedAt` to `.ktx/mcp.json`, return immediately
+  with the URL the user should configure in their client.
+- Logs go to `.ktx/logs/mcp.log` (matches existing log layout).
+
+### `ktx mcp stop`
+
+Reads `.ktx/mcp.json` for the daemon PID, sends SIGTERM, waits up to 10
+seconds for graceful exit, then SIGKILLs if still running. Removes the state
+file on success.
+
+### `ktx mcp status`
+
+Reads `.ktx/mcp.json`, checks the process is alive, hits the server's
+`/health` endpoint, and reports:
+
+- Running / stopped / stale (state file present but process not alive)
+- Port, host, started-at, pid
+- Whether token auth is enabled
+- Configured project dir
+
+### `ktx mcp logs`
+
+Tails or follows `.ktx/logs/mcp.log`. Standard `--follow` flag.
+
+### Lifecycle
+
+Manual: the user runs `ktx mcp start` after each reboot or whenever they
+want the server running. No auto-start on other `ktx` commands (matches the
+explicit pattern established by the daemon model).
+
+### Transport
+
+HTTP-only via `StreamableHTTPServerTransport` from
+`@modelcontextprotocol/sdk/server/streamableHttp.js`.
+
+The `/mcp` endpoint must implement the full Streamable HTTP contract, not
+just `POST`:
+
+- `POST /mcp` — JSON-RPC requests (and the `initialize` handshake when no
+  session exists). On the first `initialize` post, the server allocates a
+  session id and returns it in the `Mcp-Session-Id` response header.
+- `GET /mcp` — opens an SSE stream for server-initiated messages on an
+  existing session. Requires a valid `Mcp-Session-Id` header.
+- `DELETE /mcp` — explicit session termination by the client. Requires a
+  valid `Mcp-Session-Id` header; the server must drop the session and any
+  associated SSE streams.
+
+**Session model.** v1 ships **stateful** sessions: the server generates a
+session id with `randomUUID()` on `initialize`, stores the transport in an
+in-memory map keyed by session id, reuses it on subsequent
+`POST`/`GET`/`DELETE` calls that carry the same `Mcp-Session-Id`, and
+removes it on `DELETE` or transport close. Requests that carry an unknown
+session id are rejected with HTTP 404 so the client knows to re-initialize.
+
+Health: `GET /health` returns `{ status: 'ok', projectDir, port }` for
+liveness checks. `/health` is separate from `/mcp` and is not subject to
+session-id requirements (but is subject to host/origin validation; see
+below).
+
+### Security model
+
+- `127.0.0.1` binding is the default and requires no token auth (loopback
+  only). Even on loopback, the server enforces **Host and Origin header
+  validation** on every `/mcp` and `/health` request to defend against
+  browser-driven DNS-rebinding attacks (the same defense the MCP SDK
+  exposes in `createMcpExpressApp` / `createMcpHonoApp`).
+- **Host validation** compares the incoming `Host` header to the allowed-host
+  list after normalizing: lowercase, strip any port, strip surrounding
+  brackets from IPv6 literals (`[::1]:7878` → `::1`). Comparison is exact
+  on the normalized host string. The default allowed-host list is
+  `['localhost', '127.0.0.1', '::1']`. `--allowed-host` values are appended
+  after the same normalization.
+- **Origin validation** compares the full browser `Origin` header (scheme +
+  host + port) to the allowed-origin list. The default allowed-origin list
+  is empty: any request that carries an `Origin` header is rejected unless
+  an explicit `--allowed-origin` entry matches. Non-browser clients that
+  do not send an `Origin` header (Claude Code, Cursor, Codex, opencode
+  HTTP transports) are accepted regardless of `Origin`. Each
+  `--allowed-origin` value must be a full origin string
+  (e.g., `http://localhost:7878`); KTX validates the format at startup.
+- Non-loopback binding requires `--token <t>` or `KTX_MCP_TOKEN`. The
+  server checks `Authorization: Bearer <t>` on **every** `/mcp` method —
+  `POST`, `GET` (SSE), and `DELETE` — and rejects with HTTP 401 otherwise.
+  Token enforcement is independent of the session check; both must pass.
+  When `--host` is non-loopback, the allowed-host list expands to include
+  the normalized bound host plus any user-supplied `--allowed-host`
+  values.
+- TLS is out of scope. For remote access, document running KTX behind a
+  reverse proxy (Caddy, nginx) that terminates TLS.
+
+## Client config installation via `ktx setup-agents`
+
+`ktx setup-agents` extends its existing per-target file installation
+(`plannedKtxAgentFiles` in `packages/cli/src/setup-agents.ts:64`) to also
+write MCP server entries.
+
+The per-client config matrix is **not uniform**. Each client has its own
+file location, scope semantics, and entry shape; `setup-agents` must
+produce the correct shape per target rather than emit one JSON blob.
+
+| Target | Scope | MCP config path | Writer behavior |
+|---|---|---|---|
+| claude-code | user (global) | `~/.claude.json` → root `mcpServers.ktx` | write JSON |
+| claude-code | local (per-project, private) | `~/.claude.json` → `projects[<absProjectPath>].mcpServers.ktx` | write JSON |
+| claude-code | project (shared, checked in) | `<projectDir>/.mcp.json` → `mcpServers.ktx` | write JSON |
+| cursor | global | `~/.cursor/mcp.json` → `mcpServers.ktx` | write JSON |
+| cursor | project | `<projectDir>/.cursor/mcp.json` → `mcpServers.ktx` | write JSON |
+| codex | user (global) | `~/.codex/config.toml` → `[mcp_servers.ktx]` (TOML) | print instructions; do not auto-write in v1 |
+| opencode | user (global) | `~/.config/opencode/opencode.json` → `mcp.ktx` | print instructions; do not auto-write in v1 |
+| opencode | project | `<projectDir>/opencode.json` → `mcp.ktx` | print instructions; do not auto-write in v1 |
+
+The shared global `~/.claude.json` and per-project `~/.claude.json` →
+`projects[...]` scope are both supported because Claude Code's "user" vs.
+"local" scopes write to different sub-trees of the same file; `setup-agents`
+must select the scope explicitly per invocation.
+
+Codex and opencode entries are **printed as copy-pasteable snippets** in v1
+because their config formats (TOML for codex, a different JSON wrapper for
+opencode) diverge enough from the JSON writers above that mixing them into
+the same writer codepath risks silently producing invalid files. This is a
+deliberate v1 scoping decision, not a permanent limitation.
+
+#### Entry shapes by target
+
+Claude Code (HTTP):
+
+```jsonc
+{
+  "mcpServers": {
+    "ktx": {
+      "type": "http",
+      "url": "http://localhost:7878/mcp"
+      // when token auth is active, env-var expansion only:
+      // "headers": { "Authorization": "Bearer ${KTX_MCP_TOKEN}" }
+    }
+  }
+}
+```
+
+Cursor (HTTP, project `.cursor/mcp.json` or global `~/.cursor/mcp.json`):
+
+```jsonc
+{
+  "mcpServers": {
+    "ktx": {
+      "url": "http://localhost:7878/mcp"
+      // when token auth is active, env-var expansion only:
+      // "headers": { "Authorization": "Bearer ${KTX_MCP_TOKEN}" }
+    }
+  }
+}
+```
+
+Codex (printed snippet, `~/.codex/config.toml`):
+
+```toml
+[mcp_servers.ktx]
+url = "http://localhost:7878/mcp"
+# Codex MCP config does not currently document a headers field; if token
+# auth is active, instruct the user to either run KTX on loopback without a
+# token or wait for codex header support before enabling.
+```
+
+opencode (printed snippet, `opencode.json`):
+
+```jsonc
+{
+  "mcp": {
+    "ktx": {
+      "type": "remote",
+      "url": "http://localhost:7878/mcp",
+      "enabled": true
+      // when token auth is active, env-var expansion only:
+      // "headers": { "Authorization": "Bearer ${KTX_MCP_TOKEN}" }
+    }
+  }
+}
+```
+
+#### Token handling per client
+
+When `--token` / `KTX_MCP_TOKEN` is active, `setup-agents` writes the bearer
+token **only via environment-variable reference** (`Bearer ${KTX_MCP_TOKEN}`),
+never as a literal token value. Claude Code, Cursor, and opencode all
+support environment-variable expansion inside `headers` values; the
+written entry references `${KTX_MCP_TOKEN}` and the user is responsible
+for exporting it in the shell that launches the MCP client.
+
+Rules:
+
+- **No literal-token writes, anywhere.** Even the user-scope (private)
+  Claude Code / Cursor config receives env-var references, not the raw
+  token. This keeps the same writer codepath for every scope and avoids a
+  branch that materializes secrets.
+- **Project-scope (shared, checked-in) configs are gated.** When a token is
+  active and the user requests a shared scope — `<projectDir>/.mcp.json`
+  for Claude Code, `<projectDir>/.cursor/mcp.json` for Cursor — `setup-agents`
+  prints a warning and offers a choice: (a) write the entry with the
+  `${KTX_MCP_TOKEN}` reference (the file is safe to commit; readers must
+  export the variable locally), or (b) skip the shared entry and rely on a
+  user-scope entry instead. The default is (a).
+- **Verify header support per client before writing.** The matrix below
+  reflects the current state of each client's MCP config docs:
+  - claude-code: supports `headers` with `${VAR}` expansion on HTTP entries.
+  - cursor: supports `headers` with `${VAR}` expansion on HTTP entries.
+  - opencode: supports `headers` with `${VAR}` expansion on remote MCP
+    entries.
+  - codex: **not currently supported** in published config docs. When a
+    token is active and the user selects codex, `setup-agents` prints a
+    warning and skips the codex entry rather than writing an entry that
+    codex will silently ignore. The recommended workaround is to bind KTX
+    to loopback without a token for codex users.
+- **Implementation acceptance test.** Setup-agents writer tests must assert
+  that no rendered output contains the literal token string for any
+  scope/target combination — only the `${KTX_MCP_TOKEN}` reference.
+
+Port is read from `.ktx/mcp.json` if present, falling back to 7878. The
+install manifest (`agentInstallManifestPath`,
+`packages/cli/src/setup-agents.ts:60`) tracks each **written** entry so
+`ktx setup-agents --remove` can roll back cleanly. The current manifest
+entry kinds are `file` and `json-key`
+(`packages/cli/src/setup-agents.ts:42-50`); the MCP client writers for
+claude-code and cursor add `json-key` entries for their respective config
+files. Printed-only snippets for codex and opencode are **not** tracked in
+the manifest, and `--remove` does not attempt to mutate user-written
+files for those targets; the printed instructions tell the user how to
+remove the entry by hand.
+
+If the daemon is not running when `setup-agents` runs, the command prints a
+follow-up hint: "Run `ktx mcp start` to enable the configured KTX MCP
+server." It does **not** auto-start the daemon (matches the manual
+lifecycle decision).
+
+## Research skill
+
+A new skill source file at `packages/cli/src/skills/research/SKILL.md`,
+installed by `ktx setup-agents` to all configured targets. The skill is
+separate from the existing setup skill (different triggers: "work in a KTX
+project" vs. "answer a data question") and lives in its own per-target
+folder so global vs. project scope and removal stay clean.
+
+`plannedKtxAgentFiles` in `packages/cli/src/setup-agents.ts:64` is extended
+to return both the existing `ktx` entries and new `ktx-research` entries:
+
+| Target | Scope | Path |
+|---|---|---|
+| claude-code | global | `~/.claude/skills/ktx-research/SKILL.md` |
+| claude-code | project | `.claude/skills/ktx-research/SKILL.md` |
+| codex | global | `${CODEX_HOME}/skills/ktx-research/SKILL.md` |
+| codex | project | `.agents/skills/ktx-research/SKILL.md` |
+| cursor | project | `.cursor/rules/ktx-research.mdc` |
+| opencode | project | `.opencode/commands/ktx-research.md` |
+| universal | project | `.agents/skills/ktx-research/SKILL.md` |
+
+The skill body is identical across targets; only the wrapper format and
+file path differ to match each target's convention.
+
+### Skill content
+
+```markdown
+---
+name: ktx-research
+description: Use when answering a question that needs data from a KTX-connected database — investigating, analyzing, "how many", "show me", "what's the breakdown of", finding records by value, exploring tables, comparing periods, or any data-investigation request. Triggers even when the user does not say "research"; if the answer requires querying a configured KTX connection, this skill applies.
+---
+
+# KTX Research Workflow
+
+You have access to KTX MCP tools for investigating data. Follow this workflow.
+
+<workflow>
+1. **Discover** — call `discover_data(query)` first to see what exists across wiki, semantic-layer sources, and raw tables. Returns refs only.
+2. **Inspect top hits in parallel** — for each promising ref:
+   - `kind: 'wiki'` → `wiki_read(key)`
+   - `kind: 'sl_source'` / `'sl_measure'` / `'sl_dimension'` → `sl_read_source(connectionId, sourceName)`
+   - `kind: 'table'` / `'column'` → `entity_details(connectionId, entities)`
+3. **Resolve literals** — if the user named a value (e.g., "Acme Corp", "status=shipped"), call `dictionary_search(values)` to find which column holds it.
+4. **Query** —
+   - Prefer `sl_query` when the semantic layer covers the question (joins, measures pre-defined).
+   - Use `sql_execution` only for things the semantic layer doesn't cover.
+5. **Capture learnings** — at the end of the turn, call `memory_capture(userMessage, assistantMessage)` so future turns benefit. Skip when the answer carries no durable knowledge (e.g., the user only asked for schema info).
+</workflow>
+
+<rules>
+- Always run `discover_data` before writing SQL. Do not guess table names.
+- Prefer the semantic layer over raw SQL when both can answer the question — measures are the source of truth.
+- Read entity details before writing SQL against an unfamiliar table; do not assume column names.
+- Treat `sql_execution` as read-only. Writes are rejected by the server.
+- Validate value mentions with `dictionary_search` instead of guessing case/spelling — but treat a `dictionary_search` *miss* as non-authoritative. The index is built from profile-sampled values, so a missing value may simply have been outside the sample. Follow up with `sql_execution` against the most plausible columns before concluding the value is absent.
+</rules>
+
+<examples>
+**Input:** "How many orders did Acme Corp place last month?"
+
+**Output workflow:**
+1. `dictionary_search(["Acme Corp"])` → finds `customers.name`
+2. `discover_data("orders customer monthly")` → finds `orders_facts` SL source
+3. `sl_read_source("warehouse", "orders_facts")` → confirms measure `order_count`, dim `customer_name`, dim `ordered_at`
+4. `sl_query({ measures: ["order_count"], filters: ["customer_name = 'Acme Corp'", "ordered_at >= date_trunc('month', now() - interval '1 month')"], dimensions: [{ field: "ordered_at", granularity: "month" }] })`
+5. `memory_capture(userMessage, assistantMessage)`
+
+---
+
+**Input:** "What columns does the events table have?"
+
+**Output workflow:**
+1. `discover_data("events table")` → top hit `kind: 'table', id: 'analytics.events'`
+2. `entity_details("warehouse", [{ table: "analytics.events" }])` → returns columns, types, FKs
+3. Answer directly. (No query needed; no `memory_capture` since no durable learning.)
+</examples>
+```
+
+## Files
+
+### New
+
+- `packages/context/src/scan/entity-details.ts` — derives entity-detail
+  records from `KtxSchemaSnapshot`, sharing resolution logic with
+  `warehouse-verification/warehouse-catalog.service.ts` (refactored or
+  imported, not duplicated).
+- `packages/context/src/sl/dictionary-search.ts` — builds and queries the
+  dictionary index over relationship-profiling artifacts.
+- `packages/context/src/search/discover.ts` — composes wiki, SL, and raw
+  schema searches; fuses results via `rrf.ts`. Reuses the same wiki/SL/raw
+  search building blocks as `warehouse-verification/discover-data.tool.ts`.
+- `packages/cli/src/commands/mcp-commands.ts` — `ktx mcp start|stop|status|logs`.
+- `packages/cli/src/managed-mcp-daemon.ts` — daemon lifecycle (spawn,
+  pidfile, log management), mirroring `managed-python-daemon.ts`.
+- `packages/cli/src/skills/research/SKILL.md` — research workflow skill.
+- Tests for each new module following existing patterns
+  (`*.test.ts` siblings), including coverage of the per-client config
+  writer/printer matrix.
+
+### Modified
+
+- `packages/context/src/mcp/context-tools.ts` — register the four new tools
+  with their Zod schemas.
+- `packages/context/src/mcp/server.ts` — extend `KtxMcpContextPorts` with the
+  new ports (`sqlExecution`, `entityDetails`, `dictionarySearch`, `discover`).
+- `packages/context/src/mcp/types.ts` — add the new port interface
+  definitions.
+- `packages/cli/src/cli-program.ts` — register the `mcp` command subtree.
+- `packages/cli/src/setup-agents.ts` — install the research skill and write
+  MCP client config entries to each configured target.
+
+## Testing strategy
+
+- Unit tests for each new module (`entity-details.ts`,
+  `dictionary-search.ts`, `discover.ts`) using existing fixture patterns.
+- MCP-level integration test in `packages/context/src/mcp/server.test.ts`
+  that registers a fake server, invokes each tool, and asserts the
+  responses.
+- CLI integration test for `ktx mcp start|stop|status` lifecycle following
+  the pattern in `managed-python-daemon.test.ts`.
+- Setup-agents tests verifying behavior per target: claude-code and cursor
+  writers add the correct JSON entry and a corresponding `json-key`
+  manifest entry that `--remove` cleans up; codex and opencode targets
+  produce printed snippet output and do not mutate any user config file
+  or add manifest entries in v1.
+- Verification commands per CLAUDE.md: `pnpm --filter @ktx/context run test`
+  and `pnpm --filter @ktx/cli run test` for the affected packages, plus
+  `pnpm run type-check`.
+
+## Out of scope / follow-ups
+
+- **Python code execution via MCP.** The daemon's `/code/execute` endpoint
+  exists; surfacing it via MCP is a separate design with sandbox/security
+  considerations.
+- **Stdio MCP transport.** HTTP-only for now. Stdio can be added later as an
+  additional transport mode without changing the tool surface.
+- **OS-level auto-start.** Manual `ktx mcp start` only. Adding launchd /
+  systemd unit installation is a UX polish for a later release.
+- **TLS in the daemon itself.** Reverse proxy is the documented path. Native
+  TLS support if/when demand emerges.
+- **Multi-project / project-switching MCP.** One daemon per project. A
+  cross-project model would require per-call `projectDir` arguments or a
+  `set_project_dir` tool and is deferred.
+- **Audit logging, rate limiting, per-tool authorization.** Not in scope for
+  v1; the security boundary is loopback or bearer token.
+
+## Open trade-offs
+
+- **`dictionary_search` requires `--deep` (enriched) scan to have run.** The
+  relationship-profiling artifact that powers the dictionary index is only
+  produced by enriched scans. The tool reports this distinctly when missing,
+  but the dependency is real: without enriched scan, the tool returns
+  empty.
+- **`entity_details` reads from the latest snapshot, not live.** If the
+  database schema changes after the last scan, the tool will reflect the
+  scan state, not reality. Surfacing this clearly in the tool's response
+  (snapshot timestamp) is part of the implementation.
+- **No streaming for `sql_execution`.** Large results are capped at
+  `maxRows` (default 1000, max 10k). The tool returns the full result set
+  in one response. Streaming partial results is left for a later iteration
+  if real workloads demand it.