ktx/docs-site/content/docs/guides/serving-agents.mdx
2026-05-11 16:42:08 -07:00

282 lines
12 KiB
Text

---
title: Serving Agents
description: Expose your context to Claude Code, Cursor, Codex, and other coding agents.
---
Once you've built and refined your context, the final step is exposing it to coding agents. KTX provides two channels: an **MCP server** for persistent integration with tools like Claude Code and Cursor, and **CLI commands** for direct terminal access.
## Agent workflow summary
Agents should use KTX in this order:
1. Discover connections with `connection_list` or `ktx agent context --json`.
2. Discover semantic sources with `sl_list_sources` or `ktx agent sl list --json`.
3. Search knowledge with `knowledge_search` or `ktx agent wiki search`.
4. Query through the semantic layer with `sl_query` or `ktx agent sl query`.
5. Execute SQL only when execution is explicitly enabled and row limits are set.
Use the semantic layer first for analytics questions. Direct SQL is a fallback for read-only inspection, not the default path.
## MCP Server
The MCP (Model Context Protocol) server gives agents structured access to your entire context layer — semantic sources, knowledge pages, scans, and ingestion — through a standard tool-calling interface.
### Starting the server
```bash
ktx serve --mcp stdio
```
This starts an MCP server on stdio, which is how Claude Code, Cursor, and other MCP-compatible tools communicate with KTX. You typically don't run this manually — your agent's configuration handles it.
### Configuration options
| Flag | Description | Default |
|------|-------------|---------|
| `--mcp <mode>` | MCP transport mode (currently `stdio`) | Required |
| `--user-id <id>` | User identifier for knowledge scoping | `local` |
| `--semantic-compute` | Enable semantic layer planning and query execution | `false` |
| `--semantic-compute-url <url>` | URL for the semantic compute daemon | &mdash; |
| `--database-introspection-url <url>` | Daemon URL for live database access | &mdash; |
| `--execute-queries` | Allow agents to execute SQL queries | `false` |
| `--memory-capture` | Enable memory capture from conversations | `false` |
| `--memory-model <model>` | LLM model for memory capture | &mdash; |
### Available tools
When an agent connects via MCP, it can call these tools:
**Connections**
| Tool | Description |
|------|-------------|
| `connection_list` | List configured data connections |
| `connection_test` | Test a connection through the scan connector |
**Semantic Layer**
| Tool | Description |
|------|-------------|
| `sl_list_sources` | List sources, optionally filtered by connection or search query |
| `sl_read_source` | Read a source YAML by connection and name |
| `sl_write_source` | Create, replace, or delete a source |
| `sl_validate` | Validate sources against the database schema |
| `sl_query` | Execute a semantic query — returns rows, SQL, and query plan |
**Knowledge**
| Tool | Description |
|------|-------------|
| `knowledge_search` | Search knowledge pages by query, returns ranked summaries |
| `knowledge_read` | Read a knowledge page by key |
| `knowledge_write` | Create or replace a knowledge page |
**Scanning**
| Tool | Description |
|------|-------------|
| `scan_trigger` | Run a structural, enriched, or relationship scan |
| `scan_status` | Check the status of a running scan |
| `scan_report` | Read a completed scan report |
| `scan_list_artifacts` | List files produced by a scan run |
| `scan_read_artifact` | Read a scan artifact by path |
**Ingestion**
| Tool | Description |
|------|-------------|
| `ingest_trigger` | Trigger an ingest run for an adapter and connection |
| `ingest_status` | Check ingest progress, including diff and work-unit summaries |
| `ingest_report` | Read a stored ingest report |
| `ingest_replay` | Read the memory-flow replay for a past ingest |
**Memory**
| Tool | Description |
|------|-------------|
| `memory_capture` | Capture knowledge and semantic updates from a conversation |
| `memory_capture_status` | Check the status of a memory capture run |
### Tool input reference
| Tool | Required inputs | Optional inputs | Output shape |
|------|-----------------|-----------------|--------------|
| `connection_list` | none | none | JSON list of configured connections |
| `connection_test` | `connectionId` | none | JSON test result with driver metadata or an error |
| `sl_list_sources` | none | `connectionId`, `query` | JSON list of semantic source summaries |
| `sl_read_source` | `sourceName`, `connectionId` | none | YAML source content and metadata |
| `sl_write_source` | `sourceName`, `connectionId`, source YAML or delete operation | none | Write result and validation details |
| `sl_validate` | `sourceName`, `connectionId` | none | Validation result with schema and join issues |
| `sl_query` | `connectionId`, measures or query payload | dimensions, filters, segments, order, limit, execute, maxRows | Compiled SQL, query plan, and rows when execution is enabled |
| `knowledge_search` | `query` | `limit`, `userId` | Ranked knowledge results with summaries |
| `knowledge_read` | `pageId` or key | `userId` | Full Markdown knowledge page |
| `knowledge_write` | key, summary, content | tags, refs, semantic-layer refs, scope, userId | Write result |
| `scan_trigger` | `connectionId`, mode | daemon URLs, dry-run options | Scan run id and status |
| `scan_status` | `runId` | none | Scan progress and current state |
| `scan_report` | `runId` | none | Completed scan report |
| `ingest_trigger` | connection/source adapter selection | limits and introspection URLs | Ingest run id and status |
| `ingest_status` | `runId` | none | Ingest progress, work units, and diff summary |
| `memory_capture` | conversation input | model and user options | Memory capture run id |
### How agents use these tools
A typical agent interaction flows like this:
1. Agent calls `connection_list` to see available databases
2. Agent calls `sl_list_sources` to discover what semantic sources exist
3. Agent calls `knowledge_search` to find business context relevant to the user's question
4. Agent calls `sl_query` with measures, dimensions, and filters to get data
5. Agent presents results with the business context it found
Agents should use the semantic layer for analytics questions because it enforces correct joins, grain-aware aggregation, and consistent metric definitions. If SQL execution is enabled, KTX only allows read-only SQL with row limits.
### Workflow: answer an analytics question through MCP
1. `connection_list` — choose the relevant warehouse connection.
2. `sl_list_sources` with a search query — find candidate semantic sources.
3. `knowledge_search` with the user's business terms — find metric definitions and caveats.
4. `sl_read_source` for each candidate source — inspect measures, dimensions, joins, and grain.
5. `sl_query` with `execute: false` — compile SQL and inspect the generated query.
6. `sl_query` with `execute: true` and a bounded `maxRows` — execute only when the user asked for data and execution is enabled.
7. Cite the semantic source and knowledge pages used in the answer.
## CLI Commands
For agents that work through the terminal rather than MCP, KTX provides a set of machine-readable commands under `ktx agent`. These return JSON output designed for programmatic consumption.
### Available commands
```bash
# List available tools and their descriptions
ktx agent tools --json
# Get project context for planning
ktx agent context --json
```
**Semantic layer:**
```bash
# List sources
ktx agent sl list --json
ktx agent sl list --json --connection-id my-postgres
# Read a source
ktx agent sl read orders --json --connection-id my-postgres
# Run a query from a JSON file
ktx agent sl query --json \
--connection-id my-postgres \
--query-file query.json \
--execute \
--max-rows 100
```
**Knowledge:**
```bash
# Search knowledge pages
ktx agent wiki search "revenue recognition" --json --limit 10
# Read a specific page
ktx agent wiki read order-status-definitions --json
```
**SQL execution:**
```bash
# Execute read-only SQL with a row limit
ktx agent sql execute --json \
--connection-id my-postgres \
--sql-file query.sql \
--max-rows 500
```
### CLI input reference
| Command | Required inputs | Optional inputs | Output |
|---------|-----------------|-----------------|--------|
| `ktx agent tools --json` | `--json` | none | JSON list of available agent commands |
| `ktx agent context --json` | `--json` | none | JSON project context and readiness state |
| `ktx agent sl list --json` | `--json` | `--connection-id`, `--query` | JSON semantic source list |
| `ktx agent sl read <sourceName> --json --connection-id <id>` | source name, `--json`, `--connection-id` | none | JSON payload containing source YAML |
| `ktx agent sl query --json --connection-id <id> --query-file <path>` | `--json`, `--connection-id`, `--query-file` | `--execute`, `--max-rows` | JSON compiled query, SQL, plan, and optional rows |
| `ktx agent wiki search <query> --json` | query, `--json` | `--limit` | JSON ranked knowledge results |
| `ktx agent wiki read <pageId> --json` | page id, `--json` | none | JSON full knowledge page |
| `ktx agent sql execute --json --connection-id <id> --sql-file <path> --max-rows <n>` | `--json`, `--connection-id`, `--sql-file`, `--max-rows` | none | JSON rows and execution metadata |
### Workflow: answer an analytics question through CLI
1. `ktx agent context --json` — verify the KTX project is ready for agents.
2. `ktx agent sl list --json --query "revenue"` — find semantic sources related to the question.
3. `ktx agent wiki search "revenue recognition" --json --limit 5` — retrieve business definitions.
4. Write a query JSON file with measures, dimensions, filters, and limits.
5. `ktx agent sl query --json --connection-id my-postgres --query-file query.json` — compile and inspect SQL.
6. Add `--execute --max-rows 100` only when the user needs rows and execution is allowed.
### When to use CLI vs MCP
| | MCP | CLI |
|---|-----|-----|
| **Best for** | Persistent agent integrations | Terminal-based workflows, scripting |
| **Protocol** | Structured tool calls over stdio | Shell commands with JSON output |
| **Used by** | Claude Code, Cursor, Codex | Shell scripts, custom agents, debugging |
| **State** | Server runs continuously | Stateless per invocation |
Most users should set up MCP — it gives agents richer context and a more natural interaction model. The CLI commands are useful for scripting, debugging, and agents that operate through terminal tools.
## Setting Up Your Agent
The fastest way to connect an agent is through the setup wizard:
```bash
ktx setup
```
The agents step auto-detects installed tools and generates the right configuration. For manual setup or per-tool details, see the [Agent Clients](/docs/integrations/agent-clients) integration page.
### Quick manual setup
**Claude Code** — add to `.claude/settings.json`:
```json
{
"mcpServers": {
"ktx": {
"command": "ktx",
"args": ["serve", "--mcp", "stdio", "--semantic-compute", "--execute-queries"],
"env": {
"KTX_PROJECT_DIR": "/path/to/your/ktx/project"
}
}
}
}
```
**Cursor** — add to `.cursor/mcp.json`:
```json
{
"mcpServers": {
"ktx": {
"command": "ktx",
"args": ["serve", "--mcp", "stdio", "--semantic-compute", "--execute-queries"],
"env": {
"KTX_PROJECT_DIR": "/path/to/your/ktx/project"
}
}
}
}
```
After configuration, the agent can immediately start calling KTX tools — listing sources, searching knowledge, and querying your semantic layer.
## Common errors
| Error or symptom | Likely cause | Recovery |
|------------------|--------------|----------|
| Agent cannot find the MCP server | Agent config points to a missing `ktx` binary or wrong project directory | Run `ktx setup --agents` again, then verify the generated MCP config contains the intended `KTX_PROJECT_DIR` |
| MCP tools list but semantic queries fail | `--semantic-compute` was not enabled or the daemon URL is wrong | Start `ktx serve --mcp stdio --semantic-compute` or set `--semantic-compute-url` to the running daemon |
| Query execution is rejected | The MCP server was started without `--execute-queries` or the SQL is not read-only | Restart with `--execute-queries` only when execution is intended, and keep `maxRows` bounded |
| `ktx agent` command exits without JSON | `--json` was omitted | Re-run the command with `--json`; all `ktx agent` subcommands require it |
| SQL execution exceeds limits | `--max-rows` is missing or too high | Re-run with an explicit value from 1 to 1000 |