diff --git a/docs-site/content/docs/guides/building-context.mdx b/docs-site/content/docs/guides/building-context.mdx index 584e9003..39edaa85 100644 --- a/docs-site/content/docs/guides/building-context.mdx +++ b/docs-site/content/docs/guides/building-context.mdx @@ -3,12 +3,9 @@ title: Building Context description: Build and refresh KTX context from databases, source tools, query history, and text. --- -Building context turns configured connections into local semantic-layer sources -and wiki pages. Agents use those files to understand your schema, business -definitions, metric logic, joins, and known caveats before they write SQL. - -Use this guide after `ktx setup` has created `ktx.yaml` and at least one -database or context-source connection. +Build context after `ktx setup` creates `ktx.yaml` and at least one database or +context-source connection. KTX writes local semantic-layer sources and wiki +pages for agents to use before writing SQL. ## The build loop @@ -22,15 +19,12 @@ Most projects use this loop: 5. Validate and query representative sources before handing the context to an agent. -`ktx ingest --all` runs database connections first, then context-source -connections. That order lets dbt, BI, Notion, and text ingest attach context to -known warehouse tables. +`ktx ingest --all` runs databases first, then context-source connections, so +external metadata can attach to known warehouse tables. ## Database ingest -Database ingest connects to a configured warehouse and records local schema -context. It gives agents table, column, type, constraint, and row-count -grounding without requiring them to inspect the database directly. +Database ingest records table, column, type, constraint, and row-count context. ```bash # Build one configured database connection @@ -55,20 +49,16 @@ ktx ingest warehouse --deep ktx ingest --all --deep ``` -Deep ingest needs LLM and embedding readiness. If those providers are not -configured, run `ktx setup` or use `--fast`. +Deep ingest needs LLM and embedding readiness. Otherwise run `ktx setup` or use +`--fast`. -When you use `claude-code`, KTX still controls the tool surface for ingest and -memory capture. Claude Code built-in tools, discovered MCP servers, plugins, -skills, agents, and slash commands are not invokable by KTX agent loops unless -they are exact KTX MCP tools for the current run. +With `claude-code`, KTX agent loops can invoke only the KTX MCP tools for the +current run. ## Query history -PostgreSQL, BigQuery, and Snowflake can add query-history context. This helps -KTX learn common joins, filters, service-account patterns, redaction rules, and -usage-heavy query templates. BigQuery and Snowflake support a lookback window; -Postgres reads the current `pg_stat_statements` aggregate data instead. +PostgreSQL, BigQuery, and Snowflake can add query-history context: common joins, +filters, service-account patterns, redaction rules, and high-usage templates. Enable it during setup, store it under `connections..context.queryHistory`, or request it for one run: @@ -84,19 +74,13 @@ for one run. ## Relationship evidence -Many databases do not declare all foreign keys. KTX can score relationship -candidates using signals such as name similarity, type compatibility, value -overlap, embedding similarity, uniqueness, null rate, and structural priors. - -The public CLI does not expose separate relationship review subcommands. -Relationship evidence is built as part of deep database ingest when the -connector and readiness checks support it. +KTX scores relationship candidates during supported deep database ingest. The +public CLI does not expose separate relationship review subcommands. ## Context-source ingest -Context-source connections pull business metadata from tools your team already -uses. The current public `ktx ingest` command is connection-centric: pass one -configured connection id, or pass `--all`. +Context-source connections pull metadata from dbt, BI tools, Notion, and other +configured systems. Pass one connection id or `--all`. ```bash # Build one source connection @@ -117,14 +101,13 @@ Supported source types: | `metabase` | Metabase API | Questions, dashboards, table metadata, and mappings | | `notion` | Notion API | Wiki pages and business knowledge | -Source ingest extracts metadata, reconciles it with existing local context, and -writes semantic-layer YAML plus wiki Markdown. It merges rather than blindly -overwriting local edits. +Source ingest writes semantic-layer YAML and wiki Markdown, merging with local +edits. ## Text ingest -Use `ktx ingest text` for notes, Markdown files, runbooks, Slack exports, or -other free-form knowledge that should become searchable KTX memory. +Use `ktx ingest text` for notes, Markdown, runbooks, Slack exports, or other +searchable memory. ```bash # Capture a Markdown file @@ -146,14 +129,12 @@ Useful flags: | `--json` | Print structured output | | `--fail-fast` | Stop after the first failed text item | -Text ingest is a good fit for small, high-signal documents. For system-specific -connectors such as Notion, dbt, or Metabase, prefer configured source ingest so -KTX can preserve source metadata. +Use text ingest for small, high-signal documents. Prefer configured source +ingest for Notion, dbt, Metabase, and similar systems. ## Output and artifacts -Every ingest run prints a summary. Use `--json` when an agent or script needs a -structured plan and per-target results. +Every ingest run prints a summary. Use `--json` for scripts and agents. ```bash ktx ingest --all --json @@ -168,9 +149,7 @@ Typical generated files: | `wiki/user//*.md` | Text and memory ingest | User-scoped context | | `.ktx/setup/context-build.json` | Setup context build | Resume and readiness state for setup | -Ingest sessions also record transcripts with tool calls, LLM responses, and -write decisions. Inspect them when you need to debug why a source or wiki page -was written a certain way. +Ingest transcripts include tool calls, LLM responses, and write decisions. ## Example: first full refresh diff --git a/docs-site/content/docs/guides/llm-configuration.mdx b/docs-site/content/docs/guides/llm-configuration.mdx index 054d0b58..7e86cf96 100644 --- a/docs-site/content/docs/guides/llm-configuration.mdx +++ b/docs-site/content/docs/guides/llm-configuration.mdx @@ -3,8 +3,8 @@ title: LLM configuration description: Configure KTX LLM providers, model roles, and prompt caching. --- -KTX uses the top-level `llm` block in `ktx.yaml` for text generation, -structured extraction, and ingest or memory agent loops. +Configure text generation, structured extraction, and ingest or memory loops in +the top-level `llm` block. ## Backends @@ -15,9 +15,7 @@ Set `llm.provider.backend` to one of these values: - `vertex`: Use Vertex AI Anthropic models through Google Cloud credentials. - `gateway`: Use AI Gateway-compatible Anthropic model ids. - `claude-code`: Use your local Claude Code session through the Claude Agent - SDK. KTX removes provider-routing environment variables from Claude Code - child processes, so this backend doesn't silently fall back to - `ANTHROPIC_API_KEY`, Vertex, Gateway, or Bedrock credentials. + SDK. KTX strips provider-routing environment variables from child processes. ## Claude Code @@ -36,26 +34,20 @@ llm: repair: sonnet ``` -During setup, choose the Claude Code backend interactively or pass the model in -automation: +During setup, choose the backend interactively or pass the model in automation: ```bash ktx setup --llm-backend claude-code --llm-model opus --no-input ``` -For Claude Code, `sonnet`, `opus`, and `haiku` map to the current KTX defaults. -You can also pass a full Claude model ID, such as `claude-opus-4-7`. +For Claude Code, `sonnet`, `opus`, and `haiku` map to KTX defaults. Full Claude +model IDs are also accepted. -`claude-code` keeps KTX tool boundaries intact. KTX exposes only the MCP tools -needed for the current KTX agent loop, disables Claude Code built-in tools, -keeps plugins empty, and denies every non-KTX tool request through -`canUseTool`. The Claude Agent SDK may still report host-discovered slash -commands, skills, and subagent names in init metadata; that metadata is not an -execution grant for KTX agent loops. +`claude-code` exposes only KTX MCP tools for the current agent loop. SDK init +metadata may still list host slash commands, skills, and subagents; KTX does not +grant execution access to them. ## Prompt caching -`llm.promptCaching` has partial parity on `claude-code`. KTX doesn't pass -Anthropic cache-control markers to the Claude Agent SDK. Status and doctor warn -when you configure prompt-cache TTL, tool, or history fields that the Claude -Agent SDK backend ignores. +`llm.promptCaching` has partial parity on `claude-code`. Status and doctor warn +when the Claude Agent SDK backend ignores configured cache fields. diff --git a/docs-site/content/docs/guides/serving-agents.mdx b/docs-site/content/docs/guides/serving-agents.mdx index 192b1c7f..8710d3ba 100644 --- a/docs-site/content/docs/guides/serving-agents.mdx +++ b/docs-site/content/docs/guides/serving-agents.mdx @@ -3,9 +3,8 @@ title: Serving Agents description: Expose KTX context to Claude Code, Codex, Cursor, OpenCode, and custom agents. --- -KTX serves agents through the public CLI and project-local instruction files. -Agents do not need a separate server. They read the generated rules, call KTX -commands, inspect local context files, and use JSON output when they need +KTX serves agents through the CLI and project-local instruction files. Agents +read generated rules, call KTX commands, inspect context files, and use JSON for structured results. ## Recommended setup @@ -39,14 +38,13 @@ ktx setup --agents --target claude-code --global ktx setup --agents --target codex --global ``` -KTX records installed files in `.ktx/agents/install-manifest.json`. Rerun -`ktx setup --agents` after moving a checkout or reinstalling the CLI so the -generated instructions point at the current CLI path. +Installed files are recorded in `.ktx/agents/install-manifest.json`. Rerun +`ktx setup --agents` after moving a checkout or reinstalling the CLI. ## Agent command set -All supported agent clients use the same command surface. Use `--project-dir` -when the agent is running outside the KTX project directory. +All supported clients use the same command surface. Use `--project-dir` outside +the KTX project directory. ### Readiness @@ -54,9 +52,8 @@ when the agent is running outside the KTX project directory. ktx status --json ``` -Agents should run this before relying on context. It reports project, LLM, -embedding, database, context-source, context-build, and agent-integration -readiness. +Run this before relying on context. It reports project, provider, connection, +context-build, and agent-integration readiness. ### Semantic layer discovery @@ -66,8 +63,8 @@ ktx sl list --connection-id warehouse --json ktx sl search "revenue" --json --limit 10 ``` -Agents use these commands to discover source names, connection ids, measures, -dimensions, and likely files to inspect. +Use these commands to find source names, connection ids, measures, dimensions, +and files to inspect. ### Semantic-layer validation and queries @@ -106,9 +103,8 @@ ktx wiki list --json ktx wiki search "revenue recognition" --json --limit 10 ``` -Agents should search wiki context when a question depends on business -definitions, metric caveats, process rules, or terms that are not obvious from -schema names. +Search wiki context for business definitions, metric caveats, process rules, and +non-obvious terms. ### Context refresh @@ -120,8 +116,7 @@ ktx ingest --all ktx ingest text docs/revenue-notes.md --connection-id warehouse ``` -Use `--deep` only when LLM and embedding setup is ready and the user expects an -AI-enriched refresh. +Use `--deep` only when LLM and embedding setup is ready. ## Good agent behavior @@ -135,14 +130,12 @@ Agents should: - Validate edited semantic sources with `ktx sl validate`. - Keep generated context changes reviewable in git. -Agents should not assume a background server, ORPC route, frontend app, or -external migration system exists. KTX is a local context layer with a CLI and -plain project files. +KTX is a local context layer with a CLI and plain project files. Do not assume a +background server, ORPC route, frontend app, or external migration system. ## Manual setup -Manual setup is useful for custom agents that can read project-local -instructions but are not yet a named target. +Use manual setup for custom agents that can read project-local instructions. 1. Install the universal target: diff --git a/docs-site/content/docs/guides/writing-context.mdx b/docs-site/content/docs/guides/writing-context.mdx index b68960bf..2b9824c8 100644 --- a/docs-site/content/docs/guides/writing-context.mdx +++ b/docs-site/content/docs/guides/writing-context.mdx @@ -3,12 +3,8 @@ title: Writing Context description: Edit semantic sources and wiki pages so agents use your business logic. --- -KTX context is meant to be edited. Ingest gives you a grounded first draft, then -you refine source YAML and wiki Markdown until agents can answer data questions -with the same definitions your team uses. - -Use this guide when you are adding measures, fixing joins, documenting business -rules, or reviewing context changes made by an agent. +Ingest creates the first draft. Edit source YAML and wiki Markdown when you need +sharper metrics, joins, or business rules. ## Editing workflow @@ -45,10 +41,8 @@ Use this order for most context changes: ## Semantic sources -Semantic sources are YAML files that describe queryable entities. A source is -usually a table, but it can also point at a custom SQL expression. Sources -define the vocabulary agents use for measures, dimensions, segments, joins, and -grain-aware query planning. +Semantic sources are YAML files for queryable tables or custom SQL. They define +agent-facing measures, dimensions, segments, joins, and grain. Source files live at: @@ -198,8 +192,8 @@ joins: ## Measures -Good measures have precise names, SQL expressions at the correct grain, and -descriptions that say what is included and excluded. +Good measures have precise names, correct-grain SQL, and descriptions that name +key inclusions and exclusions. ```yaml measures: @@ -209,14 +203,13 @@ measures: description: Completed order revenue after refunds, excluding cancelled orders. ``` -Prefer one canonical measure plus wiki synonyms over several nearly identical -measures. If your team uses multiple definitions, document the distinction in a -wiki page and link it with `sl_refs`. +Prefer one canonical measure plus wiki synonyms. Put competing definitions in a +linked wiki page. ## Joins and grain -`grain` and `relationship` prevent agents from producing double-counted SQL. -State the row grain even when it seems obvious. +`grain` and `relationship` prevent double-counted SQL. State the row grain even +when it seems obvious. ```yaml grain: @@ -228,8 +221,7 @@ joins: ``` Use `many_to_one` for dimensions such as customer, account, product, or plan. -Use `one_to_many` only when the target can fan out the source rows, such as -orders to order items. +Use `one_to_many` only when the target can fan out rows. ## Validate and query @@ -239,8 +231,7 @@ Validation checks source YAML against the live database schema: ktx sl validate orders --connection-id warehouse ``` -It catches missing columns, invalid join targets, and table-reference problems -before an agent relies on the source. +It catches missing columns, invalid joins, and table-reference problems. Compile a query to inspect generated SQL: @@ -268,9 +259,8 @@ ktx sl query \ ## Wiki pages -Wiki pages capture business context that does not belong in a single source -file: metric policies, dashboard caveats, company vocabulary, data freshness, -known issues, and source-of-truth notes. +Wiki pages hold context that does not belong in one source file: policies, +caveats, vocabulary, freshness, known issues, and source-of-truth notes. Wiki files live under: @@ -280,8 +270,7 @@ wiki/ user// ``` -Use global pages for shared business rules. Use user-scoped pages for local -notes, personal conventions, or context that should not be shared broadly. +Use global pages for shared rules and user-scoped pages for local notes. ### Wiki page example @@ -338,8 +327,7 @@ ktx sl search "revenue" --json ktx wiki search "revenue recognition" --json --limit 10 ``` -Check that definitions are specific, hidden columns stay hidden, joins have -explicit relationships, and measures compile into the expected SQL. +Check definitions, hidden columns, join relationships, and generated SQL. ## Common errors