mirror of
https://github.com/Kaelio/ktx.git
synced 2026-06-07 07:55:13 +02:00
docs: tighten guide copy
This commit is contained in:
parent
d60d83e595
commit
5f3fb5e8ef
4 changed files with 67 additions and 115 deletions
|
|
@ -3,12 +3,9 @@ title: Building Context
|
|||
description: Build and refresh KTX context from databases, source tools, query history, and text.
|
||||
---
|
||||
|
||||
Building context turns configured connections into local semantic-layer sources
|
||||
and wiki pages. Agents use those files to understand your schema, business
|
||||
definitions, metric logic, joins, and known caveats before they write SQL.
|
||||
|
||||
Use this guide after `ktx setup` has created `ktx.yaml` and at least one
|
||||
database or context-source connection.
|
||||
Build context after `ktx setup` creates `ktx.yaml` and at least one database or
|
||||
context-source connection. KTX writes local semantic-layer sources and wiki
|
||||
pages for agents to use before writing SQL.
|
||||
|
||||
## The build loop
|
||||
|
||||
|
|
@ -22,15 +19,12 @@ Most projects use this loop:
|
|||
5. Validate and query representative sources before handing the context to an
|
||||
agent.
|
||||
|
||||
`ktx ingest --all` runs database connections first, then context-source
|
||||
connections. That order lets dbt, BI, Notion, and text ingest attach context to
|
||||
known warehouse tables.
|
||||
`ktx ingest --all` runs databases first, then context-source connections, so
|
||||
external metadata can attach to known warehouse tables.
|
||||
|
||||
## Database ingest
|
||||
|
||||
Database ingest connects to a configured warehouse and records local schema
|
||||
context. It gives agents table, column, type, constraint, and row-count
|
||||
grounding without requiring them to inspect the database directly.
|
||||
Database ingest records table, column, type, constraint, and row-count context.
|
||||
|
||||
```bash
|
||||
# Build one configured database connection
|
||||
|
|
@ -55,20 +49,16 @@ ktx ingest warehouse --deep
|
|||
ktx ingest --all --deep
|
||||
```
|
||||
|
||||
Deep ingest needs LLM and embedding readiness. If those providers are not
|
||||
configured, run `ktx setup` or use `--fast`.
|
||||
Deep ingest needs LLM and embedding readiness. Otherwise run `ktx setup` or use
|
||||
`--fast`.
|
||||
|
||||
When you use `claude-code`, KTX still controls the tool surface for ingest and
|
||||
memory capture. Claude Code built-in tools, discovered MCP servers, plugins,
|
||||
skills, agents, and slash commands are not invokable by KTX agent loops unless
|
||||
they are exact KTX MCP tools for the current run.
|
||||
With `claude-code`, KTX agent loops can invoke only the KTX MCP tools for the
|
||||
current run.
|
||||
|
||||
## Query history
|
||||
|
||||
PostgreSQL, BigQuery, and Snowflake can add query-history context. This helps
|
||||
KTX learn common joins, filters, service-account patterns, redaction rules, and
|
||||
usage-heavy query templates. BigQuery and Snowflake support a lookback window;
|
||||
Postgres reads the current `pg_stat_statements` aggregate data instead.
|
||||
PostgreSQL, BigQuery, and Snowflake can add query-history context: common joins,
|
||||
filters, service-account patterns, redaction rules, and high-usage templates.
|
||||
|
||||
Enable it during setup, store it under `connections.<id>.context.queryHistory`,
|
||||
or request it for one run:
|
||||
|
|
@ -84,19 +74,13 @@ for one run.
|
|||
|
||||
## Relationship evidence
|
||||
|
||||
Many databases do not declare all foreign keys. KTX can score relationship
|
||||
candidates using signals such as name similarity, type compatibility, value
|
||||
overlap, embedding similarity, uniqueness, null rate, and structural priors.
|
||||
|
||||
The public CLI does not expose separate relationship review subcommands.
|
||||
Relationship evidence is built as part of deep database ingest when the
|
||||
connector and readiness checks support it.
|
||||
KTX scores relationship candidates during supported deep database ingest. The
|
||||
public CLI does not expose separate relationship review subcommands.
|
||||
|
||||
## Context-source ingest
|
||||
|
||||
Context-source connections pull business metadata from tools your team already
|
||||
uses. The current public `ktx ingest` command is connection-centric: pass one
|
||||
configured connection id, or pass `--all`.
|
||||
Context-source connections pull metadata from dbt, BI tools, Notion, and other
|
||||
configured systems. Pass one connection id or `--all`.
|
||||
|
||||
```bash
|
||||
# Build one source connection
|
||||
|
|
@ -117,14 +101,13 @@ Supported source types:
|
|||
| `metabase` | Metabase API | Questions, dashboards, table metadata, and mappings |
|
||||
| `notion` | Notion API | Wiki pages and business knowledge |
|
||||
|
||||
Source ingest extracts metadata, reconciles it with existing local context, and
|
||||
writes semantic-layer YAML plus wiki Markdown. It merges rather than blindly
|
||||
overwriting local edits.
|
||||
Source ingest writes semantic-layer YAML and wiki Markdown, merging with local
|
||||
edits.
|
||||
|
||||
## Text ingest
|
||||
|
||||
Use `ktx ingest text` for notes, Markdown files, runbooks, Slack exports, or
|
||||
other free-form knowledge that should become searchable KTX memory.
|
||||
Use `ktx ingest text` for notes, Markdown, runbooks, Slack exports, or other
|
||||
searchable memory.
|
||||
|
||||
```bash
|
||||
# Capture a Markdown file
|
||||
|
|
@ -146,14 +129,12 @@ Useful flags:
|
|||
| `--json` | Print structured output |
|
||||
| `--fail-fast` | Stop after the first failed text item |
|
||||
|
||||
Text ingest is a good fit for small, high-signal documents. For system-specific
|
||||
connectors such as Notion, dbt, or Metabase, prefer configured source ingest so
|
||||
KTX can preserve source metadata.
|
||||
Use text ingest for small, high-signal documents. Prefer configured source
|
||||
ingest for Notion, dbt, Metabase, and similar systems.
|
||||
|
||||
## Output and artifacts
|
||||
|
||||
Every ingest run prints a summary. Use `--json` when an agent or script needs a
|
||||
structured plan and per-target results.
|
||||
Every ingest run prints a summary. Use `--json` for scripts and agents.
|
||||
|
||||
```bash
|
||||
ktx ingest --all --json
|
||||
|
|
@ -168,9 +149,7 @@ Typical generated files:
|
|||
| `wiki/user/<user-id>/*.md` | Text and memory ingest | User-scoped context |
|
||||
| `.ktx/setup/context-build.json` | Setup context build | Resume and readiness state for setup |
|
||||
|
||||
Ingest sessions also record transcripts with tool calls, LLM responses, and
|
||||
write decisions. Inspect them when you need to debug why a source or wiki page
|
||||
was written a certain way.
|
||||
Ingest transcripts include tool calls, LLM responses, and write decisions.
|
||||
|
||||
## Example: first full refresh
|
||||
|
||||
|
|
|
|||
|
|
@ -3,8 +3,8 @@ title: LLM configuration
|
|||
description: Configure KTX LLM providers, model roles, and prompt caching.
|
||||
---
|
||||
|
||||
KTX uses the top-level `llm` block in `ktx.yaml` for text generation,
|
||||
structured extraction, and ingest or memory agent loops.
|
||||
Configure text generation, structured extraction, and ingest or memory loops in
|
||||
the top-level `llm` block.
|
||||
|
||||
## Backends
|
||||
|
||||
|
|
@ -15,9 +15,7 @@ Set `llm.provider.backend` to one of these values:
|
|||
- `vertex`: Use Vertex AI Anthropic models through Google Cloud credentials.
|
||||
- `gateway`: Use AI Gateway-compatible Anthropic model ids.
|
||||
- `claude-code`: Use your local Claude Code session through the Claude Agent
|
||||
SDK. KTX removes provider-routing environment variables from Claude Code
|
||||
child processes, so this backend doesn't silently fall back to
|
||||
`ANTHROPIC_API_KEY`, Vertex, Gateway, or Bedrock credentials.
|
||||
SDK. KTX strips provider-routing environment variables from child processes.
|
||||
|
||||
## Claude Code
|
||||
|
||||
|
|
@ -36,26 +34,20 @@ llm:
|
|||
repair: sonnet
|
||||
```
|
||||
|
||||
During setup, choose the Claude Code backend interactively or pass the model in
|
||||
automation:
|
||||
During setup, choose the backend interactively or pass the model in automation:
|
||||
|
||||
```bash
|
||||
ktx setup --llm-backend claude-code --llm-model opus --no-input
|
||||
```
|
||||
|
||||
For Claude Code, `sonnet`, `opus`, and `haiku` map to the current KTX defaults.
|
||||
You can also pass a full Claude model ID, such as `claude-opus-4-7`.
|
||||
For Claude Code, `sonnet`, `opus`, and `haiku` map to KTX defaults. Full Claude
|
||||
model IDs are also accepted.
|
||||
|
||||
`claude-code` keeps KTX tool boundaries intact. KTX exposes only the MCP tools
|
||||
needed for the current KTX agent loop, disables Claude Code built-in tools,
|
||||
keeps plugins empty, and denies every non-KTX tool request through
|
||||
`canUseTool`. The Claude Agent SDK may still report host-discovered slash
|
||||
commands, skills, and subagent names in init metadata; that metadata is not an
|
||||
execution grant for KTX agent loops.
|
||||
`claude-code` exposes only KTX MCP tools for the current agent loop. SDK init
|
||||
metadata may still list host slash commands, skills, and subagents; KTX does not
|
||||
grant execution access to them.
|
||||
|
||||
## Prompt caching
|
||||
|
||||
`llm.promptCaching` has partial parity on `claude-code`. KTX doesn't pass
|
||||
Anthropic cache-control markers to the Claude Agent SDK. Status and doctor warn
|
||||
when you configure prompt-cache TTL, tool, or history fields that the Claude
|
||||
Agent SDK backend ignores.
|
||||
`llm.promptCaching` has partial parity on `claude-code`. Status and doctor warn
|
||||
when the Claude Agent SDK backend ignores configured cache fields.
|
||||
|
|
|
|||
|
|
@ -3,9 +3,8 @@ title: Serving Agents
|
|||
description: Expose KTX context to Claude Code, Codex, Cursor, OpenCode, and custom agents.
|
||||
---
|
||||
|
||||
KTX serves agents through the public CLI and project-local instruction files.
|
||||
Agents do not need a separate server. They read the generated rules, call KTX
|
||||
commands, inspect local context files, and use JSON output when they need
|
||||
KTX serves agents through the CLI and project-local instruction files. Agents
|
||||
read generated rules, call KTX commands, inspect context files, and use JSON for
|
||||
structured results.
|
||||
|
||||
## Recommended setup
|
||||
|
|
@ -39,14 +38,13 @@ ktx setup --agents --target claude-code --global
|
|||
ktx setup --agents --target codex --global
|
||||
```
|
||||
|
||||
KTX records installed files in `.ktx/agents/install-manifest.json`. Rerun
|
||||
`ktx setup --agents` after moving a checkout or reinstalling the CLI so the
|
||||
generated instructions point at the current CLI path.
|
||||
Installed files are recorded in `.ktx/agents/install-manifest.json`. Rerun
|
||||
`ktx setup --agents` after moving a checkout or reinstalling the CLI.
|
||||
|
||||
## Agent command set
|
||||
|
||||
All supported agent clients use the same command surface. Use `--project-dir`
|
||||
when the agent is running outside the KTX project directory.
|
||||
All supported clients use the same command surface. Use `--project-dir` outside
|
||||
the KTX project directory.
|
||||
|
||||
### Readiness
|
||||
|
||||
|
|
@ -54,9 +52,8 @@ when the agent is running outside the KTX project directory.
|
|||
ktx status --json
|
||||
```
|
||||
|
||||
Agents should run this before relying on context. It reports project, LLM,
|
||||
embedding, database, context-source, context-build, and agent-integration
|
||||
readiness.
|
||||
Run this before relying on context. It reports project, provider, connection,
|
||||
context-build, and agent-integration readiness.
|
||||
|
||||
### Semantic layer discovery
|
||||
|
||||
|
|
@ -66,8 +63,8 @@ ktx sl list --connection-id warehouse --json
|
|||
ktx sl search "revenue" --json --limit 10
|
||||
```
|
||||
|
||||
Agents use these commands to discover source names, connection ids, measures,
|
||||
dimensions, and likely files to inspect.
|
||||
Use these commands to find source names, connection ids, measures, dimensions,
|
||||
and files to inspect.
|
||||
|
||||
### Semantic-layer validation and queries
|
||||
|
||||
|
|
@ -106,9 +103,8 @@ ktx wiki list --json
|
|||
ktx wiki search "revenue recognition" --json --limit 10
|
||||
```
|
||||
|
||||
Agents should search wiki context when a question depends on business
|
||||
definitions, metric caveats, process rules, or terms that are not obvious from
|
||||
schema names.
|
||||
Search wiki context for business definitions, metric caveats, process rules, and
|
||||
non-obvious terms.
|
||||
|
||||
### Context refresh
|
||||
|
||||
|
|
@ -120,8 +116,7 @@ ktx ingest --all
|
|||
ktx ingest text docs/revenue-notes.md --connection-id warehouse
|
||||
```
|
||||
|
||||
Use `--deep` only when LLM and embedding setup is ready and the user expects an
|
||||
AI-enriched refresh.
|
||||
Use `--deep` only when LLM and embedding setup is ready.
|
||||
|
||||
## Good agent behavior
|
||||
|
||||
|
|
@ -135,14 +130,12 @@ Agents should:
|
|||
- Validate edited semantic sources with `ktx sl validate`.
|
||||
- Keep generated context changes reviewable in git.
|
||||
|
||||
Agents should not assume a background server, ORPC route, frontend app, or
|
||||
external migration system exists. KTX is a local context layer with a CLI and
|
||||
plain project files.
|
||||
KTX is a local context layer with a CLI and plain project files. Do not assume a
|
||||
background server, ORPC route, frontend app, or external migration system.
|
||||
|
||||
## Manual setup
|
||||
|
||||
Manual setup is useful for custom agents that can read project-local
|
||||
instructions but are not yet a named target.
|
||||
Use manual setup for custom agents that can read project-local instructions.
|
||||
|
||||
1. Install the universal target:
|
||||
|
||||
|
|
|
|||
|
|
@ -3,12 +3,8 @@ title: Writing Context
|
|||
description: Edit semantic sources and wiki pages so agents use your business logic.
|
||||
---
|
||||
|
||||
KTX context is meant to be edited. Ingest gives you a grounded first draft, then
|
||||
you refine source YAML and wiki Markdown until agents can answer data questions
|
||||
with the same definitions your team uses.
|
||||
|
||||
Use this guide when you are adding measures, fixing joins, documenting business
|
||||
rules, or reviewing context changes made by an agent.
|
||||
Ingest creates the first draft. Edit source YAML and wiki Markdown when you need
|
||||
sharper metrics, joins, or business rules.
|
||||
|
||||
## Editing workflow
|
||||
|
||||
|
|
@ -45,10 +41,8 @@ Use this order for most context changes:
|
|||
|
||||
## Semantic sources
|
||||
|
||||
Semantic sources are YAML files that describe queryable entities. A source is
|
||||
usually a table, but it can also point at a custom SQL expression. Sources
|
||||
define the vocabulary agents use for measures, dimensions, segments, joins, and
|
||||
grain-aware query planning.
|
||||
Semantic sources are YAML files for queryable tables or custom SQL. They define
|
||||
agent-facing measures, dimensions, segments, joins, and grain.
|
||||
|
||||
Source files live at:
|
||||
|
||||
|
|
@ -198,8 +192,8 @@ joins:
|
|||
|
||||
## Measures
|
||||
|
||||
Good measures have precise names, SQL expressions at the correct grain, and
|
||||
descriptions that say what is included and excluded.
|
||||
Good measures have precise names, correct-grain SQL, and descriptions that name
|
||||
key inclusions and exclusions.
|
||||
|
||||
```yaml
|
||||
measures:
|
||||
|
|
@ -209,14 +203,13 @@ measures:
|
|||
description: Completed order revenue after refunds, excluding cancelled orders.
|
||||
```
|
||||
|
||||
Prefer one canonical measure plus wiki synonyms over several nearly identical
|
||||
measures. If your team uses multiple definitions, document the distinction in a
|
||||
wiki page and link it with `sl_refs`.
|
||||
Prefer one canonical measure plus wiki synonyms. Put competing definitions in a
|
||||
linked wiki page.
|
||||
|
||||
## Joins and grain
|
||||
|
||||
`grain` and `relationship` prevent agents from producing double-counted SQL.
|
||||
State the row grain even when it seems obvious.
|
||||
`grain` and `relationship` prevent double-counted SQL. State the row grain even
|
||||
when it seems obvious.
|
||||
|
||||
```yaml
|
||||
grain:
|
||||
|
|
@ -228,8 +221,7 @@ joins:
|
|||
```
|
||||
|
||||
Use `many_to_one` for dimensions such as customer, account, product, or plan.
|
||||
Use `one_to_many` only when the target can fan out the source rows, such as
|
||||
orders to order items.
|
||||
Use `one_to_many` only when the target can fan out rows.
|
||||
|
||||
## Validate and query
|
||||
|
||||
|
|
@ -239,8 +231,7 @@ Validation checks source YAML against the live database schema:
|
|||
ktx sl validate orders --connection-id warehouse
|
||||
```
|
||||
|
||||
It catches missing columns, invalid join targets, and table-reference problems
|
||||
before an agent relies on the source.
|
||||
It catches missing columns, invalid joins, and table-reference problems.
|
||||
|
||||
Compile a query to inspect generated SQL:
|
||||
|
||||
|
|
@ -268,9 +259,8 @@ ktx sl query \
|
|||
|
||||
## Wiki pages
|
||||
|
||||
Wiki pages capture business context that does not belong in a single source
|
||||
file: metric policies, dashboard caveats, company vocabulary, data freshness,
|
||||
known issues, and source-of-truth notes.
|
||||
Wiki pages hold context that does not belong in one source file: policies,
|
||||
caveats, vocabulary, freshness, known issues, and source-of-truth notes.
|
||||
|
||||
Wiki files live under:
|
||||
|
||||
|
|
@ -280,8 +270,7 @@ wiki/
|
|||
user/<user-id>/
|
||||
```
|
||||
|
||||
Use global pages for shared business rules. Use user-scoped pages for local
|
||||
notes, personal conventions, or context that should not be shared broadly.
|
||||
Use global pages for shared rules and user-scoped pages for local notes.
|
||||
|
||||
### Wiki page example
|
||||
|
||||
|
|
@ -338,8 +327,7 @@ ktx sl search "revenue" --json
|
|||
ktx wiki search "revenue recognition" --json --limit 10
|
||||
```
|
||||
|
||||
Check that definitions are specific, hidden columns stay hidden, joins have
|
||||
explicit relationships, and measures compile into the expected SQL.
|
||||
Check definitions, hidden columns, join relationships, and generated SQL.
|
||||
|
||||
## Common errors
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue