Revamp setup and guide docs (#103)

* docs: revamp quickstart setup flow

* docs: refresh context build guide

* docs: rewrite context authoring guide

* docs: update agent serving guide
This commit is contained in:
Luca Martial 2026-05-14 18:09:26 -04:00 committed by GitHub
parent 17653e24f5
commit db23fea609
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
4 changed files with 709 additions and 499 deletions

View file

@ -1,254 +1,286 @@
---
title: Quickstart
description: Set up KTX and build your first context in under 10 minutes.
description: Set up KTX, build local context, and connect your coding agent.
---
This guide walks you through `ktx setup` - an interactive wizard that configures your LLM provider, connects your database, optionally ingests from your existing tools, builds context, and installs agent integration.
This guide gets a local analytics project ready for KTX. You will install the
CLI, run the setup wizard, connect a database, build context, and install agent
rules that teach your coding assistant which KTX commands to run.
If you are a coding assistant trying to decide which KTX docs page to read, start with the [Agent Quickstart](/docs/ai-resources/agent-quickstart). This page is the human setup walkthrough.
If you are a coding assistant choosing a docs route, start with the
[Agent Quickstart](/docs/ai-resources/agent-quickstart). This page is the
human setup walkthrough.
## Workflow summary
## What setup does
Use this sequence when you are setting up KTX in an analytics project:
`ktx setup` is the main project workflow. It can create or resume `ktx.yaml`,
configure model and embedding providers, add database connections, add optional
context sources, build the first context artifacts, and install agent
integration.
1. `npm install -g @kaelio/ktx` - install the published KTX CLI from npm.
2. `ktx setup` - create or resume a KTX project.
When you run bare `ktx` in an interactive terminal outside a KTX project, the
CLI opens the same setup experience. Inside an existing project, `ktx setup`
resumes incomplete work or opens a menu for changing setup, connecting an
agent, checking status, or exploring a demo project.
The setup wizard is stateful. If it exits before completion, rerun `ktx setup` in the same project directory to resume from the first incomplete step.
## Install the CLI
## Install and run setup
Install the published [`@kaelio/ktx`](https://www.npmjs.com/package/@kaelio/ktx) CLI:
Install the published `@kaelio/ktx` package:
```bash
npm install -g @kaelio/ktx
```
Then run the setup wizard:
Then run setup from the analytics project directory:
```bash
ktx setup
```
The local checkout flow is only for contributors working on KTX itself. See [Contributing](/docs/community/contributing) for that setup.
The local checkout workflow is only for KTX contributors. See
[Contributing](/docs/community/contributing) for that path.
The wizard walks through six steps. You can go back at any point, and if you exit early, rerunning `ktx setup` resumes where you left off.
## Step 1: Choose the project
## Step 1: Configure LLM
In an interactive terminal, setup can create a new KTX project or resume the
nearest existing project. The main project file is `ktx.yaml`.
KTX uses an Anthropic model to enrich schema descriptions, generate semantic sources during ingestion, and reconcile metadata from your tools.
For scripted setup, pass the project directory explicitly:
The wizard asks how to find your API key:
```
◆ How should KTX find your Anthropic API key?
│ ○ Use ANTHROPIC_API_KEY from the environment
│ ○ Paste a key and save it as a local secret file
```bash
ktx setup --project-dir ./analytics
```
If you choose to paste a key, KTX saves it in `.ktx/secrets/anthropic-api-key` with local file permissions. Your `ktx.yaml` stores a `file:` reference, never the raw key.
If setup exits early, rerun `ktx setup` in the same directory. KTX tracks
completed setup steps and resumes from the remaining work.
Next, choose a model:
## Step 2: Configure the LLM
```
◆ Which Anthropic model should KTX use?
│ ○ Claude Sonnet 4.6 (recommended)
│ ○ Claude Opus 4.6
│ ○ Claude Haiku 4.5
│ ○ Enter a model ID manually
```
KTX uses a Claude model for ingest agents that turn schemas, SQL, BI metadata,
and documents into semantic-layer sources and wiki context.
KTX runs a health check to verify your key and model work before saving.
Setup supports two LLM provider paths:
## Step 2: Configure embeddings
| Provider | Use when | Credential model |
|----------|----------|------------------|
| Anthropic API | You have an Anthropic API key | `ANTHROPIC_API_KEY` or a local `file:` secret |
| Google Vertex AI for Anthropic Claude | Your organization runs Claude through Google Cloud | Application Default Credentials plus Vertex project and location |
KTX uses embeddings for semantic search over sources, wiki content, schema metadata, and relationship evidence.
For Anthropic API, setup can read the key from the environment or save a pasted
key to `.ktx/secrets/anthropic-api-key`. `ktx.yaml` stores an `env:` or `file:`
reference, not the raw key.
```
◆ Which embedding option should KTX use?
│ ○ Local sentence-transformers embeddings
│ ○ OpenAI embeddings (recommended)
```
For Vertex AI, setup uses Google Application Default Credentials. It can read
your active `gcloud` project, list visible projects, or accept explicit
`--vertex-project` and `--vertex-location` values.
**OpenAI embeddings** use `text-embedding-3-small` (1536 dimensions) and require an `OPENAI_API_KEY`.
Setup checks the selected model before saving. Anthropic API setup fetches live
Claude model choices when possible and falls back to bundled defaults if model
discovery is unavailable.
**Local embeddings** use `all-MiniLM-L6-v2` (384 dimensions) via the KTX managed Python runtime. No API key is needed. KTX can install and start the runtime during setup; to prepare it ahead of time, run:
## Step 3: Configure embeddings
KTX uses embeddings for semantic search over semantic-layer sources, wiki
context, schema metadata, and relationship evidence.
| Backend | Default model | Notes |
|---------|---------------|-------|
| OpenAI | `text-embedding-3-small` | Recommended for hosted embeddings. Requires an OpenAI API key. |
| Local sentence-transformers | `all-MiniLM-L6-v2` | Runs through the KTX-managed Python runtime. No hosted embedding key is required. |
OpenAI setup reads `OPENAI_API_KEY` or saves a local secret file. Local
sentence-transformers setup can install and start the managed runtime during
setup. To prepare that runtime before setup, run:
```bash
ktx dev runtime install --feature local-embeddings --yes
ktx dev runtime start --feature local-embeddings
```
## Step 3: Connect a database
## Step 4: Add a database
Select one or more databases for KTX to connect to. The wizard supports
SQLite, PostgreSQL, MySQL, ClickHouse, SQL Server, BigQuery, and Snowflake.
KTX needs at least one primary database connection before it can build database
context. The wizard supports SQLite, PostgreSQL, MySQL, ClickHouse, SQL Server,
BigQuery, and Snowflake.
For PostgreSQL, you can enter connection details field by field or paste a connection URL:
You can usually enter connection fields interactively or provide a URL. Secret
URLs can be stored as local files under `.ktx/secrets/` or referenced with
`env:NAME` in `ktx.yaml`.
```
◆ How do you want to connect to PostgreSQL?
│ ○ Enter connection details (host, port, database, user)
│ ○ Paste a connection URL
```
After saving a connection, setup tests it and builds fast schema context:
If your URL contains credentials, KTX saves it to `.ktx/secrets/` and writes a `file:` reference in `ktx.yaml`. You can also use `env:DATABASE_URL` to reference an environment variable.
After connecting, KTX automatically runs a connection test and builds fast
schema context:
```
Testing postgres-warehouse
```text
Testing warehouse
Connection test passed
Driver: PostgreSQL - Status: ok
Building schema context for postgres-warehouse
Building schema context for warehouse
Running fast database ingest
Schema context complete for postgres-warehouse
Changes: 42 new tables
Database ready
postgres-warehouse - PostgreSQL - schema context complete
warehouse - PostgreSQL - schema context complete
```
For PostgreSQL, Snowflake, and BigQuery, the wizard can enable query-history
ingest when the warehouse history feature is available. Query history is stored
under `connections.<id>.context.queryHistory` in `ktx.yaml`.
PostgreSQL, BigQuery, and Snowflake can also enable query-history ingest. Query
history helps KTX learn common query patterns, joins, service-account filters,
and warehouse-specific usage.
## Step 4: Add context sources
## Step 5: Add context sources
Context sources let KTX ingest metadata from your existing analytics tools. This step is optional - you can skip it and add sources later.
Context sources are optional, but they make the first context layer much richer.
Setup can add:
```
◆ Which context sources should KTX ingest?
│ ◻ dbt
│ ◻ MetricFlow
│ ◻ Metabase
│ ◻ Looker
│ ◻ LookML
│ ◻ Notion
```
| Source | Typical input | What KTX learns |
|--------|---------------|-----------------|
| dbt | Local project or Git repo | Models, columns, tests, descriptions, tags |
| MetricFlow | Local project or Git repo | Semantic models, metrics, dimensions, entities |
| LookML | Local files or Git repo | Views, explores, dimensions, measures, joins |
| Looker | API URL and credentials | Explores, looks, dashboards, model metadata |
| Metabase | API URL and key | Questions, dashboards, BI database mappings |
| Notion | Integration token and crawl settings | Business docs and knowledge pages |
For **dbt**, point KTX at a local path or git URL. KTX reads your `dbt_project.yml` and schema files to extract model metadata:
Setup maps BI and source metadata back to your primary warehouse connection so
generated context points at the right tables.
```
◆ dbt source location
│ ○ Local path
│ ○ Git URL
```
You can skip this step and add sources later by rerunning `ktx setup`.
For **Metabase** and **Looker**, you provide an API URL and credentials. KTX maps BI databases to your KTX primary source connections so it knows which warehouse tables the BI metadata refers to.
## Step 6: Build context
Context sources are saved to `ktx.yaml` and built during the next step.
The context build turns configured databases and sources into local artifacts
agents can read. It runs database ingest first, then source ingest and memory
updates.
## Step 5: Build context
Fast database ingest records deterministic schema grounding. Deep ingest adds
AI-enriched descriptions, embeddings, relationship evidence, and query-history
context when configured.
This is where KTX builds agent-ready context. It uses the database context
depth saved by setup and ingests metadata from any configured context sources.
When the build finishes, setup verifies that agent-ready context exists:
```
◆ Build KTX context for agents?
│ ○ Build context now (recommended)
│ ○ Leave context unbuilt and exit setup
```
Fast database context builds deterministic schema grounding. Deep database
context also generates AI descriptions, embeddings, and relationship evidence
when those capabilities are configured.
For a small database (under 50 tables), this can take a few minutes. Larger
warehouses can take longer. Context builds run in the foreground; press
<kbd>Ctrl+C</kbd> to stop the current run and rerun `ktx setup` or `ktx ingest`
when you are ready to try again.
When the build completes, KTX verifies that agent-ready context was produced:
```
```text
KTX context is ready for agents.
Databases:
postgres-warehouse: deep context complete
warehouse: deep context complete
Context sources:
dbt-main: memory update complete
dbt_main: memory update complete
Verification:
Agent context: ready
Semantic search: ready
```
## Step 6: Install agent integration
If a foreground build is interrupted, rerun `ktx setup` or build the same target
with `ktx ingest <connectionId>`.
The final step connects KTX to your coding agent. Choose how agents should access the project:
## Step 7: Install agent integration
```
◆ How should agents use this KTX project?
│ ○ CLI tools and skills
The final setup step installs project-local rules for your coding assistant.
Supported targets are Claude Code, Codex, Cursor, OpenCode, and universal
`.agents`.
You can also run this step later:
```bash
ktx setup --agents --target codex
```
Then select which agents to install for:
Claude Code and Codex also support global installs:
```
◆ Which agent targets should KTX install?
│ ◻ Claude Code
│ ◻ Codex
│ ◻ Cursor
│ ◻ OpenCode
│ ◻ Custom agent (.agents)
```bash
ktx setup --agents --target codex --global
```
**CLI mode** writes a skill file (e.g., `.claude/skills/ktx/SKILL.md`) that teaches the agent to call KTX commands directly.
**Custom agent** uses the universal `.agents` target for agents that can read project-local skills.
Agent rules are CLI-based. They point agents at the KTX CLI path that created
the file, so agents do not need a separate `ktx` binary in `PATH`. If the CLI
path changes after reinstalling or moving a checkout, rerun `ktx setup --agents`.
## Generated files
KTX writes project state as plain files so agents can inspect and edit changes in git.
KTX writes plain files so people and agents can inspect changes in git.
| Path | Created by | Purpose |
|------|------------|---------|
| `ktx.yaml` | `ktx setup` | Main project configuration: connections, LLM settings, embeddings, and context sources |
| `.ktx/secrets/*` | `ktx setup` when file-backed secrets are selected | Local secret files referenced from `ktx.yaml`; do not commit these |
| `semantic-layer/<connection-id>/*.yaml` | context build, ingestion, or direct file edits | Semantic source definitions agents use for SQL generation |
| `wiki/global/*.md` | ingestion, memory capture, or direct file edits | Shared business context and metric definitions |
| `wiki/user/<user-id>/*.md` | memory capture or direct file edits | User-scoped notes for one agent/user context |
| `.claude/skills/ktx/SKILL.md`, `.agents/skills/ktx/SKILL.md` | CLI-mode agent integration setup | Agent instructions for calling public `ktx` commands |
| Path | Purpose |
|------|---------|
| `ktx.yaml` | Project configuration for LLMs, embeddings, connections, context sources, and setup state |
| `.ktx/secrets/*` | Local secret files referenced from `ktx.yaml`; do not commit these |
| `.ktx/setup/*` | Local setup and context-build state |
| `.ktx/agents/install-manifest.json` | Manifest used to manage installed agent files |
| `semantic-layer/<connection-id>/*.yaml` | Semantic source definitions used for SQL generation |
| `wiki/global/*.md` | Shared business context and metric definitions |
| `wiki/user/<user-id>/*.md` | User-scoped notes and local context |
| `.claude/skills/ktx/SKILL.md` | Claude Code project skill |
| `.agents/skills/ktx/SKILL.md` | Codex or universal project skill |
| `.cursor/rules/ktx.mdc` | Cursor project rule |
| `.opencode/commands/ktx.md` | OpenCode project command |
## Verify it worked
## Verify setup
Check your project status:
Run:
```bash
ktx status
```
```
Example output:
```text
KTX project: /home/user/analytics
Project ready: yes
LLM ready: yes (claude-sonnet-4-6)
Embeddings ready: yes (text-embedding-3-small)
Databases configured: yes (postgres-warehouse)
Context sources configured: yes (dbt-main)
Databases configured: yes (warehouse)
Context sources configured: yes (dbt_main)
KTX context built: yes
Agent integration ready: yes (claude-code:project)
Agent integration ready: yes (codex:project)
```
Use JSON when an agent or script needs a structured readiness check:
```bash
ktx status --json
```
## Scripted setup example
Use non-interactive setup when creating repeatable fixtures or automation:
```bash
ktx setup \
--project-dir ./analytics \
--no-input \
--skip-llm \
--skip-embeddings \
--database postgres \
--new-database-connection-id warehouse \
--database-url env:DATABASE_URL \
--database-schema public
```
Then build context:
```bash
ktx ingest warehouse --fast
```
See [ktx setup](/docs/cli-reference/ktx-setup) for the full automation flag
surface.
## Common errors
| Error or symptom | Likely cause | Recovery |
|------------------|--------------|----------|
| `ktx: command not found` | The KTX package is not installed globally, or the shell cannot find the global binary | Run `npm install -g @kaelio/ktx` and open a new shell |
| LLM health check fails | Missing, invalid, or unauthorized Anthropic API key | Export `ANTHROPIC_API_KEY` or rerun `ktx setup` and choose the file-backed secret option |
| OpenAI embedding check fails | `OPENAI_API_KEY` is missing when OpenAI embeddings are selected | Export `OPENAI_API_KEY`, or rerun setup and choose local sentence-transformers embeddings |
| Local embeddings hang or fail | The managed Python runtime cannot start or the local model runtime is unavailable | Install `uv`, run `ktx dev runtime status`, then run `ktx dev runtime install --feature local-embeddings --yes` and rerun setup |
| Database connection test fails | Credentials, network access, warehouse, database, or schema value is wrong | Test the same URL with the database's native client, then rerun `ktx setup` and reconfigure the connection |
| `KTX context built: no` in `ktx status` | Setup saved configuration but did not build context | Run `ktx setup` and choose to build context now |
| Agent integration is incomplete | Setup skipped the agents step or the target was not installed | Run `ktx setup --agents --target codex` using the target you need |
| Symptom | Likely cause | Recovery |
|---------|--------------|----------|
| `ktx: command not found` | The global package is not installed or your shell cannot find it | Reinstall `@kaelio/ktx` and open a new shell |
| Setup resumes the wrong project | `KTX_PROJECT_DIR` or the nearest `ktx.yaml` points somewhere else | Pass `--project-dir <path>` |
| Anthropic health check fails | API key, model id, or access is invalid | Fix `ANTHROPIC_API_KEY` or rerun setup with a different key or model |
| Vertex AI health check fails | Vertex API, Claude access, project, location, or IAM permissions are missing | Check the project, location, Application Default Credentials, and Vertex AI permissions |
| OpenAI embeddings fail | `OPENAI_API_KEY` is missing or invalid | Export the key or choose local sentence-transformers embeddings |
| Local embeddings fail | Managed Python runtime cannot install or start | Run `ktx dev runtime status`, then install the local embeddings runtime |
| Database test fails | Credentials, network access, database, warehouse, or schema is wrong | Test the same values with the database's native client, then rerun setup |
| Context is not built | Setup saved configuration but skipped or interrupted the build | Run `ktx setup` or `ktx ingest --all` |
| Agent integration is incomplete | Setup skipped the agents step or installed a different target | Run `ktx setup --agents --target <target>` |
## Next steps
- **Build more context** - learn about [database ingest](/docs/guides/building-context), relationship detection, and source ingestion workflows in the Building Context guide.
- **Refine your semantic layer** - the [Writing Context](/docs/guides/writing-context) guide covers source YAML, measures, joins, and wiki pages.
- **Understand the architecture** - read [The Context Layer](/docs/concepts/the-context-layer) to learn why a context layer is more than a semantic layer.
- **Connect more agents** - see the [Agent Clients](/docs/integrations/agent-clients) integration page for per-tool setup details.
- Build and refresh context with [Building Context](/docs/guides/building-context).
- Edit semantic sources and wiki pages with [Writing Context](/docs/guides/writing-context).
- Connect more tools with [Agent Clients](/docs/integrations/agent-clients).
- Read [The Context Layer](/docs/concepts/the-context-layer) to understand the architecture.

View file

@ -1,171 +1,195 @@
---
title: Building Context
description: Build database and source context from configured KTX connections.
description: Build and refresh KTX context from databases, source tools, query history, and text.
---
Building context reads your configured connections and writes local context that
agents can use. Database connections produce schema context, and source
connections such as dbt, Looker, Metabase, and Notion produce semantic sources
and wiki pages.
Building context turns configured connections into local semantic-layer sources
and wiki pages. Agents use those files to understand your schema, business
definitions, metric logic, joins, and known caveats before they write SQL.
Use this guide after `ktx setup` has created `ktx.yaml` and at least one
database or context-source connection.
## The build loop
Most projects use this loop:
1. Check readiness with `ktx status`.
2. Build one connection with `ktx ingest <connectionId>`, or build everything
with `ktx ingest --all`.
3. Search or inspect the generated files under `semantic-layer/` and `wiki/`.
4. Edit source YAML or Markdown when business logic needs refinement.
5. Validate and query representative sources before handing the context to an
agent.
`ktx ingest --all` runs database connections first, then context-source
connections. That order lets dbt, BI, Notion, and text ingest attach context to
known warehouse tables.
## Database ingest
Database ingest connects to your warehouse and extracts structural metadata.
KTX stores the results locally so agents can understand your schema without
querying the database directly.
### Running database ingest
Database ingest connects to a configured warehouse and records local schema
context. It gives agents table, column, type, constraint, and row-count
grounding without requiring them to inspect the database directly.
```bash
ktx ingest <connection-id>
```
This runs a fast schema ingest by default. You can choose the depth with public
flags:
| Flag | What it does |
|------|-------------|
| `--fast` | Tables, columns, types, constraints, and row counts |
| `--deep` | Fast ingest plus AI-enriched database context |
```bash
# Build one connection quickly
ktx ingest my-postgres --fast
# Build AI-enriched database context
ktx ingest my-postgres --deep
# Build one configured database connection
ktx ingest warehouse
# Build all configured connections
ktx ingest --all
```
### Checking results
Depth controls how much context KTX builds:
Every ingest prints a summary and writes local artifacts. Use `ktx status`
after ingest to review project readiness and follow-up setup work:
| Flag | Best for | What it does |
|------|----------|--------------|
| `--fast` | First setup, quick refreshes, CI smoke checks | Deterministic schema ingest with tables, columns, types, constraints, and row counts |
| `--deep` | Agent-ready context for real analysis | Fast ingest plus AI-enriched descriptions, embeddings, relationship evidence, and optional query history |
Examples:
```bash
ktx status
ktx ingest warehouse --fast
ktx ingest warehouse --deep
ktx ingest --all --deep
```
### Relationship detection
Deep ingest needs LLM and embedding readiness. If those providers are not
configured, run `ktx setup` or use `--fast`.
Many databases lack declared foreign keys. KTX infers relationships by scoring column pairs across seven signals - name similarity, type compatibility, value overlap, embedding similarity, profile uniqueness, null rate, and structural priors. The weighted score determines each candidate's status:
## Query history
| Score range | Status | Meaning |
|-------------|--------|---------|
| &ge; 0.85 | `accepted` | High confidence - applied automatically |
| 0.55 &ndash; 0.84 | `review` | Plausible - needs human review |
| &lt; 0.55 | `rejected` | Low confidence - not applied |
PostgreSQL, BigQuery, and Snowflake can add query-history context. This helps
KTX learn common joins, filters, service-account patterns, redaction rules, and
usage-heavy query templates.
Deep database ingest can include relationship evidence where the connector can
provide it. Relationship review and calibration subcommands are not part of the
current public CLI surface.
## Ingestion
Ingestion pulls semantic context from your existing analytics tools - dbt projects, Looker models, Metabase questions, and more - and writes it into your KTX project as semantic sources and wiki pages.
### How it works
Each ingest run follows this flow:
1. An **adapter** extracts metadata from your tool (dbt manifest, LookML files, Metabase API, etc.)
2. An **LLM agent** reconciles the extracted metadata with your existing context - it merges intelligently rather than overwriting
3. **Semantic sources** (YAML) and **wiki pages** (Markdown) are written to your project directory
### Running an ingest
Enable it during setup, store it under `connections.<id>.context.queryHistory`,
or request it for one run:
```bash
ktx ingest my-dbt-source
ktx ingest warehouse --deep --query-history
ktx ingest warehouse --query-history-window-days 30
```
Useful output flags:
Use `--no-query-history` when you want to skip a stored query-history setting
for one run.
## Relationship evidence
Many databases do not declare all foreign keys. KTX can score relationship
candidates using signals such as name similarity, type compatibility, value
overlap, embedding similarity, uniqueness, null rate, and structural priors.
The public CLI does not expose separate relationship review subcommands.
Relationship evidence is built as part of deep database ingest when the
connector and readiness checks support it.
## Context-source ingest
Context-source connections pull business metadata from tools your team already
uses. The current public `ktx ingest` command is connection-centric: pass one
configured connection id, or pass `--all`.
```bash
# Build one source connection
ktx ingest dbt_main
# Build every configured database and source connection
ktx ingest --all
```
Supported source types:
| Driver | Typical source | Output |
|--------|----------------|--------|
| `dbt` | dbt project or Git repo | Semantic sources with model, column, test, tag, and description metadata |
| `metricflow` | MetricFlow project or Git repo | Metrics, dimensions, entities, and semantic joins |
| `lookml` | LookML files or Git repo | Views, explores, dimensions, measures, and joins |
| `looker` | Looker API | Explores, looks, dashboards, and model metadata |
| `metabase` | Metabase API | Questions, dashboards, table metadata, and mappings |
| `notion` | Notion API | Wiki pages and business knowledge |
Source ingest extracts metadata, reconciles it with existing local context, and
writes semantic-layer YAML plus wiki Markdown. It merges rather than blindly
overwriting local edits.
## Text ingest
Use `ktx ingest text` for notes, Markdown files, runbooks, Slack exports, or
other free-form knowledge that should become searchable KTX memory.
```bash
# Capture a Markdown file
ktx ingest text docs/revenue-notes.md --connection-id warehouse
# Capture one stdin item
printf "Refunds are excluded from net revenue." | ktx ingest text -
# Capture direct text
ktx ingest text --text "ARR excludes one-time implementation fees."
```
Useful flags:
| Flag | Description |
|------|-------------|
| `--json` | Output as JSON |
| `--plain` | Plain text output |
| `--connection-id <connectionId>` | Attach the captured memory to a KTX connection |
| `--user-id <id>` | Attribute capture to a user scope, default `local-cli` |
| `--json` | Print structured output |
| `--fail-fast` | Stop after the first failed text item |
Foreground context builds do not detach into background control sessions. If a
run is interrupted, rerun `ktx ingest <connection-id>` or `ktx ingest --all`.
Text ingest is a good fit for small, high-signal documents. For system-specific
connectors such as Notion, dbt, or Metabase, prefer configured source ingest so
KTX can preserve source metadata.
### Supported context sources
## Output and artifacts
| Driver | Source | What gets ingested |
|--------|--------|--------------------|
| `dbt` | dbt project | Model definitions, column descriptions, tests, tags |
| `metricflow` | MetricFlow semantic models | Metrics, dimensions, entities, semantic joins |
| `lookml` | LookML files | Views, explores, dimensions, measures, joins |
| `looker` | Looker API | Explores, looks, dashboard metadata |
| `metabase` | Metabase API | Questions, dashboards, table metadata |
| `notion` | Notion API | Database pages, knowledge articles |
Every ingest run prints a summary. Use `--json` when an agent or script needs a
structured plan and per-target results.
Query history is a database connection facet. Enable it with
`connections.<id>.context.queryHistory` or pass `--query-history` for a current
run. See [Context Sources](/docs/integrations/context-sources) for
driver-specific setup and auth configuration.
### What gets generated
A typical dbt ingest produces semantic sources and wiki pages in your project:
**Semantic source** (`semantic-layer/my-postgres/orders.yaml`):
```yaml title="semantic-layer/my-postgres/orders.yaml"
name: orders
table: public.orders
grain:
- order_id
columns:
- name: order_id
type: string
description: Unique order identifier
- name: customer_id
type: string
description: Foreign key to customers table
- name: order_date
type: time
role: time
description: Date the order was placed
- name: total_amount
type: number
description: Total order value in USD
measures:
- name: total_revenue
expr: SUM(total_amount)
description: Sum of all order values
- name: order_count
expr: COUNT(DISTINCT order_id)
description: Number of distinct orders
joins:
- to: customers
on: orders.customer_id = customers.customer_id
relationship: many_to_one
```bash
ktx ingest --all --json
```
**Wiki page** (`wiki/global/order-status-definitions.md`):
Typical generated files:
```markdown
---
summary: Business definitions for order status values
tags: [orders, definitions]
sl_refs: [orders]
---
| Path | Created by | Purpose |
|------|------------|---------|
| `semantic-layer/<connection-id>/*.yaml` | Database and source ingest | Queryable semantic source definitions |
| `wiki/global/*.md` | Source, text, and memory ingest | Shared business definitions and notes |
| `wiki/user/<user-id>/*.md` | Text and memory ingest | User-scoped context |
| `.ktx/setup/context-build.json` | Setup context build | Resume and readiness state for setup |
## Order Statuses
Ingest sessions also record transcripts with tool calls, LLM responses, and
write decisions. Inspect them when you need to debug why a source or wiki page
was written a certain way.
- **pending**: Order placed but not yet processed
- **confirmed**: Payment received, awaiting fulfillment
- **shipped**: Order dispatched to carrier
- **delivered**: Order received by customer
- **cancelled**: Order cancelled before shipment
## Example: first full refresh
Orders in "pending" status for more than 48 hours are flagged for review.
After interactive setup:
```bash
ktx status
ktx ingest --all --deep
ktx status
```
### Ingest transcripts
Then inspect what changed:
Every ingest session records a full transcript: tool calls, LLM responses, and
write decisions. Inspect the stored transcript files when you need to debug why
a source was written a certain way.
```bash
git status --short
ktx sl list --json
ktx wiki search "revenue" --json --limit 10
```
## Common errors
| Symptom | Likely cause | Recovery |
|---------|--------------|----------|
| Connection not configured | The connection id is missing from `ktx.yaml` | Add it with `ktx setup` |
| Deep readiness is missing | LLM or embeddings are not setup-ready | Run `ktx setup`, or rerun with `--fast` |
| Query history is unsupported | The selected database driver does not expose query history | Run schema ingest without query-history flags |
| No target selected | You omitted both a connection id and `--all` | Run `ktx ingest <connectionId>` or `ktx ingest --all` |
| Source flags have no effect | Depth and query-history flags were supplied for a source connector | Use those flags only for database connections |
| Text ingest stops early | `--fail-fast` stopped on the first failed item | Fix the item or rerun without `--fail-fast` |

View file

@ -1,59 +1,167 @@
---
title: Serving Agents
description: Expose your context to Claude Code, Cursor, Codex, and other coding agents.
description: Expose KTX context to Claude Code, Codex, Cursor, OpenCode, and custom agents.
---
Once you've built and refined your context, expose it to coding agents through
the public KTX CLI. Claude Code, Cursor, Codex, OpenCode, and custom agent
workflows can call the same commands you use at a terminal.
KTX serves agents through the public CLI and project-local instruction files.
Agents do not need a separate server. They read the generated rules, call KTX
commands, inspect local context files, and use JSON output when they need
structured results.
## CLI Commands
## Recommended setup
KTX public commands support JSON output for the context reads that agents use
most often. Use `--project-dir` when the agent is not already running inside the
KTX project directory.
### Available commands
Run the agent install step from a KTX project:
```bash
ktx setup --agents
```
Or install a specific target:
```bash
ktx setup --agents --target codex
```
Supported targets:
| Target | Generated project file |
|--------|------------------------|
| Claude Code | `.claude/skills/ktx/SKILL.md` |
| Codex | `.agents/skills/ktx/SKILL.md` |
| Cursor | `.cursor/rules/ktx.mdc` |
| OpenCode | `.opencode/commands/ktx.md` |
| Universal `.agents` | `.agents/skills/ktx/SKILL.md` |
Claude Code and Codex also support global installs:
```bash
ktx setup --agents --target claude-code --global
ktx setup --agents --target codex --global
```
KTX records installed files in `.ktx/agents/install-manifest.json`. Rerun
`ktx setup --agents` after moving a checkout or reinstalling the CLI so the
generated instructions point at the current CLI path.
## Agent command set
All supported agent clients use the same command surface. Use `--project-dir`
when the agent is running outside the KTX project directory.
### Readiness
```bash
# Check setup and context readiness
ktx status --json
```
**Semantic layer:**
Agents should run this before relying on context. It reports project, LLM,
embedding, database, context-source, context-build, and agent-integration
readiness.
### Semantic layer discovery
```bash
# List sources
ktx sl list --json
ktx sl list --json --connection-id my-postgres
ktx sl search "revenue" --json
ktx sl list --connection-id warehouse --json
ktx sl search "revenue" --json --limit 10
```
# Run a query from a JSON file
ktx sl query --json \
--connection-id my-postgres \
--query-file query.json \
Agents use these commands to discover source names, connection ids, measures,
dimensions, and likely files to inspect.
### Semantic-layer validation and queries
```bash
ktx sl validate orders --connection-id warehouse
```
Compile SQL before executing:
```bash
ktx sl query \
--connection-id warehouse \
--measure orders.total_revenue \
--dimension orders.created_date \
--format sql
```
Execute only when the task calls for live data:
```bash
ktx sl query \
--connection-id warehouse \
--measure orders.total_revenue \
--dimension orders.status \
--execute \
--max-rows 100
```
**Wiki:**
For complex calls, agents can write a JSON query object and pass it with
`--query-file`.
### Wiki context
```bash
# Search wiki pages
ktx wiki list --json
ktx wiki search "revenue recognition" --json --limit 10
```
## Setting Up Your Agent
Agents should search wiki context when a question depends on business
definitions, metric caveats, process rules, or terms that are not obvious from
schema names.
The fastest way to connect an agent is through the setup wizard:
### Context refresh
Agents can refresh context when the user asks them to:
```bash
ktx setup
ktx ingest warehouse --fast
ktx ingest --all
ktx ingest text docs/revenue-notes.md --connection-id warehouse
```
The agents step auto-detects installed tools and generates the right
configuration. For manual setup or per-tool details, see the
[Agent Clients](/docs/integrations/agent-clients) integration page.
Use `--deep` only when LLM and embedding setup is ready and the user expects an
AI-enriched refresh.
After configuration, the agent can immediately call KTX commands to list
sources, search wiki pages, and query your semantic layer.
## Good agent behavior
Agents should:
- Run `ktx status --json` before using KTX context.
- Use `ktx sl search` and `ktx wiki search` before writing SQL from memory.
- Inspect the relevant YAML or Markdown files after search returns candidates.
- Compile SQL with `ktx sl query --format sql` before executing.
- Use `--max-rows` whenever executing a live query.
- Validate edited semantic sources with `ktx sl validate`.
- Keep generated context changes reviewable in git.
Agents should not assume a background server, ORPC route, frontend app, or
external migration system exists. KTX is a local context layer with a CLI and
plain project files.
## Manual setup
Manual setup is useful for custom agents that can read project-local
instructions but are not yet a named target.
1. Install the universal target:
```bash
ktx setup --agents --target universal
```
2. Configure the agent to read `.agents/skills/ktx/SKILL.md`.
3. Open the agent in the KTX project directory.
4. Ask it to run `ktx status --json` and summarize readiness.
For per-client notes, see [Agent Clients](/docs/integrations/agent-clients).
## Troubleshooting
| Symptom | Likely cause | Recovery |
|---------|--------------|----------|
| Agent says KTX is unavailable | Agent did not load the generated instruction file | Rerun `ktx setup --agents --target <target>` and restart the agent session |
| Agent command cannot find the project | Agent is running outside the KTX directory | Add `--project-dir <path>` or open the agent in the project root |
| Generated rules point at a missing CLI path | CLI was moved, rebuilt, or reinstalled | Rerun `ktx setup --agents` |
| Agent cannot find a metric | Context is missing or stale | Run `ktx sl search`, inspect source YAML, then refresh with `ktx ingest` if needed |
| Agent query returns too many rows | The command executed without a result cap | Require `--max-rows` for executed queries |

View file

@ -1,295 +1,341 @@
---
title: Writing Context
description: Write and refine semantic sources and wiki pages.
description: Edit semantic sources and wiki pages so agents use your business logic.
---
After building context through scanning and ingestion, you'll want to refine it - edit semantic sources to match your business logic, add wiki pages that capture tribal knowledge, and query your data through the semantic layer to verify everything works.
KTX context is meant to be edited. Ingest gives you a grounded first draft, then
you refine source YAML and wiki Markdown until agents can answer data questions
with the same definitions your team uses.
## Agent workflow summary
Use this guide when you are adding measures, fixing joins, documenting business
rules, or reviewing context changes made by an agent.
Agents should refine context in this order:
## Editing workflow
1. `ktx sl list --json` - discover available sources and connection ids.
2. `ktx sl search <query> --json` - find source candidates for a concept.
3. Edit the source YAML directly in `semantic-layer/<connection-id>/`.
4. `ktx sl validate <source> --connection-id <id>` - verify columns, joins, and table references.
5. `ktx sl query ... --format sql` - compile a representative query without executing it.
6. `ktx wiki search ...` - check business context captured by ingest or memory.
Use this order for most context changes:
## Semantic Sources
1. Discover existing context.
Semantic sources are YAML files that describe your tables, columns, measures, and joins. They're the core of the context layer - the structured definitions that agents use to generate correct SQL.
```bash
ktx sl list --json
ktx sl search "revenue" --json
ktx wiki search "revenue recognition" --json --limit 10
```
### Listing sources
2. Edit the smallest relevant files under `semantic-layer/<connection-id>/` or
`wiki/`.
3. Validate semantic source changes.
```bash
# List all sources across connections
ktx sl list
```bash
ktx sl validate orders --connection-id warehouse
```
# List sources for a specific connection
ktx sl list --connection-id my-postgres
4. Compile a representative query before executing it.
# Output as JSON
ktx sl list --json
```bash
ktx sl query \
--connection-id warehouse \
--measure orders.total_revenue \
--dimension orders.created_date \
--format sql
```
5. Search again using likely user wording to confirm the new context is
discoverable.
## Semantic sources
Semantic sources are YAML files that describe queryable entities. A source is
usually a table, but it can also point at a custom SQL expression. Sources
define the vocabulary agents use for measures, dimensions, segments, joins, and
grain-aware query planning.
Source files live at:
```text
semantic-layer/<connection-id>/<source-name>.yaml
```
### Searching sources
```bash
ktx sl search "revenue" --connection-id my-postgres --json
```
Search returns ranked source summaries. To inspect or edit a source, open the
YAML file under `semantic-layer/<connection-id>/`.
### The source schema
A semantic source defines a single queryable entity - usually a table or a SQL expression. Here's a fully annotated example:
### Minimal source
```yaml
name: orders
description: Customer orders with line-item totals
table: public.orders # or use `sql:` for a custom SQL expression
description: Customer orders with booked revenue.
table: public.orders
grain:
- order_id # columns that uniquely identify a row
- order_id
columns:
- name: order_id
type: string
description: Unique order identifier.
- name: order_date
type: time
role: time
description: Date the order was placed.
- name: total_amount
type: number
description: Booked order value in USD.
measures:
- name: total_revenue
expr: SUM(total_amount)
description: Sum of booked order value before refunds.
```
### Full source shape
```yaml
name: orders
description: Customer orders with line-item totals.
table: public.orders
grain:
- order_id
columns:
- name: order_id
type: string # string | number | time | boolean
description: Unique order identifier
type: string
description: Unique order identifier.
- name: order_date
type: time
role: time # marks this as the default time dimension
description: Date the order was placed
role: time
description: Date the order was placed.
- name: status
type: string
visibility: public # public (default) | internal | hidden
description: Current order status
visibility: public
description: Current order status.
- name: _etl_loaded_at
type: time
visibility: hidden # hidden columns are excluded from agent queries
description: Internal ETL timestamp
visibility: hidden
description: Internal load timestamp.
- name: total_amount
type: number
description: Order total in USD
description: Order total in USD.
measures:
- name: total_revenue
expr: SUM(total_amount)
description: Sum of all order values
description: Sum of all order values.
- name: order_count
expr: COUNT(DISTINCT order_id)
description: Number of distinct orders
description: Number of distinct orders.
- name: avg_order_value
expr: AVG(total_amount)
description: Average order value
description: Average booked order value.
- name: high_value_revenue
expr: SUM(total_amount)
filter: total_amount > 100
description: Revenue from orders over $100
description: Revenue from orders over $100.
segments:
- name: us_orders
expr: country = 'US'
description: Orders from US customers
- name: completed_orders
expr: status = 'completed'
description: Orders that completed fulfillment.
joins:
- to: customers
on: orders.customer_id = customers.customer_id
relationship: many_to_one # many_to_one | one_to_many | one_to_one
relationship: many_to_one
- to: order_items
on: orders.order_id = order_items.order_id
relationship: one_to_many
alias: items # optional alias for the joined source
alias: items
```
Key fields:
### Source fields
| Field | Required | Description |
|-------|----------|-------------|
| `name` | Yes | Source identifier (lowercase, underscores) |
| `table` or `sql` | Yes | Database table or custom SQL expression (exactly one) |
| `grain` | Yes | Columns that define row uniqueness |
| `columns` | No | Column definitions with type, role, visibility |
| `measures` | No | Aggregation expressions (SUM, COUNT, AVG, etc.) |
| `joins` | No | Relationships to other sources |
| `segments` | No | Named filter conditions |
| `inherits_columns_from` | No | Inherit column metadata from a manifest entry |
| `name` | Yes | Source identifier. Use lowercase words and underscores. |
| `table` or `sql` | Yes | Database table or custom SQL expression. Use exactly one. |
| `grain` | Yes | Columns that uniquely identify a row at the source grain. |
| `columns` | No | Column definitions with type, role, visibility, and descriptions. |
| `measures` | No | Aggregation expressions such as `SUM`, `COUNT`, and `AVG`. |
| `segments` | No | Named predicates agents can reuse. |
| `joins` | No | Relationships to other semantic sources. |
| `inherits_columns_from` | No | Inherit column metadata from a manifest entry. |
Source component fields:
### Component fields
| Component | Field | Required | Description |
|-----------|-------|----------|-------------|
| Column | `name` | Yes | Column identifier as used in SQL expressions |
| Column | `type` | Yes | Agent-facing type: `string`, `number`, `time`, or `boolean` |
| Column | `role` | No | Special role such as `time` for default time dimensions |
| Column | `visibility` | No | `public`, `internal`, or `hidden` |
| Column | `description` | Strongly recommended | Human-readable business meaning |
| Measure | `name` | Yes | Queryable metric name |
| Measure | `expr` | Yes | SQL aggregation expression at the source grain |
| Measure | `filter` | No | SQL predicate applied only to this measure |
| Measure | `description` | Strongly recommended | Definition agents can cite and compare |
| Segment | `name` | Yes | Reusable filter name |
| Segment | `expr` | Yes | SQL predicate for the segment |
| Join | `to` | Yes | Target semantic source name |
| Join | `on` | Yes | SQL join condition using source names or aliases |
| Join | `relationship` | Yes | `many_to_one`, `one_to_many`, or `one_to_one` |
| Join | `alias` | No | Query alias for repeated or clearer joins |
| Column | `name` | Yes | Column identifier used in SQL expressions. |
| Column | `type` | Yes | Agent-facing type: `string`, `number`, `time`, or `boolean`. |
| Column | `role` | No | Special role such as `time` for default time dimensions. |
| Column | `visibility` | No | `public`, `internal`, or `hidden`. |
| Column | `description` | Strongly recommended | Business meaning and usage notes. |
| Measure | `name` | Yes | Queryable metric name. |
| Measure | `expr` | Yes | SQL aggregation expression at the source grain. |
| Measure | `filter` | No | SQL predicate applied only to this measure. |
| Measure | `description` | Strongly recommended | Definition agents can cite and compare. |
| Segment | `name` | Yes | Reusable filter name. |
| Segment | `expr` | Yes | SQL predicate for the segment. |
| Join | `to` | Yes | Target semantic source name. |
| Join | `on` | Yes | SQL join condition using source names or aliases. |
| Join | `relationship` | Yes | `many_to_one`, `one_to_many`, or `one_to_one`. |
| Join | `alias` | No | Query alias for repeated or clearer joins. |
Column visibility controls what agents see:
### Visibility
| Visibility | Behavior |
|------------|----------|
| `public` | Included in agent queries and listings (default) |
| `internal` | Available for joins and measures but not shown to agents |
| `hidden` | Excluded entirely - useful for ETL columns |
| Visibility | Agent behavior |
|------------|----------------|
| `public` | Included in listings and available for agent queries. |
| `internal` | Available for joins and measures, but not highlighted to agents. |
| `hidden` | Excluded from agent-facing context. Use for ETL fields and sensitive internals. |
### Editing a source
## Measures
Edit source files directly. They live at
`semantic-layer/<connection-id>/<source-name>.yaml` in your project directory.
Good measures have precise names, SQL expressions at the correct grain, and
descriptions that say what is included and excluded.
### Validating sources
Validation checks a source definition against the actual database schema:
```bash
ktx sl validate orders --connection-id my-postgres
```yaml
measures:
- name: net_revenue
expr: SUM(total_amount - refunded_amount)
filter: status = 'completed'
description: Completed order revenue after refunds, excluding cancelled orders.
```
This catches mismatches - columns that don't exist in the table, type mismatches, invalid join targets - before an agent tries to use the source.
Prefer one canonical measure plus wiki synonyms over several nearly identical
measures. If your team uses multiple definitions, document the distinction in a
wiki page and link it with `sl_refs`.
### Querying
## Joins and grain
The semantic layer compiles your measures and dimensions into SQL, optionally executing it against the database:
`grain` and `relationship` prevent agents from producing double-counted SQL.
State the row grain even when it seems obvious.
```yaml
grain:
- order_id
joins:
- to: customers
on: orders.customer_id = customers.customer_id
relationship: many_to_one
```
Use `many_to_one` for dimensions such as customer, account, product, or plan.
Use `one_to_many` only when the target can fan out the source rows, such as
orders to order items.
## Validate and query
Validation checks source YAML against the live database schema:
```bash
ktx sl validate orders --connection-id warehouse
```
It catches missing columns, invalid join targets, and table-reference problems
before an agent relies on the source.
Compile a query to inspect generated SQL:
```bash
# Compile a query to SQL
ktx sl query \
--connection-id my-postgres \
--measure total_revenue \
--measure order_count \
--dimension "order_date" \
--filter "status = 'completed'" \
--order-by order_date:desc \
--connection-id warehouse \
--measure orders.total_revenue \
--dimension orders.order_date \
--filter "orders.status = 'completed'" \
--order-by orders.order_date:desc \
--limit 10 \
--format sql
```
This outputs the compiled SQL without executing it. To run the query:
Execute only when you need live rows:
```bash
# Execute and return results
ktx sl query \
--connection-id my-postgres \
--measure total_revenue \
--dimension "order_date" \
--connection-id warehouse \
--measure orders.total_revenue \
--dimension orders.status \
--execute \
--max-rows 100
```
Query flags:
## Wiki pages
| Flag | Description |
|------|-------------|
| `--measure <name>` | Measure to query (repeatable, at least one required) |
| `--dimension <name>` | Dimension to group by (repeatable) |
| `--filter <expr>` | Filter expression (repeatable) |
| `--segment <name>` | Named segment to apply (repeatable) |
| `--order-by <field[:dir]>` | Sort field, optionally with `:asc` or `:desc` (repeatable) |
| `--limit <n>` | Maximum rows in the compiled query |
| `--format <mode>` | Output format: `json` (default) or `sql` |
| `--execute` | Execute the query against the database |
| `--max-rows <n>` | Maximum rows to return when executing |
| `--include-empty` | Include empty/null rows in results |
Wiki pages capture business context that does not belong in a single source
file: metric policies, dashboard caveats, company vocabulary, data freshness,
known issues, and source-of-truth notes.
The query planner is grain-aware - it understands the cardinality of joins and avoids chasm traps (double-counting caused by many-to-many fan-outs). When you query measures that span multiple sources, KTX generates sub-queries at the correct grain before joining.
Wiki files live under:
### Workflow: edit and validate a source
1. Open `semantic-layer/my-postgres/orders.yaml`.
2. Edit the file to add columns, measures, joins, or descriptions.
3. `ktx sl validate orders --connection-id my-postgres` - check the definition against the live schema.
4. `ktx sl query --connection-id my-postgres --measure total_revenue --dimension order_date --format sql` - compile a representative query.
If validation fails, fix the YAML before asking an agent to use the source. Common validation failures are missing columns, invalid join targets, and measure expressions that reference fields outside the source.
## Wiki Pages
Wiki pages are Markdown files that capture business context - definitions, rules, gotchas, and anything an agent needs to understand beyond what the schema tells it.
### What they are
When an agent asks "what counts as an active user?" or "why do revenue numbers differ between the dashboard and the SQL query?", the answer isn't in the schema. It's tribal knowledge that lives in Slack threads, Notion pages, or someone's head. Wiki pages make that context searchable and available to agents.
### Organization
Wiki pages are organized by scope:
```
```text
wiki/
├── global/ # Cross-cutting definitions
│ ├── order-status-definitions.md
│ ├── revenue-recognition-rules.md
│ └── data-freshness-sla.md
└── user/
└── local/ # User-scoped context
├── schema-conventions.md
└── known-data-issues.md
global/
user/<user-id>/
```
- **Global pages** apply across all connections - business definitions, metric standards, company terminology.
- **User-scoped pages** are private to a user ID - personal notes, local gotchas, or context you do not want shared globally.
Use global pages for shared business rules. Use user-scoped pages for local
notes, personal conventions, or context that should not be shared broadly.
### Editing pages
### Wiki page example
Create and edit wiki pages directly as Markdown files in the `wiki/`
directory. Ingest and memory capture also create these pages automatically.
```markdown
---
summary: Revenue recognition rules for finance reporting.
tags: [revenue, finance, reporting]
sl_refs: [orders]
external_refs:
- type: notion
id: finance-revenue-policy
---
Wiki page fields:
## Recognized Revenue
Recognized revenue includes completed orders after refunds. It excludes
cancelled orders, test orders, implementation fees, and tax.
Finance reporting uses order completion date, not invoice creation date.
```
Useful frontmatter:
| Field | Required | Description |
|-------|----------|-------------|
| Key | Yes | Stable page identifier used as the Markdown filename |
| Summary | Yes | Short text shown in search results |
| Content | Yes | Full Markdown business context |
| Scope | No | `global` for shared context or `user` for user-scoped notes |
| Tags | No | Search and organization labels |
| External refs | No | Links or identifiers for source-of-truth systems |
| Semantic-layer refs | No | Source names the page explains or constrains |
| `summary` | Yes | Short text shown in search results. |
| `tags` | No | Business terms and synonyms that improve search. |
| `sl_refs` | No | Semantic source names the page explains or constrains. |
| `external_refs` | No | Source-of-truth system links or ids. |
### Listing pages
## Add searchable business context
1. Search first.
```bash
ktx wiki search "active customer definition" --json --limit 10
```
2. If no page covers the rule, create or edit a Markdown file under
`wiki/global/`.
3. Write a compact `summary` with the wording users are likely to ask.
4. Add tags for synonyms and related business areas.
5. Add `sl_refs` for relevant semantic sources.
6. Search again with a user-like phrase.
## Review context changes
Before accepting agent-written context:
```bash
ktx wiki list
git diff -- semantic-layer wiki
ktx sl validate orders --connection-id warehouse
ktx sl search "revenue" --json
ktx wiki search "revenue recognition" --json --limit 10
```
### Searching
```bash
ktx wiki search "revenue recognition"
```
Search uses both full-text matching and semantic similarity - it finds relevant pages even when the exact terms don't match. Agents call this automatically when they need business context to answer a question.
### Workflow: add searchable business context
1. Search first: `ktx wiki search "order status definitions"`.
2. If no page already covers the rule, create or edit a Markdown file under `wiki/global/`.
3. Include concise frontmatter; agents see the summary before loading full content.
4. Add `tags` values for the business area and `sl_refs` values for related semantic sources.
5. Search again with the user's likely wording to confirm the page is discoverable.
Check that definitions are specific, hidden columns stay hidden, joins have
explicit relationships, and measures compile into the expected SQL.
## Common errors
| Error or symptom | Likely cause | Recovery |
|------------------|--------------|----------|
| `ktx sl validate` reports a missing column | YAML references a column that is absent from the scanned table | Run a fresh scan or update the YAML to match the warehouse schema |
| Query compilation double-counts a measure | Join relationship or grain is missing or wrong | Add `grain` and explicit `relationship` values, then validate and recompile |
| Agent cannot find a metric | Measure name or description does not match business terminology | Add a measure description and a wiki page with common synonyms |
| Wiki search misses a page | Summary and tags do not include likely user wording | Rewrite the summary and add relevant tags, then search again |
| Semantic-layer changes are hard to review | The YAML edit is too large or unfocused | Split the change into smaller source-file edits, then review the git diff |
| Symptom | Likely cause | Recovery |
|---------|--------------|----------|
| `ktx sl validate` reports a missing column | YAML references a column absent from the scanned table | Refresh database context or update the YAML |
| Query compilation double-counts a measure | `grain` or join `relationship` is missing or wrong | Add explicit grain and relationship values, then recompile |
| Agent cannot find a metric | Measure name and description do not match business terminology | Add a clearer measure description and a wiki page with synonyms |
| Wiki search misses a page | Summary, tags, or content do not match user wording | Rewrite the summary and add likely synonyms |
| Context diff is hard to review | One edit changed too many concepts | Split the change into focused source and wiki edits |