mirror of
https://github.com/Kaelio/ktx.git
synced 2026-06-07 07:55:13 +02:00
195 lines
7 KiB
Text
195 lines
7 KiB
Text
---
|
|
title: Building Context
|
|
description: Build and refresh KTX context from databases, source tools, query history, and text.
|
|
---
|
|
|
|
Building context turns configured connections into local semantic-layer sources
|
|
and wiki pages. Agents use those files to understand your schema, business
|
|
definitions, metric logic, joins, and known caveats before they write SQL.
|
|
|
|
Use this guide after `ktx setup` has created `ktx.yaml` and at least one
|
|
database or context-source connection.
|
|
|
|
## The build loop
|
|
|
|
Most projects use this loop:
|
|
|
|
1. Check readiness with `ktx status`.
|
|
2. Build one connection with `ktx ingest <connectionId>`, or build everything
|
|
with `ktx ingest --all`.
|
|
3. Search or inspect the generated files under `semantic-layer/` and `wiki/`.
|
|
4. Edit source YAML or Markdown when business logic needs refinement.
|
|
5. Validate and query representative sources before handing the context to an
|
|
agent.
|
|
|
|
`ktx ingest --all` runs database connections first, then context-source
|
|
connections. That order lets dbt, BI, Notion, and text ingest attach context to
|
|
known warehouse tables.
|
|
|
|
## Database ingest
|
|
|
|
Database ingest connects to a configured warehouse and records local schema
|
|
context. It gives agents table, column, type, constraint, and row-count
|
|
grounding without requiring them to inspect the database directly.
|
|
|
|
```bash
|
|
# Build one configured database connection
|
|
ktx ingest warehouse
|
|
|
|
# Build all configured connections
|
|
ktx ingest --all
|
|
```
|
|
|
|
Depth controls how much context KTX builds:
|
|
|
|
| Flag | Best for | What it does |
|
|
|------|----------|--------------|
|
|
| `--fast` | First setup, quick refreshes, CI smoke checks | Deterministic schema ingest with tables, columns, types, constraints, and row counts |
|
|
| `--deep` | Agent-ready context for real analysis | Fast ingest plus AI-enriched descriptions, embeddings, relationship evidence, and optional query history |
|
|
|
|
Examples:
|
|
|
|
```bash
|
|
ktx ingest warehouse --fast
|
|
ktx ingest warehouse --deep
|
|
ktx ingest --all --deep
|
|
```
|
|
|
|
Deep ingest needs LLM and embedding readiness. If those providers are not
|
|
configured, run `ktx setup` or use `--fast`.
|
|
|
|
## Query history
|
|
|
|
PostgreSQL, BigQuery, and Snowflake can add query-history context. This helps
|
|
KTX learn common joins, filters, service-account patterns, redaction rules, and
|
|
usage-heavy query templates.
|
|
|
|
Enable it during setup, store it under `connections.<id>.context.queryHistory`,
|
|
or request it for one run:
|
|
|
|
```bash
|
|
ktx ingest warehouse --deep --query-history
|
|
ktx ingest warehouse --query-history-window-days 30
|
|
```
|
|
|
|
Use `--no-query-history` when you want to skip a stored query-history setting
|
|
for one run.
|
|
|
|
## Relationship evidence
|
|
|
|
Many databases do not declare all foreign keys. KTX can score relationship
|
|
candidates using signals such as name similarity, type compatibility, value
|
|
overlap, embedding similarity, uniqueness, null rate, and structural priors.
|
|
|
|
The public CLI does not expose separate relationship review subcommands.
|
|
Relationship evidence is built as part of deep database ingest when the
|
|
connector and readiness checks support it.
|
|
|
|
## Context-source ingest
|
|
|
|
Context-source connections pull business metadata from tools your team already
|
|
uses. The current public `ktx ingest` command is connection-centric: pass one
|
|
configured connection id, or pass `--all`.
|
|
|
|
```bash
|
|
# Build one source connection
|
|
ktx ingest dbt_main
|
|
|
|
# Build every configured database and source connection
|
|
ktx ingest --all
|
|
```
|
|
|
|
Supported source types:
|
|
|
|
| Driver | Typical source | Output |
|
|
|--------|----------------|--------|
|
|
| `dbt` | dbt project or Git repo | Semantic sources with model, column, test, tag, and description metadata |
|
|
| `metricflow` | MetricFlow project or Git repo | Metrics, dimensions, entities, and semantic joins |
|
|
| `lookml` | LookML files or Git repo | Views, explores, dimensions, measures, and joins |
|
|
| `looker` | Looker API | Explores, looks, dashboards, and model metadata |
|
|
| `metabase` | Metabase API | Questions, dashboards, table metadata, and mappings |
|
|
| `notion` | Notion API | Wiki pages and business knowledge |
|
|
|
|
Source ingest extracts metadata, reconciles it with existing local context, and
|
|
writes semantic-layer YAML plus wiki Markdown. It merges rather than blindly
|
|
overwriting local edits.
|
|
|
|
## Text ingest
|
|
|
|
Use `ktx ingest text` for notes, Markdown files, runbooks, Slack exports, or
|
|
other free-form knowledge that should become searchable KTX memory.
|
|
|
|
```bash
|
|
# Capture a Markdown file
|
|
ktx ingest text docs/revenue-notes.md --connection-id warehouse
|
|
|
|
# Capture one stdin item
|
|
printf "Refunds are excluded from net revenue." | ktx ingest text -
|
|
|
|
# Capture direct text
|
|
ktx ingest text --text "ARR excludes one-time implementation fees."
|
|
```
|
|
|
|
Useful flags:
|
|
|
|
| Flag | Description |
|
|
|------|-------------|
|
|
| `--connection-id <connectionId>` | Attach the captured memory to a KTX connection |
|
|
| `--user-id <id>` | Attribute capture to a user scope, default `local-cli` |
|
|
| `--json` | Print structured output |
|
|
| `--fail-fast` | Stop after the first failed text item |
|
|
|
|
Text ingest is a good fit for small, high-signal documents. For system-specific
|
|
connectors such as Notion, dbt, or Metabase, prefer configured source ingest so
|
|
KTX can preserve source metadata.
|
|
|
|
## Output and artifacts
|
|
|
|
Every ingest run prints a summary. Use `--json` when an agent or script needs a
|
|
structured plan and per-target results.
|
|
|
|
```bash
|
|
ktx ingest --all --json
|
|
```
|
|
|
|
Typical generated files:
|
|
|
|
| Path | Created by | Purpose |
|
|
|------|------------|---------|
|
|
| `semantic-layer/<connection-id>/*.yaml` | Database and source ingest | Queryable semantic source definitions |
|
|
| `wiki/global/*.md` | Source, text, and memory ingest | Shared business definitions and notes |
|
|
| `wiki/user/<user-id>/*.md` | Text and memory ingest | User-scoped context |
|
|
| `.ktx/setup/context-build.json` | Setup context build | Resume and readiness state for setup |
|
|
|
|
Ingest sessions also record transcripts with tool calls, LLM responses, and
|
|
write decisions. Inspect them when you need to debug why a source or wiki page
|
|
was written a certain way.
|
|
|
|
## Example: first full refresh
|
|
|
|
After interactive setup:
|
|
|
|
```bash
|
|
ktx status
|
|
ktx ingest --all --deep
|
|
ktx status
|
|
```
|
|
|
|
Then inspect what changed:
|
|
|
|
```bash
|
|
git status --short
|
|
ktx sl list --json
|
|
ktx wiki search "revenue" --json --limit 10
|
|
```
|
|
|
|
## Common errors
|
|
|
|
| Symptom | Likely cause | Recovery |
|
|
|---------|--------------|----------|
|
|
| Connection not configured | The connection id is missing from `ktx.yaml` | Add it with `ktx setup` |
|
|
| Deep readiness is missing | LLM or embeddings are not setup-ready | Run `ktx setup`, or rerun with `--fast` |
|
|
| Query history is unsupported | The selected database driver does not expose query history | Run schema ingest without query-history flags |
|
|
| No target selected | You omitted both a connection id and `--all` | Run `ktx ingest <connectionId>` or `ktx ingest --all` |
|
|
| Source flags have no effect | Depth and query-history flags were supplied for a source connector | Use those flags only for database connections |
|
|
| Text ingest stops early | `--fail-fast` stopped on the first failed item | Fix the item or rerun without `--fail-fast` |
|