2026-05-11 00:45:43 -07:00
---
title: Building Context
2026-05-14 01:43:06 +02:00
description: Build database and source context from configured KTX connections.
2026-05-11 00:45:43 -07:00
---
2026-05-14 01:43:06 +02:00
Building context reads your configured connections and writes local context that
agents can use. Database connections produce schema context, and source
connections such as dbt, Looker, Metabase, and Notion produce semantic sources
and wiki pages.
2026-05-11 00:45:43 -07:00
2026-05-14 01:43:06 +02:00
## Database ingest
2026-05-11 00:45:43 -07:00
2026-05-14 01:43:06 +02:00
Database ingest connects to your warehouse and extracts structural metadata.
KTX stores the results locally so agents can understand your schema without
querying the database directly.
2026-05-11 00:45:43 -07:00
2026-05-14 01:43:06 +02:00
### Running database ingest
2026-05-11 00:45:43 -07:00
```bash
2026-05-14 01:43:06 +02:00
ktx ingest <connection-id>
2026-05-11 00:45:43 -07:00
```
2026-05-14 01:43:06 +02:00
This runs a fast schema ingest by default. You can choose the depth with public
flags:
2026-05-11 00:45:43 -07:00
2026-05-14 01:43:06 +02:00
| Flag | What it does |
2026-05-11 00:45:43 -07:00
|------|-------------|
2026-05-14 01:43:06 +02:00
| `--fast` | Tables, columns, types, constraints, and row counts |
| `--deep` | Fast ingest plus AI-enriched database context |
2026-05-11 00:45:43 -07:00
```bash
2026-05-14 01:43:06 +02:00
# Build one connection quickly
ktx ingest my-postgres --fast
2026-05-11 00:45:43 -07:00
2026-05-14 01:43:06 +02:00
# Build AI-enriched database context
ktx ingest my-postgres --deep
# Build all configured connections
ktx ingest --all
2026-05-11 00:45:43 -07:00
```
2026-05-14 01:43:06 +02:00
### Checking results
2026-05-11 00:45:43 -07:00
2026-05-14 01:43:06 +02:00
Every ingest prints a summary and writes local artifacts. Use `ktx status`
after ingest to review project readiness and follow-up setup work:
2026-05-11 00:45:43 -07:00
```bash
2026-05-13 12:00:08 +02:00
ktx status
2026-05-11 00:45:43 -07:00
```
### Relationship detection
Many databases lack declared foreign keys. KTX infers relationships by scoring column pairs across seven signals — name similarity, type compatibility, value overlap, embedding similarity, profile uniqueness, null rate, and structural priors. The weighted score determines each candidate's status:
| Score range | Status | Meaning |
|-------------|--------|---------|
| ≥ 0.85 | `accepted` | High confidence — applied automatically |
| 0.55 – 0.84 | `review` | Plausible — needs human review |
| < 0.55 | `rejected` | Low confidence — not applied |
2026-05-14 01:43:06 +02:00
Deep database ingest can include relationship evidence where the connector can
provide it. Relationship review and calibration subcommands are not part of the
current public CLI surface.
2026-05-11 00:45:43 -07:00
## Ingestion
2026-05-13 16:05:58 +02:00
Ingestion pulls semantic context from your existing analytics tools — dbt projects, Looker models, Metabase questions, and more — and writes it into your KTX project as semantic sources and wiki pages.
2026-05-11 00:45:43 -07:00
### How it works
Each ingest run follows this flow:
1. An **adapter** extracts metadata from your tool (dbt manifest, LookML files, Metabase API, etc.)
2. An **LLM agent** reconciles the extracted metadata with your existing context — it merges intelligently rather than overwriting
2026-05-13 16:05:58 +02:00
3. **Semantic sources** (YAML) and **wiki pages** (Markdown) are written to your project directory
2026-05-11 00:45:43 -07:00
### Running an ingest
```bash
2026-05-14 01:43:06 +02:00
ktx ingest my-dbt-source
2026-05-11 00:45:43 -07:00
```
2026-05-14 01:43:06 +02:00
Useful output flags:
2026-05-11 00:45:43 -07:00
| Flag | Description |
|------|-------------|
| `--json` | Output as JSON |
| `--plain` | Plain text output |
2026-05-14 01:43:06 +02:00
Foreground context builds do not detach into background control sessions. If a
run is interrupted, rerun `ktx ingest <connection-id>` or `ktx ingest --all`.
2026-05-11 00:45:43 -07:00
2026-05-14 01:43:06 +02:00
### Supported context sources
2026-05-11 00:45:43 -07:00
2026-05-14 01:43:06 +02:00
| Driver | Source | What gets ingested |
|--------|--------|--------------------|
2026-05-11 00:45:43 -07:00
| `dbt` | dbt project | Model definitions, column descriptions, tests, tags |
| `metricflow` | MetricFlow semantic models | Metrics, dimensions, entities, semantic joins |
| `lookml` | LookML files | Views, explores, dimensions, measures, joins |
| `looker` | Looker API | Explores, looks, dashboard metadata |
| `metabase` | Metabase API | Questions, dashboards, table metadata |
| `notion` | Notion API | Database pages, knowledge articles |
2026-05-14 01:43:06 +02:00
Query history is a database connection facet. Enable it with
`connections.<id>.context.queryHistory` or pass `--query-history` for a current
run. See [Context Sources](/docs/integrations/context-sources) for
driver-specific setup and auth configuration.
2026-05-11 00:45:43 -07:00
### What gets generated
2026-05-13 16:05:58 +02:00
A typical dbt ingest produces semantic sources and wiki pages in your project:
2026-05-11 00:45:43 -07:00
**Semantic source** (`semantic-layer/my-postgres/orders.yaml`):
```yaml title="semantic-layer/my-postgres/orders.yaml"
name: orders
table: public.orders
grain:
- order_id
columns:
- name: order_id
type: string
description: Unique order identifier
- name: customer_id
type: string
description: Foreign key to customers table
- name: order_date
type: time
role: time
description: Date the order was placed
- name: total_amount
type: number
description: Total order value in USD
measures:
- name: total_revenue
expr: SUM(total_amount)
description: Sum of all order values
- name: order_count
expr: COUNT(DISTINCT order_id)
description: Number of distinct orders
joins:
- to: customers
on: orders.customer_id = customers.customer_id
relationship: many_to_one
```
2026-05-13 16:05:58 +02:00
**Wiki page** (`wiki/global/order-status-definitions.md`):
2026-05-11 00:45:43 -07:00
```markdown
---
summary: Business definitions for order status values
tags: [orders, definitions]
sl_refs: [orders]
---
## Order Statuses
- **pending**: Order placed but not yet processed
- **confirmed**: Payment received, awaiting fulfillment
- **shipped**: Order dispatched to carrier
- **delivered**: Order received by customer
- **cancelled**: Order cancelled before shipment
Orders in "pending" status for more than 48 hours are flagged for review.
```
2026-05-14 01:43:06 +02:00
### Ingest transcripts
2026-05-11 00:45:43 -07:00
2026-05-14 01:43:06 +02:00
Every ingest session records a full transcript: tool calls, LLM responses, and
write decisions. Inspect the stored transcript files when you need to debug why
a source was written a certain way.