--- title: Building Context description: Scan your database schema and ingest context from dbt, Looker, Metabase, and more. --- Building context is a two-step process. First, you **scan** your database to discover its structure — tables, columns, types, constraints, and relationships. Then you **ingest** from your existing tools to enrich that structure with semantic meaning — metric definitions, business descriptions, join logic, and knowledge that agents need to generate correct analytics. ## Scanning Scanning connects to your database and extracts structural metadata. KTX stores the results locally so agents can understand your schema without querying the database directly. ### Running a scan ```bash ktx scan ``` This runs a structural scan by default. You can control what the scan does with the `--mode` flag: | Mode | What it does | |------|-------------| | `structural` | Tables, columns, types, constraints, row counts (default) | | `enriched` | Structural scan plus LLM-generated column descriptions | | `relationships` | Structural scan plus foreign key relationship detection | ```bash # Scan with relationship detection ktx scan my-postgres --mode relationships # Preview without writing results ktx scan my-postgres --dry-run ``` ### Checking scan results Every scan prints a summary and writes local artifacts. Use `ktx status` after a scan to review project readiness and follow-up setup work: ```bash ktx status ``` ### Relationship detection Many databases lack declared foreign keys. KTX infers relationships by scoring column pairs across seven signals — name similarity, type compatibility, value overlap, embedding similarity, profile uniqueness, null rate, and structural priors. The weighted score determines each candidate's status: | Score range | Status | Meaning | |-------------|--------|---------| | ≥ 0.85 | `accepted` | High confidence — applied automatically | | 0.55 – 0.84 | `review` | Plausible — needs human review | | < 0.55 | `rejected` | Low confidence — not applied | Relationship scans run with `ktx scan --mode relationships`. This command only executes the scan; relationship review and calibration subcommands are not part of the current CLI surface. ## Ingestion Ingestion pulls semantic context from your existing analytics tools — dbt projects, Looker models, Metabase questions, and more — and writes it into your KTX project as semantic sources and knowledge pages. ### How it works Each ingest run follows this flow: 1. An **adapter** extracts metadata from your tool (dbt manifest, LookML files, Metabase API, etc.) 2. An **LLM agent** reconciles the extracted metadata with your existing context — it merges intelligently rather than overwriting 3. **Semantic sources** (YAML) and **knowledge pages** (Markdown) are written to your project directory ### Running an ingest ```bash ktx ingest run --connection-id my-dbt-source --adapter dbt ``` Useful low-level flags: | Flag | Description | |------|-------------| | `--source-dir ` | Directory containing source files (e.g., your dbt project) | | `--viz` | Render the memory-flow TUI for real-time progress | | `--json` | Output as JSON | | `--plain` | Plain text output | ### Watching progress ```bash # Check status of the latest ingest ktx ingest status # Check a specific run ktx ingest status # Open the visual ingest report (TUI) ktx ingest watch # Replay a past ingest run ktx ingest replay ``` The `watch` command opens an interactive TUI that shows the memory-flow output — every tool call, LLM decision, and artifact written during the ingest. ### Available adapters | Adapter | Source | What gets ingested | |---------|--------|--------------------| | `dbt` | dbt project | Model definitions, column descriptions, tests, tags | | `metricflow` | MetricFlow semantic models | Metrics, dimensions, entities, semantic joins | | `lookml` | LookML files | Views, explores, dimensions, measures, joins | | `looker` | Looker API | Explores, looks, dashboard metadata | | `metabase` | Metabase API | Questions, dashboards, table metadata | | `notion` | Notion API | Database pages, knowledge articles | | `historic-sql` | Query history | Frequent queries, usage patterns, runtime stats | | `live-database` | Direct DB connection | Live schema introspection | See [Context Sources](/docs/integrations/context-sources) for adapter-specific setup and auth configuration. ### What gets generated A typical dbt ingest produces semantic sources and knowledge pages in your project: **Semantic source** (`semantic-layer/my-postgres/orders.yaml`): ```yaml title="semantic-layer/my-postgres/orders.yaml" name: orders table: public.orders grain: - order_id columns: - name: order_id type: string description: Unique order identifier - name: customer_id type: string description: Foreign key to customers table - name: order_date type: time role: time description: Date the order was placed - name: total_amount type: number description: Total order value in USD measures: - name: total_revenue expr: SUM(total_amount) description: Sum of all order values - name: order_count expr: COUNT(DISTINCT order_id) description: Number of distinct orders joins: - to: customers on: orders.customer_id = customers.customer_id relationship: many_to_one ``` **Knowledge page** (`knowledge/global/order-status-definitions.md`): ```markdown --- summary: Business definitions for order status values tags: [orders, definitions] sl_refs: [orders] --- ## Order Statuses - **pending**: Order placed but not yet processed - **confirmed**: Payment received, awaiting fulfillment - **shipped**: Order dispatched to carrier - **delivered**: Order received by customer - **cancelled**: Order cancelled before shipment Orders in "pending" status for more than 48 hours are flagged for review. ``` ### Deterministic replay Every ingest session records a full transcript — tool calls, LLM responses, and write decisions. You can replay any session to debug why a source was written a certain way: ```bash ktx ingest replay --viz ``` This opens the same TUI view as the original run, letting you step through the agent's reasoning.