feat: merge ingest and scan

* docs: add CLI component reuse guidance * docs: add unified ingest ux design * Refine unified ingest UX design after adversarial review iteration 1 * Refine unified ingest UX design after adversarial review iteration 2 * Refine unified ingest UX design after adversarial review iteration 3 * feat(cli): route public connection ingest command * feat(cli): hide standalone scan from public help * feat(cli): plan public ingest depth and query history * feat(cli): execute public database ingest facets * feat(ingest): read connection query history config * fix(cli): use public ingest wording * fix(config): stop generating ingest adapter allow lists * docs: document public ingest command * test: align ingest surface expectations * docs: add unified ingest public CLI surface plan * feat(cli): preflight deep public ingest readiness * feat(setup): store query history in connection context * feat(setup): store database context depth * feat(setup): verify context readiness by database depth * fix(setup): keep context build foreground only * fix(config): reject reserved ingest connection ids * test: close unified ingest v1 expectations * docs: add unified ingest v1 closure plan * fix(ingest): bypass adapter allow-list for public source ingest * fix(ingest): honor query history window intent * fix(ingest): hide scan internals from public database ingest * feat(ingest): use foreground view for interactive public ingest * fix(setup): use schema context and query history wording * test(cli): verify unified ingest public output * docs: add unified ingest v1 public output closure plan * fix(setup): forward query history flags * fix(setup): prompt for postgres query history * fix(status): report query history readiness * fix(ingest): remove legacy public guidance * fix(ingest): polish foreground retry copy * docs(examples): use unified query history wording * chore(ingest): finish public query history cleanup * docs: add unified ingest v1 query history status cleanup plan * test(docs): cover unified ingest public docs * docs: align ingest CLI reference with unified UX * docs: update context build guides for unified ingest * docs: update setup and primary source ingest wording * docs: stop advertising adapter-backed example ingest * docs: close unified ingest public docs gaps * docs: add unified ingest v1 docs site closure plan * fix: render unified ingest foreground warnings * fix: explain query history schema order * fix: add public ingest retry guidance * fix: align setup next steps with unified ingest * fix: remove scan wording from demo progress * test: verify unified ingest ux closure * docs: add unified ingest v1 foreground and retry closure plan * fix(cli): preserve query-history pull config in public ingest * fix(cli): omit hidden commands from docs command tree * test(cli): close unified ingest final public surface checks * docs: add unified ingest v1 final public surface closure plan * fix(cli): use public source labels in ingest reports * fix(cli): suppress low-level public ingest output * test(cli): verify unified ingest public plain output * docs: add unified ingest v1 public plain output closure plan * fix(cli): add public ingest copy sanitizers * fix(cli): sanitize public ingest progress copy * fix(cli): rename setup schema scope prompt * docs(plan): add progress copy closure; test: align setup back-nav fixture Adds the iter9 plan and updates the setup back-navigation test fixture to pass disableQueryHistory plus listSchemas/listTables stubs that the unified ingest setup step now requires. * docs(plan): add final ux labels plan with narrowed label scans * fix(cli): aggregate unsupported query-history warnings * fix(cli): align setup database labels * test(cli): fix setup database test type-check * fix(cli): remove primary-source wording from setup output * test(cli): verify unified ingest setup closure * docs(plan): add unified ingest v1 verification copy closure plan * fix(cli): remove top-level scan command * fix(cli): remove legacy ingest and wiki commands * Merge scan into ingest flow * feat(cli): split ingest progress into per-phase rows, rename work units to tasks Each database target in the unified ingest dashboard now renders one row per real subprocess (Schema, then Query history when enabled) instead of a single combined bar. Each phase has its own monotonic 0-100% bar so the progress never snaps back to zero when historic-sql starts after scan completes. Completed phases keep their final bar, summary, and elapsed time visible as an inline audit trail; queued and skipped phases are shown explicitly. Also rename user-facing "work units" / "Failed work units" to "tasks" / "Failed tasks" in ingest output and parseIngestSummary. The parser still accepts the legacy "Work units:" wording in captured output for backward compat. Internal memory-flow event names and type fields are left alone. * Fix test harness failures * Fix CI smoke checks --------- Co-authored-by: Andrey Avtomonov <7889985+andreybavt@users.noreply.github.com>
2026-06-25 08:48:08 +02:00 · 2026-05-14 01:43:06 +02:00 · 2026-05-14 01:43:06 +02:00 · b00c1a11a9
commit b00c1a11a9
parent 1a472cf3ed
118 changed files with 16890 additions and 2992 deletions
--- a/docs-site/content/docs/cli-reference/ktx-dev.mdx
+++ b/docs-site/content/docs/cli-reference/ktx-dev.mdx
@ -3,7 +3,7 @@ title: "ktx dev"
 description: "Low-level project initialization and runtime management."
 ---

-`ktx dev` contains development-only project initialization and managed runtime commands. Scan and ingest commands live at the root as [`ktx scan`](/docs/cli-reference/ktx-scan) and [`ktx ingest`](/docs/cli-reference/ktx-ingest).
+`ktx dev` contains development-only project initialization and managed runtime commands. Context building lives at the root as [`ktx ingest`](/docs/cli-reference/ktx-ingest).

 ## Command signature

--- a/docs-site/content/docs/cli-reference/ktx-ingest.mdx
+++ b/docs-site/content/docs/cli-reference/ktx-ingest.mdx
@ -1,73 +1,59 @@
 ---
 title: "ktx ingest"
-description: "Run and inspect local ingest memory-flow output."
+description: "Build or refresh KTX context from configured connections."
 ---

-`ktx ingest` runs adapter-level local ingest and renders stored ingest reports.
+`ktx ingest` builds or refreshes KTX context from configured connections.
+Database connections build schema context. Context-source connections ingest
+metadata from tools such as dbt, Looker, Metabase, MetricFlow, LookML, and
+Notion.

 ## Command signature

 ```bash
-ktx ingest <subcommand> [options]
+ktx ingest [options] [connectionId]
 ```

-## Subcommands
+Use a connection id to build one configured connection. Use `--all` to build
+every configured connection. Database connections run before context-source
+connections when you use `--all`.

-| Subcommand | Description |
-|-----------|-------------|
-| `run` | Run local ingest for one configured connection and source adapter |
-| `status [runId]` | Print status for the latest or selected stored local ingest run or report file |
-| `watch [runId]` | Open the latest or selected stored ingest visual report |
-| `replay <runId>` | Replay a stored ingest run or bundle report through memory-flow output |
-
-## `ingest run`
+## Build options

 | Flag | Description | Default |
 |------|-------------|---------|
-| `--connection-id <connectionId>` | KTX connection id | Required |
-| `--adapter <adapter>` | Ingest source adapter name | Required |
-| `--source-dir <path>` | Directory containing source files | — |
-| `--database-introspection-url <url>` | Daemon URL for live-database introspection | — |
-| `--debug-llm-request-file <path>` | Write sanitized LLM request structure to a JSONL file | — |
+| `--all` | Build every configured connection | `false` |
+| `--fast` | Use deterministic database schema ingest | Stored connection default, or `fast` |
+| `--deep` | Use AI-enriched database ingest | Stored connection default, or `fast` |
+| `--query-history` | Include database query-history usage patterns | Stored connection default |
+| `--no-query-history` | Skip database query-history usage patterns for this run | Stored connection default |
+| `--query-history-window-days <days>` | Query-history lookback window for this run | Stored connection default |
 | `--plain` | Print plain text output | `true` |
 | `--json` | Print JSON output | `false` |
-| `--viz` | Render memory-flow TUI output | `false` |
-| `--yes` | Install the managed Python runtime without prompting when required | `false` |
-| `--no-input` | Disable interactive terminal input for visualization and runtime installation | — |
+| `--no-input` | Disable interactive terminal input | `false` |

-## `ingest status`, `watch`, and `replay`
-
-| Flag | Description | Default |
-|------|-------------|---------|
-| `--report-file <path>` | Bundle ingest report JSON file to render | — |
-| `--plain` | Print plain text output | `true` for `status` and `replay` |
-| `--json` | Print JSON output | `false` |
-| `--viz` | Render memory-flow TUI output | `true` for `watch` |
-| `--no-input` | Disable interactive terminal input for visualization | — |
+`--fast` and `--deep` are mutually exclusive. Depth flags apply only to
+database connections. Query-history flags apply only to database connections
+that support query history.

 ## Examples

 ```bash
-ktx ingest run --connection-id my-dbt-source --adapter dbt
-ktx ingest run --connection-id prod-metabase --adapter metabase --yes
-
-ktx ingest status
-ktx ingest status run-abc123
-ktx ingest status --json
-
-ktx ingest watch
-ktx ingest watch run-abc123
-
-ktx ingest replay run-abc123
-ktx ingest replay run-abc123 --viz
-ktx ingest replay run-abc123 --report-file /tmp/ingest-report.json
+ktx ingest warehouse
+ktx ingest warehouse --fast
+ktx ingest warehouse --deep
+ktx ingest warehouse --deep --query-history
+ktx ingest warehouse --query-history-window-days 30
+ktx ingest notion
+ktx ingest --all
+ktx ingest --all --deep
 ```

 ## Common errors

 | Error | Cause | Recovery |
 |-------|-------|----------|
-| Ingest needs credentials | The source adapter requires API or git access | Configure the referenced environment variable or secret file |
-| Ingest run cannot find adapter | `--adapter` does not match a supported source adapter | Use a configured adapter such as `dbt`, `metabase`, `looker`, `lookml`, `notion`, or `live-database` |
-| Latest run not found | No ingest run has been started in this project | Run `ktx ingest run --connection-id <id> --adapter <adapter>` first |
-| Report watch fails in a non-interactive shell | Visual report needs a terminal | Use `ktx ingest status --json` for agent and CI workflows |
+| Connection not configured | The connection id is not present in `ktx.yaml` | Add the connection with `ktx setup` or update `ktx.yaml` |
+| Deep readiness is missing | `--deep` or query history needs model, embedding, and scan-enrichment configuration | Run `ktx setup` or rerun with `--fast` |
+| Query history is unsupported | The selected database driver does not support query history | Run schema ingest without query-history flags |
+| No ingest target was selected | No connection id was provided and `--all` was omitted | Run `ktx ingest <connectionId>` or `ktx ingest --all` |
--- a/docs-site/content/docs/cli-reference/ktx-scan.mdx
+++ b/docs-site/content/docs/cli-reference/ktx-scan.mdx
@ -1,44 +0,0 @@
---
-title: "ktx scan"
-description: "Run standalone database scans."
---
-
-Discover a configured database connection's schema, including tables, columns, types, constraints, and optional relationship signals.
-
-## Command signature
-
-```bash
-ktx scan <connectionId> [options]
-```
-
-## Options
-
-| Flag | Description | Default |
-|------|-------------|---------|
-| `--mode <mode>` | Scan mode: `structural`, `enriched`, or `relationships` | `structural` |
-| `--dry-run` | Run without writing scan results | `false` |
-| `--database-introspection-url <url>` | Daemon URL for live-database introspection | — |
-| `--yes` | Install the managed Python runtime without prompting when required | `false` |
-| `--no-input` | Disable interactive managed runtime installation | — |
-
-## Examples
-
-```bash
-ktx scan my-warehouse
-ktx scan my-warehouse --mode enriched
-ktx scan my-warehouse --mode relationships
-ktx scan my-warehouse --dry-run
-ktx scan my-warehouse --database-introspection-url http://127.0.0.1:8765
-```
-
-## Output
-
-`ktx scan` prints a human summary and writes scan artifacts under the KTX project directory unless `--dry-run` is set. Use `ktx status` after a scan to inspect project readiness and next setup work.
-
-## Common errors
-
-| Error | Cause | Recovery |
-|-------|-------|----------|
-| Scan cannot connect | Connection credentials or network access are invalid | Run `ktx connection test <connectionId>` and update the connection before scanning |
-| Enriched scan cannot describe columns | LLM credentials are missing or invalid | Complete LLM setup with `ktx setup` before enriched scans |
-| Relationship scan has limited evidence | The connector cannot provide optional validation or statistics | Re-run with a connector that supports the missing capability, or treat relationship output as lower-confidence context |
--- a/docs-site/content/docs/cli-reference/ktx-setup.mdx
+++ b/docs-site/content/docs/cli-reference/ktx-setup.mdx
@ -30,7 +30,7 @@ ktx setup [options]
 | `--global` | Install agent integration into the global target scope (Claude Code and Codex only) | `false` |

 The setup wizard is the public configuration interface. It prompts for LLM
-credentials, embeddings, database connections, context sources, Historic SQL,
+credentials, embeddings, database connections, context sources, query history,
 and agent integration when those values are needed.

 ## Examples
@ -62,7 +62,7 @@ KTX project: /home/user/analytics
 Project ready: yes
 LLM ready: yes (claude-sonnet-4-6)
 Embeddings ready: yes (text-embedding-3-small)
-Primary sources configured: yes (postgres-warehouse)
+Databases configured: yes (postgres-warehouse)
 Context sources configured: yes (dbt-main)
 KTX context built: yes
 Agent integration ready: yes (codex:project)
--- a/docs-site/content/docs/cli-reference/ktx-wiki.mdx
+++ b/docs-site/content/docs/cli-reference/ktx-wiki.mdx
@ -1,6 +1,6 @@
 ---
 title: "ktx wiki"
-description: "List, read, search, or write wiki pages."
+description: "List or search wiki pages."
 ---

 Manage wiki pages in your KTX project. Wiki pages are Markdown documents that capture business definitions, rules, and gotchas. Agents search them for context when answering questions about your data.
@ -16,9 +16,7 @@ ktx wiki <subcommand> [options]
 | Subcommand | Description |
 |-----------|-------------|
 | `list` | List local wiki pages |
-| `read <key>` | Read one local wiki page |
 | `search <query>` | Search local wiki pages |
-| `write <key>` | Write one local wiki page |

 ## Options

@ -29,13 +27,6 @@ ktx wiki <subcommand> [options]
 | `--json` | Print JSON output | `false` |
 | `--user-id <id>` | Local user id | `local` |

-### `wiki read`
-
-| Flag | Description | Default |
-|------|-------------|---------|
-| `--json` | Print JSON output | `false` |
-| `--user-id <id>` | Local user id | `local` |
-
 ### `wiki search`

 | Flag | Description | Default |
@ -44,18 +35,6 @@ ktx wiki <subcommand> [options]
 | `--user-id <id>` | Local user id | `local` |
 | `--limit <number>` | Maximum search results | — |

-### `wiki write`
-
-| Flag | Description | Default |
-|------|-------------|---------|
-| `--user-id <id>` | Local user id | `local` |
-| `--scope <scope>` | Scope: `global` or `user` | `global` |
-| `--summary <summary>` | Wiki page summary (required) | — |
-| `--content <content>` | Wiki page content (required) | — |
-| `--tag <tag>` | Wiki tag; repeatable | — |
-| `--ref <ref>` | Wiki ref; repeatable | — |
-| `--sl-ref <ref>` | Semantic-layer ref; repeatable | — |
-
 ## Examples

 ```bash
@ -65,48 +44,17 @@ ktx wiki list
 # List all wiki pages as JSON
 ktx wiki list --json

-# Read a specific wiki page
-ktx wiki read revenue-definitions
-
-# Read a specific wiki page as JSON
-ktx wiki read revenue-definitions --json
-
 # Search wiki pages
 ktx wiki search "monthly recurring revenue"

 # Search wiki pages as JSON
 ktx wiki search "monthly recurring revenue" --json --limit 10
-
-# Write a global wiki page
-ktx wiki write revenue-definitions \
-  --summary "Canonical revenue metric definitions" \
-  --content "## MRR\nMonthly Recurring Revenue is calculated as..."
-
-# Write a user-scoped wiki page
-ktx wiki write my-notes \
-  --scope user \
-  --summary "Personal analysis notes" \
-  --content "Things to check when revenue numbers look off..."
-
-# Write a page with tags and references
-ktx wiki write churn-rules \
-  --summary "Churn calculation business rules" \
-  --content "A customer is considered churned when..." \
-  --tag finance \
-  --tag retention \
-  --sl-ref customers \
-  --sl-ref subscriptions
-
-# Write a page with external references
-ktx wiki write data-freshness \
-  --summary "Data pipeline SLAs and freshness guarantees" \
-  --content "The orders table refreshes every 15 minutes..." \
-  --ref "https://wiki.example.com/data-pipelines"
 ```

 ## Output

-Wiki commands print local wiki pages and search results. Agents should search first, then read the most relevant page by key.
+Wiki commands print local wiki page listings and search results. Open the
+matching Markdown files directly when you need the full page contents.

 ```json
 {
@ -128,6 +76,4 @@ Wiki commands print local wiki pages and search results. Agents should search fi
 | Error | Cause | Recovery |
 |-------|-------|----------|
 | Search returns no results | The query terms do not match summaries, tags, or content | Retry with business synonyms, then create a page if the knowledge is missing |
-| Read fails for a key | The page key is wrong or scoped to a different user | Run `ktx wiki list` or search again to get the exact key |
-| Write fails due to missing fields | `--summary` or `--content` was omitted | Pass both fields, and keep the summary short enough for search results |
-| Agent writes duplicate pages | It did not search existing pages first | Always run `ktx wiki search` before `ktx wiki write` |
+| A page is missing | No Markdown file exists for that business context | Add a file under `wiki/` or run `ktx ingest <connectionId>` |
--- a/docs-site/content/docs/cli-reference/meta.json
+++ b/docs-site/content/docs/cli-reference/meta.json
@ -4,7 +4,6 @@
  "pages": [
    "ktx-setup",
    "ktx-connection",
-    "ktx-scan",
    "ktx-ingest",
    "ktx-sl",
    "ktx-wiki",
--- a/docs-site/content/docs/concepts/context-as-code.mdx
+++ b/docs-site/content/docs/concepts/context-as-code.mdx
@ -59,7 +59,10 @@ dbt / Looker / Metabase / Notion

 A typical branch shows a semantic diff: "this ingest added 3 new sources from dbt, updated 2 join definitions based on schema changes, and created 1 wiki page from a Notion doc." Analytics engineers review the diff, verify that the new sources look correct, and merge.

-Teams usually run this on demand while setting up a source, then schedule it once the source is stable. A cron job or CI schedule can run `ktx ingest run --connection-id <id> --adapter <adapter> --no-input` overnight on an ingest branch so the latest dbt manifests, BI metadata, and documentation updates are ready for review each morning.
+Teams usually run this on demand while setting up a source, then schedule it
+once the source is stable. A cron job or CI schedule can run `ktx ingest --all --no-input`
+overnight on an ingest branch so the latest schema context, dbt manifests, BI
+metadata, and documentation updates are ready for review each morning.

 Once merged, agents querying through the KTX CLI see the updated context immediately. No deployment step, no cache invalidation, no restart. The files are the source of truth, and agents read them on every request.

--- a/docs-site/content/docs/concepts/the-context-layer.mdx
+++ b/docs-site/content/docs/concepts/the-context-layer.mdx
@ -134,13 +134,13 @@ my-project/
 │           └── data-quality-notes.md
 ├── raw-sources/
 │   └── warehouse/
-│       └── live-database/        # Scan artifacts and reports
+│       └── database-ingest/      # Schema ingest artifacts and reports
 └── .ktx/
    ├── db.sqlite                 # Local state (git-ignored)
    └── cache/                    # Runtime cache (git-ignored)
 ```

-Semantic sources and wiki pages are committed to git. The SQLite database holds ephemeral state — scan results, embedding indexes, session logs — and is git-ignored. If you delete it, KTX rebuilds it on the next run.
+Semantic sources and wiki pages are committed to git. The SQLite database holds ephemeral state — schema ingest results, embedding indexes, session logs — and is git-ignored. If you delete it, KTX rebuilds it on the next run.

 This means your analytics context travels with your code. You can fork it, branch it, review it in a PR, and merge it with the same tools you use for dbt models. There's no sync problem between a remote server and your local state. There's no migration to run. The files are the source of truth.

--- a/docs-site/content/docs/getting-started/quickstart.mdx
+++ b/docs-site/content/docs/getting-started/quickstart.mdx
@ -81,7 +81,8 @@ ktx dev runtime start --feature local-embeddings

 ## Step 3: Connect a database

-Select one or more databases for KTX to scan. The wizard supports SQLite, PostgreSQL, MySQL, ClickHouse, SQL Server, BigQuery, and Snowflake.
+Select one or more databases for KTX to connect to. The wizard supports
+SQLite, PostgreSQL, MySQL, ClickHouse, SQL Server, BigQuery, and Snowflake.

 For PostgreSQL, you can enter connection details field by field or paste a connection URL:

@ -93,22 +94,27 @@ For PostgreSQL, you can enter connection details field by field or paste a conne

 If your URL contains credentials, KTX saves it to `.ktx/secrets/` and writes a `file:` reference in `ktx.yaml`. You can also use `env:DATABASE_URL` to reference an environment variable.

-After connecting, KTX automatically runs a connection test and a structural scan:
+After connecting, KTX automatically runs a connection test and builds fast
+schema context:

 ```
-◇  Testing postgres-warehouse
-│  ✓ Connection test passed
-│  Driver: PostgreSQL · Tables: 42
-│
-◇  Scanning postgres-warehouse
-│  ✓ Structural scan completed
-│  Changes: 42 new tables
-│
-◇  Primary source ready
-│  postgres-warehouse · PostgreSQL · structural scan complete
+Testing postgres-warehouse
+  Connection test passed
+  Driver: PostgreSQL - Tables: 42
+
+Building schema context for postgres-warehouse
+  Running fast database ingest
+
+Schema context complete for postgres-warehouse
+  Changes: 42 new tables
+
+Database ready
+  postgres-warehouse - PostgreSQL - schema context complete
 ```

-For Snowflake and BigQuery, the wizard offers **Historic SQL** configuration for query history views. For PostgreSQL, enable Historic SQL with `--enable-historic-sql` when `pg_stat_statements` is configured.
+For PostgreSQL, Snowflake, and BigQuery, the wizard can enable query-history
+ingest when the warehouse history feature is available. Query history is stored
+under `connections.<id>.context.queryHistory` in `ktx.yaml`.

 ## Step 4: Add context sources

@ -138,7 +144,8 @@ Context sources are saved to `ktx.yaml` and built during the next step.

 ## Step 5: Build context

-This is where KTX does the heavy lifting. It runs an enriched scan of your database (generating AI-powered column and table descriptions) and ingests metadata from any configured context sources.
+This is where KTX builds agent-ready context. It uses the database context
+depth saved by setup and ingests metadata from any configured context sources.

 ```
 ◆  Build KTX context for agents?
@ -146,27 +153,22 @@ This is where KTX does the heavy lifting. It runs an enriched scan of your datab
 │  ○ Leave context unbuilt and exit setup
 ```

-The build scans each primary source with LLM enrichment, detects table relationships, and runs ingestion agents that reconcile metadata from your context sources into semantic-layer YAML files and wiki pages.
+Fast database context builds deterministic schema grounding. Deep database
+context also generates AI descriptions, embeddings, and relationship evidence
+when those capabilities are configured.

-For a small database (under 50 tables), this takes a few minutes. Larger warehouses can take longer. You can press <kbd>d</kbd> to detach and let it run in the background:
-
-```
-KTX context build
-Run: setup-context-local-abc123
-Project: /home/user/analytics
-
-Detach: press d to leave this running.
-Resume: ktx setup --project-dir /home/user/analytics
-Status: ktx status --project-dir /home/user/analytics
-```
+For a small database (under 50 tables), this can take a few minutes. Larger
+warehouses can take longer. Context builds run in the foreground; press
+<kbd>Ctrl+C</kbd> to stop the current run and rerun `ktx setup` or `ktx ingest`
+when you are ready to try again.

 When the build completes, KTX verifies that agent-ready context was produced:

 ```
 KTX context is ready for agents.

-Primary sources:
-  postgres-warehouse: enriched scan complete
+Databases:
+  postgres-warehouse: deep context complete

 Context sources:
  dbt-main: memory update complete
@ -209,8 +211,8 @@ KTX writes project state as plain files so agents can inspect and edit changes i
 | `ktx.yaml` | `ktx setup` | Main project configuration: connections, LLM settings, embeddings, and context sources |
 | `.ktx/secrets/*` | `ktx setup` when file-backed secrets are selected | Local secret files referenced from `ktx.yaml`; do not commit these |
 | `semantic-layer/<connection-id>/*.yaml` | context build, ingestion, or direct file edits | Semantic source definitions agents use for SQL generation |
-| `wiki/global/*.md` | ingestion, memory capture, `ktx wiki write --scope global`, or direct file edits | Shared business context and metric definitions |
-| `wiki/user/<user-id>/*.md` | memory capture, `ktx wiki write --scope user`, or direct file edits | User-scoped notes for one agent/user context |
+| `wiki/global/*.md` | ingestion, memory capture, or direct file edits | Shared business context and metric definitions |
+| `wiki/user/<user-id>/*.md` | memory capture or direct file edits | User-scoped notes for one agent/user context |
 | `.claude/skills/ktx/SKILL.md`, `.agents/skills/ktx/SKILL.md` | CLI-mode agent integration setup | Agent instructions for calling public `ktx` commands |

 ## Verify it worked
@ -226,7 +228,7 @@ KTX project: /home/user/analytics
 Project ready: yes
 LLM ready: yes (claude-sonnet-4-6)
 Embeddings ready: yes (text-embedding-3-small)
-Primary sources configured: yes (postgres-warehouse)
+Databases configured: yes (postgres-warehouse)
 Context sources configured: yes (dbt-main)
 KTX context built: yes
 Agent integration ready: yes (claude-code:project)
@ -246,7 +248,7 @@ Agent integration ready: yes (claude-code:project)

 ## Next steps

- **Build more context** — learn about [scanning](/docs/guides/building-context), relationship detection, and ingestion workflows in the Building Context guide.
+- **Build more context** — learn about [database ingest](/docs/guides/building-context), relationship detection, and source ingestion workflows in the Building Context guide.
 - **Refine your semantic layer** — the [Writing Context](/docs/guides/writing-context) guide covers source YAML, measures, joins, and wiki pages.
 - **Understand the architecture** — read [The Context Layer](/docs/concepts/the-context-layer) to learn why a context layer is more than a semantic layer.
 - **Connect more agents** — see the [Agent Clients](/docs/integrations/agent-clients) integration page for per-tool setup details.
--- a/docs-site/content/docs/guides/building-context.mdx
+++ b/docs-site/content/docs/guides/building-context.mdx
@ -1,39 +1,48 @@
 ---
 title: Building Context
-description: Scan your database schema and ingest context from dbt, Looker, Metabase, and more.
+description: Build database and source context from configured KTX connections.
 ---

-Building context is a two-step process. First, you **scan** your database to discover its structure — tables, columns, types, constraints, and relationships. Then you **ingest** from your existing tools to enrich that structure with semantic meaning — metric definitions, business descriptions, join logic, and knowledge that agents need to generate correct analytics.
+Building context reads your configured connections and writes local context that
+agents can use. Database connections produce schema context, and source
+connections such as dbt, Looker, Metabase, and Notion produce semantic sources
+and wiki pages.

-## Scanning
+## Database ingest

-Scanning connects to your database and extracts structural metadata. KTX stores the results locally so agents can understand your schema without querying the database directly.
+Database ingest connects to your warehouse and extracts structural metadata.
+KTX stores the results locally so agents can understand your schema without
+querying the database directly.

-### Running a scan
+### Running database ingest

 ```bash
-ktx scan <connection-id>
+ktx ingest <connection-id>
 ```

-This runs a structural scan by default. You can control what the scan does with the `--mode` flag:
+This runs a fast schema ingest by default. You can choose the depth with public
+flags:

-| Mode | What it does |
+| Flag | What it does |
 |------|-------------|
-| `structural` | Tables, columns, types, constraints, row counts (default) |
-| `enriched` | Structural scan plus LLM-generated column descriptions |
-| `relationships` | Structural scan plus foreign key relationship detection |
+| `--fast` | Tables, columns, types, constraints, and row counts |
+| `--deep` | Fast ingest plus AI-enriched database context |

 ```bash
-# Scan with relationship detection
-ktx scan my-postgres --mode relationships
+# Build one connection quickly
+ktx ingest my-postgres --fast

-# Preview without writing results
-ktx scan my-postgres --dry-run
+# Build AI-enriched database context
+ktx ingest my-postgres --deep
+
+# Build all configured connections
+ktx ingest --all
 ```

-### Checking scan results
+### Checking results

-Every scan prints a summary and writes local artifacts. Use `ktx status` after a scan to review project readiness and follow-up setup work:
+Every ingest prints a summary and writes local artifacts. Use `ktx status`
+after ingest to review project readiness and follow-up setup work:

 ```bash
 ktx status
@ -49,7 +58,9 @@ Many databases lack declared foreign keys. KTX infers relationships by scoring c
 | 0.55 &ndash; 0.84 | `review` | Plausible — needs human review |
 | &lt; 0.55 | `rejected` | Low confidence — not applied |

-Relationship scans run with `ktx scan <connection-id> --mode relationships`. This command only executes the scan; relationship review and calibration subcommands are not part of the current CLI surface.
+Deep database ingest can include relationship evidence where the connector can
+provide it. Relationship review and calibration subcommands are not part of the
+current public CLI surface.

 ## Ingestion

@ -66,50 +77,34 @@ Each ingest run follows this flow:
 ### Running an ingest

 ```bash
-ktx ingest run --connection-id my-dbt-source --adapter dbt
+ktx ingest my-dbt-source
 ```

-Useful low-level flags:
+Useful output flags:

 | Flag | Description |
 |------|-------------|
-| `--source-dir <path>` | Directory containing source files (e.g., your dbt project) |
-| `--viz` | Render the memory-flow TUI for real-time progress |
 | `--json` | Output as JSON |
 | `--plain` | Plain text output |

-### Watching progress
+Foreground context builds do not detach into background control sessions. If a
+run is interrupted, rerun `ktx ingest <connection-id>` or `ktx ingest --all`.

-```bash
-# Check status of the latest ingest
-ktx ingest status
+### Supported context sources

-# Check a specific run
-ktx ingest status <run-id>
-
-# Open the visual ingest report (TUI)
-ktx ingest watch
-
-# Replay a past ingest run
-ktx ingest replay <run-id>
-```
-
-The `watch` command opens an interactive TUI that shows the memory-flow output — every tool call, LLM decision, and artifact written during the ingest.
-
-### Available adapters
-
-| Adapter | Source | What gets ingested |
-|---------|--------|--------------------|
+| Driver | Source | What gets ingested |
+|--------|--------|--------------------|
 | `dbt` | dbt project | Model definitions, column descriptions, tests, tags |
 | `metricflow` | MetricFlow semantic models | Metrics, dimensions, entities, semantic joins |
 | `lookml` | LookML files | Views, explores, dimensions, measures, joins |
 | `looker` | Looker API | Explores, looks, dashboard metadata |
 | `metabase` | Metabase API | Questions, dashboards, table metadata |
 | `notion` | Notion API | Database pages, knowledge articles |
-| `historic-sql` | Query history | Frequent queries, usage patterns, runtime stats |
-| `live-database` | Direct DB connection | Live schema introspection |

-See [Context Sources](/docs/integrations/context-sources) for adapter-specific setup and auth configuration.
+Query history is a database connection facet. Enable it with
+`connections.<id>.context.queryHistory` or pass `--query-history` for a current
+run. See [Context Sources](/docs/integrations/context-sources) for
+driver-specific setup and auth configuration.

 ### What gets generated

@ -169,12 +164,8 @@ sl_refs: [orders]
 Orders in "pending" status for more than 48 hours are flagged for review.
 ```

-### Deterministic replay
+### Ingest transcripts

-Every ingest session records a full transcript — tool calls, LLM responses, and write decisions. You can replay any session to debug why a source was written a certain way:
-
-```bash
-ktx ingest replay <run-id> --viz
-```
-
-This opens the same TUI view as the original run, letting you step through the agent's reasoning.
+Every ingest session records a full transcript: tool calls, LLM responses, and
+write decisions. Inspect the stored transcript files when you need to debug why
+a source was written a certain way.
--- a/docs-site/content/docs/guides/writing-context.mdx
+++ b/docs-site/content/docs/guides/writing-context.mdx
@ -248,8 +248,7 @@ wiki/
 ### Editing pages

 Create and edit wiki pages directly as Markdown files in the `wiki/`
-directory, or with `ktx wiki write`. Ingest and memory capture also create
-these pages automatically.
+directory. Ingest and memory capture also create these pages automatically.

 Wiki page fields:

--- a/docs-site/content/docs/integrations/agent-clients.mdx
+++ b/docs-site/content/docs/integrations/agent-clients.mdx
@ -125,8 +125,6 @@ All supported agent clients call the same KTX CLI commands:
 |---------|-------------|
 | `ktx status --json` | Return project setup and context readiness |
 | `ktx wiki search <query> --json` | Search wiki pages |
-| `ktx wiki read <key> --json` | Read a wiki page |
-| `ktx wiki write <key>` | Write or update a wiki page |
 | `ktx sl list --json` | List semantic-layer sources |
 | `ktx sl search <query> --json` | Search semantic-layer sources |
 | `ktx sl validate <source> --connection-id <id>` | Validate semantic source definitions |
--- a/docs-site/content/docs/integrations/context-sources.mdx
+++ b/docs-site/content/docs/integrations/context-sources.mdx
@ -9,12 +9,13 @@ All context sources are configured in `ktx.yaml` under `connections` with their

 ## Ingestion workflow

-Agents should configure and ingest context sources in this order:
+Agents must configure and ingest context sources in this order:

 1. Add the context source connection in `ktx.yaml` or with `ktx setup`.
 2. Store tokens as `env:NAME` or `file:/path/to/secret`.
-3. Run `ktx ingest run --connection-id <connectionId> --adapter <adapter>` for one source or `ktx ingest run --connection-id <id> --adapter <adapter>`.
-4. Check progress with `ktx ingest status --json`.
+3. Run `ktx ingest <connectionId>` for one source or `ktx ingest --all` for
+   every configured source.
+4. Review the foreground ingest output.
 5. Review generated `semantic-layer/` YAML and `wiki/` Markdown files in git.
 6. Validate changed semantic sources with `ktx sl validate`.

--- a/docs-site/content/docs/integrations/primary-sources.mdx
+++ b/docs-site/content/docs/integrations/primary-sources.mdx
@ -3,13 +3,17 @@ title: Primary Sources
 description: Connect KTX to PostgreSQL, Snowflake, BigQuery, ClickHouse, MySQL, SQL Server, or SQLite.
 ---

-KTX connects to your data warehouse or database to scan schemas, discover relationships, and execute semantic layer queries. Each connection is defined in `ktx.yaml` under the `connections` key.
+KTX connects to your data warehouse or database to build schema context,
+discover relationships, and execute semantic layer queries. Each connection is
+defined in `ktx.yaml` under the `connections` key.

 All connectors share these conventions:

- Sensitive values support `env:VAR_NAME` (read from environment) and `file:/path/to/secret` (read from file) references
- Connections are read-only — KTX never writes to your database
- Schema scanning discovers tables, columns, types, and constraints automatically
+- Sensitive values support `env:VAR_NAME` (read from environment) and
+  `file:/path/to/secret` (read from file) references
+- Connections are read-only; KTX never writes to your database
+- Database ingest discovers tables, columns, types, and constraints
+  automatically

 ## Connection field reference

@ -21,7 +25,7 @@ Agents should prefer environment or file references over literal secrets.
 | `url` | One of the connection methods | URL-style connectors | Database URL, `env:NAME`, or `file:/path/to/secret` |
 | `host`, `port`, `database`, `username`, `password` | One of the connection methods | PostgreSQL, MySQL, ClickHouse, SQL Server | Field-by-field connection values |
 | `schema` or `schemas` | No | schema-aware warehouses | Single schema or list of schemas to scan |
-| `historicSql` | No | supported warehouses | Enables query-history ingestion when the warehouse supports it |
+| `context.queryHistory` | No | PostgreSQL, Snowflake, BigQuery | Enables query-history ingestion when the warehouse supports it |
 | `path` | Yes for path-style SQLite | SQLite | Local SQLite database path or `env:NAME` reference |
 | `max_bytes_billed` | No | BigQuery | Maximum bytes billed per query job |
 | `job_timeout_ms` | No | BigQuery | BigQuery query job timeout in milliseconds |
@ -29,7 +33,7 @@ Agents should prefer environment or file references over literal secrets.

 ## PostgreSQL

-The most full-featured connector. Supports schema introspection, foreign key detection, column statistics, and historic SQL via `pg_stat_statements`.
+The most full-featured connector. Supports schema introspection, foreign key detection, column statistics, and query history via `pg_stat_statements`.

 ### Connection config

@ -75,12 +79,13 @@ connections:
 | Foreign keys | Yes | Full constraint detection |
 | Row count estimates | Yes | Via `pg_class.reltuples` |
 | Column statistics | Yes | Requires `pg_read_all_stats` role |
-| Historic SQL | Yes | Via `pg_stat_statements` extension |
+| Query history | Yes | Via `pg_stat_statements` extension |
 | Table sampling | Yes | `TABLESAMPLE SYSTEM` |

-### Historic SQL
+### Query history

-PostgreSQL Historic SQL mines real query patterns from `pg_stat_statements`. This is the most mature local Historic SQL path and helps KTX understand how your team actually queries the data.
+PostgreSQL query history mines real query patterns from `pg_stat_statements`.
+This helps KTX understand how your team actually queries the data.

 **Requirements:**
 - `pg_stat_statements` extension enabled
@ -89,12 +94,12 @@ PostgreSQL Historic SQL mines real query patterns from `pg_stat_statements`. Thi
 **Config options:**

 ```yaml
-historicSql:
-  enabled: true
-  dialect: postgres
-  minExecutions: 5
-  filters:
-    dropTrivialProbes: true
+    context:
+      queryHistory:
+        enabled: true
+        minExecutions: 5
+        filters:
+          dropTrivialProbes: true
 ```

 ### Dialect notes
@ -108,7 +113,7 @@ historicSql:

 ## Snowflake

-Connects via the Snowflake SDK. Supports multi-schema scanning, RSA key authentication, and Historic SQL configuration for Snowflake query history.
+Connects via the Snowflake SDK. Supports multi-schema scanning, RSA key authentication, and query-history configuration for Snowflake query history.

 ### Connection config

@ -150,27 +155,27 @@ For multiple schemas:
 | Foreign keys | No | Not available in Snowflake |
 | Row count estimates | Yes | From `INFORMATION_SCHEMA.TABLES.ROW_COUNT` |
 | Column statistics | No | — |
-| Historic SQL | Yes | Via `SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY` when enabled |
+| Query history | Yes | Via `SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY` when enabled |
 | Table sampling | Yes | — |

-### Historic SQL
+### Query history

-Snowflake Historic SQL reads aggregated query-history templates from
+Snowflake query history reads aggregated query-history templates from
 `SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY` and feeds the same unified staged
 artifact shape as Postgres and BigQuery.

 ```yaml
-historicSql:
-  enabled: true
-  dialect: snowflake
-  windowDays: 90
-  minExecutions: 5
-  filters:
-    dropTrivialProbes: true
-    serviceAccounts:
-      patterns: ['^svc_']
-      mode: exclude
-  redactionPatterns: []
+    context:
+      queryHistory:
+        enabled: true
+        windowDays: 90
+        minExecutions: 5
+        filters:
+          dropTrivialProbes: true
+          serviceAccounts:
+            patterns: ['^svc_']
+            mode: exclude
+        redactionPatterns: []
 ```

 ### Dialect notes
@ -184,7 +189,7 @@ historicSql:

 ## BigQuery

-Authenticates via GCP service account credentials. Supports multi-dataset scanning and Historic SQL configuration for `INFORMATION_SCHEMA.JOBS_BY_PROJECT`.
+Authenticates via GCP service account credentials. Supports multi-dataset scanning and query-history configuration for `INFORMATION_SCHEMA.JOBS_BY_PROJECT`.

 ### Connection config

@ -227,27 +232,27 @@ mapping metadata. The BigQuery connector still authenticates with the
 | Foreign keys | No | Not available in BigQuery |
 | Row count estimates | Yes | From table metadata |
 | Column statistics | No | — |
-| Historic SQL | Yes | Via region-scoped `INFORMATION_SCHEMA.JOBS_BY_PROJECT` when enabled |
+| Query history | Yes | Via region-scoped `INFORMATION_SCHEMA.JOBS_BY_PROJECT` when enabled |
 | Table sampling | Yes | — |

-### Historic SQL
+### Query history

-BigQuery Historic SQL reads aggregated query-history templates from
+BigQuery query history reads aggregated query-history templates from
 region-scoped `INFORMATION_SCHEMA.JOBS_BY_PROJECT` and feeds the same unified
 staged artifact shape as Postgres and Snowflake.

 ```yaml
-historicSql:
-  enabled: true
-  dialect: bigquery
-  windowDays: 90
-  minExecutions: 5
-  filters:
-    dropTrivialProbes: true
-    serviceAccounts:
-      patterns: ['@bot\\.']
-      mode: exclude
-  redactionPatterns: []
+    context:
+      queryHistory:
+        enabled: true
+        windowDays: 90
+        minExecutions: 5
+        filters:
+          dropTrivialProbes: true
+          serviceAccounts:
+            patterns: ['@bot\\.']
+            mode: exclude
+        redactionPatterns: []
 ```

 ### Dialect notes
@ -303,7 +308,7 @@ connections:
 | Foreign keys | No | Not a ClickHouse concept |
 | Row count estimates | Yes | Via `system.parts` aggregation |
 | Column statistics | No | — |
-| Historic SQL | No | — |
+| Query history | No | — |
 | Table sampling | Yes | — |

 ### Dialect notes
@ -360,7 +365,7 @@ connections:
 | Foreign keys | Yes | Via `REFERENTIAL_CONSTRAINTS` |
 | Row count estimates | Yes | From `TABLE_ROWS` (InnoDB estimate) |
 | Column statistics | No | — |
-| Historic SQL | No | — |
+| Query history | No | — |
 | Table sampling | Yes | Uses `RAND()` filter |

 ### Dialect notes
@ -426,7 +431,7 @@ For multiple schemas:
 | Foreign keys | Yes | Via `REFERENTIAL_CONSTRAINTS` |
 | Row count estimates | Yes | Via `sys.dm_db_partition_stats` |
 | Column statistics | No | — |
-| Historic SQL | No | — |
+| Query history | No | — |
 | Table sampling | Yes | — |
 | Nested analysis | No | — |

@ -484,7 +489,7 @@ No authentication required — SQLite is file-based. The file must be readable b
 | Foreign keys | Yes | Via `PRAGMA foreign_key_list()` (requires `PRAGMA foreign_keys = ON`) |
 | Row count estimates | Yes | Exact count via `SELECT COUNT(*)` |
 | Column statistics | No | — |
-| Historic SQL | No | — |
+| Query history | No | — |
 | Table sampling | Yes | — |
 | Nested analysis | No | — |

@ -502,7 +507,7 @@ No authentication required — SQLite is file-based. The file must be readable b
 | Error or symptom | Likely cause | Recovery |
 |------------------|--------------|----------|
 | Connection URL appears in git diff | A literal credential URL was written to `ktx.yaml` | Replace it with `env:NAME` or `file:/path/to/secret` and rotate exposed credentials |
-| Scan returns no tables | Schema/database/project filter is wrong or the user lacks metadata permissions | Verify the schema list and grant metadata read permissions |
-| Historic SQL is empty | Query history extension or warehouse history view is unavailable | Enable the warehouse-specific history feature, then rerun scan or setup |
-| Column statistics are missing | Connector cannot access stats tables or the warehouse does not expose them | Grant stats permissions where supported; otherwise rely on structural scan output |
+| Database ingest returns no tables | Schema, database, or project filter is wrong, or the user lacks metadata permissions | Verify the schema list and grant metadata read permissions |
+| Query history is empty | Query history extension or warehouse history view is unavailable | Enable the warehouse-specific history feature, then rerun `ktx ingest <connectionId> --query-history` or `ktx setup` |
+| Column statistics are missing | Connector cannot access stats tables or the warehouse does not expose them | Grant stats permissions where supported; otherwise rely on fast schema context |
 | Semantic query execution fails | Connection is missing, unreachable, or query execution is disabled | Run `ktx connection test <id>` and check the `ktx sl query` flags |