feat: merge ingest and scan

* docs: add CLI component reuse guidance

* docs: add unified ingest ux design

* Refine unified ingest UX design after adversarial review iteration 1

* Refine unified ingest UX design after adversarial review iteration 2

* Refine unified ingest UX design after adversarial review iteration 3

* feat(cli): route public connection ingest command

* feat(cli): hide standalone scan from public help

* feat(cli): plan public ingest depth and query history

* feat(cli): execute public database ingest facets

* feat(ingest): read connection query history config

* fix(cli): use public ingest wording

* fix(config): stop generating ingest adapter allow lists

* docs: document public ingest command

* test: align ingest surface expectations

* docs: add unified ingest public CLI surface plan

* feat(cli): preflight deep public ingest readiness

* feat(setup): store query history in connection context

* feat(setup): store database context depth

* feat(setup): verify context readiness by database depth

* fix(setup): keep context build foreground only

* fix(config): reject reserved ingest connection ids

* test: close unified ingest v1 expectations

* docs: add unified ingest v1 closure plan

* fix(ingest): bypass adapter allow-list for public source ingest

* fix(ingest): honor query history window intent

* fix(ingest): hide scan internals from public database ingest

* feat(ingest): use foreground view for interactive public ingest

* fix(setup): use schema context and query history wording

* test(cli): verify unified ingest public output

* docs: add unified ingest v1 public output closure plan

* fix(setup): forward query history flags

* fix(setup): prompt for postgres query history

* fix(status): report query history readiness

* fix(ingest): remove legacy public guidance

* fix(ingest): polish foreground retry copy

* docs(examples): use unified query history wording

* chore(ingest): finish public query history cleanup

* docs: add unified ingest v1 query history status cleanup plan

* test(docs): cover unified ingest public docs

* docs: align ingest CLI reference with unified UX

* docs: update context build guides for unified ingest

* docs: update setup and primary source ingest wording

* docs: stop advertising adapter-backed example ingest

* docs: close unified ingest public docs gaps

* docs: add unified ingest v1 docs site closure plan

* fix: render unified ingest foreground warnings

* fix: explain query history schema order

* fix: add public ingest retry guidance

* fix: align setup next steps with unified ingest

* fix: remove scan wording from demo progress

* test: verify unified ingest ux closure

* docs: add unified ingest v1 foreground and retry closure plan

* fix(cli): preserve query-history pull config in public ingest

* fix(cli): omit hidden commands from docs command tree

* test(cli): close unified ingest final public surface checks

* docs: add unified ingest v1 final public surface closure plan

* fix(cli): use public source labels in ingest reports

* fix(cli): suppress low-level public ingest output

* test(cli): verify unified ingest public plain output

* docs: add unified ingest v1 public plain output closure plan

* fix(cli): add public ingest copy sanitizers

* fix(cli): sanitize public ingest progress copy

* fix(cli): rename setup schema scope prompt

* docs(plan): add progress copy closure; test: align setup back-nav fixture

Adds the iter9 plan and updates the setup back-navigation test fixture
to pass disableQueryHistory plus listSchemas/listTables stubs that the
unified ingest setup step now requires.

* docs(plan): add final ux labels plan with narrowed label scans

* fix(cli): aggregate unsupported query-history warnings

* fix(cli): align setup database labels

* test(cli): fix setup database test type-check

* fix(cli): remove primary-source wording from setup output

* test(cli): verify unified ingest setup closure

* docs(plan): add unified ingest v1 verification copy closure plan

* fix(cli): remove top-level scan command

* fix(cli): remove legacy ingest and wiki commands

* Merge scan into ingest flow

* feat(cli): split ingest progress into per-phase rows, rename work units to tasks

Each database target in the unified ingest dashboard now renders one row per
real subprocess (Schema, then Query history when enabled) instead of a single
combined bar. Each phase has its own monotonic 0-100% bar so the progress
never snaps back to zero when historic-sql starts after scan completes.
Completed phases keep their final bar, summary, and elapsed time visible as
an inline audit trail; queued and skipped phases are shown explicitly.

Also rename user-facing "work units" / "Failed work units" to "tasks" /
"Failed tasks" in ingest output and parseIngestSummary. The parser still
accepts the legacy "Work units:" wording in captured output for backward
compat. Internal memory-flow event names and type fields are left alone.

* Fix test harness failures

* Fix CI smoke checks

---------

Co-authored-by: Andrey Avtomonov <7889985+andreybavt@users.noreply.github.com>
This commit is contained in:
Andrey Avtomonov 2026-05-14 01:43:06 +02:00 committed by GitHub
parent 1a472cf3ed
commit b00c1a11a9
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
118 changed files with 16890 additions and 2992 deletions

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,829 @@
# Unified Ingest V1 Docs Site Closure Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Remove the remaining public documentation surfaces that still present
`ktx scan`, adapter-backed `ktx ingest run`, `ktx ingest watch`,
`live-database`, or `Historic SQL` as normal v1 user workflows.
**Architecture:** Keep the implemented CLI behavior unchanged. Update the
Fumadocs content, example READMEs, and documentation regression tests so public
guidance uses connection-centric `ktx ingest <connectionId>`, `ktx ingest
--all`, `--fast`, `--deep`, `--query-history`, `ktx ingest status`, and
`ktx ingest replay`.
**Tech Stack:** Markdown, MDX frontmatter, Fumadocs page metadata, Node test
runner, pnpm workspace scripts.
---
## Current audit
The four implemented unified-ingest plans cover the CLI and setup v1 surface:
- `ktx ingest [connectionId]`, `ktx ingest --all`, `--fast`, `--deep`,
`--query-history`, `--no-query-history`, and
`--query-history-window-days` route through `public-ingest.ts`.
- Database targets run before source targets, public source ingest bypasses
adapter allow-lists, and public database ingest captures internal scan output.
- `ktx scan`, `ktx ingest run`, and `ktx ingest watch` are hidden from normal
help.
- Setup stores `connections.<id>.context.depth`, writes
`connections.<id>.context.queryHistory`, rejects reserved ingest ids, and
uses foreground-only context-build state.
### V1-blocking gaps
- `docs-site/content/docs/cli-reference/ktx-ingest.mdx` still documents
adapter-level `ktx ingest run`, `--adapter`, `ktx ingest watch`, and
`live-database`.
- `docs-site/content/docs/cli-reference/ktx-scan.mdx` still presents
`ktx scan` as a public command, and
`docs-site/content/docs/cli-reference/meta.json` still publishes it in the
CLI reference.
- `docs-site/content/docs/cli-reference/ktx-dev.mdx` still links to root
`ktx scan` as a normal command.
- `docs-site/content/docs/guides/building-context.mdx` still has an adapter
table that lists `historic-sql` and `live-database`, and it still documents
`ktx ingest watch` as the visual progress path.
- `docs-site/content/docs/integrations/context-sources.mdx` still instructs
users to run
`ktx ingest run --connection-id <connectionId> --adapter <adapter>`.
- `docs-site/content/docs/concepts/context-as-code.mdx` still recommends
scheduled
`ktx ingest run --connection-id <id> --adapter <adapter> --no-input`.
- `docs-site/content/docs/getting-started/quickstart.mdx` still says setup
runs structural/enriched scans, exposes Historic SQL flags, and describes
detach/background context-build behavior.
- `docs-site/content/docs/integrations/primary-sources.mdx` still uses the
legacy `historicSql` config shape and Historic SQL wording for supported
query-history drivers.
- `examples/README.md` and `examples/local-warehouse/README.md` still present
`ktx ingest run --adapter fake` as the example command.
### Non-blocking gaps
- Hidden debug commands can continue to call `ktx scan`,
`ktx ingest run`, and `ktx ingest watch`.
- Internal source keys, raw artifact paths, tests, scripts, and developer-only
package taxonomy can continue to use `scan`, `live-database`, and
`historic-sql`.
- Contributor docs can continue to mention scan internals when describing
package ownership or connector implementation details.
- The `examples/local-warehouse/ktx.yaml` fake adapter fixture can remain for
CLI smoke tests if the public example docs stop recommending it as a normal
user workflow.
## File structure
- Modify `scripts/examples-docs.test.mjs`: add regression assertions for
docs-site and example README unified-ingest wording.
- Modify `docs-site/content/docs/cli-reference/ktx-ingest.mdx`: rewrite the
page around the connection-centric public command.
- Delete `docs-site/content/docs/cli-reference/ktx-scan.mdx`: remove the
public scan reference page.
- Modify `docs-site/content/docs/cli-reference/meta.json`: remove
`ktx-scan` from published CLI reference pages.
- Modify `docs-site/content/docs/cli-reference/ktx-dev.mdx`: remove the
root-scan link and clarify that database context is built by `ktx ingest`.
- Modify `docs-site/content/docs/guides/building-context.mdx`: remove
adapter tables and live watch guidance; describe status/replay only.
- Modify `docs-site/content/docs/integrations/context-sources.mdx`: replace
adapter-backed ingest commands with `ktx ingest <connectionId>`.
- Modify `docs-site/content/docs/concepts/context-as-code.mdx`: replace
scheduled adapter-backed ingest guidance with `ktx ingest --all`.
- Modify `docs-site/content/docs/getting-started/quickstart.mdx`: update setup
language for schema context, depth, query history, and foreground-only
progress.
- Modify `docs-site/content/docs/integrations/primary-sources.mdx`: replace
`historicSql` with `context.queryHistory` and query-history wording.
- Modify `examples/README.md`: stop advertising the fake adapter command as a
public example workflow.
- Modify `examples/local-warehouse/README.md`: mark the fake adapter fixture as
contributor-only and point users to public ingest docs.
## Tasks
### Task 1: Add stale public-doc regression tests
**Files:**
- Modify: `scripts/examples-docs.test.mjs`
- [ ] **Step 1: Add failing docs-site unified-ingest assertions**
In `scripts/examples-docs.test.mjs`, replace the existing test named
`documents public context build workflows in the docs site` with:
```js
it('documents unified public ingest workflows in the docs site', async () => {
const rootReadme = await readText('README.md');
const cliMeta = await readText('docs-site/content/docs/cli-reference/meta.json');
const ingestReference = await readText('docs-site/content/docs/cli-reference/ktx-ingest.mdx');
const devReference = await readText('docs-site/content/docs/cli-reference/ktx-dev.mdx');
const buildingContext = await readText('docs-site/content/docs/guides/building-context.mdx');
const contextSources = await readText('docs-site/content/docs/integrations/context-sources.mdx');
const contextAsCode = await readText('docs-site/content/docs/concepts/context-as-code.mdx');
const quickstart = await readText('docs-site/content/docs/getting-started/quickstart.mdx');
const primarySources = await readText('docs-site/content/docs/integrations/primary-sources.mdx');
const examplesIndex = await readText('examples/README.md');
const localWarehouseReadme = await readText('examples/local-warehouse/README.md');
assert.match(ingestReference, /ktx ingest <connectionId>/);
assert.match(ingestReference, /ktx ingest --all --deep/);
assert.match(ingestReference, /--query-history-window-days <days>/);
assert.match(buildingContext, /ktx ingest <connection-id>/);
assert.match(buildingContext, /ktx ingest --all/);
assert.match(buildingContext, /ktx ingest replay <run-id>/);
assert.match(contextSources, /ktx ingest <connectionId>/);
assert.match(contextAsCode, /ktx ingest --all --no-input/);
assert.match(quickstart, /schema context/);
assert.match(primarySources, /context:\\n queryHistory:/);
assert.doesNotMatch(cliMeta, /ktx-scan/);
assert.doesNotMatch(ingestReference, /ktx ingest run/);
assert.doesNotMatch(ingestReference, /--adapter/);
assert.doesNotMatch(ingestReference, /ktx ingest watch/);
assert.doesNotMatch(ingestReference, /live-database/);
assert.doesNotMatch(devReference, /ktx scan/);
assert.doesNotMatch(buildingContext, /ktx ingest watch/);
assert.doesNotMatch(buildingContext, /historic-sql/);
assert.doesNotMatch(buildingContext, /live-database/);
assert.doesNotMatch(contextSources, /ktx ingest run --connection-id/);
assert.doesNotMatch(contextSources, /--adapter <adapter>/);
assert.doesNotMatch(contextAsCode, /ktx ingest run --connection-id/);
assert.doesNotMatch(quickstart, /Historic SQL/);
assert.doesNotMatch(quickstart, /--enable-historic-sql/);
assert.doesNotMatch(quickstart, /press <kbd>d<\\/kbd> to detach/);
assert.doesNotMatch(primarySources, /historicSql/);
assert.doesNotMatch(primarySources, /Historic SQL/);
assert.doesNotMatch(examplesIndex, /ktx ingest run --project-dir/);
assert.doesNotMatch(localWarehouseReadme, /ktx ingest run --project-dir/);
assert.match(rootReadme, /raw-sources\//);
assert.doesNotMatch(rootReadme, new RegExp(`${['live', 'database'].join('-')}/`));
assert.doesNotMatch(rootReadme, /ktx scan/);
assert.doesNotMatch(rootReadme, /Run a local ingest smoke test/);
assert.doesNotMatch(rootReadme, /ktx ingest run --project-dir/);
assert.doesNotMatch(rootReadme, /ktx ingest status --project-dir/);
});
```
- [ ] **Step 2: Run the failing docs regression test**
Run:
```bash
node --test scripts/examples-docs.test.mjs
```
Expected: FAIL with assertions matching the stale docs-site and example README
content.
- [ ] **Step 3: Commit the failing test**
```bash
git add scripts/examples-docs.test.mjs
git commit -m "test(docs): cover unified ingest public docs"
```
### Task 2: Rewrite the CLI reference surface
**Files:**
- Modify: `docs-site/content/docs/cli-reference/ktx-ingest.mdx`
- Delete: `docs-site/content/docs/cli-reference/ktx-scan.mdx`
- Modify: `docs-site/content/docs/cli-reference/meta.json`
- Modify: `docs-site/content/docs/cli-reference/ktx-dev.mdx`
- [ ] **Step 1: Rewrite `ktx-ingest.mdx`**
Replace `docs-site/content/docs/cli-reference/ktx-ingest.mdx` with:
````mdx
---
title: "ktx ingest"
description: "Build, inspect, and replay KTX context ingest runs."
---
`ktx ingest` builds or refreshes KTX context from configured connections.
Database connections build schema context. Context-source connections ingest
metadata from tools such as dbt, Looker, Metabase, MetricFlow, LookML, and
Notion.
## Command signature
```bash
ktx ingest [options] [connectionId]
```
Use a connection id to build one configured connection. Use `--all` to build
every configured connection. Database connections run before context-source
connections when you use `--all`.
## Build options
| Flag | Description | Default |
|------|-------------|---------|
| `--all` | Build every configured connection | `false` |
| `--fast` | Use deterministic database schema ingest | Stored connection default, or `fast` |
| `--deep` | Use AI-enriched database ingest | Stored connection default, or `fast` |
| `--query-history` | Include database query-history usage patterns | Stored connection default |
| `--no-query-history` | Skip database query-history usage patterns for this run | Stored connection default |
| `--query-history-window-days <days>` | Query-history lookback window for this run | Stored connection default |
| `--plain` | Print plain text output | `true` |
| `--json` | Print JSON output | `false` |
| `--no-input` | Disable interactive terminal input | `false` |
`--fast` and `--deep` are mutually exclusive. Depth flags apply only to
database connections. Query-history flags apply only to database connections
that support query history.
## Status and replay
| Subcommand | Description |
|------------|-------------|
| `status [runId]` | Print status for the latest or selected stored ingest run or report file |
| `replay <runId>` | Replay a stored ingest run or bundle report through memory-flow output |
Both subcommands accept `--report-file <path>`, `--plain`, `--json`, `--viz`,
and `--no-input`.
## Examples
```bash
ktx ingest warehouse
ktx ingest warehouse --fast
ktx ingest warehouse --deep
ktx ingest warehouse --deep --query-history
ktx ingest warehouse --query-history-window-days 30
ktx ingest notion
ktx ingest --all
ktx ingest --all --deep
ktx ingest status
ktx ingest status run-abc123
ktx ingest status --json
ktx ingest replay run-abc123
ktx ingest replay run-abc123 --viz
ktx ingest replay run-abc123 --report-file /tmp/ingest-report.json
```
## Common errors
| Error | Cause | Recovery |
|-------|-------|----------|
| Connection not configured | The connection id is not present in `ktx.yaml` | Add the connection with `ktx setup` or update `ktx.yaml` |
| Deep readiness is missing | `--deep` or query history needs model, embedding, and scan-enrichment configuration | Run `ktx setup` or rerun with `--fast` |
| Query history is unsupported | The selected database driver does not support query history | Run schema ingest without query-history flags |
| Latest run not found | No stored ingest report exists in this project | Run `ktx ingest <connectionId>` first |
| Visual replay fails in a non-interactive shell | Visual report replay needs a terminal | Use `ktx ingest status --json` for agent and CI workflows |
````
- [ ] **Step 2: Remove the public scan page**
Delete `docs-site/content/docs/cli-reference/ktx-scan.mdx`.
- [ ] **Step 3: Remove `ktx-scan` from CLI metadata**
In `docs-site/content/docs/cli-reference/meta.json`, replace the full file
with:
```json
{
"title": "CLI Reference",
"defaultOpen": true,
"pages": [
"ktx-setup",
"ktx-connection",
"ktx-ingest",
"ktx-sl",
"ktx-wiki",
"ktx-status",
"ktx-dev"
]
}
```
- [ ] **Step 4: Update the dev command reference**
In `docs-site/content/docs/cli-reference/ktx-dev.mdx`, replace this paragraph:
```mdx
`ktx dev` contains development-only project initialization and managed runtime commands. Scan and ingest commands live at the root as [`ktx scan`](/docs/cli-reference/ktx-scan) and [`ktx ingest`](/docs/cli-reference/ktx-ingest).
```
with:
```mdx
`ktx dev` contains development-only project initialization and managed runtime commands. Context building lives at the root as [`ktx ingest`](/docs/cli-reference/ktx-ingest).
```
- [ ] **Step 5: Run the docs regression test**
Run:
```bash
node --test scripts/examples-docs.test.mjs
```
Expected: FAIL only on the remaining guide, integration, quickstart, primary
source, and example README stale wording.
- [ ] **Step 6: Commit CLI reference cleanup**
```bash
git add docs-site/content/docs/cli-reference/ktx-ingest.mdx docs-site/content/docs/cli-reference/meta.json docs-site/content/docs/cli-reference/ktx-dev.mdx
git rm docs-site/content/docs/cli-reference/ktx-scan.mdx
git commit -m "docs: align ingest CLI reference with unified UX"
```
### Task 3: Update context-build guides
**Files:**
- Modify: `docs-site/content/docs/guides/building-context.mdx`
- Modify: `docs-site/content/docs/integrations/context-sources.mdx`
- Modify: `docs-site/content/docs/concepts/context-as-code.mdx`
- [ ] **Step 1: Update stored report guidance in `building-context.mdx`**
In `docs-site/content/docs/guides/building-context.mdx`, replace the
`### Watching progress` section through the paragraph after it with:
````mdx
### Inspecting stored reports
```bash
# Check status of the latest ingest
ktx ingest status
# Check a specific run
ktx ingest status <run-id>
# Replay a past ingest run
ktx ingest replay <run-id>
```
`ktx ingest replay` opens the stored memory-flow output for a completed run.
Foreground context builds do not detach into background control sessions; if a
run is interrupted, rerun `ktx ingest <connection-id>` or `ktx ingest --all`.
````
- [ ] **Step 2: Replace the adapter table in `building-context.mdx`**
In the same file, replace the `### Available adapters` heading, table, and
following sentence with:
```mdx
### Supported context sources
| Driver | Source | What gets ingested |
|--------|--------|--------------------|
| `dbt` | dbt project | Model definitions, column descriptions, tests, tags |
| `metricflow` | MetricFlow semantic models | Metrics, dimensions, entities, semantic joins |
| `lookml` | LookML files | Views, explores, dimensions, measures, joins |
| `looker` | Looker API | Explores, looks, dashboard metadata |
| `metabase` | Metabase API | Questions, dashboards, table metadata |
| `notion` | Notion API | Database pages, knowledge articles |
Query history is a database connection facet. Enable it with
`connections.<id>.context.queryHistory` or pass `--query-history` for a current
run. See [Context Sources](/docs/integrations/context-sources) for
driver-specific setup and auth configuration.
```
- [ ] **Step 3: Update context-source workflow commands**
In `docs-site/content/docs/integrations/context-sources.mdx`, replace the
numbered workflow with:
```mdx
Agents must configure and ingest context sources in this order:
1. Add the context source connection in `ktx.yaml` or with `ktx setup`.
2. Store tokens as `env:NAME` or `file:/path/to/secret`.
3. Run `ktx ingest <connectionId>` for one source or `ktx ingest --all` for
every configured source.
4. Check progress with `ktx ingest status --json`.
5. Review generated `semantic-layer/` YAML and `wiki/` Markdown files in git.
6. Validate changed semantic sources with `ktx sl validate`.
```
- [ ] **Step 4: Update scheduled ingest wording**
In `docs-site/content/docs/concepts/context-as-code.mdx`, replace this
paragraph:
```mdx
Teams usually run this on demand while setting up a source, then schedule it once the source is stable. A cron job or CI schedule can run `ktx ingest run --connection-id <id> --adapter <adapter> --no-input` overnight on an ingest branch so the latest dbt manifests, BI metadata, and documentation updates are ready for review each morning.
```
with:
```mdx
Teams usually run this on demand while setting up a source, then schedule it
once the source is stable. A cron job or CI schedule can run `ktx ingest --all
--no-input` overnight on an ingest branch so the latest schema context, dbt
manifests, BI metadata, and documentation updates are ready for review each
morning.
```
- [ ] **Step 5: Run the docs regression test**
Run:
```bash
node --test scripts/examples-docs.test.mjs
```
Expected: FAIL only on quickstart, primary source, and example README stale
wording.
- [ ] **Step 6: Commit guide cleanup**
```bash
git add docs-site/content/docs/guides/building-context.mdx docs-site/content/docs/integrations/context-sources.mdx docs-site/content/docs/concepts/context-as-code.mdx
git commit -m "docs: update context build guides for unified ingest"
```
### Task 4: Update setup and primary-source docs
**Files:**
- Modify: `docs-site/content/docs/getting-started/quickstart.mdx`
- Modify: `docs-site/content/docs/integrations/primary-sources.mdx`
- [ ] **Step 1: Update database setup copy in quickstart**
In `docs-site/content/docs/getting-started/quickstart.mdx`, replace the first
paragraph under `## Step 3: Connect a database` with:
```mdx
Select one or more databases for KTX to connect to. The wizard supports
SQLite, PostgreSQL, MySQL, ClickHouse, SQL Server, BigQuery, and Snowflake.
```
Replace this sentence:
```mdx
After connecting, KTX automatically runs a connection test and a structural scan:
```
with:
```mdx
After connecting, KTX automatically runs a connection test and builds fast
schema context:
```
Replace the example output block in Step 3 with:
````mdx
```
Testing postgres-warehouse
Connection test passed
Driver: PostgreSQL - Tables: 42
Building schema context for postgres-warehouse
Running fast database ingest
Schema context complete for postgres-warehouse
Changes: 42 new tables
Primary source ready
postgres-warehouse - PostgreSQL - schema context complete
```
````
Replace this paragraph:
```mdx
For Snowflake and BigQuery, the wizard offers **Historic SQL** configuration for query history views. For PostgreSQL, enable Historic SQL with `--enable-historic-sql` when `pg_stat_statements` is configured.
```
with:
```mdx
For PostgreSQL, Snowflake, and BigQuery, the wizard can enable query-history
ingest when the warehouse history feature is available. Query history is stored
under `connections.<id>.context.queryHistory` in `ktx.yaml`.
```
- [ ] **Step 2: Update context-build copy in quickstart**
In the same file, replace the first two paragraphs under
`## Step 5: Build context` with:
```mdx
This is where KTX builds agent-ready context. It uses the database context
depth saved by setup and ingests metadata from any configured context sources.
Fast database context builds deterministic schema grounding. Deep database
context also generates AI descriptions, embeddings, and relationship evidence
when those capabilities are configured.
```
Replace the paragraph and background example that starts with `For a small
database` and ends with the fenced context-build block with:
````mdx
For a small database (under 50 tables), this can take a few minutes. Larger
warehouses can take longer. Context builds run in the foreground; press
<kbd>Ctrl+C</kbd> to stop the current run and rerun `ktx setup` or `ktx ingest`
when you are ready to try again.
````
Replace this output line in the completion example:
```text
postgres-warehouse: enriched scan complete
```
with:
```text
postgres-warehouse: deep context complete
```
Replace the next-steps bullet:
```mdx
- **Build more context** - learn about [scanning](/docs/guides/building-context), relationship detection, and ingestion workflows in the Building Context guide.
```
with:
```mdx
- **Build more context** - learn about [database ingest](/docs/guides/building-context), relationship detection, and source ingestion workflows in the Building Context guide.
```
- [ ] **Step 3: Update primary-source query-history config**
In `docs-site/content/docs/integrations/primary-sources.mdx`, replace the
introductory paragraph and shared conventions with:
```mdx
KTX connects to your data warehouse or database to build schema context,
discover relationships, and execute semantic layer queries. Each connection is
defined in `ktx.yaml` under the `connections` key.
All connectors share these conventions:
- Sensitive values support `env:VAR_NAME` (read from environment) and
`file:/path/to/secret` (read from file) references
- Connections are read-only; KTX never writes to your database
- Database ingest discovers tables, columns, types, and constraints
automatically
```
In the connection field reference table, replace the `historicSql` row with:
```mdx
| `context.queryHistory` | No | PostgreSQL, Snowflake, BigQuery | Enables query-history ingestion when the warehouse supports it |
```
Replace every feature row label `Historic SQL` with `Query history`.
Replace each `### Historic SQL` heading with `### Query history`.
Replace the PostgreSQL query-history config block with:
```yaml
context:
queryHistory:
enabled: true
minExecutions: 5
filters:
dropTrivialProbes: true
```
Replace the Snowflake query-history config block with:
```yaml
context:
queryHistory:
enabled: true
windowDays: 90
minExecutions: 5
filters:
dropTrivialProbes: true
serviceAccounts:
patterns: ['^svc_']
mode: exclude
redactionPatterns: []
```
Replace the BigQuery query-history config block with:
```yaml
context:
queryHistory:
enabled: true
windowDays: 90
minExecutions: 5
filters:
dropTrivialProbes: true
serviceAccounts:
patterns: ['@bot\\.']
mode: exclude
redactionPatterns: []
```
Replace the common-errors row:
```mdx
| Historic SQL is empty | Query history extension or warehouse history view is unavailable | Enable the warehouse-specific history feature, then rerun scan or setup |
```
with:
```mdx
| Query history is empty | Query history extension or warehouse history view is unavailable | Enable the warehouse-specific history feature, then rerun `ktx ingest <connectionId> --query-history` or `ktx setup` |
```
Replace the common-errors row:
```mdx
| Scan returns no tables | Schema/database/project filter is wrong or the user lacks metadata permissions | Verify the schema list and grant metadata read permissions |
```
with:
```mdx
| Database ingest returns no tables | Schema, database, or project filter is wrong, or the user lacks metadata permissions | Verify the schema list and grant metadata read permissions |
```
Replace the common-errors row:
```mdx
| Column statistics are missing | Connector cannot access stats tables or the warehouse does not expose them | Grant stats permissions where supported; otherwise rely on structural scan output |
```
with:
```mdx
| Column statistics are missing | Connector cannot access stats tables or the warehouse does not expose them | Grant stats permissions where supported; otherwise rely on fast schema context |
```
- [ ] **Step 4: Run targeted stale-term search**
Run:
```bash
rg -n "Historic SQL|historicSql|--enable-historic-sql|--historic-sql|ktx scan|ktx ingest watch|ktx ingest run --connection-id|--adapter <adapter>|live-database" docs-site/content/docs/getting-started/quickstart.mdx docs-site/content/docs/integrations/primary-sources.mdx docs-site/content/docs/cli-reference docs-site/content/docs/guides/building-context.mdx docs-site/content/docs/integrations/context-sources.mdx docs-site/content/docs/concepts/context-as-code.mdx
```
Expected: no output.
- [ ] **Step 5: Run the docs regression test**
Run:
```bash
node --test scripts/examples-docs.test.mjs
```
Expected: FAIL only on example README stale adapter-command wording.
- [ ] **Step 6: Commit setup and primary-source docs cleanup**
```bash
git add docs-site/content/docs/getting-started/quickstart.mdx docs-site/content/docs/integrations/primary-sources.mdx
git commit -m "docs: update setup and primary source ingest wording"
```
### Task 5: Remove public fake-adapter example commands
**Files:**
- Modify: `examples/README.md`
- Modify: `examples/local-warehouse/README.md`
- [ ] **Step 1: Rewrite the local-warehouse section in `examples/README.md`**
In `examples/README.md`, replace the `## local-warehouse` section with:
````md
## local-warehouse
`local-warehouse/` is a contributor fixture for local CLI smoke tests. It uses
the internal fake ingest adapter so tests can exercise memory-flow behavior
without a live database or external service.
For normal context building, use the public connection-centric commands:
```bash
ktx ingest <connectionId>
ktx ingest --all
```
The copied project initializes its own Git repository on first use.
````
- [ ] **Step 2: Rewrite `examples/local-warehouse/README.md`**
Replace `examples/local-warehouse/README.md` with:
````md
# local-warehouse fixture
This directory is a contributor fixture for KTX CLI smoke tests. It uses the
internal fake ingest adapter so tests can run without a live database or
external service.
Normal users should build context with connection-centric ingest:
```bash
ktx ingest <connectionId>
ktx ingest --all
```
The public ingest workflow is documented in
`docs-site/content/docs/cli-reference/ktx-ingest.mdx` and
`docs-site/content/docs/guides/building-context.mdx`.
````
- [ ] **Step 3: Run the docs regression test**
Run:
```bash
node --test scripts/examples-docs.test.mjs
```
Expected: PASS.
- [ ] **Step 4: Commit example docs cleanup**
```bash
git add examples/README.md examples/local-warehouse/README.md
git commit -m "docs: stop advertising adapter-backed example ingest"
```
### Task 6: Final verification
**Files:**
- Verify: `scripts/examples-docs.test.mjs`
- Verify: `docs-site/content/docs/**/*.mdx`
- Verify: `examples/**/*.md`
- [ ] **Step 1: Run docs regression tests**
Run:
```bash
node --test scripts/examples-docs.test.mjs
```
Expected: PASS.
- [ ] **Step 2: Run docs-site build**
Run:
```bash
pnpm --filter ktx-docs run build
```
Expected: PASS. If the build fails because this workspace lacks external build
prerequisites, capture the error and run `pnpm --filter ktx-docs run test` as
the closest available docs-site check.
- [ ] **Step 3: Run final stale public-surface search**
Run:
```bash
rg -n "ktx scan|ktx ingest run --connection-id|--adapter <adapter>|ktx ingest watch|live-database|Historic SQL|historicSql|--enable-historic-sql|--historic-sql" docs-site/content/docs examples/README.md examples/local-warehouse/README.md
```
Expected: no output.
- [ ] **Step 4: Inspect git status**
Run:
```bash
git status --short
```
Expected: only the files intentionally changed by this plan appear.
- [ ] **Step 5: Commit verification updates if needed**
If verification required small documentation or test fixes, commit them:
```bash
git add scripts/examples-docs.test.mjs docs-site/content/docs examples/README.md examples/local-warehouse/README.md
git commit -m "docs: close unified ingest public docs gaps"
```
## Self-review
- Spec coverage: This plan covers the remaining public documentation surfaces
that still contradicted the unified ingest UX spec. It intentionally does not
rename internal scan packages, internal adapter keys, raw artifact paths, or
developer-only test fixtures.
- Placeholder scan: No task contains open-ended placeholders. Each edit names
exact files and exact replacement text or commands.
- Type consistency: This is a documentation-only plan. Command names and config
keys match the implemented CLI and config code: `ktx ingest <connectionId>`,
`ktx ingest --all`, `ktx ingest status`, `ktx ingest replay`, and
`connections.<id>.context.queryHistory`.

View file

@ -0,0 +1,494 @@
# Unified Ingest V1 Final Public Surface Closure Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Close the remaining v1-blocking public-surface gaps in unified
`ktx ingest`.
**Architecture:** Keep the current connection-centric ingest planner and hidden
legacy debug commands. Fix the public query-history execution path so it passes
the full canonical `connections.<id>.context.queryHistory` pull config to the
historic-SQL adapter, and filter hidden Commander commands from the
documentation command-tree script so docs/discovery output matches normal CLI
help.
**Tech Stack:** TypeScript ESM, Commander, Vitest, KTX CLI/context packages,
pnpm workspace scripts.
---
## Current audit
The implemented unified-ingest plan chain covers most of the original
`docs/superpowers/specs/2026-05-13-unified-ingest-ux-design.md` spec:
- `ktx ingest [connectionId]`, `ktx ingest --all`, `--fast`, `--deep`,
`--query-history`, `--no-query-history`, and
`--query-history-window-days` route through `public-ingest.ts`.
- Database targets run before source targets. Public source ingest uses
`allowImplicitAdapter: true`, so `ingest.adapters` is no longer required for
inferred public adapters.
- Public database ingest maps `fast` to structural scan internals and `deep` to
enriched scan internals, honors `scan.relationships.enabled`, and isolates
deep-readiness failures per target under `--all`.
- Normal `ktx --help` hides `scan`; normal `ktx ingest --help` hides `run` and
`watch`; setup help exposes query-history flags instead of Historic SQL flags.
- Setup stores `connections.<id>.context.depth` and
`connections.<id>.context.queryHistory`, migrates legacy `historicSql`, and
uses foreground-only context-build state.
- Public docs-site CLI pages no longer document `ktx scan`,
`ktx ingest run --adapter`, or live `ktx ingest watch` as normal workflows.
### V1-blocking gaps
- Public query-history ingest drops configured pull fields. The lower-level
adapter path maps canonical `context.queryHistory` to the existing
`historicSqlUnifiedPullConfigSchema`, but `executePublicIngestTarget()` always
passes `historicSqlPullConfigOverride` with only `dialect` and sometimes
`windowDays`. Normal `ktx ingest warehouse --query-history` can therefore
ignore configured `minExecutions`, `filters`, `redactionPatterns`,
`concurrency`, and `staleArchiveAfterDays`.
- The documentation command-tree script still prints hidden commands. Running
`pnpm --filter @ktx/cli run docs:commands` currently prints top-level
`scan <connectionId>` and `ktx ingest run` / `ktx ingest watch`, even though
the spec requires `ktx scan` and live `ingest watch` not to be presented as
normal public command surfaces.
### Non-blocking gaps
- Hidden debug commands remain callable: `ktx scan`, `ktx ingest run`, and
`ktx ingest watch`. The spec allows hidden/debug placement for old
implementation surfaces in v1.
- Internal adapter keys, package names, WorkUnit keys, raw artifact paths, and
JSON/debug output can continue to use `scan`, `live-database`, and
`historic-sql`.
- Developer-only scripts and tests can keep scan/live-database terminology when
they exercise internal connector or artifact behavior.
- Public docs still use "scan" as a generic noun in a few conceptual database
sections. They do not document `ktx scan` as the public command, so this is
wording cleanup, not v1-blocking behavior.
## File structure
- Modify `packages/cli/src/public-ingest.ts`: preserve the full canonical
query-history pull config in public ingest plans and pass that config to the
lower-level historic-SQL adapter run.
- Modify `packages/cli/src/public-ingest.test.ts`: add regression coverage for
configured query-history fields and current-run `windowDays` overrides.
- Modify `packages/cli/src/command-tree.ts`: filter Commander commands marked
hidden via Commander private `_hidden`, matching Commander help behavior.
- Modify `packages/cli/src/command-tree.test.ts`: cover hidden top-level and
nested command filtering in the pure walker.
- Modify `packages/cli/src/print-command-tree.test.ts`: lock the rendered KTX
docs command tree against hidden unified-ingest commands.
## Tasks
### Task 1: Preserve canonical query-history pull config in public ingest
**Files:**
- Modify: `packages/cli/src/public-ingest.ts`
- Test: `packages/cli/src/public-ingest.test.ts`
- [ ] **Step 1: Write the failing public-ingest query-history config test**
In `packages/cli/src/public-ingest.test.ts`, add this test inside the
`runKtxPublicIngest` describe block, near the existing query-history execution
tests:
```ts
it('preserves configured query-history pull fields while overriding the current-run window', async () => {
const io = makeIo();
const project = deepReadyProject({
warehouse: {
driver: 'postgres',
context: {
queryHistory: {
enabled: true,
windowDays: 90,
minExecutions: 7,
concurrency: 3,
staleArchiveAfterDays: 120,
filters: {
dropTrivialProbes: true,
serviceAccounts: { patterns: ['^svc_'], mode: 'exclude' },
orchestrators: { mode: 'mark-only' },
dropFailedBelow: { errorRate: 0.5, executions: 3 },
},
redactionPatterns: ['(?i)secret'],
},
},
},
});
const runScan = vi.fn(async () => 0);
const runIngest = vi.fn(async () => 0);
await expect(
runKtxPublicIngest(
{
command: 'run',
projectDir: '/tmp/project',
targetConnectionId: 'warehouse',
all: false,
json: false,
inputMode: 'disabled',
queryHistory: 'enabled',
queryHistoryWindowDays: 30,
},
io.io,
{ loadProject: vi.fn(async () => project), runScan, runIngest },
),
).resolves.toBe(0);
const ingestArgs = runIngest.mock.calls[0]?.[0];
expect(ingestArgs).toMatchObject({
command: 'run',
connectionId: 'warehouse',
adapter: 'historic-sql',
allowImplicitAdapter: true,
historicSqlPullConfigOverride: {
dialect: 'postgres',
windowDays: 30,
minExecutions: 7,
concurrency: 3,
staleArchiveAfterDays: 120,
filters: {
dropTrivialProbes: true,
serviceAccounts: { patterns: ['^svc_'], mode: 'exclude' },
orchestrators: { mode: 'mark-only' },
dropFailedBelow: { errorRate: 0.5, executions: 3 },
},
redactionPatterns: ['(?i)secret'],
},
});
expect(ingestArgs?.historicSqlPullConfigOverride).not.toHaveProperty('enabled');
});
```
- [ ] **Step 2: Run the failing public-ingest test**
Run:
```bash
pnpm --filter @ktx/cli exec vitest run src/public-ingest.test.ts --testTimeout 30000
```
Expected: FAIL. The new assertion sees `historicSqlPullConfigOverride` with
`dialect: 'postgres'` and `windowDays: 30`, but without `minExecutions`,
`filters`, `redactionPatterns`, `concurrency`, or
`staleArchiveAfterDays`.
- [ ] **Step 3: Add the full query-history pull config to public plans**
In `packages/cli/src/public-ingest.ts`, update the `queryHistory` field on
`KtxPublicIngestPlanTarget` to include a pull config for enabled query-history
runs:
```ts
queryHistory?: {
enabled: boolean;
dialect?: HistoricSqlDialect;
windowDays?: number;
pullConfig?: Record<string, unknown>;
unsupported?: boolean;
skippedStoredByFast?: boolean;
};
```
Still in `packages/cli/src/public-ingest.ts`, add this helper below
`positiveInteger()`:
```ts
function queryHistoryPullConfig(input: {
stored: Record<string, unknown>;
dialect: HistoricSqlDialect;
windowDays?: number;
}): Record<string, unknown> {
const { enabled: _enabled, dialect: _dialect, ...storedConfig } = input.stored;
return {
...storedConfig,
dialect: input.dialect,
...(input.windowDays !== undefined ? { windowDays: input.windowDays } : {}),
};
}
```
Then replace the enabled-query-history return inside
`resolveDatabaseTargetOptions()` with this version:
```ts
if (requestedQh && dialect) {
if (depth === 'fast') {
input.warnings.push(`--query-history requires deep ingest; running ${input.connectionId} with --deep.`);
}
depth = 'deep';
return {
databaseDepth: depth,
queryHistory: {
...queryHistory,
enabled: true,
dialect,
pullConfig: queryHistoryPullConfig({
stored: storedQh,
dialect,
windowDays: queryHistory.windowDays,
}),
},
steps: ['database-schema', 'query-history'],
};
}
```
- [ ] **Step 4: Pass the preserved pull config into the historic-SQL adapter**
In `packages/cli/src/public-ingest.ts`, replace the
`historicSqlPullConfigOverride` construction in `executePublicIngestTarget()`
with:
```ts
historicSqlPullConfigOverride:
target.queryHistory.pullConfig ?? {
dialect: target.queryHistory.dialect,
...(target.queryHistory.windowDays !== undefined ? { windowDays: target.queryHistory.windowDays } : {}),
},
```
The surrounding `ingestArgs` object must still include:
```ts
adapter: 'historic-sql',
outputMode: sourceIngestOutputMode(args, io),
inputMode: args.inputMode,
allowImplicitAdapter: true,
```
- [ ] **Step 5: Run the public-ingest tests**
Run:
```bash
pnpm --filter @ktx/cli exec vitest run src/public-ingest.test.ts --testTimeout 30000
```
Expected: PASS. The new regression test proves public ingest preserves stored
query-history fields while `--query-history-window-days 30` overrides only
`windowDays` for the current run.
- [ ] **Step 6: Commit**
Run:
```bash
git add packages/cli/src/public-ingest.ts packages/cli/src/public-ingest.test.ts
git commit -m "fix(cli): preserve query-history pull config in public ingest"
```
### Task 2: Hide debug commands from the docs command tree
**Files:**
- Modify: `packages/cli/src/command-tree.ts`
- Test: `packages/cli/src/command-tree.test.ts`
- Test: `packages/cli/src/print-command-tree.test.ts`
- [ ] **Step 1: Write the failing hidden-command walker test**
In `packages/cli/src/command-tree.test.ts`, add this test inside the
`walkCommandTree` describe block:
```ts
it('omits Commander hidden commands from the public tree', () => {
const root = new Command('ktx');
root.command('scan', { hidden: true }).description('Run a standalone connection scan');
const ingest = root.command('ingest').description('Build or inspect KTX context');
ingest.command('run', { hidden: true }).description('Run local ingest by adapter');
ingest.command('watch', { hidden: true }).description('Open a stored visual report');
ingest.command('status').description('Print status');
root.command('status').description('Check readiness');
const tree = walkCommandTree(root);
expect(tree.children.map((child) => child.name)).toEqual(['ingest', 'status']);
expect(tree.children[0]).toMatchObject({
name: 'ingest',
children: [{ name: 'status', description: 'Print status', aliases: [], arguments: [], children: [] }],
});
});
```
- [ ] **Step 2: Write the failing rendered KTX tree assertions**
In `packages/cli/src/print-command-tree.test.ts`, add these assertions to the
first `renders an indented tree rooted at "ktx" with known top-level commands`
test after the existing `not.toContain()` assertions:
```ts
expect(output).not.toContain('scan <connectionId>');
expect(output).not.toContain('│ ├── run');
expect(output).not.toContain('│ ├── watch');
expect(output).not.toContain('│ └── watch');
```
- [ ] **Step 3: Run the failing command-tree tests**
Run:
```bash
pnpm --filter @ktx/cli exec vitest run src/command-tree.test.ts src/print-command-tree.test.ts
```
Expected: FAIL. The walker includes hidden commands because it currently maps
over `command.commands` without filtering Commander `_hidden` entries.
- [ ] **Step 4: Filter hidden Commander commands in the walker**
In `packages/cli/src/command-tree.ts`, add this helper above
`walkCommandTree()`:
```ts
function isHiddenCommand(command: CommandUnknownOpts): boolean {
return (command as CommandUnknownOpts & { _hidden?: boolean })._hidden === true;
}
```
Then replace the `children` field inside `walkCommandTree()` with:
```ts
children: command.commands.filter((child) => !isHiddenCommand(child)).map((child) => walkCommandTree(child)),
```
The complete function should read:
```ts
export function walkCommandTree(command: CommandUnknownOpts): CommandTreeNode {
return {
name: command.name(),
description: command.description(),
aliases: command.aliases(),
arguments: command.registeredArguments.map(formatArgumentDeclaration),
children: command.commands.filter((child) => !isHiddenCommand(child)).map((child) => walkCommandTree(child)),
};
}
```
- [ ] **Step 5: Run the command-tree tests**
Run:
```bash
pnpm --filter @ktx/cli exec vitest run src/command-tree.test.ts src/print-command-tree.test.ts
```
Expected: PASS. The pure walker omits hidden commands and the rendered KTX tree
no longer contains `scan <connectionId>`, `ingest run`, or `ingest watch`.
- [ ] **Step 6: Verify the docs command output directly**
Run:
```bash
pnpm --filter @ktx/cli run docs:commands > /tmp/ktx-command-tree.txt
rg -n "scan <connectionId>|^[[:space:][:graph:]]*run[[:space:]]+Run local ingest|^[[:space:][:graph:]]*watch \\[runId\\]" /tmp/ktx-command-tree.txt
```
Expected: the first command succeeds and writes the command tree. The `rg`
command exits with status `1` and prints no matches.
- [ ] **Step 7: Commit**
Run:
```bash
git add packages/cli/src/command-tree.ts packages/cli/src/command-tree.test.ts packages/cli/src/print-command-tree.test.ts
git commit -m "fix(cli): omit hidden commands from docs command tree"
```
### Task 3: Final verification
**Files:**
- Verify: `packages/cli/src/public-ingest.ts`
- Verify: `packages/cli/src/command-tree.ts`
- Verify: `packages/cli/src/public-ingest.test.ts`
- Verify: `packages/cli/src/command-tree.test.ts`
- Verify: `packages/cli/src/print-command-tree.test.ts`
- [ ] **Step 1: Run focused CLI regression tests**
Run:
```bash
pnpm --filter @ktx/cli exec vitest run src/public-ingest.test.ts src/local-adapters.test.ts src/index.test.ts src/command-tree.test.ts src/print-command-tree.test.ts --testTimeout 30000
```
Expected: PASS. This covers public ingest execution, adapter config mapping,
normal help routing, and docs command-tree rendering.
- [ ] **Step 2: Run CLI type-check**
Run:
```bash
pnpm --filter @ktx/cli run type-check
```
Expected: PASS with no TypeScript errors.
- [ ] **Step 3: Run docs command-tree output check**
Run:
```bash
pnpm --filter @ktx/cli run docs:commands > /tmp/ktx-command-tree.txt
rg -n "scan <connectionId>|^[[:space:][:graph:]]*run[[:space:]]+Run local ingest|^[[:space:][:graph:]]*watch \\[runId\\]" /tmp/ktx-command-tree.txt
```
Expected: the `docs:commands` command succeeds. The `rg` command exits `1`
with no matches.
- [ ] **Step 4: Run TypeScript dead-code checks**
Run:
```bash
pnpm run dead-code
```
Expected: PASS. If Knip reports unrelated existing findings, inspect them and
record the exact findings in the implementation notes before deciding whether
they are related to this plan.
- [ ] **Step 5: Inspect the final diff**
Run:
```bash
git status --short
git diff -- packages/cli/src/public-ingest.ts packages/cli/src/public-ingest.test.ts packages/cli/src/command-tree.ts packages/cli/src/command-tree.test.ts packages/cli/src/print-command-tree.test.ts
```
Expected: only the intended files are modified. The diff contains no generated
`dist/` output and no unrelated documentation changes.
- [ ] **Step 6: Commit verification-only fixes if needed**
If verification required expectation or type-only fixes, run:
```bash
git add packages/cli/src/public-ingest.ts packages/cli/src/public-ingest.test.ts packages/cli/src/command-tree.ts packages/cli/src/command-tree.test.ts packages/cli/src/print-command-tree.test.ts
git commit -m "test(cli): close unified ingest final public surface checks"
```
If no files changed during verification, do not create an empty commit.
## Self-review
- Spec coverage: This plan covers the remaining v1-blocking public query-history
config mapping and public command discovery output. It intentionally leaves
hidden debug command callability and internal scan/live-database/historic-sql
names as non-blocking because the original spec allows internal/debug names
in v1.
- Placeholder scan: No task uses deferred placeholders or unnamed edge-handling
steps. Each code step names the exact file, insertion point, and code shape.
- Type consistency: New `pullConfig` data stays under
`KtxPublicIngestPlanTarget.queryHistory` and flows unchanged into the
existing `KtxIngestArgs.historicSqlPullConfigOverride` field. Command-tree
filtering uses Commander `_hidden`, the same field Commander help uses.

View file

@ -0,0 +1,802 @@
# Unified Ingest V1 Final UX Labels Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Close the remaining v1-blocking public UX gaps in unified ingest warning aggregation and setup/status terminology.
**Architecture:** Keep the implemented connection-centric ingest planner, hidden debug commands, and internal scan/live-database/historic-sql boundaries. Add one warning accumulator lane for unsupported database query-history targets, then update normal setup/status/docs copy so public database groups are called `Databases` rather than `Primary sources`.
**Tech Stack:** TypeScript ESM, Commander, Vitest, Node test runner, KTX CLI/context packages.
---
## Current Audit
Implemented unified-ingest plans already cover the original spec's main v1 behavior:
- `ktx ingest [connectionId]`, `ktx ingest --all`, `--fast`, `--deep`, `--query-history`, `--no-query-history`, and `--query-history-window-days` route through `packages/cli/src/public-ingest.ts`.
- Database targets are ordered before source targets, public source ingest bypasses `ingest.adapters`, and database depth maps to structural/enriched scan internals.
- Deep readiness is evaluated before target work starts, and `--all` isolates per-target failures.
- Setup stores `connections.<id>.context.depth` and `connections.<id>.context.queryHistory`, migrates legacy `historicSql`, and uses foreground-only context-build state.
- Normal help hides `ktx scan`, `ktx ingest run`, and live `ktx ingest watch`; docs no longer present those as normal public workflows.
- Foreground progress uses `Databases` and `Context sources`, and normal progress/failure output sanitizes scan/live-database/historic-sql internals.
### V1-Blocking Gaps
- `ktx ingest --all --query-history` does not aggregate unsupported database query-history warnings. Source depth/query-history warnings are aggregated, but unsupported database drivers currently add one warning per target from `resolveDatabaseTargetOptions()`, contrary to the original spec's `--all` warning aggregation rule for non-applicable query-history flags.
- Normal setup/status surfaces still use the old `Primary sources` public label for databases:
- `packages/cli/src/setup.ts` prints `Primary sources configured`.
- `packages/cli/src/setup-context.ts` prints a `Primary sources:` success group.
- `packages/cli/src/setup-ready-menu.ts` labels the database section `Primary sources`.
- `packages/cli/src/setup-databases.ts` uses `primary source` in normal interactive prompts, skip/failure messages, and success headings.
- `README.md`, `docs-site/content/docs/getting-started/quickstart.mdx`, and `docs-site/content/docs/cli-reference/ktx-setup.mdx` still mirror the old label.
### Non-Blocking Gaps
- Hidden debug commands can remain callable: `ktx scan`, `ktx ingest run`, and `ktx ingest watch`.
- Internal adapter keys, raw artifact paths, WorkUnit keys, package names, tests, and developer-only scripts can continue to use `scan`, `live-database`, and `historic-sql`.
- Public conceptual docs may still use `scan` as a generic noun where they are describing internal database metadata artifacts rather than documenting `ktx scan` as the public command.
- Internal readiness config names such as `scan.enrichment.mode` can remain because they are current `ktx.yaml` field names.
## File Structure
- Modify `packages/cli/src/public-ingest.ts`: aggregate unsupported database query-history warnings for `--all`.
- Modify `packages/cli/src/public-ingest.test.ts`: add regression tests for explicit and stored unsupported query-history aggregation.
- Modify `packages/cli/src/setup-ready-menu.ts`: change the ready-project database menu label to `Databases`.
- Modify `packages/cli/src/setup-ready-menu.test.ts`: update the ready-menu expected label.
- Modify `packages/cli/src/setup.ts`: change setup status output from `Primary sources configured` to `Databases configured`.
- Modify `packages/cli/src/setup.test.ts`: update status and empty-selection expectations.
- Modify `packages/cli/src/setup-context.ts`: change setup context success grouping from `Primary sources` to `Databases`.
- Modify `packages/cli/src/setup-context.test.ts`: assert the success output uses `Databases`.
- Modify `packages/cli/src/setup-databases.ts`: change normal database setup copy from `primary source(s)` / `knowledge sources` to `database(s)` / `context sources`.
- Modify `packages/cli/src/setup-databases.test.ts`: update expected prompt/output strings.
- Modify `README.md`: update the setup status example label.
- Modify `docs-site/content/docs/getting-started/quickstart.mdx`: update setup success/status examples.
- Modify `docs-site/content/docs/cli-reference/ktx-setup.mdx`: update setup status example.
- Modify `scripts/examples-docs.test.mjs`: add docs regression assertions for the old `Primary sources` label.
## Tasks
### Task 1: Aggregate Unsupported Query-History Warnings
**Files:**
- Modify: `packages/cli/src/public-ingest.ts`
- Test: `packages/cli/src/public-ingest.test.ts`
- [ ] **Step 1: Add failing unsupported warning aggregation tests**
In `packages/cli/src/public-ingest.test.ts`, add these tests after the existing test named `warns and skips query history for unsupported database drivers`:
```ts
it('aggregates unsupported query-history warnings for all database targets', () => {
const plan = buildPublicIngestPlan(
deepReadyProject({
local: { driver: 'sqlite' },
mysql_warehouse: { driver: 'mysql' },
warehouse: { driver: 'postgres', context: { depth: 'deep' } },
}),
{
projectDir: '/tmp/project',
all: true,
depth: 'deep',
queryHistory: 'enabled',
},
);
expect(plan.targets).toEqual([
expect.objectContaining({
connectionId: 'local',
queryHistory: { enabled: false, unsupported: true },
steps: ['database-schema'],
}),
expect.objectContaining({
connectionId: 'mysql_warehouse',
queryHistory: { enabled: false, unsupported: true },
steps: ['database-schema'],
}),
expect.objectContaining({
connectionId: 'warehouse',
queryHistory: expect.objectContaining({ enabled: true, dialect: 'postgres' }),
steps: ['database-schema', 'query-history'],
}),
]);
expect(plan.warnings).toEqual([
'--query-history is not supported for 2 database connections (mysql, sqlite); running schema ingest for those connections.',
]);
});
it('aggregates stored unsupported query-history config warnings for all database targets', () => {
const plan = buildPublicIngestPlan(
projectWithConnections({
local: { driver: 'sqlite', context: { queryHistory: { enabled: true } } },
mysql_warehouse: { driver: 'mysql', context: { queryHistory: { enabled: true } } },
}),
{
projectDir: '/tmp/project',
all: true,
queryHistory: 'default',
},
);
expect(plan.targets).toEqual([
expect.objectContaining({
connectionId: 'local',
queryHistory: { enabled: false, unsupported: true },
steps: ['database-schema'],
}),
expect.objectContaining({
connectionId: 'mysql_warehouse',
queryHistory: { enabled: false, unsupported: true },
steps: ['database-schema'],
}),
]);
expect(plan.warnings).toEqual([
'2 database connections have query history enabled in ktx.yaml, but their drivers do not support it; running schema ingest for those connections.',
]);
});
```
- [ ] **Step 2: Run the failing public ingest tests**
Run:
```bash
pnpm --filter @ktx/cli exec vitest run src/public-ingest.test.ts -t "unsupported query-history"
```
Expected: FAIL because the new `--all` cases currently receive one warning per unsupported database target.
- [ ] **Step 3: Add unsupported query-history warning accumulator state**
In `packages/cli/src/public-ingest.ts`, replace the current warning accumulator interface and factory with:
```ts
interface KtxUnsupportedQueryHistoryWarning {
connectionId: string;
driver: string;
reason: 'explicit' | 'stored';
}
interface KtxPublicIngestWarningAccumulator {
warnings: string[];
ignoredDepthForSources: string[];
ignoredQueryHistoryForSources: string[];
unsupportedQueryHistoryForDatabases: KtxUnsupportedQueryHistoryWarning[];
}
function createWarningAccumulator(): KtxPublicIngestWarningAccumulator {
return {
warnings: [],
ignoredDepthForSources: [],
ignoredQueryHistoryForSources: [],
unsupportedQueryHistoryForDatabases: [],
};
}
```
- [ ] **Step 4: Add unsupported database warning formatting**
In `packages/cli/src/public-ingest.ts`, add these helpers after `sourceIgnoredWarning()`:
```ts
function unsupportedDriverList(entries: KtxUnsupportedQueryHistoryWarning[]): string {
return [...new Set(entries.map((entry) => entry.driver))].sort((left, right) => left.localeCompare(right)).join(', ');
}
function unsupportedQueryHistoryWarnings(
entries: KtxUnsupportedQueryHistoryWarning[],
all: boolean,
): string[] {
if (entries.length === 0) {
return [];
}
const warnings: string[] = [];
const explicitEntries = entries.filter((entry) => entry.reason === 'explicit');
const storedEntries = entries.filter((entry) => entry.reason === 'stored');
if (explicitEntries.length === 1 || (!all && explicitEntries.length > 0)) {
warnings.push(
...explicitEntries.map(
(entry) =>
`--query-history is not supported for ${entry.driver}; running schema ingest for ${entry.connectionId}.`,
),
);
} else if (explicitEntries.length > 1) {
warnings.push(
`--query-history is not supported for ${explicitEntries.length} database connections (${unsupportedDriverList(
explicitEntries,
)}); running schema ingest for those connections.`,
);
}
if (storedEntries.length === 1 || (!all && storedEntries.length > 0)) {
warnings.push(
...storedEntries.map(
(entry) =>
`${entry.connectionId} has query history enabled in ktx.yaml, but ${entry.driver} does not support it; running schema ingest.`,
),
);
} else if (storedEntries.length > 1) {
warnings.push(
`${storedEntries.length} database connections have query history enabled in ktx.yaml, but their drivers do not support it; running schema ingest for those connections.`,
);
}
return warnings;
}
```
- [ ] **Step 5: Use the accumulator in `finalizeWarnings()`**
In `packages/cli/src/public-ingest.ts`, replace the start of `finalizeWarnings()` with:
```ts
const warnings = [
...accumulator.warnings,
...unsupportedQueryHistoryWarnings(accumulator.unsupportedQueryHistoryForDatabases, args.all),
];
```
Keep the existing source depth/query-history aggregation logic below that new `warnings` initialization.
- [ ] **Step 6: Record unsupported database targets instead of pushing immediate warnings**
In `packages/cli/src/public-ingest.ts`, change the `resolveDatabaseTargetOptions()` input type so `warnings` is the full accumulator:
```ts
warnings: KtxPublicIngestWarningAccumulator;
```
Inside the unsupported query-history branch, replace the current `input.warnings.push(...)` block with:
```ts
input.warnings.unsupportedQueryHistoryForDatabases.push({
connectionId: input.connectionId,
driver: input.driver,
reason: explicitQueryHistory === 'enabled' || input.args.queryHistoryWindowDays !== undefined ? 'explicit' : 'stored',
});
```
In the supported query-history branch, replace:
```ts
input.warnings.push(`--query-history requires deep ingest; running ${input.connectionId} with --deep.`);
```
with:
```ts
input.warnings.warnings.push(`--query-history requires deep ingest; running ${input.connectionId} with --deep.`);
```
In the stored query-history skipped-by-fast branch, replace:
```ts
input.warnings.push(
`${input.connectionId} has query history enabled in ktx.yaml, but --fast skips query-history processing.`,
);
```
with:
```ts
input.warnings.warnings.push(
`${input.connectionId} has query history enabled in ktx.yaml, but --fast skips query-history processing.`,
);
```
In `targetForConnection()`, replace the database resolver call with:
```ts
const options = resolveDatabaseTargetOptions({ connectionId, driver, connection, args, warnings });
```
- [ ] **Step 7: Verify unsupported warning aggregation passes**
Run:
```bash
pnpm --filter @ktx/cli exec vitest run src/public-ingest.test.ts -t "unsupported query-history"
```
Expected: PASS. The single-target warning tests keep the old exact messages, while `--all` unsupported database targets receive one aggregate warning per reason.
- [ ] **Step 8: Commit unsupported warning aggregation**
Run:
```bash
git add packages/cli/src/public-ingest.ts packages/cli/src/public-ingest.test.ts
git commit -m "fix(cli): aggregate unsupported query-history warnings"
```
### Task 2: Rename Public Setup Database Labels
**Files:**
- Modify: `packages/cli/src/setup-ready-menu.ts`
- Modify: `packages/cli/src/setup.ts`
- Modify: `packages/cli/src/setup-context.ts`
- Modify: `packages/cli/src/setup-databases.ts`
- Test: `packages/cli/src/setup-ready-menu.test.ts`
- Test: `packages/cli/src/setup.test.ts`
- Test: `packages/cli/src/setup-context.test.ts`
- Test: `packages/cli/src/setup-databases.test.ts`
- Modify: `README.md`
- Modify: `docs-site/content/docs/getting-started/quickstart.mdx`
- Modify: `docs-site/content/docs/cli-reference/ktx-setup.mdx`
- Test: `scripts/examples-docs.test.mjs`
- [ ] **Step 1: Write failing CLI copy expectations**
In `packages/cli/src/setup-ready-menu.test.ts`, change the expected database option to:
```ts
{ value: 'databases', label: 'Databases' },
```
In `packages/cli/src/setup-context.test.ts`, add these assertions after each `expect(io.stdout()).toContain('KTX context is ready for agents.');` assertion in the successful build and existing-context tests:
```ts
expect(io.stdout()).toContain('Databases:');
expect(io.stdout()).not.toContain('Primary sources:');
```
In `packages/cli/src/setup.test.ts`, change the empty database selection expectation to:
```ts
expect(testIo.stdout()).toContain(
'KTX cannot work without at least one database. Select a database or press Escape to go back.',
);
expect(testIo.stderr()).not.toContain('No databases selected.');
```
In `packages/cli/src/setup.test.ts`, in the existing-project status test, add:
```ts
expect(rendered).toContain('Databases configured: no');
expect(rendered).not.toContain('Primary sources configured');
```
- [ ] **Step 2: Write failing setup database prompt expectations**
In `packages/cli/src/setup-databases.test.ts`, update the old public copy expectations to the new database labels:
```ts
expect(prompts.multiselect).toHaveBeenCalledWith(
expect.objectContaining({
message: expect.stringContaining('Which databases should KTX connect to?'),
}),
);
```
For configured database menu expectations, use:
```ts
expect(prompts.select).toHaveBeenCalledWith({
message: 'Databases already configured: warehouse\nWhat would you like to do?',
options: [
{ value: 'continue', label: 'Continue to context sources' },
{ value: 'add', label: 'Add another database' },
],
});
```
For the `postgres-warehouse` configured menu expectations, use:
```ts
expect(prompts.select).toHaveBeenCalledWith({
message: 'Databases already configured: postgres-warehouse\nWhat would you like to do?',
options: [
{ value: 'continue', label: 'Continue to context sources' },
{ value: 'add', label: 'Add another database' },
],
});
```
For empty-selection output expectations, use:
```ts
expect(io.stdout()).not.toContain('KTX cannot work without at least one database');
```
For successful initial scan/setup output, use:
```ts
expect(io.stdout()).toContain('◇ Database ready');
expect(io.stdout()).not.toContain('Primary source ready');
```
Rename test descriptions that contain `primary source` or `primary sources` so they use `database` or `databases`. For example:
```ts
it('shows every supported database in the interactive checklist', async () => {
```
```ts
it('shows a configured database menu instead of the type checklist when a database exists', async () => {
```
```ts
it('lets users add another database after completing the first one', async () => {
```
- [ ] **Step 3: Run failing setup label tests**
Run:
```bash
pnpm --filter @ktx/cli exec vitest run src/setup-ready-menu.test.ts src/setup.test.ts src/setup-context.test.ts src/setup-databases.test.ts -t "ready menu|readiness checklist|context is ready|database|primary source|configured"
```
Expected: FAIL because production copy still uses `Primary sources` and `primary source`.
- [ ] **Step 4: Update the ready menu and status labels**
In `packages/cli/src/setup-ready-menu.ts`, change:
```ts
{ value: 'databases', label: 'Primary sources' },
```
to:
```ts
{ value: 'databases', label: 'Databases' },
```
In `packages/cli/src/setup.ts`, change:
```ts
`Primary sources configured: ${formatConnectionList(status.databases.map((database) => database.connectionId))}`,
```
to:
```ts
`Databases configured: ${formatConnectionList(status.databases.map((database) => database.connectionId))}`,
```
In `packages/cli/src/setup-context.ts`, change:
```ts
io.stdout.write('Primary sources:\n');
```
to:
```ts
io.stdout.write('Databases:\n');
```
- [ ] **Step 5: Update setup database prompt and output copy**
In `packages/cli/src/setup-databases.ts`, change:
```ts
const backDestination = canReturnToDriverSelection ? 'primary source selection' : 'the previous setup step';
```
to:
```ts
const backDestination = canReturnToDriverSelection ? 'database selection' : 'the previous setup step';
```
Replace the entire `configuredPrimarySourcesPrompt()` return value with:
```ts
return {
message: `Databases already configured: ${connectionIds.join(', ')}\nWhat would you like to do?`,
options: [
{ value: 'continue', label: 'Continue to context sources' },
{ value: 'add', label: 'Add another database' },
],
};
```
Change the successful database setup heading from:
```ts
writeSetupSection(input.io, 'Primary source ready', [
```
to:
```ts
writeSetupSection(input.io, 'Database ready', [
```
Change the non-interactive no-database error from:
```ts
'KTX cannot work without a primary source. Pass --database or --database-connection-id, or pass --skip-databases to leave setup incomplete.\n',
```
to:
```ts
'KTX cannot work without a database. Pass --database or --database-connection-id, or pass --skip-databases to leave setup incomplete.\n',
```
Change the driver multiselect message from:
```ts
message: withMultiselectNavigation('Which primary sources should KTX connect to?'),
```
to:
```ts
message: withMultiselectNavigation('Which databases should KTX connect to?'),
```
Change the empty-selection warning from:
```ts
io.stdout.write('│ KTX cannot work without at least one primary source. Select a source or press Escape to go back.\n');
```
to:
```ts
io.stdout.write('│ KTX cannot work without at least one database. Select a database or press Escape to go back.\n');
```
Change the skip output from:
```ts
io.stdout.write('│ Primary source setup skipped. KTX cannot work until you add a primary source.\n');
```
to:
```ts
io.stdout.write('│ Database setup skipped. KTX cannot work until you add a database.\n');
```
Change the no-completed-database output from:
```ts
io.stdout.write('│ KTX cannot work without a primary source.\n');
```
to:
```ts
io.stdout.write('│ KTX cannot work without a database.\n');
```
Change the retry prompt message and skip label from:
```ts
message: `Primary source setup failed for ${connectionChoice.connectionId}`,
```
```ts
{ value: 'skip', label: 'Skip this primary source' },
```
to:
```ts
message: `Database setup failed for ${connectionChoice.connectionId}`,
```
```ts
{ value: 'skip', label: 'Skip this database' },
```
Change the final failure line from:
```ts
io.stderr.write('No primary source connections completed setup.\n');
```
to:
```ts
io.stderr.write('No database connections completed setup.\n');
```
- [ ] **Step 6: Update public docs examples**
In `README.md`, replace:
```text
Primary sources configured: yes (postgres-warehouse)
```
with:
```text
Databases configured: yes (postgres-warehouse)
```
In `docs-site/content/docs/getting-started/quickstart.mdx`, replace the database-ready heading line:
```text
Primary source ready
postgres-warehouse - PostgreSQL - schema context complete
```
with:
```text
Database ready
postgres-warehouse - PostgreSQL - schema context complete
```
In `docs-site/content/docs/getting-started/quickstart.mdx`, replace the setup success group:
```text
Primary sources:
postgres-warehouse: deep context complete
```
with:
```text
Databases:
postgres-warehouse: deep context complete
```
In `docs-site/content/docs/getting-started/quickstart.mdx`, replace:
```text
Primary sources configured: yes (postgres-warehouse)
```
with:
```text
Databases configured: yes (postgres-warehouse)
```
In `docs-site/content/docs/cli-reference/ktx-setup.mdx`, replace:
```text
Primary sources configured: yes (postgres-warehouse)
```
with:
```text
Databases configured: yes (postgres-warehouse)
```
- [ ] **Step 7: Add public docs regression assertions**
In `scripts/examples-docs.test.mjs`, inside the test named `documents unified public ingest workflows in the docs site`, add:
```js
const setupReference = await readText('docs-site/content/docs/cli-reference/ktx-setup.mdx');
```
Then add these assertions near the existing `quickstart` and `rootReadme` assertions:
```js
assert.match(rootReadme, /Databases configured: yes \(postgres-warehouse\)/);
assert.match(quickstart, /Databases:\n postgres-warehouse: deep context complete/);
assert.match(quickstart, /Databases configured: yes \(postgres-warehouse\)/);
assert.match(setupReference, /Databases configured: yes \(postgres-warehouse\)/);
assert.doesNotMatch(rootReadme, /Primary sources configured/);
assert.doesNotMatch(quickstart, /Primary sources/);
assert.doesNotMatch(setupReference, /Primary sources configured/);
```
- [ ] **Step 8: Verify setup label tests pass**
Run:
```bash
pnpm --filter @ktx/cli exec vitest run src/setup-ready-menu.test.ts src/setup.test.ts src/setup-context.test.ts src/setup-databases.test.ts
```
Expected: PASS.
- [ ] **Step 9: Verify docs examples pass**
Run:
```bash
node --test scripts/examples-docs.test.mjs
```
Expected: PASS.
- [ ] **Step 10: Scan for stale public labels**
Run:
```bash
rg -n "Primary sources?:|Primary sources? configured|Primary source ready|knowledge sources" packages/cli/src README.md docs-site/content/docs scripts/examples-docs.test.mjs
```
Expected: no matches in public CLI source, README/docs examples, or the docs regression test.
- [ ] **Step 11: Commit public setup labels**
Run:
```bash
git add packages/cli/src/setup-ready-menu.ts packages/cli/src/setup-ready-menu.test.ts packages/cli/src/setup.ts packages/cli/src/setup.test.ts packages/cli/src/setup-context.ts packages/cli/src/setup-context.test.ts packages/cli/src/setup-databases.ts packages/cli/src/setup-databases.test.ts README.md docs-site/content/docs/getting-started/quickstart.mdx docs-site/content/docs/cli-reference/ktx-setup.mdx scripts/examples-docs.test.mjs
git commit -m "fix(cli): align setup database labels"
```
### Task 3: Final V1 Verification
**Files:**
- Verify: `packages/cli/src/public-ingest.ts`
- Verify: `packages/cli/src/setup-ready-menu.ts`
- Verify: `packages/cli/src/setup.ts`
- Verify: `packages/cli/src/setup-context.ts`
- Verify: `packages/cli/src/setup-databases.ts`
- Verify: `README.md`
- Verify: `docs-site/content/docs/getting-started/quickstart.mdx`
- Verify: `docs-site/content/docs/cli-reference/ktx-setup.mdx`
- [ ] **Step 1: Run focused CLI tests**
Run:
```bash
pnpm --filter @ktx/cli exec vitest run src/public-ingest.test.ts src/context-build-view.test.ts src/setup-ready-menu.test.ts src/setup.test.ts src/setup-context.test.ts src/setup-databases.test.ts src/index.test.ts src/command-tree.test.ts
```
Expected: PASS.
- [ ] **Step 2: Run docs regression tests**
Run:
```bash
node --test scripts/examples-docs.test.mjs
```
Expected: PASS.
- [ ] **Step 3: Run public unified-ingest stale-copy scans**
Run:
```bash
rg -n "Primary sources?:|Primary sources? configured|Primary source ready|knowledge sources" packages/cli/src README.md docs-site/content/docs scripts/examples-docs.test.mjs
```
Expected: no matches.
Run:
```bash
rg -n "ktx scan|ktx ingest run --connection-id|--adapter <adapter>|ktx ingest watch|live-database|Historic SQL|historicSql" README.md docs-site/content/docs examples/README.md examples/local-warehouse/README.md
```
Expected: no matches. Matches in developer scripts, internal package names, tests, or artifact paths outside this public-docs command are non-blocking under the original spec.
- [ ] **Step 4: Run package pre-commit on changed files**
Run:
```bash
uv run pre-commit run --files packages/cli/src/public-ingest.ts packages/cli/src/public-ingest.test.ts packages/cli/src/setup-ready-menu.ts packages/cli/src/setup-ready-menu.test.ts packages/cli/src/setup.ts packages/cli/src/setup.test.ts packages/cli/src/setup-context.ts packages/cli/src/setup-context.test.ts packages/cli/src/setup-databases.ts packages/cli/src/setup-databases.test.ts README.md docs-site/content/docs/getting-started/quickstart.mdx docs-site/content/docs/cli-reference/ktx-setup.mdx scripts/examples-docs.test.mjs
```
Expected: PASS. If pre-commit is unavailable because the local `uv` version or hook environment is missing, record the exact failure and run the focused Vitest and Node tests from Steps 1 and 2.
- [ ] **Step 5: Commit final verification if needed**
If Step 4 made formatting changes, run:
```bash
git add packages/cli/src/public-ingest.ts packages/cli/src/public-ingest.test.ts packages/cli/src/setup-ready-menu.ts packages/cli/src/setup-ready-menu.test.ts packages/cli/src/setup.ts packages/cli/src/setup.test.ts packages/cli/src/setup-context.ts packages/cli/src/setup-context.test.ts packages/cli/src/setup-databases.ts packages/cli/src/setup-databases.test.ts README.md docs-site/content/docs/getting-started/quickstart.mdx docs-site/content/docs/cli-reference/ktx-setup.mdx scripts/examples-docs.test.mjs
git commit -m "test: verify unified ingest final ux labels"
```
If Step 4 made no changes, do not create an empty commit.
## Self-Review
- Spec coverage: This plan covers the remaining v1-blocking public gaps found in the audit: unsupported database query-history warning aggregation for `--all`, and old public `Primary sources` terminology in setup/status/docs where the spec's user-facing grouping is `Databases`. Core routing, depth, query-history execution, setup config, foreground-only state, hidden debug commands, public docs command shape, and output sanitization are already implemented by the prior plan chain.
- Placeholder scan: The plan contains exact files, exact tests, exact code snippets, exact commands, and expected outcomes.
- Type consistency: The new accumulator type is `KtxUnsupportedQueryHistoryWarning`; `resolveDatabaseTargetOptions()` receives `KtxPublicIngestWarningAccumulator`; warning strings used in tests match the implementation snippets exactly.

View file

@ -0,0 +1,932 @@
# Unified Ingest V1 Foreground and Retry Closure Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Close the remaining v1-blocking public UX gaps in the unified
`ktx ingest` redesign.
**Architecture:** Keep the implemented connection-centric ingest planner and
shared foreground context-build view. Add a small public messaging layer for
notices, warnings, and retry guidance so TTY, non-TTY, and setup next-step
surfaces all match the original spec without changing internal adapter names.
**Tech Stack:** TypeScript ESM, Commander, Vitest, KTX CLI/context packages,
Markdown plan documentation.
---
## Current audit
The implemented unified-ingest plans cover the main v1 behavior:
- `ktx ingest [connectionId]`, `ktx ingest --all`, `--fast`, `--deep`,
`--query-history`, `--no-query-history`, and
`--query-history-window-days` route through the public ingest planner.
- Database targets run before source targets. Public source ingest bypasses
`ingest.adapters`. Fast and deep map to structural and enriched database
ingest, and deep readiness failures are isolated per target under `--all`.
- `ktx scan`, `ktx ingest run`, and `ktx ingest watch` are hidden from normal
help. Setup stores `connections.<id>.context.depth` and
`connections.<id>.context.queryHistory`.
- Setup context builds are foreground-only, legacy context-build states are
normalized to stale, and public docs no longer advertise `ktx scan` or
adapter-backed `ktx ingest run` as normal workflows.
### V1-blocking gaps
- Interactive foreground `ktx ingest` and setup context builds compute public
warnings but never render them. A TTY user can pass `--deep` for source
connections, `--query-history` for unsupported targets, or `--fast` with
stored query history and receive no warning in the foreground view.
- Explicit query-history runs do not state that database schema ingest runs
before query-history processing. The spec requires that message when a user
explicitly passes `--query-history`.
- Plain non-TTY failures report generic step failures such as
`warehouse failed at database-schema.` and a debug command, but they do not
include the retry guidance required by the error-handling section.
- Setup next-step output still describes the context-build action as
`Build or resume agent-ready context` through `ktx setup`, and it says the
build covers `primary-source scans and context-source ingests`. The public
model is `setup` configures, `ingest` builds or refreshes context, and status
explains readiness.
- The guided demo foreground replay still shows `scanning tables...` and
`tables scanned`, even though the normal foreground view must use
`reading schema` or `building schema context`.
### Non-blocking gaps
- Hidden debug commands can continue to call `ktx scan`, `ktx ingest run`, and
`ktx ingest watch`.
- Internal adapter keys, raw artifact paths, WorkUnit keys, package names, and
JSON or debug output can continue to use `scan`, `live-database`, and
`historic-sql`.
- Developer docs can continue to mention scan internals when they describe
connector implementation details.
- Existing `autoWatch`, `detached`, and `paused` type remnants in setup code
are not user-facing because setup context state is normalized before display.
## File structure
- Modify `packages/cli/src/public-ingest.ts`: add public plan notices, print
schema-before-query-history notices, and add retry guidance to plain
non-TTY failure details.
- Modify `packages/cli/src/public-ingest.test.ts`: cover explicit
query-history notices and retry guidance in plain output.
- Modify `packages/cli/src/context-build-view.ts`: render foreground notices
and warnings from `buildPublicIngestPlan`.
- Modify `packages/cli/src/context-build-view.test.ts`: cover warning and
notice rendering in the foreground view.
- Modify `packages/cli/src/next-steps.ts`: make the public build command
`ktx ingest --all` and remove resume/scan wording from setup next steps.
- Modify `packages/cli/src/next-steps.test.ts`: update public next-step
expectations.
- Modify `packages/cli/src/setup-demo-tour.ts`: replace demo replay scan copy
with schema-context copy.
- Modify `packages/cli/src/setup-demo-tour.test.ts`: lock the demo replay
wording against `scan` terms.
## Tasks
### Task 1: Render foreground notices and warnings
**Files:**
- Modify: `packages/cli/src/context-build-view.ts`
- Test: `packages/cli/src/context-build-view.test.ts`
- [ ] **Step 1: Write failing foreground-message tests**
In `packages/cli/src/context-build-view.test.ts`, add these tests inside the
`renderContextBuildView` describe block, near the existing rendering tests:
```ts
it('renders public warnings in the foreground view', () => {
const state = initViewState([
{
connectionId: 'docs',
driver: 'notion',
operation: 'source-ingest',
adapter: 'notion',
debugCommand: 'ktx ingest docs --debug',
steps: ['source-ingest', 'memory-update'],
},
]);
const rendered = renderContextBuildView(state, {
styled: false,
warnings: ['--deep affects database ingest only; ignoring it for docs.'],
});
expect(rendered).toContain('Warnings:');
expect(rendered).toContain('--deep affects database ingest only; ignoring it for docs.');
});
it('renders public notices in the foreground view before warnings', () => {
const state = initViewState([
{
connectionId: 'warehouse',
driver: 'postgres',
operation: 'database-ingest',
debugCommand: 'ktx ingest warehouse --debug',
steps: ['database-schema', 'query-history'],
databaseDepth: 'deep',
detectRelationships: true,
queryHistory: { enabled: true, dialect: 'postgres' },
},
]);
const rendered = renderContextBuildView(state, {
styled: false,
notices: ['Schema ingest runs before query history for warehouse.'],
warnings: ['--query-history requires deep ingest; running warehouse with --deep.'],
});
expect(rendered.indexOf('Notices:')).toBeLessThan(rendered.indexOf('Warnings:'));
expect(rendered).toContain('Schema ingest runs before query history for warehouse.');
expect(rendered).toContain('--query-history requires deep ingest; running warehouse with --deep.');
});
```
- [ ] **Step 2: Run the failing foreground-message tests**
Run:
```bash
pnpm --filter @ktx/cli exec vitest run src/context-build-view.test.ts -t "renders public warnings|renders public notices"
```
Expected: FAIL because `renderContextBuildView` does not accept or render
`warnings` or `notices`.
- [ ] **Step 3: Add render options for foreground messages**
In `packages/cli/src/context-build-view.ts`, add this helper after
`renderTargetGroup`:
```ts
function renderMessageGroup(label: string, messages: string[], styled: boolean): string[] {
if (messages.length === 0) return [];
const renderedMessages = messages.map((message) => ` - ${message}`);
return ['', ` ${label}:`, ...renderedMessages.map((line) => (styled ? dim(line) : line))];
}
```
Then change the `renderContextBuildView` signature from:
```ts
export function renderContextBuildView(
state: ContextBuildViewState,
options: { styled?: boolean; showHint?: boolean; hintText?: string; projectDir?: string } = {},
): string {
```
to:
```ts
export function renderContextBuildView(
state: ContextBuildViewState,
options: {
styled?: boolean;
showHint?: boolean;
hintText?: string;
projectDir?: string;
notices?: string[];
warnings?: string[];
} = {},
): string {
```
In the `lines` array inside `renderContextBuildView`, insert the notice and
warning groups after the `Context sources` group:
```ts
...renderTargetGroup('Databases', state.primarySources, state.frame, styled, width),
...renderTargetGroup('Context sources', state.contextSources, state.frame, styled, width),
...renderMessageGroup('Notices', options.notices ?? [], styled),
...renderMessageGroup('Warnings', options.warnings ?? [], styled),
'',
```
- [ ] **Step 4: Pass plan messages into foreground rendering**
In `packages/cli/src/context-build-view.ts`, inside `runContextBuild`, change:
```ts
const viewOpts = { styled: true, projectDir: args.projectDir };
```
to:
```ts
const viewOpts = {
styled: true,
projectDir: args.projectDir,
notices: plan.notices ?? [],
warnings: plan.warnings,
};
```
This makes every call to `paint()` and the final non-TTY foreground fallback
render the same public messages.
- [ ] **Step 5: Run the foreground-message tests**
Run:
```bash
pnpm --filter @ktx/cli exec vitest run src/context-build-view.test.ts -t "renders public warnings|renders public notices"
```
Expected: PASS.
- [ ] **Step 6: Commit**
```bash
git add packages/cli/src/context-build-view.ts packages/cli/src/context-build-view.test.ts
git commit -m "fix: render unified ingest foreground warnings"
```
### Task 2: State schema-before-query-history for explicit runs
**Files:**
- Modify: `packages/cli/src/public-ingest.ts`
- Modify: `packages/cli/src/context-build-view.ts`
- Test: `packages/cli/src/public-ingest.test.ts`
- Test: `packages/cli/src/context-build-view.test.ts`
- [ ] **Step 1: Write failing explicit query-history notice tests**
In `packages/cli/src/public-ingest.test.ts`, add this test inside
`describe('buildPublicIngestPlan', ...)` after the existing query-history
planning tests:
```ts
it('adds a schema-first notice when query history is explicitly enabled', () => {
const project = deepReadyProject({
warehouse: { driver: 'postgres', context: { depth: 'deep' } },
});
expect(
buildPublicIngestPlan(project, {
projectDir: '/tmp/project',
targetConnectionId: 'warehouse',
all: false,
queryHistory: 'enabled',
}).notices,
).toEqual(['Schema ingest runs before query history for warehouse.']);
});
```
In `packages/cli/src/public-ingest.test.ts`, add this test inside
`describe('runKtxPublicIngest', ...)` after
`runs query history after schema ingest with current-run window override`:
```ts
it('prints the schema-first notice for explicit query-history runs', async () => {
const io = makeIo();
const project = deepReadyProject({
warehouse: { driver: 'postgres', context: { depth: 'deep' } },
});
const runScan = vi.fn(async () => 0);
const runIngest = vi.fn(async () => 0);
await expect(
runKtxPublicIngest(
{
command: 'run',
projectDir: '/tmp/project',
targetConnectionId: 'warehouse',
all: false,
json: false,
inputMode: 'disabled',
queryHistory: 'enabled',
},
io.io,
{ loadProject: vi.fn(async () => project), runScan, runIngest },
),
).resolves.toBe(0);
expect(io.stdout()).toContain('Schema ingest runs before query history for warehouse.');
});
```
In `packages/cli/src/context-build-view.test.ts`, add this test near the
existing `runContextBuild` tests:
```ts
it('passes schema-first notices from the plan into foreground output', async () => {
const io = makeIo();
const project = {
...projectWithConnections({
warehouse: { driver: 'postgres', context: { depth: 'deep' } },
}),
config: {
...projectWithConnections({ warehouse: { driver: 'postgres' } }).config,
connections: {
warehouse: { driver: 'postgres', context: { depth: 'deep' } },
},
llm: {
provider: { backend: 'gateway', gateway: { api_key: 'env:KTX_GATEWAY_API_KEY' } }, // pragma: allowlist secret
models: { default: 'gpt-test' },
},
scan: {
...projectWithConnections({ warehouse: { driver: 'postgres' } }).config.scan,
enrichment: {
mode: 'llm',
embeddings: {
backend: 'openai',
model: 'text-embedding-3-small',
dimensions: 1536,
},
},
},
},
};
const executeTarget = vi.fn(async (target) => successResult(target.connectionId, target.driver, target.operation));
await expect(
runContextBuild(
project,
{
projectDir: '/tmp/project',
inputMode: 'disabled',
targetConnectionId: 'warehouse',
all: false,
queryHistory: 'enabled',
},
io.io,
{ executeTarget, now: () => 1000 },
),
).resolves.toMatchObject({ exitCode: 0 });
expect(io.stdout()).toContain('Schema ingest runs before query history for warehouse.');
});
```
- [ ] **Step 2: Run the failing query-history notice tests**
Run:
```bash
pnpm --filter @ktx/cli exec vitest run src/public-ingest.test.ts src/context-build-view.test.ts -t "schema-first notice|passes schema-first"
```
Expected: FAIL because plans do not include `notices`, and plain output does
not print schema-first text.
- [ ] **Step 3: Add notices to the public ingest plan**
In `packages/cli/src/public-ingest.ts`, update `KtxPublicIngestPlan`:
```ts
export interface KtxPublicIngestPlan {
projectDir: string;
targets: KtxPublicIngestPlanTarget[];
warnings: string[];
notices?: string[];
}
```
Add this helper after `finalizeWarnings`:
```ts
function schemaFirstQueryHistoryNotice(
targets: KtxPublicIngestPlanTarget[],
args: { queryHistory?: KtxPublicIngestQueryHistoryFlag },
): string | null {
if (args.queryHistory !== 'enabled') {
return null;
}
const queryHistoryTargets = targets.filter((target) => target.queryHistory?.enabled === true);
if (queryHistoryTargets.length === 0) {
return null;
}
if (queryHistoryTargets.length === 1) {
return `Schema ingest runs before query history for ${queryHistoryTargets[0].connectionId}.`;
}
return `Schema ingest runs before query history for ${queryHistoryTargets.length} database connections.`;
}
```
In `buildPublicIngestPlan`, replace the direct return with:
```ts
const orderedTargets = [
...targets.filter((t) => t.operation === 'database-ingest'),
...targets.filter((t) => t.operation === 'source-ingest'),
];
const notice = schemaFirstQueryHistoryNotice(orderedTargets, args);
return {
projectDir: args.projectDir,
targets: orderedTargets,
warnings: finalizeWarnings(warnings, args),
...(notice ? { notices: [notice] } : {}),
};
```
- [ ] **Step 4: Print notices in plain public ingest**
In `packages/cli/src/public-ingest.ts`, inside `runKtxPublicIngest`, change:
```ts
if (!args.json && plan.warnings.length > 0) {
for (const warning of plan.warnings) {
io.stderr.write(`Warning: ${warning}\n`);
}
}
```
to:
```ts
if (!args.json) {
for (const notice of plan.notices ?? []) {
io.stdout.write(`${notice}\n`);
}
for (const warning of plan.warnings) {
io.stderr.write(`Warning: ${warning}\n`);
}
}
```
Task 1 already passes `plan.notices` into `runContextBuild`, so explicit
query-history foreground runs render the same notice in the view.
- [ ] **Step 5: Run the query-history notice tests**
Run:
```bash
pnpm --filter @ktx/cli exec vitest run src/public-ingest.test.ts src/context-build-view.test.ts -t "schema-first notice|passes schema-first"
```
Expected: PASS.
- [ ] **Step 6: Commit**
```bash
git add packages/cli/src/public-ingest.ts packages/cli/src/public-ingest.test.ts packages/cli/src/context-build-view.ts packages/cli/src/context-build-view.test.ts
git commit -m "fix: explain query history schema order"
```
### Task 3: Add retry guidance to plain public failures
**Files:**
- Modify: `packages/cli/src/public-ingest.ts`
- Test: `packages/cli/src/public-ingest.test.ts`
- [ ] **Step 1: Write failing plain retry tests**
In `packages/cli/src/public-ingest.test.ts`, replace these assertions in
`runs all independent targets and reports partial failures`:
```ts
expect(io.stdout()).toContain('warehouse failed at database-schema.');
expect(io.stdout()).toContain('Debug: ktx ingest warehouse --debug');
```
with:
```ts
expect(io.stdout()).toContain('warehouse failed at database-schema.');
expect(io.stdout()).toContain('Retry: ktx ingest warehouse --project-dir /tmp/project --fast');
expect(io.stdout()).not.toContain('Debug: ktx ingest warehouse --debug');
```
Then add this test after `runs all independent targets and reports partial
failures`:
```ts
it('prints query-history retry guidance for query-history facet failures', async () => {
const io = makeIo();
const project = deepReadyProject({
warehouse: { driver: 'postgres', context: { depth: 'deep' } },
});
const runScan = vi.fn(async () => 0);
const runIngest = vi.fn(async () => 1);
await expect(
runKtxPublicIngest(
{
command: 'run',
projectDir: '/tmp/project',
targetConnectionId: 'warehouse',
all: false,
json: false,
inputMode: 'disabled',
queryHistory: 'enabled',
},
io.io,
{ loadProject: vi.fn(async () => project), runScan, runIngest },
),
).resolves.toBe(1);
expect(io.stdout()).toContain('warehouse failed at query-history.');
expect(io.stdout()).toContain('Retry: ktx ingest warehouse --project-dir /tmp/project --deep --query-history');
expect(io.stdout()).not.toContain('historic-sql');
});
```
- [ ] **Step 2: Run the failing retry tests**
Run:
```bash
pnpm --filter @ktx/cli exec vitest run src/public-ingest.test.ts -t "partial failures|query-history retry"
```
Expected: FAIL because plain failures still print `Debug:` and lack retry
commands.
- [ ] **Step 3: Add retry command formatting to public ingest**
In `packages/cli/src/public-ingest.ts`, add these helpers before
`markTargetResult`:
```ts
function retryCommandForTarget(
target: KtxPublicIngestPlanTarget,
args: Extract<KtxPublicIngestArgs, { command: 'run' }>,
): string {
const projectPart = ` --project-dir ${args.projectDir}`;
const depthPart = target.databaseDepth ? ` --${target.databaseDepth}` : '';
const queryHistoryPart = target.queryHistory?.enabled === true ? ' --query-history' : '';
const windowPart =
target.queryHistory?.enabled === true && target.queryHistory.windowDays !== undefined
? ` --query-history-window-days ${target.queryHistory.windowDays}`
: '';
return `ktx ingest ${target.connectionId}${projectPart}${depthPart}${queryHistoryPart}${windowPart}`;
}
function trimTrailingPeriod(value: string): string {
return value.endsWith('.') ? value.slice(0, -1) : value;
}
function failureDetailWithRetry(input: {
target: KtxPublicIngestPlanTarget;
args: Extract<KtxPublicIngestArgs, { command: 'run' }>;
failedOperation: KtxPublicIngestStepName;
failureDetail?: string;
}): string {
const detail = input.failureDetail?.trim();
const base =
detail && detail.startsWith(`${input.target.connectionId} `)
? detail
: detail
? `${input.target.connectionId} failed: ${detail}`
: `${input.target.connectionId} failed at ${input.failedOperation}.`;
return `${trimTrailingPeriod(base)}. Retry: ${retryCommandForTarget(input.target, input.args)}`;
}
```
- [ ] **Step 4: Thread run args into failure detail construction**
Change the `markTargetResult` signature in `packages/cli/src/public-ingest.ts`
from:
```ts
function markTargetResult(
target: KtxPublicIngestPlanTarget,
status: 'done' | 'failed',
failedOperation?: KtxPublicIngestStepName,
failureDetail?: string,
): KtxPublicIngestTargetResult {
```
to:
```ts
function markTargetResult(
target: KtxPublicIngestPlanTarget,
args: Extract<KtxPublicIngestArgs, { command: 'run' }>,
status: 'done' | 'failed',
failedOperation?: KtxPublicIngestStepName,
failureDetail?: string,
): KtxPublicIngestTargetResult {
```
Inside the failed-step branch, replace:
```ts
detail: failureDetail ?? `${target.connectionId} failed at ${selectedFailedOperation}.`,
```
with:
```ts
detail: failureDetailWithRetry({
target,
args,
failedOperation: selectedFailedOperation,
failureDetail,
}),
```
Update every `markTargetResult` call in `executePublicIngestTarget`:
```ts
return markTargetResult(
target,
args,
'failed',
'database-schema',
capturedScanIo ? firstCapturedFailureLine(capturedScanIo.capturedOutput()) : undefined,
);
```
```ts
return markTargetResult(target, args, 'failed', 'query-history');
```
```ts
return markTargetResult(target, args, 'done');
```
```ts
return markTargetResult(target, args, exitCode === 0 ? 'done' : 'failed');
```
- [ ] **Step 5: Stop printing debug commands in plain failure summaries**
In `renderPlainResults`, remove this block:
```ts
if (failedStep.debugCommand) {
io.stdout.write(` Debug: ${failedStep.debugCommand}\n`);
}
```
Debug commands remain available through JSON and debug surfaces, but normal
plain output now focuses on the connection and retry action.
- [ ] **Step 6: Run the retry tests**
Run:
```bash
pnpm --filter @ktx/cli exec vitest run src/public-ingest.test.ts -t "partial failures|query-history retry"
```
Expected: PASS.
- [ ] **Step 7: Commit**
```bash
git add packages/cli/src/public-ingest.ts packages/cli/src/public-ingest.test.ts
git commit -m "fix: add public ingest retry guidance"
```
### Task 4: Replace setup next-step scan/resume wording
**Files:**
- Modify: `packages/cli/src/next-steps.ts`
- Test: `packages/cli/src/next-steps.test.ts`
- [ ] **Step 1: Write failing next-step copy tests**
In `packages/cli/src/next-steps.test.ts`, replace the expected
`KTX_CONTEXT_BUILD_COMMANDS` value with:
```ts
expect(KTX_CONTEXT_BUILD_COMMANDS).toEqual([
{
command: 'ktx ingest --all',
description: 'Build or refresh agent-ready context from configured connections',
},
{
command: 'ktx status',
description: 'Check setup and context readiness',
},
]);
```
In the test named `keeps setup next steps focused on building context when the
build is not ready`, replace:
```ts
expect(rendered).toContain('primary-source scans and context-source ingests');
expect(rendered).toContain('ktx setup');
```
with:
```ts
expect(rendered).toContain('Run ingest to build database schema context before context-source ingest.');
expect(rendered).toContain('ktx ingest --all');
expect(rendered).not.toContain('resume');
expect(rendered).not.toContain('scan');
```
- [ ] **Step 2: Run the failing next-step copy tests**
Run:
```bash
pnpm --filter @ktx/cli exec vitest run src/next-steps.test.ts
```
Expected: FAIL because the current copy still recommends `ktx setup` for the
context-build action and uses resume/scan wording.
- [ ] **Step 3: Update the next-step command constants**
In `packages/cli/src/next-steps.ts`, change `KTX_CONTEXT_BUILD_COMMANDS` to:
```ts
export const KTX_CONTEXT_BUILD_COMMANDS = [
{
command: 'ktx ingest --all',
description: 'Build or refresh agent-ready context from configured connections',
},
{
command: 'ktx status',
description: 'Check setup and context readiness',
},
] as const;
```
In `formatSetupNextStepLines`, replace:
```ts
`${indent}Preferred route: run the CLI build; it covers primary-source scans and context-source ingests.`,
```
with:
```ts
`${indent}Run ingest to build database schema context before context-source ingest.`,
```
- [ ] **Step 4: Run the next-step copy tests**
Run:
```bash
pnpm --filter @ktx/cli exec vitest run src/next-steps.test.ts
```
Expected: PASS.
- [ ] **Step 5: Commit**
```bash
git add packages/cli/src/next-steps.ts packages/cli/src/next-steps.test.ts
git commit -m "fix: align setup next steps with unified ingest"
```
### Task 5: Clean guided demo foreground scan wording
**Files:**
- Modify: `packages/cli/src/setup-demo-tour.ts`
- Test: `packages/cli/src/setup-demo-tour.test.ts`
- [ ] **Step 1: Write failing demo wording tests**
In `packages/cli/src/setup-demo-tour.test.ts`, add this test inside
`describe('buildDemoReplayTimeline', ...)`:
```ts
it('uses schema-context wording for database progress', () => {
const renderedTimeline = timeline
.map((event) => [event.detailLine, event.summaryText].filter(Boolean).join(' '))
.join('\n');
expect(renderedTimeline).toContain('reading schema');
expect(renderedTimeline).toContain('56 tables');
expect(renderedTimeline).not.toContain('scanning');
expect(renderedTimeline).not.toContain('scanned');
});
```
- [ ] **Step 2: Run the failing demo wording test**
Run:
```bash
pnpm --filter @ktx/cli exec vitest run src/setup-demo-tour.test.ts -t "schema-context wording"
```
Expected: FAIL because the demo timeline still uses `scanning tables...` and
`tables scanned`.
- [ ] **Step 3: Replace demo timeline database copy**
In `packages/cli/src/setup-demo-tour.ts`, inside `buildDemoReplayTimeline`,
replace the first three events:
```ts
// postgres-warehouse: scan
{ delayMs: 0, connectionId: 'postgres-warehouse', status: 'running', detailLine: null, summaryText: null },
{ delayMs: 1200, connectionId: 'postgres-warehouse', status: 'running', detailLine: '[50%] scanning tables...', summaryText: null },
{ delayMs: 2400, connectionId: 'postgres-warehouse', status: 'done', detailLine: null, summaryText: '56 tables scanned' },
```
with:
```ts
// postgres-warehouse: database schema context
{ delayMs: 0, connectionId: 'postgres-warehouse', status: 'running', detailLine: null, summaryText: null },
{ delayMs: 1200, connectionId: 'postgres-warehouse', status: 'running', detailLine: '[50%] reading schema...', summaryText: null },
{ delayMs: 2400, connectionId: 'postgres-warehouse', status: 'done', detailLine: null, summaryText: '56 tables' },
```
- [ ] **Step 4: Run the demo wording test**
Run:
```bash
pnpm --filter @ktx/cli exec vitest run src/setup-demo-tour.test.ts -t "schema-context wording"
```
Expected: PASS.
- [ ] **Step 5: Commit**
```bash
git add packages/cli/src/setup-demo-tour.ts packages/cli/src/setup-demo-tour.test.ts
git commit -m "fix: remove scan wording from demo progress"
```
### Task 6: Final verification
**Files:**
- Verify: `packages/cli/src/public-ingest.ts`
- Verify: `packages/cli/src/context-build-view.ts`
- Verify: `packages/cli/src/next-steps.ts`
- Verify: `packages/cli/src/setup-demo-tour.ts`
- Verify: relevant tests
- [ ] **Step 1: Run focused Vitest coverage**
Run:
```bash
pnpm --filter @ktx/cli exec vitest run src/public-ingest.test.ts src/context-build-view.test.ts src/next-steps.test.ts src/setup-demo-tour.test.ts
```
Expected: PASS.
- [ ] **Step 2: Run CLI type-check**
Run:
```bash
pnpm --filter @ktx/cli run type-check
```
Expected: PASS.
- [ ] **Step 3: Run CLI tests**
Run:
```bash
pnpm --filter @ktx/cli run test
```
Expected: PASS.
- [ ] **Step 4: Run dead-code check after TypeScript changes**
Run:
```bash
pnpm run dead-code
```
Expected: PASS.
- [ ] **Step 5: Search for stale public wording in touched surfaces**
Run:
```bash
rg -n "Build or resume agent-ready|primary-source scans|scanning tables|tables scanned|Debug: ktx ingest" packages/cli/src/public-ingest.ts packages/cli/src/public-ingest.test.ts packages/cli/src/context-build-view.ts packages/cli/src/context-build-view.test.ts packages/cli/src/next-steps.ts packages/cli/src/next-steps.test.ts packages/cli/src/setup-demo-tour.ts packages/cli/src/setup-demo-tour.test.ts
```
Expected: no matches.
- [ ] **Step 6: Commit verification fixes if any were needed**
If verification required edits, run:
```bash
git add packages/cli/src/public-ingest.ts packages/cli/src/public-ingest.test.ts packages/cli/src/context-build-view.ts packages/cli/src/context-build-view.test.ts packages/cli/src/next-steps.ts packages/cli/src/next-steps.test.ts packages/cli/src/setup-demo-tour.ts packages/cli/src/setup-demo-tour.test.ts
git commit -m "test: verify unified ingest ux closure"
```
If no edits were needed, do not create an empty commit.
## Self-review
- Spec coverage: The plan covers the remaining v1-blocking warning,
schema-first query-history, retry-guidance, setup next-step, and foreground
demo wording gaps. Core command routing, depth policy, query-history config,
setup depth, docs-site command references, foreground-only state, and reserved
ids are already covered by earlier implemented plans.
- Placeholder scan: The plan contains exact file paths, concrete test code,
implementation snippets, commands, and expected results. No red-flag
placeholders are present.
- Type consistency: `notices` is added as an optional
`KtxPublicIngestPlan` property and threaded through `renderContextBuildView`
options. Retry helpers use existing `KtxPublicIngestPlanTarget`,
`KtxPublicIngestArgs`, and `KtxPublicIngestStepName` types.

View file

@ -0,0 +1,559 @@
# Unified Ingest V1 Progress Copy Closure Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Remove the remaining v1-blocking scan wording from normal public
unified-ingest progress, failure, and setup scope-selection output.
**Architecture:** Keep the implemented connection-centric ingest planner,
hidden legacy commands, and foreground context-build view. Add a small shared
public-copy helper for lower-level database ingest and query-history messages,
then use it from foreground progress and direct public failure summarization.
**Tech Stack:** TypeScript ESM, Commander, Vitest, KTX CLI/context packages.
---
## Current audit
The implemented unified-ingest plan chain covers the original spec's main v1
behavior:
- `ktx ingest [connectionId]`, `ktx ingest --all`, `--fast`, `--deep`,
`--query-history`, `--no-query-history`, and
`--query-history-window-days` route through `public-ingest.ts`.
- Database targets run before source targets, inferred public adapters bypass
`ingest.adapters`, and `fast` or `deep` maps to structural or enriched
database ingest internals.
- Deep readiness is evaluated before target work starts, and `--all` isolates
per-target deep-readiness failures.
- Setup stores `connections.<id>.context.depth` and
`connections.<id>.context.queryHistory`, migrates legacy `historicSql`, and
uses foreground-only setup context state.
- Normal help hides `ktx scan`, `ktx ingest run`, and `ktx ingest watch`; docs
and command-tree output no longer present those as normal public workflows.
### V1-blocking gaps
- Foreground `ktx ingest` and setup context-build progress still pass database
ingest progress messages through from scan internals. A normal user can see
messages such as `Preparing scan`, even though the spec says the foreground
view must use `reading schema` or `building schema context` and must not show
`scan` in normal mode.
- Direct public database ingest failure summaries sanitize `live-database` and
`historic-sql`, but not scan-specific failure lines such as
`KTX scan enrichment failed after structural scan completed: ...`.
- Interactive database setup still asks for `PostgreSQL schemas to scan`, which
keeps scan wording in normal setup output after the public model changed to
database schema context.
### Non-blocking gaps
- Hidden debug commands can remain callable: `ktx scan`, `ktx ingest run`, and
`ktx ingest watch`.
- Internal adapter keys, raw artifact paths, WorkUnit keys, package names,
tests, and developer-only scripts can continue to use `scan`,
`live-database`, and `historic-sql`.
- README package taxonomy such as `Postgres scan connector` can remain because
it describes internal package ownership, not normal command usage.
- Internal readiness configuration names such as `scan.enrichment.mode` can
remain because they refer to existing `ktx.yaml` configuration fields.
## File structure
- Create `packages/cli/src/public-ingest-copy.ts`: shared copy sanitizer for
database ingest and query-history messages used by public output paths.
- Create `packages/cli/src/public-ingest-copy.test.ts`: unit coverage for the
sanitizer.
- Modify `packages/cli/src/context-build-view.ts`: sanitize foreground
database progress messages and reuse the shared query-history sanitizer.
- Modify `packages/cli/src/context-build-view.test.ts`: cover foreground
progress output with lower-level scan messages.
- Modify `packages/cli/src/public-ingest.ts`: use the shared public output-line
sanitizer for captured failure details.
- Modify `packages/cli/src/public-ingest.test.ts`: cover direct public failure
output for scan-enrichment failures.
- Modify `packages/cli/src/setup-databases.ts`: change the schema scope prompt
from `schemas to scan` to `schemas to include`.
- Modify `packages/cli/src/setup-databases.test.ts`: update the schema prompt
expectation and assert scan wording is absent.
## Tasks
### Task 1: Add shared public ingest copy sanitizers
**Files:**
- Create: `packages/cli/src/public-ingest-copy.ts`
- Create: `packages/cli/src/public-ingest-copy.test.ts`
- [ ] **Step 1: Write the public-copy tests**
Create `packages/cli/src/public-ingest-copy.test.ts`:
```ts
import { describe, expect, it } from 'vitest';
import {
publicDatabaseIngestMessage,
publicIngestOutputLine,
publicQueryHistoryMessage,
} from './public-ingest-copy.js';
describe('public ingest copy sanitizers', () => {
it('maps database scan progress into schema-context wording', () => {
expect(publicDatabaseIngestMessage('Preparing scan')).toBe('Preparing database ingest');
expect(publicDatabaseIngestMessage('Inspecting database schema')).toBe('Reading database schema');
expect(publicDatabaseIngestMessage('Writing schema artifacts')).toBe('Writing schema context');
expect(publicDatabaseIngestMessage('Enriching schema metadata')).toBe('Building enriched schema context');
});
it('maps database scan failure text into public database ingest wording', () => {
expect(
publicDatabaseIngestMessage(
'KTX scan enrichment failed after structural scan completed: embedding service timed out',
),
).toBe('Database enrichment failed after schema context completed: embedding service timed out');
expect(publicDatabaseIngestMessage('structural scan wrote partial artifacts')).toBe(
'schema context wrote partial artifacts',
);
expect(publicDatabaseIngestMessage('scan results may be less complete')).toBe(
'database context may be less complete',
);
});
it('maps query-history adapter progress into public wording', () => {
expect(publicQueryHistoryMessage('Fetching source files for warehouse/historic-sql', 'warehouse')).toBe(
'Fetching query history for warehouse',
);
expect(publicQueryHistoryMessage('Curating warehouse/historic-sql work units', 'warehouse')).toBe(
'Curating warehouse query history work units',
);
expect(publicQueryHistoryMessage('historic SQL local ingest failed', 'warehouse')).toBe(
'query history local ingest failed',
);
});
it('sanitizes captured public output lines across database and query-history internals', () => {
expect(
publicIngestOutputLine(
'KTX scan enrichment failed after structural scan completed in raw-sources/warehouse/live-database/sync-1',
),
).toBe('Database enrichment failed after schema context completed in raw-sources/warehouse/database schema/sync-1');
expect(publicIngestOutputLine('Historic SQL local ingest requires a configured reader')).toBe(
'query history local ingest requires a configured reader',
);
});
});
```
- [ ] **Step 2: Run the failing public-copy tests**
Run:
```bash
pnpm --filter @ktx/cli exec vitest run src/public-ingest-copy.test.ts
```
Expected: FAIL because `packages/cli/src/public-ingest-copy.ts` does not exist.
- [ ] **Step 3: Implement the shared sanitizers**
Create `packages/cli/src/public-ingest-copy.ts`:
```ts
function escapeRegExp(value: string): string {
return value.replace(/[.*+?^${}()|[\]\\]/g, '\\$&');
}
const DATABASE_INGEST_REPLACEMENTS: Array<[RegExp, string]> = [
[/\bPreparing scan\b/gi, 'Preparing database ingest'],
[/\bInspecting database schema\b/gi, 'Reading database schema'],
[/\bWriting schema artifacts\b/gi, 'Writing schema context'],
[/\bEnriching schema metadata\b/gi, 'Building enriched schema context'],
[
/\bKTX scan enrichment failed after structural scan completed\b/gi,
'Database enrichment failed after schema context completed',
],
[/\bstructural scan\b/gi, 'schema context'],
[/\benriched scan\b/gi, 'deep database ingest'],
[/\bscan results\b/gi, 'database context'],
];
export function publicDatabaseIngestMessage(message: string): string {
return DATABASE_INGEST_REPLACEMENTS.reduce(
(current, [pattern, replacement]) => current.replace(pattern, replacement),
message,
);
}
export function publicQueryHistoryMessage(message: string, connectionId?: string): string {
let current = message;
if (connectionId && connectionId.length > 0) {
const escapedConnectionId = escapeRegExp(connectionId);
current = current
.replace(
new RegExp(`Fetching source files for ${escapedConnectionId}/historic-sql`, 'i'),
`Fetching query history for ${connectionId}`,
)
.replace(`${connectionId}/historic-sql`, `${connectionId} query history`);
}
return current.replace(/\bhistoric-sql\b/g, 'query history').replace(/\bhistoric SQL\b/gi, 'query history');
}
export function publicIngestOutputLine(line: string): string {
return publicQueryHistoryMessage(publicDatabaseIngestMessage(line)).replace(/\blive-database\b/g, 'database schema');
}
```
- [ ] **Step 4: Run the public-copy tests again**
Run:
```bash
pnpm --filter @ktx/cli exec vitest run src/public-ingest-copy.test.ts
```
Expected: PASS.
- [ ] **Step 5: Commit the shared sanitizer**
Run:
```bash
git add packages/cli/src/public-ingest-copy.ts packages/cli/src/public-ingest-copy.test.ts
git commit -m "fix(cli): add public ingest copy sanitizers"
```
### Task 2: Sanitize foreground progress and captured public failures
**Files:**
- Modify: `packages/cli/src/context-build-view.ts`
- Modify: `packages/cli/src/context-build-view.test.ts`
- Modify: `packages/cli/src/public-ingest.ts`
- Modify: `packages/cli/src/public-ingest.test.ts`
- Test: `packages/cli/src/public-ingest-copy.test.ts`
- [ ] **Step 1: Write the failing foreground progress test**
In `packages/cli/src/context-build-view.test.ts`, add this test inside the
`runContextBuild` describe block near the existing query-history progress test:
```ts
it('renders database ingest progress without scan wording', async () => {
const io = makeIo();
const project = projectWithConnections({ warehouse: { driver: 'postgres' } });
const executeTarget = vi.fn(async (target, _args, _targetIo, deps) => {
await deps.scanProgress?.update(0.05, 'Preparing scan');
await deps.scanProgress?.update(0.15, 'Inspecting database schema');
await deps.scanProgress?.update(0.7, 'Writing schema artifacts');
return successResult(target.connectionId, target.driver, target.operation);
});
await expect(
runContextBuild(
project,
{
projectDir: '/tmp/project',
inputMode: 'disabled',
targetConnectionId: 'warehouse',
all: false,
},
io.io,
{ executeTarget, now: () => 1000, sourceProgressThrottleMs: 0 },
),
).resolves.toMatchObject({ exitCode: 0 });
expect(io.stdout()).toContain('Preparing database ingest');
expect(io.stdout()).toContain('Reading database schema');
expect(io.stdout()).toContain('Writing schema context');
expect(io.stdout()).not.toContain('Preparing scan');
expect(io.stdout()).not.toMatch(/\bscan\b/i);
});
```
- [ ] **Step 2: Write the failing direct public failure test**
In `packages/cli/src/public-ingest.test.ts`, add this test inside the
`runKtxPublicIngest` describe block near
`suppresses internal scan output for public database ingest summaries`:
```ts
it('sanitizes captured database scan failure details in direct public output', async () => {
const io = makeIo();
const project = deepReadyProject({ warehouse: { driver: 'postgres', context: { depth: 'deep' } } });
const runScan = vi.fn(async (_args, scanIo) => {
scanIo.stdout.write('KTX scan enrichment failed after structural scan completed: embedding service timed out\n');
return 1;
});
await expect(
runKtxPublicIngest(
{
command: 'run',
projectDir: '/tmp/project',
targetConnectionId: 'warehouse',
all: false,
json: false,
inputMode: 'disabled',
depth: 'deep',
},
io.io,
{ loadProject: vi.fn(async () => project), runScan },
),
).resolves.toBe(1);
expect(io.stdout()).toContain(
'warehouse failed: Database enrichment failed after schema context completed: embedding service timed out.',
);
expect(io.stdout()).toContain('Retry: ktx ingest warehouse --project-dir /tmp/project --deep');
expect(io.stdout()).not.toContain('KTX scan enrichment failed');
expect(io.stdout()).not.toContain('structural scan');
});
```
- [ ] **Step 3: Run the failing integration tests**
Run:
```bash
pnpm --filter @ktx/cli exec vitest run src/context-build-view.test.ts src/public-ingest.test.ts -t "database ingest progress|captured database scan failure" --testTimeout 30000
```
Expected: FAIL because foreground progress still prints `Preparing scan`, and
captured direct failures still print the lower-level scan failure text.
- [ ] **Step 4: Use the shared sanitizer in foreground progress**
In `packages/cli/src/context-build-view.ts`, add this import:
```ts
import { publicDatabaseIngestMessage, publicQueryHistoryMessage } from './public-ingest-copy.js';
```
Replace the existing `publicProgressMessage()` implementation:
```ts
function publicProgressMessage(message: string, target: KtxPublicIngestPlanTarget): string {
if (!target.steps.includes('query-history')) {
return message;
}
return message
.replace(
new RegExp(`Fetching source files for ${target.connectionId}/historic-sql`, 'i'),
`Fetching query history for ${target.connectionId}`,
)
.replace(`${target.connectionId}/historic-sql`, `${target.connectionId} query history`)
.replace(/\bhistoric-sql\b/g, 'query history')
.replace(/\bhistoric SQL\b/gi, 'query history');
}
```
with:
```ts
function publicProgressMessage(message: string, target: KtxPublicIngestPlanTarget): string {
if (target.operation === 'database-ingest') {
return publicDatabaseIngestMessage(message);
}
if (target.steps.includes('query-history')) {
return publicQueryHistoryMessage(message, target.connectionId);
}
return message;
}
```
- [ ] **Step 5: Use the shared sanitizer in public ingest failure capture**
In `packages/cli/src/public-ingest.ts`, add this import:
```ts
import { publicIngestOutputLine } from './public-ingest-copy.js';
```
Delete the local `publicIngestOutputLine()` function:
```ts
function publicIngestOutputLine(line: string): string {
return line
.replace(/\blive-database\b/g, 'database schema')
.replace(/\bhistoric-sql\b/g, 'query history')
.replace(/\bhistoric SQL\b/gi, 'query history');
}
```
Leave `firstCapturedFailureLine()` calling `publicIngestOutputLine` unchanged;
the imported function now provides the broader public wording.
- [ ] **Step 6: Run the integration tests again**
Run:
```bash
pnpm --filter @ktx/cli exec vitest run src/public-ingest-copy.test.ts src/context-build-view.test.ts src/public-ingest.test.ts --testTimeout 30000
```
Expected: PASS.
- [ ] **Step 7: Commit foreground and failure sanitization**
Run:
```bash
git add packages/cli/src/context-build-view.ts packages/cli/src/context-build-view.test.ts packages/cli/src/public-ingest.ts packages/cli/src/public-ingest.test.ts packages/cli/src/public-ingest-copy.ts packages/cli/src/public-ingest-copy.test.ts
git commit -m "fix(cli): sanitize public ingest progress copy"
```
### Task 3: Rename setup schema scope prompt
**Files:**
- Modify: `packages/cli/src/setup-databases.ts`
- Modify: `packages/cli/src/setup-databases.test.ts`
- [ ] **Step 1: Update the setup prompt expectation**
In `packages/cli/src/setup-databases.test.ts`, in the test named
`prompts for discovered Postgres schemas before the first scan`, replace:
```ts
message: expect.stringContaining('PostgreSQL schemas to scan'),
```
with:
```ts
message: expect.stringContaining('PostgreSQL schemas to include'),
```
Add this assertion after the `toHaveBeenCalledWith` block:
```ts
expect(String(prompts.multiselect.mock.calls[0]?.[0].message)).not.toContain('to scan');
```
- [ ] **Step 2: Run the failing setup prompt test**
Run:
```bash
pnpm --filter @ktx/cli exec vitest run src/setup-databases.test.ts -t "prompts for discovered Postgres schemas before the first scan" --testTimeout 30000
```
Expected: FAIL because the prompt still says `PostgreSQL schemas to scan`.
- [ ] **Step 3: Rename the setup scope prompt**
In `packages/cli/src/setup-databases.ts`, replace:
```ts
`${spec.promptLabel} to scan\n` +
`KTX found multiple ${spec.nounPlural}. Select every ${spec.noun} agents should use.`,
```
with:
```ts
`${spec.promptLabel} to include\n` +
`KTX found multiple ${spec.nounPlural}. Select every ${spec.noun} agents should use.`,
```
- [ ] **Step 4: Run the setup prompt test again**
Run:
```bash
pnpm --filter @ktx/cli exec vitest run src/setup-databases.test.ts -t "prompts for discovered Postgres schemas before the first scan" --testTimeout 30000
```
Expected: PASS.
- [ ] **Step 5: Commit setup prompt wording**
Run:
```bash
git add packages/cli/src/setup-databases.ts packages/cli/src/setup-databases.test.ts
git commit -m "fix(cli): rename setup schema scope prompt"
```
### Task 4: Final verification
**Files:**
- Verify: `packages/cli/src/public-ingest-copy.ts`
- Verify: `packages/cli/src/context-build-view.ts`
- Verify: `packages/cli/src/public-ingest.ts`
- Verify: `packages/cli/src/setup-databases.ts`
- [ ] **Step 1: Run targeted unified-ingest tests**
Run:
```bash
pnpm --filter @ktx/cli exec vitest run src/public-ingest-copy.test.ts src/context-build-view.test.ts src/public-ingest.test.ts src/setup-databases.test.ts --testTimeout 30000
```
Expected: PASS.
- [ ] **Step 2: Run CLI type-check**
Run:
```bash
pnpm --filter @ktx/cli run type-check
```
Expected: PASS.
- [ ] **Step 3: Scan normal public files for the closed wording gaps**
Run:
```bash
rg -n "Preparing scan|KTX scan enrichment failed|structural scan completed|schemas to scan" packages/cli/src/context-build-view.ts packages/cli/src/public-ingest.ts packages/cli/src/setup-databases.ts packages/cli/src/*.test.ts
```
Expected: no matches except historical expectations in low-level `scan.test.ts`
or internal scan-specific tests that are not part of the command above.
- [ ] **Step 4: Run workspace dead-code check**
Run:
```bash
pnpm run dead-code
```
Expected: PASS.
- [ ] **Step 5: Commit final verification marker if needed**
If the verification steps required only the commits above, no additional
commit is needed. If a verification fix changed files, run:
```bash
git add packages/cli/src/public-ingest-copy.ts packages/cli/src/public-ingest-copy.test.ts packages/cli/src/context-build-view.ts packages/cli/src/context-build-view.test.ts packages/cli/src/public-ingest.ts packages/cli/src/public-ingest.test.ts packages/cli/src/setup-databases.ts packages/cli/src/setup-databases.test.ts
git commit -m "test(cli): verify unified ingest public progress copy"
```
## Self-review
Spec coverage: this plan covers the remaining normal public output paths where
scan wording still leaks into unified ingest:
- Foreground progress now maps database scan progress into schema-context copy.
- Captured direct public failure summaries now map scan-enrichment failures into
database ingest copy.
- Interactive setup schema scope selection now says `schemas to include`, not
`schemas to scan`.
The plan intentionally leaves hidden debug commands, internal artifact paths,
developer scripts, low-level scan tests, and configuration field names alone.
Those are non-blocking under the original spec's implementation-detail
allowances.
Placeholder scan: no task uses deferred code markers, unnamed edge handling, or
undefined helper names. Every changed helper, test, and command is named with
the file that owns it.
Type consistency: the new helper exports
`publicDatabaseIngestMessage()`, `publicQueryHistoryMessage()`, and
`publicIngestOutputLine()`. Later tasks import those exact names from
`./public-ingest-copy.js`.

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,598 @@
# Unified Ingest V1 Public Plain Output Closure Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Remove the last v1-blocking adapter-centric and internal source-key leaks from normal public `ktx ingest` plain output.
**Architecture:** Keep the current connection-centric public ingest planner and hidden debug commands. Sanitize low-level ingest report labels in `ingest.ts`, and capture low-level source/query-history output in `public-ingest.ts` so public plain `ktx ingest <connectionId>` renders only the unified result table, warnings, notices, and retry guidance. JSON output and hidden debug commands may continue to expose raw `sourceKey` values for troubleshooting.
**Tech Stack:** TypeScript, Commander, Vitest, pnpm workspace scripts.
---
## Current audit
The unified ingest plan chain has implemented the main v1 behavior:
- `ktx ingest [connectionId]`, `ktx ingest --all`, `--fast`, `--deep`,
`--query-history`, `--no-query-history`, and
`--query-history-window-days` route through `public-ingest.ts`.
- Database targets run before source targets, deep readiness is target-local
for `--all`, and inferred public adapters bypass `ingest.adapters`.
- Normal command help hides `ktx scan`, `ktx ingest run`, and
`ktx ingest watch`; docs-site command references no longer publish those
as normal workflows.
- Setup stores `connections.<id>.context.depth` and
`connections.<id>.context.queryHistory`, migrates legacy `historicSql`, and
uses foreground-only context-build state.
### V1-blocking gaps
- Direct public non-TTY or `--no-input` source ingest still delegates to
`runKtxIngest()` with the real CLI IO. The lower-level reporter prints
`Adapter: <sourceKey>` and routine report details before the public result
table. For query history this can print `Adapter: historic-sql`, violating
the spec requirement that normal output use query-history wording and keep
internal adapter names out of routine output.
- `ktx ingest status` and `ktx ingest replay` plain output call the same
lower-level report formatter. Stored database reports can therefore print
`Adapter: live-database`, and stored query-history reports can print
`Adapter: historic-sql`, even though `status` and `replay` are public
report-viewing surfaces.
### Non-blocking gaps
- Hidden debug commands remain callable: `ktx scan`, `ktx ingest run`, and
`ktx ingest watch`.
- JSON output, debug output, tests, internal artifact paths, WorkUnit keys,
adapter package names, and developer scripts can continue to use
`scan`, `live-database`, and `historic-sql`.
- Public docs still use "scan" as a generic implementation noun in a few
contributor or concept pages. They do not present `ktx scan` as the normal
public command, so that is later wording cleanup.
## File structure
- Modify `packages/cli/src/ingest.ts`: replace the plain report `Adapter:`
label with public source labels, while leaving JSON report payloads intact.
- Modify `packages/cli/src/public-ingest.ts`: capture lower-level source and
query-history plain output for direct public ingest, sanitize failure detail
lines, and render only the public summary table.
- Modify `packages/cli/src/ingest.test.ts`: update existing report label
expectations and add regressions for `live-database` and `historic-sql`
stored-report labels.
- Modify `packages/cli/src/public-ingest.test.ts`: add regressions proving
direct public source and query-history runs do not leak lower-level adapter
report output.
## Tasks
### Task 1: Use public source labels in stored report output
**Files:**
- Modify: `packages/cli/src/ingest.ts`
- Modify: `packages/cli/src/ingest.test.ts`
- [ ] **Step 1: Add failing stored-report label tests**
Add these tests inside the existing `describe('runKtxIngest', () => { ... })`
block in `packages/cli/src/ingest.test.ts`, near the existing
`runs local ingest and reads status` test:
```typescript
it('labels internal database reports without adapter names in plain status output', async () => {
const projectDir = join(tempDir, 'project');
await writeWarehouseConfig(projectDir);
const report = localFakeBundleReport('scan-job-1', {
id: 'report-scan-1',
runId: 'run-scan-1',
connectionId: 'warehouse',
sourceKey: 'live-database',
});
const io = makeIo();
await expect(
runKtxIngest(
{
command: 'status',
projectDir,
reportFile: '/tmp/scan-report.json',
outputMode: 'plain',
},
io.io,
{
readReportFile: vi.fn(async () => report),
},
),
).resolves.toBe(0);
expect(io.stdout()).toContain('Source: Database schema\n');
expect(io.stdout()).not.toContain('Adapter:');
expect(io.stdout()).not.toContain('live-database');
expect(io.stderr()).toBe('');
});
it('labels internal query-history reports without adapter names in plain status output', async () => {
const projectDir = join(tempDir, 'project');
await writeWarehouseConfig(projectDir);
const report = localFakeBundleReport('query-history-job-1', {
id: 'report-query-history-1',
runId: 'run-query-history-1',
connectionId: 'warehouse',
sourceKey: 'historic-sql',
});
const io = makeIo();
await expect(
runKtxIngest(
{
command: 'status',
projectDir,
reportFile: '/tmp/query-history-report.json',
outputMode: 'plain',
},
io.io,
{
readReportFile: vi.fn(async () => report),
},
),
).resolves.toBe(0);
expect(io.stdout()).toContain('Source: Query history\n');
expect(io.stdout()).not.toContain('Adapter:');
expect(io.stdout()).not.toContain('historic-sql');
expect(io.stderr()).toBe('');
});
```
- [ ] **Step 2: Run the failing stored-report tests**
Run:
```bash
pnpm --filter @ktx/cli exec vitest run src/ingest.test.ts --testNamePattern "labels internal"
```
Expected: FAIL. The output still contains `Adapter: live-database` or
`Adapter: historic-sql`, and it does not contain the new public `Source:`
labels.
- [ ] **Step 3: Add public report source labels**
In `packages/cli/src/ingest.ts`, add these helpers above
`function writeReportStatus(...)`:
```typescript
const REPORT_SOURCE_LABELS = new Map<string, string>([
['live-database', 'Database schema'],
['historic-sql', 'Query history'],
['dbt', 'dbt'],
['metricflow', 'MetricFlow'],
['lookml', 'LookML'],
['looker', 'Looker'],
['metabase', 'Metabase'],
['notion', 'Notion'],
]);
function reportSourceLabel(sourceKey: string): string {
const label = REPORT_SOURCE_LABELS.get(sourceKey);
if (label) {
return label;
}
return sourceKey
.split(/[-_]+/)
.filter((part) => part.length > 0)
.map((part) => `${part[0]?.toUpperCase() ?? ''}${part.slice(1)}`)
.join(' ');
}
```
Then replace the `Adapter:` line in `writeReportStatus()`:
```typescript
io.stdout.write(`Source: ${reportSourceLabel(report.sourceKey)}\n`);
```
The full function should keep the remaining fields unchanged:
```typescript
function writeReportStatus(report: IngestReportSnapshot, io: KtxIngestIo): void {
const counts = savedMemoryCountsForReport(report);
io.stdout.write(`Report: ${report.id}\n`);
io.stdout.write(`Run: ${report.runId}\n`);
io.stdout.write(`Job: ${report.jobId}\n`);
io.stdout.write(`Status: ${reportStatus(report)}\n`);
io.stdout.write(`Source: ${reportSourceLabel(report.sourceKey)}\n`);
io.stdout.write(`Connection: ${report.connectionId}\n`);
io.stdout.write(`Sync: ${report.body.syncId}\n`);
io.stdout.write(
`Diff: +${report.body.diffSummary.added}/~${report.body.diffSummary.modified}/-${report.body.diffSummary.deleted}/=${report.body.diffSummary.unchanged}\n`,
);
io.stdout.write(`Work units: ${report.body.workUnits.length}\n`);
io.stdout.write(`Saved memory: ${counts.wikiCount} wiki, ${counts.slCount} SL\n`);
io.stdout.write(`Provenance rows: ${report.body.provenanceRows.length}\n`);
}
```
- [ ] **Step 4: Update existing report label expectations**
In `packages/cli/src/ingest.test.ts`, update the existing assertions that
still expect the old `Adapter:` label:
```typescript
expect(statusIo.stdout()).toContain('Source: Metabase');
```
```typescript
expect(io.stdout()).toContain('Source: Query history\n');
```
```typescript
expect(io.stdout()).toContain('Source: Looker');
```
```typescript
expect(statusIo.stdout()).toContain('Source: Looker');
```
Remove the corresponding `Adapter: metabase`, `Adapter: historic-sql`, and
`Adapter: looker` expectations.
- [ ] **Step 5: Run the stored-report tests again**
Run:
```bash
pnpm --filter @ktx/cli exec vitest run src/ingest.test.ts --testNamePattern "labels internal|runs public Metabase|historic-sql projection|Looker"
```
Expected: PASS. Plain report output uses `Source:` labels and does not print
`Adapter:` for the covered status and run summaries.
- [ ] **Step 6: Commit stored-report label cleanup**
Run:
```bash
git add packages/cli/src/ingest.ts packages/cli/src/ingest.test.ts
git commit -m "fix(cli): use public source labels in ingest reports"
```
### Task 2: Capture low-level output during public source ingest
**Files:**
- Modify: `packages/cli/src/public-ingest.ts`
- Modify: `packages/cli/src/public-ingest.test.ts`
- [ ] **Step 1: Add failing public source-output tests**
Add these tests to `packages/cli/src/public-ingest.test.ts` near the existing
public output tests for captured scan output and query-history retry guidance:
```typescript
it('suppresses lower-level source report output during direct public source ingest', async () => {
const io = makeIo();
const project = projectWithConnections({
docs: { driver: 'notion' },
});
const runIngest = vi.fn(async (_args, ingestIo) => {
ingestIo.stdout.write('Report: report-docs-1\n');
ingestIo.stdout.write('Adapter: notion\n');
ingestIo.stdout.write('Saved memory: 2 wiki, 0 SL\n');
return 0;
});
await expect(
runKtxPublicIngest(
{
command: 'run',
projectDir: '/tmp/project',
targetConnectionId: 'docs',
all: false,
json: false,
inputMode: 'disabled',
},
io.io,
{ loadProject: vi.fn(async () => project), runIngest },
),
).resolves.toBe(0);
expect(io.stdout()).toContain('Ingest finished');
expect(io.stdout()).toContain('docs');
expect(io.stdout()).toContain('source-ingest');
expect(io.stdout()).not.toContain('Report: report-docs-1');
expect(io.stdout()).not.toContain('Adapter:');
expect(io.stdout()).not.toContain('notion\n');
expect(io.stderr()).toBe('');
});
it('suppresses historic-sql report output during direct public query-history ingest', async () => {
const io = makeIo();
const project = deepReadyProject({
warehouse: { driver: 'postgres', context: { depth: 'deep' } },
});
const runScan = vi.fn(async () => 0);
const runIngest = vi.fn(async (_args, ingestIo) => {
ingestIo.stdout.write('Report: report-query-history-1\n');
ingestIo.stdout.write('Adapter: historic-sql\n');
ingestIo.stdout.write('Saved memory: 1 wiki, 1 SL\n');
return 0;
});
await expect(
runKtxPublicIngest(
{
command: 'run',
projectDir: '/tmp/project',
targetConnectionId: 'warehouse',
all: false,
json: false,
inputMode: 'disabled',
queryHistory: 'enabled',
},
io.io,
{ loadProject: vi.fn(async () => project), runScan, runIngest },
),
).resolves.toBe(0);
expect(io.stdout()).toContain('Schema ingest runs before query history for warehouse.');
expect(io.stdout()).toContain('Ingest finished');
expect(io.stdout()).toContain('warehouse');
expect(io.stdout()).toContain('done');
expect(io.stdout()).not.toContain('Report: report-query-history-1');
expect(io.stdout()).not.toContain('Adapter:');
expect(io.stdout()).not.toContain('historic-sql');
expect(io.stderr()).toBe('');
});
```
- [ ] **Step 2: Run the failing public source-output tests**
Run:
```bash
pnpm --filter @ktx/cli exec vitest run src/public-ingest.test.ts --testNamePattern "suppresses"
```
Expected: FAIL. The direct public run writes lower-level `Report:` and
`Adapter:` lines into normal public stdout.
- [ ] **Step 3: Add captured ingest output helpers**
In `packages/cli/src/public-ingest.ts`, keep the existing
`createCapturedPublicIngestIo()` helper and replace
`firstCapturedFailureLine()` with these helpers:
```typescript
const INTERNAL_STATUS_LINE_RE =
/^(Report|Run|Job|Status|Adapter|Connection|Sync|Diff|Work units|Saved memory|Provenance rows):\s*/;
function publicIngestOutputLine(line: string): string {
return line
.replace(/\blive-database\b/g, 'database schema')
.replace(/\bhistoric-sql\b/g, 'query history')
.replace(/\bhistoric SQL\b/gi, 'query history');
}
function firstCapturedFailureLine(output: string): string | undefined {
return output
.split(/\r?\n/)
.map((line) => line.trim())
.filter((line) => line.length > 0)
.filter((line) => !line.startsWith('KTX scan completed'))
.filter((line) => !INTERNAL_STATUS_LINE_RE.test(line))
.map(publicIngestOutputLine)
.find((line) => line.length > 0);
}
```
- [ ] **Step 4: Capture query-history ingest output**
In `executePublicIngestTarget()`, replace the query-history branch with this
captured-output flow:
```typescript
if (target.queryHistory?.enabled === true) {
const { runKtxIngest } = await import('./ingest.js');
const runIngest = deps.runIngest ?? runKtxIngest;
const ingestArgs: KtxIngestArgs = {
command: 'run',
projectDir: args.projectDir,
connectionId: target.connectionId,
adapter: 'historic-sql',
outputMode: sourceIngestOutputMode(args, io),
inputMode: args.inputMode,
allowImplicitAdapter: true,
historicSqlPullConfigOverride:
target.queryHistory.pullConfig ?? {
dialect: target.queryHistory.dialect,
...(target.queryHistory.windowDays !== undefined ? { windowDays: target.queryHistory.windowDays } : {}),
},
};
const capturedIngestIo = deps.ingestProgress ? null : createCapturedPublicIngestIo();
const ingestIo = capturedIngestIo ?? io;
const qhExitCode = deps.ingestProgress
? await runIngest(ingestArgs, ingestIo, { progress: deps.ingestProgress })
: await runIngest(ingestArgs, ingestIo);
if (qhExitCode !== 0) {
return markTargetResult(
target,
args,
'failed',
'query-history',
capturedIngestIo ? firstCapturedFailureLine(capturedIngestIo.capturedOutput()) : undefined,
);
}
}
```
This keeps foreground progress working because `runContextBuild()` supplies
`deps.ingestProgress` and already passes a captured IO object into
`executePublicIngestTarget()`.
- [ ] **Step 5: Capture source ingest output**
In the source-ingest branch of `executePublicIngestTarget()`, replace the
direct `runIngest(..., io, ...)` call with this captured-output flow:
```typescript
const runIngest = deps.runIngest ?? runKtxIngest;
const capturedIngestIo = deps.ingestProgress ? null : createCapturedPublicIngestIo();
const ingestIo = capturedIngestIo ?? io;
const exitCode = deps.ingestProgress
? await runIngest(ingestArgs, ingestIo, { progress: deps.ingestProgress })
: await runIngest(ingestArgs, ingestIo);
return markTargetResult(
target,
args,
exitCode === 0 ? 'done' : 'failed',
'source-ingest',
capturedIngestIo ? firstCapturedFailureLine(capturedIngestIo.capturedOutput()) : undefined,
);
```
Keep the existing `ingestArgs` object unchanged:
```typescript
const ingestArgs: KtxIngestArgs = {
command: 'run',
projectDir: args.projectDir,
connectionId: target.connectionId,
adapter: target.adapter ?? target.driver,
...(target.sourceDir ? { sourceDir: target.sourceDir } : {}),
outputMode: sourceIngestOutputMode(args, io),
inputMode: args.inputMode,
allowImplicitAdapter: true,
};
```
- [ ] **Step 6: Run the public source-output tests again**
Run:
```bash
pnpm --filter @ktx/cli exec vitest run src/public-ingest.test.ts --testNamePattern "suppresses|retry guidance|foreground"
```
Expected: PASS. Direct public source and query-history runs no longer print
low-level `Report:`, `Adapter:`, `live-database`, or `historic-sql` lines in
plain stdout, while existing foreground and retry guidance tests still pass.
- [ ] **Step 7: Commit public source-output capture**
Run:
```bash
git add packages/cli/src/public-ingest.ts packages/cli/src/public-ingest.test.ts
git commit -m "fix(cli): suppress low-level public ingest output"
```
### Task 3: Final verification
**Files:**
- Verify: `packages/cli/src/ingest.ts`
- Verify: `packages/cli/src/public-ingest.ts`
- Verify: `packages/cli/src/ingest.test.ts`
- Verify: `packages/cli/src/public-ingest.test.ts`
- [ ] **Step 1: Run focused CLI tests**
Run:
```bash
pnpm --filter @ktx/cli exec vitest run \
src/public-ingest.test.ts \
src/context-build-view.test.ts \
src/ingest.test.ts \
src/ingest-viz.test.ts \
src/command-tree.test.ts \
src/print-command-tree.test.ts
```
Expected: PASS. These tests cover direct public ingest, foreground context
builds, stored report rendering, visual report rendering, and hidden command
tree filtering.
- [ ] **Step 2: Run CLI type-check**
Run:
```bash
pnpm --filter @ktx/cli run type-check
```
Expected: PASS with no TypeScript errors.
- [ ] **Step 3: Verify generated command tree still hides debug commands**
Run:
```bash
pnpm --filter @ktx/cli run docs:commands >/tmp/ktx-command-tree.txt
rg "scan <connectionId>|ingest run|ingest watch" /tmp/ktx-command-tree.txt
```
Expected: the `docs:commands` command succeeds. The `rg` command exits `1`
with no matches.
- [ ] **Step 4: Search public docs and normal CLI surfaces for old public command guidance**
Run:
```bash
rg -n "ktx scan|ktx ingest run|ktx ingest watch|--enable-historic-sql|--historic-sql|historicSql|Historic SQL|live-database" \
README.md docs-site/content examples/README.md examples/local-warehouse/README.md examples/postgres-historic/README.md
```
Expected: no v1-blocking matches. Matches that refer only to internal raw
artifact paths such as `raw-sources/warehouse/historic-sql` are allowed only in
the Postgres query-history smoke README.
- [ ] **Step 5: Run dead-code checks after TypeScript changes**
Run:
```bash
pnpm run dead-code
```
Expected: PASS. If Knip reports unrelated existing findings, inspect them and
record the unrelated findings before finishing.
- [ ] **Step 6: Inspect final diff**
Run:
```bash
git status --short
git diff -- packages/cli/src/ingest.ts packages/cli/src/public-ingest.ts packages/cli/src/ingest.test.ts packages/cli/src/public-ingest.test.ts
```
Expected: only the intended TypeScript source and test files are modified.
The diff contains no generated `dist/` files and no docs changes beyond this
plan.
- [ ] **Step 7: Commit verification-only fixes if needed**
Run only if verification required small expectation or formatting fixes:
```bash
git add packages/cli/src/ingest.ts packages/cli/src/public-ingest.ts packages/cli/src/ingest.test.ts packages/cli/src/public-ingest.test.ts
git commit -m "test(cli): verify unified ingest public plain output"
```
Expected: no commit is needed when all checks pass after Tasks 1 and 2.
## Self-review
- Spec coverage: This plan closes the remaining v1-blocking normal-output
leaks for direct public source ingest, public query-history ingest, and
public stored-report status/replay output. It intentionally leaves hidden
debug commands, JSON payloads, internal artifact paths, and developer tests
untouched.
- Placeholder scan: The plan contains concrete file paths, exact test code,
exact implementation snippets, commands, and expected results.
- Type consistency: The snippets use existing local types and helpers:
`KtxIngestArgs`, `createCapturedPublicIngestIo()`,
`firstCapturedFailureLine()`, `sourceIngestOutputMode()`,
`markTargetResult()`, `localFakeBundleReport()`, and `makeIo()`.

View file

@ -0,0 +1,326 @@
# Unified Ingest V1 Verification Copy Closure Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Close the remaining v1-blocking verification and setup-copy gaps in the unified `ktx ingest` UX.
**Architecture:** Keep the implemented connection-centric ingest planner unchanged. Fix the test-only TypeScript error that currently blocks `@ktx/cli` type-check, then replace the remaining normal setup help/output references to old "primary source" terminology with database-oriented copy.
**Tech Stack:** TypeScript ESM, Commander, Vitest, pnpm workspace scripts, uv pre-commit.
---
## Current Audit
Implemented unified-ingest plans already cover the original spec's main v1 behavior:
- `ktx ingest [connectionId]`, `ktx ingest --all`, `--fast`, `--deep`, `--query-history`, `--no-query-history`, and `--query-history-window-days` route through `packages/cli/src/public-ingest.ts`.
- Database targets are ordered before source targets, public source ingest bypasses `ingest.adapters`, and database depth maps to structural/enriched scan internals.
- Deep readiness is evaluated per target before target work starts, and `--all` isolates eligible targets from independent failures.
- Setup stores `connections.<id>.context.depth` and `connections.<id>.context.queryHistory`, migrates legacy `historicSql`, and uses foreground-only context-build state.
- Normal `ktx` and `ktx ingest` help hide `ktx scan`, `ktx ingest run`, and live `ktx ingest watch`.
- Foreground progress and normal public output sanitize scan/live-database/historic-sql internals.
### V1-Blocking Gaps
- `pnpm --filter @ktx/cli run type-check` fails:
```text
src/setup-databases.test.ts(1078,39): error TS2339: Property 'mock' does not exist on type '(options: { message: string; options: KtxSetupPromptOption<string>[]; required?: boolean | undefined; initialValues?: string[] | undefined; }) => Promise<string[]>'.
```
- Normal setup help/output still exposes the old database category as "primary source":
- `packages/cli/src/commands/setup-commands.ts` documents `--skip-databases` as `KTX cannot work until a primary source is added`.
- `packages/cli/src/setup-sources.ts` prints `Connect a primary source before adding context sources.`
- `packages/cli/src/setup-context.ts` prints `No primary or context sources are configured for a KTX context build.`
### Non-Blocking Gaps
- Hidden debug commands remain callable: `ktx scan`, `ktx ingest run`, and `ktx ingest watch`.
- Internal adapter keys, artifact paths, WorkUnit keys, package names, tests, and developer-only scripts can continue to use `scan`, `live-database`, `historic-sql`, and internal `primarySource*` identifiers.
- Public docs still have a `Primary Sources` integration page and a quickstart sentence about BI metadata mapping to primary source connections. That is broader documentation information architecture cleanup, not a v1 blocker for the normal command/help/output behavior in this spec.
## File Structure
- Modify `packages/cli/src/setup-databases.test.ts`: use Vitest's typed mock helper for the existing `prompts.multiselect` assertion.
- Modify `packages/cli/src/setup-sources.ts`: change the normal missing-database message before context source setup.
- Modify `packages/cli/src/setup-sources.test.ts`: update the missing-database regression.
- Modify `packages/cli/src/setup-context.ts`: change the normal no-target context-build error.
- Modify `packages/cli/src/setup-context.test.ts`: update the no-target context-build regression.
- Modify `packages/cli/src/commands/setup-commands.ts`: change the public `--skip-databases` help copy.
- Modify `packages/cli/src/index.test.ts`: assert setup help no longer contains public "primary source" wording.
## Tasks
### Task 1: Repair Setup Database Test Type-Check
**Files:**
- Modify: `packages/cli/src/setup-databases.test.ts`
- [ ] **Step 1: Replace the untyped mock access**
In `packages/cli/src/setup-databases.test.ts`, in the test named `prompts for discovered Postgres schemas before the first scan`, replace:
```ts
expect(String(prompts.multiselect.mock.calls[0]?.[0].message)).not.toContain('to scan');
```
with:
```ts
expect(String(vi.mocked(prompts.multiselect).mock.calls[0]?.[0].message)).not.toContain('to scan');
```
- [ ] **Step 2: Run the setup database type-check regression**
Run:
```bash
pnpm --filter @ktx/cli run type-check
```
Expected before the fix: FAIL with `TS2339: Property 'mock' does not exist`.
Expected after the fix: PASS.
- [ ] **Step 3: Commit the type-check repair**
Run:
```bash
git add packages/cli/src/setup-databases.test.ts
git commit -m "test(cli): fix setup database test type-check"
```
### Task 2: Replace Remaining Normal Setup Primary-Source Copy
**Files:**
- Modify: `packages/cli/src/setup-sources.ts`
- Modify: `packages/cli/src/setup-sources.test.ts`
- Modify: `packages/cli/src/setup-context.ts`
- Modify: `packages/cli/src/setup-context.test.ts`
- Modify: `packages/cli/src/commands/setup-commands.ts`
- Modify: `packages/cli/src/index.test.ts`
- [ ] **Step 1: Update setup source missing-database expectations**
In `packages/cli/src/setup-sources.test.ts`, replace the test name and output expectation:
```ts
it('does not offer context sources until a primary source exists', async () => {
```
with:
```ts
it('does not offer context sources until a database exists', async () => {
```
and replace:
```ts
expect(io.stdout()).toContain('Connect a primary source before adding context sources.');
```
with:
```ts
expect(io.stdout()).toContain('Connect a database before adding context sources.');
```
- [ ] **Step 2: Update setup context no-target expectations**
In `packages/cli/src/setup-context.test.ts`, replace:
```ts
expect(io.stderr()).toContain('No primary or context sources are configured for a KTX context build.');
```
with:
```ts
expect(io.stderr()).toContain('No databases or context sources are configured for a KTX context build.');
```
- [ ] **Step 3: Add setup help regression coverage**
In `packages/cli/src/index.test.ts`, in the test named `documents setup as a bare command without subcommands`, add these assertions after the existing query-history flag assertions and before the historic-SQL assertions:
```ts
expect(testIo.stdout()).toContain('KTX cannot work until a database is added');
expect(testIo.stdout()).not.toContain('primary source');
expect(testIo.stdout()).not.toContain('primary sources');
```
- [ ] **Step 4: Run the failing setup-copy tests**
Run:
```bash
pnpm --filter @ktx/cli exec vitest run src/setup-sources.test.ts src/setup-context.test.ts src/index.test.ts -t "context sources until a database exists|No databases or context sources|documents setup as a bare command"
```
Expected: FAIL because implementation still prints `primary source` in setup source/context output and setup help.
- [ ] **Step 5: Update setup source output**
In `packages/cli/src/setup-sources.ts`, replace:
```ts
const message = 'Connect a primary source before adding context sources.';
```
with:
```ts
const message = 'Connect a database before adding context sources.';
```
- [ ] **Step 6: Update setup context output**
In `packages/cli/src/setup-context.ts`, replace:
```ts
io.stderr.write('No primary or context sources are configured for a KTX context build.\n');
```
with:
```ts
io.stderr.write('No databases or context sources are configured for a KTX context build.\n');
```
- [ ] **Step 7: Update public setup help output**
In `packages/cli/src/commands/setup-commands.ts`, replace:
```ts
.option('--skip-databases', 'Leave database setup incomplete; KTX cannot work until a primary source is added', false)
```
with:
```ts
.option('--skip-databases', 'Leave database setup incomplete; KTX cannot work until a database is added', false)
```
- [ ] **Step 8: Run the setup-copy tests**
Run:
```bash
pnpm --filter @ktx/cli exec vitest run src/setup-sources.test.ts src/setup-context.test.ts src/index.test.ts -t "context sources until a database exists|No databases or context sources|documents setup as a bare command"
```
Expected: PASS.
- [ ] **Step 9: Commit the setup-copy repair**
Run:
```bash
git add packages/cli/src/setup-sources.ts packages/cli/src/setup-sources.test.ts packages/cli/src/setup-context.ts packages/cli/src/setup-context.test.ts packages/cli/src/commands/setup-commands.ts packages/cli/src/index.test.ts
git commit -m "fix(cli): remove primary-source wording from setup output"
```
### Task 3: Final V1 Verification
**Files:**
- Verify: `packages/cli/src/setup-databases.test.ts`
- Verify: `packages/cli/src/setup-sources.ts`
- Verify: `packages/cli/src/setup-sources.test.ts`
- Verify: `packages/cli/src/setup-context.ts`
- Verify: `packages/cli/src/setup-context.test.ts`
- Verify: `packages/cli/src/commands/setup-commands.ts`
- Verify: `packages/cli/src/index.test.ts`
- [ ] **Step 1: Run focused unified ingest tests**
Run:
```bash
pnpm --filter @ktx/cli exec vitest run src/public-ingest.test.ts src/context-build-view.test.ts src/setup-ready-menu.test.ts src/setup.test.ts src/setup-context.test.ts src/setup-databases.test.ts src/setup-sources.test.ts src/index.test.ts src/command-tree.test.ts
```
Expected: PASS.
- [ ] **Step 2: Run docs regression tests**
Run:
```bash
node --test scripts/examples-docs.test.mjs
```
Expected: PASS.
- [ ] **Step 3: Run CLI type-check**
Run:
```bash
pnpm --filter @ktx/cli run type-check
```
Expected: PASS.
- [ ] **Step 4: Check the normal setup public-copy surface**
Run:
```bash
rg -n "primary source|primary sources|Primary Sources|primary-source" \
packages/cli/src/commands/setup-commands.ts \
packages/cli/src/setup-sources.ts \
packages/cli/src/setup-context.ts \
packages/cli/src/index.test.ts \
packages/cli/src/setup-sources.test.ts \
packages/cli/src/setup-context.test.ts
```
Expected: no matches.
- [ ] **Step 5: Check the unified ingest public command surface**
Run:
```bash
node packages/cli/dist/bin.js ingest --help
node packages/cli/dist/bin.js --help
```
Expected: normal help lists `ktx ingest [connectionId]`, `--all`, `--fast`, `--deep`, `--query-history`, `status`, and `replay`; it does not list `ktx scan`, `ktx ingest run`, or `ktx ingest watch`.
- [ ] **Step 6: Run pre-commit on changed files**
Run:
```bash
uv run pre-commit run --files \
packages/cli/src/setup-databases.test.ts \
packages/cli/src/setup-sources.ts \
packages/cli/src/setup-sources.test.ts \
packages/cli/src/setup-context.ts \
packages/cli/src/setup-context.test.ts \
packages/cli/src/commands/setup-commands.ts \
packages/cli/src/index.test.ts
```
Expected: PASS. If pre-commit cannot run because the local hook environment or pinned tool version is unavailable, record the exact failure and keep the focused Vitest, docs, and type-check results from Steps 1-3.
- [ ] **Step 7: Commit verification formatting if needed**
If Step 6 changes files, run:
```bash
git add packages/cli/src/setup-databases.test.ts packages/cli/src/setup-sources.ts packages/cli/src/setup-sources.test.ts packages/cli/src/setup-context.ts packages/cli/src/setup-context.test.ts packages/cli/src/commands/setup-commands.ts packages/cli/src/index.test.ts
git commit -m "test(cli): verify unified ingest setup closure"
```
If Step 6 makes no changes, do not create an empty commit.
## Self-Review
- Spec coverage: This plan covers the remaining v1-blocking issues found in the audit: package type-check is currently red, and normal setup help/output still exposes the old public database category as `primary source` instead of database-oriented copy. Core ingest routing, depth behavior, query-history behavior, foreground-only state, warning aggregation, public command help, and scan/live-database/historic-sql output sanitization are already implemented by prior plans.
- Placeholder scan: The plan contains concrete file paths, exact replacement snippets, exact commands, and expected outcomes.
- Type consistency: The only test typing change uses the existing Vitest pattern already used elsewhere in `packages/cli/src/setup-databases.test.ts`: `vi.mocked(prompts.multiselect).mock.calls`.

View file

@ -0,0 +1,593 @@
# Unified Ingest UX Design
**Date:** 2026-05-13
**Author:** Andrey Avtomonov
**Status:** Design — pending implementation plan
## Background
KTX currently exposes multiple user-facing ideas for one product action:
building context from configured connections. Database connections use
`ktx scan <connectionId>`, source connections use
`ktx ingest run --connection-id <id> --adapter <adapter>`, and setup uses a
context-build wrapper that plans database scans before source ingestion.
The implementation already points toward one concept. `ktx scan` runs a
stage-only ingest with the `live-database` adapter, then writes scan-specific
reports, schema manifests, and enrichment artifacts. `ktx setup` already
builds context from all configured connections by routing database connections
to scan internals and source connections to source-ingest internals.
The user-facing model must become simpler:
- Setup configures KTX.
- Ingest builds or refreshes context.
- Status explains readiness.
`scan`, `live-database`, and adapter selection are implementation details.
## Goals
The redesign makes `ktx ingest` the single public context-building command and
keeps the foreground experience rich, clear, and robust.
- Remove `ktx scan` as a normal external verb.
- Remove `live-database` from user-facing CLI help, output, docs, and
`ktx.yaml`.
- Treat database schema ingest as mandatory baseline behavior for database
connections.
- Keep slow AI-heavy database behavior explicit with `--deep`; keep fast,
deterministic behavior explicit with `--fast`.
- Fold query-history ingestion into database connection ingest as an optional
facet.
- Keep `ktx setup` guided. It stores defaults in `ktx.yaml` and uses the same
foreground context-build engine as `ktx ingest`.
- Remove detach, attach, watch, resume, stop, and background context-build
flows.
- Preserve a polished foreground progress view for TTY users and scriptable
output for non-TTY and JSON users.
## Non-goals
This spec does not redesign the semantic-layer YAML format, the ingest bundle
agent loop, or warehouse verification tools.
- Do not remove the internal scan implementation if it remains the cleanest
module boundary.
- Do not remove internal adapter/source keys in one large rename. User-facing
terminology changes first; internal cleanup can follow where it reduces
complexity.
- Do not make query-history ingestion mandatory.
- Do not make AI enrichment mandatory for database connections.
- Do not add `--fast` or `--deep` to top-level `ktx setup`.
- Do not preserve compatibility shims for old public `scan` or
`ingest run --adapter live-database` usage unless an implementation plan
explicitly chooses a short deprecation window.
## Public command model
`ktx ingest` becomes the direct command for building context from one
connection or all configured connections.
```bash
ktx ingest warehouse
ktx ingest warehouse --fast
ktx ingest warehouse --deep
ktx ingest warehouse --deep --query-history
ktx ingest warehouse --no-query-history
ktx ingest notion
ktx ingest --all
ktx ingest --all --deep
```
The command dispatches by connection driver:
- Database drivers run database ingest.
- Source drivers run source ingest.
- `--all` runs database ingest targets first, then source ingest targets.
The old `ktx ingest run --connection-id <id> --adapter <adapter>` command is
removed from the public interface. Normal users configure and ingest
connections, not adapters.
`ktx scan` is no longer a documented public command. Database schema scanning
continues as an internal phase of database ingest.
Stored report inspection is separate from live context-build control. The
public `ktx ingest` namespace has no subcommands, so `run`, `status`, `watch`,
and `replay` are ordinary connection IDs:
```bash
ktx ingest run
ktx ingest status
ktx ingest watch
ktx ingest replay
```
No setup or config validation rejects those names. Old adapter-backed command
shapes such as `ktx ingest run --connection-id warehouse --adapter
live-database` fail through normal option parsing because `--connection-id` and
`--adapter` are not public `ktx ingest` options.
## Database ingest depth
Database ingest always includes a schema baseline. The depth controls how much
extra work KTX may perform.
Depth is the public abstraction over the current scan engine:
- `fast` maps to `KtxScanMode: structural` with `detectRelationships: false`.
- `deep` maps to `KtxScanMode: enriched` and requests relationship detection.
- The internal `relationships` scan mode remains an advanced implementation
detail. It is not a separate public depth in this v1.
Deep mode includes relationship discovery when the project's
`scan.relationships.enabled` setting is true. Relationship validation thresholds
and budgets remain governed by the existing internal `scan.relationships`
configuration; users do not get a separate public relationship flag in this
surface. If `scan.relationships.enabled` is false, `--deep` still runs enriched
database ingest but relationship discovery remains disabled.
### Fast
`--fast` means KTX builds deterministic schema context quickly.
- No LLM calls.
- No embeddings.
- No AI-generated descriptions.
- No expensive relationship discovery that depends on sampling, read-only SQL,
or model calls.
- Introspect tables, columns, native types, comments, declared primary keys,
and declared foreign keys when the connector can read them.
- Write or update database schema context that agents can use as grounding.
- Do not run query-history synthesis, because the current query-history path
uses ingest work units and model-backed synthesis.
This is the safe default for new database connections, CI, smoke tests, and
large unknown warehouses.
### Deep
`--deep` means KTX builds richer database context through the enriched scan path
and uses slower capabilities.
- Requires LLM, embedding, and scan-enrichment readiness before work starts.
- Generates table and column descriptions.
- Generates embeddings.
- May sample or query data through read-only connector capabilities.
- Discovers and validates relationships when relationship discovery is enabled.
- May process query history into usage patterns when query history is enabled.
Deep mode is the best agent-readiness mode, but it can take longer and can
require model, embedding, and database permissions.
KTX must not silently downgrade an explicit or stored `deep` request to `fast`.
For a single database target, if the project is missing the model, embedding, or
scan-enrichment configuration required for deep ingest, KTX errors before
starting the run and tells the user to run `ktx setup` or rerun with `--fast`.
For `--all`, deep-readiness failures follow the per-target rule in
**Error handling and warnings**.
### Flag rules
`--fast` and `--deep` are mutually exclusive. Passing both is an error.
When neither flag is passed, `ktx ingest` uses the stored connection default.
If no default exists, database connections use `fast`.
If a depth flag is passed for a non-database source, KTX prints a warning and
continues:
```text
--deep affects database ingest only; ignoring it for notion.
```
For `--all`, KTX aggregates warnings instead of repeating noisy lines:
```text
--deep ignored for 2 non-database sources.
```
## Query history
Historic SQL becomes the database connection's query-history facet. The term
`historic-sql` remains an internal source key unless a later cleanup renames
it.
Query history is optional because it can require extra grants and can expose
sensitive SQL text. Setup asks about it only for database drivers that support
it.
```bash
ktx ingest warehouse --query-history
ktx ingest warehouse --no-query-history
ktx ingest warehouse --query-history-window-days 30
```
Query-history flags apply only to database connections that support the feature.
In v1, supported query-history drivers are `postgres` or `postgresql`,
`bigquery`, and `snowflake`. They map to the existing historic-SQL dialects
`postgres`, `bigquery`, and `snowflake`. `sqlite`, `mysql`, `clickhouse`, and
`sqlserver` are database ingest targets but do not support query history in v1.
Non-applicable query-history flags produce warnings and continue when the target
can otherwise be ingested. For a single unsupported database target,
`--query-history` or `--query-history-window-days` runs schema ingest, skips the
query-history facet, and prints a warning. For `--all`, KTX aggregates those
warnings and continues other eligible targets. Stored
`connections.<id>.context.queryHistory.enabled: true` on an unsupported driver
is a config warning and is skipped for that driver; it must not abort schema
ingest for that target.
Query history uses schema context as grounding. KTX must run the database
schema facet before query-history processing in the same ingest run. If a user
explicitly enables query history for a run, the output states that schema
ingest runs first.
Because query-history synthesis is model-backed in the current architecture,
`--query-history` upgrades the effective database depth to deep for that run.
KTX prints a warning when a user combines `--fast` with `--query-history`:
```text
--query-history requires deep ingest; running warehouse with --deep.
```
Stored `connections.<id>.context.queryHistory.enabled: true` has the same
depth requirement. When no explicit depth flag is passed, stored query-history
enablement upgrades the effective database depth to `deep` for that run. When a
user explicitly passes `--fast` and does not pass `--query-history`, KTX honors
the explicit fast request, skips stored query-history processing for that run,
does not modify `ktx.yaml`, and prints a warning:
```text
warehouse has query history enabled in ktx.yaml, but --fast skips query-history processing.
```
`--query-history-window-days <n>` overrides
`connections.<id>.context.queryHistory.windowDays` only for the current run. It
must not rewrite `ktx.yaml`. The effective value flows into the same
`historicSqlUnifiedPullConfigSchema.windowDays` field used by the current
historic-SQL pull path.
## Configuration model
User-authored `ktx.yaml` becomes connection-centric. Database schema ingest is
implied by the database connection and no longer appears as an ingest adapter.
```yaml
connections:
warehouse:
driver: postgres
readonly: true
context:
depth: fast
queryHistory:
enabled: false
notion:
driver: notion
context:
enabled: true
```
Deep database defaults and query history use the same connection-local shape:
```yaml
connections:
warehouse:
driver: postgres
readonly: true
context:
depth: deep
queryHistory:
enabled: true
windowDays: 90
minExecutions: 5
filters:
dropTrivialProbes: true
serviceAccounts:
mode: exclude
patterns:
- "^svc_"
redactionPatterns: []
```
`context.queryHistory` is the canonical user-facing shape. Runtime code maps it
to the existing historic-SQL pull config as follows:
- `dialect` is derived from the database driver (`postgres` or `postgresql`,
`bigquery`, or `snowflake`) and is not normally user-authored.
- `windowDays`, `minExecutions`, and `redactionPatterns` copy through directly.
- `filters.dropTrivialProbes` defaults to `true`.
- `filters.serviceAccounts.patterns` and `filters.serviceAccounts.mode` map to
the existing service-account filter fields. The default mode is `exclude`.
- `concurrency`, `staleArchiveAfterDays`,
`filters.orchestrators.mode`, and `filters.dropFailedBelow` are advanced
query-history fields. When present, they map directly to the same fields in
`historicSqlUnifiedPullConfigSchema`. When absent, KTX uses the existing
historic-SQL schema defaults and omitted-field behavior.
Existing `connection.historicSql` blocks are legacy cutover input. Setup or the
explicit config rewrite path must migrate them into
`connection.context.queryHistory` while preserving all mapped query-history
fields, including the advanced fields listed above. `ktx ingest` must not
rewrite `ktx.yaml`; it may read legacy `historicSql` blocks for the current run
and emit a cleanup warning. If both `context.queryHistory` and `historicSql` are
present, `context.queryHistory` wins and KTX emits a config-cleanup warning
instead of running both.
Config migration must be idempotent. A setup or explicit rewrite pass that
migrates a connection removes the legacy `connection.historicSql` block after
copying preserved fields, does not regenerate normal `ingest.adapters` entries,
and produces the same `ktx.yaml` on repeated runs. If `ktx ingest` sees a legacy
block before cleanup, the warning may repeat because ingest is config-read-only.
`ingest.adapters` is no longer normal user config. Existing `ingest.adapters`
entries load as advanced/internal overrides during the transition, but
public `ktx ingest <connectionId>` must not fail solely because the
driver-to-adapter mapping chooses an adapter missing from that list. The rule
applies to database internals (`live-database` and `historic-sql`) and to all
source adapters selected from configured drivers, including `notion`, `dbt`,
`metabase`, `looker`, `metricflow`, and `lookml`.
The implementation can satisfy this by bypassing the adapter allow-list for
connection-centric public ingest, or by synthesizing the adapters required by
configured connections before dispatch. The old adapter-backed advanced command
may continue to honor `ingest.adapters` while it exists. Normal generated
`ktx.yaml` must not include `live-database`, `historic-sql`, or source adapter
entries just to make public `ktx ingest <connectionId>` work.
## Setup flow
`ktx setup` remains a guided configuration flow. It does not expose
`ktx setup --fast` or `ktx setup --deep`.
During interactive setup, KTX asks for database context depth when a database
connection is configured or when setup reaches the context-build step:
```text
How much database context should KTX build?
Fast: schema only, no AI, quickest
Deep: AI descriptions, embeddings, relationships, slower
```
The recommended selection depends on readiness:
- Recommend Fast when model, embedding, or scan-enrichment configuration is
missing.
- Recommend Deep when model, embedding, and scan-enrichment configuration are
ready.
The recommendation is based on the final configuration produced by the current
setup run, not on an earlier intermediate state. Setup must either ask the depth
question after the model, embedding, and scan-enrichment setup paths complete,
or defer or repeat the depth prompt before the foreground context build starts
when those capabilities are configured later in the same setup run.
Setup stores the chosen default in `connections.<id>.context.depth`. The
foreground context build uses that stored default. Setup can still expose a
non-prominent automation flag later, such as `--context-depth fast`, if
headless setup needs it, but the main product surface is guided.
Setup readiness is depth-aware:
- For `fast`, a database context is ready when the latest non-dry-run
structural scan for the connection completed and wrote schema manifest shards.
Model, embedding, description-enrichment, and scan-enrichment checks are
skipped for fast contexts.
- For `deep`, a database context is ready only when the enriched scan completed
table descriptions, column descriptions, embeddings, and schema manifest
shards. When relationship discovery is enabled, readiness requires the
relationship stage to have completed for the latest enriched scan. A
completed relationship stage with zero accepted, review, rejected, or skipped
relationships still counts as ready; readiness must not require non-empty
relationship artifacts or accepted relationships. If relationship discovery is
disabled, the relationship stage is not part of the readiness gate.
The missing-input gate uses the same rule. Missing model, embedding, or
scan-enrichment configuration must not block a user who selected `fast`. The
same missing inputs must block `deep` before the foreground build starts, with a
message that offers `fast` as the no-AI path.
## Foreground progress UX
KTX keeps a rich foreground progress view. It removes detach and background
execution.
The shared build view groups work by user-facing source type:
```text
Building KTX context (2/4 · 1m 12s)
───────────────────────────────────
Databases
✓ warehouse 42 tables · 6 changed · relationships found
⠹ billing reading schema · 18/64 tables
Context sources
✓ dbt 18 models · 42 metrics
○ notion queued
Warnings
--deep ignored for notion; it only applies to database connections.
```
The view must not show `scan` or `live-database` in normal mode. It uses:
- `Databases` instead of `Primary sources`.
- `Context sources` for docs, BI, metrics, and modeling sources.
- `reading schema` or `building schema context` instead of `scanning`.
- `query history` or `usage patterns` instead of `historic-sql`.
Non-TTY output remains append-only and scriptable. `--json` returns structured
results. Routine artifact paths and internal adapter names appear only in
`--debug` or JSON output.
## Removing detach and watch
The context build is foreground only.
- `Ctrl+C` stops the current run.
- KTX records interrupted or failed state where useful for status reporting.
- Rerunning `ktx setup` or `ktx ingest` starts a fresh foreground build or
reuses existing completed artifacts when safe.
Remove these user-facing concepts from context build:
- detach
- attach
- watch
- resume
- stop
- background context-build subprocesses
- prompts that offer "Watch progress"
- hints such as `d to detach`
Existing `running` or `detached` state from older versions must be treated as
stale or interrupted with a clear rerun instruction.
`.ktx/setup/context-build.json` remains only as a foreground status cache, not a
background control plane. New writes may use `not_started`, `running`,
`completed`, `failed`, `interrupted`, or `stale`. `running` means the current
foreground process is active; a later setup process that finds a leftover
`running` record from an older process must mark it `stale` or `interrupted`
before offering a fresh run. `detached` and `paused` are legacy-only statuses
and must be normalized to `stale` or `interrupted` on read or on the next setup
write.
The state file must not keep user-facing `watch`, `resume`, or `stop` command
affordances after this redesign. It may retain run ids, report ids, artifact
paths, source progress, failure details, and a retry/build command when those
help status reporting.
## Internal naming and migration
User-facing surfaces must stop saying `live-database`.
This includes:
- CLI help.
- Normal command output.
- Setup prompts.
- Generated `ktx.yaml`.
- README quickstart and examples.
- Friendly errors and warnings.
Internal paths and source keys can keep `live-database` during the first
implementation if renaming them would add risk. Debug output and JSON may
include internal names when they are necessary for troubleshooting.
The implementation plan must also update stale command suggestions. For
example, setup source recovery must no longer tell users to run
`ktx ingest run --connection-id ... --adapter <adapter>`. It must suggest the
new connection-centric command:
```bash
ktx ingest <connectionId>
```
## Error handling and warnings
Warnings are non-fatal when KTX can still perform the requested ingest.
- Ignored depth flag on a non-database source: warn and continue.
- Ignored query-history flag on an unsupported database: warn and continue if
schema ingest can run.
- Both `--fast` and `--deep`: error before any work starts.
- Explicit or stored `deep` without required model, embedding, or
scan-enrichment readiness: error before any work starts for that target.
- `--query-history` without required model, embedding, or scan-enrichment
readiness: error before any work starts for that target because query history
upgrades the run to `deep`.
- Query-history requested without required grants: fail that query-history
facet and keep schema results when schema ingest succeeded.
- Database schema ingest failure: fail that database target.
`--all` isolates target failures. It runs all database targets first, then all
source targets, even when one or more database targets fail. Source targets may
therefore run against previously completed database context if the current
database refresh failed. The final exit code is non-zero when any target or
required facet fails, and the summary identifies partial failures by
connection.
For `--all`, readiness is evaluated per target after resolving each target's
effective depth and query-history settings. A database target whose effective
run requires deep readiness but lacks model, embedding, or scan-enrichment
configuration fails before work starts for that target; eligible database and
source targets still run. Command-level errors that make target planning
impossible, such as mutually exclusive flags, an unreadable project config, or
no eligible targets, still abort before any target work starts.
Failure messages focus on the connection and user action:
```text
warehouse failed: connection refused.
Retry: ktx ingest warehouse --deep
```
They do not mention internal adapter names unless debug output is enabled.
## Acceptance criteria
The implementation is complete when these conditions hold:
- `ktx ingest <connectionId>` works for database and source connections.
- `ktx ingest --all` runs database targets before source targets.
- `ktx ingest <connectionId>` does not require `ingest.adapters` entries for
any adapter chosen from the configured connection driver.
- Connection ids that collide with surviving `ktx ingest` subcommands are
rejected during setup or config validation.
- `--fast` and `--deep` control database depth and are mutually exclusive.
- `--fast` maps to structural database ingest without relationship detection.
- `--deep` maps to enriched database ingest with relationship detection when
`scan.relationships.enabled` is true.
- `--deep` and `--query-history` fail before work starts when required model,
embedding, or scan-enrichment configuration is missing.
- `ktx ingest --all` continues independent targets after partial failures and
exits non-zero when any target or required facet fails.
- `ktx ingest --all` treats deep-readiness failures as per-target failures
after target planning, rather than aborting eligible independent targets.
- `ktx setup` stores a database context depth without exposing top-level
`--fast` or `--deep`.
- `ktx setup` bases the recommended/default database context depth on the final
model, embedding, and scan-enrichment readiness reached by the setup run.
- `ktx setup` treats fast database context as ready after completed structural
schema ingest and does not require AI descriptions or embeddings for fast.
- Generated `ktx.yaml` does not include `live-database` for normal projects.
- Generated `ktx.yaml` uses `connections.<id>.context.queryHistory`, not
`connections.<id>.historicSql`, for query-history configuration.
- Normal CLI help and output do not mention `live-database`.
- Normal CLI help and output do not present `scan` as a public verb.
- Normal CLI help and output do not present `ktx ingest watch` as live context
build control.
- Query history is optional, connection-local, and overridable per ingest run.
- Query history is supported only for `postgres` or `postgresql`, `bigquery`,
and `snowflake` in v1; unsupported database drivers warn and skip the
query-history facet without blocking schema ingest.
- Stored query-history enablement upgrades default database ingest to deep, but
explicit `--fast` skips stored query history for that run with a warning.
- `--query-history-window-days` overrides the effective historic-SQL
`windowDays` pull config for the current run only and does not rewrite
`ktx.yaml`.
- Legacy `connection.historicSql` migration is idempotent, preserves all mapped
query-history fields, and is performed by setup or an explicit config rewrite,
not by `ktx ingest`.
- Context build has no detach, attach, watch, resume, stop, or background
execution path.
- `.ktx/setup/context-build.json` is retained only as foreground status cache
state; legacy `detached` or `paused` records do not trigger background
recovery branches.
- Existing setup context progress UX is consolidated with `ktx ingest` rather
than duplicated.
- Non-TTY and JSON output remain suitable for scripts.
## Open implementation questions
The implementation plan must decide these lower-level details:
- Whether old `ktx scan` exits with an error, is hidden, or remains as a
temporary undocumented debug command.
- Whether internal artifact paths keep `raw-sources/<connection>/live-database`
for the first implementation.
- Whether setup needs a headless `--context-depth fast|deep` flag for CI.