feat(cli)!: remove fast mode; ktx ingest always builds enriched context (KLO-721) (#237)

Fast mode (the ktx ingest --fast/--deep database-ingest depth toggle) is removed.
ktx ingest now always builds the full enriched ("deep") context. There is no
structural fallback: a database connection without a configured model and
embeddings fails the enrichment-readiness preflight before any work runs, with
a 'Run ktx setup to configure a model and embeddings' hint.

- Remove --fast/--deep flags, the per-connection context.depth field, and the
  ktx setup depth prompt (delete setup-database-context-depth.ts).
- Rename ingest-depth.ts -> connection-drivers.ts; ingest always requests scan
  mode 'enriched'; readiness gate (enrichmentReadinessGaps) runs for every
  database target.
- Drop the database-context-depth telemetry step (Node + Python schema mirrors
  regenerated).
- Update CLI, setup, context-build view, docs, the public ktx skill, and the
  release-smoke / artifacts scripts (now assert the no-LLM guard failure).

ktx status --fast (a separate network-probe flag) is unchanged.

Follow-ups: KLO-726 (live progress for ktx ingest --all), KLO-727 (restore
credentialed successful-ingest release smoke coverage).
This commit is contained in:
Andrey Avtomonov 2026-05-29 17:41:04 +02:00 committed by GitHub
parent 637891f030
commit 3f0d11e07d
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
34 changed files with 222 additions and 884 deletions

View file

@ -350,8 +350,9 @@ error messages — including the disambiguation rule for the overloaded word
`source` (semantic / primary / context / source of truth) — see `source` (semantic / primary / context / source of truth) — see
[`docs/terminology.md`](docs/terminology.md). Follow that file when choosing [`docs/terminology.md`](docs/terminology.md). Follow that file when choosing
between near-synonyms (e.g. `connector` vs `adapter`, `data agent` vs between near-synonyms (e.g. `connector` vs `adapter`, `data agent` vs
`database agent`, `fast ingest` vs `schema ingest`). Product-name rules in `database agent`, `context-source ingest` vs `source ingest`). Product-name
this section take precedence over anything in that file when they conflict. rules in this section take precedence over anything in that file when they
conflict.
### Updating `docs-site/` After Code Changes ### Updating `docs-site/` After Code Changes

View file

@ -5,9 +5,11 @@ description: "Build or refresh ktx context, or capture text into ktx memory."
`ktx ingest` builds or refreshes **ktx** context from configured connections, and `ktx ingest` builds or refreshes **ktx** context from configured connections, and
can also capture free-form text into **ktx** memory. Database connections build can also capture free-form text into **ktx** memory. Database connections build
schema context. Context-source connections ingest metadata from tools such as enriched context — schema plus AI-generated descriptions, embeddings, and
dbt, Looker, Metabase, MetricFlow, LookML, and Notion. Pass `--text` or relationship evidence — and require a configured model and embeddings.
`--file` to capture inline text or text files into memory instead. Context-source connections ingest metadata from tools such as dbt, Looker,
Metabase, MetricFlow, LookML, and Notion. Pass `--text` or `--file` to capture
inline text or text files into memory instead.
## Command signature ## Command signature
@ -29,8 +31,6 @@ connection is selected.
| Flag | Description | Default | | Flag | Description | Default |
|------|-------------|---------| |------|-------------|---------|
| `--all` | Ingest all configured connections (same as bare invocation) | `false` | | `--all` | Ingest all configured connections (same as bare invocation) | `false` |
| `--fast` | Use deterministic fast database ingest | Stored connection default, or `fast` |
| `--deep` | Use deep database ingest with AI-generated descriptions, embeddings, and relationship evidence | Stored connection default, or `fast` |
| `--query-history` | Include database query-history usage patterns | Stored connection default | | `--query-history` | Include database query-history usage patterns | Stored connection default |
| `--no-query-history` | Skip database query-history usage patterns for this run | Stored connection default | | `--no-query-history` | Skip database query-history usage patterns for this run | Stored connection default |
| `--query-history-window-days <days>` | BigQuery/Snowflake query-history lookback window for this run | Stored connection default | | `--query-history-window-days <days>` | BigQuery/Snowflake query-history lookback window for this run | Stored connection default |
@ -44,12 +44,12 @@ connection is selected.
| `--yes` | Install required managed runtime features without prompting | `false` | | `--yes` | Install required managed runtime features without prompting | `false` |
| `--no-input` | Disable interactive terminal input | - | | `--no-input` | Disable interactive terminal input | - |
`--fast` and `--deep` are mutually exclusive. Depth flags apply only to Database ingest always builds enriched context and requires a configured model
database connections. Query-history flags apply only to database connections and embeddings (run `ktx setup`); connections without that configuration fail
before any work starts. Query-history flags apply only to database connections
that support query history. The window flag applies to BigQuery and Snowflake; that support query history. The window flag applies to BigQuery and Snowflake;
Postgres reads the current `pg_stat_statements` aggregate data instead of a Postgres reads the current `pg_stat_statements` aggregate data instead of a
time-windowed history table. Query-history ingest runs after fast ingest and time-windowed history table. Query-history ingest runs after the schema scan.
requires deep ingest readiness.
When more than one connection is selected, database ingest runs first, then When more than one connection is selected, database ingest runs first, then
context-source ingest and memory updates run for context-source connections. context-source ingest and memory updates run for context-source connections.
@ -72,14 +72,8 @@ ktx ingest
# Build one database or context-source connection # Build one database or context-source connection
ktx ingest warehouse ktx ingest warehouse
# Force deterministic fast database ingest
ktx ingest warehouse --fast
# Force deep database ingest with AI enrichment
ktx ingest warehouse --deep
# Include query-history usage patterns # Include query-history usage patterns
ktx ingest warehouse --deep --query-history ktx ingest warehouse --query-history
# Set the lookback window for BigQuery or Snowflake query history # Set the lookback window for BigQuery or Snowflake query history
ktx ingest warehouse --query-history-window-days 30 ktx ingest warehouse --query-history-window-days 30
@ -154,8 +148,8 @@ KTX_INGEST_TRACE_LEVEL=trace ktx ingest metabase
| Error | Cause | Recovery | | Error | Cause | Recovery |
|-------|-------|----------| |-------|-------|----------|
| Connection not configured | The connection id is not present in `ktx.yaml` | Add the connection with `ktx setup` or update `ktx.yaml` | | Connection not configured | The connection id is not present in `ktx.yaml` | Add the connection with `ktx setup` or update `ktx.yaml` |
| Deep readiness is missing | `--deep` or query history needs model, embedding, and scan-enrichment configuration | Run `ktx setup` or rerun with `--fast` | | Enrichment is not configured | Database ingest needs a model, embeddings, and scan-enrichment configuration | Run `ktx setup` to configure a model and embeddings |
| Query history is unsupported | The selected database driver does not support query history | Run fast ingest without query-history flags | | Query history is unsupported | The selected database driver does not support query history | Run ingest without query-history flags |
| Python runtime is missing | The selected ingest target needs runtime-backed SQL analysis or source parsing | Accept the interactive prompt, rerun with `--yes`, or run the suggested `ktx admin runtime install` command | | Python runtime is missing | The selected ingest target needs runtime-backed SQL analysis or source parsing | Accept the interactive prompt, rerun with `--yes`, or run the suggested `ktx admin runtime install` command |
| Context-source options were ignored | Depth and query-history flags were supplied for a context-source connection | Omit database-only flags when ingesting context-source connections | | Context-source options were ignored | Query-history flags were supplied for a context-source connection | Omit database-only flags when ingesting context-source connections |
| Text ingest stops early | `--fail-fast` was used and one item failed | Fix the failed item or rerun without `--fail-fast` to collect all failures | | Text ingest stops early | `--fail-fast` was used and one item failed | Fix the failed item or rerun without `--fail-fast` to collect all failures |

View file

@ -131,8 +131,8 @@ BigQuery; and `databases` for ClickHouse.
Query history setup is supported for Postgres, BigQuery, and Snowflake. The Query history setup is supported for Postgres, BigQuery, and Snowflake. The
window flag applies to BigQuery and Snowflake; Postgres reads the current window flag applies to BigQuery and Snowflake; Postgres reads the current
`pg_stat_statements` aggregate data instead of a time-windowed history table. `pg_stat_statements` aggregate data instead of a time-windowed history table.
Enabling query history makes deep ingest readiness matter for later Later `ktx ingest` runs build enriched context and need a configured model and
`ktx ingest` runs. embeddings, including when query history is enabled.
When query history is enabled for PostgreSQL, Snowflake, or BigQuery, When query history is enabled for PostgreSQL, Snowflake, or BigQuery,
`ktx setup` runs a non-blocking readiness probe after the connection test `ktx setup` runs a non-blocking readiness probe after the connection test

View file

@ -66,8 +66,9 @@ read, how to think, and where to put the results.
## Minimal config ## Minimal config
A working `ktx.yaml` needs one entry in `connections`. Everything else accepts A working `ktx.yaml` needs one entry in `connections`. Everything else accepts
defaults. The example below is enough for `ktx ingest warehouse` to run a fast defaults. The example below registers a local Postgres connection; building
schema scan against a local Postgres. context with `ktx ingest warehouse` also needs a model and embeddings, which
`ktx setup` configures.
```yaml ```yaml
connections: connections:
@ -123,7 +124,7 @@ context-source drivers share the map.
Warehouse connections are open objects: the listed fields are validated, and Warehouse connections are open objects: the listed fields are validated, and
any other field is preserved and passed through to the connector. Use any other field is preserved and passed through to the connector. Use
`enabled_tables` to scope deep ingest to a specific list of `enabled_tables` to scope ingest to a specific list of
`schema.table` names - useful for smoke tests. `schema.table` names - useful for smoke tests.
```yaml ```yaml

View file

@ -236,7 +236,7 @@ Testing warehouse
Connection test passed Connection test passed
Building schema context for warehouse Building schema context for warehouse
Running fast database ingest Running database scan
``` ```
If setup exits early, rerun `ktx setup` in the same directory. **ktx** keeps If setup exits early, rerun `ktx setup` in the same directory. **ktx** keeps
@ -268,13 +268,13 @@ Agent integration ready: yes (codex:project)
For a structured check inside scripts, use `ktx status --json`. For a structured check inside scripts, use `ktx status --json`.
When setup builds deep context, its final context check looks like: When setup finishes building context, its final context check looks like:
```text ```text
ktx context is ready for agents. ktx context is ready for agents.
Databases: Databases:
warehouse: deep context complete warehouse: database context complete
Context sources: Context sources:
dbt_main: memory update complete dbt_main: memory update complete
@ -326,7 +326,7 @@ ktx setup \
Then build context: Then build context:
```bash ```bash
ktx ingest warehouse --fast ktx ingest warehouse
``` ```
See [ktx setup](/docs/cli-reference/ktx-setup) for the full automation flag See [ktx setup](/docs/cli-reference/ktx-setup) for the full automation flag

View file

@ -24,7 +24,9 @@ external metadata can attach to known warehouse tables.
## Database ingest ## Database ingest
Database ingest records table, column, type, constraint, and row-count context. Database ingest always builds enriched context: tables, columns, types,
constraints, and row counts, plus AI-generated descriptions, embeddings, and
relationship evidence.
```bash ```bash
# Build one configured database connection # Build one configured database connection
@ -34,23 +36,8 @@ ktx ingest warehouse
ktx ingest --all ktx ingest --all
``` ```
Depth controls how much context **ktx** builds: Enriched ingest needs a configured model and embeddings. Run `ktx setup` first;
connections without that configuration fail before any work starts.
| Flag | Best for | What it does |
|------|----------|--------------|
| `--fast` | First setup, quick refreshes, CI smoke checks | Deterministic fast ingest with tables, columns, types, constraints, and row counts |
| `--deep` | Agent-ready context for real analysis | Fast ingest plus deep enrichment with descriptions, embeddings, relationship evidence, and optional query history |
Examples:
```bash
ktx ingest warehouse --fast
ktx ingest warehouse --deep
ktx ingest --all --deep
```
Deep ingest needs LLM and embedding readiness. Otherwise run `ktx setup` or use
`--fast`.
With `claude-code`, **ktx** agent loops can invoke only the **ktx** MCP tools for the With `claude-code`, **ktx** agent loops can invoke only the **ktx** MCP tools for the
current run. current run.
@ -64,7 +51,7 @@ Enable it during setup, store it under `connections.<id>.context.queryHistory`,
or request it for one run: or request it for one run:
```bash ```bash
ktx ingest warehouse --deep --query-history ktx ingest warehouse --query-history
# Set the lookback window for BigQuery or Snowflake query history # Set the lookback window for BigQuery or Snowflake query history
ktx ingest warehouse --query-history-window-days 30 ktx ingest warehouse --query-history-window-days 30
``` ```
@ -74,8 +61,8 @@ for one run.
## Relationship evidence ## Relationship evidence
**ktx** scores relationship candidates during supported deep database ingest. The **ktx** scores relationship candidates during database ingest. The public CLI
public CLI does not expose separate relationship review subcommands. does not expose separate relationship review subcommands.
## Context-source ingest ## Context-source ingest
@ -159,7 +146,7 @@ After interactive setup:
```bash ```bash
ktx status ktx status
ktx ingest --all --deep ktx ingest --all
ktx status ktx status
``` ```
@ -176,8 +163,8 @@ ktx wiki "revenue" --json --limit 10
| Symptom | Likely cause | Recovery | | Symptom | Likely cause | Recovery |
|---------|--------------|----------| |---------|--------------|----------|
| Connection not configured | The connection id is missing from `ktx.yaml` | Add it with `ktx setup` | | Connection not configured | The connection id is missing from `ktx.yaml` | Add it with `ktx setup` |
| Deep readiness is missing | LLM or embeddings are not setup-ready | Run `ktx setup`, or rerun with `--fast` | | Enrichment is not configured | LLM or embeddings are not setup-ready | Run `ktx setup` to configure a model and embeddings |
| Query history is unsupported | The selected database driver does not expose query history | Run fast ingest without query-history flags | | Query history is unsupported | The selected database driver does not expose query history | Run ingest without query-history flags |
| No connections configured | The project has no entries under `connections` | Run `ktx setup` and add a database or context-source connection | | No connections configured | The project has no entries under `connections` | Run `ktx setup` and add a database or context-source connection |
| Context-source flags have no effect | Depth and query-history flags were supplied for a context-source connector | Use those flags only for database connections | | Context-source flags have no effect | Query-history flags were supplied for a context-source connector | Use query-history flags only for database connections |
| Text ingest stops early | `--fail-fast` stopped on the first failed item | Fix the item or rerun without `--fail-fast` | | Text ingest stops early | `--fail-fast` stopped on the first failed item | Fix the item or rerun without `--fail-fast` |

View file

@ -111,12 +111,13 @@ non-obvious terms.
Agents can refresh context when the user asks them to: Agents can refresh context when the user asks them to:
```bash ```bash
ktx ingest warehouse --fast ktx ingest warehouse
ktx ingest ktx ingest
ktx ingest --file docs/revenue-notes.md --connection-id warehouse ktx ingest --file docs/revenue-notes.md --connection-id warehouse
``` ```
Use `--deep` only when LLM and embedding setup is ready. Database ingest builds enriched context and requires a configured model and
embeddings; run `ktx setup` first if they are not ready.
## Good agent behavior ## Good agent behavior

View file

@ -517,5 +517,5 @@ No authentication required - SQLite is file-based. The file must be readable by
| Connection URL appears in git diff | A literal credential URL was written to `ktx.yaml` | Replace it with `env:NAME` or `file:/path/to/secret` and rotate exposed credentials | | Connection URL appears in git diff | A literal credential URL was written to `ktx.yaml` | Replace it with `env:NAME` or `file:/path/to/secret` and rotate exposed credentials |
| Database ingest returns no tables | Schema, database, or project filter is wrong, or the user lacks metadata permissions | Verify the schema list and grant metadata read permissions | | Database ingest returns no tables | Schema, database, or project filter is wrong, or the user lacks metadata permissions | Verify the schema list and grant metadata read permissions |
| Query history is empty | Query history extension or warehouse history view is unavailable | Enable the warehouse-specific history feature, then rerun `ktx ingest <connectionId> --query-history` or `ktx setup` | | Query history is empty | Query history extension or warehouse history view is unavailable | Enable the warehouse-specific history feature, then rerun `ktx ingest <connectionId> --query-history` or `ktx setup` |
| Column statistics are missing | Connector cannot access stats tables or the warehouse does not expose them | Grant stats permissions where supported; otherwise rely on fast schema context | | Column statistics are missing | Connector cannot access stats tables or the warehouse does not expose them | Grant stats permissions where supported; otherwise rely on schema-level context without column statistics |
| Semantic query execution fails | Connection is missing, unreachable, or query execution is disabled | Run `ktx connection test <id>` and check the `ktx sl query` flags | | Semantic query execution fails | Connection is missing, unreachable, or query execution is disabled | Run `ktx connection test <id>` and check the `ktx sl query` flags |

View file

@ -77,8 +77,6 @@ maintains, validates, and serves that layer.
| Connection ref in prose | **connection id** (lowercase, two words) | "connection ID" | | Connection ref in prose | **connection id** (lowercase, two words) | "connection ID" |
| CLI arg/flag literal | `connectionId` (code font) | — | | CLI arg/flag literal | `connectionId` (code font) | — |
| File path placeholder | `<connection-id>` (code font) | — | | File path placeholder | `<connection-id>` (code font) | — |
| Fast schema mode | **fast ingest** | schema ingest, schema-only ingest |
| AI-enriched mode | **deep ingest** | AI-enriched ingest |
| Ingest of a primary connection | **database ingest** | — | | Ingest of a primary connection | **database ingest** | — |
| Ingest of a context-source connection | **context-source ingest** | bare "source ingest" | | Ingest of a context-source connection | **context-source ingest** | bare "source ingest" |
| Wiki capture | **text ingest** | — | | Wiki capture | **text ingest** | — |

View file

@ -29,8 +29,6 @@ export function registerIngestCommands(
.usage('[options] [connectionId]') .usage('[options] [connectionId]')
.argument('[connectionId]', 'Configured connection id to ingest (omit to ingest all)') .argument('[connectionId]', 'Configured connection id to ingest (omit to ingest all)')
.option('--all', 'Ingest all configured connections', false) .option('--all', 'Ingest all configured connections', false)
.addOption(new Option('--fast', 'Use deterministic database schema ingest').conflicts('deep'))
.addOption(new Option('--deep', 'Use AI-enriched database ingest').conflicts('fast'))
.addOption(new Option('--query-history', 'Include database query-history usage patterns').conflicts('noQueryHistory')) .addOption(new Option('--query-history', 'Include database query-history usage patterns').conflicts('noQueryHistory'))
.addOption(new Option('--no-query-history', 'Skip database query-history usage patterns')) .addOption(new Option('--no-query-history', 'Skip database query-history usage patterns'))
.option('--query-history-window-days <days>', 'Query-history lookback window for this run', parsePositiveIntegerOption) .option('--query-history-window-days <days>', 'Query-history lookback window for this run', parsePositiveIntegerOption)
@ -87,8 +85,6 @@ export function registerIngestCommands(
all: selection.kind === 'all', all: selection.kind === 'all',
json: options.json === true, json: options.json === true,
inputMode: options.input === false ? 'disabled' : 'auto', inputMode: options.input === false ? 'disabled' : 'auto',
...(options.fast === true ? { depth: 'fast' as const } : {}),
...(options.deep === true ? { depth: 'deep' as const } : {}),
queryHistory, queryHistory,
...(options.queryHistoryWindowDays !== undefined ? { queryHistoryWindowDays: options.queryHistoryWindowDays } : {}), ...(options.queryHistoryWindowDays !== undefined ? { queryHistoryWindowDays: options.queryHistoryWindowDays } : {}),
cliVersion: context.packageInfo.version, cliVersion: context.packageInfo.version,

View file

@ -0,0 +1,21 @@
import type { KtxProjectConnectionConfig } from './context/project/config.js';
const KTX_DATABASE_DRIVER_IDS = new Set([
'sqlite',
'postgres',
'mysql',
'clickhouse',
'sqlserver',
'bigquery',
'snowflake',
]);
export function normalizeConnectionDriver(connection: KtxProjectConnectionConfig): string {
return String(connection.driver ?? '')
.trim()
.toLowerCase();
}
export function isDatabaseDriver(driver: string): boolean {
return KTX_DATABASE_DRIVER_IDS.has(driver.trim().toLowerCase());
}

View file

@ -88,7 +88,6 @@ export interface ContextBuildArgs {
targetConnectionId?: string; targetConnectionId?: string;
all?: boolean; all?: boolean;
entrypoint?: 'setup' | 'ingest'; entrypoint?: 'setup' | 'ingest';
depth?: Extract<KtxPublicIngestArgs, { command: 'run' }>['depth'];
queryHistory?: Extract<KtxPublicIngestArgs, { command: 'run' }>['queryHistory']; queryHistory?: Extract<KtxPublicIngestArgs, { command: 'run' }>['queryHistory'];
queryHistoryWindowDays?: number; queryHistoryWindowDays?: number;
scanMode?: Extract<KtxPublicIngestArgs, { command: 'run' }>['scanMode']; scanMode?: Extract<KtxPublicIngestArgs, { command: 'run' }>['scanMode'];
@ -371,19 +370,17 @@ function retryCommand(input: {
projectDir?: string; projectDir?: string;
entrypoint?: 'setup' | 'ingest'; entrypoint?: 'setup' | 'ingest';
connectionId?: string; connectionId?: string;
depth?: 'fast' | 'deep';
queryHistory?: boolean; queryHistory?: boolean;
queryHistoryWindowDays?: number; queryHistoryWindowDays?: number;
}): string { }): string {
const projectPart = input.projectDir ? ` --project-dir ${input.projectDir}` : ''; const projectPart = input.projectDir ? ` --project-dir ${input.projectDir}` : '';
if (input.entrypoint === 'ingest' && input.connectionId) { if (input.entrypoint === 'ingest' && input.connectionId) {
const depthPart = input.depth ? ` --${input.depth}` : '';
const queryHistoryPart = input.queryHistory ? ' --query-history' : ''; const queryHistoryPart = input.queryHistory ? ' --query-history' : '';
const windowPart = const windowPart =
input.queryHistory && input.queryHistoryWindowDays !== undefined input.queryHistory && input.queryHistoryWindowDays !== undefined
? ` --query-history-window-days ${input.queryHistoryWindowDays}` ? ` --query-history-window-days ${input.queryHistoryWindowDays}`
: ''; : '';
return `ktx ingest ${input.connectionId}${projectPart}${depthPart}${queryHistoryPart}${windowPart}`; return `ktx ingest ${input.connectionId}${projectPart}${queryHistoryPart}${windowPart}`;
} }
return input.projectDir ? `ktx setup --project-dir ${input.projectDir}` : 'ktx setup'; return input.projectDir ? `ktx setup --project-dir ${input.projectDir}` : 'ktx setup';
} }
@ -746,7 +743,6 @@ function appendRetryIfNeeded(input: {
projectDir: input.projectDir, projectDir: input.projectDir,
entrypoint: input.entrypoint, entrypoint: input.entrypoint,
connectionId: input.target.connectionId, connectionId: input.target.connectionId,
depth: input.target.databaseDepth,
queryHistory: input.target.queryHistory?.enabled === true, queryHistory: input.target.queryHistory?.enabled === true,
queryHistoryWindowDays: input.target.queryHistory?.windowDays, queryHistoryWindowDays: input.target.queryHistory?.windowDays,
})}`; })}`;
@ -769,7 +765,6 @@ function failureTextForTarget(input: {
projectDir: input.projectDir, projectDir: input.projectDir,
entrypoint: input.entrypoint, entrypoint: input.entrypoint,
connectionId: input.target.connectionId, connectionId: input.target.connectionId,
depth: input.target.databaseDepth,
queryHistory: input.target.queryHistory?.enabled === true, queryHistory: input.target.queryHistory?.enabled === true,
queryHistoryWindowDays: input.target.queryHistory?.windowDays, queryHistoryWindowDays: input.target.queryHistory?.windowDays,
})}`, })}`,
@ -784,7 +779,6 @@ function failureTextForTarget(input: {
projectDir: input.projectDir, projectDir: input.projectDir,
entrypoint: input.entrypoint, entrypoint: input.entrypoint,
connectionId: input.target.connectionId, connectionId: input.target.connectionId,
depth: input.target.databaseDepth,
queryHistory: input.target.queryHistory?.enabled === true, queryHistory: input.target.queryHistory?.enabled === true,
queryHistoryWindowDays: input.target.queryHistory?.windowDays, queryHistoryWindowDays: input.target.queryHistory?.windowDays,
})}`, })}`,
@ -868,7 +862,6 @@ export async function runContextBuild(
projectDir: args.projectDir, projectDir: args.projectDir,
...(args.targetConnectionId ? { targetConnectionId: args.targetConnectionId } : {}), ...(args.targetConnectionId ? { targetConnectionId: args.targetConnectionId } : {}),
all: args.all ?? true, all: args.all ?? true,
...(args.depth ? { depth: args.depth } : {}),
...(args.queryHistory ? { queryHistory: args.queryHistory } : {}), ...(args.queryHistory ? { queryHistory: args.queryHistory } : {}),
...(args.queryHistoryWindowDays !== undefined ? { queryHistoryWindowDays: args.queryHistoryWindowDays } : {}), ...(args.queryHistoryWindowDays !== undefined ? { queryHistoryWindowDays: args.queryHistoryWindowDays } : {}),
...(args.scanMode ? { scanMode: args.scanMode } : {}), ...(args.scanMode ? { scanMode: args.scanMode } : {}),
@ -935,7 +928,6 @@ export async function runContextBuild(
all: args.all ?? true, all: args.all ?? true,
json: false, json: false,
inputMode: args.inputMode, inputMode: args.inputMode,
...(args.depth ? { depth: args.depth } : {}),
...(args.queryHistory ? { queryHistory: args.queryHistory } : {}), ...(args.queryHistory ? { queryHistory: args.queryHistory } : {}),
...(args.queryHistoryWindowDays !== undefined ? { queryHistoryWindowDays: args.queryHistoryWindowDays } : {}), ...(args.queryHistoryWindowDays !== undefined ? { queryHistoryWindowDays: args.queryHistoryWindowDays } : {}),
...(args.scanMode ? { scanMode: args.scanMode } : {}), ...(args.scanMode ? { scanMode: args.scanMode } : {}),

View file

@ -30,7 +30,7 @@ function warehouseConnectionSchema<const Driver extends WarehouseDriver>(driver:
.array(z.string().min(1)) .array(z.string().min(1))
.optional() .optional()
.describe( .describe(
'Optional allowlist of fully-qualified table names ("schema.table") to ingest. When set, live-database ingest discards any table whose schema-qualified name is not in this list. Useful for smoke-testing deep ingest on a single table.', 'Optional allowlist of fully-qualified table names ("schema.table") to ingest. When set, live-database ingest discards any table whose schema-qualified name is not in this list. Useful for smoke-testing ingest on a single table.',
), ),
}) })
.describe( .describe(

View file

@ -1,75 +0,0 @@
import type { KtxProjectConfig, KtxProjectConnectionConfig } from './context/project/config.js';
export type KtxDatabaseContextDepth = 'fast' | 'deep';
const KTX_DATABASE_DRIVER_IDS = new Set([
'sqlite',
'postgres',
'mysql',
'clickhouse',
'sqlserver',
'bigquery',
'snowflake',
]);
export function normalizeConnectionDriver(connection: KtxProjectConnectionConfig): string {
return String(connection.driver ?? '')
.trim()
.toLowerCase();
}
export function isDatabaseDriver(driver: string): boolean {
return KTX_DATABASE_DRIVER_IDS.has(driver.trim().toLowerCase());
}
function connectionContextRecord(connection: KtxProjectConnectionConfig): Record<string, unknown> {
const context = connection.context;
return typeof context === 'object' && context !== null && !Array.isArray(context)
? (context as Record<string, unknown>)
: {};
}
export function databaseContextDepth(connection: KtxProjectConnectionConfig): KtxDatabaseContextDepth | undefined {
const depth = connectionContextRecord(connection).depth;
return depth === 'fast' || depth === 'deep' ? depth : undefined;
}
export function withDatabaseContextDepth(
connection: KtxProjectConnectionConfig,
depth: KtxDatabaseContextDepth,
): KtxProjectConnectionConfig {
return {
...connection,
context: {
...connectionContextRecord(connection),
depth,
},
};
}
export function deepReadinessGaps(config: KtxProjectConfig): string[] {
const gaps: string[] = [];
if (config.llm.provider.backend === 'none' || !config.llm.models.default) {
gaps.push('model configuration');
}
if (config.scan.enrichment.mode !== 'llm') {
gaps.push('scan enrichment mode');
}
const embeddings = config.scan.enrichment.embeddings;
if (
!embeddings ||
embeddings.backend === 'none' ||
!embeddings.model ||
embeddings.dimensions <= 0
) {
gaps.push('scan embeddings');
}
return gaps;
}
export function recommendedDatabaseContextDepth(config: KtxProjectConfig): KtxDatabaseContextDepth {
return deepReadinessGaps(config).length === 0 ? 'deep' : 'fast';
}

View file

@ -12,7 +12,7 @@ const DATABASE_INGEST_REPLACEMENTS: Array<[RegExp, string]> = [
'Database enrichment failed after schema context completed', 'Database enrichment failed after schema context completed',
], ],
[/\bstructural scan\b/gi, 'schema context'], [/\bstructural scan\b/gi, 'schema context'],
[/\benriched scan\b/gi, 'deep database ingest'], [/\benriched scan\b/gi, 'database ingest'],
[/\bscan results\b/gi, 'database context'], [/\bscan results\b/gi, 'database context'],
]; ];

View file

@ -1,16 +1,10 @@
import { getKtxCliPackageInfo } from './cli-runtime.js'; import { getKtxCliPackageInfo } from './cli-runtime.js';
import { loadKtxProject, type KtxLocalProject } from './context/project/project.js'; import { loadKtxProject, type KtxLocalProject } from './context/project/project.js';
import type { KtxProjectConnectionConfig } from './context/project/config.js'; import type { KtxProjectConfig, KtxProjectConnectionConfig } from './context/project/config.js';
import type { KtxProgressPort } from './context/scan/types.js'; import type { KtxProgressPort } from './context/scan/types.js';
import type { KtxCliIo } from './index.js'; import type { KtxCliIo } from './index.js';
import type { KtxIngestArgs, KtxIngestDeps, KtxIngestProgressUpdate } from './ingest.js'; import type { KtxIngestArgs, KtxIngestDeps, KtxIngestProgressUpdate } from './ingest.js';
import { import { isDatabaseDriver, normalizeConnectionDriver } from './connection-drivers.js';
type KtxDatabaseContextDepth,
databaseContextDepth,
deepReadinessGaps,
isDatabaseDriver,
normalizeConnectionDriver,
} from './ingest-depth.js';
import { import {
ensureManagedPythonCommandRuntime, ensureManagedPythonCommandRuntime,
type KtxManagedPythonInstallPolicy, type KtxManagedPythonInstallPolicy,
@ -29,7 +23,6 @@ profileMark('module:public-ingest');
type KtxPublicIngestStepName = 'database-schema' | 'query-history' | 'source-ingest' | 'memory-update'; type KtxPublicIngestStepName = 'database-schema' | 'query-history' | 'source-ingest' | 'memory-update';
type KtxPublicIngestStepStatus = 'done' | 'skipped' | 'failed' | 'not-run'; type KtxPublicIngestStepStatus = 'done' | 'skipped' | 'failed' | 'not-run';
type KtxPublicIngestInputMode = 'auto' | 'disabled'; type KtxPublicIngestInputMode = 'auto' | 'disabled';
type KtxPublicIngestDepth = KtxDatabaseContextDepth;
type KtxPublicIngestQueryHistoryFlag = 'default' | 'enabled' | 'disabled'; type KtxPublicIngestQueryHistoryFlag = 'default' | 'enabled' | 'disabled';
type HistoricSqlDialect = 'postgres' | 'bigquery' | 'snowflake'; type HistoricSqlDialect = 'postgres' | 'bigquery' | 'snowflake';
@ -41,7 +34,6 @@ export type KtxPublicIngestArgs =
all: boolean; all: boolean;
json: boolean; json: boolean;
inputMode: KtxPublicIngestInputMode; inputMode: KtxPublicIngestInputMode;
depth?: KtxPublicIngestDepth;
queryHistory?: KtxPublicIngestQueryHistoryFlag; queryHistory?: KtxPublicIngestQueryHistoryFlag;
queryHistoryWindowDays?: number; queryHistoryWindowDays?: number;
scanMode?: Extract<KtxScanArgs, { command: 'run' }>['mode']; scanMode?: Extract<KtxScanArgs, { command: 'run' }>['mode'];
@ -58,7 +50,6 @@ export interface KtxPublicIngestPlanTarget {
sourceDir?: string; sourceDir?: string;
debugCommand: string; debugCommand: string;
steps: KtxPublicIngestStepName[]; steps: KtxPublicIngestStepName[];
databaseDepth?: KtxPublicIngestDepth;
detectRelationships?: boolean; detectRelationships?: boolean;
preflightFailure?: string; preflightFailure?: string;
queryHistory?: { queryHistory?: {
@ -67,7 +58,6 @@ export interface KtxPublicIngestPlanTarget {
windowDays?: number; windowDays?: number;
pullConfig?: Record<string, unknown>; pullConfig?: Record<string, unknown>;
unsupported?: boolean; unsupported?: boolean;
skippedStoredByFast?: boolean;
}; };
} }
@ -121,7 +111,6 @@ interface KtxPublicContextBuildArgs {
inputMode: 'auto' | 'disabled'; inputMode: 'auto' | 'disabled';
targetConnectionId?: string; targetConnectionId?: string;
all?: boolean; all?: boolean;
depth?: KtxPublicIngestDepth;
queryHistory?: KtxPublicIngestQueryHistoryFlag; queryHistory?: KtxPublicIngestQueryHistoryFlag;
queryHistoryWindowDays?: number; queryHistoryWindowDays?: number;
scanMode?: Extract<KtxScanArgs, { command: 'run' }>['mode']; scanMode?: Extract<KtxScanArgs, { command: 'run' }>['mode'];
@ -154,7 +143,6 @@ interface KtxUnsupportedQueryHistoryWarning {
interface KtxPublicIngestWarningAccumulator { interface KtxPublicIngestWarningAccumulator {
warnings: string[]; warnings: string[];
ignoredDepthForSources: string[];
ignoredQueryHistoryForSources: string[]; ignoredQueryHistoryForSources: string[];
unsupportedQueryHistoryForDatabases: KtxUnsupportedQueryHistoryWarning[]; unsupportedQueryHistoryForDatabases: KtxUnsupportedQueryHistoryWarning[];
} }
@ -162,7 +150,6 @@ interface KtxPublicIngestWarningAccumulator {
function createWarningAccumulator(): KtxPublicIngestWarningAccumulator { function createWarningAccumulator(): KtxPublicIngestWarningAccumulator {
return { return {
warnings: [], warnings: [],
ignoredDepthForSources: [],
ignoredQueryHistoryForSources: [], ignoredQueryHistoryForSources: [],
unsupportedQueryHistoryForDatabases: [], unsupportedQueryHistoryForDatabases: [],
}; };
@ -233,7 +220,6 @@ function finalizeWarnings(
accumulator: KtxPublicIngestWarningAccumulator, accumulator: KtxPublicIngestWarningAccumulator,
args: { args: {
all: boolean; all: boolean;
depth?: KtxPublicIngestDepth;
queryHistory?: KtxPublicIngestQueryHistoryFlag; queryHistory?: KtxPublicIngestQueryHistoryFlag;
queryHistoryWindowDays?: number; queryHistoryWindowDays?: number;
}, },
@ -242,11 +228,6 @@ function finalizeWarnings(
...accumulator.warnings, ...accumulator.warnings,
...unsupportedQueryHistoryWarnings(accumulator.unsupportedQueryHistoryForDatabases, args.all), ...unsupportedQueryHistoryWarnings(accumulator.unsupportedQueryHistoryForDatabases, args.all),
]; ];
const depthOption = args.depth ? `--${args.depth}` : null;
if (depthOption) {
const warning = sourceIgnoredWarning(depthOption, accumulator.ignoredDepthForSources, args.all);
if (warning) warnings.push(warning);
}
if (args.queryHistory === 'enabled' || args.queryHistoryWindowDays !== undefined) { if (args.queryHistory === 'enabled' || args.queryHistoryWindowDays !== undefined) {
const warning = sourceIgnoredWarning('--query-history', accumulator.ignoredQueryHistoryForSources, args.all); const warning = sourceIgnoredWarning('--query-history', accumulator.ignoredQueryHistoryForSources, args.all);
if (warning) warnings.push(warning); if (warning) warnings.push(warning);
@ -317,13 +298,12 @@ function resolveDatabaseTargetOptions(input: {
driver: string; driver: string;
connection: KtxProjectConnectionConfig; connection: KtxProjectConnectionConfig;
args: { args: {
depth?: KtxPublicIngestDepth;
queryHistory?: KtxPublicIngestQueryHistoryFlag; queryHistory?: KtxPublicIngestQueryHistoryFlag;
queryHistoryWindowDays?: number; queryHistoryWindowDays?: number;
scanMode?: Extract<KtxScanArgs, { command: 'run' }>['mode']; scanMode?: Extract<KtxScanArgs, { command: 'run' }>['mode'];
}; };
warnings: KtxPublicIngestWarningAccumulator; warnings: KtxPublicIngestWarningAccumulator;
}): Pick<KtxPublicIngestPlanTarget, 'databaseDepth' | 'queryHistory' | 'steps'> { }): Pick<KtxPublicIngestPlanTarget, 'queryHistory' | 'steps'> {
const storedQh = storedQueryHistory(input.connection); const storedQh = storedQueryHistory(input.connection);
const dialect = queryHistoryDialectByDriver.get(input.driver); const dialect = queryHistoryDialectByDriver.get(input.driver);
const explicitQueryHistory = input.args.queryHistory ?? 'default'; const explicitQueryHistory = input.args.queryHistory ?? 'default';
@ -332,7 +312,6 @@ function resolveDatabaseTargetOptions(input: {
const requestedQh = const requestedQh =
explicitQueryHistory === 'enabled' || explicitQueryHistory === 'enabled' ||
(explicitQueryHistory !== 'disabled' && (windowOverrideRequested || storedEnabled)); (explicitQueryHistory !== 'disabled' && (windowOverrideRequested || storedEnabled));
let depth = input.args.depth ?? databaseContextDepth(input.connection) ?? 'fast';
const queryHistory = { const queryHistory = {
enabled: false, enabled: false,
...(input.args.queryHistoryWindowDays !== undefined ...(input.args.queryHistoryWindowDays !== undefined
@ -350,19 +329,13 @@ function resolveDatabaseTargetOptions(input: {
explicitQueryHistory === 'enabled' || input.args.queryHistoryWindowDays !== undefined ? 'explicit' : 'stored', explicitQueryHistory === 'enabled' || input.args.queryHistoryWindowDays !== undefined ? 'explicit' : 'stored',
}); });
return { return {
databaseDepth: depth,
queryHistory: { ...queryHistory, unsupported: true }, queryHistory: { ...queryHistory, unsupported: true },
steps: ['database-schema'], steps: ['database-schema'],
}; };
} }
if (requestedQh && dialect) { if (requestedQh && dialect) {
if (depth === 'fast') {
input.warnings.warnings.push(`--query-history requires deep ingest; running ${input.connectionId} with --deep.`);
}
depth = 'deep';
return { return {
databaseDepth: depth,
queryHistory: { queryHistory: {
...queryHistory, ...queryHistory,
enabled: true, enabled: true,
@ -378,30 +351,35 @@ function resolveDatabaseTargetOptions(input: {
}; };
} }
if (input.args.depth === 'fast' && explicitQueryHistory !== 'enabled' && storedEnabled) {
input.warnings.warnings.push(
`${input.connectionId} has query history enabled in ktx.yaml, but --fast skips query-history processing.`,
);
return {
databaseDepth: 'fast',
queryHistory: { ...queryHistory, skippedStoredByFast: true },
steps: ['database-schema'],
};
}
return { return {
databaseDepth: depth,
queryHistory, queryHistory,
steps: ['database-schema'], steps: ['database-schema'],
}; };
} }
function enrichmentReadinessGaps(config: KtxProjectConfig): string[] {
const gaps: string[] = [];
if (config.llm.provider.backend === 'none' || !config.llm.models.default) {
gaps.push('model configuration');
}
if (config.scan.enrichment.mode !== 'llm') {
gaps.push('scan enrichment mode');
}
const embeddings = config.scan.enrichment.embeddings;
if (!embeddings || embeddings.backend === 'none' || !embeddings.model || embeddings.dimensions <= 0) {
gaps.push('scan embeddings');
}
return gaps;
}
function targetForConnection( function targetForConnection(
connectionId: string, connectionId: string,
connection: KtxProjectConnectionConfig, connection: KtxProjectConnectionConfig,
projectConfig: KtxPublicIngestProject['config'], projectConfig: KtxPublicIngestProject['config'],
args: { args: {
depth?: KtxPublicIngestDepth;
queryHistory?: KtxPublicIngestQueryHistoryFlag; queryHistory?: KtxPublicIngestQueryHistoryFlag;
queryHistoryWindowDays?: number; queryHistoryWindowDays?: number;
scanMode?: Extract<KtxScanArgs, { command: 'run' }>['mode']; scanMode?: Extract<KtxScanArgs, { command: 'run' }>['mode'];
@ -412,9 +390,6 @@ function targetForConnection(
const adapter = sourceAdapterByDriver.get(driver); const adapter = sourceAdapterByDriver.get(driver);
const sourceDir = sourceDirForConnection(connection); const sourceDir = sourceDirForConnection(connection);
if (adapter) { if (adapter) {
if (args.depth) {
warnings.ignoredDepthForSources.push(connectionId);
}
if (args.queryHistory === 'enabled' || args.queryHistoryWindowDays !== undefined) { if (args.queryHistory === 'enabled' || args.queryHistoryWindowDays !== undefined) {
warnings.ignoredQueryHistoryForSources.push(connectionId); warnings.ignoredQueryHistoryForSources.push(connectionId);
} }
@ -431,18 +406,18 @@ function targetForConnection(
if (isDatabaseDriver(driver)) { if (isDatabaseDriver(driver)) {
const options = resolveDatabaseTargetOptions({ connectionId, driver, connection, args, warnings }); const options = resolveDatabaseTargetOptions({ connectionId, driver, connection, args, warnings });
const gaps = options.databaseDepth === 'deep' ? deepReadinessGaps(projectConfig) : []; const gaps = enrichmentReadinessGaps(projectConfig);
return { return {
connectionId, connectionId,
driver, driver,
operation: 'database-ingest', operation: 'database-ingest',
debugCommand: `ktx ingest ${connectionId} --debug`, debugCommand: `ktx ingest ${connectionId} --debug`,
detectRelationships: options.databaseDepth === 'deep' && projectConfig.scan.relationships.enabled, detectRelationships: projectConfig.scan.relationships.enabled,
...(gaps.length > 0 ...(gaps.length > 0
? { ? {
preflightFailure: `${connectionId} requires deep ingest readiness: ${gaps.join( preflightFailure: `${connectionId} cannot be ingested: enrichment is not configured (${gaps.join(
', ', ', ',
)}. Run ktx setup or rerun with --fast.`, )}). Run ktx setup to configure a model and embeddings.`,
} }
: {}), : {}),
...options, ...options,
@ -458,7 +433,6 @@ export function buildPublicIngestPlan(
projectDir: string; projectDir: string;
targetConnectionId?: string; targetConnectionId?: string;
all: boolean; all: boolean;
depth?: KtxPublicIngestDepth;
queryHistory?: KtxPublicIngestQueryHistoryFlag; queryHistory?: KtxPublicIngestQueryHistoryFlag;
queryHistoryWindowDays?: number; queryHistoryWindowDays?: number;
scanMode?: Extract<KtxScanArgs, { command: 'run' }>['mode']; scanMode?: Extract<KtxScanArgs, { command: 'run' }>['mode'];
@ -522,13 +496,12 @@ function retryCommandForTarget(
args: Extract<KtxPublicIngestArgs, { command: 'run' }>, args: Extract<KtxPublicIngestArgs, { command: 'run' }>,
): string { ): string {
const projectPart = ` --project-dir ${args.projectDir}`; const projectPart = ` --project-dir ${args.projectDir}`;
const depthPart = target.databaseDepth ? ` --${target.databaseDepth}` : '';
const queryHistoryPart = target.queryHistory?.enabled === true ? ' --query-history' : ''; const queryHistoryPart = target.queryHistory?.enabled === true ? ' --query-history' : '';
const windowPart = const windowPart =
target.queryHistory?.enabled === true && target.queryHistory.windowDays !== undefined target.queryHistory?.enabled === true && target.queryHistory.windowDays !== undefined
? ` --query-history-window-days ${target.queryHistory.windowDays}` ? ` --query-history-window-days ${target.queryHistory.windowDays}`
: ''; : '';
return `ktx ingest ${target.connectionId}${projectPart}${depthPart}${queryHistoryPart}${windowPart}`; return `ktx ingest ${target.connectionId}${projectPart}${queryHistoryPart}${windowPart}`;
} }
function trimTrailingPeriod(value: string): string { function trimTrailingPeriod(value: string): string {
@ -830,7 +803,7 @@ export async function executePublicIngestTarget(
command: 'run', command: 'run',
projectDir: args.projectDir, projectDir: args.projectDir,
connectionId: target.connectionId, connectionId: target.connectionId,
mode: target.databaseDepth === 'deep' ? 'enriched' : 'structural', mode: 'enriched',
detectRelationships: target.detectRelationships === true, detectRelationships: target.detectRelationships === true,
dryRun: false, dryRun: false,
...(args.cliVersion ? { cliVersion: args.cliVersion } : {}), ...(args.cliVersion ? { cliVersion: args.cliVersion } : {}),
@ -979,7 +952,6 @@ export async function runKtxPublicIngest(
all: args.all, all: args.all,
entrypoint: 'ingest', entrypoint: 'ingest',
inputMode: args.inputMode, inputMode: args.inputMode,
...(args.depth ? { depth: args.depth } : {}),
...(args.queryHistory ? { queryHistory: args.queryHistory } : {}), ...(args.queryHistory ? { queryHistory: args.queryHistory } : {}),
...(args.queryHistoryWindowDays !== undefined ? { queryHistoryWindowDays: args.queryHistoryWindowDays } : {}), ...(args.queryHistoryWindowDays !== undefined ? { queryHistoryWindowDays: args.queryHistoryWindowDays } : {}),
...(args.scanMode ? { scanMode: args.scanMode } : {}), ...(args.scanMode ? { scanMode: args.scanMode } : {}),

View file

@ -7,12 +7,7 @@ import { serializeKtxProjectConfig } from './context/project/config.js';
import type { KtxCliIo } from './cli-runtime.js'; import type { KtxCliIo } from './cli-runtime.js';
import { errorMessage, writePrefixedLines } from './clack.js'; import { errorMessage, writePrefixedLines } from './clack.js';
import { buildPublicIngestPlan } from './public-ingest.js'; import { buildPublicIngestPlan } from './public-ingest.js';
import {
type KtxDatabaseContextDepth,
databaseContextDepth,
} from './ingest-depth.js';
import type { KtxManagedPythonInstallPolicy } from './managed-python-command.js'; import type { KtxManagedPythonInstallPolicy } from './managed-python-command.js';
import { ensureSetupDatabaseContextDepths } from './setup-database-context-depth.js';
import { import {
type ContextBuildSourceProgressUpdate, type ContextBuildSourceProgressUpdate,
runContextBuild, runContextBuild,
@ -353,16 +348,6 @@ async function readLatestScanReport(projectDir: string, connectionId: string): P
return reports.at(-1)?.report ?? null; return reports.at(-1)?.report ?? null;
} }
function scanReportHasSchemaManifest(report: unknown, connectionId: string): boolean {
if (!isRecord(report)) {
return false;
}
if (report.connectionId !== connectionId || report.dryRun === true) {
return false;
}
return stringArrayValue(isRecord(report.artifactPaths) ? report.artifactPaths.manifestShards : undefined).length > 0;
}
function scanReportHasCompletedDeepEnrichment( function scanReportHasCompletedDeepEnrichment(
report: unknown, report: unknown,
connectionId: string, connectionId: string,
@ -389,18 +374,6 @@ function scanReportHasCompletedDeepEnrichment(
); );
} }
function scanReportSatisfiesDepth(input: {
report: unknown;
connectionId: string;
depth: KtxDatabaseContextDepth;
relationshipsRequired: boolean;
}): boolean {
if (input.depth === 'fast') {
return scanReportHasSchemaManifest(input.report, input.connectionId);
}
return scanReportHasCompletedDeepEnrichment(input.report, input.connectionId, input.relationshipsRequired);
}
async function verifyPrimarySourceScans( async function verifyPrimarySourceScans(
project: KtxLocalProject, project: KtxLocalProject,
connectionIds: string[], connectionIds: string[],
@ -408,15 +381,9 @@ async function verifyPrimarySourceScans(
const details: string[] = []; const details: string[] = [];
const relationshipsRequired = project.config.scan.relationships.enabled; const relationshipsRequired = project.config.scan.relationships.enabled;
for (const connectionId of connectionIds) { for (const connectionId of connectionIds) {
const connection = project.config.connections[connectionId];
const depth = connection ? (databaseContextDepth(connection) ?? 'fast') : 'fast';
const report = await readLatestScanReport(project.projectDir, connectionId); const report = await readLatestScanReport(project.projectDir, connectionId);
if (!scanReportSatisfiesDepth({ report, connectionId, depth, relationshipsRequired })) { if (!scanReportHasCompletedDeepEnrichment(report, connectionId, relationshipsRequired)) {
details.push( details.push(`${connectionId}: database context has not completed.`);
depth === 'fast'
? `${connectionId}: schema context has not completed.`
: `${connectionId}: deep database context has not completed.`,
);
} }
} }
return { ready: details.length === 0, details }; return { ready: details.length === 0, details };
@ -482,7 +449,6 @@ function writeSkippedContext(projectDir: string, io: KtxCliIo): void {
} }
function writeSuccess( function writeSuccess(
project: KtxLocalProject,
readiness: KtxSetupContextReadiness, readiness: KtxSetupContextReadiness,
targets: KtxSetupContextTargets, targets: KtxSetupContextTargets,
io: KtxCliIo, io: KtxCliIo,
@ -493,9 +459,7 @@ function writeSuccess(
io.stdout.write(' none\n'); io.stdout.write(' none\n');
} else { } else {
for (const connectionId of targets.primarySourceConnectionIds) { for (const connectionId of targets.primarySourceConnectionIds) {
const connection = project.config.connections[connectionId]; io.stdout.write(` ${connectionId}: database context complete\n`);
const depth = connection ? (databaseContextDepth(connection) ?? 'fast') : 'fast';
io.stdout.write(` ${connectionId}: ${depth === 'deep' ? 'deep context complete' : 'schema context complete'}\n`);
} }
} }
io.stdout.write('\nContext sources:\n'); io.stdout.write('\nContext sources:\n');
@ -636,7 +600,7 @@ async function runBuild(
failureReason: undefined, failureReason: undefined,
...(lastSourceProgress ? { sourceProgress: lastSourceProgress } : {}), ...(lastSourceProgress ? { sourceProgress: lastSourceProgress } : {}),
}); });
writeSuccess(project, readiness, targets, io); writeSuccess(readiness, targets, io);
return { status: 'ready', projectDir: args.projectDir, runId }; return { status: 'ready', projectDir: args.projectDir, runId };
} }
@ -678,17 +642,8 @@ export async function runKtxSetupContextStep(
deps: KtxSetupContextDeps = {}, deps: KtxSetupContextDeps = {},
): Promise<KtxSetupContextResult> { ): Promise<KtxSetupContextResult> {
try { try {
let project = await loadKtxProject({ projectDir: args.projectDir }); const project = await loadKtxProject({ projectDir: args.projectDir });
const prompts = deps.prompts ?? createPromptAdapter(); const prompts = deps.prompts ?? createPromptAdapter();
const depthProject = await ensureSetupDatabaseContextDepths({
project,
args,
prompts,
});
if (depthProject === 'back') {
return { status: 'back', projectDir: args.projectDir };
}
project = depthProject;
const existingState = await readKtxSetupContextState(args.projectDir); const existingState = await readKtxSetupContextState(args.projectDir);
const completedSteps = (await readKtxSetupState(args.projectDir)).completed_steps; const completedSteps = (await readKtxSetupState(args.projectDir)).completed_steps;
if (completedSteps.includes('context') && existingState.status === 'completed') { if (completedSteps.includes('context') && existingState.status === 'completed') {

View file

@ -1,131 +0,0 @@
import { writeFile } from 'node:fs/promises';
import { type KtxLocalProject, loadKtxProject } from './context/project/project.js';
import { type KtxProjectConnectionConfig, serializeKtxProjectConfig } from './context/project/config.js';
import {
type KtxDatabaseContextDepth,
databaseContextDepth,
deepReadinessGaps,
isDatabaseDriver,
normalizeConnectionDriver,
recommendedDatabaseContextDepth,
withDatabaseContextDepth,
} from './ingest-depth.js';
import type { KtxSetupPromptOption } from './setup-prompts.js';
export interface KtxSetupDatabaseContextDepthArgs {
inputMode: 'auto' | 'disabled';
}
export interface KtxSetupDatabaseContextDepthPromptAdapter {
select(options: { message: string; options: KtxSetupPromptOption[] }): Promise<string>;
}
function databaseConnectionsNeedingDepth(project: KtxLocalProject): string[] {
return Object.entries(project.config.connections)
.filter(([, connection]) => isDatabaseDriver(normalizeConnectionDriver(connection)))
.filter(([, connection]) => databaseContextDepth(connection) === undefined)
.map(([connectionId]) => connectionId)
.sort((left, right) => left.localeCompare(right));
}
async function chooseSetupDatabaseContextDepth(input: {
project: KtxLocalProject;
args: KtxSetupDatabaseContextDepthArgs;
prompts: KtxSetupDatabaseContextDepthPromptAdapter;
}): Promise<KtxDatabaseContextDepth | 'back'> {
const recommended = recommendedDatabaseContextDepth(input.project.config);
if (input.args.inputMode === 'disabled') {
return recommended;
}
const deepReady = deepReadinessGaps(input.project.config).length === 0;
const options =
recommended === 'deep'
? [
{
value: 'deep',
label: 'Deep: AI descriptions, embeddings, relationships, slower',
hint: 'recommended',
},
{ value: 'fast', label: 'Fast: schema only, no AI, quickest' },
{ value: 'back', label: 'Back' },
]
: [
{ value: 'fast', label: 'Fast: schema only, no AI, quickest', hint: 'recommended' },
{ value: 'deep', label: 'Deep: AI descriptions, embeddings, relationships, slower' },
{ value: 'back', label: 'Back' },
];
const choice = await input.prompts.select({
message:
'How much database context should KTX build?\n\n' +
(deepReady
? 'Deep is available because model, embedding, and scan enrichment are configured.'
: 'Fast is recommended because model, embedding, or scan enrichment is not configured.'),
options,
});
if (choice === 'back') {
return 'back';
}
if (choice === 'fast' || choice === 'deep') {
return choice;
}
return recommended;
}
async function writeDatabaseContextDepths(
project: KtxLocalProject,
connectionIds: string[],
depth: KtxDatabaseContextDepth,
): Promise<KtxLocalProject> {
if (connectionIds.length === 0) {
return project;
}
const nextConnections = { ...project.config.connections };
for (const connectionId of connectionIds) {
const connection = nextConnections[connectionId];
if (connection) {
nextConnections[connectionId] = withDatabaseContextDepth(connection, depth);
}
}
const nextConfig = { ...project.config, connections: nextConnections };
await writeFile(project.configPath, serializeKtxProjectConfig(nextConfig), 'utf-8');
return await loadKtxProject({ projectDir: project.projectDir });
}
export async function ensureSetupDatabaseContextDepths(input: {
project: KtxLocalProject;
args: KtxSetupDatabaseContextDepthArgs;
prompts: KtxSetupDatabaseContextDepthPromptAdapter;
}): Promise<KtxLocalProject | 'back'> {
const missingDepthConnectionIds = databaseConnectionsNeedingDepth(input.project);
if (missingDepthConnectionIds.length === 0) {
return input.project;
}
const depth = await chooseSetupDatabaseContextDepth(input);
if (depth === 'back') {
return 'back';
}
return await writeDatabaseContextDepths(input.project, missingDepthConnectionIds, depth);
}
export async function applySetupDatabaseContextDepth(input: {
project: KtxLocalProject;
connection: KtxProjectConnectionConfig;
args: KtxSetupDatabaseContextDepthArgs;
prompts: KtxSetupDatabaseContextDepthPromptAdapter;
}): Promise<KtxProjectConnectionConfig | 'back'> {
if (
!isDatabaseDriver(normalizeConnectionDriver(input.connection)) ||
databaseContextDepth(input.connection) !== undefined
) {
return input.connection;
}
const depth = await chooseSetupDatabaseContextDepth(input);
if (depth === 'back') {
return 'back';
}
return withDatabaseContextDepth(input.connection, depth);
}

View file

@ -29,7 +29,6 @@ import {
} from './database-tree-picker.js'; } from './database-tree-picker.js';
import { withMultiselectNavigation, withTextInputNavigation } from './prompt-navigation.js'; import { withMultiselectNavigation, withTextInputNavigation } from './prompt-navigation.js';
import { runKtxScan } from './scan.js'; import { runKtxScan } from './scan.js';
import { applySetupDatabaseContextDepth } from './setup-database-context-depth.js';
import { writeProjectLocalSecretReference } from './setup-secrets.js'; import { writeProjectLocalSecretReference } from './setup-secrets.js';
import { isDemoConnection } from './telemetry/demo-detect.js'; import { isDemoConnection } from './telemetry/demo-detect.js';
import { emitTelemetryEvent } from './telemetry/index.js'; import { emitTelemetryEvent } from './telemetry/index.js';
@ -1614,45 +1613,10 @@ async function applyHistoricSqlConfigToExistingConnection(input: {
prompts: input.prompts, prompts: input.prompts,
}); });
if (withHistoricSql === 'back') return 'back'; if (withHistoricSql === 'back') return 'back';
const withContextDepth = await maybeApplyContextDepthConfig({
projectDir: input.projectDir,
connectionId: input.connectionId,
connection: withHistoricSql,
args: input.args,
prompts: input.prompts,
});
if (withContextDepth === 'back') return 'back';
await writeConnectionConfig({ await writeConnectionConfig({
projectDir: input.projectDir, projectDir: input.projectDir,
connectionId: input.connectionId, connectionId: input.connectionId,
connection: withContextDepth, connection: withHistoricSql,
});
}
async function maybeApplyContextDepthConfig(input: {
projectDir: string;
connectionId: string;
connection: KtxProjectConnectionConfig;
args: KtxSetupDatabasesArgs;
prompts: KtxSetupDatabasesPromptAdapter;
}): Promise<KtxProjectConnectionConfig | 'back'> {
const project = await loadKtxProject({ projectDir: input.projectDir });
return await applySetupDatabaseContextDepth({
project: {
...project,
config: {
...project.config,
connections: {
...project.config.connections,
[input.connectionId]: input.connection,
},
},
},
connection: input.connection,
args: {
inputMode: input.args.inputMode === 'disabled' || input.args.databaseUrl ? 'disabled' : input.args.inputMode,
},
prompts: input.prompts,
}); });
} }
@ -1698,7 +1662,7 @@ async function validateAndScanConnection(input: {
deps: input.deps, deps: input.deps,
}); });
writeSetupSection(input.io, `Building schema context for ${input.connectionId}`, [ writeSetupSection(input.io, `Building schema context for ${input.connectionId}`, [
'Running fast database ingest…', 'Running database scan…',
]); ]);
let scanIo = createBufferedCommandIo(); let scanIo = createBufferedCommandIo();
let scanCode = await scanConnection(input.projectDir, input.connectionId, scanIo); let scanCode = await scanConnection(input.projectDir, input.connectionId, scanIo);
@ -1708,7 +1672,7 @@ async function validateAndScanConnection(input: {
writePrefixedLines( writePrefixedLines(
(chunk) => input.io.stderr.write(chunk), (chunk) => input.io.stderr.write(chunk),
[ [
`Fast database ingest failed for ${input.connectionId}.`, `Database scan failed for ${input.connectionId}.`,
'Native SQLite is built for a different Node.js ABI.', 'Native SQLite is built for a different Node.js ABI.',
`Detail: ${nativeSqliteDetail}`, `Detail: ${nativeSqliteDetail}`,
'Rebuilding Native SQLite with pnpm run native:rebuild…', 'Rebuilding Native SQLite with pnpm run native:rebuild…',
@ -1719,7 +1683,7 @@ async function validateAndScanConnection(input: {
if (rebuildCode === 0) { if (rebuildCode === 0) {
writePrefixedLines( writePrefixedLines(
(chunk) => input.io.stderr.write(chunk), (chunk) => input.io.stderr.write(chunk),
'Native SQLite rebuild complete. Retrying fast database ingest…', 'Native SQLite rebuild complete. Retrying database scan…',
); );
const retryScanIo = createBufferedCommandIo(); const retryScanIo = createBufferedCommandIo();
scanCode = await scanConnection(input.projectDir, input.connectionId, retryScanIo); scanCode = await scanConnection(input.projectDir, input.connectionId, retryScanIo);
@ -1730,10 +1694,10 @@ async function validateAndScanConnection(input: {
(chunk) => input.io.stderr.write(chunk), (chunk) => input.io.stderr.write(chunk),
[ [
rebuildCode === 0 rebuildCode === 0
? `Fast database ingest still failed for ${input.connectionId} after rebuilding Native SQLite.` ? `Database scan still failed for ${input.connectionId} after rebuilding Native SQLite.`
: `Native SQLite rebuild failed for ${input.connectionId}.`, : `Native SQLite rebuild failed for ${input.connectionId}.`,
'Fix: pnpm run native:rebuild', 'Fix: pnpm run native:rebuild',
`Retry: ktx ingest ${input.connectionId} --project-dir ${input.projectDir} --fast`, `Retry: ktx ingest ${input.connectionId} --project-dir ${input.projectDir}`,
].join('\n'), ].join('\n'),
); );
} }
@ -1742,8 +1706,8 @@ async function validateAndScanConnection(input: {
writePrefixedLines( writePrefixedLines(
(chunk) => input.io.stderr.write(chunk), (chunk) => input.io.stderr.write(chunk),
[ [
`Fast database ingest failed for ${input.connectionId}.`, `Database scan failed for ${input.connectionId}.`,
`Debug command: ktx ingest ${input.connectionId} --project-dir ${input.projectDir} --fast --debug`, `Debug command: ktx ingest ${input.connectionId} --project-dir ${input.projectDir} --debug`,
].join('\n'), ].join('\n'),
); );
} }
@ -2167,22 +2131,10 @@ export async function runKtxSetupDatabasesStep(
returnToDriverSelection = true; returnToDriverSelection = true;
break; break;
} }
const withContextDepth = await maybeApplyContextDepthConfig({
projectDir: args.projectDir,
connectionId: connectionChoice.connectionId,
connection: withHistoricSql,
args,
prompts,
});
if (withContextDepth === 'back') {
if (!canReturnToDriverSelection) return { status: 'back', projectDir: args.projectDir };
returnToDriverSelection = true;
break;
}
await writeConnectionConfig({ await writeConnectionConfig({
projectDir: args.projectDir, projectDir: args.projectDir,
connectionId: connectionChoice.connectionId, connectionId: connectionChoice.connectionId,
connection: withContextDepth, connection: withHistoricSql,
io, io,
}); });
} else { } else {
@ -2193,22 +2145,10 @@ export async function runKtxSetupDatabasesStep(
returnToDriverSelection = true; returnToDriverSelection = true;
break; break;
} }
const withContextDepth = await maybeApplyContextDepthConfig({
projectDir: args.projectDir,
connectionId: connectionChoice.connectionId,
connection: withHistoricSql,
args,
prompts,
});
if (withContextDepth === 'back') {
if (!canReturnToDriverSelection) return { status: 'back', projectDir: args.projectDir };
returnToDriverSelection = true;
break;
}
await writeConnectionConfig({ await writeConnectionConfig({
projectDir: args.projectDir, projectDir: args.projectDir,
connectionId: connectionChoice.connectionId, connectionId: connectionChoice.connectionId,
connection: withContextDepth, connection: withHistoricSql,
io, io,
}); });
} }
@ -2291,22 +2231,10 @@ export async function runKtxSetupDatabasesStep(
returnToDriverSelection = true; returnToDriverSelection = true;
break; break;
} }
const withContextDepth = await maybeApplyContextDepthConfig({
projectDir: args.projectDir,
connectionId: connectionChoice.connectionId,
connection: withHistoricSql,
args,
prompts,
});
if (withContextDepth === 'back') {
if (!canReturnToDriverSelection) return { status: 'back', projectDir: args.projectDir };
returnToDriverSelection = true;
break;
}
await writeConnectionConfig({ await writeConnectionConfig({
projectDir: args.projectDir, projectDir: args.projectDir,
connectionId: connectionChoice.connectionId, connectionId: connectionChoice.connectionId,
connection: withContextDepth, connection: withHistoricSql,
io, io,
}); });
setupStatus = await validateAndScanConnection({ setupStatus = await validateAndScanConnection({

View file

@ -365,7 +365,6 @@
"embeddings", "embeddings",
"secrets", "secrets",
"databases", "databases",
"database-context-depth",
"sources", "sources",
"context", "context",
"agents", "agents",

View file

@ -38,7 +38,6 @@ const setupStepSchema = telemetryCommonEnvelopeSchema
'embeddings', 'embeddings',
'secrets', 'secrets',
'databases', 'databases',
'database-context-depth',
'sources', 'sources',
'context', 'context',
'agents', 'agents',

View file

@ -228,11 +228,11 @@ describe('renderContextBuildView', () => {
const rendered = renderContextBuildView(state, { const rendered = renderContextBuildView(state, {
styled: false, styled: false,
warnings: ['--deep affects database ingest only; ignoring it for docs.'], warnings: ['--query-history affects database ingest only; ignoring it for docs.'],
}); });
expect(rendered).toContain('Warnings:'); expect(rendered).toContain('Warnings:');
expect(rendered).toContain('--deep affects database ingest only; ignoring it for docs.'); expect(rendered).toContain('--query-history affects database ingest only; ignoring it for docs.');
}); });
it('renders public notices in the foreground view before warnings', () => { it('renders public notices in the foreground view before warnings', () => {
@ -243,7 +243,6 @@ describe('renderContextBuildView', () => {
operation: 'database-ingest', operation: 'database-ingest',
debugCommand: 'ktx ingest warehouse --debug', debugCommand: 'ktx ingest warehouse --debug',
steps: ['database-schema', 'query-history'], steps: ['database-schema', 'query-history'],
databaseDepth: 'deep',
detectRelationships: true, detectRelationships: true,
queryHistory: { enabled: true, dialect: 'postgres' }, queryHistory: { enabled: true, dialect: 'postgres' },
}, },
@ -252,12 +251,12 @@ describe('renderContextBuildView', () => {
const rendered = renderContextBuildView(state, { const rendered = renderContextBuildView(state, {
styled: false, styled: false,
notices: ['Schema ingest runs before query history for warehouse.'], notices: ['Schema ingest runs before query history for warehouse.'],
warnings: ['--query-history requires deep ingest; running warehouse with --deep.'], warnings: ['--query-history is not supported for sqlite; running schema ingest for local.'],
}); });
expect(rendered.indexOf('Notices:')).toBeLessThan(rendered.indexOf('Warnings:')); expect(rendered.indexOf('Notices:')).toBeLessThan(rendered.indexOf('Warnings:'));
expect(rendered).toContain('Schema ingest runs before query history for warehouse.'); expect(rendered).toContain('Schema ingest runs before query history for warehouse.');
expect(rendered).toContain('--query-history requires deep ingest; running warehouse with --deep.'); expect(rendered).toContain('--query-history is not supported for sqlite; running schema ingest for local.');
}); });
it('renders dynamic separator matching header width', () => { it('renders dynamic separator matching header width', () => {
@ -653,7 +652,6 @@ describe('runContextBuild', () => {
inputMode: 'disabled', inputMode: 'disabled',
targetConnectionId: 'warehouse', targetConnectionId: 'warehouse',
all: false, all: false,
depth: 'fast',
queryHistory: 'default', queryHistory: 'default',
}, },
io.io, io.io,
@ -665,7 +663,6 @@ describe('runContextBuild', () => {
expect(executeTarget.mock.calls[0]?.[0]).toMatchObject({ expect(executeTarget.mock.calls[0]?.[0]).toMatchObject({
connectionId: 'warehouse', connectionId: 'warehouse',
operation: 'database-ingest', operation: 'database-ingest',
databaseDepth: 'fast',
}); });
expect(io.stdout()).toContain('Databases:'); expect(io.stdout()).toContain('Databases:');
expect(io.stdout()).toContain('warehouse'); expect(io.stdout()).toContain('warehouse');
@ -716,7 +713,7 @@ describe('runContextBuild', () => {
it('renders localhost SQL analysis refusal as a runtime failure during query history', async () => { it('renders localhost SQL analysis refusal as a runtime failure during query history', async () => {
const io = makeIo(); const io = makeIo();
const project = projectWithConnections({ const project = projectWithConnections({
warehouse: { driver: 'postgres', context: { depth: 'deep', queryHistory: { enabled: true } } }, warehouse: { driver: 'postgres', context: { queryHistory: { enabled: true } } },
}); });
const executeTarget = vi.fn(async (target, _args, targetIo) => { const executeTarget = vi.fn(async (target, _args, targetIo) => {
targetIo.stderr.write('connect ECONNREFUSED 127.0.0.1:8765\n'); targetIo.stderr.write('connect ECONNREFUSED 127.0.0.1:8765\n');
@ -751,7 +748,7 @@ describe('runContextBuild', () => {
it('uses captured query-history stderr instead of generic failed-at detail', async () => { it('uses captured query-history stderr instead of generic failed-at detail', async () => {
const io = makeIo(); const io = makeIo();
const project = projectWithConnections({ const project = projectWithConnections({
warehouse: { driver: 'postgres', context: { depth: 'deep', queryHistory: { enabled: true } } }, warehouse: { driver: 'postgres', context: { queryHistory: { enabled: true } } },
}); });
const executeTarget = vi.fn(async (target, _args, targetIo) => { const executeTarget = vi.fn(async (target, _args, targetIo) => {
targetIo.stdout.write('KTX scan completed\n'); targetIo.stdout.write('KTX scan completed\n');
@ -768,7 +765,7 @@ describe('runContextBuild', () => {
operation: 'query-history', operation: 'query-history',
status: 'failed', status: 'failed',
detail: detail:
'warehouse failed at query-history. Retry: ktx ingest warehouse --project-dir /tmp/project --deep --query-history', 'warehouse failed at query-history. Retry: ktx ingest warehouse --project-dir /tmp/project --query-history',
}, },
{ operation: 'source-ingest', status: 'skipped' }, { operation: 'source-ingest', status: 'skipped' },
{ operation: 'memory-update', status: 'skipped' }, { operation: 'memory-update', status: 'skipped' },
@ -785,7 +782,7 @@ describe('runContextBuild', () => {
expect(result).toEqual({ exitCode: 1 }); expect(result).toEqual({ exitCode: 1 });
expect(io.stdout()).toContain('Missing bundled Python runtime manifest: /tmp/assets/python/manifest.json.'); expect(io.stdout()).toContain('Missing bundled Python runtime manifest: /tmp/assets/python/manifest.json.');
expect(io.stdout()).toContain('Retry: ktx ingest warehouse --project-dir /tmp/project --deep --query-history'); expect(io.stdout()).toContain('Retry: ktx ingest warehouse --project-dir /tmp/project --query-history');
expect(io.stdout()).not.toContain('Then retry the runtime-backed KTX command'); expect(io.stdout()).not.toContain('Then retry the runtime-backed KTX command');
expect(io.stdout()).not.toContain('warehouse failed at query-history'); expect(io.stdout()).not.toContain('warehouse failed at query-history');
expect(io.stdout().match(/Retry: /g)).toHaveLength(1); expect(io.stdout().match(/Retry: /g)).toHaveLength(1);
@ -899,12 +896,12 @@ describe('runContextBuild', () => {
const io = makeIo(); const io = makeIo();
const project: KtxPublicIngestProject = { const project: KtxPublicIngestProject = {
...projectWithConnections({ ...projectWithConnections({
warehouse: { driver: 'postgres', context: { depth: 'deep' } }, warehouse: { driver: 'postgres' },
}), }),
config: { config: {
...projectWithConnections({ warehouse: { driver: 'postgres' } }).config, ...projectWithConnections({ warehouse: { driver: 'postgres' } }).config,
connections: { connections: {
warehouse: { driver: 'postgres', context: { depth: 'deep' } }, warehouse: { driver: 'postgres' },
}, },
llm: { llm: {
provider: { backend: 'gateway', gateway: { api_key: 'env:KTX_GATEWAY_API_KEY' } }, // pragma: allowlist secret provider: { backend: 'gateway', gateway: { api_key: 'env:KTX_GATEWAY_API_KEY' } }, // pragma: allowlist secret

View file

@ -702,7 +702,7 @@ describe('runKtxCli', () => {
const publicIngest = vi.fn().mockResolvedValue(0); const publicIngest = vi.fn().mockResolvedValue(0);
await expect( await expect(
runKtxCli(['--project-dir', tempDir, 'ingest', 'warehouse', '--fast', '--no-input'], testIo.io, { runKtxCli(['--project-dir', tempDir, 'ingest', 'warehouse', '--no-input'], testIo.io, {
publicIngest, publicIngest,
}), }),
).resolves.toBe(0); ).resolves.toBe(0);
@ -715,7 +715,6 @@ describe('runKtxCli', () => {
all: false, all: false,
json: false, json: false,
inputMode: 'disabled', inputMode: 'disabled',
depth: 'fast',
queryHistory: 'default', queryHistory: 'default',
cliVersion, cliVersion,
runtimeInstallPolicy: 'never', runtimeInstallPolicy: 'never',
@ -725,12 +724,12 @@ describe('runKtxCli', () => {
expect(testIo.stderr()).toBe(`Project: ${tempDir}\n`); expect(testIo.stderr()).toBe(`Project: ${tempDir}\n`);
}); });
it('routes public ingest --all --deep with JSON output', async () => { it('routes public ingest --all with JSON output', async () => {
const testIo = makeIo(); const testIo = makeIo();
const publicIngest = vi.fn().mockResolvedValue(0); const publicIngest = vi.fn().mockResolvedValue(0);
await expect( await expect(
runKtxCli(['--project-dir', tempDir, 'ingest', '--all', '--deep', '--json'], testIo.io, { runKtxCli(['--project-dir', tempDir, 'ingest', '--all', '--json'], testIo.io, {
publicIngest, publicIngest,
}), }),
).resolves.toBe(0); ).resolves.toBe(0);
@ -742,7 +741,6 @@ describe('runKtxCli', () => {
all: true, all: true,
json: true, json: true,
inputMode: 'auto', inputMode: 'auto',
depth: 'deep',
queryHistory: 'default', queryHistory: 'default',
cliVersion, cliVersion,
runtimeInstallPolicy: 'prompt', runtimeInstallPolicy: 'prompt',
@ -786,20 +784,6 @@ describe('runKtxCli', () => {
expect(testIo.stderr()).toContain('Choose only one runtime install mode: --yes or --no-input'); expect(testIo.stderr()).toContain('Choose only one runtime install mode: --yes or --no-input');
}); });
it('rejects mutually exclusive public ingest depth flags before dispatch', async () => {
const testIo = makeIo();
const publicIngest = vi.fn().mockResolvedValue(0);
await expect(
runKtxCli(['--project-dir', '/tmp/project', 'ingest', 'warehouse', '--fast', '--deep'], testIo.io, {
publicIngest,
}),
).resolves.toBe(1);
expect(publicIngest).not.toHaveBeenCalled();
expect(testIo.stderr()).toMatch(/option '--(deep|fast)' cannot be used with option '--(fast|deep)'/);
});
it.each(['run', 'status', 'watch', 'replay'])( it.each(['run', 'status', 'watch', 'replay'])(
'routes former ingest subcommand name "%s" as a connection id', 'routes former ingest subcommand name "%s" as a connection id',
async (connectionId) => { async (connectionId) => {
@ -890,8 +874,6 @@ describe('runKtxCli', () => {
expect(testIo.stdout()).toContain('Usage: ktx ingest'); expect(testIo.stdout()).toContain('Usage: ktx ingest');
expect(testIo.stdout()).toContain('Build or inspect KTX context'); expect(testIo.stdout()).toContain('Build or inspect KTX context');
expect(testIo.stdout()).toContain('--all'); expect(testIo.stdout()).toContain('--all');
expect(testIo.stdout()).toContain('--fast');
expect(testIo.stdout()).toContain('--deep');
expect(testIo.stdout()).toContain('--query-history'); expect(testIo.stdout()).toContain('--query-history');
expect(testIo.stdout()).toContain('--no-query-history'); expect(testIo.stdout()).toContain('--no-query-history');
expect(testIo.stdout()).toContain('--query-history-window-days <days>'); expect(testIo.stdout()).toContain('--query-history-window-days <days>');

View file

@ -88,7 +88,7 @@ function deepReadyProject(
describe('buildPublicIngestPlan', () => { describe('buildPublicIngestPlan', () => {
it('plans warehouse connections as scan targets and source connections as source ingest targets', () => { it('plans warehouse connections as scan targets and source connections as source ingest targets', () => {
const project = projectWithConnections({ const project = deepReadyProject({
warehouse: { driver: 'postgres' }, warehouse: { driver: 'postgres' },
prod_metabase: { driver: 'metabase', api_url: 'https://metabase.example.com' }, prod_metabase: { driver: 'metabase', api_url: 'https://metabase.example.com' },
docs: { driver: 'notion' }, docs: { driver: 'notion' },
@ -103,8 +103,7 @@ describe('buildPublicIngestPlan', () => {
operation: 'database-ingest', operation: 'database-ingest',
debugCommand: 'ktx ingest warehouse --debug', debugCommand: 'ktx ingest warehouse --debug',
steps: ['database-schema'], steps: ['database-schema'],
databaseDepth: 'fast', detectRelationships: true,
detectRelationships: false,
queryHistory: { enabled: false }, queryHistory: { enabled: false },
}, },
{ {
@ -139,61 +138,6 @@ describe('buildPublicIngestPlan', () => {
expect(plan.targets.map((target) => target.connectionId).sort()).toEqual(['docs', 'warehouse']); expect(plan.targets.map((target) => target.connectionId).sort()).toEqual(['docs', 'warehouse']);
}); });
it('resolves database depth from flags, stored context, and defaults', () => {
const project = projectWithConnections({
fast_default: { driver: 'postgres' },
deep_default: { driver: 'postgres', context: { depth: 'deep' } },
docs: { driver: 'notion' },
});
expect(
buildPublicIngestPlan(project, {
projectDir: '/tmp/project',
targetConnectionId: 'fast_default',
all: false,
queryHistory: 'default',
}).targets[0],
).toMatchObject({ connectionId: 'fast_default', databaseDepth: 'fast', queryHistory: { enabled: false } });
expect(
buildPublicIngestPlan(project, {
projectDir: '/tmp/project',
targetConnectionId: 'deep_default',
all: false,
queryHistory: 'default',
}).targets[0],
).toMatchObject({ connectionId: 'deep_default', databaseDepth: 'deep' });
expect(
buildPublicIngestPlan(project, {
projectDir: '/tmp/project',
targetConnectionId: 'docs',
all: false,
depth: 'deep',
queryHistory: 'default',
}).warnings,
).toEqual(['--deep affects database ingest only; ignoring it for docs.']);
});
it('does not infer deep ingest from legacy scanMode values', () => {
const project = projectWithConnections({
warehouse: { driver: 'postgres' },
});
const plan = buildPublicIngestPlan(project, {
projectDir: '/tmp/project',
targetConnectionId: 'warehouse',
all: false,
scanMode: 'enriched',
});
expect(plan.targets[0]).toMatchObject({
connectionId: 'warehouse',
databaseDepth: 'fast',
steps: ['database-schema'],
});
});
it('rejects stale local Looker source driver aliases', () => { it('rejects stale local Looker source driver aliases', () => {
const project = projectWithConnections({ const project = projectWithConnections({
local_looker: { driver: 'local_looker' } as never, local_looker: { driver: 'local_looker' } as never,
@ -204,8 +148,8 @@ describe('buildPublicIngestPlan', () => {
); );
}); });
it('upgrades effective depth when query history is explicitly enabled', () => { it('enables query history when explicitly requested even if stored config disables it', () => {
const project = projectWithConnections({ const project = deepReadyProject({
warehouse: { driver: 'postgres', context: { queryHistory: { enabled: false } } }, warehouse: { driver: 'postgres', context: { queryHistory: { enabled: false } } },
}); });
@ -213,17 +157,16 @@ describe('buildPublicIngestPlan', () => {
projectDir: '/tmp/project', projectDir: '/tmp/project',
targetConnectionId: 'warehouse', targetConnectionId: 'warehouse',
all: false, all: false,
depth: 'fast',
queryHistory: 'enabled', queryHistory: 'enabled',
queryHistoryWindowDays: 30, queryHistoryWindowDays: 30,
}); });
expect(plan.targets[0]).toMatchObject({ expect(plan.targets[0]).toMatchObject({
connectionId: 'warehouse', connectionId: 'warehouse',
databaseDepth: 'deep',
queryHistory: { enabled: true, windowDays: 30, dialect: 'postgres' }, queryHistory: { enabled: true, windowDays: 30, dialect: 'postgres' },
steps: ['database-schema', 'query-history'],
}); });
expect(plan.warnings).toEqual(['--query-history requires deep ingest; running warehouse with --deep.']); expect(plan.warnings).toEqual([]);
}); });
it('warns and skips query history for unsupported database drivers', () => { it('warns and skips query history for unsupported database drivers', () => {
@ -238,7 +181,6 @@ describe('buildPublicIngestPlan', () => {
expect(plan.targets[0]).toMatchObject({ expect(plan.targets[0]).toMatchObject({
connectionId: 'local', connectionId: 'local',
databaseDepth: 'fast',
queryHistory: { enabled: false, unsupported: true }, queryHistory: { enabled: false, unsupported: true },
}); });
expect(plan.warnings).toEqual(['--query-history is not supported for sqlite; running schema ingest for local.']); expect(plan.warnings).toEqual(['--query-history is not supported for sqlite; running schema ingest for local.']);
@ -249,12 +191,11 @@ describe('buildPublicIngestPlan', () => {
deepReadyProject({ deepReadyProject({
local: { driver: 'sqlite' }, local: { driver: 'sqlite' },
mysql_warehouse: { driver: 'mysql' }, mysql_warehouse: { driver: 'mysql' },
warehouse: { driver: 'postgres', context: { depth: 'deep' } }, warehouse: { driver: 'postgres' },
}), }),
{ {
projectDir: '/tmp/project', projectDir: '/tmp/project',
all: true, all: true,
depth: 'deep',
queryHistory: 'enabled', queryHistory: 'enabled',
}, },
); );
@ -326,7 +267,6 @@ describe('buildPublicIngestPlan', () => {
expect(plan.targets[0]).toMatchObject({ expect(plan.targets[0]).toMatchObject({
connectionId: 'warehouse', connectionId: 'warehouse',
databaseDepth: 'deep',
queryHistory: { enabled: true, dialect: 'postgres', windowDays: 30 }, queryHistory: { enabled: true, dialect: 'postgres', windowDays: 30 },
steps: ['database-schema', 'query-history'], steps: ['database-schema', 'query-history'],
}); });
@ -334,7 +274,7 @@ describe('buildPublicIngestPlan', () => {
it('adds a schema-first notice when query history is explicitly enabled', () => { it('adds a schema-first notice when query history is explicitly enabled', () => {
const project = deepReadyProject({ const project = deepReadyProject({
warehouse: { driver: 'postgres', context: { depth: 'deep' } }, warehouse: { driver: 'postgres' },
}); });
expect( expect(
@ -363,34 +303,15 @@ describe('buildPublicIngestPlan', () => {
expect(plan.targets[0]).toMatchObject({ expect(plan.targets[0]).toMatchObject({
connectionId: 'local', connectionId: 'local',
databaseDepth: 'fast',
queryHistory: { enabled: false, windowDays: 30, unsupported: true }, queryHistory: { enabled: false, windowDays: 30, unsupported: true },
steps: ['database-schema'], steps: ['database-schema'],
}); });
expect(plan.warnings).toEqual(['--query-history is not supported for sqlite; running schema ingest for local.']); expect(plan.warnings).toEqual(['--query-history is not supported for sqlite; running schema ingest for local.']);
}); });
it('aggregates ignored database-depth warnings for all source targets', () => { it('records a preflight failure for database ingest when enrichment readiness config is missing', () => {
const plan = buildPublicIngestPlan(
projectWithConnections({
warehouse: { driver: 'postgres' },
docs: { driver: 'notion' },
dbt: { driver: 'dbt' },
}),
{
projectDir: '/tmp/project',
all: true,
depth: 'deep',
queryHistory: 'default',
},
);
expect(plan.warnings).toEqual(['--deep ignored for 2 non-database sources.']);
});
it('records a preflight failure for deep database ingest when readiness config is missing', () => {
const project = projectWithConnections({ const project = projectWithConnections({
warehouse: { driver: 'postgres', context: { depth: 'deep' } }, warehouse: { driver: 'postgres' },
}); });
const plan = buildPublicIngestPlan(project, { const plan = buildPublicIngestPlan(project, {
@ -402,15 +323,14 @@ describe('buildPublicIngestPlan', () => {
expect(plan.targets[0]).toMatchObject({ expect(plan.targets[0]).toMatchObject({
connectionId: 'warehouse', connectionId: 'warehouse',
databaseDepth: 'deep',
preflightFailure: preflightFailure:
'warehouse requires deep ingest readiness: model configuration, scan enrichment mode, scan embeddings. Run ktx setup or rerun with --fast.', 'warehouse cannot be ingested: enrichment is not configured (model configuration, scan enrichment mode, scan embeddings). Run ktx setup to configure a model and embeddings.',
}); });
}); });
it('honors scan.relationships.enabled when planning deep database ingest', () => { it('honors scan.relationships.enabled when planning database ingest', () => {
const plan = buildPublicIngestPlan( const plan = buildPublicIngestPlan(
deepReadyProject({ warehouse: { driver: 'postgres', context: { depth: 'deep' } } }, false), deepReadyProject({ warehouse: { driver: 'postgres' } }, false),
{ {
projectDir: '/tmp/project', projectDir: '/tmp/project',
targetConnectionId: 'warehouse', targetConnectionId: 'warehouse',
@ -421,7 +341,6 @@ describe('buildPublicIngestPlan', () => {
expect(plan.targets[0]).toMatchObject({ expect(plan.targets[0]).toMatchObject({
connectionId: 'warehouse', connectionId: 'warehouse',
databaseDepth: 'deep',
detectRelationships: false, detectRelationships: false,
}); });
}); });
@ -432,11 +351,11 @@ describe('runKtxPublicIngest', () => {
vi.unstubAllEnvs(); vi.unstubAllEnvs();
}); });
it('maps fast and deep database targets to scan internals', async () => { it('maps database targets to enriched scan internals', async () => {
const io = makeIo(); const io = makeIo();
const project = deepReadyProject({ const project = deepReadyProject({
fast: { driver: 'postgres' }, first: { driver: 'postgres' },
deep: { driver: 'postgres', context: { depth: 'deep' } }, second: { driver: 'postgres' },
}); });
const runScan = vi.fn(async () => 0); const runScan = vi.fn(async () => 0);
@ -450,12 +369,12 @@ describe('runKtxPublicIngest', () => {
expect(runScan).toHaveBeenNthCalledWith( expect(runScan).toHaveBeenNthCalledWith(
1, 1,
expect.objectContaining({ connectionId: 'deep', mode: 'enriched', detectRelationships: true }), expect.objectContaining({ connectionId: 'first', mode: 'enriched', detectRelationships: true }),
expect.anything(), expect.anything(),
); );
expect(runScan).toHaveBeenNthCalledWith( expect(runScan).toHaveBeenNthCalledWith(
2, 2,
expect.objectContaining({ connectionId: 'fast', mode: 'structural', detectRelationships: false }), expect.objectContaining({ connectionId: 'second', mode: 'enriched', detectRelationships: true }),
expect.anything(), expect.anything(),
); );
}); });
@ -467,7 +386,7 @@ describe('runKtxPublicIngest', () => {
try { try {
await initKtxProject({ projectDir }); await initKtxProject({ projectDir });
const io = makeIo({ isTTY: true }); const io = makeIo({ isTTY: true });
const project = projectWithConnections({ const project = deepReadyProject({
warehouse: { driver: 'sqlite', path: join(projectDir, 'warehouse.sqlite') }, warehouse: { driver: 'sqlite', path: join(projectDir, 'warehouse.sqlite') },
}); });
@ -614,7 +533,7 @@ describe('runKtxPublicIngest', () => {
it('prints the schema-first notice for explicit query-history runs', async () => { it('prints the schema-first notice for explicit query-history runs', async () => {
const io = makeIo(); const io = makeIo();
const project = deepReadyProject({ const project = deepReadyProject({
warehouse: { driver: 'postgres', context: { depth: 'deep' } }, warehouse: { driver: 'postgres' },
}); });
const runScan = vi.fn(async () => 0); const runScan = vi.fn(async () => 0);
const runIngest = vi.fn(async () => 0); const runIngest = vi.fn(async () => 0);
@ -640,7 +559,7 @@ describe('runKtxPublicIngest', () => {
it('suppresses internal scan output for public database ingest summaries', async () => { it('suppresses internal scan output for public database ingest summaries', async () => {
const io = makeIo(); const io = makeIo();
const project = projectWithConnections({ warehouse: { driver: 'postgres' } }); const project = deepReadyProject({ warehouse: { driver: 'postgres' } });
const runScan = vi.fn(async (_args, scanIo) => { const runScan = vi.fn(async (_args, scanIo) => {
scanIo.stdout.write('KTX scan completed\n'); scanIo.stdout.write('KTX scan completed\n');
scanIo.stdout.write('Mode: structural\n'); scanIo.stdout.write('Mode: structural\n');
@ -674,7 +593,7 @@ describe('runKtxPublicIngest', () => {
it('sanitizes captured database scan failure details in direct public output', async () => { it('sanitizes captured database scan failure details in direct public output', async () => {
const io = makeIo(); const io = makeIo();
const project = deepReadyProject({ warehouse: { driver: 'postgres', context: { depth: 'deep' } } }); const project = deepReadyProject({ warehouse: { driver: 'postgres' } });
const runScan = vi.fn(async (_args, scanIo) => { const runScan = vi.fn(async (_args, scanIo) => {
scanIo.stdout.write('KTX scan enrichment failed after structural scan completed: embedding service timed out\n'); scanIo.stdout.write('KTX scan enrichment failed after structural scan completed: embedding service timed out\n');
return 1; return 1;
@ -689,7 +608,6 @@ describe('runKtxPublicIngest', () => {
all: false, all: false,
json: false, json: false,
inputMode: 'disabled', inputMode: 'disabled',
depth: 'deep',
}, },
io.io, io.io,
{ loadProject: vi.fn(async () => project), runScan }, { loadProject: vi.fn(async () => project), runScan },
@ -699,7 +617,7 @@ describe('runKtxPublicIngest', () => {
expect(io.stdout()).toContain( expect(io.stdout()).toContain(
'warehouse failed: Database enrichment failed after schema context completed: embedding service timed out.', 'warehouse failed: Database enrichment failed after schema context completed: embedding service timed out.',
); );
expect(io.stdout()).toContain('Retry: ktx ingest warehouse --project-dir /tmp/project --deep'); expect(io.stdout()).toContain('Retry: ktx ingest warehouse --project-dir /tmp/project');
expect(io.stdout()).not.toContain('KTX scan enrichment failed'); expect(io.stdout()).not.toContain('KTX scan enrichment failed');
expect(io.stdout()).not.toContain('structural scan'); expect(io.stdout()).not.toContain('structural scan');
}); });
@ -743,7 +661,7 @@ describe('runKtxPublicIngest', () => {
it('suppresses historic-sql report output during direct public query-history ingest', async () => { it('suppresses historic-sql report output during direct public query-history ingest', async () => {
const io = makeIo(); const io = makeIo();
const project = deepReadyProject({ const project = deepReadyProject({
warehouse: { driver: 'postgres', context: { depth: 'deep' } }, warehouse: { driver: 'postgres' },
}); });
const runScan = vi.fn(async () => 0); const runScan = vi.fn(async () => 0);
const runIngest = vi.fn(async (_args, ingestIo) => { const runIngest = vi.fn(async (_args, ingestIo) => {
@ -794,7 +712,6 @@ describe('runKtxPublicIngest', () => {
all: false, all: false,
json: false, json: false,
inputMode: 'auto', inputMode: 'auto',
depth: 'fast',
queryHistory: 'default', queryHistory: 'default',
}, },
io.io, io.io,
@ -809,7 +726,6 @@ describe('runKtxPublicIngest', () => {
targetConnectionId: 'warehouse', targetConnectionId: 'warehouse',
all: false, all: false,
entrypoint: 'ingest', entrypoint: 'ingest',
depth: 'fast',
queryHistory: 'default', queryHistory: 'default',
}), }),
io.io, io.io,
@ -821,7 +737,7 @@ describe('runKtxPublicIngest', () => {
const io = makeIo({ isTTY: true, interactive: true }); const io = makeIo({ isTTY: true, interactive: true });
const calls: string[] = []; const calls: string[] = [];
const project = projectWithConnections({ const project = projectWithConnections({
warehouse: { driver: 'postgres', context: { depth: 'deep' } }, warehouse: { driver: 'postgres' },
}); });
const ensureRuntime = vi.fn(async (): Promise<ManagedPythonCommandRuntime> => { const ensureRuntime = vi.fn(async (): Promise<ManagedPythonCommandRuntime> => {
calls.push('runtime'); calls.push('runtime');
@ -923,10 +839,13 @@ describe('runKtxPublicIngest', () => {
it('runs all independent targets and reports partial failures', async () => { it('runs all independent targets and reports partial failures', async () => {
const io = makeIo(); const io = makeIo();
const project = projectWithConnections({ const project = deepReadyProject(
warehouse: { driver: 'postgres' }, {
prod_metabase: { driver: 'metabase', api_url: 'https://metabase.example.com' }, warehouse: { driver: 'postgres' },
}); prod_metabase: { driver: 'metabase', api_url: 'https://metabase.example.com' },
},
false,
);
const runScan = vi.fn(async () => 1); const runScan = vi.fn(async () => 1);
const runIngest = vi.fn(async () => 0); const runIngest = vi.fn(async () => 0);
@ -959,7 +878,7 @@ describe('runKtxPublicIngest', () => {
command: 'run', command: 'run',
projectDir: '/tmp/project', projectDir: '/tmp/project',
connectionId: 'warehouse', connectionId: 'warehouse',
mode: 'structural', mode: 'enriched',
detectRelationships: false, detectRelationships: false,
dryRun: false, dryRun: false,
}, },
@ -967,14 +886,14 @@ describe('runKtxPublicIngest', () => {
); );
expect(io.stdout()).toContain('Ingest finished with partial failures'); expect(io.stdout()).toContain('Ingest finished with partial failures');
expect(io.stdout()).toContain('warehouse failed at database-schema.'); expect(io.stdout()).toContain('warehouse failed at database-schema.');
expect(io.stdout()).toContain('Retry: ktx ingest warehouse --project-dir /tmp/project --fast'); expect(io.stdout()).toContain('Retry: ktx ingest warehouse --project-dir /tmp/project');
expect(io.stdout()).not.toContain('Debug:'); expect(io.stdout()).not.toContain('Debug:');
}); });
it('skips the query-history facet but keeps the target green when query-history fails', async () => { it('skips the query-history facet but keeps the target green when query-history fails', async () => {
const io = makeIo(); const io = makeIo();
const project = deepReadyProject({ const project = deepReadyProject({
warehouse: { driver: 'postgres', context: { depth: 'deep' } }, warehouse: { driver: 'postgres' },
}); });
const runScan = vi.fn(async () => 0); const runScan = vi.fn(async () => 0);
const runIngest = vi.fn(async (_args, ingestIo) => { const runIngest = vi.fn(async (_args, ingestIo) => {
@ -1007,14 +926,14 @@ describe('runKtxPublicIngest', () => {
'Query history failed for 60 tasks. First failure: Google Cloud authentication failed while analyzing query history', 'Query history failed for 60 tasks. First failure: Google Cloud authentication failed while analyzing query history',
); );
expect(io.stdout()).not.toContain('warehouse failed: Error:'); expect(io.stdout()).not.toContain('warehouse failed: Error:');
expect(io.stdout()).toContain('Retry: ktx ingest warehouse --project-dir /tmp/project --deep --query-history'); expect(io.stdout()).toContain('Retry: ktx ingest warehouse --project-dir /tmp/project --query-history');
expect(io.stdout()).not.toContain('historic-sql'); expect(io.stdout()).not.toContain('historic-sql');
}); });
it('prints the runtime artifact build hint for missing query-history runtime assets', async () => { it('prints the runtime artifact build hint for missing query-history runtime assets', async () => {
const io = makeIo(); const io = makeIo();
const project = deepReadyProject({ const project = deepReadyProject({
warehouse: { driver: 'postgres', context: { depth: 'deep' } }, warehouse: { driver: 'postgres' },
}); });
const runScan = vi.fn(async () => 0); const runScan = vi.fn(async () => 0);
const runIngest = vi.fn(async (_args, ingestIo) => { const runIngest = vi.fn(async (_args, ingestIo) => {
@ -1045,14 +964,14 @@ describe('runKtxPublicIngest', () => {
expect(io.stdout()).toContain( expect(io.stdout()).toContain(
'In a source checkout, build the local runtime assets with: pnpm run artifacts:build', 'In a source checkout, build the local runtime assets with: pnpm run artifacts:build',
); );
expect(io.stdout()).toContain('Retry: ktx ingest warehouse --project-dir /tmp/project --deep --query-history'); expect(io.stdout()).toContain('Retry: ktx ingest warehouse --project-dir /tmp/project --query-history');
expect(io.stdout()).not.toContain('Then retry the runtime-backed KTX command'); expect(io.stdout()).not.toContain('Then retry the runtime-backed KTX command');
}); });
it('fails deep-readiness targets before work starts while continuing independent --all targets', async () => { it('fails enrichment-readiness targets before work starts while continuing independent --all targets', async () => {
const io = makeIo(); const io = makeIo();
const project = projectWithConnections({ const project = projectWithConnections({
warehouse: { driver: 'postgres', context: { depth: 'deep' } }, warehouse: { driver: 'postgres' },
docs: { driver: 'notion' }, docs: { driver: 'notion' },
}); });
const runScan = vi.fn(async () => 0); const runScan = vi.fn(async () => 0);
@ -1071,12 +990,12 @@ describe('runKtxPublicIngest', () => {
expect.objectContaining({ command: 'run', connectionId: 'docs', adapter: 'notion' }), expect.objectContaining({ command: 'run', connectionId: 'docs', adapter: 'notion' }),
expect.anything(), expect.anything(),
); );
expect(io.stdout()).toContain('warehouse requires deep ingest readiness'); expect(io.stdout()).toContain('warehouse cannot be ingested: enrichment is not configured');
}); });
it('does not infer enriched relationship scans from legacy scanMode values', async () => { it('drives scan relationship detection from project config, not from legacy args', async () => {
const io = makeIo(); const io = makeIo();
const project = deepReadyProject({ warehouse: { driver: 'postgres' } }); const project = deepReadyProject({ warehouse: { driver: 'postgres' } }, false);
const runScan = vi.fn(async () => 0); const runScan = vi.fn(async () => 0);
await expect( await expect(
@ -1103,7 +1022,7 @@ describe('runKtxPublicIngest', () => {
command: 'run', command: 'run',
projectDir: '/tmp/project', projectDir: '/tmp/project',
connectionId: 'warehouse', connectionId: 'warehouse',
mode: 'structural', mode: 'enriched',
detectRelationships: false, detectRelationships: false,
dryRun: false, dryRun: false,
}, },
@ -1113,7 +1032,7 @@ describe('runKtxPublicIngest', () => {
it('prints stable JSON results', async () => { it('prints stable JSON results', async () => {
const io = makeIo(); const io = makeIo();
const project = projectWithConnections({ warehouse: { driver: 'postgres' } }); const project = deepReadyProject({ warehouse: { driver: 'postgres' } });
await expect( await expect(
runKtxPublicIngest( runKtxPublicIngest(

View file

@ -1,7 +1,7 @@
import { mkdir, mkdtemp, readFile, rm, writeFile } from 'node:fs/promises'; import { mkdir, mkdtemp, readFile, rm, writeFile } from 'node:fs/promises';
import { tmpdir } from 'node:os'; import { tmpdir } from 'node:os';
import { join } from 'node:path'; import { join } from 'node:path';
import { buildDefaultKtxProjectConfig, parseKtxProjectConfig, serializeKtxProjectConfig, type KtxProjectConfig } from '../src/context/project/config.js'; import { buildDefaultKtxProjectConfig, serializeKtxProjectConfig, type KtxProjectConfig } from '../src/context/project/config.js';
import { readKtxSetupState, writeKtxSetupState } from '../src/context/project/setup-config.js'; import { readKtxSetupState, writeKtxSetupState } from '../src/context/project/setup-config.js';
import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest'; import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest';
@ -49,7 +49,7 @@ async function writeReadyProject(projectDir: string, overrides: ReadyProjectOver
...defaults, ...defaults,
setup: { database_connection_ids: ['warehouse'] }, setup: { database_connection_ids: ['warehouse'] },
connections: { connections: {
warehouse: { driver: 'postgres', url: 'env:DATABASE_URL', context: { depth: 'deep' } }, warehouse: { driver: 'postgres', url: 'env:DATABASE_URL' },
docs: { driver: 'notion', auth_token_ref: 'env:NOTION_TOKEN', crawl_mode: 'all_accessible' }, docs: { driver: 'notion', auth_token_ref: 'env:NOTION_TOKEN', crawl_mode: 'all_accessible' },
}, },
llm: { llm: {
@ -407,130 +407,10 @@ describe('setup context build state', () => {
expect(io.stdout()).not.toContain('Existing context artifacts were found from setup ingest.'); expect(io.stdout()).not.toContain('Existing context artifacts were found from setup ingest.');
}); });
it('treats fast database context as ready from schema manifest shards without AI artifacts', async () => { it('requires completed relationships for database context when relationship discovery is enabled', async () => {
await writeReadyProject(tempDir, { await writeReadyProject(tempDir, {
connections: { connections: {
warehouse: { driver: 'postgres', readonly: true, context: { depth: 'fast' } }, warehouse: { driver: 'postgres', readonly: true },
},
llm: { provider: { backend: 'none' }, models: {} },
scan: { enrichment: { mode: 'none' } },
});
await mkdir(join(tempDir, 'semantic-layer', 'warehouse', '_schema'), { recursive: true });
await writeFile(join(tempDir, 'semantic-layer', 'warehouse', '_schema', 'public.yaml'), 'tables: {}\n');
await writeScanReport(tempDir, '2026-05-09T10:00:00.000Z', {
mode: 'structural',
tableDescriptions: 'skipped',
columnDescriptions: 'skipped',
embeddings: 'skipped',
manifestShards: ['semantic-layer/warehouse/_schema/public.yaml'],
});
const io = makeIo();
const runContextBuildMock = vi.fn<NonNullable<KtxSetupContextDeps['runContextBuild']>>(async () => ({
exitCode: 0,
}));
await expect(
runKtxSetupContextStep(
{ projectDir: tempDir, inputMode: 'disabled' },
io.io,
{
runContextBuild: runContextBuildMock,
},
),
).resolves.toMatchObject({ status: 'ready' });
expect(runContextBuildMock).not.toHaveBeenCalled();
expect(io.stdout()).toContain('Existing context artifacts were found from setup ingest.');
});
it('stores fast context depth non-interactively when deep readiness is missing', async () => {
await writeReadyProject(tempDir, {
connections: { warehouse: { driver: 'postgres', readonly: true } },
llm: { provider: { backend: 'none' }, models: {} },
scan: { enrichment: { mode: 'none' } },
});
const io = makeIo();
const runContextBuildMock = vi.fn<NonNullable<KtxSetupContextDeps['runContextBuild']>>(async () => ({
exitCode: 0,
}));
const verifyContextReady = vi.fn(async () => ({
ready: true,
agentContextReady: true,
semanticSearchReady: true,
details: ['ready'],
}));
await expect(
runKtxSetupContextStep(
{ projectDir: tempDir, inputMode: 'disabled' },
io.io,
{ runContextBuild: runContextBuildMock, verifyContextReady },
),
).resolves.toMatchObject({ status: 'ready' });
const config = parseKtxProjectConfig(await readFile(join(tempDir, 'ktx.yaml'), 'utf-8'));
expect(config.connections.warehouse.context).toMatchObject({ depth: 'fast' });
expect(runContextBuildMock).toHaveBeenCalledWith(
expect.anything(),
expect.objectContaining({ projectDir: tempDir, inputMode: 'disabled' }),
expect.anything(),
expect.anything(),
);
expect(runContextBuildMock.mock.calls[0]?.[1]).not.toMatchObject({
scanMode: 'enriched',
detectRelationships: true,
});
});
it('prompts for database context depth after final readiness is known', async () => {
await writeReadyProject(tempDir, {
connections: { warehouse: { driver: 'postgres', readonly: true } },
llm: {
provider: { backend: 'gateway', gateway: { api_key: 'env:KTX_GATEWAY_API_KEY' } }, // pragma: allowlist secret
models: { default: 'gpt-test' },
},
scan: {
enrichment: {
mode: 'llm',
embeddings: { backend: 'openai', model: 'text-embedding-3-small', dimensions: 1536 },
},
},
});
const io = makeIo();
const select = vi.fn(async () => 'deep');
const runContextBuildMock = vi.fn(async () => ({ exitCode: 0 }));
const verifyContextReady = vi.fn(async () => ({
ready: true,
agentContextReady: true,
semanticSearchReady: true,
details: ['ready'],
}));
await expect(
runKtxSetupContextStep(
{ projectDir: tempDir, inputMode: 'auto' },
io.io,
{
prompts: { select, cancel: vi.fn() },
runContextBuild: runContextBuildMock,
verifyContextReady,
},
),
).resolves.toMatchObject({ status: 'ready' });
expect(select).toHaveBeenCalledWith(
expect.objectContaining({
message: expect.stringContaining('How much database context should KTX build?'),
}),
);
const config = parseKtxProjectConfig(await readFile(join(tempDir, 'ktx.yaml'), 'utf-8'));
expect(config.connections.warehouse.context).toMatchObject({ depth: 'deep' });
});
it('requires completed relationships for deep context when relationship discovery is enabled', async () => {
await writeReadyProject(tempDir, {
connections: {
warehouse: { driver: 'postgres', readonly: true, context: { depth: 'deep' } },
}, },
scan: { relationships: { enabled: true } }, scan: { relationships: { enabled: true } },
}); });
@ -560,10 +440,10 @@ describe('setup context build state', () => {
expect(runContextBuildMock).toHaveBeenCalledOnce(); expect(runContextBuildMock).toHaveBeenCalledOnce();
}); });
it('does not require relationships for deep context when relationship discovery is disabled', async () => { it('does not require relationships for database context when relationship discovery is disabled', async () => {
await writeReadyProject(tempDir, { await writeReadyProject(tempDir, {
connections: { connections: {
warehouse: { driver: 'postgres', readonly: true, context: { depth: 'deep' } }, warehouse: { driver: 'postgres', readonly: true },
}, },
scan: { relationships: { enabled: false } }, scan: { relationships: { enabled: false } },
}); });
@ -620,7 +500,7 @@ describe('setup context build state', () => {
it('starts a fresh foreground build when stale state is found', async () => { it('starts a fresh foreground build when stale state is found', async () => {
await writeReadyProject(tempDir, { await writeReadyProject(tempDir, {
connections: { warehouse: { driver: 'postgres', readonly: true, context: { depth: 'fast' } } }, connections: { warehouse: { driver: 'postgres', readonly: true } },
}); });
await writeKtxSetupContextState(tempDir, { await writeKtxSetupContextState(tempDir, {
runId: 'setup-context-local-stale', runId: 'setup-context-local-stale',

View file

@ -262,48 +262,6 @@ describe('setup databases step', () => {
expect(prompts.select).toHaveBeenCalledTimes(1); expect(prompts.select).toHaveBeenCalledTimes(1);
}); });
it('preserves context.depth when editing an existing database connection', async () => {
await writeFile(
join(tempDir, 'ktx.yaml'),
[
'connections:',
' warehouse:',
' driver: sqlite',
' path: ./warehouse.sqlite',
' context:',
' depth: deep',
'',
].join('\n'),
'utf-8',
);
const prompts = makePromptAdapter({
selectValues: ['edit', 'warehouse', 'continue'],
textValues: ['./warehouse.sqlite'],
});
const testConnection = vi.fn(async () => 0);
const scanConnection = vi.fn(async () => 0);
const io = makeIo();
const result = await runKtxSetupDatabasesStep(
{
projectDir: tempDir,
inputMode: 'auto',
skipDatabases: false,
databaseSchemas: [],
disableQueryHistory: true,
},
io.io,
{ prompts, testConnection, scanConnection },
);
expect(result.status, io.stderr()).toBe('ready');
const config = parseKtxProjectConfig(await readFile(join(tempDir, 'ktx.yaml'), 'utf-8'));
expect(config.connections.warehouse).toMatchObject({
driver: 'sqlite',
path: './warehouse.sqlite',
context: { depth: 'deep' },
});
});
it('labels existing database connections with the database type', async () => { it('labels existing database connections with the database type', async () => {
await writeFile( await writeFile(
join(tempDir, 'ktx.yaml'), join(tempDir, 'ktx.yaml'),
@ -376,7 +334,6 @@ describe('setup databases step', () => {
expect(config.connections['postgres-warehouse']).toEqual({ expect(config.connections['postgres-warehouse']).toEqual({
driver: 'postgres', driver: 'postgres',
url: 'env:DATABASE_URL', url: 'env:DATABASE_URL',
context: { depth: 'fast' },
}); });
}); });
@ -1558,7 +1515,7 @@ describe('setup databases step', () => {
); );
expect(io.stdout()).not.toContain('Tables: 2'); expect(io.stdout()).not.toContain('Tables: 2');
expect(io.stdout()).toContain('◇ Building schema context for postgres-warehouse'); expect(io.stdout()).toContain('◇ Building schema context for postgres-warehouse');
expect(io.stdout()).toContain('│ Running fast database ingest…'); expect(io.stdout()).toContain('│ Running database scan…');
expect(io.stdout()).toContain('◇ Schema context complete for postgres-warehouse'); expect(io.stdout()).toContain('◇ Schema context complete for postgres-warehouse');
expect(io.stdout()).toContain('│ Changes: 2 new tables'); expect(io.stdout()).toContain('│ Changes: 2 new tables');
expect(io.stdout()).toContain('◇ Database ready'); expect(io.stdout()).toContain('◇ Database ready');
@ -1907,7 +1864,7 @@ describe('setup databases step', () => {
driver: 'postgres', driver: 'postgres',
url: 'env:DATABASE_URL', url: 'env:DATABASE_URL',
schemas: ['public'], schemas: ['public'],
context: { queryHistory: { enabled: false }, depth: 'fast' }, context: { queryHistory: { enabled: false } },
}); });
expect(config.setup).toEqual({ expect(config.setup).toEqual({
database_connection_ids: ['warehouse'], database_connection_ids: ['warehouse'],
@ -1946,7 +1903,6 @@ describe('setup databases step', () => {
expect(config.connections.warehouse).toEqual({ expect(config.connections.warehouse).toEqual({
driver: 'sqlite', driver: 'sqlite',
path: './warehouse.sqlite', path: './warehouse.sqlite',
context: { depth: 'fast' },
}); });
expect(config.setup).toEqual({ expect(config.setup).toEqual({
database_connection_ids: ['warehouse'], database_connection_ids: ['warehouse'],
@ -2023,11 +1979,11 @@ describe('setup databases step', () => {
const config = parseKtxProjectConfig(await readFile(join(tempDir, 'ktx.yaml'), 'utf-8')); const config = parseKtxProjectConfig(await readFile(join(tempDir, 'ktx.yaml'), 'utf-8'));
expect(config.connections.warehouse).toMatchObject({ driver: 'postgres', url: 'env:DATABASE_URL' }); expect(config.connections.warehouse).toMatchObject({ driver: 'postgres', url: 'env:DATABASE_URL' });
expect(await readFile(join(tempDir, 'ktx.yaml'), 'utf-8')).not.toContain('completed_steps:'); expect(await readFile(join(tempDir, 'ktx.yaml'), 'utf-8')).not.toContain('completed_steps:');
expect(io.stderr()).toContain('Fast database ingest failed for warehouse.'); expect(io.stderr()).toContain('Database scan failed for warehouse.');
expect(io.stderr()).toContain('│ Fast database ingest failed for warehouse.'); expect(io.stderr()).toContain('│ Database scan failed for warehouse.');
expect(io.stderr()).toContain(`Debug command: ktx ingest warehouse --project-dir ${tempDir} --fast --debug`); expect(io.stderr()).toContain(`Debug command: ktx ingest warehouse --project-dir ${tempDir} --debug`);
expect(io.stderr()).not.toContain('Structural scan failed for warehouse.'); expect(io.stderr()).not.toContain('Structural scan failed for warehouse.');
expect(io.stderr()).not.toMatch(/^Fast database ingest failed for warehouse\./m); expect(io.stderr()).not.toMatch(/^Database scan failed for warehouse\./m);
}); });
it('prints the native SQLite rebuild command when scanning hits a Node ABI mismatch', async () => { it('prints the native SQLite rebuild command when scanning hits a Node ABI mismatch', async () => {
@ -2066,7 +2022,7 @@ describe('setup databases step', () => {
expect(io.stderr()).toContain('Native SQLite is built for a different Node.js ABI.'); expect(io.stderr()).toContain('Native SQLite is built for a different Node.js ABI.');
expect(io.stderr()).toContain('│ Native SQLite is built for a different Node.js ABI.'); expect(io.stderr()).toContain('│ Native SQLite is built for a different Node.js ABI.');
expect(io.stderr()).toContain('Fix: pnpm run native:rebuild'); expect(io.stderr()).toContain('Fix: pnpm run native:rebuild');
expect(io.stderr()).toContain(`Retry: ktx ingest warehouse --project-dir ${tempDir} --fast`); expect(io.stderr()).toContain(`Retry: ktx ingest warehouse --project-dir ${tempDir}`);
expect(io.stderr()).not.toContain('ktx scan'); expect(io.stderr()).not.toContain('ktx scan');
expect(io.stderr()).not.toContain('npm rebuild'); expect(io.stderr()).not.toContain('npm rebuild');
expect(io.stderr()).not.toMatch(/^Native SQLite is built for a different Node.js ABI\./m); expect(io.stderr()).not.toMatch(/^Native SQLite is built for a different Node.js ABI\./m);
@ -2364,7 +2320,7 @@ describe('setup databases step', () => {
'utf-8', 'utf-8',
); );
const io = makeIo(); const io = makeIo();
const prompts = makePromptAdapter({ selectValues: ['yes', 'deep'] }); const prompts = makePromptAdapter({ selectValues: ['yes'] });
const runner = fakeHistoricSqlRunner('postgres', 'pg_stat_statements'); const runner = fakeHistoricSqlRunner('postgres', 'pg_stat_statements');
const historicSqlReadinessProbe = vi.fn(async () => ({ const historicSqlReadinessProbe = vi.fn(async () => ({
ok: true as const, ok: true as const,
@ -2399,12 +2355,6 @@ describe('setup databases step', () => {
{ value: 'back', label: 'Back' }, { value: 'back', label: 'Back' },
], ],
}); });
expect(prompts.select).toHaveBeenNthCalledWith(
2,
expect.objectContaining({
message: expect.stringContaining('How much database context should KTX build?'),
}),
);
expect(historicSqlReadinessProbe).toHaveBeenCalledWith( expect(historicSqlReadinessProbe).toHaveBeenCalledWith(
expect.objectContaining({ expect.objectContaining({
projectDir: tempDir, projectDir: tempDir,
@ -2420,7 +2370,6 @@ describe('setup databases step', () => {
minExecutions: 5, minExecutions: 5,
filters: { dropTrivialProbes: true }, filters: { dropTrivialProbes: true },
}, },
depth: 'deep',
}, },
}); });
}); });

View file

@ -185,7 +185,7 @@ describe('standalone built ktx CLI smoke', () => {
expect([0, 1]).toContain(result.code); expect([0, 1]).toContain(result.code);
}); });
it('runs fast public database ingest through the built binary with manifest artifacts', async () => { it('blocks public database ingest through the built binary when enrichment is not configured', async () => {
const projectDir = join(tempDir, 'database-ingest-project'); const projectDir = join(tempDir, 'database-ingest-project');
const init = await runSetupNewProject(projectDir); const init = await runSetupNewProject(projectDir);
expectSetupStderr(init); expectSetupStderr(init);
@ -200,19 +200,10 @@ describe('standalone built ktx CLI smoke', () => {
expect(connectionTest.stdout).toContain('Driver: sqlite'); expect(connectionTest.stdout).toContain('Driver: sqlite');
expect(connectionTest.stdout).toContain('Status: ok'); expect(connectionTest.stdout).toContain('Status: ok');
const ingest = await runBuiltCli(['ingest', 'warehouse', '--project-dir', projectDir, '--fast', '--no-input']); const ingest = await runBuiltCli(['ingest', 'warehouse', '--project-dir', projectDir, '--no-input']);
expectProjectStderr(ingest, projectDir); expect(ingest.code).toBe(1);
expect(ingest.stdout).toContain('Ingest finished'); expect(ingest.stdout).toContain('warehouse cannot be ingested: enrichment is not configured');
expect(ingest.stdout).toContain('warehouse');
expect(ingest.stdout).toContain('Database schema');
expect(ingest.stdout).toContain('warehouse done');
expect(ingest.stdout).not.toContain('KTX scan completed'); expect(ingest.stdout).not.toContain('KTX scan completed');
const manifest = await readFile(join(projectDir, 'semantic-layer/warehouse/_schema/public.yaml'), 'utf-8');
expect(manifest).toContain('customers:');
expect(manifest).toContain('orders:');
expect(manifest).toContain('source: formal');
expect(manifest).not.toContain('ai:');
}, 30_000); }, 30_000);
it('parses gateway LLM config and OpenAI enrichment embeddings used by standalone scans without network calls', async () => { it('parses gateway LLM config and OpenAI enrichment embeddings used by standalone scans without network calls', async () => {

View file

@ -365,7 +365,6 @@
"embeddings", "embeddings",
"secrets", "secrets",
"databases", "databases",
"database-context-depth",
"sources", "sources",
"context", "context",
"agents", "agents",

View file

@ -257,7 +257,7 @@ describe('standalone example docs', () => {
assert.match(primarySources, /context:\n queryHistory:/); assert.match(primarySources, /context:\n queryHistory:/);
assert.match(rootReadme, /`ktx ingest` \| Build context for every configured connection/); assert.match(rootReadme, /`ktx ingest` \| Build context for every configured connection/);
assert.doesNotMatch(rootReadme, /`ktx ingest <id>`/); assert.doesNotMatch(rootReadme, /`ktx ingest <id>`/);
assert.match(quickstart, /Databases:\n warehouse: deep context complete/); assert.match(quickstart, /Databases:\n warehouse: database context complete/);
assert.match(quickstart, /Databases configured: yes \(warehouse\)/); assert.match(quickstart, /Databases configured: yes \(warehouse\)/);
assert.match(setupReference, /Databases configured: yes \(postgres-warehouse\)/); assert.match(setupReference, /Databases configured: yes \(postgres-warehouse\)/);
assert.doesNotMatch(rootReadme, new RegExp(['Primary sources', 'configured'].join(' '))); assert.doesNotMatch(rootReadme, new RegExp(['Primary sources', 'configured'].join(' ')));

View file

@ -106,7 +106,6 @@ export function buildLiveDatabaseIngestArgs(projectDir, _databaseIntrospectionUr
connectionId, connectionId,
'--project-dir', '--project-dir',
projectDir, projectDir,
'--fast',
'--no-input', '--no-input',
]; ];
} }
@ -152,20 +151,20 @@ function requireSuccess(label, result) {
} }
} }
function requireFailure(label, result) {
if (result.code === 0) {
throw new Error(
`${label} unexpectedly succeeded\nstdout:\n${result.stdout}\nstderr:\n${result.stderr}`,
);
}
}
function requireOutput(label, result, pattern) { function requireOutput(label, result, pattern) {
if (!pattern.test(result.stdout)) { if (!pattern.test(result.stdout)) {
throw new Error(`${label} output did not match ${pattern}\nstdout:\n${result.stdout}`); throw new Error(`${label} output did not match ${pattern}\nstdout:\n${result.stdout}`);
} }
} }
function getRunId(stdout) {
const match = stdout.match(/^Run: (.+)$/m);
if (!match) {
throw new Error(`ingest output did not include a run id\nstdout:\n${stdout}`);
}
return match[1];
}
async function requireDocker() { async function requireDocker() {
const result = await run('docker', ['info'], { timeout: 20_000 }); const result = await run('docker', ['info'], { timeout: 20_000 });
if (result.code !== 0) { if (result.code !== 0) {
@ -310,13 +309,17 @@ async function main() {
env: managedRuntimeEnv(cleanInstallDir), env: managedRuntimeEnv(cleanInstallDir),
timeout: 120_000, timeout: 120_000,
}); });
requireSuccess('ktx ingest warehouse --fast', ingestRun); // ktx ingest now always builds enriched context and requires a configured
requireOutput('ktx ingest warehouse --fast', ingestRun, /Ingest finished/); // model and embeddings. This smoke project has neither, so the database
requireOutput('ktx ingest warehouse --fast', ingestRun, /Database schema/); // target fails the enrichment-readiness preflight before any work runs.
// This still exercises the packaged binary, daemon startup, and the live
// database connection end to end.
requireFailure('ktx ingest warehouse', ingestRun);
requireOutput('ktx ingest warehouse', ingestRun, /Ingest finished with partial failures/);
requireOutput('ktx ingest warehouse', ingestRun, /enrichment is not configured/);
const runId = getRunId(ingestRun.stdout);
await assertPathExists(join(projectDir, '.ktx', 'db.sqlite'), 'SQLite local ingest state'); await assertPathExists(join(projectDir, '.ktx', 'db.sqlite'), 'SQLite local ingest state');
process.stdout.write(`Installed live-database artifact smoke passed: ${runId}\n`); process.stdout.write('Installed live-database artifact smoke passed: enrichment-readiness guard verified\n');
} finally { } finally {
if (daemonStarted && cleanInstallDir) { if (daemonStarted && cleanInstallDir) {
await stopDaemon(cleanInstallDir); await stopDaemon(cleanInstallDir);

View file

@ -100,7 +100,6 @@ describe('installed live-database artifact smoke helpers', () => {
'warehouse', 'warehouse',
'--project-dir', '--project-dir',
'/tmp/project', '/tmp/project',
'--fast',
'--no-input', '--no-input',
]); ]);

View file

@ -512,15 +512,6 @@ function requireSuccess(label, result) {
assert.equal(result.stderr, '', label + ' wrote unexpected stderr'); assert.equal(result.stderr, '', label + ' wrote unexpected stderr');
} }
function requireSuccessWithProjectStderr(label, result, projectDir) {
assert.equal(
result.code,
0,
label + ' failed with code ' + result.code + '\\nstdout:\\n' + result.stdout + '\\nstderr:\\n' + result.stderr,
);
assert.equal(result.stderr, 'Project: ' + projectDir + '\\n', label + ' wrote unexpected stderr');
}
function requireExitCodeWithProjectStderr(label, result, projectDir, expectedCode) { function requireExitCodeWithProjectStderr(label, result, projectDir, expectedCode) {
assert.equal( assert.equal(
result.code, result.code,
@ -860,27 +851,15 @@ try {
requireOutput('ktx admin runtime stop', runtimeStop, /Stopped KTX daemon/); requireOutput('ktx admin runtime stop', runtimeStop, /Stopped KTX daemon/);
process.stdout.write('ktx admin runtime daemon lifecycle verified\\n'); process.stdout.write('ktx admin runtime daemon lifecycle verified\\n');
const structuralScan = await run( const databaseIngest = await run(
...Object.values( ...Object.values(
pnpmCommand(['exec', 'ktx', 'ingest', 'warehouse', '--project-dir', projectDir, '--fast', '--no-input']), pnpmCommand(['exec', 'ktx', 'ingest', 'warehouse', '--project-dir', projectDir, '--no-input']),
), ),
); );
requireSuccessWithProjectStderr('ktx ingest fast', structuralScan, projectDir); requireExitCodeWithProjectStderr('ktx ingest enrichment guard', databaseIngest, projectDir, 1);
requireOutput('ktx ingest fast', structuralScan, /Ingest finished/); requireOutput('ktx ingest enrichment guard', databaseIngest, /Ingest finished with partial failures/);
requireOutput('ktx ingest fast', structuralScan, /Database schema/); requireOutput('ktx ingest enrichment guard', databaseIngest, /enrichment is not configured/);
requireOutput('ktx ingest fast', structuralScan, /warehouse\\s+done/); process.stdout.write('ktx ingest enrichment guard verified\\n');
await access(join(projectDir, 'semantic-layer', 'warehouse', '_schema', 'public.yaml'));
process.stdout.write('ktx ingest fast verified\\n');
const enrichedScan = await run(
...Object.values(
pnpmCommand(['exec', 'ktx', 'ingest', 'warehouse', '--project-dir', projectDir, '--deep', '--no-input']),
),
);
requireExitCodeWithProjectStderr('ktx ingest deep readiness guard', enrichedScan, projectDir, 1);
requireOutput('ktx ingest deep readiness guard', enrichedScan, /Ingest finished with partial failures/);
requireOutput('ktx ingest deep readiness guard', enrichedScan, /requires deep ingest readiness/);
process.stdout.write('ktx ingest deep readiness guard verified\\n');
await access(join(projectDir, '.ktx', 'db.sqlite')); await access(join(projectDir, '.ktx', 'db.sqlite'));
process.stdout.write('ktx ingest state verified\\n'); process.stdout.write('ktx ingest state verified\\n');

View file

@ -530,10 +530,11 @@ describe('verification snippets', () => {
assert.doesNotMatch(source, /ktx admin runtime prune/); assert.doesNotMatch(source, /ktx admin runtime prune/);
assert.doesNotMatch(source, /staleRuntimeDir/); assert.doesNotMatch(source, /staleRuntimeDir/);
assert.match(source, /pnpmCommand\(\['exec', 'ktx', 'ingest', 'warehouse'/); assert.match(source, /pnpmCommand\(\['exec', 'ktx', 'ingest', 'warehouse'/);
assert.match(source, /'--deep'/); assert.doesNotMatch(source, /'--fast'/);
assert.doesNotMatch(source, /'--deep'/);
assert.doesNotMatch(source, /'--enrich'/); assert.doesNotMatch(source, /'--enrich'/);
assert.match(source, /ktx ingest fast verified/); assert.match(source, /ktx ingest enrichment guard verified/);
assert.match(source, /ktx ingest deep readiness guard verified/); assert.match(source, /enrichment is not configured/);
assert.match(source, /enrichment:/); assert.match(source, /enrichment:/);
assert.match(source, /mode: deterministic/); assert.match(source, /mode: deterministic/);
assert.doesNotMatch(source, /run\('pnpm', \['exec', 'ktx', 'ingest', 'run'/); assert.doesNotMatch(source, /run\('pnpm', \['exec', 'ktx', 'ingest', 'run'/);

View file

@ -87,16 +87,17 @@ Do not discover these inputs across multiple setup runs.
pass the database flags from the previous run** — setup validates current pass the database flags from the previous run** — setup validates current
flags, not persisted `ktx.yaml` state. flags, not persisted `ktx.yaml` state.
4. **Run fast ingest** if setup did not already complete one: 4. **Build context** if setup did not already complete one:
```bash ```bash
ktx ingest <connection-id> --fast --no-input ktx ingest <connection-id> --no-input
``` ```
Note: `ktx ingest` rejects `--yes` together with `--no-input` `ktx ingest` always builds enriched context and requires a configured model
(*Choose only one runtime install mode*); `ktx setup` accepts both. Use and embeddings (set during setup); a database connection without them fails
`--no-input` only for ingest. Do not run `--deep` ingest unless the user with an enrichment-readiness error. Note: `ktx ingest` rejects `--yes`
explicitly asks for LLM-backed enrichment. together with `--no-input` (*Choose only one runtime install mode*);
`ktx setup` accepts both. Use `--no-input` only for ingest.
5. **Install agent integration:** 5. **Install agent integration:**
@ -151,7 +152,7 @@ Notes:
`--notion-root-page-id` (repeatable); use `all_accessible` to crawl `--notion-root-page-id` (repeatable); use `all_accessible` to crawl
everything the token can see. everything the token can see.
- After adding sources, ingest each new connection so its context is queryable: - After adding sources, ingest each new connection so its context is queryable:
`ktx ingest <source-connection-id> --fast --no-input`. `ktx ingest <source-connection-id> --no-input`.
## Files to inspect ## Files to inspect