From 63f6d645e93c037982f40ea79a76df3ab68b335b Mon Sep 17 00:00:00 2001 From: Andrey Avtomonov Date: Wed, 13 May 2026 17:28:08 +0200 Subject: [PATCH] Refine unified ingest UX design after adversarial review iteration 1 --- .../2026-05-13-unified-ingest-ux-design.md | 117 +++++++++++++++--- 1 file changed, 103 insertions(+), 14 deletions(-) diff --git a/docs/superpowers/specs/2026-05-13-unified-ingest-ux-design.md b/docs/superpowers/specs/2026-05-13-unified-ingest-ux-design.md index 2cdf850d..ea50e856 100644 --- a/docs/superpowers/specs/2026-05-13-unified-ingest-ux-design.md +++ b/docs/superpowers/specs/2026-05-13-unified-ingest-ux-design.md @@ -94,11 +94,32 @@ connections, not adapters. `ktx scan` is no longer a documented public command. Database schema scanning continues as an internal phase of database ingest. +Stored report inspection is separate from live context-build control. +`ktx ingest status [runId]`, `ktx ingest replay `, and `--report-file` +remain valid report-viewing surfaces unless the implementation plan replaces +them with an equivalent status command. `ktx ingest watch` is no longer a normal +public verb because `watch` conflicts with the foreground-only model. If a +stored-report visual replay remains useful, expose it as `replay` or hide it +under an advanced/debug namespace. + ## Database ingest depth Database ingest always includes a schema baseline. The depth controls how much extra work KTX may perform. +Depth is the public abstraction over the current scan engine: + +- `fast` maps to `KtxScanMode: structural` with `detectRelationships: false`. +- `deep` maps to `KtxScanMode: enriched` with `detectRelationships: true`. +- The internal `relationships` scan mode remains an advanced implementation + detail. It is not a separate public depth in this v1. + +Deep mode includes relationship discovery when the project's +`scan.relationships.enabled` setting is true. Relationship validation thresholds +and budgets remain governed by the existing internal `scan.relationships` +configuration; users do not get a separate public relationship flag in this +surface. + ### Fast `--fast` means KTX builds deterministic schema context quickly. @@ -119,18 +140,24 @@ large unknown warehouses. ### Deep -`--deep` means KTX builds richer database context and may use slower -capabilities. +`--deep` means KTX builds richer database context through the enriched scan path +and uses slower capabilities. -- May use LLMs and embeddings when configured. +- Requires LLM, embedding, and scan-enrichment readiness before work starts. +- Generates table and column descriptions. +- Generates embeddings. - May sample or query data through read-only connector capabilities. -- May generate table and column descriptions. -- May discover and validate relationships. +- Discovers and validates relationships when relationship discovery is enabled. - May process query history into usage patterns when query history is enabled. Deep mode is the best agent-readiness mode, but it can take longer and can require model, embedding, and database permissions. +KTX must not silently downgrade an explicit or stored `deep` request to `fast`. +If the project is missing the model, embedding, or scan-enrichment configuration +required for deep ingest, KTX errors before starting the run and tells the user +to run `ktx setup` or rerun with `--fast`. + ### Flag rules `--fast` and `--deep` are mutually exclusive. Passing both is an error. @@ -218,15 +245,39 @@ connections: enabled: true windowDays: 90 minExecutions: 5 - serviceAccountPatterns: - - "^svc_" + filters: + dropTrivialProbes: true + serviceAccounts: + mode: exclude + patterns: + - "^svc_" redactionPatterns: [] ``` -`ingest.adapters` is no longer normal user config. The implementation plan can -remove it from generated config or keep it as an internal advanced override. -KTX must not require users to list `live-database` to ingest a database -connection. +`context.queryHistory` is the canonical user-facing shape. Runtime code maps it +to the existing historic-SQL pull config as follows: + +- `dialect` is derived from the database driver (`postgres`, `bigquery`, or + `snowflake`) and is not normally user-authored. +- `windowDays`, `minExecutions`, and `redactionPatterns` copy through directly. +- `filters.dropTrivialProbes` defaults to `true`. +- `filters.serviceAccounts.patterns` and `filters.serviceAccounts.mode` map to + the existing service-account filter fields. The default mode is `exclude`. + +Existing `connection.historicSql` blocks are legacy cutover input. Setup or the +config rewrite path must migrate them into `connection.context.queryHistory` +while preserving `windowDays`, `minExecutions`, `redactionPatterns`, +`filters.dropTrivialProbes`, and service-account `patterns` and `mode`. If both +`context.queryHistory` and `historicSql` are present, `context.queryHistory` +wins and KTX emits a config-cleanup warning instead of running both. + +`ingest.adapters` is no longer normal user config. Existing `ingest.adapters` +entries load as advanced/internal overrides during the transition, but +`live-database` and `historic-sql` entries must not be required for public +`ktx ingest ` behavior, must not be regenerated in normal +`ktx.yaml`, and must not cause config-load failure solely because they are +present. The implementation plan can remove adapter parsing after checked-in +configs and examples no longer need it. ## Setup flow @@ -240,7 +291,7 @@ connection is configured or when setup reaches the context-build step: How much database context should KTX build? Fast: schema only, no AI, quickest -Deep: richer context, may use AI and take longer +Deep: AI descriptions, embeddings, relationships, slower ``` The recommended selection depends on readiness: @@ -253,6 +304,22 @@ foreground context build uses that stored default. Setup can still expose a non-prominent automation flag later, such as `--context-depth fast`, if headless setup needs it, but the main product surface is guided. +Setup readiness is depth-aware: + +- For `fast`, a database context is ready when the latest non-dry-run + structural scan for the connection completed and wrote schema manifest shards. + Model, embedding, description-enrichment, and scan-enrichment checks are + skipped for fast contexts. +- For `deep`, a database context is ready only when the enriched scan completed + table descriptions, column descriptions, embeddings, and schema manifest + shards. Relationship artifacts are also required when relationship discovery + is enabled. + +The missing-input gate uses the same rule. Missing model, embedding, or +scan-enrichment configuration must not block a user who selected `fast`. The +same missing inputs must block `deep` before the foreground build starts, with a +message that offers `fast` as the no-AI path. + ## Foreground progress UX KTX keeps a rich foreground progress view. It removes detach and background @@ -344,10 +411,22 @@ Warnings are non-fatal when KTX can still perform the requested ingest. - Ignored query-history flag on an unsupported database: warn and continue if schema ingest can run. - Both `--fast` and `--deep`: error before any work starts. +- Explicit or stored `deep` without required model, embedding, or + scan-enrichment readiness: error before any work starts. +- `--query-history` without required model, embedding, or scan-enrichment + readiness: error before any work starts because query history upgrades the + run to `deep`. - Query-history requested without required grants: fail that query-history facet and keep schema results when schema ingest succeeded. - Database schema ingest failure: fail that database target. +`--all` isolates target failures. It runs all database targets first, then all +source targets, even when one or more database targets fail. Source targets may +therefore run against previously completed database context if the current +database refresh failed. The final exit code is non-zero when any target or +required facet fails, and the summary identifies partial failures by +connection. + Failure messages focus on the connection and user action: ```text @@ -364,11 +443,23 @@ The implementation is complete when these conditions hold: - `ktx ingest ` works for database and source connections. - `ktx ingest --all` runs database targets before source targets. - `--fast` and `--deep` control database depth and are mutually exclusive. +- `--fast` maps to structural database ingest without relationship detection. +- `--deep` maps to enriched database ingest with relationship detection enabled. +- `--deep` and `--query-history` fail before work starts when required model, + embedding, or scan-enrichment configuration is missing. +- `ktx ingest --all` continues independent targets after partial failures and + exits non-zero when any target or required facet fails. - `ktx setup` stores a database context depth without exposing top-level `--fast` or `--deep`. +- `ktx setup` treats fast database context as ready after completed structural + schema ingest and does not require AI descriptions or embeddings for fast. - Generated `ktx.yaml` does not include `live-database` for normal projects. +- Generated `ktx.yaml` uses `connections..context.queryHistory`, not + `connections..historicSql`, for query-history configuration. - Normal CLI help and output do not mention `live-database`. - Normal CLI help and output do not present `scan` as a public verb. +- Normal CLI help and output do not present `ktx ingest watch` as live context + build control. - Query history is optional, connection-local, and overridable per ingest run. - Context build has no detach, attach, watch, resume, stop, or background execution path. @@ -384,8 +475,6 @@ The implementation plan must decide these lower-level details: temporary undocumented debug command. - Whether old `ktx ingest run --connection-id ... --adapter ...` is removed, hidden, or moved to `ktx dev ingest run`. -- Whether `ingest.adapters` is removed from config parsing or retained as an - advanced override. - Whether internal artifact paths keep `raw-sources//live-database` for the first implementation. - Whether setup needs a headless `--context-depth fast|deep` flag for CI.