mirror of
https://github.com/Kaelio/ktx.git
synced 2026-06-19 08:28:06 +02:00
feat(cli)!: remove fast mode; ktx ingest always builds enriched context (KLO-721) (#237)
Fast mode (the ktx ingest --fast/--deep database-ingest depth toggle) is removed.
ktx ingest now always builds the full enriched ("deep") context. There is no
structural fallback: a database connection without a configured model and
embeddings fails the enrichment-readiness preflight before any work runs, with
a 'Run ktx setup to configure a model and embeddings' hint.
- Remove --fast/--deep flags, the per-connection context.depth field, and the
ktx setup depth prompt (delete setup-database-context-depth.ts).
- Rename ingest-depth.ts -> connection-drivers.ts; ingest always requests scan
mode 'enriched'; readiness gate (enrichmentReadinessGaps) runs for every
database target.
- Drop the database-context-depth telemetry step (Node + Python schema mirrors
regenerated).
- Update CLI, setup, context-build view, docs, the public ktx skill, and the
release-smoke / artifacts scripts (now assert the no-LLM guard failure).
ktx status --fast (a separate network-probe flag) is unchanged.
Follow-ups: KLO-726 (live progress for ktx ingest --all), KLO-727 (restore
credentialed successful-ingest release smoke coverage).
This commit is contained in:
parent
637891f030
commit
3f0d11e07d
34 changed files with 222 additions and 884 deletions
|
|
@ -5,9 +5,11 @@ description: "Build or refresh ktx context, or capture text into ktx memory."
|
|||
|
||||
`ktx ingest` builds or refreshes **ktx** context from configured connections, and
|
||||
can also capture free-form text into **ktx** memory. Database connections build
|
||||
schema context. Context-source connections ingest metadata from tools such as
|
||||
dbt, Looker, Metabase, MetricFlow, LookML, and Notion. Pass `--text` or
|
||||
`--file` to capture inline text or text files into memory instead.
|
||||
enriched context — schema plus AI-generated descriptions, embeddings, and
|
||||
relationship evidence — and require a configured model and embeddings.
|
||||
Context-source connections ingest metadata from tools such as dbt, Looker,
|
||||
Metabase, MetricFlow, LookML, and Notion. Pass `--text` or `--file` to capture
|
||||
inline text or text files into memory instead.
|
||||
|
||||
## Command signature
|
||||
|
||||
|
|
@ -29,8 +31,6 @@ connection is selected.
|
|||
| Flag | Description | Default |
|
||||
|------|-------------|---------|
|
||||
| `--all` | Ingest all configured connections (same as bare invocation) | `false` |
|
||||
| `--fast` | Use deterministic fast database ingest | Stored connection default, or `fast` |
|
||||
| `--deep` | Use deep database ingest with AI-generated descriptions, embeddings, and relationship evidence | Stored connection default, or `fast` |
|
||||
| `--query-history` | Include database query-history usage patterns | Stored connection default |
|
||||
| `--no-query-history` | Skip database query-history usage patterns for this run | Stored connection default |
|
||||
| `--query-history-window-days <days>` | BigQuery/Snowflake query-history lookback window for this run | Stored connection default |
|
||||
|
|
@ -44,12 +44,12 @@ connection is selected.
|
|||
| `--yes` | Install required managed runtime features without prompting | `false` |
|
||||
| `--no-input` | Disable interactive terminal input | - |
|
||||
|
||||
`--fast` and `--deep` are mutually exclusive. Depth flags apply only to
|
||||
database connections. Query-history flags apply only to database connections
|
||||
Database ingest always builds enriched context and requires a configured model
|
||||
and embeddings (run `ktx setup`); connections without that configuration fail
|
||||
before any work starts. Query-history flags apply only to database connections
|
||||
that support query history. The window flag applies to BigQuery and Snowflake;
|
||||
Postgres reads the current `pg_stat_statements` aggregate data instead of a
|
||||
time-windowed history table. Query-history ingest runs after fast ingest and
|
||||
requires deep ingest readiness.
|
||||
time-windowed history table. Query-history ingest runs after the schema scan.
|
||||
|
||||
When more than one connection is selected, database ingest runs first, then
|
||||
context-source ingest and memory updates run for context-source connections.
|
||||
|
|
@ -72,14 +72,8 @@ ktx ingest
|
|||
# Build one database or context-source connection
|
||||
ktx ingest warehouse
|
||||
|
||||
# Force deterministic fast database ingest
|
||||
ktx ingest warehouse --fast
|
||||
|
||||
# Force deep database ingest with AI enrichment
|
||||
ktx ingest warehouse --deep
|
||||
|
||||
# Include query-history usage patterns
|
||||
ktx ingest warehouse --deep --query-history
|
||||
ktx ingest warehouse --query-history
|
||||
# Set the lookback window for BigQuery or Snowflake query history
|
||||
ktx ingest warehouse --query-history-window-days 30
|
||||
|
||||
|
|
@ -154,8 +148,8 @@ KTX_INGEST_TRACE_LEVEL=trace ktx ingest metabase
|
|||
| Error | Cause | Recovery |
|
||||
|-------|-------|----------|
|
||||
| Connection not configured | The connection id is not present in `ktx.yaml` | Add the connection with `ktx setup` or update `ktx.yaml` |
|
||||
| Deep readiness is missing | `--deep` or query history needs model, embedding, and scan-enrichment configuration | Run `ktx setup` or rerun with `--fast` |
|
||||
| Query history is unsupported | The selected database driver does not support query history | Run fast ingest without query-history flags |
|
||||
| Enrichment is not configured | Database ingest needs a model, embeddings, and scan-enrichment configuration | Run `ktx setup` to configure a model and embeddings |
|
||||
| Query history is unsupported | The selected database driver does not support query history | Run ingest without query-history flags |
|
||||
| Python runtime is missing | The selected ingest target needs runtime-backed SQL analysis or source parsing | Accept the interactive prompt, rerun with `--yes`, or run the suggested `ktx admin runtime install` command |
|
||||
| Context-source options were ignored | Depth and query-history flags were supplied for a context-source connection | Omit database-only flags when ingesting context-source connections |
|
||||
| Context-source options were ignored | Query-history flags were supplied for a context-source connection | Omit database-only flags when ingesting context-source connections |
|
||||
| Text ingest stops early | `--fail-fast` was used and one item failed | Fix the failed item or rerun without `--fail-fast` to collect all failures |
|
||||
|
|
|
|||
|
|
@ -131,8 +131,8 @@ BigQuery; and `databases` for ClickHouse.
|
|||
Query history setup is supported for Postgres, BigQuery, and Snowflake. The
|
||||
window flag applies to BigQuery and Snowflake; Postgres reads the current
|
||||
`pg_stat_statements` aggregate data instead of a time-windowed history table.
|
||||
Enabling query history makes deep ingest readiness matter for later
|
||||
`ktx ingest` runs.
|
||||
Later `ktx ingest` runs build enriched context and need a configured model and
|
||||
embeddings, including when query history is enabled.
|
||||
|
||||
When query history is enabled for PostgreSQL, Snowflake, or BigQuery,
|
||||
`ktx setup` runs a non-blocking readiness probe after the connection test
|
||||
|
|
|
|||
|
|
@ -66,8 +66,9 @@ read, how to think, and where to put the results.
|
|||
## Minimal config
|
||||
|
||||
A working `ktx.yaml` needs one entry in `connections`. Everything else accepts
|
||||
defaults. The example below is enough for `ktx ingest warehouse` to run a fast
|
||||
schema scan against a local Postgres.
|
||||
defaults. The example below registers a local Postgres connection; building
|
||||
context with `ktx ingest warehouse` also needs a model and embeddings, which
|
||||
`ktx setup` configures.
|
||||
|
||||
```yaml
|
||||
connections:
|
||||
|
|
@ -123,7 +124,7 @@ context-source drivers share the map.
|
|||
|
||||
Warehouse connections are open objects: the listed fields are validated, and
|
||||
any other field is preserved and passed through to the connector. Use
|
||||
`enabled_tables` to scope deep ingest to a specific list of
|
||||
`enabled_tables` to scope ingest to a specific list of
|
||||
`schema.table` names - useful for smoke tests.
|
||||
|
||||
```yaml
|
||||
|
|
|
|||
|
|
@ -236,7 +236,7 @@ Testing warehouse
|
|||
Connection test passed
|
||||
|
||||
Building schema context for warehouse
|
||||
Running fast database ingest
|
||||
Running database scan
|
||||
```
|
||||
|
||||
If setup exits early, rerun `ktx setup` in the same directory. **ktx** keeps
|
||||
|
|
@ -268,13 +268,13 @@ Agent integration ready: yes (codex:project)
|
|||
|
||||
For a structured check inside scripts, use `ktx status --json`.
|
||||
|
||||
When setup builds deep context, its final context check looks like:
|
||||
When setup finishes building context, its final context check looks like:
|
||||
|
||||
```text
|
||||
ktx context is ready for agents.
|
||||
|
||||
Databases:
|
||||
warehouse: deep context complete
|
||||
warehouse: database context complete
|
||||
|
||||
Context sources:
|
||||
dbt_main: memory update complete
|
||||
|
|
@ -326,7 +326,7 @@ ktx setup \
|
|||
Then build context:
|
||||
|
||||
```bash
|
||||
ktx ingest warehouse --fast
|
||||
ktx ingest warehouse
|
||||
```
|
||||
|
||||
See [ktx setup](/docs/cli-reference/ktx-setup) for the full automation flag
|
||||
|
|
|
|||
|
|
@ -24,7 +24,9 @@ external metadata can attach to known warehouse tables.
|
|||
|
||||
## Database ingest
|
||||
|
||||
Database ingest records table, column, type, constraint, and row-count context.
|
||||
Database ingest always builds enriched context: tables, columns, types,
|
||||
constraints, and row counts, plus AI-generated descriptions, embeddings, and
|
||||
relationship evidence.
|
||||
|
||||
```bash
|
||||
# Build one configured database connection
|
||||
|
|
@ -34,23 +36,8 @@ ktx ingest warehouse
|
|||
ktx ingest --all
|
||||
```
|
||||
|
||||
Depth controls how much context **ktx** builds:
|
||||
|
||||
| Flag | Best for | What it does |
|
||||
|------|----------|--------------|
|
||||
| `--fast` | First setup, quick refreshes, CI smoke checks | Deterministic fast ingest with tables, columns, types, constraints, and row counts |
|
||||
| `--deep` | Agent-ready context for real analysis | Fast ingest plus deep enrichment with descriptions, embeddings, relationship evidence, and optional query history |
|
||||
|
||||
Examples:
|
||||
|
||||
```bash
|
||||
ktx ingest warehouse --fast
|
||||
ktx ingest warehouse --deep
|
||||
ktx ingest --all --deep
|
||||
```
|
||||
|
||||
Deep ingest needs LLM and embedding readiness. Otherwise run `ktx setup` or use
|
||||
`--fast`.
|
||||
Enriched ingest needs a configured model and embeddings. Run `ktx setup` first;
|
||||
connections without that configuration fail before any work starts.
|
||||
|
||||
With `claude-code`, **ktx** agent loops can invoke only the **ktx** MCP tools for the
|
||||
current run.
|
||||
|
|
@ -64,7 +51,7 @@ Enable it during setup, store it under `connections.<id>.context.queryHistory`,
|
|||
or request it for one run:
|
||||
|
||||
```bash
|
||||
ktx ingest warehouse --deep --query-history
|
||||
ktx ingest warehouse --query-history
|
||||
# Set the lookback window for BigQuery or Snowflake query history
|
||||
ktx ingest warehouse --query-history-window-days 30
|
||||
```
|
||||
|
|
@ -74,8 +61,8 @@ for one run.
|
|||
|
||||
## Relationship evidence
|
||||
|
||||
**ktx** scores relationship candidates during supported deep database ingest. The
|
||||
public CLI does not expose separate relationship review subcommands.
|
||||
**ktx** scores relationship candidates during database ingest. The public CLI
|
||||
does not expose separate relationship review subcommands.
|
||||
|
||||
## Context-source ingest
|
||||
|
||||
|
|
@ -159,7 +146,7 @@ After interactive setup:
|
|||
|
||||
```bash
|
||||
ktx status
|
||||
ktx ingest --all --deep
|
||||
ktx ingest --all
|
||||
ktx status
|
||||
```
|
||||
|
||||
|
|
@ -176,8 +163,8 @@ ktx wiki "revenue" --json --limit 10
|
|||
| Symptom | Likely cause | Recovery |
|
||||
|---------|--------------|----------|
|
||||
| Connection not configured | The connection id is missing from `ktx.yaml` | Add it with `ktx setup` |
|
||||
| Deep readiness is missing | LLM or embeddings are not setup-ready | Run `ktx setup`, or rerun with `--fast` |
|
||||
| Query history is unsupported | The selected database driver does not expose query history | Run fast ingest without query-history flags |
|
||||
| Enrichment is not configured | LLM or embeddings are not setup-ready | Run `ktx setup` to configure a model and embeddings |
|
||||
| Query history is unsupported | The selected database driver does not expose query history | Run ingest without query-history flags |
|
||||
| No connections configured | The project has no entries under `connections` | Run `ktx setup` and add a database or context-source connection |
|
||||
| Context-source flags have no effect | Depth and query-history flags were supplied for a context-source connector | Use those flags only for database connections |
|
||||
| Context-source flags have no effect | Query-history flags were supplied for a context-source connector | Use query-history flags only for database connections |
|
||||
| Text ingest stops early | `--fail-fast` stopped on the first failed item | Fix the item or rerun without `--fail-fast` |
|
||||
|
|
|
|||
|
|
@ -111,12 +111,13 @@ non-obvious terms.
|
|||
Agents can refresh context when the user asks them to:
|
||||
|
||||
```bash
|
||||
ktx ingest warehouse --fast
|
||||
ktx ingest warehouse
|
||||
ktx ingest
|
||||
ktx ingest --file docs/revenue-notes.md --connection-id warehouse
|
||||
```
|
||||
|
||||
Use `--deep` only when LLM and embedding setup is ready.
|
||||
Database ingest builds enriched context and requires a configured model and
|
||||
embeddings; run `ktx setup` first if they are not ready.
|
||||
|
||||
## Good agent behavior
|
||||
|
||||
|
|
|
|||
|
|
@ -517,5 +517,5 @@ No authentication required - SQLite is file-based. The file must be readable by
|
|||
| Connection URL appears in git diff | A literal credential URL was written to `ktx.yaml` | Replace it with `env:NAME` or `file:/path/to/secret` and rotate exposed credentials |
|
||||
| Database ingest returns no tables | Schema, database, or project filter is wrong, or the user lacks metadata permissions | Verify the schema list and grant metadata read permissions |
|
||||
| Query history is empty | Query history extension or warehouse history view is unavailable | Enable the warehouse-specific history feature, then rerun `ktx ingest <connectionId> --query-history` or `ktx setup` |
|
||||
| Column statistics are missing | Connector cannot access stats tables or the warehouse does not expose them | Grant stats permissions where supported; otherwise rely on fast schema context |
|
||||
| Column statistics are missing | Connector cannot access stats tables or the warehouse does not expose them | Grant stats permissions where supported; otherwise rely on schema-level context without column statistics |
|
||||
| Semantic query execution fails | Connection is missing, unreachable, or query execution is disabled | Run `ktx connection test <id>` and check the `ktx sl query` flags |
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue