mirror of
https://github.com/Kaelio/ktx.git
synced 2026-06-07 07:55:13 +02:00
Fast mode (the ktx ingest --fast/--deep database-ingest depth toggle) is removed.
ktx ingest now always builds the full enriched ("deep") context. There is no
structural fallback: a database connection without a configured model and
embeddings fails the enrichment-readiness preflight before any work runs, with
a 'Run ktx setup to configure a model and embeddings' hint.
- Remove --fast/--deep flags, the per-connection context.depth field, and the
ktx setup depth prompt (delete setup-database-context-depth.ts).
- Rename ingest-depth.ts -> connection-drivers.ts; ingest always requests scan
mode 'enriched'; readiness gate (enrichmentReadinessGaps) runs for every
database target.
- Drop the database-context-depth telemetry step (Node + Python schema mirrors
regenerated).
- Update CLI, setup, context-build view, docs, the public ktx skill, and the
release-smoke / artifacts scripts (now assert the no-LLM guard failure).
ktx status --fast (a separate network-probe flag) is unchanged.
Follow-ups: KLO-726 (live progress for ktx ingest --all), KLO-727 (restore
credentialed successful-ingest release smoke coverage).
155 lines
6.7 KiB
Text
155 lines
6.7 KiB
Text
---
|
|
title: "ktx ingest"
|
|
description: "Build or refresh ktx context, or capture text into ktx memory."
|
|
---
|
|
|
|
`ktx ingest` builds or refreshes **ktx** context from configured connections, and
|
|
can also capture free-form text into **ktx** memory. Database connections build
|
|
enriched context — schema plus AI-generated descriptions, embeddings, and
|
|
relationship evidence — and require a configured model and embeddings.
|
|
Context-source connections ingest metadata from tools such as dbt, Looker,
|
|
Metabase, MetricFlow, LookML, and Notion. Pass `--text` or `--file` to capture
|
|
inline text or text files into memory instead.
|
|
|
|
## Command signature
|
|
|
|
```bash
|
|
ktx ingest [options] [connectionId]
|
|
```
|
|
|
|
- Bare `ktx ingest` (no positional, no `--all`) ingests every configured
|
|
connection.
|
|
- `ktx ingest <connectionId>` ingests one configured connection.
|
|
- `ktx ingest --text "..."` (or `--file <path>`) captures notes into **ktx**
|
|
memory instead of ingesting a connection.
|
|
|
|
Database connections run before context-source connections when more than one
|
|
connection is selected.
|
|
|
|
## Options
|
|
|
|
| Flag | Description | Default |
|
|
|------|-------------|---------|
|
|
| `--all` | Ingest all configured connections (same as bare invocation) | `false` |
|
|
| `--query-history` | Include database query-history usage patterns | Stored connection default |
|
|
| `--no-query-history` | Skip database query-history usage patterns for this run | Stored connection default |
|
|
| `--query-history-window-days <days>` | BigQuery/Snowflake query-history lookback window for this run | Stored connection default |
|
|
| `--text <content>` | Capture inline text into **ktx** memory; repeatable | `[]` |
|
|
| `--file <path>` | Capture a text file into **ktx** memory; use `-` for stdin; repeatable | `[]` |
|
|
| `--connection-id <connectionId>` | **ktx** connection id to tag captured text/file notes | - |
|
|
| `--user-id <id>` | Memory user id for text/file capture attribution | `local-cli` |
|
|
| `--fail-fast` | Stop after the first failed text/file item | `false` |
|
|
| `--plain` | Print plain text output | `true` |
|
|
| `--json` | Print JSON output | `false` |
|
|
| `--yes` | Install required managed runtime features without prompting | `false` |
|
|
| `--no-input` | Disable interactive terminal input | - |
|
|
|
|
Database ingest always builds enriched context and requires a configured model
|
|
and embeddings (run `ktx setup`); connections without that configuration fail
|
|
before any work starts. Query-history flags apply only to database connections
|
|
that support query history. The window flag applies to BigQuery and Snowflake;
|
|
Postgres reads the current `pg_stat_statements` aggregate data instead of a
|
|
time-windowed history table. Query-history ingest runs after the schema scan.
|
|
|
|
When more than one connection is selected, database ingest runs first, then
|
|
context-source ingest and memory updates run for context-source connections.
|
|
|
|
Some ingest paths use the managed **ktx** Python runtime. Query-history ingest uses
|
|
it for SQL analysis, and Looker context-source ingest uses it for Looker identifier
|
|
parsing. In an interactive terminal, `ktx ingest` prompts before installing the
|
|
required runtime features. Use `--yes` to install them without prompting, or
|
|
use `--no-input` to fail fast with install guidance.
|
|
|
|
`--text` and `--file` cannot be combined with a positional `connectionId` or
|
|
`--all`; pass `--connection-id <id>` instead to tag captured notes.
|
|
|
|
## Examples
|
|
|
|
```bash
|
|
# Build every configured connection (bare = --all)
|
|
ktx ingest
|
|
|
|
# Build one database or context-source connection
|
|
ktx ingest warehouse
|
|
|
|
# Include query-history usage patterns
|
|
ktx ingest warehouse --query-history
|
|
# Set the lookback window for BigQuery or Snowflake query history
|
|
ktx ingest warehouse --query-history-window-days 30
|
|
|
|
# Build a context-source connection
|
|
ktx ingest notion
|
|
|
|
# Capture inline text into memory
|
|
ktx ingest --text "Refunds are excluded from net revenue."
|
|
|
|
# Capture multiple text snippets in one call
|
|
ktx ingest --text "Revenue is gross receipts." --text "Orders are completed purchases."
|
|
|
|
# Capture a local Markdown file into memory and tag it to a connection
|
|
ktx ingest --file docs/revenue-notes.md --connection-id warehouse
|
|
|
|
# Capture one stdin item
|
|
printf "Refunds are excluded from net revenue." | ktx ingest --file -
|
|
```
|
|
|
|
## Output
|
|
|
|
Plain output summarizes each target and the operations that ran.
|
|
|
|
```text
|
|
Ingest finished
|
|
|
|
Source Database schema Query history Source ingest Memory update
|
|
warehouse done done skipped skipped
|
|
notion skipped skipped done done
|
|
```
|
|
|
|
Use `--json` when a script or agent needs the selected plan and per-target
|
|
results.
|
|
|
|
## Inspect context-source ingest traces
|
|
|
|
Context-source ingest writes persistent JSONL traces for postmortem debugging.
|
|
Plain ingest output prints the trace path near the report, run, and job
|
|
identifiers when a trace is available:
|
|
|
|
```text
|
|
Report: report-abc123
|
|
Run: run-abc123
|
|
Job: job-abc123
|
|
Trace: .ktx/ingest-traces/job-abc123/trace.jsonl
|
|
```
|
|
|
|
The trace file lives under the project directory at
|
|
`.ktx/ingest-traces/<jobId>/trace.jsonl`. Each line is a JSON event with the
|
|
job id, run id, sync id, connection id, source key, phase, event name, timing,
|
|
state snapshot, decision context, and error details. Failed runs also write a
|
|
stored ingest report with `status: "failed"`, `failure.phase`,
|
|
`failure.message`, and the same trace path.
|
|
|
|
Use `jq` or line-oriented tools to inspect a trace:
|
|
|
|
```bash
|
|
jq -c '. | {at, level, phase, event, durationMs, data, error}' \
|
|
.ktx/ingest-traces/<jobId>/trace.jsonl
|
|
```
|
|
|
|
**ktx** writes `debug` trace events by default. Set `KTX_INGEST_TRACE_LEVEL` to
|
|
`error`, `info`, `debug`, or `trace` before running ingest to change the trace
|
|
verbosity:
|
|
|
|
```bash
|
|
KTX_INGEST_TRACE_LEVEL=trace ktx ingest metabase
|
|
```
|
|
|
|
## Common errors
|
|
|
|
| Error | Cause | Recovery |
|
|
|-------|-------|----------|
|
|
| Connection not configured | The connection id is not present in `ktx.yaml` | Add the connection with `ktx setup` or update `ktx.yaml` |
|
|
| Enrichment is not configured | Database ingest needs a model, embeddings, and scan-enrichment configuration | Run `ktx setup` to configure a model and embeddings |
|
|
| Query history is unsupported | The selected database driver does not support query history | Run ingest without query-history flags |
|
|
| Python runtime is missing | The selected ingest target needs runtime-backed SQL analysis or source parsing | Accept the interactive prompt, rerun with `--yes`, or run the suggested `ktx admin runtime install` command |
|
|
| Context-source options were ignored | Query-history flags were supplied for a context-source connection | Omit database-only flags when ingesting context-source connections |
|
|
| Text ingest stops early | `--fail-fast` was used and one item failed | Fix the failed item or rerun without `--fail-fast` to collect all failures |
|