mirror of
https://github.com/Kaelio/ktx.git
synced 2026-06-22 08:38:08 +02:00
feat: merge ingest and scan
* docs: add CLI component reuse guidance * docs: add unified ingest ux design * Refine unified ingest UX design after adversarial review iteration 1 * Refine unified ingest UX design after adversarial review iteration 2 * Refine unified ingest UX design after adversarial review iteration 3 * feat(cli): route public connection ingest command * feat(cli): hide standalone scan from public help * feat(cli): plan public ingest depth and query history * feat(cli): execute public database ingest facets * feat(ingest): read connection query history config * fix(cli): use public ingest wording * fix(config): stop generating ingest adapter allow lists * docs: document public ingest command * test: align ingest surface expectations * docs: add unified ingest public CLI surface plan * feat(cli): preflight deep public ingest readiness * feat(setup): store query history in connection context * feat(setup): store database context depth * feat(setup): verify context readiness by database depth * fix(setup): keep context build foreground only * fix(config): reject reserved ingest connection ids * test: close unified ingest v1 expectations * docs: add unified ingest v1 closure plan * fix(ingest): bypass adapter allow-list for public source ingest * fix(ingest): honor query history window intent * fix(ingest): hide scan internals from public database ingest * feat(ingest): use foreground view for interactive public ingest * fix(setup): use schema context and query history wording * test(cli): verify unified ingest public output * docs: add unified ingest v1 public output closure plan * fix(setup): forward query history flags * fix(setup): prompt for postgres query history * fix(status): report query history readiness * fix(ingest): remove legacy public guidance * fix(ingest): polish foreground retry copy * docs(examples): use unified query history wording * chore(ingest): finish public query history cleanup * docs: add unified ingest v1 query history status cleanup plan * test(docs): cover unified ingest public docs * docs: align ingest CLI reference with unified UX * docs: update context build guides for unified ingest * docs: update setup and primary source ingest wording * docs: stop advertising adapter-backed example ingest * docs: close unified ingest public docs gaps * docs: add unified ingest v1 docs site closure plan * fix: render unified ingest foreground warnings * fix: explain query history schema order * fix: add public ingest retry guidance * fix: align setup next steps with unified ingest * fix: remove scan wording from demo progress * test: verify unified ingest ux closure * docs: add unified ingest v1 foreground and retry closure plan * fix(cli): preserve query-history pull config in public ingest * fix(cli): omit hidden commands from docs command tree * test(cli): close unified ingest final public surface checks * docs: add unified ingest v1 final public surface closure plan * fix(cli): use public source labels in ingest reports * fix(cli): suppress low-level public ingest output * test(cli): verify unified ingest public plain output * docs: add unified ingest v1 public plain output closure plan * fix(cli): add public ingest copy sanitizers * fix(cli): sanitize public ingest progress copy * fix(cli): rename setup schema scope prompt * docs(plan): add progress copy closure; test: align setup back-nav fixture Adds the iter9 plan and updates the setup back-navigation test fixture to pass disableQueryHistory plus listSchemas/listTables stubs that the unified ingest setup step now requires. * docs(plan): add final ux labels plan with narrowed label scans * fix(cli): aggregate unsupported query-history warnings * fix(cli): align setup database labels * test(cli): fix setup database test type-check * fix(cli): remove primary-source wording from setup output * test(cli): verify unified ingest setup closure * docs(plan): add unified ingest v1 verification copy closure plan * fix(cli): remove top-level scan command * fix(cli): remove legacy ingest and wiki commands * Merge scan into ingest flow * feat(cli): split ingest progress into per-phase rows, rename work units to tasks Each database target in the unified ingest dashboard now renders one row per real subprocess (Schema, then Query history when enabled) instead of a single combined bar. Each phase has its own monotonic 0-100% bar so the progress never snaps back to zero when historic-sql starts after scan completes. Completed phases keep their final bar, summary, and elapsed time visible as an inline audit trail; queued and skipped phases are shown explicitly. Also rename user-facing "work units" / "Failed work units" to "tasks" / "Failed tasks" in ingest output and parseIngestSummary. The parser still accepts the legacy "Work units:" wording in captured output for backward compat. Internal memory-flow event names and type fields are left alone. * Fix test harness failures * Fix CI smoke checks --------- Co-authored-by: Andrey Avtomonov <7889985+andreybavt@users.noreply.github.com>
This commit is contained in:
parent
1a472cf3ed
commit
b00c1a11a9
118 changed files with 16890 additions and 2992 deletions
|
|
@ -81,7 +81,8 @@ ktx dev runtime start --feature local-embeddings
|
|||
|
||||
## Step 3: Connect a database
|
||||
|
||||
Select one or more databases for KTX to scan. The wizard supports SQLite, PostgreSQL, MySQL, ClickHouse, SQL Server, BigQuery, and Snowflake.
|
||||
Select one or more databases for KTX to connect to. The wizard supports
|
||||
SQLite, PostgreSQL, MySQL, ClickHouse, SQL Server, BigQuery, and Snowflake.
|
||||
|
||||
For PostgreSQL, you can enter connection details field by field or paste a connection URL:
|
||||
|
||||
|
|
@ -93,22 +94,27 @@ For PostgreSQL, you can enter connection details field by field or paste a conne
|
|||
|
||||
If your URL contains credentials, KTX saves it to `.ktx/secrets/` and writes a `file:` reference in `ktx.yaml`. You can also use `env:DATABASE_URL` to reference an environment variable.
|
||||
|
||||
After connecting, KTX automatically runs a connection test and a structural scan:
|
||||
After connecting, KTX automatically runs a connection test and builds fast
|
||||
schema context:
|
||||
|
||||
```
|
||||
◇ Testing postgres-warehouse
|
||||
│ ✓ Connection test passed
|
||||
│ Driver: PostgreSQL · Tables: 42
|
||||
│
|
||||
◇ Scanning postgres-warehouse
|
||||
│ ✓ Structural scan completed
|
||||
│ Changes: 42 new tables
|
||||
│
|
||||
◇ Primary source ready
|
||||
│ postgres-warehouse · PostgreSQL · structural scan complete
|
||||
Testing postgres-warehouse
|
||||
Connection test passed
|
||||
Driver: PostgreSQL - Tables: 42
|
||||
|
||||
Building schema context for postgres-warehouse
|
||||
Running fast database ingest
|
||||
|
||||
Schema context complete for postgres-warehouse
|
||||
Changes: 42 new tables
|
||||
|
||||
Database ready
|
||||
postgres-warehouse - PostgreSQL - schema context complete
|
||||
```
|
||||
|
||||
For Snowflake and BigQuery, the wizard offers **Historic SQL** configuration for query history views. For PostgreSQL, enable Historic SQL with `--enable-historic-sql` when `pg_stat_statements` is configured.
|
||||
For PostgreSQL, Snowflake, and BigQuery, the wizard can enable query-history
|
||||
ingest when the warehouse history feature is available. Query history is stored
|
||||
under `connections.<id>.context.queryHistory` in `ktx.yaml`.
|
||||
|
||||
## Step 4: Add context sources
|
||||
|
||||
|
|
@ -138,7 +144,8 @@ Context sources are saved to `ktx.yaml` and built during the next step.
|
|||
|
||||
## Step 5: Build context
|
||||
|
||||
This is where KTX does the heavy lifting. It runs an enriched scan of your database (generating AI-powered column and table descriptions) and ingests metadata from any configured context sources.
|
||||
This is where KTX builds agent-ready context. It uses the database context
|
||||
depth saved by setup and ingests metadata from any configured context sources.
|
||||
|
||||
```
|
||||
◆ Build KTX context for agents?
|
||||
|
|
@ -146,27 +153,22 @@ This is where KTX does the heavy lifting. It runs an enriched scan of your datab
|
|||
│ ○ Leave context unbuilt and exit setup
|
||||
```
|
||||
|
||||
The build scans each primary source with LLM enrichment, detects table relationships, and runs ingestion agents that reconcile metadata from your context sources into semantic-layer YAML files and wiki pages.
|
||||
Fast database context builds deterministic schema grounding. Deep database
|
||||
context also generates AI descriptions, embeddings, and relationship evidence
|
||||
when those capabilities are configured.
|
||||
|
||||
For a small database (under 50 tables), this takes a few minutes. Larger warehouses can take longer. You can press <kbd>d</kbd> to detach and let it run in the background:
|
||||
|
||||
```
|
||||
KTX context build
|
||||
Run: setup-context-local-abc123
|
||||
Project: /home/user/analytics
|
||||
|
||||
Detach: press d to leave this running.
|
||||
Resume: ktx setup --project-dir /home/user/analytics
|
||||
Status: ktx status --project-dir /home/user/analytics
|
||||
```
|
||||
For a small database (under 50 tables), this can take a few minutes. Larger
|
||||
warehouses can take longer. Context builds run in the foreground; press
|
||||
<kbd>Ctrl+C</kbd> to stop the current run and rerun `ktx setup` or `ktx ingest`
|
||||
when you are ready to try again.
|
||||
|
||||
When the build completes, KTX verifies that agent-ready context was produced:
|
||||
|
||||
```
|
||||
KTX context is ready for agents.
|
||||
|
||||
Primary sources:
|
||||
postgres-warehouse: enriched scan complete
|
||||
Databases:
|
||||
postgres-warehouse: deep context complete
|
||||
|
||||
Context sources:
|
||||
dbt-main: memory update complete
|
||||
|
|
@ -209,8 +211,8 @@ KTX writes project state as plain files so agents can inspect and edit changes i
|
|||
| `ktx.yaml` | `ktx setup` | Main project configuration: connections, LLM settings, embeddings, and context sources |
|
||||
| `.ktx/secrets/*` | `ktx setup` when file-backed secrets are selected | Local secret files referenced from `ktx.yaml`; do not commit these |
|
||||
| `semantic-layer/<connection-id>/*.yaml` | context build, ingestion, or direct file edits | Semantic source definitions agents use for SQL generation |
|
||||
| `wiki/global/*.md` | ingestion, memory capture, `ktx wiki write --scope global`, or direct file edits | Shared business context and metric definitions |
|
||||
| `wiki/user/<user-id>/*.md` | memory capture, `ktx wiki write --scope user`, or direct file edits | User-scoped notes for one agent/user context |
|
||||
| `wiki/global/*.md` | ingestion, memory capture, or direct file edits | Shared business context and metric definitions |
|
||||
| `wiki/user/<user-id>/*.md` | memory capture or direct file edits | User-scoped notes for one agent/user context |
|
||||
| `.claude/skills/ktx/SKILL.md`, `.agents/skills/ktx/SKILL.md` | CLI-mode agent integration setup | Agent instructions for calling public `ktx` commands |
|
||||
|
||||
## Verify it worked
|
||||
|
|
@ -226,7 +228,7 @@ KTX project: /home/user/analytics
|
|||
Project ready: yes
|
||||
LLM ready: yes (claude-sonnet-4-6)
|
||||
Embeddings ready: yes (text-embedding-3-small)
|
||||
Primary sources configured: yes (postgres-warehouse)
|
||||
Databases configured: yes (postgres-warehouse)
|
||||
Context sources configured: yes (dbt-main)
|
||||
KTX context built: yes
|
||||
Agent integration ready: yes (claude-code:project)
|
||||
|
|
@ -246,7 +248,7 @@ Agent integration ready: yes (claude-code:project)
|
|||
|
||||
## Next steps
|
||||
|
||||
- **Build more context** — learn about [scanning](/docs/guides/building-context), relationship detection, and ingestion workflows in the Building Context guide.
|
||||
- **Build more context** — learn about [database ingest](/docs/guides/building-context), relationship detection, and source ingestion workflows in the Building Context guide.
|
||||
- **Refine your semantic layer** — the [Writing Context](/docs/guides/writing-context) guide covers source YAML, measures, joins, and wiki pages.
|
||||
- **Understand the architecture** — read [The Context Layer](/docs/concepts/the-context-layer) to learn why a context layer is more than a semantic layer.
|
||||
- **Connect more agents** — see the [Agent Clients](/docs/integrations/agent-clients) integration page for per-tool setup details.
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue