mirror of
https://github.com/Kaelio/ktx.git
synced 2026-06-07 07:55:13 +02:00
* docs: add CLI component reuse guidance * docs: add unified ingest ux design * Refine unified ingest UX design after adversarial review iteration 1 * Refine unified ingest UX design after adversarial review iteration 2 * Refine unified ingest UX design after adversarial review iteration 3 * feat(cli): route public connection ingest command * feat(cli): hide standalone scan from public help * feat(cli): plan public ingest depth and query history * feat(cli): execute public database ingest facets * feat(ingest): read connection query history config * fix(cli): use public ingest wording * fix(config): stop generating ingest adapter allow lists * docs: document public ingest command * test: align ingest surface expectations * docs: add unified ingest public CLI surface plan * feat(cli): preflight deep public ingest readiness * feat(setup): store query history in connection context * feat(setup): store database context depth * feat(setup): verify context readiness by database depth * fix(setup): keep context build foreground only * fix(config): reject reserved ingest connection ids * test: close unified ingest v1 expectations * docs: add unified ingest v1 closure plan * fix(ingest): bypass adapter allow-list for public source ingest * fix(ingest): honor query history window intent * fix(ingest): hide scan internals from public database ingest * feat(ingest): use foreground view for interactive public ingest * fix(setup): use schema context and query history wording * test(cli): verify unified ingest public output * docs: add unified ingest v1 public output closure plan * fix(setup): forward query history flags * fix(setup): prompt for postgres query history * fix(status): report query history readiness * fix(ingest): remove legacy public guidance * fix(ingest): polish foreground retry copy * docs(examples): use unified query history wording * chore(ingest): finish public query history cleanup * docs: add unified ingest v1 query history status cleanup plan * test(docs): cover unified ingest public docs * docs: align ingest CLI reference with unified UX * docs: update context build guides for unified ingest * docs: update setup and primary source ingest wording * docs: stop advertising adapter-backed example ingest * docs: close unified ingest public docs gaps * docs: add unified ingest v1 docs site closure plan * fix: render unified ingest foreground warnings * fix: explain query history schema order * fix: add public ingest retry guidance * fix: align setup next steps with unified ingest * fix: remove scan wording from demo progress * test: verify unified ingest ux closure * docs: add unified ingest v1 foreground and retry closure plan * fix(cli): preserve query-history pull config in public ingest * fix(cli): omit hidden commands from docs command tree * test(cli): close unified ingest final public surface checks * docs: add unified ingest v1 final public surface closure plan * fix(cli): use public source labels in ingest reports * fix(cli): suppress low-level public ingest output * test(cli): verify unified ingest public plain output * docs: add unified ingest v1 public plain output closure plan * fix(cli): add public ingest copy sanitizers * fix(cli): sanitize public ingest progress copy * fix(cli): rename setup schema scope prompt * docs(plan): add progress copy closure; test: align setup back-nav fixture Adds the iter9 plan and updates the setup back-navigation test fixture to pass disableQueryHistory plus listSchemas/listTables stubs that the unified ingest setup step now requires. * docs(plan): add final ux labels plan with narrowed label scans * fix(cli): aggregate unsupported query-history warnings * fix(cli): align setup database labels * test(cli): fix setup database test type-check * fix(cli): remove primary-source wording from setup output * test(cli): verify unified ingest setup closure * docs(plan): add unified ingest v1 verification copy closure plan * fix(cli): remove top-level scan command * fix(cli): remove legacy ingest and wiki commands * Merge scan into ingest flow * feat(cli): split ingest progress into per-phase rows, rename work units to tasks Each database target in the unified ingest dashboard now renders one row per real subprocess (Schema, then Query history when enabled) instead of a single combined bar. Each phase has its own monotonic 0-100% bar so the progress never snaps back to zero when historic-sql starts after scan completes. Completed phases keep their final bar, summary, and elapsed time visible as an inline audit trail; queued and skipped phases are shown explicitly. Also rename user-facing "work units" / "Failed work units" to "tasks" / "Failed tasks" in ingest output and parseIngestSummary. The parser still accepts the legacy "Work units:" wording in captured output for backward compat. Internal memory-flow event names and type fields are left alone. * Fix test harness failures * Fix CI smoke checks --------- Co-authored-by: Andrey Avtomonov <7889985+andreybavt@users.noreply.github.com>
241 lines
7.4 KiB
Markdown
241 lines
7.4 KiB
Markdown
<h1 align="center">
|
|
<img src="assets/ktx-lockup.svg" alt="KTX" width="500" />
|
|
</h1>
|
|
|
|
<p align="center">
|
|
<strong>The context layer for analytics agents</strong>
|
|
</p>
|
|
|
|
<p align="center">
|
|
<a href="https://www.npmjs.com/package/@kaelio/ktx"><img src="https://img.shields.io/npm/v/@kaelio/ktx?style=flat-square&color=f97316" alt="npm version" /></a>
|
|
<a href="https://codecov.io/gh/Kaelio/ktx"><img src="https://codecov.io/gh/Kaelio/ktx/branch/main/graph/badge.svg" alt="Codecov" /></a>
|
|
<a href="https://github.com/Kaelio/ktx/blob/main/LICENSE"><img src="https://img.shields.io/badge/license-Apache%202.0-blue?style=flat-square" alt="License" /></a>
|
|
<a href="https://github.com/Kaelio/ktx"><img src="https://img.shields.io/github/stars/Kaelio/ktx?style=flat-square" alt="GitHub stars" /></a>
|
|
</p>
|
|
|
|
---
|
|
|
|
KTX turns warehouse metadata, semantic definitions, and business knowledge into
|
|
reviewable project files that agents can use while planning, querying, and
|
|
updating analytics work.
|
|
|
|
A KTX project is a directory of plain files — YAML semantic sources, Markdown
|
|
wiki pages, and SQLite state — that you commit to git and review in PRs,
|
|
just like dbt models.
|
|
|
|
## Who KTX is for
|
|
|
|
KTX is built for analytics engineers and data teams who want data agents to
|
|
work on real analytics systems — not just generate one-off SQL.
|
|
|
|
Use KTX when you want agents to:
|
|
|
|
- **Generate SQL** from approved measures and joins
|
|
- **Repair semantic definitions** through reviewable diffs
|
|
- **Explain metric provenance** with warehouse evidence
|
|
- **Work alongside** dbt, LookML, MetricFlow, Looker, Metabase, and modern BI
|
|
platforms
|
|
|
|
Works with PostgreSQL, Snowflake, BigQuery, ClickHouse, MySQL, SQL Server, and
|
|
SQLite.
|
|
|
|
## Quick start
|
|
|
|
Install the CLI and run the setup wizard:
|
|
|
|
```bash
|
|
npm install @kaelio/ktx
|
|
npm install -g @kaelio/ktx
|
|
ktx setup
|
|
```
|
|
|
|
The wizard walks through six steps: configuring your LLM provider, setting up
|
|
embeddings, connecting your database, adding context sources (dbt, LookML,
|
|
Metabase, Looker, Notion), building context, and installing agent integration.
|
|
|
|
If it exits before completion, rerun `ktx setup` to resume where you left off.
|
|
|
|
Check your project status:
|
|
|
|
```bash
|
|
ktx status
|
|
```
|
|
|
|
```
|
|
KTX project: /home/user/analytics
|
|
Project ready: yes
|
|
LLM ready: yes (claude-sonnet-4-6)
|
|
Embeddings ready: yes (text-embedding-3-small)
|
|
Databases configured: yes (postgres-warehouse)
|
|
Context sources configured: yes (dbt-main)
|
|
KTX context built: yes
|
|
Agent integration ready: yes (claude-code:project)
|
|
```
|
|
|
|
Generate SQL from a semantic-layer source:
|
|
|
|
```bash
|
|
npx @kaelio/ktx sl query --project-dir "$PROJECT_DIR" \
|
|
--connection-id warehouse \
|
|
--measure accounts.account_count \
|
|
--dimension accounts.segment \
|
|
--format sql
|
|
```
|
|
|
|
List and test a configured warehouse connection:
|
|
|
|
```bash
|
|
ktx connection list --project-dir "$PROJECT_DIR"
|
|
ktx connection test warehouse --project-dir "$PROJECT_DIR"
|
|
```
|
|
|
|
The connection test prints the configured driver and discovered table count:
|
|
|
|
```text
|
|
Driver: sqlite
|
|
Tables: 1
|
|
```
|
|
|
|
## What's in a project
|
|
|
|
```
|
|
my-project/
|
|
├── ktx.yaml # Project configuration
|
|
├── semantic-layer/
|
|
│ └── warehouse/
|
|
│ ├── orders.yaml # Semantic source definitions
|
|
│ ├── customers.yaml
|
|
│ └── order_items.yaml
|
|
├── wiki/
|
|
│ ├── global/
|
|
│ │ ├── revenue.md # Business definitions and rules
|
|
│ │ └── segment-classification.md
|
|
│ └── user/
|
|
│ └── local/
|
|
├── raw-sources/
|
|
│ └── warehouse/
|
|
│ └── <syncId>/ # Database ingest artifacts and reports
|
|
└── .ktx/
|
|
└── db.sqlite # Local state (git-ignored)
|
|
```
|
|
|
|
Semantic sources and wiki pages are committed to git. The `.ktx/` directory
|
|
holds ephemeral state and is git-ignored — delete it and KTX rebuilds on the
|
|
next run.
|
|
|
|
### Build demo warehouse context
|
|
|
|
Database ingest artifacts are written under `raw-sources/warehouse/<syncId>/`
|
|
in the project directory.
|
|
|
|
```bash
|
|
ktx ingest warehouse --project-dir "$PROJECT_DIR" --fast
|
|
ktx status --project-dir "$PROJECT_DIR"
|
|
```
|
|
|
|
For non-SQLite drivers, prefer credential references such as `--url env:NAME`
|
|
or `--url file:PATH` over literal credential URLs.
|
|
|
|
## Managed Python runtime
|
|
|
|
KTX installs its Python runtime only when a Python-backed command needs it.
|
|
The runtime lives outside the npm cache, is versioned by the installed CLI
|
|
version, and is managed by `ktx dev runtime` commands.
|
|
|
|
KTX requires `uv` on `PATH` to create the managed runtime. Install `uv` with
|
|
your system package manager or the official installer before running Python-
|
|
backed KTX commands. KTX doesn't download `uv` automatically; run
|
|
`ktx dev runtime status` if runtime installation fails:
|
|
|
|
```bash
|
|
ktx dev runtime install --yes
|
|
ktx dev runtime status
|
|
ktx dev runtime start
|
|
ktx dev runtime stop
|
|
```
|
|
|
|
The release artifact manifest contains the public npm tarball and the bundled `kaelio-ktx`
|
|
runtime wheel. The `python/ktx-sl` and `python/ktx-daemon` directories remain
|
|
source packages for development, not public release artifacts.
|
|
|
|
## Use KTX with agents
|
|
|
|
KTX integrates with coding agents through CLI skills. The setup wizard
|
|
configures this automatically.
|
|
|
|
**CLI skills** — the agent calls `ktx` commands directly through a skill file
|
|
installed in your agent's config (e.g., `.claude/skills/ktx/SKILL.md`):
|
|
|
|
```bash
|
|
ktx sl query --measure orders.revenue --dimension orders.status --format sql
|
|
ktx wiki search "revenue definition"
|
|
ktx sl validate orders
|
|
```
|
|
|
|
Supported agents: Claude Code, Codex, Cursor, OpenCode, and any agent that
|
|
reads `.agents/` skills.
|
|
|
|
## Workspace packages
|
|
|
|
| Package | Purpose |
|
|
|---------|---------|
|
|
| `packages/cli` | CLI entry point |
|
|
| `packages/context` | Core context engine |
|
|
| `packages/llm` | LLM and embedding providers |
|
|
| `packages/connector-bigquery` | BigQuery scan connector |
|
|
| `packages/connector-clickhouse` | ClickHouse scan connector |
|
|
| `packages/connector-mysql` | MySQL scan connector |
|
|
| `packages/connector-postgres` | Postgres scan connector |
|
|
| `packages/connector-snowflake` | Snowflake scan connector |
|
|
| `packages/connector-sqlite` | SQLite scan connector |
|
|
| `packages/connector-sqlserver` | SQL Server scan connector |
|
|
| `python/ktx-sl` | Semantic-layer query planning |
|
|
| `python/ktx-daemon` | Portable compute service |
|
|
|
|
## Development
|
|
|
|
```bash
|
|
git clone https://github.com/kaelio/ktx.git
|
|
cd ktx
|
|
pnpm install
|
|
uv sync --all-groups
|
|
pnpm run build
|
|
pnpm run check
|
|
```
|
|
|
|
Use the development CLI for local testing:
|
|
|
|
```bash
|
|
pnpm run setup:dev
|
|
pnpm run link:dev
|
|
ktx-dev --help
|
|
```
|
|
|
|
### Debug LLM traces
|
|
|
|
KTX can capture local AI SDK DevTools traces for LLM calls that run through the
|
|
KTX provider. Enable it with an environment flag when running an LLM-backed
|
|
command:
|
|
|
|
```bash
|
|
KTX_AI_DEVTOOLS_ENABLED=true ktx ingest warehouse --project-dir "$PROJECT_DIR" --deep
|
|
```
|
|
|
|
Traces are written to `.devtools/generations.json` under the current working
|
|
directory. To inspect them, run:
|
|
|
|
```bash
|
|
pnpm dlx @ai-sdk/devtools
|
|
```
|
|
|
|
Then open `http://localhost:4983`. These traces are local-development-only and
|
|
store prompts, model outputs, tool arguments/results, and raw provider payloads
|
|
in plain text. Do not enable this in production or for sensitive runs.
|
|
|
|
The repository uses `pnpm` for TypeScript packages and `uv` for Python
|
|
packages. See [Contributing](docs-site/content/docs/community/contributing.mdx)
|
|
for full development setup, testing, and PR guidelines.
|
|
|
|
## License
|
|
|
|
KTX is licensed under the Apache License, Version 2.0. See `LICENSE`.
|