KTX

The context layer for analytics agents

npm version License GitHub stars

--- KTX turns warehouse metadata, semantic definitions, and business knowledge into reviewable project files that agents can use while planning, querying, and updating analytics work. A KTX project is a directory of plain files — YAML semantic sources, Markdown knowledge pages, and SQLite state — that you commit to git and review in PRs, just like dbt models. ## Who KTX is for KTX is built for analytics engineers and data teams who want data agents to work on real analytics systems — not just generate one-off SQL. Use KTX when you want agents to: - **Generate SQL** from approved measures and joins - **Repair semantic definitions** through reviewable diffs - **Explain metric provenance** with warehouse evidence - **Work alongside** dbt, LookML, MetricFlow, Looker, Metabase, and modern BI platforms Works with PostgreSQL, Snowflake, BigQuery, ClickHouse, MySQL, SQL Server, and SQLite. ## Quick start Install the CLI and run the setup wizard: ```bash npm install @kaelio/ktx npm install -g @kaelio/ktx ktx setup ``` The wizard walks through six steps: configuring your LLM provider, setting up embeddings, connecting your database, adding context sources (dbt, LookML, Metabase, Looker, Notion), building context, and installing agent integration. If it exits before completion, rerun `ktx setup` to resume where you left off. Check your project status: ```bash ktx status ``` ``` KTX project: /home/user/analytics Project ready: yes LLM ready: yes (claude-sonnet-4-6) Embeddings ready: yes (text-embedding-3-small) Primary sources configured: yes (postgres-warehouse) Context sources configured: yes (dbt-main) KTX context built: yes Agent integration ready: yes (claude-code:project) ``` Run the packaged demo without installing globally: ```bash npx @kaelio/ktx setup demo --no-input npx @kaelio/ktx setup demo inspect ``` The default demo uses packaged sample data and prebuilt context. It does not require API keys, network access, or an LLM provider. Generate SQL from a semantic-layer source: ```bash npx @kaelio/ktx sl query --project-dir "$PROJECT_DIR" \ --connection-id warehouse \ --measure accounts.account_count \ --dimension accounts.segment \ --format sql ``` List and test a configured warehouse connection: ```bash ktx connection list --project-dir "$PROJECT_DIR" ktx connection test warehouse --project-dir "$PROJECT_DIR" ``` The connection test prints the configured driver and discovered table count: ```text Driver: sqlite Tables: 1 ``` ## What's in a project ``` my-project/ ├── ktx.yaml # Project configuration ├── semantic-layer/ │ └── warehouse/ │ ├── orders.yaml # Semantic source definitions │ ├── customers.yaml │ └── order_items.yaml ├── knowledge/ │ ├── global/ │ │ ├── revenue.md # Business definitions and rules │ │ └── segment-classification.md │ └── user/ │ └── local/ ├── raw-sources/ │ └── warehouse/ │ └── live-database/ # Scan artifacts and reports └── .ktx/ └── db.sqlite # Local state (git-ignored) ``` Semantic sources and knowledge pages are committed to git. The `.ktx/` directory holds ephemeral state and is git-ignored — delete it and KTX rebuilds on the next run. ### Scan the demo warehouse Scan artifacts are written under `raw-sources/warehouse/live-database//` in the project directory. ```bash SCAN_OUTPUT="$(ktx scan warehouse --project-dir "$PROJECT_DIR")" printf '%s\n' "$SCAN_OUTPUT" SCAN_RUN_ID="$(printf '%s\n' "$SCAN_OUTPUT" | awk '/^Run: / { print $2 }')" ktx scan status --project-dir "$PROJECT_DIR" "$SCAN_RUN_ID" ktx scan report --project-dir "$PROJECT_DIR" "$SCAN_RUN_ID" ``` For non-SQLite drivers, prefer credential references such as `--url env:NAME` or `--url file:PATH` over literal credential URLs. ## Managed Python runtime KTX installs its Python runtime only when a Python-backed command needs it. The runtime lives outside the npm cache, is versioned by the installed CLI version, and is managed by `ktx runtime` commands. KTX requires `uv` on `PATH` to create the managed runtime. Install `uv` with your system package manager or the official installer before running Python- backed KTX commands. KTX doesn't download `uv` automatically; run `ktx runtime doctor` if runtime installation fails: ```bash ktx runtime install --yes ktx runtime status ktx runtime doctor ktx runtime start ktx runtime stop ktx runtime prune --dry-run ktx runtime prune --yes ``` The release artifact manifest contains the public npm tarball and the bundled `kaelio-ktx` runtime wheel. The `python/ktx-sl` and `python/ktx-daemon` directories remain source packages for development, not public release artifacts. ## Serve agents KTX integrates with coding agents through CLI skills, an MCP server, or both. The setup wizard configures this automatically — here's what each mode looks like. **CLI skills** — the agent calls `ktx` commands directly through a skill file installed in your agent's config (e.g., `.claude/skills/ktx/SKILL.md`): ```bash ktx sl query --measure orders.revenue --dimension orders.status --format sql ktx wiki search "revenue definition" ktx sl validate orders ``` **MCP server** — the agent calls KTX tools over the Model Context Protocol: ```bash ktx serve --mcp stdio \ --user-id local \ --semantic-compute \ --execute-queries \ --yes ``` This exposes tools for connections, knowledge search, semantic-layer sources, validation, queries, ingestion, and replay. The `--semantic-compute` flag starts the managed Python runtime for query planning automatically. The standalone MCP server exposes `connection_list`, `knowledge_search`, `knowledge_read`, `knowledge_write`, `sl_list_sources`, `sl_read_source`, `sl_write_source`, `sl_validate`, `sl_query`, `ingest_trigger`, `ingest_status`, `ingest_report`, and `ingest_replay`. Supported agents: Claude Code, Codex, Cursor, OpenCode, and any agent that reads `.agents/` skills or MCP configuration. ## Workspace packages | Package | Purpose | |---------|---------| | `packages/cli` | CLI entry point | | `packages/context` | Core context engine | | `packages/llm` | LLM and embedding providers | | `packages/connector-bigquery` | BigQuery scan connector | | `packages/connector-clickhouse` | ClickHouse scan connector | | `packages/connector-mysql` | MySQL scan connector | | `packages/connector-postgres` | Postgres scan connector | | `packages/connector-snowflake` | Snowflake scan connector | | `packages/connector-sqlite` | SQLite scan connector | | `packages/connector-sqlserver` | SQL Server scan connector | | `python/ktx-sl` | Semantic-layer query planning | | `python/ktx-daemon` | Portable compute service | ## Development ```bash git clone https://github.com/kaelio/ktx.git cd ktx pnpm install uv sync --all-groups pnpm run build pnpm run check ``` Use the development CLI for local testing: ```bash pnpm run setup:dev pnpm run link:dev ktx-dev --help ``` ### Debug LLM traces KTX can capture local AI SDK DevTools traces for LLM calls that run through the KTX provider. Enable it with an environment flag when running an LLM-backed command: ```bash KTX_AI_DEVTOOLS_ENABLED=true ktx dev ingest run \ --connection-id warehouse \ --adapter metabase ``` Traces are written to `.devtools/generations.json` under the current working directory. To inspect them, run: ```bash pnpm dlx @ai-sdk/devtools ``` Then open `http://localhost:4983`. These traces are local-development-only and store prompts, model outputs, tool arguments/results, and raw provider payloads in plain text. Do not enable this in production or for sensitive runs. The repository uses `pnpm` for TypeScript packages and `uv` for Python packages. See [Contributing](docs-site/content/docs/community/contributing.mdx) for full development setup, testing, and PR guidelines. ## License KTX is licensed under the Apache License, Version 2.0. See `LICENSE`.