ktx/README.md

277 lines
11 KiB
Markdown
Raw Permalink Normal View History

<h1 align="center">
<img src="assets/ktx-lockup.svg" alt="ktx" width="500" />
</h1>
2026-05-11 20:47:38 -07:00
<h1 align="center">
The context layer for data agents
</h1>
2026-05-10 23:12:26 +02:00
<p align="center">
<a href="https://www.npmjs.com/package/@kaelio/ktx"><img src="https://img.shields.io/npm/v/@kaelio/ktx?style=flat-square&color=f97316" alt="npm version" /></a>
<a href="https://codecov.io/gh/Kaelio/ktx"><img src="https://codecov.io/gh/Kaelio/ktx/graph/badge.svg?branch=main" alt="Codecov" /></a>
<a href="https://github.com/Kaelio/ktx/actions/workflows/ci.yml?query=branch%3Amain"><img src="https://img.shields.io/github/actions/workflow/status/Kaelio/ktx/ci.yml?branch=main&label=tests&style=flat-square" alt="Tests" /></a>
<a href="https://docs.kaelio.com/ktx/docs/"><img src="https://img.shields.io/badge/docs-ktx-22c55e?style=flat-square" alt="Documentation" /></a>
<a href="https://join.slack.com/t/ktxcommunity/shared_invite/zt-3y9b44m1x-LVyNNJD5nwaZHq4XS29LMQ"><img src="https://img.shields.io/badge/slack-join%20community-4A154B?style=flat-square&logo=slack&logoColor=white" alt="Join the ktx Slack community" /></a>
<a href="https://github.com/Kaelio/ktx/blob/main/LICENSE"><img src="https://img.shields.io/badge/license-Apache%202.0-blue?style=flat-square" alt="License" /></a>
<a href="https://www.ycombinator.com/companies/kaelio"><img src="https://img.shields.io/badge/Y%20Combinator-P25-orange?style=flat-square" alt="Y Combinator P25" /></a>
</p>
<p align="center">
<a href="https://docs.kaelio.com/ktx/docs/getting-started/quickstart"><b>Quickstart</b></a> ·
<a href="https://docs.kaelio.com/ktx/docs/cli-reference/ktx"><b>CLI Reference</b></a> ·
<a href="https://docs.kaelio.com/ktx/docs/ai-resources/agent-quickstart"><b>Agent Setup</b></a> ·
<a href="https://join.slack.com/t/ktxcommunity/shared_invite/zt-3y9b44m1x-LVyNNJD5nwaZHq4XS29LMQ"><b>Slack</b></a>
</p>
<p align="center">
<sub>Built and maintained by <a href="https://www.kaelio.com"><b>Kaelio</b></a></sub>
</p>
---
**ktx** is a self-improving context layer that teaches agents how to query your
warehouse accurately - from approved metric definitions, joinable columns, and
business knowledge it builds and maintains for you.
2026-05-10 23:12:26 +02:00
> [!NOTE]
feat: add codex llm backend for ktx runtime work (#253) * feat: add codex sdk runner foundation * feat: parse codex runtime events * feat: expose codex runtime mcp tools * feat: add codex llm runtime * feat: wire codex llm backend * test: avoid Array.fromAsync in codex runner test * docs: document codex llm backend * fix: tighten codex runtime config ownership * fix: use codex sdk env and thread options * fix: parse codex sdk event shapes * test: add codex backend live smoke * docs: clarify codex backend isolation * fix: drive codex loop metrics from mcp events * fix: enforce codex local step budget * docs: disclose codex isolation limits * fix: count all codex agent steps and stream step callbacks live The agent-loop step budget only counted completed mcp_tool_call items, so built-in command_execution steps (which the public Codex SDK/CLI surface can still expose) never decremented the budget, letting ingest/reconciliation run past stepBudget until Codex stopped on its own. onStepFinish was also replayed only after the whole stream drained, so live work_unit_step / reconciliation progress appeared stuck until the Codex process exited. collectEvents is now the single live step accumulator: it counts every completed agent-action item via a shared isCompletedAgentStep predicate (command_execution, mcp_tool_call, file_change, web_search), fires onStepFinish as each step completes, and enforces the budget on that broader count. A no-tool turn still counts as one step. toolFailures stays MCP-specific, since a non-zero command exit is normal agent exploration, not a loop failure. * test: align ingest llm-guard assertions with codex backend The skip-llm ingest guard message now lists codex as a valid backend and mentions a Claude Code/Codex session plus a codex setup hint, but this slow suite test still asserted the pre-codex wording. Update it to match the production message (already covered by the local-bundle-runtime unit test) and add the codex setup-line assertion. * fix: treat codex error:null tool calls as success The Codex SDK serializes error: null on successful mcp_tool_call items, so the failure check (item.error !== undefined) flagged every successful tool call as failed with the empty-payload default "Codex turn failed". This killed every ingest work unit under the codex backend before it could produce a patch. Key on status === 'failed' (authoritative, always set) and only treat a populated error object as a failure. Add a regression test built from a verbatim real-SDK event capture. * fix: default codex backend to gpt-5.5 and report real probe errors The previous default gpt-5.3-codex is an API-key-only model that the OpenAI API rejects under ChatGPT-account (subscription) auth, so codex status/setup failed with a misleading "authentication is not usable" message even though auth was fine. - Default codex model is now gpt-5.5 (works on both subscription and API-key auth); the curated setup picker offers gpt-5.5 / gpt-5.4 / gpt-5.4-mini and keeps free-form entry for account-specific ids (e.g. gpt-5.3-codex-spark). - runCodexAuthProbe now distinguishes "model not available" from an auth failure and surfaces the real API error: collectEvents retains stream events when the SDK throws on a non-zero exit, and the API error JSON envelope is unwrapped to its human-readable message. - The Codex isolation warning now renders inside the clack setup frame. - Docs updated to gpt-5.5 with a note that *-codex ids require API-key auth. * fix: require llm.models.default in status and match codex probe remediation Status reported a project ready when a non-none LLM backend was configured without llm.models.default, but the runtime (resolveModelSlots) hard-requires it, so ingest/scan/memory threw after `ktx status` said the project was usable. buildLlmStatus now fails for any non-none backend missing models.default and no longer invents a fallback model for claude-code/codex. Codex probe failures now carry a category-matched fix: a model-access failure steers the user at llm.models.default instead of the auth/install remediation. runCodexAuthProbe returns the fix and status consumes it; the message stays self-sufficient so setup output is unchanged. Docs: README now lists the codex backend and local Codex auth; ktx-setup.mdx states --llm-model only accepts codex/default or gpt-*/codex-* ids. Repaired four doctor fixtures that configured a backend without models.default (the now-correctly-blocked config) and added coverage for the new behavior.
2026-06-02 13:57:11 +02:00
> Run **ktx** with your own LLM API keys or a local agent sign-in — a
> **Claude Pro/Max** subscription through Claude Code, or your local Codex
> authentication. No extra usage billing from **ktx**.
2026-05-10 23:12:26 +02:00
<p align="center">
<a href="https://youtu.be/5V4TuzYVlrA">
<img src="assets/launch-video-thumb.png" alt="Watch the ktx launch video (1:56)" width="820" />
</a>
</p>
<p align="center">
<img src="docs-site/public/images/ingestion-flow.png" alt="Ingestion: ktx ingests databases, BI tools, modeling code, and docs through its context engine (source connectors, context builder, reconciliation, validation) into wiki Markdown and semantic-layer YAML" width="900" />
</p>
<p align="center">
<img src="docs-site/public/images/mcp-runtime-flow.png" alt="Serving: an agent queries ktx through MCP, which searches the wiki and semantic layer, returns approved metrics, and compiles them into read-only SQL run against the warehouse" width="900" />
</p>
## Why ktx
2026-05-10 23:12:26 +02:00
General-purpose agents struggle on data tasks. They re-explore your warehouse
on every question, invent their own metric logic, and return numbers that
don't match approved definitions.
Traditional semantic layers don't fix this. They demand constant manual
upkeep and don't absorb the rest of your company's knowledge.
**ktx** does both, automatically:
- **Learns from company knowledge.** Ingests wiki content, organizes it,
removes duplicates, and flags contradictions for human review.
- **Maps the data stack.** Samples tables, captures metadata and usage
patterns, detects joinable columns, and annotates sources so agents write
better queries.
- **Builds a semantic layer.** Combines raw tables and high-level metrics
through a join graph that automatically resolves chasm and fan traps, so
agents fetch metrics declaratively instead of rewriting canonical SQL each
time.
- **Serves agents at execution.** Exposes CLI and MCP tools with combined
full-text and semantic search across wiki and semantic-layer entities.
## How ktx compares
2026-05-10 23:12:26 +02:00
| | General-purpose agent | Traditional semantic layer | **ktx** |
| --- | :---: | :---: | :---: |
| Builds warehouse context automatically | — | — | ✓ |
| Detects joinable columns + resolves fan/chasm traps | — | Manual | ✓ |
| Approved, reusable metric definitions | — | ✓ | ✓ |
| Absorbs wiki / Notion / team knowledge | — | — | ✓ |
| Flags contradictions across sources | — | — | ✓ |
| Ships CLI + MCP for agent execution | Partial | — | ✓ |
| Read-only by design | n/a | n/a | ✓ |
## Who is ktx for
2026-05-19 15:23:58 +02:00
**Use ktx if you:**
2026-05-19 15:23:58 +02:00
- Want agents like Claude Code, Codex, Cursor, or OpenCode to query your
warehouse with approved metric definitions
- Have business knowledge scattered across dbt, Looker, Metabase, Notion, and
team wikis
- Need agents to reuse canonical SQL instead of inventing it on every prompt
**Skip ktx if you:**
- You don't have a SQL warehouse - **ktx** sits on top of one
- You only need one ad-hoc query - `psql` or a notebook will do
Works with PostgreSQL, Snowflake, BigQuery, ClickHouse, MySQL, SQL Server, and
SQLite. Integrates with dbt, MetricFlow, LookML, Looker, Metabase, and Notion.
2026-05-19 15:23:58 +02:00
## Quick Start
feat: npm-managed Python runtime for @kaelio/ktx (#7) * docs: add npm managed python runtime design * build: add bundled python runtime wheel builder * build: make local embedding dependencies optional * build: bundle python runtime wheel in cli artifacts * build: track bundled python runtime release artifact * test: verify bundled python runtime wheel * docs: add plan for bundled python runtime wheel * test: cover managed python runtime lifecycle * feat: add managed python runtime installer * feat: add runtime command runner * feat: expose runtime management commands * test: verify managed python runtime commands * docs: add plan for managed python runtime installer * feat: add managed python command helper * feat: use managed runtime for sl query compute * feat: route sl query managed runtime policy * docs: add plan for managed runtime sl query integration * feat: add managed runtime daemon metadata * feat: manage python daemon lifecycle * feat: add runtime daemon start stop commands * fix: verify managed runtime daemon lifecycle * docs: add plan for managed runtime daemon lifecycle * feat: add managed local embeddings config marker * feat: add managed local embeddings daemon helper * feat: use managed runtime for local embedding setup * feat: pass managed runtime policy through setup * docs: add plan for managed local embeddings runtime * feat: read CLI package metadata dynamically * feat: assemble public kaelio ktx npm package * feat: release one public kaelio ktx npm artifact * test: cover public kaelio ktx package invocations * chore: verify public kaelio ktx package artifacts * docs: add plan for public kaelio ktx npm package * test: verify managed runtime in public package smoke * test: finalize managed runtime release smoke * docs: add plan for managed runtime release smoke * test: specify local embeddings release smoke * feat: add local embeddings runtime smoke * chore: register local embeddings smoke * fix: verify local embeddings smoke * fix: restore artifact smoke python env helper * docs: add plan for managed local embeddings release smoke * refactor: share managed runtime install policy parsing * feat: use managed runtime for agent semantic queries * feat: use managed runtime for MCP semantic compute * docs: add plan for managed agent and MCP semantic runtime * feat(cli): add managed daemon HTTP helpers * feat(cli): route local adapters through managed daemon * feat(cli): use managed daemon for ingest helpers * feat(cli): pass managed daemon options to scan * feat(context): pass MCP ingest pull config options * feat(cli): pass managed daemon options to serve ingest * test: verify managed local ingest daemon runtime * docs: add plan for managed local ingest daemon runtime * docs: align managed runtime examples * docs: add plan for managed runtime docs cleanup * test: cover published package runtime smoke commands * test: validate published package smoke outputs * docs: add plan for published package runtime smoke * build: stamp public npm package version * release: add npm public release policy * release: add guarded npm publish script * release: document public npm release handoff * docs: add plan for public npm release handoff * test: cover managed runtime prune in package smoke * docs: document managed runtime prune * docs: add plan for managed runtime prune smoke and docs * chore: encode uv runtime prerequisite policy * fix: clarify missing uv runtime error * docs: document uv runtime prerequisite * docs: add plan for uv runtime prerequisite contract * refactor: limit release artifacts to public package runtime * chore: align release policy with bundled runtime wheel * docs: describe single public runtime artifact surface * test: verify single public runtime artifact contract * docs: add plan for single public runtime artifact cleanup * fix: align local embeddings smoke with public version * docs: add plan for local embeddings smoke public version * release: soft-launch as @kaelio/ktx@0.1.0-rc.0 on next tag Publish target moves to the pre-release version 0.1.0-rc.0 under the next dist-tag so npm install @kaelio/ktx (which resolves to latest) does not pick up the soft-launch build. Users opt in via @kaelio/ktx@next. * Fix release script boundary checks * Remove PostHog from public package bundle
2026-05-11 15:50:34 +02:00
```bash
npm install -g @kaelio/ktx
ktx setup
ktx status
2026-05-10 23:12:26 +02:00
```
`ktx setup` creates or resumes a local **ktx** project, configures providers
and connections, builds context, and installs agent integration.
Example `ktx status` after setup:
feat(setup): add Claude Desktop target and MCP-first agent setup (#114) * feat(setup): add Claude Desktop target and MCP-first agent setup Adds `ktx mcp stdio` and a `claude-desktop` setup target that generates a local plugin ZIP wiring the analytics skill and a stdio MCP config. Replaces the CLI-only agent install mode with MCP+analytics (default) and an optional admin CLI skill, renames the research skill to analytics, and lets interactive setup pick project vs global scope when every target supports it. Extracts a shared MCP server factory used by both HTTP and stdio entrypoints. * Add MCP agent client setup support * Polish setup output formatting * Add MCP tool polish design spec Design for slimming the MCP-registered surface from 25 to 11 tools, introducing memory_ingest, applying the per-tool polish kit (annotations, outputSchema, .describe(), in-band error wrapping, union-drift fixes, type-narrowed jsonToolResult), emitting progress notifications on sql_execution + sl_query, and refining the ktx-analytics SKILL.md to match. * Refine MCP tool polish design spec after adversarial review iteration 1 * Refine MCP tool polish design spec after adversarial review iteration 2 * Refine MCP tool polish design spec after adversarial review iteration 3 * refactor(context): rename memory capture service to ingest * feat(mcp): slim research tool surface * refactor(mcp): remove admin ports from server factory * refactor(cli): rename text ingest memory port * docs: update analytics skill for memory ingest * chore: verify mcp surface rename * Add MCP tool polish v1 surface change plan * feat(context): polish mcp tool metadata * fix(context): enforce resolved semantic layer compute sources * feat(context): emit mcp query progress stages * fix(context): keep mcp progress event internal * Add MCP tool polish v1 metadata & progress plan * Fix CI snapshot and docs checks
2026-05-16 11:39:55 +02:00
```text
ktx project: /home/user/analytics
feat(setup): add Claude Desktop target and MCP-first agent setup (#114) * feat(setup): add Claude Desktop target and MCP-first agent setup Adds `ktx mcp stdio` and a `claude-desktop` setup target that generates a local plugin ZIP wiring the analytics skill and a stdio MCP config. Replaces the CLI-only agent install mode with MCP+analytics (default) and an optional admin CLI skill, renames the research skill to analytics, and lets interactive setup pick project vs global scope when every target supports it. Extracts a shared MCP server factory used by both HTTP and stdio entrypoints. * Add MCP agent client setup support * Polish setup output formatting * Add MCP tool polish design spec Design for slimming the MCP-registered surface from 25 to 11 tools, introducing memory_ingest, applying the per-tool polish kit (annotations, outputSchema, .describe(), in-band error wrapping, union-drift fixes, type-narrowed jsonToolResult), emitting progress notifications on sql_execution + sl_query, and refining the ktx-analytics SKILL.md to match. * Refine MCP tool polish design spec after adversarial review iteration 1 * Refine MCP tool polish design spec after adversarial review iteration 2 * Refine MCP tool polish design spec after adversarial review iteration 3 * refactor(context): rename memory capture service to ingest * feat(mcp): slim research tool surface * refactor(mcp): remove admin ports from server factory * refactor(cli): rename text ingest memory port * docs: update analytics skill for memory ingest * chore: verify mcp surface rename * Add MCP tool polish v1 surface change plan * feat(context): polish mcp tool metadata * fix(context): enforce resolved semantic layer compute sources * feat(context): emit mcp query progress stages * fix(context): keep mcp progress event internal * Add MCP tool polish v1 metadata & progress plan * Fix CI snapshot and docs checks
2026-05-16 11:39:55 +02:00
Project ready: yes
LLM ready: yes (claude-sonnet-4-6)
Embeddings ready: yes (text-embedding-3-small)
2026-05-18 19:47:28 -04:00
Databases configured: yes (warehouse)
Context sources configured: yes (dbt_main)
ktx context built: yes
feat(setup): add Claude Desktop target and MCP-first agent setup (#114) * feat(setup): add Claude Desktop target and MCP-first agent setup Adds `ktx mcp stdio` and a `claude-desktop` setup target that generates a local plugin ZIP wiring the analytics skill and a stdio MCP config. Replaces the CLI-only agent install mode with MCP+analytics (default) and an optional admin CLI skill, renames the research skill to analytics, and lets interactive setup pick project vs global scope when every target supports it. Extracts a shared MCP server factory used by both HTTP and stdio entrypoints. * Add MCP agent client setup support * Polish setup output formatting * Add MCP tool polish design spec Design for slimming the MCP-registered surface from 25 to 11 tools, introducing memory_ingest, applying the per-tool polish kit (annotations, outputSchema, .describe(), in-band error wrapping, union-drift fixes, type-narrowed jsonToolResult), emitting progress notifications on sql_execution + sl_query, and refining the ktx-analytics SKILL.md to match. * Refine MCP tool polish design spec after adversarial review iteration 1 * Refine MCP tool polish design spec after adversarial review iteration 2 * Refine MCP tool polish design spec after adversarial review iteration 3 * refactor(context): rename memory capture service to ingest * feat(mcp): slim research tool surface * refactor(mcp): remove admin ports from server factory * refactor(cli): rename text ingest memory port * docs: update analytics skill for memory ingest * chore: verify mcp surface rename * Add MCP tool polish v1 surface change plan * feat(context): polish mcp tool metadata * fix(context): enforce resolved semantic layer compute sources * feat(context): emit mcp query progress stages * fix(context): keep mcp progress event internal * Add MCP tool polish v1 metadata & progress plan * Fix CI snapshot and docs checks
2026-05-16 11:39:55 +02:00
Agent integration ready: yes (codex:project)
```
> [!TIP]
> Already using an agent? Ask Claude Code, Codex, Cursor, or OpenCode from
> your project directory:
>
> ```text
> Run npx skills add Kaelio/ktx --skill ktx and use the ktx skill to install
> and configure ktx in this project.
> ```
feat(telemetry): anonymous posthog usage telemetry across node cli and python daemon (#205) * feat: add telemetry phase 1 * feat: add node telemetry event catalog * feat: add telemetry event helpers * feat: emit setup and connection telemetry * feat: emit connection and stack telemetry * feat: emit ingest and scan telemetry * feat: emit query telemetry * feat: emit sampled mcp telemetry * docs: expand telemetry event catalog * feat: add telemetry schema sync artifact * feat: pass telemetry project id to semantic daemon * feat: add daemon telemetry foundation * feat: emit semantic daemon telemetry * feat: emit daemon lifecycle telemetry * docs: document full telemetry event catalog * feat(telemetry): dim first-run notice * feat(telemetry): show first-run notice before command output * feat(telemetry): wire ktx PostHog project for live ingestion * docs(telemetry): drop posthog project name and host from storage section * docs(telemetry): trim to general overview and disclaimer * docs(agents): add short telemetry guidelines * feat(telemetry): enable posthog geoip enrichment * docs(telemetry): drop ip-geoip note from public overview * refactor(telemetry): drop no-op groupIdentify, rely on capture groups field * fix(telemetry): respect CI kill switch in python daemon identity * fix(sql): route table-count analysis to existing analyze-batch endpoint * fix(telemetry): emit install_first_run from notice path and derive flagsPresent from commander * fix(telemetry): read package info via getKtxCliPackageInfo to satisfy boundary check * fix(telemetry): make python identity env={} bypass os.environ and unset CI in tests * fix(telemetry): unset CI kill switch in cli-program-telemetry tests
2026-05-22 18:18:47 +02:00
> [!IMPORTANT]
> If `ktx status` prints `ktx mcp start --project-dir ...`, run it before
> opening your agent client.
feat(telemetry): anonymous posthog usage telemetry across node cli and python daemon (#205) * feat: add telemetry phase 1 * feat: add node telemetry event catalog * feat: add telemetry event helpers * feat: emit setup and connection telemetry * feat: emit connection and stack telemetry * feat: emit ingest and scan telemetry * feat: emit query telemetry * feat: emit sampled mcp telemetry * docs: expand telemetry event catalog * feat: add telemetry schema sync artifact * feat: pass telemetry project id to semantic daemon * feat: add daemon telemetry foundation * feat: emit semantic daemon telemetry * feat: emit daemon lifecycle telemetry * docs: document full telemetry event catalog * feat(telemetry): dim first-run notice * feat(telemetry): show first-run notice before command output * feat(telemetry): wire ktx PostHog project for live ingestion * docs(telemetry): drop posthog project name and host from storage section * docs(telemetry): trim to general overview and disclaimer * docs(agents): add short telemetry guidelines * feat(telemetry): enable posthog geoip enrichment * docs(telemetry): drop ip-geoip note from public overview * refactor(telemetry): drop no-op groupIdentify, rely on capture groups field * fix(telemetry): respect CI kill switch in python daemon identity * fix(sql): route table-count analysis to existing analyze-batch endpoint * fix(telemetry): emit install_first_run from notice path and derive flagsPresent from commander * fix(telemetry): read package info via getKtxCliPackageInfo to satisfy boundary check * fix(telemetry): make python identity env={} bypass os.environ and unset CI in tests * fix(telemetry): unset CI kill switch in cli-program-telemetry tests
2026-05-22 18:18:47 +02:00
## First commands
| Command | Purpose |
| --- | --- |
| `ktx setup` | Create, resume, or update a **ktx** project |
| `ktx status` | Check project readiness |
| `ktx ingest` | Build context for every configured connection |
| `ktx sl "revenue"` | Search semantic sources |
| `ktx wiki "refund policy"` | Search local wiki pages |
| `ktx mcp start` | Start the MCP server for agent clients |
See the [CLI Reference](https://docs.kaelio.com/ktx/docs/cli-reference/ktx)
for every command, flag, and option.
## Project Layout
```text
my-project/
├── ktx.yaml # Project configuration
├── semantic-layer/<connection-id>/ # YAML semantic sources
├── wiki/global/ # Shared business context
├── wiki/user/<user-id>/ # User-scoped notes
├── raw-sources/<connection-id>/ # Ingest artifacts and reports
└── .ktx/ # Local state and secrets, git-ignored
```
Commit `ktx.yaml`, `semantic-layer/`, and `wiki/`. Keep `.ktx/` local.
Project resolution defaults to `KTX_PROJECT_DIR`, then the nearest `ktx.yaml`,
then the current directory. Pass `--project-dir <path>` when scripting.
feat: npm-managed Python runtime for @kaelio/ktx (#7) * docs: add npm managed python runtime design * build: add bundled python runtime wheel builder * build: make local embedding dependencies optional * build: bundle python runtime wheel in cli artifacts * build: track bundled python runtime release artifact * test: verify bundled python runtime wheel * docs: add plan for bundled python runtime wheel * test: cover managed python runtime lifecycle * feat: add managed python runtime installer * feat: add runtime command runner * feat: expose runtime management commands * test: verify managed python runtime commands * docs: add plan for managed python runtime installer * feat: add managed python command helper * feat: use managed runtime for sl query compute * feat: route sl query managed runtime policy * docs: add plan for managed runtime sl query integration * feat: add managed runtime daemon metadata * feat: manage python daemon lifecycle * feat: add runtime daemon start stop commands * fix: verify managed runtime daemon lifecycle * docs: add plan for managed runtime daemon lifecycle * feat: add managed local embeddings config marker * feat: add managed local embeddings daemon helper * feat: use managed runtime for local embedding setup * feat: pass managed runtime policy through setup * docs: add plan for managed local embeddings runtime * feat: read CLI package metadata dynamically * feat: assemble public kaelio ktx npm package * feat: release one public kaelio ktx npm artifact * test: cover public kaelio ktx package invocations * chore: verify public kaelio ktx package artifacts * docs: add plan for public kaelio ktx npm package * test: verify managed runtime in public package smoke * test: finalize managed runtime release smoke * docs: add plan for managed runtime release smoke * test: specify local embeddings release smoke * feat: add local embeddings runtime smoke * chore: register local embeddings smoke * fix: verify local embeddings smoke * fix: restore artifact smoke python env helper * docs: add plan for managed local embeddings release smoke * refactor: share managed runtime install policy parsing * feat: use managed runtime for agent semantic queries * feat: use managed runtime for MCP semantic compute * docs: add plan for managed agent and MCP semantic runtime * feat(cli): add managed daemon HTTP helpers * feat(cli): route local adapters through managed daemon * feat(cli): use managed daemon for ingest helpers * feat(cli): pass managed daemon options to scan * feat(context): pass MCP ingest pull config options * feat(cli): pass managed daemon options to serve ingest * test: verify managed local ingest daemon runtime * docs: add plan for managed local ingest daemon runtime * docs: align managed runtime examples * docs: add plan for managed runtime docs cleanup * test: cover published package runtime smoke commands * test: validate published package smoke outputs * docs: add plan for published package runtime smoke * build: stamp public npm package version * release: add npm public release policy * release: add guarded npm publish script * release: document public npm release handoff * docs: add plan for public npm release handoff * test: cover managed runtime prune in package smoke * docs: document managed runtime prune * docs: add plan for managed runtime prune smoke and docs * chore: encode uv runtime prerequisite policy * fix: clarify missing uv runtime error * docs: document uv runtime prerequisite * docs: add plan for uv runtime prerequisite contract * refactor: limit release artifacts to public package runtime * chore: align release policy with bundled runtime wheel * docs: describe single public runtime artifact surface * test: verify single public runtime artifact contract * docs: add plan for single public runtime artifact cleanup * fix: align local embeddings smoke with public version * docs: add plan for local embeddings smoke public version * release: soft-launch as @kaelio/ktx@0.1.0-rc.0 on next tag Publish target moves to the pre-release version 0.1.0-rc.0 under the next dist-tag so npm install @kaelio/ktx (which resolves to latest) does not pick up the soft-launch build. Users opt in via @kaelio/ktx@next. * Fix release script boundary checks * Remove PostHog from public package bundle
2026-05-11 15:50:34 +02:00
## FAQ
- **Does ktx send my schema or query results to a hosted service?**
No. **ktx** runs locally. The only data leaving your machine is what you
send to the LLM provider you configured.
- **Which LLM backends are supported?**
feat: add codex llm backend for ktx runtime work (#253) * feat: add codex sdk runner foundation * feat: parse codex runtime events * feat: expose codex runtime mcp tools * feat: add codex llm runtime * feat: wire codex llm backend * test: avoid Array.fromAsync in codex runner test * docs: document codex llm backend * fix: tighten codex runtime config ownership * fix: use codex sdk env and thread options * fix: parse codex sdk event shapes * test: add codex backend live smoke * docs: clarify codex backend isolation * fix: drive codex loop metrics from mcp events * fix: enforce codex local step budget * docs: disclose codex isolation limits * fix: count all codex agent steps and stream step callbacks live The agent-loop step budget only counted completed mcp_tool_call items, so built-in command_execution steps (which the public Codex SDK/CLI surface can still expose) never decremented the budget, letting ingest/reconciliation run past stepBudget until Codex stopped on its own. onStepFinish was also replayed only after the whole stream drained, so live work_unit_step / reconciliation progress appeared stuck until the Codex process exited. collectEvents is now the single live step accumulator: it counts every completed agent-action item via a shared isCompletedAgentStep predicate (command_execution, mcp_tool_call, file_change, web_search), fires onStepFinish as each step completes, and enforces the budget on that broader count. A no-tool turn still counts as one step. toolFailures stays MCP-specific, since a non-zero command exit is normal agent exploration, not a loop failure. * test: align ingest llm-guard assertions with codex backend The skip-llm ingest guard message now lists codex as a valid backend and mentions a Claude Code/Codex session plus a codex setup hint, but this slow suite test still asserted the pre-codex wording. Update it to match the production message (already covered by the local-bundle-runtime unit test) and add the codex setup-line assertion. * fix: treat codex error:null tool calls as success The Codex SDK serializes error: null on successful mcp_tool_call items, so the failure check (item.error !== undefined) flagged every successful tool call as failed with the empty-payload default "Codex turn failed". This killed every ingest work unit under the codex backend before it could produce a patch. Key on status === 'failed' (authoritative, always set) and only treat a populated error object as a failure. Add a regression test built from a verbatim real-SDK event capture. * fix: default codex backend to gpt-5.5 and report real probe errors The previous default gpt-5.3-codex is an API-key-only model that the OpenAI API rejects under ChatGPT-account (subscription) auth, so codex status/setup failed with a misleading "authentication is not usable" message even though auth was fine. - Default codex model is now gpt-5.5 (works on both subscription and API-key auth); the curated setup picker offers gpt-5.5 / gpt-5.4 / gpt-5.4-mini and keeps free-form entry for account-specific ids (e.g. gpt-5.3-codex-spark). - runCodexAuthProbe now distinguishes "model not available" from an auth failure and surfaces the real API error: collectEvents retains stream events when the SDK throws on a non-zero exit, and the API error JSON envelope is unwrapped to its human-readable message. - The Codex isolation warning now renders inside the clack setup frame. - Docs updated to gpt-5.5 with a note that *-codex ids require API-key auth. * fix: require llm.models.default in status and match codex probe remediation Status reported a project ready when a non-none LLM backend was configured without llm.models.default, but the runtime (resolveModelSlots) hard-requires it, so ingest/scan/memory threw after `ktx status` said the project was usable. buildLlmStatus now fails for any non-none backend missing models.default and no longer invents a fallback model for claude-code/codex. Codex probe failures now carry a category-matched fix: a model-access failure steers the user at llm.models.default instead of the auth/install remediation. runCodexAuthProbe returns the fix and status consumes it; the message stays self-sufficient so setup output is unchanged. Docs: README now lists the codex backend and local Codex auth; ktx-setup.mdx states --llm-model only accepts codex/default or gpt-*/codex-* ids. Repaired four doctor fixtures that configured a backend without models.default (the now-correctly-blocked config) and added coverage for the new behavior.
2026-06-02 13:57:11 +02:00
Anthropic API, Google Vertex AI, AI Gateway, the local Claude Code session
through the Claude Agent SDK, and your local Codex authentication through the
Codex SDK. See
[LLM configuration](https://docs.kaelio.com/ktx/docs/guides/llm-configuration).
- **How is ktx different from a dbt or MetricFlow semantic layer?**
**ktx** *ingests* those layers and combines them with raw-table
introspection and wiki content. Agents get one searchable surface instead
of three disconnected ones - and **ktx** flags contradictions across
sources.
- **Does ktx need a running server?**
There is no hosted service. The local MCP daemon runs on demand via
`ktx mcp start` when an agent client needs it.
- **Is my warehouse safe?**
Yes. Connections are read-only - **ktx** never writes to your database.
2026-05-10 23:12:26 +02:00
## Docs
feat(setup): add Claude Desktop target and MCP-first agent setup (#114) * feat(setup): add Claude Desktop target and MCP-first agent setup Adds `ktx mcp stdio` and a `claude-desktop` setup target that generates a local plugin ZIP wiring the analytics skill and a stdio MCP config. Replaces the CLI-only agent install mode with MCP+analytics (default) and an optional admin CLI skill, renames the research skill to analytics, and lets interactive setup pick project vs global scope when every target supports it. Extracts a shared MCP server factory used by both HTTP and stdio entrypoints. * Add MCP agent client setup support * Polish setup output formatting * Add MCP tool polish design spec Design for slimming the MCP-registered surface from 25 to 11 tools, introducing memory_ingest, applying the per-tool polish kit (annotations, outputSchema, .describe(), in-band error wrapping, union-drift fixes, type-narrowed jsonToolResult), emitting progress notifications on sql_execution + sl_query, and refining the ktx-analytics SKILL.md to match. * Refine MCP tool polish design spec after adversarial review iteration 1 * Refine MCP tool polish design spec after adversarial review iteration 2 * Refine MCP tool polish design spec after adversarial review iteration 3 * refactor(context): rename memory capture service to ingest * feat(mcp): slim research tool surface * refactor(mcp): remove admin ports from server factory * refactor(cli): rename text ingest memory port * docs: update analytics skill for memory ingest * chore: verify mcp surface rename * Add MCP tool polish v1 surface change plan * feat(context): polish mcp tool metadata * fix(context): enforce resolved semantic layer compute sources * feat(context): emit mcp query progress stages * fix(context): keep mcp progress event internal * Add MCP tool polish v1 metadata & progress plan * Fix CI snapshot and docs checks
2026-05-16 11:39:55 +02:00
- [Quickstart](https://docs.kaelio.com/ktx/docs/getting-started/quickstart)
- [The Context Layer](https://docs.kaelio.com/ktx/docs/concepts/the-context-layer)
- [Building Context](https://docs.kaelio.com/ktx/docs/guides/building-context)
- [CLI Reference](https://docs.kaelio.com/ktx/docs/cli-reference/ktx)
- [Agent Quickstart](https://docs.kaelio.com/ktx/docs/ai-resources/agent-quickstart)
- [Community & Support](https://docs.kaelio.com/ktx/docs/community/support)
feat(setup): add Claude Desktop target and MCP-first agent setup (#114) * feat(setup): add Claude Desktop target and MCP-first agent setup Adds `ktx mcp stdio` and a `claude-desktop` setup target that generates a local plugin ZIP wiring the analytics skill and a stdio MCP config. Replaces the CLI-only agent install mode with MCP+analytics (default) and an optional admin CLI skill, renames the research skill to analytics, and lets interactive setup pick project vs global scope when every target supports it. Extracts a shared MCP server factory used by both HTTP and stdio entrypoints. * Add MCP agent client setup support * Polish setup output formatting * Add MCP tool polish design spec Design for slimming the MCP-registered surface from 25 to 11 tools, introducing memory_ingest, applying the per-tool polish kit (annotations, outputSchema, .describe(), in-band error wrapping, union-drift fixes, type-narrowed jsonToolResult), emitting progress notifications on sql_execution + sl_query, and refining the ktx-analytics SKILL.md to match. * Refine MCP tool polish design spec after adversarial review iteration 1 * Refine MCP tool polish design spec after adversarial review iteration 2 * Refine MCP tool polish design spec after adversarial review iteration 3 * refactor(context): rename memory capture service to ingest * feat(mcp): slim research tool surface * refactor(mcp): remove admin ports from server factory * refactor(cli): rename text ingest memory port * docs: update analytics skill for memory ingest * chore: verify mcp surface rename * Add MCP tool polish v1 surface change plan * feat(context): polish mcp tool metadata * fix(context): enforce resolved semantic layer compute sources * feat(context): emit mcp query progress stages * fix(context): keep mcp progress event internal * Add MCP tool polish v1 metadata & progress plan * Fix CI snapshot and docs checks
2026-05-16 11:39:55 +02:00
## Community
feat(setup): add Claude Desktop target and MCP-first agent setup (#114) * feat(setup): add Claude Desktop target and MCP-first agent setup Adds `ktx mcp stdio` and a `claude-desktop` setup target that generates a local plugin ZIP wiring the analytics skill and a stdio MCP config. Replaces the CLI-only agent install mode with MCP+analytics (default) and an optional admin CLI skill, renames the research skill to analytics, and lets interactive setup pick project vs global scope when every target supports it. Extracts a shared MCP server factory used by both HTTP and stdio entrypoints. * Add MCP agent client setup support * Polish setup output formatting * Add MCP tool polish design spec Design for slimming the MCP-registered surface from 25 to 11 tools, introducing memory_ingest, applying the per-tool polish kit (annotations, outputSchema, .describe(), in-band error wrapping, union-drift fixes, type-narrowed jsonToolResult), emitting progress notifications on sql_execution + sl_query, and refining the ktx-analytics SKILL.md to match. * Refine MCP tool polish design spec after adversarial review iteration 1 * Refine MCP tool polish design spec after adversarial review iteration 2 * Refine MCP tool polish design spec after adversarial review iteration 3 * refactor(context): rename memory capture service to ingest * feat(mcp): slim research tool surface * refactor(mcp): remove admin ports from server factory * refactor(cli): rename text ingest memory port * docs: update analytics skill for memory ingest * chore: verify mcp surface rename * Add MCP tool polish v1 surface change plan * feat(context): polish mcp tool metadata * fix(context): enforce resolved semantic layer compute sources * feat(context): emit mcp query progress stages * fix(context): keep mcp progress event internal * Add MCP tool polish v1 metadata & progress plan * Fix CI snapshot and docs checks
2026-05-16 11:39:55 +02:00
- **[Slack](https://join.slack.com/t/ktxcommunity/shared_invite/zt-3y9b44m1x-LVyNNJD5nwaZHq4XS29LMQ)** — ask questions, share what you're building, and chat with maintainers.
- **[GitHub Issues](https://github.com/Kaelio/ktx/issues)** — report bugs and request features.
- **[Contributing](https://docs.kaelio.com/ktx/docs/community/contributing)** — set up the repo, run tests, and open a PR.
feat(setup): add Claude Desktop target and MCP-first agent setup (#114) * feat(setup): add Claude Desktop target and MCP-first agent setup Adds `ktx mcp stdio` and a `claude-desktop` setup target that generates a local plugin ZIP wiring the analytics skill and a stdio MCP config. Replaces the CLI-only agent install mode with MCP+analytics (default) and an optional admin CLI skill, renames the research skill to analytics, and lets interactive setup pick project vs global scope when every target supports it. Extracts a shared MCP server factory used by both HTTP and stdio entrypoints. * Add MCP agent client setup support * Polish setup output formatting * Add MCP tool polish design spec Design for slimming the MCP-registered surface from 25 to 11 tools, introducing memory_ingest, applying the per-tool polish kit (annotations, outputSchema, .describe(), in-band error wrapping, union-drift fixes, type-narrowed jsonToolResult), emitting progress notifications on sql_execution + sl_query, and refining the ktx-analytics SKILL.md to match. * Refine MCP tool polish design spec after adversarial review iteration 1 * Refine MCP tool polish design spec after adversarial review iteration 2 * Refine MCP tool polish design spec after adversarial review iteration 3 * refactor(context): rename memory capture service to ingest * feat(mcp): slim research tool surface * refactor(mcp): remove admin ports from server factory * refactor(cli): rename text ingest memory port * docs: update analytics skill for memory ingest * chore: verify mcp surface rename * Add MCP tool polish v1 surface change plan * feat(context): polish mcp tool metadata * fix(context): enforce resolved semantic layer compute sources * feat(context): emit mcp query progress stages * fix(context): keep mcp progress event internal * Add MCP tool polish v1 metadata & progress plan * Fix CI snapshot and docs checks
2026-05-16 11:39:55 +02:00
2026-05-10 23:12:26 +02:00
## Development
```bash
git clone https://github.com/kaelio/ktx.git
cd ktx
2026-05-10 23:12:26 +02:00
pnpm install
uv sync --all-groups
pnpm run build
2026-05-10 23:12:26 +02:00
pnpm run check
```
**ktx** is a pnpm + uv workspace:
| Path | Purpose |
| --- | --- |
| `packages/cli` | TypeScript CLI and published npm package source |
| `packages/cli/src/context` | Core context engine |
| `packages/cli/src/llm` | LLM and embedding providers |
| `packages/cli/src/connectors` | Database scan connectors |
| `python/ktx-sl` | Semantic-layer query planning |
| `python/ktx-daemon` | Portable compute service |
Local development CLI:
2026-05-10 23:12:26 +02:00
```bash
pnpm run setup:dev
2026-05-10 23:12:26 +02:00
pnpm run link:dev
2026-05-10 23:51:24 +02:00
ktx-dev --help
2026-05-10 23:12:26 +02:00
```
Useful checks:
```bash
pnpm run type-check
pnpm run test
pnpm run dead-code
uv run pytest -q
```
## Telemetry
**ktx** collects privacy-conscious usage telemetry to understand installs and
improve setup, command reliability, and data-agent workflows. Catalog telemetry
events do not record file paths, hostnames, SQL, schema names, table names,
column names, error messages, raw environment values, or argv. Error reports use
PostHog Error Tracking and can include stack frames and raw error messages,
which may contain local file paths or the local username in those paths.
**ktx** redacts secrets, credentials, database URLs, auth headers, argv, raw
environment values, SQL text, row data, and user-typed prompt or MCP argument
text from the explicit `$exception` payload. See
[Telemetry](https://docs.kaelio.com/ktx/docs/community/telemetry) for the event
catalog and opt-out options.
2026-05-10 23:12:26 +02:00
## License
**ktx** is licensed under the Apache License, Version 2.0. See `LICENSE`.
## Star History
<p align="center">
<a href="https://star-history.com/#Kaelio/ktx&Date">
<img src="assets/star-history.svg" alt="ktx Star History Chart" width="700" />
</a>
</p>