ktx is the context layer for analytics agents https://docs.kaelio.com/ktx
Find a file
Andrey Avtomonov 2366b00301
chore(workspace): gate dead-code with knip production mode (#196)
* refactor(workspace): relocate @ktx/llm source into packages/cli/src/llm

* refactor(workspace): rewrite @ktx/llm imports to relative paths

* refactor(workspace): fold internal packages into cli

* chore(workspace): gate dead-code with knip production mode

Turn on production-mode knip plus an autofix run in pre-commit and the
`pnpm dead-code` script, document the `/** @internal */` convention for
test-only exports in AGENTS.md, annotate test-only exports across the
CLI with that JSDoc, and drop dead exports/wrappers the new gate
surfaced (e.g. `cli-project.ts`, `lookerRuntimeSourceToFileAdapterSource`,
`createLocalScanEnrichmentProvidersFromConfig`,
`PGLITE_OWNER_PROCESS_BACKEND_CAPABILITIES`, stale type re-exports).
Replace the loose `ignoreIssues` allowlist in `knip.json` with explicit
production entries so cross-package barrel leaks are caught.

* refactor(cli): delete internal barrel index.ts files

The 34 `index.ts` re-export barrels inside `packages/cli/src/` were
holdovers from the pre-fold multi-workspace structure. Post-fold-in they
served no production purpose: external consumers go through the single
package main entry, and in-repo callers mostly imported through them
only because the path was short. Internally, knip flagged most barrel
re-exports as production-dead (only reached via tests).

This change:
- Deletes every internal barrel except `packages/cli/src/index.ts`
  (the published package entry).
- Rewrites ~270 source/test files to import each name directly from
  the file that defines it.
- Moves `tools/warehouse-verification/index.ts` to
  `create-warehouse-verification-tools.ts` (the function it defined
  locally) and updates its single consumer.
- Renames `search/backend-conformance.ts` → `.test-utils.ts` to match
  the existing test-helper file convention.
- Deletes 13 dead test-only chains (dbt-descriptions/*,
  live-database/extracted-schema, live-database/structural-sync,
  relationship-* feedback/review chain) plus their tests and a
  cascading orphan integration test.
- Updates test mocks that pointed at deleted barrel paths
  (notion-client, connector barrels in scan/local-scan-connectors
  tests) to mock the source files instead.
- Points the maintainer benchmark script
  (`scripts/relationship-benchmark-report.mjs`) at source files
  instead of `dist/context/scan/index.js`.
- Drops the barrel `!` entries from `knip.json`; adds explicit
  production entries only for the benchmark code reached via dist by
  the maintainer script.

Net: 413 files changed, ~1.2k insertions, ~9.4k deletions.

`pnpm run dead-code` (Biome + knip default + knip production) and
`pnpm run type-check` are clean; 2277 tests pass.

* refactor(workspace): rename @ktx/cli to @kaelio/ktx and pack it directly

Promote the CLI workspace package to the public name `@kaelio/ktx` and
drop the separate `scripts/build-public-npm-package.mjs` wrapper. The
CLI package is now publishable in place (`publishConfig.access: public`,
`provenance: true`), so artifact packing uses `pnpm pack` against
`packages/cli/` instead of assembling a parallel package tree.

Updates all workspace filter invocations, docs, tests, and release
readiness checks to reference the new package name, and folds the
tarball-name helper into `scripts/public-npm-release-metadata.mjs`.

* docs: align "agent clients" and "data agents" terminology

Replace "client agents" with "agent clients" and "database agents" with
"data agents" across AGENTS.md, README.md, the docs-site copy, and the
matching setup-agents test description, matching the canonical
vocabulary in docs/terminology.md.

Also moves packages/cli/tsconfig.json's tsBuildInfoFile from
node_modules/.cache/ to dist/.tsbuildinfo so incremental builds survive
node_modules reinstalls.

* refactor(release): single source of truth for package version

Make packages/cli/package.json the single source of truth for the
@kaelio/ktx version. publicNpmPackageVersion() now reads it directly,
so artifact filenames, release-readiness checks, and the Python wheel
version all derive from one field. The duplicate
release-policy.json.publicNpmPackageVersion is removed.

Previously the two fields could drift: tarballs were named
kaelio-ktx-0.4.1.tgz while internally containing
@kaelio/ktx@0.0.0-private.

- update-public-release-version.mjs rewrites both Python pyproject.toml
  files (ktx-daemon, ktx-sl) alongside the npm package.jsons,
  normalizing the version for PEP 440 (e.g. 0.1.0-rc.2 -> 0.1.0rc2).
- semantic-release-config.cjs adds the two pyproject.toml files to
  @semantic-release/git assets so the release commit back to main
  carries every version source in lockstep.
- The six "?? '0.0.0-private'" fallback literals across the CLI are
  replaced with "?? getKtxCliPackageInfo().version", and
  createDefaultKtxMcpServer makes its version arg required.
- docs/release.md describes the actual commit-back model: the dev tree
  always reflects the most recent release; no sentinel pin to
  maintain.

Verified: pnpm run artifacts:build now produces
kaelio-ktx-0.4.1.tgz and kaelio_ktx-0.4.1-py3-none-any.whl with
@kaelio/ktx@0.4.1 inside. Full type-check, dead-code, and
2287 vitests + 173 script tests pass.

* refactor(cli): inject embedding provider resolution and detect sentence-transformers runtime

Make resolveProjectEmbeddingProvider and runtimeIo injectable in ingest and
scan command entrypoints so tests can stub them, and teach
resolvePublicIngestRuntimeRequirements to flag the local-embeddings runtime
feature when ktx.yaml selects sentence-transformers.

* chore(cli): mark buildLocalStatsStatus and LocalStatsStatus as @internal

Both symbols are consumed only by status-project.test.ts. Annotating with
/** @internal */ keeps knip's production-mode check clean without changing
runtime behavior.

* fix(cli): use real package metadata in print-command-tree

The stubbed package name embedded a forbidden product identifier that
tripped the boundary check in CI. Read the metadata from package.json
instead — keeps the rendered tree unchanged and removes a duplicate
source of truth.

* feat(cli): show embedding coverage in `ktx status`, drop duplicate disk counts

Inline `(N embedded)` next to the Wiki scope counts and Semantic-layer
source counts, computed with `SUM(embedding_json IS NOT NULL)` over
`knowledge_pages` and `local_sl_sources`. Rename the "Knowledge" label to
"Wiki" (canonical per `docs/terminology.md`) and rename the matching
`localStats.knowledgePages` field to `localStats.wikiPages`.

Drop `wiki=N md` and `semantic-layer=N yaml` from the Disk row — those
duplicated the per-surface rows above. Disk now reports only actual byte
usage (db, cache, raw-sources). The unused `wikiGlobalMarkdownCount` /
`semanticLayerYamlCount` fields, the `isMarkdownEntry` / `isYamlEntry`
helpers, and the `filter` arg on `summarizeDir` are removed.
2026-05-21 15:28:58 +02:00
.github chore(workspace): gate dead-code with knip production mode (#196) 2026-05-21 15:28:58 +02:00
assets feat(docs-site): refresh nav mascot with SVG and bump size (#101) 2026-05-14 23:45:41 +02:00
docs chore(workspace): gate dead-code with knip production mode (#196) 2026-05-21 15:28:58 +02:00
docs-site chore(workspace): gate dead-code with knip production mode (#196) 2026-05-21 15:28:58 +02:00
examples chore(workspace): gate dead-code with knip production mode (#196) 2026-05-21 15:28:58 +02:00
packages/cli chore(workspace): gate dead-code with knip production mode (#196) 2026-05-21 15:28:58 +02:00
python chore(workspace): gate dead-code with knip production mode (#196) 2026-05-21 15:28:58 +02:00
scripts chore(workspace): gate dead-code with knip production mode (#196) 2026-05-21 15:28:58 +02:00
website feat(docs): add Fumadocs site workspace 2026-05-11 01:08:31 -07:00
.gitignore chore: remove private planning docs (#140) 2026-05-19 14:58:55 +02:00
.pre-commit-config.yaml chore(workspace): gate dead-code with knip production mode (#196) 2026-05-21 15:28:58 +02:00
.releaserc.cjs feat: add claude-code llm backend with runtime port (#115) 2026-05-16 12:06:34 +02:00
AGENTS.md chore(workspace): gate dead-code with knip production mode (#196) 2026-05-21 15:28:58 +02:00
biome.json feat: merge ingest and scan 2026-05-14 01:43:06 +02:00
CLAUDE.md Initial open-source release 2026-05-10 23:12:26 +02:00
codecov.yml refactor(release): drop release-policy.json runtime dep and next branch (#180) 2026-05-20 13:53:14 +02:00
conductor.json [codex] Add Conductor workspace scripts (#2) 2026-05-11 09:55:42 +02:00
CONTRIBUTING.md chore(community): rewards program, issue templates, and triage workflow (#176) 2026-05-19 19:42:06 -04:00
GEMINI.md Initial open-source release 2026-05-10 23:12:26 +02:00
knip.json chore(workspace): gate dead-code with knip production mode (#196) 2026-05-21 15:28:58 +02:00
LICENSE ci: run pre-commit checks in CI (#74) 2026-05-13 19:49:25 +02:00
package.json chore(workspace): gate dead-code with knip production mode (#196) 2026-05-21 15:28:58 +02:00
pnpm-lock.yaml chore(workspace): gate dead-code with knip production mode (#196) 2026-05-21 15:28:58 +02:00
pnpm-workspace.yaml fix: resolve dependabot security advisories (#179) 2026-05-20 14:17:29 +02:00
pyproject.toml ci: add codecov coverage reporting (#82) 2026-05-14 01:13:31 +02:00
README.md chore(workspace): gate dead-code with knip production mode (#196) 2026-05-21 15:28:58 +02:00
release-policy.json chore(workspace): gate dead-code with knip production mode (#196) 2026-05-21 15:28:58 +02:00
SECURITY.md chore(community): rewards program, issue templates, and triage workflow (#176) 2026-05-19 19:42:06 -04:00
tsconfig.base.json perf(setup): speed up conductor setup and make it rerun-safe (#107) 2026-05-15 12:06:37 +02:00
uv.lock chore(workspace): gate dead-code with knip production mode (#196) 2026-05-21 15:28:58 +02:00

ktx

The context layer for data agents

npm version Codecov Tests Documentation Join the ktx Slack community License Y Combinator P25


ktx is a self-improving context layer that teaches agents how to query your warehouse accurately - from approved metric definitions, joinable columns, and business knowledge it builds and maintains for you.

Works with PostgreSQL, Snowflake, BigQuery, ClickHouse, MySQL, SQL Server, and SQLite. Integrates with dbt, MetricFlow, LookML, Looker, Metabase, and Notion.

Runs with your own LLM API keys or a Claude Pro/Max subscription - no extra usage billing from ktx.

Why ktx

General-purpose agents struggle on data tasks. They re-explore your warehouse on every question, invent their own metric logic, and return numbers that don't match approved definitions.

Traditional semantic layers don't fix this. They demand constant manual upkeep and don't absorb the rest of your company's knowledge.

ktx does both, automatically:

  • Learns from company knowledge. Ingests wiki content, organizes it, removes duplicates, and flags contradictions for human review.
  • Maps the data stack. Samples tables, captures metadata and usage patterns, detects joinable columns, and annotates sources so agents write better queries.
  • Builds a semantic layer. Combines raw tables and high-level metrics through a join graph that automatically resolves chasm and fan traps, so agents fetch metrics declaratively instead of rewriting canonical SQL each time.
  • Serves agents at execution. Exposes CLI and MCP tools with combined full-text and semantic search across wiki and semantic-layer entities.

Agents can run raw SQL when they need it, or compose semantic-layer queries when they want approved metrics with reliable joins.

ktx ingestion flow from source systems through validation to wiki and semantic-layer outputs

Agent Setup

Ask an agent such as Claude Code, Codex, Cursor, or OpenCode to install and configure ktx from your project directory:

Follow instructions from
https://docs.kaelio.com/ktx/docs/agents-setup.md
to install and configure ktx

Quick Start

npm install -g @kaelio/ktx
ktx setup
ktx status

ktx setup creates or resumes a local ktx project, configures providers and connections, builds context, and installs agent integration.

Example ktx status output after setup:

ktx project: /home/user/analytics
Project ready: yes
LLM ready: yes (claude-sonnet-4-6)
Embeddings ready: yes (text-embedding-3-small)
Databases configured: yes (warehouse)
Context sources configured: yes (dbt_main)
ktx context built: yes
Agent integration ready: yes (codex:project)

Common Commands

Command Purpose
ktx setup Create, resume, or update a ktx project
ktx status Check project readiness
ktx connection List configured connections
ktx connection test Test every configured connection
ktx connection test <id> Test one connection
ktx ingest Build context for every configured connection
ktx ingest <id> Build context for one connection
ktx ingest --text "..." Capture free-form notes into memory
ktx ingest --file notes.md --connection-id <id> Capture a text file into memory
ktx sl List semantic sources
ktx sl "revenue" Search semantic sources
ktx sl validate <source> --connection-id <id> Validate a semantic source
ktx sl query --measure <measure> --format sql Compile semantic-layer SQL
ktx sql --connection <id> "select 1" Execute read-only SQL
ktx wiki List local wiki pages
ktx wiki "revenue definition" Search local wiki pages
ktx mcp Show MCP daemon status
ktx mcp start Start the local MCP server for agent clients

Project resolution defaults to KTX_PROJECT_DIR, then the nearest ktx.yaml, then the current directory. Pass --project-dir <path> when scripting.

Project Layout

my-project/
├── ktx.yaml                         # Project configuration
├── semantic-layer/<connection-id>/  # YAML semantic sources
├── wiki/global/                     # Shared business context
├── wiki/user/<user-id>/             # User-scoped notes
├── raw-sources/<connection-id>/     # Ingest artifacts and reports
└── .ktx/                            # Local state and secrets, git-ignored

Commit ktx.yaml, semantic-layer/, and wiki/. Keep .ktx/ local.

Agent Usage

Install ktx integration for Claude Code, Claude Desktop, Codex, Cursor, OpenCode, and generic .agents clients:

ktx setup --agents

Pass --target <target> to install or repair one specific integration.

A typical agent workflow combines wiki and semantic-layer search before querying:

ktx sl "revenue" --json
ktx wiki "refund policy" --json
ktx sl query --connection-id warehouse --measure orders.revenue --format sql

During setup, choose Ask data questions with ktx MCP for agent clients. Choose Ask data questions + manage ktx with CLI commands when an operator agent also needs pinned ktx admin commands.

After setup, ktx prints Required before using agents with the exact commands to run. If the output includes ktx mcp start --project-dir ..., run it before opening your agent. Claude Desktop uses its own launcher and prints separate skill upload steps under .ktx/agents/claude/.

Workspace layout

Path Purpose
packages/cli TypeScript CLI package and published npm package source
packages/cli/src/context Core context engine
packages/cli/src/llm LLM and embedding providers
packages/cli/src/connectors Database scan connectors
python/ktx-sl Semantic-layer query planning
python/ktx-daemon Portable compute service

Development

git clone https://github.com/kaelio/ktx.git
cd ktx
pnpm install
uv sync --all-groups
pnpm run build
pnpm run check

Use the development CLI locally:

pnpm run setup:dev
pnpm run link:dev
ktx-dev --help

ktx is a pnpm + uv workspace:

  • TypeScript packages live in packages/*
  • CLI source lives in packages/cli
  • Python runtime source lives in python/ktx-sl and python/ktx-daemon
  • Public docs live in docs-site/content/docs

Useful checks:

pnpm run type-check
pnpm run test
pnpm run dead-code
uv run pytest -q

Docs

Community

  • Slack — ask questions, share what you're building, and chat with maintainers and other users.
  • GitHub Issues — report bugs and request features.
  • Contributing guide — set up the repo, run tests, and open a PR.

See Community & Support for the full guide on where to ask what.

License

ktx is licensed under the Apache License, Version 2.0. See LICENSE.