* refactor(workspace): relocate @ktx/llm source into packages/cli/src/llm * refactor(workspace): rewrite @ktx/llm imports to relative paths * refactor(workspace): fold internal packages into cli * chore(workspace): gate dead-code with knip production mode Turn on production-mode knip plus an autofix run in pre-commit and the `pnpm dead-code` script, document the `/** @internal */` convention for test-only exports in AGENTS.md, annotate test-only exports across the CLI with that JSDoc, and drop dead exports/wrappers the new gate surfaced (e.g. `cli-project.ts`, `lookerRuntimeSourceToFileAdapterSource`, `createLocalScanEnrichmentProvidersFromConfig`, `PGLITE_OWNER_PROCESS_BACKEND_CAPABILITIES`, stale type re-exports). Replace the loose `ignoreIssues` allowlist in `knip.json` with explicit production entries so cross-package barrel leaks are caught. * refactor(cli): delete internal barrel index.ts files The 34 `index.ts` re-export barrels inside `packages/cli/src/` were holdovers from the pre-fold multi-workspace structure. Post-fold-in they served no production purpose: external consumers go through the single package main entry, and in-repo callers mostly imported through them only because the path was short. Internally, knip flagged most barrel re-exports as production-dead (only reached via tests). This change: - Deletes every internal barrel except `packages/cli/src/index.ts` (the published package entry). - Rewrites ~270 source/test files to import each name directly from the file that defines it. - Moves `tools/warehouse-verification/index.ts` to `create-warehouse-verification-tools.ts` (the function it defined locally) and updates its single consumer. - Renames `search/backend-conformance.ts` → `.test-utils.ts` to match the existing test-helper file convention. - Deletes 13 dead test-only chains (dbt-descriptions/*, live-database/extracted-schema, live-database/structural-sync, relationship-* feedback/review chain) plus their tests and a cascading orphan integration test. - Updates test mocks that pointed at deleted barrel paths (notion-client, connector barrels in scan/local-scan-connectors tests) to mock the source files instead. - Points the maintainer benchmark script (`scripts/relationship-benchmark-report.mjs`) at source files instead of `dist/context/scan/index.js`. - Drops the barrel `!` entries from `knip.json`; adds explicit production entries only for the benchmark code reached via dist by the maintainer script. Net: 413 files changed, ~1.2k insertions, ~9.4k deletions. `pnpm run dead-code` (Biome + knip default + knip production) and `pnpm run type-check` are clean; 2277 tests pass. * refactor(workspace): rename @ktx/cli to @kaelio/ktx and pack it directly Promote the CLI workspace package to the public name `@kaelio/ktx` and drop the separate `scripts/build-public-npm-package.mjs` wrapper. The CLI package is now publishable in place (`publishConfig.access: public`, `provenance: true`), so artifact packing uses `pnpm pack` against `packages/cli/` instead of assembling a parallel package tree. Updates all workspace filter invocations, docs, tests, and release readiness checks to reference the new package name, and folds the tarball-name helper into `scripts/public-npm-release-metadata.mjs`. * docs: align "agent clients" and "data agents" terminology Replace "client agents" with "agent clients" and "database agents" with "data agents" across AGENTS.md, README.md, the docs-site copy, and the matching setup-agents test description, matching the canonical vocabulary in docs/terminology.md. Also moves packages/cli/tsconfig.json's tsBuildInfoFile from node_modules/.cache/ to dist/.tsbuildinfo so incremental builds survive node_modules reinstalls. * refactor(release): single source of truth for package version Make packages/cli/package.json the single source of truth for the @kaelio/ktx version. publicNpmPackageVersion() now reads it directly, so artifact filenames, release-readiness checks, and the Python wheel version all derive from one field. The duplicate release-policy.json.publicNpmPackageVersion is removed. Previously the two fields could drift: tarballs were named kaelio-ktx-0.4.1.tgz while internally containing @kaelio/ktx@0.0.0-private. - update-public-release-version.mjs rewrites both Python pyproject.toml files (ktx-daemon, ktx-sl) alongside the npm package.jsons, normalizing the version for PEP 440 (e.g. 0.1.0-rc.2 -> 0.1.0rc2). - semantic-release-config.cjs adds the two pyproject.toml files to @semantic-release/git assets so the release commit back to main carries every version source in lockstep. - The six "?? '0.0.0-private'" fallback literals across the CLI are replaced with "?? getKtxCliPackageInfo().version", and createDefaultKtxMcpServer makes its version arg required. - docs/release.md describes the actual commit-back model: the dev tree always reflects the most recent release; no sentinel pin to maintain. Verified: pnpm run artifacts:build now produces kaelio-ktx-0.4.1.tgz and kaelio_ktx-0.4.1-py3-none-any.whl with @kaelio/ktx@0.4.1 inside. Full type-check, dead-code, and 2287 vitests + 173 script tests pass. * refactor(cli): inject embedding provider resolution and detect sentence-transformers runtime Make resolveProjectEmbeddingProvider and runtimeIo injectable in ingest and scan command entrypoints so tests can stub them, and teach resolvePublicIngestRuntimeRequirements to flag the local-embeddings runtime feature when ktx.yaml selects sentence-transformers. * chore(cli): mark buildLocalStatsStatus and LocalStatsStatus as @internal Both symbols are consumed only by status-project.test.ts. Annotating with /** @internal */ keeps knip's production-mode check clean without changing runtime behavior. * fix(cli): use real package metadata in print-command-tree The stubbed package name embedded a forbidden product identifier that tripped the boundary check in CI. Read the metadata from package.json instead — keeps the rendered tree unchanged and removes a duplicate source of truth. * feat(cli): show embedding coverage in `ktx status`, drop duplicate disk counts Inline `(N embedded)` next to the Wiki scope counts and Semantic-layer source counts, computed with `SUM(embedding_json IS NOT NULL)` over `knowledge_pages` and `local_sl_sources`. Rename the "Knowledge" label to "Wiki" (canonical per `docs/terminology.md`) and rename the matching `localStats.knowledgePages` field to `localStats.wikiPages`. Drop `wiki=N md` and `semantic-layer=N yaml` from the Disk row — those duplicated the per-surface rows above. Disk now reports only actual byte usage (db, cache, raw-sources). The unused `wikiGlobalMarkdownCount` / `semanticLayerYamlCount` fields, the `isMarkdownEntry` / `isYamlEntry` helpers, and the `filter` arg on `summarizeDir` are removed. |
||
|---|---|---|
| .github | ||
| assets | ||
| docs | ||
| docs-site | ||
| examples | ||
| packages/cli | ||
| python | ||
| scripts | ||
| website | ||
| .gitignore | ||
| .pre-commit-config.yaml | ||
| .releaserc.cjs | ||
| AGENTS.md | ||
| biome.json | ||
| CLAUDE.md | ||
| codecov.yml | ||
| conductor.json | ||
| CONTRIBUTING.md | ||
| GEMINI.md | ||
| knip.json | ||
| LICENSE | ||
| package.json | ||
| pnpm-lock.yaml | ||
| pnpm-workspace.yaml | ||
| pyproject.toml | ||
| README.md | ||
| release-policy.json | ||
| SECURITY.md | ||
| tsconfig.base.json | ||
| uv.lock | ||
The context layer for data agents
ktx is a self-improving context layer that teaches agents how to query your warehouse accurately - from approved metric definitions, joinable columns, and business knowledge it builds and maintains for you.
Works with PostgreSQL, Snowflake, BigQuery, ClickHouse, MySQL, SQL Server, and SQLite. Integrates with dbt, MetricFlow, LookML, Looker, Metabase, and Notion.
Runs with your own LLM API keys or a Claude Pro/Max subscription - no extra usage billing from ktx.
Why ktx
General-purpose agents struggle on data tasks. They re-explore your warehouse on every question, invent their own metric logic, and return numbers that don't match approved definitions.
Traditional semantic layers don't fix this. They demand constant manual upkeep and don't absorb the rest of your company's knowledge.
ktx does both, automatically:
- Learns from company knowledge. Ingests wiki content, organizes it, removes duplicates, and flags contradictions for human review.
- Maps the data stack. Samples tables, captures metadata and usage patterns, detects joinable columns, and annotates sources so agents write better queries.
- Builds a semantic layer. Combines raw tables and high-level metrics through a join graph that automatically resolves chasm and fan traps, so agents fetch metrics declaratively instead of rewriting canonical SQL each time.
- Serves agents at execution. Exposes CLI and MCP tools with combined full-text and semantic search across wiki and semantic-layer entities.
Agents can run raw SQL when they need it, or compose semantic-layer queries when they want approved metrics with reliable joins.
Agent Setup
Ask an agent such as Claude Code, Codex, Cursor, or OpenCode to install and configure ktx from your project directory:
Follow instructions from
https://docs.kaelio.com/ktx/docs/agents-setup.md
to install and configure ktx
Quick Start
npm install -g @kaelio/ktx
ktx setup
ktx status
ktx setup creates or resumes a local ktx project, configures providers and
connections, builds context, and installs agent integration.
Example ktx status output after setup:
ktx project: /home/user/analytics
Project ready: yes
LLM ready: yes (claude-sonnet-4-6)
Embeddings ready: yes (text-embedding-3-small)
Databases configured: yes (warehouse)
Context sources configured: yes (dbt_main)
ktx context built: yes
Agent integration ready: yes (codex:project)
Common Commands
| Command | Purpose |
|---|---|
ktx setup |
Create, resume, or update a ktx project |
ktx status |
Check project readiness |
ktx connection |
List configured connections |
ktx connection test |
Test every configured connection |
ktx connection test <id> |
Test one connection |
ktx ingest |
Build context for every configured connection |
ktx ingest <id> |
Build context for one connection |
ktx ingest --text "..." |
Capture free-form notes into memory |
ktx ingest --file notes.md --connection-id <id> |
Capture a text file into memory |
ktx sl |
List semantic sources |
ktx sl "revenue" |
Search semantic sources |
ktx sl validate <source> --connection-id <id> |
Validate a semantic source |
ktx sl query --measure <measure> --format sql |
Compile semantic-layer SQL |
ktx sql --connection <id> "select 1" |
Execute read-only SQL |
ktx wiki |
List local wiki pages |
ktx wiki "revenue definition" |
Search local wiki pages |
ktx mcp |
Show MCP daemon status |
ktx mcp start |
Start the local MCP server for agent clients |
Project resolution defaults to KTX_PROJECT_DIR, then the nearest ktx.yaml,
then the current directory. Pass --project-dir <path> when scripting.
Project Layout
my-project/
├── ktx.yaml # Project configuration
├── semantic-layer/<connection-id>/ # YAML semantic sources
├── wiki/global/ # Shared business context
├── wiki/user/<user-id>/ # User-scoped notes
├── raw-sources/<connection-id>/ # Ingest artifacts and reports
└── .ktx/ # Local state and secrets, git-ignored
Commit ktx.yaml, semantic-layer/, and wiki/. Keep .ktx/ local.
Agent Usage
Install ktx integration for Claude Code, Claude Desktop, Codex, Cursor,
OpenCode, and generic .agents clients:
ktx setup --agents
Pass --target <target> to install or repair one specific integration.
A typical agent workflow combines wiki and semantic-layer search before querying:
ktx sl "revenue" --json
ktx wiki "refund policy" --json
ktx sl query --connection-id warehouse --measure orders.revenue --format sql
During setup, choose Ask data questions with ktx MCP for agent clients.
Choose Ask data questions + manage ktx with CLI commands when an operator
agent also needs pinned ktx admin commands.
After setup, ktx prints Required before using agents with the exact
commands to run. If the output includes ktx mcp start --project-dir ..., run
it before opening your agent. Claude Desktop uses its own launcher and prints
separate skill upload steps under .ktx/agents/claude/.
Workspace layout
| Path | Purpose |
|---|---|
packages/cli |
TypeScript CLI package and published npm package source |
packages/cli/src/context |
Core context engine |
packages/cli/src/llm |
LLM and embedding providers |
packages/cli/src/connectors |
Database scan connectors |
python/ktx-sl |
Semantic-layer query planning |
python/ktx-daemon |
Portable compute service |
Development
git clone https://github.com/kaelio/ktx.git
cd ktx
pnpm install
uv sync --all-groups
pnpm run build
pnpm run check
Use the development CLI locally:
pnpm run setup:dev
pnpm run link:dev
ktx-dev --help
ktx is a pnpm + uv workspace:
- TypeScript packages live in
packages/* - CLI source lives in
packages/cli - Python runtime source lives in
python/ktx-slandpython/ktx-daemon - Public docs live in
docs-site/content/docs
Useful checks:
pnpm run type-check
pnpm run test
pnpm run dead-code
uv run pytest -q
Docs
Community
- Slack — ask questions, share what you're building, and chat with maintainers and other users.
- GitHub Issues — report bugs and request features.
- Contributing guide — set up the repo, run tests, and open a PR.
See Community & Support for the full guide on where to ask what.
License
ktx is licensed under the Apache License, Version 2.0. See LICENSE.