mirror of
https://github.com/Kaelio/ktx.git
synced 2026-06-10 08:05:14 +02:00
* refactor(workspace): relocate @ktx/llm source into packages/cli/src/llm * refactor(workspace): rewrite @ktx/llm imports to relative paths * refactor(workspace): fold internal packages into cli * chore(workspace): gate dead-code with knip production mode Turn on production-mode knip plus an autofix run in pre-commit and the `pnpm dead-code` script, document the `/** @internal */` convention for test-only exports in AGENTS.md, annotate test-only exports across the CLI with that JSDoc, and drop dead exports/wrappers the new gate surfaced (e.g. `cli-project.ts`, `lookerRuntimeSourceToFileAdapterSource`, `createLocalScanEnrichmentProvidersFromConfig`, `PGLITE_OWNER_PROCESS_BACKEND_CAPABILITIES`, stale type re-exports). Replace the loose `ignoreIssues` allowlist in `knip.json` with explicit production entries so cross-package barrel leaks are caught. * refactor(cli): delete internal barrel index.ts files The 34 `index.ts` re-export barrels inside `packages/cli/src/` were holdovers from the pre-fold multi-workspace structure. Post-fold-in they served no production purpose: external consumers go through the single package main entry, and in-repo callers mostly imported through them only because the path was short. Internally, knip flagged most barrel re-exports as production-dead (only reached via tests). This change: - Deletes every internal barrel except `packages/cli/src/index.ts` (the published package entry). - Rewrites ~270 source/test files to import each name directly from the file that defines it. - Moves `tools/warehouse-verification/index.ts` to `create-warehouse-verification-tools.ts` (the function it defined locally) and updates its single consumer. - Renames `search/backend-conformance.ts` → `.test-utils.ts` to match the existing test-helper file convention. - Deletes 13 dead test-only chains (dbt-descriptions/*, live-database/extracted-schema, live-database/structural-sync, relationship-* feedback/review chain) plus their tests and a cascading orphan integration test. - Updates test mocks that pointed at deleted barrel paths (notion-client, connector barrels in scan/local-scan-connectors tests) to mock the source files instead. - Points the maintainer benchmark script (`scripts/relationship-benchmark-report.mjs`) at source files instead of `dist/context/scan/index.js`. - Drops the barrel `!` entries from `knip.json`; adds explicit production entries only for the benchmark code reached via dist by the maintainer script. Net: 413 files changed, ~1.2k insertions, ~9.4k deletions. `pnpm run dead-code` (Biome + knip default + knip production) and `pnpm run type-check` are clean; 2277 tests pass. * refactor(workspace): rename @ktx/cli to @kaelio/ktx and pack it directly Promote the CLI workspace package to the public name `@kaelio/ktx` and drop the separate `scripts/build-public-npm-package.mjs` wrapper. The CLI package is now publishable in place (`publishConfig.access: public`, `provenance: true`), so artifact packing uses `pnpm pack` against `packages/cli/` instead of assembling a parallel package tree. Updates all workspace filter invocations, docs, tests, and release readiness checks to reference the new package name, and folds the tarball-name helper into `scripts/public-npm-release-metadata.mjs`. * docs: align "agent clients" and "data agents" terminology Replace "client agents" with "agent clients" and "database agents" with "data agents" across AGENTS.md, README.md, the docs-site copy, and the matching setup-agents test description, matching the canonical vocabulary in docs/terminology.md. Also moves packages/cli/tsconfig.json's tsBuildInfoFile from node_modules/.cache/ to dist/.tsbuildinfo so incremental builds survive node_modules reinstalls. * refactor(release): single source of truth for package version Make packages/cli/package.json the single source of truth for the @kaelio/ktx version. publicNpmPackageVersion() now reads it directly, so artifact filenames, release-readiness checks, and the Python wheel version all derive from one field. The duplicate release-policy.json.publicNpmPackageVersion is removed. Previously the two fields could drift: tarballs were named kaelio-ktx-0.4.1.tgz while internally containing @kaelio/ktx@0.0.0-private. - update-public-release-version.mjs rewrites both Python pyproject.toml files (ktx-daemon, ktx-sl) alongside the npm package.jsons, normalizing the version for PEP 440 (e.g. 0.1.0-rc.2 -> 0.1.0rc2). - semantic-release-config.cjs adds the two pyproject.toml files to @semantic-release/git assets so the release commit back to main carries every version source in lockstep. - The six "?? '0.0.0-private'" fallback literals across the CLI are replaced with "?? getKtxCliPackageInfo().version", and createDefaultKtxMcpServer makes its version arg required. - docs/release.md describes the actual commit-back model: the dev tree always reflects the most recent release; no sentinel pin to maintain. Verified: pnpm run artifacts:build now produces kaelio-ktx-0.4.1.tgz and kaelio_ktx-0.4.1-py3-none-any.whl with @kaelio/ktx@0.4.1 inside. Full type-check, dead-code, and 2287 vitests + 173 script tests pass. * refactor(cli): inject embedding provider resolution and detect sentence-transformers runtime Make resolveProjectEmbeddingProvider and runtimeIo injectable in ingest and scan command entrypoints so tests can stub them, and teach resolvePublicIngestRuntimeRequirements to flag the local-embeddings runtime feature when ktx.yaml selects sentence-transformers. * chore(cli): mark buildLocalStatsStatus and LocalStatsStatus as @internal Both symbols are consumed only by status-project.test.ts. Annotating with /** @internal */ keeps knip's production-mode check clean without changing runtime behavior. * fix(cli): use real package metadata in print-command-tree The stubbed package name embedded a forbidden product identifier that tripped the boundary check in CI. Read the metadata from package.json instead — keeps the rendered tree unchanged and removes a duplicate source of truth. * feat(cli): show embedding coverage in `ktx status`, drop duplicate disk counts Inline `(N embedded)` next to the Wiki scope counts and Semantic-layer source counts, computed with `SUM(embedding_json IS NOT NULL)` over `knowledge_pages` and `local_sl_sources`. Rename the "Knowledge" label to "Wiki" (canonical per `docs/terminology.md`) and rename the matching `localStats.knowledgePages` field to `localStats.wikiPages`. Drop `wiki=N md` and `semantic-layer=N yaml` from the Disk row — those duplicated the per-surface rows above. Disk now reports only actual byte usage (db, cache, raw-sources). The unused `wikiGlobalMarkdownCount` / `semanticLayerYamlCount` fields, the `isMarkdownEntry` / `isYamlEntry` helpers, and the `filter` arg on `summarizeDir` are removed.
230 lines
8.7 KiB
Markdown
230 lines
8.7 KiB
Markdown
<h1 align="center">
|
|
<img src="assets/ktx-lockup.svg" alt="ktx" width="500" />
|
|
</h1>
|
|
|
|
<h1 align="center">
|
|
The context layer for data agents
|
|
</h1>
|
|
|
|
<p align="center">
|
|
<a href="https://www.npmjs.com/package/@kaelio/ktx"><img src="https://img.shields.io/npm/v/@kaelio/ktx?style=flat-square&color=f97316" alt="npm version" /></a>
|
|
<a href="https://codecov.io/gh/Kaelio/ktx"><img src="https://codecov.io/gh/Kaelio/ktx/graph/badge.svg?branch=main" alt="Codecov" /></a>
|
|
<a href="https://github.com/Kaelio/ktx/actions/workflows/ci.yml?query=branch%3Amain"><img src="https://img.shields.io/github/actions/workflow/status/Kaelio/ktx/ci.yml?branch=main&label=tests&style=flat-square" alt="Tests" /></a>
|
|
<a href="https://docs.kaelio.com/ktx/docs/"><img src="https://img.shields.io/badge/docs-ktx-22c55e?style=flat-square" alt="Documentation" /></a>
|
|
<a href="https://join.slack.com/t/ktxcommunity/shared_invite/zt-3y9b44m1x-LVyNNJD5nwaZHq4XS29LMQ"><img src="https://img.shields.io/badge/slack-join%20community-4A154B?style=flat-square&logo=slack&logoColor=white" alt="Join the ktx Slack community" /></a>
|
|
<a href="https://github.com/Kaelio/ktx/blob/main/LICENSE"><img src="https://img.shields.io/badge/license-Apache%202.0-blue?style=flat-square" alt="License" /></a>
|
|
<a href="https://www.ycombinator.com/companies?batch=P25"><img src="https://img.shields.io/badge/Y%20Combinator-P25-orange?style=flat-square" alt="Y Combinator P25" /></a>
|
|
</p>
|
|
|
|
---
|
|
|
|
**ktx** is a self-improving context layer that teaches agents how to query your
|
|
warehouse accurately - from approved metric definitions, joinable columns, and
|
|
business knowledge it builds and maintains for you.
|
|
|
|
Works with PostgreSQL, Snowflake, BigQuery, ClickHouse, MySQL, SQL Server, and
|
|
SQLite. Integrates with dbt, MetricFlow, LookML, Looker, Metabase, and Notion.
|
|
|
|
Runs with your own LLM API keys or a **Claude
|
|
Pro/Max subscription - no extra usage billing from** **ktx**.
|
|
|
|
## Why ktx
|
|
|
|
General-purpose agents struggle on data tasks. They re-explore your warehouse
|
|
on every question, invent their own metric logic, and return numbers that
|
|
don't match approved definitions.
|
|
|
|
Traditional semantic layers don't fix this. They demand constant manual
|
|
upkeep and don't absorb the rest of your company's knowledge.
|
|
|
|
**ktx** does both, automatically:
|
|
|
|
- **Learns from company knowledge.** Ingests wiki content, organizes it,
|
|
removes duplicates, and flags contradictions for human review.
|
|
- **Maps the data stack.** Samples tables, captures metadata and usage
|
|
patterns, detects joinable columns, and annotates sources so agents write
|
|
better queries.
|
|
- **Builds a semantic layer.** Combines raw tables and high-level metrics
|
|
through a join graph that automatically resolves chasm and fan traps, so
|
|
agents fetch metrics declaratively instead of rewriting canonical SQL each
|
|
time.
|
|
- **Serves agents at execution.** Exposes CLI and MCP tools with combined
|
|
full-text and semantic search across wiki and semantic-layer entities.
|
|
|
|
Agents can run raw SQL when they need it, or compose semantic-layer queries
|
|
when they want approved metrics with reliable joins.
|
|
|
|
<p align="center">
|
|
<img src="docs-site/public/images/ingestion-flow-transparent.svg" alt="ktx ingestion flow from source systems through validation to wiki and semantic-layer outputs" width="900" />
|
|
</p>
|
|
|
|
## Agent Setup
|
|
|
|
Ask an agent such as Claude Code, Codex, Cursor, or OpenCode to install and
|
|
configure **ktx** from your project directory:
|
|
|
|
```text
|
|
Follow instructions from
|
|
https://docs.kaelio.com/ktx/docs/agents-setup.md
|
|
to install and configure ktx
|
|
```
|
|
|
|
## Quick Start
|
|
|
|
```bash
|
|
npm install -g @kaelio/ktx
|
|
ktx setup
|
|
ktx status
|
|
```
|
|
|
|
`ktx setup` creates or resumes a local **ktx** project, configures providers and
|
|
connections, builds context, and installs agent integration.
|
|
|
|
Example `ktx status` output after setup:
|
|
|
|
```text
|
|
ktx project: /home/user/analytics
|
|
Project ready: yes
|
|
LLM ready: yes (claude-sonnet-4-6)
|
|
Embeddings ready: yes (text-embedding-3-small)
|
|
Databases configured: yes (warehouse)
|
|
Context sources configured: yes (dbt_main)
|
|
ktx context built: yes
|
|
Agent integration ready: yes (codex:project)
|
|
```
|
|
|
|
## Common Commands
|
|
|
|
| Command | Purpose |
|
|
|---------|---------|
|
|
| `ktx setup` | Create, resume, or update a **ktx** project |
|
|
| `ktx status` | Check project readiness |
|
|
| `ktx connection` | List configured connections |
|
|
| `ktx connection test` | Test every configured connection |
|
|
| `ktx connection test <id>` | Test one connection |
|
|
| `ktx ingest` | Build context for every configured connection |
|
|
| `ktx ingest <id>` | Build context for one connection |
|
|
| `ktx ingest --text "..."` | Capture free-form notes into memory |
|
|
| `ktx ingest --file notes.md --connection-id <id>` | Capture a text file into memory |
|
|
| `ktx sl` | List semantic sources |
|
|
| `ktx sl "revenue"` | Search semantic sources |
|
|
| `ktx sl validate <source> --connection-id <id>` | Validate a semantic source |
|
|
| `ktx sl query --measure <measure> --format sql` | Compile semantic-layer SQL |
|
|
| `ktx sql --connection <id> "select 1"` | Execute read-only SQL |
|
|
| `ktx wiki` | List local wiki pages |
|
|
| `ktx wiki "revenue definition"` | Search local wiki pages |
|
|
| `ktx mcp` | Show MCP daemon status |
|
|
| `ktx mcp start` | Start the local MCP server for agent clients |
|
|
|
|
Project resolution defaults to `KTX_PROJECT_DIR`, then the nearest `ktx.yaml`,
|
|
then the current directory. Pass `--project-dir <path>` when scripting.
|
|
|
|
## Project Layout
|
|
|
|
```text
|
|
my-project/
|
|
├── ktx.yaml # Project configuration
|
|
├── semantic-layer/<connection-id>/ # YAML semantic sources
|
|
├── wiki/global/ # Shared business context
|
|
├── wiki/user/<user-id>/ # User-scoped notes
|
|
├── raw-sources/<connection-id>/ # Ingest artifacts and reports
|
|
└── .ktx/ # Local state and secrets, git-ignored
|
|
```
|
|
|
|
Commit `ktx.yaml`, `semantic-layer/`, and `wiki/`. Keep `.ktx/` local.
|
|
|
|
## Agent Usage
|
|
|
|
Install **ktx** integration for Claude Code, Claude Desktop, Codex, Cursor,
|
|
OpenCode, and generic `.agents` clients:
|
|
|
|
```bash
|
|
ktx setup --agents
|
|
```
|
|
|
|
Pass `--target <target>` to install or repair one specific integration.
|
|
|
|
A typical agent workflow combines wiki and semantic-layer search before
|
|
querying:
|
|
|
|
```bash
|
|
ktx sl "revenue" --json
|
|
ktx wiki "refund policy" --json
|
|
ktx sl query --connection-id warehouse --measure orders.revenue --format sql
|
|
```
|
|
|
|
During setup, choose **Ask data questions with ktx MCP** for agent clients.
|
|
Choose **Ask data questions + manage ktx with CLI commands** when an operator
|
|
agent also needs pinned `ktx` admin commands.
|
|
|
|
After setup, **ktx** prints **Required before using agents** with the exact
|
|
commands to run. If the output includes `ktx mcp start --project-dir ...`, run
|
|
it before opening your agent. Claude Desktop uses its own launcher and prints
|
|
separate skill upload steps under `.ktx/agents/claude/`.
|
|
|
|
## Workspace layout
|
|
|
|
| Path | Purpose |
|
|
|------|---------|
|
|
| `packages/cli` | TypeScript CLI package and published npm package source |
|
|
| `packages/cli/src/context` | Core context engine |
|
|
| `packages/cli/src/llm` | LLM and embedding providers |
|
|
| `packages/cli/src/connectors` | Database scan connectors |
|
|
| `python/ktx-sl` | Semantic-layer query planning |
|
|
| `python/ktx-daemon` | Portable compute service |
|
|
|
|
## Development
|
|
|
|
```bash
|
|
git clone https://github.com/kaelio/ktx.git
|
|
cd ktx
|
|
pnpm install
|
|
uv sync --all-groups
|
|
pnpm run build
|
|
pnpm run check
|
|
```
|
|
|
|
Use the development CLI locally:
|
|
|
|
```bash
|
|
pnpm run setup:dev
|
|
pnpm run link:dev
|
|
ktx-dev --help
|
|
```
|
|
|
|
**ktx** is a pnpm + uv workspace:
|
|
|
|
- TypeScript packages live in `packages/*`
|
|
- CLI source lives in `packages/cli`
|
|
- Python runtime source lives in `python/ktx-sl` and `python/ktx-daemon`
|
|
- Public docs live in `docs-site/content/docs`
|
|
|
|
Useful checks:
|
|
|
|
```bash
|
|
pnpm run type-check
|
|
pnpm run test
|
|
pnpm run dead-code
|
|
uv run pytest -q
|
|
```
|
|
|
|
## Docs
|
|
|
|
- [Quickstart](docs-site/content/docs/getting-started/quickstart.mdx)
|
|
- [CLI Reference](docs-site/content/docs/cli-reference/ktx.mdx)
|
|
- [Building Context](docs-site/content/docs/guides/building-context.mdx)
|
|
- [Community & Support](docs-site/content/docs/community/support.mdx)
|
|
- [Contributing](docs-site/content/docs/community/contributing.mdx)
|
|
|
|
## Community
|
|
|
|
- **[Slack](https://join.slack.com/t/ktxcommunity/shared_invite/zt-3y9b44m1x-LVyNNJD5nwaZHq4XS29LMQ)** — ask questions, share what you're building, and chat with maintainers and other users.
|
|
- **[GitHub Issues](https://github.com/Kaelio/ktx/issues)** — report bugs and request features.
|
|
- **[Contributing guide](docs-site/content/docs/community/contributing.mdx)** — set up the repo, run tests, and open a PR.
|
|
|
|
See [Community & Support](docs-site/content/docs/community/support.mdx) for the
|
|
full guide on where to ask what.
|
|
|
|
## License
|
|
|
|
**ktx** is licensed under the Apache License, Version 2.0. See `LICENSE`.
|