chore(workspace): gate dead-code with knip production mode (#196)

* refactor(workspace): relocate @ktx/llm source into packages/cli/src/llm

* refactor(workspace): rewrite @ktx/llm imports to relative paths

* refactor(workspace): fold internal packages into cli

* chore(workspace): gate dead-code with knip production mode

Turn on production-mode knip plus an autofix run in pre-commit and the
`pnpm dead-code` script, document the `/** @internal */` convention for
test-only exports in AGENTS.md, annotate test-only exports across the
CLI with that JSDoc, and drop dead exports/wrappers the new gate
surfaced (e.g. `cli-project.ts`, `lookerRuntimeSourceToFileAdapterSource`,
`createLocalScanEnrichmentProvidersFromConfig`,
`PGLITE_OWNER_PROCESS_BACKEND_CAPABILITIES`, stale type re-exports).
Replace the loose `ignoreIssues` allowlist in `knip.json` with explicit
production entries so cross-package barrel leaks are caught.

* refactor(cli): delete internal barrel index.ts files

The 34 `index.ts` re-export barrels inside `packages/cli/src/` were
holdovers from the pre-fold multi-workspace structure. Post-fold-in they
served no production purpose: external consumers go through the single
package main entry, and in-repo callers mostly imported through them
only because the path was short. Internally, knip flagged most barrel
re-exports as production-dead (only reached via tests).

This change:
- Deletes every internal barrel except `packages/cli/src/index.ts`
  (the published package entry).
- Rewrites ~270 source/test files to import each name directly from
  the file that defines it.
- Moves `tools/warehouse-verification/index.ts` to
  `create-warehouse-verification-tools.ts` (the function it defined
  locally) and updates its single consumer.
- Renames `search/backend-conformance.ts` → `.test-utils.ts` to match
  the existing test-helper file convention.
- Deletes 13 dead test-only chains (dbt-descriptions/*,
  live-database/extracted-schema, live-database/structural-sync,
  relationship-* feedback/review chain) plus their tests and a
  cascading orphan integration test.
- Updates test mocks that pointed at deleted barrel paths
  (notion-client, connector barrels in scan/local-scan-connectors
  tests) to mock the source files instead.
- Points the maintainer benchmark script
  (`scripts/relationship-benchmark-report.mjs`) at source files
  instead of `dist/context/scan/index.js`.
- Drops the barrel `!` entries from `knip.json`; adds explicit
  production entries only for the benchmark code reached via dist by
  the maintainer script.

Net: 413 files changed, ~1.2k insertions, ~9.4k deletions.

`pnpm run dead-code` (Biome + knip default + knip production) and
`pnpm run type-check` are clean; 2277 tests pass.

* refactor(workspace): rename @ktx/cli to @kaelio/ktx and pack it directly

Promote the CLI workspace package to the public name `@kaelio/ktx` and
drop the separate `scripts/build-public-npm-package.mjs` wrapper. The
CLI package is now publishable in place (`publishConfig.access: public`,
`provenance: true`), so artifact packing uses `pnpm pack` against
`packages/cli/` instead of assembling a parallel package tree.

Updates all workspace filter invocations, docs, tests, and release
readiness checks to reference the new package name, and folds the
tarball-name helper into `scripts/public-npm-release-metadata.mjs`.

* docs: align "agent clients" and "data agents" terminology

Replace "client agents" with "agent clients" and "database agents" with
"data agents" across AGENTS.md, README.md, the docs-site copy, and the
matching setup-agents test description, matching the canonical
vocabulary in docs/terminology.md.

Also moves packages/cli/tsconfig.json's tsBuildInfoFile from
node_modules/.cache/ to dist/.tsbuildinfo so incremental builds survive
node_modules reinstalls.

* refactor(release): single source of truth for package version

Make packages/cli/package.json the single source of truth for the
@kaelio/ktx version. publicNpmPackageVersion() now reads it directly,
so artifact filenames, release-readiness checks, and the Python wheel
version all derive from one field. The duplicate
release-policy.json.publicNpmPackageVersion is removed.

Previously the two fields could drift: tarballs were named
kaelio-ktx-0.4.1.tgz while internally containing
@kaelio/ktx@0.0.0-private.

- update-public-release-version.mjs rewrites both Python pyproject.toml
  files (ktx-daemon, ktx-sl) alongside the npm package.jsons,
  normalizing the version for PEP 440 (e.g. 0.1.0-rc.2 -> 0.1.0rc2).
- semantic-release-config.cjs adds the two pyproject.toml files to
  @semantic-release/git assets so the release commit back to main
  carries every version source in lockstep.
- The six "?? '0.0.0-private'" fallback literals across the CLI are
  replaced with "?? getKtxCliPackageInfo().version", and
  createDefaultKtxMcpServer makes its version arg required.
- docs/release.md describes the actual commit-back model: the dev tree
  always reflects the most recent release; no sentinel pin to
  maintain.

Verified: pnpm run artifacts:build now produces
kaelio-ktx-0.4.1.tgz and kaelio_ktx-0.4.1-py3-none-any.whl with
@kaelio/ktx@0.4.1 inside. Full type-check, dead-code, and
2287 vitests + 173 script tests pass.

* refactor(cli): inject embedding provider resolution and detect sentence-transformers runtime

Make resolveProjectEmbeddingProvider and runtimeIo injectable in ingest and
scan command entrypoints so tests can stub them, and teach
resolvePublicIngestRuntimeRequirements to flag the local-embeddings runtime
feature when ktx.yaml selects sentence-transformers.

* chore(cli): mark buildLocalStatsStatus and LocalStatsStatus as @internal

Both symbols are consumed only by status-project.test.ts. Annotating with
/** @internal */ keeps knip's production-mode check clean without changing
runtime behavior.

* fix(cli): use real package metadata in print-command-tree

The stubbed package name embedded a forbidden product identifier that
tripped the boundary check in CI. Read the metadata from package.json
instead — keeps the rendered tree unchanged and removes a duplicate
source of truth.

* feat(cli): show embedding coverage in `ktx status`, drop duplicate disk counts

Inline `(N embedded)` next to the Wiki scope counts and Semantic-layer
source counts, computed with `SUM(embedding_json IS NOT NULL)` over
`knowledge_pages` and `local_sl_sources`. Rename the "Knowledge" label to
"Wiki" (canonical per `docs/terminology.md`) and rename the matching
`localStats.knowledgePages` field to `localStats.wikiPages`.

Drop `wiki=N md` and `semantic-layer=N yaml` from the Disk row — those
duplicated the per-surface rows above. Disk now reports only actual byte
usage (db, cache, raw-sources). The unused `wikiGlobalMarkdownCount` /
`semanticLayerYamlCount` fields, the `isMarkdownEntry` / `isYamlEntry`
helpers, and the `filter` arg on `summarizeDir` are removed.
This commit is contained in:
Andrey Avtomonov 2026-05-21 15:28:58 +02:00 committed by GitHub
parent a1cfb03d73
commit 2366b00301
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
1002 changed files with 2286 additions and 12051 deletions

View file

@ -22,8 +22,8 @@ documentation, connector coverage, and examples.
| Area | Good first context |
|------|--------------------|
| CLI and setup | `packages/cli`, especially setup steps, command definitions, status checks, and smoke tests |
| Context engine | `packages/context`, including project config, ingest orchestration, and semantic search |
| Connectors | `packages/connector-*`, plus connector-specific tests and integration docs |
| Context engine | `packages/cli/src/context`, including project config, ingest orchestration, and semantic search |
| Connectors | `packages/cli/src/connectors/<driver>`, plus connector-specific tests and integration docs |
| Python semantic layer | `python/ktx-sl` for planning and SQL compilation |
| **ktx** daemon | `python/ktx-daemon` for the portable runtime API |
| Documentation | `docs-site/content/docs` for public docs and `docs-site/tests` for docs behavior |
@ -50,7 +50,7 @@ pnpm install
uv sync --all-groups
```
`pnpm install` sets up all TypeScript packages in the workspace.
`pnpm install` sets up the TypeScript workspace.
`uv sync --all-groups` installs Python dependencies for the semantic layer and
daemon, including dev and test groups.
@ -60,11 +60,10 @@ daemon, including dev and test groups.
pnpm run build
```
This builds all TypeScript packages. You can also build individual packages:
This builds the TypeScript package. You can also build the package directly:
```bash
pnpm --filter @ktx/cli run build
pnpm --filter @ktx/context run build
pnpm --filter @kaelio/ktx run build
```
### Link the CLI for local testing
@ -80,21 +79,15 @@ changes.
## Repository structure
**ktx** is a pnpm + uv workspace. TypeScript packages live in `packages/`, Python
projects in `python/`.
**ktx** is a pnpm + uv workspace. TypeScript source lives in `packages/cli`,
and Python projects live in `python/`.
```text
packages/
cli/ # CLI entry point and commands
context/ # Core context engine (scan, ingest, MCP, semantic layer)
llm/ # LLM client abstraction
connector-postgres/ # PostgreSQL connector
connector-snowflake/ # Snowflake connector
connector-bigquery/ # BigQuery connector
connector-mysql/ # MySQL connector
connector-sqlserver/ # SQL Server connector
connector-sqlite/ # SQLite connector
connector-posthog/ # PostHog connector
cli/ # CLI package and published npm package source
src/context/ # Core context engine (scan, ingest, MCP, semantic layer)
src/llm/ # LLM client abstraction
src/connectors/ # Database connectors
python/
ktx-sl/ # Semantic layer - grain-aware query planning and SQL compilation
@ -105,7 +98,7 @@ scripts/ # Workspace scripts (benchmarks, verification, release)
docs-site/ # Documentation site (Fumadocs)
```
All TypeScript packages are ESM (`"type": "module"`) and use `NodeNext` module
The TypeScript package is ESM (`"type": "module"`) and uses `NodeNext` module
resolution. The Python projects use `pyproject.toml` for dependency management.
## Running tests
@ -116,18 +109,17 @@ resolution. The Python projects use `pyproject.toml` for dependency management.
# Run all tests
pnpm run test
# Run tests for a specific package
pnpm --filter @ktx/cli run test
pnpm --filter @ktx/context run test
# Run tests for the TypeScript package
pnpm --filter @kaelio/ktx run test
# Type-check all packages
pnpm run type-check
# Type-check a specific package
pnpm --filter @ktx/context run type-check
# Type-check the TypeScript package
pnpm --filter @kaelio/ktx run type-check
# CLI smoke test
pnpm --filter @ktx/cli run smoke
pnpm --filter @kaelio/ktx run smoke
```
### Python
@ -164,43 +156,22 @@ uv run pytest -q
## Adding a connector
Database connectors live in `packages/connector-<name>/`. Each connector
implements the `KtxScanConnector` interface from `@ktx/context`.
Database connectors live in `packages/cli/src/connectors/<driver>/`. Each
connector implements the `KtxScanConnector` interface from the internal context
modules.
### Step 1: Scaffold the package
### Step 1: Scaffold the connector
Create a new directory at `packages/connector-<name>/` with:
Create a new directory at `packages/cli/src/connectors/<driver>/` with:
```text
packages/connector-<name>/
package.json
tsconfig.json
src/
index.ts # Public exports
connector.ts # KtxScanConnector implementation
dialect.ts # SQL dialect handling
packages/cli/src/connectors/<driver>/
index.ts # Internal connector exports
connector.ts # KtxScanConnector implementation
dialect.ts # SQL dialect handling
```
The `package.json` should follow the pattern of existing connectors:
```json
{
"name": "@ktx/connector-<name>",
"private": true,
"type": "module",
"main": "dist/index.js",
"types": "dist/index.d.ts",
"exports": {
".": {
"types": "./dist/index.d.ts",
"import": "./dist/index.js"
}
},
"dependencies": {
"@ktx/context": "workspace:*"
}
}
```
Add any connector-specific npm dependency to `packages/cli/package.json`.
### Step 2: Implement the connector
@ -226,20 +197,20 @@ and statistics.
### Step 4: Wire it up
Register the new connector in `packages/context` so the CLI and scan
engine can instantiate it. Look at how existing connectors are registered for
the pattern.
Register the new connector in `packages/cli/src/local-scan-connectors.ts` and
`packages/cli/src/local-adapters.ts` so the CLI and scan engine can instantiate
it. Keep runtime loading dynamic when the connector is optional.
### Step 5: Test
```bash
pnpm --filter @ktx/connector-<name> run build
pnpm --filter @ktx/connector-<name> run type-check
pnpm --filter @ktx/connector-<name> run test
pnpm --filter @kaelio/ktx run build
pnpm --filter @kaelio/ktx run type-check
pnpm --filter @kaelio/ktx run test
```
Use `packages/connector-sqlite/` as a minimal reference and
`packages/connector-postgres/` as a full-featured one.
Use `packages/cli/src/connectors/sqlite/` as a minimal reference and
`packages/cli/src/connectors/postgres/` as a full-featured one.
## Code conventions

View file

@ -6,9 +6,9 @@ description: Set up ktx with Claude Code, Claude Desktop, Cursor, Codex, and Ope
**ktx** exposes context to end-user agents through MCP tools. The CLI remains the
admin surface for setup, ingest, status, daemon lifecycle, and debugging.
Run `ktx setup` and select your client agent targets, or configure manually
using the snippets below. Choose **Ask data questions with ktx MCP** for client
agents. Choose **Ask data questions + manage ktx with CLI commands** only when
Run `ktx setup` and select your agent client targets, or configure manually
using the snippets below. Choose **Ask data questions with ktx MCP** for agent
clients. Choose **Ask data questions + manage ktx with CLI commands** only when
a developer or operator agent also needs pinned `ktx` admin commands.
## Install with setup