chore(workspace): gate dead-code with knip production mode (#196)

* refactor(workspace): relocate @ktx/llm source into packages/cli/src/llm

* refactor(workspace): rewrite @ktx/llm imports to relative paths

* refactor(workspace): fold internal packages into cli

* chore(workspace): gate dead-code with knip production mode

Turn on production-mode knip plus an autofix run in pre-commit and the
`pnpm dead-code` script, document the `/** @internal */` convention for
test-only exports in AGENTS.md, annotate test-only exports across the
CLI with that JSDoc, and drop dead exports/wrappers the new gate
surfaced (e.g. `cli-project.ts`, `lookerRuntimeSourceToFileAdapterSource`,
`createLocalScanEnrichmentProvidersFromConfig`,
`PGLITE_OWNER_PROCESS_BACKEND_CAPABILITIES`, stale type re-exports).
Replace the loose `ignoreIssues` allowlist in `knip.json` with explicit
production entries so cross-package barrel leaks are caught.

* refactor(cli): delete internal barrel index.ts files

The 34 `index.ts` re-export barrels inside `packages/cli/src/` were
holdovers from the pre-fold multi-workspace structure. Post-fold-in they
served no production purpose: external consumers go through the single
package main entry, and in-repo callers mostly imported through them
only because the path was short. Internally, knip flagged most barrel
re-exports as production-dead (only reached via tests).

This change:
- Deletes every internal barrel except `packages/cli/src/index.ts`
  (the published package entry).
- Rewrites ~270 source/test files to import each name directly from
  the file that defines it.
- Moves `tools/warehouse-verification/index.ts` to
  `create-warehouse-verification-tools.ts` (the function it defined
  locally) and updates its single consumer.
- Renames `search/backend-conformance.ts` → `.test-utils.ts` to match
  the existing test-helper file convention.
- Deletes 13 dead test-only chains (dbt-descriptions/*,
  live-database/extracted-schema, live-database/structural-sync,
  relationship-* feedback/review chain) plus their tests and a
  cascading orphan integration test.
- Updates test mocks that pointed at deleted barrel paths
  (notion-client, connector barrels in scan/local-scan-connectors
  tests) to mock the source files instead.
- Points the maintainer benchmark script
  (`scripts/relationship-benchmark-report.mjs`) at source files
  instead of `dist/context/scan/index.js`.
- Drops the barrel `!` entries from `knip.json`; adds explicit
  production entries only for the benchmark code reached via dist by
  the maintainer script.

Net: 413 files changed, ~1.2k insertions, ~9.4k deletions.

`pnpm run dead-code` (Biome + knip default + knip production) and
`pnpm run type-check` are clean; 2277 tests pass.

* refactor(workspace): rename @ktx/cli to @kaelio/ktx and pack it directly

Promote the CLI workspace package to the public name `@kaelio/ktx` and
drop the separate `scripts/build-public-npm-package.mjs` wrapper. The
CLI package is now publishable in place (`publishConfig.access: public`,
`provenance: true`), so artifact packing uses `pnpm pack` against
`packages/cli/` instead of assembling a parallel package tree.

Updates all workspace filter invocations, docs, tests, and release
readiness checks to reference the new package name, and folds the
tarball-name helper into `scripts/public-npm-release-metadata.mjs`.

* docs: align "agent clients" and "data agents" terminology

Replace "client agents" with "agent clients" and "database agents" with
"data agents" across AGENTS.md, README.md, the docs-site copy, and the
matching setup-agents test description, matching the canonical
vocabulary in docs/terminology.md.

Also moves packages/cli/tsconfig.json's tsBuildInfoFile from
node_modules/.cache/ to dist/.tsbuildinfo so incremental builds survive
node_modules reinstalls.

* refactor(release): single source of truth for package version

Make packages/cli/package.json the single source of truth for the
@kaelio/ktx version. publicNpmPackageVersion() now reads it directly,
so artifact filenames, release-readiness checks, and the Python wheel
version all derive from one field. The duplicate
release-policy.json.publicNpmPackageVersion is removed.

Previously the two fields could drift: tarballs were named
kaelio-ktx-0.4.1.tgz while internally containing
@kaelio/ktx@0.0.0-private.

- update-public-release-version.mjs rewrites both Python pyproject.toml
  files (ktx-daemon, ktx-sl) alongside the npm package.jsons,
  normalizing the version for PEP 440 (e.g. 0.1.0-rc.2 -> 0.1.0rc2).
- semantic-release-config.cjs adds the two pyproject.toml files to
  @semantic-release/git assets so the release commit back to main
  carries every version source in lockstep.
- The six "?? '0.0.0-private'" fallback literals across the CLI are
  replaced with "?? getKtxCliPackageInfo().version", and
  createDefaultKtxMcpServer makes its version arg required.
- docs/release.md describes the actual commit-back model: the dev tree
  always reflects the most recent release; no sentinel pin to
  maintain.

Verified: pnpm run artifacts:build now produces
kaelio-ktx-0.4.1.tgz and kaelio_ktx-0.4.1-py3-none-any.whl with
@kaelio/ktx@0.4.1 inside. Full type-check, dead-code, and
2287 vitests + 173 script tests pass.

* refactor(cli): inject embedding provider resolution and detect sentence-transformers runtime

Make resolveProjectEmbeddingProvider and runtimeIo injectable in ingest and
scan command entrypoints so tests can stub them, and teach
resolvePublicIngestRuntimeRequirements to flag the local-embeddings runtime
feature when ktx.yaml selects sentence-transformers.

* chore(cli): mark buildLocalStatsStatus and LocalStatsStatus as @internal

Both symbols are consumed only by status-project.test.ts. Annotating with
/** @internal */ keeps knip's production-mode check clean without changing
runtime behavior.

* fix(cli): use real package metadata in print-command-tree

The stubbed package name embedded a forbidden product identifier that
tripped the boundary check in CI. Read the metadata from package.json
instead — keeps the rendered tree unchanged and removes a duplicate
source of truth.

* feat(cli): show embedding coverage in `ktx status`, drop duplicate disk counts

Inline `(N embedded)` next to the Wiki scope counts and Semantic-layer
source counts, computed with `SUM(embedding_json IS NOT NULL)` over
`knowledge_pages` and `local_sl_sources`. Rename the "Knowledge" label to
"Wiki" (canonical per `docs/terminology.md`) and rename the matching
`localStats.knowledgePages` field to `localStats.wikiPages`.

Drop `wiki=N md` and `semantic-layer=N yaml` from the Disk row — those
duplicated the per-surface rows above. Disk now reports only actual byte
usage (db, cache, raw-sources). The unused `wikiGlobalMarkdownCount` /
`semanticLayerYamlCount` fields, the `isMarkdownEntry` / `isYamlEntry`
helpers, and the `filter` arg on `summarizeDir` are removed.
This commit is contained in:
Andrey Avtomonov 2026-05-21 15:28:58 +02:00 committed by GitHub
parent a1cfb03d73
commit 2366b00301
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
1002 changed files with 2286 additions and 12051 deletions

View file

@ -1,17 +1,10 @@
import type { Dirent } from 'node:fs';
import { stat as statAsync, readdir as readdirAsync } from 'node:fs/promises';
import { basename, join } from 'node:path';
import { runClaudeCodeAuthProbe } from '@ktx/context';
import type {
KtxConfigIssue,
KtxLocalProject,
KtxProjectConfig,
KtxProjectConnectionConfig,
KtxProjectEmbeddingConfig,
KtxProjectLlmConfig,
} from '@ktx/context/project';
import { ktxLocalStateDbPath } from '@ktx/context/project';
import type { PostgresPgssProbeResult } from '@ktx/context/ingest';
import { runClaudeCodeAuthProbe } from './context/llm/claude-code-runtime.js';
import type { KtxConfigIssue, KtxProjectConfig, KtxProjectConnectionConfig, KtxProjectEmbeddingConfig, KtxProjectLlmConfig } from './context/project/config.js';
import type { KtxLocalProject } from './context/project/project.js';
import { ktxLocalStateDbPath } from './context/project/local-state-db.js';
import type { PostgresPgssProbeResult } from './context/ingest/adapters/historic-sql/types.js';
import {
formatClaudeCodePromptCachingFix,
formatClaudeCodePromptCachingWarning,
@ -111,28 +104,29 @@ interface LocalStatsIngestPerConnection {
interface LocalStatsSemanticLayerEntry {
connectionId: string;
sourceCount: number;
embeddedSourceCount: number;
dictionaryValueCount: number;
}
interface LocalStatsKnowledgeEntry {
interface LocalStatsWikiEntry {
scope: string;
count: number;
embeddedCount: number;
}
interface LocalStatsProjectDir {
dbSqliteBytes: number | null;
ktxCacheBytes: number;
rawSources: { fileCount: number; bytes: number };
wikiGlobalMarkdownCount: number;
semanticLayerYamlCount: number;
}
/** @internal */
export interface LocalStatsStatus {
ingest: {
totalCompletedRuns: number;
perConnection: LocalStatsIngestPerConnection[];
};
knowledgePages: LocalStatsKnowledgeEntry[];
wikiPages: LocalStatsWikiEntry[];
semanticLayer: LocalStatsSemanticLayerEntry[];
projectDir: LocalStatsProjectDir;
unavailable?: string;
@ -470,8 +464,12 @@ function readinessDetail(result: PostgresPgssProbeResult): string {
async function defaultPostgresQueryHistoryProbe(
input: PostgresQueryHistoryProbeInput,
): Promise<PostgresPgssProbeResult> {
const [{ PostgresPgssReader }, { KtxPostgresHistoricSqlQueryClient, isKtxPostgresConnectionConfig }] =
await Promise.all([import('@ktx/context/ingest'), import('@ktx/connector-postgres')]);
const [{ PostgresPgssReader }, { KtxPostgresHistoricSqlQueryClient }, { isKtxPostgresConnectionConfig }] =
await Promise.all([
import('./context/ingest/adapters/historic-sql/postgres-pgss-reader.js'),
import('./connectors/postgres/historic-sql-query-client.js'),
import('./connectors/postgres/connector.js'),
]);
const inputDriver = input.connection.driver ?? 'unknown';
if (!isKtxPostgresConnectionConfig(input.connection)) {
@ -775,16 +773,12 @@ interface DirSummary {
bytes: number;
}
async function summarizeDir(
dir: string,
filter?: (entry: Dirent, fullPath: string) => boolean,
maxDepth = 10,
): Promise<DirSummary> {
async function summarizeDir(dir: string, maxDepth = 10): Promise<DirSummary> {
let fileCount = 0;
let bytes = 0;
const walk = async (current: string, depth: number): Promise<void> => {
if (depth > maxDepth) return;
let entries: Dirent[];
let entries;
try {
entries = await readdirAsync(current, { withFileTypes: true });
} catch {
@ -797,7 +791,6 @@ async function summarizeDir(
continue;
}
if (!entry.isFile()) continue;
if (filter && !filter(entry, full)) continue;
try {
const s = await statAsync(full);
fileCount += 1;
@ -811,14 +804,6 @@ async function summarizeDir(
return { fileCount, bytes };
}
function isMarkdownEntry(entry: Dirent): boolean {
return entry.isFile() && /\.mdx?$/i.test(entry.name);
}
function isYamlEntry(entry: Dirent): boolean {
return entry.isFile() && /\.ya?ml$/i.test(entry.name);
}
async function fileSizeOrNull(filePath: string): Promise<number | null> {
try {
const s = await statAsync(filePath);
@ -836,6 +821,7 @@ function tryQuery<T>(run: () => T, fallback: T): T {
}
}
/** @internal */
export async function buildLocalStatsStatus(project: KtxLocalProject): Promise<LocalStatsStatus> {
const dbPath = ktxLocalStateDbPath(project);
const dbSqliteBytes = await fileSizeOrNull(dbPath);
@ -844,18 +830,12 @@ export async function buildLocalStatsStatus(project: KtxLocalProject): Promise<L
dbSqliteBytes,
ktxCacheBytes: (await summarizeDir(join(project.projectDir, '.ktx', 'cache'))).bytes,
rawSources: await summarizeDir(join(project.projectDir, 'raw-sources')),
wikiGlobalMarkdownCount: (
await summarizeDir(join(project.projectDir, 'wiki', 'global'), isMarkdownEntry)
).fileCount,
semanticLayerYamlCount: (
await summarizeDir(join(project.projectDir, 'semantic-layer'), isYamlEntry)
).fileCount,
};
if (dbSqliteBytes === null) {
return {
ingest: { totalCompletedRuns: 0, perConnection: [] },
knowledgePages: [],
wikiPages: [],
semanticLayer: [],
projectDir: projectDirSummary,
unavailable: 'no .ktx/db.sqlite yet',
@ -905,28 +885,34 @@ export async function buildLocalStatsStatus(project: KtxLocalProject): Promise<L
left.connectionId.localeCompare(right.connectionId),
);
const knowledgeRows = tryQuery(
const wikiRows = tryQuery(
() =>
db
.prepare(
`SELECT scope, COUNT(*) AS n FROM knowledge_pages GROUP BY scope ORDER BY scope`,
`SELECT scope, COUNT(*) AS n, SUM(CASE WHEN embedding_json IS NOT NULL THEN 1 ELSE 0 END) AS embedded
FROM knowledge_pages
GROUP BY scope
ORDER BY scope`,
)
.all() as Array<{ scope: string; n: number }>,
[] as Array<{ scope: string; n: number }>,
.all() as Array<{ scope: string; n: number; embedded: number | null }>,
[] as Array<{ scope: string; n: number; embedded: number | null }>,
);
const knowledgePages: LocalStatsKnowledgeEntry[] = knowledgeRows.map((row) => ({
const wikiPages: LocalStatsWikiEntry[] = wikiRows.map((row) => ({
scope: row.scope,
count: row.n,
embeddedCount: row.embedded ?? 0,
}));
const sourceRows = tryQuery(
() =>
db
.prepare(
`SELECT connection_id, COUNT(*) AS n FROM local_sl_sources GROUP BY connection_id`,
`SELECT connection_id, COUNT(*) AS n, SUM(CASE WHEN embedding_json IS NOT NULL THEN 1 ELSE 0 END) AS embedded
FROM local_sl_sources
GROUP BY connection_id`,
)
.all() as Array<{ connection_id: string; n: number }>,
[] as Array<{ connection_id: string; n: number }>,
.all() as Array<{ connection_id: string; n: number; embedded: number | null }>,
[] as Array<{ connection_id: string; n: number; embedded: number | null }>,
);
const dictionaryRows = tryQuery(
() =>
@ -942,6 +928,7 @@ export async function buildLocalStatsStatus(project: KtxLocalProject): Promise<L
slMap.set(row.connection_id, {
connectionId: row.connection_id,
sourceCount: row.n,
embeddedSourceCount: row.embedded ?? 0,
dictionaryValueCount: 0,
});
}
@ -949,6 +936,7 @@ export async function buildLocalStatsStatus(project: KtxLocalProject): Promise<L
const existing = slMap.get(row.connection_id) ?? {
connectionId: row.connection_id,
sourceCount: 0,
embeddedSourceCount: 0,
dictionaryValueCount: 0,
};
existing.dictionaryValueCount = row.n;
@ -960,14 +948,14 @@ export async function buildLocalStatsStatus(project: KtxLocalProject): Promise<L
return {
ingest: { totalCompletedRuns, perConnection },
knowledgePages,
wikiPages,
semanticLayer,
projectDir: projectDirSummary,
};
} catch (error) {
return {
ingest: { totalCompletedRuns: 0, perConnection: [] },
knowledgePages: [],
wikiPages: [],
semanticLayer: [],
projectDir: projectDirSummary,
unavailable: failureDetail(error),
@ -1093,7 +1081,6 @@ function formatRelativeFromNow(iso: string): string {
return iso;
}
function abbreviateHome(filePath: string, env: NodeJS.ProcessEnv): string {
const home = env.HOME;
if (home && (filePath === home || filePath.startsWith(`${home}/`))) {
@ -1117,7 +1104,7 @@ function renderLocalStats(
const localLabelWidth = Math.max(
'Ingest'.length,
'Knowledge'.length,
'Wiki'.length,
'Semantic layer'.length,
'Disk'.length,
);
@ -1139,13 +1126,13 @@ function renderLocalStats(
}
}
if (stats.knowledgePages.length === 0) {
lines.push(` ${lLabel('Knowledge')} ${dim('no pages yet')}`);
if (stats.wikiPages.length === 0) {
lines.push(` ${lLabel('Wiki')} ${dim('no pages yet')}`);
} else {
const knowledgeText = stats.knowledgePages
.map((entry) => `${entry.scope}=${entry.count}`)
const wikiText = stats.wikiPages
.map((entry) => `${entry.scope}=${entry.count} ${dim(`(${entry.embeddedCount} embedded)`)}`)
.join(` ${dim('·')} `);
lines.push(` ${lLabel('Knowledge')} ${knowledgeText}`);
lines.push(` ${lLabel('Wiki')} ${wikiText}`);
}
if (stats.semanticLayer.length === 0) {
@ -1155,8 +1142,10 @@ function renderLocalStats(
let firstLine = true;
for (const entry of stats.semanticLayer) {
const prefix = firstLine ? lLabel('Semantic layer') : ' '.repeat(localLabelWidth);
const sourcesText = `${entry.sourceCount} source${entry.sourceCount === 1 ? '' : 's'} (${entry.embeddedSourceCount} embedded)`;
const dictText = `${entry.dictionaryValueCount} dictionary value${entry.dictionaryValueCount === 1 ? '' : 's'}`;
lines.push(
` ${prefix} ${entry.connectionId.padEnd(nameWidth)} ${dim(`${entry.sourceCount} source${entry.sourceCount === 1 ? '' : 's'} · ${entry.dictionaryValueCount} dictionary value${entry.dictionaryValueCount === 1 ? '' : 's'}`)}`,
` ${prefix} ${entry.connectionId.padEnd(nameWidth)} ${dim(`${sourcesText} · ${dictText}`)}`,
);
firstLine = false;
}
@ -1169,8 +1158,6 @@ function renderLocalStats(
diskBits.push(
`raw-sources=${disk.rawSources.fileCount} file${disk.rawSources.fileCount === 1 ? '' : 's'} (${formatBytes(disk.rawSources.bytes)})`,
);
diskBits.push(`wiki=${disk.wikiGlobalMarkdownCount} md`);
diskBits.push(`semantic-layer=${disk.semanticLayerYamlCount} yaml`);
lines.push(` ${lLabel('Disk')} ${dim(diskBits.join(` ${dim('·')} `))}`);
lines.push('');
}