ktx/scripts/local-embeddings-runtime-smoke.test.mjs
Andrey Avtomonov 2366b00301
chore(workspace): gate dead-code with knip production mode (#196)
* refactor(workspace): relocate @ktx/llm source into packages/cli/src/llm

* refactor(workspace): rewrite @ktx/llm imports to relative paths

* refactor(workspace): fold internal packages into cli

* chore(workspace): gate dead-code with knip production mode

Turn on production-mode knip plus an autofix run in pre-commit and the
`pnpm dead-code` script, document the `/** @internal */` convention for
test-only exports in AGENTS.md, annotate test-only exports across the
CLI with that JSDoc, and drop dead exports/wrappers the new gate
surfaced (e.g. `cli-project.ts`, `lookerRuntimeSourceToFileAdapterSource`,
`createLocalScanEnrichmentProvidersFromConfig`,
`PGLITE_OWNER_PROCESS_BACKEND_CAPABILITIES`, stale type re-exports).
Replace the loose `ignoreIssues` allowlist in `knip.json` with explicit
production entries so cross-package barrel leaks are caught.

* refactor(cli): delete internal barrel index.ts files

The 34 `index.ts` re-export barrels inside `packages/cli/src/` were
holdovers from the pre-fold multi-workspace structure. Post-fold-in they
served no production purpose: external consumers go through the single
package main entry, and in-repo callers mostly imported through them
only because the path was short. Internally, knip flagged most barrel
re-exports as production-dead (only reached via tests).

This change:
- Deletes every internal barrel except `packages/cli/src/index.ts`
  (the published package entry).
- Rewrites ~270 source/test files to import each name directly from
  the file that defines it.
- Moves `tools/warehouse-verification/index.ts` to
  `create-warehouse-verification-tools.ts` (the function it defined
  locally) and updates its single consumer.
- Renames `search/backend-conformance.ts` → `.test-utils.ts` to match
  the existing test-helper file convention.
- Deletes 13 dead test-only chains (dbt-descriptions/*,
  live-database/extracted-schema, live-database/structural-sync,
  relationship-* feedback/review chain) plus their tests and a
  cascading orphan integration test.
- Updates test mocks that pointed at deleted barrel paths
  (notion-client, connector barrels in scan/local-scan-connectors
  tests) to mock the source files instead.
- Points the maintainer benchmark script
  (`scripts/relationship-benchmark-report.mjs`) at source files
  instead of `dist/context/scan/index.js`.
- Drops the barrel `!` entries from `knip.json`; adds explicit
  production entries only for the benchmark code reached via dist by
  the maintainer script.

Net: 413 files changed, ~1.2k insertions, ~9.4k deletions.

`pnpm run dead-code` (Biome + knip default + knip production) and
`pnpm run type-check` are clean; 2277 tests pass.

* refactor(workspace): rename @ktx/cli to @kaelio/ktx and pack it directly

Promote the CLI workspace package to the public name `@kaelio/ktx` and
drop the separate `scripts/build-public-npm-package.mjs` wrapper. The
CLI package is now publishable in place (`publishConfig.access: public`,
`provenance: true`), so artifact packing uses `pnpm pack` against
`packages/cli/` instead of assembling a parallel package tree.

Updates all workspace filter invocations, docs, tests, and release
readiness checks to reference the new package name, and folds the
tarball-name helper into `scripts/public-npm-release-metadata.mjs`.

* docs: align "agent clients" and "data agents" terminology

Replace "client agents" with "agent clients" and "database agents" with
"data agents" across AGENTS.md, README.md, the docs-site copy, and the
matching setup-agents test description, matching the canonical
vocabulary in docs/terminology.md.

Also moves packages/cli/tsconfig.json's tsBuildInfoFile from
node_modules/.cache/ to dist/.tsbuildinfo so incremental builds survive
node_modules reinstalls.

* refactor(release): single source of truth for package version

Make packages/cli/package.json the single source of truth for the
@kaelio/ktx version. publicNpmPackageVersion() now reads it directly,
so artifact filenames, release-readiness checks, and the Python wheel
version all derive from one field. The duplicate
release-policy.json.publicNpmPackageVersion is removed.

Previously the two fields could drift: tarballs were named
kaelio-ktx-0.4.1.tgz while internally containing
@kaelio/ktx@0.0.0-private.

- update-public-release-version.mjs rewrites both Python pyproject.toml
  files (ktx-daemon, ktx-sl) alongside the npm package.jsons,
  normalizing the version for PEP 440 (e.g. 0.1.0-rc.2 -> 0.1.0rc2).
- semantic-release-config.cjs adds the two pyproject.toml files to
  @semantic-release/git assets so the release commit back to main
  carries every version source in lockstep.
- The six "?? '0.0.0-private'" fallback literals across the CLI are
  replaced with "?? getKtxCliPackageInfo().version", and
  createDefaultKtxMcpServer makes its version arg required.
- docs/release.md describes the actual commit-back model: the dev tree
  always reflects the most recent release; no sentinel pin to
  maintain.

Verified: pnpm run artifacts:build now produces
kaelio-ktx-0.4.1.tgz and kaelio_ktx-0.4.1-py3-none-any.whl with
@kaelio/ktx@0.4.1 inside. Full type-check, dead-code, and
2287 vitests + 173 script tests pass.

* refactor(cli): inject embedding provider resolution and detect sentence-transformers runtime

Make resolveProjectEmbeddingProvider and runtimeIo injectable in ingest and
scan command entrypoints so tests can stub them, and teach
resolvePublicIngestRuntimeRequirements to flag the local-embeddings runtime
feature when ktx.yaml selects sentence-transformers.

* chore(cli): mark buildLocalStatsStatus and LocalStatsStatus as @internal

Both symbols are consumed only by status-project.test.ts. Annotating with
/** @internal */ keeps knip's production-mode check clean without changing
runtime behavior.

* fix(cli): use real package metadata in print-command-tree

The stubbed package name embedded a forbidden product identifier that
tripped the boundary check in CI. Read the metadata from package.json
instead — keeps the rendered tree unchanged and removes a duplicate
source of truth.

* feat(cli): show embedding coverage in `ktx status`, drop duplicate disk counts

Inline `(N embedded)` next to the Wiki scope counts and Semantic-layer
source counts, computed with `SUM(embedding_json IS NOT NULL)` over
`knowledge_pages` and `local_sl_sources`. Rename the "Knowledge" label to
"Wiki" (canonical per `docs/terminology.md`) and rename the matching
`localStats.knowledgePages` field to `localStats.wikiPages`.

Drop `wiki=N md` and `semantic-layer=N yaml` from the Disk row — those
duplicated the per-surface rows above. Disk now reports only actual byte
usage (db, cache, raw-sources). The unused `wikiGlobalMarkdownCount` /
`semanticLayerYamlCount` fields, the `isMarkdownEntry` / `isYamlEntry`
helpers, and the `filter` arg on `summarizeDir` are removed.
2026-05-21 15:28:58 +02:00

171 lines
5.9 KiB
JavaScript

import assert from 'node:assert/strict';
import { readFile } from 'node:fs/promises';
import { describe, it } from 'node:test';
import { PUBLIC_NPM_PACKAGE_VERSION } from './public-npm-release-metadata.mjs';
import {
buildLocalEmbeddingsSmokeEnv,
expectedPublicKtxVersionPattern,
localEmbeddingsSmokeCommands,
localEmbeddingsSmokeOptIn,
parseDaemonBaseUrl,
publicKtxTarballName,
validateEmbeddingResponse,
} from './local-embeddings-runtime-smoke.mjs';
const PUBLIC_TARBALL_NAME = `kaelio-ktx-${PUBLIC_NPM_PACKAGE_VERSION}.tgz`;
const OTHER_PUBLIC_TARBALL_NAME = 'kaelio-ktx-9.9.9.tgz';
describe('localEmbeddingsSmokeOptIn', () => {
it('skips unless the smoke is explicitly enabled', () => {
assert.deepEqual(localEmbeddingsSmokeOptIn({}, []), {
run: false,
message: 'Set KTX_RUN_LOCAL_EMBEDDINGS_SMOKE=1 or pass --force to run the local embeddings smoke.',
});
});
it('runs when the environment opt-in is set', () => {
assert.deepEqual(localEmbeddingsSmokeOptIn({ KTX_RUN_LOCAL_EMBEDDINGS_SMOKE: '1' }, []), {
run: true,
});
});
it('runs when --force is present', () => {
assert.deepEqual(localEmbeddingsSmokeOptIn({}, ['--force']), {
run: true,
});
});
});
describe('publicKtxTarballName', () => {
it('selects the public @kaelio/ktx tarball name', () => {
assert.equal(publicKtxTarballName([PUBLIC_TARBALL_NAME, 'ignore-me.tgz']), PUBLIC_TARBALL_NAME);
});
it('fails when the public package tarball is missing', () => {
assert.throws(
() => publicKtxTarballName(['ktx-cli-0.0.0-private.tgz']),
/Expected exactly one @kaelio\/ktx tarball/,
);
});
it('fails when multiple public package tarballs are present', () => {
assert.throws(
() => publicKtxTarballName([PUBLIC_TARBALL_NAME, OTHER_PUBLIC_TARBALL_NAME]),
/Expected exactly one @kaelio\/ktx tarball/,
);
});
});
describe('expectedPublicKtxVersionPattern', () => {
it('matches the public package version and rejects other versions', () => {
const pattern = expectedPublicKtxVersionPattern();
assert.match(`@kaelio/ktx ${PUBLIC_NPM_PACKAGE_VERSION}\n`, pattern);
assert.doesNotMatch('@kaelio/ktx 9.9.9-other\n', pattern);
});
});
describe('buildLocalEmbeddingsSmokeEnv', () => {
it('isolates the runtime root and model caches inside the smoke root', () => {
const env = buildLocalEmbeddingsSmokeEnv('/tmp/ktx-local-embedding-smoke', {
PATH: '/usr/bin',
});
assert.equal(env.PATH, '/usr/bin');
assert.equal(env.KTX_RUN_LOCAL_EMBEDDINGS_SMOKE, '1');
assert.equal(env.KTX_RUNTIME_ROOT, '/tmp/ktx-local-embedding-smoke/managed-runtime');
assert.equal(env.HF_HOME, '/tmp/ktx-local-embedding-smoke/hf-home');
assert.equal(env.TRANSFORMERS_CACHE, '/tmp/ktx-local-embedding-smoke/transformers-cache');
assert.equal(env.SENTENCE_TRANSFORMERS_HOME, '/tmp/ktx-local-embedding-smoke/sentence-transformers-home');
assert.equal(env.TORCH_HOME, '/tmp/ktx-local-embedding-smoke/torch-home');
});
});
describe('localEmbeddingsSmokeCommands', () => {
it('describes the installed-package commands needed for the smoke', () => {
const commands = localEmbeddingsSmokeCommands({
projectDir: '/tmp/ktx-local-embedding-smoke/project',
});
assert.deepEqual(commands.map((command) => command.label), [
'ktx public package version',
'ktx admin runtime status missing',
'ktx admin runtime install local embeddings',
'ktx admin runtime status local embeddings ready',
'ktx admin runtime start local embeddings',
'ktx setup local embeddings',
'ktx admin runtime stop local embeddings',
]);
assert.deepEqual(commands[2], {
label: 'ktx admin runtime install local embeddings',
command: 'pnpm',
args: ['exec', 'ktx', 'admin', 'runtime', 'install', '--feature', 'local-embeddings', '--yes'],
timeoutMs: 1_200_000,
});
assert.deepEqual(commands[4], {
label: 'ktx admin runtime start local embeddings',
command: 'pnpm',
args: ['exec', 'ktx', 'admin', 'runtime', 'start', '--feature', 'local-embeddings'],
timeoutMs: 300_000,
});
assert.deepEqual(commands[5].args, [
'exec',
'ktx',
'setup',
'--project-dir',
'/tmp/ktx-local-embedding-smoke/project',
'--no-input',
'--yes',
'--skip-llm',
'--embedding-backend',
'sentence-transformers',
'--skip-databases',
'--skip-sources',
'--skip-agents',
]);
});
});
describe('parseDaemonBaseUrl', () => {
it('extracts the daemon URL from runtime start output', () => {
assert.equal(
parseDaemonBaseUrl('Started KTX daemon\nurl: http://127.0.0.1:61234\nfeatures: local-embeddings\n'),
'http://127.0.0.1:61234',
);
});
it('rejects output without a daemon URL', () => {
assert.throws(() => parseDaemonBaseUrl('Started KTX daemon\n'), /Daemon URL was not printed/);
});
});
describe('validateEmbeddingResponse', () => {
it('accepts a finite embedding vector with the expected dimensions', () => {
validateEmbeddingResponse({ embedding: [0.1, -0.2, 0.3] }, 3);
});
it('rejects a vector with the wrong dimensions', () => {
assert.throws(
() => validateEmbeddingResponse({ embedding: [0.1, 0.2] }, 3),
/Expected embedding dimension 3, got 2/,
);
});
it('rejects non-finite embedding values', () => {
assert.throws(
() => validateEmbeddingResponse({ embedding: [0.1, Number.NaN, 0.3] }, 3),
/Embedding value at index 1 is not a finite number/,
);
});
});
describe('package script', () => {
it('registers the opt-in local embeddings smoke command', async () => {
const packageJson = JSON.parse(await readFile(new URL('../package.json', import.meta.url), 'utf8'));
assert.equal(
packageJson.scripts['release:local-embeddings-smoke'],
'node scripts/local-embeddings-runtime-smoke.mjs --require-opt-in',
);
});
});