mirror of
https://github.com/Kaelio/ktx.git
synced 2026-06-25 08:48:08 +02:00
* refactor(workspace): relocate @ktx/llm source into packages/cli/src/llm * refactor(workspace): rewrite @ktx/llm imports to relative paths * refactor(workspace): fold internal packages into cli * chore(workspace): gate dead-code with knip production mode Turn on production-mode knip plus an autofix run in pre-commit and the `pnpm dead-code` script, document the `/** @internal */` convention for test-only exports in AGENTS.md, annotate test-only exports across the CLI with that JSDoc, and drop dead exports/wrappers the new gate surfaced (e.g. `cli-project.ts`, `lookerRuntimeSourceToFileAdapterSource`, `createLocalScanEnrichmentProvidersFromConfig`, `PGLITE_OWNER_PROCESS_BACKEND_CAPABILITIES`, stale type re-exports). Replace the loose `ignoreIssues` allowlist in `knip.json` with explicit production entries so cross-package barrel leaks are caught. * refactor(cli): delete internal barrel index.ts files The 34 `index.ts` re-export barrels inside `packages/cli/src/` were holdovers from the pre-fold multi-workspace structure. Post-fold-in they served no production purpose: external consumers go through the single package main entry, and in-repo callers mostly imported through them only because the path was short. Internally, knip flagged most barrel re-exports as production-dead (only reached via tests). This change: - Deletes every internal barrel except `packages/cli/src/index.ts` (the published package entry). - Rewrites ~270 source/test files to import each name directly from the file that defines it. - Moves `tools/warehouse-verification/index.ts` to `create-warehouse-verification-tools.ts` (the function it defined locally) and updates its single consumer. - Renames `search/backend-conformance.ts` → `.test-utils.ts` to match the existing test-helper file convention. - Deletes 13 dead test-only chains (dbt-descriptions/*, live-database/extracted-schema, live-database/structural-sync, relationship-* feedback/review chain) plus their tests and a cascading orphan integration test. - Updates test mocks that pointed at deleted barrel paths (notion-client, connector barrels in scan/local-scan-connectors tests) to mock the source files instead. - Points the maintainer benchmark script (`scripts/relationship-benchmark-report.mjs`) at source files instead of `dist/context/scan/index.js`. - Drops the barrel `!` entries from `knip.json`; adds explicit production entries only for the benchmark code reached via dist by the maintainer script. Net: 413 files changed, ~1.2k insertions, ~9.4k deletions. `pnpm run dead-code` (Biome + knip default + knip production) and `pnpm run type-check` are clean; 2277 tests pass. * refactor(workspace): rename @ktx/cli to @kaelio/ktx and pack it directly Promote the CLI workspace package to the public name `@kaelio/ktx` and drop the separate `scripts/build-public-npm-package.mjs` wrapper. The CLI package is now publishable in place (`publishConfig.access: public`, `provenance: true`), so artifact packing uses `pnpm pack` against `packages/cli/` instead of assembling a parallel package tree. Updates all workspace filter invocations, docs, tests, and release readiness checks to reference the new package name, and folds the tarball-name helper into `scripts/public-npm-release-metadata.mjs`. * docs: align "agent clients" and "data agents" terminology Replace "client agents" with "agent clients" and "database agents" with "data agents" across AGENTS.md, README.md, the docs-site copy, and the matching setup-agents test description, matching the canonical vocabulary in docs/terminology.md. Also moves packages/cli/tsconfig.json's tsBuildInfoFile from node_modules/.cache/ to dist/.tsbuildinfo so incremental builds survive node_modules reinstalls. * refactor(release): single source of truth for package version Make packages/cli/package.json the single source of truth for the @kaelio/ktx version. publicNpmPackageVersion() now reads it directly, so artifact filenames, release-readiness checks, and the Python wheel version all derive from one field. The duplicate release-policy.json.publicNpmPackageVersion is removed. Previously the two fields could drift: tarballs were named kaelio-ktx-0.4.1.tgz while internally containing @kaelio/ktx@0.0.0-private. - update-public-release-version.mjs rewrites both Python pyproject.toml files (ktx-daemon, ktx-sl) alongside the npm package.jsons, normalizing the version for PEP 440 (e.g. 0.1.0-rc.2 -> 0.1.0rc2). - semantic-release-config.cjs adds the two pyproject.toml files to @semantic-release/git assets so the release commit back to main carries every version source in lockstep. - The six "?? '0.0.0-private'" fallback literals across the CLI are replaced with "?? getKtxCliPackageInfo().version", and createDefaultKtxMcpServer makes its version arg required. - docs/release.md describes the actual commit-back model: the dev tree always reflects the most recent release; no sentinel pin to maintain. Verified: pnpm run artifacts:build now produces kaelio-ktx-0.4.1.tgz and kaelio_ktx-0.4.1-py3-none-any.whl with @kaelio/ktx@0.4.1 inside. Full type-check, dead-code, and 2287 vitests + 173 script tests pass. * refactor(cli): inject embedding provider resolution and detect sentence-transformers runtime Make resolveProjectEmbeddingProvider and runtimeIo injectable in ingest and scan command entrypoints so tests can stub them, and teach resolvePublicIngestRuntimeRequirements to flag the local-embeddings runtime feature when ktx.yaml selects sentence-transformers. * chore(cli): mark buildLocalStatsStatus and LocalStatsStatus as @internal Both symbols are consumed only by status-project.test.ts. Annotating with /** @internal */ keeps knip's production-mode check clean without changing runtime behavior. * fix(cli): use real package metadata in print-command-tree The stubbed package name embedded a forbidden product identifier that tripped the boundary check in CI. Read the metadata from package.json instead — keeps the rendered tree unchanged and removes a duplicate source of truth. * feat(cli): show embedding coverage in `ktx status`, drop duplicate disk counts Inline `(N embedded)` next to the Wiki scope counts and Semantic-layer source counts, computed with `SUM(embedding_json IS NOT NULL)` over `knowledge_pages` and `local_sl_sources`. Rename the "Knowledge" label to "Wiki" (canonical per `docs/terminology.md`) and rename the matching `localStats.knowledgePages` field to `localStats.wikiPages`. Drop `wiki=N md` and `semantic-layer=N yaml` from the Disk row — those duplicated the per-surface rows above. Disk now reports only actual byte usage (db, cache, raw-sources). The unused `wikiGlobalMarkdownCount` / `semanticLayerYamlCount` fields, the `isMarkdownEntry` / `isYamlEntry` helpers, and the `filter` arg on `summarizeDir` are removed.
337 lines
11 KiB
TypeScript
337 lines
11 KiB
TypeScript
import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest';
|
|
|
|
vi.mock('ai', () => ({
|
|
generateText: vi.fn(),
|
|
stepCountIs: (n: number) => n,
|
|
tool: (def: unknown) => def,
|
|
}));
|
|
|
|
import { generateText } from 'ai';
|
|
import { AiSdkKtxLlmRuntime } from './ai-sdk-runtime.js';
|
|
import type { RunLoopStepInfo } from './runtime-port.js';
|
|
|
|
describe('AiSdkKtxLlmRuntime.runAgentLoop', () => {
|
|
let runtime: AiSdkKtxLlmRuntime;
|
|
const llmProvider = {
|
|
getModel: vi.fn().mockReturnValue({ modelId: 'claude-sonnet-4-6', provider: 'anthropic' }),
|
|
getModelByName: vi.fn(),
|
|
cacheMarker: vi.fn(),
|
|
repairToolCallHandler: vi.fn(),
|
|
thinkingProviderOptions: vi.fn(),
|
|
telemetryConfig: vi.fn(),
|
|
promptCachingConfig: vi.fn(() => ({
|
|
enabled: false,
|
|
systemTtl: '1h',
|
|
toolsTtl: '1h',
|
|
historyTtl: '5m',
|
|
cacheSystem: true,
|
|
cacheTools: true,
|
|
cacheHistory: true,
|
|
vertexFallbackTo5m: false,
|
|
})),
|
|
activeBackend: vi.fn(() => 'anthropic'),
|
|
};
|
|
|
|
beforeEach(() => {
|
|
vi.clearAllMocks();
|
|
runtime = new AiSdkKtxLlmRuntime({ llmProvider: llmProvider as any });
|
|
});
|
|
|
|
afterEach(() => vi.clearAllMocks());
|
|
|
|
it('passes systemPrompt, userPrompt, tools, and step budget through to generateText', async () => {
|
|
(generateText as any).mockResolvedValue({ text: 'ok', toolCalls: [], steps: [] });
|
|
const repairHandler = vi.fn();
|
|
llmProvider.repairToolCallHandler.mockReturnValueOnce(repairHandler);
|
|
const tools = { noop: { description: 'noop', inputSchema: {}, execute: vi.fn() } };
|
|
await runtime.runAgentLoop({
|
|
modelRole: 'candidateExtraction',
|
|
systemPrompt: 'SYS',
|
|
userPrompt: 'USR',
|
|
toolSet: tools as any,
|
|
stepBudget: 17,
|
|
telemetryTags: { source: 'test' },
|
|
});
|
|
const call = (generateText as any).mock.calls[0][0];
|
|
expect(call.system).toEqual({ role: 'system', content: 'SYS' });
|
|
expect(call.messages).toEqual([{ role: 'user', content: 'USR' }]);
|
|
expect(call.prompt).toBeUndefined();
|
|
expect(call.tools.noop).toEqual(
|
|
expect.objectContaining({
|
|
description: 'noop',
|
|
inputSchema: {},
|
|
execute: expect.any(Function),
|
|
toModelOutput: expect.any(Function),
|
|
}),
|
|
);
|
|
expect(call.stopWhen).toBe(17);
|
|
expect(call.temperature).toBe(0);
|
|
expect(call.experimental_repairToolCall).toBe(repairHandler);
|
|
expect(llmProvider.getModel).toHaveBeenCalledWith('candidateExtraction');
|
|
expect(llmProvider.repairToolCallHandler).toHaveBeenCalledWith({ source: 'ktx-agent-runner' });
|
|
});
|
|
|
|
it('returns stopReason=natural when the loop completes without error', async () => {
|
|
(generateText as any).mockResolvedValue({ text: 'done', toolCalls: [], steps: [] });
|
|
const result = await runtime.runAgentLoop({
|
|
modelRole: 'candidateExtraction',
|
|
systemPrompt: 'system',
|
|
userPrompt: 'user',
|
|
toolSet: {},
|
|
stepBudget: 10,
|
|
telemetryTags: {},
|
|
});
|
|
expect(result.stopReason).toBe('natural');
|
|
expect(result.error).toBeUndefined();
|
|
expect(llmProvider.getModel).toHaveBeenCalledWith('candidateExtraction');
|
|
expect(generateText).toHaveBeenCalledWith(
|
|
expect.objectContaining({
|
|
system: { role: 'system', content: 'system' },
|
|
messages: [{ role: 'user', content: 'user' }],
|
|
}),
|
|
);
|
|
});
|
|
|
|
it('returns stopReason=error with the error on generateText failure', async () => {
|
|
const err = new Error('LLM unavailable');
|
|
(generateText as any).mockRejectedValue(err);
|
|
const result = await runtime.runAgentLoop({
|
|
modelRole: 'candidateExtraction',
|
|
systemPrompt: '',
|
|
userPrompt: '',
|
|
toolSet: {},
|
|
stepBudget: 10,
|
|
telemetryTags: {},
|
|
});
|
|
expect(result.stopReason).toBe('error');
|
|
expect(result.error).toBe(err);
|
|
});
|
|
|
|
it('invokes caller onStepFinish with incrementing stepIndex and total budget', async () => {
|
|
const calls: RunLoopStepInfo[] = [];
|
|
(generateText as any).mockImplementation(async (opts: any) => {
|
|
for (let i = 0; i < 3; i++) {
|
|
await opts.onStepFinish({});
|
|
}
|
|
return { text: 'ok', toolCalls: [], steps: [] };
|
|
});
|
|
|
|
await runtime.runAgentLoop({
|
|
modelRole: 'candidateExtraction',
|
|
systemPrompt: '',
|
|
userPrompt: '',
|
|
toolSet: {},
|
|
stepBudget: 10,
|
|
telemetryTags: {},
|
|
onStepFinish: (info) => {
|
|
calls.push(info);
|
|
},
|
|
});
|
|
|
|
expect(calls).toEqual([
|
|
{ stepIndex: 1, stepBudget: 10 },
|
|
{ stepIndex: 2, stepBudget: 10 },
|
|
{ stepIndex: 3, stepBudget: 10 },
|
|
]);
|
|
});
|
|
|
|
it('swallows errors thrown from caller onStepFinish without aborting the loop', async () => {
|
|
(generateText as any).mockImplementation(async (opts: any) => {
|
|
await opts.onStepFinish({});
|
|
return { text: 'ok', toolCalls: [], steps: [] };
|
|
});
|
|
|
|
const result = await runtime.runAgentLoop({
|
|
modelRole: 'candidateExtraction',
|
|
systemPrompt: '',
|
|
userPrompt: '',
|
|
toolSet: {},
|
|
stepBudget: 10,
|
|
telemetryTags: {},
|
|
onStepFinish: () => {
|
|
throw new Error('boom');
|
|
},
|
|
});
|
|
|
|
expect(result.stopReason).toBe('natural');
|
|
});
|
|
|
|
it('forwards telemetryTags.source through experimental_telemetry metadata', async () => {
|
|
(generateText as any).mockResolvedValue({ text: 'ok', toolCalls: [], steps: [] });
|
|
const telemetryConfigEnabled = {
|
|
isEnabled: () => true,
|
|
devtoolsEnabled: false,
|
|
appSettingsService: {
|
|
settings: { telemetry: { recordInputs: false, recordOutputs: false } },
|
|
},
|
|
systemConfigService: {
|
|
config: { instance: { name: 'test-instance' } },
|
|
},
|
|
} as any;
|
|
const runtimeWithTelemetry = new AiSdkKtxLlmRuntime({
|
|
llmProvider: llmProvider as any,
|
|
telemetry: {
|
|
createTelemetry: (tags) => ({
|
|
isEnabled: telemetryConfigEnabled.isEnabled(),
|
|
metadata: {
|
|
source: tags.source ?? 'RESEARCH',
|
|
jobId: tags.jobId,
|
|
unitKey: tags.unitKey,
|
|
},
|
|
}),
|
|
},
|
|
});
|
|
await runtimeWithTelemetry.runAgentLoop({
|
|
modelRole: 'candidateExtraction',
|
|
systemPrompt: '',
|
|
userPrompt: '',
|
|
toolSet: {},
|
|
stepBudget: 10,
|
|
telemetryTags: { source: 'metabase', jobId: 'job-123', unitKey: 'u/1' },
|
|
});
|
|
const call = (generateText as any).mock.calls[0][0];
|
|
expect(call.experimental_telemetry.metadata.source).toBe('metabase');
|
|
});
|
|
|
|
it('defaults to source=RESEARCH when telemetryTags omits source', async () => {
|
|
(generateText as any).mockResolvedValue({ text: 'ok', toolCalls: [], steps: [] });
|
|
const telemetryConfigEnabled = {
|
|
isEnabled: () => true,
|
|
devtoolsEnabled: false,
|
|
appSettingsService: {
|
|
settings: { telemetry: { recordInputs: false, recordOutputs: false } },
|
|
},
|
|
systemConfigService: {
|
|
config: { instance: { name: 'test-instance' } },
|
|
},
|
|
} as any;
|
|
const runtimeWithTelemetry = new AiSdkKtxLlmRuntime({
|
|
llmProvider: llmProvider as any,
|
|
telemetry: {
|
|
createTelemetry: (tags) => ({
|
|
isEnabled: telemetryConfigEnabled.isEnabled(),
|
|
metadata: {
|
|
source: tags.source ?? 'RESEARCH',
|
|
jobId: tags.jobId,
|
|
unitKey: tags.unitKey,
|
|
},
|
|
}),
|
|
},
|
|
});
|
|
await runtimeWithTelemetry.runAgentLoop({
|
|
modelRole: 'candidateExtraction',
|
|
systemPrompt: '',
|
|
userPrompt: '',
|
|
toolSet: {},
|
|
stepBudget: 10,
|
|
telemetryTags: { operationName: 'memory-agent-ingest' },
|
|
});
|
|
const call = (generateText as any).mock.calls[0][0];
|
|
expect(call.experimental_telemetry.metadata.source).toBe('RESEARCH');
|
|
});
|
|
|
|
it('forwards jobId and unitKey through experimental_telemetry metadata', async () => {
|
|
(generateText as any).mockResolvedValue({ text: 'ok', toolCalls: [], steps: [] });
|
|
const telemetryConfigEnabled = {
|
|
isEnabled: () => true,
|
|
devtoolsEnabled: false,
|
|
appSettingsService: {
|
|
settings: { telemetry: { recordInputs: false, recordOutputs: false } },
|
|
},
|
|
systemConfigService: {
|
|
config: { instance: { name: 'test-instance' } },
|
|
},
|
|
} as any;
|
|
const runtimeWithTelemetry = new AiSdkKtxLlmRuntime({
|
|
llmProvider: llmProvider as any,
|
|
telemetry: {
|
|
createTelemetry: (tags) => ({
|
|
isEnabled: telemetryConfigEnabled.isEnabled(),
|
|
metadata: {
|
|
source: tags.source ?? 'RESEARCH',
|
|
jobId: tags.jobId,
|
|
unitKey: tags.unitKey,
|
|
},
|
|
}),
|
|
},
|
|
});
|
|
await runtimeWithTelemetry.runAgentLoop({
|
|
modelRole: 'candidateExtraction',
|
|
systemPrompt: '',
|
|
userPrompt: '',
|
|
toolSet: {},
|
|
stepBudget: 10,
|
|
telemetryTags: { source: 'metabase', jobId: 'job-777', unitKey: 'sources/users' },
|
|
});
|
|
const call = (generateText as any).mock.calls[0][0];
|
|
expect(call.experimental_telemetry.metadata.jobId).toBe('job-777');
|
|
expect(call.experimental_telemetry.metadata.unitKey).toBe('sources/users');
|
|
});
|
|
|
|
it('records a sanitized LLM debug request when a recorder is injected', async () => {
|
|
(generateText as any).mockResolvedValue({ text: 'ok', toolCalls: [], steps: [] });
|
|
const record = vi.fn();
|
|
const provider = {
|
|
...llmProvider,
|
|
cacheMarker: vi.fn((ttl: '5m' | '1h') => ({
|
|
anthropic: { cacheControl: { type: 'ephemeral' as const, ttl } },
|
|
})),
|
|
promptCachingConfig: vi.fn(() => ({
|
|
enabled: true,
|
|
systemTtl: '1h',
|
|
toolsTtl: '1h',
|
|
historyTtl: '5m',
|
|
cacheSystem: true,
|
|
cacheTools: true,
|
|
cacheHistory: true,
|
|
vertexFallbackTo5m: false,
|
|
})),
|
|
};
|
|
const runtimeWithDebug = new AiSdkKtxLlmRuntime({
|
|
llmProvider: provider as any,
|
|
debugRequestRecorder: { record },
|
|
});
|
|
|
|
await runtimeWithDebug.runAgentLoop({
|
|
modelRole: 'candidateExtraction',
|
|
systemPrompt: 'SECRET SYSTEM PROMPT',
|
|
userPrompt: 'SECRET USER PROMPT',
|
|
toolSet: {
|
|
emit_candidate: {
|
|
description: 'SECRET TOOL DESCRIPTION',
|
|
inputSchema: {},
|
|
execute: vi.fn(),
|
|
} as any,
|
|
},
|
|
stepBudget: 10,
|
|
telemetryTags: { operationName: 'ingest-bundle-wu', source: 'metabase', jobId: 'job-1', unitKey: 'cards/1' },
|
|
});
|
|
|
|
expect(record).toHaveBeenCalledTimes(1);
|
|
expect(record).toHaveBeenCalledWith(
|
|
expect.objectContaining({
|
|
operationName: 'ingest-bundle-wu',
|
|
source: 'metabase',
|
|
jobId: 'job-1',
|
|
unitKey: 'cards/1',
|
|
modelRole: 'candidateExtraction',
|
|
modelId: 'claude-sonnet-4-6',
|
|
messageCount: 2,
|
|
toolNames: ['emit_candidate'],
|
|
}),
|
|
);
|
|
const providerOptions = record.mock.calls[0][0].providerOptions;
|
|
expect(providerOptions).toEqual(
|
|
expect.arrayContaining([
|
|
expect.objectContaining({ target: 'message', index: 0, role: 'system' }),
|
|
expect.objectContaining({ target: 'message-part', index: 1, role: 'user', partIndex: 0 }),
|
|
expect.objectContaining({ target: 'tool', name: 'emit_candidate' }),
|
|
]),
|
|
);
|
|
expect(providerOptions).toHaveLength(3);
|
|
const serialized = JSON.stringify(record.mock.calls[0][0]);
|
|
expect(serialized).not.toContain('SECRET SYSTEM PROMPT');
|
|
expect(serialized).not.toContain('SECRET USER PROMPT');
|
|
expect(serialized).not.toContain('SECRET TOOL DESCRIPTION');
|
|
});
|
|
});
|