Merge remote-tracking branch 'origin/main' into rollback-failed-ktx-setup

# Conflicts:
#	packages/cli/src/setup-project.ts
This commit is contained in:
Andrey Avtomonov 2026-05-19 18:52:24 +02:00
commit 23acaecb52
21 changed files with 616 additions and 267 deletions

View file

@ -6,8 +6,6 @@
The context layer for analytics agents
</h1>
<p align="center">by Kaelio</p>
<p align="center">
<a href="https://www.npmjs.com/package/@kaelio/ktx"><img src="https://img.shields.io/npm/v/@kaelio/ktx?style=flat-square&color=f97316" alt="npm version" /></a>
<a href="https://codecov.io/gh/Kaelio/ktx"><img src="https://codecov.io/gh/Kaelio/ktx/graph/badge.svg?branch=main" alt="Codecov" /></a>
@ -18,19 +16,38 @@
---
KTX turns warehouse metadata, semantic definitions, and business knowledge into
reviewable project files that agents can use to plan, query, and update
analytics work.
KTX is a self-improving context layer that teaches agents how to query your
warehouse accurately - from approved metric definitions, joinable columns, and
business knowledge it builds and maintains for you.
Use KTX when you want agents to:
Works with PostgreSQL, Snowflake, BigQuery, ClickHouse, MySQL, SQL Server, and
SQLite. Integrates with dbt, MetricFlow, LookML, Looker, Metabase, and Notion.
- Generate SQL from approved measures and joins
- Repair semantic definitions through reviewable diffs
- Explain metric provenance with warehouse evidence
- Work alongside dbt, MetricFlow, LookML, Looker, Metabase, and Notion
## Why KTX
Supports PostgreSQL, Snowflake, BigQuery, ClickHouse, MySQL, SQL Server, and
SQLite.
General-purpose agents struggle on data tasks. They re-explore your warehouse
on every question, invent their own metric logic, and return numbers that
don't match approved definitions.
Traditional semantic layers don't fix this. They demand constant manual
upkeep and don't absorb the rest of your company's knowledge.
KTX does both, automatically:
- **Learns from company knowledge.** Ingests wiki content, organizes it,
removes duplicates, and flags contradictions for human review.
- **Maps the data stack.** Samples tables, captures metadata and usage
patterns, detects joinable columns, and annotates sources so agents write
better queries.
- **Builds a semantic layer.** Combines raw tables and high-level metrics
through a join graph that automatically resolves chasm and fan traps, so
agents fetch metrics declaratively instead of rewriting canonical SQL each
time.
- **Serves agents at execution.** Exposes CLI and MCP tools with combined
full-text and semantic search across wiki and semantic-layer entities.
Agents can run raw SQL when they need it, or compose semantic-layer queries
when they want approved metrics with reliable joins.
<p align="center">
<img src="docs-site/public/images/ingestion-flow-transparent.svg" alt="KTX ingestion flow from source systems through validation to wiki and semantic-layer outputs" width="900" />
@ -109,17 +126,17 @@ Commit `ktx.yaml`, `semantic-layer/`, and `wiki/`. Keep `.ktx/` local.
## Agent Usage
Setup can install KTX instructions for Claude Code, Codex, Cursor, OpenCode,
and universal `.agents` clients:
Install KTX integration for Claude Code, Claude Desktop, Codex, Cursor,
OpenCode, and generic `.agents` clients:
```bash
ktx setup --agents
```
Use `--target <target>` when you want to install or repair one specific
integration.
Pass `--target <target>` to install or repair one specific integration.
Agent-facing workflows typically start with:
A typical agent workflow combines wiki and semantic-layer search before
querying:
```bash
ktx sl search "revenue" --json
@ -127,40 +144,14 @@ ktx wiki search "refund policy" --json
ktx sl query --connection-id warehouse --measure orders.revenue --format sql
```
During agent setup, choose **Ask data questions with KTX MCP** for client
agents. Choose **Ask data questions + manage KTX with CLI commands** only when
a developer or operator agent also needs pinned `ktx` admin commands.
During setup, choose **Ask data questions with KTX MCP** for client agents.
Choose **Ask data questions + manage KTX with CLI commands** when an operator
agent also needs pinned `ktx` admin commands.
After setup, KTX prints **Required before using agents**. Complete those steps
before opening the configured agent. If it shows `ktx mcp start --project-dir ...`,
run that command before using Claude Code, Codex, Cursor, OpenCode, or generic
MCP clients. The same output also prints the matching `ktx mcp stop` command
for when you want to stop MCP later. Claude Desktop uses its own launcher for
MCP and prints separate skill upload steps.
The analytics skill teaches client agents the MCP workflow: discover data,
prefer semantic-layer measures, inspect entity details before raw SQL, and
capture durable learnings. Admin CLI skills call `ktx` commands directly
through a skill file installed in your agent's config:
```bash
ktx sl query --measure orders.revenue --dimension orders.status --format sql
ktx wiki search "revenue definition"
ktx sl validate orders
```
Supported client agents: Claude Code, Claude Desktop, Codex, Cursor, OpenCode,
and clients that can use the printed MCP endpoint or `.agents` admin skills.
Claude Desktop setup registers a local `ktx mcp stdio` server in Claude
Desktop's config and generates one uploadable ZIP per Claude Desktop skill
under `.ktx/agents/claude/`. Restart Claude Desktop after setup, then upload
each ZIP from **Customize** > **Skills** > **+** > **Create skill** >
**Upload a skill**.
The release artifact manifest contains the public npm tarball and the bundled
`kaelio-ktx` runtime wheel. The `python/ktx-sl` and `python/ktx-daemon`
directories remain source packages for development, not public release
artifacts.
After setup, KTX prints **Required before using agents** with the exact
commands to run. If the output includes `ktx mcp start --project-dir ...`, run
it before opening your agent. Claude Desktop uses its own launcher and prints
separate skill upload steps under `.ktx/agents/claude/`.
## Workspace packages

View file

@ -150,6 +150,8 @@ describe('ensureManagedLocalEmbeddingsDaemon', () => {
}),
).resolves.toEqual({
baseUrl: 'http://127.0.0.1:61234',
stdoutLog: '/work/proj/.ktx/runtime/daemon.stdout.log',
stderrLog: '/work/proj/.ktx/runtime/daemon.stderr.log',
env: {
[MANAGED_SENTENCE_TRANSFORMERS_BASE_URL_ENV]: 'http://127.0.0.1:61234',
},

View file

@ -14,6 +14,8 @@ import { startManagedPythonDaemon, type ManagedPythonDaemonStartResult } from '.
export interface ManagedLocalEmbeddingsDaemon {
baseUrl: string;
stdoutLog: string;
stderrLog: string;
env: Record<typeof MANAGED_SENTENCE_TRANSFORMERS_BASE_URL_ENV, string>;
}
@ -91,6 +93,8 @@ export async function ensureManagedLocalEmbeddingsDaemon(
return {
baseUrl: daemon.baseUrl,
stdoutLog: daemon.state.stdoutLog,
stderrLog: daemon.state.stderrLog,
env: {
[MANAGED_SENTENCE_TRANSFORMERS_BASE_URL_ENV]: daemon.baseUrl,
},

View file

@ -11,6 +11,8 @@ import {
type KtxMcpDaemonState,
} from './managed-mcp-daemon.js';
type KtxMcpDaemonStartOptions = Parameters<typeof startKtxMcpDaemon>[0];
function child(pid = 4242): KtxMcpDaemonChild {
return { pid, unref: vi.fn() };
}
@ -40,6 +42,7 @@ describe('managed MCP daemon lifecycle', () => {
});
afterEach(async () => {
vi.unstubAllEnvs();
await rm(tempDir, { recursive: true, force: true });
});
@ -94,6 +97,33 @@ describe('managed MCP daemon lifecycle', () => {
);
});
it('sanitizes IPv6 CIDR entries from child NO_PROXY env', async () => {
vi.stubEnv('NO_PROXY', 'localhost,fd07:b51a:cc66:f0::/64');
vi.stubEnv('no_proxy', '::1,fd00::/8,*.orb.local');
const spawnDaemon = vi.fn<NonNullable<KtxMcpDaemonStartOptions['spawnDaemon']>>(() => child(5555));
await startKtxMcpDaemon({
projectDir,
cliVersion: '0.0.0-test',
host: '127.0.0.1',
port: 7879,
allowedHosts: [],
allowedOrigins: [],
binPath: '/repo/packages/cli/dist/bin.js',
spawnDaemon,
processAlive: vi.fn(() => false),
portAvailable: vi.fn(async () => true),
now: () => new Date('2026-05-14T00:00:00.000Z'),
});
const env = spawnDaemon.mock.calls[0]?.[2].env;
if (!env) {
throw new Error('Expected MCP daemon spawn env');
}
expect(env.NO_PROXY).toBe('localhost,::1,*.orb.local');
expect(env.no_proxy).toBe(env.NO_PROXY);
});
it('returns already-running without spawning when the daemon is alive at the same host/port', async () => {
await mkdir(join(projectDir, '.ktx'), { recursive: true });
await writeFile(join(projectDir, '.ktx/mcp.json'), `${JSON.stringify(state(projectDir), null, 2)}\n`);

View file

@ -4,6 +4,7 @@ import { createServer } from 'node:net';
import { dirname, join } from 'node:path';
import { setTimeout as delay } from 'node:timers/promises';
import { z } from 'zod';
import { sanitizeChildProxyEnv } from './proxy-env.js';
export interface KtxMcpDaemonState {
schemaVersion: 1;
@ -166,11 +167,11 @@ export async function startKtxMcpDaemon(options: {
const child = (options.spawnDaemon ?? defaultSpawnDaemon)(process.execPath, args, {
detached: true,
stdio: ['ignore', log.fd, log.fd],
env: {
env: sanitizeChildProxyEnv({
...process.env,
KTX_CLI_VERSION: options.cliVersion,
...(options.token ? { KTX_MCP_TOKEN: options.token } : {}),
},
}),
});
if (!child.pid) {
throw new Error('Failed to start KTX MCP daemon: child process pid was not available.');

View file

@ -99,6 +99,7 @@ function installResult(features: KtxRuntimeFeature[] = ['core']): ManagedPythonR
asset: {
manifest: installedManifest.asset,
wheelPath: '/assets/python/kaelio_ktx-0.2.0-py3-none-any.whl',
requiresPython: { specifier: '>=3.13', minimumVersion: '3.13' },
},
manifest: installedManifest,
};

View file

@ -79,6 +79,7 @@ function installResult(root: string, features: Array<'core' | 'local-embeddings'
asset: {
manifest: manifest(root, features).asset,
wheelPath: join(root, 'assets', 'python', 'kaelio_ktx-0.2.0-py3-none-any.whl'),
requiresPython: { specifier: '>=3.13', minimumVersion: '3.13' },
},
manifest: manifest(root, features),
};
@ -132,6 +133,7 @@ describe('managed Python daemon lifecycle', () => {
});
afterEach(async () => {
vi.unstubAllEnvs();
await rm(tempDir, { recursive: true, force: true });
});
@ -187,6 +189,27 @@ describe('managed Python daemon lifecycle', () => {
});
});
it('sanitizes IPv6 CIDR entries from child NO_PROXY env', async () => {
vi.stubEnv('NO_PROXY', 'localhost,fd07:b51a:cc66:f0::/64,127.0.0.0/8');
vi.stubEnv('no_proxy', '::1,fd00::/8,*.orb.local');
const spawnDaemon = makeSpawn(5555);
await startManagedPythonDaemon({
...daemonOptionsBase(tempDir),
features: ['local-embeddings'],
installRuntime: vi.fn(async () => installResult(tempDir, ['core', 'local-embeddings'])),
spawnDaemon,
fetch: makeFetch(),
allocatePort: vi.fn(async () => 61234),
now: () => new Date('2026-05-11T00:00:00.000Z'),
pollIntervalMs: 1,
});
const env = vi.mocked(spawnDaemon).mock.calls[0]?.[2].env;
expect(env?.NO_PROXY).toBe('localhost,127.0.0.0/8,::1,*.orb.local');
expect(env?.no_proxy).toBe(env?.NO_PROXY);
});
it('makes a final health probe before reporting startup failure', async () => {
const spawnDaemon = makeSpawn(5556);
const installRuntime = vi.fn(async () => installResult(tempDir));

View file

@ -14,6 +14,7 @@ import {
type ManagedPythonRuntimeInstallOptions,
type ManagedPythonRuntimeInstallResult,
} from './managed-python-runtime.js';
import { sanitizeChildProxyEnv } from './proxy-env.js';
export interface ManagedPythonDaemonState {
schemaVersion: 1;
@ -696,10 +697,10 @@ export async function startManagedPythonDaemon(
{
detached: true,
stdio: ['ignore', stdout.fd, stderr.fd],
env: {
env: sanitizeChildProxyEnv({
...process.env,
KTX_DAEMON_VERSION: options.cliVersion,
},
}),
},
);
child.unref();

View file

@ -2,6 +2,7 @@ import { createHash } from 'node:crypto';
import { mkdir, mkdtemp, readFile, rm, writeFile } from 'node:fs/promises';
import { tmpdir } from 'node:os';
import { join } from 'node:path';
import { strToU8, zipSync } from 'fflate';
import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest';
import {
MISSING_UV_RUNTIME_INSTALL_MESSAGE,
@ -14,10 +15,33 @@ import {
type ManagedPythonRuntimeExec,
} from './managed-python-runtime.js';
async function writeAsset(root: string, contents = 'wheel-bytes') {
function runtimeWheelContents(input: { label?: string; requiresPython?: string | null } = {}): Buffer {
const label = input.label ?? 'runtime-wheel';
const requiresPython = input.requiresPython === null ? [] : [`Requires-Python: ${input.requiresPython ?? '>=3.13'}`];
return Buffer.from(
zipSync({
'kaelio_ktx-0.1.0.dist-info/METADATA': strToU8(
[
'Metadata-Version: 2.4',
'Name: kaelio-ktx',
'Version: 0.1.0',
...requiresPython,
`Summary: ${label}`,
'',
].join('\n'),
),
}),
);
}
async function writeAsset(
root: string,
options: { label?: string; requiresPython?: string | null; contents?: Buffer } = {},
) {
const assetDir = join(root, 'assets', 'python');
await mkdir(assetDir, { recursive: true });
const wheelPath = join(assetDir, 'kaelio_ktx-0.1.0-py3-none-any.whl');
const contents = options.contents ?? runtimeWheelContents(options);
await writeFile(wheelPath, contents);
await writeFile(
join(assetDir, 'manifest.json'),
@ -30,7 +54,7 @@ async function writeAsset(root: string, contents = 'wheel-bytes') {
wheel: {
file: 'kaelio_ktx-0.1.0-py3-none-any.whl',
sha256: createHash('sha256').update(contents).digest('hex'),
bytes: Buffer.byteLength(contents),
bytes: contents.byteLength,
},
},
null,
@ -145,17 +169,18 @@ describe('verifyRuntimeAsset', () => {
});
it('reads the manifest and verifies the wheel checksum', async () => {
const { assetDir, wheelPath } = await writeAsset(tempDir, 'valid-wheel');
const { assetDir, wheelPath } = await writeAsset(tempDir, { label: 'valid-wheel' });
const asset = await verifyRuntimeAsset({ assetDir });
expect(asset.manifest.distributionName).toBe('kaelio-ktx');
expect(asset.manifest.normalizedName).toBe('kaelio_ktx');
expect(asset.wheelPath).toBe(wheelPath);
expect(asset.requiresPython).toEqual({ specifier: '>=3.13', minimumVersion: '3.13' });
});
it('rejects a wheel whose checksum does not match the manifest', async () => {
const { assetDir, wheelPath } = await writeAsset(tempDir, 'original');
const { assetDir, wheelPath } = await writeAsset(tempDir, { label: 'original' });
await writeFile(wheelPath, 'tampered');
await expect(verifyRuntimeAsset({ assetDir })).rejects.toThrow(
@ -164,7 +189,7 @@ describe('verifyRuntimeAsset', () => {
});
it('rejects an unsafe wheel filename in the manifest', async () => {
const { assetDir } = await writeAsset(tempDir, 'valid-wheel');
const { assetDir } = await writeAsset(tempDir, { label: 'valid-wheel' });
await writeFile(
join(assetDir, 'manifest.json'),
`${JSON.stringify({
@ -190,6 +215,22 @@ describe('verifyRuntimeAsset', () => {
/Missing bundled Python runtime manifest.*pnpm run artifacts:build/s,
);
});
it('rejects a bundled wheel without Requires-Python metadata', async () => {
const { assetDir } = await writeAsset(tempDir, { requiresPython: null });
await expect(verifyRuntimeAsset({ assetDir })).rejects.toThrow(
/Bundled Python runtime wheel metadata is missing Requires-Python/,
);
});
it('rejects a bundled wheel without a supported minimum Python version', async () => {
const { assetDir } = await writeAsset(tempDir, { requiresPython: '<4' });
await expect(verifyRuntimeAsset({ assetDir })).rejects.toThrow(
/Unsupported bundled Python runtime Requires-Python: <4/,
);
});
});
describe('installManagedPythonRuntime', () => {
@ -204,7 +245,7 @@ describe('installManagedPythonRuntime', () => {
});
it('creates a venv, installs the core wheel, and writes a manifest', async () => {
const { assetDir } = await writeAsset(tempDir, 'core-wheel');
const { assetDir } = await writeAsset(tempDir, { label: 'core-wheel' });
const commands: Array<{ command: string; args: string[] }> = [];
const exec: ManagedPythonRuntimeExec = vi.fn(async (command, args) => {
commands.push({ command, args });
@ -222,7 +263,8 @@ describe('installManagedPythonRuntime', () => {
expect(result.status).toBe('installed');
expect(commands).toEqual([
{ command: 'uv', args: ['--version'] },
{ command: 'uv', args: ['venv', result.layout.venvDir] },
{ command: 'uv', args: ['python', 'install', '3.13'] },
{ command: 'uv', args: ['venv', '--python', '3.13', result.layout.venvDir] },
{
command: 'uv',
args: ['pip', 'install', '--python', result.layout.pythonPath, result.asset.wheelPath],
@ -240,7 +282,7 @@ describe('installManagedPythonRuntime', () => {
});
it('disables repo uv config for managed runtime uv commands', async () => {
const { assetDir } = await writeAsset(tempDir, 'core-wheel');
const { assetDir } = await writeAsset(tempDir, { label: 'core-wheel' });
const commands: Array<{ command: string; args: string[]; env?: NodeJS.ProcessEnv }> = [];
const exec: ManagedPythonRuntimeExec = vi.fn(async (command, args, options) => {
commands.push({ command, args, env: options?.env });
@ -258,13 +300,14 @@ describe('installManagedPythonRuntime', () => {
expect(commands.map((call) => [call.command, call.args[0], call.env?.UV_NO_CONFIG, call.env?.PATH])).toEqual([
['uv', '--version', '1', '/opt/homebrew/bin'],
['uv', 'python', '1', '/opt/homebrew/bin'],
['uv', 'venv', '1', '/opt/homebrew/bin'],
['uv', 'pip', '1', '/opt/homebrew/bin'],
]);
});
it('installs the local-embeddings extra when requested', async () => {
const { assetDir } = await writeAsset(tempDir, 'embedding-wheel');
const { assetDir } = await writeAsset(tempDir, { label: 'embedding-wheel' });
const commands: Array<{ command: string; args: string[] }> = [];
const exec: ManagedPythonRuntimeExec = vi.fn(async (command, args) => {
commands.push({ command, args });
@ -288,7 +331,7 @@ describe('installManagedPythonRuntime', () => {
});
it('fails with the hard-prerequisite message when uv is missing', async () => {
const { assetDir } = await writeAsset(tempDir, 'core-wheel');
const { assetDir } = await writeAsset(tempDir, { label: 'core-wheel' });
const commands: Array<{ command: string; args: string[] }> = [];
const exec: ManagedPythonRuntimeExec = vi.fn(async (command, args) => {
commands.push({ command, args });
@ -309,7 +352,7 @@ describe('installManagedPythonRuntime', () => {
});
it('reuses an existing compatible runtime when force is false', async () => {
const { assetDir } = await writeAsset(tempDir, 'core-wheel');
const { assetDir } = await writeAsset(tempDir, { label: 'core-wheel' });
const exec: ManagedPythonRuntimeExec = vi.fn(async (command, args) => ({
stdout: command === 'uv' && args[0] === '--version' ? 'uv 0.9.5\n' : '',
stderr: '',
@ -335,14 +378,17 @@ describe('installManagedPythonRuntime', () => {
});
expect(second.status).toBe('ready');
expect(exec).toHaveBeenCalledTimes(3);
expect(exec).toHaveBeenCalledTimes(4);
});
it('keeps failed install logs in the versioned runtime directory', async () => {
const { assetDir } = await writeAsset(tempDir, 'core-wheel');
const { assetDir } = await writeAsset(tempDir, { label: 'core-wheel' });
const exec: ManagedPythonRuntimeExec = vi.fn(async (command, args) => {
if (command === 'uv' && args[0] === 'venv') {
throw Object.assign(new Error('uv venv failed'), { stdout: 'creating\n', stderr: 'bad python\n' });
throw Object.assign(new Error('uv venv failed'), {
stdout: 'creating\n',
stderr: '× No solution found\n╰─▶ current Python version (3.12.3) does not satisfy Python>=3.13\n',
});
}
return { stdout: command === 'uv' && args[0] === '--version' ? 'uv 0.9.5\n' : '', stderr: '' };
});
@ -355,11 +401,11 @@ describe('installManagedPythonRuntime', () => {
features: ['core'],
exec,
}),
).rejects.toThrow(/Python runtime install failed/);
).rejects.toThrow(/current Python version \(3\.12\.3\) does not satisfy Python>=3\.13/);
const log = await readFile(join(tempDir, 'runtime', '0.2.0', 'install.log'), 'utf8');
expect(log).toContain('$ uv venv');
expect(log).toContain('bad python');
expect(log).toContain('$ uv venv --python 3.13');
expect(log).toContain('current Python version (3.12.3) does not satisfy Python>=3.13');
});
});
@ -386,7 +432,7 @@ describe('readManagedPythonRuntimeStatus', () => {
});
it('reports ready when manifest and executables exist', async () => {
const { assetDir } = await writeAsset(tempDir, 'core-wheel');
const { assetDir } = await writeAsset(tempDir, { label: 'core-wheel' });
const exec: ManagedPythonRuntimeExec = vi.fn(async (command, args) => ({
stdout: command === 'uv' && args[0] === '--version' ? 'uv 0.9.5\n' : '',
stderr: '',
@ -413,7 +459,7 @@ describe('readManagedPythonRuntimeStatus', () => {
});
it('reports broken when an executable is missing', async () => {
const { assetDir } = await writeAsset(tempDir, 'core-wheel');
const { assetDir } = await writeAsset(tempDir, { label: 'core-wheel' });
const exec: ManagedPythonRuntimeExec = vi.fn(async (command, args) => ({
stdout: command === 'uv' && args[0] === '--version' ? 'uv 0.9.5\n' : '',
stderr: '',
@ -449,7 +495,7 @@ describe('doctorManagedPythonRuntime', () => {
});
it('checks uv, bundled assets, and installed runtime status', async () => {
const { assetDir } = await writeAsset(tempDir, 'core-wheel');
const { assetDir } = await writeAsset(tempDir, { label: 'core-wheel' });
const exec: ManagedPythonRuntimeExec = vi.fn(async (command, args) => ({
stdout: command === 'uv' && args[0] === '--version' ? 'uv 0.9.5\n' : '',
stderr: '',
@ -471,7 +517,7 @@ describe('doctorManagedPythonRuntime', () => {
});
it('reports uv as a hard prerequisite when uv is missing', async () => {
const { assetDir } = await writeAsset(tempDir, 'core-wheel');
const { assetDir } = await writeAsset(tempDir, { label: 'core-wheel' });
const exec: ManagedPythonRuntimeExec = vi.fn(async () => {
throw new Error('spawn uv ENOENT');
});

View file

@ -5,6 +5,7 @@ import { homedir } from 'node:os';
import { basename, join } from 'node:path';
import { fileURLToPath } from 'node:url';
import { promisify } from 'node:util';
import { strFromU8, unzipSync } from 'fflate';
import { z } from 'zod';
const execFileAsync = promisify(execFile);
@ -78,6 +79,10 @@ export interface ManagedPythonDaemonLayout extends ManagedPythonRuntimeLayout {
export interface ManagedRuntimeAsset {
manifest: KtxRuntimeAssetManifest;
wheelPath: string;
requiresPython: {
specifier: string;
minimumVersion: string;
};
}
export type ManagedPythonRuntimeExec = (
@ -196,6 +201,40 @@ function isErrnoException(error: unknown, code: string): boolean {
return typeof error === 'object' && error !== null && 'code' in error && error.code === code;
}
function parseRequiresPythonFromWheel(input: { wheelPath: string; contents: Buffer }): ManagedRuntimeAsset['requiresPython'] {
let files: Record<string, Uint8Array>;
try {
files = unzipSync(new Uint8Array(input.contents));
} catch (error) {
throw new Error(
`Unable to read bundled Python runtime wheel metadata: ${error instanceof Error ? error.message : String(error)}`,
);
}
const metadataEntry = Object.entries(files).find(([path]) => path.endsWith('.dist-info/METADATA'));
if (!metadataEntry) {
throw new Error(`Bundled Python runtime wheel metadata is missing: ${input.wheelPath}`);
}
const metadata = strFromU8(metadataEntry[1]);
const requiresPython = metadata
.split(/\r?\n/)
.map((line) => line.match(/^Requires-Python:\s*(.+)\s*$/i)?.[1]?.trim())
.find((value): value is string => typeof value === 'string' && value.length > 0);
if (!requiresPython) {
throw new Error('Bundled Python runtime wheel metadata is missing Requires-Python');
}
const minimumMatch = requiresPython.match(/(?:^|[,\s])>=\s*([0-9]+)\.([0-9]+)(?:\.[0-9]+)?\b/);
if (!minimumMatch) {
throw new Error(`Unsupported bundled Python runtime Requires-Python: ${requiresPython}`);
}
return {
specifier: requiresPython,
minimumVersion: `${minimumMatch[1]}.${minimumMatch[2]}`,
};
}
export async function verifyRuntimeAsset(input: { assetDir: string }): Promise<ManagedRuntimeAsset> {
const manifestPath = join(input.assetDir, 'manifest.json');
let manifestData: unknown;
@ -221,7 +260,7 @@ export async function verifyRuntimeAsset(input: { assetDir: string }): Promise<M
if (sha256 !== manifest.wheel.sha256 || wheel.byteLength !== manifest.wheel.bytes) {
throw new Error(`Bundled Python runtime wheel checksum mismatch: ${wheelPath}`);
}
return { manifest, wheelPath };
return { manifest, wheelPath, requiresPython: parseRequiresPythonFromWheel({ wheelPath, contents: wheel }) };
}
function normalizeFeatures(features: KtxRuntimeFeature[]): KtxRuntimeFeature[] {
@ -262,6 +301,14 @@ function errorOutput(error: unknown): { stdout: string; stderr: string } {
};
}
function installFailureMessage(input: { logPath: string; stdout: string; stderr: string }): string {
const output = [input.stderr.trim(), input.stdout.trim()].filter((part) => part.length > 0).join('\n');
if (!output) {
return `Python runtime install failed. Install log: ${input.logPath}`;
}
return `Python runtime install failed.\n${output}\nInstall log: ${input.logPath}`;
}
async function runLogged(input: {
exec: ManagedPythonRuntimeExec;
logPath: string;
@ -288,7 +335,7 @@ async function runLogged(input: {
if (output.stderr) {
await appendFile(input.logPath, output.stderr.endsWith('\n') ? output.stderr : `${output.stderr}\n`);
}
throw new Error(`Python runtime install failed. Install log: ${input.logPath}`);
throw new Error(installFailureMessage({ logPath: input.logPath, stdout: output.stdout, stderr: output.stderr }));
}
}
@ -334,7 +381,14 @@ export async function installManagedPythonRuntime(
exec,
logPath: layout.installLogPath,
command: 'uv',
args: ['venv', layout.venvDir],
args: ['python', 'install', asset.requiresPython.minimumVersion],
env: uvEnv,
});
await runLogged({
exec,
logPath: layout.installLogPath,
command: 'uv',
args: ['venv', '--python', asset.requiresPython.minimumVersion, layout.venvDir],
env: uvEnv,
});
const wheelSpec = features.includes('local-embeddings') ? `${asset.wheelPath}[local-embeddings]` : asset.wheelPath;

View file

@ -0,0 +1,21 @@
import { describe, expect, it } from 'vitest';
import { sanitizeChildProxyEnv } from './proxy-env.js';
describe('sanitizeChildProxyEnv', () => {
it('drops IPv6 CIDR no-proxy entries and normalizes both env keys', () => {
const env = sanitizeChildProxyEnv({
NO_PROXY: 'localhost,127.0.0.1,127.0.0.0/8,fd07:b51a:cc66:f0::/64,*.orb.local',
no_proxy: '::1,0.250.250.0/24,fd00::/8,*.orb.internal',
});
expect(env.NO_PROXY).toBe('localhost,127.0.0.1,127.0.0.0/8,*.orb.local,::1,0.250.250.0/24,*.orb.internal');
expect(env.no_proxy).toBe(env.NO_PROXY);
});
it('preserves the input object and leaves missing proxy env unset', () => {
const input = { PATH: '/usr/bin' };
expect(sanitizeChildProxyEnv(input)).toEqual({ PATH: '/usr/bin' });
expect(input).toEqual({ PATH: '/usr/bin' });
});
});

View file

@ -0,0 +1,27 @@
const NO_PROXY_KEYS = ['NO_PROXY', 'no_proxy'] as const;
function isIpv6CidrNoProxyEntry(entry: string): boolean {
return entry.includes('/') && entry.includes(':');
}
function cleanedNoProxyValue(env: NodeJS.ProcessEnv): string | undefined {
const entries = NO_PROXY_KEYS.flatMap((key) => (env[key] ?? '').split(','))
.map((entry) => entry.trim())
.filter((entry) => entry.length > 0 && !isIpv6CidrNoProxyEntry(entry));
if (!NO_PROXY_KEYS.some((key) => env[key] !== undefined)) {
return undefined;
}
return [...new Set(entries)].join(',');
}
export function sanitizeChildProxyEnv(env: NodeJS.ProcessEnv): NodeJS.ProcessEnv {
const sanitized = { ...env };
const noProxy = cleanedNoProxyValue(env);
if (noProxy === undefined) {
return sanitized;
}
sanitized.NO_PROXY = noProxy;
sanitized.no_proxy = noProxy;
return sanitized;
}

View file

@ -52,6 +52,7 @@ describe('runKtxRuntime', () => {
},
asset: {
wheelPath: '/assets/python/kaelio_ktx-0.1.0-py3-none-any.whl',
requiresPython: { specifier: '>=3.13', minimumVersion: '3.13' },
manifest: {
schemaVersion: 1,
distributionName: 'kaelio-ktx',

View file

@ -46,9 +46,14 @@ function makePromptAdapter(options: {
};
}
function managedDaemon(baseUrl = 'http://127.0.0.1:61234') {
function managedDaemon(
baseUrl = 'http://127.0.0.1:61234',
logs: { stdoutLog?: string; stderrLog?: string } = {},
) {
return {
baseUrl,
stdoutLog: logs.stdoutLog ?? '/tmp/ktx-daemon.stdout.log',
stderrLog: logs.stderrLog ?? '/tmp/ktx-daemon.stderr.log',
env: {
KTX_MANAGED_SENTENCE_TRANSFORMERS_BASE_URL: baseUrl,
},
@ -330,6 +335,65 @@ describe('setup embeddings step', () => {
expect(io.stderr()).not.toContain('skip for now');
});
it('prints the recent daemon stderr tail when local embedding health check fails', async () => {
const io = makeIo();
const stderrLog = join(tempDir, '.ktx', 'runtime', 'daemon.stderr.log');
await mkdir(join(tempDir, '.ktx', 'runtime'), { recursive: true });
await writeFile(
stderrLog,
Array.from({ length: 45 }, (_value, index) => `daemon traceback line ${index + 1}`).join('\n'),
);
const result = await runKtxSetupEmbeddingsStep(
{
projectDir: tempDir,
inputMode: 'disabled',
cliVersion: '0.2.0',
runtimeInstallPolicy: 'auto',
skipEmbeddings: false,
},
io.io,
{
env: {},
ensureLocalEmbeddings: vi.fn(async () => managedDaemon('http://127.0.0.1:61234', { stderrLog })),
healthCheck: vi.fn(async () => ({ ok: false as const, message: 'HTTP 500' })),
},
);
expect(result.status).toBe('failed');
expect(io.stderr()).toContain('Recent local embeddings daemon stderr:');
expect(io.stderr()).toContain('daemon traceback line 6');
expect(io.stderr()).toContain('daemon traceback line 45');
expect(io.stderr()).not.toContain('daemon traceback line 5');
});
it('does not print daemon stderr diagnostics when the log is unavailable or empty', async () => {
const io = makeIo();
const result = await runKtxSetupEmbeddingsStep(
{
projectDir: tempDir,
inputMode: 'disabled',
cliVersion: '0.2.0',
runtimeInstallPolicy: 'auto',
skipEmbeddings: false,
},
io.io,
{
env: {},
ensureLocalEmbeddings: vi.fn(async () =>
managedDaemon('http://127.0.0.1:61234', {
stderrLog: join(tempDir, '.ktx', 'runtime', 'missing.stderr.log'),
}),
),
healthCheck: vi.fn(async () => ({ ok: false as const, message: 'HTTP 500' })),
},
);
expect(result.status).toBe('failed');
expect(io.stderr()).not.toContain('Recent local embeddings daemon stderr:');
});
it('uses fixed OpenAI defaults and only asks for credentials when OpenAI is selected', async () => {
const io = makeIo();
const healthCheck = vi.fn(async () => ({ ok: true as const }));

View file

@ -1,4 +1,4 @@
import { writeFile } from 'node:fs/promises';
import { readFile, writeFile } from 'node:fs/promises';
import { resolveKtxConfigReference } from '@ktx/context/core';
import {
type KtxProjectConfig,
@ -59,6 +59,7 @@ export interface KtxSetupEmbeddingsDeps {
healthCheck?: (config: KtxEmbeddingConfig) => Promise<KtxEmbeddingHealthCheckResult>;
ensureLocalEmbeddings?: (options: {
cliVersion: string;
projectDir: string;
installPolicy: KtxManagedPythonInstallPolicy;
io: KtxCliIo;
}) => Promise<ManagedLocalEmbeddingsDaemon>;
@ -85,6 +86,7 @@ const EMBEDDING_OPTION_PROMPT_CONTEXT =
'KTX uses embeddings for semantic search over semantic-layer sources, wiki context, schema metadata, ' +
'and relationship evidence.';
const LOCAL_EMBEDDING_HEALTH_TIMEOUT_MS = 120_000;
const LOCAL_EMBEDDING_STDERR_TAIL_LINES = 40;
function createPromptAdapter(): KtxSetupEmbeddingsPromptAdapter {
return createKtxSetupPromptAdapter({ selectCancelValue: 'back' });
@ -286,14 +288,33 @@ async function chooseEmbeddingBackend(
return 'back';
}
function localEmbeddingSetupMessage(message: string): string {
return [
async function readLocalEmbeddingDaemonStderrTail(stderrLog: string | undefined): Promise<string[]> {
if (!stderrLog) {
return [];
}
try {
const lines = (await readFile(stderrLog, 'utf8'))
.split(/\r?\n/)
.map((line) => line.trimEnd())
.filter((line) => line.trim().length > 0);
return lines.slice(-LOCAL_EMBEDDING_STDERR_TAIL_LINES);
} catch {
return [];
}
}
function localEmbeddingSetupMessage(message: string, stderrTail: string[] = []): string {
const lines = [
`Local embedding health check failed: ${message}`,
'Local embeddings use the KTX-managed Python runtime.',
'Prepare the runtime with: ktx dev runtime start --feature local-embeddings',
'Use --yes with setup to install and start the runtime without prompting.',
'The first run may download Python packages and the all-MiniLM-L6-v2 model.',
].join('\n');
];
if (stderrTail.length > 0) {
lines.push('Recent local embeddings daemon stderr:', ...stderrTail);
}
return lines.join('\n');
}
async function promptAfterLocalEmbeddingFailure(
@ -447,9 +468,13 @@ export async function runKtxSetupEmbeddingsStep(
}
progress.fail('Embedding test failed');
const stderrTail =
selectedBackend === 'sentence-transformers'
? await readLocalEmbeddingDaemonStderrTail(managedLocalEmbeddings?.stderrLog)
: [];
io.stderr.write(
selectedBackend === 'sentence-transformers'
? `${localEmbeddingSetupMessage(health.message)}\n`
? `${localEmbeddingSetupMessage(health.message, stderrTail)}\n`
: `Embedding health check failed: ${health.message}\n`,
);
if (args.inputMode === 'disabled') {

View file

@ -29,8 +29,18 @@ export interface KtxSetupProjectArgs {
allowBack?: boolean;
}
export type KtxSetupCreatedProjectCleanup =
| { kind: 'remove-project-dir'; projectDir: string }
| { kind: 'remove-ktx-scaffold'; projectDir: string };
export type KtxSetupProjectResult =
| { status: 'ready'; projectDir: string; project: KtxLocalProject; confirmedCreation?: boolean }
| {
status: 'ready';
projectDir: string;
project: KtxLocalProject;
confirmedCreation?: boolean;
createdProjectCleanup?: KtxSetupCreatedProjectCleanup;
}
| { status: 'back'; projectDir: string }
| { status: 'cancelled'; projectDir: string }
| { status: 'missing-input'; projectDir: string };
@ -49,7 +59,12 @@ export interface KtxSetupProjectDeps {
}
type PromptProjectDirResult =
| { status: 'selected'; projectDir: string; confirmedCreation: boolean }
| {
status: 'selected';
projectDir: string;
confirmedCreation: boolean;
createdProjectCleanup?: KtxSetupCreatedProjectCleanup;
}
| { status: 'cancelled'; projectDir: string }
| { status: 'missing-input'; projectDir: string }
| { status: 'back'; projectDir: string };
@ -92,12 +107,29 @@ async function existingFolderState(
}
type ConfirmProjectDirResult =
| { status: 'confirmed'; confirmedCreation: boolean }
| {
status: 'confirmed';
confirmedCreation: boolean;
createdProjectCleanup?: KtxSetupCreatedProjectCleanup;
}
| { status: 'choose-another' }
| { status: 'back' }
| { status: 'cancelled' }
| { status: 'not-directory' };
function cleanupForFolderState(
projectDir: string,
state: Awaited<ReturnType<typeof existingFolderState>>,
): KtxSetupCreatedProjectCleanup | undefined {
if (state === 'missing') {
return { kind: 'remove-project-dir', projectDir };
}
if (state === 'empty-directory') {
return { kind: 'remove-ktx-scaffold', projectDir };
}
return undefined;
}
async function confirmProjectDir(
selectedDir: string,
io: KtxCliIo,
@ -137,7 +169,7 @@ async function confirmProjectDir(
if (action === 'choose-another') return { status: 'choose-another' };
if (action === 'back') return { status: 'back' };
if (action !== 'create') return { status: 'cancelled' };
return { status: 'confirmed', confirmedCreation: true };
return { status: 'confirmed', confirmedCreation: true, createdProjectCleanup: cleanupForFolderState(selectedDir, state) };
}
async function normalizeSetupGitignore(projectDir: string): Promise<void> {
@ -220,10 +252,28 @@ async function promptForNewProjectDir(
if (confirmed.status === 'choose-another') continue;
if (confirmed.status === 'back') return { status: 'back', projectDir };
if (confirmed.status === 'cancelled') return { status: 'cancelled', projectDir };
return { status: 'selected', projectDir: selectedDir, confirmedCreation: confirmed.confirmedCreation };
return {
status: 'selected',
projectDir: selectedDir,
confirmedCreation: confirmed.confirmedCreation,
...(confirmed.createdProjectCleanup ? { createdProjectCleanup: confirmed.createdProjectCleanup } : {}),
};
}
}
async function createProjectWithCleanup(
projectDir: string,
deps: KtxSetupProjectDeps,
): Promise<{ project: KtxLocalProject; createdProjectCleanup?: KtxSetupCreatedProjectCleanup }> {
const state = await existingFolderState(projectDir);
const project = await createProject(projectDir, deps);
const createdProjectCleanup = cleanupForFolderState(projectDir, state);
return {
project,
...(createdProjectCleanup ? { createdProjectCleanup } : {}),
};
}
export async function runKtxSetupProjectStep(
args: KtxSetupProjectArgs,
io: KtxCliIo,
@ -261,6 +311,7 @@ export async function runKtxSetupProjectStep(
projectDir: selected.projectDir,
project,
confirmedCreation: selected.confirmedCreation,
...(selected.createdProjectCleanup ? { createdProjectCleanup: selected.createdProjectCleanup } : {}),
};
}
@ -275,9 +326,14 @@ export async function runKtxSetupProjectStep(
io.stderr.write('Missing setup choice: pass --yes to create a project in non-interactive setup.\n');
return { status: 'missing-input', projectDir };
}
const project = await createProject(projectDir, deps);
const { project, createdProjectCleanup } = await createProjectWithCleanup(projectDir, deps);
printProjectSummary(io, projectDir);
return { status: 'ready', projectDir, project };
return {
status: 'ready',
projectDir,
project,
...(createdProjectCleanup ? { createdProjectCleanup } : {}),
};
}
if (!io.stdout.isTTY && !deps.prompts) {
@ -316,9 +372,14 @@ export async function runKtxSetupProjectStep(
}
if (choice === 'current') {
const project = await createProject(projectDir, deps);
const { project, createdProjectCleanup } = await createProjectWithCleanup(projectDir, deps);
printProjectSummary(io, projectDir);
return { status: 'ready', projectDir, project };
return {
status: 'ready',
projectDir,
project,
...(createdProjectCleanup ? { createdProjectCleanup } : {}),
};
}
if (choice === 'new-default') {
@ -333,6 +394,7 @@ export async function runKtxSetupProjectStep(
projectDir: defaultProjectDir,
project,
confirmedCreation: confirmed.confirmedCreation,
...(confirmed.createdProjectCleanup ? { createdProjectCleanup: confirmed.createdProjectCleanup } : {}),
};
}
@ -356,7 +418,13 @@ export async function runKtxSetupProjectStep(
if (confirmed.status === 'cancelled') return { status: 'cancelled', projectDir };
const project = await createProject(customDir, deps);
printProjectSummary(io, customDir);
return { status: 'ready', projectDir: customDir, project, confirmedCreation: confirmed.confirmedCreation };
return {
status: 'ready',
projectDir: customDir,
project,
confirmedCreation: confirmed.confirmedCreation,
...(confirmed.createdProjectCleanup ? { createdProjectCleanup: confirmed.createdProjectCleanup } : {}),
};
}
prompts.cancel('Setup cancelled.');

View file

@ -101,6 +101,8 @@ describe('runKtxSetupRuntimeStep', () => {
const io = makeIo();
const ensureLocalEmbeddings = vi.fn(async () => ({
baseUrl: 'http://127.0.0.1:61234',
stdoutLog: join(tempDir, '.ktx', 'runtime', 'daemon.stdout.log'),
stderrLog: join(tempDir, '.ktx', 'runtime', 'daemon.stderr.log'),
env: { KTX_MANAGED_SENTENCE_TRANSFORMERS_BASE_URL: 'http://127.0.0.1:61234' },
}));
const config: KtxProjectConfig = {

View file

@ -1,5 +1,5 @@
import { execFile } from 'node:child_process';
import { mkdir, mkdtemp, readFile, rm, stat, writeFile } from 'node:fs/promises';
import { mkdir, mkdtemp, readFile, readdir, rm, stat, writeFile } from 'node:fs/promises';
import { tmpdir } from 'node:os';
import { join } from 'node:path';
import { promisify } from 'node:util';
@ -563,6 +563,112 @@ describe('setup status', () => {
expect(testIo.stderr()).toBe('');
});
it('removes a newly created missing project directory when a later runtime step fails', async () => {
const projectDir = join(tempDir, 'missing-project');
const testIo = makeIo();
await expect(
runKtxSetup(
{
command: 'run',
projectDir,
mode: 'auto',
agents: false,
skipAgents: true,
inputMode: 'disabled',
yes: true,
cliVersion: '0.2.0',
skipLlm: true,
skipEmbeddings: true,
databaseSchemas: [],
skipDatabases: true,
skipSources: true,
},
testIo.io,
{
model: async () => ({ status: 'skipped', projectDir }),
embeddings: async () => ({ status: 'skipped', projectDir }),
databases: async () => ({ status: 'skipped', projectDir }),
sources: async () => ({ status: 'skipped', projectDir }),
runtime: async () => ({ status: 'failed', projectDir, requirements: { features: ['core'], requirements: [] } }),
},
),
).resolves.toBe(1);
await expect(stat(projectDir)).rejects.toThrow();
});
it('removes KTX scaffold files from an initially empty project directory when runtime setup fails', async () => {
const testIo = makeIo();
await expect(
runKtxSetup(
{
command: 'run',
projectDir: tempDir,
mode: 'auto',
agents: false,
skipAgents: true,
inputMode: 'disabled',
yes: true,
cliVersion: '0.2.0',
skipLlm: true,
skipEmbeddings: true,
databaseSchemas: [],
skipDatabases: true,
skipSources: true,
},
testIo.io,
{
model: async () => ({ status: 'skipped', projectDir: tempDir }),
embeddings: async () => ({ status: 'skipped', projectDir: tempDir }),
databases: async () => ({ status: 'skipped', projectDir: tempDir }),
sources: async () => ({ status: 'skipped', projectDir: tempDir }),
runtime: async () => ({ status: 'failed', projectDir: tempDir, requirements: { features: ['core'], requirements: [] } }),
},
),
).resolves.toBe(1);
await expect(stat(tempDir)).resolves.toBeDefined();
expect(await readdir(tempDir)).toEqual([]);
});
it('preserves a pre-existing non-empty project directory when runtime setup fails', async () => {
await writeFile(join(tempDir, 'notes.txt'), 'keep me\n', 'utf-8');
const testIo = makeIo();
await expect(
runKtxSetup(
{
command: 'run',
projectDir: tempDir,
mode: 'auto',
agents: false,
skipAgents: true,
inputMode: 'disabled',
yes: true,
cliVersion: '0.2.0',
skipLlm: true,
skipEmbeddings: true,
databaseSchemas: [],
skipDatabases: true,
skipSources: true,
},
testIo.io,
{
model: async () => ({ status: 'skipped', projectDir: tempDir }),
embeddings: async () => ({ status: 'skipped', projectDir: tempDir }),
databases: async () => ({ status: 'skipped', projectDir: tempDir }),
sources: async () => ({ status: 'skipped', projectDir: tempDir }),
runtime: async () => ({ status: 'failed', projectDir: tempDir, requirements: { features: ['core'], requirements: [] } }),
},
),
).resolves.toBe(1);
await expect(readFile(join(tempDir, 'notes.txt'), 'utf-8')).resolves.toBe('keep me\n');
await expect(stat(join(tempDir, 'ktx.yaml'))).resolves.toBeDefined();
});
it('shows demo near the bottom of the first setup intent menu before project creation', async () => {
const testIo = makeIo();
const select = vi.fn(async (options: { options: Array<{ value: string; label: string }> }) => {

View file

@ -1,4 +1,5 @@
import { existsSync } from 'node:fs';
import { rm } from 'node:fs/promises';
import { basename, join, resolve } from 'node:path';
import { getLatestLocalIngestStatus, savedMemoryCountsForReport } from '@ktx/context/ingest';
import {
@ -33,7 +34,11 @@ import {
isKtxSetupLlmConfigReady,
runKtxSetupAnthropicModelStep,
} from './setup-models.js';
import { type KtxSetupProjectDeps, runKtxSetupProjectStep } from './setup-project.js';
import {
type KtxSetupCreatedProjectCleanup,
type KtxSetupProjectDeps,
runKtxSetupProjectStep,
} from './setup-project.js';
import {
isKtxPreAgentSetupReady,
isKtxSetupReady,
@ -502,6 +507,23 @@ async function commitSetupConfigChanges(projectDir: string): Promise<void> {
await project.git.commitFile('ktx.yaml', 'setup: update KTX project config', 'ktx setup', 'setup@ktx.local');
}
const KTX_SETUP_SCAFFOLD_PATHS = ['ktx.yaml', '.ktx', 'wiki', 'semantic-layer', 'raw-sources', '.git'];
async function cleanupCreatedProjectScaffold(cleanup: KtxSetupCreatedProjectCleanup | undefined): Promise<void> {
if (!cleanup) {
return;
}
if (cleanup.kind === 'remove-project-dir') {
await rm(cleanup.projectDir, { recursive: true, force: true });
return;
}
await Promise.all(
KTX_SETUP_SCAFFOLD_PATHS.map((relativePath) =>
rm(join(cleanup.projectDir, relativePath), { recursive: true, force: true }),
),
);
}
export async function runKtxSetup(args: KtxSetupArgs, io: KtxCliIo, deps: KtxSetupDeps = {}): Promise<number> {
try {
return await runKtxSetupInner(args, io, deps);
@ -772,7 +794,11 @@ async function runKtxSetupInner(args: KtxSetupArgs, io: KtxCliIo, deps: KtxSetup
}
}
if (stepResult.status === 'failed' || stepResult.status === 'missing-input') {
if (stepResult.status === 'failed') {
await cleanupCreatedProjectScaffold(projectResult.createdProjectCleanup);
return 1;
}
if (stepResult.status === 'missing-input') {
return 1;
}
if (stepResult.status === 'back') {

View file

@ -111,12 +111,12 @@ describe('createKtxEmbeddingProvider', () => {
);
});
it('falls back to one-shot ktx-daemon inference when the local HTTP daemon is unavailable', async () => {
const fetch = vi.fn().mockRejectedValue(new TypeError('fetch failed'));
const runSentenceTransformersJson = vi
it('reports local HTTP daemon failures without a ktx-daemon spawn fallback cascade', async () => {
const fetch = vi
.fn()
.mockResolvedValueOnce({ embedding: [0.1, 0.2] })
.mockResolvedValueOnce({ embeddings: [[0.3, 0.4], [0.5, 0.6]] });
.mockResolvedValue(
new Response('Embedding compute failed: httpx.InvalidURL: Invalid port', { status: 500 }),
);
const provider = createKtxEmbeddingProvider(
{
@ -125,19 +125,13 @@ describe('createKtxEmbeddingProvider', () => {
dimensions: 2,
sentenceTransformers: { baseURL: 'http://127.0.0.1:8765', pathPrefix: '' },
},
{ fetch, runSentenceTransformersJson },
{ fetch },
);
await expect(provider.embedMany(['hello', 'world'])).resolves.toEqual([
[0.3, 0.4],
[0.5, 0.6],
]);
await expect(provider.embed('hello')).rejects.toThrow(
'Embedding provider sentence-transformers request failed with HTTP 500: Embedding compute failed: httpx.InvalidURL: Invalid port',
);
await expect(provider.embed('hello')).rejects.not.toThrow('ktx-daemon fallback failed');
expect(fetch).toHaveBeenCalledTimes(1);
expect(runSentenceTransformersJson).toHaveBeenNthCalledWith(1, 'embedding-compute', {
text: '__ktx_embedding_probe__',
});
expect(runSentenceTransformersJson).toHaveBeenNthCalledWith(2, 'embedding-compute-bulk', {
texts: ['hello', 'world'],
});
});
});

View file

@ -1,15 +1,7 @@
import { spawn } from 'node:child_process';
import { join } from 'node:path';
import OpenAI from 'openai';
import type { KtxEmbeddingConfig, KtxEmbeddingProvider } from './types.js';
type FetchFn = typeof fetch;
type SentenceTransformersCommand = 'embedding-compute' | 'embedding-compute-bulk';
type SentenceTransformersJsonRunner = (
subcommand: SentenceTransformersCommand,
payload: Record<string, unknown>,
) => Promise<Record<string, unknown>>;
type SentenceTransformersProcessCommand = { command: string; args: string[] };
export interface KtxEmbeddingProviderDeps {
createOpenAIClient?: (options: { apiKey?: string; baseURL?: string }) => {
@ -23,14 +15,10 @@ export interface KtxEmbeddingProviderDeps {
};
};
fetch?: FetchFn;
runSentenceTransformersJson?: SentenceTransformersJsonRunner;
sentenceTransformersCommand?: string;
sentenceTransformersArgs?: string[];
sentenceTransformersCwd?: string;
sentenceTransformersEnv?: NodeJS.ProcessEnv;
}
const DEFAULT_BATCH_SIZE = 100;
const HTTP_ERROR_BODY_MAX_LENGTH = 2_000;
function assertNonEmptyText(text: string): void {
if (!text.trim()) {
@ -69,110 +57,12 @@ function joinUrl(baseURL: string, pathPrefix: string, path: string): string {
return prefix ? `${base}/${prefix}/${suffix}` : `${base}/${suffix}`;
}
function errorText(error: unknown): string {
if (error instanceof Error) {
return error.cause
? `${error.name}: ${error.message}; cause: ${errorText(error.cause)}`
: `${error.name}: ${error.message}`;
function boundedHttpBody(text: string): string {
const normalized = text.trim();
if (normalized.length <= HTTP_ERROR_BODY_MAX_LENGTH) {
return normalized;
}
return String(error);
}
function parseJsonObject(raw: string, subcommand: string): Record<string, unknown> {
const parsed = JSON.parse(raw) as unknown;
if (!parsed || typeof parsed !== 'object' || Array.isArray(parsed)) {
throw new Error(`ktx-daemon ${subcommand} returned non-object JSON`);
}
return parsed as Record<string, unknown>;
}
function isCommandNotFound(error: unknown): boolean {
return (
error instanceof Error &&
('code' in error || 'errno' in error) &&
((error as { code?: unknown }).code === 'ENOENT' || (error as { errno?: unknown }).errno === 'ENOENT')
);
}
function defaultSentenceTransformersProcessCommands(): SentenceTransformersProcessCommand[] {
const venvBin =
process.platform === 'win32' ? join('.venv', 'Scripts', 'ktx-daemon.exe') : join('.venv', 'bin', 'ktx-daemon');
const repoVenvBin =
process.platform === 'win32'
? join('ktx', '.venv', 'Scripts', 'ktx-daemon.exe')
: join('ktx', '.venv', 'bin', 'ktx-daemon');
return [
{ command: 'ktx-daemon', args: [] },
{ command: venvBin, args: [] },
{ command: repoVenvBin, args: [] },
];
}
function runSentenceTransformersProcessCommand(
options: SentenceTransformersProcessCommand & {
cwd?: string;
env?: NodeJS.ProcessEnv;
},
): SentenceTransformersJsonRunner {
return async (
subcommand: SentenceTransformersCommand,
payload: Record<string, unknown>,
): Promise<Record<string, unknown>> =>
new Promise((resolve, reject) => {
const child = spawn(options.command, [...options.args, subcommand], {
cwd: options.cwd,
env: { ...process.env, ...options.env },
stdio: ['pipe', 'pipe', 'pipe'],
});
const stdout: Buffer[] = [];
const stderr: Buffer[] = [];
child.stdout.on('data', (chunk: Buffer) => stdout.push(chunk));
child.stderr.on('data', (chunk: Buffer) => stderr.push(chunk));
child.on('error', reject);
child.on('close', (code) => {
const stdoutText = Buffer.concat(stdout).toString('utf8').trim();
const stderrText = Buffer.concat(stderr).toString('utf8').trim();
if (code !== 0) {
reject(new Error(`ktx-daemon ${subcommand} failed: ${stderrText || `exit code ${code}`}`));
return;
}
try {
resolve(parseJsonObject(stdoutText, subcommand));
} catch (error) {
reject(error);
}
});
child.stdin.end(`${JSON.stringify(payload)}\n`);
});
}
function runSentenceTransformersProcessJson(options: {
commands: SentenceTransformersProcessCommand[];
cwd?: string;
env?: NodeJS.ProcessEnv;
}): SentenceTransformersJsonRunner {
return async (
subcommand: SentenceTransformersCommand,
payload: Record<string, unknown>,
): Promise<Record<string, unknown>> => {
const errors: string[] = [];
for (const command of options.commands) {
try {
return await runSentenceTransformersProcessCommand({
...command,
cwd: options.cwd,
env: options.env,
})(subcommand, payload);
} catch (error) {
errors.push(`${command.command}: ${errorText(error)}`);
if (!isCommandNotFound(error)) {
break;
}
}
}
throw new Error(`ktx-daemon ${subcommand} failed: ${errors.join('; ')}`);
};
return `${normalized.slice(0, HTTP_ERROR_BODY_MAX_LENGTH)}...`;
}
class OpenAIEmbeddingProvider implements KtxEmbeddingProvider {
@ -228,9 +118,7 @@ class SentenceTransformersEmbeddingProvider implements KtxEmbeddingProvider {
private readonly fetch: FetchFn;
private readonly baseURL: string;
private readonly pathPrefix: string;
private readonly runJson: SentenceTransformersJsonRunner;
private readonly startupProbe: Promise<void>;
private useProcessRunner = false;
constructor(config: KtxEmbeddingConfig, deps: KtxEmbeddingProviderDeps) {
if (!config.sentenceTransformers?.baseURL) {
@ -241,15 +129,6 @@ class SentenceTransformersEmbeddingProvider implements KtxEmbeddingProvider {
this.fetch = deps.fetch ?? fetch;
this.baseURL = config.sentenceTransformers.baseURL;
this.pathPrefix = config.sentenceTransformers.pathPrefix ?? '/api';
this.runJson =
deps.runSentenceTransformersJson ??
runSentenceTransformersProcessJson({
commands: deps.sentenceTransformersCommand
? [{ command: deps.sentenceTransformersCommand, args: deps.sentenceTransformersArgs ?? [] }]
: defaultSentenceTransformersProcessCommands(),
cwd: deps.sentenceTransformersCwd,
env: deps.sentenceTransformersEnv,
});
this.startupProbe = this.requestSingle('__ktx_embedding_probe__').then((embedding) => {
assertVectorDimensions(embedding, this.dimensions, 'sentence-transformers');
});
@ -264,7 +143,7 @@ class SentenceTransformersEmbeddingProvider implements KtxEmbeddingProvider {
async embedMany(texts: string[]): Promise<number[][]> {
assertBatchSize(texts, this.maxBatchSize);
await this.startupProbe;
const response = await this.requestJson('embedding-compute-bulk', '/embeddings/compute-bulk', { texts });
const response = await this.requestJson('/embeddings/compute-bulk', { texts });
if (
!response ||
typeof response !== 'object' ||
@ -285,37 +164,15 @@ class SentenceTransformersEmbeddingProvider implements KtxEmbeddingProvider {
}
private async requestSingle(text: string): Promise<number[]> {
const response = await this.requestJson('embedding-compute', '/embeddings/compute', { text });
const response = await this.requestJson('/embeddings/compute', { text });
if (!response || typeof response !== 'object' || !('embedding' in response) || !Array.isArray(response.embedding)) {
throw new Error('Embedding provider sentence-transformers returned malformed single response');
}
return response.embedding;
}
private async requestJson(
command: SentenceTransformersCommand,
path: string,
body: Record<string, unknown>,
): Promise<Record<string, unknown>> {
if (this.useProcessRunner) {
return this.runJson(command, body);
}
try {
return await this.postJson(path, body);
} catch (httpError) {
try {
const response = await this.runJson(command, body);
this.useProcessRunner = true;
return response;
} catch (processError) {
throw new Error(
`Embedding provider sentence-transformers local HTTP request failed (${errorText(
httpError,
)}) and ktx-daemon fallback failed (${errorText(processError)})`,
);
}
}
private async requestJson(path: string, body: Record<string, unknown>): Promise<Record<string, unknown>> {
return await this.postJson(path, body);
}
private async postJson(path: string, body: Record<string, unknown>): Promise<Record<string, unknown>> {
@ -325,7 +182,12 @@ class SentenceTransformersEmbeddingProvider implements KtxEmbeddingProvider {
body: JSON.stringify(body),
});
if (!response.ok) {
throw new Error(`Embedding provider sentence-transformers request failed with HTTP ${response.status}`);
const bodyText = boundedHttpBody(await response.text());
throw new Error(
`Embedding provider sentence-transformers request failed with HTTP ${response.status}${
bodyText ? `: ${bodyText}` : ''
}`,
);
}
const parsed = (await response.json()) as unknown;
if (!parsed || typeof parsed !== 'object' || Array.isArray(parsed)) {