mirror of
https://github.com/Kaelio/ktx.git
synced 2026-06-07 07:55:13 +02:00
Merge remote-tracking branch 'origin/main' into rollback-failed-ktx-setup
# Conflicts: # packages/cli/src/setup-project.ts
This commit is contained in:
commit
23acaecb52
21 changed files with 616 additions and 267 deletions
91
README.md
91
README.md
|
|
@ -6,8 +6,6 @@
|
|||
The context layer for analytics agents
|
||||
</h1>
|
||||
|
||||
<p align="center">by Kaelio</p>
|
||||
|
||||
<p align="center">
|
||||
<a href="https://www.npmjs.com/package/@kaelio/ktx"><img src="https://img.shields.io/npm/v/@kaelio/ktx?style=flat-square&color=f97316" alt="npm version" /></a>
|
||||
<a href="https://codecov.io/gh/Kaelio/ktx"><img src="https://codecov.io/gh/Kaelio/ktx/graph/badge.svg?branch=main" alt="Codecov" /></a>
|
||||
|
|
@ -18,19 +16,38 @@
|
|||
|
||||
---
|
||||
|
||||
KTX turns warehouse metadata, semantic definitions, and business knowledge into
|
||||
reviewable project files that agents can use to plan, query, and update
|
||||
analytics work.
|
||||
KTX is a self-improving context layer that teaches agents how to query your
|
||||
warehouse accurately - from approved metric definitions, joinable columns, and
|
||||
business knowledge it builds and maintains for you.
|
||||
|
||||
Use KTX when you want agents to:
|
||||
Works with PostgreSQL, Snowflake, BigQuery, ClickHouse, MySQL, SQL Server, and
|
||||
SQLite. Integrates with dbt, MetricFlow, LookML, Looker, Metabase, and Notion.
|
||||
|
||||
- Generate SQL from approved measures and joins
|
||||
- Repair semantic definitions through reviewable diffs
|
||||
- Explain metric provenance with warehouse evidence
|
||||
- Work alongside dbt, MetricFlow, LookML, Looker, Metabase, and Notion
|
||||
## Why KTX
|
||||
|
||||
Supports PostgreSQL, Snowflake, BigQuery, ClickHouse, MySQL, SQL Server, and
|
||||
SQLite.
|
||||
General-purpose agents struggle on data tasks. They re-explore your warehouse
|
||||
on every question, invent their own metric logic, and return numbers that
|
||||
don't match approved definitions.
|
||||
|
||||
Traditional semantic layers don't fix this. They demand constant manual
|
||||
upkeep and don't absorb the rest of your company's knowledge.
|
||||
|
||||
KTX does both, automatically:
|
||||
|
||||
- **Learns from company knowledge.** Ingests wiki content, organizes it,
|
||||
removes duplicates, and flags contradictions for human review.
|
||||
- **Maps the data stack.** Samples tables, captures metadata and usage
|
||||
patterns, detects joinable columns, and annotates sources so agents write
|
||||
better queries.
|
||||
- **Builds a semantic layer.** Combines raw tables and high-level metrics
|
||||
through a join graph that automatically resolves chasm and fan traps, so
|
||||
agents fetch metrics declaratively instead of rewriting canonical SQL each
|
||||
time.
|
||||
- **Serves agents at execution.** Exposes CLI and MCP tools with combined
|
||||
full-text and semantic search across wiki and semantic-layer entities.
|
||||
|
||||
Agents can run raw SQL when they need it, or compose semantic-layer queries
|
||||
when they want approved metrics with reliable joins.
|
||||
|
||||
<p align="center">
|
||||
<img src="docs-site/public/images/ingestion-flow-transparent.svg" alt="KTX ingestion flow from source systems through validation to wiki and semantic-layer outputs" width="900" />
|
||||
|
|
@ -109,17 +126,17 @@ Commit `ktx.yaml`, `semantic-layer/`, and `wiki/`. Keep `.ktx/` local.
|
|||
|
||||
## Agent Usage
|
||||
|
||||
Setup can install KTX instructions for Claude Code, Codex, Cursor, OpenCode,
|
||||
and universal `.agents` clients:
|
||||
Install KTX integration for Claude Code, Claude Desktop, Codex, Cursor,
|
||||
OpenCode, and generic `.agents` clients:
|
||||
|
||||
```bash
|
||||
ktx setup --agents
|
||||
```
|
||||
|
||||
Use `--target <target>` when you want to install or repair one specific
|
||||
integration.
|
||||
Pass `--target <target>` to install or repair one specific integration.
|
||||
|
||||
Agent-facing workflows typically start with:
|
||||
A typical agent workflow combines wiki and semantic-layer search before
|
||||
querying:
|
||||
|
||||
```bash
|
||||
ktx sl search "revenue" --json
|
||||
|
|
@ -127,40 +144,14 @@ ktx wiki search "refund policy" --json
|
|||
ktx sl query --connection-id warehouse --measure orders.revenue --format sql
|
||||
```
|
||||
|
||||
During agent setup, choose **Ask data questions with KTX MCP** for client
|
||||
agents. Choose **Ask data questions + manage KTX with CLI commands** only when
|
||||
a developer or operator agent also needs pinned `ktx` admin commands.
|
||||
During setup, choose **Ask data questions with KTX MCP** for client agents.
|
||||
Choose **Ask data questions + manage KTX with CLI commands** when an operator
|
||||
agent also needs pinned `ktx` admin commands.
|
||||
|
||||
After setup, KTX prints **Required before using agents**. Complete those steps
|
||||
before opening the configured agent. If it shows `ktx mcp start --project-dir ...`,
|
||||
run that command before using Claude Code, Codex, Cursor, OpenCode, or generic
|
||||
MCP clients. The same output also prints the matching `ktx mcp stop` command
|
||||
for when you want to stop MCP later. Claude Desktop uses its own launcher for
|
||||
MCP and prints separate skill upload steps.
|
||||
|
||||
The analytics skill teaches client agents the MCP workflow: discover data,
|
||||
prefer semantic-layer measures, inspect entity details before raw SQL, and
|
||||
capture durable learnings. Admin CLI skills call `ktx` commands directly
|
||||
through a skill file installed in your agent's config:
|
||||
|
||||
```bash
|
||||
ktx sl query --measure orders.revenue --dimension orders.status --format sql
|
||||
ktx wiki search "revenue definition"
|
||||
ktx sl validate orders
|
||||
```
|
||||
|
||||
Supported client agents: Claude Code, Claude Desktop, Codex, Cursor, OpenCode,
|
||||
and clients that can use the printed MCP endpoint or `.agents` admin skills.
|
||||
Claude Desktop setup registers a local `ktx mcp stdio` server in Claude
|
||||
Desktop's config and generates one uploadable ZIP per Claude Desktop skill
|
||||
under `.ktx/agents/claude/`. Restart Claude Desktop after setup, then upload
|
||||
each ZIP from **Customize** > **Skills** > **+** > **Create skill** >
|
||||
**Upload a skill**.
|
||||
|
||||
The release artifact manifest contains the public npm tarball and the bundled
|
||||
`kaelio-ktx` runtime wheel. The `python/ktx-sl` and `python/ktx-daemon`
|
||||
directories remain source packages for development, not public release
|
||||
artifacts.
|
||||
After setup, KTX prints **Required before using agents** with the exact
|
||||
commands to run. If the output includes `ktx mcp start --project-dir ...`, run
|
||||
it before opening your agent. Claude Desktop uses its own launcher and prints
|
||||
separate skill upload steps under `.ktx/agents/claude/`.
|
||||
|
||||
## Workspace packages
|
||||
|
||||
|
|
|
|||
|
|
@ -150,6 +150,8 @@ describe('ensureManagedLocalEmbeddingsDaemon', () => {
|
|||
}),
|
||||
).resolves.toEqual({
|
||||
baseUrl: 'http://127.0.0.1:61234',
|
||||
stdoutLog: '/work/proj/.ktx/runtime/daemon.stdout.log',
|
||||
stderrLog: '/work/proj/.ktx/runtime/daemon.stderr.log',
|
||||
env: {
|
||||
[MANAGED_SENTENCE_TRANSFORMERS_BASE_URL_ENV]: 'http://127.0.0.1:61234',
|
||||
},
|
||||
|
|
|
|||
|
|
@ -14,6 +14,8 @@ import { startManagedPythonDaemon, type ManagedPythonDaemonStartResult } from '.
|
|||
|
||||
export interface ManagedLocalEmbeddingsDaemon {
|
||||
baseUrl: string;
|
||||
stdoutLog: string;
|
||||
stderrLog: string;
|
||||
env: Record<typeof MANAGED_SENTENCE_TRANSFORMERS_BASE_URL_ENV, string>;
|
||||
}
|
||||
|
||||
|
|
@ -91,6 +93,8 @@ export async function ensureManagedLocalEmbeddingsDaemon(
|
|||
|
||||
return {
|
||||
baseUrl: daemon.baseUrl,
|
||||
stdoutLog: daemon.state.stdoutLog,
|
||||
stderrLog: daemon.state.stderrLog,
|
||||
env: {
|
||||
[MANAGED_SENTENCE_TRANSFORMERS_BASE_URL_ENV]: daemon.baseUrl,
|
||||
},
|
||||
|
|
|
|||
|
|
@ -11,6 +11,8 @@ import {
|
|||
type KtxMcpDaemonState,
|
||||
} from './managed-mcp-daemon.js';
|
||||
|
||||
type KtxMcpDaemonStartOptions = Parameters<typeof startKtxMcpDaemon>[0];
|
||||
|
||||
function child(pid = 4242): KtxMcpDaemonChild {
|
||||
return { pid, unref: vi.fn() };
|
||||
}
|
||||
|
|
@ -40,6 +42,7 @@ describe('managed MCP daemon lifecycle', () => {
|
|||
});
|
||||
|
||||
afterEach(async () => {
|
||||
vi.unstubAllEnvs();
|
||||
await rm(tempDir, { recursive: true, force: true });
|
||||
});
|
||||
|
||||
|
|
@ -94,6 +97,33 @@ describe('managed MCP daemon lifecycle', () => {
|
|||
);
|
||||
});
|
||||
|
||||
it('sanitizes IPv6 CIDR entries from child NO_PROXY env', async () => {
|
||||
vi.stubEnv('NO_PROXY', 'localhost,fd07:b51a:cc66:f0::/64');
|
||||
vi.stubEnv('no_proxy', '::1,fd00::/8,*.orb.local');
|
||||
const spawnDaemon = vi.fn<NonNullable<KtxMcpDaemonStartOptions['spawnDaemon']>>(() => child(5555));
|
||||
|
||||
await startKtxMcpDaemon({
|
||||
projectDir,
|
||||
cliVersion: '0.0.0-test',
|
||||
host: '127.0.0.1',
|
||||
port: 7879,
|
||||
allowedHosts: [],
|
||||
allowedOrigins: [],
|
||||
binPath: '/repo/packages/cli/dist/bin.js',
|
||||
spawnDaemon,
|
||||
processAlive: vi.fn(() => false),
|
||||
portAvailable: vi.fn(async () => true),
|
||||
now: () => new Date('2026-05-14T00:00:00.000Z'),
|
||||
});
|
||||
|
||||
const env = spawnDaemon.mock.calls[0]?.[2].env;
|
||||
if (!env) {
|
||||
throw new Error('Expected MCP daemon spawn env');
|
||||
}
|
||||
expect(env.NO_PROXY).toBe('localhost,::1,*.orb.local');
|
||||
expect(env.no_proxy).toBe(env.NO_PROXY);
|
||||
});
|
||||
|
||||
it('returns already-running without spawning when the daemon is alive at the same host/port', async () => {
|
||||
await mkdir(join(projectDir, '.ktx'), { recursive: true });
|
||||
await writeFile(join(projectDir, '.ktx/mcp.json'), `${JSON.stringify(state(projectDir), null, 2)}\n`);
|
||||
|
|
|
|||
|
|
@ -4,6 +4,7 @@ import { createServer } from 'node:net';
|
|||
import { dirname, join } from 'node:path';
|
||||
import { setTimeout as delay } from 'node:timers/promises';
|
||||
import { z } from 'zod';
|
||||
import { sanitizeChildProxyEnv } from './proxy-env.js';
|
||||
|
||||
export interface KtxMcpDaemonState {
|
||||
schemaVersion: 1;
|
||||
|
|
@ -166,11 +167,11 @@ export async function startKtxMcpDaemon(options: {
|
|||
const child = (options.spawnDaemon ?? defaultSpawnDaemon)(process.execPath, args, {
|
||||
detached: true,
|
||||
stdio: ['ignore', log.fd, log.fd],
|
||||
env: {
|
||||
env: sanitizeChildProxyEnv({
|
||||
...process.env,
|
||||
KTX_CLI_VERSION: options.cliVersion,
|
||||
...(options.token ? { KTX_MCP_TOKEN: options.token } : {}),
|
||||
},
|
||||
}),
|
||||
});
|
||||
if (!child.pid) {
|
||||
throw new Error('Failed to start KTX MCP daemon: child process pid was not available.');
|
||||
|
|
|
|||
|
|
@ -99,6 +99,7 @@ function installResult(features: KtxRuntimeFeature[] = ['core']): ManagedPythonR
|
|||
asset: {
|
||||
manifest: installedManifest.asset,
|
||||
wheelPath: '/assets/python/kaelio_ktx-0.2.0-py3-none-any.whl',
|
||||
requiresPython: { specifier: '>=3.13', minimumVersion: '3.13' },
|
||||
},
|
||||
manifest: installedManifest,
|
||||
};
|
||||
|
|
|
|||
|
|
@ -79,6 +79,7 @@ function installResult(root: string, features: Array<'core' | 'local-embeddings'
|
|||
asset: {
|
||||
manifest: manifest(root, features).asset,
|
||||
wheelPath: join(root, 'assets', 'python', 'kaelio_ktx-0.2.0-py3-none-any.whl'),
|
||||
requiresPython: { specifier: '>=3.13', minimumVersion: '3.13' },
|
||||
},
|
||||
manifest: manifest(root, features),
|
||||
};
|
||||
|
|
@ -132,6 +133,7 @@ describe('managed Python daemon lifecycle', () => {
|
|||
});
|
||||
|
||||
afterEach(async () => {
|
||||
vi.unstubAllEnvs();
|
||||
await rm(tempDir, { recursive: true, force: true });
|
||||
});
|
||||
|
||||
|
|
@ -187,6 +189,27 @@ describe('managed Python daemon lifecycle', () => {
|
|||
});
|
||||
});
|
||||
|
||||
it('sanitizes IPv6 CIDR entries from child NO_PROXY env', async () => {
|
||||
vi.stubEnv('NO_PROXY', 'localhost,fd07:b51a:cc66:f0::/64,127.0.0.0/8');
|
||||
vi.stubEnv('no_proxy', '::1,fd00::/8,*.orb.local');
|
||||
const spawnDaemon = makeSpawn(5555);
|
||||
|
||||
await startManagedPythonDaemon({
|
||||
...daemonOptionsBase(tempDir),
|
||||
features: ['local-embeddings'],
|
||||
installRuntime: vi.fn(async () => installResult(tempDir, ['core', 'local-embeddings'])),
|
||||
spawnDaemon,
|
||||
fetch: makeFetch(),
|
||||
allocatePort: vi.fn(async () => 61234),
|
||||
now: () => new Date('2026-05-11T00:00:00.000Z'),
|
||||
pollIntervalMs: 1,
|
||||
});
|
||||
|
||||
const env = vi.mocked(spawnDaemon).mock.calls[0]?.[2].env;
|
||||
expect(env?.NO_PROXY).toBe('localhost,127.0.0.0/8,::1,*.orb.local');
|
||||
expect(env?.no_proxy).toBe(env?.NO_PROXY);
|
||||
});
|
||||
|
||||
it('makes a final health probe before reporting startup failure', async () => {
|
||||
const spawnDaemon = makeSpawn(5556);
|
||||
const installRuntime = vi.fn(async () => installResult(tempDir));
|
||||
|
|
|
|||
|
|
@ -14,6 +14,7 @@ import {
|
|||
type ManagedPythonRuntimeInstallOptions,
|
||||
type ManagedPythonRuntimeInstallResult,
|
||||
} from './managed-python-runtime.js';
|
||||
import { sanitizeChildProxyEnv } from './proxy-env.js';
|
||||
|
||||
export interface ManagedPythonDaemonState {
|
||||
schemaVersion: 1;
|
||||
|
|
@ -696,10 +697,10 @@ export async function startManagedPythonDaemon(
|
|||
{
|
||||
detached: true,
|
||||
stdio: ['ignore', stdout.fd, stderr.fd],
|
||||
env: {
|
||||
env: sanitizeChildProxyEnv({
|
||||
...process.env,
|
||||
KTX_DAEMON_VERSION: options.cliVersion,
|
||||
},
|
||||
}),
|
||||
},
|
||||
);
|
||||
child.unref();
|
||||
|
|
|
|||
|
|
@ -2,6 +2,7 @@ import { createHash } from 'node:crypto';
|
|||
import { mkdir, mkdtemp, readFile, rm, writeFile } from 'node:fs/promises';
|
||||
import { tmpdir } from 'node:os';
|
||||
import { join } from 'node:path';
|
||||
import { strToU8, zipSync } from 'fflate';
|
||||
import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest';
|
||||
import {
|
||||
MISSING_UV_RUNTIME_INSTALL_MESSAGE,
|
||||
|
|
@ -14,10 +15,33 @@ import {
|
|||
type ManagedPythonRuntimeExec,
|
||||
} from './managed-python-runtime.js';
|
||||
|
||||
async function writeAsset(root: string, contents = 'wheel-bytes') {
|
||||
function runtimeWheelContents(input: { label?: string; requiresPython?: string | null } = {}): Buffer {
|
||||
const label = input.label ?? 'runtime-wheel';
|
||||
const requiresPython = input.requiresPython === null ? [] : [`Requires-Python: ${input.requiresPython ?? '>=3.13'}`];
|
||||
return Buffer.from(
|
||||
zipSync({
|
||||
'kaelio_ktx-0.1.0.dist-info/METADATA': strToU8(
|
||||
[
|
||||
'Metadata-Version: 2.4',
|
||||
'Name: kaelio-ktx',
|
||||
'Version: 0.1.0',
|
||||
...requiresPython,
|
||||
`Summary: ${label}`,
|
||||
'',
|
||||
].join('\n'),
|
||||
),
|
||||
}),
|
||||
);
|
||||
}
|
||||
|
||||
async function writeAsset(
|
||||
root: string,
|
||||
options: { label?: string; requiresPython?: string | null; contents?: Buffer } = {},
|
||||
) {
|
||||
const assetDir = join(root, 'assets', 'python');
|
||||
await mkdir(assetDir, { recursive: true });
|
||||
const wheelPath = join(assetDir, 'kaelio_ktx-0.1.0-py3-none-any.whl');
|
||||
const contents = options.contents ?? runtimeWheelContents(options);
|
||||
await writeFile(wheelPath, contents);
|
||||
await writeFile(
|
||||
join(assetDir, 'manifest.json'),
|
||||
|
|
@ -30,7 +54,7 @@ async function writeAsset(root: string, contents = 'wheel-bytes') {
|
|||
wheel: {
|
||||
file: 'kaelio_ktx-0.1.0-py3-none-any.whl',
|
||||
sha256: createHash('sha256').update(contents).digest('hex'),
|
||||
bytes: Buffer.byteLength(contents),
|
||||
bytes: contents.byteLength,
|
||||
},
|
||||
},
|
||||
null,
|
||||
|
|
@ -145,17 +169,18 @@ describe('verifyRuntimeAsset', () => {
|
|||
});
|
||||
|
||||
it('reads the manifest and verifies the wheel checksum', async () => {
|
||||
const { assetDir, wheelPath } = await writeAsset(tempDir, 'valid-wheel');
|
||||
const { assetDir, wheelPath } = await writeAsset(tempDir, { label: 'valid-wheel' });
|
||||
|
||||
const asset = await verifyRuntimeAsset({ assetDir });
|
||||
|
||||
expect(asset.manifest.distributionName).toBe('kaelio-ktx');
|
||||
expect(asset.manifest.normalizedName).toBe('kaelio_ktx');
|
||||
expect(asset.wheelPath).toBe(wheelPath);
|
||||
expect(asset.requiresPython).toEqual({ specifier: '>=3.13', minimumVersion: '3.13' });
|
||||
});
|
||||
|
||||
it('rejects a wheel whose checksum does not match the manifest', async () => {
|
||||
const { assetDir, wheelPath } = await writeAsset(tempDir, 'original');
|
||||
const { assetDir, wheelPath } = await writeAsset(tempDir, { label: 'original' });
|
||||
await writeFile(wheelPath, 'tampered');
|
||||
|
||||
await expect(verifyRuntimeAsset({ assetDir })).rejects.toThrow(
|
||||
|
|
@ -164,7 +189,7 @@ describe('verifyRuntimeAsset', () => {
|
|||
});
|
||||
|
||||
it('rejects an unsafe wheel filename in the manifest', async () => {
|
||||
const { assetDir } = await writeAsset(tempDir, 'valid-wheel');
|
||||
const { assetDir } = await writeAsset(tempDir, { label: 'valid-wheel' });
|
||||
await writeFile(
|
||||
join(assetDir, 'manifest.json'),
|
||||
`${JSON.stringify({
|
||||
|
|
@ -190,6 +215,22 @@ describe('verifyRuntimeAsset', () => {
|
|||
/Missing bundled Python runtime manifest.*pnpm run artifacts:build/s,
|
||||
);
|
||||
});
|
||||
|
||||
it('rejects a bundled wheel without Requires-Python metadata', async () => {
|
||||
const { assetDir } = await writeAsset(tempDir, { requiresPython: null });
|
||||
|
||||
await expect(verifyRuntimeAsset({ assetDir })).rejects.toThrow(
|
||||
/Bundled Python runtime wheel metadata is missing Requires-Python/,
|
||||
);
|
||||
});
|
||||
|
||||
it('rejects a bundled wheel without a supported minimum Python version', async () => {
|
||||
const { assetDir } = await writeAsset(tempDir, { requiresPython: '<4' });
|
||||
|
||||
await expect(verifyRuntimeAsset({ assetDir })).rejects.toThrow(
|
||||
/Unsupported bundled Python runtime Requires-Python: <4/,
|
||||
);
|
||||
});
|
||||
});
|
||||
|
||||
describe('installManagedPythonRuntime', () => {
|
||||
|
|
@ -204,7 +245,7 @@ describe('installManagedPythonRuntime', () => {
|
|||
});
|
||||
|
||||
it('creates a venv, installs the core wheel, and writes a manifest', async () => {
|
||||
const { assetDir } = await writeAsset(tempDir, 'core-wheel');
|
||||
const { assetDir } = await writeAsset(tempDir, { label: 'core-wheel' });
|
||||
const commands: Array<{ command: string; args: string[] }> = [];
|
||||
const exec: ManagedPythonRuntimeExec = vi.fn(async (command, args) => {
|
||||
commands.push({ command, args });
|
||||
|
|
@ -222,7 +263,8 @@ describe('installManagedPythonRuntime', () => {
|
|||
expect(result.status).toBe('installed');
|
||||
expect(commands).toEqual([
|
||||
{ command: 'uv', args: ['--version'] },
|
||||
{ command: 'uv', args: ['venv', result.layout.venvDir] },
|
||||
{ command: 'uv', args: ['python', 'install', '3.13'] },
|
||||
{ command: 'uv', args: ['venv', '--python', '3.13', result.layout.venvDir] },
|
||||
{
|
||||
command: 'uv',
|
||||
args: ['pip', 'install', '--python', result.layout.pythonPath, result.asset.wheelPath],
|
||||
|
|
@ -240,7 +282,7 @@ describe('installManagedPythonRuntime', () => {
|
|||
});
|
||||
|
||||
it('disables repo uv config for managed runtime uv commands', async () => {
|
||||
const { assetDir } = await writeAsset(tempDir, 'core-wheel');
|
||||
const { assetDir } = await writeAsset(tempDir, { label: 'core-wheel' });
|
||||
const commands: Array<{ command: string; args: string[]; env?: NodeJS.ProcessEnv }> = [];
|
||||
const exec: ManagedPythonRuntimeExec = vi.fn(async (command, args, options) => {
|
||||
commands.push({ command, args, env: options?.env });
|
||||
|
|
@ -258,13 +300,14 @@ describe('installManagedPythonRuntime', () => {
|
|||
|
||||
expect(commands.map((call) => [call.command, call.args[0], call.env?.UV_NO_CONFIG, call.env?.PATH])).toEqual([
|
||||
['uv', '--version', '1', '/opt/homebrew/bin'],
|
||||
['uv', 'python', '1', '/opt/homebrew/bin'],
|
||||
['uv', 'venv', '1', '/opt/homebrew/bin'],
|
||||
['uv', 'pip', '1', '/opt/homebrew/bin'],
|
||||
]);
|
||||
});
|
||||
|
||||
it('installs the local-embeddings extra when requested', async () => {
|
||||
const { assetDir } = await writeAsset(tempDir, 'embedding-wheel');
|
||||
const { assetDir } = await writeAsset(tempDir, { label: 'embedding-wheel' });
|
||||
const commands: Array<{ command: string; args: string[] }> = [];
|
||||
const exec: ManagedPythonRuntimeExec = vi.fn(async (command, args) => {
|
||||
commands.push({ command, args });
|
||||
|
|
@ -288,7 +331,7 @@ describe('installManagedPythonRuntime', () => {
|
|||
});
|
||||
|
||||
it('fails with the hard-prerequisite message when uv is missing', async () => {
|
||||
const { assetDir } = await writeAsset(tempDir, 'core-wheel');
|
||||
const { assetDir } = await writeAsset(tempDir, { label: 'core-wheel' });
|
||||
const commands: Array<{ command: string; args: string[] }> = [];
|
||||
const exec: ManagedPythonRuntimeExec = vi.fn(async (command, args) => {
|
||||
commands.push({ command, args });
|
||||
|
|
@ -309,7 +352,7 @@ describe('installManagedPythonRuntime', () => {
|
|||
});
|
||||
|
||||
it('reuses an existing compatible runtime when force is false', async () => {
|
||||
const { assetDir } = await writeAsset(tempDir, 'core-wheel');
|
||||
const { assetDir } = await writeAsset(tempDir, { label: 'core-wheel' });
|
||||
const exec: ManagedPythonRuntimeExec = vi.fn(async (command, args) => ({
|
||||
stdout: command === 'uv' && args[0] === '--version' ? 'uv 0.9.5\n' : '',
|
||||
stderr: '',
|
||||
|
|
@ -335,14 +378,17 @@ describe('installManagedPythonRuntime', () => {
|
|||
});
|
||||
|
||||
expect(second.status).toBe('ready');
|
||||
expect(exec).toHaveBeenCalledTimes(3);
|
||||
expect(exec).toHaveBeenCalledTimes(4);
|
||||
});
|
||||
|
||||
it('keeps failed install logs in the versioned runtime directory', async () => {
|
||||
const { assetDir } = await writeAsset(tempDir, 'core-wheel');
|
||||
const { assetDir } = await writeAsset(tempDir, { label: 'core-wheel' });
|
||||
const exec: ManagedPythonRuntimeExec = vi.fn(async (command, args) => {
|
||||
if (command === 'uv' && args[0] === 'venv') {
|
||||
throw Object.assign(new Error('uv venv failed'), { stdout: 'creating\n', stderr: 'bad python\n' });
|
||||
throw Object.assign(new Error('uv venv failed'), {
|
||||
stdout: 'creating\n',
|
||||
stderr: '× No solution found\n╰─▶ current Python version (3.12.3) does not satisfy Python>=3.13\n',
|
||||
});
|
||||
}
|
||||
return { stdout: command === 'uv' && args[0] === '--version' ? 'uv 0.9.5\n' : '', stderr: '' };
|
||||
});
|
||||
|
|
@ -355,11 +401,11 @@ describe('installManagedPythonRuntime', () => {
|
|||
features: ['core'],
|
||||
exec,
|
||||
}),
|
||||
).rejects.toThrow(/Python runtime install failed/);
|
||||
).rejects.toThrow(/current Python version \(3\.12\.3\) does not satisfy Python>=3\.13/);
|
||||
|
||||
const log = await readFile(join(tempDir, 'runtime', '0.2.0', 'install.log'), 'utf8');
|
||||
expect(log).toContain('$ uv venv');
|
||||
expect(log).toContain('bad python');
|
||||
expect(log).toContain('$ uv venv --python 3.13');
|
||||
expect(log).toContain('current Python version (3.12.3) does not satisfy Python>=3.13');
|
||||
});
|
||||
});
|
||||
|
||||
|
|
@ -386,7 +432,7 @@ describe('readManagedPythonRuntimeStatus', () => {
|
|||
});
|
||||
|
||||
it('reports ready when manifest and executables exist', async () => {
|
||||
const { assetDir } = await writeAsset(tempDir, 'core-wheel');
|
||||
const { assetDir } = await writeAsset(tempDir, { label: 'core-wheel' });
|
||||
const exec: ManagedPythonRuntimeExec = vi.fn(async (command, args) => ({
|
||||
stdout: command === 'uv' && args[0] === '--version' ? 'uv 0.9.5\n' : '',
|
||||
stderr: '',
|
||||
|
|
@ -413,7 +459,7 @@ describe('readManagedPythonRuntimeStatus', () => {
|
|||
});
|
||||
|
||||
it('reports broken when an executable is missing', async () => {
|
||||
const { assetDir } = await writeAsset(tempDir, 'core-wheel');
|
||||
const { assetDir } = await writeAsset(tempDir, { label: 'core-wheel' });
|
||||
const exec: ManagedPythonRuntimeExec = vi.fn(async (command, args) => ({
|
||||
stdout: command === 'uv' && args[0] === '--version' ? 'uv 0.9.5\n' : '',
|
||||
stderr: '',
|
||||
|
|
@ -449,7 +495,7 @@ describe('doctorManagedPythonRuntime', () => {
|
|||
});
|
||||
|
||||
it('checks uv, bundled assets, and installed runtime status', async () => {
|
||||
const { assetDir } = await writeAsset(tempDir, 'core-wheel');
|
||||
const { assetDir } = await writeAsset(tempDir, { label: 'core-wheel' });
|
||||
const exec: ManagedPythonRuntimeExec = vi.fn(async (command, args) => ({
|
||||
stdout: command === 'uv' && args[0] === '--version' ? 'uv 0.9.5\n' : '',
|
||||
stderr: '',
|
||||
|
|
@ -471,7 +517,7 @@ describe('doctorManagedPythonRuntime', () => {
|
|||
});
|
||||
|
||||
it('reports uv as a hard prerequisite when uv is missing', async () => {
|
||||
const { assetDir } = await writeAsset(tempDir, 'core-wheel');
|
||||
const { assetDir } = await writeAsset(tempDir, { label: 'core-wheel' });
|
||||
const exec: ManagedPythonRuntimeExec = vi.fn(async () => {
|
||||
throw new Error('spawn uv ENOENT');
|
||||
});
|
||||
|
|
|
|||
|
|
@ -5,6 +5,7 @@ import { homedir } from 'node:os';
|
|||
import { basename, join } from 'node:path';
|
||||
import { fileURLToPath } from 'node:url';
|
||||
import { promisify } from 'node:util';
|
||||
import { strFromU8, unzipSync } from 'fflate';
|
||||
import { z } from 'zod';
|
||||
|
||||
const execFileAsync = promisify(execFile);
|
||||
|
|
@ -78,6 +79,10 @@ export interface ManagedPythonDaemonLayout extends ManagedPythonRuntimeLayout {
|
|||
export interface ManagedRuntimeAsset {
|
||||
manifest: KtxRuntimeAssetManifest;
|
||||
wheelPath: string;
|
||||
requiresPython: {
|
||||
specifier: string;
|
||||
minimumVersion: string;
|
||||
};
|
||||
}
|
||||
|
||||
export type ManagedPythonRuntimeExec = (
|
||||
|
|
@ -196,6 +201,40 @@ function isErrnoException(error: unknown, code: string): boolean {
|
|||
return typeof error === 'object' && error !== null && 'code' in error && error.code === code;
|
||||
}
|
||||
|
||||
function parseRequiresPythonFromWheel(input: { wheelPath: string; contents: Buffer }): ManagedRuntimeAsset['requiresPython'] {
|
||||
let files: Record<string, Uint8Array>;
|
||||
try {
|
||||
files = unzipSync(new Uint8Array(input.contents));
|
||||
} catch (error) {
|
||||
throw new Error(
|
||||
`Unable to read bundled Python runtime wheel metadata: ${error instanceof Error ? error.message : String(error)}`,
|
||||
);
|
||||
}
|
||||
const metadataEntry = Object.entries(files).find(([path]) => path.endsWith('.dist-info/METADATA'));
|
||||
if (!metadataEntry) {
|
||||
throw new Error(`Bundled Python runtime wheel metadata is missing: ${input.wheelPath}`);
|
||||
}
|
||||
|
||||
const metadata = strFromU8(metadataEntry[1]);
|
||||
const requiresPython = metadata
|
||||
.split(/\r?\n/)
|
||||
.map((line) => line.match(/^Requires-Python:\s*(.+)\s*$/i)?.[1]?.trim())
|
||||
.find((value): value is string => typeof value === 'string' && value.length > 0);
|
||||
if (!requiresPython) {
|
||||
throw new Error('Bundled Python runtime wheel metadata is missing Requires-Python');
|
||||
}
|
||||
|
||||
const minimumMatch = requiresPython.match(/(?:^|[,\s])>=\s*([0-9]+)\.([0-9]+)(?:\.[0-9]+)?\b/);
|
||||
if (!minimumMatch) {
|
||||
throw new Error(`Unsupported bundled Python runtime Requires-Python: ${requiresPython}`);
|
||||
}
|
||||
|
||||
return {
|
||||
specifier: requiresPython,
|
||||
minimumVersion: `${minimumMatch[1]}.${minimumMatch[2]}`,
|
||||
};
|
||||
}
|
||||
|
||||
export async function verifyRuntimeAsset(input: { assetDir: string }): Promise<ManagedRuntimeAsset> {
|
||||
const manifestPath = join(input.assetDir, 'manifest.json');
|
||||
let manifestData: unknown;
|
||||
|
|
@ -221,7 +260,7 @@ export async function verifyRuntimeAsset(input: { assetDir: string }): Promise<M
|
|||
if (sha256 !== manifest.wheel.sha256 || wheel.byteLength !== manifest.wheel.bytes) {
|
||||
throw new Error(`Bundled Python runtime wheel checksum mismatch: ${wheelPath}`);
|
||||
}
|
||||
return { manifest, wheelPath };
|
||||
return { manifest, wheelPath, requiresPython: parseRequiresPythonFromWheel({ wheelPath, contents: wheel }) };
|
||||
}
|
||||
|
||||
function normalizeFeatures(features: KtxRuntimeFeature[]): KtxRuntimeFeature[] {
|
||||
|
|
@ -262,6 +301,14 @@ function errorOutput(error: unknown): { stdout: string; stderr: string } {
|
|||
};
|
||||
}
|
||||
|
||||
function installFailureMessage(input: { logPath: string; stdout: string; stderr: string }): string {
|
||||
const output = [input.stderr.trim(), input.stdout.trim()].filter((part) => part.length > 0).join('\n');
|
||||
if (!output) {
|
||||
return `Python runtime install failed. Install log: ${input.logPath}`;
|
||||
}
|
||||
return `Python runtime install failed.\n${output}\nInstall log: ${input.logPath}`;
|
||||
}
|
||||
|
||||
async function runLogged(input: {
|
||||
exec: ManagedPythonRuntimeExec;
|
||||
logPath: string;
|
||||
|
|
@ -288,7 +335,7 @@ async function runLogged(input: {
|
|||
if (output.stderr) {
|
||||
await appendFile(input.logPath, output.stderr.endsWith('\n') ? output.stderr : `${output.stderr}\n`);
|
||||
}
|
||||
throw new Error(`Python runtime install failed. Install log: ${input.logPath}`);
|
||||
throw new Error(installFailureMessage({ logPath: input.logPath, stdout: output.stdout, stderr: output.stderr }));
|
||||
}
|
||||
}
|
||||
|
||||
|
|
@ -334,7 +381,14 @@ export async function installManagedPythonRuntime(
|
|||
exec,
|
||||
logPath: layout.installLogPath,
|
||||
command: 'uv',
|
||||
args: ['venv', layout.venvDir],
|
||||
args: ['python', 'install', asset.requiresPython.minimumVersion],
|
||||
env: uvEnv,
|
||||
});
|
||||
await runLogged({
|
||||
exec,
|
||||
logPath: layout.installLogPath,
|
||||
command: 'uv',
|
||||
args: ['venv', '--python', asset.requiresPython.minimumVersion, layout.venvDir],
|
||||
env: uvEnv,
|
||||
});
|
||||
const wheelSpec = features.includes('local-embeddings') ? `${asset.wheelPath}[local-embeddings]` : asset.wheelPath;
|
||||
|
|
|
|||
21
packages/cli/src/proxy-env.test.ts
Normal file
21
packages/cli/src/proxy-env.test.ts
Normal file
|
|
@ -0,0 +1,21 @@
|
|||
import { describe, expect, it } from 'vitest';
|
||||
import { sanitizeChildProxyEnv } from './proxy-env.js';
|
||||
|
||||
describe('sanitizeChildProxyEnv', () => {
|
||||
it('drops IPv6 CIDR no-proxy entries and normalizes both env keys', () => {
|
||||
const env = sanitizeChildProxyEnv({
|
||||
NO_PROXY: 'localhost,127.0.0.1,127.0.0.0/8,fd07:b51a:cc66:f0::/64,*.orb.local',
|
||||
no_proxy: '::1,0.250.250.0/24,fd00::/8,*.orb.internal',
|
||||
});
|
||||
|
||||
expect(env.NO_PROXY).toBe('localhost,127.0.0.1,127.0.0.0/8,*.orb.local,::1,0.250.250.0/24,*.orb.internal');
|
||||
expect(env.no_proxy).toBe(env.NO_PROXY);
|
||||
});
|
||||
|
||||
it('preserves the input object and leaves missing proxy env unset', () => {
|
||||
const input = { PATH: '/usr/bin' };
|
||||
|
||||
expect(sanitizeChildProxyEnv(input)).toEqual({ PATH: '/usr/bin' });
|
||||
expect(input).toEqual({ PATH: '/usr/bin' });
|
||||
});
|
||||
});
|
||||
27
packages/cli/src/proxy-env.ts
Normal file
27
packages/cli/src/proxy-env.ts
Normal file
|
|
@ -0,0 +1,27 @@
|
|||
const NO_PROXY_KEYS = ['NO_PROXY', 'no_proxy'] as const;
|
||||
|
||||
function isIpv6CidrNoProxyEntry(entry: string): boolean {
|
||||
return entry.includes('/') && entry.includes(':');
|
||||
}
|
||||
|
||||
function cleanedNoProxyValue(env: NodeJS.ProcessEnv): string | undefined {
|
||||
const entries = NO_PROXY_KEYS.flatMap((key) => (env[key] ?? '').split(','))
|
||||
.map((entry) => entry.trim())
|
||||
.filter((entry) => entry.length > 0 && !isIpv6CidrNoProxyEntry(entry));
|
||||
|
||||
if (!NO_PROXY_KEYS.some((key) => env[key] !== undefined)) {
|
||||
return undefined;
|
||||
}
|
||||
return [...new Set(entries)].join(',');
|
||||
}
|
||||
|
||||
export function sanitizeChildProxyEnv(env: NodeJS.ProcessEnv): NodeJS.ProcessEnv {
|
||||
const sanitized = { ...env };
|
||||
const noProxy = cleanedNoProxyValue(env);
|
||||
if (noProxy === undefined) {
|
||||
return sanitized;
|
||||
}
|
||||
sanitized.NO_PROXY = noProxy;
|
||||
sanitized.no_proxy = noProxy;
|
||||
return sanitized;
|
||||
}
|
||||
|
|
@ -52,6 +52,7 @@ describe('runKtxRuntime', () => {
|
|||
},
|
||||
asset: {
|
||||
wheelPath: '/assets/python/kaelio_ktx-0.1.0-py3-none-any.whl',
|
||||
requiresPython: { specifier: '>=3.13', minimumVersion: '3.13' },
|
||||
manifest: {
|
||||
schemaVersion: 1,
|
||||
distributionName: 'kaelio-ktx',
|
||||
|
|
|
|||
|
|
@ -46,9 +46,14 @@ function makePromptAdapter(options: {
|
|||
};
|
||||
}
|
||||
|
||||
function managedDaemon(baseUrl = 'http://127.0.0.1:61234') {
|
||||
function managedDaemon(
|
||||
baseUrl = 'http://127.0.0.1:61234',
|
||||
logs: { stdoutLog?: string; stderrLog?: string } = {},
|
||||
) {
|
||||
return {
|
||||
baseUrl,
|
||||
stdoutLog: logs.stdoutLog ?? '/tmp/ktx-daemon.stdout.log',
|
||||
stderrLog: logs.stderrLog ?? '/tmp/ktx-daemon.stderr.log',
|
||||
env: {
|
||||
KTX_MANAGED_SENTENCE_TRANSFORMERS_BASE_URL: baseUrl,
|
||||
},
|
||||
|
|
@ -330,6 +335,65 @@ describe('setup embeddings step', () => {
|
|||
expect(io.stderr()).not.toContain('skip for now');
|
||||
});
|
||||
|
||||
it('prints the recent daemon stderr tail when local embedding health check fails', async () => {
|
||||
const io = makeIo();
|
||||
const stderrLog = join(tempDir, '.ktx', 'runtime', 'daemon.stderr.log');
|
||||
await mkdir(join(tempDir, '.ktx', 'runtime'), { recursive: true });
|
||||
await writeFile(
|
||||
stderrLog,
|
||||
Array.from({ length: 45 }, (_value, index) => `daemon traceback line ${index + 1}`).join('\n'),
|
||||
);
|
||||
|
||||
const result = await runKtxSetupEmbeddingsStep(
|
||||
{
|
||||
projectDir: tempDir,
|
||||
inputMode: 'disabled',
|
||||
cliVersion: '0.2.0',
|
||||
runtimeInstallPolicy: 'auto',
|
||||
skipEmbeddings: false,
|
||||
},
|
||||
io.io,
|
||||
{
|
||||
env: {},
|
||||
ensureLocalEmbeddings: vi.fn(async () => managedDaemon('http://127.0.0.1:61234', { stderrLog })),
|
||||
healthCheck: vi.fn(async () => ({ ok: false as const, message: 'HTTP 500' })),
|
||||
},
|
||||
);
|
||||
|
||||
expect(result.status).toBe('failed');
|
||||
expect(io.stderr()).toContain('Recent local embeddings daemon stderr:');
|
||||
expect(io.stderr()).toContain('daemon traceback line 6');
|
||||
expect(io.stderr()).toContain('daemon traceback line 45');
|
||||
expect(io.stderr()).not.toContain('daemon traceback line 5');
|
||||
});
|
||||
|
||||
it('does not print daemon stderr diagnostics when the log is unavailable or empty', async () => {
|
||||
const io = makeIo();
|
||||
|
||||
const result = await runKtxSetupEmbeddingsStep(
|
||||
{
|
||||
projectDir: tempDir,
|
||||
inputMode: 'disabled',
|
||||
cliVersion: '0.2.0',
|
||||
runtimeInstallPolicy: 'auto',
|
||||
skipEmbeddings: false,
|
||||
},
|
||||
io.io,
|
||||
{
|
||||
env: {},
|
||||
ensureLocalEmbeddings: vi.fn(async () =>
|
||||
managedDaemon('http://127.0.0.1:61234', {
|
||||
stderrLog: join(tempDir, '.ktx', 'runtime', 'missing.stderr.log'),
|
||||
}),
|
||||
),
|
||||
healthCheck: vi.fn(async () => ({ ok: false as const, message: 'HTTP 500' })),
|
||||
},
|
||||
);
|
||||
|
||||
expect(result.status).toBe('failed');
|
||||
expect(io.stderr()).not.toContain('Recent local embeddings daemon stderr:');
|
||||
});
|
||||
|
||||
it('uses fixed OpenAI defaults and only asks for credentials when OpenAI is selected', async () => {
|
||||
const io = makeIo();
|
||||
const healthCheck = vi.fn(async () => ({ ok: true as const }));
|
||||
|
|
|
|||
|
|
@ -1,4 +1,4 @@
|
|||
import { writeFile } from 'node:fs/promises';
|
||||
import { readFile, writeFile } from 'node:fs/promises';
|
||||
import { resolveKtxConfigReference } from '@ktx/context/core';
|
||||
import {
|
||||
type KtxProjectConfig,
|
||||
|
|
@ -59,6 +59,7 @@ export interface KtxSetupEmbeddingsDeps {
|
|||
healthCheck?: (config: KtxEmbeddingConfig) => Promise<KtxEmbeddingHealthCheckResult>;
|
||||
ensureLocalEmbeddings?: (options: {
|
||||
cliVersion: string;
|
||||
projectDir: string;
|
||||
installPolicy: KtxManagedPythonInstallPolicy;
|
||||
io: KtxCliIo;
|
||||
}) => Promise<ManagedLocalEmbeddingsDaemon>;
|
||||
|
|
@ -85,6 +86,7 @@ const EMBEDDING_OPTION_PROMPT_CONTEXT =
|
|||
'KTX uses embeddings for semantic search over semantic-layer sources, wiki context, schema metadata, ' +
|
||||
'and relationship evidence.';
|
||||
const LOCAL_EMBEDDING_HEALTH_TIMEOUT_MS = 120_000;
|
||||
const LOCAL_EMBEDDING_STDERR_TAIL_LINES = 40;
|
||||
|
||||
function createPromptAdapter(): KtxSetupEmbeddingsPromptAdapter {
|
||||
return createKtxSetupPromptAdapter({ selectCancelValue: 'back' });
|
||||
|
|
@ -286,14 +288,33 @@ async function chooseEmbeddingBackend(
|
|||
return 'back';
|
||||
}
|
||||
|
||||
function localEmbeddingSetupMessage(message: string): string {
|
||||
return [
|
||||
async function readLocalEmbeddingDaemonStderrTail(stderrLog: string | undefined): Promise<string[]> {
|
||||
if (!stderrLog) {
|
||||
return [];
|
||||
}
|
||||
try {
|
||||
const lines = (await readFile(stderrLog, 'utf8'))
|
||||
.split(/\r?\n/)
|
||||
.map((line) => line.trimEnd())
|
||||
.filter((line) => line.trim().length > 0);
|
||||
return lines.slice(-LOCAL_EMBEDDING_STDERR_TAIL_LINES);
|
||||
} catch {
|
||||
return [];
|
||||
}
|
||||
}
|
||||
|
||||
function localEmbeddingSetupMessage(message: string, stderrTail: string[] = []): string {
|
||||
const lines = [
|
||||
`Local embedding health check failed: ${message}`,
|
||||
'Local embeddings use the KTX-managed Python runtime.',
|
||||
'Prepare the runtime with: ktx dev runtime start --feature local-embeddings',
|
||||
'Use --yes with setup to install and start the runtime without prompting.',
|
||||
'The first run may download Python packages and the all-MiniLM-L6-v2 model.',
|
||||
].join('\n');
|
||||
];
|
||||
if (stderrTail.length > 0) {
|
||||
lines.push('Recent local embeddings daemon stderr:', ...stderrTail);
|
||||
}
|
||||
return lines.join('\n');
|
||||
}
|
||||
|
||||
async function promptAfterLocalEmbeddingFailure(
|
||||
|
|
@ -447,9 +468,13 @@ export async function runKtxSetupEmbeddingsStep(
|
|||
}
|
||||
|
||||
progress.fail('Embedding test failed');
|
||||
const stderrTail =
|
||||
selectedBackend === 'sentence-transformers'
|
||||
? await readLocalEmbeddingDaemonStderrTail(managedLocalEmbeddings?.stderrLog)
|
||||
: [];
|
||||
io.stderr.write(
|
||||
selectedBackend === 'sentence-transformers'
|
||||
? `${localEmbeddingSetupMessage(health.message)}\n`
|
||||
? `${localEmbeddingSetupMessage(health.message, stderrTail)}\n`
|
||||
: `Embedding health check failed: ${health.message}\n`,
|
||||
);
|
||||
if (args.inputMode === 'disabled') {
|
||||
|
|
|
|||
|
|
@ -29,8 +29,18 @@ export interface KtxSetupProjectArgs {
|
|||
allowBack?: boolean;
|
||||
}
|
||||
|
||||
export type KtxSetupCreatedProjectCleanup =
|
||||
| { kind: 'remove-project-dir'; projectDir: string }
|
||||
| { kind: 'remove-ktx-scaffold'; projectDir: string };
|
||||
|
||||
export type KtxSetupProjectResult =
|
||||
| { status: 'ready'; projectDir: string; project: KtxLocalProject; confirmedCreation?: boolean }
|
||||
| {
|
||||
status: 'ready';
|
||||
projectDir: string;
|
||||
project: KtxLocalProject;
|
||||
confirmedCreation?: boolean;
|
||||
createdProjectCleanup?: KtxSetupCreatedProjectCleanup;
|
||||
}
|
||||
| { status: 'back'; projectDir: string }
|
||||
| { status: 'cancelled'; projectDir: string }
|
||||
| { status: 'missing-input'; projectDir: string };
|
||||
|
|
@ -49,7 +59,12 @@ export interface KtxSetupProjectDeps {
|
|||
}
|
||||
|
||||
type PromptProjectDirResult =
|
||||
| { status: 'selected'; projectDir: string; confirmedCreation: boolean }
|
||||
| {
|
||||
status: 'selected';
|
||||
projectDir: string;
|
||||
confirmedCreation: boolean;
|
||||
createdProjectCleanup?: KtxSetupCreatedProjectCleanup;
|
||||
}
|
||||
| { status: 'cancelled'; projectDir: string }
|
||||
| { status: 'missing-input'; projectDir: string }
|
||||
| { status: 'back'; projectDir: string };
|
||||
|
|
@ -92,12 +107,29 @@ async function existingFolderState(
|
|||
}
|
||||
|
||||
type ConfirmProjectDirResult =
|
||||
| { status: 'confirmed'; confirmedCreation: boolean }
|
||||
| {
|
||||
status: 'confirmed';
|
||||
confirmedCreation: boolean;
|
||||
createdProjectCleanup?: KtxSetupCreatedProjectCleanup;
|
||||
}
|
||||
| { status: 'choose-another' }
|
||||
| { status: 'back' }
|
||||
| { status: 'cancelled' }
|
||||
| { status: 'not-directory' };
|
||||
|
||||
function cleanupForFolderState(
|
||||
projectDir: string,
|
||||
state: Awaited<ReturnType<typeof existingFolderState>>,
|
||||
): KtxSetupCreatedProjectCleanup | undefined {
|
||||
if (state === 'missing') {
|
||||
return { kind: 'remove-project-dir', projectDir };
|
||||
}
|
||||
if (state === 'empty-directory') {
|
||||
return { kind: 'remove-ktx-scaffold', projectDir };
|
||||
}
|
||||
return undefined;
|
||||
}
|
||||
|
||||
async function confirmProjectDir(
|
||||
selectedDir: string,
|
||||
io: KtxCliIo,
|
||||
|
|
@ -137,7 +169,7 @@ async function confirmProjectDir(
|
|||
if (action === 'choose-another') return { status: 'choose-another' };
|
||||
if (action === 'back') return { status: 'back' };
|
||||
if (action !== 'create') return { status: 'cancelled' };
|
||||
return { status: 'confirmed', confirmedCreation: true };
|
||||
return { status: 'confirmed', confirmedCreation: true, createdProjectCleanup: cleanupForFolderState(selectedDir, state) };
|
||||
}
|
||||
|
||||
async function normalizeSetupGitignore(projectDir: string): Promise<void> {
|
||||
|
|
@ -220,10 +252,28 @@ async function promptForNewProjectDir(
|
|||
if (confirmed.status === 'choose-another') continue;
|
||||
if (confirmed.status === 'back') return { status: 'back', projectDir };
|
||||
if (confirmed.status === 'cancelled') return { status: 'cancelled', projectDir };
|
||||
return { status: 'selected', projectDir: selectedDir, confirmedCreation: confirmed.confirmedCreation };
|
||||
return {
|
||||
status: 'selected',
|
||||
projectDir: selectedDir,
|
||||
confirmedCreation: confirmed.confirmedCreation,
|
||||
...(confirmed.createdProjectCleanup ? { createdProjectCleanup: confirmed.createdProjectCleanup } : {}),
|
||||
};
|
||||
}
|
||||
}
|
||||
|
||||
async function createProjectWithCleanup(
|
||||
projectDir: string,
|
||||
deps: KtxSetupProjectDeps,
|
||||
): Promise<{ project: KtxLocalProject; createdProjectCleanup?: KtxSetupCreatedProjectCleanup }> {
|
||||
const state = await existingFolderState(projectDir);
|
||||
const project = await createProject(projectDir, deps);
|
||||
const createdProjectCleanup = cleanupForFolderState(projectDir, state);
|
||||
return {
|
||||
project,
|
||||
...(createdProjectCleanup ? { createdProjectCleanup } : {}),
|
||||
};
|
||||
}
|
||||
|
||||
export async function runKtxSetupProjectStep(
|
||||
args: KtxSetupProjectArgs,
|
||||
io: KtxCliIo,
|
||||
|
|
@ -261,6 +311,7 @@ export async function runKtxSetupProjectStep(
|
|||
projectDir: selected.projectDir,
|
||||
project,
|
||||
confirmedCreation: selected.confirmedCreation,
|
||||
...(selected.createdProjectCleanup ? { createdProjectCleanup: selected.createdProjectCleanup } : {}),
|
||||
};
|
||||
}
|
||||
|
||||
|
|
@ -275,9 +326,14 @@ export async function runKtxSetupProjectStep(
|
|||
io.stderr.write('Missing setup choice: pass --yes to create a project in non-interactive setup.\n');
|
||||
return { status: 'missing-input', projectDir };
|
||||
}
|
||||
const project = await createProject(projectDir, deps);
|
||||
const { project, createdProjectCleanup } = await createProjectWithCleanup(projectDir, deps);
|
||||
printProjectSummary(io, projectDir);
|
||||
return { status: 'ready', projectDir, project };
|
||||
return {
|
||||
status: 'ready',
|
||||
projectDir,
|
||||
project,
|
||||
...(createdProjectCleanup ? { createdProjectCleanup } : {}),
|
||||
};
|
||||
}
|
||||
|
||||
if (!io.stdout.isTTY && !deps.prompts) {
|
||||
|
|
@ -316,9 +372,14 @@ export async function runKtxSetupProjectStep(
|
|||
}
|
||||
|
||||
if (choice === 'current') {
|
||||
const project = await createProject(projectDir, deps);
|
||||
const { project, createdProjectCleanup } = await createProjectWithCleanup(projectDir, deps);
|
||||
printProjectSummary(io, projectDir);
|
||||
return { status: 'ready', projectDir, project };
|
||||
return {
|
||||
status: 'ready',
|
||||
projectDir,
|
||||
project,
|
||||
...(createdProjectCleanup ? { createdProjectCleanup } : {}),
|
||||
};
|
||||
}
|
||||
|
||||
if (choice === 'new-default') {
|
||||
|
|
@ -333,6 +394,7 @@ export async function runKtxSetupProjectStep(
|
|||
projectDir: defaultProjectDir,
|
||||
project,
|
||||
confirmedCreation: confirmed.confirmedCreation,
|
||||
...(confirmed.createdProjectCleanup ? { createdProjectCleanup: confirmed.createdProjectCleanup } : {}),
|
||||
};
|
||||
}
|
||||
|
||||
|
|
@ -356,7 +418,13 @@ export async function runKtxSetupProjectStep(
|
|||
if (confirmed.status === 'cancelled') return { status: 'cancelled', projectDir };
|
||||
const project = await createProject(customDir, deps);
|
||||
printProjectSummary(io, customDir);
|
||||
return { status: 'ready', projectDir: customDir, project, confirmedCreation: confirmed.confirmedCreation };
|
||||
return {
|
||||
status: 'ready',
|
||||
projectDir: customDir,
|
||||
project,
|
||||
confirmedCreation: confirmed.confirmedCreation,
|
||||
...(confirmed.createdProjectCleanup ? { createdProjectCleanup: confirmed.createdProjectCleanup } : {}),
|
||||
};
|
||||
}
|
||||
|
||||
prompts.cancel('Setup cancelled.');
|
||||
|
|
|
|||
|
|
@ -101,6 +101,8 @@ describe('runKtxSetupRuntimeStep', () => {
|
|||
const io = makeIo();
|
||||
const ensureLocalEmbeddings = vi.fn(async () => ({
|
||||
baseUrl: 'http://127.0.0.1:61234',
|
||||
stdoutLog: join(tempDir, '.ktx', 'runtime', 'daemon.stdout.log'),
|
||||
stderrLog: join(tempDir, '.ktx', 'runtime', 'daemon.stderr.log'),
|
||||
env: { KTX_MANAGED_SENTENCE_TRANSFORMERS_BASE_URL: 'http://127.0.0.1:61234' },
|
||||
}));
|
||||
const config: KtxProjectConfig = {
|
||||
|
|
|
|||
|
|
@ -1,5 +1,5 @@
|
|||
import { execFile } from 'node:child_process';
|
||||
import { mkdir, mkdtemp, readFile, rm, stat, writeFile } from 'node:fs/promises';
|
||||
import { mkdir, mkdtemp, readFile, readdir, rm, stat, writeFile } from 'node:fs/promises';
|
||||
import { tmpdir } from 'node:os';
|
||||
import { join } from 'node:path';
|
||||
import { promisify } from 'node:util';
|
||||
|
|
@ -563,6 +563,112 @@ describe('setup status', () => {
|
|||
expect(testIo.stderr()).toBe('');
|
||||
});
|
||||
|
||||
it('removes a newly created missing project directory when a later runtime step fails', async () => {
|
||||
const projectDir = join(tempDir, 'missing-project');
|
||||
const testIo = makeIo();
|
||||
|
||||
await expect(
|
||||
runKtxSetup(
|
||||
{
|
||||
command: 'run',
|
||||
projectDir,
|
||||
mode: 'auto',
|
||||
agents: false,
|
||||
skipAgents: true,
|
||||
inputMode: 'disabled',
|
||||
yes: true,
|
||||
cliVersion: '0.2.0',
|
||||
skipLlm: true,
|
||||
skipEmbeddings: true,
|
||||
databaseSchemas: [],
|
||||
skipDatabases: true,
|
||||
skipSources: true,
|
||||
},
|
||||
testIo.io,
|
||||
{
|
||||
model: async () => ({ status: 'skipped', projectDir }),
|
||||
embeddings: async () => ({ status: 'skipped', projectDir }),
|
||||
databases: async () => ({ status: 'skipped', projectDir }),
|
||||
sources: async () => ({ status: 'skipped', projectDir }),
|
||||
runtime: async () => ({ status: 'failed', projectDir, requirements: { features: ['core'], requirements: [] } }),
|
||||
},
|
||||
),
|
||||
).resolves.toBe(1);
|
||||
|
||||
await expect(stat(projectDir)).rejects.toThrow();
|
||||
});
|
||||
|
||||
it('removes KTX scaffold files from an initially empty project directory when runtime setup fails', async () => {
|
||||
const testIo = makeIo();
|
||||
|
||||
await expect(
|
||||
runKtxSetup(
|
||||
{
|
||||
command: 'run',
|
||||
projectDir: tempDir,
|
||||
mode: 'auto',
|
||||
agents: false,
|
||||
skipAgents: true,
|
||||
inputMode: 'disabled',
|
||||
yes: true,
|
||||
cliVersion: '0.2.0',
|
||||
skipLlm: true,
|
||||
skipEmbeddings: true,
|
||||
databaseSchemas: [],
|
||||
skipDatabases: true,
|
||||
skipSources: true,
|
||||
},
|
||||
testIo.io,
|
||||
{
|
||||
model: async () => ({ status: 'skipped', projectDir: tempDir }),
|
||||
embeddings: async () => ({ status: 'skipped', projectDir: tempDir }),
|
||||
databases: async () => ({ status: 'skipped', projectDir: tempDir }),
|
||||
sources: async () => ({ status: 'skipped', projectDir: tempDir }),
|
||||
runtime: async () => ({ status: 'failed', projectDir: tempDir, requirements: { features: ['core'], requirements: [] } }),
|
||||
},
|
||||
),
|
||||
).resolves.toBe(1);
|
||||
|
||||
await expect(stat(tempDir)).resolves.toBeDefined();
|
||||
expect(await readdir(tempDir)).toEqual([]);
|
||||
});
|
||||
|
||||
it('preserves a pre-existing non-empty project directory when runtime setup fails', async () => {
|
||||
await writeFile(join(tempDir, 'notes.txt'), 'keep me\n', 'utf-8');
|
||||
const testIo = makeIo();
|
||||
|
||||
await expect(
|
||||
runKtxSetup(
|
||||
{
|
||||
command: 'run',
|
||||
projectDir: tempDir,
|
||||
mode: 'auto',
|
||||
agents: false,
|
||||
skipAgents: true,
|
||||
inputMode: 'disabled',
|
||||
yes: true,
|
||||
cliVersion: '0.2.0',
|
||||
skipLlm: true,
|
||||
skipEmbeddings: true,
|
||||
databaseSchemas: [],
|
||||
skipDatabases: true,
|
||||
skipSources: true,
|
||||
},
|
||||
testIo.io,
|
||||
{
|
||||
model: async () => ({ status: 'skipped', projectDir: tempDir }),
|
||||
embeddings: async () => ({ status: 'skipped', projectDir: tempDir }),
|
||||
databases: async () => ({ status: 'skipped', projectDir: tempDir }),
|
||||
sources: async () => ({ status: 'skipped', projectDir: tempDir }),
|
||||
runtime: async () => ({ status: 'failed', projectDir: tempDir, requirements: { features: ['core'], requirements: [] } }),
|
||||
},
|
||||
),
|
||||
).resolves.toBe(1);
|
||||
|
||||
await expect(readFile(join(tempDir, 'notes.txt'), 'utf-8')).resolves.toBe('keep me\n');
|
||||
await expect(stat(join(tempDir, 'ktx.yaml'))).resolves.toBeDefined();
|
||||
});
|
||||
|
||||
it('shows demo near the bottom of the first setup intent menu before project creation', async () => {
|
||||
const testIo = makeIo();
|
||||
const select = vi.fn(async (options: { options: Array<{ value: string; label: string }> }) => {
|
||||
|
|
|
|||
|
|
@ -1,4 +1,5 @@
|
|||
import { existsSync } from 'node:fs';
|
||||
import { rm } from 'node:fs/promises';
|
||||
import { basename, join, resolve } from 'node:path';
|
||||
import { getLatestLocalIngestStatus, savedMemoryCountsForReport } from '@ktx/context/ingest';
|
||||
import {
|
||||
|
|
@ -33,7 +34,11 @@ import {
|
|||
isKtxSetupLlmConfigReady,
|
||||
runKtxSetupAnthropicModelStep,
|
||||
} from './setup-models.js';
|
||||
import { type KtxSetupProjectDeps, runKtxSetupProjectStep } from './setup-project.js';
|
||||
import {
|
||||
type KtxSetupCreatedProjectCleanup,
|
||||
type KtxSetupProjectDeps,
|
||||
runKtxSetupProjectStep,
|
||||
} from './setup-project.js';
|
||||
import {
|
||||
isKtxPreAgentSetupReady,
|
||||
isKtxSetupReady,
|
||||
|
|
@ -502,6 +507,23 @@ async function commitSetupConfigChanges(projectDir: string): Promise<void> {
|
|||
await project.git.commitFile('ktx.yaml', 'setup: update KTX project config', 'ktx setup', 'setup@ktx.local');
|
||||
}
|
||||
|
||||
const KTX_SETUP_SCAFFOLD_PATHS = ['ktx.yaml', '.ktx', 'wiki', 'semantic-layer', 'raw-sources', '.git'];
|
||||
|
||||
async function cleanupCreatedProjectScaffold(cleanup: KtxSetupCreatedProjectCleanup | undefined): Promise<void> {
|
||||
if (!cleanup) {
|
||||
return;
|
||||
}
|
||||
if (cleanup.kind === 'remove-project-dir') {
|
||||
await rm(cleanup.projectDir, { recursive: true, force: true });
|
||||
return;
|
||||
}
|
||||
await Promise.all(
|
||||
KTX_SETUP_SCAFFOLD_PATHS.map((relativePath) =>
|
||||
rm(join(cleanup.projectDir, relativePath), { recursive: true, force: true }),
|
||||
),
|
||||
);
|
||||
}
|
||||
|
||||
export async function runKtxSetup(args: KtxSetupArgs, io: KtxCliIo, deps: KtxSetupDeps = {}): Promise<number> {
|
||||
try {
|
||||
return await runKtxSetupInner(args, io, deps);
|
||||
|
|
@ -772,7 +794,11 @@ async function runKtxSetupInner(args: KtxSetupArgs, io: KtxCliIo, deps: KtxSetup
|
|||
}
|
||||
}
|
||||
|
||||
if (stepResult.status === 'failed' || stepResult.status === 'missing-input') {
|
||||
if (stepResult.status === 'failed') {
|
||||
await cleanupCreatedProjectScaffold(projectResult.createdProjectCleanup);
|
||||
return 1;
|
||||
}
|
||||
if (stepResult.status === 'missing-input') {
|
||||
return 1;
|
||||
}
|
||||
if (stepResult.status === 'back') {
|
||||
|
|
|
|||
|
|
@ -111,12 +111,12 @@ describe('createKtxEmbeddingProvider', () => {
|
|||
);
|
||||
});
|
||||
|
||||
it('falls back to one-shot ktx-daemon inference when the local HTTP daemon is unavailable', async () => {
|
||||
const fetch = vi.fn().mockRejectedValue(new TypeError('fetch failed'));
|
||||
const runSentenceTransformersJson = vi
|
||||
it('reports local HTTP daemon failures without a ktx-daemon spawn fallback cascade', async () => {
|
||||
const fetch = vi
|
||||
.fn()
|
||||
.mockResolvedValueOnce({ embedding: [0.1, 0.2] })
|
||||
.mockResolvedValueOnce({ embeddings: [[0.3, 0.4], [0.5, 0.6]] });
|
||||
.mockResolvedValue(
|
||||
new Response('Embedding compute failed: httpx.InvalidURL: Invalid port', { status: 500 }),
|
||||
);
|
||||
|
||||
const provider = createKtxEmbeddingProvider(
|
||||
{
|
||||
|
|
@ -125,19 +125,13 @@ describe('createKtxEmbeddingProvider', () => {
|
|||
dimensions: 2,
|
||||
sentenceTransformers: { baseURL: 'http://127.0.0.1:8765', pathPrefix: '' },
|
||||
},
|
||||
{ fetch, runSentenceTransformersJson },
|
||||
{ fetch },
|
||||
);
|
||||
|
||||
await expect(provider.embedMany(['hello', 'world'])).resolves.toEqual([
|
||||
[0.3, 0.4],
|
||||
[0.5, 0.6],
|
||||
]);
|
||||
await expect(provider.embed('hello')).rejects.toThrow(
|
||||
'Embedding provider sentence-transformers request failed with HTTP 500: Embedding compute failed: httpx.InvalidURL: Invalid port',
|
||||
);
|
||||
await expect(provider.embed('hello')).rejects.not.toThrow('ktx-daemon fallback failed');
|
||||
expect(fetch).toHaveBeenCalledTimes(1);
|
||||
expect(runSentenceTransformersJson).toHaveBeenNthCalledWith(1, 'embedding-compute', {
|
||||
text: '__ktx_embedding_probe__',
|
||||
});
|
||||
expect(runSentenceTransformersJson).toHaveBeenNthCalledWith(2, 'embedding-compute-bulk', {
|
||||
texts: ['hello', 'world'],
|
||||
});
|
||||
});
|
||||
});
|
||||
|
|
|
|||
|
|
@ -1,15 +1,7 @@
|
|||
import { spawn } from 'node:child_process';
|
||||
import { join } from 'node:path';
|
||||
import OpenAI from 'openai';
|
||||
import type { KtxEmbeddingConfig, KtxEmbeddingProvider } from './types.js';
|
||||
|
||||
type FetchFn = typeof fetch;
|
||||
type SentenceTransformersCommand = 'embedding-compute' | 'embedding-compute-bulk';
|
||||
type SentenceTransformersJsonRunner = (
|
||||
subcommand: SentenceTransformersCommand,
|
||||
payload: Record<string, unknown>,
|
||||
) => Promise<Record<string, unknown>>;
|
||||
type SentenceTransformersProcessCommand = { command: string; args: string[] };
|
||||
|
||||
export interface KtxEmbeddingProviderDeps {
|
||||
createOpenAIClient?: (options: { apiKey?: string; baseURL?: string }) => {
|
||||
|
|
@ -23,14 +15,10 @@ export interface KtxEmbeddingProviderDeps {
|
|||
};
|
||||
};
|
||||
fetch?: FetchFn;
|
||||
runSentenceTransformersJson?: SentenceTransformersJsonRunner;
|
||||
sentenceTransformersCommand?: string;
|
||||
sentenceTransformersArgs?: string[];
|
||||
sentenceTransformersCwd?: string;
|
||||
sentenceTransformersEnv?: NodeJS.ProcessEnv;
|
||||
}
|
||||
|
||||
const DEFAULT_BATCH_SIZE = 100;
|
||||
const HTTP_ERROR_BODY_MAX_LENGTH = 2_000;
|
||||
|
||||
function assertNonEmptyText(text: string): void {
|
||||
if (!text.trim()) {
|
||||
|
|
@ -69,110 +57,12 @@ function joinUrl(baseURL: string, pathPrefix: string, path: string): string {
|
|||
return prefix ? `${base}/${prefix}/${suffix}` : `${base}/${suffix}`;
|
||||
}
|
||||
|
||||
function errorText(error: unknown): string {
|
||||
if (error instanceof Error) {
|
||||
return error.cause
|
||||
? `${error.name}: ${error.message}; cause: ${errorText(error.cause)}`
|
||||
: `${error.name}: ${error.message}`;
|
||||
function boundedHttpBody(text: string): string {
|
||||
const normalized = text.trim();
|
||||
if (normalized.length <= HTTP_ERROR_BODY_MAX_LENGTH) {
|
||||
return normalized;
|
||||
}
|
||||
return String(error);
|
||||
}
|
||||
|
||||
function parseJsonObject(raw: string, subcommand: string): Record<string, unknown> {
|
||||
const parsed = JSON.parse(raw) as unknown;
|
||||
if (!parsed || typeof parsed !== 'object' || Array.isArray(parsed)) {
|
||||
throw new Error(`ktx-daemon ${subcommand} returned non-object JSON`);
|
||||
}
|
||||
return parsed as Record<string, unknown>;
|
||||
}
|
||||
|
||||
function isCommandNotFound(error: unknown): boolean {
|
||||
return (
|
||||
error instanceof Error &&
|
||||
('code' in error || 'errno' in error) &&
|
||||
((error as { code?: unknown }).code === 'ENOENT' || (error as { errno?: unknown }).errno === 'ENOENT')
|
||||
);
|
||||
}
|
||||
|
||||
function defaultSentenceTransformersProcessCommands(): SentenceTransformersProcessCommand[] {
|
||||
const venvBin =
|
||||
process.platform === 'win32' ? join('.venv', 'Scripts', 'ktx-daemon.exe') : join('.venv', 'bin', 'ktx-daemon');
|
||||
const repoVenvBin =
|
||||
process.platform === 'win32'
|
||||
? join('ktx', '.venv', 'Scripts', 'ktx-daemon.exe')
|
||||
: join('ktx', '.venv', 'bin', 'ktx-daemon');
|
||||
return [
|
||||
{ command: 'ktx-daemon', args: [] },
|
||||
{ command: venvBin, args: [] },
|
||||
{ command: repoVenvBin, args: [] },
|
||||
];
|
||||
}
|
||||
|
||||
function runSentenceTransformersProcessCommand(
|
||||
options: SentenceTransformersProcessCommand & {
|
||||
cwd?: string;
|
||||
env?: NodeJS.ProcessEnv;
|
||||
},
|
||||
): SentenceTransformersJsonRunner {
|
||||
return async (
|
||||
subcommand: SentenceTransformersCommand,
|
||||
payload: Record<string, unknown>,
|
||||
): Promise<Record<string, unknown>> =>
|
||||
new Promise((resolve, reject) => {
|
||||
const child = spawn(options.command, [...options.args, subcommand], {
|
||||
cwd: options.cwd,
|
||||
env: { ...process.env, ...options.env },
|
||||
stdio: ['pipe', 'pipe', 'pipe'],
|
||||
});
|
||||
const stdout: Buffer[] = [];
|
||||
const stderr: Buffer[] = [];
|
||||
|
||||
child.stdout.on('data', (chunk: Buffer) => stdout.push(chunk));
|
||||
child.stderr.on('data', (chunk: Buffer) => stderr.push(chunk));
|
||||
child.on('error', reject);
|
||||
child.on('close', (code) => {
|
||||
const stdoutText = Buffer.concat(stdout).toString('utf8').trim();
|
||||
const stderrText = Buffer.concat(stderr).toString('utf8').trim();
|
||||
if (code !== 0) {
|
||||
reject(new Error(`ktx-daemon ${subcommand} failed: ${stderrText || `exit code ${code}`}`));
|
||||
return;
|
||||
}
|
||||
try {
|
||||
resolve(parseJsonObject(stdoutText, subcommand));
|
||||
} catch (error) {
|
||||
reject(error);
|
||||
}
|
||||
});
|
||||
child.stdin.end(`${JSON.stringify(payload)}\n`);
|
||||
});
|
||||
}
|
||||
|
||||
function runSentenceTransformersProcessJson(options: {
|
||||
commands: SentenceTransformersProcessCommand[];
|
||||
cwd?: string;
|
||||
env?: NodeJS.ProcessEnv;
|
||||
}): SentenceTransformersJsonRunner {
|
||||
return async (
|
||||
subcommand: SentenceTransformersCommand,
|
||||
payload: Record<string, unknown>,
|
||||
): Promise<Record<string, unknown>> => {
|
||||
const errors: string[] = [];
|
||||
for (const command of options.commands) {
|
||||
try {
|
||||
return await runSentenceTransformersProcessCommand({
|
||||
...command,
|
||||
cwd: options.cwd,
|
||||
env: options.env,
|
||||
})(subcommand, payload);
|
||||
} catch (error) {
|
||||
errors.push(`${command.command}: ${errorText(error)}`);
|
||||
if (!isCommandNotFound(error)) {
|
||||
break;
|
||||
}
|
||||
}
|
||||
}
|
||||
throw new Error(`ktx-daemon ${subcommand} failed: ${errors.join('; ')}`);
|
||||
};
|
||||
return `${normalized.slice(0, HTTP_ERROR_BODY_MAX_LENGTH)}...`;
|
||||
}
|
||||
|
||||
class OpenAIEmbeddingProvider implements KtxEmbeddingProvider {
|
||||
|
|
@ -228,9 +118,7 @@ class SentenceTransformersEmbeddingProvider implements KtxEmbeddingProvider {
|
|||
private readonly fetch: FetchFn;
|
||||
private readonly baseURL: string;
|
||||
private readonly pathPrefix: string;
|
||||
private readonly runJson: SentenceTransformersJsonRunner;
|
||||
private readonly startupProbe: Promise<void>;
|
||||
private useProcessRunner = false;
|
||||
|
||||
constructor(config: KtxEmbeddingConfig, deps: KtxEmbeddingProviderDeps) {
|
||||
if (!config.sentenceTransformers?.baseURL) {
|
||||
|
|
@ -241,15 +129,6 @@ class SentenceTransformersEmbeddingProvider implements KtxEmbeddingProvider {
|
|||
this.fetch = deps.fetch ?? fetch;
|
||||
this.baseURL = config.sentenceTransformers.baseURL;
|
||||
this.pathPrefix = config.sentenceTransformers.pathPrefix ?? '/api';
|
||||
this.runJson =
|
||||
deps.runSentenceTransformersJson ??
|
||||
runSentenceTransformersProcessJson({
|
||||
commands: deps.sentenceTransformersCommand
|
||||
? [{ command: deps.sentenceTransformersCommand, args: deps.sentenceTransformersArgs ?? [] }]
|
||||
: defaultSentenceTransformersProcessCommands(),
|
||||
cwd: deps.sentenceTransformersCwd,
|
||||
env: deps.sentenceTransformersEnv,
|
||||
});
|
||||
this.startupProbe = this.requestSingle('__ktx_embedding_probe__').then((embedding) => {
|
||||
assertVectorDimensions(embedding, this.dimensions, 'sentence-transformers');
|
||||
});
|
||||
|
|
@ -264,7 +143,7 @@ class SentenceTransformersEmbeddingProvider implements KtxEmbeddingProvider {
|
|||
async embedMany(texts: string[]): Promise<number[][]> {
|
||||
assertBatchSize(texts, this.maxBatchSize);
|
||||
await this.startupProbe;
|
||||
const response = await this.requestJson('embedding-compute-bulk', '/embeddings/compute-bulk', { texts });
|
||||
const response = await this.requestJson('/embeddings/compute-bulk', { texts });
|
||||
if (
|
||||
!response ||
|
||||
typeof response !== 'object' ||
|
||||
|
|
@ -285,37 +164,15 @@ class SentenceTransformersEmbeddingProvider implements KtxEmbeddingProvider {
|
|||
}
|
||||
|
||||
private async requestSingle(text: string): Promise<number[]> {
|
||||
const response = await this.requestJson('embedding-compute', '/embeddings/compute', { text });
|
||||
const response = await this.requestJson('/embeddings/compute', { text });
|
||||
if (!response || typeof response !== 'object' || !('embedding' in response) || !Array.isArray(response.embedding)) {
|
||||
throw new Error('Embedding provider sentence-transformers returned malformed single response');
|
||||
}
|
||||
return response.embedding;
|
||||
}
|
||||
|
||||
private async requestJson(
|
||||
command: SentenceTransformersCommand,
|
||||
path: string,
|
||||
body: Record<string, unknown>,
|
||||
): Promise<Record<string, unknown>> {
|
||||
if (this.useProcessRunner) {
|
||||
return this.runJson(command, body);
|
||||
}
|
||||
|
||||
try {
|
||||
return await this.postJson(path, body);
|
||||
} catch (httpError) {
|
||||
try {
|
||||
const response = await this.runJson(command, body);
|
||||
this.useProcessRunner = true;
|
||||
return response;
|
||||
} catch (processError) {
|
||||
throw new Error(
|
||||
`Embedding provider sentence-transformers local HTTP request failed (${errorText(
|
||||
httpError,
|
||||
)}) and ktx-daemon fallback failed (${errorText(processError)})`,
|
||||
);
|
||||
}
|
||||
}
|
||||
private async requestJson(path: string, body: Record<string, unknown>): Promise<Record<string, unknown>> {
|
||||
return await this.postJson(path, body);
|
||||
}
|
||||
|
||||
private async postJson(path: string, body: Record<string, unknown>): Promise<Record<string, unknown>> {
|
||||
|
|
@ -325,7 +182,12 @@ class SentenceTransformersEmbeddingProvider implements KtxEmbeddingProvider {
|
|||
body: JSON.stringify(body),
|
||||
});
|
||||
if (!response.ok) {
|
||||
throw new Error(`Embedding provider sentence-transformers request failed with HTTP ${response.status}`);
|
||||
const bodyText = boundedHttpBody(await response.text());
|
||||
throw new Error(
|
||||
`Embedding provider sentence-transformers request failed with HTTP ${response.status}${
|
||||
bodyText ? `: ${bodyText}` : ''
|
||||
}`,
|
||||
);
|
||||
}
|
||||
const parsed = (await response.json()) as unknown;
|
||||
if (!parsed || typeof parsed !== 'object' || Array.isArray(parsed)) {
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue