mirror of
https://github.com/Kaelio/ktx.git
synced 2026-06-10 08:05:14 +02:00
* feat: add codex sdk runner foundation * feat: parse codex runtime events * feat: expose codex runtime mcp tools * feat: add codex llm runtime * feat: wire codex llm backend * test: avoid Array.fromAsync in codex runner test * docs: document codex llm backend * fix: tighten codex runtime config ownership * fix: use codex sdk env and thread options * fix: parse codex sdk event shapes * test: add codex backend live smoke * docs: clarify codex backend isolation * fix: drive codex loop metrics from mcp events * fix: enforce codex local step budget * docs: disclose codex isolation limits * fix: count all codex agent steps and stream step callbacks live The agent-loop step budget only counted completed mcp_tool_call items, so built-in command_execution steps (which the public Codex SDK/CLI surface can still expose) never decremented the budget, letting ingest/reconciliation run past stepBudget until Codex stopped on its own. onStepFinish was also replayed only after the whole stream drained, so live work_unit_step / reconciliation progress appeared stuck until the Codex process exited. collectEvents is now the single live step accumulator: it counts every completed agent-action item via a shared isCompletedAgentStep predicate (command_execution, mcp_tool_call, file_change, web_search), fires onStepFinish as each step completes, and enforces the budget on that broader count. A no-tool turn still counts as one step. toolFailures stays MCP-specific, since a non-zero command exit is normal agent exploration, not a loop failure. * test: align ingest llm-guard assertions with codex backend The skip-llm ingest guard message now lists codex as a valid backend and mentions a Claude Code/Codex session plus a codex setup hint, but this slow suite test still asserted the pre-codex wording. Update it to match the production message (already covered by the local-bundle-runtime unit test) and add the codex setup-line assertion. * fix: treat codex error:null tool calls as success The Codex SDK serializes error: null on successful mcp_tool_call items, so the failure check (item.error !== undefined) flagged every successful tool call as failed with the empty-payload default "Codex turn failed". This killed every ingest work unit under the codex backend before it could produce a patch. Key on status === 'failed' (authoritative, always set) and only treat a populated error object as a failure. Add a regression test built from a verbatim real-SDK event capture. * fix: default codex backend to gpt-5.5 and report real probe errors The previous default gpt-5.3-codex is an API-key-only model that the OpenAI API rejects under ChatGPT-account (subscription) auth, so codex status/setup failed with a misleading "authentication is not usable" message even though auth was fine. - Default codex model is now gpt-5.5 (works on both subscription and API-key auth); the curated setup picker offers gpt-5.5 / gpt-5.4 / gpt-5.4-mini and keeps free-form entry for account-specific ids (e.g. gpt-5.3-codex-spark). - runCodexAuthProbe now distinguishes "model not available" from an auth failure and surfaces the real API error: collectEvents retains stream events when the SDK throws on a non-zero exit, and the API error JSON envelope is unwrapped to its human-readable message. - The Codex isolation warning now renders inside the clack setup frame. - Docs updated to gpt-5.5 with a note that *-codex ids require API-key auth. * fix: require llm.models.default in status and match codex probe remediation Status reported a project ready when a non-none LLM backend was configured without llm.models.default, but the runtime (resolveModelSlots) hard-requires it, so ingest/scan/memory threw after `ktx status` said the project was usable. buildLlmStatus now fails for any non-none backend missing models.default and no longer invents a fallback model for claude-code/codex. Codex probe failures now carry a category-matched fix: a model-access failure steers the user at llm.models.default instead of the auth/install remediation. runCodexAuthProbe returns the fix and status consumes it; the message stays self-sufficient so setup output is unchanged. Docs: README now lists the codex backend and local Codex auth; ktx-setup.mdx states --llm-model only accepts codex/default or gpt-*/codex-* ids. Repaired four doctor fixtures that configured a backend without models.default (the now-correctly-blocked config) and added coverage for the new behavior.
73 lines
2.3 KiB
TypeScript
73 lines
2.3 KiB
TypeScript
import { describe, expect, it, vi } from 'vitest';
|
|
import { z } from 'zod';
|
|
import {
|
|
createCodexRuntimeMcpServer,
|
|
startCodexRuntimeMcpServer,
|
|
} from '../../../src/context/llm/codex-mcp-runtime-server.js';
|
|
|
|
describe('Codex runtime MCP server', () => {
|
|
it('registers runtime tools with markdown output', async () => {
|
|
const registered = new Map<
|
|
string,
|
|
{
|
|
config: { description?: string; inputSchema: unknown };
|
|
handler: (input: Record<string, unknown>) => Promise<unknown>;
|
|
}
|
|
>();
|
|
const server = createCodexRuntimeMcpServer({
|
|
server: {
|
|
registerTool(name, config, handler) {
|
|
registered.set(name, { config, handler });
|
|
},
|
|
},
|
|
toolSet: {
|
|
wiki_search: {
|
|
name: 'wiki_search',
|
|
description: 'Search the wiki',
|
|
inputSchema: z.object({ query: z.string() }),
|
|
execute: vi.fn(async () => ({ markdown: 'result markdown', structured: { matches: 1 } })),
|
|
},
|
|
},
|
|
});
|
|
|
|
expect(server).toBeDefined();
|
|
expect([...registered.keys()]).toEqual(['wiki_search']);
|
|
expect(registered.get('wiki_search')?.config).toMatchObject({
|
|
description: 'Search the wiki',
|
|
});
|
|
await expect(registered.get('wiki_search')?.handler({ query: 'revenue' })).resolves.toEqual({
|
|
content: [{ type: 'text', text: 'result markdown' }],
|
|
structuredContent: { matches: 1 },
|
|
});
|
|
});
|
|
|
|
it('starts loopback HTTP MCP with a bearer token and reports the runtime URL', async () => {
|
|
const close = vi.fn(async () => undefined);
|
|
const runServer = vi.fn(async () => ({
|
|
server: { address: () => ({ port: 4321 }) },
|
|
close,
|
|
}));
|
|
|
|
const handle = await startCodexRuntimeMcpServer({
|
|
projectDir: '/tmp/ktx-project',
|
|
toolSet: {},
|
|
runServer: runServer as never,
|
|
});
|
|
|
|
expect(handle.url).toBe('http://127.0.0.1:4321/mcp');
|
|
expect(handle.bearerTokenEnvVar).toBe('KTX_CODEX_RUNTIME_MCP_TOKEN');
|
|
expect(handle.bearerToken).toMatch(/^[a-f0-9]{64}$/);
|
|
expect(runServer).toHaveBeenCalledWith(
|
|
expect.objectContaining({
|
|
projectDir: '/tmp/ktx-project',
|
|
host: '127.0.0.1',
|
|
port: 0,
|
|
token: handle.bearerToken,
|
|
allowedHosts: ['127.0.0.1', 'localhost'],
|
|
allowedOrigins: [],
|
|
}),
|
|
);
|
|
await handle.close();
|
|
expect(close).toHaveBeenCalled();
|
|
});
|
|
});
|