mirror of
https://github.com/Kaelio/ktx.git
synced 2026-06-22 08:38:08 +02:00
feat: add codex llm backend for ktx runtime work (#253)
* feat: add codex sdk runner foundation * feat: parse codex runtime events * feat: expose codex runtime mcp tools * feat: add codex llm runtime * feat: wire codex llm backend * test: avoid Array.fromAsync in codex runner test * docs: document codex llm backend * fix: tighten codex runtime config ownership * fix: use codex sdk env and thread options * fix: parse codex sdk event shapes * test: add codex backend live smoke * docs: clarify codex backend isolation * fix: drive codex loop metrics from mcp events * fix: enforce codex local step budget * docs: disclose codex isolation limits * fix: count all codex agent steps and stream step callbacks live The agent-loop step budget only counted completed mcp_tool_call items, so built-in command_execution steps (which the public Codex SDK/CLI surface can still expose) never decremented the budget, letting ingest/reconciliation run past stepBudget until Codex stopped on its own. onStepFinish was also replayed only after the whole stream drained, so live work_unit_step / reconciliation progress appeared stuck until the Codex process exited. collectEvents is now the single live step accumulator: it counts every completed agent-action item via a shared isCompletedAgentStep predicate (command_execution, mcp_tool_call, file_change, web_search), fires onStepFinish as each step completes, and enforces the budget on that broader count. A no-tool turn still counts as one step. toolFailures stays MCP-specific, since a non-zero command exit is normal agent exploration, not a loop failure. * test: align ingest llm-guard assertions with codex backend The skip-llm ingest guard message now lists codex as a valid backend and mentions a Claude Code/Codex session plus a codex setup hint, but this slow suite test still asserted the pre-codex wording. Update it to match the production message (already covered by the local-bundle-runtime unit test) and add the codex setup-line assertion. * fix: treat codex error:null tool calls as success The Codex SDK serializes error: null on successful mcp_tool_call items, so the failure check (item.error !== undefined) flagged every successful tool call as failed with the empty-payload default "Codex turn failed". This killed every ingest work unit under the codex backend before it could produce a patch. Key on status === 'failed' (authoritative, always set) and only treat a populated error object as a failure. Add a regression test built from a verbatim real-SDK event capture. * fix: default codex backend to gpt-5.5 and report real probe errors The previous default gpt-5.3-codex is an API-key-only model that the OpenAI API rejects under ChatGPT-account (subscription) auth, so codex status/setup failed with a misleading "authentication is not usable" message even though auth was fine. - Default codex model is now gpt-5.5 (works on both subscription and API-key auth); the curated setup picker offers gpt-5.5 / gpt-5.4 / gpt-5.4-mini and keeps free-form entry for account-specific ids (e.g. gpt-5.3-codex-spark). - runCodexAuthProbe now distinguishes "model not available" from an auth failure and surfaces the real API error: collectEvents retains stream events when the SDK throws on a non-zero exit, and the API error JSON envelope is unwrapped to its human-readable message. - The Codex isolation warning now renders inside the clack setup frame. - Docs updated to gpt-5.5 with a note that *-codex ids require API-key auth. * fix: require llm.models.default in status and match codex probe remediation Status reported a project ready when a non-none LLM backend was configured without llm.models.default, but the runtime (resolveModelSlots) hard-requires it, so ingest/scan/memory threw after `ktx status` said the project was usable. buildLlmStatus now fails for any non-none backend missing models.default and no longer invents a fallback model for claude-code/codex. Codex probe failures now carry a category-matched fix: a model-access failure steers the user at llm.models.default instead of the auth/install remediation. runCodexAuthProbe returns the fix and status consumes it; the message stays self-sufficient so setup output is unchanged. Docs: README now lists the codex backend and local Codex auth; ktx-setup.mdx states --llm-model only accepts codex/default or gpt-*/codex-* ids. Repaired four doctor fixtures that configured a backend without models.default (the now-correctly-blocked config) and added coverage for the new behavior.
This commit is contained in:
parent
74c6076b72
commit
494618ab14
41 changed files with 2544 additions and 30 deletions
|
|
@ -3,6 +3,9 @@ import { writeFile } from 'node:fs/promises';
|
|||
import { promisify } from 'node:util';
|
||||
import { resolveLocalKtxLlmConfig } from './context/llm/local-config.js';
|
||||
import { runClaudeCodeAuthProbe } from './context/llm/claude-code-runtime.js';
|
||||
import { formatCodexIsolationWarning } from './context/llm/codex-isolation.js';
|
||||
import { runCodexAuthProbe } from './context/llm/codex-runtime.js';
|
||||
import { DEFAULT_CODEX_MODEL } from './context/llm/codex-models.js';
|
||||
import { resolveKtxConfigReference } from './context/core/config-reference.js';
|
||||
import { type KtxProjectConfig, type KtxProjectLlmConfig, serializeKtxProjectConfig } from './context/project/config.js';
|
||||
import { loadKtxProject } from './context/project/project.js';
|
||||
|
|
@ -56,7 +59,7 @@ export interface AnthropicModelChoice {
|
|||
recommended: boolean;
|
||||
}
|
||||
|
||||
export type KtxSetupLlmBackend = 'anthropic' | 'vertex' | 'claude-code';
|
||||
export type KtxSetupLlmBackend = 'anthropic' | 'vertex' | 'claude-code' | 'codex';
|
||||
|
||||
/** @internal */
|
||||
export interface KtxSetupModelPromptAdapter {
|
||||
|
|
@ -82,6 +85,7 @@ export interface KtxSetupModelDeps {
|
|||
model: string;
|
||||
env?: NodeJS.ProcessEnv;
|
||||
}) => Promise<{ ok: true } | { ok: false; message: string }>;
|
||||
codexAuthProbe?: (input: { projectDir: string; model: string }) => Promise<{ ok: true } | { ok: false; message: string }>;
|
||||
readGcloudProject?: () => Promise<string | undefined>;
|
||||
listGcloudProjects?: () => Promise<GcloudProjectChoice[]>;
|
||||
spinner?: () => KtxCliSpinner;
|
||||
|
|
@ -110,6 +114,20 @@ const CLAUDE_CODE_MODELS: AnthropicModelChoice[] = [
|
|||
{ id: 'haiku', label: 'Claude Haiku', recommended: false },
|
||||
];
|
||||
|
||||
// Curated Codex models from OpenAI's current lineup that work under both
|
||||
// ChatGPT-account (subscription) and API-key auth. Intentionally omitted:
|
||||
// the `*-codex` ids (e.g. gpt-5.3-codex, gpt-5.2-codex) are API-key-only and
|
||||
// fail on ChatGPT-account auth, and gpt-5.3-codex-spark is a ChatGPT-Pro-only
|
||||
// research preview. Codex resolves real availability per account at runtime
|
||||
// (its binary remote-fetches the model list), so this is a convenience
|
||||
// shortlist only — the manual-entry option accepts any id your account's
|
||||
// `codex` picker exposes, and the auth probe reports an unsupported choice.
|
||||
const CODEX_MODELS: AnthropicModelChoice[] = [
|
||||
{ id: 'gpt-5.5', label: 'GPT-5.5', recommended: true },
|
||||
{ id: 'gpt-5.4', label: 'GPT-5.4', recommended: false },
|
||||
{ id: 'gpt-5.4-mini', label: 'GPT-5.4 mini', recommended: false },
|
||||
];
|
||||
|
||||
const HIDDEN_ANTHROPIC_MODEL_PATTERNS = [
|
||||
/^claude-sonnet-4$/i,
|
||||
/^claude-opus-4$/i,
|
||||
|
|
@ -272,7 +290,12 @@ export function isKtxSetupLlmConfigReady(config: KtxProjectLlmConfig): boolean {
|
|||
return typeof resolved.vertex?.location === 'string' && resolved.vertex.location.trim().length > 0;
|
||||
}
|
||||
|
||||
return resolved.backend === 'anthropic' || resolved.backend === 'gateway' || resolved.backend === 'claude-code';
|
||||
return (
|
||||
resolved.backend === 'anthropic' ||
|
||||
resolved.backend === 'gateway' ||
|
||||
resolved.backend === 'claude-code' ||
|
||||
resolved.backend === 'codex'
|
||||
);
|
||||
}
|
||||
|
||||
function hasUsableConfiguredLlm(config: KtxProjectConfig): boolean {
|
||||
|
|
@ -284,7 +307,8 @@ function buildProjectLlmConfig(
|
|||
provider:
|
||||
| { backend: 'anthropic'; credentialRef: string }
|
||||
| { backend: 'vertex'; vertex: { project?: string; location: string } }
|
||||
| { backend: 'claude-code' },
|
||||
| { backend: 'claude-code' }
|
||||
| { backend: 'codex' },
|
||||
model: string,
|
||||
): KtxProjectLlmConfig {
|
||||
if (provider.backend === 'claude-code') {
|
||||
|
|
@ -295,6 +319,14 @@ function buildProjectLlmConfig(
|
|||
};
|
||||
}
|
||||
|
||||
if (provider.backend === 'codex') {
|
||||
return {
|
||||
provider: { backend: 'codex' },
|
||||
models: { ...existing.models, default: model },
|
||||
promptCaching: existing.promptCaching,
|
||||
};
|
||||
}
|
||||
|
||||
if (provider.backend === 'vertex') {
|
||||
return {
|
||||
provider: {
|
||||
|
|
@ -515,6 +547,7 @@ async function chooseBackend(
|
|||
message: 'Which LLM provider should KTX use?',
|
||||
options: [
|
||||
{ value: 'claude-code', label: 'Claude subscription (Pro/Max)' },
|
||||
{ value: 'codex', label: 'Codex subscription' },
|
||||
{ value: 'anthropic', label: 'Anthropic API key' },
|
||||
{ value: 'vertex', label: 'Google Vertex AI for Anthropic Claude' },
|
||||
{ value: 'back', label: 'Back' },
|
||||
|
|
@ -525,7 +558,7 @@ async function chooseBackend(
|
|||
}
|
||||
return {
|
||||
status: 'ready',
|
||||
backend: choice === 'vertex' || choice === 'claude-code' ? choice : 'anthropic',
|
||||
backend: choice === 'vertex' || choice === 'claude-code' || choice === 'codex' ? choice : 'anthropic',
|
||||
prompted: true,
|
||||
};
|
||||
}
|
||||
|
|
@ -884,12 +917,51 @@ async function chooseClaudeCodeModel(args: KtxSetupModelArgs, deps: KtxSetupMode
|
|||
return { status: 'ready', model: choice };
|
||||
}
|
||||
|
||||
async function chooseCodexModel(args: KtxSetupModelArgs, deps: KtxSetupModelDeps): Promise<ChooseModelResult> {
|
||||
const providedModel = requestedModel(args);
|
||||
if (providedModel) {
|
||||
return { status: 'ready', model: providedModel };
|
||||
}
|
||||
if (args.inputMode === 'disabled') {
|
||||
return { status: 'ready', model: DEFAULT_CODEX_MODEL };
|
||||
}
|
||||
|
||||
const prompts = deps.prompts ?? createPromptAdapter();
|
||||
const choice = await prompts.select({
|
||||
message: `Which Codex model should KTX use?\n\n${ANTHROPIC_MODEL_PROMPT_CONTEXT}`,
|
||||
options: [
|
||||
...CODEX_MODELS.map((model) => ({
|
||||
value: model.id,
|
||||
label: model.label,
|
||||
...(model.recommended ? { hint: 'recommended' } : {}),
|
||||
})),
|
||||
{ value: 'manual', label: 'Enter a Codex model ID manually' },
|
||||
{ value: 'back', label: 'Back' },
|
||||
],
|
||||
});
|
||||
if (choice === 'back') {
|
||||
return { status: 'back' };
|
||||
}
|
||||
if (choice === 'manual') {
|
||||
const manual = await prompts.text({
|
||||
message: withTextInputNavigation('Codex model ID'),
|
||||
placeholder: CODEX_MODELS.find((model) => model.recommended)?.id ?? CODEX_MODELS[0]?.id,
|
||||
});
|
||||
if (manual === undefined) {
|
||||
return { status: 'back' };
|
||||
}
|
||||
return manual.trim() ? { status: 'ready', model: manual.trim() } : { status: 'missing-input' };
|
||||
}
|
||||
return { status: 'ready', model: choice };
|
||||
}
|
||||
|
||||
async function persistLlmConfig(
|
||||
projectDir: string,
|
||||
provider:
|
||||
| { backend: 'anthropic'; credentialRef: string }
|
||||
| { backend: 'vertex'; vertex: { project?: string; location: string } }
|
||||
| { backend: 'claude-code' },
|
||||
| { backend: 'claude-code' }
|
||||
| { backend: 'codex' },
|
||||
model: string,
|
||||
): Promise<void> {
|
||||
const project = await loadKtxProject({ projectDir });
|
||||
|
|
@ -1031,6 +1103,32 @@ export async function runKtxSetupAnthropicModelStep(
|
|||
return { status: 'ready', projectDir: args.projectDir };
|
||||
}
|
||||
|
||||
if (backendChoice.backend === 'codex') {
|
||||
const model = await chooseCodexModel(backendArgs, deps);
|
||||
if (model.status === 'back' && backendChoice.prompted) {
|
||||
attemptArgs = buildInteractiveRetryArgs(args);
|
||||
continue;
|
||||
}
|
||||
if (model.status === 'invalid-credential') {
|
||||
return { status: 'failed', projectDir: args.projectDir };
|
||||
}
|
||||
if (model.status !== 'ready') {
|
||||
return { status: model.status, projectDir: args.projectDir };
|
||||
}
|
||||
const probe = deps.codexAuthProbe ?? runCodexAuthProbe;
|
||||
const health = await probe({ projectDir: args.projectDir, model: model.model });
|
||||
if (!health.ok) {
|
||||
io.stderr.write(`${health.message}\n`);
|
||||
return { status: 'failed', projectDir: args.projectDir };
|
||||
}
|
||||
// Prefix the clack gutter so the warning sits inside the setup frame
|
||||
// instead of breaking out of it; kept on stderr for scripted runs.
|
||||
io.stderr.write(`│ ${formatCodexIsolationWarning()}\n`);
|
||||
await persistLlmConfig(args.projectDir, { backend: 'codex' }, model.model);
|
||||
io.stdout.write(`│ LLM ready: yes (codex, ${model.model})\n`);
|
||||
return { status: 'ready', projectDir: args.projectDir };
|
||||
}
|
||||
|
||||
const credential = await chooseCredentialRef(backendArgs, io, deps);
|
||||
if (credential.status === 'back' && backendChoice.prompted) {
|
||||
attemptArgs = buildInteractiveRetryArgs(args);
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue