feat: add codex llm backend for ktx runtime work (#253)

* feat: add codex sdk runner foundation * feat: parse codex runtime events * feat: expose codex runtime mcp tools * feat: add codex llm runtime * feat: wire codex llm backend * test: avoid Array.fromAsync in codex runner test * docs: document codex llm backend * fix: tighten codex runtime config ownership * fix: use codex sdk env and thread options * fix: parse codex sdk event shapes * test: add codex backend live smoke * docs: clarify codex backend isolation * fix: drive codex loop metrics from mcp events * fix: enforce codex local step budget * docs: disclose codex isolation limits * fix: count all codex agent steps and stream step callbacks live The agent-loop step budget only counted completed mcp_tool_call items, so built-in command_execution steps (which the public Codex SDK/CLI surface can still expose) never decremented the budget, letting ingest/reconciliation run past stepBudget until Codex stopped on its own. onStepFinish was also replayed only after the whole stream drained, so live work_unit_step / reconciliation progress appeared stuck until the Codex process exited. collectEvents is now the single live step accumulator: it counts every completed agent-action item via a shared isCompletedAgentStep predicate (command_execution, mcp_tool_call, file_change, web_search), fires onStepFinish as each step completes, and enforces the budget on that broader count. A no-tool turn still counts as one step. toolFailures stays MCP-specific, since a non-zero command exit is normal agent exploration, not a loop failure. * test: align ingest llm-guard assertions with codex backend The skip-llm ingest guard message now lists codex as a valid backend and mentions a Claude Code/Codex session plus a codex setup hint, but this slow suite test still asserted the pre-codex wording. Update it to match the production message (already covered by the local-bundle-runtime unit test) and add the codex setup-line assertion. * fix: treat codex error:null tool calls as success The Codex SDK serializes error: null on successful mcp_tool_call items, so the failure check (item.error !== undefined) flagged every successful tool call as failed with the empty-payload default "Codex turn failed". This killed every ingest work unit under the codex backend before it could produce a patch. Key on status === 'failed' (authoritative, always set) and only treat a populated error object as a failure. Add a regression test built from a verbatim real-SDK event capture. * fix: default codex backend to gpt-5.5 and report real probe errors The previous default gpt-5.3-codex is an API-key-only model that the OpenAI API rejects under ChatGPT-account (subscription) auth, so codex status/setup failed with a misleading "authentication is not usable" message even though auth was fine. - Default codex model is now gpt-5.5 (works on both subscription and API-key auth); the curated setup picker offers gpt-5.5 / gpt-5.4 / gpt-5.4-mini and keeps free-form entry for account-specific ids (e.g. gpt-5.3-codex-spark). - runCodexAuthProbe now distinguishes "model not available" from an auth failure and surfaces the real API error: collectEvents retains stream events when the SDK throws on a non-zero exit, and the API error JSON envelope is unwrapped to its human-readable message. - The Codex isolation warning now renders inside the clack setup frame. - Docs updated to gpt-5.5 with a note that *-codex ids require API-key auth. * fix: require llm.models.default in status and match codex probe remediation Status reported a project ready when a non-none LLM backend was configured without llm.models.default, but the runtime (resolveModelSlots) hard-requires it, so ingest/scan/memory threw after `ktx status` said the project was usable. buildLlmStatus now fails for any non-none backend missing models.default and no longer invents a fallback model for claude-code/codex. Codex probe failures now carry a category-matched fix: a model-access failure steers the user at llm.models.default instead of the auth/install remediation. runCodexAuthProbe returns the fix and status consumes it; the message stays self-sufficient so setup output is unchanged. Docs: README now lists the codex backend and local Codex auth; ktx-setup.mdx states --llm-model only accepts codex/default or gpt-*/codex-* ids. Repaired four doctor fixtures that configured a backend without models.default (the now-correctly-blocked config) and added coverage for the new behavior.
2026-07-01 08:59:39 +02:00 · 2026-06-02 13:57:11 +02:00 · 2026-06-02 13:57:11 +02:00 · 494618ab14
commit 494618ab14
parent 74c6076b72
41 changed files with 2544 additions and 30 deletions
--- a/packages/cli/test/context/ingest/local-bundle-runtime.test.ts
+++ b/packages/cli/test/context/ingest/local-bundle-runtime.test.ts
@ -77,9 +77,10 @@ describe('createLocalBundleIngestRuntime', () => {
      }),
    ).toThrow(
      [
-        'ktx ingest requires llm.provider.backend: anthropic, vertex, gateway, or claude-code, or an injected agentRunner.',
-        'Configure a local Claude Code session or API-backed LLM, then rerun ingest:',
+        'ktx ingest requires llm.provider.backend: anthropic, vertex, gateway, claude-code, or codex, or an injected agentRunner.',
+        'Configure a local Claude Code/Codex session or API-backed LLM, then rerun ingest:',
        `  ktx setup --project-dir ${project.projectDir} --llm-backend claude-code --no-input`,
+        `  ktx setup --project-dir ${project.projectDir} --llm-backend codex --llm-model gpt-5.5 --no-input`,
        `  ktx setup --project-dir ${project.projectDir} --llm-backend anthropic --anthropic-api-key-env ANTHROPIC_API_KEY --llm-model claude-sonnet-4-6 --no-input`,
      ].join('\n'),
    );
--- a/packages/cli/test/context/llm/codex-exec-events.test.ts
+++ b/packages/cli/test/context/llm/codex-exec-events.test.ts
@ -0,0 +1,188 @@
+import { describe, expect, it } from 'vitest';
+import {
+  parseCodexExecEventLine,
+  summarizeCodexExecEvents,
+} from '../../../src/context/llm/codex-exec-events.js';
+
+describe('Codex exec event parsing', () => {
+  it('uses the completed turn as one step when no MCP tools run', () => {
+    const summary = summarizeCodexExecEvents(
+      [
+        { type: 'thread.started', thread_id: 'thr_1' },
+        { type: 'turn.started' },
+        { type: 'item.completed', item: { id: 'item_1', type: 'agent_message', text: 'hello from codex' } },
+        {
+          type: 'turn.completed',
+          usage: {
+            input_tokens: 12,
+            cached_input_tokens: 4,
+            output_tokens: 5,
+            reasoning_output_tokens: 2,
+          },
+        },
+      ],
+      { startedAt: 100, now: () => 125 },
+    );
+
+    expect(summary).toEqual({
+      finalText: 'hello from codex',
+      stopReason: 'natural',
+      usage: { inputTokens: 12, outputTokens: 5, totalTokens: 17 },
+      stepCount: 1,
+      stepBoundariesMs: [25],
+      toolCallCount: 0,
+      toolFailures: [],
+    });
+  });
+
+  it('uses completed MCP tool calls as loop steps', () => {
+    const offsets = [115, 140, 175];
+    const summary = summarizeCodexExecEvents(
+      [
+        { type: 'turn.started' },
+        {
+          type: 'item.started',
+          item: { id: 'call_1', type: 'mcp_tool_call', server: 'ktx', tool: 'search', arguments: {}, status: 'in_progress' },
+        },
+        {
+          type: 'item.completed',
+          item: { id: 'call_1', type: 'mcp_tool_call', server: 'ktx', tool: 'search', arguments: {}, status: 'completed' },
+        },
+        {
+          type: 'item.started',
+          item: { id: 'call_2', type: 'mcp_tool_call', server: 'ktx', tool: 'lookup', arguments: {}, status: 'in_progress' },
+        },
+        {
+          type: 'item.completed',
+          item: {
+            id: 'call_2',
+            type: 'mcp_tool_call',
+            server: 'ktx',
+            tool: 'lookup',
+            arguments: {},
+            status: 'failed',
+            error: { message: 'denied' },
+          },
+        },
+        { type: 'item.completed', item: { id: 'item_1', type: 'agent_message', text: 'done' } },
+        { type: 'turn.completed', usage: { input_tokens: 1, output_tokens: 1, cached_input_tokens: 0, reasoning_output_tokens: 0 } },
+      ],
+      { startedAt: 100, now: () => offsets.shift() ?? 175 },
+    );
+
+    expect(summary).toEqual({
+      finalText: 'done',
+      stopReason: 'natural',
+      usage: { inputTokens: 1, outputTokens: 1, totalTokens: 2 },
+      stepCount: 2,
+      stepBoundariesMs: [15, 40],
+      toolCallCount: 2,
+      toolFailures: ['lookup: denied'],
+    });
+  });
+
+  it('does not treat a completed MCP tool call as failed when Codex sends error: null', () => {
+    // Captured verbatim from a real @openai/codex-sdk run: successful tool calls
+    // carry `error: null` and `result` alongside `status: "completed"`.
+    const summary = summarizeCodexExecEvents([
+      { type: 'turn.started' },
+      {
+        type: 'item.started',
+        item: {
+          id: 'item_1',
+          type: 'mcp_tool_call',
+          server: 'ktx',
+          tool: 'echo_value',
+          arguments: { value: 'ktx_codex_tool_ok' },
+          result: null,
+          error: null,
+          status: 'in_progress',
+        },
+      },
+      {
+        type: 'item.completed',
+        item: {
+          id: 'item_1',
+          type: 'mcp_tool_call',
+          server: 'ktx',
+          tool: 'echo_value',
+          arguments: { value: 'ktx_codex_tool_ok' },
+          result: { content: [{ type: 'text', text: 'echo:ktx_codex_tool_ok' }], structured_content: null },
+          error: null,
+          status: 'completed',
+        },
+      },
+      { type: 'item.completed', item: { id: 'm1', type: 'agent_message', text: 'done' } },
+      { type: 'turn.completed', usage: { input_tokens: 1, output_tokens: 1 } },
+    ]);
+
+    expect(summary.toolFailures).toEqual([]);
+    expect(summary.toolCallCount).toBe(1);
+  });
+
+  it('counts built-in command executions as loop steps without failing the loop', () => {
+    const offsets = [110, 130];
+    const summary = summarizeCodexExecEvents(
+      [
+        { type: 'turn.started' },
+        { type: 'item.completed', item: { id: 'c1', type: 'command_execution', command: 'ls', status: 'completed', exit_code: 0 } },
+        { type: 'item.completed', item: { id: 'c2', type: 'command_execution', command: 'cat missing', status: 'failed', exit_code: 1 } },
+        { type: 'item.completed', item: { id: 'm1', type: 'agent_message', text: 'done' } },
+        { type: 'turn.completed', usage: { input_tokens: 2, output_tokens: 1 } },
+      ],
+      { startedAt: 100, now: () => offsets.shift() ?? 130 },
+    );
+
+    expect(summary.stepCount).toBe(2);
+    expect(summary.stepBoundariesMs).toEqual([10, 30]);
+    // A non-zero command exit is normal agent exploration, not a runtime tool failure.
+    expect(summary.toolFailures).toEqual([]);
+    expect(summary.toolCallCount).toBe(0);
+  });
+
+  it('maps turn failures into error stop reason', () => {
+    const summary = summarizeCodexExecEvents([
+      { type: 'turn.started' },
+      { type: 'turn.failed', error: { message: 'Codex could not connect to required MCP server' } },
+    ]);
+
+    expect(summary.stopReason).toBe('error');
+    expect(summary.error?.message).toContain('Codex could not connect to required MCP server');
+  });
+
+  it('unwraps the Codex API error envelope into its human-readable message', () => {
+    // Codex serializes API errors as a JSON envelope inside the event message.
+    const apiError = JSON.stringify({
+      type: 'error',
+      status: 400,
+      error: {
+        type: 'invalid_request_error',
+        message: "The 'gpt-5.3-codex' model is not supported when using Codex with a ChatGPT account.",
+      },
+    });
+    const summary = summarizeCodexExecEvents([
+      { type: 'thread.started', thread_id: 'thr_1' },
+      { type: 'turn.started' },
+      { type: 'error', message: apiError },
+      { type: 'turn.failed', error: { message: apiError } },
+    ]);
+
+    expect(summary.stopReason).toBe('error');
+    expect(summary.error?.message).toBe(
+      "The 'gpt-5.3-codex' model is not supported when using Codex with a ChatGPT account.",
+    );
+  });
+
+  it('maps max-turns terminal reasons into budget stop reason when Codex emits one', () => {
+    const summary = summarizeCodexExecEvents([
+      { type: 'turn.started' },
+      { type: 'turn.completed', reason: 'max_turns', usage: { input_tokens: 1, output_tokens: 1 } },
+    ]);
+
+    expect(summary.stopReason).toBe('budget');
+  });
+
+  it('throws a clear error for malformed JSONL lines', () => {
+    expect(() => parseCodexExecEventLine('{not-json')).toThrow('Codex JSONL event stream was malformed');
+  });
+});
--- a/packages/cli/test/context/llm/codex-isolation.test.ts
+++ b/packages/cli/test/context/llm/codex-isolation.test.ts
@ -0,0 +1,19 @@
+import { describe, expect, it } from 'vitest';
+import {
+  CODEX_ISOLATION_WARNING,
+  CODEX_ISOLATION_WARNING_FIX,
+  formatCodexIsolationWarning,
+} from '../../../src/context/llm/codex-isolation.js';
+
+describe('Codex isolation warning', () => {
+  it('documents the enforced and unenforced Codex isolation boundaries', () => {
+    expect(CODEX_ISOLATION_WARNING).toContain('runtime MCP server to the current ktx tool set');
+    expect(CODEX_ISOLATION_WARNING).toContain('disables Codex web search');
+    expect(CODEX_ISOLATION_WARNING).toContain('may still load user Codex config');
+    expect(CODEX_ISOLATION_WARNING).toContain('built-in command execution');
+    expect(CODEX_ISOLATION_WARNING_FIX).toContain('claude-code');
+    expect(formatCodexIsolationWarning()).toBe(
+      `${CODEX_ISOLATION_WARNING} ${CODEX_ISOLATION_WARNING_FIX}`,
+    );
+  });
+});
--- a/packages/cli/test/context/llm/codex-mcp-runtime-server.test.ts
+++ b/packages/cli/test/context/llm/codex-mcp-runtime-server.test.ts
@ -0,0 +1,73 @@
+import { describe, expect, it, vi } from 'vitest';
+import { z } from 'zod';
+import {
+  createCodexRuntimeMcpServer,
+  startCodexRuntimeMcpServer,
+} from '../../../src/context/llm/codex-mcp-runtime-server.js';
+
+describe('Codex runtime MCP server', () => {
+  it('registers runtime tools with markdown output', async () => {
+    const registered = new Map<
+      string,
+      {
+        config: { description?: string; inputSchema: unknown };
+        handler: (input: Record<string, unknown>) => Promise<unknown>;
+      }
+    >();
+    const server = createCodexRuntimeMcpServer({
+      server: {
+        registerTool(name, config, handler) {
+          registered.set(name, { config, handler });
+        },
+      },
+      toolSet: {
+        wiki_search: {
+          name: 'wiki_search',
+          description: 'Search the wiki',
+          inputSchema: z.object({ query: z.string() }),
+          execute: vi.fn(async () => ({ markdown: 'result markdown', structured: { matches: 1 } })),
+        },
+      },
+    });
+
+    expect(server).toBeDefined();
+    expect([...registered.keys()]).toEqual(['wiki_search']);
+    expect(registered.get('wiki_search')?.config).toMatchObject({
+      description: 'Search the wiki',
+    });
+    await expect(registered.get('wiki_search')?.handler({ query: 'revenue' })).resolves.toEqual({
+      content: [{ type: 'text', text: 'result markdown' }],
+      structuredContent: { matches: 1 },
+    });
+  });
+
+  it('starts loopback HTTP MCP with a bearer token and reports the runtime URL', async () => {
+    const close = vi.fn(async () => undefined);
+    const runServer = vi.fn(async () => ({
+      server: { address: () => ({ port: 4321 }) },
+      close,
+    }));
+
+    const handle = await startCodexRuntimeMcpServer({
+      projectDir: '/tmp/ktx-project',
+      toolSet: {},
+      runServer: runServer as never,
+    });
+
+    expect(handle.url).toBe('http://127.0.0.1:4321/mcp');
+    expect(handle.bearerTokenEnvVar).toBe('KTX_CODEX_RUNTIME_MCP_TOKEN');
+    expect(handle.bearerToken).toMatch(/^[a-f0-9]{64}$/);
+    expect(runServer).toHaveBeenCalledWith(
+      expect.objectContaining({
+        projectDir: '/tmp/ktx-project',
+        host: '127.0.0.1',
+        port: 0,
+        token: handle.bearerToken,
+        allowedHosts: ['127.0.0.1', 'localhost'],
+        allowedOrigins: [],
+      }),
+    );
+    await handle.close();
+    expect(close).toHaveBeenCalled();
+  });
+});
--- a/packages/cli/test/context/llm/codex-models.test.ts
+++ b/packages/cli/test/context/llm/codex-models.test.ts
@ -0,0 +1,17 @@
+import { describe, expect, it } from 'vitest';
+import { resolveCodexModel } from '../../../src/context/llm/codex-models.js';
+
+describe('resolveCodexModel', () => {
+  it.each([
+    ['codex', 'gpt-5.5'],
+    ['default', 'gpt-5.5'],
+    ['gpt-5.3-codex-spark', 'gpt-5.3-codex-spark'],
+    ['gpt-5.4', 'gpt-5.4'],
+  ])('maps %s to %s', (input, expected) => {
+    expect(resolveCodexModel(input)).toBe(expected);
+  });
+
+  it.each(['', '   ', 'sonnet', 'claude-sonnet-4-6'])('rejects %s', (input) => {
+    expect(() => resolveCodexModel(input)).toThrow('Unsupported Codex model');
+  });
+});
--- a/packages/cli/test/context/llm/codex-runtime-config.test.ts
+++ b/packages/cli/test/context/llm/codex-runtime-config.test.ts
@ -0,0 +1,43 @@
+import { describe, expect, it } from 'vitest';
+import { buildCodexRuntimeConfig } from '../../../src/context/llm/codex-runtime-config.js';
+
+describe('buildCodexRuntimeConfig', () => {
+  it('builds generic config without SDK thread-option fields', () => {
+    expect(buildCodexRuntimeConfig({ model: 'gpt-5.3-codex' })).toEqual({
+      configOverrides: {
+        history: { persistence: 'none' },
+      },
+      env: {},
+    });
+  });
+
+  it('adds only the temporary ktx MCP server and exact enabled tools', () => {
+    expect(
+      buildCodexRuntimeConfig({
+        model: 'gpt-5.3-codex',
+        mcp: {
+          url: 'http://127.0.0.1:4567/mcp',
+          bearerTokenEnvVar: 'KTX_CODEX_RUNTIME_MCP_TOKEN',
+          bearerToken: 'secret-token',
+          toolNames: ['sl_read_source', 'wiki_search'],
+        },
+      }),
+    ).toEqual({
+      configOverrides: {
+        history: { persistence: 'none' },
+        mcp_servers: {
+          ktx: {
+            url: 'http://127.0.0.1:4567/mcp',
+            bearer_token_env_var: 'KTX_CODEX_RUNTIME_MCP_TOKEN',
+            enabled_tools: ['sl_read_source', 'wiki_search'],
+            default_tools_approval_mode: 'approve',
+            required: true,
+          },
+        },
+      },
+      env: {
+        KTX_CODEX_RUNTIME_MCP_TOKEN: 'secret-token',
+      },
+    });
+  });
+});
--- a/packages/cli/test/context/llm/codex-runtime.test.ts
+++ b/packages/cli/test/context/llm/codex-runtime.test.ts
@ -0,0 +1,460 @@
+import { describe, expect, it, vi } from 'vitest';
+import { z } from 'zod';
+import {
+  CodexKtxLlmRuntime,
+  runCodexAuthProbe,
+} from '../../../src/context/llm/codex-runtime.js';
+
+async function* events(items: unknown[]) {
+  for (const item of items) {
+    yield item;
+  }
+}
+
+function runner(items: unknown[]) {
+  return {
+    runStreamed: vi.fn(async () => events(items)),
+  };
+}
+
+/** Yields the given events, then throws — mirroring the SDK throwing on a non-zero codex exec exit. */
+function throwingRunner(items: unknown[], error: Error) {
+  return {
+    runStreamed: vi.fn(async () =>
+      (async function* () {
+        for (const item of items) {
+          yield item;
+        }
+        throw error;
+      })(),
+    ),
+  };
+}
+
+const MODEL_UNSUPPORTED_API_ERROR = JSON.stringify({
+  type: 'error',
+  status: 400,
+  error: {
+    type: 'invalid_request_error',
+    message: "The 'gpt-5.3-codex' model is not supported when using Codex with a ChatGPT account.",
+  },
+});
+
+function budgetRunner() {
+  let observedSignal: AbortSignal | undefined;
+  return {
+    observedSignal: () => observedSignal,
+    runStreamed: vi.fn(async (input: { signal?: AbortSignal }) => {
+      observedSignal = input.signal;
+      return events([
+        { type: 'turn.started' },
+        { type: 'item.started', item: { type: 'mcp_tool_call', server: 'ktx', tool: 'first', status: 'in_progress' } },
+        { type: 'item.completed', item: { type: 'mcp_tool_call', server: 'ktx', tool: 'first', status: 'completed' } },
+        { type: 'item.started', item: { type: 'mcp_tool_call', server: 'ktx', tool: 'second', status: 'in_progress' } },
+        { type: 'item.completed', item: { type: 'mcp_tool_call', server: 'ktx', tool: 'second', status: 'completed' } },
+        { type: 'turn.completed', usage: { input_tokens: 1, output_tokens: 1 } },
+      ]);
+    }),
+  };
+}
+
+describe('CodexKtxLlmRuntime', () => {
+  it('generates text with the role-selected model and metrics', async () => {
+    const onMetrics = vi.fn();
+    const fakeRunner = runner([
+      { type: 'turn.started' },
+      { type: 'item.completed', item: { type: 'agent_message', text: 'hello' } },
+      { type: 'turn.completed', usage: { input_tokens: 3, output_tokens: 4, total_tokens: 7 } },
+    ]);
+    const runtime = new CodexKtxLlmRuntime({
+      projectDir: '/tmp/project',
+      modelSlots: { default: 'codex', triage: 'gpt-5.4' },
+      runner: fakeRunner,
+    });
+
+    await expect(runtime.generateText({ role: 'triage', system: 'system', prompt: 'prompt', onMetrics })).resolves.toBe('hello');
+    expect(fakeRunner.runStreamed).toHaveBeenCalledWith(
+      expect.objectContaining({
+        projectDir: '/tmp/project',
+        model: 'gpt-5.4',
+        prompt: 'system\n\nprompt',
+      }),
+    );
+    expect(onMetrics).toHaveBeenCalledWith(expect.objectContaining({ usage: { inputTokens: 3, outputTokens: 4, totalTokens: 7 } }));
+  });
+
+  it('generates and validates structured output', async () => {
+    const fakeRunner = runner([
+      { type: 'turn.started' },
+      { type: 'item.completed', item: { type: 'agent_message', text: '{"answer":"yes"}' } },
+      { type: 'turn.completed' },
+    ]);
+    const runtime = new CodexKtxLlmRuntime({
+      projectDir: '/tmp/project',
+      modelSlots: { default: 'codex' },
+      runner: fakeRunner,
+    });
+
+    await expect(
+      runtime.generateObject({
+        role: 'default',
+        prompt: 'json',
+        schema: z.object({ answer: z.string() }),
+      }),
+    ).resolves.toEqual({ answer: 'yes' });
+    expect(fakeRunner.runStreamed).toHaveBeenCalledWith(
+      expect.objectContaining({
+        outputSchema: expect.objectContaining({ type: 'object' }),
+      }),
+    );
+  });
+
+  it('returns a structured-output error when Codex final text is invalid JSON', async () => {
+    const fakeRunner = runner([
+      { type: 'turn.started' },
+      { type: 'item.completed', item: { type: 'agent_message', text: 'not json' } },
+      { type: 'turn.completed' },
+    ]);
+    const runtime = new CodexKtxLlmRuntime({
+      projectDir: '/tmp/project',
+      modelSlots: { default: 'codex' },
+      runner: fakeRunner,
+    });
+
+    await expect(
+      runtime.generateObject({
+        role: 'default',
+        prompt: 'json',
+        schema: z.object({ answer: z.string() }),
+      }),
+    ).rejects.toThrow('Codex structured output failed validation');
+  });
+
+  it('starts and closes a temporary MCP server for tool-backed agent loops', async () => {
+    const close = vi.fn(async () => undefined);
+    const startMcpServer = vi.fn(async () => ({
+      url: 'http://127.0.0.1:4321/mcp',
+      bearerTokenEnvVar: 'KTX_CODEX_RUNTIME_MCP_TOKEN' as const,
+      bearerToken: 'token',
+      close,
+    }));
+    const fakeRunner = runner([
+      { type: 'turn.started' },
+      { type: 'item.started', item: { type: 'mcp_tool_call', name: 'wiki_search' } },
+      { type: 'item.completed', item: { type: 'agent_message', text: 'done' } },
+      { type: 'turn.completed', usage: { input_tokens: 1, output_tokens: 1, total_tokens: 2 } },
+    ]);
+    const runtime = new CodexKtxLlmRuntime({
+      projectDir: '/tmp/project',
+      modelSlots: { default: 'codex' },
+      runner: fakeRunner,
+      startMcpServer,
+    });
+    const onStepFinish = vi.fn();
+
+    const result = await runtime.runAgentLoop({
+      modelRole: 'default',
+      systemPrompt: 'system',
+      userPrompt: 'user',
+      stepBudget: 5,
+      telemetryTags: {},
+      onStepFinish,
+      toolSet: {
+        aliased_wiki_tool: {
+          name: 'wiki_search',
+          description: 'Search wiki',
+          inputSchema: z.object({ query: z.string() }),
+          execute: vi.fn(),
+        },
+      },
+    });
+
+    expect(result.stopReason).toBe('natural');
+    expect(result.metrics).toMatchObject({ stepCount: 1, usage: { inputTokens: 1, outputTokens: 1, totalTokens: 2 } });
+    expect(onStepFinish).toHaveBeenCalledWith({ stepIndex: 1, stepBudget: 5 });
+    expect(startMcpServer).toHaveBeenCalledWith({ projectDir: '/tmp/project', toolSet: expect.any(Object) });
+    expect(fakeRunner.runStreamed).toHaveBeenCalledWith(
+      expect.objectContaining({
+        env: { KTX_CODEX_RUNTIME_MCP_TOKEN: 'token' },
+        configOverrides: expect.objectContaining({
+          mcp_servers: expect.objectContaining({
+            ktx: expect.objectContaining({
+              url: 'http://127.0.0.1:4321/mcp',
+              enabled_tools: ['wiki_search'],
+              required: true,
+            }),
+          }),
+        }),
+      }),
+    );
+    expect(close).toHaveBeenCalled();
+  });
+
+  it('returns error stop reason on turn failure', async () => {
+    const runtime = new CodexKtxLlmRuntime({
+      projectDir: '/tmp/project',
+      modelSlots: { default: 'codex' },
+      runner: runner([{ type: 'turn.failed', error: { message: 'boom' } }]),
+    });
+
+    const result = await runtime.runAgentLoop({
+      modelRole: 'default',
+      systemPrompt: 'system',
+      userPrompt: 'user',
+      stepBudget: 5,
+      telemetryTags: {},
+      toolSet: {},
+    });
+
+    expect(result.stopReason).toBe('error');
+    expect(result.error?.message).toBe('boom');
+  });
+
+  it('surfaces failed MCP tool calls as agent-loop errors', async () => {
+    const runtime = new CodexKtxLlmRuntime({
+      projectDir: '/tmp/project',
+      modelSlots: { default: 'codex' },
+      runner: runner([
+        { type: 'turn.started' },
+        { type: 'item.started', item: { type: 'mcp_tool_call', server: 'ktx', tool: 'search', status: 'in_progress' } },
+        {
+          type: 'item.completed',
+          item: {
+            type: 'mcp_tool_call',
+            server: 'ktx',
+            tool: 'search',
+            status: 'failed',
+            error: { message: 'denied' },
+          },
+        },
+        { type: 'turn.completed', usage: { input_tokens: 1, output_tokens: 1 } },
+      ]),
+    });
+
+    const result = await runtime.runAgentLoop({
+      modelRole: 'default',
+      systemPrompt: 'system',
+      userPrompt: 'user',
+      stepBudget: 5,
+      telemetryTags: {},
+      toolSet: {},
+    });
+
+    expect(result.stopReason).toBe('error');
+    expect(result.error?.message).toBe('Codex runtime tool call failed: search: denied');
+    expect(result.metrics).toMatchObject({
+      stepCount: 1,
+      usage: { inputTokens: 1, outputTokens: 1, totalTokens: 2 },
+    });
+  });
+
+  it('returns budget and aborts the Codex stream when local MCP step budget is reached', async () => {
+    const fakeRunner = budgetRunner();
+    const runtime = new CodexKtxLlmRuntime({
+      projectDir: '/tmp/project',
+      modelSlots: { default: 'codex' },
+      runner: fakeRunner,
+    });
+    const onStepFinish = vi.fn();
+
+    const result = await runtime.runAgentLoop({
+      modelRole: 'default',
+      systemPrompt: 'system',
+      userPrompt: 'user',
+      stepBudget: 1,
+      telemetryTags: {},
+      onStepFinish,
+      toolSet: {
+        first: {
+          name: 'first',
+          description: 'First tool',
+          inputSchema: z.object({}),
+          execute: vi.fn(),
+        },
+      },
+    });
+
+    expect(result.stopReason).toBe('budget');
+    expect(result.error).toBeUndefined();
+    expect(result.metrics).toMatchObject({ stepCount: 1 });
+    expect(onStepFinish).toHaveBeenCalledTimes(1);
+    expect(onStepFinish).toHaveBeenCalledWith({ stepIndex: 1, stepBudget: 1 });
+    expect(fakeRunner.observedSignal()?.aborted).toBe(true);
+  });
+
+  it('counts built-in command_execution steps against the budget and aborts the stream', async () => {
+    let observedSignal: AbortSignal | undefined;
+    const fakeRunner = {
+      observedSignal: () => observedSignal,
+      runStreamed: vi.fn(async (input: { signal?: AbortSignal }) => {
+        observedSignal = input.signal;
+        return events([
+          { type: 'turn.started' },
+          { type: 'item.started', item: { type: 'command_execution', command: 'ls', status: 'in_progress' } },
+          { type: 'item.completed', item: { type: 'command_execution', command: 'ls', status: 'completed', exit_code: 0 } },
+          { type: 'item.started', item: { type: 'command_execution', command: 'cat a', status: 'in_progress' } },
+          { type: 'item.completed', item: { type: 'command_execution', command: 'cat a', status: 'completed', exit_code: 0 } },
+          { type: 'item.completed', item: { type: 'command_execution', command: 'cat b', status: 'completed', exit_code: 0 } },
+          { type: 'turn.completed', usage: { input_tokens: 1, output_tokens: 1 } },
+        ]);
+      }),
+    };
+    const runtime = new CodexKtxLlmRuntime({
+      projectDir: '/tmp/project',
+      modelSlots: { default: 'codex' },
+      runner: fakeRunner,
+    });
+    const onStepFinish = vi.fn();
+
+    const result = await runtime.runAgentLoop({
+      modelRole: 'default',
+      systemPrompt: 'system',
+      userPrompt: 'user',
+      stepBudget: 2,
+      telemetryTags: {},
+      onStepFinish,
+      toolSet: {},
+    });
+
+    expect(result.stopReason).toBe('budget');
+    expect(result.error).toBeUndefined();
+    expect(result.metrics).toMatchObject({ stepCount: 2 });
+    expect(onStepFinish).toHaveBeenCalledTimes(2);
+    expect(onStepFinish).toHaveBeenLastCalledWith({ stepIndex: 2, stepBudget: 2 });
+    expect(fakeRunner.observedSignal()?.aborted).toBe(true);
+  });
+
+  it('fires onStepFinish live as each step completes, before the stream drains', async () => {
+    const order: string[] = [];
+    async function* liveEvents() {
+      yield { type: 'turn.started' };
+      yield { type: 'item.completed', item: { type: 'mcp_tool_call', server: 'ktx', tool: 'a', status: 'completed' } };
+      order.push('yielded-after-step-1');
+      yield { type: 'item.completed', item: { type: 'mcp_tool_call', server: 'ktx', tool: 'b', status: 'completed' } };
+      order.push('yielded-after-step-2');
+      yield { type: 'item.completed', item: { type: 'agent_message', text: 'done' } };
+      yield { type: 'turn.completed', usage: { input_tokens: 1, output_tokens: 1 } };
+    }
+    const fakeRunner = { runStreamed: vi.fn(async () => liveEvents()) };
+    const runtime = new CodexKtxLlmRuntime({
+      projectDir: '/tmp/project',
+      modelSlots: { default: 'codex' },
+      runner: fakeRunner,
+    });
+
+    const result = await runtime.runAgentLoop({
+      modelRole: 'default',
+      systemPrompt: 'system',
+      userPrompt: 'user',
+      stepBudget: 10,
+      telemetryTags: {},
+      onStepFinish: ({ stepIndex }) => {
+        order.push(`step-${stepIndex}`);
+      },
+      toolSet: {},
+    });
+
+    expect(result.stopReason).toBe('natural');
+    expect(result.metrics).toMatchObject({ stepCount: 2 });
+    expect(order).toEqual(['step-1', 'yielded-after-step-1', 'step-2', 'yielded-after-step-2']);
+  });
+
+  it('surfaces the real Codex error event even when the SDK stream throws afterward', async () => {
+    // The SDK yields the error/turn.failed events on stdout, then throws on the
+    // non-zero exit. The masked exit message must not hide the real API error.
+    const fakeRunner = throwingRunner(
+      [
+        { type: 'thread.started', thread_id: 't' },
+        { type: 'turn.started' },
+        { type: 'error', message: MODEL_UNSUPPORTED_API_ERROR },
+        { type: 'turn.failed', error: { message: MODEL_UNSUPPORTED_API_ERROR } },
+      ],
+      new Error('Codex Exec exited with code 1: Reading prompt from stdin...'),
+    );
+    const runtime = new CodexKtxLlmRuntime({
+      projectDir: '/tmp/project',
+      modelSlots: { default: 'codex' },
+      runner: fakeRunner,
+    });
+
+    await expect(runtime.generateText({ role: 'default', prompt: 'hi' })).rejects.toThrow(
+      'not supported when using Codex with a ChatGPT account',
+    );
+  });
+
+  it('probes Codex authentication through a minimal non-interactive turn', async () => {
+    const fakeRunner = runner([
+      { type: 'turn.started' },
+      { type: 'item.completed', item: { type: 'agent_message', text: 'ok' } },
+      { type: 'turn.completed' },
+    ]);
+
+    await expect(
+      runCodexAuthProbe({
+        projectDir: '/tmp/project',
+        model: 'codex',
+        runner: fakeRunner,
+      }),
+    ).resolves.toEqual({ ok: true });
+  });
+
+  it('reports an unavailable model without blaming auth when Codex rejects the model', async () => {
+    const fakeRunner = throwingRunner(
+      [
+        { type: 'turn.started' },
+        { type: 'turn.failed', error: { message: MODEL_UNSUPPORTED_API_ERROR } },
+      ],
+      new Error('Codex Exec exited with code 1: Reading prompt from stdin...'),
+    );
+
+    const result = await runCodexAuthProbe({
+      projectDir: '/tmp/project',
+      model: 'gpt-5.3-codex',
+      runner: fakeRunner,
+    });
+
+    expect(result.ok).toBe(false);
+    if (!result.ok) {
+      expect(result.message).not.toContain('authentication is not usable');
+      expect(result.message).toContain('not available');
+      expect(result.message).toContain('gpt-5.3-codex');
+      expect(result.message).toContain('not supported when using Codex with a ChatGPT account');
+      // A model-access failure must steer the user at the model config, not auth.
+      expect(result.fix).toContain('llm.models.default');
+      expect(result.fix).not.toContain('Authenticate Codex');
+    }
+  });
+
+  it('reports an auth failure when Codex exits without an error event', async () => {
+    const fakeRunner = throwingRunner(
+      [],
+      new Error('Codex Exec exited with code 1: Not logged in. Run `codex login`.'),
+    );
+
+    const result = await runCodexAuthProbe({
+      projectDir: '/tmp/project',
+      model: 'gpt-5.5',
+      runner: fakeRunner,
+    });
+
+    expect(result.ok).toBe(false);
+    if (!result.ok) {
+      expect(result.message).toContain('authentication is not usable');
+      expect(result.message).toContain('Not logged in');
+      expect(result.fix).toContain('Authenticate Codex');
+    }
+  });
+
+  it('rejects an unsupported model id before probing, steering at llm.models.default', async () => {
+    const result = await runCodexAuthProbe({
+      projectDir: '/tmp/project',
+      model: 'not-a-real-model',
+    });
+
+    expect(result.ok).toBe(false);
+    if (!result.ok) {
+      expect(result.message).toContain('Unsupported Codex model');
+      expect(result.fix).toContain('llm.models.default');
+    }
+  });
+});
--- a/packages/cli/test/context/llm/codex-sdk-runner.test.ts
+++ b/packages/cli/test/context/llm/codex-sdk-runner.test.ts
@ -0,0 +1,97 @@
+import { describe, expect, it, vi } from 'vitest';
+
+const sdkMock = vi.hoisted(() => {
+  const events = (async function* () {
+    yield { type: 'turn.completed', usage: { input_tokens: 1, output_tokens: 2 } };
+  })();
+  const runStreamed = vi.fn(async () => ({ events }));
+  const startThread = vi.fn(() => ({ runStreamed }));
+  const Codex = vi.fn(function Codex(this: { startThread: typeof startThread }, options?: unknown) {
+    Object.assign(this, { options, startThread });
+  });
+  return { Codex, startThread, runStreamed };
+});
+
+vi.mock('@openai/codex-sdk', () => ({ Codex: sdkMock.Codex }));
+
+import { CodexSdkCliRunner } from '../../../src/context/llm/codex-sdk-runner.js';
+
+async function collectAsync<T>(items: AsyncIterable<T>): Promise<T[]> {
+  const collected: T[] = [];
+  for await (const item of items) {
+    collected.push(item);
+  }
+  return collected;
+}
+
+describe('CodexSdkCliRunner', () => {
+  it('passes isolated env through the SDK and runtime controls through thread options', async () => {
+    const runner = new CodexSdkCliRunner({
+      envBase: {
+        HOME: '/home/ktx-user',
+        PATH: '/usr/local/bin:/usr/bin',
+        CODEX_HOME: '/home/ktx-user/.codex',
+        HTTPS_PROXY: 'http://proxy.example',
+        KTX_UNRELATED_SECRET: 'must-not-copy', // pragma: allowlist secret
+      },
+    });
+    const previousToken = process.env.KTX_CODEX_RUNTIME_MCP_TOKEN;
+    process.env.KTX_CODEX_RUNTIME_MCP_TOKEN = 'outer-token';
+    const outputSchema = {
+      type: 'object',
+      properties: { answer: { type: 'string' } },
+      required: ['answer'],
+      additionalProperties: false,
+    };
+    const controller = new AbortController();
+
+    try {
+      const events = await runner.runStreamed({
+        projectDir: '/tmp/ktx-project',
+        model: 'gpt-5.3-codex',
+        prompt: 'Return JSON.',
+        configOverrides: {
+          history: { persistence: 'none' },
+        },
+        env: { KTX_CODEX_RUNTIME_MCP_TOKEN: 'run-token' },
+        outputSchema,
+        signal: controller.signal,
+      });
+
+      expect(sdkMock.Codex).toHaveBeenCalledWith({
+        config: {
+          history: { persistence: 'none' },
+        },
+        env: {
+          HOME: '/home/ktx-user',
+          PATH: '/usr/local/bin:/usr/bin',
+          CODEX_HOME: '/home/ktx-user/.codex',
+          HTTPS_PROXY: 'http://proxy.example',
+          KTX_CODEX_RUNTIME_MCP_TOKEN: 'run-token',
+        },
+      });
+      expect(process.env.KTX_CODEX_RUNTIME_MCP_TOKEN).toBe('outer-token');
+      expect(sdkMock.startThread).toHaveBeenCalledWith({
+        workingDirectory: '/tmp/ktx-project',
+        skipGitRepoCheck: true,
+        model: 'gpt-5.3-codex',
+        sandboxMode: 'read-only',
+        webSearchMode: 'disabled',
+        approvalPolicy: 'never',
+      });
+      expect(sdkMock.runStreamed).toHaveBeenCalledWith('Return JSON.', {
+        outputSchema,
+        signal: controller.signal,
+      });
+      await expect(collectAsync(events)).resolves.toEqual([
+        { type: 'turn.completed', usage: { input_tokens: 1, output_tokens: 2 } },
+      ]);
+    } finally {
+      if (previousToken === undefined) {
+        delete process.env.KTX_CODEX_RUNTIME_MCP_TOKEN;
+      } else {
+        process.env.KTX_CODEX_RUNTIME_MCP_TOKEN = previousToken;
+      }
+    }
+  });
+});
--- a/packages/cli/test/context/llm/runtime-local-config.test.ts
+++ b/packages/cli/test/context/llm/runtime-local-config.test.ts
@ -22,4 +22,25 @@ describe('local KTX LLM runtime config', () => {
      }),
    ).toBeNull();
  });
+
+  it('creates a Codex runtime for codex backend without creating an AI SDK provider', () => {
+    const runtime = createLocalKtxLlmRuntimeFromConfig(
+      {
+        provider: { backend: 'codex' },
+        models: { default: 'codex', triage: 'gpt-5.4' },
+      },
+      { env: {}, projectDir: '/tmp/project', createCodexRuntime: vi.fn((deps) => ({ deps }) as never) },
+    );
+
+    expect(runtime).toMatchObject({ deps: expect.objectContaining({ projectDir: '/tmp/project' }) });
+  });
+
+  it('returns null from the AI SDK provider factory for codex backend', () => {
+    expect(
+      createLocalKtxLlmProviderFromConfig({
+        provider: { backend: 'codex' },
+        models: { default: 'codex' },
+      }),
+    ).toBeNull();
+  });
 });
--- a/packages/cli/test/context/project/config.test.ts
+++ b/packages/cli/test/context/project/config.test.ts
@ -231,6 +231,31 @@ llm:
    });
  });

+  it('parses Codex as a first-class LLM backend', () => {
+    const config = parseKtxProjectConfig(`
+llm:
+  provider:
+    backend: codex
+  models:
+    default: gpt-5.3-codex
+    triage: gpt-5.3-codex
+    candidateExtraction: gpt-5.3-codex
+    curator: gpt-5.3-codex
+    reconcile: gpt-5.3-codex
+    repair: gpt-5.3-codex
+`);
+
+    expect(config.llm.provider.backend).toBe('codex');
+    expect(config.llm.models).toEqual({
+      default: 'gpt-5.3-codex',
+      triage: 'gpt-5.3-codex',
+      candidateExtraction: 'gpt-5.3-codex',
+      curator: 'gpt-5.3-codex',
+      reconcile: 'gpt-5.3-codex',
+      repair: 'gpt-5.3-codex',
+    });
+  });
+
  it('parses gateway LLM, OpenAI scan embeddings, and sentence-transformers ingest embeddings', () => {
    const config = parseKtxProjectConfig(`
 llm:
@ -530,7 +555,7 @@ describe('generateKtxProjectConfigJsonSchema', () => {
    const llm = (schema.properties as Record<string, { properties?: Record<string, unknown> }>).llm;
    const provider = llm?.properties?.provider as { properties?: Record<string, unknown> };
    const backend = provider?.properties?.backend as { enum?: readonly string[] };
-    expect(backend?.enum).toEqual(['none', 'anthropic', 'vertex', 'gateway', 'claude-code']);
+    expect(backend?.enum).toEqual(['none', 'anthropic', 'vertex', 'gateway', 'claude-code', 'codex']);

    const storage = (schema.properties as Record<string, { properties?: Record<string, unknown> }>).storage;
    const state = storage?.properties?.state as { enum?: readonly string[] };
--- a/packages/cli/test/doctor.test.ts
+++ b/packages/cli/test/doctor.test.ts
@ -422,6 +422,8 @@ describe('runKtxDoctor', () => {
        'llm:',
        '  provider:',
        '    backend: anthropic',
+        '  models:',
+        '    default: claude-sonnet-4-5',
        '',
      ].join('\n'),
      'utf-8',
@ -543,6 +545,8 @@ describe('runKtxDoctor', () => {
        'llm:',
        '  provider:',
        '    backend: anthropic',
+        '  models:',
+        '    default: claude-sonnet-4-5',
        'ingest:',
        '  adapters:',
        '    - live-database',
@ -652,6 +656,8 @@ describe('runKtxDoctor', () => {
        'llm:',
        '  provider:',
        '    backend: anthropic',
+        '  models:',
+        '    default: claude-sonnet-4-5',
        '',
      ].join('\n'),
      'utf-8',
@ -698,6 +704,8 @@ describe('runKtxDoctor', () => {
        'llm:',
        '  provider:',
        '    backend: anthropic',
+        '  models:',
+        '    default: claude-sonnet-4-5',
        'ingest:',
        '  adapters:',
        '    - live-database',
--- a/packages/cli/test/ingest.test.ts
+++ b/packages/cli/test/ingest.test.ts
@ -337,10 +337,13 @@ describe('runKtxIngest', () => {

    expect(runIo.stdout()).toBe('');
    expect(runIo.stderr()).toContain(
-      'ktx ingest requires llm.provider.backend: anthropic, vertex, gateway, or claude-code, or an injected agentRunner.',
+      'ktx ingest requires llm.provider.backend: anthropic, vertex, gateway, claude-code, or codex, or an injected agentRunner.',
    );
-    expect(runIo.stderr()).toContain('Configure a local Claude Code session or API-backed LLM, then rerun ingest:');
+    expect(runIo.stderr()).toContain('Configure a local Claude Code/Codex session or API-backed LLM, then rerun ingest:');
    expect(runIo.stderr()).toContain(`ktx setup --project-dir ${projectDir} --llm-backend claude-code --no-input`);
+    expect(runIo.stderr()).toContain(
+      `ktx setup --project-dir ${projectDir} --llm-backend codex --llm-model gpt-5.5 --no-input`,
+    );
    expect(runIo.stderr()).toContain(
      `ktx setup --project-dir ${projectDir} --llm-backend anthropic --anthropic-api-key-env ANTHROPIC_API_KEY --llm-model claude-sonnet-4-6 --no-input`,
    );
--- a/packages/cli/test/llm/model-provider.test.ts
+++ b/packages/cli/test/llm/model-provider.test.ts
@ -312,4 +312,13 @@ describe('createKtxLlmProvider', () => {
      }),
    ).toThrow('claude-code is not an AI SDK LanguageModel backend');
  });
+
+  it('rejects codex as an AI SDK LanguageModel backend', () => {
+    expect(() =>
+      createKtxLlmProvider({
+        backend: 'codex',
+        modelSlots: { default: 'gpt-5.3-codex' },
+      }),
+    ).toThrow('codex is not an AI SDK LanguageModel backend');
+  });
 });
--- a/packages/cli/test/setup-models.test.ts
+++ b/packages/cli/test/setup-models.test.ts
@ -66,6 +66,7 @@ function makePromptAdapter(options: {
        nextProviderChoice === 'anthropic' ||
        nextProviderChoice === 'vertex' ||
        nextProviderChoice === 'claude-code' ||
+        nextProviderChoice === 'codex' ||
        nextProviderChoice === 'back'
      ) {
        return selectValues.shift() ?? nextProviderChoice;
@ -183,6 +184,7 @@ describe('setup Anthropic model step', () => {
        message: expect.stringContaining('Which LLM provider should KTX use?'),
        options: [
          { value: 'claude-code', label: 'Claude subscription (Pro/Max)' },
+          { value: 'codex', label: 'Codex subscription' },
          { value: 'anthropic', label: 'Anthropic API key' },
          { value: 'vertex', label: 'Google Vertex AI for Anthropic Claude' },
          { value: 'back', label: 'Back' },
@ -215,6 +217,85 @@ describe('setup Anthropic model step', () => {
    expect(authProbe).toHaveBeenCalledWith(expect.objectContaining({ projectDir: tempDir, model: 'sonnet' }));
  });

+  it('configures Codex backend and validates local auth', async () => {
+    const io = makeIo();
+    const codexAuthProbe = vi.fn(async () => ({ ok: true as const }));
+
+    const result = await runKtxSetupAnthropicModelStep(
+      {
+        projectDir: tempDir,
+        inputMode: 'disabled',
+        llmBackend: 'codex',
+        llmModel: 'gpt-5.5',
+        skipLlm: false,
+      },
+      io.io,
+      { codexAuthProbe },
+    );
+
+    expect(result.status).toBe('ready');
+    const config = parseKtxProjectConfig(await readFile(join(tempDir, 'ktx.yaml'), 'utf-8'));
+    expect(config.llm).toMatchObject({
+      provider: { backend: 'codex' },
+      models: { default: 'gpt-5.5' },
+    });
+    expect(codexAuthProbe).toHaveBeenCalledWith(expect.objectContaining({ projectDir: tempDir, model: 'gpt-5.5' }));
+    // The warning carries the clack gutter so it renders inside the setup frame.
+    expect(io.stderr()).toContain('│  Codex backend isolation is limited');
+    expect(io.stderr()).toContain('may still load user Codex config');
+  });
+
+  it('defaults the Codex model to gpt-5.5 when none is provided non-interactively', async () => {
+    const io = makeIo();
+    const codexAuthProbe = vi.fn(async () => ({ ok: true as const }));
+
+    const result = await runKtxSetupAnthropicModelStep(
+      {
+        projectDir: tempDir,
+        inputMode: 'disabled',
+        llmBackend: 'codex',
+        skipLlm: false,
+      },
+      io.io,
+      { codexAuthProbe },
+    );
+
+    expect(result.status).toBe('ready');
+    const config = parseKtxProjectConfig(await readFile(join(tempDir, 'ktx.yaml'), 'utf-8'));
+    expect(config.llm).toMatchObject({
+      provider: { backend: 'codex' },
+      models: { default: 'gpt-5.5' },
+    });
+    expect(codexAuthProbe).toHaveBeenCalledWith(expect.objectContaining({ projectDir: tempDir, model: 'gpt-5.5' }));
+  });
+
+  it('offers the curated Codex models during interactive setup', async () => {
+    const io = makeIo();
+    const prompts = makePromptAdapter({ selectValues: ['codex', 'gpt-5.5'] });
+    const codexAuthProbe = vi.fn(async () => ({ ok: true as const }));
+
+    const result = await runKtxSetupAnthropicModelStep(
+      { projectDir: tempDir, inputMode: 'auto', skipLlm: false },
+      io.io,
+      { prompts, codexAuthProbe },
+    );
+
+    expect(result.status).toBe('ready');
+    expect(prompts.select).toHaveBeenCalledWith(
+      expect.objectContaining({
+        message: expect.stringContaining('Which Codex model should KTX use?'),
+        options: [
+          { value: 'gpt-5.5', label: 'GPT-5.5', hint: 'recommended' },
+          { value: 'gpt-5.4', label: 'GPT-5.4' },
+          { value: 'gpt-5.4-mini', label: 'GPT-5.4 mini' },
+          { value: 'manual', label: 'Enter a Codex model ID manually' },
+          { value: 'back', label: 'Back' },
+        ],
+      }),
+    );
+    expect(codexAuthProbe).toHaveBeenCalledWith(expect.objectContaining({ model: 'gpt-5.5' }));
+  });
+
  it('prompts for the Claude Code model during interactive setup', async () => {
    const io = makeIo();
    const prompts = makePromptAdapter({ selectValues: ['claude-code', 'opus'] });
--- a/packages/cli/test/status-project.test.ts
+++ b/packages/cli/test/status-project.test.ts
@ -44,6 +44,17 @@ function withClaudeCodeLlm(config: KtxProjectConfig): KtxProjectConfig {
  };
 }

+function withCodexLlm(config: KtxProjectConfig): KtxProjectConfig {
+  return {
+    ...config,
+    llm: {
+      ...config.llm,
+      provider: { backend: 'codex' },
+      models: { ...config.llm.models, default: 'gpt-5.5' },
+    },
+  };
+}
+
 function baseProjectConfig(): KtxProjectConfig {
  return withClaudeCodeLlm(buildDefaultKtxProjectConfig());
 }
@ -391,6 +402,126 @@ describe('buildProjectStatus --fast', () => {
  });
 });

+describe('buildProjectStatus codex', () => {
+  it('reports authenticated local Codex session', async () => {
+    const project = projectWithConfig(withCodexLlm(buildDefaultKtxProjectConfig()));
+    const status = await buildProjectStatus(project, {
+      codexAuthProbe: async () => ({ ok: true as const }),
+    });
+
+    expect(status.llm).toMatchObject({
+      backend: 'codex',
+      model: 'gpt-5.5',
+      status: 'ok',
+      detail: 'local Codex session authenticated',
+    });
+    expect(status.warnings).toEqual(
+      expect.arrayContaining([
+        expect.objectContaining({
+          message: expect.stringContaining('Codex backend isolation is limited'),
+          fix: expect.stringContaining('claude-code'),
+        }),
+      ]),
+    );
+    const rendered = renderProjectStatus(status, { verbose: false, useColor: false });
+    expect(rendered).toContain('Codex backend isolation is limited');
+  });
+
+  it('skips Codex auth probe with --fast', async () => {
+    let probeCalls = 0;
+    const project = projectWithConfig(withCodexLlm(buildDefaultKtxProjectConfig()));
+    const status = await buildProjectStatus(project, {
+      fast: true,
+      codexAuthProbe: async () => {
+        probeCalls += 1;
+        return { ok: true };
+      },
+    });
+
+    expect(probeCalls).toBe(0);
+    expect(status.llm.status).toBe('skipped');
+    expect(status.llm.detail).toMatch(/--fast/);
+  });
+
+  it('surfaces the probe fix for a model-access failure instead of an auth fix', async () => {
+    const project = projectWithConfig(withCodexLlm(buildDefaultKtxProjectConfig()));
+    const status = await buildProjectStatus(project, {
+      codexAuthProbe: async () => ({
+        ok: false,
+        message: 'Codex is authenticated, but the configured model "gpt-5.5" is not available...',
+        fix: 'Run `codex` to see the models your account supports, then set llm.models.default in ktx.yaml (or rerun `ktx setup`).',
+      }),
+    });
+
+    expect(status.llm.status).toBe('fail');
+    expect(status.llm.fix).toContain('llm.models.default');
+    expect(status.llm.fix).not.toContain('Authenticate Codex');
+  });
+});
+
+describe('buildProjectStatus llm models.default requirement', () => {
+  function withBackendNoModel(
+    backend: KtxProjectConfig['llm']['provider']['backend'],
+  ): KtxProjectConfig {
+    const config = buildDefaultKtxProjectConfig();
+    return {
+      ...config,
+      llm: { ...config.llm, provider: { backend }, models: {} },
+    };
+  }
+
+  it('fails codex without llm.models.default and never probes', async () => {
+    let probeCalls = 0;
+    const project = projectWithConfig(withBackendNoModel('codex'));
+    const status = await buildProjectStatus(project, {
+      codexAuthProbe: async () => {
+        probeCalls += 1;
+        return { ok: true };
+      },
+    });
+
+    expect(probeCalls).toBe(0);
+    expect(status.llm.status).toBe('fail');
+    expect(status.llm.detail).toContain('llm.models.default');
+    expect(status.verdict).toBe('blocked');
+  });
+
+  it('fails claude-code without llm.models.default and never probes', async () => {
+    let probeCalls = 0;
+    const project = projectWithConfig(withBackendNoModel('claude-code'));
+    const status = await buildProjectStatus(project, {
+      claudeCodeAuthProbe: async () => {
+        probeCalls += 1;
+        return { ok: true };
+      },
+    });
+
+    expect(probeCalls).toBe(0);
+    expect(status.llm.status).toBe('fail');
+    expect(status.llm.detail).toContain('llm.models.default');
+    expect(status.verdict).toBe('blocked');
+  });
+
+  it('fails anthropic without llm.models.default even when the key is set', async () => {
+    const config = withBackendNoModel('anthropic');
+    const project = projectWithConfig({
+      ...config,
+      llm: {
+        ...config.llm,
+        provider: { backend: 'anthropic', anthropic: { api_key: 'env:ANTHROPIC_API_KEY' } }, // pragma: allowlist secret
+        models: {},
+      },
+    });
+    const status = await buildProjectStatus(project, {
+      env: { ANTHROPIC_API_KEY: 'sk-test' }, // pragma: allowlist secret
+    });
+
+    expect(status.llm.status).toBe('fail');
+    expect(status.llm.detail).toContain('llm.models.default');
+    expect(status.verdict).toBe('blocked');
+  });
+});
+
 describe('buildLocalStatsStatus', () => {
  let tempDir: string;