feat: add codex llm backend for ktx runtime work (#253)

* feat: add codex sdk runner foundation * feat: parse codex runtime events * feat: expose codex runtime mcp tools * feat: add codex llm runtime * feat: wire codex llm backend * test: avoid Array.fromAsync in codex runner test * docs: document codex llm backend * fix: tighten codex runtime config ownership * fix: use codex sdk env and thread options * fix: parse codex sdk event shapes * test: add codex backend live smoke * docs: clarify codex backend isolation * fix: drive codex loop metrics from mcp events * fix: enforce codex local step budget * docs: disclose codex isolation limits * fix: count all codex agent steps and stream step callbacks live The agent-loop step budget only counted completed mcp_tool_call items, so built-in command_execution steps (which the public Codex SDK/CLI surface can still expose) never decremented the budget, letting ingest/reconciliation run past stepBudget until Codex stopped on its own. onStepFinish was also replayed only after the whole stream drained, so live work_unit_step / reconciliation progress appeared stuck until the Codex process exited. collectEvents is now the single live step accumulator: it counts every completed agent-action item via a shared isCompletedAgentStep predicate (command_execution, mcp_tool_call, file_change, web_search), fires onStepFinish as each step completes, and enforces the budget on that broader count. A no-tool turn still counts as one step. toolFailures stays MCP-specific, since a non-zero command exit is normal agent exploration, not a loop failure. * test: align ingest llm-guard assertions with codex backend The skip-llm ingest guard message now lists codex as a valid backend and mentions a Claude Code/Codex session plus a codex setup hint, but this slow suite test still asserted the pre-codex wording. Update it to match the production message (already covered by the local-bundle-runtime unit test) and add the codex setup-line assertion. * fix: treat codex error:null tool calls as success The Codex SDK serializes error: null on successful mcp_tool_call items, so the failure check (item.error !== undefined) flagged every successful tool call as failed with the empty-payload default "Codex turn failed". This killed every ingest work unit under the codex backend before it could produce a patch. Key on status === 'failed' (authoritative, always set) and only treat a populated error object as a failure. Add a regression test built from a verbatim real-SDK event capture. * fix: default codex backend to gpt-5.5 and report real probe errors The previous default gpt-5.3-codex is an API-key-only model that the OpenAI API rejects under ChatGPT-account (subscription) auth, so codex status/setup failed with a misleading "authentication is not usable" message even though auth was fine. - Default codex model is now gpt-5.5 (works on both subscription and API-key auth); the curated setup picker offers gpt-5.5 / gpt-5.4 / gpt-5.4-mini and keeps free-form entry for account-specific ids (e.g. gpt-5.3-codex-spark). - runCodexAuthProbe now distinguishes "model not available" from an auth failure and surfaces the real API error: collectEvents retains stream events when the SDK throws on a non-zero exit, and the API error JSON envelope is unwrapped to its human-readable message. - The Codex isolation warning now renders inside the clack setup frame. - Docs updated to gpt-5.5 with a note that *-codex ids require API-key auth. * fix: require llm.models.default in status and match codex probe remediation Status reported a project ready when a non-none LLM backend was configured without llm.models.default, but the runtime (resolveModelSlots) hard-requires it, so ingest/scan/memory threw after `ktx status` said the project was usable. buildLlmStatus now fails for any non-none backend missing models.default and no longer invents a fallback model for claude-code/codex. Codex probe failures now carry a category-matched fix: a model-access failure steers the user at llm.models.default instead of the auth/install remediation. runCodexAuthProbe returns the fix and status consumes it; the message stays self-sufficient so setup output is unchanged. Docs: README now lists the codex backend and local Codex auth; ktx-setup.mdx states --llm-model only accepts codex/default or gpt-*/codex-* ids. Repaired four doctor fixtures that configured a backend without models.default (the now-correctly-blocked config) and added coverage for the new behavior.
2026-07-22 11:51:01 +02:00 · 2026-06-02 13:57:11 +02:00 · 2026-06-02 13:57:11 +02:00 · 494618ab14
commit 494618ab14
parent 74c6076b72
41 changed files with 2544 additions and 30 deletions
--- a/README.md
+++ b/README.md
@ -30,8 +30,9 @@ warehouse accurately - from approved metric definitions, joinable columns, and
 business knowledge it builds and maintains for you.

 > [!NOTE]
-> Run **ktx** with your own LLM API keys or a **Claude Pro/Max** subscription.
-> No extra usage billing from **ktx**.
+> Run **ktx** with your own LLM API keys or a local agent sign-in — a
+> **Claude Pro/Max** subscription through Claude Code, or your local Codex
+> authentication. No extra usage billing from **ktx**.

 <p align="center">
  <a href="https://youtu.be/5V4TuzYVlrA">
@ -175,8 +176,9 @@ then the current directory. Pass `--project-dir <path>` when scripting.
  No. **ktx** runs locally. The only data leaving your machine is what you
  send to the LLM provider you configured.
 - **Which LLM backends are supported?**
-  Anthropic API, Google Vertex AI, AI Gateway, and the local Claude Code
-  session through the Claude Agent SDK. See
+  Anthropic API, Google Vertex AI, AI Gateway, the local Claude Code session
+  through the Claude Agent SDK, and your local Codex authentication through the
+  Codex SDK. See
  [LLM configuration](https://docs.kaelio.com/ktx/docs/guides/llm-configuration).
 - **How is ktx different from a dbt or MetricFlow semantic layer?**
  **ktx** *ingests* those layers and combines them with raw-table
--- a/docs-site/content/docs/cli-reference/ktx-setup.mdx
+++ b/docs-site/content/docs/cli-reference/ktx-setup.mdx
@ -51,8 +51,9 @@ prompts.

 | Flag | Description |
 |------|-------------|
-| `--llm-backend <backend>` | LLM backend: `anthropic`, `vertex`, or `claude-code` |
+| `--llm-backend <backend>` | LLM backend: `anthropic`, `vertex`, `claude-code`, or `codex` |
 | `--llm-backend claude-code` | Use the local Claude Code session for **ktx** LLM calls |
+| `--llm-backend codex` | Use local Codex authentication for **ktx** LLM calls |
 | `--llm-model <model>` | LLM model ID or backend model alias to validate and save |
 | `--anthropic-api-key-env <name>` | Environment variable containing the Anthropic API key |
 | `--anthropic-api-key-file <path>` | File containing the Anthropic API key |
@ -62,9 +63,14 @@ prompts.

 Choose only one Anthropic credential source. Anthropic credential flags are only
 valid with the Anthropic backend; Vertex flags are only valid with the Vertex
-backend. The `claude-code` backend uses local Claude Code authentication instead
+backend. The `claude-code` and `codex` backends use local authentication instead
 of Anthropic API key or Vertex flags. For Claude Code, `--llm-model` accepts
-`sonnet`, `opus`, `haiku`, or a full Claude model ID.
+`sonnet`, `opus`, `haiku`, or a full Claude model ID. For Codex, `--llm-model`
+accepts `codex`, `default`, or a `gpt-*` / `codex-*` model ID such as
+`gpt-5.5`; any other value is rejected before the auth probe. Run `codex` to
+see the models available to your login, and pick a `gpt-*` / `codex-*` id from
+that list. Note that `*-codex` API-billing model IDs (for example
+`gpt-5.3-codex`) are not available to ChatGPT-subscription logins.

 ### Embeddings

@ -191,6 +197,17 @@ ktx setup \
  --llm-backend claude-code \
  --llm-model opus

+# Configure **ktx** to use local Codex authentication for LLM work
+ktx setup --llm-backend codex --llm-model gpt-5.5 --no-input
+```
+
+When you choose `--llm-backend codex`, setup prints a warning if the public
+Codex SDK and CLI surface cannot prove full Claude-Code-style isolation. The
+backend restricts **ktx** runtime MCP tools to each run, but Codex may still
+load user Codex config and built-in command execution or read-only file
+capabilities.
+
+```bash
 # Script a Postgres connection that reads its URL from the environment
 ktx setup \
  --project-dir ./analytics \
--- a/docs-site/content/docs/cli-reference/ktx-status.mdx
+++ b/docs-site/content/docs/cli-reference/ktx-status.mdx
@ -21,7 +21,7 @@ ktx status [options]
 | `--json` | Print JSON output | `false` |
 | `-v`, `--verbose` | Show every check, including passing ones | `false` |
 | `--validate` | Only validate the `ktx.yaml` schema; skip readiness checks | `false` |
-| `--fast` | Skip checks that require external communication (query-history readiness probes and Claude Code auth probe) | `false` |
+| `--fast` | Skip checks that require external communication (query-history readiness probes, Claude Code auth probe, and Codex auth probe) | `false` |
 | `--no-input` | Disable interactive terminal input | - |

 ## Examples
@ -39,7 +39,7 @@ ktx status --verbose
 # Validate ktx.yaml without running readiness checks
 ktx status --validate

-# Skip slow probes (query-history readiness, Claude Code auth)
+# Skip slow probes (query-history readiness, Claude Code auth, Codex auth)
 ktx status --fast

 # Check a project from another directory
@ -57,6 +57,16 @@ flow, then rerun `ktx status`. Use `--fast` to skip this probe (useful in CI
 or offline contexts); skipped checks render as `-` and carry
 `"status": "skipped"` in JSON output.

+For `llm.provider.backend: codex`, `ktx status` runs a minimal non-interactive
+Codex request. If the probe fails, authenticate Codex locally with the Codex CLI
+and verify the Codex CLI installation.
+
+When `llm.provider.backend: codex` is configured, `ktx status` also prints a
+warning when the installed public Codex SDK and CLI surface cannot prove full
+Claude-Code-style isolation. The warning does not block authenticated Codex
+usage, but it marks the project status as partial so you can make an explicit
+runtime-isolation decision.
+
 A `Local data` section summarises what the project has accumulated locally:
 ingest run counts, last completed timestamp per connection, knowledge page
 counts by scope, semantic-layer source and dictionary value counts, and the
--- a/docs-site/content/docs/configuration/ktx-yaml.mdx
+++ b/docs-site/content/docs/configuration/ktx-yaml.mdx
@ -376,13 +376,23 @@ llm:

 | Field | Type | Default | Purpose |
 |-------|------|---------|---------|
-| `provider.backend` | `none` \| `anthropic` \| `vertex` \| `gateway` \| `claude-code` | `none` | Selected backend. `none` disables LLM features. `claude-code` uses the local Claude Code session and needs no API key. |
+| `provider.backend` | `none` \| `anthropic` \| `vertex` \| `gateway` \| `claude-code` \| `codex` | `none` | Selected backend. `none` disables LLM features. `claude-code` uses the local Claude Code session and needs no API key. `codex` uses local Codex authentication and needs no API key. |
 | `provider.anthropic.api_key` | `string` | - | Anthropic API key. Required when `backend: anthropic`. Accepts `env:` or `file:` references. |
 | `provider.anthropic.base_url` | `string` | - | Override the Anthropic API base URL (proxy, self-hosted gateway). |
 | `provider.gateway.api_key` / `base_url` | `string` | - | Credentials for an AI Gateway provider. Required when `backend: gateway`. |
 | `provider.vertex.project` | `string` | - | Google Cloud project ID hosting the Vertex AI endpoint. |
 | `provider.vertex.location` | `string` | - | Vertex AI region (for example `us-east5`). Required when the `vertex` block is present. |

+Use `codex` when local Codex authentication should power **ktx** LLM work:
+
+```yaml
+llm:
+  provider:
+    backend: codex
+  models:
+    default: gpt-5.5
+```
+
 ### Model roles

 `models` overrides the per-role model. Keys are fixed; values are
--- a/docs-site/content/docs/guides/building-context.mdx
+++ b/docs-site/content/docs/guides/building-context.mdx
@ -39,8 +39,20 @@ ktx ingest --all
 Enriched ingest needs a configured model and embeddings. Run `ktx setup` first;
 connections without that configuration fail before any work starts.

-With `claude-code`, **ktx** agent loops can invoke only the **ktx** MCP tools for the
-current run.
+Local-auth backends keep provider credentials out of `ktx.yaml`:
+
+```bash
+ktx setup --llm-backend claude-code --no-input
+ktx setup --llm-backend codex --llm-model gpt-5.5 --no-input
+```
+
+With `claude-code`, **ktx** agent loops can invoke only the **ktx** MCP tools
+for the current run. With `codex`, **ktx** restricts the temporary runtime MCP
+server to the current run's tool set, disables Codex web search, requests a
+read-only sandbox, and sets `approval_policy=never`. The public Codex SDK and
+CLI surface may still load user Codex config and built-in command execution or
+read-only file capabilities, so use `claude-code` for stricter runtime tool
+isolation.

 ## Query history

--- a/docs-site/content/docs/guides/llm-configuration.mdx
+++ b/docs-site/content/docs/guides/llm-configuration.mdx
@ -16,6 +16,7 @@ Set `llm.provider.backend` to one of these values:
 - `gateway`: Use AI Gateway-compatible Anthropic model ids.
 - `claude-code`: Use your local Claude Code session through the Claude Agent
  SDK. **ktx** strips provider-routing environment variables from child processes.
+- `codex`: Use your local Codex authentication through the Codex SDK.

 ## Claude Code

@ -47,6 +48,42 @@ model IDs are also accepted.
 metadata may still list host slash commands, skills, and subagents; **ktx** does not
 grant execution access to them.

+## Codex backend
+
+Use `codex` when you want **ktx** to run LLM-backed workflows through your
+local Codex authentication instead of a direct provider API key.
+
+```yaml
+llm:
+  provider:
+    backend: codex
+  models:
+    default: gpt-5.5
+```
+
+Configure it non-interactively:
+
+```bash
+ktx setup --llm-backend codex --llm-model gpt-5.5 --no-input
+```
+
+This is separate from Codex agent-client setup. `ktx setup --agents --target
+codex` installs instructions and MCP access for an end-user Codex session.
+`ktx setup --llm-backend codex` makes **ktx** itself execute ingest, scan
+enrichment, memory, and other LLM-backed work through Codex.
+
+During runtime loops, **ktx** starts a temporary loopback MCP server for the
+current run, exposes only the tools passed to that run, asks Codex to use a
+read-only sandbox, sets `approval_policy=never`, auto-approves only those
+run-scoped MCP tools, and disables Codex web search.
+
+Codex backend isolation is currently limited by the public Codex SDK and CLI
+surface. Codex may still load user Codex config and built-in command execution
+or read-only file capabilities. Use `llm.provider.backend: claude-code` when
+you need stricter Claude-Code-style runtime tool isolation, or remove host
+Codex MCP and tool config before running untrusted prompts through the `codex`
+backend.
+
 ## Prompt caching

 `llm.promptCaching` has partial parity on `claude-code`. Status and doctor warn
--- a/knip.json
+++ b/knip.json
@ -37,6 +37,9 @@
    "@semantic-release/release-notes-generator",
    "conventional-changelog-conventionalcommits"
  ],
+  "ignore": [
+    ".context/**"
+  ],
  "ignoreBinaries": [
    "uv",
    "lsof"
--- a/package.json
+++ b/package.json
@ -32,6 +32,7 @@
    "setup:dev": "node scripts/setup-dev.mjs",
    "release:published-smoke": "node scripts/published-package-smoke.mjs --require-config",
    "release:local-embeddings-smoke": "node scripts/local-embeddings-runtime-smoke.mjs --require-opt-in",
+    "release:codex-backend-smoke": "node scripts/codex-backend-live-smoke.mjs",
    "release:readiness": "node scripts/release-readiness.mjs",
    "release:update-version": "node scripts/update-public-release-version.mjs",
    "relationships:acquire-public-fixtures": "node scripts/acquire-public-benchmark-fixtures.mjs",
--- a/packages/cli/package.json
+++ b/packages/cli/package.json
@ -56,6 +56,7 @@
    "@looker/sdk-rtl": "^21.6.5",
    "@modelcontextprotocol/sdk": "^1.29.0",
    "@notionhq/client": "^5.22.0",
+    "@openai/codex-sdk": "^0.133.0",
    "ai": "^6.0.188",
    "better-sqlite3": "^12.10.0",
    "commander": "14.0.3",
--- a/packages/cli/src/commands/setup-commands.ts
+++ b/packages/cli/src/commands/setup-commands.ts
@ -29,7 +29,7 @@ function embeddingBackend(value: string): 'openai' | 'sentence-transformers' {
 }

 function llmBackend(value: string): KtxSetupLlmBackend {
-  if (value === 'anthropic' || value === 'vertex' || value === 'claude-code') {
+  if (value === 'anthropic' || value === 'vertex' || value === 'claude-code' || value === 'codex') {
    return value;
  }
  throw new InvalidArgumentError(`invalid choice '${value}'`);
--- a/packages/cli/src/context/ingest/local-bundle-runtime.ts
+++ b/packages/cli/src/context/ingest/local-bundle-runtime.ts
@ -611,9 +611,10 @@ function nextLocalJobId(): string {

 function localIngestLlmProviderGuardMessage(projectDir: string): string {
  return [
-    'ktx ingest requires llm.provider.backend: anthropic, vertex, gateway, or claude-code, or an injected agentRunner.',
-    'Configure a local Claude Code session or API-backed LLM, then rerun ingest:',
+    'ktx ingest requires llm.provider.backend: anthropic, vertex, gateway, claude-code, or codex, or an injected agentRunner.',
+    'Configure a local Claude Code/Codex session or API-backed LLM, then rerun ingest:',
    `  ktx setup --project-dir ${projectDir} --llm-backend claude-code --no-input`,
+    `  ktx setup --project-dir ${projectDir} --llm-backend codex --llm-model gpt-5.5 --no-input`,
    `  ktx setup --project-dir ${projectDir} --llm-backend anthropic --anthropic-api-key-env ANTHROPIC_API_KEY --llm-model claude-sonnet-4-6 --no-input`,
  ].join('\n');
 }
--- a/packages/cli/src/context/llm/codex-exec-events.ts
+++ b/packages/cli/src/context/llm/codex-exec-events.ts
@ -0,0 +1,194 @@
+import type { LlmTokenUsage, RunLoopStopReason } from './runtime-port.js';
+
+export interface CodexExecEventSummary {
+  finalText: string;
+  stopReason: RunLoopStopReason;
+  usage: LlmTokenUsage;
+  stepCount: number;
+  stepBoundariesMs: number[];
+  toolCallCount: number;
+  toolFailures: string[];
+  error?: Error;
+}
+
+interface CodexEventParseOptions {
+  startedAt?: number;
+  now?: () => number;
+}
+
+function record(value: unknown): Record<string, unknown> | undefined {
+  return value && typeof value === 'object' ? (value as Record<string, unknown>) : undefined;
+}
+
+/**
+ * Codex thread items that represent a discrete agent action consuming one loop
+ * step. The step budget caps the total number of these regardless of which
+ * capability the agent reaches for, so built-in `command_execution` (and any
+ * file/web action the public Codex surface still exposes) count alongside our
+ * own `mcp_tool_call` items rather than only the MCP ones.
+ */
+const AGENT_STEP_ITEM_TYPES = new Set(['command_execution', 'mcp_tool_call', 'file_change', 'web_search']);
+
+export function isCompletedAgentStep(event: unknown): boolean {
+  const eventRecord = record(event);
+  if (eventRecord?.type !== 'item.completed') {
+    return false;
+  }
+  const itemType = record(eventRecord.item)?.type;
+  return typeof itemType === 'string' && AGENT_STEP_ITEM_TYPES.has(itemType);
+}
+
+function text(value: unknown): string | undefined {
+  return typeof value === 'string' && value.trim().length > 0 ? value : undefined;
+}
+
+function numberValue(value: unknown): number | undefined {
+  return typeof value === 'number' && Number.isFinite(value) ? value : undefined;
+}
+
+function usageFrom(value: unknown): LlmTokenUsage {
+  const usage = record(value);
+  if (!usage) {
+    return {};
+  }
+  const inputTokens = numberValue(usage.input_tokens ?? usage.inputTokens);
+  const outputTokens = numberValue(usage.output_tokens ?? usage.outputTokens);
+  const explicitTotalTokens = numberValue(usage.total_tokens ?? usage.totalTokens);
+  const totalTokens =
+    explicitTotalTokens ??
+    (inputTokens !== undefined && outputTokens !== undefined ? inputTokens + outputTokens : undefined);
+  return {
+    ...(inputTokens !== undefined ? { inputTokens } : {}),
+    ...(outputTokens !== undefined ? { outputTokens } : {}),
+    ...(totalTokens !== undefined ? { totalTokens } : {}),
+  };
+}
+
+function stopReasonFrom(value: unknown): RunLoopStopReason {
+  const reason = text(value)?.toLowerCase();
+  if (reason && /(budget|max_turn|max-turn|limit)/.test(reason)) {
+    return 'budget';
+  }
+  return 'natural';
+}
+
+function errorMessageFrom(value: unknown): string {
+  if (value instanceof Error) {
+    return value.message;
+  }
+  const asRecord = record(value);
+  const message = text(asRecord?.message);
+  return message ?? text(value) ?? 'Codex turn failed';
+}
+
+/**
+ * Codex serializes API failures as a JSON envelope inside the event message
+ * (e.g. `{"type":"error","status":400,"error":{"message":"…"}}`). Surface the
+ * human-readable inner message so callers don't leak raw JSON; pass plain
+ * strings through unchanged.
+ */
+function unwrapCodexApiErrorMessage(raw: string): string {
+  const trimmed = raw.trim();
+  if (!trimmed.startsWith('{')) {
+    return raw;
+  }
+  try {
+    const parsed = record(JSON.parse(trimmed));
+    return text(record(parsed?.error)?.message) ?? text(parsed?.message) ?? raw;
+  } catch {
+    return raw;
+  }
+}
+
+/** @internal */
+export function parseCodexExecEventLine(line: string): unknown {
+  try {
+    return JSON.parse(line) as unknown;
+  } catch (error) {
+    throw new Error(`Codex JSONL event stream was malformed: ${error instanceof Error ? error.message : String(error)}`);
+  }
+}
+
+export function summarizeCodexExecEvents(
+  events: Iterable<unknown>,
+  options: CodexEventParseOptions = {},
+): CodexExecEventSummary {
+  const startedAt = options.startedAt ?? Date.now();
+  const now = options.now ?? Date.now;
+  let finalText = '';
+  let stopReason: RunLoopStopReason = 'natural';
+  let usage: LlmTokenUsage = {};
+  let turnCount = 0;
+  let completedStepCount = 0;
+  const stepBoundariesMs: number[] = [];
+  let toolCallCount = 0;
+  const toolFailures: string[] = [];
+  let error: Error | undefined;
+
+  for (const event of events) {
+    const eventRecord = record(event);
+    const eventType = text(eventRecord?.type);
+    if (!eventRecord || !eventType) {
+      continue;
+    }
+
+    if (eventType === 'turn.started') {
+      turnCount += 1;
+      continue;
+    }
+
+    const item = record(eventRecord.item);
+    const itemType = text(item?.type);
+
+    if (eventType === 'item.started' && itemType === 'mcp_tool_call') {
+      toolCallCount += 1;
+      continue;
+    }
+
+    if (isCompletedAgentStep(event)) {
+      completedStepCount += 1;
+      stepBoundariesMs.push(now() - startedAt);
+      // Only MCP tool calls fail the loop: a non-zero `command_execution` exit
+      // is normal agent exploration, not a runtime error. `status` is the
+      // authoritative signal (the SDK always sets it); the SDK also serializes
+      // `error: null` on successful calls, so an explicit-null `error` must NOT
+      // be read as a failure — only a populated error object counts.
+      if (itemType === 'mcp_tool_call' && (item?.status === 'failed' || (item?.error !== undefined && item?.error !== null))) {
+        const name = text(item?.name) ?? text(item?.tool) ?? text(item?.tool_name) ?? 'unknown';
+        toolFailures.push(`${name}: ${errorMessageFrom(item?.error)}`);
+      }
+      continue;
+    }
+
+    if (eventType === 'item.completed' && itemType === 'agent_message') {
+      finalText = text(item?.text) ?? finalText;
+      continue;
+    }
+
+    if (eventType === 'turn.completed') {
+      usage = usageFrom(eventRecord.usage);
+      if (completedStepCount === 0) {
+        stepBoundariesMs.push(now() - startedAt);
+      }
+      stopReason = stopReasonFrom(eventRecord.reason ?? eventRecord.stop_reason ?? eventRecord.terminal_reason);
+      continue;
+    }
+
+    if (eventType === 'turn.failed' || eventType === 'error') {
+      stopReason = 'error';
+      error = new Error(unwrapCodexApiErrorMessage(errorMessageFrom(eventRecord.error ?? eventRecord.message)));
+      continue;
+    }
+  }
+
+  return {
+    finalText,
+    stopReason,
+    usage,
+    stepCount: completedStepCount > 0 ? completedStepCount : turnCount,
+    stepBoundariesMs,
+    toolCallCount,
+    toolFailures,
+    ...(error ? { error } : {}),
+  };
+}
--- a/packages/cli/src/context/llm/codex-isolation.ts
+++ b/packages/cli/src/context/llm/codex-isolation.ts
@ -0,0 +1,9 @@
+export const CODEX_ISOLATION_WARNING =
+  'Codex backend isolation is limited by the public Codex SDK/CLI surface: ktx restricts the runtime MCP server to the current ktx tool set, disables Codex web search, asks for a read-only sandbox, and sets approval_policy=never, but Codex may still load user Codex config and built-in command execution or read-only file capabilities.';
+
+export const CODEX_ISOLATION_WARNING_FIX =
+  'Use llm.provider.backend: claude-code when you need stricter Claude-Code-style runtime tool isolation, or remove host Codex MCP/tool config before running untrusted prompts through the codex backend.';
+
+export function formatCodexIsolationWarning(): string {
+  return `${CODEX_ISOLATION_WARNING} ${CODEX_ISOLATION_WARNING_FIX}`;
+}
--- a/packages/cli/src/context/llm/codex-mcp-runtime-server.ts
+++ b/packages/cli/src/context/llm/codex-mcp-runtime-server.ts
@ -0,0 +1,87 @@
+import { randomBytes } from 'node:crypto';
+import type { Server } from 'node:http';
+import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
+import type { KtxMcpServerLike } from '../mcp/types.js';
+import { runKtxMcpHttpServer, type KtxMcpHttpServerHandle } from '../../mcp-http-server.js';
+import type { KtxRuntimeToolSet } from './runtime-port.js';
+import { normalizeKtxRuntimeToolOutput } from './runtime-tools.js';
+
+/** @internal */
+export interface CreateCodexRuntimeMcpServerInput {
+  server?: KtxMcpServerLike;
+  toolSet: KtxRuntimeToolSet;
+}
+
+export interface CodexRuntimeMcpServerHandle {
+  url: string;
+  bearerTokenEnvVar: 'KTX_CODEX_RUNTIME_MCP_TOKEN';
+  bearerToken: string;
+  close(): Promise<void>;
+}
+
+type RunServer = typeof runKtxMcpHttpServer;
+
+export interface StartCodexRuntimeMcpServerInput {
+  projectDir: string;
+  toolSet: KtxRuntimeToolSet;
+  runServer?: RunServer;
+}
+
+/** @internal */
+export function createCodexRuntimeMcpServer(input: CreateCodexRuntimeMcpServerInput): KtxMcpServerLike {
+  const server =
+    input.server ??
+    (new McpServer({
+      name: 'ktx-runtime',
+      version: '0.0.0',
+    }) as KtxMcpServerLike);
+
+  for (const descriptor of Object.values(input.toolSet)) {
+    server.registerTool(
+      descriptor.name,
+      {
+        description: descriptor.description,
+        inputSchema: descriptor.inputSchema.shape,
+      },
+      async (toolInput) => {
+        const normalized = normalizeKtxRuntimeToolOutput(await descriptor.execute(toolInput));
+        return {
+          content: [{ type: 'text', text: normalized.markdown }],
+          ...(normalized.structured !== undefined && normalized.structured !== null && typeof normalized.structured === 'object'
+            ? { structuredContent: normalized.structured as object }
+            : {}),
+        };
+      },
+    );
+  }
+
+  return server;
+}
+
+function serverPort(server: Server, fallback: number): number {
+  const address = server.address();
+  return typeof address === 'object' && address ? address.port : fallback;
+}
+
+export async function startCodexRuntimeMcpServer(
+  input: StartCodexRuntimeMcpServerInput,
+): Promise<CodexRuntimeMcpServerHandle> {
+  const bearerToken = randomBytes(32).toString('hex');
+  const runServer = input.runServer ?? runKtxMcpHttpServer;
+  const handle = (await runServer({
+    projectDir: input.projectDir,
+    host: '127.0.0.1',
+    port: 0,
+    token: bearerToken,
+    allowedHosts: ['127.0.0.1', 'localhost'],
+    allowedOrigins: [],
+    createMcpServer: () => createCodexRuntimeMcpServer({ toolSet: input.toolSet }) as McpServer,
+  })) as KtxMcpHttpServerHandle;
+  const port = serverPort(handle.server, 0);
+  return {
+    url: `http://127.0.0.1:${port}/mcp`,
+    bearerTokenEnvVar: 'KTX_CODEX_RUNTIME_MCP_TOKEN',
+    bearerToken,
+    close: () => handle.close(),
+  };
+}
--- a/packages/cli/src/context/llm/codex-models.ts
+++ b/packages/cli/src/context/llm/codex-models.ts
@ -0,0 +1,20 @@
+export const DEFAULT_CODEX_MODEL = 'gpt-5.5';
+
+const CODEX_MODEL_ALIASES: Record<string, string> = {
+  codex: DEFAULT_CODEX_MODEL,
+  default: DEFAULT_CODEX_MODEL,
+};
+
+const EXPLICIT_CODEX_MODEL_ID = /^(?:gpt|codex)-[a-z0-9][a-z0-9._-]*$/i;
+
+export function resolveCodexModel(model: string): string {
+  const normalized = model.trim();
+  const alias = CODEX_MODEL_ALIASES[normalized];
+  if (alias) {
+    return alias;
+  }
+  if (EXPLICIT_CODEX_MODEL_ID.test(normalized)) {
+    return normalized;
+  }
+  throw new Error(`Unsupported Codex model "${model}". Use codex, default, or a gpt-* / codex-* model id.`);
+}
--- a/packages/cli/src/context/llm/codex-runtime-config.ts
+++ b/packages/cli/src/context/llm/codex-runtime-config.ts
@ -0,0 +1,38 @@
+interface CodexRuntimeMcpConfig {
+  url: string;
+  bearerTokenEnvVar: string;
+  bearerToken: string;
+  toolNames: string[];
+}
+
+export interface BuildCodexRuntimeConfigInput {
+  model: string;
+  mcp?: CodexRuntimeMcpConfig;
+}
+
+export interface CodexRuntimeConfig {
+  configOverrides: Record<string, unknown>;
+  env: Record<string, string>;
+}
+
+export function buildCodexRuntimeConfig(input: BuildCodexRuntimeConfigInput): CodexRuntimeConfig {
+  const configOverrides: Record<string, unknown> = {
+    history: { persistence: 'none' },
+  };
+  const env: Record<string, string> = {};
+
+  if (input.mcp) {
+    configOverrides.mcp_servers = {
+      ktx: {
+        url: input.mcp.url,
+        bearer_token_env_var: input.mcp.bearerTokenEnvVar,
+        enabled_tools: input.mcp.toolNames,
+        default_tools_approval_mode: 'approve',
+        required: true,
+      },
+    };
+    env[input.mcp.bearerTokenEnvVar] = input.mcp.bearerToken;
+  }
+
+  return { configOverrides, env };
+}
--- a/packages/cli/src/context/llm/codex-runtime.ts
+++ b/packages/cli/src/context/llm/codex-runtime.ts
@ -0,0 +1,371 @@
+import { z } from 'zod';
+import { noopLogger, type KtxLogger } from '../core/config.js';
+import { isCompletedAgentStep, summarizeCodexExecEvents, type CodexExecEventSummary } from './codex-exec-events.js';
+import {
+  startCodexRuntimeMcpServer,
+  type CodexRuntimeMcpServerHandle,
+} from './codex-mcp-runtime-server.js';
+import { resolveCodexModel } from './codex-models.js';
+import { buildCodexRuntimeConfig } from './codex-runtime-config.js';
+import { CodexSdkCliRunner, type CodexSdkRunner } from './codex-sdk-runner.js';
+import type {
+  KtxGenerateObjectInput,
+  KtxGenerateTextInput,
+  KtxLlmRuntimePort,
+  KtxRuntimeToolSet,
+  LlmTokenUsage,
+  RunLoopParams,
+  RunLoopResult,
+} from './runtime-port.js';
+
+export interface CodexKtxLlmRuntimeDeps {
+  projectDir: string;
+  modelSlots: { default: string } & Partial<Record<string, string>>;
+  runner?: CodexSdkRunner;
+  startMcpServer?: (input: { projectDir: string; toolSet: KtxRuntimeToolSet }) => Promise<CodexRuntimeMcpServerHandle>;
+  logger?: KtxLogger;
+}
+
+function modelForRole(modelSlots: CodexKtxLlmRuntimeDeps['modelSlots'], role: string): string {
+  return resolveCodexModel(modelSlots[role] ?? modelSlots.default);
+}
+
+function promptWithSystem(system: string | undefined, prompt: string): string {
+  return [system, prompt].filter(Boolean).join('\n\n');
+}
+
+interface CollectCodexEventsOptions {
+  stepBudget?: number;
+  abortController?: AbortController;
+  onStep?: (stepIndex: number) => void | Promise<void>;
+}
+
+interface CollectCodexEventsResult {
+  events: unknown[];
+  budgetExceeded: boolean;
+  streamError?: Error;
+}
+
+function eventRecord(value: unknown): Record<string, unknown> | undefined {
+  return value && typeof value === 'object' ? (value as Record<string, unknown>) : undefined;
+}
+
+function isTurnCompleted(event: unknown): boolean {
+  return eventRecord(event)?.type === 'turn.completed';
+}
+
+/**
+ * Drains the Codex stream once, emitting a step as each agent action completes
+ * so callers see live progress and the step budget is enforced mid-run. Every
+ * completed agent-action item counts (see {@link isCompletedAgentStep}), so
+ * built-in `command_execution` steps decrement the budget the same as
+ * `mcp_tool_call`s. A turn that produced no actions still counts as one step,
+ * matching the metrics summary and the AI SDK backend.
+ */
+async function collectEvents(
+  events: AsyncIterable<unknown>,
+  options: CollectCodexEventsOptions = {},
+): Promise<CollectCodexEventsResult> {
+  const collected: unknown[] = [];
+  let completedSteps = 0;
+  let sawActionStep = false;
+  let budgetExceeded = false;
+  let streamError: Error | undefined;
+
+  // The SDK yields every stdout event, then throws on a non-zero codex exec
+  // exit. Catch that throw so the events already collected (which carry the
+  // real `turn.failed`/`error` reason) survive for the summary; the masked
+  // exit message is kept only as a fallback when no error event was emitted.
+  try {
+    for await (const event of events) {
+      collected.push(event);
+
+      const isActionStep = isCompletedAgentStep(event);
+      if (isActionStep) {
+        sawActionStep = true;
+      } else if (sawActionStep || !isTurnCompleted(event)) {
+        // Only fall back to counting a bare turn as a step when the turn produced
+        // no agent actions; a completed turn is terminal, so it never aborts.
+        continue;
+      }
+
+      completedSteps += 1;
+      await options.onStep?.(completedSteps);
+      if (isActionStep && options.stepBudget !== undefined && completedSteps >= options.stepBudget) {
+        budgetExceeded = true;
+        options.abortController?.abort();
+        break;
+      }
+    }
+  } catch (error) {
+    streamError = error instanceof Error ? error : new Error(String(error));
+  }
+
+  return { events: collected, budgetExceeded, ...(streamError ? { streamError } : {}) };
+}
+
+function metrics(summary: CodexExecEventSummary, startedAt: number): { totalMs: number; usage: LlmTokenUsage } {
+  return { totalMs: Date.now() - startedAt, usage: summary.usage };
+}
+
+function summaryError(summary: CodexExecEventSummary, streamError?: Error): Error | undefined {
+  // A `turn.failed`/`error` event carries the real reason; prefer it over the
+  // SDK's generic non-zero-exit throw. Fall back to the stream error only when
+  // no event explained the failure (e.g. spawn failure or auth before a turn).
+  if (summary.error) {
+    return summary.error;
+  }
+  if (summary.toolFailures.length > 0) {
+    return new Error(`Codex runtime tool call failed: ${summary.toolFailures.join('; ')}`);
+  }
+  return streamError;
+}
+
+function assertSuccessfulText(summary: CodexExecEventSummary, streamError?: Error): string {
+  const error = summaryError(summary, streamError);
+  if (error) {
+    throw error;
+  }
+  if (!summary.finalText.trim()) {
+    throw new Error('Codex completed without an agent message');
+  }
+  return summary.finalText;
+}
+
+function parseStructuredOutput<TOutput, TSchema extends z.ZodType<TOutput>>(schema: TSchema, text: string): TOutput {
+  try {
+    return schema.parse(JSON.parse(text));
+  } catch (error) {
+    const message = error instanceof Error ? error.message : String(error);
+    throw new Error(`Codex structured output failed validation: ${message}`);
+  }
+}
+
+async function mcpForTools(input: {
+  projectDir: string;
+  toolSet?: KtxRuntimeToolSet;
+  startMcpServer: CodexKtxLlmRuntimeDeps['startMcpServer'];
+}): Promise<CodexRuntimeMcpServerHandle | undefined> {
+  if (!input.toolSet || Object.keys(input.toolSet).length === 0) {
+    return undefined;
+  }
+  return (input.startMcpServer ?? startCodexRuntimeMcpServer)({
+    projectDir: input.projectDir,
+    toolSet: input.toolSet,
+  });
+}
+
+function runtimeToolNames(toolSet: KtxRuntimeToolSet | undefined): string[] {
+  return Object.values(toolSet ?? {}).map((descriptor) => descriptor.name);
+}
+
+export class CodexKtxLlmRuntime implements KtxLlmRuntimePort {
+  private readonly runner: CodexSdkRunner;
+  private readonly logger: KtxLogger;
+
+  constructor(private readonly deps: CodexKtxLlmRuntimeDeps) {
+    this.runner = deps.runner ?? new CodexSdkCliRunner();
+    this.logger = deps.logger ?? noopLogger;
+  }
+
+  async generateText(input: KtxGenerateTextInput): Promise<string> {
+    const startedAt = Date.now();
+    const model = modelForRole(this.deps.modelSlots, input.role);
+    const mcp = await mcpForTools({
+      projectDir: this.deps.projectDir,
+      toolSet: input.tools,
+      startMcpServer: this.deps.startMcpServer,
+    });
+    try {
+      const config = buildCodexRuntimeConfig({
+        model,
+        ...(mcp
+          ? {
+              mcp: {
+                url: mcp.url,
+                bearerTokenEnvVar: mcp.bearerTokenEnvVar,
+                bearerToken: mcp.bearerToken,
+                toolNames: runtimeToolNames(input.tools),
+              },
+            }
+          : {}),
+      });
+      const collected = await collectEvents(
+        await this.runner.runStreamed({
+          projectDir: this.deps.projectDir,
+          model,
+          prompt: promptWithSystem(input.system, input.prompt),
+          configOverrides: config.configOverrides,
+          env: config.env,
+        }),
+      );
+      const summary = summarizeCodexExecEvents(collected.events, { startedAt });
+      input.onMetrics?.(metrics(summary, startedAt));
+      return assertSuccessfulText(summary, collected.streamError);
+    } finally {
+      await mcp?.close();
+    }
+  }
+
+  async generateObject<TOutput, TSchema extends z.ZodType<TOutput>>(
+    input: KtxGenerateObjectInput<TOutput, TSchema>,
+  ): Promise<TOutput> {
+    const startedAt = Date.now();
+    const model = modelForRole(this.deps.modelSlots, input.role);
+    const mcp = await mcpForTools({
+      projectDir: this.deps.projectDir,
+      toolSet: input.tools,
+      startMcpServer: this.deps.startMcpServer,
+    });
+    try {
+      const config = buildCodexRuntimeConfig({
+        model,
+        ...(mcp
+          ? {
+              mcp: {
+                url: mcp.url,
+                bearerTokenEnvVar: mcp.bearerTokenEnvVar,
+                bearerToken: mcp.bearerToken,
+                toolNames: runtimeToolNames(input.tools),
+              },
+            }
+          : {}),
+      });
+      const collected = await collectEvents(
+        await this.runner.runStreamed({
+          projectDir: this.deps.projectDir,
+          model,
+          prompt: promptWithSystem(input.system, input.prompt),
+          configOverrides: config.configOverrides,
+          env: config.env,
+          outputSchema: z.toJSONSchema(input.schema, { target: 'draft-7' }) as Record<string, unknown>,
+        }),
+      );
+      const summary = summarizeCodexExecEvents(collected.events, { startedAt });
+      input.onMetrics?.(metrics(summary, startedAt));
+      return parseStructuredOutput(input.schema, assertSuccessfulText(summary, collected.streamError));
+    } finally {
+      await mcp?.close();
+    }
+  }
+
+  async runAgentLoop(params: RunLoopParams): Promise<RunLoopResult> {
+    const startedAt = Date.now();
+    const model = modelForRole(this.deps.modelSlots, params.modelRole);
+    let mcp: CodexRuntimeMcpServerHandle | undefined;
+    try {
+      mcp = await mcpForTools({
+        projectDir: this.deps.projectDir,
+        toolSet: params.toolSet,
+        startMcpServer: this.deps.startMcpServer,
+      });
+      const config = buildCodexRuntimeConfig({
+        model,
+        ...(mcp
+          ? {
+              mcp: {
+                url: mcp.url,
+                bearerTokenEnvVar: mcp.bearerTokenEnvVar,
+                bearerToken: mcp.bearerToken,
+                toolNames: runtimeToolNames(params.toolSet),
+              },
+            }
+          : {}),
+      });
+      const abortController = new AbortController();
+      const onStep = async (stepIndex: number): Promise<void> => {
+        try {
+          await params.onStepFinish?.({ stepIndex, stepBudget: params.stepBudget });
+        } catch (error) {
+          this.logger.warn(
+            `[codex-runner] onStepFinish callback threw; ignoring: ${error instanceof Error ? error.message : String(error)}`,
+          );
+        }
+      };
+      const collected = await collectEvents(
+        await this.runner.runStreamed({
+          projectDir: this.deps.projectDir,
+          model,
+          prompt: promptWithSystem(params.systemPrompt, params.userPrompt),
+          configOverrides: config.configOverrides,
+          env: config.env,
+          signal: abortController.signal,
+        }),
+        { stepBudget: params.stepBudget, abortController, onStep },
+      );
+      const summary = summarizeCodexExecEvents(collected.events, { startedAt });
+      const error = summaryError(summary, collected.streamError);
+      const stopReason = collected.budgetExceeded ? 'budget' : error ? 'error' : summary.stopReason;
+      return {
+        stopReason,
+        ...(stopReason === 'error' && error ? { error } : {}),
+        metrics: {
+          totalMs: Date.now() - startedAt,
+          usage: summary.usage,
+          stepCount: summary.stepCount,
+          stepBoundariesMs: summary.stepBoundariesMs,
+        },
+      };
+    } catch (error) {
+      const err = error instanceof Error ? error : new Error(String(error));
+      return {
+        stopReason: 'error',
+        error: err,
+        metrics: { totalMs: Date.now() - startedAt, usage: {}, stepCount: 0, stepBoundariesMs: [] },
+      };
+    } finally {
+      await mcp?.close();
+    }
+  }
+}
+
+// A rejected model is not an auth failure: Codex authenticated, connected, and
+// the API refused the model id. These markers come from the API error envelope
+// (e.g. "model is not supported", "invalid_request_error").
+const MODEL_UNAVAILABLE_MARKERS =
+  /\bnot supported\b|\bnot available\b|\bdoes not exist\b|invalid_request_error|\bunknown model\b|\bunsupported model\b/i;
+
+function describeCodexProbeFailure(model: string, message: string): { message: string; fix: string } {
+  if (MODEL_UNAVAILABLE_MARKERS.test(message)) {
+    const fix = `Run \`codex\` to see the models your account supports, then set llm.models.default in ktx.yaml (or rerun \`ktx setup\`).`;
+    return {
+      message: `Codex is authenticated, but the configured model "${model}" is not available for this Codex account. ${fix} Details: ${message}`,
+      fix,
+    };
+  }
+  const fix = `Authenticate Codex locally with the Codex CLI, verify the Codex CLI is installed, then rerun setup or \`ktx status\`.`;
+  return {
+    message: `Codex authentication is not usable. ${fix} Details: ${message}`,
+    fix,
+  };
+}
+
+export async function runCodexAuthProbe(input: {
+  projectDir: string;
+  model: string;
+  runner?: CodexSdkRunner;
+}): Promise<{ ok: true } | { ok: false; message: string; fix: string }> {
+  let model: string;
+  try {
+    model = resolveCodexModel(input.model);
+  } catch (error) {
+    return {
+      ok: false,
+      message: error instanceof Error ? error.message : String(error),
+      fix: 'Set llm.models.default in ktx.yaml to a supported codex model (codex, default, or a gpt-* / codex-* id), or rerun `ktx setup`.',
+    };
+  }
+
+  const runtime = new CodexKtxLlmRuntime({
+    projectDir: input.projectDir,
+    modelSlots: { default: model },
+    ...(input.runner ? { runner: input.runner } : {}),
+  });
+  try {
+    await runtime.generateText({ role: 'default', prompt: 'Reply with exactly: ok' });
+    return { ok: true };
+  } catch (error) {
+    const message = error instanceof Error ? error.message : String(error);
+    return { ok: false, ...describeCodexProbeFailure(model, message) };
+  }
+}
--- a/packages/cli/src/context/llm/codex-sdk-runner.ts
+++ b/packages/cli/src/context/llm/codex-sdk-runner.ts
@ -0,0 +1,96 @@
+import { Codex, type CodexOptions, type ThreadOptions, type TurnOptions } from '@openai/codex-sdk';
+
+export interface CodexSdkRunnerInput {
+  projectDir: string;
+  model: string;
+  prompt: string;
+  configOverrides?: Record<string, unknown>;
+  env?: Record<string, string>;
+  outputSchema?: Record<string, unknown>;
+  signal?: AbortSignal;
+}
+
+export interface CodexSdkRunner {
+  runStreamed(input: CodexSdkRunnerInput): Promise<AsyncIterable<unknown>>;
+}
+
+type CodexThread = {
+  runStreamed(input: string, turnOptions?: TurnOptions): Promise<{ events: AsyncIterable<unknown> }>;
+};
+
+type CodexClient = {
+  startThread(options: ThreadOptions): CodexThread;
+};
+
+type CodexConstructor = new (options?: CodexOptions) => CodexClient;
+
+export interface CodexSdkCliRunnerOptions {
+  envBase?: NodeJS.ProcessEnv;
+  codexPathOverride?: string;
+}
+
+const CODEX_ENV_ALLOWLIST = new Set([
+  'HOME',
+  'USERPROFILE',
+  'APPDATA',
+  'LOCALAPPDATA',
+  'XDG_CONFIG_HOME',
+  'CODEX_HOME',
+  'CODEX_API_KEY',
+  'OPENAI_API_KEY',
+  'PATH',
+  'Path',
+  'SYSTEMROOT',
+  'COMSPEC',
+  'TMPDIR',
+  'TMP',
+  'TEMP',
+  'SSL_CERT_FILE',
+  'SSL_CERT_DIR',
+  'NODE_EXTRA_CA_CERTS',
+  'HTTPS_PROXY',
+  'HTTP_PROXY',
+  'ALL_PROXY',
+  'NO_PROXY',
+]);
+
+function buildCodexSdkEnv(baseEnv: NodeJS.ProcessEnv, overrides: Record<string, string> | undefined): Record<string, string> {
+  const env: Record<string, string> = {};
+  for (const key of CODEX_ENV_ALLOWLIST) {
+    const value = baseEnv[key];
+    if (typeof value === 'string') {
+      env[key] = value;
+    }
+  }
+  return { ...env, ...(overrides ?? {}) };
+}
+
+export class CodexSdkCliRunner implements CodexSdkRunner {
+  constructor(private readonly options: CodexSdkCliRunnerOptions = {}) {}
+
+  async runStreamed(input: CodexSdkRunnerInput): Promise<AsyncIterable<unknown>> {
+    const CodexClass = Codex as CodexConstructor;
+    const codex = new CodexClass({
+      ...(input.configOverrides ? { config: input.configOverrides as CodexOptions['config'] } : {}),
+      env: buildCodexSdkEnv(this.options.envBase ?? process.env, input.env),
+      ...(this.options.codexPathOverride ? { codexPathOverride: this.options.codexPathOverride } : {}),
+    });
+    const thread = codex.startThread({
+      workingDirectory: input.projectDir,
+      skipGitRepoCheck: true,
+      model: input.model,
+      sandboxMode: 'read-only',
+      webSearchMode: 'disabled',
+      approvalPolicy: 'never',
+    });
+    const turnOptions: TurnOptions = {
+      ...(input.outputSchema ? { outputSchema: input.outputSchema } : {}),
+      ...(input.signal ? { signal: input.signal } : {}),
+    };
+    const streamed = await thread.runStreamed(
+      input.prompt,
+      Object.keys(turnOptions).length > 0 ? turnOptions : undefined,
+    );
+    return streamed.events;
+  }
+}
--- a/packages/cli/src/context/llm/local-config.ts
+++ b/packages/cli/src/context/llm/local-config.ts
@ -5,6 +5,7 @@ import { resolveKtxConfigReference } from '../core/config-reference.js';
 import type { KtxProjectEmbeddingConfig, KtxProjectLlmConfig } from '../project/config.js';
 import { AiSdkKtxLlmRuntime } from './ai-sdk-runtime.js';
 import { ClaudeCodeKtxLlmRuntime } from './claude-code-runtime.js';
+import { CodexKtxLlmRuntime } from './codex-runtime.js';
 import type { KtxLlmRuntimePort } from './runtime-port.js';

 interface LocalConfigDeps {
@ -13,6 +14,7 @@ interface LocalConfigDeps {
  createKtxLlmProvider?: typeof createKtxLlmProvider;
  createKtxEmbeddingProvider?: typeof createKtxEmbeddingProvider;
  createClaudeCodeRuntime?: (deps: ConstructorParameters<typeof ClaudeCodeKtxLlmRuntime>[0]) => KtxLlmRuntimePort;
+  createCodexRuntime?: (deps: ConstructorParameters<typeof CodexKtxLlmRuntime>[0]) => KtxLlmRuntimePort;
  createAiSdkRuntime?: (deps: { llmProvider: KtxLlmProvider }) => KtxLlmRuntimePort;
 }

@ -104,7 +106,7 @@ export function createLocalKtxLlmProviderFromConfig(
  deps: LocalConfigDeps = {},
 ): KtxLlmProvider | null {
  const resolved = resolveLocalKtxLlmConfig(config, deps.env ?? process.env);
-  if (!resolved || resolved.backend === 'claude-code') {
+  if (!resolved || resolved.backend === 'claude-code' || resolved.backend === 'codex') {
    return null;
  }
  return (deps.createKtxLlmProvider ?? createKtxLlmProvider)(resolved);
@ -129,6 +131,16 @@ export function createLocalKtxLlmRuntimeFromConfig(
      env: deps.env,
    });
  }
+  if (resolved.backend === 'codex') {
+    const projectDir = deps.projectDir;
+    if (!projectDir) {
+      throw new Error('projectDir is required when creating the codex LLM runtime');
+    }
+    return (deps.createCodexRuntime ?? ((runtimeDeps) => new CodexKtxLlmRuntime(runtimeDeps)))({
+      projectDir,
+      modelSlots: resolved.modelSlots,
+    });
+  }
  const llmProvider = (deps.createKtxLlmProvider ?? createKtxLlmProvider)(resolved);
  return (deps.createAiSdkRuntime ?? ((runtimeDeps) => new AiSdkKtxLlmRuntime(runtimeDeps)))({ llmProvider });
 }
--- a/packages/cli/src/context/project/config.ts
+++ b/packages/cli/src/context/project/config.ts
@ -3,7 +3,7 @@ import YAML from 'yaml';
 import * as z from 'zod';
 import { connectionConfigSchema } from './driver-schemas.js';

-const KTX_LLM_BACKENDS = ['none', 'anthropic', 'vertex', 'gateway', 'claude-code'] as const;
+const KTX_LLM_BACKENDS = ['none', 'anthropic', 'vertex', 'gateway', 'claude-code', 'codex'] as const;
 const KTX_EMBEDDING_BACKENDS = ['none', 'openai', 'sentence-transformers'] as const;
 const KTX_PROMPT_CACHE_TTLS = ['5m', '1h'] as const;
 const KTX_ENRICHMENT_MODES = ['none', 'deterministic', 'llm'] as const;
@ -38,7 +38,7 @@ const llmProviderSchema = z
      .enum(KTX_LLM_BACKENDS)
      .default('none')
      .describe(
-        'LLM provider backend. "none" disables LLM features; "anthropic" / "vertex" / "gateway" require the matching nested credentials block; "claude-code" uses the local Claude Code session.',
+        'LLM provider backend. "none" disables LLM features; "anthropic" / "vertex" / "gateway" require the matching nested credentials block; "claude-code" uses the local Claude Code session; "codex" uses the local Codex session.',
      ),
    vertex: vertexProviderSchema.optional().describe('Vertex AI credentials, used when backend is "vertex".'),
    anthropic: apiCredentialsSchema.optional().describe('Anthropic API credentials, used when backend is "anthropic".'),
--- a/packages/cli/src/llm/types.ts
+++ b/packages/cli/src/llm/types.ts
@ -3,7 +3,7 @@ import type { LanguageModel, TelemetrySettings, ToolCallRepairFunction, ToolSet
 export const KTX_MODEL_ROLES = ['default', 'triage', 'candidateExtraction', 'curator', 'reconcile', 'repair'] as const;

 export type KtxModelRole = (typeof KTX_MODEL_ROLES)[number];
-type KtxLlmBackend = 'anthropic' | 'vertex' | 'gateway' | 'claude-code';
+type KtxLlmBackend = 'anthropic' | 'vertex' | 'gateway' | 'claude-code' | 'codex';
 export type KtxPromptCacheTtl = '5m' | '1h';

 type KtxJsonValue =
--- a/packages/cli/src/setup-models.ts
+++ b/packages/cli/src/setup-models.ts
@ -3,6 +3,9 @@ import { writeFile } from 'node:fs/promises';
 import { promisify } from 'node:util';
 import { resolveLocalKtxLlmConfig } from './context/llm/local-config.js';
 import { runClaudeCodeAuthProbe } from './context/llm/claude-code-runtime.js';
+import { formatCodexIsolationWarning } from './context/llm/codex-isolation.js';
+import { runCodexAuthProbe } from './context/llm/codex-runtime.js';
+import { DEFAULT_CODEX_MODEL } from './context/llm/codex-models.js';
 import { resolveKtxConfigReference } from './context/core/config-reference.js';
 import { type KtxProjectConfig, type KtxProjectLlmConfig, serializeKtxProjectConfig } from './context/project/config.js';
 import { loadKtxProject } from './context/project/project.js';
@ -56,7 +59,7 @@ export interface AnthropicModelChoice {
  recommended: boolean;
 }

-export type KtxSetupLlmBackend = 'anthropic' | 'vertex' | 'claude-code';
+export type KtxSetupLlmBackend = 'anthropic' | 'vertex' | 'claude-code' | 'codex';

 /** @internal */
 export interface KtxSetupModelPromptAdapter {
@ -82,6 +85,7 @@ export interface KtxSetupModelDeps {
    model: string;
    env?: NodeJS.ProcessEnv;
  }) => Promise<{ ok: true } | { ok: false; message: string }>;
+  codexAuthProbe?: (input: { projectDir: string; model: string }) => Promise<{ ok: true } | { ok: false; message: string }>;
  readGcloudProject?: () => Promise<string | undefined>;
  listGcloudProjects?: () => Promise<GcloudProjectChoice[]>;
  spinner?: () => KtxCliSpinner;
@ -110,6 +114,20 @@ const CLAUDE_CODE_MODELS: AnthropicModelChoice[] = [
  { id: 'haiku', label: 'Claude Haiku', recommended: false },
 ];

+// Curated Codex models from OpenAI's current lineup that work under both
+// ChatGPT-account (subscription) and API-key auth. Intentionally omitted:
+// the `*-codex` ids (e.g. gpt-5.3-codex, gpt-5.2-codex) are API-key-only and
+// fail on ChatGPT-account auth, and gpt-5.3-codex-spark is a ChatGPT-Pro-only
+// research preview. Codex resolves real availability per account at runtime
+// (its binary remote-fetches the model list), so this is a convenience
+// shortlist only — the manual-entry option accepts any id your account's
+// `codex` picker exposes, and the auth probe reports an unsupported choice.
+const CODEX_MODELS: AnthropicModelChoice[] = [
+  { id: 'gpt-5.5', label: 'GPT-5.5', recommended: true },
+  { id: 'gpt-5.4', label: 'GPT-5.4', recommended: false },
+  { id: 'gpt-5.4-mini', label: 'GPT-5.4 mini', recommended: false },
+];
+
 const HIDDEN_ANTHROPIC_MODEL_PATTERNS = [
  /^claude-sonnet-4$/i,
  /^claude-opus-4$/i,
@ -272,7 +290,12 @@ export function isKtxSetupLlmConfigReady(config: KtxProjectLlmConfig): boolean {
    return typeof resolved.vertex?.location === 'string' && resolved.vertex.location.trim().length > 0;
  }

-  return resolved.backend === 'anthropic' || resolved.backend === 'gateway' || resolved.backend === 'claude-code';
+  return (
+    resolved.backend === 'anthropic' ||
+    resolved.backend === 'gateway' ||
+    resolved.backend === 'claude-code' ||
+    resolved.backend === 'codex'
+  );
 }

 function hasUsableConfiguredLlm(config: KtxProjectConfig): boolean {
@ -284,7 +307,8 @@ function buildProjectLlmConfig(
  provider:
    | { backend: 'anthropic'; credentialRef: string }
    | { backend: 'vertex'; vertex: { project?: string; location: string } }
-    | { backend: 'claude-code' },
+    | { backend: 'claude-code' }
+    | { backend: 'codex' },
  model: string,
 ): KtxProjectLlmConfig {
  if (provider.backend === 'claude-code') {
@ -295,6 +319,14 @@ function buildProjectLlmConfig(
    };
  }

+  if (provider.backend === 'codex') {
+    return {
+      provider: { backend: 'codex' },
+      models: { ...existing.models, default: model },
+      promptCaching: existing.promptCaching,
+    };
+  }
+
  if (provider.backend === 'vertex') {
    return {
      provider: {
@ -515,6 +547,7 @@ async function chooseBackend(
    message: 'Which LLM provider should KTX use?',
    options: [
      { value: 'claude-code', label: 'Claude subscription (Pro/Max)' },
+      { value: 'codex', label: 'Codex subscription' },
      { value: 'anthropic', label: 'Anthropic API key' },
      { value: 'vertex', label: 'Google Vertex AI for Anthropic Claude' },
      { value: 'back', label: 'Back' },
@ -525,7 +558,7 @@ async function chooseBackend(
  }
  return {
    status: 'ready',
-    backend: choice === 'vertex' || choice === 'claude-code' ? choice : 'anthropic',
+    backend: choice === 'vertex' || choice === 'claude-code' || choice === 'codex' ? choice : 'anthropic',
    prompted: true,
  };
 }
@ -884,12 +917,51 @@ async function chooseClaudeCodeModel(args: KtxSetupModelArgs, deps: KtxSetupMode
  return { status: 'ready', model: choice };
 }

+async function chooseCodexModel(args: KtxSetupModelArgs, deps: KtxSetupModelDeps): Promise<ChooseModelResult> {
+  const providedModel = requestedModel(args);
+  if (providedModel) {
+    return { status: 'ready', model: providedModel };
+  }
+  if (args.inputMode === 'disabled') {
+    return { status: 'ready', model: DEFAULT_CODEX_MODEL };
+  }
+
+  const prompts = deps.prompts ?? createPromptAdapter();
+  const choice = await prompts.select({
+    message: `Which Codex model should KTX use?\n\n${ANTHROPIC_MODEL_PROMPT_CONTEXT}`,
+    options: [
+      ...CODEX_MODELS.map((model) => ({
+        value: model.id,
+        label: model.label,
+        ...(model.recommended ? { hint: 'recommended' } : {}),
+      })),
+      { value: 'manual', label: 'Enter a Codex model ID manually' },
+      { value: 'back', label: 'Back' },
+    ],
+  });
+  if (choice === 'back') {
+    return { status: 'back' };
+  }
+  if (choice === 'manual') {
+    const manual = await prompts.text({
+      message: withTextInputNavigation('Codex model ID'),
+      placeholder: CODEX_MODELS.find((model) => model.recommended)?.id ?? CODEX_MODELS[0]?.id,
+    });
+    if (manual === undefined) {
+      return { status: 'back' };
+    }
+    return manual.trim() ? { status: 'ready', model: manual.trim() } : { status: 'missing-input' };
+  }
+  return { status: 'ready', model: choice };
+}
+
 async function persistLlmConfig(
  projectDir: string,
  provider:
    | { backend: 'anthropic'; credentialRef: string }
    | { backend: 'vertex'; vertex: { project?: string; location: string } }
-    | { backend: 'claude-code' },
+    | { backend: 'claude-code' }
+    | { backend: 'codex' },
  model: string,
 ): Promise<void> {
  const project = await loadKtxProject({ projectDir });
@ -1031,6 +1103,32 @@ export async function runKtxSetupAnthropicModelStep(
      return { status: 'ready', projectDir: args.projectDir };
    }

+    if (backendChoice.backend === 'codex') {
+      const model = await chooseCodexModel(backendArgs, deps);
+      if (model.status === 'back' && backendChoice.prompted) {
+        attemptArgs = buildInteractiveRetryArgs(args);
+        continue;
+      }
+      if (model.status === 'invalid-credential') {
+        return { status: 'failed', projectDir: args.projectDir };
+      }
+      if (model.status !== 'ready') {
+        return { status: model.status, projectDir: args.projectDir };
+      }
+      const probe = deps.codexAuthProbe ?? runCodexAuthProbe;
+      const health = await probe({ projectDir: args.projectDir, model: model.model });
+      if (!health.ok) {
+        io.stderr.write(`${health.message}\n`);
+        return { status: 'failed', projectDir: args.projectDir };
+      }
+      // Prefix the clack gutter so the warning sits inside the setup frame
+      // instead of breaking out of it; kept on stderr for scripted runs.
+      io.stderr.write(`│  ${formatCodexIsolationWarning()}\n`);
+      await persistLlmConfig(args.projectDir, { backend: 'codex' }, model.model);
+      io.stdout.write(`│  LLM ready: yes (codex, ${model.model})\n`);
+      return { status: 'ready', projectDir: args.projectDir };
+    }
+
    const credential = await chooseCredentialRef(backendArgs, io, deps);
    if (credential.status === 'back' && backendChoice.prompted) {
      attemptArgs = buildInteractiveRetryArgs(args);
--- a/packages/cli/src/status-project.ts
+++ b/packages/cli/src/status-project.ts
@ -1,6 +1,11 @@
 import { stat as statAsync, readdir as readdirAsync } from 'node:fs/promises';
 import { basename, join } from 'node:path';
 import { runClaudeCodeAuthProbe } from './context/llm/claude-code-runtime.js';
+import {
+  CODEX_ISOLATION_WARNING,
+  CODEX_ISOLATION_WARNING_FIX,
+} from './context/llm/codex-isolation.js';
+import { runCodexAuthProbe } from './context/llm/codex-runtime.js';
 import type { KtxConfigIssue, KtxProjectConfig, KtxProjectConnectionConfig, KtxProjectEmbeddingConfig, KtxProjectLlmConfig } from './context/project/config.js';
 import type { KtxLocalProject } from './context/project/project.js';
 import { ktxLocalStateDbPath } from './context/project/local-state-db.js';
@ -94,6 +99,11 @@ type ClaudeCodeAuthProbe = (input: {
  env?: NodeJS.ProcessEnv;
 }) => Promise<{ ok: true } | { ok: false; message: string }>;

+type CodexAuthProbe = (input: {
+  projectDir: string;
+  model: string;
+}) => Promise<{ ok: true } | { ok: false; message: string; fix: string }>;
+
 const PROJECT_READY_COMMANDS = KTX_NEXT_STEP_DIRECT_COMMANDS.map((step) => step.command);

 interface LocalStatsIngestPerConnection {
@ -194,6 +204,7 @@ async function buildLlmStatus(
    projectDir: string;
    env: NodeJS.ProcessEnv;
    claudeCodeAuthProbe?: ClaudeCodeAuthProbe;
+    codexAuthProbe?: CodexAuthProbe;
    fast?: boolean;
    useSpinner?: boolean;
  },
@ -210,6 +221,18 @@ async function buildLlmStatus(
      fix: 'Run: ktx setup (choose an LLM provider)',
    };
  }
+  // The runtime (resolveModelSlots) hard-requires llm.models.default for every
+  // non-none backend; without it ingest/scan/memory throw. Report that here so
+  // status never marks a project ready that the runtime would refuse to run.
+  if (!model || model.trim().length === 0) {
+    return {
+      backend,
+      model,
+      status: 'fail',
+      detail: `llm.models.default is required for backend "${backend}"`,
+      fix: 'Set llm.models.default in ktx.yaml, then rerun `ktx status` (or rerun `ktx setup`).',
+    };
+  }
  if (backend === 'anthropic') {
    const ref = config.provider.anthropic?.api_key;
    const resolved = resolveRef(ref, env);
@ -251,7 +274,7 @@ async function buildLlmStatus(
    };
  }
  if (backend === 'claude-code') {
-    const modelName = model ?? 'sonnet';
+    const modelName = model;
    if (options.fast === true) {
      return {
        backend,
@ -280,6 +303,36 @@ async function buildLlmStatus(
      fix: 'Authenticate Claude Code locally with the Claude Code CLI, then rerun `ktx status`.',
    };
  }
+  if (backend === 'codex') {
+    const modelName = model;
+    if (options.fast === true) {
+      return {
+        backend,
+        model: modelName,
+        status: 'skipped',
+        detail: 'auth probe skipped (--fast)',
+      };
+    }
+    const probe = options.codexAuthProbe ?? runCodexAuthProbe;
+    const auth = await withSpinner(options.useSpinner === true, 'Probing Codex authentication', () =>
+      probe({ projectDir: options.projectDir, model: modelName }),
+    );
+    if (auth.ok) {
+      return {
+        backend,
+        model: modelName,
+        status: 'ok',
+        detail: 'local Codex session authenticated',
+      };
+    }
+    return {
+      backend,
+      model: modelName,
+      status: 'fail',
+      detail: auth.message,
+      fix: auth.fix,
+    };
+  }
  return { backend, model, status: 'warn', detail: 'unknown LLM backend' };
 }

@ -572,6 +625,13 @@ function buildWarnings(
    });
  }

+  if (llm.backend === 'codex') {
+    warnings.push({
+      message: CODEX_ISOLATION_WARNING,
+      fix: CODEX_ISOLATION_WARNING_FIX,
+    });
+  }
+
  return warnings;
 }

@ -634,6 +694,7 @@ export interface BuildProjectStatusOptions {
  env?: NodeJS.ProcessEnv;
  queryHistoryReadinessProbe?: HistoricSqlReadinessProbe;
  claudeCodeAuthProbe?: ClaudeCodeAuthProbe;
+  codexAuthProbe?: CodexAuthProbe;
  configIssues?: KtxConfigIssue[];
  fast?: boolean;
  useSpinner?: boolean;
@ -882,6 +943,7 @@ export async function buildProjectStatus(project: KtxLocalProject, options: Buil
    projectDir: project.projectDir,
    env,
    claudeCodeAuthProbe: options.claudeCodeAuthProbe,
+    codexAuthProbe: options.codexAuthProbe,
    fast: options.fast,
    useSpinner: options.useSpinner,
  });
--- a/packages/cli/test/context/ingest/local-bundle-runtime.test.ts
+++ b/packages/cli/test/context/ingest/local-bundle-runtime.test.ts
@ -77,9 +77,10 @@ describe('createLocalBundleIngestRuntime', () => {
      }),
    ).toThrow(
      [
-        'ktx ingest requires llm.provider.backend: anthropic, vertex, gateway, or claude-code, or an injected agentRunner.',
-        'Configure a local Claude Code session or API-backed LLM, then rerun ingest:',
+        'ktx ingest requires llm.provider.backend: anthropic, vertex, gateway, claude-code, or codex, or an injected agentRunner.',
+        'Configure a local Claude Code/Codex session or API-backed LLM, then rerun ingest:',
        `  ktx setup --project-dir ${project.projectDir} --llm-backend claude-code --no-input`,
+        `  ktx setup --project-dir ${project.projectDir} --llm-backend codex --llm-model gpt-5.5 --no-input`,
        `  ktx setup --project-dir ${project.projectDir} --llm-backend anthropic --anthropic-api-key-env ANTHROPIC_API_KEY --llm-model claude-sonnet-4-6 --no-input`,
      ].join('\n'),
    );
--- a/packages/cli/test/context/llm/codex-exec-events.test.ts
+++ b/packages/cli/test/context/llm/codex-exec-events.test.ts
@ -0,0 +1,188 @@
+import { describe, expect, it } from 'vitest';
+import {
+  parseCodexExecEventLine,
+  summarizeCodexExecEvents,
+} from '../../../src/context/llm/codex-exec-events.js';
+
+describe('Codex exec event parsing', () => {
+  it('uses the completed turn as one step when no MCP tools run', () => {
+    const summary = summarizeCodexExecEvents(
+      [
+        { type: 'thread.started', thread_id: 'thr_1' },
+        { type: 'turn.started' },
+        { type: 'item.completed', item: { id: 'item_1', type: 'agent_message', text: 'hello from codex' } },
+        {
+          type: 'turn.completed',
+          usage: {
+            input_tokens: 12,
+            cached_input_tokens: 4,
+            output_tokens: 5,
+            reasoning_output_tokens: 2,
+          },
+        },
+      ],
+      { startedAt: 100, now: () => 125 },
+    );
+
+    expect(summary).toEqual({
+      finalText: 'hello from codex',
+      stopReason: 'natural',
+      usage: { inputTokens: 12, outputTokens: 5, totalTokens: 17 },
+      stepCount: 1,
+      stepBoundariesMs: [25],
+      toolCallCount: 0,
+      toolFailures: [],
+    });
+  });
+
+  it('uses completed MCP tool calls as loop steps', () => {
+    const offsets = [115, 140, 175];
+    const summary = summarizeCodexExecEvents(
+      [
+        { type: 'turn.started' },
+        {
+          type: 'item.started',
+          item: { id: 'call_1', type: 'mcp_tool_call', server: 'ktx', tool: 'search', arguments: {}, status: 'in_progress' },
+        },
+        {
+          type: 'item.completed',
+          item: { id: 'call_1', type: 'mcp_tool_call', server: 'ktx', tool: 'search', arguments: {}, status: 'completed' },
+        },
+        {
+          type: 'item.started',
+          item: { id: 'call_2', type: 'mcp_tool_call', server: 'ktx', tool: 'lookup', arguments: {}, status: 'in_progress' },
+        },
+        {
+          type: 'item.completed',
+          item: {
+            id: 'call_2',
+            type: 'mcp_tool_call',
+            server: 'ktx',
+            tool: 'lookup',
+            arguments: {},
+            status: 'failed',
+            error: { message: 'denied' },
+          },
+        },
+        { type: 'item.completed', item: { id: 'item_1', type: 'agent_message', text: 'done' } },
+        { type: 'turn.completed', usage: { input_tokens: 1, output_tokens: 1, cached_input_tokens: 0, reasoning_output_tokens: 0 } },
+      ],
+      { startedAt: 100, now: () => offsets.shift() ?? 175 },
+    );
+
+    expect(summary).toEqual({
+      finalText: 'done',
+      stopReason: 'natural',
+      usage: { inputTokens: 1, outputTokens: 1, totalTokens: 2 },
+      stepCount: 2,
+      stepBoundariesMs: [15, 40],
+      toolCallCount: 2,
+      toolFailures: ['lookup: denied'],
+    });
+  });
+
+  it('does not treat a completed MCP tool call as failed when Codex sends error: null', () => {
+    // Captured verbatim from a real @openai/codex-sdk run: successful tool calls
+    // carry `error: null` and `result` alongside `status: "completed"`.
+    const summary = summarizeCodexExecEvents([
+      { type: 'turn.started' },
+      {
+        type: 'item.started',
+        item: {
+          id: 'item_1',
+          type: 'mcp_tool_call',
+          server: 'ktx',
+          tool: 'echo_value',
+          arguments: { value: 'ktx_codex_tool_ok' },
+          result: null,
+          error: null,
+          status: 'in_progress',
+        },
+      },
+      {
+        type: 'item.completed',
+        item: {
+          id: 'item_1',
+          type: 'mcp_tool_call',
+          server: 'ktx',
+          tool: 'echo_value',
+          arguments: { value: 'ktx_codex_tool_ok' },
+          result: { content: [{ type: 'text', text: 'echo:ktx_codex_tool_ok' }], structured_content: null },
+          error: null,
+          status: 'completed',
+        },
+      },
+      { type: 'item.completed', item: { id: 'm1', type: 'agent_message', text: 'done' } },
+      { type: 'turn.completed', usage: { input_tokens: 1, output_tokens: 1 } },
+    ]);
+
+    expect(summary.toolFailures).toEqual([]);
+    expect(summary.toolCallCount).toBe(1);
+  });
+
+  it('counts built-in command executions as loop steps without failing the loop', () => {
+    const offsets = [110, 130];
+    const summary = summarizeCodexExecEvents(
+      [
+        { type: 'turn.started' },
+        { type: 'item.completed', item: { id: 'c1', type: 'command_execution', command: 'ls', status: 'completed', exit_code: 0 } },
+        { type: 'item.completed', item: { id: 'c2', type: 'command_execution', command: 'cat missing', status: 'failed', exit_code: 1 } },
+        { type: 'item.completed', item: { id: 'm1', type: 'agent_message', text: 'done' } },
+        { type: 'turn.completed', usage: { input_tokens: 2, output_tokens: 1 } },
+      ],
+      { startedAt: 100, now: () => offsets.shift() ?? 130 },
+    );
+
+    expect(summary.stepCount).toBe(2);
+    expect(summary.stepBoundariesMs).toEqual([10, 30]);
+    // A non-zero command exit is normal agent exploration, not a runtime tool failure.
+    expect(summary.toolFailures).toEqual([]);
+    expect(summary.toolCallCount).toBe(0);
+  });
+
+  it('maps turn failures into error stop reason', () => {
+    const summary = summarizeCodexExecEvents([
+      { type: 'turn.started' },
+      { type: 'turn.failed', error: { message: 'Codex could not connect to required MCP server' } },
+    ]);
+
+    expect(summary.stopReason).toBe('error');
+    expect(summary.error?.message).toContain('Codex could not connect to required MCP server');
+  });
+
+  it('unwraps the Codex API error envelope into its human-readable message', () => {
+    // Codex serializes API errors as a JSON envelope inside the event message.
+    const apiError = JSON.stringify({
+      type: 'error',
+      status: 400,
+      error: {
+        type: 'invalid_request_error',
+        message: "The 'gpt-5.3-codex' model is not supported when using Codex with a ChatGPT account.",
+      },
+    });
+    const summary = summarizeCodexExecEvents([
+      { type: 'thread.started', thread_id: 'thr_1' },
+      { type: 'turn.started' },
+      { type: 'error', message: apiError },
+      { type: 'turn.failed', error: { message: apiError } },
+    ]);
+
+    expect(summary.stopReason).toBe('error');
+    expect(summary.error?.message).toBe(
+      "The 'gpt-5.3-codex' model is not supported when using Codex with a ChatGPT account.",
+    );
+  });
+
+  it('maps max-turns terminal reasons into budget stop reason when Codex emits one', () => {
+    const summary = summarizeCodexExecEvents([
+      { type: 'turn.started' },
+      { type: 'turn.completed', reason: 'max_turns', usage: { input_tokens: 1, output_tokens: 1 } },
+    ]);
+
+    expect(summary.stopReason).toBe('budget');
+  });
+
+  it('throws a clear error for malformed JSONL lines', () => {
+    expect(() => parseCodexExecEventLine('{not-json')).toThrow('Codex JSONL event stream was malformed');
+  });
+});
--- a/packages/cli/test/context/llm/codex-isolation.test.ts
+++ b/packages/cli/test/context/llm/codex-isolation.test.ts
@ -0,0 +1,19 @@
+import { describe, expect, it } from 'vitest';
+import {
+  CODEX_ISOLATION_WARNING,
+  CODEX_ISOLATION_WARNING_FIX,
+  formatCodexIsolationWarning,
+} from '../../../src/context/llm/codex-isolation.js';
+
+describe('Codex isolation warning', () => {
+  it('documents the enforced and unenforced Codex isolation boundaries', () => {
+    expect(CODEX_ISOLATION_WARNING).toContain('runtime MCP server to the current ktx tool set');
+    expect(CODEX_ISOLATION_WARNING).toContain('disables Codex web search');
+    expect(CODEX_ISOLATION_WARNING).toContain('may still load user Codex config');
+    expect(CODEX_ISOLATION_WARNING).toContain('built-in command execution');
+    expect(CODEX_ISOLATION_WARNING_FIX).toContain('claude-code');
+    expect(formatCodexIsolationWarning()).toBe(
+      `${CODEX_ISOLATION_WARNING} ${CODEX_ISOLATION_WARNING_FIX}`,
+    );
+  });
+});
--- a/packages/cli/test/context/llm/codex-mcp-runtime-server.test.ts
+++ b/packages/cli/test/context/llm/codex-mcp-runtime-server.test.ts
@ -0,0 +1,73 @@
+import { describe, expect, it, vi } from 'vitest';
+import { z } from 'zod';
+import {
+  createCodexRuntimeMcpServer,
+  startCodexRuntimeMcpServer,
+} from '../../../src/context/llm/codex-mcp-runtime-server.js';
+
+describe('Codex runtime MCP server', () => {
+  it('registers runtime tools with markdown output', async () => {
+    const registered = new Map<
+      string,
+      {
+        config: { description?: string; inputSchema: unknown };
+        handler: (input: Record<string, unknown>) => Promise<unknown>;
+      }
+    >();
+    const server = createCodexRuntimeMcpServer({
+      server: {
+        registerTool(name, config, handler) {
+          registered.set(name, { config, handler });
+        },
+      },
+      toolSet: {
+        wiki_search: {
+          name: 'wiki_search',
+          description: 'Search the wiki',
+          inputSchema: z.object({ query: z.string() }),
+          execute: vi.fn(async () => ({ markdown: 'result markdown', structured: { matches: 1 } })),
+        },
+      },
+    });
+
+    expect(server).toBeDefined();
+    expect([...registered.keys()]).toEqual(['wiki_search']);
+    expect(registered.get('wiki_search')?.config).toMatchObject({
+      description: 'Search the wiki',
+    });
+    await expect(registered.get('wiki_search')?.handler({ query: 'revenue' })).resolves.toEqual({
+      content: [{ type: 'text', text: 'result markdown' }],
+      structuredContent: { matches: 1 },
+    });
+  });
+
+  it('starts loopback HTTP MCP with a bearer token and reports the runtime URL', async () => {
+    const close = vi.fn(async () => undefined);
+    const runServer = vi.fn(async () => ({
+      server: { address: () => ({ port: 4321 }) },
+      close,
+    }));
+
+    const handle = await startCodexRuntimeMcpServer({
+      projectDir: '/tmp/ktx-project',
+      toolSet: {},
+      runServer: runServer as never,
+    });
+
+    expect(handle.url).toBe('http://127.0.0.1:4321/mcp');
+    expect(handle.bearerTokenEnvVar).toBe('KTX_CODEX_RUNTIME_MCP_TOKEN');
+    expect(handle.bearerToken).toMatch(/^[a-f0-9]{64}$/);
+    expect(runServer).toHaveBeenCalledWith(
+      expect.objectContaining({
+        projectDir: '/tmp/ktx-project',
+        host: '127.0.0.1',
+        port: 0,
+        token: handle.bearerToken,
+        allowedHosts: ['127.0.0.1', 'localhost'],
+        allowedOrigins: [],
+      }),
+    );
+    await handle.close();
+    expect(close).toHaveBeenCalled();
+  });
+});
--- a/packages/cli/test/context/llm/codex-models.test.ts
+++ b/packages/cli/test/context/llm/codex-models.test.ts
@ -0,0 +1,17 @@
+import { describe, expect, it } from 'vitest';
+import { resolveCodexModel } from '../../../src/context/llm/codex-models.js';
+
+describe('resolveCodexModel', () => {
+  it.each([
+    ['codex', 'gpt-5.5'],
+    ['default', 'gpt-5.5'],
+    ['gpt-5.3-codex-spark', 'gpt-5.3-codex-spark'],
+    ['gpt-5.4', 'gpt-5.4'],
+  ])('maps %s to %s', (input, expected) => {
+    expect(resolveCodexModel(input)).toBe(expected);
+  });
+
+  it.each(['', '   ', 'sonnet', 'claude-sonnet-4-6'])('rejects %s', (input) => {
+    expect(() => resolveCodexModel(input)).toThrow('Unsupported Codex model');
+  });
+});
--- a/packages/cli/test/context/llm/codex-runtime-config.test.ts
+++ b/packages/cli/test/context/llm/codex-runtime-config.test.ts
@ -0,0 +1,43 @@
+import { describe, expect, it } from 'vitest';
+import { buildCodexRuntimeConfig } from '../../../src/context/llm/codex-runtime-config.js';
+
+describe('buildCodexRuntimeConfig', () => {
+  it('builds generic config without SDK thread-option fields', () => {
+    expect(buildCodexRuntimeConfig({ model: 'gpt-5.3-codex' })).toEqual({
+      configOverrides: {
+        history: { persistence: 'none' },
+      },
+      env: {},
+    });
+  });
+
+  it('adds only the temporary ktx MCP server and exact enabled tools', () => {
+    expect(
+      buildCodexRuntimeConfig({
+        model: 'gpt-5.3-codex',
+        mcp: {
+          url: 'http://127.0.0.1:4567/mcp',
+          bearerTokenEnvVar: 'KTX_CODEX_RUNTIME_MCP_TOKEN',
+          bearerToken: 'secret-token',
+          toolNames: ['sl_read_source', 'wiki_search'],
+        },
+      }),
+    ).toEqual({
+      configOverrides: {
+        history: { persistence: 'none' },
+        mcp_servers: {
+          ktx: {
+            url: 'http://127.0.0.1:4567/mcp',
+            bearer_token_env_var: 'KTX_CODEX_RUNTIME_MCP_TOKEN',
+            enabled_tools: ['sl_read_source', 'wiki_search'],
+            default_tools_approval_mode: 'approve',
+            required: true,
+          },
+        },
+      },
+      env: {
+        KTX_CODEX_RUNTIME_MCP_TOKEN: 'secret-token',
+      },
+    });
+  });
+});
--- a/packages/cli/test/context/llm/codex-runtime.test.ts
+++ b/packages/cli/test/context/llm/codex-runtime.test.ts
@ -0,0 +1,460 @@
+import { describe, expect, it, vi } from 'vitest';
+import { z } from 'zod';
+import {
+  CodexKtxLlmRuntime,
+  runCodexAuthProbe,
+} from '../../../src/context/llm/codex-runtime.js';
+
+async function* events(items: unknown[]) {
+  for (const item of items) {
+    yield item;
+  }
+}
+
+function runner(items: unknown[]) {
+  return {
+    runStreamed: vi.fn(async () => events(items)),
+  };
+}
+
+/** Yields the given events, then throws — mirroring the SDK throwing on a non-zero codex exec exit. */
+function throwingRunner(items: unknown[], error: Error) {
+  return {
+    runStreamed: vi.fn(async () =>
+      (async function* () {
+        for (const item of items) {
+          yield item;
+        }
+        throw error;
+      })(),
+    ),
+  };
+}
+
+const MODEL_UNSUPPORTED_API_ERROR = JSON.stringify({
+  type: 'error',
+  status: 400,
+  error: {
+    type: 'invalid_request_error',
+    message: "The 'gpt-5.3-codex' model is not supported when using Codex with a ChatGPT account.",
+  },
+});
+
+function budgetRunner() {
+  let observedSignal: AbortSignal | undefined;
+  return {
+    observedSignal: () => observedSignal,
+    runStreamed: vi.fn(async (input: { signal?: AbortSignal }) => {
+      observedSignal = input.signal;
+      return events([
+        { type: 'turn.started' },
+        { type: 'item.started', item: { type: 'mcp_tool_call', server: 'ktx', tool: 'first', status: 'in_progress' } },
+        { type: 'item.completed', item: { type: 'mcp_tool_call', server: 'ktx', tool: 'first', status: 'completed' } },
+        { type: 'item.started', item: { type: 'mcp_tool_call', server: 'ktx', tool: 'second', status: 'in_progress' } },
+        { type: 'item.completed', item: { type: 'mcp_tool_call', server: 'ktx', tool: 'second', status: 'completed' } },
+        { type: 'turn.completed', usage: { input_tokens: 1, output_tokens: 1 } },
+      ]);
+    }),
+  };
+}
+
+describe('CodexKtxLlmRuntime', () => {
+  it('generates text with the role-selected model and metrics', async () => {
+    const onMetrics = vi.fn();
+    const fakeRunner = runner([
+      { type: 'turn.started' },
+      { type: 'item.completed', item: { type: 'agent_message', text: 'hello' } },
+      { type: 'turn.completed', usage: { input_tokens: 3, output_tokens: 4, total_tokens: 7 } },
+    ]);
+    const runtime = new CodexKtxLlmRuntime({
+      projectDir: '/tmp/project',
+      modelSlots: { default: 'codex', triage: 'gpt-5.4' },
+      runner: fakeRunner,
+    });
+
+    await expect(runtime.generateText({ role: 'triage', system: 'system', prompt: 'prompt', onMetrics })).resolves.toBe('hello');
+    expect(fakeRunner.runStreamed).toHaveBeenCalledWith(
+      expect.objectContaining({
+        projectDir: '/tmp/project',
+        model: 'gpt-5.4',
+        prompt: 'system\n\nprompt',
+      }),
+    );
+    expect(onMetrics).toHaveBeenCalledWith(expect.objectContaining({ usage: { inputTokens: 3, outputTokens: 4, totalTokens: 7 } }));
+  });
+
+  it('generates and validates structured output', async () => {
+    const fakeRunner = runner([
+      { type: 'turn.started' },
+      { type: 'item.completed', item: { type: 'agent_message', text: '{"answer":"yes"}' } },
+      { type: 'turn.completed' },
+    ]);
+    const runtime = new CodexKtxLlmRuntime({
+      projectDir: '/tmp/project',
+      modelSlots: { default: 'codex' },
+      runner: fakeRunner,
+    });
+
+    await expect(
+      runtime.generateObject({
+        role: 'default',
+        prompt: 'json',
+        schema: z.object({ answer: z.string() }),
+      }),
+    ).resolves.toEqual({ answer: 'yes' });
+    expect(fakeRunner.runStreamed).toHaveBeenCalledWith(
+      expect.objectContaining({
+        outputSchema: expect.objectContaining({ type: 'object' }),
+      }),
+    );
+  });
+
+  it('returns a structured-output error when Codex final text is invalid JSON', async () => {
+    const fakeRunner = runner([
+      { type: 'turn.started' },
+      { type: 'item.completed', item: { type: 'agent_message', text: 'not json' } },
+      { type: 'turn.completed' },
+    ]);
+    const runtime = new CodexKtxLlmRuntime({
+      projectDir: '/tmp/project',
+      modelSlots: { default: 'codex' },
+      runner: fakeRunner,
+    });
+
+    await expect(
+      runtime.generateObject({
+        role: 'default',
+        prompt: 'json',
+        schema: z.object({ answer: z.string() }),
+      }),
+    ).rejects.toThrow('Codex structured output failed validation');
+  });
+
+  it('starts and closes a temporary MCP server for tool-backed agent loops', async () => {
+    const close = vi.fn(async () => undefined);
+    const startMcpServer = vi.fn(async () => ({
+      url: 'http://127.0.0.1:4321/mcp',
+      bearerTokenEnvVar: 'KTX_CODEX_RUNTIME_MCP_TOKEN' as const,
+      bearerToken: 'token',
+      close,
+    }));
+    const fakeRunner = runner([
+      { type: 'turn.started' },
+      { type: 'item.started', item: { type: 'mcp_tool_call', name: 'wiki_search' } },
+      { type: 'item.completed', item: { type: 'agent_message', text: 'done' } },
+      { type: 'turn.completed', usage: { input_tokens: 1, output_tokens: 1, total_tokens: 2 } },
+    ]);
+    const runtime = new CodexKtxLlmRuntime({
+      projectDir: '/tmp/project',
+      modelSlots: { default: 'codex' },
+      runner: fakeRunner,
+      startMcpServer,
+    });
+    const onStepFinish = vi.fn();
+
+    const result = await runtime.runAgentLoop({
+      modelRole: 'default',
+      systemPrompt: 'system',
+      userPrompt: 'user',
+      stepBudget: 5,
+      telemetryTags: {},
+      onStepFinish,
+      toolSet: {
+        aliased_wiki_tool: {
+          name: 'wiki_search',
+          description: 'Search wiki',
+          inputSchema: z.object({ query: z.string() }),
+          execute: vi.fn(),
+        },
+      },
+    });
+
+    expect(result.stopReason).toBe('natural');
+    expect(result.metrics).toMatchObject({ stepCount: 1, usage: { inputTokens: 1, outputTokens: 1, totalTokens: 2 } });
+    expect(onStepFinish).toHaveBeenCalledWith({ stepIndex: 1, stepBudget: 5 });
+    expect(startMcpServer).toHaveBeenCalledWith({ projectDir: '/tmp/project', toolSet: expect.any(Object) });
+    expect(fakeRunner.runStreamed).toHaveBeenCalledWith(
+      expect.objectContaining({
+        env: { KTX_CODEX_RUNTIME_MCP_TOKEN: 'token' },
+        configOverrides: expect.objectContaining({
+          mcp_servers: expect.objectContaining({
+            ktx: expect.objectContaining({
+              url: 'http://127.0.0.1:4321/mcp',
+              enabled_tools: ['wiki_search'],
+              required: true,
+            }),
+          }),
+        }),
+      }),
+    );
+    expect(close).toHaveBeenCalled();
+  });
+
+  it('returns error stop reason on turn failure', async () => {
+    const runtime = new CodexKtxLlmRuntime({
+      projectDir: '/tmp/project',
+      modelSlots: { default: 'codex' },
+      runner: runner([{ type: 'turn.failed', error: { message: 'boom' } }]),
+    });
+
+    const result = await runtime.runAgentLoop({
+      modelRole: 'default',
+      systemPrompt: 'system',
+      userPrompt: 'user',
+      stepBudget: 5,
+      telemetryTags: {},
+      toolSet: {},
+    });
+
+    expect(result.stopReason).toBe('error');
+    expect(result.error?.message).toBe('boom');
+  });
+
+  it('surfaces failed MCP tool calls as agent-loop errors', async () => {
+    const runtime = new CodexKtxLlmRuntime({
+      projectDir: '/tmp/project',
+      modelSlots: { default: 'codex' },
+      runner: runner([
+        { type: 'turn.started' },
+        { type: 'item.started', item: { type: 'mcp_tool_call', server: 'ktx', tool: 'search', status: 'in_progress' } },
+        {
+          type: 'item.completed',
+          item: {
+            type: 'mcp_tool_call',
+            server: 'ktx',
+            tool: 'search',
+            status: 'failed',
+            error: { message: 'denied' },
+          },
+        },
+        { type: 'turn.completed', usage: { input_tokens: 1, output_tokens: 1 } },
+      ]),
+    });
+
+    const result = await runtime.runAgentLoop({
+      modelRole: 'default',
+      systemPrompt: 'system',
+      userPrompt: 'user',
+      stepBudget: 5,
+      telemetryTags: {},
+      toolSet: {},
+    });
+
+    expect(result.stopReason).toBe('error');
+    expect(result.error?.message).toBe('Codex runtime tool call failed: search: denied');
+    expect(result.metrics).toMatchObject({
+      stepCount: 1,
+      usage: { inputTokens: 1, outputTokens: 1, totalTokens: 2 },
+    });
+  });
+
+  it('returns budget and aborts the Codex stream when local MCP step budget is reached', async () => {
+    const fakeRunner = budgetRunner();
+    const runtime = new CodexKtxLlmRuntime({
+      projectDir: '/tmp/project',
+      modelSlots: { default: 'codex' },
+      runner: fakeRunner,
+    });
+    const onStepFinish = vi.fn();
+
+    const result = await runtime.runAgentLoop({
+      modelRole: 'default',
+      systemPrompt: 'system',
+      userPrompt: 'user',
+      stepBudget: 1,
+      telemetryTags: {},
+      onStepFinish,
+      toolSet: {
+        first: {
+          name: 'first',
+          description: 'First tool',
+          inputSchema: z.object({}),
+          execute: vi.fn(),
+        },
+      },
+    });
+
+    expect(result.stopReason).toBe('budget');
+    expect(result.error).toBeUndefined();
+    expect(result.metrics).toMatchObject({ stepCount: 1 });
+    expect(onStepFinish).toHaveBeenCalledTimes(1);
+    expect(onStepFinish).toHaveBeenCalledWith({ stepIndex: 1, stepBudget: 1 });
+    expect(fakeRunner.observedSignal()?.aborted).toBe(true);
+  });
+
+  it('counts built-in command_execution steps against the budget and aborts the stream', async () => {
+    let observedSignal: AbortSignal | undefined;
+    const fakeRunner = {
+      observedSignal: () => observedSignal,
+      runStreamed: vi.fn(async (input: { signal?: AbortSignal }) => {
+        observedSignal = input.signal;
+        return events([
+          { type: 'turn.started' },
+          { type: 'item.started', item: { type: 'command_execution', command: 'ls', status: 'in_progress' } },
+          { type: 'item.completed', item: { type: 'command_execution', command: 'ls', status: 'completed', exit_code: 0 } },
+          { type: 'item.started', item: { type: 'command_execution', command: 'cat a', status: 'in_progress' } },
+          { type: 'item.completed', item: { type: 'command_execution', command: 'cat a', status: 'completed', exit_code: 0 } },
+          { type: 'item.completed', item: { type: 'command_execution', command: 'cat b', status: 'completed', exit_code: 0 } },
+          { type: 'turn.completed', usage: { input_tokens: 1, output_tokens: 1 } },
+        ]);
+      }),
+    };
+    const runtime = new CodexKtxLlmRuntime({
+      projectDir: '/tmp/project',
+      modelSlots: { default: 'codex' },
+      runner: fakeRunner,
+    });
+    const onStepFinish = vi.fn();
+
+    const result = await runtime.runAgentLoop({
+      modelRole: 'default',
+      systemPrompt: 'system',
+      userPrompt: 'user',
+      stepBudget: 2,
+      telemetryTags: {},
+      onStepFinish,
+      toolSet: {},
+    });
+
+    expect(result.stopReason).toBe('budget');
+    expect(result.error).toBeUndefined();
+    expect(result.metrics).toMatchObject({ stepCount: 2 });
+    expect(onStepFinish).toHaveBeenCalledTimes(2);
+    expect(onStepFinish).toHaveBeenLastCalledWith({ stepIndex: 2, stepBudget: 2 });
+    expect(fakeRunner.observedSignal()?.aborted).toBe(true);
+  });
+
+  it('fires onStepFinish live as each step completes, before the stream drains', async () => {
+    const order: string[] = [];
+    async function* liveEvents() {
+      yield { type: 'turn.started' };
+      yield { type: 'item.completed', item: { type: 'mcp_tool_call', server: 'ktx', tool: 'a', status: 'completed' } };
+      order.push('yielded-after-step-1');
+      yield { type: 'item.completed', item: { type: 'mcp_tool_call', server: 'ktx', tool: 'b', status: 'completed' } };
+      order.push('yielded-after-step-2');
+      yield { type: 'item.completed', item: { type: 'agent_message', text: 'done' } };
+      yield { type: 'turn.completed', usage: { input_tokens: 1, output_tokens: 1 } };
+    }
+    const fakeRunner = { runStreamed: vi.fn(async () => liveEvents()) };
+    const runtime = new CodexKtxLlmRuntime({
+      projectDir: '/tmp/project',
+      modelSlots: { default: 'codex' },
+      runner: fakeRunner,
+    });
+
+    const result = await runtime.runAgentLoop({
+      modelRole: 'default',
+      systemPrompt: 'system',
+      userPrompt: 'user',
+      stepBudget: 10,
+      telemetryTags: {},
+      onStepFinish: ({ stepIndex }) => {
+        order.push(`step-${stepIndex}`);
+      },
+      toolSet: {},
+    });
+
+    expect(result.stopReason).toBe('natural');
+    expect(result.metrics).toMatchObject({ stepCount: 2 });
+    expect(order).toEqual(['step-1', 'yielded-after-step-1', 'step-2', 'yielded-after-step-2']);
+  });
+
+  it('surfaces the real Codex error event even when the SDK stream throws afterward', async () => {
+    // The SDK yields the error/turn.failed events on stdout, then throws on the
+    // non-zero exit. The masked exit message must not hide the real API error.
+    const fakeRunner = throwingRunner(
+      [
+        { type: 'thread.started', thread_id: 't' },
+        { type: 'turn.started' },
+        { type: 'error', message: MODEL_UNSUPPORTED_API_ERROR },
+        { type: 'turn.failed', error: { message: MODEL_UNSUPPORTED_API_ERROR } },
+      ],
+      new Error('Codex Exec exited with code 1: Reading prompt from stdin...'),
+    );
+    const runtime = new CodexKtxLlmRuntime({
+      projectDir: '/tmp/project',
+      modelSlots: { default: 'codex' },
+      runner: fakeRunner,
+    });
+
+    await expect(runtime.generateText({ role: 'default', prompt: 'hi' })).rejects.toThrow(
+      'not supported when using Codex with a ChatGPT account',
+    );
+  });
+
+  it('probes Codex authentication through a minimal non-interactive turn', async () => {
+    const fakeRunner = runner([
+      { type: 'turn.started' },
+      { type: 'item.completed', item: { type: 'agent_message', text: 'ok' } },
+      { type: 'turn.completed' },
+    ]);
+
+    await expect(
+      runCodexAuthProbe({
+        projectDir: '/tmp/project',
+        model: 'codex',
+        runner: fakeRunner,
+      }),
+    ).resolves.toEqual({ ok: true });
+  });
+
+  it('reports an unavailable model without blaming auth when Codex rejects the model', async () => {
+    const fakeRunner = throwingRunner(
+      [
+        { type: 'turn.started' },
+        { type: 'turn.failed', error: { message: MODEL_UNSUPPORTED_API_ERROR } },
+      ],
+      new Error('Codex Exec exited with code 1: Reading prompt from stdin...'),
+    );
+
+    const result = await runCodexAuthProbe({
+      projectDir: '/tmp/project',
+      model: 'gpt-5.3-codex',
+      runner: fakeRunner,
+    });
+
+    expect(result.ok).toBe(false);
+    if (!result.ok) {
+      expect(result.message).not.toContain('authentication is not usable');
+      expect(result.message).toContain('not available');
+      expect(result.message).toContain('gpt-5.3-codex');
+      expect(result.message).toContain('not supported when using Codex with a ChatGPT account');
+      // A model-access failure must steer the user at the model config, not auth.
+      expect(result.fix).toContain('llm.models.default');
+      expect(result.fix).not.toContain('Authenticate Codex');
+    }
+  });
+
+  it('reports an auth failure when Codex exits without an error event', async () => {
+    const fakeRunner = throwingRunner(
+      [],
+      new Error('Codex Exec exited with code 1: Not logged in. Run `codex login`.'),
+    );
+
+    const result = await runCodexAuthProbe({
+      projectDir: '/tmp/project',
+      model: 'gpt-5.5',
+      runner: fakeRunner,
+    });
+
+    expect(result.ok).toBe(false);
+    if (!result.ok) {
+      expect(result.message).toContain('authentication is not usable');
+      expect(result.message).toContain('Not logged in');
+      expect(result.fix).toContain('Authenticate Codex');
+    }
+  });
+
+  it('rejects an unsupported model id before probing, steering at llm.models.default', async () => {
+    const result = await runCodexAuthProbe({
+      projectDir: '/tmp/project',
+      model: 'not-a-real-model',
+    });
+
+    expect(result.ok).toBe(false);
+    if (!result.ok) {
+      expect(result.message).toContain('Unsupported Codex model');
+      expect(result.fix).toContain('llm.models.default');
+    }
+  });
+});
--- a/packages/cli/test/context/llm/codex-sdk-runner.test.ts
+++ b/packages/cli/test/context/llm/codex-sdk-runner.test.ts
@ -0,0 +1,97 @@
+import { describe, expect, it, vi } from 'vitest';
+
+const sdkMock = vi.hoisted(() => {
+  const events = (async function* () {
+    yield { type: 'turn.completed', usage: { input_tokens: 1, output_tokens: 2 } };
+  })();
+  const runStreamed = vi.fn(async () => ({ events }));
+  const startThread = vi.fn(() => ({ runStreamed }));
+  const Codex = vi.fn(function Codex(this: { startThread: typeof startThread }, options?: unknown) {
+    Object.assign(this, { options, startThread });
+  });
+  return { Codex, startThread, runStreamed };
+});
+
+vi.mock('@openai/codex-sdk', () => ({ Codex: sdkMock.Codex }));
+
+import { CodexSdkCliRunner } from '../../../src/context/llm/codex-sdk-runner.js';
+
+async function collectAsync<T>(items: AsyncIterable<T>): Promise<T[]> {
+  const collected: T[] = [];
+  for await (const item of items) {
+    collected.push(item);
+  }
+  return collected;
+}
+
+describe('CodexSdkCliRunner', () => {
+  it('passes isolated env through the SDK and runtime controls through thread options', async () => {
+    const runner = new CodexSdkCliRunner({
+      envBase: {
+        HOME: '/home/ktx-user',
+        PATH: '/usr/local/bin:/usr/bin',
+        CODEX_HOME: '/home/ktx-user/.codex',
+        HTTPS_PROXY: 'http://proxy.example',
+        KTX_UNRELATED_SECRET: 'must-not-copy', // pragma: allowlist secret
+      },
+    });
+    const previousToken = process.env.KTX_CODEX_RUNTIME_MCP_TOKEN;
+    process.env.KTX_CODEX_RUNTIME_MCP_TOKEN = 'outer-token';
+    const outputSchema = {
+      type: 'object',
+      properties: { answer: { type: 'string' } },
+      required: ['answer'],
+      additionalProperties: false,
+    };
+    const controller = new AbortController();
+
+    try {
+      const events = await runner.runStreamed({
+        projectDir: '/tmp/ktx-project',
+        model: 'gpt-5.3-codex',
+        prompt: 'Return JSON.',
+        configOverrides: {
+          history: { persistence: 'none' },
+        },
+        env: { KTX_CODEX_RUNTIME_MCP_TOKEN: 'run-token' },
+        outputSchema,
+        signal: controller.signal,
+      });
+
+      expect(sdkMock.Codex).toHaveBeenCalledWith({
+        config: {
+          history: { persistence: 'none' },
+        },
+        env: {
+          HOME: '/home/ktx-user',
+          PATH: '/usr/local/bin:/usr/bin',
+          CODEX_HOME: '/home/ktx-user/.codex',
+          HTTPS_PROXY: 'http://proxy.example',
+          KTX_CODEX_RUNTIME_MCP_TOKEN: 'run-token',
+        },
+      });
+      expect(process.env.KTX_CODEX_RUNTIME_MCP_TOKEN).toBe('outer-token');
+      expect(sdkMock.startThread).toHaveBeenCalledWith({
+        workingDirectory: '/tmp/ktx-project',
+        skipGitRepoCheck: true,
+        model: 'gpt-5.3-codex',
+        sandboxMode: 'read-only',
+        webSearchMode: 'disabled',
+        approvalPolicy: 'never',
+      });
+      expect(sdkMock.runStreamed).toHaveBeenCalledWith('Return JSON.', {
+        outputSchema,
+        signal: controller.signal,
+      });
+      await expect(collectAsync(events)).resolves.toEqual([
+        { type: 'turn.completed', usage: { input_tokens: 1, output_tokens: 2 } },
+      ]);
+    } finally {
+      if (previousToken === undefined) {
+        delete process.env.KTX_CODEX_RUNTIME_MCP_TOKEN;
+      } else {
+        process.env.KTX_CODEX_RUNTIME_MCP_TOKEN = previousToken;
+      }
+    }
+  });
+});
--- a/packages/cli/test/context/llm/runtime-local-config.test.ts
+++ b/packages/cli/test/context/llm/runtime-local-config.test.ts
@ -22,4 +22,25 @@ describe('local KTX LLM runtime config', () => {
      }),
    ).toBeNull();
  });
+
+  it('creates a Codex runtime for codex backend without creating an AI SDK provider', () => {
+    const runtime = createLocalKtxLlmRuntimeFromConfig(
+      {
+        provider: { backend: 'codex' },
+        models: { default: 'codex', triage: 'gpt-5.4' },
+      },
+      { env: {}, projectDir: '/tmp/project', createCodexRuntime: vi.fn((deps) => ({ deps }) as never) },
+    );
+
+    expect(runtime).toMatchObject({ deps: expect.objectContaining({ projectDir: '/tmp/project' }) });
+  });
+
+  it('returns null from the AI SDK provider factory for codex backend', () => {
+    expect(
+      createLocalKtxLlmProviderFromConfig({
+        provider: { backend: 'codex' },
+        models: { default: 'codex' },
+      }),
+    ).toBeNull();
+  });
 });
--- a/packages/cli/test/context/project/config.test.ts
+++ b/packages/cli/test/context/project/config.test.ts
@ -231,6 +231,31 @@ llm:
    });
  });

+  it('parses Codex as a first-class LLM backend', () => {
+    const config = parseKtxProjectConfig(`
+llm:
+  provider:
+    backend: codex
+  models:
+    default: gpt-5.3-codex
+    triage: gpt-5.3-codex
+    candidateExtraction: gpt-5.3-codex
+    curator: gpt-5.3-codex
+    reconcile: gpt-5.3-codex
+    repair: gpt-5.3-codex
+`);
+
+    expect(config.llm.provider.backend).toBe('codex');
+    expect(config.llm.models).toEqual({
+      default: 'gpt-5.3-codex',
+      triage: 'gpt-5.3-codex',
+      candidateExtraction: 'gpt-5.3-codex',
+      curator: 'gpt-5.3-codex',
+      reconcile: 'gpt-5.3-codex',
+      repair: 'gpt-5.3-codex',
+    });
+  });
+
  it('parses gateway LLM, OpenAI scan embeddings, and sentence-transformers ingest embeddings', () => {
    const config = parseKtxProjectConfig(`
 llm:
@ -530,7 +555,7 @@ describe('generateKtxProjectConfigJsonSchema', () => {
    const llm = (schema.properties as Record<string, { properties?: Record<string, unknown> }>).llm;
    const provider = llm?.properties?.provider as { properties?: Record<string, unknown> };
    const backend = provider?.properties?.backend as { enum?: readonly string[] };
-    expect(backend?.enum).toEqual(['none', 'anthropic', 'vertex', 'gateway', 'claude-code']);
+    expect(backend?.enum).toEqual(['none', 'anthropic', 'vertex', 'gateway', 'claude-code', 'codex']);

    const storage = (schema.properties as Record<string, { properties?: Record<string, unknown> }>).storage;
    const state = storage?.properties?.state as { enum?: readonly string[] };
--- a/packages/cli/test/doctor.test.ts
+++ b/packages/cli/test/doctor.test.ts
@ -422,6 +422,8 @@ describe('runKtxDoctor', () => {
        'llm:',
        '  provider:',
        '    backend: anthropic',
+        '  models:',
+        '    default: claude-sonnet-4-5',
        '',
      ].join('\n'),
      'utf-8',
@ -543,6 +545,8 @@ describe('runKtxDoctor', () => {
        'llm:',
        '  provider:',
        '    backend: anthropic',
+        '  models:',
+        '    default: claude-sonnet-4-5',
        'ingest:',
        '  adapters:',
        '    - live-database',
@ -652,6 +656,8 @@ describe('runKtxDoctor', () => {
        'llm:',
        '  provider:',
        '    backend: anthropic',
+        '  models:',
+        '    default: claude-sonnet-4-5',
        '',
      ].join('\n'),
      'utf-8',
@ -698,6 +704,8 @@ describe('runKtxDoctor', () => {
        'llm:',
        '  provider:',
        '    backend: anthropic',
+        '  models:',
+        '    default: claude-sonnet-4-5',
        'ingest:',
        '  adapters:',
        '    - live-database',
--- a/packages/cli/test/ingest.test.ts
+++ b/packages/cli/test/ingest.test.ts
@ -337,10 +337,13 @@ describe('runKtxIngest', () => {

    expect(runIo.stdout()).toBe('');
    expect(runIo.stderr()).toContain(
-      'ktx ingest requires llm.provider.backend: anthropic, vertex, gateway, or claude-code, or an injected agentRunner.',
+      'ktx ingest requires llm.provider.backend: anthropic, vertex, gateway, claude-code, or codex, or an injected agentRunner.',
    );
-    expect(runIo.stderr()).toContain('Configure a local Claude Code session or API-backed LLM, then rerun ingest:');
+    expect(runIo.stderr()).toContain('Configure a local Claude Code/Codex session or API-backed LLM, then rerun ingest:');
    expect(runIo.stderr()).toContain(`ktx setup --project-dir ${projectDir} --llm-backend claude-code --no-input`);
+    expect(runIo.stderr()).toContain(
+      `ktx setup --project-dir ${projectDir} --llm-backend codex --llm-model gpt-5.5 --no-input`,
+    );
    expect(runIo.stderr()).toContain(
      `ktx setup --project-dir ${projectDir} --llm-backend anthropic --anthropic-api-key-env ANTHROPIC_API_KEY --llm-model claude-sonnet-4-6 --no-input`,
    );
--- a/packages/cli/test/llm/model-provider.test.ts
+++ b/packages/cli/test/llm/model-provider.test.ts
@ -312,4 +312,13 @@ describe('createKtxLlmProvider', () => {
      }),
    ).toThrow('claude-code is not an AI SDK LanguageModel backend');
  });
+
+  it('rejects codex as an AI SDK LanguageModel backend', () => {
+    expect(() =>
+      createKtxLlmProvider({
+        backend: 'codex',
+        modelSlots: { default: 'gpt-5.3-codex' },
+      }),
+    ).toThrow('codex is not an AI SDK LanguageModel backend');
+  });
 });
--- a/packages/cli/test/setup-models.test.ts
+++ b/packages/cli/test/setup-models.test.ts
@ -66,6 +66,7 @@ function makePromptAdapter(options: {
        nextProviderChoice === 'anthropic' ||
        nextProviderChoice === 'vertex' ||
        nextProviderChoice === 'claude-code' ||
+        nextProviderChoice === 'codex' ||
        nextProviderChoice === 'back'
      ) {
        return selectValues.shift() ?? nextProviderChoice;
@ -183,6 +184,7 @@ describe('setup Anthropic model step', () => {
        message: expect.stringContaining('Which LLM provider should KTX use?'),
        options: [
          { value: 'claude-code', label: 'Claude subscription (Pro/Max)' },
+          { value: 'codex', label: 'Codex subscription' },
          { value: 'anthropic', label: 'Anthropic API key' },
          { value: 'vertex', label: 'Google Vertex AI for Anthropic Claude' },
          { value: 'back', label: 'Back' },
@ -215,6 +217,85 @@ describe('setup Anthropic model step', () => {
    expect(authProbe).toHaveBeenCalledWith(expect.objectContaining({ projectDir: tempDir, model: 'sonnet' }));
  });

+  it('configures Codex backend and validates local auth', async () => {
+    const io = makeIo();
+    const codexAuthProbe = vi.fn(async () => ({ ok: true as const }));
+
+    const result = await runKtxSetupAnthropicModelStep(
+      {
+        projectDir: tempDir,
+        inputMode: 'disabled',
+        llmBackend: 'codex',
+        llmModel: 'gpt-5.5',
+        skipLlm: false,
+      },
+      io.io,
+      { codexAuthProbe },
+    );
+
+    expect(result.status).toBe('ready');
+    const config = parseKtxProjectConfig(await readFile(join(tempDir, 'ktx.yaml'), 'utf-8'));
+    expect(config.llm).toMatchObject({
+      provider: { backend: 'codex' },
+      models: { default: 'gpt-5.5' },
+    });
+    expect(codexAuthProbe).toHaveBeenCalledWith(expect.objectContaining({ projectDir: tempDir, model: 'gpt-5.5' }));
+    // The warning carries the clack gutter so it renders inside the setup frame.
+    expect(io.stderr()).toContain('│  Codex backend isolation is limited');
+    expect(io.stderr()).toContain('may still load user Codex config');
+  });
+
+  it('defaults the Codex model to gpt-5.5 when none is provided non-interactively', async () => {
+    const io = makeIo();
+    const codexAuthProbe = vi.fn(async () => ({ ok: true as const }));
+
+    const result = await runKtxSetupAnthropicModelStep(
+      {
+        projectDir: tempDir,
+        inputMode: 'disabled',
+        llmBackend: 'codex',
+        skipLlm: false,
+      },
+      io.io,
+      { codexAuthProbe },
+    );
+
+    expect(result.status).toBe('ready');
+    const config = parseKtxProjectConfig(await readFile(join(tempDir, 'ktx.yaml'), 'utf-8'));
+    expect(config.llm).toMatchObject({
+      provider: { backend: 'codex' },
+      models: { default: 'gpt-5.5' },
+    });
+    expect(codexAuthProbe).toHaveBeenCalledWith(expect.objectContaining({ projectDir: tempDir, model: 'gpt-5.5' }));
+  });
+
+  it('offers the curated Codex models during interactive setup', async () => {
+    const io = makeIo();
+    const prompts = makePromptAdapter({ selectValues: ['codex', 'gpt-5.5'] });
+    const codexAuthProbe = vi.fn(async () => ({ ok: true as const }));
+
+    const result = await runKtxSetupAnthropicModelStep(
+      { projectDir: tempDir, inputMode: 'auto', skipLlm: false },
+      io.io,
+      { prompts, codexAuthProbe },
+    );
+
+    expect(result.status).toBe('ready');
+    expect(prompts.select).toHaveBeenCalledWith(
+      expect.objectContaining({
+        message: expect.stringContaining('Which Codex model should KTX use?'),
+        options: [
+          { value: 'gpt-5.5', label: 'GPT-5.5', hint: 'recommended' },
+          { value: 'gpt-5.4', label: 'GPT-5.4' },
+          { value: 'gpt-5.4-mini', label: 'GPT-5.4 mini' },
+          { value: 'manual', label: 'Enter a Codex model ID manually' },
+          { value: 'back', label: 'Back' },
+        ],
+      }),
+    );
+    expect(codexAuthProbe).toHaveBeenCalledWith(expect.objectContaining({ model: 'gpt-5.5' }));
+  });
+
  it('prompts for the Claude Code model during interactive setup', async () => {
    const io = makeIo();
    const prompts = makePromptAdapter({ selectValues: ['claude-code', 'opus'] });
--- a/packages/cli/test/status-project.test.ts
+++ b/packages/cli/test/status-project.test.ts
@ -44,6 +44,17 @@ function withClaudeCodeLlm(config: KtxProjectConfig): KtxProjectConfig {
  };
 }

+function withCodexLlm(config: KtxProjectConfig): KtxProjectConfig {
+  return {
+    ...config,
+    llm: {
+      ...config.llm,
+      provider: { backend: 'codex' },
+      models: { ...config.llm.models, default: 'gpt-5.5' },
+    },
+  };
+}
+
 function baseProjectConfig(): KtxProjectConfig {
  return withClaudeCodeLlm(buildDefaultKtxProjectConfig());
 }
@ -391,6 +402,126 @@ describe('buildProjectStatus --fast', () => {
  });
 });

+describe('buildProjectStatus codex', () => {
+  it('reports authenticated local Codex session', async () => {
+    const project = projectWithConfig(withCodexLlm(buildDefaultKtxProjectConfig()));
+    const status = await buildProjectStatus(project, {
+      codexAuthProbe: async () => ({ ok: true as const }),
+    });
+
+    expect(status.llm).toMatchObject({
+      backend: 'codex',
+      model: 'gpt-5.5',
+      status: 'ok',
+      detail: 'local Codex session authenticated',
+    });
+    expect(status.warnings).toEqual(
+      expect.arrayContaining([
+        expect.objectContaining({
+          message: expect.stringContaining('Codex backend isolation is limited'),
+          fix: expect.stringContaining('claude-code'),
+        }),
+      ]),
+    );
+    const rendered = renderProjectStatus(status, { verbose: false, useColor: false });
+    expect(rendered).toContain('Codex backend isolation is limited');
+  });
+
+  it('skips Codex auth probe with --fast', async () => {
+    let probeCalls = 0;
+    const project = projectWithConfig(withCodexLlm(buildDefaultKtxProjectConfig()));
+    const status = await buildProjectStatus(project, {
+      fast: true,
+      codexAuthProbe: async () => {
+        probeCalls += 1;
+        return { ok: true };
+      },
+    });
+
+    expect(probeCalls).toBe(0);
+    expect(status.llm.status).toBe('skipped');
+    expect(status.llm.detail).toMatch(/--fast/);
+  });
+
+  it('surfaces the probe fix for a model-access failure instead of an auth fix', async () => {
+    const project = projectWithConfig(withCodexLlm(buildDefaultKtxProjectConfig()));
+    const status = await buildProjectStatus(project, {
+      codexAuthProbe: async () => ({
+        ok: false,
+        message: 'Codex is authenticated, but the configured model "gpt-5.5" is not available...',
+        fix: 'Run `codex` to see the models your account supports, then set llm.models.default in ktx.yaml (or rerun `ktx setup`).',
+      }),
+    });
+
+    expect(status.llm.status).toBe('fail');
+    expect(status.llm.fix).toContain('llm.models.default');
+    expect(status.llm.fix).not.toContain('Authenticate Codex');
+  });
+});
+
+describe('buildProjectStatus llm models.default requirement', () => {
+  function withBackendNoModel(
+    backend: KtxProjectConfig['llm']['provider']['backend'],
+  ): KtxProjectConfig {
+    const config = buildDefaultKtxProjectConfig();
+    return {
+      ...config,
+      llm: { ...config.llm, provider: { backend }, models: {} },
+    };
+  }
+
+  it('fails codex without llm.models.default and never probes', async () => {
+    let probeCalls = 0;
+    const project = projectWithConfig(withBackendNoModel('codex'));
+    const status = await buildProjectStatus(project, {
+      codexAuthProbe: async () => {
+        probeCalls += 1;
+        return { ok: true };
+      },
+    });
+
+    expect(probeCalls).toBe(0);
+    expect(status.llm.status).toBe('fail');
+    expect(status.llm.detail).toContain('llm.models.default');
+    expect(status.verdict).toBe('blocked');
+  });
+
+  it('fails claude-code without llm.models.default and never probes', async () => {
+    let probeCalls = 0;
+    const project = projectWithConfig(withBackendNoModel('claude-code'));
+    const status = await buildProjectStatus(project, {
+      claudeCodeAuthProbe: async () => {
+        probeCalls += 1;
+        return { ok: true };
+      },
+    });
+
+    expect(probeCalls).toBe(0);
+    expect(status.llm.status).toBe('fail');
+    expect(status.llm.detail).toContain('llm.models.default');
+    expect(status.verdict).toBe('blocked');
+  });
+
+  it('fails anthropic without llm.models.default even when the key is set', async () => {
+    const config = withBackendNoModel('anthropic');
+    const project = projectWithConfig({
+      ...config,
+      llm: {
+        ...config.llm,
+        provider: { backend: 'anthropic', anthropic: { api_key: 'env:ANTHROPIC_API_KEY' } }, // pragma: allowlist secret
+        models: {},
+      },
+    });
+    const status = await buildProjectStatus(project, {
+      env: { ANTHROPIC_API_KEY: 'sk-test' }, // pragma: allowlist secret
+    });
+
+    expect(status.llm.status).toBe('fail');
+    expect(status.llm.detail).toContain('llm.models.default');
+    expect(status.verdict).toBe('blocked');
+  });
+});
+
 describe('buildLocalStatsStatus', () => {
  let tempDir: string;

--- a/pnpm-lock.yaml
+++ b/pnpm-lock.yaml
@ -158,6 +158,9 @@ importers:
      '@notionhq/client':
        specifier: ^5.22.0
        version: 5.22.0
+      '@openai/codex-sdk':
+        specifier: ^0.133.0
+        version: 0.133.0
      ai:
        specifier: ^6.0.188
        version: 6.0.188(zod@4.4.3)
@ -1288,6 +1291,51 @@ packages:
  '@octokit/types@16.0.0':
    resolution: {integrity: sha512-sKq+9r1Mm4efXW1FCk7hFSeJo4QKreL/tTbR0rz/qx/r1Oa2VV83LTA/H/MuCOX7uCIJmQVRKBcbmWoySjAnSg==}

+  '@openai/codex-sdk@0.133.0':
+    resolution: {integrity: sha512-PB82D/1Q0C7nzaV5O+1O4y5LcVwiUvxyHvCUTfz8Cwztv6bOWQ40gFHE5ZFX1EFPJx1cMV0GPVODWuXIKAuayQ==}
+    engines: {node: '>=18'}
+
+  '@openai/codex@0.133.0':
+    resolution: {integrity: sha512-Gh42kLLBo/6gpnHmDzUWDVvyS57ekCB1+1Dz0RG2oIl3Lhk1uwrjSj/PwaJWWh4Rw/rUp1RqkwrMugFfFEOlqQ==}
+    engines: {node: '>=16'}
+    hasBin: true
+
+  '@openai/codex@0.133.0-darwin-arm64':
+    resolution: {integrity: sha512-W7f8+DckLujnqGlptKCzgJU+ooeHKMuk6KYgMFP6A9asn7YUsGUgJqjiBaX8oNcXO6w/pTbKGRARx1kCNS8lIg==}
+    engines: {node: '>=16'}
+    cpu: [arm64]
+    os: [darwin]
+
+  '@openai/codex@0.133.0-darwin-x64':
+    resolution: {integrity: sha512-Ek8ikvLOiXZ8emcIJVBXxK6fm8ratBy0kaEt3JNisTNszxGshUHf/R4xxDxIyKNcUkYYXjW7A/rMwW3iu3OFlg==}
+    engines: {node: '>=16'}
+    cpu: [x64]
+    os: [darwin]
+
+  '@openai/codex@0.133.0-linux-arm64':
+    resolution: {integrity: sha512-uKXYYSJ3mY16sp4hcG/4BMNRjva/ZS4oARiI1+7k8+NiuoAhdCGWNe5u4KJ3sMuL3tp/IXcmc6B56EFX1+WDBQ==}
+    engines: {node: '>=16'}
+    cpu: [arm64]
+    os: [linux]
+
+  '@openai/codex@0.133.0-linux-x64':
+    resolution: {integrity: sha512-9YfyqrfUj/UZ2+aXE4zBz47t6RXbVni95ZorGsNh857vxYK/asVpUtR2cymo9lB3JaI4mQaKFfV/t7IRItqkuA==}
+    engines: {node: '>=16'}
+    cpu: [x64]
+    os: [linux]
+
+  '@openai/codex@0.133.0-win32-arm64':
+    resolution: {integrity: sha512-mRzND0PSGHRoLk0X41GTSoc3tFjZSF4HgDlfjU5fiQcWVi0/kLb7Ku6/tPFT/X2hOLa3YdJkbIcHC0Hc9ni80g==}
+    engines: {node: '>=16'}
+    cpu: [arm64]
+    os: [win32]
+
+  '@openai/codex@0.133.0-win32-x64':
+    resolution: {integrity: sha512-u3ji78DIPZCGJeELuovsAnaZH+vK9gsA4F6M1y+Uy2s80Sz7/i1S0KL81qGReYji3urSjgBpkQuNP47GXOqxrQ==}
+    engines: {node: '>=16'}
+    cpu: [x64]
+    os: [win32]
+
  '@opentelemetry/api@1.9.1':
    resolution: {integrity: sha512-gLyJlPHPZYdAk1JENA9LeHejZe1Ti77/pTeFm/nMXmQH/HFZlcS/O2XJB+L8fkbrNSqhdtlvjBVjxwUYanNH5Q==}
    engines: {node: '>=8.0.0'}
@ -7145,6 +7193,37 @@ snapshots:
    dependencies:
      '@octokit/openapi-types': 27.0.0

+  '@openai/codex-sdk@0.133.0':
+    dependencies:
+      '@openai/codex': 0.133.0
+
+  '@openai/codex@0.133.0':
+    optionalDependencies:
+      '@openai/codex-darwin-arm64': '@openai/codex@0.133.0-darwin-arm64'
+      '@openai/codex-darwin-x64': '@openai/codex@0.133.0-darwin-x64'
+      '@openai/codex-linux-arm64': '@openai/codex@0.133.0-linux-arm64'
+      '@openai/codex-linux-x64': '@openai/codex@0.133.0-linux-x64'
+      '@openai/codex-win32-arm64': '@openai/codex@0.133.0-win32-arm64'
+      '@openai/codex-win32-x64': '@openai/codex@0.133.0-win32-x64'
+
+  '@openai/codex@0.133.0-darwin-arm64':
+    optional: true
+
+  '@openai/codex@0.133.0-darwin-x64':
+    optional: true
+
+  '@openai/codex@0.133.0-linux-arm64':
+    optional: true
+
+  '@openai/codex@0.133.0-linux-x64':
+    optional: true
+
+  '@openai/codex@0.133.0-win32-arm64':
+    optional: true
+
+  '@openai/codex@0.133.0-win32-x64':
+    optional: true
+
  '@opentelemetry/api@1.9.1': {}

  '@orama/orama@3.1.18': {}
--- a/scripts/codex-backend-live-smoke.mjs
+++ b/scripts/codex-backend-live-smoke.mjs
@ -0,0 +1,160 @@
+import { execFile } from 'node:child_process';
+import { mkdtemp, rm } from 'node:fs/promises';
+import { tmpdir } from 'node:os';
+import { dirname, join, resolve } from 'node:path';
+import { fileURLToPath, pathToFileURL } from 'node:url';
+import { promisify } from 'node:util';
+
+const execFileAsync = promisify(execFile);
+const SCRIPT_DIR = dirname(fileURLToPath(import.meta.url));
+const ROOT_DIR = resolve(SCRIPT_DIR, '..');
+const OPT_IN_MESSAGE =
+  'Set KTX_RUN_CODEX_BACKEND_SMOKE=1 or pass --force to run the Codex backend live smoke.';
+
+export function codexBackendSmokeOptIn(env = process.env, args = process.argv.slice(2)) {
+  if (env.KTX_RUN_CODEX_BACKEND_SMOKE === '1' || args.includes('--force')) {
+    return { run: true };
+  }
+  return { run: false, message: OPT_IN_MESSAGE };
+}
+
+async function run(command, args, options = {}) {
+  process.stdout.write(`$ ${command} ${args.join(' ')}\n`);
+  try {
+    const result = await execFileAsync(command, args, {
+      cwd: options.cwd ?? ROOT_DIR,
+      env: { ...process.env, ...(options.env ?? {}) },
+      encoding: 'utf8',
+      maxBuffer: 1024 * 1024 * 20,
+      timeout: options.timeoutMs ?? 300_000,
+    });
+    if (result.stdout) {
+      process.stdout.write(result.stdout);
+    }
+    if (result.stderr) {
+      process.stderr.write(result.stderr);
+    }
+    return { code: 0, stdout: result.stdout, stderr: result.stderr };
+  } catch (error) {
+    const stdout = typeof error.stdout === 'string' ? error.stdout : '';
+    const stderr = typeof error.stderr === 'string' ? error.stderr : error.message;
+    if (stdout) {
+      process.stdout.write(stdout);
+    }
+    if (stderr) {
+      process.stderr.write(stderr);
+    }
+    return {
+      code: typeof error.code === 'number' ? error.code : 1,
+      stdout,
+      stderr,
+    };
+  }
+}
+
+function requireSuccess(label, result) {
+  if (result.code !== 0) {
+    throw new Error(`${label} failed with code ${result.code}\nstdout:\n${result.stdout}\nstderr:\n${result.stderr}`);
+  }
+}
+
+async function runSetupSmoke(projectDir) {
+  const result = await run(
+    'node',
+    [
+      join(ROOT_DIR, 'packages/cli/dist/bin.js'),
+      'setup',
+      '--project-dir',
+      projectDir,
+      '--llm-backend',
+      'codex',
+      '--llm-model',
+      'gpt-5.3-codex',
+      '--no-input',
+      '--yes',
+      '--skip-databases',
+      '--skip-sources',
+      '--skip-agents',
+    ],
+    { timeoutMs: 600_000 },
+  );
+  requireSuccess('ktx setup codex backend', result);
+  if (!result.stdout.includes('LLM ready: yes (codex, gpt-5.3-codex)')) {
+    throw new Error(`setup did not report Codex LLM readiness\nstdout:\n${result.stdout}`);
+  }
+}
+
+async function runRuntimeSmoke(projectDir) {
+  const runtimeUrl = pathToFileURL(join(ROOT_DIR, 'packages/cli/dist/context/llm/codex-runtime.js')).href;
+  const zodUrl = pathToFileURL(join(ROOT_DIR, 'packages/cli/node_modules/zod/index.js')).href;
+  const { CodexKtxLlmRuntime } = await import(runtimeUrl);
+  const { z } = await import(zodUrl);
+  const runtime = new CodexKtxLlmRuntime({
+    projectDir,
+    modelSlots: { default: 'gpt-5.3-codex' },
+  });
+
+  const text = await runtime.generateText({
+    role: 'default',
+    prompt: 'Reply with exactly: ktx_codex_text_ok',
+  });
+  if (text.trim() !== 'ktx_codex_text_ok') {
+    throw new Error(`Codex text smoke returned unexpected text: ${text}`);
+  }
+
+  let toolCalls = 0;
+  const loop = await runtime.runAgentLoop({
+    modelRole: 'default',
+    systemPrompt: 'You must use available tools when the user asks for a tool result.',
+    userPrompt:
+      'Call the echo_value tool with {"value":"ktx_codex_tool_ok"}, then finish after the tool returns.',
+    toolSet: {
+      echo_value: {
+        name: 'echo_value',
+        description: 'Return the provided value as markdown.',
+        inputSchema: z.object({ value: z.string() }),
+        execute: async (input) => {
+          toolCalls += 1;
+          return { markdown: `echo:${input.value}` };
+        },
+      },
+    },
+    stepBudget: 4,
+    telemetryTags: {},
+  });
+
+  if (loop.stopReason !== 'natural') {
+    throw new Error(`Codex tool smoke stopped with ${loop.stopReason}: ${loop.error?.message ?? 'no error'}`);
+  }
+  if (toolCalls !== 1) {
+    throw new Error(`Expected Codex to call echo_value exactly once, got ${toolCalls}`);
+  }
+}
+
+export async function runCodexBackendLiveSmoke() {
+  const projectDir = await mkdtemp(join(tmpdir(), 'ktx-codex-backend-smoke-'));
+  try {
+    requireSuccess(
+      'ktx build',
+      await run('pnpm', ['--filter', '@kaelio/ktx', 'run', 'build'], { timeoutMs: 600_000 }),
+    );
+    await runSetupSmoke(projectDir);
+    await runRuntimeSmoke(projectDir);
+    process.stdout.write(`Codex backend live smoke passed in ${projectDir}\n`);
+  } finally {
+    await rm(projectDir, { recursive: true, force: true });
+  }
+}
+
+async function main() {
+  const optIn = codexBackendSmokeOptIn();
+  if (!optIn.run) {
+    process.stdout.write(`${optIn.message}\n`);
+    return;
+  }
+  await runCodexBackendLiveSmoke();
+}
+
+if (import.meta.url === pathToFileURL(process.argv[1] ?? '').href) {
+  await main();
+}
--- a/scripts/codex-backend-live-smoke.test.mjs
+++ b/scripts/codex-backend-live-smoke.test.mjs
@ -0,0 +1,18 @@
+import assert from 'node:assert/strict';
+import test from 'node:test';
+import { codexBackendSmokeOptIn } from './codex-backend-live-smoke.mjs';
+
+test('codex backend smoke stays disabled by default', () => {
+  assert.deepEqual(codexBackendSmokeOptIn({}, []), {
+    run: false,
+    message: 'Set KTX_RUN_CODEX_BACKEND_SMOKE=1 or pass --force to run the Codex backend live smoke.',
+  });
+});
+
+test('codex backend smoke runs with env opt-in', () => {
+  assert.deepEqual(codexBackendSmokeOptIn({ KTX_RUN_CODEX_BACKEND_SMOKE: '1' }, []), { run: true });
+});
+
+test('codex backend smoke runs with force flag', () => {
+  assert.deepEqual(codexBackendSmokeOptIn({}, ['--force']), { run: true });
+});