diff --git a/README.md b/README.md
index d44905d5..2c433e0d 100644
--- a/README.md
+++ b/README.md
@@ -30,8 +30,9 @@ warehouse accurately - from approved metric definitions, joinable columns, and
business knowledge it builds and maintains for you.
> [!NOTE]
-> Run **ktx** with your own LLM API keys or a **Claude Pro/Max** subscription.
-> No extra usage billing from **ktx**.
+> Run **ktx** with your own LLM API keys or a local agent sign-in — a
+> **Claude Pro/Max** subscription through Claude Code, or your local Codex
+> authentication. No extra usage billing from **ktx**.
@@ -175,8 +176,9 @@ then the current directory. Pass `--project-dir ` when scripting.
No. **ktx** runs locally. The only data leaving your machine is what you
send to the LLM provider you configured.
- **Which LLM backends are supported?**
- Anthropic API, Google Vertex AI, AI Gateway, and the local Claude Code
- session through the Claude Agent SDK. See
+ Anthropic API, Google Vertex AI, AI Gateway, the local Claude Code session
+ through the Claude Agent SDK, and your local Codex authentication through the
+ Codex SDK. See
[LLM configuration](https://docs.kaelio.com/ktx/docs/guides/llm-configuration).
- **How is ktx different from a dbt or MetricFlow semantic layer?**
**ktx** *ingests* those layers and combines them with raw-table
diff --git a/docs-site/content/docs/cli-reference/ktx-setup.mdx b/docs-site/content/docs/cli-reference/ktx-setup.mdx
index 0da7b339..24469a63 100644
--- a/docs-site/content/docs/cli-reference/ktx-setup.mdx
+++ b/docs-site/content/docs/cli-reference/ktx-setup.mdx
@@ -51,8 +51,9 @@ prompts.
| Flag | Description |
|------|-------------|
-| `--llm-backend ` | LLM backend: `anthropic`, `vertex`, or `claude-code` |
+| `--llm-backend ` | LLM backend: `anthropic`, `vertex`, `claude-code`, or `codex` |
| `--llm-backend claude-code` | Use the local Claude Code session for **ktx** LLM calls |
+| `--llm-backend codex` | Use local Codex authentication for **ktx** LLM calls |
| `--llm-model ` | LLM model ID or backend model alias to validate and save |
| `--anthropic-api-key-env ` | Environment variable containing the Anthropic API key |
| `--anthropic-api-key-file ` | File containing the Anthropic API key |
@@ -62,9 +63,14 @@ prompts.
Choose only one Anthropic credential source. Anthropic credential flags are only
valid with the Anthropic backend; Vertex flags are only valid with the Vertex
-backend. The `claude-code` backend uses local Claude Code authentication instead
+backend. The `claude-code` and `codex` backends use local authentication instead
of Anthropic API key or Vertex flags. For Claude Code, `--llm-model` accepts
-`sonnet`, `opus`, `haiku`, or a full Claude model ID.
+`sonnet`, `opus`, `haiku`, or a full Claude model ID. For Codex, `--llm-model`
+accepts `codex`, `default`, or a `gpt-*` / `codex-*` model ID such as
+`gpt-5.5`; any other value is rejected before the auth probe. Run `codex` to
+see the models available to your login, and pick a `gpt-*` / `codex-*` id from
+that list. Note that `*-codex` API-billing model IDs (for example
+`gpt-5.3-codex`) are not available to ChatGPT-subscription logins.
### Embeddings
@@ -191,6 +197,17 @@ ktx setup \
--llm-backend claude-code \
--llm-model opus
+# Configure **ktx** to use local Codex authentication for LLM work
+ktx setup --llm-backend codex --llm-model gpt-5.5 --no-input
+```
+
+When you choose `--llm-backend codex`, setup prints a warning if the public
+Codex SDK and CLI surface cannot prove full Claude-Code-style isolation. The
+backend restricts **ktx** runtime MCP tools to each run, but Codex may still
+load user Codex config and built-in command execution or read-only file
+capabilities.
+
+```bash
# Script a Postgres connection that reads its URL from the environment
ktx setup \
--project-dir ./analytics \
diff --git a/docs-site/content/docs/cli-reference/ktx-status.mdx b/docs-site/content/docs/cli-reference/ktx-status.mdx
index 51c00148..66e4964c 100644
--- a/docs-site/content/docs/cli-reference/ktx-status.mdx
+++ b/docs-site/content/docs/cli-reference/ktx-status.mdx
@@ -21,7 +21,7 @@ ktx status [options]
| `--json` | Print JSON output | `false` |
| `-v`, `--verbose` | Show every check, including passing ones | `false` |
| `--validate` | Only validate the `ktx.yaml` schema; skip readiness checks | `false` |
-| `--fast` | Skip checks that require external communication (query-history readiness probes and Claude Code auth probe) | `false` |
+| `--fast` | Skip checks that require external communication (query-history readiness probes, Claude Code auth probe, and Codex auth probe) | `false` |
| `--no-input` | Disable interactive terminal input | - |
## Examples
@@ -39,7 +39,7 @@ ktx status --verbose
# Validate ktx.yaml without running readiness checks
ktx status --validate
-# Skip slow probes (query-history readiness, Claude Code auth)
+# Skip slow probes (query-history readiness, Claude Code auth, Codex auth)
ktx status --fast
# Check a project from another directory
@@ -57,6 +57,16 @@ flow, then rerun `ktx status`. Use `--fast` to skip this probe (useful in CI
or offline contexts); skipped checks render as `-` and carry
`"status": "skipped"` in JSON output.
+For `llm.provider.backend: codex`, `ktx status` runs a minimal non-interactive
+Codex request. If the probe fails, authenticate Codex locally with the Codex CLI
+and verify the Codex CLI installation.
+
+When `llm.provider.backend: codex` is configured, `ktx status` also prints a
+warning when the installed public Codex SDK and CLI surface cannot prove full
+Claude-Code-style isolation. The warning does not block authenticated Codex
+usage, but it marks the project status as partial so you can make an explicit
+runtime-isolation decision.
+
A `Local data` section summarises what the project has accumulated locally:
ingest run counts, last completed timestamp per connection, knowledge page
counts by scope, semantic-layer source and dictionary value counts, and the
diff --git a/docs-site/content/docs/configuration/ktx-yaml.mdx b/docs-site/content/docs/configuration/ktx-yaml.mdx
index 13105851..a9298443 100644
--- a/docs-site/content/docs/configuration/ktx-yaml.mdx
+++ b/docs-site/content/docs/configuration/ktx-yaml.mdx
@@ -376,13 +376,23 @@ llm:
| Field | Type | Default | Purpose |
|-------|------|---------|---------|
-| `provider.backend` | `none` \| `anthropic` \| `vertex` \| `gateway` \| `claude-code` | `none` | Selected backend. `none` disables LLM features. `claude-code` uses the local Claude Code session and needs no API key. |
+| `provider.backend` | `none` \| `anthropic` \| `vertex` \| `gateway` \| `claude-code` \| `codex` | `none` | Selected backend. `none` disables LLM features. `claude-code` uses the local Claude Code session and needs no API key. `codex` uses local Codex authentication and needs no API key. |
| `provider.anthropic.api_key` | `string` | - | Anthropic API key. Required when `backend: anthropic`. Accepts `env:` or `file:` references. |
| `provider.anthropic.base_url` | `string` | - | Override the Anthropic API base URL (proxy, self-hosted gateway). |
| `provider.gateway.api_key` / `base_url` | `string` | - | Credentials for an AI Gateway provider. Required when `backend: gateway`. |
| `provider.vertex.project` | `string` | - | Google Cloud project ID hosting the Vertex AI endpoint. |
| `provider.vertex.location` | `string` | - | Vertex AI region (for example `us-east5`). Required when the `vertex` block is present. |
+Use `codex` when local Codex authentication should power **ktx** LLM work:
+
+```yaml
+llm:
+ provider:
+ backend: codex
+ models:
+ default: gpt-5.5
+```
+
### Model roles
`models` overrides the per-role model. Keys are fixed; values are
diff --git a/docs-site/content/docs/guides/building-context.mdx b/docs-site/content/docs/guides/building-context.mdx
index b806c424..52179e70 100644
--- a/docs-site/content/docs/guides/building-context.mdx
+++ b/docs-site/content/docs/guides/building-context.mdx
@@ -39,8 +39,20 @@ ktx ingest --all
Enriched ingest needs a configured model and embeddings. Run `ktx setup` first;
connections without that configuration fail before any work starts.
-With `claude-code`, **ktx** agent loops can invoke only the **ktx** MCP tools for the
-current run.
+Local-auth backends keep provider credentials out of `ktx.yaml`:
+
+```bash
+ktx setup --llm-backend claude-code --no-input
+ktx setup --llm-backend codex --llm-model gpt-5.5 --no-input
+```
+
+With `claude-code`, **ktx** agent loops can invoke only the **ktx** MCP tools
+for the current run. With `codex`, **ktx** restricts the temporary runtime MCP
+server to the current run's tool set, disables Codex web search, requests a
+read-only sandbox, and sets `approval_policy=never`. The public Codex SDK and
+CLI surface may still load user Codex config and built-in command execution or
+read-only file capabilities, so use `claude-code` for stricter runtime tool
+isolation.
## Query history
diff --git a/docs-site/content/docs/guides/llm-configuration.mdx b/docs-site/content/docs/guides/llm-configuration.mdx
index 880df24e..71ab9d80 100644
--- a/docs-site/content/docs/guides/llm-configuration.mdx
+++ b/docs-site/content/docs/guides/llm-configuration.mdx
@@ -16,6 +16,7 @@ Set `llm.provider.backend` to one of these values:
- `gateway`: Use AI Gateway-compatible Anthropic model ids.
- `claude-code`: Use your local Claude Code session through the Claude Agent
SDK. **ktx** strips provider-routing environment variables from child processes.
+- `codex`: Use your local Codex authentication through the Codex SDK.
## Claude Code
@@ -47,6 +48,42 @@ model IDs are also accepted.
metadata may still list host slash commands, skills, and subagents; **ktx** does not
grant execution access to them.
+## Codex backend
+
+Use `codex` when you want **ktx** to run LLM-backed workflows through your
+local Codex authentication instead of a direct provider API key.
+
+```yaml
+llm:
+ provider:
+ backend: codex
+ models:
+ default: gpt-5.5
+```
+
+Configure it non-interactively:
+
+```bash
+ktx setup --llm-backend codex --llm-model gpt-5.5 --no-input
+```
+
+This is separate from Codex agent-client setup. `ktx setup --agents --target
+codex` installs instructions and MCP access for an end-user Codex session.
+`ktx setup --llm-backend codex` makes **ktx** itself execute ingest, scan
+enrichment, memory, and other LLM-backed work through Codex.
+
+During runtime loops, **ktx** starts a temporary loopback MCP server for the
+current run, exposes only the tools passed to that run, asks Codex to use a
+read-only sandbox, sets `approval_policy=never`, auto-approves only those
+run-scoped MCP tools, and disables Codex web search.
+
+Codex backend isolation is currently limited by the public Codex SDK and CLI
+surface. Codex may still load user Codex config and built-in command execution
+or read-only file capabilities. Use `llm.provider.backend: claude-code` when
+you need stricter Claude-Code-style runtime tool isolation, or remove host
+Codex MCP and tool config before running untrusted prompts through the `codex`
+backend.
+
## Prompt caching
`llm.promptCaching` has partial parity on `claude-code`. Status and doctor warn
diff --git a/knip.json b/knip.json
index 270c2310..65b1a0a2 100644
--- a/knip.json
+++ b/knip.json
@@ -37,6 +37,9 @@
"@semantic-release/release-notes-generator",
"conventional-changelog-conventionalcommits"
],
+ "ignore": [
+ ".context/**"
+ ],
"ignoreBinaries": [
"uv",
"lsof"
diff --git a/package.json b/package.json
index fee7b745..a9590d70 100644
--- a/package.json
+++ b/package.json
@@ -32,6 +32,7 @@
"setup:dev": "node scripts/setup-dev.mjs",
"release:published-smoke": "node scripts/published-package-smoke.mjs --require-config",
"release:local-embeddings-smoke": "node scripts/local-embeddings-runtime-smoke.mjs --require-opt-in",
+ "release:codex-backend-smoke": "node scripts/codex-backend-live-smoke.mjs",
"release:readiness": "node scripts/release-readiness.mjs",
"release:update-version": "node scripts/update-public-release-version.mjs",
"relationships:acquire-public-fixtures": "node scripts/acquire-public-benchmark-fixtures.mjs",
diff --git a/packages/cli/package.json b/packages/cli/package.json
index b04fceac..9d3af54c 100644
--- a/packages/cli/package.json
+++ b/packages/cli/package.json
@@ -56,6 +56,7 @@
"@looker/sdk-rtl": "^21.6.5",
"@modelcontextprotocol/sdk": "^1.29.0",
"@notionhq/client": "^5.22.0",
+ "@openai/codex-sdk": "^0.133.0",
"ai": "^6.0.188",
"better-sqlite3": "^12.10.0",
"commander": "14.0.3",
diff --git a/packages/cli/src/commands/setup-commands.ts b/packages/cli/src/commands/setup-commands.ts
index 19f980bd..1619a80a 100644
--- a/packages/cli/src/commands/setup-commands.ts
+++ b/packages/cli/src/commands/setup-commands.ts
@@ -29,7 +29,7 @@ function embeddingBackend(value: string): 'openai' | 'sentence-transformers' {
}
function llmBackend(value: string): KtxSetupLlmBackend {
- if (value === 'anthropic' || value === 'vertex' || value === 'claude-code') {
+ if (value === 'anthropic' || value === 'vertex' || value === 'claude-code' || value === 'codex') {
return value;
}
throw new InvalidArgumentError(`invalid choice '${value}'`);
diff --git a/packages/cli/src/context/ingest/local-bundle-runtime.ts b/packages/cli/src/context/ingest/local-bundle-runtime.ts
index 77f4234e..9d6aba95 100644
--- a/packages/cli/src/context/ingest/local-bundle-runtime.ts
+++ b/packages/cli/src/context/ingest/local-bundle-runtime.ts
@@ -611,9 +611,10 @@ function nextLocalJobId(): string {
function localIngestLlmProviderGuardMessage(projectDir: string): string {
return [
- 'ktx ingest requires llm.provider.backend: anthropic, vertex, gateway, or claude-code, or an injected agentRunner.',
- 'Configure a local Claude Code session or API-backed LLM, then rerun ingest:',
+ 'ktx ingest requires llm.provider.backend: anthropic, vertex, gateway, claude-code, or codex, or an injected agentRunner.',
+ 'Configure a local Claude Code/Codex session or API-backed LLM, then rerun ingest:',
` ktx setup --project-dir ${projectDir} --llm-backend claude-code --no-input`,
+ ` ktx setup --project-dir ${projectDir} --llm-backend codex --llm-model gpt-5.5 --no-input`,
` ktx setup --project-dir ${projectDir} --llm-backend anthropic --anthropic-api-key-env ANTHROPIC_API_KEY --llm-model claude-sonnet-4-6 --no-input`,
].join('\n');
}
diff --git a/packages/cli/src/context/llm/codex-exec-events.ts b/packages/cli/src/context/llm/codex-exec-events.ts
new file mode 100644
index 00000000..86e13694
--- /dev/null
+++ b/packages/cli/src/context/llm/codex-exec-events.ts
@@ -0,0 +1,194 @@
+import type { LlmTokenUsage, RunLoopStopReason } from './runtime-port.js';
+
+export interface CodexExecEventSummary {
+ finalText: string;
+ stopReason: RunLoopStopReason;
+ usage: LlmTokenUsage;
+ stepCount: number;
+ stepBoundariesMs: number[];
+ toolCallCount: number;
+ toolFailures: string[];
+ error?: Error;
+}
+
+interface CodexEventParseOptions {
+ startedAt?: number;
+ now?: () => number;
+}
+
+function record(value: unknown): Record | undefined {
+ return value && typeof value === 'object' ? (value as Record) : undefined;
+}
+
+/**
+ * Codex thread items that represent a discrete agent action consuming one loop
+ * step. The step budget caps the total number of these regardless of which
+ * capability the agent reaches for, so built-in `command_execution` (and any
+ * file/web action the public Codex surface still exposes) count alongside our
+ * own `mcp_tool_call` items rather than only the MCP ones.
+ */
+const AGENT_STEP_ITEM_TYPES = new Set(['command_execution', 'mcp_tool_call', 'file_change', 'web_search']);
+
+export function isCompletedAgentStep(event: unknown): boolean {
+ const eventRecord = record(event);
+ if (eventRecord?.type !== 'item.completed') {
+ return false;
+ }
+ const itemType = record(eventRecord.item)?.type;
+ return typeof itemType === 'string' && AGENT_STEP_ITEM_TYPES.has(itemType);
+}
+
+function text(value: unknown): string | undefined {
+ return typeof value === 'string' && value.trim().length > 0 ? value : undefined;
+}
+
+function numberValue(value: unknown): number | undefined {
+ return typeof value === 'number' && Number.isFinite(value) ? value : undefined;
+}
+
+function usageFrom(value: unknown): LlmTokenUsage {
+ const usage = record(value);
+ if (!usage) {
+ return {};
+ }
+ const inputTokens = numberValue(usage.input_tokens ?? usage.inputTokens);
+ const outputTokens = numberValue(usage.output_tokens ?? usage.outputTokens);
+ const explicitTotalTokens = numberValue(usage.total_tokens ?? usage.totalTokens);
+ const totalTokens =
+ explicitTotalTokens ??
+ (inputTokens !== undefined && outputTokens !== undefined ? inputTokens + outputTokens : undefined);
+ return {
+ ...(inputTokens !== undefined ? { inputTokens } : {}),
+ ...(outputTokens !== undefined ? { outputTokens } : {}),
+ ...(totalTokens !== undefined ? { totalTokens } : {}),
+ };
+}
+
+function stopReasonFrom(value: unknown): RunLoopStopReason {
+ const reason = text(value)?.toLowerCase();
+ if (reason && /(budget|max_turn|max-turn|limit)/.test(reason)) {
+ return 'budget';
+ }
+ return 'natural';
+}
+
+function errorMessageFrom(value: unknown): string {
+ if (value instanceof Error) {
+ return value.message;
+ }
+ const asRecord = record(value);
+ const message = text(asRecord?.message);
+ return message ?? text(value) ?? 'Codex turn failed';
+}
+
+/**
+ * Codex serializes API failures as a JSON envelope inside the event message
+ * (e.g. `{"type":"error","status":400,"error":{"message":"…"}}`). Surface the
+ * human-readable inner message so callers don't leak raw JSON; pass plain
+ * strings through unchanged.
+ */
+function unwrapCodexApiErrorMessage(raw: string): string {
+ const trimmed = raw.trim();
+ if (!trimmed.startsWith('{')) {
+ return raw;
+ }
+ try {
+ const parsed = record(JSON.parse(trimmed));
+ return text(record(parsed?.error)?.message) ?? text(parsed?.message) ?? raw;
+ } catch {
+ return raw;
+ }
+}
+
+/** @internal */
+export function parseCodexExecEventLine(line: string): unknown {
+ try {
+ return JSON.parse(line) as unknown;
+ } catch (error) {
+ throw new Error(`Codex JSONL event stream was malformed: ${error instanceof Error ? error.message : String(error)}`);
+ }
+}
+
+export function summarizeCodexExecEvents(
+ events: Iterable,
+ options: CodexEventParseOptions = {},
+): CodexExecEventSummary {
+ const startedAt = options.startedAt ?? Date.now();
+ const now = options.now ?? Date.now;
+ let finalText = '';
+ let stopReason: RunLoopStopReason = 'natural';
+ let usage: LlmTokenUsage = {};
+ let turnCount = 0;
+ let completedStepCount = 0;
+ const stepBoundariesMs: number[] = [];
+ let toolCallCount = 0;
+ const toolFailures: string[] = [];
+ let error: Error | undefined;
+
+ for (const event of events) {
+ const eventRecord = record(event);
+ const eventType = text(eventRecord?.type);
+ if (!eventRecord || !eventType) {
+ continue;
+ }
+
+ if (eventType === 'turn.started') {
+ turnCount += 1;
+ continue;
+ }
+
+ const item = record(eventRecord.item);
+ const itemType = text(item?.type);
+
+ if (eventType === 'item.started' && itemType === 'mcp_tool_call') {
+ toolCallCount += 1;
+ continue;
+ }
+
+ if (isCompletedAgentStep(event)) {
+ completedStepCount += 1;
+ stepBoundariesMs.push(now() - startedAt);
+ // Only MCP tool calls fail the loop: a non-zero `command_execution` exit
+ // is normal agent exploration, not a runtime error. `status` is the
+ // authoritative signal (the SDK always sets it); the SDK also serializes
+ // `error: null` on successful calls, so an explicit-null `error` must NOT
+ // be read as a failure — only a populated error object counts.
+ if (itemType === 'mcp_tool_call' && (item?.status === 'failed' || (item?.error !== undefined && item?.error !== null))) {
+ const name = text(item?.name) ?? text(item?.tool) ?? text(item?.tool_name) ?? 'unknown';
+ toolFailures.push(`${name}: ${errorMessageFrom(item?.error)}`);
+ }
+ continue;
+ }
+
+ if (eventType === 'item.completed' && itemType === 'agent_message') {
+ finalText = text(item?.text) ?? finalText;
+ continue;
+ }
+
+ if (eventType === 'turn.completed') {
+ usage = usageFrom(eventRecord.usage);
+ if (completedStepCount === 0) {
+ stepBoundariesMs.push(now() - startedAt);
+ }
+ stopReason = stopReasonFrom(eventRecord.reason ?? eventRecord.stop_reason ?? eventRecord.terminal_reason);
+ continue;
+ }
+
+ if (eventType === 'turn.failed' || eventType === 'error') {
+ stopReason = 'error';
+ error = new Error(unwrapCodexApiErrorMessage(errorMessageFrom(eventRecord.error ?? eventRecord.message)));
+ continue;
+ }
+ }
+
+ return {
+ finalText,
+ stopReason,
+ usage,
+ stepCount: completedStepCount > 0 ? completedStepCount : turnCount,
+ stepBoundariesMs,
+ toolCallCount,
+ toolFailures,
+ ...(error ? { error } : {}),
+ };
+}
diff --git a/packages/cli/src/context/llm/codex-isolation.ts b/packages/cli/src/context/llm/codex-isolation.ts
new file mode 100644
index 00000000..d54ac1f8
--- /dev/null
+++ b/packages/cli/src/context/llm/codex-isolation.ts
@@ -0,0 +1,9 @@
+export const CODEX_ISOLATION_WARNING =
+ 'Codex backend isolation is limited by the public Codex SDK/CLI surface: ktx restricts the runtime MCP server to the current ktx tool set, disables Codex web search, asks for a read-only sandbox, and sets approval_policy=never, but Codex may still load user Codex config and built-in command execution or read-only file capabilities.';
+
+export const CODEX_ISOLATION_WARNING_FIX =
+ 'Use llm.provider.backend: claude-code when you need stricter Claude-Code-style runtime tool isolation, or remove host Codex MCP/tool config before running untrusted prompts through the codex backend.';
+
+export function formatCodexIsolationWarning(): string {
+ return `${CODEX_ISOLATION_WARNING} ${CODEX_ISOLATION_WARNING_FIX}`;
+}
diff --git a/packages/cli/src/context/llm/codex-mcp-runtime-server.ts b/packages/cli/src/context/llm/codex-mcp-runtime-server.ts
new file mode 100644
index 00000000..eacf28f9
--- /dev/null
+++ b/packages/cli/src/context/llm/codex-mcp-runtime-server.ts
@@ -0,0 +1,87 @@
+import { randomBytes } from 'node:crypto';
+import type { Server } from 'node:http';
+import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
+import type { KtxMcpServerLike } from '../mcp/types.js';
+import { runKtxMcpHttpServer, type KtxMcpHttpServerHandle } from '../../mcp-http-server.js';
+import type { KtxRuntimeToolSet } from './runtime-port.js';
+import { normalizeKtxRuntimeToolOutput } from './runtime-tools.js';
+
+/** @internal */
+export interface CreateCodexRuntimeMcpServerInput {
+ server?: KtxMcpServerLike;
+ toolSet: KtxRuntimeToolSet;
+}
+
+export interface CodexRuntimeMcpServerHandle {
+ url: string;
+ bearerTokenEnvVar: 'KTX_CODEX_RUNTIME_MCP_TOKEN';
+ bearerToken: string;
+ close(): Promise;
+}
+
+type RunServer = typeof runKtxMcpHttpServer;
+
+export interface StartCodexRuntimeMcpServerInput {
+ projectDir: string;
+ toolSet: KtxRuntimeToolSet;
+ runServer?: RunServer;
+}
+
+/** @internal */
+export function createCodexRuntimeMcpServer(input: CreateCodexRuntimeMcpServerInput): KtxMcpServerLike {
+ const server =
+ input.server ??
+ (new McpServer({
+ name: 'ktx-runtime',
+ version: '0.0.0',
+ }) as KtxMcpServerLike);
+
+ for (const descriptor of Object.values(input.toolSet)) {
+ server.registerTool(
+ descriptor.name,
+ {
+ description: descriptor.description,
+ inputSchema: descriptor.inputSchema.shape,
+ },
+ async (toolInput) => {
+ const normalized = normalizeKtxRuntimeToolOutput(await descriptor.execute(toolInput));
+ return {
+ content: [{ type: 'text', text: normalized.markdown }],
+ ...(normalized.structured !== undefined && normalized.structured !== null && typeof normalized.structured === 'object'
+ ? { structuredContent: normalized.structured as object }
+ : {}),
+ };
+ },
+ );
+ }
+
+ return server;
+}
+
+function serverPort(server: Server, fallback: number): number {
+ const address = server.address();
+ return typeof address === 'object' && address ? address.port : fallback;
+}
+
+export async function startCodexRuntimeMcpServer(
+ input: StartCodexRuntimeMcpServerInput,
+): Promise {
+ const bearerToken = randomBytes(32).toString('hex');
+ const runServer = input.runServer ?? runKtxMcpHttpServer;
+ const handle = (await runServer({
+ projectDir: input.projectDir,
+ host: '127.0.0.1',
+ port: 0,
+ token: bearerToken,
+ allowedHosts: ['127.0.0.1', 'localhost'],
+ allowedOrigins: [],
+ createMcpServer: () => createCodexRuntimeMcpServer({ toolSet: input.toolSet }) as McpServer,
+ })) as KtxMcpHttpServerHandle;
+ const port = serverPort(handle.server, 0);
+ return {
+ url: `http://127.0.0.1:${port}/mcp`,
+ bearerTokenEnvVar: 'KTX_CODEX_RUNTIME_MCP_TOKEN',
+ bearerToken,
+ close: () => handle.close(),
+ };
+}
diff --git a/packages/cli/src/context/llm/codex-models.ts b/packages/cli/src/context/llm/codex-models.ts
new file mode 100644
index 00000000..1a8b9b9d
--- /dev/null
+++ b/packages/cli/src/context/llm/codex-models.ts
@@ -0,0 +1,20 @@
+export const DEFAULT_CODEX_MODEL = 'gpt-5.5';
+
+const CODEX_MODEL_ALIASES: Record = {
+ codex: DEFAULT_CODEX_MODEL,
+ default: DEFAULT_CODEX_MODEL,
+};
+
+const EXPLICIT_CODEX_MODEL_ID = /^(?:gpt|codex)-[a-z0-9][a-z0-9._-]*$/i;
+
+export function resolveCodexModel(model: string): string {
+ const normalized = model.trim();
+ const alias = CODEX_MODEL_ALIASES[normalized];
+ if (alias) {
+ return alias;
+ }
+ if (EXPLICIT_CODEX_MODEL_ID.test(normalized)) {
+ return normalized;
+ }
+ throw new Error(`Unsupported Codex model "${model}". Use codex, default, or a gpt-* / codex-* model id.`);
+}
diff --git a/packages/cli/src/context/llm/codex-runtime-config.ts b/packages/cli/src/context/llm/codex-runtime-config.ts
new file mode 100644
index 00000000..74de9efe
--- /dev/null
+++ b/packages/cli/src/context/llm/codex-runtime-config.ts
@@ -0,0 +1,38 @@
+interface CodexRuntimeMcpConfig {
+ url: string;
+ bearerTokenEnvVar: string;
+ bearerToken: string;
+ toolNames: string[];
+}
+
+export interface BuildCodexRuntimeConfigInput {
+ model: string;
+ mcp?: CodexRuntimeMcpConfig;
+}
+
+export interface CodexRuntimeConfig {
+ configOverrides: Record;
+ env: Record;
+}
+
+export function buildCodexRuntimeConfig(input: BuildCodexRuntimeConfigInput): CodexRuntimeConfig {
+ const configOverrides: Record = {
+ history: { persistence: 'none' },
+ };
+ const env: Record = {};
+
+ if (input.mcp) {
+ configOverrides.mcp_servers = {
+ ktx: {
+ url: input.mcp.url,
+ bearer_token_env_var: input.mcp.bearerTokenEnvVar,
+ enabled_tools: input.mcp.toolNames,
+ default_tools_approval_mode: 'approve',
+ required: true,
+ },
+ };
+ env[input.mcp.bearerTokenEnvVar] = input.mcp.bearerToken;
+ }
+
+ return { configOverrides, env };
+}
diff --git a/packages/cli/src/context/llm/codex-runtime.ts b/packages/cli/src/context/llm/codex-runtime.ts
new file mode 100644
index 00000000..3535072b
--- /dev/null
+++ b/packages/cli/src/context/llm/codex-runtime.ts
@@ -0,0 +1,371 @@
+import { z } from 'zod';
+import { noopLogger, type KtxLogger } from '../core/config.js';
+import { isCompletedAgentStep, summarizeCodexExecEvents, type CodexExecEventSummary } from './codex-exec-events.js';
+import {
+ startCodexRuntimeMcpServer,
+ type CodexRuntimeMcpServerHandle,
+} from './codex-mcp-runtime-server.js';
+import { resolveCodexModel } from './codex-models.js';
+import { buildCodexRuntimeConfig } from './codex-runtime-config.js';
+import { CodexSdkCliRunner, type CodexSdkRunner } from './codex-sdk-runner.js';
+import type {
+ KtxGenerateObjectInput,
+ KtxGenerateTextInput,
+ KtxLlmRuntimePort,
+ KtxRuntimeToolSet,
+ LlmTokenUsage,
+ RunLoopParams,
+ RunLoopResult,
+} from './runtime-port.js';
+
+export interface CodexKtxLlmRuntimeDeps {
+ projectDir: string;
+ modelSlots: { default: string } & Partial>;
+ runner?: CodexSdkRunner;
+ startMcpServer?: (input: { projectDir: string; toolSet: KtxRuntimeToolSet }) => Promise;
+ logger?: KtxLogger;
+}
+
+function modelForRole(modelSlots: CodexKtxLlmRuntimeDeps['modelSlots'], role: string): string {
+ return resolveCodexModel(modelSlots[role] ?? modelSlots.default);
+}
+
+function promptWithSystem(system: string | undefined, prompt: string): string {
+ return [system, prompt].filter(Boolean).join('\n\n');
+}
+
+interface CollectCodexEventsOptions {
+ stepBudget?: number;
+ abortController?: AbortController;
+ onStep?: (stepIndex: number) => void | Promise;
+}
+
+interface CollectCodexEventsResult {
+ events: unknown[];
+ budgetExceeded: boolean;
+ streamError?: Error;
+}
+
+function eventRecord(value: unknown): Record | undefined {
+ return value && typeof value === 'object' ? (value as Record) : undefined;
+}
+
+function isTurnCompleted(event: unknown): boolean {
+ return eventRecord(event)?.type === 'turn.completed';
+}
+
+/**
+ * Drains the Codex stream once, emitting a step as each agent action completes
+ * so callers see live progress and the step budget is enforced mid-run. Every
+ * completed agent-action item counts (see {@link isCompletedAgentStep}), so
+ * built-in `command_execution` steps decrement the budget the same as
+ * `mcp_tool_call`s. A turn that produced no actions still counts as one step,
+ * matching the metrics summary and the AI SDK backend.
+ */
+async function collectEvents(
+ events: AsyncIterable,
+ options: CollectCodexEventsOptions = {},
+): Promise {
+ const collected: unknown[] = [];
+ let completedSteps = 0;
+ let sawActionStep = false;
+ let budgetExceeded = false;
+ let streamError: Error | undefined;
+
+ // The SDK yields every stdout event, then throws on a non-zero codex exec
+ // exit. Catch that throw so the events already collected (which carry the
+ // real `turn.failed`/`error` reason) survive for the summary; the masked
+ // exit message is kept only as a fallback when no error event was emitted.
+ try {
+ for await (const event of events) {
+ collected.push(event);
+
+ const isActionStep = isCompletedAgentStep(event);
+ if (isActionStep) {
+ sawActionStep = true;
+ } else if (sawActionStep || !isTurnCompleted(event)) {
+ // Only fall back to counting a bare turn as a step when the turn produced
+ // no agent actions; a completed turn is terminal, so it never aborts.
+ continue;
+ }
+
+ completedSteps += 1;
+ await options.onStep?.(completedSteps);
+ if (isActionStep && options.stepBudget !== undefined && completedSteps >= options.stepBudget) {
+ budgetExceeded = true;
+ options.abortController?.abort();
+ break;
+ }
+ }
+ } catch (error) {
+ streamError = error instanceof Error ? error : new Error(String(error));
+ }
+
+ return { events: collected, budgetExceeded, ...(streamError ? { streamError } : {}) };
+}
+
+function metrics(summary: CodexExecEventSummary, startedAt: number): { totalMs: number; usage: LlmTokenUsage } {
+ return { totalMs: Date.now() - startedAt, usage: summary.usage };
+}
+
+function summaryError(summary: CodexExecEventSummary, streamError?: Error): Error | undefined {
+ // A `turn.failed`/`error` event carries the real reason; prefer it over the
+ // SDK's generic non-zero-exit throw. Fall back to the stream error only when
+ // no event explained the failure (e.g. spawn failure or auth before a turn).
+ if (summary.error) {
+ return summary.error;
+ }
+ if (summary.toolFailures.length > 0) {
+ return new Error(`Codex runtime tool call failed: ${summary.toolFailures.join('; ')}`);
+ }
+ return streamError;
+}
+
+function assertSuccessfulText(summary: CodexExecEventSummary, streamError?: Error): string {
+ const error = summaryError(summary, streamError);
+ if (error) {
+ throw error;
+ }
+ if (!summary.finalText.trim()) {
+ throw new Error('Codex completed without an agent message');
+ }
+ return summary.finalText;
+}
+
+function parseStructuredOutput>(schema: TSchema, text: string): TOutput {
+ try {
+ return schema.parse(JSON.parse(text));
+ } catch (error) {
+ const message = error instanceof Error ? error.message : String(error);
+ throw new Error(`Codex structured output failed validation: ${message}`);
+ }
+}
+
+async function mcpForTools(input: {
+ projectDir: string;
+ toolSet?: KtxRuntimeToolSet;
+ startMcpServer: CodexKtxLlmRuntimeDeps['startMcpServer'];
+}): Promise {
+ if (!input.toolSet || Object.keys(input.toolSet).length === 0) {
+ return undefined;
+ }
+ return (input.startMcpServer ?? startCodexRuntimeMcpServer)({
+ projectDir: input.projectDir,
+ toolSet: input.toolSet,
+ });
+}
+
+function runtimeToolNames(toolSet: KtxRuntimeToolSet | undefined): string[] {
+ return Object.values(toolSet ?? {}).map((descriptor) => descriptor.name);
+}
+
+export class CodexKtxLlmRuntime implements KtxLlmRuntimePort {
+ private readonly runner: CodexSdkRunner;
+ private readonly logger: KtxLogger;
+
+ constructor(private readonly deps: CodexKtxLlmRuntimeDeps) {
+ this.runner = deps.runner ?? new CodexSdkCliRunner();
+ this.logger = deps.logger ?? noopLogger;
+ }
+
+ async generateText(input: KtxGenerateTextInput): Promise {
+ const startedAt = Date.now();
+ const model = modelForRole(this.deps.modelSlots, input.role);
+ const mcp = await mcpForTools({
+ projectDir: this.deps.projectDir,
+ toolSet: input.tools,
+ startMcpServer: this.deps.startMcpServer,
+ });
+ try {
+ const config = buildCodexRuntimeConfig({
+ model,
+ ...(mcp
+ ? {
+ mcp: {
+ url: mcp.url,
+ bearerTokenEnvVar: mcp.bearerTokenEnvVar,
+ bearerToken: mcp.bearerToken,
+ toolNames: runtimeToolNames(input.tools),
+ },
+ }
+ : {}),
+ });
+ const collected = await collectEvents(
+ await this.runner.runStreamed({
+ projectDir: this.deps.projectDir,
+ model,
+ prompt: promptWithSystem(input.system, input.prompt),
+ configOverrides: config.configOverrides,
+ env: config.env,
+ }),
+ );
+ const summary = summarizeCodexExecEvents(collected.events, { startedAt });
+ input.onMetrics?.(metrics(summary, startedAt));
+ return assertSuccessfulText(summary, collected.streamError);
+ } finally {
+ await mcp?.close();
+ }
+ }
+
+ async generateObject>(
+ input: KtxGenerateObjectInput,
+ ): Promise {
+ const startedAt = Date.now();
+ const model = modelForRole(this.deps.modelSlots, input.role);
+ const mcp = await mcpForTools({
+ projectDir: this.deps.projectDir,
+ toolSet: input.tools,
+ startMcpServer: this.deps.startMcpServer,
+ });
+ try {
+ const config = buildCodexRuntimeConfig({
+ model,
+ ...(mcp
+ ? {
+ mcp: {
+ url: mcp.url,
+ bearerTokenEnvVar: mcp.bearerTokenEnvVar,
+ bearerToken: mcp.bearerToken,
+ toolNames: runtimeToolNames(input.tools),
+ },
+ }
+ : {}),
+ });
+ const collected = await collectEvents(
+ await this.runner.runStreamed({
+ projectDir: this.deps.projectDir,
+ model,
+ prompt: promptWithSystem(input.system, input.prompt),
+ configOverrides: config.configOverrides,
+ env: config.env,
+ outputSchema: z.toJSONSchema(input.schema, { target: 'draft-7' }) as Record,
+ }),
+ );
+ const summary = summarizeCodexExecEvents(collected.events, { startedAt });
+ input.onMetrics?.(metrics(summary, startedAt));
+ return parseStructuredOutput(input.schema, assertSuccessfulText(summary, collected.streamError));
+ } finally {
+ await mcp?.close();
+ }
+ }
+
+ async runAgentLoop(params: RunLoopParams): Promise {
+ const startedAt = Date.now();
+ const model = modelForRole(this.deps.modelSlots, params.modelRole);
+ let mcp: CodexRuntimeMcpServerHandle | undefined;
+ try {
+ mcp = await mcpForTools({
+ projectDir: this.deps.projectDir,
+ toolSet: params.toolSet,
+ startMcpServer: this.deps.startMcpServer,
+ });
+ const config = buildCodexRuntimeConfig({
+ model,
+ ...(mcp
+ ? {
+ mcp: {
+ url: mcp.url,
+ bearerTokenEnvVar: mcp.bearerTokenEnvVar,
+ bearerToken: mcp.bearerToken,
+ toolNames: runtimeToolNames(params.toolSet),
+ },
+ }
+ : {}),
+ });
+ const abortController = new AbortController();
+ const onStep = async (stepIndex: number): Promise => {
+ try {
+ await params.onStepFinish?.({ stepIndex, stepBudget: params.stepBudget });
+ } catch (error) {
+ this.logger.warn(
+ `[codex-runner] onStepFinish callback threw; ignoring: ${error instanceof Error ? error.message : String(error)}`,
+ );
+ }
+ };
+ const collected = await collectEvents(
+ await this.runner.runStreamed({
+ projectDir: this.deps.projectDir,
+ model,
+ prompt: promptWithSystem(params.systemPrompt, params.userPrompt),
+ configOverrides: config.configOverrides,
+ env: config.env,
+ signal: abortController.signal,
+ }),
+ { stepBudget: params.stepBudget, abortController, onStep },
+ );
+ const summary = summarizeCodexExecEvents(collected.events, { startedAt });
+ const error = summaryError(summary, collected.streamError);
+ const stopReason = collected.budgetExceeded ? 'budget' : error ? 'error' : summary.stopReason;
+ return {
+ stopReason,
+ ...(stopReason === 'error' && error ? { error } : {}),
+ metrics: {
+ totalMs: Date.now() - startedAt,
+ usage: summary.usage,
+ stepCount: summary.stepCount,
+ stepBoundariesMs: summary.stepBoundariesMs,
+ },
+ };
+ } catch (error) {
+ const err = error instanceof Error ? error : new Error(String(error));
+ return {
+ stopReason: 'error',
+ error: err,
+ metrics: { totalMs: Date.now() - startedAt, usage: {}, stepCount: 0, stepBoundariesMs: [] },
+ };
+ } finally {
+ await mcp?.close();
+ }
+ }
+}
+
+// A rejected model is not an auth failure: Codex authenticated, connected, and
+// the API refused the model id. These markers come from the API error envelope
+// (e.g. "model is not supported", "invalid_request_error").
+const MODEL_UNAVAILABLE_MARKERS =
+ /\bnot supported\b|\bnot available\b|\bdoes not exist\b|invalid_request_error|\bunknown model\b|\bunsupported model\b/i;
+
+function describeCodexProbeFailure(model: string, message: string): { message: string; fix: string } {
+ if (MODEL_UNAVAILABLE_MARKERS.test(message)) {
+ const fix = `Run \`codex\` to see the models your account supports, then set llm.models.default in ktx.yaml (or rerun \`ktx setup\`).`;
+ return {
+ message: `Codex is authenticated, but the configured model "${model}" is not available for this Codex account. ${fix} Details: ${message}`,
+ fix,
+ };
+ }
+ const fix = `Authenticate Codex locally with the Codex CLI, verify the Codex CLI is installed, then rerun setup or \`ktx status\`.`;
+ return {
+ message: `Codex authentication is not usable. ${fix} Details: ${message}`,
+ fix,
+ };
+}
+
+export async function runCodexAuthProbe(input: {
+ projectDir: string;
+ model: string;
+ runner?: CodexSdkRunner;
+}): Promise<{ ok: true } | { ok: false; message: string; fix: string }> {
+ let model: string;
+ try {
+ model = resolveCodexModel(input.model);
+ } catch (error) {
+ return {
+ ok: false,
+ message: error instanceof Error ? error.message : String(error),
+ fix: 'Set llm.models.default in ktx.yaml to a supported codex model (codex, default, or a gpt-* / codex-* id), or rerun `ktx setup`.',
+ };
+ }
+
+ const runtime = new CodexKtxLlmRuntime({
+ projectDir: input.projectDir,
+ modelSlots: { default: model },
+ ...(input.runner ? { runner: input.runner } : {}),
+ });
+ try {
+ await runtime.generateText({ role: 'default', prompt: 'Reply with exactly: ok' });
+ return { ok: true };
+ } catch (error) {
+ const message = error instanceof Error ? error.message : String(error);
+ return { ok: false, ...describeCodexProbeFailure(model, message) };
+ }
+}
diff --git a/packages/cli/src/context/llm/codex-sdk-runner.ts b/packages/cli/src/context/llm/codex-sdk-runner.ts
new file mode 100644
index 00000000..58170b3a
--- /dev/null
+++ b/packages/cli/src/context/llm/codex-sdk-runner.ts
@@ -0,0 +1,96 @@
+import { Codex, type CodexOptions, type ThreadOptions, type TurnOptions } from '@openai/codex-sdk';
+
+export interface CodexSdkRunnerInput {
+ projectDir: string;
+ model: string;
+ prompt: string;
+ configOverrides?: Record;
+ env?: Record;
+ outputSchema?: Record;
+ signal?: AbortSignal;
+}
+
+export interface CodexSdkRunner {
+ runStreamed(input: CodexSdkRunnerInput): Promise>;
+}
+
+type CodexThread = {
+ runStreamed(input: string, turnOptions?: TurnOptions): Promise<{ events: AsyncIterable }>;
+};
+
+type CodexClient = {
+ startThread(options: ThreadOptions): CodexThread;
+};
+
+type CodexConstructor = new (options?: CodexOptions) => CodexClient;
+
+export interface CodexSdkCliRunnerOptions {
+ envBase?: NodeJS.ProcessEnv;
+ codexPathOverride?: string;
+}
+
+const CODEX_ENV_ALLOWLIST = new Set([
+ 'HOME',
+ 'USERPROFILE',
+ 'APPDATA',
+ 'LOCALAPPDATA',
+ 'XDG_CONFIG_HOME',
+ 'CODEX_HOME',
+ 'CODEX_API_KEY',
+ 'OPENAI_API_KEY',
+ 'PATH',
+ 'Path',
+ 'SYSTEMROOT',
+ 'COMSPEC',
+ 'TMPDIR',
+ 'TMP',
+ 'TEMP',
+ 'SSL_CERT_FILE',
+ 'SSL_CERT_DIR',
+ 'NODE_EXTRA_CA_CERTS',
+ 'HTTPS_PROXY',
+ 'HTTP_PROXY',
+ 'ALL_PROXY',
+ 'NO_PROXY',
+]);
+
+function buildCodexSdkEnv(baseEnv: NodeJS.ProcessEnv, overrides: Record | undefined): Record {
+ const env: Record = {};
+ for (const key of CODEX_ENV_ALLOWLIST) {
+ const value = baseEnv[key];
+ if (typeof value === 'string') {
+ env[key] = value;
+ }
+ }
+ return { ...env, ...(overrides ?? {}) };
+}
+
+export class CodexSdkCliRunner implements CodexSdkRunner {
+ constructor(private readonly options: CodexSdkCliRunnerOptions = {}) {}
+
+ async runStreamed(input: CodexSdkRunnerInput): Promise> {
+ const CodexClass = Codex as CodexConstructor;
+ const codex = new CodexClass({
+ ...(input.configOverrides ? { config: input.configOverrides as CodexOptions['config'] } : {}),
+ env: buildCodexSdkEnv(this.options.envBase ?? process.env, input.env),
+ ...(this.options.codexPathOverride ? { codexPathOverride: this.options.codexPathOverride } : {}),
+ });
+ const thread = codex.startThread({
+ workingDirectory: input.projectDir,
+ skipGitRepoCheck: true,
+ model: input.model,
+ sandboxMode: 'read-only',
+ webSearchMode: 'disabled',
+ approvalPolicy: 'never',
+ });
+ const turnOptions: TurnOptions = {
+ ...(input.outputSchema ? { outputSchema: input.outputSchema } : {}),
+ ...(input.signal ? { signal: input.signal } : {}),
+ };
+ const streamed = await thread.runStreamed(
+ input.prompt,
+ Object.keys(turnOptions).length > 0 ? turnOptions : undefined,
+ );
+ return streamed.events;
+ }
+}
diff --git a/packages/cli/src/context/llm/local-config.ts b/packages/cli/src/context/llm/local-config.ts
index c64a85cf..58bd29a5 100644
--- a/packages/cli/src/context/llm/local-config.ts
+++ b/packages/cli/src/context/llm/local-config.ts
@@ -5,6 +5,7 @@ import { resolveKtxConfigReference } from '../core/config-reference.js';
import type { KtxProjectEmbeddingConfig, KtxProjectLlmConfig } from '../project/config.js';
import { AiSdkKtxLlmRuntime } from './ai-sdk-runtime.js';
import { ClaudeCodeKtxLlmRuntime } from './claude-code-runtime.js';
+import { CodexKtxLlmRuntime } from './codex-runtime.js';
import type { KtxLlmRuntimePort } from './runtime-port.js';
interface LocalConfigDeps {
@@ -13,6 +14,7 @@ interface LocalConfigDeps {
createKtxLlmProvider?: typeof createKtxLlmProvider;
createKtxEmbeddingProvider?: typeof createKtxEmbeddingProvider;
createClaudeCodeRuntime?: (deps: ConstructorParameters[0]) => KtxLlmRuntimePort;
+ createCodexRuntime?: (deps: ConstructorParameters[0]) => KtxLlmRuntimePort;
createAiSdkRuntime?: (deps: { llmProvider: KtxLlmProvider }) => KtxLlmRuntimePort;
}
@@ -104,7 +106,7 @@ export function createLocalKtxLlmProviderFromConfig(
deps: LocalConfigDeps = {},
): KtxLlmProvider | null {
const resolved = resolveLocalKtxLlmConfig(config, deps.env ?? process.env);
- if (!resolved || resolved.backend === 'claude-code') {
+ if (!resolved || resolved.backend === 'claude-code' || resolved.backend === 'codex') {
return null;
}
return (deps.createKtxLlmProvider ?? createKtxLlmProvider)(resolved);
@@ -129,6 +131,16 @@ export function createLocalKtxLlmRuntimeFromConfig(
env: deps.env,
});
}
+ if (resolved.backend === 'codex') {
+ const projectDir = deps.projectDir;
+ if (!projectDir) {
+ throw new Error('projectDir is required when creating the codex LLM runtime');
+ }
+ return (deps.createCodexRuntime ?? ((runtimeDeps) => new CodexKtxLlmRuntime(runtimeDeps)))({
+ projectDir,
+ modelSlots: resolved.modelSlots,
+ });
+ }
const llmProvider = (deps.createKtxLlmProvider ?? createKtxLlmProvider)(resolved);
return (deps.createAiSdkRuntime ?? ((runtimeDeps) => new AiSdkKtxLlmRuntime(runtimeDeps)))({ llmProvider });
}
diff --git a/packages/cli/src/context/project/config.ts b/packages/cli/src/context/project/config.ts
index a8d38d1d..cbea79b6 100644
--- a/packages/cli/src/context/project/config.ts
+++ b/packages/cli/src/context/project/config.ts
@@ -3,7 +3,7 @@ import YAML from 'yaml';
import * as z from 'zod';
import { connectionConfigSchema } from './driver-schemas.js';
-const KTX_LLM_BACKENDS = ['none', 'anthropic', 'vertex', 'gateway', 'claude-code'] as const;
+const KTX_LLM_BACKENDS = ['none', 'anthropic', 'vertex', 'gateway', 'claude-code', 'codex'] as const;
const KTX_EMBEDDING_BACKENDS = ['none', 'openai', 'sentence-transformers'] as const;
const KTX_PROMPT_CACHE_TTLS = ['5m', '1h'] as const;
const KTX_ENRICHMENT_MODES = ['none', 'deterministic', 'llm'] as const;
@@ -38,7 +38,7 @@ const llmProviderSchema = z
.enum(KTX_LLM_BACKENDS)
.default('none')
.describe(
- 'LLM provider backend. "none" disables LLM features; "anthropic" / "vertex" / "gateway" require the matching nested credentials block; "claude-code" uses the local Claude Code session.',
+ 'LLM provider backend. "none" disables LLM features; "anthropic" / "vertex" / "gateway" require the matching nested credentials block; "claude-code" uses the local Claude Code session; "codex" uses the local Codex session.',
),
vertex: vertexProviderSchema.optional().describe('Vertex AI credentials, used when backend is "vertex".'),
anthropic: apiCredentialsSchema.optional().describe('Anthropic API credentials, used when backend is "anthropic".'),
diff --git a/packages/cli/src/llm/types.ts b/packages/cli/src/llm/types.ts
index 3f7f67e2..a190b1c0 100644
--- a/packages/cli/src/llm/types.ts
+++ b/packages/cli/src/llm/types.ts
@@ -3,7 +3,7 @@ import type { LanguageModel, TelemetrySettings, ToolCallRepairFunction, ToolSet
export const KTX_MODEL_ROLES = ['default', 'triage', 'candidateExtraction', 'curator', 'reconcile', 'repair'] as const;
export type KtxModelRole = (typeof KTX_MODEL_ROLES)[number];
-type KtxLlmBackend = 'anthropic' | 'vertex' | 'gateway' | 'claude-code';
+type KtxLlmBackend = 'anthropic' | 'vertex' | 'gateway' | 'claude-code' | 'codex';
export type KtxPromptCacheTtl = '5m' | '1h';
type KtxJsonValue =
diff --git a/packages/cli/src/setup-models.ts b/packages/cli/src/setup-models.ts
index 041eef5c..8e8cf30b 100644
--- a/packages/cli/src/setup-models.ts
+++ b/packages/cli/src/setup-models.ts
@@ -3,6 +3,9 @@ import { writeFile } from 'node:fs/promises';
import { promisify } from 'node:util';
import { resolveLocalKtxLlmConfig } from './context/llm/local-config.js';
import { runClaudeCodeAuthProbe } from './context/llm/claude-code-runtime.js';
+import { formatCodexIsolationWarning } from './context/llm/codex-isolation.js';
+import { runCodexAuthProbe } from './context/llm/codex-runtime.js';
+import { DEFAULT_CODEX_MODEL } from './context/llm/codex-models.js';
import { resolveKtxConfigReference } from './context/core/config-reference.js';
import { type KtxProjectConfig, type KtxProjectLlmConfig, serializeKtxProjectConfig } from './context/project/config.js';
import { loadKtxProject } from './context/project/project.js';
@@ -56,7 +59,7 @@ export interface AnthropicModelChoice {
recommended: boolean;
}
-export type KtxSetupLlmBackend = 'anthropic' | 'vertex' | 'claude-code';
+export type KtxSetupLlmBackend = 'anthropic' | 'vertex' | 'claude-code' | 'codex';
/** @internal */
export interface KtxSetupModelPromptAdapter {
@@ -82,6 +85,7 @@ export interface KtxSetupModelDeps {
model: string;
env?: NodeJS.ProcessEnv;
}) => Promise<{ ok: true } | { ok: false; message: string }>;
+ codexAuthProbe?: (input: { projectDir: string; model: string }) => Promise<{ ok: true } | { ok: false; message: string }>;
readGcloudProject?: () => Promise;
listGcloudProjects?: () => Promise;
spinner?: () => KtxCliSpinner;
@@ -110,6 +114,20 @@ const CLAUDE_CODE_MODELS: AnthropicModelChoice[] = [
{ id: 'haiku', label: 'Claude Haiku', recommended: false },
];
+// Curated Codex models from OpenAI's current lineup that work under both
+// ChatGPT-account (subscription) and API-key auth. Intentionally omitted:
+// the `*-codex` ids (e.g. gpt-5.3-codex, gpt-5.2-codex) are API-key-only and
+// fail on ChatGPT-account auth, and gpt-5.3-codex-spark is a ChatGPT-Pro-only
+// research preview. Codex resolves real availability per account at runtime
+// (its binary remote-fetches the model list), so this is a convenience
+// shortlist only — the manual-entry option accepts any id your account's
+// `codex` picker exposes, and the auth probe reports an unsupported choice.
+const CODEX_MODELS: AnthropicModelChoice[] = [
+ { id: 'gpt-5.5', label: 'GPT-5.5', recommended: true },
+ { id: 'gpt-5.4', label: 'GPT-5.4', recommended: false },
+ { id: 'gpt-5.4-mini', label: 'GPT-5.4 mini', recommended: false },
+];
+
const HIDDEN_ANTHROPIC_MODEL_PATTERNS = [
/^claude-sonnet-4$/i,
/^claude-opus-4$/i,
@@ -272,7 +290,12 @@ export function isKtxSetupLlmConfigReady(config: KtxProjectLlmConfig): boolean {
return typeof resolved.vertex?.location === 'string' && resolved.vertex.location.trim().length > 0;
}
- return resolved.backend === 'anthropic' || resolved.backend === 'gateway' || resolved.backend === 'claude-code';
+ return (
+ resolved.backend === 'anthropic' ||
+ resolved.backend === 'gateway' ||
+ resolved.backend === 'claude-code' ||
+ resolved.backend === 'codex'
+ );
}
function hasUsableConfiguredLlm(config: KtxProjectConfig): boolean {
@@ -284,7 +307,8 @@ function buildProjectLlmConfig(
provider:
| { backend: 'anthropic'; credentialRef: string }
| { backend: 'vertex'; vertex: { project?: string; location: string } }
- | { backend: 'claude-code' },
+ | { backend: 'claude-code' }
+ | { backend: 'codex' },
model: string,
): KtxProjectLlmConfig {
if (provider.backend === 'claude-code') {
@@ -295,6 +319,14 @@ function buildProjectLlmConfig(
};
}
+ if (provider.backend === 'codex') {
+ return {
+ provider: { backend: 'codex' },
+ models: { ...existing.models, default: model },
+ promptCaching: existing.promptCaching,
+ };
+ }
+
if (provider.backend === 'vertex') {
return {
provider: {
@@ -515,6 +547,7 @@ async function chooseBackend(
message: 'Which LLM provider should KTX use?',
options: [
{ value: 'claude-code', label: 'Claude subscription (Pro/Max)' },
+ { value: 'codex', label: 'Codex subscription' },
{ value: 'anthropic', label: 'Anthropic API key' },
{ value: 'vertex', label: 'Google Vertex AI for Anthropic Claude' },
{ value: 'back', label: 'Back' },
@@ -525,7 +558,7 @@ async function chooseBackend(
}
return {
status: 'ready',
- backend: choice === 'vertex' || choice === 'claude-code' ? choice : 'anthropic',
+ backend: choice === 'vertex' || choice === 'claude-code' || choice === 'codex' ? choice : 'anthropic',
prompted: true,
};
}
@@ -884,12 +917,51 @@ async function chooseClaudeCodeModel(args: KtxSetupModelArgs, deps: KtxSetupMode
return { status: 'ready', model: choice };
}
+async function chooseCodexModel(args: KtxSetupModelArgs, deps: KtxSetupModelDeps): Promise {
+ const providedModel = requestedModel(args);
+ if (providedModel) {
+ return { status: 'ready', model: providedModel };
+ }
+ if (args.inputMode === 'disabled') {
+ return { status: 'ready', model: DEFAULT_CODEX_MODEL };
+ }
+
+ const prompts = deps.prompts ?? createPromptAdapter();
+ const choice = await prompts.select({
+ message: `Which Codex model should KTX use?\n\n${ANTHROPIC_MODEL_PROMPT_CONTEXT}`,
+ options: [
+ ...CODEX_MODELS.map((model) => ({
+ value: model.id,
+ label: model.label,
+ ...(model.recommended ? { hint: 'recommended' } : {}),
+ })),
+ { value: 'manual', label: 'Enter a Codex model ID manually' },
+ { value: 'back', label: 'Back' },
+ ],
+ });
+ if (choice === 'back') {
+ return { status: 'back' };
+ }
+ if (choice === 'manual') {
+ const manual = await prompts.text({
+ message: withTextInputNavigation('Codex model ID'),
+ placeholder: CODEX_MODELS.find((model) => model.recommended)?.id ?? CODEX_MODELS[0]?.id,
+ });
+ if (manual === undefined) {
+ return { status: 'back' };
+ }
+ return manual.trim() ? { status: 'ready', model: manual.trim() } : { status: 'missing-input' };
+ }
+ return { status: 'ready', model: choice };
+}
+
async function persistLlmConfig(
projectDir: string,
provider:
| { backend: 'anthropic'; credentialRef: string }
| { backend: 'vertex'; vertex: { project?: string; location: string } }
- | { backend: 'claude-code' },
+ | { backend: 'claude-code' }
+ | { backend: 'codex' },
model: string,
): Promise {
const project = await loadKtxProject({ projectDir });
@@ -1031,6 +1103,32 @@ export async function runKtxSetupAnthropicModelStep(
return { status: 'ready', projectDir: args.projectDir };
}
+ if (backendChoice.backend === 'codex') {
+ const model = await chooseCodexModel(backendArgs, deps);
+ if (model.status === 'back' && backendChoice.prompted) {
+ attemptArgs = buildInteractiveRetryArgs(args);
+ continue;
+ }
+ if (model.status === 'invalid-credential') {
+ return { status: 'failed', projectDir: args.projectDir };
+ }
+ if (model.status !== 'ready') {
+ return { status: model.status, projectDir: args.projectDir };
+ }
+ const probe = deps.codexAuthProbe ?? runCodexAuthProbe;
+ const health = await probe({ projectDir: args.projectDir, model: model.model });
+ if (!health.ok) {
+ io.stderr.write(`${health.message}\n`);
+ return { status: 'failed', projectDir: args.projectDir };
+ }
+ // Prefix the clack gutter so the warning sits inside the setup frame
+ // instead of breaking out of it; kept on stderr for scripted runs.
+ io.stderr.write(`│ ${formatCodexIsolationWarning()}\n`);
+ await persistLlmConfig(args.projectDir, { backend: 'codex' }, model.model);
+ io.stdout.write(`│ LLM ready: yes (codex, ${model.model})\n`);
+ return { status: 'ready', projectDir: args.projectDir };
+ }
+
const credential = await chooseCredentialRef(backendArgs, io, deps);
if (credential.status === 'back' && backendChoice.prompted) {
attemptArgs = buildInteractiveRetryArgs(args);
diff --git a/packages/cli/src/status-project.ts b/packages/cli/src/status-project.ts
index 097f4091..ff7b98f4 100644
--- a/packages/cli/src/status-project.ts
+++ b/packages/cli/src/status-project.ts
@@ -1,6 +1,11 @@
import { stat as statAsync, readdir as readdirAsync } from 'node:fs/promises';
import { basename, join } from 'node:path';
import { runClaudeCodeAuthProbe } from './context/llm/claude-code-runtime.js';
+import {
+ CODEX_ISOLATION_WARNING,
+ CODEX_ISOLATION_WARNING_FIX,
+} from './context/llm/codex-isolation.js';
+import { runCodexAuthProbe } from './context/llm/codex-runtime.js';
import type { KtxConfigIssue, KtxProjectConfig, KtxProjectConnectionConfig, KtxProjectEmbeddingConfig, KtxProjectLlmConfig } from './context/project/config.js';
import type { KtxLocalProject } from './context/project/project.js';
import { ktxLocalStateDbPath } from './context/project/local-state-db.js';
@@ -94,6 +99,11 @@ type ClaudeCodeAuthProbe = (input: {
env?: NodeJS.ProcessEnv;
}) => Promise<{ ok: true } | { ok: false; message: string }>;
+type CodexAuthProbe = (input: {
+ projectDir: string;
+ model: string;
+}) => Promise<{ ok: true } | { ok: false; message: string; fix: string }>;
+
const PROJECT_READY_COMMANDS = KTX_NEXT_STEP_DIRECT_COMMANDS.map((step) => step.command);
interface LocalStatsIngestPerConnection {
@@ -194,6 +204,7 @@ async function buildLlmStatus(
projectDir: string;
env: NodeJS.ProcessEnv;
claudeCodeAuthProbe?: ClaudeCodeAuthProbe;
+ codexAuthProbe?: CodexAuthProbe;
fast?: boolean;
useSpinner?: boolean;
},
@@ -210,6 +221,18 @@ async function buildLlmStatus(
fix: 'Run: ktx setup (choose an LLM provider)',
};
}
+ // The runtime (resolveModelSlots) hard-requires llm.models.default for every
+ // non-none backend; without it ingest/scan/memory throw. Report that here so
+ // status never marks a project ready that the runtime would refuse to run.
+ if (!model || model.trim().length === 0) {
+ return {
+ backend,
+ model,
+ status: 'fail',
+ detail: `llm.models.default is required for backend "${backend}"`,
+ fix: 'Set llm.models.default in ktx.yaml, then rerun `ktx status` (or rerun `ktx setup`).',
+ };
+ }
if (backend === 'anthropic') {
const ref = config.provider.anthropic?.api_key;
const resolved = resolveRef(ref, env);
@@ -251,7 +274,7 @@ async function buildLlmStatus(
};
}
if (backend === 'claude-code') {
- const modelName = model ?? 'sonnet';
+ const modelName = model;
if (options.fast === true) {
return {
backend,
@@ -280,6 +303,36 @@ async function buildLlmStatus(
fix: 'Authenticate Claude Code locally with the Claude Code CLI, then rerun `ktx status`.',
};
}
+ if (backend === 'codex') {
+ const modelName = model;
+ if (options.fast === true) {
+ return {
+ backend,
+ model: modelName,
+ status: 'skipped',
+ detail: 'auth probe skipped (--fast)',
+ };
+ }
+ const probe = options.codexAuthProbe ?? runCodexAuthProbe;
+ const auth = await withSpinner(options.useSpinner === true, 'Probing Codex authentication', () =>
+ probe({ projectDir: options.projectDir, model: modelName }),
+ );
+ if (auth.ok) {
+ return {
+ backend,
+ model: modelName,
+ status: 'ok',
+ detail: 'local Codex session authenticated',
+ };
+ }
+ return {
+ backend,
+ model: modelName,
+ status: 'fail',
+ detail: auth.message,
+ fix: auth.fix,
+ };
+ }
return { backend, model, status: 'warn', detail: 'unknown LLM backend' };
}
@@ -572,6 +625,13 @@ function buildWarnings(
});
}
+ if (llm.backend === 'codex') {
+ warnings.push({
+ message: CODEX_ISOLATION_WARNING,
+ fix: CODEX_ISOLATION_WARNING_FIX,
+ });
+ }
+
return warnings;
}
@@ -634,6 +694,7 @@ export interface BuildProjectStatusOptions {
env?: NodeJS.ProcessEnv;
queryHistoryReadinessProbe?: HistoricSqlReadinessProbe;
claudeCodeAuthProbe?: ClaudeCodeAuthProbe;
+ codexAuthProbe?: CodexAuthProbe;
configIssues?: KtxConfigIssue[];
fast?: boolean;
useSpinner?: boolean;
@@ -882,6 +943,7 @@ export async function buildProjectStatus(project: KtxLocalProject, options: Buil
projectDir: project.projectDir,
env,
claudeCodeAuthProbe: options.claudeCodeAuthProbe,
+ codexAuthProbe: options.codexAuthProbe,
fast: options.fast,
useSpinner: options.useSpinner,
});
diff --git a/packages/cli/test/context/ingest/local-bundle-runtime.test.ts b/packages/cli/test/context/ingest/local-bundle-runtime.test.ts
index 64fad53a..9d1ec9b4 100644
--- a/packages/cli/test/context/ingest/local-bundle-runtime.test.ts
+++ b/packages/cli/test/context/ingest/local-bundle-runtime.test.ts
@@ -77,9 +77,10 @@ describe('createLocalBundleIngestRuntime', () => {
}),
).toThrow(
[
- 'ktx ingest requires llm.provider.backend: anthropic, vertex, gateway, or claude-code, or an injected agentRunner.',
- 'Configure a local Claude Code session or API-backed LLM, then rerun ingest:',
+ 'ktx ingest requires llm.provider.backend: anthropic, vertex, gateway, claude-code, or codex, or an injected agentRunner.',
+ 'Configure a local Claude Code/Codex session or API-backed LLM, then rerun ingest:',
` ktx setup --project-dir ${project.projectDir} --llm-backend claude-code --no-input`,
+ ` ktx setup --project-dir ${project.projectDir} --llm-backend codex --llm-model gpt-5.5 --no-input`,
` ktx setup --project-dir ${project.projectDir} --llm-backend anthropic --anthropic-api-key-env ANTHROPIC_API_KEY --llm-model claude-sonnet-4-6 --no-input`,
].join('\n'),
);
diff --git a/packages/cli/test/context/llm/codex-exec-events.test.ts b/packages/cli/test/context/llm/codex-exec-events.test.ts
new file mode 100644
index 00000000..5edcfed8
--- /dev/null
+++ b/packages/cli/test/context/llm/codex-exec-events.test.ts
@@ -0,0 +1,188 @@
+import { describe, expect, it } from 'vitest';
+import {
+ parseCodexExecEventLine,
+ summarizeCodexExecEvents,
+} from '../../../src/context/llm/codex-exec-events.js';
+
+describe('Codex exec event parsing', () => {
+ it('uses the completed turn as one step when no MCP tools run', () => {
+ const summary = summarizeCodexExecEvents(
+ [
+ { type: 'thread.started', thread_id: 'thr_1' },
+ { type: 'turn.started' },
+ { type: 'item.completed', item: { id: 'item_1', type: 'agent_message', text: 'hello from codex' } },
+ {
+ type: 'turn.completed',
+ usage: {
+ input_tokens: 12,
+ cached_input_tokens: 4,
+ output_tokens: 5,
+ reasoning_output_tokens: 2,
+ },
+ },
+ ],
+ { startedAt: 100, now: () => 125 },
+ );
+
+ expect(summary).toEqual({
+ finalText: 'hello from codex',
+ stopReason: 'natural',
+ usage: { inputTokens: 12, outputTokens: 5, totalTokens: 17 },
+ stepCount: 1,
+ stepBoundariesMs: [25],
+ toolCallCount: 0,
+ toolFailures: [],
+ });
+ });
+
+ it('uses completed MCP tool calls as loop steps', () => {
+ const offsets = [115, 140, 175];
+ const summary = summarizeCodexExecEvents(
+ [
+ { type: 'turn.started' },
+ {
+ type: 'item.started',
+ item: { id: 'call_1', type: 'mcp_tool_call', server: 'ktx', tool: 'search', arguments: {}, status: 'in_progress' },
+ },
+ {
+ type: 'item.completed',
+ item: { id: 'call_1', type: 'mcp_tool_call', server: 'ktx', tool: 'search', arguments: {}, status: 'completed' },
+ },
+ {
+ type: 'item.started',
+ item: { id: 'call_2', type: 'mcp_tool_call', server: 'ktx', tool: 'lookup', arguments: {}, status: 'in_progress' },
+ },
+ {
+ type: 'item.completed',
+ item: {
+ id: 'call_2',
+ type: 'mcp_tool_call',
+ server: 'ktx',
+ tool: 'lookup',
+ arguments: {},
+ status: 'failed',
+ error: { message: 'denied' },
+ },
+ },
+ { type: 'item.completed', item: { id: 'item_1', type: 'agent_message', text: 'done' } },
+ { type: 'turn.completed', usage: { input_tokens: 1, output_tokens: 1, cached_input_tokens: 0, reasoning_output_tokens: 0 } },
+ ],
+ { startedAt: 100, now: () => offsets.shift() ?? 175 },
+ );
+
+ expect(summary).toEqual({
+ finalText: 'done',
+ stopReason: 'natural',
+ usage: { inputTokens: 1, outputTokens: 1, totalTokens: 2 },
+ stepCount: 2,
+ stepBoundariesMs: [15, 40],
+ toolCallCount: 2,
+ toolFailures: ['lookup: denied'],
+ });
+ });
+
+ it('does not treat a completed MCP tool call as failed when Codex sends error: null', () => {
+ // Captured verbatim from a real @openai/codex-sdk run: successful tool calls
+ // carry `error: null` and `result` alongside `status: "completed"`.
+ const summary = summarizeCodexExecEvents([
+ { type: 'turn.started' },
+ {
+ type: 'item.started',
+ item: {
+ id: 'item_1',
+ type: 'mcp_tool_call',
+ server: 'ktx',
+ tool: 'echo_value',
+ arguments: { value: 'ktx_codex_tool_ok' },
+ result: null,
+ error: null,
+ status: 'in_progress',
+ },
+ },
+ {
+ type: 'item.completed',
+ item: {
+ id: 'item_1',
+ type: 'mcp_tool_call',
+ server: 'ktx',
+ tool: 'echo_value',
+ arguments: { value: 'ktx_codex_tool_ok' },
+ result: { content: [{ type: 'text', text: 'echo:ktx_codex_tool_ok' }], structured_content: null },
+ error: null,
+ status: 'completed',
+ },
+ },
+ { type: 'item.completed', item: { id: 'm1', type: 'agent_message', text: 'done' } },
+ { type: 'turn.completed', usage: { input_tokens: 1, output_tokens: 1 } },
+ ]);
+
+ expect(summary.toolFailures).toEqual([]);
+ expect(summary.toolCallCount).toBe(1);
+ });
+
+ it('counts built-in command executions as loop steps without failing the loop', () => {
+ const offsets = [110, 130];
+ const summary = summarizeCodexExecEvents(
+ [
+ { type: 'turn.started' },
+ { type: 'item.completed', item: { id: 'c1', type: 'command_execution', command: 'ls', status: 'completed', exit_code: 0 } },
+ { type: 'item.completed', item: { id: 'c2', type: 'command_execution', command: 'cat missing', status: 'failed', exit_code: 1 } },
+ { type: 'item.completed', item: { id: 'm1', type: 'agent_message', text: 'done' } },
+ { type: 'turn.completed', usage: { input_tokens: 2, output_tokens: 1 } },
+ ],
+ { startedAt: 100, now: () => offsets.shift() ?? 130 },
+ );
+
+ expect(summary.stepCount).toBe(2);
+ expect(summary.stepBoundariesMs).toEqual([10, 30]);
+ // A non-zero command exit is normal agent exploration, not a runtime tool failure.
+ expect(summary.toolFailures).toEqual([]);
+ expect(summary.toolCallCount).toBe(0);
+ });
+
+ it('maps turn failures into error stop reason', () => {
+ const summary = summarizeCodexExecEvents([
+ { type: 'turn.started' },
+ { type: 'turn.failed', error: { message: 'Codex could not connect to required MCP server' } },
+ ]);
+
+ expect(summary.stopReason).toBe('error');
+ expect(summary.error?.message).toContain('Codex could not connect to required MCP server');
+ });
+
+ it('unwraps the Codex API error envelope into its human-readable message', () => {
+ // Codex serializes API errors as a JSON envelope inside the event message.
+ const apiError = JSON.stringify({
+ type: 'error',
+ status: 400,
+ error: {
+ type: 'invalid_request_error',
+ message: "The 'gpt-5.3-codex' model is not supported when using Codex with a ChatGPT account.",
+ },
+ });
+ const summary = summarizeCodexExecEvents([
+ { type: 'thread.started', thread_id: 'thr_1' },
+ { type: 'turn.started' },
+ { type: 'error', message: apiError },
+ { type: 'turn.failed', error: { message: apiError } },
+ ]);
+
+ expect(summary.stopReason).toBe('error');
+ expect(summary.error?.message).toBe(
+ "The 'gpt-5.3-codex' model is not supported when using Codex with a ChatGPT account.",
+ );
+ });
+
+ it('maps max-turns terminal reasons into budget stop reason when Codex emits one', () => {
+ const summary = summarizeCodexExecEvents([
+ { type: 'turn.started' },
+ { type: 'turn.completed', reason: 'max_turns', usage: { input_tokens: 1, output_tokens: 1 } },
+ ]);
+
+ expect(summary.stopReason).toBe('budget');
+ });
+
+ it('throws a clear error for malformed JSONL lines', () => {
+ expect(() => parseCodexExecEventLine('{not-json')).toThrow('Codex JSONL event stream was malformed');
+ });
+});
diff --git a/packages/cli/test/context/llm/codex-isolation.test.ts b/packages/cli/test/context/llm/codex-isolation.test.ts
new file mode 100644
index 00000000..0ef39ee3
--- /dev/null
+++ b/packages/cli/test/context/llm/codex-isolation.test.ts
@@ -0,0 +1,19 @@
+import { describe, expect, it } from 'vitest';
+import {
+ CODEX_ISOLATION_WARNING,
+ CODEX_ISOLATION_WARNING_FIX,
+ formatCodexIsolationWarning,
+} from '../../../src/context/llm/codex-isolation.js';
+
+describe('Codex isolation warning', () => {
+ it('documents the enforced and unenforced Codex isolation boundaries', () => {
+ expect(CODEX_ISOLATION_WARNING).toContain('runtime MCP server to the current ktx tool set');
+ expect(CODEX_ISOLATION_WARNING).toContain('disables Codex web search');
+ expect(CODEX_ISOLATION_WARNING).toContain('may still load user Codex config');
+ expect(CODEX_ISOLATION_WARNING).toContain('built-in command execution');
+ expect(CODEX_ISOLATION_WARNING_FIX).toContain('claude-code');
+ expect(formatCodexIsolationWarning()).toBe(
+ `${CODEX_ISOLATION_WARNING} ${CODEX_ISOLATION_WARNING_FIX}`,
+ );
+ });
+});
diff --git a/packages/cli/test/context/llm/codex-mcp-runtime-server.test.ts b/packages/cli/test/context/llm/codex-mcp-runtime-server.test.ts
new file mode 100644
index 00000000..c793afb7
--- /dev/null
+++ b/packages/cli/test/context/llm/codex-mcp-runtime-server.test.ts
@@ -0,0 +1,73 @@
+import { describe, expect, it, vi } from 'vitest';
+import { z } from 'zod';
+import {
+ createCodexRuntimeMcpServer,
+ startCodexRuntimeMcpServer,
+} from '../../../src/context/llm/codex-mcp-runtime-server.js';
+
+describe('Codex runtime MCP server', () => {
+ it('registers runtime tools with markdown output', async () => {
+ const registered = new Map<
+ string,
+ {
+ config: { description?: string; inputSchema: unknown };
+ handler: (input: Record) => Promise;
+ }
+ >();
+ const server = createCodexRuntimeMcpServer({
+ server: {
+ registerTool(name, config, handler) {
+ registered.set(name, { config, handler });
+ },
+ },
+ toolSet: {
+ wiki_search: {
+ name: 'wiki_search',
+ description: 'Search the wiki',
+ inputSchema: z.object({ query: z.string() }),
+ execute: vi.fn(async () => ({ markdown: 'result markdown', structured: { matches: 1 } })),
+ },
+ },
+ });
+
+ expect(server).toBeDefined();
+ expect([...registered.keys()]).toEqual(['wiki_search']);
+ expect(registered.get('wiki_search')?.config).toMatchObject({
+ description: 'Search the wiki',
+ });
+ await expect(registered.get('wiki_search')?.handler({ query: 'revenue' })).resolves.toEqual({
+ content: [{ type: 'text', text: 'result markdown' }],
+ structuredContent: { matches: 1 },
+ });
+ });
+
+ it('starts loopback HTTP MCP with a bearer token and reports the runtime URL', async () => {
+ const close = vi.fn(async () => undefined);
+ const runServer = vi.fn(async () => ({
+ server: { address: () => ({ port: 4321 }) },
+ close,
+ }));
+
+ const handle = await startCodexRuntimeMcpServer({
+ projectDir: '/tmp/ktx-project',
+ toolSet: {},
+ runServer: runServer as never,
+ });
+
+ expect(handle.url).toBe('http://127.0.0.1:4321/mcp');
+ expect(handle.bearerTokenEnvVar).toBe('KTX_CODEX_RUNTIME_MCP_TOKEN');
+ expect(handle.bearerToken).toMatch(/^[a-f0-9]{64}$/);
+ expect(runServer).toHaveBeenCalledWith(
+ expect.objectContaining({
+ projectDir: '/tmp/ktx-project',
+ host: '127.0.0.1',
+ port: 0,
+ token: handle.bearerToken,
+ allowedHosts: ['127.0.0.1', 'localhost'],
+ allowedOrigins: [],
+ }),
+ );
+ await handle.close();
+ expect(close).toHaveBeenCalled();
+ });
+});
diff --git a/packages/cli/test/context/llm/codex-models.test.ts b/packages/cli/test/context/llm/codex-models.test.ts
new file mode 100644
index 00000000..83a1e2c8
--- /dev/null
+++ b/packages/cli/test/context/llm/codex-models.test.ts
@@ -0,0 +1,17 @@
+import { describe, expect, it } from 'vitest';
+import { resolveCodexModel } from '../../../src/context/llm/codex-models.js';
+
+describe('resolveCodexModel', () => {
+ it.each([
+ ['codex', 'gpt-5.5'],
+ ['default', 'gpt-5.5'],
+ ['gpt-5.3-codex-spark', 'gpt-5.3-codex-spark'],
+ ['gpt-5.4', 'gpt-5.4'],
+ ])('maps %s to %s', (input, expected) => {
+ expect(resolveCodexModel(input)).toBe(expected);
+ });
+
+ it.each(['', ' ', 'sonnet', 'claude-sonnet-4-6'])('rejects %s', (input) => {
+ expect(() => resolveCodexModel(input)).toThrow('Unsupported Codex model');
+ });
+});
diff --git a/packages/cli/test/context/llm/codex-runtime-config.test.ts b/packages/cli/test/context/llm/codex-runtime-config.test.ts
new file mode 100644
index 00000000..97c80446
--- /dev/null
+++ b/packages/cli/test/context/llm/codex-runtime-config.test.ts
@@ -0,0 +1,43 @@
+import { describe, expect, it } from 'vitest';
+import { buildCodexRuntimeConfig } from '../../../src/context/llm/codex-runtime-config.js';
+
+describe('buildCodexRuntimeConfig', () => {
+ it('builds generic config without SDK thread-option fields', () => {
+ expect(buildCodexRuntimeConfig({ model: 'gpt-5.3-codex' })).toEqual({
+ configOverrides: {
+ history: { persistence: 'none' },
+ },
+ env: {},
+ });
+ });
+
+ it('adds only the temporary ktx MCP server and exact enabled tools', () => {
+ expect(
+ buildCodexRuntimeConfig({
+ model: 'gpt-5.3-codex',
+ mcp: {
+ url: 'http://127.0.0.1:4567/mcp',
+ bearerTokenEnvVar: 'KTX_CODEX_RUNTIME_MCP_TOKEN',
+ bearerToken: 'secret-token',
+ toolNames: ['sl_read_source', 'wiki_search'],
+ },
+ }),
+ ).toEqual({
+ configOverrides: {
+ history: { persistence: 'none' },
+ mcp_servers: {
+ ktx: {
+ url: 'http://127.0.0.1:4567/mcp',
+ bearer_token_env_var: 'KTX_CODEX_RUNTIME_MCP_TOKEN',
+ enabled_tools: ['sl_read_source', 'wiki_search'],
+ default_tools_approval_mode: 'approve',
+ required: true,
+ },
+ },
+ },
+ env: {
+ KTX_CODEX_RUNTIME_MCP_TOKEN: 'secret-token',
+ },
+ });
+ });
+});
diff --git a/packages/cli/test/context/llm/codex-runtime.test.ts b/packages/cli/test/context/llm/codex-runtime.test.ts
new file mode 100644
index 00000000..2d408543
--- /dev/null
+++ b/packages/cli/test/context/llm/codex-runtime.test.ts
@@ -0,0 +1,460 @@
+import { describe, expect, it, vi } from 'vitest';
+import { z } from 'zod';
+import {
+ CodexKtxLlmRuntime,
+ runCodexAuthProbe,
+} from '../../../src/context/llm/codex-runtime.js';
+
+async function* events(items: unknown[]) {
+ for (const item of items) {
+ yield item;
+ }
+}
+
+function runner(items: unknown[]) {
+ return {
+ runStreamed: vi.fn(async () => events(items)),
+ };
+}
+
+/** Yields the given events, then throws — mirroring the SDK throwing on a non-zero codex exec exit. */
+function throwingRunner(items: unknown[], error: Error) {
+ return {
+ runStreamed: vi.fn(async () =>
+ (async function* () {
+ for (const item of items) {
+ yield item;
+ }
+ throw error;
+ })(),
+ ),
+ };
+}
+
+const MODEL_UNSUPPORTED_API_ERROR = JSON.stringify({
+ type: 'error',
+ status: 400,
+ error: {
+ type: 'invalid_request_error',
+ message: "The 'gpt-5.3-codex' model is not supported when using Codex with a ChatGPT account.",
+ },
+});
+
+function budgetRunner() {
+ let observedSignal: AbortSignal | undefined;
+ return {
+ observedSignal: () => observedSignal,
+ runStreamed: vi.fn(async (input: { signal?: AbortSignal }) => {
+ observedSignal = input.signal;
+ return events([
+ { type: 'turn.started' },
+ { type: 'item.started', item: { type: 'mcp_tool_call', server: 'ktx', tool: 'first', status: 'in_progress' } },
+ { type: 'item.completed', item: { type: 'mcp_tool_call', server: 'ktx', tool: 'first', status: 'completed' } },
+ { type: 'item.started', item: { type: 'mcp_tool_call', server: 'ktx', tool: 'second', status: 'in_progress' } },
+ { type: 'item.completed', item: { type: 'mcp_tool_call', server: 'ktx', tool: 'second', status: 'completed' } },
+ { type: 'turn.completed', usage: { input_tokens: 1, output_tokens: 1 } },
+ ]);
+ }),
+ };
+}
+
+describe('CodexKtxLlmRuntime', () => {
+ it('generates text with the role-selected model and metrics', async () => {
+ const onMetrics = vi.fn();
+ const fakeRunner = runner([
+ { type: 'turn.started' },
+ { type: 'item.completed', item: { type: 'agent_message', text: 'hello' } },
+ { type: 'turn.completed', usage: { input_tokens: 3, output_tokens: 4, total_tokens: 7 } },
+ ]);
+ const runtime = new CodexKtxLlmRuntime({
+ projectDir: '/tmp/project',
+ modelSlots: { default: 'codex', triage: 'gpt-5.4' },
+ runner: fakeRunner,
+ });
+
+ await expect(runtime.generateText({ role: 'triage', system: 'system', prompt: 'prompt', onMetrics })).resolves.toBe('hello');
+ expect(fakeRunner.runStreamed).toHaveBeenCalledWith(
+ expect.objectContaining({
+ projectDir: '/tmp/project',
+ model: 'gpt-5.4',
+ prompt: 'system\n\nprompt',
+ }),
+ );
+ expect(onMetrics).toHaveBeenCalledWith(expect.objectContaining({ usage: { inputTokens: 3, outputTokens: 4, totalTokens: 7 } }));
+ });
+
+ it('generates and validates structured output', async () => {
+ const fakeRunner = runner([
+ { type: 'turn.started' },
+ { type: 'item.completed', item: { type: 'agent_message', text: '{"answer":"yes"}' } },
+ { type: 'turn.completed' },
+ ]);
+ const runtime = new CodexKtxLlmRuntime({
+ projectDir: '/tmp/project',
+ modelSlots: { default: 'codex' },
+ runner: fakeRunner,
+ });
+
+ await expect(
+ runtime.generateObject({
+ role: 'default',
+ prompt: 'json',
+ schema: z.object({ answer: z.string() }),
+ }),
+ ).resolves.toEqual({ answer: 'yes' });
+ expect(fakeRunner.runStreamed).toHaveBeenCalledWith(
+ expect.objectContaining({
+ outputSchema: expect.objectContaining({ type: 'object' }),
+ }),
+ );
+ });
+
+ it('returns a structured-output error when Codex final text is invalid JSON', async () => {
+ const fakeRunner = runner([
+ { type: 'turn.started' },
+ { type: 'item.completed', item: { type: 'agent_message', text: 'not json' } },
+ { type: 'turn.completed' },
+ ]);
+ const runtime = new CodexKtxLlmRuntime({
+ projectDir: '/tmp/project',
+ modelSlots: { default: 'codex' },
+ runner: fakeRunner,
+ });
+
+ await expect(
+ runtime.generateObject({
+ role: 'default',
+ prompt: 'json',
+ schema: z.object({ answer: z.string() }),
+ }),
+ ).rejects.toThrow('Codex structured output failed validation');
+ });
+
+ it('starts and closes a temporary MCP server for tool-backed agent loops', async () => {
+ const close = vi.fn(async () => undefined);
+ const startMcpServer = vi.fn(async () => ({
+ url: 'http://127.0.0.1:4321/mcp',
+ bearerTokenEnvVar: 'KTX_CODEX_RUNTIME_MCP_TOKEN' as const,
+ bearerToken: 'token',
+ close,
+ }));
+ const fakeRunner = runner([
+ { type: 'turn.started' },
+ { type: 'item.started', item: { type: 'mcp_tool_call', name: 'wiki_search' } },
+ { type: 'item.completed', item: { type: 'agent_message', text: 'done' } },
+ { type: 'turn.completed', usage: { input_tokens: 1, output_tokens: 1, total_tokens: 2 } },
+ ]);
+ const runtime = new CodexKtxLlmRuntime({
+ projectDir: '/tmp/project',
+ modelSlots: { default: 'codex' },
+ runner: fakeRunner,
+ startMcpServer,
+ });
+ const onStepFinish = vi.fn();
+
+ const result = await runtime.runAgentLoop({
+ modelRole: 'default',
+ systemPrompt: 'system',
+ userPrompt: 'user',
+ stepBudget: 5,
+ telemetryTags: {},
+ onStepFinish,
+ toolSet: {
+ aliased_wiki_tool: {
+ name: 'wiki_search',
+ description: 'Search wiki',
+ inputSchema: z.object({ query: z.string() }),
+ execute: vi.fn(),
+ },
+ },
+ });
+
+ expect(result.stopReason).toBe('natural');
+ expect(result.metrics).toMatchObject({ stepCount: 1, usage: { inputTokens: 1, outputTokens: 1, totalTokens: 2 } });
+ expect(onStepFinish).toHaveBeenCalledWith({ stepIndex: 1, stepBudget: 5 });
+ expect(startMcpServer).toHaveBeenCalledWith({ projectDir: '/tmp/project', toolSet: expect.any(Object) });
+ expect(fakeRunner.runStreamed).toHaveBeenCalledWith(
+ expect.objectContaining({
+ env: { KTX_CODEX_RUNTIME_MCP_TOKEN: 'token' },
+ configOverrides: expect.objectContaining({
+ mcp_servers: expect.objectContaining({
+ ktx: expect.objectContaining({
+ url: 'http://127.0.0.1:4321/mcp',
+ enabled_tools: ['wiki_search'],
+ required: true,
+ }),
+ }),
+ }),
+ }),
+ );
+ expect(close).toHaveBeenCalled();
+ });
+
+ it('returns error stop reason on turn failure', async () => {
+ const runtime = new CodexKtxLlmRuntime({
+ projectDir: '/tmp/project',
+ modelSlots: { default: 'codex' },
+ runner: runner([{ type: 'turn.failed', error: { message: 'boom' } }]),
+ });
+
+ const result = await runtime.runAgentLoop({
+ modelRole: 'default',
+ systemPrompt: 'system',
+ userPrompt: 'user',
+ stepBudget: 5,
+ telemetryTags: {},
+ toolSet: {},
+ });
+
+ expect(result.stopReason).toBe('error');
+ expect(result.error?.message).toBe('boom');
+ });
+
+ it('surfaces failed MCP tool calls as agent-loop errors', async () => {
+ const runtime = new CodexKtxLlmRuntime({
+ projectDir: '/tmp/project',
+ modelSlots: { default: 'codex' },
+ runner: runner([
+ { type: 'turn.started' },
+ { type: 'item.started', item: { type: 'mcp_tool_call', server: 'ktx', tool: 'search', status: 'in_progress' } },
+ {
+ type: 'item.completed',
+ item: {
+ type: 'mcp_tool_call',
+ server: 'ktx',
+ tool: 'search',
+ status: 'failed',
+ error: { message: 'denied' },
+ },
+ },
+ { type: 'turn.completed', usage: { input_tokens: 1, output_tokens: 1 } },
+ ]),
+ });
+
+ const result = await runtime.runAgentLoop({
+ modelRole: 'default',
+ systemPrompt: 'system',
+ userPrompt: 'user',
+ stepBudget: 5,
+ telemetryTags: {},
+ toolSet: {},
+ });
+
+ expect(result.stopReason).toBe('error');
+ expect(result.error?.message).toBe('Codex runtime tool call failed: search: denied');
+ expect(result.metrics).toMatchObject({
+ stepCount: 1,
+ usage: { inputTokens: 1, outputTokens: 1, totalTokens: 2 },
+ });
+ });
+
+ it('returns budget and aborts the Codex stream when local MCP step budget is reached', async () => {
+ const fakeRunner = budgetRunner();
+ const runtime = new CodexKtxLlmRuntime({
+ projectDir: '/tmp/project',
+ modelSlots: { default: 'codex' },
+ runner: fakeRunner,
+ });
+ const onStepFinish = vi.fn();
+
+ const result = await runtime.runAgentLoop({
+ modelRole: 'default',
+ systemPrompt: 'system',
+ userPrompt: 'user',
+ stepBudget: 1,
+ telemetryTags: {},
+ onStepFinish,
+ toolSet: {
+ first: {
+ name: 'first',
+ description: 'First tool',
+ inputSchema: z.object({}),
+ execute: vi.fn(),
+ },
+ },
+ });
+
+ expect(result.stopReason).toBe('budget');
+ expect(result.error).toBeUndefined();
+ expect(result.metrics).toMatchObject({ stepCount: 1 });
+ expect(onStepFinish).toHaveBeenCalledTimes(1);
+ expect(onStepFinish).toHaveBeenCalledWith({ stepIndex: 1, stepBudget: 1 });
+ expect(fakeRunner.observedSignal()?.aborted).toBe(true);
+ });
+
+ it('counts built-in command_execution steps against the budget and aborts the stream', async () => {
+ let observedSignal: AbortSignal | undefined;
+ const fakeRunner = {
+ observedSignal: () => observedSignal,
+ runStreamed: vi.fn(async (input: { signal?: AbortSignal }) => {
+ observedSignal = input.signal;
+ return events([
+ { type: 'turn.started' },
+ { type: 'item.started', item: { type: 'command_execution', command: 'ls', status: 'in_progress' } },
+ { type: 'item.completed', item: { type: 'command_execution', command: 'ls', status: 'completed', exit_code: 0 } },
+ { type: 'item.started', item: { type: 'command_execution', command: 'cat a', status: 'in_progress' } },
+ { type: 'item.completed', item: { type: 'command_execution', command: 'cat a', status: 'completed', exit_code: 0 } },
+ { type: 'item.completed', item: { type: 'command_execution', command: 'cat b', status: 'completed', exit_code: 0 } },
+ { type: 'turn.completed', usage: { input_tokens: 1, output_tokens: 1 } },
+ ]);
+ }),
+ };
+ const runtime = new CodexKtxLlmRuntime({
+ projectDir: '/tmp/project',
+ modelSlots: { default: 'codex' },
+ runner: fakeRunner,
+ });
+ const onStepFinish = vi.fn();
+
+ const result = await runtime.runAgentLoop({
+ modelRole: 'default',
+ systemPrompt: 'system',
+ userPrompt: 'user',
+ stepBudget: 2,
+ telemetryTags: {},
+ onStepFinish,
+ toolSet: {},
+ });
+
+ expect(result.stopReason).toBe('budget');
+ expect(result.error).toBeUndefined();
+ expect(result.metrics).toMatchObject({ stepCount: 2 });
+ expect(onStepFinish).toHaveBeenCalledTimes(2);
+ expect(onStepFinish).toHaveBeenLastCalledWith({ stepIndex: 2, stepBudget: 2 });
+ expect(fakeRunner.observedSignal()?.aborted).toBe(true);
+ });
+
+ it('fires onStepFinish live as each step completes, before the stream drains', async () => {
+ const order: string[] = [];
+ async function* liveEvents() {
+ yield { type: 'turn.started' };
+ yield { type: 'item.completed', item: { type: 'mcp_tool_call', server: 'ktx', tool: 'a', status: 'completed' } };
+ order.push('yielded-after-step-1');
+ yield { type: 'item.completed', item: { type: 'mcp_tool_call', server: 'ktx', tool: 'b', status: 'completed' } };
+ order.push('yielded-after-step-2');
+ yield { type: 'item.completed', item: { type: 'agent_message', text: 'done' } };
+ yield { type: 'turn.completed', usage: { input_tokens: 1, output_tokens: 1 } };
+ }
+ const fakeRunner = { runStreamed: vi.fn(async () => liveEvents()) };
+ const runtime = new CodexKtxLlmRuntime({
+ projectDir: '/tmp/project',
+ modelSlots: { default: 'codex' },
+ runner: fakeRunner,
+ });
+
+ const result = await runtime.runAgentLoop({
+ modelRole: 'default',
+ systemPrompt: 'system',
+ userPrompt: 'user',
+ stepBudget: 10,
+ telemetryTags: {},
+ onStepFinish: ({ stepIndex }) => {
+ order.push(`step-${stepIndex}`);
+ },
+ toolSet: {},
+ });
+
+ expect(result.stopReason).toBe('natural');
+ expect(result.metrics).toMatchObject({ stepCount: 2 });
+ expect(order).toEqual(['step-1', 'yielded-after-step-1', 'step-2', 'yielded-after-step-2']);
+ });
+
+ it('surfaces the real Codex error event even when the SDK stream throws afterward', async () => {
+ // The SDK yields the error/turn.failed events on stdout, then throws on the
+ // non-zero exit. The masked exit message must not hide the real API error.
+ const fakeRunner = throwingRunner(
+ [
+ { type: 'thread.started', thread_id: 't' },
+ { type: 'turn.started' },
+ { type: 'error', message: MODEL_UNSUPPORTED_API_ERROR },
+ { type: 'turn.failed', error: { message: MODEL_UNSUPPORTED_API_ERROR } },
+ ],
+ new Error('Codex Exec exited with code 1: Reading prompt from stdin...'),
+ );
+ const runtime = new CodexKtxLlmRuntime({
+ projectDir: '/tmp/project',
+ modelSlots: { default: 'codex' },
+ runner: fakeRunner,
+ });
+
+ await expect(runtime.generateText({ role: 'default', prompt: 'hi' })).rejects.toThrow(
+ 'not supported when using Codex with a ChatGPT account',
+ );
+ });
+
+ it('probes Codex authentication through a minimal non-interactive turn', async () => {
+ const fakeRunner = runner([
+ { type: 'turn.started' },
+ { type: 'item.completed', item: { type: 'agent_message', text: 'ok' } },
+ { type: 'turn.completed' },
+ ]);
+
+ await expect(
+ runCodexAuthProbe({
+ projectDir: '/tmp/project',
+ model: 'codex',
+ runner: fakeRunner,
+ }),
+ ).resolves.toEqual({ ok: true });
+ });
+
+ it('reports an unavailable model without blaming auth when Codex rejects the model', async () => {
+ const fakeRunner = throwingRunner(
+ [
+ { type: 'turn.started' },
+ { type: 'turn.failed', error: { message: MODEL_UNSUPPORTED_API_ERROR } },
+ ],
+ new Error('Codex Exec exited with code 1: Reading prompt from stdin...'),
+ );
+
+ const result = await runCodexAuthProbe({
+ projectDir: '/tmp/project',
+ model: 'gpt-5.3-codex',
+ runner: fakeRunner,
+ });
+
+ expect(result.ok).toBe(false);
+ if (!result.ok) {
+ expect(result.message).not.toContain('authentication is not usable');
+ expect(result.message).toContain('not available');
+ expect(result.message).toContain('gpt-5.3-codex');
+ expect(result.message).toContain('not supported when using Codex with a ChatGPT account');
+ // A model-access failure must steer the user at the model config, not auth.
+ expect(result.fix).toContain('llm.models.default');
+ expect(result.fix).not.toContain('Authenticate Codex');
+ }
+ });
+
+ it('reports an auth failure when Codex exits without an error event', async () => {
+ const fakeRunner = throwingRunner(
+ [],
+ new Error('Codex Exec exited with code 1: Not logged in. Run `codex login`.'),
+ );
+
+ const result = await runCodexAuthProbe({
+ projectDir: '/tmp/project',
+ model: 'gpt-5.5',
+ runner: fakeRunner,
+ });
+
+ expect(result.ok).toBe(false);
+ if (!result.ok) {
+ expect(result.message).toContain('authentication is not usable');
+ expect(result.message).toContain('Not logged in');
+ expect(result.fix).toContain('Authenticate Codex');
+ }
+ });
+
+ it('rejects an unsupported model id before probing, steering at llm.models.default', async () => {
+ const result = await runCodexAuthProbe({
+ projectDir: '/tmp/project',
+ model: 'not-a-real-model',
+ });
+
+ expect(result.ok).toBe(false);
+ if (!result.ok) {
+ expect(result.message).toContain('Unsupported Codex model');
+ expect(result.fix).toContain('llm.models.default');
+ }
+ });
+});
diff --git a/packages/cli/test/context/llm/codex-sdk-runner.test.ts b/packages/cli/test/context/llm/codex-sdk-runner.test.ts
new file mode 100644
index 00000000..fdafc666
--- /dev/null
+++ b/packages/cli/test/context/llm/codex-sdk-runner.test.ts
@@ -0,0 +1,97 @@
+import { describe, expect, it, vi } from 'vitest';
+
+const sdkMock = vi.hoisted(() => {
+ const events = (async function* () {
+ yield { type: 'turn.completed', usage: { input_tokens: 1, output_tokens: 2 } };
+ })();
+ const runStreamed = vi.fn(async () => ({ events }));
+ const startThread = vi.fn(() => ({ runStreamed }));
+ const Codex = vi.fn(function Codex(this: { startThread: typeof startThread }, options?: unknown) {
+ Object.assign(this, { options, startThread });
+ });
+ return { Codex, startThread, runStreamed };
+});
+
+vi.mock('@openai/codex-sdk', () => ({ Codex: sdkMock.Codex }));
+
+import { CodexSdkCliRunner } from '../../../src/context/llm/codex-sdk-runner.js';
+
+async function collectAsync(items: AsyncIterable): Promise {
+ const collected: T[] = [];
+ for await (const item of items) {
+ collected.push(item);
+ }
+ return collected;
+}
+
+describe('CodexSdkCliRunner', () => {
+ it('passes isolated env through the SDK and runtime controls through thread options', async () => {
+ const runner = new CodexSdkCliRunner({
+ envBase: {
+ HOME: '/home/ktx-user',
+ PATH: '/usr/local/bin:/usr/bin',
+ CODEX_HOME: '/home/ktx-user/.codex',
+ HTTPS_PROXY: 'http://proxy.example',
+ KTX_UNRELATED_SECRET: 'must-not-copy', // pragma: allowlist secret
+ },
+ });
+ const previousToken = process.env.KTX_CODEX_RUNTIME_MCP_TOKEN;
+ process.env.KTX_CODEX_RUNTIME_MCP_TOKEN = 'outer-token';
+ const outputSchema = {
+ type: 'object',
+ properties: { answer: { type: 'string' } },
+ required: ['answer'],
+ additionalProperties: false,
+ };
+ const controller = new AbortController();
+
+ try {
+ const events = await runner.runStreamed({
+ projectDir: '/tmp/ktx-project',
+ model: 'gpt-5.3-codex',
+ prompt: 'Return JSON.',
+ configOverrides: {
+ history: { persistence: 'none' },
+ },
+ env: { KTX_CODEX_RUNTIME_MCP_TOKEN: 'run-token' },
+ outputSchema,
+ signal: controller.signal,
+ });
+
+ expect(sdkMock.Codex).toHaveBeenCalledWith({
+ config: {
+ history: { persistence: 'none' },
+ },
+ env: {
+ HOME: '/home/ktx-user',
+ PATH: '/usr/local/bin:/usr/bin',
+ CODEX_HOME: '/home/ktx-user/.codex',
+ HTTPS_PROXY: 'http://proxy.example',
+ KTX_CODEX_RUNTIME_MCP_TOKEN: 'run-token',
+ },
+ });
+ expect(process.env.KTX_CODEX_RUNTIME_MCP_TOKEN).toBe('outer-token');
+ expect(sdkMock.startThread).toHaveBeenCalledWith({
+ workingDirectory: '/tmp/ktx-project',
+ skipGitRepoCheck: true,
+ model: 'gpt-5.3-codex',
+ sandboxMode: 'read-only',
+ webSearchMode: 'disabled',
+ approvalPolicy: 'never',
+ });
+ expect(sdkMock.runStreamed).toHaveBeenCalledWith('Return JSON.', {
+ outputSchema,
+ signal: controller.signal,
+ });
+ await expect(collectAsync(events)).resolves.toEqual([
+ { type: 'turn.completed', usage: { input_tokens: 1, output_tokens: 2 } },
+ ]);
+ } finally {
+ if (previousToken === undefined) {
+ delete process.env.KTX_CODEX_RUNTIME_MCP_TOKEN;
+ } else {
+ process.env.KTX_CODEX_RUNTIME_MCP_TOKEN = previousToken;
+ }
+ }
+ });
+});
diff --git a/packages/cli/test/context/llm/runtime-local-config.test.ts b/packages/cli/test/context/llm/runtime-local-config.test.ts
index 9e432cec..14adca7c 100644
--- a/packages/cli/test/context/llm/runtime-local-config.test.ts
+++ b/packages/cli/test/context/llm/runtime-local-config.test.ts
@@ -22,4 +22,25 @@ describe('local KTX LLM runtime config', () => {
}),
).toBeNull();
});
+
+ it('creates a Codex runtime for codex backend without creating an AI SDK provider', () => {
+ const runtime = createLocalKtxLlmRuntimeFromConfig(
+ {
+ provider: { backend: 'codex' },
+ models: { default: 'codex', triage: 'gpt-5.4' },
+ },
+ { env: {}, projectDir: '/tmp/project', createCodexRuntime: vi.fn((deps) => ({ deps }) as never) },
+ );
+
+ expect(runtime).toMatchObject({ deps: expect.objectContaining({ projectDir: '/tmp/project' }) });
+ });
+
+ it('returns null from the AI SDK provider factory for codex backend', () => {
+ expect(
+ createLocalKtxLlmProviderFromConfig({
+ provider: { backend: 'codex' },
+ models: { default: 'codex' },
+ }),
+ ).toBeNull();
+ });
});
diff --git a/packages/cli/test/context/project/config.test.ts b/packages/cli/test/context/project/config.test.ts
index 670e1696..6027d454 100644
--- a/packages/cli/test/context/project/config.test.ts
+++ b/packages/cli/test/context/project/config.test.ts
@@ -231,6 +231,31 @@ llm:
});
});
+ it('parses Codex as a first-class LLM backend', () => {
+ const config = parseKtxProjectConfig(`
+llm:
+ provider:
+ backend: codex
+ models:
+ default: gpt-5.3-codex
+ triage: gpt-5.3-codex
+ candidateExtraction: gpt-5.3-codex
+ curator: gpt-5.3-codex
+ reconcile: gpt-5.3-codex
+ repair: gpt-5.3-codex
+`);
+
+ expect(config.llm.provider.backend).toBe('codex');
+ expect(config.llm.models).toEqual({
+ default: 'gpt-5.3-codex',
+ triage: 'gpt-5.3-codex',
+ candidateExtraction: 'gpt-5.3-codex',
+ curator: 'gpt-5.3-codex',
+ reconcile: 'gpt-5.3-codex',
+ repair: 'gpt-5.3-codex',
+ });
+ });
+
it('parses gateway LLM, OpenAI scan embeddings, and sentence-transformers ingest embeddings', () => {
const config = parseKtxProjectConfig(`
llm:
@@ -530,7 +555,7 @@ describe('generateKtxProjectConfigJsonSchema', () => {
const llm = (schema.properties as Record }>).llm;
const provider = llm?.properties?.provider as { properties?: Record };
const backend = provider?.properties?.backend as { enum?: readonly string[] };
- expect(backend?.enum).toEqual(['none', 'anthropic', 'vertex', 'gateway', 'claude-code']);
+ expect(backend?.enum).toEqual(['none', 'anthropic', 'vertex', 'gateway', 'claude-code', 'codex']);
const storage = (schema.properties as Record }>).storage;
const state = storage?.properties?.state as { enum?: readonly string[] };
diff --git a/packages/cli/test/doctor.test.ts b/packages/cli/test/doctor.test.ts
index e3871f28..242331e8 100644
--- a/packages/cli/test/doctor.test.ts
+++ b/packages/cli/test/doctor.test.ts
@@ -422,6 +422,8 @@ describe('runKtxDoctor', () => {
'llm:',
' provider:',
' backend: anthropic',
+ ' models:',
+ ' default: claude-sonnet-4-5',
'',
].join('\n'),
'utf-8',
@@ -543,6 +545,8 @@ describe('runKtxDoctor', () => {
'llm:',
' provider:',
' backend: anthropic',
+ ' models:',
+ ' default: claude-sonnet-4-5',
'ingest:',
' adapters:',
' - live-database',
@@ -652,6 +656,8 @@ describe('runKtxDoctor', () => {
'llm:',
' provider:',
' backend: anthropic',
+ ' models:',
+ ' default: claude-sonnet-4-5',
'',
].join('\n'),
'utf-8',
@@ -698,6 +704,8 @@ describe('runKtxDoctor', () => {
'llm:',
' provider:',
' backend: anthropic',
+ ' models:',
+ ' default: claude-sonnet-4-5',
'ingest:',
' adapters:',
' - live-database',
diff --git a/packages/cli/test/ingest.test.ts b/packages/cli/test/ingest.test.ts
index f5cd1ac5..4fc47d0c 100644
--- a/packages/cli/test/ingest.test.ts
+++ b/packages/cli/test/ingest.test.ts
@@ -337,10 +337,13 @@ describe('runKtxIngest', () => {
expect(runIo.stdout()).toBe('');
expect(runIo.stderr()).toContain(
- 'ktx ingest requires llm.provider.backend: anthropic, vertex, gateway, or claude-code, or an injected agentRunner.',
+ 'ktx ingest requires llm.provider.backend: anthropic, vertex, gateway, claude-code, or codex, or an injected agentRunner.',
);
- expect(runIo.stderr()).toContain('Configure a local Claude Code session or API-backed LLM, then rerun ingest:');
+ expect(runIo.stderr()).toContain('Configure a local Claude Code/Codex session or API-backed LLM, then rerun ingest:');
expect(runIo.stderr()).toContain(`ktx setup --project-dir ${projectDir} --llm-backend claude-code --no-input`);
+ expect(runIo.stderr()).toContain(
+ `ktx setup --project-dir ${projectDir} --llm-backend codex --llm-model gpt-5.5 --no-input`,
+ );
expect(runIo.stderr()).toContain(
`ktx setup --project-dir ${projectDir} --llm-backend anthropic --anthropic-api-key-env ANTHROPIC_API_KEY --llm-model claude-sonnet-4-6 --no-input`,
);
diff --git a/packages/cli/test/llm/model-provider.test.ts b/packages/cli/test/llm/model-provider.test.ts
index 0e3ef045..17d47c6a 100644
--- a/packages/cli/test/llm/model-provider.test.ts
+++ b/packages/cli/test/llm/model-provider.test.ts
@@ -312,4 +312,13 @@ describe('createKtxLlmProvider', () => {
}),
).toThrow('claude-code is not an AI SDK LanguageModel backend');
});
+
+ it('rejects codex as an AI SDK LanguageModel backend', () => {
+ expect(() =>
+ createKtxLlmProvider({
+ backend: 'codex',
+ modelSlots: { default: 'gpt-5.3-codex' },
+ }),
+ ).toThrow('codex is not an AI SDK LanguageModel backend');
+ });
});
diff --git a/packages/cli/test/setup-models.test.ts b/packages/cli/test/setup-models.test.ts
index f054beff..dedf03bd 100644
--- a/packages/cli/test/setup-models.test.ts
+++ b/packages/cli/test/setup-models.test.ts
@@ -66,6 +66,7 @@ function makePromptAdapter(options: {
nextProviderChoice === 'anthropic' ||
nextProviderChoice === 'vertex' ||
nextProviderChoice === 'claude-code' ||
+ nextProviderChoice === 'codex' ||
nextProviderChoice === 'back'
) {
return selectValues.shift() ?? nextProviderChoice;
@@ -183,6 +184,7 @@ describe('setup Anthropic model step', () => {
message: expect.stringContaining('Which LLM provider should KTX use?'),
options: [
{ value: 'claude-code', label: 'Claude subscription (Pro/Max)' },
+ { value: 'codex', label: 'Codex subscription' },
{ value: 'anthropic', label: 'Anthropic API key' },
{ value: 'vertex', label: 'Google Vertex AI for Anthropic Claude' },
{ value: 'back', label: 'Back' },
@@ -215,6 +217,85 @@ describe('setup Anthropic model step', () => {
expect(authProbe).toHaveBeenCalledWith(expect.objectContaining({ projectDir: tempDir, model: 'sonnet' }));
});
+ it('configures Codex backend and validates local auth', async () => {
+ const io = makeIo();
+ const codexAuthProbe = vi.fn(async () => ({ ok: true as const }));
+
+ const result = await runKtxSetupAnthropicModelStep(
+ {
+ projectDir: tempDir,
+ inputMode: 'disabled',
+ llmBackend: 'codex',
+ llmModel: 'gpt-5.5',
+ skipLlm: false,
+ },
+ io.io,
+ { codexAuthProbe },
+ );
+
+ expect(result.status).toBe('ready');
+ const config = parseKtxProjectConfig(await readFile(join(tempDir, 'ktx.yaml'), 'utf-8'));
+ expect(config.llm).toMatchObject({
+ provider: { backend: 'codex' },
+ models: { default: 'gpt-5.5' },
+ });
+ expect(codexAuthProbe).toHaveBeenCalledWith(expect.objectContaining({ projectDir: tempDir, model: 'gpt-5.5' }));
+ // The warning carries the clack gutter so it renders inside the setup frame.
+ expect(io.stderr()).toContain('│ Codex backend isolation is limited');
+ expect(io.stderr()).toContain('may still load user Codex config');
+ });
+
+ it('defaults the Codex model to gpt-5.5 when none is provided non-interactively', async () => {
+ const io = makeIo();
+ const codexAuthProbe = vi.fn(async () => ({ ok: true as const }));
+
+ const result = await runKtxSetupAnthropicModelStep(
+ {
+ projectDir: tempDir,
+ inputMode: 'disabled',
+ llmBackend: 'codex',
+ skipLlm: false,
+ },
+ io.io,
+ { codexAuthProbe },
+ );
+
+ expect(result.status).toBe('ready');
+ const config = parseKtxProjectConfig(await readFile(join(tempDir, 'ktx.yaml'), 'utf-8'));
+ expect(config.llm).toMatchObject({
+ provider: { backend: 'codex' },
+ models: { default: 'gpt-5.5' },
+ });
+ expect(codexAuthProbe).toHaveBeenCalledWith(expect.objectContaining({ projectDir: tempDir, model: 'gpt-5.5' }));
+ });
+
+ it('offers the curated Codex models during interactive setup', async () => {
+ const io = makeIo();
+ const prompts = makePromptAdapter({ selectValues: ['codex', 'gpt-5.5'] });
+ const codexAuthProbe = vi.fn(async () => ({ ok: true as const }));
+
+ const result = await runKtxSetupAnthropicModelStep(
+ { projectDir: tempDir, inputMode: 'auto', skipLlm: false },
+ io.io,
+ { prompts, codexAuthProbe },
+ );
+
+ expect(result.status).toBe('ready');
+ expect(prompts.select).toHaveBeenCalledWith(
+ expect.objectContaining({
+ message: expect.stringContaining('Which Codex model should KTX use?'),
+ options: [
+ { value: 'gpt-5.5', label: 'GPT-5.5', hint: 'recommended' },
+ { value: 'gpt-5.4', label: 'GPT-5.4' },
+ { value: 'gpt-5.4-mini', label: 'GPT-5.4 mini' },
+ { value: 'manual', label: 'Enter a Codex model ID manually' },
+ { value: 'back', label: 'Back' },
+ ],
+ }),
+ );
+ expect(codexAuthProbe).toHaveBeenCalledWith(expect.objectContaining({ model: 'gpt-5.5' }));
+ });
+
it('prompts for the Claude Code model during interactive setup', async () => {
const io = makeIo();
const prompts = makePromptAdapter({ selectValues: ['claude-code', 'opus'] });
diff --git a/packages/cli/test/status-project.test.ts b/packages/cli/test/status-project.test.ts
index 38d5aa6f..cd63cf19 100644
--- a/packages/cli/test/status-project.test.ts
+++ b/packages/cli/test/status-project.test.ts
@@ -44,6 +44,17 @@ function withClaudeCodeLlm(config: KtxProjectConfig): KtxProjectConfig {
};
}
+function withCodexLlm(config: KtxProjectConfig): KtxProjectConfig {
+ return {
+ ...config,
+ llm: {
+ ...config.llm,
+ provider: { backend: 'codex' },
+ models: { ...config.llm.models, default: 'gpt-5.5' },
+ },
+ };
+}
+
function baseProjectConfig(): KtxProjectConfig {
return withClaudeCodeLlm(buildDefaultKtxProjectConfig());
}
@@ -391,6 +402,126 @@ describe('buildProjectStatus --fast', () => {
});
});
+describe('buildProjectStatus codex', () => {
+ it('reports authenticated local Codex session', async () => {
+ const project = projectWithConfig(withCodexLlm(buildDefaultKtxProjectConfig()));
+ const status = await buildProjectStatus(project, {
+ codexAuthProbe: async () => ({ ok: true as const }),
+ });
+
+ expect(status.llm).toMatchObject({
+ backend: 'codex',
+ model: 'gpt-5.5',
+ status: 'ok',
+ detail: 'local Codex session authenticated',
+ });
+ expect(status.warnings).toEqual(
+ expect.arrayContaining([
+ expect.objectContaining({
+ message: expect.stringContaining('Codex backend isolation is limited'),
+ fix: expect.stringContaining('claude-code'),
+ }),
+ ]),
+ );
+ const rendered = renderProjectStatus(status, { verbose: false, useColor: false });
+ expect(rendered).toContain('Codex backend isolation is limited');
+ });
+
+ it('skips Codex auth probe with --fast', async () => {
+ let probeCalls = 0;
+ const project = projectWithConfig(withCodexLlm(buildDefaultKtxProjectConfig()));
+ const status = await buildProjectStatus(project, {
+ fast: true,
+ codexAuthProbe: async () => {
+ probeCalls += 1;
+ return { ok: true };
+ },
+ });
+
+ expect(probeCalls).toBe(0);
+ expect(status.llm.status).toBe('skipped');
+ expect(status.llm.detail).toMatch(/--fast/);
+ });
+
+ it('surfaces the probe fix for a model-access failure instead of an auth fix', async () => {
+ const project = projectWithConfig(withCodexLlm(buildDefaultKtxProjectConfig()));
+ const status = await buildProjectStatus(project, {
+ codexAuthProbe: async () => ({
+ ok: false,
+ message: 'Codex is authenticated, but the configured model "gpt-5.5" is not available...',
+ fix: 'Run `codex` to see the models your account supports, then set llm.models.default in ktx.yaml (or rerun `ktx setup`).',
+ }),
+ });
+
+ expect(status.llm.status).toBe('fail');
+ expect(status.llm.fix).toContain('llm.models.default');
+ expect(status.llm.fix).not.toContain('Authenticate Codex');
+ });
+});
+
+describe('buildProjectStatus llm models.default requirement', () => {
+ function withBackendNoModel(
+ backend: KtxProjectConfig['llm']['provider']['backend'],
+ ): KtxProjectConfig {
+ const config = buildDefaultKtxProjectConfig();
+ return {
+ ...config,
+ llm: { ...config.llm, provider: { backend }, models: {} },
+ };
+ }
+
+ it('fails codex without llm.models.default and never probes', async () => {
+ let probeCalls = 0;
+ const project = projectWithConfig(withBackendNoModel('codex'));
+ const status = await buildProjectStatus(project, {
+ codexAuthProbe: async () => {
+ probeCalls += 1;
+ return { ok: true };
+ },
+ });
+
+ expect(probeCalls).toBe(0);
+ expect(status.llm.status).toBe('fail');
+ expect(status.llm.detail).toContain('llm.models.default');
+ expect(status.verdict).toBe('blocked');
+ });
+
+ it('fails claude-code without llm.models.default and never probes', async () => {
+ let probeCalls = 0;
+ const project = projectWithConfig(withBackendNoModel('claude-code'));
+ const status = await buildProjectStatus(project, {
+ claudeCodeAuthProbe: async () => {
+ probeCalls += 1;
+ return { ok: true };
+ },
+ });
+
+ expect(probeCalls).toBe(0);
+ expect(status.llm.status).toBe('fail');
+ expect(status.llm.detail).toContain('llm.models.default');
+ expect(status.verdict).toBe('blocked');
+ });
+
+ it('fails anthropic without llm.models.default even when the key is set', async () => {
+ const config = withBackendNoModel('anthropic');
+ const project = projectWithConfig({
+ ...config,
+ llm: {
+ ...config.llm,
+ provider: { backend: 'anthropic', anthropic: { api_key: 'env:ANTHROPIC_API_KEY' } }, // pragma: allowlist secret
+ models: {},
+ },
+ });
+ const status = await buildProjectStatus(project, {
+ env: { ANTHROPIC_API_KEY: 'sk-test' }, // pragma: allowlist secret
+ });
+
+ expect(status.llm.status).toBe('fail');
+ expect(status.llm.detail).toContain('llm.models.default');
+ expect(status.verdict).toBe('blocked');
+ });
+});
+
describe('buildLocalStatsStatus', () => {
let tempDir: string;
diff --git a/pnpm-lock.yaml b/pnpm-lock.yaml
index 15bc75f3..a3eaad5f 100644
--- a/pnpm-lock.yaml
+++ b/pnpm-lock.yaml
@@ -158,6 +158,9 @@ importers:
'@notionhq/client':
specifier: ^5.22.0
version: 5.22.0
+ '@openai/codex-sdk':
+ specifier: ^0.133.0
+ version: 0.133.0
ai:
specifier: ^6.0.188
version: 6.0.188(zod@4.4.3)
@@ -1288,6 +1291,51 @@ packages:
'@octokit/types@16.0.0':
resolution: {integrity: sha512-sKq+9r1Mm4efXW1FCk7hFSeJo4QKreL/tTbR0rz/qx/r1Oa2VV83LTA/H/MuCOX7uCIJmQVRKBcbmWoySjAnSg==}
+ '@openai/codex-sdk@0.133.0':
+ resolution: {integrity: sha512-PB82D/1Q0C7nzaV5O+1O4y5LcVwiUvxyHvCUTfz8Cwztv6bOWQ40gFHE5ZFX1EFPJx1cMV0GPVODWuXIKAuayQ==}
+ engines: {node: '>=18'}
+
+ '@openai/codex@0.133.0':
+ resolution: {integrity: sha512-Gh42kLLBo/6gpnHmDzUWDVvyS57ekCB1+1Dz0RG2oIl3Lhk1uwrjSj/PwaJWWh4Rw/rUp1RqkwrMugFfFEOlqQ==}
+ engines: {node: '>=16'}
+ hasBin: true
+
+ '@openai/codex@0.133.0-darwin-arm64':
+ resolution: {integrity: sha512-W7f8+DckLujnqGlptKCzgJU+ooeHKMuk6KYgMFP6A9asn7YUsGUgJqjiBaX8oNcXO6w/pTbKGRARx1kCNS8lIg==}
+ engines: {node: '>=16'}
+ cpu: [arm64]
+ os: [darwin]
+
+ '@openai/codex@0.133.0-darwin-x64':
+ resolution: {integrity: sha512-Ek8ikvLOiXZ8emcIJVBXxK6fm8ratBy0kaEt3JNisTNszxGshUHf/R4xxDxIyKNcUkYYXjW7A/rMwW3iu3OFlg==}
+ engines: {node: '>=16'}
+ cpu: [x64]
+ os: [darwin]
+
+ '@openai/codex@0.133.0-linux-arm64':
+ resolution: {integrity: sha512-uKXYYSJ3mY16sp4hcG/4BMNRjva/ZS4oARiI1+7k8+NiuoAhdCGWNe5u4KJ3sMuL3tp/IXcmc6B56EFX1+WDBQ==}
+ engines: {node: '>=16'}
+ cpu: [arm64]
+ os: [linux]
+
+ '@openai/codex@0.133.0-linux-x64':
+ resolution: {integrity: sha512-9YfyqrfUj/UZ2+aXE4zBz47t6RXbVni95ZorGsNh857vxYK/asVpUtR2cymo9lB3JaI4mQaKFfV/t7IRItqkuA==}
+ engines: {node: '>=16'}
+ cpu: [x64]
+ os: [linux]
+
+ '@openai/codex@0.133.0-win32-arm64':
+ resolution: {integrity: sha512-mRzND0PSGHRoLk0X41GTSoc3tFjZSF4HgDlfjU5fiQcWVi0/kLb7Ku6/tPFT/X2hOLa3YdJkbIcHC0Hc9ni80g==}
+ engines: {node: '>=16'}
+ cpu: [arm64]
+ os: [win32]
+
+ '@openai/codex@0.133.0-win32-x64':
+ resolution: {integrity: sha512-u3ji78DIPZCGJeELuovsAnaZH+vK9gsA4F6M1y+Uy2s80Sz7/i1S0KL81qGReYji3urSjgBpkQuNP47GXOqxrQ==}
+ engines: {node: '>=16'}
+ cpu: [x64]
+ os: [win32]
+
'@opentelemetry/api@1.9.1':
resolution: {integrity: sha512-gLyJlPHPZYdAk1JENA9LeHejZe1Ti77/pTeFm/nMXmQH/HFZlcS/O2XJB+L8fkbrNSqhdtlvjBVjxwUYanNH5Q==}
engines: {node: '>=8.0.0'}
@@ -7145,6 +7193,37 @@ snapshots:
dependencies:
'@octokit/openapi-types': 27.0.0
+ '@openai/codex-sdk@0.133.0':
+ dependencies:
+ '@openai/codex': 0.133.0
+
+ '@openai/codex@0.133.0':
+ optionalDependencies:
+ '@openai/codex-darwin-arm64': '@openai/codex@0.133.0-darwin-arm64'
+ '@openai/codex-darwin-x64': '@openai/codex@0.133.0-darwin-x64'
+ '@openai/codex-linux-arm64': '@openai/codex@0.133.0-linux-arm64'
+ '@openai/codex-linux-x64': '@openai/codex@0.133.0-linux-x64'
+ '@openai/codex-win32-arm64': '@openai/codex@0.133.0-win32-arm64'
+ '@openai/codex-win32-x64': '@openai/codex@0.133.0-win32-x64'
+
+ '@openai/codex@0.133.0-darwin-arm64':
+ optional: true
+
+ '@openai/codex@0.133.0-darwin-x64':
+ optional: true
+
+ '@openai/codex@0.133.0-linux-arm64':
+ optional: true
+
+ '@openai/codex@0.133.0-linux-x64':
+ optional: true
+
+ '@openai/codex@0.133.0-win32-arm64':
+ optional: true
+
+ '@openai/codex@0.133.0-win32-x64':
+ optional: true
+
'@opentelemetry/api@1.9.1': {}
'@orama/orama@3.1.18': {}
diff --git a/scripts/codex-backend-live-smoke.mjs b/scripts/codex-backend-live-smoke.mjs
new file mode 100644
index 00000000..7793fefc
--- /dev/null
+++ b/scripts/codex-backend-live-smoke.mjs
@@ -0,0 +1,160 @@
+import { execFile } from 'node:child_process';
+import { mkdtemp, rm } from 'node:fs/promises';
+import { tmpdir } from 'node:os';
+import { dirname, join, resolve } from 'node:path';
+import { fileURLToPath, pathToFileURL } from 'node:url';
+import { promisify } from 'node:util';
+
+const execFileAsync = promisify(execFile);
+const SCRIPT_DIR = dirname(fileURLToPath(import.meta.url));
+const ROOT_DIR = resolve(SCRIPT_DIR, '..');
+const OPT_IN_MESSAGE =
+ 'Set KTX_RUN_CODEX_BACKEND_SMOKE=1 or pass --force to run the Codex backend live smoke.';
+
+export function codexBackendSmokeOptIn(env = process.env, args = process.argv.slice(2)) {
+ if (env.KTX_RUN_CODEX_BACKEND_SMOKE === '1' || args.includes('--force')) {
+ return { run: true };
+ }
+ return { run: false, message: OPT_IN_MESSAGE };
+}
+
+async function run(command, args, options = {}) {
+ process.stdout.write(`$ ${command} ${args.join(' ')}\n`);
+ try {
+ const result = await execFileAsync(command, args, {
+ cwd: options.cwd ?? ROOT_DIR,
+ env: { ...process.env, ...(options.env ?? {}) },
+ encoding: 'utf8',
+ maxBuffer: 1024 * 1024 * 20,
+ timeout: options.timeoutMs ?? 300_000,
+ });
+ if (result.stdout) {
+ process.stdout.write(result.stdout);
+ }
+ if (result.stderr) {
+ process.stderr.write(result.stderr);
+ }
+ return { code: 0, stdout: result.stdout, stderr: result.stderr };
+ } catch (error) {
+ const stdout = typeof error.stdout === 'string' ? error.stdout : '';
+ const stderr = typeof error.stderr === 'string' ? error.stderr : error.message;
+ if (stdout) {
+ process.stdout.write(stdout);
+ }
+ if (stderr) {
+ process.stderr.write(stderr);
+ }
+ return {
+ code: typeof error.code === 'number' ? error.code : 1,
+ stdout,
+ stderr,
+ };
+ }
+}
+
+function requireSuccess(label, result) {
+ if (result.code !== 0) {
+ throw new Error(`${label} failed with code ${result.code}\nstdout:\n${result.stdout}\nstderr:\n${result.stderr}`);
+ }
+}
+
+async function runSetupSmoke(projectDir) {
+ const result = await run(
+ 'node',
+ [
+ join(ROOT_DIR, 'packages/cli/dist/bin.js'),
+ 'setup',
+ '--project-dir',
+ projectDir,
+ '--llm-backend',
+ 'codex',
+ '--llm-model',
+ 'gpt-5.3-codex',
+ '--no-input',
+ '--yes',
+ '--skip-databases',
+ '--skip-sources',
+ '--skip-agents',
+ ],
+ { timeoutMs: 600_000 },
+ );
+ requireSuccess('ktx setup codex backend', result);
+ if (!result.stdout.includes('LLM ready: yes (codex, gpt-5.3-codex)')) {
+ throw new Error(`setup did not report Codex LLM readiness\nstdout:\n${result.stdout}`);
+ }
+}
+
+async function runRuntimeSmoke(projectDir) {
+ const runtimeUrl = pathToFileURL(join(ROOT_DIR, 'packages/cli/dist/context/llm/codex-runtime.js')).href;
+ const zodUrl = pathToFileURL(join(ROOT_DIR, 'packages/cli/node_modules/zod/index.js')).href;
+ const { CodexKtxLlmRuntime } = await import(runtimeUrl);
+ const { z } = await import(zodUrl);
+ const runtime = new CodexKtxLlmRuntime({
+ projectDir,
+ modelSlots: { default: 'gpt-5.3-codex' },
+ });
+
+ const text = await runtime.generateText({
+ role: 'default',
+ prompt: 'Reply with exactly: ktx_codex_text_ok',
+ });
+ if (text.trim() !== 'ktx_codex_text_ok') {
+ throw new Error(`Codex text smoke returned unexpected text: ${text}`);
+ }
+
+ let toolCalls = 0;
+ const loop = await runtime.runAgentLoop({
+ modelRole: 'default',
+ systemPrompt: 'You must use available tools when the user asks for a tool result.',
+ userPrompt:
+ 'Call the echo_value tool with {"value":"ktx_codex_tool_ok"}, then finish after the tool returns.',
+ toolSet: {
+ echo_value: {
+ name: 'echo_value',
+ description: 'Return the provided value as markdown.',
+ inputSchema: z.object({ value: z.string() }),
+ execute: async (input) => {
+ toolCalls += 1;
+ return { markdown: `echo:${input.value}` };
+ },
+ },
+ },
+ stepBudget: 4,
+ telemetryTags: {},
+ });
+
+ if (loop.stopReason !== 'natural') {
+ throw new Error(`Codex tool smoke stopped with ${loop.stopReason}: ${loop.error?.message ?? 'no error'}`);
+ }
+ if (toolCalls !== 1) {
+ throw new Error(`Expected Codex to call echo_value exactly once, got ${toolCalls}`);
+ }
+}
+
+export async function runCodexBackendLiveSmoke() {
+ const projectDir = await mkdtemp(join(tmpdir(), 'ktx-codex-backend-smoke-'));
+ try {
+ requireSuccess(
+ 'ktx build',
+ await run('pnpm', ['--filter', '@kaelio/ktx', 'run', 'build'], { timeoutMs: 600_000 }),
+ );
+ await runSetupSmoke(projectDir);
+ await runRuntimeSmoke(projectDir);
+ process.stdout.write(`Codex backend live smoke passed in ${projectDir}\n`);
+ } finally {
+ await rm(projectDir, { recursive: true, force: true });
+ }
+}
+
+async function main() {
+ const optIn = codexBackendSmokeOptIn();
+ if (!optIn.run) {
+ process.stdout.write(`${optIn.message}\n`);
+ return;
+ }
+ await runCodexBackendLiveSmoke();
+}
+
+if (import.meta.url === pathToFileURL(process.argv[1] ?? '').href) {
+ await main();
+}
diff --git a/scripts/codex-backend-live-smoke.test.mjs b/scripts/codex-backend-live-smoke.test.mjs
new file mode 100644
index 00000000..8d8c051f
--- /dev/null
+++ b/scripts/codex-backend-live-smoke.test.mjs
@@ -0,0 +1,18 @@
+import assert from 'node:assert/strict';
+import test from 'node:test';
+import { codexBackendSmokeOptIn } from './codex-backend-live-smoke.mjs';
+
+test('codex backend smoke stays disabled by default', () => {
+ assert.deepEqual(codexBackendSmokeOptIn({}, []), {
+ run: false,
+ message: 'Set KTX_RUN_CODEX_BACKEND_SMOKE=1 or pass --force to run the Codex backend live smoke.',
+ });
+});
+
+test('codex backend smoke runs with env opt-in', () => {
+ assert.deepEqual(codexBackendSmokeOptIn({ KTX_RUN_CODEX_BACKEND_SMOKE: '1' }, []), { run: true });
+});
+
+test('codex backend smoke runs with force flag', () => {
+ assert.deepEqual(codexBackendSmokeOptIn({}, ['--force']), { run: true });
+});