mirror of
https://github.com/Kaelio/ktx.git
synced 2026-06-07 07:55:13 +02:00
feat: add codex llm backend for ktx runtime work (#253)
* feat: add codex sdk runner foundation * feat: parse codex runtime events * feat: expose codex runtime mcp tools * feat: add codex llm runtime * feat: wire codex llm backend * test: avoid Array.fromAsync in codex runner test * docs: document codex llm backend * fix: tighten codex runtime config ownership * fix: use codex sdk env and thread options * fix: parse codex sdk event shapes * test: add codex backend live smoke * docs: clarify codex backend isolation * fix: drive codex loop metrics from mcp events * fix: enforce codex local step budget * docs: disclose codex isolation limits * fix: count all codex agent steps and stream step callbacks live The agent-loop step budget only counted completed mcp_tool_call items, so built-in command_execution steps (which the public Codex SDK/CLI surface can still expose) never decremented the budget, letting ingest/reconciliation run past stepBudget until Codex stopped on its own. onStepFinish was also replayed only after the whole stream drained, so live work_unit_step / reconciliation progress appeared stuck until the Codex process exited. collectEvents is now the single live step accumulator: it counts every completed agent-action item via a shared isCompletedAgentStep predicate (command_execution, mcp_tool_call, file_change, web_search), fires onStepFinish as each step completes, and enforces the budget on that broader count. A no-tool turn still counts as one step. toolFailures stays MCP-specific, since a non-zero command exit is normal agent exploration, not a loop failure. * test: align ingest llm-guard assertions with codex backend The skip-llm ingest guard message now lists codex as a valid backend and mentions a Claude Code/Codex session plus a codex setup hint, but this slow suite test still asserted the pre-codex wording. Update it to match the production message (already covered by the local-bundle-runtime unit test) and add the codex setup-line assertion. * fix: treat codex error:null tool calls as success The Codex SDK serializes error: null on successful mcp_tool_call items, so the failure check (item.error !== undefined) flagged every successful tool call as failed with the empty-payload default "Codex turn failed". This killed every ingest work unit under the codex backend before it could produce a patch. Key on status === 'failed' (authoritative, always set) and only treat a populated error object as a failure. Add a regression test built from a verbatim real-SDK event capture. * fix: default codex backend to gpt-5.5 and report real probe errors The previous default gpt-5.3-codex is an API-key-only model that the OpenAI API rejects under ChatGPT-account (subscription) auth, so codex status/setup failed with a misleading "authentication is not usable" message even though auth was fine. - Default codex model is now gpt-5.5 (works on both subscription and API-key auth); the curated setup picker offers gpt-5.5 / gpt-5.4 / gpt-5.4-mini and keeps free-form entry for account-specific ids (e.g. gpt-5.3-codex-spark). - runCodexAuthProbe now distinguishes "model not available" from an auth failure and surfaces the real API error: collectEvents retains stream events when the SDK throws on a non-zero exit, and the API error JSON envelope is unwrapped to its human-readable message. - The Codex isolation warning now renders inside the clack setup frame. - Docs updated to gpt-5.5 with a note that *-codex ids require API-key auth. * fix: require llm.models.default in status and match codex probe remediation Status reported a project ready when a non-none LLM backend was configured without llm.models.default, but the runtime (resolveModelSlots) hard-requires it, so ingest/scan/memory threw after `ktx status` said the project was usable. buildLlmStatus now fails for any non-none backend missing models.default and no longer invents a fallback model for claude-code/codex. Codex probe failures now carry a category-matched fix: a model-access failure steers the user at llm.models.default instead of the auth/install remediation. runCodexAuthProbe returns the fix and status consumes it; the message stays self-sufficient so setup output is unchanged. Docs: README now lists the codex backend and local Codex auth; ktx-setup.mdx states --llm-model only accepts codex/default or gpt-*/codex-* ids. Repaired four doctor fixtures that configured a backend without models.default (the now-correctly-blocked config) and added coverage for the new behavior.
This commit is contained in:
parent
74c6076b72
commit
494618ab14
41 changed files with 2544 additions and 30 deletions
10
README.md
10
README.md
|
|
@ -30,8 +30,9 @@ warehouse accurately - from approved metric definitions, joinable columns, and
|
||||||
business knowledge it builds and maintains for you.
|
business knowledge it builds and maintains for you.
|
||||||
|
|
||||||
> [!NOTE]
|
> [!NOTE]
|
||||||
> Run **ktx** with your own LLM API keys or a **Claude Pro/Max** subscription.
|
> Run **ktx** with your own LLM API keys or a local agent sign-in — a
|
||||||
> No extra usage billing from **ktx**.
|
> **Claude Pro/Max** subscription through Claude Code, or your local Codex
|
||||||
|
> authentication. No extra usage billing from **ktx**.
|
||||||
|
|
||||||
<p align="center">
|
<p align="center">
|
||||||
<a href="https://youtu.be/5V4TuzYVlrA">
|
<a href="https://youtu.be/5V4TuzYVlrA">
|
||||||
|
|
@ -175,8 +176,9 @@ then the current directory. Pass `--project-dir <path>` when scripting.
|
||||||
No. **ktx** runs locally. The only data leaving your machine is what you
|
No. **ktx** runs locally. The only data leaving your machine is what you
|
||||||
send to the LLM provider you configured.
|
send to the LLM provider you configured.
|
||||||
- **Which LLM backends are supported?**
|
- **Which LLM backends are supported?**
|
||||||
Anthropic API, Google Vertex AI, AI Gateway, and the local Claude Code
|
Anthropic API, Google Vertex AI, AI Gateway, the local Claude Code session
|
||||||
session through the Claude Agent SDK. See
|
through the Claude Agent SDK, and your local Codex authentication through the
|
||||||
|
Codex SDK. See
|
||||||
[LLM configuration](https://docs.kaelio.com/ktx/docs/guides/llm-configuration).
|
[LLM configuration](https://docs.kaelio.com/ktx/docs/guides/llm-configuration).
|
||||||
- **How is ktx different from a dbt or MetricFlow semantic layer?**
|
- **How is ktx different from a dbt or MetricFlow semantic layer?**
|
||||||
**ktx** *ingests* those layers and combines them with raw-table
|
**ktx** *ingests* those layers and combines them with raw-table
|
||||||
|
|
|
||||||
|
|
@ -51,8 +51,9 @@ prompts.
|
||||||
|
|
||||||
| Flag | Description |
|
| Flag | Description |
|
||||||
|------|-------------|
|
|------|-------------|
|
||||||
| `--llm-backend <backend>` | LLM backend: `anthropic`, `vertex`, or `claude-code` |
|
| `--llm-backend <backend>` | LLM backend: `anthropic`, `vertex`, `claude-code`, or `codex` |
|
||||||
| `--llm-backend claude-code` | Use the local Claude Code session for **ktx** LLM calls |
|
| `--llm-backend claude-code` | Use the local Claude Code session for **ktx** LLM calls |
|
||||||
|
| `--llm-backend codex` | Use local Codex authentication for **ktx** LLM calls |
|
||||||
| `--llm-model <model>` | LLM model ID or backend model alias to validate and save |
|
| `--llm-model <model>` | LLM model ID or backend model alias to validate and save |
|
||||||
| `--anthropic-api-key-env <name>` | Environment variable containing the Anthropic API key |
|
| `--anthropic-api-key-env <name>` | Environment variable containing the Anthropic API key |
|
||||||
| `--anthropic-api-key-file <path>` | File containing the Anthropic API key |
|
| `--anthropic-api-key-file <path>` | File containing the Anthropic API key |
|
||||||
|
|
@ -62,9 +63,14 @@ prompts.
|
||||||
|
|
||||||
Choose only one Anthropic credential source. Anthropic credential flags are only
|
Choose only one Anthropic credential source. Anthropic credential flags are only
|
||||||
valid with the Anthropic backend; Vertex flags are only valid with the Vertex
|
valid with the Anthropic backend; Vertex flags are only valid with the Vertex
|
||||||
backend. The `claude-code` backend uses local Claude Code authentication instead
|
backend. The `claude-code` and `codex` backends use local authentication instead
|
||||||
of Anthropic API key or Vertex flags. For Claude Code, `--llm-model` accepts
|
of Anthropic API key or Vertex flags. For Claude Code, `--llm-model` accepts
|
||||||
`sonnet`, `opus`, `haiku`, or a full Claude model ID.
|
`sonnet`, `opus`, `haiku`, or a full Claude model ID. For Codex, `--llm-model`
|
||||||
|
accepts `codex`, `default`, or a `gpt-*` / `codex-*` model ID such as
|
||||||
|
`gpt-5.5`; any other value is rejected before the auth probe. Run `codex` to
|
||||||
|
see the models available to your login, and pick a `gpt-*` / `codex-*` id from
|
||||||
|
that list. Note that `*-codex` API-billing model IDs (for example
|
||||||
|
`gpt-5.3-codex`) are not available to ChatGPT-subscription logins.
|
||||||
|
|
||||||
### Embeddings
|
### Embeddings
|
||||||
|
|
||||||
|
|
@ -191,6 +197,17 @@ ktx setup \
|
||||||
--llm-backend claude-code \
|
--llm-backend claude-code \
|
||||||
--llm-model opus
|
--llm-model opus
|
||||||
|
|
||||||
|
# Configure **ktx** to use local Codex authentication for LLM work
|
||||||
|
ktx setup --llm-backend codex --llm-model gpt-5.5 --no-input
|
||||||
|
```
|
||||||
|
|
||||||
|
When you choose `--llm-backend codex`, setup prints a warning if the public
|
||||||
|
Codex SDK and CLI surface cannot prove full Claude-Code-style isolation. The
|
||||||
|
backend restricts **ktx** runtime MCP tools to each run, but Codex may still
|
||||||
|
load user Codex config and built-in command execution or read-only file
|
||||||
|
capabilities.
|
||||||
|
|
||||||
|
```bash
|
||||||
# Script a Postgres connection that reads its URL from the environment
|
# Script a Postgres connection that reads its URL from the environment
|
||||||
ktx setup \
|
ktx setup \
|
||||||
--project-dir ./analytics \
|
--project-dir ./analytics \
|
||||||
|
|
|
||||||
|
|
@ -21,7 +21,7 @@ ktx status [options]
|
||||||
| `--json` | Print JSON output | `false` |
|
| `--json` | Print JSON output | `false` |
|
||||||
| `-v`, `--verbose` | Show every check, including passing ones | `false` |
|
| `-v`, `--verbose` | Show every check, including passing ones | `false` |
|
||||||
| `--validate` | Only validate the `ktx.yaml` schema; skip readiness checks | `false` |
|
| `--validate` | Only validate the `ktx.yaml` schema; skip readiness checks | `false` |
|
||||||
| `--fast` | Skip checks that require external communication (query-history readiness probes and Claude Code auth probe) | `false` |
|
| `--fast` | Skip checks that require external communication (query-history readiness probes, Claude Code auth probe, and Codex auth probe) | `false` |
|
||||||
| `--no-input` | Disable interactive terminal input | - |
|
| `--no-input` | Disable interactive terminal input | - |
|
||||||
|
|
||||||
## Examples
|
## Examples
|
||||||
|
|
@ -39,7 +39,7 @@ ktx status --verbose
|
||||||
# Validate ktx.yaml without running readiness checks
|
# Validate ktx.yaml without running readiness checks
|
||||||
ktx status --validate
|
ktx status --validate
|
||||||
|
|
||||||
# Skip slow probes (query-history readiness, Claude Code auth)
|
# Skip slow probes (query-history readiness, Claude Code auth, Codex auth)
|
||||||
ktx status --fast
|
ktx status --fast
|
||||||
|
|
||||||
# Check a project from another directory
|
# Check a project from another directory
|
||||||
|
|
@ -57,6 +57,16 @@ flow, then rerun `ktx status`. Use `--fast` to skip this probe (useful in CI
|
||||||
or offline contexts); skipped checks render as `-` and carry
|
or offline contexts); skipped checks render as `-` and carry
|
||||||
`"status": "skipped"` in JSON output.
|
`"status": "skipped"` in JSON output.
|
||||||
|
|
||||||
|
For `llm.provider.backend: codex`, `ktx status` runs a minimal non-interactive
|
||||||
|
Codex request. If the probe fails, authenticate Codex locally with the Codex CLI
|
||||||
|
and verify the Codex CLI installation.
|
||||||
|
|
||||||
|
When `llm.provider.backend: codex` is configured, `ktx status` also prints a
|
||||||
|
warning when the installed public Codex SDK and CLI surface cannot prove full
|
||||||
|
Claude-Code-style isolation. The warning does not block authenticated Codex
|
||||||
|
usage, but it marks the project status as partial so you can make an explicit
|
||||||
|
runtime-isolation decision.
|
||||||
|
|
||||||
A `Local data` section summarises what the project has accumulated locally:
|
A `Local data` section summarises what the project has accumulated locally:
|
||||||
ingest run counts, last completed timestamp per connection, knowledge page
|
ingest run counts, last completed timestamp per connection, knowledge page
|
||||||
counts by scope, semantic-layer source and dictionary value counts, and the
|
counts by scope, semantic-layer source and dictionary value counts, and the
|
||||||
|
|
|
||||||
|
|
@ -376,13 +376,23 @@ llm:
|
||||||
|
|
||||||
| Field | Type | Default | Purpose |
|
| Field | Type | Default | Purpose |
|
||||||
|-------|------|---------|---------|
|
|-------|------|---------|---------|
|
||||||
| `provider.backend` | `none` \| `anthropic` \| `vertex` \| `gateway` \| `claude-code` | `none` | Selected backend. `none` disables LLM features. `claude-code` uses the local Claude Code session and needs no API key. |
|
| `provider.backend` | `none` \| `anthropic` \| `vertex` \| `gateway` \| `claude-code` \| `codex` | `none` | Selected backend. `none` disables LLM features. `claude-code` uses the local Claude Code session and needs no API key. `codex` uses local Codex authentication and needs no API key. |
|
||||||
| `provider.anthropic.api_key` | `string` | - | Anthropic API key. Required when `backend: anthropic`. Accepts `env:` or `file:` references. |
|
| `provider.anthropic.api_key` | `string` | - | Anthropic API key. Required when `backend: anthropic`. Accepts `env:` or `file:` references. |
|
||||||
| `provider.anthropic.base_url` | `string` | - | Override the Anthropic API base URL (proxy, self-hosted gateway). |
|
| `provider.anthropic.base_url` | `string` | - | Override the Anthropic API base URL (proxy, self-hosted gateway). |
|
||||||
| `provider.gateway.api_key` / `base_url` | `string` | - | Credentials for an AI Gateway provider. Required when `backend: gateway`. |
|
| `provider.gateway.api_key` / `base_url` | `string` | - | Credentials for an AI Gateway provider. Required when `backend: gateway`. |
|
||||||
| `provider.vertex.project` | `string` | - | Google Cloud project ID hosting the Vertex AI endpoint. |
|
| `provider.vertex.project` | `string` | - | Google Cloud project ID hosting the Vertex AI endpoint. |
|
||||||
| `provider.vertex.location` | `string` | - | Vertex AI region (for example `us-east5`). Required when the `vertex` block is present. |
|
| `provider.vertex.location` | `string` | - | Vertex AI region (for example `us-east5`). Required when the `vertex` block is present. |
|
||||||
|
|
||||||
|
Use `codex` when local Codex authentication should power **ktx** LLM work:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
llm:
|
||||||
|
provider:
|
||||||
|
backend: codex
|
||||||
|
models:
|
||||||
|
default: gpt-5.5
|
||||||
|
```
|
||||||
|
|
||||||
### Model roles
|
### Model roles
|
||||||
|
|
||||||
`models` overrides the per-role model. Keys are fixed; values are
|
`models` overrides the per-role model. Keys are fixed; values are
|
||||||
|
|
|
||||||
|
|
@ -39,8 +39,20 @@ ktx ingest --all
|
||||||
Enriched ingest needs a configured model and embeddings. Run `ktx setup` first;
|
Enriched ingest needs a configured model and embeddings. Run `ktx setup` first;
|
||||||
connections without that configuration fail before any work starts.
|
connections without that configuration fail before any work starts.
|
||||||
|
|
||||||
With `claude-code`, **ktx** agent loops can invoke only the **ktx** MCP tools for the
|
Local-auth backends keep provider credentials out of `ktx.yaml`:
|
||||||
current run.
|
|
||||||
|
```bash
|
||||||
|
ktx setup --llm-backend claude-code --no-input
|
||||||
|
ktx setup --llm-backend codex --llm-model gpt-5.5 --no-input
|
||||||
|
```
|
||||||
|
|
||||||
|
With `claude-code`, **ktx** agent loops can invoke only the **ktx** MCP tools
|
||||||
|
for the current run. With `codex`, **ktx** restricts the temporary runtime MCP
|
||||||
|
server to the current run's tool set, disables Codex web search, requests a
|
||||||
|
read-only sandbox, and sets `approval_policy=never`. The public Codex SDK and
|
||||||
|
CLI surface may still load user Codex config and built-in command execution or
|
||||||
|
read-only file capabilities, so use `claude-code` for stricter runtime tool
|
||||||
|
isolation.
|
||||||
|
|
||||||
## Query history
|
## Query history
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -16,6 +16,7 @@ Set `llm.provider.backend` to one of these values:
|
||||||
- `gateway`: Use AI Gateway-compatible Anthropic model ids.
|
- `gateway`: Use AI Gateway-compatible Anthropic model ids.
|
||||||
- `claude-code`: Use your local Claude Code session through the Claude Agent
|
- `claude-code`: Use your local Claude Code session through the Claude Agent
|
||||||
SDK. **ktx** strips provider-routing environment variables from child processes.
|
SDK. **ktx** strips provider-routing environment variables from child processes.
|
||||||
|
- `codex`: Use your local Codex authentication through the Codex SDK.
|
||||||
|
|
||||||
## Claude Code
|
## Claude Code
|
||||||
|
|
||||||
|
|
@ -47,6 +48,42 @@ model IDs are also accepted.
|
||||||
metadata may still list host slash commands, skills, and subagents; **ktx** does not
|
metadata may still list host slash commands, skills, and subagents; **ktx** does not
|
||||||
grant execution access to them.
|
grant execution access to them.
|
||||||
|
|
||||||
|
## Codex backend
|
||||||
|
|
||||||
|
Use `codex` when you want **ktx** to run LLM-backed workflows through your
|
||||||
|
local Codex authentication instead of a direct provider API key.
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
llm:
|
||||||
|
provider:
|
||||||
|
backend: codex
|
||||||
|
models:
|
||||||
|
default: gpt-5.5
|
||||||
|
```
|
||||||
|
|
||||||
|
Configure it non-interactively:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
ktx setup --llm-backend codex --llm-model gpt-5.5 --no-input
|
||||||
|
```
|
||||||
|
|
||||||
|
This is separate from Codex agent-client setup. `ktx setup --agents --target
|
||||||
|
codex` installs instructions and MCP access for an end-user Codex session.
|
||||||
|
`ktx setup --llm-backend codex` makes **ktx** itself execute ingest, scan
|
||||||
|
enrichment, memory, and other LLM-backed work through Codex.
|
||||||
|
|
||||||
|
During runtime loops, **ktx** starts a temporary loopback MCP server for the
|
||||||
|
current run, exposes only the tools passed to that run, asks Codex to use a
|
||||||
|
read-only sandbox, sets `approval_policy=never`, auto-approves only those
|
||||||
|
run-scoped MCP tools, and disables Codex web search.
|
||||||
|
|
||||||
|
Codex backend isolation is currently limited by the public Codex SDK and CLI
|
||||||
|
surface. Codex may still load user Codex config and built-in command execution
|
||||||
|
or read-only file capabilities. Use `llm.provider.backend: claude-code` when
|
||||||
|
you need stricter Claude-Code-style runtime tool isolation, or remove host
|
||||||
|
Codex MCP and tool config before running untrusted prompts through the `codex`
|
||||||
|
backend.
|
||||||
|
|
||||||
## Prompt caching
|
## Prompt caching
|
||||||
|
|
||||||
`llm.promptCaching` has partial parity on `claude-code`. Status and doctor warn
|
`llm.promptCaching` has partial parity on `claude-code`. Status and doctor warn
|
||||||
|
|
|
||||||
|
|
@ -37,6 +37,9 @@
|
||||||
"@semantic-release/release-notes-generator",
|
"@semantic-release/release-notes-generator",
|
||||||
"conventional-changelog-conventionalcommits"
|
"conventional-changelog-conventionalcommits"
|
||||||
],
|
],
|
||||||
|
"ignore": [
|
||||||
|
".context/**"
|
||||||
|
],
|
||||||
"ignoreBinaries": [
|
"ignoreBinaries": [
|
||||||
"uv",
|
"uv",
|
||||||
"lsof"
|
"lsof"
|
||||||
|
|
|
||||||
|
|
@ -32,6 +32,7 @@
|
||||||
"setup:dev": "node scripts/setup-dev.mjs",
|
"setup:dev": "node scripts/setup-dev.mjs",
|
||||||
"release:published-smoke": "node scripts/published-package-smoke.mjs --require-config",
|
"release:published-smoke": "node scripts/published-package-smoke.mjs --require-config",
|
||||||
"release:local-embeddings-smoke": "node scripts/local-embeddings-runtime-smoke.mjs --require-opt-in",
|
"release:local-embeddings-smoke": "node scripts/local-embeddings-runtime-smoke.mjs --require-opt-in",
|
||||||
|
"release:codex-backend-smoke": "node scripts/codex-backend-live-smoke.mjs",
|
||||||
"release:readiness": "node scripts/release-readiness.mjs",
|
"release:readiness": "node scripts/release-readiness.mjs",
|
||||||
"release:update-version": "node scripts/update-public-release-version.mjs",
|
"release:update-version": "node scripts/update-public-release-version.mjs",
|
||||||
"relationships:acquire-public-fixtures": "node scripts/acquire-public-benchmark-fixtures.mjs",
|
"relationships:acquire-public-fixtures": "node scripts/acquire-public-benchmark-fixtures.mjs",
|
||||||
|
|
|
||||||
|
|
@ -56,6 +56,7 @@
|
||||||
"@looker/sdk-rtl": "^21.6.5",
|
"@looker/sdk-rtl": "^21.6.5",
|
||||||
"@modelcontextprotocol/sdk": "^1.29.0",
|
"@modelcontextprotocol/sdk": "^1.29.0",
|
||||||
"@notionhq/client": "^5.22.0",
|
"@notionhq/client": "^5.22.0",
|
||||||
|
"@openai/codex-sdk": "^0.133.0",
|
||||||
"ai": "^6.0.188",
|
"ai": "^6.0.188",
|
||||||
"better-sqlite3": "^12.10.0",
|
"better-sqlite3": "^12.10.0",
|
||||||
"commander": "14.0.3",
|
"commander": "14.0.3",
|
||||||
|
|
|
||||||
|
|
@ -29,7 +29,7 @@ function embeddingBackend(value: string): 'openai' | 'sentence-transformers' {
|
||||||
}
|
}
|
||||||
|
|
||||||
function llmBackend(value: string): KtxSetupLlmBackend {
|
function llmBackend(value: string): KtxSetupLlmBackend {
|
||||||
if (value === 'anthropic' || value === 'vertex' || value === 'claude-code') {
|
if (value === 'anthropic' || value === 'vertex' || value === 'claude-code' || value === 'codex') {
|
||||||
return value;
|
return value;
|
||||||
}
|
}
|
||||||
throw new InvalidArgumentError(`invalid choice '${value}'`);
|
throw new InvalidArgumentError(`invalid choice '${value}'`);
|
||||||
|
|
|
||||||
|
|
@ -611,9 +611,10 @@ function nextLocalJobId(): string {
|
||||||
|
|
||||||
function localIngestLlmProviderGuardMessage(projectDir: string): string {
|
function localIngestLlmProviderGuardMessage(projectDir: string): string {
|
||||||
return [
|
return [
|
||||||
'ktx ingest requires llm.provider.backend: anthropic, vertex, gateway, or claude-code, or an injected agentRunner.',
|
'ktx ingest requires llm.provider.backend: anthropic, vertex, gateway, claude-code, or codex, or an injected agentRunner.',
|
||||||
'Configure a local Claude Code session or API-backed LLM, then rerun ingest:',
|
'Configure a local Claude Code/Codex session or API-backed LLM, then rerun ingest:',
|
||||||
` ktx setup --project-dir ${projectDir} --llm-backend claude-code --no-input`,
|
` ktx setup --project-dir ${projectDir} --llm-backend claude-code --no-input`,
|
||||||
|
` ktx setup --project-dir ${projectDir} --llm-backend codex --llm-model gpt-5.5 --no-input`,
|
||||||
` ktx setup --project-dir ${projectDir} --llm-backend anthropic --anthropic-api-key-env ANTHROPIC_API_KEY --llm-model claude-sonnet-4-6 --no-input`,
|
` ktx setup --project-dir ${projectDir} --llm-backend anthropic --anthropic-api-key-env ANTHROPIC_API_KEY --llm-model claude-sonnet-4-6 --no-input`,
|
||||||
].join('\n');
|
].join('\n');
|
||||||
}
|
}
|
||||||
|
|
|
||||||
194
packages/cli/src/context/llm/codex-exec-events.ts
Normal file
194
packages/cli/src/context/llm/codex-exec-events.ts
Normal file
|
|
@ -0,0 +1,194 @@
|
||||||
|
import type { LlmTokenUsage, RunLoopStopReason } from './runtime-port.js';
|
||||||
|
|
||||||
|
export interface CodexExecEventSummary {
|
||||||
|
finalText: string;
|
||||||
|
stopReason: RunLoopStopReason;
|
||||||
|
usage: LlmTokenUsage;
|
||||||
|
stepCount: number;
|
||||||
|
stepBoundariesMs: number[];
|
||||||
|
toolCallCount: number;
|
||||||
|
toolFailures: string[];
|
||||||
|
error?: Error;
|
||||||
|
}
|
||||||
|
|
||||||
|
interface CodexEventParseOptions {
|
||||||
|
startedAt?: number;
|
||||||
|
now?: () => number;
|
||||||
|
}
|
||||||
|
|
||||||
|
function record(value: unknown): Record<string, unknown> | undefined {
|
||||||
|
return value && typeof value === 'object' ? (value as Record<string, unknown>) : undefined;
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Codex thread items that represent a discrete agent action consuming one loop
|
||||||
|
* step. The step budget caps the total number of these regardless of which
|
||||||
|
* capability the agent reaches for, so built-in `command_execution` (and any
|
||||||
|
* file/web action the public Codex surface still exposes) count alongside our
|
||||||
|
* own `mcp_tool_call` items rather than only the MCP ones.
|
||||||
|
*/
|
||||||
|
const AGENT_STEP_ITEM_TYPES = new Set(['command_execution', 'mcp_tool_call', 'file_change', 'web_search']);
|
||||||
|
|
||||||
|
export function isCompletedAgentStep(event: unknown): boolean {
|
||||||
|
const eventRecord = record(event);
|
||||||
|
if (eventRecord?.type !== 'item.completed') {
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
const itemType = record(eventRecord.item)?.type;
|
||||||
|
return typeof itemType === 'string' && AGENT_STEP_ITEM_TYPES.has(itemType);
|
||||||
|
}
|
||||||
|
|
||||||
|
function text(value: unknown): string | undefined {
|
||||||
|
return typeof value === 'string' && value.trim().length > 0 ? value : undefined;
|
||||||
|
}
|
||||||
|
|
||||||
|
function numberValue(value: unknown): number | undefined {
|
||||||
|
return typeof value === 'number' && Number.isFinite(value) ? value : undefined;
|
||||||
|
}
|
||||||
|
|
||||||
|
function usageFrom(value: unknown): LlmTokenUsage {
|
||||||
|
const usage = record(value);
|
||||||
|
if (!usage) {
|
||||||
|
return {};
|
||||||
|
}
|
||||||
|
const inputTokens = numberValue(usage.input_tokens ?? usage.inputTokens);
|
||||||
|
const outputTokens = numberValue(usage.output_tokens ?? usage.outputTokens);
|
||||||
|
const explicitTotalTokens = numberValue(usage.total_tokens ?? usage.totalTokens);
|
||||||
|
const totalTokens =
|
||||||
|
explicitTotalTokens ??
|
||||||
|
(inputTokens !== undefined && outputTokens !== undefined ? inputTokens + outputTokens : undefined);
|
||||||
|
return {
|
||||||
|
...(inputTokens !== undefined ? { inputTokens } : {}),
|
||||||
|
...(outputTokens !== undefined ? { outputTokens } : {}),
|
||||||
|
...(totalTokens !== undefined ? { totalTokens } : {}),
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
|
function stopReasonFrom(value: unknown): RunLoopStopReason {
|
||||||
|
const reason = text(value)?.toLowerCase();
|
||||||
|
if (reason && /(budget|max_turn|max-turn|limit)/.test(reason)) {
|
||||||
|
return 'budget';
|
||||||
|
}
|
||||||
|
return 'natural';
|
||||||
|
}
|
||||||
|
|
||||||
|
function errorMessageFrom(value: unknown): string {
|
||||||
|
if (value instanceof Error) {
|
||||||
|
return value.message;
|
||||||
|
}
|
||||||
|
const asRecord = record(value);
|
||||||
|
const message = text(asRecord?.message);
|
||||||
|
return message ?? text(value) ?? 'Codex turn failed';
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Codex serializes API failures as a JSON envelope inside the event message
|
||||||
|
* (e.g. `{"type":"error","status":400,"error":{"message":"…"}}`). Surface the
|
||||||
|
* human-readable inner message so callers don't leak raw JSON; pass plain
|
||||||
|
* strings through unchanged.
|
||||||
|
*/
|
||||||
|
function unwrapCodexApiErrorMessage(raw: string): string {
|
||||||
|
const trimmed = raw.trim();
|
||||||
|
if (!trimmed.startsWith('{')) {
|
||||||
|
return raw;
|
||||||
|
}
|
||||||
|
try {
|
||||||
|
const parsed = record(JSON.parse(trimmed));
|
||||||
|
return text(record(parsed?.error)?.message) ?? text(parsed?.message) ?? raw;
|
||||||
|
} catch {
|
||||||
|
return raw;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/** @internal */
|
||||||
|
export function parseCodexExecEventLine(line: string): unknown {
|
||||||
|
try {
|
||||||
|
return JSON.parse(line) as unknown;
|
||||||
|
} catch (error) {
|
||||||
|
throw new Error(`Codex JSONL event stream was malformed: ${error instanceof Error ? error.message : String(error)}`);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
export function summarizeCodexExecEvents(
|
||||||
|
events: Iterable<unknown>,
|
||||||
|
options: CodexEventParseOptions = {},
|
||||||
|
): CodexExecEventSummary {
|
||||||
|
const startedAt = options.startedAt ?? Date.now();
|
||||||
|
const now = options.now ?? Date.now;
|
||||||
|
let finalText = '';
|
||||||
|
let stopReason: RunLoopStopReason = 'natural';
|
||||||
|
let usage: LlmTokenUsage = {};
|
||||||
|
let turnCount = 0;
|
||||||
|
let completedStepCount = 0;
|
||||||
|
const stepBoundariesMs: number[] = [];
|
||||||
|
let toolCallCount = 0;
|
||||||
|
const toolFailures: string[] = [];
|
||||||
|
let error: Error | undefined;
|
||||||
|
|
||||||
|
for (const event of events) {
|
||||||
|
const eventRecord = record(event);
|
||||||
|
const eventType = text(eventRecord?.type);
|
||||||
|
if (!eventRecord || !eventType) {
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
|
||||||
|
if (eventType === 'turn.started') {
|
||||||
|
turnCount += 1;
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
|
||||||
|
const item = record(eventRecord.item);
|
||||||
|
const itemType = text(item?.type);
|
||||||
|
|
||||||
|
if (eventType === 'item.started' && itemType === 'mcp_tool_call') {
|
||||||
|
toolCallCount += 1;
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
|
||||||
|
if (isCompletedAgentStep(event)) {
|
||||||
|
completedStepCount += 1;
|
||||||
|
stepBoundariesMs.push(now() - startedAt);
|
||||||
|
// Only MCP tool calls fail the loop: a non-zero `command_execution` exit
|
||||||
|
// is normal agent exploration, not a runtime error. `status` is the
|
||||||
|
// authoritative signal (the SDK always sets it); the SDK also serializes
|
||||||
|
// `error: null` on successful calls, so an explicit-null `error` must NOT
|
||||||
|
// be read as a failure — only a populated error object counts.
|
||||||
|
if (itemType === 'mcp_tool_call' && (item?.status === 'failed' || (item?.error !== undefined && item?.error !== null))) {
|
||||||
|
const name = text(item?.name) ?? text(item?.tool) ?? text(item?.tool_name) ?? 'unknown';
|
||||||
|
toolFailures.push(`${name}: ${errorMessageFrom(item?.error)}`);
|
||||||
|
}
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
|
||||||
|
if (eventType === 'item.completed' && itemType === 'agent_message') {
|
||||||
|
finalText = text(item?.text) ?? finalText;
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
|
||||||
|
if (eventType === 'turn.completed') {
|
||||||
|
usage = usageFrom(eventRecord.usage);
|
||||||
|
if (completedStepCount === 0) {
|
||||||
|
stepBoundariesMs.push(now() - startedAt);
|
||||||
|
}
|
||||||
|
stopReason = stopReasonFrom(eventRecord.reason ?? eventRecord.stop_reason ?? eventRecord.terminal_reason);
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
|
||||||
|
if (eventType === 'turn.failed' || eventType === 'error') {
|
||||||
|
stopReason = 'error';
|
||||||
|
error = new Error(unwrapCodexApiErrorMessage(errorMessageFrom(eventRecord.error ?? eventRecord.message)));
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
return {
|
||||||
|
finalText,
|
||||||
|
stopReason,
|
||||||
|
usage,
|
||||||
|
stepCount: completedStepCount > 0 ? completedStepCount : turnCount,
|
||||||
|
stepBoundariesMs,
|
||||||
|
toolCallCount,
|
||||||
|
toolFailures,
|
||||||
|
...(error ? { error } : {}),
|
||||||
|
};
|
||||||
|
}
|
||||||
9
packages/cli/src/context/llm/codex-isolation.ts
Normal file
9
packages/cli/src/context/llm/codex-isolation.ts
Normal file
|
|
@ -0,0 +1,9 @@
|
||||||
|
export const CODEX_ISOLATION_WARNING =
|
||||||
|
'Codex backend isolation is limited by the public Codex SDK/CLI surface: ktx restricts the runtime MCP server to the current ktx tool set, disables Codex web search, asks for a read-only sandbox, and sets approval_policy=never, but Codex may still load user Codex config and built-in command execution or read-only file capabilities.';
|
||||||
|
|
||||||
|
export const CODEX_ISOLATION_WARNING_FIX =
|
||||||
|
'Use llm.provider.backend: claude-code when you need stricter Claude-Code-style runtime tool isolation, or remove host Codex MCP/tool config before running untrusted prompts through the codex backend.';
|
||||||
|
|
||||||
|
export function formatCodexIsolationWarning(): string {
|
||||||
|
return `${CODEX_ISOLATION_WARNING} ${CODEX_ISOLATION_WARNING_FIX}`;
|
||||||
|
}
|
||||||
87
packages/cli/src/context/llm/codex-mcp-runtime-server.ts
Normal file
87
packages/cli/src/context/llm/codex-mcp-runtime-server.ts
Normal file
|
|
@ -0,0 +1,87 @@
|
||||||
|
import { randomBytes } from 'node:crypto';
|
||||||
|
import type { Server } from 'node:http';
|
||||||
|
import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
|
||||||
|
import type { KtxMcpServerLike } from '../mcp/types.js';
|
||||||
|
import { runKtxMcpHttpServer, type KtxMcpHttpServerHandle } from '../../mcp-http-server.js';
|
||||||
|
import type { KtxRuntimeToolSet } from './runtime-port.js';
|
||||||
|
import { normalizeKtxRuntimeToolOutput } from './runtime-tools.js';
|
||||||
|
|
||||||
|
/** @internal */
|
||||||
|
export interface CreateCodexRuntimeMcpServerInput {
|
||||||
|
server?: KtxMcpServerLike;
|
||||||
|
toolSet: KtxRuntimeToolSet;
|
||||||
|
}
|
||||||
|
|
||||||
|
export interface CodexRuntimeMcpServerHandle {
|
||||||
|
url: string;
|
||||||
|
bearerTokenEnvVar: 'KTX_CODEX_RUNTIME_MCP_TOKEN';
|
||||||
|
bearerToken: string;
|
||||||
|
close(): Promise<void>;
|
||||||
|
}
|
||||||
|
|
||||||
|
type RunServer = typeof runKtxMcpHttpServer;
|
||||||
|
|
||||||
|
export interface StartCodexRuntimeMcpServerInput {
|
||||||
|
projectDir: string;
|
||||||
|
toolSet: KtxRuntimeToolSet;
|
||||||
|
runServer?: RunServer;
|
||||||
|
}
|
||||||
|
|
||||||
|
/** @internal */
|
||||||
|
export function createCodexRuntimeMcpServer(input: CreateCodexRuntimeMcpServerInput): KtxMcpServerLike {
|
||||||
|
const server =
|
||||||
|
input.server ??
|
||||||
|
(new McpServer({
|
||||||
|
name: 'ktx-runtime',
|
||||||
|
version: '0.0.0',
|
||||||
|
}) as KtxMcpServerLike);
|
||||||
|
|
||||||
|
for (const descriptor of Object.values(input.toolSet)) {
|
||||||
|
server.registerTool(
|
||||||
|
descriptor.name,
|
||||||
|
{
|
||||||
|
description: descriptor.description,
|
||||||
|
inputSchema: descriptor.inputSchema.shape,
|
||||||
|
},
|
||||||
|
async (toolInput) => {
|
||||||
|
const normalized = normalizeKtxRuntimeToolOutput(await descriptor.execute(toolInput));
|
||||||
|
return {
|
||||||
|
content: [{ type: 'text', text: normalized.markdown }],
|
||||||
|
...(normalized.structured !== undefined && normalized.structured !== null && typeof normalized.structured === 'object'
|
||||||
|
? { structuredContent: normalized.structured as object }
|
||||||
|
: {}),
|
||||||
|
};
|
||||||
|
},
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
return server;
|
||||||
|
}
|
||||||
|
|
||||||
|
function serverPort(server: Server, fallback: number): number {
|
||||||
|
const address = server.address();
|
||||||
|
return typeof address === 'object' && address ? address.port : fallback;
|
||||||
|
}
|
||||||
|
|
||||||
|
export async function startCodexRuntimeMcpServer(
|
||||||
|
input: StartCodexRuntimeMcpServerInput,
|
||||||
|
): Promise<CodexRuntimeMcpServerHandle> {
|
||||||
|
const bearerToken = randomBytes(32).toString('hex');
|
||||||
|
const runServer = input.runServer ?? runKtxMcpHttpServer;
|
||||||
|
const handle = (await runServer({
|
||||||
|
projectDir: input.projectDir,
|
||||||
|
host: '127.0.0.1',
|
||||||
|
port: 0,
|
||||||
|
token: bearerToken,
|
||||||
|
allowedHosts: ['127.0.0.1', 'localhost'],
|
||||||
|
allowedOrigins: [],
|
||||||
|
createMcpServer: () => createCodexRuntimeMcpServer({ toolSet: input.toolSet }) as McpServer,
|
||||||
|
})) as KtxMcpHttpServerHandle;
|
||||||
|
const port = serverPort(handle.server, 0);
|
||||||
|
return {
|
||||||
|
url: `http://127.0.0.1:${port}/mcp`,
|
||||||
|
bearerTokenEnvVar: 'KTX_CODEX_RUNTIME_MCP_TOKEN',
|
||||||
|
bearerToken,
|
||||||
|
close: () => handle.close(),
|
||||||
|
};
|
||||||
|
}
|
||||||
20
packages/cli/src/context/llm/codex-models.ts
Normal file
20
packages/cli/src/context/llm/codex-models.ts
Normal file
|
|
@ -0,0 +1,20 @@
|
||||||
|
export const DEFAULT_CODEX_MODEL = 'gpt-5.5';
|
||||||
|
|
||||||
|
const CODEX_MODEL_ALIASES: Record<string, string> = {
|
||||||
|
codex: DEFAULT_CODEX_MODEL,
|
||||||
|
default: DEFAULT_CODEX_MODEL,
|
||||||
|
};
|
||||||
|
|
||||||
|
const EXPLICIT_CODEX_MODEL_ID = /^(?:gpt|codex)-[a-z0-9][a-z0-9._-]*$/i;
|
||||||
|
|
||||||
|
export function resolveCodexModel(model: string): string {
|
||||||
|
const normalized = model.trim();
|
||||||
|
const alias = CODEX_MODEL_ALIASES[normalized];
|
||||||
|
if (alias) {
|
||||||
|
return alias;
|
||||||
|
}
|
||||||
|
if (EXPLICIT_CODEX_MODEL_ID.test(normalized)) {
|
||||||
|
return normalized;
|
||||||
|
}
|
||||||
|
throw new Error(`Unsupported Codex model "${model}". Use codex, default, or a gpt-* / codex-* model id.`);
|
||||||
|
}
|
||||||
38
packages/cli/src/context/llm/codex-runtime-config.ts
Normal file
38
packages/cli/src/context/llm/codex-runtime-config.ts
Normal file
|
|
@ -0,0 +1,38 @@
|
||||||
|
interface CodexRuntimeMcpConfig {
|
||||||
|
url: string;
|
||||||
|
bearerTokenEnvVar: string;
|
||||||
|
bearerToken: string;
|
||||||
|
toolNames: string[];
|
||||||
|
}
|
||||||
|
|
||||||
|
export interface BuildCodexRuntimeConfigInput {
|
||||||
|
model: string;
|
||||||
|
mcp?: CodexRuntimeMcpConfig;
|
||||||
|
}
|
||||||
|
|
||||||
|
export interface CodexRuntimeConfig {
|
||||||
|
configOverrides: Record<string, unknown>;
|
||||||
|
env: Record<string, string>;
|
||||||
|
}
|
||||||
|
|
||||||
|
export function buildCodexRuntimeConfig(input: BuildCodexRuntimeConfigInput): CodexRuntimeConfig {
|
||||||
|
const configOverrides: Record<string, unknown> = {
|
||||||
|
history: { persistence: 'none' },
|
||||||
|
};
|
||||||
|
const env: Record<string, string> = {};
|
||||||
|
|
||||||
|
if (input.mcp) {
|
||||||
|
configOverrides.mcp_servers = {
|
||||||
|
ktx: {
|
||||||
|
url: input.mcp.url,
|
||||||
|
bearer_token_env_var: input.mcp.bearerTokenEnvVar,
|
||||||
|
enabled_tools: input.mcp.toolNames,
|
||||||
|
default_tools_approval_mode: 'approve',
|
||||||
|
required: true,
|
||||||
|
},
|
||||||
|
};
|
||||||
|
env[input.mcp.bearerTokenEnvVar] = input.mcp.bearerToken;
|
||||||
|
}
|
||||||
|
|
||||||
|
return { configOverrides, env };
|
||||||
|
}
|
||||||
371
packages/cli/src/context/llm/codex-runtime.ts
Normal file
371
packages/cli/src/context/llm/codex-runtime.ts
Normal file
|
|
@ -0,0 +1,371 @@
|
||||||
|
import { z } from 'zod';
|
||||||
|
import { noopLogger, type KtxLogger } from '../core/config.js';
|
||||||
|
import { isCompletedAgentStep, summarizeCodexExecEvents, type CodexExecEventSummary } from './codex-exec-events.js';
|
||||||
|
import {
|
||||||
|
startCodexRuntimeMcpServer,
|
||||||
|
type CodexRuntimeMcpServerHandle,
|
||||||
|
} from './codex-mcp-runtime-server.js';
|
||||||
|
import { resolveCodexModel } from './codex-models.js';
|
||||||
|
import { buildCodexRuntimeConfig } from './codex-runtime-config.js';
|
||||||
|
import { CodexSdkCliRunner, type CodexSdkRunner } from './codex-sdk-runner.js';
|
||||||
|
import type {
|
||||||
|
KtxGenerateObjectInput,
|
||||||
|
KtxGenerateTextInput,
|
||||||
|
KtxLlmRuntimePort,
|
||||||
|
KtxRuntimeToolSet,
|
||||||
|
LlmTokenUsage,
|
||||||
|
RunLoopParams,
|
||||||
|
RunLoopResult,
|
||||||
|
} from './runtime-port.js';
|
||||||
|
|
||||||
|
export interface CodexKtxLlmRuntimeDeps {
|
||||||
|
projectDir: string;
|
||||||
|
modelSlots: { default: string } & Partial<Record<string, string>>;
|
||||||
|
runner?: CodexSdkRunner;
|
||||||
|
startMcpServer?: (input: { projectDir: string; toolSet: KtxRuntimeToolSet }) => Promise<CodexRuntimeMcpServerHandle>;
|
||||||
|
logger?: KtxLogger;
|
||||||
|
}
|
||||||
|
|
||||||
|
function modelForRole(modelSlots: CodexKtxLlmRuntimeDeps['modelSlots'], role: string): string {
|
||||||
|
return resolveCodexModel(modelSlots[role] ?? modelSlots.default);
|
||||||
|
}
|
||||||
|
|
||||||
|
function promptWithSystem(system: string | undefined, prompt: string): string {
|
||||||
|
return [system, prompt].filter(Boolean).join('\n\n');
|
||||||
|
}
|
||||||
|
|
||||||
|
interface CollectCodexEventsOptions {
|
||||||
|
stepBudget?: number;
|
||||||
|
abortController?: AbortController;
|
||||||
|
onStep?: (stepIndex: number) => void | Promise<void>;
|
||||||
|
}
|
||||||
|
|
||||||
|
interface CollectCodexEventsResult {
|
||||||
|
events: unknown[];
|
||||||
|
budgetExceeded: boolean;
|
||||||
|
streamError?: Error;
|
||||||
|
}
|
||||||
|
|
||||||
|
function eventRecord(value: unknown): Record<string, unknown> | undefined {
|
||||||
|
return value && typeof value === 'object' ? (value as Record<string, unknown>) : undefined;
|
||||||
|
}
|
||||||
|
|
||||||
|
function isTurnCompleted(event: unknown): boolean {
|
||||||
|
return eventRecord(event)?.type === 'turn.completed';
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Drains the Codex stream once, emitting a step as each agent action completes
|
||||||
|
* so callers see live progress and the step budget is enforced mid-run. Every
|
||||||
|
* completed agent-action item counts (see {@link isCompletedAgentStep}), so
|
||||||
|
* built-in `command_execution` steps decrement the budget the same as
|
||||||
|
* `mcp_tool_call`s. A turn that produced no actions still counts as one step,
|
||||||
|
* matching the metrics summary and the AI SDK backend.
|
||||||
|
*/
|
||||||
|
async function collectEvents(
|
||||||
|
events: AsyncIterable<unknown>,
|
||||||
|
options: CollectCodexEventsOptions = {},
|
||||||
|
): Promise<CollectCodexEventsResult> {
|
||||||
|
const collected: unknown[] = [];
|
||||||
|
let completedSteps = 0;
|
||||||
|
let sawActionStep = false;
|
||||||
|
let budgetExceeded = false;
|
||||||
|
let streamError: Error | undefined;
|
||||||
|
|
||||||
|
// The SDK yields every stdout event, then throws on a non-zero codex exec
|
||||||
|
// exit. Catch that throw so the events already collected (which carry the
|
||||||
|
// real `turn.failed`/`error` reason) survive for the summary; the masked
|
||||||
|
// exit message is kept only as a fallback when no error event was emitted.
|
||||||
|
try {
|
||||||
|
for await (const event of events) {
|
||||||
|
collected.push(event);
|
||||||
|
|
||||||
|
const isActionStep = isCompletedAgentStep(event);
|
||||||
|
if (isActionStep) {
|
||||||
|
sawActionStep = true;
|
||||||
|
} else if (sawActionStep || !isTurnCompleted(event)) {
|
||||||
|
// Only fall back to counting a bare turn as a step when the turn produced
|
||||||
|
// no agent actions; a completed turn is terminal, so it never aborts.
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
|
||||||
|
completedSteps += 1;
|
||||||
|
await options.onStep?.(completedSteps);
|
||||||
|
if (isActionStep && options.stepBudget !== undefined && completedSteps >= options.stepBudget) {
|
||||||
|
budgetExceeded = true;
|
||||||
|
options.abortController?.abort();
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
} catch (error) {
|
||||||
|
streamError = error instanceof Error ? error : new Error(String(error));
|
||||||
|
}
|
||||||
|
|
||||||
|
return { events: collected, budgetExceeded, ...(streamError ? { streamError } : {}) };
|
||||||
|
}
|
||||||
|
|
||||||
|
function metrics(summary: CodexExecEventSummary, startedAt: number): { totalMs: number; usage: LlmTokenUsage } {
|
||||||
|
return { totalMs: Date.now() - startedAt, usage: summary.usage };
|
||||||
|
}
|
||||||
|
|
||||||
|
function summaryError(summary: CodexExecEventSummary, streamError?: Error): Error | undefined {
|
||||||
|
// A `turn.failed`/`error` event carries the real reason; prefer it over the
|
||||||
|
// SDK's generic non-zero-exit throw. Fall back to the stream error only when
|
||||||
|
// no event explained the failure (e.g. spawn failure or auth before a turn).
|
||||||
|
if (summary.error) {
|
||||||
|
return summary.error;
|
||||||
|
}
|
||||||
|
if (summary.toolFailures.length > 0) {
|
||||||
|
return new Error(`Codex runtime tool call failed: ${summary.toolFailures.join('; ')}`);
|
||||||
|
}
|
||||||
|
return streamError;
|
||||||
|
}
|
||||||
|
|
||||||
|
function assertSuccessfulText(summary: CodexExecEventSummary, streamError?: Error): string {
|
||||||
|
const error = summaryError(summary, streamError);
|
||||||
|
if (error) {
|
||||||
|
throw error;
|
||||||
|
}
|
||||||
|
if (!summary.finalText.trim()) {
|
||||||
|
throw new Error('Codex completed without an agent message');
|
||||||
|
}
|
||||||
|
return summary.finalText;
|
||||||
|
}
|
||||||
|
|
||||||
|
function parseStructuredOutput<TOutput, TSchema extends z.ZodType<TOutput>>(schema: TSchema, text: string): TOutput {
|
||||||
|
try {
|
||||||
|
return schema.parse(JSON.parse(text));
|
||||||
|
} catch (error) {
|
||||||
|
const message = error instanceof Error ? error.message : String(error);
|
||||||
|
throw new Error(`Codex structured output failed validation: ${message}`);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
async function mcpForTools(input: {
|
||||||
|
projectDir: string;
|
||||||
|
toolSet?: KtxRuntimeToolSet;
|
||||||
|
startMcpServer: CodexKtxLlmRuntimeDeps['startMcpServer'];
|
||||||
|
}): Promise<CodexRuntimeMcpServerHandle | undefined> {
|
||||||
|
if (!input.toolSet || Object.keys(input.toolSet).length === 0) {
|
||||||
|
return undefined;
|
||||||
|
}
|
||||||
|
return (input.startMcpServer ?? startCodexRuntimeMcpServer)({
|
||||||
|
projectDir: input.projectDir,
|
||||||
|
toolSet: input.toolSet,
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
function runtimeToolNames(toolSet: KtxRuntimeToolSet | undefined): string[] {
|
||||||
|
return Object.values(toolSet ?? {}).map((descriptor) => descriptor.name);
|
||||||
|
}
|
||||||
|
|
||||||
|
export class CodexKtxLlmRuntime implements KtxLlmRuntimePort {
|
||||||
|
private readonly runner: CodexSdkRunner;
|
||||||
|
private readonly logger: KtxLogger;
|
||||||
|
|
||||||
|
constructor(private readonly deps: CodexKtxLlmRuntimeDeps) {
|
||||||
|
this.runner = deps.runner ?? new CodexSdkCliRunner();
|
||||||
|
this.logger = deps.logger ?? noopLogger;
|
||||||
|
}
|
||||||
|
|
||||||
|
async generateText(input: KtxGenerateTextInput): Promise<string> {
|
||||||
|
const startedAt = Date.now();
|
||||||
|
const model = modelForRole(this.deps.modelSlots, input.role);
|
||||||
|
const mcp = await mcpForTools({
|
||||||
|
projectDir: this.deps.projectDir,
|
||||||
|
toolSet: input.tools,
|
||||||
|
startMcpServer: this.deps.startMcpServer,
|
||||||
|
});
|
||||||
|
try {
|
||||||
|
const config = buildCodexRuntimeConfig({
|
||||||
|
model,
|
||||||
|
...(mcp
|
||||||
|
? {
|
||||||
|
mcp: {
|
||||||
|
url: mcp.url,
|
||||||
|
bearerTokenEnvVar: mcp.bearerTokenEnvVar,
|
||||||
|
bearerToken: mcp.bearerToken,
|
||||||
|
toolNames: runtimeToolNames(input.tools),
|
||||||
|
},
|
||||||
|
}
|
||||||
|
: {}),
|
||||||
|
});
|
||||||
|
const collected = await collectEvents(
|
||||||
|
await this.runner.runStreamed({
|
||||||
|
projectDir: this.deps.projectDir,
|
||||||
|
model,
|
||||||
|
prompt: promptWithSystem(input.system, input.prompt),
|
||||||
|
configOverrides: config.configOverrides,
|
||||||
|
env: config.env,
|
||||||
|
}),
|
||||||
|
);
|
||||||
|
const summary = summarizeCodexExecEvents(collected.events, { startedAt });
|
||||||
|
input.onMetrics?.(metrics(summary, startedAt));
|
||||||
|
return assertSuccessfulText(summary, collected.streamError);
|
||||||
|
} finally {
|
||||||
|
await mcp?.close();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
async generateObject<TOutput, TSchema extends z.ZodType<TOutput>>(
|
||||||
|
input: KtxGenerateObjectInput<TOutput, TSchema>,
|
||||||
|
): Promise<TOutput> {
|
||||||
|
const startedAt = Date.now();
|
||||||
|
const model = modelForRole(this.deps.modelSlots, input.role);
|
||||||
|
const mcp = await mcpForTools({
|
||||||
|
projectDir: this.deps.projectDir,
|
||||||
|
toolSet: input.tools,
|
||||||
|
startMcpServer: this.deps.startMcpServer,
|
||||||
|
});
|
||||||
|
try {
|
||||||
|
const config = buildCodexRuntimeConfig({
|
||||||
|
model,
|
||||||
|
...(mcp
|
||||||
|
? {
|
||||||
|
mcp: {
|
||||||
|
url: mcp.url,
|
||||||
|
bearerTokenEnvVar: mcp.bearerTokenEnvVar,
|
||||||
|
bearerToken: mcp.bearerToken,
|
||||||
|
toolNames: runtimeToolNames(input.tools),
|
||||||
|
},
|
||||||
|
}
|
||||||
|
: {}),
|
||||||
|
});
|
||||||
|
const collected = await collectEvents(
|
||||||
|
await this.runner.runStreamed({
|
||||||
|
projectDir: this.deps.projectDir,
|
||||||
|
model,
|
||||||
|
prompt: promptWithSystem(input.system, input.prompt),
|
||||||
|
configOverrides: config.configOverrides,
|
||||||
|
env: config.env,
|
||||||
|
outputSchema: z.toJSONSchema(input.schema, { target: 'draft-7' }) as Record<string, unknown>,
|
||||||
|
}),
|
||||||
|
);
|
||||||
|
const summary = summarizeCodexExecEvents(collected.events, { startedAt });
|
||||||
|
input.onMetrics?.(metrics(summary, startedAt));
|
||||||
|
return parseStructuredOutput(input.schema, assertSuccessfulText(summary, collected.streamError));
|
||||||
|
} finally {
|
||||||
|
await mcp?.close();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
async runAgentLoop(params: RunLoopParams): Promise<RunLoopResult> {
|
||||||
|
const startedAt = Date.now();
|
||||||
|
const model = modelForRole(this.deps.modelSlots, params.modelRole);
|
||||||
|
let mcp: CodexRuntimeMcpServerHandle | undefined;
|
||||||
|
try {
|
||||||
|
mcp = await mcpForTools({
|
||||||
|
projectDir: this.deps.projectDir,
|
||||||
|
toolSet: params.toolSet,
|
||||||
|
startMcpServer: this.deps.startMcpServer,
|
||||||
|
});
|
||||||
|
const config = buildCodexRuntimeConfig({
|
||||||
|
model,
|
||||||
|
...(mcp
|
||||||
|
? {
|
||||||
|
mcp: {
|
||||||
|
url: mcp.url,
|
||||||
|
bearerTokenEnvVar: mcp.bearerTokenEnvVar,
|
||||||
|
bearerToken: mcp.bearerToken,
|
||||||
|
toolNames: runtimeToolNames(params.toolSet),
|
||||||
|
},
|
||||||
|
}
|
||||||
|
: {}),
|
||||||
|
});
|
||||||
|
const abortController = new AbortController();
|
||||||
|
const onStep = async (stepIndex: number): Promise<void> => {
|
||||||
|
try {
|
||||||
|
await params.onStepFinish?.({ stepIndex, stepBudget: params.stepBudget });
|
||||||
|
} catch (error) {
|
||||||
|
this.logger.warn(
|
||||||
|
`[codex-runner] onStepFinish callback threw; ignoring: ${error instanceof Error ? error.message : String(error)}`,
|
||||||
|
);
|
||||||
|
}
|
||||||
|
};
|
||||||
|
const collected = await collectEvents(
|
||||||
|
await this.runner.runStreamed({
|
||||||
|
projectDir: this.deps.projectDir,
|
||||||
|
model,
|
||||||
|
prompt: promptWithSystem(params.systemPrompt, params.userPrompt),
|
||||||
|
configOverrides: config.configOverrides,
|
||||||
|
env: config.env,
|
||||||
|
signal: abortController.signal,
|
||||||
|
}),
|
||||||
|
{ stepBudget: params.stepBudget, abortController, onStep },
|
||||||
|
);
|
||||||
|
const summary = summarizeCodexExecEvents(collected.events, { startedAt });
|
||||||
|
const error = summaryError(summary, collected.streamError);
|
||||||
|
const stopReason = collected.budgetExceeded ? 'budget' : error ? 'error' : summary.stopReason;
|
||||||
|
return {
|
||||||
|
stopReason,
|
||||||
|
...(stopReason === 'error' && error ? { error } : {}),
|
||||||
|
metrics: {
|
||||||
|
totalMs: Date.now() - startedAt,
|
||||||
|
usage: summary.usage,
|
||||||
|
stepCount: summary.stepCount,
|
||||||
|
stepBoundariesMs: summary.stepBoundariesMs,
|
||||||
|
},
|
||||||
|
};
|
||||||
|
} catch (error) {
|
||||||
|
const err = error instanceof Error ? error : new Error(String(error));
|
||||||
|
return {
|
||||||
|
stopReason: 'error',
|
||||||
|
error: err,
|
||||||
|
metrics: { totalMs: Date.now() - startedAt, usage: {}, stepCount: 0, stepBoundariesMs: [] },
|
||||||
|
};
|
||||||
|
} finally {
|
||||||
|
await mcp?.close();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// A rejected model is not an auth failure: Codex authenticated, connected, and
|
||||||
|
// the API refused the model id. These markers come from the API error envelope
|
||||||
|
// (e.g. "model is not supported", "invalid_request_error").
|
||||||
|
const MODEL_UNAVAILABLE_MARKERS =
|
||||||
|
/\bnot supported\b|\bnot available\b|\bdoes not exist\b|invalid_request_error|\bunknown model\b|\bunsupported model\b/i;
|
||||||
|
|
||||||
|
function describeCodexProbeFailure(model: string, message: string): { message: string; fix: string } {
|
||||||
|
if (MODEL_UNAVAILABLE_MARKERS.test(message)) {
|
||||||
|
const fix = `Run \`codex\` to see the models your account supports, then set llm.models.default in ktx.yaml (or rerun \`ktx setup\`).`;
|
||||||
|
return {
|
||||||
|
message: `Codex is authenticated, but the configured model "${model}" is not available for this Codex account. ${fix} Details: ${message}`,
|
||||||
|
fix,
|
||||||
|
};
|
||||||
|
}
|
||||||
|
const fix = `Authenticate Codex locally with the Codex CLI, verify the Codex CLI is installed, then rerun setup or \`ktx status\`.`;
|
||||||
|
return {
|
||||||
|
message: `Codex authentication is not usable. ${fix} Details: ${message}`,
|
||||||
|
fix,
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
|
export async function runCodexAuthProbe(input: {
|
||||||
|
projectDir: string;
|
||||||
|
model: string;
|
||||||
|
runner?: CodexSdkRunner;
|
||||||
|
}): Promise<{ ok: true } | { ok: false; message: string; fix: string }> {
|
||||||
|
let model: string;
|
||||||
|
try {
|
||||||
|
model = resolveCodexModel(input.model);
|
||||||
|
} catch (error) {
|
||||||
|
return {
|
||||||
|
ok: false,
|
||||||
|
message: error instanceof Error ? error.message : String(error),
|
||||||
|
fix: 'Set llm.models.default in ktx.yaml to a supported codex model (codex, default, or a gpt-* / codex-* id), or rerun `ktx setup`.',
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
|
const runtime = new CodexKtxLlmRuntime({
|
||||||
|
projectDir: input.projectDir,
|
||||||
|
modelSlots: { default: model },
|
||||||
|
...(input.runner ? { runner: input.runner } : {}),
|
||||||
|
});
|
||||||
|
try {
|
||||||
|
await runtime.generateText({ role: 'default', prompt: 'Reply with exactly: ok' });
|
||||||
|
return { ok: true };
|
||||||
|
} catch (error) {
|
||||||
|
const message = error instanceof Error ? error.message : String(error);
|
||||||
|
return { ok: false, ...describeCodexProbeFailure(model, message) };
|
||||||
|
}
|
||||||
|
}
|
||||||
96
packages/cli/src/context/llm/codex-sdk-runner.ts
Normal file
96
packages/cli/src/context/llm/codex-sdk-runner.ts
Normal file
|
|
@ -0,0 +1,96 @@
|
||||||
|
import { Codex, type CodexOptions, type ThreadOptions, type TurnOptions } from '@openai/codex-sdk';
|
||||||
|
|
||||||
|
export interface CodexSdkRunnerInput {
|
||||||
|
projectDir: string;
|
||||||
|
model: string;
|
||||||
|
prompt: string;
|
||||||
|
configOverrides?: Record<string, unknown>;
|
||||||
|
env?: Record<string, string>;
|
||||||
|
outputSchema?: Record<string, unknown>;
|
||||||
|
signal?: AbortSignal;
|
||||||
|
}
|
||||||
|
|
||||||
|
export interface CodexSdkRunner {
|
||||||
|
runStreamed(input: CodexSdkRunnerInput): Promise<AsyncIterable<unknown>>;
|
||||||
|
}
|
||||||
|
|
||||||
|
type CodexThread = {
|
||||||
|
runStreamed(input: string, turnOptions?: TurnOptions): Promise<{ events: AsyncIterable<unknown> }>;
|
||||||
|
};
|
||||||
|
|
||||||
|
type CodexClient = {
|
||||||
|
startThread(options: ThreadOptions): CodexThread;
|
||||||
|
};
|
||||||
|
|
||||||
|
type CodexConstructor = new (options?: CodexOptions) => CodexClient;
|
||||||
|
|
||||||
|
export interface CodexSdkCliRunnerOptions {
|
||||||
|
envBase?: NodeJS.ProcessEnv;
|
||||||
|
codexPathOverride?: string;
|
||||||
|
}
|
||||||
|
|
||||||
|
const CODEX_ENV_ALLOWLIST = new Set([
|
||||||
|
'HOME',
|
||||||
|
'USERPROFILE',
|
||||||
|
'APPDATA',
|
||||||
|
'LOCALAPPDATA',
|
||||||
|
'XDG_CONFIG_HOME',
|
||||||
|
'CODEX_HOME',
|
||||||
|
'CODEX_API_KEY',
|
||||||
|
'OPENAI_API_KEY',
|
||||||
|
'PATH',
|
||||||
|
'Path',
|
||||||
|
'SYSTEMROOT',
|
||||||
|
'COMSPEC',
|
||||||
|
'TMPDIR',
|
||||||
|
'TMP',
|
||||||
|
'TEMP',
|
||||||
|
'SSL_CERT_FILE',
|
||||||
|
'SSL_CERT_DIR',
|
||||||
|
'NODE_EXTRA_CA_CERTS',
|
||||||
|
'HTTPS_PROXY',
|
||||||
|
'HTTP_PROXY',
|
||||||
|
'ALL_PROXY',
|
||||||
|
'NO_PROXY',
|
||||||
|
]);
|
||||||
|
|
||||||
|
function buildCodexSdkEnv(baseEnv: NodeJS.ProcessEnv, overrides: Record<string, string> | undefined): Record<string, string> {
|
||||||
|
const env: Record<string, string> = {};
|
||||||
|
for (const key of CODEX_ENV_ALLOWLIST) {
|
||||||
|
const value = baseEnv[key];
|
||||||
|
if (typeof value === 'string') {
|
||||||
|
env[key] = value;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return { ...env, ...(overrides ?? {}) };
|
||||||
|
}
|
||||||
|
|
||||||
|
export class CodexSdkCliRunner implements CodexSdkRunner {
|
||||||
|
constructor(private readonly options: CodexSdkCliRunnerOptions = {}) {}
|
||||||
|
|
||||||
|
async runStreamed(input: CodexSdkRunnerInput): Promise<AsyncIterable<unknown>> {
|
||||||
|
const CodexClass = Codex as CodexConstructor;
|
||||||
|
const codex = new CodexClass({
|
||||||
|
...(input.configOverrides ? { config: input.configOverrides as CodexOptions['config'] } : {}),
|
||||||
|
env: buildCodexSdkEnv(this.options.envBase ?? process.env, input.env),
|
||||||
|
...(this.options.codexPathOverride ? { codexPathOverride: this.options.codexPathOverride } : {}),
|
||||||
|
});
|
||||||
|
const thread = codex.startThread({
|
||||||
|
workingDirectory: input.projectDir,
|
||||||
|
skipGitRepoCheck: true,
|
||||||
|
model: input.model,
|
||||||
|
sandboxMode: 'read-only',
|
||||||
|
webSearchMode: 'disabled',
|
||||||
|
approvalPolicy: 'never',
|
||||||
|
});
|
||||||
|
const turnOptions: TurnOptions = {
|
||||||
|
...(input.outputSchema ? { outputSchema: input.outputSchema } : {}),
|
||||||
|
...(input.signal ? { signal: input.signal } : {}),
|
||||||
|
};
|
||||||
|
const streamed = await thread.runStreamed(
|
||||||
|
input.prompt,
|
||||||
|
Object.keys(turnOptions).length > 0 ? turnOptions : undefined,
|
||||||
|
);
|
||||||
|
return streamed.events;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
@ -5,6 +5,7 @@ import { resolveKtxConfigReference } from '../core/config-reference.js';
|
||||||
import type { KtxProjectEmbeddingConfig, KtxProjectLlmConfig } from '../project/config.js';
|
import type { KtxProjectEmbeddingConfig, KtxProjectLlmConfig } from '../project/config.js';
|
||||||
import { AiSdkKtxLlmRuntime } from './ai-sdk-runtime.js';
|
import { AiSdkKtxLlmRuntime } from './ai-sdk-runtime.js';
|
||||||
import { ClaudeCodeKtxLlmRuntime } from './claude-code-runtime.js';
|
import { ClaudeCodeKtxLlmRuntime } from './claude-code-runtime.js';
|
||||||
|
import { CodexKtxLlmRuntime } from './codex-runtime.js';
|
||||||
import type { KtxLlmRuntimePort } from './runtime-port.js';
|
import type { KtxLlmRuntimePort } from './runtime-port.js';
|
||||||
|
|
||||||
interface LocalConfigDeps {
|
interface LocalConfigDeps {
|
||||||
|
|
@ -13,6 +14,7 @@ interface LocalConfigDeps {
|
||||||
createKtxLlmProvider?: typeof createKtxLlmProvider;
|
createKtxLlmProvider?: typeof createKtxLlmProvider;
|
||||||
createKtxEmbeddingProvider?: typeof createKtxEmbeddingProvider;
|
createKtxEmbeddingProvider?: typeof createKtxEmbeddingProvider;
|
||||||
createClaudeCodeRuntime?: (deps: ConstructorParameters<typeof ClaudeCodeKtxLlmRuntime>[0]) => KtxLlmRuntimePort;
|
createClaudeCodeRuntime?: (deps: ConstructorParameters<typeof ClaudeCodeKtxLlmRuntime>[0]) => KtxLlmRuntimePort;
|
||||||
|
createCodexRuntime?: (deps: ConstructorParameters<typeof CodexKtxLlmRuntime>[0]) => KtxLlmRuntimePort;
|
||||||
createAiSdkRuntime?: (deps: { llmProvider: KtxLlmProvider }) => KtxLlmRuntimePort;
|
createAiSdkRuntime?: (deps: { llmProvider: KtxLlmProvider }) => KtxLlmRuntimePort;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
@ -104,7 +106,7 @@ export function createLocalKtxLlmProviderFromConfig(
|
||||||
deps: LocalConfigDeps = {},
|
deps: LocalConfigDeps = {},
|
||||||
): KtxLlmProvider | null {
|
): KtxLlmProvider | null {
|
||||||
const resolved = resolveLocalKtxLlmConfig(config, deps.env ?? process.env);
|
const resolved = resolveLocalKtxLlmConfig(config, deps.env ?? process.env);
|
||||||
if (!resolved || resolved.backend === 'claude-code') {
|
if (!resolved || resolved.backend === 'claude-code' || resolved.backend === 'codex') {
|
||||||
return null;
|
return null;
|
||||||
}
|
}
|
||||||
return (deps.createKtxLlmProvider ?? createKtxLlmProvider)(resolved);
|
return (deps.createKtxLlmProvider ?? createKtxLlmProvider)(resolved);
|
||||||
|
|
@ -129,6 +131,16 @@ export function createLocalKtxLlmRuntimeFromConfig(
|
||||||
env: deps.env,
|
env: deps.env,
|
||||||
});
|
});
|
||||||
}
|
}
|
||||||
|
if (resolved.backend === 'codex') {
|
||||||
|
const projectDir = deps.projectDir;
|
||||||
|
if (!projectDir) {
|
||||||
|
throw new Error('projectDir is required when creating the codex LLM runtime');
|
||||||
|
}
|
||||||
|
return (deps.createCodexRuntime ?? ((runtimeDeps) => new CodexKtxLlmRuntime(runtimeDeps)))({
|
||||||
|
projectDir,
|
||||||
|
modelSlots: resolved.modelSlots,
|
||||||
|
});
|
||||||
|
}
|
||||||
const llmProvider = (deps.createKtxLlmProvider ?? createKtxLlmProvider)(resolved);
|
const llmProvider = (deps.createKtxLlmProvider ?? createKtxLlmProvider)(resolved);
|
||||||
return (deps.createAiSdkRuntime ?? ((runtimeDeps) => new AiSdkKtxLlmRuntime(runtimeDeps)))({ llmProvider });
|
return (deps.createAiSdkRuntime ?? ((runtimeDeps) => new AiSdkKtxLlmRuntime(runtimeDeps)))({ llmProvider });
|
||||||
}
|
}
|
||||||
|
|
|
||||||
|
|
@ -3,7 +3,7 @@ import YAML from 'yaml';
|
||||||
import * as z from 'zod';
|
import * as z from 'zod';
|
||||||
import { connectionConfigSchema } from './driver-schemas.js';
|
import { connectionConfigSchema } from './driver-schemas.js';
|
||||||
|
|
||||||
const KTX_LLM_BACKENDS = ['none', 'anthropic', 'vertex', 'gateway', 'claude-code'] as const;
|
const KTX_LLM_BACKENDS = ['none', 'anthropic', 'vertex', 'gateway', 'claude-code', 'codex'] as const;
|
||||||
const KTX_EMBEDDING_BACKENDS = ['none', 'openai', 'sentence-transformers'] as const;
|
const KTX_EMBEDDING_BACKENDS = ['none', 'openai', 'sentence-transformers'] as const;
|
||||||
const KTX_PROMPT_CACHE_TTLS = ['5m', '1h'] as const;
|
const KTX_PROMPT_CACHE_TTLS = ['5m', '1h'] as const;
|
||||||
const KTX_ENRICHMENT_MODES = ['none', 'deterministic', 'llm'] as const;
|
const KTX_ENRICHMENT_MODES = ['none', 'deterministic', 'llm'] as const;
|
||||||
|
|
@ -38,7 +38,7 @@ const llmProviderSchema = z
|
||||||
.enum(KTX_LLM_BACKENDS)
|
.enum(KTX_LLM_BACKENDS)
|
||||||
.default('none')
|
.default('none')
|
||||||
.describe(
|
.describe(
|
||||||
'LLM provider backend. "none" disables LLM features; "anthropic" / "vertex" / "gateway" require the matching nested credentials block; "claude-code" uses the local Claude Code session.',
|
'LLM provider backend. "none" disables LLM features; "anthropic" / "vertex" / "gateway" require the matching nested credentials block; "claude-code" uses the local Claude Code session; "codex" uses the local Codex session.',
|
||||||
),
|
),
|
||||||
vertex: vertexProviderSchema.optional().describe('Vertex AI credentials, used when backend is "vertex".'),
|
vertex: vertexProviderSchema.optional().describe('Vertex AI credentials, used when backend is "vertex".'),
|
||||||
anthropic: apiCredentialsSchema.optional().describe('Anthropic API credentials, used when backend is "anthropic".'),
|
anthropic: apiCredentialsSchema.optional().describe('Anthropic API credentials, used when backend is "anthropic".'),
|
||||||
|
|
|
||||||
|
|
@ -3,7 +3,7 @@ import type { LanguageModel, TelemetrySettings, ToolCallRepairFunction, ToolSet
|
||||||
export const KTX_MODEL_ROLES = ['default', 'triage', 'candidateExtraction', 'curator', 'reconcile', 'repair'] as const;
|
export const KTX_MODEL_ROLES = ['default', 'triage', 'candidateExtraction', 'curator', 'reconcile', 'repair'] as const;
|
||||||
|
|
||||||
export type KtxModelRole = (typeof KTX_MODEL_ROLES)[number];
|
export type KtxModelRole = (typeof KTX_MODEL_ROLES)[number];
|
||||||
type KtxLlmBackend = 'anthropic' | 'vertex' | 'gateway' | 'claude-code';
|
type KtxLlmBackend = 'anthropic' | 'vertex' | 'gateway' | 'claude-code' | 'codex';
|
||||||
export type KtxPromptCacheTtl = '5m' | '1h';
|
export type KtxPromptCacheTtl = '5m' | '1h';
|
||||||
|
|
||||||
type KtxJsonValue =
|
type KtxJsonValue =
|
||||||
|
|
|
||||||
|
|
@ -3,6 +3,9 @@ import { writeFile } from 'node:fs/promises';
|
||||||
import { promisify } from 'node:util';
|
import { promisify } from 'node:util';
|
||||||
import { resolveLocalKtxLlmConfig } from './context/llm/local-config.js';
|
import { resolveLocalKtxLlmConfig } from './context/llm/local-config.js';
|
||||||
import { runClaudeCodeAuthProbe } from './context/llm/claude-code-runtime.js';
|
import { runClaudeCodeAuthProbe } from './context/llm/claude-code-runtime.js';
|
||||||
|
import { formatCodexIsolationWarning } from './context/llm/codex-isolation.js';
|
||||||
|
import { runCodexAuthProbe } from './context/llm/codex-runtime.js';
|
||||||
|
import { DEFAULT_CODEX_MODEL } from './context/llm/codex-models.js';
|
||||||
import { resolveKtxConfigReference } from './context/core/config-reference.js';
|
import { resolveKtxConfigReference } from './context/core/config-reference.js';
|
||||||
import { type KtxProjectConfig, type KtxProjectLlmConfig, serializeKtxProjectConfig } from './context/project/config.js';
|
import { type KtxProjectConfig, type KtxProjectLlmConfig, serializeKtxProjectConfig } from './context/project/config.js';
|
||||||
import { loadKtxProject } from './context/project/project.js';
|
import { loadKtxProject } from './context/project/project.js';
|
||||||
|
|
@ -56,7 +59,7 @@ export interface AnthropicModelChoice {
|
||||||
recommended: boolean;
|
recommended: boolean;
|
||||||
}
|
}
|
||||||
|
|
||||||
export type KtxSetupLlmBackend = 'anthropic' | 'vertex' | 'claude-code';
|
export type KtxSetupLlmBackend = 'anthropic' | 'vertex' | 'claude-code' | 'codex';
|
||||||
|
|
||||||
/** @internal */
|
/** @internal */
|
||||||
export interface KtxSetupModelPromptAdapter {
|
export interface KtxSetupModelPromptAdapter {
|
||||||
|
|
@ -82,6 +85,7 @@ export interface KtxSetupModelDeps {
|
||||||
model: string;
|
model: string;
|
||||||
env?: NodeJS.ProcessEnv;
|
env?: NodeJS.ProcessEnv;
|
||||||
}) => Promise<{ ok: true } | { ok: false; message: string }>;
|
}) => Promise<{ ok: true } | { ok: false; message: string }>;
|
||||||
|
codexAuthProbe?: (input: { projectDir: string; model: string }) => Promise<{ ok: true } | { ok: false; message: string }>;
|
||||||
readGcloudProject?: () => Promise<string | undefined>;
|
readGcloudProject?: () => Promise<string | undefined>;
|
||||||
listGcloudProjects?: () => Promise<GcloudProjectChoice[]>;
|
listGcloudProjects?: () => Promise<GcloudProjectChoice[]>;
|
||||||
spinner?: () => KtxCliSpinner;
|
spinner?: () => KtxCliSpinner;
|
||||||
|
|
@ -110,6 +114,20 @@ const CLAUDE_CODE_MODELS: AnthropicModelChoice[] = [
|
||||||
{ id: 'haiku', label: 'Claude Haiku', recommended: false },
|
{ id: 'haiku', label: 'Claude Haiku', recommended: false },
|
||||||
];
|
];
|
||||||
|
|
||||||
|
// Curated Codex models from OpenAI's current lineup that work under both
|
||||||
|
// ChatGPT-account (subscription) and API-key auth. Intentionally omitted:
|
||||||
|
// the `*-codex` ids (e.g. gpt-5.3-codex, gpt-5.2-codex) are API-key-only and
|
||||||
|
// fail on ChatGPT-account auth, and gpt-5.3-codex-spark is a ChatGPT-Pro-only
|
||||||
|
// research preview. Codex resolves real availability per account at runtime
|
||||||
|
// (its binary remote-fetches the model list), so this is a convenience
|
||||||
|
// shortlist only — the manual-entry option accepts any id your account's
|
||||||
|
// `codex` picker exposes, and the auth probe reports an unsupported choice.
|
||||||
|
const CODEX_MODELS: AnthropicModelChoice[] = [
|
||||||
|
{ id: 'gpt-5.5', label: 'GPT-5.5', recommended: true },
|
||||||
|
{ id: 'gpt-5.4', label: 'GPT-5.4', recommended: false },
|
||||||
|
{ id: 'gpt-5.4-mini', label: 'GPT-5.4 mini', recommended: false },
|
||||||
|
];
|
||||||
|
|
||||||
const HIDDEN_ANTHROPIC_MODEL_PATTERNS = [
|
const HIDDEN_ANTHROPIC_MODEL_PATTERNS = [
|
||||||
/^claude-sonnet-4$/i,
|
/^claude-sonnet-4$/i,
|
||||||
/^claude-opus-4$/i,
|
/^claude-opus-4$/i,
|
||||||
|
|
@ -272,7 +290,12 @@ export function isKtxSetupLlmConfigReady(config: KtxProjectLlmConfig): boolean {
|
||||||
return typeof resolved.vertex?.location === 'string' && resolved.vertex.location.trim().length > 0;
|
return typeof resolved.vertex?.location === 'string' && resolved.vertex.location.trim().length > 0;
|
||||||
}
|
}
|
||||||
|
|
||||||
return resolved.backend === 'anthropic' || resolved.backend === 'gateway' || resolved.backend === 'claude-code';
|
return (
|
||||||
|
resolved.backend === 'anthropic' ||
|
||||||
|
resolved.backend === 'gateway' ||
|
||||||
|
resolved.backend === 'claude-code' ||
|
||||||
|
resolved.backend === 'codex'
|
||||||
|
);
|
||||||
}
|
}
|
||||||
|
|
||||||
function hasUsableConfiguredLlm(config: KtxProjectConfig): boolean {
|
function hasUsableConfiguredLlm(config: KtxProjectConfig): boolean {
|
||||||
|
|
@ -284,7 +307,8 @@ function buildProjectLlmConfig(
|
||||||
provider:
|
provider:
|
||||||
| { backend: 'anthropic'; credentialRef: string }
|
| { backend: 'anthropic'; credentialRef: string }
|
||||||
| { backend: 'vertex'; vertex: { project?: string; location: string } }
|
| { backend: 'vertex'; vertex: { project?: string; location: string } }
|
||||||
| { backend: 'claude-code' },
|
| { backend: 'claude-code' }
|
||||||
|
| { backend: 'codex' },
|
||||||
model: string,
|
model: string,
|
||||||
): KtxProjectLlmConfig {
|
): KtxProjectLlmConfig {
|
||||||
if (provider.backend === 'claude-code') {
|
if (provider.backend === 'claude-code') {
|
||||||
|
|
@ -295,6 +319,14 @@ function buildProjectLlmConfig(
|
||||||
};
|
};
|
||||||
}
|
}
|
||||||
|
|
||||||
|
if (provider.backend === 'codex') {
|
||||||
|
return {
|
||||||
|
provider: { backend: 'codex' },
|
||||||
|
models: { ...existing.models, default: model },
|
||||||
|
promptCaching: existing.promptCaching,
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
if (provider.backend === 'vertex') {
|
if (provider.backend === 'vertex') {
|
||||||
return {
|
return {
|
||||||
provider: {
|
provider: {
|
||||||
|
|
@ -515,6 +547,7 @@ async function chooseBackend(
|
||||||
message: 'Which LLM provider should KTX use?',
|
message: 'Which LLM provider should KTX use?',
|
||||||
options: [
|
options: [
|
||||||
{ value: 'claude-code', label: 'Claude subscription (Pro/Max)' },
|
{ value: 'claude-code', label: 'Claude subscription (Pro/Max)' },
|
||||||
|
{ value: 'codex', label: 'Codex subscription' },
|
||||||
{ value: 'anthropic', label: 'Anthropic API key' },
|
{ value: 'anthropic', label: 'Anthropic API key' },
|
||||||
{ value: 'vertex', label: 'Google Vertex AI for Anthropic Claude' },
|
{ value: 'vertex', label: 'Google Vertex AI for Anthropic Claude' },
|
||||||
{ value: 'back', label: 'Back' },
|
{ value: 'back', label: 'Back' },
|
||||||
|
|
@ -525,7 +558,7 @@ async function chooseBackend(
|
||||||
}
|
}
|
||||||
return {
|
return {
|
||||||
status: 'ready',
|
status: 'ready',
|
||||||
backend: choice === 'vertex' || choice === 'claude-code' ? choice : 'anthropic',
|
backend: choice === 'vertex' || choice === 'claude-code' || choice === 'codex' ? choice : 'anthropic',
|
||||||
prompted: true,
|
prompted: true,
|
||||||
};
|
};
|
||||||
}
|
}
|
||||||
|
|
@ -884,12 +917,51 @@ async function chooseClaudeCodeModel(args: KtxSetupModelArgs, deps: KtxSetupMode
|
||||||
return { status: 'ready', model: choice };
|
return { status: 'ready', model: choice };
|
||||||
}
|
}
|
||||||
|
|
||||||
|
async function chooseCodexModel(args: KtxSetupModelArgs, deps: KtxSetupModelDeps): Promise<ChooseModelResult> {
|
||||||
|
const providedModel = requestedModel(args);
|
||||||
|
if (providedModel) {
|
||||||
|
return { status: 'ready', model: providedModel };
|
||||||
|
}
|
||||||
|
if (args.inputMode === 'disabled') {
|
||||||
|
return { status: 'ready', model: DEFAULT_CODEX_MODEL };
|
||||||
|
}
|
||||||
|
|
||||||
|
const prompts = deps.prompts ?? createPromptAdapter();
|
||||||
|
const choice = await prompts.select({
|
||||||
|
message: `Which Codex model should KTX use?\n\n${ANTHROPIC_MODEL_PROMPT_CONTEXT}`,
|
||||||
|
options: [
|
||||||
|
...CODEX_MODELS.map((model) => ({
|
||||||
|
value: model.id,
|
||||||
|
label: model.label,
|
||||||
|
...(model.recommended ? { hint: 'recommended' } : {}),
|
||||||
|
})),
|
||||||
|
{ value: 'manual', label: 'Enter a Codex model ID manually' },
|
||||||
|
{ value: 'back', label: 'Back' },
|
||||||
|
],
|
||||||
|
});
|
||||||
|
if (choice === 'back') {
|
||||||
|
return { status: 'back' };
|
||||||
|
}
|
||||||
|
if (choice === 'manual') {
|
||||||
|
const manual = await prompts.text({
|
||||||
|
message: withTextInputNavigation('Codex model ID'),
|
||||||
|
placeholder: CODEX_MODELS.find((model) => model.recommended)?.id ?? CODEX_MODELS[0]?.id,
|
||||||
|
});
|
||||||
|
if (manual === undefined) {
|
||||||
|
return { status: 'back' };
|
||||||
|
}
|
||||||
|
return manual.trim() ? { status: 'ready', model: manual.trim() } : { status: 'missing-input' };
|
||||||
|
}
|
||||||
|
return { status: 'ready', model: choice };
|
||||||
|
}
|
||||||
|
|
||||||
async function persistLlmConfig(
|
async function persistLlmConfig(
|
||||||
projectDir: string,
|
projectDir: string,
|
||||||
provider:
|
provider:
|
||||||
| { backend: 'anthropic'; credentialRef: string }
|
| { backend: 'anthropic'; credentialRef: string }
|
||||||
| { backend: 'vertex'; vertex: { project?: string; location: string } }
|
| { backend: 'vertex'; vertex: { project?: string; location: string } }
|
||||||
| { backend: 'claude-code' },
|
| { backend: 'claude-code' }
|
||||||
|
| { backend: 'codex' },
|
||||||
model: string,
|
model: string,
|
||||||
): Promise<void> {
|
): Promise<void> {
|
||||||
const project = await loadKtxProject({ projectDir });
|
const project = await loadKtxProject({ projectDir });
|
||||||
|
|
@ -1031,6 +1103,32 @@ export async function runKtxSetupAnthropicModelStep(
|
||||||
return { status: 'ready', projectDir: args.projectDir };
|
return { status: 'ready', projectDir: args.projectDir };
|
||||||
}
|
}
|
||||||
|
|
||||||
|
if (backendChoice.backend === 'codex') {
|
||||||
|
const model = await chooseCodexModel(backendArgs, deps);
|
||||||
|
if (model.status === 'back' && backendChoice.prompted) {
|
||||||
|
attemptArgs = buildInteractiveRetryArgs(args);
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
if (model.status === 'invalid-credential') {
|
||||||
|
return { status: 'failed', projectDir: args.projectDir };
|
||||||
|
}
|
||||||
|
if (model.status !== 'ready') {
|
||||||
|
return { status: model.status, projectDir: args.projectDir };
|
||||||
|
}
|
||||||
|
const probe = deps.codexAuthProbe ?? runCodexAuthProbe;
|
||||||
|
const health = await probe({ projectDir: args.projectDir, model: model.model });
|
||||||
|
if (!health.ok) {
|
||||||
|
io.stderr.write(`${health.message}\n`);
|
||||||
|
return { status: 'failed', projectDir: args.projectDir };
|
||||||
|
}
|
||||||
|
// Prefix the clack gutter so the warning sits inside the setup frame
|
||||||
|
// instead of breaking out of it; kept on stderr for scripted runs.
|
||||||
|
io.stderr.write(`│ ${formatCodexIsolationWarning()}\n`);
|
||||||
|
await persistLlmConfig(args.projectDir, { backend: 'codex' }, model.model);
|
||||||
|
io.stdout.write(`│ LLM ready: yes (codex, ${model.model})\n`);
|
||||||
|
return { status: 'ready', projectDir: args.projectDir };
|
||||||
|
}
|
||||||
|
|
||||||
const credential = await chooseCredentialRef(backendArgs, io, deps);
|
const credential = await chooseCredentialRef(backendArgs, io, deps);
|
||||||
if (credential.status === 'back' && backendChoice.prompted) {
|
if (credential.status === 'back' && backendChoice.prompted) {
|
||||||
attemptArgs = buildInteractiveRetryArgs(args);
|
attemptArgs = buildInteractiveRetryArgs(args);
|
||||||
|
|
|
||||||
|
|
@ -1,6 +1,11 @@
|
||||||
import { stat as statAsync, readdir as readdirAsync } from 'node:fs/promises';
|
import { stat as statAsync, readdir as readdirAsync } from 'node:fs/promises';
|
||||||
import { basename, join } from 'node:path';
|
import { basename, join } from 'node:path';
|
||||||
import { runClaudeCodeAuthProbe } from './context/llm/claude-code-runtime.js';
|
import { runClaudeCodeAuthProbe } from './context/llm/claude-code-runtime.js';
|
||||||
|
import {
|
||||||
|
CODEX_ISOLATION_WARNING,
|
||||||
|
CODEX_ISOLATION_WARNING_FIX,
|
||||||
|
} from './context/llm/codex-isolation.js';
|
||||||
|
import { runCodexAuthProbe } from './context/llm/codex-runtime.js';
|
||||||
import type { KtxConfigIssue, KtxProjectConfig, KtxProjectConnectionConfig, KtxProjectEmbeddingConfig, KtxProjectLlmConfig } from './context/project/config.js';
|
import type { KtxConfigIssue, KtxProjectConfig, KtxProjectConnectionConfig, KtxProjectEmbeddingConfig, KtxProjectLlmConfig } from './context/project/config.js';
|
||||||
import type { KtxLocalProject } from './context/project/project.js';
|
import type { KtxLocalProject } from './context/project/project.js';
|
||||||
import { ktxLocalStateDbPath } from './context/project/local-state-db.js';
|
import { ktxLocalStateDbPath } from './context/project/local-state-db.js';
|
||||||
|
|
@ -94,6 +99,11 @@ type ClaudeCodeAuthProbe = (input: {
|
||||||
env?: NodeJS.ProcessEnv;
|
env?: NodeJS.ProcessEnv;
|
||||||
}) => Promise<{ ok: true } | { ok: false; message: string }>;
|
}) => Promise<{ ok: true } | { ok: false; message: string }>;
|
||||||
|
|
||||||
|
type CodexAuthProbe = (input: {
|
||||||
|
projectDir: string;
|
||||||
|
model: string;
|
||||||
|
}) => Promise<{ ok: true } | { ok: false; message: string; fix: string }>;
|
||||||
|
|
||||||
const PROJECT_READY_COMMANDS = KTX_NEXT_STEP_DIRECT_COMMANDS.map((step) => step.command);
|
const PROJECT_READY_COMMANDS = KTX_NEXT_STEP_DIRECT_COMMANDS.map((step) => step.command);
|
||||||
|
|
||||||
interface LocalStatsIngestPerConnection {
|
interface LocalStatsIngestPerConnection {
|
||||||
|
|
@ -194,6 +204,7 @@ async function buildLlmStatus(
|
||||||
projectDir: string;
|
projectDir: string;
|
||||||
env: NodeJS.ProcessEnv;
|
env: NodeJS.ProcessEnv;
|
||||||
claudeCodeAuthProbe?: ClaudeCodeAuthProbe;
|
claudeCodeAuthProbe?: ClaudeCodeAuthProbe;
|
||||||
|
codexAuthProbe?: CodexAuthProbe;
|
||||||
fast?: boolean;
|
fast?: boolean;
|
||||||
useSpinner?: boolean;
|
useSpinner?: boolean;
|
||||||
},
|
},
|
||||||
|
|
@ -210,6 +221,18 @@ async function buildLlmStatus(
|
||||||
fix: 'Run: ktx setup (choose an LLM provider)',
|
fix: 'Run: ktx setup (choose an LLM provider)',
|
||||||
};
|
};
|
||||||
}
|
}
|
||||||
|
// The runtime (resolveModelSlots) hard-requires llm.models.default for every
|
||||||
|
// non-none backend; without it ingest/scan/memory throw. Report that here so
|
||||||
|
// status never marks a project ready that the runtime would refuse to run.
|
||||||
|
if (!model || model.trim().length === 0) {
|
||||||
|
return {
|
||||||
|
backend,
|
||||||
|
model,
|
||||||
|
status: 'fail',
|
||||||
|
detail: `llm.models.default is required for backend "${backend}"`,
|
||||||
|
fix: 'Set llm.models.default in ktx.yaml, then rerun `ktx status` (or rerun `ktx setup`).',
|
||||||
|
};
|
||||||
|
}
|
||||||
if (backend === 'anthropic') {
|
if (backend === 'anthropic') {
|
||||||
const ref = config.provider.anthropic?.api_key;
|
const ref = config.provider.anthropic?.api_key;
|
||||||
const resolved = resolveRef(ref, env);
|
const resolved = resolveRef(ref, env);
|
||||||
|
|
@ -251,7 +274,7 @@ async function buildLlmStatus(
|
||||||
};
|
};
|
||||||
}
|
}
|
||||||
if (backend === 'claude-code') {
|
if (backend === 'claude-code') {
|
||||||
const modelName = model ?? 'sonnet';
|
const modelName = model;
|
||||||
if (options.fast === true) {
|
if (options.fast === true) {
|
||||||
return {
|
return {
|
||||||
backend,
|
backend,
|
||||||
|
|
@ -280,6 +303,36 @@ async function buildLlmStatus(
|
||||||
fix: 'Authenticate Claude Code locally with the Claude Code CLI, then rerun `ktx status`.',
|
fix: 'Authenticate Claude Code locally with the Claude Code CLI, then rerun `ktx status`.',
|
||||||
};
|
};
|
||||||
}
|
}
|
||||||
|
if (backend === 'codex') {
|
||||||
|
const modelName = model;
|
||||||
|
if (options.fast === true) {
|
||||||
|
return {
|
||||||
|
backend,
|
||||||
|
model: modelName,
|
||||||
|
status: 'skipped',
|
||||||
|
detail: 'auth probe skipped (--fast)',
|
||||||
|
};
|
||||||
|
}
|
||||||
|
const probe = options.codexAuthProbe ?? runCodexAuthProbe;
|
||||||
|
const auth = await withSpinner(options.useSpinner === true, 'Probing Codex authentication', () =>
|
||||||
|
probe({ projectDir: options.projectDir, model: modelName }),
|
||||||
|
);
|
||||||
|
if (auth.ok) {
|
||||||
|
return {
|
||||||
|
backend,
|
||||||
|
model: modelName,
|
||||||
|
status: 'ok',
|
||||||
|
detail: 'local Codex session authenticated',
|
||||||
|
};
|
||||||
|
}
|
||||||
|
return {
|
||||||
|
backend,
|
||||||
|
model: modelName,
|
||||||
|
status: 'fail',
|
||||||
|
detail: auth.message,
|
||||||
|
fix: auth.fix,
|
||||||
|
};
|
||||||
|
}
|
||||||
return { backend, model, status: 'warn', detail: 'unknown LLM backend' };
|
return { backend, model, status: 'warn', detail: 'unknown LLM backend' };
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
@ -572,6 +625,13 @@ function buildWarnings(
|
||||||
});
|
});
|
||||||
}
|
}
|
||||||
|
|
||||||
|
if (llm.backend === 'codex') {
|
||||||
|
warnings.push({
|
||||||
|
message: CODEX_ISOLATION_WARNING,
|
||||||
|
fix: CODEX_ISOLATION_WARNING_FIX,
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
return warnings;
|
return warnings;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
@ -634,6 +694,7 @@ export interface BuildProjectStatusOptions {
|
||||||
env?: NodeJS.ProcessEnv;
|
env?: NodeJS.ProcessEnv;
|
||||||
queryHistoryReadinessProbe?: HistoricSqlReadinessProbe;
|
queryHistoryReadinessProbe?: HistoricSqlReadinessProbe;
|
||||||
claudeCodeAuthProbe?: ClaudeCodeAuthProbe;
|
claudeCodeAuthProbe?: ClaudeCodeAuthProbe;
|
||||||
|
codexAuthProbe?: CodexAuthProbe;
|
||||||
configIssues?: KtxConfigIssue[];
|
configIssues?: KtxConfigIssue[];
|
||||||
fast?: boolean;
|
fast?: boolean;
|
||||||
useSpinner?: boolean;
|
useSpinner?: boolean;
|
||||||
|
|
@ -882,6 +943,7 @@ export async function buildProjectStatus(project: KtxLocalProject, options: Buil
|
||||||
projectDir: project.projectDir,
|
projectDir: project.projectDir,
|
||||||
env,
|
env,
|
||||||
claudeCodeAuthProbe: options.claudeCodeAuthProbe,
|
claudeCodeAuthProbe: options.claudeCodeAuthProbe,
|
||||||
|
codexAuthProbe: options.codexAuthProbe,
|
||||||
fast: options.fast,
|
fast: options.fast,
|
||||||
useSpinner: options.useSpinner,
|
useSpinner: options.useSpinner,
|
||||||
});
|
});
|
||||||
|
|
|
||||||
|
|
@ -77,9 +77,10 @@ describe('createLocalBundleIngestRuntime', () => {
|
||||||
}),
|
}),
|
||||||
).toThrow(
|
).toThrow(
|
||||||
[
|
[
|
||||||
'ktx ingest requires llm.provider.backend: anthropic, vertex, gateway, or claude-code, or an injected agentRunner.',
|
'ktx ingest requires llm.provider.backend: anthropic, vertex, gateway, claude-code, or codex, or an injected agentRunner.',
|
||||||
'Configure a local Claude Code session or API-backed LLM, then rerun ingest:',
|
'Configure a local Claude Code/Codex session or API-backed LLM, then rerun ingest:',
|
||||||
` ktx setup --project-dir ${project.projectDir} --llm-backend claude-code --no-input`,
|
` ktx setup --project-dir ${project.projectDir} --llm-backend claude-code --no-input`,
|
||||||
|
` ktx setup --project-dir ${project.projectDir} --llm-backend codex --llm-model gpt-5.5 --no-input`,
|
||||||
` ktx setup --project-dir ${project.projectDir} --llm-backend anthropic --anthropic-api-key-env ANTHROPIC_API_KEY --llm-model claude-sonnet-4-6 --no-input`,
|
` ktx setup --project-dir ${project.projectDir} --llm-backend anthropic --anthropic-api-key-env ANTHROPIC_API_KEY --llm-model claude-sonnet-4-6 --no-input`,
|
||||||
].join('\n'),
|
].join('\n'),
|
||||||
);
|
);
|
||||||
|
|
|
||||||
188
packages/cli/test/context/llm/codex-exec-events.test.ts
Normal file
188
packages/cli/test/context/llm/codex-exec-events.test.ts
Normal file
|
|
@ -0,0 +1,188 @@
|
||||||
|
import { describe, expect, it } from 'vitest';
|
||||||
|
import {
|
||||||
|
parseCodexExecEventLine,
|
||||||
|
summarizeCodexExecEvents,
|
||||||
|
} from '../../../src/context/llm/codex-exec-events.js';
|
||||||
|
|
||||||
|
describe('Codex exec event parsing', () => {
|
||||||
|
it('uses the completed turn as one step when no MCP tools run', () => {
|
||||||
|
const summary = summarizeCodexExecEvents(
|
||||||
|
[
|
||||||
|
{ type: 'thread.started', thread_id: 'thr_1' },
|
||||||
|
{ type: 'turn.started' },
|
||||||
|
{ type: 'item.completed', item: { id: 'item_1', type: 'agent_message', text: 'hello from codex' } },
|
||||||
|
{
|
||||||
|
type: 'turn.completed',
|
||||||
|
usage: {
|
||||||
|
input_tokens: 12,
|
||||||
|
cached_input_tokens: 4,
|
||||||
|
output_tokens: 5,
|
||||||
|
reasoning_output_tokens: 2,
|
||||||
|
},
|
||||||
|
},
|
||||||
|
],
|
||||||
|
{ startedAt: 100, now: () => 125 },
|
||||||
|
);
|
||||||
|
|
||||||
|
expect(summary).toEqual({
|
||||||
|
finalText: 'hello from codex',
|
||||||
|
stopReason: 'natural',
|
||||||
|
usage: { inputTokens: 12, outputTokens: 5, totalTokens: 17 },
|
||||||
|
stepCount: 1,
|
||||||
|
stepBoundariesMs: [25],
|
||||||
|
toolCallCount: 0,
|
||||||
|
toolFailures: [],
|
||||||
|
});
|
||||||
|
});
|
||||||
|
|
||||||
|
it('uses completed MCP tool calls as loop steps', () => {
|
||||||
|
const offsets = [115, 140, 175];
|
||||||
|
const summary = summarizeCodexExecEvents(
|
||||||
|
[
|
||||||
|
{ type: 'turn.started' },
|
||||||
|
{
|
||||||
|
type: 'item.started',
|
||||||
|
item: { id: 'call_1', type: 'mcp_tool_call', server: 'ktx', tool: 'search', arguments: {}, status: 'in_progress' },
|
||||||
|
},
|
||||||
|
{
|
||||||
|
type: 'item.completed',
|
||||||
|
item: { id: 'call_1', type: 'mcp_tool_call', server: 'ktx', tool: 'search', arguments: {}, status: 'completed' },
|
||||||
|
},
|
||||||
|
{
|
||||||
|
type: 'item.started',
|
||||||
|
item: { id: 'call_2', type: 'mcp_tool_call', server: 'ktx', tool: 'lookup', arguments: {}, status: 'in_progress' },
|
||||||
|
},
|
||||||
|
{
|
||||||
|
type: 'item.completed',
|
||||||
|
item: {
|
||||||
|
id: 'call_2',
|
||||||
|
type: 'mcp_tool_call',
|
||||||
|
server: 'ktx',
|
||||||
|
tool: 'lookup',
|
||||||
|
arguments: {},
|
||||||
|
status: 'failed',
|
||||||
|
error: { message: 'denied' },
|
||||||
|
},
|
||||||
|
},
|
||||||
|
{ type: 'item.completed', item: { id: 'item_1', type: 'agent_message', text: 'done' } },
|
||||||
|
{ type: 'turn.completed', usage: { input_tokens: 1, output_tokens: 1, cached_input_tokens: 0, reasoning_output_tokens: 0 } },
|
||||||
|
],
|
||||||
|
{ startedAt: 100, now: () => offsets.shift() ?? 175 },
|
||||||
|
);
|
||||||
|
|
||||||
|
expect(summary).toEqual({
|
||||||
|
finalText: 'done',
|
||||||
|
stopReason: 'natural',
|
||||||
|
usage: { inputTokens: 1, outputTokens: 1, totalTokens: 2 },
|
||||||
|
stepCount: 2,
|
||||||
|
stepBoundariesMs: [15, 40],
|
||||||
|
toolCallCount: 2,
|
||||||
|
toolFailures: ['lookup: denied'],
|
||||||
|
});
|
||||||
|
});
|
||||||
|
|
||||||
|
it('does not treat a completed MCP tool call as failed when Codex sends error: null', () => {
|
||||||
|
// Captured verbatim from a real @openai/codex-sdk run: successful tool calls
|
||||||
|
// carry `error: null` and `result` alongside `status: "completed"`.
|
||||||
|
const summary = summarizeCodexExecEvents([
|
||||||
|
{ type: 'turn.started' },
|
||||||
|
{
|
||||||
|
type: 'item.started',
|
||||||
|
item: {
|
||||||
|
id: 'item_1',
|
||||||
|
type: 'mcp_tool_call',
|
||||||
|
server: 'ktx',
|
||||||
|
tool: 'echo_value',
|
||||||
|
arguments: { value: 'ktx_codex_tool_ok' },
|
||||||
|
result: null,
|
||||||
|
error: null,
|
||||||
|
status: 'in_progress',
|
||||||
|
},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
type: 'item.completed',
|
||||||
|
item: {
|
||||||
|
id: 'item_1',
|
||||||
|
type: 'mcp_tool_call',
|
||||||
|
server: 'ktx',
|
||||||
|
tool: 'echo_value',
|
||||||
|
arguments: { value: 'ktx_codex_tool_ok' },
|
||||||
|
result: { content: [{ type: 'text', text: 'echo:ktx_codex_tool_ok' }], structured_content: null },
|
||||||
|
error: null,
|
||||||
|
status: 'completed',
|
||||||
|
},
|
||||||
|
},
|
||||||
|
{ type: 'item.completed', item: { id: 'm1', type: 'agent_message', text: 'done' } },
|
||||||
|
{ type: 'turn.completed', usage: { input_tokens: 1, output_tokens: 1 } },
|
||||||
|
]);
|
||||||
|
|
||||||
|
expect(summary.toolFailures).toEqual([]);
|
||||||
|
expect(summary.toolCallCount).toBe(1);
|
||||||
|
});
|
||||||
|
|
||||||
|
it('counts built-in command executions as loop steps without failing the loop', () => {
|
||||||
|
const offsets = [110, 130];
|
||||||
|
const summary = summarizeCodexExecEvents(
|
||||||
|
[
|
||||||
|
{ type: 'turn.started' },
|
||||||
|
{ type: 'item.completed', item: { id: 'c1', type: 'command_execution', command: 'ls', status: 'completed', exit_code: 0 } },
|
||||||
|
{ type: 'item.completed', item: { id: 'c2', type: 'command_execution', command: 'cat missing', status: 'failed', exit_code: 1 } },
|
||||||
|
{ type: 'item.completed', item: { id: 'm1', type: 'agent_message', text: 'done' } },
|
||||||
|
{ type: 'turn.completed', usage: { input_tokens: 2, output_tokens: 1 } },
|
||||||
|
],
|
||||||
|
{ startedAt: 100, now: () => offsets.shift() ?? 130 },
|
||||||
|
);
|
||||||
|
|
||||||
|
expect(summary.stepCount).toBe(2);
|
||||||
|
expect(summary.stepBoundariesMs).toEqual([10, 30]);
|
||||||
|
// A non-zero command exit is normal agent exploration, not a runtime tool failure.
|
||||||
|
expect(summary.toolFailures).toEqual([]);
|
||||||
|
expect(summary.toolCallCount).toBe(0);
|
||||||
|
});
|
||||||
|
|
||||||
|
it('maps turn failures into error stop reason', () => {
|
||||||
|
const summary = summarizeCodexExecEvents([
|
||||||
|
{ type: 'turn.started' },
|
||||||
|
{ type: 'turn.failed', error: { message: 'Codex could not connect to required MCP server' } },
|
||||||
|
]);
|
||||||
|
|
||||||
|
expect(summary.stopReason).toBe('error');
|
||||||
|
expect(summary.error?.message).toContain('Codex could not connect to required MCP server');
|
||||||
|
});
|
||||||
|
|
||||||
|
it('unwraps the Codex API error envelope into its human-readable message', () => {
|
||||||
|
// Codex serializes API errors as a JSON envelope inside the event message.
|
||||||
|
const apiError = JSON.stringify({
|
||||||
|
type: 'error',
|
||||||
|
status: 400,
|
||||||
|
error: {
|
||||||
|
type: 'invalid_request_error',
|
||||||
|
message: "The 'gpt-5.3-codex' model is not supported when using Codex with a ChatGPT account.",
|
||||||
|
},
|
||||||
|
});
|
||||||
|
const summary = summarizeCodexExecEvents([
|
||||||
|
{ type: 'thread.started', thread_id: 'thr_1' },
|
||||||
|
{ type: 'turn.started' },
|
||||||
|
{ type: 'error', message: apiError },
|
||||||
|
{ type: 'turn.failed', error: { message: apiError } },
|
||||||
|
]);
|
||||||
|
|
||||||
|
expect(summary.stopReason).toBe('error');
|
||||||
|
expect(summary.error?.message).toBe(
|
||||||
|
"The 'gpt-5.3-codex' model is not supported when using Codex with a ChatGPT account.",
|
||||||
|
);
|
||||||
|
});
|
||||||
|
|
||||||
|
it('maps max-turns terminal reasons into budget stop reason when Codex emits one', () => {
|
||||||
|
const summary = summarizeCodexExecEvents([
|
||||||
|
{ type: 'turn.started' },
|
||||||
|
{ type: 'turn.completed', reason: 'max_turns', usage: { input_tokens: 1, output_tokens: 1 } },
|
||||||
|
]);
|
||||||
|
|
||||||
|
expect(summary.stopReason).toBe('budget');
|
||||||
|
});
|
||||||
|
|
||||||
|
it('throws a clear error for malformed JSONL lines', () => {
|
||||||
|
expect(() => parseCodexExecEventLine('{not-json')).toThrow('Codex JSONL event stream was malformed');
|
||||||
|
});
|
||||||
|
});
|
||||||
19
packages/cli/test/context/llm/codex-isolation.test.ts
Normal file
19
packages/cli/test/context/llm/codex-isolation.test.ts
Normal file
|
|
@ -0,0 +1,19 @@
|
||||||
|
import { describe, expect, it } from 'vitest';
|
||||||
|
import {
|
||||||
|
CODEX_ISOLATION_WARNING,
|
||||||
|
CODEX_ISOLATION_WARNING_FIX,
|
||||||
|
formatCodexIsolationWarning,
|
||||||
|
} from '../../../src/context/llm/codex-isolation.js';
|
||||||
|
|
||||||
|
describe('Codex isolation warning', () => {
|
||||||
|
it('documents the enforced and unenforced Codex isolation boundaries', () => {
|
||||||
|
expect(CODEX_ISOLATION_WARNING).toContain('runtime MCP server to the current ktx tool set');
|
||||||
|
expect(CODEX_ISOLATION_WARNING).toContain('disables Codex web search');
|
||||||
|
expect(CODEX_ISOLATION_WARNING).toContain('may still load user Codex config');
|
||||||
|
expect(CODEX_ISOLATION_WARNING).toContain('built-in command execution');
|
||||||
|
expect(CODEX_ISOLATION_WARNING_FIX).toContain('claude-code');
|
||||||
|
expect(formatCodexIsolationWarning()).toBe(
|
||||||
|
`${CODEX_ISOLATION_WARNING} ${CODEX_ISOLATION_WARNING_FIX}`,
|
||||||
|
);
|
||||||
|
});
|
||||||
|
});
|
||||||
|
|
@ -0,0 +1,73 @@
|
||||||
|
import { describe, expect, it, vi } from 'vitest';
|
||||||
|
import { z } from 'zod';
|
||||||
|
import {
|
||||||
|
createCodexRuntimeMcpServer,
|
||||||
|
startCodexRuntimeMcpServer,
|
||||||
|
} from '../../../src/context/llm/codex-mcp-runtime-server.js';
|
||||||
|
|
||||||
|
describe('Codex runtime MCP server', () => {
|
||||||
|
it('registers runtime tools with markdown output', async () => {
|
||||||
|
const registered = new Map<
|
||||||
|
string,
|
||||||
|
{
|
||||||
|
config: { description?: string; inputSchema: unknown };
|
||||||
|
handler: (input: Record<string, unknown>) => Promise<unknown>;
|
||||||
|
}
|
||||||
|
>();
|
||||||
|
const server = createCodexRuntimeMcpServer({
|
||||||
|
server: {
|
||||||
|
registerTool(name, config, handler) {
|
||||||
|
registered.set(name, { config, handler });
|
||||||
|
},
|
||||||
|
},
|
||||||
|
toolSet: {
|
||||||
|
wiki_search: {
|
||||||
|
name: 'wiki_search',
|
||||||
|
description: 'Search the wiki',
|
||||||
|
inputSchema: z.object({ query: z.string() }),
|
||||||
|
execute: vi.fn(async () => ({ markdown: 'result markdown', structured: { matches: 1 } })),
|
||||||
|
},
|
||||||
|
},
|
||||||
|
});
|
||||||
|
|
||||||
|
expect(server).toBeDefined();
|
||||||
|
expect([...registered.keys()]).toEqual(['wiki_search']);
|
||||||
|
expect(registered.get('wiki_search')?.config).toMatchObject({
|
||||||
|
description: 'Search the wiki',
|
||||||
|
});
|
||||||
|
await expect(registered.get('wiki_search')?.handler({ query: 'revenue' })).resolves.toEqual({
|
||||||
|
content: [{ type: 'text', text: 'result markdown' }],
|
||||||
|
structuredContent: { matches: 1 },
|
||||||
|
});
|
||||||
|
});
|
||||||
|
|
||||||
|
it('starts loopback HTTP MCP with a bearer token and reports the runtime URL', async () => {
|
||||||
|
const close = vi.fn(async () => undefined);
|
||||||
|
const runServer = vi.fn(async () => ({
|
||||||
|
server: { address: () => ({ port: 4321 }) },
|
||||||
|
close,
|
||||||
|
}));
|
||||||
|
|
||||||
|
const handle = await startCodexRuntimeMcpServer({
|
||||||
|
projectDir: '/tmp/ktx-project',
|
||||||
|
toolSet: {},
|
||||||
|
runServer: runServer as never,
|
||||||
|
});
|
||||||
|
|
||||||
|
expect(handle.url).toBe('http://127.0.0.1:4321/mcp');
|
||||||
|
expect(handle.bearerTokenEnvVar).toBe('KTX_CODEX_RUNTIME_MCP_TOKEN');
|
||||||
|
expect(handle.bearerToken).toMatch(/^[a-f0-9]{64}$/);
|
||||||
|
expect(runServer).toHaveBeenCalledWith(
|
||||||
|
expect.objectContaining({
|
||||||
|
projectDir: '/tmp/ktx-project',
|
||||||
|
host: '127.0.0.1',
|
||||||
|
port: 0,
|
||||||
|
token: handle.bearerToken,
|
||||||
|
allowedHosts: ['127.0.0.1', 'localhost'],
|
||||||
|
allowedOrigins: [],
|
||||||
|
}),
|
||||||
|
);
|
||||||
|
await handle.close();
|
||||||
|
expect(close).toHaveBeenCalled();
|
||||||
|
});
|
||||||
|
});
|
||||||
17
packages/cli/test/context/llm/codex-models.test.ts
Normal file
17
packages/cli/test/context/llm/codex-models.test.ts
Normal file
|
|
@ -0,0 +1,17 @@
|
||||||
|
import { describe, expect, it } from 'vitest';
|
||||||
|
import { resolveCodexModel } from '../../../src/context/llm/codex-models.js';
|
||||||
|
|
||||||
|
describe('resolveCodexModel', () => {
|
||||||
|
it.each([
|
||||||
|
['codex', 'gpt-5.5'],
|
||||||
|
['default', 'gpt-5.5'],
|
||||||
|
['gpt-5.3-codex-spark', 'gpt-5.3-codex-spark'],
|
||||||
|
['gpt-5.4', 'gpt-5.4'],
|
||||||
|
])('maps %s to %s', (input, expected) => {
|
||||||
|
expect(resolveCodexModel(input)).toBe(expected);
|
||||||
|
});
|
||||||
|
|
||||||
|
it.each(['', ' ', 'sonnet', 'claude-sonnet-4-6'])('rejects %s', (input) => {
|
||||||
|
expect(() => resolveCodexModel(input)).toThrow('Unsupported Codex model');
|
||||||
|
});
|
||||||
|
});
|
||||||
43
packages/cli/test/context/llm/codex-runtime-config.test.ts
Normal file
43
packages/cli/test/context/llm/codex-runtime-config.test.ts
Normal file
|
|
@ -0,0 +1,43 @@
|
||||||
|
import { describe, expect, it } from 'vitest';
|
||||||
|
import { buildCodexRuntimeConfig } from '../../../src/context/llm/codex-runtime-config.js';
|
||||||
|
|
||||||
|
describe('buildCodexRuntimeConfig', () => {
|
||||||
|
it('builds generic config without SDK thread-option fields', () => {
|
||||||
|
expect(buildCodexRuntimeConfig({ model: 'gpt-5.3-codex' })).toEqual({
|
||||||
|
configOverrides: {
|
||||||
|
history: { persistence: 'none' },
|
||||||
|
},
|
||||||
|
env: {},
|
||||||
|
});
|
||||||
|
});
|
||||||
|
|
||||||
|
it('adds only the temporary ktx MCP server and exact enabled tools', () => {
|
||||||
|
expect(
|
||||||
|
buildCodexRuntimeConfig({
|
||||||
|
model: 'gpt-5.3-codex',
|
||||||
|
mcp: {
|
||||||
|
url: 'http://127.0.0.1:4567/mcp',
|
||||||
|
bearerTokenEnvVar: 'KTX_CODEX_RUNTIME_MCP_TOKEN',
|
||||||
|
bearerToken: 'secret-token',
|
||||||
|
toolNames: ['sl_read_source', 'wiki_search'],
|
||||||
|
},
|
||||||
|
}),
|
||||||
|
).toEqual({
|
||||||
|
configOverrides: {
|
||||||
|
history: { persistence: 'none' },
|
||||||
|
mcp_servers: {
|
||||||
|
ktx: {
|
||||||
|
url: 'http://127.0.0.1:4567/mcp',
|
||||||
|
bearer_token_env_var: 'KTX_CODEX_RUNTIME_MCP_TOKEN',
|
||||||
|
enabled_tools: ['sl_read_source', 'wiki_search'],
|
||||||
|
default_tools_approval_mode: 'approve',
|
||||||
|
required: true,
|
||||||
|
},
|
||||||
|
},
|
||||||
|
},
|
||||||
|
env: {
|
||||||
|
KTX_CODEX_RUNTIME_MCP_TOKEN: 'secret-token',
|
||||||
|
},
|
||||||
|
});
|
||||||
|
});
|
||||||
|
});
|
||||||
460
packages/cli/test/context/llm/codex-runtime.test.ts
Normal file
460
packages/cli/test/context/llm/codex-runtime.test.ts
Normal file
|
|
@ -0,0 +1,460 @@
|
||||||
|
import { describe, expect, it, vi } from 'vitest';
|
||||||
|
import { z } from 'zod';
|
||||||
|
import {
|
||||||
|
CodexKtxLlmRuntime,
|
||||||
|
runCodexAuthProbe,
|
||||||
|
} from '../../../src/context/llm/codex-runtime.js';
|
||||||
|
|
||||||
|
async function* events(items: unknown[]) {
|
||||||
|
for (const item of items) {
|
||||||
|
yield item;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
function runner(items: unknown[]) {
|
||||||
|
return {
|
||||||
|
runStreamed: vi.fn(async () => events(items)),
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
|
/** Yields the given events, then throws — mirroring the SDK throwing on a non-zero codex exec exit. */
|
||||||
|
function throwingRunner(items: unknown[], error: Error) {
|
||||||
|
return {
|
||||||
|
runStreamed: vi.fn(async () =>
|
||||||
|
(async function* () {
|
||||||
|
for (const item of items) {
|
||||||
|
yield item;
|
||||||
|
}
|
||||||
|
throw error;
|
||||||
|
})(),
|
||||||
|
),
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
|
const MODEL_UNSUPPORTED_API_ERROR = JSON.stringify({
|
||||||
|
type: 'error',
|
||||||
|
status: 400,
|
||||||
|
error: {
|
||||||
|
type: 'invalid_request_error',
|
||||||
|
message: "The 'gpt-5.3-codex' model is not supported when using Codex with a ChatGPT account.",
|
||||||
|
},
|
||||||
|
});
|
||||||
|
|
||||||
|
function budgetRunner() {
|
||||||
|
let observedSignal: AbortSignal | undefined;
|
||||||
|
return {
|
||||||
|
observedSignal: () => observedSignal,
|
||||||
|
runStreamed: vi.fn(async (input: { signal?: AbortSignal }) => {
|
||||||
|
observedSignal = input.signal;
|
||||||
|
return events([
|
||||||
|
{ type: 'turn.started' },
|
||||||
|
{ type: 'item.started', item: { type: 'mcp_tool_call', server: 'ktx', tool: 'first', status: 'in_progress' } },
|
||||||
|
{ type: 'item.completed', item: { type: 'mcp_tool_call', server: 'ktx', tool: 'first', status: 'completed' } },
|
||||||
|
{ type: 'item.started', item: { type: 'mcp_tool_call', server: 'ktx', tool: 'second', status: 'in_progress' } },
|
||||||
|
{ type: 'item.completed', item: { type: 'mcp_tool_call', server: 'ktx', tool: 'second', status: 'completed' } },
|
||||||
|
{ type: 'turn.completed', usage: { input_tokens: 1, output_tokens: 1 } },
|
||||||
|
]);
|
||||||
|
}),
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
|
describe('CodexKtxLlmRuntime', () => {
|
||||||
|
it('generates text with the role-selected model and metrics', async () => {
|
||||||
|
const onMetrics = vi.fn();
|
||||||
|
const fakeRunner = runner([
|
||||||
|
{ type: 'turn.started' },
|
||||||
|
{ type: 'item.completed', item: { type: 'agent_message', text: 'hello' } },
|
||||||
|
{ type: 'turn.completed', usage: { input_tokens: 3, output_tokens: 4, total_tokens: 7 } },
|
||||||
|
]);
|
||||||
|
const runtime = new CodexKtxLlmRuntime({
|
||||||
|
projectDir: '/tmp/project',
|
||||||
|
modelSlots: { default: 'codex', triage: 'gpt-5.4' },
|
||||||
|
runner: fakeRunner,
|
||||||
|
});
|
||||||
|
|
||||||
|
await expect(runtime.generateText({ role: 'triage', system: 'system', prompt: 'prompt', onMetrics })).resolves.toBe('hello');
|
||||||
|
expect(fakeRunner.runStreamed).toHaveBeenCalledWith(
|
||||||
|
expect.objectContaining({
|
||||||
|
projectDir: '/tmp/project',
|
||||||
|
model: 'gpt-5.4',
|
||||||
|
prompt: 'system\n\nprompt',
|
||||||
|
}),
|
||||||
|
);
|
||||||
|
expect(onMetrics).toHaveBeenCalledWith(expect.objectContaining({ usage: { inputTokens: 3, outputTokens: 4, totalTokens: 7 } }));
|
||||||
|
});
|
||||||
|
|
||||||
|
it('generates and validates structured output', async () => {
|
||||||
|
const fakeRunner = runner([
|
||||||
|
{ type: 'turn.started' },
|
||||||
|
{ type: 'item.completed', item: { type: 'agent_message', text: '{"answer":"yes"}' } },
|
||||||
|
{ type: 'turn.completed' },
|
||||||
|
]);
|
||||||
|
const runtime = new CodexKtxLlmRuntime({
|
||||||
|
projectDir: '/tmp/project',
|
||||||
|
modelSlots: { default: 'codex' },
|
||||||
|
runner: fakeRunner,
|
||||||
|
});
|
||||||
|
|
||||||
|
await expect(
|
||||||
|
runtime.generateObject({
|
||||||
|
role: 'default',
|
||||||
|
prompt: 'json',
|
||||||
|
schema: z.object({ answer: z.string() }),
|
||||||
|
}),
|
||||||
|
).resolves.toEqual({ answer: 'yes' });
|
||||||
|
expect(fakeRunner.runStreamed).toHaveBeenCalledWith(
|
||||||
|
expect.objectContaining({
|
||||||
|
outputSchema: expect.objectContaining({ type: 'object' }),
|
||||||
|
}),
|
||||||
|
);
|
||||||
|
});
|
||||||
|
|
||||||
|
it('returns a structured-output error when Codex final text is invalid JSON', async () => {
|
||||||
|
const fakeRunner = runner([
|
||||||
|
{ type: 'turn.started' },
|
||||||
|
{ type: 'item.completed', item: { type: 'agent_message', text: 'not json' } },
|
||||||
|
{ type: 'turn.completed' },
|
||||||
|
]);
|
||||||
|
const runtime = new CodexKtxLlmRuntime({
|
||||||
|
projectDir: '/tmp/project',
|
||||||
|
modelSlots: { default: 'codex' },
|
||||||
|
runner: fakeRunner,
|
||||||
|
});
|
||||||
|
|
||||||
|
await expect(
|
||||||
|
runtime.generateObject({
|
||||||
|
role: 'default',
|
||||||
|
prompt: 'json',
|
||||||
|
schema: z.object({ answer: z.string() }),
|
||||||
|
}),
|
||||||
|
).rejects.toThrow('Codex structured output failed validation');
|
||||||
|
});
|
||||||
|
|
||||||
|
it('starts and closes a temporary MCP server for tool-backed agent loops', async () => {
|
||||||
|
const close = vi.fn(async () => undefined);
|
||||||
|
const startMcpServer = vi.fn(async () => ({
|
||||||
|
url: 'http://127.0.0.1:4321/mcp',
|
||||||
|
bearerTokenEnvVar: 'KTX_CODEX_RUNTIME_MCP_TOKEN' as const,
|
||||||
|
bearerToken: 'token',
|
||||||
|
close,
|
||||||
|
}));
|
||||||
|
const fakeRunner = runner([
|
||||||
|
{ type: 'turn.started' },
|
||||||
|
{ type: 'item.started', item: { type: 'mcp_tool_call', name: 'wiki_search' } },
|
||||||
|
{ type: 'item.completed', item: { type: 'agent_message', text: 'done' } },
|
||||||
|
{ type: 'turn.completed', usage: { input_tokens: 1, output_tokens: 1, total_tokens: 2 } },
|
||||||
|
]);
|
||||||
|
const runtime = new CodexKtxLlmRuntime({
|
||||||
|
projectDir: '/tmp/project',
|
||||||
|
modelSlots: { default: 'codex' },
|
||||||
|
runner: fakeRunner,
|
||||||
|
startMcpServer,
|
||||||
|
});
|
||||||
|
const onStepFinish = vi.fn();
|
||||||
|
|
||||||
|
const result = await runtime.runAgentLoop({
|
||||||
|
modelRole: 'default',
|
||||||
|
systemPrompt: 'system',
|
||||||
|
userPrompt: 'user',
|
||||||
|
stepBudget: 5,
|
||||||
|
telemetryTags: {},
|
||||||
|
onStepFinish,
|
||||||
|
toolSet: {
|
||||||
|
aliased_wiki_tool: {
|
||||||
|
name: 'wiki_search',
|
||||||
|
description: 'Search wiki',
|
||||||
|
inputSchema: z.object({ query: z.string() }),
|
||||||
|
execute: vi.fn(),
|
||||||
|
},
|
||||||
|
},
|
||||||
|
});
|
||||||
|
|
||||||
|
expect(result.stopReason).toBe('natural');
|
||||||
|
expect(result.metrics).toMatchObject({ stepCount: 1, usage: { inputTokens: 1, outputTokens: 1, totalTokens: 2 } });
|
||||||
|
expect(onStepFinish).toHaveBeenCalledWith({ stepIndex: 1, stepBudget: 5 });
|
||||||
|
expect(startMcpServer).toHaveBeenCalledWith({ projectDir: '/tmp/project', toolSet: expect.any(Object) });
|
||||||
|
expect(fakeRunner.runStreamed).toHaveBeenCalledWith(
|
||||||
|
expect.objectContaining({
|
||||||
|
env: { KTX_CODEX_RUNTIME_MCP_TOKEN: 'token' },
|
||||||
|
configOverrides: expect.objectContaining({
|
||||||
|
mcp_servers: expect.objectContaining({
|
||||||
|
ktx: expect.objectContaining({
|
||||||
|
url: 'http://127.0.0.1:4321/mcp',
|
||||||
|
enabled_tools: ['wiki_search'],
|
||||||
|
required: true,
|
||||||
|
}),
|
||||||
|
}),
|
||||||
|
}),
|
||||||
|
}),
|
||||||
|
);
|
||||||
|
expect(close).toHaveBeenCalled();
|
||||||
|
});
|
||||||
|
|
||||||
|
it('returns error stop reason on turn failure', async () => {
|
||||||
|
const runtime = new CodexKtxLlmRuntime({
|
||||||
|
projectDir: '/tmp/project',
|
||||||
|
modelSlots: { default: 'codex' },
|
||||||
|
runner: runner([{ type: 'turn.failed', error: { message: 'boom' } }]),
|
||||||
|
});
|
||||||
|
|
||||||
|
const result = await runtime.runAgentLoop({
|
||||||
|
modelRole: 'default',
|
||||||
|
systemPrompt: 'system',
|
||||||
|
userPrompt: 'user',
|
||||||
|
stepBudget: 5,
|
||||||
|
telemetryTags: {},
|
||||||
|
toolSet: {},
|
||||||
|
});
|
||||||
|
|
||||||
|
expect(result.stopReason).toBe('error');
|
||||||
|
expect(result.error?.message).toBe('boom');
|
||||||
|
});
|
||||||
|
|
||||||
|
it('surfaces failed MCP tool calls as agent-loop errors', async () => {
|
||||||
|
const runtime = new CodexKtxLlmRuntime({
|
||||||
|
projectDir: '/tmp/project',
|
||||||
|
modelSlots: { default: 'codex' },
|
||||||
|
runner: runner([
|
||||||
|
{ type: 'turn.started' },
|
||||||
|
{ type: 'item.started', item: { type: 'mcp_tool_call', server: 'ktx', tool: 'search', status: 'in_progress' } },
|
||||||
|
{
|
||||||
|
type: 'item.completed',
|
||||||
|
item: {
|
||||||
|
type: 'mcp_tool_call',
|
||||||
|
server: 'ktx',
|
||||||
|
tool: 'search',
|
||||||
|
status: 'failed',
|
||||||
|
error: { message: 'denied' },
|
||||||
|
},
|
||||||
|
},
|
||||||
|
{ type: 'turn.completed', usage: { input_tokens: 1, output_tokens: 1 } },
|
||||||
|
]),
|
||||||
|
});
|
||||||
|
|
||||||
|
const result = await runtime.runAgentLoop({
|
||||||
|
modelRole: 'default',
|
||||||
|
systemPrompt: 'system',
|
||||||
|
userPrompt: 'user',
|
||||||
|
stepBudget: 5,
|
||||||
|
telemetryTags: {},
|
||||||
|
toolSet: {},
|
||||||
|
});
|
||||||
|
|
||||||
|
expect(result.stopReason).toBe('error');
|
||||||
|
expect(result.error?.message).toBe('Codex runtime tool call failed: search: denied');
|
||||||
|
expect(result.metrics).toMatchObject({
|
||||||
|
stepCount: 1,
|
||||||
|
usage: { inputTokens: 1, outputTokens: 1, totalTokens: 2 },
|
||||||
|
});
|
||||||
|
});
|
||||||
|
|
||||||
|
it('returns budget and aborts the Codex stream when local MCP step budget is reached', async () => {
|
||||||
|
const fakeRunner = budgetRunner();
|
||||||
|
const runtime = new CodexKtxLlmRuntime({
|
||||||
|
projectDir: '/tmp/project',
|
||||||
|
modelSlots: { default: 'codex' },
|
||||||
|
runner: fakeRunner,
|
||||||
|
});
|
||||||
|
const onStepFinish = vi.fn();
|
||||||
|
|
||||||
|
const result = await runtime.runAgentLoop({
|
||||||
|
modelRole: 'default',
|
||||||
|
systemPrompt: 'system',
|
||||||
|
userPrompt: 'user',
|
||||||
|
stepBudget: 1,
|
||||||
|
telemetryTags: {},
|
||||||
|
onStepFinish,
|
||||||
|
toolSet: {
|
||||||
|
first: {
|
||||||
|
name: 'first',
|
||||||
|
description: 'First tool',
|
||||||
|
inputSchema: z.object({}),
|
||||||
|
execute: vi.fn(),
|
||||||
|
},
|
||||||
|
},
|
||||||
|
});
|
||||||
|
|
||||||
|
expect(result.stopReason).toBe('budget');
|
||||||
|
expect(result.error).toBeUndefined();
|
||||||
|
expect(result.metrics).toMatchObject({ stepCount: 1 });
|
||||||
|
expect(onStepFinish).toHaveBeenCalledTimes(1);
|
||||||
|
expect(onStepFinish).toHaveBeenCalledWith({ stepIndex: 1, stepBudget: 1 });
|
||||||
|
expect(fakeRunner.observedSignal()?.aborted).toBe(true);
|
||||||
|
});
|
||||||
|
|
||||||
|
it('counts built-in command_execution steps against the budget and aborts the stream', async () => {
|
||||||
|
let observedSignal: AbortSignal | undefined;
|
||||||
|
const fakeRunner = {
|
||||||
|
observedSignal: () => observedSignal,
|
||||||
|
runStreamed: vi.fn(async (input: { signal?: AbortSignal }) => {
|
||||||
|
observedSignal = input.signal;
|
||||||
|
return events([
|
||||||
|
{ type: 'turn.started' },
|
||||||
|
{ type: 'item.started', item: { type: 'command_execution', command: 'ls', status: 'in_progress' } },
|
||||||
|
{ type: 'item.completed', item: { type: 'command_execution', command: 'ls', status: 'completed', exit_code: 0 } },
|
||||||
|
{ type: 'item.started', item: { type: 'command_execution', command: 'cat a', status: 'in_progress' } },
|
||||||
|
{ type: 'item.completed', item: { type: 'command_execution', command: 'cat a', status: 'completed', exit_code: 0 } },
|
||||||
|
{ type: 'item.completed', item: { type: 'command_execution', command: 'cat b', status: 'completed', exit_code: 0 } },
|
||||||
|
{ type: 'turn.completed', usage: { input_tokens: 1, output_tokens: 1 } },
|
||||||
|
]);
|
||||||
|
}),
|
||||||
|
};
|
||||||
|
const runtime = new CodexKtxLlmRuntime({
|
||||||
|
projectDir: '/tmp/project',
|
||||||
|
modelSlots: { default: 'codex' },
|
||||||
|
runner: fakeRunner,
|
||||||
|
});
|
||||||
|
const onStepFinish = vi.fn();
|
||||||
|
|
||||||
|
const result = await runtime.runAgentLoop({
|
||||||
|
modelRole: 'default',
|
||||||
|
systemPrompt: 'system',
|
||||||
|
userPrompt: 'user',
|
||||||
|
stepBudget: 2,
|
||||||
|
telemetryTags: {},
|
||||||
|
onStepFinish,
|
||||||
|
toolSet: {},
|
||||||
|
});
|
||||||
|
|
||||||
|
expect(result.stopReason).toBe('budget');
|
||||||
|
expect(result.error).toBeUndefined();
|
||||||
|
expect(result.metrics).toMatchObject({ stepCount: 2 });
|
||||||
|
expect(onStepFinish).toHaveBeenCalledTimes(2);
|
||||||
|
expect(onStepFinish).toHaveBeenLastCalledWith({ stepIndex: 2, stepBudget: 2 });
|
||||||
|
expect(fakeRunner.observedSignal()?.aborted).toBe(true);
|
||||||
|
});
|
||||||
|
|
||||||
|
it('fires onStepFinish live as each step completes, before the stream drains', async () => {
|
||||||
|
const order: string[] = [];
|
||||||
|
async function* liveEvents() {
|
||||||
|
yield { type: 'turn.started' };
|
||||||
|
yield { type: 'item.completed', item: { type: 'mcp_tool_call', server: 'ktx', tool: 'a', status: 'completed' } };
|
||||||
|
order.push('yielded-after-step-1');
|
||||||
|
yield { type: 'item.completed', item: { type: 'mcp_tool_call', server: 'ktx', tool: 'b', status: 'completed' } };
|
||||||
|
order.push('yielded-after-step-2');
|
||||||
|
yield { type: 'item.completed', item: { type: 'agent_message', text: 'done' } };
|
||||||
|
yield { type: 'turn.completed', usage: { input_tokens: 1, output_tokens: 1 } };
|
||||||
|
}
|
||||||
|
const fakeRunner = { runStreamed: vi.fn(async () => liveEvents()) };
|
||||||
|
const runtime = new CodexKtxLlmRuntime({
|
||||||
|
projectDir: '/tmp/project',
|
||||||
|
modelSlots: { default: 'codex' },
|
||||||
|
runner: fakeRunner,
|
||||||
|
});
|
||||||
|
|
||||||
|
const result = await runtime.runAgentLoop({
|
||||||
|
modelRole: 'default',
|
||||||
|
systemPrompt: 'system',
|
||||||
|
userPrompt: 'user',
|
||||||
|
stepBudget: 10,
|
||||||
|
telemetryTags: {},
|
||||||
|
onStepFinish: ({ stepIndex }) => {
|
||||||
|
order.push(`step-${stepIndex}`);
|
||||||
|
},
|
||||||
|
toolSet: {},
|
||||||
|
});
|
||||||
|
|
||||||
|
expect(result.stopReason).toBe('natural');
|
||||||
|
expect(result.metrics).toMatchObject({ stepCount: 2 });
|
||||||
|
expect(order).toEqual(['step-1', 'yielded-after-step-1', 'step-2', 'yielded-after-step-2']);
|
||||||
|
});
|
||||||
|
|
||||||
|
it('surfaces the real Codex error event even when the SDK stream throws afterward', async () => {
|
||||||
|
// The SDK yields the error/turn.failed events on stdout, then throws on the
|
||||||
|
// non-zero exit. The masked exit message must not hide the real API error.
|
||||||
|
const fakeRunner = throwingRunner(
|
||||||
|
[
|
||||||
|
{ type: 'thread.started', thread_id: 't' },
|
||||||
|
{ type: 'turn.started' },
|
||||||
|
{ type: 'error', message: MODEL_UNSUPPORTED_API_ERROR },
|
||||||
|
{ type: 'turn.failed', error: { message: MODEL_UNSUPPORTED_API_ERROR } },
|
||||||
|
],
|
||||||
|
new Error('Codex Exec exited with code 1: Reading prompt from stdin...'),
|
||||||
|
);
|
||||||
|
const runtime = new CodexKtxLlmRuntime({
|
||||||
|
projectDir: '/tmp/project',
|
||||||
|
modelSlots: { default: 'codex' },
|
||||||
|
runner: fakeRunner,
|
||||||
|
});
|
||||||
|
|
||||||
|
await expect(runtime.generateText({ role: 'default', prompt: 'hi' })).rejects.toThrow(
|
||||||
|
'not supported when using Codex with a ChatGPT account',
|
||||||
|
);
|
||||||
|
});
|
||||||
|
|
||||||
|
it('probes Codex authentication through a minimal non-interactive turn', async () => {
|
||||||
|
const fakeRunner = runner([
|
||||||
|
{ type: 'turn.started' },
|
||||||
|
{ type: 'item.completed', item: { type: 'agent_message', text: 'ok' } },
|
||||||
|
{ type: 'turn.completed' },
|
||||||
|
]);
|
||||||
|
|
||||||
|
await expect(
|
||||||
|
runCodexAuthProbe({
|
||||||
|
projectDir: '/tmp/project',
|
||||||
|
model: 'codex',
|
||||||
|
runner: fakeRunner,
|
||||||
|
}),
|
||||||
|
).resolves.toEqual({ ok: true });
|
||||||
|
});
|
||||||
|
|
||||||
|
it('reports an unavailable model without blaming auth when Codex rejects the model', async () => {
|
||||||
|
const fakeRunner = throwingRunner(
|
||||||
|
[
|
||||||
|
{ type: 'turn.started' },
|
||||||
|
{ type: 'turn.failed', error: { message: MODEL_UNSUPPORTED_API_ERROR } },
|
||||||
|
],
|
||||||
|
new Error('Codex Exec exited with code 1: Reading prompt from stdin...'),
|
||||||
|
);
|
||||||
|
|
||||||
|
const result = await runCodexAuthProbe({
|
||||||
|
projectDir: '/tmp/project',
|
||||||
|
model: 'gpt-5.3-codex',
|
||||||
|
runner: fakeRunner,
|
||||||
|
});
|
||||||
|
|
||||||
|
expect(result.ok).toBe(false);
|
||||||
|
if (!result.ok) {
|
||||||
|
expect(result.message).not.toContain('authentication is not usable');
|
||||||
|
expect(result.message).toContain('not available');
|
||||||
|
expect(result.message).toContain('gpt-5.3-codex');
|
||||||
|
expect(result.message).toContain('not supported when using Codex with a ChatGPT account');
|
||||||
|
// A model-access failure must steer the user at the model config, not auth.
|
||||||
|
expect(result.fix).toContain('llm.models.default');
|
||||||
|
expect(result.fix).not.toContain('Authenticate Codex');
|
||||||
|
}
|
||||||
|
});
|
||||||
|
|
||||||
|
it('reports an auth failure when Codex exits without an error event', async () => {
|
||||||
|
const fakeRunner = throwingRunner(
|
||||||
|
[],
|
||||||
|
new Error('Codex Exec exited with code 1: Not logged in. Run `codex login`.'),
|
||||||
|
);
|
||||||
|
|
||||||
|
const result = await runCodexAuthProbe({
|
||||||
|
projectDir: '/tmp/project',
|
||||||
|
model: 'gpt-5.5',
|
||||||
|
runner: fakeRunner,
|
||||||
|
});
|
||||||
|
|
||||||
|
expect(result.ok).toBe(false);
|
||||||
|
if (!result.ok) {
|
||||||
|
expect(result.message).toContain('authentication is not usable');
|
||||||
|
expect(result.message).toContain('Not logged in');
|
||||||
|
expect(result.fix).toContain('Authenticate Codex');
|
||||||
|
}
|
||||||
|
});
|
||||||
|
|
||||||
|
it('rejects an unsupported model id before probing, steering at llm.models.default', async () => {
|
||||||
|
const result = await runCodexAuthProbe({
|
||||||
|
projectDir: '/tmp/project',
|
||||||
|
model: 'not-a-real-model',
|
||||||
|
});
|
||||||
|
|
||||||
|
expect(result.ok).toBe(false);
|
||||||
|
if (!result.ok) {
|
||||||
|
expect(result.message).toContain('Unsupported Codex model');
|
||||||
|
expect(result.fix).toContain('llm.models.default');
|
||||||
|
}
|
||||||
|
});
|
||||||
|
});
|
||||||
97
packages/cli/test/context/llm/codex-sdk-runner.test.ts
Normal file
97
packages/cli/test/context/llm/codex-sdk-runner.test.ts
Normal file
|
|
@ -0,0 +1,97 @@
|
||||||
|
import { describe, expect, it, vi } from 'vitest';
|
||||||
|
|
||||||
|
const sdkMock = vi.hoisted(() => {
|
||||||
|
const events = (async function* () {
|
||||||
|
yield { type: 'turn.completed', usage: { input_tokens: 1, output_tokens: 2 } };
|
||||||
|
})();
|
||||||
|
const runStreamed = vi.fn(async () => ({ events }));
|
||||||
|
const startThread = vi.fn(() => ({ runStreamed }));
|
||||||
|
const Codex = vi.fn(function Codex(this: { startThread: typeof startThread }, options?: unknown) {
|
||||||
|
Object.assign(this, { options, startThread });
|
||||||
|
});
|
||||||
|
return { Codex, startThread, runStreamed };
|
||||||
|
});
|
||||||
|
|
||||||
|
vi.mock('@openai/codex-sdk', () => ({ Codex: sdkMock.Codex }));
|
||||||
|
|
||||||
|
import { CodexSdkCliRunner } from '../../../src/context/llm/codex-sdk-runner.js';
|
||||||
|
|
||||||
|
async function collectAsync<T>(items: AsyncIterable<T>): Promise<T[]> {
|
||||||
|
const collected: T[] = [];
|
||||||
|
for await (const item of items) {
|
||||||
|
collected.push(item);
|
||||||
|
}
|
||||||
|
return collected;
|
||||||
|
}
|
||||||
|
|
||||||
|
describe('CodexSdkCliRunner', () => {
|
||||||
|
it('passes isolated env through the SDK and runtime controls through thread options', async () => {
|
||||||
|
const runner = new CodexSdkCliRunner({
|
||||||
|
envBase: {
|
||||||
|
HOME: '/home/ktx-user',
|
||||||
|
PATH: '/usr/local/bin:/usr/bin',
|
||||||
|
CODEX_HOME: '/home/ktx-user/.codex',
|
||||||
|
HTTPS_PROXY: 'http://proxy.example',
|
||||||
|
KTX_UNRELATED_SECRET: 'must-not-copy', // pragma: allowlist secret
|
||||||
|
},
|
||||||
|
});
|
||||||
|
const previousToken = process.env.KTX_CODEX_RUNTIME_MCP_TOKEN;
|
||||||
|
process.env.KTX_CODEX_RUNTIME_MCP_TOKEN = 'outer-token';
|
||||||
|
const outputSchema = {
|
||||||
|
type: 'object',
|
||||||
|
properties: { answer: { type: 'string' } },
|
||||||
|
required: ['answer'],
|
||||||
|
additionalProperties: false,
|
||||||
|
};
|
||||||
|
const controller = new AbortController();
|
||||||
|
|
||||||
|
try {
|
||||||
|
const events = await runner.runStreamed({
|
||||||
|
projectDir: '/tmp/ktx-project',
|
||||||
|
model: 'gpt-5.3-codex',
|
||||||
|
prompt: 'Return JSON.',
|
||||||
|
configOverrides: {
|
||||||
|
history: { persistence: 'none' },
|
||||||
|
},
|
||||||
|
env: { KTX_CODEX_RUNTIME_MCP_TOKEN: 'run-token' },
|
||||||
|
outputSchema,
|
||||||
|
signal: controller.signal,
|
||||||
|
});
|
||||||
|
|
||||||
|
expect(sdkMock.Codex).toHaveBeenCalledWith({
|
||||||
|
config: {
|
||||||
|
history: { persistence: 'none' },
|
||||||
|
},
|
||||||
|
env: {
|
||||||
|
HOME: '/home/ktx-user',
|
||||||
|
PATH: '/usr/local/bin:/usr/bin',
|
||||||
|
CODEX_HOME: '/home/ktx-user/.codex',
|
||||||
|
HTTPS_PROXY: 'http://proxy.example',
|
||||||
|
KTX_CODEX_RUNTIME_MCP_TOKEN: 'run-token',
|
||||||
|
},
|
||||||
|
});
|
||||||
|
expect(process.env.KTX_CODEX_RUNTIME_MCP_TOKEN).toBe('outer-token');
|
||||||
|
expect(sdkMock.startThread).toHaveBeenCalledWith({
|
||||||
|
workingDirectory: '/tmp/ktx-project',
|
||||||
|
skipGitRepoCheck: true,
|
||||||
|
model: 'gpt-5.3-codex',
|
||||||
|
sandboxMode: 'read-only',
|
||||||
|
webSearchMode: 'disabled',
|
||||||
|
approvalPolicy: 'never',
|
||||||
|
});
|
||||||
|
expect(sdkMock.runStreamed).toHaveBeenCalledWith('Return JSON.', {
|
||||||
|
outputSchema,
|
||||||
|
signal: controller.signal,
|
||||||
|
});
|
||||||
|
await expect(collectAsync(events)).resolves.toEqual([
|
||||||
|
{ type: 'turn.completed', usage: { input_tokens: 1, output_tokens: 2 } },
|
||||||
|
]);
|
||||||
|
} finally {
|
||||||
|
if (previousToken === undefined) {
|
||||||
|
delete process.env.KTX_CODEX_RUNTIME_MCP_TOKEN;
|
||||||
|
} else {
|
||||||
|
process.env.KTX_CODEX_RUNTIME_MCP_TOKEN = previousToken;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
});
|
||||||
|
});
|
||||||
|
|
@ -22,4 +22,25 @@ describe('local KTX LLM runtime config', () => {
|
||||||
}),
|
}),
|
||||||
).toBeNull();
|
).toBeNull();
|
||||||
});
|
});
|
||||||
|
|
||||||
|
it('creates a Codex runtime for codex backend without creating an AI SDK provider', () => {
|
||||||
|
const runtime = createLocalKtxLlmRuntimeFromConfig(
|
||||||
|
{
|
||||||
|
provider: { backend: 'codex' },
|
||||||
|
models: { default: 'codex', triage: 'gpt-5.4' },
|
||||||
|
},
|
||||||
|
{ env: {}, projectDir: '/tmp/project', createCodexRuntime: vi.fn((deps) => ({ deps }) as never) },
|
||||||
|
);
|
||||||
|
|
||||||
|
expect(runtime).toMatchObject({ deps: expect.objectContaining({ projectDir: '/tmp/project' }) });
|
||||||
|
});
|
||||||
|
|
||||||
|
it('returns null from the AI SDK provider factory for codex backend', () => {
|
||||||
|
expect(
|
||||||
|
createLocalKtxLlmProviderFromConfig({
|
||||||
|
provider: { backend: 'codex' },
|
||||||
|
models: { default: 'codex' },
|
||||||
|
}),
|
||||||
|
).toBeNull();
|
||||||
|
});
|
||||||
});
|
});
|
||||||
|
|
|
||||||
|
|
@ -231,6 +231,31 @@ llm:
|
||||||
});
|
});
|
||||||
});
|
});
|
||||||
|
|
||||||
|
it('parses Codex as a first-class LLM backend', () => {
|
||||||
|
const config = parseKtxProjectConfig(`
|
||||||
|
llm:
|
||||||
|
provider:
|
||||||
|
backend: codex
|
||||||
|
models:
|
||||||
|
default: gpt-5.3-codex
|
||||||
|
triage: gpt-5.3-codex
|
||||||
|
candidateExtraction: gpt-5.3-codex
|
||||||
|
curator: gpt-5.3-codex
|
||||||
|
reconcile: gpt-5.3-codex
|
||||||
|
repair: gpt-5.3-codex
|
||||||
|
`);
|
||||||
|
|
||||||
|
expect(config.llm.provider.backend).toBe('codex');
|
||||||
|
expect(config.llm.models).toEqual({
|
||||||
|
default: 'gpt-5.3-codex',
|
||||||
|
triage: 'gpt-5.3-codex',
|
||||||
|
candidateExtraction: 'gpt-5.3-codex',
|
||||||
|
curator: 'gpt-5.3-codex',
|
||||||
|
reconcile: 'gpt-5.3-codex',
|
||||||
|
repair: 'gpt-5.3-codex',
|
||||||
|
});
|
||||||
|
});
|
||||||
|
|
||||||
it('parses gateway LLM, OpenAI scan embeddings, and sentence-transformers ingest embeddings', () => {
|
it('parses gateway LLM, OpenAI scan embeddings, and sentence-transformers ingest embeddings', () => {
|
||||||
const config = parseKtxProjectConfig(`
|
const config = parseKtxProjectConfig(`
|
||||||
llm:
|
llm:
|
||||||
|
|
@ -530,7 +555,7 @@ describe('generateKtxProjectConfigJsonSchema', () => {
|
||||||
const llm = (schema.properties as Record<string, { properties?: Record<string, unknown> }>).llm;
|
const llm = (schema.properties as Record<string, { properties?: Record<string, unknown> }>).llm;
|
||||||
const provider = llm?.properties?.provider as { properties?: Record<string, unknown> };
|
const provider = llm?.properties?.provider as { properties?: Record<string, unknown> };
|
||||||
const backend = provider?.properties?.backend as { enum?: readonly string[] };
|
const backend = provider?.properties?.backend as { enum?: readonly string[] };
|
||||||
expect(backend?.enum).toEqual(['none', 'anthropic', 'vertex', 'gateway', 'claude-code']);
|
expect(backend?.enum).toEqual(['none', 'anthropic', 'vertex', 'gateway', 'claude-code', 'codex']);
|
||||||
|
|
||||||
const storage = (schema.properties as Record<string, { properties?: Record<string, unknown> }>).storage;
|
const storage = (schema.properties as Record<string, { properties?: Record<string, unknown> }>).storage;
|
||||||
const state = storage?.properties?.state as { enum?: readonly string[] };
|
const state = storage?.properties?.state as { enum?: readonly string[] };
|
||||||
|
|
|
||||||
|
|
@ -422,6 +422,8 @@ describe('runKtxDoctor', () => {
|
||||||
'llm:',
|
'llm:',
|
||||||
' provider:',
|
' provider:',
|
||||||
' backend: anthropic',
|
' backend: anthropic',
|
||||||
|
' models:',
|
||||||
|
' default: claude-sonnet-4-5',
|
||||||
'',
|
'',
|
||||||
].join('\n'),
|
].join('\n'),
|
||||||
'utf-8',
|
'utf-8',
|
||||||
|
|
@ -543,6 +545,8 @@ describe('runKtxDoctor', () => {
|
||||||
'llm:',
|
'llm:',
|
||||||
' provider:',
|
' provider:',
|
||||||
' backend: anthropic',
|
' backend: anthropic',
|
||||||
|
' models:',
|
||||||
|
' default: claude-sonnet-4-5',
|
||||||
'ingest:',
|
'ingest:',
|
||||||
' adapters:',
|
' adapters:',
|
||||||
' - live-database',
|
' - live-database',
|
||||||
|
|
@ -652,6 +656,8 @@ describe('runKtxDoctor', () => {
|
||||||
'llm:',
|
'llm:',
|
||||||
' provider:',
|
' provider:',
|
||||||
' backend: anthropic',
|
' backend: anthropic',
|
||||||
|
' models:',
|
||||||
|
' default: claude-sonnet-4-5',
|
||||||
'',
|
'',
|
||||||
].join('\n'),
|
].join('\n'),
|
||||||
'utf-8',
|
'utf-8',
|
||||||
|
|
@ -698,6 +704,8 @@ describe('runKtxDoctor', () => {
|
||||||
'llm:',
|
'llm:',
|
||||||
' provider:',
|
' provider:',
|
||||||
' backend: anthropic',
|
' backend: anthropic',
|
||||||
|
' models:',
|
||||||
|
' default: claude-sonnet-4-5',
|
||||||
'ingest:',
|
'ingest:',
|
||||||
' adapters:',
|
' adapters:',
|
||||||
' - live-database',
|
' - live-database',
|
||||||
|
|
|
||||||
|
|
@ -337,10 +337,13 @@ describe('runKtxIngest', () => {
|
||||||
|
|
||||||
expect(runIo.stdout()).toBe('');
|
expect(runIo.stdout()).toBe('');
|
||||||
expect(runIo.stderr()).toContain(
|
expect(runIo.stderr()).toContain(
|
||||||
'ktx ingest requires llm.provider.backend: anthropic, vertex, gateway, or claude-code, or an injected agentRunner.',
|
'ktx ingest requires llm.provider.backend: anthropic, vertex, gateway, claude-code, or codex, or an injected agentRunner.',
|
||||||
);
|
);
|
||||||
expect(runIo.stderr()).toContain('Configure a local Claude Code session or API-backed LLM, then rerun ingest:');
|
expect(runIo.stderr()).toContain('Configure a local Claude Code/Codex session or API-backed LLM, then rerun ingest:');
|
||||||
expect(runIo.stderr()).toContain(`ktx setup --project-dir ${projectDir} --llm-backend claude-code --no-input`);
|
expect(runIo.stderr()).toContain(`ktx setup --project-dir ${projectDir} --llm-backend claude-code --no-input`);
|
||||||
|
expect(runIo.stderr()).toContain(
|
||||||
|
`ktx setup --project-dir ${projectDir} --llm-backend codex --llm-model gpt-5.5 --no-input`,
|
||||||
|
);
|
||||||
expect(runIo.stderr()).toContain(
|
expect(runIo.stderr()).toContain(
|
||||||
`ktx setup --project-dir ${projectDir} --llm-backend anthropic --anthropic-api-key-env ANTHROPIC_API_KEY --llm-model claude-sonnet-4-6 --no-input`,
|
`ktx setup --project-dir ${projectDir} --llm-backend anthropic --anthropic-api-key-env ANTHROPIC_API_KEY --llm-model claude-sonnet-4-6 --no-input`,
|
||||||
);
|
);
|
||||||
|
|
|
||||||
|
|
@ -312,4 +312,13 @@ describe('createKtxLlmProvider', () => {
|
||||||
}),
|
}),
|
||||||
).toThrow('claude-code is not an AI SDK LanguageModel backend');
|
).toThrow('claude-code is not an AI SDK LanguageModel backend');
|
||||||
});
|
});
|
||||||
|
|
||||||
|
it('rejects codex as an AI SDK LanguageModel backend', () => {
|
||||||
|
expect(() =>
|
||||||
|
createKtxLlmProvider({
|
||||||
|
backend: 'codex',
|
||||||
|
modelSlots: { default: 'gpt-5.3-codex' },
|
||||||
|
}),
|
||||||
|
).toThrow('codex is not an AI SDK LanguageModel backend');
|
||||||
|
});
|
||||||
});
|
});
|
||||||
|
|
|
||||||
|
|
@ -66,6 +66,7 @@ function makePromptAdapter(options: {
|
||||||
nextProviderChoice === 'anthropic' ||
|
nextProviderChoice === 'anthropic' ||
|
||||||
nextProviderChoice === 'vertex' ||
|
nextProviderChoice === 'vertex' ||
|
||||||
nextProviderChoice === 'claude-code' ||
|
nextProviderChoice === 'claude-code' ||
|
||||||
|
nextProviderChoice === 'codex' ||
|
||||||
nextProviderChoice === 'back'
|
nextProviderChoice === 'back'
|
||||||
) {
|
) {
|
||||||
return selectValues.shift() ?? nextProviderChoice;
|
return selectValues.shift() ?? nextProviderChoice;
|
||||||
|
|
@ -183,6 +184,7 @@ describe('setup Anthropic model step', () => {
|
||||||
message: expect.stringContaining('Which LLM provider should KTX use?'),
|
message: expect.stringContaining('Which LLM provider should KTX use?'),
|
||||||
options: [
|
options: [
|
||||||
{ value: 'claude-code', label: 'Claude subscription (Pro/Max)' },
|
{ value: 'claude-code', label: 'Claude subscription (Pro/Max)' },
|
||||||
|
{ value: 'codex', label: 'Codex subscription' },
|
||||||
{ value: 'anthropic', label: 'Anthropic API key' },
|
{ value: 'anthropic', label: 'Anthropic API key' },
|
||||||
{ value: 'vertex', label: 'Google Vertex AI for Anthropic Claude' },
|
{ value: 'vertex', label: 'Google Vertex AI for Anthropic Claude' },
|
||||||
{ value: 'back', label: 'Back' },
|
{ value: 'back', label: 'Back' },
|
||||||
|
|
@ -215,6 +217,85 @@ describe('setup Anthropic model step', () => {
|
||||||
expect(authProbe).toHaveBeenCalledWith(expect.objectContaining({ projectDir: tempDir, model: 'sonnet' }));
|
expect(authProbe).toHaveBeenCalledWith(expect.objectContaining({ projectDir: tempDir, model: 'sonnet' }));
|
||||||
});
|
});
|
||||||
|
|
||||||
|
it('configures Codex backend and validates local auth', async () => {
|
||||||
|
const io = makeIo();
|
||||||
|
const codexAuthProbe = vi.fn(async () => ({ ok: true as const }));
|
||||||
|
|
||||||
|
const result = await runKtxSetupAnthropicModelStep(
|
||||||
|
{
|
||||||
|
projectDir: tempDir,
|
||||||
|
inputMode: 'disabled',
|
||||||
|
llmBackend: 'codex',
|
||||||
|
llmModel: 'gpt-5.5',
|
||||||
|
skipLlm: false,
|
||||||
|
},
|
||||||
|
io.io,
|
||||||
|
{ codexAuthProbe },
|
||||||
|
);
|
||||||
|
|
||||||
|
expect(result.status).toBe('ready');
|
||||||
|
const config = parseKtxProjectConfig(await readFile(join(tempDir, 'ktx.yaml'), 'utf-8'));
|
||||||
|
expect(config.llm).toMatchObject({
|
||||||
|
provider: { backend: 'codex' },
|
||||||
|
models: { default: 'gpt-5.5' },
|
||||||
|
});
|
||||||
|
expect(codexAuthProbe).toHaveBeenCalledWith(expect.objectContaining({ projectDir: tempDir, model: 'gpt-5.5' }));
|
||||||
|
// The warning carries the clack gutter so it renders inside the setup frame.
|
||||||
|
expect(io.stderr()).toContain('│ Codex backend isolation is limited');
|
||||||
|
expect(io.stderr()).toContain('may still load user Codex config');
|
||||||
|
});
|
||||||
|
|
||||||
|
it('defaults the Codex model to gpt-5.5 when none is provided non-interactively', async () => {
|
||||||
|
const io = makeIo();
|
||||||
|
const codexAuthProbe = vi.fn(async () => ({ ok: true as const }));
|
||||||
|
|
||||||
|
const result = await runKtxSetupAnthropicModelStep(
|
||||||
|
{
|
||||||
|
projectDir: tempDir,
|
||||||
|
inputMode: 'disabled',
|
||||||
|
llmBackend: 'codex',
|
||||||
|
skipLlm: false,
|
||||||
|
},
|
||||||
|
io.io,
|
||||||
|
{ codexAuthProbe },
|
||||||
|
);
|
||||||
|
|
||||||
|
expect(result.status).toBe('ready');
|
||||||
|
const config = parseKtxProjectConfig(await readFile(join(tempDir, 'ktx.yaml'), 'utf-8'));
|
||||||
|
expect(config.llm).toMatchObject({
|
||||||
|
provider: { backend: 'codex' },
|
||||||
|
models: { default: 'gpt-5.5' },
|
||||||
|
});
|
||||||
|
expect(codexAuthProbe).toHaveBeenCalledWith(expect.objectContaining({ projectDir: tempDir, model: 'gpt-5.5' }));
|
||||||
|
});
|
||||||
|
|
||||||
|
it('offers the curated Codex models during interactive setup', async () => {
|
||||||
|
const io = makeIo();
|
||||||
|
const prompts = makePromptAdapter({ selectValues: ['codex', 'gpt-5.5'] });
|
||||||
|
const codexAuthProbe = vi.fn(async () => ({ ok: true as const }));
|
||||||
|
|
||||||
|
const result = await runKtxSetupAnthropicModelStep(
|
||||||
|
{ projectDir: tempDir, inputMode: 'auto', skipLlm: false },
|
||||||
|
io.io,
|
||||||
|
{ prompts, codexAuthProbe },
|
||||||
|
);
|
||||||
|
|
||||||
|
expect(result.status).toBe('ready');
|
||||||
|
expect(prompts.select).toHaveBeenCalledWith(
|
||||||
|
expect.objectContaining({
|
||||||
|
message: expect.stringContaining('Which Codex model should KTX use?'),
|
||||||
|
options: [
|
||||||
|
{ value: 'gpt-5.5', label: 'GPT-5.5', hint: 'recommended' },
|
||||||
|
{ value: 'gpt-5.4', label: 'GPT-5.4' },
|
||||||
|
{ value: 'gpt-5.4-mini', label: 'GPT-5.4 mini' },
|
||||||
|
{ value: 'manual', label: 'Enter a Codex model ID manually' },
|
||||||
|
{ value: 'back', label: 'Back' },
|
||||||
|
],
|
||||||
|
}),
|
||||||
|
);
|
||||||
|
expect(codexAuthProbe).toHaveBeenCalledWith(expect.objectContaining({ model: 'gpt-5.5' }));
|
||||||
|
});
|
||||||
|
|
||||||
it('prompts for the Claude Code model during interactive setup', async () => {
|
it('prompts for the Claude Code model during interactive setup', async () => {
|
||||||
const io = makeIo();
|
const io = makeIo();
|
||||||
const prompts = makePromptAdapter({ selectValues: ['claude-code', 'opus'] });
|
const prompts = makePromptAdapter({ selectValues: ['claude-code', 'opus'] });
|
||||||
|
|
|
||||||
|
|
@ -44,6 +44,17 @@ function withClaudeCodeLlm(config: KtxProjectConfig): KtxProjectConfig {
|
||||||
};
|
};
|
||||||
}
|
}
|
||||||
|
|
||||||
|
function withCodexLlm(config: KtxProjectConfig): KtxProjectConfig {
|
||||||
|
return {
|
||||||
|
...config,
|
||||||
|
llm: {
|
||||||
|
...config.llm,
|
||||||
|
provider: { backend: 'codex' },
|
||||||
|
models: { ...config.llm.models, default: 'gpt-5.5' },
|
||||||
|
},
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
function baseProjectConfig(): KtxProjectConfig {
|
function baseProjectConfig(): KtxProjectConfig {
|
||||||
return withClaudeCodeLlm(buildDefaultKtxProjectConfig());
|
return withClaudeCodeLlm(buildDefaultKtxProjectConfig());
|
||||||
}
|
}
|
||||||
|
|
@ -391,6 +402,126 @@ describe('buildProjectStatus --fast', () => {
|
||||||
});
|
});
|
||||||
});
|
});
|
||||||
|
|
||||||
|
describe('buildProjectStatus codex', () => {
|
||||||
|
it('reports authenticated local Codex session', async () => {
|
||||||
|
const project = projectWithConfig(withCodexLlm(buildDefaultKtxProjectConfig()));
|
||||||
|
const status = await buildProjectStatus(project, {
|
||||||
|
codexAuthProbe: async () => ({ ok: true as const }),
|
||||||
|
});
|
||||||
|
|
||||||
|
expect(status.llm).toMatchObject({
|
||||||
|
backend: 'codex',
|
||||||
|
model: 'gpt-5.5',
|
||||||
|
status: 'ok',
|
||||||
|
detail: 'local Codex session authenticated',
|
||||||
|
});
|
||||||
|
expect(status.warnings).toEqual(
|
||||||
|
expect.arrayContaining([
|
||||||
|
expect.objectContaining({
|
||||||
|
message: expect.stringContaining('Codex backend isolation is limited'),
|
||||||
|
fix: expect.stringContaining('claude-code'),
|
||||||
|
}),
|
||||||
|
]),
|
||||||
|
);
|
||||||
|
const rendered = renderProjectStatus(status, { verbose: false, useColor: false });
|
||||||
|
expect(rendered).toContain('Codex backend isolation is limited');
|
||||||
|
});
|
||||||
|
|
||||||
|
it('skips Codex auth probe with --fast', async () => {
|
||||||
|
let probeCalls = 0;
|
||||||
|
const project = projectWithConfig(withCodexLlm(buildDefaultKtxProjectConfig()));
|
||||||
|
const status = await buildProjectStatus(project, {
|
||||||
|
fast: true,
|
||||||
|
codexAuthProbe: async () => {
|
||||||
|
probeCalls += 1;
|
||||||
|
return { ok: true };
|
||||||
|
},
|
||||||
|
});
|
||||||
|
|
||||||
|
expect(probeCalls).toBe(0);
|
||||||
|
expect(status.llm.status).toBe('skipped');
|
||||||
|
expect(status.llm.detail).toMatch(/--fast/);
|
||||||
|
});
|
||||||
|
|
||||||
|
it('surfaces the probe fix for a model-access failure instead of an auth fix', async () => {
|
||||||
|
const project = projectWithConfig(withCodexLlm(buildDefaultKtxProjectConfig()));
|
||||||
|
const status = await buildProjectStatus(project, {
|
||||||
|
codexAuthProbe: async () => ({
|
||||||
|
ok: false,
|
||||||
|
message: 'Codex is authenticated, but the configured model "gpt-5.5" is not available...',
|
||||||
|
fix: 'Run `codex` to see the models your account supports, then set llm.models.default in ktx.yaml (or rerun `ktx setup`).',
|
||||||
|
}),
|
||||||
|
});
|
||||||
|
|
||||||
|
expect(status.llm.status).toBe('fail');
|
||||||
|
expect(status.llm.fix).toContain('llm.models.default');
|
||||||
|
expect(status.llm.fix).not.toContain('Authenticate Codex');
|
||||||
|
});
|
||||||
|
});
|
||||||
|
|
||||||
|
describe('buildProjectStatus llm models.default requirement', () => {
|
||||||
|
function withBackendNoModel(
|
||||||
|
backend: KtxProjectConfig['llm']['provider']['backend'],
|
||||||
|
): KtxProjectConfig {
|
||||||
|
const config = buildDefaultKtxProjectConfig();
|
||||||
|
return {
|
||||||
|
...config,
|
||||||
|
llm: { ...config.llm, provider: { backend }, models: {} },
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
|
it('fails codex without llm.models.default and never probes', async () => {
|
||||||
|
let probeCalls = 0;
|
||||||
|
const project = projectWithConfig(withBackendNoModel('codex'));
|
||||||
|
const status = await buildProjectStatus(project, {
|
||||||
|
codexAuthProbe: async () => {
|
||||||
|
probeCalls += 1;
|
||||||
|
return { ok: true };
|
||||||
|
},
|
||||||
|
});
|
||||||
|
|
||||||
|
expect(probeCalls).toBe(0);
|
||||||
|
expect(status.llm.status).toBe('fail');
|
||||||
|
expect(status.llm.detail).toContain('llm.models.default');
|
||||||
|
expect(status.verdict).toBe('blocked');
|
||||||
|
});
|
||||||
|
|
||||||
|
it('fails claude-code without llm.models.default and never probes', async () => {
|
||||||
|
let probeCalls = 0;
|
||||||
|
const project = projectWithConfig(withBackendNoModel('claude-code'));
|
||||||
|
const status = await buildProjectStatus(project, {
|
||||||
|
claudeCodeAuthProbe: async () => {
|
||||||
|
probeCalls += 1;
|
||||||
|
return { ok: true };
|
||||||
|
},
|
||||||
|
});
|
||||||
|
|
||||||
|
expect(probeCalls).toBe(0);
|
||||||
|
expect(status.llm.status).toBe('fail');
|
||||||
|
expect(status.llm.detail).toContain('llm.models.default');
|
||||||
|
expect(status.verdict).toBe('blocked');
|
||||||
|
});
|
||||||
|
|
||||||
|
it('fails anthropic without llm.models.default even when the key is set', async () => {
|
||||||
|
const config = withBackendNoModel('anthropic');
|
||||||
|
const project = projectWithConfig({
|
||||||
|
...config,
|
||||||
|
llm: {
|
||||||
|
...config.llm,
|
||||||
|
provider: { backend: 'anthropic', anthropic: { api_key: 'env:ANTHROPIC_API_KEY' } }, // pragma: allowlist secret
|
||||||
|
models: {},
|
||||||
|
},
|
||||||
|
});
|
||||||
|
const status = await buildProjectStatus(project, {
|
||||||
|
env: { ANTHROPIC_API_KEY: 'sk-test' }, // pragma: allowlist secret
|
||||||
|
});
|
||||||
|
|
||||||
|
expect(status.llm.status).toBe('fail');
|
||||||
|
expect(status.llm.detail).toContain('llm.models.default');
|
||||||
|
expect(status.verdict).toBe('blocked');
|
||||||
|
});
|
||||||
|
});
|
||||||
|
|
||||||
describe('buildLocalStatsStatus', () => {
|
describe('buildLocalStatsStatus', () => {
|
||||||
let tempDir: string;
|
let tempDir: string;
|
||||||
|
|
||||||
|
|
|
||||||
79
pnpm-lock.yaml
generated
79
pnpm-lock.yaml
generated
|
|
@ -158,6 +158,9 @@ importers:
|
||||||
'@notionhq/client':
|
'@notionhq/client':
|
||||||
specifier: ^5.22.0
|
specifier: ^5.22.0
|
||||||
version: 5.22.0
|
version: 5.22.0
|
||||||
|
'@openai/codex-sdk':
|
||||||
|
specifier: ^0.133.0
|
||||||
|
version: 0.133.0
|
||||||
ai:
|
ai:
|
||||||
specifier: ^6.0.188
|
specifier: ^6.0.188
|
||||||
version: 6.0.188(zod@4.4.3)
|
version: 6.0.188(zod@4.4.3)
|
||||||
|
|
@ -1288,6 +1291,51 @@ packages:
|
||||||
'@octokit/types@16.0.0':
|
'@octokit/types@16.0.0':
|
||||||
resolution: {integrity: sha512-sKq+9r1Mm4efXW1FCk7hFSeJo4QKreL/tTbR0rz/qx/r1Oa2VV83LTA/H/MuCOX7uCIJmQVRKBcbmWoySjAnSg==}
|
resolution: {integrity: sha512-sKq+9r1Mm4efXW1FCk7hFSeJo4QKreL/tTbR0rz/qx/r1Oa2VV83LTA/H/MuCOX7uCIJmQVRKBcbmWoySjAnSg==}
|
||||||
|
|
||||||
|
'@openai/codex-sdk@0.133.0':
|
||||||
|
resolution: {integrity: sha512-PB82D/1Q0C7nzaV5O+1O4y5LcVwiUvxyHvCUTfz8Cwztv6bOWQ40gFHE5ZFX1EFPJx1cMV0GPVODWuXIKAuayQ==}
|
||||||
|
engines: {node: '>=18'}
|
||||||
|
|
||||||
|
'@openai/codex@0.133.0':
|
||||||
|
resolution: {integrity: sha512-Gh42kLLBo/6gpnHmDzUWDVvyS57ekCB1+1Dz0RG2oIl3Lhk1uwrjSj/PwaJWWh4Rw/rUp1RqkwrMugFfFEOlqQ==}
|
||||||
|
engines: {node: '>=16'}
|
||||||
|
hasBin: true
|
||||||
|
|
||||||
|
'@openai/codex@0.133.0-darwin-arm64':
|
||||||
|
resolution: {integrity: sha512-W7f8+DckLujnqGlptKCzgJU+ooeHKMuk6KYgMFP6A9asn7YUsGUgJqjiBaX8oNcXO6w/pTbKGRARx1kCNS8lIg==}
|
||||||
|
engines: {node: '>=16'}
|
||||||
|
cpu: [arm64]
|
||||||
|
os: [darwin]
|
||||||
|
|
||||||
|
'@openai/codex@0.133.0-darwin-x64':
|
||||||
|
resolution: {integrity: sha512-Ek8ikvLOiXZ8emcIJVBXxK6fm8ratBy0kaEt3JNisTNszxGshUHf/R4xxDxIyKNcUkYYXjW7A/rMwW3iu3OFlg==}
|
||||||
|
engines: {node: '>=16'}
|
||||||
|
cpu: [x64]
|
||||||
|
os: [darwin]
|
||||||
|
|
||||||
|
'@openai/codex@0.133.0-linux-arm64':
|
||||||
|
resolution: {integrity: sha512-uKXYYSJ3mY16sp4hcG/4BMNRjva/ZS4oARiI1+7k8+NiuoAhdCGWNe5u4KJ3sMuL3tp/IXcmc6B56EFX1+WDBQ==}
|
||||||
|
engines: {node: '>=16'}
|
||||||
|
cpu: [arm64]
|
||||||
|
os: [linux]
|
||||||
|
|
||||||
|
'@openai/codex@0.133.0-linux-x64':
|
||||||
|
resolution: {integrity: sha512-9YfyqrfUj/UZ2+aXE4zBz47t6RXbVni95ZorGsNh857vxYK/asVpUtR2cymo9lB3JaI4mQaKFfV/t7IRItqkuA==}
|
||||||
|
engines: {node: '>=16'}
|
||||||
|
cpu: [x64]
|
||||||
|
os: [linux]
|
||||||
|
|
||||||
|
'@openai/codex@0.133.0-win32-arm64':
|
||||||
|
resolution: {integrity: sha512-mRzND0PSGHRoLk0X41GTSoc3tFjZSF4HgDlfjU5fiQcWVi0/kLb7Ku6/tPFT/X2hOLa3YdJkbIcHC0Hc9ni80g==}
|
||||||
|
engines: {node: '>=16'}
|
||||||
|
cpu: [arm64]
|
||||||
|
os: [win32]
|
||||||
|
|
||||||
|
'@openai/codex@0.133.0-win32-x64':
|
||||||
|
resolution: {integrity: sha512-u3ji78DIPZCGJeELuovsAnaZH+vK9gsA4F6M1y+Uy2s80Sz7/i1S0KL81qGReYji3urSjgBpkQuNP47GXOqxrQ==}
|
||||||
|
engines: {node: '>=16'}
|
||||||
|
cpu: [x64]
|
||||||
|
os: [win32]
|
||||||
|
|
||||||
'@opentelemetry/api@1.9.1':
|
'@opentelemetry/api@1.9.1':
|
||||||
resolution: {integrity: sha512-gLyJlPHPZYdAk1JENA9LeHejZe1Ti77/pTeFm/nMXmQH/HFZlcS/O2XJB+L8fkbrNSqhdtlvjBVjxwUYanNH5Q==}
|
resolution: {integrity: sha512-gLyJlPHPZYdAk1JENA9LeHejZe1Ti77/pTeFm/nMXmQH/HFZlcS/O2XJB+L8fkbrNSqhdtlvjBVjxwUYanNH5Q==}
|
||||||
engines: {node: '>=8.0.0'}
|
engines: {node: '>=8.0.0'}
|
||||||
|
|
@ -7145,6 +7193,37 @@ snapshots:
|
||||||
dependencies:
|
dependencies:
|
||||||
'@octokit/openapi-types': 27.0.0
|
'@octokit/openapi-types': 27.0.0
|
||||||
|
|
||||||
|
'@openai/codex-sdk@0.133.0':
|
||||||
|
dependencies:
|
||||||
|
'@openai/codex': 0.133.0
|
||||||
|
|
||||||
|
'@openai/codex@0.133.0':
|
||||||
|
optionalDependencies:
|
||||||
|
'@openai/codex-darwin-arm64': '@openai/codex@0.133.0-darwin-arm64'
|
||||||
|
'@openai/codex-darwin-x64': '@openai/codex@0.133.0-darwin-x64'
|
||||||
|
'@openai/codex-linux-arm64': '@openai/codex@0.133.0-linux-arm64'
|
||||||
|
'@openai/codex-linux-x64': '@openai/codex@0.133.0-linux-x64'
|
||||||
|
'@openai/codex-win32-arm64': '@openai/codex@0.133.0-win32-arm64'
|
||||||
|
'@openai/codex-win32-x64': '@openai/codex@0.133.0-win32-x64'
|
||||||
|
|
||||||
|
'@openai/codex@0.133.0-darwin-arm64':
|
||||||
|
optional: true
|
||||||
|
|
||||||
|
'@openai/codex@0.133.0-darwin-x64':
|
||||||
|
optional: true
|
||||||
|
|
||||||
|
'@openai/codex@0.133.0-linux-arm64':
|
||||||
|
optional: true
|
||||||
|
|
||||||
|
'@openai/codex@0.133.0-linux-x64':
|
||||||
|
optional: true
|
||||||
|
|
||||||
|
'@openai/codex@0.133.0-win32-arm64':
|
||||||
|
optional: true
|
||||||
|
|
||||||
|
'@openai/codex@0.133.0-win32-x64':
|
||||||
|
optional: true
|
||||||
|
|
||||||
'@opentelemetry/api@1.9.1': {}
|
'@opentelemetry/api@1.9.1': {}
|
||||||
|
|
||||||
'@orama/orama@3.1.18': {}
|
'@orama/orama@3.1.18': {}
|
||||||
|
|
|
||||||
160
scripts/codex-backend-live-smoke.mjs
Normal file
160
scripts/codex-backend-live-smoke.mjs
Normal file
|
|
@ -0,0 +1,160 @@
|
||||||
|
import { execFile } from 'node:child_process';
|
||||||
|
import { mkdtemp, rm } from 'node:fs/promises';
|
||||||
|
import { tmpdir } from 'node:os';
|
||||||
|
import { dirname, join, resolve } from 'node:path';
|
||||||
|
import { fileURLToPath, pathToFileURL } from 'node:url';
|
||||||
|
import { promisify } from 'node:util';
|
||||||
|
|
||||||
|
const execFileAsync = promisify(execFile);
|
||||||
|
const SCRIPT_DIR = dirname(fileURLToPath(import.meta.url));
|
||||||
|
const ROOT_DIR = resolve(SCRIPT_DIR, '..');
|
||||||
|
const OPT_IN_MESSAGE =
|
||||||
|
'Set KTX_RUN_CODEX_BACKEND_SMOKE=1 or pass --force to run the Codex backend live smoke.';
|
||||||
|
|
||||||
|
export function codexBackendSmokeOptIn(env = process.env, args = process.argv.slice(2)) {
|
||||||
|
if (env.KTX_RUN_CODEX_BACKEND_SMOKE === '1' || args.includes('--force')) {
|
||||||
|
return { run: true };
|
||||||
|
}
|
||||||
|
return { run: false, message: OPT_IN_MESSAGE };
|
||||||
|
}
|
||||||
|
|
||||||
|
async function run(command, args, options = {}) {
|
||||||
|
process.stdout.write(`$ ${command} ${args.join(' ')}\n`);
|
||||||
|
try {
|
||||||
|
const result = await execFileAsync(command, args, {
|
||||||
|
cwd: options.cwd ?? ROOT_DIR,
|
||||||
|
env: { ...process.env, ...(options.env ?? {}) },
|
||||||
|
encoding: 'utf8',
|
||||||
|
maxBuffer: 1024 * 1024 * 20,
|
||||||
|
timeout: options.timeoutMs ?? 300_000,
|
||||||
|
});
|
||||||
|
if (result.stdout) {
|
||||||
|
process.stdout.write(result.stdout);
|
||||||
|
}
|
||||||
|
if (result.stderr) {
|
||||||
|
process.stderr.write(result.stderr);
|
||||||
|
}
|
||||||
|
return { code: 0, stdout: result.stdout, stderr: result.stderr };
|
||||||
|
} catch (error) {
|
||||||
|
const stdout = typeof error.stdout === 'string' ? error.stdout : '';
|
||||||
|
const stderr = typeof error.stderr === 'string' ? error.stderr : error.message;
|
||||||
|
if (stdout) {
|
||||||
|
process.stdout.write(stdout);
|
||||||
|
}
|
||||||
|
if (stderr) {
|
||||||
|
process.stderr.write(stderr);
|
||||||
|
}
|
||||||
|
return {
|
||||||
|
code: typeof error.code === 'number' ? error.code : 1,
|
||||||
|
stdout,
|
||||||
|
stderr,
|
||||||
|
};
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
function requireSuccess(label, result) {
|
||||||
|
if (result.code !== 0) {
|
||||||
|
throw new Error(`${label} failed with code ${result.code}\nstdout:\n${result.stdout}\nstderr:\n${result.stderr}`);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
async function runSetupSmoke(projectDir) {
|
||||||
|
const result = await run(
|
||||||
|
'node',
|
||||||
|
[
|
||||||
|
join(ROOT_DIR, 'packages/cli/dist/bin.js'),
|
||||||
|
'setup',
|
||||||
|
'--project-dir',
|
||||||
|
projectDir,
|
||||||
|
'--llm-backend',
|
||||||
|
'codex',
|
||||||
|
'--llm-model',
|
||||||
|
'gpt-5.3-codex',
|
||||||
|
'--no-input',
|
||||||
|
'--yes',
|
||||||
|
'--skip-databases',
|
||||||
|
'--skip-sources',
|
||||||
|
'--skip-agents',
|
||||||
|
],
|
||||||
|
{ timeoutMs: 600_000 },
|
||||||
|
);
|
||||||
|
requireSuccess('ktx setup codex backend', result);
|
||||||
|
if (!result.stdout.includes('LLM ready: yes (codex, gpt-5.3-codex)')) {
|
||||||
|
throw new Error(`setup did not report Codex LLM readiness\nstdout:\n${result.stdout}`);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
async function runRuntimeSmoke(projectDir) {
|
||||||
|
const runtimeUrl = pathToFileURL(join(ROOT_DIR, 'packages/cli/dist/context/llm/codex-runtime.js')).href;
|
||||||
|
const zodUrl = pathToFileURL(join(ROOT_DIR, 'packages/cli/node_modules/zod/index.js')).href;
|
||||||
|
const { CodexKtxLlmRuntime } = await import(runtimeUrl);
|
||||||
|
const { z } = await import(zodUrl);
|
||||||
|
const runtime = new CodexKtxLlmRuntime({
|
||||||
|
projectDir,
|
||||||
|
modelSlots: { default: 'gpt-5.3-codex' },
|
||||||
|
});
|
||||||
|
|
||||||
|
const text = await runtime.generateText({
|
||||||
|
role: 'default',
|
||||||
|
prompt: 'Reply with exactly: ktx_codex_text_ok',
|
||||||
|
});
|
||||||
|
if (text.trim() !== 'ktx_codex_text_ok') {
|
||||||
|
throw new Error(`Codex text smoke returned unexpected text: ${text}`);
|
||||||
|
}
|
||||||
|
|
||||||
|
let toolCalls = 0;
|
||||||
|
const loop = await runtime.runAgentLoop({
|
||||||
|
modelRole: 'default',
|
||||||
|
systemPrompt: 'You must use available tools when the user asks for a tool result.',
|
||||||
|
userPrompt:
|
||||||
|
'Call the echo_value tool with {"value":"ktx_codex_tool_ok"}, then finish after the tool returns.',
|
||||||
|
toolSet: {
|
||||||
|
echo_value: {
|
||||||
|
name: 'echo_value',
|
||||||
|
description: 'Return the provided value as markdown.',
|
||||||
|
inputSchema: z.object({ value: z.string() }),
|
||||||
|
execute: async (input) => {
|
||||||
|
toolCalls += 1;
|
||||||
|
return { markdown: `echo:${input.value}` };
|
||||||
|
},
|
||||||
|
},
|
||||||
|
},
|
||||||
|
stepBudget: 4,
|
||||||
|
telemetryTags: {},
|
||||||
|
});
|
||||||
|
|
||||||
|
if (loop.stopReason !== 'natural') {
|
||||||
|
throw new Error(`Codex tool smoke stopped with ${loop.stopReason}: ${loop.error?.message ?? 'no error'}`);
|
||||||
|
}
|
||||||
|
if (toolCalls !== 1) {
|
||||||
|
throw new Error(`Expected Codex to call echo_value exactly once, got ${toolCalls}`);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
export async function runCodexBackendLiveSmoke() {
|
||||||
|
const projectDir = await mkdtemp(join(tmpdir(), 'ktx-codex-backend-smoke-'));
|
||||||
|
try {
|
||||||
|
requireSuccess(
|
||||||
|
'ktx build',
|
||||||
|
await run('pnpm', ['--filter', '@kaelio/ktx', 'run', 'build'], { timeoutMs: 600_000 }),
|
||||||
|
);
|
||||||
|
await runSetupSmoke(projectDir);
|
||||||
|
await runRuntimeSmoke(projectDir);
|
||||||
|
process.stdout.write(`Codex backend live smoke passed in ${projectDir}\n`);
|
||||||
|
} finally {
|
||||||
|
await rm(projectDir, { recursive: true, force: true });
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
async function main() {
|
||||||
|
const optIn = codexBackendSmokeOptIn();
|
||||||
|
if (!optIn.run) {
|
||||||
|
process.stdout.write(`${optIn.message}\n`);
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
await runCodexBackendLiveSmoke();
|
||||||
|
}
|
||||||
|
|
||||||
|
if (import.meta.url === pathToFileURL(process.argv[1] ?? '').href) {
|
||||||
|
await main();
|
||||||
|
}
|
||||||
18
scripts/codex-backend-live-smoke.test.mjs
Normal file
18
scripts/codex-backend-live-smoke.test.mjs
Normal file
|
|
@ -0,0 +1,18 @@
|
||||||
|
import assert from 'node:assert/strict';
|
||||||
|
import test from 'node:test';
|
||||||
|
import { codexBackendSmokeOptIn } from './codex-backend-live-smoke.mjs';
|
||||||
|
|
||||||
|
test('codex backend smoke stays disabled by default', () => {
|
||||||
|
assert.deepEqual(codexBackendSmokeOptIn({}, []), {
|
||||||
|
run: false,
|
||||||
|
message: 'Set KTX_RUN_CODEX_BACKEND_SMOKE=1 or pass --force to run the Codex backend live smoke.',
|
||||||
|
});
|
||||||
|
});
|
||||||
|
|
||||||
|
test('codex backend smoke runs with env opt-in', () => {
|
||||||
|
assert.deepEqual(codexBackendSmokeOptIn({ KTX_RUN_CODEX_BACKEND_SMOKE: '1' }, []), { run: true });
|
||||||
|
});
|
||||||
|
|
||||||
|
test('codex backend smoke runs with force flag', () => {
|
||||||
|
assert.deepEqual(codexBackendSmokeOptIn({}, ['--force']), { run: true });
|
||||||
|
});
|
||||||
Loading…
Add table
Add a link
Reference in a new issue