mirror of
https://github.com/Kaelio/ktx.git
synced 2026-07-01 08:59:39 +02:00
feat: add codex llm backend for ktx runtime work (#253)
* feat: add codex sdk runner foundation * feat: parse codex runtime events * feat: expose codex runtime mcp tools * feat: add codex llm runtime * feat: wire codex llm backend * test: avoid Array.fromAsync in codex runner test * docs: document codex llm backend * fix: tighten codex runtime config ownership * fix: use codex sdk env and thread options * fix: parse codex sdk event shapes * test: add codex backend live smoke * docs: clarify codex backend isolation * fix: drive codex loop metrics from mcp events * fix: enforce codex local step budget * docs: disclose codex isolation limits * fix: count all codex agent steps and stream step callbacks live The agent-loop step budget only counted completed mcp_tool_call items, so built-in command_execution steps (which the public Codex SDK/CLI surface can still expose) never decremented the budget, letting ingest/reconciliation run past stepBudget until Codex stopped on its own. onStepFinish was also replayed only after the whole stream drained, so live work_unit_step / reconciliation progress appeared stuck until the Codex process exited. collectEvents is now the single live step accumulator: it counts every completed agent-action item via a shared isCompletedAgentStep predicate (command_execution, mcp_tool_call, file_change, web_search), fires onStepFinish as each step completes, and enforces the budget on that broader count. A no-tool turn still counts as one step. toolFailures stays MCP-specific, since a non-zero command exit is normal agent exploration, not a loop failure. * test: align ingest llm-guard assertions with codex backend The skip-llm ingest guard message now lists codex as a valid backend and mentions a Claude Code/Codex session plus a codex setup hint, but this slow suite test still asserted the pre-codex wording. Update it to match the production message (already covered by the local-bundle-runtime unit test) and add the codex setup-line assertion. * fix: treat codex error:null tool calls as success The Codex SDK serializes error: null on successful mcp_tool_call items, so the failure check (item.error !== undefined) flagged every successful tool call as failed with the empty-payload default "Codex turn failed". This killed every ingest work unit under the codex backend before it could produce a patch. Key on status === 'failed' (authoritative, always set) and only treat a populated error object as a failure. Add a regression test built from a verbatim real-SDK event capture. * fix: default codex backend to gpt-5.5 and report real probe errors The previous default gpt-5.3-codex is an API-key-only model that the OpenAI API rejects under ChatGPT-account (subscription) auth, so codex status/setup failed with a misleading "authentication is not usable" message even though auth was fine. - Default codex model is now gpt-5.5 (works on both subscription and API-key auth); the curated setup picker offers gpt-5.5 / gpt-5.4 / gpt-5.4-mini and keeps free-form entry for account-specific ids (e.g. gpt-5.3-codex-spark). - runCodexAuthProbe now distinguishes "model not available" from an auth failure and surfaces the real API error: collectEvents retains stream events when the SDK throws on a non-zero exit, and the API error JSON envelope is unwrapped to its human-readable message. - The Codex isolation warning now renders inside the clack setup frame. - Docs updated to gpt-5.5 with a note that *-codex ids require API-key auth. * fix: require llm.models.default in status and match codex probe remediation Status reported a project ready when a non-none LLM backend was configured without llm.models.default, but the runtime (resolveModelSlots) hard-requires it, so ingest/scan/memory threw after `ktx status` said the project was usable. buildLlmStatus now fails for any non-none backend missing models.default and no longer invents a fallback model for claude-code/codex. Codex probe failures now carry a category-matched fix: a model-access failure steers the user at llm.models.default instead of the auth/install remediation. runCodexAuthProbe returns the fix and status consumes it; the message stays self-sufficient so setup output is unchanged. Docs: README now lists the codex backend and local Codex auth; ktx-setup.mdx states --llm-model only accepts codex/default or gpt-*/codex-* ids. Repaired four doctor fixtures that configured a backend without models.default (the now-correctly-blocked config) and added coverage for the new behavior.
This commit is contained in:
parent
74c6076b72
commit
494618ab14
41 changed files with 2544 additions and 30 deletions
|
|
@ -51,8 +51,9 @@ prompts.
|
|||
|
||||
| Flag | Description |
|
||||
|------|-------------|
|
||||
| `--llm-backend <backend>` | LLM backend: `anthropic`, `vertex`, or `claude-code` |
|
||||
| `--llm-backend <backend>` | LLM backend: `anthropic`, `vertex`, `claude-code`, or `codex` |
|
||||
| `--llm-backend claude-code` | Use the local Claude Code session for **ktx** LLM calls |
|
||||
| `--llm-backend codex` | Use local Codex authentication for **ktx** LLM calls |
|
||||
| `--llm-model <model>` | LLM model ID or backend model alias to validate and save |
|
||||
| `--anthropic-api-key-env <name>` | Environment variable containing the Anthropic API key |
|
||||
| `--anthropic-api-key-file <path>` | File containing the Anthropic API key |
|
||||
|
|
@ -62,9 +63,14 @@ prompts.
|
|||
|
||||
Choose only one Anthropic credential source. Anthropic credential flags are only
|
||||
valid with the Anthropic backend; Vertex flags are only valid with the Vertex
|
||||
backend. The `claude-code` backend uses local Claude Code authentication instead
|
||||
backend. The `claude-code` and `codex` backends use local authentication instead
|
||||
of Anthropic API key or Vertex flags. For Claude Code, `--llm-model` accepts
|
||||
`sonnet`, `opus`, `haiku`, or a full Claude model ID.
|
||||
`sonnet`, `opus`, `haiku`, or a full Claude model ID. For Codex, `--llm-model`
|
||||
accepts `codex`, `default`, or a `gpt-*` / `codex-*` model ID such as
|
||||
`gpt-5.5`; any other value is rejected before the auth probe. Run `codex` to
|
||||
see the models available to your login, and pick a `gpt-*` / `codex-*` id from
|
||||
that list. Note that `*-codex` API-billing model IDs (for example
|
||||
`gpt-5.3-codex`) are not available to ChatGPT-subscription logins.
|
||||
|
||||
### Embeddings
|
||||
|
||||
|
|
@ -191,6 +197,17 @@ ktx setup \
|
|||
--llm-backend claude-code \
|
||||
--llm-model opus
|
||||
|
||||
# Configure **ktx** to use local Codex authentication for LLM work
|
||||
ktx setup --llm-backend codex --llm-model gpt-5.5 --no-input
|
||||
```
|
||||
|
||||
When you choose `--llm-backend codex`, setup prints a warning if the public
|
||||
Codex SDK and CLI surface cannot prove full Claude-Code-style isolation. The
|
||||
backend restricts **ktx** runtime MCP tools to each run, but Codex may still
|
||||
load user Codex config and built-in command execution or read-only file
|
||||
capabilities.
|
||||
|
||||
```bash
|
||||
# Script a Postgres connection that reads its URL from the environment
|
||||
ktx setup \
|
||||
--project-dir ./analytics \
|
||||
|
|
|
|||
|
|
@ -21,7 +21,7 @@ ktx status [options]
|
|||
| `--json` | Print JSON output | `false` |
|
||||
| `-v`, `--verbose` | Show every check, including passing ones | `false` |
|
||||
| `--validate` | Only validate the `ktx.yaml` schema; skip readiness checks | `false` |
|
||||
| `--fast` | Skip checks that require external communication (query-history readiness probes and Claude Code auth probe) | `false` |
|
||||
| `--fast` | Skip checks that require external communication (query-history readiness probes, Claude Code auth probe, and Codex auth probe) | `false` |
|
||||
| `--no-input` | Disable interactive terminal input | - |
|
||||
|
||||
## Examples
|
||||
|
|
@ -39,7 +39,7 @@ ktx status --verbose
|
|||
# Validate ktx.yaml without running readiness checks
|
||||
ktx status --validate
|
||||
|
||||
# Skip slow probes (query-history readiness, Claude Code auth)
|
||||
# Skip slow probes (query-history readiness, Claude Code auth, Codex auth)
|
||||
ktx status --fast
|
||||
|
||||
# Check a project from another directory
|
||||
|
|
@ -57,6 +57,16 @@ flow, then rerun `ktx status`. Use `--fast` to skip this probe (useful in CI
|
|||
or offline contexts); skipped checks render as `-` and carry
|
||||
`"status": "skipped"` in JSON output.
|
||||
|
||||
For `llm.provider.backend: codex`, `ktx status` runs a minimal non-interactive
|
||||
Codex request. If the probe fails, authenticate Codex locally with the Codex CLI
|
||||
and verify the Codex CLI installation.
|
||||
|
||||
When `llm.provider.backend: codex` is configured, `ktx status` also prints a
|
||||
warning when the installed public Codex SDK and CLI surface cannot prove full
|
||||
Claude-Code-style isolation. The warning does not block authenticated Codex
|
||||
usage, but it marks the project status as partial so you can make an explicit
|
||||
runtime-isolation decision.
|
||||
|
||||
A `Local data` section summarises what the project has accumulated locally:
|
||||
ingest run counts, last completed timestamp per connection, knowledge page
|
||||
counts by scope, semantic-layer source and dictionary value counts, and the
|
||||
|
|
|
|||
|
|
@ -376,13 +376,23 @@ llm:
|
|||
|
||||
| Field | Type | Default | Purpose |
|
||||
|-------|------|---------|---------|
|
||||
| `provider.backend` | `none` \| `anthropic` \| `vertex` \| `gateway` \| `claude-code` | `none` | Selected backend. `none` disables LLM features. `claude-code` uses the local Claude Code session and needs no API key. |
|
||||
| `provider.backend` | `none` \| `anthropic` \| `vertex` \| `gateway` \| `claude-code` \| `codex` | `none` | Selected backend. `none` disables LLM features. `claude-code` uses the local Claude Code session and needs no API key. `codex` uses local Codex authentication and needs no API key. |
|
||||
| `provider.anthropic.api_key` | `string` | - | Anthropic API key. Required when `backend: anthropic`. Accepts `env:` or `file:` references. |
|
||||
| `provider.anthropic.base_url` | `string` | - | Override the Anthropic API base URL (proxy, self-hosted gateway). |
|
||||
| `provider.gateway.api_key` / `base_url` | `string` | - | Credentials for an AI Gateway provider. Required when `backend: gateway`. |
|
||||
| `provider.vertex.project` | `string` | - | Google Cloud project ID hosting the Vertex AI endpoint. |
|
||||
| `provider.vertex.location` | `string` | - | Vertex AI region (for example `us-east5`). Required when the `vertex` block is present. |
|
||||
|
||||
Use `codex` when local Codex authentication should power **ktx** LLM work:
|
||||
|
||||
```yaml
|
||||
llm:
|
||||
provider:
|
||||
backend: codex
|
||||
models:
|
||||
default: gpt-5.5
|
||||
```
|
||||
|
||||
### Model roles
|
||||
|
||||
`models` overrides the per-role model. Keys are fixed; values are
|
||||
|
|
|
|||
|
|
@ -39,8 +39,20 @@ ktx ingest --all
|
|||
Enriched ingest needs a configured model and embeddings. Run `ktx setup` first;
|
||||
connections without that configuration fail before any work starts.
|
||||
|
||||
With `claude-code`, **ktx** agent loops can invoke only the **ktx** MCP tools for the
|
||||
current run.
|
||||
Local-auth backends keep provider credentials out of `ktx.yaml`:
|
||||
|
||||
```bash
|
||||
ktx setup --llm-backend claude-code --no-input
|
||||
ktx setup --llm-backend codex --llm-model gpt-5.5 --no-input
|
||||
```
|
||||
|
||||
With `claude-code`, **ktx** agent loops can invoke only the **ktx** MCP tools
|
||||
for the current run. With `codex`, **ktx** restricts the temporary runtime MCP
|
||||
server to the current run's tool set, disables Codex web search, requests a
|
||||
read-only sandbox, and sets `approval_policy=never`. The public Codex SDK and
|
||||
CLI surface may still load user Codex config and built-in command execution or
|
||||
read-only file capabilities, so use `claude-code` for stricter runtime tool
|
||||
isolation.
|
||||
|
||||
## Query history
|
||||
|
||||
|
|
|
|||
|
|
@ -16,6 +16,7 @@ Set `llm.provider.backend` to one of these values:
|
|||
- `gateway`: Use AI Gateway-compatible Anthropic model ids.
|
||||
- `claude-code`: Use your local Claude Code session through the Claude Agent
|
||||
SDK. **ktx** strips provider-routing environment variables from child processes.
|
||||
- `codex`: Use your local Codex authentication through the Codex SDK.
|
||||
|
||||
## Claude Code
|
||||
|
||||
|
|
@ -47,6 +48,42 @@ model IDs are also accepted.
|
|||
metadata may still list host slash commands, skills, and subagents; **ktx** does not
|
||||
grant execution access to them.
|
||||
|
||||
## Codex backend
|
||||
|
||||
Use `codex` when you want **ktx** to run LLM-backed workflows through your
|
||||
local Codex authentication instead of a direct provider API key.
|
||||
|
||||
```yaml
|
||||
llm:
|
||||
provider:
|
||||
backend: codex
|
||||
models:
|
||||
default: gpt-5.5
|
||||
```
|
||||
|
||||
Configure it non-interactively:
|
||||
|
||||
```bash
|
||||
ktx setup --llm-backend codex --llm-model gpt-5.5 --no-input
|
||||
```
|
||||
|
||||
This is separate from Codex agent-client setup. `ktx setup --agents --target
|
||||
codex` installs instructions and MCP access for an end-user Codex session.
|
||||
`ktx setup --llm-backend codex` makes **ktx** itself execute ingest, scan
|
||||
enrichment, memory, and other LLM-backed work through Codex.
|
||||
|
||||
During runtime loops, **ktx** starts a temporary loopback MCP server for the
|
||||
current run, exposes only the tools passed to that run, asks Codex to use a
|
||||
read-only sandbox, sets `approval_policy=never`, auto-approves only those
|
||||
run-scoped MCP tools, and disables Codex web search.
|
||||
|
||||
Codex backend isolation is currently limited by the public Codex SDK and CLI
|
||||
surface. Codex may still load user Codex config and built-in command execution
|
||||
or read-only file capabilities. Use `llm.provider.backend: claude-code` when
|
||||
you need stricter Claude-Code-style runtime tool isolation, or remove host
|
||||
Codex MCP and tool config before running untrusted prompts through the `codex`
|
||||
backend.
|
||||
|
||||
## Prompt caching
|
||||
|
||||
`llm.promptCaching` has partial parity on `claude-code`. Status and doctor warn
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue