From f7fbacf229986fa0746ca8cc0770f2d714f3e1c0 Mon Sep 17 00:00:00 2001
From: Andrey Avtomonov <andreybavt@gmail.com>
Date: Fri, 15 May 2026 14:30:02 +0200
Subject: [PATCH] docs: expand claude-code spec to full llm parity

---
 .../2026-05-15-claude-code-backend-design.md  | 345 ++++++++++--------
 1 file changed, 197 insertions(+), 148 deletions(-)

diff --git a/docs/superpowers/specs/2026-05-15-claude-code-backend-design.md b/docs/superpowers/specs/2026-05-15-claude-code-backend-design.md
index 510cd977..7930f7ed 100644
--- a/docs/superpowers/specs/2026-05-15-claude-code-backend-design.md
+++ b/docs/superpowers/specs/2026-05-15-claude-code-backend-design.md
@@ -1,150 +1,175 @@
-# Brainstorm: `claude-code` backend for end-to-end KTX ingest
+# Brainstorm: `claude-code` backend with full KTX LLM parity
 
-Adds a `claude-code` selection that makes **all `ktx ingest` LLM work** run
-through `@anthropic-ai/claude-agent-sdk`, reusing the user's existing local
-Claude Code authentication. The user experience stays the same: users run
-`ktx ingest`; the backend is selected in `ktx.yaml`.
+Adds a `claude-code` backend that gives KTX full parity with the existing
+`ANTHROPIC_API_KEY`-based `anthropic` backend for **all KTX LLM calls**. The
+backend uses `@anthropic-ai/claude-agent-sdk` and reuses the user's existing
+local Claude Code authentication. Users select it in `ktx.yaml`.
 
-This is not an implementation plan. It is the revised design after iterating on
-the brainstorm with the requirement that **all KTX ingest capabilities must work
-with `claude-code`**. The follow-up implementation plan should be written
-separately.
+This is not an implementation plan. It is the revised design after expanding
+the requirement from "`ktx ingest` works with Claude Code" to "every KTX LLM
+call works with Claude Code." The follow-up implementation plan should be
+written separately.
 
 ## Core decision
 
-`claude-code` is no longer only an agent-runner backend. It is an ingest-capable
-LLM runtime that covers both kinds of LLM work used by ingest:
+`claude-code` is a first-class global LLM backend. Any code path that currently
+works with `llm.provider.backend: anthropic` must work with
+`llm.provider.backend: claude-code`, unless it is not an LLM call at all.
 
-- **Agent loops**: work-unit execution, reconciliation, context-candidate
-  curation pagination, and memory-agent ingestion paths that call
-  `agentRunner.runLoop(...)`.
-- **Non-agent generation**: page triage and light extraction, which currently
-  call `KtxLlmProvider` directly through `generateText`.
+This includes:
 
-The implementation must not make page triage silently disappear when the user
-chooses `claude-code`. Today `PageTriageService` is only constructed when
-`resolveAgentRunner(...)` returns an AI SDK `llmProvider`
-(`packages/context/src/ingest/local-bundle-runtime.ts:684-693`). Under the new
-design, ingest gets a generation runtime for `claude-code`, so page triage and
-light extraction still run.
+- Agent loops implemented through `AgentRunnerService.runLoop(...)`.
+- Text generation through `generateKtxText(...)`.
+- Structured object generation through `generateKtxObject(...)`.
+- Local ingest and MCP-triggered local ingest flows.
+- Page triage and light extraction.
+- Context-candidate curation and reconciliation.
+- Memory capture.
+- Scan/enrichment internals and relationship LLM proposals.
+- Future KTX LLM call sites that use the shared runtime boundary.
+
+Commands that do not use LLMs do not need special Claude Code behavior. There
+must be no silent fallback from `claude-code` to gateway, Anthropic API-key
+execution, or deterministic output.
 
 ## Goals
 
-- Let a KTX user run every `ktx ingest` mode against their existing local Claude
-  Code session without provisioning `ANTHROPIC_API_KEY`, Vertex credentials, or
-  an AI Gateway key.
-- Cover scheduled pulls, upload ingest, Metabase fan-out, page triage, light
-  extraction, context-candidate curation, work-unit execution, reconciliation,
-  memory capture invoked from ingest, and source-specific tools such as
-  historic-SQL evidence emission.
-- Preserve KTX's per-stage tool curation. Each stage exposes exactly the KTX
-  tools it already selected; Claude Code built-ins, filesystem-discovered MCP
-  servers, hooks, skills, plugins, agents, and slash commands must not expand
-  the tool surface.
+- Let a KTX user run all KTX LLM-backed behavior through their existing local
+  Claude Code session without provisioning `ANTHROPIC_API_KEY`, Vertex
+  credentials, or an AI Gateway key.
+- Preserve the existing user-facing CLI and MCP behavior. `claude-code` changes
+  how LLM calls execute, not which KTX workflows exist.
+- Preserve role-based model selection. `llm.models.default`, `triage`,
+  `candidateExtraction`, `curator`, `reconcile`, and `repair` remain the source
+  of model selection for every LLM call.
+- Preserve KTX's curated tool boundaries. Claude Code built-ins,
+  filesystem-discovered MCP servers, hooks, skills, plugins, agents, and slash
+  commands must not expand the tool surface for KTX agent loops.
 - Keep embeddings independent. Claude does not provide embeddings; users keep
-  configuring `ingest.embeddings` as they do today.
+  configuring `ingest.embeddings` and scan/enrichment embeddings as they do
+  today.
 - Fail fast with a clear message if local Claude Code authentication is not
   usable.
 
 ## Non-goals
 
-- **Non-ingest LLM surfaces.** The required target is end-to-end `ktx ingest`.
-  Other internal LLM consumers are out of scope for this spec. The config and
-  runtime design must still avoid accidental gateway fall-through when
-  `llm.provider.backend` is `claude-code`.
-- **Tool-call repair parity.** The AI SDK runner uses
+- **Embedding parity.** Embeddings remain separate from LLM execution.
+- **Tool-call repair parity in the first pass.** The AI SDK runner uses
   `experimental_repairToolCall` (`packages/llm/src/repair.ts:35-88`). The Claude
   Agent SDK has no transparent same-step repair hook. MVP behavior is next-turn
   self-correction from schema errors or a normal tool-failure count.
-- **OTEL telemetry parity.** The AI SDK runner uses `experimental_telemetry`.
-  The Agent SDK exposes hooks such as `PostToolUseFailure` and `SessionEnd`, but
-  no drop-in OTEL switch. MVP ships without telemetry parity on this backend.
+- **OTEL telemetry parity in the first pass.** The AI SDK runner uses
+  `experimental_telemetry`. The Agent SDK exposes hooks such as
+  `PostToolUseFailure` and `SessionEnd`, but no drop-in OTEL switch. MVP ships
+  without telemetry parity on this backend.
 - **Productizing Claude subscription limits.** Documentation must frame this as
   "use your own local Claude Code session," not as a third-party Claude Max or
   Claude.ai product feature.
 
 ## Approaches considered
 
-### Recommended: Ingest LLM runtime port
+### Recommended: global LLM runtime port
 
 Introduce a backend-neutral KTX LLM runtime port for operations, not just model
 construction:
 
 ```ts
-interface KtxGenerationPort {
+interface KtxLlmRuntimePort {
   generateText(input: KtxGenerateTextInput): Promise<string>;
   generateObject<T>(input: KtxGenerateObjectInput<T>): Promise<T>;
-}
-
-interface AgentRunnerPort {
-  runLoop(params: RunLoopParams): Promise<RunLoopResult>;
+  runAgentLoop(params: RunLoopParams): Promise<RunLoopResult>;
 }
 ```
 
-The existing AI SDK implementation adapts `KtxLlmProvider` to these ports. The
-new Claude Code implementation uses `query()` from
-`@anthropic-ai/claude-agent-sdk`. Ingest services depend on the ports:
+The existing `anthropic`, `vertex`, and `gateway` backends implement the runtime
+through the AI SDK and existing `KtxLlmProvider`. The new `claude-code` backend
+implements the same runtime through `@anthropic-ai/claude-agent-sdk`.
 
-- `PageTriageService` depends on `KtxGenerationPort`, not raw
-  `KtxLlmProvider`.
-- `generateKtxText` / `generateKtxObject` become thin helpers over the
-  generation port or move behind it.
-- `AgentRunnerService` and `ClaudeAgentSdkRunnerService` both implement
-  `AgentRunnerPort`.
+This is the recommended approach because KTX call sites need operations:
+"generate text," "generate a structured object," and "run an agent loop." They
+do not inherently need direct access to an AI SDK `LanguageModel`. The Agent SDK
+is a session/agent API, not an AI SDK model factory, so the runtime port avoids
+pretending those APIs are the same.
 
-This is the recommended approach because it matches the Agent SDK's actual
-shape. The Agent SDK is an agent/session API, not an AI SDK `LanguageModel`
-factory, so forcing it into `KtxLlmProvider.getModel(...)` would create a false
-abstraction and leave page triage broken.
+### Rejected: fake AI SDK `LanguageModel` for Claude Code
 
-### Rejected: agent-runner-only backend
+Trying to make Claude Code look like an AI SDK `LanguageModel` would be brittle.
+The Agent SDK owns session execution, permissions, MCP tools, structured output,
+and result messages. Those semantics do not map cleanly onto a normal
+`getModel(...)` return value.
 
-This was the previous version of the spec. It made work-unit and reconciliation
-agent loops possible, but it did not cover page triage or light extraction.
-Because `ktx ingest` uses those non-agent LLM calls for document-like sources,
-this does not satisfy the updated requirement.
+### Rejected: branch at every call site
 
-### Rejected for MVP: Claude Code OpenAI proxy
-
-Using a proxy or `claude -p` subprocess would avoid some TypeScript adapter work,
-but it would add another protocol boundary, make tool control harder to prove,
-and move away from the official Agent SDK API.
+Adding `if backend === "claude-code"` around each LLM call would work briefly
+but would duplicate prompt wrapping, structured output handling, debug logging,
+tool conversion, auth checks, and error mapping. It would also make future LLM
+call sites easy to miss.
 
 ## Architecture
 
 ```text
-ktx ingest
-  -> createLocalBundleIngestRuntime(...)
-     -> resolveIngestLlmRuntime(...)
-        -> AI SDK runtime
-           - KtxGenerationPort via generateText / Output.object
-           - AgentRunnerPort via current AgentRunnerService
-        -> Claude Code runtime
-           - KtxGenerationPort via Agent SDK query()
-           - AgentRunnerPort via ClaudeAgentSdkRunnerService
+ktx.yaml
+  llm.provider.backend: anthropic | vertex | gateway | claude-code
+  llm.models.<role>: model alias or model ID
 
-  -> PageTriageService
-     -> generation.generateText({ role: "triage", ... })
+createLocalKtxLlmRuntimeFromConfig(project.config.llm)
+  -> AiSdkKtxLlmRuntime
+     - wraps existing KtxLlmProvider
+     - generateText / Output.object / AgentRunnerService
+  -> ClaudeCodeKtxLlmRuntime
+     - uses @anthropic-ai/claude-agent-sdk query()
+     - implements text, object, and agent-loop operations
 
-  -> IngestBundleRunner stages
-     -> agentRunner.runLoop({ modelRole, toolSet, stepBudget, ... })
+All KTX LLM call sites
+  -> KtxLlmRuntimePort
 ```
 
-The runtime is selected once at the context-runtime DI boundary. The main ingest
-integration point remains `resolveAgentRunner` /
-`createLocalBundleIngestRuntime` in
-`packages/context/src/ingest/local-bundle-runtime.ts`, but the function should
-evolve from "resolve an agent runner plus optional AI SDK provider" into
-"resolve the ingest LLM runtime ports." The memory-agent construction path in
-`packages/context/src/memory/local-memory.ts` needs the same port treatment.
+The runtime is selected at the same boundaries that currently construct an
+`llmProvider` or `AgentRunnerService`:
 
-`packages/cli/src/runtime.ts` is the Python-runtime command handler; it is not
-the agent-runner or generation-runtime integration point.
+- `packages/context/src/llm/local-config.ts`
+- `packages/context/src/ingest/local-bundle-runtime.ts`
+- `packages/context/src/memory/local-memory.ts`
+- `packages/context/src/scan/local-scan.ts`
+- `packages/context/src/mcp/local-project-ports.ts`
+- Any CLI setup/status/doctor code that validates LLM readiness
+
+After the change, services should not need to know whether the configured
+backend is AI SDK based or Claude Code based. They call the runtime operation
+they need.
+
+## LLM call-site migration
+
+The implementation plan must migrate every current KTX LLM call site to the
+runtime port:
+
+- `packages/context/src/llm/generation.ts`: `generateKtxText` and
+  `generateKtxObject` become runtime-backed helpers or are folded into the
+  runtime.
+- `packages/context/src/agent/agent-runner.service.ts`: the AI SDK agent loop
+  becomes the AI SDK implementation of `runAgentLoop`.
+- `packages/context/src/ingest/page-triage/page-triage.service.ts`: page triage
+  and light extraction depend on `KtxLlmRuntimePort`, not raw `KtxLlmProvider`.
+- `packages/context/src/scan/description-generation.ts`: AI descriptions use
+  the runtime text-generation operation.
+- `packages/context/src/scan/relationship-llm-proposal.ts`: relationship
+  proposals use the runtime object-generation operation.
+- `packages/context/src/ingest/stages/stage-3-work-units.ts`,
+  `packages/context/src/ingest/stages/stage-4-reconciliation.ts`,
+  `packages/context/src/ingest/context-candidates/curator-pagination.service.ts`,
+  and `packages/context/src/memory/memory-agent.service.ts`: agent loops use the
+  runtime agent-loop operation or a thin `AgentRunnerPort` backed by it.
+- Test helpers and MCP local project ports that inject `llmProvider` or
+  `agentRunner` must either inject the runtime port or use compatibility test
+  adapters during the migration.
+
+The plan must include a grep-based audit so new or overlooked `getModel(...)`,
+`generateKtxText(...)`, `generateKtxObject(...)`, `AgentRunnerService`, and
+`llmProvider` usages are either migrated or explicitly proven non-runtime.
 
 ## Config design
 
-The plan should make `claude-code` a first-class config value, not a hidden
-side-channel. Recommended shape:
+The config should make `claude-code` a first-class backend:
 
 ```yaml
 llm:
@@ -156,28 +181,28 @@ llm:
     candidateExtraction: sonnet
     curator: sonnet
     reconcile: sonnet
+    repair: sonnet
 ```
 
 Implementation implications:
 
 - Extend `KTX_LLM_BACKENDS` in `packages/context/src/project/config.ts` and
   `KtxLlmBackend` in `packages/llm/src/types.ts`.
-- Update setup, status, doctor, and local provider resolution so
-  `claude-code` does not fall through to `gateway`.
-- For `claude-code`, do not construct a fake AI SDK `LanguageModel`. Construct
-  the Claude Code generation/runtime ports.
+- Update setup, status, doctor, schema generation, examples, and docs so
+  `claude-code` is understood everywhere `anthropic` is understood.
+- Update `createKtxLlmProvider` / `createModelFactory` so unsupported backend
+  values throw instead of falling through to gateway.
 - Keep `llm.models` as the per-role binding source. The Claude Code runtime maps
-  each KTX role to the configured model string for the current call. The plan
-  must decide and test the accepted model aliases, for example `sonnet`,
-  `opus`, `haiku`, or full Claude model IDs supported by the SDK.
-- If non-ingest code sees `backend: claude-code` before it has been ported, it
-  must fail fast with a clear unsupported-backend message. It must not silently
-  route to gateway.
+  each KTX role to the configured model string for the current call.
+- Define accepted model aliases, such as `sonnet`, `opus`, and `haiku`, and full
+  model IDs supported by the pinned SDK version.
 
 ## Claude Agent SDK runtime behavior
 
-Every Agent SDK call must be isolated and deterministic enough for KTX ingest.
-Use explicit options even when SDK defaults currently match the desired value.
+Every Agent SDK call must be isolated enough for KTX execution. Use explicit
+options even when SDK defaults currently match the desired value.
+
+For agent loops with tools:
 
 ```ts
 query({
@@ -213,33 +238,43 @@ query({
 });
 ```
 
+For plain text generation:
+
+- Use the same `query()` runtime with `maxTurns: 1`.
+- Pass `settingSources: []`, `skills: []`, `plugins: []`, `tools: []`, and
+  `permissionMode: "dontAsk"`.
+- Do not expose MCP tools unless the KTX call explicitly passed tools.
+- Return the final result message text.
+
+For structured object generation:
+
+- Use the Agent SDK structured output option for JSON schema output.
+- Convert KTX Zod schemas at the runtime boundary.
+- Parse and validate the returned object with the original KTX schema before
+  returning it to the caller.
+
 The plan must confirm the exact option names against the pinned SDK version, but
 the required outcome is fixed:
 
 - Filesystem settings are not loaded. `settingSources: []` is explicit, and the
   implementation should assert from the SDK init message that no unexpected
-  settings-derived commands, skills, agents, or MCP servers are active.
+  settings-derived commands, skills, agents, plugins, or MCP servers are active.
 - Skills are disabled with `skills: []`, and plugins are disabled with
   `plugins: []`.
-- Only KTX MCP tools are available and auto-approved. `allowedTools` alone is
-  not sufficient because the current SDK docs describe it as auto-approval, not
-  restriction. Use `tools`, `permissionMode: "dontAsk"`, and explicit
-  `disallowedTools` for built-ins.
+- `allowedTools` alone is not sufficient because the current SDK docs describe
+  it as auto-approval, not restriction. Use `tools`, `permissionMode:
+  "dontAsk"`, and explicit `disallowedTools` for built-ins.
 - Built-ins are denied even if a future SDK default changes.
 - `cwd` is `project.projectDir`, resolved at startup via `resolveKtxProjectDir`,
   not `process.cwd()`.
-- Sessions are not persisted for ingest unless the plan identifies a concrete
-  debugging feature that needs persistence.
-
-For non-agent text generation, use the same isolated runtime with no MCP tools,
-`maxTurns: 1`, and no filesystem settings. For structured outputs, use the Agent
-SDK's JSON-schema output format and convert KTX's Zod schemas at the boundary.
+- Sessions are not persisted unless the plan identifies a concrete debugging
+  feature that needs persistence.
 
 ## Tool boundary
 
-The final `RunLoopParams.toolSet` cannot remain a raw AI SDK `Record<string,
-Tool>` if two backends must consume it. The plan must define a backend-neutral
-tool descriptor for the **final** tool map handed to the runner:
+Agent-loop tools cannot remain only raw AI SDK `Record<string, Tool>` values if
+two backends must consume them. The plan must define a backend-neutral tool
+descriptor for the final tool map handed to an agent loop:
 
 ```ts
 interface KtxRuntimeToolDescriptor<TInput, TOutput> {
@@ -254,10 +289,8 @@ Every composed tool entry must preserve the descriptor, including:
 
 - `BaseTool` outputs from factory toolsets.
 - Source-specific raw tools such as `emit_historic_sql_evidence` in
-  `packages/context/src/ingest/local-bundle-runtime.ts:543-556`.
-- Stage-local tools in `buildWuToolSet` and `buildReconcileToolSet`
-  (`packages/context/src/ingest/stages/build-wu-context.ts`,
-  `packages/context/src/ingest/stages/build-reconcile-context.ts`).
+  `packages/context/src/ingest/local-bundle-runtime.ts`.
+- Stage-local tools in `buildWuToolSet` and `buildReconcileToolSet`.
 - Inline `load_skill`, read/raw/span, stage/diff, eviction, and emit tools in
   `packages/context/src/ingest/ingest-bundle.runner.ts`.
 - Memory-agent `load_skill` in
@@ -267,8 +300,8 @@ Every composed tool entry must preserve the descriptor, including:
 The AI SDK adapter converts descriptors to `tool(...)`. The Claude Code adapter
 converts descriptors to Agent SDK `tool(name, description, schema.shape,
 handler)` entries inside `createSdkMcpServer(...)`. KTX tool handlers return
-`{ markdown, structured }`; the Claude adapter returns the markdown as text
-content and may include structured JSON in the text only if a caller needs it.
+`{ markdown, structured }`; the Claude adapter returns markdown as text content
+and may include structured JSON only if a caller needs it.
 
 Non-object schemas are unsupported for `claude-code` and must be rejected at
 startup with a clear error. In practice KTX tool inputs are already `z.object`.
@@ -287,10 +320,14 @@ carry the terminal reason. They remain useful for lifecycle logging. Tool failur
 counting should use `PostToolUseFailure` and feed the same mechanism that
 `stage-3-work-units.ts` checks through `toolFailureCount?(wu.unitKey)`.
 
+For text and object generation, SDK authentication, billing, rate-limit,
+permission, max-turn, structured-output, and execution errors must map to the
+same error surfaces that KTX uses for the Anthropic API-key backend.
+
 ## Auth and setup
 
-`ktx setup` must validate that Claude Code SDK auth is usable, not just that
-`~/.claude/` exists. Acceptable validation strategies:
+`ktx setup`, status, and doctor flows must validate that Claude Code SDK auth is
+usable, not just that `~/.claude/` exists. Acceptable validation strategies:
 
 - A minimal SDK probe call with `settingSources: []`, `tools: []`, and
   `maxTurns: 1`.
@@ -299,17 +336,18 @@ counting should use `PostToolUseFailure` and feed the same mechanism that
   state that it proves auth usability.
 
 Failure copy should tell the user to authenticate Claude Code locally with the
-Claude Code CLI, then rerun setup or ingest.
+Claude Code CLI, then rerun setup or the command they attempted.
 
 ## Documentation impact
 
-Docs updates are required because this changes user-visible setup and ingest
-behavior:
+Docs updates are required because this changes user-visible setup and LLM
+provider behavior:
 
 - `docs-site/content/docs/getting-started/quickstart.mdx`
 - `docs-site/content/docs/cli-reference/ktx-setup.mdx`
 - `docs-site/content/docs/guides/building-context.mdx`
 - Any config reference page that documents `llm.provider.backend`
+- Any status or doctor docs that describe LLM readiness
 
 The docs must say that `claude-code` uses the user's own local Claude Code
 session. Do not describe it as a way for KTX to resell, pool, or productize
@@ -322,17 +360,24 @@ Claude subscription limits.
   (`packages/llm/src/types.ts`, `packages/llm/src/model-provider.ts`).
 - Project config currently accepts `llm.provider.backend: none | anthropic |
   vertex | gateway` (`packages/context/src/project/config.ts`).
-- `resolveAgentRunner(...)` currently requires an AI SDK `llmProvider`, and
-  page triage is only constructed when that provider exists
-  (`packages/context/src/ingest/local-bundle-runtime.ts`).
-- Page triage and light extraction are non-agent LLM calls using
-  `llmProvider.getModel("triage")` and AI SDK `generateText`
+- `generateKtxText` and `generateKtxObject` are shared non-agent generation
+  helpers (`packages/context/src/llm/generation.ts`).
+- `AgentRunnerService` is the shared AI SDK agent-loop implementation
+  (`packages/context/src/agent/agent-runner.service.ts`).
+- Page triage and light extraction currently use raw `KtxLlmProvider`
   (`packages/context/src/ingest/page-triage/page-triage.service.ts`).
-- The Agent SDK TypeScript reference documents `settingSources` defaulting to
-  no filesystem settings, `allowedTools` as auto-approval rather than
-  restriction, `permissionMode: "dontAsk"`, `tools`, `disallowedTools`,
-  `maxTurns`, `mcpServers`, `cwd`, `persistSession`, and SDK result/hook
-  message shapes.
+- Scan/enrichment internals currently use `createLocalKtxLlmProviderFromConfig`,
+  `generateKtxText`, and `generateKtxObject`
+  (`packages/context/src/scan/local-scan.ts`,
+  `packages/context/src/scan/description-generation.ts`,
+  `packages/context/src/scan/relationship-llm-proposal.ts`).
+- Local ingest and MCP local project ports inject `llmProvider` and
+  `agentRunner` today (`packages/context/src/ingest/local-bundle-runtime.ts`,
+  `packages/context/src/mcp/local-project-ports.ts`).
+- The Agent SDK TypeScript reference documents `settingSources` defaulting to no
+  filesystem settings, `allowedTools` as auto-approval rather than restriction,
+  `permissionMode: "dontAsk"`, `tools`, `disallowedTools`, `maxTurns`,
+  `mcpServers`, `cwd`, `persistSession`, and SDK result/hook message shapes.
 - The Agent SDK MCP docs show registering MCP servers in `query()` options and
   using `allowedTools` for MCP tool access.
 - The Agent SDK skills docs say discovered skills can be controlled with the
@@ -343,13 +388,17 @@ Claude subscription limits.
 
 1. Confirm exact TypeScript option names and result-message discriminants
    against the pinned `@anthropic-ai/claude-agent-sdk` version.
-2. Define the final `KtxGenerationPort` and `AgentRunnerPort` file locations and
-   package exports.
+2. Define the final `KtxLlmRuntimePort` file location and package exports.
 3. Define model alias validation for `sonnet`, `opus`, `haiku`, and full model
    IDs.
 4. Define the auth probe and make setup/status/doctor report actionable
    messages.
-5. Write tests that prove page triage is constructed and called under
-   `llm.provider.backend: claude-code`.
-6. Write tests that prove a raw built-in Claude Code tool request is denied and
-   only `mcp__ktx__*` tools are available during ingest.
+5. Run a repo-wide audit for all LLM call sites and migrate each one to the
+   runtime boundary.
+6. Write tests proving `claude-code` works for text generation, structured
+   object generation, and agent-loop execution.
+7. Write tests proving page triage, scan/enrichment internals, memory capture,
+   MCP-triggered local ingest, and normal local ingest all use the
+   `claude-code` runtime when configured.
+8. Write tests proving a raw built-in Claude Code tool request is denied and
+   only `mcp__ktx__*` tools are available during KTX agent loops.