ktx/docs/superpowers/plans/2026-05-15-claude-code-auth-probe-isolation-fix.md
Andrey Avtomonov b565e44a22
feat: add claude-code llm backend with runtime port (#115)
* docs: revise claude-code ingest backend spec

* docs: keep claude-code spec focused on ingest

* docs: expand claude-code spec to full llm parity

* Refine claude-code backend spec after adversarial review iteration 1

* Refine claude-code backend spec after adversarial review iteration 2

* Refine claude-code backend spec after adversarial review iteration 3

* feat: recognize claude-code llm backend

* feat: add ktx llm runtime port

* feat: add claude-code llm runtime

* feat: route non-agent llm calls through runtime

* feat: run ingest agents through llm runtime

* feat: support claude-code setup and status

* test: verify claude-code backend runtime

* docs: add claude-code backend v1 runtime plan

* fix: close claude-code runtime isolation checks

* fix: warn on claude-code prompt caching during setup

* chore: verify claude-code v1 closure

* docs: add claude-code backend v1 isolation closure plan

* fix: update claude-code ingest setup guidance

* docs: add claude-code backend v1 ingest guidance closure plan

* docs: align claude-code isolation spec with sdk metadata

* test: cover claude-code host discovery metadata

* fix: tolerate claude-code host discovery metadata

* docs: clarify claude-code host discovery metadata

* docs: add claude-code auth-probe isolation fix plan

* chore: prepare kaelio ktx rc1 release

* chore: add semantic release workflow

* fix: unblock ci checks

* chore(release): 0.1.0-rc.1

* feat: add Claude Code model selection to setup

* fix: keep git maintenance attached in local repos
2026-05-16 12:06:34 +02:00

26 KiB

Claude Code Auth Probe Isolation Fix Implementation Plan

For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (- [ ]) syntax for tracking.

Goal: Make the claude-code auth probe and runtime tolerate host-discovered Claude Code init metadata while preserving KTX-owned tool, MCP, and plugin restrictions.

Architecture: Keep the existing Claude Code runtime and SDK option tuple. Change the init-message assertion from "no host discovery appears" to "only the KTX-controlled execution surface is active." Align the design spec and user docs with the pinned SDK behavior: settingSources: [] disables filesystem settings, skills: [] is a context filter, and deny-by-default canUseTool is the runtime enforcement boundary.

Tech Stack: TypeScript, pnpm, Vitest, Markdown, Fumadocs MDX, @anthropic-ai/claude-agent-sdk@0.3.142.


Audit result

The current strict isolation assertion is a v1-blocking bug. A real authenticated Claude Code host can report non-empty slash_commands, skills, and agents in the SDK init message even when KTX passes settingSources: [], skills: [], plugins: [], tools: [], exact KTX MCP allowedTools, disallowedTools, and deny-by-default canUseTool.

Spec findings:

  • docs/superpowers/specs/2026-05-15-claude-code-backend-design.md:45-47 requires host-discovered capabilities not to expand the KTX agent-loop tool surface. That requirement is about invocation, not necessarily about zero diagnostic metadata in the init message.
  • docs/superpowers/specs/2026-05-15-claude-code-backend-design.md:254-265 overreaches by asking the implementation to assert that unexpected settings-derived commands, skills, agents, plugins, or MCP servers are inactive from the SDK init message. In @anthropic-ai/claude-agent-sdk@0.3.142, the available SDK controls cannot make message.slash_commands, message.skills, or message.agents reliably empty on an authenticated host.
  • docs/superpowers/specs/2026-05-15-claude-code-backend-design.md:266-267 says skills are disabled with skills: []. The pinned SDK type definitions document skills as a context filter, not a sandbox.
  • docs/superpowers/specs/2026-05-15-claude-code-backend-design.md:543-545 correctly requires the auth probe to pass the isolation option tuple and no MCP servers. It does not require failing when host discovery metadata is present.

SDK evidence from node_modules/.pnpm/@anthropic-ai+claude-agent-sdk@0.3.142_zod@4.4.3/node_modules/@anthropic-ai/claude-agent-sdk/sdk.d.ts:

  • Lines 1686-1695: settingSources: [] disables filesystem settings only.
  • Lines 1697-1718: skills: [] is a context filter; unlisted skills are hidden from listing and rejected by the Skill tool, but files remain on disk.
  • Lines 1202-1213: allowedTools is auto-approval, while canUseTool is the permission handler for controlling tool execution.
  • Lines 1224-1228: disallowedTools removes listed tools from context and prevents use.
  • Lines 1255-1264: tools: [] disables built-in tools.
  • Lines 1545-1558: plugins loads plugins when supplied; KTX supplies [].
  • Lines 3465-3489: the init message reports agents, tools, mcp_servers, slash_commands, skills, and plugins.

Implemented plan audit:

  • 2026-05-15-claude-code-backend-v1-runtime.md is implemented for config, runtime port, SDK dependency, model aliases, environment scrubbing, Claude Code text/object/agent execution, setup/status/doctor support, docs, and LLM call-site migration.
  • 2026-05-15-claude-code-backend-v1-isolation-closure.md is implemented, but it converted the spec's ambiguous "assert inactive" line into an impossible assertion against non-empty slash_commands, skills, and agents.
  • 2026-05-15-claude-code-backend-v1-ingest-guidance-closure.md is implemented for the ingest missing-LLM guidance and associated CLI/context tests.

Remaining v1-blocking gaps:

  • packages/context/src/llm/claude-code-runtime.ts:94-101 throws on host-discovered slash commands, skills, and agents.
  • packages/context/src/llm/claude-code-runtime.test.ts:158-178 encodes the wrong behavior by requiring the runtime to reject any init message with discovered agents.
  • The auth probe has no regression coverage for an authenticated host whose init message reports non-empty slash_commands, skills, and agents.
  • User docs under docs-site/content/docs/guides/ say KTX "disables" skills, agents, hooks, and slash commands. That wording is stronger than the SDK contract and must be changed to "not invokable by KTX agent loops."

Non-blocking gaps:

  • Same-step AI SDK tool-call repair parity remains out of scope for v1.
  • OTEL telemetry parity remains out of scope for v1.
  • Embedding parity remains out of scope because embeddings are configured separately.
  • Full prompt-caching parity remains out of scope. V1 keeps warning on ignored prompt-cache fields and avoids AI SDK cache markers on the Claude Code path.

Decision:

  • Choose option (a): relax the assertion in code and align the spec text. Do not rely on an invented SDK mechanism. The pinned type definitions expose settingSources, skills, plugins, tools, allowedTools, disallowedTools, and canUseTool, but they do not expose a query option that disables all host-discovered slash commands or user-level subagent names in the init message.

File structure

Modify these files:

  • docs/superpowers/specs/2026-05-15-claude-code-backend-design.md aligns the design with the real SDK contract.
  • packages/context/src/llm/claude-code-runtime.test.ts adds the failing regression tests for auth probe and runtime init metadata.
  • packages/context/src/llm/claude-code-runtime.ts relaxes init metadata checks while tightening exact tool equality.
  • docs-site/content/docs/guides/llm-configuration.mdx changes user docs from "disabled" to "not invokable."
  • docs-site/content/docs/guides/building-context.mdx applies the same user-facing wording at the ingest guide boundary.

Task 1: Align the design spec with SDK reality

Files:

  • Modify: docs/superpowers/specs/2026-05-15-claude-code-backend-design.md

  • Step 1: Update the tool-boundary goal

Replace the goal bullet at lines 45-47 with:

- Preserve KTX's curated tool boundaries. Claude Code built-ins,
  filesystem-discovered MCP servers, hooks, skills, plugins, agents, and slash
  commands must not become invokable in KTX agent loops. The Agent SDK init
  message may still report host-discovered slash commands, skills, and agents;
  KTX treats that metadata as diagnostic only and restricts execution through
  `tools: []`, exact KTX MCP `allowedTools`, `disallowedTools`, and
  deny-by-default `canUseTool`.
  • Step 2: Replace the over-broad init assertion requirement

Replace the bullet at lines 254-265 with:

- Filesystem settings are not loaded. The SDK's documented default for an
  omitted `settingSources` is `["user", "project", "local"]`
  (`@anthropic-ai/claude-agent-sdk@0.3.142` `sdk.d.ts:1686-1695`),
  which would inherit the user's Claude Code filesystem settings. Every KTX
  `query()` call site - agent loops, text generation, object generation, and
  the auth probe - MUST pass `settingSources: []` explicitly, along with
  `skills: []`, `plugins: []`, `tools: []`, `persistSession: false`, and no
  `mcpServers` entries other than the KTX MCP server (omitted entirely when
  the call site does not expose tools). The implementation MUST assert from
  the SDK init message that the controlled execution surface matches KTX's
  expectations:

  - `message.tools` equals the exact generated KTX MCP tool ids for the current
    call.
  - `message.mcp_servers` equals the expected KTX MCP server set: `[]` when the
    call exposes no tools, or `["ktx"]` when it does.
  - `message.plugins` is empty.

  The implementation MUST NOT reject a run solely because
  `message.slash_commands`, `message.skills`, or `message.agents` contain
  host-discovered names. In `@anthropic-ai/claude-agent-sdk@0.3.142`, those
  fields can report host discovery even when KTX passes the isolation options.
  They are not part of the KTX execution surface when `tools: []`,
  `allowedTools`, `disallowedTools`, and deny-by-default `canUseTool` are set.
  • Step 3: Replace the skills/plugin wording

Replace the bullets at lines 266-289 with:

- `skills: []` is a context filter in the pinned SDK
  (`sdk.d.ts:1697-1718`): unlisted skills are hidden from the model's skill
  listing and rejected by the Skill tool, but discovered skill names may still
  appear in init metadata. KTX must still pass `skills: []`.
- Plugins are disabled with `plugins: []`, and the runtime asserts that
  `message.plugins` is empty in the init message.
- Built-in tools are disabled by setting `tools: []`. The pinned SDK type
  (`@anthropic-ai/claude-agent-sdk@0.3.142`, `sdk.d.ts:1255-1264`) documents
  `tools` as the base set of built-in tools, with `[]` meaning "disable all
  built-ins"; `tools` does not accept MCP tool ids and cannot be used to
  restrict MCP availability.
- MCP tool availability is granted by registering the KTX MCP server through
  `mcpServers`. The SDK does not document a wildcard like `mcp__ktx__*` for
  any tool field; KTX must enumerate exact generated MCP tool ids of the form
  `mcp__ktx__<toolName>` (derived from the tool map handed to
  `createSdkMcpServer`) wherever a list of tool ids is required.
- Pre-approval under `permissionMode: "dontAsk"` is configured by listing those
  same exact `mcp__ktx__<toolName>` ids in `allowedTools` (documented as
  auto-allow without prompting). Treat `allowedTools` as auto-approval, not
  restriction.
- Defense-in-depth restriction uses `canUseTool`. The KTX runtime supplies a
  `canUseTool` handler that allows only tool names in the current KTX MCP tool
  map and denies everything else, so host-discovered slash commands, skills,
  agents, future SDK defaults, or a misconfigured MCP server cannot expand the
  execution surface.
- `disallowedTools` MUST additionally list the current built-in tool names
  (`Agent`, `Task`, `AskUserQuestion`, `Bash`, `Read`, `Edit`, `Write`, `Glob`,
  `Grep`, `WebFetch`, `WebSearch`, `TodoWrite`) as redundant insurance.
  • Step 4: Update auth probe acceptance text

After the auth probe option list at lines 543-545, add:

  The auth probe MUST tolerate init messages with non-empty
  `slash_commands`, `skills`, and `agents` when `message.tools` is empty,
  `message.mcp_servers` is empty, `message.plugins` is empty, and the query
  options contain the KTX isolation tuple. Host discovery metadata is not an
  auth failure.
  • Step 5: Update verified evidence and open items

Replace lines 621-623 with:

- The Agent SDK skills docs say the `skills` option is a context filter rather
  than a sandbox. KTX must pass `skills: []`, but must not assert that
  `message.skills` is empty in the SDK init message.

Replace open item 8 at lines 648-649 with:

8. Write tests proving a raw built-in Claude Code tool request is denied,
   host-discovered Skill/Agent/SlashCommand requests are denied by `canUseTool`,
   and only exact `mcp__ktx__*` tools are allowed during KTX agent loops.

Replace open item 9 at lines 650-654 with:

9. Write a test that asserts every KTX-originated `query()` invocation
   (agent loop, text generation, object generation, auth probe) is called
   with `settingSources: []`, `skills: []`, `plugins: []`, `tools: []`, and
   `persistSession: false`, by spying on the SDK entry point. The test must
   fail if any path falls back to SDK defaults for those fields. The test must
   also prove that non-empty host-discovered `slash_commands`, `skills`, and
   `agents` in the init message do not fail the auth probe or runtime when the
   controlled tool, MCP server, and plugin surfaces match KTX expectations.
  • Step 6: Commit the spec alignment

Run:

git add docs/superpowers/specs/2026-05-15-claude-code-backend-design.md
git commit -m "docs: align claude-code isolation spec with sdk metadata"

Expected: the design spec no longer requires zero host-discovery metadata in the SDK init message.

Task 2: Add regression tests for host-discovered init metadata

Files:

  • Modify: packages/context/src/llm/claude-code-runtime.test.ts

  • Step 1: Replace the invalid agent rejection test

In packages/context/src/llm/claude-code-runtime.test.ts, replace the test named rejects settings-derived agents and non-KTX MCP servers from init messages with these tests:

  it('treats host-discovered commands skills and agents as non-fatal init metadata for text and auth probe', async () => {
    const hostDiscoveredInit = initMessage({
      slash_commands: ['/help', '/compact', '/clear', '/user-command'],
      skills: ['pdf', 'docx'],
      agents: ['claude', 'Explore', 'general-purpose'],
    });
    const textQuery = vi.fn((_input: any) =>
      stream([hostDiscoveredInit, resultMessage({ result: 'hello' })]),
    );
    const runtime = new ClaudeCodeKtxLlmRuntime({
      projectDir: '/tmp/project',
      modelSlots: { default: 'sonnet' },
      query: textQuery,
      env: { ANTHROPIC_API_KEY: 'sk-ant-test', PATH: '/usr/bin' }, // pragma: allowlist secret
    });

    await expect(runtime.generateText({ role: 'default', prompt: 'say hello' })).resolves.toBe('hello');
    const textOptions = textQuery.mock.calls[0][0].options;
    expect(textOptions).toMatchObject({
      settingSources: [],
      skills: [],
      plugins: [],
      tools: [],
      allowedTools: [],
      permissionMode: 'dontAsk',
      persistSession: false,
      env: expect.not.objectContaining({ ANTHROPIC_API_KEY: 'sk-ant-test' }),
    });
    expect(textOptions.disallowedTools).toEqual(expect.arrayContaining(['Agent', 'Task', 'Bash']));
    expect(await textOptions.canUseTool('Agent', {}, { signal: new AbortController().signal, toolUseID: 'agent' })).toMatchObject({
      behavior: 'deny',
      toolUseID: 'agent',
    });
    expect(await textOptions.canUseTool('Skill', {}, { signal: new AbortController().signal, toolUseID: 'skill' })).toMatchObject({
      behavior: 'deny',
      toolUseID: 'skill',
    });
    expect(
      await textOptions.canUseTool('SlashCommand', {}, { signal: new AbortController().signal, toolUseID: 'slash' }),
    ).toMatchObject({
      behavior: 'deny',
      toolUseID: 'slash',
    });

    const probeQuery = vi.fn((_input: any) =>
      stream([hostDiscoveredInit, resultMessage({ result: 'ok' })]),
    );
    await expect(
      runClaudeCodeAuthProbe({
        projectDir: '/tmp/project',
        model: 'sonnet',
        query: probeQuery,
        env: { ANTHROPIC_AUTH_TOKEN: 'token', HOME: '/Users/test' },
      }),
    ).resolves.toEqual({ ok: true });
    expect(probeQuery.mock.calls[0][0].options).toMatchObject({
      settingSources: [],
      skills: [],
      plugins: [],
      tools: [],
      allowedTools: [],
      permissionMode: 'dontAsk',
      persistSession: false,
      env: expect.objectContaining({ HOME: '/Users/test' }),
    });
    expect(probeQuery.mock.calls[0][0].options.env).not.toEqual(
      expect.objectContaining({ ANTHROPIC_AUTH_TOKEN: 'token' }),
    );
  });

  it('allows host-discovered context during agent loops while requiring exact KTX MCP tools and servers', async () => {
    const query = vi.fn((_input: any) =>
      stream([
        initMessage({
          tools: ['mcp__ktx__load_skill'],
          mcp_servers: [{ name: 'ktx', status: 'connected' }],
          slash_commands: ['/help', '/compact', '/clear'],
          skills: ['memory-agent', 'doc-reader'],
          agents: ['claude', 'Plan', 'Explore'],
        }),
        {
          type: 'assistant',
          message: { role: 'assistant', content: [] },
          parent_tool_use_id: null,
          uuid: '00000000-0000-4000-8000-000000000006',
          session_id: 'session-id',
        } as unknown as SDKMessage,
        resultMessage({ subtype: 'error_max_turns', is_error: true }),
      ]),
    );
    const runtime = new ClaudeCodeKtxLlmRuntime({
      projectDir: '/tmp/project',
      modelSlots: { default: 'sonnet' },
      query,
      env: {},
    });

    await expect(
      runtime.runAgentLoop({
        modelRole: 'default',
        systemPrompt: 'system',
        userPrompt: 'user',
        toolSet: {
          load_skill: {
            name: 'load_skill',
            description: 'Load skill.',
            inputSchema: z.object({ name: z.string() }),
            execute: async () => ({ markdown: 'loaded' }),
          },
        },
        stepBudget: 1,
        telemetryTags: { operationName: 'test' },
      }),
    ).resolves.toEqual({ stopReason: 'budget' });

    const options = query.mock.calls[0][0].options;
    expect(options.allowedTools).toEqual(['mcp__ktx__load_skill']);
    expect(await options.canUseTool('mcp__ktx__load_skill', {}, { signal: new AbortController().signal, toolUseID: '1' })).toEqual({
      behavior: 'allow',
      toolUseID: '1',
    });
    expect(await options.canUseTool('Task', {}, { signal: new AbortController().signal, toolUseID: '2' })).toMatchObject({
      behavior: 'deny',
      toolUseID: '2',
    });
    expect(await options.canUseTool('Skill', {}, { signal: new AbortController().signal, toolUseID: '3' })).toMatchObject({
      behavior: 'deny',
      toolUseID: '3',
    });
  });

  it('still rejects unexpected tools, missing KTX tools, plugins, and non-KTX MCP servers from init messages', async () => {
    const query = vi.fn((_input: any) =>
      stream([
        initMessage({
          tools: ['Bash'],
          mcp_servers: [{ name: 'filesystem', status: 'connected' }],
          plugins: [{ name: 'host-plugin', path: '/tmp/plugin' }],
        }),
        resultMessage({ result: 'hello' }),
      ]),
    );
    const runtime = new ClaudeCodeKtxLlmRuntime({
      projectDir: '/tmp/project',
      modelSlots: { default: 'sonnet' },
      query,
      env: {},
    });

    await expect(
      runtime.generateText({
        role: 'default',
        prompt: 'say hello',
        tools: {
          load_skill: {
            name: 'load_skill',
            description: 'Load skill.',
            inputSchema: z.object({ name: z.string() }),
            execute: async () => ({ markdown: 'loaded' }),
          },
        },
      }),
    ).rejects.toThrow(
      /Claude Code runtime isolation failed: .*tools=Bash.*missing_tools=mcp__ktx__load_skill.*mcp_servers=filesystem.*plugins=host-plugin/,
    );
  });
  • Step 2: Run the runtime test to verify it fails

Run:

pnpm --filter @ktx/context exec vitest run src/llm/claude-code-runtime.test.ts

Expected: FAIL. The first new test fails because runClaudeCodeAuthProbe(...) returns { ok: false, ... } and generateText(...) rejects when init metadata contains non-empty slash_commands, skills, or agents. The second new test fails because runAgentLoop(...) returns { stopReason: 'error', ... } for the same reason.

  • Step 3: Commit the failing regression test

Run:

git add packages/context/src/llm/claude-code-runtime.test.ts
git commit -m "test: cover claude-code host discovery metadata"

Expected: the commit contains tests that fail before the runtime assertion is fixed.

Task 3: Relax init metadata assertions to the controlled execution surface

Files:

  • Modify: packages/context/src/llm/claude-code-runtime.ts

  • Step 1: Replace assertInitIsolation

In packages/context/src/llm/claude-code-runtime.ts, replace the full assertInitIsolation(...) function with:

function assertInitIsolation(
  message: SDKMessage,
  allowedToolIds: Set<string>,
  expectedMcpServerNames: Set<string>,
): void {
  if (message.type !== 'system' || message.subtype !== 'init') {
    return;
  }
  const activeToolIds = new Set(message.tools);
  const unexpectedTools = message.tools.filter((toolName) => !allowedToolIds.has(toolName));
  const missingTools = [...allowedToolIds].filter((toolName) => !activeToolIds.has(toolName));
  const activeMcpServerNames = message.mcp_servers.map((server) => server.name);
  const unexpectedMcpServers = activeMcpServerNames.filter((name) => !expectedMcpServerNames.has(name));
  const missingMcpServers = [...expectedMcpServerNames].filter((name) => !activeMcpServerNames.includes(name));
  const unexpectedPlugins = message.plugins.map((plugin) => plugin.name);
  if (
    unexpectedTools.length > 0 ||
    missingTools.length > 0 ||
    unexpectedMcpServers.length > 0 ||
    missingMcpServers.length > 0 ||
    unexpectedPlugins.length > 0
  ) {
    throw new Error(
      `Claude Code runtime isolation failed: tools=${unexpectedTools.join(',') || '(none)'} missing_tools=${
        missingTools.join(',') || '(none)'
      } mcp_servers=${unexpectedMcpServers.join(',') || '(none)'} missing_mcp_servers=${
        missingMcpServers.join(',') || '(none)'
      } plugins=${unexpectedPlugins.join(',') || '(none)'} host_slash_commands=${
        message.slash_commands.length
      } host_skills=${message.skills.length} host_agents=${message.agents?.join(',') || '(none)'}`,
    );
  }
}

This preserves strict checks for the KTX-controlled execution surface:

  • message.tools must exactly equal the generated KTX MCP tool ids for the current call.
  • message.mcp_servers must exactly equal the expected KTX MCP server names.
  • message.plugins must be empty.

It deliberately stops treating message.slash_commands, message.skills, and message.agents as fatal because those fields can contain host-discovered metadata that KTX cannot disable through the pinned SDK options.

  • Step 2: Run the runtime test to verify it passes

Run:

pnpm --filter @ktx/context exec vitest run src/llm/claude-code-runtime.test.ts

Expected: PASS.

  • Step 3: Commit the runtime fix

Run:

git add packages/context/src/llm/claude-code-runtime.ts packages/context/src/llm/claude-code-runtime.test.ts
git commit -m "fix: tolerate claude-code host discovery metadata"

Expected: the auth probe and runtime no longer fail solely because the SDK init message reports host-discovered slash commands, skills, or agents.

Task 4: Correct user-facing docs wording

Files:

  • Modify: docs-site/content/docs/guides/llm-configuration.mdx

  • Modify: docs-site/content/docs/guides/building-context.mdx

  • Step 1: Update the LLM configuration guide wording

In docs-site/content/docs/guides/llm-configuration.mdx, replace lines 39-41 with:

`claude-code` keeps KTX tool boundaries intact. KTX exposes only the MCP tools
needed for the current KTX agent loop, disables Claude Code built-in tools,
keeps plugins empty, and denies every non-KTX tool request through
`canUseTool`. The Claude Agent SDK may still report host-discovered slash
commands, skills, and subagent names in init metadata; that metadata is not an
execution grant for KTX agent loops.
  • Step 2: Update the building context guide wording

In docs-site/content/docs/guides/building-context.mdx, replace lines 61-63 with:

When you use `claude-code`, KTX still controls the tool surface for ingest and
memory capture. Claude Code built-in tools, discovered MCP servers, plugins,
skills, agents, and slash commands are not invokable by KTX agent loops unless
they are exact KTX MCP tools for the current run.
  • Step 3: Run docs tests

Run:

pnpm --filter ktx-docs run test

Expected: PASS.

  • Step 4: Commit docs wording

Run:

git add docs-site/content/docs/guides/llm-configuration.mdx docs-site/content/docs/guides/building-context.mdx
git commit -m "docs: clarify claude-code host discovery metadata"

Expected: user docs describe invocation control rather than promising zero host-discovery metadata.

Task 5: Final verification

Files:

  • Verify: docs/superpowers/specs/2026-05-15-claude-code-backend-design.md

  • Verify: packages/context/src/llm/claude-code-runtime.ts

  • Verify: packages/context/src/llm/claude-code-runtime.test.ts

  • Verify: docs-site/content/docs/guides/llm-configuration.mdx

  • Verify: docs-site/content/docs/guides/building-context.mdx

  • Step 1: Run targeted runtime tests

Run:

pnpm --filter @ktx/context exec vitest run src/llm/claude-code-runtime.test.ts src/llm/runtime-tools.test.ts src/llm/claude-code-env.test.ts

Expected: PASS.

  • Step 2: Run package type-check

Run:

pnpm --filter @ktx/context run type-check

Expected: PASS.

  • Step 3: Run docs verification

Run:

pnpm --filter ktx-docs run test

Expected: PASS.

  • Step 4: Run dead-code checks

Run:

pnpm run dead-code

Expected: PASS or only pre-existing unrelated findings. Investigate and fix any finding caused by the runtime assertion or test changes.

  • Step 5: Inspect git status

Run:

git status --short

Expected: only files from this plan are modified, or the working tree is clean if each task was committed.

Self-review

  • Spec coverage: This plan addresses the v1-blocking auth probe failure, aligns the spec with the SDK contract, preserves the real KTX execution boundary, and adds regression coverage for non-empty host-discovered slash_commands, skills, and agents in both auth probe and runtime paths.
  • Placeholder scan: No placeholder markers remain. Every code-changing step includes exact file paths, code blocks, commands, and expected results.
  • Type consistency: The plan uses existing names from the codebase: ClaudeCodeKtxLlmRuntime, runClaudeCodeAuthProbe, initMessage, resultMessage, assertInitIsolation, mcpToolIds, KtxRuntimeToolSet, and canUseTool.