rowboat/apps/x/packages/shared/src/code-mode.ts

71 lines
2.7 KiB
TypeScript
Raw Normal View History

feat: run code mode on an in-app ACP client with live approvals (#593) * feat(code-mode): add ACP client engine (Layer 2 core) Own the Agent Client Protocol client instead of shelling out to `acpx`, so code mode can stream structured events (tool calls, diffs, plan) and surface live permission requests. Headless acpx can't do live approvals (it only supports --approve-all), which is why we drive the agent adapters ourselves. - code-mode/acp/{agents,client,permission-broker,session-store,manager,types}.ts: headless engine driving the Claude/Codex ACP adapters; one warm session per chat with create-or-resume via session/load; approval policy (ask | auto-approve-reads | yolo) in the broker. - claude-exec.ts: cross-platform claude resolver (Windows .cmd EINVAL fix + macOS/Linux GUI-PATH safety net) shared with the legacy acpx path in builtin-tools.ts. - add @agentclientprotocol/sdk + claude/codex adapters to core. * feat(code-mode): route code mode through code_agent_run tool + live approvals Replace the acpx shell-out with a structured code_agent_run tool that drives the ACP engine directly, streaming the agent's tool calls / diffs / plan into the chat and surfacing permission requests inline. - shared: code-mode.ts zod schemas; add code-run-event + code-run-permission-request RunEvent variants (stream to the renderer over the existing runs:events channel); codeRun:resolvePermission IPC channel. - core: CodePermissionRegistry (promise-based mid-run approvals — the LLM tool-loop's pre-call gate can't model a mid-execution wait); register codeModeManager + codePermissionRegistry in awilix. - core: code_agent_run builtin tool (streams via ctx.publish, asks via the registry, cancels on ctx.signal, returns the agent summary). CodeModeConfig.approvalPolicy (ask | auto-approve-reads | yolo; default ask). Exclude the tool from the headless background-task / live-note / inline-task agents so they can't block on an approval. - main: codeRun:resolvePermission handler -> registry.resolve. - rewrite the code-with-agents skill and the runtime "Code Mode (Active)" block to call code_agent_run instead of emitting npx acpx commands. * feat(code-mode): render coding runs inline (live timeline + permission card) Render a code_agent_run tool call as a live CodingRun block instead of generic tool output: the agent's text, tool-call rows (kind icon + status + changed-file names from diffs), a plan checklist, and resolved-permission lines — plus an inline Allow / Always-allow / Deny card wired to codeRun:resolvePermission. - chat-conversation.ts: ToolCall carries codeRunEvents + pendingCodePermission; code_agent_run is excluded from tool-grouping so it renders standalone. - App.tsx: handle code-run-event / code-run-permission-request, clear the pending card on tool-result, handleCodePermissionResponse, render via CodingRunBlock. * fix(code-mode): run the ACP adapter as Node under Electron + resolve it from main Two runtime failures that only surfaced inside the packaged/bundled Electron app (the headless harness used real node, so neither showed there): - "ACP connection closed": the main process spawns the adapter via process.execPath, which inside Electron is the Electron binary, not node — so the child never ran as Node and its ACP stdio stream closed immediately. Set ELECTRON_RUN_AS_NODE=1 on the adapter env (a no-op under real node). - "Cannot find module '@agentclientprotocol/claude-agent-acp'": the adapters were transitive (core) deps, unreachable from the esbuild-bundled main.cjs. Add them as direct deps of the main app so require.resolve finds them at runtime (and so they ship when packaged). Also capture the adapter's stderr + exit code and enrich connection errors, so a future failure reports the real cause instead of the opaque "ACP connection closed". * chore(code-mode): remove dead acpx code paths and stale copy Code mode now runs through the code_agent_run tool (owning the ACP client), so the legacy acpx shell-out paths are dead. Remove them: - core: envForCommand (acpx-only CLAUDE_CODE_EXECUTABLE injection) from executeCommand; getCodeModeCommandLabel (acpx run-status label). - renderer: the acpx-detection "switch agent / auto-flip the code-mode chip" flow — App.tsx executeCommand detection, the permission-request onSwitchAgent button + badge, and the composer's code-mode-detected listener. - copy: Settings -> Code Mode and the code-with-agents skill summary no longer mention acpx; tidy stale comments (claude-exec, command-executor). No behavior change for code mode; the general executeCommand tool is unaffected. * feat(code-mode): approval-policy selector in Settings Surface the approval policy (Ask every time / Auto-approve reads / YOLO) in Settings -> Code Mode, instead of being config-file only. The broker already reads CodeModeConfig.approvalPolicy; this plumbs it through the codeMode:getConfig / setConfig IPC + main handlers and adds the picker UI (with a one-line explanation of each level). Defaults to "ask". * fix(code-mode): harden ACP engine — turn-scoped connections, chip-authoritative agent, reliable stop Three robustness fixes that co-modify manager.runPrompt and the code_agent_run tool, so they land together: - Lifecycle: scope each ACP adapter connection to the agent turn. Dispose it a short grace (60s) after the turn ends instead of holding it for the app's life; the next turn resumes via session/load (both agents support it). Wire disposeAll() on app quit (was dead code). Fixes the unbounded per-chat leak of booted agent processes. - Agent selection: make the composer chip the source of truth. Thread codeMode into ToolContext; code_agent_run uses it instead of the model's guessed `agent` arg, which anchored on the thread's earlier agent and ignored a chip change. Prompts updated to match; the run is labelled by the agent that actually ran. - Stop/abort: guarantee a stopped turn unwinds. On abort the manager sends ACP session/cancel, then force-kills the adapter after a 2s grace and resolves the turn as cancelled — a wedged adapter can no longer hang the run and lock the chat. code_agent_run returns a clean cancelled result. * fix(code-mode): hide Codex's native console window on Windows Codex's engine ships as a native console-subsystem binary (codex.exe). Launched from our console-less Electron process tree, Windows allocated a fresh *visible* console window for it; closing that window wedged the run in a pending state. (Claude Code is a Node CLI, so it never triggers this.) The window is created by @openai/codex's launcher (bin/codex.js), which spawns codex.exe with no windowsHide. Patch it via pnpm to pass windowsHide: true (CREATE_NO_WINDOW) so the console stays hidden — no window, nothing to close. * refactor(code-mode): move ACP session files out of WorkDir/config Per-run ACP session state is runtime state that accumulates one file per chat run, not user/app config. Relocate it from WorkDir/config to a dedicated WorkDir/code-mode/sessions/ so it can be listed, cleaned up, and managed on its own without crowding config. Drop the now-redundant codesession- filename prefix (the directory conveys it).
2026-06-05 14:45:08 +05:30
import z from "zod";
// Shared zod schemas for the ACP code-mode engine. Single source of truth: the
// core engine re-exports the inferred TS types, and runs.ts builds the RunEvent
// variants that carry these to the renderer.
export const CodingAgent = z.enum(["claude", "codex"]);
export type CodingAgent = z.infer<typeof CodingAgent>;
// How the permission broker answers the agent's requests before any per-tool
// "always allow" memory is applied. `yolo` is the safe, scoped equivalent of
// `claude --dangerously-skip-permissions` (our toggle, not a CLI flag).
export const ApprovalPolicy = z.enum(["ask", "auto-approve-reads", "yolo"]);
export type ApprovalPolicy = z.infer<typeof ApprovalPolicy>;
export const PermissionDecision = z.enum(["allow_once", "allow_always", "reject"]);
export type PermissionDecision = z.infer<typeof PermissionDecision>;
// What the UI needs to render a permission card.
export const PermissionAsk = z.object({
toolCallId: z.string().optional(),
title: z.string(),
kind: z.string().optional(), // tool kind, e.g. "edit" | "execute" | "read"
isRead: z.boolean(),
});
export type PermissionAsk = z.infer<typeof PermissionAsk>;
// Normalized per-run stream items. The engine maps raw ACP session/update
// notifications onto this union; the renderer renders them.
export const CodeRunEvent = z.discriminatedUnion("type", [
// role distinguishes the agent's own output from replayed user turns
// (loadSession streams the whole prior conversation back on resume).
z.object({ type: z.literal("message"), role: z.enum(["agent", "user"]), text: z.string() }),
z.object({ type: z.literal("thought") }),
z.object({
type: z.literal("tool_call"),
id: z.string().optional(),
title: z.string().optional(),
kind: z.string().optional(),
status: z.string().optional(),
}),
z.object({
type: z.literal("tool_call_update"),
id: z.string().optional(),
status: z.string().optional(),
diffs: z.array(z.string()),
}),
z.object({
type: z.literal("plan"),
entries: z.array(z.object({
content: z.string(),
status: z.string().optional(),
priority: z.string().optional(),
})),
}),
z.object({
type: z.literal("permission"),
ask: PermissionAsk,
decision: z.union([PermissionDecision, z.literal("cancelled")]),
auto: z.boolean(),
}),
z.object({ type: z.literal("other"), sessionUpdate: z.string() }),
]);
export type CodeRunEvent = z.infer<typeof CodeRunEvent>;
export const RunPromptResult = z.object({
stopReason: z.string(),
sessionId: z.string(),
});
export type RunPromptResult = z.infer<typeof RunPromptResult>;