rowboat/apps/x/packages/shared/src/runs.ts
gagan 372309eb18
feat: run code mode on an in-app ACP client with live approvals (#593)
* feat(code-mode): add ACP client engine (Layer 2 core)

Own the Agent Client Protocol client instead of shelling out to `acpx`, so code
mode can stream structured events (tool calls, diffs, plan) and surface live
permission requests. Headless acpx can't do live approvals (it only supports
--approve-all), which is why we drive the agent adapters ourselves.

- code-mode/acp/{agents,client,permission-broker,session-store,manager,types}.ts:
  headless engine driving the Claude/Codex ACP adapters; one warm session per chat
  with create-or-resume via session/load; approval policy (ask | auto-approve-reads
  | yolo) in the broker.
- claude-exec.ts: cross-platform claude resolver (Windows .cmd EINVAL fix + macOS/Linux
  GUI-PATH safety net) shared with the legacy acpx path in builtin-tools.ts.
- add @agentclientprotocol/sdk + claude/codex adapters to core.

* feat(code-mode): route code mode through code_agent_run tool + live approvals

Replace the acpx shell-out with a structured code_agent_run tool that drives the
ACP engine directly, streaming the agent's tool calls / diffs / plan into the chat
and surfacing permission requests inline.

- shared: code-mode.ts zod schemas; add code-run-event + code-run-permission-request
  RunEvent variants (stream to the renderer over the existing runs:events channel);
  codeRun:resolvePermission IPC channel.
- core: CodePermissionRegistry (promise-based mid-run approvals — the LLM tool-loop's
  pre-call gate can't model a mid-execution wait); register codeModeManager +
  codePermissionRegistry in awilix.
- core: code_agent_run builtin tool (streams via ctx.publish, asks via the registry,
  cancels on ctx.signal, returns the agent summary). CodeModeConfig.approvalPolicy
  (ask | auto-approve-reads | yolo; default ask). Exclude the tool from the headless
  background-task / live-note / inline-task agents so they can't block on an approval.
- main: codeRun:resolvePermission handler -> registry.resolve.
- rewrite the code-with-agents skill and the runtime "Code Mode (Active)" block to call
  code_agent_run instead of emitting npx acpx commands.

* feat(code-mode): render coding runs inline (live timeline + permission card)

Render a code_agent_run tool call as a live CodingRun block instead of generic
tool output: the agent's text, tool-call rows (kind icon + status + changed-file
names from diffs), a plan checklist, and resolved-permission lines — plus an
inline Allow / Always-allow / Deny card wired to codeRun:resolvePermission.

- chat-conversation.ts: ToolCall carries codeRunEvents + pendingCodePermission;
  code_agent_run is excluded from tool-grouping so it renders standalone.
- App.tsx: handle code-run-event / code-run-permission-request, clear the pending
  card on tool-result, handleCodePermissionResponse, render via CodingRunBlock.

* fix(code-mode): run the ACP adapter as Node under Electron + resolve it from main

Two runtime failures that only surfaced inside the packaged/bundled Electron app
(the headless harness used real node, so neither showed there):

- "ACP connection closed": the main process spawns the adapter via
  process.execPath, which inside Electron is the Electron binary, not node — so
  the child never ran as Node and its ACP stdio stream closed immediately. Set
  ELECTRON_RUN_AS_NODE=1 on the adapter env (a no-op under real node).
- "Cannot find module '@agentclientprotocol/claude-agent-acp'": the adapters were
  transitive (core) deps, unreachable from the esbuild-bundled main.cjs. Add them
  as direct deps of the main app so require.resolve finds them at runtime (and so
  they ship when packaged).

Also capture the adapter's stderr + exit code and enrich connection errors, so a
future failure reports the real cause instead of the opaque "ACP connection closed".

* chore(code-mode): remove dead acpx code paths and stale copy

Code mode now runs through the code_agent_run tool (owning the ACP client), so the
legacy acpx shell-out paths are dead. Remove them:

- core: envForCommand (acpx-only CLAUDE_CODE_EXECUTABLE injection) from
  executeCommand; getCodeModeCommandLabel (acpx run-status label).
- renderer: the acpx-detection "switch agent / auto-flip the code-mode chip" flow —
  App.tsx executeCommand detection, the permission-request onSwitchAgent button +
  badge, and the composer's code-mode-detected listener.
- copy: Settings -> Code Mode and the code-with-agents skill summary no longer
  mention acpx; tidy stale comments (claude-exec, command-executor).

No behavior change for code mode; the general executeCommand tool is unaffected.

* feat(code-mode): approval-policy selector in Settings

Surface the approval policy (Ask every time / Auto-approve reads / YOLO) in
Settings -> Code Mode, instead of being config-file only. The broker already
reads CodeModeConfig.approvalPolicy; this plumbs it through the
codeMode:getConfig / setConfig IPC + main handlers and adds the picker UI
(with a one-line explanation of each level). Defaults to "ask".

* fix(code-mode): harden ACP engine — turn-scoped connections, chip-authoritative agent, reliable stop

Three robustness fixes that co-modify manager.runPrompt and the code_agent_run
tool, so they land together:

- Lifecycle: scope each ACP adapter connection to the agent turn. Dispose it a
  short grace (60s) after the turn ends instead of holding it for the app's life;
  the next turn resumes via session/load (both agents support it). Wire
  disposeAll() on app quit (was dead code). Fixes the unbounded per-chat leak of
  booted agent processes.

- Agent selection: make the composer chip the source of truth. Thread codeMode
  into ToolContext; code_agent_run uses it instead of the model's guessed `agent`
  arg, which anchored on the thread's earlier agent and ignored a chip change.
  Prompts updated to match; the run is labelled by the agent that actually ran.

- Stop/abort: guarantee a stopped turn unwinds. On abort the manager sends ACP
  session/cancel, then force-kills the adapter after a 2s grace and resolves the
  turn as cancelled — a wedged adapter can no longer hang the run and lock the
  chat. code_agent_run returns a clean cancelled result.

* fix(code-mode): hide Codex's native console window on Windows

Codex's engine ships as a native console-subsystem binary (codex.exe). Launched
from our console-less Electron process tree, Windows allocated a fresh *visible*
console window for it; closing that window wedged the run in a pending state.
(Claude Code is a Node CLI, so it never triggers this.)

The window is created by @openai/codex's launcher (bin/codex.js), which spawns
codex.exe with no windowsHide. Patch it via pnpm to pass windowsHide: true
(CREATE_NO_WINDOW) so the console stays hidden — no window, nothing to close.

* refactor(code-mode): move ACP session files out of WorkDir/config

Per-run ACP session state is runtime state that accumulates one file per
chat run, not user/app config. Relocate it from WorkDir/config to a
dedicated WorkDir/code-mode/sessions/ so it can be listed, cleaned up, and
managed on its own without crowding config. Drop the now-redundant
codesession- filename prefix (the directory conveys it).
2026-06-05 14:45:08 +05:30

223 lines
6.2 KiB
TypeScript

import { LlmStepStreamEvent } from "./llm-step-events.js";
import { Message, ToolCallPart } from "./message.js";
import { CodeRunEvent as CodeRunEventSchema, PermissionAsk } from "./code-mode.js";
import z from "zod";
const BaseRunEvent = z.object({
runId: z.string(),
ts: z.iso.datetime().optional(),
subflow: z.array(z.string()),
});
export const RunProcessingStartEvent = BaseRunEvent.extend({
type: z.literal("run-processing-start"),
});
export const RunProcessingEndEvent = BaseRunEvent.extend({
type: z.literal("run-processing-end"),
});
export const StartEvent = BaseRunEvent.extend({
type: z.literal("start"),
agentName: z.string(),
model: z.string(),
provider: z.string(),
permissionMode: z.enum(["manual", "auto"]).optional(),
// useCase/subUseCase tag the run for analytics. Optional on read so legacy
// run files written before these fields existed still parse cleanly.
useCase: z.enum([
"copilot_chat",
"live_note_agent",
"background_task_agent",
"meeting_note",
"knowledge_sync",
]).optional(),
subUseCase: z.string().optional(),
});
export const SpawnSubFlowEvent = BaseRunEvent.extend({
type: z.literal("spawn-subflow"),
agentName: z.string(),
toolCallId: z.string(),
});
export const LlmStreamEvent = BaseRunEvent.extend({
type: z.literal("llm-stream-event"),
event: LlmStepStreamEvent,
});
export const MessageEvent = BaseRunEvent.extend({
type: z.literal("message"),
messageId: z.string(),
message: Message,
});
export const ToolInvocationEvent = BaseRunEvent.extend({
type: z.literal("tool-invocation"),
toolCallId: z.string().optional(),
toolName: z.string(),
input: z.string(),
});
export const ToolResultEvent = BaseRunEvent.extend({
type: z.literal("tool-result"),
toolCallId: z.string().optional(),
toolName: z.string(),
result: z.any(),
});
export const ToolOutputStreamEvent = BaseRunEvent.extend({
type: z.literal("tool-output-stream"),
toolCallId: z.string(),
toolName: z.string(),
output: z.string(),
});
export const AskHumanRequestEvent = BaseRunEvent.extend({
type: z.literal("ask-human-request"),
toolCallId: z.string(),
query: z.string(),
options: z.array(z.string()).optional(),
});
export const AskHumanResponseEvent = BaseRunEvent.extend({
type: z.literal("ask-human-response"),
toolCallId: z.string(),
response: z.string(),
});
export const ToolPermissionMetadata = z.discriminatedUnion("kind", [
z.object({
kind: z.literal("command"),
commandNames: z.array(z.string()),
}),
z.object({
kind: z.literal("file"),
operation: z.enum(["read", "list", "search", "write", "delete"]),
paths: z.array(z.string()),
pathPrefix: z.string(),
}),
]);
export const ToolPermissionRequestEvent = BaseRunEvent.extend({
type: z.literal("tool-permission-request"),
toolCall: ToolCallPart,
permission: ToolPermissionMetadata.optional(),
});
export const ToolPermissionResponseEvent = BaseRunEvent.extend({
type: z.literal("tool-permission-response"),
toolCallId: z.string(),
response: z.enum(["approve", "deny"]),
scope: z.enum(["once", "session", "always"]).optional(),
});
// A structured item from a code_agent_run coding turn (tool call, diff, plan,
// message chunk, resolved permission). Fire-and-forget — rendered live.
export const CodeRunStreamEvent = BaseRunEvent.extend({
type: z.literal("code-run-event"),
toolCallId: z.string(),
event: CodeRunEventSchema,
});
// The coding agent is asking for permission mid-turn and the run is BLOCKED until
// the user answers via `codeRun:resolvePermission` (keyed by requestId).
export const CodeRunPermissionRequestEvent = BaseRunEvent.extend({
type: z.literal("code-run-permission-request"),
toolCallId: z.string(),
requestId: z.string(),
ask: PermissionAsk,
});
export const ToolPermissionAutoDecisionEvent = BaseRunEvent.extend({
type: z.literal("tool-permission-auto-decision"),
toolCallId: z.string(),
toolCall: ToolCallPart,
permission: ToolPermissionMetadata.optional(),
decision: z.enum(["allow", "deny"]),
reason: z.string(),
});
export const RunErrorEvent = BaseRunEvent.extend({
type: z.literal("error"),
error: z.string(),
});
export const RunStoppedEvent = BaseRunEvent.extend({
type: z.literal("run-stopped"),
reason: z.enum(["user-requested", "force-stopped"]).optional(),
});
export const RunEvent = z.union([
RunProcessingStartEvent,
RunProcessingEndEvent,
StartEvent,
SpawnSubFlowEvent,
LlmStreamEvent,
MessageEvent,
ToolInvocationEvent,
ToolResultEvent,
ToolOutputStreamEvent,
AskHumanRequestEvent,
AskHumanResponseEvent,
ToolPermissionRequestEvent,
ToolPermissionResponseEvent,
CodeRunStreamEvent,
CodeRunPermissionRequestEvent,
ToolPermissionAutoDecisionEvent,
RunErrorEvent,
RunStoppedEvent,
]);
export const ToolPermissionAuthorizePayload = ToolPermissionResponseEvent.pick({
subflow: true,
toolCallId: true,
response: true,
scope: true,
});
export const AskHumanResponsePayload = AskHumanResponseEvent.pick({
subflow: true,
toolCallId: true,
response: true,
});
export const UseCase = z.enum([
"copilot_chat",
"live_note_agent",
"background_task_agent",
"meeting_note",
"knowledge_sync",
]);
export const Run = z.object({
id: z.string(),
title: z.string().optional(),
createdAt: z.iso.datetime(),
agentId: z.string(),
model: z.string(),
provider: z.string(),
permissionMode: z.enum(["manual", "auto"]).optional(),
useCase: UseCase.optional(),
subUseCase: z.string().optional(),
log: z.array(RunEvent),
});
export const ListRunsResponse = z.object({
runs: z.array(Run.pick({
id: true,
title: true,
createdAt: true,
agentId: true,
})),
nextCursor: z.string().optional(),
});
export const CreateRunOptions = z.object({
agentId: z.string(),
model: z.string().optional(),
provider: z.string().optional(),
permissionMode: z.enum(["manual", "auto"]).optional(),
useCase: UseCase.optional(),
subUseCase: z.string().optional(),
});