2025-12-29 15:30:57 +05:30
|
|
|
import { LlmStepStreamEvent } from "./llm-step-events.js";
|
|
|
|
|
import { Message, ToolCallPart } from "./message.js";
|
feat: run code mode on an in-app ACP client with live approvals (#593)
* feat(code-mode): add ACP client engine (Layer 2 core)
Own the Agent Client Protocol client instead of shelling out to `acpx`, so code
mode can stream structured events (tool calls, diffs, plan) and surface live
permission requests. Headless acpx can't do live approvals (it only supports
--approve-all), which is why we drive the agent adapters ourselves.
- code-mode/acp/{agents,client,permission-broker,session-store,manager,types}.ts:
headless engine driving the Claude/Codex ACP adapters; one warm session per chat
with create-or-resume via session/load; approval policy (ask | auto-approve-reads
| yolo) in the broker.
- claude-exec.ts: cross-platform claude resolver (Windows .cmd EINVAL fix + macOS/Linux
GUI-PATH safety net) shared with the legacy acpx path in builtin-tools.ts.
- add @agentclientprotocol/sdk + claude/codex adapters to core.
* feat(code-mode): route code mode through code_agent_run tool + live approvals
Replace the acpx shell-out with a structured code_agent_run tool that drives the
ACP engine directly, streaming the agent's tool calls / diffs / plan into the chat
and surfacing permission requests inline.
- shared: code-mode.ts zod schemas; add code-run-event + code-run-permission-request
RunEvent variants (stream to the renderer over the existing runs:events channel);
codeRun:resolvePermission IPC channel.
- core: CodePermissionRegistry (promise-based mid-run approvals — the LLM tool-loop's
pre-call gate can't model a mid-execution wait); register codeModeManager +
codePermissionRegistry in awilix.
- core: code_agent_run builtin tool (streams via ctx.publish, asks via the registry,
cancels on ctx.signal, returns the agent summary). CodeModeConfig.approvalPolicy
(ask | auto-approve-reads | yolo; default ask). Exclude the tool from the headless
background-task / live-note / inline-task agents so they can't block on an approval.
- main: codeRun:resolvePermission handler -> registry.resolve.
- rewrite the code-with-agents skill and the runtime "Code Mode (Active)" block to call
code_agent_run instead of emitting npx acpx commands.
* feat(code-mode): render coding runs inline (live timeline + permission card)
Render a code_agent_run tool call as a live CodingRun block instead of generic
tool output: the agent's text, tool-call rows (kind icon + status + changed-file
names from diffs), a plan checklist, and resolved-permission lines — plus an
inline Allow / Always-allow / Deny card wired to codeRun:resolvePermission.
- chat-conversation.ts: ToolCall carries codeRunEvents + pendingCodePermission;
code_agent_run is excluded from tool-grouping so it renders standalone.
- App.tsx: handle code-run-event / code-run-permission-request, clear the pending
card on tool-result, handleCodePermissionResponse, render via CodingRunBlock.
* fix(code-mode): run the ACP adapter as Node under Electron + resolve it from main
Two runtime failures that only surfaced inside the packaged/bundled Electron app
(the headless harness used real node, so neither showed there):
- "ACP connection closed": the main process spawns the adapter via
process.execPath, which inside Electron is the Electron binary, not node — so
the child never ran as Node and its ACP stdio stream closed immediately. Set
ELECTRON_RUN_AS_NODE=1 on the adapter env (a no-op under real node).
- "Cannot find module '@agentclientprotocol/claude-agent-acp'": the adapters were
transitive (core) deps, unreachable from the esbuild-bundled main.cjs. Add them
as direct deps of the main app so require.resolve finds them at runtime (and so
they ship when packaged).
Also capture the adapter's stderr + exit code and enrich connection errors, so a
future failure reports the real cause instead of the opaque "ACP connection closed".
* chore(code-mode): remove dead acpx code paths and stale copy
Code mode now runs through the code_agent_run tool (owning the ACP client), so the
legacy acpx shell-out paths are dead. Remove them:
- core: envForCommand (acpx-only CLAUDE_CODE_EXECUTABLE injection) from
executeCommand; getCodeModeCommandLabel (acpx run-status label).
- renderer: the acpx-detection "switch agent / auto-flip the code-mode chip" flow —
App.tsx executeCommand detection, the permission-request onSwitchAgent button +
badge, and the composer's code-mode-detected listener.
- copy: Settings -> Code Mode and the code-with-agents skill summary no longer
mention acpx; tidy stale comments (claude-exec, command-executor).
No behavior change for code mode; the general executeCommand tool is unaffected.
* feat(code-mode): approval-policy selector in Settings
Surface the approval policy (Ask every time / Auto-approve reads / YOLO) in
Settings -> Code Mode, instead of being config-file only. The broker already
reads CodeModeConfig.approvalPolicy; this plumbs it through the
codeMode:getConfig / setConfig IPC + main handlers and adds the picker UI
(with a one-line explanation of each level). Defaults to "ask".
* fix(code-mode): harden ACP engine — turn-scoped connections, chip-authoritative agent, reliable stop
Three robustness fixes that co-modify manager.runPrompt and the code_agent_run
tool, so they land together:
- Lifecycle: scope each ACP adapter connection to the agent turn. Dispose it a
short grace (60s) after the turn ends instead of holding it for the app's life;
the next turn resumes via session/load (both agents support it). Wire
disposeAll() on app quit (was dead code). Fixes the unbounded per-chat leak of
booted agent processes.
- Agent selection: make the composer chip the source of truth. Thread codeMode
into ToolContext; code_agent_run uses it instead of the model's guessed `agent`
arg, which anchored on the thread's earlier agent and ignored a chip change.
Prompts updated to match; the run is labelled by the agent that actually ran.
- Stop/abort: guarantee a stopped turn unwinds. On abort the manager sends ACP
session/cancel, then force-kills the adapter after a 2s grace and resolves the
turn as cancelled — a wedged adapter can no longer hang the run and lock the
chat. code_agent_run returns a clean cancelled result.
* fix(code-mode): hide Codex's native console window on Windows
Codex's engine ships as a native console-subsystem binary (codex.exe). Launched
from our console-less Electron process tree, Windows allocated a fresh *visible*
console window for it; closing that window wedged the run in a pending state.
(Claude Code is a Node CLI, so it never triggers this.)
The window is created by @openai/codex's launcher (bin/codex.js), which spawns
codex.exe with no windowsHide. Patch it via pnpm to pass windowsHide: true
(CREATE_NO_WINDOW) so the console stays hidden — no window, nothing to close.
* refactor(code-mode): move ACP session files out of WorkDir/config
Per-run ACP session state is runtime state that accumulates one file per
chat run, not user/app config. Relocate it from WorkDir/config to a
dedicated WorkDir/code-mode/sessions/ so it can be listed, cleaned up, and
managed on its own without crowding config. Drop the now-redundant
codesession- filename prefix (the directory conveys it).
2026-06-05 14:45:08 +05:30
|
|
|
import { CodeRunEvent as CodeRunEventSchema, PermissionAsk } from "./code-mode.js";
|
2025-12-29 15:30:57 +05:30
|
|
|
import z from "zod";
|
|
|
|
|
|
|
|
|
|
const BaseRunEvent = z.object({
|
|
|
|
|
runId: z.string(),
|
|
|
|
|
ts: z.iso.datetime().optional(),
|
|
|
|
|
subflow: z.array(z.string()),
|
|
|
|
|
});
|
|
|
|
|
|
|
|
|
|
export const RunProcessingStartEvent = BaseRunEvent.extend({
|
|
|
|
|
type: z.literal("run-processing-start"),
|
|
|
|
|
});
|
|
|
|
|
|
|
|
|
|
export const RunProcessingEndEvent = BaseRunEvent.extend({
|
|
|
|
|
type: z.literal("run-processing-end"),
|
|
|
|
|
});
|
|
|
|
|
|
|
|
|
|
export const StartEvent = BaseRunEvent.extend({
|
|
|
|
|
type: z.literal("start"),
|
|
|
|
|
agentName: z.string(),
|
freeze model + provider per run at creation time
The model dropdown was broken in two ways: it wrote to ~/.rowboat/config/models.json
(the BYOK creds file, stamped with a fake `flavor: 'openrouter'` to satisfy zod
when signed in), and the runtime ignored that write entirely for signed-in users
because `streamAgent` hard-coded `gpt-5.4`. Model selection was also globally
scoped, so every chat shared one brain.
This change moves model + provider out of the global config and onto the run
itself, resolved once at runs:create and frozen for the run's lifetime.
## Resolution
`runsCore.createRun` resolves per-field, falling through:
run.model = opts.model ?? agent.model ?? defaults.model
run.provider = opts.provider ?? agent.provider ?? defaults.provider
A new `core/models/defaults.ts` is the only place in the codebase that branches
on signed-in state. `getDefaultModelAndProvider()` returns name strings;
`resolveProviderConfig(name)` does the name → full LlmProvider lookup at
runtime. `createProvider` learns about `flavor: 'rowboat'` so the gateway is
just another flavor.
`provider` is stored as a name (e.g. `"rowboat"`, `"openai"`), not a full
LlmProvider object. API keys never get written into the JSONL log; rotating a
key in models.json applies to existing runs without re-creation. Cost: deleting
a provider from settings breaks runs that referenced it (clear error surfaced
via `resolveProviderConfig`).
## Runtime
`streamAgent` no longer resolves anything — it reads `state.runModel` /
`state.runProvider`, looks up the provider config, instantiates. Subflows
inherit the parent run's pair, so KG / inline-task subagents run on whatever
the main run resolved to at creation. The `knowledgeGraphAgents` array,
`isKgAgent`, and the per-agent default constants are gone.
KG / inline-task / pre-built agents declare their preferred model in YAML
frontmatter (claude-haiku-4.5 / claude-sonnet-4.6) — used at resolution time
when those agents are themselves the top-level agent of a run (background
triggers, scheduled tasks, etc.).
## Standalone callers
Non-run LLM call sites (summarize_meeting, track/routing, builtin-tools
parseFile) and `agent-schedule/runner` were branching on signed-in
independently. They all route through `getDefaultModelAndProvider` +
`resolveProviderConfig` + `createProvider` now; `agent-schedule/runner`
switched from raw `runsRepo.create` to `runsCore.createRun` so resolution
applies to scheduled-agent runs too.
## UI
`chat-input-with-mentions` stops calling `models:saveConfig`. The dropdown
notifies the parent via `onSelectedModelChange` ({provider, model} as names);
App.tsx stashes selection per-tab and passes it to the next `runs:create`.
When a run already exists, the input fetches it and renders a static label —
model can't change mid-run.
## Legacy runs
A lenient zod schema in `repo.ts` (`StartEvent.extend(...optional)` plus
`RunEvent.or(LegacyStartEvent)`) parses pre-existing runs. `repo.fetch` fills
missing model/provider from current defaults and returns the strict canonical
`Run` type. No file-rewriting migration; no impact on the canonical schema in
`@x/shared`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 12:26:01 +05:30
|
|
|
model: z.string(),
|
|
|
|
|
provider: z.string(),
|
2026-06-03 07:57:50 +05:30
|
|
|
permissionMode: z.enum(["manual", "auto"]).optional(),
|
2026-04-28 19:53:40 +05:30
|
|
|
// useCase/subUseCase tag the run for analytics. Optional on read so legacy
|
|
|
|
|
// run files written before these fields existed still parse cleanly.
|
|
|
|
|
useCase: z.enum([
|
|
|
|
|
"copilot_chat",
|
feat: live notes — single objective per note replaces multi-track model
Folds the multi-`track:`-array model into one `live:` block per note: a single
persistent objective the live-note agent maintains, plus an optional triggers
object (`cronExpr` / `windows` / `eventMatchCriteria`, each independently
optional). A note is now passive or live — no per-track scopes, no section
ownership contract, no `once` trigger. The agent owns the whole body and makes
patch-style incremental edits per run.
Highlights:
- Schema: `track:` array → single `live:` object (`packages/shared/src/live-note.ts`).
- Runtime: scheduler / event processor / runner under `core/knowledge/live-note/`,
with split `lastAttemptAt` (every run, drives 5-min backoff) vs `lastRunAt`
(success only, anchors cycles). `throwOnError` on agent runs surfaces LLM /
billing failures into `lastRunError`.
- Today.md: regenerated by template v2 (single objective covering overview /
calendar / emails / what-you-missed / priorities; existing files renamed to
`Today.md.bkp.<stamp>`).
- Renderer: `LiveNoteSidebar` mounts inside the editor row (no chat overlap,
auto-closes on note switch); toolbar Radio button becomes a status pill;
`LiveNotesView` replaces background-agents view.
- Copilot: new `live-note` skill with act-first stance, default folder/cadence
pickers, and a non-negotiable rule to extend an existing objective rather
than add a second one. Shared `KNOWLEDGE_NOTE_STYLE_GUIDE` enforces
terse-and-scannable writing across `doc-collab` and the live-note agent.
- Analytics: `track_block` use-case → `live_note_agent`; trigger
(`manual` / `cron` / `window` / `event`) becomes the Pass-2 sub-use-case,
alongside `routing` for Pass 1. Legacy run files with the old value are
read-mapped via `LegacyStartEvent` so they stay openable in the runs list.
Hard cutover — no back-compat shims for legacy `track:` frontmatter arrays.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 00:26:46 +05:30
|
|
|
"live_note_agent",
|
feat: background tasks
Adds Background Tasks — recurring background agents the user can set up to
either keep a digest current (daily email summary, top HN stories, weather
brief) or perform a recurring action (draft a reply, post to Slack, call an
API). Each task is a persistent set of instructions plus optional triggers
(schedule, time-of-day window, or matching incoming Gmail / calendar event).
The agent reads the verbs in the instructions on every run and picks the
right mode automatically.
User-facing surfaces:
- New "Background tasks" entry in the sidebar, with a table listing every
task, its schedule, last run, and an active toggle.
- A detail page per task with a max-width reader showing the task's
current output and a control sidebar for editing instructions, triggers,
and reviewing run history.
- "New task" can open in a free-form box where the user describes what they
want and Copilot sets it up end-to-end, or in a structured form for
manual setup.
- "Edit with Copilot" hand-off from the detail view, pre-seeded with the
task's context.
Under the hood:
- The event pipeline that previously powered live-notes is now a generic
consumer registry. Live-notes and background tasks both subscribe;
incoming events are routed to candidates from both concurrently.
- Schedule helpers and the agent-message trigger block are factored out of
live-notes into shared modules. Both features use the same building
blocks now.
- Copilot's proactive routing is reframed: anything recurring (cadence
words, watch / monitor verbs, action verbs, event-conditional asks) now
flows to background tasks. Live-notes load only on explicit mention.
- A small reliability fix for the run-creation fallback chain: an
empty-string model/provider passed by an LLM tool call now correctly
falls through to the default instead of being persisted as a real value.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 17:43:25 +05:30
|
|
|
"background_task_agent",
|
2026-04-28 19:53:40 +05:30
|
|
|
"meeting_note",
|
|
|
|
|
"knowledge_sync",
|
|
|
|
|
]).optional(),
|
|
|
|
|
subUseCase: z.string().optional(),
|
2025-12-29 15:30:57 +05:30
|
|
|
});
|
|
|
|
|
|
|
|
|
|
export const SpawnSubFlowEvent = BaseRunEvent.extend({
|
|
|
|
|
type: z.literal("spawn-subflow"),
|
|
|
|
|
agentName: z.string(),
|
|
|
|
|
toolCallId: z.string(),
|
|
|
|
|
});
|
|
|
|
|
|
|
|
|
|
export const LlmStreamEvent = BaseRunEvent.extend({
|
|
|
|
|
type: z.literal("llm-stream-event"),
|
|
|
|
|
event: LlmStepStreamEvent,
|
|
|
|
|
});
|
|
|
|
|
|
|
|
|
|
export const MessageEvent = BaseRunEvent.extend({
|
|
|
|
|
type: z.literal("message"),
|
|
|
|
|
messageId: z.string(),
|
|
|
|
|
message: Message,
|
|
|
|
|
});
|
|
|
|
|
|
|
|
|
|
export const ToolInvocationEvent = BaseRunEvent.extend({
|
|
|
|
|
type: z.literal("tool-invocation"),
|
|
|
|
|
toolCallId: z.string().optional(),
|
|
|
|
|
toolName: z.string(),
|
|
|
|
|
input: z.string(),
|
|
|
|
|
});
|
|
|
|
|
|
|
|
|
|
export const ToolResultEvent = BaseRunEvent.extend({
|
|
|
|
|
type: z.literal("tool-result"),
|
|
|
|
|
toolCallId: z.string().optional(),
|
|
|
|
|
toolName: z.string(),
|
|
|
|
|
result: z.any(),
|
|
|
|
|
});
|
|
|
|
|
|
2026-05-06 19:03:27 +05:30
|
|
|
export const ToolOutputStreamEvent = BaseRunEvent.extend({
|
|
|
|
|
type: z.literal("tool-output-stream"),
|
|
|
|
|
toolCallId: z.string(),
|
|
|
|
|
toolName: z.string(),
|
|
|
|
|
output: z.string(),
|
|
|
|
|
});
|
|
|
|
|
|
2025-12-29 15:30:57 +05:30
|
|
|
export const AskHumanRequestEvent = BaseRunEvent.extend({
|
|
|
|
|
type: z.literal("ask-human-request"),
|
|
|
|
|
toolCallId: z.string(),
|
|
|
|
|
query: z.string(),
|
Code Mode: in-chat toggle, settings tab, and permission/command UX (#572)
* feat: add in-chat code mode toggle with claude/codex swap
* feat: show agent and add swap-and-retry on acpx permission card
* style: reorder permission card buttons (approve, deny, swap)
* feat: add tooltips to composer plus and web search buttons
* feat: add code mode settings tab with agent install/auth checks
* feat: show sign-in command when agent installed but signed out
* style: refine code-mode permission and command block UX
- Render permission block before the command block
- Collapse permission details after a response; click header to expand
- Drop status icons/badge; use minimal green / bold red blocks
- Auto-collapse the running command block once it completes
* feat: rotating progress labels for code-mode commands; darker tool borders
- Code-mode (acpx) command block shows status-aware labels: rotating
'Working on the task…' phrases (5s each, holding on the last) while
running, then 'Completed the task' / "Couldn't complete the task"
- Darken outer border on all tool blocks in light and dark modes
* fix: detect Claude Code sign-in via macOS Keychain
On macOS, Claude Code stores OAuth credentials in the login Keychain
(service 'Claude Code-credentials'), not in ~/.claude/.credentials.json.
Read the Keychain as a fallback so signed-in Mac users are detected.
* feat: persistent per-chat sessions for code-mode coding agents
- Use a named acpx session (rowboat-<runId>) per chat so follow-up
coding requests resume the same agent and keep context
- Create the session once at chat start (sessions new --name), then
prompt with -s <name>; reuse on follow-ups (no re-create)
- Drop the redundant in-chat 'reply yes' confirmation (the executeCommand
permission card is the confirmation)
- Code-mode output uses plain-text paths (overrides global filepath rule)
- On not-installed/auth errors, point user to Settings -> Code Mode
* fix: code-mode session creation uses idempotent ensure, run sequentially
- Use 'sessions ensure --name' instead of 'sessions new' so reopening a
chat resumes the existing session instead of erroring on a name clash
- Create the session and run the prompt as separate sequential calls so
the permission/command blocks render one at a time (not all at once)
* fix: reliable Claude Code session resume on Windows (avoid claude.cmd EINVAL)
Resuming a code-mode chat after restarting the app spawns a fresh ACP
agent. On Windows + Node >=20.12 the bridge spawning claude.cmd throws
EINVAL, so the session queue owner fails to start. Rowboat injects
CLAUDE_CODE_EXECUTABLE=claude.exe to dodge this, but the override didn't
reliably reach the spawn. Windows-only; no-op on macOS/Linux.
- executeCommand now accepts an env override and the non-abortable
fallback path passes it through (was silently dropped)
- resolveClaudeExeOnWindows also scans known npm/pnpm/volta global bin
dirs, not just PATH (Electron's runtime PATH can omit them)
- add --timeout 600 to acpx prompt commands so a genuine stall fails
cleanly instead of hanging on 'Running' forever
2026-05-28 14:52:09 +05:30
|
|
|
options: z.array(z.string()).optional(),
|
2025-12-29 15:30:57 +05:30
|
|
|
});
|
|
|
|
|
|
|
|
|
|
export const AskHumanResponseEvent = BaseRunEvent.extend({
|
|
|
|
|
type: z.literal("ask-human-response"),
|
|
|
|
|
toolCallId: z.string(),
|
|
|
|
|
response: z.string(),
|
|
|
|
|
});
|
|
|
|
|
|
Refactor builtin file tools beyond workspace scope
Replace workspace-scoped builtin file tools with general-purpose file-* tools that accept relative, absolute, and ~/ paths. Relative paths still resolve against the configured workdir.
File operations within the workdir are auto-approved. File operations outside the workdir now emit file permission metadata and require user approval, with support for once, session, and persistent grants.
Add a shared filesystem layer for text-focused read/write/edit/list/search operations, including binary-file safeguards for text reads. parseFile and LLMParse continue to read file buffers for document/image parsing.
Update copilot prompts, background/live-note agents, knowledge workflows, and renderer labels/UI to use the new file-* tool surface and permission details.
Add package-local Vitest setup for @x/core with colocated filesystem unit tests covering path resolution, canonical permission paths, binary detection, read/write/edit behavior, glob, and grep.
2026-05-25 16:21:40 +05:30
|
|
|
export const ToolPermissionMetadata = z.discriminatedUnion("kind", [
|
|
|
|
|
z.object({
|
|
|
|
|
kind: z.literal("command"),
|
|
|
|
|
commandNames: z.array(z.string()),
|
|
|
|
|
}),
|
|
|
|
|
z.object({
|
|
|
|
|
kind: z.literal("file"),
|
|
|
|
|
operation: z.enum(["read", "list", "search", "write", "delete"]),
|
|
|
|
|
paths: z.array(z.string()),
|
|
|
|
|
pathPrefix: z.string(),
|
|
|
|
|
}),
|
|
|
|
|
]);
|
|
|
|
|
|
2025-12-29 15:30:57 +05:30
|
|
|
export const ToolPermissionRequestEvent = BaseRunEvent.extend({
|
|
|
|
|
type: z.literal("tool-permission-request"),
|
|
|
|
|
toolCall: ToolCallPart,
|
Refactor builtin file tools beyond workspace scope
Replace workspace-scoped builtin file tools with general-purpose file-* tools that accept relative, absolute, and ~/ paths. Relative paths still resolve against the configured workdir.
File operations within the workdir are auto-approved. File operations outside the workdir now emit file permission metadata and require user approval, with support for once, session, and persistent grants.
Add a shared filesystem layer for text-focused read/write/edit/list/search operations, including binary-file safeguards for text reads. parseFile and LLMParse continue to read file buffers for document/image parsing.
Update copilot prompts, background/live-note agents, knowledge workflows, and renderer labels/UI to use the new file-* tool surface and permission details.
Add package-local Vitest setup for @x/core with colocated filesystem unit tests covering path resolution, canonical permission paths, binary detection, read/write/edit behavior, glob, and grep.
2026-05-25 16:21:40 +05:30
|
|
|
permission: ToolPermissionMetadata.optional(),
|
2025-12-29 15:30:57 +05:30
|
|
|
});
|
|
|
|
|
|
|
|
|
|
export const ToolPermissionResponseEvent = BaseRunEvent.extend({
|
|
|
|
|
type: z.literal("tool-permission-response"),
|
|
|
|
|
toolCallId: z.string(),
|
|
|
|
|
response: z.enum(["approve", "deny"]),
|
2026-02-24 13:00:08 +05:30
|
|
|
scope: z.enum(["once", "session", "always"]).optional(),
|
2025-12-29 15:30:57 +05:30
|
|
|
});
|
|
|
|
|
|
feat: run code mode on an in-app ACP client with live approvals (#593)
* feat(code-mode): add ACP client engine (Layer 2 core)
Own the Agent Client Protocol client instead of shelling out to `acpx`, so code
mode can stream structured events (tool calls, diffs, plan) and surface live
permission requests. Headless acpx can't do live approvals (it only supports
--approve-all), which is why we drive the agent adapters ourselves.
- code-mode/acp/{agents,client,permission-broker,session-store,manager,types}.ts:
headless engine driving the Claude/Codex ACP adapters; one warm session per chat
with create-or-resume via session/load; approval policy (ask | auto-approve-reads
| yolo) in the broker.
- claude-exec.ts: cross-platform claude resolver (Windows .cmd EINVAL fix + macOS/Linux
GUI-PATH safety net) shared with the legacy acpx path in builtin-tools.ts.
- add @agentclientprotocol/sdk + claude/codex adapters to core.
* feat(code-mode): route code mode through code_agent_run tool + live approvals
Replace the acpx shell-out with a structured code_agent_run tool that drives the
ACP engine directly, streaming the agent's tool calls / diffs / plan into the chat
and surfacing permission requests inline.
- shared: code-mode.ts zod schemas; add code-run-event + code-run-permission-request
RunEvent variants (stream to the renderer over the existing runs:events channel);
codeRun:resolvePermission IPC channel.
- core: CodePermissionRegistry (promise-based mid-run approvals — the LLM tool-loop's
pre-call gate can't model a mid-execution wait); register codeModeManager +
codePermissionRegistry in awilix.
- core: code_agent_run builtin tool (streams via ctx.publish, asks via the registry,
cancels on ctx.signal, returns the agent summary). CodeModeConfig.approvalPolicy
(ask | auto-approve-reads | yolo; default ask). Exclude the tool from the headless
background-task / live-note / inline-task agents so they can't block on an approval.
- main: codeRun:resolvePermission handler -> registry.resolve.
- rewrite the code-with-agents skill and the runtime "Code Mode (Active)" block to call
code_agent_run instead of emitting npx acpx commands.
* feat(code-mode): render coding runs inline (live timeline + permission card)
Render a code_agent_run tool call as a live CodingRun block instead of generic
tool output: the agent's text, tool-call rows (kind icon + status + changed-file
names from diffs), a plan checklist, and resolved-permission lines — plus an
inline Allow / Always-allow / Deny card wired to codeRun:resolvePermission.
- chat-conversation.ts: ToolCall carries codeRunEvents + pendingCodePermission;
code_agent_run is excluded from tool-grouping so it renders standalone.
- App.tsx: handle code-run-event / code-run-permission-request, clear the pending
card on tool-result, handleCodePermissionResponse, render via CodingRunBlock.
* fix(code-mode): run the ACP adapter as Node under Electron + resolve it from main
Two runtime failures that only surfaced inside the packaged/bundled Electron app
(the headless harness used real node, so neither showed there):
- "ACP connection closed": the main process spawns the adapter via
process.execPath, which inside Electron is the Electron binary, not node — so
the child never ran as Node and its ACP stdio stream closed immediately. Set
ELECTRON_RUN_AS_NODE=1 on the adapter env (a no-op under real node).
- "Cannot find module '@agentclientprotocol/claude-agent-acp'": the adapters were
transitive (core) deps, unreachable from the esbuild-bundled main.cjs. Add them
as direct deps of the main app so require.resolve finds them at runtime (and so
they ship when packaged).
Also capture the adapter's stderr + exit code and enrich connection errors, so a
future failure reports the real cause instead of the opaque "ACP connection closed".
* chore(code-mode): remove dead acpx code paths and stale copy
Code mode now runs through the code_agent_run tool (owning the ACP client), so the
legacy acpx shell-out paths are dead. Remove them:
- core: envForCommand (acpx-only CLAUDE_CODE_EXECUTABLE injection) from
executeCommand; getCodeModeCommandLabel (acpx run-status label).
- renderer: the acpx-detection "switch agent / auto-flip the code-mode chip" flow —
App.tsx executeCommand detection, the permission-request onSwitchAgent button +
badge, and the composer's code-mode-detected listener.
- copy: Settings -> Code Mode and the code-with-agents skill summary no longer
mention acpx; tidy stale comments (claude-exec, command-executor).
No behavior change for code mode; the general executeCommand tool is unaffected.
* feat(code-mode): approval-policy selector in Settings
Surface the approval policy (Ask every time / Auto-approve reads / YOLO) in
Settings -> Code Mode, instead of being config-file only. The broker already
reads CodeModeConfig.approvalPolicy; this plumbs it through the
codeMode:getConfig / setConfig IPC + main handlers and adds the picker UI
(with a one-line explanation of each level). Defaults to "ask".
* fix(code-mode): harden ACP engine — turn-scoped connections, chip-authoritative agent, reliable stop
Three robustness fixes that co-modify manager.runPrompt and the code_agent_run
tool, so they land together:
- Lifecycle: scope each ACP adapter connection to the agent turn. Dispose it a
short grace (60s) after the turn ends instead of holding it for the app's life;
the next turn resumes via session/load (both agents support it). Wire
disposeAll() on app quit (was dead code). Fixes the unbounded per-chat leak of
booted agent processes.
- Agent selection: make the composer chip the source of truth. Thread codeMode
into ToolContext; code_agent_run uses it instead of the model's guessed `agent`
arg, which anchored on the thread's earlier agent and ignored a chip change.
Prompts updated to match; the run is labelled by the agent that actually ran.
- Stop/abort: guarantee a stopped turn unwinds. On abort the manager sends ACP
session/cancel, then force-kills the adapter after a 2s grace and resolves the
turn as cancelled — a wedged adapter can no longer hang the run and lock the
chat. code_agent_run returns a clean cancelled result.
* fix(code-mode): hide Codex's native console window on Windows
Codex's engine ships as a native console-subsystem binary (codex.exe). Launched
from our console-less Electron process tree, Windows allocated a fresh *visible*
console window for it; closing that window wedged the run in a pending state.
(Claude Code is a Node CLI, so it never triggers this.)
The window is created by @openai/codex's launcher (bin/codex.js), which spawns
codex.exe with no windowsHide. Patch it via pnpm to pass windowsHide: true
(CREATE_NO_WINDOW) so the console stays hidden — no window, nothing to close.
* refactor(code-mode): move ACP session files out of WorkDir/config
Per-run ACP session state is runtime state that accumulates one file per
chat run, not user/app config. Relocate it from WorkDir/config to a
dedicated WorkDir/code-mode/sessions/ so it can be listed, cleaned up, and
managed on its own without crowding config. Drop the now-redundant
codesession- filename prefix (the directory conveys it).
2026-06-05 14:45:08 +05:30
|
|
|
// A structured item from a code_agent_run coding turn (tool call, diff, plan,
|
|
|
|
|
// message chunk, resolved permission). Fire-and-forget — rendered live.
|
|
|
|
|
export const CodeRunStreamEvent = BaseRunEvent.extend({
|
|
|
|
|
type: z.literal("code-run-event"),
|
|
|
|
|
toolCallId: z.string(),
|
|
|
|
|
event: CodeRunEventSchema,
|
|
|
|
|
});
|
|
|
|
|
|
|
|
|
|
// The coding agent is asking for permission mid-turn and the run is BLOCKED until
|
|
|
|
|
// the user answers via `codeRun:resolvePermission` (keyed by requestId).
|
|
|
|
|
export const CodeRunPermissionRequestEvent = BaseRunEvent.extend({
|
|
|
|
|
type: z.literal("code-run-permission-request"),
|
|
|
|
|
toolCallId: z.string(),
|
|
|
|
|
requestId: z.string(),
|
|
|
|
|
ask: PermissionAsk,
|
|
|
|
|
});
|
|
|
|
|
|
2026-06-03 07:57:50 +05:30
|
|
|
export const ToolPermissionAutoDecisionEvent = BaseRunEvent.extend({
|
|
|
|
|
type: z.literal("tool-permission-auto-decision"),
|
|
|
|
|
toolCallId: z.string(),
|
|
|
|
|
toolCall: ToolCallPart,
|
|
|
|
|
permission: ToolPermissionMetadata.optional(),
|
|
|
|
|
decision: z.enum(["allow", "deny"]),
|
|
|
|
|
reason: z.string(),
|
|
|
|
|
});
|
|
|
|
|
|
2025-12-29 15:30:57 +05:30
|
|
|
export const RunErrorEvent = BaseRunEvent.extend({
|
|
|
|
|
type: z.literal("error"),
|
|
|
|
|
error: z.string(),
|
|
|
|
|
});
|
|
|
|
|
|
2026-01-30 06:53:50 +05:30
|
|
|
export const RunStoppedEvent = BaseRunEvent.extend({
|
|
|
|
|
type: z.literal("run-stopped"),
|
|
|
|
|
reason: z.enum(["user-requested", "force-stopped"]).optional(),
|
|
|
|
|
});
|
|
|
|
|
|
2025-12-29 15:30:57 +05:30
|
|
|
export const RunEvent = z.union([
|
|
|
|
|
RunProcessingStartEvent,
|
|
|
|
|
RunProcessingEndEvent,
|
|
|
|
|
StartEvent,
|
|
|
|
|
SpawnSubFlowEvent,
|
|
|
|
|
LlmStreamEvent,
|
|
|
|
|
MessageEvent,
|
|
|
|
|
ToolInvocationEvent,
|
|
|
|
|
ToolResultEvent,
|
2026-05-06 19:03:27 +05:30
|
|
|
ToolOutputStreamEvent,
|
2025-12-29 15:30:57 +05:30
|
|
|
AskHumanRequestEvent,
|
|
|
|
|
AskHumanResponseEvent,
|
|
|
|
|
ToolPermissionRequestEvent,
|
|
|
|
|
ToolPermissionResponseEvent,
|
feat: run code mode on an in-app ACP client with live approvals (#593)
* feat(code-mode): add ACP client engine (Layer 2 core)
Own the Agent Client Protocol client instead of shelling out to `acpx`, so code
mode can stream structured events (tool calls, diffs, plan) and surface live
permission requests. Headless acpx can't do live approvals (it only supports
--approve-all), which is why we drive the agent adapters ourselves.
- code-mode/acp/{agents,client,permission-broker,session-store,manager,types}.ts:
headless engine driving the Claude/Codex ACP adapters; one warm session per chat
with create-or-resume via session/load; approval policy (ask | auto-approve-reads
| yolo) in the broker.
- claude-exec.ts: cross-platform claude resolver (Windows .cmd EINVAL fix + macOS/Linux
GUI-PATH safety net) shared with the legacy acpx path in builtin-tools.ts.
- add @agentclientprotocol/sdk + claude/codex adapters to core.
* feat(code-mode): route code mode through code_agent_run tool + live approvals
Replace the acpx shell-out with a structured code_agent_run tool that drives the
ACP engine directly, streaming the agent's tool calls / diffs / plan into the chat
and surfacing permission requests inline.
- shared: code-mode.ts zod schemas; add code-run-event + code-run-permission-request
RunEvent variants (stream to the renderer over the existing runs:events channel);
codeRun:resolvePermission IPC channel.
- core: CodePermissionRegistry (promise-based mid-run approvals — the LLM tool-loop's
pre-call gate can't model a mid-execution wait); register codeModeManager +
codePermissionRegistry in awilix.
- core: code_agent_run builtin tool (streams via ctx.publish, asks via the registry,
cancels on ctx.signal, returns the agent summary). CodeModeConfig.approvalPolicy
(ask | auto-approve-reads | yolo; default ask). Exclude the tool from the headless
background-task / live-note / inline-task agents so they can't block on an approval.
- main: codeRun:resolvePermission handler -> registry.resolve.
- rewrite the code-with-agents skill and the runtime "Code Mode (Active)" block to call
code_agent_run instead of emitting npx acpx commands.
* feat(code-mode): render coding runs inline (live timeline + permission card)
Render a code_agent_run tool call as a live CodingRun block instead of generic
tool output: the agent's text, tool-call rows (kind icon + status + changed-file
names from diffs), a plan checklist, and resolved-permission lines — plus an
inline Allow / Always-allow / Deny card wired to codeRun:resolvePermission.
- chat-conversation.ts: ToolCall carries codeRunEvents + pendingCodePermission;
code_agent_run is excluded from tool-grouping so it renders standalone.
- App.tsx: handle code-run-event / code-run-permission-request, clear the pending
card on tool-result, handleCodePermissionResponse, render via CodingRunBlock.
* fix(code-mode): run the ACP adapter as Node under Electron + resolve it from main
Two runtime failures that only surfaced inside the packaged/bundled Electron app
(the headless harness used real node, so neither showed there):
- "ACP connection closed": the main process spawns the adapter via
process.execPath, which inside Electron is the Electron binary, not node — so
the child never ran as Node and its ACP stdio stream closed immediately. Set
ELECTRON_RUN_AS_NODE=1 on the adapter env (a no-op under real node).
- "Cannot find module '@agentclientprotocol/claude-agent-acp'": the adapters were
transitive (core) deps, unreachable from the esbuild-bundled main.cjs. Add them
as direct deps of the main app so require.resolve finds them at runtime (and so
they ship when packaged).
Also capture the adapter's stderr + exit code and enrich connection errors, so a
future failure reports the real cause instead of the opaque "ACP connection closed".
* chore(code-mode): remove dead acpx code paths and stale copy
Code mode now runs through the code_agent_run tool (owning the ACP client), so the
legacy acpx shell-out paths are dead. Remove them:
- core: envForCommand (acpx-only CLAUDE_CODE_EXECUTABLE injection) from
executeCommand; getCodeModeCommandLabel (acpx run-status label).
- renderer: the acpx-detection "switch agent / auto-flip the code-mode chip" flow —
App.tsx executeCommand detection, the permission-request onSwitchAgent button +
badge, and the composer's code-mode-detected listener.
- copy: Settings -> Code Mode and the code-with-agents skill summary no longer
mention acpx; tidy stale comments (claude-exec, command-executor).
No behavior change for code mode; the general executeCommand tool is unaffected.
* feat(code-mode): approval-policy selector in Settings
Surface the approval policy (Ask every time / Auto-approve reads / YOLO) in
Settings -> Code Mode, instead of being config-file only. The broker already
reads CodeModeConfig.approvalPolicy; this plumbs it through the
codeMode:getConfig / setConfig IPC + main handlers and adds the picker UI
(with a one-line explanation of each level). Defaults to "ask".
* fix(code-mode): harden ACP engine — turn-scoped connections, chip-authoritative agent, reliable stop
Three robustness fixes that co-modify manager.runPrompt and the code_agent_run
tool, so they land together:
- Lifecycle: scope each ACP adapter connection to the agent turn. Dispose it a
short grace (60s) after the turn ends instead of holding it for the app's life;
the next turn resumes via session/load (both agents support it). Wire
disposeAll() on app quit (was dead code). Fixes the unbounded per-chat leak of
booted agent processes.
- Agent selection: make the composer chip the source of truth. Thread codeMode
into ToolContext; code_agent_run uses it instead of the model's guessed `agent`
arg, which anchored on the thread's earlier agent and ignored a chip change.
Prompts updated to match; the run is labelled by the agent that actually ran.
- Stop/abort: guarantee a stopped turn unwinds. On abort the manager sends ACP
session/cancel, then force-kills the adapter after a 2s grace and resolves the
turn as cancelled — a wedged adapter can no longer hang the run and lock the
chat. code_agent_run returns a clean cancelled result.
* fix(code-mode): hide Codex's native console window on Windows
Codex's engine ships as a native console-subsystem binary (codex.exe). Launched
from our console-less Electron process tree, Windows allocated a fresh *visible*
console window for it; closing that window wedged the run in a pending state.
(Claude Code is a Node CLI, so it never triggers this.)
The window is created by @openai/codex's launcher (bin/codex.js), which spawns
codex.exe with no windowsHide. Patch it via pnpm to pass windowsHide: true
(CREATE_NO_WINDOW) so the console stays hidden — no window, nothing to close.
* refactor(code-mode): move ACP session files out of WorkDir/config
Per-run ACP session state is runtime state that accumulates one file per
chat run, not user/app config. Relocate it from WorkDir/config to a
dedicated WorkDir/code-mode/sessions/ so it can be listed, cleaned up, and
managed on its own without crowding config. Drop the now-redundant
codesession- filename prefix (the directory conveys it).
2026-06-05 14:45:08 +05:30
|
|
|
CodeRunStreamEvent,
|
|
|
|
|
CodeRunPermissionRequestEvent,
|
2026-06-03 07:57:50 +05:30
|
|
|
ToolPermissionAutoDecisionEvent,
|
2025-12-29 15:30:57 +05:30
|
|
|
RunErrorEvent,
|
2026-01-30 06:53:50 +05:30
|
|
|
RunStoppedEvent,
|
2025-12-29 15:30:57 +05:30
|
|
|
]);
|
|
|
|
|
|
|
|
|
|
export const ToolPermissionAuthorizePayload = ToolPermissionResponseEvent.pick({
|
|
|
|
|
subflow: true,
|
|
|
|
|
toolCallId: true,
|
|
|
|
|
response: true,
|
2026-02-24 13:00:08 +05:30
|
|
|
scope: true,
|
2025-12-29 15:30:57 +05:30
|
|
|
});
|
|
|
|
|
|
|
|
|
|
export const AskHumanResponsePayload = AskHumanResponseEvent.pick({
|
|
|
|
|
subflow: true,
|
|
|
|
|
toolCallId: true,
|
|
|
|
|
response: true,
|
|
|
|
|
});
|
|
|
|
|
|
2026-04-28 19:53:40 +05:30
|
|
|
export const UseCase = z.enum([
|
|
|
|
|
"copilot_chat",
|
feat: live notes — single objective per note replaces multi-track model
Folds the multi-`track:`-array model into one `live:` block per note: a single
persistent objective the live-note agent maintains, plus an optional triggers
object (`cronExpr` / `windows` / `eventMatchCriteria`, each independently
optional). A note is now passive or live — no per-track scopes, no section
ownership contract, no `once` trigger. The agent owns the whole body and makes
patch-style incremental edits per run.
Highlights:
- Schema: `track:` array → single `live:` object (`packages/shared/src/live-note.ts`).
- Runtime: scheduler / event processor / runner under `core/knowledge/live-note/`,
with split `lastAttemptAt` (every run, drives 5-min backoff) vs `lastRunAt`
(success only, anchors cycles). `throwOnError` on agent runs surfaces LLM /
billing failures into `lastRunError`.
- Today.md: regenerated by template v2 (single objective covering overview /
calendar / emails / what-you-missed / priorities; existing files renamed to
`Today.md.bkp.<stamp>`).
- Renderer: `LiveNoteSidebar` mounts inside the editor row (no chat overlap,
auto-closes on note switch); toolbar Radio button becomes a status pill;
`LiveNotesView` replaces background-agents view.
- Copilot: new `live-note` skill with act-first stance, default folder/cadence
pickers, and a non-negotiable rule to extend an existing objective rather
than add a second one. Shared `KNOWLEDGE_NOTE_STYLE_GUIDE` enforces
terse-and-scannable writing across `doc-collab` and the live-note agent.
- Analytics: `track_block` use-case → `live_note_agent`; trigger
(`manual` / `cron` / `window` / `event`) becomes the Pass-2 sub-use-case,
alongside `routing` for Pass 1. Legacy run files with the old value are
read-mapped via `LegacyStartEvent` so they stay openable in the runs list.
Hard cutover — no back-compat shims for legacy `track:` frontmatter arrays.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 00:26:46 +05:30
|
|
|
"live_note_agent",
|
feat: background tasks
Adds Background Tasks — recurring background agents the user can set up to
either keep a digest current (daily email summary, top HN stories, weather
brief) or perform a recurring action (draft a reply, post to Slack, call an
API). Each task is a persistent set of instructions plus optional triggers
(schedule, time-of-day window, or matching incoming Gmail / calendar event).
The agent reads the verbs in the instructions on every run and picks the
right mode automatically.
User-facing surfaces:
- New "Background tasks" entry in the sidebar, with a table listing every
task, its schedule, last run, and an active toggle.
- A detail page per task with a max-width reader showing the task's
current output and a control sidebar for editing instructions, triggers,
and reviewing run history.
- "New task" can open in a free-form box where the user describes what they
want and Copilot sets it up end-to-end, or in a structured form for
manual setup.
- "Edit with Copilot" hand-off from the detail view, pre-seeded with the
task's context.
Under the hood:
- The event pipeline that previously powered live-notes is now a generic
consumer registry. Live-notes and background tasks both subscribe;
incoming events are routed to candidates from both concurrently.
- Schedule helpers and the agent-message trigger block are factored out of
live-notes into shared modules. Both features use the same building
blocks now.
- Copilot's proactive routing is reframed: anything recurring (cadence
words, watch / monitor verbs, action verbs, event-conditional asks) now
flows to background tasks. Live-notes load only on explicit mention.
- A small reliability fix for the run-creation fallback chain: an
empty-string model/provider passed by an LLM tool call now correctly
falls through to the default instead of being persisted as a real value.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 17:43:25 +05:30
|
|
|
"background_task_agent",
|
2026-04-28 19:53:40 +05:30
|
|
|
"meeting_note",
|
|
|
|
|
"knowledge_sync",
|
|
|
|
|
]);
|
|
|
|
|
|
2025-12-29 15:30:57 +05:30
|
|
|
export const Run = z.object({
|
|
|
|
|
id: z.string(),
|
2026-01-20 16:36:36 +05:30
|
|
|
title: z.string().optional(),
|
2025-12-29 15:30:57 +05:30
|
|
|
createdAt: z.iso.datetime(),
|
|
|
|
|
agentId: z.string(),
|
freeze model + provider per run at creation time
The model dropdown was broken in two ways: it wrote to ~/.rowboat/config/models.json
(the BYOK creds file, stamped with a fake `flavor: 'openrouter'` to satisfy zod
when signed in), and the runtime ignored that write entirely for signed-in users
because `streamAgent` hard-coded `gpt-5.4`. Model selection was also globally
scoped, so every chat shared one brain.
This change moves model + provider out of the global config and onto the run
itself, resolved once at runs:create and frozen for the run's lifetime.
## Resolution
`runsCore.createRun` resolves per-field, falling through:
run.model = opts.model ?? agent.model ?? defaults.model
run.provider = opts.provider ?? agent.provider ?? defaults.provider
A new `core/models/defaults.ts` is the only place in the codebase that branches
on signed-in state. `getDefaultModelAndProvider()` returns name strings;
`resolveProviderConfig(name)` does the name → full LlmProvider lookup at
runtime. `createProvider` learns about `flavor: 'rowboat'` so the gateway is
just another flavor.
`provider` is stored as a name (e.g. `"rowboat"`, `"openai"`), not a full
LlmProvider object. API keys never get written into the JSONL log; rotating a
key in models.json applies to existing runs without re-creation. Cost: deleting
a provider from settings breaks runs that referenced it (clear error surfaced
via `resolveProviderConfig`).
## Runtime
`streamAgent` no longer resolves anything — it reads `state.runModel` /
`state.runProvider`, looks up the provider config, instantiates. Subflows
inherit the parent run's pair, so KG / inline-task subagents run on whatever
the main run resolved to at creation. The `knowledgeGraphAgents` array,
`isKgAgent`, and the per-agent default constants are gone.
KG / inline-task / pre-built agents declare their preferred model in YAML
frontmatter (claude-haiku-4.5 / claude-sonnet-4.6) — used at resolution time
when those agents are themselves the top-level agent of a run (background
triggers, scheduled tasks, etc.).
## Standalone callers
Non-run LLM call sites (summarize_meeting, track/routing, builtin-tools
parseFile) and `agent-schedule/runner` were branching on signed-in
independently. They all route through `getDefaultModelAndProvider` +
`resolveProviderConfig` + `createProvider` now; `agent-schedule/runner`
switched from raw `runsRepo.create` to `runsCore.createRun` so resolution
applies to scheduled-agent runs too.
## UI
`chat-input-with-mentions` stops calling `models:saveConfig`. The dropdown
notifies the parent via `onSelectedModelChange` ({provider, model} as names);
App.tsx stashes selection per-tab and passes it to the next `runs:create`.
When a run already exists, the input fetches it and renders a static label —
model can't change mid-run.
## Legacy runs
A lenient zod schema in `repo.ts` (`StartEvent.extend(...optional)` plus
`RunEvent.or(LegacyStartEvent)`) parses pre-existing runs. `repo.fetch` fills
missing model/provider from current defaults and returns the strict canonical
`Run` type. No file-rewriting migration; no impact on the canonical schema in
`@x/shared`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 12:26:01 +05:30
|
|
|
model: z.string(),
|
|
|
|
|
provider: z.string(),
|
2026-06-03 07:57:50 +05:30
|
|
|
permissionMode: z.enum(["manual", "auto"]).optional(),
|
2026-04-28 19:53:40 +05:30
|
|
|
useCase: UseCase.optional(),
|
|
|
|
|
subUseCase: z.string().optional(),
|
2025-12-29 15:30:57 +05:30
|
|
|
log: z.array(RunEvent),
|
|
|
|
|
});
|
|
|
|
|
|
|
|
|
|
export const ListRunsResponse = z.object({
|
|
|
|
|
runs: z.array(Run.pick({
|
|
|
|
|
id: true,
|
2026-01-20 16:36:36 +05:30
|
|
|
title: true,
|
2025-12-29 15:30:57 +05:30
|
|
|
createdAt: true,
|
|
|
|
|
agentId: true,
|
|
|
|
|
})),
|
|
|
|
|
nextCursor: z.string().optional(),
|
|
|
|
|
});
|
|
|
|
|
|
freeze model + provider per run at creation time
The model dropdown was broken in two ways: it wrote to ~/.rowboat/config/models.json
(the BYOK creds file, stamped with a fake `flavor: 'openrouter'` to satisfy zod
when signed in), and the runtime ignored that write entirely for signed-in users
because `streamAgent` hard-coded `gpt-5.4`. Model selection was also globally
scoped, so every chat shared one brain.
This change moves model + provider out of the global config and onto the run
itself, resolved once at runs:create and frozen for the run's lifetime.
## Resolution
`runsCore.createRun` resolves per-field, falling through:
run.model = opts.model ?? agent.model ?? defaults.model
run.provider = opts.provider ?? agent.provider ?? defaults.provider
A new `core/models/defaults.ts` is the only place in the codebase that branches
on signed-in state. `getDefaultModelAndProvider()` returns name strings;
`resolveProviderConfig(name)` does the name → full LlmProvider lookup at
runtime. `createProvider` learns about `flavor: 'rowboat'` so the gateway is
just another flavor.
`provider` is stored as a name (e.g. `"rowboat"`, `"openai"`), not a full
LlmProvider object. API keys never get written into the JSONL log; rotating a
key in models.json applies to existing runs without re-creation. Cost: deleting
a provider from settings breaks runs that referenced it (clear error surfaced
via `resolveProviderConfig`).
## Runtime
`streamAgent` no longer resolves anything — it reads `state.runModel` /
`state.runProvider`, looks up the provider config, instantiates. Subflows
inherit the parent run's pair, so KG / inline-task subagents run on whatever
the main run resolved to at creation. The `knowledgeGraphAgents` array,
`isKgAgent`, and the per-agent default constants are gone.
KG / inline-task / pre-built agents declare their preferred model in YAML
frontmatter (claude-haiku-4.5 / claude-sonnet-4.6) — used at resolution time
when those agents are themselves the top-level agent of a run (background
triggers, scheduled tasks, etc.).
## Standalone callers
Non-run LLM call sites (summarize_meeting, track/routing, builtin-tools
parseFile) and `agent-schedule/runner` were branching on signed-in
independently. They all route through `getDefaultModelAndProvider` +
`resolveProviderConfig` + `createProvider` now; `agent-schedule/runner`
switched from raw `runsRepo.create` to `runsCore.createRun` so resolution
applies to scheduled-agent runs too.
## UI
`chat-input-with-mentions` stops calling `models:saveConfig`. The dropdown
notifies the parent via `onSelectedModelChange` ({provider, model} as names);
App.tsx stashes selection per-tab and passes it to the next `runs:create`.
When a run already exists, the input fetches it and renders a static label —
model can't change mid-run.
## Legacy runs
A lenient zod schema in `repo.ts` (`StartEvent.extend(...optional)` plus
`RunEvent.or(LegacyStartEvent)`) parses pre-existing runs. `repo.fetch` fills
missing model/provider from current defaults and returns the strict canonical
`Run` type. No file-rewriting migration; no impact on the canonical schema in
`@x/shared`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 12:26:01 +05:30
|
|
|
export const CreateRunOptions = z.object({
|
|
|
|
|
agentId: z.string(),
|
|
|
|
|
model: z.string().optional(),
|
|
|
|
|
provider: z.string().optional(),
|
2026-06-03 07:57:50 +05:30
|
|
|
permissionMode: z.enum(["manual", "auto"]).optional(),
|
2026-04-28 19:53:40 +05:30
|
|
|
useCase: UseCase.optional(),
|
|
|
|
|
subUseCase: z.string().optional(),
|
freeze model + provider per run at creation time
The model dropdown was broken in two ways: it wrote to ~/.rowboat/config/models.json
(the BYOK creds file, stamped with a fake `flavor: 'openrouter'` to satisfy zod
when signed in), and the runtime ignored that write entirely for signed-in users
because `streamAgent` hard-coded `gpt-5.4`. Model selection was also globally
scoped, so every chat shared one brain.
This change moves model + provider out of the global config and onto the run
itself, resolved once at runs:create and frozen for the run's lifetime.
## Resolution
`runsCore.createRun` resolves per-field, falling through:
run.model = opts.model ?? agent.model ?? defaults.model
run.provider = opts.provider ?? agent.provider ?? defaults.provider
A new `core/models/defaults.ts` is the only place in the codebase that branches
on signed-in state. `getDefaultModelAndProvider()` returns name strings;
`resolveProviderConfig(name)` does the name → full LlmProvider lookup at
runtime. `createProvider` learns about `flavor: 'rowboat'` so the gateway is
just another flavor.
`provider` is stored as a name (e.g. `"rowboat"`, `"openai"`), not a full
LlmProvider object. API keys never get written into the JSONL log; rotating a
key in models.json applies to existing runs without re-creation. Cost: deleting
a provider from settings breaks runs that referenced it (clear error surfaced
via `resolveProviderConfig`).
## Runtime
`streamAgent` no longer resolves anything — it reads `state.runModel` /
`state.runProvider`, looks up the provider config, instantiates. Subflows
inherit the parent run's pair, so KG / inline-task subagents run on whatever
the main run resolved to at creation. The `knowledgeGraphAgents` array,
`isKgAgent`, and the per-agent default constants are gone.
KG / inline-task / pre-built agents declare their preferred model in YAML
frontmatter (claude-haiku-4.5 / claude-sonnet-4.6) — used at resolution time
when those agents are themselves the top-level agent of a run (background
triggers, scheduled tasks, etc.).
## Standalone callers
Non-run LLM call sites (summarize_meeting, track/routing, builtin-tools
parseFile) and `agent-schedule/runner` were branching on signed-in
independently. They all route through `getDefaultModelAndProvider` +
`resolveProviderConfig` + `createProvider` now; `agent-schedule/runner`
switched from raw `runsRepo.create` to `runsCore.createRun` so resolution
applies to scheduled-agent runs too.
## UI
`chat-input-with-mentions` stops calling `models:saveConfig`. The dropdown
notifies the parent via `onSelectedModelChange` ({provider, model} as names);
App.tsx stashes selection per-tab and passes it to the next `runs:create`.
When a run already exists, the input fetches it and renders a static label —
model can't change mid-run.
## Legacy runs
A lenient zod schema in `repo.ts` (`StartEvent.extend(...optional)` plus
`RunEvent.or(LegacyStartEvent)`) parses pre-existing runs. `repo.fetch` fills
missing model/provider from current defaults and returns the strict canonical
`Run` type. No file-rewriting migration; no impact on the canonical schema in
`@x/shared`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 12:26:01 +05:30
|
|
|
});
|