Merge pull request #286 from rowboatlabs/cli

rowboatx
2026-07-12 21:02:17 +02:00 · 2025-11-18 22:13:00 +05:30 · 2025-11-18 22:13:00 +05:30 · 51c7dc3f9d
commit 51c7dc3f9d
parent 476654af80 da5f64e938
35 changed files with 5078 additions and 30 deletions
--- a/README.md
+++ b/README.md
@ -1,6 +1,6 @@
 ![ui](/assets/banner.png)

-<h2 align="center">Let AI build you coworkers</h2>
+<h2 align="center">RowboatX - CLI Tool for Background Agents</h2>
 <h5 align="center">

 <p align="center" style="display: flex; justify-content: center; gap: 20px; align-items: center;">
@ -35,51 +35,59 @@


 </h5>
-<p align="center">
-⚡ Build AI agents instantly with natural language | 🔌 Connect tools with one-click integrations | 📂 Power with knowledge by adding documents for RAG | 🔄 Automate workflows by setting up triggers and actions | 🚀 Deploy anywhere via API or SDK<br><br>
-☁️ Prefer a hosted version? Use our <b><a href="https://rowboatlabs.com">cloud</a></b> to starting building agents right away!
-</p>

+- ✨ **Create background agents with full shell access**
+   - E.g. "Generate a NotebookLM-style podcast from my saved articles every morning"
+- 🔧 **Connect any MCP server to add capabilities**
+   - Add MCP servers and RowboatX handles the integration
+- 🎯 **Let RowboatX control and monitor your background agents**
+   - Easily inspect state on the filesystem 
+
+Inspired by Claude Code, RowboatX brings the same shell-native power to background automations.

 ## Quick start
-1. Set your OpenAI key
+1. Set your LLM API key. Supports OpenAI, Anthropic, Gemini, OpenRouter, LiteLLM, Ollama, and more.
   ```bash
   export OPENAI_API_KEY=your-openai-api-key  
   ```
      
-2. Clone the repository and start Rowboat (requires Docker)
+2. Install RowboatX
   ```bash
-   git clone git@github.com:rowboatlabs/rowboat.git
-   cd rowboat
-   ./start.sh
+   npx @rowboatlabs/rowboatx
   ```

-3. Access the app at [http://localhost:3000](http://localhost:3000).
-
-To add tools, RAG, more LLMs, and  triggers checkout the [Advanced](#advanced) section below.
-
 ## Demos
 #### Meeting-prep assistant
 Chat with the copilot to build a meeting-prep workflow, then add a calendar invite as a trigger. Watch the full demo [here](https://youtu.be/KZTP4xZM2DY).
 [![meeting-prep](https://github.com/user-attachments/assets/27755ef5-6549-476f-b9c0-50bef8770384)](https://youtu.be/KZTP4xZM2DY)

-#### Customer support assistant
-Chat with the copilot to build a customer support assistant, then connect your MCP server, and data for RAG. Watch the full demo [here](https://youtu.be/Xfo-OfgOl8w).
-[![output](https://github.com/user-attachments/assets/97485fd7-64c3-4d60-a627-f756a89dee64)](https://youtu.be/Xfo-OfgOl8w)
+## Examples
+### Add and Manage MCP servers 
+`$ rowboatx`
+- Add MCP: 'Add this MCP server config: \<config\> '
+- Explore tools: 'What tools are there in \<server-name\> '

-#### Personal assistant
-Chat with the copilot to build a personal assistant. Watch the full demo [here](https://youtu.be/6r7P4Vlcn2g).
-[![personal-assistant](https://github.com/user-attachments/assets/0f1c0ffd-23ba-4b49-8bfb-ec7a846f1332)](https://youtu.be/6r7P4Vlcn2g)
+### Create background agents
+`$ rowboatx`
+- 'Create agent to do X.'
+- '... Attach the correct tools from \<mcp-server-name\> to the agent'
+- '... Allow the agent to run shell commands including ffmpeg'

-## Advanced
-1. Native RAG Support: Enable file uploads and URL scraping with Rowboat's built-in RAG capabilities – see [RAG Guide](https://docs.rowboatlabs.com/docs/using-rowboat/rag).
+### Schedule and monitor agents
+`$ rowboatx`
+- 'Make agent \<background-agent-name\> run every day at 10 AM' 
+- 'What agents do I have scheduled to run and at what times'
+- 'When was \<background-agent-name\> last run'
+- 'Are any agents waiting for my input or confirmation'

-2. Custom LLM Providers: Use any LLM provider, including aggregators like OpenRouter and LiteLLM - see [Using more LLM providers](https://docs.rowboatlabs.com/docs/using-rowboat/customise/custom-llms).
+### Run background agents manually
+``` bash
+rowboatx --agent=<agent-name> --input="xyz" --no-interactive=true
+```
+```bash    
+rowboatx --agent=<agent-name> <run_id> # resume from a previous run
+```
+     
+## Rowboat Classic UI

-3. Tools & Triggers: Add tools and event triggers (e.g., Gmail, Slack) for automation – see [Tools](https://docs.rowboatlabs.com/docs/using-rowboat/tools) & [Triggers](https://docs.rowboatlabs.com/docs/using-rowboat/triggers).
-
-4. API & SDK: Integrate Rowboat agents directly into your app – see [API](https://docs.rowboatlabs.com/docs/api-sdk/using_the_api) & [SDK](https://docs.rowboatlabs.com/docs/api-sdk/using_the_sdk) docs.
-
-##
-
-Refer to [Docs](https://docs.rowboatlabs.com/) to learn how to start building agents with Rowboat.
+To use Rowboat Classic UI (not RowboatX), refer to [Classic](https://docs.rowboatlabs.com/). 
--- a/apps/cli/.gitignore
+++ b/apps/cli/.gitignore
@ -0,0 +1,2 @@
+node_modules/
+dist/
--- a/apps/cli/bin/app.js
+++ b/apps/cli/bin/app.js
@ -0,0 +1,55 @@
+#!/usr/bin/env node
+import yargs from 'yargs';
+import { hideBin } from 'yargs/helpers';
+import { app } from '../dist/app.js';
+
+yargs(hideBin(process.argv))
+
+    .command(
+        "$0",
+        "Run rowboatx",
+        (y) => y
+            .option("agent", {
+                type: "string",
+                description: "The agent to run",
+                default: "copilot",
+            })
+            .option("run_id", {
+                type: "string",
+                description: "Continue an existing run",
+            })
+            .option("input", {
+                type: "string",
+                description: "The input to the agent",
+            })
+            .option("no-interactive", {
+                type: "boolean",
+                description: "Do not interact with the user",
+                default: false,
+            }),
+        (argv) => {
+            app({
+                agent: argv.agent,
+                runId: argv.run_id,
+                input: argv.input,
+                noInteractive: argv.noInteractive,
+            });
+        }
+    )
+    .command(
+        "update-state <agent> <run_id>",
+        "Update state for a run",
+        (y) => y
+            .positional("agent", {
+                type: "string",
+                description: "The agent to run",
+            })
+            .positional("run_id", {
+                type: "string",
+                description: "The run id to update",
+            }),
+        (argv) => {
+            updateState(argv.agent, argv.run_id);
+        }
+    )
+    .parse();
--- a/apps/cli/examples/notebooklm-podcast.json
+++ b/apps/cli/examples/notebooklm-podcast.json
@ -0,0 +1,128 @@
+{
+    "name": "podcast",
+    "description": "A workflow to create a podcast",
+    "steps": [
+        {
+            "type": "agent",
+            "id": "arxiv-feed-reader"
+        },
+        {
+            "type": "agent",
+            "id": "summarise-a-few"
+        },
+        {
+            "type": "agent",
+            "id": "podcast_transcript_agent"
+        },
+        {
+            "type": "agent",
+            "id": "elevenlabs_audio_gen"
+        }
+    ]
+}
+
+{
+    "name": "summariser_workflow",
+    "description": "A workflow to summarise an arxiv paper",
+    "steps": [
+        {
+            "type": "agent",
+            "id": "summariser_agent"
+        }
+    ]
+}
+
+{
+    "name": "summariser_agent",
+    "description": "An agent that will summarise an arxiv paper",
+    "model": "gpt-4.1",
+    "instructions": "Your job is to download and summarise an arxiv paper. Use a command like this to do it\n\n curl -L -o paper.pdf https://arxiv.org/pdf/2511.02997 (use the url that the user provides you). Important, just put out the GIST of the paper in two lines. Dont ask a human for inputs - do what you think is best.",
+    "tools": {
+        "bash": {
+            "type": "builtin",
+            "name": "executeCommand"
+        }
+    }
+}
+
+{
+    "name": "arxiv-feed-reader",
+    "description": "A feed reader for the arXiv",
+    "model": "gpt-4.1",
+    "instructions": "Your job is to extract the latest papers from the arXiv feed and summarise them. Use an example curl command like the following to get this done:\n\n! curl -s https://rss.arxiv.org/rss/cs.AI \\\n| yq -p=xml -o=json \\\n| jq -r '.rss.channel.item[] | select(.title | test(\"agent\"; \"i\")) | \"\\(.title)\\n\\(.link)\\n\\(.description)\\n\"' \n\nThis will give you a list of papers that contain the word \"agent\" in the title. You can then summarise these papers using the summariser agent.",
+    "tools": {
+        "bash": {
+            "type": "builtin",
+            "name": "executeCommand"
+        }
+    }
+}
+
+{
+    "name": "summarise-a-few",
+    "description": "An agent that will summarise a few arxiv papers",
+    "model": "gpt-4.1",
+    "instructions": "Your job is to pick 2 interesting papers and related papers on the same topic, and then summarise each of them inidivually using the right tool calls. Make sure to pass in the URL of the paper to the summaurse tool. Don't ask for human input.",
+    "tools": {
+        "summariser": {
+            "type": "agent",
+            "name": "summariser_workflow"
+        }
+    }
+}
+
+{
+    "name": "podcast_transcript_agent",
+    "description": "An agent that will generate a transcript of a podcast",
+    "model": "gpt-4.1",
+    "instructions": "You job is to create a NotebookLM style 1 minute podcast between 2 speakers John and Chloe. Each line should be a new speaker. The podcast should be about the contents of the two papers (that were selected). You can use [sighs], [inhales then exhales], [chuckles], [laughs], [clears throat], [coughs], [sniffs], [pauses] etc. to make the podcast more natural."
+}
+
+{
+    "name": "elevenlabs_audio_gen",
+    "description": "An agent that will generate an audio file from a text",
+    "model": "gpt-4.1",
+    "instructions": "Your job is to take the mutli speaker transcript and generate an audio file from it. Use the elevenlabs text to speech tool to do this. For each speaker turn, you should generate an audio file and then combine them all into a single audio file. Use the voice_name 'Liam' for John and 'Cassidy' for Chloe. Make sure to remove the speaker names from the text before generating the audio files. Use the bash tool to look for the generated audio files and also combine the audio files into a single final podcast audio file. Use 'eleven_v3' for the model_id.",
+    "tools": {
+        "text_to_speech": {
+            "type": "mcp",
+            "name": "text_to_speech",
+            "description": "Generate an audio file from a text",
+            "mcpServerName": "elevenLabs",
+            "inputSchema": {
+                "type": "object",
+                "properties": {
+                    "text": {
+                        "type": "string", 
+                        "description": "The text to generate an audio file from"
+                    }, 
+                    "voice_name": {
+                        "type": "string",
+                        "description": "The voice name to use for the audio file"
+                    }, 
+                    "model_id": {
+                        "type": "string",
+                        "description": "The model id to use for the audio file"
+                    }
+                }
+            }
+        }, 
+        "bash": {
+            "type": "builtin",
+            "name": "executeCommand"
+        }
+    }
+}
+
+{
+  "mcpServers": {
+
+    "elevenLabs": {
+      "command": "uvx",
+      "args": ["elevenlabs-mcp"],
+      "env": {
+        "ELEVENLABS_API_KEY": "sk_42ee2a0a19266552c18b0920b593e22f0185d4b1435b65ed"
+      }
+    }
+  }
+}
--- a/apps/cli/package-lock.json
+++ b/apps/cli/package-lock.json
--- a/apps/cli/package.json
+++ b/apps/cli/package.json
@ -0,0 +1,38 @@
+{
+  "name": "@rowboatlabs/rowboatx",
+  "version": "0.6.0",
+  "main": "index.js",
+  "type": "module",
+  "scripts": {
+    "test": "echo \"Error: no test specified\" && exit 1",
+    "build": "rm -rf dist && tsc",
+    "copilot": "npm run build && node dist/x.js"
+  },
+  "files": [
+    "dist",
+    "bin"
+  ],
+  "bin": {
+    "rowboatx": "bin/app.js"
+  },
+  "keywords": [],
+  "author": "Rowboat Labs",
+  "license": "Apache-2.0",
+  "description": "",
+  "devDependencies": {
+    "@types/node": "^24.9.1",
+    "ts-node": "^10.9.2",
+    "typescript": "^5.9.3"
+  },
+  "dependencies": {
+    "@ai-sdk/anthropic": "^2.0.44",
+    "@ai-sdk/google": "^2.0.25",
+    "@ai-sdk/openai": "^2.0.53",
+    "@modelcontextprotocol/sdk": "^1.20.2",
+    "ai": "^5.0.78",
+    "json-schema-to-zod": "^2.6.1",
+    "nanoid": "^5.1.6",
+    "yargs": "^18.0.0",
+    "zod": "^4.1.12"
+  }
+}
--- a/apps/cli/src/app.ts
+++ b/apps/cli/src/app.ts
@ -0,0 +1,183 @@
+import { AgentState, streamAgent } from "./application/lib/agent.js";
+import { StreamRenderer } from "./application/lib/stream-renderer.js";
+import { stdin as input, stdout as output } from "node:process";
+import fs from "fs";
+import path from "path";
+import { WorkDir } from "./application/config/config.js";
+import { RunEvent } from "./application/entities/run-events.js";
+import { createInterface, Interface } from "node:readline/promises";
+import { ToolCallPart } from "./application/entities/message.js";
+import { z } from "zod";
+
+export async function updateState(agent: string, runId: string) {
+    const state = new AgentState(agent, runId);
+    // If running in a TTY, read run events from stdin line-by-line
+    if (!input.isTTY) {
+        return;
+    }
+
+    const rl = createInterface({ input, crlfDelay: Infinity });
+    try {
+        for await (const line of rl) {
+            if (line.trim() === "") {
+                continue;
+            }
+            const event = RunEvent.parse(JSON.parse(line));
+            state.ingestAndLog(event);
+        }
+    } finally {
+        rl.close();
+    }
+}
+
+export async function app(opts: {
+    agent: string;
+    runId?: string;
+    input?: string;
+    noInteractive?: boolean;
+}) {
+    const renderer = new StreamRenderer();
+    const state = new AgentState(opts.agent, opts.runId);
+
+    // load existing and assemble state if required
+    let runId = opts.runId;
+    if (runId) {
+        console.error("loading run", runId);
+        let stream: fs.ReadStream | null = null;
+        let rl: Interface | null = null;
+        try {
+            const logFile = path.join(WorkDir, "runs", `${runId}.jsonl`);
+            stream = fs.createReadStream(logFile, { encoding: "utf8" });
+            rl = createInterface({ input: stream, crlfDelay: Infinity });
+            for await (const line of rl) {
+                if (line.trim() === "") {
+                    continue;
+                }
+                const parsed = JSON.parse(line);
+                const event = RunEvent.parse(parsed);
+                state.ingest(event);
+            }
+        } finally {
+            stream?.close();
+        }
+    }
+
+    let rl: Interface | null = null;
+    if (!opts.noInteractive) {
+        rl = createInterface({ input, output });
+    }
+    let inputConsumed = false;
+
+    try {
+        while (true) {
+            // ask for pending tool permissions
+            for (const perm of Object.values(state.getPendingPermissions())) {
+                if (opts.noInteractive) {
+                    return;
+                }
+                const response = await getToolCallPermission(perm.toolCall, rl!);
+                state.ingestAndLog({
+                    type: "tool-permission-response",
+                    response,
+                    toolCallId: perm.toolCall.toolCallId,
+                    subflow: perm.subflow,
+                });
+            }
+
+            // ask for pending human input
+            for (const ask of Object.values(state.getPendingAskHumans())) {
+                if (opts.noInteractive) {
+                    return;
+                }
+                const response = await getAskHumanResponse(ask.query, rl!);
+                state.ingestAndLog({
+                    type: "ask-human-response",
+                    response,
+                    toolCallId: ask.toolCallId,
+                    subflow: ask.subflow,
+                });
+            }
+
+            // run one turn
+            for await (const event of streamAgent(state)) {
+                renderer.render(event);
+                if (event?.type === "error") {
+                    process.exitCode = 1;
+                }
+            }
+
+            // if nothing pending, get user input
+            if (state.getPendingPermissions().length === 0 && state.getPendingAskHumans().length === 0) {
+                if (opts.input && !inputConsumed) {
+                    state.ingestAndLog({
+                        type: "message",
+                        message: {
+                            role: "user",
+                            content: opts.input,
+                        },
+                        subflow: [],
+                    });
+                    inputConsumed = true;
+                    continue;
+                }
+                if (opts.noInteractive) {
+                    return;
+                }
+                const response = await getUserInput(rl!);
+                state.ingestAndLog({
+                    type: "message",
+                    message: {
+                        role: "user",
+                        content: response,
+                    },
+                    subflow: [],
+                });
+            }
+        }
+    } finally {
+        rl?.close();
+    }
+}
+
+async function getToolCallPermission(
+    call: z.infer<typeof ToolCallPart>,
+    rl: Interface,
+): Promise<"approve" | "deny"> {
+    const question = `Do you want to allow running the following tool: ${call.toolName}?:
+    
+    Tool name: ${call.toolName}
+    Tool arguments: ${JSON.stringify(call.arguments)}
+
+    Choices: y/n/a/d:
+    - y: approve
+    - n: deny
+    `;
+    const input = await rl.question(question);
+    if (input.toLowerCase() === "y") return "approve";
+    if (input.toLowerCase() === "n") return "deny";
+    return "deny";
+}
+
+async function getAskHumanResponse(
+    query: string,
+    rl: Interface,
+): Promise<string> {
+    const input = await rl.question(`The agent is asking for your help with the following query:
+    
+    Question: ${query}
+
+    Please respond to the question.
+    `);
+    return input;
+}
+
+async function getUserInput(
+    rl: Interface,
+): Promise<string> {
+    const input = await rl.question("You: ");
+    if (["quit", "exit", "q"].includes(input.toLowerCase().trim())) {
+        console.error("Bye!");
+        process.exit(0);
+    }
+    return input;
+}
--- a/apps/cli/src/application/assistant/agent.ts
+++ b/apps/cli/src/application/assistant/agent.ts
@ -0,0 +1,20 @@
+import { Agent, ToolAttachment } from "../entities/agent.js";
+import z from "zod";
+import { CopilotInstructions } from "./instructions.js";
+import { BuiltinTools } from "../lib/builtin-tools.js";
+
+const tools: Record<string, z.infer<typeof ToolAttachment>> = {};
+for (const [name, tool] of Object.entries(BuiltinTools)) {
+    tools[name] = {
+        type: "builtin",
+        name,
+    };
+}
+
+export const CopilotAgent: z.infer<typeof Agent> = {
+    name: "rowboatx",
+    description: "Rowboatx copilot",
+    instructions: CopilotInstructions,
+    model: "gpt-5.1",
+    tools,
+}
--- a/apps/cli/src/application/assistant/instructions.ts
+++ b/apps/cli/src/application/assistant/instructions.ts
@ -0,0 +1,44 @@
+import { skillCatalog } from "./skills/index.js";
+import { WorkDir as BASE_DIR } from "../config/config.js";
+
+export const CopilotInstructions = `You are an intelligent workflow assistant helping users manage their workflows in ${BASE_DIR}
+
+Use the catalog below to decide which skills to load for each user request. Before acting:
+- Call the \`loadSkill\` tool with the skill's name or path so you can read its guidance string.
+- Apply the instructions from every loaded skill while working on the request.
+
+${skillCatalog}
+
+Always consult this catalog first so you load the right skills before taking action.
+
+# Communication & Execution Style
+
+## Communication principles
+- Be concise and direct. Avoid verbose explanations unless the user asks for details.
+- Only show JSON output when explicitly requested by the user. Otherwise, summarize results in plain language.
+- Break complex efforts into clear, sequential steps the user can follow.
+- Explain reasoning briefly as you work, and confirm outcomes before moving on.
+- Be proactive about understanding missing context; ask clarifying questions when needed.
+- Summarize completed work and suggest logical next steps at the end of a task.
+- Always ask for confirmation before taking destructive actions.
+
+## Execution reminders
+- Explore existing files and structure before creating new assets.
+- Use relative paths (no \${BASE_DIR} prefixes) when running commands or referencing files.
+- Keep user data safe—double-check before editing or deleting important resources.
+
+## Builtin Tools vs Shell Commands
+
+**IMPORTANT**: Rowboat provides builtin tools that are internal and do NOT require security allowlist entries:
+- \`deleteFile\`, \`createFile\`, \`updateFile\`, \`readFile\` - File operations
+- \`listFiles\`, \`exploreDirectory\` - Directory exploration
+- \`analyzeAgent\` - Agent analysis
+- \`listMcpServers\`, \`listMcpTools\` - MCP server management
+- \`loadSkill\` - Skill loading
+
+These tools work directly and are NOT filtered by \`.rowboat/config/security.json\`.
+
+**Only \`executeCommand\` (shell/bash commands) is filtered** by the security allowlist. If you need to delete a file, use the \`deleteFile\` builtin tool, not \`executeCommand\` with \`rm\`. If you need to create a file, use \`createFile\`, not \`executeCommand\` with \`touch\` or \`echo >\`.
+
+The security allowlist in \`security.json\` only applies to shell commands executed via \`executeCommand\`, not to Rowboat's internal builtin tools.
+`;
--- a/apps/cli/src/application/assistant/skills/builtin-tools/skill.ts
+++ b/apps/cli/src/application/assistant/skills/builtin-tools/skill.ts
@ -0,0 +1,180 @@
+export const skill = String.raw`
+# Builtin Tools Reference
+
+Load this skill when creating or modifying agents that need access to Rowboat's builtin tools (shell execution, file operations, etc.).
+
+## Available Builtin Tools
+
+Agents can use builtin tools by declaring them in the \`"tools"\` object with \`"type": "builtin"\` and the appropriate \`"name"\`.
+
+### executeCommand
+**The most powerful and versatile builtin tool** - Execute any bash/shell command and get the output.
+
+**Security note:** Commands are filtered through \`.rowboat/config/security.json\`. Populate this file with allowed command names (array or dictionary entries). Any command not present is blocked and returns exit code 126 so the agent knows it violated the policy.
+
+**Agent tool declaration:**
+\`\`\`json
+"tools": {
+  "bash": {
+    "type": "builtin",
+    "name": "executeCommand"
+  }
+}
+\`\`\`
+
+**What it can do:**
+- Run package managers (npm, pip, apt, brew, cargo, go get, etc.)
+- Git operations (clone, commit, push, pull, status, diff, log, etc.)
+- System operations (ps, top, df, du, find, grep, kill, etc.)
+- Build and compilation (make, cargo build, go build, npm run build, etc.)
+- Network operations (curl, wget, ping, ssh, netstat, etc.)
+- Text processing (awk, sed, grep, jq, yq, cut, sort, uniq, etc.)
+- Database operations (psql, mysql, mongo, redis-cli, etc.)
+- Container operations (docker, kubectl, podman, etc.)
+- Testing and debugging (pytest, jest, cargo test, etc.)
+- File operations (cat, head, tail, wc, diff, patch, etc.)
+- Any CLI tool or script execution
+
+**Agent instruction examples:**
+- "Use the bash tool to run git commands for version control operations"
+- "Execute curl commands using the bash tool to fetch data from APIs"
+- "Use bash to run 'npm install' and 'npm test' commands"
+- "Run Python scripts using the bash tool with 'python script.py'"
+- "Use bash to execute 'docker ps' and inspect container status"
+- "Run database queries using 'psql' or 'mysql' commands via bash"
+- "Use bash to execute system monitoring commands like 'top' or 'ps aux'"
+
+**Pro tips for agent instructions:**
+- Commands can be chained with && for sequential execution
+- Use pipes (|) to combine Unix tools (e.g., "cat file.txt | grep pattern | wc -l")
+- Redirect output with > or >> when needed
+- Full bash shell features are available (variables, loops, conditionals, etc.)
+- Tools like jq, yq, awk, sed can parse and transform data
+
+**Example agent with executeCommand:**
+\`\`\`json
+{
+  "name": "arxiv-feed-reader",
+  "description": "A feed reader for the arXiv",
+  "model": "gpt-5.1",
+  "instructions": "Extract latest papers from the arXiv feed and summarize them. Use curl to fetch the RSS feed, then parse it with yq and jq:\n\ncurl -s https://rss.arxiv.org/rss/cs.AI | yq -p=xml -o=json | jq -r '.rss.channel.item[] | select(.title | test(\"agent\"; \"i\")) | \"\\(.title)\\n\\(.link)\\n\\(.description)\\n\"'\n\nThis will give you papers containing 'agent' in the title.",
+  "tools": {
+    "bash": {
+      "type": "builtin",
+      "name": "executeCommand"
+    }
+  }
+}
+\`\`\`
+
+**Another example - System monitoring agent:**
+\`\`\`json
+{
+  "name": "system-monitor",
+  "description": "Monitor system resources and processes",
+  "model": "gpt-5.1",
+  "instructions": "Monitor system resources using bash commands. Use 'df -h' for disk usage, 'free -h' for memory, 'top -bn1' for processes, 'ps aux' for process list. Parse the output and report any issues.",
+  "tools": {
+    "bash": {
+      "type": "builtin",
+      "name": "executeCommand"
+    }
+  }
+}
+\`\`\`
+
+**Another example - Git automation agent:**
+\`\`\`json
+{
+  "name": "git-helper",
+  "description": "Automate git operations",
+  "model": "gpt-5.1",
+  "instructions": "Help with git operations. Use commands like 'git status', 'git log --oneline -10', 'git diff', 'git branch -a' to inspect the repository. Can also run 'git add', 'git commit', 'git push' when instructed.",
+  "tools": {
+    "bash": {
+      "type": "builtin",
+      "name": "executeCommand"
+    }
+  }
+}
+\`\`\`
+
+## Agent-to-Agent Calling
+
+Agents can call other agents as tools to create complex multi-step workflows. This is the core mechanism for building multi-agent systems in the CLI.
+
+**Tool declaration:**
+\`\`\`json
+"tools": {
+  "summariser": {
+    "type": "agent",
+    "name": "summariser_agent"
+  }
+}
+\`\`\`
+
+**When to use:**
+- Breaking complex tasks into specialized sub-agents
+- Creating reusable agent components
+- Orchestrating multi-step workflows
+- Delegating specialized tasks (e.g., summarization, data processing, audio generation)
+
+**How it works:**
+- The agent calls the tool like any other tool
+- The target agent receives the input and processes it
+- Results are returned as tool output
+- The calling agent can then continue processing or delegate further
+
+**Example - Agent that delegates to a summarizer:**
+\`\`\`json
+{
+  "name": "paper_analyzer",
+  "model": "gpt-5.1",
+  "instructions": "Pick 2 interesting papers and summarise each using the summariser tool. Pass the paper URL to the summariser. Don't ask for human input.",
+  "tools": {
+    "summariser": {
+      "type": "agent",
+      "name": "summariser_agent"
+    }
+  }
+}
+\`\`\`
+
+**Tips for agent chaining:**
+- Make instructions explicit about when to call other agents
+- Pass clear, structured data between agents
+- Add "Don't ask for human input" for autonomous workflows
+- Keep each agent focused on a single responsibility
+
+## Additional Builtin Tools
+
+While \`executeCommand\` is the most versatile, other builtin tools exist for specific Rowboat operations (file management, agent inspection, etc.). These are primarily used by the Rowboat copilot itself and are not typically needed in user agents. If you need file operations, consider using bash commands like \`cat\`, \`echo\`, \`tee\`, etc. through \`executeCommand\`.
+
+## Best Practices
+
+1. **Give agents clear examples** in their instructions showing exact bash commands to run
+2. **Explain output parsing** - show how to use jq, yq, grep, awk to extract data
+3. **Chain commands efficiently** - use && for sequences, | for pipes
+4. **Handle errors** - remind agents to check exit codes and stderr
+5. **Be specific** - provide example commands rather than generic descriptions
+6. **Security** - remind agents to validate inputs and avoid dangerous operations
+
+## When to Use Builtin Tools vs MCP Tools vs Agent Tools
+
+- **Use builtin executeCommand** when you need: CLI tools, system operations, data processing, git operations, any shell command
+- **Use MCP tools** when you need: Web scraping (firecrawl), text-to-speech (elevenlabs), specialized APIs, external service integrations
+- **Use agent tools (\`"type": "agent"\`)** when you need: Complex multi-step logic, task delegation, specialized processing that benefits from LLM reasoning
+
+Many tasks can be accomplished with just \`executeCommand\` and common Unix tools - it's incredibly powerful!
+
+## Key Insight: Multi-Agent Workflows
+
+In the CLI, multi-agent workflows are built by:
+1. Creating specialized agents for specific tasks (in \`agents/\` directory)
+2. Creating an orchestrator agent that has other agents in its \`tools\`
+3. Running the orchestrator with \`rowboatx --agent orchestrator_name\`
+
+There are no separate "workflow" files - everything is an agent!
+`;
+
+export default skill;
--- a/apps/cli/src/application/assistant/skills/deletion-guardrails/skill.ts
+++ b/apps/cli/src/application/assistant/skills/deletion-guardrails/skill.ts
@ -0,0 +1,24 @@
+export const skill = String.raw`
+# Deletion Guardrails
+
+Load this skill when a user asks to delete agents or workflows so you follow the required confirmation steps.
+
+## Workflow deletion protocol
+1. Read the workflow file to identify every agent it references.
+2. Report those agents to the user and ask whether they should be deleted too.
+3. Wait for explicit confirmation before deleting anything.
+4. Only remove the workflow and/or agents the user authorizes.
+
+## Agent deletion protocol
+1. Inspect the agent file to discover which workflows reference it.
+2. List those workflows to the user and ask whether they should be updated or deleted.
+3. Pause for confirmation before modifying workflows or removing the agent.
+4. Perform only the deletions the user approves.
+
+## Safety checklist
+- Never delete cascaded resources automatically.
+- Keep a clear audit trail in your responses describing what was removed.
+- If the user’s instructions are ambiguous, ask clarifying questions before taking action.
+`;
+
+export default skill;
--- a/apps/cli/src/application/assistant/skills/index.ts
+++ b/apps/cli/src/application/assistant/skills/index.ts
@ -0,0 +1,151 @@
+import path from "node:path";
+import { fileURLToPath } from "node:url";
+import builtinToolsSkill from "./builtin-tools/skill.js";
+import deletionGuardrailsSkill from "./deletion-guardrails/skill.js";
+import mcpIntegrationSkill from "./mcp-integration/skill.js";
+import workflowAuthoringSkill from "./workflow-authoring/skill.js";
+import workflowRunOpsSkill from "./workflow-run-ops/skill.js";
+
+const CURRENT_FILE = fileURLToPath(import.meta.url);
+const CURRENT_DIR = path.dirname(CURRENT_FILE);
+const CATALOG_PREFIX = "src/application/assistant/skills";
+
+type SkillDefinition = {
+  id: string;
+  title: string;
+  folder: string;
+  summary: string;
+  content: string;
+};
+
+type ResolvedSkill = {
+  id: string;
+  catalogPath: string;
+  content: string;
+};
+
+const definitions: SkillDefinition[] = [
+  {
+    id: "workflow-authoring",
+    title: "Workflow Authoring",
+    folder: "workflow-authoring",
+    summary: "Creating or editing workflows/agents, validating schema rules, and keeping filenames aligned with JSON ids.",
+    content: workflowAuthoringSkill,
+  },
+  {
+    id: "builtin-tools",
+    title: "Builtin Tools Reference",
+    folder: "builtin-tools",
+    summary: "Understanding and using builtin tools (especially executeCommand for bash/shell) in agent definitions.",
+    content: builtinToolsSkill,
+  },
+  {
+    id: "mcp-integration",
+    title: "MCP Integration Guidance",
+    folder: "mcp-integration",
+    summary: "Listing MCP servers/tools and embedding their schemas in agent definitions.",
+    content: mcpIntegrationSkill,
+  },
+  {
+    id: "deletion-guardrails",
+    title: "Deletion Guardrails",
+    folder: "deletion-guardrails",
+    summary: "Following the confirmation process before removing workflows or agents and their dependencies.",
+    content: deletionGuardrailsSkill,
+  },
+  {
+    id: "workflow-run-ops",
+    title: "Workflow Run Operations",
+    folder: "workflow-run-ops",
+    summary: "Commands that list workflow runs, inspect paused executions, or manage cron schedules for workflows.",
+    content: workflowRunOpsSkill,
+  },
+];
+
+const skillEntries = definitions.map((definition) => ({
+  ...definition,
+  catalogPath: `${CATALOG_PREFIX}/${definition.folder}/skill.ts`,
+}));
+
+const catalogSections = skillEntries.map((entry) => [
+  `## ${entry.title}`,
+  `- **Skill file:** \`${entry.catalogPath}\``,
+  `- **Use it for:** ${entry.summary}`,
+].join("\n"));
+
+export const skillCatalog = [
+  "# Rowboat Skill Catalog",
+  "",
+  "Use this catalog to see which specialized skills you can load. Each entry lists the exact skill file plus a short description of when it helps.",
+  "",
+  catalogSections.join("\n\n"),
+].join("\n");
+
+const normalizeIdentifier = (value: string) =>
+  value.trim().replace(/\\/g, "/").replace(/^\.\/+/, "");
+
+const aliasMap = new Map<string, ResolvedSkill>();
+
+const registerAlias = (alias: string, entry: ResolvedSkill) => {
+  const normalized = normalizeIdentifier(alias);
+  if (!normalized) return;
+  aliasMap.set(normalized, entry);
+};
+
+const registerAliasVariants = (alias: string, entry: ResolvedSkill) => {
+  const normalized = normalizeIdentifier(alias);
+  if (!normalized) return;
+
+  const variants = new Set<string>([normalized]);
+
+  if (/\.(ts|js)$/i.test(normalized)) {
+    variants.add(normalized.replace(/\.(ts|js)$/i, ""));
+    variants.add(
+      normalized.endsWith(".ts") ? normalized.replace(/\.ts$/i, ".js") : normalized.replace(/\.js$/i, ".ts"),
+    );
+  } else {
+    variants.add(`${normalized}.ts`);
+    variants.add(`${normalized}.js`);
+  }
+
+  for (const variant of variants) {
+    registerAlias(variant, entry);
+  }
+};
+
+for (const entry of skillEntries) {
+  const absoluteTs = path.join(CURRENT_DIR, entry.folder, "skill.ts");
+  const absoluteJs = path.join(CURRENT_DIR, entry.folder, "skill.js");
+  const resolvedEntry: ResolvedSkill = {
+    id: entry.id,
+    catalogPath: entry.catalogPath,
+    content: entry.content,
+  };
+
+  const baseAliases = [
+    entry.id,
+    entry.folder,
+    `${entry.folder}/skill`,
+    `${entry.folder}/skill.ts`,
+    `${entry.folder}/skill.js`,
+    `skills/${entry.folder}/skill.ts`,
+    `skills/${entry.folder}/skill.js`,
+    `${CATALOG_PREFIX}/${entry.folder}/skill.ts`,
+    `${CATALOG_PREFIX}/${entry.folder}/skill.js`,
+    absoluteTs,
+    absoluteJs,
+  ];
+
+  for (const alias of baseAliases) {
+    registerAliasVariants(alias, resolvedEntry);
+  }
+}
+
+export const availableSkills = skillEntries.map((entry) => entry.id);
+
+export function resolveSkill(identifier: string): ResolvedSkill | null {
+  const normalized = normalizeIdentifier(identifier);
+  if (!normalized) return null;
+
+  return aliasMap.get(normalized) ?? null;
+}
--- a/apps/cli/src/application/assistant/skills/mcp-integration/skill.ts
+++ b/apps/cli/src/application/assistant/skills/mcp-integration/skill.ts
@ -0,0 +1,60 @@
+export const skill = String.raw`
+# MCP Integration Guidance
+
+Load this skill whenever a user asks about external tools, MCP servers, or how to extend an agent’s capabilities.
+
+## Key concepts
+- MCP servers expose tools (web scraping, APIs, databases, etc.) declared in \`config/mcp.json\`.
+- Agents reference MCP tools through the \`"tools"\` block by specifying \`type\`, \`name\`, \`description\`, \`mcpServerName\`, and a full \`inputSchema\`.
+- Tool schemas can include optional property descriptions; only include \`"required"\` when parameters are mandatory.
+
+## Operator actions
+1. Use \`listMcpServers\` to enumerate configured servers.
+2. Use \`listMcpTools\` for a server to understand the available operations and schemas.
+3. Explain which MCP tools match the user’s needs before editing agent definitions.
+4. When adding a tool to an agent, document what it does and ensure the schema mirrors the MCP definition.
+
+## Example snippets to reference
+- Firecrawl search (required param):
+\`\`\`
+"tools": {
+  "search": {
+    "type": "mcp",
+    "name": "firecrawl_search",
+    "description": "Search the web",
+    "mcpServerName": "firecrawl",
+    "inputSchema": {
+      "type": "object",
+      "properties": {
+        "query": {"type": "string", "description": "Search query"},
+        "limit": {"type": "number", "description": "Number of results"}
+      },
+      "required": ["query"]
+    }
+  }
+}
+\`\`\`
+- ElevenLabs text-to-speech (no required array):
+\`\`\`
+"tools": {
+  "text_to_speech": {
+    "type": "mcp",
+    "name": "text_to_speech",
+    "description": "Generate audio from text",
+    "mcpServerName": "elevenLabs",
+    "inputSchema": {
+      "type": "object",
+      "properties": {
+        "text": {"type": "string"}
+      }
+    }
+  }
+}
+\`\`\`
+
+## Safety reminders
+- Only recommend MCP tools that are actually configured.
+- Clarify any missing details (required parameters, server names) before modifying files.
+`;
+
+export default skill;
--- a/apps/cli/src/application/assistant/skills/workflow-authoring/skill.ts
+++ b/apps/cli/src/application/assistant/skills/workflow-authoring/skill.ts
@ -0,0 +1,168 @@
+export const skill = String.raw`
+# Agent and Workflow Authoring
+
+Load this skill whenever a user wants to inspect, create, or update agents inside the Rowboat workspace.
+
+## Core Concepts
+
+**IMPORTANT**: In the CLI, there are NO separate "workflow" files. Everything is an agent.
+
+- **All definitions live in \`agents/*.json\`** - there is no separate workflows folder
+- Agents configure a model, instructions, and the tools they can use
+- Tools can be: builtin (like \`executeCommand\`), MCP integrations, or **other agents**
+- **"Workflows" are just agents that orchestrate other agents** by having them as tools
+
+## How multi-agent workflows work
+
+1. **Create an orchestrator agent** that has other agents in its \`tools\`
+2. **Run the orchestrator**: \`rowboatx --agent orchestrator_name\`
+3. The orchestrator calls other agents as tools when needed
+4. Data flows through tool call parameters and responses
+
+## Agent format
+\`\`\`json
+{
+  "name": "agent_name",
+  "description": "Description of the agent",
+  "model": "gpt-5.1",
+  "instructions": "Instructions for the agent",
+  "tools": {
+    "descriptive_tool_key": {
+      "type": "mcp",
+      "name": "actual_mcp_tool_name",
+      "description": "What the tool does",
+      "mcpServerName": "server_name_from_config",
+      "inputSchema": {
+        "type": "object",
+        "properties": {
+          "param1": {"type": "string", "description": "What the parameter means"}
+        }
+      }
+    }
+  }
+}
+\`\`\`
+
+## Tool types
+
+### Builtin tools
+\`\`\`json
+"bash": {
+  "type": "builtin",
+  "name": "executeCommand"
+}
+\`\`\`
+
+### MCP tools
+\`\`\`json
+"search": {
+  "type": "mcp",
+  "name": "firecrawl_search",
+  "description": "Search the web",
+  "mcpServerName": "firecrawl",
+  "inputSchema": {
+    "type": "object",
+    "properties": {
+      "query": {"type": "string", "description": "Search query"}
+    },
+    "required": ["query"]
+  }
+}
+\`\`\`
+
+### Agent tools (for chaining agents)
+\`\`\`json
+"summariser": {
+  "type": "agent",
+  "name": "summariser_agent"
+}
+\`\`\`
+- Use \`"type": "agent"\` to call other agents as tools
+- The target agent will be invoked with the parameters you pass
+- Results are returned as tool output
+- This is how you build multi-agent workflows
+
+## Complete Multi-Agent Workflow Example
+
+**Podcast creation workflow** - This is all done through agents calling other agents:
+
+**1. Task-specific agent** (does one thing):
+\`\`\`json
+{
+  "name": "summariser_agent",
+  "description": "Summarises an arxiv paper",
+  "model": "gpt-5.1",
+  "instructions": "Download and summarise an arxiv paper. Use curl to fetch the PDF. Output just the GIST in two lines. Don't ask for human input.",
+  "tools": {
+    "bash": {"type": "builtin", "name": "executeCommand"}
+  }
+}
+\`\`\`
+
+**2. Agent that delegates to other agents**:
+\`\`\`json
+{
+  "name": "summarise-a-few",
+  "description": "Summarises multiple arxiv papers",
+  "model": "gpt-5.1",
+  "instructions": "Pick 2 interesting papers and summarise each using the summariser tool. Pass the paper URL to the tool. Don't ask for human input.",
+  "tools": {
+    "summariser": {
+      "type": "agent",
+      "name": "summariser_agent"
+    }
+  }
+}
+\`\`\`
+
+**3. Orchestrator agent** (coordinates the whole workflow):
+\`\`\`json
+{
+  "name": "podcast_workflow",
+  "description": "Create a podcast from arXiv papers",
+  "model": "gpt-5.1",
+  "instructions": "1. Fetch arXiv papers about agents using bash\n2. Pick papers and summarise them using summarise_papers\n3. Create a podcast transcript\n4. Generate audio using text_to_speech\n\nExecute these steps in sequence.",
+  "tools": {
+    "bash": {"type": "builtin", "name": "executeCommand"},
+    "summarise_papers": {
+      "type": "agent",
+      "name": "summarise-a-few"
+    },
+    "text_to_speech": {
+      "type": "mcp",
+      "name": "text_to_speech",
+      "mcpServerName": "elevenLabs",
+      "description": "Generate audio",
+      "inputSchema": { "type": "object", "properties": {...}}
+    }
+  }
+}
+\`\`\`
+
+**To run this workflow**: \`rowboatx --agent podcast_workflow\`
+
+## Naming and organization rules
+- **All agents live in \`agents/*.json\`** - no other location
+- Agent filenames must match the \`"name"\` field exactly
+- When referencing an agent as a tool, use its \`"name"\` value
+- Always keep filenames and \`"name"\` fields perfectly aligned
+- Use relative paths (no \${BASE_DIR} prefixes) when giving examples to users
+
+## Best practices for multi-agent design
+1. **Single responsibility**: Each agent should do one specific thing well
+2. **Clear delegation**: Agent instructions should explicitly say when to call other agents
+3. **Autonomous operation**: Add "Don't ask for human input" for autonomous workflows
+4. **Data passing**: Make it clear what data to extract and pass between agents
+5. **Tool naming**: Use descriptive tool keys (e.g., "summariser", "fetch_data", "analyze")
+6. **Orchestration**: Create a top-level agent that coordinates the workflow
+
+## Capabilities checklist
+1. Explore \`agents/\` directory to understand existing agents before editing
+2. Update files carefully to maintain schema validity
+3. When creating multi-agent workflows, create an orchestrator agent
+4. Add other agents as tools with \`"type": "agent"\` for chaining
+5. List and explore MCP servers/tools when users need new capabilities
+6. Confirm work done and outline next steps once changes are complete
+`;
+
+export default skill;
--- a/apps/cli/src/application/assistant/skills/workflow-run-ops/skill.ts
+++ b/apps/cli/src/application/assistant/skills/workflow-run-ops/skill.ts
@ -0,0 +1,95 @@
+export const skill = String.raw`
+# Agent Run Operations
+
+Package of repeatable commands for running agents, inspecting agent run history under ~/.rowboat/runs, and managing cron schedules. Load this skill whenever a user asks about running agents, execution history, paused runs, or scheduling.
+
+## When to use
+- User wants to run an agent (including multi-agent workflows)
+- User wants to list or filter agent runs (all runs, by agent, time range, or paused for input)
+- User wants to inspect cron jobs or change agent schedules
+- User asks how to set up monitoring for waiting runs
+
+## Running Agents
+
+**To run any agent**:
+\`\`\`bash
+rowboatx --agent <agent-name>
+\`\`\`
+
+**With input**:
+\`\`\`bash
+rowboatx --agent <agent-name> --input "your input here"
+\`\`\`
+
+**Non-interactive** (for automation/cron):
+\`\`\`bash
+rowboatx --agent <agent-name> --input "input" --no-interactive
+\`\`\`
+
+**Note**: Multi-agent workflows are just agents that have other agents in their tools. Run the orchestrator agent to trigger the whole workflow.
+
+## Run monitoring examples
+Operate from ~/.rowboat (Rowboat tools already set this as the working directory). Use executeCommand with the sample Bash snippets below, modifying placeholders as needed.
+
+Each run file name starts with a timestamp like '2025-11-12T08-02-41Z'. You can use this to filter for date/time ranges.
+
+Each line of the run file contains a running log with the first line containing information about the agent run. E.g. '{"type":"start","runId":"2025-11-12T08-02-41Z-0014322-000","agent":"agent_name","interactive":true,"ts":"2025-11-12T08:02:41.168Z"}'
+
+If a run is waiting for human input the last line will contain 'paused_for_human_input'. See examples below.
+
+1. **List all runs**
+   
+   ls ~/.rowboat/runs
+   
+
+2. **Filter by agent**
+   
+   grep -rl '"agent":"<agent-name>"' ~/.rowboat/runs | xargs -n1 basename | sed 's/\.jsonl$//' | sort -r
+   
+   Replace <agent-name> with the desired agent name.
+
+3. **Filter by time window**
+   To the previous commands add the below through unix pipe
+   
+   awk -F'/' '$NF >= "2025-11-12T08-03" && $NF <= "2025-11-12T08-10"'
+   
+   Use the correct timestamps.
+
+4. **Show runs waiting for human input**
+   
+   awk 'FNR==1{if (NR>1) print fn, last; fn=FILENAME} {last=$0} END{print fn, last}' ~/.rowboat/runs/*.jsonl | grep 'pause-for-human-input' | awk '{print $1}'
+   
+   Prints the files whose last line equals 'pause-for-human-input'.
+
+## Cron management examples
+
+For scheduling agents to run automatically at specific times.
+
+1. **View current cron schedule**
+   \`\`\`bash
+   crontab -l 2>/dev/null || echo 'No crontab entries configured.'
+   \`\`\`
+
+2. **Schedule an agent to run periodically**
+   \`\`\`bash
+   (crontab -l 2>/dev/null; echo '0 10 * * * cd /path/to/cli && rowboatx --agent <agent-name> --input "input" --no-interactive >> ~/.rowboat/logs/<agent-name>.log 2>&1') | crontab -
+   \`\`\`
+   
+   Example (runs daily at 10 AM):
+   \`\`\`bash
+   (crontab -l 2>/dev/null; echo '0 10 * * * cd ~/rowboat-V2/apps/cli && rowboatx --agent podcast_workflow --no-interactive >> ~/.rowboat/logs/podcast.log 2>&1') | crontab -
+   \`\`\`
+
+3. **Unschedule/remove an agent**
+   \`\`\`bash
+   crontab -l | grep -v '<agent-name>' | crontab -
+   \`\`\`
+
+## Common cron schedule patterns
+- \`0 10 * * *\` - Daily at 10 AM
+- \`0 */6 * * *\` - Every 6 hours
+- \`0 9 * * 1\` - Every Monday at 9 AM
+- \`*/30 * * * *\` - Every 30 minutes
+`;
+
+export default skill;
--- a/apps/cli/src/application/config/config.ts
+++ b/apps/cli/src/application/config/config.ts
@ -0,0 +1,82 @@
+import path from "path";
+import fs from "fs";
+import { McpServerConfig } from "../entities/mcp.js";
+import { ModelConfig as ModelConfigT } from "../entities/models.js";
+import { z } from "zod";
+import { homedir } from "os";
+
+// Resolve app root relative to compiled file location (dist/...)
+export const WorkDir = path.join(homedir(), ".rowboat");
+
+const baseMcpConfig: z.infer<typeof McpServerConfig> = {
+    mcpServers: {
+        firecrawl: {
+            command: "npx",
+            args: ["-y", "supergateway", "--stdio", "npx -y firecrawl-mcp"],
+            env: {
+                FIRECRAWL_API_KEY: "fc-aaacee4bdd164100a4d83af85bef6fdc",
+            },
+        },
+        test: {
+            url: "http://localhost:3000",
+            headers: {
+                "Authorization": "Bearer test",
+            },
+        },
+    }
+};
+
+const baseModelConfig: z.infer<typeof ModelConfigT> = {
+    providers: {
+        openai: {
+            flavor: "openai",
+        },
+    },
+    defaults: {
+        provider: "openai",
+        model: "gpt-5.1",
+    }
+};
+
+function ensureMcpConfig() {
+    const configPath = path.join(WorkDir, "config", "mcp.json");
+    if (!fs.existsSync(configPath)) {
+        fs.writeFileSync(configPath, JSON.stringify(baseMcpConfig, null, 2));
+    }
+}
+
+function ensureModelConfig() {
+    const configPath = path.join(WorkDir, "config", "models.json");
+    if (!fs.existsSync(configPath)) {
+        fs.writeFileSync(configPath, JSON.stringify(baseModelConfig, null, 2));
+    }
+}
+
+function ensureDirs() {
+    const ensure = (p: string) => { if (!fs.existsSync(p)) fs.mkdirSync(p, { recursive: true }); };
+    ensure(WorkDir);
+    ensure(path.join(WorkDir, "agents"));
+    ensure(path.join(WorkDir, "config"));
+    ensureMcpConfig();
+    ensureModelConfig();
+}
+
+ensureDirs();
+
+function loadMcpServerConfig(): z.infer<typeof McpServerConfig> {
+    const configPath = path.join(WorkDir, "config", "mcp.json");
+    if (!fs.existsSync(configPath)) return { mcpServers: {} };
+    const config = fs.readFileSync(configPath, "utf8");
+    return McpServerConfig.parse(JSON.parse(config));
+}
+
+function loadModelConfig(): z.infer<typeof ModelConfigT> {
+    const configPath = path.join(WorkDir, "config", "models.json");
+    if (!fs.existsSync(configPath)) return baseModelConfig;
+    const config = fs.readFileSync(configPath, "utf8");
+    return ModelConfigT.parse(JSON.parse(config));
+}
+
+const { mcpServers } = loadMcpServerConfig();
+export const McpServers = mcpServers;
+export const ModelConfig = loadModelConfig();
--- a/apps/cli/src/application/config/security.ts
+++ b/apps/cli/src/application/config/security.ts
@ -0,0 +1,101 @@
+import path from "path";
+import fs from "fs";
+import { WorkDir } from "./config.js";
+
+export const SECURITY_CONFIG_PATH = path.join(WorkDir, "config", "security.json");
+
+const DEFAULT_ALLOW_LIST = [
+    "cat",
+    "curl",
+    "date",
+    "echo",
+    "grep",
+    "jq",
+    "ls",
+    "pwd",
+    "yq",
+    "whoami"
+]
+
+let cachedAllowList: string[] | null = null;
+let cachedMtimeMs: number | null = null;
+
+function ensureSecurityConfig() {
+    if (!fs.existsSync(SECURITY_CONFIG_PATH)) {
+        fs.writeFileSync(
+            SECURITY_CONFIG_PATH,
+            JSON.stringify(DEFAULT_ALLOW_LIST, null, 2) + "\n",
+            "utf8",
+        );
+    }
+}
+
+function normalizeList(commands: unknown[]): string[] {
+    const seen = new Set<string>();
+    for (const entry of commands) {
+        if (typeof entry !== "string") continue;
+        const normalized = entry.trim().toLowerCase();
+        if (!normalized) continue;
+        seen.add(normalized);
+    }
+
+    return Array.from(seen);
+}
+
+function parseSecurityPayload(payload: unknown): string[] {
+    if (Array.isArray(payload)) {
+        return normalizeList(payload);
+    }
+
+    if (payload && typeof payload === "object") {
+        const maybeObject = payload as Record<string, unknown>;
+        if (Array.isArray(maybeObject.allowedCommands)) {
+            return normalizeList(maybeObject.allowedCommands);
+        }
+
+        const dynamicList = Object.entries(maybeObject)
+            .filter(([, value]) => Boolean(value))
+            .map(([key]) => key);
+
+        return normalizeList(dynamicList);
+    }
+
+    return [];
+}
+
+function readAllowList(): string[] {
+    ensureSecurityConfig();
+
+    try {
+        const configContent = fs.readFileSync(SECURITY_CONFIG_PATH, "utf8");
+        const parsed = JSON.parse(configContent);
+        return parseSecurityPayload(parsed);
+    } catch (error) {
+        console.warn(`Failed to read security config at ${SECURITY_CONFIG_PATH}: ${error instanceof Error ? error.message : error}`);
+        return DEFAULT_ALLOW_LIST;
+    }
+}
+
+export function getSecurityAllowList(): string[] {
+    ensureSecurityConfig();
+    try {
+        const stats = fs.statSync(SECURITY_CONFIG_PATH);
+        if (cachedAllowList && cachedMtimeMs === stats.mtimeMs) {
+            return cachedAllowList;
+        }
+
+        const allowList = readAllowList();
+        cachedAllowList = allowList;
+        cachedMtimeMs = stats.mtimeMs;
+        return allowList;
+    } catch {
+        cachedAllowList = null;
+        cachedMtimeMs = null;
+        return readAllowList();
+    }
+}
+
+export function resetSecurityAllowListCache() {
+    cachedAllowList = null;
+    cachedMtimeMs = null;
+}
--- a/apps/cli/src/application/entities/agent.ts
+++ b/apps/cli/src/application/entities/agent.ts
@ -0,0 +1,35 @@
+import { z } from "zod";
+
+export const BaseTool = z.object({
+    name: z.string(),
+});
+
+export const BuiltinTool = BaseTool.extend({
+    type: z.literal("builtin"),
+});
+
+export const McpTool = BaseTool.extend({
+    type: z.literal("mcp"),
+    description: z.string(),
+    inputSchema: z.any(),
+    mcpServerName: z.string(),
+});
+
+export const AgentAsATool = BaseTool.extend({
+    type: z.literal("agent"),
+});
+
+export const ToolAttachment = z.discriminatedUnion("type", [
+    BuiltinTool,
+    McpTool,
+    AgentAsATool,
+]);
+
+export const Agent = z.object({
+    name: z.string(),
+    provider: z.string().optional(),
+    model: z.string().optional(),
+    description: z.string(),
+    instructions: z.string(),
+    tools: z.record(z.string(), ToolAttachment).optional(),
+});
--- a/apps/cli/src/application/entities/llm-step-events.ts
+++ b/apps/cli/src/application/entities/llm-step-events.ts
@ -0,0 +1,56 @@
+import { z } from "zod";
+
+export const LlmStepStreamReasoningStartEvent = z.object({
+    type: z.literal("reasoning-start"),
+});
+
+export const LlmStepStreamReasoningDeltaEvent = z.object({
+    type: z.literal("reasoning-delta"),
+    delta: z.string(),
+});
+
+export const LlmStepStreamReasoningEndEvent = z.object({
+    type: z.literal("reasoning-end"),
+});
+
+export const LlmStepStreamTextStartEvent = z.object({
+    type: z.literal("text-start"),
+});
+
+export const LlmStepStreamTextDeltaEvent = z.object({
+    type: z.literal("text-delta"),
+    delta: z.string(),
+});
+
+export const LlmStepStreamTextEndEvent = z.object({
+    type: z.literal("text-end"),
+});
+
+export const LlmStepStreamToolCallEvent = z.object({
+    type: z.literal("tool-call"),
+    toolCallId: z.string(),
+    toolName: z.string(),
+    input: z.any(),
+});
+
+export const LlmStepStreamUsageEvent = z.object({
+    type: z.literal("usage"),
+    usage: z.object({
+        inputTokens: z.number().optional(),
+        outputTokens: z.number().optional(),
+        totalTokens: z.number().optional(),
+        reasoningTokens: z.number().optional(),
+        cachedInputTokens: z.number().optional(),
+    }),
+});
+
+export const LlmStepStreamEvent = z.union([
+    LlmStepStreamReasoningStartEvent,
+    LlmStepStreamReasoningDeltaEvent,
+    LlmStepStreamReasoningEndEvent,
+    LlmStepStreamTextStartEvent,
+    LlmStepStreamTextDeltaEvent,
+    LlmStepStreamTextEndEvent,
+    LlmStepStreamToolCallEvent,
+    LlmStepStreamUsageEvent,
+]);
--- a/apps/cli/src/application/entities/mcp.ts
+++ b/apps/cli/src/application/entities/mcp.ts
@ -0,0 +1,16 @@
+import z from "zod";
+
+const StdioMcpServerConfig = z.object({
+    command: z.string(),
+    args: z.array(z.string()).optional(),
+    env: z.record(z.string(), z.string()).optional(),
+});
+
+const HttpMcpServerConfig = z.object({
+    url: z.string(),
+    headers: z.record(z.string(), z.string()).optional(),
+});
+
+export const McpServerConfig = z.object({
+    mcpServers: z.record(z.string(), z.union([StdioMcpServerConfig, HttpMcpServerConfig])),
+});
--- a/apps/cli/src/application/entities/message.ts
+++ b/apps/cli/src/application/entities/message.ts
@ -0,0 +1,58 @@
+import { z } from "zod";
+
+export const TextPart = z.object({
+    type: z.literal("text"),
+    text: z.string(),
+});
+
+export const ReasoningPart = z.object({
+    type: z.literal("reasoning"),
+    text: z.string(),
+});
+
+export const ToolCallPart = z.object({
+    type: z.literal("tool-call"),
+    toolCallId: z.string(),
+    toolName: z.string(),
+    arguments: z.any(),
+});
+
+export const AssistantContentPart = z.union([
+    TextPart,
+    ReasoningPart,
+    ToolCallPart,
+]);
+
+export const UserMessage = z.object({
+    role: z.literal("user"),
+    content: z.string(),
+});
+
+export const AssistantMessage = z.object({
+    role: z.literal("assistant"),
+    content: z.union([
+        z.string(),
+        z.array(AssistantContentPart),
+    ]),
+});
+
+export const SystemMessage = z.object({
+    role: z.literal("system"),
+    content: z.string(),
+});
+
+export const ToolMessage = z.object({
+    role: z.literal("tool"),
+    content: z.string(),
+    toolCallId: z.string(),
+    toolName: z.string(),
+});
+
+export const Message = z.discriminatedUnion("role", [
+    AssistantMessage,
+    SystemMessage,
+    ToolMessage,
+    UserMessage,
+]);
+
+export const MessageList = z.array(Message);
--- a/apps/cli/src/application/entities/models.ts
+++ b/apps/cli/src/application/entities/models.ts
@ -0,0 +1,15 @@
+import z from "zod";
+
+export const Provider = z.object({
+    flavor: z.enum(["openai", "anthropic", "google"]),
+    apiKey: z.string().optional(),
+    baseURL: z.string().optional(),
+});
+
+export const ModelConfig = z.object({
+    providers: z.record(z.string(), Provider),
+    defaults: z.object({
+        provider: z.string(),
+        model: z.string(),
+    }),
+});
--- a/apps/cli/src/application/entities/run-events.ts
+++ b/apps/cli/src/application/entities/run-events.ts
@ -0,0 +1,85 @@
+import { LlmStepStreamEvent } from "./llm-step-events.js";
+import { Message, ToolCallPart } from "./message.js";
+import { Agent } from "./agent.js";
+import z from "zod";
+
+const BaseRunEvent = z.object({
+    ts: z.iso.datetime().optional(),
+    subflow: z.array(z.string()),
+});
+
+export const StartEvent = BaseRunEvent.extend({
+    type: z.literal("start"),
+    runId: z.string(),
+    agentName: z.string(),
+});
+
+export const SpawnSubFlowEvent = BaseRunEvent.extend({
+    type: z.literal("spawn-subflow"),
+    agentName: z.string(),
+    toolCallId: z.string(),
+});
+
+export const LlmStreamEvent = BaseRunEvent.extend({
+    type: z.literal("llm-stream-event"),
+    event: LlmStepStreamEvent,
+});
+
+export const MessageEvent = BaseRunEvent.extend({
+    type: z.literal("message"),
+    message: Message,
+});
+
+export const ToolInvocationEvent = BaseRunEvent.extend({
+    type: z.literal("tool-invocation"),
+    toolName: z.string(),
+    input: z.string(),
+});
+
+export const ToolResultEvent = BaseRunEvent.extend({
+    type: z.literal("tool-result"),
+    toolName: z.string(),
+    result: z.any(),
+});
+
+export const AskHumanRequestEvent = BaseRunEvent.extend({
+    type: z.literal("ask-human-request"),
+    toolCallId: z.string(),
+    query: z.string(),
+});
+
+export const AskHumanResponseEvent = BaseRunEvent.extend({
+    type: z.literal("ask-human-response"),
+    toolCallId: z.string(),
+    response: z.string(),
+});
+
+export const ToolPermissionRequestEvent = BaseRunEvent.extend({
+    type: z.literal("tool-permission-request"),
+    toolCall: ToolCallPart,
+});
+
+export const ToolPermissionResponseEvent = BaseRunEvent.extend({
+    type: z.literal("tool-permission-response"),
+    toolCallId: z.string(),
+    response: z.enum(["approve", "deny"]),
+});
+
+export const RunErrorEvent = BaseRunEvent.extend({
+    type: z.literal("error"),
+    error: z.string(),
+});
+
+export const RunEvent = z.union([
+    StartEvent,
+    SpawnSubFlowEvent,
+    LlmStreamEvent,
+    MessageEvent,
+    ToolInvocationEvent,
+    ToolResultEvent,
+    AskHumanRequestEvent,
+    AskHumanResponseEvent,
+    ToolPermissionRequestEvent,
+    ToolPermissionResponseEvent,
+    RunErrorEvent,
+]);
--- a/apps/cli/src/application/lib/agent.ts
+++ b/apps/cli/src/application/lib/agent.ts
@ -0,0 +1,653 @@
+import { jsonSchema, ModelMessage } from "ai";
+import fs from "fs";
+import path from "path";
+import { ModelConfig, WorkDir } from "../config/config.js";
+import { Agent, ToolAttachment } from "../entities/agent.js";
+import { AssistantContentPart, AssistantMessage, Message, MessageList, ToolCallPart, ToolMessage, UserMessage } from "../entities/message.js";
+import { runIdGenerator } from "./run-id-gen.js";
+import { LanguageModel, stepCountIs, streamText, tool, Tool, ToolSet } from "ai";
+import { z } from "zod";
+import { getProvider } from "./models.js";
+import { LlmStepStreamEvent } from "../entities/llm-step-events.js";
+import { execTool } from "./exec-tool.js";
+import { AskHumanRequestEvent, RunEvent, ToolPermissionRequestEvent, ToolPermissionResponseEvent } from "../entities/run-events.js";
+import { BuiltinTools } from "./builtin-tools.js";
+import { CopilotAgent } from "../assistant/agent.js";
+import { isBlocked } from "./command-executor.js";
+
+export async function mapAgentTool(t: z.infer<typeof ToolAttachment>): Promise<Tool> {
+    switch (t.type) {
+        case "mcp":
+            return tool({
+                name: t.name,
+                description: t.description,
+                inputSchema: jsonSchema(t.inputSchema),
+            });
+        case "agent":
+            const agent = await loadAgent(t.name);
+            if (!agent) {
+                throw new Error(`Agent ${t.name} not found`);
+            }
+            return tool({
+                name: t.name,
+                description: agent.description,
+                inputSchema: z.object({
+                    message: z.string().describe("The message to send to the workflow"),
+                }),
+            });
+        case "builtin":
+            if (t.name === "ask-human") {
+                return tool({
+                    description: "Ask a human before proceeding",
+                    inputSchema: z.object({
+                        question: z.string().describe("The question to ask the human"),
+                    }),
+                });
+            }
+            const match = BuiltinTools[t.name];
+            if (!match) {
+                throw new Error(`Unknown builtin tool: ${t.name}`);
+            }
+            return tool({
+                description: match.description,
+                inputSchema: match.inputSchema,
+            });
+    }
+}
+
+export class RunLogger {
+    private logFile: string;
+    private fileHandle: fs.WriteStream;
+
+    ensureRunsDir() {
+        const runsDir = path.join(WorkDir, "runs");
+        if (!fs.existsSync(runsDir)) {
+            fs.mkdirSync(runsDir, { recursive: true });
+        }
+    }
+
+    constructor(runId: string) {
+        this.ensureRunsDir();
+        this.logFile = path.join(WorkDir, "runs", `${runId}.jsonl`);
+        this.fileHandle = fs.createWriteStream(this.logFile, {
+            flags: "a",
+            encoding: "utf8",
+        });
+    }
+
+    log(event: z.infer<typeof RunEvent>) {
+        if (event.type !== "llm-stream-event") {
+            this.fileHandle.write(JSON.stringify(event) + "\n");
+        }
+    }
+
+    close() {
+        this.fileHandle.close();
+    }
+}
+
+export class StreamStepMessageBuilder {
+    private parts: z.infer<typeof AssistantContentPart>[] = [];
+    private textBuffer: string = "";
+    private reasoningBuffer: string = "";
+
+    flushBuffers() {
+        // skip reasoning
+        // if (this.reasoningBuffer) {
+        //     this.parts.push({ type: "reasoning", text: this.reasoningBuffer });
+        //     this.reasoningBuffer = "";
+        // }
+        if (this.textBuffer) {
+            this.parts.push({ type: "text", text: this.textBuffer });
+            this.textBuffer = "";
+        }
+    }
+
+    ingest(event: z.infer<typeof LlmStepStreamEvent>) {
+        switch (event.type) {
+            case "reasoning-start":
+            case "reasoning-end":
+            case "text-start":
+            case "text-end":
+                this.flushBuffers();
+                break;
+            case "reasoning-delta":
+                this.reasoningBuffer += event.delta;
+                break;
+            case "text-delta":
+                this.textBuffer += event.delta;
+                break;
+            case "tool-call":
+                this.parts.push({
+                    type: "tool-call",
+                    toolCallId: event.toolCallId,
+                    toolName: event.toolName,
+                    arguments: event.input,
+                });
+                break;
+        }
+    }
+
+    get(): z.infer<typeof AssistantMessage> {
+        this.flushBuffers();
+        return {
+            role: "assistant",
+            content: this.parts,
+        };
+    }
+}
+
+function normaliseAskHumanToolCall(message: z.infer<typeof AssistantMessage>) {
+    if (typeof message.content === "string") {
+        return;
+    }
+    let askHumanToolCall: z.infer<typeof ToolCallPart> | null = null;
+    const newParts = [];
+    for (const part of message.content as z.infer<typeof AssistantContentPart>[]) {
+        if (part.type === "tool-call" && part.toolName === "ask-human") {
+            if (!askHumanToolCall) {
+                askHumanToolCall = part;
+            } else {
+                (askHumanToolCall as z.infer<typeof ToolCallPart>).arguments += "\n" + part.arguments;
+            }
+            break;
+        } else {
+            newParts.push(part);
+        }
+    }
+    if (askHumanToolCall) {
+        newParts.push(askHumanToolCall);
+    }
+    message.content = newParts;
+}
+
+export async function loadAgent(id: string): Promise<z.infer<typeof Agent>> {
+    if (id === "copilot") {
+        return CopilotAgent;
+    }
+    const agentPath = path.join(WorkDir, "agents", `${id}.json`);
+    const agent = fs.readFileSync(agentPath, "utf8");
+    return Agent.parse(JSON.parse(agent));
+}
+
+export function convertFromMessages(messages: z.infer<typeof Message>[]): ModelMessage[] {
+    const result: ModelMessage[] = [];
+    for (const msg of messages) {
+        switch (msg.role) {
+            case "assistant":
+                if (typeof msg.content === 'string') {
+                    result.push({
+                        role: "assistant",
+                        content: msg.content,
+                    });
+                } else {
+                    result.push({
+                        role: "assistant",
+                        content: msg.content.map(part => {
+                            switch (part.type) {
+                                case 'text':
+                                    return part;
+                                case 'reasoning':
+                                    return part;
+                                case 'tool-call':
+                                    return {
+                                        type: 'tool-call',
+                                        toolCallId: part.toolCallId,
+                                        toolName: part.toolName,
+                                        input: part.arguments,
+                                    };
+                            }
+                        }),
+                    });
+                }
+                break;
+            case "system":
+                result.push({
+                    role: "system",
+                    content: msg.content,
+                });
+                break;
+            case "user":
+                result.push({
+                    role: "user",
+                    content: msg.content,
+                });
+                break;
+            case "tool":
+                result.push({
+                    role: "tool",
+                    content: [
+                        {
+                            type: "tool-result",
+                            toolCallId: msg.toolCallId,
+                            toolName: msg.toolName,
+                            output: {
+                                type: "text",
+                                value: msg.content,
+                            },
+                        },
+                    ],
+                });
+                break;
+        }
+    }
+    return result;
+}
+
+async function buildTools(agent: z.infer<typeof Agent>): Promise<ToolSet> {
+    const tools: ToolSet = {};
+    for (const [name, tool] of Object.entries(agent.tools ?? {})) {
+        try {
+            tools[name] = await mapAgentTool(tool);
+        } catch (error) {
+            console.error(`Error mapping tool ${name}:`, error);
+            continue;
+        }
+    }
+    return tools;
+}
+
+export class AgentState {
+    logger: RunLogger | null = null;
+    runId: string | null = null;
+    agent: z.infer<typeof Agent> | null = null;
+    agentName: string;
+    messages: z.infer<typeof MessageList> = [];
+    lastAssistantMsg: z.infer<typeof AssistantMessage> | null = null;
+    subflowStates: Record<string, AgentState> = {};
+    toolCallIdMap: Record<string, z.infer<typeof ToolCallPart>> = {};
+    pendingToolCalls: Record<string, true> = {};
+    pendingToolPermissionRequests: Record<string, z.infer<typeof ToolPermissionRequestEvent>> = {};
+    pendingAskHumanRequests: Record<string, z.infer<typeof AskHumanRequestEvent>> = {};
+    allowedToolCallIds: Record<string, true> = {};
+    deniedToolCallIds: Record<string, true> = {};
+
+    constructor(agentName: string, runId?: string) {
+        this.agentName = agentName;
+        this.runId = runId || runIdGenerator.next();
+        this.logger = new RunLogger(this.runId);
+        if (!runId) {
+            this.logger.log({
+                type: "start",
+                runId: this.runId,
+                agentName: this.agentName,
+                subflow: [],
+            });
+        }
+    }
+
+    getPendingPermissions(): z.infer<typeof ToolPermissionRequestEvent>[] {
+        const response: z.infer<typeof ToolPermissionRequestEvent>[] = [];
+        for (const [id, subflowState] of Object.entries(this.subflowStates)) {
+            for (const perm of subflowState.getPendingPermissions()) {
+                response.push({
+                    ...perm,
+                    subflow: [id, ...perm.subflow],
+                });
+            }
+        }
+        for (const perm of Object.values(this.pendingToolPermissionRequests)) {
+            response.push({
+                ...perm,
+                subflow: [],
+            });
+        }
+        return response;
+    }
+
+    getPendingAskHumans(): z.infer<typeof AskHumanRequestEvent>[] {
+        const response: z.infer<typeof AskHumanRequestEvent>[] = [];
+        for (const [id, subflowState] of Object.entries(this.subflowStates)) {
+            for (const ask of subflowState.getPendingAskHumans()) {
+                response.push({
+                    ...ask,
+                    subflow: [id, ...ask.subflow],
+                });
+            }
+        }
+        for (const ask of Object.values(this.pendingAskHumanRequests)) {
+            response.push({
+                ...ask,
+                subflow: [],
+            });
+        }
+        return response;
+    }
+
+    finalResponse(): string {
+        if (!this.lastAssistantMsg) {
+            return '';
+        }
+        if (typeof this.lastAssistantMsg.content === "string") {
+            return this.lastAssistantMsg.content;
+        }
+        return this.lastAssistantMsg.content.reduce((acc, part) => {
+            if (part.type === "text") {
+                return acc + part.text;
+            }
+            return acc;
+        }, "");
+    }
+
+    ingest(event: z.infer<typeof RunEvent>) {
+        if (event.subflow.length > 0) {
+            const { subflow, ...rest } = event;
+            this.subflowStates[subflow[0]].ingest({
+                ...rest,
+                subflow: subflow.slice(1),
+            });
+            return;
+        }
+        switch (event.type) {
+            case "message":
+                this.messages.push(event.message);
+                if (event.message.content instanceof Array) {
+                    for (const part of event.message.content) {
+                        if (part.type === "tool-call") {
+                            this.toolCallIdMap[part.toolCallId] = part;
+                            this.pendingToolCalls[part.toolCallId] = true;
+                        }
+                    }
+                }
+                if (event.message.role === "tool") {
+                    const message = event.message as z.infer<typeof ToolMessage>;
+                    delete this.pendingToolCalls[message.toolCallId];
+                }
+                if (event.message.role === "assistant") {
+                    this.lastAssistantMsg = event.message;
+                }
+                break;
+            case "spawn-subflow":
+                this.subflowStates[event.toolCallId] = new AgentState(event.agentName);
+                break;
+            case "tool-permission-request":
+                this.pendingToolPermissionRequests[event.toolCall.toolCallId] = event;
+                break;
+            case "tool-permission-response":
+                switch (event.response) {
+                    case "approve":
+                        this.allowedToolCallIds[event.toolCallId] = true;
+                        break;
+                    case "deny":
+                        this.deniedToolCallIds[event.toolCallId] = true;
+                        break;
+                }
+                delete this.pendingToolPermissionRequests[event.toolCallId];
+                break;
+            case "ask-human-request":
+                this.pendingAskHumanRequests[event.toolCallId] = event;
+                break;
+            case "ask-human-response":
+                // console.error('im here', this.agentName, this.runId, event.subflow);
+                const ogEvent = this.pendingAskHumanRequests[event.toolCallId];
+                this.messages.push({
+                    role: "tool",
+                    content: JSON.stringify({
+                        userResponse: event.response,
+                    }),
+                    toolCallId: ogEvent.toolCallId,
+                    toolName: this.toolCallIdMap[ogEvent.toolCallId]!.toolName,
+                });
+                delete this.pendingAskHumanRequests[ogEvent.toolCallId];
+                break;
+        }
+    }
+
+    ingestAndLog(event: z.infer<typeof RunEvent>) {
+        this.ingest(event);
+        this.logger!.log(event);
+    }
+
+    *ingestAndLogAndYield(event: z.infer<typeof RunEvent>): Generator<z.infer<typeof RunEvent>, void, unknown> {
+        this.ingestAndLog(event);
+        yield event;
+    }
+}
+
+export async function* streamAgent(state: AgentState): AsyncGenerator<z.infer<typeof RunEvent>, void, unknown> {
+    // set up agent
+    const agent = await loadAgent(state.agentName);
+
+    // set up tools
+    const tools = await buildTools(agent);
+
+    // set up provider + model
+    const provider = getProvider(agent.provider);
+    const model = provider(agent.model || ModelConfig.defaults.model);
+    let loopCounter = 0;
+
+    while (true) {
+        // console.error(`loop counter: ${loopCounter++}`)
+        // if last response is from assistant and text, so exit
+        const lastMessage = state.messages[state.messages.length - 1];
+        if (lastMessage
+            && lastMessage.role === "assistant"
+            && (typeof lastMessage.content === "string"
+                || !lastMessage.content.some(part => part.type === "tool-call")
+            )
+        ) {
+            // console.error("Nothing to do, exiting (a.)")
+            return;
+        }
+
+        // execute any pending tool calls
+        for (const toolCallId of Object.keys(state.pendingToolCalls)) {
+            const toolCall = state.toolCallIdMap[toolCallId];
+
+            // if ask-human, skip
+            if (toolCall.toolName === "ask-human") {
+                continue;
+            }
+
+            // if tool has been denied, deny
+            if (state.deniedToolCallIds[toolCallId])  {
+                yield* state.ingestAndLogAndYield({
+                    type: "message",
+                    message: {
+                        role: "tool",
+                        content: "Unable to execute this tool: Permission was denied.",
+                        toolCallId: toolCallId,
+                        toolName: toolCall.toolName,
+                    },
+                    subflow: [],
+                });
+                continue;
+            }
+
+            // if permission is pending on this tool call, allow execution
+            if (state.pendingToolPermissionRequests[toolCallId]) {
+                continue;
+            }
+
+            // execute approved tool
+            yield* state.ingestAndLogAndYield({
+                type: "tool-invocation",
+                toolName: toolCall.toolName,
+                input: JSON.stringify(toolCall.arguments),
+                subflow: [],
+            });
+            let result: any = null;
+            if (agent.tools![toolCall.toolName].type === "agent") {
+                let subflowState = state.subflowStates[toolCallId];
+                for await (const event of streamAgent(subflowState)) {
+                    yield* state.ingestAndLogAndYield({
+                        ...event,
+                        subflow: [toolCallId, ...event.subflow],
+                    });
+                }
+                if (!subflowState.getPendingAskHumans().length && !subflowState.getPendingPermissions().length) {
+                    result = subflowState.finalResponse();
+                }
+            } else {
+                result = await execTool(agent.tools![toolCall.toolName], toolCall.arguments);
+            }
+            if (result) {
+                const resultMsg: z.infer<typeof ToolMessage> = {
+                    role: "tool",
+                    content: JSON.stringify(result),
+                    toolCallId: toolCall.toolCallId,
+                    toolName: toolCall.toolName,
+                };
+                yield* state.ingestAndLogAndYield({
+                    type: "tool-result",
+                    toolName: toolCall.toolName,
+                    result: result,
+                    subflow: [],
+                });
+                yield* state.ingestAndLogAndYield({
+                    type: "message",
+                    message: resultMsg,
+                    subflow: [],
+                });
+            }
+        }
+
+        // if pending state, exit
+        if (state.getPendingAskHumans().length || state.getPendingPermissions().length) {
+            // console.error("pending asks or permissions, exiting (b.)")
+            return;
+        }
+
+        // if current message state isn't runnable, exit
+        if (state.messages.length === 0 || state.messages[state.messages.length - 1].role === "assistant") {
+            // console.error("current message state isn't runnable, exiting (c.)")
+            return;
+        }
+
+        // run one LLM turn.
+        // stream agent response and build message
+        const messageBuilder = new StreamStepMessageBuilder();
+        for await (const event of streamLlm(
+            model,
+            state.messages,
+            agent.instructions,
+            tools,
+        )) {
+            messageBuilder.ingest(event);
+            yield* state.ingestAndLogAndYield({
+                type: "llm-stream-event",
+                event: event,
+                subflow: [],
+            });
+        }
+
+        // build and emit final message from agent response
+        const message = messageBuilder.get();
+        yield* state.ingestAndLogAndYield({
+            type: "message",
+            message,
+            subflow: [],
+        });
+
+        // if there were any ask-human calls, emit those events
+        if (message.content instanceof Array) {
+            for (const part of message.content) {
+                if (part.type === "tool-call") {
+                    const underlyingTool = agent.tools![part.toolName];
+                    if (underlyingTool.type === "builtin" && underlyingTool.name === "ask-human") {
+                        yield* state.ingestAndLogAndYield({
+                            type: "ask-human-request",
+                            toolCallId: part.toolCallId,
+                            query: part.arguments.question,
+                            subflow: [],
+                        });
+                    }
+                    if (underlyingTool.type === "builtin" && underlyingTool.name === "executeCommand") {
+                        // if command is blocked, then seek permission
+                        if (isBlocked(part.arguments.command)) {
+                            yield *state.ingestAndLogAndYield({
+                                type: "tool-permission-request",
+                                toolCall: part,
+                                subflow: [],
+                            });
+                        }
+                    }
+                    if (underlyingTool.type === "agent" && underlyingTool.name) {
+                        yield* state.ingestAndLogAndYield({
+                            type: "spawn-subflow",
+                            agentName: underlyingTool.name,
+                            toolCallId: part.toolCallId,
+                            subflow: [],
+                        });
+                        yield* state.ingestAndLogAndYield({
+                            type: "message",
+                            message: {
+                                role: "user",
+                                content: part.arguments.message,
+                            },
+                            subflow: [part.toolCallId],
+                        });
+                    }
+                }
+            }
+        }
+    }
+}
+
+async function* streamLlm(
+    model: LanguageModel,
+    messages: z.infer<typeof MessageList>,
+    instructions: string,
+    tools: ToolSet,
+): AsyncGenerator<z.infer<typeof LlmStepStreamEvent>, void, unknown> {
+    const { fullStream } = streamText({
+        model,
+        messages: convertFromMessages(messages),
+        system: instructions,
+        tools,
+        stopWhen: stepCountIs(1),
+    });
+    for await (const event of fullStream) {
+        // console.log("\n\n\t>>>>\t\tstream event", JSON.stringify(event));
+        switch (event.type) {
+            case "reasoning-start":
+                yield {
+                    type: "reasoning-start",
+                };
+                break;
+            case "reasoning-delta":
+                yield {
+                    type: "reasoning-delta",
+                    delta: event.text,
+                };
+                break;
+            case "reasoning-end":
+                yield {
+                    type: "reasoning-end",
+                };
+                break;
+            case "text-start":
+                yield {
+                    type: "text-start",
+                };
+                break;
+            case "text-delta":
+                yield {
+                    type: "text-delta",
+                    delta: event.text,
+                };
+                break;
+            case "tool-call":
+                yield {
+                    type: "tool-call",
+                    toolCallId: event.toolCallId,
+                    toolName: event.toolName,
+                    input: event.input,
+                };
+                break;
+            case "finish":
+                yield {
+                    type: "usage",
+                    usage: event.totalUsage,
+                };
+                break;
+            default:
+                // console.warn("Unknown event type", event);
+                continue;
+        }
+    }
+}
+export const MappedToolCall = z.object({
+    toolCall: ToolCallPart,
+    agentTool: ToolAttachment,
+});
--- a/apps/cli/src/application/lib/builtin-tools.ts
+++ b/apps/cli/src/application/lib/builtin-tools.ts
@ -0,0 +1,457 @@
+import { z, ZodType } from "zod";
+import * as fs from "fs/promises";
+import * as path from "path";
+import { WorkDir as BASE_DIR } from "../config/config.js";
+import { executeCommand } from "./command-executor.js";
+import { StdioClientTransport } from "@modelcontextprotocol/sdk/client/stdio.js";
+import { StreamableHTTPClientTransport } from "@modelcontextprotocol/sdk/client/streamableHttp.js";
+import { SSEClientTransport } from "@modelcontextprotocol/sdk/client/sse.js";
+import { Client } from "@modelcontextprotocol/sdk/client";
+import { resolveSkill, availableSkills } from "../assistant/skills/index.js";
+
+const BuiltinToolsSchema = z.record(z.string(), z.object({
+    description: z.string(),
+	inputSchema: z.custom<ZodType>(),
+    execute: z.function({
+        input: z.any(),
+        output: z.promise(z.any()),
+    }),
+}));
+
+export const BuiltinTools: z.infer<typeof BuiltinToolsSchema> = {
+    loadSkill: {
+        description: "Load a Rowboat skill definition into context by fetching its guidance string",
+        inputSchema: z.object({
+            skillName: z.string().describe("Skill identifier or path (e.g., 'workflow-run-ops' or 'src/application/assistant/skills/workflow-run-ops/skill.ts')"),
+        }),
+        execute: async ({ skillName }: { skillName: string }) => {
+            const resolved = resolveSkill(skillName);
+
+            if (!resolved) {
+                return {
+                    success: false,
+                    message: `Skill '${skillName}' not found. Available skills: ${availableSkills.join(", ")}`,
+                };
+            }
+
+            return {
+                success: true,
+                skillName: resolved.id,
+                path: resolved.catalogPath,
+                content: resolved.content,
+            };
+        },
+    },
+
+    exploreDirectory: {
+        description: 'Recursively explore directory structure to understand existing agents and file organization',
+        inputSchema: z.object({
+            subdirectory: z.string().optional().describe('Subdirectory to explore (optional, defaults to root)'),
+            maxDepth: z.number().optional().describe('Maximum depth to traverse (default: 3)'),
+        }),
+        execute: async ({ subdirectory, maxDepth = 3 }: { subdirectory?: string, maxDepth?: number }) => {
+            async function explore(dir: string, depth: number = 0): Promise<any> {
+                if (depth > maxDepth) return null;
+                
+                try {
+                    const entries = await fs.readdir(dir, { withFileTypes: true });
+                    const result: any = { files: [], directories: {} };
+                    
+                    for (const entry of entries) {
+                        const fullPath = path.join(dir, entry.name);
+                        if (entry.isFile()) {
+                            const ext = path.extname(entry.name);
+                            const size = (await fs.stat(fullPath)).size;
+                            result.files.push({
+                                name: entry.name,
+                                type: ext || 'no-extension',
+                                size: size,
+                                relativePath: path.relative(BASE_DIR, fullPath),
+                            });
+                        } else if (entry.isDirectory()) {
+                            result.directories[entry.name] = await explore(fullPath, depth + 1);
+                        }
+                    }
+                    
+                    return result;
+                } catch (error) {
+                    return { error: error instanceof Error ? error.message : 'Unknown error' };
+                }
+            }
+            
+            const dirPath = subdirectory ? path.join(BASE_DIR, subdirectory) : BASE_DIR;
+            const structure = await explore(dirPath);
+            
+            return {
+                success: true,
+                basePath: path.relative(BASE_DIR, dirPath) || '.',
+                structure,
+            };
+        },
+    },
+    
+    readFile: {
+        description: 'Read and parse file contents. For JSON files, provides parsed structure.',
+        inputSchema: z.object({
+            filename: z.string().describe('The name of the file to read (relative to .rowboat directory)'),
+        }),
+        execute: async ({ filename }: { filename: string }) => {
+            try {
+                const filePath = path.join(BASE_DIR, filename);
+                const content = await fs.readFile(filePath, 'utf-8');
+                
+                let parsed = null;
+                let fileType = path.extname(filename);
+                
+                if (fileType === '.json') {
+                    try {
+                        parsed = JSON.parse(content);
+                    } catch {
+                        parsed = { error: 'Invalid JSON' };
+                    }
+                }
+                
+                return {
+                    success: true,
+                    filename,
+                    fileType,
+                    content,
+                    parsed,
+                    path: filePath,
+                    size: content.length,
+                };
+            } catch (error) {
+                return {
+                    success: false,
+                    message: `Failed to read file: ${error instanceof Error ? error.message : 'Unknown error'}`,
+                };
+            }
+        },
+    },
+    
+    createFile: {
+        description: 'Create a new file with content. Automatically creates parent directories if needed.',
+        inputSchema: z.object({
+            filename: z.string().describe('The name of the file to create (relative to .rowboat directory)'),
+            content: z.string().describe('The content to write to the file'),
+            description: z.string().optional().describe('Optional description of why this file is being created'),
+        }),
+        execute: async ({ filename, content, description }: { filename: string, content: string, description?: string }) => {
+            try {
+                const filePath = path.join(BASE_DIR, filename);
+                const dir = path.dirname(filePath);
+                
+                // Ensure directory exists
+                await fs.mkdir(dir, { recursive: true });
+                
+                // Write file
+                await fs.writeFile(filePath, content, 'utf-8');
+                
+                return {
+                    success: true,
+                    message: `File '${filename}' created successfully`,
+                    description: description || 'No description provided',
+                    path: filePath,
+                    size: content.length,
+                };
+            } catch (error) {
+                return {
+                    success: false,
+                    message: `Failed to create file: ${error instanceof Error ? error.message : 'Unknown error'}`,
+                };
+            }
+        },
+    },
+    
+    updateFile: {
+        description: 'Update or overwrite the contents of an existing file',
+        inputSchema: z.object({
+            filename: z.string().describe('The name of the file to update (relative to .rowboat directory)'),
+            content: z.string().describe('The new content to write to the file'),
+            reason: z.string().optional().describe('Optional reason for the update'),
+        }),
+        execute: async ({ filename, content, reason }: { filename: string, content: string, reason?: string }) => {
+            try {
+                const filePath = path.join(BASE_DIR, filename);
+                
+                // Check if file exists
+                await fs.access(filePath);
+                
+                // Update file
+                await fs.writeFile(filePath, content, 'utf-8');
+                
+                return {
+                    success: true,
+                    message: `File '${filename}' updated successfully`,
+                    reason: reason || 'No reason provided',
+                    path: filePath,
+                    size: content.length,
+                };
+            } catch (error) {
+                return {
+                    success: false,
+                    message: `Failed to update file: ${error instanceof Error ? error.message : 'Unknown error'}`,
+                };
+            }
+        },
+    },
+    
+    deleteFile: {
+        description: 'Delete a file from the .rowboat directory',
+        inputSchema: z.object({
+            filename: z.string().describe('The name of the file to delete (relative to .rowboat directory)'),
+        }),
+        execute: async ({ filename }: { filename: string }) => {
+            try {
+                const filePath = path.join(BASE_DIR, filename);
+                await fs.unlink(filePath);
+                
+                return {
+                    success: true,
+                    message: `File '${filename}' deleted successfully`,
+                    path: filePath,
+                };
+            } catch (error) {
+                return {
+                    success: false,
+                    message: `Failed to delete file: ${error instanceof Error ? error.message : 'Unknown error'}`,
+                };
+            }
+        },
+    },
+    
+    listFiles: {
+        description: 'List all files and directories in the .rowboat directory or subdirectory',
+        inputSchema: z.object({
+            subdirectory: z.string().optional().describe('Optional subdirectory to list (relative to .rowboat directory)'),
+        }),
+        execute: async ({ subdirectory }: { subdirectory?: string }) => {
+            try {
+                const dirPath = subdirectory ? path.join(BASE_DIR, subdirectory) : BASE_DIR;
+                const entries = await fs.readdir(dirPath, { withFileTypes: true });
+                
+                const files = entries
+                    .filter(entry => entry.isFile())
+                    .map(entry => ({
+                        name: entry.name,
+                        type: path.extname(entry.name) || 'no-extension',
+                        relativePath: path.relative(BASE_DIR, path.join(dirPath, entry.name)),
+                    }));
+                
+                const directories = entries
+                    .filter(entry => entry.isDirectory())
+                    .map(entry => entry.name);
+                
+                return {
+                    success: true,
+                    path: dirPath,
+                    relativePath: path.relative(BASE_DIR, dirPath) || '.',
+                    files,
+                    directories,
+                    totalFiles: files.length,
+                    totalDirectories: directories.length,
+                };
+            } catch (error) {
+                return {
+                    success: false,
+                    message: `Failed to list files: ${error instanceof Error ? error.message : 'Unknown error'}`,
+                };
+            }
+        },
+    },
+    
+    analyzeAgent: {
+        description: 'Read and analyze an agent file to understand its structure, tools, and configuration',
+        inputSchema: z.object({
+            agentName: z.string().describe('Name of the agent file to analyze (with or without .json extension)'),
+        }),
+        execute: async ({ agentName }: { agentName: string }) => {
+            try {
+                const filename = agentName.endsWith('.json') ? agentName : `${agentName}.json`;
+                const filePath = path.join(BASE_DIR, 'agents', filename);
+                
+                const content = await fs.readFile(filePath, 'utf-8');
+                const agent = JSON.parse(content);
+                
+                // Extract key information
+                const toolsList = agent.tools ? Object.keys(agent.tools) : [];
+                const agentTools = agent.tools ? Object.entries(agent.tools).map(([key, tool]: [string, any]) => ({
+                    key,
+                    type: tool.type,
+                    name: tool.name || key,
+                })) : [];
+                
+                const analysis = {
+                    name: agent.name,
+                    description: agent.description || 'No description',
+                    model: agent.model || 'Not specified',
+                    toolCount: toolsList.length,
+                    tools: agentTools,
+                    hasOtherAgents: agentTools.some((t: any) => t.type === 'agent'),
+                    structure: agent,
+                };
+                
+                return {
+                    success: true,
+                    filePath: path.relative(BASE_DIR, filePath),
+                    analysis,
+                };
+            } catch (error) {
+                return {
+                    success: false,
+                    message: `Failed to analyze agent: ${error instanceof Error ? error.message : 'Unknown error'}`,
+                };
+            }
+        },
+    },
+    
+    listMcpServers: {
+        description: 'List all available MCP servers from the configuration',
+        inputSchema: z.object({}),
+        execute: async (): Promise<{ success: boolean, servers: any[], count: number, message: string }> => {
+            try {
+                const configPath = path.join(BASE_DIR, 'config', 'mcp.json');
+                
+                // Check if config exists
+                try {
+                    await fs.access(configPath);
+                } catch {
+                    return {
+                        success: true,
+                        servers: [],
+						count: 0,
+                        message: 'No MCP servers configured yet',
+                    };
+                }
+                
+                const content = await fs.readFile(configPath, 'utf-8');
+                const config = JSON.parse(content);
+                
+                const servers = Object.keys(config.mcpServers || {}).map(name => {
+                    const server = config.mcpServers[name];
+                    return {
+                        name,
+                        type: 'command' in server ? 'stdio' : 'http',
+                        command: server.command,
+                        url: server.url,
+                    };
+                });
+                
+                return {
+                    success: true,
+                    servers,
+                    count: servers.length,
+                    message: `Found ${servers.length} MCP server(s)`,
+                };
+            } catch (error) {
+                return {
+                    success: false,
+					servers: [],
+					count: 0,
+                    message: `Failed to list MCP servers: ${error instanceof Error ? error.message : 'Unknown error'}`,
+                };
+            }
+        },
+    },
+    
+    listMcpTools: {
+        description: 'List all available tools from a specific MCP server',
+        inputSchema: z.object({
+            serverName: z.string().describe('Name of the MCP server to query'),
+        }),
+        execute: async ({ serverName }: { serverName: string }) => {
+            try {
+                const configPath = path.join(BASE_DIR, 'config', 'mcp.json');
+                const content = await fs.readFile(configPath, 'utf-8');
+                const config = JSON.parse(content);
+                
+                const mcpConfig = config.mcpServers[serverName];
+                if (!mcpConfig) {
+                    return {
+                        success: false,
+                        message: `MCP server '${serverName}' not found in configuration`,
+                    };
+                }
+                
+                // Create transport based on config type
+                let transport;
+                if ('command' in mcpConfig) {
+                    transport = new StdioClientTransport({
+                        command: mcpConfig.command,
+                        args: mcpConfig.args || [],
+                        env: mcpConfig.env || {},
+                    });
+                } else {
+                    try {
+                        transport = new StreamableHTTPClientTransport(new URL(mcpConfig.url));
+                    } catch {
+                        transport = new SSEClientTransport(new URL(mcpConfig.url));
+                    }
+                }
+                
+                // Create and connect client
+                const client = new Client({
+                    name: 'rowboat-copilot',
+                    version: '1.0.0',
+                });
+                
+                await client.connect(transport);
+                
+                // List available tools
+                const toolsList = await client.listTools();
+                
+                // Close connection
+                client.close();
+                transport.close();
+                
+                const tools = toolsList.tools.map((t: any) => ({
+                    name: t.name,
+                    description: t.description || 'No description',
+                    inputSchema: t.inputSchema,
+                }));
+                
+                return {
+                    success: true,
+                    serverName,
+                    tools,
+                    count: tools.length,
+                    message: `Found ${tools.length} tool(s) in MCP server '${serverName}'`,
+                };
+            } catch (error) {
+                return {
+                    success: false,
+                    message: `Failed to list MCP tools: ${error instanceof Error ? error.message : 'Unknown error'}`,
+                };
+            }
+        },
+    },
+    
+    executeCommand: {
+        description: 'Execute a shell command and return the output. Use this to run bash/shell commands.',
+        inputSchema: z.object({
+            command: z.string().describe('The shell command to execute (e.g., "ls -la", "cat file.txt")'),
+            cwd: z.string().optional().describe('Working directory to execute the command in (defaults to .rowboat directory)'),
+        }),
+        execute: async ({ command, cwd }: { command: string, cwd?: string }) => {
+            try {
+                const workingDir = cwd ? path.join(BASE_DIR, cwd) : BASE_DIR;
+                const result = await executeCommand(command, { cwd: workingDir });
+                
+                return {
+                    success: result.exitCode === 0,
+                    stdout: result.stdout,
+                    stderr: result.stderr,
+                    exitCode: result.exitCode,
+                    command,
+                    workingDir,
+                };
+            } catch (error) {
+                return {
+                    success: false,
+                    message: `Failed to execute command: ${error instanceof Error ? error.message : 'Unknown error'}`,
+                    command,
+                };
+            }
+        },
+    },
+};
--- a/apps/cli/src/application/lib/command-executor.ts
+++ b/apps/cli/src/application/lib/command-executor.ts
@ -0,0 +1,143 @@
+import { exec, execSync } from 'child_process';
+import { promisify } from 'util';
+import { getSecurityAllowList, SECURITY_CONFIG_PATH } from '../config/security.js';
+
+const execPromise = promisify(exec);
+const COMMAND_SPLIT_REGEX = /(?:\|\||&&|;|\||\n)/;
+const ENV_ASSIGNMENT_REGEX = /^[A-Za-z_][A-Za-z0-9_]*=.*/;
+const WRAPPER_COMMANDS = new Set(['sudo', 'env', 'time', 'command']);
+
+function sanitizeToken(token: string): string {
+  return token.trim().replace(/^['"]+|['"]+$/g, '');
+}
+
+function extractCommandNames(command: string): string[] {
+  const discovered = new Set<string>();
+  const segments = command.split(COMMAND_SPLIT_REGEX);
+
+  for (const segment of segments) {
+    const tokens = segment.trim().split(/\s+/).filter(Boolean);
+    if (!tokens.length) continue;
+
+    let index = 0;
+    while (index < tokens.length && ENV_ASSIGNMENT_REGEX.test(tokens[index])) {
+      index++;
+    }
+
+    if (index >= tokens.length) continue;
+
+    const primary = sanitizeToken(tokens[index]).toLowerCase();
+    if (!primary) continue;
+
+    discovered.add(primary);
+
+    if (WRAPPER_COMMANDS.has(primary) && index + 1 < tokens.length) {
+      const wrapped = sanitizeToken(tokens[index + 1]).toLowerCase();
+      if (wrapped) {
+        discovered.add(wrapped);
+      }
+    }
+  }
+
+  return Array.from(discovered);
+}
+
+function findBlockedCommands(command: string): string[] {
+  const invoked = extractCommandNames(command);
+  if (!invoked.length) return [];
+
+  const allowList = getSecurityAllowList();
+  if (!allowList.length) return invoked;
+
+  const allowSet = new Set(allowList);
+  if (allowSet.has('*')) return [];
+
+  return invoked.filter((cmd) => !allowSet.has(cmd));
+}
+
+// export const BlockedResult = {
+//   stdout: '',
+//   stderr: `Command blocked by security policy. Update ${SECURITY_CONFIG_PATH} to allow them before retrying.`,
+//   exitCode: 126,
+// };
+
+export function isBlocked(command: string): boolean {
+  const blocked = findBlockedCommands(command);
+  return blocked.length > 0;
+}
+
+export interface CommandResult {
+  stdout: string;
+  stderr: string;
+  exitCode: number;
+}
+
+/**
+ * Executes an arbitrary shell command
+ * @param command - The command to execute (e.g., "cat abc.txt | grep 'abc@gmail.com'")
+ * @param options - Optional execution options
+ * @returns Promise with stdout, stderr, and exit code
+ */
+export async function executeCommand(
+  command: string,
+  options?: {
+    cwd?: string;
+    timeout?: number; // timeout in milliseconds
+    maxBuffer?: number; // max buffer size in bytes
+  }
+): Promise<CommandResult> {
+  try {
+    const { stdout, stderr } = await execPromise(command, {
+      cwd: options?.cwd,
+      timeout: options?.timeout,
+      maxBuffer: options?.maxBuffer || 1024 * 1024, // default 1MB
+      shell: '/bin/sh', // use sh for cross-platform compatibility
+    });
+
+    return {
+      stdout: stdout.trim(),
+      stderr: stderr.trim(),
+      exitCode: 0,
+    };
+  } catch (error: any) {
+    // exec throws an error if the command fails or times out
+    return {
+      stdout: error.stdout?.trim() || '',
+      stderr: error.stderr?.trim() || error.message,
+      exitCode: error.code || 1,
+    };
+  }
+}
+
+/**
+ * Executes a command synchronously (blocking)
+ * Use with caution - prefer executeCommand for async execution
+ */
+export function executeCommandSync(
+  command: string,
+  options?: {
+    cwd?: string;
+    timeout?: number;
+  }
+): CommandResult {
+  try {
+    const stdout = execSync(command, {
+      cwd: options?.cwd,
+      timeout: options?.timeout,
+      encoding: 'utf-8',
+      shell: '/bin/sh',
+    });
+
+    return {
+      stdout: stdout.trim(),
+      stderr: '',
+      exitCode: 0,
+    };
+  } catch (error: any) {
+    return {
+      stdout: error.stdout?.toString().trim() || '',
+      stderr: error.stderr?.toString().trim() || error.message,
+      exitCode: error.status || 1,
+    };
+  }
+}
--- a/apps/cli/src/application/lib/exec-tool.ts
+++ b/apps/cli/src/application/lib/exec-tool.ts
@ -0,0 +1,65 @@
+import { ToolAttachment } from "../entities/agent.js";
+import { z } from "zod";
+import { McpServers } from "../config/config.js";
+import { StdioClientTransport } from "@modelcontextprotocol/sdk/client/stdio.js";
+import { StreamableHTTPClientTransport } from "@modelcontextprotocol/sdk/client/streamableHttp.js";
+import { SSEClientTransport } from "@modelcontextprotocol/sdk/client/sse.js";
+import { Transport } from "@modelcontextprotocol/sdk/shared/transport.js";
+import { Client } from "@modelcontextprotocol/sdk/client";
+import { BuiltinTools } from "./builtin-tools.js";
+
+async function execMcpTool(agentTool: z.infer<typeof ToolAttachment> & { type: "mcp" }, input: any): Promise<any> {
+    // load mcp configuration from the tool
+    const mcpConfig = McpServers[agentTool.mcpServerName];
+    if (!mcpConfig) {
+        throw new Error(`MCP server ${agentTool.mcpServerName} not found`);
+    }
+
+    // create transport
+    let transport: Transport;
+    if ("command" in mcpConfig) {
+        transport = new StdioClientTransport({
+            command: mcpConfig.command,
+            args: mcpConfig.args,
+            env: mcpConfig.env,
+        });
+    } else {
+        // first try streamable http transport
+        try {
+            transport = new StreamableHTTPClientTransport(new URL(mcpConfig.url));
+        } catch (error) {
+            // if that fails, try sse transport
+            transport = new SSEClientTransport(new URL(mcpConfig.url));
+        }
+    }
+
+    if (!transport) {
+        throw new Error(`No transport found for ${agentTool.mcpServerName}`);
+    }
+
+    // create client
+    const client = new Client({
+        name: 'rowboatx',
+        version: '1.0.0',
+    });
+    await client.connect(transport);
+
+    // call tool
+    const result = await client.callTool({ name: agentTool.name, arguments: input });
+    client.close();
+    transport.close();
+    return result;
+}
+
+export async function execTool(agentTool: z.infer<typeof ToolAttachment>, input: any): Promise<any> {
+    switch (agentTool.type) {
+        case "mcp":
+            return execMcpTool(agentTool, input);
+        case "builtin":
+            const builtinTool = BuiltinTools[agentTool.name];
+            if (!builtinTool || !builtinTool.execute) {
+                throw new Error(`Unsupported builtin tool: ${agentTool.name}`);
+            }
+            return builtinTool.execute(input);
+    }
+}
--- a/apps/cli/src/application/lib/mcp.ts
+++ b/apps/cli/src/application/lib/mcp.ts
@ -0,0 +1,31 @@
+import { Client } from '@modelcontextprotocol/sdk/client/index.js';
+import { StreamableHTTPClientTransport } from '@modelcontextprotocol/sdk/client/streamableHttp.js';
+import { SSEClientTransport } from '@modelcontextprotocol/sdk/client/sse.js';
+
+export async function getMcpClient(serverUrl: string, serverName: string): Promise<Client> {
+    let client: Client | undefined = undefined;
+    const baseUrl = new URL(serverUrl);
+
+    // Try to connect using Streamable HTTP transport
+    try {
+        client = new Client({
+            name: 'streamable-http-client',
+            version: '1.0.0'
+        });
+        const transport = new StreamableHTTPClientTransport(baseUrl);
+        await client.connect(transport);
+        console.log(`[MCP] Connected using Streamable HTTP transport to ${serverName}`);
+        return client;
+    } catch (error) {
+        // If that fails with a 4xx error, try the older SSE transport
+        console.log(`[MCP] Streamable HTTP connection failed, falling back to SSE transport for ${serverName}`);
+        client = new Client({
+            name: 'sse-client',
+            version: '1.0.0'
+        });
+        const sseTransport = new SSEClientTransport(baseUrl);
+        await client.connect(sseTransport);
+        console.log(`[MCP] Connected using SSE transport to ${serverName}`);
+        return client;
+    }
+}
--- a/apps/cli/src/application/lib/models.ts
+++ b/apps/cli/src/application/lib/models.ts
@ -0,0 +1,40 @@
+import { createOpenAI, OpenAIProvider } from "@ai-sdk/openai";
+import { createGoogleGenerativeAI, GoogleGenerativeAIProvider } from "@ai-sdk/google";
+import { AnthropicProvider, createAnthropic } from "@ai-sdk/anthropic";
+import { ModelConfig } from "../config/config.js";
+
+const providerMap: Record<string, OpenAIProvider | GoogleGenerativeAIProvider | AnthropicProvider> = {};
+
+export function getProvider(name: string = "") {
+    if (!name) {
+        name = ModelConfig.defaults.provider;
+    }
+    if (providerMap[name]) {
+        return providerMap[name];
+    }
+    const providerConfig = ModelConfig.providers[name];
+    if (!providerConfig) {
+        throw new Error(`Provider ${name} not found`);
+    }
+    switch (providerConfig.flavor) {
+        case "openai":
+            providerMap[name] = createOpenAI({
+                apiKey: providerConfig.apiKey,
+                baseURL: providerConfig.baseURL,
+            });
+            break;
+        case "anthropic":
+            providerMap[name] = createAnthropic({
+                apiKey: providerConfig.apiKey,
+                baseURL: providerConfig.baseURL,
+            });
+            break;
+        case "google":
+            providerMap[name] = createGoogleGenerativeAI({
+                apiKey: providerConfig.apiKey,
+                baseURL: providerConfig.baseURL,
+            });
+            break;
+    }
+    return providerMap[name];
+}
--- a/apps/cli/src/application/lib/random-id.ts
+++ b/apps/cli/src/application/lib/random-id.ts
@ -0,0 +1,7 @@
+import { customAlphabet } from 'nanoid';
+const alphabet = '0123456789abcdefghijklmnopqrstuvwxyz-';
+const nanoid = customAlphabet(alphabet, 7);
+
+export async function randomId(): Promise<string> {
+    return nanoid();
+}
--- a/apps/cli/src/application/lib/run-id-gen.ts
+++ b/apps/cli/src/application/lib/run-id-gen.ts
@ -0,0 +1,32 @@
+class RunIdGenerator {
+    private lastMs = 0;
+    private seq = 0;
+    private readonly pid: string;
+    private readonly hostTag: string;
+
+    constructor(hostTag: string = "") {
+        this.pid = String(process.pid).padStart(7, "0");
+        this.hostTag = hostTag ? `-${hostTag}` : "";
+    }
+
+    /**
+     * Returns an ISO8601-based, lexicographically sortable id string.
+     * Example: 2025-11-11T04-36-29Z-0001234-h1-000
+     */
+    next(): string {
+        const now = Date.now();
+        const ms = now >= this.lastMs ? now : this.lastMs; // monotonic clamp
+        this.seq = ms === this.lastMs ? this.seq + 1 : 0;
+        this.lastMs = ms;
+
+        // Build ISO string (UTC) and remove milliseconds for cleaner filenames
+        const iso = new Date(ms).toISOString() // e.g. 2025-11-11T04:36:29.123Z
+            .replace(/\.\d{3}Z$/, "Z")           // drop .123 part
+            .replace(/:/g, "-");                 // safe for files: 2025-11-11T04-36-29Z
+
+        const seqStr = String(this.seq).padStart(3, "0");
+        return `${iso}-${this.pid}${this.hostTag}-${seqStr}`;
+    }
+}
+
+export const runIdGenerator = new RunIdGenerator();
--- a/apps/cli/src/application/lib/step.ts
+++ b/apps/cli/src/application/lib/step.ts
@ -0,0 +1,13 @@
+import { MessageList } from "../entities/message.js";
+import { LlmStepStreamEvent } from "../entities/llm-step-events.js";
+import { z } from "zod";
+import { ToolAttachment } from "../entities/agent.js";
+
+export type StepInputT = z.infer<typeof MessageList>;
+export type StepOutputT = AsyncGenerator<z.infer<typeof LlmStepStreamEvent>, void, unknown>;
+
+export interface Step {
+    execute(input: StepInputT): StepOutputT;
+
+    tools(): Record<string, z.infer<typeof ToolAttachment>>;
+}
--- a/apps/cli/src/application/lib/stream-renderer.ts
+++ b/apps/cli/src/application/lib/stream-renderer.ts
@ -0,0 +1,297 @@
+import { z } from "zod";
+import { RunEvent } from "../entities/run-events.js";
+import { LlmStepStreamEvent } from "../entities/llm-step-events.js";
+
+export interface StreamRendererOptions {
+    showHeaders?: boolean;
+    dimReasoning?: boolean;
+    jsonIndent?: number;
+    truncateJsonAt?: number;
+}
+
+export class StreamRenderer {
+    private options: Required<StreamRendererOptions>;
+    private reasoningActive = false;
+    private textActive = false;
+    private firstText = true;
+
+    constructor(options?: StreamRendererOptions) {
+        this.options = {
+            showHeaders: true,
+            dimReasoning: true,
+            jsonIndent: 2,
+            truncateJsonAt: 500,
+            ...options,
+        };
+    }
+
+    render(event: z.infer<typeof RunEvent>) {
+        switch (event.type) {
+            case "start": {
+                this.onStart(event.agentName, event.runId);
+                break;
+            }
+            case "llm-stream-event": {
+                this.renderLlmEvent(event.event);
+                break;
+            }
+            case "message": {
+                // this.onStepMessage(event.stepId, event.message);
+                break;
+            }
+            case "tool-invocation": {
+                this.onStepToolInvocation(event.toolName, event.input);
+                break;
+            }
+            case "tool-result": {
+                this.onStepToolResult(event.toolName, event.result);
+                break;
+            }
+            case "error": {
+                this.onError(event.error);
+                break;
+            }
+        }
+    }
+
+    private renderLlmEvent(event: z.infer<typeof LlmStepStreamEvent>) {
+        switch (event.type) {
+            case "reasoning-start":
+                this.onReasoningStart();
+                break;
+            case "reasoning-delta":
+                this.onReasoningDelta(event.delta);
+                break;
+            case "reasoning-end":
+                this.onReasoningEnd();
+                break;
+            case "text-start":
+                this.onTextStart();
+                break;
+            case "text-delta":
+                this.onTextDelta(event.delta);
+                break;
+            case "text-end":
+                this.onTextEnd();
+                break;
+            case "tool-call":
+                this.onToolCall(event.toolCallId, event.toolName, event.input);
+                break;
+            case "usage":
+                this.onUsage(event.usage);
+                break;
+        }
+    }
+
+    private onStart(agentName: string, runId: string) {
+        this.write("\n");
+        this.write(this.bold(`▶ Agent ${agentName} (run ${runId})`));
+        this.write("\n");
+        this.write(this.dim(`╰─────────────────────────────────────────────────\n`));
+    }
+
+    private onEnd() {
+        this.write("\n");
+        this.write(this.dim("─".repeat(50)));
+        this.write("\n");
+        this.write(this.green(this.bold("✓ Complete")));
+        this.write("\n\n");
+    }
+
+    private onError(error: string) {
+        this.write("\n");
+        this.write(this.red(this.bold("✖ Error")));
+        this.write("\n");
+        this.write(this.red(this.indent(error)));
+        this.write("\n\n");
+    }
+
+    private onStepStart() {
+        this.write("\n");
+        this.write(this.dim("│ "));
+        this.write(this.dim("Step in progress..."));
+        this.write("\n");
+    }
+
+    private onStepEnd() {
+        // More subtle step end - just add a little spacing
+        this.write(this.dim("\n"));
+    }
+
+    private onStepMessage(stepIndex: number, message: any) {
+        const role = message?.role ?? "message";
+        const content = message?.content;
+        this.write(this.bold(`${role}: `));
+        if (typeof content === "string") {
+            this.write(content + "\n");
+        } else {
+            const pretty = this.truncate(JSON.stringify(message, null, this.options.jsonIndent));
+            this.write(this.dim("\n" + this.indent(pretty) + "\n"));
+        }
+    }
+
+    private onStepToolInvocation(toolName: string, input: string) {
+        this.write("\n");
+        this.write(this.cyan("┌─ ") + this.bold(this.cyan(`🔧 ${toolName}`)));
+        this.write("\n");
+        if (input && input.length) {
+            this.write(this.dim("│ ") + this.dim(this.indent(this.truncate(input)).replace(/\n/g, "\n│ ")));
+            this.write("\n");
+        }
+    }
+
+    private onStepToolResult(toolName: string, result: unknown) {
+        const res = this.truncate(JSON.stringify(result, null, this.options.jsonIndent));
+        this.write(this.dim("│\n"));
+        this.write(this.green("└─ ") + this.dim(this.green(`Result`)));
+        this.write("\n");
+        this.write(this.dim("  " + this.indent(res).replace(/\n/g, "\n  ")));
+        this.write("\n");
+    }
+
+    private onReasoningStart() {
+        if (this.reasoningActive) return;
+        this.reasoningActive = true;
+        if (this.options.showHeaders) {
+            this.write("\n");
+            this.write(this.dim("│ "));
+            this.write(this.dim(this.italic("thinking... ")));
+        }
+    }
+
+    private onReasoningDelta(delta: string) {
+        if (!this.reasoningActive) this.onReasoningStart();
+        this.write(this.options.dimReasoning ? this.dim(delta) : delta);
+    }
+
+    private onReasoningEnd() {
+        if (!this.reasoningActive) return;
+        this.reasoningActive = false;
+        this.write("\n");
+    }
+
+    private onTextStart() {
+        if (this.textActive) return;
+        this.textActive = true;
+        if (this.options.showHeaders && this.firstText) {
+            this.write("\n");
+            this.write(this.bold("╭─ ") + this.bold("Response"));
+            this.write("\n");
+            this.write(this.dim("│\n"));
+            this.firstText = false;
+        } else if (this.options.showHeaders) {
+            this.write("\n");
+            this.write(this.dim("│ "));
+        }
+    }
+
+    private onTextDelta(delta: string) {
+        // Add subtle left margin to assistant text for better readability
+        const formattedDelta = this.neutral(delta);
+        if (delta.includes("\n")) {
+            this.write(formattedDelta.replace(/\n/g, "\n  "));
+        } else {
+            this.write(formattedDelta);
+        }
+    }
+
+    private onTextEnd() {
+        if (!this.textActive) return;
+        this.textActive = false;
+        this.write("\n");
+    }
+
+    private onToolCall(toolCallId: string, toolName: string, input: unknown) {
+        const inputStr = this.truncate(JSON.stringify(input, null, this.options.jsonIndent));
+        this.write("\n");
+        this.write(this.magenta("┌─ ") + this.bold(this.magenta(`⚡ ${toolName}`)));
+        this.write(this.dim(` (${toolCallId.slice(0, 8)}...)`));
+        this.write("\n");
+        this.write(this.dim("│ ") + this.dim(this.indent(inputStr).replace(/\n/g, "\n│ ")));
+        this.write("\n");
+        this.write(this.dim("└─────────────\n"));
+    }
+
+    private onPauseForHumanInput(toolCallId: string, question: string) {
+        this.write(this.cyan(`\n→ Pause for human input (${toolCallId})`));
+        this.write("\n");
+        this.write(this.bold("Question: ") + question);
+        this.write("\n");
+    }
+
+    private onUsage(usage: {
+        inputTokens?: number;
+        outputTokens?: number;
+        totalTokens?: number;
+        reasoningTokens?: number;
+        cachedInputTokens?: number;
+    }) {
+        const parts: string[] = [];
+        if (usage.inputTokens !== undefined) parts.push(`${this.dim("in:")} ${usage.inputTokens}`);
+        if (usage.outputTokens !== undefined) parts.push(`${this.dim("out:")} ${usage.outputTokens}`);
+        if (usage.reasoningTokens !== undefined) parts.push(`${this.dim("reasoning:")} ${usage.reasoningTokens}`);
+        if (usage.cachedInputTokens !== undefined) parts.push(`${this.dim("cached:")} ${usage.cachedInputTokens}`);
+        if (usage.totalTokens !== undefined) parts.push(`${this.dim("total:")} ${this.bold(usage.totalTokens.toString())}`);
+        const line = parts.join(this.dim(" | "));
+        this.write("\n");
+        this.write(this.dim("╭─ Usage\n"));
+        this.write(this.dim("│ ") + line);
+        this.write("\n");
+        this.write(this.dim("╰─────────────\n"));
+    }
+
+    // Formatting helpers
+    private write(text: string) {
+        process.stdout.write(text);
+    }
+
+    private indent(text: string): string {
+        return text
+            .split("\n")
+            .map((line) => (line.length ? `  ${line}` : line))
+            .join("\n");
+    }
+
+    private truncate(text: string): string {
+        if (text.length <= this.options.truncateJsonAt) return text;
+        return text.slice(0, this.options.truncateJsonAt) + "…";
+    }
+
+    private bold(text: string): string {
+        return "\x1b[1m" + text + "\x1b[0m";
+    }
+
+    private dim(text: string): string {
+        return "\x1b[2m" + text + "\x1b[0m";
+    }
+
+    private italic(text: string): string {
+        return "\x1b[3m" + text + "\x1b[0m";
+    }
+
+    private cyan(text: string): string {
+        return "\x1b[36m" + text + "\x1b[0m";
+    }
+
+    private green(text: string): string {
+        return "\x1b[32m" + text + "\x1b[0m";
+    }
+
+    private red(text: string): string {
+        return "\x1b[31m" + text + "\x1b[0m";
+    }
+
+    private magenta(text: string): string {
+        return "\x1b[35m" + text + "\x1b[0m";
+    }
+
+    private yellow(text: string): string {
+        return "\x1b[33m" + text + "\x1b[0m";
+    }
+
+    private neutral(text: string): string {
+        return "\x1b[38;5;250m" + text + "\x1b[0m";
+    }
+}
+
+
--- a/apps/cli/todo.md
+++ b/apps/cli/todo.md
@ -0,0 +1,15 @@
+runtime
+---
+o stream out responses
+o terminal logging
+o file logging
+- accept initial user input from CLI
+- mcp tool calls (http + stdio)
+- human input support
+- bash tool support
+- cli wrapper (node commander)
+
+
+rowboat agent
+---
+- create agent
--- a/apps/cli/tsconfig.json
+++ b/apps/cli/tsconfig.json
@ -0,0 +1,20 @@
+{
+  // Visit https://aka.ms/tsconfig to read more about this file
+  "compilerOptions": {
+    "rootDir": "./src",
+    "outDir": "./dist",
+    "module": "nodenext",
+    "target": "esnext",
+    "lib": ["esnext"],
+    "types": ["node"],
+    "strict": true,
+    "esModuleInterop": true,
+    "skipLibCheck": true,
+    "sourceMap": true,
+    "paths": {
+      "@/*": [
+        "./src/*"
+      ]
+    }
+  }
+}