Add OpenClaw + Plano intelligent routing demo (#761)

* Add OpenClaw + Plano intelligent routing demo Demonstrates preference-based routing for personal AI assistants: Kimi K2.5 handles conversation and agentic tasks, Claude handles code generation, testing, and complex reasoning — with zero application code changes and ~48% cost savings. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Remove redundant provider_interface from Kimi K2.5 config The openai/ prefix in the model name already sets the provider interface. Setting provider_interface explicitly conflicts with it and fails config validation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Simplify config to v0.3.0 format, remove explicit Arch-Router entry Arch-Router is implicit when routing_preferences are defined. Aligns with the preference_based_routing demo pattern. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Clean up Ollama/Arch-Router references, make Jaeger optional Router is handled internally by Plano — no need for Ollama or explicit Arch-Router setup. Jaeger is kept as an optional step in the README for developers who want tracing visibility. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Remove run_demo.sh, use planoai CLI directly The planoai CLI already handles startup. README now uses planoai up/down directly instead of a wrapper script. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Remove docker-compose.yaml, use inline docker run for Jaeger No need for a compose file when Jaeger is the only optional service. A single docker run command in the README is simpler. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Clarify testing: OpenClaw channels vs direct Plano requests Primary testing is through messaging channels (Telegram, Slack, etc.) with log monitoring. The test_routing.sh script is now documented as an optional direct verification tool. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Add OpenClaw onboarding instructions to README Includes install, onboarding wizard, channel setup, doctor check, and how to point the gateway at Plano. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Use OpenClaw onboarding wizard for Plano provider setup Replace manual JSON config with instructions to use the openclaw onboard wizard to set up a custom OpenAI-compatible provider pointing at Plano. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fixed readme and removed unnecessary testing.sh file --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>
2026-06-08 14:55:14 +02:00 · 2026-02-17 06:34:54 -08:00 · 2026-02-17 06:34:54 -08:00 · c69fbd8a4d
commit c69fbd8a4d
parent bfbf838b19
3 changed files with 168 additions and 0 deletions
--- a/demos/llm_routing/openclaw_routing/README.md
+++ b/demos/llm_routing/openclaw_routing/README.md
@ -0,0 +1,135 @@
+# OpenClaw + Plano: Smart Model Routing for Personal AI Assistants
+
+<p align="center">
+  <img src="openclaw_plano.png" alt="OpenClaw + Plano" width="50%">
+</p>
+
+OpenClaw is an open-source personal AI assistant that connects to WhatsApp, Telegram, Slack, and Discord. By pointing it at Plano instead of a single LLM provider, every message is automatically routed to the best model — conversational requests go to Kimi K2.5 (cost-effective), while code generation, testing, and complex reasoning go to Claude (most capable) — with zero application code changes.
+
+## Architecture
+
+```
+[WhatsApp / Telegram / Slack / Discord]
+                |
+        [OpenClaw Gateway]
+         ws://127.0.0.1:18789
+                |
+        [Plano :12000]  ──────────────>  Kimi K2.5  (conversation, agentic tasks)
+                |                           $0.60/M input tokens
+                |──────────────────────>  Claude     (code, tests, reasoning)
+```
+
+Plano uses a [preference-aligned router](https://arxiv.org/abs/2506.16655) to analyze each prompt and select the best backend based on configured routing preferences.
+
+## Prerequisites
+
+- **Docker** running
+- **Plano CLI**: `uv tool install planoai` or `pip install planoai`
+- **OpenClaw**: `npm install -g openclaw@latest`
+- **API keys**:
+  - `MOONSHOT_API_KEY` — from [Moonshot AI](https://www..moonshot.ai/)
+  - `ANTHROPIC_API_KEY` — from [Anthropic](https://console.anthropic.com/)
+
+## Quick Start
+
+### 1. Set Environment Variables
+
+```bash
+export MOONSHOT_API_KEY="your-moonshot-key"
+export ANTHROPIC_API_KEY="your-anthropic-key"
+```
+
+### 2. Start Plano
+
+```bash
+cd demos/llm_routing/openclaw_routing
+planoai up --service plano --foreground
+```
+
+### 3. Set Up OpenClaw
+
+Install OpenClaw (requires Node >= 22):
+
+```bash
+npm install -g openclaw@latest
+```
+
+Install the gateway daemon and connect your messaging channels:
+
+```bash
+openclaw onboard --install-daemon
+```
+
+This installs the gateway as a background service (launchd on macOS, systemd on Linux). To connect messaging channels like WhatsApp or Telegram, see the [OpenClaw channel setup docs](https://docs.openclaw.ai/gateway/configuration).
+
+Run `openclaw doctor` to verify everything is working.
+
+### 4. Point OpenClaw at Plano
+
+During the OpenClaw onboarding wizard, when prompted to choose an LLM provider:
+
+1. Select **Custom OpenAI-compatible** as the provider
+2. Set the base URL to `http://127.0.0.1:12000/v1`
+3. Enter any value for the API key (e.g. `none`) — Plano handles auth to the actual providers
+4. Set the context window to at least `128000` tokens
+
+This registers Plano as OpenClaw's LLM backend. All requests go through Plano on port 12000, which routes them to Kimi K2.5 or Claude based on the prompt content.
+
+If you've already onboarded, re-run the wizard to update the provider:
+
+```bash
+openclaw onboard --install-daemon
+```
+
+### 5. Test Routing Through OpenClaw
+
+Send messages through any connected channel (WhatsApp, Telegram, Slack, etc.) and watch routing decisions in a separate terminal:
+
+```bash
+planoai logs --service plano | grep MODEL_RESOLUTION
+```
+
+Try these messages to see routing in action:
+
+| # | Message (via your messaging channel) | Expected Route | Why |
+|---|---------|---------------|-----|
+| 1 | "Hey, what's up? Tell me something interesting." | **Kimi K2.5** | General conversation — cheap and fast |
+| 2 | "Remind me tomorrow at 9am and ping Slack about the deploy" | **Kimi K2.5** | Agentic multi-step task orchestration |
+| 3 | "Write a Python rate limiter with the token bucket algorithm" | **Claude** | Code generation — needs precision |
+| 4 | "Write unit tests for the auth middleware, cover edge cases" | **Claude** | Testing & evaluation — needs thoroughness |
+| 5 | "Compare WebSockets vs SSE vs polling for 10K concurrent users" | **Claude** | Complex reasoning — needs deep analysis |
+
+OpenClaw's code doesn't change at all. It points at `http://127.0.0.1:12000/v1` instead of a direct provider URL. Plano's router analyzes each prompt and picks the right backend.
+
+
+## Tracing
+
+For fast dev/test cycles, Plano provides built-in tracing to visualize routing decisions and LLM interactions. Start the trace listener in a separate terminal:
+
+```bash
+planoai trace
+```
+
+Then send requests through OpenClaw. You'll see detailed traces showing:
+- Which model was selected and why
+- Token usage and latency for each request
+- Complete request/response payloads
+
+Learn more about tracing features and configuration in the [Plano tracing guide](https://docs.planoai.dev/guides/observability/tracing.html#tracing-with-the-cli).
+
+## Cost Impact
+
+For a personal assistant handling ~1000 requests/day with a 60/40 conversation-to-code split:
+
+| Without Plano (all Claude) | With Plano (routed) |
+|---|---|
+| 1000 req x Claude pricing | 600 req x Kimi K2.5 + 400 req x Claude |
+| ~$3.00/day input tokens | ~$0.36 + $1.20 = **$1.56/day** (~48% savings) |
+
+Same quality where it matters (code, tests), lower cost where it doesn't (chat).
+
+## Stopping the Demo
+
+```bash
+planoai down
+```
--- a/demos/llm_routing/openclaw_routing/config.yaml
+++ b/demos/llm_routing/openclaw_routing/config.yaml
@ -0,0 +1,33 @@
+version: v0.1.0
+
+routing:
+  model: Arch-Router
+  llm_provider: arch-router
+
+listeners:
+  egress_traffic:
+    address: 0.0.0.0
+    port: 12000
+    message_format: openai
+    timeout: 30s
+
+llm_providers:
+
+  # Kimi K2.5 — Moonshot AI's open model (1T MoE, 32B active params)
+  # Great for general conversation, agentic tasks, and multimodal work
+  # OpenAI-compatible API at $0.60/M input, $2.50/M output tokens
+  - model: openai/kimi-k2.5
+    access_key: $MOONSHOT_API_KEY
+    base_url: https://api.moonshot.ai/v1
+    default: true
+    routing_preferences:
+      - name: code generation
+        description: generating code, writing scripts, implementing functions, and building tool integrations
+
+  # Claude — Anthropic's most capable model
+  # Best for complex reasoning, code, tool use, and evaluation
+  - model: anthropic/claude-sonnet-4-5
+    access_key: $ANTHROPIC_API_KEY
+    routing_preferences:
+      - name: general conversation
+        description: general chat, greetings, casual conversation, Q&A, and everyday questions
--- a/demos/llm_routing/openclaw_routing/openclaw_plano.png
+++ b/demos/llm_routing/openclaw_routing/openclaw_plano.png