Add OpenClaw + Plano intelligent routing demo (#761)

* Add OpenClaw + Plano intelligent routing demo

Demonstrates preference-based routing for personal AI assistants:
Kimi K2.5 handles conversation and agentic tasks, Claude handles
code generation, testing, and complex reasoning — with zero
application code changes and ~48% cost savings.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Remove redundant provider_interface from Kimi K2.5 config

The openai/ prefix in the model name already sets the provider
interface. Setting provider_interface explicitly conflicts with it
and fails config validation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Simplify config to v0.3.0 format, remove explicit Arch-Router entry

Arch-Router is implicit when routing_preferences are defined.
Aligns with the preference_based_routing demo pattern.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Clean up Ollama/Arch-Router references, make Jaeger optional

Router is handled internally by Plano — no need for Ollama or
explicit Arch-Router setup. Jaeger is kept as an optional step
in the README for developers who want tracing visibility.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Remove run_demo.sh, use planoai CLI directly

The planoai CLI already handles startup. README now uses
planoai up/down directly instead of a wrapper script.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Remove docker-compose.yaml, use inline docker run for Jaeger

No need for a compose file when Jaeger is the only optional
service. A single docker run command in the README is simpler.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Clarify testing: OpenClaw channels vs direct Plano requests

Primary testing is through messaging channels (Telegram, Slack,
etc.) with log monitoring. The test_routing.sh script is now
documented as an optional direct verification tool.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Add OpenClaw onboarding instructions to README

Includes install, onboarding wizard, channel setup, doctor
check, and how to point the gateway at Plano.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Use OpenClaw onboarding wizard for Plano provider setup

Replace manual JSON config with instructions to use the
openclaw onboard wizard to set up a custom OpenAI-compatible
provider pointing at Plano.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fixed readme and removed unnecessary testing.sh file

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>
This commit is contained in:
Adil Hafeez 2026-02-17 06:34:54 -08:00 committed by GitHub
parent bfbf838b19
commit c69fbd8a4d
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
3 changed files with 168 additions and 0 deletions

View file

@ -0,0 +1,135 @@
# OpenClaw + Plano: Smart Model Routing for Personal AI Assistants
<p align="center">
<img src="openclaw_plano.png" alt="OpenClaw + Plano" width="50%">
</p>
OpenClaw is an open-source personal AI assistant that connects to WhatsApp, Telegram, Slack, and Discord. By pointing it at Plano instead of a single LLM provider, every message is automatically routed to the best model — conversational requests go to Kimi K2.5 (cost-effective), while code generation, testing, and complex reasoning go to Claude (most capable) — with zero application code changes.
## Architecture
```
[WhatsApp / Telegram / Slack / Discord]
|
[OpenClaw Gateway]
ws://127.0.0.1:18789
|
[Plano :12000] ──────────────> Kimi K2.5 (conversation, agentic tasks)
| $0.60/M input tokens
|──────────────────────> Claude (code, tests, reasoning)
```
Plano uses a [preference-aligned router](https://arxiv.org/abs/2506.16655) to analyze each prompt and select the best backend based on configured routing preferences.
## Prerequisites
- **Docker** running
- **Plano CLI**: `uv tool install planoai` or `pip install planoai`
- **OpenClaw**: `npm install -g openclaw@latest`
- **API keys**:
- `MOONSHOT_API_KEY` — from [Moonshot AI](https://www..moonshot.ai/)
- `ANTHROPIC_API_KEY` — from [Anthropic](https://console.anthropic.com/)
## Quick Start
### 1. Set Environment Variables
```bash
export MOONSHOT_API_KEY="your-moonshot-key"
export ANTHROPIC_API_KEY="your-anthropic-key"
```
### 2. Start Plano
```bash
cd demos/llm_routing/openclaw_routing
planoai up --service plano --foreground
```
### 3. Set Up OpenClaw
Install OpenClaw (requires Node >= 22):
```bash
npm install -g openclaw@latest
```
Install the gateway daemon and connect your messaging channels:
```bash
openclaw onboard --install-daemon
```
This installs the gateway as a background service (launchd on macOS, systemd on Linux). To connect messaging channels like WhatsApp or Telegram, see the [OpenClaw channel setup docs](https://docs.openclaw.ai/gateway/configuration).
Run `openclaw doctor` to verify everything is working.
### 4. Point OpenClaw at Plano
During the OpenClaw onboarding wizard, when prompted to choose an LLM provider:
1. Select **Custom OpenAI-compatible** as the provider
2. Set the base URL to `http://127.0.0.1:12000/v1`
3. Enter any value for the API key (e.g. `none`) — Plano handles auth to the actual providers
4. Set the context window to at least `128000` tokens
This registers Plano as OpenClaw's LLM backend. All requests go through Plano on port 12000, which routes them to Kimi K2.5 or Claude based on the prompt content.
If you've already onboarded, re-run the wizard to update the provider:
```bash
openclaw onboard --install-daemon
```
### 5. Test Routing Through OpenClaw
Send messages through any connected channel (WhatsApp, Telegram, Slack, etc.) and watch routing decisions in a separate terminal:
```bash
planoai logs --service plano | grep MODEL_RESOLUTION
```
Try these messages to see routing in action:
| # | Message (via your messaging channel) | Expected Route | Why |
|---|---------|---------------|-----|
| 1 | "Hey, what's up? Tell me something interesting." | **Kimi K2.5** | General conversation — cheap and fast |
| 2 | "Remind me tomorrow at 9am and ping Slack about the deploy" | **Kimi K2.5** | Agentic multi-step task orchestration |
| 3 | "Write a Python rate limiter with the token bucket algorithm" | **Claude** | Code generation — needs precision |
| 4 | "Write unit tests for the auth middleware, cover edge cases" | **Claude** | Testing & evaluation — needs thoroughness |
| 5 | "Compare WebSockets vs SSE vs polling for 10K concurrent users" | **Claude** | Complex reasoning — needs deep analysis |
OpenClaw's code doesn't change at all. It points at `http://127.0.0.1:12000/v1` instead of a direct provider URL. Plano's router analyzes each prompt and picks the right backend.
## Tracing
For fast dev/test cycles, Plano provides built-in tracing to visualize routing decisions and LLM interactions. Start the trace listener in a separate terminal:
```bash
planoai trace
```
Then send requests through OpenClaw. You'll see detailed traces showing:
- Which model was selected and why
- Token usage and latency for each request
- Complete request/response payloads
Learn more about tracing features and configuration in the [Plano tracing guide](https://docs.planoai.dev/guides/observability/tracing.html#tracing-with-the-cli).
## Cost Impact
For a personal assistant handling ~1000 requests/day with a 60/40 conversation-to-code split:
| Without Plano (all Claude) | With Plano (routed) |
|---|---|
| 1000 req x Claude pricing | 600 req x Kimi K2.5 + 400 req x Claude |
| ~$3.00/day input tokens | ~$0.36 + $1.20 = **$1.56/day** (~48% savings) |
Same quality where it matters (code, tests), lower cost where it doesn't (chat).
## Stopping the Demo
```bash
planoai down
```

View file

@ -0,0 +1,33 @@
version: v0.1.0
routing:
model: Arch-Router
llm_provider: arch-router
listeners:
egress_traffic:
address: 0.0.0.0
port: 12000
message_format: openai
timeout: 30s
llm_providers:
# Kimi K2.5 — Moonshot AI's open model (1T MoE, 32B active params)
# Great for general conversation, agentic tasks, and multimodal work
# OpenAI-compatible API at $0.60/M input, $2.50/M output tokens
- model: openai/kimi-k2.5
access_key: $MOONSHOT_API_KEY
base_url: https://api.moonshot.ai/v1
default: true
routing_preferences:
- name: code generation
description: generating code, writing scripts, implementing functions, and building tool integrations
# Claude — Anthropic's most capable model
# Best for complex reasoning, code, tool use, and evaluation
- model: anthropic/claude-sonnet-4-5
access_key: $ANTHROPIC_API_KEY
routing_preferences:
- name: general conversation
description: general chat, greetings, casual conversation, Q&A, and everyday questions

Binary file not shown.

After

Width:  |  Height:  |  Size: 110 KiB