mirror of https://github.com/katanemo/plano.git synced 2026-06-26 15:39:40 +02:00

Adil Hafeez 352d60b970 Remove docker-compose.yaml, use inline docker run for Jaeger No need for a compose file when Jaeger is the only optional service. A single docker run command in the README is simpler. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>		2026-02-17 03:30:08 -08:00
..
config.yaml	Simplify config to v0.3.0 format, remove explicit Arch-Router entry	2026-02-17 03:27:05 -08:00
README.md	Remove docker-compose.yaml, use inline docker run for Jaeger	2026-02-17 03:30:08 -08:00
test_routing.sh	Add OpenClaw + Plano intelligent routing demo	2026-02-17 03:16:18 -08:00

README.md

OpenClaw + Plano: Smart Model Routing for Personal AI Assistants

OpenClaw is an open-source personal AI assistant that connects to WhatsApp, Telegram, Slack, and Discord. By pointing it at Plano instead of a single LLM provider, every message is automatically routed to the best model — conversational requests go to Kimi K2.5 (cost-effective), while code generation, testing, and complex reasoning go to Claude (most capable) — with zero application code changes.

Architecture

[WhatsApp / Telegram / Slack / Discord]
                |
        [OpenClaw Gateway]
         ws://127.0.0.1:18789
                |
        [Plano :12000]  ──────────────>  Kimi K2.5  (conversation, agentic tasks)
                |                           $0.60/M input tokens
                |──────────────────────>  Claude     (code, tests, reasoning)

Plano uses a preference-aligned router to analyze each prompt and select the best backend based on configured routing preferences.

Prerequisites

Docker running
Plano CLI: uv tool install planoai or pip install planoai
OpenClaw: npm install -g openclaw@latest
API keys:
- MOONSHOT_API_KEY — from Moonshot AI
- ANTHROPIC_API_KEY — from Anthropic

Quick Start

1. Set Environment Variables

export MOONSHOT_API_KEY="your-moonshot-key"
export ANTHROPIC_API_KEY="your-anthropic-key"

2. Start Plano

cd demos/llm_routing/openclaw_routing
planoai up --service plano --foreground

3. Configure OpenClaw

In ~/.openclaw/openclaw.json, set:

{
  "agent": {
    "model": "kimi-k2.5",
    "baseURL": "http://127.0.0.1:12000/v1"
  }
}

Then run:

openclaw onboard --install-daemon

4. Test Routing

Run the test script to verify routing decisions:

bash test_routing.sh

Demo Scenarios

#	Message	Expected Route	Why
1	"Hey, what's up? Tell me something interesting."	Kimi K2.5	General conversation — cheap and fast
2	"Remind me tomorrow at 9am and ping Slack about the deploy"	Kimi K2.5	Agentic multi-step task orchestration
3	"Write a Python rate limiter with the token bucket algorithm"	Claude	Code generation — needs precision
4	"Write unit tests for the auth middleware, cover edge cases"	Claude	Testing & evaluation — needs thoroughness
5	"Compare WebSockets vs SSE vs polling for 10K concurrent users"	Claude	Complex reasoning — needs deep analysis

OpenClaw's code doesn't change at all. It points at http://127.0.0.1:12000/v1 instead of a direct provider URL. Plano's router analyzes each prompt and picks the right backend.

Monitoring

Routing Decisions

Watch Plano logs for model selection:

docker logs plano 2>&1 | grep MODEL_RESOLUTION

Jaeger Tracing (Optional)

To visualize full request traces and routing decisions, start Jaeger:

docker run -d --name jaeger -p 16686:16686 -p 4317:4317 -p 4318:4318 \
  -e COLLECTOR_OTLP_ENABLED=true jaegertracing/all-in-one:latest

Then open http://localhost:16686 to see traces for each request, including which model was selected and the routing latency.

Cost Impact

For a personal assistant handling ~1000 requests/day with a 60/40 conversation-to-code split:

Without Plano (all Claude)	With Plano (routed)
1000 req x Claude pricing	600 req x Kimi K2.5 + 400 req x Claude
~$3.00/day input tokens	~$0.36 + $1.20 = $1.56/day (~48% savings)

Same quality where it matters (code, tests), lower cost where it doesn't (chat).

Stopping the Demo

planoai down