mirror of
https://github.com/katanemo/plano.git
synced 2026-04-25 00:36:34 +02:00
Revert "Add OpenClaw + Plano intelligent routing demo"
This reverts commit 5090000fd2.
This commit is contained in:
parent
5090000fd2
commit
5d2279a796
5 changed files with 0 additions and 297 deletions
|
|
@ -1,122 +0,0 @@
|
|||
# OpenClaw + Plano: Smart Model Routing for Personal AI Assistants
|
||||
|
||||
OpenClaw is an open-source personal AI assistant that connects to WhatsApp, Telegram, Slack, and Discord. By pointing it at Plano instead of a single LLM provider, every message is automatically routed to the best model — conversational requests go to Kimi K2.5 (cost-effective), while code generation, testing, and complex reasoning go to Claude (most capable) — with zero application code changes.
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
[WhatsApp / Telegram / Slack / Discord]
|
||||
|
|
||||
[OpenClaw Gateway]
|
||||
ws://127.0.0.1:18789
|
||||
|
|
||||
[Plano :12000] ──────────────> Kimi K2.5 (conversation, agentic tasks)
|
||||
| $0.60/M input tokens
|
||||
|──────────────────────> Claude (code, tests, reasoning)
|
||||
|
|
||||
[Arch-Router 1.5B]
|
||||
(local via Ollama, ~200ms)
|
||||
```
|
||||
|
||||
Plano's 1.5B [Arch-Router](https://arxiv.org/abs/2506.16655) model analyzes each prompt locally and selects the best backend based on configured routing preferences.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- **Docker** running
|
||||
- **Ollama** installed ([ollama.com](https://ollama.com))
|
||||
- **Plano CLI**: `uv tool install planoai` or `pip install planoai`
|
||||
- **OpenClaw**: `npm install -g openclaw@latest`
|
||||
- **API keys**:
|
||||
- `MOONSHOT_API_KEY` — from [Moonshot AI](https://platform.moonshot.cn/)
|
||||
- `ANTHROPIC_API_KEY` — from [Anthropic](https://console.anthropic.com/)
|
||||
|
||||
## Quick Start
|
||||
|
||||
### 1. Set Environment Variables
|
||||
|
||||
```bash
|
||||
export MOONSHOT_API_KEY="your-moonshot-key"
|
||||
export ANTHROPIC_API_KEY="your-anthropic-key"
|
||||
```
|
||||
|
||||
### 2. Start the Demo
|
||||
|
||||
```bash
|
||||
cd demos/llm_routing/openclaw_routing
|
||||
bash run_demo.sh
|
||||
```
|
||||
|
||||
This will:
|
||||
- Pull the Arch-Router model into Ollama
|
||||
- Start Jaeger for tracing
|
||||
- Start Plano on port 12000
|
||||
|
||||
### 3. Configure OpenClaw
|
||||
|
||||
In `~/.openclaw/openclaw.json`, set:
|
||||
|
||||
```json
|
||||
{
|
||||
"agent": {
|
||||
"model": "kimi-k2.5",
|
||||
"baseURL": "http://127.0.0.1:12000/v1"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Then run:
|
||||
|
||||
```bash
|
||||
openclaw onboard --install-daemon
|
||||
```
|
||||
|
||||
### 4. Test Routing
|
||||
|
||||
Run the test script to verify routing decisions:
|
||||
|
||||
```bash
|
||||
bash test_routing.sh
|
||||
```
|
||||
|
||||
## Demo Scenarios
|
||||
|
||||
| # | Message | Expected Route | Why |
|
||||
|---|---------|---------------|-----|
|
||||
| 1 | "Hey, what's up? Tell me something interesting." | **Kimi K2.5** | General conversation — cheap and fast |
|
||||
| 2 | "Remind me tomorrow at 9am and ping Slack about the deploy" | **Kimi K2.5** | Agentic multi-step task orchestration |
|
||||
| 3 | "Write a Python rate limiter with the token bucket algorithm" | **Claude** | Code generation — needs precision |
|
||||
| 4 | "Write unit tests for the auth middleware, cover edge cases" | **Claude** | Testing & evaluation — needs thoroughness |
|
||||
| 5 | "Compare WebSockets vs SSE vs polling for 10K concurrent users" | **Claude** | Complex reasoning — needs deep analysis |
|
||||
|
||||
OpenClaw's code doesn't change at all. It points at `http://127.0.0.1:12000/v1` instead of a direct provider URL. Plano's Arch-Router analyzes each prompt in ~200ms and picks the right backend.
|
||||
|
||||
## Monitoring
|
||||
|
||||
### Routing Decisions
|
||||
|
||||
Watch Plano logs for model selection:
|
||||
|
||||
```bash
|
||||
docker logs plano 2>&1 | grep MODEL_RESOLUTION
|
||||
```
|
||||
|
||||
### Jaeger Tracing
|
||||
|
||||
Open [http://localhost:16686](http://localhost:16686) to see full traces of each request, including which model was selected and the routing latency.
|
||||
|
||||
## Cost Impact
|
||||
|
||||
For a personal assistant handling ~1000 requests/day with a 60/40 conversation-to-code split:
|
||||
|
||||
| Without Plano (all Claude) | With Plano (routed) |
|
||||
|---|---|
|
||||
| 1000 req x Claude pricing | 600 req x Kimi K2.5 + 400 req x Claude |
|
||||
| ~$3.00/day input tokens | ~$0.36 + $1.20 = **$1.56/day** (~48% savings) |
|
||||
|
||||
Same quality where it matters (code, tests), lower cost where it doesn't (chat).
|
||||
|
||||
## Stopping the Demo
|
||||
|
||||
```bash
|
||||
bash run_demo.sh down
|
||||
```
|
||||
|
|
@ -1,48 +0,0 @@
|
|||
version: v0.1.0
|
||||
|
||||
routing:
|
||||
model: Arch-Router
|
||||
llm_provider: arch-router
|
||||
|
||||
listeners:
|
||||
egress_traffic:
|
||||
address: 0.0.0.0
|
||||
port: 12000
|
||||
message_format: openai
|
||||
timeout: 30s
|
||||
|
||||
llm_providers:
|
||||
|
||||
# Arch Router - the 1.5B preference-aligned routing model (runs locally via Ollama)
|
||||
- name: arch-router
|
||||
model: arch/hf.co/katanemo/Arch-Router-1.5B.gguf:Q4_K_M
|
||||
base_url: http://host.docker.internal:11434
|
||||
|
||||
# Kimi K2.5 — Moonshot AI's open model (1T MoE, 32B active params)
|
||||
# Great for general conversation, agentic tasks, and multimodal work
|
||||
# OpenAI-compatible API at $0.60/M input, $2.50/M output tokens
|
||||
- model: openai/kimi-k2.5
|
||||
access_key: $MOONSHOT_API_KEY
|
||||
base_url: https://api.moonshot.ai/v1
|
||||
provider_interface: openai
|
||||
default: true
|
||||
routing_preferences:
|
||||
- name: general conversation
|
||||
description: general chat, greetings, casual conversation, Q&A, and everyday questions
|
||||
- name: agentic tasks
|
||||
description: coordinating multi-step workflows, device automation, scheduling, and task orchestration across channels
|
||||
|
||||
# Claude — Anthropic's most capable model
|
||||
# Best for complex reasoning, code, tool use, and evaluation
|
||||
- model: anthropic/claude-sonnet-4-5
|
||||
access_key: $ANTHROPIC_API_KEY
|
||||
routing_preferences:
|
||||
- name: testing and evaluation
|
||||
description: writing tests, running evaluations, QA checks, verifying correctness, and debugging failures
|
||||
- name: code generation
|
||||
description: generating code, writing scripts, implementing functions, and building tool integrations
|
||||
- name: complex reasoning
|
||||
description: multi-step analysis, planning, architectural decisions, and deep problem-solving
|
||||
|
||||
tracing:
|
||||
random_sampling: 100
|
||||
|
|
@ -1,8 +0,0 @@
|
|||
services:
|
||||
jaeger:
|
||||
build:
|
||||
context: ../../shared/jaeger
|
||||
ports:
|
||||
- "16686:16686"
|
||||
- "4317:4317"
|
||||
- "4318:4318"
|
||||
|
|
@ -1,59 +0,0 @@
|
|||
#!/bin/bash
|
||||
set -e
|
||||
|
||||
echo "=== OpenClaw + Plano Routing Demo ==="
|
||||
|
||||
# Check prerequisites
|
||||
command -v docker >/dev/null || { echo "Error: Docker not found"; exit 1; }
|
||||
command -v ollama >/dev/null || { echo "Error: Ollama not found. Install from https://ollama.com"; exit 1; }
|
||||
|
||||
# Check/create .env file
|
||||
if [ -f ".env" ]; then
|
||||
echo ".env file already exists"
|
||||
else
|
||||
if [ -z "${MOONSHOT_API_KEY:-}" ]; then
|
||||
echo "Error: MOONSHOT_API_KEY not set"
|
||||
exit 1
|
||||
fi
|
||||
if [ -z "${ANTHROPIC_API_KEY:-}" ]; then
|
||||
echo "Error: ANTHROPIC_API_KEY not set"
|
||||
exit 1
|
||||
fi
|
||||
echo "Creating .env file..."
|
||||
echo "MOONSHOT_API_KEY=$MOONSHOT_API_KEY" > .env
|
||||
echo "ANTHROPIC_API_KEY=$ANTHROPIC_API_KEY" >> .env
|
||||
fi
|
||||
|
||||
# Pull Arch-Router model if needed
|
||||
echo "Pulling Arch-Router model..."
|
||||
ollama pull hf.co/katanemo/Arch-Router-1.5B.gguf:Q4_K_M
|
||||
|
||||
start_demo() {
|
||||
# Start Jaeger for tracing
|
||||
echo "Starting Jaeger..."
|
||||
docker compose up -d
|
||||
|
||||
# Start Plano gateway
|
||||
echo "Starting Plano..."
|
||||
planoai up --service plano --foreground
|
||||
}
|
||||
|
||||
stop_demo() {
|
||||
docker compose down
|
||||
planoai down
|
||||
}
|
||||
|
||||
if [ "${1:-}" == "down" ]; then
|
||||
stop_demo
|
||||
else
|
||||
start_demo
|
||||
echo ""
|
||||
echo "=== Plano is running on http://localhost:12000 ==="
|
||||
echo "=== Jaeger UI at http://localhost:16686 ==="
|
||||
echo ""
|
||||
echo "Configure OpenClaw to use Plano as its LLM endpoint:"
|
||||
echo ' In ~/.openclaw/openclaw.json, set:'
|
||||
echo ' { "agent": { "model": "kimi-k2.5", "baseURL": "http://127.0.0.1:12000/v1" } }'
|
||||
echo ""
|
||||
echo "Then run: openclaw onboard --install-daemon"
|
||||
fi
|
||||
|
|
@ -1,60 +0,0 @@
|
|||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
|
||||
PLANO_URL="http://localhost:12000/v1/chat/completions"
|
||||
|
||||
echo "=== Testing Plano Routing Decisions ==="
|
||||
echo ""
|
||||
|
||||
# Scenario 1: General conversation -> should route to Kimi K2.5
|
||||
echo "--- Scenario 1: General Conversation (expect: Kimi K2.5) ---"
|
||||
curl -s "$PLANO_URL" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"model": "kimi-k2.5",
|
||||
"messages": [{"role": "user", "content": "Hey! What is the weather like today? Can you tell me a fun fact?"}]
|
||||
}' | jq '{model: .model, content: .choices[0].message.content[:100]}'
|
||||
echo ""
|
||||
|
||||
# Scenario 2: Agentic task -> should route to Kimi K2.5
|
||||
echo "--- Scenario 2: Agentic Task (expect: Kimi K2.5) ---"
|
||||
curl -s "$PLANO_URL" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"model": "kimi-k2.5",
|
||||
"messages": [{"role": "user", "content": "Schedule a reminder for tomorrow at 9am to review the pull request, then send a message to the team Slack channel about the deployment."}]
|
||||
}' | jq '{model: .model, content: .choices[0].message.content[:100]}'
|
||||
echo ""
|
||||
|
||||
# Scenario 3: Code generation -> should route to Claude
|
||||
echo "--- Scenario 3: Code Generation (expect: Claude) ---"
|
||||
curl -s "$PLANO_URL" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"model": "kimi-k2.5",
|
||||
"messages": [{"role": "user", "content": "Write a Python function that implements a rate limiter using the token bucket algorithm with async support."}]
|
||||
}' | jq '{model: .model, content: .choices[0].message.content[:100]}'
|
||||
echo ""
|
||||
|
||||
# Scenario 4: Testing/evaluation -> should route to Claude
|
||||
echo "--- Scenario 4: Testing & Evaluation (expect: Claude) ---"
|
||||
curl -s "$PLANO_URL" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"model": "kimi-k2.5",
|
||||
"messages": [{"role": "user", "content": "Write unit tests for this authentication middleware. Test edge cases: expired tokens, malformed headers, missing credentials, and concurrent requests."}]
|
||||
}' | jq '{model: .model, content: .choices[0].message.content[:100]}'
|
||||
echo ""
|
||||
|
||||
# Scenario 5: Complex reasoning -> should route to Claude
|
||||
echo "--- Scenario 5: Complex Reasoning (expect: Claude) ---"
|
||||
curl -s "$PLANO_URL" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"model": "kimi-k2.5",
|
||||
"messages": [{"role": "user", "content": "Analyze the trade-offs between using WebSockets vs SSE vs long-polling for real-time notifications in a distributed messaging system with 10K concurrent users."}]
|
||||
}' | jq '{model: .model, content: .choices[0].message.content[:100]}'
|
||||
echo ""
|
||||
|
||||
echo "=== Check Plano logs for MODEL_RESOLUTION details ==="
|
||||
echo "Run: docker logs plano 2>&1 | grep MODEL_RESOLUTION"
|
||||
Loading…
Add table
Add a link
Reference in a new issue