Revert "Add OpenClaw + Plano intelligent routing demo"

This reverts commit 5090000fd2.
2026-06-26 15:39:40 +02:00 · 2026-02-17 03:15:42 -08:00 · 2026-02-17 03:15:42 -08:00 · 5d2279a796
commit 5d2279a796
parent 5090000fd2
5 changed files with 0 additions and 297 deletions
--- a/demos/llm_routing/openclaw_routing/README.md
+++ b/demos/llm_routing/openclaw_routing/README.md
@ -1,122 +0,0 @@
-# OpenClaw + Plano: Smart Model Routing for Personal AI Assistants
-
-OpenClaw is an open-source personal AI assistant that connects to WhatsApp, Telegram, Slack, and Discord. By pointing it at Plano instead of a single LLM provider, every message is automatically routed to the best model — conversational requests go to Kimi K2.5 (cost-effective), while code generation, testing, and complex reasoning go to Claude (most capable) — with zero application code changes.
-
-## Architecture
-
-```
-[WhatsApp / Telegram / Slack / Discord]
-                |
-        [OpenClaw Gateway]
-         ws://127.0.0.1:18789
-                |
-        [Plano :12000]  ──────────────>  Kimi K2.5  (conversation, agentic tasks)
-                |                           $0.60/M input tokens
-                |──────────────────────>  Claude     (code, tests, reasoning)
-                |
-        [Arch-Router 1.5B]
-        (local via Ollama, ~200ms)
-```
-
-Plano's 1.5B [Arch-Router](https://arxiv.org/abs/2506.16655) model analyzes each prompt locally and selects the best backend based on configured routing preferences.
-
-## Prerequisites
-
- **Docker** running
- **Ollama** installed ([ollama.com](https://ollama.com))
- **Plano CLI**: `uv tool install planoai` or `pip install planoai`
- **OpenClaw**: `npm install -g openclaw@latest`
- **API keys**:
-  - `MOONSHOT_API_KEY` — from [Moonshot AI](https://platform.moonshot.cn/)
-  - `ANTHROPIC_API_KEY` — from [Anthropic](https://console.anthropic.com/)
-
-## Quick Start
-
-### 1. Set Environment Variables
-
-```bash
-export MOONSHOT_API_KEY="your-moonshot-key"
-export ANTHROPIC_API_KEY="your-anthropic-key"
-```
-
-### 2. Start the Demo
-
-```bash
-cd demos/llm_routing/openclaw_routing
-bash run_demo.sh
-```
-
-This will:
- Pull the Arch-Router model into Ollama
- Start Jaeger for tracing
- Start Plano on port 12000
-
-### 3. Configure OpenClaw
-
-In `~/.openclaw/openclaw.json`, set:
-
-```json
-{
-  "agent": {
-    "model": "kimi-k2.5",
-    "baseURL": "http://127.0.0.1:12000/v1"
-  }
-}
-```
-
-Then run:
-
-```bash
-openclaw onboard --install-daemon
-```
-
-### 4. Test Routing
-
-Run the test script to verify routing decisions:
-
-```bash
-bash test_routing.sh
-```
-
-## Demo Scenarios
-
-| # | Message | Expected Route | Why |
-|---|---------|---------------|-----|
-| 1 | "Hey, what's up? Tell me something interesting." | **Kimi K2.5** | General conversation — cheap and fast |
-| 2 | "Remind me tomorrow at 9am and ping Slack about the deploy" | **Kimi K2.5** | Agentic multi-step task orchestration |
-| 3 | "Write a Python rate limiter with the token bucket algorithm" | **Claude** | Code generation — needs precision |
-| 4 | "Write unit tests for the auth middleware, cover edge cases" | **Claude** | Testing & evaluation — needs thoroughness |
-| 5 | "Compare WebSockets vs SSE vs polling for 10K concurrent users" | **Claude** | Complex reasoning — needs deep analysis |
-
-OpenClaw's code doesn't change at all. It points at `http://127.0.0.1:12000/v1` instead of a direct provider URL. Plano's Arch-Router analyzes each prompt in ~200ms and picks the right backend.
-
-## Monitoring
-
-### Routing Decisions
-
-Watch Plano logs for model selection:
-
-```bash
-docker logs plano 2>&1 | grep MODEL_RESOLUTION
-```
-
-### Jaeger Tracing
-
-Open [http://localhost:16686](http://localhost:16686) to see full traces of each request, including which model was selected and the routing latency.
-
-## Cost Impact
-
-For a personal assistant handling ~1000 requests/day with a 60/40 conversation-to-code split:
-
-| Without Plano (all Claude) | With Plano (routed) |
-|---|---|
-| 1000 req x Claude pricing | 600 req x Kimi K2.5 + 400 req x Claude |
-| ~$3.00/day input tokens | ~$0.36 + $1.20 = **$1.56/day** (~48% savings) |
-
-Same quality where it matters (code, tests), lower cost where it doesn't (chat).
-
-## Stopping the Demo
-
-```bash
-bash run_demo.sh down
-```
--- a/demos/llm_routing/openclaw_routing/config.yaml
+++ b/demos/llm_routing/openclaw_routing/config.yaml
@ -1,48 +0,0 @@
-version: v0.1.0
-
-routing:
-  model: Arch-Router
-  llm_provider: arch-router
-
-listeners:
-  egress_traffic:
-    address: 0.0.0.0
-    port: 12000
-    message_format: openai
-    timeout: 30s
-
-llm_providers:
-
-  # Arch Router - the 1.5B preference-aligned routing model (runs locally via Ollama)
-  - name: arch-router
-    model: arch/hf.co/katanemo/Arch-Router-1.5B.gguf:Q4_K_M
-    base_url: http://host.docker.internal:11434
-
-  # Kimi K2.5 — Moonshot AI's open model (1T MoE, 32B active params)
-  # Great for general conversation, agentic tasks, and multimodal work
-  # OpenAI-compatible API at $0.60/M input, $2.50/M output tokens
-  - model: openai/kimi-k2.5
-    access_key: $MOONSHOT_API_KEY
-    base_url: https://api.moonshot.ai/v1
-    provider_interface: openai
-    default: true
-    routing_preferences:
-      - name: general conversation
-        description: general chat, greetings, casual conversation, Q&A, and everyday questions
-      - name: agentic tasks
-        description: coordinating multi-step workflows, device automation, scheduling, and task orchestration across channels
-
-  # Claude — Anthropic's most capable model
-  # Best for complex reasoning, code, tool use, and evaluation
-  - model: anthropic/claude-sonnet-4-5
-    access_key: $ANTHROPIC_API_KEY
-    routing_preferences:
-      - name: testing and evaluation
-        description: writing tests, running evaluations, QA checks, verifying correctness, and debugging failures
-      - name: code generation
-        description: generating code, writing scripts, implementing functions, and building tool integrations
-      - name: complex reasoning
-        description: multi-step analysis, planning, architectural decisions, and deep problem-solving
-
-tracing:
-  random_sampling: 100
--- a/demos/llm_routing/openclaw_routing/docker-compose.yaml
+++ b/demos/llm_routing/openclaw_routing/docker-compose.yaml
@ -1,8 +0,0 @@
-services:
-  jaeger:
-    build:
-      context: ../../shared/jaeger
-    ports:
-      - "16686:16686"
-      - "4317:4317"
-      - "4318:4318"
--- a/demos/llm_routing/openclaw_routing/run_demo.sh
+++ b/demos/llm_routing/openclaw_routing/run_demo.sh
@ -1,59 +0,0 @@
-#!/bin/bash
-set -e
-
-echo "=== OpenClaw + Plano Routing Demo ==="
-
-# Check prerequisites
-command -v docker >/dev/null || { echo "Error: Docker not found"; exit 1; }
-command -v ollama >/dev/null || { echo "Error: Ollama not found. Install from https://ollama.com"; exit 1; }
-
-# Check/create .env file
-if [ -f ".env" ]; then
-  echo ".env file already exists"
-else
-  if [ -z "${MOONSHOT_API_KEY:-}" ]; then
-    echo "Error: MOONSHOT_API_KEY not set"
-    exit 1
-  fi
-  if [ -z "${ANTHROPIC_API_KEY:-}" ]; then
-    echo "Error: ANTHROPIC_API_KEY not set"
-    exit 1
-  fi
-  echo "Creating .env file..."
-  echo "MOONSHOT_API_KEY=$MOONSHOT_API_KEY" > .env
-  echo "ANTHROPIC_API_KEY=$ANTHROPIC_API_KEY" >> .env
-fi
-
-# Pull Arch-Router model if needed
-echo "Pulling Arch-Router model..."
-ollama pull hf.co/katanemo/Arch-Router-1.5B.gguf:Q4_K_M
-
-start_demo() {
-  # Start Jaeger for tracing
-  echo "Starting Jaeger..."
-  docker compose up -d
-
-  # Start Plano gateway
-  echo "Starting Plano..."
-  planoai up --service plano --foreground
-}
-
-stop_demo() {
-  docker compose down
-  planoai down
-}
-
-if [ "${1:-}" == "down" ]; then
-  stop_demo
-else
-  start_demo
-  echo ""
-  echo "=== Plano is running on http://localhost:12000 ==="
-  echo "=== Jaeger UI at http://localhost:16686 ==="
-  echo ""
-  echo "Configure OpenClaw to use Plano as its LLM endpoint:"
-  echo '  In ~/.openclaw/openclaw.json, set:'
-  echo '    { "agent": { "model": "kimi-k2.5", "baseURL": "http://127.0.0.1:12000/v1" } }'
-  echo ""
-  echo "Then run: openclaw onboard --install-daemon"
-fi
--- a/demos/llm_routing/openclaw_routing/test_routing.sh
+++ b/demos/llm_routing/openclaw_routing/test_routing.sh
@ -1,60 +0,0 @@
-#!/usr/bin/env bash
-set -euo pipefail
-
-PLANO_URL="http://localhost:12000/v1/chat/completions"
-
-echo "=== Testing Plano Routing Decisions ==="
-echo ""
-
-# Scenario 1: General conversation -> should route to Kimi K2.5
-echo "--- Scenario 1: General Conversation (expect: Kimi K2.5) ---"
-curl -s "$PLANO_URL" \
-  -H "Content-Type: application/json" \
-  -d '{
-    "model": "kimi-k2.5",
-    "messages": [{"role": "user", "content": "Hey! What is the weather like today? Can you tell me a fun fact?"}]
-  }' | jq '{model: .model, content: .choices[0].message.content[:100]}'
-echo ""
-
-# Scenario 2: Agentic task -> should route to Kimi K2.5
-echo "--- Scenario 2: Agentic Task (expect: Kimi K2.5) ---"
-curl -s "$PLANO_URL" \
-  -H "Content-Type: application/json" \
-  -d '{
-    "model": "kimi-k2.5",
-    "messages": [{"role": "user", "content": "Schedule a reminder for tomorrow at 9am to review the pull request, then send a message to the team Slack channel about the deployment."}]
-  }' | jq '{model: .model, content: .choices[0].message.content[:100]}'
-echo ""
-
-# Scenario 3: Code generation -> should route to Claude
-echo "--- Scenario 3: Code Generation (expect: Claude) ---"
-curl -s "$PLANO_URL" \
-  -H "Content-Type: application/json" \
-  -d '{
-    "model": "kimi-k2.5",
-    "messages": [{"role": "user", "content": "Write a Python function that implements a rate limiter using the token bucket algorithm with async support."}]
-  }' | jq '{model: .model, content: .choices[0].message.content[:100]}'
-echo ""
-
-# Scenario 4: Testing/evaluation -> should route to Claude
-echo "--- Scenario 4: Testing & Evaluation (expect: Claude) ---"
-curl -s "$PLANO_URL" \
-  -H "Content-Type: application/json" \
-  -d '{
-    "model": "kimi-k2.5",
-    "messages": [{"role": "user", "content": "Write unit tests for this authentication middleware. Test edge cases: expired tokens, malformed headers, missing credentials, and concurrent requests."}]
-  }' | jq '{model: .model, content: .choices[0].message.content[:100]}'
-echo ""
-
-# Scenario 5: Complex reasoning -> should route to Claude
-echo "--- Scenario 5: Complex Reasoning (expect: Claude) ---"
-curl -s "$PLANO_URL" \
-  -H "Content-Type: application/json" \
-  -d '{
-    "model": "kimi-k2.5",
-    "messages": [{"role": "user", "content": "Analyze the trade-offs between using WebSockets vs SSE vs long-polling for real-time notifications in a distributed messaging system with 10K concurrent users."}]
-  }' | jq '{model: .model, content: .choices[0].message.content[:100]}'
-echo ""
-
-echo "=== Check Plano logs for MODEL_RESOLUTION details ==="
-echo "Run: docker logs plano 2>&1 | grep MODEL_RESOLUTION"