diff --git a/demos/llm_routing/openclaw_routing/README.md b/demos/llm_routing/openclaw_routing/README.md index 77088ec2..7c201687 100644 --- a/demos/llm_routing/openclaw_routing/README.md +++ b/demos/llm_routing/openclaw_routing/README.md @@ -1,5 +1,9 @@ # OpenClaw + Plano: Smart Model Routing for Personal AI Assistants +

+ OpenClaw + Plano +

+ OpenClaw is an open-source personal AI assistant that connects to WhatsApp, Telegram, Slack, and Discord. By pointing it at Plano instead of a single LLM provider, every message is automatically routed to the best model — conversational requests go to Kimi K2.5 (cost-effective), while code generation, testing, and complex reasoning go to Claude (most capable) — with zero application code changes. ## Architecture @@ -97,34 +101,21 @@ Try these messages to see routing in action: OpenClaw's code doesn't change at all. It points at `http://127.0.0.1:12000/v1` instead of a direct provider URL. Plano's router analyzes each prompt and picks the right backend. -### Verify Plano Routing Directly (Optional) -To test Plano's routing without OpenClaw, run the test script which sends requests directly to the gateway: +## Tracing + +For fast dev/test cycles, Plano provides built-in tracing to visualize routing decisions and LLM interactions. Start the trace listener in a separate terminal: ```bash -bash test_routing.sh +planoai trace ``` -## Monitoring +Then send requests through OpenClaw. You'll see detailed traces showing: +- Which model was selected and why +- Token usage and latency for each request +- Complete request/response payloads -### Routing Decisions - -Watch Plano logs for model selection: - -```bash -docker logs plano 2>&1 | grep MODEL_RESOLUTION -``` - -### Jaeger Tracing (Optional) - -To visualize full request traces and routing decisions, start Jaeger: - -```bash -docker run -d --name jaeger -p 16686:16686 -p 4317:4317 -p 4318:4318 \ - -e COLLECTOR_OTLP_ENABLED=true jaegertracing/all-in-one:latest -``` - -Then open [http://localhost:16686](http://localhost:16686) to see traces for each request, including which model was selected and the routing latency. +Learn more about tracing features and configuration in the [Plano tracing guide](https://docs.planoai.dev/guides/observability/tracing.html#tracing-with-the-cli). ## Cost Impact diff --git a/demos/llm_routing/openclaw_routing/openclaw_plano.png b/demos/llm_routing/openclaw_routing/openclaw_plano.png new file mode 100644 index 00000000..66b2ee73 Binary files /dev/null and b/demos/llm_routing/openclaw_routing/openclaw_plano.png differ diff --git a/demos/llm_routing/openclaw_routing/test_routing.sh b/demos/llm_routing/openclaw_routing/test_routing.sh deleted file mode 100755 index d630f920..00000000 --- a/demos/llm_routing/openclaw_routing/test_routing.sh +++ /dev/null @@ -1,60 +0,0 @@ -#!/usr/bin/env bash -set -euo pipefail - -PLANO_URL="http://localhost:12000/v1/chat/completions" - -echo "=== Testing Plano Routing Decisions ===" -echo "" - -# Scenario 1: General conversation -> should route to Kimi K2.5 -echo "--- Scenario 1: General Conversation (expect: Kimi K2.5) ---" -curl -s "$PLANO_URL" \ - -H "Content-Type: application/json" \ - -d '{ - "model": "kimi-k2.5", - "messages": [{"role": "user", "content": "Hey! What is the weather like today? Can you tell me a fun fact?"}] - }' | jq '{model: .model, content: .choices[0].message.content[:100]}' -echo "" - -# Scenario 2: Agentic task -> should route to Kimi K2.5 -echo "--- Scenario 2: Agentic Task (expect: Kimi K2.5) ---" -curl -s "$PLANO_URL" \ - -H "Content-Type: application/json" \ - -d '{ - "model": "kimi-k2.5", - "messages": [{"role": "user", "content": "Schedule a reminder for tomorrow at 9am to review the pull request, then send a message to the team Slack channel about the deployment."}] - }' | jq '{model: .model, content: .choices[0].message.content[:100]}' -echo "" - -# Scenario 3: Code generation -> should route to Claude -echo "--- Scenario 3: Code Generation (expect: Claude) ---" -curl -s "$PLANO_URL" \ - -H "Content-Type: application/json" \ - -d '{ - "model": "kimi-k2.5", - "messages": [{"role": "user", "content": "Write a Python function that implements a rate limiter using the token bucket algorithm with async support."}] - }' | jq '{model: .model, content: .choices[0].message.content[:100]}' -echo "" - -# Scenario 4: Testing/evaluation -> should route to Claude -echo "--- Scenario 4: Testing & Evaluation (expect: Claude) ---" -curl -s "$PLANO_URL" \ - -H "Content-Type: application/json" \ - -d '{ - "model": "kimi-k2.5", - "messages": [{"role": "user", "content": "Write unit tests for this authentication middleware. Test edge cases: expired tokens, malformed headers, missing credentials, and concurrent requests."}] - }' | jq '{model: .model, content: .choices[0].message.content[:100]}' -echo "" - -# Scenario 5: Complex reasoning -> should route to Claude -echo "--- Scenario 5: Complex Reasoning (expect: Claude) ---" -curl -s "$PLANO_URL" \ - -H "Content-Type: application/json" \ - -d '{ - "model": "kimi-k2.5", - "messages": [{"role": "user", "content": "Analyze the trade-offs between using WebSockets vs SSE vs long-polling for real-time notifications in a distributed messaging system with 10K concurrent users."}] - }' | jq '{model: .model, content: .choices[0].message.content[:100]}' -echo "" - -echo "=== Check Plano logs for MODEL_RESOLUTION details ===" -echo "Run: docker logs plano 2>&1 | grep MODEL_RESOLUTION"