mirror of
https://github.com/katanemo/plano.git
synced 2026-04-25 00:36:34 +02:00
2.1 KiB
2.1 KiB
Model Routing Service Demo
This demo shows how to use the /routing/v1/* endpoints to get routing decisions without proxying requests to an LLM. The endpoint accepts standard LLM request formats and returns which model Plano's router would select.
Setup
Make sure you have Plano CLI installed (pip install planoai or uv tool install planoai).
export OPENAI_API_KEY=<your-key>
export ANTHROPIC_API_KEY=<your-key>
Start Plano:
cd demos/llm_routing/model_routing_service
planoai up config.yaml
Run the demo
./demo.sh
Endpoints
All three LLM API formats are supported:
| Endpoint | Format |
|---|---|
POST /routing/v1/chat/completions |
OpenAI Chat Completions |
POST /routing/v1/messages |
Anthropic Messages |
POST /routing/v1/responses |
OpenAI Responses API |
Example
curl http://localhost:12000/routing/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o-mini",
"messages": [{"role": "user", "content": "Write a Python function for binary search"}]
}'
Response:
{
"model": "anthropic/claude-sonnet-4-20250514",
"route": "code_generation",
"trace_id": "c16d1096c1af4a17abb48fb182918a88"
}
The response tells you which model would handle this request and which route was matched, without actually making the LLM call.
Demo Output
=== Model Routing Service Demo ===
--- 1. Code generation query (OpenAI format) ---
{
"model": "anthropic/claude-sonnet-4-20250514",
"route": "code_generation",
"trace_id": "c16d1096c1af4a17abb48fb182918a88"
}
--- 2. Complex reasoning query (OpenAI format) ---
{
"model": "openai/gpt-4o",
"route": "complex_reasoning",
"trace_id": "30795e228aff4d7696f082ed01b75ad4"
}
--- 3. Simple query - no routing match (OpenAI format) ---
{
"model": "none",
"route": null,
"trace_id": "ae0b6c3b220d499fb5298ac63f4eac0e"
}
--- 4. Code generation query (Anthropic format) ---
{
"model": "anthropic/claude-sonnet-4-20250514",
"route": "code_generation",
"trace_id": "26be822bbdf14a3ba19fe198e55ea4a9"
}
=== Demo Complete ===