plano/demos/llm_routing/preference_based_routing
Adil Hafeez d39d7ddd1c
Some checks are pending
CI / pre-commit (push) Waiting to run
CI / plano-tools-tests (push) Waiting to run
CI / native-smoke-test (push) Waiting to run
CI / docker-build (push) Waiting to run
CI / validate-config (push) Waiting to run
CI / security-scan (push) Blocked by required conditions
CI / test-prompt-gateway (push) Blocked by required conditions
CI / test-model-alias-routing (push) Blocked by required conditions
CI / test-responses-api-with-state (push) Blocked by required conditions
CI / e2e-plano-tests (3.10) (push) Blocked by required conditions
CI / e2e-plano-tests (3.11) (push) Blocked by required conditions
CI / e2e-plano-tests (3.12) (push) Blocked by required conditions
CI / e2e-plano-tests (3.13) (push) Blocked by required conditions
CI / e2e-plano-tests (3.14) (push) Blocked by required conditions
CI / e2e-demo-preference (push) Blocked by required conditions
CI / e2e-demo-currency (push) Blocked by required conditions
Publish docker image (latest) / build-arm64 (push) Waiting to run
Publish docker image (latest) / build-amd64 (push) Waiting to run
Publish docker image (latest) / create-manifest (push) Blocked by required conditions
Build and Deploy Documentation / build (push) Waiting to run
release 0.4.19 (#887)
2026-04-15 16:49:50 -07:00
..
hurl_tests Overhaul demos directory: cleanup, restructure, and standardize configs (#760) 2026-02-17 03:09:28 -08:00
config.yaml Overhaul demos directory: cleanup, restructure, and standardize configs (#760) 2026-02-17 03:09:28 -08:00
docker-compose.yaml Run plano natively by default (#744) 2026-03-05 07:35:25 -08:00
plano_config_local.yaml Unified overrides for custom router and orchestrator models (#820) 2026-03-15 09:36:11 -07:00
README.md release 0.4.19 (#887) 2026-04-15 16:49:50 -07:00
run_demo.sh Run demos without Docker (#809) 2026-03-11 12:49:36 -07:00
test_router_endpoint.rest use plano-orchestrator for LLM routing, remove arch-router (#886) 2026-04-15 16:41:42 -07:00

Usage based LLM Routing

This demo shows how you can use user preferences to route user prompts to appropriate llm. See config.yaml for details on how you can define user preferences.

How to start the demo

Make sure you have Plano CLI installed (pip install planoai==0.4.19 or uv tool install planoai==0.4.19).

cd demos/llm_routing/preference_based_routing
./run_demo.sh

To also start AnythingLLM (chat UI) and Jaeger (tracing):

./run_demo.sh --with-ui

Then open AnythingLLM at http://localhost:3001/

Or start manually:

  1. (Optional) Start AnythingLLM and Jaeger
docker compose up -d
  1. Start Plano
planoai up config.yaml
  1. Test with curl or open AnythingLLM http://localhost:3001/

Running with local routing model (via Ollama)

By default, Plano uses a hosted Plano-Orchestrator endpoint. To self-host a routing model locally using Ollama:

  1. Install Ollama and pull the model:
ollama pull hf.co/katanemo/Arch-Router-1.5B.gguf:Q4_K_M
  1. Make sure Ollama is running (ollama serve or the macOS app).

  2. Start Plano with the local config:

planoai up plano_config_local.yaml
  1. Test routing:
curl -s "http://localhost:12000/routing/v1/messages" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o-mini",
    "max_tokens": 1024,
    "messages": [
      {"role": "user", "content": "Create a REST API endpoint in Rust using actix-web"}
    ]
  }'

You should see the router select the appropriate model based on the routing preferences defined in plano_config_local.yaml.

Testing out preference based routing

We have defined two routes 1. code generation and 2. code understanding

For code generation query LLM that is better suited for code generation wil handle the request,

If you look at the logs you'd see that code generation llm was selected,

...
2025-05-31T01:02:19.382716Z  INFO brightstaff::router::llm_router: router response: {'route': 'code_generation'}, response time: 203ms
...
image

Now if you ask for query related to code understanding you'd see llm that is better suited to handle code understanding in handled,

...
2025-05-31T01:06:33.555680Z  INFO brightstaff::router::llm_router: router response: {'route': 'code_understanding'}, response time: 327ms
...
image