2025-05-19 09:59:22 -07:00
# Usage based LLM Routing
2025-12-23 18:37:58 -08:00
This demo shows how you can use user preferences to route user prompts to appropriate llm. See [config.yaml ](config.yaml ) for details on how you can define user preferences.
2025-05-30 18:09:10 -07:00
## How to start the demo
2025-12-23 18:37:58 -08:00
Make sure your machine is up to date with [latest version of plano ]([url](https://github.com/katanemo/plano/tree/main?tab=readme-ov-file#prerequisites )). And you have activated the virtual environment.
2025-05-30 18:09:10 -07:00
2026-02-09 13:33:27 -08:00
1. start anythingllm
2025-05-30 18:09:10 -07:00
```bash
2026-02-17 03:09:28 -08:00
(venv) $ cd demos/llm_routing/preference_based_routing
2025-05-30 18:09:10 -07:00
(venv) $ docker compose up -d
```
2025-12-23 18:37:58 -08:00
2. start plano in the foreground
2025-05-30 18:09:10 -07:00
```bash
2025-12-23 19:26:51 -08:00
(venv) $ planoai up --service plano --foreground
2025-12-26 11:21:42 -08:00
# Or if installed with uv: uvx planoai up --service plano --foreground
2026-02-26 15:55:52 -08:00
2025-05-30 18:00:09,953 - planoai.main - INFO - Starting plano cli version: 0.4.9
2026-02-17 03:09:28 -08:00
2025-05-30 18:00:09,953 - planoai.main - INFO - Validating /Users/adilhafeez/src/intelligent-prompt-gateway/demos/llm_routing/preference_based_routing/config.yaml
2026-02-26 15:55:52 -08:00
2025-05-30 18:00:10,422 - cli.core - INFO - Starting plano gateway, image name: plano, tag: katanemo/plano:0.4.9
2025-12-23 18:37:58 -08:00
2025-05-30 18:00:10,662 - cli.core - INFO - plano status: running, health status: starting
2025-05-30 18:00:11,712 - cli.core - INFO - plano status: running, health status: starting
2025-05-30 18:00:12,761 - cli.core - INFO - plano is running and is healthy!
2025-05-30 18:09:10 -07:00
...
```
2026-02-09 13:33:27 -08:00
3. open AnythingLLM http://localhost:3001/
2025-05-30 18:09:10 -07:00
# Testing out preference based routing
We have defined two routes 1. code generation and 2. code understanding
For code generation query LLM that is better suited for code generation wil handle the request,
If you look at the logs you'd see that code generation llm was selected,
```
...
2025-05-31T01:02:19.382716Z INFO brightstaff::router::llm_router: router response: {'route': 'code_generation'}, response time: 203ms
...
```
< img width = "1036" alt = "image" src = "https://github.com/user-attachments/assets/f923944b-ddbe-462e-9fd5-c75504adc8cf" / >
Now if you ask for query related to code understanding you'd see llm that is better suited to handle code understanding in handled,
```
...
2025-05-31T01:06:33.555680Z INFO brightstaff::router::llm_router: router response: {'route': 'code_understanding'}, response time: 327ms
...
```
< img width = "1081" alt = "image" src = "https://github.com/user-attachments/assets/e50d167c-46a0-4e3a-ba77-e84db1bd376d" / >