plano/demos/llm_routing/preference_based_routing/README.md

# Usage based LLM Routing
This demo shows how you can use user preferences to route user prompts to appropriate llm. See [config.yaml](config.yaml) for details on how you can define user preferences.

## How to start the demo

Make sure your machine is up to date with [latest version of plano]([url](https://github.com/katanemo/plano/tree/main?tab=readme-ov-file#prerequisites)). And you have activated the virtual environment.


1. start anythingllm
```bash
(venv) $ cd demos/llm_routing/preference_based_routing
(venv) $ docker compose up -d
```
2. start plano in the foreground
```bash
(venv) $ planoai up --service plano --foreground
# Or if installed with uv: uvx planoai up --service plano --foreground
2025-05-30 18:00:09,953 - planoai.main - INFO - Starting plano cli version: 0.4.9
2025-05-30 18:00:09,953 - planoai.main - INFO - Validating /Users/adilhafeez/src/intelligent-prompt-gateway/demos/llm_routing/preference_based_routing/config.yaml
2025-05-30 18:00:10,422 - cli.core - INFO - Starting plano gateway, image name: plano, tag: katanemo/plano:0.4.9
2025-05-30 18:00:10,662 - cli.core - INFO - plano status: running, health status: starting
2025-05-30 18:00:11,712 - cli.core - INFO - plano status: running, health status: starting
2025-05-30 18:00:12,761 - cli.core - INFO - plano is running and is healthy!
...
```

3. open AnythingLLM http://localhost:3001/

# Testing out preference based routing

We have defined two routes 1. code generation and 2. code understanding

For code generation query LLM that is better suited for code generation wil handle the request,


If you look at the logs you'd see that code generation llm was selected,

```
...
2025-05-31T01:02:19.382716Z  INFO brightstaff::router::llm_router: router response: {'route': 'code_generation'}, response time: 203ms
...
```

<img width="1036" alt="image" src="https://github.com/user-attachments/assets/f923944b-ddbe-462e-9fd5-c75504adc8cf" />

Now if you ask for query related to code understanding you'd see llm that is better suited to handle code understanding in handled,

```
...
2025-05-31T01:06:33.555680Z  INFO brightstaff::router::llm_router: router response: {'route': 'code_understanding'}, response time: 327ms
...
```

<img width="1081" alt="image" src="https://github.com/user-attachments/assets/e50d167c-46a0-4e3a-ba77-e84db1bd376d" />
Introduce brightstaff a new terminal service for llm routing (#477) 2025-05-19 09:59:22 -07:00			`# Usage based LLM Routing`
rename cli to plano (#647) 2025-12-23 18:37:58 -08:00			`This demo shows how you can use user preferences to route user prompts to appropriate llm. See [config.yaml](config.yaml) for details on how you can define user preferences.`
update readme for preference based routing (#496) 2025-05-30 18:09:10 -07:00
			`## How to start the demo`

rename cli to plano (#647) 2025-12-23 18:37:58 -08:00			`Make sure your machine is up to date with [latest version of plano]([url](https://github.com/katanemo/plano/tree/main?tab=readme-ov-file#prerequisites)). And you have activated the virtual environment.`
update readme for preference based routing (#496) 2025-05-30 18:09:10 -07:00

use standard tracing and logging in brightstaff (#721) 2026-02-09 13:33:27 -08:00			`1. start anythingllm`
update readme for preference based routing (#496) 2025-05-30 18:09:10 -07:00			```bash
Overhaul demos directory: cleanup, restructure, and standardize configs (#760) 2026-02-17 03:09:28 -08:00			`(venv) $ cd demos/llm_routing/preference_based_routing`
update readme for preference based routing (#496) 2025-05-30 18:09:10 -07:00			`(venv) $ docker compose up -d`
			```
rename cli to plano (#647) 2025-12-23 18:37:58 -08:00			`2. start plano in the foreground`
update readme for preference based routing (#496) 2025-05-30 18:09:10 -07:00			```bash
rename to planoai (#650) 2025-12-23 19:26:51 -08:00			`(venv) $ planoai up --service plano --foreground`
use uv instead of poetry (#663) 2025-12-26 11:21:42 -08:00			`# Or if installed with uv: uvx planoai up --service plano --foreground`
release 0.4.9 2026-02-26 15:55:52 -08:00			`2025-05-30 18:00:09,953 - planoai.main - INFO - Starting plano cli version: 0.4.9`
Overhaul demos directory: cleanup, restructure, and standardize configs (#760) 2026-02-17 03:09:28 -08:00			`2025-05-30 18:00:09,953 - planoai.main - INFO - Validating /Users/adilhafeez/src/intelligent-prompt-gateway/demos/llm_routing/preference_based_routing/config.yaml`
release 0.4.9 2026-02-26 15:55:52 -08:00			`2025-05-30 18:00:10,422 - cli.core - INFO - Starting plano gateway, image name: plano, tag: katanemo/plano:0.4.9`
rename cli to plano (#647) 2025-12-23 18:37:58 -08:00			`2025-05-30 18:00:10,662 - cli.core - INFO - plano status: running, health status: starting`
			`2025-05-30 18:00:11,712 - cli.core - INFO - plano status: running, health status: starting`
			`2025-05-30 18:00:12,761 - cli.core - INFO - plano is running and is healthy!`
update readme for preference based routing (#496) 2025-05-30 18:09:10 -07:00			`...`
			```

use standard tracing and logging in brightstaff (#721) 2026-02-09 13:33:27 -08:00			`3. open AnythingLLM http://localhost:3001/`
update readme for preference based routing (#496) 2025-05-30 18:09:10 -07:00
			`# Testing out preference based routing`

			`We have defined two routes 1. code generation and 2. code understanding`

			`For code generation query LLM that is better suited for code generation wil handle the request,`


			`If you look at the logs you'd see that code generation llm was selected,`

			```
			`...`
			`2025-05-31T01:02:19.382716Z INFO brightstaff::router::llm_router: router response: {'route': 'code_generation'}, response time: 203ms`
			`...`
			```

			`<img width="1036" alt="image" src="https://github.com/user-attachments/assets/f923944b-ddbe-462e-9fd5-c75504adc8cf" />`

			`Now if you ask for query related to code understanding you'd see llm that is better suited to handle code understanding in handled,`

			```
			`...`
			`2025-05-31T01:06:33.555680Z INFO brightstaff::router::llm_router: router response: {'route': 'code_understanding'}, response time: 327ms`
			`...`
			```

			`<img width="1081" alt="image" src="https://github.com/user-attachments/assets/e50d167c-46a0-4e3a-ba77-e84db1bd376d" />`