mirror of
https://github.com/katanemo/plano.git
synced 2026-04-25 00:36:34 +02:00
update readme for preference based routing (#496)
This commit is contained in:
parent
ed28bbaf04
commit
c7a3a668a9
1 changed files with 51 additions and 0 deletions
|
|
@ -1,2 +1,53 @@
|
|||
# Usage based LLM Routing
|
||||
This demo shows how you can use user preferences to route user prompts to appropriate llm. See [arch_config.yaml](arch_config.yaml) for details on how you can define user preferences.
|
||||
|
||||
## How to start the demo
|
||||
|
||||
Make sure your machine is up to date with [latest version of archgw]([url](https://github.com/katanemo/archgw/tree/main?tab=readme-ov-file#prerequisites)). And you have activated the virtual environment.
|
||||
|
||||
|
||||
1. start the openwebui
|
||||
```bash
|
||||
(venv) $ cd demos/use_cases/preference_based_routing
|
||||
(venv) $ docker compose up -d
|
||||
```
|
||||
2. start archgw in the foreground
|
||||
```bash
|
||||
(venv) $ archgw up --service archgw --foreground
|
||||
2025-05-30 18:00:09,953 - cli.main - INFO - Starting archgw cli version: 0.3.1
|
||||
2025-05-30 18:00:09,953 - cli.main - INFO - Validating /Users/adilhafeez/src/intelligent-prompt-gateway/demos/use_cases/preference_based_routing/arch_config.yaml
|
||||
2025-05-30 18:00:10,422 - cli.core - INFO - Starting arch gateway, image name: archgw, tag: katanemo/archgw:0.3.1
|
||||
2025-05-30 18:00:10,662 - cli.core - INFO - archgw status: running, health status: starting
|
||||
2025-05-30 18:00:11,712 - cli.core - INFO - archgw status: running, health status: starting
|
||||
2025-05-30 18:00:12,761 - cli.core - INFO - archgw is running and is healthy!
|
||||
...
|
||||
```
|
||||
|
||||
3. open openwebui http://localhost:8080/
|
||||
|
||||
# Testing out preference based routing
|
||||
|
||||
We have defined two routes 1. code generation and 2. code understanding
|
||||
|
||||
For code generation query LLM that is better suited for code generation wil handle the request,
|
||||
|
||||
|
||||
If you look at the logs you'd see that code generation llm was selected,
|
||||
|
||||
```
|
||||
...
|
||||
2025-05-31T01:02:19.382716Z INFO brightstaff::router::llm_router: router response: {'route': 'code_generation'}, response time: 203ms
|
||||
...
|
||||
```
|
||||
|
||||
<img width="1036" alt="image" src="https://github.com/user-attachments/assets/f923944b-ddbe-462e-9fd5-c75504adc8cf" />
|
||||
|
||||
Now if you ask for query related to code understanding you'd see llm that is better suited to handle code understanding in handled,
|
||||
|
||||
```
|
||||
...
|
||||
2025-05-31T01:06:33.555680Z INFO brightstaff::router::llm_router: router response: {'route': 'code_understanding'}, response time: 327ms
|
||||
...
|
||||
```
|
||||
|
||||
<img width="1081" alt="image" src="https://github.com/user-attachments/assets/e50d167c-46a0-4e3a-ba77-e84db1bd376d" />
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue