mirror of https://github.com/katanemo/plano.git synced 2026-06-17 15:25:17 +02:00

Adil Hafeez 5fb7ce576c release 0.3.3 (#519 )		2025-07-08 00:59:33 -07:00
..
hurl_tests	release 0.3.2 (#507 )	2025-06-13 17:02:20 -07:00
arch_config.yaml	make arch-router cluster optional (#518 )	2025-07-08 00:33:40 -07:00
arch_config_local.yaml	make arch-router cluster optional (#518 )	2025-07-08 00:33:40 -07:00
docker-compose.yaml	update code to use new json based system prompt for routing (#493 )	2025-05-30 17:40:46 -07:00
README.md	release 0.3.3 (#519 )	2025-07-08 00:59:33 -07:00
test_router_endpoint.rest	make arch-router cluster optional (#518 )	2025-07-08 00:33:40 -07:00

README.md

Usage based LLM Routing

This demo shows how you can use user preferences to route user prompts to appropriate llm. See arch_config.yaml for details on how you can define user preferences.

How to start the demo

Make sure your machine is up to date with latest version of archgw. And you have activated the virtual environment.

start the openwebui

(venv) $ cd demos/use_cases/preference_based_routing
(venv) $ docker compose up -d

start archgw in the foreground

(venv) $ archgw up --service archgw --foreground
2025-05-30 18:00:09,953 - cli.main - INFO - Starting archgw cli version: 0.3.3
2025-05-30 18:00:09,953 - cli.main - INFO - Validating /Users/adilhafeez/src/intelligent-prompt-gateway/demos/use_cases/preference_based_routing/arch_config.yaml
2025-05-30 18:00:10,422 - cli.core - INFO - Starting arch gateway, image name: archgw, tag: katanemo/archgw:0.3.3
2025-05-30 18:00:10,662 - cli.core - INFO - archgw status: running, health status: starting
2025-05-30 18:00:11,712 - cli.core - INFO - archgw status: running, health status: starting
2025-05-30 18:00:12,761 - cli.core - INFO - archgw is running and is healthy!
...

open openwebui http://localhost:8080/

Testing out preference based routing

We have defined two routes 1. code generation and 2. code understanding

For code generation query LLM that is better suited for code generation wil handle the request,

If you look at the logs you'd see that code generation llm was selected,

...
2025-05-31T01:02:19.382716Z  INFO brightstaff::router::llm_router: router response: {'route': 'code_generation'}, response time: 203ms
...

Now if you ask for query related to code understanding you'd see llm that is better suited to handle code understanding in handled,

...
2025-05-31T01:06:33.555680Z  INFO brightstaff::router::llm_router: router response: {'route': 'code_understanding'}, response time: 327ms
...