showing the model-based and preference-based routing in consistent yaml

2026-06-17 15:25:17 +02:00 · 2025-07-01 11:53:52 -07:00 · 2025-07-01 11:53:52 -07:00 · 54065cc3be
commit 54065cc3be
parent 73f05174c9
2 changed files with 23 additions and 3 deletions
--- a/README.md
+++ b/README.md
@ -219,9 +219,29 @@ llm_providers:
 #### Preference-based Routing
 Preference-based routing is designed for more dynamic and intelligent selection of models. Instead of static model names, you write plain-language routing policies that describe the type of task or preference — for example:

-    "contract clauses → GPT-4o"
+```yaml
+version: v0.1.0

-    "quick travel tips → Gemini Flash"
+listeners:
+  egress_traffic:
+    address: 0.0.0.0
+    port: 12000
+    message_format: openai
+    timeout: 30s
+
+llm_providers:
+  - name: code_generation
+    access_key: $OPENAI_API_KEY
+    provider_interface: openai
+    model: gpt-4.1
+    usage: generating new code snippets, functions, or boilerplate based on user prompts or requirements
+
+  - name: code_understanding
+    provider_interface: openai
+    access_key: $OPENAI_API_KEY
+    model: gpt-4o-mini
+    usage: understand and explain existing code snippets, functions, or libraries
+```

 Arch uses a lightweight 1.5B autoregressive model to map prompts (and conversation context) to these policies. This approach adapts to intent drift, supports multi-turn conversations, and avoids the brittleness of embedding-based classifiers or manual if/else chains. No retraining is required when adding new models or updating policies — routing is governed entirely by human-readable rules. You can learn more about the design, benchmarks, and methodology behind preference-based routing in our paper:

--- a/demos/use_cases/preference_based_routing/arch_config.yaml
+++ b/demos/use_cases/preference_based_routing/arch_config.yaml
@ -37,7 +37,7 @@ llm_providers:
  - name: code_understanding
    provider_interface: openai
    access_key: $OPENAI_API_KEY
-    model: gpt-4.1
+    model: gpt-4o-mini
    usage: understand and explain existing code snippets, functions, or libraries

 tracing: