Unified overrides for custom router and orchestrator models (#820)

* support configurable orchestrator model via orchestration config section * add self-hosting docs and demo for Plano-Orchestrator * list all Plano-Orchestrator model variants in docs * use overrides for custom routing and orchestration model * update docs * update orchestrator model name * rename arch provider to plano, use llm_routing_model and agent_orchestration_model * regenerate rendered config reference
2026-07-23 16:51:04 +02:00 · 2026-03-15 09:36:11 -07:00 · 2026-03-15 09:36:11 -07:00 · bc059aed4d
commit bc059aed4d
parent 785bf7e021
20 changed files with 312 additions and 103 deletions
--- a/docs/source/guides/llm_router.rst
+++ b/docs/source/guides/llm_router.rst
@ -253,13 +253,11 @@ Using Ollama (recommended for local development)

   .. code-block:: yaml

-       routing:
-         model: Arch-Router
-         llm_provider: arch-router
+       overrides:
+         llm_routing_model: plano/hf.co/katanemo/Arch-Router-1.5B.gguf:Q4_K_M

       model_providers:
-         - name: arch-router
-           model: arch/hf.co/katanemo/Arch-Router-1.5B.gguf:Q4_K_M
+         - model: plano/hf.co/katanemo/Arch-Router-1.5B.gguf:Q4_K_M
           base_url: http://localhost:11434

         - model: openai/gpt-5.2
@ -324,13 +322,11 @@ vLLM provides higher throughput and GPU optimizations suitable for production de

   .. code-block:: yaml

-       routing:
-         model: Arch-Router
-         llm_provider: arch-router
+       overrides:
+         llm_routing_model: plano/Arch-Router

       model_providers:
-         - name: arch-router
-           model: Arch-Router
+         - model: plano/Arch-Router
           base_url: http://<your-server-ip>:10000

         - model: openai/gpt-5.2