mirror of
https://github.com/katanemo/plano.git
synced 2026-04-26 17:26:26 +02:00
Unified overrides for custom router and orchestrator models (#820)
* support configurable orchestrator model via orchestration config section * add self-hosting docs and demo for Plano-Orchestrator * list all Plano-Orchestrator model variants in docs * use overrides for custom routing and orchestration model * update docs * update orchestrator model name * rename arch provider to plano, use llm_routing_model and agent_orchestration_model * regenerate rendered config reference
This commit is contained in:
parent
785bf7e021
commit
bc059aed4d
20 changed files with 312 additions and 103 deletions
|
|
@ -253,13 +253,11 @@ Using Ollama (recommended for local development)
|
|||
|
||||
.. code-block:: yaml
|
||||
|
||||
routing:
|
||||
model: Arch-Router
|
||||
llm_provider: arch-router
|
||||
overrides:
|
||||
llm_routing_model: plano/hf.co/katanemo/Arch-Router-1.5B.gguf:Q4_K_M
|
||||
|
||||
model_providers:
|
||||
- name: arch-router
|
||||
model: arch/hf.co/katanemo/Arch-Router-1.5B.gguf:Q4_K_M
|
||||
- model: plano/hf.co/katanemo/Arch-Router-1.5B.gguf:Q4_K_M
|
||||
base_url: http://localhost:11434
|
||||
|
||||
- model: openai/gpt-5.2
|
||||
|
|
@ -324,13 +322,11 @@ vLLM provides higher throughput and GPU optimizations suitable for production de
|
|||
|
||||
.. code-block:: yaml
|
||||
|
||||
routing:
|
||||
model: Arch-Router
|
||||
llm_provider: arch-router
|
||||
overrides:
|
||||
llm_routing_model: plano/Arch-Router
|
||||
|
||||
model_providers:
|
||||
- name: arch-router
|
||||
model: Arch-Router
|
||||
- model: plano/Arch-Router
|
||||
base_url: http://<your-server-ip>:10000
|
||||
|
||||
- model: openai/gpt-5.2
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue