update orchestrator model name

This commit is contained in:
Adil Hafeez 2026-03-11 16:46:03 -07:00
parent 48bf83fa0d
commit 680dee60a0
No known key found for this signature in database
GPG key ID: 9B18EF7691369645
3 changed files with 9 additions and 7 deletions

View file

@ -141,7 +141,7 @@ vllm serve katanemo/Plano-Orchestrator-4B \
--gpu-memory-utilization 0.3 \
--tokenizer katanemo/Plano-Orchestrator-4B \
--chat-template chat_template.jinja \
--served-model-name Plano-Orchestrator \
--served-model-name katanemo/Plano-Orchestrator-4B \
--enable-prefix-caching
```

View file

@ -1,7 +1,7 @@
version: v0.3.0
overrides:
orchestrator_model: plano/Plano-Orchestrator
orchestrator_model: plano/katanemo/Plano-Orchestrator-4B
agents:
- id: weather_agent
@ -10,7 +10,7 @@ agents:
url: http://localhost:10520
model_providers:
- model: plano/Plano-Orchestrator
- model: plano/katanemo/Plano-Orchestrator-4B
base_url: http://localhost:8000
- model: openai/gpt-5.2

View file

@ -379,7 +379,7 @@ Using vLLM
--gpu-memory-utilization 0.3 \
--tokenizer katanemo/Plano-Orchestrator-4B \
--chat-template chat_template.jinja \
--served-model-name Plano-Orchestrator \
--served-model-name katanemo/Plano-Orchestrator-4B \
--enable-prefix-caching
For the 30B-A3B-FP8 model (production):
@ -394,18 +394,20 @@ Using vLLM
--tokenizer katanemo/Plano-Orchestrator-30B-A3B-FP8 \
--chat-template chat_template.jinja \
--max-model-len 32768 \
--served-model-name Plano-Orchestrator \
--served-model-name katanemo/Plano-Orchestrator-30B-A3B-FP8 \
--enable-prefix-caching
4. **Configure Plano to use the local orchestrator**
Use the model name matching your ``--served-model-name``:
.. code-block:: yaml
overrides:
orchestrator_model: plano/Plano-Orchestrator
orchestrator_model: plano/katanemo/Plano-Orchestrator-4B
model_providers:
- model: plano/Plano-Orchestrator
- model: plano/katanemo/Plano-Orchestrator-4B
base_url: http://<your-server-ip>:8000
5. **Verify the server is running**