update orchestrator model name

This commit is contained in:
Adil Hafeez 2026-03-11 16:46:03 -07:00
parent 48bf83fa0d
commit 680dee60a0
No known key found for this signature in database
GPG key ID: 9B18EF7691369645
3 changed files with 9 additions and 7 deletions

View file

@ -141,7 +141,7 @@ vllm serve katanemo/Plano-Orchestrator-4B \
--gpu-memory-utilization 0.3 \ --gpu-memory-utilization 0.3 \
--tokenizer katanemo/Plano-Orchestrator-4B \ --tokenizer katanemo/Plano-Orchestrator-4B \
--chat-template chat_template.jinja \ --chat-template chat_template.jinja \
--served-model-name Plano-Orchestrator \ --served-model-name katanemo/Plano-Orchestrator-4B \
--enable-prefix-caching --enable-prefix-caching
``` ```

View file

@ -1,7 +1,7 @@
version: v0.3.0 version: v0.3.0
overrides: overrides:
orchestrator_model: plano/Plano-Orchestrator orchestrator_model: plano/katanemo/Plano-Orchestrator-4B
agents: agents:
- id: weather_agent - id: weather_agent
@ -10,7 +10,7 @@ agents:
url: http://localhost:10520 url: http://localhost:10520
model_providers: model_providers:
- model: plano/Plano-Orchestrator - model: plano/katanemo/Plano-Orchestrator-4B
base_url: http://localhost:8000 base_url: http://localhost:8000
- model: openai/gpt-5.2 - model: openai/gpt-5.2

View file

@ -379,7 +379,7 @@ Using vLLM
--gpu-memory-utilization 0.3 \ --gpu-memory-utilization 0.3 \
--tokenizer katanemo/Plano-Orchestrator-4B \ --tokenizer katanemo/Plano-Orchestrator-4B \
--chat-template chat_template.jinja \ --chat-template chat_template.jinja \
--served-model-name Plano-Orchestrator \ --served-model-name katanemo/Plano-Orchestrator-4B \
--enable-prefix-caching --enable-prefix-caching
For the 30B-A3B-FP8 model (production): For the 30B-A3B-FP8 model (production):
@ -394,18 +394,20 @@ Using vLLM
--tokenizer katanemo/Plano-Orchestrator-30B-A3B-FP8 \ --tokenizer katanemo/Plano-Orchestrator-30B-A3B-FP8 \
--chat-template chat_template.jinja \ --chat-template chat_template.jinja \
--max-model-len 32768 \ --max-model-len 32768 \
--served-model-name Plano-Orchestrator \ --served-model-name katanemo/Plano-Orchestrator-30B-A3B-FP8 \
--enable-prefix-caching --enable-prefix-caching
4. **Configure Plano to use the local orchestrator** 4. **Configure Plano to use the local orchestrator**
Use the model name matching your ``--served-model-name``:
.. code-block:: yaml .. code-block:: yaml
overrides: overrides:
orchestrator_model: plano/Plano-Orchestrator orchestrator_model: plano/katanemo/Plano-Orchestrator-4B
model_providers: model_providers:
- model: plano/Plano-Orchestrator - model: plano/katanemo/Plano-Orchestrator-4B
base_url: http://<your-server-ip>:8000 base_url: http://<your-server-ip>:8000
5. **Verify the server is running** 5. **Verify the server is running**