update orchestrator model name

2026-06-17 15:25:17 +02:00 · 2026-03-11 16:46:03 -07:00 · 2026-03-11 16:46:03 -07:00 · 680dee60a0
commit 680dee60a0
parent 48bf83fa0d
3 changed files with 9 additions and 7 deletions
--- a/demos/agent_orchestration/travel_agents/README.md
+++ b/demos/agent_orchestration/travel_agents/README.md
@ -141,7 +141,7 @@ vllm serve katanemo/Plano-Orchestrator-4B \
    --gpu-memory-utilization 0.3 \
    --tokenizer katanemo/Plano-Orchestrator-4B \
    --chat-template chat_template.jinja \
-    --served-model-name Plano-Orchestrator \
+    --served-model-name katanemo/Plano-Orchestrator-4B \
    --enable-prefix-caching
 ```

--- a/demos/agent_orchestration/travel_agents/config_local_orchestrator.yaml
+++ b/demos/agent_orchestration/travel_agents/config_local_orchestrator.yaml
@ -1,7 +1,7 @@
 version: v0.3.0

 overrides:
-  orchestrator_model: plano/Plano-Orchestrator
+  orchestrator_model: plano/katanemo/Plano-Orchestrator-4B

 agents:
  - id: weather_agent
@ -10,7 +10,7 @@ agents:
    url: http://localhost:10520

 model_providers:
-  - model: plano/Plano-Orchestrator
+  - model: plano/katanemo/Plano-Orchestrator-4B
    base_url: http://localhost:8000

  - model: openai/gpt-5.2
--- a/docs/source/guides/orchestration.rst
+++ b/docs/source/guides/orchestration.rst
@ -379,7 +379,7 @@ Using vLLM
           --gpu-memory-utilization 0.3 \
           --tokenizer katanemo/Plano-Orchestrator-4B \
           --chat-template chat_template.jinja \
-           --served-model-name Plano-Orchestrator \
+           --served-model-name katanemo/Plano-Orchestrator-4B \
           --enable-prefix-caching

   For the 30B-A3B-FP8 model (production):
@ -394,18 +394,20 @@ Using vLLM
           --tokenizer katanemo/Plano-Orchestrator-30B-A3B-FP8 \
           --chat-template chat_template.jinja \
           --max-model-len 32768 \
-           --served-model-name Plano-Orchestrator \
+           --served-model-name katanemo/Plano-Orchestrator-30B-A3B-FP8 \
           --enable-prefix-caching

 4. **Configure Plano to use the local orchestrator**

+   Use the model name matching your ``--served-model-name``:
+
   .. code-block:: yaml

       overrides:
-         orchestrator_model: plano/Plano-Orchestrator
+         orchestrator_model: plano/katanemo/Plano-Orchestrator-4B

       model_providers:
-         - model: plano/Plano-Orchestrator
+         - model: plano/katanemo/Plano-Orchestrator-4B
           base_url: http://<your-server-ip>:8000

 5. **Verify the server is running**