mirror of
https://github.com/katanemo/plano.git
synced 2026-04-25 00:36:34 +02:00
update orchestrator model name
This commit is contained in:
parent
48bf83fa0d
commit
680dee60a0
3 changed files with 9 additions and 7 deletions
|
|
@ -141,7 +141,7 @@ vllm serve katanemo/Plano-Orchestrator-4B \
|
|||
--gpu-memory-utilization 0.3 \
|
||||
--tokenizer katanemo/Plano-Orchestrator-4B \
|
||||
--chat-template chat_template.jinja \
|
||||
--served-model-name Plano-Orchestrator \
|
||||
--served-model-name katanemo/Plano-Orchestrator-4B \
|
||||
--enable-prefix-caching
|
||||
```
|
||||
|
||||
|
|
|
|||
|
|
@ -1,7 +1,7 @@
|
|||
version: v0.3.0
|
||||
|
||||
overrides:
|
||||
orchestrator_model: plano/Plano-Orchestrator
|
||||
orchestrator_model: plano/katanemo/Plano-Orchestrator-4B
|
||||
|
||||
agents:
|
||||
- id: weather_agent
|
||||
|
|
@ -10,7 +10,7 @@ agents:
|
|||
url: http://localhost:10520
|
||||
|
||||
model_providers:
|
||||
- model: plano/Plano-Orchestrator
|
||||
- model: plano/katanemo/Plano-Orchestrator-4B
|
||||
base_url: http://localhost:8000
|
||||
|
||||
- model: openai/gpt-5.2
|
||||
|
|
|
|||
|
|
@ -379,7 +379,7 @@ Using vLLM
|
|||
--gpu-memory-utilization 0.3 \
|
||||
--tokenizer katanemo/Plano-Orchestrator-4B \
|
||||
--chat-template chat_template.jinja \
|
||||
--served-model-name Plano-Orchestrator \
|
||||
--served-model-name katanemo/Plano-Orchestrator-4B \
|
||||
--enable-prefix-caching
|
||||
|
||||
For the 30B-A3B-FP8 model (production):
|
||||
|
|
@ -394,18 +394,20 @@ Using vLLM
|
|||
--tokenizer katanemo/Plano-Orchestrator-30B-A3B-FP8 \
|
||||
--chat-template chat_template.jinja \
|
||||
--max-model-len 32768 \
|
||||
--served-model-name Plano-Orchestrator \
|
||||
--served-model-name katanemo/Plano-Orchestrator-30B-A3B-FP8 \
|
||||
--enable-prefix-caching
|
||||
|
||||
4. **Configure Plano to use the local orchestrator**
|
||||
|
||||
Use the model name matching your ``--served-model-name``:
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
overrides:
|
||||
orchestrator_model: plano/Plano-Orchestrator
|
||||
orchestrator_model: plano/katanemo/Plano-Orchestrator-4B
|
||||
|
||||
model_providers:
|
||||
- model: plano/Plano-Orchestrator
|
||||
- model: plano/katanemo/Plano-Orchestrator-4B
|
||||
base_url: http://<your-server-ip>:8000
|
||||
|
||||
5. **Verify the server is running**
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue