mirror of
https://github.com/katanemo/plano.git
synced 2026-05-09 15:52:44 +02:00
deploy: 897fda2deb
This commit is contained in:
parent
5ede678869
commit
805883eadb
6 changed files with 547 additions and 393 deletions
|
|
@ -1,6 +1,6 @@
|
|||
Plano Docs v0.4.20
|
||||
llms.txt (auto-generated)
|
||||
Generated (UTC): 2026-04-24T19:31:46.972805+00:00
|
||||
Generated (UTC): 2026-04-24T19:32:12.216149+00:00
|
||||
|
||||
Table of contents
|
||||
- Agents (concepts/agents)
|
||||
|
|
@ -1381,7 +1381,9 @@ Complex agents and coding
|
|||
|
||||
Configuration Examples:
|
||||
|
||||
llm_providers:
|
||||
version: v0.4.0
|
||||
|
||||
model_providers:
|
||||
# Configure all Anthropic models with wildcard
|
||||
- model: anthropic/*
|
||||
access_key: $ANTHROPIC_API_KEY
|
||||
|
|
@ -1402,8 +1404,12 @@ llm_providers:
|
|||
|
||||
- model: anthropic/claude-sonnet-4-20250514
|
||||
access_key: $ANTHROPIC_PROD_API_KEY
|
||||
routing_preferences:
|
||||
- name: code_generation
|
||||
|
||||
routing_preferences:
|
||||
- name: code_generation
|
||||
description: generating new code snippets, functions, or boilerplate based on user prompts or requirements
|
||||
models:
|
||||
- anthropic/claude-sonnet-4-20250514
|
||||
|
||||
DeepSeek
|
||||
|
||||
|
|
@ -2084,7 +2090,9 @@ Overriding Wildcard Models:
|
|||
|
||||
You can configure specific models with custom settings even when using wildcards. Specific configurations take precedence and are excluded from wildcard expansion:
|
||||
|
||||
llm_providers:
|
||||
version: v0.4.0
|
||||
|
||||
model_providers:
|
||||
# Expand to all Anthropic models
|
||||
- model: anthropic/*
|
||||
access_key: $ANTHROPIC_API_KEY
|
||||
|
|
@ -2093,14 +2101,17 @@ llm_providers:
|
|||
# This model will NOT be included in the wildcard expansion above
|
||||
- model: anthropic/claude-sonnet-4-20250514
|
||||
access_key: $ANTHROPIC_PROD_API_KEY
|
||||
routing_preferences:
|
||||
- name: code_generation
|
||||
priority: 1
|
||||
|
||||
# Another specific override
|
||||
- model: anthropic/claude-3-haiku-20240307
|
||||
access_key: $ANTHROPIC_DEV_API_KEY
|
||||
|
||||
routing_preferences:
|
||||
- name: code_generation
|
||||
description: generating new code snippets, functions, or boilerplate based on user prompts or requirements
|
||||
models:
|
||||
- anthropic/claude-sonnet-4-20250514
|
||||
|
||||
Custom Provider Wildcards:
|
||||
|
||||
For providers not in Plano’s registry, wildcards enable dynamic model routing:
|
||||
|
|
@ -2139,22 +2150,33 @@ llm_providers:
|
|||
|
||||
Routing Preferences
|
||||
|
||||
Configure routing preferences for dynamic model selection:
|
||||
Starting in v0.4.0, configure routing preferences at the top level of the config. Each preference declares an ordered models candidate pool; the first entry is primary and the rest are fallbacks the client tries on 429/5xx errors. Multiple providers can serve the same route — just list them all under models. See /guides/llm_router for the full routing model.
|
||||
|
||||
llm_providers:
|
||||
version: v0.4.0
|
||||
|
||||
model_providers:
|
||||
- model: openai/gpt-5.2
|
||||
access_key: $OPENAI_API_KEY
|
||||
routing_preferences:
|
||||
- name: complex_reasoning
|
||||
description: deep analysis, mathematical problem solving, and logical reasoning
|
||||
- name: code_review
|
||||
description: reviewing and analyzing existing code for bugs and improvements
|
||||
|
||||
- model: anthropic/claude-sonnet-4-5
|
||||
access_key: $ANTHROPIC_API_KEY
|
||||
routing_preferences:
|
||||
- name: creative_writing
|
||||
description: creative content generation, storytelling, and writing assistance
|
||||
|
||||
routing_preferences:
|
||||
- name: complex_reasoning
|
||||
description: deep analysis, mathematical problem solving, and logical reasoning
|
||||
models:
|
||||
- openai/gpt-5.2
|
||||
- anthropic/claude-sonnet-4-5
|
||||
- name: code_review
|
||||
description: reviewing and analyzing existing code for bugs and improvements
|
||||
models:
|
||||
- openai/gpt-5.2
|
||||
- name: creative_writing
|
||||
description: creative content generation, storytelling, and writing assistance
|
||||
models:
|
||||
- anthropic/claude-sonnet-4-5
|
||||
|
||||
v0.3.0 configs that declare routing_preferences inline under each model_provider are auto-migrated to this top-level shape by the Plano CLI at compile time, with a deprecation warning. Update to the form above to silence the warning and gain the multi-model fallback behavior.
|
||||
|
||||
|
||||
|
||||
|
|
@ -4179,37 +4201,51 @@ Plano-Orchestrator analyzes each prompt to infer domain and action, then applies
|
|||
|
||||
Configuration
|
||||
|
||||
To configure preference-aligned dynamic routing, define routing preferences that map domains and actions to specific models:
|
||||
To configure preference-aligned dynamic routing, declare a top-level routing_preferences list and attach an ordered models candidate pool to each route. Starting in v0.4.0, routing_preferences lives at the root of the config (not inline under model_providers), which lets multiple models serve the same route — the first entry in models is primary, the rest are fallbacks that the client tries on 429/5xx errors.
|
||||
|
||||
Preference-Aligned Dynamic Routing Configuration
|
||||
|
||||
version: v0.4.0
|
||||
|
||||
listeners:
|
||||
egress_traffic:
|
||||
- name: egress_traffic
|
||||
type: model
|
||||
address: 0.0.0.0
|
||||
port: 12000
|
||||
message_format: openai
|
||||
timeout: 30s
|
||||
|
||||
llm_providers:
|
||||
model_providers:
|
||||
- model: openai/gpt-5.2
|
||||
access_key: $OPENAI_API_KEY
|
||||
default: true
|
||||
|
||||
- model: openai/gpt-5
|
||||
access_key: $OPENAI_API_KEY
|
||||
routing_preferences:
|
||||
- name: code understanding
|
||||
description: understand and explain existing code snippets, functions, or libraries
|
||||
- name: complex reasoning
|
||||
description: deep analysis, mathematical problem solving, and logical reasoning
|
||||
|
||||
- model: anthropic/claude-sonnet-4-5
|
||||
access_key: $ANTHROPIC_API_KEY
|
||||
routing_preferences:
|
||||
- name: creative writing
|
||||
description: creative content generation, storytelling, and writing assistance
|
||||
- name: code generation
|
||||
description: generating new code snippets, functions, or boilerplate based on user prompts
|
||||
|
||||
routing_preferences:
|
||||
- name: code understanding
|
||||
description: understand and explain existing code snippets, functions, or libraries
|
||||
models:
|
||||
- openai/gpt-5
|
||||
- anthropic/claude-sonnet-4-5
|
||||
- name: complex reasoning
|
||||
description: deep analysis, mathematical problem solving, and logical reasoning
|
||||
models:
|
||||
- openai/gpt-5
|
||||
- name: creative writing
|
||||
description: creative content generation, storytelling, and writing assistance
|
||||
models:
|
||||
- anthropic/claude-sonnet-4-5
|
||||
- name: code generation
|
||||
description: generating new code snippets, functions, or boilerplate based on user prompts
|
||||
models:
|
||||
- anthropic/claude-sonnet-4-5
|
||||
- openai/gpt-5
|
||||
|
||||
Configs still using the v0.3.0 inline style (routing_preferences nested under each model_provider) are auto-migrated to this top-level shape by the Plano CLI at compile time, with a deprecation warning. Update your config to the form above to silence the warning.
|
||||
|
||||
Client usage
|
||||
|
||||
|
|
@ -4273,6 +4309,8 @@ This downloads the quantized GGUF model from HuggingFace and starts serving on h
|
|||
|
||||
Configure Plano to use local routing model
|
||||
|
||||
version: v0.4.0
|
||||
|
||||
overrides:
|
||||
llm_routing_model: plano/hf.co/katanemo/Arch-Router-1.5B.gguf:Q4_K_M
|
||||
|
||||
|
|
@ -4286,9 +4324,12 @@ model_providers:
|
|||
|
||||
- model: anthropic/claude-sonnet-4-5
|
||||
access_key: $ANTHROPIC_API_KEY
|
||||
routing_preferences:
|
||||
- name: creative writing
|
||||
description: creative content generation, storytelling, and writing assistance
|
||||
|
||||
routing_preferences:
|
||||
- name: creative writing
|
||||
description: creative content generation, storytelling, and writing assistance
|
||||
models:
|
||||
- anthropic/claude-sonnet-4-5
|
||||
|
||||
Verify the model is running
|
||||
|
||||
|
|
@ -4331,6 +4372,8 @@ vllm serve ${SNAPSHOT_DIR}Arch-Router-1.5B-Q4_K_M.gguf \
|
|||
|
||||
Configure Plano to use the vLLM endpoint
|
||||
|
||||
version: v0.4.0
|
||||
|
||||
overrides:
|
||||
llm_routing_model: plano/Plano-Orchestrator
|
||||
|
||||
|
|
@ -4344,9 +4387,12 @@ model_providers:
|
|||
|
||||
- model: anthropic/claude-sonnet-4-5
|
||||
access_key: $ANTHROPIC_API_KEY
|
||||
routing_preferences:
|
||||
- name: creative writing
|
||||
description: creative content generation, storytelling, and writing assistance
|
||||
|
||||
routing_preferences:
|
||||
- name: creative writing
|
||||
description: creative content generation, storytelling, and writing assistance
|
||||
models:
|
||||
- anthropic/claude-sonnet-4-5
|
||||
|
||||
Verify the server is running
|
||||
|
||||
|
|
@ -4460,22 +4506,30 @@ You can combine static model selection with dynamic routing preferences for maxi
|
|||
|
||||
Hybrid Routing Configuration
|
||||
|
||||
llm_providers:
|
||||
version: v0.4.0
|
||||
|
||||
model_providers:
|
||||
- model: openai/gpt-5.2
|
||||
access_key: $OPENAI_API_KEY
|
||||
default: true
|
||||
|
||||
- model: openai/gpt-5
|
||||
access_key: $OPENAI_API_KEY
|
||||
routing_preferences:
|
||||
- name: complex_reasoning
|
||||
description: deep analysis and complex problem solving
|
||||
|
||||
- model: anthropic/claude-sonnet-4-5
|
||||
access_key: $ANTHROPIC_API_KEY
|
||||
routing_preferences:
|
||||
- name: creative_tasks
|
||||
description: creative writing and content generation
|
||||
|
||||
routing_preferences:
|
||||
- name: complex_reasoning
|
||||
description: deep analysis and complex problem solving
|
||||
models:
|
||||
- openai/gpt-5
|
||||
- anthropic/claude-sonnet-4-5
|
||||
- name: creative_tasks
|
||||
description: creative writing and content generation
|
||||
models:
|
||||
- anthropic/claude-sonnet-4-5
|
||||
- openai/gpt-5
|
||||
|
||||
model_aliases:
|
||||
# Model aliases - friendly names that map to actual provider names
|
||||
|
|
@ -6895,7 +6949,7 @@ where prompts get routed to, apply guardrails, and enable critical agent observa
|
|||
Plano Configuration - Full Reference
|
||||
|
||||
# Plano Gateway configuration version
|
||||
version: v0.3.0
|
||||
version: v0.4.0
|
||||
|
||||
# External HTTP agents - API type is controlled by request path (/v1/responses, /v1/messages, /v1/chat/completions)
|
||||
agents:
|
||||
|
|
@ -6928,17 +6982,8 @@ model_providers:
|
|||
- model: mistral/ministral-3b-latest
|
||||
access_key: $MISTRAL_API_KEY
|
||||
|
||||
# routing_preferences: tags a model with named capabilities so Plano's LLM router
|
||||
# can select the best model for each request based on intent. Requires the
|
||||
# Plano-Orchestrator model (or equivalent) to be configured in overrides.llm_routing_model.
|
||||
# Each preference has a name (short label) and a description (used for intent matching).
|
||||
- model: groq/llama-3.3-70b-versatile
|
||||
access_key: $GROQ_API_KEY
|
||||
routing_preferences:
|
||||
- name: code generation
|
||||
description: generating new code snippets, functions, or boilerplate based on user prompts or requirements
|
||||
- name: code review
|
||||
description: reviewing, analyzing, and suggesting improvements to existing code
|
||||
|
||||
# passthrough_auth: forwards the client's Authorization header upstream instead of
|
||||
# using the configured access_key. Useful for LiteLLM or similar proxy setups.
|
||||
|
|
@ -6960,6 +7005,29 @@ model_aliases:
|
|||
smart-llm:
|
||||
target: gpt-4o
|
||||
|
||||
# routing_preferences: top-level list that tags named task categories with an
|
||||
# ordered pool of candidate models. Plano's LLM router matches incoming requests
|
||||
# against these descriptions and returns an ordered list of models; the client
|
||||
# uses models[0] as primary and retries with models[1], models[2]... on 429/5xx.
|
||||
# Requires overrides.llm_routing_model to point at Plano-Orchestrator (or equivalent).
|
||||
# Each model in `models` must be declared in model_providers above.
|
||||
# selection_policy is optional: {prefer: cheapest|fastest|none} lets the router
|
||||
# reorder candidates using live cost/latency data from model_metrics_sources.
|
||||
routing_preferences:
|
||||
- name: code generation
|
||||
description: generating new code snippets, functions, or boilerplate based on user prompts or requirements
|
||||
models:
|
||||
- anthropic/claude-sonnet-4-0
|
||||
- openai/gpt-4o
|
||||
- groq/llama-3.3-70b-versatile
|
||||
- name: code review
|
||||
description: reviewing, analyzing, and suggesting improvements to existing code
|
||||
models:
|
||||
- anthropic/claude-sonnet-4-0
|
||||
- groq/llama-3.3-70b-versatile
|
||||
selection_policy:
|
||||
prefer: cheapest
|
||||
|
||||
# HTTP listeners - entry points for agent routing, prompt targets, and direct LLM access
|
||||
listeners:
|
||||
# Agent listener for routing requests to multiple agents
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue