This commit is contained in:
Spherrrical 2026-04-24 19:32:15 +00:00
parent 5ede678869
commit 805883eadb
6 changed files with 547 additions and 393 deletions

View file

@ -1,6 +1,6 @@
Plano Docs v0.4.20
llms.txt (auto-generated)
Generated (UTC): 2026-04-24T19:31:46.972805+00:00
Generated (UTC): 2026-04-24T19:32:12.216149+00:00
Table of contents
- Agents (concepts/agents)
@ -1381,7 +1381,9 @@ Complex agents and coding
Configuration Examples:
llm_providers:
version: v0.4.0
model_providers:
# Configure all Anthropic models with wildcard
- model: anthropic/*
access_key: $ANTHROPIC_API_KEY
@ -1402,8 +1404,12 @@ llm_providers:
- model: anthropic/claude-sonnet-4-20250514
access_key: $ANTHROPIC_PROD_API_KEY
routing_preferences:
- name: code_generation
routing_preferences:
- name: code_generation
description: generating new code snippets, functions, or boilerplate based on user prompts or requirements
models:
- anthropic/claude-sonnet-4-20250514
DeepSeek
@ -2084,7 +2090,9 @@ Overriding Wildcard Models:
You can configure specific models with custom settings even when using wildcards. Specific configurations take precedence and are excluded from wildcard expansion:
llm_providers:
version: v0.4.0
model_providers:
# Expand to all Anthropic models
- model: anthropic/*
access_key: $ANTHROPIC_API_KEY
@ -2093,14 +2101,17 @@ llm_providers:
# This model will NOT be included in the wildcard expansion above
- model: anthropic/claude-sonnet-4-20250514
access_key: $ANTHROPIC_PROD_API_KEY
routing_preferences:
- name: code_generation
priority: 1
# Another specific override
- model: anthropic/claude-3-haiku-20240307
access_key: $ANTHROPIC_DEV_API_KEY
routing_preferences:
- name: code_generation
description: generating new code snippets, functions, or boilerplate based on user prompts or requirements
models:
- anthropic/claude-sonnet-4-20250514
Custom Provider Wildcards:
For providers not in Planos registry, wildcards enable dynamic model routing:
@ -2139,22 +2150,33 @@ llm_providers:
Routing Preferences
Configure routing preferences for dynamic model selection:
Starting in v0.4.0, configure routing preferences at the top level of the config. Each preference declares an ordered models candidate pool; the first entry is primary and the rest are fallbacks the client tries on 429/5xx errors. Multiple providers can serve the same route — just list them all under models. See /guides/llm_router for the full routing model.
llm_providers:
version: v0.4.0
model_providers:
- model: openai/gpt-5.2
access_key: $OPENAI_API_KEY
routing_preferences:
- name: complex_reasoning
description: deep analysis, mathematical problem solving, and logical reasoning
- name: code_review
description: reviewing and analyzing existing code for bugs and improvements
- model: anthropic/claude-sonnet-4-5
access_key: $ANTHROPIC_API_KEY
routing_preferences:
- name: creative_writing
description: creative content generation, storytelling, and writing assistance
routing_preferences:
- name: complex_reasoning
description: deep analysis, mathematical problem solving, and logical reasoning
models:
- openai/gpt-5.2
- anthropic/claude-sonnet-4-5
- name: code_review
description: reviewing and analyzing existing code for bugs and improvements
models:
- openai/gpt-5.2
- name: creative_writing
description: creative content generation, storytelling, and writing assistance
models:
- anthropic/claude-sonnet-4-5
v0.3.0 configs that declare routing_preferences inline under each model_provider are auto-migrated to this top-level shape by the Plano CLI at compile time, with a deprecation warning. Update to the form above to silence the warning and gain the multi-model fallback behavior.
@ -4179,37 +4201,51 @@ Plano-Orchestrator analyzes each prompt to infer domain and action, then applies
Configuration
To configure preference-aligned dynamic routing, define routing preferences that map domains and actions to specific models:
To configure preference-aligned dynamic routing, declare a top-level routing_preferences list and attach an ordered models candidate pool to each route. Starting in v0.4.0, routing_preferences lives at the root of the config (not inline under model_providers), which lets multiple models serve the same route — the first entry in models is primary, the rest are fallbacks that the client tries on 429/5xx errors.
Preference-Aligned Dynamic Routing Configuration
version: v0.4.0
listeners:
egress_traffic:
- name: egress_traffic
type: model
address: 0.0.0.0
port: 12000
message_format: openai
timeout: 30s
llm_providers:
model_providers:
- model: openai/gpt-5.2
access_key: $OPENAI_API_KEY
default: true
- model: openai/gpt-5
access_key: $OPENAI_API_KEY
routing_preferences:
- name: code understanding
description: understand and explain existing code snippets, functions, or libraries
- name: complex reasoning
description: deep analysis, mathematical problem solving, and logical reasoning
- model: anthropic/claude-sonnet-4-5
access_key: $ANTHROPIC_API_KEY
routing_preferences:
- name: creative writing
description: creative content generation, storytelling, and writing assistance
- name: code generation
description: generating new code snippets, functions, or boilerplate based on user prompts
routing_preferences:
- name: code understanding
description: understand and explain existing code snippets, functions, or libraries
models:
- openai/gpt-5
- anthropic/claude-sonnet-4-5
- name: complex reasoning
description: deep analysis, mathematical problem solving, and logical reasoning
models:
- openai/gpt-5
- name: creative writing
description: creative content generation, storytelling, and writing assistance
models:
- anthropic/claude-sonnet-4-5
- name: code generation
description: generating new code snippets, functions, or boilerplate based on user prompts
models:
- anthropic/claude-sonnet-4-5
- openai/gpt-5
Configs still using the v0.3.0 inline style (routing_preferences nested under each model_provider) are auto-migrated to this top-level shape by the Plano CLI at compile time, with a deprecation warning. Update your config to the form above to silence the warning.
Client usage
@ -4273,6 +4309,8 @@ This downloads the quantized GGUF model from HuggingFace and starts serving on h
Configure Plano to use local routing model
version: v0.4.0
overrides:
llm_routing_model: plano/hf.co/katanemo/Arch-Router-1.5B.gguf:Q4_K_M
@ -4286,9 +4324,12 @@ model_providers:
- model: anthropic/claude-sonnet-4-5
access_key: $ANTHROPIC_API_KEY
routing_preferences:
- name: creative writing
description: creative content generation, storytelling, and writing assistance
routing_preferences:
- name: creative writing
description: creative content generation, storytelling, and writing assistance
models:
- anthropic/claude-sonnet-4-5
Verify the model is running
@ -4331,6 +4372,8 @@ vllm serve ${SNAPSHOT_DIR}Arch-Router-1.5B-Q4_K_M.gguf \
Configure Plano to use the vLLM endpoint
version: v0.4.0
overrides:
llm_routing_model: plano/Plano-Orchestrator
@ -4344,9 +4387,12 @@ model_providers:
- model: anthropic/claude-sonnet-4-5
access_key: $ANTHROPIC_API_KEY
routing_preferences:
- name: creative writing
description: creative content generation, storytelling, and writing assistance
routing_preferences:
- name: creative writing
description: creative content generation, storytelling, and writing assistance
models:
- anthropic/claude-sonnet-4-5
Verify the server is running
@ -4460,22 +4506,30 @@ You can combine static model selection with dynamic routing preferences for maxi
Hybrid Routing Configuration
llm_providers:
version: v0.4.0
model_providers:
- model: openai/gpt-5.2
access_key: $OPENAI_API_KEY
default: true
- model: openai/gpt-5
access_key: $OPENAI_API_KEY
routing_preferences:
- name: complex_reasoning
description: deep analysis and complex problem solving
- model: anthropic/claude-sonnet-4-5
access_key: $ANTHROPIC_API_KEY
routing_preferences:
- name: creative_tasks
description: creative writing and content generation
routing_preferences:
- name: complex_reasoning
description: deep analysis and complex problem solving
models:
- openai/gpt-5
- anthropic/claude-sonnet-4-5
- name: creative_tasks
description: creative writing and content generation
models:
- anthropic/claude-sonnet-4-5
- openai/gpt-5
model_aliases:
# Model aliases - friendly names that map to actual provider names
@ -6895,7 +6949,7 @@ where prompts get routed to, apply guardrails, and enable critical agent observa
Plano Configuration - Full Reference
# Plano Gateway configuration version
version: v0.3.0
version: v0.4.0
# External HTTP agents - API type is controlled by request path (/v1/responses, /v1/messages, /v1/chat/completions)
agents:
@ -6928,17 +6982,8 @@ model_providers:
- model: mistral/ministral-3b-latest
access_key: $MISTRAL_API_KEY
# routing_preferences: tags a model with named capabilities so Plano's LLM router
# can select the best model for each request based on intent. Requires the
# Plano-Orchestrator model (or equivalent) to be configured in overrides.llm_routing_model.
# Each preference has a name (short label) and a description (used for intent matching).
- model: groq/llama-3.3-70b-versatile
access_key: $GROQ_API_KEY
routing_preferences:
- name: code generation
description: generating new code snippets, functions, or boilerplate based on user prompts or requirements
- name: code review
description: reviewing, analyzing, and suggesting improvements to existing code
# passthrough_auth: forwards the client's Authorization header upstream instead of
# using the configured access_key. Useful for LiteLLM or similar proxy setups.
@ -6960,6 +7005,29 @@ model_aliases:
smart-llm:
target: gpt-4o
# routing_preferences: top-level list that tags named task categories with an
# ordered pool of candidate models. Plano's LLM router matches incoming requests
# against these descriptions and returns an ordered list of models; the client
# uses models[0] as primary and retries with models[1], models[2]... on 429/5xx.
# Requires overrides.llm_routing_model to point at Plano-Orchestrator (or equivalent).
# Each model in `models` must be declared in model_providers above.
# selection_policy is optional: {prefer: cheapest|fastest|none} lets the router
# reorder candidates using live cost/latency data from model_metrics_sources.
routing_preferences:
- name: code generation
description: generating new code snippets, functions, or boilerplate based on user prompts or requirements
models:
- anthropic/claude-sonnet-4-0
- openai/gpt-4o
- groq/llama-3.3-70b-versatile
- name: code review
description: reviewing, analyzing, and suggesting improvements to existing code
models:
- anthropic/claude-sonnet-4-0
- groq/llama-3.3-70b-versatile
selection_policy:
prefer: cheapest
# HTTP listeners - entry points for agent routing, prompt targets, and direct LLM access
listeners:
# Agent listener for routing requests to multiple agents