deploy: 897fda2deb

2026-05-09 15:52:44 +02:00 · 2026-04-24 19:32:15 +00:00 · 2026-04-24 19:32:15 +00:00 · 805883eadb
commit 805883eadb
parent 5ede678869
6 changed files with 547 additions and 393 deletions
--- a/includes/llms.txt
+++ b/includes/llms.txt
@ -1,6 +1,6 @@
 Plano Docs v0.4.20
 llms.txt (auto-generated)
-Generated (UTC): 2026-04-24T19:31:46.972805+00:00
+Generated (UTC): 2026-04-24T19:32:12.216149+00:00

 Table of contents
 - Agents (concepts/agents)
@ -1381,7 +1381,9 @@ Complex agents and coding

 Configuration Examples:

-llm_providers:
+version: v0.4.0
+
+model_providers:
  # Configure all Anthropic models with wildcard
  - model: anthropic/*
    access_key: $ANTHROPIC_API_KEY
@ -1402,8 +1404,12 @@ llm_providers:

  - model: anthropic/claude-sonnet-4-20250514
    access_key: $ANTHROPIC_PROD_API_KEY
-    routing_preferences:
-      - name: code_generation
+
+routing_preferences:
+  - name: code_generation
+    description: generating new code snippets, functions, or boilerplate based on user prompts or requirements
+    models:
+      - anthropic/claude-sonnet-4-20250514

 DeepSeek

@ -2084,7 +2090,9 @@ Overriding Wildcard Models:

 You can configure specific models with custom settings even when using wildcards. Specific configurations take precedence and are excluded from wildcard expansion:

-llm_providers:
+version: v0.4.0
+
+model_providers:
  # Expand to all Anthropic models
  - model: anthropic/*
    access_key: $ANTHROPIC_API_KEY
@ -2093,14 +2101,17 @@ llm_providers:
  # This model will NOT be included in the wildcard expansion above
  - model: anthropic/claude-sonnet-4-20250514
    access_key: $ANTHROPIC_PROD_API_KEY
-    routing_preferences:
-      - name: code_generation
-        priority: 1

  # Another specific override
  - model: anthropic/claude-3-haiku-20240307
    access_key: $ANTHROPIC_DEV_API_KEY

+routing_preferences:
+  - name: code_generation
+    description: generating new code snippets, functions, or boilerplate based on user prompts or requirements
+    models:
+      - anthropic/claude-sonnet-4-20250514
+
 Custom Provider Wildcards:

 For providers not in Plano’s registry, wildcards enable dynamic model routing:
@ -2139,22 +2150,33 @@ llm_providers:

 Routing Preferences

-Configure routing preferences for dynamic model selection:
+Starting in v0.4.0, configure routing preferences at the top level of the config. Each preference declares an ordered models candidate pool; the first entry is primary and the rest are fallbacks the client tries on 429/5xx errors. Multiple providers can serve the same route — just list them all under models. See /guides/llm_router for the full routing model.

-llm_providers:
+version: v0.4.0
+
+model_providers:
  - model: openai/gpt-5.2
    access_key: $OPENAI_API_KEY
-    routing_preferences:
-      - name: complex_reasoning
-        description: deep analysis, mathematical problem solving, and logical reasoning
-      - name: code_review
-        description: reviewing and analyzing existing code for bugs and improvements

  - model: anthropic/claude-sonnet-4-5
    access_key: $ANTHROPIC_API_KEY
-    routing_preferences:
-      - name: creative_writing
-        description: creative content generation, storytelling, and writing assistance
+
+routing_preferences:
+  - name: complex_reasoning
+    description: deep analysis, mathematical problem solving, and logical reasoning
+    models:
+      - openai/gpt-5.2
+      - anthropic/claude-sonnet-4-5
+  - name: code_review
+    description: reviewing and analyzing existing code for bugs and improvements
+    models:
+      - openai/gpt-5.2
+  - name: creative_writing
+    description: creative content generation, storytelling, and writing assistance
+    models:
+      - anthropic/claude-sonnet-4-5
+
+v0.3.0 configs that declare routing_preferences inline under each model_provider are auto-migrated to this top-level shape by the Plano CLI at compile time, with a deprecation warning. Update to the form above to silence the warning and gain the multi-model fallback behavior.



@ -4179,37 +4201,51 @@ Plano-Orchestrator analyzes each prompt to infer domain and action, then applies

 Configuration

-To configure preference-aligned dynamic routing, define routing preferences that map domains and actions to specific models:
+To configure preference-aligned dynamic routing, declare a top-level routing_preferences list and attach an ordered models candidate pool to each route. Starting in v0.4.0, routing_preferences lives at the root of the config (not inline under model_providers), which lets multiple models serve the same route — the first entry in models is primary, the rest are fallbacks that the client tries on 429/5xx errors.

 Preference-Aligned Dynamic Routing Configuration

+version: v0.4.0
+
 listeners:
-  egress_traffic:
+  - name: egress_traffic
+    type: model
    address: 0.0.0.0
    port: 12000
-    message_format: openai
    timeout: 30s

-llm_providers:
+model_providers:
  - model: openai/gpt-5.2
    access_key: $OPENAI_API_KEY
    default: true

  - model: openai/gpt-5
    access_key: $OPENAI_API_KEY
-    routing_preferences:
-      - name: code understanding
-        description: understand and explain existing code snippets, functions, or libraries
-      - name: complex reasoning
-        description: deep analysis, mathematical problem solving, and logical reasoning

  - model: anthropic/claude-sonnet-4-5
    access_key: $ANTHROPIC_API_KEY
-    routing_preferences:
-      - name: creative writing
-        description: creative content generation, storytelling, and writing assistance
-      - name: code generation
-        description: generating new code snippets, functions, or boilerplate based on user prompts
+
+routing_preferences:
+  - name: code understanding
+    description: understand and explain existing code snippets, functions, or libraries
+    models:
+      - openai/gpt-5
+      - anthropic/claude-sonnet-4-5
+  - name: complex reasoning
+    description: deep analysis, mathematical problem solving, and logical reasoning
+    models:
+      - openai/gpt-5
+  - name: creative writing
+    description: creative content generation, storytelling, and writing assistance
+    models:
+      - anthropic/claude-sonnet-4-5
+  - name: code generation
+    description: generating new code snippets, functions, or boilerplate based on user prompts
+    models:
+      - anthropic/claude-sonnet-4-5
+      - openai/gpt-5
+
+Configs still using the v0.3.0 inline style (routing_preferences nested under each model_provider) are auto-migrated to this top-level shape by the Plano CLI at compile time, with a deprecation warning. Update your config to the form above to silence the warning.

 Client usage

@ -4273,6 +4309,8 @@ This downloads the quantized GGUF model from HuggingFace and starts serving on h

 Configure Plano to use local routing model

+version: v0.4.0
+
 overrides:
  llm_routing_model: plano/hf.co/katanemo/Arch-Router-1.5B.gguf:Q4_K_M

@ -4286,9 +4324,12 @@ model_providers:

  - model: anthropic/claude-sonnet-4-5
    access_key: $ANTHROPIC_API_KEY
-    routing_preferences:
-      - name: creative writing
-        description: creative content generation, storytelling, and writing assistance
+
+routing_preferences:
+  - name: creative writing
+    description: creative content generation, storytelling, and writing assistance
+    models:
+      - anthropic/claude-sonnet-4-5

 Verify the model is running

@ -4331,6 +4372,8 @@ vllm serve ${SNAPSHOT_DIR}Arch-Router-1.5B-Q4_K_M.gguf \

 Configure Plano to use the vLLM endpoint

+version: v0.4.0
+
 overrides:
  llm_routing_model: plano/Plano-Orchestrator

@ -4344,9 +4387,12 @@ model_providers:

  - model: anthropic/claude-sonnet-4-5
    access_key: $ANTHROPIC_API_KEY
-    routing_preferences:
-      - name: creative writing
-        description: creative content generation, storytelling, and writing assistance
+
+routing_preferences:
+  - name: creative writing
+    description: creative content generation, storytelling, and writing assistance
+    models:
+      - anthropic/claude-sonnet-4-5

 Verify the server is running

@ -4460,22 +4506,30 @@ You can combine static model selection with dynamic routing preferences for maxi

 Hybrid Routing Configuration

-llm_providers:
+version: v0.4.0
+
+model_providers:
  - model: openai/gpt-5.2
    access_key: $OPENAI_API_KEY
    default: true

  - model: openai/gpt-5
    access_key: $OPENAI_API_KEY
-    routing_preferences:
-      - name: complex_reasoning
-        description: deep analysis and complex problem solving

  - model: anthropic/claude-sonnet-4-5
    access_key: $ANTHROPIC_API_KEY
-    routing_preferences:
-      - name: creative_tasks
-        description: creative writing and content generation
+
+routing_preferences:
+  - name: complex_reasoning
+    description: deep analysis and complex problem solving
+    models:
+      - openai/gpt-5
+      - anthropic/claude-sonnet-4-5
+  - name: creative_tasks
+    description: creative writing and content generation
+    models:
+      - anthropic/claude-sonnet-4-5
+      - openai/gpt-5

 model_aliases:
  # Model aliases - friendly names that map to actual provider names
@ -6895,7 +6949,7 @@ where prompts get routed to, apply guardrails, and enable critical agent observa
 Plano Configuration - Full Reference

 # Plano Gateway configuration version
-version: v0.3.0
+version: v0.4.0

 # External HTTP agents - API type is controlled by request path (/v1/responses, /v1/messages, /v1/chat/completions)
 agents:
@ -6928,17 +6982,8 @@ model_providers:
  - model: mistral/ministral-3b-latest
    access_key: $MISTRAL_API_KEY

-  # routing_preferences: tags a model with named capabilities so Plano's LLM router
-  # can select the best model for each request based on intent. Requires the
-  # Plano-Orchestrator model (or equivalent) to be configured in overrides.llm_routing_model.
-  # Each preference has a name (short label) and a description (used for intent matching).
  - model: groq/llama-3.3-70b-versatile
    access_key: $GROQ_API_KEY
-    routing_preferences:
-      - name: code generation
-        description: generating new code snippets, functions, or boilerplate based on user prompts or requirements
-      - name: code review
-        description: reviewing, analyzing, and suggesting improvements to existing code

  # passthrough_auth: forwards the client's Authorization header upstream instead of
  # using the configured access_key. Useful for LiteLLM or similar proxy setups.
@ -6960,6 +7005,29 @@ model_aliases:
  smart-llm:
    target: gpt-4o

+# routing_preferences: top-level list that tags named task categories with an
+# ordered pool of candidate models. Plano's LLM router matches incoming requests
+# against these descriptions and returns an ordered list of models; the client
+# uses models[0] as primary and retries with models[1], models[2]... on 429/5xx.
+# Requires overrides.llm_routing_model to point at Plano-Orchestrator (or equivalent).
+# Each model in `models` must be declared in model_providers above.
+# selection_policy is optional: {prefer: cheapest|fastest|none} lets the router
+# reorder candidates using live cost/latency data from model_metrics_sources.
+routing_preferences:
+  - name: code generation
+    description: generating new code snippets, functions, or boilerplate based on user prompts or requirements
+    models:
+      - anthropic/claude-sonnet-4-0
+      - openai/gpt-4o
+      - groq/llama-3.3-70b-versatile
+  - name: code review
+    description: reviewing, analyzing, and suggesting improvements to existing code
+    models:
+      - anthropic/claude-sonnet-4-0
+      - groq/llama-3.3-70b-versatile
+    selection_policy:
+      prefer: cheapest
+
 # HTTP listeners - entry points for agent routing, prompt targets, and direct LLM access
 listeners:
  # Agent listener for routing requests to multiple agents