mirror of
https://github.com/katanemo/plano.git
synced 2026-05-30 14:25:15 +02:00
fix(routing): auto-migrate v0.3.0 inline routing_preferences to v0.4.0 top-level
Lift inline routing_preferences under each model_provider into the top-level routing_preferences list with merged models[] and bump version to v0.4.0, with a deprecation warning. Existing v0.3.0 demo configs (Claude Code, Codex, preference_based_routing, etc.) keep working unchanged. Schema flags the inline shape as deprecated but still accepts it. Docs and skills updated to canonical top-level multi-model form.
This commit is contained in:
parent
b81eb7266c
commit
dde90cae82
11 changed files with 693 additions and 224 deletions
|
|
@ -7,67 +7,100 @@ tags: routing, model-selection, preferences, llm-routing
|
|||
|
||||
## Write Task-Specific Routing Preference Descriptions
|
||||
|
||||
Plano's `plano_orchestrator_v1` router uses a 1.5B preference-aligned LLM to classify incoming requests against your `routing_preferences` descriptions. It routes the request to the first provider whose preferences match. Description quality directly determines routing accuracy.
|
||||
Plano's `plano_orchestrator_v1` router uses a 1.5B preference-aligned LLM to classify incoming requests against your `routing_preferences` descriptions. It returns an ordered `models` list for the matched route; the client uses `models[0]` as primary and falls back to `models[1]`, `models[2]`... on `429`/`5xx` errors. Description quality directly determines routing accuracy.
|
||||
|
||||
Starting in `v0.4.0`, `routing_preferences` lives at the **top level** of the config and each entry carries its own `models: [...]` candidate pool. Configs still using the legacy v0.3.0 inline shape (under each `model_provider`) are auto-migrated with a deprecation warning — prefer the top-level form below.
|
||||
|
||||
**Incorrect (vague, overlapping descriptions):**
|
||||
|
||||
```yaml
|
||||
version: v0.4.0
|
||||
|
||||
model_providers:
|
||||
- model: openai/gpt-4o-mini
|
||||
access_key: $OPENAI_API_KEY
|
||||
default: true
|
||||
routing_preferences:
|
||||
- name: simple
|
||||
description: easy tasks # Too vague — what is "easy"?
|
||||
|
||||
- model: openai/gpt-4o
|
||||
access_key: $OPENAI_API_KEY
|
||||
routing_preferences:
|
||||
- name: hard
|
||||
description: hard tasks # Too vague — overlaps with "easy"
|
||||
|
||||
routing_preferences:
|
||||
- name: simple
|
||||
description: easy tasks # Too vague — what is "easy"?
|
||||
models:
|
||||
- openai/gpt-4o-mini
|
||||
- name: hard
|
||||
description: hard tasks # Too vague — overlaps with "easy"
|
||||
models:
|
||||
- openai/gpt-4o
|
||||
```
|
||||
|
||||
**Correct (specific, distinct task descriptions):**
|
||||
**Correct (specific, distinct task descriptions, multi-model fallbacks):**
|
||||
|
||||
```yaml
|
||||
version: v0.4.0
|
||||
|
||||
model_providers:
|
||||
- model: openai/gpt-4o-mini
|
||||
access_key: $OPENAI_API_KEY
|
||||
default: true
|
||||
routing_preferences:
|
||||
- name: summarization
|
||||
description: >
|
||||
Summarizing documents, articles, emails, or meeting transcripts.
|
||||
Extracting key points, generating TL;DR sections, condensing long text.
|
||||
- name: classification
|
||||
description: >
|
||||
Categorizing inputs, sentiment analysis, spam detection,
|
||||
intent classification, labeling structured data fields.
|
||||
- name: translation
|
||||
description: >
|
||||
Translating text between languages, localization tasks.
|
||||
|
||||
- model: openai/gpt-4o
|
||||
access_key: $OPENAI_API_KEY
|
||||
routing_preferences:
|
||||
- name: code_generation
|
||||
description: >
|
||||
Writing new functions, classes, or modules from scratch.
|
||||
Implementing algorithms, boilerplate generation, API integrations.
|
||||
- name: code_review
|
||||
description: >
|
||||
Reviewing code for bugs, security vulnerabilities, performance issues.
|
||||
Suggesting refactors, explaining complex code, debugging errors.
|
||||
- name: complex_reasoning
|
||||
description: >
|
||||
Multi-step math problems, logical deduction, strategic planning,
|
||||
research synthesis requiring chain-of-thought reasoning.
|
||||
|
||||
- model: anthropic/claude-sonnet-4-5
|
||||
access_key: $ANTHROPIC_API_KEY
|
||||
|
||||
routing_preferences:
|
||||
- name: summarization
|
||||
description: >
|
||||
Summarizing documents, articles, emails, or meeting transcripts.
|
||||
Extracting key points, generating TL;DR sections, condensing long text.
|
||||
models:
|
||||
- openai/gpt-4o-mini
|
||||
- openai/gpt-4o
|
||||
- name: classification
|
||||
description: >
|
||||
Categorizing inputs, sentiment analysis, spam detection,
|
||||
intent classification, labeling structured data fields.
|
||||
models:
|
||||
- openai/gpt-4o-mini
|
||||
- name: translation
|
||||
description: >
|
||||
Translating text between languages, localization tasks.
|
||||
models:
|
||||
- openai/gpt-4o-mini
|
||||
- anthropic/claude-sonnet-4-5
|
||||
- name: code_generation
|
||||
description: >
|
||||
Writing new functions, classes, or modules from scratch.
|
||||
Implementing algorithms, boilerplate generation, API integrations.
|
||||
models:
|
||||
- openai/gpt-4o
|
||||
- anthropic/claude-sonnet-4-5
|
||||
- name: code_review
|
||||
description: >
|
||||
Reviewing code for bugs, security vulnerabilities, performance issues.
|
||||
Suggesting refactors, explaining complex code, debugging errors.
|
||||
models:
|
||||
- anthropic/claude-sonnet-4-5
|
||||
- openai/gpt-4o
|
||||
- name: complex_reasoning
|
||||
description: >
|
||||
Multi-step math problems, logical deduction, strategic planning,
|
||||
research synthesis requiring chain-of-thought reasoning.
|
||||
models:
|
||||
- openai/gpt-4o
|
||||
- anthropic/claude-sonnet-4-5
|
||||
```
|
||||
|
||||
**Key principles for good preference descriptions:**
|
||||
- Use concrete action verbs: "writing", "reviewing", "translating", "summarizing"
|
||||
- List 3–5 specific sub-tasks or synonyms for each preference
|
||||
- Ensure preferences across providers are mutually exclusive in scope
|
||||
- Ensure preferences across routes are mutually exclusive in scope
|
||||
- Order `models` from most preferred to least — the client will fall back in order on `429`/`5xx`
|
||||
- List multiple models under one route to get automatic provider fallback without additional client logic
|
||||
- Every model listed in `models` must be declared in `model_providers`
|
||||
- Test with representative queries using `planoai trace` and `--where` filters to verify routing decisions
|
||||
|
||||
Reference: https://github.com/katanemo/archgw
|
||||
Reference: [Routing API](../../docs/routing-api.md) · https://github.com/katanemo/archgw
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue