plano/demos/llm_routing/claude_code_router
Musa 897fda2deb
fix(routing): auto-migrate v0.3.0 inline routing_preferences to v0.4.0 top-level (#912)
* fix(routing): auto-migrate v0.3.0 inline routing_preferences to v0.4.0 top-level

Lift inline routing_preferences under each model_provider into the
top-level routing_preferences list with merged models[] and bump
version to v0.4.0, with a deprecation warning. Existing v0.3.0
demo configs (Claude Code, Codex, preference_based_routing, etc.)
keep working unchanged. Schema flags the inline shape as deprecated
but still accepts it. Docs and skills updated to canonical top-level
multi-model form.

* test(common): bump reference config assertion to v0.4.0

The rendered reference config was bumped to v0.4.0 when its inline
routing_preferences were lifted to the top level; align the
configuration deserialization test with that change.

* fix(config_generator): bump version to v0.4.0 up front in migration

Move the v0.3.0 -> v0.4.0 version bump to the top of
migrate_inline_routing_preferences so it runs unconditionally,
including for configs that already declare top-level
routing_preferences at v0.3.0. Previously the bump only fired
when inline migration produced entries, leaving top-level v0.3.0
configs rejected by brightstaff's v0.4.0 gate. Tests updated to
cover the new behavior and to confirm we never downgrade newer
versions.

* fix(config_generator): gate routing_preferences migration on version < v0.4.0

Short-circuit the migration when the config already declares v0.4.0
or newer. Anything at v0.4.0+ is assumed to be on the canonical
top-level shape and is passed through untouched, including stray
inline preferences (which are the author's bug to fix). Only v0.3.0
and older configs are rewritten and bumped.
2026-04-24 12:31:44 -07:00
..
claude_code.png Overhaul demos directory: cleanup, restructure, and standardize configs (#760) 2026-02-17 03:09:28 -08:00
config.yaml fix(routing): auto-migrate v0.3.0 inline routing_preferences to v0.4.0 top-level (#912) 2026-04-24 12:31:44 -07:00
model_selection.png Overhaul demos directory: cleanup, restructure, and standardize configs (#760) 2026-02-17 03:09:28 -08:00
pretty_model_resolution.sh use plano-orchestrator for LLM routing, remove arch-router (#886) 2026-04-15 16:41:42 -07:00
README.md Run plano natively by default (#744) 2026-03-05 07:35:25 -08:00

Claude Code Router - Multi-Model Access with Intelligent Routing

Plano extends Claude Code to access multiple LLM providers through a single interface. Offering two key benefits:

  1. Access to Models: Connect to Grok, Mistral, Gemini, DeepSeek, GPT models, Claude, and local models via Ollama
  2. Intelligent Routing via Preferences for Coding Tasks: Configure which models handle specific development tasks:
    • Code generation and implementation
    • Code reviews and analysis
    • Architecture and system design
    • Debugging and optimization
    • Documentation and explanations

Uses a 1.5B preference-aligned router LLM to automatically select the best model based on your request type.

Benefits

  • Single Interface: Access multiple LLM providers through the same Claude Code CLI
  • Task-Aware Routing: Requests are analyzed and routed to models based on task type (code generation, debugging, architecture, documentation)
  • Provider Flexibility: Add or remove LLM providers without changing your workflow
  • Routing Transparency: See which model handles each request and why

How It Works

Plano sits between Claude Code and multiple LLM providers, analyzing each request to route it to the most suitable model:

Your Request → Plano → Suitable Model → Response
             ↓
    [Task Analysis & Model Selection]

Supported Providers: OpenAI-compatible, Anthropic, DeepSeek, Grok, Gemini, Llama, Mistral, local models via Ollama. See full list of supported providers.

Quick Start (5 minutes)

Prerequisites

# Install Claude Code if you haven't already
npm install -g @anthropic-ai/claude-code

# Install Plano CLI
pip install planoai

Step 1: Get Configuration

# Clone and navigate to demo
git clone https://github.com/katanemo/arch.git
cd arch/demos/llm_routing/claude_code_router

Step 2: Set API Keys

# Copy the sample environment file
cp .env .env.local

# Edit with your actual API keys
export OPENAI_API_KEY="your-openai-key-here"
export ANTHROPIC_API_KEY="your-anthropic-key-here"
# Add other providers as needed

Step 3: Start Plano

# Install using uv (recommended)
uv tool install planoai
planoai up
# Or if already installed with uv: uvx planoai up

# Or install using pip (traditional)
pip install planoai
planoai up

Step 4: Launch Enhanced Claude Code

# This will launch Claude Code with multi-model routing
planoai cli-agent claude
# Or if installed with uv: uvx planoai cli-agent claude

claude code

Monitor Model Selection in Real-Time

While using Claude Code, open a second terminal and run this helper script to watch routing decisions. This script shows you:

  • Which model was selected for each request
  • Real-time routing decisions as you work
# In a new terminal window (from the same directory)
sh pretty_model_resolution.sh

model_selection

Understanding the Configuration

The config.yaml file defines your multi-model setup:

llm_providers:
  - model: openai/gpt-4.1-2025-04-14
    access_key: $OPENAI_API_KEY
    routing_preferences:
      - name: code generation
        description: generating new code snippets and functions

  - model: anthropic/claude-3-5-sonnet-20241022
    access_key: $ANTHROPIC_API_KEY
    routing_preferences:
      - name: code understanding
        description: explaining and analyzing existing code

Advanced Usage

Override Model Selection

# Force a specific model for this session
planoai cli-agent claude --settings='{"ANTHROPIC_SMALL_FAST_MODEL": "deepseek-coder-v2"}'

### Environment Variables
The system automatically configures these variables for Claude Code:
```bash
ANTHROPIC_BASE_URL=http://127.0.0.1:12000  # Routes through Plano
ANTHROPIC_SMALL_FAST_MODEL=arch.claude.code.small.fast    # Uses intelligent alias

Custom Routing Configuration

Edit config.yaml to define custom task→model mappings:

llm_providers:
  # OpenAI Models
  - model: openai/gpt-5-2025-08-07
    access_key: $OPENAI_API_KEY
    routing_preferences:
      - name: code generation
        description: generating new code snippets, functions, or boilerplate based on user prompts or requirements

  - model: openai/gpt-4.1-2025-04-14
    access_key: $OPENAI_API_KEY
    routing_preferences:
      - name: code understanding
        description: understand and explain existing code snippets, functions, or libraries

Technical Details

How routing works: Plano intercepts Claude Code requests, analyzes the content using preference-aligned routing, and forwards to the configured model. Research foundation: Built on our research in Preference-Aligned LLM Routing Documentation: docs.planoai.dev for advanced configuration and API details.