mirror of
https://github.com/katanemo/plano.git
synced 2026-05-24 14:05:14 +02:00
fixing README for claude code and adding a helper script to show model selection (#576)
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-288.local>
This commit is contained in:
parent
f00870dccb
commit
cf23aefddd
4 changed files with 118 additions and 72 deletions
|
|
@ -1,133 +1,146 @@
|
||||||
# Claude Code Routing with (Preference-aligned) Intelligence
|
# Claude Code Router - Multi-Model Access with Intelligent Routing
|
||||||
|
|
||||||
## Why This Matters
|
Arch Gateway extends Claude Code to access multiple LLM providers through a single interface. Offering two key benefits:
|
||||||
|
|
||||||
**Claude Code is powerful, but what if you could access the best of ALL AI models through one familiar interface?**
|
1. **Access to Models**: Connect to Grok, Mistral, Gemini, DeepSeek, GPT models, Claude, and local models via Ollama
|
||||||
|
2. **Intelligent Routing via Preferences for Coding Tasks**: Configure which models handle specific development tasks:
|
||||||
|
- Code generation and implementation
|
||||||
|
- Code reviews and analysis
|
||||||
|
- Architecture and system design
|
||||||
|
- Debugging and optimization
|
||||||
|
- Documentation and explanations
|
||||||
|
|
||||||
Instead of being locked into a set of LLMs from one provier, imagine:
|
Uses a [1.5B preference-aligned router LLM](https://arxiv.org/abs/2506.16655) to automatically select the best model based on your request type.
|
||||||
- Using **DeepSeek's coding expertise** for complex algorithms
|
|
||||||
- Leveraging **GPT-5's reasoning** for architecture decisions
|
|
||||||
- Tapping **Claude's analysis** for code reviews
|
|
||||||
- Accessing **Grok's speed** for quick iterations
|
|
||||||
|
|
||||||
**All through the same Claude Code interface you already love.**
|
## Benefits
|
||||||
|
|
||||||
## The Solution: Intelligent Multi-LLM Routing
|
- **Single Interface**: Access multiple LLM providers through the same Claude Code CLI
|
||||||
|
- **Task-Aware Routing**: Requests are analyzed and routed to models based on task type (code generation, debugging, architecture, documentation)
|
||||||
|
- **Provider Flexibility**: Add or remove LLM providers without changing your workflow
|
||||||
|
- **Routing Transparency**: See which model handles each request and why
|
||||||
|
|
||||||
Arch Gateway transforms Claude Code into a **universal AI development interface** that:
|
## How It Works
|
||||||
|
|
||||||
### 🌐 **Connects to Any LLM Provider**
|
Arch Gateway sits between Claude Code and multiple LLM providers, analyzing each request to route it to the most suitable model:
|
||||||
- **OpenAI**: GPT-4.1, GPT-5, etc.
|
|
||||||
- **Anthropic**: Claude 3.5 Sonnet, Claude 3 Haiku, Claude 4.5
|
|
||||||
- **DeepSeek**: DeepSeek-V3, DeepSeek-Coder-V2
|
|
||||||
- **Grok**: Grok-2, Grok-2-mini
|
|
||||||
- **Others**: Gemini, Llama, Mistral, local models via Ollama
|
|
||||||
|
|
||||||
### 🧠 **Routes Intelligently Based on Task**
|
```
|
||||||
Our research-backed routing system automatically selects the optimal model by analyzing:
|
Your Request → Arch Gateway → Suitable Model → Response
|
||||||
- **Task complexity** (simple refactoring vs. architectural design)
|
↓
|
||||||
- **Content type** (code generation vs. debugging vs. documentation)
|
[Task Analysis & Model Selection]
|
||||||
|
```
|
||||||
|
|
||||||
|
**Supported Providers**: OpenAI-compatible, Anthropic, DeepSeek, Grok, Gemini, Llama, Mistral, local models via Ollama. See [full list of supported providers](https://docs.archgw.com/concepts/llm_providers/supported_providers.html).
|
||||||
|
|
||||||
|
|
||||||
## Quick Start
|
## Quick Start (5 minutes)
|
||||||
|
|
||||||
### Prerequisites
|
### Prerequisites
|
||||||
- Claude Code installed: `npm install -g @anthropic-ai/claude-code`
|
|
||||||
- Docker running on your system
|
|
||||||
- Create a python virtual environment in your current working directory
|
|
||||||
|
|
||||||
### 1. Get the Configuration File
|
|
||||||
Download the demo configuration file using one of these methods:
|
|
||||||
|
|
||||||
**Option A: Direct download**
|
|
||||||
```bash
|
```bash
|
||||||
curl -O https://raw.githubusercontent.com/katanemo/arch/main/demos/use_cases/claude_code/config.yaml
|
# Install Claude Code if you haven't already
|
||||||
|
npm install -g @anthropic-ai/claude-code
|
||||||
|
|
||||||
|
# Ensure Docker is running
|
||||||
|
docker --version
|
||||||
```
|
```
|
||||||
|
|
||||||
**Option B: Clone the repository**
|
### Step 1: Get Configuration
|
||||||
```bash
|
```bash
|
||||||
|
# Clone and navigate to demo
|
||||||
git clone https://github.com/katanemo/arch.git
|
git clone https://github.com/katanemo/arch.git
|
||||||
cd arch/demos/use_cases/claude_code
|
cd arch/demos/use_cases/claude_code
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Set Up Your API Keys
|
### Step 2: Set API Keys
|
||||||
Set up your environment variables with your actual API keys:
|
|
||||||
```bash
|
```bash
|
||||||
export OPENAI_API_KEY="your-openai-api-key"
|
# Copy the sample environment file
|
||||||
export ANTHROPIC_API_KEY="your-anthropic-api-key"
|
cp .env .env.local
|
||||||
export AZURE_API_KEY="your-azure-api-key" # Optional
|
|
||||||
|
# Edit with your actual API keys
|
||||||
|
export OPENAI_API_KEY="your-openai-key-here"
|
||||||
|
export ANTHROPIC_API_KEY="your-anthropic-key-here"
|
||||||
|
# Add other providers as needed
|
||||||
```
|
```
|
||||||
|
|
||||||
Alternatively, create a `.env` file in your working directory:
|
### Step 3: Start Arch Gateway
|
||||||
```bash
|
|
||||||
echo "OPENAI_API_KEY=your-openai-api-key" > .env
|
|
||||||
echo "ANTHROPIC_API_KEY=your-anthropic-api-key" >> .env
|
|
||||||
```
|
|
||||||
|
|
||||||
### 3. Install and Start Arch Gateway
|
|
||||||
```bash
|
```bash
|
||||||
|
# Install and start the gateway
|
||||||
pip install archgw
|
pip install archgw
|
||||||
archgw up
|
archgw up
|
||||||
```
|
```
|
||||||
|
|
||||||
### 4. Launch Claude Code with Multi-LLM Support
|
### Step 4: Launch Enhanced Claude Code
|
||||||
```bash
|
```bash
|
||||||
|
# This will launch Claude Code with multi-model routing
|
||||||
archgw cli-agent claude
|
archgw cli-agent claude
|
||||||
```
|
```
|
||||||
|

|
||||||
|
|
||||||
That's it! Claude Code now has access to multiple LLM providers with intelligent routing.
|
### Monitor Model Selection in Real-Time
|
||||||
|
|
||||||
## What You'll Experience
|
While using Claude Code, open a **second terminal** and run this helper script to watch routing decisions. This script shows you:
|
||||||
|
- **Which model** was selected for each request
|
||||||
|
- **Real-time routing decisions** as you work
|
||||||
|
|
||||||
### Screenshot Placeholder
|
```bash
|
||||||

|
# In a new terminal window (from the same directory)
|
||||||
*Claude Code interface enhanced with intelligent model routing and multi-provider access*
|
sh pretty_model_resolution.sh
|
||||||
|
```
|
||||||
|

|
||||||
|
|
||||||
### Real-Time Model Selection
|
## Understanding the Configuration
|
||||||
When you interact with Claude Code, you'll get:
|
|
||||||
- **Automatic model selection** based on your query type
|
|
||||||
- **Transparent routing decisions** showing which model was chosen and why
|
|
||||||
- **Seamless failover** if a model becomes unavailable
|
|
||||||
|
|
||||||
## Configuration
|
The `config.yaml` file defines your multi-model setup:
|
||||||
|
|
||||||
The setup uses the included `config.yaml` file which defines:
|
|
||||||
|
|
||||||
### Multi-Provider Access
|
|
||||||
```yaml
|
```yaml
|
||||||
llm_providers:
|
llm_providers:
|
||||||
- model: openai/gpt-4.1-2025-04-14
|
- model: openai/gpt-4.1-2025-04-14
|
||||||
access_key: $OPENAI_API_KEY
|
access_key: $OPENAI_API_KEY
|
||||||
routing_preferences:
|
routing_preferences:
|
||||||
- name: code generation
|
- name: code generation
|
||||||
description: generating new code snippets and functions
|
description: generating new code snippets and functions
|
||||||
|
|
||||||
- model: anthropic/claude-3-5-sonnet-20241022
|
- model: anthropic/claude-3-5-sonnet-20241022
|
||||||
access_key: $ANTHROPIC_API_KEY
|
access_key: $ANTHROPIC_API_KEY
|
||||||
routing_preferences:
|
routing_preferences:
|
||||||
name: code understanding
|
- name: code understanding
|
||||||
description: explaining and analyzing existing code
|
description: explaining and analyzing existing code
|
||||||
```
|
```
|
||||||
|
|
||||||
## Advanced Usage
|
## Advanced Usage
|
||||||
|
|
||||||
### Custom Model Selection
|
### Override Model Selection
|
||||||
```bash
|
```bash
|
||||||
# Force a specific model for this session
|
# Force a specific model for this session
|
||||||
archgw cli-agent claude --settings='{"ANTHROPIC_SMALL_FAST_MODEL": "deepseek-coder-v2"}'
|
archgw cli-agent claude --settings='{"ANTHROPIC_SMALL_FAST_MODEL": "deepseek-coder-v2"}'
|
||||||
|
|
||||||
# Enable detailed routing information
|
|
||||||
archgw cli-agent claude --settings='{"statusLine": {"type": "command", "command": "ccr statusline"}}'
|
|
||||||
```
|
|
||||||
|
|
||||||
### Environment Variables
|
### Environment Variables
|
||||||
The system automatically configures:
|
The system automatically configures these variables for Claude Code:
|
||||||
```bash
|
```bash
|
||||||
ANTHROPIC_BASE_URL=http://127.0.0.1:12000 # Routes through Arch Gateway
|
ANTHROPIC_BASE_URL=http://127.0.0.1:12000 # Routes through Arch Gateway
|
||||||
ANTHROPIC_SMALL_FAST_MODEL=arch.claude.code.small.fast # Uses intelligent alias
|
ANTHROPIC_SMALL_FAST_MODEL=arch.claude.code.small.fast # Uses intelligent alias
|
||||||
```
|
```
|
||||||
|
|
||||||
## Real Developer Workflows
|
### Custom Routing Configuration
|
||||||
|
Edit `config.yaml` to define custom task→model mappings:
|
||||||
|
|
||||||
This intelligent routing is powered by our research in preference-aligned LLMM routing:
|
```yaml
|
||||||
- **Research Paper**: [Preference-Aligned LLM Router](https://arxiv.org/abs/2506.16655)
|
llm_providers:
|
||||||
- **Technical Docs**: [docs.archgw.com](https://docs.archgw.com)
|
# OpenAI Models
|
||||||
|
- model: openai/gpt-5-2025-08-07
|
||||||
|
access_key: $OPENAI_API_KEY
|
||||||
|
routing_preferences:
|
||||||
|
- name: code generation
|
||||||
|
description: generating new code snippets, functions, or boilerplate based on user prompts or requirements
|
||||||
|
|
||||||
|
- model: openai/gpt-4.1-2025-04-14
|
||||||
|
access_key: $OPENAI_API_KEY
|
||||||
|
routing_preferences:
|
||||||
|
- name: code understanding
|
||||||
|
description: understand and explain existing code snippets, functions, or libraries
|
||||||
|
```
|
||||||
|
|
||||||
|
## Technical Details
|
||||||
|
|
||||||
|
**How routing works:** Arch intercepts Claude Code requests, analyzes the content using preference-aligned routing, and forwards to the configured model.
|
||||||
|
**Research foundation:** Built on our research in [Preference-Aligned LLM Routing](https://arxiv.org/abs/2506.16655)
|
||||||
|
**Documentation:** [docs.archgw.com](https://docs.archgw.com) for advanced configuration and API details.
|
||||||
|
|
|
||||||
BIN
demos/use_cases/claude_code/claude_code.png
Normal file
BIN
demos/use_cases/claude_code/claude_code.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 205 KiB |
BIN
demos/use_cases/claude_code/model_selection.png
Normal file
BIN
demos/use_cases/claude_code/model_selection.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 215 KiB |
33
demos/use_cases/claude_code/pretty_model_resolution.sh
Normal file
33
demos/use_cases/claude_code/pretty_model_resolution.sh
Normal file
|
|
@ -0,0 +1,33 @@
|
||||||
|
#!/usr/bin/env bash
|
||||||
|
# Pretty-print ArchGW MODEL_RESOLUTION lines from docker logs
|
||||||
|
# - hides Arch-Router
|
||||||
|
# - prints timestamp
|
||||||
|
# - colors MODEL_RESOLUTION red
|
||||||
|
# - colors req_model cyan
|
||||||
|
# - colors resolved_model magenta
|
||||||
|
# - removes provider and streaming
|
||||||
|
|
||||||
|
docker logs -f archgw 2>&1 \
|
||||||
|
| awk '
|
||||||
|
/MODEL_RESOLUTION:/ && $0 !~ /Arch-Router/ {
|
||||||
|
# extract timestamp between first [ and ]
|
||||||
|
ts=""
|
||||||
|
if (match($0, /\[[0-9-]+ [0-9:.]+\]/)) {
|
||||||
|
ts=substr($0, RSTART+1, RLENGTH-2)
|
||||||
|
}
|
||||||
|
|
||||||
|
# split out after MODEL_RESOLUTION:
|
||||||
|
n = split($0, parts, /MODEL_RESOLUTION: */)
|
||||||
|
line = parts[2]
|
||||||
|
|
||||||
|
# remove provider and streaming fields
|
||||||
|
sub(/ *provider='\''[^'\'']+'\''/, "", line)
|
||||||
|
sub(/ *streaming=(true|false)/, "", line)
|
||||||
|
|
||||||
|
# highlight fields
|
||||||
|
gsub(/req_model='\''[^'\'']+'\''/, "\033[36m&\033[0m", line)
|
||||||
|
gsub(/resolved_model='\''[^'\'']+'\''/, "\033[35m&\033[0m", line)
|
||||||
|
|
||||||
|
# print timestamp + MODEL_RESOLUTION
|
||||||
|
printf "\033[90m[%s]\033[0m \033[31mMODEL_RESOLUTION\033[0m: %s\n", ts, line
|
||||||
|
}'
|
||||||
Loading…
Add table
Add a link
Reference in a new issue