plano/demos/use_cases/claude_code/README.md

159 lines
5.4 KiB
Markdown

# Claude Code Routing with Intelligence
## Why This Matters
**Claude Code is powerful, but what if you could access the best of ALL AI models through one familiar interface?**
Instead of being locked into a single provider, imagine:
- Using **DeepSeek's coding expertise** for complex algorithms
- Leveraging **GPT-4's reasoning** for architecture decisions
- Tapping **Claude's analysis** for code reviews
- Accessing **Grok's speed** for quick iterations
**All through the same Claude Code interface you already love.**
## The Problem with Single-Model Development
Most developers are stuck in single-provider silos:
- 🔒 **Vendor Lock-in**: Tied to one model's strengths and weaknesses
- 🎯 **Wrong Tool for the Job**: Using a reasoning model for simple tasks (expensive) or a fast model for complex problems (poor results)
- 🚫 **No Fallbacks**: When your preferred model is down, you're stuck
- 💸 **Suboptimal Costs**: Paying premium prices for tasks that could use cheaper models
## The Solution: Intelligent Multi-LLM Routing
Arch Gateway transforms Claude Code into a **universal AI development interface** that:
### 🌐 **Connects to Any LLM Provider**
- **OpenAI**: GPT-4o, o1-preview, GPT-4o-mini
- **Anthropic**: Claude 3.5 Sonnet, Claude 3 Haiku
- **DeepSeek**: DeepSeek-V3, DeepSeek-Coder-V2
- **Grok**: Grok-2, Grok-2-mini
- **Others**: Gemini, Llama, Mistral, local models via Ollama
### 🧠 **Routes Intelligently Based on Task**
Our research-backed routing system automatically selects the optimal model by analyzing:
- **Task complexity** (simple refactoring vs. architectural design)
- **Content type** (code generation vs. debugging vs. documentation)
- **Performance preferences** (speed vs. quality vs. cost)
- **Real-time availability** (automatic failover when models are down)
### 💡 **Learns Your Preferences**
The system adapts to your coding patterns and preferences over time, ensuring you always get the best model for your specific needs.
## Quick Start
### Prerequisites
- Claude Code installed: `npm install -g @anthropic-ai/claude-code`
- Docker running on your system
### 1. Install and Start Arch Gateway
```bash
pip install archgw
archgw up
```
### 2. Launch Claude Code with Multi-LLM Support
```bash
archgw cli-agent claude
```
That's it! Claude Code now has access to multiple LLM providers with intelligent routing.
## What You'll Experience
### Screenshot Placeholder
![Claude Code with Multi-LLM Routing](screenshot-placeholder.png)
*Claude Code interface enhanced with intelligent model routing and multi-provider access*
### Real-Time Model Selection
When you interact with Claude Code, you'll see:
- **Automatic model selection** based on your query type
- **Transparent routing decisions** showing which model was chosen and why
- **Seamless failover** if a model becomes unavailable
- **Performance metrics** comparing response times and quality
### Example Interactions
**Code Generation Query:**
```
You: "Create a Python function to validate email addresses"
→ Routed to: DeepSeek-Coder-V2 (optimized for code generation)
```
**Architecture Discussion:**
```
You: "How should I structure a microservices backend?"
→ Routed to: Claude 3.5 Sonnet (excellent for architectural reasoning)
```
**Quick Bug Fix:**
```
You: "Fix this syntax error in my JavaScript"
→ Routed to: GPT-4o-mini (fast and cost-effective for simple fixes)
```
## Configuration
The setup uses the included `config.yaml` file which defines:
### Multi-Provider Access
```yaml
llm_providers:
- model: openai/gpt-4.1-2025-04-14
access_key: $OPENAI_API_KEY
routing_preferences:
- name: code generation
description: generating new code snippets and functions
- model: anthropic/claude-3-5-sonnet-20241022
access_key: $ANTHROPIC_API_KEY
routing_preferences:
name: code understanding
description: explaining and analyzing existing code
```
## Advanced Usage
### Custom Model Selection
```bash
# Force a specific model for this session
archgw cli-agent claude --settings='{"ANTHROPIC_SMALL_FAST_MODEL": "deepseek-coder-v2"}'
# Enable detailed routing information
archgw cli-agent claude --settings='{"statusLine": {"type": "command", "command": "ccr statusline"}}'
```
### Environment Variables
The system automatically configures:
```bash
ANTHROPIC_BASE_URL=http://127.0.0.1:12000 # Routes through Arch Gateway
ANTHROPIC_SMALL_FAST_MODEL=arch.fast.v1 # Uses intelligent alias
```
## Benefits You'll See Immediately
### 🚀 **Better Performance**
- Right model for each task = better results
- Automatic failover = no interruptions
- Caching = faster repeated queries
### 💰 **Cost Optimization**
- Use expensive models only when needed
- Leverage free/cheap models for simple tasks
- Track usage across all providers
### 🛡️ **Reliability**
- Multiple providers = no single point of failure
- Automatic retry logic
- Graceful degradation when models are unavailable
### 📊 **Insights**
- See which models work best for your coding style
- Track performance metrics across providers
- Optimize your model usage over time
## Real Developer Workflows
This intelligent routing is powered by our research in preference-aligned AI systems:
- **Research Paper**: [Preference-Aligned LLM Router](https://katanemo.com/research)
- **Technical Docs**: [docs.katanemo.com](https://docs.katanemo.com)
- **API Reference**: [docs.katanemo.com/api](https://docs.katanemo.com/api)