fixing README for claude code and adding a helper script to show model selection (#576)

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-288.local>
2026-07-14 16:22:12 +02:00 · 2025-09-29 21:20:52 -07:00 · 2025-09-29 21:20:52 -07:00 · cf23aefddd
commit cf23aefddd
parent f00870dccb
4 changed files with 118 additions and 72 deletions
--- a/demos/use_cases/claude_code/README.md
+++ b/demos/use_cases/claude_code/README.md
@ -1,133 +1,146 @@
-# Claude Code Routing with (Preference-aligned) Intelligence
+# Claude Code Router - Multi-Model Access with Intelligent Routing
-## Why This Matters
+Arch Gateway extends Claude Code to access multiple LLM providers through a single interface. Offering two key benefits:
-**Claude Code is powerful, but what if you could access the best of ALL AI models through one familiar interface?**
+1. **Access to Models**: Connect to Grok, Mistral, Gemini, DeepSeek, GPT models, Claude, and local models via Ollama
 2. **Intelligent Routing via Preferences for Coding Tasks**: Configure which models handle specific development tasks:
   - Code generation and implementation
   - Code reviews and analysis
   - Architecture and system design
   - Debugging and optimization
   - Documentation and explanations
-Instead of being locked into a set of LLMs from one provier, imagine:
+Uses a [1.5B preference-aligned router LLM](https://arxiv.org/abs/2506.16655) to automatically select the best model based on your request type.
 - Using **DeepSeek's coding expertise** for complex algorithms
 - Leveraging **GPT-5's reasoning** for architecture decisions
 - Tapping **Claude's analysis** for code reviews
 - Accessing **Grok's speed** for quick iterations
-**All through the same Claude Code interface you already love.**
+## Benefits
-## The Solution: Intelligent Multi-LLM Routing
+- **Single Interface**: Access multiple LLM providers through the same Claude Code CLI
 - **Task-Aware Routing**: Requests are analyzed and routed to models based on task type (code generation, debugging, architecture, documentation)
 - **Provider Flexibility**: Add or remove LLM providers without changing your workflow
 - **Routing Transparency**: See which model handles each request and why
-Arch Gateway transforms Claude Code into a **universal AI development interface** that:
+## How It Works
-### 🌐 **Connects to Any LLM Provider**
+Arch Gateway sits between Claude Code and multiple LLM providers, analyzing each request to route it to the most suitable model:
 - **OpenAI**: GPT-4.1, GPT-5, etc.
 - **Anthropic**: Claude 3.5 Sonnet, Claude 3 Haiku, Claude 4.5
 - **DeepSeek**: DeepSeek-V3, DeepSeek-Coder-V2
 - **Grok**: Grok-2, Grok-2-mini
 - **Others**: Gemini, Llama, Mistral, local models via Ollama
-### 🧠 **Routes Intelligently Based on Task**
+```
-Our research-backed routing system automatically selects the optimal model by analyzing:
+Your Request → Arch Gateway → Suitable Model → Response
- **Task complexity** (simple refactoring vs. architectural design)
+             ↓
- **Content type** (code generation vs. debugging vs. documentation)
+    [Task Analysis & Model Selection]
 ```
 **Supported Providers**: OpenAI-compatible, Anthropic, DeepSeek, Grok, Gemini, Llama, Mistral, local models via Ollama. See [full list of supported providers](https://docs.archgw.com/concepts/llm_providers/supported_providers.html).
-## Quick Start
+## Quick Start (5 minutes)
 ### Prerequisites
 - Claude Code installed: `npm install -g @anthropic-ai/claude-code`
 - Docker running on your system
 - Create a python virtual environment in your current working directory
 ### 1. Get the Configuration File
 Download the demo configuration file using one of these methods:
 **Option A: Direct download**
 ```bash
-curl -O https://raw.githubusercontent.com/katanemo/arch/main/demos/use_cases/claude_code/config.yaml
+# Install Claude Code if you haven't already
 npm install -g @anthropic-ai/claude-code
 # Ensure Docker is running
 docker --version
 ```
-**Option B: Clone the repository**
+### Step 1: Get Configuration
 ```bash
 # Clone and navigate to demo
 git clone https://github.com/katanemo/arch.git
 cd arch/demos/use_cases/claude_code
 ```
-### 2. Set Up Your API Keys
+### Step 2: Set API Keys
 Set up your environment variables with your actual API keys:
 ```bash
-export OPENAI_API_KEY="your-openai-api-key"
+# Copy the sample environment file
-export ANTHROPIC_API_KEY="your-anthropic-api-key"
+cp .env .env.local
-export AZURE_API_KEY="your-azure-api-key"  # Optional
+
 # Edit with your actual API keys
 export OPENAI_API_KEY="your-openai-key-here"
 export ANTHROPIC_API_KEY="your-anthropic-key-here"
 # Add other providers as needed
 ```
-Alternatively, create a `.env` file in your working directory:
+### Step 3: Start Arch Gateway
 ```bash
 echo "OPENAI_API_KEY=your-openai-api-key" > .env
 echo "ANTHROPIC_API_KEY=your-anthropic-api-key" >> .env
 ```
 ### 3. Install and Start Arch Gateway
 ```bash
 # Install and start the gateway
 pip install archgw
 archgw up
 ```
-### 4. Launch Claude Code with Multi-LLM Support
+### Step 4: Launch Enhanced Claude Code
 ```bash
 # This will launch Claude Code with multi-model routing
 archgw cli-agent claude
 ```
 ![claude code](claude_code.png)
-That's it! Claude Code now has access to multiple LLM providers with intelligent routing.
+### Monitor Model Selection in Real-Time
-## What You'll Experience
+While using Claude Code, open a **second terminal** and run this helper script to watch routing decisions. This script shows you:
 - **Which model** was selected for each request
 - **Real-time routing decisions** as you work
-### Screenshot Placeholder
+```bash
-![Claude Code with Multi-LLM Routing](screenshot-placeholder.png)
+# In a new terminal window (from the same directory)
-*Claude Code interface enhanced with intelligent model routing and multi-provider access*
+sh pretty_model_resolution.sh
 ```
 ![model_selection](model_selection.png)
-### Real-Time Model Selection
+## Understanding the Configuration
 When you interact with Claude Code, you'll get:
 - **Automatic model selection** based on your query type
 - **Transparent routing decisions** showing which model was chosen and why
 - **Seamless failover** if a model becomes unavailable
-## Configuration
+The `config.yaml` file defines your multi-model setup:
 The setup uses the included `config.yaml` file which defines:
 ### Multi-Provider Access
 ```yaml
 llm_providers:
  - model: openai/gpt-4.1-2025-04-14
    access_key: $OPENAI_API_KEY
    routing_preferences:
-    - name: code generation
+      - name: code generation
        description: generating new code snippets and functions
  - model: anthropic/claude-3-5-sonnet-20241022
    access_key: $ANTHROPIC_API_KEY
    routing_preferences:
-        name: code understanding
+      - name: code understanding
        description: explaining and analyzing existing code
 ```
 ## Advanced Usage
-### Custom Model Selection
+### Override Model Selection
 ```bash
 # Force a specific model for this session
 archgw cli-agent claude --settings='{"ANTHROPIC_SMALL_FAST_MODEL": "deepseek-coder-v2"}'
 # Enable detailed routing information
 archgw cli-agent claude --settings='{"statusLine": {"type": "command", "command": "ccr statusline"}}'
 ```
 ### Environment Variables
-The system automatically configures:
+The system automatically configures these variables for Claude Code:
 ```bash
 ANTHROPIC_BASE_URL=http://127.0.0.1:12000  # Routes through Arch Gateway
 ANTHROPIC_SMALL_FAST_MODEL=arch.claude.code.small.fast    # Uses intelligent alias
 ```
-## Real Developer Workflows
+### Custom Routing Configuration
 Edit `config.yaml` to define custom task→model mappings:
-This intelligent routing is powered by our research in preference-aligned LLMM routing:
+```yaml
- **Research Paper**: [Preference-Aligned LLM Router](https://arxiv.org/abs/2506.16655)
+llm_providers:
- **Technical Docs**: [docs.archgw.com](https://docs.archgw.com)
+  # OpenAI Models
  - model: openai/gpt-5-2025-08-07
    access_key: $OPENAI_API_KEY
    routing_preferences:
      - name: code generation
        description: generating new code snippets, functions, or boilerplate based on user prompts or requirements
  - model: openai/gpt-4.1-2025-04-14
    access_key: $OPENAI_API_KEY
    routing_preferences:
      - name: code understanding
        description: understand and explain existing code snippets, functions, or libraries
 ```
 ## Technical Details
 **How routing works:** Arch intercepts Claude Code requests, analyzes the content using preference-aligned routing, and forwards to the configured model.
 **Research foundation:** Built on our research in [Preference-Aligned LLM Routing](https://arxiv.org/abs/2506.16655)
 **Documentation:** [docs.archgw.com](https://docs.archgw.com) for advanced configuration and API details.
--- a/demos/use_cases/claude_code/claude_code.png
+++ b/demos/use_cases/claude_code/claude_code.png
--- a/demos/use_cases/claude_code/model_selection.png
+++ b/demos/use_cases/claude_code/model_selection.png
--- a/demos/use_cases/claude_code/pretty_model_resolution.sh
+++ b/demos/use_cases/claude_code/pretty_model_resolution.sh
@ -0,0 +1,33 @@
 #!/usr/bin/env bash
 # Pretty-print ArchGW MODEL_RESOLUTION lines from docker logs
 # - hides Arch-Router
 # - prints timestamp
 # - colors MODEL_RESOLUTION red
 # - colors req_model cyan
 # - colors resolved_model magenta
 # - removes provider and streaming
 docker logs -f archgw 2>&1 \
 | awk '
 /MODEL_RESOLUTION:/ && $0 !~ /Arch-Router/ {
  # extract timestamp between first [ and ]
  ts=""
  if (match($0, /\[[0-9-]+ [0-9:.]+\]/)) {
    ts=substr($0, RSTART+1, RLENGTH-2)
  }
  # split out after MODEL_RESOLUTION:
  n = split($0, parts, /MODEL_RESOLUTION: */)
  line = parts[2]
  # remove provider and streaming fields
  sub(/ *provider='\''[^'\'']+'\''/, "", line)
  sub(/ *streaming=(true|false)/, "", line)
  # highlight fields
  gsub(/req_model='\''[^'\'']+'\''/, "\033[36m&\033[0m", line)
  gsub(/resolved_model='\''[^'\'']+'\''/, "\033[35m&\033[0m", line)
  # print timestamp + MODEL_RESOLUTION
  printf "\033[90m[%s]\033[0m \033[31mMODEL_RESOLUTION\033[0m: %s\n", ts, line
 }'