Merge branch 'main' into adil/agent_format

This commit is contained in:
Adil Hafeez 2025-09-30 11:39:34 -07:00
commit 2cebc0c85f
No known key found for this signature in database
GPG key ID: 9B18EF7691369645
33 changed files with 1369 additions and 421 deletions

View file

@ -0,0 +1,146 @@
# Claude Code Router - Multi-Model Access with Intelligent Routing
Arch Gateway extends Claude Code to access multiple LLM providers through a single interface. Offering two key benefits:
1. **Access to Models**: Connect to Grok, Mistral, Gemini, DeepSeek, GPT models, Claude, and local models via Ollama
2. **Intelligent Routing via Preferences for Coding Tasks**: Configure which models handle specific development tasks:
- Code generation and implementation
- Code reviews and analysis
- Architecture and system design
- Debugging and optimization
- Documentation and explanations
Uses a [1.5B preference-aligned router LLM](https://arxiv.org/abs/2506.16655) to automatically select the best model based on your request type.
## Benefits
- **Single Interface**: Access multiple LLM providers through the same Claude Code CLI
- **Task-Aware Routing**: Requests are analyzed and routed to models based on task type (code generation, debugging, architecture, documentation)
- **Provider Flexibility**: Add or remove LLM providers without changing your workflow
- **Routing Transparency**: See which model handles each request and why
## How It Works
Arch Gateway sits between Claude Code and multiple LLM providers, analyzing each request to route it to the most suitable model:
```
Your Request → Arch Gateway → Suitable Model → Response
[Task Analysis & Model Selection]
```
**Supported Providers**: OpenAI-compatible, Anthropic, DeepSeek, Grok, Gemini, Llama, Mistral, local models via Ollama. See [full list of supported providers](https://docs.archgw.com/concepts/llm_providers/supported_providers.html).
## Quick Start (5 minutes)
### Prerequisites
```bash
# Install Claude Code if you haven't already
npm install -g @anthropic-ai/claude-code
# Ensure Docker is running
docker --version
```
### Step 1: Get Configuration
```bash
# Clone and navigate to demo
git clone https://github.com/katanemo/arch.git
cd arch/demos/use_cases/claude_code
```
### Step 2: Set API Keys
```bash
# Copy the sample environment file
cp .env .env.local
# Edit with your actual API keys
export OPENAI_API_KEY="your-openai-key-here"
export ANTHROPIC_API_KEY="your-anthropic-key-here"
# Add other providers as needed
```
### Step 3: Start Arch Gateway
```bash
# Install and start the gateway
pip install archgw
archgw up
```
### Step 4: Launch Enhanced Claude Code
```bash
# This will launch Claude Code with multi-model routing
archgw cli-agent claude
```
![claude code](claude_code.png)
### Monitor Model Selection in Real-Time
While using Claude Code, open a **second terminal** and run this helper script to watch routing decisions. This script shows you:
- **Which model** was selected for each request
- **Real-time routing decisions** as you work
```bash
# In a new terminal window (from the same directory)
sh pretty_model_resolution.sh
```
![model_selection](model_selection.png)
## Understanding the Configuration
The `config.yaml` file defines your multi-model setup:
```yaml
llm_providers:
- model: openai/gpt-4.1-2025-04-14
access_key: $OPENAI_API_KEY
routing_preferences:
- name: code generation
description: generating new code snippets and functions
- model: anthropic/claude-3-5-sonnet-20241022
access_key: $ANTHROPIC_API_KEY
routing_preferences:
- name: code understanding
description: explaining and analyzing existing code
```
## Advanced Usage
### Override Model Selection
```bash
# Force a specific model for this session
archgw cli-agent claude --settings='{"ANTHROPIC_SMALL_FAST_MODEL": "deepseek-coder-v2"}'
### Environment Variables
The system automatically configures these variables for Claude Code:
```bash
ANTHROPIC_BASE_URL=http://127.0.0.1:12000 # Routes through Arch Gateway
ANTHROPIC_SMALL_FAST_MODEL=arch.claude.code.small.fast # Uses intelligent alias
```
### Custom Routing Configuration
Edit `config.yaml` to define custom task→model mappings:
```yaml
llm_providers:
# OpenAI Models
- model: openai/gpt-5-2025-08-07
access_key: $OPENAI_API_KEY
routing_preferences:
- name: code generation
description: generating new code snippets, functions, or boilerplate based on user prompts or requirements
- model: openai/gpt-4.1-2025-04-14
access_key: $OPENAI_API_KEY
routing_preferences:
- name: code understanding
description: understand and explain existing code snippets, functions, or libraries
```
## Technical Details
**How routing works:** Arch intercepts Claude Code requests, analyzes the content using preference-aligned routing, and forwards to the configured model.
**Research foundation:** Built on our research in [Preference-Aligned LLM Routing](https://arxiv.org/abs/2506.16655)
**Documentation:** [docs.archgw.com](https://docs.archgw.com) for advanced configuration and API details.

Binary file not shown.

After

Width:  |  Height:  |  Size: 205 KiB

View file

@ -0,0 +1,41 @@
version: v0.1
listeners:
egress_traffic:
address: 0.0.0.0
port: 12000
message_format: openai
timeout: 30s
llm_providers:
# OpenAI Models
- model: openai/gpt-5-2025-08-07
access_key: $OPENAI_API_KEY
routing_preferences:
- name: code generation
description: generating new code snippets, functions, or boilerplate based on user prompts or requirements
- model: openai/gpt-4.1-2025-04-14
access_key: $OPENAI_API_KEY
routing_preferences:
- name: code understanding
description: understand and explain existing code snippets, functions, or libraries
# Anthropic Models
- model: anthropic/claude-sonnet-4-5
default: true
access_key: $ANTHROPIC_API_KEY
- model: anthropic/claude-3-haiku-20240307
access_key: $ANTHROPIC_API_KEY
# Ollama Models
- model: ollama/llama3.1
base_url: http://host.docker.internal:11434
# Model aliases - friendly names that map to actual provider names
model_aliases:
# Alias for a small faster Claude model
arch.claude.code.small.fast:
target: claude-3-haiku-20240307

Binary file not shown.

After

Width:  |  Height:  |  Size: 215 KiB

View file

@ -0,0 +1,33 @@
#!/usr/bin/env bash
# Pretty-print ArchGW MODEL_RESOLUTION lines from docker logs
# - hides Arch-Router
# - prints timestamp
# - colors MODEL_RESOLUTION red
# - colors req_model cyan
# - colors resolved_model magenta
# - removes provider and streaming
docker logs -f archgw 2>&1 \
| awk '
/MODEL_RESOLUTION:/ && $0 !~ /Arch-Router/ {
# extract timestamp between first [ and ]
ts=""
if (match($0, /\[[0-9-]+ [0-9:.]+\]/)) {
ts=substr($0, RSTART+1, RLENGTH-2)
}
# split out after MODEL_RESOLUTION:
n = split($0, parts, /MODEL_RESOLUTION: */)
line = parts[2]
# remove provider and streaming fields
sub(/ *provider='\''[^'\'']+'\''/, "", line)
sub(/ *streaming=(true|false)/, "", line)
# highlight fields
gsub(/req_model='\''[^'\'']+'\''/, "\033[36m&\033[0m", line)
gsub(/resolved_model='\''[^'\'']+'\''/, "\033[35m&\033[0m", line)
# print timestamp + MODEL_RESOLUTION
printf "\033[90m[%s]\033[0m \033[31mMODEL_RESOLUTION\033[0m: %s\n", ts, line
}'

View file

@ -24,7 +24,7 @@ llm_providers:
access_key: $OPENAI_API_KEY
# Anthropic Models
- model: anthropic/claude-3-5-sonnet-20241022
- model: anthropic/claude-sonnet-4-20250514
access_key: $ANTHROPIC_API_KEY
- model: anthropic/claude-3-haiku-20240307
@ -56,7 +56,7 @@ model_aliases:
# Alias for creative tasks -> Claude model
arch.creative.v1:
target: claude-3-5-sonnet-20241022
target: claude-sonnet-4-20250514
# Alias for quick responses -> fast model
arch.fast.v1:
@ -67,7 +67,7 @@ model_aliases:
target: gpt-5-mini-2025-08-07
chat-model:
target: llama3.1
target: gpt-5-mini-2025-08-07
creative-model:
target: claude-3-5-sonnet-20241022
target: claude-sonnet-4-20250514

View file

@ -12,7 +12,7 @@ python = ">=3.10,<3.13.3"
pydantic = "^2.0"
openai = "^1.0"
pyyaml = "^6.0"
archgw ="^0.3.13"
archgw ="^0.3.14"
[tool.poetry.group.dev.dependencies]
pytest = "^8.3"

View file

@ -14,9 +14,9 @@ Make sure your machine is up to date with [latest version of archgw]([url](https
2. start archgw in the foreground
```bash
(venv) $ archgw up --service archgw --foreground
2025-05-30 18:00:09,953 - cli.main - INFO - Starting archgw cli version: 0.3.13
2025-05-30 18:00:09,953 - cli.main - INFO - Starting archgw cli version: 0.3.14
2025-05-30 18:00:09,953 - cli.main - INFO - Validating /Users/adilhafeez/src/intelligent-prompt-gateway/demos/use_cases/preference_based_routing/arch_config.yaml
2025-05-30 18:00:10,422 - cli.core - INFO - Starting arch gateway, image name: archgw, tag: katanemo/archgw:0.3.13
2025-05-30 18:00:10,422 - cli.core - INFO - Starting arch gateway, image name: archgw, tag: katanemo/archgw:0.3.14
2025-05-30 18:00:10,662 - cli.core - INFO - archgw status: running, health status: starting
2025-05-30 18:00:11,712 - cli.core - INFO - archgw status: running, health status: starting
2025-05-30 18:00:12,761 - cli.core - INFO - archgw is running and is healthy!