merge main

This commit is contained in:
Adil Hafeez 2026-03-11 19:15:33 +00:00
commit 692499d910
22 changed files with 1771 additions and 215 deletions

View file

@ -16,6 +16,7 @@ This directory contains demos showcasing Plano's capabilities as an AI-native pr
| [Preference-Based Routing](llm_routing/preference_based_routing/) | Routes prompts to LLMs based on user-defined preferences and task type (e.g. code generation vs. understanding) |
| [Model Alias Routing](llm_routing/model_alias_routing/) | Maps semantic aliases (`arch.summarize.v1`) to provider-specific models for centralized governance |
| [Claude Code Router](llm_routing/claude_code_router/) | Extends Claude Code with multi-provider access and preference-aligned routing for coding tasks |
| [Codex Router](llm_routing/codex_router/) | Extends Codex CLI with multi-provider access and preference-aligned routing for coding tasks |
## Agent Orchestration

View file

@ -0,0 +1,92 @@
# Codex Router - Multi-Model Access with Intelligent Routing
Plano extends Codex CLI to access multiple LLM providers through a single interface. This gives you:
1. **Access to Models**: Connect to OpenAI, Anthropic, xAI, Gemini, and local models via Ollama
2. **Intelligent Routing via Preferences for Coding Tasks**: Configure which models handle specific development tasks:
- Code generation and implementation
- Code understanding and analysis
- Debugging and optimization
- Architecture and system design
Uses a [1.5B preference-aligned router LLM](https://arxiv.org/abs/2506.16655) to automatically select the best model based on your request type.
## Benefits
- **Single Interface**: Access multiple LLM providers through the same Codex CLI
- **Task-Aware Routing**: Requests are analyzed and routed to models based on task type (code generation vs code understanding)
- **Provider Flexibility**: Add or remove providers without changing your workflow
- **Routing Transparency**: See which model handles each request and why
## Quick Start
### Prerequisites
```bash
# Install Codex CLI
npm install -g @openai/codex
# Install Plano CLI
pip install planoai
```
### Step 1: Open the Demo
```bash
git clone https://github.com/katanemo/arch.git
cd arch/demos/llm_routing/codex_router
```
### Step 2: Set API Keys
```bash
export OPENAI_API_KEY="your-openai-key-here"
export ANTHROPIC_API_KEY="your-anthropic-key-here"
export XAI_API_KEY="your-xai-key-here"
export GEMINI_API_KEY="your-gemini-key-here"
```
### Step 3: Start Plano
```bash
planoai up
# or: uvx planoai up
```
### Step 4: Launch Codex Through Plano
```bash
planoai cli-agent codex
# or: uvx planoai cli-agent codex
```
By default, `planoai cli-agent codex` starts Codex with `gpt-5.3-codex`. With this demo config:
- `code understanding` prompts are routed to `gpt-5-2025-08-07`
- `code generation` prompts are routed to `gpt-5.3-codex`
## Monitor Routing Decisions
In a second terminal:
```bash
sh pretty_model_resolution.sh
```
This shows each request model and the final model selected by Plano's router.
## Configuration Highlights
`config.yaml` demonstrates:
- OpenAI default model for Codex sessions (`gpt-5.3-codex`)
- Routing preference override for code understanding (`gpt-5-2025-08-07`)
- Additional providers (Anthropic, xAI, Gemini, Ollama local) to show cross-provider routing support
## Optional Overrides
Set a different Codex session model:
```bash
planoai cli-agent codex --settings='{"CODEX_MODEL":"gpt-5-2025-08-07"}'
```

View file

@ -0,0 +1,38 @@
version: v0.3.0
listeners:
- type: model
name: model_listener
port: 12000
model_providers:
# OpenAI models used by Codex defaults and preference routing
- model: openai/gpt-5.3-codex
default: true
access_key: $OPENAI_API_KEY
routing_preferences:
- name: code generation
description: generating new code snippets, functions, or boilerplate based on user prompts or requirements
- model: xai/grok-4-1-fast-non-reasoning
access_key: $GROK_API_KEY
routing_preferences:
- name: project understanding
description: understand repository structure, codebase, and code files, readmes, and other documentation
# Additional providers (optional): Codex can route to any configured model
# - model: anthropic/claude-sonnet-4-5
# access_key: $ANTHROPIC_API_KEY
# - model: xai/grok-4-1-fast-non-reasoning
# access_key: $GROK_API_KEY
- model: ollama/llama3.1
base_url: http://localhost:11434
model_aliases:
arch.codex.default:
target: gpt-5.3-codex
tracing:
random_sampling: 100

View file

@ -0,0 +1,33 @@
#!/usr/bin/env bash
# Pretty-print Plano MODEL_RESOLUTION lines from docker logs
# - hides Arch-Router
# - prints timestamp
# - colors MODEL_RESOLUTION red
# - colors req_model cyan
# - colors resolved_model magenta
# - removes provider and streaming
docker logs -f plano 2>&1 \
| awk '
/MODEL_RESOLUTION:/ && $0 !~ /Arch-Router/ {
# extract timestamp between first [ and ]
ts=""
if (match($0, /\[[0-9-]+ [0-9:.]+\]/)) {
ts=substr($0, RSTART+1, RLENGTH-2)
}
# split out after MODEL_RESOLUTION:
n = split($0, parts, /MODEL_RESOLUTION: */)
line = parts[2]
# remove provider and streaming fields
sub(/ *provider='\''[^'\'']+'\''/, "", line)
sub(/ *streaming=(true|false)/, "", line)
# highlight fields
gsub(/req_model='\''[^'\'']+'\''/, "\033[36m&\033[0m", line)
gsub(/resolved_model='\''[^'\'']+'\''/, "\033[35m&\033[0m", line)
# print timestamp + MODEL_RESOLUTION
printf "\033[90m[%s]\033[0m \033[31mMODEL_RESOLUTION\033[0m: %s\n", ts, line
}'

View file

@ -62,4 +62,59 @@ curl -s "$PLANO_URL/routing/v1/messages" \
}' | python3 -m json.tool
echo ""
# --- Example 5: Inline routing policy in request body ---
echo "--- 5. Inline routing_policy (no config needed) ---"
echo ""
curl -s "$PLANO_URL/routing/v1/chat/completions" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o-mini",
"messages": [
{"role": "user", "content": "Write a quicksort implementation in Go"}
],
"routing_policy": [
{
"model": "openai/gpt-4o",
"routing_preferences": [
{"name": "coding", "description": "code generation, writing functions, debugging"}
]
},
{
"model": "openai/gpt-4o-mini",
"routing_preferences": [
{"name": "general", "description": "general questions, simple lookups, casual conversation"}
]
}
]
}' | python3 -m json.tool
echo ""
# --- Example 6: Inline routing policy with Anthropic format ---
echo "--- 6. Inline routing_policy (Anthropic format) ---"
echo ""
curl -s "$PLANO_URL/routing/v1/messages" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o-mini",
"max_tokens": 1024,
"messages": [
{"role": "user", "content": "What is the weather like today?"}
],
"routing_policy": [
{
"model": "openai/gpt-4o",
"routing_preferences": [
{"name": "coding", "description": "code generation, writing functions, debugging"}
]
},
{
"model": "openai/gpt-4o-mini",
"routing_preferences": [
{"name": "general", "description": "general questions, simple lookups, casual conversation"}
]
}
]
}' | python3 -m json.tool
echo ""
echo "=== Demo Complete ==="