mirror of
https://github.com/katanemo/plano.git
synced 2026-05-15 11:02:39 +02:00
feat: add Codex and OpenCode routers with multi-provider support and updated configurations
This commit is contained in:
parent
69d650a4e5
commit
5d35a3ae18
8 changed files with 347 additions and 4 deletions
|
|
@ -16,6 +16,8 @@ This directory contains demos showcasing Plano's capabilities as an AI-native pr
|
||||||
| [Preference-Based Routing](llm_routing/preference_based_routing/) | Routes prompts to LLMs based on user-defined preferences and task type (e.g. code generation vs. understanding) |
|
| [Preference-Based Routing](llm_routing/preference_based_routing/) | Routes prompts to LLMs based on user-defined preferences and task type (e.g. code generation vs. understanding) |
|
||||||
| [Model Alias Routing](llm_routing/model_alias_routing/) | Maps semantic aliases (`arch.summarize.v1`) to provider-specific models for centralized governance |
|
| [Model Alias Routing](llm_routing/model_alias_routing/) | Maps semantic aliases (`arch.summarize.v1`) to provider-specific models for centralized governance |
|
||||||
| [Claude Code Router](llm_routing/claude_code_router/) | Extends Claude Code with multi-provider access and preference-aligned routing for coding tasks |
|
| [Claude Code Router](llm_routing/claude_code_router/) | Extends Claude Code with multi-provider access and preference-aligned routing for coding tasks |
|
||||||
|
| [Codex Router](llm_routing/codex_router/) | Extends Codex CLI with multi-provider access and preference-aligned routing for coding tasks |
|
||||||
|
| [OpenCode Router](llm_routing/opencode_router/) | Extends OpenCode CLI with multi-provider access and preference-aligned routing for coding tasks |
|
||||||
|
|
||||||
## Agent Orchestration
|
## Agent Orchestration
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -19,11 +19,11 @@ model_providers:
|
||||||
- name: code understanding
|
- name: code understanding
|
||||||
description: understand and explain existing code snippets, functions, or libraries
|
description: understand and explain existing code snippets, functions, or libraries
|
||||||
# Anthropic Models
|
# Anthropic Models
|
||||||
- model: anthropic/claude-sonnet-4-5
|
- model: anthropic/claude-sonnet-4-6
|
||||||
default: true
|
|
||||||
access_key: $ANTHROPIC_API_KEY
|
access_key: $ANTHROPIC_API_KEY
|
||||||
|
|
||||||
- model: anthropic/claude-haiku-4-5
|
- model: anthropic/claude-haiku-4-5-20251001
|
||||||
|
default: true
|
||||||
access_key: $ANTHROPIC_API_KEY
|
access_key: $ANTHROPIC_API_KEY
|
||||||
|
|
||||||
# Ollama Models
|
# Ollama Models
|
||||||
|
|
@ -35,7 +35,7 @@ model_providers:
|
||||||
model_aliases:
|
model_aliases:
|
||||||
# Alias for a small faster Claude model
|
# Alias for a small faster Claude model
|
||||||
arch.claude.code.small.fast:
|
arch.claude.code.small.fast:
|
||||||
target: claude-haiku-4-5
|
target: claude-haiku-4-5-20251001
|
||||||
|
|
||||||
tracing:
|
tracing:
|
||||||
random_sampling: 100
|
random_sampling: 100
|
||||||
|
|
|
||||||
102
demos/llm_routing/codex_router/README.md
Normal file
102
demos/llm_routing/codex_router/README.md
Normal file
|
|
@ -0,0 +1,102 @@
|
||||||
|
# Codex Router - Multi-Model Access with Intelligent Routing
|
||||||
|
|
||||||
|
Plano extends Codex to access multiple LLM providers through a single interface and route coding requests to the best configured model.
|
||||||
|
|
||||||
|
## Benefits
|
||||||
|
|
||||||
|
- **Single Interface**: Use Codex while routing through Plano
|
||||||
|
- **Task-Aware Routing**: Route requests based on coding task intent
|
||||||
|
- **Provider Flexibility**: Mix OpenAI, Anthropic, and local models behind one endpoint
|
||||||
|
- **Routing Transparency**: Inspect exactly which model served each request
|
||||||
|
|
||||||
|
## How It Works
|
||||||
|
|
||||||
|
Plano sits between Codex and configured providers:
|
||||||
|
|
||||||
|
```text
|
||||||
|
Your Request -> Codex -> Plano -> Selected Model -> Response
|
||||||
|
```
|
||||||
|
|
||||||
|
## Quick Start
|
||||||
|
|
||||||
|
### Prerequisites
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Install Codex CLI
|
||||||
|
npm install -g @openai/codex
|
||||||
|
|
||||||
|
# Ensure Docker is running
|
||||||
|
docker --version
|
||||||
|
```
|
||||||
|
|
||||||
|
### 1) Enter this demo directory
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cd demos/llm_routing/codex_router
|
||||||
|
```
|
||||||
|
|
||||||
|
### 2) Set API keys
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export OPENAI_API_KEY="your-openai-key-here"
|
||||||
|
export ANTHROPIC_API_KEY="your-anthropic-key-here"
|
||||||
|
```
|
||||||
|
|
||||||
|
### 3) Start Plano
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Install with uv (recommended)
|
||||||
|
uv tool install planoai
|
||||||
|
planoai up
|
||||||
|
|
||||||
|
# Or if already installed with uv
|
||||||
|
uvx planoai up
|
||||||
|
```
|
||||||
|
|
||||||
|
### 4) Launch Codex through Plano
|
||||||
|
|
||||||
|
```bash
|
||||||
|
planoai cli-agent codex
|
||||||
|
# Or if installed with uv:
|
||||||
|
uvx planoai cli-agent codex
|
||||||
|
```
|
||||||
|
|
||||||
|
The Codex launcher integration configures:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
OPENAI_BASE_URL=http://127.0.0.1:12000/v1
|
||||||
|
OPENAI_API_KEY=test
|
||||||
|
```
|
||||||
|
|
||||||
|
If `arch.codex.default` exists in `model_aliases`, `planoai cli-agent codex` automatically starts Codex with:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
codex -m arch.codex.default
|
||||||
|
```
|
||||||
|
|
||||||
|
## Monitor Routing Decisions
|
||||||
|
|
||||||
|
In a second terminal:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
sh pretty_model_resolution.sh
|
||||||
|
```
|
||||||
|
|
||||||
|
This prints `MODEL_RESOLUTION` lines so you can see request model -> resolved model mappings in real time.
|
||||||
|
|
||||||
|
## Advanced Usage
|
||||||
|
|
||||||
|
### Override Codex model for a session
|
||||||
|
|
||||||
|
```bash
|
||||||
|
planoai cli-agent codex --settings='{"CODEX_MODEL":"openai/gpt-4.1-2025-04-14"}'
|
||||||
|
```
|
||||||
|
|
||||||
|
### Context window guidance
|
||||||
|
|
||||||
|
Codex works best with a large context window. Use models/configuration that support at least 64k context when possible.
|
||||||
|
|
||||||
|
## Notes
|
||||||
|
|
||||||
|
- Plano's `default: true` model is only used when a client request does not specify a model.
|
||||||
|
- If Codex sends an explicit model in requests, aliasing/routing rules decide the final upstream model.
|
||||||
38
demos/llm_routing/codex_router/config.yaml
Normal file
38
demos/llm_routing/codex_router/config.yaml
Normal file
|
|
@ -0,0 +1,38 @@
|
||||||
|
version: v0.3.0
|
||||||
|
|
||||||
|
listeners:
|
||||||
|
- type: model
|
||||||
|
name: model_listener
|
||||||
|
port: 12000
|
||||||
|
|
||||||
|
model_providers:
|
||||||
|
# OpenAI Models
|
||||||
|
- model: openai/gpt-5-2025-08-07
|
||||||
|
default: true
|
||||||
|
access_key: $OPENAI_API_KEY
|
||||||
|
routing_preferences:
|
||||||
|
- name: code generation
|
||||||
|
description: generating new code snippets, functions, or boilerplate based on user prompts or requirements
|
||||||
|
|
||||||
|
- model: openai/gpt-4.1-2025-04-14
|
||||||
|
access_key: $OPENAI_API_KEY
|
||||||
|
routing_preferences:
|
||||||
|
- name: code understanding
|
||||||
|
description: understand and explain existing code snippets, functions, or libraries
|
||||||
|
|
||||||
|
# Anthropic Model
|
||||||
|
- model: anthropic/claude-sonnet-4-6
|
||||||
|
access_key: $ANTHROPIC_API_KEY
|
||||||
|
|
||||||
|
# Ollama Model (optional local fallback)
|
||||||
|
- model: ollama/llama3.1
|
||||||
|
base_url: http://host.docker.internal:11434
|
||||||
|
|
||||||
|
# Model aliases for Codex sessions
|
||||||
|
model_aliases:
|
||||||
|
# Default model Codex should request when launched by planoai cli-agent codex
|
||||||
|
arch.codex.default:
|
||||||
|
target: gpt-5-2025-08-07
|
||||||
|
|
||||||
|
tracing:
|
||||||
|
random_sampling: 100
|
||||||
33
demos/llm_routing/codex_router/pretty_model_resolution.sh
Executable file
33
demos/llm_routing/codex_router/pretty_model_resolution.sh
Executable file
|
|
@ -0,0 +1,33 @@
|
||||||
|
#!/usr/bin/env bash
|
||||||
|
# Pretty-print Plano MODEL_RESOLUTION lines from docker logs
|
||||||
|
# - hides Arch-Router
|
||||||
|
# - prints timestamp
|
||||||
|
# - colors MODEL_RESOLUTION red
|
||||||
|
# - colors req_model cyan
|
||||||
|
# - colors resolved_model magenta
|
||||||
|
# - removes provider and streaming
|
||||||
|
|
||||||
|
docker logs -f plano 2>&1 \
|
||||||
|
| awk '
|
||||||
|
/MODEL_RESOLUTION:/ && $0 !~ /Arch-Router/ {
|
||||||
|
# extract timestamp between first [ and ]
|
||||||
|
ts=""
|
||||||
|
if (match($0, /\[[0-9-]+ [0-9:.]+\]/)) {
|
||||||
|
ts=substr($0, RSTART+1, RLENGTH-2)
|
||||||
|
}
|
||||||
|
|
||||||
|
# split out after MODEL_RESOLUTION:
|
||||||
|
n = split($0, parts, /MODEL_RESOLUTION: */)
|
||||||
|
line = parts[2]
|
||||||
|
|
||||||
|
# remove provider and streaming fields
|
||||||
|
sub(/ *provider='\''[^'\'']+'\''/, "", line)
|
||||||
|
sub(/ *streaming=(true|false)/, "", line)
|
||||||
|
|
||||||
|
# highlight fields
|
||||||
|
gsub(/req_model='\''[^'\'']+'\''/, "\033[36m&\033[0m", line)
|
||||||
|
gsub(/resolved_model='\''[^'\'']+'\''/, "\033[35m&\033[0m", line)
|
||||||
|
|
||||||
|
# print timestamp + MODEL_RESOLUTION
|
||||||
|
printf "\033[90m[%s]\033[0m \033[31mMODEL_RESOLUTION\033[0m: %s\n", ts, line
|
||||||
|
}'
|
||||||
97
demos/llm_routing/opencode_router/README.md
Normal file
97
demos/llm_routing/opencode_router/README.md
Normal file
|
|
@ -0,0 +1,97 @@
|
||||||
|
# OpenCode Router - Multi-Model Access with Intelligent Routing
|
||||||
|
|
||||||
|
Plano extends OpenCode to access multiple LLM providers through a single interface and route coding requests to the best configured model.
|
||||||
|
|
||||||
|
## Benefits
|
||||||
|
|
||||||
|
- **Single Interface**: Use OpenCode while routing through Plano
|
||||||
|
- **Task-Aware Routing**: Route requests based on coding task intent
|
||||||
|
- **Provider Flexibility**: Mix OpenAI, Anthropic, and local models behind one endpoint
|
||||||
|
- **Routing Transparency**: Inspect exactly which model served each request
|
||||||
|
|
||||||
|
## How It Works
|
||||||
|
|
||||||
|
Plano sits between OpenCode and configured providers:
|
||||||
|
|
||||||
|
```text
|
||||||
|
Your Request -> OpenCode -> Plano -> Selected Model -> Response
|
||||||
|
```
|
||||||
|
|
||||||
|
## Quick Start
|
||||||
|
|
||||||
|
### Prerequisites
|
||||||
|
|
||||||
|
- OpenCode CLI installed and available on your `PATH` (`opencode` command)
|
||||||
|
- Docker running
|
||||||
|
|
||||||
|
### 1) Enter this demo directory
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cd demos/llm_routing/opencode_router
|
||||||
|
```
|
||||||
|
|
||||||
|
### 2) Set API keys
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export OPENAI_API_KEY="your-openai-key-here"
|
||||||
|
export ANTHROPIC_API_KEY="your-anthropic-key-here"
|
||||||
|
```
|
||||||
|
|
||||||
|
### 3) Start Plano
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Install with uv (recommended)
|
||||||
|
uv tool install planoai
|
||||||
|
planoai up
|
||||||
|
|
||||||
|
# Or if already installed with uv
|
||||||
|
uvx planoai up
|
||||||
|
```
|
||||||
|
|
||||||
|
### 4) Launch OpenCode through Plano
|
||||||
|
|
||||||
|
```bash
|
||||||
|
planoai cli-agent opencode
|
||||||
|
# Or if installed with uv:
|
||||||
|
uvx planoai cli-agent opencode
|
||||||
|
```
|
||||||
|
|
||||||
|
The OpenCode launcher integration configures:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
OPENAI_BASE_URL=http://127.0.0.1:12000/v1
|
||||||
|
OPENAI_API_KEY=test
|
||||||
|
```
|
||||||
|
|
||||||
|
If `arch.opencode.default` exists in `model_aliases`, `planoai cli-agent opencode` exports:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
OPENAI_MODEL=<target-from-arch.opencode.default>
|
||||||
|
```
|
||||||
|
|
||||||
|
## Monitor Routing Decisions
|
||||||
|
|
||||||
|
In a second terminal:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
sh pretty_model_resolution.sh
|
||||||
|
```
|
||||||
|
|
||||||
|
This prints `MODEL_RESOLUTION` lines so you can see request model -> resolved model mappings in real time.
|
||||||
|
|
||||||
|
## Advanced Usage
|
||||||
|
|
||||||
|
### Override OpenCode model for a session
|
||||||
|
|
||||||
|
```bash
|
||||||
|
planoai cli-agent opencode --settings='{"OPENCODE_MODEL":"openai/gpt-4.1-2025-04-14"}'
|
||||||
|
```
|
||||||
|
|
||||||
|
### Context window guidance
|
||||||
|
|
||||||
|
OpenCode works best with a large context window. Use models/configuration that support at least 64k context when possible.
|
||||||
|
|
||||||
|
## Notes
|
||||||
|
|
||||||
|
- Plano's `default: true` model is only used when a client request does not specify a model.
|
||||||
|
- If OpenCode sends an explicit model in requests, aliasing/routing rules decide the final upstream model.
|
||||||
38
demos/llm_routing/opencode_router/config.yaml
Normal file
38
demos/llm_routing/opencode_router/config.yaml
Normal file
|
|
@ -0,0 +1,38 @@
|
||||||
|
version: v0.3.0
|
||||||
|
|
||||||
|
listeners:
|
||||||
|
- type: model
|
||||||
|
name: model_listener
|
||||||
|
port: 12000
|
||||||
|
|
||||||
|
model_providers:
|
||||||
|
# OpenAI Models
|
||||||
|
- model: openai/gpt-5-2025-08-07
|
||||||
|
default: true
|
||||||
|
access_key: $OPENAI_API_KEY
|
||||||
|
routing_preferences:
|
||||||
|
- name: code generation
|
||||||
|
description: generating new code snippets, functions, or boilerplate based on user prompts or requirements
|
||||||
|
|
||||||
|
- model: openai/gpt-4.1-2025-04-14
|
||||||
|
access_key: $OPENAI_API_KEY
|
||||||
|
routing_preferences:
|
||||||
|
- name: code understanding
|
||||||
|
description: understand and explain existing code snippets, functions, or libraries
|
||||||
|
|
||||||
|
# Anthropic Model
|
||||||
|
- model: anthropic/claude-sonnet-4-6
|
||||||
|
access_key: $ANTHROPIC_API_KEY
|
||||||
|
|
||||||
|
# Ollama Model (optional local fallback)
|
||||||
|
- model: ollama/llama3.1
|
||||||
|
base_url: http://host.docker.internal:11434
|
||||||
|
|
||||||
|
# Model aliases for OpenCode sessions
|
||||||
|
model_aliases:
|
||||||
|
# Default model OpenCode should request when launched by planoai cli-agent opencode
|
||||||
|
arch.opencode.default:
|
||||||
|
target: gpt-5-2025-08-07
|
||||||
|
|
||||||
|
tracing:
|
||||||
|
random_sampling: 100
|
||||||
33
demos/llm_routing/opencode_router/pretty_model_resolution.sh
Executable file
33
demos/llm_routing/opencode_router/pretty_model_resolution.sh
Executable file
|
|
@ -0,0 +1,33 @@
|
||||||
|
#!/usr/bin/env bash
|
||||||
|
# Pretty-print Plano MODEL_RESOLUTION lines from docker logs
|
||||||
|
# - hides Arch-Router
|
||||||
|
# - prints timestamp
|
||||||
|
# - colors MODEL_RESOLUTION red
|
||||||
|
# - colors req_model cyan
|
||||||
|
# - colors resolved_model magenta
|
||||||
|
# - removes provider and streaming
|
||||||
|
|
||||||
|
docker logs -f plano 2>&1 \
|
||||||
|
| awk '
|
||||||
|
/MODEL_RESOLUTION:/ && $0 !~ /Arch-Router/ {
|
||||||
|
# extract timestamp between first [ and ]
|
||||||
|
ts=""
|
||||||
|
if (match($0, /\[[0-9-]+ [0-9:.]+\]/)) {
|
||||||
|
ts=substr($0, RSTART+1, RLENGTH-2)
|
||||||
|
}
|
||||||
|
|
||||||
|
# split out after MODEL_RESOLUTION:
|
||||||
|
n = split($0, parts, /MODEL_RESOLUTION: */)
|
||||||
|
line = parts[2]
|
||||||
|
|
||||||
|
# remove provider and streaming fields
|
||||||
|
sub(/ *provider='\''[^'\'']+'\''/, "", line)
|
||||||
|
sub(/ *streaming=(true|false)/, "", line)
|
||||||
|
|
||||||
|
# highlight fields
|
||||||
|
gsub(/req_model='\''[^'\'']+'\''/, "\033[36m&\033[0m", line)
|
||||||
|
gsub(/resolved_model='\''[^'\'']+'\''/, "\033[35m&\033[0m", line)
|
||||||
|
|
||||||
|
# print timestamp + MODEL_RESOLUTION
|
||||||
|
printf "\033[90m[%s]\033[0m \033[31mMODEL_RESOLUTION\033[0m: %s\n", ts, line
|
||||||
|
}'
|
||||||
Loading…
Add table
Add a link
Reference in a new issue