feat: add Codex and OpenCode routers with multi-provider support and updated configurations

2026-05-15 11:02:39 +02:00 · 2026-02-25 10:16:09 -08:00 · 2026-02-25 10:16:09 -08:00 · 5d35a3ae18
commit 5d35a3ae18
parent 69d650a4e5
8 changed files with 347 additions and 4 deletions
--- a/demos/README.md
+++ b/demos/README.md
@ -16,6 +16,8 @@ This directory contains demos showcasing Plano's capabilities as an AI-native pr
 | [Preference-Based Routing](llm_routing/preference_based_routing/) | Routes prompts to LLMs based on user-defined preferences and task type (e.g. code generation vs. understanding) |
 | [Model Alias Routing](llm_routing/model_alias_routing/) | Maps semantic aliases (`arch.summarize.v1`) to provider-specific models for centralized governance |
 | [Claude Code Router](llm_routing/claude_code_router/) | Extends Claude Code with multi-provider access and preference-aligned routing for coding tasks |
 | [Codex Router](llm_routing/codex_router/) | Extends Codex CLI with multi-provider access and preference-aligned routing for coding tasks |
 | [OpenCode Router](llm_routing/opencode_router/) | Extends OpenCode CLI with multi-provider access and preference-aligned routing for coding tasks |
 ## Agent Orchestration
--- a/demos/llm_routing/claude_code_router/config.yaml
+++ b/demos/llm_routing/claude_code_router/config.yaml
@ -19,11 +19,11 @@ model_providers:
      - name: code understanding
        description: understand and explain existing code snippets, functions, or libraries
  # Anthropic Models
-  - model: anthropic/claude-sonnet-4-5
+  - model: anthropic/claude-sonnet-4-6
    default: true
    access_key: $ANTHROPIC_API_KEY
-  - model: anthropic/claude-haiku-4-5
+  - model: anthropic/claude-haiku-4-5-20251001
    default: true
    access_key: $ANTHROPIC_API_KEY
  # Ollama Models
@ -35,7 +35,7 @@ model_providers:
 model_aliases:
  # Alias for a small faster Claude model
  arch.claude.code.small.fast:
-    target: claude-haiku-4-5
+    target: claude-haiku-4-5-20251001
 tracing:
  random_sampling: 100
--- a/demos/llm_routing/codex_router/README.md
+++ b/demos/llm_routing/codex_router/README.md
@ -0,0 +1,102 @@
 # Codex Router - Multi-Model Access with Intelligent Routing
 Plano extends Codex to access multiple LLM providers through a single interface and route coding requests to the best configured model.
 ## Benefits
 - **Single Interface**: Use Codex while routing through Plano
 - **Task-Aware Routing**: Route requests based on coding task intent
 - **Provider Flexibility**: Mix OpenAI, Anthropic, and local models behind one endpoint
 - **Routing Transparency**: Inspect exactly which model served each request
 ## How It Works
 Plano sits between Codex and configured providers:
 ```text
 Your Request -> Codex -> Plano -> Selected Model -> Response
 ```
 ## Quick Start
 ### Prerequisites
 ```bash
 # Install Codex CLI
 npm install -g @openai/codex
 # Ensure Docker is running
 docker --version
 ```
 ### 1) Enter this demo directory
 ```bash
 cd demos/llm_routing/codex_router
 ```
 ### 2) Set API keys
 ```bash
 export OPENAI_API_KEY="your-openai-key-here"
 export ANTHROPIC_API_KEY="your-anthropic-key-here"
 ```
 ### 3) Start Plano
 ```bash
 # Install with uv (recommended)
 uv tool install planoai
 planoai up
 # Or if already installed with uv
 uvx planoai up
 ```
 ### 4) Launch Codex through Plano
 ```bash
 planoai cli-agent codex
 # Or if installed with uv:
 uvx planoai cli-agent codex
 ```
 The Codex launcher integration configures:
 ```bash
 OPENAI_BASE_URL=http://127.0.0.1:12000/v1
 OPENAI_API_KEY=test
 ```
 If `arch.codex.default` exists in `model_aliases`, `planoai cli-agent codex` automatically starts Codex with:
 ```bash
 codex -m arch.codex.default
 ```
 ## Monitor Routing Decisions
 In a second terminal:
 ```bash
 sh pretty_model_resolution.sh
 ```
 This prints `MODEL_RESOLUTION` lines so you can see request model -> resolved model mappings in real time.
 ## Advanced Usage
 ### Override Codex model for a session
 ```bash
 planoai cli-agent codex --settings='{"CODEX_MODEL":"openai/gpt-4.1-2025-04-14"}'
 ```
 ### Context window guidance
 Codex works best with a large context window. Use models/configuration that support at least 64k context when possible.
 ## Notes
 - Plano's `default: true` model is only used when a client request does not specify a model.
 - If Codex sends an explicit model in requests, aliasing/routing rules decide the final upstream model.
--- a/demos/llm_routing/codex_router/config.yaml
+++ b/demos/llm_routing/codex_router/config.yaml
@ -0,0 +1,38 @@
 version: v0.3.0
 listeners:
  - type: model
    name: model_listener
    port: 12000
 model_providers:
  # OpenAI Models
  - model: openai/gpt-5-2025-08-07
    default: true
    access_key: $OPENAI_API_KEY
    routing_preferences:
      - name: code generation
        description: generating new code snippets, functions, or boilerplate based on user prompts or requirements
  - model: openai/gpt-4.1-2025-04-14
    access_key: $OPENAI_API_KEY
    routing_preferences:
      - name: code understanding
        description: understand and explain existing code snippets, functions, or libraries
  # Anthropic Model
  - model: anthropic/claude-sonnet-4-6
    access_key: $ANTHROPIC_API_KEY
  # Ollama Model (optional local fallback)
  - model: ollama/llama3.1
    base_url: http://host.docker.internal:11434
 # Model aliases for Codex sessions
 model_aliases:
  # Default model Codex should request when launched by planoai cli-agent codex
  arch.codex.default:
    target: gpt-5-2025-08-07
 tracing:
  random_sampling: 100
--- a/demos/llm_routing/codex_router/pretty_model_resolution.sh
+++ b/demos/llm_routing/codex_router/pretty_model_resolution.sh
@ -0,0 +1,33 @@
 #!/usr/bin/env bash
 # Pretty-print Plano MODEL_RESOLUTION lines from docker logs
 # - hides Arch-Router
 # - prints timestamp
 # - colors MODEL_RESOLUTION red
 # - colors req_model cyan
 # - colors resolved_model magenta
 # - removes provider and streaming
 docker logs -f plano 2>&1 \
 | awk '
 /MODEL_RESOLUTION:/ && $0 !~ /Arch-Router/ {
  # extract timestamp between first [ and ]
  ts=""
  if (match($0, /\[[0-9-]+ [0-9:.]+\]/)) {
    ts=substr($0, RSTART+1, RLENGTH-2)
  }
  # split out after MODEL_RESOLUTION:
  n = split($0, parts, /MODEL_RESOLUTION: */)
  line = parts[2]
  # remove provider and streaming fields
  sub(/ *provider='\''[^'\'']+'\''/, "", line)
  sub(/ *streaming=(true|false)/, "", line)
  # highlight fields
  gsub(/req_model='\''[^'\'']+'\''/, "\033[36m&\033[0m", line)
  gsub(/resolved_model='\''[^'\'']+'\''/, "\033[35m&\033[0m", line)
  # print timestamp + MODEL_RESOLUTION
  printf "\033[90m[%s]\033[0m \033[31mMODEL_RESOLUTION\033[0m: %s\n", ts, line
 }'
--- a/demos/llm_routing/opencode_router/README.md
+++ b/demos/llm_routing/opencode_router/README.md
@ -0,0 +1,97 @@
 # OpenCode Router - Multi-Model Access with Intelligent Routing
 Plano extends OpenCode to access multiple LLM providers through a single interface and route coding requests to the best configured model.
 ## Benefits
 - **Single Interface**: Use OpenCode while routing through Plano
 - **Task-Aware Routing**: Route requests based on coding task intent
 - **Provider Flexibility**: Mix OpenAI, Anthropic, and local models behind one endpoint
 - **Routing Transparency**: Inspect exactly which model served each request
 ## How It Works
 Plano sits between OpenCode and configured providers:
 ```text
 Your Request -> OpenCode -> Plano -> Selected Model -> Response
 ```
 ## Quick Start
 ### Prerequisites
 - OpenCode CLI installed and available on your `PATH` (`opencode` command)
 - Docker running
 ### 1) Enter this demo directory
 ```bash
 cd demos/llm_routing/opencode_router
 ```
 ### 2) Set API keys
 ```bash
 export OPENAI_API_KEY="your-openai-key-here"
 export ANTHROPIC_API_KEY="your-anthropic-key-here"
 ```
 ### 3) Start Plano
 ```bash
 # Install with uv (recommended)
 uv tool install planoai
 planoai up
 # Or if already installed with uv
 uvx planoai up
 ```
 ### 4) Launch OpenCode through Plano
 ```bash
 planoai cli-agent opencode
 # Or if installed with uv:
 uvx planoai cli-agent opencode
 ```
 The OpenCode launcher integration configures:
 ```bash
 OPENAI_BASE_URL=http://127.0.0.1:12000/v1
 OPENAI_API_KEY=test
 ```
 If `arch.opencode.default` exists in `model_aliases`, `planoai cli-agent opencode` exports:
 ```bash
 OPENAI_MODEL=<target-from-arch.opencode.default>
 ```
 ## Monitor Routing Decisions
 In a second terminal:
 ```bash
 sh pretty_model_resolution.sh
 ```
 This prints `MODEL_RESOLUTION` lines so you can see request model -> resolved model mappings in real time.
 ## Advanced Usage
 ### Override OpenCode model for a session
 ```bash
 planoai cli-agent opencode --settings='{"OPENCODE_MODEL":"openai/gpt-4.1-2025-04-14"}'
 ```
 ### Context window guidance
 OpenCode works best with a large context window. Use models/configuration that support at least 64k context when possible.
 ## Notes
 - Plano's `default: true` model is only used when a client request does not specify a model.
 - If OpenCode sends an explicit model in requests, aliasing/routing rules decide the final upstream model.
--- a/demos/llm_routing/opencode_router/config.yaml
+++ b/demos/llm_routing/opencode_router/config.yaml
@ -0,0 +1,38 @@
 version: v0.3.0
 listeners:
  - type: model
    name: model_listener
    port: 12000
 model_providers:
  # OpenAI Models
  - model: openai/gpt-5-2025-08-07
    default: true
    access_key: $OPENAI_API_KEY
    routing_preferences:
      - name: code generation
        description: generating new code snippets, functions, or boilerplate based on user prompts or requirements
  - model: openai/gpt-4.1-2025-04-14
    access_key: $OPENAI_API_KEY
    routing_preferences:
      - name: code understanding
        description: understand and explain existing code snippets, functions, or libraries
  # Anthropic Model
  - model: anthropic/claude-sonnet-4-6
    access_key: $ANTHROPIC_API_KEY
  # Ollama Model (optional local fallback)
  - model: ollama/llama3.1
    base_url: http://host.docker.internal:11434
 # Model aliases for OpenCode sessions
 model_aliases:
  # Default model OpenCode should request when launched by planoai cli-agent opencode
  arch.opencode.default:
    target: gpt-5-2025-08-07
 tracing:
  random_sampling: 100
--- a/demos/llm_routing/opencode_router/pretty_model_resolution.sh
+++ b/demos/llm_routing/opencode_router/pretty_model_resolution.sh
@ -0,0 +1,33 @@
 #!/usr/bin/env bash
 # Pretty-print Plano MODEL_RESOLUTION lines from docker logs
 # - hides Arch-Router
 # - prints timestamp
 # - colors MODEL_RESOLUTION red
 # - colors req_model cyan
 # - colors resolved_model magenta
 # - removes provider and streaming
 docker logs -f plano 2>&1 \
 | awk '
 /MODEL_RESOLUTION:/ && $0 !~ /Arch-Router/ {
  # extract timestamp between first [ and ]
  ts=""
  if (match($0, /\[[0-9-]+ [0-9:.]+\]/)) {
    ts=substr($0, RSTART+1, RLENGTH-2)
  }
  # split out after MODEL_RESOLUTION:
  n = split($0, parts, /MODEL_RESOLUTION: */)
  line = parts[2]
  # remove provider and streaming fields
  sub(/ *provider='\''[^'\'']+'\''/, "", line)
  sub(/ *streaming=(true|false)/, "", line)
  # highlight fields
  gsub(/req_model='\''[^'\'']+'\''/, "\033[36m&\033[0m", line)
  gsub(/resolved_model='\''[^'\'']+'\''/, "\033[35m&\033[0m", line)
  # print timestamp + MODEL_RESOLUTION
  printf "\033[90m[%s]\033[0m \033[31mMODEL_RESOLUTION\033[0m: %s\n", ts, line
 }'