Rename all arch references to plano (#745)

* Rename all arch references to plano across the codebase

Complete rebrand from "Arch"/"archgw" to "Plano" including:
- Config files: arch_config_schema.yaml, workflow, demo configs
- Environment variables: ARCH_CONFIG_* → PLANO_CONFIG_*
- Python CLI: variables, functions, file paths, docker mounts
- Rust crates: config paths, log messages, metadata keys
- Docker/build: Dockerfile, supervisord, .dockerignore, .gitignore
- Docker Compose: volume mounts and env vars across all demos/tests
- GitHub workflows: job/step names
- Shell scripts: log messages
- Demos: Python code, READMEs, VS Code configs, Grafana dashboard
- Docs: RST includes, code comments, config references
- Package metadata: package.json, pyproject.toml, uv.lock

External URLs (docs.archgw.com, github.com/katanemo/archgw) left as-is.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Update remaining arch references in docs

- Rename RST cross-reference labels: arch_access_logging, arch_overview_tracing, arch_overview_threading → plano_*
- Update label references in request_lifecycle.rst
- Rename arch_config_state_storage_example.yaml → plano_config_state_storage_example.yaml
- Update config YAML comments: "Arch creates/uses" → "Plano creates/uses"
- Update "the Arch gateway" → "the Plano gateway" in configuration_reference.rst
- Update arch_config_schema.yaml reference in provider_models.py
- Rename arch_agent_router → plano_agent_router in config example

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Fix remaining arch references found in second pass

- config/docker-compose.dev.yaml: ARCH_CONFIG_FILE → PLANO_CONFIG_FILE,
  arch_config.yaml → plano_config.yaml, archgw_logs → plano_logs
- config/test_passthrough.yaml: container mount path
- tests/e2e/docker-compose.yaml: source file path (was still arch_config.yaml)
- cli/planoai/core.py: comment and log message
- crates/brightstaff/src/tracing/constants.rs: doc comment
- tests/{e2e,archgw}/common.py: get_arch_messages → get_plano_messages,
  arch_state/arch_messages variables renamed
- tests/{e2e,archgw}/test_prompt_gateway.py: updated imports and usages
- demos/shared/test_runner/{common,test_demos}.py: same renames
- tests/e2e/test_model_alias_routing.py: docstring
- .dockerignore: archgw_modelserver → plano_modelserver
- demos/use_cases/claude_code_router/pretty_model_resolution.sh: container name

Note: x-arch-* HTTP header values and Rust constant names intentionally
preserved for backwards compatibility with existing deployments.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Adil Hafeez 2026-02-13 15:16:56 -08:00 committed by GitHub
parent 0557f7ff98
commit ba651aaf71
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
115 changed files with 504 additions and 505 deletions

View file

@ -18,7 +18,7 @@ services:
extra_hosts:
- "host.docker.internal:host-gateway"
volumes:
- ./config.yaml:/app/arch_config.yaml
- ./config.yaml:/app/plano_config.yaml
jaeger:
build:

View file

@ -1 +1 @@
This demo shows how you can use a publicly hosted rest api and interact it using arch gateway.
This demo shows how you can use a publicly hosted rest api and interact it using Plano gateway.

View file

@ -10,7 +10,7 @@ services:
extra_hosts:
- "host.docker.internal:host-gateway"
volumes:
- ./config.yaml:/app/arch_config.yaml
- ./config.yaml:/app/plano_config.yaml
jaeger:
build:

View file

@ -1,6 +1,6 @@
# Multi-Turn Agentic Demo (RAG)
This demo showcases how the **Arch** can be used to build accurate multi-turn RAG agent by just writing simple APIs.
This demo showcases how **Plano** can be used to build accurate multi-turn RAG agent by just writing simple APIs.
![Example of Multi-turn Interaction](mutli-turn-example.png)
@ -14,7 +14,7 @@ Provides information about various energy sources and considerations.
# Starting the demo
1. Please make sure the [pre-requisites](https://github.com/katanemo/arch/?tab=readme-ov-file#prerequisites) are installed correctly
2. Start Arch
2. Start Plano
```sh
sh run_demo.sh
```

View file

@ -21,4 +21,4 @@ services:
extra_hosts:
- "host.docker.internal:host-gateway"
volumes:
- ./config.yaml:/app/arch_config.yaml
- ./config.yaml:/app/plano_config.yaml

View file

@ -18,8 +18,8 @@ start_demo() {
echo ".env file created with OPENAI_API_KEY."
fi
# Step 3: Start Arch
echo "Starting Arch with config.yaml..."
# Step 3: Start Plano
echo "Starting Plano with config.yaml..."
planoai up config.yaml
# Step 4: Start Network Agent
@ -33,8 +33,8 @@ stop_demo() {
echo "Stopping HR Agent using Docker Compose..."
docker compose down -v
# Step 2: Stop Arch
echo "Stopping Arch..."
# Step 2: Stop Plano
echo "Stopping Plano..."
planoai down
}

View file

@ -10,7 +10,7 @@ services:
extra_hosts:
- "host.docker.internal:host-gateway"
volumes:
- ./config.yaml:/app/arch_config.yaml
- ./config.yaml:/app/plano_config.yaml
jaeger:
build:

View file

@ -18,8 +18,8 @@ start_demo() {
echo ".env file created with OPENAI_API_KEY."
fi
# Step 3: Start Arch
echo "Starting Arch with config.yaml..."
# Step 3: Start Plano
echo "Starting Plano with config.yaml..."
planoai up config.yaml
# Step 4: Start developer services
@ -33,8 +33,8 @@ stop_demo() {
echo "Stopping Network Agent using Docker Compose..."
docker compose down
# Step 2: Stop Arch
echo "Stopping Arch..."
# Step 2: Stop Plano
echo "Stopping Plano..."
planoai down
}

View file

@ -1,11 +1,11 @@
# Function calling
This demo shows how you can use Arch's core function calling capabilities.
This demo shows how you can use Plano's core function calling capabilities.
# Starting the demo
1. Please make sure the [pre-requisites](https://github.com/katanemo/arch/?tab=readme-ov-file#prerequisites) are installed correctly
2. Start Arch
2. Start Plano
3. ```sh
sh run_demo.sh
@ -15,14 +15,14 @@ This demo shows how you can use Arch's core function calling capabilities.
# Observability
Arch gateway publishes stats endpoint at http://localhost:19901/stats. In this demo we are using prometheus to pull stats from arch and we are using grafana to visalize the stats in dashboard. To see grafana dashboard follow instructions below,
Plano gateway publishes stats endpoint at http://localhost:19901/stats. In this demo we are using prometheus to pull stats from Plano and we are using grafana to visalize the stats in dashboard. To see grafana dashboard follow instructions below,
1. Start grafana and prometheus using following command
```yaml
docker compose --profile monitoring up
```
2. Navigate to http://localhost:3000/ to open grafana UI (use admin/grafana as credentials)
3. From grafana left nav click on dashboards and select "Intelligent Gateway Overview" to view arch gateway stats
3. From grafana left nav click on dashboards and select "Intelligent Gateway Overview" to view Plano gateway stats
Here is a sample interaction,
<img width="575" alt="image" src="https://github.com/user-attachments/assets/e0929490-3eb2-4130-ae87-a732aea4d059">

View file

@ -20,7 +20,7 @@ services:
extra_hosts:
- "host.docker.internal:host-gateway"
volumes:
- ./config.yaml:/app/arch_config.yaml
- ./config.yaml:/app/plano_config.yaml
otel-collector:
build:

View file

@ -20,7 +20,7 @@ services:
extra_hosts:
- "host.docker.internal:host-gateway"
volumes:
- ./config.yaml:/app/arch_config.yaml
- ./config.yaml:/app/plano_config.yaml
jaeger:
build:

View file

@ -20,7 +20,7 @@ services:
extra_hosts:
- "host.docker.internal:host-gateway"
volumes:
- ./config.yaml:/app/arch_config.yaml
- ./config.yaml:/app/plano_config.yaml
otel-collector:
build:

View file

@ -23,7 +23,7 @@ services:
extra_hosts:
- "host.docker.internal:host-gateway"
volumes:
- ./config.yaml:/app/arch_config.yaml
- ./config.yaml:/app/plano_config.yaml
prometheus:
build:

View file

@ -20,4 +20,4 @@ services:
extra_hosts:
- "host.docker.internal:host-gateway"
volumes:
- ./config.yaml:/app/arch_config.yaml
- ./config.yaml:/app/plano_config.yaml

View file

@ -72,8 +72,8 @@ start_demo() {
exit 1
fi
# Step 4: Start Arch
echo "Starting Arch with config.yaml..."
# Step 4: Start Plano
echo "Starting Plano with config.yaml..."
planoai up config.yaml
# Step 5: Start Network Agent with the chosen Docker Compose file
@ -91,8 +91,8 @@ stop_demo() {
docker compose -f "$compose_file" down
done
# Stop Arch
echo "Stopping Arch..."
# Stop Plano
echo "Stopping Plano..."
planoai down
}

View file

@ -15,7 +15,7 @@
"LLM": "1",
"CHAT_COMPLETION_ENDPOINT": "http://localhost:10000/v1",
"STREAMING": "True",
"ARCH_CONFIG": "../../samples_python/weather_forecast/arch_config.yaml"
"PLANO_CONFIG": "../../samples_python/weather_forecast/plano_config.yaml"
}
},
{
@ -29,7 +29,7 @@
"LLM": "1",
"CHAT_COMPLETION_ENDPOINT": "http://localhost:12000/v1",
"STREAMING": "True",
"ARCH_CONFIG": "../../samples_python/weather_forecast/arch_config.yaml"
"PLANO_CONFIG": "../../samples_python/weather_forecast/plano_config.yaml"
}
},
]

View file

@ -37,7 +37,7 @@ def chat(
try:
response = client.chat.completions.create(
# we select model from arch_config file
# we select model from plano_config file
model="None",
messages=history,
temperature=1.0,
@ -86,7 +86,7 @@ def create_gradio_app(demo_description, client):
with gr.Column(scale=2):
chatbot = gr.Chatbot(
label="Arch Chatbot",
label="Plano Chatbot",
elem_classes="chatbot",
)
textbox = gr.Textbox(
@ -110,7 +110,7 @@ def process_stream_chunk(chunk, history):
delta = chunk.choices[0].delta
if delta.role and delta.role != history[-1]["role"]:
# create new history item if role changes
# this is likely due to arch tool call and api response
# this is likely due to Plano tool call and api response
history.append({"role": delta.role})
history[-1]["model"] = chunk.model
@ -159,7 +159,7 @@ def convert_prompt_target_to_openai_format(target):
def get_prompt_targets():
try:
with open(os.getenv("ARCH_CONFIG", "config.yaml"), "r") as file:
with open(os.getenv("PLANO_CONFIG", "config.yaml"), "r") as file:
config = yaml.safe_load(file)
available_tools = []
@ -181,7 +181,7 @@ def get_prompt_targets():
def get_llm_models():
try:
with open(os.getenv("ARCH_CONFIG", "config.yaml"), "r") as file:
with open(os.getenv("PLANO_CONFIG", "config.yaml"), "r") as file:
config = yaml.safe_load(file)
available_models = [""]

View file

@ -787,7 +787,7 @@
},
"timepicker": {},
"timezone": "browser",
"title": "Arch Gateway Dashboard",
"title": "Plano Gateway Dashboard",
"uid": "adt6uhx5lk8aob",
"version": 1,
"weekStart": ""

View file

@ -19,17 +19,17 @@ def get_data_chunks(stream, n=1):
return chunks
def get_arch_messages(response_json):
arch_messages = []
def get_plano_messages(response_json):
plano_messages = []
if response_json and "metadata" in response_json:
# load arch_state from metadata
arch_state_str = response_json.get("metadata", {}).get(ARCH_STATE_HEADER, "{}")
# parse arch_state into json object
arch_state = json.loads(arch_state_str)
# load messages from arch_state
arch_messages_str = arch_state.get("messages", "[]")
# load plano_state from metadata
plano_state_str = response_json.get("metadata", {}).get(ARCH_STATE_HEADER, "{}")
# parse plano_state into json object
plano_state = json.loads(plano_state_str)
# load messages from plano_state
plano_messages_str = plano_state.get("messages", "[]")
# parse messages into json object
arch_messages = json.loads(arch_messages_str)
# append messages from arch gateway to history
return arch_messages
plano_messages = json.loads(plano_messages_str)
# append messages from plano gateway to history
return plano_messages
return []

View file

@ -1,6 +1,6 @@
import json
import os
from common import get_arch_messages
from common import get_plano_messages
import pytest
import requests
from deepdiff import DeepDiff
@ -46,10 +46,10 @@ def test_demos(test_data):
assert choices[0]["message"]["role"] == "assistant"
assert expected_output_contains.lower() in choices[0]["message"]["content"].lower()
# now verify arch_messages (tool call and api response) that are sent as response metadata
arch_messages = get_arch_messages(response_json)
assert len(arch_messages) == 2
tool_calls_message = arch_messages[0]
# now verify plano_messages (tool call and api response) that are sent as response metadata
plano_messages = get_plano_messages(response_json)
assert len(plano_messages) == 2
tool_calls_message = plano_messages[0]
tool_calls = tool_calls_message.get("tool_calls", [])
assert len(tool_calls) > 0

View file

@ -1,6 +1,6 @@
# Claude Code Router - Multi-Model Access with Intelligent Routing
Arch Gateway extends Claude Code to access multiple LLM providers through a single interface. Offering two key benefits:
Plano extends Claude Code to access multiple LLM providers through a single interface. Offering two key benefits:
1. **Access to Models**: Connect to Grok, Mistral, Gemini, DeepSeek, GPT models, Claude, and local models via Ollama
2. **Intelligent Routing via Preferences for Coding Tasks**: Configure which models handle specific development tasks:
@ -21,15 +21,15 @@ Uses a [1.5B preference-aligned router LLM](https://arxiv.org/abs/2506.16655) to
## How It Works
Arch Gateway sits between Claude Code and multiple LLM providers, analyzing each request to route it to the most suitable model:
Plano sits between Claude Code and multiple LLM providers, analyzing each request to route it to the most suitable model:
```
Your Request → Arch Gateway → Suitable Model → Response
Your Request → Plano → Suitable Model → Response
[Task Analysis & Model Selection]
```
**Supported Providers**: OpenAI-compatible, Anthropic, DeepSeek, Grok, Gemini, Llama, Mistral, local models via Ollama. See [full list of supported providers](https://docs.archgw.com/concepts/llm_providers/supported_providers.html).
**Supported Providers**: OpenAI-compatible, Anthropic, DeepSeek, Grok, Gemini, Llama, Mistral, local models via Ollama. See [full list of supported providers](https://docs.planoai.dev/concepts/llm_providers/supported_providers.html).
## Quick Start (5 minutes)
@ -61,7 +61,7 @@ export ANTHROPIC_API_KEY="your-anthropic-key-here"
# Add other providers as needed
```
### Step 3: Start Arch Gateway
### Step 3: Start Plano
```bash
# Install using uv (recommended)
uv tool install planoai
@ -122,7 +122,7 @@ planoai cli-agent claude --settings='{"ANTHROPIC_SMALL_FAST_MODEL": "deepseek-co
### Environment Variables
The system automatically configures these variables for Claude Code:
```bash
ANTHROPIC_BASE_URL=http://127.0.0.1:12000 # Routes through Arch Gateway
ANTHROPIC_BASE_URL=http://127.0.0.1:12000 # Routes through Plano
ANTHROPIC_SMALL_FAST_MODEL=arch.claude.code.small.fast # Uses intelligent alias
```
@ -147,6 +147,6 @@ llm_providers:
## Technical Details
**How routing works:** Arch intercepts Claude Code requests, analyzes the content using preference-aligned routing, and forwards to the configured model.
**How routing works:** Plano intercepts Claude Code requests, analyzes the content using preference-aligned routing, and forwards to the configured model.
**Research foundation:** Built on our research in [Preference-Aligned LLM Routing](https://arxiv.org/abs/2506.16655)
**Documentation:** [docs.archgw.com](https://docs.archgw.com) for advanced configuration and API details.
**Documentation:** [docs.planoai.dev](https://docs.planoai.dev) for advanced configuration and API details.

View file

@ -1,5 +1,5 @@
#!/usr/bin/env bash
# Pretty-print ArchGW MODEL_RESOLUTION lines from docker logs
# Pretty-print Plano MODEL_RESOLUTION lines from docker logs
# - hides Arch-Router
# - prints timestamp
# - colors MODEL_RESOLUTION red
@ -7,7 +7,7 @@
# - colors resolved_model magenta
# - removes provider and streaming
docker logs -f archgw 2>&1 \
docker logs -f plano 2>&1 \
| awk '
/MODEL_RESOLUTION:/ && $0 !~ /Arch-Router/ {
# extract timestamp between first [ and ]

View file

@ -19,10 +19,10 @@ services:
- "12000:12000"
- "8001:8001"
environment:
- ARCH_CONFIG_PATH=/config/config.yaml
- PLANO_CONFIG_PATH=/config/config.yaml
- OPENAI_API_KEY=${OPENAI_API_KEY:?OPENAI_API_KEY environment variable is required but not set}
volumes:
- ./config.yaml:/app/arch_config.yaml
- ./config.yaml:/app/plano_config.yaml
- /etc/ssl/cert.pem:/etc/ssl/cert.pem
jaeger:
build:

View file

@ -22,14 +22,14 @@ logger = logging.getLogger(__name__)
### add new setup
app = FastAPI(title="RAG Agent Context Builder", version="1.0.0")
# Configuration for archgw LLM gateway
# Configuration for Plano LLM gateway
LLM_GATEWAY_ENDPOINT = os.getenv("LLM_GATEWAY_ENDPOINT", "http://localhost:12000/v1")
RAG_MODEL = "gpt-4o-mini"
# Initialize OpenAI client for archgw
archgw_client = AsyncOpenAI(
# Initialize OpenAI client for Plano
plano_client = AsyncOpenAI(
base_url=LLM_GATEWAY_ENDPOINT,
api_key="EMPTY", # archgw doesn't require a real API key
api_key="EMPTY", # Plano doesn't require a real API key
)
# Global variable to store the knowledge base
@ -95,15 +95,15 @@ async def find_relevant_passages(
If no passages are relevant, return "NONE"."""
try:
# Call archgw to select relevant passages
logger.info(f"Calling archgw to find relevant passages for query: '{query}'")
# Call Plano to select relevant passages
logger.info(f"Calling Plano to find relevant passages for query: '{query}'")
# Prepare extra headers if traceparent is provided
extra_headers = {"x-envoy-max-retries": "3", "x-request-id": request_id}
if traceparent:
extra_headers["traceparent"] = traceparent
response = await archgw_client.chat.completions.create(
response = await plano_client.chat.completions.create(
model=RAG_MODEL,
messages=[{"role": "system", "content": system_prompt}],
temperature=0.1,

View file

@ -22,14 +22,14 @@ logging.basicConfig(
)
logger = logging.getLogger(__name__)
# Configuration for archgw LLM gateway
# Configuration for Plano LLM gateway
LLM_GATEWAY_ENDPOINT = os.getenv("LLM_GATEWAY_ENDPOINT", "http://localhost:12000/v1")
GUARD_MODEL = "gpt-4o-mini"
# Initialize OpenAI client for archgw
archgw_client = AsyncOpenAI(
# Initialize OpenAI client for Plano
plano_client = AsyncOpenAI(
base_url=LLM_GATEWAY_ENDPOINT,
api_key="EMPTY", # archgw doesn't require a real API key
api_key="EMPTY", # Plano doesn't require a real API key
)
app = FastAPI(title="RAG Agent Input Guards", version="1.0.0")
@ -93,13 +93,13 @@ Respond in JSON format:
]
try:
# Call archgw using OpenAI client
# Call Plano using OpenAI client
extra_headers = {"x-envoy-max-retries": "3", "x-request-id": request_id}
if traceparent_header:
extra_headers["traceparent"] = traceparent_header
logger.info(f"Validating query scope: '{last_user_message}'")
response = await archgw_client.chat.completions.create(
response = await plano_client.chat.completions.create(
model=GUARD_MODEL,
messages=guard_messages,
temperature=0.1,

View file

@ -20,20 +20,20 @@ logging.basicConfig(
)
logger = logging.getLogger(__name__)
# Configuration for archgw LLM gateway
# Configuration for Plano LLM gateway
LLM_GATEWAY_ENDPOINT = os.getenv("LLM_GATEWAY_ENDPOINT", "http://localhost:12000/v1")
QUERY_REWRITE_MODEL = "gpt-4o-mini"
# Initialize OpenAI client for archgw
archgw_client = AsyncOpenAI(
# Initialize OpenAI client for Plano
plano_client = AsyncOpenAI(
base_url=LLM_GATEWAY_ENDPOINT,
api_key="EMPTY", # archgw doesn't require a real API key
api_key="EMPTY", # Plano doesn't require a real API key
)
app = FastAPI(title="RAG Agent Query Rewriter", version="1.0.0")
async def rewrite_query_with_archgw(
async def rewrite_query_with_plano(
messages: List[ChatMessage],
traceparent_header: Optional[str] = None,
request_id: Optional[str] = None,
@ -59,8 +59,8 @@ Return only the rewritten query, nothing else."""
extra_headers["traceparent"] = traceparent_header
try:
logger.info(f"Calling archgw at {LLM_GATEWAY_ENDPOINT} to rewrite query")
resp = await archgw_client.chat.completions.create(
logger.info(f"Calling Plano at {LLM_GATEWAY_ENDPOINT} to rewrite query")
resp = await plano_client.chat.completions.create(
model=QUERY_REWRITE_MODEL,
messages=rewrite_messages,
temperature=0.3,
@ -96,7 +96,7 @@ async def query_rewriter_http(
else:
logger.info("No traceparent header found")
rewritten_query = await rewrite_query_with_archgw(
rewritten_query = await rewrite_query_with_plano(
messages, traceparent_header, request_id
)
# Create updated messages with the rewritten query

View file

@ -22,7 +22,7 @@ logging.basicConfig(
)
logger = logging.getLogger(__name__)
# Configuration for archgw LLM gateway
# Configuration for Plano LLM gateway
LLM_GATEWAY_ENDPOINT = os.getenv("LLM_GATEWAY_ENDPOINT", "http://localhost:12000/v1")
RESPONSE_MODEL = "gpt-4o"
@ -38,10 +38,10 @@ Your response should:
Generate a complete response to assist the user."""
# Initialize OpenAI client for archgw
archgw_client = AsyncOpenAI(
# Initialize OpenAI client for Plano
plano_client = AsyncOpenAI(
base_url=LLM_GATEWAY_ENDPOINT,
api_key="EMPTY", # archgw doesn't require a real API key
api_key="EMPTY", # Plano doesn't require a real API key
)
# FastAPI app for REST server
@ -95,9 +95,9 @@ async def stream_chat_completions(
response_messages = prepare_response_messages(request_body)
try:
# Call archgw using OpenAI client for streaming
# Call Plano using OpenAI client for streaming
logger.info(
f"Calling archgw at {LLM_GATEWAY_ENDPOINT} to generate streaming response"
f"Calling Plano at {LLM_GATEWAY_ENDPOINT} to generate streaming response"
)
# Prepare extra headers if traceparent is provided
@ -105,7 +105,7 @@ async def stream_chat_completions(
if traceparent_header:
extra_headers["traceparent"] = traceparent_header
response_stream = await archgw_client.chat.completions.create(
response_stream = await plano_client.chat.completions.create(
model=RESPONSE_MODEL,
messages=response_messages,
temperature=request_body.temperature or 0.7,

View file

@ -1,15 +1,15 @@
# LLM Routing
This demo shows how you can arch gateway to manage keys and route to upstream LLM.
This demo shows how you can use Plano gateway to manage keys and route to upstream LLM.
# Starting the demo
1. Please make sure the [pre-requisites](https://github.com/katanemo/arch/?tab=readme-ov-file#prerequisites) are installed correctly
1. Start Arch
1. Start Plano
```sh
sh run_demo.sh
```
1. Navigate to http://localhost:18080/
Following screen shows an example of interaction with arch gateway showing dynamic routing. You can select between different LLMs using "override model" option in the chat UI.
Following screen shows an example of interaction with Plano gateway showing dynamic routing. You can select between different LLMs using "override model" option in the chat UI.
![LLM Routing Demo](llm_routing_demo.png)
@ -47,12 +47,12 @@ $ curl --header 'Content-Type: application/json' \
```
# Observability
Arch gateway publishes stats endpoint at http://localhost:19901/stats. In this demo we are using prometheus to pull stats from arch and we are using grafana to visualize the stats in dashboard. To see grafana dashboard follow instructions below,
Plano gateway publishes stats endpoint at http://localhost:19901/stats. In this demo we are using prometheus to pull stats from Plano and we are using grafana to visualize the stats in dashboard. To see grafana dashboard follow instructions below,
1. Navigate to http://localhost:3000/ to open grafana UI (use admin/grafana as credentials)
1. From grafana left nav click on dashboards and select "Intelligent Gateway Overview" to view arch gateway stats
1. From grafana left nav click on dashboards and select "Intelligent Gateway Overview" to view Plano gateway stats
1. For tracing you can head over to http://localhost:16686/ to view recent traces.
Following is a screenshot of tracing UI showing call received by arch gateway and making upstream call to LLM,
Following is a screenshot of tracing UI showing call received by Plano gateway and making upstream call to LLM,
![Jaeger Tracing](jaeger_tracing_llm_routing.png)

View file

@ -8,11 +8,11 @@ services:
- "12000:12000"
- "12001:12001"
environment:
- ARCH_CONFIG_PATH=/app/arch_config.yaml
- PLANO_CONFIG_PATH=/app/plano_config.yaml
- OPENAI_API_KEY=${OPENAI_API_KEY:?OPENAI_API_KEY environment variable is required but not set}
- OTEL_TRACING_GRPC_ENDPOINT=http://host.docker.internal:4317
volumes:
- ./config.yaml:/app/arch_config.yaml:ro
- ./config.yaml:/app/plano_config.yaml:ro
- /etc/ssl/cert.pem:/etc/ssl/cert.pem
anythingllm:

View file

@ -18,8 +18,8 @@ start_demo() {
echo ".env file created with OPENAI_API_KEY."
fi
# Step 3: Start Arch
echo "Starting Arch with config.yaml..."
# Step 3: Start Plano
echo "Starting Plano with config.yaml..."
planoai up config.yaml
# Step 4: Start LLM Routing
@ -33,8 +33,8 @@ stop_demo() {
echo "Stopping LLM Routing using Docker Compose..."
docker compose down
# Step 2: Stop Arch
echo "Stopping Arch..."
# Step 2: Stop Plano
echo "Stopping Plano..."
planoai down
}

View file

@ -21,10 +21,10 @@ services:
- "12000:12000"
- "8001:8001"
environment:
- ARCH_CONFIG_PATH=/config/config.yaml
- PLANO_CONFIG_PATH=/config/config.yaml
- OPENAI_API_KEY=${OPENAI_API_KEY:?OPENAI_API_KEY environment variable is required but not set}
volumes:
- ./config.yaml:/app/arch_config.yaml
- ./config.yaml:/app/plano_config.yaml
- /etc/ssl/cert.pem:/etc/ssl/cert.pem
jaeger:
build:

View file

@ -18,14 +18,14 @@ logging.basicConfig(
logger = logging.getLogger(__name__)
# Configuration for archgw LLM gateway
# Configuration for Plano LLM gateway
LLM_GATEWAY_ENDPOINT = os.getenv("LLM_GATEWAY_ENDPOINT", "http://localhost:12000/v1")
RAG_MODEL = "gpt-4o-mini"
# Initialize OpenAI client for archgw
archgw_client = AsyncOpenAI(
# Initialize OpenAI client for Plano
plano_client = AsyncOpenAI(
base_url=LLM_GATEWAY_ENDPOINT,
api_key="EMPTY", # archgw doesn't require a real API key
api_key="EMPTY", # Plano doesn't require a real API key
)
# Global variable to store the knowledge base
@ -91,8 +91,8 @@ async def find_relevant_passages(
If no passages are relevant, return "NONE"."""
try:
# Call archgw to select relevant passages
logger.info(f"Calling archgw to find relevant passages for query: '{query}'")
# Call Plano to select relevant passages
logger.info(f"Calling Plano to find relevant passages for query: '{query}'")
# Prepare extra headers if traceparent is provided
extra_headers = {
@ -103,7 +103,7 @@ async def find_relevant_passages(
if traceparent:
extra_headers["traceparent"] = traceparent
response = await archgw_client.chat.completions.create(
response = await plano_client.chat.completions.create(
model=RAG_MODEL,
messages=[{"role": "system", "content": system_prompt}],
temperature=0.1,

View file

@ -20,14 +20,14 @@ logging.basicConfig(
)
logger = logging.getLogger(__name__)
# Configuration for archgw LLM gateway
# Configuration for Plano LLM gateway
LLM_GATEWAY_ENDPOINT = os.getenv("LLM_GATEWAY_ENDPOINT", "http://localhost:12000/v1")
GUARD_MODEL = "gpt-4o-mini"
# Initialize OpenAI client for archgw
archgw_client = AsyncOpenAI(
# Initialize OpenAI client for Plano
plano_client = AsyncOpenAI(
base_url=LLM_GATEWAY_ENDPOINT,
api_key="EMPTY", # archgw doesn't require a real API key
api_key="EMPTY", # Plano doesn't require a real API key
)
app = FastAPI()
@ -91,7 +91,7 @@ Respond in JSON format:
]
try:
# Call archgw using OpenAI client
# Call Plano using OpenAI client
extra_headers = {"x-envoy-max-retries": "3"}
if traceparent_header:
extra_headers["traceparent"] = traceparent_header
@ -100,7 +100,7 @@ Respond in JSON format:
extra_headers["x-request-id"] = request_id
logger.info(f"Validating query scope: '{last_user_message}'")
response = await archgw_client.chat.completions.create(
response = await plano_client.chat.completions.create(
model=GUARD_MODEL,
messages=guard_messages,
temperature=0.1,

View file

@ -19,20 +19,20 @@ logging.basicConfig(
)
logger = logging.getLogger(__name__)
# Configuration for archgw LLM gateway
# Configuration for Plano LLM gateway
LLM_GATEWAY_ENDPOINT = os.getenv("LLM_GATEWAY_ENDPOINT", "http://localhost:12000/v1")
QUERY_REWRITE_MODEL = "gpt-4o-mini"
# Initialize OpenAI client for archgw
archgw_client = AsyncOpenAI(
# Initialize OpenAI client for Plano
plano_client = AsyncOpenAI(
base_url=LLM_GATEWAY_ENDPOINT,
api_key="EMPTY", # archgw doesn't require a real API key
api_key="EMPTY", # Plano doesn't require a real API key
)
app = FastAPI()
async def rewrite_query_with_archgw(
async def rewrite_query_with_plano(
messages: List[ChatMessage],
traceparent_header: str,
request_id: Optional[str] = None,
@ -57,14 +57,14 @@ async def rewrite_query_with_archgw(
rewrite_messages.append({"role": msg.role, "content": msg.content})
try:
# Call archgw using OpenAI client
# Call Plano using OpenAI client
extra_headers = {"x-envoy-max-retries": "3"}
if traceparent_header:
extra_headers["traceparent"] = traceparent_header
if request_id:
extra_headers["x-request-id"] = request_id
logger.info(f"Calling archgw at {LLM_GATEWAY_ENDPOINT} to rewrite query")
response = await archgw_client.chat.completions.create(
logger.info(f"Calling Plano at {LLM_GATEWAY_ENDPOINT} to rewrite query")
response = await plano_client.chat.completions.create(
model=QUERY_REWRITE_MODEL,
messages=rewrite_messages,
temperature=0.3,
@ -88,7 +88,7 @@ async def rewrite_query_with_archgw(
async def query_rewriter(messages: List[ChatMessage]) -> List[ChatMessage]:
"""Chat completions endpoint that rewrites the last user query using archgw.
"""Chat completions endpoint that rewrites the last user query using Plano.
Returns a dict with a 'messages' key containing the updated message list.
"""
@ -104,8 +104,8 @@ async def query_rewriter(messages: List[ChatMessage]) -> List[ChatMessage]:
else:
logger.info("No traceparent header found")
# Call archgw to rewrite the last user query
rewritten_query = await rewrite_query_with_archgw(
# Call Plano to rewrite the last user query
rewritten_query = await rewrite_query_with_plano(
messages, traceparent_header, request_id
)

View file

@ -22,7 +22,7 @@ logging.basicConfig(
)
logger = logging.getLogger(__name__)
# Configuration for archgw LLM gateway
# Configuration for Plano LLM gateway
LLM_GATEWAY_ENDPOINT = os.getenv("LLM_GATEWAY_ENDPOINT", "http://localhost:12000/v1")
RESPONSE_MODEL = "gpt-4o"
@ -38,10 +38,10 @@ Your response should:
Generate a complete response to assist the user."""
# Initialize OpenAI client for archgw
archgw_client = AsyncOpenAI(
# Initialize OpenAI client for Plano
plano_client = AsyncOpenAI(
base_url=LLM_GATEWAY_ENDPOINT,
api_key="EMPTY", # archgw doesn't require a real API key
api_key="EMPTY", # Plano doesn't require a real API key
)
# FastAPI app for REST server
@ -94,9 +94,9 @@ async def stream_chat_completions(
response_messages = prepare_response_messages(request_body)
try:
# Call archgw using OpenAI client for streaming
# Call Plano using OpenAI client for streaming
logger.info(
f"Calling archgw at {LLM_GATEWAY_ENDPOINT} to generate streaming response"
f"Calling Plano at {LLM_GATEWAY_ENDPOINT} to generate streaming response"
)
logger.info(f"rag_agent - request_id: {request_id}")
@ -107,7 +107,7 @@ async def stream_chat_completions(
if traceparent_header:
extra_headers["traceparent"] = traceparent_header
response_stream = await archgw_client.chat.completions.create(
response_stream = await plano_client.chat.completions.create(
model=RESPONSE_MODEL,
messages=response_messages,
temperature=request_body.temperature or 0.7,

View file

@ -1,6 +1,6 @@
# Model Alias Demo Suite
This directory contains demos for the model alias feature in archgw.
This directory contains demos for the model alias feature in Plano.
## Overview
@ -48,7 +48,7 @@ model_aliases:
```
## Prerequisites
- Install all dependencies as described in the main Arch README ([link](https://github.com/katanemo/arch/?tab=readme-ov-file#prerequisites))
- Install all dependencies as described in the main Plano README ([link](https://github.com/katanemo/plano/?tab=readme-ov-file#prerequisites))
- Set your API keys in your environment:
- `export OPENAI_API_KEY=your-openai-key`
- `export ANTHROPIC_API_KEY=your-anthropic-key` (optional, but recommended for Anthropic tests)
@ -60,13 +60,13 @@ model_aliases:
sh run_demo.sh
```
- This will create a `.env` file with your API keys (if not present).
- Starts Arch Gateway with model alias config (`arch_config_with_aliases.yaml`).
- Starts Plano gateway with model alias config (`arch_config_with_aliases.yaml`).
2. To stop the demo:
```sh
sh run_demo.sh down
```
- This will stop Arch Gateway and any related services.
- This will stop Plano gateway and any related services.
## Example Requests
@ -145,4 +145,4 @@ curl -sS -X POST "http://localhost:12000/v1/messages" \
## Troubleshooting
- Ensure your API keys are set in your environment before running the demo.
- If you see errors about missing keys, set them and re-run the script.
- For more details, see the main Arch documentation.
- For more details, see the main Plano documentation.

View file

@ -24,11 +24,11 @@ start_demo() {
echo ".env file created with API keys."
fi
# Step 3: Start Arch
echo "Starting Arch with arch_config_with_aliases.yaml..."
# Step 3: Start Plano
echo "Starting Plano with arch_config_with_aliases.yaml..."
planoai up arch_config_with_aliases.yaml
echo "\n\nArch started successfully."
echo "\n\nPlano started successfully."
echo "Please run the following CURL command to test model alias routing. Additional instructions are in the README.md file. \n"
echo "curl -sS -X POST \"http://localhost:12000/v1/chat/completions\" \
-H \"Authorization: Bearer test-key\" \
@ -46,8 +46,8 @@ start_demo() {
# Function to stop the demo
stop_demo() {
# Step 2: Stop Arch
echo "Stopping Arch..."
# Step 2: Stop Plano
echo "Stopping Plano..."
planoai down
}

View file

@ -1,6 +1,6 @@
# Model Choice Newsletter Demo
This folder demonstrates a practical workflow for rapid model adoption and safe model switching using Arch Gateway (`plano`). It includes both a minimal test harness and a sample proxy configuration.
This folder demonstrates a practical workflow for rapid model adoption and safe model switching using Plano (`plano`). It includes both a minimal test harness and a sample proxy configuration.
---
@ -85,13 +85,13 @@ See `config.yaml` for a sample configuration mapping aliases to provider models.
```
2. **Install dependencies:**
- Install all dependencies as described in the main Arch README ([link](https://github.com/katanemo/arch/?tab=readme-ov-file#prerequisites))
- Install all dependencies as described in the main Plano README ([link](https://github.com/katanemo/plano/?tab=readme-ov-file#prerequisites))
- Then run
```sh
uv sync
```
3. **Start Arch Gateway**
3. **Start Plano**
```sh
run_demo.sh
```

View file

@ -3,7 +3,7 @@ import json, time, yaml, statistics as stats
from pydantic import BaseModel, ValidationError
from openai import OpenAI
# archgw endpoint (keys are handled by archgw)
# Plano endpoint (keys are handled by Plano)
client = OpenAI(base_url="http://localhost:12000/v1", api_key="n/a")
MODELS = ["arch.summarize.v1", "arch.reason.v1"]
FIXTURES = "evals_summarize.yaml"

View file

@ -1,7 +1,7 @@
[project]
name = "model-choice-newsletter-code-snippets"
version = "0.1.0"
description = "Benchmarking model alias routing with Arch Gateway."
description = "Benchmarking model alias routing with Plano."
authors = [{name = "Your Name", email = "your@email.com"}]
license = {text = "Apache 2.0"}
readme = "README.md"

View file

@ -17,18 +17,18 @@ start_demo() {
echo ".env file created with API keys."
fi
# Step 3: Start Arch
echo "Starting Arch with arch_config_with_aliases.yaml..."
# Step 3: Start Plano
echo "Starting Plano with arch_config_with_aliases.yaml..."
planoai up arch_config_with_aliases.yaml
echo "\n\nArch started successfully."
echo "\n\nPlano started successfully."
echo "Please run the following command to test the setup: python bench.py\n"
}
# Function to stop the demo
stop_demo() {
# Step 2: Stop Arch
echo "Stopping Arch..."
# Step 2: Stop Plano
echo "Stopping Plano..."
planoai down
}

View file

@ -8,12 +8,12 @@ services:
- "8001:8001"
- "12000:12000"
environment:
- ARCH_CONFIG_PATH=/app/arch_config.yaml
- PLANO_CONFIG_PATH=/app/plano_config.yaml
- OPENAI_API_KEY=${OPENAI_API_KEY:?OPENAI_API_KEY environment variable is required but not set}
- OTEL_TRACING_GRPC_ENDPOINT=http://jaeger:4317
- LOG_LEVEL=${LOG_LEVEL:-info}
volumes:
- ./config.yaml:/app/arch_config.yaml:ro
- ./config.yaml:/app/plano_config.yaml:ro
- /etc/ssl/cert.pem:/etc/ssl/cert.pem
crewai-flight-agent:

View file

@ -10,7 +10,7 @@ services:
extra_hosts:
- "host.docker.internal:host-gateway"
volumes:
- ./config.yaml:/app/arch_config.yaml
- ./config.yaml:/app/plano_config.yaml
jaeger:
build:

View file

@ -10,7 +10,7 @@ services:
extra_hosts:
- "host.docker.internal:host-gateway"
volumes:
- ./config.yaml:/app/arch_config.yaml
- ./config.yaml:/app/plano_config.yaml
otel-collector:
build:

View file

@ -18,8 +18,8 @@ start_demo() {
echo ".env file created with OPENAI_API_KEY."
fi
# Step 3: Start Arch
echo "Starting Arch with config.yaml..."
# Step 3: Start Plano
echo "Starting Plano with config.yaml..."
planoai up config.yaml
# Step 4: Start developer services
@ -33,8 +33,8 @@ stop_demo() {
echo "Stopping Network Agent using Docker Compose..."
docker compose down
# Step 2: Stop Arch
echo "Stopping Arch..."
# Step 2: Stop Plano
echo "Stopping Plano..."
planoai down
}

View file

@ -17,7 +17,7 @@ Make sure your machine is up to date with [latest version of plano]([url](https:
# Or if installed with uv: uvx planoai up --service plano --foreground
2025-05-30 18:00:09,953 - planoai.main - INFO - Starting plano cli version: 0.4.6
2025-05-30 18:00:09,953 - planoai.main - INFO - Validating /Users/adilhafeez/src/intelligent-prompt-gateway/demos/use_cases/preference_based_routing/config.yaml
2025-05-30 18:00:10,422 - cli.core - INFO - Starting arch gateway, image name: plano, tag: katanemo/plano:0.4.6
2025-05-30 18:00:10,422 - cli.core - INFO - Starting plano gateway, image name: plano, tag: katanemo/plano:0.4.6
2025-05-30 18:00:10,662 - cli.core - INFO - plano status: running, health status: starting
2025-05-30 18:00:11,712 - cli.core - INFO - plano status: running, health status: starting
2025-05-30 18:00:12,761 - cli.core - INFO - plano is running and is healthy!

View file

@ -8,14 +8,14 @@ services:
- "12000:12000"
- "12001:12001"
environment:
- ARCH_CONFIG_PATH=/app/arch_config.yaml
- PLANO_CONFIG_PATH=/app/plano_config.yaml
- OPENAI_API_KEY=${OPENAI_API_KEY:?OPENAI_API_KEY environment variable is required but not set}
- ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY:?ANTHROPIC_API_KEY environment variable is required but not set}
- OTEL_TRACING_GRPC_ENDPOINT=http://host.docker.internal:4317
- OTEL_TRACING_ENABLED=true
- RUST_LOG=debug
volumes:
- ./config.yaml:/app/arch_config.yaml:ro
- ./config.yaml:/app/plano_config.yaml:ro
- /etc/ssl/cert.pem:/etc/ssl/cert.pem
anythingllm:

View file

@ -1,6 +1,6 @@
# Use Case Demo: Bearer Authorization with Spotify APIs
In this demo, we show how you can use Arch's bearer authorization capability to connect your agentic apps to third-party APIs.
In this demo, we show how you can use Plano's bearer authorization capability to connect your agentic apps to third-party APIs.
More specifically, we demonstrate how you can connect to two Spotify APIs:
- [`/v1/browse/new-releases`](https://developer.spotify.com/documentation/web-api/reference/get-new-releases)
@ -23,7 +23,7 @@ Where users can engage by asking questions like _"Show me the latest releases in
SPOTIFY_CLIENT_KEY=your_spotify_api_token
```
3. Start Arch
3. Start Plano
```sh
sh run_demo.sh
```

View file

@ -10,7 +10,7 @@ services:
extra_hosts:
- "host.docker.internal:host-gateway"
volumes:
- ./config.yaml:/app/arch_config.yaml
- ./config.yaml:/app/plano_config.yaml
jaeger:
build:

View file

@ -18,8 +18,8 @@ start_demo() {
echo ".env file created with OPENAI_API_KEY."
fi
# Step 3: Start Arch
echo "Starting Arch with config.yaml..."
# Step 3: Start Plano
echo "Starting Plano with config.yaml..."
planoai up config.yaml
# Step 4: Start developer services
@ -33,8 +33,8 @@ stop_demo() {
echo "Stopping Network Agent using Docker Compose..."
docker compose down
# Step 2: Stop Arch
echo "Stopping Arch..."
# Step 2: Stop Plano
echo "Stopping Plano..."
planoai down
}

View file

@ -8,10 +8,10 @@ services:
- "12000:12000"
- "8001:8001"
environment:
- ARCH_CONFIG_PATH=/config/config.yaml
- PLANO_CONFIG_PATH=/config/config.yaml
- OPENAI_API_KEY=${OPENAI_API_KEY:?OPENAI_API_KEY environment variable is required but not set}
volumes:
- ./config.yaml:/app/arch_config.yaml
- ./config.yaml:/app/plano_config.yaml
- /etc/ssl/cert.pem:/etc/ssl/cert.pem
weather-agent:
build: