mirror of
https://github.com/katanemo/plano.git
synced 2026-04-25 00:36:34 +02:00
Rename all arch references to plano (#745)
* Rename all arch references to plano across the codebase
Complete rebrand from "Arch"/"archgw" to "Plano" including:
- Config files: arch_config_schema.yaml, workflow, demo configs
- Environment variables: ARCH_CONFIG_* → PLANO_CONFIG_*
- Python CLI: variables, functions, file paths, docker mounts
- Rust crates: config paths, log messages, metadata keys
- Docker/build: Dockerfile, supervisord, .dockerignore, .gitignore
- Docker Compose: volume mounts and env vars across all demos/tests
- GitHub workflows: job/step names
- Shell scripts: log messages
- Demos: Python code, READMEs, VS Code configs, Grafana dashboard
- Docs: RST includes, code comments, config references
- Package metadata: package.json, pyproject.toml, uv.lock
External URLs (docs.archgw.com, github.com/katanemo/archgw) left as-is.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Update remaining arch references in docs
- Rename RST cross-reference labels: arch_access_logging, arch_overview_tracing, arch_overview_threading → plano_*
- Update label references in request_lifecycle.rst
- Rename arch_config_state_storage_example.yaml → plano_config_state_storage_example.yaml
- Update config YAML comments: "Arch creates/uses" → "Plano creates/uses"
- Update "the Arch gateway" → "the Plano gateway" in configuration_reference.rst
- Update arch_config_schema.yaml reference in provider_models.py
- Rename arch_agent_router → plano_agent_router in config example
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Fix remaining arch references found in second pass
- config/docker-compose.dev.yaml: ARCH_CONFIG_FILE → PLANO_CONFIG_FILE,
arch_config.yaml → plano_config.yaml, archgw_logs → plano_logs
- config/test_passthrough.yaml: container mount path
- tests/e2e/docker-compose.yaml: source file path (was still arch_config.yaml)
- cli/planoai/core.py: comment and log message
- crates/brightstaff/src/tracing/constants.rs: doc comment
- tests/{e2e,archgw}/common.py: get_arch_messages → get_plano_messages,
arch_state/arch_messages variables renamed
- tests/{e2e,archgw}/test_prompt_gateway.py: updated imports and usages
- demos/shared/test_runner/{common,test_demos}.py: same renames
- tests/e2e/test_model_alias_routing.py: docstring
- .dockerignore: archgw_modelserver → plano_modelserver
- demos/use_cases/claude_code_router/pretty_model_resolution.sh: container name
Note: x-arch-* HTTP header values and Rust constant names intentionally
preserved for backwards compatibility with existing deployments.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
parent
0557f7ff98
commit
ba651aaf71
115 changed files with 504 additions and 505 deletions
|
|
@ -18,7 +18,7 @@ services:
|
|||
extra_hosts:
|
||||
- "host.docker.internal:host-gateway"
|
||||
volumes:
|
||||
- ./config.yaml:/app/arch_config.yaml
|
||||
- ./config.yaml:/app/plano_config.yaml
|
||||
|
||||
jaeger:
|
||||
build:
|
||||
|
|
|
|||
|
|
@ -1 +1 @@
|
|||
This demo shows how you can use a publicly hosted rest api and interact it using arch gateway.
|
||||
This demo shows how you can use a publicly hosted rest api and interact it using Plano gateway.
|
||||
|
|
|
|||
|
|
@ -10,7 +10,7 @@ services:
|
|||
extra_hosts:
|
||||
- "host.docker.internal:host-gateway"
|
||||
volumes:
|
||||
- ./config.yaml:/app/arch_config.yaml
|
||||
- ./config.yaml:/app/plano_config.yaml
|
||||
|
||||
jaeger:
|
||||
build:
|
||||
|
|
|
|||
|
|
@ -1,6 +1,6 @@
|
|||
# Multi-Turn Agentic Demo (RAG)
|
||||
|
||||
This demo showcases how the **Arch** can be used to build accurate multi-turn RAG agent by just writing simple APIs.
|
||||
This demo showcases how **Plano** can be used to build accurate multi-turn RAG agent by just writing simple APIs.
|
||||
|
||||

|
||||
|
||||
|
|
@ -14,7 +14,7 @@ Provides information about various energy sources and considerations.
|
|||
|
||||
# Starting the demo
|
||||
1. Please make sure the [pre-requisites](https://github.com/katanemo/arch/?tab=readme-ov-file#prerequisites) are installed correctly
|
||||
2. Start Arch
|
||||
2. Start Plano
|
||||
```sh
|
||||
sh run_demo.sh
|
||||
```
|
||||
|
|
|
|||
|
|
@ -21,4 +21,4 @@ services:
|
|||
extra_hosts:
|
||||
- "host.docker.internal:host-gateway"
|
||||
volumes:
|
||||
- ./config.yaml:/app/arch_config.yaml
|
||||
- ./config.yaml:/app/plano_config.yaml
|
||||
|
|
|
|||
|
|
@ -18,8 +18,8 @@ start_demo() {
|
|||
echo ".env file created with OPENAI_API_KEY."
|
||||
fi
|
||||
|
||||
# Step 3: Start Arch
|
||||
echo "Starting Arch with config.yaml..."
|
||||
# Step 3: Start Plano
|
||||
echo "Starting Plano with config.yaml..."
|
||||
planoai up config.yaml
|
||||
|
||||
# Step 4: Start Network Agent
|
||||
|
|
@ -33,8 +33,8 @@ stop_demo() {
|
|||
echo "Stopping HR Agent using Docker Compose..."
|
||||
docker compose down -v
|
||||
|
||||
# Step 2: Stop Arch
|
||||
echo "Stopping Arch..."
|
||||
# Step 2: Stop Plano
|
||||
echo "Stopping Plano..."
|
||||
planoai down
|
||||
}
|
||||
|
||||
|
|
|
|||
|
|
@ -10,7 +10,7 @@ services:
|
|||
extra_hosts:
|
||||
- "host.docker.internal:host-gateway"
|
||||
volumes:
|
||||
- ./config.yaml:/app/arch_config.yaml
|
||||
- ./config.yaml:/app/plano_config.yaml
|
||||
|
||||
jaeger:
|
||||
build:
|
||||
|
|
|
|||
|
|
@ -18,8 +18,8 @@ start_demo() {
|
|||
echo ".env file created with OPENAI_API_KEY."
|
||||
fi
|
||||
|
||||
# Step 3: Start Arch
|
||||
echo "Starting Arch with config.yaml..."
|
||||
# Step 3: Start Plano
|
||||
echo "Starting Plano with config.yaml..."
|
||||
planoai up config.yaml
|
||||
|
||||
# Step 4: Start developer services
|
||||
|
|
@ -33,8 +33,8 @@ stop_demo() {
|
|||
echo "Stopping Network Agent using Docker Compose..."
|
||||
docker compose down
|
||||
|
||||
# Step 2: Stop Arch
|
||||
echo "Stopping Arch..."
|
||||
# Step 2: Stop Plano
|
||||
echo "Stopping Plano..."
|
||||
planoai down
|
||||
}
|
||||
|
||||
|
|
|
|||
|
|
@ -1,11 +1,11 @@
|
|||
# Function calling
|
||||
|
||||
This demo shows how you can use Arch's core function calling capabilities.
|
||||
This demo shows how you can use Plano's core function calling capabilities.
|
||||
|
||||
# Starting the demo
|
||||
|
||||
1. Please make sure the [pre-requisites](https://github.com/katanemo/arch/?tab=readme-ov-file#prerequisites) are installed correctly
|
||||
2. Start Arch
|
||||
2. Start Plano
|
||||
|
||||
3. ```sh
|
||||
sh run_demo.sh
|
||||
|
|
@ -15,14 +15,14 @@ This demo shows how you can use Arch's core function calling capabilities.
|
|||
|
||||
# Observability
|
||||
|
||||
Arch gateway publishes stats endpoint at http://localhost:19901/stats. In this demo we are using prometheus to pull stats from arch and we are using grafana to visalize the stats in dashboard. To see grafana dashboard follow instructions below,
|
||||
Plano gateway publishes stats endpoint at http://localhost:19901/stats. In this demo we are using prometheus to pull stats from Plano and we are using grafana to visalize the stats in dashboard. To see grafana dashboard follow instructions below,
|
||||
|
||||
1. Start grafana and prometheus using following command
|
||||
```yaml
|
||||
docker compose --profile monitoring up
|
||||
```
|
||||
2. Navigate to http://localhost:3000/ to open grafana UI (use admin/grafana as credentials)
|
||||
3. From grafana left nav click on dashboards and select "Intelligent Gateway Overview" to view arch gateway stats
|
||||
3. From grafana left nav click on dashboards and select "Intelligent Gateway Overview" to view Plano gateway stats
|
||||
|
||||
Here is a sample interaction,
|
||||
<img width="575" alt="image" src="https://github.com/user-attachments/assets/e0929490-3eb2-4130-ae87-a732aea4d059">
|
||||
|
|
|
|||
|
|
@ -20,7 +20,7 @@ services:
|
|||
extra_hosts:
|
||||
- "host.docker.internal:host-gateway"
|
||||
volumes:
|
||||
- ./config.yaml:/app/arch_config.yaml
|
||||
- ./config.yaml:/app/plano_config.yaml
|
||||
|
||||
otel-collector:
|
||||
build:
|
||||
|
|
|
|||
|
|
@ -20,7 +20,7 @@ services:
|
|||
extra_hosts:
|
||||
- "host.docker.internal:host-gateway"
|
||||
volumes:
|
||||
- ./config.yaml:/app/arch_config.yaml
|
||||
- ./config.yaml:/app/plano_config.yaml
|
||||
|
||||
jaeger:
|
||||
build:
|
||||
|
|
|
|||
|
|
@ -20,7 +20,7 @@ services:
|
|||
extra_hosts:
|
||||
- "host.docker.internal:host-gateway"
|
||||
volumes:
|
||||
- ./config.yaml:/app/arch_config.yaml
|
||||
- ./config.yaml:/app/plano_config.yaml
|
||||
|
||||
otel-collector:
|
||||
build:
|
||||
|
|
|
|||
|
|
@ -23,7 +23,7 @@ services:
|
|||
extra_hosts:
|
||||
- "host.docker.internal:host-gateway"
|
||||
volumes:
|
||||
- ./config.yaml:/app/arch_config.yaml
|
||||
- ./config.yaml:/app/plano_config.yaml
|
||||
|
||||
prometheus:
|
||||
build:
|
||||
|
|
|
|||
|
|
@ -20,4 +20,4 @@ services:
|
|||
extra_hosts:
|
||||
- "host.docker.internal:host-gateway"
|
||||
volumes:
|
||||
- ./config.yaml:/app/arch_config.yaml
|
||||
- ./config.yaml:/app/plano_config.yaml
|
||||
|
|
|
|||
|
|
@ -72,8 +72,8 @@ start_demo() {
|
|||
exit 1
|
||||
fi
|
||||
|
||||
# Step 4: Start Arch
|
||||
echo "Starting Arch with config.yaml..."
|
||||
# Step 4: Start Plano
|
||||
echo "Starting Plano with config.yaml..."
|
||||
planoai up config.yaml
|
||||
|
||||
# Step 5: Start Network Agent with the chosen Docker Compose file
|
||||
|
|
@ -91,8 +91,8 @@ stop_demo() {
|
|||
docker compose -f "$compose_file" down
|
||||
done
|
||||
|
||||
# Stop Arch
|
||||
echo "Stopping Arch..."
|
||||
# Stop Plano
|
||||
echo "Stopping Plano..."
|
||||
planoai down
|
||||
}
|
||||
|
||||
|
|
|
|||
4
demos/shared/chatbot_ui/.vscode/launch.json
vendored
4
demos/shared/chatbot_ui/.vscode/launch.json
vendored
|
|
@ -15,7 +15,7 @@
|
|||
"LLM": "1",
|
||||
"CHAT_COMPLETION_ENDPOINT": "http://localhost:10000/v1",
|
||||
"STREAMING": "True",
|
||||
"ARCH_CONFIG": "../../samples_python/weather_forecast/arch_config.yaml"
|
||||
"PLANO_CONFIG": "../../samples_python/weather_forecast/plano_config.yaml"
|
||||
}
|
||||
},
|
||||
{
|
||||
|
|
@ -29,7 +29,7 @@
|
|||
"LLM": "1",
|
||||
"CHAT_COMPLETION_ENDPOINT": "http://localhost:12000/v1",
|
||||
"STREAMING": "True",
|
||||
"ARCH_CONFIG": "../../samples_python/weather_forecast/arch_config.yaml"
|
||||
"PLANO_CONFIG": "../../samples_python/weather_forecast/plano_config.yaml"
|
||||
}
|
||||
},
|
||||
]
|
||||
|
|
|
|||
|
|
@ -37,7 +37,7 @@ def chat(
|
|||
|
||||
try:
|
||||
response = client.chat.completions.create(
|
||||
# we select model from arch_config file
|
||||
# we select model from plano_config file
|
||||
model="None",
|
||||
messages=history,
|
||||
temperature=1.0,
|
||||
|
|
@ -86,7 +86,7 @@ def create_gradio_app(demo_description, client):
|
|||
|
||||
with gr.Column(scale=2):
|
||||
chatbot = gr.Chatbot(
|
||||
label="Arch Chatbot",
|
||||
label="Plano Chatbot",
|
||||
elem_classes="chatbot",
|
||||
)
|
||||
textbox = gr.Textbox(
|
||||
|
|
@ -110,7 +110,7 @@ def process_stream_chunk(chunk, history):
|
|||
delta = chunk.choices[0].delta
|
||||
if delta.role and delta.role != history[-1]["role"]:
|
||||
# create new history item if role changes
|
||||
# this is likely due to arch tool call and api response
|
||||
# this is likely due to Plano tool call and api response
|
||||
history.append({"role": delta.role})
|
||||
|
||||
history[-1]["model"] = chunk.model
|
||||
|
|
@ -159,7 +159,7 @@ def convert_prompt_target_to_openai_format(target):
|
|||
|
||||
def get_prompt_targets():
|
||||
try:
|
||||
with open(os.getenv("ARCH_CONFIG", "config.yaml"), "r") as file:
|
||||
with open(os.getenv("PLANO_CONFIG", "config.yaml"), "r") as file:
|
||||
config = yaml.safe_load(file)
|
||||
|
||||
available_tools = []
|
||||
|
|
@ -181,7 +181,7 @@ def get_prompt_targets():
|
|||
|
||||
def get_llm_models():
|
||||
try:
|
||||
with open(os.getenv("ARCH_CONFIG", "config.yaml"), "r") as file:
|
||||
with open(os.getenv("PLANO_CONFIG", "config.yaml"), "r") as file:
|
||||
config = yaml.safe_load(file)
|
||||
|
||||
available_models = [""]
|
||||
|
|
|
|||
|
|
@ -787,7 +787,7 @@
|
|||
},
|
||||
"timepicker": {},
|
||||
"timezone": "browser",
|
||||
"title": "Arch Gateway Dashboard",
|
||||
"title": "Plano Gateway Dashboard",
|
||||
"uid": "adt6uhx5lk8aob",
|
||||
"version": 1,
|
||||
"weekStart": ""
|
||||
|
|
|
|||
|
|
@ -19,17 +19,17 @@ def get_data_chunks(stream, n=1):
|
|||
return chunks
|
||||
|
||||
|
||||
def get_arch_messages(response_json):
|
||||
arch_messages = []
|
||||
def get_plano_messages(response_json):
|
||||
plano_messages = []
|
||||
if response_json and "metadata" in response_json:
|
||||
# load arch_state from metadata
|
||||
arch_state_str = response_json.get("metadata", {}).get(ARCH_STATE_HEADER, "{}")
|
||||
# parse arch_state into json object
|
||||
arch_state = json.loads(arch_state_str)
|
||||
# load messages from arch_state
|
||||
arch_messages_str = arch_state.get("messages", "[]")
|
||||
# load plano_state from metadata
|
||||
plano_state_str = response_json.get("metadata", {}).get(ARCH_STATE_HEADER, "{}")
|
||||
# parse plano_state into json object
|
||||
plano_state = json.loads(plano_state_str)
|
||||
# load messages from plano_state
|
||||
plano_messages_str = plano_state.get("messages", "[]")
|
||||
# parse messages into json object
|
||||
arch_messages = json.loads(arch_messages_str)
|
||||
# append messages from arch gateway to history
|
||||
return arch_messages
|
||||
plano_messages = json.loads(plano_messages_str)
|
||||
# append messages from plano gateway to history
|
||||
return plano_messages
|
||||
return []
|
||||
|
|
|
|||
|
|
@ -1,6 +1,6 @@
|
|||
import json
|
||||
import os
|
||||
from common import get_arch_messages
|
||||
from common import get_plano_messages
|
||||
import pytest
|
||||
import requests
|
||||
from deepdiff import DeepDiff
|
||||
|
|
@ -46,10 +46,10 @@ def test_demos(test_data):
|
|||
assert choices[0]["message"]["role"] == "assistant"
|
||||
assert expected_output_contains.lower() in choices[0]["message"]["content"].lower()
|
||||
|
||||
# now verify arch_messages (tool call and api response) that are sent as response metadata
|
||||
arch_messages = get_arch_messages(response_json)
|
||||
assert len(arch_messages) == 2
|
||||
tool_calls_message = arch_messages[0]
|
||||
# now verify plano_messages (tool call and api response) that are sent as response metadata
|
||||
plano_messages = get_plano_messages(response_json)
|
||||
assert len(plano_messages) == 2
|
||||
tool_calls_message = plano_messages[0]
|
||||
tool_calls = tool_calls_message.get("tool_calls", [])
|
||||
assert len(tool_calls) > 0
|
||||
|
||||
|
|
|
|||
|
|
@ -1,6 +1,6 @@
|
|||
# Claude Code Router - Multi-Model Access with Intelligent Routing
|
||||
|
||||
Arch Gateway extends Claude Code to access multiple LLM providers through a single interface. Offering two key benefits:
|
||||
Plano extends Claude Code to access multiple LLM providers through a single interface. Offering two key benefits:
|
||||
|
||||
1. **Access to Models**: Connect to Grok, Mistral, Gemini, DeepSeek, GPT models, Claude, and local models via Ollama
|
||||
2. **Intelligent Routing via Preferences for Coding Tasks**: Configure which models handle specific development tasks:
|
||||
|
|
@ -21,15 +21,15 @@ Uses a [1.5B preference-aligned router LLM](https://arxiv.org/abs/2506.16655) to
|
|||
|
||||
## How It Works
|
||||
|
||||
Arch Gateway sits between Claude Code and multiple LLM providers, analyzing each request to route it to the most suitable model:
|
||||
Plano sits between Claude Code and multiple LLM providers, analyzing each request to route it to the most suitable model:
|
||||
|
||||
```
|
||||
Your Request → Arch Gateway → Suitable Model → Response
|
||||
Your Request → Plano → Suitable Model → Response
|
||||
↓
|
||||
[Task Analysis & Model Selection]
|
||||
```
|
||||
|
||||
**Supported Providers**: OpenAI-compatible, Anthropic, DeepSeek, Grok, Gemini, Llama, Mistral, local models via Ollama. See [full list of supported providers](https://docs.archgw.com/concepts/llm_providers/supported_providers.html).
|
||||
**Supported Providers**: OpenAI-compatible, Anthropic, DeepSeek, Grok, Gemini, Llama, Mistral, local models via Ollama. See [full list of supported providers](https://docs.planoai.dev/concepts/llm_providers/supported_providers.html).
|
||||
|
||||
|
||||
## Quick Start (5 minutes)
|
||||
|
|
@ -61,7 +61,7 @@ export ANTHROPIC_API_KEY="your-anthropic-key-here"
|
|||
# Add other providers as needed
|
||||
```
|
||||
|
||||
### Step 3: Start Arch Gateway
|
||||
### Step 3: Start Plano
|
||||
```bash
|
||||
# Install using uv (recommended)
|
||||
uv tool install planoai
|
||||
|
|
@ -122,7 +122,7 @@ planoai cli-agent claude --settings='{"ANTHROPIC_SMALL_FAST_MODEL": "deepseek-co
|
|||
### Environment Variables
|
||||
The system automatically configures these variables for Claude Code:
|
||||
```bash
|
||||
ANTHROPIC_BASE_URL=http://127.0.0.1:12000 # Routes through Arch Gateway
|
||||
ANTHROPIC_BASE_URL=http://127.0.0.1:12000 # Routes through Plano
|
||||
ANTHROPIC_SMALL_FAST_MODEL=arch.claude.code.small.fast # Uses intelligent alias
|
||||
```
|
||||
|
||||
|
|
@ -147,6 +147,6 @@ llm_providers:
|
|||
|
||||
## Technical Details
|
||||
|
||||
**How routing works:** Arch intercepts Claude Code requests, analyzes the content using preference-aligned routing, and forwards to the configured model.
|
||||
**How routing works:** Plano intercepts Claude Code requests, analyzes the content using preference-aligned routing, and forwards to the configured model.
|
||||
**Research foundation:** Built on our research in [Preference-Aligned LLM Routing](https://arxiv.org/abs/2506.16655)
|
||||
**Documentation:** [docs.archgw.com](https://docs.archgw.com) for advanced configuration and API details.
|
||||
**Documentation:** [docs.planoai.dev](https://docs.planoai.dev) for advanced configuration and API details.
|
||||
|
|
|
|||
|
|
@ -1,5 +1,5 @@
|
|||
#!/usr/bin/env bash
|
||||
# Pretty-print ArchGW MODEL_RESOLUTION lines from docker logs
|
||||
# Pretty-print Plano MODEL_RESOLUTION lines from docker logs
|
||||
# - hides Arch-Router
|
||||
# - prints timestamp
|
||||
# - colors MODEL_RESOLUTION red
|
||||
|
|
@ -7,7 +7,7 @@
|
|||
# - colors resolved_model magenta
|
||||
# - removes provider and streaming
|
||||
|
||||
docker logs -f archgw 2>&1 \
|
||||
docker logs -f plano 2>&1 \
|
||||
| awk '
|
||||
/MODEL_RESOLUTION:/ && $0 !~ /Arch-Router/ {
|
||||
# extract timestamp between first [ and ]
|
||||
|
|
|
|||
|
|
@ -19,10 +19,10 @@ services:
|
|||
- "12000:12000"
|
||||
- "8001:8001"
|
||||
environment:
|
||||
- ARCH_CONFIG_PATH=/config/config.yaml
|
||||
- PLANO_CONFIG_PATH=/config/config.yaml
|
||||
- OPENAI_API_KEY=${OPENAI_API_KEY:?OPENAI_API_KEY environment variable is required but not set}
|
||||
volumes:
|
||||
- ./config.yaml:/app/arch_config.yaml
|
||||
- ./config.yaml:/app/plano_config.yaml
|
||||
- /etc/ssl/cert.pem:/etc/ssl/cert.pem
|
||||
jaeger:
|
||||
build:
|
||||
|
|
|
|||
|
|
@ -22,14 +22,14 @@ logger = logging.getLogger(__name__)
|
|||
### add new setup
|
||||
app = FastAPI(title="RAG Agent Context Builder", version="1.0.0")
|
||||
|
||||
# Configuration for archgw LLM gateway
|
||||
# Configuration for Plano LLM gateway
|
||||
LLM_GATEWAY_ENDPOINT = os.getenv("LLM_GATEWAY_ENDPOINT", "http://localhost:12000/v1")
|
||||
RAG_MODEL = "gpt-4o-mini"
|
||||
|
||||
# Initialize OpenAI client for archgw
|
||||
archgw_client = AsyncOpenAI(
|
||||
# Initialize OpenAI client for Plano
|
||||
plano_client = AsyncOpenAI(
|
||||
base_url=LLM_GATEWAY_ENDPOINT,
|
||||
api_key="EMPTY", # archgw doesn't require a real API key
|
||||
api_key="EMPTY", # Plano doesn't require a real API key
|
||||
)
|
||||
|
||||
# Global variable to store the knowledge base
|
||||
|
|
@ -95,15 +95,15 @@ async def find_relevant_passages(
|
|||
If no passages are relevant, return "NONE"."""
|
||||
|
||||
try:
|
||||
# Call archgw to select relevant passages
|
||||
logger.info(f"Calling archgw to find relevant passages for query: '{query}'")
|
||||
# Call Plano to select relevant passages
|
||||
logger.info(f"Calling Plano to find relevant passages for query: '{query}'")
|
||||
|
||||
# Prepare extra headers if traceparent is provided
|
||||
extra_headers = {"x-envoy-max-retries": "3", "x-request-id": request_id}
|
||||
if traceparent:
|
||||
extra_headers["traceparent"] = traceparent
|
||||
|
||||
response = await archgw_client.chat.completions.create(
|
||||
response = await plano_client.chat.completions.create(
|
||||
model=RAG_MODEL,
|
||||
messages=[{"role": "system", "content": system_prompt}],
|
||||
temperature=0.1,
|
||||
|
|
|
|||
|
|
@ -22,14 +22,14 @@ logging.basicConfig(
|
|||
)
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# Configuration for archgw LLM gateway
|
||||
# Configuration for Plano LLM gateway
|
||||
LLM_GATEWAY_ENDPOINT = os.getenv("LLM_GATEWAY_ENDPOINT", "http://localhost:12000/v1")
|
||||
GUARD_MODEL = "gpt-4o-mini"
|
||||
|
||||
# Initialize OpenAI client for archgw
|
||||
archgw_client = AsyncOpenAI(
|
||||
# Initialize OpenAI client for Plano
|
||||
plano_client = AsyncOpenAI(
|
||||
base_url=LLM_GATEWAY_ENDPOINT,
|
||||
api_key="EMPTY", # archgw doesn't require a real API key
|
||||
api_key="EMPTY", # Plano doesn't require a real API key
|
||||
)
|
||||
|
||||
app = FastAPI(title="RAG Agent Input Guards", version="1.0.0")
|
||||
|
|
@ -93,13 +93,13 @@ Respond in JSON format:
|
|||
]
|
||||
|
||||
try:
|
||||
# Call archgw using OpenAI client
|
||||
# Call Plano using OpenAI client
|
||||
extra_headers = {"x-envoy-max-retries": "3", "x-request-id": request_id}
|
||||
if traceparent_header:
|
||||
extra_headers["traceparent"] = traceparent_header
|
||||
|
||||
logger.info(f"Validating query scope: '{last_user_message}'")
|
||||
response = await archgw_client.chat.completions.create(
|
||||
response = await plano_client.chat.completions.create(
|
||||
model=GUARD_MODEL,
|
||||
messages=guard_messages,
|
||||
temperature=0.1,
|
||||
|
|
|
|||
|
|
@ -20,20 +20,20 @@ logging.basicConfig(
|
|||
)
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# Configuration for archgw LLM gateway
|
||||
# Configuration for Plano LLM gateway
|
||||
LLM_GATEWAY_ENDPOINT = os.getenv("LLM_GATEWAY_ENDPOINT", "http://localhost:12000/v1")
|
||||
QUERY_REWRITE_MODEL = "gpt-4o-mini"
|
||||
|
||||
# Initialize OpenAI client for archgw
|
||||
archgw_client = AsyncOpenAI(
|
||||
# Initialize OpenAI client for Plano
|
||||
plano_client = AsyncOpenAI(
|
||||
base_url=LLM_GATEWAY_ENDPOINT,
|
||||
api_key="EMPTY", # archgw doesn't require a real API key
|
||||
api_key="EMPTY", # Plano doesn't require a real API key
|
||||
)
|
||||
|
||||
app = FastAPI(title="RAG Agent Query Rewriter", version="1.0.0")
|
||||
|
||||
|
||||
async def rewrite_query_with_archgw(
|
||||
async def rewrite_query_with_plano(
|
||||
messages: List[ChatMessage],
|
||||
traceparent_header: Optional[str] = None,
|
||||
request_id: Optional[str] = None,
|
||||
|
|
@ -59,8 +59,8 @@ Return only the rewritten query, nothing else."""
|
|||
extra_headers["traceparent"] = traceparent_header
|
||||
|
||||
try:
|
||||
logger.info(f"Calling archgw at {LLM_GATEWAY_ENDPOINT} to rewrite query")
|
||||
resp = await archgw_client.chat.completions.create(
|
||||
logger.info(f"Calling Plano at {LLM_GATEWAY_ENDPOINT} to rewrite query")
|
||||
resp = await plano_client.chat.completions.create(
|
||||
model=QUERY_REWRITE_MODEL,
|
||||
messages=rewrite_messages,
|
||||
temperature=0.3,
|
||||
|
|
@ -96,7 +96,7 @@ async def query_rewriter_http(
|
|||
else:
|
||||
logger.info("No traceparent header found")
|
||||
|
||||
rewritten_query = await rewrite_query_with_archgw(
|
||||
rewritten_query = await rewrite_query_with_plano(
|
||||
messages, traceparent_header, request_id
|
||||
)
|
||||
# Create updated messages with the rewritten query
|
||||
|
|
|
|||
|
|
@ -22,7 +22,7 @@ logging.basicConfig(
|
|||
)
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# Configuration for archgw LLM gateway
|
||||
# Configuration for Plano LLM gateway
|
||||
LLM_GATEWAY_ENDPOINT = os.getenv("LLM_GATEWAY_ENDPOINT", "http://localhost:12000/v1")
|
||||
RESPONSE_MODEL = "gpt-4o"
|
||||
|
||||
|
|
@ -38,10 +38,10 @@ Your response should:
|
|||
|
||||
Generate a complete response to assist the user."""
|
||||
|
||||
# Initialize OpenAI client for archgw
|
||||
archgw_client = AsyncOpenAI(
|
||||
# Initialize OpenAI client for Plano
|
||||
plano_client = AsyncOpenAI(
|
||||
base_url=LLM_GATEWAY_ENDPOINT,
|
||||
api_key="EMPTY", # archgw doesn't require a real API key
|
||||
api_key="EMPTY", # Plano doesn't require a real API key
|
||||
)
|
||||
|
||||
# FastAPI app for REST server
|
||||
|
|
@ -95,9 +95,9 @@ async def stream_chat_completions(
|
|||
response_messages = prepare_response_messages(request_body)
|
||||
|
||||
try:
|
||||
# Call archgw using OpenAI client for streaming
|
||||
# Call Plano using OpenAI client for streaming
|
||||
logger.info(
|
||||
f"Calling archgw at {LLM_GATEWAY_ENDPOINT} to generate streaming response"
|
||||
f"Calling Plano at {LLM_GATEWAY_ENDPOINT} to generate streaming response"
|
||||
)
|
||||
|
||||
# Prepare extra headers if traceparent is provided
|
||||
|
|
@ -105,7 +105,7 @@ async def stream_chat_completions(
|
|||
if traceparent_header:
|
||||
extra_headers["traceparent"] = traceparent_header
|
||||
|
||||
response_stream = await archgw_client.chat.completions.create(
|
||||
response_stream = await plano_client.chat.completions.create(
|
||||
model=RESPONSE_MODEL,
|
||||
messages=response_messages,
|
||||
temperature=request_body.temperature or 0.7,
|
||||
|
|
|
|||
|
|
@ -1,15 +1,15 @@
|
|||
# LLM Routing
|
||||
This demo shows how you can arch gateway to manage keys and route to upstream LLM.
|
||||
This demo shows how you can use Plano gateway to manage keys and route to upstream LLM.
|
||||
|
||||
# Starting the demo
|
||||
1. Please make sure the [pre-requisites](https://github.com/katanemo/arch/?tab=readme-ov-file#prerequisites) are installed correctly
|
||||
1. Start Arch
|
||||
1. Start Plano
|
||||
```sh
|
||||
sh run_demo.sh
|
||||
```
|
||||
1. Navigate to http://localhost:18080/
|
||||
|
||||
Following screen shows an example of interaction with arch gateway showing dynamic routing. You can select between different LLMs using "override model" option in the chat UI.
|
||||
Following screen shows an example of interaction with Plano gateway showing dynamic routing. You can select between different LLMs using "override model" option in the chat UI.
|
||||
|
||||

|
||||
|
||||
|
|
@ -47,12 +47,12 @@ $ curl --header 'Content-Type: application/json' \
|
|||
```
|
||||
|
||||
# Observability
|
||||
Arch gateway publishes stats endpoint at http://localhost:19901/stats. In this demo we are using prometheus to pull stats from arch and we are using grafana to visualize the stats in dashboard. To see grafana dashboard follow instructions below,
|
||||
Plano gateway publishes stats endpoint at http://localhost:19901/stats. In this demo we are using prometheus to pull stats from Plano and we are using grafana to visualize the stats in dashboard. To see grafana dashboard follow instructions below,
|
||||
|
||||
1. Navigate to http://localhost:3000/ to open grafana UI (use admin/grafana as credentials)
|
||||
1. From grafana left nav click on dashboards and select "Intelligent Gateway Overview" to view arch gateway stats
|
||||
1. From grafana left nav click on dashboards and select "Intelligent Gateway Overview" to view Plano gateway stats
|
||||
1. For tracing you can head over to http://localhost:16686/ to view recent traces.
|
||||
|
||||
Following is a screenshot of tracing UI showing call received by arch gateway and making upstream call to LLM,
|
||||
Following is a screenshot of tracing UI showing call received by Plano gateway and making upstream call to LLM,
|
||||
|
||||

|
||||
|
|
|
|||
|
|
@ -8,11 +8,11 @@ services:
|
|||
- "12000:12000"
|
||||
- "12001:12001"
|
||||
environment:
|
||||
- ARCH_CONFIG_PATH=/app/arch_config.yaml
|
||||
- PLANO_CONFIG_PATH=/app/plano_config.yaml
|
||||
- OPENAI_API_KEY=${OPENAI_API_KEY:?OPENAI_API_KEY environment variable is required but not set}
|
||||
- OTEL_TRACING_GRPC_ENDPOINT=http://host.docker.internal:4317
|
||||
volumes:
|
||||
- ./config.yaml:/app/arch_config.yaml:ro
|
||||
- ./config.yaml:/app/plano_config.yaml:ro
|
||||
- /etc/ssl/cert.pem:/etc/ssl/cert.pem
|
||||
|
||||
anythingllm:
|
||||
|
|
|
|||
|
|
@ -18,8 +18,8 @@ start_demo() {
|
|||
echo ".env file created with OPENAI_API_KEY."
|
||||
fi
|
||||
|
||||
# Step 3: Start Arch
|
||||
echo "Starting Arch with config.yaml..."
|
||||
# Step 3: Start Plano
|
||||
echo "Starting Plano with config.yaml..."
|
||||
planoai up config.yaml
|
||||
|
||||
# Step 4: Start LLM Routing
|
||||
|
|
@ -33,8 +33,8 @@ stop_demo() {
|
|||
echo "Stopping LLM Routing using Docker Compose..."
|
||||
docker compose down
|
||||
|
||||
# Step 2: Stop Arch
|
||||
echo "Stopping Arch..."
|
||||
# Step 2: Stop Plano
|
||||
echo "Stopping Plano..."
|
||||
planoai down
|
||||
}
|
||||
|
||||
|
|
|
|||
|
|
@ -21,10 +21,10 @@ services:
|
|||
- "12000:12000"
|
||||
- "8001:8001"
|
||||
environment:
|
||||
- ARCH_CONFIG_PATH=/config/config.yaml
|
||||
- PLANO_CONFIG_PATH=/config/config.yaml
|
||||
- OPENAI_API_KEY=${OPENAI_API_KEY:?OPENAI_API_KEY environment variable is required but not set}
|
||||
volumes:
|
||||
- ./config.yaml:/app/arch_config.yaml
|
||||
- ./config.yaml:/app/plano_config.yaml
|
||||
- /etc/ssl/cert.pem:/etc/ssl/cert.pem
|
||||
jaeger:
|
||||
build:
|
||||
|
|
|
|||
|
|
@ -18,14 +18,14 @@ logging.basicConfig(
|
|||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
# Configuration for archgw LLM gateway
|
||||
# Configuration for Plano LLM gateway
|
||||
LLM_GATEWAY_ENDPOINT = os.getenv("LLM_GATEWAY_ENDPOINT", "http://localhost:12000/v1")
|
||||
RAG_MODEL = "gpt-4o-mini"
|
||||
|
||||
# Initialize OpenAI client for archgw
|
||||
archgw_client = AsyncOpenAI(
|
||||
# Initialize OpenAI client for Plano
|
||||
plano_client = AsyncOpenAI(
|
||||
base_url=LLM_GATEWAY_ENDPOINT,
|
||||
api_key="EMPTY", # archgw doesn't require a real API key
|
||||
api_key="EMPTY", # Plano doesn't require a real API key
|
||||
)
|
||||
|
||||
# Global variable to store the knowledge base
|
||||
|
|
@ -91,8 +91,8 @@ async def find_relevant_passages(
|
|||
If no passages are relevant, return "NONE"."""
|
||||
|
||||
try:
|
||||
# Call archgw to select relevant passages
|
||||
logger.info(f"Calling archgw to find relevant passages for query: '{query}'")
|
||||
# Call Plano to select relevant passages
|
||||
logger.info(f"Calling Plano to find relevant passages for query: '{query}'")
|
||||
|
||||
# Prepare extra headers if traceparent is provided
|
||||
extra_headers = {
|
||||
|
|
@ -103,7 +103,7 @@ async def find_relevant_passages(
|
|||
if traceparent:
|
||||
extra_headers["traceparent"] = traceparent
|
||||
|
||||
response = await archgw_client.chat.completions.create(
|
||||
response = await plano_client.chat.completions.create(
|
||||
model=RAG_MODEL,
|
||||
messages=[{"role": "system", "content": system_prompt}],
|
||||
temperature=0.1,
|
||||
|
|
|
|||
|
|
@ -20,14 +20,14 @@ logging.basicConfig(
|
|||
)
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# Configuration for archgw LLM gateway
|
||||
# Configuration for Plano LLM gateway
|
||||
LLM_GATEWAY_ENDPOINT = os.getenv("LLM_GATEWAY_ENDPOINT", "http://localhost:12000/v1")
|
||||
GUARD_MODEL = "gpt-4o-mini"
|
||||
|
||||
# Initialize OpenAI client for archgw
|
||||
archgw_client = AsyncOpenAI(
|
||||
# Initialize OpenAI client for Plano
|
||||
plano_client = AsyncOpenAI(
|
||||
base_url=LLM_GATEWAY_ENDPOINT,
|
||||
api_key="EMPTY", # archgw doesn't require a real API key
|
||||
api_key="EMPTY", # Plano doesn't require a real API key
|
||||
)
|
||||
|
||||
app = FastAPI()
|
||||
|
|
@ -91,7 +91,7 @@ Respond in JSON format:
|
|||
]
|
||||
|
||||
try:
|
||||
# Call archgw using OpenAI client
|
||||
# Call Plano using OpenAI client
|
||||
extra_headers = {"x-envoy-max-retries": "3"}
|
||||
if traceparent_header:
|
||||
extra_headers["traceparent"] = traceparent_header
|
||||
|
|
@ -100,7 +100,7 @@ Respond in JSON format:
|
|||
extra_headers["x-request-id"] = request_id
|
||||
|
||||
logger.info(f"Validating query scope: '{last_user_message}'")
|
||||
response = await archgw_client.chat.completions.create(
|
||||
response = await plano_client.chat.completions.create(
|
||||
model=GUARD_MODEL,
|
||||
messages=guard_messages,
|
||||
temperature=0.1,
|
||||
|
|
|
|||
|
|
@ -19,20 +19,20 @@ logging.basicConfig(
|
|||
)
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# Configuration for archgw LLM gateway
|
||||
# Configuration for Plano LLM gateway
|
||||
LLM_GATEWAY_ENDPOINT = os.getenv("LLM_GATEWAY_ENDPOINT", "http://localhost:12000/v1")
|
||||
QUERY_REWRITE_MODEL = "gpt-4o-mini"
|
||||
|
||||
# Initialize OpenAI client for archgw
|
||||
archgw_client = AsyncOpenAI(
|
||||
# Initialize OpenAI client for Plano
|
||||
plano_client = AsyncOpenAI(
|
||||
base_url=LLM_GATEWAY_ENDPOINT,
|
||||
api_key="EMPTY", # archgw doesn't require a real API key
|
||||
api_key="EMPTY", # Plano doesn't require a real API key
|
||||
)
|
||||
|
||||
app = FastAPI()
|
||||
|
||||
|
||||
async def rewrite_query_with_archgw(
|
||||
async def rewrite_query_with_plano(
|
||||
messages: List[ChatMessage],
|
||||
traceparent_header: str,
|
||||
request_id: Optional[str] = None,
|
||||
|
|
@ -57,14 +57,14 @@ async def rewrite_query_with_archgw(
|
|||
rewrite_messages.append({"role": msg.role, "content": msg.content})
|
||||
|
||||
try:
|
||||
# Call archgw using OpenAI client
|
||||
# Call Plano using OpenAI client
|
||||
extra_headers = {"x-envoy-max-retries": "3"}
|
||||
if traceparent_header:
|
||||
extra_headers["traceparent"] = traceparent_header
|
||||
if request_id:
|
||||
extra_headers["x-request-id"] = request_id
|
||||
logger.info(f"Calling archgw at {LLM_GATEWAY_ENDPOINT} to rewrite query")
|
||||
response = await archgw_client.chat.completions.create(
|
||||
logger.info(f"Calling Plano at {LLM_GATEWAY_ENDPOINT} to rewrite query")
|
||||
response = await plano_client.chat.completions.create(
|
||||
model=QUERY_REWRITE_MODEL,
|
||||
messages=rewrite_messages,
|
||||
temperature=0.3,
|
||||
|
|
@ -88,7 +88,7 @@ async def rewrite_query_with_archgw(
|
|||
|
||||
|
||||
async def query_rewriter(messages: List[ChatMessage]) -> List[ChatMessage]:
|
||||
"""Chat completions endpoint that rewrites the last user query using archgw.
|
||||
"""Chat completions endpoint that rewrites the last user query using Plano.
|
||||
|
||||
Returns a dict with a 'messages' key containing the updated message list.
|
||||
"""
|
||||
|
|
@ -104,8 +104,8 @@ async def query_rewriter(messages: List[ChatMessage]) -> List[ChatMessage]:
|
|||
else:
|
||||
logger.info("No traceparent header found")
|
||||
|
||||
# Call archgw to rewrite the last user query
|
||||
rewritten_query = await rewrite_query_with_archgw(
|
||||
# Call Plano to rewrite the last user query
|
||||
rewritten_query = await rewrite_query_with_plano(
|
||||
messages, traceparent_header, request_id
|
||||
)
|
||||
|
||||
|
|
|
|||
|
|
@ -22,7 +22,7 @@ logging.basicConfig(
|
|||
)
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# Configuration for archgw LLM gateway
|
||||
# Configuration for Plano LLM gateway
|
||||
LLM_GATEWAY_ENDPOINT = os.getenv("LLM_GATEWAY_ENDPOINT", "http://localhost:12000/v1")
|
||||
RESPONSE_MODEL = "gpt-4o"
|
||||
|
||||
|
|
@ -38,10 +38,10 @@ Your response should:
|
|||
|
||||
Generate a complete response to assist the user."""
|
||||
|
||||
# Initialize OpenAI client for archgw
|
||||
archgw_client = AsyncOpenAI(
|
||||
# Initialize OpenAI client for Plano
|
||||
plano_client = AsyncOpenAI(
|
||||
base_url=LLM_GATEWAY_ENDPOINT,
|
||||
api_key="EMPTY", # archgw doesn't require a real API key
|
||||
api_key="EMPTY", # Plano doesn't require a real API key
|
||||
)
|
||||
|
||||
# FastAPI app for REST server
|
||||
|
|
@ -94,9 +94,9 @@ async def stream_chat_completions(
|
|||
response_messages = prepare_response_messages(request_body)
|
||||
|
||||
try:
|
||||
# Call archgw using OpenAI client for streaming
|
||||
# Call Plano using OpenAI client for streaming
|
||||
logger.info(
|
||||
f"Calling archgw at {LLM_GATEWAY_ENDPOINT} to generate streaming response"
|
||||
f"Calling Plano at {LLM_GATEWAY_ENDPOINT} to generate streaming response"
|
||||
)
|
||||
|
||||
logger.info(f"rag_agent - request_id: {request_id}")
|
||||
|
|
@ -107,7 +107,7 @@ async def stream_chat_completions(
|
|||
if traceparent_header:
|
||||
extra_headers["traceparent"] = traceparent_header
|
||||
|
||||
response_stream = await archgw_client.chat.completions.create(
|
||||
response_stream = await plano_client.chat.completions.create(
|
||||
model=RESPONSE_MODEL,
|
||||
messages=response_messages,
|
||||
temperature=request_body.temperature or 0.7,
|
||||
|
|
|
|||
|
|
@ -1,6 +1,6 @@
|
|||
# Model Alias Demo Suite
|
||||
|
||||
This directory contains demos for the model alias feature in archgw.
|
||||
This directory contains demos for the model alias feature in Plano.
|
||||
|
||||
## Overview
|
||||
|
||||
|
|
@ -48,7 +48,7 @@ model_aliases:
|
|||
```
|
||||
|
||||
## Prerequisites
|
||||
- Install all dependencies as described in the main Arch README ([link](https://github.com/katanemo/arch/?tab=readme-ov-file#prerequisites))
|
||||
- Install all dependencies as described in the main Plano README ([link](https://github.com/katanemo/plano/?tab=readme-ov-file#prerequisites))
|
||||
- Set your API keys in your environment:
|
||||
- `export OPENAI_API_KEY=your-openai-key`
|
||||
- `export ANTHROPIC_API_KEY=your-anthropic-key` (optional, but recommended for Anthropic tests)
|
||||
|
|
@ -60,13 +60,13 @@ model_aliases:
|
|||
sh run_demo.sh
|
||||
```
|
||||
- This will create a `.env` file with your API keys (if not present).
|
||||
- Starts Arch Gateway with model alias config (`arch_config_with_aliases.yaml`).
|
||||
- Starts Plano gateway with model alias config (`arch_config_with_aliases.yaml`).
|
||||
|
||||
2. To stop the demo:
|
||||
```sh
|
||||
sh run_demo.sh down
|
||||
```
|
||||
- This will stop Arch Gateway and any related services.
|
||||
- This will stop Plano gateway and any related services.
|
||||
|
||||
## Example Requests
|
||||
|
||||
|
|
@ -145,4 +145,4 @@ curl -sS -X POST "http://localhost:12000/v1/messages" \
|
|||
## Troubleshooting
|
||||
- Ensure your API keys are set in your environment before running the demo.
|
||||
- If you see errors about missing keys, set them and re-run the script.
|
||||
- For more details, see the main Arch documentation.
|
||||
- For more details, see the main Plano documentation.
|
||||
|
|
|
|||
|
|
@ -24,11 +24,11 @@ start_demo() {
|
|||
echo ".env file created with API keys."
|
||||
fi
|
||||
|
||||
# Step 3: Start Arch
|
||||
echo "Starting Arch with arch_config_with_aliases.yaml..."
|
||||
# Step 3: Start Plano
|
||||
echo "Starting Plano with arch_config_with_aliases.yaml..."
|
||||
planoai up arch_config_with_aliases.yaml
|
||||
|
||||
echo "\n\nArch started successfully."
|
||||
echo "\n\nPlano started successfully."
|
||||
echo "Please run the following CURL command to test model alias routing. Additional instructions are in the README.md file. \n"
|
||||
echo "curl -sS -X POST \"http://localhost:12000/v1/chat/completions\" \
|
||||
-H \"Authorization: Bearer test-key\" \
|
||||
|
|
@ -46,8 +46,8 @@ start_demo() {
|
|||
|
||||
# Function to stop the demo
|
||||
stop_demo() {
|
||||
# Step 2: Stop Arch
|
||||
echo "Stopping Arch..."
|
||||
# Step 2: Stop Plano
|
||||
echo "Stopping Plano..."
|
||||
planoai down
|
||||
}
|
||||
|
||||
|
|
|
|||
|
|
@ -1,6 +1,6 @@
|
|||
# Model Choice Newsletter Demo
|
||||
|
||||
This folder demonstrates a practical workflow for rapid model adoption and safe model switching using Arch Gateway (`plano`). It includes both a minimal test harness and a sample proxy configuration.
|
||||
This folder demonstrates a practical workflow for rapid model adoption and safe model switching using Plano (`plano`). It includes both a minimal test harness and a sample proxy configuration.
|
||||
|
||||
---
|
||||
|
||||
|
|
@ -85,13 +85,13 @@ See `config.yaml` for a sample configuration mapping aliases to provider models.
|
|||
```
|
||||
|
||||
2. **Install dependencies:**
|
||||
- Install all dependencies as described in the main Arch README ([link](https://github.com/katanemo/arch/?tab=readme-ov-file#prerequisites))
|
||||
- Install all dependencies as described in the main Plano README ([link](https://github.com/katanemo/plano/?tab=readme-ov-file#prerequisites))
|
||||
- Then run
|
||||
```sh
|
||||
uv sync
|
||||
```
|
||||
|
||||
3. **Start Arch Gateway**
|
||||
3. **Start Plano**
|
||||
```sh
|
||||
run_demo.sh
|
||||
```
|
||||
|
|
|
|||
|
|
@ -3,7 +3,7 @@ import json, time, yaml, statistics as stats
|
|||
from pydantic import BaseModel, ValidationError
|
||||
from openai import OpenAI
|
||||
|
||||
# archgw endpoint (keys are handled by archgw)
|
||||
# Plano endpoint (keys are handled by Plano)
|
||||
client = OpenAI(base_url="http://localhost:12000/v1", api_key="n/a")
|
||||
MODELS = ["arch.summarize.v1", "arch.reason.v1"]
|
||||
FIXTURES = "evals_summarize.yaml"
|
||||
|
|
|
|||
|
|
@ -1,7 +1,7 @@
|
|||
[project]
|
||||
name = "model-choice-newsletter-code-snippets"
|
||||
version = "0.1.0"
|
||||
description = "Benchmarking model alias routing with Arch Gateway."
|
||||
description = "Benchmarking model alias routing with Plano."
|
||||
authors = [{name = "Your Name", email = "your@email.com"}]
|
||||
license = {text = "Apache 2.0"}
|
||||
readme = "README.md"
|
||||
|
|
|
|||
|
|
@ -17,18 +17,18 @@ start_demo() {
|
|||
echo ".env file created with API keys."
|
||||
fi
|
||||
|
||||
# Step 3: Start Arch
|
||||
echo "Starting Arch with arch_config_with_aliases.yaml..."
|
||||
# Step 3: Start Plano
|
||||
echo "Starting Plano with arch_config_with_aliases.yaml..."
|
||||
planoai up arch_config_with_aliases.yaml
|
||||
|
||||
echo "\n\nArch started successfully."
|
||||
echo "\n\nPlano started successfully."
|
||||
echo "Please run the following command to test the setup: python bench.py\n"
|
||||
}
|
||||
|
||||
# Function to stop the demo
|
||||
stop_demo() {
|
||||
# Step 2: Stop Arch
|
||||
echo "Stopping Arch..."
|
||||
# Step 2: Stop Plano
|
||||
echo "Stopping Plano..."
|
||||
planoai down
|
||||
}
|
||||
|
||||
|
|
|
|||
|
|
@ -8,12 +8,12 @@ services:
|
|||
- "8001:8001"
|
||||
- "12000:12000"
|
||||
environment:
|
||||
- ARCH_CONFIG_PATH=/app/arch_config.yaml
|
||||
- PLANO_CONFIG_PATH=/app/plano_config.yaml
|
||||
- OPENAI_API_KEY=${OPENAI_API_KEY:?OPENAI_API_KEY environment variable is required but not set}
|
||||
- OTEL_TRACING_GRPC_ENDPOINT=http://jaeger:4317
|
||||
- LOG_LEVEL=${LOG_LEVEL:-info}
|
||||
volumes:
|
||||
- ./config.yaml:/app/arch_config.yaml:ro
|
||||
- ./config.yaml:/app/plano_config.yaml:ro
|
||||
- /etc/ssl/cert.pem:/etc/ssl/cert.pem
|
||||
|
||||
crewai-flight-agent:
|
||||
|
|
|
|||
|
|
@ -10,7 +10,7 @@ services:
|
|||
extra_hosts:
|
||||
- "host.docker.internal:host-gateway"
|
||||
volumes:
|
||||
- ./config.yaml:/app/arch_config.yaml
|
||||
- ./config.yaml:/app/plano_config.yaml
|
||||
|
||||
jaeger:
|
||||
build:
|
||||
|
|
|
|||
|
|
@ -10,7 +10,7 @@ services:
|
|||
extra_hosts:
|
||||
- "host.docker.internal:host-gateway"
|
||||
volumes:
|
||||
- ./config.yaml:/app/arch_config.yaml
|
||||
- ./config.yaml:/app/plano_config.yaml
|
||||
|
||||
otel-collector:
|
||||
build:
|
||||
|
|
|
|||
|
|
@ -18,8 +18,8 @@ start_demo() {
|
|||
echo ".env file created with OPENAI_API_KEY."
|
||||
fi
|
||||
|
||||
# Step 3: Start Arch
|
||||
echo "Starting Arch with config.yaml..."
|
||||
# Step 3: Start Plano
|
||||
echo "Starting Plano with config.yaml..."
|
||||
planoai up config.yaml
|
||||
|
||||
# Step 4: Start developer services
|
||||
|
|
@ -33,8 +33,8 @@ stop_demo() {
|
|||
echo "Stopping Network Agent using Docker Compose..."
|
||||
docker compose down
|
||||
|
||||
# Step 2: Stop Arch
|
||||
echo "Stopping Arch..."
|
||||
# Step 2: Stop Plano
|
||||
echo "Stopping Plano..."
|
||||
planoai down
|
||||
}
|
||||
|
||||
|
|
|
|||
|
|
@ -17,7 +17,7 @@ Make sure your machine is up to date with [latest version of plano]([url](https:
|
|||
# Or if installed with uv: uvx planoai up --service plano --foreground
|
||||
2025-05-30 18:00:09,953 - planoai.main - INFO - Starting plano cli version: 0.4.6
|
||||
2025-05-30 18:00:09,953 - planoai.main - INFO - Validating /Users/adilhafeez/src/intelligent-prompt-gateway/demos/use_cases/preference_based_routing/config.yaml
|
||||
2025-05-30 18:00:10,422 - cli.core - INFO - Starting arch gateway, image name: plano, tag: katanemo/plano:0.4.6
|
||||
2025-05-30 18:00:10,422 - cli.core - INFO - Starting plano gateway, image name: plano, tag: katanemo/plano:0.4.6
|
||||
2025-05-30 18:00:10,662 - cli.core - INFO - plano status: running, health status: starting
|
||||
2025-05-30 18:00:11,712 - cli.core - INFO - plano status: running, health status: starting
|
||||
2025-05-30 18:00:12,761 - cli.core - INFO - plano is running and is healthy!
|
||||
|
|
|
|||
|
|
@ -8,14 +8,14 @@ services:
|
|||
- "12000:12000"
|
||||
- "12001:12001"
|
||||
environment:
|
||||
- ARCH_CONFIG_PATH=/app/arch_config.yaml
|
||||
- PLANO_CONFIG_PATH=/app/plano_config.yaml
|
||||
- OPENAI_API_KEY=${OPENAI_API_KEY:?OPENAI_API_KEY environment variable is required but not set}
|
||||
- ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY:?ANTHROPIC_API_KEY environment variable is required but not set}
|
||||
- OTEL_TRACING_GRPC_ENDPOINT=http://host.docker.internal:4317
|
||||
- OTEL_TRACING_ENABLED=true
|
||||
- RUST_LOG=debug
|
||||
volumes:
|
||||
- ./config.yaml:/app/arch_config.yaml:ro
|
||||
- ./config.yaml:/app/plano_config.yaml:ro
|
||||
- /etc/ssl/cert.pem:/etc/ssl/cert.pem
|
||||
|
||||
anythingllm:
|
||||
|
|
|
|||
|
|
@ -1,6 +1,6 @@
|
|||
# Use Case Demo: Bearer Authorization with Spotify APIs
|
||||
|
||||
In this demo, we show how you can use Arch's bearer authorization capability to connect your agentic apps to third-party APIs.
|
||||
In this demo, we show how you can use Plano's bearer authorization capability to connect your agentic apps to third-party APIs.
|
||||
More specifically, we demonstrate how you can connect to two Spotify APIs:
|
||||
|
||||
- [`/v1/browse/new-releases`](https://developer.spotify.com/documentation/web-api/reference/get-new-releases)
|
||||
|
|
@ -23,7 +23,7 @@ Where users can engage by asking questions like _"Show me the latest releases in
|
|||
SPOTIFY_CLIENT_KEY=your_spotify_api_token
|
||||
```
|
||||
|
||||
3. Start Arch
|
||||
3. Start Plano
|
||||
```sh
|
||||
sh run_demo.sh
|
||||
```
|
||||
|
|
|
|||
|
|
@ -10,7 +10,7 @@ services:
|
|||
extra_hosts:
|
||||
- "host.docker.internal:host-gateway"
|
||||
volumes:
|
||||
- ./config.yaml:/app/arch_config.yaml
|
||||
- ./config.yaml:/app/plano_config.yaml
|
||||
|
||||
jaeger:
|
||||
build:
|
||||
|
|
|
|||
|
|
@ -18,8 +18,8 @@ start_demo() {
|
|||
echo ".env file created with OPENAI_API_KEY."
|
||||
fi
|
||||
|
||||
# Step 3: Start Arch
|
||||
echo "Starting Arch with config.yaml..."
|
||||
# Step 3: Start Plano
|
||||
echo "Starting Plano with config.yaml..."
|
||||
planoai up config.yaml
|
||||
|
||||
# Step 4: Start developer services
|
||||
|
|
@ -33,8 +33,8 @@ stop_demo() {
|
|||
echo "Stopping Network Agent using Docker Compose..."
|
||||
docker compose down
|
||||
|
||||
# Step 2: Stop Arch
|
||||
echo "Stopping Arch..."
|
||||
# Step 2: Stop Plano
|
||||
echo "Stopping Plano..."
|
||||
planoai down
|
||||
}
|
||||
|
||||
|
|
|
|||
|
|
@ -8,10 +8,10 @@ services:
|
|||
- "12000:12000"
|
||||
- "8001:8001"
|
||||
environment:
|
||||
- ARCH_CONFIG_PATH=/config/config.yaml
|
||||
- PLANO_CONFIG_PATH=/config/config.yaml
|
||||
- OPENAI_API_KEY=${OPENAI_API_KEY:?OPENAI_API_KEY environment variable is required but not set}
|
||||
volumes:
|
||||
- ./config.yaml:/app/arch_config.yaml
|
||||
- ./config.yaml:/app/plano_config.yaml
|
||||
- /etc/ssl/cert.pem:/etc/ssl/cert.pem
|
||||
weather-agent:
|
||||
build:
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue