mirror of
https://github.com/katanemo/plano.git
synced 2026-04-24 16:26:34 +02:00
Rename all arch references to plano (#745)
* Rename all arch references to plano across the codebase
Complete rebrand from "Arch"/"archgw" to "Plano" including:
- Config files: arch_config_schema.yaml, workflow, demo configs
- Environment variables: ARCH_CONFIG_* → PLANO_CONFIG_*
- Python CLI: variables, functions, file paths, docker mounts
- Rust crates: config paths, log messages, metadata keys
- Docker/build: Dockerfile, supervisord, .dockerignore, .gitignore
- Docker Compose: volume mounts and env vars across all demos/tests
- GitHub workflows: job/step names
- Shell scripts: log messages
- Demos: Python code, READMEs, VS Code configs, Grafana dashboard
- Docs: RST includes, code comments, config references
- Package metadata: package.json, pyproject.toml, uv.lock
External URLs (docs.archgw.com, github.com/katanemo/archgw) left as-is.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Update remaining arch references in docs
- Rename RST cross-reference labels: arch_access_logging, arch_overview_tracing, arch_overview_threading → plano_*
- Update label references in request_lifecycle.rst
- Rename arch_config_state_storage_example.yaml → plano_config_state_storage_example.yaml
- Update config YAML comments: "Arch creates/uses" → "Plano creates/uses"
- Update "the Arch gateway" → "the Plano gateway" in configuration_reference.rst
- Update arch_config_schema.yaml reference in provider_models.py
- Rename arch_agent_router → plano_agent_router in config example
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Fix remaining arch references found in second pass
- config/docker-compose.dev.yaml: ARCH_CONFIG_FILE → PLANO_CONFIG_FILE,
arch_config.yaml → plano_config.yaml, archgw_logs → plano_logs
- config/test_passthrough.yaml: container mount path
- tests/e2e/docker-compose.yaml: source file path (was still arch_config.yaml)
- cli/planoai/core.py: comment and log message
- crates/brightstaff/src/tracing/constants.rs: doc comment
- tests/{e2e,archgw}/common.py: get_arch_messages → get_plano_messages,
arch_state/arch_messages variables renamed
- tests/{e2e,archgw}/test_prompt_gateway.py: updated imports and usages
- demos/shared/test_runner/{common,test_demos}.py: same renames
- tests/e2e/test_model_alias_routing.py: docstring
- .dockerignore: archgw_modelserver → plano_modelserver
- demos/use_cases/claude_code_router/pretty_model_resolution.sh: container name
Note: x-arch-* HTTP header values and Rust constant names intentionally
preserved for backwards compatibility with existing deployments.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
parent
0557f7ff98
commit
ba651aaf71
115 changed files with 504 additions and 505 deletions
|
|
@ -33,8 +33,8 @@ cli/__pycache__/
|
|||
cli/planoai/__pycache__/
|
||||
|
||||
# Python model server
|
||||
archgw_modelserver/
|
||||
arch_tools/
|
||||
plano_modelserver/
|
||||
plano_tools/
|
||||
|
||||
# Misc
|
||||
*.md
|
||||
|
|
@ -44,4 +44,4 @@ turbo.json
|
|||
package.json
|
||||
*.sh
|
||||
!cli/build_cli.sh
|
||||
arch_config.yaml_rendered
|
||||
plano_config.yaml_rendered
|
||||
|
|
|
|||
2
.github/workflows/e2e_plano_tests.yml
vendored
2
.github/workflows/e2e_plano_tests.yml
vendored
|
|
@ -28,7 +28,7 @@ jobs:
|
|||
python-version: ${{ matrix.python-version }}
|
||||
cache: "pip" # auto-caches based on requirements files
|
||||
|
||||
- name: build arch docker image
|
||||
- name: build plano docker image
|
||||
run: |
|
||||
cd ../../ && docker build -f Dockerfile . -t katanemo/plano -t katanemo/plano:0.4.6 -t katanemo/plano:latest
|
||||
|
||||
|
|
|
|||
|
|
@ -38,7 +38,7 @@ jobs:
|
|||
curl --location --remote-name https://github.com/Orange-OpenSource/hurl/releases/download/4.0.0/hurl_4.0.0_amd64.deb
|
||||
sudo dpkg -i hurl_4.0.0_amd64.deb
|
||||
|
||||
- name: install arch gateway and test dependencies
|
||||
- name: install plano gateway and test dependencies
|
||||
run: |
|
||||
source venv/bin/activate
|
||||
cd cli && echo "installing plano cli" && uv sync && uv tool install .
|
||||
|
|
|
|||
|
|
@ -22,7 +22,7 @@ jobs:
|
|||
with:
|
||||
python-version: "3.12"
|
||||
|
||||
- name: build arch docker image
|
||||
- name: build plano docker image
|
||||
run: |
|
||||
docker build -f Dockerfile . -t katanemo/plano -t katanemo/plano:0.4.6
|
||||
|
||||
|
|
@ -38,7 +38,7 @@ jobs:
|
|||
curl --location --remote-name https://github.com/Orange-OpenSource/hurl/releases/download/4.0.0/hurl_4.0.0_amd64.deb
|
||||
sudo dpkg -i hurl_4.0.0_amd64.deb
|
||||
|
||||
- name: install arch gateway and test dependencies
|
||||
- name: install plano gateway and test dependencies
|
||||
run: |
|
||||
source venv/bin/activate
|
||||
cd cli && echo "installing plano cli" && uv sync && uv tool install .
|
||||
|
|
|
|||
|
|
@ -1,4 +1,4 @@
|
|||
name: arch config tests
|
||||
name: plano config tests
|
||||
|
||||
on:
|
||||
push:
|
||||
|
|
@ -7,7 +7,7 @@ on:
|
|||
pull_request:
|
||||
|
||||
jobs:
|
||||
validate_arch_config:
|
||||
validate_plano_config:
|
||||
runs-on: ubuntu-latest
|
||||
defaults:
|
||||
run:
|
||||
|
|
@ -22,10 +22,10 @@ jobs:
|
|||
with:
|
||||
python-version: "3.12"
|
||||
|
||||
- name: build arch docker image
|
||||
- name: build plano docker image
|
||||
run: |
|
||||
docker build -f Dockerfile . -t katanemo/plano -t katanemo/plano:0.4.6
|
||||
|
||||
- name: validate arch config
|
||||
- name: validate plano config
|
||||
run: |
|
||||
bash config/validate_plano_config.sh
|
||||
14
.gitignore
vendored
14
.gitignore
vendored
|
|
@ -107,30 +107,30 @@ venv.bak/
|
|||
|
||||
# =========================================
|
||||
|
||||
# Arch
|
||||
# Plano
|
||||
cli/config
|
||||
cli/build
|
||||
|
||||
# Archgw - Docs
|
||||
# Plano - Docs
|
||||
docs/build/
|
||||
|
||||
# Archgw - Demos
|
||||
# Plano - Demos
|
||||
demos/function_calling/ollama/models/
|
||||
demos/function_calling/ollama/id_ed*
|
||||
demos/function_calling/open-webui/
|
||||
demos/function_calling/open-webui/
|
||||
demos/shared/signoz/data
|
||||
|
||||
# Arch - Miscellaneous
|
||||
# Plano - Miscellaneous
|
||||
grafana-data
|
||||
prom_data
|
||||
arch_log/
|
||||
arch_logs/
|
||||
plano_log/
|
||||
plano_logs/
|
||||
crates/*/target/
|
||||
crates/target/
|
||||
build.log
|
||||
|
||||
archgw.log
|
||||
plano.log
|
||||
|
||||
# Next.js / Turborepo
|
||||
.next/
|
||||
|
|
|
|||
|
|
@ -96,7 +96,7 @@ Entry point: `cli/planoai/main.py`. Container lifecycle in `core.py`. Docker ope
|
|||
|
||||
### Configuration System (config/)
|
||||
|
||||
- `arch_config_schema.yaml` — JSON Schema (draft-07) for validating user config files
|
||||
- `plano_config_schema.yaml` — JSON Schema (draft-07) for validating user config files
|
||||
- `envoy.template.yaml` — Jinja2 template rendered into Envoy proxy config
|
||||
- `supervisord.conf` — Process supervisor for Envoy + brightstaff in the container
|
||||
|
||||
|
|
|
|||
|
|
@ -69,7 +69,7 @@ RUN uv run pip install --no-cache-dir .
|
|||
|
||||
COPY cli/planoai planoai/
|
||||
COPY config/envoy.template.yaml .
|
||||
COPY config/arch_config_schema.yaml .
|
||||
COPY config/plano_config_schema.yaml .
|
||||
COPY config/supervisord.conf /etc/supervisor/conf.d/supervisord.conf
|
||||
|
||||
COPY --from=wasm-builder /arch/target/wasm32-wasip1/release/prompt_gateway.wasm /etc/envoy/proxy-wasm-plugins/prompt_gateway.wasm
|
||||
|
|
|
|||
|
|
@ -115,7 +115,6 @@ export const siteConfig = {
|
|||
// Brand (minimal, necessary)
|
||||
"Plano AI",
|
||||
"Plano gateway",
|
||||
"Arch gateway",
|
||||
],
|
||||
authors: [{ name: "Katanemo", url: "https://github.com/katanemo/plano" }],
|
||||
creator: "Katanemo",
|
||||
|
|
@ -240,7 +239,7 @@ export const pageMetadata = {
|
|||
"agentic AI",
|
||||
"Plano blog",
|
||||
"Plano blog posts",
|
||||
"Arch gateway blog",
|
||||
"Plano gateway blog",
|
||||
],
|
||||
}),
|
||||
|
||||
|
|
|
|||
|
|
@ -53,34 +53,34 @@ def validate_and_render_schema():
|
|||
ENVOY_CONFIG_TEMPLATE_FILE = os.getenv(
|
||||
"ENVOY_CONFIG_TEMPLATE_FILE", "envoy.template.yaml"
|
||||
)
|
||||
ARCH_CONFIG_FILE = os.getenv("ARCH_CONFIG_FILE", "/app/arch_config.yaml")
|
||||
ARCH_CONFIG_FILE_RENDERED = os.getenv(
|
||||
"ARCH_CONFIG_FILE_RENDERED", "/app/arch_config_rendered.yaml"
|
||||
PLANO_CONFIG_FILE = os.getenv("PLANO_CONFIG_FILE", "/app/plano_config.yaml")
|
||||
PLANO_CONFIG_FILE_RENDERED = os.getenv(
|
||||
"PLANO_CONFIG_FILE_RENDERED", "/app/plano_config_rendered.yaml"
|
||||
)
|
||||
ENVOY_CONFIG_FILE_RENDERED = os.getenv(
|
||||
"ENVOY_CONFIG_FILE_RENDERED", "/etc/envoy/envoy.yaml"
|
||||
)
|
||||
ARCH_CONFIG_SCHEMA_FILE = os.getenv(
|
||||
"ARCH_CONFIG_SCHEMA_FILE", "arch_config_schema.yaml"
|
||||
PLANO_CONFIG_SCHEMA_FILE = os.getenv(
|
||||
"PLANO_CONFIG_SCHEMA_FILE", "plano_config_schema.yaml"
|
||||
)
|
||||
|
||||
env = Environment(loader=FileSystemLoader(os.getenv("TEMPLATE_ROOT", "./")))
|
||||
template = env.get_template(ENVOY_CONFIG_TEMPLATE_FILE)
|
||||
|
||||
try:
|
||||
validate_prompt_config(ARCH_CONFIG_FILE, ARCH_CONFIG_SCHEMA_FILE)
|
||||
validate_prompt_config(PLANO_CONFIG_FILE, PLANO_CONFIG_SCHEMA_FILE)
|
||||
except Exception as e:
|
||||
print(str(e))
|
||||
exit(1) # validate_prompt_config failed. Exit
|
||||
|
||||
with open(ARCH_CONFIG_FILE, "r") as file:
|
||||
arch_config = file.read()
|
||||
with open(PLANO_CONFIG_FILE, "r") as file:
|
||||
plano_config = file.read()
|
||||
|
||||
with open(ARCH_CONFIG_SCHEMA_FILE, "r") as file:
|
||||
arch_config_schema = file.read()
|
||||
with open(PLANO_CONFIG_SCHEMA_FILE, "r") as file:
|
||||
plano_config_schema = file.read()
|
||||
|
||||
config_yaml = yaml.safe_load(arch_config)
|
||||
_ = yaml.safe_load(arch_config_schema)
|
||||
config_yaml = yaml.safe_load(plano_config)
|
||||
_ = yaml.safe_load(plano_config_schema)
|
||||
inferred_clusters = {}
|
||||
|
||||
# Convert legacy llm_providers to model_providers
|
||||
|
|
@ -145,7 +145,7 @@ def validate_and_render_schema():
|
|||
inferred_clusters[name]["port"],
|
||||
) = get_endpoint_and_port(endpoint, protocol)
|
||||
|
||||
print("defined clusters from arch_config.yaml: ", json.dumps(inferred_clusters))
|
||||
print("defined clusters from plano_config.yaml: ", json.dumps(inferred_clusters))
|
||||
|
||||
if "prompt_targets" in config_yaml:
|
||||
for prompt_target in config_yaml["prompt_targets"]:
|
||||
|
|
@ -154,13 +154,13 @@ def validate_and_render_schema():
|
|||
continue
|
||||
if name not in inferred_clusters:
|
||||
raise Exception(
|
||||
f"Unknown endpoint {name}, please add it in endpoints section in your arch_config.yaml file"
|
||||
f"Unknown endpoint {name}, please add it in endpoints section in your plano_config.yaml file"
|
||||
)
|
||||
|
||||
arch_tracing = config_yaml.get("tracing", {})
|
||||
plano_tracing = config_yaml.get("tracing", {})
|
||||
|
||||
# Resolution order: config yaml > OTEL_TRACING_GRPC_ENDPOINT env var > hardcoded default
|
||||
opentracing_grpc_endpoint = arch_tracing.get(
|
||||
opentracing_grpc_endpoint = plano_tracing.get(
|
||||
"opentracing_grpc_endpoint",
|
||||
os.environ.get(
|
||||
"OTEL_TRACING_GRPC_ENDPOINT", DEFAULT_OTEL_TRACING_GRPC_ENDPOINT
|
||||
|
|
@ -172,7 +172,7 @@ def validate_and_render_schema():
|
|||
print(
|
||||
f"Resolved opentracing_grpc_endpoint to {opentracing_grpc_endpoint} after expanding environment variables"
|
||||
)
|
||||
arch_tracing["opentracing_grpc_endpoint"] = opentracing_grpc_endpoint
|
||||
plano_tracing["opentracing_grpc_endpoint"] = opentracing_grpc_endpoint
|
||||
# ensure that opentracing_grpc_endpoint is a valid URL if present and start with http and must not have any path
|
||||
if opentracing_grpc_endpoint:
|
||||
urlparse_result = urlparse(opentracing_grpc_endpoint)
|
||||
|
|
@ -436,8 +436,8 @@ def validate_and_render_schema():
|
|||
f"Model alias 2 - '{alias_name}' targets '{target}' which is not defined as a model. Available models: {', '.join(sorted(model_name_keys))}"
|
||||
)
|
||||
|
||||
arch_config_string = yaml.dump(config_yaml)
|
||||
arch_llm_config_string = yaml.dump(config_yaml)
|
||||
plano_config_string = yaml.dump(config_yaml)
|
||||
plano_llm_config_string = yaml.dump(config_yaml)
|
||||
|
||||
use_agent_orchestrator = config_yaml.get("overrides", {}).get(
|
||||
"use_agent_orchestrator", False
|
||||
|
|
@ -449,11 +449,11 @@ def validate_and_render_schema():
|
|||
|
||||
if len(endpoints) == 0:
|
||||
raise Exception(
|
||||
"Please provide agent orchestrator in the endpoints section in your arch_config.yaml file"
|
||||
"Please provide agent orchestrator in the endpoints section in your plano_config.yaml file"
|
||||
)
|
||||
elif len(endpoints) > 1:
|
||||
raise Exception(
|
||||
"Please provide single agent orchestrator in the endpoints section in your arch_config.yaml file"
|
||||
"Please provide single agent orchestrator in the endpoints section in your plano_config.yaml file"
|
||||
)
|
||||
else:
|
||||
agent_orchestrator = list(endpoints.keys())[0]
|
||||
|
|
@ -463,11 +463,11 @@ def validate_and_render_schema():
|
|||
data = {
|
||||
"prompt_gateway_listener": prompt_gateway,
|
||||
"llm_gateway_listener": llm_gateway,
|
||||
"arch_config": arch_config_string,
|
||||
"arch_llm_config": arch_llm_config_string,
|
||||
"arch_clusters": inferred_clusters,
|
||||
"arch_model_providers": updated_model_providers,
|
||||
"arch_tracing": arch_tracing,
|
||||
"plano_config": plano_config_string,
|
||||
"plano_llm_config": plano_llm_config_string,
|
||||
"plano_clusters": inferred_clusters,
|
||||
"plano_model_providers": updated_model_providers,
|
||||
"plano_tracing": plano_tracing,
|
||||
"local_llms": llms_with_endpoint,
|
||||
"agent_orchestrator": agent_orchestrator,
|
||||
"listeners": listeners,
|
||||
|
|
@ -479,25 +479,25 @@ def validate_and_render_schema():
|
|||
with open(ENVOY_CONFIG_FILE_RENDERED, "w") as file:
|
||||
file.write(rendered)
|
||||
|
||||
with open(ARCH_CONFIG_FILE_RENDERED, "w") as file:
|
||||
file.write(arch_config_string)
|
||||
with open(PLANO_CONFIG_FILE_RENDERED, "w") as file:
|
||||
file.write(plano_config_string)
|
||||
|
||||
|
||||
def validate_prompt_config(arch_config_file, arch_config_schema_file):
|
||||
with open(arch_config_file, "r") as file:
|
||||
arch_config = file.read()
|
||||
def validate_prompt_config(plano_config_file, plano_config_schema_file):
|
||||
with open(plano_config_file, "r") as file:
|
||||
plano_config = file.read()
|
||||
|
||||
with open(arch_config_schema_file, "r") as file:
|
||||
arch_config_schema = file.read()
|
||||
with open(plano_config_schema_file, "r") as file:
|
||||
plano_config_schema = file.read()
|
||||
|
||||
config_yaml = yaml.safe_load(arch_config)
|
||||
config_schema_yaml = yaml.safe_load(arch_config_schema)
|
||||
config_yaml = yaml.safe_load(plano_config)
|
||||
config_schema_yaml = yaml.safe_load(plano_config_schema)
|
||||
|
||||
try:
|
||||
validate(config_yaml, config_schema_yaml)
|
||||
except Exception as e:
|
||||
print(
|
||||
f"Error validating arch_config file: {arch_config_file}, schema file: {arch_config_schema_file}, error: {e}"
|
||||
f"Error validating plano_config file: {plano_config_file}, schema file: {plano_config_schema_file}, error: {e}"
|
||||
)
|
||||
raise e
|
||||
|
||||
|
|
|
|||
|
|
@ -24,17 +24,17 @@ from planoai.docker_cli import (
|
|||
log = getLogger(__name__)
|
||||
|
||||
|
||||
def _get_gateway_ports(arch_config_file: str) -> list[int]:
|
||||
def _get_gateway_ports(plano_config_file: str) -> list[int]:
|
||||
PROMPT_GATEWAY_DEFAULT_PORT = 10000
|
||||
LLM_GATEWAY_DEFAULT_PORT = 12000
|
||||
|
||||
# parse arch_config_file yaml file and get prompt_gateway_port
|
||||
arch_config_dict = {}
|
||||
with open(arch_config_file) as f:
|
||||
arch_config_dict = yaml.safe_load(f)
|
||||
# parse plano_config_file yaml file and get prompt_gateway_port
|
||||
plano_config_dict = {}
|
||||
with open(plano_config_file) as f:
|
||||
plano_config_dict = yaml.safe_load(f)
|
||||
|
||||
listeners, _, _ = convert_legacy_listeners(
|
||||
arch_config_dict.get("listeners"), arch_config_dict.get("llm_providers")
|
||||
plano_config_dict.get("listeners"), plano_config_dict.get("llm_providers")
|
||||
)
|
||||
|
||||
all_ports = [listener.get("port") for listener in listeners]
|
||||
|
|
@ -45,7 +45,7 @@ def _get_gateway_ports(arch_config_file: str) -> list[int]:
|
|||
return all_ports
|
||||
|
||||
|
||||
def start_arch(arch_config_file, env, log_timeout=120, foreground=False):
|
||||
def start_plano(plano_config_file, env, log_timeout=120, foreground=False):
|
||||
"""
|
||||
Start Docker Compose in detached mode and stream logs until services are healthy.
|
||||
|
||||
|
|
@ -54,7 +54,7 @@ def start_arch(arch_config_file, env, log_timeout=120, foreground=False):
|
|||
log_timeout (int): Time in seconds to show logs before checking for healthy state.
|
||||
"""
|
||||
log.info(
|
||||
f"Starting arch gateway, image name: {PLANO_DOCKER_NAME}, tag: {PLANO_DOCKER_IMAGE}"
|
||||
f"Starting plano gateway, image name: {PLANO_DOCKER_NAME}, tag: {PLANO_DOCKER_IMAGE}"
|
||||
)
|
||||
|
||||
try:
|
||||
|
|
@ -64,10 +64,10 @@ def start_arch(arch_config_file, env, log_timeout=120, foreground=False):
|
|||
docker_stop_container(PLANO_DOCKER_NAME)
|
||||
docker_remove_container(PLANO_DOCKER_NAME)
|
||||
|
||||
gateway_ports = _get_gateway_ports(arch_config_file)
|
||||
gateway_ports = _get_gateway_ports(plano_config_file)
|
||||
|
||||
return_code, _, plano_stderr = docker_start_plano_detached(
|
||||
arch_config_file,
|
||||
plano_config_file,
|
||||
env,
|
||||
gateway_ports,
|
||||
)
|
||||
|
|
@ -117,7 +117,7 @@ def start_arch(arch_config_file, env, log_timeout=120, foreground=False):
|
|||
stream_gateway_logs(follow=True)
|
||||
|
||||
except KeyboardInterrupt:
|
||||
log.info("Keyboard interrupt received, stopping arch gateway service.")
|
||||
log.info("Keyboard interrupt received, stopping plano gateway service.")
|
||||
stop_docker_container()
|
||||
|
||||
|
||||
|
|
@ -144,15 +144,15 @@ def stop_docker_container(service=PLANO_DOCKER_NAME):
|
|||
log.info(f"Failed to shut down services: {str(e)}")
|
||||
|
||||
|
||||
def start_cli_agent(arch_config_file=None, settings_json="{}"):
|
||||
def start_cli_agent(plano_config_file=None, settings_json="{}"):
|
||||
"""Start a CLI client connected to Plano."""
|
||||
|
||||
with open(arch_config_file, "r") as file:
|
||||
arch_config = file.read()
|
||||
arch_config_yaml = yaml.safe_load(arch_config)
|
||||
with open(plano_config_file, "r") as file:
|
||||
plano_config = file.read()
|
||||
plano_config_yaml = yaml.safe_load(plano_config)
|
||||
|
||||
# Get egress listener configuration
|
||||
egress_config = arch_config_yaml.get("listeners", {}).get("egress_traffic", {})
|
||||
egress_config = plano_config_yaml.get("listeners", {}).get("egress_traffic", {})
|
||||
host = egress_config.get("host", "127.0.0.1")
|
||||
port = egress_config.get("port", 12000)
|
||||
|
||||
|
|
@ -167,7 +167,7 @@ def start_cli_agent(arch_config_file=None, settings_json="{}"):
|
|||
env = os.environ.copy()
|
||||
env.update(
|
||||
{
|
||||
"ANTHROPIC_AUTH_TOKEN": "test", # Use test token for arch
|
||||
"ANTHROPIC_AUTH_TOKEN": "test", # Use test token for plano
|
||||
"ANTHROPIC_API_KEY": "",
|
||||
"ANTHROPIC_BASE_URL": f"http://{host}:{port}",
|
||||
"NO_PROXY": host,
|
||||
|
|
@ -184,7 +184,7 @@ def start_cli_agent(arch_config_file=None, settings_json="{}"):
|
|||
]
|
||||
else:
|
||||
# Check if arch.claude.code.small.fast alias exists in model_aliases
|
||||
model_aliases = arch_config_yaml.get("model_aliases", {})
|
||||
model_aliases = plano_config_yaml.get("model_aliases", {})
|
||||
if "arch.claude.code.small.fast" in model_aliases:
|
||||
env["ANTHROPIC_SMALL_FAST_MODEL"] = "arch.claude.code.small.fast"
|
||||
else:
|
||||
|
|
@ -220,7 +220,7 @@ def start_cli_agent(arch_config_file=None, settings_json="{}"):
|
|||
|
||||
# Use claude from PATH
|
||||
claude_path = "claude"
|
||||
log.info(f"Connecting Claude Code Agent to Arch at {host}:{port}")
|
||||
log.info(f"Connecting Claude Code Agent to Plano at {host}:{port}")
|
||||
|
||||
try:
|
||||
subprocess.run([claude_path] + claude_args, env=env, check=True)
|
||||
|
|
|
|||
|
|
@ -41,7 +41,7 @@ def docker_remove_container(container: str) -> str:
|
|||
|
||||
|
||||
def docker_start_plano_detached(
|
||||
arch_config_file: str,
|
||||
plano_config_file: str,
|
||||
env: dict,
|
||||
gateway_ports: list[int],
|
||||
) -> str:
|
||||
|
|
@ -58,7 +58,7 @@ def docker_start_plano_detached(
|
|||
port_mappings_args = [item for port in port_mappings for item in ("-p", port)]
|
||||
|
||||
volume_mappings = [
|
||||
f"{arch_config_file}:/app/arch_config.yaml:ro",
|
||||
f"{plano_config_file}:/app/plano_config.yaml:ro",
|
||||
]
|
||||
volume_mappings_args = [
|
||||
item for volume in volume_mappings for item in ("-v", volume)
|
||||
|
|
@ -115,7 +115,7 @@ def stream_gateway_logs(follow, service="plano"):
|
|||
log.info(f"Failed to stream logs: {str(e)}")
|
||||
|
||||
|
||||
def docker_validate_plano_schema(arch_config_file):
|
||||
def docker_validate_plano_schema(plano_config_file):
|
||||
import os
|
||||
|
||||
env = os.environ.copy()
|
||||
|
|
@ -129,7 +129,7 @@ def docker_validate_plano_schema(arch_config_file):
|
|||
"--rm",
|
||||
*env_args,
|
||||
"-v",
|
||||
f"{arch_config_file}:/app/arch_config.yaml:ro",
|
||||
f"{plano_config_file}:/app/plano_config.yaml:ro",
|
||||
"--entrypoint",
|
||||
"python",
|
||||
PLANO_DOCKER_IMAGE,
|
||||
|
|
|
|||
|
|
@ -22,7 +22,7 @@ from planoai.utils import (
|
|||
find_repo_root,
|
||||
)
|
||||
from planoai.core import (
|
||||
start_arch,
|
||||
start_plano,
|
||||
stop_docker_container,
|
||||
start_cli_agent,
|
||||
)
|
||||
|
|
@ -200,12 +200,12 @@ def up(file, path, foreground, with_tracing, tracing_port):
|
|||
_print_cli_header(console)
|
||||
|
||||
# Use the utility function to find config file
|
||||
arch_config_file = find_config_file(path, file)
|
||||
plano_config_file = find_config_file(path, file)
|
||||
|
||||
# Check if the file exists
|
||||
if not os.path.exists(arch_config_file):
|
||||
if not os.path.exists(plano_config_file):
|
||||
console.print(
|
||||
f"[red]✗[/red] Config file not found: [dim]{arch_config_file}[/dim]"
|
||||
f"[red]✗[/red] Config file not found: [dim]{plano_config_file}[/dim]"
|
||||
)
|
||||
sys.exit(1)
|
||||
|
||||
|
|
@ -216,7 +216,7 @@ def up(file, path, foreground, with_tracing, tracing_port):
|
|||
validation_return_code,
|
||||
_,
|
||||
validation_stderr,
|
||||
) = docker_validate_plano_schema(arch_config_file)
|
||||
) = docker_validate_plano_schema(plano_config_file)
|
||||
|
||||
if validation_return_code != 0:
|
||||
console.print(f"[red]✗[/red] Validation failed")
|
||||
|
|
@ -234,7 +234,7 @@ def up(file, path, foreground, with_tracing, tracing_port):
|
|||
env.pop("PATH", None)
|
||||
|
||||
# Check access keys
|
||||
access_keys = get_llm_provider_access_keys(arch_config_file=arch_config_file)
|
||||
access_keys = get_llm_provider_access_keys(plano_config_file=plano_config_file)
|
||||
access_keys = set(access_keys)
|
||||
access_keys = [item[1:] if item.startswith("$") else item for item in access_keys]
|
||||
|
||||
|
|
@ -302,7 +302,7 @@ def up(file, path, foreground, with_tracing, tracing_port):
|
|||
|
||||
env.update(env_stage)
|
||||
try:
|
||||
start_arch(arch_config_file, env, foreground=foreground)
|
||||
start_plano(plano_config_file, env, foreground=foreground)
|
||||
|
||||
# When tracing is enabled but --foreground is not, keep the process
|
||||
# alive so the OTLP collector continues to receive spans.
|
||||
|
|
@ -363,35 +363,35 @@ def generate_prompt_targets(file):
|
|||
def logs(debug, follow):
|
||||
"""Stream logs from access logs services."""
|
||||
|
||||
archgw_process = None
|
||||
plano_process = None
|
||||
try:
|
||||
if debug:
|
||||
archgw_process = multiprocessing.Process(
|
||||
plano_process = multiprocessing.Process(
|
||||
target=stream_gateway_logs, args=(follow,)
|
||||
)
|
||||
archgw_process.start()
|
||||
plano_process.start()
|
||||
|
||||
archgw_access_logs_process = multiprocessing.Process(
|
||||
plano_access_logs_process = multiprocessing.Process(
|
||||
target=stream_access_logs, args=(follow,)
|
||||
)
|
||||
archgw_access_logs_process.start()
|
||||
archgw_access_logs_process.join()
|
||||
plano_access_logs_process.start()
|
||||
plano_access_logs_process.join()
|
||||
|
||||
if archgw_process:
|
||||
archgw_process.join()
|
||||
if plano_process:
|
||||
plano_process.join()
|
||||
except KeyboardInterrupt:
|
||||
log.info("KeyboardInterrupt detected. Exiting.")
|
||||
if archgw_access_logs_process.is_alive():
|
||||
archgw_access_logs_process.terminate()
|
||||
if archgw_process and archgw_process.is_alive():
|
||||
archgw_process.terminate()
|
||||
if plano_access_logs_process.is_alive():
|
||||
plano_access_logs_process.terminate()
|
||||
if plano_process and plano_process.is_alive():
|
||||
plano_process.terminate()
|
||||
|
||||
|
||||
@click.command()
|
||||
@click.argument("type", type=click.Choice(["claude"]), required=True)
|
||||
@click.argument("file", required=False) # Optional file argument
|
||||
@click.option(
|
||||
"--path", default=".", help="Path to the directory containing arch_config.yaml"
|
||||
"--path", default=".", help="Path to the directory containing plano_config.yaml"
|
||||
)
|
||||
@click.option(
|
||||
"--settings",
|
||||
|
|
@ -405,20 +405,20 @@ def cli_agent(type, file, path, settings):
|
|||
"""
|
||||
|
||||
# Check if plano docker container is running
|
||||
archgw_status = docker_container_status(PLANO_DOCKER_NAME)
|
||||
if archgw_status != "running":
|
||||
log.error(f"plano docker container is not running (status: {archgw_status})")
|
||||
plano_status = docker_container_status(PLANO_DOCKER_NAME)
|
||||
if plano_status != "running":
|
||||
log.error(f"plano docker container is not running (status: {plano_status})")
|
||||
log.error("Please start plano using the 'planoai up' command.")
|
||||
sys.exit(1)
|
||||
|
||||
# Determine arch_config.yaml path
|
||||
arch_config_file = find_config_file(path, file)
|
||||
if not os.path.exists(arch_config_file):
|
||||
log.error(f"Config file not found: {arch_config_file}")
|
||||
# Determine plano_config.yaml path
|
||||
plano_config_file = find_config_file(path, file)
|
||||
if not os.path.exists(plano_config_file):
|
||||
log.error(f"Config file not found: {plano_config_file}")
|
||||
sys.exit(1)
|
||||
|
||||
try:
|
||||
start_cli_agent(arch_config_file, settings)
|
||||
start_cli_agent(plano_config_file, settings)
|
||||
except SystemExit:
|
||||
# Re-raise SystemExit to preserve exit codes
|
||||
raise
|
||||
|
|
|
|||
|
|
@ -68,19 +68,19 @@ def find_repo_root(start_path=None):
|
|||
return None
|
||||
|
||||
|
||||
def has_ingress_listener(arch_config_file):
|
||||
"""Check if the arch config file has ingress_traffic listener configured."""
|
||||
def has_ingress_listener(plano_config_file):
|
||||
"""Check if the plano config file has ingress_traffic listener configured."""
|
||||
try:
|
||||
with open(arch_config_file) as f:
|
||||
arch_config_dict = yaml.safe_load(f)
|
||||
with open(plano_config_file) as f:
|
||||
plano_config_dict = yaml.safe_load(f)
|
||||
|
||||
ingress_traffic = arch_config_dict.get("listeners", {}).get(
|
||||
ingress_traffic = plano_config_dict.get("listeners", {}).get(
|
||||
"ingress_traffic", {}
|
||||
)
|
||||
|
||||
return bool(ingress_traffic)
|
||||
except Exception as e:
|
||||
log.error(f"Error reading config file {arch_config_file}: {e}")
|
||||
log.error(f"Error reading config file {plano_config_file}: {e}")
|
||||
return False
|
||||
|
||||
|
||||
|
|
@ -161,27 +161,27 @@ def convert_legacy_listeners(
|
|||
return listeners, llm_gateway_listener, prompt_gateway_listener
|
||||
|
||||
|
||||
def get_llm_provider_access_keys(arch_config_file):
|
||||
with open(arch_config_file, "r") as file:
|
||||
arch_config = file.read()
|
||||
arch_config_yaml = yaml.safe_load(arch_config)
|
||||
def get_llm_provider_access_keys(plano_config_file):
|
||||
with open(plano_config_file, "r") as file:
|
||||
plano_config = file.read()
|
||||
plano_config_yaml = yaml.safe_load(plano_config)
|
||||
|
||||
access_key_list = []
|
||||
|
||||
# Convert legacy llm_providers to model_providers
|
||||
if "llm_providers" in arch_config_yaml:
|
||||
if "model_providers" in arch_config_yaml:
|
||||
if "llm_providers" in plano_config_yaml:
|
||||
if "model_providers" in plano_config_yaml:
|
||||
raise Exception(
|
||||
"Please provide either llm_providers or model_providers, not both. llm_providers is deprecated, please use model_providers instead"
|
||||
)
|
||||
arch_config_yaml["model_providers"] = arch_config_yaml["llm_providers"]
|
||||
del arch_config_yaml["llm_providers"]
|
||||
plano_config_yaml["model_providers"] = plano_config_yaml["llm_providers"]
|
||||
del plano_config_yaml["llm_providers"]
|
||||
|
||||
listeners, _, _ = convert_legacy_listeners(
|
||||
arch_config_yaml.get("listeners"), arch_config_yaml.get("model_providers")
|
||||
plano_config_yaml.get("listeners"), plano_config_yaml.get("model_providers")
|
||||
)
|
||||
|
||||
for prompt_target in arch_config_yaml.get("prompt_targets", []):
|
||||
for prompt_target in plano_config_yaml.get("prompt_targets", []):
|
||||
for k, v in prompt_target.get("endpoint", {}).get("http_headers", {}).items():
|
||||
if k.lower() == "authorization":
|
||||
print(
|
||||
|
|
@ -200,7 +200,7 @@ def get_llm_provider_access_keys(arch_config_file):
|
|||
access_key_list.append(access_key)
|
||||
|
||||
# Extract environment variables from state_storage.connection_string
|
||||
state_storage = arch_config_yaml.get("state_storage_v1_responses")
|
||||
state_storage = plano_config_yaml.get("state_storage_v1_responses")
|
||||
if state_storage:
|
||||
connection_string = state_storage.get("connection_string")
|
||||
if connection_string and isinstance(connection_string, str):
|
||||
|
|
@ -251,16 +251,16 @@ def find_config_file(path=".", file=None):
|
|||
# If a file is provided, process that file
|
||||
return os.path.abspath(file)
|
||||
else:
|
||||
# If no file is provided, use the path and look for arch_config.yaml first, then config.yaml for convenience
|
||||
arch_config_file = os.path.abspath(os.path.join(path, "config.yaml"))
|
||||
if not os.path.exists(arch_config_file):
|
||||
arch_config_file = os.path.abspath(os.path.join(path, "arch_config.yaml"))
|
||||
return arch_config_file
|
||||
# If no file is provided, use the path and look for plano_config.yaml first, then config.yaml for convenience
|
||||
plano_config_file = os.path.abspath(os.path.join(path, "config.yaml"))
|
||||
if not os.path.exists(plano_config_file):
|
||||
plano_config_file = os.path.abspath(os.path.join(path, "plano_config.yaml"))
|
||||
return plano_config_file
|
||||
|
||||
|
||||
def stream_access_logs(follow):
|
||||
"""
|
||||
Get the archgw access logs
|
||||
Get the plano access logs
|
||||
"""
|
||||
|
||||
follow_arg = "-f" if follow else ""
|
||||
|
|
|
|||
|
|
@ -12,14 +12,14 @@ def cleanup_env(monkeypatch):
|
|||
|
||||
|
||||
def test_validate_and_render_happy_path(monkeypatch):
|
||||
monkeypatch.setenv("ARCH_CONFIG_FILE", "fake_arch_config.yaml")
|
||||
monkeypatch.setenv("ARCH_CONFIG_SCHEMA_FILE", "fake_arch_config_schema.yaml")
|
||||
monkeypatch.setenv("PLANO_CONFIG_FILE", "fake_plano_config.yaml")
|
||||
monkeypatch.setenv("PLANO_CONFIG_SCHEMA_FILE", "fake_plano_config_schema.yaml")
|
||||
monkeypatch.setenv("ENVOY_CONFIG_TEMPLATE_FILE", "./envoy.template.yaml")
|
||||
monkeypatch.setenv("ARCH_CONFIG_FILE_RENDERED", "fake_arch_config_rendered.yaml")
|
||||
monkeypatch.setenv("PLANO_CONFIG_FILE_RENDERED", "fake_plano_config_rendered.yaml")
|
||||
monkeypatch.setenv("ENVOY_CONFIG_FILE_RENDERED", "fake_envoy.yaml")
|
||||
monkeypatch.setenv("TEMPLATE_ROOT", "../")
|
||||
|
||||
arch_config = """
|
||||
plano_config = """
|
||||
version: v0.1.0
|
||||
|
||||
listeners:
|
||||
|
|
@ -50,24 +50,24 @@ llm_providers:
|
|||
tracing:
|
||||
random_sampling: 100
|
||||
"""
|
||||
arch_config_schema = ""
|
||||
with open("../config/arch_config_schema.yaml", "r") as file:
|
||||
arch_config_schema = file.read()
|
||||
plano_config_schema = ""
|
||||
with open("../config/plano_config_schema.yaml", "r") as file:
|
||||
plano_config_schema = file.read()
|
||||
|
||||
m_open = mock.mock_open()
|
||||
# Provide enough file handles for all open() calls in validate_and_render_schema
|
||||
m_open.side_effect = [
|
||||
# Removed empty read - was causing validation failures
|
||||
mock.mock_open(read_data=arch_config).return_value, # ARCH_CONFIG_FILE
|
||||
mock.mock_open(read_data=plano_config).return_value, # PLANO_CONFIG_FILE
|
||||
mock.mock_open(
|
||||
read_data=arch_config_schema
|
||||
).return_value, # ARCH_CONFIG_SCHEMA_FILE
|
||||
mock.mock_open(read_data=arch_config).return_value, # ARCH_CONFIG_FILE
|
||||
read_data=plano_config_schema
|
||||
).return_value, # PLANO_CONFIG_SCHEMA_FILE
|
||||
mock.mock_open(read_data=plano_config).return_value, # PLANO_CONFIG_FILE
|
||||
mock.mock_open(
|
||||
read_data=arch_config_schema
|
||||
).return_value, # ARCH_CONFIG_SCHEMA_FILE
|
||||
read_data=plano_config_schema
|
||||
).return_value, # PLANO_CONFIG_SCHEMA_FILE
|
||||
mock.mock_open().return_value, # ENVOY_CONFIG_FILE_RENDERED (write)
|
||||
mock.mock_open().return_value, # ARCH_CONFIG_FILE_RENDERED (write)
|
||||
mock.mock_open().return_value, # PLANO_CONFIG_FILE_RENDERED (write)
|
||||
]
|
||||
with mock.patch("builtins.open", m_open):
|
||||
with mock.patch("planoai.config_generator.Environment"):
|
||||
|
|
@ -75,14 +75,14 @@ tracing:
|
|||
|
||||
|
||||
def test_validate_and_render_happy_path_agent_config(monkeypatch):
|
||||
monkeypatch.setenv("ARCH_CONFIG_FILE", "fake_arch_config.yaml")
|
||||
monkeypatch.setenv("ARCH_CONFIG_SCHEMA_FILE", "fake_arch_config_schema.yaml")
|
||||
monkeypatch.setenv("PLANO_CONFIG_FILE", "fake_plano_config.yaml")
|
||||
monkeypatch.setenv("PLANO_CONFIG_SCHEMA_FILE", "fake_plano_config_schema.yaml")
|
||||
monkeypatch.setenv("ENVOY_CONFIG_TEMPLATE_FILE", "./envoy.template.yaml")
|
||||
monkeypatch.setenv("ARCH_CONFIG_FILE_RENDERED", "fake_arch_config_rendered.yaml")
|
||||
monkeypatch.setenv("PLANO_CONFIG_FILE_RENDERED", "fake_plano_config_rendered.yaml")
|
||||
monkeypatch.setenv("ENVOY_CONFIG_FILE_RENDERED", "fake_envoy.yaml")
|
||||
monkeypatch.setenv("TEMPLATE_ROOT", "../")
|
||||
|
||||
arch_config = """
|
||||
plano_config = """
|
||||
version: v0.3.0
|
||||
|
||||
agents:
|
||||
|
|
@ -123,35 +123,35 @@ model_providers:
|
|||
- access_key: ${OPENAI_API_KEY}
|
||||
model: openai/gpt-4o
|
||||
"""
|
||||
arch_config_schema = ""
|
||||
with open("../config/arch_config_schema.yaml", "r") as file:
|
||||
arch_config_schema = file.read()
|
||||
plano_config_schema = ""
|
||||
with open("../config/plano_config_schema.yaml", "r") as file:
|
||||
plano_config_schema = file.read()
|
||||
|
||||
m_open = mock.mock_open()
|
||||
# Provide enough file handles for all open() calls in validate_and_render_schema
|
||||
m_open.side_effect = [
|
||||
# Removed empty read - was causing validation failures
|
||||
mock.mock_open(read_data=arch_config).return_value, # ARCH_CONFIG_FILE
|
||||
mock.mock_open(read_data=plano_config).return_value, # PLANO_CONFIG_FILE
|
||||
mock.mock_open(
|
||||
read_data=arch_config_schema
|
||||
).return_value, # ARCH_CONFIG_SCHEMA_FILE
|
||||
mock.mock_open(read_data=arch_config).return_value, # ARCH_CONFIG_FILE
|
||||
read_data=plano_config_schema
|
||||
).return_value, # PLANO_CONFIG_SCHEMA_FILE
|
||||
mock.mock_open(read_data=plano_config).return_value, # PLANO_CONFIG_FILE
|
||||
mock.mock_open(
|
||||
read_data=arch_config_schema
|
||||
).return_value, # ARCH_CONFIG_SCHEMA_FILE
|
||||
read_data=plano_config_schema
|
||||
).return_value, # PLANO_CONFIG_SCHEMA_FILE
|
||||
mock.mock_open().return_value, # ENVOY_CONFIG_FILE_RENDERED (write)
|
||||
mock.mock_open().return_value, # ARCH_CONFIG_FILE_RENDERED (write)
|
||||
mock.mock_open().return_value, # PLANO_CONFIG_FILE_RENDERED (write)
|
||||
]
|
||||
with mock.patch("builtins.open", m_open):
|
||||
with mock.patch("planoai.config_generator.Environment"):
|
||||
validate_and_render_schema()
|
||||
|
||||
|
||||
arch_config_test_cases = [
|
||||
plano_config_test_cases = [
|
||||
{
|
||||
"id": "duplicate_provider_name",
|
||||
"expected_error": "Duplicate model_provider name",
|
||||
"arch_config": """
|
||||
"plano_config": """
|
||||
version: v0.1.0
|
||||
|
||||
listeners:
|
||||
|
|
@ -176,7 +176,7 @@ llm_providers:
|
|||
{
|
||||
"id": "provider_interface_with_model_id",
|
||||
"expected_error": "Please provide provider interface as part of model name",
|
||||
"arch_config": """
|
||||
"plano_config": """
|
||||
version: v0.1.0
|
||||
|
||||
listeners:
|
||||
|
|
@ -197,7 +197,7 @@ llm_providers:
|
|||
{
|
||||
"id": "duplicate_model_id",
|
||||
"expected_error": "Duplicate model_id",
|
||||
"arch_config": """
|
||||
"plano_config": """
|
||||
version: v0.1.0
|
||||
|
||||
listeners:
|
||||
|
|
@ -219,7 +219,7 @@ llm_providers:
|
|||
{
|
||||
"id": "custom_provider_base_url",
|
||||
"expected_error": "Must provide base_url and provider_interface",
|
||||
"arch_config": """
|
||||
"plano_config": """
|
||||
version: v0.1.0
|
||||
|
||||
listeners:
|
||||
|
|
@ -237,7 +237,7 @@ llm_providers:
|
|||
{
|
||||
"id": "base_url_with_path_prefix",
|
||||
"expected_error": None,
|
||||
"arch_config": """
|
||||
"plano_config": """
|
||||
version: v0.1.0
|
||||
|
||||
listeners:
|
||||
|
|
@ -258,7 +258,7 @@ llm_providers:
|
|||
{
|
||||
"id": "duplicate_routeing_preference_name",
|
||||
"expected_error": "Duplicate routing preference name",
|
||||
"arch_config": """
|
||||
"plano_config": """
|
||||
version: v0.1.0
|
||||
|
||||
listeners:
|
||||
|
|
@ -295,42 +295,42 @@ tracing:
|
|||
|
||||
|
||||
@pytest.mark.parametrize(
|
||||
"arch_config_test_case",
|
||||
arch_config_test_cases,
|
||||
ids=[case["id"] for case in arch_config_test_cases],
|
||||
"plano_config_test_case",
|
||||
plano_config_test_cases,
|
||||
ids=[case["id"] for case in plano_config_test_cases],
|
||||
)
|
||||
def test_validate_and_render_schema_tests(monkeypatch, arch_config_test_case):
|
||||
monkeypatch.setenv("ARCH_CONFIG_FILE", "fake_arch_config.yaml")
|
||||
monkeypatch.setenv("ARCH_CONFIG_SCHEMA_FILE", "fake_arch_config_schema.yaml")
|
||||
def test_validate_and_render_schema_tests(monkeypatch, plano_config_test_case):
|
||||
monkeypatch.setenv("PLANO_CONFIG_FILE", "fake_plano_config.yaml")
|
||||
monkeypatch.setenv("PLANO_CONFIG_SCHEMA_FILE", "fake_plano_config_schema.yaml")
|
||||
monkeypatch.setenv("ENVOY_CONFIG_TEMPLATE_FILE", "./envoy.template.yaml")
|
||||
monkeypatch.setenv("ARCH_CONFIG_FILE_RENDERED", "fake_arch_config_rendered.yaml")
|
||||
monkeypatch.setenv("PLANO_CONFIG_FILE_RENDERED", "fake_plano_config_rendered.yaml")
|
||||
monkeypatch.setenv("ENVOY_CONFIG_FILE_RENDERED", "fake_envoy.yaml")
|
||||
monkeypatch.setenv("TEMPLATE_ROOT", "../")
|
||||
|
||||
arch_config = arch_config_test_case["arch_config"]
|
||||
expected_error = arch_config_test_case.get("expected_error")
|
||||
plano_config = plano_config_test_case["plano_config"]
|
||||
expected_error = plano_config_test_case.get("expected_error")
|
||||
|
||||
arch_config_schema = ""
|
||||
with open("../config/arch_config_schema.yaml", "r") as file:
|
||||
arch_config_schema = file.read()
|
||||
plano_config_schema = ""
|
||||
with open("../config/plano_config_schema.yaml", "r") as file:
|
||||
plano_config_schema = file.read()
|
||||
|
||||
m_open = mock.mock_open()
|
||||
# Provide enough file handles for all open() calls in validate_and_render_schema
|
||||
m_open.side_effect = [
|
||||
mock.mock_open(
|
||||
read_data=arch_config
|
||||
).return_value, # validate_prompt_config: ARCH_CONFIG_FILE
|
||||
read_data=plano_config
|
||||
).return_value, # validate_prompt_config: PLANO_CONFIG_FILE
|
||||
mock.mock_open(
|
||||
read_data=arch_config_schema
|
||||
).return_value, # validate_prompt_config: ARCH_CONFIG_SCHEMA_FILE
|
||||
read_data=plano_config_schema
|
||||
).return_value, # validate_prompt_config: PLANO_CONFIG_SCHEMA_FILE
|
||||
mock.mock_open(
|
||||
read_data=arch_config
|
||||
).return_value, # validate_and_render_schema: ARCH_CONFIG_FILE
|
||||
read_data=plano_config
|
||||
).return_value, # validate_and_render_schema: PLANO_CONFIG_FILE
|
||||
mock.mock_open(
|
||||
read_data=arch_config_schema
|
||||
).return_value, # validate_and_render_schema: ARCH_CONFIG_SCHEMA_FILE
|
||||
read_data=plano_config_schema
|
||||
).return_value, # validate_and_render_schema: PLANO_CONFIG_SCHEMA_FILE
|
||||
mock.mock_open().return_value, # ENVOY_CONFIG_FILE_RENDERED (write)
|
||||
mock.mock_open().return_value, # ARCH_CONFIG_FILE_RENDERED (write)
|
||||
mock.mock_open().return_value, # PLANO_CONFIG_FILE_RENDERED (write)
|
||||
]
|
||||
with mock.patch("builtins.open", m_open):
|
||||
with mock.patch("planoai.config_generator.Environment"):
|
||||
|
|
|
|||
2
cli/uv.lock
generated
2
cli/uv.lock
generated
|
|
@ -337,7 +337,7 @@ wheels = [
|
|||
|
||||
[[package]]
|
||||
name = "planoai"
|
||||
version = "0.4.4"
|
||||
version = "0.4.6"
|
||||
source = { editable = "." }
|
||||
dependencies = [
|
||||
{ name = "click" },
|
||||
|
|
|
|||
|
|
@ -8,14 +8,14 @@ services:
|
|||
- "12000:12000"
|
||||
- "19901:9901"
|
||||
volumes:
|
||||
- ${ARCH_CONFIG_FILE:-../demos/samples_python/weather_forecast/arch_config.yaml}:/app/arch_config.yaml
|
||||
- ${PLANO_CONFIG_FILE:-../demos/samples_python/weather_forecast/plano_config.yaml}:/app/plano_config.yaml
|
||||
- /etc/ssl/cert.pem:/etc/ssl/cert.pem
|
||||
- ./envoy.template.yaml:/app/envoy.template.yaml
|
||||
- ./arch_config_schema.yaml:/app/arch_config_schema.yaml
|
||||
- ./plano_config_schema.yaml:/app/plano_config_schema.yaml
|
||||
- ../cli/planoai/config_generator.py:/app/planoai/config_generator.py
|
||||
- ../crates/target/wasm32-wasip1/release/llm_gateway.wasm:/etc/envoy/proxy-wasm-plugins/llm_gateway.wasm
|
||||
- ../crates/target/wasm32-wasip1/release/prompt_gateway.wasm:/etc/envoy/proxy-wasm-plugins/prompt_gateway.wasm
|
||||
- ~/archgw_logs:/var/log/
|
||||
- ~/plano_logs:/var/log/
|
||||
extra_hosts:
|
||||
- "host.docker.internal:host-gateway"
|
||||
environment:
|
||||
|
|
|
|||
|
|
@ -40,7 +40,7 @@ static_resources:
|
|||
- name: envoy.filters.network.http_connection_manager
|
||||
typed_config:
|
||||
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
|
||||
{% if "random_sampling" in arch_tracing and arch_tracing["random_sampling"] > 0 %}
|
||||
{% if "random_sampling" in plano_tracing and plano_tracing["random_sampling"] > 0 %}
|
||||
generate_request_id: true
|
||||
tracing:
|
||||
provider:
|
||||
|
|
@ -53,7 +53,7 @@ static_resources:
|
|||
timeout: 0.250s
|
||||
service_name: plano(inbound)
|
||||
random_sampling:
|
||||
value: {{ arch_tracing.random_sampling }}
|
||||
value: {{ plano_tracing.random_sampling }}
|
||||
operation: "%REQ(:METHOD)% %REQ(X-ENVOY-ORIGINAL-PATH?:PATH)%"
|
||||
{% endif %}
|
||||
stat_prefix: plano(inbound)
|
||||
|
|
@ -114,7 +114,7 @@ static_resources:
|
|||
domains:
|
||||
- "*"
|
||||
routes:
|
||||
{% for provider in arch_model_providers %}
|
||||
{% for provider in plano_model_providers %}
|
||||
# if endpoint is set then use custom cluster for upstream llm
|
||||
{% if provider.endpoint %}
|
||||
{% set llm_cluster_name = provider.cluster_name %}
|
||||
|
|
@ -166,7 +166,7 @@ static_resources:
|
|||
configuration:
|
||||
"@type": "type.googleapis.com/google.protobuf.StringValue"
|
||||
value: |
|
||||
{{ arch_config | indent(32) }}
|
||||
{{ plano_config | indent(32) }}
|
||||
vm_config:
|
||||
runtime: "envoy.wasm.runtime.v8"
|
||||
code:
|
||||
|
|
@ -183,7 +183,7 @@ static_resources:
|
|||
configuration:
|
||||
"@type": "type.googleapis.com/google.protobuf.StringValue"
|
||||
value: |
|
||||
{{ arch_llm_config | indent(32) }}
|
||||
{{ plano_llm_config | indent(32) }}
|
||||
vm_config:
|
||||
runtime: "envoy.wasm.runtime.v8"
|
||||
code:
|
||||
|
|
@ -215,7 +215,7 @@ static_resources:
|
|||
- name: envoy.filters.network.http_connection_manager
|
||||
typed_config:
|
||||
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
|
||||
# {% if "random_sampling" in arch_tracing and arch_tracing["random_sampling"] > 0 %}
|
||||
# {% if "random_sampling" in plano_tracing and plano_tracing["random_sampling"] > 0 %}
|
||||
# generate_request_id: true
|
||||
# tracing:
|
||||
# provider:
|
||||
|
|
@ -228,7 +228,7 @@ static_resources:
|
|||
# timeout: 0.250s
|
||||
# service_name: tools
|
||||
# random_sampling:
|
||||
# value: {{ arch_tracing.random_sampling }}
|
||||
# value: {{ plano_tracing.random_sampling }}
|
||||
# {% endif %}
|
||||
stat_prefix: outbound_api_traffic
|
||||
codec_type: AUTO
|
||||
|
|
@ -258,7 +258,7 @@ static_resources:
|
|||
auto_host_rewrite: true
|
||||
cluster: bright_staff
|
||||
timeout: 300s
|
||||
{% for cluster_name, cluster in arch_clusters.items() %}
|
||||
{% for cluster_name, cluster in plano_clusters.items() %}
|
||||
- match:
|
||||
prefix: "/"
|
||||
headers:
|
||||
|
|
@ -290,7 +290,7 @@ static_resources:
|
|||
- name: envoy.filters.network.http_connection_manager
|
||||
typed_config:
|
||||
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
|
||||
{% if "random_sampling" in arch_tracing and arch_tracing["random_sampling"] > 0 %}
|
||||
{% if "random_sampling" in plano_tracing and plano_tracing["random_sampling"] > 0 %}
|
||||
generate_request_id: true
|
||||
tracing:
|
||||
provider:
|
||||
|
|
@ -303,7 +303,7 @@ static_resources:
|
|||
timeout: 0.250s
|
||||
service_name: plano(inbound)
|
||||
random_sampling:
|
||||
value: {{ arch_tracing.random_sampling }}
|
||||
value: {{ plano_tracing.random_sampling }}
|
||||
operation: "%REQ(:METHOD)% %REQ(X-ENVOY-ORIGINAL-PATH?:PATH)%"
|
||||
{% endif %}
|
||||
stat_prefix: {{ listener.name | replace(" ", "_") }}_traffic
|
||||
|
|
@ -467,7 +467,7 @@ static_resources:
|
|||
- name: envoy.filters.network.http_connection_manager
|
||||
typed_config:
|
||||
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
|
||||
{% if "random_sampling" in arch_tracing and arch_tracing["random_sampling"] > 0 %}
|
||||
{% if "random_sampling" in plano_tracing and plano_tracing["random_sampling"] > 0 %}
|
||||
generate_request_id: true
|
||||
tracing:
|
||||
provider:
|
||||
|
|
@ -480,7 +480,7 @@ static_resources:
|
|||
timeout: 0.250s
|
||||
service_name: plano(outbound)
|
||||
random_sampling:
|
||||
value: {{ arch_tracing.random_sampling }}
|
||||
value: {{ plano_tracing.random_sampling }}
|
||||
operation: "%REQ(:METHOD)% %REQ(:AUTHORITY)%%REQ(X-ENVOY-ORIGINAL-PATH?:PATH)%"
|
||||
{% endif %}
|
||||
stat_prefix: egress_traffic
|
||||
|
|
@ -501,7 +501,7 @@ static_resources:
|
|||
domains:
|
||||
- "*"
|
||||
routes:
|
||||
{% for provider in arch_model_providers %}
|
||||
{% for provider in plano_model_providers %}
|
||||
# if endpoint is set then use custom cluster for upstream llm
|
||||
{% if provider.endpoint %}
|
||||
{% set llm_cluster_name = provider.cluster_name %}
|
||||
|
|
@ -564,7 +564,7 @@ static_resources:
|
|||
configuration:
|
||||
"@type": "type.googleapis.com/google.protobuf.StringValue"
|
||||
value: |
|
||||
{{ arch_llm_config | indent(32) }}
|
||||
{{ plano_llm_config | indent(32) }}
|
||||
vm_config:
|
||||
runtime: "envoy.wasm.runtime.v8"
|
||||
code:
|
||||
|
|
@ -879,7 +879,7 @@ static_resources:
|
|||
address: mistral_7b_instruct
|
||||
port_value: 10001
|
||||
hostname: "mistral_7b_instruct"
|
||||
{% for cluster_name, cluster in arch_clusters.items() %}
|
||||
{% for cluster_name, cluster in plano_clusters.items() %}
|
||||
- name: {{ cluster_name }}
|
||||
{% if cluster.connect_timeout -%}
|
||||
connect_timeout: {{ cluster.connect_timeout }}
|
||||
|
|
@ -1013,7 +1013,7 @@ static_resources:
|
|||
port_value: 12001
|
||||
hostname: arch_listener_llm
|
||||
|
||||
{% if "random_sampling" in arch_tracing and arch_tracing["random_sampling"] > 0 %}
|
||||
{% if "random_sampling" in plano_tracing and plano_tracing["random_sampling"] > 0 %}
|
||||
- name: opentelemetry_collector
|
||||
type: STRICT_DNS
|
||||
dns_lookup_family: V4_ONLY
|
||||
|
|
@ -1030,7 +1030,7 @@ static_resources:
|
|||
- endpoint:
|
||||
address:
|
||||
socket_address:
|
||||
{% set _otel_endpoint = arch_tracing.opentracing_grpc_endpoint | default('host.docker.internal:4317') | replace("http://", "") | replace("https://", "") %}
|
||||
{% set _otel_endpoint = plano_tracing.opentracing_grpc_endpoint | default('host.docker.internal:4317') | replace("http://", "") | replace("https://", "") %}
|
||||
address: {{ _otel_endpoint.split(":") | first }}
|
||||
port_value: {{ _otel_endpoint.split(":") | last }}
|
||||
{% endif %}
|
||||
|
|
|
|||
|
|
@ -3,9 +3,9 @@ nodaemon=true
|
|||
|
||||
[program:brightstaff]
|
||||
command=sh -c "\
|
||||
envsubst < /app/arch_config_rendered.yaml > /app/arch_config_rendered.env_sub.yaml && \
|
||||
envsubst < /app/plano_config_rendered.yaml > /app/plano_config_rendered.env_sub.yaml && \
|
||||
RUST_LOG=${LOG_LEVEL:-info} \
|
||||
ARCH_CONFIG_PATH_RENDERED=/app/arch_config_rendered.env_sub.yaml \
|
||||
PLANO_CONFIG_PATH_RENDERED=/app/plano_config_rendered.env_sub.yaml \
|
||||
/app/brightstaff 2>&1 | \
|
||||
tee /var/log/brightstaff.log | \
|
||||
while IFS= read -r line; do echo '[brightstaff]' \"$line\"; done"
|
||||
|
|
|
|||
|
|
@ -7,7 +7,7 @@
|
|||
#
|
||||
# To test:
|
||||
# docker build -t plano-passthrough-test .
|
||||
# docker run -d -p 10000:10000 -v $(pwd)/config/test_passthrough.yaml:/app/arch_config.yaml plano-passthrough-test
|
||||
# docker run -d -p 10000:10000 -v $(pwd)/config/test_passthrough.yaml:/app/plano_config.yaml plano-passthrough-test
|
||||
#
|
||||
# curl http://localhost:10000/v1/chat/completions \
|
||||
# -H "Authorization: Bearer sk-your-virtual-key" \
|
||||
|
|
|
|||
|
|
@ -2,10 +2,10 @@
|
|||
|
||||
failed_files=()
|
||||
|
||||
for file in $(find . -name config.yaml -o -name arch_config_full_reference.yaml); do
|
||||
for file in $(find . -name config.yaml -o -name plano_config_full_reference.yaml); do
|
||||
echo "Validating ${file}..."
|
||||
touch $(pwd)/${file}_rendered
|
||||
if ! docker run --rm -v "$(pwd)/${file}:/app/arch_config.yaml:ro" -v "$(pwd)/${file}_rendered:/app/arch_config_rendered.yaml:rw" --entrypoint /bin/sh katanemo/plano:0.4.6 -c "python -m planoai.config_generator" 2>&1 > /dev/null ; then
|
||||
if ! docker run --rm -v "$(pwd)/${file}:/app/plano_config.yaml:ro" -v "$(pwd)/${file}_rendered:/app/plano_config_rendered.yaml:rw" --entrypoint /bin/sh katanemo/plano:0.4.6 -c "python -m planoai.config_generator" 2>&1 > /dev/null ; then
|
||||
echo "Validation failed for $file"
|
||||
failed_files+=("$file")
|
||||
fi
|
||||
|
|
|
|||
|
|
@ -210,8 +210,8 @@ async fn llm_chat_inner(
|
|||
// Set the model to just the model name (without provider prefix)
|
||||
// This ensures upstream receives "gpt-4" not "openai/gpt-4"
|
||||
client_request.set_model(model_name_only.clone());
|
||||
if client_request.remove_metadata_key("archgw_preference_config") {
|
||||
debug!("removed archgw_preference_config from metadata");
|
||||
if client_request.remove_metadata_key("plano_preference_config") {
|
||||
debug!("removed plano_preference_config from metadata");
|
||||
}
|
||||
|
||||
// === v1/responses state management: Determine upstream API and combine input if needed ===
|
||||
|
|
|
|||
|
|
@ -78,7 +78,7 @@ pub async fn router_chat_get_upstream_model(
|
|||
// Extract usage preferences from metadata
|
||||
let usage_preferences_str: Option<String> = routing_metadata.as_ref().and_then(|metadata| {
|
||||
metadata
|
||||
.get("archgw_preference_config")
|
||||
.get("plano_preference_config")
|
||||
.map(|value| value.to_string())
|
||||
});
|
||||
|
||||
|
|
|
|||
|
|
@ -52,57 +52,57 @@ fn empty() -> BoxBody<Bytes, hyper::Error> {
|
|||
async fn main() -> Result<(), Box<dyn std::error::Error + Send + Sync>> {
|
||||
let bind_address = env::var("BIND_ADDRESS").unwrap_or_else(|_| BIND_ADDRESS.to_string());
|
||||
|
||||
// loading arch_config.yaml file (before tracing init so we can read tracing config)
|
||||
let arch_config_path = env::var("ARCH_CONFIG_PATH_RENDERED")
|
||||
.unwrap_or_else(|_| "./arch_config_rendered.yaml".to_string());
|
||||
eprintln!("loading arch_config.yaml from {}", arch_config_path);
|
||||
// loading plano_config.yaml file (before tracing init so we can read tracing config)
|
||||
let plano_config_path = env::var("PLANO_CONFIG_PATH_RENDERED")
|
||||
.unwrap_or_else(|_| "./plano_config_rendered.yaml".to_string());
|
||||
eprintln!("loading plano_config.yaml from {}", plano_config_path);
|
||||
|
||||
let config_contents =
|
||||
fs::read_to_string(&arch_config_path).expect("Failed to read arch_config.yaml");
|
||||
fs::read_to_string(&plano_config_path).expect("Failed to read plano_config.yaml");
|
||||
|
||||
let config: Configuration =
|
||||
serde_yaml::from_str(&config_contents).expect("Failed to parse arch_config.yaml");
|
||||
serde_yaml::from_str(&config_contents).expect("Failed to parse plano_config.yaml");
|
||||
|
||||
// Initialize tracing using config.yaml tracing section
|
||||
let _tracer_provider = init_tracer(config.tracing.as_ref());
|
||||
info!(path = %arch_config_path, "loaded arch_config.yaml");
|
||||
info!(path = %plano_config_path, "loaded plano_config.yaml");
|
||||
|
||||
let arch_config = Arc::new(config);
|
||||
let plano_config = Arc::new(config);
|
||||
|
||||
// combine agents and filters into a single list of agents
|
||||
let all_agents: Vec<Agent> = arch_config
|
||||
let all_agents: Vec<Agent> = plano_config
|
||||
.agents
|
||||
.as_deref()
|
||||
.unwrap_or_default()
|
||||
.iter()
|
||||
.chain(arch_config.filters.as_deref().unwrap_or_default())
|
||||
.chain(plano_config.filters.as_deref().unwrap_or_default())
|
||||
.cloned()
|
||||
.collect();
|
||||
|
||||
// Create expanded provider list for /v1/models endpoint
|
||||
let llm_providers = LlmProviders::try_from(arch_config.model_providers.clone())
|
||||
let llm_providers = LlmProviders::try_from(plano_config.model_providers.clone())
|
||||
.expect("Failed to create LlmProviders");
|
||||
let llm_providers = Arc::new(RwLock::new(llm_providers));
|
||||
let combined_agents_filters_list = Arc::new(RwLock::new(Some(all_agents)));
|
||||
let listeners = Arc::new(RwLock::new(arch_config.listeners.clone()));
|
||||
let listeners = Arc::new(RwLock::new(plano_config.listeners.clone()));
|
||||
let llm_provider_url =
|
||||
env::var("LLM_PROVIDER_ENDPOINT").unwrap_or_else(|_| "http://localhost:12001".to_string());
|
||||
|
||||
let listener = TcpListener::bind(bind_address).await?;
|
||||
let routing_model_name: String = arch_config
|
||||
let routing_model_name: String = plano_config
|
||||
.routing
|
||||
.as_ref()
|
||||
.and_then(|r| r.model.clone())
|
||||
.unwrap_or_else(|| DEFAULT_ROUTING_MODEL_NAME.to_string());
|
||||
|
||||
let routing_llm_provider = arch_config
|
||||
let routing_llm_provider = plano_config
|
||||
.routing
|
||||
.as_ref()
|
||||
.and_then(|r| r.model_provider.clone())
|
||||
.unwrap_or_else(|| DEFAULT_ROUTING_LLM_PROVIDER.to_string());
|
||||
|
||||
let router_service: Arc<RouterService> = Arc::new(RouterService::new(
|
||||
arch_config.model_providers.clone(),
|
||||
plano_config.model_providers.clone(),
|
||||
format!("{llm_provider_url}{CHAT_COMPLETIONS_PATH}"),
|
||||
routing_model_name,
|
||||
routing_llm_provider,
|
||||
|
|
@ -113,19 +113,19 @@ async fn main() -> Result<(), Box<dyn std::error::Error + Send + Sync>> {
|
|||
PLANO_ORCHESTRATOR_MODEL_NAME.to_string(),
|
||||
));
|
||||
|
||||
let model_aliases = Arc::new(arch_config.model_aliases.clone());
|
||||
let model_aliases = Arc::new(plano_config.model_aliases.clone());
|
||||
|
||||
// Initialize trace collector and start background flusher
|
||||
// Tracing is enabled if the tracing config is present in arch_config.yaml
|
||||
// Tracing is enabled if the tracing config is present in plano_config.yaml
|
||||
// Pass Some(true/false) to override, or None to use env var OTEL_TRACING_ENABLED
|
||||
// OpenTelemetry automatic instrumentation is configured in utils/tracing.rs
|
||||
|
||||
// Initialize conversation state storage for v1/responses
|
||||
// Configurable via arch_config.yaml state_storage section
|
||||
// Configurable via plano_config.yaml state_storage section
|
||||
// If not configured, state management is disabled
|
||||
// Environment variables are substituted by envsubst before config is read
|
||||
let state_storage: Option<Arc<dyn StateStorage>> =
|
||||
if let Some(storage_config) = &arch_config.state_storage {
|
||||
if let Some(storage_config) = &plano_config.state_storage {
|
||||
let storage: Arc<dyn StateStorage> = match storage_config.storage_type {
|
||||
common::configuration::StateStorageType::Memory => {
|
||||
info!(
|
||||
|
|
|
|||
|
|
@ -182,7 +182,7 @@ pub mod signals {
|
|||
// Operation Names
|
||||
// =============================================================================
|
||||
|
||||
/// Canonical operation name components for Arch Gateway
|
||||
/// Canonical operation name components for Plano Gateway
|
||||
pub mod operation_component {
|
||||
/// Inbound request handling
|
||||
pub const INBOUND: &str = "plano(inbound)";
|
||||
|
|
@ -210,7 +210,7 @@ pub mod operation_component {
|
|||
///
|
||||
/// Format: `{method} {path} {target}`
|
||||
///
|
||||
/// The operation component (e.g., "archgw(llm)") is now part of the service name,
|
||||
/// The operation component (e.g., "plano(llm)") is now part of the service name,
|
||||
/// so the operation name focuses on the HTTP request details and target.
|
||||
///
|
||||
/// # Examples
|
||||
|
|
@ -218,7 +218,7 @@ pub mod operation_component {
|
|||
/// use brightstaff::tracing::OperationNameBuilder;
|
||||
///
|
||||
/// // LLM call operation: "POST /v1/chat/completions gpt-4"
|
||||
/// // (service name will be "archgw(llm)")
|
||||
/// // (service name will be "plano(llm)")
|
||||
/// let op = OperationNameBuilder::new()
|
||||
/// .with_method("POST")
|
||||
/// .with_path("/v1/chat/completions")
|
||||
|
|
@ -226,7 +226,7 @@ pub mod operation_component {
|
|||
/// .build();
|
||||
///
|
||||
/// // Agent filter operation: "POST /agents/v1/chat/completions hallucination-detector"
|
||||
/// // (service name will be "archgw(agent filter)")
|
||||
/// // (service name will be "plano(agent filter)")
|
||||
/// let op = OperationNameBuilder::new()
|
||||
/// .with_method("POST")
|
||||
/// .with_path("/agents/v1/chat/completions")
|
||||
|
|
@ -234,7 +234,7 @@ pub mod operation_component {
|
|||
/// .build();
|
||||
///
|
||||
/// // Routing operation: "POST /v1/chat/completions"
|
||||
/// // (service name will be "archgw(routing)")
|
||||
/// // (service name will be "plano(routing)")
|
||||
/// let op = OperationNameBuilder::new()
|
||||
/// .with_method("POST")
|
||||
/// .with_path("/v1/chat/completions")
|
||||
|
|
|
|||
|
|
@ -493,7 +493,7 @@ mod test {
|
|||
#[test]
|
||||
fn test_deserialize_configuration() {
|
||||
let ref_config = fs::read_to_string(
|
||||
"../../docs/source/resources/includes/arch_config_full_reference_rendered.yaml",
|
||||
"../../docs/source/resources/includes/plano_config_full_reference_rendered.yaml",
|
||||
)
|
||||
.expect("reference config file not found");
|
||||
|
||||
|
|
@ -520,7 +520,7 @@ mod test {
|
|||
#[test]
|
||||
fn test_tool_conversion() {
|
||||
let ref_config = fs::read_to_string(
|
||||
"../../docs/source/resources/includes/arch_config_full_reference_rendered.yaml",
|
||||
"../../docs/source/resources/includes/plano_config_full_reference_rendered.yaml",
|
||||
)
|
||||
.expect("reference config file not found");
|
||||
let config: super::Configuration = serde_yaml::from_str(&ref_config).unwrap();
|
||||
|
|
|
|||
|
|
@ -990,7 +990,7 @@ impl HttpContext for StreamContext {
|
|||
self.send_server_error(
|
||||
ServerError::BadRequest {
|
||||
why: format!(
|
||||
"No model specified in request and couldn't determine model name from arch_config. Model name in req: {}, arch_config, provider: {}, model: {:?}",
|
||||
"No model specified in request and couldn't determine model name from plano_config. Model name in req: {}, plano_config, provider: {}, model: {:?}",
|
||||
model_requested,
|
||||
self.llm_provider().name,
|
||||
self.llm_provider().model
|
||||
|
|
|
|||
|
|
@ -419,7 +419,7 @@ impl HttpContext for StreamContext {
|
|||
);
|
||||
}
|
||||
let data_serialized = serde_json::to_string(&data).unwrap();
|
||||
info!("archgw <= developer: {}", data_serialized);
|
||||
info!("plano <= developer: {}", data_serialized);
|
||||
self.set_http_response_body(0, body_size, data_serialized.as_bytes());
|
||||
};
|
||||
}
|
||||
|
|
|
|||
|
|
@ -246,7 +246,7 @@ impl StreamContext {
|
|||
let chat_completion_request_json =
|
||||
serde_json::to_string(&chat_completion_request).unwrap();
|
||||
info!(
|
||||
"archgw => upstream llm request: {}",
|
||||
"plano => upstream llm request: {}",
|
||||
chat_completion_request_json
|
||||
);
|
||||
self.set_http_request_body(
|
||||
|
|
@ -799,7 +799,7 @@ impl StreamContext {
|
|||
};
|
||||
|
||||
let json_resp = serde_json::to_string(&chat_completion_request).unwrap();
|
||||
info!("archgw => (default target) llm request: {}", json_resp);
|
||||
info!("plano => (default target) llm request: {}", json_resp);
|
||||
self.set_http_request_body(0, self.request_body_size, json_resp.as_bytes());
|
||||
self.resume_http_request();
|
||||
}
|
||||
|
|
|
|||
|
|
@ -18,7 +18,7 @@ services:
|
|||
extra_hosts:
|
||||
- "host.docker.internal:host-gateway"
|
||||
volumes:
|
||||
- ./config.yaml:/app/arch_config.yaml
|
||||
- ./config.yaml:/app/plano_config.yaml
|
||||
|
||||
jaeger:
|
||||
build:
|
||||
|
|
|
|||
|
|
@ -1 +1 @@
|
|||
This demo shows how you can use a publicly hosted rest api and interact it using arch gateway.
|
||||
This demo shows how you can use a publicly hosted rest api and interact it using Plano gateway.
|
||||
|
|
|
|||
|
|
@ -10,7 +10,7 @@ services:
|
|||
extra_hosts:
|
||||
- "host.docker.internal:host-gateway"
|
||||
volumes:
|
||||
- ./config.yaml:/app/arch_config.yaml
|
||||
- ./config.yaml:/app/plano_config.yaml
|
||||
|
||||
jaeger:
|
||||
build:
|
||||
|
|
|
|||
|
|
@ -1,6 +1,6 @@
|
|||
# Multi-Turn Agentic Demo (RAG)
|
||||
|
||||
This demo showcases how the **Arch** can be used to build accurate multi-turn RAG agent by just writing simple APIs.
|
||||
This demo showcases how **Plano** can be used to build accurate multi-turn RAG agent by just writing simple APIs.
|
||||
|
||||

|
||||
|
||||
|
|
@ -14,7 +14,7 @@ Provides information about various energy sources and considerations.
|
|||
|
||||
# Starting the demo
|
||||
1. Please make sure the [pre-requisites](https://github.com/katanemo/arch/?tab=readme-ov-file#prerequisites) are installed correctly
|
||||
2. Start Arch
|
||||
2. Start Plano
|
||||
```sh
|
||||
sh run_demo.sh
|
||||
```
|
||||
|
|
|
|||
|
|
@ -21,4 +21,4 @@ services:
|
|||
extra_hosts:
|
||||
- "host.docker.internal:host-gateway"
|
||||
volumes:
|
||||
- ./config.yaml:/app/arch_config.yaml
|
||||
- ./config.yaml:/app/plano_config.yaml
|
||||
|
|
|
|||
|
|
@ -18,8 +18,8 @@ start_demo() {
|
|||
echo ".env file created with OPENAI_API_KEY."
|
||||
fi
|
||||
|
||||
# Step 3: Start Arch
|
||||
echo "Starting Arch with config.yaml..."
|
||||
# Step 3: Start Plano
|
||||
echo "Starting Plano with config.yaml..."
|
||||
planoai up config.yaml
|
||||
|
||||
# Step 4: Start Network Agent
|
||||
|
|
@ -33,8 +33,8 @@ stop_demo() {
|
|||
echo "Stopping HR Agent using Docker Compose..."
|
||||
docker compose down -v
|
||||
|
||||
# Step 2: Stop Arch
|
||||
echo "Stopping Arch..."
|
||||
# Step 2: Stop Plano
|
||||
echo "Stopping Plano..."
|
||||
planoai down
|
||||
}
|
||||
|
||||
|
|
|
|||
|
|
@ -10,7 +10,7 @@ services:
|
|||
extra_hosts:
|
||||
- "host.docker.internal:host-gateway"
|
||||
volumes:
|
||||
- ./config.yaml:/app/arch_config.yaml
|
||||
- ./config.yaml:/app/plano_config.yaml
|
||||
|
||||
jaeger:
|
||||
build:
|
||||
|
|
|
|||
|
|
@ -18,8 +18,8 @@ start_demo() {
|
|||
echo ".env file created with OPENAI_API_KEY."
|
||||
fi
|
||||
|
||||
# Step 3: Start Arch
|
||||
echo "Starting Arch with config.yaml..."
|
||||
# Step 3: Start Plano
|
||||
echo "Starting Plano with config.yaml..."
|
||||
planoai up config.yaml
|
||||
|
||||
# Step 4: Start developer services
|
||||
|
|
@ -33,8 +33,8 @@ stop_demo() {
|
|||
echo "Stopping Network Agent using Docker Compose..."
|
||||
docker compose down
|
||||
|
||||
# Step 2: Stop Arch
|
||||
echo "Stopping Arch..."
|
||||
# Step 2: Stop Plano
|
||||
echo "Stopping Plano..."
|
||||
planoai down
|
||||
}
|
||||
|
||||
|
|
|
|||
|
|
@ -1,11 +1,11 @@
|
|||
# Function calling
|
||||
|
||||
This demo shows how you can use Arch's core function calling capabilities.
|
||||
This demo shows how you can use Plano's core function calling capabilities.
|
||||
|
||||
# Starting the demo
|
||||
|
||||
1. Please make sure the [pre-requisites](https://github.com/katanemo/arch/?tab=readme-ov-file#prerequisites) are installed correctly
|
||||
2. Start Arch
|
||||
2. Start Plano
|
||||
|
||||
3. ```sh
|
||||
sh run_demo.sh
|
||||
|
|
@ -15,14 +15,14 @@ This demo shows how you can use Arch's core function calling capabilities.
|
|||
|
||||
# Observability
|
||||
|
||||
Arch gateway publishes stats endpoint at http://localhost:19901/stats. In this demo we are using prometheus to pull stats from arch and we are using grafana to visalize the stats in dashboard. To see grafana dashboard follow instructions below,
|
||||
Plano gateway publishes stats endpoint at http://localhost:19901/stats. In this demo we are using prometheus to pull stats from Plano and we are using grafana to visalize the stats in dashboard. To see grafana dashboard follow instructions below,
|
||||
|
||||
1. Start grafana and prometheus using following command
|
||||
```yaml
|
||||
docker compose --profile monitoring up
|
||||
```
|
||||
2. Navigate to http://localhost:3000/ to open grafana UI (use admin/grafana as credentials)
|
||||
3. From grafana left nav click on dashboards and select "Intelligent Gateway Overview" to view arch gateway stats
|
||||
3. From grafana left nav click on dashboards and select "Intelligent Gateway Overview" to view Plano gateway stats
|
||||
|
||||
Here is a sample interaction,
|
||||
<img width="575" alt="image" src="https://github.com/user-attachments/assets/e0929490-3eb2-4130-ae87-a732aea4d059">
|
||||
|
|
|
|||
|
|
@ -20,7 +20,7 @@ services:
|
|||
extra_hosts:
|
||||
- "host.docker.internal:host-gateway"
|
||||
volumes:
|
||||
- ./config.yaml:/app/arch_config.yaml
|
||||
- ./config.yaml:/app/plano_config.yaml
|
||||
|
||||
otel-collector:
|
||||
build:
|
||||
|
|
|
|||
|
|
@ -20,7 +20,7 @@ services:
|
|||
extra_hosts:
|
||||
- "host.docker.internal:host-gateway"
|
||||
volumes:
|
||||
- ./config.yaml:/app/arch_config.yaml
|
||||
- ./config.yaml:/app/plano_config.yaml
|
||||
|
||||
jaeger:
|
||||
build:
|
||||
|
|
|
|||
|
|
@ -20,7 +20,7 @@ services:
|
|||
extra_hosts:
|
||||
- "host.docker.internal:host-gateway"
|
||||
volumes:
|
||||
- ./config.yaml:/app/arch_config.yaml
|
||||
- ./config.yaml:/app/plano_config.yaml
|
||||
|
||||
otel-collector:
|
||||
build:
|
||||
|
|
|
|||
|
|
@ -23,7 +23,7 @@ services:
|
|||
extra_hosts:
|
||||
- "host.docker.internal:host-gateway"
|
||||
volumes:
|
||||
- ./config.yaml:/app/arch_config.yaml
|
||||
- ./config.yaml:/app/plano_config.yaml
|
||||
|
||||
prometheus:
|
||||
build:
|
||||
|
|
|
|||
|
|
@ -20,4 +20,4 @@ services:
|
|||
extra_hosts:
|
||||
- "host.docker.internal:host-gateway"
|
||||
volumes:
|
||||
- ./config.yaml:/app/arch_config.yaml
|
||||
- ./config.yaml:/app/plano_config.yaml
|
||||
|
|
|
|||
|
|
@ -72,8 +72,8 @@ start_demo() {
|
|||
exit 1
|
||||
fi
|
||||
|
||||
# Step 4: Start Arch
|
||||
echo "Starting Arch with config.yaml..."
|
||||
# Step 4: Start Plano
|
||||
echo "Starting Plano with config.yaml..."
|
||||
planoai up config.yaml
|
||||
|
||||
# Step 5: Start Network Agent with the chosen Docker Compose file
|
||||
|
|
@ -91,8 +91,8 @@ stop_demo() {
|
|||
docker compose -f "$compose_file" down
|
||||
done
|
||||
|
||||
# Stop Arch
|
||||
echo "Stopping Arch..."
|
||||
# Stop Plano
|
||||
echo "Stopping Plano..."
|
||||
planoai down
|
||||
}
|
||||
|
||||
|
|
|
|||
4
demos/shared/chatbot_ui/.vscode/launch.json
vendored
4
demos/shared/chatbot_ui/.vscode/launch.json
vendored
|
|
@ -15,7 +15,7 @@
|
|||
"LLM": "1",
|
||||
"CHAT_COMPLETION_ENDPOINT": "http://localhost:10000/v1",
|
||||
"STREAMING": "True",
|
||||
"ARCH_CONFIG": "../../samples_python/weather_forecast/arch_config.yaml"
|
||||
"PLANO_CONFIG": "../../samples_python/weather_forecast/plano_config.yaml"
|
||||
}
|
||||
},
|
||||
{
|
||||
|
|
@ -29,7 +29,7 @@
|
|||
"LLM": "1",
|
||||
"CHAT_COMPLETION_ENDPOINT": "http://localhost:12000/v1",
|
||||
"STREAMING": "True",
|
||||
"ARCH_CONFIG": "../../samples_python/weather_forecast/arch_config.yaml"
|
||||
"PLANO_CONFIG": "../../samples_python/weather_forecast/plano_config.yaml"
|
||||
}
|
||||
},
|
||||
]
|
||||
|
|
|
|||
|
|
@ -37,7 +37,7 @@ def chat(
|
|||
|
||||
try:
|
||||
response = client.chat.completions.create(
|
||||
# we select model from arch_config file
|
||||
# we select model from plano_config file
|
||||
model="None",
|
||||
messages=history,
|
||||
temperature=1.0,
|
||||
|
|
@ -86,7 +86,7 @@ def create_gradio_app(demo_description, client):
|
|||
|
||||
with gr.Column(scale=2):
|
||||
chatbot = gr.Chatbot(
|
||||
label="Arch Chatbot",
|
||||
label="Plano Chatbot",
|
||||
elem_classes="chatbot",
|
||||
)
|
||||
textbox = gr.Textbox(
|
||||
|
|
@ -110,7 +110,7 @@ def process_stream_chunk(chunk, history):
|
|||
delta = chunk.choices[0].delta
|
||||
if delta.role and delta.role != history[-1]["role"]:
|
||||
# create new history item if role changes
|
||||
# this is likely due to arch tool call and api response
|
||||
# this is likely due to Plano tool call and api response
|
||||
history.append({"role": delta.role})
|
||||
|
||||
history[-1]["model"] = chunk.model
|
||||
|
|
@ -159,7 +159,7 @@ def convert_prompt_target_to_openai_format(target):
|
|||
|
||||
def get_prompt_targets():
|
||||
try:
|
||||
with open(os.getenv("ARCH_CONFIG", "config.yaml"), "r") as file:
|
||||
with open(os.getenv("PLANO_CONFIG", "config.yaml"), "r") as file:
|
||||
config = yaml.safe_load(file)
|
||||
|
||||
available_tools = []
|
||||
|
|
@ -181,7 +181,7 @@ def get_prompt_targets():
|
|||
|
||||
def get_llm_models():
|
||||
try:
|
||||
with open(os.getenv("ARCH_CONFIG", "config.yaml"), "r") as file:
|
||||
with open(os.getenv("PLANO_CONFIG", "config.yaml"), "r") as file:
|
||||
config = yaml.safe_load(file)
|
||||
|
||||
available_models = [""]
|
||||
|
|
|
|||
|
|
@ -787,7 +787,7 @@
|
|||
},
|
||||
"timepicker": {},
|
||||
"timezone": "browser",
|
||||
"title": "Arch Gateway Dashboard",
|
||||
"title": "Plano Gateway Dashboard",
|
||||
"uid": "adt6uhx5lk8aob",
|
||||
"version": 1,
|
||||
"weekStart": ""
|
||||
|
|
|
|||
|
|
@ -19,17 +19,17 @@ def get_data_chunks(stream, n=1):
|
|||
return chunks
|
||||
|
||||
|
||||
def get_arch_messages(response_json):
|
||||
arch_messages = []
|
||||
def get_plano_messages(response_json):
|
||||
plano_messages = []
|
||||
if response_json and "metadata" in response_json:
|
||||
# load arch_state from metadata
|
||||
arch_state_str = response_json.get("metadata", {}).get(ARCH_STATE_HEADER, "{}")
|
||||
# parse arch_state into json object
|
||||
arch_state = json.loads(arch_state_str)
|
||||
# load messages from arch_state
|
||||
arch_messages_str = arch_state.get("messages", "[]")
|
||||
# load plano_state from metadata
|
||||
plano_state_str = response_json.get("metadata", {}).get(ARCH_STATE_HEADER, "{}")
|
||||
# parse plano_state into json object
|
||||
plano_state = json.loads(plano_state_str)
|
||||
# load messages from plano_state
|
||||
plano_messages_str = plano_state.get("messages", "[]")
|
||||
# parse messages into json object
|
||||
arch_messages = json.loads(arch_messages_str)
|
||||
# append messages from arch gateway to history
|
||||
return arch_messages
|
||||
plano_messages = json.loads(plano_messages_str)
|
||||
# append messages from plano gateway to history
|
||||
return plano_messages
|
||||
return []
|
||||
|
|
|
|||
|
|
@ -1,6 +1,6 @@
|
|||
import json
|
||||
import os
|
||||
from common import get_arch_messages
|
||||
from common import get_plano_messages
|
||||
import pytest
|
||||
import requests
|
||||
from deepdiff import DeepDiff
|
||||
|
|
@ -46,10 +46,10 @@ def test_demos(test_data):
|
|||
assert choices[0]["message"]["role"] == "assistant"
|
||||
assert expected_output_contains.lower() in choices[0]["message"]["content"].lower()
|
||||
|
||||
# now verify arch_messages (tool call and api response) that are sent as response metadata
|
||||
arch_messages = get_arch_messages(response_json)
|
||||
assert len(arch_messages) == 2
|
||||
tool_calls_message = arch_messages[0]
|
||||
# now verify plano_messages (tool call and api response) that are sent as response metadata
|
||||
plano_messages = get_plano_messages(response_json)
|
||||
assert len(plano_messages) == 2
|
||||
tool_calls_message = plano_messages[0]
|
||||
tool_calls = tool_calls_message.get("tool_calls", [])
|
||||
assert len(tool_calls) > 0
|
||||
|
||||
|
|
|
|||
|
|
@ -1,6 +1,6 @@
|
|||
# Claude Code Router - Multi-Model Access with Intelligent Routing
|
||||
|
||||
Arch Gateway extends Claude Code to access multiple LLM providers through a single interface. Offering two key benefits:
|
||||
Plano extends Claude Code to access multiple LLM providers through a single interface. Offering two key benefits:
|
||||
|
||||
1. **Access to Models**: Connect to Grok, Mistral, Gemini, DeepSeek, GPT models, Claude, and local models via Ollama
|
||||
2. **Intelligent Routing via Preferences for Coding Tasks**: Configure which models handle specific development tasks:
|
||||
|
|
@ -21,15 +21,15 @@ Uses a [1.5B preference-aligned router LLM](https://arxiv.org/abs/2506.16655) to
|
|||
|
||||
## How It Works
|
||||
|
||||
Arch Gateway sits between Claude Code and multiple LLM providers, analyzing each request to route it to the most suitable model:
|
||||
Plano sits between Claude Code and multiple LLM providers, analyzing each request to route it to the most suitable model:
|
||||
|
||||
```
|
||||
Your Request → Arch Gateway → Suitable Model → Response
|
||||
Your Request → Plano → Suitable Model → Response
|
||||
↓
|
||||
[Task Analysis & Model Selection]
|
||||
```
|
||||
|
||||
**Supported Providers**: OpenAI-compatible, Anthropic, DeepSeek, Grok, Gemini, Llama, Mistral, local models via Ollama. See [full list of supported providers](https://docs.archgw.com/concepts/llm_providers/supported_providers.html).
|
||||
**Supported Providers**: OpenAI-compatible, Anthropic, DeepSeek, Grok, Gemini, Llama, Mistral, local models via Ollama. See [full list of supported providers](https://docs.planoai.dev/concepts/llm_providers/supported_providers.html).
|
||||
|
||||
|
||||
## Quick Start (5 minutes)
|
||||
|
|
@ -61,7 +61,7 @@ export ANTHROPIC_API_KEY="your-anthropic-key-here"
|
|||
# Add other providers as needed
|
||||
```
|
||||
|
||||
### Step 3: Start Arch Gateway
|
||||
### Step 3: Start Plano
|
||||
```bash
|
||||
# Install using uv (recommended)
|
||||
uv tool install planoai
|
||||
|
|
@ -122,7 +122,7 @@ planoai cli-agent claude --settings='{"ANTHROPIC_SMALL_FAST_MODEL": "deepseek-co
|
|||
### Environment Variables
|
||||
The system automatically configures these variables for Claude Code:
|
||||
```bash
|
||||
ANTHROPIC_BASE_URL=http://127.0.0.1:12000 # Routes through Arch Gateway
|
||||
ANTHROPIC_BASE_URL=http://127.0.0.1:12000 # Routes through Plano
|
||||
ANTHROPIC_SMALL_FAST_MODEL=arch.claude.code.small.fast # Uses intelligent alias
|
||||
```
|
||||
|
||||
|
|
@ -147,6 +147,6 @@ llm_providers:
|
|||
|
||||
## Technical Details
|
||||
|
||||
**How routing works:** Arch intercepts Claude Code requests, analyzes the content using preference-aligned routing, and forwards to the configured model.
|
||||
**How routing works:** Plano intercepts Claude Code requests, analyzes the content using preference-aligned routing, and forwards to the configured model.
|
||||
**Research foundation:** Built on our research in [Preference-Aligned LLM Routing](https://arxiv.org/abs/2506.16655)
|
||||
**Documentation:** [docs.archgw.com](https://docs.archgw.com) for advanced configuration and API details.
|
||||
**Documentation:** [docs.planoai.dev](https://docs.planoai.dev) for advanced configuration and API details.
|
||||
|
|
|
|||
|
|
@ -1,5 +1,5 @@
|
|||
#!/usr/bin/env bash
|
||||
# Pretty-print ArchGW MODEL_RESOLUTION lines from docker logs
|
||||
# Pretty-print Plano MODEL_RESOLUTION lines from docker logs
|
||||
# - hides Arch-Router
|
||||
# - prints timestamp
|
||||
# - colors MODEL_RESOLUTION red
|
||||
|
|
@ -7,7 +7,7 @@
|
|||
# - colors resolved_model magenta
|
||||
# - removes provider and streaming
|
||||
|
||||
docker logs -f archgw 2>&1 \
|
||||
docker logs -f plano 2>&1 \
|
||||
| awk '
|
||||
/MODEL_RESOLUTION:/ && $0 !~ /Arch-Router/ {
|
||||
# extract timestamp between first [ and ]
|
||||
|
|
|
|||
|
|
@ -19,10 +19,10 @@ services:
|
|||
- "12000:12000"
|
||||
- "8001:8001"
|
||||
environment:
|
||||
- ARCH_CONFIG_PATH=/config/config.yaml
|
||||
- PLANO_CONFIG_PATH=/config/config.yaml
|
||||
- OPENAI_API_KEY=${OPENAI_API_KEY:?OPENAI_API_KEY environment variable is required but not set}
|
||||
volumes:
|
||||
- ./config.yaml:/app/arch_config.yaml
|
||||
- ./config.yaml:/app/plano_config.yaml
|
||||
- /etc/ssl/cert.pem:/etc/ssl/cert.pem
|
||||
jaeger:
|
||||
build:
|
||||
|
|
|
|||
|
|
@ -22,14 +22,14 @@ logger = logging.getLogger(__name__)
|
|||
### add new setup
|
||||
app = FastAPI(title="RAG Agent Context Builder", version="1.0.0")
|
||||
|
||||
# Configuration for archgw LLM gateway
|
||||
# Configuration for Plano LLM gateway
|
||||
LLM_GATEWAY_ENDPOINT = os.getenv("LLM_GATEWAY_ENDPOINT", "http://localhost:12000/v1")
|
||||
RAG_MODEL = "gpt-4o-mini"
|
||||
|
||||
# Initialize OpenAI client for archgw
|
||||
archgw_client = AsyncOpenAI(
|
||||
# Initialize OpenAI client for Plano
|
||||
plano_client = AsyncOpenAI(
|
||||
base_url=LLM_GATEWAY_ENDPOINT,
|
||||
api_key="EMPTY", # archgw doesn't require a real API key
|
||||
api_key="EMPTY", # Plano doesn't require a real API key
|
||||
)
|
||||
|
||||
# Global variable to store the knowledge base
|
||||
|
|
@ -95,15 +95,15 @@ async def find_relevant_passages(
|
|||
If no passages are relevant, return "NONE"."""
|
||||
|
||||
try:
|
||||
# Call archgw to select relevant passages
|
||||
logger.info(f"Calling archgw to find relevant passages for query: '{query}'")
|
||||
# Call Plano to select relevant passages
|
||||
logger.info(f"Calling Plano to find relevant passages for query: '{query}'")
|
||||
|
||||
# Prepare extra headers if traceparent is provided
|
||||
extra_headers = {"x-envoy-max-retries": "3", "x-request-id": request_id}
|
||||
if traceparent:
|
||||
extra_headers["traceparent"] = traceparent
|
||||
|
||||
response = await archgw_client.chat.completions.create(
|
||||
response = await plano_client.chat.completions.create(
|
||||
model=RAG_MODEL,
|
||||
messages=[{"role": "system", "content": system_prompt}],
|
||||
temperature=0.1,
|
||||
|
|
|
|||
|
|
@ -22,14 +22,14 @@ logging.basicConfig(
|
|||
)
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# Configuration for archgw LLM gateway
|
||||
# Configuration for Plano LLM gateway
|
||||
LLM_GATEWAY_ENDPOINT = os.getenv("LLM_GATEWAY_ENDPOINT", "http://localhost:12000/v1")
|
||||
GUARD_MODEL = "gpt-4o-mini"
|
||||
|
||||
# Initialize OpenAI client for archgw
|
||||
archgw_client = AsyncOpenAI(
|
||||
# Initialize OpenAI client for Plano
|
||||
plano_client = AsyncOpenAI(
|
||||
base_url=LLM_GATEWAY_ENDPOINT,
|
||||
api_key="EMPTY", # archgw doesn't require a real API key
|
||||
api_key="EMPTY", # Plano doesn't require a real API key
|
||||
)
|
||||
|
||||
app = FastAPI(title="RAG Agent Input Guards", version="1.0.0")
|
||||
|
|
@ -93,13 +93,13 @@ Respond in JSON format:
|
|||
]
|
||||
|
||||
try:
|
||||
# Call archgw using OpenAI client
|
||||
# Call Plano using OpenAI client
|
||||
extra_headers = {"x-envoy-max-retries": "3", "x-request-id": request_id}
|
||||
if traceparent_header:
|
||||
extra_headers["traceparent"] = traceparent_header
|
||||
|
||||
logger.info(f"Validating query scope: '{last_user_message}'")
|
||||
response = await archgw_client.chat.completions.create(
|
||||
response = await plano_client.chat.completions.create(
|
||||
model=GUARD_MODEL,
|
||||
messages=guard_messages,
|
||||
temperature=0.1,
|
||||
|
|
|
|||
|
|
@ -20,20 +20,20 @@ logging.basicConfig(
|
|||
)
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# Configuration for archgw LLM gateway
|
||||
# Configuration for Plano LLM gateway
|
||||
LLM_GATEWAY_ENDPOINT = os.getenv("LLM_GATEWAY_ENDPOINT", "http://localhost:12000/v1")
|
||||
QUERY_REWRITE_MODEL = "gpt-4o-mini"
|
||||
|
||||
# Initialize OpenAI client for archgw
|
||||
archgw_client = AsyncOpenAI(
|
||||
# Initialize OpenAI client for Plano
|
||||
plano_client = AsyncOpenAI(
|
||||
base_url=LLM_GATEWAY_ENDPOINT,
|
||||
api_key="EMPTY", # archgw doesn't require a real API key
|
||||
api_key="EMPTY", # Plano doesn't require a real API key
|
||||
)
|
||||
|
||||
app = FastAPI(title="RAG Agent Query Rewriter", version="1.0.0")
|
||||
|
||||
|
||||
async def rewrite_query_with_archgw(
|
||||
async def rewrite_query_with_plano(
|
||||
messages: List[ChatMessage],
|
||||
traceparent_header: Optional[str] = None,
|
||||
request_id: Optional[str] = None,
|
||||
|
|
@ -59,8 +59,8 @@ Return only the rewritten query, nothing else."""
|
|||
extra_headers["traceparent"] = traceparent_header
|
||||
|
||||
try:
|
||||
logger.info(f"Calling archgw at {LLM_GATEWAY_ENDPOINT} to rewrite query")
|
||||
resp = await archgw_client.chat.completions.create(
|
||||
logger.info(f"Calling Plano at {LLM_GATEWAY_ENDPOINT} to rewrite query")
|
||||
resp = await plano_client.chat.completions.create(
|
||||
model=QUERY_REWRITE_MODEL,
|
||||
messages=rewrite_messages,
|
||||
temperature=0.3,
|
||||
|
|
@ -96,7 +96,7 @@ async def query_rewriter_http(
|
|||
else:
|
||||
logger.info("No traceparent header found")
|
||||
|
||||
rewritten_query = await rewrite_query_with_archgw(
|
||||
rewritten_query = await rewrite_query_with_plano(
|
||||
messages, traceparent_header, request_id
|
||||
)
|
||||
# Create updated messages with the rewritten query
|
||||
|
|
|
|||
|
|
@ -22,7 +22,7 @@ logging.basicConfig(
|
|||
)
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# Configuration for archgw LLM gateway
|
||||
# Configuration for Plano LLM gateway
|
||||
LLM_GATEWAY_ENDPOINT = os.getenv("LLM_GATEWAY_ENDPOINT", "http://localhost:12000/v1")
|
||||
RESPONSE_MODEL = "gpt-4o"
|
||||
|
||||
|
|
@ -38,10 +38,10 @@ Your response should:
|
|||
|
||||
Generate a complete response to assist the user."""
|
||||
|
||||
# Initialize OpenAI client for archgw
|
||||
archgw_client = AsyncOpenAI(
|
||||
# Initialize OpenAI client for Plano
|
||||
plano_client = AsyncOpenAI(
|
||||
base_url=LLM_GATEWAY_ENDPOINT,
|
||||
api_key="EMPTY", # archgw doesn't require a real API key
|
||||
api_key="EMPTY", # Plano doesn't require a real API key
|
||||
)
|
||||
|
||||
# FastAPI app for REST server
|
||||
|
|
@ -95,9 +95,9 @@ async def stream_chat_completions(
|
|||
response_messages = prepare_response_messages(request_body)
|
||||
|
||||
try:
|
||||
# Call archgw using OpenAI client for streaming
|
||||
# Call Plano using OpenAI client for streaming
|
||||
logger.info(
|
||||
f"Calling archgw at {LLM_GATEWAY_ENDPOINT} to generate streaming response"
|
||||
f"Calling Plano at {LLM_GATEWAY_ENDPOINT} to generate streaming response"
|
||||
)
|
||||
|
||||
# Prepare extra headers if traceparent is provided
|
||||
|
|
@ -105,7 +105,7 @@ async def stream_chat_completions(
|
|||
if traceparent_header:
|
||||
extra_headers["traceparent"] = traceparent_header
|
||||
|
||||
response_stream = await archgw_client.chat.completions.create(
|
||||
response_stream = await plano_client.chat.completions.create(
|
||||
model=RESPONSE_MODEL,
|
||||
messages=response_messages,
|
||||
temperature=request_body.temperature or 0.7,
|
||||
|
|
|
|||
|
|
@ -1,15 +1,15 @@
|
|||
# LLM Routing
|
||||
This demo shows how you can arch gateway to manage keys and route to upstream LLM.
|
||||
This demo shows how you can use Plano gateway to manage keys and route to upstream LLM.
|
||||
|
||||
# Starting the demo
|
||||
1. Please make sure the [pre-requisites](https://github.com/katanemo/arch/?tab=readme-ov-file#prerequisites) are installed correctly
|
||||
1. Start Arch
|
||||
1. Start Plano
|
||||
```sh
|
||||
sh run_demo.sh
|
||||
```
|
||||
1. Navigate to http://localhost:18080/
|
||||
|
||||
Following screen shows an example of interaction with arch gateway showing dynamic routing. You can select between different LLMs using "override model" option in the chat UI.
|
||||
Following screen shows an example of interaction with Plano gateway showing dynamic routing. You can select between different LLMs using "override model" option in the chat UI.
|
||||
|
||||

|
||||
|
||||
|
|
@ -47,12 +47,12 @@ $ curl --header 'Content-Type: application/json' \
|
|||
```
|
||||
|
||||
# Observability
|
||||
Arch gateway publishes stats endpoint at http://localhost:19901/stats. In this demo we are using prometheus to pull stats from arch and we are using grafana to visualize the stats in dashboard. To see grafana dashboard follow instructions below,
|
||||
Plano gateway publishes stats endpoint at http://localhost:19901/stats. In this demo we are using prometheus to pull stats from Plano and we are using grafana to visualize the stats in dashboard. To see grafana dashboard follow instructions below,
|
||||
|
||||
1. Navigate to http://localhost:3000/ to open grafana UI (use admin/grafana as credentials)
|
||||
1. From grafana left nav click on dashboards and select "Intelligent Gateway Overview" to view arch gateway stats
|
||||
1. From grafana left nav click on dashboards and select "Intelligent Gateway Overview" to view Plano gateway stats
|
||||
1. For tracing you can head over to http://localhost:16686/ to view recent traces.
|
||||
|
||||
Following is a screenshot of tracing UI showing call received by arch gateway and making upstream call to LLM,
|
||||
Following is a screenshot of tracing UI showing call received by Plano gateway and making upstream call to LLM,
|
||||
|
||||

|
||||
|
|
|
|||
|
|
@ -8,11 +8,11 @@ services:
|
|||
- "12000:12000"
|
||||
- "12001:12001"
|
||||
environment:
|
||||
- ARCH_CONFIG_PATH=/app/arch_config.yaml
|
||||
- PLANO_CONFIG_PATH=/app/plano_config.yaml
|
||||
- OPENAI_API_KEY=${OPENAI_API_KEY:?OPENAI_API_KEY environment variable is required but not set}
|
||||
- OTEL_TRACING_GRPC_ENDPOINT=http://host.docker.internal:4317
|
||||
volumes:
|
||||
- ./config.yaml:/app/arch_config.yaml:ro
|
||||
- ./config.yaml:/app/plano_config.yaml:ro
|
||||
- /etc/ssl/cert.pem:/etc/ssl/cert.pem
|
||||
|
||||
anythingllm:
|
||||
|
|
|
|||
|
|
@ -18,8 +18,8 @@ start_demo() {
|
|||
echo ".env file created with OPENAI_API_KEY."
|
||||
fi
|
||||
|
||||
# Step 3: Start Arch
|
||||
echo "Starting Arch with config.yaml..."
|
||||
# Step 3: Start Plano
|
||||
echo "Starting Plano with config.yaml..."
|
||||
planoai up config.yaml
|
||||
|
||||
# Step 4: Start LLM Routing
|
||||
|
|
@ -33,8 +33,8 @@ stop_demo() {
|
|||
echo "Stopping LLM Routing using Docker Compose..."
|
||||
docker compose down
|
||||
|
||||
# Step 2: Stop Arch
|
||||
echo "Stopping Arch..."
|
||||
# Step 2: Stop Plano
|
||||
echo "Stopping Plano..."
|
||||
planoai down
|
||||
}
|
||||
|
||||
|
|
|
|||
|
|
@ -21,10 +21,10 @@ services:
|
|||
- "12000:12000"
|
||||
- "8001:8001"
|
||||
environment:
|
||||
- ARCH_CONFIG_PATH=/config/config.yaml
|
||||
- PLANO_CONFIG_PATH=/config/config.yaml
|
||||
- OPENAI_API_KEY=${OPENAI_API_KEY:?OPENAI_API_KEY environment variable is required but not set}
|
||||
volumes:
|
||||
- ./config.yaml:/app/arch_config.yaml
|
||||
- ./config.yaml:/app/plano_config.yaml
|
||||
- /etc/ssl/cert.pem:/etc/ssl/cert.pem
|
||||
jaeger:
|
||||
build:
|
||||
|
|
|
|||
|
|
@ -18,14 +18,14 @@ logging.basicConfig(
|
|||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
# Configuration for archgw LLM gateway
|
||||
# Configuration for Plano LLM gateway
|
||||
LLM_GATEWAY_ENDPOINT = os.getenv("LLM_GATEWAY_ENDPOINT", "http://localhost:12000/v1")
|
||||
RAG_MODEL = "gpt-4o-mini"
|
||||
|
||||
# Initialize OpenAI client for archgw
|
||||
archgw_client = AsyncOpenAI(
|
||||
# Initialize OpenAI client for Plano
|
||||
plano_client = AsyncOpenAI(
|
||||
base_url=LLM_GATEWAY_ENDPOINT,
|
||||
api_key="EMPTY", # archgw doesn't require a real API key
|
||||
api_key="EMPTY", # Plano doesn't require a real API key
|
||||
)
|
||||
|
||||
# Global variable to store the knowledge base
|
||||
|
|
@ -91,8 +91,8 @@ async def find_relevant_passages(
|
|||
If no passages are relevant, return "NONE"."""
|
||||
|
||||
try:
|
||||
# Call archgw to select relevant passages
|
||||
logger.info(f"Calling archgw to find relevant passages for query: '{query}'")
|
||||
# Call Plano to select relevant passages
|
||||
logger.info(f"Calling Plano to find relevant passages for query: '{query}'")
|
||||
|
||||
# Prepare extra headers if traceparent is provided
|
||||
extra_headers = {
|
||||
|
|
@ -103,7 +103,7 @@ async def find_relevant_passages(
|
|||
if traceparent:
|
||||
extra_headers["traceparent"] = traceparent
|
||||
|
||||
response = await archgw_client.chat.completions.create(
|
||||
response = await plano_client.chat.completions.create(
|
||||
model=RAG_MODEL,
|
||||
messages=[{"role": "system", "content": system_prompt}],
|
||||
temperature=0.1,
|
||||
|
|
|
|||
|
|
@ -20,14 +20,14 @@ logging.basicConfig(
|
|||
)
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# Configuration for archgw LLM gateway
|
||||
# Configuration for Plano LLM gateway
|
||||
LLM_GATEWAY_ENDPOINT = os.getenv("LLM_GATEWAY_ENDPOINT", "http://localhost:12000/v1")
|
||||
GUARD_MODEL = "gpt-4o-mini"
|
||||
|
||||
# Initialize OpenAI client for archgw
|
||||
archgw_client = AsyncOpenAI(
|
||||
# Initialize OpenAI client for Plano
|
||||
plano_client = AsyncOpenAI(
|
||||
base_url=LLM_GATEWAY_ENDPOINT,
|
||||
api_key="EMPTY", # archgw doesn't require a real API key
|
||||
api_key="EMPTY", # Plano doesn't require a real API key
|
||||
)
|
||||
|
||||
app = FastAPI()
|
||||
|
|
@ -91,7 +91,7 @@ Respond in JSON format:
|
|||
]
|
||||
|
||||
try:
|
||||
# Call archgw using OpenAI client
|
||||
# Call Plano using OpenAI client
|
||||
extra_headers = {"x-envoy-max-retries": "3"}
|
||||
if traceparent_header:
|
||||
extra_headers["traceparent"] = traceparent_header
|
||||
|
|
@ -100,7 +100,7 @@ Respond in JSON format:
|
|||
extra_headers["x-request-id"] = request_id
|
||||
|
||||
logger.info(f"Validating query scope: '{last_user_message}'")
|
||||
response = await archgw_client.chat.completions.create(
|
||||
response = await plano_client.chat.completions.create(
|
||||
model=GUARD_MODEL,
|
||||
messages=guard_messages,
|
||||
temperature=0.1,
|
||||
|
|
|
|||
|
|
@ -19,20 +19,20 @@ logging.basicConfig(
|
|||
)
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# Configuration for archgw LLM gateway
|
||||
# Configuration for Plano LLM gateway
|
||||
LLM_GATEWAY_ENDPOINT = os.getenv("LLM_GATEWAY_ENDPOINT", "http://localhost:12000/v1")
|
||||
QUERY_REWRITE_MODEL = "gpt-4o-mini"
|
||||
|
||||
# Initialize OpenAI client for archgw
|
||||
archgw_client = AsyncOpenAI(
|
||||
# Initialize OpenAI client for Plano
|
||||
plano_client = AsyncOpenAI(
|
||||
base_url=LLM_GATEWAY_ENDPOINT,
|
||||
api_key="EMPTY", # archgw doesn't require a real API key
|
||||
api_key="EMPTY", # Plano doesn't require a real API key
|
||||
)
|
||||
|
||||
app = FastAPI()
|
||||
|
||||
|
||||
async def rewrite_query_with_archgw(
|
||||
async def rewrite_query_with_plano(
|
||||
messages: List[ChatMessage],
|
||||
traceparent_header: str,
|
||||
request_id: Optional[str] = None,
|
||||
|
|
@ -57,14 +57,14 @@ async def rewrite_query_with_archgw(
|
|||
rewrite_messages.append({"role": msg.role, "content": msg.content})
|
||||
|
||||
try:
|
||||
# Call archgw using OpenAI client
|
||||
# Call Plano using OpenAI client
|
||||
extra_headers = {"x-envoy-max-retries": "3"}
|
||||
if traceparent_header:
|
||||
extra_headers["traceparent"] = traceparent_header
|
||||
if request_id:
|
||||
extra_headers["x-request-id"] = request_id
|
||||
logger.info(f"Calling archgw at {LLM_GATEWAY_ENDPOINT} to rewrite query")
|
||||
response = await archgw_client.chat.completions.create(
|
||||
logger.info(f"Calling Plano at {LLM_GATEWAY_ENDPOINT} to rewrite query")
|
||||
response = await plano_client.chat.completions.create(
|
||||
model=QUERY_REWRITE_MODEL,
|
||||
messages=rewrite_messages,
|
||||
temperature=0.3,
|
||||
|
|
@ -88,7 +88,7 @@ async def rewrite_query_with_archgw(
|
|||
|
||||
|
||||
async def query_rewriter(messages: List[ChatMessage]) -> List[ChatMessage]:
|
||||
"""Chat completions endpoint that rewrites the last user query using archgw.
|
||||
"""Chat completions endpoint that rewrites the last user query using Plano.
|
||||
|
||||
Returns a dict with a 'messages' key containing the updated message list.
|
||||
"""
|
||||
|
|
@ -104,8 +104,8 @@ async def query_rewriter(messages: List[ChatMessage]) -> List[ChatMessage]:
|
|||
else:
|
||||
logger.info("No traceparent header found")
|
||||
|
||||
# Call archgw to rewrite the last user query
|
||||
rewritten_query = await rewrite_query_with_archgw(
|
||||
# Call Plano to rewrite the last user query
|
||||
rewritten_query = await rewrite_query_with_plano(
|
||||
messages, traceparent_header, request_id
|
||||
)
|
||||
|
||||
|
|
|
|||
|
|
@ -22,7 +22,7 @@ logging.basicConfig(
|
|||
)
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# Configuration for archgw LLM gateway
|
||||
# Configuration for Plano LLM gateway
|
||||
LLM_GATEWAY_ENDPOINT = os.getenv("LLM_GATEWAY_ENDPOINT", "http://localhost:12000/v1")
|
||||
RESPONSE_MODEL = "gpt-4o"
|
||||
|
||||
|
|
@ -38,10 +38,10 @@ Your response should:
|
|||
|
||||
Generate a complete response to assist the user."""
|
||||
|
||||
# Initialize OpenAI client for archgw
|
||||
archgw_client = AsyncOpenAI(
|
||||
# Initialize OpenAI client for Plano
|
||||
plano_client = AsyncOpenAI(
|
||||
base_url=LLM_GATEWAY_ENDPOINT,
|
||||
api_key="EMPTY", # archgw doesn't require a real API key
|
||||
api_key="EMPTY", # Plano doesn't require a real API key
|
||||
)
|
||||
|
||||
# FastAPI app for REST server
|
||||
|
|
@ -94,9 +94,9 @@ async def stream_chat_completions(
|
|||
response_messages = prepare_response_messages(request_body)
|
||||
|
||||
try:
|
||||
# Call archgw using OpenAI client for streaming
|
||||
# Call Plano using OpenAI client for streaming
|
||||
logger.info(
|
||||
f"Calling archgw at {LLM_GATEWAY_ENDPOINT} to generate streaming response"
|
||||
f"Calling Plano at {LLM_GATEWAY_ENDPOINT} to generate streaming response"
|
||||
)
|
||||
|
||||
logger.info(f"rag_agent - request_id: {request_id}")
|
||||
|
|
@ -107,7 +107,7 @@ async def stream_chat_completions(
|
|||
if traceparent_header:
|
||||
extra_headers["traceparent"] = traceparent_header
|
||||
|
||||
response_stream = await archgw_client.chat.completions.create(
|
||||
response_stream = await plano_client.chat.completions.create(
|
||||
model=RESPONSE_MODEL,
|
||||
messages=response_messages,
|
||||
temperature=request_body.temperature or 0.7,
|
||||
|
|
|
|||
|
|
@ -1,6 +1,6 @@
|
|||
# Model Alias Demo Suite
|
||||
|
||||
This directory contains demos for the model alias feature in archgw.
|
||||
This directory contains demos for the model alias feature in Plano.
|
||||
|
||||
## Overview
|
||||
|
||||
|
|
@ -48,7 +48,7 @@ model_aliases:
|
|||
```
|
||||
|
||||
## Prerequisites
|
||||
- Install all dependencies as described in the main Arch README ([link](https://github.com/katanemo/arch/?tab=readme-ov-file#prerequisites))
|
||||
- Install all dependencies as described in the main Plano README ([link](https://github.com/katanemo/plano/?tab=readme-ov-file#prerequisites))
|
||||
- Set your API keys in your environment:
|
||||
- `export OPENAI_API_KEY=your-openai-key`
|
||||
- `export ANTHROPIC_API_KEY=your-anthropic-key` (optional, but recommended for Anthropic tests)
|
||||
|
|
@ -60,13 +60,13 @@ model_aliases:
|
|||
sh run_demo.sh
|
||||
```
|
||||
- This will create a `.env` file with your API keys (if not present).
|
||||
- Starts Arch Gateway with model alias config (`arch_config_with_aliases.yaml`).
|
||||
- Starts Plano gateway with model alias config (`arch_config_with_aliases.yaml`).
|
||||
|
||||
2. To stop the demo:
|
||||
```sh
|
||||
sh run_demo.sh down
|
||||
```
|
||||
- This will stop Arch Gateway and any related services.
|
||||
- This will stop Plano gateway and any related services.
|
||||
|
||||
## Example Requests
|
||||
|
||||
|
|
@ -145,4 +145,4 @@ curl -sS -X POST "http://localhost:12000/v1/messages" \
|
|||
## Troubleshooting
|
||||
- Ensure your API keys are set in your environment before running the demo.
|
||||
- If you see errors about missing keys, set them and re-run the script.
|
||||
- For more details, see the main Arch documentation.
|
||||
- For more details, see the main Plano documentation.
|
||||
|
|
|
|||
|
|
@ -24,11 +24,11 @@ start_demo() {
|
|||
echo ".env file created with API keys."
|
||||
fi
|
||||
|
||||
# Step 3: Start Arch
|
||||
echo "Starting Arch with arch_config_with_aliases.yaml..."
|
||||
# Step 3: Start Plano
|
||||
echo "Starting Plano with arch_config_with_aliases.yaml..."
|
||||
planoai up arch_config_with_aliases.yaml
|
||||
|
||||
echo "\n\nArch started successfully."
|
||||
echo "\n\nPlano started successfully."
|
||||
echo "Please run the following CURL command to test model alias routing. Additional instructions are in the README.md file. \n"
|
||||
echo "curl -sS -X POST \"http://localhost:12000/v1/chat/completions\" \
|
||||
-H \"Authorization: Bearer test-key\" \
|
||||
|
|
@ -46,8 +46,8 @@ start_demo() {
|
|||
|
||||
# Function to stop the demo
|
||||
stop_demo() {
|
||||
# Step 2: Stop Arch
|
||||
echo "Stopping Arch..."
|
||||
# Step 2: Stop Plano
|
||||
echo "Stopping Plano..."
|
||||
planoai down
|
||||
}
|
||||
|
||||
|
|
|
|||
|
|
@ -1,6 +1,6 @@
|
|||
# Model Choice Newsletter Demo
|
||||
|
||||
This folder demonstrates a practical workflow for rapid model adoption and safe model switching using Arch Gateway (`plano`). It includes both a minimal test harness and a sample proxy configuration.
|
||||
This folder demonstrates a practical workflow for rapid model adoption and safe model switching using Plano (`plano`). It includes both a minimal test harness and a sample proxy configuration.
|
||||
|
||||
---
|
||||
|
||||
|
|
@ -85,13 +85,13 @@ See `config.yaml` for a sample configuration mapping aliases to provider models.
|
|||
```
|
||||
|
||||
2. **Install dependencies:**
|
||||
- Install all dependencies as described in the main Arch README ([link](https://github.com/katanemo/arch/?tab=readme-ov-file#prerequisites))
|
||||
- Install all dependencies as described in the main Plano README ([link](https://github.com/katanemo/plano/?tab=readme-ov-file#prerequisites))
|
||||
- Then run
|
||||
```sh
|
||||
uv sync
|
||||
```
|
||||
|
||||
3. **Start Arch Gateway**
|
||||
3. **Start Plano**
|
||||
```sh
|
||||
run_demo.sh
|
||||
```
|
||||
|
|
|
|||
|
|
@ -3,7 +3,7 @@ import json, time, yaml, statistics as stats
|
|||
from pydantic import BaseModel, ValidationError
|
||||
from openai import OpenAI
|
||||
|
||||
# archgw endpoint (keys are handled by archgw)
|
||||
# Plano endpoint (keys are handled by Plano)
|
||||
client = OpenAI(base_url="http://localhost:12000/v1", api_key="n/a")
|
||||
MODELS = ["arch.summarize.v1", "arch.reason.v1"]
|
||||
FIXTURES = "evals_summarize.yaml"
|
||||
|
|
|
|||
|
|
@ -1,7 +1,7 @@
|
|||
[project]
|
||||
name = "model-choice-newsletter-code-snippets"
|
||||
version = "0.1.0"
|
||||
description = "Benchmarking model alias routing with Arch Gateway."
|
||||
description = "Benchmarking model alias routing with Plano."
|
||||
authors = [{name = "Your Name", email = "your@email.com"}]
|
||||
license = {text = "Apache 2.0"}
|
||||
readme = "README.md"
|
||||
|
|
|
|||
|
|
@ -17,18 +17,18 @@ start_demo() {
|
|||
echo ".env file created with API keys."
|
||||
fi
|
||||
|
||||
# Step 3: Start Arch
|
||||
echo "Starting Arch with arch_config_with_aliases.yaml..."
|
||||
# Step 3: Start Plano
|
||||
echo "Starting Plano with arch_config_with_aliases.yaml..."
|
||||
planoai up arch_config_with_aliases.yaml
|
||||
|
||||
echo "\n\nArch started successfully."
|
||||
echo "\n\nPlano started successfully."
|
||||
echo "Please run the following command to test the setup: python bench.py\n"
|
||||
}
|
||||
|
||||
# Function to stop the demo
|
||||
stop_demo() {
|
||||
# Step 2: Stop Arch
|
||||
echo "Stopping Arch..."
|
||||
# Step 2: Stop Plano
|
||||
echo "Stopping Plano..."
|
||||
planoai down
|
||||
}
|
||||
|
||||
|
|
|
|||
|
|
@ -8,12 +8,12 @@ services:
|
|||
- "8001:8001"
|
||||
- "12000:12000"
|
||||
environment:
|
||||
- ARCH_CONFIG_PATH=/app/arch_config.yaml
|
||||
- PLANO_CONFIG_PATH=/app/plano_config.yaml
|
||||
- OPENAI_API_KEY=${OPENAI_API_KEY:?OPENAI_API_KEY environment variable is required but not set}
|
||||
- OTEL_TRACING_GRPC_ENDPOINT=http://jaeger:4317
|
||||
- LOG_LEVEL=${LOG_LEVEL:-info}
|
||||
volumes:
|
||||
- ./config.yaml:/app/arch_config.yaml:ro
|
||||
- ./config.yaml:/app/plano_config.yaml:ro
|
||||
- /etc/ssl/cert.pem:/etc/ssl/cert.pem
|
||||
|
||||
crewai-flight-agent:
|
||||
|
|
|
|||
|
|
@ -10,7 +10,7 @@ services:
|
|||
extra_hosts:
|
||||
- "host.docker.internal:host-gateway"
|
||||
volumes:
|
||||
- ./config.yaml:/app/arch_config.yaml
|
||||
- ./config.yaml:/app/plano_config.yaml
|
||||
|
||||
jaeger:
|
||||
build:
|
||||
|
|
|
|||
|
|
@ -10,7 +10,7 @@ services:
|
|||
extra_hosts:
|
||||
- "host.docker.internal:host-gateway"
|
||||
volumes:
|
||||
- ./config.yaml:/app/arch_config.yaml
|
||||
- ./config.yaml:/app/plano_config.yaml
|
||||
|
||||
otel-collector:
|
||||
build:
|
||||
|
|
|
|||
|
|
@ -18,8 +18,8 @@ start_demo() {
|
|||
echo ".env file created with OPENAI_API_KEY."
|
||||
fi
|
||||
|
||||
# Step 3: Start Arch
|
||||
echo "Starting Arch with config.yaml..."
|
||||
# Step 3: Start Plano
|
||||
echo "Starting Plano with config.yaml..."
|
||||
planoai up config.yaml
|
||||
|
||||
# Step 4: Start developer services
|
||||
|
|
@ -33,8 +33,8 @@ stop_demo() {
|
|||
echo "Stopping Network Agent using Docker Compose..."
|
||||
docker compose down
|
||||
|
||||
# Step 2: Stop Arch
|
||||
echo "Stopping Arch..."
|
||||
# Step 2: Stop Plano
|
||||
echo "Stopping Plano..."
|
||||
planoai down
|
||||
}
|
||||
|
||||
|
|
|
|||
|
|
@ -17,7 +17,7 @@ Make sure your machine is up to date with [latest version of plano]([url](https:
|
|||
# Or if installed with uv: uvx planoai up --service plano --foreground
|
||||
2025-05-30 18:00:09,953 - planoai.main - INFO - Starting plano cli version: 0.4.6
|
||||
2025-05-30 18:00:09,953 - planoai.main - INFO - Validating /Users/adilhafeez/src/intelligent-prompt-gateway/demos/use_cases/preference_based_routing/config.yaml
|
||||
2025-05-30 18:00:10,422 - cli.core - INFO - Starting arch gateway, image name: plano, tag: katanemo/plano:0.4.6
|
||||
2025-05-30 18:00:10,422 - cli.core - INFO - Starting plano gateway, image name: plano, tag: katanemo/plano:0.4.6
|
||||
2025-05-30 18:00:10,662 - cli.core - INFO - plano status: running, health status: starting
|
||||
2025-05-30 18:00:11,712 - cli.core - INFO - plano status: running, health status: starting
|
||||
2025-05-30 18:00:12,761 - cli.core - INFO - plano is running and is healthy!
|
||||
|
|
|
|||
|
|
@ -8,14 +8,14 @@ services:
|
|||
- "12000:12000"
|
||||
- "12001:12001"
|
||||
environment:
|
||||
- ARCH_CONFIG_PATH=/app/arch_config.yaml
|
||||
- PLANO_CONFIG_PATH=/app/plano_config.yaml
|
||||
- OPENAI_API_KEY=${OPENAI_API_KEY:?OPENAI_API_KEY environment variable is required but not set}
|
||||
- ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY:?ANTHROPIC_API_KEY environment variable is required but not set}
|
||||
- OTEL_TRACING_GRPC_ENDPOINT=http://host.docker.internal:4317
|
||||
- OTEL_TRACING_ENABLED=true
|
||||
- RUST_LOG=debug
|
||||
volumes:
|
||||
- ./config.yaml:/app/arch_config.yaml:ro
|
||||
- ./config.yaml:/app/plano_config.yaml:ro
|
||||
- /etc/ssl/cert.pem:/etc/ssl/cert.pem
|
||||
|
||||
anythingllm:
|
||||
|
|
|
|||
|
|
@ -1,6 +1,6 @@
|
|||
# Use Case Demo: Bearer Authorization with Spotify APIs
|
||||
|
||||
In this demo, we show how you can use Arch's bearer authorization capability to connect your agentic apps to third-party APIs.
|
||||
In this demo, we show how you can use Plano's bearer authorization capability to connect your agentic apps to third-party APIs.
|
||||
More specifically, we demonstrate how you can connect to two Spotify APIs:
|
||||
|
||||
- [`/v1/browse/new-releases`](https://developer.spotify.com/documentation/web-api/reference/get-new-releases)
|
||||
|
|
@ -23,7 +23,7 @@ Where users can engage by asking questions like _"Show me the latest releases in
|
|||
SPOTIFY_CLIENT_KEY=your_spotify_api_token
|
||||
```
|
||||
|
||||
3. Start Arch
|
||||
3. Start Plano
|
||||
```sh
|
||||
sh run_demo.sh
|
||||
```
|
||||
|
|
|
|||
|
|
@ -10,7 +10,7 @@ services:
|
|||
extra_hosts:
|
||||
- "host.docker.internal:host-gateway"
|
||||
volumes:
|
||||
- ./config.yaml:/app/arch_config.yaml
|
||||
- ./config.yaml:/app/plano_config.yaml
|
||||
|
||||
jaeger:
|
||||
build:
|
||||
|
|
|
|||
|
|
@ -18,8 +18,8 @@ start_demo() {
|
|||
echo ".env file created with OPENAI_API_KEY."
|
||||
fi
|
||||
|
||||
# Step 3: Start Arch
|
||||
echo "Starting Arch with config.yaml..."
|
||||
# Step 3: Start Plano
|
||||
echo "Starting Plano with config.yaml..."
|
||||
planoai up config.yaml
|
||||
|
||||
# Step 4: Start developer services
|
||||
|
|
@ -33,8 +33,8 @@ stop_demo() {
|
|||
echo "Stopping Network Agent using Docker Compose..."
|
||||
docker compose down
|
||||
|
||||
# Step 2: Stop Arch
|
||||
echo "Stopping Arch..."
|
||||
# Step 2: Stop Plano
|
||||
echo "Stopping Plano..."
|
||||
planoai down
|
||||
}
|
||||
|
||||
|
|
|
|||
|
|
@ -8,10 +8,10 @@ services:
|
|||
- "12000:12000"
|
||||
- "8001:8001"
|
||||
environment:
|
||||
- ARCH_CONFIG_PATH=/config/config.yaml
|
||||
- PLANO_CONFIG_PATH=/config/config.yaml
|
||||
- OPENAI_API_KEY=${OPENAI_API_KEY:?OPENAI_API_KEY environment variable is required but not set}
|
||||
volumes:
|
||||
- ./config.yaml:/app/arch_config.yaml
|
||||
- ./config.yaml:/app/plano_config.yaml
|
||||
- /etc/ssl/cert.pem:/etc/ssl/cert.pem
|
||||
weather-agent:
|
||||
build:
|
||||
|
|
|
|||
|
|
@ -19,7 +19,7 @@ def _on_build_finished(app: Sphinx, exception: Exception | None) -> None:
|
|||
return
|
||||
|
||||
# Source path: provider_models.yaml is copied into the Docker image at /docs/provider_models.yaml
|
||||
# This follows the pattern used for config templates like envoy.template.yaml and arch_config_schema.yaml
|
||||
# This follows the pattern used for config templates like envoy.template.yaml and plano_config_schema.yaml
|
||||
docs_root = Path(app.srcdir).parent # Goes from source/ to docs/
|
||||
source_path = docs_root / "provider_models.yaml"
|
||||
|
||||
|
|
|
|||
|
|
@ -48,7 +48,7 @@ prompt_targets:
|
|||
description: Time range in days for which to gather device statistics. Defaults to 7.
|
||||
default: 7
|
||||
|
||||
# Arch creates a round-robin load balancing between different endpoints, managed via the cluster subsystem.
|
||||
# Plano creates a round-robin load balancing between different endpoints, managed via the cluster subsystem.
|
||||
endpoints:
|
||||
app_server:
|
||||
# value could be ip address or a hostname with port
|
||||
|
|
|
|||
|
|
@ -18,7 +18,7 @@ prompt_targets:
|
|||
endpoint:
|
||||
name: app_server
|
||||
path: /agent/summary
|
||||
# Arch uses the default LLM and treats the response from the endpoint as the prompt to send to the LLM
|
||||
# Plano uses the default LLM and treats the response from the endpoint as the prompt to send to the LLM
|
||||
auto_llm_dispatch_on_response: true
|
||||
# override system prompt for this prompt target
|
||||
system_prompt: You are a helpful information extraction assistant. Use the information that is provided to you.
|
||||
|
|
@ -39,7 +39,7 @@ prompt_targets:
|
|||
default: false
|
||||
enum: [true, false]
|
||||
|
||||
# Arch creates a round-robin load balancing between different endpoints, managed via the cluster subsystem.
|
||||
# Plano creates a round-robin load balancing between different endpoints, managed via the cluster subsystem.
|
||||
endpoints:
|
||||
app_server:
|
||||
# value could be ip address or a hostname with port
|
||||
|
|
|
|||
|
|
@ -54,7 +54,7 @@ The OpenAI SDK works with any provider through Plano's OpenAI-compatible endpoin
|
|||
base_url="http://127.0.0.1:12000/v1"
|
||||
)
|
||||
|
||||
# Use any model configured in your arch_config.yaml
|
||||
# Use any model configured in your plano_config.yaml
|
||||
completion = client.chat.completions.create(
|
||||
model="gpt-4o-mini", # Or use :ref:`model aliases <model_aliases>` like "fast-model"
|
||||
max_tokens=50,
|
||||
|
|
@ -231,7 +231,7 @@ The Anthropic SDK works with any provider through Plano's Anthropic-compatible e
|
|||
base_url="http://127.0.0.1:12000"
|
||||
)
|
||||
|
||||
# Use any model configured in your arch_config.yaml
|
||||
# Use any model configured in your plano_config.yaml
|
||||
message = client.messages.create(
|
||||
model="claude-3-5-sonnet-20241022",
|
||||
max_tokens=50,
|
||||
|
|
|
|||
|
|
@ -114,7 +114,7 @@ Example 1: Adjusting Retrieval
|
|||
|
||||
User: What are the benefits of renewable energy?
|
||||
**[Plano]**: Check if there is an available <prompt_target> that can handle this user query.
|
||||
**[Plano]**: Found "get_info_for_energy_source" prompt_target in arch_config.yaml. Forward prompt to the endpoint configured in "get_info_for_energy_source"
|
||||
**[Plano]**: Found "get_info_for_energy_source" prompt_target in plano_config.yaml. Forward prompt to the endpoint configured in "get_info_for_energy_source"
|
||||
...
|
||||
Assistant: Renewable energy reduces greenhouse gas emissions, lowers air pollution, and provides sustainable power sources like solar and wind.
|
||||
|
||||
|
|
@ -130,13 +130,13 @@ Example 2: Switching Intent
|
|||
|
||||
User: What are the symptoms of diabetes?
|
||||
**[Plano]**: Check if there is an available <prompt_target> that can handle this user query.
|
||||
**[Plano]**: Found "diseases_symptoms" prompt_target in arch_config.yaml. Forward disease=diabeteres to "diseases_symptoms" prompt target
|
||||
**[Plano]**: Found "diseases_symptoms" prompt_target in plano_config.yaml. Forward disease=diabeteres to "diseases_symptoms" prompt target
|
||||
...
|
||||
Assistant: Common symptoms include frequent urination, excessive thirst, fatigue, and blurry vision.
|
||||
|
||||
User: How is it diagnosed?
|
||||
**[Plano]**: New intent detected.
|
||||
**[Plano]**: Found "disease_diagnoses" prompt_target in arch_config.yaml. Forward disease=diabeteres to "disease_diagnoses" prompt target
|
||||
**[Plano]**: Found "disease_diagnoses" prompt_target in plano_config.yaml. Forward disease=diabeteres to "disease_diagnoses" prompt target
|
||||
...
|
||||
Assistant: Diabetes is diagnosed through blood tests like fasting blood sugar, A1C, or an oral glucose tolerance test.
|
||||
|
||||
|
|
@ -172,7 +172,7 @@ Once the prompt targets are configured as above, handle parameters across multi-
|
|||
Demo App
|
||||
--------
|
||||
|
||||
For your convenience, we've built a `demo app <https://github.com/katanemo/archgw/tree/main/demos/samples_python/multi_turn_rag_agent>`_
|
||||
For your convenience, we've built a `demo app <https://github.com/katanemo/plano/tree/main/demos/samples_python/multi_turn_rag_agent>`_
|
||||
that you can test and modify locally for multi-turn RAG scenarios.
|
||||
|
||||
.. figure:: ../build_with_plano/includes/multi_turn/mutli-turn-example.png
|
||||
|
|
|
|||
|
|
@ -24,7 +24,7 @@ prompt_targets:
|
|||
name: app_server
|
||||
path: /agent/summary
|
||||
http_method: POST
|
||||
# Arch uses the default LLM and treats the response from the endpoint as the prompt to send to the LLM
|
||||
# Plano uses the default LLM and treats the response from the endpoint as the prompt to send to the LLM
|
||||
auto_llm_dispatch_on_response: true
|
||||
# override system prompt for this prompt target
|
||||
system_prompt: You are a helpful information extraction assistant. Use the information that is provided to you.
|
||||
|
|
@ -46,7 +46,7 @@ prompt_targets:
|
|||
default: false
|
||||
enum: [true, false]
|
||||
|
||||
# Arch creates a round-robin load balancing between different endpoints, managed via the cluster subsystem.
|
||||
# Plano creates a round-robin load balancing between different endpoints, managed via the cluster subsystem.
|
||||
endpoints:
|
||||
app_server:
|
||||
# value could be ip address or a hostname with port
|
||||
|
|
|
|||
|
|
@ -1,4 +1,4 @@
|
|||
.. _arch_access_logging:
|
||||
.. _plano_access_logging:
|
||||
|
||||
Access Logging
|
||||
==============
|
||||
|
|
|
|||
|
|
@ -1,4 +1,4 @@
|
|||
.. _arch_overview_tracing:
|
||||
.. _plano_overview_tracing:
|
||||
|
||||
Tracing
|
||||
=======
|
||||
|
|
|
|||
|
|
@ -57,7 +57,7 @@ Configuration Overview
|
|||
|
||||
State storage is configured in the ``state_storage`` section of your ``plano_config.yaml``:
|
||||
|
||||
.. literalinclude:: ../resources/includes/arch_config_state_storage_example.yaml
|
||||
.. literalinclude:: ../resources/includes/plano_config_state_storage_example.yaml
|
||||
:language: yaml
|
||||
:lines: 21-30
|
||||
:linenos:
|
||||
|
|
|
|||
|
|
@ -4,10 +4,10 @@ Configuration Reference
|
|||
=======================
|
||||
|
||||
The following is a complete reference of the ``plano_config.yml`` that controls the behavior of a single instance of
|
||||
the Arch gateway. This where you enable capabilities like routing to upstream LLm providers, defining prompt_targets
|
||||
the Plano gateway. This where you enable capabilities like routing to upstream LLm providers, defining prompt_targets
|
||||
where prompts get routed to, apply guardrails, and enable critical agent observability features.
|
||||
|
||||
.. literalinclude:: includes/arch_config_full_reference.yaml
|
||||
.. literalinclude:: includes/plano_config_full_reference.yaml
|
||||
:language: yaml
|
||||
:linenos:
|
||||
:caption: :download:`Plano Configuration - Full Reference <includes/arch_config_full_reference.yaml>`
|
||||
:caption: :download:`Plano Configuration - Full Reference <includes/plano_config_full_reference.yaml>`
|
||||
|
|
|
|||
|
|
@ -30,7 +30,7 @@ listeners:
|
|||
- type: agent
|
||||
name: agent_1
|
||||
port: 8001
|
||||
router: arch_agent_router
|
||||
router: plano_agent_router
|
||||
agents:
|
||||
- id: rag_agent
|
||||
description: virtual assistant for retrieval augmented generation tasks
|
||||
|
|
|
|||
|
|
@ -1,4 +1,4 @@
|
|||
# Arch Gateway configuration version
|
||||
# Plano Gateway configuration version
|
||||
version: v0.3.0
|
||||
|
||||
# External HTTP agents - API type is controlled by request path (/v1/responses, /v1/messages, /v1/chat/completions)
|
||||
|
|
@ -41,7 +41,7 @@ The request processing path in Plano has three main parts:
|
|||
|
||||
These two subsystems are bridged with either the HTTP router filter, and the cluster manager subsystems of Envoy.
|
||||
|
||||
Also, Plano utilizes `Envoy event-based thread model <https://blog.envoyproxy.io/envoy-threading-model-a8d44b922310>`_. A main thread is responsible for the server lifecycle, configuration processing, stats, etc. and some number of :ref:`worker threads <arch_overview_threading>` process requests. All threads operate around an event loop (`libevent <https://libevent.org/>`_) and any given downstream TCP connection will be handled by exactly one worker thread for its lifetime. Each worker thread maintains its own pool of TCP connections to upstream endpoints.
|
||||
Also, Plano utilizes `Envoy event-based thread model <https://blog.envoyproxy.io/envoy-threading-model-a8d44b922310>`_. A main thread is responsible for the server lifecycle, configuration processing, stats, etc. and some number of :ref:`worker threads <plano_overview_threading>` process requests. All threads operate around an event loop (`libevent <https://libevent.org/>`_) and any given downstream TCP connection will be handled by exactly one worker thread for its lifetime. Each worker thread maintains its own pool of TCP connections to upstream endpoints.
|
||||
|
||||
Worker threads rarely share state and operate in a trivially parallel fashion. This threading model
|
||||
enables scaling to very high core count CPUs.
|
||||
|
|
@ -130,8 +130,8 @@ Once a request completes, the stream is destroyed. The following also takes plac
|
|||
* The post-request :ref:`monitoring <monitoring>` are updated (e.g. timing, active requests, upgrades, health checks).
|
||||
Some statistics are updated earlier however, during request processing. Stats are batched and written by the main
|
||||
thread periodically.
|
||||
* :ref:`Access logs <arch_access_logging>` are written to the access log
|
||||
* :ref:`Trace <arch_overview_tracing>` spans are finalized. If our example request was traced, a
|
||||
* :ref:`Access logs <plano_access_logging>` are written to the access log
|
||||
* :ref:`Trace <plano_overview_tracing>` spans are finalized. If our example request was traced, a
|
||||
trace span, describing the duration and details of the request would be created by the HCM when
|
||||
processing request headers and then finalized by the HCM during post-request processing.
|
||||
|
||||
|
|
|
|||
|
|
@ -1,4 +1,4 @@
|
|||
.. _arch_overview_threading:
|
||||
.. _plano_overview_threading:
|
||||
|
||||
Threading Model
|
||||
===============
|
||||
|
|
|
|||
4
package-lock.json
generated
4
package-lock.json
generated
|
|
@ -1,11 +1,11 @@
|
|||
{
|
||||
"name": "archgw-monorepo",
|
||||
"name": "plano-monorepo",
|
||||
"version": "0.1.0",
|
||||
"lockfileVersion": 3,
|
||||
"requires": true,
|
||||
"packages": {
|
||||
"": {
|
||||
"name": "archgw-monorepo",
|
||||
"name": "plano-monorepo",
|
||||
"version": "0.1.0",
|
||||
"workspaces": [
|
||||
"apps/*",
|
||||
|
|
|
|||
Some files were not shown because too many files have changed in this diff Show more
Loading…
Add table
Add a link
Reference in a new issue