mirror of
https://github.com/katanemo/plano.git
synced 2026-04-25 00:36:34 +02:00
Rename all arch references to plano (#745)
* Rename all arch references to plano across the codebase
Complete rebrand from "Arch"/"archgw" to "Plano" including:
- Config files: arch_config_schema.yaml, workflow, demo configs
- Environment variables: ARCH_CONFIG_* → PLANO_CONFIG_*
- Python CLI: variables, functions, file paths, docker mounts
- Rust crates: config paths, log messages, metadata keys
- Docker/build: Dockerfile, supervisord, .dockerignore, .gitignore
- Docker Compose: volume mounts and env vars across all demos/tests
- GitHub workflows: job/step names
- Shell scripts: log messages
- Demos: Python code, READMEs, VS Code configs, Grafana dashboard
- Docs: RST includes, code comments, config references
- Package metadata: package.json, pyproject.toml, uv.lock
External URLs (docs.archgw.com, github.com/katanemo/archgw) left as-is.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Update remaining arch references in docs
- Rename RST cross-reference labels: arch_access_logging, arch_overview_tracing, arch_overview_threading → plano_*
- Update label references in request_lifecycle.rst
- Rename arch_config_state_storage_example.yaml → plano_config_state_storage_example.yaml
- Update config YAML comments: "Arch creates/uses" → "Plano creates/uses"
- Update "the Arch gateway" → "the Plano gateway" in configuration_reference.rst
- Update arch_config_schema.yaml reference in provider_models.py
- Rename arch_agent_router → plano_agent_router in config example
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Fix remaining arch references found in second pass
- config/docker-compose.dev.yaml: ARCH_CONFIG_FILE → PLANO_CONFIG_FILE,
arch_config.yaml → plano_config.yaml, archgw_logs → plano_logs
- config/test_passthrough.yaml: container mount path
- tests/e2e/docker-compose.yaml: source file path (was still arch_config.yaml)
- cli/planoai/core.py: comment and log message
- crates/brightstaff/src/tracing/constants.rs: doc comment
- tests/{e2e,archgw}/common.py: get_arch_messages → get_plano_messages,
arch_state/arch_messages variables renamed
- tests/{e2e,archgw}/test_prompt_gateway.py: updated imports and usages
- demos/shared/test_runner/{common,test_demos}.py: same renames
- tests/e2e/test_model_alias_routing.py: docstring
- .dockerignore: archgw_modelserver → plano_modelserver
- demos/use_cases/claude_code_router/pretty_model_resolution.sh: container name
Note: x-arch-* HTTP header values and Rust constant names intentionally
preserved for backwards compatibility with existing deployments.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
parent
0557f7ff98
commit
ba651aaf71
115 changed files with 504 additions and 505 deletions
|
|
@ -33,8 +33,8 @@ cli/__pycache__/
|
||||||
cli/planoai/__pycache__/
|
cli/planoai/__pycache__/
|
||||||
|
|
||||||
# Python model server
|
# Python model server
|
||||||
archgw_modelserver/
|
plano_modelserver/
|
||||||
arch_tools/
|
plano_tools/
|
||||||
|
|
||||||
# Misc
|
# Misc
|
||||||
*.md
|
*.md
|
||||||
|
|
@ -44,4 +44,4 @@ turbo.json
|
||||||
package.json
|
package.json
|
||||||
*.sh
|
*.sh
|
||||||
!cli/build_cli.sh
|
!cli/build_cli.sh
|
||||||
arch_config.yaml_rendered
|
plano_config.yaml_rendered
|
||||||
|
|
|
||||||
2
.github/workflows/e2e_plano_tests.yml
vendored
2
.github/workflows/e2e_plano_tests.yml
vendored
|
|
@ -28,7 +28,7 @@ jobs:
|
||||||
python-version: ${{ matrix.python-version }}
|
python-version: ${{ matrix.python-version }}
|
||||||
cache: "pip" # auto-caches based on requirements files
|
cache: "pip" # auto-caches based on requirements files
|
||||||
|
|
||||||
- name: build arch docker image
|
- name: build plano docker image
|
||||||
run: |
|
run: |
|
||||||
cd ../../ && docker build -f Dockerfile . -t katanemo/plano -t katanemo/plano:0.4.6 -t katanemo/plano:latest
|
cd ../../ && docker build -f Dockerfile . -t katanemo/plano -t katanemo/plano:0.4.6 -t katanemo/plano:latest
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -38,7 +38,7 @@ jobs:
|
||||||
curl --location --remote-name https://github.com/Orange-OpenSource/hurl/releases/download/4.0.0/hurl_4.0.0_amd64.deb
|
curl --location --remote-name https://github.com/Orange-OpenSource/hurl/releases/download/4.0.0/hurl_4.0.0_amd64.deb
|
||||||
sudo dpkg -i hurl_4.0.0_amd64.deb
|
sudo dpkg -i hurl_4.0.0_amd64.deb
|
||||||
|
|
||||||
- name: install arch gateway and test dependencies
|
- name: install plano gateway and test dependencies
|
||||||
run: |
|
run: |
|
||||||
source venv/bin/activate
|
source venv/bin/activate
|
||||||
cd cli && echo "installing plano cli" && uv sync && uv tool install .
|
cd cli && echo "installing plano cli" && uv sync && uv tool install .
|
||||||
|
|
|
||||||
|
|
@ -22,7 +22,7 @@ jobs:
|
||||||
with:
|
with:
|
||||||
python-version: "3.12"
|
python-version: "3.12"
|
||||||
|
|
||||||
- name: build arch docker image
|
- name: build plano docker image
|
||||||
run: |
|
run: |
|
||||||
docker build -f Dockerfile . -t katanemo/plano -t katanemo/plano:0.4.6
|
docker build -f Dockerfile . -t katanemo/plano -t katanemo/plano:0.4.6
|
||||||
|
|
||||||
|
|
@ -38,7 +38,7 @@ jobs:
|
||||||
curl --location --remote-name https://github.com/Orange-OpenSource/hurl/releases/download/4.0.0/hurl_4.0.0_amd64.deb
|
curl --location --remote-name https://github.com/Orange-OpenSource/hurl/releases/download/4.0.0/hurl_4.0.0_amd64.deb
|
||||||
sudo dpkg -i hurl_4.0.0_amd64.deb
|
sudo dpkg -i hurl_4.0.0_amd64.deb
|
||||||
|
|
||||||
- name: install arch gateway and test dependencies
|
- name: install plano gateway and test dependencies
|
||||||
run: |
|
run: |
|
||||||
source venv/bin/activate
|
source venv/bin/activate
|
||||||
cd cli && echo "installing plano cli" && uv sync && uv tool install .
|
cd cli && echo "installing plano cli" && uv sync && uv tool install .
|
||||||
|
|
|
||||||
|
|
@ -1,4 +1,4 @@
|
||||||
name: arch config tests
|
name: plano config tests
|
||||||
|
|
||||||
on:
|
on:
|
||||||
push:
|
push:
|
||||||
|
|
@ -7,7 +7,7 @@ on:
|
||||||
pull_request:
|
pull_request:
|
||||||
|
|
||||||
jobs:
|
jobs:
|
||||||
validate_arch_config:
|
validate_plano_config:
|
||||||
runs-on: ubuntu-latest
|
runs-on: ubuntu-latest
|
||||||
defaults:
|
defaults:
|
||||||
run:
|
run:
|
||||||
|
|
@ -22,10 +22,10 @@ jobs:
|
||||||
with:
|
with:
|
||||||
python-version: "3.12"
|
python-version: "3.12"
|
||||||
|
|
||||||
- name: build arch docker image
|
- name: build plano docker image
|
||||||
run: |
|
run: |
|
||||||
docker build -f Dockerfile . -t katanemo/plano -t katanemo/plano:0.4.6
|
docker build -f Dockerfile . -t katanemo/plano -t katanemo/plano:0.4.6
|
||||||
|
|
||||||
- name: validate arch config
|
- name: validate plano config
|
||||||
run: |
|
run: |
|
||||||
bash config/validate_plano_config.sh
|
bash config/validate_plano_config.sh
|
||||||
14
.gitignore
vendored
14
.gitignore
vendored
|
|
@ -107,30 +107,30 @@ venv.bak/
|
||||||
|
|
||||||
# =========================================
|
# =========================================
|
||||||
|
|
||||||
# Arch
|
# Plano
|
||||||
cli/config
|
cli/config
|
||||||
cli/build
|
cli/build
|
||||||
|
|
||||||
# Archgw - Docs
|
# Plano - Docs
|
||||||
docs/build/
|
docs/build/
|
||||||
|
|
||||||
# Archgw - Demos
|
# Plano - Demos
|
||||||
demos/function_calling/ollama/models/
|
demos/function_calling/ollama/models/
|
||||||
demos/function_calling/ollama/id_ed*
|
demos/function_calling/ollama/id_ed*
|
||||||
demos/function_calling/open-webui/
|
demos/function_calling/open-webui/
|
||||||
demos/function_calling/open-webui/
|
demos/function_calling/open-webui/
|
||||||
demos/shared/signoz/data
|
demos/shared/signoz/data
|
||||||
|
|
||||||
# Arch - Miscellaneous
|
# Plano - Miscellaneous
|
||||||
grafana-data
|
grafana-data
|
||||||
prom_data
|
prom_data
|
||||||
arch_log/
|
plano_log/
|
||||||
arch_logs/
|
plano_logs/
|
||||||
crates/*/target/
|
crates/*/target/
|
||||||
crates/target/
|
crates/target/
|
||||||
build.log
|
build.log
|
||||||
|
|
||||||
archgw.log
|
plano.log
|
||||||
|
|
||||||
# Next.js / Turborepo
|
# Next.js / Turborepo
|
||||||
.next/
|
.next/
|
||||||
|
|
|
||||||
|
|
@ -96,7 +96,7 @@ Entry point: `cli/planoai/main.py`. Container lifecycle in `core.py`. Docker ope
|
||||||
|
|
||||||
### Configuration System (config/)
|
### Configuration System (config/)
|
||||||
|
|
||||||
- `arch_config_schema.yaml` — JSON Schema (draft-07) for validating user config files
|
- `plano_config_schema.yaml` — JSON Schema (draft-07) for validating user config files
|
||||||
- `envoy.template.yaml` — Jinja2 template rendered into Envoy proxy config
|
- `envoy.template.yaml` — Jinja2 template rendered into Envoy proxy config
|
||||||
- `supervisord.conf` — Process supervisor for Envoy + brightstaff in the container
|
- `supervisord.conf` — Process supervisor for Envoy + brightstaff in the container
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -69,7 +69,7 @@ RUN uv run pip install --no-cache-dir .
|
||||||
|
|
||||||
COPY cli/planoai planoai/
|
COPY cli/planoai planoai/
|
||||||
COPY config/envoy.template.yaml .
|
COPY config/envoy.template.yaml .
|
||||||
COPY config/arch_config_schema.yaml .
|
COPY config/plano_config_schema.yaml .
|
||||||
COPY config/supervisord.conf /etc/supervisor/conf.d/supervisord.conf
|
COPY config/supervisord.conf /etc/supervisor/conf.d/supervisord.conf
|
||||||
|
|
||||||
COPY --from=wasm-builder /arch/target/wasm32-wasip1/release/prompt_gateway.wasm /etc/envoy/proxy-wasm-plugins/prompt_gateway.wasm
|
COPY --from=wasm-builder /arch/target/wasm32-wasip1/release/prompt_gateway.wasm /etc/envoy/proxy-wasm-plugins/prompt_gateway.wasm
|
||||||
|
|
|
||||||
|
|
@ -115,7 +115,6 @@ export const siteConfig = {
|
||||||
// Brand (minimal, necessary)
|
// Brand (minimal, necessary)
|
||||||
"Plano AI",
|
"Plano AI",
|
||||||
"Plano gateway",
|
"Plano gateway",
|
||||||
"Arch gateway",
|
|
||||||
],
|
],
|
||||||
authors: [{ name: "Katanemo", url: "https://github.com/katanemo/plano" }],
|
authors: [{ name: "Katanemo", url: "https://github.com/katanemo/plano" }],
|
||||||
creator: "Katanemo",
|
creator: "Katanemo",
|
||||||
|
|
@ -240,7 +239,7 @@ export const pageMetadata = {
|
||||||
"agentic AI",
|
"agentic AI",
|
||||||
"Plano blog",
|
"Plano blog",
|
||||||
"Plano blog posts",
|
"Plano blog posts",
|
||||||
"Arch gateway blog",
|
"Plano gateway blog",
|
||||||
],
|
],
|
||||||
}),
|
}),
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -53,34 +53,34 @@ def validate_and_render_schema():
|
||||||
ENVOY_CONFIG_TEMPLATE_FILE = os.getenv(
|
ENVOY_CONFIG_TEMPLATE_FILE = os.getenv(
|
||||||
"ENVOY_CONFIG_TEMPLATE_FILE", "envoy.template.yaml"
|
"ENVOY_CONFIG_TEMPLATE_FILE", "envoy.template.yaml"
|
||||||
)
|
)
|
||||||
ARCH_CONFIG_FILE = os.getenv("ARCH_CONFIG_FILE", "/app/arch_config.yaml")
|
PLANO_CONFIG_FILE = os.getenv("PLANO_CONFIG_FILE", "/app/plano_config.yaml")
|
||||||
ARCH_CONFIG_FILE_RENDERED = os.getenv(
|
PLANO_CONFIG_FILE_RENDERED = os.getenv(
|
||||||
"ARCH_CONFIG_FILE_RENDERED", "/app/arch_config_rendered.yaml"
|
"PLANO_CONFIG_FILE_RENDERED", "/app/plano_config_rendered.yaml"
|
||||||
)
|
)
|
||||||
ENVOY_CONFIG_FILE_RENDERED = os.getenv(
|
ENVOY_CONFIG_FILE_RENDERED = os.getenv(
|
||||||
"ENVOY_CONFIG_FILE_RENDERED", "/etc/envoy/envoy.yaml"
|
"ENVOY_CONFIG_FILE_RENDERED", "/etc/envoy/envoy.yaml"
|
||||||
)
|
)
|
||||||
ARCH_CONFIG_SCHEMA_FILE = os.getenv(
|
PLANO_CONFIG_SCHEMA_FILE = os.getenv(
|
||||||
"ARCH_CONFIG_SCHEMA_FILE", "arch_config_schema.yaml"
|
"PLANO_CONFIG_SCHEMA_FILE", "plano_config_schema.yaml"
|
||||||
)
|
)
|
||||||
|
|
||||||
env = Environment(loader=FileSystemLoader(os.getenv("TEMPLATE_ROOT", "./")))
|
env = Environment(loader=FileSystemLoader(os.getenv("TEMPLATE_ROOT", "./")))
|
||||||
template = env.get_template(ENVOY_CONFIG_TEMPLATE_FILE)
|
template = env.get_template(ENVOY_CONFIG_TEMPLATE_FILE)
|
||||||
|
|
||||||
try:
|
try:
|
||||||
validate_prompt_config(ARCH_CONFIG_FILE, ARCH_CONFIG_SCHEMA_FILE)
|
validate_prompt_config(PLANO_CONFIG_FILE, PLANO_CONFIG_SCHEMA_FILE)
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
print(str(e))
|
print(str(e))
|
||||||
exit(1) # validate_prompt_config failed. Exit
|
exit(1) # validate_prompt_config failed. Exit
|
||||||
|
|
||||||
with open(ARCH_CONFIG_FILE, "r") as file:
|
with open(PLANO_CONFIG_FILE, "r") as file:
|
||||||
arch_config = file.read()
|
plano_config = file.read()
|
||||||
|
|
||||||
with open(ARCH_CONFIG_SCHEMA_FILE, "r") as file:
|
with open(PLANO_CONFIG_SCHEMA_FILE, "r") as file:
|
||||||
arch_config_schema = file.read()
|
plano_config_schema = file.read()
|
||||||
|
|
||||||
config_yaml = yaml.safe_load(arch_config)
|
config_yaml = yaml.safe_load(plano_config)
|
||||||
_ = yaml.safe_load(arch_config_schema)
|
_ = yaml.safe_load(plano_config_schema)
|
||||||
inferred_clusters = {}
|
inferred_clusters = {}
|
||||||
|
|
||||||
# Convert legacy llm_providers to model_providers
|
# Convert legacy llm_providers to model_providers
|
||||||
|
|
@ -145,7 +145,7 @@ def validate_and_render_schema():
|
||||||
inferred_clusters[name]["port"],
|
inferred_clusters[name]["port"],
|
||||||
) = get_endpoint_and_port(endpoint, protocol)
|
) = get_endpoint_and_port(endpoint, protocol)
|
||||||
|
|
||||||
print("defined clusters from arch_config.yaml: ", json.dumps(inferred_clusters))
|
print("defined clusters from plano_config.yaml: ", json.dumps(inferred_clusters))
|
||||||
|
|
||||||
if "prompt_targets" in config_yaml:
|
if "prompt_targets" in config_yaml:
|
||||||
for prompt_target in config_yaml["prompt_targets"]:
|
for prompt_target in config_yaml["prompt_targets"]:
|
||||||
|
|
@ -154,13 +154,13 @@ def validate_and_render_schema():
|
||||||
continue
|
continue
|
||||||
if name not in inferred_clusters:
|
if name not in inferred_clusters:
|
||||||
raise Exception(
|
raise Exception(
|
||||||
f"Unknown endpoint {name}, please add it in endpoints section in your arch_config.yaml file"
|
f"Unknown endpoint {name}, please add it in endpoints section in your plano_config.yaml file"
|
||||||
)
|
)
|
||||||
|
|
||||||
arch_tracing = config_yaml.get("tracing", {})
|
plano_tracing = config_yaml.get("tracing", {})
|
||||||
|
|
||||||
# Resolution order: config yaml > OTEL_TRACING_GRPC_ENDPOINT env var > hardcoded default
|
# Resolution order: config yaml > OTEL_TRACING_GRPC_ENDPOINT env var > hardcoded default
|
||||||
opentracing_grpc_endpoint = arch_tracing.get(
|
opentracing_grpc_endpoint = plano_tracing.get(
|
||||||
"opentracing_grpc_endpoint",
|
"opentracing_grpc_endpoint",
|
||||||
os.environ.get(
|
os.environ.get(
|
||||||
"OTEL_TRACING_GRPC_ENDPOINT", DEFAULT_OTEL_TRACING_GRPC_ENDPOINT
|
"OTEL_TRACING_GRPC_ENDPOINT", DEFAULT_OTEL_TRACING_GRPC_ENDPOINT
|
||||||
|
|
@ -172,7 +172,7 @@ def validate_and_render_schema():
|
||||||
print(
|
print(
|
||||||
f"Resolved opentracing_grpc_endpoint to {opentracing_grpc_endpoint} after expanding environment variables"
|
f"Resolved opentracing_grpc_endpoint to {opentracing_grpc_endpoint} after expanding environment variables"
|
||||||
)
|
)
|
||||||
arch_tracing["opentracing_grpc_endpoint"] = opentracing_grpc_endpoint
|
plano_tracing["opentracing_grpc_endpoint"] = opentracing_grpc_endpoint
|
||||||
# ensure that opentracing_grpc_endpoint is a valid URL if present and start with http and must not have any path
|
# ensure that opentracing_grpc_endpoint is a valid URL if present and start with http and must not have any path
|
||||||
if opentracing_grpc_endpoint:
|
if opentracing_grpc_endpoint:
|
||||||
urlparse_result = urlparse(opentracing_grpc_endpoint)
|
urlparse_result = urlparse(opentracing_grpc_endpoint)
|
||||||
|
|
@ -436,8 +436,8 @@ def validate_and_render_schema():
|
||||||
f"Model alias 2 - '{alias_name}' targets '{target}' which is not defined as a model. Available models: {', '.join(sorted(model_name_keys))}"
|
f"Model alias 2 - '{alias_name}' targets '{target}' which is not defined as a model. Available models: {', '.join(sorted(model_name_keys))}"
|
||||||
)
|
)
|
||||||
|
|
||||||
arch_config_string = yaml.dump(config_yaml)
|
plano_config_string = yaml.dump(config_yaml)
|
||||||
arch_llm_config_string = yaml.dump(config_yaml)
|
plano_llm_config_string = yaml.dump(config_yaml)
|
||||||
|
|
||||||
use_agent_orchestrator = config_yaml.get("overrides", {}).get(
|
use_agent_orchestrator = config_yaml.get("overrides", {}).get(
|
||||||
"use_agent_orchestrator", False
|
"use_agent_orchestrator", False
|
||||||
|
|
@ -449,11 +449,11 @@ def validate_and_render_schema():
|
||||||
|
|
||||||
if len(endpoints) == 0:
|
if len(endpoints) == 0:
|
||||||
raise Exception(
|
raise Exception(
|
||||||
"Please provide agent orchestrator in the endpoints section in your arch_config.yaml file"
|
"Please provide agent orchestrator in the endpoints section in your plano_config.yaml file"
|
||||||
)
|
)
|
||||||
elif len(endpoints) > 1:
|
elif len(endpoints) > 1:
|
||||||
raise Exception(
|
raise Exception(
|
||||||
"Please provide single agent orchestrator in the endpoints section in your arch_config.yaml file"
|
"Please provide single agent orchestrator in the endpoints section in your plano_config.yaml file"
|
||||||
)
|
)
|
||||||
else:
|
else:
|
||||||
agent_orchestrator = list(endpoints.keys())[0]
|
agent_orchestrator = list(endpoints.keys())[0]
|
||||||
|
|
@ -463,11 +463,11 @@ def validate_and_render_schema():
|
||||||
data = {
|
data = {
|
||||||
"prompt_gateway_listener": prompt_gateway,
|
"prompt_gateway_listener": prompt_gateway,
|
||||||
"llm_gateway_listener": llm_gateway,
|
"llm_gateway_listener": llm_gateway,
|
||||||
"arch_config": arch_config_string,
|
"plano_config": plano_config_string,
|
||||||
"arch_llm_config": arch_llm_config_string,
|
"plano_llm_config": plano_llm_config_string,
|
||||||
"arch_clusters": inferred_clusters,
|
"plano_clusters": inferred_clusters,
|
||||||
"arch_model_providers": updated_model_providers,
|
"plano_model_providers": updated_model_providers,
|
||||||
"arch_tracing": arch_tracing,
|
"plano_tracing": plano_tracing,
|
||||||
"local_llms": llms_with_endpoint,
|
"local_llms": llms_with_endpoint,
|
||||||
"agent_orchestrator": agent_orchestrator,
|
"agent_orchestrator": agent_orchestrator,
|
||||||
"listeners": listeners,
|
"listeners": listeners,
|
||||||
|
|
@ -479,25 +479,25 @@ def validate_and_render_schema():
|
||||||
with open(ENVOY_CONFIG_FILE_RENDERED, "w") as file:
|
with open(ENVOY_CONFIG_FILE_RENDERED, "w") as file:
|
||||||
file.write(rendered)
|
file.write(rendered)
|
||||||
|
|
||||||
with open(ARCH_CONFIG_FILE_RENDERED, "w") as file:
|
with open(PLANO_CONFIG_FILE_RENDERED, "w") as file:
|
||||||
file.write(arch_config_string)
|
file.write(plano_config_string)
|
||||||
|
|
||||||
|
|
||||||
def validate_prompt_config(arch_config_file, arch_config_schema_file):
|
def validate_prompt_config(plano_config_file, plano_config_schema_file):
|
||||||
with open(arch_config_file, "r") as file:
|
with open(plano_config_file, "r") as file:
|
||||||
arch_config = file.read()
|
plano_config = file.read()
|
||||||
|
|
||||||
with open(arch_config_schema_file, "r") as file:
|
with open(plano_config_schema_file, "r") as file:
|
||||||
arch_config_schema = file.read()
|
plano_config_schema = file.read()
|
||||||
|
|
||||||
config_yaml = yaml.safe_load(arch_config)
|
config_yaml = yaml.safe_load(plano_config)
|
||||||
config_schema_yaml = yaml.safe_load(arch_config_schema)
|
config_schema_yaml = yaml.safe_load(plano_config_schema)
|
||||||
|
|
||||||
try:
|
try:
|
||||||
validate(config_yaml, config_schema_yaml)
|
validate(config_yaml, config_schema_yaml)
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
print(
|
print(
|
||||||
f"Error validating arch_config file: {arch_config_file}, schema file: {arch_config_schema_file}, error: {e}"
|
f"Error validating plano_config file: {plano_config_file}, schema file: {plano_config_schema_file}, error: {e}"
|
||||||
)
|
)
|
||||||
raise e
|
raise e
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -24,17 +24,17 @@ from planoai.docker_cli import (
|
||||||
log = getLogger(__name__)
|
log = getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
def _get_gateway_ports(arch_config_file: str) -> list[int]:
|
def _get_gateway_ports(plano_config_file: str) -> list[int]:
|
||||||
PROMPT_GATEWAY_DEFAULT_PORT = 10000
|
PROMPT_GATEWAY_DEFAULT_PORT = 10000
|
||||||
LLM_GATEWAY_DEFAULT_PORT = 12000
|
LLM_GATEWAY_DEFAULT_PORT = 12000
|
||||||
|
|
||||||
# parse arch_config_file yaml file and get prompt_gateway_port
|
# parse plano_config_file yaml file and get prompt_gateway_port
|
||||||
arch_config_dict = {}
|
plano_config_dict = {}
|
||||||
with open(arch_config_file) as f:
|
with open(plano_config_file) as f:
|
||||||
arch_config_dict = yaml.safe_load(f)
|
plano_config_dict = yaml.safe_load(f)
|
||||||
|
|
||||||
listeners, _, _ = convert_legacy_listeners(
|
listeners, _, _ = convert_legacy_listeners(
|
||||||
arch_config_dict.get("listeners"), arch_config_dict.get("llm_providers")
|
plano_config_dict.get("listeners"), plano_config_dict.get("llm_providers")
|
||||||
)
|
)
|
||||||
|
|
||||||
all_ports = [listener.get("port") for listener in listeners]
|
all_ports = [listener.get("port") for listener in listeners]
|
||||||
|
|
@ -45,7 +45,7 @@ def _get_gateway_ports(arch_config_file: str) -> list[int]:
|
||||||
return all_ports
|
return all_ports
|
||||||
|
|
||||||
|
|
||||||
def start_arch(arch_config_file, env, log_timeout=120, foreground=False):
|
def start_plano(plano_config_file, env, log_timeout=120, foreground=False):
|
||||||
"""
|
"""
|
||||||
Start Docker Compose in detached mode and stream logs until services are healthy.
|
Start Docker Compose in detached mode and stream logs until services are healthy.
|
||||||
|
|
||||||
|
|
@ -54,7 +54,7 @@ def start_arch(arch_config_file, env, log_timeout=120, foreground=False):
|
||||||
log_timeout (int): Time in seconds to show logs before checking for healthy state.
|
log_timeout (int): Time in seconds to show logs before checking for healthy state.
|
||||||
"""
|
"""
|
||||||
log.info(
|
log.info(
|
||||||
f"Starting arch gateway, image name: {PLANO_DOCKER_NAME}, tag: {PLANO_DOCKER_IMAGE}"
|
f"Starting plano gateway, image name: {PLANO_DOCKER_NAME}, tag: {PLANO_DOCKER_IMAGE}"
|
||||||
)
|
)
|
||||||
|
|
||||||
try:
|
try:
|
||||||
|
|
@ -64,10 +64,10 @@ def start_arch(arch_config_file, env, log_timeout=120, foreground=False):
|
||||||
docker_stop_container(PLANO_DOCKER_NAME)
|
docker_stop_container(PLANO_DOCKER_NAME)
|
||||||
docker_remove_container(PLANO_DOCKER_NAME)
|
docker_remove_container(PLANO_DOCKER_NAME)
|
||||||
|
|
||||||
gateway_ports = _get_gateway_ports(arch_config_file)
|
gateway_ports = _get_gateway_ports(plano_config_file)
|
||||||
|
|
||||||
return_code, _, plano_stderr = docker_start_plano_detached(
|
return_code, _, plano_stderr = docker_start_plano_detached(
|
||||||
arch_config_file,
|
plano_config_file,
|
||||||
env,
|
env,
|
||||||
gateway_ports,
|
gateway_ports,
|
||||||
)
|
)
|
||||||
|
|
@ -117,7 +117,7 @@ def start_arch(arch_config_file, env, log_timeout=120, foreground=False):
|
||||||
stream_gateway_logs(follow=True)
|
stream_gateway_logs(follow=True)
|
||||||
|
|
||||||
except KeyboardInterrupt:
|
except KeyboardInterrupt:
|
||||||
log.info("Keyboard interrupt received, stopping arch gateway service.")
|
log.info("Keyboard interrupt received, stopping plano gateway service.")
|
||||||
stop_docker_container()
|
stop_docker_container()
|
||||||
|
|
||||||
|
|
||||||
|
|
@ -144,15 +144,15 @@ def stop_docker_container(service=PLANO_DOCKER_NAME):
|
||||||
log.info(f"Failed to shut down services: {str(e)}")
|
log.info(f"Failed to shut down services: {str(e)}")
|
||||||
|
|
||||||
|
|
||||||
def start_cli_agent(arch_config_file=None, settings_json="{}"):
|
def start_cli_agent(plano_config_file=None, settings_json="{}"):
|
||||||
"""Start a CLI client connected to Plano."""
|
"""Start a CLI client connected to Plano."""
|
||||||
|
|
||||||
with open(arch_config_file, "r") as file:
|
with open(plano_config_file, "r") as file:
|
||||||
arch_config = file.read()
|
plano_config = file.read()
|
||||||
arch_config_yaml = yaml.safe_load(arch_config)
|
plano_config_yaml = yaml.safe_load(plano_config)
|
||||||
|
|
||||||
# Get egress listener configuration
|
# Get egress listener configuration
|
||||||
egress_config = arch_config_yaml.get("listeners", {}).get("egress_traffic", {})
|
egress_config = plano_config_yaml.get("listeners", {}).get("egress_traffic", {})
|
||||||
host = egress_config.get("host", "127.0.0.1")
|
host = egress_config.get("host", "127.0.0.1")
|
||||||
port = egress_config.get("port", 12000)
|
port = egress_config.get("port", 12000)
|
||||||
|
|
||||||
|
|
@ -167,7 +167,7 @@ def start_cli_agent(arch_config_file=None, settings_json="{}"):
|
||||||
env = os.environ.copy()
|
env = os.environ.copy()
|
||||||
env.update(
|
env.update(
|
||||||
{
|
{
|
||||||
"ANTHROPIC_AUTH_TOKEN": "test", # Use test token for arch
|
"ANTHROPIC_AUTH_TOKEN": "test", # Use test token for plano
|
||||||
"ANTHROPIC_API_KEY": "",
|
"ANTHROPIC_API_KEY": "",
|
||||||
"ANTHROPIC_BASE_URL": f"http://{host}:{port}",
|
"ANTHROPIC_BASE_URL": f"http://{host}:{port}",
|
||||||
"NO_PROXY": host,
|
"NO_PROXY": host,
|
||||||
|
|
@ -184,7 +184,7 @@ def start_cli_agent(arch_config_file=None, settings_json="{}"):
|
||||||
]
|
]
|
||||||
else:
|
else:
|
||||||
# Check if arch.claude.code.small.fast alias exists in model_aliases
|
# Check if arch.claude.code.small.fast alias exists in model_aliases
|
||||||
model_aliases = arch_config_yaml.get("model_aliases", {})
|
model_aliases = plano_config_yaml.get("model_aliases", {})
|
||||||
if "arch.claude.code.small.fast" in model_aliases:
|
if "arch.claude.code.small.fast" in model_aliases:
|
||||||
env["ANTHROPIC_SMALL_FAST_MODEL"] = "arch.claude.code.small.fast"
|
env["ANTHROPIC_SMALL_FAST_MODEL"] = "arch.claude.code.small.fast"
|
||||||
else:
|
else:
|
||||||
|
|
@ -220,7 +220,7 @@ def start_cli_agent(arch_config_file=None, settings_json="{}"):
|
||||||
|
|
||||||
# Use claude from PATH
|
# Use claude from PATH
|
||||||
claude_path = "claude"
|
claude_path = "claude"
|
||||||
log.info(f"Connecting Claude Code Agent to Arch at {host}:{port}")
|
log.info(f"Connecting Claude Code Agent to Plano at {host}:{port}")
|
||||||
|
|
||||||
try:
|
try:
|
||||||
subprocess.run([claude_path] + claude_args, env=env, check=True)
|
subprocess.run([claude_path] + claude_args, env=env, check=True)
|
||||||
|
|
|
||||||
|
|
@ -41,7 +41,7 @@ def docker_remove_container(container: str) -> str:
|
||||||
|
|
||||||
|
|
||||||
def docker_start_plano_detached(
|
def docker_start_plano_detached(
|
||||||
arch_config_file: str,
|
plano_config_file: str,
|
||||||
env: dict,
|
env: dict,
|
||||||
gateway_ports: list[int],
|
gateway_ports: list[int],
|
||||||
) -> str:
|
) -> str:
|
||||||
|
|
@ -58,7 +58,7 @@ def docker_start_plano_detached(
|
||||||
port_mappings_args = [item for port in port_mappings for item in ("-p", port)]
|
port_mappings_args = [item for port in port_mappings for item in ("-p", port)]
|
||||||
|
|
||||||
volume_mappings = [
|
volume_mappings = [
|
||||||
f"{arch_config_file}:/app/arch_config.yaml:ro",
|
f"{plano_config_file}:/app/plano_config.yaml:ro",
|
||||||
]
|
]
|
||||||
volume_mappings_args = [
|
volume_mappings_args = [
|
||||||
item for volume in volume_mappings for item in ("-v", volume)
|
item for volume in volume_mappings for item in ("-v", volume)
|
||||||
|
|
@ -115,7 +115,7 @@ def stream_gateway_logs(follow, service="plano"):
|
||||||
log.info(f"Failed to stream logs: {str(e)}")
|
log.info(f"Failed to stream logs: {str(e)}")
|
||||||
|
|
||||||
|
|
||||||
def docker_validate_plano_schema(arch_config_file):
|
def docker_validate_plano_schema(plano_config_file):
|
||||||
import os
|
import os
|
||||||
|
|
||||||
env = os.environ.copy()
|
env = os.environ.copy()
|
||||||
|
|
@ -129,7 +129,7 @@ def docker_validate_plano_schema(arch_config_file):
|
||||||
"--rm",
|
"--rm",
|
||||||
*env_args,
|
*env_args,
|
||||||
"-v",
|
"-v",
|
||||||
f"{arch_config_file}:/app/arch_config.yaml:ro",
|
f"{plano_config_file}:/app/plano_config.yaml:ro",
|
||||||
"--entrypoint",
|
"--entrypoint",
|
||||||
"python",
|
"python",
|
||||||
PLANO_DOCKER_IMAGE,
|
PLANO_DOCKER_IMAGE,
|
||||||
|
|
|
||||||
|
|
@ -22,7 +22,7 @@ from planoai.utils import (
|
||||||
find_repo_root,
|
find_repo_root,
|
||||||
)
|
)
|
||||||
from planoai.core import (
|
from planoai.core import (
|
||||||
start_arch,
|
start_plano,
|
||||||
stop_docker_container,
|
stop_docker_container,
|
||||||
start_cli_agent,
|
start_cli_agent,
|
||||||
)
|
)
|
||||||
|
|
@ -200,12 +200,12 @@ def up(file, path, foreground, with_tracing, tracing_port):
|
||||||
_print_cli_header(console)
|
_print_cli_header(console)
|
||||||
|
|
||||||
# Use the utility function to find config file
|
# Use the utility function to find config file
|
||||||
arch_config_file = find_config_file(path, file)
|
plano_config_file = find_config_file(path, file)
|
||||||
|
|
||||||
# Check if the file exists
|
# Check if the file exists
|
||||||
if not os.path.exists(arch_config_file):
|
if not os.path.exists(plano_config_file):
|
||||||
console.print(
|
console.print(
|
||||||
f"[red]✗[/red] Config file not found: [dim]{arch_config_file}[/dim]"
|
f"[red]✗[/red] Config file not found: [dim]{plano_config_file}[/dim]"
|
||||||
)
|
)
|
||||||
sys.exit(1)
|
sys.exit(1)
|
||||||
|
|
||||||
|
|
@ -216,7 +216,7 @@ def up(file, path, foreground, with_tracing, tracing_port):
|
||||||
validation_return_code,
|
validation_return_code,
|
||||||
_,
|
_,
|
||||||
validation_stderr,
|
validation_stderr,
|
||||||
) = docker_validate_plano_schema(arch_config_file)
|
) = docker_validate_plano_schema(plano_config_file)
|
||||||
|
|
||||||
if validation_return_code != 0:
|
if validation_return_code != 0:
|
||||||
console.print(f"[red]✗[/red] Validation failed")
|
console.print(f"[red]✗[/red] Validation failed")
|
||||||
|
|
@ -234,7 +234,7 @@ def up(file, path, foreground, with_tracing, tracing_port):
|
||||||
env.pop("PATH", None)
|
env.pop("PATH", None)
|
||||||
|
|
||||||
# Check access keys
|
# Check access keys
|
||||||
access_keys = get_llm_provider_access_keys(arch_config_file=arch_config_file)
|
access_keys = get_llm_provider_access_keys(plano_config_file=plano_config_file)
|
||||||
access_keys = set(access_keys)
|
access_keys = set(access_keys)
|
||||||
access_keys = [item[1:] if item.startswith("$") else item for item in access_keys]
|
access_keys = [item[1:] if item.startswith("$") else item for item in access_keys]
|
||||||
|
|
||||||
|
|
@ -302,7 +302,7 @@ def up(file, path, foreground, with_tracing, tracing_port):
|
||||||
|
|
||||||
env.update(env_stage)
|
env.update(env_stage)
|
||||||
try:
|
try:
|
||||||
start_arch(arch_config_file, env, foreground=foreground)
|
start_plano(plano_config_file, env, foreground=foreground)
|
||||||
|
|
||||||
# When tracing is enabled but --foreground is not, keep the process
|
# When tracing is enabled but --foreground is not, keep the process
|
||||||
# alive so the OTLP collector continues to receive spans.
|
# alive so the OTLP collector continues to receive spans.
|
||||||
|
|
@ -363,35 +363,35 @@ def generate_prompt_targets(file):
|
||||||
def logs(debug, follow):
|
def logs(debug, follow):
|
||||||
"""Stream logs from access logs services."""
|
"""Stream logs from access logs services."""
|
||||||
|
|
||||||
archgw_process = None
|
plano_process = None
|
||||||
try:
|
try:
|
||||||
if debug:
|
if debug:
|
||||||
archgw_process = multiprocessing.Process(
|
plano_process = multiprocessing.Process(
|
||||||
target=stream_gateway_logs, args=(follow,)
|
target=stream_gateway_logs, args=(follow,)
|
||||||
)
|
)
|
||||||
archgw_process.start()
|
plano_process.start()
|
||||||
|
|
||||||
archgw_access_logs_process = multiprocessing.Process(
|
plano_access_logs_process = multiprocessing.Process(
|
||||||
target=stream_access_logs, args=(follow,)
|
target=stream_access_logs, args=(follow,)
|
||||||
)
|
)
|
||||||
archgw_access_logs_process.start()
|
plano_access_logs_process.start()
|
||||||
archgw_access_logs_process.join()
|
plano_access_logs_process.join()
|
||||||
|
|
||||||
if archgw_process:
|
if plano_process:
|
||||||
archgw_process.join()
|
plano_process.join()
|
||||||
except KeyboardInterrupt:
|
except KeyboardInterrupt:
|
||||||
log.info("KeyboardInterrupt detected. Exiting.")
|
log.info("KeyboardInterrupt detected. Exiting.")
|
||||||
if archgw_access_logs_process.is_alive():
|
if plano_access_logs_process.is_alive():
|
||||||
archgw_access_logs_process.terminate()
|
plano_access_logs_process.terminate()
|
||||||
if archgw_process and archgw_process.is_alive():
|
if plano_process and plano_process.is_alive():
|
||||||
archgw_process.terminate()
|
plano_process.terminate()
|
||||||
|
|
||||||
|
|
||||||
@click.command()
|
@click.command()
|
||||||
@click.argument("type", type=click.Choice(["claude"]), required=True)
|
@click.argument("type", type=click.Choice(["claude"]), required=True)
|
||||||
@click.argument("file", required=False) # Optional file argument
|
@click.argument("file", required=False) # Optional file argument
|
||||||
@click.option(
|
@click.option(
|
||||||
"--path", default=".", help="Path to the directory containing arch_config.yaml"
|
"--path", default=".", help="Path to the directory containing plano_config.yaml"
|
||||||
)
|
)
|
||||||
@click.option(
|
@click.option(
|
||||||
"--settings",
|
"--settings",
|
||||||
|
|
@ -405,20 +405,20 @@ def cli_agent(type, file, path, settings):
|
||||||
"""
|
"""
|
||||||
|
|
||||||
# Check if plano docker container is running
|
# Check if plano docker container is running
|
||||||
archgw_status = docker_container_status(PLANO_DOCKER_NAME)
|
plano_status = docker_container_status(PLANO_DOCKER_NAME)
|
||||||
if archgw_status != "running":
|
if plano_status != "running":
|
||||||
log.error(f"plano docker container is not running (status: {archgw_status})")
|
log.error(f"plano docker container is not running (status: {plano_status})")
|
||||||
log.error("Please start plano using the 'planoai up' command.")
|
log.error("Please start plano using the 'planoai up' command.")
|
||||||
sys.exit(1)
|
sys.exit(1)
|
||||||
|
|
||||||
# Determine arch_config.yaml path
|
# Determine plano_config.yaml path
|
||||||
arch_config_file = find_config_file(path, file)
|
plano_config_file = find_config_file(path, file)
|
||||||
if not os.path.exists(arch_config_file):
|
if not os.path.exists(plano_config_file):
|
||||||
log.error(f"Config file not found: {arch_config_file}")
|
log.error(f"Config file not found: {plano_config_file}")
|
||||||
sys.exit(1)
|
sys.exit(1)
|
||||||
|
|
||||||
try:
|
try:
|
||||||
start_cli_agent(arch_config_file, settings)
|
start_cli_agent(plano_config_file, settings)
|
||||||
except SystemExit:
|
except SystemExit:
|
||||||
# Re-raise SystemExit to preserve exit codes
|
# Re-raise SystemExit to preserve exit codes
|
||||||
raise
|
raise
|
||||||
|
|
|
||||||
|
|
@ -68,19 +68,19 @@ def find_repo_root(start_path=None):
|
||||||
return None
|
return None
|
||||||
|
|
||||||
|
|
||||||
def has_ingress_listener(arch_config_file):
|
def has_ingress_listener(plano_config_file):
|
||||||
"""Check if the arch config file has ingress_traffic listener configured."""
|
"""Check if the plano config file has ingress_traffic listener configured."""
|
||||||
try:
|
try:
|
||||||
with open(arch_config_file) as f:
|
with open(plano_config_file) as f:
|
||||||
arch_config_dict = yaml.safe_load(f)
|
plano_config_dict = yaml.safe_load(f)
|
||||||
|
|
||||||
ingress_traffic = arch_config_dict.get("listeners", {}).get(
|
ingress_traffic = plano_config_dict.get("listeners", {}).get(
|
||||||
"ingress_traffic", {}
|
"ingress_traffic", {}
|
||||||
)
|
)
|
||||||
|
|
||||||
return bool(ingress_traffic)
|
return bool(ingress_traffic)
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
log.error(f"Error reading config file {arch_config_file}: {e}")
|
log.error(f"Error reading config file {plano_config_file}: {e}")
|
||||||
return False
|
return False
|
||||||
|
|
||||||
|
|
||||||
|
|
@ -161,27 +161,27 @@ def convert_legacy_listeners(
|
||||||
return listeners, llm_gateway_listener, prompt_gateway_listener
|
return listeners, llm_gateway_listener, prompt_gateway_listener
|
||||||
|
|
||||||
|
|
||||||
def get_llm_provider_access_keys(arch_config_file):
|
def get_llm_provider_access_keys(plano_config_file):
|
||||||
with open(arch_config_file, "r") as file:
|
with open(plano_config_file, "r") as file:
|
||||||
arch_config = file.read()
|
plano_config = file.read()
|
||||||
arch_config_yaml = yaml.safe_load(arch_config)
|
plano_config_yaml = yaml.safe_load(plano_config)
|
||||||
|
|
||||||
access_key_list = []
|
access_key_list = []
|
||||||
|
|
||||||
# Convert legacy llm_providers to model_providers
|
# Convert legacy llm_providers to model_providers
|
||||||
if "llm_providers" in arch_config_yaml:
|
if "llm_providers" in plano_config_yaml:
|
||||||
if "model_providers" in arch_config_yaml:
|
if "model_providers" in plano_config_yaml:
|
||||||
raise Exception(
|
raise Exception(
|
||||||
"Please provide either llm_providers or model_providers, not both. llm_providers is deprecated, please use model_providers instead"
|
"Please provide either llm_providers or model_providers, not both. llm_providers is deprecated, please use model_providers instead"
|
||||||
)
|
)
|
||||||
arch_config_yaml["model_providers"] = arch_config_yaml["llm_providers"]
|
plano_config_yaml["model_providers"] = plano_config_yaml["llm_providers"]
|
||||||
del arch_config_yaml["llm_providers"]
|
del plano_config_yaml["llm_providers"]
|
||||||
|
|
||||||
listeners, _, _ = convert_legacy_listeners(
|
listeners, _, _ = convert_legacy_listeners(
|
||||||
arch_config_yaml.get("listeners"), arch_config_yaml.get("model_providers")
|
plano_config_yaml.get("listeners"), plano_config_yaml.get("model_providers")
|
||||||
)
|
)
|
||||||
|
|
||||||
for prompt_target in arch_config_yaml.get("prompt_targets", []):
|
for prompt_target in plano_config_yaml.get("prompt_targets", []):
|
||||||
for k, v in prompt_target.get("endpoint", {}).get("http_headers", {}).items():
|
for k, v in prompt_target.get("endpoint", {}).get("http_headers", {}).items():
|
||||||
if k.lower() == "authorization":
|
if k.lower() == "authorization":
|
||||||
print(
|
print(
|
||||||
|
|
@ -200,7 +200,7 @@ def get_llm_provider_access_keys(arch_config_file):
|
||||||
access_key_list.append(access_key)
|
access_key_list.append(access_key)
|
||||||
|
|
||||||
# Extract environment variables from state_storage.connection_string
|
# Extract environment variables from state_storage.connection_string
|
||||||
state_storage = arch_config_yaml.get("state_storage_v1_responses")
|
state_storage = plano_config_yaml.get("state_storage_v1_responses")
|
||||||
if state_storage:
|
if state_storage:
|
||||||
connection_string = state_storage.get("connection_string")
|
connection_string = state_storage.get("connection_string")
|
||||||
if connection_string and isinstance(connection_string, str):
|
if connection_string and isinstance(connection_string, str):
|
||||||
|
|
@ -251,16 +251,16 @@ def find_config_file(path=".", file=None):
|
||||||
# If a file is provided, process that file
|
# If a file is provided, process that file
|
||||||
return os.path.abspath(file)
|
return os.path.abspath(file)
|
||||||
else:
|
else:
|
||||||
# If no file is provided, use the path and look for arch_config.yaml first, then config.yaml for convenience
|
# If no file is provided, use the path and look for plano_config.yaml first, then config.yaml for convenience
|
||||||
arch_config_file = os.path.abspath(os.path.join(path, "config.yaml"))
|
plano_config_file = os.path.abspath(os.path.join(path, "config.yaml"))
|
||||||
if not os.path.exists(arch_config_file):
|
if not os.path.exists(plano_config_file):
|
||||||
arch_config_file = os.path.abspath(os.path.join(path, "arch_config.yaml"))
|
plano_config_file = os.path.abspath(os.path.join(path, "plano_config.yaml"))
|
||||||
return arch_config_file
|
return plano_config_file
|
||||||
|
|
||||||
|
|
||||||
def stream_access_logs(follow):
|
def stream_access_logs(follow):
|
||||||
"""
|
"""
|
||||||
Get the archgw access logs
|
Get the plano access logs
|
||||||
"""
|
"""
|
||||||
|
|
||||||
follow_arg = "-f" if follow else ""
|
follow_arg = "-f" if follow else ""
|
||||||
|
|
|
||||||
|
|
@ -12,14 +12,14 @@ def cleanup_env(monkeypatch):
|
||||||
|
|
||||||
|
|
||||||
def test_validate_and_render_happy_path(monkeypatch):
|
def test_validate_and_render_happy_path(monkeypatch):
|
||||||
monkeypatch.setenv("ARCH_CONFIG_FILE", "fake_arch_config.yaml")
|
monkeypatch.setenv("PLANO_CONFIG_FILE", "fake_plano_config.yaml")
|
||||||
monkeypatch.setenv("ARCH_CONFIG_SCHEMA_FILE", "fake_arch_config_schema.yaml")
|
monkeypatch.setenv("PLANO_CONFIG_SCHEMA_FILE", "fake_plano_config_schema.yaml")
|
||||||
monkeypatch.setenv("ENVOY_CONFIG_TEMPLATE_FILE", "./envoy.template.yaml")
|
monkeypatch.setenv("ENVOY_CONFIG_TEMPLATE_FILE", "./envoy.template.yaml")
|
||||||
monkeypatch.setenv("ARCH_CONFIG_FILE_RENDERED", "fake_arch_config_rendered.yaml")
|
monkeypatch.setenv("PLANO_CONFIG_FILE_RENDERED", "fake_plano_config_rendered.yaml")
|
||||||
monkeypatch.setenv("ENVOY_CONFIG_FILE_RENDERED", "fake_envoy.yaml")
|
monkeypatch.setenv("ENVOY_CONFIG_FILE_RENDERED", "fake_envoy.yaml")
|
||||||
monkeypatch.setenv("TEMPLATE_ROOT", "../")
|
monkeypatch.setenv("TEMPLATE_ROOT", "../")
|
||||||
|
|
||||||
arch_config = """
|
plano_config = """
|
||||||
version: v0.1.0
|
version: v0.1.0
|
||||||
|
|
||||||
listeners:
|
listeners:
|
||||||
|
|
@ -50,24 +50,24 @@ llm_providers:
|
||||||
tracing:
|
tracing:
|
||||||
random_sampling: 100
|
random_sampling: 100
|
||||||
"""
|
"""
|
||||||
arch_config_schema = ""
|
plano_config_schema = ""
|
||||||
with open("../config/arch_config_schema.yaml", "r") as file:
|
with open("../config/plano_config_schema.yaml", "r") as file:
|
||||||
arch_config_schema = file.read()
|
plano_config_schema = file.read()
|
||||||
|
|
||||||
m_open = mock.mock_open()
|
m_open = mock.mock_open()
|
||||||
# Provide enough file handles for all open() calls in validate_and_render_schema
|
# Provide enough file handles for all open() calls in validate_and_render_schema
|
||||||
m_open.side_effect = [
|
m_open.side_effect = [
|
||||||
# Removed empty read - was causing validation failures
|
# Removed empty read - was causing validation failures
|
||||||
mock.mock_open(read_data=arch_config).return_value, # ARCH_CONFIG_FILE
|
mock.mock_open(read_data=plano_config).return_value, # PLANO_CONFIG_FILE
|
||||||
mock.mock_open(
|
mock.mock_open(
|
||||||
read_data=arch_config_schema
|
read_data=plano_config_schema
|
||||||
).return_value, # ARCH_CONFIG_SCHEMA_FILE
|
).return_value, # PLANO_CONFIG_SCHEMA_FILE
|
||||||
mock.mock_open(read_data=arch_config).return_value, # ARCH_CONFIG_FILE
|
mock.mock_open(read_data=plano_config).return_value, # PLANO_CONFIG_FILE
|
||||||
mock.mock_open(
|
mock.mock_open(
|
||||||
read_data=arch_config_schema
|
read_data=plano_config_schema
|
||||||
).return_value, # ARCH_CONFIG_SCHEMA_FILE
|
).return_value, # PLANO_CONFIG_SCHEMA_FILE
|
||||||
mock.mock_open().return_value, # ENVOY_CONFIG_FILE_RENDERED (write)
|
mock.mock_open().return_value, # ENVOY_CONFIG_FILE_RENDERED (write)
|
||||||
mock.mock_open().return_value, # ARCH_CONFIG_FILE_RENDERED (write)
|
mock.mock_open().return_value, # PLANO_CONFIG_FILE_RENDERED (write)
|
||||||
]
|
]
|
||||||
with mock.patch("builtins.open", m_open):
|
with mock.patch("builtins.open", m_open):
|
||||||
with mock.patch("planoai.config_generator.Environment"):
|
with mock.patch("planoai.config_generator.Environment"):
|
||||||
|
|
@ -75,14 +75,14 @@ tracing:
|
||||||
|
|
||||||
|
|
||||||
def test_validate_and_render_happy_path_agent_config(monkeypatch):
|
def test_validate_and_render_happy_path_agent_config(monkeypatch):
|
||||||
monkeypatch.setenv("ARCH_CONFIG_FILE", "fake_arch_config.yaml")
|
monkeypatch.setenv("PLANO_CONFIG_FILE", "fake_plano_config.yaml")
|
||||||
monkeypatch.setenv("ARCH_CONFIG_SCHEMA_FILE", "fake_arch_config_schema.yaml")
|
monkeypatch.setenv("PLANO_CONFIG_SCHEMA_FILE", "fake_plano_config_schema.yaml")
|
||||||
monkeypatch.setenv("ENVOY_CONFIG_TEMPLATE_FILE", "./envoy.template.yaml")
|
monkeypatch.setenv("ENVOY_CONFIG_TEMPLATE_FILE", "./envoy.template.yaml")
|
||||||
monkeypatch.setenv("ARCH_CONFIG_FILE_RENDERED", "fake_arch_config_rendered.yaml")
|
monkeypatch.setenv("PLANO_CONFIG_FILE_RENDERED", "fake_plano_config_rendered.yaml")
|
||||||
monkeypatch.setenv("ENVOY_CONFIG_FILE_RENDERED", "fake_envoy.yaml")
|
monkeypatch.setenv("ENVOY_CONFIG_FILE_RENDERED", "fake_envoy.yaml")
|
||||||
monkeypatch.setenv("TEMPLATE_ROOT", "../")
|
monkeypatch.setenv("TEMPLATE_ROOT", "../")
|
||||||
|
|
||||||
arch_config = """
|
plano_config = """
|
||||||
version: v0.3.0
|
version: v0.3.0
|
||||||
|
|
||||||
agents:
|
agents:
|
||||||
|
|
@ -123,35 +123,35 @@ model_providers:
|
||||||
- access_key: ${OPENAI_API_KEY}
|
- access_key: ${OPENAI_API_KEY}
|
||||||
model: openai/gpt-4o
|
model: openai/gpt-4o
|
||||||
"""
|
"""
|
||||||
arch_config_schema = ""
|
plano_config_schema = ""
|
||||||
with open("../config/arch_config_schema.yaml", "r") as file:
|
with open("../config/plano_config_schema.yaml", "r") as file:
|
||||||
arch_config_schema = file.read()
|
plano_config_schema = file.read()
|
||||||
|
|
||||||
m_open = mock.mock_open()
|
m_open = mock.mock_open()
|
||||||
# Provide enough file handles for all open() calls in validate_and_render_schema
|
# Provide enough file handles for all open() calls in validate_and_render_schema
|
||||||
m_open.side_effect = [
|
m_open.side_effect = [
|
||||||
# Removed empty read - was causing validation failures
|
# Removed empty read - was causing validation failures
|
||||||
mock.mock_open(read_data=arch_config).return_value, # ARCH_CONFIG_FILE
|
mock.mock_open(read_data=plano_config).return_value, # PLANO_CONFIG_FILE
|
||||||
mock.mock_open(
|
mock.mock_open(
|
||||||
read_data=arch_config_schema
|
read_data=plano_config_schema
|
||||||
).return_value, # ARCH_CONFIG_SCHEMA_FILE
|
).return_value, # PLANO_CONFIG_SCHEMA_FILE
|
||||||
mock.mock_open(read_data=arch_config).return_value, # ARCH_CONFIG_FILE
|
mock.mock_open(read_data=plano_config).return_value, # PLANO_CONFIG_FILE
|
||||||
mock.mock_open(
|
mock.mock_open(
|
||||||
read_data=arch_config_schema
|
read_data=plano_config_schema
|
||||||
).return_value, # ARCH_CONFIG_SCHEMA_FILE
|
).return_value, # PLANO_CONFIG_SCHEMA_FILE
|
||||||
mock.mock_open().return_value, # ENVOY_CONFIG_FILE_RENDERED (write)
|
mock.mock_open().return_value, # ENVOY_CONFIG_FILE_RENDERED (write)
|
||||||
mock.mock_open().return_value, # ARCH_CONFIG_FILE_RENDERED (write)
|
mock.mock_open().return_value, # PLANO_CONFIG_FILE_RENDERED (write)
|
||||||
]
|
]
|
||||||
with mock.patch("builtins.open", m_open):
|
with mock.patch("builtins.open", m_open):
|
||||||
with mock.patch("planoai.config_generator.Environment"):
|
with mock.patch("planoai.config_generator.Environment"):
|
||||||
validate_and_render_schema()
|
validate_and_render_schema()
|
||||||
|
|
||||||
|
|
||||||
arch_config_test_cases = [
|
plano_config_test_cases = [
|
||||||
{
|
{
|
||||||
"id": "duplicate_provider_name",
|
"id": "duplicate_provider_name",
|
||||||
"expected_error": "Duplicate model_provider name",
|
"expected_error": "Duplicate model_provider name",
|
||||||
"arch_config": """
|
"plano_config": """
|
||||||
version: v0.1.0
|
version: v0.1.0
|
||||||
|
|
||||||
listeners:
|
listeners:
|
||||||
|
|
@ -176,7 +176,7 @@ llm_providers:
|
||||||
{
|
{
|
||||||
"id": "provider_interface_with_model_id",
|
"id": "provider_interface_with_model_id",
|
||||||
"expected_error": "Please provide provider interface as part of model name",
|
"expected_error": "Please provide provider interface as part of model name",
|
||||||
"arch_config": """
|
"plano_config": """
|
||||||
version: v0.1.0
|
version: v0.1.0
|
||||||
|
|
||||||
listeners:
|
listeners:
|
||||||
|
|
@ -197,7 +197,7 @@ llm_providers:
|
||||||
{
|
{
|
||||||
"id": "duplicate_model_id",
|
"id": "duplicate_model_id",
|
||||||
"expected_error": "Duplicate model_id",
|
"expected_error": "Duplicate model_id",
|
||||||
"arch_config": """
|
"plano_config": """
|
||||||
version: v0.1.0
|
version: v0.1.0
|
||||||
|
|
||||||
listeners:
|
listeners:
|
||||||
|
|
@ -219,7 +219,7 @@ llm_providers:
|
||||||
{
|
{
|
||||||
"id": "custom_provider_base_url",
|
"id": "custom_provider_base_url",
|
||||||
"expected_error": "Must provide base_url and provider_interface",
|
"expected_error": "Must provide base_url and provider_interface",
|
||||||
"arch_config": """
|
"plano_config": """
|
||||||
version: v0.1.0
|
version: v0.1.0
|
||||||
|
|
||||||
listeners:
|
listeners:
|
||||||
|
|
@ -237,7 +237,7 @@ llm_providers:
|
||||||
{
|
{
|
||||||
"id": "base_url_with_path_prefix",
|
"id": "base_url_with_path_prefix",
|
||||||
"expected_error": None,
|
"expected_error": None,
|
||||||
"arch_config": """
|
"plano_config": """
|
||||||
version: v0.1.0
|
version: v0.1.0
|
||||||
|
|
||||||
listeners:
|
listeners:
|
||||||
|
|
@ -258,7 +258,7 @@ llm_providers:
|
||||||
{
|
{
|
||||||
"id": "duplicate_routeing_preference_name",
|
"id": "duplicate_routeing_preference_name",
|
||||||
"expected_error": "Duplicate routing preference name",
|
"expected_error": "Duplicate routing preference name",
|
||||||
"arch_config": """
|
"plano_config": """
|
||||||
version: v0.1.0
|
version: v0.1.0
|
||||||
|
|
||||||
listeners:
|
listeners:
|
||||||
|
|
@ -295,42 +295,42 @@ tracing:
|
||||||
|
|
||||||
|
|
||||||
@pytest.mark.parametrize(
|
@pytest.mark.parametrize(
|
||||||
"arch_config_test_case",
|
"plano_config_test_case",
|
||||||
arch_config_test_cases,
|
plano_config_test_cases,
|
||||||
ids=[case["id"] for case in arch_config_test_cases],
|
ids=[case["id"] for case in plano_config_test_cases],
|
||||||
)
|
)
|
||||||
def test_validate_and_render_schema_tests(monkeypatch, arch_config_test_case):
|
def test_validate_and_render_schema_tests(monkeypatch, plano_config_test_case):
|
||||||
monkeypatch.setenv("ARCH_CONFIG_FILE", "fake_arch_config.yaml")
|
monkeypatch.setenv("PLANO_CONFIG_FILE", "fake_plano_config.yaml")
|
||||||
monkeypatch.setenv("ARCH_CONFIG_SCHEMA_FILE", "fake_arch_config_schema.yaml")
|
monkeypatch.setenv("PLANO_CONFIG_SCHEMA_FILE", "fake_plano_config_schema.yaml")
|
||||||
monkeypatch.setenv("ENVOY_CONFIG_TEMPLATE_FILE", "./envoy.template.yaml")
|
monkeypatch.setenv("ENVOY_CONFIG_TEMPLATE_FILE", "./envoy.template.yaml")
|
||||||
monkeypatch.setenv("ARCH_CONFIG_FILE_RENDERED", "fake_arch_config_rendered.yaml")
|
monkeypatch.setenv("PLANO_CONFIG_FILE_RENDERED", "fake_plano_config_rendered.yaml")
|
||||||
monkeypatch.setenv("ENVOY_CONFIG_FILE_RENDERED", "fake_envoy.yaml")
|
monkeypatch.setenv("ENVOY_CONFIG_FILE_RENDERED", "fake_envoy.yaml")
|
||||||
monkeypatch.setenv("TEMPLATE_ROOT", "../")
|
monkeypatch.setenv("TEMPLATE_ROOT", "../")
|
||||||
|
|
||||||
arch_config = arch_config_test_case["arch_config"]
|
plano_config = plano_config_test_case["plano_config"]
|
||||||
expected_error = arch_config_test_case.get("expected_error")
|
expected_error = plano_config_test_case.get("expected_error")
|
||||||
|
|
||||||
arch_config_schema = ""
|
plano_config_schema = ""
|
||||||
with open("../config/arch_config_schema.yaml", "r") as file:
|
with open("../config/plano_config_schema.yaml", "r") as file:
|
||||||
arch_config_schema = file.read()
|
plano_config_schema = file.read()
|
||||||
|
|
||||||
m_open = mock.mock_open()
|
m_open = mock.mock_open()
|
||||||
# Provide enough file handles for all open() calls in validate_and_render_schema
|
# Provide enough file handles for all open() calls in validate_and_render_schema
|
||||||
m_open.side_effect = [
|
m_open.side_effect = [
|
||||||
mock.mock_open(
|
mock.mock_open(
|
||||||
read_data=arch_config
|
read_data=plano_config
|
||||||
).return_value, # validate_prompt_config: ARCH_CONFIG_FILE
|
).return_value, # validate_prompt_config: PLANO_CONFIG_FILE
|
||||||
mock.mock_open(
|
mock.mock_open(
|
||||||
read_data=arch_config_schema
|
read_data=plano_config_schema
|
||||||
).return_value, # validate_prompt_config: ARCH_CONFIG_SCHEMA_FILE
|
).return_value, # validate_prompt_config: PLANO_CONFIG_SCHEMA_FILE
|
||||||
mock.mock_open(
|
mock.mock_open(
|
||||||
read_data=arch_config
|
read_data=plano_config
|
||||||
).return_value, # validate_and_render_schema: ARCH_CONFIG_FILE
|
).return_value, # validate_and_render_schema: PLANO_CONFIG_FILE
|
||||||
mock.mock_open(
|
mock.mock_open(
|
||||||
read_data=arch_config_schema
|
read_data=plano_config_schema
|
||||||
).return_value, # validate_and_render_schema: ARCH_CONFIG_SCHEMA_FILE
|
).return_value, # validate_and_render_schema: PLANO_CONFIG_SCHEMA_FILE
|
||||||
mock.mock_open().return_value, # ENVOY_CONFIG_FILE_RENDERED (write)
|
mock.mock_open().return_value, # ENVOY_CONFIG_FILE_RENDERED (write)
|
||||||
mock.mock_open().return_value, # ARCH_CONFIG_FILE_RENDERED (write)
|
mock.mock_open().return_value, # PLANO_CONFIG_FILE_RENDERED (write)
|
||||||
]
|
]
|
||||||
with mock.patch("builtins.open", m_open):
|
with mock.patch("builtins.open", m_open):
|
||||||
with mock.patch("planoai.config_generator.Environment"):
|
with mock.patch("planoai.config_generator.Environment"):
|
||||||
|
|
|
||||||
2
cli/uv.lock
generated
2
cli/uv.lock
generated
|
|
@ -337,7 +337,7 @@ wheels = [
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "planoai"
|
name = "planoai"
|
||||||
version = "0.4.4"
|
version = "0.4.6"
|
||||||
source = { editable = "." }
|
source = { editable = "." }
|
||||||
dependencies = [
|
dependencies = [
|
||||||
{ name = "click" },
|
{ name = "click" },
|
||||||
|
|
|
||||||
|
|
@ -8,14 +8,14 @@ services:
|
||||||
- "12000:12000"
|
- "12000:12000"
|
||||||
- "19901:9901"
|
- "19901:9901"
|
||||||
volumes:
|
volumes:
|
||||||
- ${ARCH_CONFIG_FILE:-../demos/samples_python/weather_forecast/arch_config.yaml}:/app/arch_config.yaml
|
- ${PLANO_CONFIG_FILE:-../demos/samples_python/weather_forecast/plano_config.yaml}:/app/plano_config.yaml
|
||||||
- /etc/ssl/cert.pem:/etc/ssl/cert.pem
|
- /etc/ssl/cert.pem:/etc/ssl/cert.pem
|
||||||
- ./envoy.template.yaml:/app/envoy.template.yaml
|
- ./envoy.template.yaml:/app/envoy.template.yaml
|
||||||
- ./arch_config_schema.yaml:/app/arch_config_schema.yaml
|
- ./plano_config_schema.yaml:/app/plano_config_schema.yaml
|
||||||
- ../cli/planoai/config_generator.py:/app/planoai/config_generator.py
|
- ../cli/planoai/config_generator.py:/app/planoai/config_generator.py
|
||||||
- ../crates/target/wasm32-wasip1/release/llm_gateway.wasm:/etc/envoy/proxy-wasm-plugins/llm_gateway.wasm
|
- ../crates/target/wasm32-wasip1/release/llm_gateway.wasm:/etc/envoy/proxy-wasm-plugins/llm_gateway.wasm
|
||||||
- ../crates/target/wasm32-wasip1/release/prompt_gateway.wasm:/etc/envoy/proxy-wasm-plugins/prompt_gateway.wasm
|
- ../crates/target/wasm32-wasip1/release/prompt_gateway.wasm:/etc/envoy/proxy-wasm-plugins/prompt_gateway.wasm
|
||||||
- ~/archgw_logs:/var/log/
|
- ~/plano_logs:/var/log/
|
||||||
extra_hosts:
|
extra_hosts:
|
||||||
- "host.docker.internal:host-gateway"
|
- "host.docker.internal:host-gateway"
|
||||||
environment:
|
environment:
|
||||||
|
|
|
||||||
|
|
@ -40,7 +40,7 @@ static_resources:
|
||||||
- name: envoy.filters.network.http_connection_manager
|
- name: envoy.filters.network.http_connection_manager
|
||||||
typed_config:
|
typed_config:
|
||||||
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
|
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
|
||||||
{% if "random_sampling" in arch_tracing and arch_tracing["random_sampling"] > 0 %}
|
{% if "random_sampling" in plano_tracing and plano_tracing["random_sampling"] > 0 %}
|
||||||
generate_request_id: true
|
generate_request_id: true
|
||||||
tracing:
|
tracing:
|
||||||
provider:
|
provider:
|
||||||
|
|
@ -53,7 +53,7 @@ static_resources:
|
||||||
timeout: 0.250s
|
timeout: 0.250s
|
||||||
service_name: plano(inbound)
|
service_name: plano(inbound)
|
||||||
random_sampling:
|
random_sampling:
|
||||||
value: {{ arch_tracing.random_sampling }}
|
value: {{ plano_tracing.random_sampling }}
|
||||||
operation: "%REQ(:METHOD)% %REQ(X-ENVOY-ORIGINAL-PATH?:PATH)%"
|
operation: "%REQ(:METHOD)% %REQ(X-ENVOY-ORIGINAL-PATH?:PATH)%"
|
||||||
{% endif %}
|
{% endif %}
|
||||||
stat_prefix: plano(inbound)
|
stat_prefix: plano(inbound)
|
||||||
|
|
@ -114,7 +114,7 @@ static_resources:
|
||||||
domains:
|
domains:
|
||||||
- "*"
|
- "*"
|
||||||
routes:
|
routes:
|
||||||
{% for provider in arch_model_providers %}
|
{% for provider in plano_model_providers %}
|
||||||
# if endpoint is set then use custom cluster for upstream llm
|
# if endpoint is set then use custom cluster for upstream llm
|
||||||
{% if provider.endpoint %}
|
{% if provider.endpoint %}
|
||||||
{% set llm_cluster_name = provider.cluster_name %}
|
{% set llm_cluster_name = provider.cluster_name %}
|
||||||
|
|
@ -166,7 +166,7 @@ static_resources:
|
||||||
configuration:
|
configuration:
|
||||||
"@type": "type.googleapis.com/google.protobuf.StringValue"
|
"@type": "type.googleapis.com/google.protobuf.StringValue"
|
||||||
value: |
|
value: |
|
||||||
{{ arch_config | indent(32) }}
|
{{ plano_config | indent(32) }}
|
||||||
vm_config:
|
vm_config:
|
||||||
runtime: "envoy.wasm.runtime.v8"
|
runtime: "envoy.wasm.runtime.v8"
|
||||||
code:
|
code:
|
||||||
|
|
@ -183,7 +183,7 @@ static_resources:
|
||||||
configuration:
|
configuration:
|
||||||
"@type": "type.googleapis.com/google.protobuf.StringValue"
|
"@type": "type.googleapis.com/google.protobuf.StringValue"
|
||||||
value: |
|
value: |
|
||||||
{{ arch_llm_config | indent(32) }}
|
{{ plano_llm_config | indent(32) }}
|
||||||
vm_config:
|
vm_config:
|
||||||
runtime: "envoy.wasm.runtime.v8"
|
runtime: "envoy.wasm.runtime.v8"
|
||||||
code:
|
code:
|
||||||
|
|
@ -215,7 +215,7 @@ static_resources:
|
||||||
- name: envoy.filters.network.http_connection_manager
|
- name: envoy.filters.network.http_connection_manager
|
||||||
typed_config:
|
typed_config:
|
||||||
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
|
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
|
||||||
# {% if "random_sampling" in arch_tracing and arch_tracing["random_sampling"] > 0 %}
|
# {% if "random_sampling" in plano_tracing and plano_tracing["random_sampling"] > 0 %}
|
||||||
# generate_request_id: true
|
# generate_request_id: true
|
||||||
# tracing:
|
# tracing:
|
||||||
# provider:
|
# provider:
|
||||||
|
|
@ -228,7 +228,7 @@ static_resources:
|
||||||
# timeout: 0.250s
|
# timeout: 0.250s
|
||||||
# service_name: tools
|
# service_name: tools
|
||||||
# random_sampling:
|
# random_sampling:
|
||||||
# value: {{ arch_tracing.random_sampling }}
|
# value: {{ plano_tracing.random_sampling }}
|
||||||
# {% endif %}
|
# {% endif %}
|
||||||
stat_prefix: outbound_api_traffic
|
stat_prefix: outbound_api_traffic
|
||||||
codec_type: AUTO
|
codec_type: AUTO
|
||||||
|
|
@ -258,7 +258,7 @@ static_resources:
|
||||||
auto_host_rewrite: true
|
auto_host_rewrite: true
|
||||||
cluster: bright_staff
|
cluster: bright_staff
|
||||||
timeout: 300s
|
timeout: 300s
|
||||||
{% for cluster_name, cluster in arch_clusters.items() %}
|
{% for cluster_name, cluster in plano_clusters.items() %}
|
||||||
- match:
|
- match:
|
||||||
prefix: "/"
|
prefix: "/"
|
||||||
headers:
|
headers:
|
||||||
|
|
@ -290,7 +290,7 @@ static_resources:
|
||||||
- name: envoy.filters.network.http_connection_manager
|
- name: envoy.filters.network.http_connection_manager
|
||||||
typed_config:
|
typed_config:
|
||||||
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
|
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
|
||||||
{% if "random_sampling" in arch_tracing and arch_tracing["random_sampling"] > 0 %}
|
{% if "random_sampling" in plano_tracing and plano_tracing["random_sampling"] > 0 %}
|
||||||
generate_request_id: true
|
generate_request_id: true
|
||||||
tracing:
|
tracing:
|
||||||
provider:
|
provider:
|
||||||
|
|
@ -303,7 +303,7 @@ static_resources:
|
||||||
timeout: 0.250s
|
timeout: 0.250s
|
||||||
service_name: plano(inbound)
|
service_name: plano(inbound)
|
||||||
random_sampling:
|
random_sampling:
|
||||||
value: {{ arch_tracing.random_sampling }}
|
value: {{ plano_tracing.random_sampling }}
|
||||||
operation: "%REQ(:METHOD)% %REQ(X-ENVOY-ORIGINAL-PATH?:PATH)%"
|
operation: "%REQ(:METHOD)% %REQ(X-ENVOY-ORIGINAL-PATH?:PATH)%"
|
||||||
{% endif %}
|
{% endif %}
|
||||||
stat_prefix: {{ listener.name | replace(" ", "_") }}_traffic
|
stat_prefix: {{ listener.name | replace(" ", "_") }}_traffic
|
||||||
|
|
@ -467,7 +467,7 @@ static_resources:
|
||||||
- name: envoy.filters.network.http_connection_manager
|
- name: envoy.filters.network.http_connection_manager
|
||||||
typed_config:
|
typed_config:
|
||||||
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
|
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
|
||||||
{% if "random_sampling" in arch_tracing and arch_tracing["random_sampling"] > 0 %}
|
{% if "random_sampling" in plano_tracing and plano_tracing["random_sampling"] > 0 %}
|
||||||
generate_request_id: true
|
generate_request_id: true
|
||||||
tracing:
|
tracing:
|
||||||
provider:
|
provider:
|
||||||
|
|
@ -480,7 +480,7 @@ static_resources:
|
||||||
timeout: 0.250s
|
timeout: 0.250s
|
||||||
service_name: plano(outbound)
|
service_name: plano(outbound)
|
||||||
random_sampling:
|
random_sampling:
|
||||||
value: {{ arch_tracing.random_sampling }}
|
value: {{ plano_tracing.random_sampling }}
|
||||||
operation: "%REQ(:METHOD)% %REQ(:AUTHORITY)%%REQ(X-ENVOY-ORIGINAL-PATH?:PATH)%"
|
operation: "%REQ(:METHOD)% %REQ(:AUTHORITY)%%REQ(X-ENVOY-ORIGINAL-PATH?:PATH)%"
|
||||||
{% endif %}
|
{% endif %}
|
||||||
stat_prefix: egress_traffic
|
stat_prefix: egress_traffic
|
||||||
|
|
@ -501,7 +501,7 @@ static_resources:
|
||||||
domains:
|
domains:
|
||||||
- "*"
|
- "*"
|
||||||
routes:
|
routes:
|
||||||
{% for provider in arch_model_providers %}
|
{% for provider in plano_model_providers %}
|
||||||
# if endpoint is set then use custom cluster for upstream llm
|
# if endpoint is set then use custom cluster for upstream llm
|
||||||
{% if provider.endpoint %}
|
{% if provider.endpoint %}
|
||||||
{% set llm_cluster_name = provider.cluster_name %}
|
{% set llm_cluster_name = provider.cluster_name %}
|
||||||
|
|
@ -564,7 +564,7 @@ static_resources:
|
||||||
configuration:
|
configuration:
|
||||||
"@type": "type.googleapis.com/google.protobuf.StringValue"
|
"@type": "type.googleapis.com/google.protobuf.StringValue"
|
||||||
value: |
|
value: |
|
||||||
{{ arch_llm_config | indent(32) }}
|
{{ plano_llm_config | indent(32) }}
|
||||||
vm_config:
|
vm_config:
|
||||||
runtime: "envoy.wasm.runtime.v8"
|
runtime: "envoy.wasm.runtime.v8"
|
||||||
code:
|
code:
|
||||||
|
|
@ -879,7 +879,7 @@ static_resources:
|
||||||
address: mistral_7b_instruct
|
address: mistral_7b_instruct
|
||||||
port_value: 10001
|
port_value: 10001
|
||||||
hostname: "mistral_7b_instruct"
|
hostname: "mistral_7b_instruct"
|
||||||
{% for cluster_name, cluster in arch_clusters.items() %}
|
{% for cluster_name, cluster in plano_clusters.items() %}
|
||||||
- name: {{ cluster_name }}
|
- name: {{ cluster_name }}
|
||||||
{% if cluster.connect_timeout -%}
|
{% if cluster.connect_timeout -%}
|
||||||
connect_timeout: {{ cluster.connect_timeout }}
|
connect_timeout: {{ cluster.connect_timeout }}
|
||||||
|
|
@ -1013,7 +1013,7 @@ static_resources:
|
||||||
port_value: 12001
|
port_value: 12001
|
||||||
hostname: arch_listener_llm
|
hostname: arch_listener_llm
|
||||||
|
|
||||||
{% if "random_sampling" in arch_tracing and arch_tracing["random_sampling"] > 0 %}
|
{% if "random_sampling" in plano_tracing and plano_tracing["random_sampling"] > 0 %}
|
||||||
- name: opentelemetry_collector
|
- name: opentelemetry_collector
|
||||||
type: STRICT_DNS
|
type: STRICT_DNS
|
||||||
dns_lookup_family: V4_ONLY
|
dns_lookup_family: V4_ONLY
|
||||||
|
|
@ -1030,7 +1030,7 @@ static_resources:
|
||||||
- endpoint:
|
- endpoint:
|
||||||
address:
|
address:
|
||||||
socket_address:
|
socket_address:
|
||||||
{% set _otel_endpoint = arch_tracing.opentracing_grpc_endpoint | default('host.docker.internal:4317') | replace("http://", "") | replace("https://", "") %}
|
{% set _otel_endpoint = plano_tracing.opentracing_grpc_endpoint | default('host.docker.internal:4317') | replace("http://", "") | replace("https://", "") %}
|
||||||
address: {{ _otel_endpoint.split(":") | first }}
|
address: {{ _otel_endpoint.split(":") | first }}
|
||||||
port_value: {{ _otel_endpoint.split(":") | last }}
|
port_value: {{ _otel_endpoint.split(":") | last }}
|
||||||
{% endif %}
|
{% endif %}
|
||||||
|
|
|
||||||
|
|
@ -3,9 +3,9 @@ nodaemon=true
|
||||||
|
|
||||||
[program:brightstaff]
|
[program:brightstaff]
|
||||||
command=sh -c "\
|
command=sh -c "\
|
||||||
envsubst < /app/arch_config_rendered.yaml > /app/arch_config_rendered.env_sub.yaml && \
|
envsubst < /app/plano_config_rendered.yaml > /app/plano_config_rendered.env_sub.yaml && \
|
||||||
RUST_LOG=${LOG_LEVEL:-info} \
|
RUST_LOG=${LOG_LEVEL:-info} \
|
||||||
ARCH_CONFIG_PATH_RENDERED=/app/arch_config_rendered.env_sub.yaml \
|
PLANO_CONFIG_PATH_RENDERED=/app/plano_config_rendered.env_sub.yaml \
|
||||||
/app/brightstaff 2>&1 | \
|
/app/brightstaff 2>&1 | \
|
||||||
tee /var/log/brightstaff.log | \
|
tee /var/log/brightstaff.log | \
|
||||||
while IFS= read -r line; do echo '[brightstaff]' \"$line\"; done"
|
while IFS= read -r line; do echo '[brightstaff]' \"$line\"; done"
|
||||||
|
|
|
||||||
|
|
@ -7,7 +7,7 @@
|
||||||
#
|
#
|
||||||
# To test:
|
# To test:
|
||||||
# docker build -t plano-passthrough-test .
|
# docker build -t plano-passthrough-test .
|
||||||
# docker run -d -p 10000:10000 -v $(pwd)/config/test_passthrough.yaml:/app/arch_config.yaml plano-passthrough-test
|
# docker run -d -p 10000:10000 -v $(pwd)/config/test_passthrough.yaml:/app/plano_config.yaml plano-passthrough-test
|
||||||
#
|
#
|
||||||
# curl http://localhost:10000/v1/chat/completions \
|
# curl http://localhost:10000/v1/chat/completions \
|
||||||
# -H "Authorization: Bearer sk-your-virtual-key" \
|
# -H "Authorization: Bearer sk-your-virtual-key" \
|
||||||
|
|
|
||||||
|
|
@ -2,10 +2,10 @@
|
||||||
|
|
||||||
failed_files=()
|
failed_files=()
|
||||||
|
|
||||||
for file in $(find . -name config.yaml -o -name arch_config_full_reference.yaml); do
|
for file in $(find . -name config.yaml -o -name plano_config_full_reference.yaml); do
|
||||||
echo "Validating ${file}..."
|
echo "Validating ${file}..."
|
||||||
touch $(pwd)/${file}_rendered
|
touch $(pwd)/${file}_rendered
|
||||||
if ! docker run --rm -v "$(pwd)/${file}:/app/arch_config.yaml:ro" -v "$(pwd)/${file}_rendered:/app/arch_config_rendered.yaml:rw" --entrypoint /bin/sh katanemo/plano:0.4.6 -c "python -m planoai.config_generator" 2>&1 > /dev/null ; then
|
if ! docker run --rm -v "$(pwd)/${file}:/app/plano_config.yaml:ro" -v "$(pwd)/${file}_rendered:/app/plano_config_rendered.yaml:rw" --entrypoint /bin/sh katanemo/plano:0.4.6 -c "python -m planoai.config_generator" 2>&1 > /dev/null ; then
|
||||||
echo "Validation failed for $file"
|
echo "Validation failed for $file"
|
||||||
failed_files+=("$file")
|
failed_files+=("$file")
|
||||||
fi
|
fi
|
||||||
|
|
|
||||||
|
|
@ -210,8 +210,8 @@ async fn llm_chat_inner(
|
||||||
// Set the model to just the model name (without provider prefix)
|
// Set the model to just the model name (without provider prefix)
|
||||||
// This ensures upstream receives "gpt-4" not "openai/gpt-4"
|
// This ensures upstream receives "gpt-4" not "openai/gpt-4"
|
||||||
client_request.set_model(model_name_only.clone());
|
client_request.set_model(model_name_only.clone());
|
||||||
if client_request.remove_metadata_key("archgw_preference_config") {
|
if client_request.remove_metadata_key("plano_preference_config") {
|
||||||
debug!("removed archgw_preference_config from metadata");
|
debug!("removed plano_preference_config from metadata");
|
||||||
}
|
}
|
||||||
|
|
||||||
// === v1/responses state management: Determine upstream API and combine input if needed ===
|
// === v1/responses state management: Determine upstream API and combine input if needed ===
|
||||||
|
|
|
||||||
|
|
@ -78,7 +78,7 @@ pub async fn router_chat_get_upstream_model(
|
||||||
// Extract usage preferences from metadata
|
// Extract usage preferences from metadata
|
||||||
let usage_preferences_str: Option<String> = routing_metadata.as_ref().and_then(|metadata| {
|
let usage_preferences_str: Option<String> = routing_metadata.as_ref().and_then(|metadata| {
|
||||||
metadata
|
metadata
|
||||||
.get("archgw_preference_config")
|
.get("plano_preference_config")
|
||||||
.map(|value| value.to_string())
|
.map(|value| value.to_string())
|
||||||
});
|
});
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -52,57 +52,57 @@ fn empty() -> BoxBody<Bytes, hyper::Error> {
|
||||||
async fn main() -> Result<(), Box<dyn std::error::Error + Send + Sync>> {
|
async fn main() -> Result<(), Box<dyn std::error::Error + Send + Sync>> {
|
||||||
let bind_address = env::var("BIND_ADDRESS").unwrap_or_else(|_| BIND_ADDRESS.to_string());
|
let bind_address = env::var("BIND_ADDRESS").unwrap_or_else(|_| BIND_ADDRESS.to_string());
|
||||||
|
|
||||||
// loading arch_config.yaml file (before tracing init so we can read tracing config)
|
// loading plano_config.yaml file (before tracing init so we can read tracing config)
|
||||||
let arch_config_path = env::var("ARCH_CONFIG_PATH_RENDERED")
|
let plano_config_path = env::var("PLANO_CONFIG_PATH_RENDERED")
|
||||||
.unwrap_or_else(|_| "./arch_config_rendered.yaml".to_string());
|
.unwrap_or_else(|_| "./plano_config_rendered.yaml".to_string());
|
||||||
eprintln!("loading arch_config.yaml from {}", arch_config_path);
|
eprintln!("loading plano_config.yaml from {}", plano_config_path);
|
||||||
|
|
||||||
let config_contents =
|
let config_contents =
|
||||||
fs::read_to_string(&arch_config_path).expect("Failed to read arch_config.yaml");
|
fs::read_to_string(&plano_config_path).expect("Failed to read plano_config.yaml");
|
||||||
|
|
||||||
let config: Configuration =
|
let config: Configuration =
|
||||||
serde_yaml::from_str(&config_contents).expect("Failed to parse arch_config.yaml");
|
serde_yaml::from_str(&config_contents).expect("Failed to parse plano_config.yaml");
|
||||||
|
|
||||||
// Initialize tracing using config.yaml tracing section
|
// Initialize tracing using config.yaml tracing section
|
||||||
let _tracer_provider = init_tracer(config.tracing.as_ref());
|
let _tracer_provider = init_tracer(config.tracing.as_ref());
|
||||||
info!(path = %arch_config_path, "loaded arch_config.yaml");
|
info!(path = %plano_config_path, "loaded plano_config.yaml");
|
||||||
|
|
||||||
let arch_config = Arc::new(config);
|
let plano_config = Arc::new(config);
|
||||||
|
|
||||||
// combine agents and filters into a single list of agents
|
// combine agents and filters into a single list of agents
|
||||||
let all_agents: Vec<Agent> = arch_config
|
let all_agents: Vec<Agent> = plano_config
|
||||||
.agents
|
.agents
|
||||||
.as_deref()
|
.as_deref()
|
||||||
.unwrap_or_default()
|
.unwrap_or_default()
|
||||||
.iter()
|
.iter()
|
||||||
.chain(arch_config.filters.as_deref().unwrap_or_default())
|
.chain(plano_config.filters.as_deref().unwrap_or_default())
|
||||||
.cloned()
|
.cloned()
|
||||||
.collect();
|
.collect();
|
||||||
|
|
||||||
// Create expanded provider list for /v1/models endpoint
|
// Create expanded provider list for /v1/models endpoint
|
||||||
let llm_providers = LlmProviders::try_from(arch_config.model_providers.clone())
|
let llm_providers = LlmProviders::try_from(plano_config.model_providers.clone())
|
||||||
.expect("Failed to create LlmProviders");
|
.expect("Failed to create LlmProviders");
|
||||||
let llm_providers = Arc::new(RwLock::new(llm_providers));
|
let llm_providers = Arc::new(RwLock::new(llm_providers));
|
||||||
let combined_agents_filters_list = Arc::new(RwLock::new(Some(all_agents)));
|
let combined_agents_filters_list = Arc::new(RwLock::new(Some(all_agents)));
|
||||||
let listeners = Arc::new(RwLock::new(arch_config.listeners.clone()));
|
let listeners = Arc::new(RwLock::new(plano_config.listeners.clone()));
|
||||||
let llm_provider_url =
|
let llm_provider_url =
|
||||||
env::var("LLM_PROVIDER_ENDPOINT").unwrap_or_else(|_| "http://localhost:12001".to_string());
|
env::var("LLM_PROVIDER_ENDPOINT").unwrap_or_else(|_| "http://localhost:12001".to_string());
|
||||||
|
|
||||||
let listener = TcpListener::bind(bind_address).await?;
|
let listener = TcpListener::bind(bind_address).await?;
|
||||||
let routing_model_name: String = arch_config
|
let routing_model_name: String = plano_config
|
||||||
.routing
|
.routing
|
||||||
.as_ref()
|
.as_ref()
|
||||||
.and_then(|r| r.model.clone())
|
.and_then(|r| r.model.clone())
|
||||||
.unwrap_or_else(|| DEFAULT_ROUTING_MODEL_NAME.to_string());
|
.unwrap_or_else(|| DEFAULT_ROUTING_MODEL_NAME.to_string());
|
||||||
|
|
||||||
let routing_llm_provider = arch_config
|
let routing_llm_provider = plano_config
|
||||||
.routing
|
.routing
|
||||||
.as_ref()
|
.as_ref()
|
||||||
.and_then(|r| r.model_provider.clone())
|
.and_then(|r| r.model_provider.clone())
|
||||||
.unwrap_or_else(|| DEFAULT_ROUTING_LLM_PROVIDER.to_string());
|
.unwrap_or_else(|| DEFAULT_ROUTING_LLM_PROVIDER.to_string());
|
||||||
|
|
||||||
let router_service: Arc<RouterService> = Arc::new(RouterService::new(
|
let router_service: Arc<RouterService> = Arc::new(RouterService::new(
|
||||||
arch_config.model_providers.clone(),
|
plano_config.model_providers.clone(),
|
||||||
format!("{llm_provider_url}{CHAT_COMPLETIONS_PATH}"),
|
format!("{llm_provider_url}{CHAT_COMPLETIONS_PATH}"),
|
||||||
routing_model_name,
|
routing_model_name,
|
||||||
routing_llm_provider,
|
routing_llm_provider,
|
||||||
|
|
@ -113,19 +113,19 @@ async fn main() -> Result<(), Box<dyn std::error::Error + Send + Sync>> {
|
||||||
PLANO_ORCHESTRATOR_MODEL_NAME.to_string(),
|
PLANO_ORCHESTRATOR_MODEL_NAME.to_string(),
|
||||||
));
|
));
|
||||||
|
|
||||||
let model_aliases = Arc::new(arch_config.model_aliases.clone());
|
let model_aliases = Arc::new(plano_config.model_aliases.clone());
|
||||||
|
|
||||||
// Initialize trace collector and start background flusher
|
// Initialize trace collector and start background flusher
|
||||||
// Tracing is enabled if the tracing config is present in arch_config.yaml
|
// Tracing is enabled if the tracing config is present in plano_config.yaml
|
||||||
// Pass Some(true/false) to override, or None to use env var OTEL_TRACING_ENABLED
|
// Pass Some(true/false) to override, or None to use env var OTEL_TRACING_ENABLED
|
||||||
// OpenTelemetry automatic instrumentation is configured in utils/tracing.rs
|
// OpenTelemetry automatic instrumentation is configured in utils/tracing.rs
|
||||||
|
|
||||||
// Initialize conversation state storage for v1/responses
|
// Initialize conversation state storage for v1/responses
|
||||||
// Configurable via arch_config.yaml state_storage section
|
// Configurable via plano_config.yaml state_storage section
|
||||||
// If not configured, state management is disabled
|
// If not configured, state management is disabled
|
||||||
// Environment variables are substituted by envsubst before config is read
|
// Environment variables are substituted by envsubst before config is read
|
||||||
let state_storage: Option<Arc<dyn StateStorage>> =
|
let state_storage: Option<Arc<dyn StateStorage>> =
|
||||||
if let Some(storage_config) = &arch_config.state_storage {
|
if let Some(storage_config) = &plano_config.state_storage {
|
||||||
let storage: Arc<dyn StateStorage> = match storage_config.storage_type {
|
let storage: Arc<dyn StateStorage> = match storage_config.storage_type {
|
||||||
common::configuration::StateStorageType::Memory => {
|
common::configuration::StateStorageType::Memory => {
|
||||||
info!(
|
info!(
|
||||||
|
|
|
||||||
|
|
@ -182,7 +182,7 @@ pub mod signals {
|
||||||
// Operation Names
|
// Operation Names
|
||||||
// =============================================================================
|
// =============================================================================
|
||||||
|
|
||||||
/// Canonical operation name components for Arch Gateway
|
/// Canonical operation name components for Plano Gateway
|
||||||
pub mod operation_component {
|
pub mod operation_component {
|
||||||
/// Inbound request handling
|
/// Inbound request handling
|
||||||
pub const INBOUND: &str = "plano(inbound)";
|
pub const INBOUND: &str = "plano(inbound)";
|
||||||
|
|
@ -210,7 +210,7 @@ pub mod operation_component {
|
||||||
///
|
///
|
||||||
/// Format: `{method} {path} {target}`
|
/// Format: `{method} {path} {target}`
|
||||||
///
|
///
|
||||||
/// The operation component (e.g., "archgw(llm)") is now part of the service name,
|
/// The operation component (e.g., "plano(llm)") is now part of the service name,
|
||||||
/// so the operation name focuses on the HTTP request details and target.
|
/// so the operation name focuses on the HTTP request details and target.
|
||||||
///
|
///
|
||||||
/// # Examples
|
/// # Examples
|
||||||
|
|
@ -218,7 +218,7 @@ pub mod operation_component {
|
||||||
/// use brightstaff::tracing::OperationNameBuilder;
|
/// use brightstaff::tracing::OperationNameBuilder;
|
||||||
///
|
///
|
||||||
/// // LLM call operation: "POST /v1/chat/completions gpt-4"
|
/// // LLM call operation: "POST /v1/chat/completions gpt-4"
|
||||||
/// // (service name will be "archgw(llm)")
|
/// // (service name will be "plano(llm)")
|
||||||
/// let op = OperationNameBuilder::new()
|
/// let op = OperationNameBuilder::new()
|
||||||
/// .with_method("POST")
|
/// .with_method("POST")
|
||||||
/// .with_path("/v1/chat/completions")
|
/// .with_path("/v1/chat/completions")
|
||||||
|
|
@ -226,7 +226,7 @@ pub mod operation_component {
|
||||||
/// .build();
|
/// .build();
|
||||||
///
|
///
|
||||||
/// // Agent filter operation: "POST /agents/v1/chat/completions hallucination-detector"
|
/// // Agent filter operation: "POST /agents/v1/chat/completions hallucination-detector"
|
||||||
/// // (service name will be "archgw(agent filter)")
|
/// // (service name will be "plano(agent filter)")
|
||||||
/// let op = OperationNameBuilder::new()
|
/// let op = OperationNameBuilder::new()
|
||||||
/// .with_method("POST")
|
/// .with_method("POST")
|
||||||
/// .with_path("/agents/v1/chat/completions")
|
/// .with_path("/agents/v1/chat/completions")
|
||||||
|
|
@ -234,7 +234,7 @@ pub mod operation_component {
|
||||||
/// .build();
|
/// .build();
|
||||||
///
|
///
|
||||||
/// // Routing operation: "POST /v1/chat/completions"
|
/// // Routing operation: "POST /v1/chat/completions"
|
||||||
/// // (service name will be "archgw(routing)")
|
/// // (service name will be "plano(routing)")
|
||||||
/// let op = OperationNameBuilder::new()
|
/// let op = OperationNameBuilder::new()
|
||||||
/// .with_method("POST")
|
/// .with_method("POST")
|
||||||
/// .with_path("/v1/chat/completions")
|
/// .with_path("/v1/chat/completions")
|
||||||
|
|
|
||||||
|
|
@ -493,7 +493,7 @@ mod test {
|
||||||
#[test]
|
#[test]
|
||||||
fn test_deserialize_configuration() {
|
fn test_deserialize_configuration() {
|
||||||
let ref_config = fs::read_to_string(
|
let ref_config = fs::read_to_string(
|
||||||
"../../docs/source/resources/includes/arch_config_full_reference_rendered.yaml",
|
"../../docs/source/resources/includes/plano_config_full_reference_rendered.yaml",
|
||||||
)
|
)
|
||||||
.expect("reference config file not found");
|
.expect("reference config file not found");
|
||||||
|
|
||||||
|
|
@ -520,7 +520,7 @@ mod test {
|
||||||
#[test]
|
#[test]
|
||||||
fn test_tool_conversion() {
|
fn test_tool_conversion() {
|
||||||
let ref_config = fs::read_to_string(
|
let ref_config = fs::read_to_string(
|
||||||
"../../docs/source/resources/includes/arch_config_full_reference_rendered.yaml",
|
"../../docs/source/resources/includes/plano_config_full_reference_rendered.yaml",
|
||||||
)
|
)
|
||||||
.expect("reference config file not found");
|
.expect("reference config file not found");
|
||||||
let config: super::Configuration = serde_yaml::from_str(&ref_config).unwrap();
|
let config: super::Configuration = serde_yaml::from_str(&ref_config).unwrap();
|
||||||
|
|
|
||||||
|
|
@ -990,7 +990,7 @@ impl HttpContext for StreamContext {
|
||||||
self.send_server_error(
|
self.send_server_error(
|
||||||
ServerError::BadRequest {
|
ServerError::BadRequest {
|
||||||
why: format!(
|
why: format!(
|
||||||
"No model specified in request and couldn't determine model name from arch_config. Model name in req: {}, arch_config, provider: {}, model: {:?}",
|
"No model specified in request and couldn't determine model name from plano_config. Model name in req: {}, plano_config, provider: {}, model: {:?}",
|
||||||
model_requested,
|
model_requested,
|
||||||
self.llm_provider().name,
|
self.llm_provider().name,
|
||||||
self.llm_provider().model
|
self.llm_provider().model
|
||||||
|
|
|
||||||
|
|
@ -419,7 +419,7 @@ impl HttpContext for StreamContext {
|
||||||
);
|
);
|
||||||
}
|
}
|
||||||
let data_serialized = serde_json::to_string(&data).unwrap();
|
let data_serialized = serde_json::to_string(&data).unwrap();
|
||||||
info!("archgw <= developer: {}", data_serialized);
|
info!("plano <= developer: {}", data_serialized);
|
||||||
self.set_http_response_body(0, body_size, data_serialized.as_bytes());
|
self.set_http_response_body(0, body_size, data_serialized.as_bytes());
|
||||||
};
|
};
|
||||||
}
|
}
|
||||||
|
|
|
||||||
|
|
@ -246,7 +246,7 @@ impl StreamContext {
|
||||||
let chat_completion_request_json =
|
let chat_completion_request_json =
|
||||||
serde_json::to_string(&chat_completion_request).unwrap();
|
serde_json::to_string(&chat_completion_request).unwrap();
|
||||||
info!(
|
info!(
|
||||||
"archgw => upstream llm request: {}",
|
"plano => upstream llm request: {}",
|
||||||
chat_completion_request_json
|
chat_completion_request_json
|
||||||
);
|
);
|
||||||
self.set_http_request_body(
|
self.set_http_request_body(
|
||||||
|
|
@ -799,7 +799,7 @@ impl StreamContext {
|
||||||
};
|
};
|
||||||
|
|
||||||
let json_resp = serde_json::to_string(&chat_completion_request).unwrap();
|
let json_resp = serde_json::to_string(&chat_completion_request).unwrap();
|
||||||
info!("archgw => (default target) llm request: {}", json_resp);
|
info!("plano => (default target) llm request: {}", json_resp);
|
||||||
self.set_http_request_body(0, self.request_body_size, json_resp.as_bytes());
|
self.set_http_request_body(0, self.request_body_size, json_resp.as_bytes());
|
||||||
self.resume_http_request();
|
self.resume_http_request();
|
||||||
}
|
}
|
||||||
|
|
|
||||||
|
|
@ -18,7 +18,7 @@ services:
|
||||||
extra_hosts:
|
extra_hosts:
|
||||||
- "host.docker.internal:host-gateway"
|
- "host.docker.internal:host-gateway"
|
||||||
volumes:
|
volumes:
|
||||||
- ./config.yaml:/app/arch_config.yaml
|
- ./config.yaml:/app/plano_config.yaml
|
||||||
|
|
||||||
jaeger:
|
jaeger:
|
||||||
build:
|
build:
|
||||||
|
|
|
||||||
|
|
@ -1 +1 @@
|
||||||
This demo shows how you can use a publicly hosted rest api and interact it using arch gateway.
|
This demo shows how you can use a publicly hosted rest api and interact it using Plano gateway.
|
||||||
|
|
|
||||||
|
|
@ -10,7 +10,7 @@ services:
|
||||||
extra_hosts:
|
extra_hosts:
|
||||||
- "host.docker.internal:host-gateway"
|
- "host.docker.internal:host-gateway"
|
||||||
volumes:
|
volumes:
|
||||||
- ./config.yaml:/app/arch_config.yaml
|
- ./config.yaml:/app/plano_config.yaml
|
||||||
|
|
||||||
jaeger:
|
jaeger:
|
||||||
build:
|
build:
|
||||||
|
|
|
||||||
|
|
@ -1,6 +1,6 @@
|
||||||
# Multi-Turn Agentic Demo (RAG)
|
# Multi-Turn Agentic Demo (RAG)
|
||||||
|
|
||||||
This demo showcases how the **Arch** can be used to build accurate multi-turn RAG agent by just writing simple APIs.
|
This demo showcases how **Plano** can be used to build accurate multi-turn RAG agent by just writing simple APIs.
|
||||||
|
|
||||||

|

|
||||||
|
|
||||||
|
|
@ -14,7 +14,7 @@ Provides information about various energy sources and considerations.
|
||||||
|
|
||||||
# Starting the demo
|
# Starting the demo
|
||||||
1. Please make sure the [pre-requisites](https://github.com/katanemo/arch/?tab=readme-ov-file#prerequisites) are installed correctly
|
1. Please make sure the [pre-requisites](https://github.com/katanemo/arch/?tab=readme-ov-file#prerequisites) are installed correctly
|
||||||
2. Start Arch
|
2. Start Plano
|
||||||
```sh
|
```sh
|
||||||
sh run_demo.sh
|
sh run_demo.sh
|
||||||
```
|
```
|
||||||
|
|
|
||||||
|
|
@ -21,4 +21,4 @@ services:
|
||||||
extra_hosts:
|
extra_hosts:
|
||||||
- "host.docker.internal:host-gateway"
|
- "host.docker.internal:host-gateway"
|
||||||
volumes:
|
volumes:
|
||||||
- ./config.yaml:/app/arch_config.yaml
|
- ./config.yaml:/app/plano_config.yaml
|
||||||
|
|
|
||||||
|
|
@ -18,8 +18,8 @@ start_demo() {
|
||||||
echo ".env file created with OPENAI_API_KEY."
|
echo ".env file created with OPENAI_API_KEY."
|
||||||
fi
|
fi
|
||||||
|
|
||||||
# Step 3: Start Arch
|
# Step 3: Start Plano
|
||||||
echo "Starting Arch with config.yaml..."
|
echo "Starting Plano with config.yaml..."
|
||||||
planoai up config.yaml
|
planoai up config.yaml
|
||||||
|
|
||||||
# Step 4: Start Network Agent
|
# Step 4: Start Network Agent
|
||||||
|
|
@ -33,8 +33,8 @@ stop_demo() {
|
||||||
echo "Stopping HR Agent using Docker Compose..."
|
echo "Stopping HR Agent using Docker Compose..."
|
||||||
docker compose down -v
|
docker compose down -v
|
||||||
|
|
||||||
# Step 2: Stop Arch
|
# Step 2: Stop Plano
|
||||||
echo "Stopping Arch..."
|
echo "Stopping Plano..."
|
||||||
planoai down
|
planoai down
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -10,7 +10,7 @@ services:
|
||||||
extra_hosts:
|
extra_hosts:
|
||||||
- "host.docker.internal:host-gateway"
|
- "host.docker.internal:host-gateway"
|
||||||
volumes:
|
volumes:
|
||||||
- ./config.yaml:/app/arch_config.yaml
|
- ./config.yaml:/app/plano_config.yaml
|
||||||
|
|
||||||
jaeger:
|
jaeger:
|
||||||
build:
|
build:
|
||||||
|
|
|
||||||
|
|
@ -18,8 +18,8 @@ start_demo() {
|
||||||
echo ".env file created with OPENAI_API_KEY."
|
echo ".env file created with OPENAI_API_KEY."
|
||||||
fi
|
fi
|
||||||
|
|
||||||
# Step 3: Start Arch
|
# Step 3: Start Plano
|
||||||
echo "Starting Arch with config.yaml..."
|
echo "Starting Plano with config.yaml..."
|
||||||
planoai up config.yaml
|
planoai up config.yaml
|
||||||
|
|
||||||
# Step 4: Start developer services
|
# Step 4: Start developer services
|
||||||
|
|
@ -33,8 +33,8 @@ stop_demo() {
|
||||||
echo "Stopping Network Agent using Docker Compose..."
|
echo "Stopping Network Agent using Docker Compose..."
|
||||||
docker compose down
|
docker compose down
|
||||||
|
|
||||||
# Step 2: Stop Arch
|
# Step 2: Stop Plano
|
||||||
echo "Stopping Arch..."
|
echo "Stopping Plano..."
|
||||||
planoai down
|
planoai down
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -1,11 +1,11 @@
|
||||||
# Function calling
|
# Function calling
|
||||||
|
|
||||||
This demo shows how you can use Arch's core function calling capabilities.
|
This demo shows how you can use Plano's core function calling capabilities.
|
||||||
|
|
||||||
# Starting the demo
|
# Starting the demo
|
||||||
|
|
||||||
1. Please make sure the [pre-requisites](https://github.com/katanemo/arch/?tab=readme-ov-file#prerequisites) are installed correctly
|
1. Please make sure the [pre-requisites](https://github.com/katanemo/arch/?tab=readme-ov-file#prerequisites) are installed correctly
|
||||||
2. Start Arch
|
2. Start Plano
|
||||||
|
|
||||||
3. ```sh
|
3. ```sh
|
||||||
sh run_demo.sh
|
sh run_demo.sh
|
||||||
|
|
@ -15,14 +15,14 @@ This demo shows how you can use Arch's core function calling capabilities.
|
||||||
|
|
||||||
# Observability
|
# Observability
|
||||||
|
|
||||||
Arch gateway publishes stats endpoint at http://localhost:19901/stats. In this demo we are using prometheus to pull stats from arch and we are using grafana to visalize the stats in dashboard. To see grafana dashboard follow instructions below,
|
Plano gateway publishes stats endpoint at http://localhost:19901/stats. In this demo we are using prometheus to pull stats from Plano and we are using grafana to visalize the stats in dashboard. To see grafana dashboard follow instructions below,
|
||||||
|
|
||||||
1. Start grafana and prometheus using following command
|
1. Start grafana and prometheus using following command
|
||||||
```yaml
|
```yaml
|
||||||
docker compose --profile monitoring up
|
docker compose --profile monitoring up
|
||||||
```
|
```
|
||||||
2. Navigate to http://localhost:3000/ to open grafana UI (use admin/grafana as credentials)
|
2. Navigate to http://localhost:3000/ to open grafana UI (use admin/grafana as credentials)
|
||||||
3. From grafana left nav click on dashboards and select "Intelligent Gateway Overview" to view arch gateway stats
|
3. From grafana left nav click on dashboards and select "Intelligent Gateway Overview" to view Plano gateway stats
|
||||||
|
|
||||||
Here is a sample interaction,
|
Here is a sample interaction,
|
||||||
<img width="575" alt="image" src="https://github.com/user-attachments/assets/e0929490-3eb2-4130-ae87-a732aea4d059">
|
<img width="575" alt="image" src="https://github.com/user-attachments/assets/e0929490-3eb2-4130-ae87-a732aea4d059">
|
||||||
|
|
|
||||||
|
|
@ -20,7 +20,7 @@ services:
|
||||||
extra_hosts:
|
extra_hosts:
|
||||||
- "host.docker.internal:host-gateway"
|
- "host.docker.internal:host-gateway"
|
||||||
volumes:
|
volumes:
|
||||||
- ./config.yaml:/app/arch_config.yaml
|
- ./config.yaml:/app/plano_config.yaml
|
||||||
|
|
||||||
otel-collector:
|
otel-collector:
|
||||||
build:
|
build:
|
||||||
|
|
|
||||||
|
|
@ -20,7 +20,7 @@ services:
|
||||||
extra_hosts:
|
extra_hosts:
|
||||||
- "host.docker.internal:host-gateway"
|
- "host.docker.internal:host-gateway"
|
||||||
volumes:
|
volumes:
|
||||||
- ./config.yaml:/app/arch_config.yaml
|
- ./config.yaml:/app/plano_config.yaml
|
||||||
|
|
||||||
jaeger:
|
jaeger:
|
||||||
build:
|
build:
|
||||||
|
|
|
||||||
|
|
@ -20,7 +20,7 @@ services:
|
||||||
extra_hosts:
|
extra_hosts:
|
||||||
- "host.docker.internal:host-gateway"
|
- "host.docker.internal:host-gateway"
|
||||||
volumes:
|
volumes:
|
||||||
- ./config.yaml:/app/arch_config.yaml
|
- ./config.yaml:/app/plano_config.yaml
|
||||||
|
|
||||||
otel-collector:
|
otel-collector:
|
||||||
build:
|
build:
|
||||||
|
|
|
||||||
|
|
@ -23,7 +23,7 @@ services:
|
||||||
extra_hosts:
|
extra_hosts:
|
||||||
- "host.docker.internal:host-gateway"
|
- "host.docker.internal:host-gateway"
|
||||||
volumes:
|
volumes:
|
||||||
- ./config.yaml:/app/arch_config.yaml
|
- ./config.yaml:/app/plano_config.yaml
|
||||||
|
|
||||||
prometheus:
|
prometheus:
|
||||||
build:
|
build:
|
||||||
|
|
|
||||||
|
|
@ -20,4 +20,4 @@ services:
|
||||||
extra_hosts:
|
extra_hosts:
|
||||||
- "host.docker.internal:host-gateway"
|
- "host.docker.internal:host-gateway"
|
||||||
volumes:
|
volumes:
|
||||||
- ./config.yaml:/app/arch_config.yaml
|
- ./config.yaml:/app/plano_config.yaml
|
||||||
|
|
|
||||||
|
|
@ -72,8 +72,8 @@ start_demo() {
|
||||||
exit 1
|
exit 1
|
||||||
fi
|
fi
|
||||||
|
|
||||||
# Step 4: Start Arch
|
# Step 4: Start Plano
|
||||||
echo "Starting Arch with config.yaml..."
|
echo "Starting Plano with config.yaml..."
|
||||||
planoai up config.yaml
|
planoai up config.yaml
|
||||||
|
|
||||||
# Step 5: Start Network Agent with the chosen Docker Compose file
|
# Step 5: Start Network Agent with the chosen Docker Compose file
|
||||||
|
|
@ -91,8 +91,8 @@ stop_demo() {
|
||||||
docker compose -f "$compose_file" down
|
docker compose -f "$compose_file" down
|
||||||
done
|
done
|
||||||
|
|
||||||
# Stop Arch
|
# Stop Plano
|
||||||
echo "Stopping Arch..."
|
echo "Stopping Plano..."
|
||||||
planoai down
|
planoai down
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
|
||||||
4
demos/shared/chatbot_ui/.vscode/launch.json
vendored
4
demos/shared/chatbot_ui/.vscode/launch.json
vendored
|
|
@ -15,7 +15,7 @@
|
||||||
"LLM": "1",
|
"LLM": "1",
|
||||||
"CHAT_COMPLETION_ENDPOINT": "http://localhost:10000/v1",
|
"CHAT_COMPLETION_ENDPOINT": "http://localhost:10000/v1",
|
||||||
"STREAMING": "True",
|
"STREAMING": "True",
|
||||||
"ARCH_CONFIG": "../../samples_python/weather_forecast/arch_config.yaml"
|
"PLANO_CONFIG": "../../samples_python/weather_forecast/plano_config.yaml"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
|
|
@ -29,7 +29,7 @@
|
||||||
"LLM": "1",
|
"LLM": "1",
|
||||||
"CHAT_COMPLETION_ENDPOINT": "http://localhost:12000/v1",
|
"CHAT_COMPLETION_ENDPOINT": "http://localhost:12000/v1",
|
||||||
"STREAMING": "True",
|
"STREAMING": "True",
|
||||||
"ARCH_CONFIG": "../../samples_python/weather_forecast/arch_config.yaml"
|
"PLANO_CONFIG": "../../samples_python/weather_forecast/plano_config.yaml"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
]
|
]
|
||||||
|
|
|
||||||
|
|
@ -37,7 +37,7 @@ def chat(
|
||||||
|
|
||||||
try:
|
try:
|
||||||
response = client.chat.completions.create(
|
response = client.chat.completions.create(
|
||||||
# we select model from arch_config file
|
# we select model from plano_config file
|
||||||
model="None",
|
model="None",
|
||||||
messages=history,
|
messages=history,
|
||||||
temperature=1.0,
|
temperature=1.0,
|
||||||
|
|
@ -86,7 +86,7 @@ def create_gradio_app(demo_description, client):
|
||||||
|
|
||||||
with gr.Column(scale=2):
|
with gr.Column(scale=2):
|
||||||
chatbot = gr.Chatbot(
|
chatbot = gr.Chatbot(
|
||||||
label="Arch Chatbot",
|
label="Plano Chatbot",
|
||||||
elem_classes="chatbot",
|
elem_classes="chatbot",
|
||||||
)
|
)
|
||||||
textbox = gr.Textbox(
|
textbox = gr.Textbox(
|
||||||
|
|
@ -110,7 +110,7 @@ def process_stream_chunk(chunk, history):
|
||||||
delta = chunk.choices[0].delta
|
delta = chunk.choices[0].delta
|
||||||
if delta.role and delta.role != history[-1]["role"]:
|
if delta.role and delta.role != history[-1]["role"]:
|
||||||
# create new history item if role changes
|
# create new history item if role changes
|
||||||
# this is likely due to arch tool call and api response
|
# this is likely due to Plano tool call and api response
|
||||||
history.append({"role": delta.role})
|
history.append({"role": delta.role})
|
||||||
|
|
||||||
history[-1]["model"] = chunk.model
|
history[-1]["model"] = chunk.model
|
||||||
|
|
@ -159,7 +159,7 @@ def convert_prompt_target_to_openai_format(target):
|
||||||
|
|
||||||
def get_prompt_targets():
|
def get_prompt_targets():
|
||||||
try:
|
try:
|
||||||
with open(os.getenv("ARCH_CONFIG", "config.yaml"), "r") as file:
|
with open(os.getenv("PLANO_CONFIG", "config.yaml"), "r") as file:
|
||||||
config = yaml.safe_load(file)
|
config = yaml.safe_load(file)
|
||||||
|
|
||||||
available_tools = []
|
available_tools = []
|
||||||
|
|
@ -181,7 +181,7 @@ def get_prompt_targets():
|
||||||
|
|
||||||
def get_llm_models():
|
def get_llm_models():
|
||||||
try:
|
try:
|
||||||
with open(os.getenv("ARCH_CONFIG", "config.yaml"), "r") as file:
|
with open(os.getenv("PLANO_CONFIG", "config.yaml"), "r") as file:
|
||||||
config = yaml.safe_load(file)
|
config = yaml.safe_load(file)
|
||||||
|
|
||||||
available_models = [""]
|
available_models = [""]
|
||||||
|
|
|
||||||
|
|
@ -787,7 +787,7 @@
|
||||||
},
|
},
|
||||||
"timepicker": {},
|
"timepicker": {},
|
||||||
"timezone": "browser",
|
"timezone": "browser",
|
||||||
"title": "Arch Gateway Dashboard",
|
"title": "Plano Gateway Dashboard",
|
||||||
"uid": "adt6uhx5lk8aob",
|
"uid": "adt6uhx5lk8aob",
|
||||||
"version": 1,
|
"version": 1,
|
||||||
"weekStart": ""
|
"weekStart": ""
|
||||||
|
|
|
||||||
|
|
@ -19,17 +19,17 @@ def get_data_chunks(stream, n=1):
|
||||||
return chunks
|
return chunks
|
||||||
|
|
||||||
|
|
||||||
def get_arch_messages(response_json):
|
def get_plano_messages(response_json):
|
||||||
arch_messages = []
|
plano_messages = []
|
||||||
if response_json and "metadata" in response_json:
|
if response_json and "metadata" in response_json:
|
||||||
# load arch_state from metadata
|
# load plano_state from metadata
|
||||||
arch_state_str = response_json.get("metadata", {}).get(ARCH_STATE_HEADER, "{}")
|
plano_state_str = response_json.get("metadata", {}).get(ARCH_STATE_HEADER, "{}")
|
||||||
# parse arch_state into json object
|
# parse plano_state into json object
|
||||||
arch_state = json.loads(arch_state_str)
|
plano_state = json.loads(plano_state_str)
|
||||||
# load messages from arch_state
|
# load messages from plano_state
|
||||||
arch_messages_str = arch_state.get("messages", "[]")
|
plano_messages_str = plano_state.get("messages", "[]")
|
||||||
# parse messages into json object
|
# parse messages into json object
|
||||||
arch_messages = json.loads(arch_messages_str)
|
plano_messages = json.loads(plano_messages_str)
|
||||||
# append messages from arch gateway to history
|
# append messages from plano gateway to history
|
||||||
return arch_messages
|
return plano_messages
|
||||||
return []
|
return []
|
||||||
|
|
|
||||||
|
|
@ -1,6 +1,6 @@
|
||||||
import json
|
import json
|
||||||
import os
|
import os
|
||||||
from common import get_arch_messages
|
from common import get_plano_messages
|
||||||
import pytest
|
import pytest
|
||||||
import requests
|
import requests
|
||||||
from deepdiff import DeepDiff
|
from deepdiff import DeepDiff
|
||||||
|
|
@ -46,10 +46,10 @@ def test_demos(test_data):
|
||||||
assert choices[0]["message"]["role"] == "assistant"
|
assert choices[0]["message"]["role"] == "assistant"
|
||||||
assert expected_output_contains.lower() in choices[0]["message"]["content"].lower()
|
assert expected_output_contains.lower() in choices[0]["message"]["content"].lower()
|
||||||
|
|
||||||
# now verify arch_messages (tool call and api response) that are sent as response metadata
|
# now verify plano_messages (tool call and api response) that are sent as response metadata
|
||||||
arch_messages = get_arch_messages(response_json)
|
plano_messages = get_plano_messages(response_json)
|
||||||
assert len(arch_messages) == 2
|
assert len(plano_messages) == 2
|
||||||
tool_calls_message = arch_messages[0]
|
tool_calls_message = plano_messages[0]
|
||||||
tool_calls = tool_calls_message.get("tool_calls", [])
|
tool_calls = tool_calls_message.get("tool_calls", [])
|
||||||
assert len(tool_calls) > 0
|
assert len(tool_calls) > 0
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -1,6 +1,6 @@
|
||||||
# Claude Code Router - Multi-Model Access with Intelligent Routing
|
# Claude Code Router - Multi-Model Access with Intelligent Routing
|
||||||
|
|
||||||
Arch Gateway extends Claude Code to access multiple LLM providers through a single interface. Offering two key benefits:
|
Plano extends Claude Code to access multiple LLM providers through a single interface. Offering two key benefits:
|
||||||
|
|
||||||
1. **Access to Models**: Connect to Grok, Mistral, Gemini, DeepSeek, GPT models, Claude, and local models via Ollama
|
1. **Access to Models**: Connect to Grok, Mistral, Gemini, DeepSeek, GPT models, Claude, and local models via Ollama
|
||||||
2. **Intelligent Routing via Preferences for Coding Tasks**: Configure which models handle specific development tasks:
|
2. **Intelligent Routing via Preferences for Coding Tasks**: Configure which models handle specific development tasks:
|
||||||
|
|
@ -21,15 +21,15 @@ Uses a [1.5B preference-aligned router LLM](https://arxiv.org/abs/2506.16655) to
|
||||||
|
|
||||||
## How It Works
|
## How It Works
|
||||||
|
|
||||||
Arch Gateway sits between Claude Code and multiple LLM providers, analyzing each request to route it to the most suitable model:
|
Plano sits between Claude Code and multiple LLM providers, analyzing each request to route it to the most suitable model:
|
||||||
|
|
||||||
```
|
```
|
||||||
Your Request → Arch Gateway → Suitable Model → Response
|
Your Request → Plano → Suitable Model → Response
|
||||||
↓
|
↓
|
||||||
[Task Analysis & Model Selection]
|
[Task Analysis & Model Selection]
|
||||||
```
|
```
|
||||||
|
|
||||||
**Supported Providers**: OpenAI-compatible, Anthropic, DeepSeek, Grok, Gemini, Llama, Mistral, local models via Ollama. See [full list of supported providers](https://docs.archgw.com/concepts/llm_providers/supported_providers.html).
|
**Supported Providers**: OpenAI-compatible, Anthropic, DeepSeek, Grok, Gemini, Llama, Mistral, local models via Ollama. See [full list of supported providers](https://docs.planoai.dev/concepts/llm_providers/supported_providers.html).
|
||||||
|
|
||||||
|
|
||||||
## Quick Start (5 minutes)
|
## Quick Start (5 minutes)
|
||||||
|
|
@ -61,7 +61,7 @@ export ANTHROPIC_API_KEY="your-anthropic-key-here"
|
||||||
# Add other providers as needed
|
# Add other providers as needed
|
||||||
```
|
```
|
||||||
|
|
||||||
### Step 3: Start Arch Gateway
|
### Step 3: Start Plano
|
||||||
```bash
|
```bash
|
||||||
# Install using uv (recommended)
|
# Install using uv (recommended)
|
||||||
uv tool install planoai
|
uv tool install planoai
|
||||||
|
|
@ -122,7 +122,7 @@ planoai cli-agent claude --settings='{"ANTHROPIC_SMALL_FAST_MODEL": "deepseek-co
|
||||||
### Environment Variables
|
### Environment Variables
|
||||||
The system automatically configures these variables for Claude Code:
|
The system automatically configures these variables for Claude Code:
|
||||||
```bash
|
```bash
|
||||||
ANTHROPIC_BASE_URL=http://127.0.0.1:12000 # Routes through Arch Gateway
|
ANTHROPIC_BASE_URL=http://127.0.0.1:12000 # Routes through Plano
|
||||||
ANTHROPIC_SMALL_FAST_MODEL=arch.claude.code.small.fast # Uses intelligent alias
|
ANTHROPIC_SMALL_FAST_MODEL=arch.claude.code.small.fast # Uses intelligent alias
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
@ -147,6 +147,6 @@ llm_providers:
|
||||||
|
|
||||||
## Technical Details
|
## Technical Details
|
||||||
|
|
||||||
**How routing works:** Arch intercepts Claude Code requests, analyzes the content using preference-aligned routing, and forwards to the configured model.
|
**How routing works:** Plano intercepts Claude Code requests, analyzes the content using preference-aligned routing, and forwards to the configured model.
|
||||||
**Research foundation:** Built on our research in [Preference-Aligned LLM Routing](https://arxiv.org/abs/2506.16655)
|
**Research foundation:** Built on our research in [Preference-Aligned LLM Routing](https://arxiv.org/abs/2506.16655)
|
||||||
**Documentation:** [docs.archgw.com](https://docs.archgw.com) for advanced configuration and API details.
|
**Documentation:** [docs.planoai.dev](https://docs.planoai.dev) for advanced configuration and API details.
|
||||||
|
|
|
||||||
|
|
@ -1,5 +1,5 @@
|
||||||
#!/usr/bin/env bash
|
#!/usr/bin/env bash
|
||||||
# Pretty-print ArchGW MODEL_RESOLUTION lines from docker logs
|
# Pretty-print Plano MODEL_RESOLUTION lines from docker logs
|
||||||
# - hides Arch-Router
|
# - hides Arch-Router
|
||||||
# - prints timestamp
|
# - prints timestamp
|
||||||
# - colors MODEL_RESOLUTION red
|
# - colors MODEL_RESOLUTION red
|
||||||
|
|
@ -7,7 +7,7 @@
|
||||||
# - colors resolved_model magenta
|
# - colors resolved_model magenta
|
||||||
# - removes provider and streaming
|
# - removes provider and streaming
|
||||||
|
|
||||||
docker logs -f archgw 2>&1 \
|
docker logs -f plano 2>&1 \
|
||||||
| awk '
|
| awk '
|
||||||
/MODEL_RESOLUTION:/ && $0 !~ /Arch-Router/ {
|
/MODEL_RESOLUTION:/ && $0 !~ /Arch-Router/ {
|
||||||
# extract timestamp between first [ and ]
|
# extract timestamp between first [ and ]
|
||||||
|
|
|
||||||
|
|
@ -19,10 +19,10 @@ services:
|
||||||
- "12000:12000"
|
- "12000:12000"
|
||||||
- "8001:8001"
|
- "8001:8001"
|
||||||
environment:
|
environment:
|
||||||
- ARCH_CONFIG_PATH=/config/config.yaml
|
- PLANO_CONFIG_PATH=/config/config.yaml
|
||||||
- OPENAI_API_KEY=${OPENAI_API_KEY:?OPENAI_API_KEY environment variable is required but not set}
|
- OPENAI_API_KEY=${OPENAI_API_KEY:?OPENAI_API_KEY environment variable is required but not set}
|
||||||
volumes:
|
volumes:
|
||||||
- ./config.yaml:/app/arch_config.yaml
|
- ./config.yaml:/app/plano_config.yaml
|
||||||
- /etc/ssl/cert.pem:/etc/ssl/cert.pem
|
- /etc/ssl/cert.pem:/etc/ssl/cert.pem
|
||||||
jaeger:
|
jaeger:
|
||||||
build:
|
build:
|
||||||
|
|
|
||||||
|
|
@ -22,14 +22,14 @@ logger = logging.getLogger(__name__)
|
||||||
### add new setup
|
### add new setup
|
||||||
app = FastAPI(title="RAG Agent Context Builder", version="1.0.0")
|
app = FastAPI(title="RAG Agent Context Builder", version="1.0.0")
|
||||||
|
|
||||||
# Configuration for archgw LLM gateway
|
# Configuration for Plano LLM gateway
|
||||||
LLM_GATEWAY_ENDPOINT = os.getenv("LLM_GATEWAY_ENDPOINT", "http://localhost:12000/v1")
|
LLM_GATEWAY_ENDPOINT = os.getenv("LLM_GATEWAY_ENDPOINT", "http://localhost:12000/v1")
|
||||||
RAG_MODEL = "gpt-4o-mini"
|
RAG_MODEL = "gpt-4o-mini"
|
||||||
|
|
||||||
# Initialize OpenAI client for archgw
|
# Initialize OpenAI client for Plano
|
||||||
archgw_client = AsyncOpenAI(
|
plano_client = AsyncOpenAI(
|
||||||
base_url=LLM_GATEWAY_ENDPOINT,
|
base_url=LLM_GATEWAY_ENDPOINT,
|
||||||
api_key="EMPTY", # archgw doesn't require a real API key
|
api_key="EMPTY", # Plano doesn't require a real API key
|
||||||
)
|
)
|
||||||
|
|
||||||
# Global variable to store the knowledge base
|
# Global variable to store the knowledge base
|
||||||
|
|
@ -95,15 +95,15 @@ async def find_relevant_passages(
|
||||||
If no passages are relevant, return "NONE"."""
|
If no passages are relevant, return "NONE"."""
|
||||||
|
|
||||||
try:
|
try:
|
||||||
# Call archgw to select relevant passages
|
# Call Plano to select relevant passages
|
||||||
logger.info(f"Calling archgw to find relevant passages for query: '{query}'")
|
logger.info(f"Calling Plano to find relevant passages for query: '{query}'")
|
||||||
|
|
||||||
# Prepare extra headers if traceparent is provided
|
# Prepare extra headers if traceparent is provided
|
||||||
extra_headers = {"x-envoy-max-retries": "3", "x-request-id": request_id}
|
extra_headers = {"x-envoy-max-retries": "3", "x-request-id": request_id}
|
||||||
if traceparent:
|
if traceparent:
|
||||||
extra_headers["traceparent"] = traceparent
|
extra_headers["traceparent"] = traceparent
|
||||||
|
|
||||||
response = await archgw_client.chat.completions.create(
|
response = await plano_client.chat.completions.create(
|
||||||
model=RAG_MODEL,
|
model=RAG_MODEL,
|
||||||
messages=[{"role": "system", "content": system_prompt}],
|
messages=[{"role": "system", "content": system_prompt}],
|
||||||
temperature=0.1,
|
temperature=0.1,
|
||||||
|
|
|
||||||
|
|
@ -22,14 +22,14 @@ logging.basicConfig(
|
||||||
)
|
)
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
# Configuration for archgw LLM gateway
|
# Configuration for Plano LLM gateway
|
||||||
LLM_GATEWAY_ENDPOINT = os.getenv("LLM_GATEWAY_ENDPOINT", "http://localhost:12000/v1")
|
LLM_GATEWAY_ENDPOINT = os.getenv("LLM_GATEWAY_ENDPOINT", "http://localhost:12000/v1")
|
||||||
GUARD_MODEL = "gpt-4o-mini"
|
GUARD_MODEL = "gpt-4o-mini"
|
||||||
|
|
||||||
# Initialize OpenAI client for archgw
|
# Initialize OpenAI client for Plano
|
||||||
archgw_client = AsyncOpenAI(
|
plano_client = AsyncOpenAI(
|
||||||
base_url=LLM_GATEWAY_ENDPOINT,
|
base_url=LLM_GATEWAY_ENDPOINT,
|
||||||
api_key="EMPTY", # archgw doesn't require a real API key
|
api_key="EMPTY", # Plano doesn't require a real API key
|
||||||
)
|
)
|
||||||
|
|
||||||
app = FastAPI(title="RAG Agent Input Guards", version="1.0.0")
|
app = FastAPI(title="RAG Agent Input Guards", version="1.0.0")
|
||||||
|
|
@ -93,13 +93,13 @@ Respond in JSON format:
|
||||||
]
|
]
|
||||||
|
|
||||||
try:
|
try:
|
||||||
# Call archgw using OpenAI client
|
# Call Plano using OpenAI client
|
||||||
extra_headers = {"x-envoy-max-retries": "3", "x-request-id": request_id}
|
extra_headers = {"x-envoy-max-retries": "3", "x-request-id": request_id}
|
||||||
if traceparent_header:
|
if traceparent_header:
|
||||||
extra_headers["traceparent"] = traceparent_header
|
extra_headers["traceparent"] = traceparent_header
|
||||||
|
|
||||||
logger.info(f"Validating query scope: '{last_user_message}'")
|
logger.info(f"Validating query scope: '{last_user_message}'")
|
||||||
response = await archgw_client.chat.completions.create(
|
response = await plano_client.chat.completions.create(
|
||||||
model=GUARD_MODEL,
|
model=GUARD_MODEL,
|
||||||
messages=guard_messages,
|
messages=guard_messages,
|
||||||
temperature=0.1,
|
temperature=0.1,
|
||||||
|
|
|
||||||
|
|
@ -20,20 +20,20 @@ logging.basicConfig(
|
||||||
)
|
)
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
# Configuration for archgw LLM gateway
|
# Configuration for Plano LLM gateway
|
||||||
LLM_GATEWAY_ENDPOINT = os.getenv("LLM_GATEWAY_ENDPOINT", "http://localhost:12000/v1")
|
LLM_GATEWAY_ENDPOINT = os.getenv("LLM_GATEWAY_ENDPOINT", "http://localhost:12000/v1")
|
||||||
QUERY_REWRITE_MODEL = "gpt-4o-mini"
|
QUERY_REWRITE_MODEL = "gpt-4o-mini"
|
||||||
|
|
||||||
# Initialize OpenAI client for archgw
|
# Initialize OpenAI client for Plano
|
||||||
archgw_client = AsyncOpenAI(
|
plano_client = AsyncOpenAI(
|
||||||
base_url=LLM_GATEWAY_ENDPOINT,
|
base_url=LLM_GATEWAY_ENDPOINT,
|
||||||
api_key="EMPTY", # archgw doesn't require a real API key
|
api_key="EMPTY", # Plano doesn't require a real API key
|
||||||
)
|
)
|
||||||
|
|
||||||
app = FastAPI(title="RAG Agent Query Rewriter", version="1.0.0")
|
app = FastAPI(title="RAG Agent Query Rewriter", version="1.0.0")
|
||||||
|
|
||||||
|
|
||||||
async def rewrite_query_with_archgw(
|
async def rewrite_query_with_plano(
|
||||||
messages: List[ChatMessage],
|
messages: List[ChatMessage],
|
||||||
traceparent_header: Optional[str] = None,
|
traceparent_header: Optional[str] = None,
|
||||||
request_id: Optional[str] = None,
|
request_id: Optional[str] = None,
|
||||||
|
|
@ -59,8 +59,8 @@ Return only the rewritten query, nothing else."""
|
||||||
extra_headers["traceparent"] = traceparent_header
|
extra_headers["traceparent"] = traceparent_header
|
||||||
|
|
||||||
try:
|
try:
|
||||||
logger.info(f"Calling archgw at {LLM_GATEWAY_ENDPOINT} to rewrite query")
|
logger.info(f"Calling Plano at {LLM_GATEWAY_ENDPOINT} to rewrite query")
|
||||||
resp = await archgw_client.chat.completions.create(
|
resp = await plano_client.chat.completions.create(
|
||||||
model=QUERY_REWRITE_MODEL,
|
model=QUERY_REWRITE_MODEL,
|
||||||
messages=rewrite_messages,
|
messages=rewrite_messages,
|
||||||
temperature=0.3,
|
temperature=0.3,
|
||||||
|
|
@ -96,7 +96,7 @@ async def query_rewriter_http(
|
||||||
else:
|
else:
|
||||||
logger.info("No traceparent header found")
|
logger.info("No traceparent header found")
|
||||||
|
|
||||||
rewritten_query = await rewrite_query_with_archgw(
|
rewritten_query = await rewrite_query_with_plano(
|
||||||
messages, traceparent_header, request_id
|
messages, traceparent_header, request_id
|
||||||
)
|
)
|
||||||
# Create updated messages with the rewritten query
|
# Create updated messages with the rewritten query
|
||||||
|
|
|
||||||
|
|
@ -22,7 +22,7 @@ logging.basicConfig(
|
||||||
)
|
)
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
# Configuration for archgw LLM gateway
|
# Configuration for Plano LLM gateway
|
||||||
LLM_GATEWAY_ENDPOINT = os.getenv("LLM_GATEWAY_ENDPOINT", "http://localhost:12000/v1")
|
LLM_GATEWAY_ENDPOINT = os.getenv("LLM_GATEWAY_ENDPOINT", "http://localhost:12000/v1")
|
||||||
RESPONSE_MODEL = "gpt-4o"
|
RESPONSE_MODEL = "gpt-4o"
|
||||||
|
|
||||||
|
|
@ -38,10 +38,10 @@ Your response should:
|
||||||
|
|
||||||
Generate a complete response to assist the user."""
|
Generate a complete response to assist the user."""
|
||||||
|
|
||||||
# Initialize OpenAI client for archgw
|
# Initialize OpenAI client for Plano
|
||||||
archgw_client = AsyncOpenAI(
|
plano_client = AsyncOpenAI(
|
||||||
base_url=LLM_GATEWAY_ENDPOINT,
|
base_url=LLM_GATEWAY_ENDPOINT,
|
||||||
api_key="EMPTY", # archgw doesn't require a real API key
|
api_key="EMPTY", # Plano doesn't require a real API key
|
||||||
)
|
)
|
||||||
|
|
||||||
# FastAPI app for REST server
|
# FastAPI app for REST server
|
||||||
|
|
@ -95,9 +95,9 @@ async def stream_chat_completions(
|
||||||
response_messages = prepare_response_messages(request_body)
|
response_messages = prepare_response_messages(request_body)
|
||||||
|
|
||||||
try:
|
try:
|
||||||
# Call archgw using OpenAI client for streaming
|
# Call Plano using OpenAI client for streaming
|
||||||
logger.info(
|
logger.info(
|
||||||
f"Calling archgw at {LLM_GATEWAY_ENDPOINT} to generate streaming response"
|
f"Calling Plano at {LLM_GATEWAY_ENDPOINT} to generate streaming response"
|
||||||
)
|
)
|
||||||
|
|
||||||
# Prepare extra headers if traceparent is provided
|
# Prepare extra headers if traceparent is provided
|
||||||
|
|
@ -105,7 +105,7 @@ async def stream_chat_completions(
|
||||||
if traceparent_header:
|
if traceparent_header:
|
||||||
extra_headers["traceparent"] = traceparent_header
|
extra_headers["traceparent"] = traceparent_header
|
||||||
|
|
||||||
response_stream = await archgw_client.chat.completions.create(
|
response_stream = await plano_client.chat.completions.create(
|
||||||
model=RESPONSE_MODEL,
|
model=RESPONSE_MODEL,
|
||||||
messages=response_messages,
|
messages=response_messages,
|
||||||
temperature=request_body.temperature or 0.7,
|
temperature=request_body.temperature or 0.7,
|
||||||
|
|
|
||||||
|
|
@ -1,15 +1,15 @@
|
||||||
# LLM Routing
|
# LLM Routing
|
||||||
This demo shows how you can arch gateway to manage keys and route to upstream LLM.
|
This demo shows how you can use Plano gateway to manage keys and route to upstream LLM.
|
||||||
|
|
||||||
# Starting the demo
|
# Starting the demo
|
||||||
1. Please make sure the [pre-requisites](https://github.com/katanemo/arch/?tab=readme-ov-file#prerequisites) are installed correctly
|
1. Please make sure the [pre-requisites](https://github.com/katanemo/arch/?tab=readme-ov-file#prerequisites) are installed correctly
|
||||||
1. Start Arch
|
1. Start Plano
|
||||||
```sh
|
```sh
|
||||||
sh run_demo.sh
|
sh run_demo.sh
|
||||||
```
|
```
|
||||||
1. Navigate to http://localhost:18080/
|
1. Navigate to http://localhost:18080/
|
||||||
|
|
||||||
Following screen shows an example of interaction with arch gateway showing dynamic routing. You can select between different LLMs using "override model" option in the chat UI.
|
Following screen shows an example of interaction with Plano gateway showing dynamic routing. You can select between different LLMs using "override model" option in the chat UI.
|
||||||
|
|
||||||

|

|
||||||
|
|
||||||
|
|
@ -47,12 +47,12 @@ $ curl --header 'Content-Type: application/json' \
|
||||||
```
|
```
|
||||||
|
|
||||||
# Observability
|
# Observability
|
||||||
Arch gateway publishes stats endpoint at http://localhost:19901/stats. In this demo we are using prometheus to pull stats from arch and we are using grafana to visualize the stats in dashboard. To see grafana dashboard follow instructions below,
|
Plano gateway publishes stats endpoint at http://localhost:19901/stats. In this demo we are using prometheus to pull stats from Plano and we are using grafana to visualize the stats in dashboard. To see grafana dashboard follow instructions below,
|
||||||
|
|
||||||
1. Navigate to http://localhost:3000/ to open grafana UI (use admin/grafana as credentials)
|
1. Navigate to http://localhost:3000/ to open grafana UI (use admin/grafana as credentials)
|
||||||
1. From grafana left nav click on dashboards and select "Intelligent Gateway Overview" to view arch gateway stats
|
1. From grafana left nav click on dashboards and select "Intelligent Gateway Overview" to view Plano gateway stats
|
||||||
1. For tracing you can head over to http://localhost:16686/ to view recent traces.
|
1. For tracing you can head over to http://localhost:16686/ to view recent traces.
|
||||||
|
|
||||||
Following is a screenshot of tracing UI showing call received by arch gateway and making upstream call to LLM,
|
Following is a screenshot of tracing UI showing call received by Plano gateway and making upstream call to LLM,
|
||||||
|
|
||||||

|

|
||||||
|
|
|
||||||
|
|
@ -8,11 +8,11 @@ services:
|
||||||
- "12000:12000"
|
- "12000:12000"
|
||||||
- "12001:12001"
|
- "12001:12001"
|
||||||
environment:
|
environment:
|
||||||
- ARCH_CONFIG_PATH=/app/arch_config.yaml
|
- PLANO_CONFIG_PATH=/app/plano_config.yaml
|
||||||
- OPENAI_API_KEY=${OPENAI_API_KEY:?OPENAI_API_KEY environment variable is required but not set}
|
- OPENAI_API_KEY=${OPENAI_API_KEY:?OPENAI_API_KEY environment variable is required but not set}
|
||||||
- OTEL_TRACING_GRPC_ENDPOINT=http://host.docker.internal:4317
|
- OTEL_TRACING_GRPC_ENDPOINT=http://host.docker.internal:4317
|
||||||
volumes:
|
volumes:
|
||||||
- ./config.yaml:/app/arch_config.yaml:ro
|
- ./config.yaml:/app/plano_config.yaml:ro
|
||||||
- /etc/ssl/cert.pem:/etc/ssl/cert.pem
|
- /etc/ssl/cert.pem:/etc/ssl/cert.pem
|
||||||
|
|
||||||
anythingllm:
|
anythingllm:
|
||||||
|
|
|
||||||
|
|
@ -18,8 +18,8 @@ start_demo() {
|
||||||
echo ".env file created with OPENAI_API_KEY."
|
echo ".env file created with OPENAI_API_KEY."
|
||||||
fi
|
fi
|
||||||
|
|
||||||
# Step 3: Start Arch
|
# Step 3: Start Plano
|
||||||
echo "Starting Arch with config.yaml..."
|
echo "Starting Plano with config.yaml..."
|
||||||
planoai up config.yaml
|
planoai up config.yaml
|
||||||
|
|
||||||
# Step 4: Start LLM Routing
|
# Step 4: Start LLM Routing
|
||||||
|
|
@ -33,8 +33,8 @@ stop_demo() {
|
||||||
echo "Stopping LLM Routing using Docker Compose..."
|
echo "Stopping LLM Routing using Docker Compose..."
|
||||||
docker compose down
|
docker compose down
|
||||||
|
|
||||||
# Step 2: Stop Arch
|
# Step 2: Stop Plano
|
||||||
echo "Stopping Arch..."
|
echo "Stopping Plano..."
|
||||||
planoai down
|
planoai down
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -21,10 +21,10 @@ services:
|
||||||
- "12000:12000"
|
- "12000:12000"
|
||||||
- "8001:8001"
|
- "8001:8001"
|
||||||
environment:
|
environment:
|
||||||
- ARCH_CONFIG_PATH=/config/config.yaml
|
- PLANO_CONFIG_PATH=/config/config.yaml
|
||||||
- OPENAI_API_KEY=${OPENAI_API_KEY:?OPENAI_API_KEY environment variable is required but not set}
|
- OPENAI_API_KEY=${OPENAI_API_KEY:?OPENAI_API_KEY environment variable is required but not set}
|
||||||
volumes:
|
volumes:
|
||||||
- ./config.yaml:/app/arch_config.yaml
|
- ./config.yaml:/app/plano_config.yaml
|
||||||
- /etc/ssl/cert.pem:/etc/ssl/cert.pem
|
- /etc/ssl/cert.pem:/etc/ssl/cert.pem
|
||||||
jaeger:
|
jaeger:
|
||||||
build:
|
build:
|
||||||
|
|
|
||||||
|
|
@ -18,14 +18,14 @@ logging.basicConfig(
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
# Configuration for archgw LLM gateway
|
# Configuration for Plano LLM gateway
|
||||||
LLM_GATEWAY_ENDPOINT = os.getenv("LLM_GATEWAY_ENDPOINT", "http://localhost:12000/v1")
|
LLM_GATEWAY_ENDPOINT = os.getenv("LLM_GATEWAY_ENDPOINT", "http://localhost:12000/v1")
|
||||||
RAG_MODEL = "gpt-4o-mini"
|
RAG_MODEL = "gpt-4o-mini"
|
||||||
|
|
||||||
# Initialize OpenAI client for archgw
|
# Initialize OpenAI client for Plano
|
||||||
archgw_client = AsyncOpenAI(
|
plano_client = AsyncOpenAI(
|
||||||
base_url=LLM_GATEWAY_ENDPOINT,
|
base_url=LLM_GATEWAY_ENDPOINT,
|
||||||
api_key="EMPTY", # archgw doesn't require a real API key
|
api_key="EMPTY", # Plano doesn't require a real API key
|
||||||
)
|
)
|
||||||
|
|
||||||
# Global variable to store the knowledge base
|
# Global variable to store the knowledge base
|
||||||
|
|
@ -91,8 +91,8 @@ async def find_relevant_passages(
|
||||||
If no passages are relevant, return "NONE"."""
|
If no passages are relevant, return "NONE"."""
|
||||||
|
|
||||||
try:
|
try:
|
||||||
# Call archgw to select relevant passages
|
# Call Plano to select relevant passages
|
||||||
logger.info(f"Calling archgw to find relevant passages for query: '{query}'")
|
logger.info(f"Calling Plano to find relevant passages for query: '{query}'")
|
||||||
|
|
||||||
# Prepare extra headers if traceparent is provided
|
# Prepare extra headers if traceparent is provided
|
||||||
extra_headers = {
|
extra_headers = {
|
||||||
|
|
@ -103,7 +103,7 @@ async def find_relevant_passages(
|
||||||
if traceparent:
|
if traceparent:
|
||||||
extra_headers["traceparent"] = traceparent
|
extra_headers["traceparent"] = traceparent
|
||||||
|
|
||||||
response = await archgw_client.chat.completions.create(
|
response = await plano_client.chat.completions.create(
|
||||||
model=RAG_MODEL,
|
model=RAG_MODEL,
|
||||||
messages=[{"role": "system", "content": system_prompt}],
|
messages=[{"role": "system", "content": system_prompt}],
|
||||||
temperature=0.1,
|
temperature=0.1,
|
||||||
|
|
|
||||||
|
|
@ -20,14 +20,14 @@ logging.basicConfig(
|
||||||
)
|
)
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
# Configuration for archgw LLM gateway
|
# Configuration for Plano LLM gateway
|
||||||
LLM_GATEWAY_ENDPOINT = os.getenv("LLM_GATEWAY_ENDPOINT", "http://localhost:12000/v1")
|
LLM_GATEWAY_ENDPOINT = os.getenv("LLM_GATEWAY_ENDPOINT", "http://localhost:12000/v1")
|
||||||
GUARD_MODEL = "gpt-4o-mini"
|
GUARD_MODEL = "gpt-4o-mini"
|
||||||
|
|
||||||
# Initialize OpenAI client for archgw
|
# Initialize OpenAI client for Plano
|
||||||
archgw_client = AsyncOpenAI(
|
plano_client = AsyncOpenAI(
|
||||||
base_url=LLM_GATEWAY_ENDPOINT,
|
base_url=LLM_GATEWAY_ENDPOINT,
|
||||||
api_key="EMPTY", # archgw doesn't require a real API key
|
api_key="EMPTY", # Plano doesn't require a real API key
|
||||||
)
|
)
|
||||||
|
|
||||||
app = FastAPI()
|
app = FastAPI()
|
||||||
|
|
@ -91,7 +91,7 @@ Respond in JSON format:
|
||||||
]
|
]
|
||||||
|
|
||||||
try:
|
try:
|
||||||
# Call archgw using OpenAI client
|
# Call Plano using OpenAI client
|
||||||
extra_headers = {"x-envoy-max-retries": "3"}
|
extra_headers = {"x-envoy-max-retries": "3"}
|
||||||
if traceparent_header:
|
if traceparent_header:
|
||||||
extra_headers["traceparent"] = traceparent_header
|
extra_headers["traceparent"] = traceparent_header
|
||||||
|
|
@ -100,7 +100,7 @@ Respond in JSON format:
|
||||||
extra_headers["x-request-id"] = request_id
|
extra_headers["x-request-id"] = request_id
|
||||||
|
|
||||||
logger.info(f"Validating query scope: '{last_user_message}'")
|
logger.info(f"Validating query scope: '{last_user_message}'")
|
||||||
response = await archgw_client.chat.completions.create(
|
response = await plano_client.chat.completions.create(
|
||||||
model=GUARD_MODEL,
|
model=GUARD_MODEL,
|
||||||
messages=guard_messages,
|
messages=guard_messages,
|
||||||
temperature=0.1,
|
temperature=0.1,
|
||||||
|
|
|
||||||
|
|
@ -19,20 +19,20 @@ logging.basicConfig(
|
||||||
)
|
)
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
# Configuration for archgw LLM gateway
|
# Configuration for Plano LLM gateway
|
||||||
LLM_GATEWAY_ENDPOINT = os.getenv("LLM_GATEWAY_ENDPOINT", "http://localhost:12000/v1")
|
LLM_GATEWAY_ENDPOINT = os.getenv("LLM_GATEWAY_ENDPOINT", "http://localhost:12000/v1")
|
||||||
QUERY_REWRITE_MODEL = "gpt-4o-mini"
|
QUERY_REWRITE_MODEL = "gpt-4o-mini"
|
||||||
|
|
||||||
# Initialize OpenAI client for archgw
|
# Initialize OpenAI client for Plano
|
||||||
archgw_client = AsyncOpenAI(
|
plano_client = AsyncOpenAI(
|
||||||
base_url=LLM_GATEWAY_ENDPOINT,
|
base_url=LLM_GATEWAY_ENDPOINT,
|
||||||
api_key="EMPTY", # archgw doesn't require a real API key
|
api_key="EMPTY", # Plano doesn't require a real API key
|
||||||
)
|
)
|
||||||
|
|
||||||
app = FastAPI()
|
app = FastAPI()
|
||||||
|
|
||||||
|
|
||||||
async def rewrite_query_with_archgw(
|
async def rewrite_query_with_plano(
|
||||||
messages: List[ChatMessage],
|
messages: List[ChatMessage],
|
||||||
traceparent_header: str,
|
traceparent_header: str,
|
||||||
request_id: Optional[str] = None,
|
request_id: Optional[str] = None,
|
||||||
|
|
@ -57,14 +57,14 @@ async def rewrite_query_with_archgw(
|
||||||
rewrite_messages.append({"role": msg.role, "content": msg.content})
|
rewrite_messages.append({"role": msg.role, "content": msg.content})
|
||||||
|
|
||||||
try:
|
try:
|
||||||
# Call archgw using OpenAI client
|
# Call Plano using OpenAI client
|
||||||
extra_headers = {"x-envoy-max-retries": "3"}
|
extra_headers = {"x-envoy-max-retries": "3"}
|
||||||
if traceparent_header:
|
if traceparent_header:
|
||||||
extra_headers["traceparent"] = traceparent_header
|
extra_headers["traceparent"] = traceparent_header
|
||||||
if request_id:
|
if request_id:
|
||||||
extra_headers["x-request-id"] = request_id
|
extra_headers["x-request-id"] = request_id
|
||||||
logger.info(f"Calling archgw at {LLM_GATEWAY_ENDPOINT} to rewrite query")
|
logger.info(f"Calling Plano at {LLM_GATEWAY_ENDPOINT} to rewrite query")
|
||||||
response = await archgw_client.chat.completions.create(
|
response = await plano_client.chat.completions.create(
|
||||||
model=QUERY_REWRITE_MODEL,
|
model=QUERY_REWRITE_MODEL,
|
||||||
messages=rewrite_messages,
|
messages=rewrite_messages,
|
||||||
temperature=0.3,
|
temperature=0.3,
|
||||||
|
|
@ -88,7 +88,7 @@ async def rewrite_query_with_archgw(
|
||||||
|
|
||||||
|
|
||||||
async def query_rewriter(messages: List[ChatMessage]) -> List[ChatMessage]:
|
async def query_rewriter(messages: List[ChatMessage]) -> List[ChatMessage]:
|
||||||
"""Chat completions endpoint that rewrites the last user query using archgw.
|
"""Chat completions endpoint that rewrites the last user query using Plano.
|
||||||
|
|
||||||
Returns a dict with a 'messages' key containing the updated message list.
|
Returns a dict with a 'messages' key containing the updated message list.
|
||||||
"""
|
"""
|
||||||
|
|
@ -104,8 +104,8 @@ async def query_rewriter(messages: List[ChatMessage]) -> List[ChatMessage]:
|
||||||
else:
|
else:
|
||||||
logger.info("No traceparent header found")
|
logger.info("No traceparent header found")
|
||||||
|
|
||||||
# Call archgw to rewrite the last user query
|
# Call Plano to rewrite the last user query
|
||||||
rewritten_query = await rewrite_query_with_archgw(
|
rewritten_query = await rewrite_query_with_plano(
|
||||||
messages, traceparent_header, request_id
|
messages, traceparent_header, request_id
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -22,7 +22,7 @@ logging.basicConfig(
|
||||||
)
|
)
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
# Configuration for archgw LLM gateway
|
# Configuration for Plano LLM gateway
|
||||||
LLM_GATEWAY_ENDPOINT = os.getenv("LLM_GATEWAY_ENDPOINT", "http://localhost:12000/v1")
|
LLM_GATEWAY_ENDPOINT = os.getenv("LLM_GATEWAY_ENDPOINT", "http://localhost:12000/v1")
|
||||||
RESPONSE_MODEL = "gpt-4o"
|
RESPONSE_MODEL = "gpt-4o"
|
||||||
|
|
||||||
|
|
@ -38,10 +38,10 @@ Your response should:
|
||||||
|
|
||||||
Generate a complete response to assist the user."""
|
Generate a complete response to assist the user."""
|
||||||
|
|
||||||
# Initialize OpenAI client for archgw
|
# Initialize OpenAI client for Plano
|
||||||
archgw_client = AsyncOpenAI(
|
plano_client = AsyncOpenAI(
|
||||||
base_url=LLM_GATEWAY_ENDPOINT,
|
base_url=LLM_GATEWAY_ENDPOINT,
|
||||||
api_key="EMPTY", # archgw doesn't require a real API key
|
api_key="EMPTY", # Plano doesn't require a real API key
|
||||||
)
|
)
|
||||||
|
|
||||||
# FastAPI app for REST server
|
# FastAPI app for REST server
|
||||||
|
|
@ -94,9 +94,9 @@ async def stream_chat_completions(
|
||||||
response_messages = prepare_response_messages(request_body)
|
response_messages = prepare_response_messages(request_body)
|
||||||
|
|
||||||
try:
|
try:
|
||||||
# Call archgw using OpenAI client for streaming
|
# Call Plano using OpenAI client for streaming
|
||||||
logger.info(
|
logger.info(
|
||||||
f"Calling archgw at {LLM_GATEWAY_ENDPOINT} to generate streaming response"
|
f"Calling Plano at {LLM_GATEWAY_ENDPOINT} to generate streaming response"
|
||||||
)
|
)
|
||||||
|
|
||||||
logger.info(f"rag_agent - request_id: {request_id}")
|
logger.info(f"rag_agent - request_id: {request_id}")
|
||||||
|
|
@ -107,7 +107,7 @@ async def stream_chat_completions(
|
||||||
if traceparent_header:
|
if traceparent_header:
|
||||||
extra_headers["traceparent"] = traceparent_header
|
extra_headers["traceparent"] = traceparent_header
|
||||||
|
|
||||||
response_stream = await archgw_client.chat.completions.create(
|
response_stream = await plano_client.chat.completions.create(
|
||||||
model=RESPONSE_MODEL,
|
model=RESPONSE_MODEL,
|
||||||
messages=response_messages,
|
messages=response_messages,
|
||||||
temperature=request_body.temperature or 0.7,
|
temperature=request_body.temperature or 0.7,
|
||||||
|
|
|
||||||
|
|
@ -1,6 +1,6 @@
|
||||||
# Model Alias Demo Suite
|
# Model Alias Demo Suite
|
||||||
|
|
||||||
This directory contains demos for the model alias feature in archgw.
|
This directory contains demos for the model alias feature in Plano.
|
||||||
|
|
||||||
## Overview
|
## Overview
|
||||||
|
|
||||||
|
|
@ -48,7 +48,7 @@ model_aliases:
|
||||||
```
|
```
|
||||||
|
|
||||||
## Prerequisites
|
## Prerequisites
|
||||||
- Install all dependencies as described in the main Arch README ([link](https://github.com/katanemo/arch/?tab=readme-ov-file#prerequisites))
|
- Install all dependencies as described in the main Plano README ([link](https://github.com/katanemo/plano/?tab=readme-ov-file#prerequisites))
|
||||||
- Set your API keys in your environment:
|
- Set your API keys in your environment:
|
||||||
- `export OPENAI_API_KEY=your-openai-key`
|
- `export OPENAI_API_KEY=your-openai-key`
|
||||||
- `export ANTHROPIC_API_KEY=your-anthropic-key` (optional, but recommended for Anthropic tests)
|
- `export ANTHROPIC_API_KEY=your-anthropic-key` (optional, but recommended for Anthropic tests)
|
||||||
|
|
@ -60,13 +60,13 @@ model_aliases:
|
||||||
sh run_demo.sh
|
sh run_demo.sh
|
||||||
```
|
```
|
||||||
- This will create a `.env` file with your API keys (if not present).
|
- This will create a `.env` file with your API keys (if not present).
|
||||||
- Starts Arch Gateway with model alias config (`arch_config_with_aliases.yaml`).
|
- Starts Plano gateway with model alias config (`arch_config_with_aliases.yaml`).
|
||||||
|
|
||||||
2. To stop the demo:
|
2. To stop the demo:
|
||||||
```sh
|
```sh
|
||||||
sh run_demo.sh down
|
sh run_demo.sh down
|
||||||
```
|
```
|
||||||
- This will stop Arch Gateway and any related services.
|
- This will stop Plano gateway and any related services.
|
||||||
|
|
||||||
## Example Requests
|
## Example Requests
|
||||||
|
|
||||||
|
|
@ -145,4 +145,4 @@ curl -sS -X POST "http://localhost:12000/v1/messages" \
|
||||||
## Troubleshooting
|
## Troubleshooting
|
||||||
- Ensure your API keys are set in your environment before running the demo.
|
- Ensure your API keys are set in your environment before running the demo.
|
||||||
- If you see errors about missing keys, set them and re-run the script.
|
- If you see errors about missing keys, set them and re-run the script.
|
||||||
- For more details, see the main Arch documentation.
|
- For more details, see the main Plano documentation.
|
||||||
|
|
|
||||||
|
|
@ -24,11 +24,11 @@ start_demo() {
|
||||||
echo ".env file created with API keys."
|
echo ".env file created with API keys."
|
||||||
fi
|
fi
|
||||||
|
|
||||||
# Step 3: Start Arch
|
# Step 3: Start Plano
|
||||||
echo "Starting Arch with arch_config_with_aliases.yaml..."
|
echo "Starting Plano with arch_config_with_aliases.yaml..."
|
||||||
planoai up arch_config_with_aliases.yaml
|
planoai up arch_config_with_aliases.yaml
|
||||||
|
|
||||||
echo "\n\nArch started successfully."
|
echo "\n\nPlano started successfully."
|
||||||
echo "Please run the following CURL command to test model alias routing. Additional instructions are in the README.md file. \n"
|
echo "Please run the following CURL command to test model alias routing. Additional instructions are in the README.md file. \n"
|
||||||
echo "curl -sS -X POST \"http://localhost:12000/v1/chat/completions\" \
|
echo "curl -sS -X POST \"http://localhost:12000/v1/chat/completions\" \
|
||||||
-H \"Authorization: Bearer test-key\" \
|
-H \"Authorization: Bearer test-key\" \
|
||||||
|
|
@ -46,8 +46,8 @@ start_demo() {
|
||||||
|
|
||||||
# Function to stop the demo
|
# Function to stop the demo
|
||||||
stop_demo() {
|
stop_demo() {
|
||||||
# Step 2: Stop Arch
|
# Step 2: Stop Plano
|
||||||
echo "Stopping Arch..."
|
echo "Stopping Plano..."
|
||||||
planoai down
|
planoai down
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -1,6 +1,6 @@
|
||||||
# Model Choice Newsletter Demo
|
# Model Choice Newsletter Demo
|
||||||
|
|
||||||
This folder demonstrates a practical workflow for rapid model adoption and safe model switching using Arch Gateway (`plano`). It includes both a minimal test harness and a sample proxy configuration.
|
This folder demonstrates a practical workflow for rapid model adoption and safe model switching using Plano (`plano`). It includes both a minimal test harness and a sample proxy configuration.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|
@ -85,13 +85,13 @@ See `config.yaml` for a sample configuration mapping aliases to provider models.
|
||||||
```
|
```
|
||||||
|
|
||||||
2. **Install dependencies:**
|
2. **Install dependencies:**
|
||||||
- Install all dependencies as described in the main Arch README ([link](https://github.com/katanemo/arch/?tab=readme-ov-file#prerequisites))
|
- Install all dependencies as described in the main Plano README ([link](https://github.com/katanemo/plano/?tab=readme-ov-file#prerequisites))
|
||||||
- Then run
|
- Then run
|
||||||
```sh
|
```sh
|
||||||
uv sync
|
uv sync
|
||||||
```
|
```
|
||||||
|
|
||||||
3. **Start Arch Gateway**
|
3. **Start Plano**
|
||||||
```sh
|
```sh
|
||||||
run_demo.sh
|
run_demo.sh
|
||||||
```
|
```
|
||||||
|
|
|
||||||
|
|
@ -3,7 +3,7 @@ import json, time, yaml, statistics as stats
|
||||||
from pydantic import BaseModel, ValidationError
|
from pydantic import BaseModel, ValidationError
|
||||||
from openai import OpenAI
|
from openai import OpenAI
|
||||||
|
|
||||||
# archgw endpoint (keys are handled by archgw)
|
# Plano endpoint (keys are handled by Plano)
|
||||||
client = OpenAI(base_url="http://localhost:12000/v1", api_key="n/a")
|
client = OpenAI(base_url="http://localhost:12000/v1", api_key="n/a")
|
||||||
MODELS = ["arch.summarize.v1", "arch.reason.v1"]
|
MODELS = ["arch.summarize.v1", "arch.reason.v1"]
|
||||||
FIXTURES = "evals_summarize.yaml"
|
FIXTURES = "evals_summarize.yaml"
|
||||||
|
|
|
||||||
|
|
@ -1,7 +1,7 @@
|
||||||
[project]
|
[project]
|
||||||
name = "model-choice-newsletter-code-snippets"
|
name = "model-choice-newsletter-code-snippets"
|
||||||
version = "0.1.0"
|
version = "0.1.0"
|
||||||
description = "Benchmarking model alias routing with Arch Gateway."
|
description = "Benchmarking model alias routing with Plano."
|
||||||
authors = [{name = "Your Name", email = "your@email.com"}]
|
authors = [{name = "Your Name", email = "your@email.com"}]
|
||||||
license = {text = "Apache 2.0"}
|
license = {text = "Apache 2.0"}
|
||||||
readme = "README.md"
|
readme = "README.md"
|
||||||
|
|
|
||||||
|
|
@ -17,18 +17,18 @@ start_demo() {
|
||||||
echo ".env file created with API keys."
|
echo ".env file created with API keys."
|
||||||
fi
|
fi
|
||||||
|
|
||||||
# Step 3: Start Arch
|
# Step 3: Start Plano
|
||||||
echo "Starting Arch with arch_config_with_aliases.yaml..."
|
echo "Starting Plano with arch_config_with_aliases.yaml..."
|
||||||
planoai up arch_config_with_aliases.yaml
|
planoai up arch_config_with_aliases.yaml
|
||||||
|
|
||||||
echo "\n\nArch started successfully."
|
echo "\n\nPlano started successfully."
|
||||||
echo "Please run the following command to test the setup: python bench.py\n"
|
echo "Please run the following command to test the setup: python bench.py\n"
|
||||||
}
|
}
|
||||||
|
|
||||||
# Function to stop the demo
|
# Function to stop the demo
|
||||||
stop_demo() {
|
stop_demo() {
|
||||||
# Step 2: Stop Arch
|
# Step 2: Stop Plano
|
||||||
echo "Stopping Arch..."
|
echo "Stopping Plano..."
|
||||||
planoai down
|
planoai down
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -8,12 +8,12 @@ services:
|
||||||
- "8001:8001"
|
- "8001:8001"
|
||||||
- "12000:12000"
|
- "12000:12000"
|
||||||
environment:
|
environment:
|
||||||
- ARCH_CONFIG_PATH=/app/arch_config.yaml
|
- PLANO_CONFIG_PATH=/app/plano_config.yaml
|
||||||
- OPENAI_API_KEY=${OPENAI_API_KEY:?OPENAI_API_KEY environment variable is required but not set}
|
- OPENAI_API_KEY=${OPENAI_API_KEY:?OPENAI_API_KEY environment variable is required but not set}
|
||||||
- OTEL_TRACING_GRPC_ENDPOINT=http://jaeger:4317
|
- OTEL_TRACING_GRPC_ENDPOINT=http://jaeger:4317
|
||||||
- LOG_LEVEL=${LOG_LEVEL:-info}
|
- LOG_LEVEL=${LOG_LEVEL:-info}
|
||||||
volumes:
|
volumes:
|
||||||
- ./config.yaml:/app/arch_config.yaml:ro
|
- ./config.yaml:/app/plano_config.yaml:ro
|
||||||
- /etc/ssl/cert.pem:/etc/ssl/cert.pem
|
- /etc/ssl/cert.pem:/etc/ssl/cert.pem
|
||||||
|
|
||||||
crewai-flight-agent:
|
crewai-flight-agent:
|
||||||
|
|
|
||||||
|
|
@ -10,7 +10,7 @@ services:
|
||||||
extra_hosts:
|
extra_hosts:
|
||||||
- "host.docker.internal:host-gateway"
|
- "host.docker.internal:host-gateway"
|
||||||
volumes:
|
volumes:
|
||||||
- ./config.yaml:/app/arch_config.yaml
|
- ./config.yaml:/app/plano_config.yaml
|
||||||
|
|
||||||
jaeger:
|
jaeger:
|
||||||
build:
|
build:
|
||||||
|
|
|
||||||
|
|
@ -10,7 +10,7 @@ services:
|
||||||
extra_hosts:
|
extra_hosts:
|
||||||
- "host.docker.internal:host-gateway"
|
- "host.docker.internal:host-gateway"
|
||||||
volumes:
|
volumes:
|
||||||
- ./config.yaml:/app/arch_config.yaml
|
- ./config.yaml:/app/plano_config.yaml
|
||||||
|
|
||||||
otel-collector:
|
otel-collector:
|
||||||
build:
|
build:
|
||||||
|
|
|
||||||
|
|
@ -18,8 +18,8 @@ start_demo() {
|
||||||
echo ".env file created with OPENAI_API_KEY."
|
echo ".env file created with OPENAI_API_KEY."
|
||||||
fi
|
fi
|
||||||
|
|
||||||
# Step 3: Start Arch
|
# Step 3: Start Plano
|
||||||
echo "Starting Arch with config.yaml..."
|
echo "Starting Plano with config.yaml..."
|
||||||
planoai up config.yaml
|
planoai up config.yaml
|
||||||
|
|
||||||
# Step 4: Start developer services
|
# Step 4: Start developer services
|
||||||
|
|
@ -33,8 +33,8 @@ stop_demo() {
|
||||||
echo "Stopping Network Agent using Docker Compose..."
|
echo "Stopping Network Agent using Docker Compose..."
|
||||||
docker compose down
|
docker compose down
|
||||||
|
|
||||||
# Step 2: Stop Arch
|
# Step 2: Stop Plano
|
||||||
echo "Stopping Arch..."
|
echo "Stopping Plano..."
|
||||||
planoai down
|
planoai down
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -17,7 +17,7 @@ Make sure your machine is up to date with [latest version of plano]([url](https:
|
||||||
# Or if installed with uv: uvx planoai up --service plano --foreground
|
# Or if installed with uv: uvx planoai up --service plano --foreground
|
||||||
2025-05-30 18:00:09,953 - planoai.main - INFO - Starting plano cli version: 0.4.6
|
2025-05-30 18:00:09,953 - planoai.main - INFO - Starting plano cli version: 0.4.6
|
||||||
2025-05-30 18:00:09,953 - planoai.main - INFO - Validating /Users/adilhafeez/src/intelligent-prompt-gateway/demos/use_cases/preference_based_routing/config.yaml
|
2025-05-30 18:00:09,953 - planoai.main - INFO - Validating /Users/adilhafeez/src/intelligent-prompt-gateway/demos/use_cases/preference_based_routing/config.yaml
|
||||||
2025-05-30 18:00:10,422 - cli.core - INFO - Starting arch gateway, image name: plano, tag: katanemo/plano:0.4.6
|
2025-05-30 18:00:10,422 - cli.core - INFO - Starting plano gateway, image name: plano, tag: katanemo/plano:0.4.6
|
||||||
2025-05-30 18:00:10,662 - cli.core - INFO - plano status: running, health status: starting
|
2025-05-30 18:00:10,662 - cli.core - INFO - plano status: running, health status: starting
|
||||||
2025-05-30 18:00:11,712 - cli.core - INFO - plano status: running, health status: starting
|
2025-05-30 18:00:11,712 - cli.core - INFO - plano status: running, health status: starting
|
||||||
2025-05-30 18:00:12,761 - cli.core - INFO - plano is running and is healthy!
|
2025-05-30 18:00:12,761 - cli.core - INFO - plano is running and is healthy!
|
||||||
|
|
|
||||||
|
|
@ -8,14 +8,14 @@ services:
|
||||||
- "12000:12000"
|
- "12000:12000"
|
||||||
- "12001:12001"
|
- "12001:12001"
|
||||||
environment:
|
environment:
|
||||||
- ARCH_CONFIG_PATH=/app/arch_config.yaml
|
- PLANO_CONFIG_PATH=/app/plano_config.yaml
|
||||||
- OPENAI_API_KEY=${OPENAI_API_KEY:?OPENAI_API_KEY environment variable is required but not set}
|
- OPENAI_API_KEY=${OPENAI_API_KEY:?OPENAI_API_KEY environment variable is required but not set}
|
||||||
- ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY:?ANTHROPIC_API_KEY environment variable is required but not set}
|
- ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY:?ANTHROPIC_API_KEY environment variable is required but not set}
|
||||||
- OTEL_TRACING_GRPC_ENDPOINT=http://host.docker.internal:4317
|
- OTEL_TRACING_GRPC_ENDPOINT=http://host.docker.internal:4317
|
||||||
- OTEL_TRACING_ENABLED=true
|
- OTEL_TRACING_ENABLED=true
|
||||||
- RUST_LOG=debug
|
- RUST_LOG=debug
|
||||||
volumes:
|
volumes:
|
||||||
- ./config.yaml:/app/arch_config.yaml:ro
|
- ./config.yaml:/app/plano_config.yaml:ro
|
||||||
- /etc/ssl/cert.pem:/etc/ssl/cert.pem
|
- /etc/ssl/cert.pem:/etc/ssl/cert.pem
|
||||||
|
|
||||||
anythingllm:
|
anythingllm:
|
||||||
|
|
|
||||||
|
|
@ -1,6 +1,6 @@
|
||||||
# Use Case Demo: Bearer Authorization with Spotify APIs
|
# Use Case Demo: Bearer Authorization with Spotify APIs
|
||||||
|
|
||||||
In this demo, we show how you can use Arch's bearer authorization capability to connect your agentic apps to third-party APIs.
|
In this demo, we show how you can use Plano's bearer authorization capability to connect your agentic apps to third-party APIs.
|
||||||
More specifically, we demonstrate how you can connect to two Spotify APIs:
|
More specifically, we demonstrate how you can connect to two Spotify APIs:
|
||||||
|
|
||||||
- [`/v1/browse/new-releases`](https://developer.spotify.com/documentation/web-api/reference/get-new-releases)
|
- [`/v1/browse/new-releases`](https://developer.spotify.com/documentation/web-api/reference/get-new-releases)
|
||||||
|
|
@ -23,7 +23,7 @@ Where users can engage by asking questions like _"Show me the latest releases in
|
||||||
SPOTIFY_CLIENT_KEY=your_spotify_api_token
|
SPOTIFY_CLIENT_KEY=your_spotify_api_token
|
||||||
```
|
```
|
||||||
|
|
||||||
3. Start Arch
|
3. Start Plano
|
||||||
```sh
|
```sh
|
||||||
sh run_demo.sh
|
sh run_demo.sh
|
||||||
```
|
```
|
||||||
|
|
|
||||||
|
|
@ -10,7 +10,7 @@ services:
|
||||||
extra_hosts:
|
extra_hosts:
|
||||||
- "host.docker.internal:host-gateway"
|
- "host.docker.internal:host-gateway"
|
||||||
volumes:
|
volumes:
|
||||||
- ./config.yaml:/app/arch_config.yaml
|
- ./config.yaml:/app/plano_config.yaml
|
||||||
|
|
||||||
jaeger:
|
jaeger:
|
||||||
build:
|
build:
|
||||||
|
|
|
||||||
|
|
@ -18,8 +18,8 @@ start_demo() {
|
||||||
echo ".env file created with OPENAI_API_KEY."
|
echo ".env file created with OPENAI_API_KEY."
|
||||||
fi
|
fi
|
||||||
|
|
||||||
# Step 3: Start Arch
|
# Step 3: Start Plano
|
||||||
echo "Starting Arch with config.yaml..."
|
echo "Starting Plano with config.yaml..."
|
||||||
planoai up config.yaml
|
planoai up config.yaml
|
||||||
|
|
||||||
# Step 4: Start developer services
|
# Step 4: Start developer services
|
||||||
|
|
@ -33,8 +33,8 @@ stop_demo() {
|
||||||
echo "Stopping Network Agent using Docker Compose..."
|
echo "Stopping Network Agent using Docker Compose..."
|
||||||
docker compose down
|
docker compose down
|
||||||
|
|
||||||
# Step 2: Stop Arch
|
# Step 2: Stop Plano
|
||||||
echo "Stopping Arch..."
|
echo "Stopping Plano..."
|
||||||
planoai down
|
planoai down
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -8,10 +8,10 @@ services:
|
||||||
- "12000:12000"
|
- "12000:12000"
|
||||||
- "8001:8001"
|
- "8001:8001"
|
||||||
environment:
|
environment:
|
||||||
- ARCH_CONFIG_PATH=/config/config.yaml
|
- PLANO_CONFIG_PATH=/config/config.yaml
|
||||||
- OPENAI_API_KEY=${OPENAI_API_KEY:?OPENAI_API_KEY environment variable is required but not set}
|
- OPENAI_API_KEY=${OPENAI_API_KEY:?OPENAI_API_KEY environment variable is required but not set}
|
||||||
volumes:
|
volumes:
|
||||||
- ./config.yaml:/app/arch_config.yaml
|
- ./config.yaml:/app/plano_config.yaml
|
||||||
- /etc/ssl/cert.pem:/etc/ssl/cert.pem
|
- /etc/ssl/cert.pem:/etc/ssl/cert.pem
|
||||||
weather-agent:
|
weather-agent:
|
||||||
build:
|
build:
|
||||||
|
|
|
||||||
|
|
@ -19,7 +19,7 @@ def _on_build_finished(app: Sphinx, exception: Exception | None) -> None:
|
||||||
return
|
return
|
||||||
|
|
||||||
# Source path: provider_models.yaml is copied into the Docker image at /docs/provider_models.yaml
|
# Source path: provider_models.yaml is copied into the Docker image at /docs/provider_models.yaml
|
||||||
# This follows the pattern used for config templates like envoy.template.yaml and arch_config_schema.yaml
|
# This follows the pattern used for config templates like envoy.template.yaml and plano_config_schema.yaml
|
||||||
docs_root = Path(app.srcdir).parent # Goes from source/ to docs/
|
docs_root = Path(app.srcdir).parent # Goes from source/ to docs/
|
||||||
source_path = docs_root / "provider_models.yaml"
|
source_path = docs_root / "provider_models.yaml"
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -48,7 +48,7 @@ prompt_targets:
|
||||||
description: Time range in days for which to gather device statistics. Defaults to 7.
|
description: Time range in days for which to gather device statistics. Defaults to 7.
|
||||||
default: 7
|
default: 7
|
||||||
|
|
||||||
# Arch creates a round-robin load balancing between different endpoints, managed via the cluster subsystem.
|
# Plano creates a round-robin load balancing between different endpoints, managed via the cluster subsystem.
|
||||||
endpoints:
|
endpoints:
|
||||||
app_server:
|
app_server:
|
||||||
# value could be ip address or a hostname with port
|
# value could be ip address or a hostname with port
|
||||||
|
|
|
||||||
|
|
@ -18,7 +18,7 @@ prompt_targets:
|
||||||
endpoint:
|
endpoint:
|
||||||
name: app_server
|
name: app_server
|
||||||
path: /agent/summary
|
path: /agent/summary
|
||||||
# Arch uses the default LLM and treats the response from the endpoint as the prompt to send to the LLM
|
# Plano uses the default LLM and treats the response from the endpoint as the prompt to send to the LLM
|
||||||
auto_llm_dispatch_on_response: true
|
auto_llm_dispatch_on_response: true
|
||||||
# override system prompt for this prompt target
|
# override system prompt for this prompt target
|
||||||
system_prompt: You are a helpful information extraction assistant. Use the information that is provided to you.
|
system_prompt: You are a helpful information extraction assistant. Use the information that is provided to you.
|
||||||
|
|
@ -39,7 +39,7 @@ prompt_targets:
|
||||||
default: false
|
default: false
|
||||||
enum: [true, false]
|
enum: [true, false]
|
||||||
|
|
||||||
# Arch creates a round-robin load balancing between different endpoints, managed via the cluster subsystem.
|
# Plano creates a round-robin load balancing between different endpoints, managed via the cluster subsystem.
|
||||||
endpoints:
|
endpoints:
|
||||||
app_server:
|
app_server:
|
||||||
# value could be ip address or a hostname with port
|
# value could be ip address or a hostname with port
|
||||||
|
|
|
||||||
|
|
@ -54,7 +54,7 @@ The OpenAI SDK works with any provider through Plano's OpenAI-compatible endpoin
|
||||||
base_url="http://127.0.0.1:12000/v1"
|
base_url="http://127.0.0.1:12000/v1"
|
||||||
)
|
)
|
||||||
|
|
||||||
# Use any model configured in your arch_config.yaml
|
# Use any model configured in your plano_config.yaml
|
||||||
completion = client.chat.completions.create(
|
completion = client.chat.completions.create(
|
||||||
model="gpt-4o-mini", # Or use :ref:`model aliases <model_aliases>` like "fast-model"
|
model="gpt-4o-mini", # Or use :ref:`model aliases <model_aliases>` like "fast-model"
|
||||||
max_tokens=50,
|
max_tokens=50,
|
||||||
|
|
@ -231,7 +231,7 @@ The Anthropic SDK works with any provider through Plano's Anthropic-compatible e
|
||||||
base_url="http://127.0.0.1:12000"
|
base_url="http://127.0.0.1:12000"
|
||||||
)
|
)
|
||||||
|
|
||||||
# Use any model configured in your arch_config.yaml
|
# Use any model configured in your plano_config.yaml
|
||||||
message = client.messages.create(
|
message = client.messages.create(
|
||||||
model="claude-3-5-sonnet-20241022",
|
model="claude-3-5-sonnet-20241022",
|
||||||
max_tokens=50,
|
max_tokens=50,
|
||||||
|
|
|
||||||
|
|
@ -114,7 +114,7 @@ Example 1: Adjusting Retrieval
|
||||||
|
|
||||||
User: What are the benefits of renewable energy?
|
User: What are the benefits of renewable energy?
|
||||||
**[Plano]**: Check if there is an available <prompt_target> that can handle this user query.
|
**[Plano]**: Check if there is an available <prompt_target> that can handle this user query.
|
||||||
**[Plano]**: Found "get_info_for_energy_source" prompt_target in arch_config.yaml. Forward prompt to the endpoint configured in "get_info_for_energy_source"
|
**[Plano]**: Found "get_info_for_energy_source" prompt_target in plano_config.yaml. Forward prompt to the endpoint configured in "get_info_for_energy_source"
|
||||||
...
|
...
|
||||||
Assistant: Renewable energy reduces greenhouse gas emissions, lowers air pollution, and provides sustainable power sources like solar and wind.
|
Assistant: Renewable energy reduces greenhouse gas emissions, lowers air pollution, and provides sustainable power sources like solar and wind.
|
||||||
|
|
||||||
|
|
@ -130,13 +130,13 @@ Example 2: Switching Intent
|
||||||
|
|
||||||
User: What are the symptoms of diabetes?
|
User: What are the symptoms of diabetes?
|
||||||
**[Plano]**: Check if there is an available <prompt_target> that can handle this user query.
|
**[Plano]**: Check if there is an available <prompt_target> that can handle this user query.
|
||||||
**[Plano]**: Found "diseases_symptoms" prompt_target in arch_config.yaml. Forward disease=diabeteres to "diseases_symptoms" prompt target
|
**[Plano]**: Found "diseases_symptoms" prompt_target in plano_config.yaml. Forward disease=diabeteres to "diseases_symptoms" prompt target
|
||||||
...
|
...
|
||||||
Assistant: Common symptoms include frequent urination, excessive thirst, fatigue, and blurry vision.
|
Assistant: Common symptoms include frequent urination, excessive thirst, fatigue, and blurry vision.
|
||||||
|
|
||||||
User: How is it diagnosed?
|
User: How is it diagnosed?
|
||||||
**[Plano]**: New intent detected.
|
**[Plano]**: New intent detected.
|
||||||
**[Plano]**: Found "disease_diagnoses" prompt_target in arch_config.yaml. Forward disease=diabeteres to "disease_diagnoses" prompt target
|
**[Plano]**: Found "disease_diagnoses" prompt_target in plano_config.yaml. Forward disease=diabeteres to "disease_diagnoses" prompt target
|
||||||
...
|
...
|
||||||
Assistant: Diabetes is diagnosed through blood tests like fasting blood sugar, A1C, or an oral glucose tolerance test.
|
Assistant: Diabetes is diagnosed through blood tests like fasting blood sugar, A1C, or an oral glucose tolerance test.
|
||||||
|
|
||||||
|
|
@ -172,7 +172,7 @@ Once the prompt targets are configured as above, handle parameters across multi-
|
||||||
Demo App
|
Demo App
|
||||||
--------
|
--------
|
||||||
|
|
||||||
For your convenience, we've built a `demo app <https://github.com/katanemo/archgw/tree/main/demos/samples_python/multi_turn_rag_agent>`_
|
For your convenience, we've built a `demo app <https://github.com/katanemo/plano/tree/main/demos/samples_python/multi_turn_rag_agent>`_
|
||||||
that you can test and modify locally for multi-turn RAG scenarios.
|
that you can test and modify locally for multi-turn RAG scenarios.
|
||||||
|
|
||||||
.. figure:: ../build_with_plano/includes/multi_turn/mutli-turn-example.png
|
.. figure:: ../build_with_plano/includes/multi_turn/mutli-turn-example.png
|
||||||
|
|
|
||||||
|
|
@ -24,7 +24,7 @@ prompt_targets:
|
||||||
name: app_server
|
name: app_server
|
||||||
path: /agent/summary
|
path: /agent/summary
|
||||||
http_method: POST
|
http_method: POST
|
||||||
# Arch uses the default LLM and treats the response from the endpoint as the prompt to send to the LLM
|
# Plano uses the default LLM and treats the response from the endpoint as the prompt to send to the LLM
|
||||||
auto_llm_dispatch_on_response: true
|
auto_llm_dispatch_on_response: true
|
||||||
# override system prompt for this prompt target
|
# override system prompt for this prompt target
|
||||||
system_prompt: You are a helpful information extraction assistant. Use the information that is provided to you.
|
system_prompt: You are a helpful information extraction assistant. Use the information that is provided to you.
|
||||||
|
|
@ -46,7 +46,7 @@ prompt_targets:
|
||||||
default: false
|
default: false
|
||||||
enum: [true, false]
|
enum: [true, false]
|
||||||
|
|
||||||
# Arch creates a round-robin load balancing between different endpoints, managed via the cluster subsystem.
|
# Plano creates a round-robin load balancing between different endpoints, managed via the cluster subsystem.
|
||||||
endpoints:
|
endpoints:
|
||||||
app_server:
|
app_server:
|
||||||
# value could be ip address or a hostname with port
|
# value could be ip address or a hostname with port
|
||||||
|
|
|
||||||
|
|
@ -1,4 +1,4 @@
|
||||||
.. _arch_access_logging:
|
.. _plano_access_logging:
|
||||||
|
|
||||||
Access Logging
|
Access Logging
|
||||||
==============
|
==============
|
||||||
|
|
|
||||||
|
|
@ -1,4 +1,4 @@
|
||||||
.. _arch_overview_tracing:
|
.. _plano_overview_tracing:
|
||||||
|
|
||||||
Tracing
|
Tracing
|
||||||
=======
|
=======
|
||||||
|
|
|
||||||
|
|
@ -57,7 +57,7 @@ Configuration Overview
|
||||||
|
|
||||||
State storage is configured in the ``state_storage`` section of your ``plano_config.yaml``:
|
State storage is configured in the ``state_storage`` section of your ``plano_config.yaml``:
|
||||||
|
|
||||||
.. literalinclude:: ../resources/includes/arch_config_state_storage_example.yaml
|
.. literalinclude:: ../resources/includes/plano_config_state_storage_example.yaml
|
||||||
:language: yaml
|
:language: yaml
|
||||||
:lines: 21-30
|
:lines: 21-30
|
||||||
:linenos:
|
:linenos:
|
||||||
|
|
|
||||||
|
|
@ -4,10 +4,10 @@ Configuration Reference
|
||||||
=======================
|
=======================
|
||||||
|
|
||||||
The following is a complete reference of the ``plano_config.yml`` that controls the behavior of a single instance of
|
The following is a complete reference of the ``plano_config.yml`` that controls the behavior of a single instance of
|
||||||
the Arch gateway. This where you enable capabilities like routing to upstream LLm providers, defining prompt_targets
|
the Plano gateway. This where you enable capabilities like routing to upstream LLm providers, defining prompt_targets
|
||||||
where prompts get routed to, apply guardrails, and enable critical agent observability features.
|
where prompts get routed to, apply guardrails, and enable critical agent observability features.
|
||||||
|
|
||||||
.. literalinclude:: includes/arch_config_full_reference.yaml
|
.. literalinclude:: includes/plano_config_full_reference.yaml
|
||||||
:language: yaml
|
:language: yaml
|
||||||
:linenos:
|
:linenos:
|
||||||
:caption: :download:`Plano Configuration - Full Reference <includes/arch_config_full_reference.yaml>`
|
:caption: :download:`Plano Configuration - Full Reference <includes/plano_config_full_reference.yaml>`
|
||||||
|
|
|
||||||
|
|
@ -30,7 +30,7 @@ listeners:
|
||||||
- type: agent
|
- type: agent
|
||||||
name: agent_1
|
name: agent_1
|
||||||
port: 8001
|
port: 8001
|
||||||
router: arch_agent_router
|
router: plano_agent_router
|
||||||
agents:
|
agents:
|
||||||
- id: rag_agent
|
- id: rag_agent
|
||||||
description: virtual assistant for retrieval augmented generation tasks
|
description: virtual assistant for retrieval augmented generation tasks
|
||||||
|
|
|
||||||
|
|
@ -1,4 +1,4 @@
|
||||||
# Arch Gateway configuration version
|
# Plano Gateway configuration version
|
||||||
version: v0.3.0
|
version: v0.3.0
|
||||||
|
|
||||||
# External HTTP agents - API type is controlled by request path (/v1/responses, /v1/messages, /v1/chat/completions)
|
# External HTTP agents - API type is controlled by request path (/v1/responses, /v1/messages, /v1/chat/completions)
|
||||||
|
|
@ -41,7 +41,7 @@ The request processing path in Plano has three main parts:
|
||||||
|
|
||||||
These two subsystems are bridged with either the HTTP router filter, and the cluster manager subsystems of Envoy.
|
These two subsystems are bridged with either the HTTP router filter, and the cluster manager subsystems of Envoy.
|
||||||
|
|
||||||
Also, Plano utilizes `Envoy event-based thread model <https://blog.envoyproxy.io/envoy-threading-model-a8d44b922310>`_. A main thread is responsible for the server lifecycle, configuration processing, stats, etc. and some number of :ref:`worker threads <arch_overview_threading>` process requests. All threads operate around an event loop (`libevent <https://libevent.org/>`_) and any given downstream TCP connection will be handled by exactly one worker thread for its lifetime. Each worker thread maintains its own pool of TCP connections to upstream endpoints.
|
Also, Plano utilizes `Envoy event-based thread model <https://blog.envoyproxy.io/envoy-threading-model-a8d44b922310>`_. A main thread is responsible for the server lifecycle, configuration processing, stats, etc. and some number of :ref:`worker threads <plano_overview_threading>` process requests. All threads operate around an event loop (`libevent <https://libevent.org/>`_) and any given downstream TCP connection will be handled by exactly one worker thread for its lifetime. Each worker thread maintains its own pool of TCP connections to upstream endpoints.
|
||||||
|
|
||||||
Worker threads rarely share state and operate in a trivially parallel fashion. This threading model
|
Worker threads rarely share state and operate in a trivially parallel fashion. This threading model
|
||||||
enables scaling to very high core count CPUs.
|
enables scaling to very high core count CPUs.
|
||||||
|
|
@ -130,8 +130,8 @@ Once a request completes, the stream is destroyed. The following also takes plac
|
||||||
* The post-request :ref:`monitoring <monitoring>` are updated (e.g. timing, active requests, upgrades, health checks).
|
* The post-request :ref:`monitoring <monitoring>` are updated (e.g. timing, active requests, upgrades, health checks).
|
||||||
Some statistics are updated earlier however, during request processing. Stats are batched and written by the main
|
Some statistics are updated earlier however, during request processing. Stats are batched and written by the main
|
||||||
thread periodically.
|
thread periodically.
|
||||||
* :ref:`Access logs <arch_access_logging>` are written to the access log
|
* :ref:`Access logs <plano_access_logging>` are written to the access log
|
||||||
* :ref:`Trace <arch_overview_tracing>` spans are finalized. If our example request was traced, a
|
* :ref:`Trace <plano_overview_tracing>` spans are finalized. If our example request was traced, a
|
||||||
trace span, describing the duration and details of the request would be created by the HCM when
|
trace span, describing the duration and details of the request would be created by the HCM when
|
||||||
processing request headers and then finalized by the HCM during post-request processing.
|
processing request headers and then finalized by the HCM during post-request processing.
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -1,4 +1,4 @@
|
||||||
.. _arch_overview_threading:
|
.. _plano_overview_threading:
|
||||||
|
|
||||||
Threading Model
|
Threading Model
|
||||||
===============
|
===============
|
||||||
|
|
|
||||||
4
package-lock.json
generated
4
package-lock.json
generated
|
|
@ -1,11 +1,11 @@
|
||||||
{
|
{
|
||||||
"name": "archgw-monorepo",
|
"name": "plano-monorepo",
|
||||||
"version": "0.1.0",
|
"version": "0.1.0",
|
||||||
"lockfileVersion": 3,
|
"lockfileVersion": 3,
|
||||||
"requires": true,
|
"requires": true,
|
||||||
"packages": {
|
"packages": {
|
||||||
"": {
|
"": {
|
||||||
"name": "archgw-monorepo",
|
"name": "plano-monorepo",
|
||||||
"version": "0.1.0",
|
"version": "0.1.0",
|
||||||
"workspaces": [
|
"workspaces": [
|
||||||
"apps/*",
|
"apps/*",
|
||||||
|
|
|
||||||
Some files were not shown because too many files have changed in this diff Show more
Loading…
Add table
Add a link
Reference in a new issue