apunkt/plano

Fork 0

mirror of https://github.com/katanemo/plano.git synced 2026-06-17 15:25:17 +02:00

Adil Hafeez 3f8aa14e4c

create md files for coding agents and for humans

2026-02-09 23:34:18 -08:00

6 KiB

Raw Permalink Blame History

Copilot Instructions for Plano (ArchGW)

System Identity

Plano is an AI-native gateway built on Envoy Proxy. It uses WASM filters for inline request processing and a native Rust service (Brightstaff) for orchestration. All components run in a single container managed by Supervisord.

Critical Architectural Rules

1. Envoy Is the Data Plane — Never Bypass It

All external traffic MUST flow through Envoy. Brightstaff NEVER makes direct outbound HTTP calls to LLM providers or developer APIs. It always routes through Envoy listeners:

LLM requests → localhost:12001 (egress LLM listener with llm_gateway.wasm)
Agent/API requests → localhost:11000 (outbound API listener)

Do not add direct HTTP calls from Brightstaff to external services. Use Envoy's cluster routing via x-arch-* headers instead.

2. WASM Crate Constraints

prompt_gateway and llm_gateway compile to wasm32-wasip1. This means:

No tokio, no async/await, no threads, no filesystem, no network sockets
All I/O goes through proxy-wasm SDK's dispatch_http_call (async callback-based)
No crate with std networking features — use governor with no_std, etc.
The crate-type is ["cdylib"] — these are shared libraries, not binaries
Test with cargo test (native), but build with --target wasm32-wasip1

Do not add dependencies to WASM crates that require std::net, tokio, reqwest, hyper, or any async runtime.

3. Crate Dependency Direction

prompt_gateway → common
llm_gateway    → common, hermesllm
common         → hermesllm
brightstaff    → common (non-WASM parts), hermesllm
hermesllm      → (standalone, no proxy-wasm)

hermesllm must NEVER depend on proxy-wasm or common — it's a pure Rust library usable outside WASM
common provides the proxy-wasm abstractions — WASM crates use common, not raw proxy-wasm directly (except for the SDK traits)
brightstaff uses hermesllm directly for LLM types but does NOT use common's WASM-specific code (like proxy-wasm Client trait)

4. Header-Based Routing Protocol

Envoy routes requests using custom headers. These are the canonical header names defined in common/src/consts.rs:

Header	Purpose	Do NOT change
`x-arch-llm-provider`	Envoy route matching for LLM provider cluster	Used in envoy.template.yaml
`x-arch-llm-provider-hint`	Brightstaff → llm_gateway provider selection	Both sides must agree
`x-arch-upstream`	Targets a specific agent/API cluster in Envoy	Used in envoy.template.yaml
`x-arch-streaming-request`	Signals streaming mode	llm_gateway reads this
`x-arch-state`	Multi-turn conversation state in prompt_gateway	Serialized JSON
`x-arch-tool-call-message`	Tool call metadata	prompt_gateway internal
`x-arch-api-response-message`	Developer API response	prompt_gateway internal
`x-arch-agent-listener-name`	Identifies agent listener	Set by Envoy, read by Brightstaff
`x-arch-llm-route`	LLM route decision result	Brightstaff ↔ llm_gateway

Changing header names requires updating: consts.rs, envoy.template.yaml, and all consumers.

5. Build System

# WASM filters — must use wasm32-wasip1 target
cargo build --release --target wasm32-wasip1 -p prompt_gateway -p llm_gateway

# Brightstaff — native binary
cargo build --release -p brightstaff

The workspace uses Rust edition 2021 and resolver "2". The workspace root is crates/Cargo.toml.

6. Configuration Flow

User config (arch_config.yaml) is validated and rendered by cli/planoai/config_generator.py:

Schema: config/arch_config_schema.yaml
Template: config/envoy.template.yaml (Jinja2)
Output: envoy.yaml (for Envoy) + arch_config_rendered.yaml (for Brightstaff + WASM filter configs)

When adding new config fields: update the schema, the template (if Envoy-relevant), the Python generator, AND the Rust Configuration struct in common/src/configuration.rs.

7. Internal Model Names

These are reserved model names used internally — do not conflict with them:

Arch-Function — intent classification / function calling
Arch-Router — (used as route name prefix, not direct model name)
Plano-Orchestrator — agent selection orchestrator

8. API Compatibility

Brightstaff exposes OpenAI-compatible endpoints:

/v1/chat/completions — Chat Completions API
/v1/messages — Anthropic Messages API compatible
/v1/responses — OpenAI Responses API with state management
/function_calling — Internal Arch-Function endpoint

The /agents/ prefix variants mirror these for agent orchestration.

Do NOT change these path structures without updating consts.rs, Brightstaff router, and envoy.template.yaml.

9. Streaming

LLM responses use SSE (Server-Sent Events) format: data: {json}\n\n
The llm_gateway WASM filter handles SSE stream reassembly across chunk boundaries via SseStreamBuffer
Brightstaff uses mpsc channels for streaming responses back to clients
Bedrock uses AWS Event Stream binary protocol — decoded by hermesllm

10. Testing Conventions

WASM crates: unit tests run natively (cargo test), NOT under WASM runtime
Brightstaff: unit tests with mockito for HTTP mocking
E2E tests: separate tests/ directory, run via GitHub Actions workflows
Config validation tests: cli/test/test_config_generator.py

File Layout Reference

crates/
  Cargo.toml          # Workspace root
  brightstaff/        # Native Rust HTTP server (Axum)
  common/             # Shared types, config, HTTP, rate limiting
  hermesllm/          # LLM protocol translation (pure Rust)
  llm_gateway/        # WASM filter: provider routing, auth, rate limits
  prompt_gateway/     # WASM filter: intent matching, guardrails
config/
  arch_config_schema.yaml   # User config JSON schema
  envoy.template.yaml       # Jinja2 template → envoy.yaml
  docker-compose.dev.yaml   # Dev environment
cli/
  planoai/                  # Python CLI (config generator, Docker management)

6 KiB Raw Permalink Blame History