2026-02-11 15:34:39 -08:00
# CLAUDE.md
Plano is an AI-native proxy server and data plane for agentic applications, built on Envoy proxy. It centralizes agent orchestration, LLM routing, observability, and safety guardrails as an out-of-process dataplane.
## Build & Test Commands
```bash
2026-03-13 00:18:41 -07:00
# Rust — WASM plugins (must target wasm32-wasip1)
2026-02-11 15:34:39 -08:00
cd crates & & cargo build --release --target=wasm32-wasip1 -p llm_gateway -p prompt_gateway
2026-03-13 00:18:41 -07:00
# Rust — brightstaff binary (native target)
2026-02-11 15:34:39 -08:00
cd crates & & cargo build --release -p brightstaff
2026-03-13 00:18:41 -07:00
# Rust — tests, format, lint
2026-02-11 15:34:39 -08:00
cd crates & & cargo test --lib
cd crates & & cargo fmt --all -- --check
cd crates & & cargo clippy --locked --all-targets --all-features -- -D warnings
2026-03-13 00:18:41 -07:00
# Python CLI
cd cli & & uv sync & & uv run pytest -v
2026-02-11 15:34:39 -08:00
2026-03-13 00:18:41 -07:00
# JS/TS (Turbo monorepo)
npm run build & & npm run lint & & npm run typecheck
2026-02-11 15:34:39 -08:00
2026-03-13 00:18:41 -07:00
# Pre-commit (fmt, clippy, cargo test, black, yaml)
2026-02-11 15:34:39 -08:00
pre-commit run --all-files
2026-03-13 00:18:41 -07:00
# Docker
2026-02-11 15:34:39 -08:00
docker build -t katanemo/plano:latest .
```
2026-03-13 00:18:41 -07:00
E2E tests require a Docker image and API keys: `tests/e2e/run_e2e_tests.sh`
2026-02-11 15:34:39 -08:00
## Architecture
```
Client → Envoy (prompt_gateway.wasm → llm_gateway.wasm) → Agents/LLM Providers
↕
brightstaff (native binary: state, routing, signals, tracing)
```
2026-03-13 00:18:41 -07:00
### Crates (crates/)
2026-02-11 15:34:39 -08:00
2026-03-13 00:18:41 -07:00
- **prompt_gateway** (WASM) — Proxy-WASM filter for prompt processing, guardrails, filter chains
2026-02-11 15:34:39 -08:00
- **llm_gateway** (WASM) — Proxy-WASM filter for LLM request/response handling and routing
2026-03-13 00:18:41 -07:00
- **brightstaff** (native) — Core server: handlers, router, signals, state, tracing
- **common** (lib) — Shared: config, HTTP, routing, rate limiting, tokenizer, PII, tracing
- **hermesllm** (lib) — LLM API translation between providers. Key types: `ProviderId` , `ProviderRequest` , `ProviderResponse` , `ProviderStreamResponse`
2026-02-11 15:34:39 -08:00
### Python CLI (cli/planoai/)
2026-03-13 00:18:41 -07:00
Entry point: `main.py` . Built with `rich-click` . Commands: `up` , `down` , `build` , `logs` , `trace` , `init` , `cli_agent` , `generate_prompt_targets` .
2026-02-11 15:34:39 -08:00
2026-03-13 00:18:41 -07:00
### Config (config/)
2026-02-11 15:34:39 -08:00
2026-03-13 00:18:41 -07:00
- `plano_config_schema.yaml` — JSON Schema for validating user configs
- `envoy.template.yaml` — Jinja2 template → Envoy config
- `supervisord.conf` — Process supervisor for Envoy + brightstaff
2026-02-11 15:34:39 -08:00
2026-03-13 00:18:41 -07:00
### JS Apps (apps/, packages/)
2026-02-11 15:34:39 -08:00
2026-03-13 00:18:41 -07:00
Turbo monorepo with Next.js 16 / React 19. Not part of the core proxy.
2026-02-11 15:34:39 -08:00
2026-03-13 00:18:41 -07:00
## WASM Plugin Rules
2026-02-11 15:34:39 -08:00
2026-03-13 00:18:41 -07:00
Code in `prompt_gateway` and `llm_gateway` runs in Envoy's WASM sandbox:
2026-02-17 05:45:44 -08:00
2026-03-13 00:18:41 -07:00
- **No std networking/filesystem** — use proxy-wasm host calls only
- **No tokio/async** — synchronous, callback-driven. `Action::Pause` / `Action::Continue` for flow control
- **Lifecycle**: `RootContext` → `on_configure` , `create_http_context` ; `HttpContext` → `on_http_request/response_headers/body`
- **HTTP callouts**: `dispatch_http_call()` → store context in `callouts: RefCell<HashMap<u32, CallContext>>` → match in `on_http_call_response()`
- **Config**: `Rc` -wrapped, loaded once in `on_configure()` via `serde_yaml::from_slice()`
- **Dependencies must be no_std compatible** (e.g., `governor` with `features = ["no_std"]` )
- **Crate type**: `cdylib` → produces `.wasm`
2026-02-17 05:45:44 -08:00
2026-03-13 00:18:41 -07:00
## Adding a New LLM Provider
2026-02-17 05:45:44 -08:00
2026-03-13 00:18:41 -07:00
1. Add variant to `ProviderId` in `crates/hermesllm/src/providers/id.rs` + `TryFrom<&str>`
2. Create request/response types in `crates/hermesllm/src/apis/` if non-OpenAI format
3. Add variant to `ProviderRequestType` /`ProviderResponseType` enums, update all match arms
4. Add models to `crates/hermesllm/src/providers/provider_models.yaml`
5. Update `SupportedUpstreamAPIs` mapping if needed
2026-02-17 05:45:44 -08:00
2026-03-13 00:18:41 -07:00
## Release Process
2026-02-17 05:45:44 -08:00
2026-03-13 00:18:41 -07:00
Update version (e.g., `0.4.11` → `0.4.12` ) in all of these files:
2026-02-17 05:45:44 -08:00
2026-03-13 00:18:41 -07:00
- `.github/workflows/ci.yml` , `build_filter_image.sh` , `config/validate_plano_config.sh`
- `cli/planoai/__init__.py` , `cli/planoai/consts.py` , `cli/pyproject.toml`
- `docs/source/conf.py` , `docs/source/get_started/quickstart.rst` , `docs/source/resources/deployment.rst`
- `apps/www/src/components/Hero.tsx` , `demos/llm_routing/preference_based_routing/README.md`
2026-02-17 05:45:44 -08:00
2026-03-13 00:18:41 -07:00
Do NOT change version strings in `*.lock` files or `Cargo.lock` . Commit message: `release X.Y.Z`
2026-02-17 05:45:44 -08:00
2026-02-18 20:24:34 -08:00
## Workflow Preferences
2026-03-13 00:18:41 -07:00
- **Commits:** No `Co-Authored-By` . Short one-line messages. Never push directly to `main` — always feature branch + PR.
- **Branches:** Use `adil/<feature_name>` format.
- **Issues:** When a GitHub issue URL is pasted, fetch all context first. Goal is always a PR with passing tests.
2026-02-18 20:24:34 -08:00
2026-02-11 15:34:39 -08:00
## Key Conventions
2026-03-13 00:18:41 -07:00
- Rust edition 2021, `cargo fmt` , `cargo clippy -D warnings`
- Python: Black. Rust errors: `thiserror` with `#[from]`
- API keys from env vars or `.env` , never hardcoded
- Provider dispatch: `ProviderRequestType` /`ProviderResponseType` enums implementing `ProviderRequest` /`ProviderResponse` traits