Update CLAUDE.md with comprehensive codebase documentation

Add repository structure overview, detailed Rust crate file layouts, testing strategy breakdown (unit/E2E/integration/REST/HURL), CI/CD workflow descriptions, toolchain requirements, and expanded key conventions including state backends and hermesllm provider guidance. https://claude.ai/code/session_01FxkSG37u4urpiZBPZcV2ke
2026-06-17 15:25:17 +02:00 · 2026-02-12 00:06:26 +00:00 · 2026-02-12 00:06:26 +00:00 · 123a1ed640
commit 123a1ed640
parent 3c8e899de3
1 changed files with 100 additions and 8 deletions
--- a/CLAUDE.md
+++ b/CLAUDE.md
@ -6,6 +6,31 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co

 Plano is an AI-native proxy server and data plane for agentic applications, built on Envoy proxy. It centralizes agent orchestration, LLM routing, observability, and safety guardrails as an out-of-process dataplane.

+## Repository Structure
+
+```
+plano/
+├── crates/                  # Rust workspace (5 crates: WASM plugins + native binary + libraries)
+├── cli/                     # Python CLI tool (planoai) for managing Plano lifecycle
+├── apps/                    # Next.js 16 / React 19 web applications (Turbo monorepo)
+│   ├── www/                 # Main marketing/docs site with Sanity CMS
+│   └── katanemo-www/        # Katanemo company website
+├── packages/                # Shared TypeScript packages
+│   ├── ui/                  # React component library (Radix UI based)
+│   ├── shared-styles/       # Global CSS
+│   ├── tailwind-config/     # Shared Tailwind v4 config
+│   └── tsconfig/            # Shared TypeScript configs
+├── config/                  # Configuration templates and schemas
+├── tests/                   # E2E, integration, REST API, and HURL tests
+├── demos/                   # Sample applications and use-case demos
+├── docs/                    # Sphinx documentation source
+├── .github/workflows/       # CI/CD (14 workflows)
+├── Dockerfile               # Multi-stage production build (Rust 1.93.0, Envoy v1.37.0, Python 3.13)
+├── turbo.json               # Turbo monorepo orchestration
+├── package.json             # Root npm workspace config
+└── .pre-commit-config.yaml  # Pre-commit hooks
+```
+
 ## Build & Test Commands

 ### Rust (crates/)
@ -58,7 +83,7 @@ docker build -t katanemo/plano:latest .

 ### E2E Tests (tests/e2e/)

-E2E tests require a built Docker image and API keys. They run via `tests/e2e/run_e2e_tests.sh` which executes three test suites: `test_prompt_gateway.py`, `test_model_alias_routing.py`, and `test_openai_responses_api_client_with_state.py`.
+E2E tests require a built Docker image and API keys. They run via `tests/e2e/run_e2e_tests.sh` which executes three test suites: `test_prompt_gateway.py`, `test_model_alias_routing.py`, and `test_openai_responses_api_client_with_state.py`. Individual suites can be run with `run_prompt_gateway_tests.sh`, `run_model_alias_tests.sh`, or `run_responses_state_tests.sh`.

 ## Architecture

@ -76,11 +101,21 @@ Client → Envoy (prompt_gateway.wasm → llm_gateway.wasm) → Agents/LLM Provi

 All crates share a Cargo workspace. Two compile to `wasm32-wasip1` for Envoy, the rest are native:

- **prompt_gateway** (WASM) — Proxy-WASM filter for prompt/message processing, guardrails, and filter chains
- **llm_gateway** (WASM) — Proxy-WASM filter for LLM request/response handling and routing
- **brightstaff** (native binary) — Core application server: handlers, router, signals, state management, tracing
- **common** (library) — Shared across all crates: configuration, LLM provider abstractions, HTTP utilities, routing logic, rate limiting, tokenizer, PII detection, tracing
- **hermesllm** (library) — Translates LLM API formats between providers (OpenAI, Anthropic, Gemini, Mistral, Grok, AWS Bedrock, Azure, together.ai). Key types: `ProviderId`, `ProviderRequest`, `ProviderResponse`, `ProviderStreamResponse`
+- **prompt_gateway** (WASM, `crate-type = ["cdylib"]`) — Proxy-WASM filter for prompt/message processing, guardrails, and filter chains. Key files: `lib.rs` (entry point), `http_context.rs`, `filter_context.rs`, `stream_context.rs`, `metrics.rs`
+- **llm_gateway** (WASM, `crate-type = ["cdylib"]`) — Proxy-WASM filter for LLM request/response handling and routing. Same structure as prompt_gateway
+- **brightstaff** (native binary) — Core application server. Key subdirectories:
+  - `handlers/` — Request handling: `function_calling.rs` (largest file), `agent_chat_completions.rs`, `llm.rs`, `pipeline_processor.rs`, `response_handler.rs`, `jsonrpc.rs`
+  - `router/` — Routing: `llm_router.rs`, `plano_orchestrator.rs`, `router_model.rs`/`router_model_v1.rs`, `orchestrator_model.rs`/`orchestrator_model_v1.rs`
+  - `state/` — State management with `memory.rs` (in-memory) and `postgresql.rs` (PostgreSQL) backends
+  - `signals/` — Signal analysis engine (`analyzer.rs`)
+  - `tracing/` — OpenTelemetry instrumentation
+- **common** (library) — Shared across all crates: `configuration.rs`, `llm_providers.rs`, `routing.rs`, `ratelimit.rs`, `tokenizer.rs`, `pii.rs`, `tracing.rs`, `http.rs`. Sub-modules: `api/` (hallucination, prompt_guard, zero_shot), `traces/` (span builders, shapes)
+- **hermesllm** (library) — Translates LLM API formats between providers. Supported providers via `ProviderId`: OpenAI, Anthropic, Gemini, Mistral, Groq, GitHub, AWS Bedrock, Azure, together.ai, Deepseek. Key types: `ProviderRequest`, `ProviderResponse`, `ProviderStreamResponse`. Sub-modules:
+  - `apis/` — Provider-specific API implementations (`openai.rs`, `anthropic.rs`, `amazon_bedrock.rs`, `openai_responses.rs`)
+  - `transforms/request/` — Cross-provider request translation (`from_openai.rs`, `from_anthropic.rs`)
+  - `transforms/response/` — Cross-provider response translation (`to_openai.rs`, `to_anthropic.rs`, `output_to_input.rs`)
+  - `transforms/response_streaming/` — Streaming response translation
+  - `apis/streaming_shapes/` — SSE parsing and streaming buffers per provider

 ### Python CLI (cli/planoai/)

@ -92,19 +127,72 @@ The `planoai` CLI manages the Plano lifecycle. Key commands:
 - `planoai trace` — OTEL trace collection and analysis
 - `planoai init` — Initialize new project

-Entry point: `cli/planoai/main.py`. Container lifecycle in `core.py`. Docker operations in `docker_cli.py`.
+Entry point: `cli/planoai/main.py`. Container lifecycle in `core.py`. Docker operations in `docker_cli.py`. Config generation in `config_generator.py`. Trace analysis in `trace_cmd.py`. Templates for init in `cli/planoai/templates/`.

 ### Configuration System (config/)

 - `arch_config_schema.yaml` — JSON Schema (draft-07) for validating user config files
 - `envoy.template.yaml` — Jinja2 template rendered into Envoy proxy config
 - `supervisord.conf` — Process supervisor for Envoy + brightstaff in the container
+- `docker-compose.dev.yaml` — Development Docker Compose setup
+- `validate_plano_config.sh` — Config validation script

 User configs define: `agents` (id + url), `model_providers` (model + access_key), `listeners` (type: agent/model/prompt, with router strategy), `filters` (filter chains), and `tracing` settings.

 ### JavaScript Apps (apps/, packages/)

-Turbo monorepo with Next.js 16 / React 19 applications and shared packages (UI components, Tailwind config, TypeScript config). Not part of the core proxy — these are web applications.
+Turbo monorepo with Next.js 16 / React 19 applications and shared packages. Not part of the core proxy — these are marketing/documentation websites.
+
+- **apps/www** (`@katanemo/www`) — Main site with blog, research section, Sanity CMS integration
+- **apps/katanemo-www** (`@katanemo/katanemo-www`) — Katanemo company website
+- **packages/ui** (`@katanemo/ui`) — Shared React components (Navbar, Footer, Logo, Radix UI primitives)
+- **packages/shared-styles** — Global CSS custom properties
+- **packages/tailwind-config** — Centralized Tailwind v4 config
+- **packages/tsconfig** — Shared TypeScript configs (`base.json`, `nextjs.json`)
+
+## Testing Strategy
+
+### Unit Tests
+- **Rust**: `cd crates && cargo test --lib` — runs unit tests across all crates
+- **Python CLI**: `cd cli && uv run pytest -v` — tests in `cli/test/` (config generation, init, version checks)
+
+### E2E Tests (tests/e2e/)
+Require a running Docker container with API keys. Orchestrated by shell scripts:
+- `run_e2e_tests.sh` — runs all three suites
+- `test_prompt_gateway.py` — prompt gateway filter behavior
+- `test_model_alias_routing.py` — model alias routing
+- `test_openai_responses_api_client_with_state.py` — stateful OpenAI Responses API
+
+### Integration Tests (tests/archgw/)
+- `test_prompt_gateway.py`, `test_llm_gateway.py` — gateway-level integration tests with Docker Compose
+
+### REST API Tests (tests/rest/)
+- `.rest` files for manual API testing against prompt gateway, LLM gateway, model server, and routing
+
+### HURL Tests (tests/hurl/)
+- HTTP-level API tests for LLM gateway model selection behaviors
+
+## CI/CD Workflows (.github/workflows/)
+
+- **pre-commit.yml** — Pre-commit hook validation on PRs
+- **rust_tests.yml** — Rust unit tests
+- **static.yml** — Static analysis
+- **plano_tools_tests.yml** — Python CLI tests
+- **validate_arch_config.yml** — Configuration schema validation
+- **e2e_tests.yml** — Main E2E test suite
+- **e2e_plano_tests.yml** — Plano-specific E2E tests
+- **e2e_test_currency_convert.yml** / **e2e_test_preference_based_routing.yml** — Demo-specific E2E tests
+- **docker-push-main.yml** / **docker-push-release.yml** — Docker Hub image publishing
+- **ghrc-push-main.yml** / **ghrc-push-release.yml** — GitHub Container Registry publishing
+- **publish-pypi.yml** — PyPI package publishing for planoai CLI
+
+## Toolchain Requirements
+
+- **Rust**: 1.93.0, edition 2021 (must have `wasm32-wasip1` target installed via `rustup target add wasm32-wasip1`)
+- **Python**: >=3.10 (3.13 in Docker), managed with `uv`
+- **Node.js**: >=18.0.0, npm 10.0.0
+- **Docker**: Required for E2E tests and production builds (base image: `envoyproxy/envoy:v1.37.0`)
+- **Pre-commit**: Install with `pip install pre-commit && pre-commit install`

 ## Key Conventions

@ -113,3 +201,7 @@ Turbo monorepo with Next.js 16 / React 19 applications and shared packages (UI c
 - WASM plugins must target `wasm32-wasip1` — they run inside Envoy, not as native binaries
 - The Docker image bundles Envoy + WASM plugins + brightstaff + Python CLI into a single container managed by supervisord
 - API keys come from environment variables or `.env` files, never hardcoded
+- Brightstaff supports two state backends: in-memory (`state/memory.rs`) and PostgreSQL (`state/postgresql.rs`)
+- LLM provider translation is centralized in hermesllm — add new providers there, not in individual crates
+- The Cargo workspace is in `crates/`, not at the repo root — always `cd crates` before running cargo commands
+- Pre-commit hooks must pass before committing (cargo fmt, clippy, cargo test --lib, black, YAML checks)