+```
+
+Start Plano:
+```bash
+cd demos/llm_routing/model_routing_service
+planoai up config.yaml
+```
+
+## Run the demo
+
+```bash
+./demo.sh
+```
+
+## Endpoints
+
+All three LLM API formats are supported:
+
+| Endpoint | Format |
+|---|---|
+| `POST /routing/v1/chat/completions` | OpenAI Chat Completions |
+| `POST /routing/v1/messages` | Anthropic Messages |
+| `POST /routing/v1/responses` | OpenAI Responses API |
+
+## Example
+
+```bash
+curl http://localhost:12000/routing/v1/chat/completions \
+ -H "Content-Type: application/json" \
+ -d '{
+ "model": "gpt-4o-mini",
+ "messages": [{"role": "user", "content": "Write a Python function for binary search"}]
+ }'
+```
+
+Response:
+```json
+{
+ "model": "anthropic/claude-sonnet-4-20250514",
+ "route": "code_generation",
+ "trace_id": "c16d1096c1af4a17abb48fb182918a88"
+}
+```
+
+The response tells you which model would handle this request and which route was matched, without actually making the LLM call.
+
+## Demo Output
+
+```
+=== Model Routing Service Demo ===
+
+--- 1. Code generation query (OpenAI format) ---
+{
+ "model": "anthropic/claude-sonnet-4-20250514",
+ "route": "code_generation",
+ "trace_id": "c16d1096c1af4a17abb48fb182918a88"
+}
+
+--- 2. Complex reasoning query (OpenAI format) ---
+{
+ "model": "openai/gpt-4o",
+ "route": "complex_reasoning",
+ "trace_id": "30795e228aff4d7696f082ed01b75ad4"
+}
+
+--- 3. Simple query - no routing match (OpenAI format) ---
+{
+ "model": "none",
+ "route": null,
+ "trace_id": "ae0b6c3b220d499fb5298ac63f4eac0e"
+}
+
+--- 4. Code generation query (Anthropic format) ---
+{
+ "model": "anthropic/claude-sonnet-4-20250514",
+ "route": "code_generation",
+ "trace_id": "26be822bbdf14a3ba19fe198e55ea4a9"
+}
+
+=== Demo Complete ===
+```
diff --git a/demos/llm_routing/model_routing_service/config.yaml b/demos/llm_routing/model_routing_service/config.yaml
new file mode 100644
index 00000000..7b98b25b
--- /dev/null
+++ b/demos/llm_routing/model_routing_service/config.yaml
@@ -0,0 +1,27 @@
+version: v0.3.0
+
+listeners:
+ - type: model
+ name: model_listener
+ port: 12000
+
+model_providers:
+
+ - model: openai/gpt-4o-mini
+ access_key: $OPENAI_API_KEY
+ default: true
+
+ - model: openai/gpt-4o
+ access_key: $OPENAI_API_KEY
+ routing_preferences:
+ - name: complex_reasoning
+ description: complex reasoning tasks, multi-step analysis, or detailed explanations
+
+ - model: anthropic/claude-sonnet-4-20250514
+ access_key: $ANTHROPIC_API_KEY
+ routing_preferences:
+ - name: code_generation
+ description: generating new code, writing functions, or creating boilerplate
+
+tracing:
+ random_sampling: 100
diff --git a/demos/llm_routing/model_routing_service/demo.sh b/demos/llm_routing/model_routing_service/demo.sh
new file mode 100755
index 00000000..3e9b0584
--- /dev/null
+++ b/demos/llm_routing/model_routing_service/demo.sh
@@ -0,0 +1,65 @@
+#!/bin/bash
+set -e
+
+PLANO_URL="${PLANO_URL:-http://localhost:12000}"
+
+echo "=== Model Routing Service Demo ==="
+echo ""
+echo "This demo shows how to use the /routing/v1/* endpoints to get"
+echo "routing decisions without actually proxying the request to an LLM."
+echo ""
+
+# --- Example 1: OpenAI Chat Completions format ---
+echo "--- 1. Code generation query (OpenAI format) ---"
+echo ""
+curl -s "$PLANO_URL/routing/v1/chat/completions" \
+ -H "Content-Type: application/json" \
+ -d '{
+ "model": "gpt-4o-mini",
+ "messages": [
+ {"role": "user", "content": "Write a Python function that implements binary search on a sorted array"}
+ ]
+ }' | python3 -m json.tool
+echo ""
+
+# --- Example 2: Complex reasoning query ---
+echo "--- 2. Complex reasoning query (OpenAI format) ---"
+echo ""
+curl -s "$PLANO_URL/routing/v1/chat/completions" \
+ -H "Content-Type: application/json" \
+ -d '{
+ "model": "gpt-4o-mini",
+ "messages": [
+ {"role": "user", "content": "Explain the trade-offs between microservices and monolithic architectures, considering scalability, team structure, and operational complexity"}
+ ]
+ }' | python3 -m json.tool
+echo ""
+
+# --- Example 3: Simple query (no routing match) ---
+echo "--- 3. Simple query - no routing match (OpenAI format) ---"
+echo ""
+curl -s "$PLANO_URL/routing/v1/chat/completions" \
+ -H "Content-Type: application/json" \
+ -d '{
+ "model": "gpt-4o-mini",
+ "messages": [
+ {"role": "user", "content": "What is the capital of France?"}
+ ]
+ }' | python3 -m json.tool
+echo ""
+
+# --- Example 4: Anthropic Messages format ---
+echo "--- 4. Code generation query (Anthropic format) ---"
+echo ""
+curl -s "$PLANO_URL/routing/v1/messages" \
+ -H "Content-Type: application/json" \
+ -d '{
+ "model": "gpt-4o-mini",
+ "max_tokens": 1024,
+ "messages": [
+ {"role": "user", "content": "Create a REST API endpoint in Rust using actix-web that handles user registration"}
+ ]
+ }' | python3 -m json.tool
+echo ""
+
+echo "=== Demo Complete ==="
diff --git a/demos/llm_routing/openclaw_routing/README.md b/demos/llm_routing/openclaw_routing/README.md
index 7c201687..34ddde47 100644
--- a/demos/llm_routing/openclaw_routing/README.md
+++ b/demos/llm_routing/openclaw_routing/README.md
@@ -23,7 +23,6 @@ Plano uses a [preference-aligned router](https://arxiv.org/abs/2506.16655) to an
## Prerequisites
-- **Docker** running
- **Plano CLI**: `uv tool install planoai` or `pip install planoai`
- **OpenClaw**: `npm install -g openclaw@latest`
- **API keys**:
@@ -43,7 +42,7 @@ export ANTHROPIC_API_KEY="your-anthropic-key"
```bash
cd demos/llm_routing/openclaw_routing
-planoai up --service plano --foreground
+planoai up config.yaml
```
### 3. Set Up OpenClaw
diff --git a/demos/llm_routing/preference_based_routing/README.md b/demos/llm_routing/preference_based_routing/README.md
index e1e16ec0..03d28cee 100644
--- a/demos/llm_routing/preference_based_routing/README.md
+++ b/demos/llm_routing/preference_based_routing/README.md
@@ -3,25 +3,23 @@ This demo shows how you can use user preferences to route user prompts to approp
## How to start the demo
-Make sure your machine is up to date with [latest version of plano]([url](https://github.com/katanemo/plano/tree/main?tab=readme-ov-file#prerequisites)). And you have activated the virtual environment.
+Make sure you have Plano CLI installed (`pip install planoai` or `uv tool install planoai`).
-
-1. start anythingllm
```bash
-(venv) $ cd demos/llm_routing/preference_based_routing
-(venv) $ docker compose up -d
+cd demos/llm_routing/preference_based_routing
+./run_demo.sh
```
-2. start plano in the foreground
+
+Or manually:
+
+1. Start Plano
```bash
-(venv) $ planoai up --service plano --foreground
-# Or if installed with uv: uvx planoai up --service plano --foreground
-2025-05-30 18:00:09,953 - planoai.main - INFO - Starting plano cli version: 0.4.8
-2025-05-30 18:00:09,953 - planoai.main - INFO - Validating /Users/adilhafeez/src/intelligent-prompt-gateway/demos/llm_routing/preference_based_routing/config.yaml
-2025-05-30 18:00:10,422 - cli.core - INFO - Starting plano gateway, image name: plano, tag: katanemo/plano:0.4.8
-2025-05-30 18:00:10,662 - cli.core - INFO - plano status: running, health status: starting
-2025-05-30 18:00:11,712 - cli.core - INFO - plano status: running, health status: starting
-2025-05-30 18:00:12,761 - cli.core - INFO - plano is running and is healthy!
-...
+planoai up config.yaml
+```
+
+2. Start AnythingLLM
+```bash
+docker compose up -d
```
3. open AnythingLLM http://localhost:3001/
diff --git a/demos/llm_routing/preference_based_routing/docker-compose.yaml b/demos/llm_routing/preference_based_routing/docker-compose.yaml
index 7c88594a..3273d55a 100644
--- a/demos/llm_routing/preference_based_routing/docker-compose.yaml
+++ b/demos/llm_routing/preference_based_routing/docker-compose.yaml
@@ -1,23 +1,5 @@
services:
- plano:
- build:
- context: ../../../
- dockerfile: Dockerfile
- ports:
- - "12000:12000"
- - "12001:12001"
- environment:
- - PLANO_CONFIG_PATH=/app/plano_config.yaml
- - OPENAI_API_KEY=${OPENAI_API_KEY:?OPENAI_API_KEY environment variable is required but not set}
- - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY:?ANTHROPIC_API_KEY environment variable is required but not set}
- - OTEL_TRACING_GRPC_ENDPOINT=http://host.docker.internal:4317
- - OTEL_TRACING_ENABLED=true
- - RUST_LOG=debug
- volumes:
- - ./config.yaml:/app/plano_config.yaml:ro
- - /etc/ssl/cert.pem:/etc/ssl/cert.pem
-
anythingllm:
image: mintplexlabs/anythingllm
restart: always
@@ -28,7 +10,7 @@ services:
environment:
- STORAGE_DIR=/app/server/storage
- LLM_PROVIDER=generic-openai
- - GENERIC_OPEN_AI_BASE_PATH=http://plano:12000/v1
+ - GENERIC_OPEN_AI_BASE_PATH=http://host.docker.internal:12000/v1
- GENERIC_OPEN_AI_MODEL_PREF=gpt-4o-mini
- GENERIC_OPEN_AI_MODEL_TOKEN_LIMIT=128000
- GENERIC_OPEN_AI_API_KEY=sk-placeholder
diff --git a/demos/llm_routing/preference_based_routing/plano_config_local.yaml b/demos/llm_routing/preference_based_routing/plano_config_local.yaml
index 0a3db8bf..dbd287dd 100644
--- a/demos/llm_routing/preference_based_routing/plano_config_local.yaml
+++ b/demos/llm_routing/preference_based_routing/plano_config_local.yaml
@@ -13,7 +13,7 @@ model_providers:
- name: arch-router
model: arch/hf.co/katanemo/Arch-Router-1.5B.gguf:Q4_K_M
- base_url: http://host.docker.internal:11434
+ base_url: http://localhost:11434
- model: openai/gpt-4o-mini
access_key: $OPENAI_API_KEY
diff --git a/demos/llm_routing/preference_based_routing/run_demo.sh b/demos/llm_routing/preference_based_routing/run_demo.sh
new file mode 100755
index 00000000..c9525c26
--- /dev/null
+++ b/demos/llm_routing/preference_based_routing/run_demo.sh
@@ -0,0 +1,52 @@
+#!/bin/bash
+set -e
+
+# Function to start the demo
+start_demo() {
+ # Step 1: Check if .env file exists
+ if [ -f ".env" ]; then
+ echo ".env file already exists. Skipping creation."
+ else
+ # Step 2: Create `.env` file and set API keys
+ if [ -z "$OPENAI_API_KEY" ]; then
+ echo "Error: OPENAI_API_KEY environment variable is not set for the demo."
+ exit 1
+ fi
+ if [ -z "$ANTHROPIC_API_KEY" ]; then
+ echo "Warning: ANTHROPIC_API_KEY environment variable is not set. Anthropic features may not work."
+ fi
+
+ echo "Creating .env file..."
+ echo "OPENAI_API_KEY=$OPENAI_API_KEY" > .env
+ if [ -n "$ANTHROPIC_API_KEY" ]; then
+ echo "ANTHROPIC_API_KEY=$ANTHROPIC_API_KEY" >> .env
+ fi
+ echo ".env file created with API keys."
+ fi
+
+ # Step 3: Start Plano
+ echo "Starting Plano with config.yaml..."
+ planoai up config.yaml
+
+ # Step 4: Start services
+ echo "Starting services using Docker Compose..."
+ docker compose up -d
+}
+
+# Function to stop the demo
+stop_demo() {
+ # Step 1: Stop Docker Compose services
+ echo "Stopping Docker Compose services..."
+ docker compose down
+
+ # Step 2: Stop Plano
+ echo "Stopping Plano..."
+ planoai down
+}
+
+# Main script logic
+if [ "$1" == "down" ]; then
+ stop_demo
+else
+ start_demo
+fi
diff --git a/demos/shared/test_runner/run_demo_tests.sh b/demos/shared/test_runner/run_demo_tests.sh
index 7feeb9ac..0c098106 100644
--- a/demos/shared/test_runner/run_demo_tests.sh
+++ b/demos/shared/test_runner/run_demo_tests.sh
@@ -21,7 +21,7 @@ do
echo "****************************************"
cd ../../$demo
echo "starting plano"
- planoai up config.yaml
+ planoai up --docker config.yaml
echo "starting docker containers"
# only execute docker compose if demo is llm_routing/preference_based_routing
if [ "$demo" == "llm_routing/preference_based_routing" ]; then
@@ -38,7 +38,7 @@ do
exit 1
fi
echo "stopping docker containers and plano"
- planoai down
+ planoai down --docker
docker compose down -v
cd ../../shared/test_runner
done
diff --git a/docs/source/_static/img/cli-default-command.png b/docs/source/_static/img/cli-default-command.png
new file mode 100644
index 00000000..a69dbe86
Binary files /dev/null and b/docs/source/_static/img/cli-default-command.png differ
diff --git a/docs/source/_static/img/cli-init-command.png b/docs/source/_static/img/cli-init-command.png
new file mode 100644
index 00000000..b9176a29
Binary files /dev/null and b/docs/source/_static/img/cli-init-command.png differ
diff --git a/docs/source/_static/img/cli-trace-command.png b/docs/source/_static/img/cli-trace-command.png
new file mode 100644
index 00000000..0efa04b8
Binary files /dev/null and b/docs/source/_static/img/cli-trace-command.png differ
diff --git a/docs/source/_static/js/fix-copy.js b/docs/source/_static/js/fix-copy.js
new file mode 100644
index 00000000..0999cc3e
--- /dev/null
+++ b/docs/source/_static/js/fix-copy.js
@@ -0,0 +1,18 @@
+/* Fix: Prevent "Copy code" button label from appearing in clipboard content.
+ *
+ * sphinxawesome_theme inserts a copy button inside elements. When
+ * clipboard.js selects all children of the to copy, the button's
+ * sr-only text ("Copy code") is included in the selection. This listener
+ * intercepts the copy event and strips that trailing label from the data
+ * written to the clipboard.
+ */
+document.addEventListener('copy', function (e) {
+ if (!e.clipboardData) { return; }
+ var selection = window.getSelection();
+ if (!selection) { return; }
+ var text = selection.toString();
+ var clean = text.replace(/\nCopy code\s*$/, '');
+ if (clean === text) { return; }
+ e.clipboardData.setData('text/plain', clean);
+ e.preventDefault();
+}, true);
diff --git a/docs/source/build_with_plano/includes/agent/function-calling-agent.yaml b/docs/source/build_with_plano/includes/agent/function-calling-agent.yaml
index 904b12ce..1399cb9b 100644
--- a/docs/source/build_with_plano/includes/agent/function-calling-agent.yaml
+++ b/docs/source/build_with_plano/includes/agent/function-calling-agent.yaml
@@ -54,6 +54,6 @@ endpoints:
# value could be ip address or a hostname with port
# this could also be a list of endpoints for load balancing
# for example endpoint: [ ip1:port, ip2:port ]
- endpoint: host.docker.internal:18083
+ endpoint: localhost:18083
# max time to wait for a connection to be established
connect_timeout: 0.005s
diff --git a/docs/source/concepts/llm_providers/model_aliases.rst b/docs/source/concepts/llm_providers/model_aliases.rst
index 2d29be93..5d0a43a4 100644
--- a/docs/source/concepts/llm_providers/model_aliases.rst
+++ b/docs/source/concepts/llm_providers/model_aliases.rst
@@ -32,7 +32,7 @@ Basic Configuration
access_key: $ANTHROPIC_API_KEY
- model: ollama/llama3.1
- base_url: http://host.docker.internal:11434
+ base_url: http://localhost:11434
# Define aliases that map to the models above
model_aliases:
diff --git a/docs/source/concepts/llm_providers/supported_providers.rst b/docs/source/concepts/llm_providers/supported_providers.rst
index 4ad89931..e09061e7 100644
--- a/docs/source/concepts/llm_providers/supported_providers.rst
+++ b/docs/source/concepts/llm_providers/supported_providers.rst
@@ -598,9 +598,9 @@ Ollama
- model: ollama/llama3.1
base_url: http://localhost:11434
- # Ollama in Docker (from host)
+ # Ollama running locally
- model: ollama/codellama
- base_url: http://host.docker.internal:11434
+ base_url: http://localhost:11434
OpenAI-Compatible Providers
diff --git a/docs/source/conf.py b/docs/source/conf.py
index c4f20ea0..ec476136 100644
--- a/docs/source/conf.py
+++ b/docs/source/conf.py
@@ -17,7 +17,7 @@ from sphinxawesome_theme.postprocess import Icons
project = "Plano Docs"
copyright = "2025, Katanemo Labs, Inc"
author = "Katanemo Labs, Inc"
-release = " v0.4.8"
+release = " v0.4.11"
# -- General configuration ---------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration
@@ -116,6 +116,7 @@ html_theme_options = asdict(theme_options)
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ["_static"]
html_css_files = ["css/custom.css"]
+html_js_files = ["js/fix-copy.js"]
pygments_style = "lovelace"
pygments_style_dark = "github-dark"
diff --git a/docs/source/get_started/quickstart.rst b/docs/source/get_started/quickstart.rst
index e52e349b..279fde2d 100644
--- a/docs/source/get_started/quickstart.rst
+++ b/docs/source/get_started/quickstart.rst
@@ -17,11 +17,17 @@ Follow this guide to learn how to quickly set up Plano and integrate it into you
Prerequisites
-------------
-Before you begin, ensure you have the following:
+Plano runs **natively** by default — no Docker or Rust toolchain required. Pre-compiled binaries are downloaded automatically on first run.
+
+1. `Python `_ (v3.10+)
+2. Supported platforms: Linux (x86_64, aarch64), macOS (Apple Silicon)
+
+**Docker mode** (optional):
+
+If you prefer to run inside Docker, add ``--docker`` to ``planoai up`` / ``planoai down``. This requires:
1. `Docker System `_ (v24)
2. `Docker Compose `_ (v2.29)
-3. `Python `_ (v3.10+)
Plano's CLI allows you to manage and interact with the Plano efficiently. To install the CLI, simply run the following command:
@@ -37,7 +43,7 @@ Plano's CLI allows you to manage and interact with the Plano efficiently. To ins
.. code-block:: console
- $ uv tool install planoai==0.4.8
+ $ uv tool install planoai==0.4.11
**Option 2: Install with pip (Traditional)**
@@ -45,7 +51,7 @@ Plano's CLI allows you to manage and interact with the Plano efficiently. To ins
$ python -m venv venv
$ source venv/bin/activate # On Windows, use: venv\Scripts\activate
- $ pip install planoai==0.4.8
+ $ pip install planoai==0.4.11
.. _llm_routing_quickstart:
@@ -84,17 +90,20 @@ Step 2. Start plano
Once the config file is created, ensure that you have environment variables set up for ``ANTHROPIC_API_KEY`` and ``OPENAI_API_KEY`` (or these are defined in a ``.env`` file).
-Start Plano:
-
.. code-block:: console
$ planoai up plano_config.yaml
- # Or if installed with uv tool: uvx planoai up plano_config.yaml
- 2024-12-05 11:24:51,288 - planoai.main - INFO - Starting plano cli version: 0.4.8
- 2024-12-05 11:24:51,825 - planoai.utils - INFO - Schema validation successful!
- 2024-12-05 11:24:51,825 - planoai.main - INFO - Starting plano
- ...
- 2024-12-05 11:25:16,131 - planoai.core - INFO - Container is healthy!
+
+On the first run, Plano automatically downloads Envoy, WASM plugins, and brightstaff and caches them at ``~/.plano/``.
+
+To stop Plano, run ``planoai down``.
+
+**Docker mode** (optional):
+
+.. code-block:: console
+
+ $ planoai up plano_config.yaml --docker
+ $ planoai down --docker
Step 3: Interact with LLM
~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -185,9 +194,9 @@ Here is a minimal configuration that wires Plano-Orchestrator to two HTTP servic
agents:
- id: flight_agent
- url: http://host.docker.internal:10520 # your flights service
+ url: http://localhost:10520 # your flights service
- id: hotel_agent
- url: http://host.docker.internal:10530 # your hotels service
+ url: http://localhost:10530 # your hotels service
model_providers:
- model: openai/gpt-4o
diff --git a/docs/source/guides/observability/monitoring.rst b/docs/source/guides/observability/monitoring.rst
index 9d497568..736e0a64 100644
--- a/docs/source/guides/observability/monitoring.rst
+++ b/docs/source/guides/observability/monitoring.rst
@@ -59,7 +59,7 @@ are some sample configuration files for both, respectively.
scheme: http
static_configs:
- targets:
- - host.docker.internal:19901
+ - localhost:19901
params:
format: ["prometheus"]
diff --git a/docs/source/guides/observability/tracing.rst b/docs/source/guides/observability/tracing.rst
index 02723b23..950befd2 100644
--- a/docs/source/guides/observability/tracing.rst
+++ b/docs/source/guides/observability/tracing.rst
@@ -142,6 +142,109 @@ In your observability platform (Jaeger, Grafana Tempo, Datadog, etc.), filter tr
For complete details on all available signals, detection methods, and best practices, see the :doc:`../../concepts/signals` guide.
+Custom Span Attributes
+-------------------------------------------
+
+Plano can automatically attach **custom span attributes** derived from request headers and **static** attributes
+defined in configuration. This lets you stamp
+traces with identifiers like workspace, tenant, or user IDs without changing application code or adding
+custom instrumentation.
+
+**Why This Is Useful**
+
+- **Tenant-aware debugging**: Filter traces by ``workspace.id`` or ``tenant.id``.
+- **Customer-specific visibility**: Attribute performance or errors to a specific customer.
+- **Low overhead**: No code changes in agents or clients—just headers.
+
+How It Works
+~~~~~~~~~~~~
+
+You configure one or more header prefixes. Any incoming HTTP header whose name starts with one of these
+prefixes is captured as a span attribute. You can also provide static attributes that are always injected.
+
+- The **prefix is only for matching**, not the resulting attribute key.
+- The attribute key is the header name **with the prefix removed**, then hyphens converted to dots.
+
+.. note::
+
+ Custom span attributes are attached to LLM spans when handling ``/v1/...`` requests via ``llm_chat``. For orchestrator requests to ``/agents/...``,
+ these attributes are added to both the orchestrator selection span and to each agent span created by ``agent_chat``.
+
+**Example**
+
+Configured prefix::
+
+ tracing:
+ span_attributes:
+ header_prefixes:
+ - x-katanemo-
+
+Incoming headers::
+
+ X-Katanemo-Workspace-Id: ws_123
+ X-Katanemo-Tenant-Id: ten_456
+
+Resulting span attributes::
+
+ workspace.id = "ws_123"
+ tenant.id = "ten_456"
+
+Configuration
+~~~~~~~~~~~~~
+
+Add the prefix list under ``tracing`` in your config:
+
+.. code-block:: yaml
+
+ tracing:
+ random_sampling: 100
+ span_attributes:
+ header_prefixes:
+ - x-katanemo-
+ static:
+ environment: production
+ service.version: "1.0.0"
+
+Static attributes are always injected alongside any header-derived attributes. If a header-derived
+attribute key matches a static key, the header value overrides the static value.
+
+You can provide multiple prefixes:
+
+.. code-block:: yaml
+
+ tracing:
+ span_attributes:
+ header_prefixes:
+ - x-katanemo-
+ - x-tenant-
+ static:
+ environment: production
+ service.version: "1.0.0"
+
+Notes and Examples
+~~~~~~~~~~~~~~~~~~
+
+- **Prefix must match exactly**: ``katanemo-`` does not match ``x-katanemo-`` headers.
+- **Trailing dash is recommended**: Without it, ``x-katanemo`` would also match ``x-katanemo-foo`` and
+ ``x-katanemofoo``.
+- **Keys are always strings**: Values are captured as string attributes.
+
+**Prefix mismatch example**
+
+Config::
+
+ tracing:
+ span_attributes:
+ header_prefixes:
+ - x-katanemo-
+
+Request headers::
+
+ X-Other-User-Id: usr_999
+
+Result: no attributes are captured from ``X-Other-User-Id``.
+
+
Benefits of Using ``Traceparent`` Headers
-----------------------------------------
@@ -497,55 +600,7 @@ tools like AWS X-Ray and Datadog, enhancing observability and facilitating faste
Additional Resources
--------------------
-CLI Reference
-~~~~~~~~~~~~~
-
-``planoai trace``
- Trace requests captured by the local OTLP listener.
-
- **Synopsis**
-
- .. code-block:: console
-
- $ planoai trace [TARGET] [OPTIONS]
-
- **Targets**
-
- - ``last`` (default): show the most recent trace.
- - ``any``: allow interactive selection when available.
- - ````: full 32-hex trace ID.
- - ````: first 8 hex characters.
-
- **Options**
-
- - ``--filter ``: limit displayed attributes to matching keys (supports ``*``).
- - ``--where ``: match traces containing a specific attribute (repeatable, AND).
- - ``--list``: list trace IDs only.
- - ``--no-interactive``: disable interactive prompts/selections.
- - ``--limit ``: limit the number of traces returned.
- - ``--since ``: look back window (``5m``, ``2h``, ``1d``).
- - ``--json``: output raw JSON instead of formatted output.
- - ``--verbose, -v``: show all span attributes. By default, inbound/outbound
- spans are displayed in a compact view.
-
- **Environment**
-
- - ``PLANO_TRACE_PORT``: gRPC port used by ``planoai trace`` to query traces
- (defaults to ``4317``).
-
-``planoai trace listen``
- Start a local OTLP/gRPC listener.
-
- **Synopsis**
-
- .. code-block:: console
-
- $ planoai trace listen [OPTIONS]
-
- **Options**
-
- - ``--host ``: bind address (default: ``0.0.0.0``).
- - ``--port ``: gRPC listener port (default: ``4317``).
+For full command documentation (including ``planoai trace`` and all other CLI commands), see :ref:`cli_reference`.
External References
~~~~~~~~~~~~~~~~~~~
diff --git a/docs/source/guides/state.rst b/docs/source/guides/state.rst
index 7cc6b20a..3f875ce1 100644
--- a/docs/source/guides/state.rst
+++ b/docs/source/guides/state.rst
@@ -165,7 +165,7 @@ Then set the environment variable before running Plano:
./plano
.. warning::
- **Special Characters in Passwords**: If your password contains special characters like ``#``, ``@``, or ``&``, you must URL-encode them in the connection string. For example, ``MyPass#123`` becomes ``MyPass%23123``.
+ **Special Characters in Passwords**: If your password contains special characters like ``#``, ``@``, or ``&``, you must URL-encode them in the connection string. For example, ``P@ss#123`` becomes ``P%40ss%23123``.
Supabase Connection Strings
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -202,14 +202,14 @@ Use the direct connection (port 5432):
state_storage:
type: postgres
- connection_string: "postgresql://postgres.myproject:$DB_PASSWORD@aws-0-us-west-2.pooler.supabase.com:5432/postgres"
+ connection_string: "postgresql://postgres.[YOUR-PROJECT-REF]:$DB_PASSWORD@aws-0-[REGION].pooler.supabase.com:5432/postgres"
Then set the environment variable:
.. code-block:: bash
- # If your password is "MyPass#123", encode it as "MyPass%23123"
- export DB_PASSWORD="MyPass%23123"
+ # If your password is "P@ss#123", encode it as "P%40ss%23123"
+ export DB_PASSWORD=""
Troubleshooting
---------------
diff --git a/docs/source/index.rst b/docs/source/index.rst
index 57952c92..7a2e5b60 100644
--- a/docs/source/index.rst
+++ b/docs/source/index.rst
@@ -62,4 +62,5 @@ Built by contributors to the widely adopted `Envoy Proxy ] [--foreground] [--with-tracing] [--tracing-port ]
+
+**Arguments**
+
+- ``FILE`` (optional): explicit path to config file.
+
+**Options**
+
+- ``--path ``: directory to search for config (default ``.``).
+- ``--foreground``: run Plano in foreground.
+- ``--with-tracing``: start local OTLP/gRPC trace collector.
+- ``--tracing-port ``: collector port (default ``4317``).
+
+.. note::
+
+ If you use ``--with-tracing``, ensure that port 4317 is free and not already in use by Jaeger or any other observability services or processes. If port 4317 is occupied, the command will fail to start the trace collector.
+
+**Examples**
+
+.. code-block:: console
+
+ $ planoai up config.yaml
+ $ planoai up --path ./deploy
+ $ planoai up --with-tracing
+ $ planoai up --with-tracing --tracing-port 4318
+
+
+.. _cli_reference_down:
+
+planoai down
+------------
+
+Stop Plano (container/process stack managed by the CLI).
+
+**Synopsis**
+
+.. code-block:: console
+
+ $ planoai down
+
+
+.. _cli_reference_build:
+
+planoai build
+-------------
+
+Build Plano Docker image from repository source.
+
+**Synopsis**
+
+.. code-block:: console
+
+ $ planoai build
+
+
+.. _cli_reference_logs:
+
+planoai logs
+------------
+
+Stream Plano logs.
+
+**Synopsis**
+
+.. code-block:: console
+
+ $ planoai logs [--follow] [--debug]
+
+**Options**
+
+- ``--follow``: stream logs continuously.
+- ``--debug``: include additional gateway/debug streams.
+
+**Examples**
+
+.. code-block:: console
+
+ $ planoai logs
+ $ planoai logs --follow
+ $ planoai logs --follow --debug
+
+
+.. _cli_reference_init:
+
+planoai init
+------------
+
+Generate a new ``config.yaml`` using an interactive wizard, built-in templates, or a clean empty file.
+
+**Synopsis**
+
+.. code-block:: console
+
+ $ planoai init [--template | --clean] [--output ] [--force] [--list-templates]
+
+**Options**
+
+- ``--template ``: create config from a built-in template id.
+- ``--clean``: create an empty config file.
+- ``--output, -o ``: output path (default ``config.yaml``).
+- ``--force``: overwrite existing output file.
+- ``--list-templates``: print available template IDs and exit.
+
+**Examples**
+
+.. code-block:: console
+
+ $ planoai init
+ $ planoai init --list-templates
+ $ planoai init --template coding_agent_routing
+ $ planoai init --clean --output ./config/config.yaml
+
+.. figure:: /_static/img/cli-init-command.png
+ :width: 100%
+ :alt: planoai init command screenshot
+
+ ``planoai init --list-templates`` showing built-in starter templates.
+
+
+.. _cli_reference_trace:
+
+planoai trace
+-------------
+
+Inspect request traces from the local OTLP listener.
+
+**Synopsis**
+
+.. code-block:: console
+
+ $ planoai trace [TARGET] [OPTIONS]
+
+**Targets**
+
+- ``last`` (default): show most recent trace.
+- ``any``: consider all traces (interactive selection when terminal supports it).
+- ``listen``: start local OTLP listener.
+- ``down``: stop background listener.
+- ````: full 32-hex trace id.
+- ````: first 8 hex chars of trace id.
+
+**Display options**
+
+- ``--filter ``: keep only matching attribute keys (supports ``*`` via "glob" syntax).
+- ``--where ``: locate traces containing key/value (repeatable, AND semantics).
+- ``--list``: list trace IDs instead of full trace output (use with ``--no-interactive`` to fetch plain-text trace IDs only).
+- ``--no-interactive``: disable interactive selection prompts.
+- ``--limit ``: limit returned traces.
+- ``--since ``: lookback window such as ``5m``, ``2h``, ``1d``.
+- ``--json``: emit JSON payloads.
+- ``--verbose``, ``-v``: show full attribute output (disable compact trimming). Useful for debugging internal attributes.
+
+**Listener options (for ``TARGET=listen``)**
+
+- ``--host ``: bind host (default ``0.0.0.0``).
+- ``--port ``: bind port (default ``4317``).
+
+.. note::
+
+ When using ``listen``, ensure that port 4317 is free and not already in use by Jaeger or any other observability services or processes. If port 4317 is occupied, the command will fail to start the trace collector. You cannot use other services on the same port when running.
+
+
+**Environment**
+
+- ``PLANO_TRACE_PORT``: query port used by ``planoai trace`` when reading traces (default ``4317``).
+
+**Examples**
+
+.. code-block:: console
+
+ # Start/stop listener
+ $ planoai trace listen
+ $ planoai trace down
+
+ # Basic inspection
+ $ planoai trace
+ $ planoai trace 7f4e9a1c
+ $ planoai trace 7f4e9a1c0d9d4a0bb9bf5a8a7d13f62a
+
+ # Filtering and automation
+ $ planoai trace --where llm.model=openai/gpt-5.2 --since 30m
+ $ planoai trace --filter "http.*"
+ $ planoai trace --list --limit 5
+ $ planoai trace --where http.status_code=500 --json
+
+.. figure:: /_static/img/cli-trace-command.png
+ :width: 100%
+ :alt: planoai trace command screenshot
+
+ ``planoai trace`` command showing trace inspection and filtering capabilities.
+
+**Operational notes**
+
+- ``--host`` and ``--port`` are valid only when ``TARGET`` is ``listen``.
+- ``--list`` cannot be combined with a specific trace-id target.
+
+
+.. _cli_reference_prompt_targets:
+
+planoai prompt_targets
+----------------------
+
+Generate prompt-target metadata from Python methods.
+
+**Synopsis**
+
+.. code-block:: console
+
+ $ planoai prompt_targets --file
+
+**Options**
+
+- ``--file, --f ``: required path to a ``.py`` source file.
+
+
+.. _cli_reference_cli_agent:
+
+planoai cli_agent
+-----------------
+
+Start an interactive CLI agent session against a running Plano deployment.
+
+**Synopsis**
+
+.. code-block:: console
+
+ $ planoai cli_agent claude [FILE] [--path ] [--settings '']
+
+**Arguments**
+
+- ``type``: currently ``claude``.
+- ``FILE`` (optional): config file path.
+
+**Options**
+
+- ``--path ``: directory containing config file.
+- ``--settings ``: JSON settings payload for agent startup.
diff --git a/docs/source/resources/db_setup/README.md b/docs/source/resources/db_setup/README.md
index 34aff973..2936d1d6 100644
--- a/docs/source/resources/db_setup/README.md
+++ b/docs/source/resources/db_setup/README.md
@@ -64,8 +64,8 @@ After setting up the database table, configure your application to use Supabase
**Example:**
```bash
-# If your password is "MyPass#123", encode it as "MyPass%23123"
-export DATABASE_URL="postgresql://postgres.myproject:MyPass%23123@aws-0-us-west-2.pooler.supabase.com:5432/postgres"
+# If your password is "P@ss#123", encode it as "P%40ss%23123"
+export DATABASE_URL="postgresql://postgres.[YOUR-PROJECT-REF]:@aws-0-[REGION].pooler.supabase.com:5432/postgres"
```
### Testing the Connection
diff --git a/docs/source/resources/deployment.rst b/docs/source/resources/deployment.rst
index c9c75886..71452ea3 100644
--- a/docs/source/resources/deployment.rst
+++ b/docs/source/resources/deployment.rst
@@ -3,7 +3,47 @@
Deployment
==========
-This guide shows how to deploy Plano directly using Docker without the ``plano`` CLI, including basic runtime checks for routing and health monitoring.
+Plano can be deployed in two ways: **natively** on the host (default) or inside a **Docker container**.
+
+Native Deployment (Default)
+---------------------------
+
+Plano runs natively by default. Pre-compiled binaries (Envoy, WASM plugins, brightstaff) are automatically downloaded on the first run and cached at ``~/.plano/``.
+
+Supported platforms: Linux (x86_64, aarch64), macOS (Apple Silicon).
+
+Start Plano
+~~~~~~~~~~~~
+
+.. code-block:: bash
+
+ planoai up plano_config.yaml
+
+Options:
+
+- ``--foreground`` — stay attached and stream logs (Ctrl+C to stop)
+- ``--with-tracing`` — start a local OTLP trace collector
+
+Runtime files (rendered configs, logs, PID file) are stored in ``~/.plano/run/``.
+
+Stop Plano
+~~~~~~~~~~
+
+.. code-block:: bash
+
+ planoai down
+
+Build from Source (Developer)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+If you want to build from source instead of using pre-compiled binaries, you need:
+
+- `Rust `_ with the ``wasm32-wasip1`` target
+- OpenSSL dev headers (``libssl-dev`` on Debian/Ubuntu, ``openssl`` on macOS)
+
+.. code-block:: bash
+
+ planoai build --native
Docker Deployment
-----------------
@@ -25,7 +65,7 @@ Create a ``docker-compose.yml`` file with the following configuration:
# docker-compose.yml
services:
plano:
- image: katanemo/plano:0.4.8
+ image: katanemo/plano:0.4.11
container_name: plano
ports:
- "10000:10000" # ingress (client -> plano)
@@ -53,6 +93,13 @@ Check container health and logs:
docker compose ps
docker compose logs -f plano
+You can also use the CLI with Docker mode:
+
+.. code-block:: bash
+
+ planoai up plano_config.yaml --docker
+ planoai down --docker
+
Runtime Tests
-------------
diff --git a/docs/source/resources/includes/agents/agents_config.yaml b/docs/source/resources/includes/agents/agents_config.yaml
index 0b6aaba2..ef522337 100644
--- a/docs/source/resources/includes/agents/agents_config.yaml
+++ b/docs/source/resources/includes/agents/agents_config.yaml
@@ -2,9 +2,9 @@ version: v0.3.0
agents:
- id: weather_agent
- url: http://host.docker.internal:10510
+ url: http://localhost:10510
- id: flight_agent
- url: http://host.docker.internal:10520
+ url: http://localhost:10520
model_providers:
- model: openai/gpt-4o
diff --git a/docs/source/resources/includes/agents/flights.py b/docs/source/resources/includes/agents/flights.py
index 69c93f25..069f06dd 100644
--- a/docs/source/resources/includes/agents/flights.py
+++ b/docs/source/resources/includes/agents/flights.py
@@ -28,7 +28,7 @@ EXTRACTION_MODEL = "openai/gpt-4o-mini"
# FlightAware AeroAPI configuration
AEROAPI_BASE_URL = "https://aeroapi.flightaware.com/aeroapi"
-AEROAPI_KEY = os.getenv("AEROAPI_KEY", "ESVFX7TJLxB7OTuayUv0zTQBryA3tOPr")
+AEROAPI_KEY = os.getenv("AEROAPI_KEY")
# HTTP client for API calls
http_client = httpx.AsyncClient(timeout=30.0)
diff --git a/docs/source/resources/includes/plano_config_agents_filters.yaml b/docs/source/resources/includes/plano_config_agents_filters.yaml
index dfc8fe7b..f726b121 100644
--- a/docs/source/resources/includes/plano_config_agents_filters.yaml
+++ b/docs/source/resources/includes/plano_config_agents_filters.yaml
@@ -2,16 +2,16 @@ version: v0.3.0
agents:
- id: rag_agent
- url: http://host.docker.internal:10505
+ url: http://localhost:10505
filters:
- id: query_rewriter
- url: http://host.docker.internal:10501
+ url: http://localhost:10501
# type: mcp # default is mcp
# transport: streamable-http # default is streamable-http
# tool: query_rewriter # default name is the filter id
- id: context_builder
- url: http://host.docker.internal:10502
+ url: http://localhost:10502
model_providers:
- model: openai/gpt-4o-mini
diff --git a/docs/source/resources/includes/plano_config_full_reference.yaml b/docs/source/resources/includes/plano_config_full_reference.yaml
index cc3973e0..a650baea 100644
--- a/docs/source/resources/includes/plano_config_full_reference.yaml
+++ b/docs/source/resources/includes/plano_config_full_reference.yaml
@@ -4,15 +4,15 @@ version: v0.3.0
# External HTTP agents - API type is controlled by request path (/v1/responses, /v1/messages, /v1/chat/completions)
agents:
- id: weather_agent # Example agent for weather
- url: http://host.docker.internal:10510
+ url: http://localhost:10510
- id: flight_agent # Example agent for flights
- url: http://host.docker.internal:10520
+ url: http://localhost:10520
# MCP filters applied to requests/responses (e.g., input validation, query rewriting)
filters:
- id: input_guards # Example filter for input validation
- url: http://host.docker.internal:10500
+ url: http://localhost:10500
# type: mcp (default)
# transport: streamable-http (default)
# tool: input_guards (default - same as filter id)
diff --git a/docs/source/resources/includes/plano_config_full_reference_rendered.yaml b/docs/source/resources/includes/plano_config_full_reference_rendered.yaml
index 68505e83..9717b53a 100644
--- a/docs/source/resources/includes/plano_config_full_reference_rendered.yaml
+++ b/docs/source/resources/includes/plano_config_full_reference_rendered.yaml
@@ -1,31 +1,31 @@
agents:
- id: weather_agent
- url: http://host.docker.internal:10510
+ url: http://localhost:10510
- id: flight_agent
- url: http://host.docker.internal:10520
+ url: http://localhost:10520
endpoints:
app_server:
connect_timeout: 0.005s
endpoint: 127.0.0.1
port: 80
flight_agent:
- endpoint: host.docker.internal
+ endpoint: localhost
port: 10520
protocol: http
input_guards:
- endpoint: host.docker.internal
+ endpoint: localhost
port: 10500
protocol: http
mistral_local:
endpoint: 127.0.0.1
port: 8001
weather_agent:
- endpoint: host.docker.internal
+ endpoint: localhost
port: 10510
protocol: http
filters:
- id: input_guards
- url: http://host.docker.internal:10500
+ url: http://localhost:10500
listeners:
- address: 0.0.0.0
agents:
@@ -65,8 +65,6 @@ listeners:
port: 443
protocol: https
provider_interface: openai
- filter_chain:
- - input_guards
name: model_1
port: 12000
type: model
@@ -132,6 +130,6 @@ prompt_targets:
required: true
type: int
tracing:
- opentracing_grpc_endpoint: http://host.docker.internal:4317
+ opentracing_grpc_endpoint: http://localhost:4317
random_sampling: 100
version: v0.3.0
diff --git a/docs/source/resources/tech_overview/request_lifecycle.rst b/docs/source/resources/tech_overview/request_lifecycle.rst
index 61df546c..9c985a45 100644
--- a/docs/source/resources/tech_overview/request_lifecycle.rst
+++ b/docs/source/resources/tech_overview/request_lifecycle.rst
@@ -46,6 +46,117 @@ Also, Plano utilizes `Envoy event-based thread model