Plano is an AI-native proxy and data plane for agentic apps — with built-in orchestration, safety, observability, and smart LLM routing so you stay focused on your agents core logic. https://planoai.dev
Find a file
2026-01-03 17:43:30 -08:00
.github/workflows release 0.4.1 (#670) 2026-01-01 23:39:18 -08:00
apps Update docs to Plano (#639) 2025-12-23 17:14:50 -08:00
cli release 0.4.1 (#670) 2026-01-01 23:39:18 -08:00
config release 0.4.1 (#670) 2026-01-01 23:39:18 -08:00
crates cargo clippy (#660) 2025-12-25 21:08:37 -08:00
demos release 0.4.1 (#670) 2026-01-01 23:39:18 -08:00
docs updated readme with a snippet of code to go along with the description of the proejct 2026-01-03 17:43:30 -08:00
packages include contact page and restructuring (#640) 2025-12-22 15:02:45 -08:00
tests use uv instead of poetry (#663) 2025-12-26 11:21:42 -08:00
.dockerignore update .dockerignore file after filter move 2024-10-18 14:44:39 -07:00
.gitignore use uv instead of poetry (#663) 2025-12-26 11:21:42 -08:00
.gitmodules Remove OMF (#78) 2024-09-24 15:18:20 -07:00
.pre-commit-config.yaml cargo clippy (#660) 2025-12-25 21:08:37 -08:00
archgw.code-workspace update workspace (#659) 2025-12-25 18:26:35 -08:00
build_filter_image.sh release 0.4.1 (#670) 2026-01-01 23:39:18 -08:00
CONTRIBUTING.md Update discord server invite url (#428) 2025-03-05 13:21:35 -08:00
Dockerfile use uv instead of poetry (#663) 2025-12-26 11:21:42 -08:00
LICENSE Create LICENSE 2024-10-10 06:30:23 -07:00
package-lock.json feat(security): fix security issues (#673) 2026-01-03 08:22:21 -08:00
package.json feat(security): fix security issues (#673) 2026-01-03 08:22:21 -08:00
README.md updated readme with a snippet of code to go along with the description of the proejct 2026-01-03 17:43:30 -08:00
turbo.json feat: redesign archgw -> plano + website in Next.js (#613) 2025-12-18 15:55:15 -08:00

Plano Logo

The AI-native proxy server and data plane for agentic apps.

Plano pulls out the rote plumbing work and decouples you from brittle framework abstractions, centralizing what shouldnt be bespoke in every codebase - like agent routing and orchestration, rich agentic signals and traces for continuous improvement, guardrail filters for safety and moderation, and smart LLM routing APIs for model agility. Use any language or AI framework, and deliver agents faster to production.

QuickstartBuild Agentic Apps with PlanoDocumentationContact

pre-commit rust tests (prompt and llm gateway) e2e tests Build and Deploy Documentation

Overview

Building agentic demos is easy. Shipping agentic applications safely, reliably, and repeatably to production is hard. After the thrill of a quick hack, you end up building the “hidden middleware” to reach production: routing logic to reach the right agent, guardrail hooks for safety and moderation, evaluation and observability glue for continuous learning, and model/provider quirks scattered across frameworks and application code.

Plano solves this by moving core delivery concerns into a unified, out-of-process dataplane.

Plano pulls rote plumbing out of your framework so you can stay focused on what matters most: the core product logic of your agentic applications. Plano is backed by industry-leading LLM research and built on Envoy by its core contributors, who built critical infrastructure at scale for modern worklaods.

High-Level Network Sequence Diagram: high-level network plano arcitecture for Plano

Jump to our docs to learn how you can use Plano to improve the speed, safety and obervability of your agentic applications.

Important

Plano and the Arch family of LLMs (like Plano-Orchestrator-4B, Arch-Router, etc) are hosted free of charge in the US-central region to give you a great first-run developer experience of Plano. To scale and run in production, you can either run these LLMs locally or contact us on Discord for API keys.


Build Agentic Apps with Plano

Skip the plumbing. Plano handles routing, model management, and observability so you can focus on your agent's core logic. Here's a multi-agent travel assistant with zero infrastructure code.

📁 Full working code: See demos/use_cases/travel_agents/ for complete weather and flight agents you can run locally.

1. Define Your Agents in YAML

# config.yaml
version: v0.3.0

# What you declare: Agent URLs and natural language descriptions
# What you don't write: Intent classifiers, routing logic, model fallbacks, provider adapters, or tracing instrumentation

agents:
  - id: weather_agent
    url: http://localhost:10510
  - id: flight_agent
    url: http://localhost:10520

model_providers:
  - model: openai/gpt-4o
    access_key: $OPENAI_API_KEY
    default: true
  - model: anthropic/claude-3-5-sonnet
    access_key: $ANTHROPIC_API_KEY

listeners:
  - type: agent
    name: travel_assistant
    port: 8001
    router: plano_orchestrator_v1  # Powered by our 4B-parameter routing model. You can change this to different models
    agents:
      - id: weather_agent
        description: |
          Gets real-time weather and forecasts for any city worldwide.
          Handles: "What's the weather in Paris?", "Will it rain in Tokyo?"

      - id: flight_agent
        description: |
          Searches flights between airports with live status and schedules.
          Handles: "Flights from NYC to LA", "Show me flights to Seattle"

tracing:
  random_sampling: 100  # Auto-capture traces for evaluation

2. Write Simple Agent Code

Your agents are just HTTP servers that implement the OpenAI-compatible chat completions endpoint. Use any language or framework:

# weather_agent.py
from fastapi import FastAPI, Request
from fastapi.responses import StreamingResponse
from openai import AsyncOpenAI

app = FastAPI()

# Point to Plano's LLM gateway - it handles model routing for you
llm = AsyncOpenAI(base_url="http://localhost:12001/v1", api_key="EMPTY")

@app.post("/v1/chat/completions")
async def chat(request: Request):
    body = await request.json()
    messages = body.get("messages", [])

    # Your agent logic: fetch data, call APIs, run tools
    # See demos/use_cases/travel_agents/ for the full implementation
    weather_data = await get_weather_for_city(messages)

    # Stream the response back through Plano
    async def generate():
        stream = await llm.chat.completions.create(
            model="openai/gpt-4o",
            messages=[{"role": "system", "content": f"Weather: {weather_data}"}, *messages],
            stream=True
        )
        async for chunk in stream:
            yield f"data: {chunk.model_dump_json()}\n\n"

    return StreamingResponse(generate(), media_type="text/event-stream")

3. Start Plano & Query Your Agents

# Start Plano
planoai up config.yaml

# Query - Plano routes to the right agent automatically
curl http://localhost:8001/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "What is the weather in Paris?"}]
  }'
# → Routes to weather_agent ✓

curl http://localhost:8001/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "Find me flights from NYC to Tokyo"}]
  }'
# → Routes to flight_agent ✓

4. Get Observability and Model Agility for Free

Every request is traced end-to-end with OpenTelemetry - no instrumentation code needed.

Atomatic Tracing

What You Didn't Have to Build

Infrastructure Concern Without Plano With Plano
Agent Routing Write intent classifier + routing logic Declare agent descriptions in YAML
Model Management Handle each provider's API quirks Unified OpenAI-compatible interface
Model Switching Redeploy code for model changes Change model name in request
Tracing Instrument every service with OTEL Automatic end-to-end traces
Evaluation Data Build pipeline to capture/export spans Built-in sampling & export
Adding Agents Update routing code, test, redeploy Add to config, restart

Why it's efficient: Plano uses purpose-built, lightweight LLMs (like our 4B-parameter orchestrator) instead of heavyweight frameworks or GPT-4 for routing - giving you production-grade routing at a fraction of the cost and latency.


Contact

To get in touch with us, please join our discord server. We actively monitor that and offer support there.

Getting Started

Ready to try Plano? Check out our comprehensive documentation:

Contribution

We would love feedback on our Roadmap and we welcome contributions to Plano! Whether you're fixing bugs, adding new features, improving documentation, or creating tutorials, your help is much appreciated. Please visit our Contribution Guide for more details