diff --git a/docs/ROADMAP.md b/docs/ROADMAP.md
new file mode 100644
index 00000000..23120b4a
--- /dev/null
+++ b/docs/ROADMAP.md
@@ -0,0 +1,110 @@
+# Plano Roadmap
+
+This document describes the roadmap for the Plano project — its current focus areas, how features are planned, and how you can participate in shaping its direction.
+
+Plano's roadmap is a **living plan** maintained through [GitHub Project Board](https://github.com/orgs/katanemo/projects/1), [GitHub Milestones](https://github.com/katanemo/plano/milestones), and the [Plano Enhancement Proposal (PEP)](peps/PEP-0000-process.md) process. This document provides the high-level context; the detailed, up-to-date tracking lives in those tools.
+
+## Contributing to the Roadmap
+
+Anyone can propose a feature or improvement:
+
+1. **Small changes** (bug fixes, docs, minor enhancements) — open a [GitHub issue](https://github.com/katanemo/plano/issues) directly.
+2. **Significant features** (new capabilities, architectural changes, new providers) — write a [Plano Enhancement Proposal (PEP)](peps/PEP-0000-process.md) and submit it as a PR to `docs/peps/`.
+3. **Discussion first** — if you're unsure whether something warrants a PEP, start a [GitHub Discussion](https://github.com/katanemo/plano/discussions) or bring it to a [community meeting](#community-meetings).
+
+If your proposal is accepted, a maintainer will assign it to a release milestone and link it on the project board.
+
+### How to Help with Existing Items
+
+- Browse the [project board](https://github.com/orgs/katanemo/projects/1) for items that interest you
+- Look for issues labeled [`help wanted`](https://github.com/katanemo/plano/labels/help%20wanted) or [`good first issue`](https://github.com/katanemo/plano/labels/good%20first%20issue)
+- Comment on any roadmap issue to volunteer or ask questions
+- Attend a [community meeting](#community-meetings) to discuss design or get unblocked
+
+## Current Focus Areas
+
+### Actively Working On
+
+These items are being implemented now. PRs are in flight or imminent.
+
+- **Content guard models via filter chains** — use off-the-shelf SLMs (e.g., Llama Guard, ShieldGemma, WildGuard) as content moderation filters for jailbreak detection, toxicity screening, and content safety. The legacy `prompt_guards` config is being deprecated in favor of this composable filter-chain approach.
+- **Gemini native protocol** — full support for Google's native Gemini API (generateContent, streamGenerateContent) as both a client-facing and upstream protocol, unlocking Gemini-specific features lost in translation
+- **Model fallback & retry** — automatic failover to the next ranked model on provider errors
+- **`prompt_guards` deprecation** — removing the legacy config path in favor of the filter-chain approach
+
+### Next Up
+
+Scoped and ready for contributors. If you want to help, these are the best places to start.
+
+- **Circuit breaking** — per-provider/model circuit breakers to prevent cascading failures
+- **PII detection & redaction** — configurable entity detection as a reference filter implementation
+- **Accurate token counting** — provider-specific tokenizers for correct rate limiting and cost attribution
+- **Response caching** — exact-match cache with configurable TTL, opt-out headers
+- **Full Responses API support** — complete coverage of OpenAI's Responses API tool types
+
+### Future
+
+Planned but not yet scoped in detail. These are good candidates for [PEPs](peps/PEP-0000-process.md).
+
+**Routing Intelligence**
+- Embedding-based semantic routing for high-throughput use cases
+- A/B testing with weighted traffic splitting and automatic metric collection
+- Latency SLO routing based on historical P99 data
+
+**Agentic Protocols**
+- MCP server mode — expose Plano routing and orchestration as MCP tools
+- A2A protocol — agent discovery and communication across platforms
+- Streaming request passthrough for large-context workloads
+
+**Observability & Evaluation**
+- Pre-built Grafana dashboards for Agentic Signals
+- Regression detection when signal quality degrades after changes
+- Evaluation dataset capture for offline eval
+- Prompt versioning correlated with signal quality
+
+**Developer Experience**
+- Client SDKs — typed Python, JavaScript, and Go clients
+- Authentication — built-in API key and JWT validation for multi-tenant deployments
+- Framework integrations for LangChain, CrewAI, Vercel AI SDK, and others
+
+**Extensibility**
+- WASM plugin SDK with stable ABI contract
+- Community plugin registry for guardrails, routers, and provider adapters
+- Python/JS filter runtime to lower the barrier beyond Rust/WASM
+
+## Release Process
+
+Plano follows a **time-based release cadence**, targeting a new release approximately every two weeks. Each release:
+
+- Is tagged and published to [GitHub Releases](https://github.com/katanemo/plano/releases) with notes
+- Publishes Docker images to Docker Hub, GHCR, and DigitalOcean Container Registry
+- Publishes the `planoai` CLI to [PyPI](https://pypi.org/project/planoai/)
+- Publishes pre-built `brightstaff` binaries
+
+Features land in whichever release they're ready for. Large features that span multiple releases use the PEP process to track progress.
+
+## Community Meetings
+
+We hold regular community meetings open to all contributors:
+
+- **When:** Schedule posted on [Discord](https://discord.gg/pGZf2gcwEc) and GitHub Discussions
+- **Where:** Video link shared in Discord `#community-meetings` channel
+- **What:** Demo new features, discuss active PEPs, triage roadmap items, answer questions
+- **Notes:** Published to GitHub Discussions after each meeting
+
+## Roadmap History
+
+| Version | Theme | Key Deliverables |
+|---|---|---|
+| v0.4.x | Foundation | Agent orchestration, filter chains, cost/latency routing, 17+ providers, Agentic Signals |
+| v0.5.x | _Planned_ | Gemini native protocol, content guard model demos, model fallback, `prompt_guards` deprecation, caching, PEP process |
+
+## Feedback
+
+Roadmap features and timelines may change based on community feedback, contributor capacity, and ecosystem shifts. If you depend on a specific item, you're encouraged to:
+
+- Comment on the relevant GitHub issue to register interest
+- Attend a community meeting to discuss timeline
+- Contribute directly — the fastest way to get a feature shipped
+
+Questions? Join our [Discord](https://discord.gg/pGZf2gcwEc) or open a [Discussion](https://github.com/katanemo/plano/discussions).
diff --git a/docs/peps/PEP-0000-process.md b/docs/peps/PEP-0000-process.md
new file mode 100644
index 00000000..7407ee81
--- /dev/null
+++ b/docs/peps/PEP-0000-process.md
@@ -0,0 +1,142 @@
+# PEP-0000: Plano Enhancement Proposal Process
+
+| Field | Value |
+|---|---|
+| **PEP** | 0000 |
+| **Title** | Plano Enhancement Proposal Process |
+| **Status** | Active |
+| **Authors** | Plano Maintainers |
+| **Created** | 2026-04-07 |
+
+## What is a PEP?
+
+A **Plano Enhancement Proposal (PEP)** is a design document that describes a significant change to the Plano project. PEPs provide a structured way to propose, discuss, and track major features, architectural changes, and process improvements.
+
+PEPs are inspired by [Kafka's KIP process](https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals), [Kubernetes KEPs](https://github.com/kubernetes/enhancements/tree/master/keps), and [Envoy's design document process](https://github.com/envoyproxy/envoy/blob/main/CONTRIBUTING.md).
+
+## When is a PEP Required?
+
+A PEP is required for changes that:
+
+- Introduce a new user-facing feature or capability
+- Change existing user-facing behavior in a breaking way
+- Add a new subsystem or architectural component
+- Modify the configuration schema in a significant way
+- Add a new LLM provider with non-standard API patterns
+- Change the project's processes or governance
+
+A PEP is **not** required for:
+
+- Bug fixes
+- Documentation improvements
+- Refactoring that doesn't change behavior
+- Adding models to an existing provider
+- Minor CLI improvements
+- Test improvements
+- Dependency updates
+
+When in doubt, open a GitHub issue or Discussion first. A maintainer will let you know if a PEP is warranted.
+
+## PEP Lifecycle
+
+```
+Draft → Under Review → Accepted → Implementing → Complete
+                  ↘ Declined
+                  ↘ Deferred
+                  ↘ Withdrawn
+```
+
+### States
+
+| State | Description |
+|---|---|
+| **Draft** | Author is writing the proposal. Not yet ready for formal review. |
+| **Under Review** | PR is open. Maintainers and community are discussing the design. |
+| **Accepted** | Maintainers have approved the design. Implementation can begin. |
+| **Declined** | Maintainers have decided not to pursue this proposal. The PEP remains in the repo for historical reference with an explanation of the decision. |
+| **Deferred** | Good idea, but not the right time. Will be reconsidered later. |
+| **Withdrawn** | Author has decided not to pursue this proposal. |
+| **Implementing** | Accepted and actively being built. Linked to tracking issue(s). |
+| **Complete** | Fully implemented and released. |
+
+## How to Submit a PEP
+
+### 1. Discuss First (Recommended)
+
+Before writing a full PEP, validate the idea:
+
+- Open a [GitHub Discussion](https://github.com/katanemo/plano/discussions) describing the problem and your proposed approach
+- Or bring it up in a [community meeting](https://discord.gg/pGZf2gcwEc)
+- Or open a GitHub issue tagged `enhancement`
+
+This step saves time by catching fundamental objections early.
+
+### 2. Write the PEP
+
+Copy `docs/peps/PEP-TEMPLATE.md` to `docs/peps/PEP-XXXX-short-title.md` (use the next available number). Fill in all sections. The template is deliberately structured — each section exists for a reason.
+
+Key guidelines:
+
+- **Be specific.** "Add caching" is too vague. "Add exact-match response cache with configurable TTL keyed by model + message hash" is actionable.
+- **Show the config.** If the feature involves user-facing configuration, include the YAML snippet users would write.
+- **Address trade-offs.** Every design has trade-offs. Acknowledging them strengthens the proposal.
+- **Include alternatives.** Explain what other approaches you considered and why you chose this one.
+
+### 3. Submit as a Pull Request
+
+Open a PR adding your PEP file to `docs/peps/`. The PR title should be `PEP-XXXX: Short Title`. Set the status to `Draft` or `Under Review` depending on readiness.
+
+### 4. Review and Discussion
+
+- At least **two maintainers** must review the PEP
+- Community members are encouraged to comment on the PR
+- The author is expected to respond to feedback and revise the proposal
+- Discussion should focus on the **design**, not implementation details (those belong in code review)
+- Complex PEPs may be discussed in a community meeting
+
+### 5. Decision
+
+Maintainers aim to provide **initial feedback within two weeks** of a PEP entering `Under Review`. Complex proposals may take longer, but the author should never be left without a response.
+
+A PEP is **accepted** when at least two maintainers approve the PR and there are no unresolved objections. The accepting maintainer merges the PR with the status set to `Accepted`.
+
+A PEP is **declined** when maintainers determine the proposal doesn't align with the project's direction or has fundamental issues that can't be resolved. The PR is merged (not closed) with the status set to `Declined` and a rationale recorded — declined PEPs remain in the repo as a record.
+
+**Resolving disagreements:** If maintainers disagree on a PEP, the proposal is discussed in a community meeting. If consensus still can't be reached, the project lead makes the final call and records the rationale in the PEP.
+
+### 6. Implementation
+
+Once accepted:
+
+- Create a tracking GitHub issue (or issues) for the implementation
+- Link the issue(s) in the PEP header
+- Update the PEP status to `Implementing`
+- Implementation PRs should reference the PEP number (e.g., "Part of PEP-0042")
+- When all implementation work is merged and released, update status to `Complete`
+
+## PEP Numbering
+
+- PEPs are numbered sequentially starting from 0001
+- PEP-0000 is reserved for this process document
+- The author picks the next available number when submitting
+
+## Roles
+
+| Role | Responsibility |
+|---|---|
+| **Author** | Writes the PEP, responds to review feedback, drives the proposal to a decision |
+| **Sponsor** | A maintainer who shepherds the PEP through review. Required for PEPs from non-maintainers. Find a sponsor by asking in Discord or a community meeting. |
+| **Reviewers** | Maintainers and community members who provide feedback on the design |
+
+## Amending Accepted PEPs
+
+If an accepted PEP needs material changes during implementation:
+
+- For minor adjustments (implementation details, clarifications): update the PEP in-place via a PR
+- For significant design changes: open a new PEP that supersedes the original, linking back to it
+
+## Index
+
+| PEP | Title | Status | Author |
+|---|---|---|---|
+| [0000](PEP-0000-process.md) | Plano Enhancement Proposal Process | Active | Plano Maintainers |
diff --git a/docs/peps/PEP-TEMPLATE.md b/docs/peps/PEP-TEMPLATE.md
new file mode 100644
index 00000000..0e920b7a
--- /dev/null
+++ b/docs/peps/PEP-TEMPLATE.md
@@ -0,0 +1,171 @@
+# PEP-XXXX: Title
+
+<!--
+Instructions:
+1. Copy this file to PEP-XXXX-short-title.md (pick the next available number)
+2. Fill in all sections below
+3. Submit as a PR to docs/peps/
+4. Delete these instructions before submitting
+-->
+
+| Field | Value |
+|---|---|
+| **PEP** | XXXX |
+| **Title** | |
+| **Status** | Draft |
+| **Author(s)** | Name (@github-handle) |
+| **Sponsor** | _(required for non-maintainers)_ |
+| **Created** | YYYY-MM-DD |
+| **Tracking Issue** | _(filled after acceptance)_ |
+| **Target Release** | _(filled after acceptance)_ |
+
+## Summary
+
+<!--
+One paragraph. What is this proposal, and why should someone care?
+A reader should be able to decide whether to keep reading from this section alone.
+-->
+
+## Motivation
+
+<!--
+Why is this change needed? What problem does it solve? Who benefits?
+Include concrete examples or user stories where possible.
+Link to GitHub issues, Discord discussions, or community meeting notes that motivated this proposal.
+-->
+
+### Goals
+
+<!--
+Bulleted list of what this PEP aims to achieve.
+-->
+
+### Non-Goals
+
+<!--
+Bulleted list of what this PEP explicitly does NOT aim to achieve.
+Being clear about scope prevents scope creep during review and implementation.
+-->
+
+## Design
+
+<!--
+The core of the proposal. Describe the technical design in enough detail that:
+1. Someone familiar with the codebase could implement it
+2. Someone unfamiliar could understand the approach and trade-offs
+
+Structure this section however makes sense for your proposal. Common subsections:
+-->
+
+### User-Facing Configuration
+
+<!--
+If this feature involves configuration changes, show the YAML that users would write.
+Include a complete, working example — not pseudocode.
+-->
+
+```yaml
+# Example configuration
+```
+
+### Architecture
+
+<!--
+How does this fit into Plano's existing architecture?
+Which crates/components are affected? What's the data flow?
+Diagrams are welcome (use Mermaid or link to an image).
+-->
+
+### API Changes
+
+<!--
+Any new or changed HTTP endpoints, headers, or response formats.
+-->
+
+### Behavior
+
+<!--
+Describe the runtime behavior in detail.
+What happens on the happy path? What happens on errors?
+How does this interact with existing features (routing, streaming, tracing, etc.)?
+-->
+
+## Alternatives Considered
+
+<!--
+What other approaches did you evaluate? Why did you choose this design over them?
+This section demonstrates thoroughness and helps reviewers understand the design space.
+
+For each alternative:
+- Brief description of the approach
+- Why it was rejected (trade-offs, complexity, limitations)
+-->
+
+## Compatibility
+
+<!--
+Does this change break any existing behavior? If so:
+- What breaks?
+- What's the migration path?
+- Should there be a deprecation period?
+
+If this is purely additive, say so explicitly.
+-->
+
+## Observability
+
+<!--
+How will operators know this feature is working correctly?
+- New metrics, traces, or log entries?
+- Integration with existing Agentic Signals?
+- Dashboard or alerting recommendations?
+-->
+
+## Security Considerations
+
+<!--
+Does this change affect Plano's security posture?
+- New attack surfaces?
+- Authentication/authorization implications?
+- Data handling (PII, credentials, etc.)?
+
+If not applicable, briefly explain why.
+-->
+
+## Test Plan
+
+<!--
+How will this be tested?
+- Unit tests (which crates?)
+- Integration tests
+- E2E tests (new demo or test scenario?)
+- Performance/load testing considerations
+
+Be specific enough that a reviewer can evaluate coverage.
+-->
+
+## Implementation Plan
+
+<!--
+How will this be implemented? Suggested breakdown:
+- Phases or PRs (if the work is large enough to split)
+- Which crates/files are primarily affected
+- Estimated complexity (small / medium / large)
+- Any dependencies on other work
+-->
+
+## Open Questions
+
+<!--
+Unresolved design questions that you'd like feedback on during review.
+Number them so reviewers can reference specific questions.
+
+Remove this section (or mark all as resolved) before the PEP is accepted.
+-->
+
+## References
+
+<!--
+Links to related GitHub issues, discussions, external documentation,
+research papers, or prior art in other projects.
+-->