add output filter chain (#822)

2026-05-18 13:45:15 +02:00 · 2026-03-18 17:58:20 -07:00 · 2026-03-18 17:58:20 -07:00 · 1f23c573bf
commit 1f23c573bf
parent de2d8847f3
59 changed files with 2961 additions and 2621 deletions
--- a/docs/source/concepts/filter_chain.rst
+++ b/docs/source/concepts/filter_chain.rst
@ -31,6 +31,9 @@ Because these behaviors live in the dataplane rather than inside individual agen
 Configuration example
 ---------------------

+Agent listener filter chain
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
 The example below shows a configuration where an agent uses a filter chain with two filters: a query rewriter,
 and a context builder that prepares retrieval context before the agent runs.

@ -46,6 +49,38 @@ In this setup:
 * The ``listeners`` section wires the ``rag_agent`` behind an ``agent`` listener and attaches a ``filter_chain`` with ``query_rewriter`` followed by ``context_builder``.
 * When a request arrives at ``agent_1``, Plano executes the filters in order before handing control to ``rag_agent``.

+Model listener filter chain
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Filter chains can also be attached directly to a **model listener**. This lets you run input guardrails on
+direct LLM proxy requests (``/v1/chat/completions``, ``/v1/responses``, etc.) without an agent layer in between.
+
+.. code-block:: yaml
+    :caption: Model listener with a content-safety filter chain
+
+    filters:
+      - id: content_guard
+        url: http://content-guard:10500
+        type: http
+
+    model_providers:
+      - model: openai/gpt-4o-mini
+        access_key: $OPENAI_API_KEY
+        default: true
+
+    listeners:
+      - type: model
+        name: llm_gateway
+        port: 12000
+        filter_chain:
+          - content_guard
+
+In this setup:
+
+* The ``filter_chain`` is declared at the listener level (not per-agent).
+* When a request arrives at the model listener, Plano executes the filters in order before forwarding the request to the upstream LLM provider.
+* If a filter rejects the request (HTTP 4xx), the error is returned to the caller and the LLM is never called.
+

 Filter Chain Programming Model (HTTP and MCP)
 ---------------------------------------------
--- a/docs/source/concepts/listeners.rst
+++ b/docs/source/concepts/listeners.rst
@ -57,6 +57,10 @@ Under the hood, Plano opens outbound HTTP(S) connections to upstream LLM provide
 smart model routing. For more details on how Plano talks to models and how providers are configured, see
 :ref:`LLM providers <llm_providers>`.

+Model listeners also support :ref:`Filter Chains <filter_chain>`. By adding a ``filter_chain`` to a model listener
+you can run input guardrails, content-safety checks, or other preprocessing on direct LLM requests before they reach
+the upstream provider — without requiring an agent layer.
+
 Configure Listeners
 ^^^^^^^^^^^^^^^^^^^