document filter_chain support on model listeners

2026-05-21 13:55:15 +02:00 · 2026-03-12 15:46:07 -07:00 · 2026-03-12 15:46:07 -07:00 · 60a3f0ecab
commit 60a3f0ecab
parent 4e290fb715
3 changed files with 42 additions and 0 deletions
--- a/docs/source/concepts/filter_chain.rst
+++ b/docs/source/concepts/filter_chain.rst
@ -31,6 +31,9 @@ Because these behaviors live in the dataplane rather than inside individual agen
 Configuration example
 ---------------------
 Agent listener filter chain
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
 The example below shows a configuration where an agent uses a filter chain with two filters: a query rewriter,
 and a context builder that prepares retrieval context before the agent runs.
@ -46,6 +49,38 @@ In this setup:
 * The ``listeners`` section wires the ``rag_agent`` behind an ``agent`` listener and attaches a ``filter_chain`` with ``query_rewriter`` followed by ``context_builder``.
 * When a request arrives at ``agent_1``, Plano executes the filters in order before handing control to ``rag_agent``.
 Model listener filter chain
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 Filter chains can also be attached directly to a **model listener**. This lets you run input guardrails on
 direct LLM proxy requests (``/v1/chat/completions``, ``/v1/responses``, etc.) without an agent layer in between.
 .. code-block:: yaml
    :caption: Model listener with a content-safety filter chain
    filters:
      - id: content_guard
        url: http://content-guard:10500
        type: http
    model_providers:
      - model: openai/gpt-4o-mini
        access_key: $OPENAI_API_KEY
        default: true
    listeners:
      - type: model
        name: llm_gateway
        port: 12000
        filter_chain:
          - content_guard
 In this setup:
 * The ``filter_chain`` is declared at the listener level (not per-agent).
 * When a request arrives at the model listener, Plano executes the filters in order before forwarding the request to the upstream LLM provider.
 * If a filter rejects the request (HTTP 4xx), the error is returned to the caller and the LLM is never called.
 Filter Chain Programming Model (HTTP and MCP)
 ---------------------------------------------
--- a/docs/source/concepts/listeners.rst
+++ b/docs/source/concepts/listeners.rst
@ -57,6 +57,10 @@ Under the hood, Plano opens outbound HTTP(S) connections to upstream LLM provide
 smart model routing. For more details on how Plano talks to models and how providers are configured, see
 :ref:`LLM providers <llm_providers>`.
 Model listeners also support :ref:`Filter Chains <filter_chain>`. By adding a ``filter_chain`` to a model listener
 you can run input guardrails, content-safety checks, or other preprocessing on direct LLM requests before they reach
 the upstream provider — without requiring an agent layer.
 Configure Listeners
 ^^^^^^^^^^^^^^^^^^^
--- a/docs/source/resources/includes/plano_config_full_reference.yaml
+++ b/docs/source/resources/includes/plano_config_full_reference.yaml
@ -66,6 +66,9 @@ listeners:
    name: model_1
    address: 0.0.0.0
    port: 12000
    # Optional: attach a filter chain for input guardrails on direct LLM requests
    # filter_chain:
    #   - input_guards
  # Prompt listener for function calling (for prompt_targets)
  - type: prompt