Merge remote-tracking branch 'origin/main' into musa/cli

This commit is contained in:
Musa 2026-02-17 07:28:07 -08:00
commit 0416573c82
293 changed files with 8728 additions and 9045 deletions

View file

@ -4,10 +4,10 @@ Configuration Reference
=======================
The following is a complete reference of the ``plano_config.yml`` that controls the behavior of a single instance of
the Arch gateway. This where you enable capabilities like routing to upstream LLm providers, defining prompt_targets
the Plano gateway. This where you enable capabilities like routing to upstream LLm providers, defining prompt_targets
where prompts get routed to, apply guardrails, and enable critical agent observability features.
.. literalinclude:: includes/arch_config_full_reference.yaml
.. literalinclude:: includes/plano_config_full_reference.yaml
:language: yaml
:linenos:
:caption: :download:`Plano Configuration - Full Reference <includes/arch_config_full_reference.yaml>`
:caption: :download:`Plano Configuration - Full Reference <includes/plano_config_full_reference.yaml>`

View file

@ -25,7 +25,7 @@ Create a ``docker-compose.yml`` file with the following configuration:
# docker-compose.yml
services:
plano:
image: katanemo/plano:0.4.6
image: katanemo/plano:0.4.7
container_name: plano
ports:
- "10000:10000" # ingress (client -> plano)

View file

@ -30,7 +30,7 @@ listeners:
- type: agent
name: agent_1
port: 8001
router: arch_agent_router
router: plano_agent_router
agents:
- id: rag_agent
description: virtual assistant for retrieval augmented generation tasks

View file

@ -1,4 +1,4 @@
# Arch Gateway configuration version
# Plano Gateway configuration version
version: v0.3.0
# External HTTP agents - API type is controlled by request path (/v1/responses, /v1/messages, /v1/chat/completions)

View file

@ -41,7 +41,7 @@ The request processing path in Plano has three main parts:
These two subsystems are bridged with either the HTTP router filter, and the cluster manager subsystems of Envoy.
Also, Plano utilizes `Envoy event-based thread model <https://blog.envoyproxy.io/envoy-threading-model-a8d44b922310>`_. A main thread is responsible for the server lifecycle, configuration processing, stats, etc. and some number of :ref:`worker threads <arch_overview_threading>` process requests. All threads operate around an event loop (`libevent <https://libevent.org/>`_) and any given downstream TCP connection will be handled by exactly one worker thread for its lifetime. Each worker thread maintains its own pool of TCP connections to upstream endpoints.
Also, Plano utilizes `Envoy event-based thread model <https://blog.envoyproxy.io/envoy-threading-model-a8d44b922310>`_. A main thread is responsible for the server lifecycle, configuration processing, stats, etc. and some number of :ref:`worker threads <plano_overview_threading>` process requests. All threads operate around an event loop (`libevent <https://libevent.org/>`_) and any given downstream TCP connection will be handled by exactly one worker thread for its lifetime. Each worker thread maintains its own pool of TCP connections to upstream endpoints.
Worker threads rarely share state and operate in a trivially parallel fashion. This threading model
enables scaling to very high core count CPUs.
@ -130,8 +130,8 @@ Once a request completes, the stream is destroyed. The following also takes plac
* The post-request :ref:`monitoring <monitoring>` are updated (e.g. timing, active requests, upgrades, health checks).
Some statistics are updated earlier however, during request processing. Stats are batched and written by the main
thread periodically.
* :ref:`Access logs <arch_access_logging>` are written to the access log
* :ref:`Trace <arch_overview_tracing>` spans are finalized. If our example request was traced, a
* :ref:`Access logs <plano_access_logging>` are written to the access log
* :ref:`Trace <plano_overview_tracing>` spans are finalized. If our example request was traced, a
trace span, describing the duration and details of the request would be created by the HCM when
processing request headers and then finalized by the HCM during post-request processing.

View file

@ -1,4 +1,4 @@
.. _arch_overview_threading:
.. _plano_overview_threading:
Threading Model
===============