mirror of
https://github.com/katanemo/plano.git
synced 2026-06-17 15:25:17 +02:00
Merge remote-tracking branch 'origin/main' into musa/cli
This commit is contained in:
commit
0416573c82
293 changed files with 8728 additions and 9045 deletions
|
|
@ -4,10 +4,10 @@ Configuration Reference
|
|||
=======================
|
||||
|
||||
The following is a complete reference of the ``plano_config.yml`` that controls the behavior of a single instance of
|
||||
the Arch gateway. This where you enable capabilities like routing to upstream LLm providers, defining prompt_targets
|
||||
the Plano gateway. This where you enable capabilities like routing to upstream LLm providers, defining prompt_targets
|
||||
where prompts get routed to, apply guardrails, and enable critical agent observability features.
|
||||
|
||||
.. literalinclude:: includes/arch_config_full_reference.yaml
|
||||
.. literalinclude:: includes/plano_config_full_reference.yaml
|
||||
:language: yaml
|
||||
:linenos:
|
||||
:caption: :download:`Plano Configuration - Full Reference <includes/arch_config_full_reference.yaml>`
|
||||
:caption: :download:`Plano Configuration - Full Reference <includes/plano_config_full_reference.yaml>`
|
||||
|
|
|
|||
|
|
@ -25,7 +25,7 @@ Create a ``docker-compose.yml`` file with the following configuration:
|
|||
# docker-compose.yml
|
||||
services:
|
||||
plano:
|
||||
image: katanemo/plano:0.4.6
|
||||
image: katanemo/plano:0.4.7
|
||||
container_name: plano
|
||||
ports:
|
||||
- "10000:10000" # ingress (client -> plano)
|
||||
|
|
|
|||
|
|
@ -30,7 +30,7 @@ listeners:
|
|||
- type: agent
|
||||
name: agent_1
|
||||
port: 8001
|
||||
router: arch_agent_router
|
||||
router: plano_agent_router
|
||||
agents:
|
||||
- id: rag_agent
|
||||
description: virtual assistant for retrieval augmented generation tasks
|
||||
|
|
|
|||
|
|
@ -1,4 +1,4 @@
|
|||
# Arch Gateway configuration version
|
||||
# Plano Gateway configuration version
|
||||
version: v0.3.0
|
||||
|
||||
# External HTTP agents - API type is controlled by request path (/v1/responses, /v1/messages, /v1/chat/completions)
|
||||
|
|
@ -41,7 +41,7 @@ The request processing path in Plano has three main parts:
|
|||
|
||||
These two subsystems are bridged with either the HTTP router filter, and the cluster manager subsystems of Envoy.
|
||||
|
||||
Also, Plano utilizes `Envoy event-based thread model <https://blog.envoyproxy.io/envoy-threading-model-a8d44b922310>`_. A main thread is responsible for the server lifecycle, configuration processing, stats, etc. and some number of :ref:`worker threads <arch_overview_threading>` process requests. All threads operate around an event loop (`libevent <https://libevent.org/>`_) and any given downstream TCP connection will be handled by exactly one worker thread for its lifetime. Each worker thread maintains its own pool of TCP connections to upstream endpoints.
|
||||
Also, Plano utilizes `Envoy event-based thread model <https://blog.envoyproxy.io/envoy-threading-model-a8d44b922310>`_. A main thread is responsible for the server lifecycle, configuration processing, stats, etc. and some number of :ref:`worker threads <plano_overview_threading>` process requests. All threads operate around an event loop (`libevent <https://libevent.org/>`_) and any given downstream TCP connection will be handled by exactly one worker thread for its lifetime. Each worker thread maintains its own pool of TCP connections to upstream endpoints.
|
||||
|
||||
Worker threads rarely share state and operate in a trivially parallel fashion. This threading model
|
||||
enables scaling to very high core count CPUs.
|
||||
|
|
@ -130,8 +130,8 @@ Once a request completes, the stream is destroyed. The following also takes plac
|
|||
* The post-request :ref:`monitoring <monitoring>` are updated (e.g. timing, active requests, upgrades, health checks).
|
||||
Some statistics are updated earlier however, during request processing. Stats are batched and written by the main
|
||||
thread periodically.
|
||||
* :ref:`Access logs <arch_access_logging>` are written to the access log
|
||||
* :ref:`Trace <arch_overview_tracing>` spans are finalized. If our example request was traced, a
|
||||
* :ref:`Access logs <plano_access_logging>` are written to the access log
|
||||
* :ref:`Trace <plano_overview_tracing>` spans are finalized. If our example request was traced, a
|
||||
trace span, describing the duration and details of the request would be created by the HCM when
|
||||
processing request headers and then finalized by the HCM during post-request processing.
|
||||
|
||||
|
|
|
|||
|
|
@ -1,4 +1,4 @@
|
|||
.. _arch_overview_threading:
|
||||
.. _plano_overview_threading:
|
||||
|
||||
Threading Model
|
||||
===============
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue