plano/docs/source/resources/tech_overview/threading_model.rst
Adil Hafeez db7b9ca2af
Update remaining arch references in docs
- Rename RST cross-reference labels: arch_access_logging, arch_overview_tracing, arch_overview_threading → plano_*
- Update label references in request_lifecycle.rst
- Rename arch_config_state_storage_example.yaml → plano_config_state_storage_example.yaml
- Update config YAML comments: "Arch creates/uses" → "Plano creates/uses"
- Update "the Arch gateway" → "the Plano gateway" in configuration_reference.rst
- Update arch_config_schema.yaml reference in provider_models.py
- Rename arch_agent_router → plano_agent_router in config example

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 20:26:04 -08:00

21 lines
900 B
ReStructuredText

.. _plano_overview_threading:
Threading Model
===============
Plano builds on top of Envoy's single process with multiple threads architecture.
A single *primary* thread controls various sporadic coordination tasks while some number of *worker*
threads perform filtering, and forwarding.
Once a connection is accepted, the connection spends the rest of its lifetime bound to a single worker
thread. All the functionality around prompt handling from a downstream client is handled in a separate worker thread.
This allows the majority of Plano to be largely single threaded (embarrassingly parallel) with a small amount
of more complex code handling coordination between the worker threads.
Generally, Plano is written to be 100% non-blocking.
.. tip::
For most workloads we recommend configuring the number of worker threads to be equal to the number of
hardware threads on the machine.