plano/docs/source/concepts/tech_overview/threading_model.rst
Shuguang Chen 5c7567584d
Doc Update (#129)
* init update

* Update terminology.rst

* fix the branch to create an index.html, and fix pre-commit issues

* Doc update

* made several changes to the docs after Shuguang's revision

* fixing pre-commit issues

* fixed the reference file to the final prompt config file

* added google analytics

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>
2024-10-06 16:54:34 -07:00

21 lines
895 B
ReStructuredText

.. _arch_overview_threading:
Threading Model
===============
Arch builds on top of Envoy's single process with multiple threads architecture.
A single *primary* thread controls various sporadic coordination tasks while some number of *worker*
threads perform filtering, and forwarding.
Once a connection is accepted, the connection spends the rest of its lifetime bound to a single worker
thread. All the functionality around prompt handling from a downstream client is handled in a separate worker thread.
This allows the majority of Arch to be largely single threaded (embarrassingly parallel) with a small amount
of more complex code handling coordination between the worker threads.
Generally Arch is written to be 100% non-blocking.
.. tip::
For most workloads we recommend configuring the number of worker threads to be equal to the number of
hardware threads on the machine.