mirror of
https://github.com/katanemo/plano.git
synced 2026-04-29 19:06:34 +02:00
Doc Update (#129)
* init update * Update terminology.rst * fix the branch to create an index.html, and fix pre-commit issues * Doc update * made several changes to the docs after Shuguang's revision * fixing pre-commit issues * fixed the reference file to the final prompt config file * added google analytics --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>
This commit is contained in:
parent
2a7b95582c
commit
5c7567584d
49 changed files with 1185 additions and 609 deletions
21
docs/source/concepts/tech_overview/threading_model.rst
Normal file
21
docs/source/concepts/tech_overview/threading_model.rst
Normal file
|
|
@ -0,0 +1,21 @@
|
|||
.. _arch_overview_threading:
|
||||
|
||||
Threading Model
|
||||
===============
|
||||
|
||||
Arch builds on top of Envoy's single process with multiple threads architecture.
|
||||
|
||||
A single *primary* thread controls various sporadic coordination tasks while some number of *worker*
|
||||
threads perform filtering, and forwarding.
|
||||
|
||||
Once a connection is accepted, the connection spends the rest of its lifetime bound to a single worker
|
||||
thread. All the functionality around prompt handling from a downstream client is handled in a separate worker thread.
|
||||
This allows the majority of Arch to be largely single threaded (embarrassingly parallel) with a small amount
|
||||
of more complex code handling coordination between the worker threads.
|
||||
|
||||
Generally Arch is written to be 100% non-blocking.
|
||||
|
||||
.. tip::
|
||||
|
||||
For most workloads we recommend configuring the number of worker threads to be equal to the number of
|
||||
hardware threads on the machine.
|
||||
Loading…
Add table
Add a link
Reference in a new issue