mirror of
https://github.com/katanemo/plano.git
synced 2026-04-25 16:56:24 +02:00
deploy: b30ad791f7
This commit is contained in:
parent
f4b686c7fc
commit
3e881c6eec
28 changed files with 819 additions and 820 deletions
|
|
@ -101,9 +101,10 @@
|
|||
<li class="toctree-l2"><a class="reference internal" href="terminology.html">Terminology</a></li>
|
||||
<li class="toctree-l2 current"><a class="current reference internal" href="#">Threading Model</a></li>
|
||||
<li class="toctree-l2"><a class="reference internal" href="listener.html">Listener</a></li>
|
||||
<li class="toctree-l2"><a class="reference internal" href="prompt.html">Prompts</a></li>
|
||||
<li class="toctree-l2"><a class="reference internal" href="model_serving.html">Model Serving</a></li>
|
||||
<li class="toctree-l2"><a class="reference internal" href="prompt.html">Prompt</a></li>
|
||||
<li class="toctree-l2"><a class="reference internal" href="request_lifecycle.html">Request Lifecycle</a></li>
|
||||
<li class="toctree-l2"><a class="reference internal" href="error_target.html">Error Target</a></li>
|
||||
</ul>
|
||||
</li>
|
||||
<li class="toctree-l1"><a class="reference internal" href="../llm_provider.html">LLM Provider</a></li>
|
||||
|
|
@ -128,7 +129,6 @@
|
|||
<p class="caption" role="heading"><span class="caption-text">Resources</span></p>
|
||||
<ul>
|
||||
<li class="toctree-l1"><a class="reference internal" href="../../resources/configuration_reference.html">Configuration Reference</a></li>
|
||||
<li class="toctree-l1"><a class="reference internal" href="../../resources/error_target.html">Error Targets</a></li>
|
||||
</ul>
|
||||
</nav>
|
||||
</div>
|
||||
|
|
@ -161,7 +161,7 @@ threads perform filtering, and forwarding.</p>
|
|||
thread. All the functionality around prompt handling from a downstream client is handled in a separate worker thread.
|
||||
This allows the majority of Arch to be largely single threaded (embarrassingly parallel) with a small amount
|
||||
of more complex code handling coordination between the worker threads.</p>
|
||||
<p>Generally Arch is written to be 100% non-blocking.</p>
|
||||
<p>Generally, Arch is written to be 100% non-blocking.</p>
|
||||
<div class="admonition tip">
|
||||
<p class="admonition-title">Tip</p>
|
||||
<p>For most workloads we recommend configuring the number of worker threads to be equal to the number of
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue