mirror of
https://github.com/katanemo/plano.git
synced 2026-04-26 01:06:25 +02:00
deploy: b30ad791f7
This commit is contained in:
parent
f4b686c7fc
commit
3e881c6eec
28 changed files with 819 additions and 820 deletions
|
|
@ -101,9 +101,10 @@
|
|||
<li class="toctree-l2 current"><a class="current reference internal" href="#">Terminology</a></li>
|
||||
<li class="toctree-l2"><a class="reference internal" href="threading_model.html">Threading Model</a></li>
|
||||
<li class="toctree-l2"><a class="reference internal" href="listener.html">Listener</a></li>
|
||||
<li class="toctree-l2"><a class="reference internal" href="prompt.html">Prompts</a></li>
|
||||
<li class="toctree-l2"><a class="reference internal" href="model_serving.html">Model Serving</a></li>
|
||||
<li class="toctree-l2"><a class="reference internal" href="prompt.html">Prompt</a></li>
|
||||
<li class="toctree-l2"><a class="reference internal" href="request_lifecycle.html">Request Lifecycle</a></li>
|
||||
<li class="toctree-l2"><a class="reference internal" href="error_target.html">Error Target</a></li>
|
||||
</ul>
|
||||
</li>
|
||||
<li class="toctree-l1"><a class="reference internal" href="../llm_provider.html">LLM Provider</a></li>
|
||||
|
|
@ -128,7 +129,6 @@
|
|||
<p class="caption" role="heading"><span class="caption-text">Resources</span></p>
|
||||
<ul>
|
||||
<li class="toctree-l1"><a class="reference internal" href="../../resources/configuration_reference.html">Configuration Reference</a></li>
|
||||
<li class="toctree-l1"><a class="reference internal" href="../../resources/error_target.html">Error Targets</a></li>
|
||||
</ul>
|
||||
</nav>
|
||||
</div>
|
||||
|
|
@ -160,31 +160,31 @@ to keep things consistent in logs, traces and in code.</p>
|
|||
<p><strong>Upstream(Egress)</strong>: An upstream host that receives connections and prompts from Arch, and returns context or responses for a prompt</p>
|
||||
<a class="reference internal image-reference" href="../../_images/network-topology-ingress-egress.jpg"><img alt="../../_images/network-topology-ingress-egress.jpg" class="align-center" src="../../_images/network-topology-ingress-egress.jpg" style="width: 100%;"/>
|
||||
</a>
|
||||
<p><strong>Listener</strong>: A listener is a named network location (e.g., port, address, path etc.) that Arch listens on to process prompts
|
||||
<p><strong>Listener</strong>: A <a class="reference internal" href="listener.html#arch-overview-listeners"><span class="std std-ref">listener</span></a> is a named network location (e.g., port, address, path etc.) that Arch listens on to process prompts
|
||||
before forwarding them to your application server endpoints. rch enables you to configure one listener for downstream connections
|
||||
(like port 80, 443) and creates a separate internal listener for calls that initiate from your application code to LLMs.</p>
|
||||
<div class="admonition note">
|
||||
<p class="admonition-title">Note</p>
|
||||
<p>When you start Arch, you specify a listener address/port that you want to bind downstream. But, Arch uses are predefined port
|
||||
that you can use (<code class="docutils literal notranslate"><span class="pre">127.0.0.1:10000</span></code>) to proxy egress calls originating from your application to LLMs (API-based or hosted).
|
||||
For more details, check out <a class="reference internal" href="../llm_provider.html#llm-provider"><span class="std std-ref">LLM providers</span></a></p>
|
||||
For more details, check out <a class="reference internal" href="../llm_provider.html#llm-provider"><span class="std std-ref">LLM provider</span></a>.</p>
|
||||
</div>
|
||||
<p><strong>Instance</strong>: An instance of the Arch gateway. When you start Arch it creates at most two processes. One to handle Layer 7
|
||||
networking operations (auth, tls, observability, etc) and the second process to serve models that enable it to make smart
|
||||
decisions on how to accept, handle and forward prompts. The second process is optional, as the model serving sevice could be
|
||||
hosted on a different network (an API call). But these two processes are considered a single instance of Arch.</p>
|
||||
<p><strong>Prompt Targets</strong>: Arch offers a primitive called <code class="docutils literal notranslate"><span class="pre">prompt_targets</span></code> to help separate business logic from undifferentiated
|
||||
<p><strong>Prompt Target</strong>: Arch offers a primitive called <a class="reference internal" href="../prompt_target.html#prompt-target"><span class="std std-ref">prompt_target</span></a> to help separate business logic from undifferentiated
|
||||
work in building generative AI apps. Prompt targets are endpoints that receive prompts that are processed by Arch.
|
||||
For example, Arch enriches incoming prompts with metadata like knowing when a request is a follow-up or clarifying prompt
|
||||
so that you can build faster, more accurate retrieval (RAG) apps. To support agentic apps, like scheduling travel plans or
|
||||
sharing comments on a document - via prompts, Bolt uses its function calling abilities to extract critical information from
|
||||
the incoming prompt (or a set of prompts) needed by a downstream backend API or function call before calling it directly.</p>
|
||||
<p><strong>Error Targets</strong>: Error targets are those endpoints that receive forwarded errors from Arch when issues arise,
|
||||
<p><strong>Error Target</strong>: <a class="reference internal" href="error_target.html#error-target"><span class="std std-ref">Error targets</span></a> are those endpoints that receive forwarded errors from Arch when issues arise,
|
||||
such as failing to properly call a function/API, detecting violations of guardrails, or encountering other processing errors.
|
||||
These errors are communicated to the application via headers (X-Arch-[ERROR-TYPE]), allowing it to handle the errors gracefully
|
||||
These errors are communicated to the application via headers <code class="docutils literal notranslate"><span class="pre">X-Arch-[ERROR-TYPE]</span></code>, allowing it to handle the errors gracefully
|
||||
and take appropriate actions.</p>
|
||||
<p><strong>Model Serving</strong>: Arch is a set of <strong>two</strong> self-contained processes that are designed to run alongside your application servers
|
||||
(or on a separate hostconnected via a network).The <strong>model serving</strong> process helps Arch make intelligent decisions about the
|
||||
<p><strong>Model Serving</strong>: Arch is a set of <cite>two</cite> self-contained processes that are designed to run alongside your application servers
|
||||
(or on a separate hostconnected via a network).The <a class="reference internal" href="model_serving.html#model-serving"><span class="std std-ref">model serving</span></a> process helps Arch make intelligent decisions about the
|
||||
incoming prompts. The model server is designed to call the (fast) purpose-built LLMs in Arch.</p>
|
||||
</section>
|
||||
</div><div class="flex justify-between items-center pt-6 mt-12 border-t border-border gap-4">
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue