mirror of
https://github.com/katanemo/plano.git
synced 2026-05-18 13:45:15 +02:00
deploy: e7b0de2a72
This commit is contained in:
parent
f50f1bb4a6
commit
ed2124f773
29 changed files with 64 additions and 64 deletions
|
|
@ -160,11 +160,11 @@
|
|||
<p>A few definitions before we dive into the main architecture documentation. Also note, Arch borrows from Envoy’s terminology
|
||||
to keep things consistent in logs and traces, and introduces and clarifies concepts are is relates to LLM applications.</p>
|
||||
<p><strong>Agent</strong>: An application that uses LLMs to handle wide-ranging tasks from users via prompts. This could be as simple
|
||||
as retrieving or summarizing data from an API, or being able to trigger compleix actions like adjusting ad campaigns, or
|
||||
as retrieving or summarizing data from an API, or being able to trigger complex actions like adjusting ad campaigns, or
|
||||
changing travel plans via prompts.</p>
|
||||
<p><strong>Arch Config</strong>: Arch operates based on a configuration that controls the behavior of a single instance of the Arch gateway.
|
||||
This where you enable capabilities like LLM routing, fast function calling (via prompt_targets), applying guardrails, and enabling critical
|
||||
features like metrics and tracing. For the full configuration reference of <cite>arch_config.yaml</cite> see <a class="reference internal" href="../../resources/configuration_reference.html#configuration-refernce"><span class="std std-ref">here</span></a>.</p>
|
||||
features like metrics and tracing. For the full configuration reference of <cite>arch_config.yaml</cite> see <a class="reference internal" href="../../resources/configuration_reference.html#configuration-reference"><span class="std std-ref">here</span></a>.</p>
|
||||
<p><strong>Downstream(Ingress)</strong>: An downstream client (web application, etc.) connects to Arch, sends prompts, and receives responses.</p>
|
||||
<p><strong>Upstream(Egress)</strong>: An upstream host that receives connections and prompts from Arch, and returns context or responses for a prompt</p>
|
||||
<a class="reference internal image-reference" href="../../_images/network-topology-ingress-egress.jpg"><img alt="../../_images/network-topology-ingress-egress.jpg" class="align-center" src="../../_images/network-topology-ingress-egress.jpg" style="width: 100%;"/>
|
||||
|
|
@ -183,10 +183,10 @@ For more details, check out <a class="reference internal" href="../llm_provider.
|
|||
undifferentiated work in building generative AI apps. Prompt targets are endpoints that receive prompts that are processed by Arch.
|
||||
For example, Arch enriches incoming prompts with metadata like knowing when a request is a follow-up or clarifying prompt so that you
|
||||
can build faster, more accurate retrieval (RAG) apps. To support agentic apps, like scheduling travel plans or sharing comments on a
|
||||
document - via prompts, Arch uses its function calling abilities to extract critical information fromthe incoming prompt (or a set of
|
||||
document - via prompts, Arch uses its function calling abilities to extract critical information from the incoming prompt (or a set of
|
||||
prompts) needed by a downstream backend API or function call before calling it directly.</p>
|
||||
<p><strong>Model Serving</strong>: Arch is a set of <cite>two</cite> self-contained processes that are designed to run alongside your application servers
|
||||
(or on a separate hostconnected via a network).The <a class="reference internal" href="model_serving.html#model-serving"><span class="std std-ref">model serving</span></a> process helps Arch make intelligent decisions
|
||||
(or on a separate host connected via a network).The <a class="reference internal" href="model_serving.html#model-serving"><span class="std std-ref">model serving</span></a> process helps Arch make intelligent decisions
|
||||
about the incoming prompts. The model server is designed to call the (fast) purpose-built LLMs in Arch.</p>
|
||||
<p><strong>Error Target</strong>: <a class="reference internal" href="error_target.html#error-target"><span class="std std-ref">Error targets</span></a> are those endpoints that receive forwarded errors from Arch when issues arise,
|
||||
such as failing to properly call a function/API, detecting violations of guardrails, or encountering other processing errors.
|
||||
|
|
@ -216,7 +216,7 @@ and take appropriate actions.</p>
|
|||
</div><footer class="py-6 border-t border-border md:py-0">
|
||||
<div class="container flex flex-col items-center justify-between gap-4 md:h-24 md:flex-row">
|
||||
<div class="flex flex-col items-center gap-4 px-8 md:flex-row md:gap-2 md:px-0">
|
||||
<p class="text-sm leading-loose text-center text-muted-foreground md:text-left">© 2025, Katanemo Labs, Inc Last updated: Apr 06, 2025. </p>
|
||||
<p class="text-sm leading-loose text-center text-muted-foreground md:text-left">© 2025, Katanemo Labs, Inc Last updated: Apr 13, 2025. </p>
|
||||
</div>
|
||||
</div>
|
||||
</footer>
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue