Docs branch - v1 of our tech docs (#69)

* added the first set of docs for our technical docs * more docuemtnation changes * added support for prompt processing and updated life of a request * updated docs to including getting help sections and updated life of a request * committing local changes for getting started guide, sample applications, and full reference spec for prompt-config * updated configuration reference, added sample app skeleton, updated favico * fixed the configuration refernce file, and made minor changes to the intent detection. commit v1 for now --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local> Co-authored-by: Adil Hafeez <adil@katanemo.com>
2026-05-15 11:02:39 +02:00 · 2024-09-20 17:08:42 -07:00 · 2024-09-20 17:08:42 -07:00 · 80c554ce1a
commit 80c554ce1a
parent 233976a568
34 changed files with 1040 additions and 0 deletions
--- a/docs/source/intro/what_is_arch.rst
+++ b/docs/source/intro/what_is_arch.rst
@ -0,0 +1,72 @@
+What is Arch
+============
+
+Arch is an intelligent Layer 7 gateway designed for generative AI apps, agents, and Co-pilots that work 
+with prompts. Written in `Rust <https://www.rust-lang.org/>`_, and engineered with purpose-built 
+:ref:`LLMs <llms_in_arch>`, Arch handles all the critical but undifferentiated tasks related to handling and 
+processing prompts, including rejecting `jailbreak <https://github.com/verazuo/jailbreak_llms>`_ attempts, 
+intelligently calling “backend” APIs to fulfill a user's request represented in a prompt, routing/disaster 
+recovery between upstream LLMs, and managing the observability of prompts and LLM interactions in a centralized way.
+
+The project was born out of the belief that:
+
+  *prompts are nuanced and opaque user requests that need the same capabilities as network requests 
+  in modern (cloud-native) applications, including secure handling, intelligent routing, robust observability, 
+  and integration with backend (API) systems for personalization.*
+
+In practice, achieving the above goal is incredibly difficult. Arch attempts to do so by providing the 
+following high level features:
+
+**Out of process archtiecture, built on Envoy:** Arch is takes a dependency on `Envoy <http://envoyproxy.io/>`_ 
+and is a self-contained process that is designed to run alongside your application servers. Arch uses 
+Envoy's HTTP connection management subsystem and HTTP L7 filtering capabilities to extend its' proxying 
+functionality. This gives Arch several advantages:
+
+* Arch builds on Envoy's success. Envoy is used at masssive sacle by the leading technology companies of 
+  our time including `AirBnB <https://www.airbnb.com>`_, `Dropbox <https://www.dropbox.com>`_, 
+  `Google <https://www.google.com>`_, `Reddit <https://www.reddit.com>`_, `Stripe <https://www.stripe.com>`_, 
+  etc. Its battle tested and scales linearly with usage and enables developers to focus on what really matters: 
+  application and business logic.
+
+* Arch works with any application language. A single Arch deployment can act as gateway for AI applications 
+  written in Python, Java, C++, Go, Php, etc. 
+
+* As anyone that has worked with a modern application architecture knows, deploying library upgrades 
+  can be incredibly painful. Arch can be deployed and upgraded quickly across your infrastructure 
+  transparently.
+
+**Engineered with LLMs:** Arch is engineered with specialized LLMs that are desgined for fast, cost-effective 
+and acurrate handling of prompts. These (sub-billion parameter) :ref:`LLMs <llms_in_arch>` are designed to be 
+best-in-class for critcal but undifferentiated prompt-related tasks like 1) applying guardrails for jailbreak 
+attempts 2) extracting critical information from prompts (like follow-on, clarifying questions, etc.) so that 
+you can improve the speed and accuracy of retrieval, and be able to convert prompts into API sematics when necessary 
+to build text-to-action (or agentic) applications. The focus for Arch is to make prompt processing indistiguishable 
+from the processing of a traditional HTTP request before forwarding it to an application server. With our focus on 
+speed and cost, Arch uses purpose-built LLMs and will continue to invest in those to lower latency (and cost) while 
+maintaining exceptional baseline performance with frontier LLMs like `OpenAI <https:openai.com>`_, and 
+`Anthropic <https:www.anthropic.com>`_.
+
+**Prompt Guardrails:** Arch helps you apply prompt guardrails in a centralized way for better governance 
+hygiene. With prompt guardrails you can prevent `jailbreak <https://github.com/verazuo/jailbreak_llms>`_ 
+attempts or toxicity present in user's prompts without having to write a single line of code. To learn more about 
+how to configure guardrails available in Arch, read :ref:`more <llms_in_arch>`. 
+
+**Function Calling:** Arch helps you personalize GenAI apps by enabling calls to application-specific (API) 
+operations using prompts. This involves any predefined functions or APIs you want to expose to users to 
+perform tasks, gather information, or manipulate data. With function calling, you have flexibilityto support 
+agentic workflows tailored to specific use cases - from updating insurance claims to creating ad campaigns. 
+Arch analyzes prompts, extracts critical information from prompts, engages in lightweight conversation with the 
+user to gather any missing parameters and makes API calls so that you can focus on writing business logic.
+
+**Best-In Class Monitoring & Traffic Management:** Arch offers several monitoring metrics that help you 
+understand three critical aspects of your application: latency, token usage, and error rates by LLM provider. 
+Latency measures the speed at which your application is responding to users, which includes metrics like time 
+to first token (TFT), time per output token (TOT) metrics, and the total latency as perceived by users. In 
+addition, Arch offers several capabilities for calls originating from your applications to upstream LLMs, 
+including a vendor-agnostic SDK to make LLM calls, smart retries on errors from upstream LLMs, and automatic 
+cutover to other LLMs configured for continuous availability and disaster recovery scenarios.
+
+**Front/edge proxy support:** There is substantial benefit in using the same software at the edge (observability, 
+prompt management, load balancing algorithms, etc.) as it is . Arch has a feature set that makes it well suited 
+as an edge proxy for most modern web application use cases. This includes TLS termination, HTTP/1.1 HTTP/2 and 
+HTTP/3 support and prompt-based routing.