Doc Update (#129)

* init update * Update terminology.rst * fix the branch to create an index.html, and fix pre-commit issues * Doc update * made several changes to the docs after Shuguang's revision * fixing pre-commit issues * fixed the reference file to the final prompt config file * added google analytics --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>
2026-05-05 22:02:43 +02:00 · 2024-10-06 16:54:34 -07:00 · 2024-10-06 16:54:34 -07:00 · 5c7567584d
commit 5c7567584d
parent 2a7b95582c
49 changed files with 1185 additions and 609 deletions
--- a/docs/source/get_started/includes/quickstart.yaml
+++ b/docs/source/get_started/includes/quickstart.yaml
@ -0,0 +1,47 @@
+version: "0.1-beta"
+listen:
+  address: 127.0.0.1 | 0.0.0.0
+  port_value: 8080 #If you configure port 443, you'll need to update the listener with tls_certificates
+
+system_prompt: |
+  You are a network assistant that just offers facts; not advice on manufacturers or purchasing decisions.
+
+llm_providers:
+  - name: "OpenAI"
+    provider: "openai"
+    access_key: OPENAI_API_KEY
+    model: gpt-4o
+    stream: true
+
+prompt_targets:
+  - name: reboot_devices
+    description: >
+      This prompt target handles user requests to reboot devices.
+      It ensures that when users request to reboot specific devices or device groups, the system processes the reboot commands accurately.
+
+      **Examples of user prompts:**
+
+      - "Please reboot device 12345."
+      - "Restart all devices in tenant group tenant-XYZ
+      - "I need to reboot devices A, B, and C."
+
+    path: /agent/device_reboot
+    parameters:
+      - name: "device_ids"
+        type: list  # Options: integer | float | list | dictionary | set
+        description: "A list of device identifiers (IDs) to reboot."
+        required: false
+      - name: "device_group"
+        type: string  # Options: string | integer | float | list | dictionary | set
+        description: "The name of the device group to reboot."
+        required: false
+
+# Arch creates a round-robin load balancing between different endpoints, managed via the cluster subsystem.
+endpoints:
+  app_server:
+    # value could be ip address or a hostname with port
+    # this could also be a list of endpoints for load balancing for example endpoint: [ ip1:port, ip2:port ]
+    endpoint: "127.0.0.1:80"
+    # max time to wait for a connection to be established
+    connect_timeout: 0.005s
+  version: "0.1-beta"
--- a/docs/source/get_started/intro_to_arch.rst
+++ b/docs/source/get_started/intro_to_arch.rst
@ -0,0 +1,90 @@
+.. _intro_to_arch:
+
+Intro to Arch
+=============
+
+Arch is an intelligent `(Layer 7) <https://www.cloudflare.com/learning/ddos/what-is-layer-7/>`_ gateway
+designed for generative AI apps, AI agents, and Co-pilots that work with prompts. Engineered with purpose-built
+large language models (LLMs), Arch handles all the critical but undifferentiated tasks related to the handling and
+processing of prompts, including detecting and rejecting `jailbreak <https://github.com/verazuo/jailbreak_llms>`_
+attempts, intelligently calling “backend” APIs to fulfill the user's request represented in a prompt, routing to
+and offering disaster recovery between upstream LLMs, and managing the observability of prompts and LLM interactions
+in a centralized way.
+
+.. image:: /_static/img/arch-logo.png
+   :width: 100%
+   :align: center
+
+**The project was born out of the belief that:**
+
+  *Prompts are nuanced and opaque user requests, which require the same capabilities as traditional HTTP requests
+  including secure handling, intelligent routing, robust observability, and integration with backend (API)
+  systems for personalization - all outside business logic.*
+
+
+In practice, achieving the above goal is incredibly difficult. Arch attempts to do so by providing the
+following high level features:
+
+_____________________________________________________________________________________________________________
+
+**Out-of-process architecture, built on** `Envoy <http://envoyproxy.io/>`_: Arch is takes a dependency on
+Envoy and is a self-contained process that is designed to run alongside your application servers. Arch uses
+Envoy's HTTP connection management subsystem, HTTP L7 filtering and telemetry capabilities to extend the
+functionality exclusively for prompts and LLMs. This gives Arch several advantages:
+
+* Arch builds on Envoy's proven success. Envoy is used at masssive sacle by the leading technology companies of
+  our time including `AirBnB <https://www.airbnb.com>`_, `Dropbox <https://www.dropbox.com>`_,
+  `Google <https://www.google.com>`_, `Reddit <https://www.reddit.com>`_, `Stripe <https://www.stripe.com>`_,
+  etc. Its battle tested and scales linearly with usage and enables developers to focus on what really matters:
+  application features and business logic.
+
+* Arch works with any application language. A single Arch deployment can act as gateway for AI applications
+  written in Python, Java, C++, Go, Php, etc.
+
+* Arch can be deployed and upgraded quickly across your infrastructure transparently without the horrid pain
+  of deploying library upgrades in your applications.
+
+**Engineered with Fast LLMs:** Arch is engineered with specialized (sub-billion) LLMs that are desgined for
+fast, cost-effective and acurrate handling of prompts. These LLMs are designed to be
+best-in-class for critcal prompt-related tasks like:
+
+* **Function/API Calling:** Arch helps you easily personalize your applications by enabling calls to
+  application-specific (API) operations via user prompts. This involves any predefined functions or APIs
+  you want to expose to users to perform tasks, gather information, or manipulate data. With function calling,
+  you have flexibility to support "agentic" experiences tailored to specific use cases - from updating insurance
+  claims to creating ad campaigns - via prompts. Arch analyzes prompts, extracts critical information from
+  prompts, engages in lightweight conversation to gather any missing parameters and makes API calls so that you can
+  focus on writing business logic. For more details, read :ref:`prompt processing <arch_overview_prompt_handling>`.
+
+* **Prompt Guardrails:** Arch helps you improve the safety of your application by applying prompt guardrails in
+  a centralized way for better governance hygiene. With prompt guardrails you can prevent `jailbreak <https://github.com/verazuo/jailbreak_llms>`_
+  attempts or toxicity present in user's prompts without having to write a single line of code. To learn more
+  about how to configure guardrails available in Arch, read :ref:`prompt processing <arch_overview_prompt_handling>`.
+
+* **[Coming Soon] Intent-Markers:** Developers struggle to handle `follow-up <https://www.reddit.com/r/ChatGPTPromptGenius/comments/17dzmpy/how_to_use_rag_with_conversation_history_for/?>`_,
+  or `clarifying <https://www.reddit.com/r/LocalLLaMA/comments/18mqwg6/best_practice_for_rag_with_followup_chat/>`_
+  questions. Specifically, when users ask for modifications or additions to previous responses their AI applications
+  often generate entirely new responses instead of adjusting the previous ones. Arch offers intent-markers as a
+  feature so that developers know when the user has shifted away from the previous intent so that they can improve
+  their retrieval, lower overall token cost and dramatically improve the speed and accuracy of their responses back
+  to users. For more details :ref:`intent markers<arch_rag_guide>`
+
+**Traffic Management:** Arch offers several capabilities for LLM calls originating from your applications, including smart
+retries on errors from upstream LLMs, and automatic cutover to other LLMs configured in Arch for continuous availability
+and disaster recovery scenarios. Arch extends Envoy's `cluster subsystem <https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/upstream/cluster_manager>`_
+to manage upstream connections to LLMs so that you can build resilient AI applications.
+
+**Front/edge Gateway:** There is substantial benefit in using the same software at the edge (observability,
+traffic shaping alogirithms, applying guardrails, etc.) as for outbound LLM inference use cases. Arch has the feature set
+that makes it exceptionally well suited as an edge gateway for AI applications. This includes TLS termination, applying
+guardrail early in the pricess, intelligent parameter gathering from prompts, and prompt-based routing to backend APIs.
+
+**Best-In Class Monitoring:** Arch offers several monitoring metrics that help you understand three critical aspects of
+your application: latency, token usage, and error rates by an upstream LLM provider. Latency measures the speed at which
+your application is responding to users, which includes metrics like time to first token (TFT), time per output token (TOT)
+metrics, and the total latency as perceived by users.
+
+**End-to-End Tracing:** Arch propagates trace context using the W3C Trace Context standard, specifically through the
+``traceparent`` header. This allows each component in the system to record its part of the request flow, enabling **end-to-end tracing**
+across the entire application. By using OpenTelemetry, Arch ensures that developers can capture this trace data consistently and
+in a format compatible with various observability tools. For more details, read :ref:`tracing <arch_overview_tracing>`.
--- a/docs/source/get_started/overview.rst
+++ b/docs/source/get_started/overview.rst
@ -0,0 +1,91 @@
+Overview
+============
+Welcome to Arch, the intelligent prompt gateway designed to help developers build **fast**, **secure**, and **personalized** generative AI apps at ANY scale.
+In this documentation, you will learn how to quickly set up Arch to trigger API calls via prompts, apply prompt guardrails without writing any application-level logic,
+simplify the interaction with upstream LLMs, and improve observability all while simplifying your application development process.
+
+
+Get Started
+-----------
+
+This section introduces you to Arch and helps you get set up quickly:
+
+.. grid:: 3
+
+    .. grid-item-card:: Overview
+        :link: overview.html
+
+        Overview of Arch and Doc navigation
+
+    .. grid-item-card:: Intro to Arch
+        :link: intro_to_arch.html
+
+        Explore Arch's features and developer workflow
+
+    .. grid-item-card:: Quickstart
+        :link: quickstart.html
+
+        Learn how to quickly set up and integrate
+
+
+Concepts
+--------
+
+Deep dive into essential ideas and mechanisms behind Arch:
+
+.. grid:: 3
+
+    .. grid-item-card:: Tech Overview
+        :link: ../Concepts/tech_overview/tech_overview.html
+
+        Learn about the technology stack
+
+    .. grid-item-card:: LLM Provider
+        :link: ../Concepts/llm_provider.html
+
+        Explore Arch’s LLM integration options
+
+    .. grid-item-card:: Targets
+        :link: ../Concepts/prompt_target.html
+
+        Understand how Arch handles prompts
+
+
+Guides
+------
+Step-by-step tutorials for practical Arch use cases and scenarios:
+
+.. grid:: 3
+
+    .. grid-item-card:: Prompt Guard
+        :link: ../guides/tech_overview/tech_overview.html
+
+        Instructions on securing and validating prompts
+
+    .. grid-item-card:: Function Calling
+        :link: ../guides/function_calling.html
+
+        A guide to effective function calling
+
+    .. grid-item-card:: Observability
+        :link: ../guides/prompt_target.html
+
+        Learn to monitor and troubleshoot Arch
+
+
+Build with Arch
+---------------
+
+For developers extending and customizing Arch for specialized needs:
+
+.. grid:: 2
+
+    .. grid-item-card:: Agentic Workflow
+        :link: ../build_with_arch/agent.html
+
+        Discover how to create and manage custom agents within Arch
+
+    .. grid-item-card:: RAG Application
+        :link: ../build_with_arch/rag.html
+
+        Integrate RAG for knowledge-driven responses
--- a/docs/source/get_started/quickstart.rst
+++ b/docs/source/get_started/quickstart.rst
@ -0,0 +1,84 @@
+.. _quickstart:
+
+Quickstart
+================
+
+Follow this guide to learn how to quickly set up Arch and integrate it into your generative AI applications.
+
+
+Prerequisites
+----------------------------
+
+Before you begin, ensure you have the following:
+
+.. vale Vale.Spelling = NO
+
+- ``Docker`` & ``Python`` installed on your system
+- ``API Keys`` for LLM providers (if using external LLMs)
+
+The fastest way to get started using Arch is to use `katanemo/arch <https://hub.docker.com/r/katanemo/arch>`_ pre-built binaries.
+You can also build it from source.
+
+
+Step 1: Install Arch
+----------------------------
+Arch's CLI allows you to manage and interact with the Arch gateway efficiently. To install the CLI, simply
+run the following command:
+
+.. code-block:: console
+
+    $ pip install archgw
+
+This will install the archgw command-line tool globally on your system.
+
+.. tip::
+   We recommend that developers create a new Python virtual environment to isolate dependencies before installing Arch.
+   This ensures that `archgw` and its dependencies do not interfere with other packages on your system.
+
+   To create and activate a virtual environment, you can run the following commands:
+
+   .. code-block:: console
+
+      $ python -m venv venv
+      $ source venv/bin/activate   # On Windows, use: venv\Scripts\activate
+      $ pip install archgw
+
+
+Step 2: Config Arch
+-------------------
+
+Arch operates based on a configuration file where you can define LLM providers, prompt targets, and guardrails, etc.
+Below is an example configuration to get you started, including:
+
+.. vale Vale.Spelling = NO
+
+- ``endpoints``: Specifies where Arch listens for incoming prompts.
+- ``system_prompts``: Defines predefined prompts to set the context for interactions.
+- ``llm_providers``: Lists the LLM providers Arch can route prompts to.
+- ``prompt_guards``: Sets up rules to detect and reject undesirable prompts.
+- ``prompt_targets``: Defines endpoints that handle specific types of prompts.
+- ``error_target``: Specifies where to route errors for handling.
+
+.. literalinclude:: includes/quickstart.yaml
+    :language: yaml
+
+
+Step 3: Start Arch Gateway
+--------------------------
+
+.. code-block:: console
+
+    $ archgw up [path_to_config]
+
+
+
+Next Steps
+-------------------
+
+Congratulations! You've successfully set up Arch and made your first prompt-based request. To further enhance your GenAI applications, explore the following resources:
+
+- Full Documentation: Comprehensive guides and references.
+- `GitHub Repository <https://github.com/katanemo/arch>`_: Access the source code, contribute, and track updates.
+- `Support <https://github.com/katanemo/arch#contact>`_: Get help and connect with the Arch community .
+
+With Arch, building scalable, fast, and personalized GenAI applications has never been easier. Dive deeper into Arch's capabilities and start creating innovative AI-driven experiences today!