Salmanap/docs v1 push (#92)

* updated model serving, updated the config references, architecture docs and added the llm_provider section * several documentation changes to improve sections like life_of_a_request, model serving subsystem --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>
2026-04-28 18:36:34 +02:00 · 2024-09-27 15:37:49 -07:00 · 2024-09-27 15:37:49 -07:00 · 7168b14ed3
commit 7168b14ed3
parent 8a4e11077c
19 changed files with 375 additions and 119 deletions
--- a/docs/source/intro/architecture/prompt_processing/prompt_processing.rst
+++ b/docs/source/intro/architecture/prompt_processing/prompt_processing.rst
@ -1,23 +1,37 @@
 .. _arch_overview_prompt_handling:

 Prompts
-=======
+-------

 Arch's primary design point is to securely accept, process and handle prompts. To do that effectively, 
 Arch relies on Envoy's HTTP `connection management <https://www.envoyproxy.io/docs/envoy/v1.31.2/intro/arch_overview/http/http_connection_management>`_, 
-subsystem and its prompt-handler subsystem engineered with purpose-built :ref:`LLMs <llms_in_arch>` to implement 
-critical functionality on behalf of developers so that you can stay focused on business logic.
+subsystem and its **prompt handler** subsystem engineered with purpose-built :ref:`LLMs <llms_in_arch>` to 
+implement critical functionality on behalf of developers so that you can stay focused on business logic.
+
+.. Note::
+   Arch's **prompt handler** subsystem interacts with the **model** subsytem through Envoy's cluster manager
+   system to ensure robust, resilient and fault-tolerant experience in managing incoming prompts. Read more
+   about the :ref:`model subsystem <arch_model_serving>` and how the LLMs are hosted in Arch.
+
+Messages
+--------
+
+Arch accepts messages directly from the body of the HTTP request in a format that follows the `Hugging Face Messages API <https://huggingface.co/docs/text-generation-inference/en/messages_api>`_. 
+This design allows developers to pass a list of messages, where each message is represented as a dictionary 
+containing two key-value pairs:
+
+    - **Role**: Defines the role of the message sender, such as "user" or "assistant".
+    - **Content**: Contains the actual text of the message.
+

 Prompt Guardrails
 -----------------

 Arch is engineered with :ref:`Arch-Guard <llms_in_arch>`, an industry leading safety layer, powered by a 
-compact and high-performimg LLM that monitors incoming prompts to detect and reject jailbreak attempts and 
-several safety related concerns, ensuring that unauthorized or harmful behaviors are intercepted early in 
-the process. Arch-Guard is a composite model combining work from the industry leading Meta LLama models and 
-purposely-tuned models that offer exceptional overall performance. 
+compact and high-performimg LLM that monitors incoming prompts to detect and reject jailbreak attempts - 
+ensuring that unauthorized or harmful behaviors are intercepted early in the process.

-To add prompt guardrails, see example below: 
+To add jailbreak guardrails, see example below: 

 .. literalinclude:: /_config/getting-started.yml
    :language: yaml
@ -26,9 +40,9 @@ To add prompt guardrails, see example below:
    :caption: :download:`arch-getting-started.yml </_config/getting-started.yml>`

 .. Note::
-   As a roadmap item, Arch will expose the ability for developers to define custom guardrails via Arch-Guard-v2, 
-   which would enforce instructions defined by the application developer to control conversational flow. To
-   offer feedback on our roadmap, please visit our `github page <https://github.com/orgs/katanemo/projects/1>`_
+   As a roadmap item, Arch will expose the ability for developers to define custom guardrails via Arch-Guard-v2,
+   and add support for additional safety checks defined by developers and hazardous categories like, violent crimes, privacy, hate,  
+   etc. To offer feedback on our roadmap, please visit our `github page <https://github.com/orgs/katanemo/projects/1>`_


 Prompt Targets
@ -132,7 +146,6 @@ Example: Using OpenAI Client with Arch as an Egress Gateway

   print("OpenAI Response:", response.choices[0].text.strip())

-
 In these examples:

    The ArchClient is used to send traffic directly through the Arch egress proxy to the LLM of your choice, such as OpenAI.