mirror of
https://github.com/katanemo/plano.git
synced 2026-04-28 18:36:34 +02:00
Salmanap/docs v1 push (#92)
* updated model serving, updated the config references, architecture docs and added the llm_provider section * several documentation changes to improve sections like life_of_a_request, model serving subsystem --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>
This commit is contained in:
parent
8a4e11077c
commit
7168b14ed3
19 changed files with 375 additions and 119 deletions
|
|
@ -1,23 +1,37 @@
|
|||
.. _arch_overview_prompt_handling:
|
||||
|
||||
Prompts
|
||||
=======
|
||||
-------
|
||||
|
||||
Arch's primary design point is to securely accept, process and handle prompts. To do that effectively,
|
||||
Arch relies on Envoy's HTTP `connection management <https://www.envoyproxy.io/docs/envoy/v1.31.2/intro/arch_overview/http/http_connection_management>`_,
|
||||
subsystem and its prompt-handler subsystem engineered with purpose-built :ref:`LLMs <llms_in_arch>` to implement
|
||||
critical functionality on behalf of developers so that you can stay focused on business logic.
|
||||
subsystem and its **prompt handler** subsystem engineered with purpose-built :ref:`LLMs <llms_in_arch>` to
|
||||
implement critical functionality on behalf of developers so that you can stay focused on business logic.
|
||||
|
||||
.. Note::
|
||||
Arch's **prompt handler** subsystem interacts with the **model** subsytem through Envoy's cluster manager
|
||||
system to ensure robust, resilient and fault-tolerant experience in managing incoming prompts. Read more
|
||||
about the :ref:`model subsystem <arch_model_serving>` and how the LLMs are hosted in Arch.
|
||||
|
||||
Messages
|
||||
--------
|
||||
|
||||
Arch accepts messages directly from the body of the HTTP request in a format that follows the `Hugging Face Messages API <https://huggingface.co/docs/text-generation-inference/en/messages_api>`_.
|
||||
This design allows developers to pass a list of messages, where each message is represented as a dictionary
|
||||
containing two key-value pairs:
|
||||
|
||||
- **Role**: Defines the role of the message sender, such as "user" or "assistant".
|
||||
- **Content**: Contains the actual text of the message.
|
||||
|
||||
|
||||
Prompt Guardrails
|
||||
-----------------
|
||||
|
||||
Arch is engineered with :ref:`Arch-Guard <llms_in_arch>`, an industry leading safety layer, powered by a
|
||||
compact and high-performimg LLM that monitors incoming prompts to detect and reject jailbreak attempts and
|
||||
several safety related concerns, ensuring that unauthorized or harmful behaviors are intercepted early in
|
||||
the process. Arch-Guard is a composite model combining work from the industry leading Meta LLama models and
|
||||
purposely-tuned models that offer exceptional overall performance.
|
||||
compact and high-performimg LLM that monitors incoming prompts to detect and reject jailbreak attempts -
|
||||
ensuring that unauthorized or harmful behaviors are intercepted early in the process.
|
||||
|
||||
To add prompt guardrails, see example below:
|
||||
To add jailbreak guardrails, see example below:
|
||||
|
||||
.. literalinclude:: /_config/getting-started.yml
|
||||
:language: yaml
|
||||
|
|
@ -26,9 +40,9 @@ To add prompt guardrails, see example below:
|
|||
:caption: :download:`arch-getting-started.yml </_config/getting-started.yml>`
|
||||
|
||||
.. Note::
|
||||
As a roadmap item, Arch will expose the ability for developers to define custom guardrails via Arch-Guard-v2,
|
||||
which would enforce instructions defined by the application developer to control conversational flow. To
|
||||
offer feedback on our roadmap, please visit our `github page <https://github.com/orgs/katanemo/projects/1>`_
|
||||
As a roadmap item, Arch will expose the ability for developers to define custom guardrails via Arch-Guard-v2,
|
||||
and add support for additional safety checks defined by developers and hazardous categories like, violent crimes, privacy, hate,
|
||||
etc. To offer feedback on our roadmap, please visit our `github page <https://github.com/orgs/katanemo/projects/1>`_
|
||||
|
||||
|
||||
Prompt Targets
|
||||
|
|
@ -132,7 +146,6 @@ Example: Using OpenAI Client with Arch as an Egress Gateway
|
|||
|
||||
print("OpenAI Response:", response.choices[0].text.strip())
|
||||
|
||||
|
||||
In these examples:
|
||||
|
||||
The ArchClient is used to send traffic directly through the Arch egress proxy to the LLM of your choice, such as OpenAI.
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue