plano/docs/source/llms/llms.rst
Adil Hafeez 4182879717
add precommit check (#97)
* add precommit check

* remove check

* Revert "remove check"

This reverts commit 9987b62b9b.

* fix checks

* fix whitespace errors
2024-09-30 14:54:01 -07:00

159 lines
4.5 KiB
ReStructuredText
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

.. _llms_in_arch:
LLMs
====
Arch utilizes purpose-built, industry leading, LLMs to handle the crufty and undifferentiated work around
accepting, handling and processing prompts. The following sections talk about some of the core models that
are built-in Arch.
Arch-Guard-v1
-------------
LLM-powered applications are susceptible to prompt attacks, which are prompts intentionally designed to
subvert the developers intended behavior of the LLM. Arch-Guard-v1 is a classifier model trained on a large
corpus of attacks, capable of detecting explicitly malicious prompts (and toxicity).
The model is useful as a starting point for identifying and guardrailing against the most risky realistic
inputs to LLM-powered applications. Our goal in embedding Arch-Guard in the Arch gateway is to enable developers
to focus on their business logic and factor out security and safety outside application logic. Wth Arch-Guard-v1
developers can take to significantly reduce prompt attack risk while maintaining control over the user experience.
Below is our test results of the strength of our model as compared to Prompt-Guard from `Meta LLama <https://huggingface.co/meta-llama/Prompt-Guard-86M>`_.
.. list-table::
:header-rows: 1
:widths: 15 15 10 15 15
* - Dataset
- Jailbreak (Yes/No)
- Samples
- Prompt-Guard Accuracy
- Arch-Guard Accuracy
* - casual_conversation
- 0
- 3725
- 1.00
- 1.00
* - commonqa
- 0
- 9741
- 1.00
- 1.00
* - financeqa
- 0
- 1585
- 1.00
- 1.00
* - instruction
- 0
- 5000
- 1.00
- 1.00
* - jailbreak_behavior_benign
- 0
- 100
- 0.10
- 0.20
* - jailbreak_behavior_harmful
- 1
- 100
- 0.30
- 0.52
* - jailbreak_judge
- 1
- 300
- 0.33
- 0.49
* - jailbreak_prompts
- 1
- 79
- 0.99
- 1.00
* - jailbreak_tweet
- 1
- 1282
- 0.16
- 0.35
* - jailbreak_v
- 1
- 20000
- 0.90
- 0.93
* - jailbreak_vigil
- 1
- 104
- 1.00
- 1.00
* - mental_health
- 0
- 3512
- 1.00
- 1.00
* - telecom
- 0
- 4000
- 1.00
- 1.00
* - truthqa
- 0
- 817
- 1.00
- 0.98
* - weather
- 0
- 3121
- 1.00
- 1.00
.. list-table::
:header-rows: 1
:widths: 15 20
* - Statistics
- Overall performance
* - Overall Accuracy
- 0.93568 (Prompt-Guard), 0.95267 (Arch-Guard)
* - True positives rate (TPR)
- 0.8468 (Prompt-Guard), 0.8887 (Arch-Guard)
* - True negative rate (TNR)
- 0.9972 (Prompt-Guard), 0.9970 (Arch-Guard)
* - False positive rate (FPR)
- 0.0028 (Prompt-Guard), 0.0030 (Arch-Guard)
* - False negative rate (FNR)
- 0.1532 (Prompt-Guard), 0.1113 (Arch-Guard)
.. list-table::
:header-rows: 1
:widths: 15 20
* - Metrics
- Values
* - AUC
- 0.857 (Prompt-Guard), 0.880 (Arch-Guard)
* - Precision
- 0.715 (Prompt-Guard), 0.761 (Arch-Guard)
* - Recall
- 0.999 (Prompt-Guard), 0.999 (Arch-Guard)
Arch-FC
-------
Arch-FC is a lean, powerful and cost-effective agentic model designed for function calling scenarios.
You can run Arch-FC locally, or use the cloud-hosted version for as little as $0.05/M token (100x cheaper
than GPT-4o), with a p50 latency of 200ms (5x faster than GPT-4o), while meeting frontier model performance.
.. Note::
Function calling helps you personalize the GenAI experience by calling application-specific operations via
prompts. This involves any predefined functions or APIs you want to expose to perform tasks, gather
information, or manipulate data - via prompts.
You can get started with function calling simply by configuring a prompt target with a name, description
and set of parameters needed by a specific backend function or a hosted API. The name, and description helps
Arch-FC match a user prompt to a function or API that can process it.
By using Arch-FC, Arch enables you to easily build agentic workflows tailored to domain-specific use cases -
from updating insurance claims to creating ad campaigns. Arch-FC analyzes prompts, extracts critical information
from prompts, engages in lightweight conversations with the user to gather any missing parameters need before
handling control back to Arch to make the API call to your hosted backend. Arch-FC handles the muck of information
extraction so that you can focus on the business logic of your application.