Docs branch - v1 of our tech docs (#69)

* added the first set of docs for our technical docs * more docuemtnation changes * added support for prompt processing and updated life of a request * updated docs to including getting help sections and updated life of a request * committing local changes for getting started guide, sample applications, and full reference spec for prompt-config * updated configuration reference, added sample app skeleton, updated favico * fixed the configuration refernce file, and made minor changes to the intent detection. commit v1 for now --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local> Co-authored-by: Adil Hafeez <adil@katanemo.com>
2026-06-29 15:49:40 +02:00 · 2024-09-20 17:08:42 -07:00 · 2024-09-20 17:08:42 -07:00 · 80c554ce1a
commit 80c554ce1a
parent 233976a568
34 changed files with 1040 additions and 0 deletions
--- a/docs/source/llms/llms.rst
+++ b/docs/source/llms/llms.rst
@ -0,0 +1,139 @@
+.. _llms_in_arch:
+
+LLMs
+====
+Arch utilizes purpose-built, industry leading, LLMs to handle the crufty and undifferentiated 
+work around accepting, handling and processing prompts. The following   
+
+Arch-Guard
+----------
+LLM-powered applications are susceptible to prompt attacks, which are prompts intentionally designed to subvert the developer’s 
+intended behavior of the LLM.Arch-Guard is a classifier model trained on a large corpus of attacks, capable of detecting explicitly 
+malicious prompts (and toxicity). 
+
+The model is useful as a starting point for identifying and guardrailing against the most risky realistic inputs to
+LLM-powered applications. Our goal in embedding Arch-Guard in the Arch gateway is to enable developers to focus on their business logic
+and factor out security and safety outside application logic. Wth Arch-Guard= developers can take to significantly reduce prompt attack 
+risk while maintaining control over the user experience.
+
+Below is our test results of the strength of our model as compared to Prompt-Guard from `Meta LLama <https://huggingface.co/meta-llama/Prompt-Guard-86M>`_.
+
+.. list-table::
+   :header-rows: 1
+   :widths: 15 15 10 15 15
+
+   * - Dataset
+     - Jailbreak (Yes/No)
+     - Samples
+     - Prompt-Guard Accuracy
+     - Arch-Guard Accuracy
+   * - casual_conversation
+     - 0
+     - 3725
+     - 1.00
+     - 1.00
+   * - commonqa
+     - 0
+     - 9741
+     - 1.00
+     - 1.00
+   * - financeqa
+     - 0
+     - 1585
+     - 1.00
+     - 1.00
+   * - instruction
+     - 0
+     - 5000
+     - 1.00
+     - 1.00
+   * - jailbreak_behavior_benign
+     - 0
+     - 100
+     - 0.10
+     - 0.20
+   * - jailbreak_behavior_harmful
+     - 1
+     - 100
+     - 0.30
+     - 0.52
+   * - jailbreak_judge
+     - 1
+     - 300
+     - 0.33
+     - 0.49
+   * - jailbreak_prompts
+     - 1
+     - 79
+     - 0.99
+     - 1.00
+   * - jailbreak_tweet
+     - 1
+     - 1282
+     - 0.16
+     - 0.35
+   * - jailbreak_v
+     - 1
+     - 20000
+     - 0.90
+     - 0.93
+   * - jailbreak_vigil
+     - 1
+     - 104
+     - 1.00
+     - 1.00
+   * - mental_health
+     - 0
+     - 3512
+     - 1.00
+     - 1.00
+   * - telecom
+     - 0
+     - 4000
+     - 1.00
+     - 1.00
+   * - truthqa
+     - 0
+     - 817
+     - 1.00
+     - 0.98
+   * - weather
+     - 0
+     - 3121
+     - 1.00
+     - 1.00
+
+.. list-table::
+   :header-rows: 1
+   :widths: 15 20
+
+   * - Statistics
+     - Overall performance
+   * - Overall Accuracy
+     - 0.93568 (Prompt-Guard), 0.95267 (Arch-Guard)
+   * - True positives rate (TPR)
+     - 0.8468 (Prompt-Guard), 0.8887 (Arch-Guard)
+   * - True negative rate (TNR)
+     - 0.9972 (Prompt-Guard), 0.9970 (Arch-Guard)
+   * - False positive rate (FPR)
+     - 0.0028 (Prompt-Guard), 0.0030 (Arch-Guard)
+   * - False negative rate (FNR)
+     - 0.1532 (Prompt-Guard), 0.1113 (Arch-Guard)
+
+.. list-table::
+   :header-rows: 1
+   :widths: 15 20
+
+   * - Metrics
+     - Values
+   * - AUC
+     - 0.857 (Prompt-Guard), 0.880 (Arch-Guard)
+   * - Precision
+     - 0.715 (Prompt-Guard), 0.761 (Arch-Guard)
+   * - Recall
+     - 0.999 (Prompt-Guard), 0.999 (Arch-Guard)
+
+
+
+Arch-FC1B
+---------