Doc Update (#129)

* init update

* Update terminology.rst

* fix the branch to create an index.html, and fix pre-commit issues

* Doc update

* made several changes to the docs after Shuguang's revision

* fixing pre-commit issues

* fixed the reference file to the final prompt config file

* added google analytics

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>
This commit is contained in:
Shuguang Chen 2024-10-06 16:54:34 -07:00 committed by GitHub
parent 2a7b95582c
commit 5c7567584d
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
49 changed files with 1185 additions and 609 deletions

View file

@ -0,0 +1,13 @@
Configuration Reference
============================
The following is a complete reference of the ``prompt-conifg.yml`` that controls the behavior of a single instance of
the Arch gateway. We've kept things simple (less than 80 lines) and held off on exposing additional functionality (for
e.g. suppporting push observability stats, managing prompt-endpoints as virtual cluster, exposing more load balancing
options, etc). Our belief that the simple things, should be simple. So we offert good defaults for developers, so
that they can spend more of their time in building features unique to their AI experience.
.. literalinclude:: includes/arch_config_full_reference.yaml
:language: yaml
:linenos:
:caption: :download:`Arch Configuration - Full Reference <includes/arch_config_full_reference.yaml>`

View file

@ -0,0 +1,58 @@
.. _error_target:
Error Targets
=============
**Error targets** are designed to capture and manage specific issues or exceptions that occur during Arch's function or system's execution.
These endpoints receive errors forwarded from Arch when issues arise, such as improper function/API calls, guardrail violations, or other processing errors.
The errors are communicated to the application via headers like ``X-Arch-[ERROR-TYPE]``, enabling you to respond appropriately and handle errors gracefully.
Key Concepts
------------
**Error Type**: Categorizes the nature of the error, such as "ValidationError" or "RuntimeError." These error types help in identifying what
kind of issue occurred and provide context for troubleshooting.
**Error Message**: A clear, human-readable message describing the error. This should provide enough detail to inform users or developers of
the root cause or required action.
**Target Prompt**: The specific prompt or operation where the error occurred. Understanding where the error happened helps with debugging
and pinpointing the source of the problem.
**Parameter-Specific Errors**: Errors that arise due to invalid or missing parameters when invoking a function. These errors are critical
for ensuring the correctness of inputs.
Error Header Example
--------------------
.. code-block:: http
HTTP/1.1 400 Bad Request
X-Arch-Error-Type: FunctionValidationError
X-Arch-Error-Message: Tools call parsing failure
X-Arch-Target-Prompt: createUser
Content-Type: application/json
"messages": [
{
"role": "user",
"content": "Please create a user with the following ID: 1234"
},
{
"role": "system",
"content": "Expected a string for 'user_id', but got an integer."
}]
Best Practices and Tips
-----------------------
- **Graceful Degradation**: If an error occurs, fail gracefully by providing fallback logic or alternative flows when possible.
- **Log Errors**: Always log errors on the server side for later analysis.
- **Client-Side Handling**: Make sure the client can interpret error responses and provide meaningful feedback to the user. Clients should not display raw error codes or stack traces but rather handle them gracefully.

View file

@ -0,0 +1,109 @@
version: "0.1-beta"
listener:
address: 0.0.0.0 # or 127.0.0.1
port: 10000
# Defines how Arch should parse the content from application/json or text/pain Content-type in the http request
message_format: huggingface
common_tls_context: # If you configure port 443, you'll need to update the listener with your TLS certificates
tls_certificates:
- certificate_chain:
filename: "/etc/certs/cert.pem"
private_key:
filename: "/etc/certs/key.pem"
# Arch creates a round-robin load balancing between different endpoints, managed via the cluster subsystem.
endpoints:
app_server:
# value could be ip address or a hostname with port
# this could also be a list of endpoints for load balancing
# for example endpoint: [ ip1:port, ip2:port ]
endpoint: "127.0.0.1:80"
# max time to wait for a connection to be established
connect_timeout: 0.005s
mistral_local:
endpoint: "127.0.0.1:8001"
error_target:
endpoint: "error_target_1"
# Centralized way to manage LLMs, manage keys, retry logic, failover and limits in a central way
llm_providers:
- name: "OpenAI"
provider: "openai"
access_key: $OPENAI_API_KEY
model: gpt-4o
default: true
stream: true
rate_limits:
selector: #optional headers, to add rate limiting based on http headers like JWT tokens or API keys
http_header:
name: "Authorization"
value: "" # Empty value means each separate value has a separate limit
limit:
tokens: 100000 # Tokens per unit
unit: "minute"
- name: "Mistral8x7b"
provider: "mistral"
access_key: $MISTRAL_API_KEY
model: "mistral-8x7b"
- name: "MistralLocal7b"
provider: "local"
model: "mistral-7b-instruct"
endpoint: "mistral_local"
# provides a way to override default settings for the arch system
overrides:
# By default Arch uses an NLI + embedding approach to match an incomming prompt to a prompt target.
# The intent matching threshold is kept at 0.80, you can overide this behavior if you would like
prompt_target_intent_matching_threshold: 0.60
# default system prompt used by all prompt targets
system_prompt: |
You are a network assistant that just offers facts; not advice on manufacturers or purchasing decisions.
prompt_guards:
input_guards:
jailbreak:
on_exception:
message: "Looks like you're curious about my abilities, but I can only provide assistance within my programmed parameters."
prompt_targets:
- name: "reboot_network_device"
description: "Helps network operators perform device operations like rebooting a device."
endpoint:
name: app_server
path: "/agent/action"
parameters:
- name: "device_id"
# additional type options include: int | float | bool | string | list | dict
type: "string"
description: "Identifier of the network device to reboot."
required: true
- name: "confirmation"
type: "string"
description: "Confirmation flag to proceed with reboot."
default: "no"
enum: [yes, no]
- name: "information_extraction"
default: true
description: "This prompt handles all scenarios that are question and answer in nature. Like summarization, information extraction, etc."
endpoint:
name: app_server
path: "/agent/summary"
# Arch uses the default LLM and treats the response from the endpoint as the prompt to send to the LLM
auto_llm_dispatch_on_response: true
# override system prompt for this prompt target
system_prompt: |
You are a helpful information extraction assistant. Use the information that is provided to you.
error_target:
endpoint:
name: error_target_1
path: /error
tracing: 100 #sampling rate. Note by default Arch works on OpenTelemetry compatible tracing.