add precommit check (#97)

* add precommit check

* remove check

* Revert "remove check"

This reverts commit 9987b62b9b.

* fix checks

* fix whitespace errors
This commit is contained in:
Adil Hafeez 2024-09-30 14:54:01 -07:00 committed by GitHub
parent 1e61452310
commit 4182879717
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
26 changed files with 292 additions and 312 deletions

View file

@ -3,16 +3,16 @@
Agentic (Text-to-Action) Apps
==============================
Arch helps you easily personalize your applications by calling application-specific (API) functions
via user prompts. This involves any predefined functions or APIs you want to expose to users to perform tasks,
gather information, or manipulate data. This capability is generally referred to as **function calling**, where
you have the flexibility to support “agentic” apps tailored to specific use cases - from updating insurance
claims to creating ad campaigns - via prompts.
Arch helps you easily personalize your applications by calling application-specific (API) functions
via user prompts. This involves any predefined functions or APIs you want to expose to users to perform tasks,
gather information, or manipulate data. This capability is generally referred to as **function calling**, where
you have the flexibility to support “agentic” apps tailored to specific use cases - from updating insurance
claims to creating ad campaigns - via prompts.
Arch analyzes prompts, extracts critical information from prompts, engages in lightweight conversation with
Arch analyzes prompts, extracts critical information from prompts, engages in lightweight conversation with
the user to gather any missing parameters and makes API calls so that you can focus on writing business logic.
Arch does this via its purpose-built :ref:`Arch-FC LLM <llms_in_arch>` - the fastest (200ms p90 - 10x faser than GPT-4o)
and cheapest (100x than GPT-40) function-calling LLM that matches performance with frontier models.
Arch does this via its purpose-built :ref:`Arch-FC LLM <llms_in_arch>` - the fastest (200ms p90 - 10x faser than GPT-4o)
and cheapest (100x than GPT-40) function-calling LLM that matches performance with frontier models.
______________________________________________________________________________________________
.. image:: /_static/img/function-calling-network-flow.jpg
@ -22,7 +22,7 @@ ________________________________________________________________________________
Single Function Call
--------------------
In the most common scenario, users will request a single action via prompts, and Arch efficiently processes the
In the most common scenario, users will request a single action via prompts, and Arch efficiently processes the
request by extracting relevant parameters, validating the input, and calling the designated function or API. Here
is how you would go about enabling this scenario with Arch:
@ -38,7 +38,7 @@ Step 1: Define prompt targets with functions
Step 2: Process request parameters in Flask
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Once the prompt targets are configured as above, handling those parameters is
Once the prompt targets are configured as above, handling those parameters is
.. literalinclude:: /_include/parameter_handling_flask.py
:language: python
@ -47,19 +47,19 @@ Once the prompt targets are configured as above, handling those parameters is
Parallel/ Multiple Function Calling
-----------------------------------
In more complex use cases, users may request multiple actions or need multiple APIs/functions to be called
simultaneously or sequentially. With Arch, you can handle these scenarios efficiently using parallel or multiple
In more complex use cases, users may request multiple actions or need multiple APIs/functions to be called
simultaneously or sequentially. With Arch, you can handle these scenarios efficiently using parallel or multiple
function calling. This allows your application to engage in a broader range of interactions, such as updating
different datasets, triggering events across systems, or collecting results from multiple services in one prompt.
Arch-FC1B is built to manage these parallel tasks efficiently, ensuring low latency and high throughput, even
Arch-FC1B is built to manage these parallel tasks efficiently, ensuring low latency and high throughput, even
when multiple functions are invoked. It provides two mechanisms to handle these cases:
Step 1: Define Multiple Function Targets
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
When enabling multiple function calling, define the prompt targets in a way that supports multiple functions or
API calls based on the user's prompt. These targets can be triggered in parallel or sequentially, depending on
When enabling multiple function calling, define the prompt targets in a way that supports multiple functions or
API calls based on the user's prompt. These targets can be triggered in parallel or sequentially, depending on
the user's intent.
Example of Multiple Prompt Targets in YAML:
@ -68,4 +68,4 @@ Example of Multiple Prompt Targets in YAML:
:language: yaml
:linenos:
:emphasize-lines: 16-37
:caption: Define prompt targets that can enable users to engage with API and backened functions of an app
:caption: Define prompt targets that can enable users to engage with API and backened functions of an app

View file

@ -3,22 +3,22 @@
Retrieval-Augmented (RAG)
=========================
The following section describes how Arch can help you build faster, smarter and more accurate
The following section describes how Arch can help you build faster, smarter and more accurate
Retrieval-Augmented Generation (RAG) applications.
Intent-drift Detection
----------------------
Developers struggle to handle `follow-up <https://www.reddit.com/r/ChatGPTPromptGenius/comments/17dzmpy/how_to_use_rag_with_conversation_history_for/?>`_
or `clarifying <https://www.reddit.com/r/LocalLLaMA/comments/18mqwg6/best_practice_for_rag_with_followup_chat/>`_
questions. Specifically, when users ask for changes or additions to previous responses their AI applications often
generate entirely new responses instead of adjusting previous ones. Arch offers *intent-drift* tracking as a feature so
that developers can know when the user has shifted away from a previous intent so that they can dramatically improve
Developers struggle to handle `follow-up <https://www.reddit.com/r/ChatGPTPromptGenius/comments/17dzmpy/how_to_use_rag_with_conversation_history_for/?>`_
or `clarifying <https://www.reddit.com/r/LocalLLaMA/comments/18mqwg6/best_practice_for_rag_with_followup_chat/>`_
questions. Specifically, when users ask for changes or additions to previous responses their AI applications often
generate entirely new responses instead of adjusting previous ones. Arch offers *intent-drift* tracking as a feature so
that developers can know when the user has shifted away from a previous intent so that they can dramatically improve
retrieval accuracy, lower overall token cost and improve the speed of their responses back to users.
Arch uses its built-in lightweight NLI and embedding models to know if the user has steered away from an active intent.
Arch uses its built-in lightweight NLI and embedding models to know if the user has steered away from an active intent.
Arch's intent-drift detection mechanism is based on its' *prompt_targets* primtive. Arch tries to match an incoming
prompt to one of the *prompt_targets* configured in the gateway. Once it detects that the user has moved away from an active
prompt to one of the *prompt_targets* configured in the gateway. Once it detects that the user has moved away from an active
active intent, Arch adds the ``x-arch-intent-drift`` headers to the request before sending it your application servers.
.. literalinclude:: /_include/intent_detection.py
@ -32,9 +32,9 @@ ________________________________________________________________________________
.. Note::
Arch is (mostly) stateless so that it can scale in an embarrassingly parrallel fashion. So, while Arch offers
intent-drift detetction, you still have to maintain converational state with intent drift as meta-data. The
following code snippets show how easily you can build and enrich conversational history with Langchain (in python),
Arch is (mostly) stateless so that it can scale in an embarrassingly parrallel fashion. So, while Arch offers
intent-drift detetction, you still have to maintain converational state with intent drift as meta-data. The
following code snippets show how easily you can build and enrich conversational history with Langchain (in python),
so that you can use the most relevant prompts for your retrieval and for prompting upstream LLMs.
@ -54,7 +54,7 @@ Step 2: update ConversationBufferMemory w/ intent
:linenos:
:lines: 22-62
Step 3: get Messages based on latest drift
Step 3: get Messages based on latest drift
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. literalinclude:: /_include/intent_detection.py
@ -63,17 +63,17 @@ Step 3: get Messages based on latest drift
:lines: 64-76
You can used the last set of messages that match to an intent to prompt an LLM, use it with an vector-DB for
improved retrieval, etc. With Arch and a few lines of code, you can improve the retrieval accuracy, lower overall
You can used the last set of messages that match to an intent to prompt an LLM, use it with an vector-DB for
improved retrieval, etc. With Arch and a few lines of code, you can improve the retrieval accuracy, lower overall
token cost and dramatically improve the speed of their responses back to users.
Parameter Extraction for RAG
Parameter Extraction for RAG
----------------------------
To build RAG (Retrieval-Augmented Generation) applications, you can configure prompt targets with parameters,
enabling Arch to retrieve critical information in a structured way for processing. This approach improves the
retrieval quality and speed of your application. By extracting parameters from the conversation, you can pull
the appropriate chunks from a vector database or SQL-like data store to enhance accuracy. With Arch, you can
To build RAG (Retrieval-Augmented Generation) applications, you can configure prompt targets with parameters,
enabling Arch to retrieve critical information in a structured way for processing. This approach improves the
retrieval quality and speed of your application. By extracting parameters from the conversation, you can pull
the appropriate chunks from a vector database or SQL-like data store to enhance accuracy. With Arch, you can
streamline data retrieval and processing to build more efficient and precise RAG applications.
Step 1: Define prompt targets with parameter definitions
@ -88,9 +88,9 @@ Step 1: Define prompt targets with parameter definitions
Step 2: Process request parameters in Flask
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Once the prompt targets are configured as above, handling those parameters is
Once the prompt targets are configured as above, handling those parameters is
.. literalinclude:: /_include/parameter_handling_flask.py
:language: python
:linenos:
:caption: Flask API example for parameter extraction via HTTP request parameters
:caption: Flask API example for parameter extraction via HTTP request parameters