updated docs for 0.1.8 support (#366)

* updated docs for 0.1.8 support

* updated REAMDE on root

* updated version reference to 0.1.8 in other parts of the repo

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>
This commit is contained in:
Salman Paracha 2025-01-10 16:38:48 -08:00 committed by GitHub
parent dae6239b81
commit ebda682b30
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
8 changed files with 31 additions and 77 deletions

View file

@ -19,11 +19,14 @@ Engineered with purpose-built LLMs, Arch handles the critical but undifferentiat
**Core Features**:
- Built on [Envoy](https://envoyproxy.io): Arch runs alongside application servers as a separate containerized process, and builds on top of Envoy's proven HTTP management and scalability features to handle ingress and egress traffic related to prompts and LLMs.
- Function Calling for fast Agents and RAG apps. Engineered with purpose-built [LLMs](https://huggingface.co/collections/katanemo/arch-function-66f209a693ea8df14317ad68) to handle fast, cost-effective, and accurate prompt-based tasks like function/API calling, and parameter extraction from prompts.
- Intent Routing & Fast Function Calling. Engineered with purpose-built [LLMs](https://huggingface.co/collections/katanemo/arch-function-66f209a693ea8df14317ad68) to handle fast, cost-effective, and accurate prompt-based tasks like function/API calling, and parameter extraction from prompts to build smarter more accurate agentic applications.
- Prompt [Guard](https://huggingface.co/collections/katanemo/arch-guard-6702bdc08b889e4bce8f446d): Arch centralizes guardrails to prevent jailbreak attempts and ensure safe user interactions without writing a single line of code.
- Routing & Traffic Management: Arch manages LLM calls, offering smart retries, automatic cutover, and resilient upstream connections for continuous availability.
- Routing & Traffic Management: Arch centralizes calls to LLMs used by your applications, offering smart retries, automatic cutover, and resilient upstream connections for continuous availability.
- Observability: Arch uses the W3C Trace Context standard to enable complete request tracing across applications, ensuring compatibility with observability tools, and provides metrics to monitor latency, token usage, and error rates, helping optimize AI application performance.
**High-Level Network Flow**:
![alt text](docs/source/_static/img/arch_network_diagram_high_level.png)
**Jump to our [docs](https://docs.archgw.com)** to learn how you can use Arch to improve the speed, security and personalization of your GenAI apps.
> [!IMPORTANT]
@ -57,7 +60,7 @@ Arch's CLI allows you to manage and interact with the Arch gateway efficiently.
```console
$ python -m venv venv
$ source venv/bin/activate # On Windows, use: venv\Scripts\activate
$ pip install archgw==0.1.7
$ pip install archgw==0.1.8
```
### Build AI Agent with Arch Gateway