mirror of
https://github.com/katanemo/plano.git
synced 2026-04-25 00:36:34 +02:00
updating readme and see how it flows (#556)
* updating readme and see how it flows * fixed links --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-4.local>
This commit is contained in:
parent
89ab51697a
commit
95d28df725
1 changed files with 71 additions and 71 deletions
142
README.md
142
README.md
|
|
@ -4,14 +4,14 @@
|
|||
<div align="center">
|
||||
|
||||
|
||||
_Arch is a smart proxy server designed as a modular edge and AI gateway for agentic apps_<br><br>
|
||||
_Arch is a smart proxy server designed as a modular edge and AI gateway for agents._<br><br>
|
||||
Arch handles the *pesky low-level work* in building agentic apps — like applying guardrails, clarifying vague user input, routing prompts to the right agent, and unifying access to any LLM. It’s a language and framework friendly infrastructure layer designed to help you build and ship agentic apps faster.
|
||||
|
||||
|
||||
[Quickstart](#Quickstart) •
|
||||
[Demos](#Demos) •
|
||||
[Build agentic apps with Arch](#Build-AI-Agent-with-Arch-Gateway) •
|
||||
[Route LLMs](#Use-Arch-as-a-LLM-Router) •
|
||||
[Build agentic apps with Arch](#Build-Agentic-Apps-with-Arch) •
|
||||
[Documentation](https://docs.archgw.com) •
|
||||
[Contact](#Contact)
|
||||
|
||||
|
|
@ -26,12 +26,12 @@ _Arch is a smart proxy server designed as a modular edge and AI gateway for agen
|
|||
# Overview
|
||||
<a href="https://www.producthunt.com/posts/arch-3?embed=true&utm_source=badge-top-post-badge&utm_medium=badge&utm_souce=badge-arch-3" target="_blank"><img src="https://api.producthunt.com/widgets/embed-image/v1/top-post-badge.svg?post_id=565761&theme=dark&period=daily&t=1742359429995" alt="Arch - Build fast, hyper-personalized agents with intelligent infra | Product Hunt" style="width: 188px; height: 41px;" width="188" height="41" /></a>
|
||||
|
||||
AI demos are easy to build. But past the thrill of a quick hack, you are left building, maintaining and scaling low-level plumbing code for agents that slows down AI innovation. For example:
|
||||
AI demos are easy to hack. But once you move past the prototype stage, you’re stuck building and maintaining low-level plumbing code that slows down real innovation. For example:
|
||||
|
||||
- You want to build specialized agents, but get stuck building **routing and handoff** code.
|
||||
- You want use new LLMs, but struggle to **quickly and safely add LLMs** without writing integration code.
|
||||
- You're bogged down with prompt engineering work to **clarify user intent and validate inputs**.
|
||||
- You're wasting cycles choosing and integrating code for **observability** instead of it happening transparently.
|
||||
- **Routing & orchestration.** Frameworks handle routing and handoffs in tightly coupled ways, so if you want to plug in your own router, planner, or policy engine, you’re stuck with a heavy refactor or brittle overrides.
|
||||
- **Model integration churn.** Frameworks wire LLM integrations directly into code abstractions, making it hard to add or swap models without touching application code — meaning you’ll have to bounce your app every time you want to experiment with a new provider or version.
|
||||
- **Observability & governance.** Logging, tracing, and guardrails are baked in as tightly coupled features, so bringing in best-of-breed solutions is painful and often requires digging through the guts of a framework.
|
||||
- **Prompt engineering overhead**. Input validation, clarifying vague user input, and coercing outputs into the right schema all pile up, turning what should be design work into low-level plumbing work.
|
||||
|
||||
With Arch, you can move faster by focusing on higher-level objectives in a language and framework agnostic way. **Arch** was built by the contributors of [Envoy Proxy](https://www.envoyproxy.io/) with the belief that:
|
||||
|
||||
|
|
@ -39,8 +39,8 @@ With Arch, you can move faster by focusing on higher-level objectives in a langu
|
|||
|
||||
**Core Features**:
|
||||
|
||||
- `🚦 Routing to Agents`. Engineered with purpose-built [LLMs](https://huggingface.co/collections/katanemo/arch-function-66f209a693ea8df14317ad68) for fast (<100ms) agent routing and hand-off scenarios
|
||||
- `🔗 Routing to LLMs`: Unify access and routing to any LLM, including dynamic routing via [preference policies](#Preference-based-Routing).
|
||||
- `🚦 Route to Agents`: Engineered with purpose-built [LLMs](https://huggingface.co/collections/katanemo/arch-function-66f209a693ea8df14317ad68) for fast (<100ms) agent routing and hand-off
|
||||
- `🔗 Route to LLMs`: Unify access to LLMs with support for [dynamic routing](#Preference-based-Routing). Model aliases [coming soon](https://github.com/katanemo/archgw/issues/557)
|
||||
- `⛨ Guardrails`: Centrally configure and prevent harmful outcomes and ensure safe user interactions
|
||||
- `⚡ Tools Use`: For common agentic scenarios let Arch instantly clarify and convert prompts to tools/API calls
|
||||
- `🕵 Observability`: W3C compatible request tracing and LLM metrics that instantly plugin with popular tools
|
||||
|
|
@ -85,7 +85,68 @@ $ source venv/bin/activate # On Windows, use: venv\Scripts\activate
|
|||
$ pip install archgw==0.3.10
|
||||
```
|
||||
|
||||
### Build Agentic Apps with Arch Gateway
|
||||
### Use Arch as a LLM Router
|
||||
Arch supports two primary routing strategies for LLMs: model-based routing and preference-based routing.
|
||||
|
||||
#### Model-based Routing
|
||||
Model-based routing allows you to configure static model names for routing. This is useful when you always want to use a specific model for certain tasks, or manually swap between models. Below an example configuration for model-based routing, and you can follow our [usage guide](demos/use_cases/README.md) on how to get working.
|
||||
|
||||
```yaml
|
||||
version: v0.1.0
|
||||
|
||||
listeners:
|
||||
egress_traffic:
|
||||
address: 0.0.0.0
|
||||
port: 12000
|
||||
message_format: openai
|
||||
timeout: 30s
|
||||
|
||||
llm_providers:
|
||||
- access_key: $OPENAI_API_KEY
|
||||
model: openai/gpt-4o
|
||||
default: true
|
||||
|
||||
- access_key: $MISTRAL_API_KEY
|
||||
model: mistral/mistral-3b-latest
|
||||
```
|
||||
|
||||
#### Preference-based Routing
|
||||
Preference-based routing is designed for more dynamic and intelligent selection of models. Instead of static model names, you write plain-language routing policies that describe the type of task or preference — for example:
|
||||
|
||||
```yaml
|
||||
version: v0.1.0
|
||||
|
||||
listeners:
|
||||
egress_traffic:
|
||||
address: 0.0.0.0
|
||||
port: 12000
|
||||
message_format: openai
|
||||
timeout: 30s
|
||||
|
||||
llm_providers:
|
||||
- model: openai/gpt-4.1
|
||||
access_key: $OPENAI_API_KEY
|
||||
default: true
|
||||
routing_preferences:
|
||||
- name: code generation
|
||||
description: generating new code snippets, functions, or boilerplate based on user prompts or requirements
|
||||
|
||||
- model: openai/gpt-4o-mini
|
||||
access_key: $OPENAI_API_KEY
|
||||
routing_preferences:
|
||||
- name: code understanding
|
||||
description: understand and explain existing code snippets, functions, or libraries
|
||||
```
|
||||
|
||||
Arch uses a lightweight 1.5B autoregressive model to map prompts (and conversation context) to these policies. This approach adapts to intent drift, supports multi-turn conversations, and avoids the brittleness of embedding-based classifiers or manual if/else chains. No retraining is required when adding new models or updating policies — routing is governed entirely by human-readable rules. You can learn more about the design, benchmarks, and methodology behind preference-based routing in our paper:
|
||||
|
||||
<div align="left">
|
||||
<a href="https://arxiv.org/abs/2506.16655" target="_blank">
|
||||
<img src="docs/source/_static/img/arch_router_paper_preview.png" alt="Arch Router Paper Preview">
|
||||
</a>
|
||||
</div>
|
||||
|
||||
### Build Agentic Apps with Arch
|
||||
|
||||
In following quickstart we will show you how easy it is to build AI agent with Arch gateway. We will build a currency exchange agent using following simple steps. For this demo we will use `https://api.frankfurter.dev/` to fetch latest price for currencies and assume USD as base currency.
|
||||
|
||||
|
|
@ -182,67 +243,6 @@ $ curl --header 'Content-Type: application/json' \
|
|||
|
||||
```
|
||||
|
||||
### Use Arch as a LLM Router
|
||||
Arch supports two primary routing strategies for LLMs: model-based routing and preference-based routing.
|
||||
|
||||
#### Model-based Routing
|
||||
Model-based routing allows you to configure static model names for routing. This is useful when you always want to use a specific model for certain tasks, or manually swap between models. Below an example configuration for model-based routing, and you can follow our [usage guide](demos/use_cases/README.md) on how to get working.
|
||||
|
||||
```yaml
|
||||
version: v0.1.0
|
||||
|
||||
listeners:
|
||||
egress_traffic:
|
||||
address: 0.0.0.0
|
||||
port: 12000
|
||||
message_format: openai
|
||||
timeout: 30s
|
||||
|
||||
llm_providers:
|
||||
- access_key: $OPENAI_API_KEY
|
||||
model: openai/gpt-4o
|
||||
default: true
|
||||
|
||||
- access_key: $MISTRAL_API_KEY
|
||||
model: mistral/mistral-3b-latest
|
||||
```
|
||||
|
||||
#### Preference-based Routing
|
||||
Preference-based routing is designed for more dynamic and intelligent selection of models. Instead of static model names, you write plain-language routing policies that describe the type of task or preference — for example:
|
||||
|
||||
```yaml
|
||||
version: v0.1.0
|
||||
|
||||
listeners:
|
||||
egress_traffic:
|
||||
address: 0.0.0.0
|
||||
port: 12000
|
||||
message_format: openai
|
||||
timeout: 30s
|
||||
|
||||
llm_providers:
|
||||
- model: openai/gpt-4.1
|
||||
access_key: $OPENAI_API_KEY
|
||||
default: true
|
||||
routing_preferences:
|
||||
- name: code generation
|
||||
description: generating new code snippets, functions, or boilerplate based on user prompts or requirements
|
||||
|
||||
- model: openai/gpt-4o-mini
|
||||
access_key: $OPENAI_API_KEY
|
||||
routing_preferences:
|
||||
- name: code understanding
|
||||
description: understand and explain existing code snippets, functions, or libraries
|
||||
```
|
||||
|
||||
Arch uses a lightweight 1.5B autoregressive model to map prompts (and conversation context) to these policies. This approach adapts to intent drift, supports multi-turn conversations, and avoids the brittleness of embedding-based classifiers or manual if/else chains. No retraining is required when adding new models or updating policies — routing is governed entirely by human-readable rules. You can learn more about the design, benchmarks, and methodology behind preference-based routing in our paper:
|
||||
|
||||
<div align="left">
|
||||
<a href="https://arxiv.org/abs/2506.16655" target="_blank">
|
||||
<img src="docs/source/_static/img/arch_router_paper_preview.png" alt="Arch Router Paper Preview">
|
||||
</a>
|
||||
</div>
|
||||
|
||||
## [Observability](https://docs.archgw.com/guides/observability/observability.html)
|
||||
Arch is designed to support best-in class observability by supporting open standards. Please read our [docs](https://docs.archgw.com/guides/observability/observability.html) on observability for more details on tracing, metrics, and logs. The screenshot below is from our integration with Signoz (among others)
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue