fixed test and docs for deployment (#595)

* fixed test and docs for deployment

* updating the main logo image

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-288.local>
This commit is contained in:
Salman Paracha 2025-10-22 14:13:16 -07:00 committed by GitHub
parent 9407ae6af7
commit 7a6f87de3e
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
7 changed files with 135 additions and 19 deletions

Binary file not shown.

Before

Width:  |  Height:  |  Size: 217 KiB

After

Width:  |  Height:  |  Size: 181 KiB

Before After
Before After

View file

@ -12,8 +12,8 @@ Prerequisites
Before you begin, ensure you have the following:
1. `Docker System <https://docs.docker.com/get-started/get-docker/>`_ (v24)
2. `Docker compose <https://docs.docker.com/compose/install/>`_ (v2.29)
3. `Python <https://www.python.org/downloads/>`_ (v3.12)
2. `Docker Compose <https://docs.docker.com/compose/install/>`_ (v2.29)
3. `Python <https://www.python.org/downloads/>`_ (v3.10+)
Arch's CLI allows you to manage and interact with the Arch gateway efficiently. To install the CLI, simply run the following command:

View file

@ -7,14 +7,9 @@ Welcome to Arch!
.. raw:: html
<div style="text-align: center; font-size: 1.25rem;">
<br>
<p>Build <strong>faster</strong>, <strong>multi-LLM</strong> agents for the <strong>enterprise</strong>.</p>
</div>
<a href="https://www.producthunt.com/posts/arch-3?embed=true&utm_source=badge-top-post-badge&utm_medium=badge&utm_souce=badge-arch&#0045;3" target="_blank"><img src="https://api.producthunt.com/widgets/embed-image/v1/top-post-badge.svg?post_id=565761&theme=dark&period=daily&t=1742433071161" alt="Arch - Build&#0032;fast&#0044;&#0032;hyper&#0045;personalized&#0032;agents&#0032;with&#0032;intelligent&#0032;infra | Product Hunt" style="width: 250px; height: 54px;" width="250" height="54" /></a>
`Arch <https://github.com/katanemo/arch>`_ is a smart edge and AI gateway for AI-native apps - one that is natively designed to handle and process prompts, not just network traffic.
`Arch <https://github.com/katanemo/arch>`_ is a models-native edge and LLM proxy/gateway for AI agents - one that is natively designed to handle and process prompts, not just network traffic.
Built by contributors to the widely adopted `Envoy Proxy <https://www.envoyproxy.io/>`_, Arch handles the *pesky low-level work* in building agentic apps — like applying guardrails, clarifying vague user input, routing prompts to the right agent, and unifying access to any LLM. Its a language and framework friendly infrastructure layer designed to help you build and ship agentic apps faster.
@ -73,4 +68,5 @@ Built by contributors to the widely adopted `Envoy Proxy <https://www.envoyproxy
:titlesonly:
:maxdepth: 2
resources/deployment
resources/configuration_reference

View file

@ -0,0 +1,121 @@
.. _deployment:
Deployment
==========
This guide shows how to deploy Arch directly using Docker without the archgw CLI, including basic runtime checks for routing and health monitoring.
Docker Deployment
-----------------
Below is a minimal, production-ready example showing how to deploy the Arch Docker image directly and run basic runtime checks. Adjust image names, tags, and the ``arch_config.yaml`` path to match your environment.
.. note::
You will need to pass all required environment variables that are referenced in your ``arch_config.yaml`` file.
For ``arch_config.yaml``, you can use any sample configuration defined earlier in the documentation. For example, you can try the :ref:`LLM Routing <llm_router>` sample config.
Docker Compose Setup
~~~~~~~~~~~~~~~~~~~~
Create a ``docker-compose.yml`` file with the following configuration:
.. code-block:: yaml
# docker-compose.yml
services:
archgw:
image: katanemo/archgw:0.3.15
container_name: archgw
ports:
- "10000:10000" # ingress (client -> arch)
- "12000:12000" # egress (arch -> upstream/llm proxy)
volumes:
- ./arch_config.yaml:/app/arch_config.yaml:ro
environment:
- OPENAI_API_KEY=${OPENAI_API_KEY:?error}
- ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY:?error}
- MODEL_SERVER_PORT=51000
Starting the Stack
~~~~~~~~~~~~~~~~~~
Start the services from the directory containing ``docker-compose.yml`` and ``arch_config.yaml``:
.. code-block:: bash
# Set required environment variables and start services
OPENAI_API_KEY=xxx ANTHROPIC_API_KEY=yyy docker compose up -d
Check container health and logs:
.. code-block:: bash
docker compose ps
docker compose logs -f archgw
Runtime Tests
-------------
Perform basic runtime tests to verify routing and functionality.
Gateway Smoke Test
~~~~~~~~~~~~~~~~~~
Test the chat completion endpoint with automatic routing:
.. code-block:: bash
# Request handled by the gateway. 'model: "none"' lets Arch decide routing
curl --header 'Content-Type: application/json' \
--data '{"messages":[{"role":"user","content":"tell me a joke"}], "model":"none"}' \
http://localhost:12000/v1/chat/completions | jq .model
Expected output:
.. code-block:: json
"gpt-4o-2024-08-06"
Model-Based Routing
~~~~~~~~~~~~~~~~~~~
Test explicit provider and model routing:
.. code-block:: bash
curl -s -H "Content-Type: application/json" \
-d '{"messages":[{"role":"user","content":"Explain quantum computing"}], "model":"anthropic/claude-3-5-sonnet-20241022"}' \
http://localhost:12000/v1/chat/completions | jq .model
Expected output:
.. code-block:: json
"claude-3-5-sonnet-20241022"
Troubleshooting
---------------
Common Issues and Solutions
~~~~~~~~~~~~~~~~~~~~~~~~~~~
**Environment Variables**
Ensure all environment variables (``OPENAI_API_KEY``, ``ANTHROPIC_API_KEY``, etc.) used by ``arch_config.yaml`` are set before starting services.
**TLS/Connection Errors**
If you encounter TLS or connection errors to upstream providers:
- Check DNS resolution
- Verify proxy settings
- Confirm correct protocol and port in your ``arch_config`` endpoints
**Verbose Logging**
To enable more detailed logs for debugging:
- Run archgw with a higher component log level
- See the :ref:`Observability <observability>` guide for logging and monitoring details
- Rebuild the image if required with updated log configuration
**CI/Automated Checks**
For continuous integration or automated testing, you can use the curl commands above as health checks in your deployment pipeline.