fixing the README for multi-agent orchestration (#722)

* fixing the README for multi-agent orchestration

* fixed issues where the multi-intent queries weren't being properly handled by GPT-40

* more fixes for the README.md and tracing visuals

* removed remnant Arch README.md

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>
This commit is contained in:
Salman Paracha 2026-02-02 14:35:49 -08:00 committed by GitHub
parent e41aa0a617
commit 7cba42f887
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
7 changed files with 39 additions and 246 deletions

View file

@ -1,112 +0,0 @@
### Use Arch for (Model-based) LLM Routing Step 1. Create arch config file
Create `config.yaml` file with following content:
```yaml
version: v0.1.0
listeners:
egress_traffic:
address: 0.0.0.0
port: 12000
message_format: openai
timeout: 30s
llm_providers:
- access_key: $OPENAI_API_KEY
model: openai/gpt-4o
default: true
- access_key: $MISTRAL_API_KEY
model: mistral/ministral-3b-latest
```
### Step 2. Start arch gateway
Once the config file is created ensure that you have env vars setup for `MISTRAL_API_KEY` and `OPENAI_API_KEY` (or these are defined in `.env` file).
Start arch gateway,
```
$ planoai up config.yaml
# Or if installed with uv: uvx planoai up config.yaml
2024-12-05 11:24:51,288 - planoai.main - INFO - Starting plano cli version: 0.4.4
2024-12-05 11:24:51,825 - planoai.utils - INFO - Schema validation successful!
2024-12-05 11:24:51,825 - planoai.main - INFO - Starting arch model server and arch gateway
...
2024-12-05 11:25:16,131 - planoai.core - INFO - Container is healthy!
```
### Step 3: Interact with LLM
#### Step 3.1: Using OpenAI python client
Make outbound calls via Arch gateway
```python
from openai import OpenAI
# Use the OpenAI client as usual
client = OpenAI(
# No need to set a specific openai.api_key since it's configured in Arch's gateway
api_key = '--',
# Set the OpenAI API base URL to the Arch gateway endpoint
base_url = "http://127.0.0.1:12000/v1"
)
response = client.chat.completions.create(
# we select model from arch_config file
model="None",
messages=[{"role": "user", "content": "What is the capital of France?"}],
)
print("OpenAI Response:", response.choices[0].message.content)
```
#### Step 3.2: Using curl command
```
$ curl --header 'Content-Type: application/json' \
--data '{"messages": [{"role": "user","content": "What is the capital of France?"}], "model": "gpt-4o"}' \
http://localhost:12000/v1/chat/completions
{
...
"model": "gpt-4o-2024-08-06",
"choices": [
{
...
"messages": {
"role": "assistant",
"content": "The capital of France is Paris.",
},
}
],
...
}
```
You can override model selection using `x-arch-llm-provider-hint` header. For example if you want to use mistral using following curl command,
```
$ curl --header 'Content-Type: application/json' \
--header 'x-arch-llm-provider-hint: ministral-3b' \
--data '{"messages": [{"role": "user","content": "What is the capital of France?"}], "model": "gpt-4o"}' \
http://localhost:12000/v1/chat/completions
{
...
"model": "ministral-3b-latest",
"choices": [
{
"messages": {
"role": "assistant",
"content": "The capital of France is Paris. It is the most populous city in France and is known for its iconic landmarks such as the Eiffel Tower, the Louvre Museum, and Notre-Dame Cathedral. Paris is also a major global center for art, fashion, gastronomy, and culture.",
},
...
}
],
...
}
```

View file

@ -1,6 +1,6 @@
# Travel Booking Agent Demo
A production-ready multi-agent travel booking system demonstrating Plano's intelligent agent routing. This demo showcases two specialized agents working together to help users plan trips with weather information and flight searches.
A multi-agent travel booking system demonstrating Plano's intelligent agent routing and orchestration capabilities. This demo showcases two specialized agents working together to help users plan trips with weather information and flight searches. All agent interactions are fully traced with OpenTelemetry-compatible tracing for complete observability.
## Overview
@ -9,7 +9,7 @@ This demo consists of two intelligent agents that work together seamlessly:
- **Weather Agent** - Real-time weather conditions and multi-day forecasts for any city worldwide
- **Flight Agent** - Live flight information between airports with real-time tracking
All agents use Plano's agent router to intelligently route user requests to the appropriate specialized agent based on conversation context and user intent. Both agents run as Docker containers for easy deployment.
All agents use Plano's agent orchestration LLM to intelligently route user requests to the appropriate specialized agent based on conversation context and user intent. Both agents run as Docker containers for easy deployment.
## Features
@ -17,14 +17,17 @@ All agents use Plano's agent router to intelligently route user requests to the
- **Conversation Context**: Agents understand follow-up questions and references
- **Real-Time Data**: Live weather and flight data from public APIs
- **Multi-Day Forecasts**: Weather agent supports up to 16-day forecasts
- **LLM-Powered**: Uses GPT-4o-mini for extraction and GPT-4o for responses
- **LLM-Powered**: Uses GPT-4o-mini for extraction and GPT-5.2 for responses
- **Streaming Responses**: Real-time streaming for better user experience
## Prerequisites
- Docker and Docker Compose
- [Plano CLI](https://docs.planoai.dev) installed
- OpenAI API key
- [Plano CLI](https://docs.planoai.dev/get_started/quickstart.html#prerequisites) installed
- [OpenAI API key](https://platform.openai.com/api-keys)
- [FlightAware AeroAPI key](https://www.flightaware.com/aeroapi/portal)
> **Note:** You'll need to obtain a FlightAware AeroAPI key for live flight data. Visit [https://www.flightaware.com/aeroapi/portal](https://www.flightaware.com/aeroapi/portal) to get your API key.
## Quick Start
@ -33,17 +36,11 @@ All agents use Plano's agent router to intelligently route user requests to the
Create a `.env` file or export environment variables:
```bash
export AEROAPI_KEY="your-flightaware-api-key" # Optional, demo key included
export AEROAPI_KEY="your-flightaware-api-key"
export OPENAI_API_KEY="your OpenAI api key"
```
### 2. Start All Agents with Docker
```bash
chmod +x start_agents.sh
./start_agents.sh
```
Or directly:
### 2. Start All Agents & Plano with Docker
```bash
docker compose up --build
@ -53,50 +50,16 @@ This starts:
- Weather Agent on port 10510
- Flight Agent on port 10520
- Open WebUI on port 8080
### 3. Start Plano Orchestrator
In a new terminal:
```bash
cd /path/to/travel_agents
planoai up config.yaml
# Or if installed with uv: uvx planoai up config.yaml
```
The gateway will start on port 8001 and route requests to the appropriate agents.
- Plano Proxy on port 8001
### 4. Test the System
**Option 1**: Use Open WebUI at http://localhost:8080
Use Open WebUI at http://localhost:8080
**Option 2**: Send requests directly to Plano Orchestrator:
```bash
curl http://localhost:8001/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [
{"role": "user", "content": "What is the weather like in Paris?"}
]
}'
```
> **Note:** The Open WebUI may take a few minutes to start up and be fully ready. Please wait for the container to finish initializing before accessing the interface. Once ready, make sure to select the **gpt-5.2** model from the model dropdown menu in the UI.
## Example Conversations
### Weather Query
```
User: What's the weather in Istanbul?
Assistant: [Weather Agent provides current conditions and forecast]
```
### Flight Search
```
User: What flights go from London to Seattle?
Assistant: [Flight Agent shows available flights with schedules and status]
```
### Multi-Agent Conversation
```
User: What's the weather in Istanbul?
@ -108,7 +71,7 @@ Assistant: [Flight information from Istanbul to Seattle]
The system understands context and pronouns, automatically routing to the right agent.
### Multi-Intent Queries
### Multi-Intent Single Query
```
User: What's the weather in Seattle, and do any flights go direct to New York?
Assistant: [Both weather_agent and flight_agent respond simultaneously]
@ -116,20 +79,6 @@ Assistant: [Both weather_agent and flight_agent respond simultaneously]
- Flight Agent: [Flight information from Seattle to New York]
```
The orchestrator can select multiple agents simultaneously for queries containing multiple intents.
## Agent Details
### Weather Agent
- **Port**: 10510
- **API**: Open-Meteo (free, no API key)
- **Capabilities**: Current weather, multi-day forecasts, temperature, conditions, sunrise/sunset
### Flight Agent
- **Port**: 10520
- **API**: FlightAware AeroAPI
- **Capabilities**: Real-time flight status, schedules, delays, gates, terminals, live tracking
## Architecture
```
@ -138,75 +87,27 @@ The orchestrator can select multiple agents simultaneously for queries containin
Plano (8001)
[Orchestrator]
|
┌────┴────┐
↓ ↓
Weather Flight
Agent Agent
(10510) (10520)
[Docker] [Docker]
```
┌────┴──-──┐
↓ ↓
Weather Flight
Agent Agent
(10510) (10520)
[Docker] [Docker]
```
Each agent:
1. Extracts intent using GPT-4o-mini (with OpenTelemetry tracing)
2. Fetches real-time data from APIs
3. Generates response using GPT-4o
3. Generates response using GPT-5.2
4. Streams response back to user
Both agents run as Docker containers and communicate with Plano via `host.docker.internal`.
## Project Structure
## Observability
```
travel_agents/
├── config.yaml # Plano configuration
├── docker-compose.yaml # Docker services orchestration
├── Dockerfile # Multi-agent container image
├── start_agents.sh # Quick start script
├── pyproject.toml # Python dependencies
└── src/
└── travel_agents/
├── __init__.py # CLI entry point
├── weather_agent.py # Weather forecast agent (multi-day support)
└── flight_agent.py # Flight information agent
```
This demo includes full OpenTelemetry (OTel) compatible distributed tracing to monitor and debug agent interactions:
The tracing data provides complete visibility into the multi-agent system, making it easy to identify bottlenecks, debug issues, and optimize performance.
## Configuration Files
For more details on setting up and using tracing, see the [Plano Observability documentation](https://docs.planoai.dev/guides/observability/tracing.html).
### config.yaml
Defines the two agents, their descriptions, and routing configuration. The agent router uses these descriptions to intelligently route requests.
### docker-compose.yaml
Orchestrates the deployment of:
- Weather Agent (builds from Dockerfile)
- Flight Agent (builds from Dockerfile)
- Open WebUI (for testing)
- Jaeger (for distributed tracing)
## Troubleshooting
**Docker containers won't start**
- Verify Docker and Docker Compose are installed
- Check that ports 10510, 10520, 8080 are available
- Review container logs: `docker compose logs weather-agent` or `docker compose logs flight-agent`
**Plano won't start**
- Verify Plano is installed: `plano --version`
- Ensure you're in the travel_agents directory
- Check config.yaml is valid
**No response from agents**
- Verify all containers are running: `docker compose ps`
- Check that Plano is running on port 8001
- Review agent logs: `docker compose logs -f`
- Verify `host.docker.internal` resolves correctly (should point to host machine)
## API Endpoints
All agents expose OpenAI-compatible chat completion endpoints:
- `POST /v1/chat/completions` - Chat completion endpoint
- `GET /health` - Health check endpoint
![alt text](tracing.png)

View file

@ -7,7 +7,7 @@ agents:
url: http://host.docker.internal:10520
model_providers:
- model: openai/gpt-4o
- model: openai/gpt-5.2
access_key: $OPENAI_API_KEY
default: true
- model: openai/gpt-4o-mini

View file

@ -49,6 +49,10 @@ services:
- DEFAULT_MODEL=gpt-4o-mini
- ENABLE_OPENAI_API=true
- OPENAI_API_BASE_URL=http://host.docker.internal:8001/v1
- ENABLE_FOLLOW_UP_GENERATION=false
- ENABLE_TITLE_GENERATION=false
- ENABLE_TAGS_GENERATION=false
- ENABLE_AUTOCOMPLETE_GENERATION=false
depends_on:
- weather-agent
- flight-agent

View file

@ -19,7 +19,7 @@ logger = logging.getLogger(__name__)
LLM_GATEWAY_ENDPOINT = os.getenv(
"LLM_GATEWAY_ENDPOINT", "http://host.docker.internal:12000/v1"
)
FLIGHT_MODEL = "openai/gpt-4o"
FLIGHT_MODEL = "openai/gpt-5.2"
EXTRACTION_MODEL = "openai/gpt-4o-mini"
AEROAPI_BASE_URL = "https://aeroapi.flightaware.com/aeroapi"
@ -82,7 +82,7 @@ async def extract_flight_route(messages: list, request: Request) -> dict:
],
],
temperature=0.1,
max_tokens=100,
max_completion_tokens=100,
extra_headers=extra_headers or None,
)
@ -124,7 +124,7 @@ async def resolve_airport_code(city_name: str, request: Request) -> Optional[str
{"role": "user", "content": city_name},
],
temperature=0.1,
max_tokens=10,
max_completion_tokens=10,
extra_headers=extra_headers or None,
)
@ -355,7 +355,7 @@ Ask the user to check the city name or provide a different city."""
model=FLIGHT_MODEL,
messages=response_messages,
temperature=request_body.get("temperature", 0.7),
max_tokens=request_body.get("max_tokens", 1000),
max_completion_tokens=request_body.get("max_tokens", 3000),
stream=True,
extra_headers=extra_headers,
)

View file

@ -26,7 +26,7 @@ logger = logging.getLogger(__name__)
LLM_GATEWAY_ENDPOINT = os.getenv(
"LLM_GATEWAY_ENDPOINT", "http://host.docker.internal:12001/v1"
)
WEATHER_MODEL = "openai/gpt-4o"
WEATHER_MODEL = "openai/gpt-5.2"
LOCATION_MODEL = "openai/gpt-4o-mini"
# Initialize OpenAI client for plano
@ -117,7 +117,7 @@ If no city can be found, output: NOT_FOUND"""
],
],
temperature=0.1,
max_tokens=10,
max_completion_tokens=10,
extra_headers=extra_headers if extra_headers else None,
)
@ -372,7 +372,7 @@ Present the weather information to the user in a clear, readable format. If ther
model=WEATHER_MODEL,
messages=response_messages,
temperature=request_body.get("temperature", 0.7),
max_tokens=request_body.get("max_tokens", 1000),
max_completion_tokens=request_body.get("max_tokens", 3000),
stream=True,
extra_headers=extra_headers,
)

Binary file not shown.

After

Width:  |  Height:  |  Size: 3 MiB