fixing the README for multi-agent orchestration (#722)

* fixing the README for multi-agent orchestration * fixed issues where the multi-intent queries weren't being properly handled by GPT-40 * more fixes for the README.md and tracing visuals * removed remnant Arch README.md --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>
2026-06-08 14:55:14 +02:00 · 2026-02-02 14:35:49 -08:00 · 2026-02-02 14:35:49 -08:00 · 7cba42f887
commit 7cba42f887
parent e41aa0a617
7 changed files with 39 additions and 246 deletions
--- a/demos/use_cases/README.md
+++ b/demos/use_cases/README.md
@ -1,112 +0,0 @@
-
-### Use Arch for (Model-based) LLM Routing Step 1. Create arch config file
-Create `config.yaml` file with following content:
-
-```yaml
-version: v0.1.0
-
-listeners:
-  egress_traffic:
-    address: 0.0.0.0
-    port: 12000
-    message_format: openai
-    timeout: 30s
-
-llm_providers:
-  - access_key: $OPENAI_API_KEY
-    model: openai/gpt-4o
-    default: true
-
-  - access_key: $MISTRAL_API_KEY
-    model: mistral/ministral-3b-latest
-```
-
-### Step 2. Start arch gateway
-
-Once the config file is created ensure that you have env vars setup for `MISTRAL_API_KEY` and `OPENAI_API_KEY` (or these are defined in `.env` file).
-
-Start arch gateway,
-
-```
-$ planoai up config.yaml
-# Or if installed with uv: uvx planoai up config.yaml
-2024-12-05 11:24:51,288 - planoai.main - INFO - Starting plano cli version: 0.4.4
-2024-12-05 11:24:51,825 - planoai.utils - INFO - Schema validation successful!
-2024-12-05 11:24:51,825 - planoai.main - INFO - Starting arch model server and arch gateway
-...
-2024-12-05 11:25:16,131 - planoai.core - INFO - Container is healthy!
-```
-
-### Step 3: Interact with LLM
-
-#### Step 3.1: Using OpenAI python client
-
-Make outbound calls via Arch gateway
-
-```python
-from openai import OpenAI
-
-# Use the OpenAI client as usual
-client = OpenAI(
-  # No need to set a specific openai.api_key since it's configured in Arch's gateway
-  api_key = '--',
-  # Set the OpenAI API base URL to the Arch gateway endpoint
-  base_url = "http://127.0.0.1:12000/v1"
-)
-
-response = client.chat.completions.create(
-    # we select model from arch_config file
-    model="None",
-    messages=[{"role": "user", "content": "What is the capital of France?"}],
-)
-
-print("OpenAI Response:", response.choices[0].message.content)
-
-```
-
-#### Step 3.2: Using curl command
-```
-$ curl --header 'Content-Type: application/json' \
-  --data '{"messages": [{"role": "user","content": "What is the capital of France?"}], "model": "gpt-4o"}' \
-  http://localhost:12000/v1/chat/completions
-
-{
-  ...
-  "model": "gpt-4o-2024-08-06",
-  "choices": [
-    {
-      ...
-      "messages": {
-        "role": "assistant",
-        "content": "The capital of France is Paris.",
-      },
-    }
-  ],
-...
-}
-
-```
-
-You can override model selection using `x-arch-llm-provider-hint` header. For example if you want to use mistral using following curl command,
-
-```
-$ curl --header 'Content-Type: application/json' \
-  --header 'x-arch-llm-provider-hint: ministral-3b' \
-  --data '{"messages": [{"role": "user","content": "What is the capital of France?"}], "model": "gpt-4o"}' \
-  http://localhost:12000/v1/chat/completions
-{
-  ...
-  "model": "ministral-3b-latest",
-  "choices": [
-    {
-      "messages": {
-        "role": "assistant",
-        "content": "The capital of France is Paris. It is the most populous city in France and is known for its iconic landmarks such as the Eiffel Tower, the Louvre Museum, and Notre-Dame Cathedral. Paris is also a major global center for art, fashion, gastronomy, and culture.",
-      },
-      ...
-    }
-  ],
-  ...
-}
-
-```
--- a/demos/use_cases/travel_agents/README.md
+++ b/demos/use_cases/travel_agents/README.md
@ -1,6 +1,6 @@
 # Travel Booking Agent Demo

-A production-ready multi-agent travel booking system demonstrating Plano's intelligent agent routing. This demo showcases two specialized agents working together to help users plan trips with weather information and flight searches.
+A multi-agent travel booking system demonstrating Plano's intelligent agent routing and orchestration capabilities. This demo showcases two specialized agents working together to help users plan trips with weather information and flight searches. All agent interactions are fully traced with OpenTelemetry-compatible tracing for complete observability.

 ## Overview

@ -9,7 +9,7 @@ This demo consists of two intelligent agents that work together seamlessly:
 - **Weather Agent** - Real-time weather conditions and multi-day forecasts for any city worldwide
 - **Flight Agent** - Live flight information between airports with real-time tracking

-All agents use Plano's agent router to intelligently route user requests to the appropriate specialized agent based on conversation context and user intent. Both agents run as Docker containers for easy deployment.
+All agents use Plano's agent orchestration LLM to intelligently route user requests to the appropriate specialized agent based on conversation context and user intent. Both agents run as Docker containers for easy deployment.

 ## Features

@ -17,14 +17,17 @@ All agents use Plano's agent router to intelligently route user requests to the
 - **Conversation Context**: Agents understand follow-up questions and references
 - **Real-Time Data**: Live weather and flight data from public APIs
 - **Multi-Day Forecasts**: Weather agent supports up to 16-day forecasts
- **LLM-Powered**: Uses GPT-4o-mini for extraction and GPT-4o for responses
+- **LLM-Powered**: Uses GPT-4o-mini for extraction and GPT-5.2 for responses
 - **Streaming Responses**: Real-time streaming for better user experience

 ## Prerequisites

 - Docker and Docker Compose
- [Plano CLI](https://docs.planoai.dev) installed
- OpenAI API key
+- [Plano CLI](https://docs.planoai.dev/get_started/quickstart.html#prerequisites) installed
+- [OpenAI API key](https://platform.openai.com/api-keys)
+- [FlightAware AeroAPI key](https://www.flightaware.com/aeroapi/portal)
+
+> **Note:** You'll need to obtain a FlightAware AeroAPI key for live flight data. Visit [https://www.flightaware.com/aeroapi/portal](https://www.flightaware.com/aeroapi/portal) to get your API key.

 ## Quick Start

@ -33,17 +36,11 @@ All agents use Plano's agent router to intelligently route user requests to the
 Create a `.env` file or export environment variables:

 ```bash
-export AEROAPI_KEY="your-flightaware-api-key"  # Optional, demo key included
+export AEROAPI_KEY="your-flightaware-api-key"
+export OPENAI_API_KEY="your OpenAI api key"
 ```

-### 2. Start All Agents with Docker
-
-```bash
-chmod +x start_agents.sh
-./start_agents.sh
-```
-
-Or directly:
+### 2. Start All Agents & Plano with Docker

 ```bash
 docker compose up --build
@ -53,50 +50,16 @@ This starts:
 - Weather Agent on port 10510
 - Flight Agent on port 10520
 - Open WebUI on port 8080
-
-### 3. Start Plano Orchestrator
-
-In a new terminal:
-
-```bash
-cd /path/to/travel_agents
-planoai up config.yaml
-# Or if installed with uv: uvx planoai up config.yaml
-```
-
-The gateway will start on port 8001 and route requests to the appropriate agents.
+- Plano Proxy on port 8001

 ### 4. Test the System

-**Option 1**: Use Open WebUI at http://localhost:8080
+Use Open WebUI at http://localhost:8080

-**Option 2**: Send requests directly to Plano Orchestrator:
-
-```bash
-curl http://localhost:8001/v1/chat/completions \
-  -H "Content-Type: application/json" \
-  -d '{
-    "model": "gpt-4o",
-    "messages": [
-      {"role": "user", "content": "What is the weather like in Paris?"}
-    ]
-  }'
-```
+> **Note:** The Open WebUI may take a few minutes to start up and be fully ready. Please wait for the container to finish initializing before accessing the interface. Once ready, make sure to select the **gpt-5.2** model from the model dropdown menu in the UI.

 ## Example Conversations

-### Weather Query
-```
-User: What's the weather in Istanbul?
-Assistant: [Weather Agent provides current conditions and forecast]
-```
-
-### Flight Search
-```
-User: What flights go from London to Seattle?
-Assistant: [Flight Agent shows available flights with schedules and status]
-```
-
 ### Multi-Agent Conversation
 ```
 User: What's the weather in Istanbul?
@ -108,7 +71,7 @@ Assistant: [Flight information from Istanbul to Seattle]

 The system understands context and pronouns, automatically routing to the right agent.

-### Multi-Intent Queries
+### Multi-Intent Single Query
 ```
 User: What's the weather in Seattle, and do any flights go direct to New York?
 Assistant: [Both weather_agent and flight_agent respond simultaneously]
@ -116,20 +79,6 @@ Assistant: [Both weather_agent and flight_agent respond simultaneously]
  - Flight Agent: [Flight information from Seattle to New York]
 ```

-The orchestrator can select multiple agents simultaneously for queries containing multiple intents.
-
-## Agent Details
-
-### Weather Agent
- **Port**: 10510
- **API**: Open-Meteo (free, no API key)
- **Capabilities**: Current weather, multi-day forecasts, temperature, conditions, sunrise/sunset
-
-### Flight Agent
- **Port**: 10520
- **API**: FlightAware AeroAPI
- **Capabilities**: Real-time flight status, schedules, delays, gates, terminals, live tracking
-
 ## Architecture

 ```
@ -138,75 +87,27 @@ The orchestrator can select multiple agents simultaneously for queries containin
    Plano (8001)
     [Orchestrator]
         |
-    ┌────┴────┐
-    ↓         ↓
-Weather    Flight
-Agent      Agent
-(10510)    (10520)
-[Docker]   [Docker]
-```
-
+    ┌────┴──-──┐
+    ↓          ↓
+ Weather     Flight
+  Agent       Agent
+ (10510)     (10520)
+ [Docker]    [Docker]
 ```

 Each agent:
 1. Extracts intent using GPT-4o-mini (with OpenTelemetry tracing)
 2. Fetches real-time data from APIs
-3. Generates response using GPT-4o
+3. Generates response using GPT-5.2
 4. Streams response back to user

 Both agents run as Docker containers and communicate with Plano via `host.docker.internal`.

-## Project Structure
+## Observability

-```
-travel_agents/
-├── config.yaml          # Plano configuration
-├── docker-compose.yaml       # Docker services orchestration
-├── Dockerfile               # Multi-agent container image
-├── start_agents.sh          # Quick start script
-├── pyproject.toml           # Python dependencies
-└── src/
-    └── travel_agents/
-        ├── __init__.py      # CLI entry point
-        ├── weather_agent.py # Weather forecast agent (multi-day support)
-        └── flight_agent.py  # Flight information agent
-```
+This demo includes full OpenTelemetry (OTel) compatible distributed tracing to monitor and debug agent interactions:
+The tracing data provides complete visibility into the multi-agent system, making it easy to identify bottlenecks, debug issues, and optimize performance.

-## Configuration Files
+For more details on setting up and using tracing, see the [Plano Observability documentation](https://docs.planoai.dev/guides/observability/tracing.html).

-### config.yaml
-
-Defines the two agents, their descriptions, and routing configuration. The agent router uses these descriptions to intelligently route requests.
-
-### docker-compose.yaml
-
-Orchestrates the deployment of:
- Weather Agent (builds from Dockerfile)
- Flight Agent (builds from Dockerfile)
- Open WebUI (for testing)
- Jaeger (for distributed tracing)
-
-## Troubleshooting
-
-**Docker containers won't start**
- Verify Docker and Docker Compose are installed
- Check that ports 10510, 10520, 8080 are available
- Review container logs: `docker compose logs weather-agent` or `docker compose logs flight-agent`
-
-**Plano won't start**
- Verify Plano is installed: `plano --version`
- Ensure you're in the travel_agents directory
- Check config.yaml is valid
-
-**No response from agents**
- Verify all containers are running: `docker compose ps`
- Check that Plano is running on port 8001
- Review agent logs: `docker compose logs -f`
- Verify `host.docker.internal` resolves correctly (should point to host machine)
-
-## API Endpoints
-
-All agents expose OpenAI-compatible chat completion endpoints:
-
- `POST /v1/chat/completions` - Chat completion endpoint
- `GET /health` - Health check endpoint
+![alt text](tracing.png)
--- a/demos/use_cases/travel_agents/config.yaml
+++ b/demos/use_cases/travel_agents/config.yaml
@ -7,7 +7,7 @@ agents:
    url: http://host.docker.internal:10520

 model_providers:
-  - model: openai/gpt-4o
+  - model: openai/gpt-5.2
    access_key: $OPENAI_API_KEY
    default: true
  - model: openai/gpt-4o-mini
--- a/demos/use_cases/travel_agents/docker-compose.yaml
+++ b/demos/use_cases/travel_agents/docker-compose.yaml
@ -49,6 +49,10 @@ services:
      - DEFAULT_MODEL=gpt-4o-mini
      - ENABLE_OPENAI_API=true
      - OPENAI_API_BASE_URL=http://host.docker.internal:8001/v1
+      - ENABLE_FOLLOW_UP_GENERATION=false
+      - ENABLE_TITLE_GENERATION=false
+      - ENABLE_TAGS_GENERATION=false
+      - ENABLE_AUTOCOMPLETE_GENERATION=false
    depends_on:
      - weather-agent
      - flight-agent
--- a/demos/use_cases/travel_agents/src/travel_agents/flight_agent.py
+++ b/demos/use_cases/travel_agents/src/travel_agents/flight_agent.py
@ -19,7 +19,7 @@ logger = logging.getLogger(__name__)
 LLM_GATEWAY_ENDPOINT = os.getenv(
    "LLM_GATEWAY_ENDPOINT", "http://host.docker.internal:12000/v1"
 )
-FLIGHT_MODEL = "openai/gpt-4o"
+FLIGHT_MODEL = "openai/gpt-5.2"
 EXTRACTION_MODEL = "openai/gpt-4o-mini"

 AEROAPI_BASE_URL = "https://aeroapi.flightaware.com/aeroapi"
@ -82,7 +82,7 @@ async def extract_flight_route(messages: list, request: Request) -> dict:
                ],
            ],
            temperature=0.1,
-            max_tokens=100,
+            max_completion_tokens=100,
            extra_headers=extra_headers or None,
        )

@ -124,7 +124,7 @@ async def resolve_airport_code(city_name: str, request: Request) -> Optional[str
                {"role": "user", "content": city_name},
            ],
            temperature=0.1,
-            max_tokens=10,
+            max_completion_tokens=10,
            extra_headers=extra_headers or None,
        )

@ -355,7 +355,7 @@ Ask the user to check the city name or provide a different city."""
            model=FLIGHT_MODEL,
            messages=response_messages,
            temperature=request_body.get("temperature", 0.7),
-            max_tokens=request_body.get("max_tokens", 1000),
+            max_completion_tokens=request_body.get("max_tokens", 3000),
            stream=True,
            extra_headers=extra_headers,
        )
--- a/demos/use_cases/travel_agents/src/travel_agents/weather_agent.py
+++ b/demos/use_cases/travel_agents/src/travel_agents/weather_agent.py
@ -26,7 +26,7 @@ logger = logging.getLogger(__name__)
 LLM_GATEWAY_ENDPOINT = os.getenv(
    "LLM_GATEWAY_ENDPOINT", "http://host.docker.internal:12001/v1"
 )
-WEATHER_MODEL = "openai/gpt-4o"
+WEATHER_MODEL = "openai/gpt-5.2"
 LOCATION_MODEL = "openai/gpt-4o-mini"

 # Initialize OpenAI client for plano
@ -117,7 +117,7 @@ If no city can be found, output: NOT_FOUND"""
                    ],
                ],
                temperature=0.1,
-                max_tokens=10,
+                max_completion_tokens=10,
                extra_headers=extra_headers if extra_headers else None,
            )

@ -372,7 +372,7 @@ Present the weather information to the user in a clear, readable format. If ther
            model=WEATHER_MODEL,
            messages=response_messages,
            temperature=request_body.get("temperature", 0.7),
-            max_tokens=request_body.get("max_tokens", 1000),
+            max_completion_tokens=request_body.get("max_tokens", 3000),
            stream=True,
            extra_headers=extra_headers,
        )
--- a/demos/use_cases/travel_agents/tracing.png
+++ b/demos/use_cases/travel_agents/tracing.png