Overhaul demos directory: cleanup, restructure, and standardize configs (#760)

2026-05-21 13:55:15 +02:00 · 2026-02-17 03:09:28 -08:00 · 2026-02-17 03:09:28 -08:00 · 473996d35d
commit 473996d35d
parent c3591bcbf3
205 changed files with 304 additions and 5223 deletions
--- a/demos/use_cases/claude_code_router/README.md
+++ b/demos/use_cases/claude_code_router/README.md
@ -1,152 +0,0 @@
-# Claude Code Router - Multi-Model Access with Intelligent Routing
-
-Plano extends Claude Code to access multiple LLM providers through a single interface. Offering two key benefits:
-
-1. **Access to Models**: Connect to Grok, Mistral, Gemini, DeepSeek, GPT models, Claude, and local models via Ollama
-2. **Intelligent Routing via Preferences for Coding Tasks**: Configure which models handle specific development tasks:
-   - Code generation and implementation
-   - Code reviews and analysis
-   - Architecture and system design
-   - Debugging and optimization
-   - Documentation and explanations
-
-Uses a [1.5B preference-aligned router LLM](https://arxiv.org/abs/2506.16655) to automatically select the best model based on your request type.
-
-## Benefits
-
- **Single Interface**: Access multiple LLM providers through the same Claude Code CLI
- **Task-Aware Routing**: Requests are analyzed and routed to models based on task type (code generation, debugging, architecture, documentation)
- **Provider Flexibility**: Add or remove LLM providers without changing your workflow
- **Routing Transparency**: See which model handles each request and why
-
-## How It Works
-
-Plano sits between Claude Code and multiple LLM providers, analyzing each request to route it to the most suitable model:
-
-```
-Your Request → Plano → Suitable Model → Response
-             ↓
-    [Task Analysis & Model Selection]
-```
-
-**Supported Providers**: OpenAI-compatible, Anthropic, DeepSeek, Grok, Gemini, Llama, Mistral, local models via Ollama. See [full list of supported providers](https://docs.planoai.dev/concepts/llm_providers/supported_providers.html).
-
-
-## Quick Start (5 minutes)
-
-### Prerequisites
-```bash
-# Install Claude Code if you haven't already
-npm install -g @anthropic-ai/claude-code
-
-# Ensure Docker is running
-docker --version
-```
-
-### Step 1: Get Configuration
-```bash
-# Clone and navigate to demo
-git clone https://github.com/katanemo/arch.git
-cd arch/demos/use_cases/claude_code
-```
-
-### Step 2: Set API Keys
-```bash
-# Copy the sample environment file
-cp .env .env.local
-
-# Edit with your actual API keys
-export OPENAI_API_KEY="your-openai-key-here"
-export ANTHROPIC_API_KEY="your-anthropic-key-here"
-# Add other providers as needed
-```
-
-### Step 3: Start Plano
-```bash
-# Install using uv (recommended)
-uv tool install planoai
-planoai up
-# Or if already installed with uv: uvx planoai up
-
-# Or install using pip (traditional)
-pip install planoai
-planoai up
-```
-
-### Step 4: Launch Enhanced Claude Code
-```bash
-# This will launch Claude Code with multi-model routing
-planoai cli-agent claude
-# Or if installed with uv: uvx planoai cli-agent claude
-```
-![claude code](claude_code.png)
-
-### Monitor Model Selection in Real-Time
-
-While using Claude Code, open a **second terminal** and run this helper script to watch routing decisions. This script shows you:
- **Which model** was selected for each request
- **Real-time routing decisions** as you work
-
-```bash
-# In a new terminal window (from the same directory)
-sh pretty_model_resolution.sh
-```
-![model_selection](model_selection.png)
-
-## Understanding the Configuration
-
-The `config.yaml` file defines your multi-model setup:
-
-```yaml
-llm_providers:
-  - model: openai/gpt-4.1-2025-04-14
-    access_key: $OPENAI_API_KEY
-    routing_preferences:
-      - name: code generation
-        description: generating new code snippets and functions
-
-  - model: anthropic/claude-3-5-sonnet-20241022
-    access_key: $ANTHROPIC_API_KEY
-    routing_preferences:
-      - name: code understanding
-        description: explaining and analyzing existing code
-```
-
-## Advanced Usage
-
-### Override Model Selection
-```bash
-# Force a specific model for this session
-planoai cli-agent claude --settings='{"ANTHROPIC_SMALL_FAST_MODEL": "deepseek-coder-v2"}'
-
-### Environment Variables
-The system automatically configures these variables for Claude Code:
-```bash
-ANTHROPIC_BASE_URL=http://127.0.0.1:12000  # Routes through Plano
-ANTHROPIC_SMALL_FAST_MODEL=arch.claude.code.small.fast    # Uses intelligent alias
-```
-
-### Custom Routing Configuration
-Edit `config.yaml` to define custom task→model mappings:
-
-```yaml
-llm_providers:
-  # OpenAI Models
-  - model: openai/gpt-5-2025-08-07
-    access_key: $OPENAI_API_KEY
-    routing_preferences:
-      - name: code generation
-        description: generating new code snippets, functions, or boilerplate based on user prompts or requirements
-
-  - model: openai/gpt-4.1-2025-04-14
-    access_key: $OPENAI_API_KEY
-    routing_preferences:
-      - name: code understanding
-        description: understand and explain existing code snippets, functions, or libraries
-```
-
-## Technical Details
-
-**How routing works:** Plano intercepts Claude Code requests, analyzes the content using preference-aligned routing, and forwards to the configured model.
-**Research foundation:** Built on our research in [Preference-Aligned LLM Routing](https://arxiv.org/abs/2506.16655)
-**Documentation:** [docs.planoai.dev](https://docs.planoai.dev) for advanced configuration and API details.
--- a/demos/use_cases/claude_code_router/claude_code.png
+++ b/demos/use_cases/claude_code_router/claude_code.png
--- a/demos/use_cases/claude_code_router/config.yaml
+++ b/demos/use_cases/claude_code_router/config.yaml
@ -1,43 +0,0 @@
-version: v0.1
-
-listeners:
-  egress_traffic:
-    address: 0.0.0.0
-    port: 12000
-    message_format: openai
-    timeout: 30s
-
-llm_providers:
-  # OpenAI Models
-  - model: openai/gpt-5-2025-08-07
-    access_key: $OPENAI_API_KEY
-    routing_preferences:
-      - name: code generation
-        description: generating new code snippets, functions, or boilerplate based on user prompts or requirements
-
-  - model: openai/gpt-4.1-2025-04-14
-    access_key: $OPENAI_API_KEY
-    routing_preferences:
-      - name: code understanding
-        description: understand and explain existing code snippets, functions, or libraries
-  # Anthropic Models
-  - model: anthropic/claude-sonnet-4-5
-    default: true
-    access_key: $ANTHROPIC_API_KEY
-
-  - model: anthropic/claude-haiku-4-5
-    access_key: $ANTHROPIC_API_KEY
-
-  # Ollama Models
-  - model: ollama/llama3.1
-    base_url: http://host.docker.internal:11434
-
-
-# Model aliases - friendly names that map to actual provider names
-model_aliases:
-  # Alias for a small faster Claude model
-  arch.claude.code.small.fast:
-    target: claude-haiku-4-5
-
-tracing:
-  random_sampling: 100
--- a/demos/use_cases/claude_code_router/model_selection.png
+++ b/demos/use_cases/claude_code_router/model_selection.png
--- a/demos/use_cases/claude_code_router/pretty_model_resolution.sh
+++ b/demos/use_cases/claude_code_router/pretty_model_resolution.sh
@ -1,33 +0,0 @@
-#!/usr/bin/env bash
-# Pretty-print Plano MODEL_RESOLUTION lines from docker logs
-# - hides Arch-Router
-# - prints timestamp
-# - colors MODEL_RESOLUTION red
-# - colors req_model cyan
-# - colors resolved_model magenta
-# - removes provider and streaming
-
-docker logs -f plano 2>&1 \
-| awk '
-/MODEL_RESOLUTION:/ && $0 !~ /Arch-Router/ {
-  # extract timestamp between first [ and ]
-  ts=""
-  if (match($0, /\[[0-9-]+ [0-9:.]+\]/)) {
-    ts=substr($0, RSTART+1, RLENGTH-2)
-  }
-
-  # split out after MODEL_RESOLUTION:
-  n = split($0, parts, /MODEL_RESOLUTION: */)
-  line = parts[2]
-
-  # remove provider and streaming fields
-  sub(/ *provider='\''[^'\'']+'\''/, "", line)
-  sub(/ *streaming=(true|false)/, "", line)
-
-  # highlight fields
-  gsub(/req_model='\''[^'\'']+'\''/, "\033[36m&\033[0m", line)
-  gsub(/resolved_model='\''[^'\'']+'\''/, "\033[35m&\033[0m", line)
-
-  # print timestamp + MODEL_RESOLUTION
-  printf "\033[90m[%s]\033[0m \033[31mMODEL_RESOLUTION\033[0m: %s\n", ts, line
-}'
--- a/demos/use_cases/http_filter/Dockerfile
+++ b/demos/use_cases/http_filter/Dockerfile
@ -1,26 +0,0 @@
-FROM python:3.14-slim
-
-WORKDIR /app
-
-# Install bash and uv
-RUN apt-get update && apt-get install -y bash && rm -rf /var/lib/apt/lists/*
-RUN pip install --no-cache-dir uv
-
-# Copy dependency files
-COPY pyproject.toml README.md ./
-
-# Copy source code
-COPY src/ ./src/
-COPY start_agents.sh ./
-
-# Install dependencies using uv
-RUN uv pip install --system --no-cache click fastmcp pydantic fastapi uvicorn openai
-
-# Make start script executable
-RUN chmod +x start_agents.sh
-
-# Expose ports for all agents
-EXPOSE 10500 10501 10502 10505
-
-# Run the start script with bash
-CMD ["bash", "./start_agents.sh"]
--- a/demos/use_cases/http_filter/README.md
+++ b/demos/use_cases/http_filter/README.md
@ -1,128 +0,0 @@
-# RAG Agent Demo
-
-A multi-agent RAG system demonstrating plano's agent filter chain with MCP protocol.
-
-## Architecture
-
-This demo consists of four components:
-1. **Input Guards** (MCP filter) - Validates queries are within TechCorp's domain
-2. **Query Rewriter** (MCP filter) - Rewrites user queries for better retrieval
-3. **Context Builder** (MCP filter) - Retrieves relevant context from knowledge base
-4. **RAG Agent** (REST) - Generates final responses based on augmented context
-
-## Components
-
-### Input Guards Filter (MCP)
- **Port**: 10500
- **Tool**: `input_guards`
- Validates queries are within TechCorp's domain
- Rejects queries about other companies or unrelated topics
-
-### Query Rewrit3r Filter (MCP)
- **Port**: 10501
- **Tool**: `query_rewriter`
- Improves queries using LLM before retrieval
-
-### Context Builder Filter (MCP)
- **Port**: 10502
- **Tool**: `context_builder`
- Augments queries with relevant passages from knowledge base
-
-### RAG Agent (REST/OpenAI)
- **Port**: 10505
- **Endpoint**: `/v1/chat/completions`
- Generates responses using OpenAI-compatible API
-
-## Quick Start
-
-### 1. Start everything with Docker Compose
-```bash
-docker compose up --build
-```
-
-This brings up:
- Input Guards MCP server on port 10500
- Query Rewriter MCP server on port 10501
- Context Builder MCP server on port 10502
- RAG Agent REST server on port 10505
- Plano listener on port 8001 (and gateway on 12000)
- Jaeger UI for viewing traces at http://localhost:16686
- AnythingLLM at http://localhost:3001 for interactive queries
-
-> Set `OPENAI_API_KEY` in your environment before running; `LLM_GATEWAY_ENDPOINT` defaults to `http://host.docker.internal:12000/v1`.
-
-### 2. Test the system
-
-**Option A: Using AnythingLLM (recommended)**
-
-Navigate to http://localhost:3001 and send queries through the chat interface.
-
-**Option B: Using curl**
-```bash
-curl -X POST http://localhost:8001/v1/chat/completions \
-  -H "Content-Type: application/json" \
-  -d '{
-    "model": "gpt-4o",
-    "messages": [{"role": "user", "content": "What is the guaranteed uptime for TechCorp?"}]
-  }'
-```
-
-## Configuration
-
-The `config.yaml` defines how agents are connected:
-
-```yaml
-filters:
-  - id: input_guards
-    url: http://host.docker.internal:10500
-    # type: mcp (default)
-    # tool: input_guards (default - same as filter id)
-
-  - id: query_rewriter
-    url: http://host.docker.internal:10501
-    # type: mcp (default)
-
-  - id: context_builder
-    url: http://host.docker.internal:10502
-```
-
-## How It Works
-
-1. User sends request to plano listener on port 8001
-2. Request passes through MCP filter chain:
-   - **Input Guards** validates the query is within TechCorp's domain
-   - **Query Rewriter** rewrites the query for better retrieval
-   - **Context Builder** augments query with relevant knowledge base passages
-3. Augmented request is forwarded to **RAG Agent** REST endpoint
-4. RAG Agent generates final response using LLM
-
-## Additional Configuration
-
-See `config.yaml` for the complete filter chain setup. The MCP filters use default settings:
- `type: mcp` (default)
- `transport: streamable-http` (default)
- Tool name defaults to filter ID
-
-See `sample_queries.md` for example queries to test the RAG system.
-
-Example request:
-```bash
-curl -X POST http://localhost:8001/v1/chat/completions \
-  -H "Content-Type: application/json" \
-  -d '{
-    "model": "gpt-4o",
-    "messages": [
-      {
-        "role": "user",
-        "content": "What is the guaranteed uptime for TechCorp?"
-      }
-    ]
-  }'
-```
- `LLM_GATEWAY_ENDPOINT` - lpano endpoint (default: `http://localhost:12000/v1`)
- `OPENAI_API_KEY` - OpenAI API key for model providers
-
-## Additional Resources
-
- See `sample_queries.md` for more example queries
- See `config.yaml` for complete configuration details
--- a/demos/use_cases/http_filter/config.yaml
+++ b/demos/use_cases/http_filter/config.yaml
@ -1,50 +0,0 @@
-version: v0.3.0
-
-agents:
-  - id: rag_agent
-    url: http://rag-agents:10505
-
-filters:
-  - id: input_guards
-    url: http://rag-agents:10500
-    type: http
-    # type: mcp (default)
-    # transport: streamable-http (default)
-    # tool: input_guards (default - same as filter id)
-  - id: query_rewriter
-    url: http://rag-agents:10501
-    type: http
-    # type: mcp (default)
-    # transport: streamable-http (default)
-    # tool: query_rewriter (default - same as filter id)
-  - id: context_builder
-    url: http://rag-agents:10502
-    type: http
-
-model_providers:
-  - model: openai/gpt-4o-mini
-    access_key: $OPENAI_API_KEY
-    default: true
-  - model: openai/gpt-4o
-    access_key: $OPENAI_API_KEY
-
-model_aliases:
-  fast-llm:
-    target: gpt-4o-mini
-  smart-llm:
-    target: gpt-4o
-
-listeners:
-  - type: agent
-    name: agent_1
-    port: 8001
-    router: plano_orchestrator_v1
-    agents:
-      - id: rag_agent
-        description: virtual assistant for retrieval augmented generation tasks
-        filter_chain:
-          - input_guards
-          - query_rewriter
-          - context_builder
-tracing:
-  random_sampling: 100
--- a/demos/use_cases/http_filter/docker-compose.yaml
+++ b/demos/use_cases/http_filter/docker-compose.yaml
@ -1,47 +0,0 @@
-services:
-  rag-agents:
-    build:
-      context: .
-      dockerfile: Dockerfile
-    ports:
-      - "10500:10500"
-      - "10501:10501"
-      - "10502:10502"
-      - "10505:10505"
-    environment:
-      - LLM_GATEWAY_ENDPOINT=${LLM_GATEWAY_ENDPOINT:-http://host.docker.internal:12000/v1}
-      - OPENAI_API_KEY=${OPENAI_API_KEY:?OPENAI_API_KEY environment variable is required but not set}
-  plano:
-    build:
-      context: ../../../
-      dockerfile: Dockerfile
-    ports:
-      - "12000:12000"
-      - "8001:8001"
-    environment:
-      - PLANO_CONFIG_PATH=/config/config.yaml
-      - OPENAI_API_KEY=${OPENAI_API_KEY:?OPENAI_API_KEY environment variable is required but not set}
-    volumes:
-      - ./config.yaml:/app/plano_config.yaml
-      - /etc/ssl/cert.pem:/etc/ssl/cert.pem
-  jaeger:
-    build:
-      context: ../../shared/jaeger
-    ports:
-      - "16686:16686"
-      - "4317:4317"
-      - "4318:4318"
-  anythingllm:
-    image: mintplexlabs/anythingllm
-    restart: always
-    ports:
-      - "3001:3001"
-    cap_add:
-      - SYS_ADMIN
-    environment:
-      - STORAGE_DIR=/app/server/storage
-      - LLM_PROVIDER=generic-openai
-      - GENERIC_OPEN_AI_BASE_PATH=http://plano:8001/v1
-      - GENERIC_OPEN_AI_MODEL_PREF=gpt-4o-mini
-      - GENERIC_OPEN_AI_MODEL_TOKEN_LIMIT=128000
-      - GENERIC_OPEN_AI_API_KEY=sk-placeholder
--- a/demos/use_cases/http_filter/http.rest
+++ b/demos/use_cases/http_filter/http.rest
@ -1,77 +0,0 @@
-# http.rest (replacement for mcp_query.rest)
-
-@host = http://localhost
-@model = gpt-4o-mini
-
-# Filter endpoints (HTTP-only)
-@inputGuards = {{host}}:10500
-@queryRewriter = {{host}}:10501
-@contextBuilder = {{host}}:10502
-
-# Plano agent listener (the thing Open WebUI calls)
-@planoAgent = {{host}}:8001
-
-
-POST {{queryRewriter}}/
-Content-Type: application/json
-
-[
-  {
-    "role": "user",
-    "content": "What is the guaranteed uptime percentage for TechCorp's cloud services?"
-  }
-]
-
-
-
-POST {{inputGuards}}/
-Content-Type: application/json
-
-[
-  {
-    "role": "user",
-    "content": "What is the guaranteed uptime percentage for TechCorp's cloud services?"
-  }
-]
-
-
-
-POST {{contextBuilder}}/
-Content-Type: application/json
-
-[
-  {
-    "role": "user",
-    "content": "What is TechCorp's uptime SLA?"
-  }
-]
-
-
-POST {{planoAgent}}/v1/chat/completions
-Content-Type: application/json
-
-{
-  "model": "fast-llm",
-  "messages": [
-    {
-      "role": "user",
-      "content": "What is the guaranteed uptime percentage for TechCorp's cloud services?"
-    }
-  ],
-  "stream": false
-}
-
-
-POST {{planoAgent}}/v1/chat/completions
-Content-Type: application/json
-
-{
-  "model": "fast-llm",
-  "messages": [
-    {
-      "role": "user",
-      "content": "What is the guaranteed uptime percentage for TechCorp's cloud services?"
-    }
-  ],
-  "stream": true
-}
--- a/demos/use_cases/http_filter/mcp_query.rest
+++ b/demos/use_cases/http_filter/mcp_query.rest
@ -1,86 +0,0 @@
-### Initialize MCP Session (SSE)
-POST http://localhost:10501/mcp
-Content-Type: application/json
-Accept: application/json, text/event-stream
-
-{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"capabilities":{},"protocolVersion":"2024-11-05","clientInfo":{"name":"test","version":"1.0.0"}}}
-
-### Send Initialized Notification
-POST http://localhost:10501/mcp
-Content-Type: application/json
-Accept: application/json, text/event-stream
-mcp-session-id: 35d455dc07b8400887f86668590f12bb
-
-{
-  "jsonrpc": "2.0",
-  "method": "notifications/initialized"
-}
-
-### List Tools
-POST http://localhost:10501/mcp
-Content-Type: application/json
-Accept: application/json, text/event-stream
-mcp-session-id: eb10a691b36e4547b6c93c5dc5b47e11
-
-{
-  "jsonrpc": "2.0",
-  "id": "list-tools-1",
-  "method": "tools/list"
-}
-
-### Call Query Rewriter Tool
-POST http://localhost:10501/mcp
-Content-Type: application/json
-Accept: application/json, text/event-stream
-mcp-session-id: 6b95ff75825a402b90eb3ea07e23fbce
-
-{
-  "jsonrpc": "2.0",
-  "id": "3d3b886a-6216-4a26-a422-7a972529c0e7",
-  "method": "tools/call",
-  "params": {
-    "arguments": {
-      "messages": [
-        {
-          "content": "What is the guaranteed uptime percentage for TechCorp's cloud services?",
-          "role": "user"
-        }
-      ]
-    },
-    "name": "query_rewriter"
-  }
-}
-
-### another test
-
-# Content-Type: application/json
-# Accept: application/json, text/event-stream
-# mcp-session-id: ed7a81a1d39549ecaadb867a6b2daf1e
-
-POST http://localhost:10501/mcp
-content-type: application/json
-mcp-session-id: e4ec1ae904e14e06b7d194da10e5f74c
-accept: application/json, text/event-stream
-
-{"jsonrpc":"2.0","id":"4bb1043a-2953-4bcd-b801-f270b0ae8c39","method":"tools/call","params":{"arguments":{"messages":[{"content":"What is the guaranteed uptime percentage for TechCorp's cloud services?","role":"user"}]},"name":"query_rewriter"}}
-
-
-
-### stream test
-
-POST http://localhost:10501/mcp
-content-type: application/json
-mcp-session-id: 35d455dc07b8400887f86668590f12bb
-accept: application/json, text/event-stream
-
-{
-  "jsonrpc": "2.0",
-  "id": 1,
-  "method": "tools/call",
-  "params": {
-    "name": "long_job",
-    "arguments": {
-      "n": 3
-    }
-  }
-}
--- a/demos/use_cases/http_filter/pyproject.toml
+++ b/demos/use_cases/http_filter/pyproject.toml
@ -1,22 +0,0 @@
-[project]
-name = "rag_agent"
-version = "0.1.0"
-description = "RAG Agent"
-readme = "README.md"
-requires-python = ">=3.10"
-dependencies = [
-    "click>=8.2.1",
-    "mcp>=1.13.1",
-    "fastmcp>=2.14",
-    "pydantic>=2.11.7",
-    "fastapi>=0.104.1",
-    "uvicorn>=0.24.0",
-    "openai==2.13.0",
-]
-
-[project.scripts]
-rag_agent = "rag_agent:main"
-
-[build-system]
-requires = ["hatchling"]
-build-backend = "hatchling.build"
--- a/demos/use_cases/http_filter/sample_queries.md
+++ b/demos/use_cases/http_filter/sample_queries.md
@ -1,64 +0,0 @@
-# Sample Queries for Knowledge Base RAG Agent
-
-## Service Level Agreement Queries
- What is the guaranteed uptime percentage for TechCorp's cloud services?
- What remedies are available if the API response time exceeds the agreed threshold?
- How quickly must TechCorp respond to critical support issues?
- What monitoring and reporting requirements are specified in the SLA?
- When was the TechCorp service agreement signed and by whom?
-
-## Privacy Policy Queries
- What encryption methods does DataSecure use to protect data?
- How long does DataSecure retain personal data after account deletion?
- What rights do users have regarding their personal information?
- Can DataSecure sell user data to third parties for marketing?
- Who should be contacted for privacy-related concerns at DataSecure?
-
-## Supply Chain Agreement Queries
- What types of automotive components does PrecisionParts supply?
- What are the payment terms and volume discount structure?
- What quality standards must the supplied components meet?
- What are the penalties for late delivery?
- What insurance coverage requirements apply to the supplier?
-
-## Student Data Management Queries
- What federal laws must EduTech comply with regarding student data?
- What security measures are in place to protect student information?
- How long are student records retained after graduation?
- What consent is required for students under 13 years old?
- Who can access student educational records?
-
-## Investment Advisory Queries
- What is FinanceFirst's management fee structure?
- What types of investments are included in the advisory services?
- What regulatory body oversees FinanceFirst Advisors?
- How often are portfolio reviews conducted?
- What are the client's responsibilities under this agreement?
-
-## Healthcare Standards Queries
- What is the target response time for emergency code teams?
- What hand hygiene compliance rate is required?
- How quickly must medical records be completed after patient encounters?
- What continuing education requirements apply to nursing staff?
- What patient safety protocols are mandatory upon admission?
-
-## Cross-Document Queries
- Which agreements include confidentiality or data protection provisions?
- What are the common termination notice periods across different contract types?
- Which documents specify insurance or liability coverage requirements?
- What compliance and regulatory requirements are mentioned across agreements?
- Which contracts include performance metrics or service level commitments?
-
-## Complex Analysis Queries
- Compare the data retention policies across the privacy policy and student data management documents.
- What are the different approaches to risk management across the supply chain and investment advisory agreements?
- How do the security measures in the healthcare standards compare to those in the privacy policy?
- Which agreements provide the most detailed compliance and regulatory frameworks?
- What common themes exist in the quality assurance requirements across different industries?
-
-## Document-Specific Detail Queries
- List all the specific percentages, timeframes, and numerical requirements mentioned in the SLA.
- What are all the contact persons and their roles mentioned across the documents?
- Identify all the compliance standards and certifications referenced in the supply chain agreement.
- What are the specific consequences or penalties mentioned for non-compliance across agreements?
- List all the third-party systems, tools, or services mentioned in the documents.
--- a/demos/use_cases/http_filter/src/rag_agent/init.py
+++ b/demos/use_cases/http_filter/src/rag_agent/init.py
@ -1,109 +0,0 @@
-import click
-from fastmcp import FastMCP
-
-mcp = None
-
-
-@click.command()
-@click.option(
-    "--transport",
-    "transport",
-    default="streamable-http",
-    help="Transport type: stdio or sse",
-)
-@click.option("--host", "host", default="localhost", help="Host to bind MCP server to")
-@click.option("--port", "port", type=int, default=10500, help="Port for MCP server")
-@click.option(
-    "--agent",
-    "agent",
-    required=True,
-    help="Agent name: query_rewriter, context_builder, or response_generator",
-)
-@click.option(
-    "--name",
-    "agent_name",
-    default=None,
-    help="Custom MCP server name (defaults to agent type)",
-)
-@click.option(
-    "--rest-server",
-    "rest_server",
-    is_flag=True,
-    help="Start REST server instead of MCP server",
-)
-@click.option("--rest-port", "rest_port", default=8000, help="Port for REST server")
-def main(host, port, agent, transport, agent_name, rest_server, rest_port):
-    """Start a RAG agent as an MCP server or REST server."""
-
-    # Map friendly names to agent modules
-    agent_map = {
-        "input_guards": ("rag_agent.input_guards", "Input Guards Agent"),
-        "query_rewriter": ("rag_agent.query_rewriter", "Query Rewriter Agent"),
-        "context_builder": ("rag_agent.context_builder", "Context Builder Agent"),
-        "response_generator": (
-            "rag_agent.rag_agent",
-            "Response Generator Agent",
-        ),
-    }
-
-    if agent not in agent_map:
-        print(f"Error: Unknown agent '{agent}'")
-        print(f"Available agents: {', '.join(agent_map.keys())}")
-        return
-
-    module_name, default_name = agent_map[agent]
-    mcp_name = agent_name or default_name
-
-    if rest_server:
-        # REST server mode - supported for query_rewriter and response_generator
-        if agent == "response_generator":
-            print(f"Starting REST server on {host}:{rest_port} for agent: {agent}")
-            from rag_agent.rag_agent import start_server
-
-            start_server(host=host, port=rest_port)
-            return
-        elif agent == "query_rewriter":
-            print(f"Starting REST server on {host}:{rest_port} for agent: {agent}")
-            from rag_agent.query_rewriter import start_server
-
-            start_server(host=host, port=rest_port)
-            return
-        else:
-            print(f"Error: Agent '{agent}' does not support REST server mode.")
-            print(
-                f"REST server is only supported for: query_rewriter, response_generator"
-            )
-            print(f"Remove --rest-server flag to start {agent} as an MCP server.")
-            return
-    else:
-        # Only input_guards, query_rewriter and context_builder support MCP
-        if agent not in ["input_guards", "query_rewriter", "context_builder"]:
-            print(f"Error: Agent '{agent}' does not support MCP mode.")
-            print(
-                f"MCP is only supported for: input_guards, query_rewriter, context_builder"
-            )
-            print(f"Use --rest-server flag to start {agent} as a REST server.")
-            return
-
-        global mcp
-        mcp = FastMCP(mcp_name, host=host, port=port)
-
-        print(f"Starting MCP server: {mcp_name}")
-        print(f"  Agent: {agent}")
-        print(f"  Transport: {transport}")
-        print(f"  Host: {host}")
-        print(f"  Port: {port}")
-
-        # Import the agent module to register its tools
-        import importlib
-
-        importlib.import_module(module_name)
-
-        print(f"Agent '{agent}' loaded successfully")
-        print(f"MCP server ready on {transport}://{host}:{port}")
-
-        mcp.run(transport=transport)
-
-
-if __name__ == "__main__":
-    main()
--- a/demos/use_cases/http_filter/src/rag_agent/main.py
+++ b/demos/use_cases/http_filter/src/rag_agent/main.py
@ -1,4 +0,0 @@
-from . import main
-
-if __name__ == "__main__":
-    main()
--- a/demos/use_cases/http_filter/src/rag_agent/api.py
+++ b/demos/use_cases/http_filter/src/rag_agent/api.py
@ -1,36 +0,0 @@
-from pydantic import BaseModel
-from typing import List, Optional, Dict, Any
-
-
-class ChatMessage(BaseModel):
-    role: str
-    content: str
-
-
-class ChatCompletionRequest(BaseModel):
-    model: str
-    messages: List[ChatMessage]
-    temperature: Optional[float] = 1.0
-    max_tokens: Optional[int] = None
-    top_p: Optional[float] = 1.0
-    frequency_penalty: Optional[float] = 0.0
-    presence_penalty: Optional[float] = 0.0
-    stream: Optional[bool] = False
-    stop: Optional[List[str]] = None
-
-
-class ChatCompletionResponse(BaseModel):
-    id: str
-    object: str = "chat.completion"
-    created: int
-    model: str
-    choices: List[Dict[str, Any]]
-    usage: Dict[str, int]
-
-
-class ChatCompletionStreamResponse(BaseModel):
-    id: str
-    object: str = "chat.completion.chunk"
-    created: int
-    model: str
-    choices: List[Dict[str, Any]]
--- a/demos/use_cases/http_filter/src/rag_agent/context_builder.py
+++ b/demos/use_cases/http_filter/src/rag_agent/context_builder.py
@ -1,228 +0,0 @@
-import json
-from typing import List, Optional, Dict, Any
-from openai import AsyncOpenAI
-import os
-import logging
-import csv
-from pathlib import Path
-
-from .api import ChatMessage
-from fastapi import Request, FastAPI
-
-# from . import mcp
-# from fastmcp.server.dependencies import get_http_headers
-
-# Set up logging
-logging.basicConfig(
-    level=logging.INFO,
-    format="%(asctime)s - [CONTEXT_BUILDER]    - %(levelname)s - %(message)s",
-)
-logger = logging.getLogger(__name__)
-
-### add new setup
-app = FastAPI(title="RAG Agent Context Builder", version="1.0.0")
-
-# Configuration for Plano LLM gateway
-LLM_GATEWAY_ENDPOINT = os.getenv("LLM_GATEWAY_ENDPOINT", "http://localhost:12000/v1")
-RAG_MODEL = "gpt-4o-mini"
-
-# Initialize OpenAI client for Plano
-plano_client = AsyncOpenAI(
-    base_url=LLM_GATEWAY_ENDPOINT,
-    api_key="EMPTY",  # Plano doesn't require a real API key
-)
-
-# Global variable to store the knowledge base
-knowledge_base = []
-
-
-def load_knowledge_base():
-    """Load the sample_knowledge_base.csv file into memory on startup."""
-    global knowledge_base
-
-    # Get the path to the CSV file relative to this script
-    current_dir = Path(__file__).parent
-    csv_path = current_dir / "sample_knowledge_base.csv"
-
-    print(f"Loading knowledge base from {csv_path}")
-
-    try:
-        knowledge_base = []
-        with open(csv_path, "r", encoding="utf-8-sig") as file:
-            csv_reader = csv.DictReader(file)
-            for row in csv_reader:
-                knowledge_base.append({"path": row["path"], "content": row["content"]})
-
-        logger.info(f"Loaded {len(knowledge_base)} documents from knowledge base")
-
-    except Exception as e:
-        logger.error(f"Error loading knowledge base: {e}")
-        knowledge_base = []
-
-
-async def find_relevant_passages(
-    query: str,
-    traceparent: Optional[str] = None,
-    request_id: Optional[str] = None,
-    top_k: int = 3,
-) -> List[Dict[str, str]]:
-    """Use the LLM to find the most relevant passages from the knowledge base."""
-
-    if not knowledge_base:
-        logger.warning("Knowledge base is empty")
-        return []
-
-    # Create a system prompt for passage selection
-    system_prompt = f"""You are a retrieval assistant that selects the most relevant document passages for a given query.
-
-                    Given a user query and a list of document passages, identify the {top_k} most relevant passages that would help answer the query.
-
-                    Query: {query}
-
-                    Available passages:
-                    """
-
-    # Add all passages with indices
-    for i, doc in enumerate(knowledge_base):
-        system_prompt += (
-            f"\n[{i}] Path: {doc['path']}\nContent: {doc['content'][:500]}...\n"
-        )
-
-    system_prompt += f"""
-
-        Please respond with ONLY the indices of the {top_k} most relevant passages, separated by commas (e.g., "0,3,7").
-        If fewer than {top_k} passages are relevant, return only the relevant ones.
-        If no passages are relevant, return "NONE"."""
-
-    try:
-        # Call Plano to select relevant passages
-        logger.info(f"Calling Plano to find relevant passages for query: '{query}'")
-
-        # Prepare extra headers if traceparent is provided
-        extra_headers = {"x-envoy-max-retries": "3", "x-request-id": request_id}
-        if traceparent:
-            extra_headers["traceparent"] = traceparent
-
-        response = await plano_client.chat.completions.create(
-            model=RAG_MODEL,
-            messages=[{"role": "system", "content": system_prompt}],
-            temperature=0.1,
-            max_tokens=50,
-            extra_headers=extra_headers,
-        )
-
-        result = response.choices[0].message.content.strip()
-        logger.info(f"LLM selected passages: {result}")
-
-        # Parse the indices
-        if result.upper() == "NONE":
-            return []
-
-        selected_passages = []
-        indices = [
-            int(idx.strip()) for idx in result.split(",") if idx.strip().isdigit()
-        ]
-
-        for idx in indices:
-            if 0 <= idx < len(knowledge_base):
-                selected_passages.append(knowledge_base[idx])
-
-        logger.info(f"Selected {len(selected_passages)} relevant passages")
-        return selected_passages
-
-    except Exception as e:
-        logger.error(f"Error finding relevant passages: {e}")
-        return []
-
-
-async def augment_query_with_context(
-    messages: List[ChatMessage],
-    traceparent: Optional[str] = None,
-    request_id: Optional[str] = None,
-) -> List[ChatMessage]:
-    """Extract user query, find relevant context, and augment the messages."""
-
-    # Find the last user message
-    last_user_message = None
-    last_user_index = -1
-
-    for i in range(len(messages) - 1, -1, -1):
-        if messages[i].role == "user":
-            last_user_message = messages[i].content
-            last_user_index = i
-            break
-
-    if not last_user_message:
-        logger.warning("No user message found in conversation")
-        return messages
-
-    logger.info(f"Processing user query: '{last_user_message}'")
-
-    # Find relevant passages
-    relevant_passages = await find_relevant_passages(
-        last_user_message, traceparent, request_id
-    )
-
-    if not relevant_passages:
-        logger.info("No relevant passages found, returning original messages")
-        return messages
-
-    # Build context from relevant passages
-    context_parts = []
-    for i, passage in enumerate(relevant_passages):
-        context_parts.append(
-            f"Document {i+1} ({passage['path']}):\n{passage['content']}"
-        )
-
-    context = "\n\n".join(context_parts)
-
-    # Create augmented content with original query and context
-    augmented_content = f"""{last_user_message} RELEVANT CONTEXT:
-    {context}"""
-
-    # Create updated messages with the augmented query
-    updated_messages = messages.copy()
-    updated_messages[last_user_index] = ChatMessage(
-        role="user", content=augmented_content
-    )
-
-    logger.info(f"Augmented user query with {len(relevant_passages)} relevant passages")
-
-    return updated_messages
-
-
-# Load knowledge base on module import
-load_knowledge_base()
-
-
-@app.post("/")
-async def context_builder(
-    messages: List[ChatMessage], request: Request
-) -> List[ChatMessage]:
-    """MCP tool that augments user queries with relevant context from the knowledge base."""
-    logger.info(f"Received chat completion request with {len(messages)} messages")
-
-    # Get traceparent header from MCP request
-    # headers = get_http_headers()
-    # traceparent_header = headers.get("traceparent")
-
-    traceparent_header = request.headers.get("traceparent")
-    request_id = request.headers.get("x-request-id")
-
-    if traceparent_header:
-        logger.info(f"Received traceparent header: {traceparent_header}")
-    else:
-        logger.info("No traceparent header found")
-
-    # Augment the user query with relevant context
-    updated_messages = await augment_query_with_context(
-        messages, traceparent_header, request_id
-    )
-
-    # Return as dict to minimize text serialization
-    return [{"role": msg.role, "content": msg.content} for msg in updated_messages]
-
-
-# Register MCP tool only if mcp is available
-# if mcp is not None:
-#     mcp.tool()(context_builder)
--- a/demos/use_cases/http_filter/src/rag_agent/input_guards.py
+++ b/demos/use_cases/http_filter/src/rag_agent/input_guards.py
@ -1,172 +0,0 @@
-import asyncio
-import json
-import time
-from typing import List, Optional, Dict, Any
-import uuid
-from fastapi import FastAPI, Depends, Request, HTTPException
-
-# from fastmcp.exceptions import ToolError
-from openai import AsyncOpenAI
-import os
-import logging
-
-from .api import ChatCompletionRequest, ChatCompletionResponse, ChatMessage
-from . import mcp
-
-# from fastmcp.server.dependencies import get_http_headers
-
-# Set up logging
-logging.basicConfig(
-    level=logging.INFO,
-    format="%(asctime)s - [INPUT_GUARDS]       - %(levelname)s - %(message)s",
-)
-logger = logging.getLogger(__name__)
-
-# Configuration for Plano LLM gateway
-LLM_GATEWAY_ENDPOINT = os.getenv("LLM_GATEWAY_ENDPOINT", "http://localhost:12000/v1")
-GUARD_MODEL = "gpt-4o-mini"
-
-# Initialize OpenAI client for Plano
-plano_client = AsyncOpenAI(
-    base_url=LLM_GATEWAY_ENDPOINT,
-    api_key="EMPTY",  # Plano doesn't require a real API key
-)
-
-app = FastAPI(title="RAG Agent Input Guards", version="1.0.0")
-
-
-async def validate_query_scope(
-    messages: List[ChatMessage],
-    traceparent_header: Optional[str] = None,
-    request_id: Optional[str] = None,
-) -> Dict[str, Any]:
-    """Validate that the user query is within TechCorp's domain.
-
-    Returns a dict with:
-        - is_valid: bool indicating if query is within scope
-        - reason: str explaining why query is out of scope (if applicable)
-    """
-    system_prompt = """You are an input validation guard for TechCorp's customer support system.
-
-Your job is to determine if a user's query is related to TechCorp and its services/products.
-
-TechCorp is a technology company that provides:
- Cloud services and infrastructure
- SaaS products
- Technical support
- Service level agreements (SLAs)
- Uptime guarantees
- Enterprise solutions
-
-ALLOW queries about:
- TechCorp's services, products, or offerings
- TechCorp's pricing, SLAs, uptime, or policies
- Technical support for TechCorp products
- General questions about TechCorp as a company
-
-REJECT queries about:
- Other companies or their products
- General knowledge questions unrelated to TechCorp
- Personal advice or topics outside TechCorp's domain
- Anything that doesn't relate to TechCorp's business
-
-Respond in JSON format:
-{
-    "is_valid": true/false,
-    "reason": "brief explanation if invalid"
-}"""
-
-    # Get the last user message for validation
-    last_user_message = None
-    for msg in reversed(messages):
-        if msg.role == "user":
-            last_user_message = msg.content
-            break
-
-    if not last_user_message:
-        return {"is_valid": True, "reason": ""}
-
-    # Prepare messages for the guard
-    guard_messages = [
-        {"role": "system", "content": system_prompt},
-        {"role": "user", "content": f"Query to validate: {last_user_message}"},
-    ]
-
-    try:
-        # Call Plano using OpenAI client
-        extra_headers = {"x-envoy-max-retries": "3", "x-request-id": request_id}
-        if traceparent_header:
-            extra_headers["traceparent"] = traceparent_header
-
-        logger.info(f"Validating query scope: '{last_user_message}'")
-        response = await plano_client.chat.completions.create(
-            model=GUARD_MODEL,
-            messages=guard_messages,
-            temperature=0.1,
-            max_tokens=150,
-            extra_headers=extra_headers,
-        )
-
-        result_text = response.choices[0].message.content.strip()
-
-        # Parse JSON response
-        try:
-            result = json.loads(result_text)
-            logger.info(f"Validation result: {result}")
-            return result
-        except json.JSONDecodeError:
-            logger.error(f"Failed to parse validation response: {result_text}")
-            # Default to allowing if parsing fails
-            return {"is_valid": True, "reason": ""}
-
-    except Exception as e:
-        logger.error(f"Error validating query: {e}")
-        # Default to allowing if validation fails
-        return {"is_valid": True, "reason": ""}
-
-
-# @mcp.tool
-@app.post("/")
-async def input_guards(
-    messages: List[ChatMessage], request: Request
-) -> List[ChatMessage]:
-    """Input guard that validates queries are within TechCorp's domain.
-
-    If the query is out of scope, replaces the user message with a rejection notice.
-    """
-    logger.info(f"Received request with {len(messages)} messages")
-
-    # Get traceparent header from HTTP request using FastMCP's dependency function
-    # headers = get_http_headers()
-    # traceparent_header = headers.get("traceparent")
-    traceparent_header = request.headers.get("traceparent")
-    request_id = request.headers.get("x-request-id")
-
-    if traceparent_header:
-        logger.info(f"Received traceparent header: {traceparent_header}")
-    else:
-        logger.info("No traceparent header found")
-
-    # Validate the query scope
-    validation_result = await validate_query_scope(
-        messages, traceparent_header, request_id
-    )
-
-    if not validation_result.get("is_valid", True):
-        reason = validation_result.get("reason", "Query is outside TechCorp's domain")
-        logger.warning(f"Query rejected: {reason}")
-
-        # Throw ToolError
-        error_message = f"I apologize, but I can only assist with questions related to TechCorp and its services. Your query appears to be outside this scope. {reason}\n\nPlease ask me about TechCorp's products, services, pricing, SLAs, or technical support."
-        # raise ToolError(error_message)
-        raise HTTPException(
-            status_code=400, detail={"error": "out_of_scope", "message": error_message}
-        )
-
-    logger.info("Query validation passed - forwarding to next filter")
-    return messages
-
-
-@app.get("/health")
-async def health():
-    return {"status": "healthy"}
--- a/demos/use_cases/http_filter/src/rag_agent/query_rewriter.py
+++ b/demos/use_cases/http_filter/src/rag_agent/query_rewriter.py
@ -1,133 +0,0 @@
-import asyncio
-import json
-import time
-from typing import List, Optional, Dict, Any
-import uuid
-from fastapi import FastAPI, Depends, Request
-from openai import AsyncOpenAI
-import os
-import logging
-
-from .api import ChatCompletionRequest, ChatCompletionResponse, ChatMessage
-
-# from . import mcp
-# from fastmcp.server.dependencies import get_http_headers
-
-# Set up logging
-logging.basicConfig(
-    level=logging.INFO,
-    format="%(asctime)s - [QUERY_REWRITER]     - %(levelname)s - %(message)s",
-)
-logger = logging.getLogger(__name__)
-
-# Configuration for Plano LLM gateway
-LLM_GATEWAY_ENDPOINT = os.getenv("LLM_GATEWAY_ENDPOINT", "http://localhost:12000/v1")
-QUERY_REWRITE_MODEL = "gpt-4o-mini"
-
-# Initialize OpenAI client for Plano
-plano_client = AsyncOpenAI(
-    base_url=LLM_GATEWAY_ENDPOINT,
-    api_key="EMPTY",  # Plano doesn't require a real API key
-)
-
-app = FastAPI(title="RAG Agent Query Rewriter", version="1.0.0")
-
-
-async def rewrite_query_with_plano(
-    messages: List[ChatMessage],
-    traceparent_header: Optional[str] = None,
-    request_id: Optional[str] = None,
-) -> str:
-    """Rewrite the last user message for better retrieval. Returns the rewritten query."""
-    system_prompt = """You are a query rewriter that improves user queries for better retrieval.
-
-Given a conversation history, rewrite the last user message to be more specific and context-aware.
-The rewritten query should:
-1. Include relevant context from previous messages
-2. Be clear and specific for information retrieval
-3. Maintain the user's intent
-4. Be concise but comprehensive
-
-Return only the rewritten query, nothing else."""
-
-    rewrite_messages = [{"role": "system", "content": system_prompt}]
-    for msg in messages:
-        rewrite_messages.append({"role": msg.role, "content": msg.content})
-
-    extra_headers = {"x-envoy-max-retries": "3", "x-request-id": request_id}
-    if traceparent_header:
-        extra_headers["traceparent"] = traceparent_header
-
-    try:
-        logger.info(f"Calling Plano at {LLM_GATEWAY_ENDPOINT} to rewrite query")
-        resp = await plano_client.chat.completions.create(
-            model=QUERY_REWRITE_MODEL,
-            messages=rewrite_messages,
-            temperature=0.3,
-            max_tokens=200,
-            extra_headers=extra_headers,
-        )
-        rewritten = resp.choices[0].message.content.strip()
-        logger.info(f"Query rewritten successfully: '{rewritten}'")
-        return rewritten
-    except Exception as e:
-        logger.error(f"Error rewriting query: {e}")
-
-    # Fallback: return the original last user message
-    for m in reversed(messages):
-        if m.role == "user":
-            logger.info("Falling back to original user message")
-            return m.content
-    return ""
-
-
-@app.post("/")
-async def query_rewriter_http(
-    messages: List[ChatMessage], request: Request
-) -> List[ChatMessage]:
-    """HTTP filter endpoint used by Plano (type: http)."""
-    logger.info(f"Received request with {len(messages)} messages")
-
-    traceparent_header = request.headers.get("traceparent")
-    request_id = request.headers.get("x-request-id")
-
-    if traceparent_header:
-        logger.info(f"Received traceparent header: {traceparent_header}")
-    else:
-        logger.info("No traceparent header found")
-
-    rewritten_query = await rewrite_query_with_plano(
-        messages, traceparent_header, request_id
-    )
-    # Create updated messages with the rewritten query
-    updated_messages = messages.copy()
-
-    # Find and update the last user message with the rewritten query
-    for i in range(len(updated_messages) - 1, -1, -1):
-        if updated_messages[i].role == "user":
-            original_query = updated_messages[i].content
-            updated_messages[i] = ChatMessage(role="user", content=rewritten_query)
-            logger.info(
-                f"Updated user query from '{original_query}' to '{rewritten_query}'"
-            )
-            break
-    updated_messages_data = [
-        {"role": msg.role, "content": msg.content} for msg in updated_messages
-    ]
-    updated_messages = [ChatMessage(**msg) for msg in updated_messages_data]
-
-    logger.info("Returning rewritten chat completion response")
-    return updated_messages
-
-
-@app.get("/health")
-async def health():
-    return {"status": "healthy"}
-
-
-def start_server(host: str = "0.0.0.0", port: int = 10501):
-    """Start the FastAPI server for query rewriter."""
-    import uvicorn
-
-    logger.info(f"Starting Query Rewriter REST server on {host}:{port}")
-    uvicorn.run(app, host=host, port=port)
--- a/demos/use_cases/http_filter/src/rag_agent/rag_agent.py
+++ b/demos/use_cases/http_filter/src/rag_agent/rag_agent.py
@ -1,221 +0,0 @@
-import json
-from fastapi import FastAPI, Request
-from fastapi.responses import StreamingResponse
-from openai import AsyncOpenAI
-import os
-import logging
-import time
-import uuid
-import uvicorn
-import asyncio
-
-from .api import (
-    ChatCompletionRequest,
-    ChatCompletionResponse,
-    ChatCompletionStreamResponse,
-)
-
-# Set up logging
-logging.basicConfig(
-    level=logging.INFO,
-    format="%(asctime)s - [RESPONSE_GENERATOR] - %(levelname)s - %(message)s",
-)
-logger = logging.getLogger(__name__)
-
-# Configuration for Plano LLM gateway
-LLM_GATEWAY_ENDPOINT = os.getenv("LLM_GATEWAY_ENDPOINT", "http://localhost:12000/v1")
-RESPONSE_MODEL = "gpt-4o"
-
-# System prompt for response generation
-SYSTEM_PROMPT = """You are a helpful assistant that generates coherent, contextual responses.
-
-Given a conversation history, generate a helpful and relevant response based on all the context available in the messages.
-Your response should:
-1. Be contextually aware of the entire conversation
-2. Address the user's needs appropriately
-3. Be helpful and informative
-4. Maintain a natural conversational tone
-
-Generate a complete response to assist the user."""
-
-# Initialize OpenAI client for Plano
-plano_client = AsyncOpenAI(
-    base_url=LLM_GATEWAY_ENDPOINT,
-    api_key="EMPTY",  # Plano doesn't require a real API key
-)
-
-# FastAPI app for REST server
-app = FastAPI(title="RAG Agent Response Generator", version="1.0.0")
-
-
-def prepare_response_messages(request_body: ChatCompletionRequest):
-    """Prepare messages for response generation by adding system prompt."""
-    response_messages = [{"role": "system", "content": SYSTEM_PROMPT}]
-
-    # Add conversation history
-    for msg in request_body.messages:
-        response_messages.append({"role": msg.role, "content": msg.content})
-
-    return response_messages
-
-
-@app.post("/v1/chat/completions")
-async def chat_completion_http(request: Request, request_body: ChatCompletionRequest):
-    """HTTP endpoint for chat completions with streaming support."""
-    logger.info(
-        f"Received chat completion request with {len(request_body.messages)} messages"
-    )
-
-    # Get traceparent header from HTTP request
-    traceparent_header = request.headers.get("traceparent")
-    request_id = request.headers.get("x-request-id") or f"req-{uuid.uuid4().hex}"
-
-    if traceparent_header:
-        logger.info(f"Received traceparent header: {traceparent_header}")
-    else:
-        logger.info("No traceparent header found")
-
-    return StreamingResponse(
-        stream_chat_completions(request_body, traceparent_header, request_id),
-        media_type="text/plain",
-        headers={
-            "content-type": "text/event-stream",
-            "x-request-id": request_id,
-        },
-    )
-
-
-async def stream_chat_completions(
-    request_body: ChatCompletionRequest,
-    traceparent_header: str = None,
-    request_id: str = None,
-):
-    """Generate streaming chat completions."""
-    # Prepare messages for response generation
-    response_messages = prepare_response_messages(request_body)
-
-    try:
-        # Call Plano using OpenAI client for streaming
-        logger.info(
-            f"Calling Plano at {LLM_GATEWAY_ENDPOINT} to generate streaming response"
-        )
-
-        # Prepare extra headers if traceparent is provided
-        extra_headers = {"x-envoy-max-retries": "3", "x-request-id": request_id}
-        if traceparent_header:
-            extra_headers["traceparent"] = traceparent_header
-
-        response_stream = await plano_client.chat.completions.create(
-            model=RESPONSE_MODEL,
-            messages=response_messages,
-            temperature=request_body.temperature or 0.7,
-            max_tokens=request_body.max_tokens or 1000,
-            stream=True,
-            extra_headers=extra_headers,
-        )
-
-        completion_id = f"chatcmpl-{uuid.uuid4().hex[:8]}"
-        created_time = int(time.time())
-        collected_content = []
-
-        async for chunk in response_stream:
-            if chunk.choices and chunk.choices[0].delta.content:
-                content = chunk.choices[0].delta.content
-                collected_content.append(content)
-
-                # Create streaming response chunk
-                stream_chunk = ChatCompletionStreamResponse(
-                    id=completion_id,
-                    created=created_time,
-                    model=request_body.model,
-                    choices=[
-                        {
-                            "index": 0,
-                            "delta": {"content": content},
-                            "finish_reason": None,
-                        }
-                    ],
-                )
-
-                yield f"data: {stream_chunk.model_dump_json()}\n\n"
-
-        # Send final chunk with complete response in expected format
-        full_response = "".join(collected_content)
-        updated_history = [{"role": "assistant", "content": full_response}]
-
-        final_chunk = ChatCompletionStreamResponse(
-            id=completion_id,
-            created=created_time,
-            model=request_body.model,
-            choices=[
-                {
-                    "index": 0,
-                    "delta": {},
-                    "finish_reason": "stop",
-                    "message": {
-                        "role": "assistant",
-                        "content": json.dumps(updated_history),
-                    },
-                }
-            ],
-        )
-
-        yield f"data: {final_chunk.model_dump_json()}\n\n"
-        yield "data: [DONE]\n\n"
-
-    except Exception as e:
-        logger.error(f"Error generating streaming response: {e}")
-
-        # Send error as streaming response
-        error_chunk = ChatCompletionStreamResponse(
-            id=f"chatcmpl-{uuid.uuid4().hex[:8]}",
-            created=int(time.time()),
-            model=request_body.model,
-            choices=[
-                {
-                    "index": 0,
-                    "delta": {
-                        "content": "I apologize, but I'm having trouble generating a response right now. Please try again."
-                    },
-                    "finish_reason": "stop",
-                }
-            ],
-        )
-
-        yield f"data: {error_chunk.model_dump_json()}\n\n"
-        yield "data: [DONE]\n\n"
-
-
-@app.get("/health")
-async def health_check():
-    """Health check endpoint."""
-    return {"status": "healthy"}
-
-
-def start_server(host: str = "localhost", port: int = 8000):
-    """Start the REST server."""
-    uvicorn.run(
-        app,
-        host=host,
-        port=port,
-        log_config={
-            "version": 1,
-            "disable_existing_loggers": False,
-            "formatters": {
-                "default": {
-                    "format": "%(asctime)s - [RESPONSE_GENERATOR] - %(levelname)s - %(message)s",
-                },
-            },
-            "handlers": {
-                "default": {
-                    "formatter": "default",
-                    "class": "logging.StreamHandler",
-                    "stream": "ext://sys.stdout",
-                },
-            },
-            "root": {
-                "level": "INFO",
-                "handlers": ["default"],
-            },
-        },
-    )
--- a/demos/use_cases/http_filter/src/rag_agent/sample_knowledge_base.csv
+++ b/demos/use_cases/http_filter/src/rag_agent/sample_knowledge_base.csv
@ -1,257 +0,0 @@
-path,content
-TechCorp_CloudServices_SLA_Agreement_2024,"SERVICE LEVEL AGREEMENT
-This Service Level Agreement (""SLA"") is entered into on March 15, 2024, between TechCorp Solutions Inc., a Delaware corporation (""Provider""), and CloudFirst Enterprises LLC (""Customer"").
-
-DEFINITIONS
-Service Availability: The percentage of time during which the cloud services are operational and accessible.
-Downtime: Any period when the services are unavailable or inaccessible to Customer.
-Response Time: The time between service request submission and initial response from Provider.
-
-SERVICE COMMITMENTS
-Provider guarantees 99.9% uptime for all cloud infrastructure services during any calendar month.
-Average response time for API calls shall not exceed 200 milliseconds under normal operating conditions.
-Customer support response times: Critical issues within 1 hour, Standard issues within 4 hours.
-
-REMEDIES
-For each full percentage point below 99.9% availability, Customer receives 10% credit on monthly fees.
-If response times exceed 500ms for more than 5 minutes in any hour, Customer receives 5% monthly credit.
-
-MONITORING AND REPORTING
-Provider will maintain real-time monitoring systems and provide monthly performance reports.
-All metrics will be measured from Provider's monitoring systems located in primary data centers.
-
-This SLA remains in effect for the duration of the underlying service agreement.
-
-Executed by:
-TechCorp Solutions Inc.
-Sarah Mitchell, VP Operations
-Date: March 15, 2024
-
-CloudFirst Enterprises LLC
-Robert Chen, CTO
-Date: March 16, 2024"
-
-DataSecure_Privacy_Policy_v3.2,"PRIVACY POLICY
-DataSecure Analytics, Inc. (""Company"") Privacy Policy
-Effective Date: January 1, 2024
-Last Updated: February 28, 2024
-
-INFORMATION COLLECTION
-We collect information you provide directly, such as account details, usage preferences, and communication records.
-Automatically collected data includes IP addresses, browser types, device information, and service interaction logs.
-Third-party integrations may provide additional user behavior and demographic information with consent.
-
-DATA USAGE
-Personal information is used to provide services, improve user experience, and communicate service updates.
-Aggregated, non-identifiable data may be used for analytics, research, and service enhancement.
-We do not sell personal information to third parties for marketing purposes.
-
-DATA PROTECTION
-All data is encrypted in transit using TLS 1.3 and at rest using AES-256 encryption.
-Access controls limit data access to authorized personnel only on a need-to-know basis.
-Regular security audits and penetration testing ensure ongoing protection measures.
-
-DATA RETENTION
-Personal data is retained for the duration of active service plus 24 months.
-Logs and analytics data are retained for 12 months unless legally required otherwise.
-Upon account deletion, personal data is permanently removed within 30 days.
-
-USER RIGHTS
-Users may request access to, correction of, or deletion of their personal information.
-Data portability requests will be fulfilled in standard formats within 30 days.
-Marketing communications can be opted out of at any time.
-
-CONTACT
-For privacy concerns, contact: privacy@datasecure.com
-Data Protection Officer: Jennifer Walsh, jwalsh@datasecure.com"
-
-GlobalManufacturing_SupplyChain_Contract_Q2_2024,"SUPPLY CHAIN AGREEMENT
-This Supply Chain Agreement is entered into between GlobalManufacturing Corp (""Buyer"") and PrecisionParts Ltd (""Supplier"") effective April 1, 2024.
-
-SCOPE OF SERVICES
-Supplier will provide automotive components including brake assemblies, suspension parts, and electrical harnesses.
-All products must meet ISO 9001 quality standards and automotive industry specifications.
-Delivery schedule: Weekly shipments every Tuesday, with 48-hour advance shipping notifications.
-
-PRICING AND PAYMENT
-Component pricing is fixed for initial 6-month term with quarterly price review thereafter.
-Payment terms: Net 45 days from invoice date via electronic transfer.
-Volume discounts apply: 5% for orders exceeding 10,000 units per month, 8% for orders exceeding 25,000 units.
-
-QUALITY REQUIREMENTS
-All components must pass incoming inspection with less than 0.1% defect rate.
-Supplier maintains quality certifications including IATF 16949 and environmental compliance.
-Batch tracking and traceability required for all delivered components.
-
-LOGISTICS AND DELIVERY
-Supplier responsible for packaging, labeling, and delivery to Buyer's distribution centers.
-Delivery windows: 8 AM - 4 PM, Monday through Friday, with advance appointment scheduling.
-Late delivery penalties: 2% of shipment value for each day beyond scheduled delivery.
-
-RISK MANAGEMENT
-Supplier maintains business continuity plans and alternative sourcing strategies.
-Force majeure events must be reported within 24 hours with mitigation plans.
-Insurance requirements: $5M general liability, $2M product liability coverage.
-
-INTELLECTUAL PROPERTY
-All custom tooling and specifications remain property of Buyer.
-Supplier grants license to use necessary patents for component manufacturing.
-
-This agreement shall remain in effect for 24 months with automatic renewal unless terminated.
-
-GlobalManufacturing Corp
-Michael Rodriguez, Supply Chain Director
-Date: April 1, 2024
-
-PrecisionParts Ltd
-Amanda Foster, VP Sales
-Date: April 2, 2024"
-
-EduTech_StudentData_Management_Policy_2024,"STUDENT DATA MANAGEMENT POLICY
-EduTech Learning Platform - Data Management and Protection Policy
-Document Version: 2.1
-Effective Date: August 15, 2024
-
-SCOPE AND PURPOSE
-This policy governs the collection, use, storage, and protection of student educational records and personal information.
-Applies to all employees, contractors, and third-party service providers accessing student data.
-Compliance with FERPA, COPPA, and state student privacy laws is mandatory.
-
-DATA CLASSIFICATION
-Educational Records: Grades, attendance, assignments, and academic progress information.
-Personal Information: Names, addresses, contact details, and demographic information.
-Behavioral Data: Learning patterns, platform usage, and engagement metrics.
-
-COLLECTION PRINCIPLES
-Data collection is limited to educational purposes and service improvement only.
-Parental consent required for students under 13 years of age.
-Students and parents have right to review and request corrections to educational records.
-
-ACCESS CONTROLS
-Role-based access ensures personnel see only data necessary for their functions.
-Multi-factor authentication required for all system access.
-Access logs maintained and reviewed monthly for unauthorized activity.
-
-DATA SHARING
-Educational records shared only with authorized school personnel and parents/students.
-No data sharing with third parties for commercial purposes without explicit consent.
-Research data must be de-identified and aggregated before external sharing.
-
-SECURITY MEASURES
-Data encrypted using industry-standard protocols during transmission and storage.
-Regular security assessments and vulnerability testing conducted quarterly.
-Incident response plan includes notification procedures for data breaches.
-
-RETENTION AND DISPOSAL
-Student records retained according to school district policies, typically 5-7 years post-graduation.
-Inactive accounts and associated data purged after 2 years of non-use.
-Secure data destruction protocols ensure complete removal of sensitive information.
-
-COMPLIANCE MONITORING
-Annual privacy training required for all staff handling student data.
-Regular audits ensure ongoing compliance with applicable privacy regulations.
-Privacy impact assessments conducted for new features or data uses.
-
-Contact: Dr. Lisa Thompson, Chief Privacy Officer
-Email: privacy@edutech-learning.com
-Phone: (555) 123-4567"
-
-FinanceFirst_Investment_Advisory_Agreement_2024,"INVESTMENT ADVISORY AGREEMENT
-This Investment Advisory Agreement is entered into between FinanceFirst Advisors LLC (""Advisor"") and Madison Investment Group (""Client"") on May 20, 2024.
-
-ADVISORY SERVICES
-Advisor will provide comprehensive investment management and financial planning services.
-Services include portfolio construction, asset allocation, risk assessment, and performance monitoring.
-Regular portfolio reviews conducted quarterly with detailed performance reporting.
-
-INVESTMENT AUTHORITY
-Client grants Advisor discretionary authority to make investment decisions within agreed parameters.
-Investment universe includes stocks, bonds, ETFs, mutual funds, and alternative investments as appropriate.
-All trades executed through qualified broker-dealers with best execution practices.
-
-FEE STRUCTURE
-Management fee: 1.25% annually on assets under management, calculated and billed quarterly.
-Performance fee: 15% of returns exceeding S&P 500 benchmark, calculated annually.
-Additional fees may apply for specialized services such as tax planning or estate planning.
-
-CLIENT RESPONSIBILITIES
-Client must provide accurate financial information and promptly communicate changes in circumstances.
-Investment objectives and risk tolerance should be reviewed and updated annually.
-Client responsible for reviewing and approving investment policy statement.
-
-RISK DISCLOSURE
-All investments carry risk of loss, and past performance does not guarantee future results.
-Diversification does not ensure profit or protect against loss in declining markets.
-Alternative investments may have limited liquidity and higher volatility.
-
-REGULATORY COMPLIANCE
-Advisor is registered with the Securities and Exchange Commission as an investment advisor.
-All activities conducted in accordance with Investment Advisers Act of 1940 and applicable regulations.
-Form ADV Part 2 brochure provided annually with material updates.
-
-CONFIDENTIALITY
-All client information treated as confidential and shared only as necessary for service provision.
-Third-party service providers bound by confidentiality agreements.
-Client data protected through secure systems and access controls.
-
-TERMINATION
-Either party may terminate agreement with 30 days written notice.
-Upon termination, Advisor will assist with orderly transfer of assets to new custodian or advisor.
-Final fee calculation prorated to date of termination.
-
-FinanceFirst Advisors LLC
-Thomas Anderson, Managing Partner
-Date: May 20, 2024
-
-Madison Investment Group
-Rebecca Martinez, Chief Investment Officer
-Date: May 21, 2024"
-
-HealthSystem_PatientCare_Standards_2024,"PATIENT CARE STANDARDS AND PROTOCOLS
-Metropolitan Health System - Clinical Care Standards
-Document ID: MHS-PCS-2024-001
-Effective Date: June 1, 2024
-
-PATIENT SAFETY PROTOCOLS
-All patients must have proper identification verification using two unique identifiers.
-Medication administration requires independent double-check for high-risk medications.
-Fall risk assessments completed within 4 hours of admission with appropriate interventions.
-
-CLINICAL DOCUMENTATION
-Medical records must be completed within 24 hours of patient encounter.
-All entries require electronic signature with timestamp and provider identification.
-Critical values and abnormal results must be communicated and documented immediately.
-
-INFECTION CONTROL
-Hand hygiene compliance monitored with target rate of 95% or higher.
-Personal protective equipment used according to transmission-based precautions.
-Isolation procedures implemented within 2 hours of identification of infectious conditions.
-
-EMERGENCY RESPONSE
-Code team response time target: 3 minutes from activation to arrival.
-Crash cart and emergency equipment checks performed daily and documented.
-All staff required to maintain current CPR and emergency response certifications.
-
-PATIENT COMMUNICATION
-Patient rights and responsibilities communicated upon admission.
-Informed consent obtained and documented prior to procedures and treatments.
-Family involvement encouraged with respect for patient privacy preferences.
-
-QUALITY MEASURES
-Patient satisfaction scores monitored monthly with target of 4.5/5.0 or higher.
-Medication error rates tracked with goal of less than 1 per 1000 patient days.
-Hospital-acquired infection rates measured and benchmarked against national standards.
-
-STAFF COMPETENCY
-Annual competency assessments required for all clinical staff.
-Continuing education requirements: 24 hours annually for nurses, 40 hours for physicians.
-Specialty certifications maintained according to department and role requirements.
-
-TECHNOLOGY STANDARDS
-Electronic health record system used for all patient documentation.
-Telemedicine capabilities available for remote consultations and monitoring.
-Clinical decision support tools integrated to assist with diagnosis and treatment decisions.
-
-Contact: Dr. Patricia Williams, Chief Medical Officer
-Email: pwilliams@metrohealthsystem.org
-Phone: (555) 987-6543"
--- a/demos/use_cases/http_filter/start_agents.sh
+++ b/demos/use_cases/http_filter/start_agents.sh
@ -1,78 +0,0 @@
-# #!/bin/bash
-# set -e
-
-# WAIT_FOR_PIDS=()
-
-# log() {
-#   timestamp=$(python3 -c 'from datetime import datetime; print(datetime.now().strftime("%Y-%m-%d %H:%M:%S,%f")[:23])')
-#   message="$*"
-#   echo "$timestamp - $message"
-# }
-
-# cleanup() {
-#     log "Caught signal, terminating all user processes ..."
-#     for PID in "${WAIT_FOR_PIDS[@]}"; do
-#         if kill $PID 2> /dev/null; then
-#             log "killed process: $PID"
-#         fi
-#     done
-#     exit 1
-# }
-
-# trap cleanup EXIT
-
-# log "Starting input_guards agent on port 10500/mcp..."
-# uv run python -m rag_agent --rest-server --host 0.0.0.0 --rest-port 10500 --agent input_guards &
-# WAIT_FOR_PIDS+=($!)
-
-# log "Starting query_rewriter agent on port 10501/mcp..."
-# uv run python -m rag_agent --rest-server --host 0.0.0.0 --rest-port 10501 --agent query_rewriter &
-# WAIT_FOR_PIDS+=($!)
-
-# log "Starting context_builder agent on port 10502/mcp..."
-# uv run python -m rag_agent --rest-server --host 0.0.0.0 --rest-port 10502 --agent context_builder &
-# WAIT_FOR_PIDS+=($!)
-
-# # log "Starting response_generator agent on port 10400..."
-# # uv run python -m rag_agent --host 0.0.0.0 --port 10400 --agent response_generator &
-# # WAIT_FOR_PIDS+=($!)
-
-# log "Starting response_generator agent on port 10505..."
-# uv run python -m rag_agent --rest-server --host 0.0.0.0 --rest-port 10505 --agent response_generator &
-# WAIT_FOR_PIDS+=($!)
-
-# for PID in "${WAIT_FOR_PIDS[@]}"; do
-#     wait "$PID"
-# done
-
-
-
-
-#!/bin/bash
-set -e
-
-export PYTHONPATH=/app/src
-
-pids=()
-
-log() { echo "$(date '+%F %T') - $*"; }
-
-log "Starting input_guards HTTP server on :10500"
-uv run uvicorn rag_agent.input_guards:app --host 0.0.0.0 --port 10500 &
-pids+=($!)
-
-log "Starting query_rewriter HTTP server on :10501"
-uv run uvicorn rag_agent.query_rewriter:app --host 0.0.0.0 --port 10501 &
-pids+=($!)
-
-log "Starting context_builder HTTP server on :10502"
-uv run uvicorn rag_agent.context_builder:app --host 0.0.0.0 --port 10502 &
-pids+=($!)
-
-log "Starting response_generator (OpenAI-compatible) on :10505"
-uv run uvicorn rag_agent.rag_agent:app --host 0.0.0.0 --port 10505 &
-pids+=($!)
-
-for PID in "${pids[@]}"; do
-    wait "$PID"
-done
--- a/demos/use_cases/http_filter/test.rest
+++ b/demos/use_cases/http_filter/test.rest
@ -1,92 +0,0 @@
-@baseUrl = http://0.0.0.0:10502
-@model = gpt-4o
-
-# Health Check
-GET {{baseUrl}}/health
-
-###
-
-# Test 1: Simple Non-Streaming Chat Completion
-POST {{baseUrl}}/v1/chat/completions
-Content-Type: application/json
-
-{
-  "model": "{{model}}",
-  "messages": [
-    {
-      "role": "user",
-      "content": "Hello! Can you help me understand what machine learning is?"
-    }
-  ]
-}
-
-###
-
-# Test 2: Simple Streaming Chat Completion
-POST {{baseUrl}}/v1/chat/completions
-Content-Type: application/json
-
-{
-  "model": "{{model}}",
-  "messages": [
-    {
-      "role": "user",
-      "content": "Explain the concept of artificial intelligence in simple terms."
-    }
-  ],
-  "stream": true
-}
-
-### Test 3
-POST http://localhost:8001/v1/chat/completions
-Content-Type: application/json
-
-{
-  "model": "{{model}}",
-  "messages": [
-    {
-      "role": "user",
-      "content": "What is the guaranteed uptime percentage for TechCorp's cloud services?"
-    }
-  ],
-  "stream": true
-}
-
-### send request to query_rewriter agent
-POST http://localhost:10500/
-Content-Type: application/json
-
-[
-  {
-    "role": "user",
-    "content": "What is the guaranteed uptime percentage for TechCorp's cloud services?"
-  }
-]
-
-### test fast-llm
-POST http://localhost:12000/v1/chat/completions
-Content-Type: application/json
-
-{
-  "model": "fast-llm",
-  "messages": [
-    {
-      "role": "user",
-      "content": "hello"
-    }
-  ]
-}
-
-### test smart-llm
-POST http://localhost:12000/v1/chat/completions
-Content-Type: application/json
-
-{
-  "model": "smart-llm",
-  "messages": [
-    {
-      "role": "user",
-      "content": "hello"
-    }
-  ]
-}
--- a/demos/use_cases/http_filter/uv.lock
+++ b/demos/use_cases/http_filter/uv.lock
--- a/demos/use_cases/llm_routing/README.md
+++ b/demos/use_cases/llm_routing/README.md
@ -1,58 +0,0 @@
-# LLM Routing
-This demo shows how you can use Plano gateway to manage keys and route to upstream LLM.
-
-# Starting the demo
-1. Please make sure the [pre-requisites](https://github.com/katanemo/arch/?tab=readme-ov-file#prerequisites) are installed correctly
-1. Start Plano
-   ```sh
-   sh run_demo.sh
-   ```
-1. Navigate to http://localhost:18080/
-
-Following screen shows an example of interaction with Plano gateway showing dynamic routing. You can select between different LLMs using "override model" option in the chat UI.
-
-![LLM Routing Demo](llm_routing_demo.png)
-
-You can also pass in a header to override model when sending prompt. Following example shows how you can use `x-arch-llm-provider-hint` header to override model selection,
-
-```bash
-
-$ curl --header 'Content-Type: application/json' \
-  --header 'x-arch-llm-provider-hint: mistral/ministral-3b' \
-  --data '{"messages": [{"role": "user","content": "hello"}], "model": "gpt-4o"}' \
-  http://localhost:12000/v1/chat/completions 2> /dev/null | jq .
-{
-  "id": "xxx",
-  "object": "chat.completion",
-  "created": 1737760394,
-  "model": "ministral-3b-latest",
-  "choices": [
-    {
-      "index": 0,
-      "messages": {
-        "role": "assistant",
-        "tool_calls": null,
-        "content": "Hello! How can I assist you today? Let's chat about anything you'd like. 😊"
-      },
-      "finish_reason": "stop"
-    }
-  ],
-  "usage": {
-    "prompt_tokens": 4,
-    "total_tokens": 25,
-    "completion_tokens": 21
-  }
-}
-
-```
-
-# Observability
-Plano gateway publishes stats endpoint at http://localhost:19901/stats. In this demo we are using prometheus to pull stats from Plano and we are using grafana to visualize the stats in dashboard. To see grafana dashboard follow instructions below,
-
-1. Navigate to http://localhost:3000/ to open grafana UI (use admin/grafana as credentials)
-1. From grafana left nav click on dashboards and select "Intelligent Gateway Overview" to view Plano gateway stats
-1. For tracing you can head over to http://localhost:16686/ to view recent traces.
-
-Following is a screenshot of tracing UI showing call received by Plano gateway and making upstream call to LLM,
-
-![Jaeger Tracing](jaeger_tracing_llm_routing.png)
--- a/demos/use_cases/llm_routing/config.yaml
+++ b/demos/use_cases/llm_routing/config.yaml
@ -1,51 +0,0 @@
-version: v0.3.0
-
-listeners:
-  - type: model
-    name: model_1
-    address: 0.0.0.0
-    port: 12000
-    max_retries: 3
-
-model_providers:
-
-  - access_key: $OPENAI_API_KEY
-    model: openai/gpt-4o-mini
-
-  - access_key: $OPENAI_API_KEY
-    model: openai/gpt-4.1
-
-  - access_key: $OPENAI_API_KEY
-    model: openai/gpt-4o
-    default: true
-
-  - access_key: $MISTRAL_API_KEY
-    model: mistral/ministral-3b-latest
-
-  - access_key: $ANTHROPIC_API_KEY
-    model: anthropic/claude-3-7-sonnet-latest
-
-  - access_key: $ANTHROPIC_API_KEY
-    model: anthropic/claude-sonnet-4-0
-
-  - access_key: $DEEPSEEK_API_KEY
-    model: deepseek/deepseek-reasoner
-
-  - access_key: $GROQ_API_KEY
-    model: groq/llama-3.1-8b-instant
-
-  - access_key: $GEMINI_API_KEY
-    model: gemini/gemini-1.5-pro-latest
-
-  - model: xai/grok-4-latest
-    access_key: $GROK_API_KEY
-
-  - model: together_ai/openai/gpt-oss-20b
-    access_key: $TOGETHER_API_KEY
-
-  - model: custom/test-model
-    base_url: http://host.docker.internal:11223
-    provider_interface: openai
-
-tracing:
-  random_sampling: 100
--- a/demos/use_cases/llm_routing/docker-compose.yaml
+++ b/demos/use_cases/llm_routing/docker-compose.yaml
@ -1,49 +0,0 @@
-services:
-
-  plano:
-    build:
-      context: ../../../
-      dockerfile: Dockerfile
-    ports:
-      - "12000:12000"
-      - "12001:12001"
-    environment:
-      - PLANO_CONFIG_PATH=/app/plano_config.yaml
-      - OPENAI_API_KEY=${OPENAI_API_KEY:?OPENAI_API_KEY environment variable is required but not set}
-      - OTEL_TRACING_GRPC_ENDPOINT=http://host.docker.internal:4317
-    volumes:
-      - ./config.yaml:/app/plano_config.yaml:ro
-      - /etc/ssl/cert.pem:/etc/ssl/cert.pem
-
-  anythingllm:
-    image: mintplexlabs/anythingllm
-    restart: always
-    ports:
-      - "3001:3001"
-    cap_add:
-      - SYS_ADMIN
-    environment:
-      - STORAGE_DIR=/app/server/storage
-      - LLM_PROVIDER=generic-openai
-      - GENERIC_OPEN_AI_BASE_PATH=http://plano:12000/v1
-      - GENERIC_OPEN_AI_MODEL_PREF=gpt-4o-mini
-      - GENERIC_OPEN_AI_MODEL_TOKEN_LIMIT=128000
-      - GENERIC_OPEN_AI_API_KEY=sk-placeholder
-
-  jaeger:
-    build:
-      context: ../../shared/jaeger
-    ports:
-      - "16686:16686"
-      - "4317:4317"
-      - "4318:4318"
-
-  prometheus:
-    build:
-      context: ../../shared/prometheus
-
-  grafana:
-    build:
-      context: ../../shared/grafana
-    ports:
-      - "3000:3000"
--- a/demos/use_cases/llm_routing/jaeger_tracing_llm_routing.png
+++ b/demos/use_cases/llm_routing/jaeger_tracing_llm_routing.png
--- a/demos/use_cases/llm_routing/llm_routing_demo.png
+++ b/demos/use_cases/llm_routing/llm_routing_demo.png
--- a/demos/use_cases/llm_routing/run_demo.sh
+++ b/demos/use_cases/llm_routing/run_demo.sh
@ -1,47 +0,0 @@
-#!/bin/bash
-set -e
-
-# Function to start the demo
-start_demo() {
-  # Step 1: Check if .env file exists
-  if [ -f ".env" ]; then
-    echo ".env file already exists. Skipping creation."
-  else
-    # Step 2: Create `.env` file and set OpenAI key
-    if [ -z "$OPENAI_API_KEY" ]; then
-      echo "Error: OPENAI_API_KEY environment variable is not set for the demo."
-      exit 1
-    fi
-
-    echo "Creating .env file..."
-    echo "OPENAI_API_KEY=$OPENAI_API_KEY" > .env
-    echo ".env file created with OPENAI_API_KEY."
-  fi
-
-  # Step 3: Start Plano
-  echo "Starting Plano with config.yaml..."
-  planoai up config.yaml
-
-  # Step 4: Start LLM Routing
-  echo "Starting LLM Routing using Docker Compose..."
-  docker compose up -d  # Run in detached mode
-}
-
-# Function to stop the demo
-stop_demo() {
-  # Step 1: Stop Docker Compose services
-  echo "Stopping LLM Routing using Docker Compose..."
-  docker compose down
-
-  # Step 2: Stop Plano
-  echo "Stopping Plano..."
-  planoai down
-}
-
-# Main script logic
-if [ "$1" == "down" ]; then
-  stop_demo
-else
-  # Default action is to bring the demo up
-  start_demo
-fi
--- a/demos/use_cases/mcp_filter/Dockerfile
+++ b/demos/use_cases/mcp_filter/Dockerfile
@ -1,26 +0,0 @@
-FROM python:3.14-slim
-
-WORKDIR /app
-
-# Install bash and uv
-RUN apt-get update && apt-get install -y bash && rm -rf /var/lib/apt/lists/*
-RUN pip install --no-cache-dir uv
-
-# Copy dependency files
-COPY pyproject.toml README.md ./
-
-# Copy source code
-COPY src/ ./src/
-COPY start_agents.sh ./
-
-# Install dependencies using uv
-RUN uv pip install --system --no-cache click fastmcp pydantic fastapi uvicorn openai
-
-# Make start script executable
-RUN chmod +x start_agents.sh
-
-# Expose ports for all agents
-EXPOSE 10500 10501 10502 10505
-
-# Run the start script with bash
-CMD ["bash", "./start_agents.sh"]
--- a/demos/use_cases/mcp_filter/README.md
+++ b/demos/use_cases/mcp_filter/README.md
@ -1,128 +0,0 @@
-# RAG Agent Demo
-
-A multi-agent RAG system demonstrating plano's agent filter chain with MCP protocol.
-
-## Architecture
-
-This demo consists of four components:
-1. **Input Guards** (MCP filter) - Validates queries are within TechCorp's domain
-2. **Query Rewriter** (MCP filter) - Rewrites user queries for better retrieval
-3. **Context Builder** (MCP filter) - Retrieves relevant context from knowledge base
-4. **RAG Agent** (REST) - Generates final responses based on augmented context
-
-## Components
-
-### Input Guards Filter (MCP)
- **Port**: 10500
- **Tool**: `input_guards`
- Validates queries are within TechCorp's domain
- Rejects queries about other companies or unrelated topics
-
-### Query Rewrit3r Filter (MCP)
- **Port**: 10501
- **Tool**: `query_rewriter`
- Improves queries using LLM before retrieval
-
-### Context Builder Filter (MCP)
- **Port**: 10502
- **Tool**: `context_builder`
- Augments queries with relevant passages from knowledge base
-
-### RAG Agent (REST/OpenAI)
- **Port**: 10505
- **Endpoint**: `/v1/chat/completions`
- Generates responses using OpenAI-compatible API
-
-## Quick Start
-
-### 1. Start everything with Docker Compose
-```bash
-docker compose up --build
-```
-
-This brings up:
- Input Guards MCP server on port 10500
- Query Rewriter MCP server on port 10501
- Context Builder MCP server on port 10502
- RAG Agent REST server on port 10505
- Plano listener on port 8001 (and gateway on 12000)
- Jaeger UI for viewing traces at http://localhost:16686
- AnythingLLM at http://localhost:3001 for interactive queries
-
-> Set `OPENAI_API_KEY` in your environment before running; `LLM_GATEWAY_ENDPOINT` defaults to `http://host.docker.internal:12000/v1`.
-
-### 2. Test the system
-
-**Option A: Using AnythingLLM (recommended)**
-
-Navigate to http://localhost:3001 and send queries through the chat interface.
-
-**Option B: Using curl**
-```bash
-curl -X POST http://localhost:8001/v1/chat/completions \
-  -H "Content-Type: application/json" \
-  -d '{
-    "model": "gpt-4o",
-    "messages": [{"role": "user", "content": "What is the guaranteed uptime for TechCorp?"}]
-  }'
-```
-
-## Configuration
-
-The `config.yaml` defines how agents are connected:
-
-```yaml
-filters:
-  - id: input_guards
-    url: http://host.docker.internal:10500
-    # type: mcp (default)
-    # tool: input_guards (default - same as filter id)
-
-  - id: query_rewriter
-    url: http://host.docker.internal:10501
-    # type: mcp (default)
-
-  - id: context_builder
-    url: http://host.docker.internal:10502
-```
-
-## How It Works
-
-1. User sends request to plano listener on port 8001
-2. Request passes through MCP filter chain:
-   - **Input Guards** validates the query is within TechCorp's domain
-   - **Query Rewriter** rewrites the query for better retrieval
-   - **Context Builder** augments query with relevant knowledge base passages
-3. Augmented request is forwarded to **RAG Agent** REST endpoint
-4. RAG Agent generates final response using LLM
-
-## Additional Configuration
-
-See `config.yaml` for the complete filter chain setup. The MCP filters use default settings:
- `type: mcp` (default)
- `transport: streamable-http` (default)
- Tool name defaults to filter ID
-
-See `sample_queries.md` for example queries to test the RAG system.
-
-Example request:
-```bash
-curl -X POST http://localhost:8001/v1/chat/completions \
-  -H "Content-Type: application/json" \
-  -d '{
-    "model": "gpt-4o",
-    "messages": [
-      {
-        "role": "user",
-        "content": "What is the guaranteed uptime for TechCorp?"
-      }
-    ]
-  }'
-```
- `LLM_GATEWAY_ENDPOINT` - lpano endpoint (default: `http://localhost:12000/v1`)
- `OPENAI_API_KEY` - OpenAI API key for model providers
-
-## Additional Resources
-
- See `sample_queries.md` for more example queries
- See `config.yaml` for complete configuration details
--- a/demos/use_cases/mcp_filter/config.yaml
+++ b/demos/use_cases/mcp_filter/config.yaml
@ -1,47 +0,0 @@
-version: v0.3.0
-
-agents:
-  - id: rag_agent
-    url: http://host.docker.internal:10505
-
-filters:
-  - id: input_guards
-    url: http://host.docker.internal:10500
-    # type: mcp (default)
-    # transport: streamable-http (default)
-    # tool: input_guards (default - same as filter id)
-  - id: query_rewriter
-    url: http://host.docker.internal:10501
-    # type: mcp (default)
-    # transport: streamable-http (default)
-    # tool: query_rewriter (default - same as filter id)
-  - id: context_builder
-    url: http://host.docker.internal:10502
-
-model_providers:
-  - model: openai/gpt-4o-mini
-    access_key: $OPENAI_API_KEY
-    default: true
-  - model: openai/gpt-4o
-    access_key: $OPENAI_API_KEY
-
-model_aliases:
-  fast-llm:
-    target: gpt-4o-mini
-  smart-llm:
-    target: gpt-4o
-
-listeners:
-  - type: agent
-    name: agent_1
-    port: 8001
-    router: plano_orchestrator_v1
-    agents:
-      - id: rag_agent
-        description: virtual assistant for retrieval augmented generation tasks
-        filter_chain:
-          - input_guards
-          - query_rewriter
-          - context_builder
-tracing:
-  random_sampling: 100
--- a/demos/use_cases/mcp_filter/docker-compose.yaml
+++ b/demos/use_cases/mcp_filter/docker-compose.yaml
@ -1,49 +0,0 @@
-services:
-  rag-agents:
-    build:
-      context: .
-      dockerfile: Dockerfile
-    ports:
-      - "10500:10500"
-      - "10501:10501"
-      - "10502:10502"
-      - "10505:10505"
-    environment:
-      - LLM_GATEWAY_ENDPOINT=${LLM_GATEWAY_ENDPOINT:-http://host.docker.internal:12000/v1}
-      - OPENAI_API_KEY=${OPENAI_API_KEY:?OPENAI_API_KEY environment variable is required but not set}
-  plano:
-    build:
-      context: ../../../
-      dockerfile: Dockerfile
-    ports:
-      - "11000:11000"
-      - "12001:12001"
-      - "12000:12000"
-      - "8001:8001"
-    environment:
-      - PLANO_CONFIG_PATH=/config/config.yaml
-      - OPENAI_API_KEY=${OPENAI_API_KEY:?OPENAI_API_KEY environment variable is required but not set}
-    volumes:
-      - ./config.yaml:/app/plano_config.yaml
-      - /etc/ssl/cert.pem:/etc/ssl/cert.pem
-  jaeger:
-    build:
-      context: ../../shared/jaeger
-    ports:
-      - "16686:16686"
-      - "4317:4317"
-      - "4318:4318"
-  anythingllm:
-    image: mintplexlabs/anythingllm
-    restart: always
-    ports:
-      - "3001:3001"
-    cap_add:
-      - SYS_ADMIN
-    environment:
-      - STORAGE_DIR=/app/server/storage
-      - LLM_PROVIDER=generic-openai
-      - GENERIC_OPEN_AI_BASE_PATH=http://plano:8001/v1
-      - GENERIC_OPEN_AI_MODEL_PREF=gpt-4o-mini
-      - GENERIC_OPEN_AI_MODEL_TOKEN_LIMIT=128000
-      - GENERIC_OPEN_AI_API_KEY=sk-placeholder
--- a/demos/use_cases/mcp_filter/mcp_query.rest
+++ b/demos/use_cases/mcp_filter/mcp_query.rest
@ -1,86 +0,0 @@
-### Initialize MCP Session (SSE)
-POST http://localhost:10501/mcp
-Content-Type: application/json
-Accept: application/json, text/event-stream
-
-{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"capabilities":{},"protocolVersion":"2024-11-05","clientInfo":{"name":"test","version":"1.0.0"}}}
-
-### Send Initialized Notification
-POST http://localhost:10501/mcp
-Content-Type: application/json
-Accept: application/json, text/event-stream
-mcp-session-id: 35d455dc07b8400887f86668590f12bb
-
-{
-  "jsonrpc": "2.0",
-  "method": "notifications/initialized"
-}
-
-### List Tools
-POST http://localhost:10501/mcp
-Content-Type: application/json
-Accept: application/json, text/event-stream
-mcp-session-id: eb10a691b36e4547b6c93c5dc5b47e11
-
-{
-  "jsonrpc": "2.0",
-  "id": "list-tools-1",
-  "method": "tools/list"
-}
-
-### Call Query Rewriter Tool
-POST http://localhost:10501/mcp
-Content-Type: application/json
-Accept: application/json, text/event-stream
-mcp-session-id: 6b95ff75825a402b90eb3ea07e23fbce
-
-{
-  "jsonrpc": "2.0",
-  "id": "3d3b886a-6216-4a26-a422-7a972529c0e7",
-  "method": "tools/call",
-  "params": {
-    "arguments": {
-      "messages": [
-        {
-          "content": "What is the guaranteed uptime percentage for TechCorp's cloud services?",
-          "role": "user"
-        }
-      ]
-    },
-    "name": "query_rewriter"
-  }
-}
-
-### another test
-
-# Content-Type: application/json
-# Accept: application/json, text/event-stream
-# mcp-session-id: ed7a81a1d39549ecaadb867a6b2daf1e
-
-POST http://localhost:10501/mcp
-content-type: application/json
-mcp-session-id: e4ec1ae904e14e06b7d194da10e5f74c
-accept: application/json, text/event-stream
-
-{"jsonrpc":"2.0","id":"4bb1043a-2953-4bcd-b801-f270b0ae8c39","method":"tools/call","params":{"arguments":{"messages":[{"content":"What is the guaranteed uptime percentage for TechCorp's cloud services?","role":"user"}]},"name":"query_rewriter"}}
-
-
-
-### stream test
-
-POST http://localhost:10501/mcp
-content-type: application/json
-mcp-session-id: 35d455dc07b8400887f86668590f12bb
-accept: application/json, text/event-stream
-
-{
-  "jsonrpc": "2.0",
-  "id": 1,
-  "method": "tools/call",
-  "params": {
-    "name": "long_job",
-    "arguments": {
-      "n": 3
-    }
-  }
-}
--- a/demos/use_cases/mcp_filter/pyproject.toml
+++ b/demos/use_cases/mcp_filter/pyproject.toml
@ -1,22 +0,0 @@
-[project]
-name = "rag_agent"
-version = "0.1.0"
-description = "RAG Agent"
-readme = "README.md"
-requires-python = ">=3.10"
-dependencies = [
-    "click>=8.2.1",
-    "mcp>=1.13.1",
-    "fastmcp>=2.14",
-    "pydantic>=2.11.7",
-    "fastapi>=0.104.1",
-    "uvicorn>=0.24.0",
-    "openai==2.13.0",
-]
-
-[project.scripts]
-rag_agent = "rag_agent:main"
-
-[build-system]
-requires = ["hatchling"]
-build-backend = "hatchling.build"
--- a/demos/use_cases/mcp_filter/sample_queries.md
+++ b/demos/use_cases/mcp_filter/sample_queries.md
@ -1,64 +0,0 @@
-# Sample Queries for Knowledge Base RAG Agent
-
-## Service Level Agreement Queries
- What is the guaranteed uptime percentage for TechCorp's cloud services?
- What remedies are available if the API response time exceeds the agreed threshold?
- How quickly must TechCorp respond to critical support issues?
- What monitoring and reporting requirements are specified in the SLA?
- When was the TechCorp service agreement signed and by whom?
-
-## Privacy Policy Queries
- What encryption methods does DataSecure use to protect data?
- How long does DataSecure retain personal data after account deletion?
- What rights do users have regarding their personal information?
- Can DataSecure sell user data to third parties for marketing?
- Who should be contacted for privacy-related concerns at DataSecure?
-
-## Supply Chain Agreement Queries
- What types of automotive components does PrecisionParts supply?
- What are the payment terms and volume discount structure?
- What quality standards must the supplied components meet?
- What are the penalties for late delivery?
- What insurance coverage requirements apply to the supplier?
-
-## Student Data Management Queries
- What federal laws must EduTech comply with regarding student data?
- What security measures are in place to protect student information?
- How long are student records retained after graduation?
- What consent is required for students under 13 years old?
- Who can access student educational records?
-
-## Investment Advisory Queries
- What is FinanceFirst's management fee structure?
- What types of investments are included in the advisory services?
- What regulatory body oversees FinanceFirst Advisors?
- How often are portfolio reviews conducted?
- What are the client's responsibilities under this agreement?
-
-## Healthcare Standards Queries
- What is the target response time for emergency code teams?
- What hand hygiene compliance rate is required?
- How quickly must medical records be completed after patient encounters?
- What continuing education requirements apply to nursing staff?
- What patient safety protocols are mandatory upon admission?
-
-## Cross-Document Queries
- Which agreements include confidentiality or data protection provisions?
- What are the common termination notice periods across different contract types?
- Which documents specify insurance or liability coverage requirements?
- What compliance and regulatory requirements are mentioned across agreements?
- Which contracts include performance metrics or service level commitments?
-
-## Complex Analysis Queries
- Compare the data retention policies across the privacy policy and student data management documents.
- What are the different approaches to risk management across the supply chain and investment advisory agreements?
- How do the security measures in the healthcare standards compare to those in the privacy policy?
- Which agreements provide the most detailed compliance and regulatory frameworks?
- What common themes exist in the quality assurance requirements across different industries?
-
-## Document-Specific Detail Queries
- List all the specific percentages, timeframes, and numerical requirements mentioned in the SLA.
- What are all the contact persons and their roles mentioned across the documents?
- Identify all the compliance standards and certifications referenced in the supply chain agreement.
- What are the specific consequences or penalties mentioned for non-compliance across agreements?
- List all the third-party systems, tools, or services mentioned in the documents.
--- a/demos/use_cases/mcp_filter/src/rag_agent/init.py
+++ b/demos/use_cases/mcp_filter/src/rag_agent/init.py
@ -1,109 +0,0 @@
-import click
-from fastmcp import FastMCP
-
-mcp = None
-
-
-@click.command()
-@click.option(
-    "--transport",
-    "transport",
-    default="streamable-http",
-    help="Transport type: stdio or sse",
-)
-@click.option("--host", "host", default="localhost", help="Host to bind MCP server to")
-@click.option("--port", "port", type=int, default=10500, help="Port for MCP server")
-@click.option(
-    "--agent",
-    "agent",
-    required=True,
-    help="Agent name: query_rewriter, context_builder, or response_generator",
-)
-@click.option(
-    "--name",
-    "agent_name",
-    default=None,
-    help="Custom MCP server name (defaults to agent type)",
-)
-@click.option(
-    "--rest-server",
-    "rest_server",
-    is_flag=True,
-    help="Start REST server instead of MCP server",
-)
-@click.option("--rest-port", "rest_port", default=8000, help="Port for REST server")
-def main(host, port, agent, transport, agent_name, rest_server, rest_port):
-    """Start a RAG agent as an MCP server or REST server."""
-
-    # Map friendly names to agent modules
-    agent_map = {
-        "input_guards": ("rag_agent.input_guards", "Input Guards Agent"),
-        "query_rewriter": ("rag_agent.query_rewriter", "Query Rewriter Agent"),
-        "context_builder": ("rag_agent.context_builder", "Context Builder Agent"),
-        "response_generator": (
-            "rag_agent.rag_agent",
-            "Response Generator Agent",
-        ),
-    }
-
-    if agent not in agent_map:
-        print(f"Error: Unknown agent '{agent}'")
-        print(f"Available agents: {', '.join(agent_map.keys())}")
-        return
-
-    module_name, default_name = agent_map[agent]
-    mcp_name = agent_name or default_name
-
-    if rest_server:
-        # REST server mode - supported for query_rewriter and response_generator
-        if agent == "response_generator":
-            print(f"Starting REST server on {host}:{rest_port} for agent: {agent}")
-            from rag_agent.rag_agent import start_server
-
-            start_server(host=host, port=rest_port)
-            return
-        elif agent == "query_rewriter":
-            print(f"Starting REST server on {host}:{rest_port} for agent: {agent}")
-            from rag_agent.query_rewriter import start_server
-
-            start_server(host=host, port=rest_port)
-            return
-        else:
-            print(f"Error: Agent '{agent}' does not support REST server mode.")
-            print(
-                f"REST server is only supported for: query_rewriter, response_generator"
-            )
-            print(f"Remove --rest-server flag to start {agent} as an MCP server.")
-            return
-    else:
-        # Only input_guards, query_rewriter and context_builder support MCP
-        if agent not in ["input_guards", "query_rewriter", "context_builder"]:
-            print(f"Error: Agent '{agent}' does not support MCP mode.")
-            print(
-                f"MCP is only supported for: input_guards, query_rewriter, context_builder"
-            )
-            print(f"Use --rest-server flag to start {agent} as a REST server.")
-            return
-
-        global mcp
-        mcp = FastMCP(mcp_name, host=host, port=port)
-
-        print(f"Starting MCP server: {mcp_name}")
-        print(f"  Agent: {agent}")
-        print(f"  Transport: {transport}")
-        print(f"  Host: {host}")
-        print(f"  Port: {port}")
-
-        # Import the agent module to register its tools
-        import importlib
-
-        importlib.import_module(module_name)
-
-        print(f"Agent '{agent}' loaded successfully")
-        print(f"MCP server ready on {transport}://{host}:{port}")
-
-        mcp.run(transport=transport)
-
-
-if __name__ == "__main__":
-    main()
--- a/demos/use_cases/mcp_filter/src/rag_agent/main.py
+++ b/demos/use_cases/mcp_filter/src/rag_agent/main.py
@ -1,4 +0,0 @@
-from . import main
-
-if __name__ == "__main__":
-    main()
--- a/demos/use_cases/mcp_filter/src/rag_agent/api.py
+++ b/demos/use_cases/mcp_filter/src/rag_agent/api.py
@ -1,36 +0,0 @@
-from pydantic import BaseModel
-from typing import List, Optional, Dict, Any
-
-
-class ChatMessage(BaseModel):
-    role: str
-    content: str
-
-
-class ChatCompletionRequest(BaseModel):
-    model: str
-    messages: List[ChatMessage]
-    temperature: Optional[float] = 1.0
-    max_tokens: Optional[int] = None
-    top_p: Optional[float] = 1.0
-    frequency_penalty: Optional[float] = 0.0
-    presence_penalty: Optional[float] = 0.0
-    stream: Optional[bool] = False
-    stop: Optional[List[str]] = None
-
-
-class ChatCompletionResponse(BaseModel):
-    id: str
-    object: str = "chat.completion"
-    created: int
-    model: str
-    choices: List[Dict[str, Any]]
-    usage: Dict[str, int]
-
-
-class ChatCompletionStreamResponse(BaseModel):
-    id: str
-    object: str = "chat.completion.chunk"
-    created: int
-    model: str
-    choices: List[Dict[str, Any]]
--- a/demos/use_cases/mcp_filter/src/rag_agent/context_builder.py
+++ b/demos/use_cases/mcp_filter/src/rag_agent/context_builder.py
@ -1,224 +0,0 @@
-import json
-from typing import List, Optional, Dict, Any
-from openai import AsyncOpenAI
-import os
-import logging
-import csv
-from pathlib import Path
-
-from .api import ChatMessage
-from . import mcp
-from fastmcp.server.dependencies import get_http_headers
-
-# Set up logging
-logging.basicConfig(
-    level=logging.INFO,
-    format="%(asctime)s - [CONTEXT_BUILDER]    - %(levelname)s - %(message)s",
-)
-logger = logging.getLogger(__name__)
-
-
-# Configuration for Plano LLM gateway
-LLM_GATEWAY_ENDPOINT = os.getenv("LLM_GATEWAY_ENDPOINT", "http://localhost:12000/v1")
-RAG_MODEL = "gpt-4o-mini"
-
-# Initialize OpenAI client for Plano
-plano_client = AsyncOpenAI(
-    base_url=LLM_GATEWAY_ENDPOINT,
-    api_key="EMPTY",  # Plano doesn't require a real API key
-)
-
-# Global variable to store the knowledge base
-knowledge_base = []
-
-
-def load_knowledge_base():
-    """Load the sample_knowledge_base.csv file into memory on startup."""
-    global knowledge_base
-
-    # Get the path to the CSV file relative to this script
-    current_dir = Path(__file__).parent
-    csv_path = current_dir / "sample_knowledge_base.csv"
-
-    print(f"Loading knowledge base from {csv_path}")
-
-    try:
-        knowledge_base = []
-        with open(csv_path, "r", encoding="utf-8-sig") as file:
-            csv_reader = csv.DictReader(file)
-            for row in csv_reader:
-                knowledge_base.append({"path": row["path"], "content": row["content"]})
-
-        logger.info(f"Loaded {len(knowledge_base)} documents from knowledge base")
-
-    except Exception as e:
-        logger.error(f"Error loading knowledge base: {e}")
-        knowledge_base = []
-
-
-async def find_relevant_passages(
-    query: str,
-    traceparent: Optional[str] = None,
-    request_id: Optional[str] = None,
-    top_k: int = 3,
-) -> List[Dict[str, str]]:
-    """Use the LLM to find the most relevant passages from the knowledge base."""
-
-    if not knowledge_base:
-        logger.warning("Knowledge base is empty")
-        return []
-
-    # Create a system prompt for passage selection
-    system_prompt = f"""You are a retrieval assistant that selects the most relevant document passages for a given query.
-
-                    Given a user query and a list of document passages, identify the {top_k} most relevant passages that would help answer the query.
-
-                    Query: {query}
-
-                    Available passages:
-                    """
-
-    # Add all passages with indices
-    for i, doc in enumerate(knowledge_base):
-        system_prompt += (
-            f"\n[{i}] Path: {doc['path']}\nContent: {doc['content'][:500]}...\n"
-        )
-
-    system_prompt += f"""
-
-        Please respond with ONLY the indices of the {top_k} most relevant passages, separated by commas (e.g., "0,3,7").
-        If fewer than {top_k} passages are relevant, return only the relevant ones.
-        If no passages are relevant, return "NONE"."""
-
-    try:
-        # Call Plano to select relevant passages
-        logger.info(f"Calling Plano to find relevant passages for query: '{query}'")
-
-        # Prepare extra headers if traceparent is provided
-        extra_headers = {
-            "x-envoy-max-retries": "3",
-        }
-        if request_id:
-            extra_headers["x-request-id"] = request_id
-        if traceparent:
-            extra_headers["traceparent"] = traceparent
-
-        response = await plano_client.chat.completions.create(
-            model=RAG_MODEL,
-            messages=[{"role": "system", "content": system_prompt}],
-            temperature=0.1,
-            max_tokens=50,
-            extra_headers=extra_headers,
-        )
-
-        result = response.choices[0].message.content.strip()
-        logger.info(f"LLM selected passages: {result}")
-
-        # Parse the indices
-        if result.upper() == "NONE":
-            return []
-
-        selected_passages = []
-        indices = [
-            int(idx.strip()) for idx in result.split(",") if idx.strip().isdigit()
-        ]
-
-        for idx in indices:
-            if 0 <= idx < len(knowledge_base):
-                selected_passages.append(knowledge_base[idx])
-
-        logger.info(f"Selected {len(selected_passages)} relevant passages")
-        return selected_passages
-
-    except Exception as e:
-        logger.error(f"Error finding relevant passages: {e}")
-        return []
-
-
-async def augment_query_with_context(
-    messages: List[ChatMessage],
-    traceparent: Optional[str] = None,
-    request_id: Optional[str] = None,
-) -> List[ChatMessage]:
-    """Extract user query, find relevant context, and augment the messages."""
-
-    # Find the last user message
-    last_user_message = None
-    last_user_index = -1
-
-    for i in range(len(messages) - 1, -1, -1):
-        if messages[i].role == "user":
-            last_user_message = messages[i].content
-            last_user_index = i
-            break
-
-    if not last_user_message:
-        logger.warning("No user message found in conversation")
-        return messages
-
-    logger.info(f"Processing user query: '{last_user_message}'")
-
-    # Find relevant passages
-    relevant_passages = await find_relevant_passages(
-        last_user_message, traceparent, request_id
-    )
-
-    if not relevant_passages:
-        logger.info("No relevant passages found, returning original messages")
-        return messages
-
-    # Build context from relevant passages
-    context_parts = []
-    for i, passage in enumerate(relevant_passages):
-        context_parts.append(
-            f"Document {i+1} ({passage['path']}):\n{passage['content']}"
-        )
-
-    context = "\n\n".join(context_parts)
-
-    # Create augmented content with original query and context
-    augmented_content = f"""{last_user_message} RELEVANT CONTEXT:
-    {context}"""
-
-    # Create updated messages with the augmented query
-    updated_messages = messages.copy()
-    updated_messages[last_user_index] = ChatMessage(
-        role="user", content=augmented_content
-    )
-
-    logger.info(f"Augmented user query with {len(relevant_passages)} relevant passages")
-
-    return updated_messages
-
-
-# Load knowledge base on module import
-load_knowledge_base()
-
-
-async def context_builder(messages: List[ChatMessage]) -> List[ChatMessage]:
-    """MCP tool that augments user queries with relevant context from the knowledge base."""
-    logger.info(f"Received chat completion request with {len(messages)} messages")
-
-    # Get traceparent header from MCP request
-    headers = get_http_headers()
-    traceparent_header = headers.get("traceparent")
-    request_id = headers.get("x-request-id")
-    logger.info(f"Received request ID: {request_id}")
-
-    if traceparent_header:
-        logger.info(f"Received traceparent header: {traceparent_header}")
-    else:
-        logger.info("No traceparent header found")
-
-    # Augment the user query with relevant context
-    updated_messages = await augment_query_with_context(
-        messages, traceparent_header, request_id
-    )
-
-    # Return as dict to minimize text serialization
-    return [{"role": msg.role, "content": msg.content} for msg in updated_messages]
-
-
-# Register MCP tool only if mcp is available
-if mcp is not None:
-    mcp.tool()(context_builder)
--- a/demos/use_cases/mcp_filter/src/rag_agent/input_guards.py
+++ b/demos/use_cases/mcp_filter/src/rag_agent/input_guards.py
@ -1,161 +0,0 @@
-import asyncio
-import json
-import time
-from typing import List, Optional, Dict, Any
-import uuid
-from fastapi import FastAPI, Depends, Request
-from fastmcp.exceptions import ToolError
-from openai import AsyncOpenAI
-import os
-import logging
-
-from .api import ChatCompletionRequest, ChatCompletionResponse, ChatMessage
-from . import mcp
-from fastmcp.server.dependencies import get_http_headers
-
-# Set up logging
-logging.basicConfig(
-    level=logging.INFO,
-    format="%(asctime)s - [INPUT_GUARDS]       - %(levelname)s - %(message)s",
-)
-logger = logging.getLogger(__name__)
-
-# Configuration for Plano LLM gateway
-LLM_GATEWAY_ENDPOINT = os.getenv("LLM_GATEWAY_ENDPOINT", "http://localhost:12000/v1")
-GUARD_MODEL = "gpt-4o-mini"
-
-# Initialize OpenAI client for Plano
-plano_client = AsyncOpenAI(
-    base_url=LLM_GATEWAY_ENDPOINT,
-    api_key="EMPTY",  # Plano doesn't require a real API key
-)
-
-app = FastAPI()
-
-
-async def validate_query_scope(
-    messages: List[ChatMessage],
-    traceparent_header: str,
-    request_id: Optional[str] = None,
-) -> Dict[str, Any]:
-    """Validate that the user query is within TechCorp's domain.
-
-    Returns a dict with:
-        - is_valid: bool indicating if query is within scope
-        - reason: str explaining why query is out of scope (if applicable)
-    """
-    system_prompt = """You are an input validation guard for TechCorp's customer support system.
-
-Your job is to determine if a user's query is related to TechCorp and its services/products.
-
-TechCorp is a technology company that provides:
- Cloud services and infrastructure
- SaaS products
- Technical support
- Service level agreements (SLAs)
- Uptime guarantees
- Enterprise solutions
-
-ALLOW queries about:
- TechCorp's services, products, or offerings
- TechCorp's pricing, SLAs, uptime, or policies
- Technical support for TechCorp products
- General questions about TechCorp as a company
-
-REJECT queries about:
- Other companies or their products
- General knowledge questions unrelated to TechCorp
- Personal advice or topics outside TechCorp's domain
- Anything that doesn't relate to TechCorp's business
-
-Respond in JSON format:
-{
-    "is_valid": true/false,
-    "reason": "brief explanation if invalid"
-}"""
-
-    # Get the last user message for validation
-    last_user_message = None
-    for msg in reversed(messages):
-        if msg.role == "user":
-            last_user_message = msg.content
-            break
-
-    if not last_user_message:
-        return {"is_valid": True, "reason": ""}
-
-    # Prepare messages for the guard
-    guard_messages = [
-        {"role": "system", "content": system_prompt},
-        {"role": "user", "content": f"Query to validate: {last_user_message}"},
-    ]
-
-    try:
-        # Call Plano using OpenAI client
-        extra_headers = {"x-envoy-max-retries": "3"}
-        if traceparent_header:
-            extra_headers["traceparent"] = traceparent_header
-
-        if request_id:
-            extra_headers["x-request-id"] = request_id
-
-        logger.info(f"Validating query scope: '{last_user_message}'")
-        response = await plano_client.chat.completions.create(
-            model=GUARD_MODEL,
-            messages=guard_messages,
-            temperature=0.1,
-            max_tokens=150,
-            extra_headers=extra_headers,
-        )
-
-        result_text = response.choices[0].message.content.strip()
-
-        # Parse JSON response
-        try:
-            result = json.loads(result_text)
-            logger.info(f"Validation result: {result}")
-            return result
-        except json.JSONDecodeError:
-            logger.error(f"Failed to parse validation response: {result_text}")
-            # Default to allowing if parsing fails
-            return {"is_valid": True, "reason": ""}
-
-    except Exception as e:
-        logger.error(f"Error validating query: {e}")
-        # Default to allowing if validation fails
-        return {"is_valid": True, "reason": ""}
-
-
-@mcp.tool
-async def input_guards(messages: List[ChatMessage]) -> List[ChatMessage]:
-    """Input guard that validates queries are within TechCorp's domain.
-
-    If the query is out of scope, replaces the user message with a rejection notice.
-    """
-    logger.info(f"Received request with {len(messages)} messages")
-
-    # Get traceparent header from HTTP request using FastMCP's dependency function
-    headers = get_http_headers()
-    traceparent_header = headers.get("traceparent")
-    request_id = headers.get("x-request-id")
-
-    if traceparent_header:
-        logger.info(f"Received traceparent header: {traceparent_header}")
-    else:
-        logger.info("No traceparent header found")
-
-    # Validate the query scope
-    validation_result = await validate_query_scope(
-        messages, traceparent_header, request_id
-    )
-
-    if not validation_result.get("is_valid", True):
-        reason = validation_result.get("reason", "Query is outside TechCorp's domain")
-        logger.warning(f"Query rejected: {reason}")
-
-        # Throw ToolError
-        error_message = f"I apologize, but I can only assist with questions related to TechCorp and its services. Your query appears to be outside this scope. {reason}\n\nPlease ask me about TechCorp's products, services, pricing, SLAs, or technical support."
-        raise ToolError(error_message)
-
-    logger.info("Query validation passed - forwarding to next filter")
-    return messages
--- a/demos/use_cases/mcp_filter/src/rag_agent/query_rewriter.py
+++ b/demos/use_cases/mcp_filter/src/rag_agent/query_rewriter.py
@ -1,165 +0,0 @@
-import asyncio
-import json
-import time
-from typing import List, Optional, Dict, Any
-import uuid
-from fastapi import FastAPI, Depends, Request
-from openai import AsyncOpenAI
-import os
-import logging
-
-from .api import ChatCompletionRequest, ChatCompletionResponse, ChatMessage
-from . import mcp
-from fastmcp.server.dependencies import get_http_headers
-
-# Set up logging
-logging.basicConfig(
-    level=logging.INFO,
-    format="%(asctime)s - [QUERY_REWRITER]     - %(levelname)s - %(message)s",
-)
-logger = logging.getLogger(__name__)
-
-# Configuration for Plano LLM gateway
-LLM_GATEWAY_ENDPOINT = os.getenv("LLM_GATEWAY_ENDPOINT", "http://localhost:12000/v1")
-QUERY_REWRITE_MODEL = "gpt-4o-mini"
-
-# Initialize OpenAI client for Plano
-plano_client = AsyncOpenAI(
-    base_url=LLM_GATEWAY_ENDPOINT,
-    api_key="EMPTY",  # Plano doesn't require a real API key
-)
-
-app = FastAPI()
-
-
-async def rewrite_query_with_plano(
-    messages: List[ChatMessage],
-    traceparent_header: str,
-    request_id: Optional[str] = None,
-) -> str:
-    """Rewrite the user query using LLM for better retrieval."""
-    system_prompt = """You are a query rewriter that improves user queries for better retrieval.
-
-    Given a conversation history, rewrite the last user message to be more specific and context-aware.
-    The rewritten query should:
-    1. Include relevant context from previous messages
-    2. Be clear and specific for information retrieval
-    3. Maintain the user's intent
-    4. Be concise but comprehensive
-
-    Return only the rewritten query, nothing else."""
-
-    # Prepare messages for the query rewriter - just add system prompt to existing messages
-    rewrite_messages = [{"role": "system", "content": system_prompt}]
-
-    # Add conversation history
-    for msg in messages:
-        rewrite_messages.append({"role": msg.role, "content": msg.content})
-
-    try:
-        # Call Plano using OpenAI client
-        extra_headers = {"x-envoy-max-retries": "3"}
-        if traceparent_header:
-            extra_headers["traceparent"] = traceparent_header
-        if request_id:
-            extra_headers["x-request-id"] = request_id
-        logger.info(f"Calling Plano at {LLM_GATEWAY_ENDPOINT} to rewrite query")
-        response = await plano_client.chat.completions.create(
-            model=QUERY_REWRITE_MODEL,
-            messages=rewrite_messages,
-            temperature=0.3,
-            max_tokens=200,
-            extra_headers=extra_headers,
-        )
-
-        rewritten_query = response.choices[0].message.content.strip()
-        logger.info(f"Query rewritten successfully: '{rewritten_query}'")
-        return rewritten_query
-
-    except Exception as e:
-        logger.error(f"Error rewriting query: {e}")
-
-    # If rewriting fails, return the original last user message
-    logger.info("Falling back to original user message")
-    for message in reversed(messages):
-        if message.role == "user":
-            return message.content
-    return ""
-
-
-async def query_rewriter(messages: List[ChatMessage]) -> List[ChatMessage]:
-    """Chat completions endpoint that rewrites the last user query using Plano.
-
-    Returns a dict with a 'messages' key containing the updated message list.
-    """
-    logger.info(f"Received chat completion request with {len(messages)} messages")
-
-    # Get traceparent header from HTTP request using FastMCP's dependency function
-    headers = get_http_headers()
-    traceparent_header = headers.get("traceparent")
-    request_id = headers.get("x-request-id")
-
-    if traceparent_header:
-        logger.info(f"Received traceparent header: {traceparent_header}")
-    else:
-        logger.info("No traceparent header found")
-
-    # Call Plano to rewrite the last user query
-    rewritten_query = await rewrite_query_with_plano(
-        messages, traceparent_header, request_id
-    )
-
-    # Create updated messages with the rewritten query
-    updated_messages = messages.copy()
-
-    # Find and update the last user message with the rewritten query
-    for i in range(len(updated_messages) - 1, -1, -1):
-        if updated_messages[i].role == "user":
-            original_query = updated_messages[i].content
-            updated_messages[i] = ChatMessage(role="user", content=rewritten_query)
-            logger.info(
-                f"Updated user query from '{original_query}' to '{rewritten_query}'"
-            )
-            break
-
-    # Return as dict to minimize text serialization
-    return [{"role": msg.role, "content": msg.content} for msg in updated_messages]
-
-
-# Register MCP tool only if mcp is available
-if mcp is not None:
-    mcp.tool()(query_rewriter)
-
-
-@app.post("/")
-async def chat_completions_endpoint(
-    request_messages: List[ChatMessage], request: Request
-) -> List[ChatMessage]:
-    """FastAPI endpoint for chat completions with query rewriting."""
-    logger.info(
-        f"Received /v1/chat/completions request with {len(request_messages)} messages"
-    )
-
-    # Extract traceparent header
-    traceparent_header = request.headers.get("traceparent")
-    if traceparent_header:
-        logger.info(f"Received traceparent header: {traceparent_header}")
-    else:
-        logger.info("No traceparent header found")
-
-    # Call the query rewriter tool
-    updated_messages_data = await query_rewriter(request_messages)
-
-    # Convert back to ChatMessage objects
-    updated_messages = [ChatMessage(**msg) for msg in updated_messages_data]
-
-    logger.info("Returning rewritten chat completion response")
-    return updated_messages
-
-
-def start_server(host: str = "0.0.0.0", port: int = 10501):
-    """Start the FastAPI server for query rewriter."""
-    import uvicorn
-
-    logger.info(f"Starting Query Rewriter REST server on {host}:{port}")
-    uvicorn.run(app, host=host, port=port)
--- a/demos/use_cases/mcp_filter/src/rag_agent/rag_agent.py
+++ b/demos/use_cases/mcp_filter/src/rag_agent/rag_agent.py
@ -1,223 +0,0 @@
-import json
-from fastapi import FastAPI, Request
-from fastapi.responses import StreamingResponse
-from openai import AsyncOpenAI
-import os
-import logging
-import time
-import uuid
-import uvicorn
-import asyncio
-
-from .api import (
-    ChatCompletionRequest,
-    ChatCompletionResponse,
-    ChatCompletionStreamResponse,
-)
-
-# Set up logging
-logging.basicConfig(
-    level=logging.INFO,
-    format="%(asctime)s - [RESPONSE_GENERATOR] - %(levelname)s - %(message)s",
-)
-logger = logging.getLogger(__name__)
-
-# Configuration for Plano LLM gateway
-LLM_GATEWAY_ENDPOINT = os.getenv("LLM_GATEWAY_ENDPOINT", "http://localhost:12000/v1")
-RESPONSE_MODEL = "gpt-4o"
-
-# System prompt for response generation
-SYSTEM_PROMPT = """You are a helpful assistant that generates coherent, contextual responses.
-
-Given a conversation history, generate a helpful and relevant response based on all the context available in the messages.
-Your response should:
-1. Be contextually aware of the entire conversation
-2. Address the user's needs appropriately
-3. Be helpful and informative
-4. Maintain a natural conversational tone
-
-Generate a complete response to assist the user."""
-
-# Initialize OpenAI client for Plano
-plano_client = AsyncOpenAI(
-    base_url=LLM_GATEWAY_ENDPOINT,
-    api_key="EMPTY",  # Plano doesn't require a real API key
-)
-
-# FastAPI app for REST server
-app = FastAPI(title="RAG Agent Response Generator", version="1.0.0")
-
-
-def prepare_response_messages(request_body: ChatCompletionRequest):
-    """Prepare messages for response generation by adding system prompt."""
-    response_messages = [{"role": "system", "content": SYSTEM_PROMPT}]
-
-    # Add conversation history
-    for msg in request_body.messages:
-        response_messages.append({"role": msg.role, "content": msg.content})
-
-    return response_messages
-
-
-@app.post("/v1/chat/completions")
-async def chat_completion_http(request: Request, request_body: ChatCompletionRequest):
-    """HTTP endpoint for chat completions with streaming support."""
-    logger.info(
-        f"Received chat completion request with {len(request_body.messages)} messages"
-    )
-
-    # Get traceparent header from HTTP request
-    traceparent_header = request.headers.get("traceparent")
-    request_id = request.headers.get("x-request-id")
-
-    if traceparent_header:
-        logger.info(f"Received traceparent header: {traceparent_header}")
-    else:
-        logger.info("No traceparent header found")
-
-    return StreamingResponse(
-        stream_chat_completions(request_body, traceparent_header, request_id),
-        media_type="text/plain",
-        headers={
-            "content-type": "text/event-stream",
-        },
-    )
-
-
-async def stream_chat_completions(
-    request_body: ChatCompletionRequest,
-    traceparent_header: str = None,
-    request_id: str = None,
-):
-    """Generate streaming chat completions."""
-    # Prepare messages for response generation
-    response_messages = prepare_response_messages(request_body)
-
-    try:
-        # Call Plano using OpenAI client for streaming
-        logger.info(
-            f"Calling Plano at {LLM_GATEWAY_ENDPOINT} to generate streaming response"
-        )
-
-        logger.info(f"rag_agent - request_id: {request_id}")
-        # Prepare extra headers if traceparent is provided
-        extra_headers = {"x-envoy-max-retries": "3"}
-        if request_id:
-            extra_headers["x-request-id"] = request_id
-        if traceparent_header:
-            extra_headers["traceparent"] = traceparent_header
-
-        response_stream = await plano_client.chat.completions.create(
-            model=RESPONSE_MODEL,
-            messages=response_messages,
-            temperature=request_body.temperature or 0.7,
-            max_tokens=request_body.max_tokens or 1000,
-            stream=True,
-            extra_headers=extra_headers,
-        )
-
-        completion_id = f"chatcmpl-{uuid.uuid4().hex[:8]}"
-        created_time = int(time.time())
-        collected_content = []
-
-        async for chunk in response_stream:
-            if chunk.choices and chunk.choices[0].delta.content:
-                content = chunk.choices[0].delta.content
-                collected_content.append(content)
-
-                # Create streaming response chunk
-                stream_chunk = ChatCompletionStreamResponse(
-                    id=completion_id,
-                    created=created_time,
-                    model=request_body.model,
-                    choices=[
-                        {
-                            "index": 0,
-                            "delta": {"content": content},
-                            "finish_reason": None,
-                        }
-                    ],
-                )
-
-                yield f"data: {stream_chunk.model_dump_json()}\n\n"
-
-        # Send final chunk with complete response in expected format
-        full_response = "".join(collected_content)
-        updated_history = [{"role": "assistant", "content": full_response}]
-
-        final_chunk = ChatCompletionStreamResponse(
-            id=completion_id,
-            created=created_time,
-            model=request_body.model,
-            choices=[
-                {
-                    "index": 0,
-                    "delta": {},
-                    "finish_reason": "stop",
-                    "message": {
-                        "role": "assistant",
-                        "content": json.dumps(updated_history),
-                    },
-                }
-            ],
-        )
-
-        yield f"data: {final_chunk.model_dump_json()}\n\n"
-        yield "data: [DONE]\n\n"
-
-    except Exception as e:
-        logger.error(f"Error generating streaming response: {e}")
-
-        # Send error as streaming response
-        error_chunk = ChatCompletionStreamResponse(
-            id=f"chatcmpl-{uuid.uuid4().hex[:8]}",
-            created=int(time.time()),
-            model=request_body.model,
-            choices=[
-                {
-                    "index": 0,
-                    "delta": {
-                        "content": "I apologize, but I'm having trouble generating a response right now. Please try again."
-                    },
-                    "finish_reason": "stop",
-                }
-            ],
-        )
-
-        yield f"data: {error_chunk.model_dump_json()}\n\n"
-        yield "data: [DONE]\n\n"
-
-
-@app.get("/health")
-async def health_check():
-    """Health check endpoint."""
-    return {"status": "healthy"}
-
-
-def start_server(host: str = "localhost", port: int = 8000):
-    """Start the REST server."""
-    uvicorn.run(
-        app,
-        host=host,
-        port=port,
-        log_config={
-            "version": 1,
-            "disable_existing_loggers": False,
-            "formatters": {
-                "default": {
-                    "format": "%(asctime)s - [RESPONSE_GENERATOR] - %(levelname)s - %(message)s",
-                },
-            },
-            "handlers": {
-                "default": {
-                    "formatter": "default",
-                    "class": "logging.StreamHandler",
-                    "stream": "ext://sys.stdout",
-                },
-            },
-            "root": {
-                "level": "INFO",
-                "handlers": ["default"],
-            },
-        },
-    )
--- a/demos/use_cases/mcp_filter/src/rag_agent/sample_knowledge_base.csv
+++ b/demos/use_cases/mcp_filter/src/rag_agent/sample_knowledge_base.csv
@ -1,257 +0,0 @@
-path,content
-TechCorp_CloudServices_SLA_Agreement_2024,"SERVICE LEVEL AGREEMENT
-This Service Level Agreement (""SLA"") is entered into on March 15, 2024, between TechCorp Solutions Inc., a Delaware corporation (""Provider""), and CloudFirst Enterprises LLC (""Customer"").
-
-DEFINITIONS
-Service Availability: The percentage of time during which the cloud services are operational and accessible.
-Downtime: Any period when the services are unavailable or inaccessible to Customer.
-Response Time: The time between service request submission and initial response from Provider.
-
-SERVICE COMMITMENTS
-Provider guarantees 99.9% uptime for all cloud infrastructure services during any calendar month.
-Average response time for API calls shall not exceed 200 milliseconds under normal operating conditions.
-Customer support response times: Critical issues within 1 hour, Standard issues within 4 hours.
-
-REMEDIES
-For each full percentage point below 99.9% availability, Customer receives 10% credit on monthly fees.
-If response times exceed 500ms for more than 5 minutes in any hour, Customer receives 5% monthly credit.
-
-MONITORING AND REPORTING
-Provider will maintain real-time monitoring systems and provide monthly performance reports.
-All metrics will be measured from Provider's monitoring systems located in primary data centers.
-
-This SLA remains in effect for the duration of the underlying service agreement.
-
-Executed by:
-TechCorp Solutions Inc.
-Sarah Mitchell, VP Operations
-Date: March 15, 2024
-
-CloudFirst Enterprises LLC
-Robert Chen, CTO
-Date: March 16, 2024"
-
-DataSecure_Privacy_Policy_v3.2,"PRIVACY POLICY
-DataSecure Analytics, Inc. (""Company"") Privacy Policy
-Effective Date: January 1, 2024
-Last Updated: February 28, 2024
-
-INFORMATION COLLECTION
-We collect information you provide directly, such as account details, usage preferences, and communication records.
-Automatically collected data includes IP addresses, browser types, device information, and service interaction logs.
-Third-party integrations may provide additional user behavior and demographic information with consent.
-
-DATA USAGE
-Personal information is used to provide services, improve user experience, and communicate service updates.
-Aggregated, non-identifiable data may be used for analytics, research, and service enhancement.
-We do not sell personal information to third parties for marketing purposes.
-
-DATA PROTECTION
-All data is encrypted in transit using TLS 1.3 and at rest using AES-256 encryption.
-Access controls limit data access to authorized personnel only on a need-to-know basis.
-Regular security audits and penetration testing ensure ongoing protection measures.
-
-DATA RETENTION
-Personal data is retained for the duration of active service plus 24 months.
-Logs and analytics data are retained for 12 months unless legally required otherwise.
-Upon account deletion, personal data is permanently removed within 30 days.
-
-USER RIGHTS
-Users may request access to, correction of, or deletion of their personal information.
-Data portability requests will be fulfilled in standard formats within 30 days.
-Marketing communications can be opted out of at any time.
-
-CONTACT
-For privacy concerns, contact: privacy@datasecure.com
-Data Protection Officer: Jennifer Walsh, jwalsh@datasecure.com"
-
-GlobalManufacturing_SupplyChain_Contract_Q2_2024,"SUPPLY CHAIN AGREEMENT
-This Supply Chain Agreement is entered into between GlobalManufacturing Corp (""Buyer"") and PrecisionParts Ltd (""Supplier"") effective April 1, 2024.
-
-SCOPE OF SERVICES
-Supplier will provide automotive components including brake assemblies, suspension parts, and electrical harnesses.
-All products must meet ISO 9001 quality standards and automotive industry specifications.
-Delivery schedule: Weekly shipments every Tuesday, with 48-hour advance shipping notifications.
-
-PRICING AND PAYMENT
-Component pricing is fixed for initial 6-month term with quarterly price review thereafter.
-Payment terms: Net 45 days from invoice date via electronic transfer.
-Volume discounts apply: 5% for orders exceeding 10,000 units per month, 8% for orders exceeding 25,000 units.
-
-QUALITY REQUIREMENTS
-All components must pass incoming inspection with less than 0.1% defect rate.
-Supplier maintains quality certifications including IATF 16949 and environmental compliance.
-Batch tracking and traceability required for all delivered components.
-
-LOGISTICS AND DELIVERY
-Supplier responsible for packaging, labeling, and delivery to Buyer's distribution centers.
-Delivery windows: 8 AM - 4 PM, Monday through Friday, with advance appointment scheduling.
-Late delivery penalties: 2% of shipment value for each day beyond scheduled delivery.
-
-RISK MANAGEMENT
-Supplier maintains business continuity plans and alternative sourcing strategies.
-Force majeure events must be reported within 24 hours with mitigation plans.
-Insurance requirements: $5M general liability, $2M product liability coverage.
-
-INTELLECTUAL PROPERTY
-All custom tooling and specifications remain property of Buyer.
-Supplier grants license to use necessary patents for component manufacturing.
-
-This agreement shall remain in effect for 24 months with automatic renewal unless terminated.
-
-GlobalManufacturing Corp
-Michael Rodriguez, Supply Chain Director
-Date: April 1, 2024
-
-PrecisionParts Ltd
-Amanda Foster, VP Sales
-Date: April 2, 2024"
-
-EduTech_StudentData_Management_Policy_2024,"STUDENT DATA MANAGEMENT POLICY
-EduTech Learning Platform - Data Management and Protection Policy
-Document Version: 2.1
-Effective Date: August 15, 2024
-
-SCOPE AND PURPOSE
-This policy governs the collection, use, storage, and protection of student educational records and personal information.
-Applies to all employees, contractors, and third-party service providers accessing student data.
-Compliance with FERPA, COPPA, and state student privacy laws is mandatory.
-
-DATA CLASSIFICATION
-Educational Records: Grades, attendance, assignments, and academic progress information.
-Personal Information: Names, addresses, contact details, and demographic information.
-Behavioral Data: Learning patterns, platform usage, and engagement metrics.
-
-COLLECTION PRINCIPLES
-Data collection is limited to educational purposes and service improvement only.
-Parental consent required for students under 13 years of age.
-Students and parents have right to review and request corrections to educational records.
-
-ACCESS CONTROLS
-Role-based access ensures personnel see only data necessary for their functions.
-Multi-factor authentication required for all system access.
-Access logs maintained and reviewed monthly for unauthorized activity.
-
-DATA SHARING
-Educational records shared only with authorized school personnel and parents/students.
-No data sharing with third parties for commercial purposes without explicit consent.
-Research data must be de-identified and aggregated before external sharing.
-
-SECURITY MEASURES
-Data encrypted using industry-standard protocols during transmission and storage.
-Regular security assessments and vulnerability testing conducted quarterly.
-Incident response plan includes notification procedures for data breaches.
-
-RETENTION AND DISPOSAL
-Student records retained according to school district policies, typically 5-7 years post-graduation.
-Inactive accounts and associated data purged after 2 years of non-use.
-Secure data destruction protocols ensure complete removal of sensitive information.
-
-COMPLIANCE MONITORING
-Annual privacy training required for all staff handling student data.
-Regular audits ensure ongoing compliance with applicable privacy regulations.
-Privacy impact assessments conducted for new features or data uses.
-
-Contact: Dr. Lisa Thompson, Chief Privacy Officer
-Email: privacy@edutech-learning.com
-Phone: (555) 123-4567"
-
-FinanceFirst_Investment_Advisory_Agreement_2024,"INVESTMENT ADVISORY AGREEMENT
-This Investment Advisory Agreement is entered into between FinanceFirst Advisors LLC (""Advisor"") and Madison Investment Group (""Client"") on May 20, 2024.
-
-ADVISORY SERVICES
-Advisor will provide comprehensive investment management and financial planning services.
-Services include portfolio construction, asset allocation, risk assessment, and performance monitoring.
-Regular portfolio reviews conducted quarterly with detailed performance reporting.
-
-INVESTMENT AUTHORITY
-Client grants Advisor discretionary authority to make investment decisions within agreed parameters.
-Investment universe includes stocks, bonds, ETFs, mutual funds, and alternative investments as appropriate.
-All trades executed through qualified broker-dealers with best execution practices.
-
-FEE STRUCTURE
-Management fee: 1.25% annually on assets under management, calculated and billed quarterly.
-Performance fee: 15% of returns exceeding S&P 500 benchmark, calculated annually.
-Additional fees may apply for specialized services such as tax planning or estate planning.
-
-CLIENT RESPONSIBILITIES
-Client must provide accurate financial information and promptly communicate changes in circumstances.
-Investment objectives and risk tolerance should be reviewed and updated annually.
-Client responsible for reviewing and approving investment policy statement.
-
-RISK DISCLOSURE
-All investments carry risk of loss, and past performance does not guarantee future results.
-Diversification does not ensure profit or protect against loss in declining markets.
-Alternative investments may have limited liquidity and higher volatility.
-
-REGULATORY COMPLIANCE
-Advisor is registered with the Securities and Exchange Commission as an investment advisor.
-All activities conducted in accordance with Investment Advisers Act of 1940 and applicable regulations.
-Form ADV Part 2 brochure provided annually with material updates.
-
-CONFIDENTIALITY
-All client information treated as confidential and shared only as necessary for service provision.
-Third-party service providers bound by confidentiality agreements.
-Client data protected through secure systems and access controls.
-
-TERMINATION
-Either party may terminate agreement with 30 days written notice.
-Upon termination, Advisor will assist with orderly transfer of assets to new custodian or advisor.
-Final fee calculation prorated to date of termination.
-
-FinanceFirst Advisors LLC
-Thomas Anderson, Managing Partner
-Date: May 20, 2024
-
-Madison Investment Group
-Rebecca Martinez, Chief Investment Officer
-Date: May 21, 2024"
-
-HealthSystem_PatientCare_Standards_2024,"PATIENT CARE STANDARDS AND PROTOCOLS
-Metropolitan Health System - Clinical Care Standards
-Document ID: MHS-PCS-2024-001
-Effective Date: June 1, 2024
-
-PATIENT SAFETY PROTOCOLS
-All patients must have proper identification verification using two unique identifiers.
-Medication administration requires independent double-check for high-risk medications.
-Fall risk assessments completed within 4 hours of admission with appropriate interventions.
-
-CLINICAL DOCUMENTATION
-Medical records must be completed within 24 hours of patient encounter.
-All entries require electronic signature with timestamp and provider identification.
-Critical values and abnormal results must be communicated and documented immediately.
-
-INFECTION CONTROL
-Hand hygiene compliance monitored with target rate of 95% or higher.
-Personal protective equipment used according to transmission-based precautions.
-Isolation procedures implemented within 2 hours of identification of infectious conditions.
-
-EMERGENCY RESPONSE
-Code team response time target: 3 minutes from activation to arrival.
-Crash cart and emergency equipment checks performed daily and documented.
-All staff required to maintain current CPR and emergency response certifications.
-
-PATIENT COMMUNICATION
-Patient rights and responsibilities communicated upon admission.
-Informed consent obtained and documented prior to procedures and treatments.
-Family involvement encouraged with respect for patient privacy preferences.
-
-QUALITY MEASURES
-Patient satisfaction scores monitored monthly with target of 4.5/5.0 or higher.
-Medication error rates tracked with goal of less than 1 per 1000 patient days.
-Hospital-acquired infection rates measured and benchmarked against national standards.
-
-STAFF COMPETENCY
-Annual competency assessments required for all clinical staff.
-Continuing education requirements: 24 hours annually for nurses, 40 hours for physicians.
-Specialty certifications maintained according to department and role requirements.
-
-TECHNOLOGY STANDARDS
-Electronic health record system used for all patient documentation.
-Telemedicine capabilities available for remote consultations and monitoring.
-Clinical decision support tools integrated to assist with diagnosis and treatment decisions.
-
-Contact: Dr. Patricia Williams, Chief Medical Officer
-Email: pwilliams@metrohealthsystem.org
-Phone: (555) 987-6543"
--- a/demos/use_cases/mcp_filter/start_agents.sh
+++ b/demos/use_cases/mcp_filter/start_agents.sh
@ -1,46 +0,0 @@
-#!/bin/bash
-set -e
-
-WAIT_FOR_PIDS=()
-
-log() {
-  timestamp=$(python3 -c 'from datetime import datetime; print(datetime.now().strftime("%Y-%m-%d %H:%M:%S,%f")[:23])')
-  message="$*"
-  echo "$timestamp - $message"
-}
-
-cleanup() {
-    log "Caught signal, terminating all user processes ..."
-    for PID in "${WAIT_FOR_PIDS[@]}"; do
-        if kill $PID 2> /dev/null; then
-            log "killed process: $PID"
-        fi
-    done
-    exit 1
-}
-
-trap cleanup EXIT
-
-log "Starting input_guards agent on port 10500/mcp..."
-uv run python -m rag_agent --host 0.0.0.0 --port 10500 --agent input_guards &
-WAIT_FOR_PIDS+=($!)
-
-log "Starting query_rewriter agent on port 10501/mcp..."
-uv run python -m rag_agent --host 0.0.0.0 --port 10501 --agent query_rewriter &
-WAIT_FOR_PIDS+=($!)
-
-log "Starting context_builder agent on port 10502/mcp..."
-uv run python -m rag_agent --host 0.0.0.0 --port 10502 --agent context_builder &
-WAIT_FOR_PIDS+=($!)
-
-# log "Starting response_generator agent on port 10400..."
-# uv run python -m rag_agent --host 0.0.0.0 --port 10400 --agent response_generator &
-# WAIT_FOR_PIDS+=($!)
-
-log "Starting response_generator agent on port 10505..."
-uv run python -m rag_agent --rest-server --host 0.0.0.0 --rest-port 10505 --agent response_generator &
-WAIT_FOR_PIDS+=($!)
-
-for PID in "${WAIT_FOR_PIDS[@]}"; do
-    wait "$PID"
-done
--- a/demos/use_cases/mcp_filter/test.rest
+++ b/demos/use_cases/mcp_filter/test.rest
@ -1,92 +0,0 @@
-@baseUrl = http://0.0.0.0:10502
-@model = gpt-4o
-
-# Health Check
-GET {{baseUrl}}/health
-
-###
-
-# Test 1: Simple Non-Streaming Chat Completion
-POST {{baseUrl}}/v1/chat/completions
-Content-Type: application/json
-
-{
-  "model": "{{model}}",
-  "messages": [
-    {
-      "role": "user",
-      "content": "Hello! Can you help me understand what machine learning is?"
-    }
-  ]
-}
-
-###
-
-# Test 2: Simple Streaming Chat Completion
-POST {{baseUrl}}/v1/chat/completions
-Content-Type: application/json
-
-{
-  "model": "{{model}}",
-  "messages": [
-    {
-      "role": "user",
-      "content": "Explain the concept of artificial intelligence in simple terms."
-    }
-  ],
-  "stream": true
-}
-
-### Test 3
-POST http://localhost:8001/v1/chat/completions
-Content-Type: application/json
-
-{
-  "model": "{{model}}",
-  "messages": [
-    {
-      "role": "user",
-      "content": "What is the guaranteed uptime percentage for TechCorp's cloud services?"
-    }
-  ],
-  "stream": true
-}
-
-### send request to query_rewriter agent
-POST http://localhost:10500/
-Content-Type: application/json
-
-[
-  {
-    "role": "user",
-    "content": "What is the guaranteed uptime percentage for TechCorp's cloud services?"
-  }
-]
-
-### test fast-llm
-POST http://localhost:12000/v1/chat/completions
-Content-Type: application/json
-
-{
-  "model": "fast-llm",
-  "messages": [
-    {
-      "role": "user",
-      "content": "hello"
-    }
-  ]
-}
-
-### test smart-llm
-POST http://localhost:12000/v1/chat/completions
-Content-Type: application/json
-
-{
-  "model": "smart-llm",
-  "messages": [
-    {
-      "role": "user",
-      "content": "hello"
-    }
-  ]
-}
--- a/demos/use_cases/mcp_filter/uv.lock
+++ b/demos/use_cases/mcp_filter/uv.lock
--- a/demos/use_cases/model_alias_routing/README.md
+++ b/demos/use_cases/model_alias_routing/README.md
@ -1,148 +0,0 @@
-# Model Alias Demo Suite
-
-This directory contains demos for the model alias feature in Plano.
-
-## Overview
-
-Model aliases allow clients to use friendly, semantic names instead of provider-specific model names. For example:
- `arch.summarize.v1` → `4o-mini` (fast, cheap model for summaries)
- `arch.reasoning.v1` → `gpt-4o` (capable model for complex reasoning)
- `creative-model` → `claude-3-5-sonnet` (creative tasks)
-
-## Configuration
-
-The `arch_config_with_aliases.yaml` file defines several aliases:
-
-```yaml
-# Model aliases - friendly names that map to actual provider names
-model_aliases:
-  # Alias for summarization tasks -> fast/cheap model
-  arch.summarize.v1:
-    target: gpt-4o-mini
-
-  # Alias for general purpose tasks -> latest model
-  arch.v1:
-    target: o3
-
-  # Alias for reasoning tasks -> capable model
-  arch.reasoning.v1:
-    target: gpt-4o
-
-  # Alias for creative tasks -> Claude model
-  arch.creative.v1:
-    target: claude-3-5-sonnet-20241022
-
-  # Alias for quick responses -> fast model
-  arch.fast.v1:
-    target: claude-3-haiku-20240307
-
-  # Semantic aliases
-  summary-model:
-    target: gpt-4o-mini
-
-  chat-model:
-    target: gpt-4o
-
-  creative-model:
-    target: claude-3-5-sonnet-20241022
-```
-
-## Prerequisites
- Install all dependencies as described in the main Plano README ([link](https://github.com/katanemo/plano/?tab=readme-ov-file#prerequisites))
- Set your API keys in your environment:
-  - `export OPENAI_API_KEY=your-openai-key`
-  - `export ANTHROPIC_API_KEY=your-anthropic-key` (optional, but recommended for Anthropic tests)
-
-## How to Run
-
-1. Start the demo:
-   ```sh
-   sh run_demo.sh
-   ```
-   - This will create a `.env` file with your API keys (if not present).
-   - Starts Plano gateway with model alias config (`arch_config_with_aliases.yaml`).
-
-2. To stop the demo:
-   ```sh
-   sh run_demo.sh down
-   ```
-   - This will stop Plano gateway and any related services.
-
-## Example Requests
-
-### OpenAI client with alias `arch.summarize.v1`
-```sh
-curl -sS -X POST "http://localhost:12000/v1/chat/completions" \
-  -H "Authorization: Bearer test-key" \
-  -H "Content-Type: application/json" \
-  -d '{
-    "model": "arch.summarize.v1",
-    "max_tokens": 50,
-    "messages": [
-      { "role": "user",
-        "content": "Hello, please respond with exactly: Hello from alias arch.summarize.v1!"
-      }
-    ]
-  }' | jq .
-```
-
-### OpenAI client with alias `arch.v1`
-```sh
-curl -sS -X POST "http://localhost:12000/v1/chat/completions" \
-  -H "Authorization: Bearer test-key" \
-  -H "Content-Type: application/json" \
-  -d '{
-    "model": "arch.v1",
-    "max_tokens": 50,
-    "messages": [
-      { "role": "user",
-        "content": "Hello, please respond with exactly: Hello from alias arch.v1!"
-      }
-    ]
-  }' | jq .
-```
-
-### Anthropic client with alias `arch.summarize.v1`
-```sh
-curl -sS -X POST "http://localhost:12000/v1/messages" \
-  -H "x-api-key: test-key" \
-  -H "anthropic-version: 2023-06-01" \
-  -H "Content-Type: application/json" \
-  -d '{
-    "model": "arch.summarize.v1",
-    "max_tokens": 50,
-    "messages": [
-      { "role": "user",
-        "content": "Hello, please respond with exactly: Hello from alias arch.summarize.v1 via Anthropic!"
-      }
-    ]
-  }' | jq .
-```
-
-### Anthropic client with alias `arch.v1`
-```sh
-curl -sS -X POST "http://localhost:12000/v1/messages" \
-  -H "x-api-key: test-key" \
-  -H "anthropic-version: 2023-06-01" \
-  -H "Content-Type: application/json" \
-  -d '{
-    "model": "arch.summarize.v1",
-    "max_tokens": 50,
-    "messages": [
-      { "role": "user",
-        "content": "Hello, please respond with exactly: Hello from alias arch.summarize.v1 via Anthropic!"
-      }
-    ]
-  }' | jq .
-```
-
-## Notes
- The `.env` file will be created automatically if missing, with your API keys.
- If `ANTHROPIC_API_KEY` is not set, Anthropic requests will not work.
- You can add more aliases in `arch_config_with_aliases.yaml`.
- All curl examples use `jq .` for pretty-printing JSON responses.
-
-## Troubleshooting
- Ensure your API keys are set in your environment before running the demo.
- If you see errors about missing keys, set them and re-run the script.
- For more details, see the main Plano documentation.
--- a/demos/use_cases/model_alias_routing/config_with_aliases.yaml
+++ b/demos/use_cases/model_alias_routing/config_with_aliases.yaml
@ -1,97 +0,0 @@
-version: v0.1
-
-listeners:
-  egress_traffic:
-    address: 0.0.0.0
-    port: 12000
-    message_format: openai
-    timeout: 30s
-
-llm_providers:
-
-  # OpenAI Models
-  - model: openai/gpt-5-mini-2025-08-07
-    access_key: $OPENAI_API_KEY
-    default: true
-
-  - model: openai/gpt-4o-mini
-    access_key: $OPENAI_API_KEY
-
-  - model: openai/o3
-    access_key: $OPENAI_API_KEY
-
-  - model: openai/gpt-4o
-    access_key: $OPENAI_API_KEY
-
-  - model: openai/*
-    access_key: $OPENAI_API_KEY
-
-  # Anthropic - support all Claude models
-  - model: anthropic/*
-    access_key: $ANTHROPIC_API_KEY
-
-  - model: anthropic/claude-sonnet-4-20250514
-    access_key: $ANTHROPIC_API_KEY
-
-  - model: anthropic/claude-3-haiku-20240307
-    access_key: $ANTHROPIC_API_KEY
-
-  # Azure OpenAI Models
-  - model: azure_openai/gpt-5-mini
-    access_key: $AZURE_API_KEY
-    base_url: https://katanemo.openai.azure.com
-
-  - model: amazon_bedrock/us.amazon.nova-premier-v1:0
-    access_key: $AWS_BEARER_TOKEN_BEDROCK
-    base_url: https://bedrock-runtime.us-west-2.amazonaws.com
-
-  - model: amazon_bedrock/us.amazon.nova-pro-v1:0
-    access_key: $AWS_BEARER_TOKEN_BEDROCK
-    base_url: https://bedrock-runtime.us-west-2.amazonaws.com
-
-  # Ollama Models
-  - model: ollama/llama3.1
-    base_url: http://host.docker.internal:11434
-
-  # Grok (xAI) Models
-  - model: xai/grok-4-0709
-    access_key: $GROK_API_KEY
-
-# Model aliases - friendly names that map to actual provider names
-model_aliases:
-  # Alias for summarization tasks -> fast/cheap model
-  arch.summarize.v1:
-    target: gpt-5-mini-2025-08-07
-
-  # Alias for general purpose tasks -> latest model
-  arch.v1:
-    target: o3
-
-  # Alias for reasoning tasks -> capable model
-  arch.reasoning.v1:
-    target: gpt-4o
-
-  # Alias for creative tasks -> Claude model
-  arch.creative.v1:
-    target: claude-sonnet-4-20250514
-
-  # Alias for quick responses -> fast model
-  arch.fast.v1:
-    target: claude-3-haiku-20240307
-
-  # Semantic aliases
-  summary-model:
-    target: gpt-5-mini-2025-08-07
-
-  chat-model:
-    target: gpt-5-mini-2025-08-07
-
-  creative-model:
-    target: claude-sonnet-4-20250514
-
-  coding-model:
-    target: us.amazon.nova-premier-v1:0
-
-  # Alias for grok testing
-  arch.grok.v1:
-    target: grok-4-0709
--- a/demos/use_cases/model_alias_routing/run_demo.sh
+++ b/demos/use_cases/model_alias_routing/run_demo.sh
@ -1,60 +0,0 @@
-#!/bin/bash
-set -e
-
-# Function to start the demo
-start_demo() {
-  # Step 1: Check if .env file exists
-  if [ -f ".env" ]; then
-    echo ".env file already exists. Skipping creation."
-  else
-    # Step 2: Create `.env` file and set API keys
-    if [ -z "$OPENAI_API_KEY" ]; then
-      echo "Error: OPENAI_API_KEY environment variable is not set for the demo."
-      exit 1
-    fi
-    if [ -z "$ANTHROPIC_API_KEY" ]; then
-      echo "Warning: ANTHROPIC_API_KEY environment variable is not set. Anthropic features may not work."
-    fi
-
-    echo "Creating .env file..."
-    echo "OPENAI_API_KEY=$OPENAI_API_KEY" > .env
-    if [ -n "$ANTHROPIC_API_KEY" ]; then
-      echo "ANTHROPIC_API_KEY=$ANTHROPIC_API_KEY" >> .env
-    fi
-    echo ".env file created with API keys."
-  fi
-
-  # Step 3: Start Plano
-  echo "Starting Plano with arch_config_with_aliases.yaml..."
-  planoai up arch_config_with_aliases.yaml
-
-  echo "\n\nPlano started successfully."
-  echo "Please run the following CURL command to test model alias routing. Additional instructions are in the README.md file. \n"
-  echo "curl -sS -X POST \"http://localhost:12000/v1/chat/completions\" \
-    -H \"Authorization: Bearer test-key\" \
-    -H \"Content-Type: application/json\" \
-    -d '{
-      \"model\": \"arch.summarize.v1\",
-      \"max_tokens\": 50,
-      \"messages\": [
-        { \"role\": \"user\",
-          \"content\": \"Hello, please respond with exactly: Hello from alias arch.summarize.v1!\"
-        }
-      ]
-    }' | jq ."
-}
-
-# Function to stop the demo
-stop_demo() {
-  # Step 2: Stop Plano
-  echo "Stopping Plano..."
-  planoai down
-}
-
-# Main script logic
-if [ "$1" == "down" ]; then
-  stop_demo
-else
-  # Default action is to bring the demo up
-  start_demo
-fi
--- a/demos/use_cases/model_choice_with_test_harness/README.md
+++ b/demos/use_cases/model_choice_with_test_harness/README.md
@ -1,119 +0,0 @@
-# Model Choice Newsletter Demo
-
-This folder demonstrates a practical workflow for rapid model adoption and safe model switching using Plano (`plano`). It includes both a minimal test harness and a sample proxy configuration.
-
---
-
-## Step-by-Step Walkthrough: Adopting New Models
-
-### Part 1 — Testing Infrastructure
-
-**Goal:** Quickly evaluate candidate models for a task using a repeatable, automated harness.
-
-#### 1. Write Test Fixtures
-
-Create a YAML file (`evals_summarize.yaml`) with real examples for your task. Each fixture includes:
- `input`: The prompt or scenario.
- `must_include`: List of anchor words that must appear in the output.
- `schema`: The expected output schema.
-
-Example:
-```yaml
-# evals_summarize.yaml
-task: summarize
-fixtures:
-  - id: sum-001
-    input: "Thread about a billing dispute…"
-    must_include: ["invoice"]
-    schema: SummarizeOut
-  - id: sum-002
-    input: "Thread about a shipping delay…"
-    must_include: ["status"]
-    schema: SummarizeOut
-```
-
-#### 2. Candidate Models
-
-List the model aliases (e.g., `arch.summarize.v1`, `arch.reason.v1`) you want to test. The harness will route requests through `plano`, so you don’t need provider API keys in your code.
-
-#### 3. Minimal Python Harness
-
-See `bench.py` for a complete example. It:
- Loads fixtures.
- Sends requests to each candidate model via `plano`.
- Validates output against schema and anchor words.
- Reports success rate and latency.
-
-Example usage:
-```sh
-uv sync
-python bench.py
-```
-
-**Benchmarks:**
- ≥90% schema-valid
- ≥80% anchors present
- Latency within SLO
- Cost within budget
-
---
-
-### Part 2 — Network Infrastructure
-
-**Goal:** Use a proxy server (`plano`) to decouple your app from vendor-specific model names and centralize control.
-
-#### Why Use a Proxy?
-
- Consistent API across providers
- Centralized key management
- Unified logging, metrics, and guardrails
- Intent-based model aliases (e.g., `arch.summarize.v1`)
- Safe model promotions and rollbacks
- Central governance and observability
-
-#### Example Proxy Config
-
-See `config.yaml` for a sample configuration mapping aliases to provider models.
-
---
-
-## How to Run This Demo
-
-1. **Install uv** (if not already installed):
-   ```sh
-   curl -LsSf https://astral.sh/uv/install.sh | sh
-   ```
-
-2. **Install dependencies:**
-  - Install all dependencies as described in the main Plano README ([link](https://github.com/katanemo/plano/?tab=readme-ov-file#prerequisites))
-  - Then run
-    ```sh
-    uv sync
-    ```
-
-3. **Start Plano**
-   ```sh
-    run_demo.sh
-   ```
-
-4. **Run the test harness:**
-   ```sh
-   python bench.py
-   ```
-
---
-
-## Files in This Folder
-
- `bench.py` — Minimal Python test harness
- `evals_summarize.yaml` — Example test fixtures
- `pyproject.toml` — Python project configuration
- `config.yaml` — Sample plano config (if present)
-
---
-
-## Troubleshooting
-
- If you see `Success: 0/2 (0%)`, check your anchor words and prompt clarity.
- Make sure plano is running and accessible at `http://localhost:12000/`.
- For schema validation errors, ensure your prompt instructs the model to output the correct JSON structure.
--- a/demos/use_cases/model_choice_with_test_harness/bench.py
+++ b/demos/use_cases/model_choice_with_test_harness/bench.py
@ -1,86 +0,0 @@
-# bench.py
-import json, time, yaml, statistics as stats
-from pydantic import BaseModel, ValidationError
-from openai import OpenAI
-
-# Plano endpoint (keys are handled by Plano)
-client = OpenAI(base_url="http://localhost:12000/v1", api_key="n/a")
-MODELS = ["arch.summarize.v1", "arch.reason.v1"]
-FIXTURES = "evals_summarize.yaml"
-
-
-# Expected output shape
-class SummarizeOut(BaseModel):
-    title: str
-    bullets: list[str]
-    next_actions: list[str]
-
-
-def load_fixtures(path):
-    with open(path, "r") as f:
-        return yaml.safe_load(f)["fixtures"]
-
-
-def must_contain(text: str, anchors: list[str]) -> bool:
-    t = text.lower()
-    return all(a.lower() in t for a in anchors)
-
-
-def schema_fmt(model: type[BaseModel]):
-    return {"type": "json_object"}  # Simplified for broad compatibility
-
-
-def run_case(model, fx):
-    t0 = time.perf_counter()
-    schema = SummarizeOut.model_json_schema()
-    resp = client.chat.completions.create(
-        model=model,
-        messages=[
-            {
-                "role": "system",
-                "content": f"Be concise. Output valid JSON matching this schema:\n{json.dumps(schema)}",
-            },
-            {"role": "user", "content": fx["input"]},
-        ],
-        response_format=schema_fmt(SummarizeOut),
-    )
-    dt = time.perf_counter() - t0
-
-    content = resp.choices[0].message.content or "{}"
-    passed, reasons = True, []
-
-    try:
-        data = json.loads(content)
-    except:
-        return {"ok": False, "lat": dt, "why": "json decode"}
-
-    try:
-        SummarizeOut(**data)
-    except ValidationError:
-        passed = False
-        reasons.append("schema")
-    if not must_contain(json.dumps(data), fx.get("must_include", [])):
-        passed = False
-        reasons.append("anchors")
-
-    return {"ok": passed, "lat": dt, "why": ";".join(reasons)}
-
-
-def main():
-    fixtures = load_fixtures(FIXTURES)
-    for model in MODELS:
-        results = [run_case(model, fx) for fx in fixtures]
-        ok = sum(r["ok"] for r in results)
-        total = len(results)
-        latencies = [r["lat"] for r in results]
-
-        print(f"\n››› {model}")
-        print(f"  Success: {ok}/{total} ({ok/total:.0%})")
-        if latencies:
-            avg_lat = stats.mean(latencies)
-            p95_lat = stats.quantiles(latencies, n=100)[94]
-            print(f"  Latency (ms): avg={avg_lat*1000:.0f}, p95={p95_lat*1000:.0f}")
-
-
-if __name__ == "__main__":
-    main()
--- a/demos/use_cases/model_choice_with_test_harness/evals_summarize.yaml
+++ b/demos/use_cases/model_choice_with_test_harness/evals_summarize.yaml
@ -1,11 +0,0 @@
-# evals_summarize.yaml
-task: summarize
-fixtures:
-  - id: sum-001
-    input: "Thread about a billing dispute…"
-    must_include: ["invoice"]
-    schema: SummarizeOut
-  - id: sum-002
-    input: "Thread about a shipping delay…"
-    must_include: ["status"]
-    schema: SummarizeOut
--- a/demos/use_cases/model_choice_with_test_harness/plano_config_with_aliases.yaml
+++ b/demos/use_cases/model_choice_with_test_harness/plano_config_with_aliases.yaml
@ -1,22 +0,0 @@
-version: v0.1.0
-
-listeners:
-  egress_traffic:
-    address: 0.0.0.0
-    port: 12000
-    message_format: openai
-    timeout: 30s
-
-llm_providers:
-  - model: openai/gpt-4o-mini
-    access_key: $OPENAI_API_KEY
-    default: true
-
-  - model: openai/o3
-    access_key: $OPENAI_API_KEY
-
-model_aliases:
-  arch.summarize.v1:
-    target: gpt-4o-mini
-  arch.reason.v1:
-    target: o3
--- a/demos/use_cases/model_choice_with_test_harness/pyproject.toml
+++ b/demos/use_cases/model_choice_with_test_harness/pyproject.toml
@ -1,26 +0,0 @@
-[project]
-name = "model-choice-newsletter-code-snippets"
-version = "0.1.0"
-description = "Benchmarking model alias routing with Plano."
-authors = [{name = "Your Name", email = "your@email.com"}]
-license = {text = "Apache 2.0"}
-readme = "README.md"
-requires-python = ">=3.10"
-dependencies = [
-    "pydantic>=2.0",
-    "openai>=1.0",
-    "pyyaml>=6.0",
-    "planoai>=0.4.1",
-]
-
-[project.optional-dependencies]
-dev = [
-    "pytest>=8.3",
-]
-
-[tool.hatch.build.targets.wheel]
-packages = ["."]
-
-[build-system]
-requires = ["hatchling"]
-build-backend = "hatchling.build"
--- a/demos/use_cases/model_choice_with_test_harness/run_demo.sh
+++ b/demos/use_cases/model_choice_with_test_harness/run_demo.sh
@ -1,41 +0,0 @@
-#!/bin/bash
-set -e
-
-# Function to start the demo
-start_demo() {
-  # Step 1: Check if .env file exists
-  if [ -f ".env" ]; then
-    echo ".env file already exists. Skipping creation."
-  else
-    # Step 2: Create `.env` file and set API keys
-    if [ -z "$OPENAI_API_KEY" ]; then
-      echo "Error: OPENAI_API_KEY environment variable is not set for the demo."
-      exit 1
-    fi
-    echo "Creating .env file..."
-    echo "OPENAI_API_KEY=$OPENAI_API_KEY" > .env
-    echo ".env file created with API keys."
-  fi
-
-  # Step 3: Start Plano
-  echo "Starting Plano with arch_config_with_aliases.yaml..."
-  planoai up arch_config_with_aliases.yaml
-
-  echo "\n\nPlano started successfully."
-  echo "Please run the following command to test the setup: python bench.py\n"
-}
-
-# Function to stop the demo
-stop_demo() {
-  # Step 2: Stop Plano
-  echo "Stopping Plano..."
-  planoai down
-}
-
-# Main script logic
-if [ "$1" == "down" ]; then
-  stop_demo
-else
-  # Default action is to bring the demo up
-  start_demo
-fi
--- a/demos/use_cases/model_choice_with_test_harness/uv.lock
+++ b/demos/use_cases/model_choice_with_test_harness/uv.lock
--- a/demos/use_cases/multi_agent_with_crewai_langchain/Dockerfile
+++ b/demos/use_cases/multi_agent_with_crewai_langchain/Dockerfile
@ -1,24 +0,0 @@
-FROM python:3.14-slim
-
-WORKDIR /app
-
-# Install system dependencies
-RUN apt-get update && \
-    apt-get install -y --no-install-recommends bash && \
-    rm -rf /var/lib/apt/lists/* && \
-    pip install --no-cache-dir uv
-
-# Install Python dependencies
-COPY pyproject.toml README.md ./
-RUN uv pip install --system .
-
-# Copy application code
-COPY openai_protocol.py ./
-COPY crewai/ ./crewai/
-COPY langchain/ ./langchain/
-
-# Runtime configuration
-ENV PYTHONUNBUFFERED=1 \
-    PYTHONPATH=/app
-
-CMD ["uv", "run", "python", "crewai/flight_agent.py"]
--- a/demos/use_cases/multi_agent_with_crewai_langchain/README.md
+++ b/demos/use_cases/multi_agent_with_crewai_langchain/README.md
@ -1,132 +0,0 @@
-Travel Agents in CrewAI and LangChain - with Plano
-
-**What you'll see:** A travel assistant that seamlessly combines flight booking (CrewAI) and weather forecasts (LangChain) in a single conversation - with unified routing, orchestration, moderation, and observability across both frameworks.
-
-## The Problem
-
-Building multi-agent systems today forces developers to:
- **Pick one framework** - can't mix CrewAI, LangChain, or custom agents easily
- **Write plumbing code** - authentication, request routing, error handling
- **Rebuild for changes** - want to swap frameworks? Start over
- **Limited observability** - no unified view across different agent frameworks
-
-## Plano's Solution
-
-Plano acts as a **framework-agnostic proxy and data plane** that:
- Routes requests to the right agent(s), in the right order (CrewAI, LangChain, or custom)
- Normalizes requests/responses across frameworks automatically
- Provides unified authentication, tracing, and logs
- Lets you mix and match frameworks without coupling, so that you can continue to innovate easily
-
-## How To Run
-
-### Prerequisites
-
-1. **Install Plano CLI**
-   ```bash
-   uv tool install planoai
-   ```
-
-2. **Set Environment Variables**
-   ```bash
-   export OPENAI_API_KEY=your_key_here
-   export AEROAPI_KEY=your_key_here  # Get your free API key at https://flightaware.com/aeroapi/
-   ```
-
-### Start the Demo
-
-```bash
-# From the demo directory
-cd demos/use_cases/multi_agent_with_crewai_langchain
-
-# Build and start all services
-docker-compose up -d
-```
-
-This starts:
- **Plano** (ports 12000, 8001) - routing and orchestration
- **CrewAI Flight Agent** (port 10520) - flight search
- **LangChain Weather Agent** (port 10510) - weather forecasts
- **AnythingLLM** (port 3001) - chat interface
- **Jaeger** (port 16686) - distributed tracing
-
-### Try It Out
-
-1. **Open the Chat Interface**
-   - Navigate to [http://localhost:3001](http://localhost:3001)
-   - Create an account (stored locally)
-
-2. **Ask Multi-Agent Questions**
-   ```
-   "What's the weather in San Francisco and can you find flights from Seattle to San Francisco?"
-   ```
-
-   Plano automatically:
-   - Routes the weather part to the LangChain agent
-   - Routes the flight part to the CrewAI agent
-   - Combines responses seamlessly
-
-3. **View Distributed Traces**
-   - Open [http://localhost:16686](http://localhost:16686) (Jaeger UI)
-   - See how requests flow through both agents
-
-   ![Tracing Example](./traces.png)
-
-## Architecture
-
-```
-┌──────────────┐
-│ AnythingLLM  │ (Chat Interface)
-└──────┬───────┘
-       │
-       v
-┌─────────────┐
-│    Plano    │ (Orchestration & DataPlane)
-└──────┬──────┘
-       │
-       ├──────────────┬──────────────┐
-       v              v              v
-┌────────────┐ ┌────────────┐ ┌──────────┐
-│  CrewAI    │ │ LangChain  │ │  Jaeger  │
-│   Flight   │ │  Weather   │ │ (Traces) │
-│   Agent    │ │   Agent    │ └──────────┘
-└────────────┘ └────────────┘
-       ├──────────────├
-       v              v
-┌─────────────┐
-│    Plano    │ (Proxy LLM calls)
-└──────┬──────┘
-```
-
-
-## Travel Agents
-
-### Flight Agent
- Framework: CrewAI
- Capabilities: Flight search, itinerary planning
- Tools: `resolve_airport_code`, `search_flights`
- Data Source: FlightAware AeroAPI
-
-### Weather Agent
- Framework: LangChain
- Capabilities: Weather forecasts, conditions
- Tools: `get_weather_forecast`
- Data Source: Open-Meteo API
-
-## Cleanup
-
-```bash
-docker-compose down
-```
-
-## Next Steps
-
- **Add your own agent** - any framework, just expose the OpenAI-compatible endpoint
- **Custom routing** - modify `config.yaml` to change agent selection logic
- **Production deployment** - see [Plano docs](https://docs.katanemo.com) for scaling guidance
-
-## Learn More
-
- [Plano Documentation](https://docs.planoai.dev)
- [CrewAI Documentation](https://docs.crewai.com)
- [LangChain Documentation](https://python.langchain.com)
--- a/demos/use_cases/multi_agent_with_crewai_langchain/config.yaml
+++ b/demos/use_cases/multi_agent_with_crewai_langchain/config.yaml
@ -1,57 +0,0 @@
-version: v0.3.0
-
-agents:
-  - id: weather_agent
-    url: http://langchain-weather-agent:10510
-  - id: flight_agent
-    url: http://crewai-flight-agent:10520
-
-model_providers:
-  - model: openai/gpt-4o
-    access_key: $OPENAI_API_KEY
-    default: true
-  - model: openai/gpt-4o-mini
-    access_key: $OPENAI_API_KEY # smaller, faster, cheaper model for extracting entities like location
-
-listeners:
-  - type: agent
-    name: travel_booking_service
-    port: 8001
-    router: plano_orchestrator_v1
-    agents:
-      - id: weather_agent
-        description: |
-
-          WeatherAgent is a specialized AI assistant for real-time weather information and forecasts. It provides accurate weather data for any city worldwide using the Open-Meteo API, helping travelers plan their trips with up-to-date weather conditions.
-
-          Capabilities:
-            * Get real-time weather conditions and multi-day forecasts for any city worldwide using Open-Meteo API (free, no API key needed)
-            * Provides current temperature
-            * Provides multi-day forecasts
-            * Provides weather conditions
-            * Provides sunrise/sunset times
-            * Provides detailed weather information
-            * Understands conversation context to resolve location references from previous messages
-            * Handles weather-related questions including "What's the weather in [city]?", "What's the forecast for [city]?", "How's the weather in [city]?"
-            * When queries include both weather and other travel questions (e.g., flights, currency), this agent answers ONLY the weather part
-
-      - id: flight_agent
-        description: |
-
-          FlightAgent is an AI-powered tool specialized in providing live flight information between airports. It leverages the FlightAware AeroAPI to deliver real-time flight status, gate information, and delay updates.
-
-          Capabilities:
-            * Get live flight information between airports using FlightAware AeroAPI
-            * Shows real-time flight status
-            * Shows scheduled/estimated/actual departure and arrival times
-            * Shows gate and terminal information
-            * Shows delays
-            * Shows aircraft type
-            * Shows flight status
-            * Automatically resolves city names to airport codes (IATA/ICAO)
-            * Understands conversation context to infer origin/destination from follow-up questions
-            * Handles flight-related questions including "What flights go from [city] to [city]?", "Do flights go to [city]?", "Are there direct flights from [city]?"
-            * When queries include both flight and other travel questions (e.g., weather, currency), this agent answers ONLY the flight part
-
-tracing:
-  random_sampling: 100
--- a/demos/use_cases/multi_agent_with_crewai_langchain/crewai/flight_agent.py
+++ b/demos/use_cases/multi_agent_with_crewai_langchain/crewai/flight_agent.py
@ -1,430 +0,0 @@
-import json
-import os
-import logging
-import time
-import uuid
-import httpx
-import uvicorn
-from datetime import datetime
-from typing import Optional
-
-from fastapi import FastAPI, Request
-from fastapi.responses import JSONResponse, StreamingResponse
-from openai import AsyncOpenAI
-from opentelemetry.propagate import extract, inject
-from crewai import Agent, Task, Crew, LLM
-from crewai.tools import tool
-
-from openai_protocol import create_chat_completion_chunk
-
-logging.basicConfig(
-    level=logging.INFO,
-    format="%(asctime)s - [FLIGHT_AGENT] - %(levelname)s - %(message)s",
-)
-logger = logging.getLogger(__name__)
-
-LLM_GATEWAY_ENDPOINT = os.getenv(
-    "LLM_GATEWAY_ENDPOINT", "http://host.docker.internal:12000/v1"
-)
-FLIGHT_MODEL = "openai/gpt-4o"
-EXTRACTION_MODEL = "openai/gpt-4o-mini"
-
-AEROAPI_BASE_URL = "https://aeroapi.flightaware.com/aeroapi"
-AEROAPI_KEY = os.getenv("AEROAPI_KEY")
-
-http_client = httpx.AsyncClient(timeout=30.0)
-openai_client = AsyncOpenAI(base_url=LLM_GATEWAY_ENDPOINT, api_key="EMPTY")
-
-
-SYSTEM_PROMPT = """You are a travel planning assistant specializing in flight information and travel conditions.
-
-CRITICAL: You MUST respond with ONLY the final answer to the user.
-
-DO NOT OUTPUT:
- "Thought:" or any internal thinking
- "Action:" or tool names
- "Action Input:" or parameters
- "Observation:" or tool results
- Any reasoning steps, planning, or internal deliberation
-
-FORMATTING RULES:
- Respond in natural, conversational text ONLY
- NEVER use JSON, code blocks, or technical formats
- Present flight information in a clean, bullet-point list
- Use plain text with proper spacing and line breaks
-
-Flight Information Format:
- Airline Name (Flight Number)
- Departure: Time from Airport Code, Gate info
- Arrival: Time at Airport Code
- Aircraft: Model name
- Status: Current status
-
-Weather Information (when available):
- Present weather data in a clear, readable format
- Include temperature, conditions, and any travel advisories
- Integrate weather context with flight information naturally
- Mention how weather might affect travel plans if relevant
-
-Your task:
-1. Use tools silently (don't mention them to the user)
-2. Convert technical data into friendly, readable text
-3. Use 12-hour time format (e.g., "9:00 AM")
-4. Organize flights chronologically by departure time
-5. Include terminal/gate info when available
-6. When weather data is provided, summarize it clearly and relate it to the travel plans
-7. NOTE (Multi-agent context): If the conversation includes information from other sources (weather, hotels, etc.), incorporate it naturally and cohesively in your response."""
-
-
-def build_flight_crew(
-    request: Request,
-    request_body: dict,
-    streaming: bool,
-):
-    ctx = extract(request.headers)
-    extra_headers = {"x-envoy-max-retries": "3"}
-    request_id = request.headers.get("x-request-id")
-    if request_id:
-        extra_headers["x-request-id"] = request_id
-    inject(extra_headers, context=ctx)
-
-    @tool("resolve_airport_code")
-    async def resolve_airport_code_tool(city_name: str) -> str:
-        """Convert a city name to its primary airport IATA code.
-
-        Args:
-            city_name: Name of the city (e.g., 'Seattle', 'Atlanta', 'Karachi', 'Dubai')
-
-        Returns:
-            3-letter IATA airport code (e.g., 'SEA', 'ATL', 'KHI', 'DXB')
-
-        Examples:
-            Seattle → SEA
-            Atlanta → ATL
-            New York → JFK
-            Dubai → DXB
-            Karachi → KHI
-            Lahore → LHE
-        """
-        code = await resolve_airport_code(city_name, request)
-        if not code:
-            return f"Error: Could not resolve airport code for '{city_name}'"
-        return code
-
-    @tool("search_flights")
-    async def search_flights(
-        origin_code: str, destination_code: str, travel_date: Optional[str] = None
-    ):
-        """Search for flights between two airports using their IATA codes.
-
-        Args:
-            origin_code: Origin airport IATA code (3 letters, e.g., 'SEA', 'KHI')
-            destination_code: Destination airport IATA code (3 letters, e.g., 'ATL', 'DXB')
-            travel_date: Travel date in YYYY-MM-DD format. If not provided, defaults to TODAY.
-
-        Note: Flight data is only available for today and up to 2 days ahead.
-
-        IMPORTANT: Use the resolve_airport_code tool first if you only have city names.
-        """
-        # Default to today's date if not provided
-        if not travel_date:
-            travel_date = datetime.now().strftime("%Y-%m-%d")
-
-        # Validate that we have proper IATA codes (3 letters)
-        if len(origin_code) != 3 or len(destination_code) != 3:
-            return {
-                "error": f"Invalid airport codes. Expected 3-letter IATA codes, got origin='{origin_code}' and destination='{destination_code}'. Use resolve_airport_code tool first to convert city names to codes.",
-            }
-
-        flight_data = await fetch_flights(origin_code, destination_code, travel_date)
-        return {
-            "origin_code": origin_code,
-            "destination_code": destination_code,
-            "travel_date": travel_date,
-            "flights": flight_data.get("flights", []),
-            "count": flight_data.get("count", 0),
-            "error": flight_data.get("error"),
-        }
-
-    llm = LLM(
-        model=FLIGHT_MODEL,
-        api_key="EMPTY",
-        base_url=LLM_GATEWAY_ENDPOINT,
-        temperature=request_body.get("temperature", 0.7),
-        max_tokens=request_body.get("max_tokens", 1000),
-        stream=streaming,
-        extra_headers=extra_headers,
-    )
-
-    agent = Agent(
-        role="Flight Information Specialist",
-        goal="Provide accurate, clear flight options and details for travelers.",
-        backstory=SYSTEM_PROMPT,
-        tools=[resolve_airport_code_tool, search_flights],
-        llm=llm,
-        verbose=True,
-        reasoning=False,
-    )
-
-    task = Task(
-        description=(
-            "Answer the user's request based on this conversation:\n{conversation}\n\n"
-            "CRITICAL: NOTE you are part of a multi-agent setup, so if the conversation includes information from other sources that are not flight-related, incorporate it naturally.\n"
-            "Output ONLY your final answer to the user. Do NOT show:\n"
-            "- Thought, Action, Action Input, Observation, or any reasoning steps\n"
-            "- Tool names, parameters, or results\n"
-            "- Planning or internal deliberation\n\n"
-            "Tool workflow (execute silently):\n"
-            "1. City names → use resolve_airport_code to get IATA codes\n"
-            "2. Use search_flights with the codes\n"
-            "3. Present results conversationally\n\n"
-            "Output requirements:\n"
-            "- Natural conversational text only\n"
-            "- NO JSON, code blocks, or technical formatting\n"
-            "- Clean bullet points with readable times (9:00 AM format)\n"
-            "- Direct answer with no reasoning shown"
-        ),
-        expected_output=(
-            "A direct answer to the user in plain text with flight options. "
-            "NO Thought/Action/Observation. NO code blocks. NO JSON. "
-            "Just natural language with bullet points."
-        ),
-        agent=agent,
-    )
-
-    return Crew(agents=[agent], tasks=[task], stream=streaming, verbose=False)
-
-
-async def resolve_airport_code(city_name: str, request: Request) -> Optional[str]:
-    if not city_name:
-        return None
-
-    try:
-        ctx = extract(request.headers)
-        extra_headers = {}
-        inject(extra_headers, context=ctx)
-
-        response = await openai_client.chat.completions.create(
-            model=EXTRACTION_MODEL,
-            messages=[
-                {
-                    "role": "system",
-                    "content": "Convert city names to primary airport IATA codes. Return only the 3-letter code. Examples: Seattle→SEA, Atlanta→ATL, New York→JFK, Dubai→DXB, Lahore→LHE",
-                },
-                {"role": "user", "content": city_name},
-            ],
-            temperature=0.1,
-            max_tokens=10,
-            extra_headers=extra_headers or None,
-        )
-
-        code = response.choices[0].message.content.strip().upper()
-        code = code.strip("\"'`.,!? \n\t")
-        return code if len(code) == 3 else None
-
-    except Exception as e:
-        logger.error(f"Error resolving airport code for {city_name}: {e}")
-        return None
-
-
-async def fetch_flights(
-    origin_code: str, dest_code: str, travel_date: Optional[str] = None
-) -> dict:
-    """Fetch flights between two airports. Note: FlightAware limits to 2 days ahead."""
-    search_date = travel_date or datetime.now().strftime("%Y-%m-%d")
-
-    search_date_obj = datetime.strptime(search_date, "%Y-%m-%d")
-    today = datetime.now().replace(hour=0, minute=0, second=0, microsecond=0)
-    days_ahead = (search_date_obj - today).days
-
-    if days_ahead > 2:
-        logger.warning(
-            f"Date {search_date} is {days_ahead} days ahead, exceeds FlightAware limit"
-        )
-        return {
-            "origin_code": origin_code,
-            "destination_code": dest_code,
-            "flights": [],
-            "count": 0,
-            "error": f"FlightAware API only provides data up to 2 days ahead. Requested date ({search_date}) is {days_ahead} days away.",
-        }
-
-    try:
-        url = f"{AEROAPI_BASE_URL}/airports/{origin_code}/flights/to/{dest_code}"
-        headers = {"x-apikey": AEROAPI_KEY}
-        params = {
-            "start": f"{search_date}T00:00:00Z",
-            "end": f"{search_date}T23:59:59Z",
-            "connection": "nonstop",
-            "max_pages": 1,
-        }
-
-        response = await http_client.get(url, headers=headers, params=params)
-
-        if response.status_code != 200:
-            logger.error(
-                f"FlightAware API error {response.status_code}: {response.text}"
-            )
-            return {
-                "origin_code": origin_code,
-                "destination_code": dest_code,
-                "flights": [],
-                "count": 0,
-            }
-
-        data = response.json()
-        flights = []
-
-        for flight_group in data.get("flights", [])[:5]:
-            segments = flight_group.get("segments", [])
-            if not segments:
-                continue
-
-            flight = segments[0]
-            flights.append(
-                {
-                    "airline": flight.get("operator"),
-                    "flight_number": flight.get("ident_iata") or flight.get("ident"),
-                    "departure_time": flight.get("scheduled_out"),
-                    "arrival_time": flight.get("scheduled_in"),
-                    "origin": flight["origin"].get("code_iata")
-                    if isinstance(flight.get("origin"), dict)
-                    else None,
-                    "destination": flight["destination"].get("code_iata")
-                    if isinstance(flight.get("destination"), dict)
-                    else None,
-                    "aircraft_type": flight.get("aircraft_type"),
-                    "status": flight.get("status"),
-                    "terminal_origin": flight.get("terminal_origin"),
-                    "gate_origin": flight.get("gate_origin"),
-                }
-            )
-
-        logger.info(f"Found {len(flights)} flights from {origin_code} to {dest_code}")
-        return {
-            "origin_code": origin_code,
-            "destination_code": dest_code,
-            "flights": flights,
-            "count": len(flights),
-        }
-
-    except Exception as e:
-        logger.error(f"Error fetching flights: {e}")
-        return {
-            "origin_code": origin_code,
-            "destination_code": dest_code,
-            "flights": [],
-            "count": 0,
-        }
-
-
-app = FastAPI(title="Flight Information Agent", version="1.0.0")
-
-
-@app.post("/v1/chat/completions")
-async def handle_request(request: Request):
-    request_body = await request.json()
-    is_streaming = request_body.get("stream", True)
-    model = request_body.get("model", FLIGHT_MODEL)
-
-    if is_streaming:
-        return StreamingResponse(
-            invoke_flight_agent_stream(request, request_body, model),
-            media_type="text/event-stream",
-            headers={"content-type": "text/event-stream"},
-        )
-
-    content = await invoke_flight_agent(request, request_body)
-    return JSONResponse(
-        {
-            "id": f"chatcmpl-{uuid.uuid4().hex[:8]}",
-            "object": "chat.completion",
-            "created": int(time.time()),
-            "model": model,
-            "choices": [
-                {
-                    "index": 0,
-                    "message": {"role": "assistant", "content": content},
-                    "finish_reason": "stop",
-                }
-            ],
-        }
-    )
-
-
-async def invoke_flight_agent(request: Request, request_body: dict):
-    """Generate flight information using a CrewAI agent."""
-    messages = request_body.get("messages", [])
-    crew = build_flight_crew(request, request_body, streaming=False)
-    conversation = json.dumps(messages, indent=2)
-
-    try:
-        result = crew.kickoff(inputs={"conversation": conversation})
-        if hasattr(result, "raw"):
-            return result.raw
-        return str(result)
-    except Exception as e:
-        logger.error(f"Error generating response: {e}")
-        return "I'm having trouble retrieving flight information right now. Please try again."
-
-
-async def invoke_flight_agent_stream(
-    request: Request,
-    request_body: dict,
-    model: str,
-):
-    messages = request_body.get("messages", [])
-    crew = build_flight_crew(request, request_body, streaming=True)
-    conversation = json.dumps(messages, indent=2)
-
-    try:
-        streaming = crew.kickoff(inputs={"conversation": conversation})
-        for chunk in streaming:
-            content = getattr(chunk, "content", None)
-            if content is None:
-                content = str(chunk)
-            if not content:
-                continue
-            yield f"data: {create_chat_completion_chunk(model, content).model_dump_json()}\n\n"
-
-        yield f"data: {create_chat_completion_chunk(model, '', 'stop').model_dump_json()}\n\n"
-        yield "data: [DONE]\n\n"
-    except Exception as e:
-        logger.error(f"Error streaming response: {e}")
-        error_message = "I'm having trouble retrieving flight information right now. Please try again."
-        yield f"data: {create_chat_completion_chunk(model, error_message, 'stop').model_dump_json()}\n\n"
-        yield "data: [DONE]\n\n"
-
-
-@app.get("/health")
-async def health_check():
-    return {"status": "healthy", "agent": "flight_information"}
-
-
-def start_server(host: str = "0.0.0.0", port: int = 10520):
-    uvicorn.run(
-        app,
-        host=host,
-        port=port,
-        log_config={
-            "version": 1,
-            "disable_existing_loggers": False,
-            "formatters": {
-                "default": {
-                    "format": "%(asctime)s - [FLIGHT_AGENT] - %(levelname)s - %(message)s"
-                }
-            },
-            "handlers": {
-                "default": {
-                    "formatter": "default",
-                    "class": "logging.StreamHandler",
-                    "stream": "ext://sys.stdout",
-                }
-            },
-            "root": {"level": "INFO", "handlers": ["default"]},
-        },
-    )
-
-
-if __name__ == "__main__":
-    start_server()
--- a/demos/use_cases/multi_agent_with_crewai_langchain/docker-compose.yaml
+++ b/demos/use_cases/multi_agent_with_crewai_langchain/docker-compose.yaml
@ -1,62 +0,0 @@
-
-services:
-  plano:
-    build:
-      context: ../../../
-      dockerfile: Dockerfile
-    ports:
-      - "8001:8001"
-      - "12000:12000"
-    environment:
-      - PLANO_CONFIG_PATH=/app/plano_config.yaml
-      - OPENAI_API_KEY=${OPENAI_API_KEY:?OPENAI_API_KEY environment variable is required but not set}
-      - OTEL_TRACING_GRPC_ENDPOINT=http://jaeger:4317
-      - LOG_LEVEL=${LOG_LEVEL:-info}
-    volumes:
-      - ./config.yaml:/app/plano_config.yaml:ro
-      - /etc/ssl/cert.pem:/etc/ssl/cert.pem
-
-  crewai-flight-agent:
-    build:
-      dockerfile: Dockerfile
-    restart: always
-    ports:
-      - "10520:10520"
-    environment:
-      - LLM_GATEWAY_ENDPOINT=http://plano:12000/v1
-      - AEROAPI_KEY=${AEROAPI_KEY:?AEROAPI_KEY environment variable is required but not set}
-      - PYTHONUNBUFFERED=1
-    command: ["python", "-u", "crewai/flight_agent.py"]
-
-  langchain-weather-agent:
-    build:
-      dockerfile: Dockerfile
-    restart: always
-    ports:
-      - "10510:10510"
-    environment:
-      - LLM_GATEWAY_ENDPOINT=http://plano:12000/v1
-    command: ["python", "-u", "langchain/weather_agent.py"]
-
-  anythingllm:
-    image: mintplexlabs/anythingllm
-    restart: always
-    ports:
-      - "3001:3001"
-    cap_add:
-      - SYS_ADMIN
-    environment:
-      - STORAGE_DIR=/app/server/storage
-      - LLM_PROVIDER=generic-openai
-      - GENERIC_OPEN_AI_BASE_PATH=http://plano:8001/v1
-      - GENERIC_OPEN_AI_MODEL_PREF=gpt-4o-mini
-      - GENERIC_OPEN_AI_MODEL_TOKEN_LIMIT=128000
-      - GENERIC_OPEN_AI_API_KEY=sk-placeholder
-
-  jaeger:
-    build:
-      context: ../../shared/jaeger
-    restart: always
-    ports:
-      - "16686:16686"  # Jaeger UI
-      - "4317:4317"    # OTLP gRPC receiver
--- a/demos/use_cases/multi_agent_with_crewai_langchain/langchain/weather_agent.py
+++ b/demos/use_cases/multi_agent_with_crewai_langchain/langchain/weather_agent.py
@ -1,459 +0,0 @@
-import json
-import logging
-import os
-import time
-import uuid
-from datetime import datetime
-from typing import Optional
-from urllib.parse import quote
-
-import httpx
-import uvicorn
-from fastapi import FastAPI, Request
-from fastapi.responses import JSONResponse, StreamingResponse
-from langchain.agents import create_agent
-from langchain_core.tools import tool
-from langchain_openai import ChatOpenAI
-from openai import AsyncOpenAI
-from opentelemetry.propagate import extract, inject
-from pydantic import BaseModel, Field
-
-from openai_protocol import create_chat_completion_chunk
-
-logging.basicConfig(
-    level=logging.INFO,
-    format="%(asctime)s - [WEATHER_AGENT] - %(levelname)s - %(message)s",
-)
-logger = logging.getLogger(__name__)
-
-LLM_GATEWAY_ENDPOINT = os.getenv(
-    "LLM_GATEWAY_ENDPOINT", "http://host.docker.internal:12000/v1"
-)
-WEATHER_MODEL = "gpt-4o"
-LOCATION_MODEL = "gpt-4o-mini"
-
-openai_client_via_plano = AsyncOpenAI(
-    base_url=LLM_GATEWAY_ENDPOINT,
-    api_key="EMPTY",
-)
-
-app = FastAPI(title="Weather Forecast Agent", version="1.0.0")
-
-http_client = httpx.AsyncClient(timeout=10.0)
-
-
-def celsius_to_fahrenheit(temp_c: Optional[float]) -> Optional[float]:
-    return round(temp_c * 9 / 5 + 32, 1) if temp_c is not None else None
-
-
-async def get_weather_data(
-    request: Request,
-    messages: list,
-    days: int = 1,
-    request_id: str = None,
-    city_override: Optional[str] = None,
-):
-    instructions = """You are a city name extractor. Look at the FINAL user message ONLY and extract the city name.
-
-    The FINAL user message will be the LAST message with role "user" in the conversation.
-
-    IMPORTANT: Ignore all previous messages. Focus ONLY on the FINAL user message.
-
-    Examples of what to extract from the FINAL user message:
-    - "What's the weather in Seattle?" -> Seattle
-    - "What's the weather in San Francisco?" -> San Francisco
-    - "What about Dubai?" -> Dubai
-    - "How's the weather in Tokyo today?" -> Tokyo
-    - "Tell me about Lahore" -> Lahore
-    - "What about there?" -> Look at conversation for the last mentioned city
-
-    Output ONLY the city name. Nothing else. One word or city name only.
-    If no city can be found, output: NOT_FOUND"""
-
-    location = city_override
-    if not location:
-        try:
-            user_messages = [
-                msg.get("content") for msg in messages if msg.get("role") == "user"
-            ]
-
-            if not user_messages:
-                location = "New York"
-            else:
-                ctx = extract(request.headers)
-                extra_headers = {}
-                if request_id:
-                    extra_headers["x-request-id"] = request_id
-                inject(extra_headers, context=ctx)
-                response = await openai_client_via_plano.chat.completions.create(
-                    model=LOCATION_MODEL,
-                    messages=[
-                        {"role": "system", "content": instructions},
-                        *[
-                            {"role": msg.get("role"), "content": msg.get("content")}
-                            for msg in messages
-                        ],
-                    ],
-                    temperature=0.1,
-                    max_tokens=10,
-                    extra_headers=extra_headers if extra_headers else None,
-                )
-
-                location = response.choices[0].message.content.strip().strip("\"'`.,!?")
-
-                if not location or location.upper() == "NOT_FOUND":
-                    location = "New York"
-                    logger.info("Location not found, defaulting to: %s", location)
-
-        except Exception as e:
-            logger.error("Error extracting location: %s", e)
-            location = "New York"
-
-    logger.info("Fetching weather for location: '%s' (days: %s)", location, days)
-
-    try:
-        geocode_url = (
-            "https://geocoding-api.open-meteo.com/v1/search?"
-            f"name={quote(location)}&count=1&language=en&format=json"
-        )
-        geocode_response = await http_client.get(geocode_url)
-
-        if geocode_response.status_code != 200 or not geocode_response.json().get(
-            "results"
-        ):
-            logger.warning("Could not geocode %s, using New York", location)
-            location = "New York"
-            geocode_url = (
-                "https://geocoding-api.open-meteo.com/v1/search?"
-                f"name={quote(location)}&count=1&language=en&format=json"
-            )
-            geocode_response = await http_client.get(geocode_url)
-
-        geocode_data = geocode_response.json()
-        if not geocode_data.get("results"):
-            return {
-                "location": location,
-                "weather": {
-                    "date": datetime.now().strftime("%Y-%m-%d"),
-                    "day_name": datetime.now().strftime("%A"),
-                    "temperature_c": None,
-                    "temperature_f": None,
-                    "weather_code": None,
-                    "error": "Could not retrieve weather data",
-                },
-            }
-
-        result = geocode_data["results"][0]
-        location_name = result.get("name", location)
-        latitude = result["latitude"]
-        longitude = result["longitude"]
-
-        logger.info(
-            "Geocoded '%s' to %s (%s, %s)", location, location_name, latitude, longitude
-        )
-
-        weather_url = (
-            "https://api.open-meteo.com/v1/forecast?"
-            f"latitude={latitude}&longitude={longitude}&"
-            "current=temperature_2m&"
-            "daily=sunrise,sunset,temperature_2m_max,temperature_2m_min,weather_code&"
-            f"forecast_days={days}&timezone=auto"
-        )
-
-        weather_response = await http_client.get(weather_url)
-        if weather_response.status_code != 200:
-            return {
-                "location": location_name,
-                "weather": {
-                    "date": datetime.now().strftime("%Y-%m-%d"),
-                    "day_name": datetime.now().strftime("%A"),
-                    "temperature_c": None,
-                    "temperature_f": None,
-                    "weather_code": None,
-                    "error": "Could not retrieve weather data",
-                },
-            }
-
-        weather_data = weather_response.json()
-        current_temp = weather_data.get("current", {}).get("temperature_2m")
-        daily = weather_data.get("daily", {})
-
-        forecast = []
-        for i in range(days):
-            date_str = daily["time"][i]
-            date_obj = datetime.fromisoformat(date_str.replace("Z", "+00:00"))
-
-            temp_max = (
-                daily.get("temperature_2m_max", [])[i]
-                if daily.get("temperature_2m_max")
-                else None
-            )
-            temp_min = (
-                daily.get("temperature_2m_min", [])[i]
-                if daily.get("temperature_2m_min")
-                else None
-            )
-            weather_code = (
-                daily.get("weather_code", [0])[i] if daily.get("weather_code") else 0
-            )
-            sunrise = daily.get("sunrise", [])[i] if daily.get("sunrise") else None
-            sunset = daily.get("sunset", [])[i] if daily.get("sunset") else None
-
-            temp_c = (
-                temp_max
-                if temp_max is not None
-                else (current_temp if i == 0 and current_temp else temp_min)
-            )
-
-            forecast.append(
-                {
-                    "date": date_str.split("T")[0],
-                    "day_name": date_obj.strftime("%A"),
-                    "temperature_c": round(temp_c, 1) if temp_c is not None else None,
-                    "temperature_f": celsius_to_fahrenheit(temp_c),
-                    "temperature_max_c": (
-                        round(temp_max, 1) if temp_max is not None else None
-                    ),
-                    "temperature_min_c": (
-                        round(temp_min, 1) if temp_min is not None else None
-                    ),
-                    "weather_code": weather_code,
-                    "sunrise": sunrise.split("T")[1] if sunrise else None,
-                    "sunset": sunset.split("T")[1] if sunset else None,
-                }
-            )
-
-        return {"location": location_name, "forecast": forecast}
-
-    except Exception as e:
-        logger.error("Error getting weather data: %s", e)
-        return {
-            "location": location,
-            "weather": {
-                "date": datetime.now().strftime("%Y-%m-%d"),
-                "day_name": datetime.now().strftime("%A"),
-                "temperature_c": None,
-                "temperature_f": None,
-                "weather_code": None,
-                "error": "Could not retrieve weather data",
-            },
-        }
-
-
-class WeatherToolInput(BaseModel):
-    city: str = Field(..., description="City name to look up weather for")
-    days: int = Field(
-        1,
-        ge=1,
-        le=16,
-        description="Number of forecast days (1-16). Defaults to 1 (current).",
-    )
-
-
-WEATHER_SYSTEM_PROMPT = """You are a weather and travel conditions assistant in a multi-agent system. You will receive weather data in JSON format with these fields:
-
-    - "location": City name
-    - "forecast": Array of weather objects, each with date, day_name, temperature_c, temperature_f, temperature_max_c, temperature_min_c, weather_code, sunrise, sunset
-    - weather_code: WMO code (0=clear, 1-3=partly cloudy, 45-48=fog, 51-67=rain, 71-86=snow, 95-99=thunderstorm)
-
-    Your task:
-    1. Present the weather/forecast clearly for the location
-    2. For single day: show current conditions
-    3. For multi-day: show each day with date and conditions
-    4. Include temperature in both Celsius and Fahrenheit
-    5. Describe conditions naturally based on weather_code
-    6. Use conversational language
-    7. When flight information is present in the conversation, summarize it clearly:
-       - Present flight details in a readable format (airline, times, gates, status)
-       - Integrate flight and weather information cohesively
-       - Mention how weather might affect the flights if relevant
-    8. NOTE (Multi-agent context): If the conversation includes information from other agents and sources (flights, hotels, etc.), incorporate it naturally and provide a comprehensive travel summary.
-
-    Remember: Only use the provided data. If fields are null, mention data is unavailable."""
-
-
-def build_weather_agent(
-    request: Request,
-    request_body: dict,
-    streaming: bool,
-):
-    messages = request_body.get("messages", [])
-    ctx = extract(request.headers)
-    extra_headers = {"x-envoy-max-retries": "3"}
-    request_id = request.headers.get("x-request-id")
-    if request_id:
-        extra_headers["x-request-id"] = request_id
-        logger.debug("Request ID set: [redacted]")
-    inject(extra_headers, context=ctx)
-
-    @tool("get_weather_forecast", args_schema=WeatherToolInput)
-    async def get_weather_forecast(city: str, days: int = 1):
-        """Fetch a structured weather forecast for a city."""
-        return await get_weather_data(
-            request,
-            messages,
-            days,
-            request_id=request_id,
-            city_override=city,
-        )
-
-    llm = ChatOpenAI(
-        model=WEATHER_MODEL,
-        api_key="EMPTY",
-        base_url=LLM_GATEWAY_ENDPOINT,
-        temperature=request_body.get("temperature", 0.7),
-        max_tokens=request_body.get("max_tokens", 1000),
-        streaming=streaming,
-        default_headers=extra_headers,
-    )
-
-    return create_agent(
-        model=llm,
-        tools=[get_weather_forecast],
-        system_prompt=WEATHER_SYSTEM_PROMPT,
-    )
-
-
-@app.post("/v1/chat/completions")
-async def handle_request(request: Request):
-    request_body = await request.json()
-    is_streaming = request_body.get("stream", True)
-
-    try:
-        model = request_body.get("model", WEATHER_MODEL)
-
-        if is_streaming:
-            return StreamingResponse(
-                invoke_weather_agent_stream(request, request_body, model),
-                media_type="text/event-stream",
-                headers={"content-type": "text/event-stream"},
-            )
-
-        content = await invoke_weather_agent(request, request_body)
-        return JSONResponse(
-            {
-                "id": f"chatcmpl-{uuid.uuid4().hex[:8]}",
-                "object": "chat.completion",
-                "created": int(time.time()),
-                "model": model,
-                "choices": [
-                    {
-                        "index": 0,
-                        "message": {"role": "assistant", "content": content},
-                        "finish_reason": "stop",
-                    }
-                ],
-            }
-        )
-    except Exception as e:
-        logger.error("Error generating weather response: %s", e)
-        if is_streaming:
-            return StreamingResponse(
-                invoke_weather_agent_error_stream(
-                    request_body,
-                    "I'm having trouble retrieving weather information right now. Please try again.",
-                ),
-                media_type="text/event-stream",
-                headers={"content-type": "text/event-stream"},
-            )
-        return JSONResponse(
-            {
-                "error": {
-                    "message": "I'm having trouble retrieving weather information right now. Please try again.",
-                    "type": "server_error",
-                }
-            },
-            status_code=500,
-        )
-
-
-async def invoke_weather_agent(
-    request: Request,
-    request_body: dict,
-):
-    messages = request_body.get("messages", [])
-    agent = build_weather_agent(request, request_body, streaming=False)
-
-    result = await agent.ainvoke({"messages": messages})
-    final_message = result["messages"][-1]
-    return (
-        final_message.content
-        if hasattr(final_message, "content")
-        else str(final_message)
-    )
-
-
-async def invoke_weather_agent_stream(
-    request: Request,
-    request_body: dict,
-    model: str,
-):
-    messages = request_body.get("messages", [])
-    agent = build_weather_agent(request, request_body, streaming=True)
-
-    try:
-        async for event in agent.astream_events(
-            {"messages": messages},
-            version="v2",
-        ):
-            if event.get("event") != "on_chat_model_stream":
-                continue
-            chunk = event.get("data", {}).get("chunk")
-            content = getattr(chunk, "content", None)
-            if not content:
-                continue
-            if isinstance(content, list):
-                content = "".join(
-                    piece for piece in content if isinstance(piece, str)
-                ).strip()
-                if not content:
-                    continue
-            yield f"data: {create_chat_completion_chunk(model, content).model_dump_json()}\n\n"
-
-        yield f"data: {create_chat_completion_chunk(model, '', 'stop').model_dump_json()}\n\n"
-        yield "data: [DONE]\n\n"
-    except Exception as e:
-        logger.error("Error streaming weather response: %s", e)
-        error_message = "I'm having trouble retrieving weather information right now. Please try again."
-        yield f"data: {create_chat_completion_chunk(model, error_message, 'stop').model_dump_json()}\n\n"
-        yield "data: [DONE]\n\n"
-
-
-async def invoke_weather_agent_error_stream(request_body: dict, error_message: str):
-    model = request_body.get("model", WEATHER_MODEL)
-    yield f"data: {create_chat_completion_chunk(model, error_message, 'stop').model_dump_json()}\n\n"
-    yield "data: [DONE]\n\n"
-
-
-@app.get("/health")
-async def health_check():
-    return {"status": "healthy", "agent": "weather_forecast"}
-
-
-def start_server(host: str = "localhost", port: int = 10510):
-    uvicorn.run(
-        app,
-        host=host,
-        port=port,
-        log_config={
-            "version": 1,
-            "disable_existing_loggers": False,
-            "formatters": {
-                "default": {
-                    "format": "%(asctime)s - [WEATHER_AGENT] - %(levelname)s - %(message)s",
-                }
-            },
-            "handlers": {
-                "default": {
-                    "formatter": "default",
-                    "class": "logging.StreamHandler",
-                    "stream": "ext://sys.stdout",
-                }
-            },
-            "root": {"level": "INFO", "handlers": ["default"]},
-        },
-    )
-
-
-if __name__ == "__main__":
-    start_server(host="0.0.0.0", port=10510)
--- a/demos/use_cases/multi_agent_with_crewai_langchain/openai_protocol.py
+++ b/demos/use_cases/multi_agent_with_crewai_langchain/openai_protocol.py
@ -1,36 +0,0 @@
-"""OpenAI API protocol utilities for standardized response formatting."""
-
-import time
-from typing import Optional
-from openai.types.chat import ChatCompletionChunk
-from openai.types.chat.chat_completion_chunk import Choice, ChoiceDelta
-
-
-def create_chat_completion_chunk(
-    model: str,
-    content: str,
-    finish_reason: Optional[str] = None,
-) -> ChatCompletionChunk:
-    """Create an OpenAI-compatible streaming chat completion chunk.
-
-    Args:
-        model: Model identifier to include in the response
-        content: Content text for this chunk
-        finish_reason: Optional finish reason ('stop', 'length', etc.)
-
-    Returns:
-        ChatCompletionChunk object from OpenAI SDK
-    """
-    return ChatCompletionChunk(
-        id=f"chatcmpl-{int(time.time() * 1000000)}",
-        object="chat.completion.chunk",
-        created=int(time.time()),
-        model=model,
-        choices=[
-            Choice(
-                index=0,
-                delta=ChoiceDelta(content=content) if content else ChoiceDelta(),
-                finish_reason=finish_reason,
-            )
-        ],
-    )
--- a/demos/use_cases/multi_agent_with_crewai_langchain/pyproject.toml
+++ b/demos/use_cases/multi_agent_with_crewai_langchain/pyproject.toml
@ -1,26 +0,0 @@
-[project]
-name = "multi-framework-agents"
-version = "0.1.0"
-description = "Multi-Framework Travel Agents - CrewAI and LangChain integration"
-readme = "README.md"
-requires-python = ">=3.10"
-dependencies = [
-    "click>=8.2.1",
-    "pydantic>=2.11.7",
-    "fastapi>=0.115.0",
-    "uvicorn>=0.30.0",
-    "openai>=1.0.0",
-    "httpx>=0.24.0",
-    "opentelemetry-api>=1.20.0",
-    "crewai[tools]>=0.70.0",
-    "langchain>=1.0.0",
-    "langchain-core>=1.0.0",
-    "langchain-openai>=0.3.0",
-]
-
-[build-system]
-requires = ["hatchling"]
-build-backend = "hatchling.build"
-
-[tool.hatch.build.targets.wheel]
-packages = ["crewai", "langchain"]
--- a/demos/use_cases/multi_agent_with_crewai_langchain/traces.png
+++ b/demos/use_cases/multi_agent_with_crewai_langchain/traces.png
--- a/demos/use_cases/multi_agent_with_crewai_langchain/uv.lock
+++ b/demos/use_cases/multi_agent_with_crewai_langchain/uv.lock
--- a/demos/use_cases/ollama/README.md
+++ b/demos/use_cases/ollama/README.md
@ -1,3 +0,0 @@
-This demo shows how you can use ollama as upstream LLM.
-
-Before you can start the demo please make sure you have ollama up and running. You can use command `ollama run llama3.2` to start llama 3.2 (3b) model locally at port `11434`.
--- a/demos/use_cases/ollama/config.yaml
+++ b/demos/use_cases/ollama/config.yaml
@ -1,48 +0,0 @@
-version: v0.1.0
-
-listeners:
-  egress_traffic:
-    address: 0.0.0.0
-    port: 12000
-    message_format: openai
-    timeout: 30s
-
-llm_providers:
-
-  - model: my_llm_provider/llama3.2
-    provider_interface: openai
-    base_url: http://host.docker.internal:11434
-    default: true
-
-system_prompt: |
-  You are a helpful assistant.
-
-prompt_targets:
-  - name: currency_exchange
-    description: Get currency exchange rate from USD to other currencies
-    parameters:
-      - name: currency_symbol
-        description: the currency that needs conversion
-        required: true
-        type: str
-        in_path: true
-    endpoint:
-      name: frankfurther_api
-      path: /v1/latest?base=USD&symbols={currency_symbol}
-    system_prompt: |
-      You are a helpful assistant. Show me the currency symbol you want to convert from USD.
-
-  - name: get_supported_currencies
-    description: Get list of supported currencies for conversion
-    endpoint:
-      name: frankfurther_api
-      path: /v1/currencies
-
-endpoints:
-  frankfurther_api:
-    endpoint: api.frankfurter.dev:443
-    protocol: https
-
-tracing:
-  random_sampling: 100
-  trace_arch_internal: true
--- a/demos/use_cases/ollama/docker-compose.yaml
+++ b/demos/use_cases/ollama/docker-compose.yaml
@ -1,21 +0,0 @@
-services:
-  chatbot_ui:
-    build:
-      context: ../../shared/chatbot_ui
-    ports:
-      - "18080:8080"
-    environment:
-      # this is only because we are running the sample app in the same docker container environemtn as archgw
-      - CHAT_COMPLETION_ENDPOINT=http://host.docker.internal:12000/v1
-    extra_hosts:
-      - "host.docker.internal:host-gateway"
-    volumes:
-      - ./config.yaml:/app/plano_config.yaml
-
-  jaeger:
-    build:
-      context: ../../shared/jaeger
-    ports:
-      - "16686:16686"
-      - "4317:4317"
-      - "4318:4318"
--- a/demos/use_cases/ollama/docker-compose_honeycomb.yaml
+++ b/demos/use_cases/ollama/docker-compose_honeycomb.yaml
@ -1,26 +0,0 @@
-services:
-  chatbot_ui:
-    build:
-      context: ../../shared/chatbot_ui
-    ports:
-      - "18080:8080"
-    environment:
-      # this is only because we are running the sample app in the same docker container environemtn as archgw
-      - CHAT_COMPLETION_ENDPOINT=http://host.docker.internal:10000/v1
-    extra_hosts:
-      - "host.docker.internal:host-gateway"
-    volumes:
-      - ./config.yaml:/app/plano_config.yaml
-
-  otel-collector:
-    build:
-      context: ../../shared/honeycomb/
-    ports:
-      - "4317:4317"
-      - "4318:4318"
-    volumes:
-      - ../../shared/honeycomb/otel-collector-config.yaml:/etc/otel-collector-config.yaml
-    env_file:
-      - .env
-    environment:
-      - HONEYCOMB_API_KEY=${HONEYCOMB_API_KEY:?error}
--- a/demos/use_cases/ollama/run_demo.sh
+++ b/demos/use_cases/ollama/run_demo.sh
@ -1,47 +0,0 @@
-#!/bin/bash
-set -e
-
-# Function to start the demo
-start_demo() {
-  # Step 1: Check if .env file exists
-  if [ -f ".env" ]; then
-    echo ".env file already exists. Skipping creation."
-  else
-    # Step 2: Create `.env` file and set OpenAI key
-    if [ -z "$OPENAI_API_KEY" ]; then
-      echo "Error: OPENAI_API_KEY environment variable is not set for the demo."
-      exit 1
-    fi
-
-    echo "Creating .env file..."
-    echo "OPENAI_API_KEY=$OPENAI_API_KEY" > .env
-    echo ".env file created with OPENAI_API_KEY."
-  fi
-
-  # Step 3: Start Plano
-  echo "Starting Plano with config.yaml..."
-  planoai up config.yaml
-
-  # Step 4: Start developer services
-  echo "Starting Network Agent using Docker Compose..."
-  docker compose up -d  # Run in detached mode
-}
-
-# Function to stop the demo
-stop_demo() {
-  # Step 1: Stop Docker Compose services
-  echo "Stopping Network Agent using Docker Compose..."
-  docker compose down
-
-  # Step 2: Stop Plano
-  echo "Stopping Plano..."
-  planoai down
-}
-
-# Main script logic
-if [ "$1" == "down" ]; then
-  stop_demo
-else
-  # Default action is to bring the demo up
-  start_demo
-fi
--- a/demos/use_cases/preference_based_routing/README.md
+++ b/demos/use_cases/preference_based_routing/README.md
@ -1,54 +0,0 @@
-# Usage based LLM Routing
-This demo shows how you can use user preferences to route user prompts to appropriate llm. See [config.yaml](config.yaml) for details on how you can define user preferences.
-
-## How to start the demo
-
-Make sure your machine is up to date with [latest version of plano]([url](https://github.com/katanemo/plano/tree/main?tab=readme-ov-file#prerequisites)). And you have activated the virtual environment.
-
-
-1. start anythingllm
-```bash
-(venv) $ cd demos/use_cases/preference_based_routing
-(venv) $ docker compose up -d
-```
-2. start plano in the foreground
-```bash
-(venv) $ planoai up --service plano --foreground
-# Or if installed with uv: uvx planoai up --service plano --foreground
-2025-05-30 18:00:09,953 - planoai.main - INFO - Starting plano cli version: 0.4.6
-2025-05-30 18:00:09,953 - planoai.main - INFO - Validating /Users/adilhafeez/src/intelligent-prompt-gateway/demos/use_cases/preference_based_routing/config.yaml
-2025-05-30 18:00:10,422 - cli.core - INFO - Starting plano gateway, image name: plano, tag: katanemo/plano:0.4.6
-2025-05-30 18:00:10,662 - cli.core - INFO - plano status: running, health status: starting
-2025-05-30 18:00:11,712 - cli.core - INFO - plano status: running, health status: starting
-2025-05-30 18:00:12,761 - cli.core - INFO - plano is running and is healthy!
-...
-```
-
-3. open AnythingLLM http://localhost:3001/
-
-# Testing out preference based routing
-
-We have defined two routes 1. code generation and 2. code understanding
-
-For code generation query LLM that is better suited for code generation wil handle the request,
-
-
-If you look at the logs you'd see that code generation llm was selected,
-
-```
-...
-2025-05-31T01:02:19.382716Z  INFO brightstaff::router::llm_router: router response: {'route': 'code_generation'}, response time: 203ms
-...
-```
-
-<img width="1036" alt="image" src="https://github.com/user-attachments/assets/f923944b-ddbe-462e-9fd5-c75504adc8cf" />
-
-Now if you ask for query related to code understanding you'd see llm that is better suited to handle code understanding in handled,
-
-```
-...
-2025-05-31T01:06:33.555680Z  INFO brightstaff::router::llm_router: router response: {'route': 'code_understanding'}, response time: 327ms
-...
-```
-
-<img width="1081" alt="image" src="https://github.com/user-attachments/assets/e50d167c-46a0-4e3a-ba77-e84db1bd376d" />
--- a/demos/use_cases/preference_based_routing/config.yaml
+++ b/demos/use_cases/preference_based_routing/config.yaml
@ -1,29 +0,0 @@
-version: v0.1.0
-
-listeners:
-  egress_traffic:
-    address: 0.0.0.0
-    port: 12000
-    message_format: openai
-    timeout: 30s
-
-llm_providers:
-
-  - model: openai/gpt-4o-mini
-    access_key: $OPENAI_API_KEY
-    default: true
-
-  - model: openai/gpt-4o
-    access_key: $OPENAI_API_KEY
-    routing_preferences:
-      - name: code understanding
-        description: understand and explain existing code snippets, functions, or libraries
-
-  - model: anthropic/claude-sonnet-4-20250514
-    access_key: $ANTHROPIC_API_KEY
-    routing_preferences:
-      - name: code generation
-        description: generating new code snippets, functions, or boilerplate based on user prompts or requirements
-
-tracing:
-  random_sampling: 100
--- a/demos/use_cases/preference_based_routing/docker-compose.yaml
+++ b/demos/use_cases/preference_based_routing/docker-compose.yaml
@ -1,52 +0,0 @@
-services:
-
-  plano:
-    build:
-      context: ../../../
-      dockerfile: Dockerfile
-    ports:
-      - "12000:12000"
-      - "12001:12001"
-    environment:
-      - PLANO_CONFIG_PATH=/app/plano_config.yaml
-      - OPENAI_API_KEY=${OPENAI_API_KEY:?OPENAI_API_KEY environment variable is required but not set}
-      - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY:?ANTHROPIC_API_KEY environment variable is required but not set}
-      - OTEL_TRACING_GRPC_ENDPOINT=http://host.docker.internal:4317
-      - OTEL_TRACING_ENABLED=true
-      - RUST_LOG=debug
-    volumes:
-      - ./config.yaml:/app/plano_config.yaml:ro
-      - /etc/ssl/cert.pem:/etc/ssl/cert.pem
-
-  anythingllm:
-    image: mintplexlabs/anythingllm
-    restart: always
-    ports:
-      - "3001:3001"
-    cap_add:
-      - SYS_ADMIN
-    environment:
-      - STORAGE_DIR=/app/server/storage
-      - LLM_PROVIDER=generic-openai
-      - GENERIC_OPEN_AI_BASE_PATH=http://plano:12000/v1
-      - GENERIC_OPEN_AI_MODEL_PREF=gpt-4o-mini
-      - GENERIC_OPEN_AI_MODEL_TOKEN_LIMIT=128000
-      - GENERIC_OPEN_AI_API_KEY=sk-placeholder
-
-  jaeger:
-    build:
-      context: ../../shared/jaeger
-    ports:
-      - "16686:16686"
-      - "4317:4317"
-      - "4318:4318"
-
-  # prometheus:
-  #   build:
-  #     context: ../../shared/prometheus
-
-  # grafana:
-  #   build:
-  #     context: ../../shared/grafana
-  #   ports:
-  #     - "3000:3000"
--- a/demos/use_cases/preference_based_routing/hurl_tests/simple.hurl
+++ b/demos/use_cases/preference_based_routing/hurl_tests/simple.hurl
@ -1,19 +0,0 @@
-POST http://localhost:12000/v1/chat/completions
-Content-Type: application/json
-
-{
-  "model": "openai/gpt-4o-mini",
-  "messages": [
-    {
-      "role": "user",
-      "content": "hi"
-    }
-  ]
-}
-HTTP 200
-[Asserts]
-header "content-type" == "application/json"
-jsonpath "$.model" matches /^gpt-4o-mini/
-jsonpath "$.usage" != null
-jsonpath "$.choices[0].message.content" != null
-jsonpath "$.choices[0].message.role" == "assistant"
--- a/demos/use_cases/preference_based_routing/hurl_tests/simple_stream.hurl
+++ b/demos/use_cases/preference_based_routing/hurl_tests/simple_stream.hurl
@ -1,17 +0,0 @@
-POST http://localhost:12000/v1/chat/completions
-Content-Type: application/json
-
-{
-  "messages": [
-    {
-      "role": "user",
-      "content": "Can you explain what this Python function does?\n\ndef fibonacci(n):\n    if n <= 1:\n        return n\n    return fibonacci(n-1) + fibonacci(n-2)"
-    }
-  ],
-  "model": "openai/gpt-4o-mini",
-  "stream": true
-}
-HTTP 200
-[Asserts]
-header "content-type" matches /text\/event-stream/
-body matches /^data: .*?gpt-4o.*?\n/
--- a/demos/use_cases/preference_based_routing/plano_config_local.yaml
+++ b/demos/use_cases/preference_based_routing/plano_config_local.yaml
@ -1,37 +0,0 @@
-version: v0.1.0
-
-routing:
-  model: Arch-Router
-  llm_provider: arch-router
-
-listeners:
-  egress_traffic:
-    address: 0.0.0.0
-    port: 12000
-    message_format: openai
-    timeout: 30s
-
-llm_providers:
-
-  - name: arch-router
-    model: arch/hf.co/katanemo/Arch-Router-1.5B.gguf:Q4_K_M
-    base_url: http://host.docker.internal:11434
-
-  - model: openai/gpt-4o-mini
-    access_key: $OPENAI_API_KEY
-    default: true
-
-  - model: openai/gpt-4o
-    access_key: $OPENAI_API_KEY
-    routing_preferences:
-      - name: code understanding
-        description: understand and explain existing code snippets, functions, or libraries
-
-  - model: openai/gpt-4.1
-    access_key: $OPENAI_API_KEY
-    routing_preferences:
-      - name: code generation
-        description: generating new code snippets, functions, or boilerplate based on user prompts or requirements
-
-tracing:
-  random_sampling: 100
--- a/demos/use_cases/preference_based_routing/test_router_endpoint.rest
+++ b/demos/use_cases/preference_based_routing/test_router_endpoint.rest
@ -1,59 +0,0 @@
-@arch_llm_router_endpoint = http://35.192.87.187:8000
-
-POST https://archfc.katanemo.dev/v1/chat/completions HTTP/1.1
-Content-Type: application/json
-
-{
-  "model": "cotran2/qwen-4-epoch-2600",
-  "messages": [
-    {
-      "role": "user",
-      "content": "You are an advanced Routing Assistant designed to select the optimal route based on user requests. \nYour task is to analyze conversations and match them to the most appropriate predefined route.\nReview the available routes config:\n\n# ROUTES CONFIG START\n- name: gpt-4o()\n  description: \"complex reasoning problem, require multi step answer\\n\"\n- name: o4-mini()\n  description: \"simple requests, basic fact retrieval, easy to answer\\n\"\n\n# ROUTES CONFIG END\n\nExamine the following conversation between a user and an assistant:\n\n# CONVERSATION START\n\nuser: Hello\nassistant: Hi! How can I assist you today?\nuser: List us presidents who are born in odd years and are still alive. Order them by their age and I also know what is their home city they were born. And what year they became president. Also give me summary of which president was the best for economy of the US.\n\n# CONVERSATION END\n\nYour goal is to identify the most appropriate route that matches the user's LATEST intent. Follow these steps:\n\n1. Carefully read and analyze the provided conversation, focusing on the user's latest request and the conversation scenario.\n2. Check if the user's request and scenario matches any of the routes in the routing configuration (focus on the description).\n3. Find the route that best matches.\n4. Use context clues from the entire conversation to determine the best fit.\n5. Return the best match possible. You only response the name of the route that best matches the user's request, use the exact name in the routes config.\n6. If no route relatively close to matches the user's latest intent or user last message is thank you or greeting, return an empty route ''. \n\n\n# OUTPUT FORMAT\nYour final output must follow this JSON format:\n{\n  \"route\": \"route_name\" # The matched route name, or empty string '' if no match\n}\n\nBased on your analysis, provide only the JSON object as your final output with no additional text, explanations, or whitespace."
-    }
-  ]
-}
-
-### test 2
-
-POST {{arch_llm_router_endpoint}}/v1/chat/completions HTTP/1.1
-Content-Type: application/json
-
-{"model":"cotran2/llama-1b-4-26","messages":[{"role":"user","content":"\nYou are an advanced Routing Assistant designed to select the optimal route based on user requests. \nYour task is to analyze conversations and match them to the most appropriate predefined route.\nReview the available routes config:\n\n# ROUTES CONFIG START\n- name: gpt-4o\n  description: simple requests, basic fact retrieval, easy to answer\n- name: o4-mini()\n  description: complex reasoning problem, require multi step answer\n# ROUTES CONFIG END\n\nExamine the following conversation between a user and an assistant:\n\n# CONVERSATION START\n[{\"role\":\"user\",\"content\":\"What is the capital of France?\"}]\n# CONVERSATION END\n\nYour goal is to identify the most appropriate route that matches the user's LATEST intent. Follow these steps:\n\n1. Carefully read and analyze the provided conversation, focusing on the user's latest request and the conversation scenario.\n2. Check if the user's request and scenario matches any of the routes in the routing configuration (focus on the description).\n3. Find the route that best matches.\n4. Use context clues from the entire conversation to determine the best fit.\n5. Return the best match possible. You only response the name of the route that best matches the user's request, use the exact name in the routes config.\n6. If no route relatively close to matches the user's latest intent or user last message is thank you or greeting, return an empty route ''. \n\n# OUTPUT FORMAT\nYour final output must follow this JSON format:\n{\n  \"route\": \"route_name\" # The matched route name, or empty string '' if no match\n}\n\nBased on your analysis, provide only the JSON object as your final output with no additional text, explanations, or whitespace.\n"}],"stream":false}
-
-### get model list from arch-function
-GET https://archfc.katanemo.dev/v1/models HTTP/1.1
-model: Arch-Router
-
-### get model list from Arch-Router (notice model header)
-GET https://archfc.katanemo.dev/v1/models HTTP/1.1
-model: Arch-Router
-
-
-### test try code generating
-POST http://localhost:12000/v1/chat/completions HTTP/1.1
-Content-Type: application/json
-
-{
-  "model": "gpt-4o",
-  "messages": [
-    {
-      "role": "user",
-      "content": "write code in python to generate a random number between 1 and 10"
-    }
-  ]
-}
-
-
-### test try code understanding
-POST http://localhost:12000/v1/chat/completions HTTP/1.1
-Content-Type: application/json
-
-{
-  "model": "gpt-4o",
-  "messages": [
-    {
-      "role": "user",
-      "content": "help me understand this python code:\n\nimport random\n\ndef generate_random_number():\n    return random.randint(1, 10)\n\nprint(generate_random_number())"
-    }
-  ]
-}
--- a/demos/use_cases/spotify_bearer_auth/README.md
+++ b/demos/use_cases/spotify_bearer_auth/README.md
@ -1,31 +0,0 @@
-# Use Case Demo: Bearer Authorization with Spotify APIs
-
-In this demo, we show how you can use Plano's bearer authorization capability to connect your agentic apps to third-party APIs.
-More specifically, we demonstrate how you can connect to two Spotify APIs:
-
- [`/v1/browse/new-releases`](https://developer.spotify.com/documentation/web-api/reference/get-new-releases)
- [`/v1/artists/{artist_id}/top-tracks`](https://developer.spotify.com/documentation/web-api/reference/get-an-artists-top-tracks)
-
-Where users can engage by asking questions like _"Show me the latest releases in the US"_, followed by queries like _"Show me top tracks from Taylor Swift"_.
-
-![Example of Bearer Authorization with Spotify APIs](spotify_bearer_auth.png)
-
-## Starting the demo
-
-1. Ensure the [prerequisites](https://github.com/katanemo/arch/?tab=readme-ov-file#prerequisites) are installed correctly.
-2. Create an `.env` file with API keys for OpenAI and Spotify.
-   - Sign up for an OpenAI API key at [https://platform.openai.com/signup/](https://platform.openai.com/signup/)
-   - Sign up for a Spotify Client Key/Secret by following instructions at [https://developer.spotify.com/dashboard/](https://developer.spotify.com/dashboard/)
-   - Generate a Spotify token using the [https://accounts.spotify.com/api/token API](https://accounts.spotify.com/api/token), using ```curl``` or similar commands.
-   - Create a .env file with the following keys:
-   ```
-   OPENAI_API_KEY=your_openai_api_key
-   SPOTIFY_CLIENT_KEY=your_spotify_api_token
-   ```
-
-3. Start Plano
-   ```sh
-   sh run_demo.sh
-   ```
-4. Navigate to http://localhost:18080
-5. Ask "show me new album releases in the US"
--- a/demos/use_cases/spotify_bearer_auth/config.yaml
+++ b/demos/use_cases/spotify_bearer_auth/config.yaml
@ -1,123 +0,0 @@
-version: v0.1.0
-
-listeners:
-  ingress_traffic:
-    address: 0.0.0.0
-    port: 10000
-    message_format: openai
-    timeout: 30s
-
-overrides:
-  optimize_context_window: true
-
-endpoints:
-  spotify:
-    endpoint: api.spotify.com
-    protocol: https
-
-system_prompt: |
-  I have the following JSON data representing a list of albums from Spotify:
-
-  {
-  "items": [
-    {
-      "album_type": "album",
-      "artists": [
-        {
-          "external_urls": {
-            "spotify": "https://open.spotify.com/artist/06HL4z0CvFAxyc27GXpf02"
-          },
-          "href": "https://api.spotify.com/v1/artists/06HL4z0CvFAxyc27GXpf02",
-          "id": "06HL4z0CvFAxyc27GXpf02",
-          "name": "Taylor Swift",
-          "type": "artist",
-          "uri": "spotify:artist:06HL4z0CvFAxyc27GXpf02"
-        }
-      ],
-      "available_markets": [ /* ... markets omitted for brevity ... */ ],
-      "external_urls": {
-        "spotify": "https://open.spotify.com/album/1Mo4aZ8pdj6L1jx8zSwJnt"
-      },
-      "href": "https://api.spotify.com/v1/albums/1Mo4aZ8pdj6L1jx8zSwJnt",
-      "id": "1Mo4aZ8pdj6L1jx8zSwJnt",
-      "images": [
-        {
-          "height": 300,
-          "url": "https://i.scdn.co/image/ab67616d00001e025076e4160d018e378f488c33",
-          "width": 300
-        },
-        {
-          "height": 64,
-          "url": "https://i.scdn.co/image/ab67616d000048515076e4160d018e378f488c33",
-          "width": 64
-        },
-        {
-          "height": 640,
-          "url": "https://i.scdn.co/image/ab67616d0000b2735076e4160d018e378f488c33",
-          "width": 640
-        }
-      ],
-      "name": "THE TORTURED POETS DEPARTMENT",
-      "release_date": "2024-04-18",
-      "release_date_precision": "day",
-      "total_tracks": 16,
-      "type": "album",
-      "uri": "spotify:album:1Mo4aZ8pdj6L1jx8zSwJnt"
-    }
-  ]
-  }
-
-  Please convert this JSON into Markdown with the following layout for each album:
-
-  - Display the album image (using Markdown image syntax) first.
-  - On the next line immediately after the image, display the album title, artist name (use the first artist listed), and the release date, all separated by a hyphen or another clear delimiter.
-  - On the next line, provide the Spotify link (using Markdown link syntax).
-
-  For example, the output should look similar to this (using the data above):
-
-  ![Album Image](https://i.scdn.co/image/ab67616d00001e025076e4160d018e378f488c33)
-  **THE TORTURED POETS DEPARTMENT**
-  Taylor Swift - 2024-04-18
-  [Listen on Spotify](https://open.spotify.com/album/1Mo4aZ8pdj6L1jx8zSwJnt)
-  Arist Id: 06HL4z0CvFAxyc27GXpf02
-  <hr>
-
-  Make sure your output is valid Markdown. And don't say "formatted in Markdown". Thanks!
-
-llm_providers:
-  - access_key: $OPENAI_API_KEY
-    model: openai/gpt-4o
-    default: true
-
-prompt_targets:
-  - name: get_new_releases
-    description: Get a list of new album releases featured in Spotify (shown, for example, on a Spotify player’s “Browse” tab).
-    parameters:
-      - name: country
-        description: the country where the album is released
-        required: true
-        type: str
-        in_path: true
-      - name: limit
-        type: integer
-        description: The maximum number of results to return
-        default: "5"
-    endpoint:
-      name: spotify
-      path: /v1/browse/new-releases
-      http_headers:
-        Authorization: "Bearer $SPOTIFY_CLIENT_KEY"
-
-  - name: get_artist_top_tracks
-    description: Get information about an artist's top tracks
-    parameters:
-      - name: artist_id
-        description: The ID of the artist.
-        required: true
-        type: str
-        in_path: true
-    endpoint:
-      name: spotify
-      path: /v1/artists/{artist_id}/top-tracks
-      http_headers:
-        Authorization: "Bearer $SPOTIFY_CLIENT_KEY"
--- a/demos/use_cases/spotify_bearer_auth/docker-compose.yaml
+++ b/demos/use_cases/spotify_bearer_auth/docker-compose.yaml
@ -1,21 +0,0 @@
-services:
-  chatbot_ui:
-    build:
-      context: ../../shared/chatbot_ui
-    ports:
-      - "18080:8080"
-    environment:
-      # this is only because we are running the sample app in the same docker container environemtn as archgw
-      - CHAT_COMPLETION_ENDPOINT=http://host.docker.internal:10000/v1
-    extra_hosts:
-      - "host.docker.internal:host-gateway"
-    volumes:
-      - ./config.yaml:/app/plano_config.yaml
-
-  jaeger:
-    build:
-      context: ../../shared/jaeger
-    ports:
-      - "16686:16686"
-      - "4317:4317"
-      - "4318:4318"
--- a/demos/use_cases/spotify_bearer_auth/run_demo.sh
+++ b/demos/use_cases/spotify_bearer_auth/run_demo.sh
@ -1,47 +0,0 @@
-#!/bin/bash
-set -e
-
-# Function to start the demo
-start_demo() {
-  # Step 1: Check if .env file exists
-  if [ -f ".env" ]; then
-    echo ".env file already exists. Skipping creation."
-  else
-    # Step 2: Create `.env` file and set OpenAI key
-    if [ -z "$OPENAI_API_KEY" ]; then
-      echo "Error: OPENAI_API_KEY environment variable is not set for the demo."
-      exit 1
-    fi
-
-    echo "Creating .env file..."
-    echo "OPENAI_API_KEY=$OPENAI_API_KEY" > .env
-    echo ".env file created with OPENAI_API_KEY."
-  fi
-
-  # Step 3: Start Plano
-  echo "Starting Plano with config.yaml..."
-  planoai up config.yaml
-
-  # Step 4: Start developer services
-  echo "Starting Network Agent using Docker Compose..."
-  docker compose up -d  # Run in detached mode
-}
-
-# Function to stop the demo
-stop_demo() {
-  # Step 1: Stop Docker Compose services
-  echo "Stopping Network Agent using Docker Compose..."
-  docker compose down
-
-  # Step 2: Stop Plano
-  echo "Stopping Plano..."
-  planoai down
-}
-
-# Main script logic
-if [ "$1" == "down" ]; then
-  stop_demo
-else
-  # Default action is to bring the demo up
-  start_demo
-fi
--- a/demos/use_cases/spotify_bearer_auth/spotify_bearer_auth.png
+++ b/demos/use_cases/spotify_bearer_auth/spotify_bearer_auth.png
--- a/demos/use_cases/travel_agents/Dockerfile
+++ b/demos/use_cases/travel_agents/Dockerfile
@ -1,22 +0,0 @@
-FROM python:3.14-slim
-
-WORKDIR /app
-
-# Install bash and uv
-RUN apt-get update && apt-get install -y bash && rm -rf /var/lib/apt/lists/*
-RUN pip install --no-cache-dir uv
-
-# Copy dependency files
-COPY pyproject.toml README.md ./
-
-# Install dependencies (without lock file to resolve fresh)
-RUN uv sync --no-dev
-
-# Copy application code
-COPY src/ ./src/
-
-# Set environment variables
-ENV PYTHONUNBUFFERED=1
-
-# Default command (will be overridden in docker-compose)
-CMD ["uv", "run", "python", "src/travel_agents/weather_agent.py"]
--- a/demos/use_cases/travel_agents/README.md
+++ b/demos/use_cases/travel_agents/README.md
@ -1,113 +0,0 @@
-# Travel Booking Agent Demo
-
-A multi-agent travel booking system demonstrating Plano's intelligent agent routing and orchestration capabilities. This demo showcases two specialized agents working together to help users plan trips with weather information and flight searches. All agent interactions are fully traced with OpenTelemetry-compatible tracing for complete observability.
-
-## Overview
-
-This demo consists of two intelligent agents that work together seamlessly:
-
- **Weather Agent** - Real-time weather conditions and multi-day forecasts for any city worldwide
- **Flight Agent** - Live flight information between airports with real-time tracking
-
-All agents use Plano's agent orchestration LLM to intelligently route user requests to the appropriate specialized agent based on conversation context and user intent. Both agents run as Docker containers for easy deployment.
-
-## Features
-
- **Intelligent Routing**: Plano automatically routes requests to the right agent
- **Conversation Context**: Agents understand follow-up questions and references
- **Real-Time Data**: Live weather and flight data from public APIs
- **Multi-Day Forecasts**: Weather agent supports up to 16-day forecasts
- **LLM-Powered**: Uses GPT-4o-mini for extraction and GPT-5.2 for responses
- **Streaming Responses**: Real-time streaming for better user experience
-
-## Prerequisites
-
- Docker and Docker Compose
- [Plano CLI](https://docs.planoai.dev/get_started/quickstart.html#prerequisites) installed
- [OpenAI API key](https://platform.openai.com/api-keys)
- [FlightAware AeroAPI key](https://www.flightaware.com/aeroapi/portal)
-
-> **Note:** You'll need to obtain a FlightAware AeroAPI key for live flight data. Visit [https://www.flightaware.com/aeroapi/portal](https://www.flightaware.com/aeroapi/portal) to get your API key.
-
-## Quick Start
-
-### 1. Set Environment Variables
-
-Create a `.env` file or export environment variables:
-
-```bash
-export AEROAPI_KEY="your-flightaware-api-key"
-export OPENAI_API_KEY="your OpenAI api key"
-```
-
-### 2. Start All Agents & Plano with Docker
-
-```bash
-docker compose up --build
-```
-
-This starts:
- Weather Agent on port 10510
- Flight Agent on port 10520
- Open WebUI on port 8080
- Plano Proxy on port 8001
-
-### 4. Test the System
-
-Use Open WebUI at http://localhost:8080
-
-> **Note:** The Open WebUI may take a few minutes to start up and be fully ready. Please wait for the container to finish initializing before accessing the interface. Once ready, make sure to select the **gpt-5.2** model from the model dropdown menu in the UI.
-
-## Example Conversations
-
-### Multi-Agent Conversation
-```
-User: What's the weather in Istanbul?
-Assistant: [Weather information]
-
-User: Do they fly out from Seattle?
-Assistant: [Flight information from Istanbul to Seattle]
-```
-
-The system understands context and pronouns, automatically routing to the right agent.
-
-### Multi-Intent Single Query
-```
-User: What's the weather in Seattle, and do any flights go direct to New York?
-Assistant: [Both weather_agent and flight_agent respond simultaneously]
-  - Weather Agent: [Weather information for Seattle]
-  - Flight Agent: [Flight information from Seattle to New York]
-```
-
-## Architecture
-
-```
-    User Request
-         ↓
-    Plano (8001)
-     [Orchestrator]
-         |
-    ┌────┴──-──┐
-    ↓          ↓
- Weather     Flight
-  Agent       Agent
- (10510)     (10520)
- [Docker]    [Docker]
-```
-
-Each agent:
-1. Extracts intent using GPT-4o-mini (with OpenTelemetry tracing)
-2. Fetches real-time data from APIs
-3. Generates response using GPT-5.2
-4. Streams response back to user
-
-Both agents run as Docker containers and communicate with Plano via `host.docker.internal`.
-
-## Observability
-
-This demo includes full OpenTelemetry (OTel) compatible distributed tracing to monitor and debug agent interactions:
-The tracing data provides complete visibility into the multi-agent system, making it easy to identify bottlenecks, debug issues, and optimize performance.
-
-For more details on setting up and using tracing, see the [Plano Observability documentation](https://docs.planoai.dev/guides/observability/tracing.html).
-
-![alt text](tracing.png)
--- a/demos/use_cases/travel_agents/config.yaml
+++ b/demos/use_cases/travel_agents/config.yaml
@ -1,57 +0,0 @@
-version: v0.3.0
-
-agents:
-  - id: weather_agent
-    url: http://host.docker.internal:10510
-  - id: flight_agent
-    url: http://host.docker.internal:10520
-
-model_providers:
-  - model: openai/gpt-5.2
-    access_key: $OPENAI_API_KEY
-    default: true
-  - model: openai/gpt-4o-mini
-    access_key: $OPENAI_API_KEY # smaller, faster, cheaper model for extracting entities like location
-
-listeners:
-  - type: agent
-    name: travel_booking_service
-    port: 8001
-    router: plano_orchestrator_v1
-    agents:
-      - id: weather_agent
-        description: |
-
-          WeatherAgent is a specialized AI assistant for real-time weather information and forecasts. It provides accurate weather data for any city worldwide using the Open-Meteo API, helping travelers plan their trips with up-to-date weather conditions.
-
-          Capabilities:
-            * Get real-time weather conditions and multi-day forecasts for any city worldwide using Open-Meteo API (free, no API key needed)
-            * Provides current temperature
-            * Provides multi-day forecasts
-            * Provides weather conditions
-            * Provides sunrise/sunset times
-            * Provides detailed weather information
-            * Understands conversation context to resolve location references from previous messages
-            * Handles weather-related questions including "What's the weather in [city]?", "What's the forecast for [city]?", "How's the weather in [city]?"
-            * When queries include both weather and other travel questions (e.g., flights, currency), this agent answers ONLY the weather part
-
-      - id: flight_agent
-        description: |
-
-          FlightAgent is an AI-powered tool specialized in providing live flight information between airports. It leverages the FlightAware AeroAPI to deliver real-time flight status, gate information, and delay updates.
-
-          Capabilities:
-            * Get live flight information between airports using FlightAware AeroAPI
-            * Shows real-time flight status
-            * Shows scheduled/estimated/actual departure and arrival times
-            * Shows gate and terminal information
-            * Shows delays
-            * Shows aircraft type
-            * Shows flight status
-            * Automatically resolves city names to airport codes (IATA/ICAO)
-            * Understands conversation context to infer origin/destination from follow-up questions
-            * Handles flight-related questions including "What flights go from [city] to [city]?", "Do flights go to [city]?", "Are there direct flights from [city]?"
-            * When queries include both flight and other travel questions (e.g., weather, currency), this agent answers ONLY the flight part
-
-tracing:
-  random_sampling: 100
--- a/demos/use_cases/travel_agents/docker-compose.yaml
+++ b/demos/use_cases/travel_agents/docker-compose.yaml
@ -1,67 +0,0 @@
-
-services:
-  plano:
-    build:
-      context: ../../../
-      dockerfile: Dockerfile
-    ports:
-      - "12000:12000"
-      - "8001:8001"
-    environment:
-      - PLANO_CONFIG_PATH=/config/config.yaml
-      - OPENAI_API_KEY=${OPENAI_API_KEY:?OPENAI_API_KEY environment variable is required but not set}
-    volumes:
-      - ./config.yaml:/app/plano_config.yaml
-      - /etc/ssl/cert.pem:/etc/ssl/cert.pem
-  weather-agent:
-    build:
-      context: .
-      dockerfile: Dockerfile
-    container_name: weather-agent
-    restart: always
-    ports:
-      - "10510:10510"
-    environment:
-      - LLM_GATEWAY_ENDPOINT=http://host.docker.internal:12000/v1
-    command: ["uv", "run", "python", "src/travel_agents/weather_agent.py"]
-    extra_hosts:
-      - "host.docker.internal:host-gateway"
-  flight-agent:
-    build:
-      context: .
-      dockerfile: Dockerfile
-    container_name: flight-agent
-    restart: always
-    ports:
-      - "10520:10520"
-    environment:
-      - LLM_GATEWAY_ENDPOINT=http://host.docker.internal:12000/v1
-      - AEROAPI_KEY=${AEROAPI_KEY:? AEROAPI_KEY environment variable is required but not set}
-    command: ["uv", "run", "python", "src/travel_agents/flight_agent.py"]
-    extra_hosts:
-      - "host.docker.internal:host-gateway"
-  open-web-ui:
-    image: dyrnq/open-webui:main
-    restart: always
-    ports:
-      - "8080:8080"
-    environment:
-      - DEFAULT_MODEL=gpt-4o-mini
-      - ENABLE_OPENAI_API=true
-      - OPENAI_API_BASE_URL=http://host.docker.internal:8001/v1
-      - ENABLE_FOLLOW_UP_GENERATION=false
-      - ENABLE_TITLE_GENERATION=false
-      - ENABLE_TAGS_GENERATION=false
-      - ENABLE_AUTOCOMPLETE_GENERATION=false
-    depends_on:
-      - weather-agent
-      - flight-agent
-  jaeger:
-    build:
-      context: ../../shared/jaeger
-    container_name: jaeger
-    restart: always
-    ports:
-      - "16686:16686"  # Jaeger UI
-      - "4317:4317"    # OTLP gRPC receiver
-      - "4318:4318"    # OTLP HTTP receiver
--- a/demos/use_cases/travel_agents/pyproject.toml
+++ b/demos/use_cases/travel_agents/pyproject.toml
@ -1,25 +0,0 @@
-[project]
-name = "travel-agents"
-version = "0.1.0"
-description = "Travel Booking Agents - Weather, Flight, and Currency"
-readme = "README.md"
-requires-python = ">=3.10"
-dependencies = [
-    "click>=8.2.1",
-    "pydantic>=2.11.7",
-    "fastapi>=0.115.0",
-    "uvicorn>=0.30.0",
-    "openai>=1.0.0",
-    "httpx>=0.24.0",
-    "opentelemetry-api>=1.20.0",
-]
-
-[project.scripts]
-travel_agents = "travel_agents:main"
-
-[build-system]
-requires = ["hatchling"]
-build-backend = "hatchling.build"
-
-[tool.hatch.build.targets.wheel]
-packages = ["src/travel_agents"]
--- a/demos/use_cases/travel_agents/src/travel_agents/flight_agent.py
+++ b/demos/use_cases/travel_agents/src/travel_agents/flight_agent.py
@ -1,405 +0,0 @@
-import json
-from fastapi import FastAPI, Request
-from fastapi.responses import StreamingResponse
-from openai import AsyncOpenAI
-import os
-import logging
-import uvicorn
-from datetime import datetime
-import httpx
-from typing import Optional
-from opentelemetry.propagate import extract, inject
-
-logging.basicConfig(
-    level=logging.INFO,
-    format="%(asctime)s - [FLIGHT_AGENT] - %(levelname)s - %(message)s",
-)
-logger = logging.getLogger(__name__)
-
-LLM_GATEWAY_ENDPOINT = os.getenv(
-    "LLM_GATEWAY_ENDPOINT", "http://host.docker.internal:12000/v1"
-)
-FLIGHT_MODEL = "openai/gpt-5.2"
-EXTRACTION_MODEL = "openai/gpt-4o-mini"
-
-AEROAPI_BASE_URL = "https://aeroapi.flightaware.com/aeroapi"
-AEROAPI_KEY = os.getenv("AEROAPI_KEY")
-
-http_client = httpx.AsyncClient(timeout=30.0)
-openai_client = AsyncOpenAI(base_url=LLM_GATEWAY_ENDPOINT, api_key="EMPTY")
-
-SYSTEM_PROMPT = """You are a travel planning assistant specializing in flight information. You support both direct flights AND multi-leg connecting flights.
-
-Flight data fields:
- airline: Full airline name (e.g., "Delta Air Lines")
- flight_number: Flight identifier (e.g., "DL123")
- departure_time/arrival_time: ISO 8601 timestamps
- origin/destination: Airport IATA codes
- aircraft_type: Aircraft model code (e.g., "B739")
- status: Flight status (e.g., "Scheduled", "Delayed")
- terminal_origin/gate_origin: Departure terminal and gate (may be null)
-
-Your task:
-1. Present flights clearly with airline, flight number, readable times, airports, and aircraft
-2. Organize chronologically by departure time
-3. Convert ISO timestamps to readable format (e.g., "11:00 AM")
-4. Include terminal/gate info when available
-5. For multi-leg flights: present each leg separately with connection timing
-
-Multi-agent context: If the conversation includes information from other sources, incorporate it naturally into your response."""
-
-ROUTE_EXTRACTION_PROMPT = """Extract flight route and travel date. Support direct AND multi-leg flights.
-
-Rules:
-1. Patterns: "flight from X to Y", "X to Y to Z", "fly from X through Y to Z"
-2. For multi-leg (e.g., "Seattle to Dubai to Lahore"), extract ALL cities in order
-3. Extract dates: "tomorrow", "next week", "December 25", "12/25", "on Monday"
-4. Use conversation context for missing details
-
-Output format: {"cities": ["City1", "City2", ...], "date": "YYYY-MM-DD" or null}
-
-Examples:
- "Flight from Seattle to Atlanta tomorrow" → {"cities": ["Seattle", "Atlanta"], "date": "2026-01-07"}
- "Seattle to Dubai to Lahore" → {"cities": ["Seattle", "Dubai", "Lahore"], "date": null}
- "Flights from LA through Chicago to NYC" → {"cities": ["LA", "Chicago", "NYC"], "date": null}
-
-Today is January 6, 2026. Extract flight route:"""
-
-
-async def extract_flight_route(messages: list, request: Request) -> dict:
-    try:
-        ctx = extract(request.headers)
-        extra_headers = {}
-        inject(extra_headers, context=ctx)
-
-        response = await openai_client.chat.completions.create(
-            model=EXTRACTION_MODEL,
-            messages=[
-                {"role": "system", "content": ROUTE_EXTRACTION_PROMPT},
-                *[
-                    {"role": m.get("role"), "content": m.get("content")}
-                    for m in messages[-5:]
-                ],
-            ],
-            temperature=0.1,
-            max_completion_tokens=100,
-            extra_headers=extra_headers or None,
-        )
-
-        result = response.choices[0].message.content.strip()
-        if "```json" in result:
-            result = result.split("```json")[1].split("```")[0].strip()
-        elif "```" in result:
-            result = result.split("```")[1].split("```")[0].strip()
-
-        route = json.loads(result)
-        cities = route.get("cities", [])
-
-        if not cities and (route.get("origin") or route.get("destination")):
-            cities = [c for c in [route.get("origin"), route.get("destination")] if c]
-
-        return {"cities": cities, "date": route.get("date")}
-
-    except Exception as e:
-        logger.error(f"Error extracting flight route: {e}")
-        return {"cities": [], "date": None}
-
-
-async def resolve_airport_code(city_name: str, request: Request) -> Optional[str]:
-    if not city_name:
-        return None
-
-    try:
-        ctx = extract(request.headers)
-        extra_headers = {}
-        inject(extra_headers, context=ctx)
-
-        response = await openai_client.chat.completions.create(
-            model=EXTRACTION_MODEL,
-            messages=[
-                {
-                    "role": "system",
-                    "content": "Convert city names to primary airport IATA codes. Return only the 3-letter code. Examples: Seattle→SEA, Atlanta→ATL, New York→JFK, Dubai→DXB, Lahore→LHE",
-                },
-                {"role": "user", "content": city_name},
-            ],
-            temperature=0.1,
-            max_completion_tokens=10,
-            extra_headers=extra_headers or None,
-        )
-
-        code = response.choices[0].message.content.strip().upper()
-        code = code.strip("\"'`.,!? \n\t")
-        return code if len(code) == 3 else None
-
-    except Exception as e:
-        logger.error(f"Error resolving airport code for {city_name}: {e}")
-        return None
-
-
-async def fetch_flights(
-    origin_code: str, dest_code: str, travel_date: Optional[str] = None
-) -> dict:
-    """Fetch flights between two airports. Note: FlightAware limits to 2 days ahead."""
-    search_date = travel_date or datetime.now().strftime("%Y-%m-%d")
-
-    search_date_obj = datetime.strptime(search_date, "%Y-%m-%d")
-    today = datetime.now().replace(hour=0, minute=0, second=0, microsecond=0)
-    days_ahead = (search_date_obj - today).days
-
-    if days_ahead > 2:
-        logger.warning(
-            f"Date {search_date} is {days_ahead} days ahead, exceeds FlightAware limit"
-        )
-        return {
-            "origin_code": origin_code,
-            "destination_code": dest_code,
-            "flights": [],
-            "count": 0,
-            "error": f"FlightAware API only provides data up to 2 days ahead. Requested date ({search_date}) is {days_ahead} days away.",
-        }
-
-    try:
-        url = f"{AEROAPI_BASE_URL}/airports/{origin_code}/flights/to/{dest_code}"
-        headers = {"x-apikey": AEROAPI_KEY}
-        params = {
-            "start": f"{search_date}T00:00:00Z",
-            "end": f"{search_date}T23:59:59Z",
-            "connection": "nonstop",
-            "max_pages": 1,
-        }
-
-        response = await http_client.get(url, headers=headers, params=params)
-
-        if response.status_code != 200:
-            logger.error(
-                f"FlightAware API error {response.status_code}: {response.text}"
-            )
-            return {
-                "origin_code": origin_code,
-                "destination_code": dest_code,
-                "flights": [],
-                "count": 0,
-            }
-
-        data = response.json()
-        flights = []
-
-        for flight_group in data.get("flights", [])[:5]:
-            segments = flight_group.get("segments", [])
-            if not segments:
-                continue
-
-            flight = segments[0]
-            flights.append(
-                {
-                    "airline": flight.get("operator"),
-                    "flight_number": flight.get("ident_iata") or flight.get("ident"),
-                    "departure_time": flight.get("scheduled_out"),
-                    "arrival_time": flight.get("scheduled_in"),
-                    "origin": flight["origin"].get("code_iata")
-                    if isinstance(flight.get("origin"), dict)
-                    else None,
-                    "destination": flight["destination"].get("code_iata")
-                    if isinstance(flight.get("destination"), dict)
-                    else None,
-                    "aircraft_type": flight.get("aircraft_type"),
-                    "status": flight.get("status"),
-                    "terminal_origin": flight.get("terminal_origin"),
-                    "gate_origin": flight.get("gate_origin"),
-                }
-            )
-
-        logger.info(f"Found {len(flights)} flights from {origin_code} to {dest_code}")
-        return {
-            "origin_code": origin_code,
-            "destination_code": dest_code,
-            "flights": flights,
-            "count": len(flights),
-        }
-
-    except Exception as e:
-        logger.error(f"Error fetching flights: {e}")
-        return {
-            "origin_code": origin_code,
-            "destination_code": dest_code,
-            "flights": [],
-            "count": 0,
-        }
-
-
-def build_flight_context(cities: list, airport_codes: list, legs_data: list) -> str:
-    if len(cities) == 2:
-        leg = legs_data[0]
-        flight_data = {
-            "flights": leg["flights"],
-            "count": len(leg["flights"]),
-            "origin_code": leg["origin_code"],
-            "destination_code": leg["dest_code"],
-        }
-        if leg["flights"]:
-            return f"""
-Flight search results from {leg['origin']} ({leg['origin_code']}) to {leg['destination']} ({leg['dest_code']}):
-
-Flight data in JSON format:
-{json.dumps(flight_data, indent=2)}
-
-Present these {len(leg['flights'])} flight(s) to the user clearly."""
-        else:
-            error = leg.get("error") or "No direct flights found"
-            return f"""
-Flight search from {leg['origin']} ({leg['origin_code']}) to {leg['destination']} ({leg['dest_code']}):
-
-Result: {error}
-
-Let the user know and suggest alternatives if appropriate."""
-
-    route_str = " → ".join(
-        [f"{city} ({code})" for city, code in zip(cities, airport_codes)]
-    )
-    context = f"\nMulti-leg flight search: {route_str}\n\n"
-
-    for leg in legs_data:
-        context += f"**Leg {leg['leg']}: {leg['origin']} ({leg['origin_code']}) → {leg['destination']} ({leg['dest_code']})**\n"
-        if leg["flights"]:
-            leg_data = {"flights": leg["flights"], "count": len(leg["flights"])}
-            context += f"Flight data:\n{json.dumps(leg_data, indent=2)}\n\n"
-        elif leg.get("error"):
-            context += f"Error: {leg['error']}\n\n"
-        else:
-            context += "No direct flights found for this leg.\n\n"
-
-    context += "Present this itinerary clearly. For each leg, show available flights by departure time. Note connection timing between legs."
-    return context
-
-
-app = FastAPI(title="Flight Information Agent", version="1.0.0")
-
-
-@app.post("/v1/chat/completions")
-async def handle_request(request: Request):
-    request_body = await request.json()
-    return StreamingResponse(
-        invoke_flight_agent(request, request_body),
-        media_type="text/event-stream",
-    )
-
-
-async def invoke_flight_agent(request: Request, request_body: dict):
-    messages = request_body.get("messages", [])
-
-    route = await extract_flight_route(messages, request)
-    cities = route.get("cities", [])
-    travel_date = route.get("date")
-
-    # Build context based on what we could extract
-    if len(cities) < 2:
-        flight_context = """
-Could not extract a complete flight route from the user's request.
-
-Ask the user to provide both origin and destination cities.
-Example: 'Flights from Seattle to Atlanta' or 'Seattle to Dubai to Lahore'"""
-        airport_codes = []
-        legs_data = []
-    else:
-        airport_codes = []
-        failed_city = None
-        for city in cities:
-            code = await resolve_airport_code(city, request)
-            if not code:
-                failed_city = city
-                break
-            airport_codes.append(code)
-
-        if failed_city:
-            flight_context = f"""
-Could not find airport code for "{failed_city}".
-
-Ask the user to check the city name or provide a different city."""
-            legs_data = []
-        else:
-            legs_data = []
-            for i in range(len(cities) - 1):
-                flight_data = await fetch_flights(
-                    airport_codes[i], airport_codes[i + 1], travel_date
-                )
-                legs_data.append(
-                    {
-                        "leg": i + 1,
-                        "origin": cities[i],
-                        "origin_code": airport_codes[i],
-                        "destination": cities[i + 1],
-                        "dest_code": airport_codes[i + 1],
-                        "flights": flight_data.get("flights", []),
-                        "error": flight_data.get("error"),
-                    }
-                )
-
-            flight_context = build_flight_context(cities, airport_codes, legs_data)
-
-    response_messages = [{"role": "system", "content": SYSTEM_PROMPT}]
-    for i, msg in enumerate(messages):
-        content = msg.get("content", "")
-        if i == len(messages) - 1 and msg.get("role") == "user":
-            content += flight_context
-        response_messages.append({"role": msg.get("role"), "content": content})
-
-    logger.info(f"Sending {len(response_messages)} messages to LLM")
-
-    try:
-        ctx = extract(request.headers)
-        extra_headers = {"x-envoy-max-retries": "3"}
-        inject(extra_headers, context=ctx)
-
-        stream = await openai_client.chat.completions.create(
-            model=FLIGHT_MODEL,
-            messages=response_messages,
-            temperature=request_body.get("temperature", 0.7),
-            max_completion_tokens=request_body.get("max_tokens", 3000),
-            stream=True,
-            extra_headers=extra_headers,
-        )
-
-        async for chunk in stream:
-            if chunk.choices:
-                yield f"data: {chunk.model_dump_json()}\n\n"
-
-        yield "data: [DONE]\n\n"
-
-    except Exception as e:
-        logger.error(f"Error generating response: {e}")
-        yield "data: [DONE]\n\n"
-
-
-@app.get("/health")
-async def health_check():
-    return {"status": "healthy", "agent": "flight_information"}
-
-
-def start_server(host: str = "0.0.0.0", port: int = 10520):
-    uvicorn.run(
-        app,
-        host=host,
-        port=port,
-        log_config={
-            "version": 1,
-            "disable_existing_loggers": False,
-            "formatters": {
-                "default": {
-                    "format": "%(asctime)s - [FLIGHT_AGENT] - %(levelname)s - %(message)s"
-                }
-            },
-            "handlers": {
-                "default": {
-                    "formatter": "default",
-                    "class": "logging.StreamHandler",
-                    "stream": "ext://sys.stdout",
-                }
-            },
-            "root": {"level": "INFO", "handlers": ["default"]},
-        },
-    )
-
-
-if __name__ == "__main__":
-    start_server()
--- a/demos/use_cases/travel_agents/src/travel_agents/weather_agent.py
+++ b/demos/use_cases/travel_agents/src/travel_agents/weather_agent.py
@ -1,443 +0,0 @@
-import json
-import re
-from fastapi import FastAPI, Request
-from fastapi.responses import StreamingResponse
-from openai import AsyncOpenAI
-import os
-import logging
-import time
-import uuid
-import uvicorn
-from datetime import datetime, timedelta
-import httpx
-from typing import Optional
-from urllib.parse import quote
-from opentelemetry.propagate import extract, inject
-
-# Set up logging
-logging.basicConfig(
-    level=logging.INFO,
-    format="%(asctime)s - [WEATHER_AGENT] - %(levelname)s - %(message)s",
-)
-logger = logging.getLogger(__name__)
-
-
-# Configuration for plano LLM gateway
-LLM_GATEWAY_ENDPOINT = os.getenv(
-    "LLM_GATEWAY_ENDPOINT", "http://host.docker.internal:12001/v1"
-)
-WEATHER_MODEL = "openai/gpt-5.2"
-LOCATION_MODEL = "openai/gpt-4o-mini"
-
-# Initialize OpenAI client for plano
-openai_client_via_plano = AsyncOpenAI(
-    base_url=LLM_GATEWAY_ENDPOINT,
-    api_key="EMPTY",
-)
-
-# FastAPI app for REST server
-app = FastAPI(title="Weather Forecast Agent", version="1.0.0")
-
-# HTTP client for API calls
-http_client = httpx.AsyncClient(timeout=10.0)
-
-
-# Utility functions
-def celsius_to_fahrenheit(temp_c: Optional[float]) -> Optional[float]:
-    """Convert Celsius to Fahrenheit."""
-    return round(temp_c * 9 / 5 + 32, 1) if temp_c is not None else None
-
-
-def get_user_messages(messages: list) -> list:
-    """Extract user messages from message list."""
-    return [msg for msg in messages if msg.get("role") == "user"]
-
-
-def get_last_user_content(messages: list) -> str:
-    """Get the content of the most recent user message."""
-    for msg in reversed(messages):
-        if msg.get("role") == "user":
-            return msg.get("content", "").lower()
-    return ""
-
-
-async def get_weather_data(
-    request: Request,
-    messages: list,
-    days: int = 1,
-    traceparent_header: str = None,
-    request_id: str = None,
-):
-    """Extract location from user's conversation and fetch weather data from Open-Meteo API.
-
-    This function does two things:
-    1. Uses an LLM to extract the location from the user's message
-    2. Fetches weather data for that location from Open-Meteo
-
-    Currently returns only current day weather. Want to add multi-day forecasts?
-    """
-    instructions = """You are a city name extractor. Look at the FINAL user message ONLY and extract the city name.
-
-The FINAL user message will be the LAST message with role "user" in the conversation.
-
-IMPORTANT: Ignore all previous messages. Focus ONLY on the FINAL user message.
-
-Examples of what to extract from the FINAL user message:
- "What's the weather in Seattle?" → Seattle
- "What's the weather in San Francisco?" → San Francisco
- "What about Dubai?" → Dubai
- "How's the weather in Tokyo today?" → Tokyo
- "Tell me about Lahore" → Lahore
- "What about there?" → Look at conversation for the last mentioned city
-
-Output ONLY the city name. Nothing else. One word or city name only.
-If no city can be found, output: NOT_FOUND"""
-
-    try:
-        user_messages = [
-            msg.get("content") for msg in messages if msg.get("role") == "user"
-        ]
-
-        if not user_messages:
-            location = "New York"
-        else:
-            ctx = extract(request.headers)
-            extra_headers = {}
-            if request_id:
-                extra_headers["x-request-id"] = request_id
-            inject(extra_headers, context=ctx)
-            # For location extraction, pass full conversation for context (e.g., "there" = previous destination)
-            response = await openai_client_via_plano.chat.completions.create(
-                model=LOCATION_MODEL,
-                messages=[
-                    {"role": "system", "content": instructions},
-                    *[
-                        {"role": msg.get("role"), "content": msg.get("content")}
-                        for msg in messages
-                    ],
-                ],
-                temperature=0.1,
-                max_completion_tokens=10,
-                extra_headers=extra_headers if extra_headers else None,
-            )
-
-            location = response.choices[0].message.content.strip().strip("\"'`.,!?")
-            logger.info(f"Location extraction result: '{location}'")
-
-            if not location or location.upper() == "NOT_FOUND":
-                location = "New York"
-                logger.info(f"Location not found, defaulting to: {location}")
-
-    except Exception as e:
-        logger.error(f"Error extracting location: {e}")
-        location = "New York"
-
-    logger.info(f"Fetching weather for location: '{location}' (days: {days})")
-
-    # Step 2: Fetch weather data for the extracted location
-    try:
-        # Geocode city to get coordinates
-        geocode_url = f"https://geocoding-api.open-meteo.com/v1/search?name={quote(location)}&count=1&language=en&format=json"
-        geocode_response = await http_client.get(geocode_url)
-
-        if geocode_response.status_code != 200 or not geocode_response.json().get(
-            "results"
-        ):
-            logger.warning(f"Could not geocode {location}, using New York")
-            location = "New York"
-            geocode_url = f"https://geocoding-api.open-meteo.com/v1/search?name={quote(location)}&count=1&language=en&format=json"
-            geocode_response = await http_client.get(geocode_url)
-
-        geocode_data = geocode_response.json()
-        if not geocode_data.get("results"):
-            return {
-                "location": location,
-                "weather": {
-                    "date": datetime.now().strftime("%Y-%m-%d"),
-                    "day_name": datetime.now().strftime("%A"),
-                    "temperature_c": None,
-                    "temperature_f": None,
-                    "weather_code": None,
-                    "error": "Could not retrieve weather data",
-                },
-            }
-
-        result = geocode_data["results"][0]
-        location_name = result.get("name", location)
-        latitude = result["latitude"]
-        longitude = result["longitude"]
-
-        logger.info(
-            f"Geocoded '{location}' to {location_name} ({latitude}, {longitude})"
-        )
-
-        # Get weather forecast
-        weather_url = (
-            f"https://api.open-meteo.com/v1/forecast?"
-            f"latitude={latitude}&longitude={longitude}&"
-            f"current=temperature_2m&"
-            f"daily=sunrise,sunset,temperature_2m_max,temperature_2m_min,weather_code&"
-            f"forecast_days={days}&timezone=auto"
-        )
-
-        weather_response = await http_client.get(weather_url)
-        if weather_response.status_code != 200:
-            return {
-                "location": location_name,
-                "weather": {
-                    "date": datetime.now().strftime("%Y-%m-%d"),
-                    "day_name": datetime.now().strftime("%A"),
-                    "temperature_c": None,
-                    "temperature_f": None,
-                    "weather_code": None,
-                    "error": "Could not retrieve weather data",
-                },
-            }
-
-        weather_data = weather_response.json()
-        current_temp = weather_data.get("current", {}).get("temperature_2m")
-        daily = weather_data.get("daily", {})
-
-        # Build forecast for requested number of days
-        forecast = []
-        for i in range(days):
-            date_str = daily["time"][i]
-            date_obj = datetime.fromisoformat(date_str.replace("Z", "+00:00"))
-
-            temp_max = (
-                daily.get("temperature_2m_max", [])[i]
-                if daily.get("temperature_2m_max")
-                else None
-            )
-            temp_min = (
-                daily.get("temperature_2m_min", [])[i]
-                if daily.get("temperature_2m_min")
-                else None
-            )
-            weather_code = (
-                daily.get("weather_code", [0])[i] if daily.get("weather_code") else 0
-            )
-            sunrise = daily.get("sunrise", [])[i] if daily.get("sunrise") else None
-            sunset = daily.get("sunset", [])[i] if daily.get("sunset") else None
-
-            # Use current temp for today, otherwise use max temp
-            temp_c = (
-                temp_max
-                if temp_max is not None
-                else (current_temp if i == 0 and current_temp else temp_min)
-            )
-
-            forecast.append(
-                {
-                    "date": date_str.split("T")[0],
-                    "day_name": date_obj.strftime("%A"),
-                    "temperature_c": round(temp_c, 1) if temp_c is not None else None,
-                    "temperature_f": celsius_to_fahrenheit(temp_c),
-                    "temperature_max_c": (
-                        round(temp_max, 1) if temp_max is not None else None
-                    ),
-                    "temperature_min_c": (
-                        round(temp_min, 1) if temp_min is not None else None
-                    ),
-                    "weather_code": weather_code,
-                    "sunrise": sunrise.split("T")[1] if sunrise else None,
-                    "sunset": sunset.split("T")[1] if sunset else None,
-                }
-            )
-
-        return {"location": location_name, "forecast": forecast}
-
-    except Exception as e:
-        logger.error(f"Error getting weather data: {e}")
-        return {
-            "location": location,
-            "weather": {
-                "date": datetime.now().strftime("%Y-%m-%d"),
-                "day_name": datetime.now().strftime("%A"),
-                "temperature_c": None,
-                "temperature_f": None,
-                "weather_code": None,
-                "error": "Could not retrieve weather data",
-            },
-        }
-
-
-@app.post("/v1/chat/completions")
-async def handle_request(request: Request):
-    """HTTP endpoint for chat completions with streaming support."""
-
-    request_body = await request.json()
-    messages = request_body.get("messages", [])
-    # Respect the stream parameter - orchestrator controls this based on agent position in chain
-    is_streaming = request_body.get("stream", True)
-
-    logger.info(
-        "messages detail json dumps: %s",
-        json.dumps(messages, indent=2),
-    )
-
-    traceparent_header = request.headers.get("traceparent")
-    request_id = request.headers.get("x-request-id")
-
-    return StreamingResponse(
-        invoke_weather_agent(request, request_body, traceparent_header, request_id),
-        media_type="text/plain",
-        headers={
-            "content-type": "text/event-stream",
-        },
-    )
-
-
-async def invoke_weather_agent(
-    request: Request,
-    request_body: dict,
-    traceparent_header: str = None,
-    request_id: str = None,
-):
-    """Generate streaming chat completions."""
-    messages = request_body.get("messages", [])
-
-    # Detect if user wants multi-day forecast
-    last_user_msg = get_last_user_content(messages)
-    days = 1
-
-    if "forecast" in last_user_msg or "week" in last_user_msg:
-        days = 7
-    elif "tomorrow" in last_user_msg:
-        days = 2
-
-    # Extract specific number of days if mentioned (e.g., "5 day forecast")
-    import re
-
-    day_match = re.search(r"(\d{1,2})\s+day", last_user_msg)
-    if day_match:
-        requested_days = int(day_match.group(1))
-        days = min(requested_days, 16)  # API supports max 16 days
-
-    # Get live weather data (location extraction happens inside this function)
-    weather_data = await get_weather_data(
-        request, messages, days, traceparent_header, request_id
-    )
-
-    # Create weather context to append to user message
-    forecast_type = "forecast" if days > 1 else "current weather"
-    weather_context = f"""
-
-Weather data for {weather_data['location']} ({forecast_type}):
-{json.dumps(weather_data, indent=2)}
-
-Present the weather information to the user in a clear, readable format. If there is information from other agents, start your response with a summary of that information."""
-
-    # System prompt for weather agent
-    instructions = """You are a weather assistant in a multi-agent system. You will receive weather data in JSON format with these fields:
-
-    - "location": City name
-    - "forecast": Array of weather objects, each with date, day_name, temperature_c, temperature_f, temperature_max_c, temperature_min_c, weather_code, sunrise, sunset
-    - weather_code: WMO code (0=clear, 1-3=partly cloudy, 45-48=fog, 51-67=rain, 71-86=snow, 95-99=thunderstorm)
-
-    Your task:
-    1. Present the weather/forecast clearly for the location
-    2. For single day: show current conditions
-    3. For multi-day: show each day with date and conditions
-    4. Include temperature in both Celsius and Fahrenheit
-    5. Describe conditions naturally based on weather_code
-    6. Use conversational language
-
-    Multi-agent context: You are part of a larger system. If the conversation includes additional context or information from other sources, acknowledge and incorporate it naturally into your response. Your primary focus is weather, but be aware of the full conversation context.
-
-    Remember: Only use the provided data. If fields are null, mention data is unavailable."""
-
-    # Build message history with weather data appended to the last user message
-    response_messages = [{"role": "system", "content": instructions}]
-
-    for i, msg in enumerate(messages):
-        # Append weather data to the last user message
-        if i == len(messages) - 1 and msg.get("role") == "user":
-            response_messages.append(
-                {"role": "user", "content": msg.get("content") + weather_context}
-            )
-        else:
-            response_messages.append(
-                {"role": msg.get("role"), "content": msg.get("content")}
-            )
-
-    try:
-        ctx = extract(request.headers)
-        extra_headers = {"x-envoy-max-retries": "3"}
-        if request_id:
-            extra_headers["x-request-id"] = request_id
-        inject(extra_headers, context=ctx)
-
-        stream = await openai_client_via_plano.chat.completions.create(
-            model=WEATHER_MODEL,
-            messages=response_messages,
-            temperature=request_body.get("temperature", 0.7),
-            max_completion_tokens=request_body.get("max_tokens", 3000),
-            stream=True,
-            extra_headers=extra_headers,
-        )
-
-        async for chunk in stream:
-            if chunk.choices:
-                yield f"data: {chunk.model_dump_json()}\n\n"
-
-        yield "data: [DONE]\n\n"
-
-    except Exception as e:
-        logger.error(f"Error generating weather response: {e}")
-        error_chunk = {
-            "id": f"chatcmpl-{uuid.uuid4().hex[:8]}",
-            "object": "chat.completion.chunk",
-            "created": int(time.time()),
-            "model": request_body.get("model", WEATHER_MODEL),
-            "choices": [
-                {
-                    "index": 0,
-                    "delta": {
-                        "content": "I apologize, but I'm having trouble retrieving weather information right now. Please try again."
-                    },
-                    "finish_reason": "stop",
-                }
-            ],
-        }
-        yield f"data: {json.dumps(error_chunk)}\n\n"
-        yield "data: [DONE]\n\n"
-
-
-@app.get("/health")
-async def health_check():
-    """Health check endpoint."""
-    return {"status": "healthy", "agent": "weather_forecast"}
-
-
-def start_server(host: str = "localhost", port: int = 10510):
-    """Start the REST server."""
-    uvicorn.run(
-        app,
-        host=host,
-        port=port,
-        log_config={
-            "version": 1,
-            "disable_existing_loggers": False,
-            "formatters": {
-                "default": {
-                    "format": "%(asctime)s - [WEATHER_AGENT] - %(levelname)s - %(message)s",
-                },
-            },
-            "handlers": {
-                "default": {
-                    "formatter": "default",
-                    "class": "logging.StreamHandler",
-                    "stream": "ext://sys.stdout",
-                },
-            },
-            "root": {
-                "level": "INFO",
-                "handlers": ["default"],
-            },
-        },
-    )
-
-
-if __name__ == "__main__":
-    start_server(host="0.0.0.0", port=10510)
--- a/demos/use_cases/travel_agents/test.rest
+++ b/demos/use_cases/travel_agents/test.rest
@ -1,43 +0,0 @@
-@llm_endpoint = http://localhost:12000
-
-### Travel Agent Chat Completion Request
-POST {{llm_endpoint}}/v1/chat/completions HTTP/1.1
-Content-Type: application/json
-
-{
-  "model": "gpt-4o",
-  "messages": [
-    {
-      "role": "user",
-      "content": "What's the weather in Seattle?"
-    },
-    {
-      "role": "assistant",
-      "content": "The weather in Seattle is sunny with a temperature of 60 degrees Fahrenheit."
-    },
-    {
-      "role": "user",
-      "content": "What is one Alaska flight that goes direct to Atlanta from Seattle?"
-    }
-  ],
-  "max_tokens": 1000,
-  "stream": false,
-  "temperature": 1.0
-}
-
-
-### test 8001
-
-### test upstream llm
-POST http://localhost:8001/v1/chat/completions HTTP/1.1
-Content-Type: application/json
-
-{
-   "messages": [
-      {
-         "role": "system",
-         "content": "\nCurrent weather data for Seattle:\n\n{\n  \"location\": \"Seattle\",\n  \"forecast\": [\n    {\n      \"date\": \"2025-12-22\",\n      \"day_name\": \"Monday\",\n      \"temperature_c\": 8.3,\n      \"temperature_f\": 46.9,\n      \"temperature_max_c\": 8.3,\n      \"temperature_min_c\": 2.8,\n      \"condition\": \"Rainy\",\n      \"sunrise\": \"07:55\",\n      \"sunset\": \"16:20\"\n    }\n  ]\n}\n\nUse this data to answer the user's weather query."
-      }
-   ],
-   "model": "gpt-4o",
-}
--- a/demos/use_cases/travel_agents/tracing.png
+++ b/demos/use_cases/travel_agents/tracing.png
--- a/demos/use_cases/travel_agents/travel_agent_request.rest
+++ b/demos/use_cases/travel_agents/travel_agent_request.rest
@ -1,30 +0,0 @@
-@llm_endpoint = http://localhost:12000
-
-### Travel Agent Chat Completion - Full Conversation
-POST {{llm_endpoint}}/v1/chat/completions HTTP/1.1
-Content-Type: application/json
-
-{
-  "model": "gpt-4o",
-  "messages": [
-    {
-      "role": "system",
-      "content": "You are a professional travel planner assistant. Your role is to provide accurate, clear, and helpful information about weather and flights based on the structured data provided to you.\n\nCRITICAL INSTRUCTIONS:\n\n1. DATA STRUCTURE:\n   \n   WEATHER DATA:\n   - You will receive weather data as JSON in a system message\n   - The data contains a \"location\" field (string) and a \"forecast\" array\n   - Each forecast entry has: date, day_name, temperature_c, temperature_f, temperature_max_c, temperature_min_c, condition, sunrise, sunset\n   - Some fields may be null/None - handle these gracefully\n   \n   FLIGHT DATA:\n   - You will receive flight information in a system message\n   - Flight data includes: airline, flight number, departure time, arrival time, origin airport, destination airport, aircraft type, status, gate, terminal\n   - Information may include both scheduled and estimated times\n   - Some fields may be unavailable - handle these gracefully\n\n2. WEATHER HANDLING:\n   - For single-day queries: Use temperature_c/temperature_f (current/primary temperature)\n   - For multi-day forecasts: Use temperature_max_c and temperature_min_c when available\n   - Always provide temperatures in both Celsius and Fahrenheit when available\n   - If temperature is null, say \"temperature data unavailable\" rather than making up numbers\n   - Use exact condition descriptions provided (e.g., \"Clear sky\", \"Rainy\", \"Partly Cloudy\")\n   - Add helpful context when appropriate (e.g., \"perfect for outdoor activities\" for clear skies)\n\n3. FLIGHT HANDLING:\n   - Present flight information clearly with airline name and flight number\n   - Include departure and arrival times with time zones when provided\n   - Mention origin and destination airports with their codes\n   - Include gate and terminal information when available\n   - Note aircraft type if relevant to the query\n   - Highlight any status updates (delays, early arrivals, etc.)\n   - For multiple flights, list them in chronological order by departure time\n   - If specific details are missing, acknowledge this rather than inventing information\n\n4. MULTI-PART QUERIES:\n   - Users may ask about both weather and flights in one message\n   - Answer ALL parts of the query that you have data for\n   - Organize your response logically - typically weather first, then flights, or vice versa based on the query\n   - Provide complete information for each topic without mentioning other agents\n   - If you receive data for only one topic but the user asked about multiple, answer what you can with the provided data\n\n5. ERROR HANDLING:\n   - If weather forecast contains an \"error\" field, acknowledge the issue politely\n   - If temperature or condition is null/None, mention that specific data is unavailable\n   - If flight details are incomplete, state which information is unavailable\n   - Never invent or guess weather or flight data - only use what's provided\n   - If location couldn't be determined, acknowledge this but still provide available data\n\n6. RESPONSE FORMAT:\n   \n   For Weather:\n   - Single-day queries: Provide current conditions, temperature, and condition\n   - Multi-day forecasts: List each day with date, day name, high/low temps, and condition\n   - Include sunrise/sunset times when available and relevant\n   \n   For Flights:\n   - List flights with clear numbering or bullet points\n   - Include key details: airline, flight number, departure/arrival times, airports\n   - Add gate, terminal, and status information when available\n   - For multiple flights, organize chronologically\n   \n   General:\n   - Use natural, conversational language\n   - Be concise but complete\n   - Format dates and times clearly\n   - Use bullet points or numbered lists for clarity\n\n7. LOCATION HANDLING:\n   - Always mention location names from the data\n   - For flights, clearly state origin and destination cities/airports\n   - If locations differ from what the user asked, acknowledge this politely\n\n8. RESPONSE STYLE:\n   - Be friendly and professional\n   - Use natural language, not technical jargon\n   - Provide information in a logical, easy-to-read format\n   - When answering multi-part queries, create a cohesive response that addresses all aspects\n\nRemember: Only use the data provided. Never fabricate weather or flight information. If data is missing, clearly state what's unavailable. Answer all parts of the user's query that you have data for."
-    },
-    {
-      "role": "assistant",
-      "content": "Current weather data for Seattle:\n\n{\n  \"location\": \"Seattle\",\n  \"forecast\": [\n    {\n      \"date\": \"2025-12-22\",\n      \"day_name\": \"Monday\",\n      \"temperature_c\": 8.3,\n      \"temperature_f\": 46.9,\n      \"temperature_max_c\": 8.3,\n      \"temperature_min_c\": 2.8,\n      \"condition\": \"Rainy\",\n      \"sunrise\": \"07:55\",\n      \"sunset\": \"16:20\"\n    }\n  ]\n}\n\nUse this data to answer the user's weather query."
-    },
-    {
-      "role": "assistant",
-      "content": "Here are some direct flights from Seattle to Atlanta on December 23, 2025:\n\n1. **Delta Airlines Flight DL552**\n   - **Departure:** Scheduled at 3:47 PM (Seattle Time), from Seattle-Tacoma Intl (SEA)\n   - **Arrival:** Scheduled at 8:31 PM (Atlanta Time), at Hartsfield-Jackson Intl (ATL)\n   - **Aircraft:** Boeing 737-900 (B739)\n   - **Status:** Scheduled\n   - **Terminal at Atlanta:** S\n   - **Estimated arrival slightly early**: 8:26 PM\n\n2. **Delta Airlines Flight DL542**\n   - **Departure:** Scheduled at 12:00 PM (Seattle Time), Gate A4, from Seattle-Tacoma Intl (SEA)\n   - **Arrival:** Scheduled at 4:49 PM (Atlanta Time), at Hartsfield-Jackson Intl (ATL)\n   - **Aircraft:** Boeing 737-900 (B739)\n   - **Status:** Scheduled\n   - **Gate at Atlanta:** E10, Terminal: S\n   - **Estimated early arrival**: 4:44 PM\n\n3. **Delta Airlines Flight DL554**\n   - **Departure:** Scheduled at 10:15 AM (Seattle Time), Gate A10, from Seattle-Tacoma Intl (SEA)\n   - **Arrival:** Scheduled at 4:05 PM (Atlanta Time), at Hartsfield-Jackson Intl (ATL)\n   - **Aircraft:** Boeing 737-900 (B739)\n   - **Status:** Scheduled\n   - **Gate at Atlanta:** B19, Terminal: S\n   - **Estimated late arrival**: 4:06 PM\n\n4. **Alaska Airlines Flight AS334**\n   - **Departure:** Scheduled at 9:16 AM (Seattle Time), Gate C20, from Seattle-Tacoma Intl (SEA)\n   - **Arrival:** Scheduled at 5:08 PM (Atlanta Time), at Hartsfield-Jackson Intl (ATL)\n   - **Aircraft:** Boeing 737-900 (B739)\n   - **Status:** Scheduled\n   - **Gate at Atlanta:** C5, Terminal: N\n\nThese are just a few of the direct flights available. Please let me know if you need more details on any other specific flight."
-    },
-    {
-      "role": "user",
-      "content": "What is the weather in Atlanta and what flight goes from Seattle to Atlanta?"
-    }
-  ],
-  "max_tokens": 1000,
-  "stream": false,
-  "temperature": 1.0
-}
--- a/demos/use_cases/travel_agents/uv.lock
+++ b/demos/use_cases/travel_agents/uv.lock
@ -1,524 +0,0 @@
-version = 1
-revision = 3
-requires-python = ">=3.10"
-
-[[package]]
-name = "annotated-doc"
-version = "0.0.4"
-source = { registry = "https://pypi.org/simple" }
-sdist = { url = "https://files.pythonhosted.org/packages/57/ba/046ceea27344560984e26a590f90bc7f4a75b06701f653222458922b558c/annotated_doc-0.0.4.tar.gz", hash = "sha256:fbcda96e87e9c92ad167c2e53839e57503ecfda18804ea28102353485033faa4", size = 7288, upload-time = "2025-11-10T22:07:42.062Z" }
-wheels = [
-    { url = "https://files.pythonhosted.org/packages/1e/d3/26bf1008eb3d2daa8ef4cacc7f3bfdc11818d111f7e2d0201bc6e3b49d45/annotated_doc-0.0.4-py3-none-any.whl", hash = "sha256:571ac1dc6991c450b25a9c2d84a3705e2ae7a53467b5d111c24fa8baabbed320", size = 5303, upload-time = "2025-11-10T22:07:40.673Z" },
-]
-
-[[package]]
-name = "annotated-types"
-version = "0.7.0"
-source = { registry = "https://pypi.org/simple" }
-sdist = { url = "https://files.pythonhosted.org/packages/ee/67/531ea369ba64dcff5ec9c3402f9f51bf748cec26dde048a2f973a4eea7f5/annotated_types-0.7.0.tar.gz", hash = "sha256:aff07c09a53a08bc8cfccb9c85b05f1aa9a2a6f23728d790723543408344ce89", size = 16081, upload-time = "2024-05-20T21:33:25.928Z" }
-wheels = [
-    { url = "https://files.pythonhosted.org/packages/78/b6/6307fbef88d9b5ee7421e68d78a9f162e0da4900bc5f5793f6d3d0e34fb8/annotated_types-0.7.0-py3-none-any.whl", hash = "sha256:1f02e8b43a8fbbc3f3e0d4f0f4bfc8131bcb4eebe8849b8e5c773f3a1c582a53", size = 13643, upload-time = "2024-05-20T21:33:24.1Z" },
-]
-
-[[package]]
-name = "anyio"
-version = "4.12.0"
-source = { registry = "https://pypi.org/simple" }
-dependencies = [
-    { name = "exceptiongroup", marker = "python_full_version < '3.11'" },
-    { name = "idna" },
-    { name = "typing-extensions", marker = "python_full_version < '3.13'" },
-]
-sdist = { url = "https://files.pythonhosted.org/packages/16/ce/8a777047513153587e5434fd752e89334ac33e379aa3497db860eeb60377/anyio-4.12.0.tar.gz", hash = "sha256:73c693b567b0c55130c104d0b43a9baf3aa6a31fc6110116509f27bf75e21ec0", size = 228266, upload-time = "2025-11-28T23:37:38.911Z" }
-wheels = [
-    { url = "https://files.pythonhosted.org/packages/7f/9c/36c5c37947ebfb8c7f22e0eb6e4d188ee2d53aa3880f3f2744fb894f0cb1/anyio-4.12.0-py3-none-any.whl", hash = "sha256:dad2376a628f98eeca4881fc56cd06affd18f659b17a747d3ff0307ced94b1bb", size = 113362, upload-time = "2025-11-28T23:36:57.897Z" },
-]
-
-[[package]]
-name = "certifi"
-version = "2025.11.12"
-source = { registry = "https://pypi.org/simple" }
-sdist = { url = "https://files.pythonhosted.org/packages/a2/8c/58f469717fa48465e4a50c014a0400602d3c437d7c0c468e17ada824da3a/certifi-2025.11.12.tar.gz", hash = "sha256:d8ab5478f2ecd78af242878415affce761ca6bc54a22a27e026d7c25357c3316", size = 160538, upload-time = "2025-11-12T02:54:51.517Z" }
-wheels = [
-    { url = "https://files.pythonhosted.org/packages/70/7d/9bc192684cea499815ff478dfcdc13835ddf401365057044fb721ec6bddb/certifi-2025.11.12-py3-none-any.whl", hash = "sha256:97de8790030bbd5c2d96b7ec782fc2f7820ef8dba6db909ccf95449f2d062d4b", size = 159438, upload-time = "2025-11-12T02:54:49.735Z" },
-]
-
-[[package]]
-name = "click"
-version = "8.3.1"
-source = { registry = "https://pypi.org/simple" }
-dependencies = [
-    { name = "colorama", marker = "sys_platform == 'win32'" },
-]
-sdist = { url = "https://files.pythonhosted.org/packages/3d/fa/656b739db8587d7b5dfa22e22ed02566950fbfbcdc20311993483657a5c0/click-8.3.1.tar.gz", hash = "sha256:12ff4785d337a1bb490bb7e9c2b1ee5da3112e94a8622f26a6c77f5d2fc6842a", size = 295065, upload-time = "2025-11-15T20:45:42.706Z" }
-wheels = [
-    { url = "https://files.pythonhosted.org/packages/98/78/01c019cdb5d6498122777c1a43056ebb3ebfeef2076d9d026bfe15583b2b/click-8.3.1-py3-none-any.whl", hash = "sha256:981153a64e25f12d547d3426c367a4857371575ee7ad18df2a6183ab0545b2a6", size = 108274, upload-time = "2025-11-15T20:45:41.139Z" },
-]
-
-[[package]]
-name = "colorama"
-version = "0.4.6"
-source = { registry = "https://pypi.org/simple" }
-sdist = { url = "https://files.pythonhosted.org/packages/d8/53/6f443c9a4a8358a93a6792e2acffb9d9d5cb0a5cfd8802644b7b1c9a02e4/colorama-0.4.6.tar.gz", hash = "sha256:08695f5cb7ed6e0531a20572697297273c47b8cae5a63ffc6d6ed5c201be6e44", size = 27697, upload-time = "2022-10-25T02:36:22.414Z" }
-wheels = [
-    { url = "https://files.pythonhosted.org/packages/d1/d6/3965ed04c63042e047cb6a3e6ed1a63a35087b6a609aa3a15ed8ac56c221/colorama-0.4.6-py2.py3-none-any.whl", hash = "sha256:4f1d9991f5acc0ca119f9d443620b77f9d6b33703e51011c16baf57afb285fc6", size = 25335, upload-time = "2022-10-25T02:36:20.889Z" },
-]
-
-[[package]]
-name = "distro"
-version = "1.9.0"
-source = { registry = "https://pypi.org/simple" }
-sdist = { url = "https://files.pythonhosted.org/packages/fc/f8/98eea607f65de6527f8a2e8885fc8015d3e6f5775df186e443e0964a11c3/distro-1.9.0.tar.gz", hash = "sha256:2fa77c6fd8940f116ee1d6b94a2f90b13b5ea8d019b98bc8bafdcabcdd9bdbed", size = 60722, upload-time = "2023-12-24T09:54:32.31Z" }
-wheels = [
-    { url = "https://files.pythonhosted.org/packages/12/b3/231ffd4ab1fc9d679809f356cebee130ac7daa00d6d6f3206dd4fd137e9e/distro-1.9.0-py3-none-any.whl", hash = "sha256:7bffd925d65168f85027d8da9af6bddab658135b840670a223589bc0c8ef02b2", size = 20277, upload-time = "2023-12-24T09:54:30.421Z" },
-]
-
-[[package]]
-name = "exceptiongroup"
-version = "1.3.1"
-source = { registry = "https://pypi.org/simple" }
-dependencies = [
-    { name = "typing-extensions", marker = "python_full_version < '3.13'" },
-]
-sdist = { url = "https://files.pythonhosted.org/packages/50/79/66800aadf48771f6b62f7eb014e352e5d06856655206165d775e675a02c9/exceptiongroup-1.3.1.tar.gz", hash = "sha256:8b412432c6055b0b7d14c310000ae93352ed6754f70fa8f7c34141f91c4e3219", size = 30371, upload-time = "2025-11-21T23:01:54.787Z" }
-wheels = [
-    { url = "https://files.pythonhosted.org/packages/8a/0e/97c33bf5009bdbac74fd2beace167cab3f978feb69cc36f1ef79360d6c4e/exceptiongroup-1.3.1-py3-none-any.whl", hash = "sha256:a7a39a3bd276781e98394987d3a5701d0c4edffb633bb7a5144577f82c773598", size = 16740, upload-time = "2025-11-21T23:01:53.443Z" },
-]
-
-[[package]]
-name = "fastapi"
-version = "0.125.0"
-source = { registry = "https://pypi.org/simple" }
-dependencies = [
-    { name = "annotated-doc" },
-    { name = "pydantic" },
-    { name = "starlette" },
-    { name = "typing-extensions" },
-]
-sdist = { url = "https://files.pythonhosted.org/packages/17/71/2df15009fb4bdd522a069d2fbca6007c6c5487fce5cb965be00fc335f1d1/fastapi-0.125.0.tar.gz", hash = "sha256:16b532691a33e2c5dee1dac32feb31dc6eb41a3dd4ff29a95f9487cb21c054c0", size = 370550, upload-time = "2025-12-17T21:41:44.15Z" }
-wheels = [
-    { url = "https://files.pythonhosted.org/packages/34/2f/ff2fcc98f500713368d8b650e1bbc4a0b3ebcdd3e050dcdaad5f5a13fd7e/fastapi-0.125.0-py3-none-any.whl", hash = "sha256:2570ec4f3aecf5cca8f0428aed2398b774fcdfee6c2116f86e80513f2f86a7a1", size = 112888, upload-time = "2025-12-17T21:41:41.286Z" },
-]
-
-[[package]]
-name = "h11"
-version = "0.16.0"
-source = { registry = "https://pypi.org/simple" }
-sdist = { url = "https://files.pythonhosted.org/packages/01/ee/02a2c011bdab74c6fb3c75474d40b3052059d95df7e73351460c8588d963/h11-0.16.0.tar.gz", hash = "sha256:4e35b956cf45792e4caa5885e69fba00bdbc6ffafbfa020300e549b208ee5ff1", size = 101250, upload-time = "2025-04-24T03:35:25.427Z" }
-wheels = [
-    { url = "https://files.pythonhosted.org/packages/04/4b/29cac41a4d98d144bf5f6d33995617b185d14b22401f75ca86f384e87ff1/h11-0.16.0-py3-none-any.whl", hash = "sha256:63cf8bbe7522de3bf65932fda1d9c2772064ffb3dae62d55932da54b31cb6c86", size = 37515, upload-time = "2025-04-24T03:35:24.344Z" },
-]
-
-[[package]]
-name = "httpcore"
-version = "1.0.9"
-source = { registry = "https://pypi.org/simple" }
-dependencies = [
-    { name = "certifi" },
-    { name = "h11" },
-]
-sdist = { url = "https://files.pythonhosted.org/packages/06/94/82699a10bca87a5556c9c59b5963f2d039dbd239f25bc2a63907a05a14cb/httpcore-1.0.9.tar.gz", hash = "sha256:6e34463af53fd2ab5d807f399a9b45ea31c3dfa2276f15a2c3f00afff6e176e8", size = 85484, upload-time = "2025-04-24T22:06:22.219Z" }
-wheels = [
-    { url = "https://files.pythonhosted.org/packages/7e/f5/f66802a942d491edb555dd61e3a9961140fd64c90bce1eafd741609d334d/httpcore-1.0.9-py3-none-any.whl", hash = "sha256:2d400746a40668fc9dec9810239072b40b4484b640a8c38fd654a024c7a1bf55", size = 78784, upload-time = "2025-04-24T22:06:20.566Z" },
-]
-
-[[package]]
-name = "httpx"
-version = "0.28.1"
-source = { registry = "https://pypi.org/simple" }
-dependencies = [
-    { name = "anyio" },
-    { name = "certifi" },
-    { name = "httpcore" },
-    { name = "idna" },
-]
-sdist = { url = "https://files.pythonhosted.org/packages/b1/df/48c586a5fe32a0f01324ee087459e112ebb7224f646c0b5023f5e79e9956/httpx-0.28.1.tar.gz", hash = "sha256:75e98c5f16b0f35b567856f597f06ff2270a374470a5c2392242528e3e3e42fc", size = 141406, upload-time = "2024-12-06T15:37:23.222Z" }
-wheels = [
-    { url = "https://files.pythonhosted.org/packages/2a/39/e50c7c3a983047577ee07d2a9e53faf5a69493943ec3f6a384bdc792deb2/httpx-0.28.1-py3-none-any.whl", hash = "sha256:d909fcccc110f8c7faf814ca82a9a4d816bc5a6dbfea25d6591d6985b8ba59ad", size = 73517, upload-time = "2024-12-06T15:37:21.509Z" },
-]
-
-[[package]]
-name = "idna"
-version = "3.11"
-source = { registry = "https://pypi.org/simple" }
-sdist = { url = "https://files.pythonhosted.org/packages/6f/6d/0703ccc57f3a7233505399edb88de3cbd678da106337b9fcde432b65ed60/idna-3.11.tar.gz", hash = "sha256:795dafcc9c04ed0c1fb032c2aa73654d8e8c5023a7df64a53f39190ada629902", size = 194582, upload-time = "2025-10-12T14:55:20.501Z" }
-wheels = [
-    { url = "https://files.pythonhosted.org/packages/0e/61/66938bbb5fc52dbdf84594873d5b51fb1f7c7794e9c0f5bd885f30bc507b/idna-3.11-py3-none-any.whl", hash = "sha256:771a87f49d9defaf64091e6e6fe9c18d4833f140bd19464795bc32d966ca37ea", size = 71008, upload-time = "2025-10-12T14:55:18.883Z" },
-]
-
-[[package]]
-name = "importlib-metadata"
-version = "8.7.1"
-source = { registry = "https://pypi.org/simple" }
-dependencies = [
-    { name = "zipp" },
-]
-sdist = { url = "https://files.pythonhosted.org/packages/f3/49/3b30cad09e7771a4982d9975a8cbf64f00d4a1ececb53297f1d9a7be1b10/importlib_metadata-8.7.1.tar.gz", hash = "sha256:49fef1ae6440c182052f407c8d34a68f72efc36db9ca90dc0113398f2fdde8bb", size = 57107, upload-time = "2025-12-21T10:00:19.278Z" }
-wheels = [
-    { url = "https://files.pythonhosted.org/packages/fa/5e/f8e9a1d23b9c20a551a8a02ea3637b4642e22c2626e3a13a9a29cdea99eb/importlib_metadata-8.7.1-py3-none-any.whl", hash = "sha256:5a1f80bf1daa489495071efbb095d75a634cf28a8bc299581244063b53176151", size = 27865, upload-time = "2025-12-21T10:00:18.329Z" },
-]
-
-[[package]]
-name = "jiter"
-version = "0.12.0"
-source = { registry = "https://pypi.org/simple" }
-sdist = { url = "https://files.pythonhosted.org/packages/45/9d/e0660989c1370e25848bb4c52d061c71837239738ad937e83edca174c273/jiter-0.12.0.tar.gz", hash = "sha256:64dfcd7d5c168b38d3f9f8bba7fc639edb3418abcc74f22fdbe6b8938293f30b", size = 168294, upload-time = "2025-11-09T20:49:23.302Z" }
-wheels = [
-    { url = "https://files.pythonhosted.org/packages/3b/91/13cb9505f7be74a933f37da3af22e029f6ba64f5669416cb8b2774bc9682/jiter-0.12.0-cp310-cp310-macosx_10_12_x86_64.whl", hash = "sha256:e7acbaba9703d5de82a2c98ae6a0f59ab9770ab5af5fa35e43a303aee962cf65", size = 316652, upload-time = "2025-11-09T20:46:41.021Z" },
-    { url = "https://files.pythonhosted.org/packages/4e/76/4e9185e5d9bb4e482cf6dec6410d5f78dfeb374cfcecbbe9888d07c52daa/jiter-0.12.0-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:364f1a7294c91281260364222f535bc427f56d4de1d8ffd718162d21fbbd602e", size = 319829, upload-time = "2025-11-09T20:46:43.281Z" },
-    { url = "https://files.pythonhosted.org/packages/86/af/727de50995d3a153138139f259baae2379d8cb0522c0c00419957bc478a6/jiter-0.12.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:85ee4d25805d4fb23f0a5167a962ef8e002dbfb29c0989378488e32cf2744b62", size = 350568, upload-time = "2025-11-09T20:46:45.075Z" },
-    { url = "https://files.pythonhosted.org/packages/6a/c1/d6e9f4b7a3d5ac63bcbdfddeb50b2dcfbdc512c86cffc008584fdc350233/jiter-0.12.0-cp310-cp310-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:796f466b7942107eb889c08433b6e31b9a7ed31daceaecf8af1be26fb26c0ca8", size = 369052, upload-time = "2025-11-09T20:46:46.818Z" },
-    { url = "https://files.pythonhosted.org/packages/eb/be/00824cd530f30ed73fa8a4f9f3890a705519e31ccb9e929f1e22062e7c76/jiter-0.12.0-cp310-cp310-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:35506cb71f47dba416694e67af996bbdefb8e3608f1f78799c2e1f9058b01ceb", size = 481585, upload-time = "2025-11-09T20:46:48.319Z" },
-    { url = "https://files.pythonhosted.org/packages/74/b6/2ad7990dff9504d4b5052eef64aa9574bd03d722dc7edced97aad0d47be7/jiter-0.12.0-cp310-cp310-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:726c764a90c9218ec9e4f99a33d6bf5ec169163f2ca0fc21b654e88c2abc0abc", size = 380541, upload-time = "2025-11-09T20:46:49.643Z" },
-    { url = "https://files.pythonhosted.org/packages/b5/c7/f3c26ecbc1adbf1db0d6bba99192143d8fe8504729d9594542ecc4445784/jiter-0.12.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:baa47810c5565274810b726b0dc86d18dce5fd17b190ebdc3890851d7b2a0e74", size = 364423, upload-time = "2025-11-09T20:46:51.731Z" },
-    { url = "https://files.pythonhosted.org/packages/18/51/eac547bf3a2d7f7e556927278e14c56a0604b8cddae75815d5739f65f81d/jiter-0.12.0-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:f8ec0259d3f26c62aed4d73b198c53e316ae11f0f69c8fbe6682c6dcfa0fcce2", size = 389958, upload-time = "2025-11-09T20:46:53.432Z" },
-    { url = "https://files.pythonhosted.org/packages/2c/1f/9ca592e67175f2db156cff035e0d817d6004e293ee0c1d73692d38fcb596/jiter-0.12.0-cp310-cp310-musllinux_1_1_aarch64.whl", hash = "sha256:79307d74ea83465b0152fa23e5e297149506435535282f979f18b9033c0bb025", size = 522084, upload-time = "2025-11-09T20:46:54.848Z" },
-    { url = "https://files.pythonhosted.org/packages/83/ff/597d9cdc3028f28224f53e1a9d063628e28b7a5601433e3196edda578cdd/jiter-0.12.0-cp310-cp310-musllinux_1_1_x86_64.whl", hash = "sha256:cf6e6dd18927121fec86739f1a8906944703941d000f0639f3eb6281cc601dca", size = 513054, upload-time = "2025-11-09T20:46:56.487Z" },
-    { url = "https://files.pythonhosted.org/packages/24/6d/1970bce1351bd02e3afcc5f49e4f7ef3dabd7fb688f42be7e8091a5b809a/jiter-0.12.0-cp310-cp310-win32.whl", hash = "sha256:b6ae2aec8217327d872cbfb2c1694489057b9433afce447955763e6ab015b4c4", size = 206368, upload-time = "2025-11-09T20:46:58.638Z" },
-    { url = "https://files.pythonhosted.org/packages/e3/6b/eb1eb505b2d86709b59ec06681a2b14a94d0941db091f044b9f0e16badc0/jiter-0.12.0-cp310-cp310-win_amd64.whl", hash = "sha256:c7f49ce90a71e44f7e1aa9e7ec415b9686bbc6a5961e57eab511015e6759bc11", size = 204847, upload-time = "2025-11-09T20:47:00.295Z" },
-    { url = "https://files.pythonhosted.org/packages/32/f9/eaca4633486b527ebe7e681c431f529b63fe2709e7c5242fc0f43f77ce63/jiter-0.12.0-cp311-cp311-macosx_10_12_x86_64.whl", hash = "sha256:d8f8a7e317190b2c2d60eb2e8aa835270b008139562d70fe732e1c0020ec53c9", size = 316435, upload-time = "2025-11-09T20:47:02.087Z" },
-    { url = "https://files.pythonhosted.org/packages/10/c1/40c9f7c22f5e6ff715f28113ebaba27ab85f9af2660ad6e1dd6425d14c19/jiter-0.12.0-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:2218228a077e784c6c8f1a8e5d6b8cb1dea62ce25811c356364848554b2056cd", size = 320548, upload-time = "2025-11-09T20:47:03.409Z" },
-    { url = "https://files.pythonhosted.org/packages/6b/1b/efbb68fe87e7711b00d2cfd1f26bb4bfc25a10539aefeaa7727329ffb9cb/jiter-0.12.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:9354ccaa2982bf2188fd5f57f79f800ef622ec67beb8329903abf6b10da7d423", size = 351915, upload-time = "2025-11-09T20:47:05.171Z" },
-    { url = "https://files.pythonhosted.org/packages/15/2d/c06e659888c128ad1e838123d0638f0efad90cc30860cb5f74dd3f2fc0b3/jiter-0.12.0-cp311-cp311-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:8f2607185ea89b4af9a604d4c7ec40e45d3ad03ee66998b031134bc510232bb7", size = 368966, upload-time = "2025-11-09T20:47:06.508Z" },
-    { url = "https://files.pythonhosted.org/packages/6b/20/058db4ae5fb07cf6a4ab2e9b9294416f606d8e467fb74c2184b2a1eeacba/jiter-0.12.0-cp311-cp311-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:3a585a5e42d25f2e71db5f10b171f5e5ea641d3aa44f7df745aa965606111cc2", size = 482047, upload-time = "2025-11-09T20:47:08.382Z" },
-    { url = "https://files.pythonhosted.org/packages/49/bb/dc2b1c122275e1de2eb12905015d61e8316b2f888bdaac34221c301495d6/jiter-0.12.0-cp311-cp311-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:bd9e21d34edff5a663c631f850edcb786719c960ce887a5661e9c828a53a95d9", size = 380835, upload-time = "2025-11-09T20:47:09.81Z" },
-    { url = "https://files.pythonhosted.org/packages/23/7d/38f9cd337575349de16da575ee57ddb2d5a64d425c9367f5ef9e4612e32e/jiter-0.12.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:4a612534770470686cd5431478dc5a1b660eceb410abade6b1b74e320ca98de6", size = 364587, upload-time = "2025-11-09T20:47:11.529Z" },
-    { url = "https://files.pythonhosted.org/packages/f0/a3/b13e8e61e70f0bb06085099c4e2462647f53cc2ca97614f7fedcaa2bb9f3/jiter-0.12.0-cp311-cp311-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:3985aea37d40a908f887b34d05111e0aae822943796ebf8338877fee2ab67725", size = 390492, upload-time = "2025-11-09T20:47:12.993Z" },
-    { url = "https://files.pythonhosted.org/packages/07/71/e0d11422ed027e21422f7bc1883c61deba2d9752b720538430c1deadfbca/jiter-0.12.0-cp311-cp311-musllinux_1_1_aarch64.whl", hash = "sha256:b1207af186495f48f72529f8d86671903c8c10127cac6381b11dddc4aaa52df6", size = 522046, upload-time = "2025-11-09T20:47:14.6Z" },
-    { url = "https://files.pythonhosted.org/packages/9f/59/b968a9aa7102a8375dbbdfbd2aeebe563c7e5dddf0f47c9ef1588a97e224/jiter-0.12.0-cp311-cp311-musllinux_1_1_x86_64.whl", hash = "sha256:ef2fb241de583934c9915a33120ecc06d94aa3381a134570f59eed784e87001e", size = 513392, upload-time = "2025-11-09T20:47:16.011Z" },
-    { url = "https://files.pythonhosted.org/packages/ca/e4/7df62002499080dbd61b505c5cb351aa09e9959d176cac2aa8da6f93b13b/jiter-0.12.0-cp311-cp311-win32.whl", hash = "sha256:453b6035672fecce8007465896a25b28a6b59cfe8fbc974b2563a92f5a92a67c", size = 206096, upload-time = "2025-11-09T20:47:17.344Z" },
-    { url = "https://files.pythonhosted.org/packages/bb/60/1032b30ae0572196b0de0e87dce3b6c26a1eff71aad5fe43dee3082d32e0/jiter-0.12.0-cp311-cp311-win_amd64.whl", hash = "sha256:ca264b9603973c2ad9435c71a8ec8b49f8f715ab5ba421c85a51cde9887e421f", size = 204899, upload-time = "2025-11-09T20:47:19.365Z" },
-    { url = "https://files.pythonhosted.org/packages/49/d5/c145e526fccdb834063fb45c071df78b0cc426bbaf6de38b0781f45d956f/jiter-0.12.0-cp311-cp311-win_arm64.whl", hash = "sha256:cb00ef392e7d684f2754598c02c409f376ddcef857aae796d559e6cacc2d78a5", size = 188070, upload-time = "2025-11-09T20:47:20.75Z" },
-    { url = "https://files.pythonhosted.org/packages/92/c9/5b9f7b4983f1b542c64e84165075335e8a236fa9e2ea03a0c79780062be8/jiter-0.12.0-cp312-cp312-macosx_10_12_x86_64.whl", hash = "sha256:305e061fa82f4680607a775b2e8e0bcb071cd2205ac38e6ef48c8dd5ebe1cf37", size = 314449, upload-time = "2025-11-09T20:47:22.999Z" },
-    { url = "https://files.pythonhosted.org/packages/98/6e/e8efa0e78de00db0aee82c0cf9e8b3f2027efd7f8a71f859d8f4be8e98ef/jiter-0.12.0-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:5c1860627048e302a528333c9307c818c547f214d8659b0705d2195e1a94b274", size = 319855, upload-time = "2025-11-09T20:47:24.779Z" },
-    { url = "https://files.pythonhosted.org/packages/20/26/894cd88e60b5d58af53bec5c6759d1292bd0b37a8b5f60f07abf7a63ae5f/jiter-0.12.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:df37577a4f8408f7e0ec3205d2a8f87672af8f17008358063a4d6425b6081ce3", size = 350171, upload-time = "2025-11-09T20:47:26.469Z" },
-    { url = "https://files.pythonhosted.org/packages/f5/27/a7b818b9979ac31b3763d25f3653ec3a954044d5e9f5d87f2f247d679fd1/jiter-0.12.0-cp312-cp312-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:75fdd787356c1c13a4f40b43c2156276ef7a71eb487d98472476476d803fb2cf", size = 365590, upload-time = "2025-11-09T20:47:27.918Z" },
-    { url = "https://files.pythonhosted.org/packages/ba/7e/e46195801a97673a83746170b17984aa8ac4a455746354516d02ca5541b4/jiter-0.12.0-cp312-cp312-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:1eb5db8d9c65b112aacf14fcd0faae9913d07a8afea5ed06ccdd12b724e966a1", size = 479462, upload-time = "2025-11-09T20:47:29.654Z" },
-    { url = "https://files.pythonhosted.org/packages/ca/75/f833bfb009ab4bd11b1c9406d333e3b4357709ed0570bb48c7c06d78c7dd/jiter-0.12.0-cp312-cp312-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:73c568cc27c473f82480abc15d1301adf333a7ea4f2e813d6a2c7d8b6ba8d0df", size = 378983, upload-time = "2025-11-09T20:47:31.026Z" },
-    { url = "https://files.pythonhosted.org/packages/71/b3/7a69d77943cc837d30165643db753471aff5df39692d598da880a6e51c24/jiter-0.12.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:4321e8a3d868919bcb1abb1db550d41f2b5b326f72df29e53b2df8b006eb9403", size = 361328, upload-time = "2025-11-09T20:47:33.286Z" },
-    { url = "https://files.pythonhosted.org/packages/b0/ac/a78f90caf48d65ba70d8c6efc6f23150bc39dc3389d65bbec2a95c7bc628/jiter-0.12.0-cp312-cp312-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:0a51bad79f8cc9cac2b4b705039f814049142e0050f30d91695a2d9a6611f126", size = 386740, upload-time = "2025-11-09T20:47:34.703Z" },
-    { url = "https://files.pythonhosted.org/packages/39/b6/5d31c2cc8e1b6a6bcf3c5721e4ca0a3633d1ab4754b09bc7084f6c4f5327/jiter-0.12.0-cp312-cp312-musllinux_1_1_aarch64.whl", hash = "sha256:2a67b678f6a5f1dd6c36d642d7db83e456bc8b104788262aaefc11a22339f5a9", size = 520875, upload-time = "2025-11-09T20:47:36.058Z" },
-    { url = "https://files.pythonhosted.org/packages/30/b5/4df540fae4e9f68c54b8dab004bd8c943a752f0b00efd6e7d64aa3850339/jiter-0.12.0-cp312-cp312-musllinux_1_1_x86_64.whl", hash = "sha256:efe1a211fe1fd14762adea941e3cfd6c611a136e28da6c39272dbb7a1bbe6a86", size = 511457, upload-time = "2025-11-09T20:47:37.932Z" },
-    { url = "https://files.pythonhosted.org/packages/07/65/86b74010e450a1a77b2c1aabb91d4a91dd3cd5afce99f34d75fd1ac64b19/jiter-0.12.0-cp312-cp312-win32.whl", hash = "sha256:d779d97c834b4278276ec703dc3fc1735fca50af63eb7262f05bdb4e62203d44", size = 204546, upload-time = "2025-11-09T20:47:40.47Z" },
-    { url = "https://files.pythonhosted.org/packages/1c/c7/6659f537f9562d963488e3e55573498a442503ced01f7e169e96a6110383/jiter-0.12.0-cp312-cp312-win_amd64.whl", hash = "sha256:e8269062060212b373316fe69236096aaf4c49022d267c6736eebd66bbbc60bb", size = 205196, upload-time = "2025-11-09T20:47:41.794Z" },
-    { url = "https://files.pythonhosted.org/packages/21/f4/935304f5169edadfec7f9c01eacbce4c90bb9a82035ac1de1f3bd2d40be6/jiter-0.12.0-cp312-cp312-win_arm64.whl", hash = "sha256:06cb970936c65de926d648af0ed3d21857f026b1cf5525cb2947aa5e01e05789", size = 186100, upload-time = "2025-11-09T20:47:43.007Z" },
-    { url = "https://files.pythonhosted.org/packages/3d/a6/97209693b177716e22576ee1161674d1d58029eb178e01866a0422b69224/jiter-0.12.0-cp313-cp313-macosx_10_12_x86_64.whl", hash = "sha256:6cc49d5130a14b732e0612bc76ae8db3b49898732223ef8b7599aa8d9810683e", size = 313658, upload-time = "2025-11-09T20:47:44.424Z" },
-    { url = "https://files.pythonhosted.org/packages/06/4d/125c5c1537c7d8ee73ad3d530a442d6c619714b95027143f1b61c0b4dfe0/jiter-0.12.0-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:37f27a32ce36364d2fa4f7fdc507279db604d27d239ea2e044c8f148410defe1", size = 318605, upload-time = "2025-11-09T20:47:45.973Z" },
-    { url = "https://files.pythonhosted.org/packages/99/bf/a840b89847885064c41a5f52de6e312e91fa84a520848ee56c97e4fa0205/jiter-0.12.0-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:bbc0944aa3d4b4773e348cda635252824a78f4ba44328e042ef1ff3f6080d1cf", size = 349803, upload-time = "2025-11-09T20:47:47.535Z" },
-    { url = "https://files.pythonhosted.org/packages/8a/88/e63441c28e0db50e305ae23e19c1d8fae012d78ed55365da392c1f34b09c/jiter-0.12.0-cp313-cp313-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:da25c62d4ee1ffbacb97fac6dfe4dcd6759ebdc9015991e92a6eae5816287f44", size = 365120, upload-time = "2025-11-09T20:47:49.284Z" },
-    { url = "https://files.pythonhosted.org/packages/0a/7c/49b02714af4343970eb8aca63396bc1c82fa01197dbb1e9b0d274b550d4e/jiter-0.12.0-cp313-cp313-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:048485c654b838140b007390b8182ba9774621103bd4d77c9c3f6f117474ba45", size = 479918, upload-time = "2025-11-09T20:47:50.807Z" },
-    { url = "https://files.pythonhosted.org/packages/69/ba/0a809817fdd5a1db80490b9150645f3aae16afad166960bcd562be194f3b/jiter-0.12.0-cp313-cp313-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:635e737fbb7315bef0037c19b88b799143d2d7d3507e61a76751025226b3ac87", size = 379008, upload-time = "2025-11-09T20:47:52.211Z" },
-    { url = "https://files.pythonhosted.org/packages/5f/c3/c9fc0232e736c8877d9e6d83d6eeb0ba4e90c6c073835cc2e8f73fdeef51/jiter-0.12.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:4e017c417b1ebda911bd13b1e40612704b1f5420e30695112efdbed8a4b389ed", size = 361785, upload-time = "2025-11-09T20:47:53.512Z" },
-    { url = "https://files.pythonhosted.org/packages/96/61/61f69b7e442e97ca6cd53086ddc1cf59fb830549bc72c0a293713a60c525/jiter-0.12.0-cp313-cp313-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:89b0bfb8b2bf2351fba36bb211ef8bfceba73ef58e7f0c68fb67b5a2795ca2f9", size = 386108, upload-time = "2025-11-09T20:47:54.893Z" },
-    { url = "https://files.pythonhosted.org/packages/e9/2e/76bb3332f28550c8f1eba3bf6e5efe211efda0ddbbaf24976bc7078d42a5/jiter-0.12.0-cp313-cp313-musllinux_1_1_aarch64.whl", hash = "sha256:f5aa5427a629a824a543672778c9ce0c5e556550d1569bb6ea28a85015287626", size = 519937, upload-time = "2025-11-09T20:47:56.253Z" },
-    { url = "https://files.pythonhosted.org/packages/84/d6/fa96efa87dc8bff2094fb947f51f66368fa56d8d4fc9e77b25d7fbb23375/jiter-0.12.0-cp313-cp313-musllinux_1_1_x86_64.whl", hash = "sha256:ed53b3d6acbcb0fd0b90f20c7cb3b24c357fe82a3518934d4edfa8c6898e498c", size = 510853, upload-time = "2025-11-09T20:47:58.32Z" },
-    { url = "https://files.pythonhosted.org/packages/8a/28/93f67fdb4d5904a708119a6ab58a8f1ec226ff10a94a282e0215402a8462/jiter-0.12.0-cp313-cp313-win32.whl", hash = "sha256:4747de73d6b8c78f2e253a2787930f4fffc68da7fa319739f57437f95963c4de", size = 204699, upload-time = "2025-11-09T20:47:59.686Z" },
-    { url = "https://files.pythonhosted.org/packages/c4/1f/30b0eb087045a0abe2a5c9c0c0c8da110875a1d3be83afd4a9a4e548be3c/jiter-0.12.0-cp313-cp313-win_amd64.whl", hash = "sha256:e25012eb0c456fcc13354255d0338cd5397cce26c77b2832b3c4e2e255ea5d9a", size = 204258, upload-time = "2025-11-09T20:48:01.01Z" },
-    { url = "https://files.pythonhosted.org/packages/2c/f4/2b4daf99b96bce6fc47971890b14b2a36aef88d7beb9f057fafa032c6141/jiter-0.12.0-cp313-cp313-win_arm64.whl", hash = "sha256:c97b92c54fe6110138c872add030a1f99aea2401ddcdaa21edf74705a646dd60", size = 185503, upload-time = "2025-11-09T20:48:02.35Z" },
-    { url = "https://files.pythonhosted.org/packages/39/ca/67bb15a7061d6fe20b9b2a2fd783e296a1e0f93468252c093481a2f00efa/jiter-0.12.0-cp313-cp313t-macosx_11_0_arm64.whl", hash = "sha256:53839b35a38f56b8be26a7851a48b89bc47e5d88e900929df10ed93b95fea3d6", size = 317965, upload-time = "2025-11-09T20:48:03.783Z" },
-    { url = "https://files.pythonhosted.org/packages/18/af/1788031cd22e29c3b14bc6ca80b16a39a0b10e611367ffd480c06a259831/jiter-0.12.0-cp313-cp313t-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:94f669548e55c91ab47fef8bddd9c954dab1938644e715ea49d7e117015110a4", size = 345831, upload-time = "2025-11-09T20:48:05.55Z" },
-    { url = "https://files.pythonhosted.org/packages/05/17/710bf8472d1dff0d3caf4ced6031060091c1320f84ee7d5dcbed1f352417/jiter-0.12.0-cp313-cp313t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:351d54f2b09a41600ffea43d081522d792e81dcfb915f6d2d242744c1cc48beb", size = 361272, upload-time = "2025-11-09T20:48:06.951Z" },
-    { url = "https://files.pythonhosted.org/packages/fb/f1/1dcc4618b59761fef92d10bcbb0b038b5160be653b003651566a185f1a5c/jiter-0.12.0-cp313-cp313t-win_amd64.whl", hash = "sha256:2a5e90604620f94bf62264e7c2c038704d38217b7465b863896c6d7c902b06c7", size = 204604, upload-time = "2025-11-09T20:48:08.328Z" },
-    { url = "https://files.pythonhosted.org/packages/d9/32/63cb1d9f1c5c6632a783c0052cde9ef7ba82688f7065e2f0d5f10a7e3edb/jiter-0.12.0-cp313-cp313t-win_arm64.whl", hash = "sha256:88ef757017e78d2860f96250f9393b7b577b06a956ad102c29c8237554380db3", size = 185628, upload-time = "2025-11-09T20:48:09.572Z" },
-    { url = "https://files.pythonhosted.org/packages/a8/99/45c9f0dbe4a1416b2b9a8a6d1236459540f43d7fb8883cff769a8db0612d/jiter-0.12.0-cp314-cp314-macosx_10_12_x86_64.whl", hash = "sha256:c46d927acd09c67a9fb1416df45c5a04c27e83aae969267e98fba35b74e99525", size = 312478, upload-time = "2025-11-09T20:48:10.898Z" },
-    { url = "https://files.pythonhosted.org/packages/4c/a7/54ae75613ba9e0f55fcb0bc5d1f807823b5167cc944e9333ff322e9f07dd/jiter-0.12.0-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:774ff60b27a84a85b27b88cd5583899c59940bcc126caca97eb2a9df6aa00c49", size = 318706, upload-time = "2025-11-09T20:48:12.266Z" },
-    { url = "https://files.pythonhosted.org/packages/59/31/2aa241ad2c10774baf6c37f8b8e1f39c07db358f1329f4eb40eba179c2a2/jiter-0.12.0-cp314-cp314-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:c5433fab222fb072237df3f637d01b81f040a07dcac1cb4a5c75c7aa9ed0bef1", size = 351894, upload-time = "2025-11-09T20:48:13.673Z" },
-    { url = "https://files.pythonhosted.org/packages/54/4f/0f2759522719133a9042781b18cc94e335b6d290f5e2d3e6899d6af933e3/jiter-0.12.0-cp314-cp314-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:f8c593c6e71c07866ec6bfb790e202a833eeec885022296aff6b9e0b92d6a70e", size = 365714, upload-time = "2025-11-09T20:48:15.083Z" },
-    { url = "https://files.pythonhosted.org/packages/dc/6f/806b895f476582c62a2f52c453151edd8a0fde5411b0497baaa41018e878/jiter-0.12.0-cp314-cp314-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:90d32894d4c6877a87ae00c6b915b609406819dce8bc0d4e962e4de2784e567e", size = 478989, upload-time = "2025-11-09T20:48:16.706Z" },
-    { url = "https://files.pythonhosted.org/packages/86/6c/012d894dc6e1033acd8db2b8346add33e413ec1c7c002598915278a37f79/jiter-0.12.0-cp314-cp314-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:798e46eed9eb10c3adbbacbd3bdb5ecd4cf7064e453d00dbef08802dae6937ff", size = 378615, upload-time = "2025-11-09T20:48:18.614Z" },
-    { url = "https://files.pythonhosted.org/packages/87/30/d718d599f6700163e28e2c71c0bbaf6dace692e7df2592fd793ac9276717/jiter-0.12.0-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:b3f1368f0a6719ea80013a4eb90ba72e75d7ea67cfc7846db2ca504f3df0169a", size = 364745, upload-time = "2025-11-09T20:48:20.117Z" },
-    { url = "https://files.pythonhosted.org/packages/8f/85/315b45ce4b6ddc7d7fceca24068543b02bdc8782942f4ee49d652e2cc89f/jiter-0.12.0-cp314-cp314-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:65f04a9d0b4406f7e51279710b27484af411896246200e461d80d3ba0caa901a", size = 386502, upload-time = "2025-11-09T20:48:21.543Z" },
-    { url = "https://files.pythonhosted.org/packages/74/0b/ce0434fb40c5b24b368fe81b17074d2840748b4952256bab451b72290a49/jiter-0.12.0-cp314-cp314-musllinux_1_1_aarch64.whl", hash = "sha256:fd990541982a24281d12b67a335e44f117e4c6cbad3c3b75c7dea68bf4ce3a67", size = 519845, upload-time = "2025-11-09T20:48:22.964Z" },
-    { url = "https://files.pythonhosted.org/packages/e8/a3/7a7a4488ba052767846b9c916d208b3ed114e3eb670ee984e4c565b9cf0d/jiter-0.12.0-cp314-cp314-musllinux_1_1_x86_64.whl", hash = "sha256:b111b0e9152fa7df870ecaebb0bd30240d9f7fff1f2003bcb4ed0f519941820b", size = 510701, upload-time = "2025-11-09T20:48:24.483Z" },
-    { url = "https://files.pythonhosted.org/packages/c3/16/052ffbf9d0467b70af24e30f91e0579e13ded0c17bb4a8eb2aed3cb60131/jiter-0.12.0-cp314-cp314-win32.whl", hash = "sha256:a78befb9cc0a45b5a5a0d537b06f8544c2ebb60d19d02c41ff15da28a9e22d42", size = 205029, upload-time = "2025-11-09T20:48:25.749Z" },
-    { url = "https://files.pythonhosted.org/packages/e4/18/3cf1f3f0ccc789f76b9a754bdb7a6977e5d1d671ee97a9e14f7eb728d80e/jiter-0.12.0-cp314-cp314-win_amd64.whl", hash = "sha256:e1fe01c082f6aafbe5c8faf0ff074f38dfb911d53f07ec333ca03f8f6226debf", size = 204960, upload-time = "2025-11-09T20:48:27.415Z" },
-    { url = "https://files.pythonhosted.org/packages/02/68/736821e52ecfdeeb0f024b8ab01b5a229f6b9293bbdb444c27efade50b0f/jiter-0.12.0-cp314-cp314-win_arm64.whl", hash = "sha256:d72f3b5a432a4c546ea4bedc84cce0c3404874f1d1676260b9c7f048a9855451", size = 185529, upload-time = "2025-11-09T20:48:29.125Z" },
-    { url = "https://files.pythonhosted.org/packages/30/61/12ed8ee7a643cce29ac97c2281f9ce3956eb76b037e88d290f4ed0d41480/jiter-0.12.0-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:e6ded41aeba3603f9728ed2b6196e4df875348ab97b28fc8afff115ed42ba7a7", size = 318974, upload-time = "2025-11-09T20:48:30.87Z" },
-    { url = "https://files.pythonhosted.org/packages/2d/c6/f3041ede6d0ed5e0e79ff0de4c8f14f401bbf196f2ef3971cdbe5fd08d1d/jiter-0.12.0-cp314-cp314t-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:a947920902420a6ada6ad51892082521978e9dd44a802663b001436e4b771684", size = 345932, upload-time = "2025-11-09T20:48:32.658Z" },
-    { url = "https://files.pythonhosted.org/packages/d5/5d/4d94835889edd01ad0e2dbfc05f7bdfaed46292e7b504a6ac7839aa00edb/jiter-0.12.0-cp314-cp314t-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:add5e227e0554d3a52cf390a7635edaffdf4f8fce4fdbcef3cc2055bb396a30c", size = 367243, upload-time = "2025-11-09T20:48:34.093Z" },
-    { url = "https://files.pythonhosted.org/packages/fd/76/0051b0ac2816253a99d27baf3dda198663aff882fa6ea7deeb94046da24e/jiter-0.12.0-cp314-cp314t-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:3f9b1cda8fcb736250d7e8711d4580ebf004a46771432be0ae4796944b5dfa5d", size = 479315, upload-time = "2025-11-09T20:48:35.507Z" },
-    { url = "https://files.pythonhosted.org/packages/70/ae/83f793acd68e5cb24e483f44f482a1a15601848b9b6f199dacb970098f77/jiter-0.12.0-cp314-cp314t-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:deeb12a2223fe0135c7ff1356a143d57f95bbf1f4a66584f1fc74df21d86b993", size = 380714, upload-time = "2025-11-09T20:48:40.014Z" },
-    { url = "https://files.pythonhosted.org/packages/b1/5e/4808a88338ad2c228b1126b93fcd8ba145e919e886fe910d578230dabe3b/jiter-0.12.0-cp314-cp314t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:c596cc0f4cb574877550ce4ecd51f8037469146addd676d7c1a30ebe6391923f", size = 365168, upload-time = "2025-11-09T20:48:41.462Z" },
-    { url = "https://files.pythonhosted.org/packages/0c/d4/04619a9e8095b42aef436b5aeb4c0282b4ff1b27d1db1508df9f5dc82750/jiter-0.12.0-cp314-cp314t-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:5ab4c823b216a4aeab3fdbf579c5843165756bd9ad87cc6b1c65919c4715f783", size = 387893, upload-time = "2025-11-09T20:48:42.921Z" },
-    { url = "https://files.pythonhosted.org/packages/17/ea/d3c7e62e4546fdc39197fa4a4315a563a89b95b6d54c0d25373842a59cbe/jiter-0.12.0-cp314-cp314t-musllinux_1_1_aarch64.whl", hash = "sha256:e427eee51149edf962203ff8db75a7514ab89be5cb623fb9cea1f20b54f1107b", size = 520828, upload-time = "2025-11-09T20:48:44.278Z" },
-    { url = "https://files.pythonhosted.org/packages/cc/0b/c6d3562a03fd767e31cb119d9041ea7958c3c80cb3d753eafb19b3b18349/jiter-0.12.0-cp314-cp314t-musllinux_1_1_x86_64.whl", hash = "sha256:edb868841f84c111255ba5e80339d386d937ec1fdce419518ce1bd9370fac5b6", size = 511009, upload-time = "2025-11-09T20:48:45.726Z" },
-    { url = "https://files.pythonhosted.org/packages/aa/51/2cb4468b3448a8385ebcd15059d325c9ce67df4e2758d133ab9442b19834/jiter-0.12.0-cp314-cp314t-win32.whl", hash = "sha256:8bbcfe2791dfdb7c5e48baf646d37a6a3dcb5a97a032017741dea9f817dca183", size = 205110, upload-time = "2025-11-09T20:48:47.033Z" },
-    { url = "https://files.pythonhosted.org/packages/b2/c5/ae5ec83dec9c2d1af805fd5fe8f74ebded9c8670c5210ec7820ce0dbeb1e/jiter-0.12.0-cp314-cp314t-win_amd64.whl", hash = "sha256:2fa940963bf02e1d8226027ef461e36af472dea85d36054ff835aeed944dd873", size = 205223, upload-time = "2025-11-09T20:48:49.076Z" },
-    { url = "https://files.pythonhosted.org/packages/97/9a/3c5391907277f0e55195550cf3fa8e293ae9ee0c00fb402fec1e38c0c82f/jiter-0.12.0-cp314-cp314t-win_arm64.whl", hash = "sha256:506c9708dd29b27288f9f8f1140c3cb0e3d8ddb045956d7757b1fa0e0f39a473", size = 185564, upload-time = "2025-11-09T20:48:50.376Z" },
-    { url = "https://files.pythonhosted.org/packages/fe/54/5339ef1ecaa881c6948669956567a64d2670941925f245c434f494ffb0e5/jiter-0.12.0-graalpy311-graalpy242_311_native-macosx_10_12_x86_64.whl", hash = "sha256:4739a4657179ebf08f85914ce50332495811004cc1747852e8b2041ed2aab9b8", size = 311144, upload-time = "2025-11-09T20:49:10.503Z" },
-    { url = "https://files.pythonhosted.org/packages/27/74/3446c652bffbd5e81ab354e388b1b5fc1d20daac34ee0ed11ff096b1b01a/jiter-0.12.0-graalpy311-graalpy242_311_native-macosx_11_0_arm64.whl", hash = "sha256:41da8def934bf7bec16cb24bd33c0ca62126d2d45d81d17b864bd5ad721393c3", size = 305877, upload-time = "2025-11-09T20:49:12.269Z" },
-    { url = "https://files.pythonhosted.org/packages/a1/f4/ed76ef9043450f57aac2d4fbeb27175aa0eb9c38f833be6ef6379b3b9a86/jiter-0.12.0-graalpy311-graalpy242_311_native-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:9c44ee814f499c082e69872d426b624987dbc5943ab06e9bbaa4f81989fdb79e", size = 340419, upload-time = "2025-11-09T20:49:13.803Z" },
-    { url = "https://files.pythonhosted.org/packages/21/01/857d4608f5edb0664aa791a3d45702e1a5bcfff9934da74035e7b9803846/jiter-0.12.0-graalpy311-graalpy242_311_native-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:cd2097de91cf03eaa27b3cbdb969addf83f0179c6afc41bbc4513705e013c65d", size = 347212, upload-time = "2025-11-09T20:49:15.643Z" },
-    { url = "https://files.pythonhosted.org/packages/cb/f5/12efb8ada5f5c9edc1d4555fe383c1fb2eac05ac5859258a72d61981d999/jiter-0.12.0-graalpy312-graalpy250_312_native-macosx_10_12_x86_64.whl", hash = "sha256:e8547883d7b96ef2e5fe22b88f8a4c8725a56e7f4abafff20fd5272d634c7ecb", size = 309974, upload-time = "2025-11-09T20:49:17.187Z" },
-    { url = "https://files.pythonhosted.org/packages/85/15/d6eb3b770f6a0d332675141ab3962fd4a7c270ede3515d9f3583e1d28276/jiter-0.12.0-graalpy312-graalpy250_312_native-macosx_11_0_arm64.whl", hash = "sha256:89163163c0934854a668ed783a2546a0617f71706a2551a4a0666d91ab365d6b", size = 304233, upload-time = "2025-11-09T20:49:18.734Z" },
-    { url = "https://files.pythonhosted.org/packages/8c/3e/e7e06743294eea2cf02ced6aa0ff2ad237367394e37a0e2b4a1108c67a36/jiter-0.12.0-graalpy312-graalpy250_312_native-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:d96b264ab7d34bbb2312dedc47ce07cd53f06835eacbc16dde3761f47c3a9e7f", size = 338537, upload-time = "2025-11-09T20:49:20.317Z" },
-    { url = "https://files.pythonhosted.org/packages/2f/9c/6753e6522b8d0ef07d3a3d239426669e984fb0eba15a315cdbc1253904e4/jiter-0.12.0-graalpy312-graalpy250_312_native-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:c24e864cb30ab82311c6425655b0cdab0a98c5d973b065c66a3f020740c2324c", size = 346110, upload-time = "2025-11-09T20:49:21.817Z" },
-]
-
-[[package]]
-name = "openai"
-version = "2.13.0"
-source = { registry = "https://pypi.org/simple" }
-dependencies = [
-    { name = "anyio" },
-    { name = "distro" },
-    { name = "httpx" },
-    { name = "jiter" },
-    { name = "pydantic" },
-    { name = "sniffio" },
-    { name = "tqdm" },
-    { name = "typing-extensions" },
-]
-sdist = { url = "https://files.pythonhosted.org/packages/0f/39/8e347e9fda125324d253084bb1b82407e5e3c7777a03dc398f79b2d95626/openai-2.13.0.tar.gz", hash = "sha256:9ff633b07a19469ec476b1e2b5b26c5ef700886524a7a72f65e6f0b5203142d5", size = 626583, upload-time = "2025-12-16T18:19:44.387Z" }
-wheels = [
-    { url = "https://files.pythonhosted.org/packages/bb/d5/eb52edff49d3d5ea116e225538c118699ddeb7c29fa17ec28af14bc10033/openai-2.13.0-py3-none-any.whl", hash = "sha256:746521065fed68df2f9c2d85613bb50844343ea81f60009b60e6a600c9352c79", size = 1066837, upload-time = "2025-12-16T18:19:43.124Z" },
-]
-
-[[package]]
-name = "opentelemetry-api"
-version = "1.39.1"
-source = { registry = "https://pypi.org/simple" }
-dependencies = [
-    { name = "importlib-metadata" },
-    { name = "typing-extensions" },
-]
-sdist = { url = "https://files.pythonhosted.org/packages/97/b9/3161be15bb8e3ad01be8be5a968a9237c3027c5be504362ff800fca3e442/opentelemetry_api-1.39.1.tar.gz", hash = "sha256:fbde8c80e1b937a2c61f20347e91c0c18a1940cecf012d62e65a7caf08967c9c", size = 65767, upload-time = "2025-12-11T13:32:39.182Z" }
-wheels = [
-    { url = "https://files.pythonhosted.org/packages/cf/df/d3f1ddf4bb4cb50ed9b1139cc7b1c54c34a1e7ce8fd1b9a37c0d1551a6bd/opentelemetry_api-1.39.1-py3-none-any.whl", hash = "sha256:2edd8463432a7f8443edce90972169b195e7d6a05500cd29e6d13898187c9950", size = 66356, upload-time = "2025-12-11T13:32:17.304Z" },
-]
-
-[[package]]
-name = "pydantic"
-version = "2.12.5"
-source = { registry = "https://pypi.org/simple" }
-dependencies = [
-    { name = "annotated-types" },
-    { name = "pydantic-core" },
-    { name = "typing-extensions" },
-    { name = "typing-inspection" },
-]
-sdist = { url = "https://files.pythonhosted.org/packages/69/44/36f1a6e523abc58ae5f928898e4aca2e0ea509b5aa6f6f392a5d882be928/pydantic-2.12.5.tar.gz", hash = "sha256:4d351024c75c0f085a9febbb665ce8c0c6ec5d30e903bdb6394b7ede26aebb49", size = 821591, upload-time = "2025-11-26T15:11:46.471Z" }
-wheels = [
-    { url = "https://files.pythonhosted.org/packages/5a/87/b70ad306ebb6f9b585f114d0ac2137d792b48be34d732d60e597c2f8465a/pydantic-2.12.5-py3-none-any.whl", hash = "sha256:e561593fccf61e8a20fc46dfc2dfe075b8be7d0188df33f221ad1f0139180f9d", size = 463580, upload-time = "2025-11-26T15:11:44.605Z" },
-]
-
-[[package]]
-name = "pydantic-core"
-version = "2.41.5"
-source = { registry = "https://pypi.org/simple" }
-dependencies = [
-    { name = "typing-extensions" },
-]
-sdist = { url = "https://files.pythonhosted.org/packages/71/70/23b021c950c2addd24ec408e9ab05d59b035b39d97cdc1130e1bce647bb6/pydantic_core-2.41.5.tar.gz", hash = "sha256:08daa51ea16ad373ffd5e7606252cc32f07bc72b28284b6bc9c6df804816476e", size = 460952, upload-time = "2025-11-04T13:43:49.098Z" }
-wheels = [
-    { url = "https://files.pythonhosted.org/packages/c6/90/32c9941e728d564b411d574d8ee0cf09b12ec978cb22b294995bae5549a5/pydantic_core-2.41.5-cp310-cp310-macosx_10_12_x86_64.whl", hash = "sha256:77b63866ca88d804225eaa4af3e664c5faf3568cea95360d21f4725ab6e07146", size = 2107298, upload-time = "2025-11-04T13:39:04.116Z" },
-    { url = "https://files.pythonhosted.org/packages/fb/a8/61c96a77fe28993d9a6fb0f4127e05430a267b235a124545d79fea46dd65/pydantic_core-2.41.5-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:dfa8a0c812ac681395907e71e1274819dec685fec28273a28905df579ef137e2", size = 1901475, upload-time = "2025-11-04T13:39:06.055Z" },
-    { url = "https://files.pythonhosted.org/packages/5d/b6/338abf60225acc18cdc08b4faef592d0310923d19a87fba1faf05af5346e/pydantic_core-2.41.5-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:5921a4d3ca3aee735d9fd163808f5e8dd6c6972101e4adbda9a4667908849b97", size = 1918815, upload-time = "2025-11-04T13:39:10.41Z" },
-    { url = "https://files.pythonhosted.org/packages/d1/1c/2ed0433e682983d8e8cba9c8d8ef274d4791ec6a6f24c58935b90e780e0a/pydantic_core-2.41.5-cp310-cp310-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:e25c479382d26a2a41b7ebea1043564a937db462816ea07afa8a44c0866d52f9", size = 2065567, upload-time = "2025-11-04T13:39:12.244Z" },
-    { url = "https://files.pythonhosted.org/packages/b3/24/cf84974ee7d6eae06b9e63289b7b8f6549d416b5c199ca2d7ce13bbcf619/pydantic_core-2.41.5-cp310-cp310-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:f547144f2966e1e16ae626d8ce72b4cfa0caedc7fa28052001c94fb2fcaa1c52", size = 2230442, upload-time = "2025-11-04T13:39:13.962Z" },
-    { url = "https://files.pythonhosted.org/packages/fd/21/4e287865504b3edc0136c89c9c09431be326168b1eb7841911cbc877a995/pydantic_core-2.41.5-cp310-cp310-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:6f52298fbd394f9ed112d56f3d11aabd0d5bd27beb3084cc3d8ad069483b8941", size = 2350956, upload-time = "2025-11-04T13:39:15.889Z" },
-    { url = "https://files.pythonhosted.org/packages/a8/76/7727ef2ffa4b62fcab916686a68a0426b9b790139720e1934e8ba797e238/pydantic_core-2.41.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:100baa204bb412b74fe285fb0f3a385256dad1d1879f0a5cb1499ed2e83d132a", size = 2068253, upload-time = "2025-11-04T13:39:17.403Z" },
-    { url = "https://files.pythonhosted.org/packages/d5/8c/a4abfc79604bcb4c748e18975c44f94f756f08fb04218d5cb87eb0d3a63e/pydantic_core-2.41.5-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:05a2c8852530ad2812cb7914dc61a1125dc4e06252ee98e5638a12da6cc6fb6c", size = 2177050, upload-time = "2025-11-04T13:39:19.351Z" },
-    { url = "https://files.pythonhosted.org/packages/67/b1/de2e9a9a79b480f9cb0b6e8b6ba4c50b18d4e89852426364c66aa82bb7b3/pydantic_core-2.41.5-cp310-cp310-musllinux_1_1_aarch64.whl", hash = "sha256:29452c56df2ed968d18d7e21f4ab0ac55e71dc59524872f6fc57dcf4a3249ed2", size = 2147178, upload-time = "2025-11-04T13:39:21Z" },
-    { url = "https://files.pythonhosted.org/packages/16/c1/dfb33f837a47b20417500efaa0378adc6635b3c79e8369ff7a03c494b4ac/pydantic_core-2.41.5-cp310-cp310-musllinux_1_1_armv7l.whl", hash = "sha256:d5160812ea7a8a2ffbe233d8da666880cad0cbaf5d4de74ae15c313213d62556", size = 2341833, upload-time = "2025-11-04T13:39:22.606Z" },
-    { url = "https://files.pythonhosted.org/packages/47/36/00f398642a0f4b815a9a558c4f1dca1b4020a7d49562807d7bc9ff279a6c/pydantic_core-2.41.5-cp310-cp310-musllinux_1_1_x86_64.whl", hash = "sha256:df3959765b553b9440adfd3c795617c352154e497a4eaf3752555cfb5da8fc49", size = 2321156, upload-time = "2025-11-04T13:39:25.843Z" },
-    { url = "https://files.pythonhosted.org/packages/7e/70/cad3acd89fde2010807354d978725ae111ddf6d0ea46d1ea1775b5c1bd0c/pydantic_core-2.41.5-cp310-cp310-win32.whl", hash = "sha256:1f8d33a7f4d5a7889e60dc39856d76d09333d8a6ed0f5f1190635cbec70ec4ba", size = 1989378, upload-time = "2025-11-04T13:39:27.92Z" },
-    { url = "https://files.pythonhosted.org/packages/76/92/d338652464c6c367e5608e4488201702cd1cbb0f33f7b6a85a60fe5f3720/pydantic_core-2.41.5-cp310-cp310-win_amd64.whl", hash = "sha256:62de39db01b8d593e45871af2af9e497295db8d73b085f6bfd0b18c83c70a8f9", size = 2013622, upload-time = "2025-11-04T13:39:29.848Z" },
-    { url = "https://files.pythonhosted.org/packages/e8/72/74a989dd9f2084b3d9530b0915fdda64ac48831c30dbf7c72a41a5232db8/pydantic_core-2.41.5-cp311-cp311-macosx_10_12_x86_64.whl", hash = "sha256:a3a52f6156e73e7ccb0f8cced536adccb7042be67cb45f9562e12b319c119da6", size = 2105873, upload-time = "2025-11-04T13:39:31.373Z" },
-    { url = "https://files.pythonhosted.org/packages/12/44/37e403fd9455708b3b942949e1d7febc02167662bf1a7da5b78ee1ea2842/pydantic_core-2.41.5-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:7f3bf998340c6d4b0c9a2f02d6a400e51f123b59565d74dc60d252ce888c260b", size = 1899826, upload-time = "2025-11-04T13:39:32.897Z" },
-    { url = "https://files.pythonhosted.org/packages/33/7f/1d5cab3ccf44c1935a359d51a8a2a9e1a654b744b5e7f80d41b88d501eec/pydantic_core-2.41.5-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:378bec5c66998815d224c9ca994f1e14c0c21cb95d2f52b6021cc0b2a58f2a5a", size = 1917869, upload-time = "2025-11-04T13:39:34.469Z" },
-    { url = "https://files.pythonhosted.org/packages/6e/6a/30d94a9674a7fe4f4744052ed6c5e083424510be1e93da5bc47569d11810/pydantic_core-2.41.5-cp311-cp311-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:e7b576130c69225432866fe2f4a469a85a54ade141d96fd396dffcf607b558f8", size = 2063890, upload-time = "2025-11-04T13:39:36.053Z" },
-    { url = "https://files.pythonhosted.org/packages/50/be/76e5d46203fcb2750e542f32e6c371ffa9b8ad17364cf94bb0818dbfb50c/pydantic_core-2.41.5-cp311-cp311-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:6cb58b9c66f7e4179a2d5e0f849c48eff5c1fca560994d6eb6543abf955a149e", size = 2229740, upload-time = "2025-11-04T13:39:37.753Z" },
-    { url = "https://files.pythonhosted.org/packages/d3/ee/fed784df0144793489f87db310a6bbf8118d7b630ed07aa180d6067e653a/pydantic_core-2.41.5-cp311-cp311-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:88942d3a3dff3afc8288c21e565e476fc278902ae4d6d134f1eeda118cc830b1", size = 2350021, upload-time = "2025-11-04T13:39:40.94Z" },
-    { url = "https://files.pythonhosted.org/packages/c8/be/8fed28dd0a180dca19e72c233cbf58efa36df055e5b9d90d64fd1740b828/pydantic_core-2.41.5-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:f31d95a179f8d64d90f6831d71fa93290893a33148d890ba15de25642c5d075b", size = 2066378, upload-time = "2025-11-04T13:39:42.523Z" },
-    { url = "https://files.pythonhosted.org/packages/b0/3b/698cf8ae1d536a010e05121b4958b1257f0b5522085e335360e53a6b1c8b/pydantic_core-2.41.5-cp311-cp311-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:c1df3d34aced70add6f867a8cf413e299177e0c22660cc767218373d0779487b", size = 2175761, upload-time = "2025-11-04T13:39:44.553Z" },
-    { url = "https://files.pythonhosted.org/packages/b8/ba/15d537423939553116dea94ce02f9c31be0fa9d0b806d427e0308ec17145/pydantic_core-2.41.5-cp311-cp311-musllinux_1_1_aarch64.whl", hash = "sha256:4009935984bd36bd2c774e13f9a09563ce8de4abaa7226f5108262fa3e637284", size = 2146303, upload-time = "2025-11-04T13:39:46.238Z" },
-    { url = "https://files.pythonhosted.org/packages/58/7f/0de669bf37d206723795f9c90c82966726a2ab06c336deba4735b55af431/pydantic_core-2.41.5-cp311-cp311-musllinux_1_1_armv7l.whl", hash = "sha256:34a64bc3441dc1213096a20fe27e8e128bd3ff89921706e83c0b1ac971276594", size = 2340355, upload-time = "2025-11-04T13:39:48.002Z" },
-    { url = "https://files.pythonhosted.org/packages/e5/de/e7482c435b83d7e3c3ee5ee4451f6e8973cff0eb6007d2872ce6383f6398/pydantic_core-2.41.5-cp311-cp311-musllinux_1_1_x86_64.whl", hash = "sha256:c9e19dd6e28fdcaa5a1de679aec4141f691023916427ef9bae8584f9c2fb3b0e", size = 2319875, upload-time = "2025-11-04T13:39:49.705Z" },
-    { url = "https://files.pythonhosted.org/packages/fe/e6/8c9e81bb6dd7560e33b9053351c29f30c8194b72f2d6932888581f503482/pydantic_core-2.41.5-cp311-cp311-win32.whl", hash = "sha256:2c010c6ded393148374c0f6f0bf89d206bf3217f201faa0635dcd56bd1520f6b", size = 1987549, upload-time = "2025-11-04T13:39:51.842Z" },
-    { url = "https://files.pythonhosted.org/packages/11/66/f14d1d978ea94d1bc21fc98fcf570f9542fe55bfcc40269d4e1a21c19bf7/pydantic_core-2.41.5-cp311-cp311-win_amd64.whl", hash = "sha256:76ee27c6e9c7f16f47db7a94157112a2f3a00e958bc626e2f4ee8bec5c328fbe", size = 2011305, upload-time = "2025-11-04T13:39:53.485Z" },
-    { url = "https://files.pythonhosted.org/packages/56/d8/0e271434e8efd03186c5386671328154ee349ff0354d83c74f5caaf096ed/pydantic_core-2.41.5-cp311-cp311-win_arm64.whl", hash = "sha256:4bc36bbc0b7584de96561184ad7f012478987882ebf9f9c389b23f432ea3d90f", size = 1972902, upload-time = "2025-11-04T13:39:56.488Z" },
-    { url = "https://files.pythonhosted.org/packages/5f/5d/5f6c63eebb5afee93bcaae4ce9a898f3373ca23df3ccaef086d0233a35a7/pydantic_core-2.41.5-cp312-cp312-macosx_10_12_x86_64.whl", hash = "sha256:f41a7489d32336dbf2199c8c0a215390a751c5b014c2c1c5366e817202e9cdf7", size = 2110990, upload-time = "2025-11-04T13:39:58.079Z" },
-    { url = "https://files.pythonhosted.org/packages/aa/32/9c2e8ccb57c01111e0fd091f236c7b371c1bccea0fa85247ac55b1e2b6b6/pydantic_core-2.41.5-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:070259a8818988b9a84a449a2a7337c7f430a22acc0859c6b110aa7212a6d9c0", size = 1896003, upload-time = "2025-11-04T13:39:59.956Z" },
-    { url = "https://files.pythonhosted.org/packages/68/b8/a01b53cb0e59139fbc9e4fda3e9724ede8de279097179be4ff31f1abb65a/pydantic_core-2.41.5-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:e96cea19e34778f8d59fe40775a7a574d95816eb150850a85a7a4c8f4b94ac69", size = 1919200, upload-time = "2025-11-04T13:40:02.241Z" },
-    { url = "https://files.pythonhosted.org/packages/38/de/8c36b5198a29bdaade07b5985e80a233a5ac27137846f3bc2d3b40a47360/pydantic_core-2.41.5-cp312-cp312-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:ed2e99c456e3fadd05c991f8f437ef902e00eedf34320ba2b0842bd1c3ca3a75", size = 2052578, upload-time = "2025-11-04T13:40:04.401Z" },
-    { url = "https://files.pythonhosted.org/packages/00/b5/0e8e4b5b081eac6cb3dbb7e60a65907549a1ce035a724368c330112adfdd/pydantic_core-2.41.5-cp312-cp312-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:65840751b72fbfd82c3c640cff9284545342a4f1eb1586ad0636955b261b0b05", size = 2208504, upload-time = "2025-11-04T13:40:06.072Z" },
-    { url = "https://files.pythonhosted.org/packages/77/56/87a61aad59c7c5b9dc8caad5a41a5545cba3810c3e828708b3d7404f6cef/pydantic_core-2.41.5-cp312-cp312-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:e536c98a7626a98feb2d3eaf75944ef6f3dbee447e1f841eae16f2f0a72d8ddc", size = 2335816, upload-time = "2025-11-04T13:40:07.835Z" },
-    { url = "https://files.pythonhosted.org/packages/0d/76/941cc9f73529988688a665a5c0ecff1112b3d95ab48f81db5f7606f522d3/pydantic_core-2.41.5-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:eceb81a8d74f9267ef4081e246ffd6d129da5d87e37a77c9bde550cb04870c1c", size = 2075366, upload-time = "2025-11-04T13:40:09.804Z" },
-    { url = "https://files.pythonhosted.org/packages/d3/43/ebef01f69baa07a482844faaa0a591bad1ef129253ffd0cdaa9d8a7f72d3/pydantic_core-2.41.5-cp312-cp312-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:d38548150c39b74aeeb0ce8ee1d8e82696f4a4e16ddc6de7b1d8823f7de4b9b5", size = 2171698, upload-time = "2025-11-04T13:40:12.004Z" },
-    { url = "https://files.pythonhosted.org/packages/b1/87/41f3202e4193e3bacfc2c065fab7706ebe81af46a83d3e27605029c1f5a6/pydantic_core-2.41.5-cp312-cp312-musllinux_1_1_aarch64.whl", hash = "sha256:c23e27686783f60290e36827f9c626e63154b82b116d7fe9adba1fda36da706c", size = 2132603, upload-time = "2025-11-04T13:40:13.868Z" },
-    { url = "https://files.pythonhosted.org/packages/49/7d/4c00df99cb12070b6bccdef4a195255e6020a550d572768d92cc54dba91a/pydantic_core-2.41.5-cp312-cp312-musllinux_1_1_armv7l.whl", hash = "sha256:482c982f814460eabe1d3bb0adfdc583387bd4691ef00b90575ca0d2b6fe2294", size = 2329591, upload-time = "2025-11-04T13:40:15.672Z" },
-    { url = "https://files.pythonhosted.org/packages/cc/6a/ebf4b1d65d458f3cda6a7335d141305dfa19bdc61140a884d165a8a1bbc7/pydantic_core-2.41.5-cp312-cp312-musllinux_1_1_x86_64.whl", hash = "sha256:bfea2a5f0b4d8d43adf9d7b8bf019fb46fdd10a2e5cde477fbcb9d1fa08c68e1", size = 2319068, upload-time = "2025-11-04T13:40:17.532Z" },
-    { url = "https://files.pythonhosted.org/packages/49/3b/774f2b5cd4192d5ab75870ce4381fd89cf218af999515baf07e7206753f0/pydantic_core-2.41.5-cp312-cp312-win32.whl", hash = "sha256:b74557b16e390ec12dca509bce9264c3bbd128f8a2c376eaa68003d7f327276d", size = 1985908, upload-time = "2025-11-04T13:40:19.309Z" },
-    { url = "https://files.pythonhosted.org/packages/86/45/00173a033c801cacf67c190fef088789394feaf88a98a7035b0e40d53dc9/pydantic_core-2.41.5-cp312-cp312-win_amd64.whl", hash = "sha256:1962293292865bca8e54702b08a4f26da73adc83dd1fcf26fbc875b35d81c815", size = 2020145, upload-time = "2025-11-04T13:40:21.548Z" },
-    { url = "https://files.pythonhosted.org/packages/f9/22/91fbc821fa6d261b376a3f73809f907cec5ca6025642c463d3488aad22fb/pydantic_core-2.41.5-cp312-cp312-win_arm64.whl", hash = "sha256:1746d4a3d9a794cacae06a5eaaccb4b8643a131d45fbc9af23e353dc0a5ba5c3", size = 1976179, upload-time = "2025-11-04T13:40:23.393Z" },
-    { url = "https://files.pythonhosted.org/packages/87/06/8806241ff1f70d9939f9af039c6c35f2360cf16e93c2ca76f184e76b1564/pydantic_core-2.41.5-cp313-cp313-macosx_10_12_x86_64.whl", hash = "sha256:941103c9be18ac8daf7b7adca8228f8ed6bb7a1849020f643b3a14d15b1924d9", size = 2120403, upload-time = "2025-11-04T13:40:25.248Z" },
-    { url = "https://files.pythonhosted.org/packages/94/02/abfa0e0bda67faa65fef1c84971c7e45928e108fe24333c81f3bfe35d5f5/pydantic_core-2.41.5-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:112e305c3314f40c93998e567879e887a3160bb8689ef3d2c04b6cc62c33ac34", size = 1896206, upload-time = "2025-11-04T13:40:27.099Z" },
-    { url = "https://files.pythonhosted.org/packages/15/df/a4c740c0943e93e6500f9eb23f4ca7ec9bf71b19e608ae5b579678c8d02f/pydantic_core-2.41.5-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:0cbaad15cb0c90aa221d43c00e77bb33c93e8d36e0bf74760cd00e732d10a6a0", size = 1919307, upload-time = "2025-11-04T13:40:29.806Z" },
-    { url = "https://files.pythonhosted.org/packages/9a/e3/6324802931ae1d123528988e0e86587c2072ac2e5394b4bc2bc34b61ff6e/pydantic_core-2.41.5-cp313-cp313-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:03ca43e12fab6023fc79d28ca6b39b05f794ad08ec2feccc59a339b02f2b3d33", size = 2063258, upload-time = "2025-11-04T13:40:33.544Z" },
-    { url = "https://files.pythonhosted.org/packages/c9/d4/2230d7151d4957dd79c3044ea26346c148c98fbf0ee6ebd41056f2d62ab5/pydantic_core-2.41.5-cp313-cp313-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:dc799088c08fa04e43144b164feb0c13f9a0bc40503f8df3e9fde58a3c0c101e", size = 2214917, upload-time = "2025-11-04T13:40:35.479Z" },
-    { url = "https://files.pythonhosted.org/packages/e6/9f/eaac5df17a3672fef0081b6c1bb0b82b33ee89aa5cec0d7b05f52fd4a1fa/pydantic_core-2.41.5-cp313-cp313-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:97aeba56665b4c3235a0e52b2c2f5ae9cd071b8a8310ad27bddb3f7fb30e9aa2", size = 2332186, upload-time = "2025-11-04T13:40:37.436Z" },
-    { url = "https://files.pythonhosted.org/packages/cf/4e/35a80cae583a37cf15604b44240e45c05e04e86f9cfd766623149297e971/pydantic_core-2.41.5-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:406bf18d345822d6c21366031003612b9c77b3e29ffdb0f612367352aab7d586", size = 2073164, upload-time = "2025-11-04T13:40:40.289Z" },
-    { url = "https://files.pythonhosted.org/packages/bf/e3/f6e262673c6140dd3305d144d032f7bd5f7497d3871c1428521f19f9efa2/pydantic_core-2.41.5-cp313-cp313-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:b93590ae81f7010dbe380cdeab6f515902ebcbefe0b9327cc4804d74e93ae69d", size = 2179146, upload-time = "2025-11-04T13:40:42.809Z" },
-    { url = "https://files.pythonhosted.org/packages/75/c7/20bd7fc05f0c6ea2056a4565c6f36f8968c0924f19b7d97bbfea55780e73/pydantic_core-2.41.5-cp313-cp313-musllinux_1_1_aarch64.whl", hash = "sha256:01a3d0ab748ee531f4ea6c3e48ad9dac84ddba4b0d82291f87248f2f9de8d740", size = 2137788, upload-time = "2025-11-04T13:40:44.752Z" },
-    { url = "https://files.pythonhosted.org/packages/3a/8d/34318ef985c45196e004bc46c6eab2eda437e744c124ef0dbe1ff2c9d06b/pydantic_core-2.41.5-cp313-cp313-musllinux_1_1_armv7l.whl", hash = "sha256:6561e94ba9dacc9c61bce40e2d6bdc3bfaa0259d3ff36ace3b1e6901936d2e3e", size = 2340133, upload-time = "2025-11-04T13:40:46.66Z" },
-    { url = "https://files.pythonhosted.org/packages/9c/59/013626bf8c78a5a5d9350d12e7697d3d4de951a75565496abd40ccd46bee/pydantic_core-2.41.5-cp313-cp313-musllinux_1_1_x86_64.whl", hash = "sha256:915c3d10f81bec3a74fbd4faebe8391013ba61e5a1a8d48c4455b923bdda7858", size = 2324852, upload-time = "2025-11-04T13:40:48.575Z" },
-    { url = "https://files.pythonhosted.org/packages/1a/d9/c248c103856f807ef70c18a4f986693a46a8ffe1602e5d361485da502d20/pydantic_core-2.41.5-cp313-cp313-win32.whl", hash = "sha256:650ae77860b45cfa6e2cdafc42618ceafab3a2d9a3811fcfbd3bbf8ac3c40d36", size = 1994679, upload-time = "2025-11-04T13:40:50.619Z" },
-    { url = "https://files.pythonhosted.org/packages/9e/8b/341991b158ddab181cff136acd2552c9f35bd30380422a639c0671e99a91/pydantic_core-2.41.5-cp313-cp313-win_amd64.whl", hash = "sha256:79ec52ec461e99e13791ec6508c722742ad745571f234ea6255bed38c6480f11", size = 2019766, upload-time = "2025-11-04T13:40:52.631Z" },
-    { url = "https://files.pythonhosted.org/packages/73/7d/f2f9db34af103bea3e09735bb40b021788a5e834c81eedb541991badf8f5/pydantic_core-2.41.5-cp313-cp313-win_arm64.whl", hash = "sha256:3f84d5c1b4ab906093bdc1ff10484838aca54ef08de4afa9de0f5f14d69639cd", size = 1981005, upload-time = "2025-11-04T13:40:54.734Z" },
-    { url = "https://files.pythonhosted.org/packages/ea/28/46b7c5c9635ae96ea0fbb779e271a38129df2550f763937659ee6c5dbc65/pydantic_core-2.41.5-cp314-cp314-macosx_10_12_x86_64.whl", hash = "sha256:3f37a19d7ebcdd20b96485056ba9e8b304e27d9904d233d7b1015db320e51f0a", size = 2119622, upload-time = "2025-11-04T13:40:56.68Z" },
-    { url = "https://files.pythonhosted.org/packages/74/1a/145646e5687e8d9a1e8d09acb278c8535ebe9e972e1f162ed338a622f193/pydantic_core-2.41.5-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:1d1d9764366c73f996edd17abb6d9d7649a7eb690006ab6adbda117717099b14", size = 1891725, upload-time = "2025-11-04T13:40:58.807Z" },
-    { url = "https://files.pythonhosted.org/packages/23/04/e89c29e267b8060b40dca97bfc64a19b2a3cf99018167ea1677d96368273/pydantic_core-2.41.5-cp314-cp314-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:25e1c2af0fce638d5f1988b686f3b3ea8cd7de5f244ca147c777769e798a9cd1", size = 1915040, upload-time = "2025-11-04T13:41:00.853Z" },
-    { url = "https://files.pythonhosted.org/packages/84/a3/15a82ac7bd97992a82257f777b3583d3e84bdb06ba6858f745daa2ec8a85/pydantic_core-2.41.5-cp314-cp314-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:506d766a8727beef16b7adaeb8ee6217c64fc813646b424d0804d67c16eddb66", size = 2063691, upload-time = "2025-11-04T13:41:03.504Z" },
-    { url = "https://files.pythonhosted.org/packages/74/9b/0046701313c6ef08c0c1cf0e028c67c770a4e1275ca73131563c5f2a310a/pydantic_core-2.41.5-cp314-cp314-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:4819fa52133c9aa3c387b3328f25c1facc356491e6135b459f1de698ff64d869", size = 2213897, upload-time = "2025-11-04T13:41:05.804Z" },
-    { url = "https://files.pythonhosted.org/packages/8a/cd/6bac76ecd1b27e75a95ca3a9a559c643b3afcd2dd62086d4b7a32a18b169/pydantic_core-2.41.5-cp314-cp314-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:2b761d210c9ea91feda40d25b4efe82a1707da2ef62901466a42492c028553a2", size = 2333302, upload-time = "2025-11-04T13:41:07.809Z" },
-    { url = "https://files.pythonhosted.org/packages/4c/d2/ef2074dc020dd6e109611a8be4449b98cd25e1b9b8a303c2f0fca2f2bcf7/pydantic_core-2.41.5-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:22f0fb8c1c583a3b6f24df2470833b40207e907b90c928cc8d3594b76f874375", size = 2064877, upload-time = "2025-11-04T13:41:09.827Z" },
-    { url = "https://files.pythonhosted.org/packages/18/66/e9db17a9a763d72f03de903883c057b2592c09509ccfe468187f2a2eef29/pydantic_core-2.41.5-cp314-cp314-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:2782c870e99878c634505236d81e5443092fba820f0373997ff75f90f68cd553", size = 2180680, upload-time = "2025-11-04T13:41:12.379Z" },
-    { url = "https://files.pythonhosted.org/packages/d3/9e/3ce66cebb929f3ced22be85d4c2399b8e85b622db77dad36b73c5387f8f8/pydantic_core-2.41.5-cp314-cp314-musllinux_1_1_aarch64.whl", hash = "sha256:0177272f88ab8312479336e1d777f6b124537d47f2123f89cb37e0accea97f90", size = 2138960, upload-time = "2025-11-04T13:41:14.627Z" },
-    { url = "https://files.pythonhosted.org/packages/a6/62/205a998f4327d2079326b01abee48e502ea739d174f0a89295c481a2272e/pydantic_core-2.41.5-cp314-cp314-musllinux_1_1_armv7l.whl", hash = "sha256:63510af5e38f8955b8ee5687740d6ebf7c2a0886d15a6d65c32814613681bc07", size = 2339102, upload-time = "2025-11-04T13:41:16.868Z" },
-    { url = "https://files.pythonhosted.org/packages/3c/0d/f05e79471e889d74d3d88f5bd20d0ed189ad94c2423d81ff8d0000aab4ff/pydantic_core-2.41.5-cp314-cp314-musllinux_1_1_x86_64.whl", hash = "sha256:e56ba91f47764cc14f1daacd723e3e82d1a89d783f0f5afe9c364b8bb491ccdb", size = 2326039, upload-time = "2025-11-04T13:41:18.934Z" },
-    { url = "https://files.pythonhosted.org/packages/ec/e1/e08a6208bb100da7e0c4b288eed624a703f4d129bde2da475721a80cab32/pydantic_core-2.41.5-cp314-cp314-win32.whl", hash = "sha256:aec5cf2fd867b4ff45b9959f8b20ea3993fc93e63c7363fe6851424c8a7e7c23", size = 1995126, upload-time = "2025-11-04T13:41:21.418Z" },
-    { url = "https://files.pythonhosted.org/packages/48/5d/56ba7b24e9557f99c9237e29f5c09913c81eeb2f3217e40e922353668092/pydantic_core-2.41.5-cp314-cp314-win_amd64.whl", hash = "sha256:8e7c86f27c585ef37c35e56a96363ab8de4e549a95512445b85c96d3e2f7c1bf", size = 2015489, upload-time = "2025-11-04T13:41:24.076Z" },
-    { url = "https://files.pythonhosted.org/packages/4e/bb/f7a190991ec9e3e0ba22e4993d8755bbc4a32925c0b5b42775c03e8148f9/pydantic_core-2.41.5-cp314-cp314-win_arm64.whl", hash = "sha256:e672ba74fbc2dc8eea59fb6d4aed6845e6905fc2a8afe93175d94a83ba2a01a0", size = 1977288, upload-time = "2025-11-04T13:41:26.33Z" },
-    { url = "https://files.pythonhosted.org/packages/92/ed/77542d0c51538e32e15afe7899d79efce4b81eee631d99850edc2f5e9349/pydantic_core-2.41.5-cp314-cp314t-macosx_10_12_x86_64.whl", hash = "sha256:8566def80554c3faa0e65ac30ab0932b9e3a5cd7f8323764303d468e5c37595a", size = 2120255, upload-time = "2025-11-04T13:41:28.569Z" },
-    { url = "https://files.pythonhosted.org/packages/bb/3d/6913dde84d5be21e284439676168b28d8bbba5600d838b9dca99de0fad71/pydantic_core-2.41.5-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:b80aa5095cd3109962a298ce14110ae16b8c1aece8b72f9dafe81cf597ad80b3", size = 1863760, upload-time = "2025-11-04T13:41:31.055Z" },
-    { url = "https://files.pythonhosted.org/packages/5a/f0/e5e6b99d4191da102f2b0eb9687aaa7f5bea5d9964071a84effc3e40f997/pydantic_core-2.41.5-cp314-cp314t-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:3006c3dd9ba34b0c094c544c6006cc79e87d8612999f1a5d43b769b89181f23c", size = 1878092, upload-time = "2025-11-04T13:41:33.21Z" },
-    { url = "https://files.pythonhosted.org/packages/71/48/36fb760642d568925953bcc8116455513d6e34c4beaa37544118c36aba6d/pydantic_core-2.41.5-cp314-cp314t-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:72f6c8b11857a856bcfa48c86f5368439f74453563f951e473514579d44aa612", size = 2053385, upload-time = "2025-11-04T13:41:35.508Z" },
-    { url = "https://files.pythonhosted.org/packages/20/25/92dc684dd8eb75a234bc1c764b4210cf2646479d54b47bf46061657292a8/pydantic_core-2.41.5-cp314-cp314t-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:5cb1b2f9742240e4bb26b652a5aeb840aa4b417c7748b6f8387927bc6e45e40d", size = 2218832, upload-time = "2025-11-04T13:41:37.732Z" },
-    { url = "https://files.pythonhosted.org/packages/e2/09/f53e0b05023d3e30357d82eb35835d0f6340ca344720a4599cd663dca599/pydantic_core-2.41.5-cp314-cp314t-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:bd3d54f38609ff308209bd43acea66061494157703364ae40c951f83ba99a1a9", size = 2327585, upload-time = "2025-11-04T13:41:40Z" },
-    { url = "https://files.pythonhosted.org/packages/aa/4e/2ae1aa85d6af35a39b236b1b1641de73f5a6ac4d5a7509f77b814885760c/pydantic_core-2.41.5-cp314-cp314t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:2ff4321e56e879ee8d2a879501c8e469414d948f4aba74a2d4593184eb326660", size = 2041078, upload-time = "2025-11-04T13:41:42.323Z" },
-    { url = "https://files.pythonhosted.org/packages/cd/13/2e215f17f0ef326fc72afe94776edb77525142c693767fc347ed6288728d/pydantic_core-2.41.5-cp314-cp314t-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:d0d2568a8c11bf8225044aa94409e21da0cb09dcdafe9ecd10250b2baad531a9", size = 2173914, upload-time = "2025-11-04T13:41:45.221Z" },
-    { url = "https://files.pythonhosted.org/packages/02/7a/f999a6dcbcd0e5660bc348a3991c8915ce6599f4f2c6ac22f01d7a10816c/pydantic_core-2.41.5-cp314-cp314t-musllinux_1_1_aarch64.whl", hash = "sha256:a39455728aabd58ceabb03c90e12f71fd30fa69615760a075b9fec596456ccc3", size = 2129560, upload-time = "2025-11-04T13:41:47.474Z" },
-    { url = "https://files.pythonhosted.org/packages/3a/b1/6c990ac65e3b4c079a4fb9f5b05f5b013afa0f4ed6780a3dd236d2cbdc64/pydantic_core-2.41.5-cp314-cp314t-musllinux_1_1_armv7l.whl", hash = "sha256:239edca560d05757817c13dc17c50766136d21f7cd0fac50295499ae24f90fdf", size = 2329244, upload-time = "2025-11-04T13:41:49.992Z" },
-    { url = "https://files.pythonhosted.org/packages/d9/02/3c562f3a51afd4d88fff8dffb1771b30cfdfd79befd9883ee094f5b6c0d8/pydantic_core-2.41.5-cp314-cp314t-musllinux_1_1_x86_64.whl", hash = "sha256:2a5e06546e19f24c6a96a129142a75cee553cc018ffee48a460059b1185f4470", size = 2331955, upload-time = "2025-11-04T13:41:54.079Z" },
-    { url = "https://files.pythonhosted.org/packages/5c/96/5fb7d8c3c17bc8c62fdb031c47d77a1af698f1d7a406b0f79aaa1338f9ad/pydantic_core-2.41.5-cp314-cp314t-win32.whl", hash = "sha256:b4ececa40ac28afa90871c2cc2b9ffd2ff0bf749380fbdf57d165fd23da353aa", size = 1988906, upload-time = "2025-11-04T13:41:56.606Z" },
-    { url = "https://files.pythonhosted.org/packages/22/ed/182129d83032702912c2e2d8bbe33c036f342cc735737064668585dac28f/pydantic_core-2.41.5-cp314-cp314t-win_amd64.whl", hash = "sha256:80aa89cad80b32a912a65332f64a4450ed00966111b6615ca6816153d3585a8c", size = 1981607, upload-time = "2025-11-04T13:41:58.889Z" },
-    { url = "https://files.pythonhosted.org/packages/9f/ed/068e41660b832bb0b1aa5b58011dea2a3fe0ba7861ff38c4d4904c1c1a99/pydantic_core-2.41.5-cp314-cp314t-win_arm64.whl", hash = "sha256:35b44f37a3199f771c3eaa53051bc8a70cd7b54f333531c59e29fd4db5d15008", size = 1974769, upload-time = "2025-11-04T13:42:01.186Z" },
-    { url = "https://files.pythonhosted.org/packages/11/72/90fda5ee3b97e51c494938a4a44c3a35a9c96c19bba12372fb9c634d6f57/pydantic_core-2.41.5-graalpy311-graalpy242_311_native-macosx_10_12_x86_64.whl", hash = "sha256:b96d5f26b05d03cc60f11a7761a5ded1741da411e7fe0909e27a5e6a0cb7b034", size = 2115441, upload-time = "2025-11-04T13:42:39.557Z" },
-    { url = "https://files.pythonhosted.org/packages/1f/53/8942f884fa33f50794f119012dc6a1a02ac43a56407adaac20463df8e98f/pydantic_core-2.41.5-graalpy311-graalpy242_311_native-macosx_11_0_arm64.whl", hash = "sha256:634e8609e89ceecea15e2d61bc9ac3718caaaa71963717bf3c8f38bfde64242c", size = 1930291, upload-time = "2025-11-04T13:42:42.169Z" },
-    { url = "https://files.pythonhosted.org/packages/79/c8/ecb9ed9cd942bce09fc888ee960b52654fbdbede4ba6c2d6e0d3b1d8b49c/pydantic_core-2.41.5-graalpy311-graalpy242_311_native-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:93e8740d7503eb008aa2df04d3b9735f845d43ae845e6dcd2be0b55a2da43cd2", size = 1948632, upload-time = "2025-11-04T13:42:44.564Z" },
-    { url = "https://files.pythonhosted.org/packages/2e/1b/687711069de7efa6af934e74f601e2a4307365e8fdc404703afc453eab26/pydantic_core-2.41.5-graalpy311-graalpy242_311_native-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:f15489ba13d61f670dcc96772e733aad1a6f9c429cc27574c6cdaed82d0146ad", size = 2138905, upload-time = "2025-11-04T13:42:47.156Z" },
-    { url = "https://files.pythonhosted.org/packages/09/32/59b0c7e63e277fa7911c2fc70ccfb45ce4b98991e7ef37110663437005af/pydantic_core-2.41.5-graalpy312-graalpy250_312_native-macosx_10_12_x86_64.whl", hash = "sha256:7da7087d756b19037bc2c06edc6c170eeef3c3bafcb8f532ff17d64dc427adfd", size = 2110495, upload-time = "2025-11-04T13:42:49.689Z" },
-    { url = "https://files.pythonhosted.org/packages/aa/81/05e400037eaf55ad400bcd318c05bb345b57e708887f07ddb2d20e3f0e98/pydantic_core-2.41.5-graalpy312-graalpy250_312_native-macosx_11_0_arm64.whl", hash = "sha256:aabf5777b5c8ca26f7824cb4a120a740c9588ed58df9b2d196ce92fba42ff8dc", size = 1915388, upload-time = "2025-11-04T13:42:52.215Z" },
-    { url = "https://files.pythonhosted.org/packages/6e/0d/e3549b2399f71d56476b77dbf3cf8937cec5cd70536bdc0e374a421d0599/pydantic_core-2.41.5-graalpy312-graalpy250_312_native-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:c007fe8a43d43b3969e8469004e9845944f1a80e6acd47c150856bb87f230c56", size = 1942879, upload-time = "2025-11-04T13:42:56.483Z" },
-    { url = "https://files.pythonhosted.org/packages/f7/07/34573da085946b6a313d7c42f82f16e8920bfd730665de2d11c0c37a74b5/pydantic_core-2.41.5-graalpy312-graalpy250_312_native-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:76d0819de158cd855d1cbb8fcafdf6f5cf1eb8e470abe056d5d161106e38062b", size = 2139017, upload-time = "2025-11-04T13:42:59.471Z" },
-    { url = "https://files.pythonhosted.org/packages/e6/b0/1a2aa41e3b5a4ba11420aba2d091b2d17959c8d1519ece3627c371951e73/pydantic_core-2.41.5-pp310-pypy310_pp73-macosx_10_12_x86_64.whl", hash = "sha256:b5819cd790dbf0c5eb9f82c73c16b39a65dd6dd4d1439dcdea7816ec9adddab8", size = 2103351, upload-time = "2025-11-04T13:43:02.058Z" },
-    { url = "https://files.pythonhosted.org/packages/a4/ee/31b1f0020baaf6d091c87900ae05c6aeae101fa4e188e1613c80e4f1ea31/pydantic_core-2.41.5-pp310-pypy310_pp73-macosx_11_0_arm64.whl", hash = "sha256:5a4e67afbc95fa5c34cf27d9089bca7fcab4e51e57278d710320a70b956d1b9a", size = 1925363, upload-time = "2025-11-04T13:43:05.159Z" },
-    { url = "https://files.pythonhosted.org/packages/e1/89/ab8e86208467e467a80deaca4e434adac37b10a9d134cd2f99b28a01e483/pydantic_core-2.41.5-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:ece5c59f0ce7d001e017643d8d24da587ea1f74f6993467d85ae8a5ef9d4f42b", size = 2135615, upload-time = "2025-11-04T13:43:08.116Z" },
-    { url = "https://files.pythonhosted.org/packages/99/0a/99a53d06dd0348b2008f2f30884b34719c323f16c3be4e6cc1203b74a91d/pydantic_core-2.41.5-pp310-pypy310_pp73-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:16f80f7abe3351f8ea6858914ddc8c77e02578544a0ebc15b4c2e1a0e813b0b2", size = 2175369, upload-time = "2025-11-04T13:43:12.49Z" },
-    { url = "https://files.pythonhosted.org/packages/6d/94/30ca3b73c6d485b9bb0bc66e611cff4a7138ff9736b7e66bcf0852151636/pydantic_core-2.41.5-pp310-pypy310_pp73-musllinux_1_1_aarch64.whl", hash = "sha256:33cb885e759a705b426baada1fe68cbb0a2e68e34c5d0d0289a364cf01709093", size = 2144218, upload-time = "2025-11-04T13:43:15.431Z" },
-    { url = "https://files.pythonhosted.org/packages/87/57/31b4f8e12680b739a91f472b5671294236b82586889ef764b5fbc6669238/pydantic_core-2.41.5-pp310-pypy310_pp73-musllinux_1_1_armv7l.whl", hash = "sha256:c8d8b4eb992936023be7dee581270af5c6e0697a8559895f527f5b7105ecd36a", size = 2329951, upload-time = "2025-11-04T13:43:18.062Z" },
-    { url = "https://files.pythonhosted.org/packages/7d/73/3c2c8edef77b8f7310e6fb012dbc4b8551386ed575b9eb6fb2506e28a7eb/pydantic_core-2.41.5-pp310-pypy310_pp73-musllinux_1_1_x86_64.whl", hash = "sha256:242a206cd0318f95cd21bdacff3fcc3aab23e79bba5cac3db5a841c9ef9c6963", size = 2318428, upload-time = "2025-11-04T13:43:20.679Z" },
-    { url = "https://files.pythonhosted.org/packages/2f/02/8559b1f26ee0d502c74f9cca5c0d2fd97e967e083e006bbbb4e97f3a043a/pydantic_core-2.41.5-pp310-pypy310_pp73-win_amd64.whl", hash = "sha256:d3a978c4f57a597908b7e697229d996d77a6d3c94901e9edee593adada95ce1a", size = 2147009, upload-time = "2025-11-04T13:43:23.286Z" },
-    { url = "https://files.pythonhosted.org/packages/5f/9b/1b3f0e9f9305839d7e84912f9e8bfbd191ed1b1ef48083609f0dabde978c/pydantic_core-2.41.5-pp311-pypy311_pp73-macosx_10_12_x86_64.whl", hash = "sha256:b2379fa7ed44ddecb5bfe4e48577d752db9fc10be00a6b7446e9663ba143de26", size = 2101980, upload-time = "2025-11-04T13:43:25.97Z" },
-    { url = "https://files.pythonhosted.org/packages/a4/ed/d71fefcb4263df0da6a85b5d8a7508360f2f2e9b3bf5814be9c8bccdccc1/pydantic_core-2.41.5-pp311-pypy311_pp73-macosx_11_0_arm64.whl", hash = "sha256:266fb4cbf5e3cbd0b53669a6d1b039c45e3ce651fd5442eff4d07c2cc8d66808", size = 1923865, upload-time = "2025-11-04T13:43:28.763Z" },
-    { url = "https://files.pythonhosted.org/packages/ce/3a/626b38db460d675f873e4444b4bb030453bbe7b4ba55df821d026a0493c4/pydantic_core-2.41.5-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:58133647260ea01e4d0500089a8c4f07bd7aa6ce109682b1426394988d8aaacc", size = 2134256, upload-time = "2025-11-04T13:43:31.71Z" },
-    { url = "https://files.pythonhosted.org/packages/83/d9/8412d7f06f616bbc053d30cb4e5f76786af3221462ad5eee1f202021eb4e/pydantic_core-2.41.5-pp311-pypy311_pp73-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:287dad91cfb551c363dc62899a80e9e14da1f0e2b6ebde82c806612ca2a13ef1", size = 2174762, upload-time = "2025-11-04T13:43:34.744Z" },
-    { url = "https://files.pythonhosted.org/packages/55/4c/162d906b8e3ba3a99354e20faa1b49a85206c47de97a639510a0e673f5da/pydantic_core-2.41.5-pp311-pypy311_pp73-musllinux_1_1_aarch64.whl", hash = "sha256:03b77d184b9eb40240ae9fd676ca364ce1085f203e1b1256f8ab9984dca80a84", size = 2143141, upload-time = "2025-11-04T13:43:37.701Z" },
-    { url = "https://files.pythonhosted.org/packages/1f/f2/f11dd73284122713f5f89fc940f370d035fa8e1e078d446b3313955157fe/pydantic_core-2.41.5-pp311-pypy311_pp73-musllinux_1_1_armv7l.whl", hash = "sha256:a668ce24de96165bb239160b3d854943128f4334822900534f2fe947930e5770", size = 2330317, upload-time = "2025-11-04T13:43:40.406Z" },
-    { url = "https://files.pythonhosted.org/packages/88/9d/b06ca6acfe4abb296110fb1273a4d848a0bfb2ff65f3ee92127b3244e16b/pydantic_core-2.41.5-pp311-pypy311_pp73-musllinux_1_1_x86_64.whl", hash = "sha256:f14f8f046c14563f8eb3f45f499cc658ab8d10072961e07225e507adb700e93f", size = 2316992, upload-time = "2025-11-04T13:43:43.602Z" },
-    { url = "https://files.pythonhosted.org/packages/36/c7/cfc8e811f061c841d7990b0201912c3556bfeb99cdcb7ed24adc8d6f8704/pydantic_core-2.41.5-pp311-pypy311_pp73-win_amd64.whl", hash = "sha256:56121965f7a4dc965bff783d70b907ddf3d57f6eba29b6d2e5dabfaf07799c51", size = 2145302, upload-time = "2025-11-04T13:43:46.64Z" },
-]
-
-[[package]]
-name = "sniffio"
-version = "1.3.1"
-source = { registry = "https://pypi.org/simple" }
-sdist = { url = "https://files.pythonhosted.org/packages/a2/87/a6771e1546d97e7e041b6ae58d80074f81b7d5121207425c964ddf5cfdbd/sniffio-1.3.1.tar.gz", hash = "sha256:f4324edc670a0f49750a81b895f35c3adb843cca46f0530f79fc1babb23789dc", size = 20372, upload-time = "2024-02-25T23:20:04.057Z" }
-wheels = [
-    { url = "https://files.pythonhosted.org/packages/e9/44/75a9c9421471a6c4805dbf2356f7c181a29c1879239abab1ea2cc8f38b40/sniffio-1.3.1-py3-none-any.whl", hash = "sha256:2f6da418d1f1e0fddd844478f41680e794e6051915791a034ff65e5f100525a2", size = 10235, upload-time = "2024-02-25T23:20:01.196Z" },
-]
-
-[[package]]
-name = "starlette"
-version = "0.50.0"
-source = { registry = "https://pypi.org/simple" }
-dependencies = [
-    { name = "anyio" },
-    { name = "typing-extensions", marker = "python_full_version < '3.13'" },
-]
-sdist = { url = "https://files.pythonhosted.org/packages/ba/b8/73a0e6a6e079a9d9cfa64113d771e421640b6f679a52eeb9b32f72d871a1/starlette-0.50.0.tar.gz", hash = "sha256:a2a17b22203254bcbc2e1f926d2d55f3f9497f769416b3190768befe598fa3ca", size = 2646985, upload-time = "2025-11-01T15:25:27.516Z" }
-wheels = [
-    { url = "https://files.pythonhosted.org/packages/d9/52/1064f510b141bd54025f9b55105e26d1fa970b9be67ad766380a3c9b74b0/starlette-0.50.0-py3-none-any.whl", hash = "sha256:9e5391843ec9b6e472eed1365a78c8098cfceb7a74bfd4d6b1c0c0095efb3bca", size = 74033, upload-time = "2025-11-01T15:25:25.461Z" },
-]
-
-[[package]]
-name = "tqdm"
-version = "4.67.1"
-source = { registry = "https://pypi.org/simple" }
-dependencies = [
-    { name = "colorama", marker = "sys_platform == 'win32'" },
-]
-sdist = { url = "https://files.pythonhosted.org/packages/a8/4b/29b4ef32e036bb34e4ab51796dd745cdba7ed47ad142a9f4a1eb8e0c744d/tqdm-4.67.1.tar.gz", hash = "sha256:f8aef9c52c08c13a65f30ea34f4e5aac3fd1a34959879d7e59e63027286627f2", size = 169737, upload-time = "2024-11-24T20:12:22.481Z" }
-wheels = [
-    { url = "https://files.pythonhosted.org/packages/d0/30/dc54f88dd4a2b5dc8a0279bdd7270e735851848b762aeb1c1184ed1f6b14/tqdm-4.67.1-py3-none-any.whl", hash = "sha256:26445eca388f82e72884e0d580d5464cd801a3ea01e63e5601bdff9ba6a48de2", size = 78540, upload-time = "2024-11-24T20:12:19.698Z" },
-]
-
-[[package]]
-name = "travel-agents"
-version = "0.1.0"
-source = { editable = "." }
-dependencies = [
-    { name = "click" },
-    { name = "fastapi" },
-    { name = "httpx" },
-    { name = "openai" },
-    { name = "opentelemetry-api" },
-    { name = "pydantic" },
-    { name = "uvicorn" },
-]
-
-[package.metadata]
-requires-dist = [
-    { name = "click", specifier = ">=8.2.1" },
-    { name = "fastapi", specifier = ">=0.115.0" },
-    { name = "httpx", specifier = ">=0.24.0" },
-    { name = "openai", specifier = ">=1.0.0" },
-    { name = "opentelemetry-api", specifier = ">=1.20.0" },
-    { name = "pydantic", specifier = ">=2.11.7" },
-    { name = "uvicorn", specifier = ">=0.30.0" },
-]
-
-[[package]]
-name = "typing-extensions"
-version = "4.15.0"
-source = { registry = "https://pypi.org/simple" }
-sdist = { url = "https://files.pythonhosted.org/packages/72/94/1a15dd82efb362ac84269196e94cf00f187f7ed21c242792a923cdb1c61f/typing_extensions-4.15.0.tar.gz", hash = "sha256:0cea48d173cc12fa28ecabc3b837ea3cf6f38c6d1136f85cbaaf598984861466", size = 109391, upload-time = "2025-08-25T13:49:26.313Z" }
-wheels = [
-    { url = "https://files.pythonhosted.org/packages/18/67/36e9267722cc04a6b9f15c7f3441c2363321a3ea07da7ae0c0707beb2a9c/typing_extensions-4.15.0-py3-none-any.whl", hash = "sha256:f0fa19c6845758ab08074a0cfa8b7aecb71c999ca73d62883bc25cc018c4e548", size = 44614, upload-time = "2025-08-25T13:49:24.86Z" },
-]
-
-[[package]]
-name = "typing-inspection"
-version = "0.4.2"
-source = { registry = "https://pypi.org/simple" }
-dependencies = [
-    { name = "typing-extensions" },
-]
-sdist = { url = "https://files.pythonhosted.org/packages/55/e3/70399cb7dd41c10ac53367ae42139cf4b1ca5f36bb3dc6c9d33acdb43655/typing_inspection-0.4.2.tar.gz", hash = "sha256:ba561c48a67c5958007083d386c3295464928b01faa735ab8547c5692e87f464", size = 75949, upload-time = "2025-10-01T02:14:41.687Z" }
-wheels = [
-    { url = "https://files.pythonhosted.org/packages/dc/9b/47798a6c91d8bdb567fe2698fe81e0c6b7cb7ef4d13da4114b41d239f65d/typing_inspection-0.4.2-py3-none-any.whl", hash = "sha256:4ed1cacbdc298c220f1bd249ed5287caa16f34d44ef4e9c3d0cbad5b521545e7", size = 14611, upload-time = "2025-10-01T02:14:40.154Z" },
-]
-
-[[package]]
-name = "uvicorn"
-version = "0.38.0"
-source = { registry = "https://pypi.org/simple" }
-dependencies = [
-    { name = "click" },
-    { name = "h11" },
-    { name = "typing-extensions", marker = "python_full_version < '3.11'" },
-]
-sdist = { url = "https://files.pythonhosted.org/packages/cb/ce/f06b84e2697fef4688ca63bdb2fdf113ca0a3be33f94488f2cadb690b0cf/uvicorn-0.38.0.tar.gz", hash = "sha256:fd97093bdd120a2609fc0d3afe931d4d4ad688b6e75f0f929fde1bc36fe0e91d", size = 80605, upload-time = "2025-10-18T13:46:44.63Z" }
-wheels = [
-    { url = "https://files.pythonhosted.org/packages/ee/d9/d88e73ca598f4f6ff671fb5fde8a32925c2e08a637303a1d12883c7305fa/uvicorn-0.38.0-py3-none-any.whl", hash = "sha256:48c0afd214ceb59340075b4a052ea1ee91c16fbc2a9b1469cca0e54566977b02", size = 68109, upload-time = "2025-10-18T13:46:42.958Z" },
-]
-
-[[package]]
-name = "zipp"
-version = "3.23.0"
-source = { registry = "https://pypi.org/simple" }
-sdist = { url = "https://files.pythonhosted.org/packages/e3/02/0f2892c661036d50ede074e376733dca2ae7c6eb617489437771209d4180/zipp-3.23.0.tar.gz", hash = "sha256:a07157588a12518c9d4034df3fbbee09c814741a33ff63c05fa29d26a2404166", size = 25547, upload-time = "2025-06-08T17:06:39.4Z" }
-wheels = [
-    { url = "https://files.pythonhosted.org/packages/2e/54/647ade08bf0db230bfea292f893923872fd20be6ac6f53b2b936ba839d75/zipp-3.23.0-py3-none-any.whl", hash = "sha256:071652d6115ed432f5ce1d34c336c0adfd6a884660d1e9712a256d3d3bd4b14e", size = 10276, upload-time = "2025-06-08T17:06:38.034Z" },
-]