plano/demos/use_cases/rag_agent
2025-12-16 12:14:16 -08:00
..
src/rag_agent cleanup demo 2025-12-16 12:11:42 -08:00
arch_config.yaml cleanup demo 2025-12-16 12:11:42 -08:00
docker-compose.yaml agents framework demo 2025-11-24 15:02:52 -08:00
mcp_query.rest rebase with main and better handle error from mcp 2025-12-16 00:09:24 -08:00
pyproject.toml pending changes 2025-12-15 18:17:15 -08:00
README.md update readme 2025-12-16 12:14:16 -08:00
sample_queries.md agents framework demo 2025-11-24 15:02:52 -08:00
start_agents.sh pending changes 2025-12-15 18:17:15 -08:00
test.rest pending changes 2025-12-15 18:17:15 -08:00
uv.lock pending changes 2025-12-15 18:17:15 -08:00

RAG Agent Demo

A multi-agent RAG system demonstrating archgw's agent filter chain with MCP protocol.

Architecture

This demo consists of three components:

  1. Query Rewriter (MCP filter) - Rewrites user queries for better retrieval
  2. Context Builder (MCP filter) - Retrieves relevant context from knowledge base
  3. RAG Agent (REST) - Generates final responses based on augmented context

Components

Query Rewriter Filter (MCP)

  • Port: 10501
  • Tool: query_rewriter
  • Improves queries using LLM before retrieval

Context Builder Filter (MCP)

  • Port: 10502
  • Tool: context_builder
  • Augments queries with relevant passages from knowledge base

RAG Agent (REST/OpenAI)

  • Port: 10505
  • Endpoint: /v1/chat/completions
  • Generates responses using OpenAI-compatible API

Quick Start

1. Start all agents

./start_agents.sh

This starts:

  • Query Rewriter MCP server on port 10501
  • Context Builder MCP server on port 10502
  • RAG Agent REST server on port 10505

2. Start archgw

archgw up --foreground

3. Test the system

curl -X POST http://localhost:8001/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "What is the guaranteed uptime for TechCorp?"}]
  }'

Configuration

The arch_config.yaml defines how agents are connected:

agent_filters:
  - id: query_rewriter
    url: mcp://host.docker.internal:10500
    tool: rewrite_query_with_archgw  # MCP tool name

  - id: context_builder
    url: mcp://host.docker.internal:10501
    tool: chat_completions

How It Works

  1. User sends request to archgw listener on port 8001
  2. Request passes through MCP filter chain:
    • Query Rewriter rewrites the query for better retrieval
    • Context Builder augments query with relevant knowledge base passages
  3. Augmented request is forwarded to RAG Agent REST endpoint
  4. RAG Agent generates final response using LLM

Configuration

See arch_config.yaml for the complete filter chain setup. The MCP filters use default settings:

  • type: mcp (default)
  • transport: streamable-http (default)
  • Tool name defaults to filter ID sample_queries.md for example queries to test the RAG system.

Example request:

curl -X POST http://localhost:8001/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {
        "role": "user",
        "content": "What is the guaranteed uptime for TechCorp?"
      }
    ]
  }'
  • LLM_GATEWAY_ENDPOINT - archgw endpoint (default: http://localhost:12000/v1)
  • OPENAI_API_KEY - OpenAI API key for model providers

Additional Resources

  • See sample_queries.md for more example queries
  • See arch_config.yaml for complete configuration details