Tech spec first pass

2026-07-01 09:29:38 +02:00 · 2025-08-26 22:38:17 +01:00 · 2025-08-26 22:38:17 +01:00 · c28569fc73
commit c28569fc73
parent e5b9b4976a
1 changed files with 354 additions and 0 deletions
--- a/docs/tech-specs/multi-agent-planning-framework.md
+++ b/docs/tech-specs/multi-agent-planning-framework.md
@ -0,0 +1,354 @@
 # TechSpec: Multi-Agent Planning Framework (MAPF)
 ## Overview
 The Multi-Agent Planning Framework (MAPF) is a confidence-driven agent architecture that complements the existing ReAct framework. While ReAct uses iterative reasoning cycles, MAPF generates upfront execution plans with confidence thresholds and executes them through specialized microservices.
 ## Architecture Design
 ### Core Philosophy
 - **Confidence-Driven Execution**: Each step has confidence thresholds that determine progression
 - **Upfront Planning**: Generate complete execution plans before execution begins
 - **Specialized Functions**: Domain-specific microservices for different operation types
 - **Fault Tolerance**: Retry mechanisms and fallback strategies based on confidence scores
 ### System Components
 #### 1. Planning Service (`trustgraph-mapf-planning`)
 **Responsibility**: Generate execution plans from user requests
 **Input Schema**:
 ```python
@dataclasses.dataclass
 class PlanRequest:
    user_request: str
    context: Optional[dict] = None
    constraints: Optional[dict] = None
 ```
 **Output Schema**:
 ```python
@dataclasses.dataclass
 class ExecutionPlan:
    plan_id: str
    steps: List[PlanStep]
    confidence_threshold: float = 0.7
    max_retries: int = 3
@dataclasses.dataclass
 class PlanStep:
    step_id: str
    function_type: str  # query, external_query, compare, evaluate, etc.
    parameters: dict
    confidence_threshold: float
    dependencies: List[str]  # Previous step IDs this depends on
    retry_strategy: Optional[dict] = None
 ```
 **Implementation**:
 - Uses LLM to decompose complex requests into structured plans
 - Assigns confidence thresholds based on task complexity
 - Determines step dependencies and execution order
 #### 2. Flow Controller Service (`trustgraph-mapf-controller`)
 **Responsibility**: Orchestrate plan execution across microservices
 **Key Features**:
 - Step dependency resolution
 - Confidence score validation
 - Retry logic and failure handling
 - Context management between steps
 **Execution Logic**:
 ```python
 async def execute_plan(self, plan: ExecutionPlan):
    context = ExecutionContext(plan_id=plan.plan_id)
    for step in plan.steps:
        if not self.dependencies_satisfied(step, context):
            await self.wait_for_dependencies(step, context)
        result = await self.execute_step(step, context)
        if result.confidence < step.confidence_threshold:
            if context.retries[step.step_id] < plan.max_retries:
                await self.retry_step(step, context)
            else:
                await self.handle_failure(step, context)
        else:
            context.add_result(step.step_id, result)
 ```
 #### 3. Function Services
 Specialized microservices for different operation types:
 ##### Query Service (`trustgraph-mapf-query`)
 - Internal knowledge base queries
 - Uses existing graph RAG infrastructure
 - Handles collection-specific queries
 ##### External Query Service (`trustgraph-mapf-external`)
 - Web searches, API calls
 - Rate limiting and error handling
 - Response validation and confidence scoring
 ##### Processing Services:
 - **Compare Service** (`trustgraph-mapf-compare`): Data comparison operations
 - **Evaluate Service** (`trustgraph-mapf-evaluate`): Assessment and scoring
 - **Compute Service** (`trustgraph-mapf-compute`): Mathematical operations
 - **Filter Service** (`trustgraph-mapf-filter`): Data filtering and selection
 - **Prioritize Service** (`trustgraph-mapf-prioritize`): Ranking and ordering
 - **Deduce Service** (`trustgraph-mapf-deduce`): Logical inference
 ##### Document Service (`trustgraph-mapf-document`)
 - Final output generation
 - Report compilation
 - Format conversion
 #### 4. Data Store Service (`trustgraph-mapf-datastore`)
 **Responsibility**: Execution context and intermediate results
 **Features**:
 - Step result storage
 - Context sharing between services
 - Execution history and audit trails
 ### Message Flow Architecture
 #### Pulsar Topics:
 - `mapf-plan-requests`
 - `mapf-execution-plans` 
 - `mapf-step-requests`
 - `mapf-step-responses`
 - `mapf-completion-notices`
 #### Message Schemas:
 ```python
@dataclasses.dataclass
 class StepRequest:
    step_id: str
    plan_id: str
    function_type: str
    parameters: dict
    context: dict
    timeout: int = 300
@dataclasses.dataclass  
 class StepResponse:
    step_id: str
    plan_id: str
    success: bool
    result: Any
    confidence: float
    execution_time: float
    error: Optional[str] = None
 ```
 ### Integration with Existing Infrastructure
 #### Shared Components:
 - **Pulsar Messaging**: Reuse existing pub/sub infrastructure
 - **Graph RAG**: Query service leverages existing knowledge graph
 - **Prompt Templates**: Planning service uses existing template system
 - **Configuration**: Extend existing config management
 #### Service Registration:
 ```python
 # In service.py for each MAPF service
 class MapfPlanningService(BaseService):
    def __init__(self):
        super().__init__()
        self.register_specification(
            PlanRequestSpec(
                request_name="mapf-plan-request",
                response_name="mapf-execution-plan"
            )
        )
 ```
 ### Configuration Schema
 ```yaml
 mapf:
  planning:
    model: "gpt-4"
    default_confidence_threshold: 0.7
    max_plan_complexity: 20
  execution:
    max_concurrent_steps: 5
    step_timeout: 300
    retry_backoff: exponential
  functions:
    query:
      collections: ["default", "research", "technical"]
    external:
      rate_limit: 10 # requests per minute
      timeout: 30
 ```
 ### Deployment Considerations
 #### Docker Services:
 - Each function service as separate container
 - Shared base image with common utilities
 - Horizontal scaling based on load
 #### Development Workflow:
 1. Add new function services in `trustgraph-mapf-functions/`
 2. Implement service interface and Pulsar specs
 3. Add to docker-compose and deployment scripts
 4. Create integration tests
 ### Comparison with ReAct
 | Aspect | ReAct | MAPF |
 |--------|-------|------|
 | Planning | Iterative, step-by-step | Upfront, complete plan |
 | Execution | Sequential reasoning cycles | Parallel, dependency-based |
 | Error Handling | Human-readable error messages | Confidence-based retries |
 | Use Cases | Interactive problem solving | Batch processing, complex workflows |
 | LLM Usage | High (every iteration) | Medium (planning phase) |
 ### Implementation Phases
 #### Phase 1: Core Infrastructure
 - Planning service with basic plan generation
 - Flow controller with sequential execution
 - Query service integration
 - Basic Pulsar messaging
 #### Phase 2: Function Services
 - Implement processing services (compare, evaluate, etc.)
 - External query service with web search
 - Document service for output generation
 - Confidence scoring mechanisms
 #### Phase 3: Advanced Features
 - Parallel execution optimization
 - Dynamic plan modification
 - Performance monitoring and analytics
 - Integration with existing agent tools
 ### Success Metrics
 - **Plan Success Rate**: % of plans that complete successfully
 - **Confidence Accuracy**: How well confidence scores predict success
 - **Execution Time**: Average time from request to completion
 - **Resource Utilization**: Service load balancing and efficiency
 - **Error Recovery**: Success rate of retry mechanisms
 ### Example Plan Generation
 For a request like "Compare the top 3 machine learning frameworks and recommend the best for a startup":
 ```json
 {
  "plan_id": "plan-12345",
  "confidence_threshold": 0.7,
  "max_retries": 3,
  "steps": [
    {
      "step_id": "step-1",
      "function_type": "query",
      "parameters": {
        "question": "What are the top machine learning frameworks?",
        "collection": "technical"
      },
      "confidence_threshold": 0.8,
      "dependencies": []
    },
    {
      "step_id": "step-2",
      "function_type": "external_query",
      "parameters": {
        "query": "latest ML framework popularity statistics 2024",
        "sources": ["web", "github"]
      },
      "confidence_threshold": 0.6,
      "dependencies": []
    },
    {
      "step_id": "step-3",
      "function_type": "filter",
      "parameters": {
        "input": "${step-1.result}",
        "criteria": "top_3_by_popularity",
        "additional_data": "${step-2.result}"
      },
      "confidence_threshold": 0.7,
      "dependencies": ["step-1", "step-2"]
    },
    {
      "step_id": "step-4",
      "function_type": "compare",
      "parameters": {
        "items": "${step-3.result}",
        "criteria": ["ease_of_use", "community_support", "performance", "cost"]
      },
      "confidence_threshold": 0.7,
      "dependencies": ["step-3"]
    },
    {
      "step_id": "step-5",
      "function_type": "evaluate",
      "parameters": {
        "comparison": "${step-4.result}",
        "context": "startup with limited resources"
      },
      "confidence_threshold": 0.75,
      "dependencies": ["step-4"]
    },
    {
      "step_id": "step-6",
      "function_type": "document",
      "parameters": {
        "type": "recommendation_report",
        "evaluation": "${step-5.result}",
        "comparison": "${step-4.result}",
        "format": "markdown"
      },
      "confidence_threshold": 0.8,
      "dependencies": ["step-5"]
    }
  ]
 }
 ```
 ### API Endpoints
 The framework will expose the following REST API endpoints through the gateway:
 - `POST /mapf/plan` - Submit a new planning request
 - `GET /mapf/plan/{plan_id}` - Get plan status and details
 - `GET /mapf/plan/{plan_id}/steps/{step_id}` - Get specific step results
 - `POST /mapf/plan/{plan_id}/retry` - Retry a failed plan
 - `DELETE /mapf/plan/{plan_id}` - Cancel an executing plan
 ### Testing Strategy
 #### Unit Tests:
 - Planning service plan generation
 - Individual function service logic
 - Confidence scoring algorithms
 - Message serialization/deserialization
 #### Integration Tests:
 - End-to-end plan execution
 - Service communication via Pulsar
 - Error handling and retry logic
 - Context persistence
 #### Performance Tests:
 - Plan execution throughput
 - Service scaling behavior
 - Message queue performance
 - Resource utilization
 This framework provides a complementary approach to the ReAct system, offering structured planning and execution for complex, multi-step tasks while leveraging the existing TrustGraph infrastructure.