Update tech spec

This commit is contained in:
Cyber MacGeddon 2025-09-03 11:52:14 +01:00
parent 66116eb875
commit 3353d4305b

View file

@ -30,13 +30,13 @@ This document specifies a new agent architecture for TrustGraph that introduces
│ │ │ │
│ ┌──────────────┐ ┌─────────────────┐ ┌────────────────┐ │ │ ┌──────────────┐ ┌─────────────────┐ ┌────────────────┐ │
│ │ Planner │ │ Flow Controller │ │ Confidence │ │ │ │ Planner │ │ Flow Controller │ │ Confidence │ │
│ │ Module │─▶│ Module │─▶│ Evaluator │ │ │ │ Module │─▶│ Module │─▶│ Evaluator │ │
│ └──────────────┘ └─────────────────┘ └────────────────┘ │ │ └──────────────┘ └─────────────────┘ └────────────────┘ │
│ │ │ │ │ │ │ │ │ │
│ ▼ ▼ ▼ │ │ ▼ ▼ ▼ │
│ ┌──────────────┐ ┌───────────────┐ ┌────────────────┐ │ │ ┌──────────────┐ ┌───────────────┐ ┌────────────────┐ │
│ │ Execution │ │ Memory │ │ Audit │ │ │ │ Execution │ │ Memory │ │ Audit │ │
│ │ Engine │◄──│ Manager │ │ Logger │ │ │ │ Engine │◄──│ Manager │ │ Logger │ │
│ └──────────────┘ └───────────────┘ └────────────────┘ │ │ └──────────────┘ └───────────────┘ └────────────────┘ │
└──────────────────────────────────────────────────────────────────┘ └──────────────────────────────────────────────────────────────────┘
@ -233,52 +233,204 @@ The confidence agent reuses existing tool implementations:
No changes required to existing tools. No changes required to existing tools.
### 5. Execution Flow ### 5. End-to-End Execution Flow
#### 5.1 Request Processing #### 5.1 Module Interaction Overview
When an `AgentRequest` arrives, the confidence agent orchestrates the following flow:
1. **Service Entry**: The main service receives the `AgentRequest` via Pulsar
2. **Planning Phase**: Service invokes Planner Module to generate an `ExecutionPlan`
3. **Execution Loop**: Service passes plan to Flow Controller, which:
- Resolves step dependencies
- For each step, calls Executor with context from Memory Manager
- Evaluator assesses confidence after each execution
- Retry logic triggered if confidence below threshold
4. **Response Stream**: Service sends `AgentResponse` messages at key points
5. **Audit Trail**: Logger records all decisions and confidence scores
#### 5.2 Detailed Message Flow
```mermaid ```mermaid
sequenceDiagram sequenceDiagram
participant Client participant Client
participant Gateway participant Service as ConfidenceAgent<br/>Service
participant ConfidenceAgent
participant Planner participant Planner
participant FlowController participant FlowCtrl as Flow<br/>Controller
participant Memory
participant Executor participant Executor
participant Evaluator
participant Tools participant Tools
Client->>Gateway: Request Client->>Service: AgentRequest
Gateway->>ConfidenceAgent: ConfidenceAgentRequest Service->>Service: Parse request,<br/>extract config
ConfidenceAgent->>Planner: Generate plan
Planner->>ConfidenceAgent: ExecutionPlan
loop For each step %% Planning Phase
ConfidenceAgent->>FlowController: Execute step Service->>Planner: generate_plan(request)
FlowController->>Executor: Run tool Planner->>Tools: Query available tools
Executor->>Tools: Tool invocation Planner->>Planner: LLM generates<br/>ExecutionPlan
Tools->>Executor: Result Planner-->>Service: ExecutionPlan
Executor->>FlowController: StepResult + Confidence Service->>Client: AgentResponse<br/>(planning thought)
%% Execution Phase
Service->>FlowCtrl: execute_plan(plan)
loop For each ExecutionStep
FlowCtrl->>Memory: get_context(step)
Memory-->>FlowCtrl: context + dependencies
alt Confidence >= Threshold FlowCtrl->>Executor: execute_step(step, context)
FlowController->>ConfidenceAgent: Continue Executor->>Tools: invoke_tool(name, args)
else Confidence < Threshold Tools-->>Executor: raw_result
FlowController->>FlowController: Retry logic
Executor->>Evaluator: evaluate(result)
Evaluator-->>Executor: ConfidenceMetrics
alt Confidence >= threshold
Executor-->>FlowCtrl: StepResult (success)
FlowCtrl->>Memory: store_result(step, result)
FlowCtrl->>Service: Send progress
Service->>Client: AgentResponse<br/>(step observation)
else Confidence < threshold
FlowCtrl->>FlowCtrl: Retry with backoff
Note over FlowCtrl: Max 3 retries by default
alt After max retries
FlowCtrl->>Service: Request override
Service->>Client: AgentResponse<br/>(override request)
end
end end
end end
ConfidenceAgent->>Gateway: ConfidenceAgentResponse FlowCtrl-->>Service: All StepResults
Gateway->>Client: Response Service->>Service: Generate final answer
Service->>Client: AgentResponse<br/>(final answer)
``` ```
#### 5.2 Confidence-Based Control Flow #### 5.3 Confidence Decision Points
The control flow implements a retry loop with exponential backoff: The confidence mechanism affects execution at three critical points:
1. Execute step and evaluate confidence **1. Planning Confidence**
2. If confidence meets threshold, proceed to next step - Planner assigns confidence thresholds to each step based on:
3. If below threshold, retry with backoff delay - Operation criticality (graph mutations = higher threshold)
4. After max retries, either request user override or fail - Tool reliability history
5. Log all attempts and decisions for audit trail - Query complexity
- Default thresholds: GraphQuery (0.8), TextCompletion (0.7), McpTool (0.6)
**2. Execution Confidence**
- After each tool execution, Evaluator calculates confidence based on:
- Output completeness and structure
- Consistency with expected schemas
- Semantic coherence (for text outputs)
- Result size and validity (for graph queries)
**3. Retry Decision**
- If confidence < threshold:
- First retry: Same parameters with backoff
- Second retry: Adjusted parameters (e.g., broader query)
- Third retry: Simplified approach
- After max retries: User override or graceful failure
#### 5.4 Example: Graph Query with Low Confidence
**Scenario**: User asks "What are the connections between Entity X and Entity Y?"
**Step 1: Planning**
```
AgentRequest arrives:
question: "What are the connections between Entity X and Entity Y?"
Planner generates ExecutionPlan:
Step 1: GraphQuery
function: "GraphQuery"
arguments: {"query": "MATCH path=(x:Entity {name:'X'})-[*..3]-(y:Entity {name:'Y'}) RETURN path"}
confidence_threshold: 0.8
```
**Step 2: First Execution**
```
Executor runs GraphQuery:
Result: Empty result set []
Evaluator assesses confidence:
Score: 0.3 (low - empty results suspicious)
Reasoning: "Empty result may indicate entities don't exist or query too restrictive"
Flow Controller decides:
0.3 < 0.8 threshold RETRY
```
**Step 3: Retry with Adjusted Query**
```
Flow Controller adjusts parameters:
New query: "MATCH (x:Entity), (y:Entity) WHERE x.name CONTAINS 'X' AND y.name CONTAINS 'Y' RETURN x, y"
Executor runs adjusted query:
Result: Found 2 entities but no connections
Evaluator assesses confidence:
Score: 0.85
Reasoning: "Entities exist but genuinely unconnected"
Flow Controller decides:
0.85 >= 0.8 threshold → SUCCESS
```
**Step 4: Response Stream**
```
AgentResponse 1 (planning):
thought: "Planning graph traversal query to find connections"
observation: "Generated query with 3-hop path search"
AgentResponse 2 (retry):
thought: "Initial query returned empty, adjusting search parameters"
observation: "Retrying with broader entity matching"
AgentResponse 3 (final):
answer: "Entity X and Entity Y exist in the graph but have no direct or indirect connections within 3 hops"
thought: "Query successful with high confidence after parameter adjustment"
observation: "Confidence: 0.85 - Entities verified to exist but unconnected"
```
#### 5.5 Example: Multi-Step Plan with Dependencies
**Scenario**: "Summarize the main topics discussed about AI regulation"
**ExecutionPlan Generated**:
```
Step 1: GraphQuery - Find documents about AI regulation
confidence_threshold: 0.75
Step 2: TextCompletion - Extract key topics from documents
dependencies: [Step 1]
confidence_threshold: 0.7
Step 3: TextCompletion - Generate summary
dependencies: [Step 2]
confidence_threshold: 0.8
```
**Execution Flow**:
1. **Step 1 Success** (confidence: 0.9)
- Found 15 relevant documents
- Memory Manager stores document list
2. **Step 2 Initial Failure** (confidence: 0.5)
- Topics extraction unclear
- Retry with more specific prompt
- **Retry Success** (confidence: 0.75)
- Memory Manager stores topics list
3. **Step 3 Success** (confidence: 0.85)
- Uses topics from memory
- Generates coherent summary
**Total AgentResponses sent**: 6
- 1 for planning
- 2 for Step 1 (execution + success)
- 2 for Step 2 (failure + retry success)
- 1 for Step 3
- 1 final response
### 6. Monitoring and Observability ### 6. Monitoring and Observability