From 1515dbaf08965c4ac80f583b59ffbba2d68a14a9 Mon Sep 17 00:00:00 2001 From: cybermaggedon Date: Mon, 13 Apr 2026 17:29:24 +0100 Subject: [PATCH] Updated tech specs for agent & explainability changes (#796) --- docs/tech-specs/agent-explainability.md | 361 ++++++++---------- docs/tech-specs/agent-orchestration.md | 2 +- .../extraction-provenance-subgraph.md | 4 + docs/tech-specs/extraction-time-provenance.md | 7 +- docs/tech-specs/python-api-refactor.md | 6 +- docs/tech-specs/query-time-explainability.md | 24 ++ docs/tech-specs/streaming-llm-responses.md | 34 +- 7 files changed, 205 insertions(+), 233 deletions(-) diff --git a/docs/tech-specs/agent-explainability.md b/docs/tech-specs/agent-explainability.md index 3dee0ac2..f7900e75 100644 --- a/docs/tech-specs/agent-explainability.md +++ b/docs/tech-specs/agent-explainability.md @@ -1,272 +1,211 @@ # Agent Explainability: Provenance Recording +## Status + +Implemented + ## Overview -Add provenance recording to the React agent loop so agent sessions can be traced and debugged using the same explainability infrastructure as GraphRAG. +Agent sessions are traced and debugged using the same explainability infrastructure as GraphRAG and Document RAG. Provenance is written to `urn:graph:retrieval` and delivered inline on the explain stream. -**Design Decisions:** -- Write to `urn:graph:retrieval` (generic explainability graph) -- Linear dependency chain for now (analysis N → wasDerivedFrom → analysis N-1) -- Tools are opaque black boxes (record input/output only) -- DAG support deferred to future iteration +The canonical vocabulary for all predicates and types is published as an OWL ontology at `specs/ontology/trustgraph.ttl`. ## Entity Types -Both GraphRAG and Agent use PROV-O as the base ontology with TrustGraph-specific subtypes: +All services use PROV-O as the base ontology with TrustGraph-specific subtypes. ### GraphRAG Types -| Entity | PROV-O Type | TG Types | Description | -|--------|-------------|----------|-------------| -| Question | `prov:Activity` | `tg:Question`, `tg:GraphRagQuestion` | The user's query | -| Exploration | `prov:Entity` | `tg:Exploration` | Edges retrieved from knowledge graph | -| Focus | `prov:Entity` | `tg:Focus` | Selected edges with reasoning | -| Synthesis | `prov:Entity` | `tg:Synthesis` | Final answer | - -### Agent Types -| Entity | PROV-O Type | TG Types | Description | -|--------|-------------|----------|-------------| -| Question | `prov:Activity` | `tg:Question`, `tg:AgentQuestion` | The user's query | -| Analysis | `prov:Entity` | `tg:Analysis` | Each think/act/observe cycle | -| Conclusion | `prov:Entity` | `tg:Conclusion` | Final answer | +| Entity | TG Types | Description | +|--------|----------|-------------| +| Question | `tg:Question`, `tg:GraphRagQuestion` | The user's query | +| Grounding | `tg:Grounding` | Concept extraction from query | +| Exploration | `tg:Exploration` | Edges retrieved from knowledge graph | +| Focus | `tg:Focus` | Selected edges with reasoning | +| Synthesis | `tg:Synthesis`, `tg:Answer` | Final answer | ### Document RAG Types -| Entity | PROV-O Type | TG Types | Description | -|--------|-------------|----------|-------------| -| Question | `prov:Activity` | `tg:Question`, `tg:DocRagQuestion` | The user's query | -| Exploration | `prov:Entity` | `tg:Exploration` | Chunks retrieved from document store | -| Synthesis | `prov:Entity` | `tg:Synthesis` | Final answer | +| Entity | TG Types | Description | +|--------|----------|-------------| +| Question | `tg:Question`, `tg:DocRagQuestion` | The user's query | +| Grounding | `tg:Grounding` | Concept extraction from query | +| Exploration | `tg:Exploration` | Chunks retrieved from document store | +| Synthesis | `tg:Synthesis`, `tg:Answer` | Final answer | -**Note:** Document RAG uses a subset of GraphRAG's types (no Focus step since there's no edge selection/reasoning phase). +### Agent Types (React) +| Entity | TG Types | Description | +|--------|----------|-------------| +| Question | `tg:Question`, `tg:AgentQuestion` | The user's query (session start) | +| PatternDecision | `tg:PatternDecision` | Meta-router routing decision | +| Analysis | `tg:Analysis`, `tg:ToolUse` | One think/act cycle | +| Thought | `tg:Reflection`, `tg:Thought` | Agent reasoning (sub-entity of Analysis) | +| Observation | `tg:Reflection`, `tg:Observation` | Tool result (standalone entity) | +| Conclusion | `tg:Conclusion`, `tg:Answer` | Final answer | + +### Agent Types (Orchestrator — Plan) +| Entity | TG Types | Description | +|--------|----------|-------------| +| Plan | `tg:Plan` | Structured plan of steps | +| StepResult | `tg:StepResult`, `tg:Answer` | Result from executing one plan step | +| Synthesis | `tg:Synthesis`, `tg:Answer` | Final synthesised answer | + +### Agent Types (Orchestrator — Supervisor) +| Entity | TG Types | Description | +|--------|----------|-------------| +| Decomposition | `tg:Decomposition` | Question decomposed into sub-goals | +| Finding | `tg:Finding`, `tg:Answer` | Result from a sub-agent | +| Synthesis | `tg:Synthesis`, `tg:Answer` | Final synthesised answer | + +### Mixin Types +| Type | Description | +|------|-------------| +| `tg:Answer` | Unifying type for terminal answers (Synthesis, Conclusion, Finding, StepResult) | +| `tg:Reflection` | Unifying type for intermediate commentary (Thought, Observation) | +| `tg:ToolUse` | Applied to Analysis when a tool is invoked | +| `tg:Error` | Applied to Observation events where a failure occurred (tool error or LLM parse error) | ### Question Subtypes -All Question entities share `tg:Question` as a base type but have a specific subtype to identify the retrieval mechanism: - | Subtype | URI Pattern | Mechanism | |---------|-------------|-----------| | `tg:GraphRagQuestion` | `urn:trustgraph:question:{uuid}` | Knowledge graph RAG | | `tg:DocRagQuestion` | `urn:trustgraph:docrag:{uuid}` | Document/chunk RAG | -| `tg:AgentQuestion` | `urn:trustgraph:agent:{uuid}` | ReAct agent | +| `tg:AgentQuestion` | `urn:trustgraph:agent:session:{uuid}` | Agent orchestrator | -This allows querying all questions via `tg:Question` while filtering by specific mechanism via the subtype. +## Provenance Chains -## Provenance Model +All chains use `prov:wasDerivedFrom` links. Each entity is a `prov:Entity`. + +### GraphRAG ``` -Question (urn:trustgraph:agent:{uuid}) - │ - │ tg:query = "User's question" - │ prov:startedAtTime = timestamp - │ rdf:type = prov:Activity, tg:Question - │ - ↓ prov:wasDerivedFrom - │ -Analysis1 (urn:trustgraph:agent:{uuid}/i1) - │ - │ tg:thought = "I need to query the knowledge base..." - │ tg:action = "knowledge-query" - │ tg:arguments = {"question": "..."} - │ tg:observation = "Result from tool..." - │ rdf:type = prov:Entity, tg:Analysis - │ - ↓ prov:wasDerivedFrom - │ -Analysis2 (urn:trustgraph:agent:{uuid}/i2) - │ ... - ↓ prov:wasDerivedFrom - │ -Conclusion (urn:trustgraph:agent:{uuid}/final) - │ - │ tg:answer = "The final response..." - │ rdf:type = prov:Entity, tg:Conclusion +Question → Grounding → Exploration → Focus → Synthesis ``` -### Document RAG Provenance Model +### Document RAG ``` -Question (urn:trustgraph:docrag:{uuid}) - │ - │ tg:query = "User's question" - │ prov:startedAtTime = timestamp - │ rdf:type = prov:Activity, tg:Question - │ - ↓ prov:wasGeneratedBy - │ -Exploration (urn:trustgraph:docrag:{uuid}/exploration) - │ - │ tg:chunkCount = 5 - │ tg:selectedChunk = "chunk-id-1" - │ tg:selectedChunk = "chunk-id-2" - │ ... - │ rdf:type = prov:Entity, tg:Exploration - │ - ↓ prov:wasDerivedFrom - │ -Synthesis (urn:trustgraph:docrag:{uuid}/synthesis) - │ - │ tg:content = "The synthesized answer..." - │ rdf:type = prov:Entity, tg:Synthesis +Question → Grounding → Exploration → Synthesis ``` -## Changes Required +### Agent React -### 1. Schema Changes - -**File:** `trustgraph-base/trustgraph/schema/services/agent.py` - -Add `session_id` and `collection` fields to `AgentRequest`: -```python -@dataclass -class AgentRequest: - question: str = "" - state: str = "" - group: list[str] | None = None - history: list[AgentStep] = field(default_factory=list) - user: str = "" - collection: str = "default" # NEW: Collection for provenance traces - streaming: bool = False - session_id: str = "" # NEW: For provenance tracking across iterations +``` +Question → PatternDecision → Analysis(1) → Observation(1) → Analysis(2) → ... → Conclusion ``` -**File:** `trustgraph-base/trustgraph/messaging/translators/agent.py` +The PatternDecision entity records which execution pattern the meta-router selected. It is only emitted on the first iteration when routing occurs. -Update translator to handle `session_id` and `collection` in both `to_pulsar()` and `from_pulsar()`. +Thought sub-entities derive from their parent Analysis. Observation entities derive from their parent Analysis (or from a sub-trace entity if the tool produced its own explainability, e.g. a GraphRAG query). -### 2. Add Explainability Producer to Agent Service +### Agent Plan-then-Execute -**File:** `trustgraph-flow/trustgraph/agent/react/service.py` - -Register an "explainability" producer (same pattern as GraphRAG): -```python -from ... base import ProducerSpec -from ... schema import Triples - -# In __init__: -self.register_specification( - ProducerSpec( - name = "explainability", - schema = Triples, - ) -) +``` +Question → PatternDecision → Plan → StepResult(0) → StepResult(1) → ... → Synthesis ``` -### 3. Provenance Triple Generation +### Agent Supervisor -**File:** `trustgraph-base/trustgraph/provenance/agent.py` - -Create helper functions (similar to GraphRAG's `question_triples`, `exploration_triples`, etc.): -```python -def agent_session_triples(session_uri, query, timestamp): - """Generate triples for agent Question.""" - return [ - Triple(s=session_uri, p=RDF_TYPE, o=PROV_ACTIVITY), - Triple(s=session_uri, p=RDF_TYPE, o=TG_QUESTION), - Triple(s=session_uri, p=TG_QUERY, o=query), - Triple(s=session_uri, p=PROV_STARTED_AT_TIME, o=timestamp), - ] - -def agent_iteration_triples(iteration_uri, parent_uri, thought, action, arguments, observation): - """Generate triples for one Analysis step.""" - return [ - Triple(s=iteration_uri, p=RDF_TYPE, o=PROV_ENTITY), - Triple(s=iteration_uri, p=RDF_TYPE, o=TG_ANALYSIS), - Triple(s=iteration_uri, p=TG_THOUGHT, o=thought), - Triple(s=iteration_uri, p=TG_ACTION, o=action), - Triple(s=iteration_uri, p=TG_ARGUMENTS, o=json.dumps(arguments)), - Triple(s=iteration_uri, p=TG_OBSERVATION, o=observation), - Triple(s=iteration_uri, p=PROV_WAS_DERIVED_FROM, o=parent_uri), - ] - -def agent_final_triples(final_uri, parent_uri, answer): - """Generate triples for Conclusion.""" - return [ - Triple(s=final_uri, p=RDF_TYPE, o=PROV_ENTITY), - Triple(s=final_uri, p=RDF_TYPE, o=TG_CONCLUSION), - Triple(s=final_uri, p=TG_ANSWER, o=answer), - Triple(s=final_uri, p=PROV_WAS_DERIVED_FROM, o=parent_uri), - ] +``` +Question → PatternDecision → Decomposition → [fan-out sub-agents] + → Finding(0) → Finding(1) → ... → Synthesis ``` -### 4. Type Definitions +Each sub-agent runs its own session with `wasDerivedFrom` linking back to the parent's Decomposition. Findings derive from their sub-agent's Conclusion. -**File:** `trustgraph-base/trustgraph/provenance/namespaces.py` +## Predicates -Add explainability entity types and agent predicates: -```python -# Explainability entity types (used by both GraphRAG and Agent) -TG_QUESTION = TG + "Question" -TG_EXPLORATION = TG + "Exploration" -TG_FOCUS = TG + "Focus" -TG_SYNTHESIS = TG + "Synthesis" -TG_ANALYSIS = TG + "Analysis" -TG_CONCLUSION = TG + "Conclusion" +### Session / Question +| Predicate | Type | Description | +|-----------|------|-------------| +| `tg:query` | string | The user's query text | +| `prov:startedAtTime` | string | ISO timestamp | -# Agent predicates -TG_THOUGHT = TG + "thought" -TG_ACTION = TG + "action" -TG_ARGUMENTS = TG + "arguments" -TG_OBSERVATION = TG + "observation" -TG_ANSWER = TG + "answer" -``` +### Pattern Decision +| Predicate | Type | Description | +|-----------|------|-------------| +| `tg:pattern` | string | Selected pattern (react, plan-then-execute, supervisor) | +| `tg:taskType` | string | Identified task type (general, research, etc.) | -## Files Modified +### Analysis (Iteration) +| Predicate | Type | Description | +|-----------|------|-------------| +| `tg:action` | string | Tool name selected by the agent | +| `tg:arguments` | string | JSON-encoded arguments | +| `tg:thought` | IRI | Link to Thought sub-entity | +| `tg:toolCandidate` | string | Tool name available to the LLM (one per candidate) | +| `tg:stepNumber` | integer | 1-based iteration counter | +| `tg:llmDurationMs` | integer | LLM call duration in milliseconds | +| `tg:inToken` | integer | Input token count | +| `tg:outToken` | integer | Output token count | +| `tg:llmModel` | string | Model identifier | -| File | Change | -|------|--------| -| `trustgraph-base/trustgraph/schema/services/agent.py` | Add session_id and collection to AgentRequest | -| `trustgraph-base/trustgraph/messaging/translators/agent.py` | Update translator for new fields | -| `trustgraph-base/trustgraph/provenance/namespaces.py` | Add entity types, agent predicates, and Document RAG predicates | -| `trustgraph-base/trustgraph/provenance/triples.py` | Add TG types to GraphRAG triple builders, add Document RAG triple builders | -| `trustgraph-base/trustgraph/provenance/uris.py` | Add Document RAG URI generators | -| `trustgraph-base/trustgraph/provenance/__init__.py` | Export new types, predicates, and Document RAG functions | -| `trustgraph-base/trustgraph/schema/services/retrieval.py` | Add explain_id, explain_graph, and explain_triples to DocumentRagResponse | -| `trustgraph-base/trustgraph/messaging/translators/retrieval.py` | Update DocumentRagResponseTranslator for explainability fields including inline triples | -| `trustgraph-flow/trustgraph/agent/react/service.py` | Add explainability producer + recording logic | -| `trustgraph-flow/trustgraph/retrieval/document_rag/document_rag.py` | Add explainability callback and emit provenance triples | -| `trustgraph-flow/trustgraph/retrieval/document_rag/rag.py` | Add explainability producer and wire up callback | -| `trustgraph-cli/trustgraph/cli/show_explain_trace.py` | Handle agent trace types | -| `trustgraph-cli/trustgraph/cli/list_explain_traces.py` | List agent sessions alongside GraphRAG | +### Observation +| Predicate | Type | Description | +|-----------|------|-------------| +| `tg:document` | IRI | Librarian document reference | +| `tg:toolDurationMs` | integer | Tool execution time in milliseconds | +| `tg:toolError` | string | Error message (tool failure or LLM parse error) | -## Files Created +When `tg:toolError` is present, the Observation also carries the `tg:Error` mixin type. -| File | Purpose | -|------|---------| -| `trustgraph-base/trustgraph/provenance/agent.py` | Agent-specific triple generators | +### Conclusion / Synthesis +| Predicate | Type | Description | +|-----------|------|-------------| +| `tg:document` | IRI | Librarian document reference | +| `tg:terminationReason` | string | Why the loop stopped | +| `tg:inToken` | integer | Input token count (synthesis LLM call) | +| `tg:outToken` | integer | Output token count | +| `tg:llmModel` | string | Model identifier | -## CLI Updates +Termination reason values: +- `final-answer` -- LLM produced a confident answer (react) +- `plan-complete` -- all plan steps executed (plan-then-execute) +- `subagents-complete` -- all sub-agents reported back (supervisor) -**Detection:** Both GraphRAG and Agent Questions have `tg:Question` type. Distinguished by: -1. URI pattern: `urn:trustgraph:agent:` vs `urn:trustgraph:question:` -2. Derived entities: `tg:Analysis` (agent) vs `tg:Exploration` (GraphRAG) +### Decomposition +| Predicate | Type | Description | +|-----------|------|-------------| +| `tg:subagentGoal` | string | Goal assigned to a sub-agent (one per goal) | +| `tg:inToken` | integer | Input token count | +| `tg:outToken` | integer | Output token count | -**`list_explain_traces.py`:** -- Shows Type column (Agent vs GraphRAG) +### Plan +| Predicate | Type | Description | +|-----------|------|-------------| +| `tg:planStep` | string | Goal for a plan step (one per step) | +| `tg:inToken` | integer | Input token count | +| `tg:outToken` | integer | Output token count | -**`show_explain_trace.py`:** -- Auto-detects trace type -- Agent rendering shows: Question → Analysis step(s) → Conclusion +### Token Counts on RAG Events -## Backwards Compatibility +Grounding, Focus, and Synthesis events on GraphRAG and Document RAG also carry `tg:inToken`, `tg:outToken`, and `tg:llmModel` for the LLM calls associated with that step. -- `session_id` defaults to `""` - old requests work, just won't have provenance -- `collection` defaults to `"default"` - reasonable fallback -- CLI gracefully handles both trace types +## Error Handling + +Tool execution errors and LLM parse errors are captured as Observation events rather than crashing the agent: + +- The error message is recorded on `tg:toolError` +- The Observation carries the `tg:Error` mixin type +- The error text becomes the observation content, visible to the LLM on the next iteration +- The provenance chain is preserved (Observation derives from Analysis) +- The agent gets another iteration to retry or choose a different approach + +## Vocabulary Reference + +The full OWL ontology covering all classes and predicates is at `specs/ontology/trustgraph.ttl`. ## Verification ```bash -# Run an agent query -tg-invoke-agent -q "What is the capital of France?" +# Run an agent query with explainability +tg-invoke-agent -q "What is quantum computing?" -x -# List traces (should show agent sessions with Type column) -tg-list-explain-traces -U trustgraph -C default +# Run with token usage +tg-invoke-agent -q "What is quantum computing?" --show-usage -# Show agent trace -tg-show-explain-trace "urn:trustgraph:agent:xxx" +# GraphRAG with explainability +tg-invoke-graph-rag -q "Tell me about AI" -x + +# Document RAG with explainability +tg-invoke-document-rag -q "Summarize the findings" -x ``` - -## Future Work (Not This PR) - -- DAG dependencies (when analysis N uses results from multiple prior analyses) -- Tool-specific provenance linking (KnowledgeQuery → its GraphRAG trace) -- Streaming provenance emission (emit as we go, not batch at end) diff --git a/docs/tech-specs/agent-orchestration.md b/docs/tech-specs/agent-orchestration.md index c93388ed..90723368 100644 --- a/docs/tech-specs/agent-orchestration.md +++ b/docs/tech-specs/agent-orchestration.md @@ -862,7 +862,7 @@ independently. Response chunk fields: message_id UUID for this message (groups chunks) session_id Which agent session produced this chunk - chunk_type "thought" | "observation" | "answer" | ... + message_type "thought" | "observation" | "answer" | ... content The chunk text end_of_message True on the final chunk of this message end_of_dialog True on the final message of the entire execution diff --git a/docs/tech-specs/extraction-provenance-subgraph.md b/docs/tech-specs/extraction-provenance-subgraph.md index ba0d3e50..27a3775b 100644 --- a/docs/tech-specs/extraction-provenance-subgraph.md +++ b/docs/tech-specs/extraction-provenance-subgraph.md @@ -203,3 +203,7 @@ def subgraph_provenance_triples( This is a breaking change to the provenance model. Provenance has not been released, so no migration is needed. The old `tg:reifies` / `statement_uri` code can be removed outright. + +## Vocabulary Reference + +The full OWL ontology covering all extraction and query-time classes and predicates is at `specs/ontology/trustgraph.ttl`. diff --git a/docs/tech-specs/extraction-time-provenance.md b/docs/tech-specs/extraction-time-provenance.md index d70c8c11..782bf5de 100644 --- a/docs/tech-specs/extraction-time-provenance.md +++ b/docs/tech-specs/extraction-time-provenance.md @@ -612,8 +612,13 @@ Link embedding entity IDs to chunk. | Triple store | Use reification for triple → chunk provenance | | Embedding provenance | Link entity ID → chunk ID | +## Vocabulary Reference + +The full OWL ontology covering all extraction and query-time classes and predicates is at `specs/ontology/trustgraph.ttl`. + ## References -- Query-time provenance: `docs/tech-specs/query-time-provenance.md` +- Query-time provenance: `docs/tech-specs/query-time-explainability.md` +- Agent explainability: `docs/tech-specs/agent-explainability.md` - PROV-O standard for provenance modeling - Existing source metadata in knowledge graph (needs audit) diff --git a/docs/tech-specs/python-api-refactor.md b/docs/tech-specs/python-api-refactor.md index 6fcf2f22..215ed3d6 100644 --- a/docs/tech-specs/python-api-refactor.md +++ b/docs/tech-specs/python-api-refactor.md @@ -704,17 +704,17 @@ class StreamingChunk: @dataclasses.dataclass class AgentThought(StreamingChunk): """Agent reasoning chunk""" - chunk_type: str = "thought" + message_type: str = "thought" @dataclasses.dataclass class AgentObservation(StreamingChunk): """Agent tool observation chunk""" - chunk_type: str = "observation" + message_type: str = "observation" @dataclasses.dataclass class AgentAnswer(StreamingChunk): """Agent final answer chunk""" - chunk_type: str = "final-answer" + message_type: str = "final-answer" end_of_dialog: bool = False @dataclasses.dataclass diff --git a/docs/tech-specs/query-time-explainability.md b/docs/tech-specs/query-time-explainability.md index 69cb45ac..052b9619 100644 --- a/docs/tech-specs/query-time-explainability.md +++ b/docs/tech-specs/query-time-explainability.md @@ -138,6 +138,25 @@ Defined in `trustgraph-base/trustgraph/provenance/namespaces.py`: | `TG_REASONING` | `https://trustgraph.ai/ns/reasoning` | | `TG_CONTENT` | `https://trustgraph.ai/ns/content` | | `TG_DOCUMENT` | `https://trustgraph.ai/ns/document` | +| `TG_IN_TOKEN` | `https://trustgraph.ai/ns/inToken` | +| `TG_OUT_TOKEN` | `https://trustgraph.ai/ns/outToken` | +| `TG_LLM_MODEL` | `https://trustgraph.ai/ns/llmModel` | + +### Token Usage on Events + +Grounding, Focus, and Synthesis events carry per-event LLM token counts: + +| Predicate | Type | Present on | +|-----------|------|------------| +| `tg:inToken` | integer | Grounding, Focus, Synthesis | +| `tg:outToken` | integer | Grounding, Focus, Synthesis | +| `tg:llmModel` | string | Grounding, Focus, Synthesis | + +- **Grounding**: tokens from the extract-concepts LLM call +- **Focus**: summed tokens from edge-scoring + edge-reasoning LLM calls +- **Synthesis**: tokens from the synthesis LLM call + +Values are absent (not zero) when token counts are unavailable. ## GraphRagResponse Schema @@ -261,8 +280,13 @@ Based on the provided knowledge statements... | `trustgraph-flow/trustgraph/query/triples/cassandra/service.py` | Quoted triple query support | | `trustgraph-cli/trustgraph/cli/invoke_graph_rag.py` | CLI with explainability display | +## Vocabulary Reference + +The full OWL ontology covering all classes and predicates is at `specs/ontology/trustgraph.ttl`. + ## References - PROV-O (W3C Provenance Ontology): https://www.w3.org/TR/prov-o/ - RDF-star: https://w3c.github.io/rdf-star/ - Extraction-time provenance: `docs/tech-specs/extraction-time-provenance.md` +- Agent explainability: `docs/tech-specs/agent-explainability.md` diff --git a/docs/tech-specs/streaming-llm-responses.md b/docs/tech-specs/streaming-llm-responses.md index 5733315a..950610ee 100644 --- a/docs/tech-specs/streaming-llm-responses.md +++ b/docs/tech-specs/streaming-llm-responses.md @@ -393,28 +393,28 @@ The agent produces multiple types of output during its reasoning cycle: - Answer (final response) - Errors -Since `chunk_type` identifies what kind of content is being sent, the separate +Since `message_type` identifies what kind of content is being sent, the separate `answer`, `error`, `thought`, and `observation` fields can be collapsed into a single `content` field: ```python class AgentResponse(Record): - chunk_type = String() # "thought", "action", "observation", "answer", "error" - content = String() # The actual content (interpretation depends on chunk_type) + message_type = String() # "thought", "action", "observation", "answer", "error" + content = String() # The actual content (interpretation depends on message_type) end_of_message = Boolean() # Current thought/action/observation/answer is complete end_of_dialog = Boolean() # Entire agent dialog is complete ``` **Field Semantics:** -- `chunk_type`: Indicates what type of content is in the `content` field +- `message_type`: Indicates what type of content is in the `content` field - `"thought"`: Agent reasoning/thinking - `"action"`: Tool/action being invoked - `"observation"`: Result from tool execution - `"answer"`: Final answer to the user's question - `"error"`: Error message -- `content`: The actual streamed content, interpreted based on `chunk_type` +- `content`: The actual streamed content, interpreted based on `message_type` - `end_of_message`: When `true`, the current chunk type is complete - Example: All tokens for the current thought have been sent @@ -428,27 +428,27 @@ class AgentResponse(Record): When `streaming=true`: 1. **Thought streaming**: - - Multiple chunks with `chunk_type="thought"`, `end_of_message=false` + - Multiple chunks with `message_type="thought"`, `end_of_message=false` - Final thought chunk has `end_of_message=true` 2. **Action notification**: - - Single chunk with `chunk_type="action"`, `end_of_message=true` + - Single chunk with `message_type="action"`, `end_of_message=true` 3. **Observation**: - - Chunk(s) with `chunk_type="observation"`, final has `end_of_message=true` + - Chunk(s) with `message_type="observation"`, final has `end_of_message=true` 4. **Repeat** steps 1-3 as the agent reasons 5. **Final answer**: - - `chunk_type="answer"` with the final response in `content` + - `message_type="answer"` with the final response in `content` - Last chunk has `end_of_message=true`, `end_of_dialog=true` **Example Stream Sequence:** ``` -{chunk_type: "thought", content: "I need to", end_of_message: false, end_of_dialog: false} -{chunk_type: "thought", content: " search for...", end_of_message: true, end_of_dialog: false} -{chunk_type: "action", content: "search", end_of_message: true, end_of_dialog: false} -{chunk_type: "observation", content: "Found: ...", end_of_message: true, end_of_dialog: false} -{chunk_type: "thought", content: "Based on this", end_of_message: false, end_of_dialog: false} -{chunk_type: "thought", content: " I can answer...", end_of_message: true, end_of_dialog: false} -{chunk_type: "answer", content: "The answer is...", end_of_message: true, end_of_dialog: true} +{message_type: "thought", content: "I need to", end_of_message: false, end_of_dialog: false} +{message_type: "thought", content: " search for...", end_of_message: true, end_of_dialog: false} +{message_type: "action", content: "search", end_of_message: true, end_of_dialog: false} +{message_type: "observation", content: "Found: ...", end_of_message: true, end_of_dialog: false} +{message_type: "thought", content: "Based on this", end_of_message: false, end_of_dialog: false} +{message_type: "thought", content: " I can answer...", end_of_message: true, end_of_dialog: false} +{message_type: "answer", content: "The answer is...", end_of_message: true, end_of_dialog: true} ``` When `streaming=false`: @@ -541,7 +541,7 @@ The following questions were resolved during specification: populated and no other fields are needed. An error is always the final communication - no subsequent messages are permitted or expected after an error. For LLM/Prompt streams, `end_of_stream=true`. For Agent streams, - `chunk_type="error"` with `end_of_dialog=true`. + `message_type="error"` with `end_of_dialog=true`. 3. **Partial Response Recovery**: The messaging protocol (Pulsar) is resilient, so message-level retry is not needed. If a client loses track of the stream