Additional agent DAG tests (#750)
- test_agent_provenance.py: test_session_parent_uri,
test_session_no_parent_uri, and 6 synthesis tests (types,
single/multiple parents, document, label)
- test_on_action_callback.py: 3 tests — fires before tool, skipped
for Final, works when None
- test_callback_message_id.py: 7 tests — message_id on think/observe/
answer callbacks (streaming + non-streaming) and
send_final_response
- test_parse_chunk_message_id.py (5 tests) - _parse_chunk propagates
message_id for thought, observation, answer; handles missing
gracefully
- test_explainability_parsing.py (+1) -
test_dispatches_analysis_with_tooluse - Analysis+ToolUse mixin still
dispatches to Analysis
- test_explainability.py (+1) -
test_observation_found_via_subtrace_synthesis
- chain walker follows from sub-trace Synthesis to find Observation
and
Conclusion in correct order
- test_agent_provenance.py (+8) - session parent_uri (2), synthesis
single/multiple parents, types, document, label (6)
2026-04-01 13:59:34 +01:00
|
|
|
"""
|
|
|
|
|
Tests that streaming callbacks set message_id on AgentResponse.
|
|
|
|
|
"""
|
|
|
|
|
|
|
|
|
|
import pytest
|
|
|
|
|
from unittest.mock import AsyncMock, MagicMock
|
|
|
|
|
|
|
|
|
|
from trustgraph.agent.orchestrator.pattern_base import PatternBase
|
|
|
|
|
from trustgraph.schema import AgentResponse
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@pytest.fixture
|
|
|
|
|
def pattern():
|
|
|
|
|
processor = MagicMock()
|
|
|
|
|
return PatternBase(processor)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
class TestThinkCallbackMessageId:
|
|
|
|
|
|
|
|
|
|
@pytest.mark.asyncio
|
|
|
|
|
async def test_streaming_think_has_message_id(self, pattern):
|
|
|
|
|
responses = []
|
|
|
|
|
async def capture(r):
|
|
|
|
|
responses.append(r)
|
|
|
|
|
|
|
|
|
|
msg_id = "urn:trustgraph:agent:sess/i1/thought"
|
|
|
|
|
think = pattern.make_think_callback(capture, streaming=True, message_id=msg_id)
|
|
|
|
|
await think("hello", is_final=False)
|
|
|
|
|
|
|
|
|
|
assert len(responses) == 1
|
|
|
|
|
assert responses[0].message_id == msg_id
|
Add agent explainability instrumentation and unify envelope field naming (#795)
Addresses recommendations from the UX developer's agent experience report.
Adds provenance predicates, DAG structure changes, error resilience, and
a published OWL ontology.
Explainability additions:
- Tool candidates: tg:toolCandidate on Analysis events lists the tools
visible to the LLM for each iteration (names only, descriptions in config)
- Termination reason: tg:terminationReason on Conclusion/Synthesis events
(final-answer, plan-complete, subagents-complete)
- Step counter: tg:stepNumber on iteration events
- Pattern decision: new tg:PatternDecision entity in the DAG between
session and first iteration, carrying tg:pattern and tg:taskType
- Latency: tg:llmDurationMs on Analysis events, tg:toolDurationMs on
Observation events
- Token counts on events: tg:inToken/tg:outToken/tg:llmModel on
Grounding, Focus, Synthesis, and Analysis events
- Tool/parse errors: tg:toolError on Observation events with tg:Error
mixin type. Parse failures return as error observations instead of
crashing the agent, giving it a chance to retry.
Envelope unification:
- Rename chunk_type to message_type across AgentResponse schema,
translator, SDK types, socket clients, CLI, and all tests.
Agent and RAG services now both use message_type on the wire.
Ontology:
- specs/ontology/trustgraph.ttl — OWL vocabulary covering all 26 classes,
7 object properties, and 36+ datatype properties including new predicates.
DAG structure tests:
- tests/unit/test_provenance/test_dag_structure.py verifies the
wasDerivedFrom chain for GraphRAG, DocumentRAG, and all three agent
patterns (react, plan, supervisor) including the pattern-decision link.
2026-04-13 16:16:42 +01:00
|
|
|
assert responses[0].message_type == "thought"
|
Additional agent DAG tests (#750)
- test_agent_provenance.py: test_session_parent_uri,
test_session_no_parent_uri, and 6 synthesis tests (types,
single/multiple parents, document, label)
- test_on_action_callback.py: 3 tests — fires before tool, skipped
for Final, works when None
- test_callback_message_id.py: 7 tests — message_id on think/observe/
answer callbacks (streaming + non-streaming) and
send_final_response
- test_parse_chunk_message_id.py (5 tests) - _parse_chunk propagates
message_id for thought, observation, answer; handles missing
gracefully
- test_explainability_parsing.py (+1) -
test_dispatches_analysis_with_tooluse - Analysis+ToolUse mixin still
dispatches to Analysis
- test_explainability.py (+1) -
test_observation_found_via_subtrace_synthesis
- chain walker follows from sub-trace Synthesis to find Observation
and
Conclusion in correct order
- test_agent_provenance.py (+8) - session parent_uri (2), synthesis
single/multiple parents, types, document, label (6)
2026-04-01 13:59:34 +01:00
|
|
|
|
|
|
|
|
@pytest.mark.asyncio
|
|
|
|
|
async def test_non_streaming_think_has_message_id(self, pattern):
|
|
|
|
|
responses = []
|
|
|
|
|
async def capture(r):
|
|
|
|
|
responses.append(r)
|
|
|
|
|
|
|
|
|
|
msg_id = "urn:trustgraph:agent:sess/i1/thought"
|
|
|
|
|
think = pattern.make_think_callback(capture, streaming=False, message_id=msg_id)
|
|
|
|
|
await think("hello")
|
|
|
|
|
|
|
|
|
|
assert responses[0].message_id == msg_id
|
|
|
|
|
assert responses[0].end_of_message is True
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
class TestObserveCallbackMessageId:
|
|
|
|
|
|
|
|
|
|
@pytest.mark.asyncio
|
|
|
|
|
async def test_streaming_observe_has_message_id(self, pattern):
|
|
|
|
|
responses = []
|
|
|
|
|
async def capture(r):
|
|
|
|
|
responses.append(r)
|
|
|
|
|
|
|
|
|
|
msg_id = "urn:trustgraph:agent:sess/i1/observation"
|
|
|
|
|
observe = pattern.make_observe_callback(capture, streaming=True, message_id=msg_id)
|
|
|
|
|
await observe("result", is_final=True)
|
|
|
|
|
|
|
|
|
|
assert responses[0].message_id == msg_id
|
Add agent explainability instrumentation and unify envelope field naming (#795)
Addresses recommendations from the UX developer's agent experience report.
Adds provenance predicates, DAG structure changes, error resilience, and
a published OWL ontology.
Explainability additions:
- Tool candidates: tg:toolCandidate on Analysis events lists the tools
visible to the LLM for each iteration (names only, descriptions in config)
- Termination reason: tg:terminationReason on Conclusion/Synthesis events
(final-answer, plan-complete, subagents-complete)
- Step counter: tg:stepNumber on iteration events
- Pattern decision: new tg:PatternDecision entity in the DAG between
session and first iteration, carrying tg:pattern and tg:taskType
- Latency: tg:llmDurationMs on Analysis events, tg:toolDurationMs on
Observation events
- Token counts on events: tg:inToken/tg:outToken/tg:llmModel on
Grounding, Focus, Synthesis, and Analysis events
- Tool/parse errors: tg:toolError on Observation events with tg:Error
mixin type. Parse failures return as error observations instead of
crashing the agent, giving it a chance to retry.
Envelope unification:
- Rename chunk_type to message_type across AgentResponse schema,
translator, SDK types, socket clients, CLI, and all tests.
Agent and RAG services now both use message_type on the wire.
Ontology:
- specs/ontology/trustgraph.ttl — OWL vocabulary covering all 26 classes,
7 object properties, and 36+ datatype properties including new predicates.
DAG structure tests:
- tests/unit/test_provenance/test_dag_structure.py verifies the
wasDerivedFrom chain for GraphRAG, DocumentRAG, and all three agent
patterns (react, plan, supervisor) including the pattern-decision link.
2026-04-13 16:16:42 +01:00
|
|
|
assert responses[0].message_type == "observation"
|
Additional agent DAG tests (#750)
- test_agent_provenance.py: test_session_parent_uri,
test_session_no_parent_uri, and 6 synthesis tests (types,
single/multiple parents, document, label)
- test_on_action_callback.py: 3 tests — fires before tool, skipped
for Final, works when None
- test_callback_message_id.py: 7 tests — message_id on think/observe/
answer callbacks (streaming + non-streaming) and
send_final_response
- test_parse_chunk_message_id.py (5 tests) - _parse_chunk propagates
message_id for thought, observation, answer; handles missing
gracefully
- test_explainability_parsing.py (+1) -
test_dispatches_analysis_with_tooluse - Analysis+ToolUse mixin still
dispatches to Analysis
- test_explainability.py (+1) -
test_observation_found_via_subtrace_synthesis
- chain walker follows from sub-trace Synthesis to find Observation
and
Conclusion in correct order
- test_agent_provenance.py (+8) - session parent_uri (2), synthesis
single/multiple parents, types, document, label (6)
2026-04-01 13:59:34 +01:00
|
|
|
|
|
|
|
|
|
|
|
|
|
class TestAnswerCallbackMessageId:
|
|
|
|
|
|
|
|
|
|
@pytest.mark.asyncio
|
|
|
|
|
async def test_streaming_answer_has_message_id(self, pattern):
|
|
|
|
|
responses = []
|
|
|
|
|
async def capture(r):
|
|
|
|
|
responses.append(r)
|
|
|
|
|
|
|
|
|
|
msg_id = "urn:trustgraph:agent:sess/final"
|
|
|
|
|
answer = pattern.make_answer_callback(capture, streaming=True, message_id=msg_id)
|
|
|
|
|
await answer("the answer")
|
|
|
|
|
|
|
|
|
|
assert responses[0].message_id == msg_id
|
Add agent explainability instrumentation and unify envelope field naming (#795)
Addresses recommendations from the UX developer's agent experience report.
Adds provenance predicates, DAG structure changes, error resilience, and
a published OWL ontology.
Explainability additions:
- Tool candidates: tg:toolCandidate on Analysis events lists the tools
visible to the LLM for each iteration (names only, descriptions in config)
- Termination reason: tg:terminationReason on Conclusion/Synthesis events
(final-answer, plan-complete, subagents-complete)
- Step counter: tg:stepNumber on iteration events
- Pattern decision: new tg:PatternDecision entity in the DAG between
session and first iteration, carrying tg:pattern and tg:taskType
- Latency: tg:llmDurationMs on Analysis events, tg:toolDurationMs on
Observation events
- Token counts on events: tg:inToken/tg:outToken/tg:llmModel on
Grounding, Focus, Synthesis, and Analysis events
- Tool/parse errors: tg:toolError on Observation events with tg:Error
mixin type. Parse failures return as error observations instead of
crashing the agent, giving it a chance to retry.
Envelope unification:
- Rename chunk_type to message_type across AgentResponse schema,
translator, SDK types, socket clients, CLI, and all tests.
Agent and RAG services now both use message_type on the wire.
Ontology:
- specs/ontology/trustgraph.ttl — OWL vocabulary covering all 26 classes,
7 object properties, and 36+ datatype properties including new predicates.
DAG structure tests:
- tests/unit/test_provenance/test_dag_structure.py verifies the
wasDerivedFrom chain for GraphRAG, DocumentRAG, and all three agent
patterns (react, plan, supervisor) including the pattern-decision link.
2026-04-13 16:16:42 +01:00
|
|
|
assert responses[0].message_type == "answer"
|
Additional agent DAG tests (#750)
- test_agent_provenance.py: test_session_parent_uri,
test_session_no_parent_uri, and 6 synthesis tests (types,
single/multiple parents, document, label)
- test_on_action_callback.py: 3 tests — fires before tool, skipped
for Final, works when None
- test_callback_message_id.py: 7 tests — message_id on think/observe/
answer callbacks (streaming + non-streaming) and
send_final_response
- test_parse_chunk_message_id.py (5 tests) - _parse_chunk propagates
message_id for thought, observation, answer; handles missing
gracefully
- test_explainability_parsing.py (+1) -
test_dispatches_analysis_with_tooluse - Analysis+ToolUse mixin still
dispatches to Analysis
- test_explainability.py (+1) -
test_observation_found_via_subtrace_synthesis
- chain walker follows from sub-trace Synthesis to find Observation
and
Conclusion in correct order
- test_agent_provenance.py (+8) - session parent_uri (2), synthesis
single/multiple parents, types, document, label (6)
2026-04-01 13:59:34 +01:00
|
|
|
|
|
|
|
|
@pytest.mark.asyncio
|
|
|
|
|
async def test_no_message_id_default(self, pattern):
|
|
|
|
|
responses = []
|
|
|
|
|
async def capture(r):
|
|
|
|
|
responses.append(r)
|
|
|
|
|
|
|
|
|
|
answer = pattern.make_answer_callback(capture, streaming=True)
|
|
|
|
|
await answer("the answer")
|
|
|
|
|
|
|
|
|
|
assert responses[0].message_id == ""
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
class TestSendFinalResponseMessageId:
|
|
|
|
|
|
|
|
|
|
@pytest.mark.asyncio
|
|
|
|
|
async def test_streaming_final_has_message_id(self, pattern):
|
|
|
|
|
responses = []
|
|
|
|
|
async def capture(r):
|
|
|
|
|
responses.append(r)
|
|
|
|
|
|
|
|
|
|
msg_id = "urn:trustgraph:agent:sess/final"
|
|
|
|
|
await pattern.send_final_response(
|
|
|
|
|
capture, streaming=True, answer_text="answer",
|
|
|
|
|
message_id=msg_id,
|
|
|
|
|
)
|
|
|
|
|
|
|
|
|
|
# Should get content chunk + end-of-dialog marker
|
|
|
|
|
assert all(r.message_id == msg_id for r in responses)
|
|
|
|
|
|
|
|
|
|
@pytest.mark.asyncio
|
|
|
|
|
async def test_non_streaming_final_has_message_id(self, pattern):
|
|
|
|
|
responses = []
|
|
|
|
|
async def capture(r):
|
|
|
|
|
responses.append(r)
|
|
|
|
|
|
|
|
|
|
msg_id = "urn:trustgraph:agent:sess/final"
|
|
|
|
|
await pattern.send_final_response(
|
|
|
|
|
capture, streaming=False, answer_text="answer",
|
|
|
|
|
message_id=msg_id,
|
|
|
|
|
)
|
|
|
|
|
|
|
|
|
|
assert len(responses) == 1
|
|
|
|
|
assert responses[0].message_id == msg_id
|
|
|
|
|
assert responses[0].end_of_dialog is True
|