mirror of
https://github.com/trustgraph-ai/trustgraph.git
synced 2026-04-25 00:16:23 +02:00
Adding explainability to the ReACT agent (#689)
* Added tech spec
* Add provenance recording to React agent loop
Enables agent sessions to be traced and debugged using the same
explainability infrastructure as GraphRAG. Agent traces record:
- Session start with query and timestamp
- Each iteration's thought, action, arguments, and observation
- Final answer with derivation chain
Changes:
- Add session_id and collection fields to AgentRequest schema
- Add agent predicates (TG_THOUGHT, TG_ACTION, etc.) to namespaces
- Create agent provenance triple generators in provenance/agent.py
- Register explainability producer in agent service
- Emit provenance triples during agent execution
- Update CLI tools to detect and render agent traces alongside GraphRAG
* Updated explainability taxonomy:
GraphRAG: tg:Question → tg:Exploration → tg:Focus → tg:Synthesis
Agent: tg:Question → tg:Analysis(s) → tg:Conclusion
All entities also have their PROV-O type (prov:Activity or prov:Entity).
Updated commit message:
Add provenance recording to React agent loop
Enables agent sessions to be traced and debugged using the same
explainability infrastructure as GraphRAG.
Entity types follow human reasoning patterns:
- tg:Question - the user's query (shared with GraphRAG)
- tg:Analysis - each think/act/observe cycle
- tg:Conclusion - the final answer
Also adds explicit TG types to GraphRAG entities:
- tg:Question, tg:Exploration, tg:Focus, tg:Synthesis
All types retain their PROV-O base types (prov:Activity, prov:Entity).
Changes:
- Add session_id and collection fields to AgentRequest schema
- Add explainability entity types to namespaces.py
- Create agent provenance triple generators
- Register explainability producer in agent service
- Emit provenance triples during agent execution
- Update CLI tools to detect and render both trace types
* Document RAG explainability is now complete. Here's a summary of the
changes made:
Schema Changes:
- trustgraph-base/trustgraph/schema/services/retrieval.py: Added
explain_id and explain_graph fields to DocumentRagResponse
- trustgraph-base/trustgraph/messaging/translators/retrieval.py:
Updated translator to handle explainability fields
Provenance Changes:
- trustgraph-base/trustgraph/provenance/namespaces.py: Added
TG_CHUNK_COUNT and TG_SELECTED_CHUNK predicates
- trustgraph-base/trustgraph/provenance/uris.py: Added
docrag_question_uri, docrag_exploration_uri, docrag_synthesis_uri
generators
- trustgraph-base/trustgraph/provenance/triples.py: Added
docrag_question_triples, docrag_exploration_triples,
docrag_synthesis_triples builders
- trustgraph-base/trustgraph/provenance/__init__.py: Exported all
new Document RAG functions and predicates
Service Changes:
- trustgraph-flow/trustgraph/retrieval/document_rag/document_rag.py:
Added explainability callback support and triple emission at each
phase (Question → Exploration → Synthesis)
- trustgraph-flow/trustgraph/retrieval/document_rag/rag.py:
Registered explainability producer and wired up the callback
Documentation:
- docs/tech-specs/agent-explainability.md: Added Document RAG entity
types and provenance model documentation
Document RAG Provenance Model:
Question (urn:trustgraph:docrag:{uuid})
│
│ tg:query, prov:startedAtTime
│ rdf:type = prov:Activity, tg:Question
│
↓ prov:wasGeneratedBy
│
Exploration (urn:trustgraph:docrag:{uuid}/exploration)
│
│ tg:chunkCount, tg:selectedChunk (multiple)
│ rdf:type = prov:Entity, tg:Exploration
│
↓ prov:wasDerivedFrom
│
Synthesis (urn:trustgraph:docrag:{uuid}/synthesis)
│
│ tg:content = "The answer..."
│ rdf:type = prov:Entity, tg:Synthesis
* Specific subtype that makes the retrieval mechanism immediately
obvious:
System: GraphRAG
TG Types on Question: tg:Question, tg:GraphRagQuestion
URI Pattern: urn:trustgraph:question:{uuid}
────────────────────────────────────────
System: Document RAG
TG Types on Question: tg:Question, tg:DocRagQuestion
URI Pattern: urn:trustgraph:docrag:{uuid}
────────────────────────────────────────
System: Agent
TG Types on Question: tg:Question, tg:AgentQuestion
URI Pattern: urn:trustgraph:agent:{uuid}
Files modified:
- trustgraph-base/trustgraph/provenance/namespaces.py - Added
TG_GRAPH_RAG_QUESTION, TG_DOC_RAG_QUESTION, TG_AGENT_QUESTION
- trustgraph-base/trustgraph/provenance/triples.py - Added subtype to
question_triples and docrag_question_triples
- trustgraph-base/trustgraph/provenance/agent.py - Added subtype to
agent_session_triples
- trustgraph-base/trustgraph/provenance/__init__.py - Exported new types
- docs/tech-specs/agent-explainability.md - Documented the subtypes
This allows:
- Query all questions: ?q rdf:type tg:Question
- Query only GraphRAG: ?q rdf:type tg:GraphRagQuestion
- Query only Document RAG: ?q rdf:type tg:DocRagQuestion
- Query only Agent: ?q rdf:type tg:AgentQuestion
* Fixed tests
This commit is contained in:
parent
a53ed41da2
commit
312174eb88
17 changed files with 1269 additions and 44 deletions
|
|
@ -13,7 +13,9 @@ class AgentRequestTranslator(MessageTranslator):
|
|||
group=data.get("group", None),
|
||||
history=data.get("history", []),
|
||||
user=data.get("user", "trustgraph"),
|
||||
streaming=data.get("streaming", False)
|
||||
collection=data.get("collection", "default"),
|
||||
streaming=data.get("streaming", False),
|
||||
session_id=data.get("session_id", ""),
|
||||
)
|
||||
|
||||
def from_pulsar(self, obj: AgentRequest) -> Dict[str, Any]:
|
||||
|
|
@ -23,7 +25,9 @@ class AgentRequestTranslator(MessageTranslator):
|
|||
"group": obj.group,
|
||||
"history": obj.history,
|
||||
"user": obj.user,
|
||||
"streaming": getattr(obj, "streaming", False)
|
||||
"collection": getattr(obj, "collection", "default"),
|
||||
"streaming": getattr(obj, "streaming", False),
|
||||
"session_id": getattr(obj, "session_id", ""),
|
||||
}
|
||||
|
||||
|
||||
|
|
|
|||
|
|
@ -38,6 +38,16 @@ class DocumentRagResponseTranslator(MessageTranslator):
|
|||
if obj.response is not None:
|
||||
result["response"] = obj.response
|
||||
|
||||
# Include explain_id for explain messages
|
||||
explain_id = getattr(obj, "explain_id", None)
|
||||
if explain_id:
|
||||
result["explain_id"] = explain_id
|
||||
|
||||
# Include explain_graph for explain messages (named graph filter)
|
||||
explain_graph = getattr(obj, "explain_graph", None)
|
||||
if explain_graph is not None:
|
||||
result["explain_graph"] = explain_graph
|
||||
|
||||
# Include end_of_stream flag
|
||||
result["end_of_stream"] = getattr(obj, "end_of_stream", False)
|
||||
|
||||
|
|
|
|||
|
|
@ -40,11 +40,19 @@ from . uris import (
|
|||
activity_uri,
|
||||
statement_uri,
|
||||
agent_uri,
|
||||
# Query-time provenance URIs
|
||||
# Query-time provenance URIs (GraphRAG)
|
||||
question_uri,
|
||||
exploration_uri,
|
||||
focus_uri,
|
||||
synthesis_uri,
|
||||
# Agent provenance URIs
|
||||
agent_session_uri,
|
||||
agent_iteration_uri,
|
||||
agent_final_uri,
|
||||
# Document RAG provenance URIs
|
||||
docrag_question_uri,
|
||||
docrag_exploration_uri,
|
||||
docrag_synthesis_uri,
|
||||
)
|
||||
|
||||
# Namespace constants
|
||||
|
|
@ -63,8 +71,17 @@ from . namespaces import (
|
|||
TG_CHUNK_SIZE, TG_CHUNK_OVERLAP, TG_COMPONENT_VERSION,
|
||||
TG_LLM_MODEL, TG_ONTOLOGY, TG_EMBEDDING_MODEL,
|
||||
TG_SOURCE_TEXT, TG_SOURCE_CHAR_OFFSET, TG_SOURCE_CHAR_LENGTH,
|
||||
# Query-time provenance predicates
|
||||
# Query-time provenance predicates (GraphRAG)
|
||||
TG_QUERY, TG_EDGE_COUNT, TG_SELECTED_EDGE, TG_REASONING, TG_CONTENT,
|
||||
# Query-time provenance predicates (DocumentRAG)
|
||||
TG_CHUNK_COUNT, TG_SELECTED_CHUNK,
|
||||
# Explainability entity types
|
||||
TG_QUESTION, TG_EXPLORATION, TG_FOCUS, TG_SYNTHESIS,
|
||||
TG_ANALYSIS, TG_CONCLUSION,
|
||||
# Question subtypes (to distinguish retrieval mechanism)
|
||||
TG_GRAPH_RAG_QUESTION, TG_DOC_RAG_QUESTION, TG_AGENT_QUESTION,
|
||||
# Agent provenance predicates
|
||||
TG_THOUGHT, TG_ACTION, TG_ARGUMENTS, TG_OBSERVATION, TG_ANSWER,
|
||||
# Named graphs
|
||||
GRAPH_DEFAULT, GRAPH_SOURCE, GRAPH_RETRIEVAL,
|
||||
)
|
||||
|
|
@ -74,15 +91,26 @@ from . triples import (
|
|||
document_triples,
|
||||
derived_entity_triples,
|
||||
triple_provenance_triples,
|
||||
# Query-time provenance triple builders
|
||||
# Query-time provenance triple builders (GraphRAG)
|
||||
question_triples,
|
||||
exploration_triples,
|
||||
focus_triples,
|
||||
synthesis_triples,
|
||||
# Query-time provenance triple builders (DocumentRAG)
|
||||
docrag_question_triples,
|
||||
docrag_exploration_triples,
|
||||
docrag_synthesis_triples,
|
||||
# Utility
|
||||
set_graph,
|
||||
)
|
||||
|
||||
# Agent provenance triple builders
|
||||
from . agent import (
|
||||
agent_session_triples,
|
||||
agent_iteration_triples,
|
||||
agent_final_triples,
|
||||
)
|
||||
|
||||
# Vocabulary bootstrap
|
||||
from . vocabulary import (
|
||||
get_vocabulary_triples,
|
||||
|
|
@ -107,6 +135,14 @@ __all__ = [
|
|||
"exploration_uri",
|
||||
"focus_uri",
|
||||
"synthesis_uri",
|
||||
# Agent provenance URIs
|
||||
"agent_session_uri",
|
||||
"agent_iteration_uri",
|
||||
"agent_final_uri",
|
||||
# Document RAG provenance URIs
|
||||
"docrag_question_uri",
|
||||
"docrag_exploration_uri",
|
||||
"docrag_synthesis_uri",
|
||||
# Namespaces
|
||||
"PROV", "PROV_ENTITY", "PROV_ACTIVITY", "PROV_AGENT",
|
||||
"PROV_WAS_DERIVED_FROM", "PROV_WAS_GENERATED_BY",
|
||||
|
|
@ -118,19 +154,36 @@ __all__ = [
|
|||
"TG_CHUNK_SIZE", "TG_CHUNK_OVERLAP", "TG_COMPONENT_VERSION",
|
||||
"TG_LLM_MODEL", "TG_ONTOLOGY", "TG_EMBEDDING_MODEL",
|
||||
"TG_SOURCE_TEXT", "TG_SOURCE_CHAR_OFFSET", "TG_SOURCE_CHAR_LENGTH",
|
||||
# Query-time provenance predicates
|
||||
# Query-time provenance predicates (GraphRAG)
|
||||
"TG_QUERY", "TG_EDGE_COUNT", "TG_SELECTED_EDGE", "TG_REASONING", "TG_CONTENT",
|
||||
# Query-time provenance predicates (DocumentRAG)
|
||||
"TG_CHUNK_COUNT", "TG_SELECTED_CHUNK",
|
||||
# Explainability entity types
|
||||
"TG_QUESTION", "TG_EXPLORATION", "TG_FOCUS", "TG_SYNTHESIS",
|
||||
"TG_ANALYSIS", "TG_CONCLUSION",
|
||||
# Question subtypes
|
||||
"TG_GRAPH_RAG_QUESTION", "TG_DOC_RAG_QUESTION", "TG_AGENT_QUESTION",
|
||||
# Agent provenance predicates
|
||||
"TG_THOUGHT", "TG_ACTION", "TG_ARGUMENTS", "TG_OBSERVATION", "TG_ANSWER",
|
||||
# Named graphs
|
||||
"GRAPH_DEFAULT", "GRAPH_SOURCE", "GRAPH_RETRIEVAL",
|
||||
# Triple builders
|
||||
"document_triples",
|
||||
"derived_entity_triples",
|
||||
"triple_provenance_triples",
|
||||
# Query-time provenance triple builders
|
||||
# Query-time provenance triple builders (GraphRAG)
|
||||
"question_triples",
|
||||
"exploration_triples",
|
||||
"focus_triples",
|
||||
"synthesis_triples",
|
||||
# Query-time provenance triple builders (DocumentRAG)
|
||||
"docrag_question_triples",
|
||||
"docrag_exploration_triples",
|
||||
"docrag_synthesis_triples",
|
||||
# Agent provenance triple builders
|
||||
"agent_session_triples",
|
||||
"agent_iteration_triples",
|
||||
"agent_final_triples",
|
||||
# Utility
|
||||
"set_graph",
|
||||
# Vocabulary
|
||||
|
|
|
|||
141
trustgraph-base/trustgraph/provenance/agent.py
Normal file
141
trustgraph-base/trustgraph/provenance/agent.py
Normal file
|
|
@ -0,0 +1,141 @@
|
|||
"""
|
||||
Helper functions to build PROV-O triples for agent provenance.
|
||||
|
||||
Agent provenance tracks the reasoning trace of ReAct agent sessions:
|
||||
- Question: The root activity with query and timestamp
|
||||
- Analysis: Each think/act/observe cycle
|
||||
- Conclusion: The final answer
|
||||
"""
|
||||
|
||||
import json
|
||||
from datetime import datetime
|
||||
from typing import List, Optional, Dict, Any
|
||||
|
||||
from .. schema import Triple, Term, IRI, LITERAL
|
||||
|
||||
from . namespaces import (
|
||||
RDF_TYPE, RDFS_LABEL,
|
||||
PROV_ACTIVITY, PROV_ENTITY, PROV_WAS_DERIVED_FROM, PROV_STARTED_AT_TIME,
|
||||
TG_QUERY, TG_THOUGHT, TG_ACTION, TG_ARGUMENTS, TG_OBSERVATION, TG_ANSWER,
|
||||
TG_QUESTION, TG_ANALYSIS, TG_CONCLUSION,
|
||||
TG_AGENT_QUESTION,
|
||||
)
|
||||
|
||||
|
||||
def _iri(uri: str) -> Term:
|
||||
"""Create an IRI term."""
|
||||
return Term(type=IRI, iri=uri)
|
||||
|
||||
|
||||
def _literal(value) -> Term:
|
||||
"""Create a literal term."""
|
||||
return Term(type=LITERAL, value=str(value))
|
||||
|
||||
|
||||
def _triple(s: str, p: str, o_term: Term) -> Triple:
|
||||
"""Create a triple with IRI subject and predicate."""
|
||||
return Triple(s=_iri(s), p=_iri(p), o=o_term)
|
||||
|
||||
|
||||
def agent_session_triples(
|
||||
session_uri: str,
|
||||
query: str,
|
||||
timestamp: Optional[str] = None,
|
||||
) -> List[Triple]:
|
||||
"""
|
||||
Build triples for an agent session start (Question).
|
||||
|
||||
Creates:
|
||||
- Activity declaration with tg:Question type
|
||||
- Query text and timestamp
|
||||
|
||||
Args:
|
||||
session_uri: URI of the session (from agent_session_uri)
|
||||
query: The user's query text
|
||||
timestamp: ISO timestamp (defaults to now)
|
||||
|
||||
Returns:
|
||||
List of Triple objects
|
||||
"""
|
||||
if timestamp is None:
|
||||
timestamp = datetime.utcnow().isoformat() + "Z"
|
||||
|
||||
return [
|
||||
_triple(session_uri, RDF_TYPE, _iri(PROV_ACTIVITY)),
|
||||
_triple(session_uri, RDF_TYPE, _iri(TG_QUESTION)),
|
||||
_triple(session_uri, RDF_TYPE, _iri(TG_AGENT_QUESTION)),
|
||||
_triple(session_uri, RDFS_LABEL, _literal("Agent Question")),
|
||||
_triple(session_uri, PROV_STARTED_AT_TIME, _literal(timestamp)),
|
||||
_triple(session_uri, TG_QUERY, _literal(query)),
|
||||
]
|
||||
|
||||
|
||||
def agent_iteration_triples(
|
||||
iteration_uri: str,
|
||||
parent_uri: str,
|
||||
thought: str,
|
||||
action: str,
|
||||
arguments: Dict[str, Any],
|
||||
observation: str,
|
||||
) -> List[Triple]:
|
||||
"""
|
||||
Build triples for one agent iteration (Analysis - think/act/observe cycle).
|
||||
|
||||
Creates:
|
||||
- Entity declaration with tg:Analysis type
|
||||
- wasDerivedFrom link to parent (previous iteration or session)
|
||||
- Thought, action, arguments, and observation data
|
||||
|
||||
Args:
|
||||
iteration_uri: URI of this iteration (from agent_iteration_uri)
|
||||
parent_uri: URI of the parent (previous iteration or session)
|
||||
thought: The agent's reasoning/thought
|
||||
action: The tool/action name
|
||||
arguments: Arguments passed to the tool (will be JSON-encoded)
|
||||
observation: The result/observation from the tool
|
||||
|
||||
Returns:
|
||||
List of Triple objects
|
||||
"""
|
||||
triples = [
|
||||
_triple(iteration_uri, RDF_TYPE, _iri(PROV_ENTITY)),
|
||||
_triple(iteration_uri, RDF_TYPE, _iri(TG_ANALYSIS)),
|
||||
_triple(iteration_uri, RDFS_LABEL, _literal(f"Analysis: {action}")),
|
||||
_triple(iteration_uri, PROV_WAS_DERIVED_FROM, _iri(parent_uri)),
|
||||
_triple(iteration_uri, TG_THOUGHT, _literal(thought)),
|
||||
_triple(iteration_uri, TG_ACTION, _literal(action)),
|
||||
_triple(iteration_uri, TG_ARGUMENTS, _literal(json.dumps(arguments))),
|
||||
_triple(iteration_uri, TG_OBSERVATION, _literal(observation)),
|
||||
]
|
||||
|
||||
return triples
|
||||
|
||||
|
||||
def agent_final_triples(
|
||||
final_uri: str,
|
||||
parent_uri: str,
|
||||
answer: str,
|
||||
) -> List[Triple]:
|
||||
"""
|
||||
Build triples for an agent final answer (Conclusion).
|
||||
|
||||
Creates:
|
||||
- Entity declaration with tg:Conclusion type
|
||||
- wasDerivedFrom link to parent (last iteration or session)
|
||||
- The answer text
|
||||
|
||||
Args:
|
||||
final_uri: URI of the final answer (from agent_final_uri)
|
||||
parent_uri: URI of the parent (last iteration or session if no iterations)
|
||||
answer: The final answer text
|
||||
|
||||
Returns:
|
||||
List of Triple objects
|
||||
"""
|
||||
return [
|
||||
_triple(final_uri, RDF_TYPE, _iri(PROV_ENTITY)),
|
||||
_triple(final_uri, RDF_TYPE, _iri(TG_CONCLUSION)),
|
||||
_triple(final_uri, RDFS_LABEL, _literal("Conclusion")),
|
||||
_triple(final_uri, PROV_WAS_DERIVED_FROM, _iri(parent_uri)),
|
||||
_triple(final_uri, TG_ANSWER, _literal(answer)),
|
||||
]
|
||||
|
|
@ -59,7 +59,7 @@ TG_SOURCE_TEXT = TG + "sourceText"
|
|||
TG_SOURCE_CHAR_OFFSET = TG + "sourceCharOffset"
|
||||
TG_SOURCE_CHAR_LENGTH = TG + "sourceCharLength"
|
||||
|
||||
# Query-time provenance predicates
|
||||
# Query-time provenance predicates (GraphRAG)
|
||||
TG_QUERY = TG + "query"
|
||||
TG_EDGE_COUNT = TG + "edgeCount"
|
||||
TG_SELECTED_EDGE = TG + "selectedEdge"
|
||||
|
|
@ -68,6 +68,30 @@ TG_REASONING = TG + "reasoning"
|
|||
TG_CONTENT = TG + "content"
|
||||
TG_DOCUMENT = TG + "document" # Reference to document in librarian
|
||||
|
||||
# Query-time provenance predicates (DocumentRAG)
|
||||
TG_CHUNK_COUNT = TG + "chunkCount"
|
||||
TG_SELECTED_CHUNK = TG + "selectedChunk"
|
||||
|
||||
# Explainability entity types (shared)
|
||||
TG_QUESTION = TG + "Question"
|
||||
TG_EXPLORATION = TG + "Exploration"
|
||||
TG_FOCUS = TG + "Focus"
|
||||
TG_SYNTHESIS = TG + "Synthesis"
|
||||
TG_ANALYSIS = TG + "Analysis"
|
||||
TG_CONCLUSION = TG + "Conclusion"
|
||||
|
||||
# Question subtypes (to distinguish retrieval mechanism)
|
||||
TG_GRAPH_RAG_QUESTION = TG + "GraphRagQuestion"
|
||||
TG_DOC_RAG_QUESTION = TG + "DocRagQuestion"
|
||||
TG_AGENT_QUESTION = TG + "AgentQuestion"
|
||||
|
||||
# Agent provenance predicates
|
||||
TG_THOUGHT = TG + "thought"
|
||||
TG_ACTION = TG + "action"
|
||||
TG_ARGUMENTS = TG + "arguments"
|
||||
TG_OBSERVATION = TG + "observation"
|
||||
TG_ANSWER = TG + "answer"
|
||||
|
||||
# Named graph URIs for RDF datasets
|
||||
# These separate different types of data while keeping them in the same collection
|
||||
GRAPH_DEFAULT = "" # Core knowledge facts (triples extracted from documents)
|
||||
|
|
|
|||
|
|
@ -17,9 +17,15 @@ from . namespaces import (
|
|||
TG_CHUNK_INDEX, TG_CHAR_OFFSET, TG_CHAR_LENGTH,
|
||||
TG_CHUNK_SIZE, TG_CHUNK_OVERLAP, TG_COMPONENT_VERSION,
|
||||
TG_LLM_MODEL, TG_ONTOLOGY, TG_REIFIES,
|
||||
# Query-time provenance predicates
|
||||
# Query-time provenance predicates (GraphRAG)
|
||||
TG_QUERY, TG_EDGE_COUNT, TG_SELECTED_EDGE, TG_EDGE, TG_REASONING, TG_CONTENT,
|
||||
TG_DOCUMENT,
|
||||
# Query-time provenance predicates (DocumentRAG)
|
||||
TG_CHUNK_COUNT, TG_SELECTED_CHUNK,
|
||||
# Explainability entity types
|
||||
TG_QUESTION, TG_EXPLORATION, TG_FOCUS, TG_SYNTHESIS,
|
||||
# Question subtypes
|
||||
TG_GRAPH_RAG_QUESTION, TG_DOC_RAG_QUESTION,
|
||||
)
|
||||
|
||||
from . uris import activity_uri, agent_uri, edge_selection_uri
|
||||
|
|
@ -310,7 +316,9 @@ def question_triples(
|
|||
|
||||
return [
|
||||
_triple(question_uri, RDF_TYPE, _iri(PROV_ACTIVITY)),
|
||||
_triple(question_uri, RDFS_LABEL, _literal("GraphRAG question")),
|
||||
_triple(question_uri, RDF_TYPE, _iri(TG_QUESTION)),
|
||||
_triple(question_uri, RDF_TYPE, _iri(TG_GRAPH_RAG_QUESTION)),
|
||||
_triple(question_uri, RDFS_LABEL, _literal("GraphRAG Question")),
|
||||
_triple(question_uri, PROV_STARTED_AT_TIME, _literal(timestamp)),
|
||||
_triple(question_uri, TG_QUERY, _literal(query)),
|
||||
]
|
||||
|
|
@ -339,6 +347,7 @@ def exploration_triples(
|
|||
"""
|
||||
return [
|
||||
_triple(exploration_uri, RDF_TYPE, _iri(PROV_ENTITY)),
|
||||
_triple(exploration_uri, RDF_TYPE, _iri(TG_EXPLORATION)),
|
||||
_triple(exploration_uri, RDFS_LABEL, _literal("Exploration")),
|
||||
_triple(exploration_uri, PROV_WAS_GENERATED_BY, _iri(question_uri)),
|
||||
_triple(exploration_uri, TG_EDGE_COUNT, _literal(edge_count)),
|
||||
|
|
@ -383,6 +392,7 @@ def focus_triples(
|
|||
"""
|
||||
triples = [
|
||||
_triple(focus_uri, RDF_TYPE, _iri(PROV_ENTITY)),
|
||||
_triple(focus_uri, RDF_TYPE, _iri(TG_FOCUS)),
|
||||
_triple(focus_uri, RDFS_LABEL, _literal("Focus")),
|
||||
_triple(focus_uri, PROV_WAS_DERIVED_FROM, _iri(exploration_uri)),
|
||||
]
|
||||
|
|
@ -443,6 +453,7 @@ def synthesis_triples(
|
|||
"""
|
||||
triples = [
|
||||
_triple(synthesis_uri, RDF_TYPE, _iri(PROV_ENTITY)),
|
||||
_triple(synthesis_uri, RDF_TYPE, _iri(TG_SYNTHESIS)),
|
||||
_triple(synthesis_uri, RDFS_LABEL, _literal("Synthesis")),
|
||||
_triple(synthesis_uri, PROV_WAS_DERIVED_FROM, _iri(focus_uri)),
|
||||
]
|
||||
|
|
@ -455,3 +466,120 @@ def synthesis_triples(
|
|||
triples.append(_triple(synthesis_uri, TG_CONTENT, _literal(answer_text)))
|
||||
|
||||
return triples
|
||||
|
||||
|
||||
# Document RAG provenance triple builders
|
||||
#
|
||||
# Document RAG uses a subset of GraphRAG's model:
|
||||
# Question - What was asked
|
||||
# Exploration - Chunks retrieved from document store
|
||||
# Synthesis - The final answer (no Focus step)
|
||||
|
||||
def docrag_question_triples(
|
||||
question_uri: str,
|
||||
query: str,
|
||||
timestamp: Optional[str] = None,
|
||||
) -> List[Triple]:
|
||||
"""
|
||||
Build triples for a document RAG question activity.
|
||||
|
||||
Creates:
|
||||
- Activity declaration with tg:Question type
|
||||
- Query text and timestamp
|
||||
|
||||
Args:
|
||||
question_uri: URI of the question (from docrag_question_uri)
|
||||
query: The user's query text
|
||||
timestamp: ISO timestamp (defaults to now)
|
||||
|
||||
Returns:
|
||||
List of Triple objects
|
||||
"""
|
||||
if timestamp is None:
|
||||
timestamp = datetime.utcnow().isoformat() + "Z"
|
||||
|
||||
return [
|
||||
_triple(question_uri, RDF_TYPE, _iri(PROV_ACTIVITY)),
|
||||
_triple(question_uri, RDF_TYPE, _iri(TG_QUESTION)),
|
||||
_triple(question_uri, RDF_TYPE, _iri(TG_DOC_RAG_QUESTION)),
|
||||
_triple(question_uri, RDFS_LABEL, _literal("DocumentRAG Question")),
|
||||
_triple(question_uri, PROV_STARTED_AT_TIME, _literal(timestamp)),
|
||||
_triple(question_uri, TG_QUERY, _literal(query)),
|
||||
]
|
||||
|
||||
|
||||
def docrag_exploration_triples(
|
||||
exploration_uri: str,
|
||||
question_uri: str,
|
||||
chunk_count: int,
|
||||
chunk_ids: Optional[List[str]] = None,
|
||||
) -> List[Triple]:
|
||||
"""
|
||||
Build triples for a document RAG exploration entity (chunks retrieved).
|
||||
|
||||
Creates:
|
||||
- Entity declaration with tg:Exploration type
|
||||
- wasGeneratedBy link to question
|
||||
- Chunk count and optional chunk references
|
||||
|
||||
Args:
|
||||
exploration_uri: URI of the exploration entity
|
||||
question_uri: URI of the parent question
|
||||
chunk_count: Number of chunks retrieved
|
||||
chunk_ids: Optional list of chunk URIs/IDs
|
||||
|
||||
Returns:
|
||||
List of Triple objects
|
||||
"""
|
||||
triples = [
|
||||
_triple(exploration_uri, RDF_TYPE, _iri(PROV_ENTITY)),
|
||||
_triple(exploration_uri, RDF_TYPE, _iri(TG_EXPLORATION)),
|
||||
_triple(exploration_uri, RDFS_LABEL, _literal("Exploration")),
|
||||
_triple(exploration_uri, PROV_WAS_GENERATED_BY, _iri(question_uri)),
|
||||
_triple(exploration_uri, TG_CHUNK_COUNT, _literal(chunk_count)),
|
||||
]
|
||||
|
||||
# Add references to selected chunks
|
||||
if chunk_ids:
|
||||
for chunk_id in chunk_ids:
|
||||
triples.append(_triple(exploration_uri, TG_SELECTED_CHUNK, _iri(chunk_id)))
|
||||
|
||||
return triples
|
||||
|
||||
|
||||
def docrag_synthesis_triples(
|
||||
synthesis_uri: str,
|
||||
exploration_uri: str,
|
||||
answer_text: str = "",
|
||||
document_id: Optional[str] = None,
|
||||
) -> List[Triple]:
|
||||
"""
|
||||
Build triples for a document RAG synthesis entity (final answer).
|
||||
|
||||
Creates:
|
||||
- Entity declaration with tg:Synthesis type
|
||||
- wasDerivedFrom link to exploration (skips focus step)
|
||||
- Either document reference or inline content
|
||||
|
||||
Args:
|
||||
synthesis_uri: URI of the synthesis entity
|
||||
exploration_uri: URI of the parent exploration entity
|
||||
answer_text: The synthesized answer text (used if no document_id)
|
||||
document_id: Optional librarian document ID (preferred over inline content)
|
||||
|
||||
Returns:
|
||||
List of Triple objects
|
||||
"""
|
||||
triples = [
|
||||
_triple(synthesis_uri, RDF_TYPE, _iri(PROV_ENTITY)),
|
||||
_triple(synthesis_uri, RDF_TYPE, _iri(TG_SYNTHESIS)),
|
||||
_triple(synthesis_uri, RDFS_LABEL, _literal("Synthesis")),
|
||||
_triple(synthesis_uri, PROV_WAS_DERIVED_FROM, _iri(exploration_uri)),
|
||||
]
|
||||
|
||||
if document_id:
|
||||
triples.append(_triple(synthesis_uri, TG_DOCUMENT, _iri(document_id)))
|
||||
elif answer_text:
|
||||
triples.append(_triple(synthesis_uri, TG_CONTENT, _literal(answer_text)))
|
||||
|
||||
return triples
|
||||
|
|
|
|||
|
|
@ -138,3 +138,94 @@ def edge_selection_uri(session_id: str, edge_index: int) -> str:
|
|||
URN in format: urn:trustgraph:prov:edge:{uuid}:{index}
|
||||
"""
|
||||
return f"urn:trustgraph:prov:edge:{session_id}:{edge_index}"
|
||||
|
||||
|
||||
# Agent provenance URIs
|
||||
# These URIs use the urn:trustgraph:agent: namespace to distinguish agent
|
||||
# provenance from GraphRAG question provenance
|
||||
|
||||
def agent_session_uri(session_id: str = None) -> str:
|
||||
"""
|
||||
Generate URI for an agent session.
|
||||
|
||||
Args:
|
||||
session_id: Optional UUID string. Auto-generates if not provided.
|
||||
|
||||
Returns:
|
||||
URN in format: urn:trustgraph:agent:{uuid}
|
||||
"""
|
||||
if session_id is None:
|
||||
session_id = str(uuid.uuid4())
|
||||
return f"urn:trustgraph:agent:{session_id}"
|
||||
|
||||
|
||||
def agent_iteration_uri(session_id: str, iteration_num: int) -> str:
|
||||
"""
|
||||
Generate URI for an agent iteration.
|
||||
|
||||
Args:
|
||||
session_id: The session UUID.
|
||||
iteration_num: 1-based iteration number.
|
||||
|
||||
Returns:
|
||||
URN in format: urn:trustgraph:agent:{uuid}/i{num}
|
||||
"""
|
||||
return f"urn:trustgraph:agent:{session_id}/i{iteration_num}"
|
||||
|
||||
|
||||
def agent_final_uri(session_id: str) -> str:
|
||||
"""
|
||||
Generate URI for an agent final answer.
|
||||
|
||||
Args:
|
||||
session_id: The session UUID.
|
||||
|
||||
Returns:
|
||||
URN in format: urn:trustgraph:agent:{uuid}/final
|
||||
"""
|
||||
return f"urn:trustgraph:agent:{session_id}/final"
|
||||
|
||||
|
||||
# Document RAG provenance URIs
|
||||
# These URIs use the urn:trustgraph:docrag: namespace to distinguish
|
||||
# document RAG provenance from graph RAG provenance
|
||||
|
||||
def docrag_question_uri(session_id: str = None) -> str:
|
||||
"""
|
||||
Generate URI for a document RAG question activity.
|
||||
|
||||
Args:
|
||||
session_id: Optional UUID string. Auto-generates if not provided.
|
||||
|
||||
Returns:
|
||||
URN in format: urn:trustgraph:docrag:{uuid}
|
||||
"""
|
||||
if session_id is None:
|
||||
session_id = str(uuid.uuid4())
|
||||
return f"urn:trustgraph:docrag:{session_id}"
|
||||
|
||||
|
||||
def docrag_exploration_uri(session_id: str) -> str:
|
||||
"""
|
||||
Generate URI for a document RAG exploration entity (chunks retrieved).
|
||||
|
||||
Args:
|
||||
session_id: The session UUID.
|
||||
|
||||
Returns:
|
||||
URN in format: urn:trustgraph:docrag:{uuid}/exploration
|
||||
"""
|
||||
return f"urn:trustgraph:docrag:{session_id}/exploration"
|
||||
|
||||
|
||||
def docrag_synthesis_uri(session_id: str) -> str:
|
||||
"""
|
||||
Generate URI for a document RAG synthesis entity (final answer).
|
||||
|
||||
Args:
|
||||
session_id: The session UUID.
|
||||
|
||||
Returns:
|
||||
URN in format: urn:trustgraph:docrag:{uuid}/synthesis
|
||||
"""
|
||||
return f"urn:trustgraph:docrag:{session_id}/synthesis"
|
||||
|
|
|
|||
|
|
@ -23,7 +23,9 @@ class AgentRequest:
|
|||
group: list[str] | None = None
|
||||
history: list[AgentStep] = field(default_factory=list)
|
||||
user: str = "" # User context for multi-tenancy
|
||||
streaming: bool = False # NEW: Enable streaming response delivery (default false)
|
||||
collection: str = "default" # Collection for provenance traces
|
||||
streaming: bool = False # Enable streaming response delivery (default false)
|
||||
session_id: str = "" # For provenance tracking across iterations
|
||||
|
||||
@dataclass
|
||||
class AgentResponse:
|
||||
|
|
|
|||
|
|
@ -42,5 +42,7 @@ class DocumentRagQuery:
|
|||
@dataclass
|
||||
class DocumentRagResponse:
|
||||
error: Error | None = None
|
||||
response: str = ""
|
||||
response: str | None = ""
|
||||
end_of_stream: bool = False
|
||||
explain_id: str | None = None # Single explain URI (announced as created)
|
||||
explain_graph: str | None = None # Named graph where explain was stored (e.g., urn:graph:retrieval)
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue