Update tech spec

2026-05-25 23:35:12 +02:00 · 2026-03-10 10:06:57 +00:00 · 2026-03-10 10:06:57 +00:00 · 34968c801b
commit 34968c801b
parent 7a6197d8c3
2 changed files with 263 additions and 282 deletions
--- a/docs/tech-specs/query-time-explainability.md
+++ b/docs/tech-specs/query-time-explainability.md
@ -0,0 +1,263 @@
 # Query-Time Explainability
 ## Status
 Implemented
 ## Overview
 This specification describes how GraphRAG records and communicates explainability data during query execution. The goal is full traceability: from final answer back through selected edges to source documents.
 Query-time explainability captures what the GraphRAG pipeline did during reasoning. It connects to extraction-time provenance which records where knowledge graph facts originated.
 ## Terminology
 | Term | Definition |
 |------|------------|
 | **Explainability** | The record of how a result was derived |
 | **Session** | A single GraphRAG query execution |
 | **Edge Selection** | LLM-driven selection of relevant edges with reasoning |
 | **Provenance Chain** | Path from edge → chunk → page → document |
 ## Architecture
 ### Explainability Flow
 ```
 GraphRAG Query
    │
    ├─► Session Activity
    │       └─► Query text, timestamp
    │
    ├─► Retrieval Entity
    │       └─► All edges retrieved from subgraph
    │
    ├─► Selection Entity
    │       └─► Selected edges with LLM reasoning
    │           └─► Each edge links to extraction provenance
    │
    └─► Answer Entity
            └─► Reference to synthesized response (in librarian)
 ```
 ### Two-Stage GraphRAG Pipeline
 1. **Edge Selection**: LLM selects relevant edges from subgraph, providing reasoning for each
 2. **Synthesis**: LLM generates answer from selected edges only
 This separation enables explainability - we know exactly which edges contributed.
 ### Storage
 - Explainability triples stored in configurable collection (default: `explainability`)
 - Uses PROV-O ontology for provenance relationships
 - RDF-star reification for edge references
 - Answer content stored in librarian service (not inline - too large)
 ### Real-Time Streaming
 Explainability events stream to client as the query executes:
 1. Session created → event emitted
 2. Edges retrieved → event emitted
 3. Edges selected with reasoning → event emitted
 4. Answer synthesized → event emitted
 Client receives `explain_id` and `explain_collection` to fetch full details.
 ## URI Structure
 All URIs use the `urn:trustgraph:` namespace with UUIDs:
 | Entity | URI Pattern |
 |--------|-------------|
 | Session | `urn:trustgraph:session:{uuid}` |
 | Retrieval | `urn:trustgraph:prov:retrieval:{uuid}` |
 | Selection | `urn:trustgraph:prov:selection:{uuid}` |
 | Answer | `urn:trustgraph:prov:answer:{uuid}` |
 | Edge Selection | `urn:trustgraph:prov:edge:{uuid}:{index}` |
 ## RDF Model (PROV-O)
 ### Session Activity
 ```turtle
 <session-uri> a prov:Activity ;
    rdfs:label "GraphRAG query session" ;
    prov:startedAtTime "2024-01-15T10:30:00Z" ;
    tg:query "What was the War on Terror?" .
 ```
 ### Retrieval Entity
 ```turtle
 <retrieval-uri> a prov:Entity ;
    rdfs:label "Retrieved edges" ;
    prov:wasGeneratedBy <session-uri> ;
    tg:edgeCount 50 .
 ```
 ### Selection Entity
 ```turtle
 <selection-uri> a prov:Entity ;
    rdfs:label "Selected edges" ;
    prov:wasDerivedFrom <retrieval-uri> ;
    tg:selectedEdge <edge-sel-0> ;
    tg:selectedEdge <edge-sel-1> .
 <edge-sel-0> tg:edge << <s> <p> <o> >> ;
    tg:reasoning "This edge establishes the key relationship..." .
 ```
 ### Answer Entity
 ```turtle
 <answer-uri> a prov:Entity ;
    rdfs:label "GraphRAG answer" ;
    prov:wasDerivedFrom <selection-uri> ;
    tg:document <urn:trustgraph:answer:{uuid}> .
 ```
 The `tg:document` references the answer stored in the librarian service.
 ## Namespace Constants
 Defined in `trustgraph-base/trustgraph/provenance/namespaces.py`:
 | Constant | URI |
 |----------|-----|
 | `TG_QUERY` | `https://trustgraph.ai/ns/query` |
 | `TG_EDGE_COUNT` | `https://trustgraph.ai/ns/edgeCount` |
 | `TG_SELECTED_EDGE` | `https://trustgraph.ai/ns/selectedEdge` |
 | `TG_EDGE` | `https://trustgraph.ai/ns/edge` |
 | `TG_REASONING` | `https://trustgraph.ai/ns/reasoning` |
 | `TG_CONTENT` | `https://trustgraph.ai/ns/content` |
 | `TG_DOCUMENT` | `https://trustgraph.ai/ns/document` |
 ## GraphRagResponse Schema
 ```python
@dataclass
 class GraphRagResponse:
    error: Error | None = None
    response: str = ""
    end_of_stream: bool = False
    explain_id: str | None = None
    explain_collection: str | None = None
    message_type: str = ""  # "chunk" or "explain"
    end_of_session: bool = False
 ```
 ### Message Types
 | message_type | Purpose |
 |--------------|---------|
 | `chunk` | Response text (streaming or final) |
 | `explain` | Explainability event with IRI reference |
 ### Session Lifecycle
 1. Multiple `explain` messages (session, retrieval, selection, answer)
 2. Multiple `chunk` messages (streaming response)
 3. Final `chunk` with `end_of_session=True`
 ## Edge Selection Format
 LLM returns JSONL with selected edges:
 ```jsonl
 {"id": "edge-hash-1", "reasoning": "This edge shows the key relationship..."}
 {"id": "edge-hash-2", "reasoning": "Provides supporting evidence..."}
 ```
 The `id` is a hash of `(labeled_s, labeled_p, labeled_o)` computed by `edge_id()`.
 ## URI Preservation
 ### The Problem
 GraphRAG displays human-readable labels to the LLM, but explainability needs original URIs for provenance tracing.
 ### Solution
 `get_labelgraph()` returns both:
 - `labeled_edges`: List of `(label_s, label_p, label_o)` for LLM
 - `uri_map`: Dict mapping `edge_id(labels)` → `(uri_s, uri_p, uri_o)`
 When storing explainability data, URIs from `uri_map` are used.
 ## Provenance Tracing
 ### From Edge to Source
 Selected edges can be traced back to source documents:
 1. Query for reifying statement: `?stmt tg:reifies <<s p o>>`
 2. Follow `prov:wasDerivedFrom` chain to root document
 3. Each step in chain: chunk → page → document
 ### Cassandra Quoted Triple Support
 The Cassandra query service supports matching quoted triples:
 ```python
 # In get_term_value():
 elif term.type == TRIPLE:
    return serialize_triple(term.triple)
 ```
 This enables queries like:
 ```
 ?stmt tg:reifies <<http://example.org/s http://example.org/p "value">>
 ```
 ## CLI Usage
 ```bash
 tg-invoke-graph-rag --explainable -q "What was the War on Terror?"
 ```
 ### Output Format
 ```
 [session] urn:trustgraph:session:abc123
 [retrieval] urn:trustgraph:prov:retrieval:abc123
 [selection] urn:trustgraph:prov:selection:abc123
    Selected 12 edge(s)
      Edge: (Guantanamo, definition, A detention facility...)
        Reason: Directly connects Guantanamo to the War on Terror
        Source: Chunk 1 → Page 2 → Beyond the Vigilant State
 [answer] urn:trustgraph:prov:answer:abc123
 Based on the provided knowledge statements...
 ```
 ### Features
 - Real-time explainability events during query
 - Label resolution for edge components via `rdfs:label`
 - Source chain tracing via `prov:wasDerivedFrom`
 - Label caching to avoid repeated queries
 ## Files Implemented
 | File | Purpose |
 |------|---------|
 | `trustgraph-base/trustgraph/provenance/uris.py` | URI generators |
 | `trustgraph-base/trustgraph/provenance/namespaces.py` | RDF namespace constants |
 | `trustgraph-base/trustgraph/provenance/triples.py` | Triple builders |
 | `trustgraph-base/trustgraph/schema/services/retrieval.py` | GraphRagResponse schema |
 | `trustgraph-flow/trustgraph/retrieval/graph_rag/graph_rag.py` | Core GraphRAG with URI preservation |
 | `trustgraph-flow/trustgraph/retrieval/graph_rag/rag.py` | Service with librarian integration |
 | `trustgraph-flow/trustgraph/query/triples/cassandra/service.py` | Quoted triple query support |
 | `trustgraph-cli/trustgraph/cli/invoke_graph_rag.py` | CLI with explainability display |
 ## References
 - PROV-O (W3C Provenance Ontology): https://www.w3.org/TR/prov-o/
 - RDF-star: https://w3c.github.io/rdf-star/
 - Extraction-time provenance: `docs/tech-specs/extraction-time-provenance.md`
--- a/docs/tech-specs/query-time-provenance.md
+++ b/docs/tech-specs/query-time-provenance.md
@ -1,282 +0,0 @@
 # Query-Time Provenance: Agent Explainability
 ## Status
 Draft - Gathering Requirements
 ## Overview
 This specification defines how the agent framework records and communicates provenance during query execution. The goal is full explainability: tracing how a result was obtained, from final answer back through reasoning steps to source data.
 Query-time provenance captures the "inference layer" - what the agent did during reasoning. It connects to extraction-time provenance (source layer) which records where facts came from originally.
 ## Terminology
 | Term | Definition |
 |------|------------|
 | **Provenance** | The record of how a result was derived |
 | **Provenance Node** | A single step or artifact in the provenance DAG |
 | **Provenance DAG** | Directed Acyclic Graph of provenance relationships |
 | **Query-time Provenance** | Provenance generated during agent reasoning |
 | **Extraction-time Provenance** | Provenance from data ingestion (source metadata) - separate spec |
 ## Architecture
 ### Two Provenance Contexts
 1. **Extraction-time** (out of scope for this spec):
   - Generated when data is ingested (PDF extraction, web scraping, etc.)
   - Records: source URL, extraction method, timestamps, funding, authorship
   - Already partially implemented via source metadata in knowledge graph
   - See: `docs/tech-specs/extraction-time-provenance.md` (notes)
 2. **Query-time** (this spec):
   - Generated during agent reasoning
   - Records: tool invocations, retrieval results, LLM reasoning, final conclusions
   - Links to extraction-time provenance for retrieved facts
 ### Provenance Flow
 ```
 Agent Session
    │
    ├─► Tool: Knowledge Query
    │       │
    │       ├─► Retrieved Fact A ──► [link to extraction provenance]
    │       └─► Retrieved Fact B ──► [link to extraction provenance]
    │
    ├─► LLM Reasoning Step
    │       │
    │       └─► "Combined A and B to conclude X"
    │
    └─► Final Answer
            │
            └─► Derived from reasoning step above
 ```
 ### Storage
 - Provenance stored in knowledge graph infrastructure
 - Segregated in a **separate collection** for distinct retrieval patterns
 - Query-time provenance references extraction-time provenance nodes via IRIs
 - Persists beyond agent session (reusable, auditable)
 ### Real-Time Streaming
 Provenance events stream back to the client as the agent works:
 1. Agent invokes tool
 2. Tool generates provenance data
 3. Provenance stored in graph
 4. Provenance event sent to client
 5. UX builds provenance visualization incrementally
 ## Provenance Node Structure
 Each provenance node represents a step in the reasoning process.
 ### Node Identity
 Provenance nodes are identified by IRIs containing UUIDs, consistent with the RDF-style knowledge graph:
 ```
 urn:trustgraph:prov:550e8400-e29b-41d4-a716-446655440000
 ```
 ### Core Fields
 | Field | Description |
 |-------|-------------|
 | `id` | IRI with UUID (e.g., `urn:trustgraph:prov:{uuid}`) |
 | `session_id` | Agent session this belongs to |
 | `timestamp` | When this step occurred |
 | `type` | Node type (see below) |
 | `derived_from` | List of parent node IRIs (DAG edges) |
 ### Node Types
 | Type | Description | Additional Fields |
 |------|-------------|-------------------|
 | `retrieval` | Facts retrieved from knowledge graph | `facts`, `source_refs` |
 | `tool_invocation` | Tool was called | `tool_name`, `input`, `output` |
 | `reasoning` | LLM reasoning step | `prompt_summary`, `conclusion` |
 | `answer` | Final answer produced | `content` |
 ### Example Provenance Nodes
 ```json
 {
  "id": "urn:trustgraph:prov:550e8400-e29b-41d4-a716-446655440001",
  "session_id": "urn:trustgraph:session:7c9e6679-7425-40de-944b-e07fc1f90ae7",
  "timestamp": "2024-01-15T10:30:00Z",
  "type": "retrieval",
  "derived_from": [],
  "facts": [
    {
      "id": "urn:trustgraph:fact:9b1deb4d-3b7d-4bad-9bdd-2b0d7b3dcb6d",
      "content": "Swallow airspeed is 8.5 m/s"
    }
  ],
  "source_refs": ["urn:trustgraph:extract:1b9d6bcd-bbfd-4b2d-9b5d-ab8dfbbd4bed"]
 }
 ```
 ```json
 {
  "id": "urn:trustgraph:prov:550e8400-e29b-41d4-a716-446655440002",
  "session_id": "urn:trustgraph:session:7c9e6679-7425-40de-944b-e07fc1f90ae7",
  "timestamp": "2024-01-15T10:30:01Z",
  "type": "reasoning",
  "derived_from": ["urn:trustgraph:prov:550e8400-e29b-41d4-a716-446655440001"],
  "prompt_summary": "Asked to determine average swallow speed",
  "conclusion": "Based on retrieved data, average speed is 8.5 m/s"
 }
 ```
 ## Provenance Events
 Events streamed to the client during agent execution.
 ### Design: Lightweight Reference Events
 Provenance events are lightweight - they reference provenance nodes by IRI rather than embedding full provenance data. This keeps the stream efficient while allowing the client to fetch full details if needed.
 A single agent step may create or modify multiple provenance objects. The event references all of them.
 ### Event Structure
 ```json
 {
  "provenance_refs": [
    "urn:trustgraph:prov:550e8400-e29b-41d4-a716-446655440001",
    "urn:trustgraph:prov:550e8400-e29b-41d4-a716-446655440002"
  ]
 }
 ```
 ### Integration with Agent Response
 Provenance events extend `AgentResponse` with a new `chunk_type: "provenance"`:
 ```json
 {
  "chunk_type": "provenance",
  "content": "",
  "provenance_refs": ["urn:trustgraph:prov:..."],
  "end_of_message": false
 }
 ```
 This allows provenance updates to flow alongside existing chunk types (`thought`, `observation`, `answer`, `error`).
 ## Tool Provenance Reporting
 Tools report provenance as part of their execution.
 ### Minimum Reporting (all tools)
 Every tool can report at minimum:
 - Tool name
 - Input arguments
 - Output result
 ### Enhanced Reporting (tools that can describe more)
 Tools that understand their internals can report:
 - What sources were consulted
 - What reasoning/transformation was applied
 - Confidence scores
 - Links to extraction-time provenance
 ### Graceful Degradation
 Tools that can't provide detailed provenance still participate:
 ```json
 {
  "type": "tool_invocation",
  "tool_name": "calculator",
  "input": {"expression": "8 + 5"},
  "output": "13",
  "detail_level": "basic"
 }
 ```
 ## Design Decisions
 ### Provenance Node Identity: IRIs with UUIDs
 Provenance nodes use IRIs containing UUIDs, consistent with the RDF-style knowledge graph:
 - Format: `urn:trustgraph:prov:{uuid}`
 - Globally unique, persistent across sessions
 - Can be dereferenced to retrieve full node data
 ### Storage Segregation: Separate Collection
 Provenance is stored in a separate collection within the knowledge graph infrastructure. This allows:
 - Distinct retrieval patterns for provenance vs. data
 - Independent scaling/retention policies
 - Clear separation of concerns
 ### Client Protocol: Extended AgentResponse
 Provenance events extend `AgentResponse` with `chunk_type: "provenance"`. Events are lightweight, containing only IRI references to provenance nodes created/modified in the step.
 ### Retrieval Granularity: Flexible, Multiple Objects Per Step
 A single agent step can create multiple provenance objects. The provenance event references all objects created or modified. This handles cases like:
 - Retrieval returning multiple facts (each gets a provenance node)
 - Tool invocation creating both an invocation node and result nodes
 ### Graph Structure: True DAG
 The provenance structure is a DAG (not a tree):
 - A provenance node can have multiple parents (e.g., reasoning combines facts A and B)
 - Extraction-time nodes can be referenced by multiple query-time sessions
 - Enables proper modeling of how conclusions derive from multiple sources
 ### Linking to Extraction Provenance: Direct IRI Reference
 Query-time provenance references extraction-time provenance via direct IRI links in the `source_refs` field. No separate linking mechanism needed.
 ## Open Questions
 ### Provenance Retrieval API
 Base layer uses the existing knowledge graph API to query the provenance collection. A higher-level service may be added to provide convenience methods. Details TBD during implementation.
 ### Provenance Node Granularity
 Placeholder to explore: What level of detail should different node types capture?
 - Should `reasoning` nodes include the full LLM prompt, or just a summary?
 - How much of tool input/output to store?
 - Trade-offs between completeness and storage/performance
 ### Provenance Retention
 TBD - retention policy to be determined:
 - Indefinitely?
 - Tied to session retention?
 - Configurable per collection?
 ## Implementation Considerations
 ### Files Likely Affected
 | Area | Changes |
 |------|---------|
 | Agent service | Generate provenance events |
 | Tool implementations | Report provenance data |
 | Agent response schema | Add provenance event type |
 | Knowledge graph | Provenance storage/retrieval |
 ### Backward Compatibility
 - Existing agent clients continue to work (provenance is additive)
 - Tools that don't report provenance still function
 ## References
 - PROV-O (PROV-Ontology): W3C standard for provenance modeling
 - Current agent implementation: `trustgraph-flow/trustgraph/agent/react/`
 - Agent schemas: `trustgraph-base/trustgraph/schema/services/agent.py`
 - Extraction-time provenance notes: `docs/tech-specs/extraction-time-provenance.md`