Wire the FlashRank reranker subsystem from #1005 into Document-RAG: after
vector retrieval, over-fetch a wider candidate pool, rerank with the
cross-encoder, and keep the top doc_limit chunks for synthesis.
Per maintainer review, the fetch and select sizes are two caller-controlled
limits rather than one internal heuristic:
- doc_limit: chunks selected into the synthesis prompt (unchanged meaning).
- fetch_limit: candidate pool pulled from the vector store before reranking.
0 = derive (OVERFETCH_FACTOR x doc_limit); values below doc_limit are
raised to it. Lets the caller control how hard the reranker has to work.
Details:
- schema: DocumentRagQuery.fetch_limit (additive, backward compatible).
- document_rag.py / rag.py: fetch_limit resolved in the processor (mirrors
doc_limit); the core applies the heuristic default and derives synthesis
provenance from the chunk-selection focus when reranking ran.
- provenance: tg:ChunkSelection focus stage (mirrors tg:EdgeSelection).
- request translator + client SDKs + CLI: fetch-limit / --fetch-limit,
threaded exactly like doc_limit and the GraphRAG limits.
- tests: no-op identity, over-fetch/narrow, explicit fetch_limit, heuristic
default, floor-at-doc_limit, provenance lineage, cross-repo topic wiring.
Reranking is skipped byte-identically when no reranker role is wired.
Requires the companion trustgraph-templates change wiring the reranker
topics into the document-rag flow (mirrors #279 for GraphRAG).
Replace the three-prompt LLM scoring pipeline (kg-edge-scoring,
kg-edge-reasoning, kg-edge-selection) with a cross-encoder reranker
service backed by FlashRank. The new hop_and_filter() method performs
iterative graph traversal with semantic scoring at each hop, replacing
the previous follow_edges/get_subgraph approach.
- Add reranker service (trustgraph-base client/service, FlashRank processor)
- Add gateway dispatch for reranker via API and WebSocket
- Rewrite GraphRAG pipeline: hop_and_filter() with per-hop cross-encoder scoring
- Remove kg_prompt() and edge_score_limit from prompt client
- Update provenance: add tg:EdgeSelection type, tg:concept, tg:score predicates
- Update CLIs (tg-invoke-graph-rag, tg-show-explain-trace) for new metadata
- Add tg-invoke-reranker CLI tool
- Add tech spec and UX developer guidance
- Update all unit and integration tests
Python 3.14 deprecates datetime.utcnow(). Replace all 9 occurrences with
datetime.now(timezone.utc) and normalize the output to preserve the existing
ISO-8601 "Z"-suffixed format so downstream parsers are unaffected.
Fixes#814
Addresses recommendations from the UX developer's agent experience report.
Adds provenance predicates, DAG structure changes, error resilience, and
a published OWL ontology.
Explainability additions:
- Tool candidates: tg:toolCandidate on Analysis events lists the tools
visible to the LLM for each iteration (names only, descriptions in config)
- Termination reason: tg:terminationReason on Conclusion/Synthesis events
(final-answer, plan-complete, subagents-complete)
- Step counter: tg:stepNumber on iteration events
- Pattern decision: new tg:PatternDecision entity in the DAG between
session and first iteration, carrying tg:pattern and tg:taskType
- Latency: tg:llmDurationMs on Analysis events, tg:toolDurationMs on
Observation events
- Token counts on events: tg:inToken/tg:outToken/tg:llmModel on
Grounding, Focus, Synthesis, and Analysis events
- Tool/parse errors: tg:toolError on Observation events with tg:Error
mixin type. Parse failures return as error observations instead of
crashing the agent, giving it a chance to retry.
Envelope unification:
- Rename chunk_type to message_type across AgentResponse schema,
translator, SDK types, socket clients, CLI, and all tests.
Agent and RAG services now both use message_type on the wire.
Ontology:
- specs/ontology/trustgraph.ttl — OWL vocabulary covering all 26 classes,
7 object properties, and 36+ datatype properties including new predicates.
DAG structure tests:
- tests/unit/test_provenance/test_dag_structure.py verifies the
wasDerivedFrom chain for GraphRAG, DocumentRAG, and all three agent
patterns (react, plan, supervisor) including the pattern-decision link.
The triples client returns Uri/Literal (str subclasses), not Term
objects. _quoted_triple() treated all values as IRIs, so literal
objects like skos:definition values were mistyped in focus
provenance events, and trace_source_documents could not match
them in the store.
Added to_term() to convert Uri/Literal back to Term, threaded a
term_map from follow_edges_batch through
get_subgraph/get_labelgraph into uri_map, and updated
_quoted_triple to accept Term objects directly.
Wire message_id on all answer chunks, fix DAG structure message_id:
- Add message_id to AgentAnswer dataclass and propagate in
socket_client._parse_chunk
- Wire message_id into answer callbacks and send_final_response
for all three patterns (react, plan-then-execute, supervisor)
- Supervisor decomposition thought and synthesis answer chunks
now carry message_id
DAG structure fixes:
- Observation derives from sub-trace Synthesis (not Analysis)
when a tool produces a sub-trace; tracked via
last_sub_explain_uri on context
- Subagent sessions derive from parent's Decomposition via
parent_uri on agent_session_triples
- Findings derive from subagent Conclusions (not Decomposition)
- Synthesis derives from all findings (multiple wasDerivedFrom)
ensuring single terminal node
- agent_synthesis_triples accepts list of parent URIs
- Explainability chain walker follows from sub-trace terminal
to find downstream Observation
Emit Analysis before tool execution:
- Add on_action callback to react() in agent_manager.py, called
after reason() but before tool invocation
- Orchestrator and old service emit Analysis+ToolUse triples via
on_action so sub-traces appear after their parent in the stream
Refactor agent provenance so that the decision (thought + tool
selection) and the result (observation) are separate DAG entities:
Question ← Analysis+ToolUse ← Observation ← ... ← Conclusion
Analysis gains tg:ToolUse as a mixin RDF type and is emitted
before tool execution via an on_action callback in react().
This ensures sub-traces (e.g. GraphRAG) appear after their
parent Analysis in the streaming event order.
Observation becomes a standalone prov:Entity with tg:Observation
type, emitted after tool execution. The linear DAG chain runs
through Observation — subsequent iterations and the Conclusion
derive from it, not from the Analysis.
message_id is populated on streaming AgentResponse for thought
and observation chunks, using the provenance URI of the entity
being built. This lets clients group streamed chunks by entity.
Wire changes:
- provenance/agent.py: Add ToolUse type, new
agent_observation_triples(), remove observation from iteration
- agent_manager.py: Add on_action callback between reason() and
tool execution
- orchestrator/pattern_base.py: Split emit, wire message_id,
chain through observation URIs
- orchestrator/react_pattern.py: Emit Analysis via on_action
before tool runs
- agent/react/service.py: Same for non-orchestrator path
- api/explainability.py: New Observation class, updated dispatch
and chain walker
- api/types.py: Add message_id to AgentThought/AgentObservation
- cli: Render Observation separately, [analysis: tool] labels
agent-orchestrator: add explainability provenance for all agent
patterns
Extend the provenance/explainability system to provide
human-readable reasoning traces for the orchestrator's three
agent patterns. Previously only ReAct emitted provenance
(session, iteration, conclusion). Now each pattern records its
cognitive steps as typed RDF entities in the knowledge graph,
using composable mixin types (e.g. Finding + Answer).
New provenance chains:
- Supervisor: Question → Decomposition → Finding ×N → Synthesis
- Plan-then-Execute: Question → Plan → StepResult ×N → Synthesis
- ReAct: Question → Analysis ×N → Conclusion (unchanged)
New RDF types: Decomposition, Finding, Plan, StepResult.
New predicates: tg:subagentGoal, tg:planStep.
Reuses existing Synthesis + Answer mixin for final answers.
Provenance library (trustgraph-base):
- Triple builders, URI generators, vocabulary labels for new types
- Client dataclasses with from_triples() dispatch
- fetch_agent_trace() follows branching provenance chains
- API exports updated
Orchestrator (trustgraph-flow):
- PatternBase emit methods for decomposition, finding, plan, step result, and synthesis
- SupervisorPattern emits decomposition during fan-out
- PlanThenExecutePattern emits plan and step results
- Service emits finding triples on subagent completion
- Synthesis provenance replaces generic final triples
CLI (trustgraph-cli):
- invoke_agent -x displays new entity types inline
Add universal document decoder with multi-format support
using 'unstructured'.
New universal decoder service powered by the unstructured
library, handling DOCX, XLSX, PPTX, HTML, Markdown, CSV, RTF,
ODT, EPUB and more through a single service. Tables are preserved
as HTML markup for better downstream extraction. Images are
stored in the librarian but excluded from the text
pipeline. Configurable section grouping strategies
(whole-document, heading, element-type, count, size) for non-page
formats. Page-based formats (PDF, PPTX, XLSX) are automatically
grouped by page.
All four decoders (PDF, Mistral OCR, Tesseract OCR, universal)
now share the "document-decoder" ident so they are
interchangeable. PDF-only decoders fetch document metadata to
check MIME type and gracefully skip unsupported formats.
Librarian changes: removed MIME type whitelist validation so any
document format can be ingested. Simplified routing so text/plain
goes to text-load and everything else goes to document-load.
Removed dual inline/streaming data paths — documents always use
document_id for content retrieval.
New provenance entity types (tg:Section, tg:Image) and metadata
predicates (tg:elementTypes, tg:tableCount, tg:imageCount) for
richer explainability.
Universal decoder is in its own package (trustgraph-unstructured)
and container image (trustgraph-unstructured).
Page and chunk document IDs were deterministic ({doc_id}/p{num},
{doc_id}/p{num}/c{num}), causing "Document already exists" errors
when reprocessing documents through different flows. Content may
differ between runs due to different parameters or extractors, so
deterministic IDs are incorrect.
Pages now use urn:page:{uuid}, chunks use
urn:chunk:{uuid}. Parent- child relationships are tracked via
librarian metadata and provenance triples.
Also brings Mistral OCR and Tesseract OCR decoders up to parity
with the PDF decoder: librarian fetch/save support, per-page
output with unique IDs, and provenance triple emission. Fixes
Mistral OCR bug where only the first 5 pages were processed.
The subjectOf triples were redundant with the subgraph provenance model
introduced in e8407b34. Entity-to-source lineage can be traced via
tg:contains -> subgraph -> prov:wasDerivedFrom -> chunk, making the
direct subjectOf edges unnecessary metadata polluting the knowledge graph.
Removed from all three extractors (agent, definitions, relationships),
cleaned up the SUBJECT_OF constant and vocabulary label, and updated
tests accordingly.
Replace per-triple provenance reification with subgraph model
Extraction provenance previously created a full reification (statement
URI, activity, agent) for every single extracted triple, producing ~13
provenance triples per knowledge triple. Since each chunk is processed
by a single LLM call, this was both redundant and semantically
inaccurate.
Now one subgraph object is created per chunk extraction, with
tg:contains linking to each extracted triple. For 20 extractions from
a chunk this reduces provenance from ~260 triples to ~33.
- Rename tg:reifies -> tg:contains, stmt_uri -> subgraph_uri
- Replace triple_provenance_triples() with subgraph_provenance_triples()
- Refactor kg-extract-definitions and kg-extract-relationships to
generate provenance once per chunk instead of per triple
- Add subgraph provenance to kg-extract-ontology and kg-extract-agent
(previously had none)
- Update CLI tools and tech specs to match
Also rename tg-show-document-hierarchy to tg-show-extraction-provenance.
Added extra typing for extraction provenance, fixed extraction prov CLI
Add unified explainability support and librarian storage for all retrieval engines
Implements consistent explainability/provenance tracking
across GraphRAG, DocumentRAG, and Agent retrieval
engines. All large content (answers, thoughts, observations)
is now stored in librarian rather than as inline literals in
the knowledge graph.
Explainability API:
- New explainability.py module with entity classes (Question,
Exploration, Focus, Synthesis, Analysis, Conclusion) and
ExplainabilityClient
- Quiescence-based eventual consistency handling for trace
fetching
- Content fetching from librarian with retry logic
CLI updates:
- tg-invoke-graph-rag -x/--explainable flag returns
explain_id
- tg-invoke-document-rag -x/--explainable flag returns
explain_id
- tg-invoke-agent -x/--explainable flag returns explain_id
- tg-list-explain-traces uses new explainability API
- tg-show-explain-trace handles all three trace types
Agent provenance:
- Records session, iterations (think/act/observe), and conclusion
- Stores thoughts and observations in librarian with document
references
- New predicates: tg:thoughtDocument, tg:observationDocument
DocumentRAG provenance:
- Records question, exploration (chunk retrieval), and synthesis
- Stores answers in librarian with document references
Schema changes:
- AgentResponse: added explain_id, explain_graph fields
- RetrievalResponse: added explain_id, explain_graph fields
- agent_iteration_triples: supports thought_document_id,
observation_document_id
Update tests.
Terminology Rename, and named-graphs for explainability data
Changed terminology:
- session -> question
- retrieval -> exploration
- selection -> focus
- answer -> synthesis
- uris.py: Renamed query_session_uri → question_uri,
retrieval_uri → exploration_uri, selection_uri → focus_uri,
answer_uri → synthesis_uri
- triples.py: Renamed corresponding triple generation functions with
updated labels ("GraphRAG question", "Exploration", "Focus",
"Synthesis")
- namespaces.py: Added named graph constants GRAPH_DEFAULT,
GRAPH_SOURCE, GRAPH_RETRIEVAL
- init.py: Updated exports
- graph_rag.py: Updated to use new terminology
- invoke_graph_rag.py: Updated CLI to display new stage names
(Question, Exploration, Focus, Synthesis)
Query-Time Explainability → Named Graph
- triples.py: Added set_graph() helper function to set named graph
on triples
- graph_rag.py: All explainability triples now use GRAPH_RETRIEVAL
named graph
- rag.py: Explainability triples stored in user's collection (not
separate collection) with named graph
Extraction Provenance → Named Graph
- relationships/extract.py: Provenance triples use GRAPH_SOURCE
named graph
- definitions/extract.py: Provenance triples use GRAPH_SOURCE
named graph
- chunker.py: Provenance triples use GRAPH_SOURCE named graph
- pdf_decoder.py: Provenance triples use GRAPH_SOURCE named graph
CLI Updates
- show_graph.py: Added -g/--graph option to filter by named graph and
--show-graph to display graph column
Also:
- Fix knowledge core schemas
Quoted triple fixes, including...
1. Updated triple_provenance_triples() in triples.py:
- Now accepts a Triple object directly
- Creates the reification triple using TRIPLE term type: stmt_uri tg:reifies
<<extracted_triple>>
- Includes it in the returned provenance triples
2. Updated definitions extractor:
- Added imports for provenance functions and component version
- Added ParameterSpec for optional llm-model and ontology flow parameters
- For each definition triple, generates provenance with reification
3. Updated relationships extractor:
- Same changes as definitions extractor