trustgraph

mirror of https://github.com/trustgraph-ai/trustgraph.git synced 2026-07-09 13:22:10 +02:00

Author	SHA1	Message	Date
cybermaggedon	7a6197d8c3	GraphRAG Query-Time Explainability (#677 ) Implements full explainability pipeline for GraphRAG queries, enabling traceability from answers back to source documents. Renamed throughout for clarity: - provenance_callback → explain_callback - provenance_id → explain_id - provenance_collection → explain_collection - message_type "provenance" → "explain" - Queue name "provenance" → "explainability" GraphRAG queries now emit explainability events as they execute: 1. Session - query text and timestamp 2. Retrieval - edges retrieved from subgraph 3. Selection - selected edges with LLM reasoning (JSONL with id + reasoning) 4. Answer - reference to synthesized response Events stream via explain_callback during query(), enabling real-time UX. - Answers stored in librarian service (not inline in graph - too large) - Document ID as URN: urn:trustgraph:answer:{session_id} - Graph stores tg:document reference (IRI) to librarian document - Added librarian producer/consumer to graph-rag service - get_labelgraph() now returns (labeled_edges, uri_map) - uri_map maps edge_id(label_s, label_p, label_o) → (uri_s, uri_p, uri_o) - Explainability data stores original URIs, not labels - Enables tracing edges back to reifying statements via tg:reifies - Added serialize_triple() to query service (matches storage format) - get_term_value() now handles TRIPLE type terms - Enables querying by quoted triple in object position: ?stmt tg:reifies <<s p o>> - Displays real-time explainability events during query - Resolves rdfs:label for edge components (s, p, o) - Traces source chain via prov:wasDerivedFrom to root document - Output: "Source: Chunk 1 → Page 2 → Document Title" - Label caching to avoid repeated queries GraphRagResponse: - explain_id: str \| None - explain_collection: str \| None - message_type: str ("chunk" or "explain") - end_of_session: bool trustgraph-base/trustgraph/provenance/: - namespaces.py - Added TG_DOCUMENT predicate - triples.py - answer_triples() supports document_id reference - uris.py - Added edge_selection_uri() trustgraph-base/trustgraph/schema/services/retrieval.py: - GraphRagResponse with explain_id, explain_collection, end_of_session trustgraph-flow/trustgraph/retrieval/graph_rag/: - graph_rag.py - URI preservation, streaming answer accumulation - rag.py - Librarian integration, real-time explain emission trustgraph-flow/trustgraph/query/triples/cassandra/service.py: - Quoted triple serialization for query matching trustgraph-cli/trustgraph/cli/invoke_graph_rag.py: - Full explainability display with label resolution and source tracing	2026-03-10 10:00:01 +00:00
cybermaggedon	d2d71f859d	Feature/streaming triples (#676 ) * Steaming triples * Also GraphRAG service uses this * Updated tests	2026-03-09 15:46:33 +00:00
cybermaggedon	f2ae0e8623	Embeddings API scores (#671 ) - Put scores in all responses - Remove unused 'middle' vector layer. Vector of texts -> vector of (vector embedding)	2026-03-09 10:53:44 +00:00
cybermaggedon	4fa7cc7d7c	Fix/embeddings integration 2 (#670 )	2026-03-08 19:42:26 +00:00
cybermaggedon	3bf8a65409	Fix tests (#666 )	2026-03-07 23:38:09 +00:00
cybermaggedon	1809c1f56d	Structured data 2 (#645 ) * Structured data refactor - multi-index tables, remove need for manual mods to the Cassandra tables * Tech spec updated to track implementation	2026-02-23 15:56:29 +00:00
cybermaggedon	d886358be6	Entity & triple batch size limits (#635 ) * Entities and triples are emitted in batches with a batch limit to manage overloading downstream. * Update tests	2026-02-16 17:38:03 +00:00
cybermaggedon	8574861196	Protect null embeddings - v2.0 (#627 ) * Don't emit graph embeddings if there aren't any. * Don't store graph embeddings in a knowledge store if there's an empty list. * Translate between Cassandra's 'null' representing an empty list and an empty list which is what the surrounding code wants (and stored in the first place). * Avoid emitting empty embedding lists * Avoid output empty triple lists * Fix tests	2026-02-09 14:57:36 +00:00
cybermaggedon	cf0daedefa	Changed schema for Value -> Term, majorly breaking change (#622 ) * Changed schema for Value -> Term, majorly breaking change * Following the schema change, Value -> Term into all processing * Updated Cassandra for g, p, s, o index patterns (7 indexes) * Reviewed and updated all tests * Neo4j, Memgraph and FalkorDB remain broken, will look at once settled down	2026-01-27 13:48:08 +00:00
cybermaggedon	e214eb4e02	Feature/prompts jsonl (#619 ) * Tech spec * JSONL implementation complete * Updated prompt client users * Fix tests	2026-01-26 17:38:00 +00:00
cybermaggedon	16a5cf966a	Fix agent streaming tool failure (#602 ) * Fix agent streaming linkage * Update tests	2026-01-06 23:00:50 +00:00
cybermaggedon	f79d0603f7	Update to add streaming tests (#600 )	2026-01-06 21:48:05 +00:00
cybermaggedon	ae13190093	Address legacy issues in storage management (#595 ) * Removed legacy storage management cruft. Tidied tech specs. * Fix deletion of last collection * Storage processor ignores data on the queue which is for a deleted collection * Updated tests	2026-01-05 13:45:14 +00:00
cybermaggedon	5304f96fe6	Fix tests (#593 ) * Fix unit/integration/contract tests which were broken by messaging fabric work	2025-12-19 08:53:21 +00:00
cybermaggedon	7d07f802a8	Basic multitenant support (#583 ) * Tech spec * Address multi-tenant queue option problems in CLI * Modified collection service to use config * Changed storage management to use the config service definition	2025-12-05 21:45:30 +00:00
cybermaggedon	72cb1c98e0	Fix tests (#571 )	2025-11-28 16:37:01 +00:00
cybermaggedon	e24de6081f	Fix streaming agent interactions (#570 ) * Fix observer, thought streaming * Fix end of message indicators * Remove double-delivery of answer	2025-11-28 16:25:57 +00:00
cybermaggedon	1948edaa50	Streaming rag responses (#568 ) * Tech spec for streaming RAG * Support for streaming Graph/Doc RAG	2025-11-26 19:47:39 +00:00
cybermaggedon	b1cc724f7d	Streaming LLM part 2 (#567 ) * Updates for agent API with streaming support * Added tg-dump-queues tool to dump Pulsar queues to a log * Updated tg-invoke-agent, incremental output * Queue dumper CLI - might be useful for debug * Updating for tests	2025-11-26 15:16:17 +00:00
cybermaggedon	310a2deb06	Feature/streaming llm phase 1 (#566 ) * Tidy up duplicate tech specs in doc directory * Streaming LLM text-completion service tech spec. * text-completion and prompt interfaces * streaming change applied to all LLMs, so far tested with VertexAI * Skip Pinecone unit tests, upstream module issue is affecting things, tests are passing again * Added agent streaming, not working and has broken tests	2025-11-26 09:59:10 +00:00
cybermaggedon	51107008fd	master -> 1.5 (README updates) (#552 )	2025-10-11 11:46:03 +01:00
cybermaggedon	52b133fc86	Collection delete pt. 3 (#542 ) * Fixing collection deletion * Fixing collection management param error * Always test for collections * Add Cassandra collection table * Updated tech spec for explicit creation/deletion * Remove implicit collection creation * Fix up collection tracking in all processors	2025-09-30 16:02:33 +01:00
cybermaggedon	43cfcb18a0	More LLM param test coverage (#535 ) * More LLM tests * Fixing tests	2025-09-26 01:00:30 +01:00
cybermaggedon	b0a3716b0e	Tests are failing (#534 ) * Fix tests, update to new model parameter usage	2025-09-25 21:32:19 +01:00
cybermaggedon	13ff7d765d	Collection management (#520 ) * Tech spec * Refactored Cassanda knowledge graph for single table * Collection management, librarian services to manage metadata and collection deletion	2025-09-18 15:57:52 +01:00
cybermaggedon	f22bf13aa6	Extend use of user + collection fields (#503 ) * Collection+user fields in structured query * User/collection in structured query & agent	2025-09-08 18:28:38 +01:00
cybermaggedon	5537fac731	Structured data, minor features (#500 ) - Sorted out confusing --auto mode with tg-load-structured-data - Fixed tests & added CLI tests	2025-09-05 17:25:12 +01:00
cybermaggedon	0b7620bc04	Object batching (#499 ) * Object batching * Update tests	2025-09-05 15:59:06 +01:00
cybermaggedon	50c37407c5	Fix/sys integration issues (#494 ) * Fix integration issues * Fix query defaults * Fix tests	2025-09-05 08:38:15 +01:00
cybermaggedon	ed0e02791d	Feature/structured query tool integration (#493 ) * Agent integration to structured query * Update tests	2025-09-04 16:23:43 +01:00
cybermaggedon	a6d9f5e849	Structured query support (#492 ) * Tweak the structured query schema * Structure query service * Gateway support for nlp-query and structured-query * API support * Added CLI * Update tests * More tests	2025-09-04 16:06:18 +01:00
cybermaggedon	85e669c763	Fixing more Cassandra consistency issues (#488 ) * Fixing more Cassandra work * Fix tests	2025-09-04 00:58:11 +01:00
cybermaggedon	ccaec88a72	Feature/consolidate cassandra config (#483 ) * Cassandra consolidation of parameters * New Cassandra configuration helper * Implemented Cassanda config refactor * New tests	2025-09-03 23:41:22 +01:00
cybermaggedon	e74eb5d1ff	Feature/tool group (#484 ) * Tech spec for tool group * Partial tool group implementation * Tool group tests	2025-09-03 23:39:49 +01:00
cybermaggedon	672e358b2f	Feature/graphql table query (#486 ) * Tech spec * Object query service for Cassandra * Gateway support for objects-query * GraphQL query utility * Filters, ordering	2025-09-03 23:39:11 +01:00
cybermaggedon	96c2b73457	Fix import export graceful shutdown (#476 ) * Tech spec for graceful shutdown * Graceful shutdown of importers/exporters * Update socket to include graceful shutdown orchestration * Adding tests for conditions tracked in this PR	2025-08-28 13:39:28 +01:00
cybermaggedon	e5b9b4976a	Fix agent knowledge query initialisation failure (#469 ) * Back out agent change * Fixed broken tests	2025-08-26 19:41:04 +01:00
cybermaggedon	6e9e2a11b1	Fix knowledge query ignoring the collection (#467 ) * Fix knowledge query ignoring the collection * Updated the agent_manager.py to properly pass config parameters when instantiating tool implementations * Added tests for agent collection parameter	2025-08-26 19:05:48 +01:00
cybermaggedon	28190fea8a	More config cli (#466 ) * Extra config CLI tech spec * Describe packaging * Added CLI commands * Add tests	2025-08-22 13:36:10 +01:00
cybermaggedon	83f0c1e7f3	Structure data mvp (#452 ) * Structured data tech spec * Architecture principles * New schemas * Updated schemas and specs * Object extractor * Add .coveragerc * New tests * Cassandra object storage * Trying to object extraction working, issues exist	2025-08-07 20:47:20 +01:00
cybermaggedon	dd70aade11	Implement logging strategy (#444 ) * Logging strategy and convert all prints() to logging invocations	2025-07-30 23:18:38 +01:00
cybermaggedon	d83e4e3d59	Update to enable knowledge extraction using the agent framework (#439 ) * Implement KG extraction agent (kg-extract-agent) * Using ReAct framework (agent-manager-react) * ReAct manager had an issue when emitting JSON, which conflicts which ReAct manager's own JSON messages, so refactored ReAct manager to use traditional ReAct messages, non-JSON structure. * Minor refactor to take the prompt template client out of prompt-template so it can be more readily used by other modules. kg-extract-agent uses this framework.	2025-07-21 14:31:57 +01:00
cybermaggedon	81c7c1181b	Updated CLI invocation and config model for tools and mcp (#438 ) * Updated CLI invocation and config model for tools and mcp * CLI anomalies * Tweaked the MCP tool implementation for new model * Update agent implementation to match the new model * Fix agent tools, now all tested * Fixed integration tests * Fix MCP delete tool params	2025-07-16 23:09:32 +01:00
cybermaggedon	f37decea2b	Increase storage test coverage (#435 ) * Fixing storage and adding tests * PR pipeline only runs quick tests	2025-07-15 09:33:35 +01:00
cybermaggedon	2f7fddd206	Test suite executed from CI pipeline (#433 ) * Test strategy & test cases * Unit tests * Integration tests	2025-07-14 14:57:44 +01:00

45 commits