trustgraph

mirror of https://github.com/trustgraph-ai/trustgraph.git synced 2026-04-26 17:06:22 +02:00

Author	SHA1	Message	Date
cybermaggedon	4fb0b4d8e8	Pub/sub abstraction: decouple from Pulsar (#751 ) Remove Pulsar-specific concepts from application code so that the pub/sub backend is swappable via configuration. Rename translators: - to_pulsar/from_pulsar → decode/encode across all translator classes, dispatch handlers, and tests (55+ files) - from_response_with_completion → encode_with_completion - Remove pulsar.schema.Record from translator base class Queue naming (CLASS:TOPICSPACE:TOPIC): - Replace topic() helper with queue() using new format: flow:tg:name, request:tg:name, response:tg:name, state:tg:name - Queue class implies persistence/TTL (no QoS in names) - Update Pulsar backend map_topic() to parse new format - Librarian queues use flow class (persistent, for chunking) - Config push uses state class (persistent, last-value) - Remove 15 dead topic imports from schema files - Update init_trustgraph.py namespace: config → state Confine Pulsar to pulsar_backend.py: - Delete legacy PulsarClient class from pubsub.py - Move add_args to add_pubsub_args() with standalone flag for CLI tools (defaults to localhost) - PulsarBackendConsumer.receive() catches _pulsar.Timeout, raises standard TimeoutError - Remove Pulsar imports from: async_processor, flow_processor, log_level, all 11 client files, 4 storage writers, gateway service, gateway config receiver - Remove log_level/LoggerLevel from client API - Rewrite tg-monitor-prompts to use backend abstraction - Update tg-dump-queues to use add_pubsub_args Also: pubsub-abstraction.md tech spec covering problem statement, design goals, as-is requirements, candidate broker assessment, approach, and implementation order.	2026-04-01 20:16:53 +01:00
cybermaggedon	816a8cfcf6	Update tests for agent-orchestrator (#745 ) Add 96 tests covering the orchestrator's aggregation, provenance, routing, and explainability parsing. These verify the supervisor fan-out/fan-in lifecycle, the new RDF provenance types (Decomposition, Finding, Plan, StepResult, Synthesis), and their round-trip through the wire format. Unit tests (84): - Aggregator: register, record completion, peek, build synthesis, cleanup - Provenance triple builders: types, provenance links, goals/steps, labels - Explainability parsing: from_triples dispatch, field extraction for all new entity types, precedence over existing types - PatternBase: is_subagent detection, emit_subagent_completion message shape - Completion dispatch: detection logic, full aggregator integration flow, synthesis request not re-intercepted as completion - MetaRouter: task type identification, pattern selection, valid_patterns constraints, fallback on LLM error or unknown response Contract tests (12): - Orchestration fields on AgentRequest round-trip correctly - subagent-completion and synthesise step types in request history - Plan steps with status and dependencies - Provenance triple builder → wire format → from_triples round-trip for all five new entity types	2026-03-31 13:12:26 +01:00
cybermaggedon	849987f0e6	Add multi-pattern orchestrator with plan-then-execute and supervisor (#739 ) Introduce an agent orchestrator service that supports three execution patterns (ReAct, plan-then-execute, supervisor) with LLM-based meta-routing to select the appropriate pattern and task type per request. Update the agent schema to support orchestration fields (correlation, sub-agents, plan steps) and remove legacy response fields (answer, thought, observation).	2026-03-31 00:32:49 +01:00
cybermaggedon	35128ff019	Add unified explainability support and librarian storage for (#693 ) Add unified explainability support and librarian storage for all retrieval engines Implements consistent explainability/provenance tracking across GraphRAG, DocumentRAG, and Agent retrieval engines. All large content (answers, thoughts, observations) is now stored in librarian rather than as inline literals in the knowledge graph. Explainability API: - New explainability.py module with entity classes (Question, Exploration, Focus, Synthesis, Analysis, Conclusion) and ExplainabilityClient - Quiescence-based eventual consistency handling for trace fetching - Content fetching from librarian with retry logic CLI updates: - tg-invoke-graph-rag -x/--explainable flag returns explain_id - tg-invoke-document-rag -x/--explainable flag returns explain_id - tg-invoke-agent -x/--explainable flag returns explain_id - tg-list-explain-traces uses new explainability API - tg-show-explain-trace handles all three trace types Agent provenance: - Records session, iterations (think/act/observe), and conclusion - Stores thoughts and observations in librarian with document references - New predicates: tg:thoughtDocument, tg:observationDocument DocumentRAG provenance: - Records question, exploration (chunk retrieval), and synthesis - Stores answers in librarian with document references Schema changes: - AgentResponse: added explain_id, explain_graph fields - RetrievalResponse: added explain_id, explain_graph fields - agent_iteration_triples: supports thought_document_id, observation_document_id Update tests.	2026-03-12 21:40:09 +00:00
cybermaggedon	aa4f5c6c00	Remove redundant metadata (#685 ) The metadata field (list of triples) in the pipeline Metadata class was redundant. Document metadata triples already flow directly from librarian to triple-store via emit_document_provenance() - they don't need to pass through the extraction pipeline. Additionally, chunker and PDF decoder were overwriting metadata to [] anyway, so any metadata passed through the pipeline was being discarded. Changes: - Remove metadata field from Metadata dataclass (schema/core/metadata.py) - Update all Metadata instantiations to remove metadata=[] parameter - Remove metadata handling from translators (document_loading, knowledge) - Remove metadata consumption from extractors (ontology, agent) - Update gateway serializers and import handlers - Update all unit, integration, and contract tests	2026-03-11 10:51:39 +00:00
cybermaggedon	7a6197d8c3	GraphRAG Query-Time Explainability (#677 ) Implements full explainability pipeline for GraphRAG queries, enabling traceability from answers back to source documents. Renamed throughout for clarity: - provenance_callback → explain_callback - provenance_id → explain_id - provenance_collection → explain_collection - message_type "provenance" → "explain" - Queue name "provenance" → "explainability" GraphRAG queries now emit explainability events as they execute: 1. Session - query text and timestamp 2. Retrieval - edges retrieved from subgraph 3. Selection - selected edges with LLM reasoning (JSONL with id + reasoning) 4. Answer - reference to synthesized response Events stream via explain_callback during query(), enabling real-time UX. - Answers stored in librarian service (not inline in graph - too large) - Document ID as URN: urn:trustgraph:answer:{session_id} - Graph stores tg:document reference (IRI) to librarian document - Added librarian producer/consumer to graph-rag service - get_labelgraph() now returns (labeled_edges, uri_map) - uri_map maps edge_id(label_s, label_p, label_o) → (uri_s, uri_p, uri_o) - Explainability data stores original URIs, not labels - Enables tracing edges back to reifying statements via tg:reifies - Added serialize_triple() to query service (matches storage format) - get_term_value() now handles TRIPLE type terms - Enables querying by quoted triple in object position: ?stmt tg:reifies <<s p o>> - Displays real-time explainability events during query - Resolves rdfs:label for edge components (s, p, o) - Traces source chain via prov:wasDerivedFrom to root document - Output: "Source: Chunk 1 → Page 2 → Document Title" - Label caching to avoid repeated queries GraphRagResponse: - explain_id: str \| None - explain_collection: str \| None - message_type: str ("chunk" or "explain") - end_of_session: bool trustgraph-base/trustgraph/provenance/: - namespaces.py - Added TG_DOCUMENT predicate - triples.py - answer_triples() supports document_id reference - uris.py - Added edge_selection_uri() trustgraph-base/trustgraph/schema/services/retrieval.py: - GraphRagResponse with explain_id, explain_collection, end_of_session trustgraph-flow/trustgraph/retrieval/graph_rag/: - graph_rag.py - URI preservation, streaming answer accumulation - rag.py - Librarian integration, real-time explain emission trustgraph-flow/trustgraph/query/triples/cassandra/service.py: - Quoted triple serialization for query matching trustgraph-cli/trustgraph/cli/invoke_graph_rag.py: - Full explainability display with label resolution and source tracing	2026-03-10 10:00:01 +00:00
cybermaggedon	f2ae0e8623	Embeddings API scores (#671 ) - Put scores in all responses - Remove unused 'middle' vector layer. Vector of texts -> vector of (vector embedding)	2026-03-09 10:53:44 +00:00
cybermaggedon	3bf8a65409	Fix tests (#666 )	2026-03-07 23:38:09 +00:00
cybermaggedon	1809c1f56d	Structured data 2 (#645 ) * Structured data refactor - multi-index tables, remove need for manual mods to the Cassandra tables * Tech spec updated to track implementation	2026-02-23 15:56:29 +00:00
cybermaggedon	cf0daedefa	Changed schema for Value -> Term, majorly breaking change (#622 ) * Changed schema for Value -> Term, majorly breaking change * Following the schema change, Value -> Term into all processing * Updated Cassandra for g, p, s, o index patterns (7 indexes) * Reviewed and updated all tests * Neo4j, Memgraph and FalkorDB remain broken, will look at once settled down	2026-01-27 13:48:08 +00:00
cybermaggedon	807f6cc4e2	Fix non streaming RAG problems (#607 ) * Fix non-streaming failure in RAG services * Fix non-streaming failure in API * Fix agent non-streaming messaging * Agent messaging unit & contract tests	2026-01-12 18:45:52 +00:00
cybermaggedon	5304f96fe6	Fix tests (#593 ) * Fix unit/integration/contract tests which were broken by messaging fabric work	2025-12-19 08:53:21 +00:00
cybermaggedon	0b7620bc04	Object batching (#499 ) * Object batching * Update tests	2025-09-05 15:59:06 +01:00
cybermaggedon	a6d9f5e849	Structured query support (#492 ) * Tweak the structured query schema * Structure query service * Gateway support for nlp-query and structured-query * API support * Added CLI * Update tests * More tests	2025-09-04 16:06:18 +01:00
cybermaggedon	e74eb5d1ff	Feature/tool group (#484 ) * Tech spec for tool group * Partial tool group implementation * Tool group tests	2025-09-03 23:39:49 +01:00
cybermaggedon	672e358b2f	Feature/graphql table query (#486 ) * Tech spec * Object query service for Cassandra * Gateway support for objects-query * GraphQL query utility * Filters, ordering	2025-09-03 23:39:11 +01:00
cybermaggedon	38826c7de1	trustgraph-base .chunks / .documents confusion in the API (#481 ) * trustgraph-base .chunks / .documents confusion in the API * Added tests, fixed test failures in code * Fix file dup error * Fix contract error	2025-09-02 17:58:53 +01:00
cybermaggedon	83f0c1e7f3	Structure data mvp (#452 ) * Structured data tech spec * Architecture principles * New schemas * Updated schemas and specs * Object extractor * Add .coveragerc * New tests * Cassandra object storage * Trying to object extraction working, issues exist	2025-08-07 20:47:20 +01:00
cybermaggedon	4daa54abaf	Extending test coverage (#434 ) * Contract tests * Testing embeedings * Agent unit tests * Knowledge pipeline tests * Turn on contract tests	2025-07-14 17:54:04 +01:00

19 commits