trustgraph

mirror of https://github.com/trustgraph-ai/trustgraph.git synced 2026-07-10 05:42:12 +02:00

cybermaggedon 286f762369 The id field in pipeline Metadata was being overwritten at each processing (#686 ) The id field in pipeline Metadata was being overwritten at each processing stage (document → page → chunk), causing knowledge storage to create separate cores per chunk instead of grouping by document. Add a root field that: - Is set by librarian to the original document ID - Is copied unchanged through PDF decoder, chunkers, and extractors - Is used by knowledge storage for document_id grouping (with fallback to id) Changes: - Add root field to Metadata schema with empty string default - Set root=document.id in librarian when initiating document processing - Copy root through PDF decoder, recursive chunker, and all extractors - Update knowledge storage to use root (or id as fallback) for grouping - Add root handling to translators and gateway serialization - Update test mock Metadata class to include root parameter		2026-03-11 12:16:39 +00:00
..
__init__.py	Row embeddings APIs exposed (#646 )	2026-02-23 21:52:56 +00:00
agent.py	Fix non streaming RAG problems (#607 )	2026-01-12 18:45:52 +00:00
base.py	Feature/translator classes (#414 )	2025-06-20 16:59:55 +01:00
collection.py	Fix/queue configurations (#585 )	2025-12-06 14:54:47 +00:00
config.py	Empty configuration is returned as empty list, previously was not in response (#436 )	2025-07-15 14:30:37 +01:00
diagnosis.py	Fix tests (#593 )	2025-12-19 08:53:21 +00:00
document_loading.py	The id field in pipeline Metadata was being overwritten at each processing (#686 )	2026-03-11 12:16:39 +00:00
embeddings.py	Batch embeddings (#668 )	2026-03-08 18:36:54 +00:00
embeddings_query.py	Embeddings API scores (#671 )	2026-03-09 10:53:44 +00:00
flow.py	Fix config inconsistency (#609 )	2026-01-14 12:31:40 +00:00
knowledge.py	The id field in pipeline Metadata was being overwritten at each processing (#686 )	2026-03-11 12:16:39 +00:00
library.py	Fix/librarian broken (#674 )	2026-03-09 13:36:24 +00:00
metadata.py	Incremental / large document loading (#659 )	2026-03-04 16:57:58 +00:00
nlp_query.py	Structured query support (#492 )	2025-09-04 16:06:18 +01:00
primitives.py	Fix/extraction prov (#662 )	2026-03-06 12:23:58 +00:00
prompt.py	Fix streaming API niggles (#599 )	2026-01-06 16:41:35 +00:00
retrieval.py	Terminology Rename, and named-graphs for explainability (#682 )	2026-03-10 14:35:21 +00:00
rows_query.py	Structured data 2 (#645 )	2026-02-23 15:56:29 +00:00
structured_query.py	Extend use of user + collection fields (#503 )	2025-09-08 18:28:38 +01:00
text_completion.py	Update to add streaming tests (#600 )	2026-01-06 21:48:05 +00:00
tool.py	MCP client support (#427 )	2025-07-07 23:52:23 +01:00
triples.py	Feature/streaming triples (#676 )	2026-03-09 15:46:33 +00:00