trustgraph/tests/unit
cybermaggedon 286f762369
The id field in pipeline Metadata was being overwritten at each processing (#686)
The id field in pipeline Metadata was being overwritten at each processing
stage (document → page → chunk), causing knowledge storage to create
separate cores per chunk instead of grouping by document.

Add a root field that:
- Is set by librarian to the original document ID
- Is copied unchanged through PDF decoder, chunkers, and extractors
- Is used by knowledge storage for document_id grouping (with fallback to id)

Changes:
- Add root field to Metadata schema with empty string default
- Set root=document.id in librarian when initiating document processing
- Copy root through PDF decoder, recursive chunker, and all extractors
- Update knowledge storage to use root (or id as fallback) for grouping
- Add root handling to translators and gateway serialization
- Update test mock Metadata class to include root parameter
2026-03-11 12:16:39 +00:00
..
test_agent Tool services - dynamically pluggable tool implementations for agent frameworks (#658) 2026-03-04 14:51:32 +00:00
test_base Embeddings API scores (#671) 2026-03-09 10:53:44 +00:00
test_chunking Remove redundant metadata (#685) 2026-03-11 10:51:39 +00:00
test_cli Fix/tests (#647) 2026-02-23 22:01:47 +00:00
test_clients Embeddings API scores (#671) 2026-03-09 10:53:44 +00:00
test_config Structure data mvp (#452) 2025-08-07 20:47:20 +01:00
test_cores Remove redundant metadata (#685) 2026-03-11 10:51:39 +00:00
test_decoding Extract-time provenance (#661) 2026-03-05 18:36:10 +00:00
test_direct Fix Cassandra schema and graph filter semantics (#680) 2026-03-10 12:52:51 +00:00
test_embeddings Embeddings API scores (#671) 2026-03-09 10:53:44 +00:00
test_extract Changed schema for Value -> Term, majorly breaking change (#622) 2026-01-27 13:48:08 +00:00
test_gateway Remove redundant metadata (#685) 2026-03-11 10:51:39 +00:00
test_knowledge_graph The id field in pipeline Metadata was being overwritten at each processing (#686) 2026-03-11 12:16:39 +00:00
test_query Knowledge core processing updated for embeddings interface change (#681) 2026-03-10 13:28:16 +00:00
test_retrieval Terminology Rename, and named-graphs for explainability (#682) 2026-03-10 14:35:21 +00:00
test_rev_gateway Fix tests (#593) 2025-12-19 08:53:21 +00:00
test_storage Remove redundant metadata (#685) 2026-03-11 10:51:39 +00:00
test_text_completion Structured data 2 (#645) 2026-02-23 15:56:29 +00:00
__init__.py Test suite executed from CI pipeline (#433) 2025-07-14 14:57:44 +01:00
test_prompt_manager.py Feature/prompts jsonl (#619) 2026-01-26 17:38:00 +00:00
test_prompt_manager_edge_cases.py Update to enable knowledge extraction using the agent framework (#439) 2025-07-21 14:31:57 +01:00
test_python_api_client.py Structured data 2 (#645) 2026-02-23 15:56:29 +00:00