trustgraph/trustgraph-base/trustgraph/schema/core/metadata.py at aa4f5c6c00402ca8fa379cf6019a4978b8d1aab7 - apunkt/trustgraph - bitfreedom.net: free all bits, everywhere

apunkt/trustgraph

mirror of https://github.com/trustgraph-ai/trustgraph.git synced 2026-05-01 11:26:22 +02:00

cybermaggedon aa4f5c6c00

Remove redundant metadata (#685 )

The metadata field (list of triples) in the pipeline Metadata class
was redundant. Document metadata triples already flow directly from
librarian to triple-store via emit_document_provenance() - they don't
need to pass through the extraction pipeline.

Additionally, chunker and PDF decoder were overwriting metadata to []
anyway, so any metadata passed through the pipeline was being
discarded.

Changes:
- Remove metadata field from Metadata dataclass
  (schema/core/metadata.py)
- Update all Metadata instantiations to remove metadata=[]
  parameter
- Remove metadata handling from translators (document_loading,
  knowledge)
- Remove metadata consumption from extractors (ontology, agent)
- Update gateway serializers and import handlers
- Update all unit, integration, and contract tests

2026-03-11 10:51:39 +00:00

10 lines

176 B

Python

Raw Blame History

 from dataclasses import dataclass
 @dataclass
 class Metadata:
     # Source identifier
     id: str = ""
     # Collection management
     user: str = ""
     collection: str = ""