mirror of
https://github.com/trustgraph-ai/trustgraph.git
synced 2026-04-25 00:16:23 +02:00
Agent explainability tech specs (#655)
* Query time provenance tech spec * Extraction provenance placeholder
This commit is contained in:
parent
88fe8468bc
commit
4d31cd4c03
2 changed files with 331 additions and 0 deletions
49
docs/tech-specs/extraction-time-provenance.md
Normal file
49
docs/tech-specs/extraction-time-provenance.md
Normal file
|
|
@ -0,0 +1,49 @@
|
|||
# Extraction-Time Provenance: Source Layer
|
||||
|
||||
## Status
|
||||
|
||||
Notes - Not yet started
|
||||
|
||||
## Overview
|
||||
|
||||
This document captures notes on extraction-time provenance for future specification work. Extraction-time provenance records the "source layer" - where data came from originally, how it was extracted and transformed.
|
||||
|
||||
This is separate from query-time provenance (see `query-time-provenance.md`) which records agent reasoning.
|
||||
|
||||
## Current State
|
||||
|
||||
Source metadata is already partially stored in the knowledge graph (~40% solved):
|
||||
- Documents have source URLs, timestamps
|
||||
- Some extraction metadata exists
|
||||
|
||||
## Scope
|
||||
|
||||
Extraction-time provenance should capture:
|
||||
|
||||
### Source Layer (Origin)
|
||||
- URL / file path
|
||||
- Retrieval timestamp
|
||||
- Funding sources
|
||||
- Authorship / authority
|
||||
- Document metadata (title, date, version)
|
||||
|
||||
### Transformation Layer (Extraction)
|
||||
- Extraction tool used (e.g., PDF parser, table extractor)
|
||||
- Extraction method / version
|
||||
- Confidence scores
|
||||
- Raw-to-structured mapping
|
||||
- Parent-child relationships (PDF → table → row → fact)
|
||||
|
||||
## Key Questions for Future Spec
|
||||
|
||||
1. What metadata is already captured today?
|
||||
2. What gaps exist?
|
||||
3. How to structure the extraction DAG?
|
||||
4. How does query-time provenance link to extraction-time nodes?
|
||||
5. Storage format - RDF triples? Separate schema?
|
||||
|
||||
## References
|
||||
|
||||
- Query-time provenance: `docs/tech-specs/query-time-provenance.md`
|
||||
- PROV-O standard for provenance modeling
|
||||
- Existing source metadata in knowledge graph (needs audit)
|
||||
Loading…
Add table
Add a link
Reference in a new issue