--- layout: default title: "Asili ya Utoaji: Mfumo wa Subgraph" parent: "Swahili (Beta)" --- # Asili ya Utoaji: Mfumo wa Subgraph > **Beta Translation:** This document was translated via Machine Learning and as such may not be 100% accurate. All non-English languages are currently classified as Beta. ## Tatizo <<<<<<< HEAD Hivi sasa, utoaji wa wakati wa uondoaji huunda uelekezaji kamili kwa kila triple iliyoundwa: `stmt_uri`, `activity_uri`, na metadata inayohusiana ya PROV-O kwa kila ukweli wa maarifa. Kushughulikia sehemu moja ambayo hutoa uhusiano wa 20 hutoa triples ~220 za utoaji pamoja na triples ~20 za maarifa - mzigo wa takriban 10:1. Hii ni ghali (uhifadhi, uwekaji wa indexi, usambazaji) na pia si sahihi kimaana. Kila sehemu hushughulikiwa na simu moja ya LLM ambayo hutoa triples zake zote katika mshono mmoja. Mfumo wa sasa wa kila triple huficha hili kwa kuunda udanganyifu wa matukio 20 ya uondoaji huru. Zaidi ya hayo, vichakavu viwili vya nne vya uondoaji (kg-extract-ontology, kg-extract-agent) havina utoaji wowote, na hivyo kuacha pengo katika ======= Hivi sasa, utoaji wa taarifa wakati wa utoaji huunda uelekezaji kamili kwa kila triple iliyotoa: `stmt_uri`, `activity_uri`, na metadata inayohusiana ya PROV-O kwa kila ukweli wa maarifa. Kushughulikia sehemu moja ambayo hutoa uhusiano wa 20 hutoa triples ~220 za taarifa juu ya triples ~20 za maarifa - mzigo wa takriban 10:1. Hii ni ghali (uhifadhi, urekebishaji, usambazaji) na pia si sahihi kimaana. Kila sehemu hushughulikiwa na simu moja ya LLM ambayo hutoa triples zake zote katika mshughuliko mmoja. Mfumo wa sasa wa kila triple huficha hili kwa kuunda udanganyifu wa matukio 20 ya kujitenga ya utoaji. Zaidi ya hayo, vichakavu viwili vya utoaji vifo (kg-extract-ontology, kg-extract-agent) havina taarifa zozote, na hivyo kuacha pengo katika >>>>>>> 82edf2d (New md files from RunPod) njia ya ukaguzi. ## Suluhisho Badilisha uelekezaji wa kila triple na **mfumo wa subgraph**: rekodi moja <<<<<<< HEAD ya utoaji kwa kila uondoaji wa sehemu, inayoshirikiwa na triples zote ======= ya taarifa kwa kila utoaji wa sehemu, inayoshirikiwa na triples zote >>>>>>> 82edf2d (New md files from RunPod) zilizozalishwa kutoka sehemu hiyo. ### Mabadiliko ya Dhana | Zamani | Mpya | |-----|-----| | `stmt_uri` (`https://trustgraph.ai/stmt/{uuid}`) | `subgraph_uri` (`https://trustgraph.ai/subgraph/{uuid}`) | | `statement_uri()` | `subgraph_uri()` | <<<<<<< HEAD | `tg:reifies` (1:1, utambulisho) | `tg:contains` (1:wengi, uwezeshaji) | ### Muundo Unaolengwa Triples zote za utoaji huwekwa katika grafu iliyoitwa `urn:graph:source`. ======= | `tg:reifies` (1:1, utambulisho) | `tg:contains` (1:wengi, kuingia) | ### Muundo Unaolengwa Triples zote za taarifa huwekwa katika grafu iliyoitwa `urn:graph:source`. >>>>>>> 82edf2d (New md files from RunPod) ``` # Subgraph contains each extracted triple (RDF-star quoted triples) tg:contains <> . tg:contains <> . tg:contains <> . # Derivation from source chunk prov:wasDerivedFrom . prov:wasGeneratedBy . # Activity: one per chunk extraction rdf:type prov:Activity . rdfs:label "{component_name} extraction" . prov:used . prov:wasAssociatedWith . prov:startedAtTime "2026-03-13T10:00:00Z" . tg:componentVersion "0.25.0" . tg:llmModel "gpt-4" . # if available tg:ontology . # if available # Agent: stable per component rdf:type prov:Agent . rdfs:label "{component_name}" . ``` ### Kulinganisha Kiasi Kwa kila sehemu inayozalisha triples tatu zilizochukuliwa: | | Zamani (kwa kila triple) | Mpya (subgraph) | |---|---|---| | `tg:contains` / `tg:reifies` | N | N | | Triples za shughuli | ~9 x N | ~9 | | Triples za wakala | 2 x N | 2 | | Metadata ya taarifa/subgraph | 2 x N | 2 | | **Triples tatu za jumla za asili** | **~13N** | **N + 13** | | **Mfano (N=20)** | **~260** | **33** | ## Upeo ### Wasindikaji ambao Watasasishwa (asili iliyopo, kwa kila triple) **kg-extract-definitions** (`trustgraph-flow/trustgraph/extract/kg/definitions/extract.py`) Hivi sasa huita `statement_uri()` + `triple_provenance_triples()` ndani ya loop ya kila ufafanuzi. Mabadiliko: Hamisha `subgraph_uri()` na `activity_uri()` kabla ya loop Kusanya triples za `tg:contains` ndani ya loop Toa kundi la shughuli/wakala/uzalishaji mara moja baada ya loop **kg-extract-relationships** (`trustgraph-flow/trustgraph/extract/kg/relationships/extract.py`) Mfano sawa na ufafanuzi. Mabadiliko sawa. <<<<<<< HEAD ### Wasindikaji ambao Watasasishwa ili Kuongeza Asili (sasa hayapo) ======= ### Wasindikaji ambao Wataongezwa Asili (sasa hayapo) >>>>>>> 82edf2d (New md files from RunPod) **kg-extract-ontology** (`trustgraph-flow/trustgraph/extract/kg/ontology/extract.py`) Hivi sasa hutoa triples bila asili. Ongeza asili ya subgraph kwa kutumia mfano sawa: subgraph moja kwa kila sehemu, `tg:contains` kwa kila triple iliyochukuliwa. **kg-extract-agent** (`trustgraph-flow/trustgraph/extract/kg/agent/extract.py`) Hivi sasa hutoa triples bila asili. Ongeza asili ya subgraph kwa kutumia mfano sawa. <<<<<<< HEAD ### Mabadiliko ya Maktaba ya Asili iliyoshirikiwa ======= ### Mabadiliko ya Maktaba ya Asili Iliyoshirikiwa >>>>>>> 82edf2d (New md files from RunPod) **`trustgraph-base/trustgraph/provenance/triples.py`** Badilisha `triple_provenance_triples()` na `subgraph_provenance_triples()` Kazi mpya inakubali orodha ya triples zilizochukuliwa badala ya moja Inazalisha `tg:contains` moja kwa kila triple, kundi la shughuli/wakala lililoshirikiwa Ondoa `triple_provenance_triples()` ya zamani **`trustgraph-base/trustgraph/provenance/uris.py`** Badilisha `statement_uri()` na `subgraph_uri()` **`trustgraph-base/trustgraph/provenance/namespaces.py`** Badilisha `TG_REIFIES` na `TG_CONTAINS` <<<<<<< HEAD ### Hayajajumuishwa katika Upeo ======= ### Hayako Katika Upeo >>>>>>> 82edf2d (New md files from RunPod) **kg-extract-topics**: wasindikaji wa mtindo wa zamani, hawatumiki kwa sasa katika mtiririko wa kawaida **kg-extract-rows**: hutoa mistari si triples, mfumo tofauti wa asili **Asili ya wakati wa swali** (`urn:graph:retrieval`): suala tofauti, tayari hutumia mfumo tofauti (swali/uchunguzi/lengo/muhtasari) **Asili ya hati/ukurasa/sehemu** (dekoda ya PDF, kichunguzi): tayari hutumia `derived_entity_triples()` ambayo ni kwa kila kitu, si kwa kila triple — hakuna suala la ziada ## Maelezo ya Utendaji ### Upangaji Upya wa Loop ya Msindikaji Kabla (kwa kila triple, katika uhusiano): ```python for rel in rels: # ... build relationship_triple ... stmt_uri = statement_uri() prov_triples = triple_provenance_triples( stmt_uri=stmt_uri, extracted_triple=relationship_triple, ... ) triples.extend(set_graph(prov_triples, GRAPH_SOURCE)) ``` Baada ya (mfumo mdogo): ```python sg_uri = subgraph_uri() for rel in rels: # ... build relationship_triple ... extracted_triples.append(relationship_triple) prov_triples = subgraph_provenance_triples( subgraph_uri=sg_uri, extracted_triples=extracted_triples, chunk_uri=chunk_uri, component_name=default_ident, component_version=COMPONENT_VERSION, llm_model=llm_model, ontology_uri=ontology_uri, ) triples.extend(set_graph(prov_triples, GRAPH_SOURCE)) ``` ### Saini Mpya ya Msaidizi ```python def subgraph_provenance_triples( subgraph_uri: str, extracted_triples: List[Triple], chunk_uri: str, component_name: str, component_version: str, llm_model: Optional[str] = None, ontology_uri: Optional[str] = None, timestamp: Optional[str] = None, ) -> List[Triple]: """ Build provenance triples for a subgraph of extracted knowledge. Creates: - tg:contains link for each extracted triple (RDF-star quoted) - One prov:wasDerivedFrom link to source chunk - One activity with agent metadata """ ``` ### Mabadiliko Makubwa <<<<<<< HEAD Hii ni mabadiliko makubwa kwa mfumo wa asili ya data. Asili ya data haijatolewa, kwa hivyo hakuna uhamishaji unaohitajika. Msimbo wa zamani wa ⟦CODE_0⟧ / ======= Hii ni mabadiliko makubwa kwa mfumo wa uhakikisho. Uhakikisho haujatolewa, kwa hivyo hakuna uhamishaji unaohitajika. Msimbo wa zamani wa ⟦CODE_0⟧ / >>>>>>> 82edf2d (New md files from RunPod) `tg:reifies` unaweza kuondolewa kabisa. Msimbo `statement_uri` unaweza kufutwa kabisa.