mirror of
https://github.com/trustgraph-ai/trustgraph.git
synced 2026-05-11 00:02:37 +02:00
fix: ontology extractor reads .objects, not .object, from PromptResult
The extract-with-ontologies prompt is a JSONL prompt, which means the prompt service returns a PromptResult with response_type="jsonl" and the parsed items in `.objects` (plural). The ontology extractor was reading `.object` (singular) — the field used for response_type="json" — which is always None for JSONL prompts. Effect: the parser received None on every chunk, hit its "Unexpected response type: <class 'NoneType'>" branch, returned no ExtractionResult, and extract_with_simplified_format returned []. Every extraction silently produced zero triples. Graphs populated only with the seed ontology schema (TBox) and document/chunk provenance — no instance triples at all. The e2e test threshold of >=100 edges per collection was met by schema + provenance alone, so the failure mode was invisible until RAG queries couldn't find any content. Regression introduced in v2.3 with the token-usage work (commit56d700f3/14e49d83) when PromptClient.prompt() began returning a PromptResult wrapper instead of the raw text/dict/list. All other call sites of .prompt() across retrieval/, agent/, orchestrator/ were already reading the correct field for their prompt's response_type; ontology extraction was the sole stranded caller. Also adds tests/unit/test_extract/test_ontology/test_extract_with_simplified_format.py covering: - happy path: populated .objects produces non-empty triples - production failure shape: .objects=None returns [] cleanly - empty .objects returns [] without raising - defensive: do not silently fall back to .object for a JSONL prompt
This commit is contained in:
parent
424ace44c4
commit
bcd7a1694a
2 changed files with 207 additions and 1 deletions
|
|
@ -380,7 +380,13 @@ class Processor(FlowProcessor):
|
|||
id="extract-with-ontologies",
|
||||
variables=prompt_variables
|
||||
)
|
||||
extraction_response = result.object
|
||||
|
||||
# extract-with-ontologies is a JSONL prompt, so PromptResult
|
||||
# always populates .objects (a list of dicts). Reading .object
|
||||
# (singular) silently gives None for JSONL responses and drops
|
||||
# every extraction.
|
||||
extraction_response = result.objects
|
||||
|
||||
logger.debug(f"Simplified extraction response: {extraction_response}")
|
||||
|
||||
# Parse response into structured format
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue