trustgraph/specs/api/components/schemas/rag/GraphRagRequest.yaml
cybermaggedon 68e816e65c
feat: filter and cap GraphRAG reranker input across full stack (#1021)
- Filter out RDF/RDFS/OWL schema predicates (rdfs:domain, owl:inverseOf,
  etc.) from hop traversal, keeping rdf:type for data signal
- Skip edges where reranker-visible components are unlabeled IRIs, since
  the cross-encoder cannot meaningfully score raw URIs
- Add max-reranker-input safety cap (default 350) to prevent overloading
  the reranker, applied after filtering for maximum useful candidates
- Expose max-reranker-input as per-request parameter through schema,
  translator, REST API, socket client, CLI, and OpenAPI spec
- Update tests
- Update tech spec
2026-07-03 15:51:04 +01:00

56 lines
1.4 KiB
YAML

type: object
description: |
Graph RAG (Retrieval-Augmented Generation) query request.
Searches knowledge graph and generates answer using retrieved subgraph.
required:
- query
properties:
query:
type: string
description: User query or question
example: What connections exist between quantum physics and computer science?
collection:
type: string
description: Collection to search within
default: default
example: research
entity-limit:
type: integer
description: Maximum number of entities to retrieve
default: 50
minimum: 1
maximum: 200
example: 30
triple-limit:
type: integer
description: Maximum number of triples to retrieve per entity
default: 30
minimum: 1
maximum: 100
example: 20
max-subgraph-size:
type: integer
description: Maximum total subgraph size (triples)
default: 1000
minimum: 10
maximum: 5000
example: 500
max-path-length:
type: integer
description: Maximum path length for graph traversal
default: 2
minimum: 1
maximum: 5
example: 3
max-reranker-input:
type: integer
description: Maximum candidate edges sent to the reranker per hop
default: 350
minimum: 1
maximum: 1000
example: 350
streaming:
type: boolean
description: Enable streaming response delivery
default: false
example: true