trustgraph/trustgraph-flow/trustgraph
Sunny 6c9a545a06
feat: add cross-encoder reranking to Document-RAG with two-limit control (#878) (#1011)
Wire the FlashRank reranker subsystem from #1005 into Document-RAG: after
vector retrieval, over-fetch a wider candidate pool, rerank with the
cross-encoder, and keep the top doc_limit chunks for synthesis.

Per maintainer review, the fetch and select sizes are two caller-controlled
limits rather than one internal heuristic:

- doc_limit:   chunks selected into the synthesis prompt (unchanged meaning).
- fetch_limit: candidate pool pulled from the vector store before reranking.
  0 = derive (OVERFETCH_FACTOR x doc_limit); values below doc_limit are
  raised to it. Lets the caller control how hard the reranker has to work.

Details:
- schema: DocumentRagQuery.fetch_limit (additive, backward compatible).
- document_rag.py / rag.py: fetch_limit resolved in the processor (mirrors
  doc_limit); the core applies the heuristic default and derives synthesis
  provenance from the chunk-selection focus when reranking ran.
- provenance: tg:ChunkSelection focus stage (mirrors tg:EdgeSelection).
- request translator + client SDKs + CLI: fetch-limit / --fetch-limit,
  threaded exactly like doc_limit and the GraphRAG limits.
- tests: no-op identity, over-fetch/narrow, explicit fetch_limit, heuristic
  default, floor-at-doc_limit, provenance lineage, cross-repo topic wiring.

Reranking is skipped byte-identically when no reranker role is wired.
Requires the companion trustgraph-templates change wiring the reranker
topics into the document-rag flow (mirrors #279 for GraphRAG).
2026-07-02 09:50:13 +01:00
..
agent Per-flow librarian clients and per-workspace response queues (#865) 2026-05-06 12:01:01 +01:00
bootstrap feat: make bootstrapper initialiser timeouts configurable (#999) 2026-06-30 09:37:22 +01:00
chunking Per-flow librarian clients and per-workspace response queues (#865) 2026-05-06 12:01:01 +01:00
config/service fix: wire replication params through YAML/params path for Cassandra and Qdrant (#976) 2026-06-04 12:36:36 +01:00
cores fix: wire replication params through YAML/params path for Cassandra and Qdrant (#976) 2026-06-04 12:36:36 +01:00
decoding fix: reject invalid PDF decoder input (#977) 2026-06-09 16:37:39 +01:00
direct feat: data store replication configuration and TLS upgrade (#975) 2026-06-04 11:49:29 +01:00
embeddings Fix Ollama async issue (#854) 2026-04-28 15:43:04 +01:00
external Implement logging strategy (#444) 2025-07-30 23:18:38 +01:00
extract Fix ontology selector defaults, add bypass mode, enforce domain/range (#929) 2026-05-16 15:13:38 +01:00
flow Per-flow librarian clients and per-workspace response queues (#865) 2026-05-06 12:01:01 +01:00
gateway feat: replace LLM edge scoring with cross-encoder reranker in GraphRAG (#1005) 2026-06-30 14:36:37 +01:00
iam feat: replace LLM edge scoring with cross-encoder reranker in GraphRAG (#1005) 2026-06-30 14:36:37 +01:00
librarian fix: wire replication params through YAML/params path for Cassandra and Qdrant (#976) 2026-06-04 12:36:36 +01:00
metering feat: workspace-based multi-tenancy, replacing user as tenancy axis (#840) 2026-04-21 23:23:01 +01:00
model fix: simplify dashscope variant and route API calls through variants (#1012) 2026-07-02 09:12:55 +01:00
processing Fix/startup failure (#445) 2025-07-30 23:42:11 +01:00
prompt feat: workspace-based multi-tenancy, replacing user as tenancy axis (#840) 2026-04-21 23:23:01 +01:00
query fix: structured data query and auth fixes (#978) 2026-06-08 15:22:11 +01:00
reranker feat: replace LLM edge scoring with cross-encoder reranker in GraphRAG (#1005) 2026-06-30 14:36:37 +01:00
retrieval feat: add cross-encoder reranking to Document-RAG with two-limit control (#878) (#1011) 2026-07-02 09:50:13 +01:00
rev_gateway Update rev-gateway for IAM integration (#940) 2026-05-19 21:45:43 +01:00
storage fix: structured data query and auth fixes (#978) 2026-06-08 15:22:11 +01:00
tables feat: global usernames and rename workspace to default_workspace (#1001) 2026-06-25 16:34:31 +01:00
template fix: avoid swallowing prompt manager interrupts 2026-05-26 10:37:21 -04:00
tool_service feat: workspace-based multi-tenancy, replacing user as tenancy axis (#840) 2026-04-21 23:23:01 +01:00
__init__.py Feature/pkgsplit (#83) 2024-09-30 19:36:09 +01:00