trustgraph

mirror of https://github.com/trustgraph-ai/trustgraph.git synced 2026-07-03 15:01:00 +02:00

Sunny 6c9a545a06 feat: add cross-encoder reranking to Document-RAG with two-limit control (#878 ) (#1011 ) Wire the FlashRank reranker subsystem from #1005 into Document-RAG: after vector retrieval, over-fetch a wider candidate pool, rerank with the cross-encoder, and keep the top doc_limit chunks for synthesis. Per maintainer review, the fetch and select sizes are two caller-controlled limits rather than one internal heuristic: - doc_limit: chunks selected into the synthesis prompt (unchanged meaning). - fetch_limit: candidate pool pulled from the vector store before reranking. 0 = derive (OVERFETCH_FACTOR x doc_limit); values below doc_limit are raised to it. Lets the caller control how hard the reranker has to work. Details: - schema: DocumentRagQuery.fetch_limit (additive, backward compatible). - document_rag.py / rag.py: fetch_limit resolved in the processor (mirrors doc_limit); the core applies the heuristic default and derives synthesis provenance from the chunk-selection focus when reranking ran. - provenance: tg:ChunkSelection focus stage (mirrors tg:EdgeSelection). - request translator + client SDKs + CLI: fetch-limit / --fetch-limit, threaded exactly like doc_limit and the GraphRAG limits. - tests: no-op identity, over-fetch/narrow, explicit fetch_limit, heuristic default, floor-at-doc_limit, provenance lineage, cross-repo topic wiring. Reranking is skipped byte-identically when no reranker role is wired. Requires the companion trustgraph-templates change wiring the reranker topics into the document-rag flow (mirrors #279 for GraphRAG).		2026-07-02 09:50:13 +01:00
..
__init__.py	feat: replace LLM edge scoring with cross-encoder reranker in GraphRAG (#1005 )	2026-06-30 14:36:37 +01:00
agent.py	feat: workspace-based multi-tenancy, replacing user as tenancy axis (#840 )	2026-04-21 23:23:01 +01:00
base.py	Pub/sub abstraction: decouple from Pulsar (#751 )	2026-04-01 20:16:53 +01:00
collection.py	Per-workspace queue routing for workspace-scoped services (#862 )	2026-05-04 10:30:03 +01:00
config.py	feat: workspace-based multi-tenancy, replacing user as tenancy axis (#840 )	2026-04-21 23:23:01 +01:00
diagnosis.py	Pub/sub abstraction: decouple from Pulsar (#751 )	2026-04-01 20:16:53 +01:00
document_loading.py	feat: workspace-based multi-tenancy, replacing user as tenancy axis (#840 )	2026-04-21 23:23:01 +01:00
embeddings.py	Pub/sub abstraction: decouple from Pulsar (#751 )	2026-04-01 20:16:53 +01:00
embeddings_query.py	feat: workspace-based multi-tenancy, replacing user as tenancy axis (#840 )	2026-04-21 23:23:01 +01:00
flow.py	Per-workspace queue routing for workspace-scoped services (#862 )	2026-05-04 10:30:03 +01:00
iam.py	feat: global usernames and rename workspace to default_workspace (#1001 )	2026-06-25 16:34:31 +01:00
knowledge.py	feat: complete knowledge core storage — named graphs, provenance, source material (#973 )	2026-06-03 10:46:52 +01:00
library.py	Per-workspace queue routing for workspace-scoped services (#862 )	2026-05-04 10:30:03 +01:00
metadata.py	Per-workspace queue routing for workspace-scoped services (#862 )	2026-05-04 10:30:03 +01:00
nlp_query.py	Pub/sub abstraction: decouple from Pulsar (#751 )	2026-04-01 20:16:53 +01:00
primitives.py	Pub/sub abstraction: decouple from Pulsar (#751 )	2026-04-01 20:16:53 +01:00
prompt.py	Expose LLM token usage across all service layers (#782 )	2026-04-13 14:38:34 +01:00
reranker.py	feat: replace LLM edge scoring with cross-encoder reranker in GraphRAG (#1005 )	2026-06-30 14:36:37 +01:00
retrieval.py	feat: add cross-encoder reranking to Document-RAG with two-limit control (#878 ) (#1011 )	2026-07-02 09:50:13 +01:00
rows_query.py	feat: workspace-based multi-tenancy, replacing user as tenancy axis (#840 )	2026-04-21 23:23:01 +01:00
sparql_query.py	feat: workspace-based multi-tenancy, replacing user as tenancy axis (#840 )	2026-04-21 23:23:01 +01:00
structured_query.py	feat: workspace-based multi-tenancy, replacing user as tenancy axis (#840 )	2026-04-21 23:23:01 +01:00
text_completion.py	Expose LLM token usage across all service layers (#782 )	2026-04-13 14:38:34 +01:00
tool.py	Pub/sub abstraction: decouple from Pulsar (#751 )	2026-04-01 20:16:53 +01:00
triples.py	feat: workspace-based multi-tenancy, replacing user as tenancy axis (#840 )	2026-04-21 23:23:01 +01:00