perf(kb-search): offload sync embed_texts to thread

embed_texts holds a threading.Lock and runs a sync embedding call inside
search_knowledge_base, an async coroutine on the KB priority middleware
critical path. Blocking the event loop here stalls every other coroutine
on the worker (SSE keepalives, concurrent chat requests, background
tasks). Wrap in asyncio.to_thread so the embed runs on the default
executor pool while the loop keeps serving.
This commit is contained in:
CREDO23 2026-05-20 10:02:38 +02:00
parent 32f6766cb6
commit 4fa85a9a94

View file

@ -457,7 +457,7 @@ async def search_knowledge_base(
if not query:
return []
[embedding] = embed_texts([query])
[embedding] = await asyncio.to_thread(embed_texts, [query])
doc_types = _resolve_search_types(available_connectors, available_document_types)
retriever_top_k = min(top_k * 3, 30)