Fix ontology RAG pipeline + add query concurrency (#691)

- Fix ontology RAG pipeline: embeddings API, chunker provenance, and query concurrency

- Fix ontology embeddings to use correct response shape from embed()
  API (returns list of vectors, not list of list of vectors).
- Simplify chunker URI logic to append /c{index} to parent ID
  instead of parsing page/doc URI structure which was fragile.

- Add provenance tracking and librarian integration to token
  chunker, matching recursive chunker capabilities.

- Add configurable concurrency (default 10) to Cassandra, Qdrant,
  and embeddings query services.
This commit is contained in:
cybermaggedon 2026-03-12 11:34:42 +00:00 committed by GitHub
parent 312174eb88
commit 45e6ad4abc
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
9 changed files with 148 additions and 50 deletions

View file

@ -176,6 +176,9 @@ class TestTokenChunkerSimple(IsolatedAsyncioTestCase):
processor = Processor(**config)
# Mock save_child_document to avoid librarian producer interactions
processor.save_child_document = AsyncMock(return_value="chunk-id")
# Mock message with TextDocument
mock_message = MagicMock()
mock_text_doc = MagicMock()
@ -191,11 +194,13 @@ class TestTokenChunkerSimple(IsolatedAsyncioTestCase):
# Mock consumer and flow with parameter overrides
mock_consumer = MagicMock()
mock_producer = AsyncMock()
mock_triples_producer = AsyncMock()
mock_flow = MagicMock()
mock_flow.side_effect = lambda param: {
"chunk-size": 400,
"chunk-overlap": 40,
"output": mock_producer
"output": mock_producer,
"triples": mock_triples_producer,
}.get(param)
# Act