Fix RabbitMQ request/response race and chunker Flow API drift (#780)

Two unrelated regressions surfaced after the v2.2 queue class refactor. Bundled here because both are small and both block production. 1. Request/response race against ephemeral RabbitMQ response queues Commit feeb92b3 switched response/notify queues to per-subscriber auto-delete exclusive queues. That fixed orphaned-queue accumulation but introduced a setup race: Subscriber.start() created the run() task and returned immediately, while the underlying RabbitMQ consumer only declared and bound its queue lazily on the first receive() call. RequestResponse.request() therefore published the request before any queue was bound to the matching routing key, and the broker dropped the reply. Symptoms: "Failed to fetch config on notify" / "Request timeout exception" repeating roughly every 10s in api-gateway, document-embeddings and any other service exercising the config notify path. Fix: * Add ensure_connected() to the BackendConsumer protocol; implement it on RabbitMQBackendConsumer (calls _connect synchronously, declaring and binding the queue) and as a no-op on PulsarBackendConsumer (Pulsar's client.subscribe is already synchronous at construction). * Convert Subscriber's readiness signal from a non-existent Event to an asyncio.Future created in start(). run() calls consumer.ensure_connected() immediately after create_consumer() and sets _ready.set_result(None) on first successful bind. start() awaits the future via asyncio.wait so it returns only once the consumer is fully bound. Any reply published after start() returns is therefore guaranteed to land in a bound queue. * First-attempt connection failures call _ready.set_exception(e) and exit run() so start() unblocks with the error rather than hanging forever — the existing higher-level retry pattern in fetch_and_apply_config takes over from there. Runtime failures after a successful start still go through the existing retry-with-backoff path. * Update the two existing graceful-shutdown tests that monkey-patch Subscriber.run with a custom coroutine to honor the new contract by signalling _ready themselves. * Add tests/unit/test_base/test_subscriber_readiness.py with five regression tests pinning the readiness contract: ensure_connected must be called before start() returns; start() must block while ensure_connected runs (race-condition guard with a threading.Event gate); first-attempt create_consumer and ensure_connected failures must propagate to start() instead of hanging; ensure_connected must run before any receive() call. 2. Chunker Flow parameter lookup using the wrong attribute trustgraph-base/trustgraph/base/chunking_service.py was reading flow.parameters.get("chunk-size") and chunk-overlap, but the Flow class has no `parameters` attribute — parameter lookup is exposed through Flow.__call__ (flow("chunk-size") returns the resolved value or None). The exception was caught and logged as a WARNING, so chunking continued with the default sizes and any configured chunk-size / chunk-overlap was silently ignored: chunker - WARNING - Could not parse chunk-size parameter: 'Flow' object has no attribute 'parameters' The chunker tests didn't catch this because they constructed mock_flow = MagicMock() and configured mock_flow.parameters.get.side_effect = ..., which is the same phantom attribute MagicMock auto-creates on demand. Tests and production agreed on the wrong API. Fix: switch chunking_service.py to flow("chunk-size") / flow("chunk-overlap"). Update both chunker test files to mock the __call__ side_effect instead of the phantom parameters.get, merging parameter values into the existing flow() lookup the on_message tests already used for producer resolution.
2026-04-25 16:36:21 +02:00 · 2026-04-11 01:29:33 +01:00 · 2026-04-11 01:29:33 +01:00 · dfb6d26a56
commit dfb6d26a56
parent 7f5f2f955d
9 changed files with 329 additions and 24 deletions
--- a/tests/unit/test_chunking/test_token_chunker.py
+++ b/tests/unit/test_chunking/test_token_chunker.py
@ -70,11 +70,12 @@ class TestTokenChunkerSimple(IsolatedAsyncioTestCase):
        # Mock message and flow
        mock_message = MagicMock()
        mock_consumer = MagicMock()
+        # Flow exposes parameter lookup via __call__: flow("chunk-size")
        mock_flow = MagicMock()
-        mock_flow.parameters.get.side_effect = lambda param: {
+        mock_flow.side_effect = lambda key: {
            "chunk-size": 400,  # Override chunk size
            "chunk-overlap": None  # Use default chunk overlap
-        }.get(param)
+        }.get(key)

        # Act
        chunk_size, chunk_overlap = await processor.chunk_document(
@ -105,10 +106,10 @@ class TestTokenChunkerSimple(IsolatedAsyncioTestCase):
        mock_message = MagicMock()
        mock_consumer = MagicMock()
        mock_flow = MagicMock()
-        mock_flow.parameters.get.side_effect = lambda param: {
+        mock_flow.side_effect = lambda key: {
            "chunk-size": None,  # Use default chunk size
            "chunk-overlap": 25  # Override chunk overlap
-        }.get(param)
+        }.get(key)

        # Act
        chunk_size, chunk_overlap = await processor.chunk_document(
@ -139,10 +140,10 @@ class TestTokenChunkerSimple(IsolatedAsyncioTestCase):
        mock_message = MagicMock()
        mock_consumer = MagicMock()
        mock_flow = MagicMock()
-        mock_flow.parameters.get.side_effect = lambda param: {
+        mock_flow.side_effect = lambda key: {
            "chunk-size": 350,    # Override chunk size
            "chunk-overlap": 30   # Override chunk overlap
-        }.get(param)
+        }.get(key)

        # Act
        chunk_size, chunk_overlap = await processor.chunk_document(
@ -195,15 +196,15 @@ class TestTokenChunkerSimple(IsolatedAsyncioTestCase):
        mock_consumer = MagicMock()
        mock_producer = AsyncMock()
        mock_triples_producer = AsyncMock()
+        # Flow.__call__ resolves parameters and producers/consumers from the
+        # same dict — merge both kinds here.
        mock_flow = MagicMock()
-        mock_flow.parameters.get.side_effect = lambda param: {
+        mock_flow.side_effect = lambda key: {
            "chunk-size": 400,
            "chunk-overlap": 40,
-        }.get(param)
-        mock_flow.side_effect = lambda name: {
            "output": mock_producer,
            "triples": mock_triples_producer,
-        }.get(name)
+        }.get(key)

        # Act
        await processor.on_message(mock_message, mock_consumer, mock_flow)
@ -245,7 +246,7 @@ class TestTokenChunkerSimple(IsolatedAsyncioTestCase):
        mock_message = MagicMock()
        mock_consumer = MagicMock()
        mock_flow = MagicMock()
-        mock_flow.parameters.get.return_value = None  # No overrides
+        mock_flow.side_effect = lambda key: None  # No overrides

        # Act
        chunk_size, chunk_overlap = await processor.chunk_document(