mirror of
https://github.com/trustgraph-ai/trustgraph.git
synced 2026-04-25 00:16:23 +02:00
RabbitMQ pub/sub backend with topic exchange architecture (#752)
Adds a RabbitMQ backend as an alternative to Pulsar, selectable via PUBSUB_BACKEND=rabbitmq. Both backends implement the same PubSubBackend protocol — no application code changes needed to switch. RabbitMQ topology: - Single topic exchange per topicspace (e.g. 'tg') - Routing key derived from queue class and topic name - Shared consumers: named queue bound to exchange (competing, round-robin) - Exclusive consumers: anonymous auto-delete queue (broadcast, each gets every message). Used by Subscriber and config push consumer. - Thread-local producer connections (pika is not thread-safe) - Push-based consumption via basic_consume with process_data_events for heartbeat processing Consumer model changes: - Consumer class creates one backend consumer per concurrent task (required for pika thread safety, harmless for Pulsar) - Consumer class accepts consumer_type parameter - Subscriber passes consumer_type='exclusive' for broadcast semantics - Config push consumer uses consumer_type='exclusive' so every processor instance receives config updates - handle_one_from_queue receives consumer as parameter for correct per-connection ack/nack LibrarianClient: - New shared client class replacing duplicated librarian request-response code across 6+ services (chunking, decoders, RAG, etc.) - Uses stream-document instead of get-document-content for fetching document content in 1MB chunks (avoids broker message size limits) - Standalone object (self.librarian = LibrarianClient(...)) not a mixin - get-document-content marked deprecated in schema and OpenAPI spec Serialisation: - Extracted dataclass_to_dict/dict_to_dataclass to shared serialization.py (used by both Pulsar and RabbitMQ backends) Librarian queues: - Changed from flow class (persistent) back to request/response class now that stream-document eliminates large single messages - API upload chunk size reduced from 5MB to 3MB to stay under broker limits after base64 encoding Factory and CLI: - get_pubsub() handles 'rabbitmq' backend with RabbitMQ connection params - add_pubsub_args() includes RabbitMQ options (host, port, credentials) - add_pubsub_args(standalone=True) defaults to localhost for CLI tools - init_trustgraph skips Pulsar admin setup for non-Pulsar backends - tg-dump-queues and tg-monitor-prompts use backend abstraction - BaseClient and ConfigClient accept generic pubsub config
This commit is contained in:
parent
4fb0b4d8e8
commit
24f0190ce7
36 changed files with 1277 additions and 1313 deletions
|
|
@ -3,6 +3,9 @@ description: |
|
|||
Librarian service request for document library management.
|
||||
|
||||
Operations: add-document, remove-document, list-documents,
|
||||
get-document-metadata, stream-document, add-child-document,
|
||||
list-children, begin-upload, upload-chunk, complete-upload,
|
||||
abort-upload, get-upload-status, list-uploads,
|
||||
start-processing, stop-processing, list-processing
|
||||
required:
|
||||
- operation
|
||||
|
|
@ -13,6 +16,17 @@ properties:
|
|||
- add-document
|
||||
- remove-document
|
||||
- list-documents
|
||||
- get-document-metadata
|
||||
- get-document-content
|
||||
- stream-document
|
||||
- add-child-document
|
||||
- list-children
|
||||
- begin-upload
|
||||
- upload-chunk
|
||||
- complete-upload
|
||||
- abort-upload
|
||||
- get-upload-status
|
||||
- list-uploads
|
||||
- start-processing
|
||||
- stop-processing
|
||||
- list-processing
|
||||
|
|
@ -21,6 +35,21 @@ properties:
|
|||
- `add-document`: Add document to library
|
||||
- `remove-document`: Remove document from library
|
||||
- `list-documents`: List documents in library
|
||||
- `get-document-metadata`: Get document metadata
|
||||
- `get-document-content`: Get full document content in a single response.
|
||||
**Deprecated** — use `stream-document` instead. Fails for documents
|
||||
exceeding the broker's max message size.
|
||||
- `stream-document`: Stream document content in chunks. Each response
|
||||
includes `chunk_index` and `is_final`. Preferred over `get-document-content`
|
||||
for all document sizes.
|
||||
- `add-child-document`: Add a child document (e.g. page, chunk)
|
||||
- `list-children`: List child documents of a parent
|
||||
- `begin-upload`: Start a chunked upload session
|
||||
- `upload-chunk`: Upload a chunk of data
|
||||
- `complete-upload`: Finalize a chunked upload
|
||||
- `abort-upload`: Cancel a chunked upload
|
||||
- `get-upload-status`: Check upload progress
|
||||
- `list-uploads`: List active upload sessions
|
||||
- `start-processing`: Start processing library documents
|
||||
- `stop-processing`: Stop library processing
|
||||
- `list-processing`: List processing status
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue