Pub/sub abstraction: decouple from Pulsar (#751)

Remove Pulsar-specific concepts from application code so that
the pub/sub backend is swappable via configuration.

Rename translators:
- to_pulsar/from_pulsar → decode/encode across all translator
  classes, dispatch handlers, and tests (55+ files)
- from_response_with_completion → encode_with_completion
- Remove pulsar.schema.Record from translator base class

Queue naming (CLASS:TOPICSPACE:TOPIC):
- Replace topic() helper with queue() using new format:
  flow:tg:name, request:tg:name, response:tg:name, state:tg:name
- Queue class implies persistence/TTL (no QoS in names)
- Update Pulsar backend map_topic() to parse new format
- Librarian queues use flow class (persistent, for chunking)
- Config push uses state class (persistent, last-value)
- Remove 15 dead topic imports from schema files
- Update init_trustgraph.py namespace: config → state

Confine Pulsar to pulsar_backend.py:
- Delete legacy PulsarClient class from pubsub.py
- Move add_args to add_pubsub_args() with standalone flag
  for CLI tools (defaults to localhost)
- PulsarBackendConsumer.receive() catches _pulsar.Timeout,
  raises standard TimeoutError
- Remove Pulsar imports from: async_processor, flow_processor,
  log_level, all 11 client files, 4 storage writers, gateway
  service, gateway config receiver
- Remove log_level/LoggerLevel from client API
- Rewrite tg-monitor-prompts to use backend abstraction
- Update tg-dump-queues to use add_pubsub_args

Also: pubsub-abstraction.md tech spec covering problem statement,
design goals, as-is requirements, candidate broker assessment,
approach, and implementation order.
This commit is contained in:
cybermaggedon 2026-04-01 20:16:53 +01:00 committed by GitHub
parent dbf8daa74a
commit 4fb0b4d8e8
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
106 changed files with 1269 additions and 788 deletions

View file

@ -0,0 +1,551 @@
# Pub/Sub Abstraction: Broker-Independent Messaging
## Problem
TrustGraph's messaging infrastructure is deeply coupled to Apache Pulsar in ways that go beyond the transport layer. This coupling creates several concrete problems.
### 1. Schema system is Pulsar-native
Every message type in the system is defined as a `pulsar.schema.Record` subclass using Pulsar field types (`String()`, `Integer()`, `Boolean()`, etc.). This means:
- The `pulsar` Python package is a build dependency for `trustgraph-base`, even though `trustgraph-base` contains no transport logic
- Any code that imports a message schema transitively depends on Pulsar
- The schema definitions cannot be reused with a different broker without the Pulsar library installed
- What's actually happening on the wire is JSON serialisation — the Pulsar schema machinery adds complexity without adding value over plain JSON encode/decode
### 2. Translators are named after the broker
The translator layer that converts between internal Python objects and wire format uses methods called `to_pulsar()` and `from_pulsar()`. These are really just JSON encode/decode operations — they have nothing to do with Pulsar specifically. The naming creates a false impression that the translation is broker-specific, when in reality any broker that carries JSON payloads would use identical logic.
### 3. Queue names use Pulsar URI format
Queue identifiers throughout the codebase use Pulsar's `persistent://tenant/namespace/topic` or `non-persistent://tenant/namespace/topic` URI format. These are hardcoded in schema definitions and referenced across services. RabbitMQ, Redis Streams, or any other broker would use completely different naming conventions. There is no abstraction between the logical identity of a queue and its broker-specific address.
### 4. Broker selection is not configurable
There is no mechanism to select a different pub/sub backend at deployment time. The Pulsar client is instantiated directly in the gateway and via `PulsarClient` in the base processor. Switching to a different broker would require code changes across multiple packages, not a configuration change.
### 5. Architectural requirements are implicit
TrustGraph relies on specific pub/sub behaviours — shared subscriptions for load balancing, message acknowledgement for reliability, message properties for correlation — but these requirements are not documented. This makes it difficult to evaluate whether a candidate broker (RabbitMQ, Redis Streams, NATS, etc.) actually satisfies the system's needs, or where the gaps would be.
## Design Goals
### Goal 1: Remove the link between Pulsar schemas and application code
Message types should be plain Python objects (dataclasses) that know how to serialise to and from JSON. The `pulsar.schema.Record` base class and Pulsar field types should not appear in schema definitions. The pub/sub transport layer sends and receives JSON bytes; the schema layer handles the mapping between JSON and typed Python objects independently.
### Goal 2: Remove `to_pulsar` / `from_pulsar` naming
The translator methods should reflect what they actually do: encode a Python object to a JSON-compatible dict, and decode a JSON-compatible dict back to a Python object. The naming should be broker-neutral (e.g. `encode` / `decode`, or `to_dict` / `from_dict`).
### Goal 3: Schema objects provide encode/decode
Each message type should be a Python dataclass (or similar) with a well-defined mapping to and from JSON. For example:
```python
@dataclass
class TextCompletionRequest:
system: str
prompt: str
streaming: bool = False
```
Given `{"system": "You are helpful", "prompt": "Hello", "streaming": false}` on the wire, decoding produces an object where `request.system` is `"You are helpful"`, `request.prompt` is `"Hello"`, and `request.streaming` is `False`. Encoding does the reverse. This is the schema's concern, not the broker's.
### Goal 4: Abstract queue naming
Queue identifiers should not use Pulsar URI format (`persistent://tg/flow/topic`). A broker-neutral naming scheme is needed so that each backend can map logical queue names to its native format. The right approach here is not yet clear and needs to be worked through — considerations include how to express quality-of-service, multi-tenancy, and namespace separation without leaking broker concepts.
### Goal 5: Document pub/sub architectural requirements
TrustGraph's actual requirements from the pub/sub layer need to be formally specified. This includes:
- **Delivery semantics**: Which queues need at-least-once delivery? Are any fire-and-forget?
- **Consumer patterns**: Shared subscriptions (competing consumers for load balancing), exclusive subscriptions, fan-out/broadcast
- **Message acknowledgement**: Positive ack, negative ack (redelivery), timeout-based redelivery
- **Message properties**: Key-value metadata on messages used for correlation (e.g. request IDs, flow routing)
- **Ordering guarantees**: Per-topic ordering, per-key ordering, or no ordering required
- **Message size**: Typical and maximum message sizes (some payloads include base64-encoded documents)
- **Persistence**: Which messages must survive broker restarts
- **Consumer positioning**: Ability to consume from earliest (replay) vs latest (live tail)
- **Connection model**: Long-lived connections with reconnection, or transient
Documenting these requirements makes it possible to evaluate RabbitMQ or any other candidate against concrete criteria rather than discovering gaps during implementation.
## Pub/Sub Architectural Requirements (As-Is)
This section documents what TrustGraph currently needs from its pub/sub layer. These are the as-is requirements — some may be revisited or relaxed in a future design if it makes broker portability easier.
### Consumer model
All consumers use **shared subscriptions** (competing consumers). Multiple instances of the same processor read from the same subscription, and each message is delivered to exactly one instance. This is the load-balancing mechanism.
No exclusive or failover subscriptions are used anywhere in the codebase, despite infrastructure support for them.
Consumers support configurable concurrency — multiple async tasks within a single process can independently call `receive()` on the same subscription.
### Delivery semantics
Almost all queues are **non-persistent / best-effort (q0)**. The only persistent queue is `config_push_queue` (q2, exactly-once), which pushes full configuration state to processors. Since config pushes are idempotent (full state, not deltas), the persistence requirement here is about surviving broker restarts, not about exactly-once semantics per se.
Flow processing queues (request/response pairs for LLM, RAG, agent, etc.) are all non-persistent. Messages in flight are lost on broker restart. This is acceptable because:
- Requests originate from a client that will time out and retry
- There is no durable work-in-progress that would be corrupted by message loss
- The system is designed for real-time query processing, not batch pipelines
### Message acknowledgement
**Positive acknowledgement**: After successful handler execution, the message is acknowledged. This removes it from the subscription.
**Negative acknowledgement**: On handler failure (unhandled exception or rate-limit timeout), the message is negatively acknowledged, which triggers redelivery by the broker. Rate-limited messages retry for up to 7200 seconds before giving up and negatively acknowledging.
**Orphaned messages**: In the request-response subscriber pattern, messages that arrive with no matching waiter (e.g. the requester timed out) are positively acknowledged and discarded. This prevents redelivery storms.
### Message properties
Messages carry a small set of key-value string properties as metadata, separate from the payload. The primary use is a `"id"` property for request-response correlation — the requester generates a unique ID, attaches it as a property, and the responder echoes it back so the subscriber can match responses to waiters.
Agent orchestration correlation (`correlation_id`, `parent_session_id`) is carried in the message payload, not in properties.
### Consumer positioning
Two modes are used:
- **Earliest**: The configuration consumer starts from the beginning of the topic to receive full configuration history on startup. This is the only use of earliest positioning.
- **Latest** (default): All flow consumers start from the current position, processing only new messages.
### Message ordering
**Not required.** The codebase explicitly does not depend on message ordering:
- Shared subscriptions distribute messages across consumers without ordering guarantees
- Concurrent handler tasks within a consumer process messages in arbitrary order
- Request-response correlation uses IDs, not positional ordering
- The supervisor fan-out/fan-in pattern collects results in a dictionary, order-independent
- Configuration pushes are full state snapshots, not ordered deltas
### Message sizes
Most messages are small JSON payloads (< 10KB). The exceptions:
- **Document content**: Large documents (PDFs, text files) can be sent through the chunking service with base64 encoding. Pulsar's chunking feature (`chunking_enabled`) handles automatic splitting of oversized messages.
- **Agent observations**: LLM-generated text can be several KB but rarely exceeds typical message size limits.
A replacement broker needs to either support large messages natively or provide a chunking/streaming mechanism. Alternatively, the large-document path could be refactored to use a side-channel (e.g. object store reference) instead of inline payload.
### Fan-out patterns
**Supervisor fan-out**: One supervisor request decomposes into N independent sub-agent requests, each emitted as a separate message on the agent request queue. Different agent instances pick them up via the shared subscription. A correlation ID links the completions back to the original decomposition. This is not pub/sub fan-out (one message to many consumers) — it's application-level fan-out (many messages to one queue).
**Request-response isolation**: Each client creates a unique subscription name on response queues so it only receives its own responses. This means the response queue effectively has many independent subscribers, each seeing a filtered subset of messages based on the `"id"` property match.
### Reconnection and resilience
Reconnection logic lives in the Consumer/Producer/Publisher/Subscriber classes, not in the broker client. These classes handle:
- Automatic reconnection on connection loss
- Retry loops with backoff
- Graceful shutdown (unsubscribe, close)
The broker client itself is expected to provide a basic connection that can fail, and the wrapper classes handle recovery. This is important for the abstraction — the backend interface can be simple because resilience is handled above it.
### Queue inventory
| Queue | Persistence | Purpose |
|-------|-------------|---------|
| config push | Persistent (q2) | Full configuration state broadcast |
| config request/response | Non-persistent | Configuration queries |
| flow request/response | Non-persistent | Flow management |
| knowledge request/response | Non-persistent | Knowledge graph operations |
| librarian request/response | Non-persistent | Document storage operations |
| document embeddings request/response | Non-persistent | Document vector queries |
| row embeddings request/response | Non-persistent | Row vector queries |
| collection request/response | Non-persistent | Collection management |
Additionally, each processing service (LLM, RAG, agent, prompt, embeddings, etc.) has dynamically defined request/response queue pairs configured at deployment time.
### Summary of hard requirements for a replacement broker
1. **Shared subscription / competing consumers** — multiple consumers on one queue, each message delivered to exactly one
2. **Message acknowledgement** — positive ack (remove from queue) and negative ack (trigger redelivery)
3. **Message properties** — key-value metadata on messages, at minimum a string `"id"` field
4. **Two consumer start positions** — from beginning of topic and from current position
5. **Persistence for at least one queue** — config state must survive broker restart
6. **Messages up to several MB** — or a chunking mechanism for large payloads
7. **No ordering requirement** — simplifies broker selection significantly
## Candidate Brokers
A quick assessment of alternatives against the hard requirements above.
### RabbitMQ
The primary candidate. Mature, widely deployed, well understood.
- **Competing consumers**: Yes — multiple consumers on a queue, round-robin delivery. This is RabbitMQ's native model.
- **Acknowledgement**: Yes — `basic.ack` and `basic.nack` with requeue flag.
- **Message properties**: Yes — headers and properties on every message. The `correlation_id` and `message_id` fields are first-class concepts.
- **Consumer positioning**: Yes, via RabbitMQ Streams (3.9+). Streams are append-only logs that support reading from any offset — beginning, end, or timestamp. Classic queues are consumed destructively (no replay), but streams solve this cleanly. The `state` queue class maps to a RabbitMQ stream. Additionally, the Last Value Cache Exchange plugin can retain the most recent message per routing key for new consumers.
- **Persistence**: Yes — durable queues and persistent messages survive broker restart.
- **Large messages**: No hard limit but not designed for very large payloads. Practical limit around 128MB with default config. Adequate for current use.
- **Ordering**: FIFO per queue (stronger than required).
- **Operational complexity**: Low. Single binary, no ZooKeeper/BookKeeper dependencies. Significantly simpler to operate than Pulsar.
- **Ecosystem**: Excellent client libraries, management UI, mature tooling.
**Gaps**: None significant. RabbitMQ Streams cover the replay/earliest positioning requirement.
### Apache Kafka
High-throughput distributed log. More infrastructure than TrustGraph likely needs.
- **Competing consumers**: Yes — consumer groups with partition assignment.
- **Acknowledgement**: Yes — offset commits. No per-message negative ack; failed messages require application-level retry or dead-letter handling.
- **Message properties**: Yes — message headers (key-value byte arrays).
- **Consumer positioning**: Yes — seek to earliest or latest offset. Supports full replay.
- **Persistence**: Yes — all messages are persisted to the log by default.
- **Large messages**: Configurable (`max.message.bytes`), default 1MB, can be increased. Large payloads are discouraged by design.
- **Ordering**: Per-partition ordering (stronger than required).
- **Operational complexity**: High. Requires ZooKeeper (or KRaft), partition management, replication config. Overkill for typical TrustGraph deployments.
- **Ecosystem**: Excellent client libraries, schema registry, Connect framework.
**Gaps**: No native negative acknowledgement. Operational complexity is high for small-to-medium deployments. Partition count must be planned upfront for parallelism.
### Redis Streams
Lightweight option using Redis as a message broker.
- **Competing consumers**: Yes — consumer groups with `XREADGROUP`.
- **Acknowledgement**: Yes — `XACK`. Pending entries list tracks unacknowledged messages. No explicit negative ack but unacknowledged messages can be claimed after timeout via `XAUTOCLAIM`.
- **Message properties**: No native separation between properties and payload. Would need to encode properties as fields within the stream entry or in the payload.
- **Consumer positioning**: Yes — `0` (earliest) or `$` (latest) on group creation.
- **Persistence**: Yes — Redis persistence (RDB/AOF), though Redis is primarily an in-memory system.
- **Large messages**: Practical limit tied to Redis memory. Not suited for large payloads.
- **Ordering**: Per-stream ordering (stronger than required).
- **Operational complexity**: Low if Redis is already in the stack. No additional infrastructure.
**Gaps**: No native message properties. Memory-bound. Persistence depends on Redis configuration. Not a natural fit for message broker patterns.
### NATS / NATS JetStream
Lightweight, high-performance messaging. JetStream adds persistence.
- **Competing consumers**: Yes — queue groups in core NATS; consumer groups in JetStream.
- **Acknowledgement**: JetStream only — `Ack`, `Nak` (with redelivery), `InProgress` (extend timeout).
- **Message properties**: Yes — message headers (key-value).
- **Consumer positioning**: JetStream — deliver all, deliver last, deliver new, deliver by sequence/time.
- **Persistence**: JetStream only. Core NATS is fire-and-forget.
- **Large messages**: Default 1MB, configurable up to 64MB.
- **Ordering**: Per-subject ordering.
- **Operational complexity**: Very low. Single binary, no dependencies. Clustering is straightforward.
**Gaps**: Requires JetStream for persistence and acknowledgement. Smaller ecosystem than RabbitMQ/Kafka.
### Assessment Summary
| Requirement | RabbitMQ | Kafka | Redis Streams | NATS JetStream |
|---|---|---|---|---|
| Competing consumers | Yes | Yes | Yes | Yes |
| Positive/negative ack | Yes | Partial | Partial | Yes |
| Message properties | Yes | Yes | No | Yes |
| Earliest positioning | Yes (Streams) | Yes | Yes | Yes |
| Persistence | Yes | Yes | Partial | Yes |
| Large messages | Yes | Configurable | No | Configurable |
| Operational simplicity | Good | Poor | Good | Good |
**RabbitMQ** is the strongest candidate given TrustGraph's requirements and deployment profile. The only gap (earliest consumer positioning for config) has known workarounds. Operational simplicity is a significant advantage over Pulsar.
## Approach
### Current state
The codebase has already undergone a partial abstraction. The picture is better than the problem statement might suggest:
- **Backend abstraction exists**: `backend.py` defines Protocol-based interfaces (`PubSubBackend`, `BackendProducer`, `BackendConsumer`, `Message`). The Pulsar implementation lives in `pulsar_backend.py`.
- **Schemas are already dataclasses**: Message types in `schema/services/*.py` are plain Python dataclasses with type hints, not Pulsar `Record` subclasses. This was the hardest part of the old spec and it's done.
- **Serialization is JSON-based**: `pulsar_backend.py` contains `dataclass_to_dict()` and `dict_to_dataclass()` helpers that handle the round-trip. The wire format is JSON.
- **Factory pattern exists**: `pubsub.py` has `get_pubsub()` which creates a backend from configuration. Currently only Pulsar is implemented.
- **Consumer/Producer/Publisher/Subscriber are backend-agnostic**: These classes accept a `backend` parameter and delegate transport operations to it. They own retry, reconnection, metrics, and concurrency.
What remains is cleanup, not a rewrite.
### What needs to change
#### 1. Rename translator methods
The translator base class (`messaging/translators/base.py`) defines `to_pulsar()` and `from_pulsar()` as abstract methods. Every translator implements these. The methods convert between external API dicts and internal dataclass objects — nothing Pulsar-specific happens in them.
**Change**: Rename to `decode()` (external dict → dataclass) and `encode()` (dataclass → external dict). Update all translator subclasses and all call sites.
This is a mechanical rename. The method bodies don't change.
#### 2. Rename translator base classes
The base classes `Translator`, `MessageTranslator`, and `SendTranslator` reference "pulsar" in docstrings and parameter names. Clean these up so the naming reflects what the layer actually does: translating between the external API representation (JSON dicts from HTTP/WebSocket) and the internal schema (dataclasses).
#### 3. Move serialization out of the Pulsar backend
`dataclass_to_dict()` and `dict_to_dataclass()` currently live in `pulsar_backend.py` but are not Pulsar-specific. They handle the conversion between dataclasses and JSON-compatible dicts, which every backend needs.
**Change**: Move these to a shared location (e.g. `trustgraph/base/serialization.py` or alongside the schema definitions). The backend interface sends and receives dicts; serialization to/from dataclasses happens at a layer above.
This means the backend Protocol simplifies: `send()` accepts a dict and properties, `value()` returns a dict. The Consumer/Producer layer handles dataclass ↔ dict conversion using the shared serializers.
#### 4. Abstract queue naming
Queue names currently use the format `q0/tg/flow/queue-name` or `q2/tg/config/queue-name`, which the Pulsar backend maps to `non-persistent://tg/flow/queue-name` or `persistent://tg/config/queue-name`.
This is an open design question. Options:
**Option A: Simple string names.** Queues are just strings like `"text-completion-request"`. The backend is responsible for mapping to its native format (Pulsar adds `persistent://tg/flow/` prefix, RabbitMQ uses the string as-is or adds a vhost prefix). Persistence and namespace are configuration concerns, not embedded in the name.
**Option B: Structured queue descriptor.** A small object that carries the logical name plus metadata:
```python
@dataclass
class QueueDescriptor:
name: str # e.g. "text-completion-request"
namespace: str = "flow" # logical grouping
persistent: bool = False # must survive broker restart
```
The backend maps this to its native format.
**Option C: Keep the current format** (`q0/tg/flow/name`) but document it as a TrustGraph convention, not a Pulsar convention. Backends parse it.
Option B is the most explicit. Option A is the simplest. Either is workable. The key constraint is that persistence is a property of the queue definition, not a runtime choice — the config push queue is persistent, everything else is not.
#### 5. Implement RabbitMQ backend
Write `rabbitmq_backend.py` implementing the `PubSubBackend` Protocol:
- **`create_producer()`**: Creates a channel and declares the target queue. `send()` publishes to the default exchange with the queue name as routing key. Properties map to AMQP basic properties (specifically `message_id` for the `"id"` property).
- **`create_consumer()`**: Declares the queue and starts consuming with `basic_consume`. Shared subscription is the default RabbitMQ model — multiple consumers on one queue get round-robin delivery. `acknowledge()` maps to `basic_ack`, `negative_acknowledge()` maps to `basic_nack` with `requeue=True`.
- **Persistence**: For persistent queues, declare as durable with `delivery_mode=2` on messages. For non-persistent queues, declare as non-durable.
- **Consumer positioning**: RabbitMQ queues are consumed destructively, so "earliest" doesn't apply in the Pulsar sense. For the config push use case, use a **fanout exchange with per-consumer exclusive queues** — each new processor gets its own queue that receives all config publishes, plus the last-value can be handled by having the config service re-publish on startup.
- **Large messages**: RabbitMQ handles messages up to `rabbit.max_message_size` (default 128MB). No chunking needed.
The factory in `pubsub.py` gets a new branch:
```python
if backend_type == 'rabbitmq':
return RabbitMQBackend(
host=config.get('rabbitmq_host'),
port=config.get('rabbitmq_port'),
username=config.get('rabbitmq_username'),
password=config.get('rabbitmq_password'),
vhost=config.get('rabbitmq_vhost', '/'),
)
```
Backend selection via `PUBSUB_BACKEND=rabbitmq` environment variable or `--pubsub-backend rabbitmq` CLI flag.
#### 6. Clean up remaining Pulsar references
After the above changes, Pulsar-specific code should be confined to:
- `pulsar_backend.py` — the Pulsar implementation
- `pubsub.py` — the factory that imports it
Audit and remove any remaining Pulsar imports, Pulsar exception handling, or Pulsar-specific concepts from:
- `async_processor.py` (currently catches `_pulsar.Interrupted`)
- `consumer.py`, `subscriber.py` (if any Pulsar exceptions leak through)
- Schema files (should be clean already, but verify)
- Gateway service (currently instantiates Pulsar client directly)
The gateway is a special case — it currently bypasses the abstraction layer and creates a Pulsar client directly for dispatching API requests. It should use the same `get_pubsub()` factory as everything else.
### What stays the same
- **Schema definitions**: Already dataclasses. No changes needed.
- **Consumer/Producer/Publisher/Subscriber**: Already backend-agnostic. No changes to their core logic.
- **FlowProcessor and spec wiring**: Already uses `processor.pubsub` to create backend instances. No changes.
- **Backend Protocol**: The interface in `backend.py` is sound. Minor refinement possible (dict vs dataclass at the boundary) but the shape is right.
### Concrete cleanups
The following files have Pulsar-specific imports that should not be there after the abstraction is complete. Pulsar imports should be confined to `pulsar_backend.py` and the factory in `pubsub.py`.
**Dead imports (unused, can just be removed):**
- `trustgraph-base/trustgraph/base/pubsub.py``from pulsar.schema import JsonSchema`, `import pulsar`, `import _pulsar`. The `JsonSchema` import is unused since the switch to `BytesSchema`. The `pulsar`/`_pulsar` imports are only used by the legacy `PulsarClient` class which should be removed (superseded by `PulsarBackend`).
- `trustgraph-base/trustgraph/base/flow_processor.py``from pulsar.schema import JsonSchema`. Unused.
**Legacy `PulsarClient` class:**
- `trustgraph-base/trustgraph/base/pubsub.py` — The `PulsarClient` class is a leftover from before the backend abstraction. `get_pubsub()` still references `PulsarClient.default_pulsar_host` for defaults. Move the defaults to `PulsarBackend` or to environment variable reads in the factory, then delete `PulsarClient`.
**Client libraries using Pulsar directly:**
- `trustgraph-base/trustgraph/clients/base.py``import pulsar`, `import _pulsar`, `from pulsar.schema import JsonSchema`. This is the base class for the old synchronous client library. These clients predate the backend abstraction and use Pulsar directly.
- `trustgraph-base/trustgraph/clients/embeddings_client.py``from pulsar.schema import JsonSchema`, `import _pulsar`.
- `trustgraph-base/trustgraph/clients/*.py` (agent, config, document_embeddings, document_rag, graph_embeddings, graph_rag, llm, prompt, row_embeddings, triples_query) — all import `_pulsar` for exception handling.
These clients are the internal request-response clients used by processors. They need to be migrated to use the backend abstraction or their Pulsar exception handling needs to be wrapped behind a backend-agnostic exception type.
**Translator base class:**
- `trustgraph-base/trustgraph/messaging/translators/base.py``from pulsar.schema import Record`. Used in type hints. Should be removed when `to_pulsar`/`from_pulsar` are renamed.
**Gateway service (bypasses abstraction):**
- `trustgraph-flow/trustgraph/gateway/service.py``import pulsar`. Creates a Pulsar client directly.
- `trustgraph-flow/trustgraph/gateway/config/receiver.py``import pulsar`. Direct Pulsar usage.
The gateway should use `get_pubsub()` like everything else.
**Storage writers:**
- `trustgraph-flow/trustgraph/storage/triples/neo4j/write.py``import pulsar`
- `trustgraph-flow/trustgraph/storage/triples/memgraph/write.py``import pulsar`
- `trustgraph-flow/trustgraph/storage/triples/falkordb/write.py``import pulsar`
- `trustgraph-flow/trustgraph/storage/triples/cassandra/write.py``import pulsar`
These need investigation — likely Pulsar exception handling or direct client usage that should go through the abstraction.
**Log level:**
- `trustgraph-base/trustgraph/log_level.py``import _pulsar`. Used to set Pulsar's log level. Should be moved into `pulsar_backend.py`.
### Queue naming
The current scheme encodes QoS, tenant, namespace, and queue name into a slash-separated string (`q0/tg/request/config`) which the Pulsar backend parses and maps to a Pulsar URI (`non-persistent://tg/request/config`). This was an attempt at abstraction but it has problems:
- QoS in the name was a mistake — it's a property of the queue definition, not something that belongs in the name. A queue is either persistent or it isn't; that's decided once when the queue is defined.
- The tenant/namespace structure mirrors Pulsar's model. RabbitMQ doesn't use this — it has vhosts and exchange/queue names. Pretending the naming isn't TrustGraph-specific just leaks Pulsar concepts.
- The `topic()` helper generates these strings, and the backend parses them apart. This is unnecessary indirection.
There are two categories of queue in TrustGraph:
**Infrastructure queues** — defined in code, used for system services. These are fixed and well-known:
| Queue | Persistent | Purpose |
|-------|------------|---------|
| `config-request` | No | Config queries |
| `config-response` | No | Config query responses |
| `config-push` | Yes | Config state broadcast |
| `flow-request` | No | Flow management queries |
| `flow-response` | No | Flow management responses |
| `librarian-request` | No | Document storage operations |
| `librarian-response` | No | Document storage responses |
| `knowledge-request` | No | Knowledge graph operations |
| `knowledge-response` | No | Knowledge graph responses |
| `document-embeddings-request` | No | Document vector queries |
| `document-embeddings-response` | No | Document vector responses |
| `row-embeddings-request` | No | Row vector queries |
| `row-embeddings-response` | No | Row vector responses |
| `collection-request` | No | Collection management |
| `collection-response` | No | Collection management responses |
**Flow queues** — defined in configuration, created dynamically per flow. The queue names come from the config service (e.g. `text-completion-request`, `graph-rag-request`, `agent-request`). Each flow instance has its own set of these queues.
For infrastructure queues, the name is just a string. Persistence is a property of the queue definition, not encoded in the name. The backend maps the name to whatever its native format requires.
For flow queues, the name comes from configuration. The config service already distributes queue names as strings — the backend just needs to be able to use them.
#### Proposed scheme: CLASS:TOPICSPACE:TOPIC
A queue name has three parts separated by colons:
- **CLASS** — a small enum that defines the queue's operational characteristics. The backend knows what each class means in terms of persistence, TTL, memory limits, etc. There are only four classes:
| Class | Persistent | TTL | Behaviour |
|-------|------------|-----|-----------|
| `flow` | Yes | Long | Processing pipeline queues. Messages survive broker restart. |
| `request` | No | Short | Transient request-response. Low TTL, no persistence needed — clients retry on failure. |
| `response` | No | Short | Same as request, for the response side. |
| `state` | Yes | Retained | Last-value state broadcast. Consumers need the most recent value on startup, plus any future updates. Config push is the primary example. |
- **TOPICSPACE** — deployment isolation. Keeps different TrustGraph deployments separate when sharing the same pub/sub infrastructure. Most deployments just use `tg`. Avoids the overloaded terms "tenant" and "namespace".
- **TOPIC** — the logical queue identity. What the queue is for.
**Examples:**
```
flow:tg:text-completion-request
flow:tg:graph-rag-request
flow:tg:agent-request
request:tg:librarian
response:tg:librarian
request:tg:config
response:tg:config
state:tg:config
request:tg:flow
response:tg:flow
```
**Backend mapping:**
Each backend parses the three parts and maps them to its native concepts:
- **Pulsar**: `flow:tg:text-completion-request``persistent://tg/flow/text-completion-request`. Class maps to persistent/non-persistent and namespace. State class uses persistent topic with earliest consumer positioning.
- **RabbitMQ**: Topicspace maps to vhost. Class determines queue durability and TTL policy. State class uses a last-value queue (via plugin) or a fanout exchange pattern where each consumer gets the retained state on connect.
- **Kafka**: `flow.tg.text-completion-request` as topic name. Class determines retention and compaction policy. State class maps to a compacted topic (last value per key).
**Why this works:**
- The class enum is small and stable — adding a new class is rare and deliberate
- Queue properties (persistence, TTL) are implied by class, not encoded in the name
- Dynamic registration works naturally — the config service publishes `flow:tg:text-completion-request` and the backend knows how to declare it from the `flow` class
- The colon separator is unambiguous, easy to split, doesn't conflict with URIs or path separators that backends use internally
- No pretence of being generic — this is a TrustGraph convention, and that's fine
### Serialization boundary
**Decision: the backend owns the wire format.**
The contract between the Consumer/Producer layer and the backend is dataclass objects in, dataclass objects out:
- `send()` accepts a dataclass instance and properties dict
- `receive()` returns a message whose `value()` is a dataclass instance
What happens on the wire is the backend's concern. The Pulsar backend uses JSON (via `dataclass_to_dict` / `dict_to_dataclass`). A RabbitMQ backend would likely also use JSON. A future backend could use Protobuf, MessagePack, or Avro if the broker benefits from it.
The serialization helpers stay inside the backend that uses them — they are not shared infrastructure. Each backend brings its own serialization strategy. The Consumer/Producer layer never thinks about wire format.
### Gateway service
**Decision: the gateway uses the backend abstraction like any other component.**
The gateway currently bridges WebSocket/REST to Pulsar directly, bypassing the abstraction layer. It translates incoming API JSON to Pulsar schema objects, sends them, receives responses as Pulsar schema objects, and translates back to API JSON. Since the wire format is JSON in both directions, this is effectively a no-op round trip through the schema machinery.
With the backend abstraction, the gateway follows the same pattern as every other component:
1. Incoming API JSON → translator `decode()` → dataclass
2. Dataclass → backend `send()` (backend handles wire format)
3. Backend `receive()` → dataclass
4. Dataclass → translator `encode()` → API JSON → WebSocket/REST client
This is architecturally simple — one code path, no special cases. The gateway depends on the schema dataclasses and the translator layer, which it already does. The overhead of deserialize-then-reserialize is negligible for the message sizes involved. And it keeps all options open — if a future backend uses a non-JSON wire format, the gateway still works without changes.
## Implementation Order
### Phase 1: Rename translators
Rename `to_pulsar()``decode()`, `from_pulsar()``encode()` across all translator classes and call sites. Remove `from pulsar.schema import Record` from the translator base class. Mechanical find-and-replace, no behavioural changes.
### Phase 2: Queue naming
Replace the `topic()` helper with the CLASS:TOPICSPACE:TOPIC scheme. Update all queue definitions in `schema/services/*.py` and `schema/knowledge/*.py`. Update `PulsarBackend.map_topic()` to parse the new format. Verify all existing functionality still works with Pulsar.
### Phase 3: Clean up Pulsar leaks
Work through the concrete cleanups list: remove dead imports, delete the legacy `PulsarClient` class, migrate the client libraries and gateway to use the backend abstraction. After this phase, `pulsar` imports exist only in `pulsar_backend.py`.
### Phase 4: RabbitMQ backend
Implement `rabbitmq_backend.py` against the existing `PubSubBackend` Protocol. Map queue classes to RabbitMQ concepts: `flow` → durable queues, `request`/`response` → non-durable queues with TTL, `state` → RabbitMQ streams. Add `rabbitmq` as a backend option in the factory. Test end-to-end with `PUBSUB_BACKEND=rabbitmq`.
Phases 1-3 are safe to do on main — they don't change behaviour, just clean up. Phase 4 is additive — it adds a new backend without touching the existing one.
### Config distribution on RabbitMQ
The `state` queue class needs "start from earliest" semantics — a newly started processor must receive the current configuration state.
RabbitMQ Streams (available since 3.9) solve this directly. Streams are persistent, append-only logs that support consumer offset positioning. The RabbitMQ backend maps the `state` class to a stream, and consumers attach with offset `first` to read from the beginning, or `last` to read the most recent entry plus future updates.
Since config pushes are full state snapshots (not deltas), a consumer only needs the most recent entry. The RabbitMQ backend can use `last` offset positioning for `state` class consumers, which delivers the last message in the stream followed by any new messages. This matches the current behaviour where processors read config on startup and then react to updates.

View file

@ -38,7 +38,7 @@ class TestDocumentEmbeddingsRequestContract:
assert request.user == "test_user" assert request.user == "test_user"
assert request.collection == "test_collection" assert request.collection == "test_collection"
def test_request_translator_to_pulsar(self): def test_request_translator_decode(self):
"""Test request translator converts dict to Pulsar schema""" """Test request translator converts dict to Pulsar schema"""
translator = DocumentEmbeddingsRequestTranslator() translator = DocumentEmbeddingsRequestTranslator()
@ -49,7 +49,7 @@ class TestDocumentEmbeddingsRequestContract:
"collection": "custom_collection" "collection": "custom_collection"
} }
result = translator.to_pulsar(data) result = translator.decode(data)
assert isinstance(result, DocumentEmbeddingsRequest) assert isinstance(result, DocumentEmbeddingsRequest)
assert result.vector == [0.1, 0.2, 0.3, 0.4] assert result.vector == [0.1, 0.2, 0.3, 0.4]
@ -57,7 +57,7 @@ class TestDocumentEmbeddingsRequestContract:
assert result.user == "custom_user" assert result.user == "custom_user"
assert result.collection == "custom_collection" assert result.collection == "custom_collection"
def test_request_translator_to_pulsar_with_defaults(self): def test_request_translator_decode_with_defaults(self):
"""Test request translator uses correct defaults""" """Test request translator uses correct defaults"""
translator = DocumentEmbeddingsRequestTranslator() translator = DocumentEmbeddingsRequestTranslator()
@ -66,7 +66,7 @@ class TestDocumentEmbeddingsRequestContract:
# No limit, user, or collection provided # No limit, user, or collection provided
} }
result = translator.to_pulsar(data) result = translator.decode(data)
assert isinstance(result, DocumentEmbeddingsRequest) assert isinstance(result, DocumentEmbeddingsRequest)
assert result.vector == [0.1, 0.2] assert result.vector == [0.1, 0.2]
@ -74,7 +74,7 @@ class TestDocumentEmbeddingsRequestContract:
assert result.user == "trustgraph" # Default assert result.user == "trustgraph" # Default
assert result.collection == "default" # Default assert result.collection == "default" # Default
def test_request_translator_from_pulsar(self): def test_request_translator_encode(self):
"""Test request translator converts Pulsar schema to dict""" """Test request translator converts Pulsar schema to dict"""
translator = DocumentEmbeddingsRequestTranslator() translator = DocumentEmbeddingsRequestTranslator()
@ -85,7 +85,7 @@ class TestDocumentEmbeddingsRequestContract:
collection="test_collection" collection="test_collection"
) )
result = translator.from_pulsar(request) result = translator.encode(request)
assert isinstance(result, dict) assert isinstance(result, dict)
assert result["vector"] == [0.5, 0.6] assert result["vector"] == [0.5, 0.6]
@ -134,7 +134,7 @@ class TestDocumentEmbeddingsResponseContract:
assert response.error == error assert response.error == error
assert response.chunks == [] assert response.chunks == []
def test_response_translator_from_pulsar_with_chunks(self): def test_response_translator_encode_with_chunks(self):
"""Test response translator converts Pulsar schema with chunks to dict""" """Test response translator converts Pulsar schema with chunks to dict"""
translator = DocumentEmbeddingsResponseTranslator() translator = DocumentEmbeddingsResponseTranslator()
@ -147,7 +147,7 @@ class TestDocumentEmbeddingsResponseContract:
] ]
) )
result = translator.from_pulsar(response) result = translator.encode(response)
assert isinstance(result, dict) assert isinstance(result, dict)
assert "chunks" in result assert "chunks" in result
@ -155,7 +155,7 @@ class TestDocumentEmbeddingsResponseContract:
assert result["chunks"][0]["chunk_id"] == "doc1/c1" assert result["chunks"][0]["chunk_id"] == "doc1/c1"
assert result["chunks"][0]["score"] == 0.95 assert result["chunks"][0]["score"] == 0.95
def test_response_translator_from_pulsar_with_empty_chunks(self): def test_response_translator_encode_with_empty_chunks(self):
"""Test response translator handles empty chunks list""" """Test response translator handles empty chunks list"""
translator = DocumentEmbeddingsResponseTranslator() translator = DocumentEmbeddingsResponseTranslator()
@ -164,25 +164,25 @@ class TestDocumentEmbeddingsResponseContract:
chunks=[] chunks=[]
) )
result = translator.from_pulsar(response) result = translator.encode(response)
assert isinstance(result, dict) assert isinstance(result, dict)
assert "chunks" in result assert "chunks" in result
assert result["chunks"] == [] assert result["chunks"] == []
def test_response_translator_from_pulsar_with_none_chunks(self): def test_response_translator_encode_with_none_chunks(self):
"""Test response translator handles None chunks""" """Test response translator handles None chunks"""
translator = DocumentEmbeddingsResponseTranslator() translator = DocumentEmbeddingsResponseTranslator()
response = MagicMock() response = MagicMock()
response.chunks = None response.chunks = None
result = translator.from_pulsar(response) result = translator.encode(response)
assert isinstance(result, dict) assert isinstance(result, dict)
assert "chunks" not in result or result.get("chunks") is None assert "chunks" not in result or result.get("chunks") is None
def test_response_translator_from_response_with_completion(self): def test_response_translator_encode_with_completion(self):
"""Test response translator with completion flag""" """Test response translator with completion flag"""
translator = DocumentEmbeddingsResponseTranslator() translator = DocumentEmbeddingsResponseTranslator()
@ -194,7 +194,7 @@ class TestDocumentEmbeddingsResponseContract:
] ]
) )
result, is_final = translator.from_response_with_completion(response) result, is_final = translator.encode_with_completion(response)
assert isinstance(result, dict) assert isinstance(result, dict)
assert "chunks" in result assert "chunks" in result
@ -202,12 +202,12 @@ class TestDocumentEmbeddingsResponseContract:
assert result["chunks"][0]["chunk_id"] == "chunk1" assert result["chunks"][0]["chunk_id"] == "chunk1"
assert is_final is True # Document embeddings responses are always final assert is_final is True # Document embeddings responses are always final
def test_response_translator_to_pulsar_not_implemented(self): def test_response_translator_decode_not_implemented(self):
"""Test that to_pulsar raises NotImplementedError for responses""" """Test that decode raises NotImplementedError for responses"""
translator = DocumentEmbeddingsResponseTranslator() translator = DocumentEmbeddingsResponseTranslator()
with pytest.raises(NotImplementedError): with pytest.raises(NotImplementedError):
translator.to_pulsar({"chunks": [{"chunk_id": "test", "score": 0.9}]}) translator.decode({"chunks": [{"chunk_id": "test", "score": 0.9}]})
class TestDocumentEmbeddingsMessageCompatibility: class TestDocumentEmbeddingsMessageCompatibility:
@ -225,7 +225,7 @@ class TestDocumentEmbeddingsMessageCompatibility:
# Convert to Pulsar request # Convert to Pulsar request
req_translator = DocumentEmbeddingsRequestTranslator() req_translator = DocumentEmbeddingsRequestTranslator()
pulsar_request = req_translator.to_pulsar(request_data) pulsar_request = req_translator.decode(request_data)
# Simulate service processing and creating response # Simulate service processing and creating response
response = DocumentEmbeddingsResponse( response = DocumentEmbeddingsResponse(
@ -238,7 +238,7 @@ class TestDocumentEmbeddingsMessageCompatibility:
# Convert response back to dict # Convert response back to dict
resp_translator = DocumentEmbeddingsResponseTranslator() resp_translator = DocumentEmbeddingsResponseTranslator()
response_data = resp_translator.from_pulsar(response) response_data = resp_translator.encode(response)
# Verify data integrity # Verify data integrity
assert isinstance(pulsar_request, DocumentEmbeddingsRequest) assert isinstance(pulsar_request, DocumentEmbeddingsRequest)
@ -261,7 +261,7 @@ class TestDocumentEmbeddingsMessageCompatibility:
# Convert response to dict # Convert response to dict
translator = DocumentEmbeddingsResponseTranslator() translator = DocumentEmbeddingsResponseTranslator()
response_data = translator.from_pulsar(response) response_data = translator.encode(response)
# Verify error handling # Verify error handling
assert isinstance(response_data, dict) assert isinstance(response_data, dict)

View file

@ -33,7 +33,7 @@ class TestRAGTranslatorCompletionFlags:
) )
# Act # Act
response_dict, is_final = translator.from_response_with_completion(response) response_dict, is_final = translator.encode_with_completion(response)
# Assert # Assert
assert is_final is True, "is_final must be True when end_of_session=True" assert is_final is True, "is_final must be True when end_of_session=True"
@ -57,7 +57,7 @@ class TestRAGTranslatorCompletionFlags:
) )
# Act # Act
response_dict, is_final = translator.from_response_with_completion(response) response_dict, is_final = translator.encode_with_completion(response)
# Assert # Assert
assert is_final is False, "is_final must be False when end_of_session=False" assert is_final is False, "is_final must be False when end_of_session=False"
@ -80,7 +80,7 @@ class TestRAGTranslatorCompletionFlags:
) )
# Act # Act
response_dict, is_final = translator.from_response_with_completion(response) response_dict, is_final = translator.encode_with_completion(response)
# Assert # Assert
assert is_final is False assert is_final is False
@ -103,7 +103,7 @@ class TestRAGTranslatorCompletionFlags:
) )
# Act # Act
response_dict, is_final = translator.from_response_with_completion(response) response_dict, is_final = translator.encode_with_completion(response)
# Assert # Assert
assert is_final is False, "end_of_stream=True should NOT make is_final=True" assert is_final is False, "end_of_stream=True should NOT make is_final=True"
@ -125,7 +125,7 @@ class TestRAGTranslatorCompletionFlags:
) )
# Act # Act
response_dict, is_final = translator.from_response_with_completion(response) response_dict, is_final = translator.encode_with_completion(response)
# Assert # Assert
assert is_final is True, "is_final must be True when end_of_session=True" assert is_final is True, "is_final must be True when end_of_session=True"
@ -147,7 +147,7 @@ class TestRAGTranslatorCompletionFlags:
) )
# Act # Act
response_dict, is_final = translator.from_response_with_completion(response) response_dict, is_final = translator.encode_with_completion(response)
# Assert # Assert
assert is_final is False, "end_of_stream=True should NOT make is_final=True" assert is_final is False, "end_of_stream=True should NOT make is_final=True"
@ -168,7 +168,7 @@ class TestRAGTranslatorCompletionFlags:
) )
# Act # Act
response_dict, is_final = translator.from_response_with_completion(response) response_dict, is_final = translator.encode_with_completion(response)
# Assert # Assert
assert is_final is False, "is_final must be False when end_of_stream=False" assert is_final is False, "is_final must be False when end_of_stream=False"
@ -195,7 +195,7 @@ class TestAgentTranslatorCompletionFlags:
) )
# Act # Act
response_dict, is_final = translator.from_response_with_completion(response) response_dict, is_final = translator.encode_with_completion(response)
# Assert # Assert
assert is_final is True, "is_final must be True when end_of_dialog=True" assert is_final is True, "is_final must be True when end_of_dialog=True"
@ -217,7 +217,7 @@ class TestAgentTranslatorCompletionFlags:
) )
# Act # Act
response_dict, is_final = translator.from_response_with_completion(response) response_dict, is_final = translator.encode_with_completion(response)
# Assert # Assert
assert is_final is False, "is_final must be False when end_of_dialog=False" assert is_final is False, "is_final must be False when end_of_dialog=False"
@ -240,7 +240,7 @@ class TestAgentTranslatorCompletionFlags:
) )
# Act # Act
thought_dict, thought_is_final = translator.from_response_with_completion(thought_response) thought_dict, thought_is_final = translator.encode_with_completion(thought_response)
# Assert # Assert
assert thought_is_final is False, "Thought message must not be final" assert thought_is_final is False, "Thought message must not be final"
@ -254,7 +254,7 @@ class TestAgentTranslatorCompletionFlags:
) )
# Act # Act
obs_dict, obs_is_final = translator.from_response_with_completion(observation_response) obs_dict, obs_is_final = translator.encode_with_completion(observation_response)
# Assert # Assert
assert obs_is_final is False, "Observation message must not be final" assert obs_is_final is False, "Observation message must not be final"
@ -275,7 +275,7 @@ class TestAgentTranslatorCompletionFlags:
) )
# Act # Act
response_dict, is_final = translator.from_response_with_completion(response) response_dict, is_final = translator.encode_with_completion(response)
# Assert # Assert
assert is_final is True, "Streaming format must use end_of_dialog for is_final" assert is_final is True, "Streaming format must use end_of_dialog for is_final"

View file

@ -21,17 +21,15 @@ class TestSyncDocumentEmbeddingsClient:
# Act # Act
client = DocumentEmbeddingsClient( client = DocumentEmbeddingsClient(
log_level=1,
subscriber="test-subscriber", subscriber="test-subscriber",
input_queue="test-input", input_queue="test-input",
output_queue="test-output", output_queue="test-output",
pulsar_host="pulsar://test:6650", pulsar_host="pulsar://test:6650",
pulsar_api_key="test-key" pulsar_api_key="test-key"
) )
# Assert # Assert
mock_base_init.assert_called_once_with( mock_base_init.assert_called_once_with(
log_level=1,
subscriber="test-subscriber", subscriber="test-subscriber",
input_queue="test-input", input_queue="test-input",
output_queue="test-output", output_queue="test-output",

View file

@ -49,7 +49,7 @@ class TestConfigRequestor:
mock_translator_registry.get_response_translator.return_value = Mock() mock_translator_registry.get_response_translator.return_value = Mock()
# Setup translator response # Setup translator response
mock_request_translator.to_pulsar.return_value = "translated_request" mock_request_translator.decode.return_value = "translated_request"
# Patch ServiceRequestor async methods with regular mocks (not AsyncMock) # Patch ServiceRequestor async methods with regular mocks (not AsyncMock)
with patch.object(ServiceRequestor, 'start', return_value=None), \ with patch.object(ServiceRequestor, 'start', return_value=None), \
@ -64,7 +64,7 @@ class TestConfigRequestor:
result = requestor.to_request({"test": "body"}) result = requestor.to_request({"test": "body"})
# Verify translator was called correctly # Verify translator was called correctly
mock_request_translator.to_pulsar.assert_called_once_with({"test": "body"}) mock_request_translator.decode.assert_called_once_with({"test": "body"})
assert result == "translated_request" assert result == "translated_request"
@patch('trustgraph.gateway.dispatch.config.TranslatorRegistry') @patch('trustgraph.gateway.dispatch.config.TranslatorRegistry')
@ -76,7 +76,7 @@ class TestConfigRequestor:
mock_translator_registry.get_response_translator.return_value = mock_response_translator mock_translator_registry.get_response_translator.return_value = mock_response_translator
# Setup translator response # Setup translator response
mock_response_translator.from_response_with_completion.return_value = "translated_response" mock_response_translator.encode_with_completion.return_value = "translated_response"
requestor = ConfigRequestor( requestor = ConfigRequestor(
backend=Mock(), backend=Mock(),
@ -89,5 +89,5 @@ class TestConfigRequestor:
result = requestor.from_response(mock_message) result = requestor.from_response(mock_message)
# Verify translator was called correctly # Verify translator was called correctly
mock_response_translator.from_response_with_completion.assert_called_once_with(mock_message) mock_response_translator.encode_with_completion.assert_called_once_with(mock_message)
assert result == "translated_response" assert result == "translated_response"

View file

@ -25,7 +25,7 @@ from trustgraph.schema import (
class TestGraphRagResponseTranslator: class TestGraphRagResponseTranslator:
"""Test GraphRagResponseTranslator streaming behavior""" """Test GraphRagResponseTranslator streaming behavior"""
def test_from_pulsar_with_empty_response(self): def test_encode_with_empty_response(self):
"""Test that empty response strings are preserved""" """Test that empty response strings are preserved"""
# Arrange # Arrange
translator = GraphRagResponseTranslator() translator = GraphRagResponseTranslator()
@ -36,14 +36,14 @@ class TestGraphRagResponseTranslator:
) )
# Act # Act
result = translator.from_pulsar(response) result = translator.encode(response)
# Assert - Empty string should be included in result # Assert - Empty string should be included in result
assert "response" in result assert "response" in result
assert result["response"] == "" assert result["response"] == ""
assert result["end_of_stream"] is True assert result["end_of_stream"] is True
def test_from_pulsar_with_non_empty_response(self): def test_encode_with_non_empty_response(self):
"""Test that non-empty responses work correctly""" """Test that non-empty responses work correctly"""
# Arrange # Arrange
translator = GraphRagResponseTranslator() translator = GraphRagResponseTranslator()
@ -54,13 +54,13 @@ class TestGraphRagResponseTranslator:
) )
# Act # Act
result = translator.from_pulsar(response) result = translator.encode(response)
# Assert # Assert
assert result["response"] == "Some text" assert result["response"] == "Some text"
assert result["end_of_stream"] is False assert result["end_of_stream"] is False
def test_from_pulsar_with_none_response(self): def test_encode_with_none_response(self):
"""Test that None response is handled correctly""" """Test that None response is handled correctly"""
# Arrange # Arrange
translator = GraphRagResponseTranslator() translator = GraphRagResponseTranslator()
@ -71,14 +71,14 @@ class TestGraphRagResponseTranslator:
) )
# Act # Act
result = translator.from_pulsar(response) result = translator.encode(response)
# Assert - None should not be included # Assert - None should not be included
assert "response" not in result assert "response" not in result
assert result["end_of_stream"] is True assert result["end_of_stream"] is True
def test_from_response_with_completion_returns_correct_flag(self): def test_encode_with_completion_returns_correct_flag(self):
"""Test that from_response_with_completion returns correct is_final flag""" """Test that encode_with_completion returns correct is_final flag"""
# Arrange # Arrange
translator = GraphRagResponseTranslator() translator = GraphRagResponseTranslator()
@ -90,7 +90,7 @@ class TestGraphRagResponseTranslator:
) )
# Act # Act
result, is_final = translator.from_response_with_completion(response_chunk) result, is_final = translator.encode_with_completion(response_chunk)
# Assert # Assert
assert is_final is False assert is_final is False
@ -105,7 +105,7 @@ class TestGraphRagResponseTranslator:
) )
# Act # Act
result, is_final = translator.from_response_with_completion(final_response) result, is_final = translator.encode_with_completion(final_response)
# Assert - is_final is based on end_of_session, not end_of_stream # Assert - is_final is based on end_of_session, not end_of_stream
assert is_final is True assert is_final is True
@ -116,7 +116,7 @@ class TestGraphRagResponseTranslator:
class TestDocumentRagResponseTranslator: class TestDocumentRagResponseTranslator:
"""Test DocumentRagResponseTranslator streaming behavior""" """Test DocumentRagResponseTranslator streaming behavior"""
def test_from_pulsar_with_empty_response(self): def test_encode_with_empty_response(self):
"""Test that empty response strings are preserved""" """Test that empty response strings are preserved"""
# Arrange # Arrange
translator = DocumentRagResponseTranslator() translator = DocumentRagResponseTranslator()
@ -127,14 +127,14 @@ class TestDocumentRagResponseTranslator:
) )
# Act # Act
result = translator.from_pulsar(response) result = translator.encode(response)
# Assert # Assert
assert "response" in result assert "response" in result
assert result["response"] == "" assert result["response"] == ""
assert result["end_of_stream"] is True assert result["end_of_stream"] is True
def test_from_pulsar_with_non_empty_response(self): def test_encode_with_non_empty_response(self):
"""Test that non-empty responses work correctly""" """Test that non-empty responses work correctly"""
# Arrange # Arrange
translator = DocumentRagResponseTranslator() translator = DocumentRagResponseTranslator()
@ -145,7 +145,7 @@ class TestDocumentRagResponseTranslator:
) )
# Act # Act
result = translator.from_pulsar(response) result = translator.encode(response)
# Assert # Assert
assert result["response"] == "Document content" assert result["response"] == "Document content"
@ -155,7 +155,7 @@ class TestDocumentRagResponseTranslator:
class TestPromptResponseTranslator: class TestPromptResponseTranslator:
"""Test PromptResponseTranslator streaming behavior""" """Test PromptResponseTranslator streaming behavior"""
def test_from_pulsar_with_empty_text(self): def test_encode_with_empty_text(self):
"""Test that empty text strings are preserved""" """Test that empty text strings are preserved"""
# Arrange # Arrange
translator = PromptResponseTranslator() translator = PromptResponseTranslator()
@ -167,14 +167,14 @@ class TestPromptResponseTranslator:
) )
# Act # Act
result = translator.from_pulsar(response) result = translator.encode(response)
# Assert # Assert
assert "text" in result assert "text" in result
assert result["text"] == "" assert result["text"] == ""
assert result["end_of_stream"] is True assert result["end_of_stream"] is True
def test_from_pulsar_with_non_empty_text(self): def test_encode_with_non_empty_text(self):
"""Test that non-empty text works correctly""" """Test that non-empty text works correctly"""
# Arrange # Arrange
translator = PromptResponseTranslator() translator = PromptResponseTranslator()
@ -186,13 +186,13 @@ class TestPromptResponseTranslator:
) )
# Act # Act
result = translator.from_pulsar(response) result = translator.encode(response)
# Assert # Assert
assert result["text"] == "Some prompt response" assert result["text"] == "Some prompt response"
assert result["end_of_stream"] is False assert result["end_of_stream"] is False
def test_from_pulsar_with_none_text(self): def test_encode_with_none_text(self):
"""Test that None text is handled correctly""" """Test that None text is handled correctly"""
# Arrange # Arrange
translator = PromptResponseTranslator() translator = PromptResponseTranslator()
@ -204,14 +204,14 @@ class TestPromptResponseTranslator:
) )
# Act # Act
result = translator.from_pulsar(response) result = translator.encode(response)
# Assert # Assert
assert "text" not in result assert "text" not in result
assert "object" in result assert "object" in result
assert result["end_of_stream"] is True assert result["end_of_stream"] is True
def test_from_pulsar_includes_end_of_stream(self): def test_encode_includes_end_of_stream(self):
"""Test that end_of_stream flag is always included""" """Test that end_of_stream flag is always included"""
# Arrange # Arrange
translator = PromptResponseTranslator() translator = PromptResponseTranslator()
@ -225,7 +225,7 @@ class TestPromptResponseTranslator:
) )
# Act # Act
result = translator.from_pulsar(response) result = translator.encode(response)
# Assert # Assert
assert "end_of_stream" in result assert "end_of_stream" in result
@ -235,7 +235,7 @@ class TestPromptResponseTranslator:
class TestTextCompletionResponseTranslator: class TestTextCompletionResponseTranslator:
"""Test TextCompletionResponseTranslator streaming behavior""" """Test TextCompletionResponseTranslator streaming behavior"""
def test_from_pulsar_always_includes_response(self): def test_encode_always_includes_response(self):
"""Test that response field is always included, even if empty""" """Test that response field is always included, even if empty"""
# Arrange # Arrange
translator = TextCompletionResponseTranslator() translator = TextCompletionResponseTranslator()
@ -249,13 +249,13 @@ class TestTextCompletionResponseTranslator:
) )
# Act # Act
result = translator.from_pulsar(response) result = translator.encode(response)
# Assert - Response should always be present # Assert - Response should always be present
assert "response" in result assert "response" in result
assert result["response"] == "" assert result["response"] == ""
def test_from_response_with_completion_with_empty_final(self): def test_encode_with_completion_with_empty_final(self):
"""Test that empty final response is handled correctly""" """Test that empty final response is handled correctly"""
# Arrange # Arrange
translator = TextCompletionResponseTranslator() translator = TextCompletionResponseTranslator()
@ -269,7 +269,7 @@ class TestTextCompletionResponseTranslator:
) )
# Act # Act
result, is_final = translator.from_response_with_completion(response) result, is_final = translator.encode_with_completion(response)
# Assert # Assert
assert is_final is True assert is_final is True
@ -297,7 +297,7 @@ class TestStreamingProtocolCompliance:
response = response_class(**kwargs) response = response_class(**kwargs)
# Act # Act
result = translator.from_pulsar(response) result = translator.encode(response)
# Assert # Assert
assert field_name in result, f"{translator_class.__name__} should include '{field_name}' field even when empty" assert field_name in result, f"{translator_class.__name__} should include '{field_name}' field even when empty"
@ -320,7 +320,7 @@ class TestStreamingProtocolCompliance:
response = response_class(**kwargs) response = response_class(**kwargs)
# Act # Act
result = translator.from_pulsar(response) result = translator.encode(response)
# Assert # Assert
assert "end_of_stream" in result, f"{translator_class.__name__} should include 'end_of_stream' flag" assert "end_of_stream" in result, f"{translator_class.__name__} should include 'end_of_stream' flag"

View file

@ -8,11 +8,11 @@ from trustgraph.messaging.translators.document_loading import TextDocumentTransl
class TestTextDocumentTranslator: class TestTextDocumentTranslator:
def test_to_pulsar_decodes_base64_text(self): def test_decode_decodes_base64_text(self):
translator = TextDocumentTranslator() translator = TextDocumentTranslator()
payload = "Cancer survival: 2.74× higher hazard ratio" payload = "Cancer survival: 2.74× higher hazard ratio"
msg = translator.to_pulsar( msg = translator.decode(
{ {
"id": "doc-1", "id": "doc-1",
"user": "alice", "user": "alice",
@ -27,11 +27,11 @@ class TestTextDocumentTranslator:
assert msg.metadata.collection == "research" assert msg.metadata.collection == "research"
assert msg.text == payload.encode("utf-8") assert msg.text == payload.encode("utf-8")
def test_to_pulsar_accepts_raw_utf8_text(self): def test_decode_accepts_raw_utf8_text(self):
translator = TextDocumentTranslator() translator = TextDocumentTranslator()
payload = "Cancer survival: 2.74× higher hazard ratio" payload = "Cancer survival: 2.74× higher hazard ratio"
msg = translator.to_pulsar( msg = translator.decode(
{ {
"charset": "utf-8", "charset": "utf-8",
"text": payload, "text": payload,
@ -40,11 +40,11 @@ class TestTextDocumentTranslator:
assert msg.text == payload.encode("utf-8") assert msg.text == payload.encode("utf-8")
def test_to_pulsar_falls_back_to_raw_non_base64_ascii(self): def test_decode_falls_back_to_raw_non_base64_ascii(self):
translator = TextDocumentTranslator() translator = TextDocumentTranslator()
payload = "plain-text payload" payload = "plain-text payload"
msg = translator.to_pulsar( msg = translator.decode(
{ {
"charset": "utf-8", "charset": "utf-8",
"text": payload, "text": payload,

View file

@ -0,0 +1,133 @@
"""
Tests for queue naming and topic mapping.
"""
import pytest
import argparse
from trustgraph.schema.core.topic import queue
from trustgraph.base.pubsub import get_pubsub, add_pubsub_args
from trustgraph.base.pulsar_backend import PulsarBackend
class TestQueueFunction:
def test_flow_default(self):
assert queue('text-completion-request') == 'flow:tg:text-completion-request'
def test_request_class(self):
assert queue('config', cls='request') == 'request:tg:config'
def test_response_class(self):
assert queue('config', cls='response') == 'response:tg:config'
def test_state_class(self):
assert queue('config', cls='state') == 'state:tg:config'
def test_custom_topicspace(self):
assert queue('config', cls='request', topicspace='prod') == 'request:prod:config'
def test_default_class_is_flow(self):
result = queue('something')
assert result.startswith('flow:')
class TestPulsarMapTopic:
@pytest.fixture
def backend(self):
"""Create a PulsarBackend without connecting."""
b = object.__new__(PulsarBackend)
return b
def test_flow_maps_to_persistent(self, backend):
assert backend.map_topic('flow:tg:text-completion-request') == \
'persistent://tg/flow/text-completion-request'
def test_state_maps_to_persistent(self, backend):
assert backend.map_topic('state:tg:config') == \
'persistent://tg/state/config'
def test_request_maps_to_non_persistent(self, backend):
assert backend.map_topic('request:tg:config') == \
'non-persistent://tg/request/config'
def test_response_maps_to_non_persistent(self, backend):
assert backend.map_topic('response:tg:librarian') == \
'non-persistent://tg/response/librarian'
def test_passthrough_pulsar_uri(self, backend):
uri = 'persistent://tg/flow/something'
assert backend.map_topic(uri) == uri
def test_invalid_format_raises(self, backend):
with pytest.raises(ValueError, match="Invalid queue format"):
backend.map_topic('bad-format')
def test_invalid_class_raises(self, backend):
with pytest.raises(ValueError, match="Invalid queue class"):
backend.map_topic('unknown:tg:topic')
def test_custom_topicspace(self, backend):
assert backend.map_topic('flow:prod:my-queue') == \
'persistent://prod/flow/my-queue'
class TestGetPubsubDispatch:
def test_unknown_backend_raises(self):
with pytest.raises(ValueError, match="Unknown pub/sub backend"):
get_pubsub(pubsub_backend='redis')
class TestAddPubsubArgs:
def test_standalone_defaults_to_localhost(self):
parser = argparse.ArgumentParser()
add_pubsub_args(parser, standalone=True)
args = parser.parse_args([])
assert args.pulsar_host == 'pulsar://localhost:6650'
assert args.pulsar_listener == 'localhost'
def test_non_standalone_defaults_to_container(self):
parser = argparse.ArgumentParser()
add_pubsub_args(parser, standalone=False)
args = parser.parse_args([])
assert 'pulsar:6650' in args.pulsar_host
assert args.pulsar_listener is None
def test_cli_override_respected(self):
parser = argparse.ArgumentParser()
add_pubsub_args(parser, standalone=True)
args = parser.parse_args(['--pulsar-host', 'pulsar://custom:6650'])
assert args.pulsar_host == 'pulsar://custom:6650'
def test_pubsub_backend_default(self):
parser = argparse.ArgumentParser()
add_pubsub_args(parser)
args = parser.parse_args([])
assert args.pubsub_backend == 'pulsar'
class TestQueueDefinitions:
"""Verify the actual queue constants produce correct names."""
def test_config_request(self):
from trustgraph.schema.services.config import config_request_queue
assert config_request_queue == 'request:tg:config'
def test_config_response(self):
from trustgraph.schema.services.config import config_response_queue
assert config_response_queue == 'response:tg:config'
def test_config_push(self):
from trustgraph.schema.services.config import config_push_queue
assert config_push_queue == 'state:tg:config'
def test_librarian_request_is_persistent(self):
from trustgraph.schema.services.library import librarian_request_queue
assert librarian_request_queue.startswith('flow:')
def test_knowledge_request(self):
from trustgraph.schema.knowledge.knowledge import knowledge_request_queue
assert knowledge_request_queue == 'request:tg:knowledge'

View file

@ -28,21 +28,21 @@ def triple_tx():
class TestTermTranslatorIri: class TestTermTranslatorIri:
def test_iri_to_pulsar(self, term_tx): def test_iri_decode(self, term_tx):
data = {"t": "i", "i": "http://example.org/Alice"} data = {"t": "i", "i": "http://example.org/Alice"}
term = term_tx.to_pulsar(data) term = term_tx.decode(data)
assert term.type == IRI assert term.type == IRI
assert term.iri == "http://example.org/Alice" assert term.iri == "http://example.org/Alice"
def test_iri_from_pulsar(self, term_tx): def test_iri_encode(self, term_tx):
term = Term(type=IRI, iri="http://example.org/Bob") term = Term(type=IRI, iri="http://example.org/Bob")
wire = term_tx.from_pulsar(term) wire = term_tx.encode(term)
assert wire == {"t": "i", "i": "http://example.org/Bob"} assert wire == {"t": "i", "i": "http://example.org/Bob"}
def test_iri_round_trip(self, term_tx): def test_iri_round_trip(self, term_tx):
original = Term(type=IRI, iri="http://example.org/round") original = Term(type=IRI, iri="http://example.org/round")
wire = term_tx.from_pulsar(original) wire = term_tx.encode(original)
restored = term_tx.to_pulsar(wire) restored = term_tx.decode(wire)
assert restored == original assert restored == original
@ -52,21 +52,21 @@ class TestTermTranslatorIri:
class TestTermTranslatorBlank: class TestTermTranslatorBlank:
def test_blank_to_pulsar(self, term_tx): def test_blank_decode(self, term_tx):
data = {"t": "b", "d": "_:b42"} data = {"t": "b", "d": "_:b42"}
term = term_tx.to_pulsar(data) term = term_tx.decode(data)
assert term.type == BLANK assert term.type == BLANK
assert term.id == "_:b42" assert term.id == "_:b42"
def test_blank_from_pulsar(self, term_tx): def test_blank_encode(self, term_tx):
term = Term(type=BLANK, id="_:node1") term = Term(type=BLANK, id="_:node1")
wire = term_tx.from_pulsar(term) wire = term_tx.encode(term)
assert wire == {"t": "b", "d": "_:node1"} assert wire == {"t": "b", "d": "_:node1"}
def test_blank_round_trip(self, term_tx): def test_blank_round_trip(self, term_tx):
original = Term(type=BLANK, id="_:x") original = Term(type=BLANK, id="_:x")
wire = term_tx.from_pulsar(original) wire = term_tx.encode(original)
restored = term_tx.to_pulsar(wire) restored = term_tx.decode(wire)
assert restored == original assert restored == original
@ -76,29 +76,29 @@ class TestTermTranslatorBlank:
class TestTermTranslatorTypedLiteral: class TestTermTranslatorTypedLiteral:
def test_plain_literal_to_pulsar(self, term_tx): def test_plain_literal_decode(self, term_tx):
data = {"t": "l", "v": "hello"} data = {"t": "l", "v": "hello"}
term = term_tx.to_pulsar(data) term = term_tx.decode(data)
assert term.type == LITERAL assert term.type == LITERAL
assert term.value == "hello" assert term.value == "hello"
assert term.datatype == "" assert term.datatype == ""
assert term.language == "" assert term.language == ""
def test_xsd_integer_to_pulsar(self, term_tx): def test_xsd_integer_decode(self, term_tx):
data = { data = {
"t": "l", "v": "42", "t": "l", "v": "42",
"dt": "http://www.w3.org/2001/XMLSchema#integer", "dt": "http://www.w3.org/2001/XMLSchema#integer",
} }
term = term_tx.to_pulsar(data) term = term_tx.decode(data)
assert term.value == "42" assert term.value == "42"
assert term.datatype.endswith("#integer") assert term.datatype.endswith("#integer")
def test_typed_literal_from_pulsar(self, term_tx): def test_typed_literal_encode(self, term_tx):
term = Term( term = Term(
type=LITERAL, value="3.14", type=LITERAL, value="3.14",
datatype="http://www.w3.org/2001/XMLSchema#double", datatype="http://www.w3.org/2001/XMLSchema#double",
) )
wire = term_tx.from_pulsar(term) wire = term_tx.encode(term)
assert wire["t"] == "l" assert wire["t"] == "l"
assert wire["v"] == "3.14" assert wire["v"] == "3.14"
assert wire["dt"] == "http://www.w3.org/2001/XMLSchema#double" assert wire["dt"] == "http://www.w3.org/2001/XMLSchema#double"
@ -109,13 +109,13 @@ class TestTermTranslatorTypedLiteral:
type=LITERAL, value="true", type=LITERAL, value="true",
datatype="http://www.w3.org/2001/XMLSchema#boolean", datatype="http://www.w3.org/2001/XMLSchema#boolean",
) )
wire = term_tx.from_pulsar(original) wire = term_tx.encode(original)
restored = term_tx.to_pulsar(wire) restored = term_tx.decode(wire)
assert restored == original assert restored == original
def test_plain_literal_omits_dt_and_ln(self, term_tx): def test_plain_literal_omits_dt_and_ln(self, term_tx):
term = Term(type=LITERAL, value="x") term = Term(type=LITERAL, value="x")
wire = term_tx.from_pulsar(term) wire = term_tx.encode(term)
assert "dt" not in wire assert "dt" not in wire
assert "ln" not in wire assert "ln" not in wire
@ -126,22 +126,22 @@ class TestTermTranslatorTypedLiteral:
class TestTermTranslatorLangLiteral: class TestTermTranslatorLangLiteral:
def test_language_tag_to_pulsar(self, term_tx): def test_language_tag_decode(self, term_tx):
data = {"t": "l", "v": "bonjour", "ln": "fr"} data = {"t": "l", "v": "bonjour", "ln": "fr"}
term = term_tx.to_pulsar(data) term = term_tx.decode(data)
assert term.value == "bonjour" assert term.value == "bonjour"
assert term.language == "fr" assert term.language == "fr"
def test_language_tag_from_pulsar(self, term_tx): def test_language_tag_encode(self, term_tx):
term = Term(type=LITERAL, value="colour", language="en-GB") term = Term(type=LITERAL, value="colour", language="en-GB")
wire = term_tx.from_pulsar(term) wire = term_tx.encode(term)
assert wire["ln"] == "en-GB" assert wire["ln"] == "en-GB"
assert "dt" not in wire # No datatype assert "dt" not in wire # No datatype
def test_language_tag_round_trip(self, term_tx): def test_language_tag_round_trip(self, term_tx):
original = Term(type=LITERAL, value="hola", language="es") original = Term(type=LITERAL, value="hola", language="es")
wire = term_tx.from_pulsar(original) wire = term_tx.encode(original)
restored = term_tx.to_pulsar(wire) restored = term_tx.decode(wire)
assert restored == original assert restored == original
@ -151,7 +151,7 @@ class TestTermTranslatorLangLiteral:
class TestTermTranslatorQuotedTriple: class TestTermTranslatorQuotedTriple:
def test_quoted_triple_to_pulsar(self, term_tx): def test_quoted_triple_decode(self, term_tx):
data = { data = {
"t": "t", "t": "t",
"tr": { "tr": {
@ -160,20 +160,20 @@ class TestTermTranslatorQuotedTriple:
"o": {"t": "i", "i": "http://example.org/Bob"}, "o": {"t": "i", "i": "http://example.org/Bob"},
}, },
} }
term = term_tx.to_pulsar(data) term = term_tx.decode(data)
assert term.type == TRIPLE assert term.type == TRIPLE
assert term.triple is not None assert term.triple is not None
assert term.triple.s.iri == "http://example.org/Alice" assert term.triple.s.iri == "http://example.org/Alice"
assert term.triple.o.iri == "http://example.org/Bob" assert term.triple.o.iri == "http://example.org/Bob"
def test_quoted_triple_from_pulsar(self, term_tx): def test_quoted_triple_encode(self, term_tx):
inner = Triple( inner = Triple(
s=Term(type=IRI, iri="http://example.org/s"), s=Term(type=IRI, iri="http://example.org/s"),
p=Term(type=IRI, iri="http://example.org/p"), p=Term(type=IRI, iri="http://example.org/p"),
o=Term(type=LITERAL, value="val"), o=Term(type=LITERAL, value="val"),
) )
term = Term(type=TRIPLE, triple=inner) term = Term(type=TRIPLE, triple=inner)
wire = term_tx.from_pulsar(term) wire = term_tx.encode(term)
assert wire["t"] == "t" assert wire["t"] == "t"
assert "tr" in wire assert "tr" in wire
assert wire["tr"]["s"]["i"] == "http://example.org/s" assert wire["tr"]["s"]["i"] == "http://example.org/s"
@ -186,18 +186,18 @@ class TestTermTranslatorQuotedTriple:
o=Term(type=LITERAL, value="C", language="en"), o=Term(type=LITERAL, value="C", language="en"),
) )
original = Term(type=TRIPLE, triple=inner) original = Term(type=TRIPLE, triple=inner)
wire = term_tx.from_pulsar(original) wire = term_tx.encode(original)
restored = term_tx.to_pulsar(wire) restored = term_tx.decode(wire)
assert restored.type == TRIPLE assert restored.type == TRIPLE
assert restored.triple.s == original.triple.s assert restored.triple.s == original.triple.s
assert restored.triple.o == original.triple.o assert restored.triple.o == original.triple.o
def test_quoted_triple_none_triple(self, term_tx): def test_quoted_triple_none_triple(self, term_tx):
term = Term(type=TRIPLE, triple=None) term = Term(type=TRIPLE, triple=None)
wire = term_tx.from_pulsar(term) wire = term_tx.encode(term)
assert wire == {"t": "t"} assert wire == {"t": "t"}
# And back # And back
restored = term_tx.to_pulsar(wire) restored = term_tx.decode(wire)
assert restored.type == TRIPLE assert restored.type == TRIPLE
assert restored.triple is None assert restored.triple is None
@ -210,7 +210,7 @@ class TestTermTranslatorQuotedTriple:
"o": {"t": "l", "v": "A feeling of expectation"}, "o": {"t": "l", "v": "A feeling of expectation"},
}, },
} }
term = term_tx.to_pulsar(data) term = term_tx.decode(data)
assert term.triple.o.type == LITERAL assert term.triple.o.type == LITERAL
assert term.triple.o.value == "A feeling of expectation" assert term.triple.o.value == "A feeling of expectation"
@ -223,22 +223,22 @@ class TestTermTranslatorEdgeCases:
def test_unknown_type(self, term_tx): def test_unknown_type(self, term_tx):
data = {"t": "z"} data = {"t": "z"}
term = term_tx.to_pulsar(data) term = term_tx.decode(data)
assert term.type == "z" assert term.type == "z"
def test_empty_type(self, term_tx): def test_empty_type(self, term_tx):
data = {} data = {}
term = term_tx.to_pulsar(data) term = term_tx.decode(data)
assert term.type == "" assert term.type == ""
def test_missing_iri_field(self, term_tx): def test_missing_iri_field(self, term_tx):
data = {"t": "i"} data = {"t": "i"}
term = term_tx.to_pulsar(data) term = term_tx.decode(data)
assert term.iri == "" assert term.iri == ""
def test_missing_literal_fields(self, term_tx): def test_missing_literal_fields(self, term_tx):
data = {"t": "l"} data = {"t": "l"}
term = term_tx.to_pulsar(data) term = term_tx.decode(data)
assert term.value == "" assert term.value == ""
assert term.datatype == "" assert term.datatype == ""
assert term.language == "" assert term.language == ""
@ -250,24 +250,24 @@ class TestTermTranslatorEdgeCases:
class TestTripleTranslator: class TestTripleTranslator:
def test_triple_to_pulsar(self, triple_tx): def test_triple_decode(self, triple_tx):
data = { data = {
"s": {"t": "i", "i": "http://example.org/s"}, "s": {"t": "i", "i": "http://example.org/s"},
"p": {"t": "i", "i": "http://example.org/p"}, "p": {"t": "i", "i": "http://example.org/p"},
"o": {"t": "l", "v": "object"}, "o": {"t": "l", "v": "object"},
} }
triple = triple_tx.to_pulsar(data) triple = triple_tx.decode(data)
assert triple.s.iri == "http://example.org/s" assert triple.s.iri == "http://example.org/s"
assert triple.o.value == "object" assert triple.o.value == "object"
assert triple.g is None assert triple.g is None
def test_triple_from_pulsar(self, triple_tx): def test_triple_encode(self, triple_tx):
triple = Triple( triple = Triple(
s=Term(type=IRI, iri="http://example.org/A"), s=Term(type=IRI, iri="http://example.org/A"),
p=Term(type=IRI, iri="http://example.org/B"), p=Term(type=IRI, iri="http://example.org/B"),
o=Term(type=LITERAL, value="C"), o=Term(type=LITERAL, value="C"),
) )
wire = triple_tx.from_pulsar(triple) wire = triple_tx.encode(triple)
assert wire["s"]["t"] == "i" assert wire["s"]["t"] == "i"
assert wire["o"]["v"] == "C" assert wire["o"]["v"] == "C"
assert "g" not in wire assert "g" not in wire
@ -279,17 +279,17 @@ class TestTripleTranslator:
"o": {"t": "l", "v": "val"}, "o": {"t": "l", "v": "val"},
"g": "urn:graph:source", "g": "urn:graph:source",
} }
quad = triple_tx.to_pulsar(data) quad = triple_tx.decode(data)
assert quad.g == "urn:graph:source" assert quad.g == "urn:graph:source"
def test_quad_from_pulsar_includes_graph(self, triple_tx): def test_quad_encode_includes_graph(self, triple_tx):
quad = Triple( quad = Triple(
s=Term(type=IRI, iri="http://example.org/s"), s=Term(type=IRI, iri="http://example.org/s"),
p=Term(type=IRI, iri="http://example.org/p"), p=Term(type=IRI, iri="http://example.org/p"),
o=Term(type=LITERAL, value="v"), o=Term(type=LITERAL, value="v"),
g="urn:graph:retrieval", g="urn:graph:retrieval",
) )
wire = triple_tx.from_pulsar(quad) wire = triple_tx.encode(quad)
assert wire["g"] == "urn:graph:retrieval" assert wire["g"] == "urn:graph:retrieval"
def test_quad_round_trip(self, triple_tx): def test_quad_round_trip(self, triple_tx):
@ -299,8 +299,8 @@ class TestTripleTranslator:
o=Term(type=LITERAL, value="v"), o=Term(type=LITERAL, value="v"),
g="urn:graph:source", g="urn:graph:source",
) )
wire = triple_tx.from_pulsar(original) wire = triple_tx.encode(original)
restored = triple_tx.to_pulsar(wire) restored = triple_tx.decode(wire)
assert restored == original assert restored == original
def test_none_graph_omitted_from_wire(self, triple_tx): def test_none_graph_omitted_from_wire(self, triple_tx):
@ -310,12 +310,12 @@ class TestTripleTranslator:
o=Term(type=LITERAL, value="v"), o=Term(type=LITERAL, value="v"),
g=None, g=None,
) )
wire = triple_tx.from_pulsar(triple) wire = triple_tx.encode(triple)
assert "g" not in wire assert "g" not in wire
def test_missing_terms_handled(self, triple_tx): def test_missing_terms_handled(self, triple_tx):
data = {} data = {}
triple = triple_tx.to_pulsar(data) triple = triple_tx.decode(data)
assert triple.s is None assert triple.s is None
assert triple.p is None assert triple.p is None
assert triple.o is None assert triple.o is None
@ -342,16 +342,16 @@ class TestSubgraphTranslator:
g="urn:graph:source", g="urn:graph:source",
), ),
] ]
wire_list = tx.from_pulsar(triples) wire_list = tx.encode(triples)
assert len(wire_list) == 2 assert len(wire_list) == 2
assert wire_list[1]["g"] == "urn:graph:source" assert wire_list[1]["g"] == "urn:graph:source"
restored = tx.to_pulsar(wire_list) restored = tx.decode(wire_list)
assert len(restored) == 2 assert len(restored) == 2
assert restored[0] == triples[0] assert restored[0] == triples[0]
assert restored[1] == triples[1] assert restored[1] == triples[1]
def test_empty_subgraph(self): def test_empty_subgraph(self):
tx = SubgraphTranslator() tx = SubgraphTranslator()
assert tx.to_pulsar([]) == [] assert tx.decode([]) == []
assert tx.from_pulsar([]) == [] assert tx.encode([]) == []

View file

@ -35,7 +35,7 @@ class TestDocumentMetadataTranslator:
"parent-id": "doc-100", "parent-id": "doc-100",
"document-type": "page", "document-type": "page",
} }
obj = self.tx.to_pulsar(data) obj = self.tx.decode(data)
assert obj.id == "doc-123" assert obj.id == "doc-123"
assert obj.time == 1710000000 assert obj.time == 1710000000
assert obj.kind == "application/pdf" assert obj.kind == "application/pdf"
@ -45,14 +45,14 @@ class TestDocumentMetadataTranslator:
assert obj.parent_id == "doc-100" assert obj.parent_id == "doc-100"
assert obj.document_type == "page" assert obj.document_type == "page"
wire = self.tx.from_pulsar(obj) wire = self.tx.encode(obj)
assert wire["id"] == "doc-123" assert wire["id"] == "doc-123"
assert wire["user"] == "alice" assert wire["user"] == "alice"
assert wire["parent-id"] == "doc-100" assert wire["parent-id"] == "doc-100"
assert wire["document-type"] == "page" assert wire["document-type"] == "page"
def test_defaults_for_missing_fields(self): def test_defaults_for_missing_fields(self):
obj = self.tx.to_pulsar({}) obj = self.tx.decode({})
assert obj.parent_id == "" assert obj.parent_id == ""
assert obj.document_type == "source" assert obj.document_type == "source"
@ -63,25 +63,25 @@ class TestDocumentMetadataTranslator:
"o": {"t": "i", "i": "http://example.org/o"}, "o": {"t": "i", "i": "http://example.org/o"},
}] }]
data = {"metadata": triple_wire} data = {"metadata": triple_wire}
obj = self.tx.to_pulsar(data) obj = self.tx.decode(data)
assert len(obj.metadata) == 1 assert len(obj.metadata) == 1
assert obj.metadata[0].s.iri == "http://example.org/s" assert obj.metadata[0].s.iri == "http://example.org/s"
def test_none_metadata_handled(self): def test_none_metadata_handled(self):
data = {"metadata": None} data = {"metadata": None}
obj = self.tx.to_pulsar(data) obj = self.tx.decode(data)
assert obj.metadata == [] assert obj.metadata == []
def test_empty_tags_preserved(self): def test_empty_tags_preserved(self):
data = {"tags": []} data = {"tags": []}
obj = self.tx.to_pulsar(data) obj = self.tx.decode(data)
wire = self.tx.from_pulsar(obj) wire = self.tx.encode(obj)
assert wire["tags"] == [] assert wire["tags"] == []
def test_falsy_fields_omitted_from_wire(self): def test_falsy_fields_omitted_from_wire(self):
"""Empty string fields should be omitted from wire format.""" """Empty string fields should be omitted from wire format."""
obj = DocumentMetadata(id="", time=0, user="") obj = DocumentMetadata(id="", time=0, user="")
wire = self.tx.from_pulsar(obj) wire = self.tx.encode(obj)
assert "id" not in wire assert "id" not in wire
assert "user" not in wire assert "user" not in wire
@ -105,7 +105,7 @@ class TestProcessingMetadataTranslator:
"collection": "my-collection", "collection": "my-collection",
"tags": ["tag1"], "tags": ["tag1"],
} }
obj = self.tx.to_pulsar(data) obj = self.tx.decode(data)
assert obj.id == "proc-1" assert obj.id == "proc-1"
assert obj.document_id == "doc-123" assert obj.document_id == "doc-123"
assert obj.flow == "default" assert obj.flow == "default"
@ -113,32 +113,32 @@ class TestProcessingMetadataTranslator:
assert obj.collection == "my-collection" assert obj.collection == "my-collection"
assert obj.tags == ["tag1"] assert obj.tags == ["tag1"]
wire = self.tx.from_pulsar(obj) wire = self.tx.encode(obj)
assert wire["id"] == "proc-1" assert wire["id"] == "proc-1"
assert wire["document-id"] == "doc-123" assert wire["document-id"] == "doc-123"
assert wire["user"] == "alice" assert wire["user"] == "alice"
assert wire["collection"] == "my-collection" assert wire["collection"] == "my-collection"
def test_missing_fields_use_defaults(self): def test_missing_fields_use_defaults(self):
obj = self.tx.to_pulsar({}) obj = self.tx.decode({})
assert obj.id is None assert obj.id is None
assert obj.user is None assert obj.user is None
assert obj.collection is None assert obj.collection is None
def test_tags_none_omitted(self): def test_tags_none_omitted(self):
obj = ProcessingMetadata(tags=None) obj = ProcessingMetadata(tags=None)
wire = self.tx.from_pulsar(obj) wire = self.tx.encode(obj)
assert "tags" not in wire assert "tags" not in wire
def test_tags_empty_list_preserved(self): def test_tags_empty_list_preserved(self):
obj = ProcessingMetadata(tags=[]) obj = ProcessingMetadata(tags=[])
wire = self.tx.from_pulsar(obj) wire = self.tx.encode(obj)
assert wire["tags"] == [] assert wire["tags"] == []
def test_user_and_collection_preserved(self): def test_user_and_collection_preserved(self):
"""Core pipeline routing fields must survive round-trip.""" """Core pipeline routing fields must survive round-trip."""
data = {"user": "bob", "collection": "research"} data = {"user": "bob", "collection": "research"}
obj = self.tx.to_pulsar(data) obj = self.tx.decode(data)
wire = self.tx.from_pulsar(obj) wire = self.tx.encode(obj)
assert wire["user"] == "bob" assert wire["user"] == "bob"
assert wire["collection"] == "research" assert wire["collection"] == "research"

View file

@ -28,7 +28,7 @@ class TestRequestTranslation:
} }
# Translate to Pulsar # Translate to Pulsar
pulsar_msg = translator.to_pulsar(api_data) pulsar_msg = translator.decode(api_data)
assert pulsar_msg.operation == "schema-selection" assert pulsar_msg.operation == "schema-selection"
assert pulsar_msg.sample == "test data sample" assert pulsar_msg.sample == "test data sample"
@ -46,7 +46,7 @@ class TestRequestTranslation:
"options": {"delimiter": ","} "options": {"delimiter": ","}
} }
pulsar_msg = translator.to_pulsar(api_data) pulsar_msg = translator.decode(api_data)
assert pulsar_msg.operation == "generate-descriptor" assert pulsar_msg.operation == "generate-descriptor"
assert pulsar_msg.sample == "csv data" assert pulsar_msg.sample == "csv data"
@ -70,7 +70,7 @@ class TestResponseTranslation:
) )
# Translate to API format # Translate to API format
api_data = translator.from_pulsar(pulsar_response) api_data = translator.encode(pulsar_response)
assert api_data["operation"] == "schema-selection" assert api_data["operation"] == "schema-selection"
assert api_data["schema-matches"] == ["products", "inventory", "catalog"] assert api_data["schema-matches"] == ["products", "inventory", "catalog"]
@ -86,7 +86,7 @@ class TestResponseTranslation:
error=None error=None
) )
api_data = translator.from_pulsar(pulsar_response) api_data = translator.encode(pulsar_response)
assert api_data["operation"] == "schema-selection" assert api_data["operation"] == "schema-selection"
assert api_data["schema-matches"] == [] assert api_data["schema-matches"] == []
@ -103,7 +103,7 @@ class TestResponseTranslation:
error=None error=None
) )
api_data = translator.from_pulsar(pulsar_response) api_data = translator.encode(pulsar_response)
assert api_data["operation"] == "detect-type" assert api_data["operation"] == "detect-type"
assert api_data["detected-type"] == "xml" assert api_data["detected-type"] == "xml"
@ -123,7 +123,7 @@ class TestResponseTranslation:
) )
) )
api_data = translator.from_pulsar(pulsar_response) api_data = translator.encode(pulsar_response)
assert api_data["operation"] == "schema-selection" assert api_data["operation"] == "schema-selection"
# Error objects are typically handled separately by the gateway # Error objects are typically handled separately by the gateway
@ -146,7 +146,7 @@ class TestResponseTranslation:
error=None error=None
) )
api_data = translator.from_pulsar(pulsar_response) api_data = translator.encode(pulsar_response)
assert api_data["operation"] == "diagnose" assert api_data["operation"] == "diagnose"
assert api_data["detected-type"] == "csv" assert api_data["detected-type"] == "csv"
@ -165,7 +165,7 @@ class TestResponseTranslation:
error=None error=None
) )
api_data, is_final = translator.from_response_with_completion(pulsar_response) api_data, is_final = translator.encode_with_completion(pulsar_response)
assert is_final is True # Structured-diag responses are always final assert is_final is True # Structured-diag responses are always final
assert api_data["operation"] == "schema-selection" assert api_data["operation"] == "schema-selection"

View file

@ -1,5 +1,5 @@
from . pubsub import PulsarClient, get_pubsub from . pubsub import get_pubsub, add_pubsub_args
from . async_processor import AsyncProcessor from . async_processor import AsyncProcessor
from . consumer import Consumer from . consumer import Consumer
from . producer import Producer from . producer import Producer

View file

@ -6,7 +6,6 @@
import asyncio import asyncio
import argparse import argparse
import _pulsar
import time import time
import uuid import uuid
import logging import logging
@ -15,7 +14,7 @@ from prometheus_client import start_http_server, Info
from .. schema import ConfigPush, config_push_queue from .. schema import ConfigPush, config_push_queue
from .. log_level import LogLevel from .. log_level import LogLevel
from . pubsub import PulsarClient, get_pubsub from . pubsub import get_pubsub, add_pubsub_args
from . producer import Producer from . producer import Producer
from . consumer import Consumer from . consumer import Consumer
from . metrics import ProcessorMetrics, ConsumerMetrics from . metrics import ProcessorMetrics, ConsumerMetrics
@ -223,8 +222,8 @@ class AsyncProcessor:
logger.info("Keyboard interrupt.") logger.info("Keyboard interrupt.")
return return
except _pulsar.Interrupted: except KeyboardInterrupt:
logger.info("Pulsar Interrupted.") logger.info("Interrupted.")
return return
# Exceptions from a taskgroup come in as an exception group # Exceptions from a taskgroup come in as an exception group
@ -250,15 +249,7 @@ class AsyncProcessor:
@staticmethod @staticmethod
def add_args(parser): def add_args(parser):
# Pub/sub backend selection add_pubsub_args(parser)
parser.add_argument(
'--pubsub-backend',
default=os.getenv('PUBSUB_BACKEND', 'pulsar'),
choices=['pulsar', 'mqtt'],
help='Pub/sub backend (default: pulsar, env: PUBSUB_BACKEND)',
)
PulsarClient.add_args(parser)
add_logging_args(parser) add_logging_args(parser)
parser.add_argument( parser.add_argument(

View file

@ -6,8 +6,6 @@
import json import json
import logging import logging
from pulsar.schema import JsonSchema
from .. schema import Error from .. schema import Error
from .. schema import config_request_queue, config_response_queue from .. schema import config_request_queue, config_response_queue
from .. schema import config_push_queue from .. schema import config_push_queue

View file

@ -1,110 +1,72 @@
import os import os
import pulsar
import _pulsar
import uuid
from pulsar.schema import JsonSchema
import logging import logging
from .. log_level import LogLevel
from .pulsar_backend import PulsarBackend
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
# Default connection settings from environment
DEFAULT_PULSAR_HOST = os.getenv("PULSAR_HOST", 'pulsar://pulsar:6650')
DEFAULT_PULSAR_API_KEY = os.getenv("PULSAR_API_KEY", None)
def get_pubsub(**config): def get_pubsub(**config):
""" """
Factory function to create a pub/sub backend based on configuration. Factory function to create a pub/sub backend based on configuration.
Args: Args:
config: Configuration dictionary from command-line args config: Configuration dictionary from command-line args.
Must include 'pubsub_backend' key Key 'pubsub_backend' selects the backend (default: 'pulsar').
Returns: Returns:
Backend instance (PulsarBackend, MQTTBackend, etc.) Backend instance implementing the PubSubBackend protocol.
Example:
backend = get_pubsub(
pubsub_backend='pulsar',
pulsar_host='pulsar://localhost:6650'
)
""" """
backend_type = config.get('pubsub_backend', 'pulsar') backend_type = config.get('pubsub_backend', 'pulsar')
if backend_type == 'pulsar': if backend_type == 'pulsar':
from .pulsar_backend import PulsarBackend
return PulsarBackend( return PulsarBackend(
host=config.get('pulsar_host', PulsarClient.default_pulsar_host), host=config.get('pulsar_host', DEFAULT_PULSAR_HOST),
api_key=config.get('pulsar_api_key', PulsarClient.default_pulsar_api_key), api_key=config.get('pulsar_api_key', DEFAULT_PULSAR_API_KEY),
listener=config.get('pulsar_listener'), listener=config.get('pulsar_listener'),
) )
elif backend_type == 'mqtt':
# TODO: Implement MQTT backend
raise NotImplementedError("MQTT backend not yet implemented")
else: else:
raise ValueError(f"Unknown pub/sub backend: {backend_type}") raise ValueError(f"Unknown pub/sub backend: {backend_type}")
class PulsarClient: STANDALONE_PULSAR_HOST = 'pulsar://localhost:6650'
default_pulsar_host = os.getenv("PULSAR_HOST", 'pulsar://pulsar:6650')
default_pulsar_api_key = os.getenv("PULSAR_API_KEY", None)
def __init__(self, **params): def add_pubsub_args(parser, standalone=False):
"""Add pub/sub CLI arguments to an argument parser.
self.client = None Args:
parser: argparse.ArgumentParser
standalone: If True, default host is localhost (for CLI tools
that run outside containers)
"""
host = STANDALONE_PULSAR_HOST if standalone else DEFAULT_PULSAR_HOST
listener_default = 'localhost' if standalone else None
pulsar_host = params.get("pulsar_host", self.default_pulsar_host) parser.add_argument(
pulsar_listener = params.get("pulsar_listener", None) '--pubsub-backend',
pulsar_api_key = params.get( default=os.getenv('PUBSUB_BACKEND', 'pulsar'),
"pulsar_api_key", help='Pub/sub backend (default: pulsar, env: PUBSUB_BACKEND)',
self.default_pulsar_api_key )
)
# Hard-code Pulsar logging to ERROR level to minimize noise
self.pulsar_host = pulsar_host parser.add_argument(
self.pulsar_api_key = pulsar_api_key '-p', '--pulsar-host',
default=host,
help=f'Pulsar host (default: {host})',
)
if pulsar_api_key: parser.add_argument(
auth = pulsar.AuthenticationToken(pulsar_api_key) '--pulsar-api-key',
self.client = pulsar.Client( default=DEFAULT_PULSAR_API_KEY,
pulsar_host, help='Pulsar API key',
authentication=auth, )
logger=pulsar.ConsoleLogger(_pulsar.LoggerLevel.Error)
)
else:
self.client = pulsar.Client(
pulsar_host,
listener_name=pulsar_listener,
logger=pulsar.ConsoleLogger(_pulsar.LoggerLevel.Error)
)
self.pulsar_listener = pulsar_listener parser.add_argument(
'--pulsar-listener',
def close(self): default=listener_default,
self.client.close() help=f'Pulsar listener (default: {listener_default or "none"})',
)
def __del__(self):
if hasattr(self, "client"):
if self.client:
self.client.close()
@staticmethod
def add_args(parser):
parser.add_argument(
'-p', '--pulsar-host',
default=__class__.default_pulsar_host,
help=f'Pulsar host (default: {__class__.default_pulsar_host})',
)
parser.add_argument(
'--pulsar-api-key',
default=__class__.default_pulsar_api_key,
help=f'Pulsar API key',
)
parser.add_argument(
'--pulsar-listener',
help=f'Pulsar listener (default: none)',
)

View file

@ -181,8 +181,11 @@ class PulsarBackendConsumer:
self._schema_cls = schema_cls self._schema_cls = schema_cls
def receive(self, timeout_millis: int = 2000) -> Message: def receive(self, timeout_millis: int = 2000) -> Message:
"""Receive a message.""" """Receive a message. Raises TimeoutError if no message available."""
pulsar_msg = self._consumer.receive(timeout_millis=timeout_millis) try:
pulsar_msg = self._consumer.receive(timeout_millis=timeout_millis)
except _pulsar.Timeout:
raise TimeoutError("No message received within timeout")
return PulsarMessage(pulsar_msg, self._schema_cls) return PulsarMessage(pulsar_msg, self._schema_cls)
def acknowledge(self, message: Message) -> None: def acknowledge(self, message: Message) -> None:
@ -237,38 +240,44 @@ class PulsarBackend:
self.client = pulsar.Client(**client_args) self.client = pulsar.Client(**client_args)
logger.info(f"Pulsar client connected to {host}") logger.info(f"Pulsar client connected to {host}")
def map_topic(self, generic_topic: str) -> str: def map_topic(self, queue_id: str) -> str:
""" """
Map generic topic format to Pulsar URI. Map queue identifier to Pulsar URI.
Format: qos/tenant/namespace/queue Format: class:topicspace:topic
Example: q1/tg/flow/my-queue -> persistent://tg/flow/my-queue Example: flow:tg:text-completion-request -> persistent://tg/flow/text-completion-request
Args: Args:
generic_topic: Generic topic string or already-formatted Pulsar URI queue_id: Queue identifier string or already-formatted Pulsar URI
Returns: Returns:
Pulsar topic URI Pulsar topic URI
""" """
# If already a Pulsar URI, return as-is # If already a Pulsar URI, return as-is
if '://' in generic_topic: if '://' in queue_id:
return generic_topic return queue_id
parts = generic_topic.split('/', 3) parts = queue_id.split(':', 2)
if len(parts) != 4: if len(parts) != 3:
raise ValueError(f"Invalid topic format: {generic_topic}, expected qos/tenant/namespace/queue") raise ValueError(
f"Invalid queue format: {queue_id}, "
f"expected class:topicspace:topic"
)
qos, tenant, namespace, queue = parts cls, topicspace, topic = parts
# Map QoS to persistence # Map class to Pulsar persistence and namespace
if qos == 'q0': if cls in ('flow', 'state'):
persistence = 'non-persistent'
elif qos in ['q1', 'q2']:
persistence = 'persistent' persistence = 'persistent'
elif cls in ('request', 'response'):
persistence = 'non-persistent'
else: else:
raise ValueError(f"Invalid QoS level: {qos}, expected q0, q1, or q2") raise ValueError(
f"Invalid queue class: {cls}, "
f"expected flow, request, response, or state"
)
return f"{persistence}://{tenant}/{namespace}/{queue}" return f"{persistence}://{topicspace}/{cls}/{topic}"
def create_producer(self, topic: str, schema: type, **options) -> BackendProducer: def create_producer(self, topic: str, schema: type, **options) -> BackendProducer:
""" """

View file

@ -1,5 +1,4 @@
import _pulsar
from .. schema import AgentRequest, AgentResponse from .. schema import AgentRequest, AgentResponse
from .. schema import agent_request_queue from .. schema import agent_request_queue
@ -7,15 +6,11 @@ from .. schema import agent_response_queue
from . base import BaseClient from . base import BaseClient
# Ugly # Ugly
ERROR=_pulsar.LoggerLevel.Error
WARN=_pulsar.LoggerLevel.Warn
INFO=_pulsar.LoggerLevel.Info
DEBUG=_pulsar.LoggerLevel.Debug
class AgentClient(BaseClient): class AgentClient(BaseClient):
def __init__( def __init__(
self, log_level=ERROR, self,
subscriber=None, subscriber=None,
input_queue=None, input_queue=None,
output_queue=None, output_queue=None,
@ -27,7 +22,6 @@ class AgentClient(BaseClient):
if output_queue is None: output_queue = agent_response_queue if output_queue is None: output_queue = agent_response_queue
super(AgentClient, self).__init__( super(AgentClient, self).__init__(
log_level=log_level,
subscriber=subscriber, subscriber=subscriber,
input_queue=input_queue, input_queue=input_queue,
output_queue=output_queue, output_queue=output_queue,

View file

@ -1,10 +1,6 @@
import pulsar
import _pulsar
import hashlib
import uuid import uuid
import time import time
from pulsar.schema import JsonSchema
from .. exceptions import * from .. exceptions import *
from ..base.pubsub import get_pubsub from ..base.pubsub import get_pubsub
@ -12,16 +8,11 @@ from ..base.pubsub import get_pubsub
# Default timeout for a request/response. In seconds. # Default timeout for a request/response. In seconds.
DEFAULT_TIMEOUT=300 DEFAULT_TIMEOUT=300
# Ugly
ERROR=_pulsar.LoggerLevel.Error
WARN=_pulsar.LoggerLevel.Warn
INFO=_pulsar.LoggerLevel.Info
DEBUG=_pulsar.LoggerLevel.Debug
class BaseClient: class BaseClient:
def __init__( def __init__(
self, log_level=ERROR, self,
subscriber=None, subscriber=None,
input_queue=None, input_queue=None,
output_queue=None, output_queue=None,
@ -87,7 +78,7 @@ class BaseClient:
try: try:
msg = self.consumer.receive(timeout_millis=2500) msg = self.consumer.receive(timeout_millis=2500)
except pulsar.exceptions.Timeout: except TimeoutError:
continue continue
mid = msg.properties()["id"] mid = msg.properties()["id"]
@ -139,4 +130,3 @@ class BaseClient:
if hasattr(self, "backend"): if hasattr(self, "backend"):
self.backend.close() self.backend.close()

View file

@ -1,5 +1,4 @@
import _pulsar
import json import json
import dataclasses import dataclasses
@ -9,10 +8,6 @@ from .. schema import config_response_queue
from . base import BaseClient from . base import BaseClient
# Ugly # Ugly
ERROR=_pulsar.LoggerLevel.Error
WARN=_pulsar.LoggerLevel.Warn
INFO=_pulsar.LoggerLevel.Info
DEBUG=_pulsar.LoggerLevel.Debug
@dataclasses.dataclass @dataclasses.dataclass
class Definition: class Definition:
@ -34,7 +29,7 @@ class Topic:
class ConfigClient(BaseClient): class ConfigClient(BaseClient):
def __init__( def __init__(
self, log_level=ERROR, self,
subscriber=None, subscriber=None,
input_queue=None, input_queue=None,
output_queue=None, output_queue=None,
@ -50,7 +45,6 @@ class ConfigClient(BaseClient):
output_queue = config_response_queue output_queue = config_response_queue
super(ConfigClient, self).__init__( super(ConfigClient, self).__init__(
log_level=log_level,
subscriber=subscriber, subscriber=subscriber,
input_queue=input_queue, input_queue=input_queue,
output_queue=output_queue, output_queue=output_queue,

View file

@ -1,5 +1,4 @@
import _pulsar
from .. schema import DocumentEmbeddingsRequest, DocumentEmbeddingsResponse from .. schema import DocumentEmbeddingsRequest, DocumentEmbeddingsResponse
from .. schema import document_embeddings_request_queue from .. schema import document_embeddings_request_queue
@ -7,15 +6,11 @@ from .. schema import document_embeddings_response_queue
from . base import BaseClient from . base import BaseClient
# Ugly # Ugly
ERROR=_pulsar.LoggerLevel.Error
WARN=_pulsar.LoggerLevel.Warn
INFO=_pulsar.LoggerLevel.Info
DEBUG=_pulsar.LoggerLevel.Debug
class DocumentEmbeddingsClient(BaseClient): class DocumentEmbeddingsClient(BaseClient):
def __init__( def __init__(
self, log_level=ERROR, self,
subscriber=None, subscriber=None,
input_queue=None, input_queue=None,
output_queue=None, output_queue=None,
@ -30,7 +25,6 @@ class DocumentEmbeddingsClient(BaseClient):
output_queue = document_embeddings_response_queue output_queue = document_embeddings_response_queue
super(DocumentEmbeddingsClient, self).__init__( super(DocumentEmbeddingsClient, self).__init__(
log_level=log_level,
subscriber=subscriber, subscriber=subscriber,
input_queue=input_queue, input_queue=input_queue,
output_queue=output_queue, output_queue=output_queue,

View file

@ -1,21 +1,15 @@
import _pulsar
from .. schema import DocumentRagQuery, DocumentRagResponse from .. schema import DocumentRagQuery, DocumentRagResponse
from .. schema import document_rag_request_queue, document_rag_response_queue from .. schema import document_rag_request_queue, document_rag_response_queue
from . base import BaseClient from . base import BaseClient
# Ugly # Ugly
ERROR=_pulsar.LoggerLevel.Error
WARN=_pulsar.LoggerLevel.Warn
INFO=_pulsar.LoggerLevel.Info
DEBUG=_pulsar.LoggerLevel.Debug
class DocumentRagClient(BaseClient): class DocumentRagClient(BaseClient):
def __init__( def __init__(
self, self,
log_level=ERROR,
subscriber=None, subscriber=None,
input_queue=None, input_queue=None,
output_queue=None, output_queue=None,
@ -30,7 +24,6 @@ class DocumentRagClient(BaseClient):
output_queue = document_rag_response_queue output_queue = document_rag_response_queue
super(DocumentRagClient, self).__init__( super(DocumentRagClient, self).__init__(
log_level=log_level,
subscriber=subscriber, subscriber=subscriber,
input_queue=input_queue, input_queue=input_queue,
output_queue=output_queue, output_queue=output_queue,

View file

@ -1,20 +1,14 @@
from pulsar.schema import JsonSchema
from .. schema import EmbeddingsRequest, EmbeddingsResponse from .. schema import EmbeddingsRequest, EmbeddingsResponse
from . base import BaseClient from . base import BaseClient
import _pulsar
# Ugly # Ugly
ERROR=_pulsar.LoggerLevel.Error
WARN=_pulsar.LoggerLevel.Warn
INFO=_pulsar.LoggerLevel.Info
DEBUG=_pulsar.LoggerLevel.Debug
class EmbeddingsClient(BaseClient): class EmbeddingsClient(BaseClient):
def __init__( def __init__(
self, log_level=ERROR, self,
input_queue=None, input_queue=None,
output_queue=None, output_queue=None,
subscriber=None, subscriber=None,
@ -23,7 +17,6 @@ class EmbeddingsClient(BaseClient):
): ):
super(EmbeddingsClient, self).__init__( super(EmbeddingsClient, self).__init__(
log_level=log_level,
subscriber=subscriber, subscriber=subscriber,
input_queue=input_queue, input_queue=input_queue,
output_queue=output_queue, output_queue=output_queue,

View file

@ -1,5 +1,4 @@
import _pulsar
from .. schema import GraphEmbeddingsRequest, GraphEmbeddingsResponse from .. schema import GraphEmbeddingsRequest, GraphEmbeddingsResponse
from .. schema import graph_embeddings_request_queue from .. schema import graph_embeddings_request_queue
@ -7,15 +6,11 @@ from .. schema import graph_embeddings_response_queue
from . base import BaseClient from . base import BaseClient
# Ugly # Ugly
ERROR=_pulsar.LoggerLevel.Error
WARN=_pulsar.LoggerLevel.Warn
INFO=_pulsar.LoggerLevel.Info
DEBUG=_pulsar.LoggerLevel.Debug
class GraphEmbeddingsClient(BaseClient): class GraphEmbeddingsClient(BaseClient):
def __init__( def __init__(
self, log_level=ERROR, self,
subscriber=None, subscriber=None,
input_queue=None, input_queue=None,
output_queue=None, output_queue=None,
@ -30,7 +25,6 @@ class GraphEmbeddingsClient(BaseClient):
output_queue = graph_embeddings_response_queue output_queue = graph_embeddings_response_queue
super(GraphEmbeddingsClient, self).__init__( super(GraphEmbeddingsClient, self).__init__(
log_level=log_level,
subscriber=subscriber, subscriber=subscriber,
input_queue=input_queue, input_queue=input_queue,
output_queue=output_queue, output_queue=output_queue,

View file

@ -1,21 +1,15 @@
import _pulsar
from .. schema import GraphRagQuery, GraphRagResponse from .. schema import GraphRagQuery, GraphRagResponse
from .. schema import graph_rag_request_queue, graph_rag_response_queue from .. schema import graph_rag_request_queue, graph_rag_response_queue
from . base import BaseClient from . base import BaseClient
# Ugly # Ugly
ERROR=_pulsar.LoggerLevel.Error
WARN=_pulsar.LoggerLevel.Warn
INFO=_pulsar.LoggerLevel.Info
DEBUG=_pulsar.LoggerLevel.Debug
class GraphRagClient(BaseClient): class GraphRagClient(BaseClient):
def __init__( def __init__(
self, self,
log_level=ERROR,
subscriber=None, subscriber=None,
input_queue=None, input_queue=None,
output_queue=None, output_queue=None,
@ -30,7 +24,6 @@ class GraphRagClient(BaseClient):
output_queue = graph_rag_response_queue output_queue = graph_rag_response_queue
super(GraphRagClient, self).__init__( super(GraphRagClient, self).__init__(
log_level=log_level,
subscriber=subscriber, subscriber=subscriber,
input_queue=input_queue, input_queue=input_queue,
output_queue=output_queue, output_queue=output_queue,

View file

@ -1,5 +1,4 @@
import _pulsar
from .. schema import TextCompletionRequest, TextCompletionResponse from .. schema import TextCompletionRequest, TextCompletionResponse
from .. schema import text_completion_request_queue from .. schema import text_completion_request_queue
@ -8,15 +7,11 @@ from . base import BaseClient
from .. exceptions import LlmError from .. exceptions import LlmError
# Ugly # Ugly
ERROR=_pulsar.LoggerLevel.Error
WARN=_pulsar.LoggerLevel.Warn
INFO=_pulsar.LoggerLevel.Info
DEBUG=_pulsar.LoggerLevel.Debug
class LlmClient(BaseClient): class LlmClient(BaseClient):
def __init__( def __init__(
self, log_level=ERROR, self,
subscriber=None, subscriber=None,
input_queue=None, input_queue=None,
output_queue=None, output_queue=None,
@ -28,7 +23,6 @@ class LlmClient(BaseClient):
if output_queue is None: output_queue = text_completion_response_queue if output_queue is None: output_queue = text_completion_response_queue
super(LlmClient, self).__init__( super(LlmClient, self).__init__(
log_level=log_level,
subscriber=subscriber, subscriber=subscriber,
input_queue=input_queue, input_queue=input_queue,
output_queue=output_queue, output_queue=output_queue,

View file

@ -1,5 +1,4 @@
import _pulsar
import json import json
import dataclasses import dataclasses
@ -9,10 +8,6 @@ from .. schema import prompt_response_queue
from . base import BaseClient from . base import BaseClient
# Ugly # Ugly
ERROR=_pulsar.LoggerLevel.Error
WARN=_pulsar.LoggerLevel.Warn
INFO=_pulsar.LoggerLevel.Info
DEBUG=_pulsar.LoggerLevel.Debug
@dataclasses.dataclass @dataclasses.dataclass
class Definition: class Definition:
@ -34,7 +29,7 @@ class Topic:
class PromptClient(BaseClient): class PromptClient(BaseClient):
def __init__( def __init__(
self, log_level=ERROR, self,
subscriber=None, subscriber=None,
input_queue=None, input_queue=None,
output_queue=None, output_queue=None,
@ -49,7 +44,6 @@ class PromptClient(BaseClient):
output_queue = prompt_response_queue output_queue = prompt_response_queue
super(PromptClient, self).__init__( super(PromptClient, self).__init__(
log_level=log_level,
subscriber=subscriber, subscriber=subscriber,
input_queue=input_queue, input_queue=input_queue,
output_queue=output_queue, output_queue=output_queue,

View file

@ -1,5 +1,4 @@
import _pulsar
from .. schema import RowEmbeddingsRequest, RowEmbeddingsResponse from .. schema import RowEmbeddingsRequest, RowEmbeddingsResponse
from .. schema import row_embeddings_request_queue from .. schema import row_embeddings_request_queue
@ -7,15 +6,11 @@ from .. schema import row_embeddings_response_queue
from . base import BaseClient from . base import BaseClient
# Ugly # Ugly
ERROR=_pulsar.LoggerLevel.Error
WARN=_pulsar.LoggerLevel.Warn
INFO=_pulsar.LoggerLevel.Info
DEBUG=_pulsar.LoggerLevel.Debug
class RowEmbeddingsClient(BaseClient): class RowEmbeddingsClient(BaseClient):
def __init__( def __init__(
self, log_level=ERROR, self,
subscriber=None, subscriber=None,
input_queue=None, input_queue=None,
output_queue=None, output_queue=None,
@ -30,7 +25,6 @@ class RowEmbeddingsClient(BaseClient):
output_queue = row_embeddings_response_queue output_queue = row_embeddings_response_queue
super(RowEmbeddingsClient, self).__init__( super(RowEmbeddingsClient, self).__init__(
log_level=log_level,
subscriber=subscriber, subscriber=subscriber,
input_queue=input_queue, input_queue=input_queue,
output_queue=output_queue, output_queue=output_queue,

View file

@ -1,6 +1,5 @@
#!/usr/bin/env python3 #!/usr/bin/env python3
import _pulsar
from .. schema import TriplesQueryRequest, TriplesQueryResponse, Term, IRI, LITERAL from .. schema import TriplesQueryRequest, TriplesQueryResponse, Term, IRI, LITERAL
from .. schema import triples_request_queue from .. schema import triples_request_queue
@ -8,15 +7,11 @@ from .. schema import triples_response_queue
from . base import BaseClient from . base import BaseClient
# Ugly # Ugly
ERROR=_pulsar.LoggerLevel.Error
WARN=_pulsar.LoggerLevel.Warn
INFO=_pulsar.LoggerLevel.Info
DEBUG=_pulsar.LoggerLevel.Debug
class TriplesQueryClient(BaseClient): class TriplesQueryClient(BaseClient):
def __init__( def __init__(
self, log_level=ERROR, self,
subscriber=None, subscriber=None,
input_queue=None, input_queue=None,
output_queue=None, output_queue=None,
@ -31,7 +26,6 @@ class TriplesQueryClient(BaseClient):
output_queue = triples_response_queue output_queue = triples_response_queue
super(TriplesQueryClient, self).__init__( super(TriplesQueryClient, self).__init__(
log_level=log_level,
subscriber=subscriber, subscriber=subscriber,
input_queue=input_queue, input_queue=input_queue,
output_queue=output_queue, output_queue=output_queue,

View file

@ -1,6 +1,6 @@
from enum import Enum from enum import Enum
import _pulsar
class LogLevel(Enum): class LogLevel(Enum):
DEBUG = 'debug' DEBUG = 'debug'
@ -10,11 +10,3 @@ class LogLevel(Enum):
def __str__(self): def __str__(self):
return self.value return self.value
def to_pulsar(self):
if self == LogLevel.DEBUG: return _pulsar.LoggerLevel.Debug
if self == LogLevel.INFO: return _pulsar.LoggerLevel.Info
if self == LogLevel.WARN: return _pulsar.LoggerLevel.Warn
if self == LogLevel.ERROR: return _pulsar.LoggerLevel.Error
raise RuntimeError("Log level mismatch")

View file

@ -6,7 +6,7 @@ from .base import MessageTranslator
class AgentRequestTranslator(MessageTranslator): class AgentRequestTranslator(MessageTranslator):
"""Translator for AgentRequest schema objects""" """Translator for AgentRequest schema objects"""
def to_pulsar(self, data: Dict[str, Any]) -> AgentRequest: def decode(self, data: Dict[str, Any]) -> AgentRequest:
return AgentRequest( return AgentRequest(
question=data["question"], question=data["question"],
state=data.get("state", None), state=data.get("state", None),
@ -26,7 +26,7 @@ class AgentRequestTranslator(MessageTranslator):
expected_siblings=data.get("expected_siblings", 0), expected_siblings=data.get("expected_siblings", 0),
) )
def from_pulsar(self, obj: AgentRequest) -> Dict[str, Any]: def encode(self, obj: AgentRequest) -> Dict[str, Any]:
return { return {
"question": obj.question, "question": obj.question,
"state": obj.state, "state": obj.state,
@ -50,10 +50,10 @@ class AgentRequestTranslator(MessageTranslator):
class AgentResponseTranslator(MessageTranslator): class AgentResponseTranslator(MessageTranslator):
"""Translator for AgentResponse schema objects""" """Translator for AgentResponse schema objects"""
def to_pulsar(self, data: Dict[str, Any]) -> AgentResponse: def decode(self, data: Dict[str, Any]) -> AgentResponse:
raise NotImplementedError("Response translation to Pulsar not typically needed") raise NotImplementedError("Response translation to Pulsar not typically needed")
def from_pulsar(self, obj: AgentResponse) -> Dict[str, Any]: def encode(self, obj: AgentResponse) -> Dict[str, Any]:
result = {} result = {}
if obj.chunk_type: if obj.chunk_type:
@ -81,7 +81,7 @@ class AgentResponseTranslator(MessageTranslator):
return result return result
def from_response_with_completion(self, obj: AgentResponse) -> Tuple[Dict[str, Any], bool]: def encode_with_completion(self, obj: AgentResponse) -> Tuple[Dict[str, Any], bool]:
"""Returns (response_dict, is_final)""" """Returns (response_dict, is_final)"""
is_final = getattr(obj, 'end_of_dialog', False) is_final = getattr(obj, 'end_of_dialog', False)
return self.from_pulsar(obj), is_final return self.encode(obj), is_final

View file

@ -1,43 +1,46 @@
from abc import ABC, abstractmethod from abc import ABC, abstractmethod
from typing import Dict, Any, Tuple from typing import Dict, Any, Tuple
from pulsar.schema import Record
class Translator(ABC): class Translator(ABC):
"""Base class for bidirectional Pulsar ↔ dict translation""" """Base class for bidirectional schema ↔ dict translation.
Translates between external API dicts (JSON from HTTP/WebSocket)
and internal schema objects (dataclasses).
"""
@abstractmethod @abstractmethod
def to_pulsar(self, data: Dict[str, Any]) -> Record: def decode(self, data: Dict[str, Any]) -> Any:
"""Convert dict to Pulsar schema object""" """Convert external dict to schema object."""
pass pass
@abstractmethod @abstractmethod
def from_pulsar(self, obj: Record) -> Dict[str, Any]: def encode(self, obj: Any) -> Dict[str, Any]:
"""Convert Pulsar schema object to dict""" """Convert schema object to external dict."""
pass pass
class MessageTranslator(Translator): class MessageTranslator(Translator):
"""For complete request/response message translation""" """For complete request/response message translation."""
def from_response_with_completion(self, obj: Record) -> Tuple[Dict[str, Any], bool]: def encode_with_completion(self, obj: Any) -> Tuple[Dict[str, Any], bool]:
"""Returns (response_dict, is_final) - for streaming responses""" """Returns (response_dict, is_final) — for streaming responses."""
return self.from_pulsar(obj), True return self.encode(obj), True
class SendTranslator(Translator): class SendTranslator(Translator):
"""For fire-and-forget send operations (like ServiceSender)""" """For fire-and-forget send operations."""
def from_pulsar(self, obj: Record) -> Dict[str, Any]: def encode(self, obj: Any) -> Dict[str, Any]:
"""Usually not needed for send-only operations""" """Usually not needed for send-only operations."""
raise NotImplementedError("Send translators typically don't need from_pulsar") raise NotImplementedError("Send translators don't need encode")
def handle_optional_fields(obj: Record, fields: list) -> Dict[str, Any]: def handle_optional_fields(obj: Any, fields: list) -> Dict[str, Any]:
"""Helper to extract optional fields from Pulsar object""" """Helper to extract optional fields from a schema object."""
result = {} result = {}
for field in fields: for field in fields:
value = getattr(obj, field, None) value = getattr(obj, field, None)
if value is not None: if value is not None:
result[field] = value result[field] = value
return result return result

View file

@ -6,7 +6,7 @@ from .base import MessageTranslator
class CollectionManagementRequestTranslator(MessageTranslator): class CollectionManagementRequestTranslator(MessageTranslator):
"""Translator for CollectionManagementRequest schema objects""" """Translator for CollectionManagementRequest schema objects"""
def to_pulsar(self, data: Dict[str, Any]) -> CollectionManagementRequest: def decode(self, data: Dict[str, Any]) -> CollectionManagementRequest:
return CollectionManagementRequest( return CollectionManagementRequest(
operation=data.get("operation"), operation=data.get("operation"),
user=data.get("user"), user=data.get("user"),
@ -19,7 +19,7 @@ class CollectionManagementRequestTranslator(MessageTranslator):
limit=data.get("limit") limit=data.get("limit")
) )
def from_pulsar(self, obj: CollectionManagementRequest) -> Dict[str, Any]: def encode(self, obj: CollectionManagementRequest) -> Dict[str, Any]:
result = {} result = {}
if obj.operation is not None: if obj.operation is not None:
@ -47,7 +47,7 @@ class CollectionManagementRequestTranslator(MessageTranslator):
class CollectionManagementResponseTranslator(MessageTranslator): class CollectionManagementResponseTranslator(MessageTranslator):
"""Translator for CollectionManagementResponse schema objects""" """Translator for CollectionManagementResponse schema objects"""
def to_pulsar(self, data: Dict[str, Any]) -> CollectionManagementResponse: def decode(self, data: Dict[str, Any]) -> CollectionManagementResponse:
# Handle error # Handle error
error = None error = None
@ -76,7 +76,7 @@ class CollectionManagementResponseTranslator(MessageTranslator):
collections=collections collections=collections
) )
def from_pulsar(self, obj: CollectionManagementResponse) -> Dict[str, Any]: def encode(self, obj: CollectionManagementResponse) -> Dict[str, Any]:
result = {} result = {}
print("COLLECTIONMGMT", obj, flush=True) print("COLLECTIONMGMT", obj, flush=True)

View file

@ -6,7 +6,7 @@ from .base import MessageTranslator
class ConfigRequestTranslator(MessageTranslator): class ConfigRequestTranslator(MessageTranslator):
"""Translator for ConfigRequest schema objects""" """Translator for ConfigRequest schema objects"""
def to_pulsar(self, data: Dict[str, Any]) -> ConfigRequest: def decode(self, data: Dict[str, Any]) -> ConfigRequest:
keys = None keys = None
if "keys" in data: if "keys" in data:
keys = [ keys = [
@ -35,7 +35,7 @@ class ConfigRequestTranslator(MessageTranslator):
values=values values=values
) )
def from_pulsar(self, obj: ConfigRequest) -> Dict[str, Any]: def encode(self, obj: ConfigRequest) -> Dict[str, Any]:
result = {} result = {}
if obj.operation is not None: if obj.operation is not None:
@ -69,10 +69,10 @@ class ConfigRequestTranslator(MessageTranslator):
class ConfigResponseTranslator(MessageTranslator): class ConfigResponseTranslator(MessageTranslator):
"""Translator for ConfigResponse schema objects""" """Translator for ConfigResponse schema objects"""
def to_pulsar(self, data: Dict[str, Any]) -> ConfigResponse: def decode(self, data: Dict[str, Any]) -> ConfigResponse:
raise NotImplementedError("Response translation to Pulsar not typically needed") raise NotImplementedError("Response translation to Pulsar not typically needed")
def from_pulsar(self, obj: ConfigResponse) -> Dict[str, Any]: def encode(self, obj: ConfigResponse) -> Dict[str, Any]:
result = {} result = {}
if obj.version is not None: if obj.version is not None:
@ -96,6 +96,6 @@ class ConfigResponseTranslator(MessageTranslator):
return result return result
def from_response_with_completion(self, obj: ConfigResponse) -> Tuple[Dict[str, Any], bool]: def encode_with_completion(self, obj: ConfigResponse) -> Tuple[Dict[str, Any], bool]:
"""Returns (response_dict, is_final)""" """Returns (response_dict, is_final)"""
return self.from_pulsar(obj), True return self.encode(obj), True

View file

@ -7,7 +7,7 @@ from .base import MessageTranslator
class StructuredDataDiagnosisRequestTranslator(MessageTranslator): class StructuredDataDiagnosisRequestTranslator(MessageTranslator):
"""Translator for StructuredDataDiagnosisRequest schema objects""" """Translator for StructuredDataDiagnosisRequest schema objects"""
def to_pulsar(self, data: Dict[str, Any]) -> StructuredDataDiagnosisRequest: def decode(self, data: Dict[str, Any]) -> StructuredDataDiagnosisRequest:
return StructuredDataDiagnosisRequest( return StructuredDataDiagnosisRequest(
operation=data["operation"], operation=data["operation"],
sample=data["sample"], sample=data["sample"],
@ -16,7 +16,7 @@ class StructuredDataDiagnosisRequestTranslator(MessageTranslator):
options=data.get("options", {}) options=data.get("options", {})
) )
def from_pulsar(self, obj: StructuredDataDiagnosisRequest) -> Dict[str, Any]: def encode(self, obj: StructuredDataDiagnosisRequest) -> Dict[str, Any]:
result = { result = {
"operation": obj.operation, "operation": obj.operation,
"sample": obj.sample, "sample": obj.sample,
@ -36,10 +36,10 @@ class StructuredDataDiagnosisRequestTranslator(MessageTranslator):
class StructuredDataDiagnosisResponseTranslator(MessageTranslator): class StructuredDataDiagnosisResponseTranslator(MessageTranslator):
"""Translator for StructuredDataDiagnosisResponse schema objects""" """Translator for StructuredDataDiagnosisResponse schema objects"""
def to_pulsar(self, data: Dict[str, Any]) -> StructuredDataDiagnosisResponse: def decode(self, data: Dict[str, Any]) -> StructuredDataDiagnosisResponse:
raise NotImplementedError("Response translation to Pulsar not typically needed") raise NotImplementedError("Response translation to Pulsar not typically needed")
def from_pulsar(self, obj: StructuredDataDiagnosisResponse) -> Dict[str, Any]: def encode(self, obj: StructuredDataDiagnosisResponse) -> Dict[str, Any]:
result = { result = {
"operation": obj.operation "operation": obj.operation
} }
@ -64,6 +64,6 @@ class StructuredDataDiagnosisResponseTranslator(MessageTranslator):
return result return result
def from_response_with_completion(self, obj: StructuredDataDiagnosisResponse) -> Tuple[Dict[str, Any], bool]: def encode_with_completion(self, obj: StructuredDataDiagnosisResponse) -> Tuple[Dict[str, Any], bool]:
"""Returns (response_dict, is_final)""" """Returns (response_dict, is_final)"""
return self.from_pulsar(obj), True return self.encode(obj), True

View file

@ -30,7 +30,7 @@ def _decode_text_payload(payload: str | bytes, charset: str) -> str:
class DocumentTranslator(SendTranslator): class DocumentTranslator(SendTranslator):
"""Translator for Document schema objects (PDF docs etc.)""" """Translator for Document schema objects (PDF docs etc.)"""
def to_pulsar(self, data: Dict[str, Any]) -> Document: def decode(self, data: Dict[str, Any]) -> Document:
# Handle base64 content validation # Handle base64 content validation
doc = base64.b64decode(data["data"]) doc = base64.b64decode(data["data"])
@ -45,7 +45,7 @@ class DocumentTranslator(SendTranslator):
data=base64.b64encode(doc).decode("utf-8") data=base64.b64encode(doc).decode("utf-8")
) )
def from_pulsar(self, obj: Document) -> Dict[str, Any]: def encode(self, obj: Document) -> Dict[str, Any]:
result = { result = {
"data": obj.data "data": obj.data
} }
@ -69,7 +69,7 @@ class DocumentTranslator(SendTranslator):
class TextDocumentTranslator(SendTranslator): class TextDocumentTranslator(SendTranslator):
"""Translator for TextDocument schema objects""" """Translator for TextDocument schema objects"""
def to_pulsar(self, data: Dict[str, Any]) -> TextDocument: def decode(self, data: Dict[str, Any]) -> TextDocument:
charset = data.get("charset", "utf-8") charset = data.get("charset", "utf-8")
text = _decode_text_payload(data["text"], charset) text = _decode_text_payload(data["text"], charset)
@ -85,7 +85,7 @@ class TextDocumentTranslator(SendTranslator):
text=text.encode("utf-8") text=text.encode("utf-8")
) )
def from_pulsar(self, obj: TextDocument) -> Dict[str, Any]: def encode(self, obj: TextDocument) -> Dict[str, Any]:
result = { result = {
"text": obj.text.decode("utf-8") if isinstance(obj.text, bytes) else obj.text "text": obj.text.decode("utf-8") if isinstance(obj.text, bytes) else obj.text
} }
@ -109,7 +109,7 @@ class TextDocumentTranslator(SendTranslator):
class ChunkTranslator(SendTranslator): class ChunkTranslator(SendTranslator):
"""Translator for Chunk schema objects""" """Translator for Chunk schema objects"""
def to_pulsar(self, data: Dict[str, Any]) -> Chunk: def decode(self, data: Dict[str, Any]) -> Chunk:
from ...schema import Metadata from ...schema import Metadata
return Chunk( return Chunk(
metadata=Metadata( metadata=Metadata(
@ -121,7 +121,7 @@ class ChunkTranslator(SendTranslator):
chunk=data["chunk"].encode("utf-8") if isinstance(data["chunk"], str) else data["chunk"] chunk=data["chunk"].encode("utf-8") if isinstance(data["chunk"], str) else data["chunk"]
) )
def from_pulsar(self, obj: Chunk) -> Dict[str, Any]: def encode(self, obj: Chunk) -> Dict[str, Any]:
result = { result = {
"chunk": obj.chunk.decode("utf-8") if isinstance(obj.chunk, bytes) else obj.chunk "chunk": obj.chunk.decode("utf-8") if isinstance(obj.chunk, bytes) else obj.chunk
} }
@ -145,7 +145,7 @@ class ChunkTranslator(SendTranslator):
class DocumentEmbeddingsTranslator(SendTranslator): class DocumentEmbeddingsTranslator(SendTranslator):
"""Translator for DocumentEmbeddings schema objects""" """Translator for DocumentEmbeddings schema objects"""
def to_pulsar(self, data: Dict[str, Any]) -> DocumentEmbeddings: def decode(self, data: Dict[str, Any]) -> DocumentEmbeddings:
metadata = data.get("metadata", {}) metadata = data.get("metadata", {})
chunks = [ chunks = [
@ -167,7 +167,7 @@ class DocumentEmbeddingsTranslator(SendTranslator):
chunks=chunks chunks=chunks
) )
def from_pulsar(self, obj: DocumentEmbeddings) -> Dict[str, Any]: def encode(self, obj: DocumentEmbeddings) -> Dict[str, Any]:
result = { result = {
"chunks": [ "chunks": [
{ {

View file

@ -6,12 +6,12 @@ from .base import MessageTranslator
class EmbeddingsRequestTranslator(MessageTranslator): class EmbeddingsRequestTranslator(MessageTranslator):
"""Translator for EmbeddingsRequest schema objects""" """Translator for EmbeddingsRequest schema objects"""
def to_pulsar(self, data: Dict[str, Any]) -> EmbeddingsRequest: def decode(self, data: Dict[str, Any]) -> EmbeddingsRequest:
return EmbeddingsRequest( return EmbeddingsRequest(
texts=data["texts"] texts=data["texts"]
) )
def from_pulsar(self, obj: EmbeddingsRequest) -> Dict[str, Any]: def encode(self, obj: EmbeddingsRequest) -> Dict[str, Any]:
return { return {
"texts": obj.texts "texts": obj.texts
} }
@ -20,14 +20,14 @@ class EmbeddingsRequestTranslator(MessageTranslator):
class EmbeddingsResponseTranslator(MessageTranslator): class EmbeddingsResponseTranslator(MessageTranslator):
"""Translator for EmbeddingsResponse schema objects""" """Translator for EmbeddingsResponse schema objects"""
def to_pulsar(self, data: Dict[str, Any]) -> EmbeddingsResponse: def decode(self, data: Dict[str, Any]) -> EmbeddingsResponse:
raise NotImplementedError("Response translation to Pulsar not typically needed") raise NotImplementedError("Response translation to Pulsar not typically needed")
def from_pulsar(self, obj: EmbeddingsResponse) -> Dict[str, Any]: def encode(self, obj: EmbeddingsResponse) -> Dict[str, Any]:
return { return {
"vectors": obj.vectors "vectors": obj.vectors
} }
def from_response_with_completion(self, obj: EmbeddingsResponse) -> Tuple[Dict[str, Any], bool]: def encode_with_completion(self, obj: EmbeddingsResponse) -> Tuple[Dict[str, Any], bool]:
"""Returns (response_dict, is_final)""" """Returns (response_dict, is_final)"""
return self.from_pulsar(obj), True return self.encode(obj), True

View file

@ -11,7 +11,7 @@ from .primitives import ValueTranslator
class DocumentEmbeddingsRequestTranslator(MessageTranslator): class DocumentEmbeddingsRequestTranslator(MessageTranslator):
"""Translator for DocumentEmbeddingsRequest schema objects""" """Translator for DocumentEmbeddingsRequest schema objects"""
def to_pulsar(self, data: Dict[str, Any]) -> DocumentEmbeddingsRequest: def decode(self, data: Dict[str, Any]) -> DocumentEmbeddingsRequest:
return DocumentEmbeddingsRequest( return DocumentEmbeddingsRequest(
vector=data["vector"], vector=data["vector"],
limit=int(data.get("limit", 10)), limit=int(data.get("limit", 10)),
@ -19,7 +19,7 @@ class DocumentEmbeddingsRequestTranslator(MessageTranslator):
collection=data.get("collection", "default") collection=data.get("collection", "default")
) )
def from_pulsar(self, obj: DocumentEmbeddingsRequest) -> Dict[str, Any]: def encode(self, obj: DocumentEmbeddingsRequest) -> Dict[str, Any]:
return { return {
"vector": obj.vector, "vector": obj.vector,
"limit": obj.limit, "limit": obj.limit,
@ -31,10 +31,10 @@ class DocumentEmbeddingsRequestTranslator(MessageTranslator):
class DocumentEmbeddingsResponseTranslator(MessageTranslator): class DocumentEmbeddingsResponseTranslator(MessageTranslator):
"""Translator for DocumentEmbeddingsResponse schema objects""" """Translator for DocumentEmbeddingsResponse schema objects"""
def to_pulsar(self, data: Dict[str, Any]) -> DocumentEmbeddingsResponse: def decode(self, data: Dict[str, Any]) -> DocumentEmbeddingsResponse:
raise NotImplementedError("Response translation to Pulsar not typically needed") raise NotImplementedError("Response translation to Pulsar not typically needed")
def from_pulsar(self, obj: DocumentEmbeddingsResponse) -> Dict[str, Any]: def encode(self, obj: DocumentEmbeddingsResponse) -> Dict[str, Any]:
result = {} result = {}
if obj.chunks is not None: if obj.chunks is not None:
@ -48,15 +48,15 @@ class DocumentEmbeddingsResponseTranslator(MessageTranslator):
return result return result
def from_response_with_completion(self, obj: DocumentEmbeddingsResponse) -> Tuple[Dict[str, Any], bool]: def encode_with_completion(self, obj: DocumentEmbeddingsResponse) -> Tuple[Dict[str, Any], bool]:
"""Returns (response_dict, is_final)""" """Returns (response_dict, is_final)"""
return self.from_pulsar(obj), True return self.encode(obj), True
class GraphEmbeddingsRequestTranslator(MessageTranslator): class GraphEmbeddingsRequestTranslator(MessageTranslator):
"""Translator for GraphEmbeddingsRequest schema objects""" """Translator for GraphEmbeddingsRequest schema objects"""
def to_pulsar(self, data: Dict[str, Any]) -> GraphEmbeddingsRequest: def decode(self, data: Dict[str, Any]) -> GraphEmbeddingsRequest:
return GraphEmbeddingsRequest( return GraphEmbeddingsRequest(
vector=data["vector"], vector=data["vector"],
limit=int(data.get("limit", 10)), limit=int(data.get("limit", 10)),
@ -64,7 +64,7 @@ class GraphEmbeddingsRequestTranslator(MessageTranslator):
collection=data.get("collection", "default") collection=data.get("collection", "default")
) )
def from_pulsar(self, obj: GraphEmbeddingsRequest) -> Dict[str, Any]: def encode(self, obj: GraphEmbeddingsRequest) -> Dict[str, Any]:
return { return {
"vector": obj.vector, "vector": obj.vector,
"limit": obj.limit, "limit": obj.limit,
@ -79,16 +79,16 @@ class GraphEmbeddingsResponseTranslator(MessageTranslator):
def __init__(self): def __init__(self):
self.value_translator = ValueTranslator() self.value_translator = ValueTranslator()
def to_pulsar(self, data: Dict[str, Any]) -> GraphEmbeddingsResponse: def decode(self, data: Dict[str, Any]) -> GraphEmbeddingsResponse:
raise NotImplementedError("Response translation to Pulsar not typically needed") raise NotImplementedError("Response translation to Pulsar not typically needed")
def from_pulsar(self, obj: GraphEmbeddingsResponse) -> Dict[str, Any]: def encode(self, obj: GraphEmbeddingsResponse) -> Dict[str, Any]:
result = {} result = {}
if obj.entities is not None: if obj.entities is not None:
result["entities"] = [ result["entities"] = [
{ {
"entity": self.value_translator.from_pulsar(match.entity), "entity": self.value_translator.encode(match.entity),
"score": match.score "score": match.score
} }
for match in obj.entities for match in obj.entities
@ -96,15 +96,15 @@ class GraphEmbeddingsResponseTranslator(MessageTranslator):
return result return result
def from_response_with_completion(self, obj: GraphEmbeddingsResponse) -> Tuple[Dict[str, Any], bool]: def encode_with_completion(self, obj: GraphEmbeddingsResponse) -> Tuple[Dict[str, Any], bool]:
"""Returns (response_dict, is_final)""" """Returns (response_dict, is_final)"""
return self.from_pulsar(obj), True return self.encode(obj), True
class RowEmbeddingsRequestTranslator(MessageTranslator): class RowEmbeddingsRequestTranslator(MessageTranslator):
"""Translator for RowEmbeddingsRequest schema objects""" """Translator for RowEmbeddingsRequest schema objects"""
def to_pulsar(self, data: Dict[str, Any]) -> RowEmbeddingsRequest: def decode(self, data: Dict[str, Any]) -> RowEmbeddingsRequest:
return RowEmbeddingsRequest( return RowEmbeddingsRequest(
vector=data["vector"], vector=data["vector"],
limit=int(data.get("limit", 10)), limit=int(data.get("limit", 10)),
@ -114,7 +114,7 @@ class RowEmbeddingsRequestTranslator(MessageTranslator):
index_name=data.get("index_name") index_name=data.get("index_name")
) )
def from_pulsar(self, obj: RowEmbeddingsRequest) -> Dict[str, Any]: def encode(self, obj: RowEmbeddingsRequest) -> Dict[str, Any]:
result = { result = {
"vector": obj.vector, "vector": obj.vector,
"limit": obj.limit, "limit": obj.limit,
@ -130,10 +130,10 @@ class RowEmbeddingsRequestTranslator(MessageTranslator):
class RowEmbeddingsResponseTranslator(MessageTranslator): class RowEmbeddingsResponseTranslator(MessageTranslator):
"""Translator for RowEmbeddingsResponse schema objects""" """Translator for RowEmbeddingsResponse schema objects"""
def to_pulsar(self, data: Dict[str, Any]) -> RowEmbeddingsResponse: def decode(self, data: Dict[str, Any]) -> RowEmbeddingsResponse:
raise NotImplementedError("Response translation to Pulsar not typically needed") raise NotImplementedError("Response translation to Pulsar not typically needed")
def from_pulsar(self, obj: RowEmbeddingsResponse) -> Dict[str, Any]: def encode(self, obj: RowEmbeddingsResponse) -> Dict[str, Any]:
result = {} result = {}
if obj.error is not None: if obj.error is not None:
@ -155,6 +155,6 @@ class RowEmbeddingsResponseTranslator(MessageTranslator):
return result return result
def from_response_with_completion(self, obj: RowEmbeddingsResponse) -> Tuple[Dict[str, Any], bool]: def encode_with_completion(self, obj: RowEmbeddingsResponse) -> Tuple[Dict[str, Any], bool]:
"""Returns (response_dict, is_final)""" """Returns (response_dict, is_final)"""
return self.from_pulsar(obj), True return self.encode(obj), True

View file

@ -6,7 +6,7 @@ from .base import MessageTranslator
class FlowRequestTranslator(MessageTranslator): class FlowRequestTranslator(MessageTranslator):
"""Translator for FlowRequest schema objects""" """Translator for FlowRequest schema objects"""
def to_pulsar(self, data: Dict[str, Any]) -> FlowRequest: def decode(self, data: Dict[str, Any]) -> FlowRequest:
return FlowRequest( return FlowRequest(
operation=data.get("operation"), operation=data.get("operation"),
blueprint_name=data.get("blueprint-name"), blueprint_name=data.get("blueprint-name"),
@ -16,7 +16,7 @@ class FlowRequestTranslator(MessageTranslator):
parameters=data.get("parameters") parameters=data.get("parameters")
) )
def from_pulsar(self, obj: FlowRequest) -> Dict[str, Any]: def encode(self, obj: FlowRequest) -> Dict[str, Any]:
result = {} result = {}
if obj.operation is not None: if obj.operation is not None:
@ -38,10 +38,10 @@ class FlowRequestTranslator(MessageTranslator):
class FlowResponseTranslator(MessageTranslator): class FlowResponseTranslator(MessageTranslator):
"""Translator for FlowResponse schema objects""" """Translator for FlowResponse schema objects"""
def to_pulsar(self, data: Dict[str, Any]) -> FlowResponse: def decode(self, data: Dict[str, Any]) -> FlowResponse:
raise NotImplementedError("Response translation to Pulsar not typically needed") raise NotImplementedError("Response translation to Pulsar not typically needed")
def from_pulsar(self, obj: FlowResponse) -> Dict[str, Any]: def encode(self, obj: FlowResponse) -> Dict[str, Any]:
result = {} result = {}
if obj.blueprint_names is not None: if obj.blueprint_names is not None:
@ -59,6 +59,6 @@ class FlowResponseTranslator(MessageTranslator):
return result return result
def from_response_with_completion(self, obj: FlowResponse) -> Tuple[Dict[str, Any], bool]: def encode_with_completion(self, obj: FlowResponse) -> Tuple[Dict[str, Any], bool]:
"""Returns (response_dict, is_final)""" """Returns (response_dict, is_final)"""
return self.from_pulsar(obj), True return self.encode(obj), True

View file

@ -14,7 +14,7 @@ class KnowledgeRequestTranslator(MessageTranslator):
self.value_translator = ValueTranslator() self.value_translator = ValueTranslator()
self.subgraph_translator = SubgraphTranslator() self.subgraph_translator = SubgraphTranslator()
def to_pulsar(self, data: Dict[str, Any]) -> KnowledgeRequest: def decode(self, data: Dict[str, Any]) -> KnowledgeRequest:
triples = None triples = None
if "triples" in data: if "triples" in data:
triples = Triples( triples = Triples(
@ -24,7 +24,7 @@ class KnowledgeRequestTranslator(MessageTranslator):
user=data["triples"]["metadata"]["user"], user=data["triples"]["metadata"]["user"],
collection=data["triples"]["metadata"]["collection"] collection=data["triples"]["metadata"]["collection"]
), ),
triples=self.subgraph_translator.to_pulsar(data["triples"]["triples"]), triples=self.subgraph_translator.decode(data["triples"]["triples"]),
) )
graph_embeddings = None graph_embeddings = None
@ -38,7 +38,7 @@ class KnowledgeRequestTranslator(MessageTranslator):
), ),
entities=[ entities=[
EntityEmbeddings( EntityEmbeddings(
entity=self.value_translator.to_pulsar(ent["entity"]), entity=self.value_translator.decode(ent["entity"]),
vectors=ent["vectors"], vectors=ent["vectors"],
) )
for ent in data["graph-embeddings"]["entities"] for ent in data["graph-embeddings"]["entities"]
@ -55,7 +55,7 @@ class KnowledgeRequestTranslator(MessageTranslator):
graph_embeddings=graph_embeddings, graph_embeddings=graph_embeddings,
) )
def from_pulsar(self, obj: KnowledgeRequest) -> Dict[str, Any]: def encode(self, obj: KnowledgeRequest) -> Dict[str, Any]:
result = {} result = {}
if obj.operation: if obj.operation:
@ -77,7 +77,7 @@ class KnowledgeRequestTranslator(MessageTranslator):
"user": obj.triples.metadata.user, "user": obj.triples.metadata.user,
"collection": obj.triples.metadata.collection, "collection": obj.triples.metadata.collection,
}, },
"triples": self.subgraph_translator.from_pulsar(obj.triples.triples), "triples": self.subgraph_translator.encode(obj.triples.triples),
} }
if obj.graph_embeddings: if obj.graph_embeddings:
@ -91,7 +91,7 @@ class KnowledgeRequestTranslator(MessageTranslator):
"entities": [ "entities": [
{ {
"vector": entity.vector, "vector": entity.vector,
"entity": self.value_translator.from_pulsar(entity.entity), "entity": self.value_translator.encode(entity.entity),
} }
for entity in obj.graph_embeddings.entities for entity in obj.graph_embeddings.entities
], ],
@ -107,10 +107,10 @@ class KnowledgeResponseTranslator(MessageTranslator):
self.value_translator = ValueTranslator() self.value_translator = ValueTranslator()
self.subgraph_translator = SubgraphTranslator() self.subgraph_translator = SubgraphTranslator()
def to_pulsar(self, data: Dict[str, Any]) -> KnowledgeResponse: def decode(self, data: Dict[str, Any]) -> KnowledgeResponse:
raise NotImplementedError("Response translation to Pulsar not typically needed") raise NotImplementedError("Response translation to Pulsar not typically needed")
def from_pulsar(self, obj: KnowledgeResponse) -> Dict[str, Any]: def encode(self, obj: KnowledgeResponse) -> Dict[str, Any]:
# Response to list operation # Response to list operation
if obj.ids is not None: if obj.ids is not None:
return {"ids": obj.ids} return {"ids": obj.ids}
@ -125,7 +125,7 @@ class KnowledgeResponseTranslator(MessageTranslator):
"user": obj.triples.metadata.user, "user": obj.triples.metadata.user,
"collection": obj.triples.metadata.collection, "collection": obj.triples.metadata.collection,
}, },
"triples": self.subgraph_translator.from_pulsar(obj.triples.triples), "triples": self.subgraph_translator.encode(obj.triples.triples),
} }
} }
@ -142,7 +142,7 @@ class KnowledgeResponseTranslator(MessageTranslator):
"entities": [ "entities": [
{ {
"vector": entity.vector, "vector": entity.vector,
"entity": self.value_translator.from_pulsar(entity.entity), "entity": self.value_translator.encode(entity.entity),
} }
for entity in obj.graph_embeddings.entities for entity in obj.graph_embeddings.entities
], ],
@ -156,9 +156,9 @@ class KnowledgeResponseTranslator(MessageTranslator):
# Empty response (successful delete) # Empty response (successful delete)
return {} return {}
def from_response_with_completion(self, obj: KnowledgeResponse) -> Tuple[Dict[str, Any], bool]: def encode_with_completion(self, obj: KnowledgeResponse) -> Tuple[Dict[str, Any], bool]:
"""Returns (response_dict, is_final)""" """Returns (response_dict, is_final)"""
response = self.from_pulsar(obj) response = self.encode(obj)
# Check if this is a final response # Check if this is a final response
is_final = ( is_final = (

View file

@ -11,16 +11,16 @@ class LibraryRequestTranslator(MessageTranslator):
self.doc_metadata_translator = DocumentMetadataTranslator() self.doc_metadata_translator = DocumentMetadataTranslator()
self.proc_metadata_translator = ProcessingMetadataTranslator() self.proc_metadata_translator = ProcessingMetadataTranslator()
def to_pulsar(self, data: Dict[str, Any]) -> LibrarianRequest: def decode(self, data: Dict[str, Any]) -> LibrarianRequest:
# Document metadata # Document metadata
doc_metadata = None doc_metadata = None
if "document-metadata" in data: if "document-metadata" in data:
doc_metadata = self.doc_metadata_translator.to_pulsar(data["document-metadata"]) doc_metadata = self.doc_metadata_translator.decode(data["document-metadata"])
# Processing metadata # Processing metadata
proc_metadata = None proc_metadata = None
if "processing-metadata" in data: if "processing-metadata" in data:
proc_metadata = self.proc_metadata_translator.to_pulsar(data["processing-metadata"]) proc_metadata = self.proc_metadata_translator.decode(data["processing-metadata"])
# Criteria # Criteria
criteria = [] criteria = []
@ -61,7 +61,7 @@ class LibraryRequestTranslator(MessageTranslator):
include_children=data.get("include-children", False), include_children=data.get("include-children", False),
) )
def from_pulsar(self, obj: LibrarianRequest) -> Dict[str, Any]: def encode(self, obj: LibrarianRequest) -> Dict[str, Any]:
result = {} result = {}
if obj.operation: if obj.operation:
@ -71,9 +71,9 @@ class LibraryRequestTranslator(MessageTranslator):
if obj.processing_id: if obj.processing_id:
result["processing-id"] = obj.processing_id result["processing-id"] = obj.processing_id
if obj.document_metadata: if obj.document_metadata:
result["document-metadata"] = self.doc_metadata_translator.from_pulsar(obj.document_metadata) result["document-metadata"] = self.doc_metadata_translator.encode(obj.document_metadata)
if obj.processing_metadata: if obj.processing_metadata:
result["processing-metadata"] = self.proc_metadata_translator.from_pulsar(obj.processing_metadata) result["processing-metadata"] = self.proc_metadata_translator.encode(obj.processing_metadata)
if obj.content: if obj.content:
result["content"] = obj.content.decode("utf-8") if isinstance(obj.content, bytes) else obj.content result["content"] = obj.content.decode("utf-8") if isinstance(obj.content, bytes) else obj.content
if obj.user: if obj.user:
@ -100,10 +100,10 @@ class LibraryResponseTranslator(MessageTranslator):
self.doc_metadata_translator = DocumentMetadataTranslator() self.doc_metadata_translator = DocumentMetadataTranslator()
self.proc_metadata_translator = ProcessingMetadataTranslator() self.proc_metadata_translator = ProcessingMetadataTranslator()
def to_pulsar(self, data: Dict[str, Any]) -> LibrarianResponse: def decode(self, data: Dict[str, Any]) -> LibrarianResponse:
raise NotImplementedError("Response translation to Pulsar not typically needed") raise NotImplementedError("Response translation to Pulsar not typically needed")
def from_pulsar(self, obj: LibrarianResponse) -> Dict[str, Any]: def encode(self, obj: LibrarianResponse) -> Dict[str, Any]:
result = {} result = {}
if obj.error: if obj.error:
@ -113,20 +113,20 @@ class LibraryResponseTranslator(MessageTranslator):
} }
if obj.document_metadata: if obj.document_metadata:
result["document-metadata"] = self.doc_metadata_translator.from_pulsar(obj.document_metadata) result["document-metadata"] = self.doc_metadata_translator.encode(obj.document_metadata)
if obj.content: if obj.content:
result["content"] = obj.content.decode("utf-8") if isinstance(obj.content, bytes) else obj.content result["content"] = obj.content.decode("utf-8") if isinstance(obj.content, bytes) else obj.content
if obj.document_metadatas is not None: if obj.document_metadatas is not None:
result["document-metadatas"] = [ result["document-metadatas"] = [
self.doc_metadata_translator.from_pulsar(dm) self.doc_metadata_translator.encode(dm)
for dm in obj.document_metadatas for dm in obj.document_metadatas
] ]
if obj.processing_metadatas is not None: if obj.processing_metadatas is not None:
result["processing-metadatas"] = [ result["processing-metadatas"] = [
self.proc_metadata_translator.from_pulsar(pm) self.proc_metadata_translator.encode(pm)
for pm in obj.processing_metadatas for pm in obj.processing_metadatas
] ]
@ -172,6 +172,6 @@ class LibraryResponseTranslator(MessageTranslator):
return result return result
def from_response_with_completion(self, obj: LibrarianResponse) -> Tuple[Dict[str, Any], bool]: def encode_with_completion(self, obj: LibrarianResponse) -> Tuple[Dict[str, Any], bool]:
"""Returns (response_dict, is_final)""" """Returns (response_dict, is_final)"""
return self.from_pulsar(obj), obj.is_final return self.encode(obj), obj.is_final

View file

@ -10,7 +10,7 @@ class DocumentMetadataTranslator(Translator):
def __init__(self): def __init__(self):
self.subgraph_translator = SubgraphTranslator() self.subgraph_translator = SubgraphTranslator()
def to_pulsar(self, data: Dict[str, Any]) -> DocumentMetadata: def decode(self, data: Dict[str, Any]) -> DocumentMetadata:
metadata = data.get("metadata", []) metadata = data.get("metadata", [])
return DocumentMetadata( return DocumentMetadata(
id=data.get("id"), id=data.get("id"),
@ -18,14 +18,14 @@ class DocumentMetadataTranslator(Translator):
kind=data.get("kind"), kind=data.get("kind"),
title=data.get("title"), title=data.get("title"),
comments=data.get("comments"), comments=data.get("comments"),
metadata=self.subgraph_translator.to_pulsar(metadata) if metadata is not None else [], metadata=self.subgraph_translator.decode(metadata) if metadata is not None else [],
user=data.get("user"), user=data.get("user"),
tags=data.get("tags"), tags=data.get("tags"),
parent_id=data.get("parent-id", ""), parent_id=data.get("parent-id", ""),
document_type=data.get("document-type", "source"), document_type=data.get("document-type", "source"),
) )
def from_pulsar(self, obj: DocumentMetadata) -> Dict[str, Any]: def encode(self, obj: DocumentMetadata) -> Dict[str, Any]:
result = {} result = {}
if obj.id: if obj.id:
@ -39,7 +39,7 @@ class DocumentMetadataTranslator(Translator):
if obj.comments: if obj.comments:
result["comments"] = obj.comments result["comments"] = obj.comments
if obj.metadata is not None: if obj.metadata is not None:
result["metadata"] = self.subgraph_translator.from_pulsar(obj.metadata) result["metadata"] = self.subgraph_translator.encode(obj.metadata)
if obj.user: if obj.user:
result["user"] = obj.user result["user"] = obj.user
if obj.tags is not None: if obj.tags is not None:
@ -55,7 +55,7 @@ class DocumentMetadataTranslator(Translator):
class ProcessingMetadataTranslator(Translator): class ProcessingMetadataTranslator(Translator):
"""Translator for ProcessingMetadata schema objects""" """Translator for ProcessingMetadata schema objects"""
def to_pulsar(self, data: Dict[str, Any]) -> ProcessingMetadata: def decode(self, data: Dict[str, Any]) -> ProcessingMetadata:
return ProcessingMetadata( return ProcessingMetadata(
id=data.get("id"), id=data.get("id"),
document_id=data.get("document-id"), document_id=data.get("document-id"),
@ -66,7 +66,7 @@ class ProcessingMetadataTranslator(Translator):
tags=data.get("tags") tags=data.get("tags")
) )
def from_pulsar(self, obj: ProcessingMetadata) -> Dict[str, Any]: def encode(self, obj: ProcessingMetadata) -> Dict[str, Any]:
result = {} result = {}
if obj.id: if obj.id:

View file

@ -6,13 +6,13 @@ from .base import MessageTranslator
class QuestionToStructuredQueryRequestTranslator(MessageTranslator): class QuestionToStructuredQueryRequestTranslator(MessageTranslator):
"""Translator for QuestionToStructuredQueryRequest schema objects""" """Translator for QuestionToStructuredQueryRequest schema objects"""
def to_pulsar(self, data: Dict[str, Any]) -> QuestionToStructuredQueryRequest: def decode(self, data: Dict[str, Any]) -> QuestionToStructuredQueryRequest:
return QuestionToStructuredQueryRequest( return QuestionToStructuredQueryRequest(
question=data.get("question", ""), question=data.get("question", ""),
max_results=data.get("max_results", 100) max_results=data.get("max_results", 100)
) )
def from_pulsar(self, obj: QuestionToStructuredQueryRequest) -> Dict[str, Any]: def encode(self, obj: QuestionToStructuredQueryRequest) -> Dict[str, Any]:
return { return {
"question": obj.question, "question": obj.question,
"max_results": obj.max_results "max_results": obj.max_results
@ -22,10 +22,10 @@ class QuestionToStructuredQueryRequestTranslator(MessageTranslator):
class QuestionToStructuredQueryResponseTranslator(MessageTranslator): class QuestionToStructuredQueryResponseTranslator(MessageTranslator):
"""Translator for QuestionToStructuredQueryResponse schema objects""" """Translator for QuestionToStructuredQueryResponse schema objects"""
def to_pulsar(self, data: Dict[str, Any]) -> QuestionToStructuredQueryResponse: def decode(self, data: Dict[str, Any]) -> QuestionToStructuredQueryResponse:
raise NotImplementedError("Response translation to Pulsar not typically needed") raise NotImplementedError("Response translation to Pulsar not typically needed")
def from_pulsar(self, obj: QuestionToStructuredQueryResponse) -> Dict[str, Any]: def encode(self, obj: QuestionToStructuredQueryResponse) -> Dict[str, Any]:
result = { result = {
"graphql_query": obj.graphql_query, "graphql_query": obj.graphql_query,
"variables": dict(obj.variables) if obj.variables else {}, "variables": dict(obj.variables) if obj.variables else {},
@ -42,6 +42,6 @@ class QuestionToStructuredQueryResponseTranslator(MessageTranslator):
return result return result
def from_response_with_completion(self, obj: QuestionToStructuredQueryResponse) -> Tuple[Dict[str, Any], bool]: def encode_with_completion(self, obj: QuestionToStructuredQueryResponse) -> Tuple[Dict[str, Any], bool]:
"""Returns (response_dict, is_final)""" """Returns (response_dict, is_final)"""
return self.from_pulsar(obj), True return self.encode(obj), True

View file

@ -17,7 +17,7 @@ class TermTranslator(Translator):
- "tr": triple (for TRIPLE type, nested) - "tr": triple (for TRIPLE type, nested)
""" """
def to_pulsar(self, data: Dict[str, Any]) -> Term: def decode(self, data: Dict[str, Any]) -> Term:
term_type = data.get("t", "") term_type = data.get("t", "")
if term_type == IRI: if term_type == IRI:
@ -38,7 +38,7 @@ class TermTranslator(Translator):
# Nested triple - use TripleTranslator # Nested triple - use TripleTranslator
triple_data = data.get("tr") triple_data = data.get("tr")
if triple_data: if triple_data:
triple = _triple_translator_to_pulsar(triple_data) triple = _triple_translator_decode(triple_data)
else: else:
triple = None triple = None
return Term(type=TRIPLE, triple=triple) return Term(type=TRIPLE, triple=triple)
@ -47,7 +47,7 @@ class TermTranslator(Translator):
# Unknown or empty type # Unknown or empty type
return Term(type=term_type) return Term(type=term_type)
def from_pulsar(self, obj: Term) -> Dict[str, Any]: def encode(self, obj: Term) -> Dict[str, Any]:
result: Dict[str, Any] = {"t": obj.type} result: Dict[str, Any] = {"t": obj.type}
if obj.type == IRI: if obj.type == IRI:
@ -65,33 +65,33 @@ class TermTranslator(Translator):
elif obj.type == TRIPLE: elif obj.type == TRIPLE:
if obj.triple: if obj.triple:
result["tr"] = _triple_translator_from_pulsar(obj.triple) result["tr"] = _triple_translator_encode(obj.triple)
return result return result
# Module-level helper functions to avoid circular instantiation # Module-level helper functions to avoid circular instantiation
def _triple_translator_to_pulsar(data: Dict[str, Any]) -> Triple: def _triple_translator_decode(data: Dict[str, Any]) -> Triple:
term_translator = TermTranslator() term_translator = TermTranslator()
return Triple( return Triple(
s=term_translator.to_pulsar(data["s"]) if data.get("s") else None, s=term_translator.decode(data["s"]) if data.get("s") else None,
p=term_translator.to_pulsar(data["p"]) if data.get("p") else None, p=term_translator.decode(data["p"]) if data.get("p") else None,
o=term_translator.to_pulsar(data["o"]) if data.get("o") else None, o=term_translator.decode(data["o"]) if data.get("o") else None,
g=data.get("g"), g=data.get("g"),
) )
def _triple_translator_from_pulsar(obj: Triple) -> Dict[str, Any]: def _triple_translator_encode(obj: Triple) -> Dict[str, Any]:
"""Convert Triple object to wire format dict.""" """Convert Triple object to wire format dict."""
term_translator = TermTranslator() term_translator = TermTranslator()
result: Dict[str, Any] = {} result: Dict[str, Any] = {}
if obj.s: if obj.s:
result["s"] = term_translator.from_pulsar(obj.s) result["s"] = term_translator.encode(obj.s)
if obj.p: if obj.p:
result["p"] = term_translator.from_pulsar(obj.p) result["p"] = term_translator.encode(obj.p)
if obj.o: if obj.o:
result["o"] = term_translator.from_pulsar(obj.o) result["o"] = term_translator.encode(obj.o)
if obj.g: if obj.g:
result["g"] = obj.g result["g"] = obj.g
@ -104,23 +104,23 @@ class TripleTranslator(Translator):
def __init__(self): def __init__(self):
self.term_translator = TermTranslator() self.term_translator = TermTranslator()
def to_pulsar(self, data: Dict[str, Any]) -> Triple: def decode(self, data: Dict[str, Any]) -> Triple:
return Triple( return Triple(
s=self.term_translator.to_pulsar(data["s"]) if data.get("s") else None, s=self.term_translator.decode(data["s"]) if data.get("s") else None,
p=self.term_translator.to_pulsar(data["p"]) if data.get("p") else None, p=self.term_translator.decode(data["p"]) if data.get("p") else None,
o=self.term_translator.to_pulsar(data["o"]) if data.get("o") else None, o=self.term_translator.decode(data["o"]) if data.get("o") else None,
g=data.get("g"), g=data.get("g"),
) )
def from_pulsar(self, obj: Triple) -> Dict[str, Any]: def encode(self, obj: Triple) -> Dict[str, Any]:
result: Dict[str, Any] = {} result: Dict[str, Any] = {}
if obj.s: if obj.s:
result["s"] = self.term_translator.from_pulsar(obj.s) result["s"] = self.term_translator.encode(obj.s)
if obj.p: if obj.p:
result["p"] = self.term_translator.from_pulsar(obj.p) result["p"] = self.term_translator.encode(obj.p)
if obj.o: if obj.o:
result["o"] = self.term_translator.from_pulsar(obj.o) result["o"] = self.term_translator.encode(obj.o)
if obj.g: if obj.g:
result["g"] = obj.g result["g"] = obj.g
@ -137,17 +137,17 @@ class SubgraphTranslator(Translator):
def __init__(self): def __init__(self):
self.triple_translator = TripleTranslator() self.triple_translator = TripleTranslator()
def to_pulsar(self, data: List[Dict[str, Any]]) -> List[Triple]: def decode(self, data: List[Dict[str, Any]]) -> List[Triple]:
return [self.triple_translator.to_pulsar(t) for t in data] return [self.triple_translator.decode(t) for t in data]
def from_pulsar(self, obj: List[Triple]) -> List[Dict[str, Any]]: def encode(self, obj: List[Triple]) -> List[Dict[str, Any]]:
return [self.triple_translator.from_pulsar(t) for t in obj] return [self.triple_translator.encode(t) for t in obj]
class RowSchemaTranslator(Translator): class RowSchemaTranslator(Translator):
"""Translator for RowSchema objects""" """Translator for RowSchema objects"""
def to_pulsar(self, data: Dict[str, Any]) -> RowSchema: def decode(self, data: Dict[str, Any]) -> RowSchema:
"""Convert dict to RowSchema Pulsar object""" """Convert dict to RowSchema Pulsar object"""
fields = [] fields = []
for field_data in data.get("fields", []): for field_data in data.get("fields", []):
@ -169,7 +169,7 @@ class RowSchemaTranslator(Translator):
fields=fields fields=fields
) )
def from_pulsar(self, obj: RowSchema) -> Dict[str, Any]: def encode(self, obj: RowSchema) -> Dict[str, Any]:
"""Convert RowSchema Pulsar object to JSON-serializable dictionary""" """Convert RowSchema Pulsar object to JSON-serializable dictionary"""
result = { result = {
"name": obj.name, "name": obj.name,
@ -200,7 +200,7 @@ class RowSchemaTranslator(Translator):
class FieldTranslator(Translator): class FieldTranslator(Translator):
"""Translator for Field objects""" """Translator for Field objects"""
def to_pulsar(self, data: Dict[str, Any]) -> Field: def decode(self, data: Dict[str, Any]) -> Field:
"""Convert dict to Field Pulsar object""" """Convert dict to Field Pulsar object"""
return Field( return Field(
name=data.get("name", ""), name=data.get("name", ""),
@ -213,7 +213,7 @@ class FieldTranslator(Translator):
enum_values=data.get("enum_values", []) enum_values=data.get("enum_values", [])
) )
def from_pulsar(self, obj: Field) -> Dict[str, Any]: def encode(self, obj: Field) -> Dict[str, Any]:
"""Convert Field Pulsar object to JSON-serializable dictionary""" """Convert Field Pulsar object to JSON-serializable dictionary"""
result = { result = {
"name": obj.name, "name": obj.name,

View file

@ -7,7 +7,7 @@ from .base import MessageTranslator
class PromptRequestTranslator(MessageTranslator): class PromptRequestTranslator(MessageTranslator):
"""Translator for PromptRequest schema objects""" """Translator for PromptRequest schema objects"""
def to_pulsar(self, data: Dict[str, Any]) -> PromptRequest: def decode(self, data: Dict[str, Any]) -> PromptRequest:
# Handle both "terms" and "variables" input keys # Handle both "terms" and "variables" input keys
terms = data.get("terms", {}) terms = data.get("terms", {})
if "variables" in data: if "variables" in data:
@ -23,7 +23,7 @@ class PromptRequestTranslator(MessageTranslator):
streaming=data.get("streaming", False) streaming=data.get("streaming", False)
) )
def from_pulsar(self, obj: PromptRequest) -> Dict[str, Any]: def encode(self, obj: PromptRequest) -> Dict[str, Any]:
result = {} result = {}
if obj.id: if obj.id:
@ -37,10 +37,10 @@ class PromptRequestTranslator(MessageTranslator):
class PromptResponseTranslator(MessageTranslator): class PromptResponseTranslator(MessageTranslator):
"""Translator for PromptResponse schema objects""" """Translator for PromptResponse schema objects"""
def to_pulsar(self, data: Dict[str, Any]) -> PromptResponse: def decode(self, data: Dict[str, Any]) -> PromptResponse:
raise NotImplementedError("Response translation to Pulsar not typically needed") raise NotImplementedError("Response translation to Pulsar not typically needed")
def from_pulsar(self, obj: PromptResponse) -> Dict[str, Any]: def encode(self, obj: PromptResponse) -> Dict[str, Any]:
result = {} result = {}
# Include text field if present (even if empty string) # Include text field if present (even if empty string)
@ -55,8 +55,8 @@ class PromptResponseTranslator(MessageTranslator):
return result return result
def from_response_with_completion(self, obj: PromptResponse) -> Tuple[Dict[str, Any], bool]: def encode_with_completion(self, obj: PromptResponse) -> Tuple[Dict[str, Any], bool]:
"""Returns (response_dict, is_final)""" """Returns (response_dict, is_final)"""
# Check end_of_stream field to determine if this is the final message # Check end_of_stream field to determine if this is the final message
is_final = getattr(obj, 'end_of_stream', True) is_final = getattr(obj, 'end_of_stream', True)
return self.from_pulsar(obj), is_final return self.encode(obj), is_final

View file

@ -6,7 +6,7 @@ from .base import MessageTranslator
class DocumentRagRequestTranslator(MessageTranslator): class DocumentRagRequestTranslator(MessageTranslator):
"""Translator for DocumentRagQuery schema objects""" """Translator for DocumentRagQuery schema objects"""
def to_pulsar(self, data: Dict[str, Any]) -> DocumentRagQuery: def decode(self, data: Dict[str, Any]) -> DocumentRagQuery:
return DocumentRagQuery( return DocumentRagQuery(
query=data["query"], query=data["query"],
user=data.get("user", "trustgraph"), user=data.get("user", "trustgraph"),
@ -15,7 +15,7 @@ class DocumentRagRequestTranslator(MessageTranslator):
streaming=data.get("streaming", False) streaming=data.get("streaming", False)
) )
def from_pulsar(self, obj: DocumentRagQuery) -> Dict[str, Any]: def encode(self, obj: DocumentRagQuery) -> Dict[str, Any]:
return { return {
"query": obj.query, "query": obj.query,
"user": obj.user, "user": obj.user,
@ -28,10 +28,10 @@ class DocumentRagRequestTranslator(MessageTranslator):
class DocumentRagResponseTranslator(MessageTranslator): class DocumentRagResponseTranslator(MessageTranslator):
"""Translator for DocumentRagResponse schema objects""" """Translator for DocumentRagResponse schema objects"""
def to_pulsar(self, data: Dict[str, Any]) -> DocumentRagResponse: def decode(self, data: Dict[str, Any]) -> DocumentRagResponse:
raise NotImplementedError("Response translation to Pulsar not typically needed") raise NotImplementedError("Response translation to Pulsar not typically needed")
def from_pulsar(self, obj: DocumentRagResponse) -> Dict[str, Any]: def encode(self, obj: DocumentRagResponse) -> Dict[str, Any]:
result = {} result = {}
# Include message_type for distinguishing chunk vs explain messages # Include message_type for distinguishing chunk vs explain messages
@ -65,17 +65,17 @@ class DocumentRagResponseTranslator(MessageTranslator):
return result return result
def from_response_with_completion(self, obj: DocumentRagResponse) -> Tuple[Dict[str, Any], bool]: def encode_with_completion(self, obj: DocumentRagResponse) -> Tuple[Dict[str, Any], bool]:
"""Returns (response_dict, is_final)""" """Returns (response_dict, is_final)"""
# Session is complete when end_of_session is True # Session is complete when end_of_session is True
is_final = getattr(obj, 'end_of_session', False) is_final = getattr(obj, 'end_of_session', False)
return self.from_pulsar(obj), is_final return self.encode(obj), is_final
class GraphRagRequestTranslator(MessageTranslator): class GraphRagRequestTranslator(MessageTranslator):
"""Translator for GraphRagQuery schema objects""" """Translator for GraphRagQuery schema objects"""
def to_pulsar(self, data: Dict[str, Any]) -> GraphRagQuery: def decode(self, data: Dict[str, Any]) -> GraphRagQuery:
return GraphRagQuery( return GraphRagQuery(
query=data["query"], query=data["query"],
user=data.get("user", "trustgraph"), user=data.get("user", "trustgraph"),
@ -89,7 +89,7 @@ class GraphRagRequestTranslator(MessageTranslator):
streaming=data.get("streaming", False) streaming=data.get("streaming", False)
) )
def from_pulsar(self, obj: GraphRagQuery) -> Dict[str, Any]: def encode(self, obj: GraphRagQuery) -> Dict[str, Any]:
return { return {
"query": obj.query, "query": obj.query,
"user": obj.user, "user": obj.user,
@ -107,10 +107,10 @@ class GraphRagRequestTranslator(MessageTranslator):
class GraphRagResponseTranslator(MessageTranslator): class GraphRagResponseTranslator(MessageTranslator):
"""Translator for GraphRagResponse schema objects""" """Translator for GraphRagResponse schema objects"""
def to_pulsar(self, data: Dict[str, Any]) -> GraphRagResponse: def decode(self, data: Dict[str, Any]) -> GraphRagResponse:
raise NotImplementedError("Response translation to Pulsar not typically needed") raise NotImplementedError("Response translation to Pulsar not typically needed")
def from_pulsar(self, obj: GraphRagResponse) -> Dict[str, Any]: def encode(self, obj: GraphRagResponse) -> Dict[str, Any]:
result = {} result = {}
# Include message_type # Include message_type
@ -144,8 +144,8 @@ class GraphRagResponseTranslator(MessageTranslator):
return result return result
def from_response_with_completion(self, obj: GraphRagResponse) -> Tuple[Dict[str, Any], bool]: def encode_with_completion(self, obj: GraphRagResponse) -> Tuple[Dict[str, Any], bool]:
"""Returns (response_dict, is_final)""" """Returns (response_dict, is_final)"""
# Session is complete when end_of_session is True # Session is complete when end_of_session is True
is_final = getattr(obj, 'end_of_session', False) is_final = getattr(obj, 'end_of_session', False)
return self.from_pulsar(obj), is_final return self.encode(obj), is_final

View file

@ -7,7 +7,7 @@ import json
class RowsQueryRequestTranslator(MessageTranslator): class RowsQueryRequestTranslator(MessageTranslator):
"""Translator for RowsQueryRequest schema objects""" """Translator for RowsQueryRequest schema objects"""
def to_pulsar(self, data: Dict[str, Any]) -> RowsQueryRequest: def decode(self, data: Dict[str, Any]) -> RowsQueryRequest:
return RowsQueryRequest( return RowsQueryRequest(
user=data.get("user", "trustgraph"), user=data.get("user", "trustgraph"),
collection=data.get("collection", "default"), collection=data.get("collection", "default"),
@ -16,7 +16,7 @@ class RowsQueryRequestTranslator(MessageTranslator):
operation_name=data.get("operation_name", None) operation_name=data.get("operation_name", None)
) )
def from_pulsar(self, obj: RowsQueryRequest) -> Dict[str, Any]: def encode(self, obj: RowsQueryRequest) -> Dict[str, Any]:
result = { result = {
"user": obj.user, "user": obj.user,
"collection": obj.collection, "collection": obj.collection,
@ -33,10 +33,10 @@ class RowsQueryRequestTranslator(MessageTranslator):
class RowsQueryResponseTranslator(MessageTranslator): class RowsQueryResponseTranslator(MessageTranslator):
"""Translator for RowsQueryResponse schema objects""" """Translator for RowsQueryResponse schema objects"""
def to_pulsar(self, data: Dict[str, Any]) -> RowsQueryResponse: def decode(self, data: Dict[str, Any]) -> RowsQueryResponse:
raise NotImplementedError("Response translation to Pulsar not typically needed") raise NotImplementedError("Response translation to Pulsar not typically needed")
def from_pulsar(self, obj: RowsQueryResponse) -> Dict[str, Any]: def encode(self, obj: RowsQueryResponse) -> Dict[str, Any]:
result = {} result = {}
# Handle GraphQL response data # Handle GraphQL response data
@ -74,6 +74,6 @@ class RowsQueryResponseTranslator(MessageTranslator):
return result return result
def from_response_with_completion(self, obj: RowsQueryResponse) -> Tuple[Dict[str, Any], bool]: def encode_with_completion(self, obj: RowsQueryResponse) -> Tuple[Dict[str, Any], bool]:
"""Returns (response_dict, is_final)""" """Returns (response_dict, is_final)"""
return self.from_pulsar(obj), True return self.encode(obj), True

View file

@ -7,14 +7,14 @@ import json
class StructuredQueryRequestTranslator(MessageTranslator): class StructuredQueryRequestTranslator(MessageTranslator):
"""Translator for StructuredQueryRequest schema objects""" """Translator for StructuredQueryRequest schema objects"""
def to_pulsar(self, data: Dict[str, Any]) -> StructuredQueryRequest: def decode(self, data: Dict[str, Any]) -> StructuredQueryRequest:
return StructuredQueryRequest( return StructuredQueryRequest(
question=data.get("question", ""), question=data.get("question", ""),
user=data.get("user", "trustgraph"), # Default fallback user=data.get("user", "trustgraph"), # Default fallback
collection=data.get("collection", "default") # Default fallback collection=data.get("collection", "default") # Default fallback
) )
def from_pulsar(self, obj: StructuredQueryRequest) -> Dict[str, Any]: def encode(self, obj: StructuredQueryRequest) -> Dict[str, Any]:
return { return {
"question": obj.question, "question": obj.question,
"user": obj.user, "user": obj.user,
@ -25,10 +25,10 @@ class StructuredQueryRequestTranslator(MessageTranslator):
class StructuredQueryResponseTranslator(MessageTranslator): class StructuredQueryResponseTranslator(MessageTranslator):
"""Translator for StructuredQueryResponse schema objects""" """Translator for StructuredQueryResponse schema objects"""
def to_pulsar(self, data: Dict[str, Any]) -> StructuredQueryResponse: def decode(self, data: Dict[str, Any]) -> StructuredQueryResponse:
raise NotImplementedError("Response translation to Pulsar not typically needed") raise NotImplementedError("Response translation to Pulsar not typically needed")
def from_pulsar(self, obj: StructuredQueryResponse) -> Dict[str, Any]: def encode(self, obj: StructuredQueryResponse) -> Dict[str, Any]:
result = {} result = {}
# Handle structured query response data # Handle structured query response data
@ -55,6 +55,6 @@ class StructuredQueryResponseTranslator(MessageTranslator):
return result return result
def from_response_with_completion(self, obj: StructuredQueryResponse) -> Tuple[Dict[str, Any], bool]: def encode_with_completion(self, obj: StructuredQueryResponse) -> Tuple[Dict[str, Any], bool]:
"""Returns (response_dict, is_final)""" """Returns (response_dict, is_final)"""
return self.from_pulsar(obj), True return self.encode(obj), True

View file

@ -6,14 +6,14 @@ from .base import MessageTranslator
class TextCompletionRequestTranslator(MessageTranslator): class TextCompletionRequestTranslator(MessageTranslator):
"""Translator for TextCompletionRequest schema objects""" """Translator for TextCompletionRequest schema objects"""
def to_pulsar(self, data: Dict[str, Any]) -> TextCompletionRequest: def decode(self, data: Dict[str, Any]) -> TextCompletionRequest:
return TextCompletionRequest( return TextCompletionRequest(
system=data["system"], system=data["system"],
prompt=data["prompt"], prompt=data["prompt"],
streaming=data.get("streaming", False) streaming=data.get("streaming", False)
) )
def from_pulsar(self, obj: TextCompletionRequest) -> Dict[str, Any]: def encode(self, obj: TextCompletionRequest) -> Dict[str, Any]:
return { return {
"system": obj.system, "system": obj.system,
"prompt": obj.prompt "prompt": obj.prompt
@ -23,10 +23,10 @@ class TextCompletionRequestTranslator(MessageTranslator):
class TextCompletionResponseTranslator(MessageTranslator): class TextCompletionResponseTranslator(MessageTranslator):
"""Translator for TextCompletionResponse schema objects""" """Translator for TextCompletionResponse schema objects"""
def to_pulsar(self, data: Dict[str, Any]) -> TextCompletionResponse: def decode(self, data: Dict[str, Any]) -> TextCompletionResponse:
raise NotImplementedError("Response translation to Pulsar not typically needed") raise NotImplementedError("Response translation to Pulsar not typically needed")
def from_pulsar(self, obj: TextCompletionResponse) -> Dict[str, Any]: def encode(self, obj: TextCompletionResponse) -> Dict[str, Any]:
result = {"response": obj.response} result = {"response": obj.response}
if obj.in_token: if obj.in_token:
@ -41,8 +41,8 @@ class TextCompletionResponseTranslator(MessageTranslator):
return result return result
def from_response_with_completion(self, obj: TextCompletionResponse) -> Tuple[Dict[str, Any], bool]: def encode_with_completion(self, obj: TextCompletionResponse) -> Tuple[Dict[str, Any], bool]:
"""Returns (response_dict, is_final)""" """Returns (response_dict, is_final)"""
# Check end_of_stream field to determine if this is the final message # Check end_of_stream field to determine if this is the final message
is_final = getattr(obj, 'end_of_stream', True) is_final = getattr(obj, 'end_of_stream', True)
return self.from_pulsar(obj), is_final return self.encode(obj), is_final

View file

@ -6,7 +6,7 @@ from .base import MessageTranslator
class ToolRequestTranslator(MessageTranslator): class ToolRequestTranslator(MessageTranslator):
"""Translator for ToolRequest schema objects""" """Translator for ToolRequest schema objects"""
def to_pulsar(self, data: Dict[str, Any]) -> ToolRequest: def decode(self, data: Dict[str, Any]) -> ToolRequest:
# Handle both "name" and "parameters" input keys # Handle both "name" and "parameters" input keys
name = data.get("name", "") name = data.get("name", "")
if "parameters" in data: if "parameters" in data:
@ -19,7 +19,7 @@ class ToolRequestTranslator(MessageTranslator):
parameters = parameters, parameters = parameters,
) )
def from_pulsar(self, obj: ToolRequest) -> Dict[str, Any]: def encode(self, obj: ToolRequest) -> Dict[str, Any]:
result = {} result = {}
if obj.name: if obj.name:
@ -32,10 +32,10 @@ class ToolRequestTranslator(MessageTranslator):
class ToolResponseTranslator(MessageTranslator): class ToolResponseTranslator(MessageTranslator):
"""Translator for ToolResponse schema objects""" """Translator for ToolResponse schema objects"""
def to_pulsar(self, data: Dict[str, Any]) -> ToolResponse: def decode(self, data: Dict[str, Any]) -> ToolResponse:
raise NotImplementedError("Response translation to Pulsar not typically needed") raise NotImplementedError("Response translation to Pulsar not typically needed")
def from_pulsar(self, obj: ToolResponse) -> Dict[str, Any]: def encode(self, obj: ToolResponse) -> Dict[str, Any]:
result = {} result = {}
@ -46,6 +46,6 @@ class ToolResponseTranslator(MessageTranslator):
return result return result
def from_response_with_completion(self, obj: ToolResponse) -> Tuple[Dict[str, Any], bool]: def encode_with_completion(self, obj: ToolResponse) -> Tuple[Dict[str, Any], bool]:
"""Returns (response_dict, is_final)""" """Returns (response_dict, is_final)"""
return self.from_pulsar(obj), True return self.encode(obj), True

View file

@ -10,10 +10,10 @@ class TriplesQueryRequestTranslator(MessageTranslator):
def __init__(self): def __init__(self):
self.value_translator = ValueTranslator() self.value_translator = ValueTranslator()
def to_pulsar(self, data: Dict[str, Any]) -> TriplesQueryRequest: def decode(self, data: Dict[str, Any]) -> TriplesQueryRequest:
s = self.value_translator.to_pulsar(data["s"]) if "s" in data else None s = self.value_translator.decode(data["s"]) if "s" in data else None
p = self.value_translator.to_pulsar(data["p"]) if "p" in data else None p = self.value_translator.decode(data["p"]) if "p" in data else None
o = self.value_translator.to_pulsar(data["o"]) if "o" in data else None o = self.value_translator.decode(data["o"]) if "o" in data else None
g = data.get("g") # None=default graph, "*"=all graphs g = data.get("g") # None=default graph, "*"=all graphs
return TriplesQueryRequest( return TriplesQueryRequest(
@ -28,7 +28,7 @@ class TriplesQueryRequestTranslator(MessageTranslator):
batch_size=int(data.get("batch-size", 20)), batch_size=int(data.get("batch-size", 20)),
) )
def from_pulsar(self, obj: TriplesQueryRequest) -> Dict[str, Any]: def encode(self, obj: TriplesQueryRequest) -> Dict[str, Any]:
result = { result = {
"limit": obj.limit, "limit": obj.limit,
"user": obj.user, "user": obj.user,
@ -38,11 +38,11 @@ class TriplesQueryRequestTranslator(MessageTranslator):
} }
if obj.s: if obj.s:
result["s"] = self.value_translator.from_pulsar(obj.s) result["s"] = self.value_translator.encode(obj.s)
if obj.p: if obj.p:
result["p"] = self.value_translator.from_pulsar(obj.p) result["p"] = self.value_translator.encode(obj.p)
if obj.o: if obj.o:
result["o"] = self.value_translator.from_pulsar(obj.o) result["o"] = self.value_translator.encode(obj.o)
if obj.g is not None: if obj.g is not None:
result["g"] = obj.g result["g"] = obj.g
@ -55,14 +55,14 @@ class TriplesQueryResponseTranslator(MessageTranslator):
def __init__(self): def __init__(self):
self.subgraph_translator = SubgraphTranslator() self.subgraph_translator = SubgraphTranslator()
def to_pulsar(self, data: Dict[str, Any]) -> TriplesQueryResponse: def decode(self, data: Dict[str, Any]) -> TriplesQueryResponse:
raise NotImplementedError("Response translation to Pulsar not typically needed") raise NotImplementedError("Response translation to Pulsar not typically needed")
def from_pulsar(self, obj: TriplesQueryResponse) -> Dict[str, Any]: def encode(self, obj: TriplesQueryResponse) -> Dict[str, Any]:
return { return {
"response": self.subgraph_translator.from_pulsar(obj.triples) "response": self.subgraph_translator.encode(obj.triples)
} }
def from_response_with_completion(self, obj: TriplesQueryResponse) -> Tuple[Dict[str, Any], bool]: def encode_with_completion(self, obj: TriplesQueryResponse) -> Tuple[Dict[str, Any], bool]:
"""Returns (response_dict, is_final)""" """Returns (response_dict, is_final)"""
return self.from_pulsar(obj), obj.is_final return self.encode(obj), obj.is_final

View file

@ -1,23 +1,26 @@
def topic(queue_name, qos='q1', tenant='tg', namespace='flow'): def queue(topic, cls='flow', topicspace='tg'):
""" """
Create a generic topic identifier that can be mapped by backends. Create a queue identifier in CLASS:TOPICSPACE:TOPIC format.
Args: Args:
queue_name: The queue/topic name topic: The logical queue name (e.g. 'config', 'librarian')
qos: Quality of service cls: Queue class determining operational characteristics:
- 'q0' = best-effort (no ack) - 'flow' = persistent processing pipeline queue
- 'q1' = at-least-once (ack required) - 'request' = non-persistent, short TTL request queue
- 'q2' = exactly-once (two-phase ack) - 'response' = non-persistent, short TTL response queue
tenant: Tenant identifier for multi-tenancy - 'state' = persistent, last-value state broadcast
namespace: Namespace within tenant topicspace: Deployment isolation prefix (default: 'tg')
Returns: Returns:
Generic topic string: qos/tenant/namespace/queue_name Queue identifier string: cls:topicspace:topic
Examples: Examples:
topic('my-queue') # q1/tg/flow/my-queue queue('text-completion-request')
topic('config', qos='q2', namespace='config') # q2/tg/config/config # flow:tg:text-completion-request
queue('config', cls='request')
# request:tg:config
queue('config', cls='state')
# state:tg:config
""" """
return f"{qos}/{tenant}/{namespace}/{queue_name}" return f"{cls}:{topicspace}:{topic}"

View file

@ -1,7 +1,6 @@
from dataclasses import dataclass from dataclasses import dataclass
from ..core.metadata import Metadata from ..core.metadata import Metadata
from ..core.topic import topic
############################################################################ ############################################################################

View file

@ -2,7 +2,6 @@ from dataclasses import dataclass, field
from ..core.metadata import Metadata from ..core.metadata import Metadata
from ..core.primitives import Term, RowSchema from ..core.primitives import Term, RowSchema
from ..core.topic import topic
############################################################################ ############################################################################

View file

@ -2,7 +2,6 @@ from dataclasses import dataclass, field
from ..core.primitives import Term, Triple from ..core.primitives import Term, Triple
from ..core.metadata import Metadata from ..core.metadata import Metadata
from ..core.topic import topic
############################################################################ ############################################################################

View file

@ -1,6 +1,6 @@
from dataclasses import dataclass, field from dataclasses import dataclass, field
from ..core.primitives import Triple, Error from ..core.primitives import Triple, Error
from ..core.topic import topic from ..core.topic import queue
from ..core.metadata import Metadata from ..core.metadata import Metadata
from .document import Document, TextDocument from .document import Document, TextDocument
from .graph import Triples from .graph import Triples
@ -52,9 +52,5 @@ class KnowledgeResponse:
triples: Triples | None = None triples: Triples | None = None
graph_embeddings: GraphEmbeddings | None = None graph_embeddings: GraphEmbeddings | None = None
knowledge_request_queue = topic( knowledge_request_queue = queue('knowledge', cls='request')
'knowledge', qos='q0', namespace='request' knowledge_response_queue = queue('knowledge', cls='response')
)
knowledge_response_queue = topic(
'knowledge', qos='q0', namespace='response',
)

View file

@ -1,6 +1,5 @@
from dataclasses import dataclass from dataclasses import dataclass
from ..core.topic import topic
############################################################################ ############################################################################

View file

@ -1,7 +1,6 @@
from dataclasses import dataclass, field from dataclasses import dataclass, field
from ..core.metadata import Metadata from ..core.metadata import Metadata
from ..core.topic import topic
############################################################################ ############################################################################

View file

@ -2,7 +2,6 @@ from dataclasses import dataclass, field
from ..core.metadata import Metadata from ..core.metadata import Metadata
from ..core.primitives import RowSchema from ..core.primitives import RowSchema
from ..core.topic import topic
############################################################################ ############################################################################

View file

@ -1,7 +1,6 @@
from dataclasses import dataclass, field from dataclasses import dataclass, field
from ..core.metadata import Metadata from ..core.metadata import Metadata
from ..core.topic import topic
############################################################################ ############################################################################

View file

@ -2,7 +2,6 @@
from dataclasses import dataclass, field from dataclasses import dataclass, field
from typing import Optional from typing import Optional
from ..core.topic import topic
from ..core.primitives import Error from ..core.primitives import Error
############################################################################ ############################################################################

View file

@ -2,7 +2,7 @@ from dataclasses import dataclass, field
from datetime import datetime from datetime import datetime
from ..core.primitives import Error from ..core.primitives import Error
from ..core.topic import topic from ..core.topic import queue
############################################################################ ############################################################################
@ -50,10 +50,6 @@ class CollectionManagementResponse:
# Topics # Topics
collection_request_queue = topic( collection_request_queue = queue('collection', cls='request')
'collection', qos='q0', namespace='request' collection_response_queue = queue('collection', cls='response')
)
collection_response_queue = topic(
'collection', qos='q0', namespace='response'
)

View file

@ -1,7 +1,7 @@
from dataclasses import dataclass, field from dataclasses import dataclass, field
from ..core.topic import topic from ..core.topic import queue
from ..core.primitives import Error from ..core.primitives import Error
############################################################################ ############################################################################
@ -60,15 +60,9 @@ class ConfigPush:
version: int = 0 version: int = 0
config: dict[str, dict[str, str]] = field(default_factory=dict) config: dict[str, dict[str, str]] = field(default_factory=dict)
config_request_queue = topic( config_request_queue = queue('config', cls='request')
'config', qos='q0', namespace='request' config_response_queue = queue('config', cls='response')
) config_push_queue = queue('config', cls='state')
config_response_queue = topic(
'config', qos='q0', namespace='response'
)
config_push_queue = topic(
'config', qos='q2', namespace='config'
)
############################################################################ ############################################################################

View file

@ -1,7 +1,7 @@
from dataclasses import dataclass, field from dataclasses import dataclass, field
from ..core.topic import topic from ..core.topic import queue
from ..core.primitives import Error from ..core.primitives import Error
############################################################################ ############################################################################
@ -61,12 +61,8 @@ class FlowResponse:
# Everything # Everything
error: Error | None = None error: Error | None = None
flow_request_queue = topic( flow_request_queue = queue('flow', cls='request')
'flow', qos='q0', namespace='request' flow_response_queue = queue('flow', cls='response')
)
flow_response_queue = topic(
'flow', qos='q0', namespace='response'
)
############################################################################ ############################################################################

View file

@ -1,6 +1,6 @@
from dataclasses import dataclass, field from dataclasses import dataclass, field
from ..core.primitives import Triple, Error from ..core.primitives import Triple, Error
from ..core.topic import topic from ..core.topic import queue
from ..core.metadata import Metadata from ..core.metadata import Metadata
# Note: Document imports will be updated after knowledge schemas are converted # Note: Document imports will be updated after knowledge schemas are converted
@ -220,9 +220,5 @@ class LibrarianResponse:
# FIXME: Is this right? Using persistence on librarian so that # FIXME: Is this right? Using persistence on librarian so that
# message chunking works # message chunking works
librarian_request_queue = topic( librarian_request_queue = queue('librarian-request', cls='flow')
'librarian', qos='q1', namespace='request' librarian_response_queue = queue('librarian-response', cls='flow')
)
librarian_response_queue = topic(
'librarian', qos='q1', namespace='response',
)

View file

@ -1,7 +1,6 @@
from dataclasses import dataclass, field from dataclasses import dataclass, field
from ..core.topic import topic
from ..core.primitives import Error from ..core.primitives import Error
############################################################################ ############################################################################

View file

@ -1,7 +1,6 @@
from dataclasses import dataclass from dataclasses import dataclass
from ..core.primitives import Error, Term, Triple from ..core.primitives import Error, Term, Triple
from ..core.topic import topic
from ..core.metadata import Metadata from ..core.metadata import Metadata
############################################################################ ############################################################################

View file

@ -1,7 +1,6 @@
from dataclasses import dataclass, field from dataclasses import dataclass, field
from ..core.primitives import Error from ..core.primitives import Error
from ..core.topic import topic
############################################################################ ############################################################################

View file

@ -1,7 +1,6 @@
from dataclasses import dataclass, field from dataclasses import dataclass, field
from ..core.primitives import Error from ..core.primitives import Error
from ..core.topic import topic
############################################################################ ############################################################################

View file

@ -1,7 +1,7 @@
from dataclasses import dataclass, field from dataclasses import dataclass, field
from ..core.primitives import Error, Term, Triple from ..core.primitives import Error, Term, Triple
from ..core.topic import topic from ..core.topic import queue
############################################################################ ############################################################################
@ -69,12 +69,8 @@ class DocumentEmbeddingsResponse:
error: Error | None = None error: Error | None = None
chunks: list[ChunkMatch] = field(default_factory=list) chunks: list[ChunkMatch] = field(default_factory=list)
document_embeddings_request_queue = topic( document_embeddings_request_queue = queue('document-embeddings', cls='request')
"document-embeddings-request", qos='q0', tenant='trustgraph', namespace='flow' document_embeddings_response_queue = queue('document-embeddings', cls='response')
)
document_embeddings_response_queue = topic(
"document-embeddings-response", qos='q0', tenant='trustgraph', namespace='flow'
)
############################################################################ ############################################################################
@ -104,9 +100,5 @@ class RowEmbeddingsResponse:
error: Error | None = None error: Error | None = None
matches: list[RowIndexMatch] = field(default_factory=list) matches: list[RowIndexMatch] = field(default_factory=list)
row_embeddings_request_queue = topic( row_embeddings_request_queue = queue('row-embeddings', cls='request')
"row-embeddings-request", qos='q0', tenant='trustgraph', namespace='flow' row_embeddings_response_queue = queue('row-embeddings', cls='response')
)
row_embeddings_response_queue = topic(
"row-embeddings-response", qos='q0', tenant='trustgraph', namespace='flow'
)

View file

@ -1,5 +1,4 @@
from dataclasses import dataclass from dataclasses import dataclass
from ..core.topic import topic
from ..core.primitives import Error, Term from ..core.primitives import Error, Term
############################################################################ ############################################################################

View file

@ -2,7 +2,6 @@ from dataclasses import dataclass, field
from typing import Optional from typing import Optional
from ..core.primitives import Error from ..core.primitives import Error
from ..core.topic import topic
############################################################################ ############################################################################

View file

@ -1,7 +1,6 @@
from dataclasses import dataclass, field from dataclasses import dataclass, field
from ..core.primitives import Error from ..core.primitives import Error
from ..core.topic import topic
############################################################################ ############################################################################

View file

@ -8,8 +8,6 @@ message flows, diagnosing stuck services, and understanding system behavior.
Uses TrustGraph's Subscriber abstraction for future-proof pub/sub compatibility. Uses TrustGraph's Subscriber abstraction for future-proof pub/sub compatibility.
""" """
import pulsar
from pulsar.schema import BytesSchema
import sys import sys
import json import json
import asyncio import asyncio
@ -17,7 +15,7 @@ from datetime import datetime
import argparse import argparse
from trustgraph.base.subscriber import Subscriber from trustgraph.base.subscriber import Subscriber
from trustgraph.base.pubsub import get_pubsub from trustgraph.base.pubsub import get_pubsub, add_pubsub_args
def decode_json_strings(obj): def decode_json_strings(obj):
"""Recursively decode JSON-encoded string values within a dict/list.""" """Recursively decode JSON-encoded string values within a dict/list."""
@ -172,15 +170,13 @@ async def log_writer(central_queue, file_handle, shutdown_event, console_output=
break break
async def async_main(queues, output_file, pulsar_host, listener_name, subscriber_name, append_mode): async def async_main(queues, output_file, subscriber_name, append_mode, **pubsub_config):
""" """
Main async function to monitor multiple queues concurrently. Main async function to monitor multiple queues concurrently.
Args: Args:
queues: List of queue names to monitor queues: List of queue names to monitor
output_file: Path to output file output_file: Path to output file
pulsar_host: Pulsar connection URL
listener_name: Pulsar listener name
subscriber_name: Base name for subscribers subscriber_name: Base name for subscribers
append_mode: Whether to append to existing file append_mode: Whether to append to existing file
""" """
@ -194,9 +190,9 @@ async def async_main(queues, output_file, pulsar_host, listener_name, subscriber
# Create backend connection # Create backend connection
try: try:
backend = get_pubsub(pulsar_host=pulsar_host, pulsar_listener=listener_name, pubsub_backend='pulsar') backend = get_pubsub(**pubsub_config)
except Exception as e: except Exception as e:
print(f"Error connecting to backend at {pulsar_host}: {e}", file=sys.stderr) print(f"Error connecting to backend: {e}", file=sys.stderr)
sys.exit(1) sys.exit(1)
# Create Subscribers and central queue # Create Subscribers and central queue
@ -291,25 +287,20 @@ def main():
description='Monitor and dump messages from multiple Pulsar queues', description='Monitor and dump messages from multiple Pulsar queues',
epilog=""" epilog="""
Examples: Examples:
# Monitor agent and prompt queues # Monitor agent and prompt flow queues
tg-dump-queues non-persistent://tg/request/agent:default \\ tg-dump-queues flow:tg:agent-request:default \\
non-persistent://tg/request/prompt:default flow:tg:prompt-request:default
# Monitor with custom output file # Monitor with custom output file
tg-dump-queues non-persistent://tg/request/agent:default \\ tg-dump-queues flow:tg:agent-request:default \\
--output debug.log --output debug.log
# Append to existing log file # Append to existing log file
tg-dump-queues non-persistent://tg/request/agent:default \\ tg-dump-queues flow:tg:agent-request:default \\
--output queue.log --append --output queue.log --append
Common queue patterns: # Raw Pulsar URIs also accepted
- Agent requests: non-persistent://tg/request/agent:default tg-dump-queues persistent://tg/flow/agent-request:default
- Agent responses: non-persistent://tg/response/agent:default
- Prompt requests: non-persistent://tg/request/prompt:default
- Prompt responses: non-persistent://tg/response/prompt:default
- LLM requests: non-persistent://tg/request/text-completion:default
- LLM responses: non-persistent://tg/response/text-completion:default
IMPORTANT: IMPORTANT:
This tool subscribes to queues without a schema (schema-less mode). To avoid This tool subscribes to queues without a schema (schema-less mode). To avoid
@ -340,17 +331,7 @@ IMPORTANT:
help='Append to output file instead of overwriting' help='Append to output file instead of overwriting'
) )
parser.add_argument( add_pubsub_args(parser, standalone=True)
'--pulsar-host',
default='pulsar://localhost:6650',
help='Pulsar host URL (default: pulsar://localhost:6650)'
)
parser.add_argument(
'--listener-name',
default='localhost',
help='Pulsar listener name (default: localhost)'
)
parser.add_argument( parser.add_argument(
'--subscriber', '--subscriber',
@ -371,10 +352,12 @@ IMPORTANT:
asyncio.run(async_main( asyncio.run(async_main(
queues=queues, queues=queues,
output_file=args.output, output_file=args.output,
pulsar_host=args.pulsar_host,
listener_name=args.listener_name,
subscriber_name=args.subscriber, subscriber_name=args.subscriber,
append_mode=args.append append_mode=args.append,
pubsub_backend=args.pubsub_backend,
pulsar_host=args.pulsar_host,
pulsar_api_key=args.pulsar_api_key,
pulsar_listener=args.pulsar_listener,
)) ))
except KeyboardInterrupt: except KeyboardInterrupt:
# Already handled in async_main # Already handled in async_main

View file

@ -137,7 +137,7 @@ def init(
} }
}) })
ensure_namespace(pulsar_admin_url, tenant, "config", { ensure_namespace(pulsar_admin_url, tenant, "state", {
"retention_policies": { "retention_policies": {
"retentionSizeInMB": 10, "retentionSizeInMB": 10,
"retentionTimeInMinutes": -1, "retentionTimeInMinutes": -1,

View file

@ -1,7 +1,7 @@
""" """
Monitor prompt request/response queues and log activity with timing. Monitor prompt request/response queues and log activity with timing.
Subscribes to prompt request and response Pulsar queues, correlates Subscribes to prompt request and response queues, correlates
them by message ID, and logs a summary of each request/response with them by message ID, and logs a summary of each request/response with
elapsed time. Streaming responses are accumulated and shown once at elapsed time. Streaming responses are accumulated and shown once at
completion. completion.
@ -19,8 +19,7 @@ import argparse
from datetime import datetime from datetime import datetime
from collections import OrderedDict from collections import OrderedDict
import pulsar from trustgraph.base.pubsub import get_pubsub, add_pubsub_args
from pulsar.schema import BytesSchema
default_flow = "default" default_flow = "default"
@ -85,7 +84,7 @@ def format_terms(terms, max_lines, max_width):
def parse_raw_message(msg): def parse_raw_message(msg):
"""Parse a raw Pulsar message into (correlation_id, body_dict).""" """Parse a raw message into (correlation_id, body_dict)."""
try: try:
props = msg.properties() props = msg.properties()
corr_id = props.get("id", "") corr_id = props.get("id", "")
@ -94,53 +93,46 @@ def parse_raw_message(msg):
try: try:
value = msg.value() value = msg.value()
if isinstance(value, bytes): if isinstance(value, dict):
value = value.decode("utf-8") body = value
body = json.loads(value) if isinstance(value, str) else {} elif isinstance(value, bytes):
body = json.loads(value.decode("utf-8"))
elif isinstance(value, str):
body = json.loads(value)
else:
body = {}
except Exception: except Exception:
body = {} body = {}
return corr_id, body return corr_id, body
def receive_with_timeout(consumer, timeout_ms=500): async def monitor(flow, queue_type, max_lines, max_width, **config):
"""Receive a message with timeout, returning None on timeout."""
try:
return consumer.receive(timeout_millis=timeout_ms)
except Exception:
return None
request_queue = f"request:tg:{queue_type}:{flow}"
async def monitor(flow, queue_type, max_lines, max_width, response_queue = f"response:tg:{queue_type}:{flow}"
pulsar_host, listener_name):
request_queue = f"non-persistent://tg/request/{queue_type}:{flow}"
response_queue = f"non-persistent://tg/response/{queue_type}:{flow}"
print(f"Monitoring prompt queues:") print(f"Monitoring prompt queues:")
print(f" Request: {request_queue}") print(f" Request: {request_queue}")
print(f" Response: {response_queue}") print(f" Response: {response_queue}")
print(f"Press Ctrl+C to stop\n") print(f"Press Ctrl+C to stop\n")
client = pulsar.Client( backend = get_pubsub(**config)
pulsar_host,
listener_name=listener_name, req_consumer = backend.create_consumer(
topic=request_queue,
subscription="prompt-monitor-req",
schema=None,
consumer_type='shared',
initial_position='latest',
) )
req_consumer = client.subscribe( resp_consumer = backend.create_consumer(
request_queue, topic=response_queue,
subscription_name="prompt-monitor-req", subscription="prompt-monitor-resp",
consumer_type=pulsar.ConsumerType.Shared, schema=None,
schema=BytesSchema(), consumer_type='shared',
initial_position=pulsar.InitialPosition.Latest, initial_position='latest',
)
resp_consumer = client.subscribe(
response_queue,
subscription_name="prompt-monitor-resp",
consumer_type=pulsar.ConsumerType.Shared,
schema=BytesSchema(),
initial_position=pulsar.InitialPosition.Latest,
) )
# Track in-flight requests: corr_id -> (timestamp, template_id) # Track in-flight requests: corr_id -> (timestamp, template_id)
@ -156,8 +148,8 @@ async def monitor(flow, queue_type, max_lines, max_width,
got_message = False got_message = False
# Poll request queue # Poll request queue
msg = receive_with_timeout(req_consumer, 100) try:
if msg: msg = req_consumer.receive(timeout_millis=100)
got_message = True got_message = True
timestamp = datetime.now() timestamp = datetime.now()
corr_id, body = parse_raw_message(msg) corr_id, body = parse_raw_message(msg)
@ -182,10 +174,12 @@ async def monitor(flow, queue_type, max_lines, max_width,
print(format_terms(terms, max_lines, max_width)) print(format_terms(terms, max_lines, max_width))
req_consumer.acknowledge(msg) req_consumer.acknowledge(msg)
except TimeoutError:
pass
# Poll response queue # Poll response queue
msg = receive_with_timeout(resp_consumer, 100) try:
if msg: msg = resp_consumer.receive(timeout_millis=100)
got_message = True got_message = True
timestamp = datetime.now() timestamp = datetime.now()
corr_id, body = parse_raw_message(msg) corr_id, body = parse_raw_message(msg)
@ -265,6 +259,8 @@ async def monitor(flow, queue_type, max_lines, max_width,
print(f" {truncated}") print(f" {truncated}")
resp_consumer.acknowledge(msg) resp_consumer.acknowledge(msg)
except TimeoutError:
pass
if not got_message: if not got_message:
await asyncio.sleep(0.05) await asyncio.sleep(0.05)
@ -274,7 +270,7 @@ async def monitor(flow, queue_type, max_lines, max_width,
finally: finally:
req_consumer.close() req_consumer.close()
resp_consumer.close() resp_consumer.close()
client.close() backend.close()
def main(): def main():
@ -310,17 +306,7 @@ def main():
help=f"Max width per line (default: {default_max_width})", help=f"Max width per line (default: {default_max_width})",
) )
parser.add_argument( add_pubsub_args(parser, standalone=True)
"--pulsar-host",
default="pulsar://localhost:6650",
help="Pulsar host URL (default: pulsar://localhost:6650)",
)
parser.add_argument(
"--listener-name",
default="localhost",
help="Pulsar listener name (default: localhost)",
)
args = parser.parse_args() args = parser.parse_args()
@ -331,7 +317,9 @@ def main():
max_lines=args.max_lines, max_lines=args.max_lines,
max_width=args.max_width, max_width=args.max_width,
pulsar_host=args.pulsar_host, pulsar_host=args.pulsar_host,
listener_name=args.listener_name, pulsar_api_key=args.pulsar_api_key,
pulsar_listener=args.pulsar_listener,
pubsub_backend=args.pubsub_backend,
)) ))
except KeyboardInterrupt: except KeyboardInterrupt:
pass pass

View file

@ -145,7 +145,7 @@ class Processor(FlowProcessor):
try: try:
# Convert Pulsar RowSchema to JSON-serializable dict # Convert Pulsar RowSchema to JSON-serializable dict
schema_dict = row_schema_translator.from_pulsar(schema) schema_dict = row_schema_translator.encode(schema)
# Use prompt client to extract rows based on schema # Use prompt client to extract rows based on schema
objects = await flow("prompt-request").extract_objects( objects = await flow("prompt-request").extract_objects(

View file

@ -23,7 +23,6 @@ import uuid
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
import json import json
import pulsar
from prometheus_client import start_http_server from prometheus_client import start_http_server
from ... schema import ConfigPush, config_push_queue from ... schema import ConfigPush, config_push_queue

View file

@ -25,8 +25,8 @@ class AgentRequestor(ServiceRequestor):
self.response_translator = TranslatorRegistry.get_response_translator("agent") self.response_translator = TranslatorRegistry.get_response_translator("agent")
def to_request(self, body): def to_request(self, body):
return self.request_translator.to_pulsar(body) return self.request_translator.decode(body)
def from_response(self, message): def from_response(self, message):
return self.response_translator.from_response_with_completion(message) return self.response_translator.encode_with_completion(message)

View file

@ -29,8 +29,8 @@ class CollectionManagementRequestor(ServiceRequestor):
def to_request(self, body): def to_request(self, body):
print("REQUEST", body, flush=True) print("REQUEST", body, flush=True)
return self.request_translator.to_pulsar(body) return self.request_translator.decode(body)
def from_response(self, message): def from_response(self, message):
print("RESPONSE", message, flush=True) print("RESPONSE", message, flush=True)
return self.response_translator.from_response_with_completion(message) return self.response_translator.encode_with_completion(message)

View file

@ -30,8 +30,8 @@ class ConfigRequestor(ServiceRequestor):
self.response_translator = TranslatorRegistry.get_response_translator("config") self.response_translator = TranslatorRegistry.get_response_translator("config")
def to_request(self, body): def to_request(self, body):
return self.request_translator.to_pulsar(body) return self.request_translator.decode(body)
def from_response(self, message): def from_response(self, message):
return self.response_translator.from_response_with_completion(message) return self.response_translator.encode_with_completion(message)

View file

@ -44,7 +44,7 @@ class DocumentEmbeddingsImport:
async def receive(self, msg): async def receive(self, msg):
data = msg.json() data = msg.json()
elt = self.translator.to_pulsar(data) elt = self.translator.decode(data)
await self.publisher.send(None, elt) await self.publisher.send(None, elt)
async def run(self): async def run(self):

View file

@ -25,7 +25,7 @@ class DocumentEmbeddingsQueryRequestor(ServiceRequestor):
self.response_translator = TranslatorRegistry.get_response_translator("document-embeddings-query") self.response_translator = TranslatorRegistry.get_response_translator("document-embeddings-query")
def to_request(self, body): def to_request(self, body):
return self.request_translator.to_pulsar(body) return self.request_translator.decode(body)
def from_response(self, message): def from_response(self, message):
return self.response_translator.from_response_with_completion(message) return self.response_translator.encode_with_completion(message)

View file

@ -23,5 +23,5 @@ class DocumentLoad(ServiceSender):
def to_request(self, body): def to_request(self, body):
logger.info("Document received") logger.info("Document received")
return self.translator.to_pulsar(body) return self.translator.decode(body)

View file

@ -25,8 +25,8 @@ class DocumentRagRequestor(ServiceRequestor):
self.response_translator = TranslatorRegistry.get_response_translator("document-rag") self.response_translator = TranslatorRegistry.get_response_translator("document-rag")
def to_request(self, body): def to_request(self, body):
return self.request_translator.to_pulsar(body) return self.request_translator.decode(body)
def from_response(self, message): def from_response(self, message):
return self.response_translator.from_response_with_completion(message) return self.response_translator.encode_with_completion(message)

View file

@ -25,8 +25,8 @@ class EmbeddingsRequestor(ServiceRequestor):
self.response_translator = TranslatorRegistry.get_response_translator("embeddings") self.response_translator = TranslatorRegistry.get_response_translator("embeddings")
def to_request(self, body): def to_request(self, body):
return self.request_translator.to_pulsar(body) return self.request_translator.decode(body)
def from_response(self, message): def from_response(self, message):
return self.response_translator.from_response_with_completion(message) return self.response_translator.encode_with_completion(message)

View file

@ -30,8 +30,8 @@ class FlowRequestor(ServiceRequestor):
self.response_translator = TranslatorRegistry.get_response_translator("flow") self.response_translator = TranslatorRegistry.get_response_translator("flow")
def to_request(self, body): def to_request(self, body):
return self.request_translator.to_pulsar(body) return self.request_translator.decode(body)
def from_response(self, message): def from_response(self, message):
return self.response_translator.from_response_with_completion(message) return self.response_translator.encode_with_completion(message)

View file

@ -25,8 +25,8 @@ class GraphEmbeddingsQueryRequestor(ServiceRequestor):
self.response_translator = TranslatorRegistry.get_response_translator("graph-embeddings-query") self.response_translator = TranslatorRegistry.get_response_translator("graph-embeddings-query")
def to_request(self, body): def to_request(self, body):
return self.request_translator.to_pulsar(body) return self.request_translator.decode(body)
def from_response(self, message): def from_response(self, message):
return self.response_translator.from_response_with_completion(message) return self.response_translator.encode_with_completion(message)

View file

@ -25,8 +25,8 @@ class GraphRagRequestor(ServiceRequestor):
self.response_translator = TranslatorRegistry.get_response_translator("graph-rag") self.response_translator = TranslatorRegistry.get_response_translator("graph-rag")
def to_request(self, body): def to_request(self, body):
return self.request_translator.to_pulsar(body) return self.request_translator.decode(body)
def from_response(self, message): def from_response(self, message):
return self.response_translator.from_response_with_completion(message) return self.response_translator.encode_with_completion(message)

View file

@ -33,8 +33,8 @@ class KnowledgeRequestor(ServiceRequestor):
self.response_translator = TranslatorRegistry.get_response_translator("knowledge") self.response_translator = TranslatorRegistry.get_response_translator("knowledge")
def to_request(self, body): def to_request(self, body):
return self.request_translator.to_pulsar(body) return self.request_translator.decode(body)
def from_response(self, message): def from_response(self, message):
return self.response_translator.from_response_with_completion(message) return self.response_translator.encode_with_completion(message)

View file

@ -40,8 +40,8 @@ class LibrarianRequestor(ServiceRequestor):
body = body.copy() body = body.copy()
body["content"] = content body["content"] = content
return self.request_translator.to_pulsar(body) return self.request_translator.decode(body)
def from_response(self, message): def from_response(self, message):
return self.response_translator.from_response_with_completion(message) return self.response_translator.encode_with_completion(message)

View file

@ -25,8 +25,8 @@ class McpToolRequestor(ServiceRequestor):
self.response_translator = TranslatorRegistry.get_response_translator("tool") self.response_translator = TranslatorRegistry.get_response_translator("tool")
def to_request(self, body): def to_request(self, body):
return self.request_translator.to_pulsar(body) return self.request_translator.decode(body)
def from_response(self, message): def from_response(self, message):
return self.response_translator.from_response_with_completion(message) return self.response_translator.encode_with_completion(message)

View file

@ -24,7 +24,7 @@ class NLPQueryRequestor(ServiceRequestor):
self.response_translator = TranslatorRegistry.get_response_translator("nlp-query") self.response_translator = TranslatorRegistry.get_response_translator("nlp-query")
def to_request(self, body): def to_request(self, body):
return self.request_translator.to_pulsar(body) return self.request_translator.decode(body)
def from_response(self, message): def from_response(self, message):
return self.response_translator.from_response_with_completion(message) return self.response_translator.encode_with_completion(message)

View file

@ -27,8 +27,8 @@ class PromptRequestor(ServiceRequestor):
self.response_translator = TranslatorRegistry.get_response_translator("prompt") self.response_translator = TranslatorRegistry.get_response_translator("prompt")
def to_request(self, body): def to_request(self, body):
return self.request_translator.to_pulsar(body) return self.request_translator.decode(body)
def from_response(self, message): def from_response(self, message):
return self.response_translator.from_response_with_completion(message) return self.response_translator.encode_with_completion(message)

View file

@ -25,7 +25,7 @@ class RowEmbeddingsQueryRequestor(ServiceRequestor):
self.response_translator = TranslatorRegistry.get_response_translator("row-embeddings-query") self.response_translator = TranslatorRegistry.get_response_translator("row-embeddings-query")
def to_request(self, body): def to_request(self, body):
return self.request_translator.to_pulsar(body) return self.request_translator.decode(body)
def from_response(self, message): def from_response(self, message):
return self.response_translator.from_response_with_completion(message) return self.response_translator.encode_with_completion(message)

View file

@ -24,7 +24,7 @@ class RowsQueryRequestor(ServiceRequestor):
self.response_translator = TranslatorRegistry.get_response_translator("rows-query") self.response_translator = TranslatorRegistry.get_response_translator("rows-query")
def to_request(self, body): def to_request(self, body):
return self.request_translator.to_pulsar(body) return self.request_translator.decode(body)
def from_response(self, message): def from_response(self, message):
return self.response_translator.from_response_with_completion(message) return self.response_translator.encode_with_completion(message)

View file

@ -11,22 +11,22 @@ _triple_translator = TripleTranslator()
def to_value(x): def to_value(x):
"""Convert dict to Term. Delegates to TermTranslator.""" """Convert dict to Term. Delegates to TermTranslator."""
return _term_translator.to_pulsar(x) return _term_translator.decode(x)
def to_subgraph(x): def to_subgraph(x):
"""Convert list of dicts to list of Triples. Delegates to TripleTranslator.""" """Convert list of dicts to list of Triples. Delegates to TripleTranslator."""
return [_triple_translator.to_pulsar(t) for t in x] return [_triple_translator.decode(t) for t in x]
def serialize_value(v): def serialize_value(v):
"""Convert Term to dict. Delegates to TermTranslator.""" """Convert Term to dict. Delegates to TermTranslator."""
return _term_translator.from_pulsar(v) return _term_translator.encode(v)
def serialize_triple(t): def serialize_triple(t):
"""Convert Triple to dict. Delegates to TripleTranslator.""" """Convert Triple to dict. Delegates to TripleTranslator."""
return _triple_translator.from_pulsar(t) return _triple_translator.encode(t)
def serialize_subgraph(sg): def serialize_subgraph(sg):

View file

@ -24,7 +24,7 @@ class StructuredDiagRequestor(ServiceRequestor):
self.response_translator = TranslatorRegistry.get_response_translator("structured-diag") self.response_translator = TranslatorRegistry.get_response_translator("structured-diag")
def to_request(self, body): def to_request(self, body):
return self.request_translator.to_pulsar(body) return self.request_translator.decode(body)
def from_response(self, message): def from_response(self, message):
return self.response_translator.from_response_with_completion(message) return self.response_translator.encode_with_completion(message)

View file

@ -24,7 +24,7 @@ class StructuredQueryRequestor(ServiceRequestor):
self.response_translator = TranslatorRegistry.get_response_translator("structured-query") self.response_translator = TranslatorRegistry.get_response_translator("structured-query")
def to_request(self, body): def to_request(self, body):
return self.request_translator.to_pulsar(body) return self.request_translator.decode(body)
def from_response(self, message): def from_response(self, message):
return self.response_translator.from_response_with_completion(message) return self.response_translator.encode_with_completion(message)

View file

@ -25,8 +25,8 @@ class TextCompletionRequestor(ServiceRequestor):
self.response_translator = TranslatorRegistry.get_response_translator("text-completion") self.response_translator = TranslatorRegistry.get_response_translator("text-completion")
def to_request(self, body): def to_request(self, body):
return self.request_translator.to_pulsar(body) return self.request_translator.decode(body)
def from_response(self, message): def from_response(self, message):
return self.response_translator.from_response_with_completion(message) return self.response_translator.encode_with_completion(message)

View file

@ -23,5 +23,5 @@ class TextLoad(ServiceSender):
def to_request(self, body): def to_request(self, body):
logger.info("Text document received") logger.info("Text document received")
return self.translator.to_pulsar(body) return self.translator.decode(body)

Some files were not shown because too many files have changed in this diff Show more