mirror of https://github.com/trustgraph-ai/trustgraph.git synced 2026-04-25 08:26:21 +02:00

* Plugin architecture for messaging fabric

* Schemas use a technology neutral expression

* Schemas strictness has uncovered some incorrect schema use which is fixed

2025-12-17 21:40:43 +00:00

33 KiB

Raw Blame History

Pub/Sub Infrastructure

Overview

This document catalogs all connections between the TrustGraph codebase and the pub/sub infrastructure. Currently, the system is hardcoded to use Apache Pulsar. This analysis identifies all integration points to inform future refactoring toward a configurable pub/sub abstraction.

Current State: Pulsar Integration Points

1. Direct Pulsar Client Usage

Location: trustgraph-flow/trustgraph/gateway/service.py

The API gateway directly imports and instantiates the Pulsar client:

Line 20: import pulsar
Lines 54-61: Direct instantiation of pulsar.Client() with optional pulsar.AuthenticationToken()
Lines 33-35: Default Pulsar host configuration from environment variables
Lines 178-192: CLI arguments for --pulsar-host, --pulsar-api-key, and --pulsar-listener
Lines 78, 124: Passes pulsar_client to ConfigReceiver and DispatcherManager

This is the only location that directly instantiates a Pulsar client outside of the abstraction layer.

2. Base Processor Framework

Location: trustgraph-base/trustgraph/base/async_processor.py

The base class for all processors provides Pulsar connectivity:

Line 9: import _pulsar (for exception handling)
Line 18: from . pubsub import PulsarClient
Line 38: Creates pulsar_client_object = PulsarClient(**params)
Lines 104-108: Properties exposing pulsar_host and pulsar_client
Line 250: Static method add_args() calls PulsarClient.add_args(parser) for CLI arguments
Lines 223-225: Exception handling for _pulsar.Interrupted

All processors inherit from AsyncProcessor, making this the central integration point.

3. Consumer Abstraction

Location: trustgraph-base/trustgraph/base/consumer.py

Consumes messages from queues and invokes handler functions:

Pulsar imports:

Line 12: from pulsar.schema import JsonSchema
Line 13: import pulsar
Line 14: import _pulsar

Pulsar-specific usage:

Lines 100, 102: pulsar.InitialPosition.Earliest / pulsar.InitialPosition.Latest
Line 108: JsonSchema(self.schema) wrapper
Line 110: pulsar.ConsumerType.Shared
Lines 104-111: self.client.subscribe() with Pulsar-specific parameters
Lines 143, 150, 65: consumer.unsubscribe() and consumer.close() methods
Line 162: _pulsar.Timeout exception
Lines 182, 205, 232: consumer.acknowledge() / consumer.negative_acknowledge()

Spec file: trustgraph-base/trustgraph/base/consumer_spec.py

Line 22: References processor.pulsar_client

4. Producer Abstraction

Location: trustgraph-base/trustgraph/base/producer.py

Sends messages to queues:

Pulsar imports:

Line 2: from pulsar.schema import JsonSchema

Pulsar-specific usage:

Line 49: JsonSchema(self.schema) wrapper
Lines 47-51: self.client.create_producer() with Pulsar-specific parameters (topic, schema, chunking_enabled)
Lines 31, 76: producer.close() method
Lines 64-65: producer.send() with message and properties

Spec file: trustgraph-base/trustgraph/base/producer_spec.py

Line 18: References processor.pulsar_client

5. Publisher Abstraction

Location: trustgraph-base/trustgraph/base/publisher.py

Asynchronous message publishing with queue buffering:

Pulsar imports:

Line 2: from pulsar.schema import JsonSchema
Line 6: import pulsar

Pulsar-specific usage:

Line 52: JsonSchema(self.schema) wrapper
Lines 50-54: self.client.create_producer() with Pulsar-specific parameters
Lines 101, 103: producer.send() with message and optional properties
Lines 106-107: producer.flush() and producer.close() methods

6. Subscriber Abstraction

Location: trustgraph-base/trustgraph/base/subscriber.py

Provides multi-recipient message distribution from queues:

Pulsar imports:

Line 6: from pulsar.schema import JsonSchema
Line 8: import _pulsar

Pulsar-specific usage:

Line 55: JsonSchema(self.schema) wrapper
Line 57: self.client.subscribe(**subscribe_args)
Lines 101, 136, 160, 167-172: Pulsar exceptions: _pulsar.Timeout, _pulsar.InvalidConfiguration, _pulsar.AlreadyClosed
Lines 159, 166, 170: Consumer methods: negative_acknowledge(), unsubscribe(), close()
Lines 247, 251: Message acknowledgment: acknowledge(), negative_acknowledge()

Spec file: trustgraph-base/trustgraph/base/subscriber_spec.py

Line 19: References processor.pulsar_client

7. Schema System (Heart of Darkness)

Location: trustgraph-base/trustgraph/schema/

Every message schema in the system is defined using Pulsar's schema framework.

Core primitives: schema/core/primitives.py

Line 2: from pulsar.schema import Record, String, Boolean, Array, Integer
All schemas inherit from Pulsar's Record base class
All field types are Pulsar types: String(), Integer(), Boolean(), Array(), Map(), Double()

Example schemas:

schema/services/llm.py (Line 2): from pulsar.schema import Record, String, Array, Double, Integer, Boolean
schema/services/config.py (Line 2): from pulsar.schema import Record, Bytes, String, Boolean, Array, Map, Integer

Topic naming: schema/core/topic.py

Lines 2-3: Topic format: {kind}://{tenant}/{namespace}/{topic}
This URI structure is Pulsar-specific (e.g., persistent://tg/flow/config)

Impact:

All request/response message definitions throughout the codebase use Pulsar schemas
This includes services for: config, flow, llm, prompt, query, storage, agent, collection, diagnosis, library, lookup, nlp_query, objects_query, retrieval, structured_query
Schema definitions are imported and used extensively across all processors and services

Summary

Pulsar Dependencies by Category

Client instantiation:
- Direct: gateway/service.py
- Abstracted: async_processor.py → pubsub.py (PulsarClient)
Message transport:
- Consumer: consumer.py, consumer_spec.py
- Producer: producer.py, producer_spec.py
- Publisher: publisher.py
- Subscriber: subscriber.py, subscriber_spec.py
Schema system:
- Base types: schema/core/primitives.py
- All service schemas: schema/services/*.py
- Topic naming: schema/core/topic.py
Pulsar-specific concepts required:
- Topic-based messaging
- Schema system (Record, field types)
- Shared subscriptions
- Message acknowledgment (positive/negative)
- Consumer positioning (earliest/latest)
- Message properties
- Initial positions and consumer types
- Chunking support
- Persistent vs non-persistent topics

Refactoring Challenges

The good news: The abstraction layer (Consumer, Producer, Publisher, Subscriber) provides a clean encapsulation of most Pulsar interactions.

The challenges:

Schema system pervasiveness: Every message definition uses pulsar.schema.Record and Pulsar field types
Pulsar-specific enums: InitialPosition, ConsumerType
Pulsar exceptions: _pulsar.Timeout, _pulsar.Interrupted, _pulsar.InvalidConfiguration, _pulsar.AlreadyClosed
Method signatures: acknowledge(), negative_acknowledge(), subscribe(), create_producer(), etc.
Topic URI format: Pulsar's kind://tenant/namespace/topic structure

Next Steps

To make the pub/sub infrastructure configurable, we need to:

Create an abstraction interface for the client/schema system
Abstract Pulsar-specific enums and exceptions
Create schema wrappers or alternative schema definitions
Implement the interface for both Pulsar and alternative systems (Kafka, RabbitMQ, Redis Streams, etc.)
Update pubsub.py to be configurable and support multiple backends
Provide migration path for existing deployments

Approach Draft 1: Adapter Pattern with Schema Translation Layer

Key Insight

The schema system is the deepest integration point - everything else flows from it. We need to solve this first, or we'll be rewriting the entire codebase.

Strategy: Minimal Disruption with Adapters

1. Keep Pulsar schemas as the internal representation

Don't rewrite all the schema definitions
Schemas remain pulsar.schema.Record internally
Use adapters to translate at the boundary between our code and the pub/sub backend

2. Create a pub/sub abstraction layer:

┌─────────────────────────────────────┐
│   Existing Code (unchanged)         │
│   - Uses Pulsar schemas internally  │
│   - Consumer/Producer/Publisher     │
└──────────────┬──────────────────────┘
               │
┌──────────────┴──────────────────────┐
│   PubSubFactory (configurable)      │
│   - Creates backend-specific client │
└──────────────┬──────────────────────┘
               │
        ┌──────┴──────┐
        │             │
┌───────▼─────┐  ┌────▼─────────┐
│ PulsarAdapter│  │ KafkaAdapter │  etc...
│ (passthrough)│  │ (translates) │
└──────────────┘  └──────────────┘

3. Define abstract interfaces:

PubSubClient - client connection
PubSubProducer - sending messages
PubSubConsumer - receiving messages
SchemaAdapter - translating Pulsar schemas to/from JSON or backend-specific formats

4. Implementation details:

For Pulsar adapter: Nearly passthrough, minimal translation

For other backends (Kafka, RabbitMQ, etc.):

Serialize Pulsar Record objects to JSON/bytes
Map concepts like:
- InitialPosition.Earliest/Latest → Kafka's auto.offset.reset
- acknowledge() → Kafka's commit
- negative_acknowledge() → Re-queue or DLQ pattern
- Topic URIs → Backend-specific topic names

Analysis

Pros:

✅ Minimal code changes to existing services
✅ Schemas stay as-is (no massive rewrite)
✅ Gradual migration path
✅ Pulsar users see no difference
✅ New backends added via adapters

Cons:

⚠️ Still carries Pulsar dependency (for schema definitions)
⚠️ Some impedance mismatch translating concepts

Alternative Consideration

Create a TrustGraph schema system that's pub/sub agnostic (using dataclasses or Pydantic), then generate Pulsar/Kafka/etc schemas from it. This requires rewriting every schema file and potentially breaking changes.

Recommendation for Draft 1

Start with the adapter approach because:

It's pragmatic - works with existing code
Proves the concept with minimal risk
Can evolve to a native schema system later if needed
Configuration-driven: one env var switches backends

Approach Draft 2: Backend-Agnostic Schema System with Dataclasses

Core Concept

Use Python dataclasses as the neutral schema definition format. Each pub/sub backend provides its own serialization/deserialization for dataclasses, eliminating the need for Pulsar schemas to remain in the codebase.

Schema Polymorphism at the Factory Level

Instead of translating Pulsar schemas, each backend provides its own schema handling that works with standard Python dataclasses.

Publisher Flow

# 1. Get the configured backend from factory
pubsub = get_pubsub()  # Returns PulsarBackend, MQTTBackend, etc.

# 2. Get schema class from the backend
# (Can be imported directly - backend-agnostic)
from trustgraph.schema.services.llm import TextCompletionRequest

# 3. Create a producer/publisher for a specific topic
producer = pubsub.create_producer(
    topic="text-completion-requests",
    schema=TextCompletionRequest  # Tells backend what schema to use
)

# 4. Create message instances (same API regardless of backend)
request = TextCompletionRequest(
    system="You are helpful",
    prompt="Hello world",
    streaming=False
)

# 5. Send the message
producer.send(request)  # Backend serializes appropriately

Consumer Flow

# 1. Get the configured backend
pubsub = get_pubsub()

# 2. Create a consumer
consumer = pubsub.subscribe(
    topic="text-completion-requests",
    schema=TextCompletionRequest  # Tells backend how to deserialize
)

# 3. Receive and deserialize
msg = consumer.receive()
request = msg.value()  # Returns TextCompletionRequest dataclass instance

# 4. Use the data (type-safe access)
print(request.system)   # "You are helpful"
print(request.prompt)   # "Hello world"
print(request.streaming)  # False

What Happens Behind the Scenes

For Pulsar backend:

create_producer() → creates Pulsar producer with JSON schema or dynamically generated Record
send(request) → serializes dataclass to JSON/Pulsar format, sends to Pulsar
receive() → gets Pulsar message, deserializes back to dataclass

For MQTT backend:

create_producer() → connects to MQTT broker, no schema registration needed
send(request) → converts dataclass to JSON, publishes to MQTT topic
receive() → subscribes to MQTT topic, deserializes JSON to dataclass

For Kafka backend:

create_producer() → creates Kafka producer, registers Avro schema if needed
send(request) → serializes dataclass to Avro format, sends to Kafka
receive() → gets Kafka message, deserializes Avro back to dataclass

Key Design Points

Schema object creation: The dataclass instance (TextCompletionRequest(...)) is identical regardless of backend
Backend handles encoding: Each backend knows how to serialize its dataclass to the wire format
Schema definition at creation: When creating producer/consumer, you specify the schema type
Type safety preserved: You get back a proper TextCompletionRequest object, not a dict
No backend leakage: Application code never imports backend-specific libraries

Example Transformation

Current (Pulsar-specific):

# schema/services/llm.py
from pulsar.schema import Record, String, Boolean, Integer

class TextCompletionRequest(Record):
    system = String()
    prompt = String()
    streaming = Boolean()

New (Backend-agnostic):

# schema/services/llm.py
from dataclasses import dataclass

@dataclass
class TextCompletionRequest:
    system: str
    prompt: str
    streaming: bool = False

Backend Integration

Each backend handles serialization/deserialization of dataclasses:

Pulsar backend:

Dynamically generate pulsar.schema.Record classes from dataclasses
Or serialize dataclasses to JSON and use Pulsar's JSON schema
Maintains compatibility with existing Pulsar deployments

MQTT/Redis backend:

Direct JSON serialization of dataclass instances
Use dataclasses.asdict() / from_dict()
Lightweight, no schema registry needed

Kafka backend:

Generate Avro schemas from dataclass definitions
Use Confluent's schema registry
Type-safe serialization with schema evolution support

Architecture

┌─────────────────────────────────────┐
│   Application Code                  │
│   - Uses dataclass schemas          │
│   - Backend-agnostic                │
└──────────────┬──────────────────────┘
               │
┌──────────────┴──────────────────────┐
│   PubSubFactory (configurable)      │
│   - get_pubsub() returns backend    │
└──────────────┬──────────────────────┘
               │
        ┌──────┴──────┐
        │             │
┌───────▼─────────┐  ┌────▼──────────────┐
│ PulsarBackend   │  │ MQTTBackend       │
│ - JSON schema   │  │ - JSON serialize  │
│ - or dynamic    │  │ - Simple queues   │
│   Record gen    │  │                   │
└─────────────────┘  └───────────────────┘

Implementation Details

1. Schema definitions: Plain dataclasses with type hints

str, int, bool, float for primitives
list[T] for arrays
dict[str, T] for maps
Nested dataclasses for complex types

2. Each backend provides:

Serializer: dataclass → bytes/wire format
Deserializer: bytes/wire format → dataclass
Schema registration (if needed, like Pulsar/Kafka)

3. Consumer/Producer abstraction:

Already exists (consumer.py, producer.py)
Update to use backend's serialization
Remove direct Pulsar imports

4. Type mappings:

Pulsar String() → Python str
Pulsar Integer() → Python int
Pulsar Boolean() → Python bool
Pulsar Array(T) → Python list[T]
Pulsar Map(K, V) → Python dict[K, V]
Pulsar Double() → Python float
Pulsar Bytes() → Python bytes

Migration Path

Create dataclass versions of all schemas in trustgraph/schema/
Update backend classes (Consumer, Producer, Publisher, Subscriber) to use backend-provided serialization
Implement PulsarBackend with JSON schema or dynamic Record generation
Test with Pulsar to ensure backward compatibility with existing deployments
Add new backends (MQTT, Kafka, Redis, etc.) as needed
Remove Pulsar imports from schema files

Benefits

✅ No pub/sub dependency in schema definitions ✅ Standard Python - easy to understand, type-check, document ✅ Modern tooling - works with mypy, IDE autocomplete, linters ✅ Backend-optimized - each backend uses native serialization ✅ No translation overhead - direct serialization, no adapters ✅ Type safety - real objects with proper types ✅ Easy validation - can use Pydantic if needed

Challenges & Solutions

Challenge: Pulsar's Record has runtime field validation Solution: Use Pydantic dataclasses for validation if needed, or Python 3.10+ dataclass features with __post_init__

Challenge: Some Pulsar-specific features (like Bytes type) Solution: Map to bytes type in dataclass, backend handles encoding appropriately

Challenge: Topic naming (persistent://tenant/namespace/topic) Solution: Abstract topic names in schema definitions, backend converts to proper format

Challenge: Schema evolution and versioning Solution: Each backend handles this according to its capabilities (Pulsar schema versions, Kafka schema registry, etc.)

Challenge: Nested complex types Solution: Use nested dataclasses, backends recursively serialize/deserialize

Design Decisions

Plain dataclasses or Pydantic?
- ✅ Decision: Use plain Python dataclasses
- Simpler, no additional dependencies
- Validation not required in practice
- Easier to understand and maintain
Schema evolution:
- ✅ Decision: No versioning mechanism needed
- Schemas are stable and long-lasting
- Updates typically add new fields (backward compatible)
- Backends handle schema evolution according to their capabilities
Backward compatibility:
- ✅ Decision: Major version change, no backward compatibility required
- Will be a breaking change with migration instructions
- Clean break allows for better design
- Migration guide will be provided for existing deployments

Nested types and complex structures:

✅ Decision: Use nested dataclasses naturally
Python dataclasses handle nesting perfectly
list[T] for arrays, dict[K, V] for maps
Backends recursively serialize/deserialize

Example:

@dataclass
class Value:
    value: str
    is_uri: bool

@dataclass
class Triple:
    s: Value              # Nested dataclass
    p: Value
    o: Value

@dataclass
class GraphQuery:
    triples: list[Triple]  # Array of nested dataclasses
    metadata: dict[str, str]

Default values and optional fields:

✅ Decision: Mix of required, defaults, and optional fields
Required fields: No default value
Fields with defaults: Always present, have sensible default
Truly optional fields: T | None = None, omitted from serialization when None

Example:

@dataclass
class TextCompletionRequest:
    system: str              # Required, no default
    prompt: str              # Required, no default
    streaming: bool = False  # Optional with default value
    metadata: dict | None = None  # Truly optional, can be absent

Important serialization semantics:

When metadata = None:

{
    "system": "...",
    "prompt": "...",
    "streaming": false
    // metadata field NOT PRESENT
}

When metadata = {} (explicitly empty):

{
    "system": "...",
    "prompt": "...",
    "streaming": false,
    "metadata": {}  // Field PRESENT but empty
}

Key distinction:

None → field absent from JSON (not serialized)
Empty value ({}, [], "") → field present with empty value
This matters semantically: "not provided" vs "explicitly empty"
Serialization backends must skip None fields, not encode as null

Approach Draft 3: Implementation Details

Generic Queue Naming Format

Replace backend-specific queue names with a generic format that backends can map appropriately.

Format: {qos}/{tenant}/{namespace}/{queue-name}

Where:

qos: Quality of Service level
- q0 = best-effort (fire and forget, no acknowledgment)
- q1 = at-least-once (requires acknowledgment)
- q2 = exactly-once (two-phase acknowledgment)
tenant: Logical grouping for multi-tenancy
namespace: Sub-grouping within tenant
queue-name: Actual queue/topic name

Examples:

q1/tg/flow/text-completion-requests
q2/tg/config/config-push
q0/tg/metrics/stats

Backend Topic Mapping

Each backend maps the generic format to its native format:

Pulsar Backend:

def map_topic(self, generic_topic: str) -> str:
    # Parse: q1/tg/flow/text-completion-requests
    qos, tenant, namespace, queue = generic_topic.split('/', 3)

    # Map QoS to persistence
    persistence = 'persistent' if qos in ['q1', 'q2'] else 'non-persistent'

    # Return Pulsar URI: persistent://tg/flow/text-completion-requests
    return f"{persistence}://{tenant}/{namespace}/{queue}"

MQTT Backend:

def map_topic(self, generic_topic: str) -> tuple[str, int]:
    # Parse: q1/tg/flow/text-completion-requests
    qos, tenant, namespace, queue = generic_topic.split('/', 3)

    # Map QoS level
    qos_level = {'q0': 0, 'q1': 1, 'q2': 2}[qos]

    # Build MQTT topic including tenant/namespace for proper namespacing
    mqtt_topic = f"{tenant}/{namespace}/{queue}"

    return mqtt_topic, qos_level

Updated Topic Helper Function

# schema/core/topic.py
def topic(queue_name, qos='q1', tenant='tg', namespace='flow'):
    """
    Create a generic topic identifier that can be mapped by backends.

    Args:
        queue_name: The queue/topic name
        qos: Quality of service
             - 'q0' = best-effort (no ack)
             - 'q1' = at-least-once (ack required)
             - 'q2' = exactly-once (two-phase ack)
        tenant: Tenant identifier for multi-tenancy
        namespace: Namespace within tenant

    Returns:
        Generic topic string: qos/tenant/namespace/queue_name

    Examples:
        topic('my-queue')  # q1/tg/flow/my-queue
        topic('config', qos='q2', namespace='config')  # q2/tg/config/config
    """
    return f"{qos}/{tenant}/{namespace}/{queue_name}"

Configuration and Initialization

Command-Line Arguments + Environment Variables:

# In base/async_processor.py - add_args() method
@staticmethod
def add_args(parser):
    # Pub/sub backend selection
    parser.add_argument(
        '--pubsub-backend',
        default=os.getenv('PUBSUB_BACKEND', 'pulsar'),
        choices=['pulsar', 'mqtt'],
        help='Pub/sub backend (default: pulsar, env: PUBSUB_BACKEND)'
    )

    # Pulsar-specific configuration
    parser.add_argument(
        '--pulsar-host',
        default=os.getenv('PULSAR_HOST', 'pulsar://localhost:6650'),
        help='Pulsar host (default: pulsar://localhost:6650, env: PULSAR_HOST)'
    )

    parser.add_argument(
        '--pulsar-api-key',
        default=os.getenv('PULSAR_API_KEY', None),
        help='Pulsar API key (env: PULSAR_API_KEY)'
    )

    parser.add_argument(
        '--pulsar-listener',
        default=os.getenv('PULSAR_LISTENER', None),
        help='Pulsar listener name (env: PULSAR_LISTENER)'
    )

    # MQTT-specific configuration
    parser.add_argument(
        '--mqtt-host',
        default=os.getenv('MQTT_HOST', 'localhost'),
        help='MQTT broker host (default: localhost, env: MQTT_HOST)'
    )

    parser.add_argument(
        '--mqtt-port',
        type=int,
        default=int(os.getenv('MQTT_PORT', '1883')),
        help='MQTT broker port (default: 1883, env: MQTT_PORT)'
    )

    parser.add_argument(
        '--mqtt-username',
        default=os.getenv('MQTT_USERNAME', None),
        help='MQTT username (env: MQTT_USERNAME)'
    )

    parser.add_argument(
        '--mqtt-password',
        default=os.getenv('MQTT_PASSWORD', None),
        help='MQTT password (env: MQTT_PASSWORD)'
    )

Factory Function:

# In base/pubsub.py or base/pubsub_factory.py
def get_pubsub(**config) -> PubSubBackend:
    """
    Create and return a pub/sub backend based on configuration.

    Args:
        config: Configuration dict from command-line args
                Must include 'pubsub_backend' key

    Returns:
        Backend instance (PulsarBackend, MQTTBackend, etc.)
    """
    backend_type = config.get('pubsub_backend', 'pulsar')

    if backend_type == 'pulsar':
        return PulsarBackend(
            host=config.get('pulsar_host'),
            api_key=config.get('pulsar_api_key'),
            listener=config.get('pulsar_listener'),
        )
    elif backend_type == 'mqtt':
        return MQTTBackend(
            host=config.get('mqtt_host'),
            port=config.get('mqtt_port'),
            username=config.get('mqtt_username'),
            password=config.get('mqtt_password'),
        )
    else:
        raise ValueError(f"Unknown pub/sub backend: {backend_type}")

Usage in AsyncProcessor:

# In async_processor.py
class AsyncProcessor:
    def __init__(self, **params):
        self.id = params.get("id")

        # Create backend from config (replaces PulsarClient)
        self.pubsub = get_pubsub(**params)

        # Rest of initialization...

Backend Interface

class PubSubBackend(Protocol):
    """Protocol defining the interface all pub/sub backends must implement."""

    def create_producer(self, topic: str, schema: type, **options) -> BackendProducer:
        """
        Create a producer for a topic.

        Args:
            topic: Generic topic format (qos/tenant/namespace/queue)
            schema: Dataclass type for messages
            options: Backend-specific options (e.g., chunking_enabled)

        Returns:
            Backend-specific producer instance
        """
        ...

    def create_consumer(
        self,
        topic: str,
        subscription: str,
        schema: type,
        initial_position: str = 'latest',
        consumer_type: str = 'shared',
        **options
    ) -> BackendConsumer:
        """
        Create a consumer for a topic.

        Args:
            topic: Generic topic format (qos/tenant/namespace/queue)
            subscription: Subscription/consumer group name
            schema: Dataclass type for messages
            initial_position: 'earliest' or 'latest' (MQTT may ignore)
            consumer_type: 'shared', 'exclusive', 'failover' (MQTT may ignore)
            options: Backend-specific options

        Returns:
            Backend-specific consumer instance
        """
        ...

    def close(self) -> None:
        """Close the backend connection."""
        ...

class BackendProducer(Protocol):
    """Protocol for backend-specific producer."""

    def send(self, message: Any, properties: dict = {}) -> None:
        """Send a message (dataclass instance) with optional properties."""
        ...

    def flush(self) -> None:
        """Flush any buffered messages."""
        ...

    def close(self) -> None:
        """Close the producer."""
        ...

class BackendConsumer(Protocol):
    """Protocol for backend-specific consumer."""

    def receive(self, timeout_millis: int = 2000) -> Message:
        """
        Receive a message from the topic.

        Raises:
            TimeoutError: If no message received within timeout
        """
        ...

    def acknowledge(self, message: Message) -> None:
        """Acknowledge successful processing of a message."""
        ...

    def negative_acknowledge(self, message: Message) -> None:
        """Negative acknowledge - triggers redelivery."""
        ...

    def unsubscribe(self) -> None:
        """Unsubscribe from the topic."""
        ...

    def close(self) -> None:
        """Close the consumer."""
        ...

class Message(Protocol):
    """Protocol for a received message."""

    def value(self) -> Any:
        """Get the deserialized message (dataclass instance)."""
        ...

    def properties(self) -> dict:
        """Get message properties/metadata."""
        ...

Existing Classes Refactoring

The existing Consumer, Producer, Publisher, Subscriber classes remain largely intact:

Current responsibilities (keep):

Async threading model and taskgroups
Reconnection logic and retry handling
Metrics collection
Rate limiting
Concurrency management

Changes needed:

Remove direct Pulsar imports (pulsar.schema, pulsar.InitialPosition, etc.)
Accept BackendProducer/BackendConsumer instead of Pulsar client
Delegate actual pub/sub operations to backend instances
Map generic concepts to backend calls

Example refactoring:

# OLD - consumer.py
class Consumer:
    def __init__(self, client, topic, subscriber, schema, ...):
        self.client = client  # Direct Pulsar client
        # ...

    async def consumer_run(self):
        # Uses pulsar.InitialPosition, pulsar.ConsumerType
        self.consumer = self.client.subscribe(
            topic=self.topic,
            schema=JsonSchema(self.schema),
            initial_position=pulsar.InitialPosition.Earliest,
            consumer_type=pulsar.ConsumerType.Shared,
        )

# NEW - consumer.py
class Consumer:
    def __init__(self, backend_consumer, schema, ...):
        self.backend_consumer = backend_consumer  # Backend-specific consumer
        self.schema = schema
        # ...

    async def consumer_run(self):
        # Backend consumer already created with right settings
        # Just use it directly
        while self.running:
            msg = await asyncio.to_thread(
                self.backend_consumer.receive,
                timeout_millis=2000
            )
            await self.handle_message(msg)

Backend-Specific Behaviors

Pulsar Backend:

Maps q0 → non-persistent://, q1/q2 → persistent://
Supports all consumer types (shared, exclusive, failover)
Supports initial position (earliest/latest)
Native message acknowledgment
Schema registry support

MQTT Backend:

Maps q0/q1/q2 → MQTT QoS levels 0/1/2
Includes tenant/namespace in topic path for namespacing
Auto-generates client IDs from subscription names
Ignores initial position (no message history in basic MQTT)
Ignores consumer type (MQTT uses client IDs, not consumer groups)
Simple publish/subscribe model

Design Decisions Summary

✅ Generic queue naming: qos/tenant/namespace/queue-name format
✅ QoS in queue ID: Determined by queue definition, not configuration
✅ Reconnection: Handled by Consumer/Producer classes, not backends
✅ MQTT topics: Include tenant/namespace for proper namespacing
✅ Message history: MQTT ignores initial_position parameter (future enhancement)
✅ Client IDs: MQTT backend auto-generates from subscription name

Future Enhancements

MQTT message history:

Could add optional persistence layer (e.g., retained messages, external store)
Would allow supporting initial_position='earliest'
Not required for initial implementation

33 KiB Raw Blame History

Pub/Sub Infrastructure

Overview

Current State: Pulsar Integration Points

1. Direct Pulsar Client Usage

2. Base Processor Framework

3. Consumer Abstraction

4. Producer Abstraction

5. Publisher Abstraction

6. Subscriber Abstraction

7. Schema System (Heart of Darkness)

Summary

Pulsar Dependencies by Category

Refactoring Challenges

Next Steps

Approach Draft 1: Adapter Pattern with Schema Translation Layer

Key Insight

Strategy: Minimal Disruption with Adapters

Analysis

Alternative Consideration

Recommendation for Draft 1

Approach Draft 2: Backend-Agnostic Schema System with Dataclasses

Core Concept

Schema Polymorphism at the Factory Level

Publisher Flow

Consumer Flow

What Happens Behind the Scenes

Key Design Points

Example Transformation

Backend Integration

Architecture

Implementation Details

Migration Path

Benefits

Challenges & Solutions

Design Decisions

Approach Draft 3: Implementation Details

Generic Queue Naming Format

Backend Topic Mapping

Updated Topic Helper Function

Configuration and Initialization

Backend Interface

Existing Classes Refactoring

Backend-Specific Behaviors

Design Decisions Summary

Future Enhancements

33 KiB

Raw Blame History