trustgraph/docs/tech-specs/active-flow-key-restructure.md

68 lines
2.5 KiB
Markdown
Raw Normal View History

Flow service lifecycle management (#822) feat: separate flow service from config service with explicit queue lifecycle management The flow service is now an independent service that owns the lifecycle of flow and blueprint queues. System services own their own queues. Consumers never create queues. Flow service separation: - New service at trustgraph-flow/trustgraph/flow/service/ - Uses async ConfigClient (RequestResponse pattern) to talk to config service - Config service stripped of all flow handling Queue lifecycle management: - PubSubBackend protocol gains create_queue, delete_queue, queue_exists, ensure_queue — all async - RabbitMQ: implements via pika with asyncio.to_thread internally - Pulsar: stubs for future admin REST API implementation - Consumer _connect() no longer creates queues (passive=True for named queues) - System services call ensure_queue on startup - Flow service creates queues on flow start, deletes on flow stop - Flow service ensures queues for pre-existing flows on startup Two-phase flow stop: - Phase 1: set flow status to "stopping", delete processor config entries - Phase 2: retry queue deletion, then delete flow record Config restructure: - active-flow config replaced with processor:{name} types - Each processor has its own config type, each flow variant is a key - Flow start/stop use batch put/delete — single config push per operation - FlowProcessor subscribes to its own type only Blueprint format: - Processor entries split into topics and parameters dicts - Flow interfaces use {"flow": "topic"} instead of bare strings - Specs (ConsumerSpec, ProducerSpec, etc.) read from definition["topics"] Tests updated
2026-04-16 17:19:39 +01:00
---
layout: default
title: "Active-Flow Key Restructure"
parent: "Tech Specs"
---
# Active-Flow Key Restructure
## Problem
Active-flow config uses `('active-flow', processor)` as its key, where
each processor's value is a JSON blob containing all flow variants
assigned to that processor:
```
('active-flow', 'chunker') -> { "default": {...}, "flow2": {...} }
```
This causes two problems:
1. **Read-modify-write on every change.** Starting or stopping a flow
requires fetching the processor's current blob, parsing it, adding
or removing a variant, serialising it, and writing it back. This is
a concurrency hazard if two flow operations target the same
processor simultaneously.
2. **Noisy config pushes.** Config subscribers subscribe to a type,
not a specific key. Every active-flow write triggers a config push
that causes every processor in the system to fetch the full config
and re-evaluate, even though only one processor's config changed.
With N processors in a blueprint, a single flow start/stop causes
N writes and N^2 config fetches across the system.
## Proposed Change
Restructure the key to `('active-flow', 'processor:variant')` where
each key holds a single flow variant's configuration:
```
('active-flow', 'chunker:default') -> { "topics": {...}, "parameters": {...} }
('active-flow', 'chunker:flow2') -> { "topics": {...}, "parameters": {...} }
```
Starting a flow is a set of clean puts. Stopping a flow is a set of
clean deletes. No read-modify-write. No JSON blob merging.
The config push problem (all processors fetching on every change)
remains — that's a limitation of the config subscription model and
would require per-key subscriptions to solve. But eliminating the
read-modify-write removes the concurrency hazard and simplifies the
flow service code.
## What Changes
- **Flow service** (`flow.py`): `handle_start_flow` writes individual
keys per processor:variant instead of merging into per-processor
blobs. `handle_stop_flow` deletes individual keys instead of
read-modify-write.
- **FlowProcessor** (`flow_processor.py`): `on_configure_flows`
currently looks up `config["active-flow"][self.id]` to find a JSON
blob of all its variants. Needs to scan all active-flow keys for
entries prefixed with `self.id:` and assemble its flow list from
those.
- **Config client**: May benefit from a prefix-scan or pattern-match
query to support the FlowProcessor lookup efficiently.
- **Initial config / bootstrapping**: Any code that seeds active-flow
entries at deployment time needs to use the new key format.