2.5 KiB
| layout | title | parent |
|---|---|---|
| default | Active-Flow Key Restructure | Tech Specs |
Active-Flow Key Restructure
Problem
Active-flow config uses ('active-flow', processor) as its key, where
each processor's value is a JSON blob containing all flow variants
assigned to that processor:
('active-flow', 'chunker') -> { "default": {...}, "flow2": {...} }
This causes two problems:
-
Read-modify-write on every change. Starting or stopping a flow requires fetching the processor's current blob, parsing it, adding or removing a variant, serialising it, and writing it back. This is a concurrency hazard if two flow operations target the same processor simultaneously.
-
Noisy config pushes. Config subscribers subscribe to a type, not a specific key. Every active-flow write triggers a config push that causes every processor in the system to fetch the full config and re-evaluate, even though only one processor's config changed. With N processors in a blueprint, a single flow start/stop causes N writes and N^2 config fetches across the system.
Proposed Change
Restructure the key to ('active-flow', 'processor:variant') where
each key holds a single flow variant's configuration:
('active-flow', 'chunker:default') -> { "topics": {...}, "parameters": {...} }
('active-flow', 'chunker:flow2') -> { "topics": {...}, "parameters": {...} }
Starting a flow is a set of clean puts. Stopping a flow is a set of clean deletes. No read-modify-write. No JSON blob merging.
The config push problem (all processors fetching on every change) remains — that's a limitation of the config subscription model and would require per-key subscriptions to solve. But eliminating the read-modify-write removes the concurrency hazard and simplifies the flow service code.
What Changes
- Flow service (
flow.py):handle_start_flowwrites individual keys per processor:variant instead of merging into per-processor blobs.handle_stop_flowdeletes individual keys instead of read-modify-write. - FlowProcessor (
flow_processor.py):on_configure_flowscurrently looks upconfig["active-flow"][self.id]to find a JSON blob of all its variants. Needs to scan all active-flow keys for entries prefixed withself.id:and assemble its flow list from those. - Config client: May benefit from a prefix-scan or pattern-match query to support the FlowProcessor lookup efficiently.
- Initial config / bootstrapping: Any code that seeds active-flow entries at deployment time needs to use the new key format.