feat: pluggable bootstrap framework with ordered initialisers (#847)

A generic, long-running bootstrap processor that converges a deployment to its configured initial state and then idles. Replaces the previous one-shot `tg-init-trustgraph` container model and provides an extension point for enterprise / third-party initialisers. See docs/tech-specs/bootstrap.md for the full design. Bootstrapper ------------ A single AsyncProcessor (trustgraph.bootstrap.bootstrapper.Processor) that: * Reads a list of initialiser specifications (class, name, flag, params) from either a direct `initialisers` parameter (processor-group embedding) or a YAML/JSON file (`-c`, CLI). * On each wake, runs a cheap service-gate (config-svc + flow-svc round-trips), then iterates the initialiser list, running each whose configured flag differs from the one stored in __system__/init-state/<name>. * Stores per-initialiser completion state in the reserved __system__ workspace. * Adapts cadence: ~5s on gate failure, ~15s while converging, ~300s in steady state. * Isolates failures — one initialiser's exception does not block others in the same cycle; the failed one retries next wake. Initialiser contract -------------------- * Subclass trustgraph.bootstrap.base.Initialiser. * Implement async run(ctx, old_flag, new_flag). * Opt out of the service gate with class attr wait_for_services=False (only used by PulsarTopology, since config-svc cannot come up until Pulsar namespaces exist). * ctx carries short-lived config and flow-svc clients plus a scoped logger. Core initialisers (trustgraph.bootstrap.initialisers.*) ------------------------------------------------------- * PulsarTopology — creates Pulsar tenant + namespaces (pre-gate, blocking HTTP offloaded to executor). * TemplateSeed — seeds __template__ from an external JSON file; re-run is upsert-missing by default, overwrite-all opt-in. * WorkspaceInit — populates a named workspace from either the full contents of __template__ or a seed file; raises cleanly if the template isn't seeded yet so the bootstrapper retries on the next cycle. * DefaultFlowStart — starts a specific flow in a workspace; no-ops if the flow is already running. Enterprise or third-party initialisers plug in via fully-qualified dotted class paths in the bootstrapper's configuration — no core code change required. Config service -------------- * push(): filter out reserved workspaces (ids starting with "_") from the change notifications. Stored config is preserved; only the broadcast is suppressed, so bootstrap / template state lives in config-svc without live processors ever reacting to it. Config client ------------- * ConfigClient.get_all(workspace): wraps the existing `config` operation to return {type: {key: value}} for a workspace. WorkspaceInit uses it to copy __template__ without needing a hardcoded types list. pyproject.toml -------------- * Adds a `bootstrap` console script pointing at the new Processor. * Remove tg-init-trustgraph, superceded by bootstrap processor
2026-06-09 06:45:13 +02:00 · 2026-04-22 18:03:46 +01:00 · 2026-04-22 18:03:46 +01:00 · ae9936c9cc
commit ae9936c9cc
parent 31027e30ae
17 changed files with 1312 additions and 273 deletions
--- a/docs/tech-specs/bootstrap.md
+++ b/docs/tech-specs/bootstrap.md
@ -0,0 +1,297 @@
+---
+layout: default
+title: "Bootstrap Framework Technical Specification"
+parent: "Tech Specs"
+---
+
+# Bootstrap Framework Technical Specification
+
+## Overview
+
+A generic, pluggable framework for running one-time initialisation steps
+against a TrustGraph deployment — replacing the dedicated
+`tg-init-trustgraph` container with a long-running processor that
+converges the system to a desired initial state and then idles.
+
+The framework is content-agnostic. It knows how to run, retry,
+mark-as-done, and surface failures; the actual init work lives in
+small pluggable classes called **initialisers**. Core initialisers
+ship in the `trustgraph-flow` package; enterprise and third-party
+initialisers can be loaded by dotted path without any core code
+change.
+
+## Motivation
+
+The existing `tg-init-trustgraph` is a one-shot CLI run in its own
+container. It performs two very different jobs (Pulsar topology
+setup and config seeding) in a single script, is wasteful as a whole
+container, cannot handle partial-success states, and has no way to
+extend the boot process with enterprise-specific concerns (user
+provisioning, workspace initialisation, IAM scaffolding) without
+forking the tool.
+
+A pluggable, long-running reconciler addresses all of this and slots
+naturally into the existing processor-group model.
+
+## Design
+
+### Bootstrapper Processor
+
+A single `AsyncProcessor` subclass. One entry in a processor group.
+Parameters include the processor's own identity and a list of
+**initialiser specifications** — each spec names a class (by dotted
+path), a unique instance name, a flag string, and the parameters
+that will be passed to the initialiser's constructor.
+
+On each wake the bootstrapper does the following, in order:
+
+1. Open a short-lived context (config client, flow-svc client,
+   logger). The context is torn down at the end of the wake so
+   steady-state idle cost is effectively nil.
+2. Run all **pre-service initialisers** (those that opt out of the
+   service gate — principally `PulsarTopology`, which must run
+   before the services it gates on can even come up).
+3. Check the **service gate**: cheap round-trips to config-svc and
+   flow-svc. If either fails, skip to the sleep step using the
+   short gate-retry cadence.
+4. Run all **post-service initialisers** that haven't already
+   completed at the currently-configured flag.
+5. Sleep. Cadence adapts to state (see below).
+
+### Initialiser Contract
+
+An initialiser is a class with:
+
+- A class-level `name` identifier, unique within the bootstrapper's
+  configuration. This is the key under which completion state is
+  stored.
+- A class-level `wait_for_services` flag. When `True` (the default)
+  the initialiser runs only after the service gate passes. When
+  `False`, it runs before the gate, on every wake.
+- A constructor that accepts the initialiser's own params as kwargs.
+- An async `run(ctx, old_flag, new_flag)` method that performs the
+  init work and returns on success. Any raised exception is
+  logged and treated as a transient failure — the stored flag is
+  not updated and the initialiser will re-run on the next cycle.
+
+`old_flag` is the previously-stored flag string, or `None` if the
+initialiser has never successfully run in this deployment. `new_flag`
+is the flag the operator has configured for this run. This pair
+lets an initialiser distinguish a clean first-run from a migration
+between flag versions and behave accordingly (see "Flag change and
+re-run safety" below).
+
+### Context
+
+The context is the bootstrapper-owned object passed to every
+initialiser's `run()` method. Its fields are deliberately narrow:
+
+| Field | Purpose |
+|---|---|
+| `logger` | A child logger named for the initialiser instance |
+| `config` | A short-lived `ConfigClient` for config-svc reads/writes |
+| `flow`   | A short-lived `RequestResponse` client for flow-svc |
+
+The context is always fully-populated regardless of which services
+a given initialiser uses, for symmetry. Additional fields may be
+added in future without breaking existing initialisers. Clients are
+started at the beginning of a wake cycle and stopped at the end.
+
+Initialisers that need services beyond config-svc and flow-svc are
+responsible for their own readiness checks and for raising cleanly
+when a prerequisite is not met.
+
+### Completion State
+
+Per-initialiser completion state is stored in the reserved
+`__system__` workspace, under a dedicated config type for bootstrap
+state. The stored value is the flag string that was configured when
+the initialiser last succeeded.
+
+On each cycle, for each initialiser, the bootstrapper reads the
+stored flag and compares it to the currently-configured flag. If
+they match, the initialiser is skipped silently. If they differ,
+the initialiser runs; on success, the stored flag is updated.
+
+Because the state lives in a reserved (`_`-prefixed) workspace, it
+is stored by config-svc but excluded from the config push broadcast.
+Live processors never see it and cannot act on it.
+
+### The Service Gate
+
+The gate is a cheap, bootstrapper-internal check that config-svc
+and flow-svc are both reachable and responsive. It is intentionally
+a simple pair of low-cost round-trips — a config list against
+`__system__` and a flow-svc `list-blueprints` — rather than any
+deeper health check.
+
+Its purpose is to avoid filling logs with noise and to concentrate
+retry effort during the brief window when services are coming up.
+The gate is applied only to initialisers with
+`wait_for_services=True` (the default); `False` is reserved for
+initialisers that set up infrastructure the gate itself depends on.
+
+### Adaptive Cadence
+
+The sleep between wake cycles is chosen from three tiers based on
+observed state:
+
+| Tier | Duration | When |
+|---|---|---|
+| Gate backoff | ~5 s | Services not responding — concentrate retry during startup |
+| Init retry | ~15 s | Gate passes but at least one initialiser is not yet at its configured flag — transient failures, waiting on prereqs, recently-bumped flag not yet applied |
+| Steady | ~300 s | All configured initialisers at their configured flag; gate passes; nothing to do |
+
+The short tiers ensure a fresh deployment converges quickly;
+steady state costs a single round-trip per initialiser every few
+minutes.
+
+### Failure Handling
+
+An initialiser raising an exception does not stop the bootstrapper
+or block other initialisers. Each initialiser in the cycle is
+attempted independently; failures are logged and retried on the next
+cycle. This means there is no ordered-DAG enforcement: order of
+initialisers in the configuration determines the attempt order
+within a cycle, but a dependency between two initialisers is
+expressed by the dependant raising cleanly when its prerequisite
+isn't satisfied. Over successive cycles the system converges.
+
+### Flag Change and Re-run Safety
+
+Each initialiser's completion state is a string flag chosen by the
+operator. Typically these follow a simple version pattern
+(`v1`, `v2`, ...), but the bootstrapper imposes no format.
+
+Changing the flag in the group configuration causes the
+corresponding initialiser to re-run on the next cycle. Initialisers
+must be written so that re-running after a flag bump is safe — they
+receive both the previous and the new flag and are responsible for
+either cleanly re-applying the work or performing a step-change
+migration from the prior state.
+
+This gives operators an explicit, visible mechanism for triggering
+re-initialisation. Re-runs are never implicit.
+
+## Core Initialisers
+
+The following initialisers ship in `trustgraph.bootstrap.initialisers`
+and cover the base deployment case.
+
+### PulsarTopology
+
+Creates the Pulsar tenant and the four namespaces
+(`flow`, `request`, `response`, `notify`) with appropriate
+retention policies if they don't exist.
+
+Opts out of the service gate (`wait_for_services = False`) because
+config-svc and flow-svc cannot come online until the Pulsar
+namespaces exist.
+
+Parameters: Pulsar admin URL, tenant name.
+
+Idempotent via the admin API (GET-then-PUT). Flag change causes
+re-evaluation of all namespaces; any absent are created.
+
+### TemplateSeed
+
+Populates the reserved `__template__` workspace from an external
+JSON seed file. The seed file has the standard shape of
+`{config-type: {config-key: value}}`.
+
+Runs post-gate. Parameters: path to the seed file, overwrite
+policy (upsert-missing only, or overwrite-all).
+
+On clean run, writes the whole file. On flag change, behaviour
+depends on the overwrite policy — typically upsert-missing so
+that operator-customised keys are preserved across seed-file
+upgrades.
+
+### WorkspaceInit
+
+Creates a named workspace and populates it from the seed file or
+from the full contents of the `__template__` workspace.
+
+Runs post-gate. Parameters: workspace name, source (seed file or
+`__template__`), optional `seed_file` path, `overwrite` flag.
+
+When `source` is `template`, the initialiser copies every config
+type and key present in `__template__` — there is no per-type
+selection. Deployments that want to seed only a subset should
+either curate the seed file they feed to `TemplateSeed` or use
+`source: seed-file` directly here.
+
+Raises cleanly if its source does not exist — depends on
+`TemplateSeed` having run in the same cycle or a prior one.
+
+### DefaultFlowStart
+
+Starts a specific flow in a specific workspace using a specific
+blueprint.
+
+Runs post-gate. Parameters: workspace name, flow id, blueprint
+name, description, optional parameter overrides.
+
+Separated from `WorkspaceInit` deliberately so that deployments
+which want a workspace without an auto-started flow can simply omit
+this initialiser from their bootstrap configuration.
+
+## Extensibility
+
+New initialisers are added by:
+
+1. Subclassing the initialiser base class.
+2. Implementing `run(ctx, old_flag, new_flag)`.
+3. Choosing `wait_for_services` (almost always `True`).
+4. Adding an entry in the bootstrapper's configuration with the new
+   class's dotted path.
+
+No core code changes are required to add an enterprise or third-party
+initialiser. Enterprise builds ship their own package with their own
+initialiser classes (e.g. `CreateAdminUser`, `ProvisionWorkspaces`)
+and reference them in the bootstrapper config alongside the core
+initialisers.
+
+## Reserved Workspaces
+
+This specification relies on the "reserved workspace" convention:
+
+- Any workspace id beginning with `_` is reserved.
+- Reserved workspaces are stored normally by config-svc but never
+  appear in the config push broadcast.
+- Live processors cannot react to reserved-workspace state.
+
+The bootstrapper uses two reserved workspaces:
+
+- `__template__` — factory-default seed config, readable by
+  initialisers that copy-from-template.
+- `__system__` — bootstrapper completion state (under the
+  `init-state` config type) and any other system-internal bookkeeping.
+
+See the reserved-workspace convention in the config service for
+the general rule and its enforcement.
+
+## Non-Goals
+
+- No DAG scheduling across initialisers. Dependencies are expressed
+  by the dependant failing cleanly until its prerequisite is met,
+  and convergence over subsequent cycles.
+- No parallel execution of initialisers within a cycle. A cycle runs
+  each initialiser sequentially.
+- No implicit re-runs. Re-running an initialiser requires an explicit
+  flag change by the operator.
+- No cross-initialiser atomicity. Each initialiser's completion is
+  recorded independently on its own success.
+
+## Operational Notes
+
+- Running the bootstrapper as a processor-group entry replaces the
+  previous `tg-init-trustgraph` container. The bootstrapper is also
+  CLI-invocable directly for standalone testing via
+  `Processor.launch(...)`.
+- First-boot convergence is typically a handful of short cycles
+  followed by a transition to the steady cadence. Deployments
+  should expect the first few minutes of logs to show
+  initialisation activity, thereafter effective silence.
+- Bumping a flag is a deliberate operational act. The log line
+  emitted on re-run makes the event visible for audit.
--- a/docs/tech-specs/iam.md
+++ b/docs/tech-specs/iam.md
@ -848,7 +848,6 @@ service, not in the config service. Reasons:
 - **API key scoping.** API keys could be scoped to specific collections
  within a workspace rather than granting workspace-wide access. To be
  designed when the need arises.
- **tg-init-trustgraph** only initialises a single workspace.

 ## References

--- a/trustgraph-base/trustgraph/base/config_client.py
+++ b/trustgraph-base/trustgraph/base/config_client.py
@ -84,6 +84,18 @@ class ConfigClient(RequestResponse):
        )
        return resp.directory

+    async def get_all(self, workspace, timeout=CONFIG_TIMEOUT):
+        """Return every config entry in ``workspace`` as a nested dict
+        ``{type: {key: value}}``.  Values are returned as the raw
+        strings stored by config-svc (typically JSON); callers parse
+        as needed.  An empty dict means the workspace has no config."""
+        resp = await self._request(
+            operation="config",
+            workspace=workspace,
+            timeout=timeout,
+        )
+        return resp.config
+
    async def workspaces_for_type(self, type, timeout=CONFIG_TIMEOUT):
        """Return the set of distinct workspaces with any config of
        the given type."""
--- a/trustgraph-cli/pyproject.toml
+++ b/trustgraph-cli/pyproject.toml
@ -40,7 +40,6 @@ tg-get-flow-blueprint = "trustgraph.cli.get_flow_blueprint:main"
 tg-get-kg-core = "trustgraph.cli.get_kg_core:main"
 tg-get-document-content = "trustgraph.cli.get_document_content:main"
 tg-graph-to-turtle = "trustgraph.cli.graph_to_turtle:main"
-tg-init-trustgraph = "trustgraph.cli.init_trustgraph:main"
 tg-invoke-agent = "trustgraph.cli.invoke_agent:main"
 tg-invoke-document-rag = "trustgraph.cli.invoke_document_rag:main"
 tg-invoke-graph-rag = "trustgraph.cli.invoke_graph_rag:main"
--- a/trustgraph-cli/trustgraph/cli/init_trustgraph.py
+++ b/trustgraph-cli/trustgraph/cli/init_trustgraph.py
@ -1,271 +0,0 @@
-"""
-Initialises TrustGraph pub/sub infrastructure and pushes initial config.
-
-For Pulsar: creates tenant, namespaces, and retention policies.
-For RabbitMQ: queues are auto-declared, so only config push is needed.
-"""
-
-import requests
-import time
-import argparse
-import json
-
-from trustgraph.clients.config_client import ConfigClient
-from trustgraph.base.pubsub import add_pubsub_args
-
-default_pulsar_admin_url = "http://pulsar:8080"
-subscriber = "tg-init-pubsub"
-
-
-def get_clusters(url):
-
-    print("Get clusters...", flush=True)
-
-    resp = requests.get(f"{url}/admin/v2/clusters")
-
-    if resp.status_code != 200: raise RuntimeError("Could not fetch clusters")
-
-    return resp.json()
-
-def ensure_tenant(url, tenant, clusters):
-
-    resp = requests.get(f"{url}/admin/v2/tenants/{tenant}")
-
-    if resp.status_code == 200:
-        print(f"Tenant {tenant} already exists.", flush=True)
-        return
-
-    resp = requests.put(
-        f"{url}/admin/v2/tenants/{tenant}",
-        json={
-            "adminRoles": [],
-            "allowedClusters": clusters,
-        }
-    )
-
-    if resp.status_code != 204:
-        print(resp.text, flush=True)
-        raise RuntimeError("Tenant creation failed.")
-
-    print(f"Tenant {tenant} created.", flush=True)
-
-def ensure_namespace(url, tenant, namespace, config):
-
-    resp = requests.get(f"{url}/admin/v2/namespaces/{tenant}/{namespace}")
-
-    if resp.status_code == 200:
-        print(f"Namespace {tenant}/{namespace} already exists.", flush=True)
-        return
-
-    resp = requests.put(
-        f"{url}/admin/v2/namespaces/{tenant}/{namespace}",
-        json=config,
-    )
-
-    if resp.status_code != 204:
-        print(resp.status_code, flush=True)
-        print(resp.text, flush=True)
-        raise RuntimeError(f"Namespace {tenant}/{namespace} creation failed.")
-
-    print(f"Namespace {tenant}/{namespace} created.", flush=True)
-
-def ensure_config(config, workspace="default", **pubsub_config):
-
-    cli = ConfigClient(
-        subscriber=subscriber,
-        workspace=workspace,
-        **pubsub_config,
-    )
-
-    while True:
-
-        try:
-
-            print("Get current config...", flush=True)
-            current, version = cli.config(timeout=5)
-
-        except Exception as e:
-
-            print("Exception:", e, flush=True)
-            time.sleep(2)
-            print("Retrying...", flush=True)
-            continue
-
-        print("Current config version is", version, flush=True)
-
-        if version != 0:
-            print("Already updated, not updating config.  Done.", flush=True)
-            return
-
-        print("Config is version 0, updating...", flush=True)
-
-        batch = []
-
-        for type in config:
-            for key in config[type]:
-                print(f"Adding {type}/{key} to update.", flush=True)
-                batch.append({
-                    "type": type,
-                    "key": key,
-                    "value": json.dumps(config[type][key]),
-                })
-
-        try:
-            cli.put(batch, timeout=10)
-            print("Update succeeded.", flush=True)
-            break
-        except Exception as e:
-            print("Exception:", e, flush=True)
-            time.sleep(2)
-            print("Retrying...", flush=True)
-            continue
-
-def init_pulsar(pulsar_admin_url, tenant):
-    """Pulsar-specific setup: create tenant, namespaces, retention policies."""
-
-    clusters = get_clusters(pulsar_admin_url)
-
-    ensure_tenant(pulsar_admin_url, tenant, clusters)
-
-    ensure_namespace(pulsar_admin_url, tenant, "flow", {})
-
-    ensure_namespace(pulsar_admin_url, tenant, "request", {})
-
-    ensure_namespace(pulsar_admin_url, tenant, "response", {
-        "retention_policies": {
-            "retentionSizeInMB": -1,
-            "retentionTimeInMinutes": 3,
-            "subscriptionExpirationTimeMinutes": 30,
-        }
-    })
-
-    ensure_namespace(pulsar_admin_url, tenant, "notify", {
-        "retention_policies": {
-            "retentionSizeInMB": -1,
-            "retentionTimeInMinutes": 3,
-            "subscriptionExpirationTimeMinutes": 5,
-        }
-    })
-
-
-def push_config(config_json, config_file, workspace="default",
-                **pubsub_config):
-    """Push initial config if provided."""
-
-    if config_json is not None:
-
-        try:
-            print("Decoding config...", flush=True)
-            dec = json.loads(config_json)
-            print("Decoded.", flush=True)
-        except Exception as e:
-            print("Exception:", e, flush=True)
-            raise e
-
-        ensure_config(dec, workspace=workspace, **pubsub_config)
-
-    elif config_file is not None:
-
-        try:
-            print("Decoding config...", flush=True)
-            dec = json.load(open(config_file))
-            print("Decoded.", flush=True)
-        except Exception as e:
-            print("Exception:", e, flush=True)
-            raise e
-
-        ensure_config(dec, workspace=workspace, **pubsub_config)
-
-    else:
-        print("No config to update.", flush=True)
-
-
-def main():
-
-    parser = argparse.ArgumentParser(
-        prog='tg-init-trustgraph',
-        description=__doc__,
-    )
-
-    parser.add_argument(
-        '--pulsar-admin-url',
-        default=default_pulsar_admin_url,
-        help=f'Pulsar admin URL (default: {default_pulsar_admin_url})',
-    )
-
-    parser.add_argument(
-        '-c', '--config',
-        help=f'Initial configuration to load',
-    )
-
-    parser.add_argument(
-        '-C', '--config-file',
-        help=f'Initial configuration to load from file',
-    )
-
-    parser.add_argument(
-        '-t', '--tenant',
-        default="tg",
-        help=f'Tenant (default: tg)',
-    )
-
-    parser.add_argument(
-        '-w', '--workspace',
-        default="default",
-        help=f'Workspace (default: default)',
-    )
-
-    add_pubsub_args(parser)
-
-    args = parser.parse_args()
-
-    backend_type = args.pubsub_backend
-
-    # Extract pubsub config from args
-    pubsub_config = {
-        k: v for k, v in vars(args).items()
-        if k not in (
-            'pulsar_admin_url', 'config', 'config_file', 'tenant',
-            'workspace',
-        )
-    }
-
-    while True:
-
-        try:
-
-            # Pulsar-specific setup (tenants, namespaces)
-            if backend_type == 'pulsar':
-                print(flush=True)
-                print(
-                    f"Initialising Pulsar at {args.pulsar_admin_url}...",
-                    flush=True,
-                )
-                init_pulsar(args.pulsar_admin_url, args.tenant)
-            else:
-                print(flush=True)
-                print(
-                    f"Using {backend_type} backend (no admin setup needed).",
-                    flush=True,
-                )
-
-            # Push config (works with any backend)
-            push_config(
-                args.config, args.config_file,
-                workspace=args.workspace,
-                **pubsub_config,
-            )
-
-            print("Initialisation complete.", flush=True)
-            break
-
-        except Exception as e:
-
-            print("Exception:", e, flush=True)
-
-            print("Sleeping...", flush=True)
-            time.sleep(2)
-            print("Will retry...", flush=True)
-
-if __name__ == "__main__":
-    main()
--- a/trustgraph-flow/pyproject.toml
+++ b/trustgraph-flow/pyproject.toml
@ -60,6 +60,7 @@ agent-orchestrator = "trustgraph.agent.orchestrator:run"
 api-gateway = "trustgraph.gateway:run"
 chunker-recursive = "trustgraph.chunking.recursive:run"
 chunker-token = "trustgraph.chunking.token:run"
+bootstrap = "trustgraph.bootstrap.bootstrapper:run"
 config-svc = "trustgraph.config.service:run"
 flow-svc = "trustgraph.flow.service:run"
 doc-embeddings-query-milvus = "trustgraph.query.doc_embeddings.milvus:run"
--- a/trustgraph-flow/trustgraph/bootstrap/init.py
+++ b/trustgraph-flow/trustgraph/bootstrap/init.py
--- a/trustgraph-flow/trustgraph/bootstrap/base.py
+++ b/trustgraph-flow/trustgraph/bootstrap/base.py
@ -0,0 +1,68 @@
+"""
+Bootstrap framework: Initialiser base class and per-wake context.
+
+See docs/tech-specs/bootstrap.md for the full design.
+"""
+
+import logging
+from dataclasses import dataclass
+from typing import Any
+
+
+@dataclass
+class InitContext:
+    """Shared per-wake context passed to each initialiser.
+
+    The bootstrapper constructs one of these on every wake cycle,
+    tears it down at cycle end, and passes it into each initialiser's
+    ``run()`` method.  Fields are short-lived and safe to use during
+    a single cycle only.
+    """
+
+    logger: logging.Logger
+    config: Any    # ConfigClient
+    flow: Any      # RequestResponse client for flow-svc
+
+
+class Initialiser:
+    """Base class for bootstrap initialisers.
+
+    Subclasses implement :meth:`run`.  The bootstrapper manages
+    completion state, flag comparison, retry and error handling —
+    subclasses describe only the work to perform.
+
+    Class attributes:
+
+    * ``wait_for_services`` (bool, default ``True``): when ``True`` the
+      initialiser only runs after the bootstrapper's service gate has
+      passed (config-svc and flow-svc reachable).  Set ``False`` for
+      initialisers that bring up infrastructure the gate itself
+      depends on — principally Pulsar topology, without which
+      config-svc cannot come online.
+    """
+
+    wait_for_services: bool = True
+
+    def __init__(self, **params):
+        # Subclasses should consume their own params via keyword
+        # arguments in their own __init__ signatures.  This catch-all
+        # is here so any kwargs that filter through unnoticed don't
+        # raise TypeError on construction.
+        pass
+
+    async def run(self, ctx, old_flag, new_flag):
+        """Perform initialisation work.
+
+        :param ctx: :class:`InitContext` with logger, config client,
+            flow-svc client.
+        :param old_flag: Previously-stored flag string, or ``None`` if
+            this initialiser has never successfully completed in this
+            deployment.
+        :param new_flag: Currently-configured flag.  A string chosen
+            by the operator; typically something like ``"v1"``.
+
+        :raises: Any exception on failure.  The bootstrapper catches,
+            logs, and re-runs on the next cycle; completion state is
+            only written on clean return.
+        """
+        raise NotImplementedError
--- a/trustgraph-flow/trustgraph/bootstrap/bootstrapper/init.py
+++ b/trustgraph-flow/trustgraph/bootstrap/bootstrapper/init.py
@ -0,0 +1 @@
+from . service import *
--- a/trustgraph-flow/trustgraph/bootstrap/bootstrapper/main.py
+++ b/trustgraph-flow/trustgraph/bootstrap/bootstrapper/main.py
@ -0,0 +1,6 @@
+#!/usr/bin/env python3
+
+from . service import run
+
+if __name__ == '__main__':
+    run()
--- a/trustgraph-flow/trustgraph/bootstrap/bootstrapper/service.py
+++ b/trustgraph-flow/trustgraph/bootstrap/bootstrapper/service.py
@ -0,0 +1,414 @@
+"""
+Bootstrapper processor.
+
+Runs a pluggable list of initialisers in a reconciliation loop.
+Each initialiser's completion state is recorded in the reserved
+``__system__`` workspace under the ``init-state`` config type.
+
+See docs/tech-specs/bootstrap.md for the full design.
+"""
+
+import asyncio
+import importlib
+import json
+import logging
+import uuid
+from argparse import ArgumentParser
+from dataclasses import dataclass
+
+from trustgraph.base import AsyncProcessor
+from trustgraph.base import ProducerMetrics, SubscriberMetrics
+from trustgraph.base.config_client import ConfigClient
+from trustgraph.base.request_response_spec import RequestResponse
+from trustgraph.schema import (
+    ConfigRequest, ConfigResponse,
+    config_request_queue, config_response_queue,
+)
+from trustgraph.schema import (
+    FlowRequest, FlowResponse,
+    flow_request_queue, flow_response_queue,
+)
+
+from .. base import Initialiser, InitContext
+
+logger = logging.getLogger(__name__)
+
+default_ident = "bootstrap"
+
+# Reserved workspace + config type under which completion state is
+# stored.  Reserved (`_`-prefix) workspaces are excluded from the
+# config push broadcast — live processors never see these keys.
+SYSTEM_WORKSPACE = "__system__"
+INIT_STATE_TYPE = "init-state"
+
+# Cadence tiers.
+GATE_BACKOFF = 5           # Services not responding; retry soon.
+INIT_RETRY = 15            # Gate passed but something ran/failed;
+                           # converge quickly.
+STEADY_INTERVAL = 300      # Everything at target flag; idle cheaply.
+
+
+@dataclass
+class InitialiserSpec:
+    """One entry in the bootstrapper's configured list of initialisers."""
+    name: str
+    flag: str
+    instance: Initialiser
+
+
+def _resolve_class(dotted):
+    """Import and return a class by its dotted path."""
+    module_path, _, class_name = dotted.rpartition(".")
+    if not module_path:
+        raise ValueError(
+            f"Initialiser class must be a dotted path, got {dotted!r}"
+        )
+    module = importlib.import_module(module_path)
+    return getattr(module, class_name)
+
+
+def _load_initialisers_file(path):
+    """Load the initialisers spec list from a YAML or JSON file.
+
+    File shape:
+
+    .. code-block:: yaml
+
+        initialisers:
+          - class: trustgraph.bootstrap.initialisers.PulsarTopology
+            name: pulsar-topology
+            flag: v1
+            params:
+              admin_url: http://pulsar:8080
+              tenant: tg
+          - ...
+    """
+    with open(path) as f:
+        content = f.read()
+    if path.endswith((".yaml", ".yml")):
+        import yaml
+        doc = yaml.safe_load(content)
+    else:
+        doc = json.loads(content)
+    if not isinstance(doc, dict) or "initialisers" not in doc:
+        raise RuntimeError(
+            f"{path}: expected a mapping with an 'initialisers' key"
+        )
+    return doc["initialisers"]
+
+
+class Processor(AsyncProcessor):
+
+    def __init__(self, **params):
+
+        super().__init__(**params)
+
+        # Source the initialisers list either from a direct parameter
+        # (processor-group embedding) or from a file (CLI launch).
+        inits = params.get("initialisers")
+        if inits is None:
+            inits_file = params.get("initialisers_file")
+            if inits_file is None:
+                raise RuntimeError(
+                    "Bootstrapper requires either the 'initialisers' "
+                    "parameter or --initialisers-file"
+                )
+            inits = _load_initialisers_file(inits_file)
+
+        self.specs = []
+        names = set()
+
+        for entry in inits:
+            if not isinstance(entry, dict):
+                raise RuntimeError(
+                    f"Initialiser entry must be a mapping, got: {entry!r}"
+                )
+            for required in ("class", "name", "flag"):
+                if required not in entry:
+                    raise RuntimeError(
+                        f"Initialiser entry missing required field "
+                        f"{required!r}: {entry!r}"
+                    )
+
+            name = entry["name"]
+            if name in names:
+                raise RuntimeError(f"Duplicate initialiser name {name!r}")
+            names.add(name)
+
+            cls = _resolve_class(entry["class"])
+
+            try:
+                instance = cls(**entry.get("params", {}))
+            except Exception as e:
+                raise RuntimeError(
+                    f"Failed to instantiate initialiser "
+                    f"{entry['class']!r} as {name!r}: "
+                    f"{type(e).__name__}: {e}"
+                )
+
+            self.specs.append(InitialiserSpec(
+                name=name,
+                flag=entry["flag"],
+                instance=instance,
+            ))
+
+        logger.info(
+            f"Bootstrapper: loaded {len(self.specs)} initialisers"
+        )
+
+    # ------------------------------------------------------------------
+    # Client construction (short-lived per wake cycle).
+    # ------------------------------------------------------------------
+
+    def _make_config_client(self):
+        rr_id = str(uuid.uuid4())
+        return ConfigClient(
+            backend=self.pubsub_backend,
+            subscription=f"{self.id}--config--{rr_id}",
+            consumer_name=self.id,
+            request_topic=config_request_queue,
+            request_schema=ConfigRequest,
+            request_metrics=ProducerMetrics(
+                processor=self.id, flow=None, name="config-request",
+            ),
+            response_topic=config_response_queue,
+            response_schema=ConfigResponse,
+            response_metrics=SubscriberMetrics(
+                processor=self.id, flow=None, name="config-response",
+            ),
+        )
+
+    def _make_flow_client(self):
+        rr_id = str(uuid.uuid4())
+        return RequestResponse(
+            backend=self.pubsub_backend,
+            subscription=f"{self.id}--flow--{rr_id}",
+            consumer_name=self.id,
+            request_topic=flow_request_queue,
+            request_schema=FlowRequest,
+            request_metrics=ProducerMetrics(
+                processor=self.id, flow=None, name="flow-request",
+            ),
+            response_topic=flow_response_queue,
+            response_schema=FlowResponse,
+            response_metrics=SubscriberMetrics(
+                processor=self.id, flow=None, name="flow-response",
+            ),
+        )
+
+    async def _open_clients(self):
+        config = self._make_config_client()
+        flow = self._make_flow_client()
+        await config.start()
+        try:
+            await flow.start()
+        except Exception:
+            await self._safe_stop(config)
+            raise
+        return config, flow
+
+    async def _safe_stop(self, client):
+        try:
+            await client.stop()
+        except Exception:
+            pass
+
+    # ------------------------------------------------------------------
+    # Service gate.
+    # ------------------------------------------------------------------
+
+    async def _gate_ready(self, config, flow):
+        try:
+            await config.keys(SYSTEM_WORKSPACE, INIT_STATE_TYPE)
+        except Exception as e:
+            logger.info(
+                f"Gate: config-svc not ready ({type(e).__name__}: {e})"
+            )
+            return False
+
+        try:
+            resp = await flow.request(
+                FlowRequest(
+                    operation="list-blueprints",
+                    workspace=SYSTEM_WORKSPACE,
+                ),
+                timeout=5,
+            )
+            if resp.error:
+                logger.info(
+                    f"Gate: flow-svc error: "
+                    f"{resp.error.type}: {resp.error.message}"
+                )
+                return False
+        except Exception as e:
+            logger.info(
+                f"Gate: flow-svc not ready ({type(e).__name__}: {e})"
+            )
+            return False
+
+        return True
+
+    # ------------------------------------------------------------------
+    # Completion state.
+    # ------------------------------------------------------------------
+
+    async def _stored_flag(self, config, name):
+        raw = await config.get(SYSTEM_WORKSPACE, INIT_STATE_TYPE, name)
+        if raw is None:
+            return None
+        try:
+            return json.loads(raw)
+        except Exception:
+            return raw
+
+    async def _store_flag(self, config, name, flag):
+        await config.put(
+            SYSTEM_WORKSPACE, INIT_STATE_TYPE, name,
+            json.dumps(flag),
+        )
+
+    # ------------------------------------------------------------------
+    # Per-spec execution.
+    # ------------------------------------------------------------------
+
+    async def _run_spec(self, spec, config, flow):
+        """Run a single initialiser spec.
+
+        Returns one of:
+          - ``"skip"``: stored flag already matches target, nothing to do.
+          - ``"ran"``: initialiser ran and completion state was updated.
+          - ``"failed"``: initialiser raised.
+          - ``"failed-state-write"``: initialiser succeeded but we could
+            not persist the new flag (transient — will re-run next cycle).
+        """
+
+        try:
+            old_flag = await self._stored_flag(config, spec.name)
+        except Exception as e:
+            logger.warning(
+                f"{spec.name}: could not read stored flag "
+                f"({type(e).__name__}: {e})"
+            )
+            return "failed"
+
+        if old_flag == spec.flag:
+            return "skip"
+
+        child_logger = logger.getChild(spec.name)
+        child_ctx = InitContext(
+            logger=child_logger,
+            config=config,
+            flow=flow,
+        )
+
+        child_logger.info(
+            f"Running (old_flag={old_flag!r} -> new_flag={spec.flag!r})"
+        )
+
+        try:
+            await spec.instance.run(child_ctx, old_flag, spec.flag)
+        except Exception as e:
+            child_logger.error(
+                f"Failed: {type(e).__name__}: {e}", exc_info=True,
+            )
+            return "failed"
+
+        try:
+            await self._store_flag(config, spec.name, spec.flag)
+        except Exception as e:
+            child_logger.warning(
+                f"Completed but could not persist state flag "
+                f"({type(e).__name__}: {e}); will re-run next cycle"
+            )
+            return "failed-state-write"
+
+        child_logger.info(f"Completed (flag={spec.flag!r})")
+        return "ran"
+
+    # ------------------------------------------------------------------
+    # Main loop.
+    # ------------------------------------------------------------------
+
+    async def run(self):
+
+        logger.info(
+            f"Bootstrapper starting with {len(self.specs)} initialisers"
+        )
+
+        while self.running:
+
+            sleep_for = STEADY_INTERVAL
+
+            try:
+                config, flow = await self._open_clients()
+            except Exception as e:
+                logger.info(
+                    f"Failed to open clients "
+                    f"({type(e).__name__}: {e}); retry in {GATE_BACKOFF}s"
+                )
+                await asyncio.sleep(GATE_BACKOFF)
+                continue
+
+            try:
+                # Phase 1: pre-service initialisers run unconditionally.
+                pre_specs = [
+                    s for s in self.specs
+                    if not s.instance.wait_for_services
+                ]
+                pre_results = {}
+                for spec in pre_specs:
+                    pre_results[spec.name] = await self._run_spec(
+                        spec, config, flow,
+                    )
+
+                # Phase 2: gate.
+                gate_ok = await self._gate_ready(config, flow)
+
+                # Phase 3: post-service initialisers, if gate passed.
+                post_results = {}
+                if gate_ok:
+                    post_specs = [
+                        s for s in self.specs
+                        if s.instance.wait_for_services
+                    ]
+                    for spec in post_specs:
+                        post_results[spec.name] = await self._run_spec(
+                            spec, config, flow,
+                        )
+
+                # Cadence selection.
+                if not gate_ok:
+                    sleep_for = GATE_BACKOFF
+                else:
+                    all_results = {**pre_results, **post_results}
+                    if any(r != "skip" for r in all_results.values()):
+                        sleep_for = INIT_RETRY
+                    else:
+                        sleep_for = STEADY_INTERVAL
+
+            finally:
+                await self._safe_stop(config)
+                await self._safe_stop(flow)
+
+            await asyncio.sleep(sleep_for)
+
+    # ------------------------------------------------------------------
+    # CLI arg plumbing.
+    # ------------------------------------------------------------------
+
+    @staticmethod
+    def add_args(parser: ArgumentParser) -> None:
+
+        AsyncProcessor.add_args(parser)
+
+        parser.add_argument(
+            '-c', '--initialisers-file',
+            help='Path to YAML or JSON file describing the '
+                 'initialisers to run.  Ignored when the '
+                 "'initialisers' parameter is provided directly "
+                 '(e.g. when running inside a processor group).',
+        )
+
+
+def run():
+    Processor.launch(default_ident, __doc__)
--- a/trustgraph-flow/trustgraph/bootstrap/initialisers/init.py
+++ b/trustgraph-flow/trustgraph/bootstrap/initialisers/init.py
@ -0,0 +1,20 @@
+"""
+Core bootstrap initialisers.
+
+These cover the base TrustGraph deployment case.  Enterprise or
+third-party initialisers live in their own packages and are
+referenced in the bootstrapper's config by fully-qualified dotted
+path.
+"""
+
+from . pulsar_topology import PulsarTopology
+from . template_seed import TemplateSeed
+from . workspace_init import WorkspaceInit
+from . default_flow_start import DefaultFlowStart
+
+__all__ = [
+    "PulsarTopology",
+    "TemplateSeed",
+    "WorkspaceInit",
+    "DefaultFlowStart",
+]
--- a/trustgraph-flow/trustgraph/bootstrap/initialisers/default_flow_start.py
+++ b/trustgraph-flow/trustgraph/bootstrap/initialisers/default_flow_start.py
@ -0,0 +1,101 @@
+"""
+DefaultFlowStart initialiser — starts a named flow in a workspace
+using a specified blueprint.
+
+Separated from WorkspaceInit so deployments that want a workspace
+without an auto-started flow can simply omit this initialiser.
+
+Parameters
+----------
+workspace : str (default "default")
+    Workspace in which to start the flow.
+flow_id : str (default "default")
+    Identifier for the started flow.
+blueprint : str (required)
+    Blueprint name (must already exist in the workspace's config,
+    typically via TemplateSeed -> WorkspaceInit).
+description : str (default "Default")
+    Human-readable description passed to flow-svc.
+parameters : dict (optional)
+    Optional parameter overrides passed to start-flow.
+"""
+
+from trustgraph.schema import FlowRequest
+
+from .. base import Initialiser
+
+
+class DefaultFlowStart(Initialiser):
+
+    def __init__(
+            self,
+            workspace="default",
+            flow_id="default",
+            blueprint=None,
+            description="Default",
+            parameters=None,
+            **kwargs,
+    ):
+        super().__init__(**kwargs)
+        if not blueprint:
+            raise ValueError(
+                "DefaultFlowStart requires 'blueprint'"
+            )
+        self.workspace = workspace
+        self.flow_id = flow_id
+        self.blueprint = blueprint
+        self.description = description
+        self.parameters = dict(parameters) if parameters else {}
+
+    async def run(self, ctx, old_flag, new_flag):
+
+        # Check whether the flow already exists.  Belt-and-braces
+        # beyond the flag gate: if an operator stops and restarts the
+        # bootstrapper after the flow is already running, we don't
+        # want to blindly try to start it again.
+        list_resp = await ctx.flow.request(
+            FlowRequest(
+                operation="list-flows",
+                workspace=self.workspace,
+            ),
+            timeout=10,
+        )
+        if list_resp.error:
+            raise RuntimeError(
+                f"list-flows failed: "
+                f"{list_resp.error.type}: {list_resp.error.message}"
+            )
+
+        if self.flow_id in (list_resp.flow_ids or []):
+            ctx.logger.info(
+                f"Flow {self.flow_id!r} already running in workspace "
+                f"{self.workspace!r}; nothing to do"
+            )
+            return
+
+        ctx.logger.info(
+            f"Starting flow {self.flow_id!r} "
+            f"(blueprint={self.blueprint!r}) "
+            f"in workspace {self.workspace!r}"
+        )
+
+        resp = await ctx.flow.request(
+            FlowRequest(
+                operation="start-flow",
+                workspace=self.workspace,
+                flow_id=self.flow_id,
+                blueprint_name=self.blueprint,
+                description=self.description,
+                parameters=self.parameters,
+            ),
+            timeout=30,
+        )
+        if resp.error:
+            raise RuntimeError(
+                f"start-flow failed: "
+                f"{resp.error.type}: {resp.error.message}"
+            )
+
+        ctx.logger.info(
+            f"Flow {self.flow_id!r} started"
+        )
--- a/trustgraph-flow/trustgraph/bootstrap/initialisers/pulsar_topology.py
+++ b/trustgraph-flow/trustgraph/bootstrap/initialisers/pulsar_topology.py
@ -0,0 +1,131 @@
+"""
+PulsarTopology initialiser — creates Pulsar tenant and namespaces
+with their retention policies.
+
+Runs pre-gate (``wait_for_services = False``) because config-svc and
+flow-svc can't connect to Pulsar until these namespaces exist.
+Admin-API calls are idempotent so re-runs on flag change are safe.
+"""
+
+import asyncio
+import requests
+
+from .. base import Initialiser
+
+# Namespace configs.  flow/request take broker defaults.  response
+# and notify get aggressive retention — those classes carry short-lived
+# request/response and notification traffic only.
+NAMESPACE_CONFIG = {
+    "flow": {},
+    "request": {},
+    "response": {
+        "retention_policies": {
+            "retentionSizeInMB": -1,
+            "retentionTimeInMinutes": 3,
+            "subscriptionExpirationTimeMinutes": 30,
+        },
+    },
+    "notify": {
+        "retention_policies": {
+            "retentionSizeInMB": -1,
+            "retentionTimeInMinutes": 3,
+            "subscriptionExpirationTimeMinutes": 5,
+        },
+    },
+}
+
+REQUEST_TIMEOUT = 10
+
+
+class PulsarTopology(Initialiser):
+
+    wait_for_services = False
+
+    def __init__(
+            self,
+            admin_url="http://pulsar:8080",
+            tenant="tg",
+            **kwargs,
+    ):
+        super().__init__(**kwargs)
+        self.admin_url = admin_url.rstrip("/")
+        self.tenant = tenant
+
+    async def run(self, ctx, old_flag, new_flag):
+        # requests is blocking; offload to executor so the loop stays
+        # responsive.
+        loop = asyncio.get_event_loop()
+        await loop.run_in_executor(None, self._reconcile_sync, ctx.logger)
+
+    # ------------------------------------------------------------------
+    # Sync admin-API calls.
+    # ------------------------------------------------------------------
+
+    def _get_clusters(self):
+        resp = requests.get(
+            f"{self.admin_url}/admin/v2/clusters",
+            timeout=REQUEST_TIMEOUT,
+        )
+        resp.raise_for_status()
+        return resp.json()
+
+    def _tenant_exists(self):
+        resp = requests.get(
+            f"{self.admin_url}/admin/v2/tenants/{self.tenant}",
+            timeout=REQUEST_TIMEOUT,
+        )
+        return resp.status_code == 200
+
+    def _create_tenant(self, clusters):
+        resp = requests.put(
+            f"{self.admin_url}/admin/v2/tenants/{self.tenant}",
+            json={"adminRoles": [], "allowedClusters": clusters},
+            timeout=REQUEST_TIMEOUT,
+        )
+        if resp.status_code != 204:
+            raise RuntimeError(
+                f"Tenant {self.tenant!r} create failed: "
+                f"{resp.status_code} {resp.text}"
+            )
+
+    def _namespace_exists(self, namespace):
+        resp = requests.get(
+            f"{self.admin_url}/admin/v2/namespaces/"
+            f"{self.tenant}/{namespace}",
+            timeout=REQUEST_TIMEOUT,
+        )
+        return resp.status_code == 200
+
+    def _create_namespace(self, namespace, config):
+        resp = requests.put(
+            f"{self.admin_url}/admin/v2/namespaces/"
+            f"{self.tenant}/{namespace}",
+            json=config,
+            timeout=REQUEST_TIMEOUT,
+        )
+        if resp.status_code != 204:
+            raise RuntimeError(
+                f"Namespace {self.tenant}/{namespace} create failed: "
+                f"{resp.status_code} {resp.text}"
+            )
+
+    def _reconcile_sync(self, logger):
+        if not self._tenant_exists():
+            clusters = self._get_clusters()
+            logger.info(
+                f"Creating tenant {self.tenant!r} with clusters {clusters}"
+            )
+            self._create_tenant(clusters)
+        else:
+            logger.debug(f"Tenant {self.tenant!r} already exists")
+
+        for namespace, config in NAMESPACE_CONFIG.items():
+            if self._namespace_exists(namespace):
+                logger.debug(
+                    f"Namespace {self.tenant}/{namespace} already exists"
+                )
+                continue
+            logger.info(
+                f"Creating namespace {self.tenant}/{namespace}"
+            )
+            self._create_namespace(namespace, config)
--- a/trustgraph-flow/trustgraph/bootstrap/initialisers/template_seed.py
+++ b/trustgraph-flow/trustgraph/bootstrap/initialisers/template_seed.py
@ -0,0 +1,93 @@
+"""
+TemplateSeed initialiser — populates the reserved ``__template__``
+workspace from an external JSON seed file.
+
+Seed file shape:
+
+.. code-block:: json
+
+    {
+        "flow-blueprint": {
+            "ontology": { ... },
+            "agent":    { ... }
+        },
+        "prompt": {
+            ...
+        },
+        ...
+    }
+
+Top-level keys are config types; nested keys are config entries.
+Values are arbitrary JSON (they'll be ``json.dumps()``'d on write).
+
+Parameters
+----------
+config_file : str
+    Path to the seed file on disk.
+overwrite : bool (default False)
+    On re-run (flag change), if True overwrite all keys; if False
+    upsert-missing-only (preserves any operator customisation of
+    the template).
+"""
+
+import json
+
+from .. base import Initialiser
+
+TEMPLATE_WORKSPACE = "__template__"
+
+
+class TemplateSeed(Initialiser):
+
+    def __init__(self, config_file, overwrite=False, **kwargs):
+        super().__init__(**kwargs)
+        if not config_file:
+            raise ValueError("TemplateSeed requires 'config_file'")
+        self.config_file = config_file
+        self.overwrite = overwrite
+
+    async def run(self, ctx, old_flag, new_flag):
+
+        with open(self.config_file) as f:
+            seed = json.load(f)
+
+        if old_flag is None:
+            # Clean first run — write every entry.
+            await self._write_all(ctx, seed)
+            return
+
+        # Re-run after flag change.
+        if self.overwrite:
+            await self._write_all(ctx, seed)
+        else:
+            await self._upsert_missing(ctx, seed)
+
+    async def _write_all(self, ctx, seed):
+        values = []
+        for type_name, entries in seed.items():
+            for key, value in entries.items():
+                values.append((type_name, key, json.dumps(value)))
+        if values:
+            await ctx.config.put_many(TEMPLATE_WORKSPACE, values)
+        ctx.logger.info(
+            f"Template seeded with {len(values)} entries"
+        )
+
+    async def _upsert_missing(self, ctx, seed):
+        written = 0
+        for type_name, entries in seed.items():
+            existing = set(
+                await ctx.config.keys(TEMPLATE_WORKSPACE, type_name)
+            )
+            values = []
+            for key, value in entries.items():
+                if key not in existing:
+                    values.append(
+                        (type_name, key, json.dumps(value))
+                    )
+            if values:
+                await ctx.config.put_many(TEMPLATE_WORKSPACE, values)
+                written += len(values)
+        ctx.logger.info(
+            f"Template upsert-missing: {written} new entries"
+        )
--- a/trustgraph-flow/trustgraph/bootstrap/initialisers/workspace_init.py
+++ b/trustgraph-flow/trustgraph/bootstrap/initialisers/workspace_init.py
@ -0,0 +1,138 @@
+"""
+WorkspaceInit initialiser — creates a workspace and populates it from
+either the ``__template__`` workspace or a seed file on disk.
+
+Parameters
+----------
+workspace : str
+    Target workspace to create / populate.
+source : str
+    Either ``"template"`` (copy the full contents of the
+    ``__template__`` workspace) or ``"seed-file"`` (read from
+    ``seed_file``).
+seed_file : str (required when source=="seed-file")
+    Path to a JSON seed file with the same shape TemplateSeed consumes.
+overwrite : bool (default False)
+    On re-run (flag change), if True overwrite all keys; if False,
+    upsert-missing-only (preserves in-workspace customisations).
+
+Raises (in ``run``)
+-------------------
+When source is ``"template"``, raises ``RuntimeError`` if the
+``__template__`` workspace is empty — indicating that TemplateSeed
+hasn't run yet.  The bootstrapper's retry loop will re-attempt on
+the next cycle once the prerequisite is satisfied.
+"""
+
+import json
+
+from .. base import Initialiser
+
+TEMPLATE_WORKSPACE = "__template__"
+
+
+class WorkspaceInit(Initialiser):
+
+    def __init__(
+            self,
+            workspace="default",
+            source="template",
+            seed_file=None,
+            overwrite=False,
+            **kwargs,
+    ):
+        super().__init__(**kwargs)
+
+        if source not in ("template", "seed-file"):
+            raise ValueError(
+                f"WorkspaceInit: source must be 'template' or "
+                f"'seed-file', got {source!r}"
+            )
+        if source == "seed-file" and not seed_file:
+            raise ValueError(
+                "WorkspaceInit: seed_file required when source='seed-file'"
+            )
+
+        self.workspace = workspace
+        self.source = source
+        self.seed_file = seed_file
+        self.overwrite = overwrite
+
+    async def run(self, ctx, old_flag, new_flag):
+        if self.source == "seed-file":
+            tree = self._load_seed_file()
+        else:
+            tree = await self._load_from_template(ctx)
+
+        if old_flag is None or self.overwrite:
+            await self._write_all(ctx, tree)
+        else:
+            await self._upsert_missing(ctx, tree)
+
+    def _load_seed_file(self):
+        with open(self.seed_file) as f:
+            return json.load(f)
+
+    async def _load_from_template(self, ctx):
+        """Build a seed tree from the entire ``__template__`` workspace.
+        Raises if the workspace is empty, so the bootstrapper knows
+        the prerequisite isn't met yet."""
+
+        raw_tree = await ctx.config.get_all(TEMPLATE_WORKSPACE)
+
+        tree = {}
+        total = 0
+        for type_name, entries in raw_tree.items():
+            parsed = {}
+            for key, raw in entries.items():
+                if raw is None:
+                    continue
+                try:
+                    parsed[key] = json.loads(raw)
+                except Exception:
+                    parsed[key] = raw
+                total += 1
+            if parsed:
+                tree[type_name] = parsed
+
+        if total == 0:
+            raise RuntimeError(
+                "Template workspace is empty — has TemplateSeed run yet?"
+            )
+
+        ctx.logger.debug(
+            f"Loaded {total} template entries across {len(tree)} types"
+        )
+        return tree
+
+    async def _write_all(self, ctx, tree):
+        values = []
+        for type_name, entries in tree.items():
+            for key, value in entries.items():
+                values.append((type_name, key, json.dumps(value)))
+        if values:
+            await ctx.config.put_many(self.workspace, values)
+        ctx.logger.info(
+            f"Workspace {self.workspace!r} populated with "
+            f"{len(values)} entries"
+        )
+
+    async def _upsert_missing(self, ctx, tree):
+        written = 0
+        for type_name, entries in tree.items():
+            existing = set(
+                await ctx.config.keys(self.workspace, type_name)
+            )
+            values = []
+            for key, value in entries.items():
+                if key not in existing:
+                    values.append(
+                        (type_name, key, json.dumps(value))
+                    )
+            if values:
+                await ctx.config.put_many(self.workspace, values)
+                written += len(values)
+        ctx.logger.info(
+            f"Workspace {self.workspace!r} upsert-missing: "
+            f"{written} new entries"
+        )
--- a/trustgraph-flow/trustgraph/config/service/service.py
+++ b/trustgraph-flow/trustgraph/config/service/service.py
@ -24,6 +24,21 @@ logger = logging.getLogger(__name__)

 default_ident = "config-svc"

+
+def is_reserved_workspace(workspace):
+    """Reserved workspaces are storage-only.
+
+    Any workspace id beginning with ``_`` is reserved for internal use
+    (e.g. ``__template__`` holding factory-default seed config).
+    Reads and writes work normally so bootstrap and provisioning code
+    can use the standard config API, but **change notifications for
+    reserved workspaces are suppressed**.  Services subscribed to the
+    config push therefore never see reserved-workspace events and
+    cannot accidentally act on template content as if it were live
+    state.
+    """
+    return workspace.startswith("_")
+
 default_config_request_queue = config_request_queue
 default_config_response_queue = config_response_queue
 default_config_push_queue = config_push_queue
@ -130,6 +145,21 @@ class Processor(AsyncProcessor):

    async def push(self, changes=None):

+        # Suppress notifications from reserved workspaces (ids starting
+        # with "_", e.g. "__template__").  Stored config is preserved;
+        # only the broadcast is filtered.  Keeps services oblivious to
+        # template / bootstrap state.
+        if changes:
+            filtered = {}
+            for type_name, workspaces in changes.items():
+                visible = [
+                    w for w in workspaces
+                    if not is_reserved_workspace(w)
+                ]
+                if visible:
+                    filtered[type_name] = visible
+            changes = filtered
+
        version = await self.config.get_version()

        resp = ConfigPush(