mirror of
https://github.com/trustgraph-ai/trustgraph.git
synced 2026-04-25 00:16:23 +02:00
feat: pluggable bootstrap framework with ordered initialisers (#847)
A generic, long-running bootstrap processor that converges a
deployment to its configured initial state and then idles.
Replaces the previous one-shot `tg-init-trustgraph` container model
and provides an extension point for enterprise / third-party
initialisers.
See docs/tech-specs/bootstrap.md for the full design.
Bootstrapper
------------
A single AsyncProcessor (trustgraph.bootstrap.bootstrapper.Processor)
that:
* Reads a list of initialiser specifications (class, name, flag,
params) from either a direct `initialisers` parameter
(processor-group embedding) or a YAML/JSON file (`-c`, CLI).
* On each wake, runs a cheap service-gate (config-svc +
flow-svc round-trips), then iterates the initialiser list,
running each whose configured flag differs from the one stored
in __system__/init-state/<name>.
* Stores per-initialiser completion state in the reserved
__system__ workspace.
* Adapts cadence: ~5s on gate failure, ~15s while converging,
~300s in steady state.
* Isolates failures — one initialiser's exception does not block
others in the same cycle; the failed one retries next wake.
Initialiser contract
--------------------
* Subclass trustgraph.bootstrap.base.Initialiser.
* Implement async run(ctx, old_flag, new_flag).
* Opt out of the service gate with class attr
wait_for_services=False (only used by PulsarTopology, since
config-svc cannot come up until Pulsar namespaces exist).
* ctx carries short-lived config and flow-svc clients plus a
scoped logger.
Core initialisers (trustgraph.bootstrap.initialisers.*)
-------------------------------------------------------
* PulsarTopology — creates Pulsar tenant + namespaces
(pre-gate, blocking HTTP offloaded to
executor).
* TemplateSeed — seeds __template__ from an external JSON
file; re-run is upsert-missing by default,
overwrite-all opt-in.
* WorkspaceInit — populates a named workspace from either
the full contents of __template__ or a
seed file; raises cleanly if the template
isn't seeded yet so the bootstrapper retries
on the next cycle.
* DefaultFlowStart — starts a specific flow in a workspace;
no-ops if the flow is already running.
Enterprise or third-party initialisers plug in via fully-qualified
dotted class paths in the bootstrapper's configuration — no core
code change required.
Config service
--------------
* push(): filter out reserved workspaces (ids starting with "_")
from the change notifications. Stored config is preserved; only
the broadcast is suppressed, so bootstrap / template state lives
in config-svc without live processors ever reacting to it.
Config client
-------------
* ConfigClient.get_all(workspace): wraps the existing `config`
operation to return {type: {key: value}} for a workspace.
WorkspaceInit uses it to copy __template__ without needing a
hardcoded types list.
pyproject.toml
--------------
* Adds a `bootstrap` console script pointing at the new Processor.
* Remove tg-init-trustgraph, superceded by bootstrap processor
This commit is contained in:
parent
31027e30ae
commit
ae9936c9cc
17 changed files with 1312 additions and 273 deletions
297
docs/tech-specs/bootstrap.md
Normal file
297
docs/tech-specs/bootstrap.md
Normal file
|
|
@ -0,0 +1,297 @@
|
||||||
|
---
|
||||||
|
layout: default
|
||||||
|
title: "Bootstrap Framework Technical Specification"
|
||||||
|
parent: "Tech Specs"
|
||||||
|
---
|
||||||
|
|
||||||
|
# Bootstrap Framework Technical Specification
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
A generic, pluggable framework for running one-time initialisation steps
|
||||||
|
against a TrustGraph deployment — replacing the dedicated
|
||||||
|
`tg-init-trustgraph` container with a long-running processor that
|
||||||
|
converges the system to a desired initial state and then idles.
|
||||||
|
|
||||||
|
The framework is content-agnostic. It knows how to run, retry,
|
||||||
|
mark-as-done, and surface failures; the actual init work lives in
|
||||||
|
small pluggable classes called **initialisers**. Core initialisers
|
||||||
|
ship in the `trustgraph-flow` package; enterprise and third-party
|
||||||
|
initialisers can be loaded by dotted path without any core code
|
||||||
|
change.
|
||||||
|
|
||||||
|
## Motivation
|
||||||
|
|
||||||
|
The existing `tg-init-trustgraph` is a one-shot CLI run in its own
|
||||||
|
container. It performs two very different jobs (Pulsar topology
|
||||||
|
setup and config seeding) in a single script, is wasteful as a whole
|
||||||
|
container, cannot handle partial-success states, and has no way to
|
||||||
|
extend the boot process with enterprise-specific concerns (user
|
||||||
|
provisioning, workspace initialisation, IAM scaffolding) without
|
||||||
|
forking the tool.
|
||||||
|
|
||||||
|
A pluggable, long-running reconciler addresses all of this and slots
|
||||||
|
naturally into the existing processor-group model.
|
||||||
|
|
||||||
|
## Design
|
||||||
|
|
||||||
|
### Bootstrapper Processor
|
||||||
|
|
||||||
|
A single `AsyncProcessor` subclass. One entry in a processor group.
|
||||||
|
Parameters include the processor's own identity and a list of
|
||||||
|
**initialiser specifications** — each spec names a class (by dotted
|
||||||
|
path), a unique instance name, a flag string, and the parameters
|
||||||
|
that will be passed to the initialiser's constructor.
|
||||||
|
|
||||||
|
On each wake the bootstrapper does the following, in order:
|
||||||
|
|
||||||
|
1. Open a short-lived context (config client, flow-svc client,
|
||||||
|
logger). The context is torn down at the end of the wake so
|
||||||
|
steady-state idle cost is effectively nil.
|
||||||
|
2. Run all **pre-service initialisers** (those that opt out of the
|
||||||
|
service gate — principally `PulsarTopology`, which must run
|
||||||
|
before the services it gates on can even come up).
|
||||||
|
3. Check the **service gate**: cheap round-trips to config-svc and
|
||||||
|
flow-svc. If either fails, skip to the sleep step using the
|
||||||
|
short gate-retry cadence.
|
||||||
|
4. Run all **post-service initialisers** that haven't already
|
||||||
|
completed at the currently-configured flag.
|
||||||
|
5. Sleep. Cadence adapts to state (see below).
|
||||||
|
|
||||||
|
### Initialiser Contract
|
||||||
|
|
||||||
|
An initialiser is a class with:
|
||||||
|
|
||||||
|
- A class-level `name` identifier, unique within the bootstrapper's
|
||||||
|
configuration. This is the key under which completion state is
|
||||||
|
stored.
|
||||||
|
- A class-level `wait_for_services` flag. When `True` (the default)
|
||||||
|
the initialiser runs only after the service gate passes. When
|
||||||
|
`False`, it runs before the gate, on every wake.
|
||||||
|
- A constructor that accepts the initialiser's own params as kwargs.
|
||||||
|
- An async `run(ctx, old_flag, new_flag)` method that performs the
|
||||||
|
init work and returns on success. Any raised exception is
|
||||||
|
logged and treated as a transient failure — the stored flag is
|
||||||
|
not updated and the initialiser will re-run on the next cycle.
|
||||||
|
|
||||||
|
`old_flag` is the previously-stored flag string, or `None` if the
|
||||||
|
initialiser has never successfully run in this deployment. `new_flag`
|
||||||
|
is the flag the operator has configured for this run. This pair
|
||||||
|
lets an initialiser distinguish a clean first-run from a migration
|
||||||
|
between flag versions and behave accordingly (see "Flag change and
|
||||||
|
re-run safety" below).
|
||||||
|
|
||||||
|
### Context
|
||||||
|
|
||||||
|
The context is the bootstrapper-owned object passed to every
|
||||||
|
initialiser's `run()` method. Its fields are deliberately narrow:
|
||||||
|
|
||||||
|
| Field | Purpose |
|
||||||
|
|---|---|
|
||||||
|
| `logger` | A child logger named for the initialiser instance |
|
||||||
|
| `config` | A short-lived `ConfigClient` for config-svc reads/writes |
|
||||||
|
| `flow` | A short-lived `RequestResponse` client for flow-svc |
|
||||||
|
|
||||||
|
The context is always fully-populated regardless of which services
|
||||||
|
a given initialiser uses, for symmetry. Additional fields may be
|
||||||
|
added in future without breaking existing initialisers. Clients are
|
||||||
|
started at the beginning of a wake cycle and stopped at the end.
|
||||||
|
|
||||||
|
Initialisers that need services beyond config-svc and flow-svc are
|
||||||
|
responsible for their own readiness checks and for raising cleanly
|
||||||
|
when a prerequisite is not met.
|
||||||
|
|
||||||
|
### Completion State
|
||||||
|
|
||||||
|
Per-initialiser completion state is stored in the reserved
|
||||||
|
`__system__` workspace, under a dedicated config type for bootstrap
|
||||||
|
state. The stored value is the flag string that was configured when
|
||||||
|
the initialiser last succeeded.
|
||||||
|
|
||||||
|
On each cycle, for each initialiser, the bootstrapper reads the
|
||||||
|
stored flag and compares it to the currently-configured flag. If
|
||||||
|
they match, the initialiser is skipped silently. If they differ,
|
||||||
|
the initialiser runs; on success, the stored flag is updated.
|
||||||
|
|
||||||
|
Because the state lives in a reserved (`_`-prefixed) workspace, it
|
||||||
|
is stored by config-svc but excluded from the config push broadcast.
|
||||||
|
Live processors never see it and cannot act on it.
|
||||||
|
|
||||||
|
### The Service Gate
|
||||||
|
|
||||||
|
The gate is a cheap, bootstrapper-internal check that config-svc
|
||||||
|
and flow-svc are both reachable and responsive. It is intentionally
|
||||||
|
a simple pair of low-cost round-trips — a config list against
|
||||||
|
`__system__` and a flow-svc `list-blueprints` — rather than any
|
||||||
|
deeper health check.
|
||||||
|
|
||||||
|
Its purpose is to avoid filling logs with noise and to concentrate
|
||||||
|
retry effort during the brief window when services are coming up.
|
||||||
|
The gate is applied only to initialisers with
|
||||||
|
`wait_for_services=True` (the default); `False` is reserved for
|
||||||
|
initialisers that set up infrastructure the gate itself depends on.
|
||||||
|
|
||||||
|
### Adaptive Cadence
|
||||||
|
|
||||||
|
The sleep between wake cycles is chosen from three tiers based on
|
||||||
|
observed state:
|
||||||
|
|
||||||
|
| Tier | Duration | When |
|
||||||
|
|---|---|---|
|
||||||
|
| Gate backoff | ~5 s | Services not responding — concentrate retry during startup |
|
||||||
|
| Init retry | ~15 s | Gate passes but at least one initialiser is not yet at its configured flag — transient failures, waiting on prereqs, recently-bumped flag not yet applied |
|
||||||
|
| Steady | ~300 s | All configured initialisers at their configured flag; gate passes; nothing to do |
|
||||||
|
|
||||||
|
The short tiers ensure a fresh deployment converges quickly;
|
||||||
|
steady state costs a single round-trip per initialiser every few
|
||||||
|
minutes.
|
||||||
|
|
||||||
|
### Failure Handling
|
||||||
|
|
||||||
|
An initialiser raising an exception does not stop the bootstrapper
|
||||||
|
or block other initialisers. Each initialiser in the cycle is
|
||||||
|
attempted independently; failures are logged and retried on the next
|
||||||
|
cycle. This means there is no ordered-DAG enforcement: order of
|
||||||
|
initialisers in the configuration determines the attempt order
|
||||||
|
within a cycle, but a dependency between two initialisers is
|
||||||
|
expressed by the dependant raising cleanly when its prerequisite
|
||||||
|
isn't satisfied. Over successive cycles the system converges.
|
||||||
|
|
||||||
|
### Flag Change and Re-run Safety
|
||||||
|
|
||||||
|
Each initialiser's completion state is a string flag chosen by the
|
||||||
|
operator. Typically these follow a simple version pattern
|
||||||
|
(`v1`, `v2`, ...), but the bootstrapper imposes no format.
|
||||||
|
|
||||||
|
Changing the flag in the group configuration causes the
|
||||||
|
corresponding initialiser to re-run on the next cycle. Initialisers
|
||||||
|
must be written so that re-running after a flag bump is safe — they
|
||||||
|
receive both the previous and the new flag and are responsible for
|
||||||
|
either cleanly re-applying the work or performing a step-change
|
||||||
|
migration from the prior state.
|
||||||
|
|
||||||
|
This gives operators an explicit, visible mechanism for triggering
|
||||||
|
re-initialisation. Re-runs are never implicit.
|
||||||
|
|
||||||
|
## Core Initialisers
|
||||||
|
|
||||||
|
The following initialisers ship in `trustgraph.bootstrap.initialisers`
|
||||||
|
and cover the base deployment case.
|
||||||
|
|
||||||
|
### PulsarTopology
|
||||||
|
|
||||||
|
Creates the Pulsar tenant and the four namespaces
|
||||||
|
(`flow`, `request`, `response`, `notify`) with appropriate
|
||||||
|
retention policies if they don't exist.
|
||||||
|
|
||||||
|
Opts out of the service gate (`wait_for_services = False`) because
|
||||||
|
config-svc and flow-svc cannot come online until the Pulsar
|
||||||
|
namespaces exist.
|
||||||
|
|
||||||
|
Parameters: Pulsar admin URL, tenant name.
|
||||||
|
|
||||||
|
Idempotent via the admin API (GET-then-PUT). Flag change causes
|
||||||
|
re-evaluation of all namespaces; any absent are created.
|
||||||
|
|
||||||
|
### TemplateSeed
|
||||||
|
|
||||||
|
Populates the reserved `__template__` workspace from an external
|
||||||
|
JSON seed file. The seed file has the standard shape of
|
||||||
|
`{config-type: {config-key: value}}`.
|
||||||
|
|
||||||
|
Runs post-gate. Parameters: path to the seed file, overwrite
|
||||||
|
policy (upsert-missing only, or overwrite-all).
|
||||||
|
|
||||||
|
On clean run, writes the whole file. On flag change, behaviour
|
||||||
|
depends on the overwrite policy — typically upsert-missing so
|
||||||
|
that operator-customised keys are preserved across seed-file
|
||||||
|
upgrades.
|
||||||
|
|
||||||
|
### WorkspaceInit
|
||||||
|
|
||||||
|
Creates a named workspace and populates it from the seed file or
|
||||||
|
from the full contents of the `__template__` workspace.
|
||||||
|
|
||||||
|
Runs post-gate. Parameters: workspace name, source (seed file or
|
||||||
|
`__template__`), optional `seed_file` path, `overwrite` flag.
|
||||||
|
|
||||||
|
When `source` is `template`, the initialiser copies every config
|
||||||
|
type and key present in `__template__` — there is no per-type
|
||||||
|
selection. Deployments that want to seed only a subset should
|
||||||
|
either curate the seed file they feed to `TemplateSeed` or use
|
||||||
|
`source: seed-file` directly here.
|
||||||
|
|
||||||
|
Raises cleanly if its source does not exist — depends on
|
||||||
|
`TemplateSeed` having run in the same cycle or a prior one.
|
||||||
|
|
||||||
|
### DefaultFlowStart
|
||||||
|
|
||||||
|
Starts a specific flow in a specific workspace using a specific
|
||||||
|
blueprint.
|
||||||
|
|
||||||
|
Runs post-gate. Parameters: workspace name, flow id, blueprint
|
||||||
|
name, description, optional parameter overrides.
|
||||||
|
|
||||||
|
Separated from `WorkspaceInit` deliberately so that deployments
|
||||||
|
which want a workspace without an auto-started flow can simply omit
|
||||||
|
this initialiser from their bootstrap configuration.
|
||||||
|
|
||||||
|
## Extensibility
|
||||||
|
|
||||||
|
New initialisers are added by:
|
||||||
|
|
||||||
|
1. Subclassing the initialiser base class.
|
||||||
|
2. Implementing `run(ctx, old_flag, new_flag)`.
|
||||||
|
3. Choosing `wait_for_services` (almost always `True`).
|
||||||
|
4. Adding an entry in the bootstrapper's configuration with the new
|
||||||
|
class's dotted path.
|
||||||
|
|
||||||
|
No core code changes are required to add an enterprise or third-party
|
||||||
|
initialiser. Enterprise builds ship their own package with their own
|
||||||
|
initialiser classes (e.g. `CreateAdminUser`, `ProvisionWorkspaces`)
|
||||||
|
and reference them in the bootstrapper config alongside the core
|
||||||
|
initialisers.
|
||||||
|
|
||||||
|
## Reserved Workspaces
|
||||||
|
|
||||||
|
This specification relies on the "reserved workspace" convention:
|
||||||
|
|
||||||
|
- Any workspace id beginning with `_` is reserved.
|
||||||
|
- Reserved workspaces are stored normally by config-svc but never
|
||||||
|
appear in the config push broadcast.
|
||||||
|
- Live processors cannot react to reserved-workspace state.
|
||||||
|
|
||||||
|
The bootstrapper uses two reserved workspaces:
|
||||||
|
|
||||||
|
- `__template__` — factory-default seed config, readable by
|
||||||
|
initialisers that copy-from-template.
|
||||||
|
- `__system__` — bootstrapper completion state (under the
|
||||||
|
`init-state` config type) and any other system-internal bookkeeping.
|
||||||
|
|
||||||
|
See the reserved-workspace convention in the config service for
|
||||||
|
the general rule and its enforcement.
|
||||||
|
|
||||||
|
## Non-Goals
|
||||||
|
|
||||||
|
- No DAG scheduling across initialisers. Dependencies are expressed
|
||||||
|
by the dependant failing cleanly until its prerequisite is met,
|
||||||
|
and convergence over subsequent cycles.
|
||||||
|
- No parallel execution of initialisers within a cycle. A cycle runs
|
||||||
|
each initialiser sequentially.
|
||||||
|
- No implicit re-runs. Re-running an initialiser requires an explicit
|
||||||
|
flag change by the operator.
|
||||||
|
- No cross-initialiser atomicity. Each initialiser's completion is
|
||||||
|
recorded independently on its own success.
|
||||||
|
|
||||||
|
## Operational Notes
|
||||||
|
|
||||||
|
- Running the bootstrapper as a processor-group entry replaces the
|
||||||
|
previous `tg-init-trustgraph` container. The bootstrapper is also
|
||||||
|
CLI-invocable directly for standalone testing via
|
||||||
|
`Processor.launch(...)`.
|
||||||
|
- First-boot convergence is typically a handful of short cycles
|
||||||
|
followed by a transition to the steady cadence. Deployments
|
||||||
|
should expect the first few minutes of logs to show
|
||||||
|
initialisation activity, thereafter effective silence.
|
||||||
|
- Bumping a flag is a deliberate operational act. The log line
|
||||||
|
emitted on re-run makes the event visible for audit.
|
||||||
|
|
@ -848,7 +848,6 @@ service, not in the config service. Reasons:
|
||||||
- **API key scoping.** API keys could be scoped to specific collections
|
- **API key scoping.** API keys could be scoped to specific collections
|
||||||
within a workspace rather than granting workspace-wide access. To be
|
within a workspace rather than granting workspace-wide access. To be
|
||||||
designed when the need arises.
|
designed when the need arises.
|
||||||
- **tg-init-trustgraph** only initialises a single workspace.
|
|
||||||
|
|
||||||
## References
|
## References
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -84,6 +84,18 @@ class ConfigClient(RequestResponse):
|
||||||
)
|
)
|
||||||
return resp.directory
|
return resp.directory
|
||||||
|
|
||||||
|
async def get_all(self, workspace, timeout=CONFIG_TIMEOUT):
|
||||||
|
"""Return every config entry in ``workspace`` as a nested dict
|
||||||
|
``{type: {key: value}}``. Values are returned as the raw
|
||||||
|
strings stored by config-svc (typically JSON); callers parse
|
||||||
|
as needed. An empty dict means the workspace has no config."""
|
||||||
|
resp = await self._request(
|
||||||
|
operation="config",
|
||||||
|
workspace=workspace,
|
||||||
|
timeout=timeout,
|
||||||
|
)
|
||||||
|
return resp.config
|
||||||
|
|
||||||
async def workspaces_for_type(self, type, timeout=CONFIG_TIMEOUT):
|
async def workspaces_for_type(self, type, timeout=CONFIG_TIMEOUT):
|
||||||
"""Return the set of distinct workspaces with any config of
|
"""Return the set of distinct workspaces with any config of
|
||||||
the given type."""
|
the given type."""
|
||||||
|
|
|
||||||
|
|
@ -40,7 +40,6 @@ tg-get-flow-blueprint = "trustgraph.cli.get_flow_blueprint:main"
|
||||||
tg-get-kg-core = "trustgraph.cli.get_kg_core:main"
|
tg-get-kg-core = "trustgraph.cli.get_kg_core:main"
|
||||||
tg-get-document-content = "trustgraph.cli.get_document_content:main"
|
tg-get-document-content = "trustgraph.cli.get_document_content:main"
|
||||||
tg-graph-to-turtle = "trustgraph.cli.graph_to_turtle:main"
|
tg-graph-to-turtle = "trustgraph.cli.graph_to_turtle:main"
|
||||||
tg-init-trustgraph = "trustgraph.cli.init_trustgraph:main"
|
|
||||||
tg-invoke-agent = "trustgraph.cli.invoke_agent:main"
|
tg-invoke-agent = "trustgraph.cli.invoke_agent:main"
|
||||||
tg-invoke-document-rag = "trustgraph.cli.invoke_document_rag:main"
|
tg-invoke-document-rag = "trustgraph.cli.invoke_document_rag:main"
|
||||||
tg-invoke-graph-rag = "trustgraph.cli.invoke_graph_rag:main"
|
tg-invoke-graph-rag = "trustgraph.cli.invoke_graph_rag:main"
|
||||||
|
|
|
||||||
|
|
@ -1,271 +0,0 @@
|
||||||
"""
|
|
||||||
Initialises TrustGraph pub/sub infrastructure and pushes initial config.
|
|
||||||
|
|
||||||
For Pulsar: creates tenant, namespaces, and retention policies.
|
|
||||||
For RabbitMQ: queues are auto-declared, so only config push is needed.
|
|
||||||
"""
|
|
||||||
|
|
||||||
import requests
|
|
||||||
import time
|
|
||||||
import argparse
|
|
||||||
import json
|
|
||||||
|
|
||||||
from trustgraph.clients.config_client import ConfigClient
|
|
||||||
from trustgraph.base.pubsub import add_pubsub_args
|
|
||||||
|
|
||||||
default_pulsar_admin_url = "http://pulsar:8080"
|
|
||||||
subscriber = "tg-init-pubsub"
|
|
||||||
|
|
||||||
|
|
||||||
def get_clusters(url):
|
|
||||||
|
|
||||||
print("Get clusters...", flush=True)
|
|
||||||
|
|
||||||
resp = requests.get(f"{url}/admin/v2/clusters")
|
|
||||||
|
|
||||||
if resp.status_code != 200: raise RuntimeError("Could not fetch clusters")
|
|
||||||
|
|
||||||
return resp.json()
|
|
||||||
|
|
||||||
def ensure_tenant(url, tenant, clusters):
|
|
||||||
|
|
||||||
resp = requests.get(f"{url}/admin/v2/tenants/{tenant}")
|
|
||||||
|
|
||||||
if resp.status_code == 200:
|
|
||||||
print(f"Tenant {tenant} already exists.", flush=True)
|
|
||||||
return
|
|
||||||
|
|
||||||
resp = requests.put(
|
|
||||||
f"{url}/admin/v2/tenants/{tenant}",
|
|
||||||
json={
|
|
||||||
"adminRoles": [],
|
|
||||||
"allowedClusters": clusters,
|
|
||||||
}
|
|
||||||
)
|
|
||||||
|
|
||||||
if resp.status_code != 204:
|
|
||||||
print(resp.text, flush=True)
|
|
||||||
raise RuntimeError("Tenant creation failed.")
|
|
||||||
|
|
||||||
print(f"Tenant {tenant} created.", flush=True)
|
|
||||||
|
|
||||||
def ensure_namespace(url, tenant, namespace, config):
|
|
||||||
|
|
||||||
resp = requests.get(f"{url}/admin/v2/namespaces/{tenant}/{namespace}")
|
|
||||||
|
|
||||||
if resp.status_code == 200:
|
|
||||||
print(f"Namespace {tenant}/{namespace} already exists.", flush=True)
|
|
||||||
return
|
|
||||||
|
|
||||||
resp = requests.put(
|
|
||||||
f"{url}/admin/v2/namespaces/{tenant}/{namespace}",
|
|
||||||
json=config,
|
|
||||||
)
|
|
||||||
|
|
||||||
if resp.status_code != 204:
|
|
||||||
print(resp.status_code, flush=True)
|
|
||||||
print(resp.text, flush=True)
|
|
||||||
raise RuntimeError(f"Namespace {tenant}/{namespace} creation failed.")
|
|
||||||
|
|
||||||
print(f"Namespace {tenant}/{namespace} created.", flush=True)
|
|
||||||
|
|
||||||
def ensure_config(config, workspace="default", **pubsub_config):
|
|
||||||
|
|
||||||
cli = ConfigClient(
|
|
||||||
subscriber=subscriber,
|
|
||||||
workspace=workspace,
|
|
||||||
**pubsub_config,
|
|
||||||
)
|
|
||||||
|
|
||||||
while True:
|
|
||||||
|
|
||||||
try:
|
|
||||||
|
|
||||||
print("Get current config...", flush=True)
|
|
||||||
current, version = cli.config(timeout=5)
|
|
||||||
|
|
||||||
except Exception as e:
|
|
||||||
|
|
||||||
print("Exception:", e, flush=True)
|
|
||||||
time.sleep(2)
|
|
||||||
print("Retrying...", flush=True)
|
|
||||||
continue
|
|
||||||
|
|
||||||
print("Current config version is", version, flush=True)
|
|
||||||
|
|
||||||
if version != 0:
|
|
||||||
print("Already updated, not updating config. Done.", flush=True)
|
|
||||||
return
|
|
||||||
|
|
||||||
print("Config is version 0, updating...", flush=True)
|
|
||||||
|
|
||||||
batch = []
|
|
||||||
|
|
||||||
for type in config:
|
|
||||||
for key in config[type]:
|
|
||||||
print(f"Adding {type}/{key} to update.", flush=True)
|
|
||||||
batch.append({
|
|
||||||
"type": type,
|
|
||||||
"key": key,
|
|
||||||
"value": json.dumps(config[type][key]),
|
|
||||||
})
|
|
||||||
|
|
||||||
try:
|
|
||||||
cli.put(batch, timeout=10)
|
|
||||||
print("Update succeeded.", flush=True)
|
|
||||||
break
|
|
||||||
except Exception as e:
|
|
||||||
print("Exception:", e, flush=True)
|
|
||||||
time.sleep(2)
|
|
||||||
print("Retrying...", flush=True)
|
|
||||||
continue
|
|
||||||
|
|
||||||
def init_pulsar(pulsar_admin_url, tenant):
|
|
||||||
"""Pulsar-specific setup: create tenant, namespaces, retention policies."""
|
|
||||||
|
|
||||||
clusters = get_clusters(pulsar_admin_url)
|
|
||||||
|
|
||||||
ensure_tenant(pulsar_admin_url, tenant, clusters)
|
|
||||||
|
|
||||||
ensure_namespace(pulsar_admin_url, tenant, "flow", {})
|
|
||||||
|
|
||||||
ensure_namespace(pulsar_admin_url, tenant, "request", {})
|
|
||||||
|
|
||||||
ensure_namespace(pulsar_admin_url, tenant, "response", {
|
|
||||||
"retention_policies": {
|
|
||||||
"retentionSizeInMB": -1,
|
|
||||||
"retentionTimeInMinutes": 3,
|
|
||||||
"subscriptionExpirationTimeMinutes": 30,
|
|
||||||
}
|
|
||||||
})
|
|
||||||
|
|
||||||
ensure_namespace(pulsar_admin_url, tenant, "notify", {
|
|
||||||
"retention_policies": {
|
|
||||||
"retentionSizeInMB": -1,
|
|
||||||
"retentionTimeInMinutes": 3,
|
|
||||||
"subscriptionExpirationTimeMinutes": 5,
|
|
||||||
}
|
|
||||||
})
|
|
||||||
|
|
||||||
|
|
||||||
def push_config(config_json, config_file, workspace="default",
|
|
||||||
**pubsub_config):
|
|
||||||
"""Push initial config if provided."""
|
|
||||||
|
|
||||||
if config_json is not None:
|
|
||||||
|
|
||||||
try:
|
|
||||||
print("Decoding config...", flush=True)
|
|
||||||
dec = json.loads(config_json)
|
|
||||||
print("Decoded.", flush=True)
|
|
||||||
except Exception as e:
|
|
||||||
print("Exception:", e, flush=True)
|
|
||||||
raise e
|
|
||||||
|
|
||||||
ensure_config(dec, workspace=workspace, **pubsub_config)
|
|
||||||
|
|
||||||
elif config_file is not None:
|
|
||||||
|
|
||||||
try:
|
|
||||||
print("Decoding config...", flush=True)
|
|
||||||
dec = json.load(open(config_file))
|
|
||||||
print("Decoded.", flush=True)
|
|
||||||
except Exception as e:
|
|
||||||
print("Exception:", e, flush=True)
|
|
||||||
raise e
|
|
||||||
|
|
||||||
ensure_config(dec, workspace=workspace, **pubsub_config)
|
|
||||||
|
|
||||||
else:
|
|
||||||
print("No config to update.", flush=True)
|
|
||||||
|
|
||||||
|
|
||||||
def main():
|
|
||||||
|
|
||||||
parser = argparse.ArgumentParser(
|
|
||||||
prog='tg-init-trustgraph',
|
|
||||||
description=__doc__,
|
|
||||||
)
|
|
||||||
|
|
||||||
parser.add_argument(
|
|
||||||
'--pulsar-admin-url',
|
|
||||||
default=default_pulsar_admin_url,
|
|
||||||
help=f'Pulsar admin URL (default: {default_pulsar_admin_url})',
|
|
||||||
)
|
|
||||||
|
|
||||||
parser.add_argument(
|
|
||||||
'-c', '--config',
|
|
||||||
help=f'Initial configuration to load',
|
|
||||||
)
|
|
||||||
|
|
||||||
parser.add_argument(
|
|
||||||
'-C', '--config-file',
|
|
||||||
help=f'Initial configuration to load from file',
|
|
||||||
)
|
|
||||||
|
|
||||||
parser.add_argument(
|
|
||||||
'-t', '--tenant',
|
|
||||||
default="tg",
|
|
||||||
help=f'Tenant (default: tg)',
|
|
||||||
)
|
|
||||||
|
|
||||||
parser.add_argument(
|
|
||||||
'-w', '--workspace',
|
|
||||||
default="default",
|
|
||||||
help=f'Workspace (default: default)',
|
|
||||||
)
|
|
||||||
|
|
||||||
add_pubsub_args(parser)
|
|
||||||
|
|
||||||
args = parser.parse_args()
|
|
||||||
|
|
||||||
backend_type = args.pubsub_backend
|
|
||||||
|
|
||||||
# Extract pubsub config from args
|
|
||||||
pubsub_config = {
|
|
||||||
k: v for k, v in vars(args).items()
|
|
||||||
if k not in (
|
|
||||||
'pulsar_admin_url', 'config', 'config_file', 'tenant',
|
|
||||||
'workspace',
|
|
||||||
)
|
|
||||||
}
|
|
||||||
|
|
||||||
while True:
|
|
||||||
|
|
||||||
try:
|
|
||||||
|
|
||||||
# Pulsar-specific setup (tenants, namespaces)
|
|
||||||
if backend_type == 'pulsar':
|
|
||||||
print(flush=True)
|
|
||||||
print(
|
|
||||||
f"Initialising Pulsar at {args.pulsar_admin_url}...",
|
|
||||||
flush=True,
|
|
||||||
)
|
|
||||||
init_pulsar(args.pulsar_admin_url, args.tenant)
|
|
||||||
else:
|
|
||||||
print(flush=True)
|
|
||||||
print(
|
|
||||||
f"Using {backend_type} backend (no admin setup needed).",
|
|
||||||
flush=True,
|
|
||||||
)
|
|
||||||
|
|
||||||
# Push config (works with any backend)
|
|
||||||
push_config(
|
|
||||||
args.config, args.config_file,
|
|
||||||
workspace=args.workspace,
|
|
||||||
**pubsub_config,
|
|
||||||
)
|
|
||||||
|
|
||||||
print("Initialisation complete.", flush=True)
|
|
||||||
break
|
|
||||||
|
|
||||||
except Exception as e:
|
|
||||||
|
|
||||||
print("Exception:", e, flush=True)
|
|
||||||
|
|
||||||
print("Sleeping...", flush=True)
|
|
||||||
time.sleep(2)
|
|
||||||
print("Will retry...", flush=True)
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
|
||||||
main()
|
|
||||||
|
|
@ -60,6 +60,7 @@ agent-orchestrator = "trustgraph.agent.orchestrator:run"
|
||||||
api-gateway = "trustgraph.gateway:run"
|
api-gateway = "trustgraph.gateway:run"
|
||||||
chunker-recursive = "trustgraph.chunking.recursive:run"
|
chunker-recursive = "trustgraph.chunking.recursive:run"
|
||||||
chunker-token = "trustgraph.chunking.token:run"
|
chunker-token = "trustgraph.chunking.token:run"
|
||||||
|
bootstrap = "trustgraph.bootstrap.bootstrapper:run"
|
||||||
config-svc = "trustgraph.config.service:run"
|
config-svc = "trustgraph.config.service:run"
|
||||||
flow-svc = "trustgraph.flow.service:run"
|
flow-svc = "trustgraph.flow.service:run"
|
||||||
doc-embeddings-query-milvus = "trustgraph.query.doc_embeddings.milvus:run"
|
doc-embeddings-query-milvus = "trustgraph.query.doc_embeddings.milvus:run"
|
||||||
|
|
|
||||||
0
trustgraph-flow/trustgraph/bootstrap/__init__.py
Normal file
0
trustgraph-flow/trustgraph/bootstrap/__init__.py
Normal file
68
trustgraph-flow/trustgraph/bootstrap/base.py
Normal file
68
trustgraph-flow/trustgraph/bootstrap/base.py
Normal file
|
|
@ -0,0 +1,68 @@
|
||||||
|
"""
|
||||||
|
Bootstrap framework: Initialiser base class and per-wake context.
|
||||||
|
|
||||||
|
See docs/tech-specs/bootstrap.md for the full design.
|
||||||
|
"""
|
||||||
|
|
||||||
|
import logging
|
||||||
|
from dataclasses import dataclass
|
||||||
|
from typing import Any
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class InitContext:
|
||||||
|
"""Shared per-wake context passed to each initialiser.
|
||||||
|
|
||||||
|
The bootstrapper constructs one of these on every wake cycle,
|
||||||
|
tears it down at cycle end, and passes it into each initialiser's
|
||||||
|
``run()`` method. Fields are short-lived and safe to use during
|
||||||
|
a single cycle only.
|
||||||
|
"""
|
||||||
|
|
||||||
|
logger: logging.Logger
|
||||||
|
config: Any # ConfigClient
|
||||||
|
flow: Any # RequestResponse client for flow-svc
|
||||||
|
|
||||||
|
|
||||||
|
class Initialiser:
|
||||||
|
"""Base class for bootstrap initialisers.
|
||||||
|
|
||||||
|
Subclasses implement :meth:`run`. The bootstrapper manages
|
||||||
|
completion state, flag comparison, retry and error handling —
|
||||||
|
subclasses describe only the work to perform.
|
||||||
|
|
||||||
|
Class attributes:
|
||||||
|
|
||||||
|
* ``wait_for_services`` (bool, default ``True``): when ``True`` the
|
||||||
|
initialiser only runs after the bootstrapper's service gate has
|
||||||
|
passed (config-svc and flow-svc reachable). Set ``False`` for
|
||||||
|
initialisers that bring up infrastructure the gate itself
|
||||||
|
depends on — principally Pulsar topology, without which
|
||||||
|
config-svc cannot come online.
|
||||||
|
"""
|
||||||
|
|
||||||
|
wait_for_services: bool = True
|
||||||
|
|
||||||
|
def __init__(self, **params):
|
||||||
|
# Subclasses should consume their own params via keyword
|
||||||
|
# arguments in their own __init__ signatures. This catch-all
|
||||||
|
# is here so any kwargs that filter through unnoticed don't
|
||||||
|
# raise TypeError on construction.
|
||||||
|
pass
|
||||||
|
|
||||||
|
async def run(self, ctx, old_flag, new_flag):
|
||||||
|
"""Perform initialisation work.
|
||||||
|
|
||||||
|
:param ctx: :class:`InitContext` with logger, config client,
|
||||||
|
flow-svc client.
|
||||||
|
:param old_flag: Previously-stored flag string, or ``None`` if
|
||||||
|
this initialiser has never successfully completed in this
|
||||||
|
deployment.
|
||||||
|
:param new_flag: Currently-configured flag. A string chosen
|
||||||
|
by the operator; typically something like ``"v1"``.
|
||||||
|
|
||||||
|
:raises: Any exception on failure. The bootstrapper catches,
|
||||||
|
logs, and re-runs on the next cycle; completion state is
|
||||||
|
only written on clean return.
|
||||||
|
"""
|
||||||
|
raise NotImplementedError
|
||||||
|
|
@ -0,0 +1 @@
|
||||||
|
from . service import *
|
||||||
|
|
@ -0,0 +1,6 @@
|
||||||
|
#!/usr/bin/env python3
|
||||||
|
|
||||||
|
from . service import run
|
||||||
|
|
||||||
|
if __name__ == '__main__':
|
||||||
|
run()
|
||||||
414
trustgraph-flow/trustgraph/bootstrap/bootstrapper/service.py
Normal file
414
trustgraph-flow/trustgraph/bootstrap/bootstrapper/service.py
Normal file
|
|
@ -0,0 +1,414 @@
|
||||||
|
"""
|
||||||
|
Bootstrapper processor.
|
||||||
|
|
||||||
|
Runs a pluggable list of initialisers in a reconciliation loop.
|
||||||
|
Each initialiser's completion state is recorded in the reserved
|
||||||
|
``__system__`` workspace under the ``init-state`` config type.
|
||||||
|
|
||||||
|
See docs/tech-specs/bootstrap.md for the full design.
|
||||||
|
"""
|
||||||
|
|
||||||
|
import asyncio
|
||||||
|
import importlib
|
||||||
|
import json
|
||||||
|
import logging
|
||||||
|
import uuid
|
||||||
|
from argparse import ArgumentParser
|
||||||
|
from dataclasses import dataclass
|
||||||
|
|
||||||
|
from trustgraph.base import AsyncProcessor
|
||||||
|
from trustgraph.base import ProducerMetrics, SubscriberMetrics
|
||||||
|
from trustgraph.base.config_client import ConfigClient
|
||||||
|
from trustgraph.base.request_response_spec import RequestResponse
|
||||||
|
from trustgraph.schema import (
|
||||||
|
ConfigRequest, ConfigResponse,
|
||||||
|
config_request_queue, config_response_queue,
|
||||||
|
)
|
||||||
|
from trustgraph.schema import (
|
||||||
|
FlowRequest, FlowResponse,
|
||||||
|
flow_request_queue, flow_response_queue,
|
||||||
|
)
|
||||||
|
|
||||||
|
from .. base import Initialiser, InitContext
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
default_ident = "bootstrap"
|
||||||
|
|
||||||
|
# Reserved workspace + config type under which completion state is
|
||||||
|
# stored. Reserved (`_`-prefix) workspaces are excluded from the
|
||||||
|
# config push broadcast — live processors never see these keys.
|
||||||
|
SYSTEM_WORKSPACE = "__system__"
|
||||||
|
INIT_STATE_TYPE = "init-state"
|
||||||
|
|
||||||
|
# Cadence tiers.
|
||||||
|
GATE_BACKOFF = 5 # Services not responding; retry soon.
|
||||||
|
INIT_RETRY = 15 # Gate passed but something ran/failed;
|
||||||
|
# converge quickly.
|
||||||
|
STEADY_INTERVAL = 300 # Everything at target flag; idle cheaply.
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class InitialiserSpec:
|
||||||
|
"""One entry in the bootstrapper's configured list of initialisers."""
|
||||||
|
name: str
|
||||||
|
flag: str
|
||||||
|
instance: Initialiser
|
||||||
|
|
||||||
|
|
||||||
|
def _resolve_class(dotted):
|
||||||
|
"""Import and return a class by its dotted path."""
|
||||||
|
module_path, _, class_name = dotted.rpartition(".")
|
||||||
|
if not module_path:
|
||||||
|
raise ValueError(
|
||||||
|
f"Initialiser class must be a dotted path, got {dotted!r}"
|
||||||
|
)
|
||||||
|
module = importlib.import_module(module_path)
|
||||||
|
return getattr(module, class_name)
|
||||||
|
|
||||||
|
|
||||||
|
def _load_initialisers_file(path):
|
||||||
|
"""Load the initialisers spec list from a YAML or JSON file.
|
||||||
|
|
||||||
|
File shape:
|
||||||
|
|
||||||
|
.. code-block:: yaml
|
||||||
|
|
||||||
|
initialisers:
|
||||||
|
- class: trustgraph.bootstrap.initialisers.PulsarTopology
|
||||||
|
name: pulsar-topology
|
||||||
|
flag: v1
|
||||||
|
params:
|
||||||
|
admin_url: http://pulsar:8080
|
||||||
|
tenant: tg
|
||||||
|
- ...
|
||||||
|
"""
|
||||||
|
with open(path) as f:
|
||||||
|
content = f.read()
|
||||||
|
if path.endswith((".yaml", ".yml")):
|
||||||
|
import yaml
|
||||||
|
doc = yaml.safe_load(content)
|
||||||
|
else:
|
||||||
|
doc = json.loads(content)
|
||||||
|
if not isinstance(doc, dict) or "initialisers" not in doc:
|
||||||
|
raise RuntimeError(
|
||||||
|
f"{path}: expected a mapping with an 'initialisers' key"
|
||||||
|
)
|
||||||
|
return doc["initialisers"]
|
||||||
|
|
||||||
|
|
||||||
|
class Processor(AsyncProcessor):
|
||||||
|
|
||||||
|
def __init__(self, **params):
|
||||||
|
|
||||||
|
super().__init__(**params)
|
||||||
|
|
||||||
|
# Source the initialisers list either from a direct parameter
|
||||||
|
# (processor-group embedding) or from a file (CLI launch).
|
||||||
|
inits = params.get("initialisers")
|
||||||
|
if inits is None:
|
||||||
|
inits_file = params.get("initialisers_file")
|
||||||
|
if inits_file is None:
|
||||||
|
raise RuntimeError(
|
||||||
|
"Bootstrapper requires either the 'initialisers' "
|
||||||
|
"parameter or --initialisers-file"
|
||||||
|
)
|
||||||
|
inits = _load_initialisers_file(inits_file)
|
||||||
|
|
||||||
|
self.specs = []
|
||||||
|
names = set()
|
||||||
|
|
||||||
|
for entry in inits:
|
||||||
|
if not isinstance(entry, dict):
|
||||||
|
raise RuntimeError(
|
||||||
|
f"Initialiser entry must be a mapping, got: {entry!r}"
|
||||||
|
)
|
||||||
|
for required in ("class", "name", "flag"):
|
||||||
|
if required not in entry:
|
||||||
|
raise RuntimeError(
|
||||||
|
f"Initialiser entry missing required field "
|
||||||
|
f"{required!r}: {entry!r}"
|
||||||
|
)
|
||||||
|
|
||||||
|
name = entry["name"]
|
||||||
|
if name in names:
|
||||||
|
raise RuntimeError(f"Duplicate initialiser name {name!r}")
|
||||||
|
names.add(name)
|
||||||
|
|
||||||
|
cls = _resolve_class(entry["class"])
|
||||||
|
|
||||||
|
try:
|
||||||
|
instance = cls(**entry.get("params", {}))
|
||||||
|
except Exception as e:
|
||||||
|
raise RuntimeError(
|
||||||
|
f"Failed to instantiate initialiser "
|
||||||
|
f"{entry['class']!r} as {name!r}: "
|
||||||
|
f"{type(e).__name__}: {e}"
|
||||||
|
)
|
||||||
|
|
||||||
|
self.specs.append(InitialiserSpec(
|
||||||
|
name=name,
|
||||||
|
flag=entry["flag"],
|
||||||
|
instance=instance,
|
||||||
|
))
|
||||||
|
|
||||||
|
logger.info(
|
||||||
|
f"Bootstrapper: loaded {len(self.specs)} initialisers"
|
||||||
|
)
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# Client construction (short-lived per wake cycle).
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
|
||||||
|
def _make_config_client(self):
|
||||||
|
rr_id = str(uuid.uuid4())
|
||||||
|
return ConfigClient(
|
||||||
|
backend=self.pubsub_backend,
|
||||||
|
subscription=f"{self.id}--config--{rr_id}",
|
||||||
|
consumer_name=self.id,
|
||||||
|
request_topic=config_request_queue,
|
||||||
|
request_schema=ConfigRequest,
|
||||||
|
request_metrics=ProducerMetrics(
|
||||||
|
processor=self.id, flow=None, name="config-request",
|
||||||
|
),
|
||||||
|
response_topic=config_response_queue,
|
||||||
|
response_schema=ConfigResponse,
|
||||||
|
response_metrics=SubscriberMetrics(
|
||||||
|
processor=self.id, flow=None, name="config-response",
|
||||||
|
),
|
||||||
|
)
|
||||||
|
|
||||||
|
def _make_flow_client(self):
|
||||||
|
rr_id = str(uuid.uuid4())
|
||||||
|
return RequestResponse(
|
||||||
|
backend=self.pubsub_backend,
|
||||||
|
subscription=f"{self.id}--flow--{rr_id}",
|
||||||
|
consumer_name=self.id,
|
||||||
|
request_topic=flow_request_queue,
|
||||||
|
request_schema=FlowRequest,
|
||||||
|
request_metrics=ProducerMetrics(
|
||||||
|
processor=self.id, flow=None, name="flow-request",
|
||||||
|
),
|
||||||
|
response_topic=flow_response_queue,
|
||||||
|
response_schema=FlowResponse,
|
||||||
|
response_metrics=SubscriberMetrics(
|
||||||
|
processor=self.id, flow=None, name="flow-response",
|
||||||
|
),
|
||||||
|
)
|
||||||
|
|
||||||
|
async def _open_clients(self):
|
||||||
|
config = self._make_config_client()
|
||||||
|
flow = self._make_flow_client()
|
||||||
|
await config.start()
|
||||||
|
try:
|
||||||
|
await flow.start()
|
||||||
|
except Exception:
|
||||||
|
await self._safe_stop(config)
|
||||||
|
raise
|
||||||
|
return config, flow
|
||||||
|
|
||||||
|
async def _safe_stop(self, client):
|
||||||
|
try:
|
||||||
|
await client.stop()
|
||||||
|
except Exception:
|
||||||
|
pass
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# Service gate.
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
|
||||||
|
async def _gate_ready(self, config, flow):
|
||||||
|
try:
|
||||||
|
await config.keys(SYSTEM_WORKSPACE, INIT_STATE_TYPE)
|
||||||
|
except Exception as e:
|
||||||
|
logger.info(
|
||||||
|
f"Gate: config-svc not ready ({type(e).__name__}: {e})"
|
||||||
|
)
|
||||||
|
return False
|
||||||
|
|
||||||
|
try:
|
||||||
|
resp = await flow.request(
|
||||||
|
FlowRequest(
|
||||||
|
operation="list-blueprints",
|
||||||
|
workspace=SYSTEM_WORKSPACE,
|
||||||
|
),
|
||||||
|
timeout=5,
|
||||||
|
)
|
||||||
|
if resp.error:
|
||||||
|
logger.info(
|
||||||
|
f"Gate: flow-svc error: "
|
||||||
|
f"{resp.error.type}: {resp.error.message}"
|
||||||
|
)
|
||||||
|
return False
|
||||||
|
except Exception as e:
|
||||||
|
logger.info(
|
||||||
|
f"Gate: flow-svc not ready ({type(e).__name__}: {e})"
|
||||||
|
)
|
||||||
|
return False
|
||||||
|
|
||||||
|
return True
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# Completion state.
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
|
||||||
|
async def _stored_flag(self, config, name):
|
||||||
|
raw = await config.get(SYSTEM_WORKSPACE, INIT_STATE_TYPE, name)
|
||||||
|
if raw is None:
|
||||||
|
return None
|
||||||
|
try:
|
||||||
|
return json.loads(raw)
|
||||||
|
except Exception:
|
||||||
|
return raw
|
||||||
|
|
||||||
|
async def _store_flag(self, config, name, flag):
|
||||||
|
await config.put(
|
||||||
|
SYSTEM_WORKSPACE, INIT_STATE_TYPE, name,
|
||||||
|
json.dumps(flag),
|
||||||
|
)
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# Per-spec execution.
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
|
||||||
|
async def _run_spec(self, spec, config, flow):
|
||||||
|
"""Run a single initialiser spec.
|
||||||
|
|
||||||
|
Returns one of:
|
||||||
|
- ``"skip"``: stored flag already matches target, nothing to do.
|
||||||
|
- ``"ran"``: initialiser ran and completion state was updated.
|
||||||
|
- ``"failed"``: initialiser raised.
|
||||||
|
- ``"failed-state-write"``: initialiser succeeded but we could
|
||||||
|
not persist the new flag (transient — will re-run next cycle).
|
||||||
|
"""
|
||||||
|
|
||||||
|
try:
|
||||||
|
old_flag = await self._stored_flag(config, spec.name)
|
||||||
|
except Exception as e:
|
||||||
|
logger.warning(
|
||||||
|
f"{spec.name}: could not read stored flag "
|
||||||
|
f"({type(e).__name__}: {e})"
|
||||||
|
)
|
||||||
|
return "failed"
|
||||||
|
|
||||||
|
if old_flag == spec.flag:
|
||||||
|
return "skip"
|
||||||
|
|
||||||
|
child_logger = logger.getChild(spec.name)
|
||||||
|
child_ctx = InitContext(
|
||||||
|
logger=child_logger,
|
||||||
|
config=config,
|
||||||
|
flow=flow,
|
||||||
|
)
|
||||||
|
|
||||||
|
child_logger.info(
|
||||||
|
f"Running (old_flag={old_flag!r} -> new_flag={spec.flag!r})"
|
||||||
|
)
|
||||||
|
|
||||||
|
try:
|
||||||
|
await spec.instance.run(child_ctx, old_flag, spec.flag)
|
||||||
|
except Exception as e:
|
||||||
|
child_logger.error(
|
||||||
|
f"Failed: {type(e).__name__}: {e}", exc_info=True,
|
||||||
|
)
|
||||||
|
return "failed"
|
||||||
|
|
||||||
|
try:
|
||||||
|
await self._store_flag(config, spec.name, spec.flag)
|
||||||
|
except Exception as e:
|
||||||
|
child_logger.warning(
|
||||||
|
f"Completed but could not persist state flag "
|
||||||
|
f"({type(e).__name__}: {e}); will re-run next cycle"
|
||||||
|
)
|
||||||
|
return "failed-state-write"
|
||||||
|
|
||||||
|
child_logger.info(f"Completed (flag={spec.flag!r})")
|
||||||
|
return "ran"
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# Main loop.
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
|
||||||
|
async def run(self):
|
||||||
|
|
||||||
|
logger.info(
|
||||||
|
f"Bootstrapper starting with {len(self.specs)} initialisers"
|
||||||
|
)
|
||||||
|
|
||||||
|
while self.running:
|
||||||
|
|
||||||
|
sleep_for = STEADY_INTERVAL
|
||||||
|
|
||||||
|
try:
|
||||||
|
config, flow = await self._open_clients()
|
||||||
|
except Exception as e:
|
||||||
|
logger.info(
|
||||||
|
f"Failed to open clients "
|
||||||
|
f"({type(e).__name__}: {e}); retry in {GATE_BACKOFF}s"
|
||||||
|
)
|
||||||
|
await asyncio.sleep(GATE_BACKOFF)
|
||||||
|
continue
|
||||||
|
|
||||||
|
try:
|
||||||
|
# Phase 1: pre-service initialisers run unconditionally.
|
||||||
|
pre_specs = [
|
||||||
|
s for s in self.specs
|
||||||
|
if not s.instance.wait_for_services
|
||||||
|
]
|
||||||
|
pre_results = {}
|
||||||
|
for spec in pre_specs:
|
||||||
|
pre_results[spec.name] = await self._run_spec(
|
||||||
|
spec, config, flow,
|
||||||
|
)
|
||||||
|
|
||||||
|
# Phase 2: gate.
|
||||||
|
gate_ok = await self._gate_ready(config, flow)
|
||||||
|
|
||||||
|
# Phase 3: post-service initialisers, if gate passed.
|
||||||
|
post_results = {}
|
||||||
|
if gate_ok:
|
||||||
|
post_specs = [
|
||||||
|
s for s in self.specs
|
||||||
|
if s.instance.wait_for_services
|
||||||
|
]
|
||||||
|
for spec in post_specs:
|
||||||
|
post_results[spec.name] = await self._run_spec(
|
||||||
|
spec, config, flow,
|
||||||
|
)
|
||||||
|
|
||||||
|
# Cadence selection.
|
||||||
|
if not gate_ok:
|
||||||
|
sleep_for = GATE_BACKOFF
|
||||||
|
else:
|
||||||
|
all_results = {**pre_results, **post_results}
|
||||||
|
if any(r != "skip" for r in all_results.values()):
|
||||||
|
sleep_for = INIT_RETRY
|
||||||
|
else:
|
||||||
|
sleep_for = STEADY_INTERVAL
|
||||||
|
|
||||||
|
finally:
|
||||||
|
await self._safe_stop(config)
|
||||||
|
await self._safe_stop(flow)
|
||||||
|
|
||||||
|
await asyncio.sleep(sleep_for)
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# CLI arg plumbing.
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def add_args(parser: ArgumentParser) -> None:
|
||||||
|
|
||||||
|
AsyncProcessor.add_args(parser)
|
||||||
|
|
||||||
|
parser.add_argument(
|
||||||
|
'-c', '--initialisers-file',
|
||||||
|
help='Path to YAML or JSON file describing the '
|
||||||
|
'initialisers to run. Ignored when the '
|
||||||
|
"'initialisers' parameter is provided directly "
|
||||||
|
'(e.g. when running inside a processor group).',
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def run():
|
||||||
|
Processor.launch(default_ident, __doc__)
|
||||||
|
|
@ -0,0 +1,20 @@
|
||||||
|
"""
|
||||||
|
Core bootstrap initialisers.
|
||||||
|
|
||||||
|
These cover the base TrustGraph deployment case. Enterprise or
|
||||||
|
third-party initialisers live in their own packages and are
|
||||||
|
referenced in the bootstrapper's config by fully-qualified dotted
|
||||||
|
path.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from . pulsar_topology import PulsarTopology
|
||||||
|
from . template_seed import TemplateSeed
|
||||||
|
from . workspace_init import WorkspaceInit
|
||||||
|
from . default_flow_start import DefaultFlowStart
|
||||||
|
|
||||||
|
__all__ = [
|
||||||
|
"PulsarTopology",
|
||||||
|
"TemplateSeed",
|
||||||
|
"WorkspaceInit",
|
||||||
|
"DefaultFlowStart",
|
||||||
|
]
|
||||||
|
|
@ -0,0 +1,101 @@
|
||||||
|
"""
|
||||||
|
DefaultFlowStart initialiser — starts a named flow in a workspace
|
||||||
|
using a specified blueprint.
|
||||||
|
|
||||||
|
Separated from WorkspaceInit so deployments that want a workspace
|
||||||
|
without an auto-started flow can simply omit this initialiser.
|
||||||
|
|
||||||
|
Parameters
|
||||||
|
----------
|
||||||
|
workspace : str (default "default")
|
||||||
|
Workspace in which to start the flow.
|
||||||
|
flow_id : str (default "default")
|
||||||
|
Identifier for the started flow.
|
||||||
|
blueprint : str (required)
|
||||||
|
Blueprint name (must already exist in the workspace's config,
|
||||||
|
typically via TemplateSeed -> WorkspaceInit).
|
||||||
|
description : str (default "Default")
|
||||||
|
Human-readable description passed to flow-svc.
|
||||||
|
parameters : dict (optional)
|
||||||
|
Optional parameter overrides passed to start-flow.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from trustgraph.schema import FlowRequest
|
||||||
|
|
||||||
|
from .. base import Initialiser
|
||||||
|
|
||||||
|
|
||||||
|
class DefaultFlowStart(Initialiser):
|
||||||
|
|
||||||
|
def __init__(
|
||||||
|
self,
|
||||||
|
workspace="default",
|
||||||
|
flow_id="default",
|
||||||
|
blueprint=None,
|
||||||
|
description="Default",
|
||||||
|
parameters=None,
|
||||||
|
**kwargs,
|
||||||
|
):
|
||||||
|
super().__init__(**kwargs)
|
||||||
|
if not blueprint:
|
||||||
|
raise ValueError(
|
||||||
|
"DefaultFlowStart requires 'blueprint'"
|
||||||
|
)
|
||||||
|
self.workspace = workspace
|
||||||
|
self.flow_id = flow_id
|
||||||
|
self.blueprint = blueprint
|
||||||
|
self.description = description
|
||||||
|
self.parameters = dict(parameters) if parameters else {}
|
||||||
|
|
||||||
|
async def run(self, ctx, old_flag, new_flag):
|
||||||
|
|
||||||
|
# Check whether the flow already exists. Belt-and-braces
|
||||||
|
# beyond the flag gate: if an operator stops and restarts the
|
||||||
|
# bootstrapper after the flow is already running, we don't
|
||||||
|
# want to blindly try to start it again.
|
||||||
|
list_resp = await ctx.flow.request(
|
||||||
|
FlowRequest(
|
||||||
|
operation="list-flows",
|
||||||
|
workspace=self.workspace,
|
||||||
|
),
|
||||||
|
timeout=10,
|
||||||
|
)
|
||||||
|
if list_resp.error:
|
||||||
|
raise RuntimeError(
|
||||||
|
f"list-flows failed: "
|
||||||
|
f"{list_resp.error.type}: {list_resp.error.message}"
|
||||||
|
)
|
||||||
|
|
||||||
|
if self.flow_id in (list_resp.flow_ids or []):
|
||||||
|
ctx.logger.info(
|
||||||
|
f"Flow {self.flow_id!r} already running in workspace "
|
||||||
|
f"{self.workspace!r}; nothing to do"
|
||||||
|
)
|
||||||
|
return
|
||||||
|
|
||||||
|
ctx.logger.info(
|
||||||
|
f"Starting flow {self.flow_id!r} "
|
||||||
|
f"(blueprint={self.blueprint!r}) "
|
||||||
|
f"in workspace {self.workspace!r}"
|
||||||
|
)
|
||||||
|
|
||||||
|
resp = await ctx.flow.request(
|
||||||
|
FlowRequest(
|
||||||
|
operation="start-flow",
|
||||||
|
workspace=self.workspace,
|
||||||
|
flow_id=self.flow_id,
|
||||||
|
blueprint_name=self.blueprint,
|
||||||
|
description=self.description,
|
||||||
|
parameters=self.parameters,
|
||||||
|
),
|
||||||
|
timeout=30,
|
||||||
|
)
|
||||||
|
if resp.error:
|
||||||
|
raise RuntimeError(
|
||||||
|
f"start-flow failed: "
|
||||||
|
f"{resp.error.type}: {resp.error.message}"
|
||||||
|
)
|
||||||
|
|
||||||
|
ctx.logger.info(
|
||||||
|
f"Flow {self.flow_id!r} started"
|
||||||
|
)
|
||||||
|
|
@ -0,0 +1,131 @@
|
||||||
|
"""
|
||||||
|
PulsarTopology initialiser — creates Pulsar tenant and namespaces
|
||||||
|
with their retention policies.
|
||||||
|
|
||||||
|
Runs pre-gate (``wait_for_services = False``) because config-svc and
|
||||||
|
flow-svc can't connect to Pulsar until these namespaces exist.
|
||||||
|
Admin-API calls are idempotent so re-runs on flag change are safe.
|
||||||
|
"""
|
||||||
|
|
||||||
|
import asyncio
|
||||||
|
import requests
|
||||||
|
|
||||||
|
from .. base import Initialiser
|
||||||
|
|
||||||
|
# Namespace configs. flow/request take broker defaults. response
|
||||||
|
# and notify get aggressive retention — those classes carry short-lived
|
||||||
|
# request/response and notification traffic only.
|
||||||
|
NAMESPACE_CONFIG = {
|
||||||
|
"flow": {},
|
||||||
|
"request": {},
|
||||||
|
"response": {
|
||||||
|
"retention_policies": {
|
||||||
|
"retentionSizeInMB": -1,
|
||||||
|
"retentionTimeInMinutes": 3,
|
||||||
|
"subscriptionExpirationTimeMinutes": 30,
|
||||||
|
},
|
||||||
|
},
|
||||||
|
"notify": {
|
||||||
|
"retention_policies": {
|
||||||
|
"retentionSizeInMB": -1,
|
||||||
|
"retentionTimeInMinutes": 3,
|
||||||
|
"subscriptionExpirationTimeMinutes": 5,
|
||||||
|
},
|
||||||
|
},
|
||||||
|
}
|
||||||
|
|
||||||
|
REQUEST_TIMEOUT = 10
|
||||||
|
|
||||||
|
|
||||||
|
class PulsarTopology(Initialiser):
|
||||||
|
|
||||||
|
wait_for_services = False
|
||||||
|
|
||||||
|
def __init__(
|
||||||
|
self,
|
||||||
|
admin_url="http://pulsar:8080",
|
||||||
|
tenant="tg",
|
||||||
|
**kwargs,
|
||||||
|
):
|
||||||
|
super().__init__(**kwargs)
|
||||||
|
self.admin_url = admin_url.rstrip("/")
|
||||||
|
self.tenant = tenant
|
||||||
|
|
||||||
|
async def run(self, ctx, old_flag, new_flag):
|
||||||
|
# requests is blocking; offload to executor so the loop stays
|
||||||
|
# responsive.
|
||||||
|
loop = asyncio.get_event_loop()
|
||||||
|
await loop.run_in_executor(None, self._reconcile_sync, ctx.logger)
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# Sync admin-API calls.
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
|
||||||
|
def _get_clusters(self):
|
||||||
|
resp = requests.get(
|
||||||
|
f"{self.admin_url}/admin/v2/clusters",
|
||||||
|
timeout=REQUEST_TIMEOUT,
|
||||||
|
)
|
||||||
|
resp.raise_for_status()
|
||||||
|
return resp.json()
|
||||||
|
|
||||||
|
def _tenant_exists(self):
|
||||||
|
resp = requests.get(
|
||||||
|
f"{self.admin_url}/admin/v2/tenants/{self.tenant}",
|
||||||
|
timeout=REQUEST_TIMEOUT,
|
||||||
|
)
|
||||||
|
return resp.status_code == 200
|
||||||
|
|
||||||
|
def _create_tenant(self, clusters):
|
||||||
|
resp = requests.put(
|
||||||
|
f"{self.admin_url}/admin/v2/tenants/{self.tenant}",
|
||||||
|
json={"adminRoles": [], "allowedClusters": clusters},
|
||||||
|
timeout=REQUEST_TIMEOUT,
|
||||||
|
)
|
||||||
|
if resp.status_code != 204:
|
||||||
|
raise RuntimeError(
|
||||||
|
f"Tenant {self.tenant!r} create failed: "
|
||||||
|
f"{resp.status_code} {resp.text}"
|
||||||
|
)
|
||||||
|
|
||||||
|
def _namespace_exists(self, namespace):
|
||||||
|
resp = requests.get(
|
||||||
|
f"{self.admin_url}/admin/v2/namespaces/"
|
||||||
|
f"{self.tenant}/{namespace}",
|
||||||
|
timeout=REQUEST_TIMEOUT,
|
||||||
|
)
|
||||||
|
return resp.status_code == 200
|
||||||
|
|
||||||
|
def _create_namespace(self, namespace, config):
|
||||||
|
resp = requests.put(
|
||||||
|
f"{self.admin_url}/admin/v2/namespaces/"
|
||||||
|
f"{self.tenant}/{namespace}",
|
||||||
|
json=config,
|
||||||
|
timeout=REQUEST_TIMEOUT,
|
||||||
|
)
|
||||||
|
if resp.status_code != 204:
|
||||||
|
raise RuntimeError(
|
||||||
|
f"Namespace {self.tenant}/{namespace} create failed: "
|
||||||
|
f"{resp.status_code} {resp.text}"
|
||||||
|
)
|
||||||
|
|
||||||
|
def _reconcile_sync(self, logger):
|
||||||
|
if not self._tenant_exists():
|
||||||
|
clusters = self._get_clusters()
|
||||||
|
logger.info(
|
||||||
|
f"Creating tenant {self.tenant!r} with clusters {clusters}"
|
||||||
|
)
|
||||||
|
self._create_tenant(clusters)
|
||||||
|
else:
|
||||||
|
logger.debug(f"Tenant {self.tenant!r} already exists")
|
||||||
|
|
||||||
|
for namespace, config in NAMESPACE_CONFIG.items():
|
||||||
|
if self._namespace_exists(namespace):
|
||||||
|
logger.debug(
|
||||||
|
f"Namespace {self.tenant}/{namespace} already exists"
|
||||||
|
)
|
||||||
|
continue
|
||||||
|
logger.info(
|
||||||
|
f"Creating namespace {self.tenant}/{namespace}"
|
||||||
|
)
|
||||||
|
self._create_namespace(namespace, config)
|
||||||
|
|
@ -0,0 +1,93 @@
|
||||||
|
"""
|
||||||
|
TemplateSeed initialiser — populates the reserved ``__template__``
|
||||||
|
workspace from an external JSON seed file.
|
||||||
|
|
||||||
|
Seed file shape:
|
||||||
|
|
||||||
|
.. code-block:: json
|
||||||
|
|
||||||
|
{
|
||||||
|
"flow-blueprint": {
|
||||||
|
"ontology": { ... },
|
||||||
|
"agent": { ... }
|
||||||
|
},
|
||||||
|
"prompt": {
|
||||||
|
...
|
||||||
|
},
|
||||||
|
...
|
||||||
|
}
|
||||||
|
|
||||||
|
Top-level keys are config types; nested keys are config entries.
|
||||||
|
Values are arbitrary JSON (they'll be ``json.dumps()``'d on write).
|
||||||
|
|
||||||
|
Parameters
|
||||||
|
----------
|
||||||
|
config_file : str
|
||||||
|
Path to the seed file on disk.
|
||||||
|
overwrite : bool (default False)
|
||||||
|
On re-run (flag change), if True overwrite all keys; if False
|
||||||
|
upsert-missing-only (preserves any operator customisation of
|
||||||
|
the template).
|
||||||
|
"""
|
||||||
|
|
||||||
|
import json
|
||||||
|
|
||||||
|
from .. base import Initialiser
|
||||||
|
|
||||||
|
TEMPLATE_WORKSPACE = "__template__"
|
||||||
|
|
||||||
|
|
||||||
|
class TemplateSeed(Initialiser):
|
||||||
|
|
||||||
|
def __init__(self, config_file, overwrite=False, **kwargs):
|
||||||
|
super().__init__(**kwargs)
|
||||||
|
if not config_file:
|
||||||
|
raise ValueError("TemplateSeed requires 'config_file'")
|
||||||
|
self.config_file = config_file
|
||||||
|
self.overwrite = overwrite
|
||||||
|
|
||||||
|
async def run(self, ctx, old_flag, new_flag):
|
||||||
|
|
||||||
|
with open(self.config_file) as f:
|
||||||
|
seed = json.load(f)
|
||||||
|
|
||||||
|
if old_flag is None:
|
||||||
|
# Clean first run — write every entry.
|
||||||
|
await self._write_all(ctx, seed)
|
||||||
|
return
|
||||||
|
|
||||||
|
# Re-run after flag change.
|
||||||
|
if self.overwrite:
|
||||||
|
await self._write_all(ctx, seed)
|
||||||
|
else:
|
||||||
|
await self._upsert_missing(ctx, seed)
|
||||||
|
|
||||||
|
async def _write_all(self, ctx, seed):
|
||||||
|
values = []
|
||||||
|
for type_name, entries in seed.items():
|
||||||
|
for key, value in entries.items():
|
||||||
|
values.append((type_name, key, json.dumps(value)))
|
||||||
|
if values:
|
||||||
|
await ctx.config.put_many(TEMPLATE_WORKSPACE, values)
|
||||||
|
ctx.logger.info(
|
||||||
|
f"Template seeded with {len(values)} entries"
|
||||||
|
)
|
||||||
|
|
||||||
|
async def _upsert_missing(self, ctx, seed):
|
||||||
|
written = 0
|
||||||
|
for type_name, entries in seed.items():
|
||||||
|
existing = set(
|
||||||
|
await ctx.config.keys(TEMPLATE_WORKSPACE, type_name)
|
||||||
|
)
|
||||||
|
values = []
|
||||||
|
for key, value in entries.items():
|
||||||
|
if key not in existing:
|
||||||
|
values.append(
|
||||||
|
(type_name, key, json.dumps(value))
|
||||||
|
)
|
||||||
|
if values:
|
||||||
|
await ctx.config.put_many(TEMPLATE_WORKSPACE, values)
|
||||||
|
written += len(values)
|
||||||
|
ctx.logger.info(
|
||||||
|
f"Template upsert-missing: {written} new entries"
|
||||||
|
)
|
||||||
|
|
@ -0,0 +1,138 @@
|
||||||
|
"""
|
||||||
|
WorkspaceInit initialiser — creates a workspace and populates it from
|
||||||
|
either the ``__template__`` workspace or a seed file on disk.
|
||||||
|
|
||||||
|
Parameters
|
||||||
|
----------
|
||||||
|
workspace : str
|
||||||
|
Target workspace to create / populate.
|
||||||
|
source : str
|
||||||
|
Either ``"template"`` (copy the full contents of the
|
||||||
|
``__template__`` workspace) or ``"seed-file"`` (read from
|
||||||
|
``seed_file``).
|
||||||
|
seed_file : str (required when source=="seed-file")
|
||||||
|
Path to a JSON seed file with the same shape TemplateSeed consumes.
|
||||||
|
overwrite : bool (default False)
|
||||||
|
On re-run (flag change), if True overwrite all keys; if False,
|
||||||
|
upsert-missing-only (preserves in-workspace customisations).
|
||||||
|
|
||||||
|
Raises (in ``run``)
|
||||||
|
-------------------
|
||||||
|
When source is ``"template"``, raises ``RuntimeError`` if the
|
||||||
|
``__template__`` workspace is empty — indicating that TemplateSeed
|
||||||
|
hasn't run yet. The bootstrapper's retry loop will re-attempt on
|
||||||
|
the next cycle once the prerequisite is satisfied.
|
||||||
|
"""
|
||||||
|
|
||||||
|
import json
|
||||||
|
|
||||||
|
from .. base import Initialiser
|
||||||
|
|
||||||
|
TEMPLATE_WORKSPACE = "__template__"
|
||||||
|
|
||||||
|
|
||||||
|
class WorkspaceInit(Initialiser):
|
||||||
|
|
||||||
|
def __init__(
|
||||||
|
self,
|
||||||
|
workspace="default",
|
||||||
|
source="template",
|
||||||
|
seed_file=None,
|
||||||
|
overwrite=False,
|
||||||
|
**kwargs,
|
||||||
|
):
|
||||||
|
super().__init__(**kwargs)
|
||||||
|
|
||||||
|
if source not in ("template", "seed-file"):
|
||||||
|
raise ValueError(
|
||||||
|
f"WorkspaceInit: source must be 'template' or "
|
||||||
|
f"'seed-file', got {source!r}"
|
||||||
|
)
|
||||||
|
if source == "seed-file" and not seed_file:
|
||||||
|
raise ValueError(
|
||||||
|
"WorkspaceInit: seed_file required when source='seed-file'"
|
||||||
|
)
|
||||||
|
|
||||||
|
self.workspace = workspace
|
||||||
|
self.source = source
|
||||||
|
self.seed_file = seed_file
|
||||||
|
self.overwrite = overwrite
|
||||||
|
|
||||||
|
async def run(self, ctx, old_flag, new_flag):
|
||||||
|
if self.source == "seed-file":
|
||||||
|
tree = self._load_seed_file()
|
||||||
|
else:
|
||||||
|
tree = await self._load_from_template(ctx)
|
||||||
|
|
||||||
|
if old_flag is None or self.overwrite:
|
||||||
|
await self._write_all(ctx, tree)
|
||||||
|
else:
|
||||||
|
await self._upsert_missing(ctx, tree)
|
||||||
|
|
||||||
|
def _load_seed_file(self):
|
||||||
|
with open(self.seed_file) as f:
|
||||||
|
return json.load(f)
|
||||||
|
|
||||||
|
async def _load_from_template(self, ctx):
|
||||||
|
"""Build a seed tree from the entire ``__template__`` workspace.
|
||||||
|
Raises if the workspace is empty, so the bootstrapper knows
|
||||||
|
the prerequisite isn't met yet."""
|
||||||
|
|
||||||
|
raw_tree = await ctx.config.get_all(TEMPLATE_WORKSPACE)
|
||||||
|
|
||||||
|
tree = {}
|
||||||
|
total = 0
|
||||||
|
for type_name, entries in raw_tree.items():
|
||||||
|
parsed = {}
|
||||||
|
for key, raw in entries.items():
|
||||||
|
if raw is None:
|
||||||
|
continue
|
||||||
|
try:
|
||||||
|
parsed[key] = json.loads(raw)
|
||||||
|
except Exception:
|
||||||
|
parsed[key] = raw
|
||||||
|
total += 1
|
||||||
|
if parsed:
|
||||||
|
tree[type_name] = parsed
|
||||||
|
|
||||||
|
if total == 0:
|
||||||
|
raise RuntimeError(
|
||||||
|
"Template workspace is empty — has TemplateSeed run yet?"
|
||||||
|
)
|
||||||
|
|
||||||
|
ctx.logger.debug(
|
||||||
|
f"Loaded {total} template entries across {len(tree)} types"
|
||||||
|
)
|
||||||
|
return tree
|
||||||
|
|
||||||
|
async def _write_all(self, ctx, tree):
|
||||||
|
values = []
|
||||||
|
for type_name, entries in tree.items():
|
||||||
|
for key, value in entries.items():
|
||||||
|
values.append((type_name, key, json.dumps(value)))
|
||||||
|
if values:
|
||||||
|
await ctx.config.put_many(self.workspace, values)
|
||||||
|
ctx.logger.info(
|
||||||
|
f"Workspace {self.workspace!r} populated with "
|
||||||
|
f"{len(values)} entries"
|
||||||
|
)
|
||||||
|
|
||||||
|
async def _upsert_missing(self, ctx, tree):
|
||||||
|
written = 0
|
||||||
|
for type_name, entries in tree.items():
|
||||||
|
existing = set(
|
||||||
|
await ctx.config.keys(self.workspace, type_name)
|
||||||
|
)
|
||||||
|
values = []
|
||||||
|
for key, value in entries.items():
|
||||||
|
if key not in existing:
|
||||||
|
values.append(
|
||||||
|
(type_name, key, json.dumps(value))
|
||||||
|
)
|
||||||
|
if values:
|
||||||
|
await ctx.config.put_many(self.workspace, values)
|
||||||
|
written += len(values)
|
||||||
|
ctx.logger.info(
|
||||||
|
f"Workspace {self.workspace!r} upsert-missing: "
|
||||||
|
f"{written} new entries"
|
||||||
|
)
|
||||||
|
|
@ -24,6 +24,21 @@ logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
default_ident = "config-svc"
|
default_ident = "config-svc"
|
||||||
|
|
||||||
|
|
||||||
|
def is_reserved_workspace(workspace):
|
||||||
|
"""Reserved workspaces are storage-only.
|
||||||
|
|
||||||
|
Any workspace id beginning with ``_`` is reserved for internal use
|
||||||
|
(e.g. ``__template__`` holding factory-default seed config).
|
||||||
|
Reads and writes work normally so bootstrap and provisioning code
|
||||||
|
can use the standard config API, but **change notifications for
|
||||||
|
reserved workspaces are suppressed**. Services subscribed to the
|
||||||
|
config push therefore never see reserved-workspace events and
|
||||||
|
cannot accidentally act on template content as if it were live
|
||||||
|
state.
|
||||||
|
"""
|
||||||
|
return workspace.startswith("_")
|
||||||
|
|
||||||
default_config_request_queue = config_request_queue
|
default_config_request_queue = config_request_queue
|
||||||
default_config_response_queue = config_response_queue
|
default_config_response_queue = config_response_queue
|
||||||
default_config_push_queue = config_push_queue
|
default_config_push_queue = config_push_queue
|
||||||
|
|
@ -130,6 +145,21 @@ class Processor(AsyncProcessor):
|
||||||
|
|
||||||
async def push(self, changes=None):
|
async def push(self, changes=None):
|
||||||
|
|
||||||
|
# Suppress notifications from reserved workspaces (ids starting
|
||||||
|
# with "_", e.g. "__template__"). Stored config is preserved;
|
||||||
|
# only the broadcast is filtered. Keeps services oblivious to
|
||||||
|
# template / bootstrap state.
|
||||||
|
if changes:
|
||||||
|
filtered = {}
|
||||||
|
for type_name, workspaces in changes.items():
|
||||||
|
visible = [
|
||||||
|
w for w in workspaces
|
||||||
|
if not is_reserved_workspace(w)
|
||||||
|
]
|
||||||
|
if visible:
|
||||||
|
filtered[type_name] = visible
|
||||||
|
changes = filtered
|
||||||
|
|
||||||
version = await self.config.get_version()
|
version = await self.config.get_version()
|
||||||
|
|
||||||
resp = ConfigPush(
|
resp = ConfigPush(
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue