mirror of
https://github.com/trustgraph-ai/trustgraph.git
synced 2026-04-25 00:16:23 +02:00
A generic, long-running bootstrap processor that converges a
deployment to its configured initial state and then idles.
Replaces the previous one-shot `tg-init-trustgraph` container model
and provides an extension point for enterprise / third-party
initialisers.
See docs/tech-specs/bootstrap.md for the full design.
Bootstrapper
------------
A single AsyncProcessor (trustgraph.bootstrap.bootstrapper.Processor)
that:
* Reads a list of initialiser specifications (class, name, flag,
params) from either a direct `initialisers` parameter
(processor-group embedding) or a YAML/JSON file (`-c`, CLI).
* On each wake, runs a cheap service-gate (config-svc +
flow-svc round-trips), then iterates the initialiser list,
running each whose configured flag differs from the one stored
in __system__/init-state/<name>.
* Stores per-initialiser completion state in the reserved
__system__ workspace.
* Adapts cadence: ~5s on gate failure, ~15s while converging,
~300s in steady state.
* Isolates failures — one initialiser's exception does not block
others in the same cycle; the failed one retries next wake.
Initialiser contract
--------------------
* Subclass trustgraph.bootstrap.base.Initialiser.
* Implement async run(ctx, old_flag, new_flag).
* Opt out of the service gate with class attr
wait_for_services=False (only used by PulsarTopology, since
config-svc cannot come up until Pulsar namespaces exist).
* ctx carries short-lived config and flow-svc clients plus a
scoped logger.
Core initialisers (trustgraph.bootstrap.initialisers.*)
-------------------------------------------------------
* PulsarTopology — creates Pulsar tenant + namespaces
(pre-gate, blocking HTTP offloaded to
executor).
* TemplateSeed — seeds __template__ from an external JSON
file; re-run is upsert-missing by default,
overwrite-all opt-in.
* WorkspaceInit — populates a named workspace from either
the full contents of __template__ or a
seed file; raises cleanly if the template
isn't seeded yet so the bootstrapper retries
on the next cycle.
* DefaultFlowStart — starts a specific flow in a workspace;
no-ops if the flow is already running.
Enterprise or third-party initialisers plug in via fully-qualified
dotted class paths in the bootstrapper's configuration — no core
code change required.
Config service
--------------
* push(): filter out reserved workspaces (ids starting with "_")
from the change notifications. Stored config is preserved; only
the broadcast is suppressed, so bootstrap / template state lives
in config-svc without live processors ever reacting to it.
Config client
-------------
* ConfigClient.get_all(workspace): wraps the existing `config`
operation to return {type: {key: value}} for a workspace.
WorkspaceInit uses it to copy __template__ without needing a
hardcoded types list.
pyproject.toml
--------------
* Adds a `bootstrap` console script pointing at the new Processor.
* Remove tg-init-trustgraph, superceded by bootstrap processor
|
||
|---|---|---|
| .. | ||
| ar | ||
| es | ||
| he | ||
| hi | ||
| pt | ||
| ru | ||
| sw | ||
| tr | ||
| zh-cn | ||
| __TEMPLATE.md | ||
| active-flow-key-restructure.md | ||
| agent-explainability.md | ||
| agent-orchestration.md | ||
| architecture-principles.md | ||
| bootstrap.md | ||
| cassandra-consolidation.md | ||
| cassandra-performance-refactor.md | ||
| collection-management.md | ||
| config-push-poke.md | ||
| data-ownership-model.md | ||
| document-embeddings-chunk-id.md | ||
| embeddings-batch-processing.md | ||
| entity-centric-graph.md | ||
| explainability-cli.md | ||
| extraction-flows.md | ||
| extraction-provenance-subgraph.md | ||
| extraction-time-provenance.md | ||
| flow-class-definition.md | ||
| flow-configurable-parameters.md | ||
| flow-service-queue-lifecycle.md | ||
| graph-contexts.md | ||
| graphql-query.md | ||
| graphrag-performance-optimization.md | ||
| iam.md | ||
| import-export-graceful-shutdown.md | ||
| jsonl-prompt-output.md | ||
| kafka-backend.md | ||
| large-document-loading.md | ||
| logging-strategy.md | ||
| mcp-tool-arguments.md | ||
| mcp-tool-bearer-token.md | ||
| minio-to-s3-migration.md | ||
| more-config-cli.md | ||
| multi-tenant-support.md | ||
| neo4j-user-collection-isolation.md | ||
| ontology-extract-phase-2.md | ||
| ontology.md | ||
| ontorag.md | ||
| openapi-spec.md | ||
| pubsub-abstraction.md | ||
| pubsub.md | ||
| python-api-refactor.md | ||
| query-time-explainability.md | ||
| rag-streaming-support.md | ||
| schema-refactoring-proposal.md | ||
| sparql-query.md | ||
| streaming-llm-responses.md | ||
| structured-data-2.md | ||
| structured-data-descriptor.md | ||
| structured-data-schemas.md | ||
| structured-data.md | ||
| structured-diag-service.md | ||
| tool-group.md | ||
| tool-services.md | ||
| universal-decoder.md | ||
| vector-store-lifecycle.md | ||