trustgraph/trustgraph-flow/pyproject.toml

136 lines
5.8 KiB
TOML
Raw Normal View History

[build-system]
requires = ["setuptools>=61.0", "wheel"]
build-backend = "setuptools.build_meta"
[project]
name = "trustgraph-flow"
dynamic = ["version"]
authors = [{name = "trustgraph.ai", email = "security@trustgraph.ai"}]
description = "TrustGraph provides a means to run a pipeline of flexible AI processing components in a flexible means to achieve a processing pipeline."
readme = "README.md"
requires-python = ">=3.8"
dependencies = [
2026-04-21 21:36:46 +01:00
"trustgraph-base>=2.4,<2.5",
"aiohttp",
"anthropic",
"scylla-driver",
"cohere",
"cryptography",
"faiss-cpu",
"falkordb",
"fastembed",
"ibis",
"jsonschema",
"langchain",
"langchain-community",
"langchain-core",
"langchain-text-splitters",
"mcp",
"minio",
"mistralai<2.0.0",
"neo4j",
"nltk",
"ollama",
"openai",
"pinecone[grpc]",
"prometheus-client",
"pulsar-client",
"pymilvus",
"pypdf",
"pyyaml",
"qdrant-client",
"rdflib",
"requests",
"strawberry-graphql",
"tabulate",
"tiktoken",
"urllib3",
]
classifiers = [
"Programming Language :: Python :: 3",
"Operating System :: OS Independent",
]
[project.urls]
Homepage = "https://github.com/trustgraph-ai/trustgraph"
[project.scripts]
agent-manager-react = "trustgraph.agent.react:run"
agent-orchestrator = "trustgraph.agent.orchestrator:run"
api-gateway = "trustgraph.gateway:run"
chunker-recursive = "trustgraph.chunking.recursive:run"
chunker-token = "trustgraph.chunking.token:run"
feat: pluggable bootstrap framework with ordered initialisers (#847) A generic, long-running bootstrap processor that converges a deployment to its configured initial state and then idles. Replaces the previous one-shot `tg-init-trustgraph` container model and provides an extension point for enterprise / third-party initialisers. See docs/tech-specs/bootstrap.md for the full design. Bootstrapper ------------ A single AsyncProcessor (trustgraph.bootstrap.bootstrapper.Processor) that: * Reads a list of initialiser specifications (class, name, flag, params) from either a direct `initialisers` parameter (processor-group embedding) or a YAML/JSON file (`-c`, CLI). * On each wake, runs a cheap service-gate (config-svc + flow-svc round-trips), then iterates the initialiser list, running each whose configured flag differs from the one stored in __system__/init-state/<name>. * Stores per-initialiser completion state in the reserved __system__ workspace. * Adapts cadence: ~5s on gate failure, ~15s while converging, ~300s in steady state. * Isolates failures — one initialiser's exception does not block others in the same cycle; the failed one retries next wake. Initialiser contract -------------------- * Subclass trustgraph.bootstrap.base.Initialiser. * Implement async run(ctx, old_flag, new_flag). * Opt out of the service gate with class attr wait_for_services=False (only used by PulsarTopology, since config-svc cannot come up until Pulsar namespaces exist). * ctx carries short-lived config and flow-svc clients plus a scoped logger. Core initialisers (trustgraph.bootstrap.initialisers.*) ------------------------------------------------------- * PulsarTopology — creates Pulsar tenant + namespaces (pre-gate, blocking HTTP offloaded to executor). * TemplateSeed — seeds __template__ from an external JSON file; re-run is upsert-missing by default, overwrite-all opt-in. * WorkspaceInit — populates a named workspace from either the full contents of __template__ or a seed file; raises cleanly if the template isn't seeded yet so the bootstrapper retries on the next cycle. * DefaultFlowStart — starts a specific flow in a workspace; no-ops if the flow is already running. Enterprise or third-party initialisers plug in via fully-qualified dotted class paths in the bootstrapper's configuration — no core code change required. Config service -------------- * push(): filter out reserved workspaces (ids starting with "_") from the change notifications. Stored config is preserved; only the broadcast is suppressed, so bootstrap / template state lives in config-svc without live processors ever reacting to it. Config client ------------- * ConfigClient.get_all(workspace): wraps the existing `config` operation to return {type: {key: value}} for a workspace. WorkspaceInit uses it to copy __template__ without needing a hardcoded types list. pyproject.toml -------------- * Adds a `bootstrap` console script pointing at the new Processor. * Remove tg-init-trustgraph, superceded by bootstrap processor
2026-04-22 18:03:46 +01:00
bootstrap = "trustgraph.bootstrap.bootstrapper:run"
config-svc = "trustgraph.config.service:run"
Flow service lifecycle management (#822) feat: separate flow service from config service with explicit queue lifecycle management The flow service is now an independent service that owns the lifecycle of flow and blueprint queues. System services own their own queues. Consumers never create queues. Flow service separation: - New service at trustgraph-flow/trustgraph/flow/service/ - Uses async ConfigClient (RequestResponse pattern) to talk to config service - Config service stripped of all flow handling Queue lifecycle management: - PubSubBackend protocol gains create_queue, delete_queue, queue_exists, ensure_queue — all async - RabbitMQ: implements via pika with asyncio.to_thread internally - Pulsar: stubs for future admin REST API implementation - Consumer _connect() no longer creates queues (passive=True for named queues) - System services call ensure_queue on startup - Flow service creates queues on flow start, deletes on flow stop - Flow service ensures queues for pre-existing flows on startup Two-phase flow stop: - Phase 1: set flow status to "stopping", delete processor config entries - Phase 2: retry queue deletion, then delete flow record Config restructure: - active-flow config replaced with processor:{name} types - Each processor has its own config type, each flow variant is a key - Flow start/stop use batch put/delete — single config push per operation - FlowProcessor subscribes to its own type only Blueprint format: - Processor entries split into topics and parameters dicts - Flow interfaces use {"flow": "topic"} instead of bare strings - Specs (ConsumerSpec, ProducerSpec, etc.) read from definition["topics"] Tests updated
2026-04-16 17:19:39 +01:00
flow-svc = "trustgraph.flow.service:run"
feat: IAM service, gateway auth middleware, capability model, and CLIs (#849) Replaces the legacy GATEWAY_SECRET shared-token gate with an IAM-backed identity and authorisation model. The gateway no longer has an "allow-all" or "no auth" mode; every request is authenticated via the IAM service, authorised against a capability model that encodes both the operation and the workspace it targets, and rejected with a deliberately-uninformative 401 / 403 on any failure. IAM service (trustgraph-flow/trustgraph/iam, trustgraph-base/schema/iam) ----------------------------------------------------------------------- * New backend service (iam-svc) owning users, workspaces, API keys, passwords and JWT signing keys in Cassandra. Reached over the standard pub/sub request/response pattern; gateway is the only caller. * Operations: bootstrap, resolve-api-key, login, get-signing-key-public, rotate-signing-key, create/list/get/update/disable/delete/enable-user, change-password, reset-password, create/list/get/update/disable- workspace, create/list/revoke-api-key. * Ed25519 JWT signing (alg=EdDSA). Key rotation writes a new kid and retires the previous one; validation is grace-period friendly. * Passwords: PBKDF2-HMAC-SHA-256, 600k iterations, per-user salt. * API keys: 128-bit random, SHA-256 hashed. Plaintext returned once. * Bootstrap is explicit: --bootstrap-mode {token,bootstrap} is a required startup argument with no permissive default. Masked "auth failure" errors hide whether a refused bootstrap request was due to mode, state, or authorisation. Gateway authentication (trustgraph-flow/trustgraph/gateway/auth.py) ------------------------------------------------------------------- * IamAuth replaces the legacy Authenticator. Distinguishes JWTs (three-segment dotted) from API keys by shape; verifies JWTs locally using the cached IAM public key; resolves API keys via IAM with a short-TTL hash-keyed cache. Every failure path surfaces the same 401 body ("auth failure") so callers cannot enumerate credential state. * Public key is fetched at gateway startup with a bounded retry loop; traffic does not begin flowing until auth has started. Capability model (trustgraph-flow/trustgraph/gateway/capabilities.py) --------------------------------------------------------------------- * Roles have two dimensions: a capability set and a workspace scope. OSS ships reader / writer / admin; the first two are workspace- assigned, admin is cross-workspace ("*"). No "cross-workspace" pseudo-capability — workspace permission is a property of the role. * check(identity, capability, target_workspace=None) is the single authorisation test: some role must grant the capability *and* be active in the target workspace. * enforce_workspace validates a request-body workspace against the caller's role scopes and injects the resolved value. Cross- workspace admin is permitted by role scope, not by a bypass. * Gateway endpoints declare a required capability explicitly — no permissive default. Construction fails fast if omitted. Enterprise editions can replace the role table without changing the wire protocol. WebSocket first-frame auth (dispatch/mux.py, endpoint/socket.py) ---------------------------------------------------------------- * /api/v1/socket handshake unconditionally accepts; authentication runs on the first WebSocket frame ({"type":"auth","token":"..."}) with {"type":"auth-ok","workspace":"..."} / {"type":"auth-failed"}. The socket stays open on failure so the client can re-authenticate — browsers treat a handshake-time 401 as terminal, breaking reconnection. * Mux.receive rejects every non-auth frame before auth succeeds, enforces the caller's workspace (envelope + inner payload) using the role-scope resolver, and supports mid-session re-auth. * Flow import/export streaming endpoints keep the legacy ?token= handshake (URL-scoped short-lived transfers; no re-auth need). Auth surface ------------ * POST /api/v1/auth/login — public, returns a JWT. * POST /api/v1/auth/bootstrap — public; forwards to IAM's bootstrap op which itself enforces mode + tables-empty. * POST /api/v1/auth/change-password — any authenticated user. * POST /api/v1/iam — admin-only generic forwarder for the rest of the IAM API (per-op REST endpoints to follow in a later change). Removed / breaking ------------------ * GATEWAY_SECRET / --api-token / default_api_token and the legacy Authenticator.permitted contract. The gateway cannot run without IAM. * ?token= on /api/v1/socket. * DispatcherManager and Mux both raise on auth=None — no silent downgrade path. CLI tools (trustgraph-cli) -------------------------- tg-bootstrap-iam, tg-login, tg-create-user, tg-list-users, tg-disable-user, tg-enable-user, tg-delete-user, tg-change-password, tg-reset-password, tg-create-api-key, tg-list-api-keys, tg-revoke-api-key, tg-create-workspace, tg-list-workspaces. Passwords read via getpass; tokens / one-time secrets written to stdout with operator context on stderr so shell composition works cleanly. AsyncSocketClient / SocketClient updated to the first-frame auth protocol. Specifications -------------- * docs/tech-specs/iam.md updated with the error policy, workspace resolver extension point, and OSS role-scope model. * docs/tech-specs/iam-protocol.md (new) — transport, dataclasses, operation table, error taxonomy, bootstrap modes. * docs/tech-specs/capabilities.md (new) — capability vocabulary, OSS role bundles, agent-as-composition note, enforcement-boundary policy, enterprise extensibility. Tests ----- * test_auth.py (rewritten) — IamAuth + JWT round-trip with real Ed25519 keypairs + API-key cache behaviour. * test_capabilities.py (new) — role table sanity, check across role x workspace combinations, enforce_workspace paths, unknown-cap / unknown-role fail-closed. * Every endpoint test construction now names its capability explicitly (no permissive defaults relied upon). New tests pin the fail-closed invariants: DispatcherManager / Mux refuse auth=None; i18n path-traversal defense is exercised. * test_socket_graceful_shutdown rewritten against IamAuth.
2026-04-24 17:29:10 +01:00
iam-svc = "trustgraph.iam.service:run"
doc-embeddings-query-milvus = "trustgraph.query.doc_embeddings.milvus:run"
doc-embeddings-query-pinecone = "trustgraph.query.doc_embeddings.pinecone:run"
doc-embeddings-query-qdrant = "trustgraph.query.doc_embeddings.qdrant:run"
doc-embeddings-write-milvus = "trustgraph.storage.doc_embeddings.milvus:run"
doc-embeddings-write-pinecone = "trustgraph.storage.doc_embeddings.pinecone:run"
doc-embeddings-write-qdrant = "trustgraph.storage.doc_embeddings.qdrant:run"
document-embeddings = "trustgraph.embeddings.document_embeddings:run"
document-rag = "trustgraph.retrieval.document_rag:run"
embeddings-fastembed = "trustgraph.embeddings.fastembed:run"
embeddings-ollama = "trustgraph.embeddings.ollama:run"
graph-embeddings-query-milvus = "trustgraph.query.graph_embeddings.milvus:run"
graph-embeddings-query-pinecone = "trustgraph.query.graph_embeddings.pinecone:run"
graph-embeddings-query-qdrant = "trustgraph.query.graph_embeddings.qdrant:run"
graph-embeddings-write-milvus = "trustgraph.storage.graph_embeddings.milvus:run"
graph-embeddings-write-pinecone = "trustgraph.storage.graph_embeddings.pinecone:run"
graph-embeddings-write-qdrant = "trustgraph.storage.graph_embeddings.qdrant:run"
graph-embeddings = "trustgraph.embeddings.graph_embeddings:run"
graph-rag = "trustgraph.retrieval.graph_rag:run"
kg-extract-agent = "trustgraph.extract.kg.agent:run"
kg-extract-definitions = "trustgraph.extract.kg.definitions:run"
kg-extract-rows = "trustgraph.extract.kg.rows:run"
kg-extract-relationships = "trustgraph.extract.kg.relationships:run"
kg-extract-topics = "trustgraph.extract.kg.topics:run"
kg-extract-ontology = "trustgraph.extract.kg.ontology:run"
kg-manager = "trustgraph.cores:run"
kg-store = "trustgraph.storage.knowledge:run"
librarian = "trustgraph.librarian:run"
mcp-tool = "trustgraph.agent.mcp_tool:run"
metering = "trustgraph.metering:run"
2025-09-04 13:39:47 +01:00
nlp-query = "trustgraph.retrieval.nlp_query:run"
rows-write-cassandra = "trustgraph.storage.rows.cassandra:run"
rows-query-cassandra = "trustgraph.query.rows.cassandra:run"
row-embeddings = "trustgraph.embeddings.row_embeddings:run"
row-embeddings-write-qdrant = "trustgraph.storage.row_embeddings.qdrant:run"
row-embeddings-query-qdrant = "trustgraph.query.row_embeddings.qdrant:run"
pdf-decoder = "trustgraph.decoding.pdf:run"
pdf-ocr-mistral = "trustgraph.decoding.mistral_ocr:run"
prompt-template = "trustgraph.prompt.template:run"
rev-gateway = "trustgraph.rev_gateway:run"
run-processing = "trustgraph.processing:run"
sparql-query = "trustgraph.query.sparql:run"
structured-query = "trustgraph.retrieval.structured_query:run"
structured-diag = "trustgraph.retrieval.structured_diag:run"
text-completion-azure = "trustgraph.model.text_completion.azure:run"
text-completion-azure-openai = "trustgraph.model.text_completion.azure_openai:run"
text-completion-claude = "trustgraph.model.text_completion.claude:run"
text-completion-cohere = "trustgraph.model.text_completion.cohere:run"
text-completion-llamafile = "trustgraph.model.text_completion.llamafile:run"
text-completion-lmstudio = "trustgraph.model.text_completion.lmstudio:run"
text-completion-mistral = "trustgraph.model.text_completion.mistral:run"
text-completion-ollama = "trustgraph.model.text_completion.ollama:run"
text-completion-openai = "trustgraph.model.text_completion.openai:run"
text-completion-tgi = "trustgraph.model.text_completion.tgi:run"
text-completion-vllm = "trustgraph.model.text_completion.vllm:run"
triples-query-cassandra = "trustgraph.query.triples.cassandra:run"
triples-query-falkordb = "trustgraph.query.triples.falkordb:run"
triples-query-memgraph = "trustgraph.query.triples.memgraph:run"
triples-query-neo4j = "trustgraph.query.triples.neo4j:run"
triples-write-cassandra = "trustgraph.storage.triples.cassandra:run"
triples-write-falkordb = "trustgraph.storage.triples.falkordb:run"
triples-write-memgraph = "trustgraph.storage.triples.memgraph:run"
triples-write-neo4j = "trustgraph.storage.triples.neo4j:run"
wikipedia-lookup = "trustgraph.external.wikipedia:run"
joke-service = "trustgraph.tool_service.joke:run"
[tool.setuptools.packages.find]
include = ["trustgraph*"]
[tool.setuptools.dynamic]
version = {attr = "trustgraph.flow_version.__version__"}