mirror of https://github.com/trustgraph-ai/trustgraph.git synced 2026-06-09 06:45:13 +02:00

The context development platform. Store, enrich, and retrieve structured knowledge with graph-native infrastructure, semantic retrieval, and portable context cores. https://trustgraph.ai

Find a file

cybermaggedon ffe310af7c Fix RabbitMQ request/response race and chunker Flow API drift (#779 ) * Fix Metadata/EntityEmbeddings schema migration tail and add regression tests (#776) The Metadata dataclass dropped its `metadata: list[Triple]` field and EntityEmbeddings/ChunkEmbeddings settled on a singular `vector: list[float]` field, but several call sites kept passing `Metadata(metadata=...)` and `EntityEmbeddings(vectors=...)`. The bugs were latent until a websocket client first hit `/api/v1/flow/default/import/entity-contexts`, at which point the dispatcher TypeError'd on construction. Production fixes (5 call sites on the same migration tail): * trustgraph-flow gateway dispatchers entity_contexts_import.py and graph_embeddings_import.py — drop the stale Metadata(metadata=...) kwarg; switch graph_embeddings_import to the singular `vector` wire key. * trustgraph-base messaging translators knowledge.py and document_loading.py — fix decode side to read the singular `"vector"` key, matching what their own encode sides have always written. * trustgraph-flow tables/knowledge.py — fix Cassandra row deserialiser to construct EntityEmbeddings(vector=...) instead of vectors=. * trustgraph-flow gateway core_import/core_export — switch the kg-core msgpack wire format to the singular `"v"`/`"vector"` key and drop the dead `m["m"]` envelope field that referenced the removed Metadata.metadata triples list (it was a guaranteed KeyError on the export side). Defense-in-depth regression coverage (32 new tests across 7 files): * tests/contract/test_schema_field_contracts.py — pin the field set of Metadata, EntityEmbeddings, ChunkEmbeddings, EntityContext so any future schema rename fails CI loudly with a clear diff. * tests/unit/test_translators/test_knowledge_translator_roundtrip.py and test_document_embeddings_translator_roundtrip.py - encode→decode round-trip the affected translators end to end, locking in the singular `"vector"` wire key. * tests/unit/test_gateway/test_entity_contexts_import_dispatcher.py and test_graph_embeddings_import_dispatcher.py — exercise the websocket dispatchers' receive() path with realistic payloads, the direct regression test for the original production crash. * tests/unit/test_gateway/test_core_import_export_roundtrip.py — pack/unpack the kg-core msgpack format through the real dispatcher classes (with KnowledgeRequestor mocked), including a full export→import round-trip. * tests/unit/test_tables/test_knowledge_table_store.py — exercise the Cassandra row → schema conversion via __new__ to bypass the live cluster connection. Also fixes an unrelated leaked-coroutine RuntimeWarning in test_gateway/test_service.py::test_run_method_calls_web_run_app: the mocked aiohttp.web.run_app now closes the coroutine that Api.run() hands it, mirroring what the real run_app would do, instead of leaving it for the GC to complain about. * Fix RabbitMQ request/response race and chunker Flow API drift Two unrelated regressions surfaced after the v2.2 queue class refactor. Bundled here because both are small and both block production. 1. Request/response race against ephemeral RabbitMQ response queues Commit `feeb92b3` switched response/notify queues to per-subscriber auto-delete exclusive queues. That fixed orphaned-queue accumulation but introduced a setup race: Subscriber.start() created the run() task and returned immediately, while the underlying RabbitMQ consumer only declared and bound its queue lazily on the first receive() call. RequestResponse.request() therefore published the request before any queue was bound to the matching routing key, and the broker dropped the reply. Symptoms: "Failed to fetch config on notify" / "Request timeout exception" repeating roughly every 10s in api-gateway, document-embeddings and any other service exercising the config notify path. Fix: * Add ensure_connected() to the BackendConsumer protocol; implement it on RabbitMQBackendConsumer (calls _connect synchronously, declaring and binding the queue) and as a no-op on PulsarBackendConsumer (Pulsar's client.subscribe is already synchronous at construction). * Convert Subscriber's readiness signal from a non-existent Event to an asyncio.Future created in start(). run() calls consumer.ensure_connected() immediately after create_consumer() and sets _ready.set_result(None) on first successful bind. start() awaits the future via asyncio.wait so it returns only once the consumer is fully bound. Any reply published after start() returns is therefore guaranteed to land in a bound queue. * First-attempt connection failures call _ready.set_exception(e) and exit run() so start() unblocks with the error rather than hanging forever — the existing higher-level retry pattern in fetch_and_apply_config takes over from there. Runtime failures after a successful start still go through the existing retry-with-backoff path. * Update the two existing graceful-shutdown tests that monkey-patch Subscriber.run with a custom coroutine to honor the new contract by signalling _ready themselves. * Add tests/unit/test_base/test_subscriber_readiness.py with five regression tests pinning the readiness contract: ensure_connected must be called before start() returns; start() must block while ensure_connected runs (race-condition guard with a threading.Event gate); first-attempt create_consumer and ensure_connected failures must propagate to start() instead of hanging; ensure_connected must run before any receive() call. 2. Chunker Flow parameter lookup using the wrong attribute trustgraph-base/trustgraph/base/chunking_service.py was reading flow.parameters.get("chunk-size") and chunk-overlap, but the Flow class has no `parameters` attribute — parameter lookup is exposed through Flow.__call__ (flow("chunk-size") returns the resolved value or None). The exception was caught and logged as a WARNING, so chunking continued with the default sizes and any configured chunk-size / chunk-overlap was silently ignored: chunker - WARNING - Could not parse chunk-size parameter: 'Flow' object has no attribute 'parameters' The chunker tests didn't catch this because they constructed mock_flow = MagicMock() and configured mock_flow.parameters.get.side_effect = ..., which is the same phantom attribute MagicMock auto-creates on demand. Tests and production agreed on the wrong API. Fix: switch chunking_service.py to flow("chunk-size") / flow("chunk-overlap"). Update both chunker test files to mock the __call__ side_effect instead of the phantom parameters.get, merging parameter values into the existing flow() lookup the on_message tests already used for producer resolution.		2026-04-11 01:29:38 +01:00
.github/workflows	Open 2.3 release branch (#775 )	2026-04-10 14:42:19 +01:00
containers	Add missing pdf extra to unstructured dependency (#728 )	2026-03-29 20:22:45 +01:00
dev-tools	Added Explainable AI agent demo in Typescript (#770 )	2026-04-08 14:16:14 +01:00
docs	Update docs for 2.2 release (#766 )	2026-04-07 22:24:59 +01:00
specs	Update docs for 2.2 release (#766 )	2026-04-07 22:24:59 +01:00
test-api	Knowledge core CLI (#368 )	2025-05-07 00:20:59 +01:00
tests	Fix RabbitMQ request/response race and chunker Flow API drift (#779 )	2026-04-11 01:29:38 +01:00
tests.manual	Test suite executed from CI pipeline (#433 )	2025-07-14 14:57:44 +01:00
trustgraph	Start 1.8 release branch	2025-12-17 21:32:13 +00:00
trustgraph-base	Fix RabbitMQ request/response race and chunker Flow API drift (#779 )	2026-04-11 01:29:38 +01:00
trustgraph-bedrock	Open 2.3 release branch (#775 )	2026-04-10 14:42:19 +01:00
trustgraph-cli	Open 2.3 release branch (#775 )	2026-04-10 14:42:19 +01:00
trustgraph-embeddings-hf	Open 2.3 release branch (#775 )	2026-04-10 14:42:19 +01:00
trustgraph-flow	Fix Metadata/EntityEmbeddings schema migration tail and add regression tests (#777 )	2026-04-10 20:43:45 +01:00
trustgraph-mcp	Add GATEWAY_SECRET support for MCP server to API gateway auth (#721 )	2026-03-26 10:49:28 +00:00
trustgraph-ocr	Open 2.3 release branch (#775 )	2026-04-10 14:42:19 +01:00
trustgraph-unstructured	Open 2.3 release branch (#775 )	2026-04-10 14:42:19 +01:00
trustgraph-vertexai	Open 2.3 release branch (#775 )	2026-04-10 14:42:19 +01:00
.coveragerc	Structure data mvp (#452 )	2025-08-07 20:47:20 +01:00
.gitignore	Add universal document decoder with multi-format support (#705 )	2026-03-23 12:56:35 +00:00
check_imports.py	Test suite executed from CI pipeline (#433 )	2025-07-14 14:57:44 +01:00
context7.json	Merge master into release/v2.1 (#652 )	2026-02-28 11:07:03 +00:00
DEVELOPER_GUIDE.md	Developer guide into 0.11 branch (#101 )	2024-10-03 17:50:25 +01:00
install_packages.sh	Test suite executed from CI pipeline (#433 )	2025-07-14 14:57:44 +01:00
LICENSE	Apache 2 (#373 )	2025-05-08 18:59:58 +01:00
Makefile	SPARQL query service (#754 )	2026-04-02 17:21:39 +01:00
ontology-prompt.md	Feature/improve ontology extract (#576 )	2025-12-03 13:36:10 +00:00
product-platform-diagram.svg	master -> 1.5 (README updates) (#552 )	2025-10-11 11:46:03 +01:00
prompt.txt	Structured data loader cli (#498 )	2025-09-05 15:38:18 +01:00
README.md	master -> release/v2.3 (#774 )	2026-04-10 14:38:46 +01:00
requirements.txt	Loki logging (#586 )	2025-12-09 23:24:41 +00:00
run_tests.sh	Test suite executed from CI pipeline (#433 )	2025-07-14 14:57:44 +01:00
schema.ttl	Feature/doc metadata labels (#130 )	2024-10-29 21:18:02 +00:00
SECURITY.md	master -> release/v2.2 (#732 )	2026-03-29 20:26:26 +01:00
TEST_CASES.md	Test suite executed from CI pipeline (#433 )	2025-07-14 14:57:44 +01:00
TEST_SETUP.md	Test suite executed from CI pipeline (#433 )	2025-07-14 14:57:44 +01:00
TEST_STRATEGY.md	Test suite executed from CI pipeline (#433 )	2025-07-14 14:57:44 +01:00
TESTS.md	Test suite executed from CI pipeline (#433 )	2025-07-14 14:57:44 +01:00
TG-fullname-logo.svg	Reconcile master with 1.6 (#563 )	2025-11-24 10:02:30 +00:00
TG-hero-diagram.svg	Reconcile master with 1.6 (#563 )	2025-11-24 10:02:30 +00:00

README.md

The context development platform

Building applications that need to know things requires more than a database. TrustGraph is the context development platform: graph-native infrastructure for storing, enriching, and retrieving structured knowledge at any scale. Think like Supabase but built around context graphs: multi-model storage, semantic retrieval pipelines, portable context cores, and a full developer toolkit out of the box. Deploy locally or in the cloud. No unnecessary API keys. Just context, engineered.

The platform:

Multi-model and multimodal database system
- Tabular/relational, key-value
- Document, graph, and vectors
- Images, video, and audio
Automated data ingest and loading
- Quick ingest with semantic similarity retrieval
- Ontology structuring for precision retrieval
Out-of-the-box RAG pipelines
- DocumentRAG
- GraphRAG
- OntologyRAG
3D GraphViz for exploring context
Fully Agentic System
- Single Agent
- Multi Agent
- MCP integration
Run anywhere
- Deploy locally with Docker
- Deploy in cloud with Kubernetes
Support for all major LLMs
- API support for Anthropic, Cohere, Gemini, Mistral, OpenAI, and others
- Model inferencing with vLLM, Ollama, TGI, LM Studio, and Llamafiles
Developer friendly
- REST API Docs
- Websocket API Docs
- Python API Docs
- CLI Docs

No API Keys Required

How many times have you cloned a repo and opened the .env.example to see the dozens of API keys for 3rd party dependencies needed to make the services work? There are only 3 things in TrustGraph that might need an API key:

3rd party LLM services like Anthropic, Cohere, Gemini, Mistral, OpenAI, etc.
3rd party OCR like Mistral OCR
The API key you set for the TrustGraph API gateway

Everything else is included.

Managed Multi-model storage in Cassandra
Managed Vector embedding storage in Qdrant
Managed File and Object storage in Garage (S3 compatible)
Managed High-speed Pub/Sub messaging fabric with Pulsar
Complete LLM inferencing stack for open LLMs with vLLM, TGI, Ollama, LM Studio, and Llamafiles

Quickstart

There's no need to clone this repo, unless you want to build from source. TrustGraph is a fully containerized app that deploys as a set of Docker containers. To configure TrustGraph on the command line:

npx @trustgraph/config

The config process will generate an app config that can be run locally with Docker, Podman, or Minikube. The process will output:

deploy.zip with either a docker-compose.yaml file for a Docker/Podman or resources.yaml for Kubernetes
Deployment instructions as INSTALLATION.md

For a browser based configuration, try the Configuration Terminal.

Watch What is a Context Graph?

Watch Context Graphs in Action

Getting Started with TrustGraph

Workbench

The Workbench provides tools for all major features of TrustGraph. The Workbench is on port 8888 by default.

Vector Search: Search the installed knowledge bases
Agentic, GraphRAG and LLM Chat: Chat interface for agents, GraphRAG queries, or direct to LLMs
Relationships: Analyze deep relationships in the installed knowledge bases
Graph Visualizer: 3D GraphViz of the installed knowledge bases
Library: Staging area for installing knowledge bases
Flow Classes: Workflow preset configurations
Flows: Create custom workflows and adjust LLM parameters during runtime
Knowledge Cores: Manage resuable knowledge bases
Prompts: Manage and adjust prompts during runtime
Schemas: Define custom schemas for structured data knowledge bases
Ontologies: Define custom ontologies for unstructured data knowledge bases
Agent Tools: Define tools with collections, knowledge cores, MCP connections, and tool groups
MCP Tools: Connect to MCP servers

TypeScript Library for UIs

There are 3 libraries for quick UI integration of TrustGraph services.

Context Cores

A Context Core is a portable, versioned bundle of context that you can ship between projects and environments, pin in production, and reuse across agents. It packages the “stuff agents need to know” (structured knowledge + embeddings + evidence + policies) into a single artifact, so you can treat context like code: build it, test it, version it, promote it, and roll it back. TrustGraph is built to support this kind of end-to-end context engineering and orchestration workflow.

What’s inside a Context Core

A Context Core typically includes:

Ontology (your domain schema) and mappings
Context Graph (entities, relationships, supporting evidence)
Embeddings / vector indexes for fast semantic entry-point lookup
Source manifests + provenance (where facts came from, when, and how they were derived)
Retrieval policies (traversal rules, freshness, authority ranking)

Tech Stack

TrustGraph provides component flexibility to optimize agent workflows.

LLM APIs

Anthropic
AWS Bedrock
AzureAI
AzureOpenAI
Cohere
Google AI Studio
Google VertexAI
Mistral
OpenAI

LLM Orchestration

LM Studio
Llamafiles
Ollama
TGI
vLLM

Multi-model storage

Apache Cassandra

VectorDB

Qdrant

File and Object Storage

Garage

Observability

Prometheus
Grafana
Loki

Data Streaming

Apache Pulsar

Clouds

AWS
Azure
Google Cloud
OVHcloud
Scaleway

Observability & Telemetry

Once the platform is running, access the Grafana dashboard at:

http://localhost:3000

Default credentials are:

user: admin
password: admin

The default Grafana dashboard tracks the following:

Telemetry

LLM Latency
Error Rate
Service Request Rates
Queue Backlogs
Chunking Histogram
Error Source by Service
Rate Limit Events
CPU usage by Service
Memory usage by Service
Models Deployed
Token Throughput (Tokens/second)
Cost Throughput (Cost/second)

Contributing

Developer's Guide

License

TrustGraph is licensed under Apache 2.0.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

   http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Support & Community

Bug Reports & Feature Requests: Discord
Discussions & Questions: Discord
Documentation: Docs

README.md Unescape Escape