master -> 1.5 (README updates) (#552)

2026-07-01 09:29:38 +02:00 · 2025-10-11 11:46:03 +01:00 · 2025-10-11 11:46:03 +01:00 · 51107008fd
commit 51107008fd
parent ad35656811
18 changed files with 891 additions and 120 deletions
--- a/README.md
+++ b/README.md
@ -1,30 +1,33 @@
-<img src="tg-adapter.png" width=100% />
-
 <div align="center">

-## The Sovereign Universal AI Adapter
+## The Agentic AI Platform for Enterprise Availability, Scalability, and Security

-[![PyPI version](https://img.shields.io/pypi/v/trustgraph.svg)](https://pypi.org/project/trustgraph/) [![Discord](https://img.shields.io/discord/1251652173201149994
-)](https://discord.gg/sQMwkRz5GX)
+<img src="product-platform-diagram.svg" width=100% />

-[Full Docs](https://docs.trustgraph.ai/docs/TrustGraph) | [YouTube](https://www.youtube.com/@TrustGraphAI?sub_confirmation=1) | [Configuration Builder](https://config-ui.demo.trustgraph.ai/) | [API Docs](docs/apis/README.md) | [CLI Docs](docs/cli/README.md) | [Discord](https://discord.gg/sQMwkRz5GX) | [Blog](https://blog.trustgraph.ai/subscribe)
+---
+
+[![PyPI version](https://img.shields.io/pypi/v/trustgraph.svg)](https://pypi.org/project/trustgraph/) ![E2E Tests](https://github.com/trustgraph-ai/trustgraph/actions/workflows/release.yaml/badge.svg)
+[![Discord](https://img.shields.io/discord/1251652173201149994
+)](https://discord.gg/sQMwkRz5GX) [![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/trustgraph-ai/trustgraph)
+
+[**Docs**](https://docs.trustgraph.ai) | [**YouTube**](https://www.youtube.com/@TrustGraphAI?sub_confirmation=1) | [**Configuration Builder**](https://config-ui.demo.trustgraph.ai/) | [**Discord**](https://discord.gg/sQMwkRz5GX) | [**Blog**](https://blog.trustgraph.ai/subscribe)

 </div>

-Take control of your data and AI future with **TrustGraph**. Universal connectors can call the latest LLMs or deploy models on your hardware. **TrustGraph** future-proofs your AI strategy with graph driven intelligence that can deploy in any environment.
-
---
+**TrustGraph** is an agentic AI platform built to meet the enterprise demands for availability, scalability, and security. TrustGraph meets these demands by combining the enterprise-grade data streaming platform [Apache Pulsar](https://github.com/apache/pulsar/) with knowledge graphs, structured data storage, VectorDBs, and MCP interoperability all in a single containerized platform.

 <details>
 <summary>Table of Contents</summary>
 <br>

+- [**Key Features**](#key-features)<br>
 - [**Why TrustGraph?**](#why-trustgraph)<br>
+- [**Agentic MCP Demo**](#agentic-mcp-demo)<br>
 - [**Getting Started**](#getting-started)<br>
 - [**Configuration Builder**](#configuration-builder)<br>
- [**GraphRAG**](#graphrag)<br>
- [**Knowledge Packages**](#knowledge-packages)<br>
- [**Architecture**](#architecture)<br>
+- [**Context Engineering**](#context-engineering)<br>
+- [**Knowledge Cores**](#knowledge-cores)<br>
+- [**Platform Architecture**](#platform-architecture)<br>
 - [**Integrations**](#integrations)<br>
 - [**Observability & Telemetry**](#observability--telemetry)<br>
 - [**Contributing**](#contributing)<br>
@ -33,40 +36,37 @@ Take control of your data and AI future with **TrustGraph**. Universal connector

 </details>

---
+## Key Features
+
+To meet the demands of enterprises, a platform needs to enable multi-tenancy, user and agentic access controls, data management, and total data privacy. TrustGraph enables these capabilities with:
+
+- **Flows and Flow Classes -> Multi-tenancy**. *Flow classes are sets of processing components that can be combined into logically separate flows for both users and agents.*
+- **Collections -> User/agent access controls and data management**. *Collections enable grouping data with custom labels that can be used for limiting data access to both users and agents. Collections can be added, deleted, and listed.*
+- **Tool Groups -> Multi-agent**. *Create groups for agent tools for multi-agent flows within a single deployment.*
+- **Knowledge Cores -> Data management and data privacy**. *Knowledge cores are modular and reusable components of knowledge graphs and vector embeddings that can serve as "long-term memory".*
+- **Fully Containerized Platform with Private Model Serving -> Total data privacy**. *The entire TrustGraph platform can be deployed in any environment while managing the deployment of private LLMs for total data sovereignty.*
+- **No-LLM Knowledge Graph Retrieval -> Deterministic Natural Language Graph Retrieval**. *TrustGraph does *not* use LLMs for knowledge graph retrieval. Natural language queries use semantic similarity search as the basis for building graph queries without LLMs enabling true graph enhanced agentic flows.*

 ## Why TrustGraph?

-If you want to build powerful, intelligent AI applications without getting bogged down by complex infrastructure, brittle data pipelines, or opaque "black box" systems, TrustGraph is the platform that accelerates your AI transformation by solving these core problems.
+[![Why TrustGraph?](https://img.youtube.com/vi/Norboj8YP2M/maxresdefault.jpg)](https://www.youtube.com/watch?v=Norboj8YP2M)

- **Go Beyond Basic RAG with GraphRAG**: Stop building agents that just retrieve text snippets. TrustGraph provides the tooling to automatically build and query Knowledge Graphs combined with Vector Embeddings, enabling you to create applications with deep contextual reasoning and higher accuracy.
- **Decouple Your App from the AI Stack**: Our modular, containerized architecture lets you deploy anywhere (Docker, K8s, bare-metal) and swap out components (LLMs, vector DBs, graph DBs) without re-architecting your core application. Write your app once, knowing the underlying AI stack can evolve.
- **Automate the Knowledge Pipeline**: Focus on building your application's logic, not on writing ETL scripts for AI. TrustGraph provides a unified platform to ingest data from silos, transform it into structured Knowledge Packages, and deliver it to your AI – streamlining the entire "knowledge supply chain."
- **Enjoy Full Transparency & Control**: As an open-source platform, you get complete visibility into the system's inner workings. Debug more effectively, customize components to your needs, and maintain total control over your application's data flow and security, eliminating vendor lock-in.
+## Agentic MCP Demo
+
+[![Agentic MCP Demo](https://img.youtube.com/vi/mUCL1b1lmbA/maxresdefault.jpg)](https://www.youtube.com/watch?v=mUCL1b1lmbA)

 ## Getting Started

-This is a very-quickstart.  See [other installation options](docs/README.md#ways-to-deploy).
+- [**Quickstart Guide**](https://docs.trustgraph.ai/getting-started/)
+- [**Configuration Builder**](#configuration-builder)
+- [**Workbench**](#workbench)
+- [**Developer APIs and CLI**](https://docs.trustgraph.ai/reference/)
+- [**Example Notebooks**](https://github.com/trustgraph-ai/example-notebooks)
+- [**Deployment Guide**](https://docs.trustgraph.ai/deployment/)

- [Configuration Builder](#configuration-builder)
- [Install the CLI](#install-the-trustgraph-cli)
- [Test Suite](#test-suite)
+### Watch TrustGraph 101

-### Developer APIs and CLI
-
- [**REST API**](docs/apis/README.md#rest-apis)
- [**Websocket API**](docs/apis/README.md#websocket-api)
- [**Python SDK**](https://trustgraph.ai/docs/api/apistarted)
- [**TrustGraph CLI**](https://trustgraph.ai/docs/running/cli)
-
-### Install the TrustGraph CLI
-
-```
-pip3 install trustgraph-cli==<trustgraph-version>
-```
-
-> [!CAUTION]
-> The `trustgraph-cli` version *must* match the selected **TrustGraph** release version. 
+[![TrustGraph 101](https://img.youtube.com/vi/rWYl_yhKCng/maxresdefault.jpg)](https://www.youtube.com/watch?v=rWYl_yhKCng)

 ## Configuration Builder

@ -74,55 +74,51 @@ The [**Configuration Builder**](https://config-ui.demo.trustgraph.ai/) assembles

 - **Version**: Select the version of TrustGraph you'd like to deploy
 - **Component Selection**: Choose from the available deployment platforms, LLMs, graph store, VectorDB, chunking algorithm, chunking parameters, and LLM parameters
- **Customization**: Customize the prompts for the LLM System, Data Extraction Agents, and Agent Flow
+- **Customization**: Enable OCR pipelines and custom embeddings models
 - **Finish Deployment**: Download the launch `YAML` files with deployment instructions

-### Test Suite
+## Workbench

-If added to the build in the **Configuration Builder**, the **Test Suite** will be available at port `8888`. The **Test Suite** has the following capabilities:
+The **Workbench** is a UI that provides tools for interacting with all major features of the platform. The **Workbench** is enabled by default in the **Configuration Builder** and is available at port `8888` on deployment. The **Workbench** has the following capabilities:

- **GraphRAG Chat**: GraphRAG queries in a chat interface
- **Vector Search**: Semantic similarity search with cosine similarity scores
- **Semantic Relationships**: See semantic relationships in a list structure
- **Graph Visualizer**: Visualize semantic relationships in **3D**
- **Data Loader**: Directly load `.pdf`, `.txt`, or `.md` into the system with document metadata
+- **Agentic, GraphRAG and LLM Chat**: Chat interface for agentic flows, GraphRAG queries, or directly interfacing with a LLM
+- **Semantic Discovery**: Analyze semantic relationships with vector search, knowledge graph relationships, and 3D graph visualization
+- **Data Management**: Load data into the **Librarian** for processing, create and upload **Knowledge Packages**
+- **Flow Management**: Create and delete processing flow patterns
+- **Prompt Management**: Edit all LLM prompts used in the platform during runtime
+- **Agent Tools**: Define tools used by the Agent Flow including MCP tools
+- **MCP Tools**: Connect to MCP servers

-## GraphRAG
+## Context Engineering

-TrustGraph features an advanced GraphRAG approach that automatically constructs Knowledge Graphs with mapped Vector Embeddings to provide richer and more accurate context to LLMs for trustworthy agents.
-
-**How TrustGraph's GraphRAG Works:**
+TrustGraph features a complete context engineering solution combinging the power of Knowledge Graphs and VectorDBs. Connect your data to automatically construct Knowledge Graphs with mapped Vector Embeddings to deliver richer and more accurate context to LLMs for trustworthy agents.

 - **Automated Knowledge Graph Construction:** Data Transformation Agents processes source data to automatically **extract key entities, topics, and the relationships** connecting them. Vector emebeddings are then mapped to these semantic relationships for context retrieval.
- **Hybrid Retrieval:** When an agent needs to perform deep research, it first performs a **cosine similarity search** on the vector embeddings to identify potentially relevant concepts and relationships within the knowledge graph. This initial vector search **pinpoints relevant entry points** within the structured Knowledge Graph.
+- **Deterministic Graph Retrieval:** Semantic relationsips are retrieved from the knowledge graph *without* the use of LLMs. When an agent needs to perform deep research, it first performs a **cosine similarity search** on the vector embeddings to identify potentially relevant concepts and relationships within the knowledge graph. This initial vector search **pinpoints relevant entry points** within the structured Knowledge Graph which gets built into graph queries *without* LLMs that retrieve the relevant subgraphs.
 - **Context Generation via Subgraph Traversal:** Based on the ranked results from the similarity search, agents are provided with only the relevant subgraphs for **deep context**. Users can configure the **number of 'hops'** (relationship traversals) to extend the depth of knowledge availabe to the agents. This structured **subgraph**, containing entities and their relationships, forms a highly relevant and context-aware input prompt for the LLM that is endlessly configurable with options for the number of entities, relationships, and overall subgraph size.

-## Knowledge Packages
+## Knowledge Cores

-One of the biggest challenges currently facing RAG architectures is the ability to quickly reuse and integrate knowledge sets. **TrustGraph** solves this problem by storing the results of the data ingestion process in reusable Knowledge Packages. Being able to store and reuse the Knowledge Packages means the data transformation process has to be run only once. These reusable Knowledge Packages can be loaded back into **TrustGraph** and used for GraphRAG.
+One of the biggest challenges currently facing RAG architectures is the ability to quickly reuse and integrate knowledge sets like long-term memory for LLMs. **TrustGraph** solves this problem by storing the results of the data ingestion process in reusable Knowledge Cores. Being able to store and reuse the Knowledge Cores means the data transformation process has to be run only once. These reusable Knowledge Cores can be loaded back into **TrustGraph** and used for GraphRAG. Some sample knowledge cores are available for download [here](https://github.com/trustgraph-ai/catalog/tree/master/v3).

-A Knowledge Package has two components:
+A Knowledge Core has two components:

 - Set of Graph Edges
 - Set of mapped Vector Embeddings

-When a Knowledge Package is loaded into TrustGraph, the corresponding graph edges and vector embeddings are queued and loaded into the chosen graph and vector stores.
+When a Knowledge Core is loaded into TrustGraph, the corresponding graph edges and vector embeddings are queued and loaded into the chosen graph and vector stores.

-## Architecture
-
-The platform contains the services, stores, control plane, and API gateway needed to connect your data to intelligent agents.
-
-![architecture](TG-platform-diagram.svg)
+## Platform Architecture

 The platform orchestrates a comprehensive suite of services to transform external data into intelligent, actionable outputs for AI agents and users. It interacts with external data sources and external services (like LLM APIs) via an **API Gateway**.

 Within the **TrustGraph** Platform, the services are grouped as follows:

- **Data Orchestration:** This crucial set of services manages the entire lifecycle of ingesting and preparing data to become AI-ready knowledge. It includes **Data Ingest** capabilities for various data types, a *Data Librarian* for managing and cataloging this information, *Data Transformation* services to clean, structure, and refine raw data, and ultimately produces consumable *Knowledge Packages* – the structured, enriched knowledge artifacts for AI.
+- **Data Orchestration:** This crucial set of services manages the entire lifecycle of ingesting and preparing data to become AI-ready knowledge. It includes **Data Ingest** capabilities for various data types, a *Data Librarian* for managing and cataloging this information, *Data Transformation* services to clean, structure, and refine raw data, and ultimately produces consumable *Knowledge Cores* – the structured, enriched knowledge artifacts for AI.
 - **Data Storage:** The platform relies on a flexible storage layer designed to handle the diverse needs of AI applications. This includes dedicated storage for *Knowledge Graphs* (to represent interconnected relationships), *VectorDBs* (for efficient semantic similarity search on embeddings), and *Tabular Datastores* (for structured data).
- **Intelligence Orchestration:** This is the core reasoning engine of the platform. It leverages the structured knowledge from the Storage layer to perform *Deep Knowledge Retrieval* (advanced search and context discovery beyond simple keyword matching) and facilitate *Agentic Thinking*, enabling AI agents to process information and form complex responses or action plans.
+- **Context Orchestration:** This is the core reasoning engine of the platform. It leverages the structured knowledge from the Storage layer to perform *Deep Knowledge Retrieval* (advanced search and context discovery beyond simple keyword matching) and facilitate *Agentic Thinking*, enabling AI agents to process information and form complex responses or action plans.
 - **Agent Orchestration:** This group of services is dedicated to managing and empowering the AI agents themselves. The *Agent Manager* handles the lifecycle, configuration, and operation of agents, while *Agent Tools* provide a framework or library of capabilities that agents can utilize to perform actions or interact with other systems.
- **Model Orchestration:** This layer is responsible for the deployment, management, and operationalization of the various AI models TrustGraph uses or provides to agents. This includes *LLM Deployment*, *Embeddings Deployment*, and *OCR Deployment*. Crucially, it features *Cross Hardware Support*, indicating the platform's ability to run these models across diverse computing environments.
+- **Private Model Serving:** This layer is responsible for the deployment, management, and operationalization of the various AI models TrustGraph uses or provides to agents. This includes *LLM Deployment*, *Embeddings Deployment*, and *OCR Deployment*. Crucially, it features *Cross Hardware Support*, indicating the platform's ability to run these models across diverse computing environments.
 - **Prompt Management:** Effective interaction with AI, especially LLMs and agents, requires precise instruction. This service centralizes the management of all prompt types: *LLM System Prompts* (to define an LLM's persona or core instructions), *Data Transformation Prompts* (to guide AI in structuring data), **RAG Context** generation (providing relevant intelligence to LLMs), and *Agent Definitions* (the core instructions and goals for AI agents).
 - **Platform Services:** These foundational services provide the essential operational backbone for the entire TrustGraph platform, ensuring it runs securely, reliably, and efficiently. This includes *Access Controls* (for security and permissions), *Secrets Management* (for handling sensitive credentials), *Logging* (for audit and diagnostics), *Observability* (for monitoring platform health and performance), *Realtime Cost Observability* (for tracking resource consumption expenses), and *Hardware Resource Management* (for optimizing the use of underlying compute).

@ -197,44 +193,11 @@ TrustGraph provides maximum flexibility so your agents are always powered by the
 - Azure<br>
 - Google Cloud<br>
 - Intel Tiber Cloud<br>
+- OVHcloud<br>
 - Scaleway<br>

 </details>

-### Pulsar Control Plane
-
- For flows, Pulsar accepts the output of a processing module and queues it for input to the next subscribed module.
- For services such as LLMs and embeddings, Pulsar provides a client/server model.  A Pulsar queue is used as the input to the service.  When processed, the output is then delivered to a separate queue where a client subscriber can request that output.
-
-PDF file:
-```
-tg-load-pdf <document.pdf>
-```
-
-Text or Markdown file:
-```
-tg-load-text <document.txt>
-```
-
-### GraphRAG Queries
-
-Once the knowledge graph and embeddings have been built or a cognitive core has been loaded, RAG queries are launched with a single line:
-
-```
-tg-invoke-graph-rag -q "What are the top 3 takeaways from the document?"
-```
-
-### Agent Flow
-
-Invoking the Agent Flow will use a ReAct style approach the combines Graph RAG and text completion requests to think through a problem solution.
-
-```
-tg-invoke-agent -v -q "Write a blog post on the top 3 takeaways from the document."
-```
-
-> [!TIP]
-> Adding `-v` to the agent request will return all of the agent manager's thoughts and observations that led to the final response.
-
 ## Observability & Telemetry

 Once the platform is running, access the Grafana dashboard at:
@ -252,22 +215,28 @@ password: admin

 The default Grafana dashboard tracks the following:

- LLM Latency
- Error Rate
- Service Request Rates
- Queue Backlogs
- Chunking Histogram
- Error Source by Service
- Rate Limit Events
- CPU usage by Service
- Memory usage by Service
- Models Deployed
- Token Throughput (Tokens/second)
- Cost Throughput (Cost/second)
+<details>
+<summary>Telemetry</summary>
+<br>
+
+- LLM Latency<br>
+- Error Rate<br>
+- Service Request Rates<br>
+- Queue Backlogs<br>
+- Chunking Histogram<br>
+- Error Source by Service<br>
+- Rate Limit Events<br>
+- CPU usage by Service<br>
+- Memory usage by Service<br>
+- Models Deployed<br>
+- Token Throughput (Tokens/second)<br>
+- Cost Throughput (Cost/second)<br>
+   
+</details>

 ## Contributing

-[Developing for TrustGraph](docs/README.development.md)
+[Developer's Guide](https://docs.trustgraph.ai/community/developer.html)

 ## License

@ -290,4 +259,4 @@ The default Grafana dashboard tracks the following:
 ## Support & Community
 - Bug Reports & Feature Requests: [Discord](https://discord.gg/sQMwkRz5GX)
 - Discussions & Questions: [Discord](https://discord.gg/sQMwkRz5GX)
- Documentation: [Docs](https://docs.trustgraph.ai/docs/getstarted)
+- Documentation: [Docs](https://docs.trustgraph.ai/)