trustgraph/README.md

306 lines
14 KiB
Markdown
Raw Normal View History

2025-04-20 16:18:17 -07:00
<img src="TG-ship.jpg" width=100% />
2024-07-16 17:00:56 +01:00
2024-12-30 17:36:05 -08:00
<div align="center">
2024-10-26 10:35:08 -07:00
2025-06-12 16:18:39 -07:00
## The Agent Intelligence Platform
2024-10-29 17:46:18 -07:00
2024-10-10 18:37:40 -07:00
[![PyPI version](https://img.shields.io/pypi/v/trustgraph.svg)](https://pypi.org/project/trustgraph/) [![Discord](https://img.shields.io/discord/1251652173201149994
)](https://discord.gg/sQMwkRz5GX)
2025-04-23 14:04:31 -07:00
📑 [Full Docs](https://docs.trustgraph.ai/docs/TrustGraph) 📺 [YouTube](https://www.youtube.com/@TrustGraphAI?sub_confirmation=1) 🔧 [Configuration Builder](https://config-ui.demo.trustgraph.ai/) ⚙️ [API Docs](docs/apis/README.md) 🧑‍💻 [CLI Docs](https://docs.trustgraph.ai/docs/running/cli) 💬 [Discord](https://discord.gg/sQMwkRz5GX) 📖 [Blog](https://blog.trustgraph.ai/subscribe)
2024-10-10 18:37:40 -07:00
</div>
2025-06-12 16:45:49 -07:00
Define and deploy trustworthy, intelligent AI agents. **TrustGraph** overcomes the "black box" limitations of other platforms by providing a transparent, deploy-anywhere solution with sophisticated GraphRAG that grounds agent responses with accessed-controlled, modular knowledge packages built from your data.
2025-04-07 13:46:20 -07:00
---
2025-04-20 15:49:04 -07:00
<details>
<summary>Table of Contents</summary>
<br>
2025-04-20 16:32:55 -07:00
- 🎯 [**Why TrustGraph?**](#-why-trustgraph)<br>
- 🚀 [**Getting Started**](#-getting-started)<br>
- 🔧 [**Configuration Builder**](#-configuration-builder)<br>
- 🔎 [**TrustRAG**](#-trustrag)<br>
- 🧠 [**Knowledge Cores**](#-knowledge-cores)<br>
- 📐 [**Architecture**](#-architecture)<br>
- 🧩 [**Integrations**](#-integrations)<br>
- 📊 [**Observability & Telemetry**](#-observability--telemetry)<br>
- 🤝 [**Contributing**](#-contributing)<br>
- 📄 [**License**](#-license)<br>
- 📞 [**Support & Community**](#-support--community)<br>
2025-04-20 15:49:04 -07:00
</details>
2025-04-07 13:46:20 -07:00
---
2025-04-07 13:51:05 -07:00
## 🎯 Why TrustGraph?
2025-03-06 12:19:22 -08:00
2025-05-03 11:22:23 -07:00
* **Unified Knowledge:** Define and deploy complete knowledge environments, including models, dependencies, and tooling, as a single, manageable unit.
2025-04-20 16:08:24 -07:00
* **No-code TrustRAG Pipelines:** Deploy full end-to-end RAG pipelines using unique TrustGraph algorithms leveraging both Knowledge graphs and VectorDBs.
2025-04-20 16:01:15 -07:00
* **Environment-Agnostic Deployment:** Provision consistently across diverse infrastructures (Cloud, On-Prem, Edge, Dev environments). Build once, provision anywhere.
* **Trusted & Secure Delivery:** Focuses on providing a secure supply chain for AI components.
* **Simplified Operations:** Radically reduce the complexity and time required to stand up and manage sophisticated AI stacks. Get operational faster.
* **Open Source & Extensible:** Built with transparency and community collaboration in mind. Easily inspect, modify, and extend the platform to meet your specific provisioning needs.
* **Component Flexibility:** Avoid component lock-in. TrustGraph integrates multiple options for all system components.
2025-03-06 12:19:22 -08:00
2025-04-07 13:57:34 -07:00
## 🚀 Getting Started
2025-01-13 11:39:03 -08:00
- [Install the CLI](#install-the-trustgraph-cli)
2025-04-07 13:57:34 -07:00
- [Configuration Builder](#-configuration-builder)
2025-04-07 14:16:45 -07:00
- [Platform Restarts](#platform-restarts)
- [Test Suite](#test-suite)
2025-01-13 11:39:03 -08:00
- [Example Notebooks](#example-trustgraph-notebooks)
2025-04-07 14:16:45 -07:00
### Developer APIs and CLI
2024-10-14 13:31:16 -07:00
2024-12-30 11:05:31 -08:00
- [**REST API**](docs/apis/README.md#rest-apis)
- [**Websocket API**](docs/apis/README.md#websocket-api)
2024-11-23 18:40:35 -08:00
- [**Python SDK**](https://trustgraph.ai/docs/api/apistarted)
- [**TrustGraph CLI**](https://trustgraph.ai/docs/running/cli)
2024-12-30 10:38:11 -08:00
2024-12-30 10:52:58 -08:00
See the [API Developer's Guide](#api-documentation) for more information.
2024-12-30 10:38:11 -08:00
For users, **TrustGraph** has the following interfaces:
2025-04-07 13:57:34 -07:00
- [**Configuration Builder**](#-configuration-builder)
- [**Test Suite**](#test-suite)
2024-10-14 13:31:16 -07:00
2025-05-03 11:35:19 -07:00
The `trustgraph-cli` installs the commands for interacting with TrustGraph while running along with the Python SDK. The **Configuration Builder** enables customization of TrustGraph deployments prior to launching. The **REST API** can be accessed through port `8088` of the TrustGraph host machine with JSON request and response bodies.
2024-10-14 13:31:16 -07:00
### Install the TrustGraph CLI
2024-09-23 14:14:30 -07:00
```
2025-05-03 11:35:19 -07:00
pip3 install trustgraph-cli==<trustgraph-version>
2024-09-23 14:14:30 -07:00
```
2025-05-03 11:37:02 -07:00
> [!CAUTION]
2025-05-03 11:35:19 -07:00
> The `trustgraph-cli` version *must* match the selected **TrustGraph** release version.
2025-04-07 13:57:34 -07:00
## 🔧 Configuration Builder
2025-05-03 11:35:19 -07:00
TrustGraph is endlessly customizable by editing the `YAML` resource files. The **Configuration Builder** provides a tool for building a custom configuration that deploys with your selected orchestration method in your target environment.
2025-05-03 11:35:19 -07:00
- [**Configuration Builder** 🚀](https://config-ui.demo.trustgraph.ai/)
2025-05-03 11:35:19 -07:00
The **Configuration Builder** has 5 important sections:
2024-10-05 20:40:33 -07:00
2025-05-03 11:49:08 -07:00
- 🚢 **TrustGraph Version**: Select the version of TrustGraph you'd like to deploy
-**Component Selection**: Choose from the available deployment platforms, LLMs, graph store, VectorDB, chunking algorithm, chunking parameters, and LLM parameters
- 🧰 **Customization**: Customize the prompts for the LLM System, Data Extraction Agents, and Agent Flow
- 🕵️ **Test Suite**: Add the **Test Suite** to the configuration available on port `8888`
- 🚀 **Finish Deployment**: Download the launch `YAML` files with deployment instructions
2025-05-03 11:35:19 -07:00
The **Configuration Builder** will generate the `YAML` files in `deploy.zip`. Once `deploy.zip` has been downloaded and unzipped, launching TrustGraph is as simple as navigating to the `deploy` directory and running:
```
docker compose up -d
```
2024-11-10 18:29:44 -08:00
> [!TIP]
> Docker is the recommended container orchestration platform for first getting started with TrustGraph.
When finished, shutting down TrustGraph is as simple as:
```
docker compose down -v
```
2025-04-07 16:03:53 -07:00
### Platform Restarts
2025-01-13 11:25:39 -08:00
The `-v` flag will destroy all data on shut down. To restart the system, it's necessary to keep the volumes. To keep the volumes, shut down without the `-v` flag:
```
docker compose down
```
With the volumes preserved, restarting the system is as simple as:
```
2025-01-13 11:30:15 -08:00
docker compose up -d
2025-01-13 11:25:39 -08:00
```
All data previously in TrustGraph will be saved and usable on restart.
2025-04-07 16:03:53 -07:00
### Test Suite
2024-12-30 10:28:50 -08:00
2025-05-03 11:35:19 -07:00
If added to the build in the **Configuration Builder**, the **Test Suite** will be available at port `8888`. The **Test Suite** has the following capabilities:
2024-12-30 10:28:50 -08:00
- **Graph RAG Chat** 💬: Graph RAG queries in a chat interface
- **Vector Search** 🔎: Semantic similarity search with cosine similarity scores
- **Semantic Relationships** 🕵️: See semantic relationships in a list structure
- **Graph Visualizer** 🌐: Visualize semantic relationships in **3D**
- **Data Loader** 📂: Directly load `.pdf`, `.txt`, or `.md` into the system with document metadata
2024-12-30 10:28:50 -08:00
2025-04-07 16:03:53 -07:00
### Example TrustGraph Notebooks
- [**REST API Notebooks**](https://github.com/trustgraph-ai/example-notebooks/tree/master/api-examples)
- [**Python SDK Notebooks**](https://github.com/trustgraph-ai/example-notebooks/tree/master/api-library)
2024-10-02 18:22:09 -07:00
TrustGraph is fully containerized and is launched with a `YAML` configuration file. Unzipping the `deploy.zip` will add the `deploy` directory with the following subdirectories:
- `docker-compose`
- `minikube-k8s`
- `gcp-k8s`
2024-12-17 13:33:31 -08:00
> [!NOTE]
> As more integrations have been added, the number of possible combinations of configurations has become quite large. It is recommended to use the `Configuration Builder` to build your deployment configuration. Each directory contains `YAML` configuration files for the default component selections.
2024-09-23 13:59:02 -07:00
2024-10-02 18:22:09 -07:00
**Docker**:
```
docker compose -f <launch-file.yaml> up -d
```
**Kubernetes**:
2024-09-23 13:59:02 -07:00
```
2024-10-02 18:22:09 -07:00
kubectl apply -f <launch-file.yaml>
2024-09-23 13:59:02 -07:00
```
TrustGraph is designed to be modular to support as many LLMs and environments as possible. A natural fit for a modular architecture is to decompose functions into a set of modules connected through a pub/sub backbone. [Apache Pulsar](https://github.com/apache/pulsar/) serves as this pub/sub backbone. Pulsar acts as the data broker managing data processing queues connected to procesing modules.
2024-07-16 17:00:56 +01:00
2025-06-12 16:45:49 -07:00
## 🔎 GraphRAG
2025-04-20 16:32:55 -07:00
TrustGraph incorporates **TrustRAG**, an advanced RAG approach that leverages automatically constructed Knowledge Graphs to provide richer and more accurate context to LLMs. Instead of relying solely on unstructured text chunks, TrustRAG understands and utilizes the relationships *between* pieces of information.
2025-06-12 16:45:49 -07:00
**How TrustGraph's GraphRAG Works:**
2025-04-20 16:32:55 -07:00
1. **Automated Knowledge Graph Construction:**
* TrustGraph processes source data to automatically **extract key entities, topics, and the relationships** connecting them.
* It then maps these extracted **semantic relationships and concepts to high-dimensional vector embeddings**, capturing the nuanced meaning beyond simple keyword matching.
2. **Hybrid Retrieval Process:**
* When a query is received, TrustRAG first performs a **cosine similarity search** on the vector embeddings to identify potentially relevant concepts and relationships within the knowledge graph.
* This initial vector search **pinpoints relevant entry points** within the structured Knowledge Graph.
3. **Context Generation via Subgraph Traversal:**
* Based on the ranked results from the similarity search, TrustRAG dynamically **generates relevant subgraphs**.
* It starts from the identified entry points and traverses the connections within the Knowledge Graph. Users can configure the **number of 'hops'** (relationship traversals) to expand the contextual window, gathering interconnected information.
* This structured **subgraph**, containing entities and their relationships, forms a highly relevant and context-aware input prompt for the LLM that is endlessly configurable with options for the number of entities, relationships, and overall subgraph size.
2025-06-12 16:45:49 -07:00
## 🧠 Knowledge Packages
2025-04-07 17:21:05 -07:00
2025-06-12 16:45:49 -07:00
One of the biggest challenges currently facing RAG architectures is the ability to quickly reuse and integrate knowledge sets. **TrustGraph** solves this problem by storing the results of the data ingestion process in reusable Knowledge Packages. Being able to store and reuse the Knowledge Packages means the data transformation process has to be run only once. These reusable Knowledge Packages can be loaded back into **TrustGraph** and used for GraphRAG.
2025-04-07 17:21:05 -07:00
2025-06-12 16:45:49 -07:00
A Knowledge Package has two components:
2025-04-07 17:21:05 -07:00
- Set of Graph Edges
- Set of mapped Vector Embeddings
2025-06-12 16:45:49 -07:00
When a Knowledge Package is loaded into TrustGraph, the corresponding graph edges and vector embeddings are queued and loaded into the chosen graph and vector stores.
2025-04-07 17:21:05 -07:00
2025-04-07 14:02:23 -07:00
## 📐 Architecture
2025-06-12 16:45:49 -07:00
TrustGraph provides all the services, stores, control plane, and API gateway needed to connect your data to intelligent agents.
2025-04-07 14:02:23 -07:00
![architecture](TG-platform-diagram.svg)
2025-04-07 14:02:23 -07:00
2025-04-07 14:28:45 -07:00
## 🧩 Integrations
2025-06-12 16:45:49 -07:00
TrustGraph provides maximum flexibility so your agents are always powered by the latest and greatest components.
2025-04-07 14:28:45 -07:00
2025-06-12 16:13:16 -07:00
- LLM APIs: **Anthropic**, **AWS Bedrock**, **AzureAI**, **AzureOpenAI**, **Cohere**, **Google AI Studio**, **Google VertexAI**, **Mistral**, and **OpenAI**
- LLM Orchestration: **LM Studio**, **Llamafiles**, **Ollama**, **TGI**, and **vLLM**
2025-04-07 14:28:45 -07:00
- Vector Databases: **Qdrant**, **Pinecone**, and **Milvus**
2025-04-07 18:03:12 -07:00
- Knowledge Graphs: **Memgraph**, **Neo4j**, and **FalkorDB**
- Data Stores: **Apache Cassandra**
- Observability: **Prometheus** and **Grafana**
2025-06-12 16:13:16 -07:00
- Control Plane: **Apache Pulsar**
2025-06-12 16:21:18 -07:00
- Clouds: **AWS**, **Azure**, **Google Cloud**, **Scaleway**, and **Intel Tiber Cloud**
2025-04-07 14:28:45 -07:00
2025-06-12 16:45:49 -07:00
### Pulsar Control Plane
2024-07-16 17:00:56 +01:00
2025-06-12 16:45:49 -07:00
- For flows, Pulsar accepts the output of a processing module and queues it for input to the next subscribed module.
2024-08-27 18:36:39 -07:00
- For services such as LLMs and embeddings, Pulsar provides a client/server model. A Pulsar queue is used as the input to the service. When processed, the output is then delivered to a separate queue where a client subscriber can request that output.
2025-06-12 16:45:49 -07:00
### Data Transformation Agents
2024-09-23 13:59:02 -07:00
2025-06-12 16:45:49 -07:00
TrustGraph transforms data to an ultra-dense knowledge graph using 3 automonous data transformation agents. These agents focus on individual elements needed to build the knowledge graph. The agents are:
2024-09-23 13:59:02 -07:00
2024-09-24 14:24:16 -07:00
- Topic Extraction Agent
- Entity Extraction Agent
- Relationship Extraction Agent
2024-09-23 13:59:02 -07:00
The agent prompts are built through templates, enabling customized data extraction agents for a specific use case. The data extraction agents are launched automatically with the loader commands.
2024-09-23 14:21:50 -07:00
2024-09-23 15:01:23 -07:00
PDF file:
2024-09-23 14:21:50 -07:00
```
2024-10-03 14:58:49 -07:00
tg-load-pdf <document.pdf>
2024-09-23 15:01:23 -07:00
```
Text or Markdown file:
2024-09-23 15:01:23 -07:00
```
2024-10-03 14:58:49 -07:00
tg-load-text <document.txt>
2024-09-23 14:21:50 -07:00
```
2024-09-23 13:59:02 -07:00
2025-06-12 16:45:49 -07:00
### GraphRAG Queries
2024-09-23 13:59:02 -07:00
2025-01-20 16:59:48 -08:00
Once the knowledge graph and embeddings have been built or a cognitive core has been loaded, RAG queries are launched with a single line:
2024-09-23 13:59:02 -07:00
```
2025-01-13 11:29:16 -08:00
tg-invoke-graph-rag -q "What are the top 3 takeaways from the document?"
2024-09-23 13:59:02 -07:00
```
2024-07-16 17:00:56 +01:00
2025-04-07 15:24:58 -07:00
### Agent Flow
2024-11-23 18:59:00 -08:00
2025-01-20 16:59:48 -08:00
Invoking the Agent Flow will use a ReAct style approach the combines Graph RAG and text completion requests to think through a problem solution.
2024-11-23 18:59:00 -08:00
```
2025-01-13 11:29:16 -08:00
tg-invoke-agent -v -q "Write a blog post on the top 3 takeaways from the document."
2024-11-23 18:59:00 -08:00
```
> [!TIP]
> Adding `-v` to the agent request will return all of the agent manager's thoughts and observations that led to the final response.
2025-04-07 15:24:58 -07:00
## 📊 Observability & Telemetry
Once the platform is running, access the Grafana dashboard at:
```
http://localhost:3000
```
Default credentials are:
```
user: admin
password: admin
```
The default Grafana dashboard tracks the following:
- LLM Latency
- Error Rate
- Service Request Rates
- Queue Backlogs
- Chunking Histogram
- Error Source by Service
- Rate Limit Events
- CPU usage by Service
- Memory usage by Service
- Models Deployed
- Token Throughput (Tokens/second)
- Cost Throughput (Cost/second)
2025-04-07 15:18:35 -07:00
## 🤝 Contributing
2024-12-28 16:59:11 +00:00
2025-04-07 15:18:35 -07:00
[Developing for TrustGraph](docs/README.development.md)
2024-07-16 17:06:07 +01:00
2025-04-07 15:18:35 -07:00
## 📄 License
2025-05-08 18:59:58 +01:00
**TrustGraph** is licensed under [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0).
Copyright 2024-2025 TrustGraph
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
2024-07-16 17:06:07 +01:00
2025-04-07 15:18:35 -07:00
## 📞 Support & Community
- Bug Reports & Feature Requests: [Discord](https://discord.gg/sQMwkRz5GX)
- Discussions & Questions: [Discord](https://discord.gg/sQMwkRz5GX)
- Documentation: [Docs](https://docs.trustgraph.ai/docs/getstarted)