trustgraph/README.md


# TrustGraph

![TrustGraph banner](TG_Banner_readme.png)

🚀 [Full Documentation](https://trustgraph.ai/docs/getstarted)
💬 [Join the Discord](https://discord.gg/AXpxVjwzAw)
📖 [Read the Blog](https://blog.trustgraph.ai)
📺 [YouTube](https://www.youtube.com/@TrustGraph)

## Introduction

TrustGraph deploys a full E2E (end-to-end) AI solution with native GraphRAG in minutes. Autonomous Knowledge Agents build ultra-dense knowlege graphs to fully capture all knowledge context. TrustGraph is designed for maximum flexibility and modularity whether it's calling Cloud LLMs or deploying SLMs On-Device. TrustGraph ingests data to build a RDF style knowledge graph to enable accurate and private `RAG` responses using only the knowledge you want, when you want.

The pipeline processing components are interconnected with a pub/sub engine to maximize modularity for agent integration. The core processing components decode documents, chunk text, create mapped embeddings, generate a RDF knowledge graph, generate AI predictions from either a Cloud LLM or On-Device SLM.

The processing showcases the reliability and efficiences of GraphRAG algorithms which can capture contextual language flags that are missed in conventional RAG approaches. Graph querying algorithms enable retrieving not just relevant knowledge but language cues essential to understanding semantic uses unique to a text corpus.

## Deploy in Minutes

TrustGraph is designed to deploy all the services and stores needed for a scalable GraphRAG infrastructure as quickly and simply as possible.

### Install Requirements

```
python3 -m venv env
. env/bin/activate
pip3 install pulsar-client
pip3 install cassandra-driver
export PYTHON_PATH=.
```

### Download TrustGraph

```
git clone https://github.com/trustgraph-ai/trustgraph trustgraph
cd trustgraph
```

TrustGraph is fully containerized and is launched with a Docker Compose `YAML` file. These files are prebuilt and included in the download main directory. Simply select the file that matches your desired model deployment and graph store configuration.

| Model Deployment | Graph Store | Launch File |
| ---------------- | ------------ | ----------- |
| AWS Bedrock | Cassandra | `tg-launch-bedrock-cassandra.yaml` |
| AWS Bedrock | Neo4j | `tg-launch-bedrock-neo4j.yaml` |
| AzureAI Serverless Endpoint | Cassandra | `tg-launch-azure-cassandra.yaml` |
| AzureAI Serverless Endpoint | Neo4j | `tg-launch-azure-neo4j.yaml` |
| Anthropic API | Cassandra | `tg-launch-claude-cassandra.yaml` |
| Anthropic API | Neo4j | `tg-launch-claude-neo4j.yaml` |
| Cohere API | Cassandra | `tg-launch-cohere-cassandra.yaml` |
| Cohere API | Neo4j | `tg-launch-cohere-neo4j.yaml` |
| Llamafile | Cassandra | `tg-launch-llamafile-cassandra.yaml` |
| Llamafile | Neo4j | `tg-launch-llamafile-neo4j.yaml` |
| Mixed Depoloyment | Cassandra | `tg-launch-mix-cassandra.yaml` |
| Mixed Depoloyment | Neo4j | `tg-launch-mix-neo4j.yaml` |
| Ollama | Cassandra | `tg-launch-ollama-cassandra.yaml` |
| Ollama | Neo4j | `tg-launch-ollama-neo4j.yaml` |
| OpenAI | Cassandra | `tg-launch-openai-cassandra.yaml` |
| OpenAI | Neo4j | `tg-launch-openai-neo4j.yaml` |
| VertexAI | Cassandra | `tg-launch-vertexai-cassandra.yaml` |
| VertexAI | Neo4j | `tg-launch-vertexai-neo4j.yaml` |

Launching TrustGraph is as simple as running one line:

```
docker compose -f <launch-file> up -d
```

## Core TrustGraph Features

- PDF decoding
- Text chunking
- On-Device SLM inference with [Ollama](https://ollama.com) or [Llamafile](https://github.com/Mozilla-Ocho/llamafile)
- Cloud LLM infernece: `AWS Bedrock`, `AzureAI`, `Anthropic`, `Cohere`, `OpenAI`, and `VertexAI`
- Chunk-mapped vector embeddings with [HuggingFace](https://hf.co) models
- [RDF](https://www.w3.org/TR/rdf12-schema/) Knowledge Extraction Agents
- [Apache Cassandra](https://github.com/apache/cassandra) or [Neo4j](https://neo4j.com/) as the graph store
- [Qdrant](https://qdrant.tech/) as the VectorDB
- Build and load [Knowledge Cores](https://trustgraph.ai/docs/category/knowledge-cores)
- GraphRAG query service
- [Grafana](https://github.com/grafana/) telemetry dashboard
- Module integration with [Apache Pulsar](https://github.com/apache/pulsar/)
- Container orchestration with `Docker` or [Podman](http://podman.io/)

## Architecture

![architecture](architecture_0.8.0.png)

TrustGraph is designed to be modular to support as many Language Models and environments as possible. A natural fit for a modular architecture is to decompose functions into a set of modules connected through a pub/sub backbone. [Apache Pulsar](https://github.com/apache/pulsar/) serves as this pub/sub backbone. Pulsar acts as the data broker managing data processing queues connected to procesing modules.

### Pulsar Workflows

- For processing flows, Pulsar accepts the output of a processing module and queues it for input to the next subscribed module.
- For services such as LLMs and embeddings, Pulsar provides a client/server model.  A Pulsar queue is used as the input to the service.  When processed, the output is then delivered to a separate queue where a client subscriber can request that output.

## Knowledge Agents

TrustGraph extracts knowledge from a text corpus (PDF or text) to an ultra-dense knowledge graph using 3 automonous knowledge agents. These agents focus on individual elements needed to build the RDF knowledge graph. The agents are:

- Topic Extraction Agent
- Entity Extraction Agent
- Node Connection Agent

The agent prompts are built through templates, enabling customized extraction agents for a specific use case. The extraction agents are launched automatically with either of following commands pointing to the path of a desired text corpus or the included sample files:

PDF file:
```
scripts/load-pdf -f sample-text-corpus.pdf
```

Text file:
```
scripts/load-text -f sample-text-corpus.txt
```

## GraphRAG Queries

Once the knowledge graph has been built or a knowledge core has been loaded, GraphRAG queries are launched with a single line:

```
scripts/query-graph-rag -q "Write a blog post about the 5 key takeaways from SB1047 and how they will impact AI development."
```

## Deploy and Manage TrustGraph

[🚀 Full Deployment Guide 🚀](https://trustgraph.ai/docs/getstarted)

## TrustGraph Developer's Guide

[Developing for TrustGraph](docs/README.development.md)
Fixed bad rename 2024-07-16 17:00:56 +01:00
			`# TrustGraph`

Readme tweaks 2024-09-22 10:41:49 -07:00			`![TrustGraph banner](TG_Banner_readme.png)`
Updated readme for knowledge cores 2024-08-12 11:56:21 -07:00
Link tweak in readme 2024-09-23 14:07:17 -07:00			`🚀 [Full Documentation](https://trustgraph.ai/docs/getstarted)`
Added links in the readme 2024-08-23 11:16:19 -07:00			`💬 [Join the Discord](https://discord.gg/AXpxVjwzAw)`
			`📖 [Read the Blog](https://blog.trustgraph.ai)`
Tidy up readme 2024-09-23 13:59:02 -07:00			`📺 [YouTube](https://www.youtube.com/@TrustGraph)`
Added links in the readme 2024-08-23 11:16:19 -07:00
Fixed bad rename 2024-07-16 17:00:56 +01:00			`## Introduction`

Updated agent messaging in readme 2024-09-24 14:24:16 -07:00			TrustGraph deploys a full E2E (end-to-end) AI solution with native GraphRAG in minutes. Autonomous Knowledge Agents build ultra-dense knowlege graphs to fully capture all knowledge context. TrustGraph is designed for maximum flexibility and modularity whether it's calling Cloud LLMs or deploying SLMs On-Device. TrustGraph ingests data to build a RDF style knowledge graph to enable accurate and private `RAG` responses using only the knowledge you want, when you want.
Fixed bad rename 2024-07-16 17:00:56 +01:00
Tidy up readme 2024-09-23 13:59:02 -07:00			`The pipeline processing components are interconnected with a pub/sub engine to maximize modularity for agent integration. The core processing components decode documents, chunk text, create mapped embeddings, generate a RDF knowledge graph, generate AI predictions from either a Cloud LLM or On-Device SLM.`
Fixed bad rename 2024-07-16 17:00:56 +01:00
Updated readme 2024-09-11 10:10:39 -07:00			`The processing showcases the reliability and efficiences of GraphRAG algorithms which can capture contextual language flags that are missed in conventional RAG approaches. Graph querying algorithms enable retrieving not just relevant knowledge but language cues essential to understanding semantic uses unique to a text corpus.`
Fixed bad rename 2024-07-16 17:00:56 +01:00
Tidy up readme 2024-09-23 13:59:02 -07:00			`## Deploy in Minutes`

More readme tweaks 2024-09-23 14:14:30 -07:00			`TrustGraph is designed to deploy all the services and stores needed for a scalable GraphRAG infrastructure as quickly and simply as possible.`

			`### Install Requirements`

			```
			`python3 -m venv env`
			`. env/bin/activate`
			`pip3 install pulsar-client`
			`pip3 install cassandra-driver`
			`export PYTHON_PATH=.`
			```

			`### Download TrustGraph`

			```
			`git clone https://github.com/trustgraph-ai/trustgraph trustgraph`
			`cd trustgraph`
			```

			TrustGraph is fully containerized and is launched with a Docker Compose `YAML` file. These files are prebuilt and included in the download main directory. Simply select the file that matches your desired model deployment and graph store configuration.
Tidy up readme 2024-09-23 13:59:02 -07:00
			`\| Model Deployment \| Graph Store \| Launch File \|`
			`\| ---------------- \| ------------ \| ----------- \|`
			\| AWS Bedrock \| Cassandra \| `tg-launch-bedrock-cassandra.yaml` \|
			\| AWS Bedrock \| Neo4j \| `tg-launch-bedrock-neo4j.yaml` \|
			\| AzureAI Serverless Endpoint \| Cassandra \| `tg-launch-azure-cassandra.yaml` \|
			\| AzureAI Serverless Endpoint \| Neo4j \| `tg-launch-azure-neo4j.yaml` \|
			\| Anthropic API \| Cassandra \| `tg-launch-claude-cassandra.yaml` \|
			\| Anthropic API \| Neo4j \| `tg-launch-claude-neo4j.yaml` \|
			\| Cohere API \| Cassandra \| `tg-launch-cohere-cassandra.yaml` \|
			\| Cohere API \| Neo4j \| `tg-launch-cohere-neo4j.yaml` \|
			\| Llamafile \| Cassandra \| `tg-launch-llamafile-cassandra.yaml` \|
			\| Llamafile \| Neo4j \| `tg-launch-llamafile-neo4j.yaml` \|
			\| Mixed Depoloyment \| Cassandra \| `tg-launch-mix-cassandra.yaml` \|
			\| Mixed Depoloyment \| Neo4j \| `tg-launch-mix-neo4j.yaml` \|
			\| Ollama \| Cassandra \| `tg-launch-ollama-cassandra.yaml` \|
			\| Ollama \| Neo4j \| `tg-launch-ollama-neo4j.yaml` \|
			\| OpenAI \| Cassandra \| `tg-launch-openai-cassandra.yaml` \|
			\| OpenAI \| Neo4j \| `tg-launch-openai-neo4j.yaml` \|
			\| VertexAI \| Cassandra \| `tg-launch-vertexai-cassandra.yaml` \|
			\| VertexAI \| Neo4j \| `tg-launch-vertexai-neo4j.yaml` \|

			`Launching TrustGraph is as simple as running one line:`

			```
			`docker compose -f <launch-file> up -d`
			```

			`## Core TrustGraph Features`
Fixed bad rename 2024-07-16 17:00:56 +01:00
			`- PDF decoding`
			`- Text chunking`
Tidy up readme 2024-09-23 13:59:02 -07:00			`- On-Device SLM inference with [Ollama](https://ollama.com) or [Llamafile](https://github.com/Mozilla-Ocho/llamafile)`
			- Cloud LLM infernece: `AWS Bedrock`, `AzureAI`, `Anthropic`, `Cohere`, `OpenAI`, and `VertexAI`
			`- Chunk-mapped vector embeddings with [HuggingFace](https://hf.co) models`
Updated agent messaging in readme 2024-09-24 14:24:16 -07:00			`- [RDF](https://www.w3.org/TR/rdf12-schema/) Knowledge Extraction Agents`
Tidy up readme 2024-09-23 13:59:02 -07:00			`- [Apache Cassandra](https://github.com/apache/cassandra) or [Neo4j](https://neo4j.com/) as the graph store`
			`- [Qdrant](https://qdrant.tech/) as the VectorDB`
Updated readme for knowledge cores 2024-08-12 11:56:21 -07:00			`- Build and load [Knowledge Cores](https://trustgraph.ai/docs/category/knowledge-cores)`
Updated readme 2024-09-11 10:10:39 -07:00			`- GraphRAG query service`
Readme tweaks 2024-09-23 14:05:01 -07:00			`- [Grafana](https://github.com/grafana/) telemetry dashboard`
Tidy up readme 2024-09-23 13:59:02 -07:00			`- Module integration with [Apache Pulsar](https://github.com/apache/pulsar/)`
			- Container orchestration with `Docker` or [Podman](http://podman.io/)
Fixed bad rename 2024-07-16 17:00:56 +01:00
			`## Architecture`

Fixed typo 2024-08-27 19:11:36 -07:00			`![architecture](architecture_0.8.0.png)`
Fixed bad rename 2024-07-16 17:00:56 +01:00
Tidy up readme 2024-09-23 13:59:02 -07:00			`TrustGraph is designed to be modular to support as many Language Models and environments as possible. A natural fit for a modular architecture is to decompose functions into a set of modules connected through a pub/sub backbone. [Apache Pulsar](https://github.com/apache/pulsar/) serves as this pub/sub backbone. Pulsar acts as the data broker managing data processing queues connected to procesing modules.`
Fixed bad rename 2024-07-16 17:00:56 +01:00
Tidy up readme 2024-09-23 13:59:02 -07:00			`### Pulsar Workflows`
Fixed bad rename 2024-07-16 17:00:56 +01:00
Updated readme for Qdrant 2024-08-27 18:36:39 -07:00			`- For processing flows, Pulsar accepts the output of a processing module and queues it for input to the next subscribed module.`
			`- For services such as LLMs and embeddings, Pulsar provides a client/server model. A Pulsar queue is used as the input to the service. When processed, the output is then delivered to a separate queue where a client subscriber can request that output.`

Updated agent messaging in readme 2024-09-24 14:24:16 -07:00			`## Knowledge Agents`
Tidy up readme 2024-09-23 13:59:02 -07:00
Updated agent messaging in readme 2024-09-24 14:24:16 -07:00			`TrustGraph extracts knowledge from a text corpus (PDF or text) to an ultra-dense knowledge graph using 3 automonous knowledge agents. These agents focus on individual elements needed to build the RDF knowledge graph. The agents are:`
Tidy up readme 2024-09-23 13:59:02 -07:00
Updated agent messaging in readme 2024-09-24 14:24:16 -07:00			`- Topic Extraction Agent`
			`- Entity Extraction Agent`
			`- Node Connection Agent`
Tidy up readme 2024-09-23 13:59:02 -07:00
Updated agent messaging in readme 2024-09-24 14:24:16 -07:00			`The agent prompts are built through templates, enabling customized extraction agents for a specific use case. The extraction agents are launched automatically with either of following commands pointing to the path of a desired text corpus or the included sample files:`
Added sample files 2024-09-23 14:21:50 -07:00
Readme tweaks 2024-09-23 15:01:23 -07:00			`PDF file:`
Added sample files 2024-09-23 14:21:50 -07:00			```
			`scripts/load-pdf -f sample-text-corpus.pdf`
Readme tweaks 2024-09-23 15:01:23 -07:00			```

			`Text file:`
			```
Added sample files 2024-09-23 14:21:50 -07:00			`scripts/load-text -f sample-text-corpus.txt`
			```
Tidy up readme 2024-09-23 13:59:02 -07:00
			`## GraphRAG Queries`

			`Once the knowledge graph has been built or a knowledge core has been loaded, GraphRAG queries are launched with a single line:`

			```
			`scripts/query-graph-rag -q "Write a blog post about the 5 key takeaways from SB1047 and how they will impact AI development."`
			```
Fixed bad rename 2024-07-16 17:00:56 +01:00
Readme tweaks 2024-09-23 14:05:01 -07:00			`## Deploy and Manage TrustGraph`
Fixed bad rename 2024-07-16 17:00:56 +01:00
Readme tweaks 2024-09-23 14:05:01 -07:00			`[🚀 Full Deployment Guide 🚀](https://trustgraph.ai/docs/getstarted)`
Added links to dev doc 2024-07-16 17:06:07 +01:00
Updated readme 2024-09-16 17:52:02 -07:00			`## TrustGraph Developer's Guide`
Added links to dev doc 2024-07-16 17:06:07 +01:00
Tidy up readme 2024-09-23 13:59:02 -07:00			`[Developing for TrustGraph](docs/README.development.md)`