Update docs for API/CLI changes in 1.0 (#421)

* Update some API basics for the 0.23/1.0 API change
2026-05-05 05:12:36 +02:00 · 2025-07-03 14:58:32 +01:00 · 2025-07-03 14:58:32 +01:00 · 44bdd29f51
commit 44bdd29f51
parent f907ea7db8
69 changed files with 19981 additions and 407 deletions
--- a/docs/apis/README.md
+++ b/docs/apis/README.md
@ -3,8 +3,10 @@

 ## Overview

-If you want to interact with TrustGraph through APIs, there are 3
-forms of API which may be of interest to you:
+If you want to interact with TrustGraph through APIs, there are 4
+forms of API which may be of interest to you. All four mechanisms
+invoke the same underlying TrustGraph functionality but are made
+available for integration in different ways:

 ### Pulsar APIs

@ -56,6 +58,31 @@ Cons:
    using a basic REST API, particular if you want to cover all of the error
    scenarios well

+### Python SDK API
+
+The `trustgraph-base` package provides a Python SDK that wraps the underlying
+service invocations in a convenient Python API.
+
+Pros:
+  - Native Python integration with type hints and documentation
+  - Simplified service invocation without manual message handling
+  - Built-in error handling and response parsing
+  - Convenient for Python-based applications and scripts
+
+Cons:
+  - Python-specific, not available for other programming languages
+  - Requires Python environment and trustgraph-base package installation
+  - Less control over low-level message handling
+
+## Flow-hosted APIs
+
+There are two types of APIs: Flow-hosted which need a flow to be running
+to operate.  Non-flow-hosted which are core to the system, and can
+be seen as 'global' - they are not dependent on a flow to be running.
+
+Knowledge, Librarian, Config and Flow APIs fall into the latter
+category.
+
 ## See also

 - [TrustGraph websocket overview](websocket.md)
@ -64,9 +91,19 @@ Cons:
  - [Text completion](api-text-completion.md)
  - [Prompt completion](api-prompt.md)
  - [Graph RAG](api-graph-rag.md)
+  - [Document RAG](api-document-rag.md)
  - [Agent](api-agent.md)
  - [Embeddings](api-embeddings.md)
  - [Graph embeddings](api-graph-embeddings.md)
+  - [Document embeddings](api-document-embeddings.md)
+  - [Entity contexts](api-entity-contexts.md)
  - [Triples query](api-triples-query.md)
  - [Document load](api-document-load.md)
+  - [Text load](api-text-load.md)
+  - [Config](api-config.md)
+  - [Flow](api-flow.md)
+  - [Librarian](api-librarian.md)
+  - [Knowledge](api-knowledge.md)
+  - [Metrics](api-metrics.md)
+  - [Core import/export](api-core-import-export.md)

--- a/docs/apis/api-agent.md
+++ b/docs/apis/api-agent.md
@ -18,7 +18,7 @@ The request contains the following fields:

 ### Response

-The request contains the following fields:
+The response contains the following fields:
 - `thought`: Optional, a string, provides an interim agent thought
 - `observation`: Optional, a string, provides an interim agent thought
 - `answer`: Optional, a string, provides the final answer
@ -61,6 +61,7 @@ Request:
 {
    "id": "blrqotfefnmnh7de-20",
    "service": "agent",
+    "flow": "default",
    "request": {
        "question": "What does NASA stand for?"
    }
--- a/docs/apis/api-config.md
+++ b/docs/apis/api-config.md
@ -0,0 +1,261 @@
+# TrustGraph Config API
+
+This API provides centralized configuration management for TrustGraph components.
+Configuration data is organized hierarchically by type and key, with support for
+persistent storage and push notifications.
+
+## Request/response
+
+### Request
+
+The request contains the following fields:
+- `operation`: The operation to perform (`get`, `list`, `getvalues`, `put`, `delete`, `config`)
+- `keys`: Array of ConfigKey objects (for `get`, `delete` operations)
+- `type`: Configuration type (for `list`, `getvalues` operations)
+- `values`: Array of ConfigValue objects (for `put` operation)
+
+### Response
+
+The response contains the following fields:
+- `version`: Version number for tracking changes
+- `values`: Array of ConfigValue objects returned by operations
+- `directory`: Array of key names returned by `list` operation
+- `config`: Full configuration map returned by `config` operation
+- `error`: Error information if operation fails
+
+## Operations
+
+### PUT - Store Configuration Values
+
+Request:
+```json
+{
+    "operation": "put",
+    "values": [
+        {
+            "type": "test",
+            "key": "key1",
+            "value": "value1"
+        }
+    ]
+}
+```
+
+Response:
+```json
+{
+    "version": 123
+}
+```
+
+### GET - Retrieve Configuration Values
+
+Request:
+```json
+{
+    "operation": "get",
+    "keys": [
+        {
+            "type": "test",
+            "key": "key1"
+        }
+    ]
+}
+```
+
+Response:
+```json
+{
+    "version": 123,
+    "values": [
+        {
+            "type": "test",
+            "key": "key1",
+            "value": "value1"
+        }
+    ]
+}
+```
+
+### LIST - List Keys by Type
+
+Request:
+```json
+{
+    "operation": "list",
+    "type": "test"
+}
+```
+
+Response:
+```json
+{
+    "version": 123,
+    "directory": ["key1", "key2", "key3"]
+}
+```
+
+### GETVALUES - Get All Values by Type
+
+Request:
+```json
+{
+    "operation": "getvalues",
+    "type": "test"
+}
+```
+
+Response:
+```json
+{
+    "version": 123,
+    "values": [
+        {
+            "type": "test",
+            "key": "key1",
+            "value": "value1"
+        },
+        {
+            "type": "test",
+            "key": "key2",
+            "value": "value2"
+        }
+    ]
+}
+```
+
+### CONFIG - Get Entire Configuration
+
+Request:
+```json
+{
+    "operation": "config"
+}
+```
+
+Response:
+```json
+{
+    "version": 123,
+    "config": {
+        "test": {
+            "key1": "value1",
+            "key2": "value2"
+        }
+    }
+}
+```
+
+### DELETE - Remove Configuration Values
+
+Request:
+```json
+{
+    "operation": "delete",
+    "keys": [
+        {
+            "type": "test",
+            "key": "key1"
+        }
+    ]
+}
+```
+
+Response:
+```json
+{
+    "version": 124
+}
+```
+
+## REST service
+
+The REST service is available at `/api/v1/config` and accepts the above request formats.
+
+## Websocket
+
+Requests have a `request` object containing the operation fields.
+Responses have a `response` object containing the response fields.
+
+Request:
+```json
+{
+    "id": "unique-request-id",
+    "service": "config",
+    "request": {
+        "operation": "get",
+        "keys": [
+            {
+                "type": "test",
+                "key": "key1"
+            }
+        ]
+    }
+}
+```
+
+Response:
+```json
+{
+    "id": "unique-request-id",
+    "response": {
+        "version": 123,
+        "values": [
+            {
+                "type": "test",
+                "key": "key1",
+                "value": "value1"
+            }
+        ]
+    },
+    "complete": true
+}
+```
+
+## Pulsar
+
+The Pulsar schema for the Config API is defined in Python code here:
+
+https://github.com/trustgraph-ai/trustgraph/blob/master/trustgraph-base/trustgraph/schema/config.py
+
+Default request queue:
+`non-persistent://tg/request/config`
+
+Default response queue:
+`non-persistent://tg/response/config`
+
+Request schema:
+`trustgraph.schema.ConfigRequest`
+
+Response schema:
+`trustgraph.schema.ConfigResponse`
+
+## Python SDK
+
+The Python SDK provides convenient access to the Config API:
+
+```python
+from trustgraph.api.config import ConfigClient
+
+client = ConfigClient()
+
+# Put a value
+await client.put("test", "key1", "value1")
+
+# Get a value
+value = await client.get("test", "key1")
+
+# List keys
+keys = await client.list("test")
+
+# Get all values for a type
+values = await client.get_values("test")
+```
+
+## Features
+
+- **Hierarchical Organization**: Configuration organized by type and key
+- **Versioning**: Each operation returns a version number for change tracking
+- **Persistent Storage**: Data stored in Cassandra for persistence
+- **Push Notifications**: Configuration changes pushed to subscribers
+- **Multiple Access Methods**: Available via Pulsar, REST, WebSocket, and Python SDK
--- a/docs/apis/api-core-import-export.md
+++ b/docs/apis/api-core-import-export.md
@ -0,0 +1,324 @@
+# TrustGraph Core Import/Export API
+
+This API provides bulk import and export capabilities for TrustGraph knowledge cores. 
+It handles efficient transfer of both RDF triples and graph embeddings using MessagePack 
+binary format for high-performance data exchange.
+
+## Overview
+
+The Core Import/Export API enables:
+- **Bulk Import**: Import large knowledge cores from binary streams
+- **Bulk Export**: Export knowledge cores as binary streams  
+- **Efficient Format**: Uses MessagePack for compact, fast serialization
+- **Dual Data Types**: Handles both RDF triples and graph embeddings
+- **Streaming**: Supports streaming for large datasets
+
+## Import Endpoint
+
+**Endpoint:** `POST /api/v1/import-core`
+
+**Query Parameters:**
+- `id`: Knowledge core identifier
+- `user`: User identifier
+
+**Content-Type:** `application/octet-stream`
+
+**Request Body:** MessagePack-encoded binary stream
+
+### Import Process
+
+1. **Stream Processing**: Reads binary data in 128KB chunks
+2. **MessagePack Decoding**: Unpacks binary data into structured messages
+3. **Knowledge Storage**: Stores data via Knowledge API
+4. **Response**: Returns success/error status
+
+### Import Data Format
+
+The import stream contains MessagePack-encoded tuples with type indicators:
+
+#### Triples Data
+```python
+("t", {
+    "m": {  # metadata
+        "i": "core-id",
+        "m": [],  # metadata triples
+        "u": "user",
+        "c": "collection"
+    },
+    "t": [  # triples array
+        {
+            "s": {"value": "subject", "is_uri": true},
+            "p": {"value": "predicate", "is_uri": true}, 
+            "o": {"value": "object", "is_uri": false}
+        }
+    ]
+})
+```
+
+#### Graph Embeddings Data
+```python
+("ge", {
+    "m": {  # metadata
+        "i": "core-id",
+        "m": [],  # metadata triples
+        "u": "user", 
+        "c": "collection"
+    },
+    "e": [  # entities array
+        {
+            "e": {"value": "entity", "is_uri": true},
+            "v": [[0.1, 0.2, 0.3]]  # vectors
+        }
+    ]
+})
+```
+
+## Export Endpoint
+
+**Endpoint:** `GET /api/v1/export-core`
+
+**Query Parameters:**
+- `id`: Knowledge core identifier  
+- `user`: User identifier
+
+**Content-Type:** `application/octet-stream`
+
+**Response Body:** MessagePack-encoded binary stream
+
+### Export Process
+
+1. **Knowledge Retrieval**: Fetches data via Knowledge API
+2. **MessagePack Encoding**: Encodes data into binary format
+3. **Streaming Response**: Sends data as binary stream
+4. **Type Identification**: Uses type prefixes for data classification
+
+## Usage Examples
+
+### Import Knowledge Core
+
+```bash
+# Import from file
+curl -X POST \
+  -H "Authorization: Bearer your-token" \
+  -H "Content-Type: application/octet-stream" \
+  --data-binary @knowledge-core.msgpack \
+  "http://api-gateway:8080/api/v1/import-core?id=core-123&user=alice"
+```
+
+### Export Knowledge Core
+
+```bash
+# Export to file
+curl -H "Authorization: Bearer your-token" \
+  "http://api-gateway:8080/api/v1/export-core?id=core-123&user=alice" \
+  -o knowledge-core.msgpack
+```
+
+## Python Integration
+
+### Import Example
+
+```python
+import msgpack
+import requests
+
+def import_knowledge_core(core_id, user, triples_data, embeddings_data, token):
+    # Prepare data
+    messages = []
+    
+    # Add triples
+    if triples_data:
+        messages.append(("t", {
+            "m": {
+                "i": core_id,
+                "m": [],
+                "u": user,
+                "c": "default"
+            },
+            "t": triples_data
+        }))
+    
+    # Add embeddings
+    if embeddings_data:
+        messages.append(("ge", {
+            "m": {
+                "i": core_id,
+                "m": [],
+                "u": user,
+                "c": "default"
+            },
+            "e": embeddings_data
+        }))
+    
+    # Pack data
+    binary_data = b''.join(msgpack.packb(msg) for msg in messages)
+    
+    # Upload
+    response = requests.post(
+        f"http://api-gateway:8080/api/v1/import-core?id={core_id}&user={user}",
+        headers={
+            "Authorization": f"Bearer {token}",
+            "Content-Type": "application/octet-stream"
+        },
+        data=binary_data
+    )
+    
+    return response.status_code == 200
+
+# Usage
+triples = [
+    {
+        "s": {"value": "Person1", "is_uri": True},
+        "p": {"value": "hasName", "is_uri": True},
+        "o": {"value": "John Doe", "is_uri": False}
+    }
+]
+
+embeddings = [
+    {
+        "e": {"value": "Person1", "is_uri": True},
+        "v": [[0.1, 0.2, 0.3, 0.4]]
+    }
+]
+
+success = import_knowledge_core("core-123", "alice", triples, embeddings, "your-token")
+```
+
+### Export Example
+
+```python
+import msgpack
+import requests
+
+def export_knowledge_core(core_id, user, token):
+    response = requests.get(
+        f"http://api-gateway:8080/api/v1/export-core?id={core_id}&user={user}",
+        headers={"Authorization": f"Bearer {token}"}
+    )
+    
+    if response.status_code != 200:
+        return None
+    
+    # Decode MessagePack stream
+    data = response.content
+    unpacker = msgpack.Unpacker()
+    unpacker.feed(data)
+    
+    triples = []
+    embeddings = []
+    
+    for unpacked in unpacker:
+        msg_type, msg_data = unpacked
+        
+        if msg_type == "t":
+            triples.extend(msg_data["t"])
+        elif msg_type == "ge":
+            embeddings.extend(msg_data["e"])
+    
+    return {
+        "triples": triples,
+        "embeddings": embeddings
+    }
+
+# Usage
+data = export_knowledge_core("core-123", "alice", "your-token")
+if data:
+    print(f"Exported {len(data['triples'])} triples")
+    print(f"Exported {len(data['embeddings'])} embeddings")
+```
+
+## Data Format Specification
+
+### MessagePack Tuples
+
+Each message is a tuple: `(type_indicator, data_object)`
+
+**Type Indicators:**
+- `"t"`: RDF triples data
+- `"ge"`: Graph embeddings data
+
+### Metadata Structure
+
+```python
+{
+    "i": "core-identifier",     # ID
+    "m": [...],                 # Metadata triples array
+    "u": "user-identifier",     # User
+    "c": "collection-name"      # Collection
+}
+```
+
+### Triple Structure
+
+```python
+{
+    "s": {"value": "subject", "is_uri": boolean},
+    "p": {"value": "predicate", "is_uri": boolean},
+    "o": {"value": "object", "is_uri": boolean}
+}
+```
+
+### Entity Embedding Structure
+
+```python
+{
+    "e": {"value": "entity", "is_uri": boolean},
+    "v": [[float, float, ...]]  # Array of vectors
+}
+```
+
+## Performance Characteristics
+
+### Import Performance
+- **Streaming**: Processes data in 128KB chunks
+- **Memory Efficient**: Incremental unpacking
+- **Concurrent**: Multiple imports can run simultaneously
+- **Error Handling**: Robust error recovery
+
+### Export Performance  
+- **Direct Streaming**: Data streamed directly from knowledge store
+- **Efficient Encoding**: MessagePack for minimal overhead
+- **Large Dataset Support**: Handles cores of any size
+
+## Error Handling
+
+### Import Errors
+- **Format Errors**: Invalid MessagePack data
+- **Type Errors**: Unknown type indicators
+- **Storage Errors**: Knowledge API failures
+- **Authentication**: Invalid user credentials
+
+### Export Errors
+- **Not Found**: Core ID doesn't exist
+- **Access Denied**: User lacks permissions
+- **System Errors**: Knowledge API failures
+
+### Error Responses
+- **HTTP 400**: Bad request (invalid parameters)
+- **HTTP 401**: Unauthorized access
+- **HTTP 404**: Core not found
+- **HTTP 500**: Internal server error
+
+## Use Cases
+
+### Data Migration
+- **System Upgrades**: Export/import during system migrations
+- **Environment Sync**: Copy cores between environments
+- **Backup/Restore**: Full knowledge core backup operations
+
+### Batch Processing
+- **Bulk Loading**: Load large knowledge datasets efficiently
+- **Data Integration**: Merge knowledge from multiple sources
+- **ETL Pipelines**: Extract-Transform-Load operations
+
+### Performance Optimization
+- **Faster Than REST**: Binary format reduces transfer time
+- **Atomic Operations**: Complete import/export as single operation
+- **Resource Efficient**: Minimal memory footprint during transfer
+
+## Security Considerations
+
+- **Authentication Required**: Bearer token authentication
+- **User Isolation**: Access restricted to user's own cores
+- **Data Validation**: Input validation on import
+- **Audit Logging**: Operations logged for security auditing
--- a/docs/apis/api-document-embeddings.md
+++ b/docs/apis/api-document-embeddings.md
@ -0,0 +1,252 @@
+# TrustGraph Document Embeddings API
+
+This API provides import, export, and query capabilities for document embeddings. It handles 
+document chunks with their vector embeddings and metadata, supporting both real-time WebSocket 
+operations and request/response patterns.
+
+## Schema Overview
+
+### DocumentEmbeddings Structure
+- `metadata`: Document metadata (ID, user, collection, RDF triples)
+- `chunks`: Array of document chunks with embeddings
+
+### ChunkEmbeddings Structure
+- `chunk`: Text chunk as bytes
+- `vectors`: Array of vector embeddings (Array of Array of Double)
+
+### DocumentEmbeddingsRequest Structure
+- `vectors`: Query vector embeddings
+- `limit`: Maximum number of results
+- `user`: User identifier
+- `collection`: Collection identifier
+
+### DocumentEmbeddingsResponse Structure
+- `error`: Error information if operation fails
+- `documents`: Array of matching documents as bytes
+
+## Import/Export Operations
+
+### Import - WebSocket Endpoint
+
+**Endpoint:** `/api/v1/flow/{flow}/import/document-embeddings`
+
+**Method:** WebSocket connection
+
+**Request Format:**
+```json
+{
+    "metadata": {
+        "id": "doc-123",
+        "user": "alice",
+        "collection": "research",
+        "metadata": [
+            {
+                "s": {"v": "doc-123", "e": true},
+                "p": {"v": "dc:title", "e": true},
+                "o": {"v": "Research Paper", "e": false}
+            }
+        ]
+    },
+    "chunks": [
+        {
+            "chunk": "This is the first chunk of the document...",
+            "vectors": [
+                [0.1, 0.2, 0.3, 0.4],
+                [0.5, 0.6, 0.7, 0.8]
+            ]
+        },
+        {
+            "chunk": "This is the second chunk...",
+            "vectors": [
+                [0.9, 0.8, 0.7, 0.6],
+                [0.5, 0.4, 0.3, 0.2]
+            ]
+        }
+    ]
+}
+```
+
+**Response:** Import operations are fire-and-forget with no response payload.
+
+### Export - WebSocket Endpoint
+
+**Endpoint:** `/api/v1/flow/{flow}/export/document-embeddings`
+
+**Method:** WebSocket connection
+
+The export endpoint streams document embeddings data in real-time. Each message contains:
+
+```json
+{
+    "metadata": {
+        "id": "doc-123",
+        "user": "alice",
+        "collection": "research",
+        "metadata": [
+            {
+                "s": {"v": "doc-123", "e": true},
+                "p": {"v": "dc:title", "e": true},
+                "o": {"v": "Research Paper", "e": false}
+            }
+        ]
+    },
+    "chunks": [
+        {
+            "chunk": "Decoded text content of chunk",
+            "vectors": [[0.1, 0.2, 0.3, 0.4]]
+        }
+    ]
+}
+```
+
+## Query Operations
+
+### Query Document Embeddings
+
+**Purpose:** Find documents similar to provided vector embeddings
+
+**Request:**
+```json
+{
+    "vectors": [
+        [0.1, 0.2, 0.3, 0.4, 0.5],
+        [0.6, 0.7, 0.8, 0.9, 1.0]
+    ],
+    "limit": 10,
+    "user": "alice",
+    "collection": "research"
+}
+```
+
+**Response:**
+```json
+{
+    "documents": [
+        "base64-encoded-document-1",
+        "base64-encoded-document-2"
+    ]
+}
+```
+
+## WebSocket Usage Examples
+
+### Importing Document Embeddings
+
+```javascript
+// Connect to import endpoint
+const ws = new WebSocket('ws://api-gateway:8080/api/v1/flow/my-flow/import/document-embeddings');
+
+// Send document embeddings
+ws.send(JSON.stringify({
+    metadata: {
+        id: "doc-123",
+        user: "alice",
+        collection: "research"
+    },
+    chunks: [
+        {
+            chunk: "Document content chunk 1",
+            vectors: [[0.1, 0.2, 0.3]]
+        }
+    ]
+}));
+```
+
+### Exporting Document Embeddings
+
+```javascript
+// Connect to export endpoint
+const ws = new WebSocket('ws://api-gateway:8080/api/v1/flow/my-flow/export/document-embeddings');
+
+// Listen for exported data
+ws.onmessage = (event) => {
+    const documentEmbeddings = JSON.parse(event.data);
+    console.log('Received document:', documentEmbeddings.metadata.id);
+    console.log('Chunks:', documentEmbeddings.chunks.length);
+};
+```
+
+## Data Format Details
+
+### Metadata Format
+Each metadata triple contains:
+- `s`: Subject (object with `v` for value and `e` for is_entity boolean)
+- `p`: Predicate (object with `v` for value and `e` for is_entity boolean)
+- `o`: Object (object with `v` for value and `e` for is_entity boolean)
+
+### Vector Format
+- Vectors are arrays of floating-point numbers
+- Each chunk can have multiple vectors (different embedding models)
+- Vectors should be consistently dimensioned within a collection
+
+### Text Encoding
+- Chunk text is handled as UTF-8 encoded bytes internally
+- WebSocket API accepts/returns plain text strings
+- Base64 encoding used for binary data in query responses
+
+## Python SDK
+
+```python
+from trustgraph.clients.document_embeddings_client import DocumentEmbeddingsClient
+
+# Create client
+client = DocumentEmbeddingsClient()
+
+# Query similar documents
+request = {
+    "vectors": [[0.1, 0.2, 0.3, 0.4]],
+    "limit": 5,
+    "user": "alice",
+    "collection": "research"
+}
+
+response = await client.query(request)
+documents = response.documents
+```
+
+## Integration with TrustGraph
+
+### Storage Integration
+- Document embeddings are stored in vector databases
+- Metadata is cross-referenced with knowledge graph
+- Supports multi-tenant isolation by user and collection
+
+### Processing Pipeline
+1. **Document Ingestion**: Text documents loaded via text-load API
+2. **Chunking**: Documents split into manageable chunks
+3. **Embedding Generation**: Vector embeddings created for each chunk
+4. **Storage**: Embeddings stored via import API
+5. **Retrieval**: Similar documents found via query API
+
+### Use Cases
+- **Semantic Search**: Find documents similar to query embeddings
+- **RAG Systems**: Retrieve relevant document chunks for question answering
+- **Document Clustering**: Group similar documents using embeddings
+- **Content Recommendations**: Suggest related documents to users
+- **Knowledge Discovery**: Find connections between document collections
+
+## Error Handling
+
+Common error scenarios:
+- Invalid vector dimensions
+- Missing required metadata fields
+- User/collection access restrictions
+- WebSocket connection failures
+- Malformed JSON data
+
+Errors are returned in the response `error` field:
+```json
+{
+    "error": {
+        "type": "ValidationError",
+        "message": "Invalid vector dimensions"
+    }
+}
+```
+
+## Performance Considerations
+
+- **Batch Processing**: Import multiple documents in single WebSocket session
+- **Vector Dimensions**: Consistent embedding dimensions improve performance
+- **Collection Sizing**: Limit collections to reasonable sizes for query performance
+- **Real-time vs Batch**: Choose appropriate method based on use case requirements
--- a/docs/apis/api-document-rag.md
+++ b/docs/apis/api-document-rag.md
@ -0,0 +1,96 @@
+# TrustGraph Document RAG API
+
+This presents a prompt to the Document RAG service and retrieves the answer.
+This makes use of a number of the other APIs behind the scenes:
+Embeddings, Document Embeddings, Prompt, TextCompletion, Triples Query.
+
+## Request/response
+
+### Request
+
+The request contains the following fields:
+- `query`: The question to answer
+
+### Response
+
+The response contains the following fields:
+- `response`: LLM response
+
+## REST service
+
+The REST service accepts a request object containing the `query` field.
+The response is a JSON object containing the `response` field.
+
+e.g.
+
+Request:
+```
+{
+    "query": "What does NASA stand for?"
+}
+```
+
+Response:
+
+```
+{
+    "response": "National Aeronautics and Space Administration"
+}
+```
+
+## Websocket
+
+Requests have a `request` object containing the `query` field.
+Responses have a `response` object containing `response` field.
+
+e.g.
+
+Request:
+
+```
+{
+    "id": "blrqotfefnmnh7de-14",
+    "service": "document-rag",
+    "flow": "default",
+    "request": {
+        "query": "What does NASA stand for?"
+    }
+}
+```
+
+Response:
+
+```
+{
+    "id": "blrqotfefnmnh7de-14",
+    "response": {
+        "response": "National Aeronautics and Space Administration"
+    },
+    "complete": true
+}
+```
+
+## Pulsar
+
+The Pulsar schema for the Document RAG API is defined in Python code here:
+
+https://github.com/trustgraph-ai/trustgraph/blob/master/trustgraph-base/trustgraph/schema/retrieval.py
+
+Default request queue:
+`non-persistent://tg/request/document-rag`
+
+Default response queue:
+`non-persistent://tg/response/document-rag`
+
+Request schema:
+`trustgraph.schema.DocumentRagQuery`
+
+Response schema:
+`trustgraph.schema.DocumentRagResponse`
+
+## Pulsar Python client
+
+The client class is
+`trustgraph.clients.DocumentRagClient`
+
+https://github.com/trustgraph-ai/trustgraph/blob/master/trustgraph-base/trustgraph/clients/document_rag_client.py
--- a/docs/apis/api-embeddings.md
+++ b/docs/apis/api-embeddings.md
@ -10,7 +10,7 @@ The request contains the following fields:

 ### Response

-The request contains the following fields:
+The response contains the following fields:
 - `vectors`: Embeddings response, an array of arrays.  An embedding is
  an array of floating-point numbers.  As multiple embeddings may be
  returned, an array of embeddings is returned, hence an array
@ -51,6 +51,7 @@ Request:
 {
    "id": "qgzw1287vfjc8wsk-2",
    "service": "embeddings",
+    "flow": "default",
    "request": {
        "text": "What is a cat?"
    }
--- a/docs/apis/api-entity-contexts.md
+++ b/docs/apis/api-entity-contexts.md
@ -0,0 +1,259 @@
+# TrustGraph Entity Contexts API
+
+This API provides import and export capabilities for entity contexts data. Entity contexts 
+associate entities with their textual context information, commonly used for entity 
+descriptions, definitions, or explanatory text in knowledge graphs.
+
+## Schema Overview
+
+### EntityContext Structure
+- `entity`: Entity identifier (Value object with value, is_uri, type)
+- `context`: Textual context or description string
+
+### EntityContexts Structure
+- `metadata`: Metadata including ID, user, collection, and RDF triples
+- `entities`: Array of EntityContext objects
+
+### Value Structure
+- `value`: The entity value as string
+- `is_uri`: Boolean indicating if the value is a URI
+- `type`: Data type of the value (optional)
+
+## Import/Export Operations
+
+### Import - WebSocket Endpoint
+
+**Endpoint:** `/api/v1/flow/{flow}/import/entity-contexts`
+
+**Method:** WebSocket connection
+
+**Request Format:**
+```json
+{
+    "metadata": {
+        "id": "context-batch-123",
+        "user": "alice",
+        "collection": "research",
+        "metadata": [
+            {
+                "s": {"value": "source-doc", "is_uri": true},
+                "p": {"value": "dc:title", "is_uri": true},
+                "o": {"value": "Research Paper", "is_uri": false}
+            }
+        ]
+    },
+    "entities": [
+        {
+            "entity": {
+                "v": "https://example.com/Person/JohnDoe",
+                "e": true
+            },
+            "context": "John Doe is a researcher at MIT specializing in artificial intelligence and machine learning."
+        },
+        {
+            "entity": {
+                "v": "https://example.com/Organization/MIT",
+                "e": true
+            },
+            "context": "Massachusetts Institute of Technology (MIT) is a private research university in Cambridge, Massachusetts."
+        },
+        {
+            "entity": {
+                "v": "machine learning",
+                "e": false
+            },
+            "context": "Machine learning is a method of data analysis that automates analytical model building using algorithms."
+        }
+    ]
+}
+```
+
+**Response:** Import operations are fire-and-forget with no response payload.
+
+### Export - WebSocket Endpoint
+
+**Endpoint:** `/api/v1/flow/{flow}/export/entity-contexts`
+
+**Method:** WebSocket connection
+
+The export endpoint streams entity contexts data in real-time. Each message contains:
+
+```json
+{
+    "metadata": {
+        "id": "context-batch-123",
+        "user": "alice",
+        "collection": "research",
+        "metadata": [
+            {
+                "s": {"value": "source-doc", "is_uri": true},
+                "p": {"value": "dc:title", "is_uri": true},
+                "o": {"value": "Research Paper", "is_uri": false}
+            }
+        ]
+    },
+    "entities": [
+        {
+            "entity": {
+                "v": "https://example.com/Person/JohnDoe",
+                "e": true
+            },
+            "context": "John Doe is a researcher at MIT specializing in artificial intelligence."
+        }
+    ]
+}
+```
+
+## WebSocket Usage Examples
+
+### Importing Entity Contexts
+
+```javascript
+// Connect to import endpoint
+const ws = new WebSocket('ws://api-gateway:8080/api/v1/flow/my-flow/import/entity-contexts');
+
+// Send entity contexts
+ws.send(JSON.stringify({
+    metadata: {
+        id: "context-batch-1",
+        user: "alice",
+        collection: "research"
+    },
+    entities: [
+        {
+            entity: {
+                v: "Albert Einstein",
+                e: false
+            },
+            context: "Albert Einstein was a German-born theoretical physicist widely acknowledged to be one of the greatest physicists of all time."
+        }
+    ]
+}));
+```
+
+### Exporting Entity Contexts
+
+```javascript
+// Connect to export endpoint
+const ws = new WebSocket('ws://api-gateway:8080/api/v1/flow/my-flow/export/entity-contexts');
+
+// Listen for exported data
+ws.onmessage = (event) => {
+    const entityContexts = JSON.parse(event.data);
+    console.log('Received contexts for', entityContexts.entities.length, 'entities');
+    
+    entityContexts.entities.forEach(item => {
+        console.log('Entity:', item.entity.v);
+        console.log('Context:', item.context);
+    });
+};
+```
+
+## Data Format Details
+
+### Entity Format
+The `entity` field uses the Value structure:
+- `v`: The entity value (name, URI, identifier)
+- `e`: Boolean indicating if it's a URI entity (true) or literal (false)
+- `type`: Optional data type specification
+
+### Context Format
+- Plain text string providing description or context
+- Can include definitions, explanations, or background information
+- Supports multi-sentence descriptions and detailed context
+
+### Metadata Format
+Each metadata triple contains:
+- `s`: Subject (object with `value` and `is_uri` fields)
+- `p`: Predicate (object with `value` and `is_uri` fields)
+- `o`: Object (object with `value` and `is_uri` fields)
+
+## Integration with TrustGraph
+
+### Storage Integration
+- Entity contexts are stored in graph databases
+- Links entities to their descriptive text
+- Supports multi-tenant isolation by user and collection
+
+### Processing Pipeline
+1. **Text Analysis**: Extract entities from documents
+2. **Context Extraction**: Identify descriptive text for entities
+3. **Entity Linking**: Associate entities with their contexts
+4. **Import**: Store entity-context pairs via import API
+5. **Knowledge Enhancement**: Use contexts for better entity understanding
+
+### Use Cases
+- **Entity Disambiguation**: Provide context to distinguish similar entities
+- **Knowledge Base Enhancement**: Add descriptive information to entities
+- **Question Answering**: Use entity contexts to provide detailed answers
+- **Entity Summarization**: Generate summaries based on collected contexts
+- **Knowledge Graph Visualization**: Display rich entity information
+
+## Authentication
+
+Both import and export endpoints support authentication:
+- API token authentication via Authorization header
+- Flow-based access control
+- User and collection isolation
+
+## Error Handling
+
+Common error scenarios:
+- Invalid JSON format
+- Missing required metadata fields
+- User/collection access restrictions
+- WebSocket connection failures
+- Invalid entity value formats
+
+Errors are typically handled at the WebSocket connection level with connection termination or error messages.
+
+## Performance Considerations
+
+- **Batch Processing**: Import multiple entity contexts in single messages
+- **Context Length**: Balance detailed context with performance
+- **Flow Capacity**: Ensure target flow can handle entity context volume
+- **Real-time vs Batch**: Choose appropriate method based on use case
+
+## Python Integration
+
+While no direct Python SDK is mentioned in the codebase, integration can be achieved through:
+
+```python
+import websocket
+import json
+
+# Connect to import endpoint
+def import_entity_contexts(flow_id, contexts_data):
+    ws_url = f"ws://api-gateway:8080/api/v1/flow/{flow_id}/import/entity-contexts"
+    ws = websocket.create_connection(ws_url)
+    
+    # Send data
+    ws.send(json.dumps(contexts_data))
+    ws.close()
+
+# Usage example
+contexts = {
+    "metadata": {
+        "id": "batch-1",
+        "user": "alice",
+        "collection": "research"
+    },
+    "entities": [
+        {
+            "entity": {"v": "Neural Networks", "e": False},
+            "context": "Neural networks are computing systems inspired by biological neural networks."
+        }
+    ]
+}
+
+import_entity_contexts("my-flow", contexts)
+```
+
+## Features
+
+- **Real-time Streaming**: WebSocket-based import/export for live data flow
+- **Batch Operations**: Process multiple entity contexts efficiently
+- **Rich Metadata**: Full metadata support with RDF triples
+- **Entity Types**: Support for both URI entities and literal values
+- **Flow Integration**: Direct integration with TrustGraph processing flows
+- **Multi-tenant Support**: User and collection-based data isolation
--- a/docs/apis/api-flow.md
+++ b/docs/apis/api-flow.md
@ -0,0 +1,252 @@
+# TrustGraph Flow API
+
+This API provides workflow management for TrustGraph components. It manages flow classes 
+(workflow templates) and flow instances (active running workflows) that orchestrate 
+complex data processing pipelines.
+
+## Request/response
+
+### Request
+
+The request contains the following fields:
+- `operation`: The operation to perform (see operations below)
+- `class_name`: Flow class name (for class operations and start-flow)
+- `class_definition`: Flow class definition JSON (for put-class)
+- `description`: Flow description (for start-flow)
+- `flow_id`: Flow instance ID (for flow instance operations)
+
+### Response
+
+The response contains the following fields:
+- `class_names`: Array of flow class names (returned by list-classes)
+- `flow_ids`: Array of active flow IDs (returned by list-flows)
+- `class_definition`: Flow class definition JSON (returned by get-class)
+- `flow`: Flow instance JSON (returned by get-flow)
+- `description`: Flow description (returned by get-flow)
+- `error`: Error information if operation fails
+
+## Operations
+
+### Flow Class Operations
+
+#### LIST-CLASSES - List All Flow Classes
+
+Request:
+```json
+{
+    "operation": "list-classes"
+}
+```
+
+Response:
+```json
+{
+    "class_names": ["pdf-processor", "text-analyzer", "knowledge-extractor"]
+}
+```
+
+#### GET-CLASS - Get Flow Class Definition
+
+Request:
+```json
+{
+    "operation": "get-class",
+    "class_name": "pdf-processor"
+}
+```
+
+Response:
+```json
+{
+    "class_definition": "{\"interfaces\": {\"text-completion\": {\"request\": \"persistent://tg/request/text-completion\", \"response\": \"persistent://tg/response/text-completion\"}}, \"description\": \"PDF processing workflow\"}"
+}
+```
+
+#### PUT-CLASS - Create/Update Flow Class
+
+Request:
+```json
+{
+    "operation": "put-class",
+    "class_name": "pdf-processor",
+    "class_definition": "{\"interfaces\": {\"text-completion\": {\"request\": \"persistent://tg/request/text-completion\", \"response\": \"persistent://tg/response/text-completion\"}}, \"description\": \"PDF processing workflow\"}"
+}
+```
+
+Response:
+```json
+{}
+```
+
+#### DELETE-CLASS - Remove Flow Class
+
+Request:
+```json
+{
+    "operation": "delete-class",
+    "class_name": "pdf-processor"
+}
+```
+
+Response:
+```json
+{}
+```
+
+### Flow Instance Operations
+
+#### LIST-FLOWS - List Active Flow Instances
+
+Request:
+```json
+{
+    "operation": "list-flows"
+}
+```
+
+Response:
+```json
+{
+    "flow_ids": ["flow-123", "flow-456", "flow-789"]
+}
+```
+
+#### GET-FLOW - Get Flow Instance
+
+Request:
+```json
+{
+    "operation": "get-flow",
+    "flow_id": "flow-123"
+}
+```
+
+Response:
+```json
+{
+    "flow": "{\"interfaces\": {\"text-completion\": {\"request\": \"persistent://tg/request/text-completion-flow-123\", \"response\": \"persistent://tg/response/text-completion-flow-123\"}}}",
+    "description": "PDF processing workflow instance"
+}
+```
+
+#### START-FLOW - Start Flow Instance
+
+Request:
+```json
+{
+    "operation": "start-flow",
+    "class_name": "pdf-processor",
+    "flow_id": "flow-123",
+    "description": "Processing document batch 1"
+}
+```
+
+Response:
+```json
+{}
+```
+
+#### STOP-FLOW - Stop Flow Instance
+
+Request:
+```json
+{
+    "operation": "stop-flow",
+    "flow_id": "flow-123"
+}
+```
+
+Response:
+```json
+{}
+```
+
+## REST service
+
+The REST service is available at `/api/v1/flow` and accepts the above request formats.
+
+## Websocket
+
+Requests have a `request` object containing the operation fields.
+Responses have a `response` object containing the response fields.
+
+Request:
+```json
+{
+    "id": "unique-request-id",
+    "service": "flow",
+    "request": {
+        "operation": "list-classes"
+    }
+}
+```
+
+Response:
+```json
+{
+    "id": "unique-request-id",
+    "response": {
+        "class_names": ["pdf-processor", "text-analyzer"]
+    },
+    "complete": true
+}
+```
+
+## Pulsar
+
+The Pulsar schema for the Flow API is defined in Python code here:
+
+https://github.com/trustgraph-ai/trustgraph/blob/master/trustgraph-base/trustgraph/schema/flows.py
+
+Default request queue:
+`non-persistent://tg/request/flow`
+
+Default response queue:
+`non-persistent://tg/response/flow`
+
+Request schema:
+`trustgraph.schema.FlowRequest`
+
+Response schema:
+`trustgraph.schema.FlowResponse`
+
+## Python SDK
+
+The Python SDK provides convenient access to the Flow API:
+
+```python
+from trustgraph.api.flow import FlowClient
+
+client = FlowClient()
+
+# List all flow classes
+classes = await client.list_classes()
+
+# Get a flow class definition
+definition = await client.get_class("pdf-processor")
+
+# Start a flow instance
+await client.start_flow("pdf-processor", "flow-123", "Processing batch 1")
+
+# List active flows
+flows = await client.list_flows()
+
+# Stop a flow instance
+await client.stop_flow("flow-123")
+```
+
+## Features
+
+- **Flow Classes**: Templates that define workflow structure and interfaces
+- **Flow Instances**: Active running workflows based on flow classes
+- **Dynamic Management**: Flows can be started/stopped dynamically
+- **Template Processing**: Uses template replacement for customizing flow instances
+- **Integration**: Works with TrustGraph ecosystem for data processing pipelines
+- **Persistent Storage**: Flow definitions and instances stored for reliability
+
+## Use Cases
+
+- **Document Processing**: Orchestrating PDF processing through chunking, extraction, and storage
+- **Knowledge Extraction**: Managing workflows for relationship and definition extraction
+- **Data Pipelines**: Coordinating complex multi-step data processing workflows
+- **Resource Management**: Dynamically scaling processing flows based on demand
--- a/docs/apis/api-graph-embeddings.md
+++ b/docs/apis/api-graph-embeddings.md
@ -17,7 +17,7 @@ The request contains the following fields:

 ### Response

-The request contains the following fields:
+The response contains the following fields:
 - `entities`: An array of graph entities.  The entity type is described here:

 TrustGraph uses the same schema for knowledge graph elements:
@ -85,6 +85,7 @@ Request:
 {
    "id": "qgzw1287vfjc8wsk-3",
    "service": "graph-embeddings-query",
+    "flow": "default",
    "request": {
        "vectors": [
          [
--- a/docs/apis/api-graph-rag.md
+++ b/docs/apis/api-graph-rag.md
@ -14,7 +14,7 @@ The request contains the following fields:

 ### Response

-The request contains the following fields:
+The response contains the following fields:
 - `response`: LLM response

 ## REST service
@ -52,6 +52,7 @@ Request:
 {
    "id": "blrqotfefnmnh7de-14",
    "service": "graph-rag",
+    "flow": "default",
    "request": {
        "query": "What does NASA stand for?"
    }
--- a/docs/apis/api-knowledge.md
+++ b/docs/apis/api-knowledge.md
@ -0,0 +1,310 @@
+# TrustGraph Knowledge API
+
+This API provides knowledge graph management for TrustGraph. It handles storage, retrieval, 
+and flow integration of knowledge cores containing RDF triples and graph embeddings with 
+multi-tenant support.
+
+## Request/response
+
+### Request
+
+The request contains the following fields:
+- `operation`: The operation to perform (see operations below)
+- `user`: User identifier (for user-specific operations)
+- `id`: Knowledge core identifier
+- `flow`: Flow identifier (for load operations)
+- `collection`: Collection identifier (for load operations)
+- `triples`: RDF triples data (for put operations)
+- `graph_embeddings`: Graph embeddings data (for put operations)
+
+### Response
+
+The response contains the following fields:
+- `error`: Error information if operation fails
+- `ids`: Array of knowledge core IDs (returned by list operation)
+- `eos`: End of stream indicator for streaming responses
+- `triples`: RDF triples data (returned by get operation)
+- `graph_embeddings`: Graph embeddings data (returned by get operation)
+
+## Operations
+
+### PUT-KG-CORE - Store Knowledge Core
+
+Request:
+```json
+{
+    "operation": "put-kg-core",
+    "user": "alice",
+    "id": "core-123",
+    "triples": {
+        "metadata": {
+            "id": "core-123",
+            "user": "alice",
+            "collection": "research"
+        },
+        "triples": [
+            {
+                "s": {"value": "Person1", "is_uri": true},
+                "p": {"value": "hasName", "is_uri": true},
+                "o": {"value": "John Doe", "is_uri": false}
+            },
+            {
+                "s": {"value": "Person1", "is_uri": true},
+                "p": {"value": "worksAt", "is_uri": true},
+                "o": {"value": "Company1", "is_uri": true}
+            }
+        ]
+    },
+    "graph_embeddings": {
+        "metadata": {
+            "id": "core-123",
+            "user": "alice",
+            "collection": "research"
+        },
+        "entities": [
+            {
+                "entity": {"value": "Person1", "is_uri": true},
+                "vectors": [[0.1, 0.2, 0.3], [0.4, 0.5, 0.6]]
+            }
+        ]
+    }
+}
+```
+
+Response:
+```json
+{}
+```
+
+### GET-KG-CORE - Retrieve Knowledge Core
+
+Request:
+```json
+{
+    "operation": "get-kg-core",
+    "id": "core-123"
+}
+```
+
+Response:
+```json
+{
+    "triples": {
+        "metadata": {
+            "id": "core-123",
+            "user": "alice",
+            "collection": "research"
+        },
+        "triples": [
+            {
+                "s": {"value": "Person1", "is_uri": true},
+                "p": {"value": "hasName", "is_uri": true},
+                "o": {"value": "John Doe", "is_uri": false}
+            }
+        ]
+    },
+    "graph_embeddings": {
+        "metadata": {
+            "id": "core-123",
+            "user": "alice",
+            "collection": "research"
+        },
+        "entities": [
+            {
+                "entity": {"value": "Person1", "is_uri": true},
+                "vectors": [[0.1, 0.2, 0.3], [0.4, 0.5, 0.6]]
+            }
+        ]
+    }
+}
+```
+
+### LIST-KG-CORES - List Knowledge Cores
+
+Request:
+```json
+{
+    "operation": "list-kg-cores",
+    "user": "alice"
+}
+```
+
+Response:
+```json
+{
+    "ids": ["core-123", "core-456", "core-789"]
+}
+```
+
+### DELETE-KG-CORE - Delete Knowledge Core
+
+Request:
+```json
+{
+    "operation": "delete-kg-core",
+    "user": "alice",
+    "id": "core-123"
+}
+```
+
+Response:
+```json
+{}
+```
+
+### LOAD-KG-CORE - Load Knowledge Core into Flow
+
+Request:
+```json
+{
+    "operation": "load-kg-core",
+    "id": "core-123",
+    "flow": "qa-flow",
+    "collection": "research"
+}
+```
+
+Response:
+```json
+{}
+```
+
+### UNLOAD-KG-CORE - Unload Knowledge Core from Flow
+
+Request:
+```json
+{
+    "operation": "unload-kg-core",
+    "id": "core-123"
+}
+```
+
+Response:
+```json
+{}
+```
+
+## Data Structures
+
+### Triple Structure
+Each RDF triple contains:
+- `s`: Subject (Value object)
+- `p`: Predicate (Value object)
+- `o`: Object (Value object)
+
+### Value Structure
+- `value`: The actual value as string
+- `is_uri`: Boolean indicating if value is a URI
+- `type`: Data type of the value (optional)
+
+### Triples Structure
+- `metadata`: Metadata including ID, user, collection
+- `triples`: Array of Triple objects
+
+### Graph Embeddings Structure
+- `metadata`: Metadata including ID, user, collection
+- `entities`: Array of EntityEmbeddings objects
+
+### Entity Embeddings Structure
+- `entity`: The entity being embedded (Value object)
+- `vectors`: Array of vector embeddings (Array of Array of Double)
+
+## REST service
+
+The REST service is available at `/api/v1/knowledge` and accepts the above request formats.
+
+## Websocket
+
+Requests have a `request` object containing the operation fields.
+Responses have a `response` object containing the response fields.
+
+Request:
+```json
+{
+    "id": "unique-request-id",
+    "service": "knowledge",
+    "request": {
+        "operation": "list-kg-cores",
+        "user": "alice"
+    }
+}
+```
+
+Response:
+```json
+{
+    "id": "unique-request-id",
+    "response": {
+        "ids": ["core-123", "core-456"]
+    },
+    "complete": true
+}
+```
+
+## Pulsar
+
+The Pulsar schema for the Knowledge API is defined in Python code here:
+
+https://github.com/trustgraph-ai/trustgraph/blob/master/trustgraph-base/trustgraph/schema/knowledge.py
+
+Default request queue:
+`non-persistent://tg/request/knowledge`
+
+Default response queue:
+`non-persistent://tg/response/knowledge`
+
+Request schema:
+`trustgraph.schema.KnowledgeRequest`
+
+Response schema:
+`trustgraph.schema.KnowledgeResponse`
+
+## Python SDK
+
+The Python SDK provides convenient access to the Knowledge API:
+
+```python
+from trustgraph.api.knowledge import KnowledgeClient
+
+client = KnowledgeClient()
+
+# List knowledge cores
+cores = await client.list_kg_cores("alice")
+
+# Get a knowledge core
+core = await client.get_kg_core("core-123")
+
+# Store a knowledge core
+await client.put_kg_core(
+    user="alice",
+    id="core-123",
+    triples=triples_data,
+    graph_embeddings=embeddings_data
+)
+
+# Load core into flow
+await client.load_kg_core("core-123", "qa-flow", "research")
+
+# Delete a knowledge core
+await client.delete_kg_core("alice", "core-123")
+```
+
+## Features
+
+- **Knowledge Core Management**: Store, retrieve, list, and delete knowledge cores
+- **Dual Data Types**: Support for both RDF triples and graph embeddings
+- **Flow Integration**: Load knowledge cores into processing flows
+- **Multi-tenant Support**: User-specific knowledge cores with isolation
+- **Streaming Support**: Efficient transfer of large knowledge cores
+- **Collection Organization**: Group knowledge cores by collection
+- **Semantic Reasoning**: RDF triples enable symbolic reasoning
+- **Vector Similarity**: Graph embeddings enable neural approaches
+
+## Use Cases
+
+- **Knowledge Base Construction**: Build semantic knowledge graphs from documents
+- **Question Answering**: Load knowledge cores for graph-based QA systems
+- **Semantic Search**: Use embeddings for similarity-based knowledge retrieval
+- **Multi-domain Knowledge**: Organize knowledge by user and collection
+- **Hybrid Reasoning**: Combine symbolic (triples) and neural (embeddings) approaches
+- **Knowledge Transfer**: Export and import knowledge cores between systems
--- a/docs/apis/api-librarian.md
+++ b/docs/apis/api-librarian.md
@ -0,0 +1,360 @@
+# TrustGraph Librarian API
+
+This API provides document library management for TrustGraph. It handles document storage, 
+metadata management, and processing orchestration using hybrid storage (MinIO for content, 
+Cassandra for metadata) with multi-user support.
+
+## Request/response
+
+### Request
+
+The request contains the following fields:
+- `operation`: The operation to perform (see operations below)
+- `document_id`: Document identifier (for document operations)
+- `document_metadata`: Document metadata object (for add/update operations)
+- `content`: Document content as base64-encoded bytes (for add operations)
+- `processing_id`: Processing job identifier (for processing operations)
+- `processing_metadata`: Processing metadata object (for add-processing)
+- `user`: User identifier (required for most operations)
+- `collection`: Collection filter (optional for list operations)
+- `criteria`: Query criteria array (for filtering operations)
+
+### Response
+
+The response contains the following fields:
+- `error`: Error information if operation fails
+- `document_metadata`: Single document metadata (for get operations)
+- `content`: Document content as base64-encoded bytes (for get-content)
+- `document_metadatas`: Array of document metadata (for list operations)
+- `processing_metadatas`: Array of processing metadata (for list-processing)
+
+## Document Operations
+
+### ADD-DOCUMENT - Add Document to Library
+
+Request:
+```json
+{
+    "operation": "add-document",
+    "document_metadata": {
+        "id": "doc-123",
+        "time": 1640995200000,
+        "kind": "application/pdf",
+        "title": "Research Paper",
+        "comments": "Important research findings",
+        "user": "alice",
+        "tags": ["research", "ai", "machine-learning"],
+        "metadata": [
+            {
+                "subject": "doc-123",
+                "predicate": "dc:creator",
+                "object": "Dr. Smith"
+            }
+        ]
+    },
+    "content": "JVBERi0xLjQKMSAwIG9iago8PAovVHlwZSAvQ2F0YWxvZwovUGFnZXMgMiAwIFIKPj4KZW5kb2JqCg=="
+}
+```
+
+Response:
+```json
+{}
+```
+
+### GET-DOCUMENT-METADATA - Get Document Metadata
+
+Request:
+```json
+{
+    "operation": "get-document-metadata",
+    "document_id": "doc-123",
+    "user": "alice"
+}
+```
+
+Response:
+```json
+{
+    "document_metadata": {
+        "id": "doc-123",
+        "time": 1640995200000,
+        "kind": "application/pdf",
+        "title": "Research Paper",
+        "comments": "Important research findings",
+        "user": "alice",
+        "tags": ["research", "ai", "machine-learning"],
+        "metadata": [
+            {
+                "subject": "doc-123",
+                "predicate": "dc:creator",
+                "object": "Dr. Smith"
+            }
+        ]
+    }
+}
+```
+
+### GET-DOCUMENT-CONTENT - Get Document Content
+
+Request:
+```json
+{
+    "operation": "get-document-content",
+    "document_id": "doc-123",
+    "user": "alice"
+}
+```
+
+Response:
+```json
+{
+    "content": "JVBERi0xLjQKMSAwIG9iago8PAovVHlwZSAvQ2F0YWxvZwovUGFnZXMgMiAwIFIKPj4KZW5kb2JqCg=="
+}
+```
+
+### LIST-DOCUMENTS - List User's Documents
+
+Request:
+```json
+{
+    "operation": "list-documents",
+    "user": "alice",
+    "collection": "research"
+}
+```
+
+Response:
+```json
+{
+    "document_metadatas": [
+        {
+            "id": "doc-123",
+            "time": 1640995200000,
+            "kind": "application/pdf",
+            "title": "Research Paper",
+            "comments": "Important research findings",
+            "user": "alice",
+            "tags": ["research", "ai"]
+        },
+        {
+            "id": "doc-124",
+            "time": 1640995300000,
+            "kind": "text/plain",
+            "title": "Meeting Notes",
+            "comments": "Team meeting discussion",
+            "user": "alice",
+            "tags": ["meeting", "notes"]
+        }
+    ]
+}
+```
+
+### UPDATE-DOCUMENT - Update Document Metadata
+
+Request:
+```json
+{
+    "operation": "update-document",
+    "document_metadata": {
+        "id": "doc-123",
+        "title": "Updated Research Paper",
+        "comments": "Updated findings and conclusions",
+        "user": "alice",
+        "tags": ["research", "ai", "machine-learning", "updated"]
+    }
+}
+```
+
+Response:
+```json
+{}
+```
+
+### REMOVE-DOCUMENT - Remove Document
+
+Request:
+```json
+{
+    "operation": "remove-document",
+    "document_id": "doc-123",
+    "user": "alice"
+}
+```
+
+Response:
+```json
+{}
+```
+
+## Processing Operations
+
+### ADD-PROCESSING - Start Document Processing
+
+Request:
+```json
+{
+    "operation": "add-processing",
+    "processing_metadata": {
+        "id": "proc-456",
+        "document_id": "doc-123",
+        "time": 1640995400000,
+        "flow": "pdf-extraction",
+        "user": "alice",
+        "collection": "research",
+        "tags": ["extraction", "nlp"]
+    }
+}
+```
+
+Response:
+```json
+{}
+```
+
+### LIST-PROCESSING - List Processing Jobs
+
+Request:
+```json
+{
+    "operation": "list-processing",
+    "user": "alice",
+    "collection": "research"
+}
+```
+
+Response:
+```json
+{
+    "processing_metadatas": [
+        {
+            "id": "proc-456",
+            "document_id": "doc-123",
+            "time": 1640995400000,
+            "flow": "pdf-extraction",
+            "user": "alice",
+            "collection": "research",
+            "tags": ["extraction", "nlp"]
+        }
+    ]
+}
+```
+
+### REMOVE-PROCESSING - Stop Processing Job
+
+Request:
+```json
+{
+    "operation": "remove-processing",
+    "processing_id": "proc-456",
+    "user": "alice"
+}
+```
+
+Response:
+```json
+{}
+```
+
+## REST service
+
+The REST service is available at `/api/v1/librarian` and accepts the above request formats.
+
+## Websocket
+
+Requests have a `request` object containing the operation fields.
+Responses have a `response` object containing the response fields.
+
+Request:
+```json
+{
+    "id": "unique-request-id",
+    "service": "librarian",
+    "request": {
+        "operation": "list-documents",
+        "user": "alice"
+    }
+}
+```
+
+Response:
+```json
+{
+    "id": "unique-request-id",
+    "response": {
+        "document_metadatas": [...]
+    },
+    "complete": true
+}
+```
+
+## Pulsar
+
+The Pulsar schema for the Librarian API is defined in Python code here:
+
+https://github.com/trustgraph-ai/trustgraph/blob/master/trustgraph-base/trustgraph/schema/library.py
+
+Default request queue:
+`non-persistent://tg/request/librarian`
+
+Default response queue:
+`non-persistent://tg/response/librarian`
+
+Request schema:
+`trustgraph.schema.LibrarianRequest`
+
+Response schema:
+`trustgraph.schema.LibrarianResponse`
+
+## Python SDK
+
+The Python SDK provides convenient access to the Librarian API:
+
+```python
+from trustgraph.api.library import LibrarianClient
+
+client = LibrarianClient()
+
+# Add a document
+with open("document.pdf", "rb") as f:
+    content = f.read()
+    
+await client.add_document(
+    doc_id="doc-123",
+    title="Research Paper",
+    content=content,
+    user="alice",
+    tags=["research", "ai"]
+)
+
+# Get document metadata
+metadata = await client.get_document_metadata("doc-123", "alice")
+
+# List documents
+documents = await client.list_documents("alice", collection="research")
+
+# Start processing
+await client.add_processing(
+    processing_id="proc-456",
+    document_id="doc-123",
+    flow="pdf-extraction",
+    user="alice"
+)
+```
+
+## Features
+
+- **Hybrid Storage**: MinIO for content, Cassandra for metadata
+- **Multi-user Support**: User-based document ownership and access control
+- **Rich Metadata**: RDF-style metadata triples and tagging system
+- **Processing Integration**: Automatic triggering of document processing workflows
+- **Content Types**: Support for multiple document formats (PDF, text, etc.)
+- **Collection Management**: Optional document grouping by collection
+- **Metadata Search**: Query documents by metadata criteria
+
+## Use Cases
+
+- **Document Management**: Store and organize documents with rich metadata
+- **Knowledge Extraction**: Process documents to extract structured knowledge
+- **Research Libraries**: Manage collections of research papers and documents
+- **Content Processing**: Orchestrate document processing workflows
+- **Multi-tenant Systems**: Support multiple users with isolated document libraries
--- a/docs/apis/api-metrics.md
+++ b/docs/apis/api-metrics.md
@ -0,0 +1,313 @@
+# TrustGraph Metrics API
+
+This API provides access to TrustGraph system metrics through a Prometheus proxy endpoint. 
+It allows authenticated access to monitoring and observability data from the TrustGraph 
+system components.
+
+## Overview
+
+The Metrics API is implemented as a proxy to a Prometheus metrics server, providing:
+- System performance metrics
+- Service health information  
+- Resource utilization data
+- Request/response statistics
+- Error rates and latency metrics
+
+## Authentication
+
+All metrics endpoints require Bearer token authentication:
+
+```
+Authorization: Bearer <your-api-token>
+```
+
+Unauthorized requests return HTTP 401.
+
+## Endpoint
+
+**Base Path:** `/api/metrics`
+
+**Method:** GET
+
+**Description:** Proxies requests to the underlying Prometheus API
+
+## Usage Examples
+
+### Query Current Metrics
+
+```bash
+# Get all available metrics
+curl -H "Authorization: Bearer your-token" \
+  "http://api-gateway:8080/api/metrics/query?query=up"
+
+# Get specific metric with time range
+curl -H "Authorization: Bearer your-token" \
+  "http://api-gateway:8080/api/metrics/query_range?query=cpu_usage&start=1640995200&end=1640998800&step=60"
+
+# Get metric metadata
+curl -H "Authorization: Bearer your-token" \
+  "http://api-gateway:8080/api/metrics/metadata"
+```
+
+### Common Prometheus API Endpoints
+
+The metrics API supports all standard Prometheus API endpoints:
+
+#### Instant Queries
+```
+GET /api/metrics/query?query=<prometheus_query>
+```
+
+#### Range Queries  
+```
+GET /api/metrics/query_range?query=<query>&start=<timestamp>&end=<timestamp>&step=<duration>
+```
+
+#### Metadata
+```
+GET /api/metrics/metadata
+GET /api/metrics/metadata?metric=<metric_name>
+```
+
+#### Series
+```
+GET /api/metrics/series?match[]=<series_selector>
+```
+
+#### Label Values
+```
+GET /api/metrics/label/<label_name>/values
+```
+
+#### Targets
+```
+GET /api/metrics/targets
+```
+
+## Example Queries
+
+### System Health
+```bash
+# Check if services are up
+curl -H "Authorization: Bearer token" \
+  "http://api-gateway:8080/api/metrics/query?query=up"
+
+# Get service uptime
+curl -H "Authorization: Bearer token" \
+  "http://api-gateway:8080/api/metrics/query?query=time()-process_start_time_seconds"
+```
+
+### Performance Metrics
+```bash
+# CPU usage
+curl -H "Authorization: Bearer token" \
+  "http://api-gateway:8080/api/metrics/query?query=rate(cpu_seconds_total[5m])"
+
+# Memory usage
+curl -H "Authorization: Bearer token" \
+  "http://api-gateway:8080/api/metrics/query?query=process_resident_memory_bytes"
+
+# Request rate
+curl -H "Authorization: Bearer token" \
+  "http://api-gateway:8080/api/metrics/query?query=rate(http_requests_total[5m])"
+```
+
+### TrustGraph-Specific Metrics
+```bash
+# Document processing rate
+curl -H "Authorization: Bearer token" \
+  "http://api-gateway:8080/api/metrics/query?query=rate(trustgraph_documents_processed_total[5m])"
+
+# Knowledge graph size
+curl -H "Authorization: Bearer token" \
+  "http://api-gateway:8080/api/metrics/query?query=trustgraph_triples_count"
+
+# Embedding generation rate
+curl -H "Authorization: Bearer token" \
+  "http://api-gateway:8080/api/metrics/query?query=rate(trustgraph_embeddings_generated_total[5m])"
+```
+
+## Response Format
+
+Responses follow the standard Prometheus API format:
+
+### Successful Query Response
+```json
+{
+    "status": "success",
+    "data": {
+        "resultType": "vector",
+        "result": [
+            {
+                "metric": {
+                    "__name__": "up",
+                    "instance": "api-gateway:8080",
+                    "job": "trustgraph"
+                },
+                "value": [1640995200, "1"]
+            }
+        ]
+    }
+}
+```
+
+### Range Query Response
+```json
+{
+    "status": "success", 
+    "data": {
+        "resultType": "matrix",
+        "result": [
+            {
+                "metric": {
+                    "__name__": "cpu_usage",
+                    "instance": "worker-1"
+                },
+                "values": [
+                    [1640995200, "0.15"],
+                    [1640995260, "0.18"],
+                    [1640995320, "0.12"]
+                ]
+            }
+        ]
+    }
+}
+```
+
+### Error Response
+```json
+{
+    "status": "error",
+    "errorType": "bad_data",
+    "error": "invalid query syntax"
+}
+```
+
+## Available Metrics
+
+### Standard System Metrics
+- `up`: Service availability (1 = up, 0 = down)
+- `process_resident_memory_bytes`: Memory usage
+- `process_cpu_seconds_total`: CPU time
+- `http_requests_total`: HTTP request count
+- `http_request_duration_seconds`: Request latency
+
+### TrustGraph-Specific Metrics
+- `trustgraph_documents_processed_total`: Documents processed count
+- `trustgraph_triples_count`: Knowledge graph triple count
+- `trustgraph_embeddings_generated_total`: Embeddings generated count
+- `trustgraph_flow_executions_total`: Flow execution count
+- `trustgraph_pulsar_messages_total`: Pulsar message count
+- `trustgraph_errors_total`: Error count by component
+
+## Time Series Queries
+
+### Time Ranges
+Use standard Prometheus time range formats:
+- `5m`: 5 minutes
+- `1h`: 1 hour  
+- `1d`: 1 day
+- `1w`: 1 week
+
+### Rate Calculations
+```bash
+# 5-minute rate
+rate(metric_name[5m])
+
+# Increase over time
+increase(metric_name[1h])
+```
+
+### Aggregations
+```bash
+# Sum across instances
+sum(metric_name)
+
+# Average by label
+avg by (instance) (metric_name)
+
+# Top 5 values
+topk(5, metric_name)
+```
+
+## Integration Examples
+
+### Python Integration
+```python
+import requests
+
+def query_metrics(token, query):
+    headers = {"Authorization": f"Bearer {token}"}
+    params = {"query": query}
+    
+    response = requests.get(
+        "http://api-gateway:8080/api/metrics/query",
+        headers=headers,
+        params=params
+    )
+    
+    return response.json()
+
+# Get system uptime
+uptime = query_metrics("your-token", "time() - process_start_time_seconds")
+```
+
+### JavaScript Integration
+```javascript
+async function queryMetrics(token, query) {
+    const response = await fetch(
+        `http://api-gateway:8080/api/metrics/query?query=${encodeURIComponent(query)}`,
+        {
+            headers: {
+                'Authorization': `Bearer ${token}`
+            }
+        }
+    );
+    
+    return await response.json();
+}
+
+// Get request rate
+const requestRate = await queryMetrics('your-token', 'rate(http_requests_total[5m])');
+```
+
+## Error Handling
+
+### Common HTTP Status Codes
+- `200`: Success
+- `400`: Bad request (invalid query)
+- `401`: Unauthorized (invalid/missing token)
+- `422`: Unprocessable entity (query execution error)
+- `500`: Internal server error
+
+### Error Types
+- `bad_data`: Invalid query syntax
+- `timeout`: Query execution timeout
+- `canceled`: Query was canceled
+- `execution`: Query execution error
+
+## Best Practices
+
+### Query Optimization
+- Use appropriate time ranges to limit data volume
+- Apply label filters to reduce result sets
+- Use recording rules for frequently accessed metrics
+
+### Rate Limiting
+- Avoid high-frequency polling
+- Cache results when appropriate
+- Use appropriate step sizes for range queries
+
+### Security
+- Keep API tokens secure
+- Use HTTPS in production
+- Rotate tokens regularly
+
+## Use Cases
+
+- **System Monitoring**: Track system health and performance
+- **Capacity Planning**: Monitor resource utilization trends  
+- **Alerting**: Set up alerts based on metric thresholds
+- **Performance Analysis**: Analyze system performance over time
+- **Debugging**: Investigate issues using detailed metrics
+- **Business Intelligence**: Track document processing and knowledge extraction metrics
--- a/docs/apis/api-prompt.md
+++ b/docs/apis/api-prompt.md
@ -15,7 +15,7 @@ The request contains the following fields:

 ### Response

-The request contains either of these fields:
+The response contains either of these fields:
 - `text`: A plain text response
 - `object`: A structured object, JSON-encoded

@ -60,6 +60,7 @@ Request:
 {
    "id": "akshfkiehfkseffh-142",
    "service": "prompt",
+    "flow": "default",
    "request": {
        "id": "extract-definitions",
        "variables": {
--- a/docs/apis/api-text-completion.md
+++ b/docs/apis/api-text-completion.md
@ -19,7 +19,7 @@ The request contains the following fields:

 ### Response

-The request contains the following fields:
+The response contains the following fields:
 - `response`: LLM response

 ## REST service
@ -59,6 +59,7 @@ Request:
 {
    "id": "blrqotfefnmnh7de-1",
    "service": "text-completion",
+    "flow": "default",
    "request": {
        "system": "You are a helpful agent",
        "prompt": "What does NASA stand for?"
--- a/docs/apis/api-text-load.md
+++ b/docs/apis/api-text-load.md
@ -0,0 +1,168 @@
+# TrustGraph Text Load API
+
+This API loads text documents into TrustGraph processing pipelines. It's a sender API 
+that accepts text documents with metadata and queues them for processing through 
+specified flows.
+
+## Request Format
+
+The text-load API accepts a JSON request with the following fields:
+- `id`: Document identifier (typically a URI)
+- `metadata`: Array of RDF triples providing document metadata
+- `charset`: Character encoding (defaults to "utf-8")
+- `text`: Base64-encoded text content
+- `user`: User identifier (defaults to "trustgraph")
+- `collection`: Collection identifier (defaults to "default")
+
+## Request Example
+
+```json
+{
+    "id": "https://example.com/documents/research-paper-123",
+    "metadata": [
+        {
+            "s": {"v": "https://example.com/documents/research-paper-123", "e": true},
+            "p": {"v": "http://purl.org/dc/terms/title", "e": true},
+            "o": {"v": "Machine Learning in Healthcare", "e": false}
+        },
+        {
+            "s": {"v": "https://example.com/documents/research-paper-123", "e": true},
+            "p": {"v": "http://purl.org/dc/terms/creator", "e": true},
+            "o": {"v": "Dr. Jane Smith", "e": false}
+        },
+        {
+            "s": {"v": "https://example.com/documents/research-paper-123", "e": true},
+            "p": {"v": "http://purl.org/dc/terms/subject", "e": true},
+            "o": {"v": "Healthcare AI", "e": false}
+        }
+    ],
+    "charset": "utf-8",
+    "text": "VGhpcyBpcyBhIHNhbXBsZSByZXNlYXJjaCBwYXBlciBhYm91dCBtYWNoaW5lIGxlYXJuaW5nIGluIGhlYWx0aGNhcmUuLi4=",
+    "user": "researcher",
+    "collection": "healthcare-research"
+}
+```
+
+## Response
+
+The text-load API is a sender API with no response body. Success is indicated by HTTP status code 200.
+
+## REST service
+
+The text-load service is available at:
+`POST /api/v1/flow/{flow-id}/service/text-load`
+
+Where `{flow-id}` is the identifier of the flow that will process the document.
+
+Example:
+```bash
+curl -X POST \
+  -H "Content-Type: application/json" \
+  -d @document.json \
+  http://api-gateway:8080/api/v1/flow/pdf-processing/service/text-load
+```
+
+## Metadata Format
+
+Each metadata triple contains:
+- `s`: Subject (object with `v` for value and `e` for is_entity boolean)
+- `p`: Predicate (object with `v` for value and `e` for is_entity boolean)
+- `o`: Object (object with `v` for value and `e` for is_entity boolean)
+
+The `e` field indicates whether the value should be treated as an entity (true) or literal (false).
+
+## Common Metadata Properties
+
+### Document Properties
+- `http://purl.org/dc/terms/title`: Document title
+- `http://purl.org/dc/terms/creator`: Document author
+- `http://purl.org/dc/terms/subject`: Document subject/topic
+- `http://purl.org/dc/terms/description`: Document description
+- `http://purl.org/dc/terms/date`: Publication date
+- `http://purl.org/dc/terms/language`: Document language
+
+### Organizational Properties
+- `http://xmlns.com/foaf/0.1/name`: Organization name
+- `http://www.w3.org/2006/vcard/ns#hasAddress`: Organization address
+- `http://xmlns.com/foaf/0.1/homepage`: Organization website
+
+### Publication Properties
+- `http://purl.org/ontology/bibo/doi`: DOI identifier
+- `http://purl.org/ontology/bibo/isbn`: ISBN identifier
+- `http://purl.org/ontology/bibo/volume`: Publication volume
+- `http://purl.org/ontology/bibo/issue`: Publication issue
+
+## Text Encoding
+
+The `text` field must contain base64-encoded content. To encode text:
+
+```bash
+# Command line encoding
+echo "Your text content here" | base64
+
+# Python encoding
+import base64
+encoded_text = base64.b64encode("Your text content here".encode('utf-8')).decode('utf-8')
+```
+
+## Integration with Processing Flows
+
+Once loaded, text documents are processed through the specified flow, which typically includes:
+
+1. **Text Chunking**: Breaking documents into manageable chunks
+2. **Embedding Generation**: Creating vector embeddings for semantic search
+3. **Knowledge Extraction**: Extracting entities and relationships
+4. **Graph Storage**: Storing extracted knowledge in the knowledge graph
+5. **Indexing**: Making content searchable for RAG queries
+
+## Error Handling
+
+Common errors include:
+- Invalid base64 encoding in text field
+- Missing required fields (id, text)
+- Invalid metadata triple format
+- Flow not found or inactive
+
+## Python SDK
+
+```python
+import base64
+from trustgraph.api.text_load import TextLoadClient
+
+client = TextLoadClient()
+
+# Prepare document
+document = {
+    "id": "https://example.com/doc-123",
+    "metadata": [
+        {
+            "s": {"v": "https://example.com/doc-123", "e": True},
+            "p": {"v": "http://purl.org/dc/terms/title", "e": True},
+            "o": {"v": "Sample Document", "e": False}
+        }
+    ],
+    "charset": "utf-8",
+    "text": base64.b64encode("Document content here".encode('utf-8')).decode('utf-8'),
+    "user": "alice",
+    "collection": "research"
+}
+
+# Load document
+await client.load_text_document("my-flow", document)
+```
+
+## Use Cases
+
+- **Research Paper Ingestion**: Load academic papers with rich metadata
+- **Document Processing**: Ingest documents for knowledge extraction
+- **Content Management**: Build searchable document repositories
+- **RAG System Population**: Load content for question-answering systems
+- **Knowledge Base Construction**: Convert documents into structured knowledge
+
+## Features
+
+- **Rich Metadata**: Full RDF metadata support for semantic annotation
+- **Flow Integration**: Direct integration with TrustGraph processing flows
+- **Multi-tenant**: User and collection-based document organization
+- **Encoding Support**: Flexible character encoding support
+- **No Response Required**: Fire-and-forget operation for high throughput
--- a/docs/apis/api-triples-query.md
+++ b/docs/apis/api-triples-query.md
@ -21,7 +21,7 @@ Returned triples will match all of `s`, `p` and `o` where provided.

 ### Response

-The request contains the following fields:
+The response contains the following fields:
 - `response`: A list of triples.

 Each triple contains `s`, `p` and `o` fields describing the
@ -33,15 +33,53 @@ Each triple element uses the same schema:
 - `is_uri`: A boolean value which is true if this is a graph entity i.e.
  `value` is a URI, not a literal value.

+## Data Format Details
+
+### Triple Element Format
+
+To reduce the size of JSON messages, triple elements (subject, predicate, object) are encoded using a compact format:
+
+- `v`: The value as a string (maps to `value` in the full schema)
+- `e`: Boolean indicating if this is an entity/URI (maps to `is_uri` in the full schema)
+
+Each triple element (`s`, `p`, `o`) contains:
+- `v`: The actual value as a string
+- `e`: Boolean indicating the value type
+  - `true`: The value is a URI/entity (e.g., `"http://example.com/Person1"`)
+  - `false`: The value is a literal (e.g., `"John Doe"`, `"42"`, `"2023-01-01"`)
+
+### Examples
+
+**URI/Entity Element:**
+```json
+{
+    "v": "http://trustgraph.ai/e/space-station-modules",
+    "e": true
+}
+```
+
+**Literal Element:**
+```json
+{
+    "v": "space station modules", 
+    "e": false
+}
+```
+
+**Numeric Literal:**
+```json
+{
+    "v": "42",
+    "e": false
+}
+```
+
 ## REST service

 The REST service accepts a request object containing the `s`, `p`, `o`
 and `limit` fields.
 The response is a JSON object containing the `response` field.

-To reduce the size of the JSON, the graph entities are encoded as an
-object with `value` and `is_uri` mapped to `v` and `e` respectively.
-
 e.g.

 This example query matches triples with a subject of
@ -58,6 +96,7 @@ Request:
 {
    "id": "qgzw1287vfjc8wsk-4",
    "service": "triples-query",
+    "flow": "default",
    "request": {
        "s": {
            "v": "http://trustgraph.ai/e/space-station-modules",
@ -97,13 +136,9 @@ Response:

 ## Websocket

-Requests have a `request` object containing the `system` and
-`prompt` fields.
+Requests have a `request` object containing the query fields (`s`, `p`, `o`, `limit`).
 Responses have a `response` object containing `response` field.

-To reduce the size of the JSON, the graph entities are encoded as an
-object with `value` and `is_uri` mapped to `v` and `e` respectively.
-
 e.g.

 Request:
@ -178,10 +213,3 @@ The client class is

 https://github.com/trustgraph-ai/trustgraph/blob/master/trustgraph-base/trustgraph/clients/triples_query_client.py

-
-
-
-
-
-
-
--- a/docs/apis/pulsar.md
+++ b/docs/apis/pulsar.md
@ -1,3 +1,230 @@
+# TrustGraph Pulsar API

-Coming soon
+Apache Pulsar is the underlying message queue system used by TrustGraph for inter-component communication. Understanding Pulsar queue names is essential for direct integration with TrustGraph services.

+## Overview
+
+TrustGraph uses two types of APIs with different queue naming patterns:
+
+1. **Global Services**: Fixed queue names, not dependent on flows
+2. **Flow-Hosted Services**: Dynamic queue names that depend on the specific flow configuration
+
+## Global Services (Fixed Queue Names)
+
+These services run independently and have fixed Pulsar queue names:
+
+### Config API
+- **Request Queue**: `non-persistent://tg/request/config`
+- **Response Queue**: `non-persistent://tg/response/config`
+- **Push Queue**: `persistent://tg/config/config`
+
+### Flow API
+- **Request Queue**: `non-persistent://tg/request/flow`
+- **Response Queue**: `non-persistent://tg/response/flow`
+
+### Knowledge API
+- **Request Queue**: `non-persistent://tg/request/knowledge`
+- **Response Queue**: `non-persistent://tg/response/knowledge`
+
+### Librarian API
+- **Request Queue**: `non-persistent://tg/request/librarian`
+- **Response Queue**: `non-persistent://tg/response/librarian`
+
+## Flow-Hosted Services (Dynamic Queue Names)
+
+These services are hosted within specific flows and have queue names that depend on the flow configuration:
+
+- Agent API
+- Document RAG API
+- Graph RAG API
+- Text Completion API
+- Prompt API
+- Embeddings API
+- Graph Embeddings API
+- Triples Query API
+- Text Load API
+- Document Load API
+
+## Discovering Flow-Hosted Queue Names
+
+To find the queue names for flow-hosted services, you need to query the flow configuration using the Config API.
+
+### Method 1: Using the Config API
+
+Query for the flow configuration:
+
+**Request:**
+```json
+{
+    "operation": "get",
+    "keys": [
+        {
+            "type": "flows",
+            "key": "your-flow-name"
+        }
+    ]
+}
+```
+
+**Response:**
+The response will contain a flow definition with an "interfaces" object that lists all queue names.
+
+### Method 2: Using the CLI
+
+Use the TrustGraph CLI to dump the configuration:
+
+```bash
+tg-show-config
+```
+
+## Flow Interface Types
+
+Flow configurations define two types of service interfaces:
+
+### 1. Request/Response Interfaces
+
+Services that accept a request and return a response:
+
+```json
+{
+    "graph-rag": {
+        "request": "non-persistent://tg/request/graph-rag:document-rag+graph-rag",
+        "response": "non-persistent://tg/response/graph-rag:document-rag+graph-rag"
+    }
+}
+```
+
+**Examples**: agent, document-rag, graph-rag, text-completion, prompt, embeddings, graph-embeddings, triples
+
+### 2. Fire-and-Forget Interfaces  
+
+Services that accept data but don't return a response:
+
+```json
+{
+    "text-load": "persistent://tg/flow/text-document-load:default"
+}
+```
+
+**Examples**: text-load, document-load, triples-store, graph-embeddings-store, document-embeddings-store, entity-contexts-load
+
+## Example Flow Configuration
+
+Here's an example of a complete flow configuration showing queue names:
+
+```json
+{
+    "class-name": "document-rag+graph-rag",
+    "description": "Default processing flow", 
+    "interfaces": {
+        "agent": {
+            "request": "non-persistent://tg/request/agent:default",
+            "response": "non-persistent://tg/response/agent:default"
+        },
+        "document-rag": {
+            "request": "non-persistent://tg/request/document-rag:document-rag+graph-rag",
+            "response": "non-persistent://tg/response/document-rag:document-rag+graph-rag"
+        },
+        "graph-rag": {
+            "request": "non-persistent://tg/request/graph-rag:document-rag+graph-rag", 
+            "response": "non-persistent://tg/response/graph-rag:document-rag+graph-rag"
+        },
+        "text-completion": {
+            "request": "non-persistent://tg/request/text-completion:document-rag+graph-rag",
+            "response": "non-persistent://tg/response/text-completion:document-rag+graph-rag"
+        },
+        "embeddings": {
+            "request": "non-persistent://tg/request/embeddings:document-rag+graph-rag",
+            "response": "non-persistent://tg/response/embeddings:document-rag+graph-rag"
+        },
+        "triples": {
+            "request": "non-persistent://tg/request/triples:document-rag+graph-rag",
+            "response": "non-persistent://tg/response/triples:document-rag+graph-rag"
+        },
+        "text-load": "persistent://tg/flow/text-document-load:default",
+        "document-load": "persistent://tg/flow/document-load:default",
+        "triples-store": "persistent://tg/flow/triples-store:default",
+        "graph-embeddings-store": "persistent://tg/flow/graph-embeddings-store:default"
+    }
+}
+```
+
+## Queue Naming Patterns
+
+### Global Services
+- **Pattern**: `{persistence}://tg/{namespace}/{service-name}`
+- **Example**: `non-persistent://tg/request/config`
+
+### Flow-Hosted Request/Response
+- **Pattern**: `{persistence}://tg/{namespace}/{service-name}:{flow-identifier}`
+- **Example**: `non-persistent://tg/request/graph-rag:document-rag+graph-rag`
+
+### Flow-Hosted Fire-and-Forget
+- **Pattern**: `{persistence}://tg/flow/{service-name}:{flow-identifier}`
+- **Example**: `persistent://tg/flow/text-document-load:default`
+
+## Persistence Types
+
+- **non-persistent**: Messages are not persisted to disk, faster but less reliable
+- **persistent**: Messages are persisted to disk, slower but more reliable
+
+## Practical Usage
+
+### Python Example
+
+```python
+import pulsar
+from trustgraph.schema import ConfigRequest, ConfigResponse
+
+# Connect to Pulsar
+client = pulsar.Client('pulsar://localhost:6650')
+
+# Create producer for config requests
+producer = client.create_producer(
+    'non-persistent://tg/request/config',
+    schema=pulsar.schema.AvroSchema(ConfigRequest)
+)
+
+# Create consumer for config responses  
+consumer = client.subscribe(
+    'non-persistent://tg/response/config',
+    subscription_name='my-subscription',
+    schema=pulsar.schema.AvroSchema(ConfigResponse)
+)
+
+# Send request
+request = ConfigRequest(operation='list-classes')
+producer.send(request)
+
+# Receive response
+response = consumer.receive()
+print(response.value())
+```
+
+### Flow Service Example
+
+```python
+# First, get the flow configuration to find queue names
+config_request = ConfigRequest(
+    operation='get',
+    keys=[ConfigKey(type='flows', key='my-flow')]
+)
+
+# Use the returned interface information to determine queue names
+# Then connect to the appropriate queues for the service you need
+```
+
+## Best Practices
+
+1. **Query Flow Configuration**: Always query the current flow configuration to get accurate queue names
+2. **Handle Dynamic Names**: Flow-hosted service queue names can change when flows are reconfigured
+3. **Choose Appropriate Persistence**: Use persistent queues for critical data, non-persistent for performance
+4. **Schema Validation**: Use the appropriate Pulsar schema for each service
+5. **Error Handling**: Implement proper error handling for queue connection and message failures
+
+## Security Considerations
+
+- Pulsar access should be restricted in production environments
+- Use appropriate authentication and authorization mechanisms
+- Monitor queue access and message patterns for security anomalies
+- Consider encryption for sensitive data in messages
--- a/docs/apis/websocket.md
+++ b/docs/apis/websocket.md
@ -18,13 +18,16 @@ When hosted using docker compose, you can access the service at

 ## Request

-A request message is a JSON message containing 3 fields:
+A request message is a JSON message containing 3/4 fields:

 - `id`: A unique ID which is used to correlate requests and responses.
  You should make sure it is unique.
 - `service`: The name of the service to invoke.
 - `request`: The request body which is passed to the service - this is
  defined in the API documentation for that service.
+- `flow`: Some APIs are supported by processors launched within a flow,
+  are are dependent on a flow running.  For such APIs, the flow identifier
+  needs to be provided.

 e.g.

@ -32,6 +35,7 @@ e.g.
 {
    "id": "qgzw1287vfjc8wsk-1",
    "service": "graph-rag",
+    "flow": "default",
    "request": {
        "query": "What does NASA stand for?"
    }
@ -86,6 +90,7 @@ Request:
 {
    "id": "blrqotfefnmnh7de-20",
    "service": "agent",
+    "flow": "default",
    "request": {
        "question": "What does NASA stand for?"
    }