feat: workspace-based multi-tenancy, replacing user as tenancy axis (#840)

Introduces `workspace` as the isolation boundary for config, flows,
library, and knowledge data. Removes `user` as a schema-level field
throughout the code, API specs, and tests; workspace provides the
same separation more cleanly at the trusted flow.workspace layer
rather than through client-supplied message fields.

Design
------
- IAM tech spec (docs/tech-specs/iam.md) documents current state,
  proposed auth/access model, and migration direction.
- Data ownership model (docs/tech-specs/data-ownership-model.md)
  captures the workspace/collection/flow hierarchy.

Schema + messaging
------------------
- Drop `user` field from AgentRequest/Step, GraphRagQuery,
  DocumentRagQuery, Triples/Graph/Document/Row EmbeddingsRequest,
  Sparql/Rows/Structured QueryRequest, ToolServiceRequest.
- Keep collection/workspace routing via flow.workspace at the
  service layer.
- Translators updated to not serialise/deserialise user.

API specs
---------
- OpenAPI schemas and path examples cleaned of user fields.
- Websocket async-api messages updated.
- Removed the unused parameters/User.yaml.

Services + base
---------------
- Librarian, collection manager, knowledge, config: all operations
  scoped by workspace. Config client API takes workspace as first
  positional arg.
- `flow.workspace` set at flow start time by the infrastructure;
  no longer pass-through from clients.
- Tool service drops user-personalisation passthrough.

CLI + SDK
---------
- tg-init-workspace and workspace-aware import/export.
- All tg-* commands drop user args; accept --workspace.
- Python API/SDK (flow, socket_client, async_*, explainability,
  library) drop user kwargs from every method signature.

MCP server
----------
- All tool endpoints drop user parameters; socket_manager no longer
  keyed per user.

Flow service
------------
- Closure-based topic cleanup on flow stop: only delete topics
  whose blueprint template was parameterised AND no remaining
  live flow (across all workspaces) still resolves to that topic.
  Three scopes fall out naturally from template analysis:
    * {id} -> per-flow, deleted on stop
    * {blueprint} -> per-blueprint, kept while any flow of the
      same blueprint exists
    * {workspace} -> per-workspace, kept while any flow in the
      workspace exists
    * literal -> global, never deleted (e.g. tg.request.librarian)
  Fixes a bug where stopping a flow silently destroyed the global
  librarian exchange, wedging all library operations until manual
  restart.

RabbitMQ backend
----------------
- heartbeat=60, blocked_connection_timeout=300. Catches silently
  dead connections (broker restart, orphaned channels, network
  partitions) within ~2 heartbeat windows, so the consumer
  reconnects and re-binds its queue rather than sitting forever
  on a zombie connection.

Tests
-----
- Full test refresh: unit, integration, contract, provenance.
- Dropped user-field assertions and constructor kwargs across
  ~100 test files.
- Renamed user-collection isolation tests to workspace-collection.
This commit is contained in:
cybermaggedon 2026-04-21 23:23:01 +01:00 committed by GitHub
parent 9332089b3d
commit d35473f7f7
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
377 changed files with 6868 additions and 5785 deletions

View file

@ -27,7 +27,6 @@ Quick Start:
# Execute a graph RAG query
response = flow.graph_rag(
query="What are the main topics?",
user="trustgraph",
collection="default"
)
```
@ -38,7 +37,7 @@ For streaming and async operations:
socket = api.socket()
flow = socket.flow("default")
for chunk in flow.agent(question="Hello", user="trustgraph"):
for chunk in flow.agent(question="Hello"):
print(chunk.content)
# Async operations

View file

@ -50,7 +50,7 @@ class Api:
token: Optional bearer token for authentication
"""
def __init__(self, url="http://localhost:8088/", timeout=60, token: Optional[str] = None):
def __init__(self, url="http://localhost:8088/", timeout=60, token: Optional[str] = None, workspace: str = "default"):
"""
Initialize the TrustGraph API client.
@ -82,6 +82,7 @@ class Api:
self.timeout = timeout
self.token = token
self.workspace = workspace
# Lazy initialization for new clients
self._socket_client = None
@ -137,7 +138,7 @@ class Api:
config.put([ConfigValue(type="llm", key="model", value="gpt-4")])
```
"""
return Config(api=self)
return Config(api=self, workspace=self.workspace)
def knowledge(self):
"""
@ -151,10 +152,10 @@ class Api:
knowledge = api.knowledge()
# List available KG cores
cores = knowledge.list_kg_cores(user="trustgraph")
cores = knowledge.list_kg_cores()
# Load a KG core
knowledge.load_kg_core(id="core-123", user="trustgraph")
knowledge.load_kg_core(id="core-123")
```
"""
return Knowledge(api=self)
@ -191,6 +192,12 @@ class Api:
if self.token:
headers["Authorization"] = f"Bearer {self.token}"
# Ensure every REST request carries the workspace so services can
# scope their behaviour. Callers that already set workspace in the
# payload (e.g. Library client) take precedence.
if isinstance(request, dict) and "workspace" not in request:
request = {**request, "workspace": self.workspace}
# Invoke the API, input is passed as JSON
resp = requests.post(url, json=request, timeout=self.timeout, headers=headers)
@ -227,13 +234,12 @@ class Api:
document=b"Document content",
id="doc-123",
metadata=[],
user="trustgraph",
title="My Document",
comments="Test document"
)
# List documents
docs = library.get_documents(user="trustgraph")
docs = library.get_documents()
```
"""
return Library(self)
@ -253,11 +259,10 @@ class Api:
collection = api.collection()
# List collections
colls = collection.list_collections(user="trustgraph")
colls = collection.list_collections()
# Update collection metadata
collection.update_collection(
user="trustgraph",
collection="default",
name="Default Collection",
description="Main data collection"
@ -286,7 +291,6 @@ class Api:
# Stream agent responses
for chunk in flow.agent(
question="Explain quantum computing",
user="trustgraph",
streaming=True
):
if hasattr(chunk, 'content'):
@ -297,7 +301,10 @@ class Api:
from . socket_client import SocketClient
# Extract base URL (remove api/v1/ suffix)
base_url = self.url.rsplit("api/v1/", 1)[0].rstrip("/")
self._socket_client = SocketClient(base_url, self.timeout, self.token)
self._socket_client = SocketClient(
base_url, self.timeout, self.token,
workspace=self.workspace,
)
return self._socket_client
def bulk(self):
@ -406,7 +413,6 @@ class Api:
# Stream agent responses
async for chunk in flow.agent(
question="Explain quantum computing",
user="trustgraph",
streaming=True
):
if hasattr(chunk, 'content'):
@ -417,7 +423,10 @@ class Api:
from . async_socket_client import AsyncSocketClient
# Extract base URL (remove api/v1/ suffix)
base_url = self.url.rsplit("api/v1/", 1)[0].rstrip("/")
self._async_socket_client = AsyncSocketClient(base_url, self.timeout, self.token)
self._async_socket_client = AsyncSocketClient(
base_url, self.timeout, self.token,
workspace=self.workspace,
)
return self._async_socket_client
def async_bulk(self):

View file

@ -326,9 +326,7 @@ class AsyncFlow:
# Use flow services
result = await flow.graph_rag(
query="What is TrustGraph?",
user="trustgraph",
collection="default"
query="What is TrustGraph?",collection="default"
)
```
"""
@ -385,7 +383,7 @@ class AsyncFlowInstance:
"""
return await self.flow.request(f"flow/{self.flow_id}/service/{service}", request_data)
async def agent(self, question: str, user: str, state: Optional[Dict] = None,
async def agent(self, question: str, state: Optional[Dict] = None,
group: Optional[str] = None, history: Optional[List] = None, **kwargs: Any) -> Dict[str, Any]:
"""
Execute an agent operation (non-streaming).
@ -399,7 +397,6 @@ class AsyncFlowInstance:
Args:
question: User question or instruction
user: User identifier
state: Optional state dictionary for conversation context
group: Optional group identifier for session management
history: Optional conversation history list
@ -416,14 +413,12 @@ class AsyncFlowInstance:
# Execute agent
result = await flow.agent(
question="What is the capital of France?",
user="trustgraph"
)
)
print(f"Answer: {result.get('response')}")
```
"""
request_data = {
"question": question,
"user": user,
"streaming": False # REST doesn't support streaming
}
if state is not None:
@ -481,7 +476,7 @@ class AsyncFlowInstance:
model=result.get("model"),
)
async def graph_rag(self, query: str, user: str, collection: str,
async def graph_rag(self, query: str, collection: str,
max_subgraph_size: int = 1000, max_subgraph_count: int = 5,
max_entity_distance: int = 3, **kwargs: Any) -> str:
"""
@ -496,7 +491,6 @@ class AsyncFlowInstance:
Args:
query: User query text
user: User identifier
collection: Collection identifier containing the knowledge graph
max_subgraph_size: Maximum number of triples per subgraph (default: 1000)
max_subgraph_count: Maximum number of subgraphs to retrieve (default: 5)
@ -513,9 +507,7 @@ class AsyncFlowInstance:
# Query knowledge graph
response = await flow.graph_rag(
query="What are the relationships between these entities?",
user="trustgraph",
collection="medical-kb",
query="What are the relationships between these entities?",collection="medical-kb",
max_subgraph_count=3
)
print(response)
@ -523,7 +515,6 @@ class AsyncFlowInstance:
"""
request_data = {
"query": query,
"user": user,
"collection": collection,
"max-subgraph-size": max_subgraph_size,
"max-subgraph-count": max_subgraph_count,
@ -535,7 +526,7 @@ class AsyncFlowInstance:
result = await self.request("graph-rag", request_data)
return result.get("response", "")
async def document_rag(self, query: str, user: str, collection: str,
async def document_rag(self, query: str, collection: str,
doc_limit: int = 10, **kwargs: Any) -> str:
"""
Execute document-based RAG query (non-streaming).
@ -549,7 +540,6 @@ class AsyncFlowInstance:
Args:
query: User query text
user: User identifier
collection: Collection identifier containing documents
doc_limit: Maximum number of document chunks to retrieve (default: 10)
**kwargs: Additional service-specific parameters
@ -564,9 +554,7 @@ class AsyncFlowInstance:
# Query documents
response = await flow.document_rag(
query="What does the documentation say about authentication?",
user="trustgraph",
collection="docs",
query="What does the documentation say about authentication?",collection="docs",
doc_limit=5
)
print(response)
@ -574,7 +562,6 @@ class AsyncFlowInstance:
"""
request_data = {
"query": query,
"user": user,
"collection": collection,
"doc-limit": doc_limit,
"streaming": False
@ -584,7 +571,7 @@ class AsyncFlowInstance:
result = await self.request("document-rag", request_data)
return result.get("response", "")
async def graph_embeddings_query(self, text: str, user: str, collection: str, limit: int = 10, **kwargs: Any):
async def graph_embeddings_query(self, text: str, collection: str, limit: int = 10, **kwargs: Any):
"""
Query graph embeddings for semantic entity search.
@ -593,7 +580,6 @@ class AsyncFlowInstance:
Args:
text: Query text for semantic search
user: User identifier
collection: Collection identifier containing graph embeddings
limit: Maximum number of results to return (default: 10)
**kwargs: Additional service-specific parameters
@ -608,9 +594,7 @@ class AsyncFlowInstance:
# Find related entities
results = await flow.graph_embeddings_query(
text="machine learning algorithms",
user="trustgraph",
collection="tech-kb",
text="machine learning algorithms",collection="tech-kb",
limit=5
)
@ -624,7 +608,6 @@ class AsyncFlowInstance:
request_data = {
"vector": vector,
"user": user,
"collection": collection,
"limit": limit
}
@ -663,7 +646,7 @@ class AsyncFlowInstance:
return await self.request("embeddings", request_data)
async def triples_query(self, s=None, p=None, o=None, user=None, collection=None, limit=100, **kwargs: Any):
async def triples_query(self, s=None, p=None, o=None, collection=None, limit=100, **kwargs: Any):
"""
Query RDF triples using pattern matching.
@ -674,7 +657,6 @@ class AsyncFlowInstance:
s: Subject pattern (None for wildcard)
p: Predicate pattern (None for wildcard)
o: Object pattern (None for wildcard)
user: User identifier (None for all users)
collection: Collection identifier (None for all collections)
limit: Maximum number of triples to return (default: 100)
**kwargs: Additional service-specific parameters
@ -689,9 +671,7 @@ class AsyncFlowInstance:
# Find all triples with a specific predicate
results = await flow.triples_query(
p="knows",
user="trustgraph",
collection="social",
p="knows",collection="social",
limit=50
)
@ -706,15 +686,13 @@ class AsyncFlowInstance:
request_data["p"] = str(p)
if o is not None:
request_data["o"] = str(o)
if user is not None:
request_data["user"] = user
if collection is not None:
request_data["collection"] = collection
request_data.update(kwargs)
return await self.request("triples", request_data)
async def rows_query(self, query: str, user: str, collection: str, variables: Optional[Dict] = None,
async def rows_query(self, query: str, collection: str, variables: Optional[Dict] = None,
operation_name: Optional[str] = None, **kwargs: Any):
"""
Execute a GraphQL query on stored rows.
@ -724,7 +702,6 @@ class AsyncFlowInstance:
Args:
query: GraphQL query string
user: User identifier
collection: Collection identifier containing rows
variables: Optional GraphQL query variables
operation_name: Optional operation name for multi-operation queries
@ -750,9 +727,7 @@ class AsyncFlowInstance:
'''
result = await flow.rows_query(
query=query,
user="trustgraph",
collection="users",
query=query,collection="users",
variables={"status": "active"}
)
@ -762,7 +737,6 @@ class AsyncFlowInstance:
"""
request_data = {
"query": query,
"user": user,
"collection": collection
}
if variables:
@ -774,7 +748,7 @@ class AsyncFlowInstance:
return await self.request("rows", request_data)
async def row_embeddings_query(
self, text: str, schema_name: str, user: str = "trustgraph",
self, text: str, schema_name: str,
collection: str = "default", index_name: Optional[str] = None,
limit: int = 10, **kwargs: Any
):
@ -788,7 +762,6 @@ class AsyncFlowInstance:
Args:
text: Query text for semantic search
schema_name: Schema name to search within
user: User identifier (default: "trustgraph")
collection: Collection identifier (default: "default")
index_name: Optional index name to filter search to specific index
limit: Maximum number of results to return (default: 10)
@ -806,9 +779,7 @@ class AsyncFlowInstance:
# Search for customers by name similarity
results = await flow.row_embeddings_query(
text="John Smith",
schema_name="customers",
user="trustgraph",
collection="sales",
schema_name="customers",collection="sales",
limit=5
)
@ -823,7 +794,6 @@ class AsyncFlowInstance:
request_data = {
"vector": vector,
"schema_name": schema_name,
"user": user,
"collection": collection,
"limit": limit
}

View file

@ -22,10 +22,14 @@ class AsyncSocketClient:
Or call connect()/aclose() manually.
"""
def __init__(self, url: str, timeout: int, token: Optional[str]):
def __init__(
self, url: str, timeout: int, token: Optional[str],
workspace: str = "default",
):
self.url = self._convert_to_ws_url(url)
self.timeout = timeout
self.token = token
self.workspace = workspace
self._request_counter = 0
self._socket = None
self._connect_cm = None
@ -117,6 +121,7 @@ class AsyncSocketClient:
try:
message = {
"id": request_id,
"workspace": self.workspace,
"service": service,
"request": request
}
@ -149,6 +154,7 @@ class AsyncSocketClient:
try:
message = {
"id": request_id,
"workspace": self.workspace,
"service": service,
"request": request
}
@ -251,13 +257,12 @@ class AsyncSocketFlowInstance:
self.client = client
self.flow_id = flow_id
async def agent(self, question: str, user: str, state: Optional[Dict[str, Any]] = None,
async def agent(self, question: str, state: Optional[Dict[str, Any]] = None,
group: Optional[str] = None, history: Optional[list] = None,
streaming: bool = False, **kwargs) -> Union[Dict[str, Any], AsyncIterator]:
"""Agent with optional streaming"""
request = {
"question": question,
"user": user,
"streaming": streaming
}
if state is not None:
@ -303,13 +308,12 @@ class AsyncSocketFlowInstance:
if isinstance(chunk, RAGChunk):
yield chunk
async def graph_rag(self, query: str, user: str, collection: str,
async def graph_rag(self, query: str, collection: str,
max_subgraph_size: int = 1000, max_subgraph_count: int = 5,
max_entity_distance: int = 3, streaming: bool = False, **kwargs):
"""Graph RAG with optional streaming"""
request = {
"query": query,
"user": user,
"collection": collection,
"max-subgraph-size": max_subgraph_size,
"max-subgraph-count": max_subgraph_count,
@ -330,12 +334,11 @@ class AsyncSocketFlowInstance:
if hasattr(chunk, 'content'):
yield chunk.content
async def document_rag(self, query: str, user: str, collection: str,
async def document_rag(self, query: str, collection: str,
doc_limit: int = 10, streaming: bool = False, **kwargs):
"""Document RAG with optional streaming"""
request = {
"query": query,
"user": user,
"collection": collection,
"doc-limit": doc_limit,
"streaming": streaming
@ -375,14 +378,13 @@ class AsyncSocketFlowInstance:
if hasattr(chunk, 'content'):
yield chunk.content
async def graph_embeddings_query(self, text: str, user: str, collection: str, limit: int = 10, **kwargs):
async def graph_embeddings_query(self, text: str, collection: str, limit: int = 10, **kwargs):
"""Query graph embeddings for semantic search"""
emb_result = await self.embeddings(texts=[text])
vector = emb_result.get("vectors", [[]])[0]
request = {
"vector": vector,
"user": user,
"collection": collection,
"limit": limit
}
@ -397,7 +399,7 @@ class AsyncSocketFlowInstance:
return await self.client._send_request("embeddings", self.flow_id, request)
async def triples_query(self, s=None, p=None, o=None, user=None, collection=None, limit=100, **kwargs):
async def triples_query(self, s=None, p=None, o=None, collection=None, limit=100, **kwargs):
"""Triple pattern query"""
request = {"limit": limit}
if s is not None:
@ -406,20 +408,17 @@ class AsyncSocketFlowInstance:
request["p"] = str(p)
if o is not None:
request["o"] = str(o)
if user is not None:
request["user"] = user
if collection is not None:
request["collection"] = collection
request.update(kwargs)
return await self.client._send_request("triples", self.flow_id, request)
async def rows_query(self, query: str, user: str, collection: str, variables: Optional[Dict] = None,
async def rows_query(self, query: str, collection: str, variables: Optional[Dict] = None,
operation_name: Optional[str] = None, **kwargs):
"""GraphQL query against structured rows"""
request = {
"query": query,
"user": user,
"collection": collection
}
if variables:
@ -441,7 +440,7 @@ class AsyncSocketFlowInstance:
return await self.client._send_request("mcp-tool", self.flow_id, request)
async def row_embeddings_query(
self, text: str, schema_name: str, user: str = "trustgraph",
self, text: str, schema_name: str,
collection: str = "default", index_name: Optional[str] = None,
limit: int = 10, **kwargs
):
@ -452,7 +451,6 @@ class AsyncSocketFlowInstance:
request = {
"vector": vector,
"schema_name": schema_name,
"user": user,
"collection": collection,
"limit": limit
}

View file

@ -85,7 +85,7 @@ class BulkClient:
Args:
flow: Flow identifier
triples: Iterator yielding Triple objects
metadata: Metadata dict with id, metadata, user, collection
metadata: Metadata dict with id, metadata, collection
batch_size: Number of triples per batch (default 100)
**kwargs: Additional parameters (reserved for future use)
@ -105,7 +105,7 @@ class BulkClient:
bulk.import_triples(
flow="default",
triples=triple_generator(),
metadata={"id": "doc1", "metadata": [], "user": "user1", "collection": "default"}
metadata={"id": "doc1", "metadata": [], "collection": "default"}
)
```
"""
@ -121,7 +121,7 @@ class BulkClient:
ws_url = f"{ws_url}?token={self.token}"
if metadata is None:
metadata = {"id": "", "metadata": [], "user": "trustgraph", "collection": "default"}
metadata = {"id": "", "metadata": [], "collection": "default"}
async with websockets.connect(ws_url, ping_interval=20, ping_timeout=self.timeout) as websocket:
batch = []
@ -418,7 +418,7 @@ class BulkClient:
Args:
flow: Flow identifier
contexts: Iterator yielding context dictionaries
metadata: Metadata dict with id, metadata, user, collection
metadata: Metadata dict with id, metadata, collection
batch_size: Number of contexts per batch (default 100)
**kwargs: Additional parameters (reserved for future use)
@ -435,7 +435,7 @@ class BulkClient:
bulk.import_entity_contexts(
flow="default",
contexts=context_generator(),
metadata={"id": "doc1", "metadata": [], "user": "user1", "collection": "default"}
metadata={"id": "doc1", "metadata": [], "collection": "default"}
)
```
"""
@ -451,7 +451,7 @@ class BulkClient:
ws_url = f"{ws_url}?token={self.token}"
if metadata is None:
metadata = {"id": "", "metadata": [], "user": "trustgraph", "collection": "default"}
metadata = {"id": "", "metadata": [], "collection": "default"}
async with websockets.connect(ws_url, ping_interval=20, ping_timeout=self.timeout) as websocket:
batch = []

View file

@ -2,11 +2,9 @@
TrustGraph Collection Management
This module provides interfaces for managing data collections in TrustGraph.
Collections provide logical grouping and isolation for documents and knowledge
graph data.
Collections provide logical grouping within a workspace.
"""
import datetime
import logging
from . types import CollectionMetadata
@ -18,10 +16,9 @@ class Collection:
"""
Collection management client.
Provides methods for managing data collections, including listing,
updating metadata, and deleting collections. Collections organize
documents and knowledge graph data into logical groupings for
isolation and access control.
Provides methods for managing data collections within the configured
workspace, including listing, updating metadata, and deleting
collections.
"""
def __init__(self, api):
@ -45,45 +42,20 @@ class Collection:
"""
return self.api.request(f"collection-management", request)
def list_collections(self, user, tag_filter=None):
def list_collections(self, tag_filter=None):
"""
List all collections for a user.
Retrieves metadata for all collections owned by the specified user,
with optional filtering by tags.
List all collections in this workspace.
Args:
user: User identifier
tag_filter: Optional list of tags to filter collections (default: None)
tag_filter: Optional list of tags to filter collections
Returns:
list[CollectionMetadata]: List of collection metadata objects
Raises:
ProtocolException: If response format is invalid
Example:
```python
collection = api.collection()
# List all collections
all_colls = collection.list_collections(user="trustgraph")
for coll in all_colls:
print(f"{coll.collection}: {coll.name}")
print(f" Description: {coll.description}")
print(f" Tags: {', '.join(coll.tags)}")
# List collections with specific tags
research_colls = collection.list_collections(
user="trustgraph",
tag_filter=["research", "published"]
)
```
"""
input = {
"operation": "list-collections",
"user": user,
"workspace": self.api.workspace,
}
if tag_filter:
@ -92,7 +64,6 @@ class Collection:
object = self.request(input)
try:
# Handle case where collections might be None or missing
if object is None or "collections" not in object:
return []
@ -102,7 +73,6 @@ class Collection:
return [
CollectionMetadata(
user = v["user"],
collection = v["collection"],
name = v["name"],
description = v["description"],
@ -114,15 +84,11 @@ class Collection:
logger.error("Failed to parse collection list response", exc_info=True)
raise ProtocolException(f"Response not formatted correctly")
def update_collection(self, user, collection, name=None, description=None, tags=None):
def update_collection(self, collection, name=None, description=None, tags=None):
"""
Update collection metadata.
Updates the name, description, and/or tags for an existing collection.
Only provided fields are updated; others remain unchanged.
Args:
user: User identifier
collection: Collection identifier
name: New collection name (optional)
description: New collection description (optional)
@ -130,35 +96,11 @@ class Collection:
Returns:
CollectionMetadata: Updated collection metadata, or None if not found
Raises:
ProtocolException: If response format is invalid
Example:
```python
collection_api = api.collection()
# Update collection metadata
updated = collection_api.update_collection(
user="trustgraph",
collection="default",
name="Default Collection",
description="Main data collection for general use",
tags=["default", "production"]
)
# Update only specific fields
updated = collection_api.update_collection(
user="trustgraph",
collection="research",
description="Updated description"
)
```
"""
input = {
"operation": "update-collection",
"user": user,
"workspace": self.api.workspace,
"collection": collection,
}
@ -175,7 +117,6 @@ class Collection:
if "collections" in object and object["collections"]:
v = object["collections"][0]
return CollectionMetadata(
user = v["user"],
collection = v["collection"],
name = v["name"],
description = v["description"],
@ -186,37 +127,23 @@ class Collection:
logger.error("Failed to parse collection update response", exc_info=True)
raise ProtocolException(f"Response not formatted correctly")
def delete_collection(self, user, collection):
def delete_collection(self, collection):
"""
Delete a collection.
Removes a collection and all its associated data from the system.
Args:
user: User identifier
collection: Collection identifier to delete
Returns:
dict: Empty response object
Example:
```python
collection_api = api.collection()
# Delete a collection
collection_api.delete_collection(
user="trustgraph",
collection="old-collection"
)
```
"""
input = {
"operation": "delete-collection",
"user": user,
"workspace": self.api.workspace,
"collection": collection,
}
object = self.request(input)
self.request(input)
return {}
return {}

View file

@ -21,14 +21,16 @@ class Config:
and list operations.
"""
def __init__(self, api):
def __init__(self, api, workspace="default"):
"""
Initialize Config client.
Args:
api: Parent Api instance for making requests
workspace: Workspace to scope all config operations to
"""
self.api = api
self.workspace = workspace
def request(self, request):
"""
@ -75,9 +77,9 @@ class Config:
```
"""
# The input consists of system and prompt strings
input = {
"operation": "get",
"workspace": self.workspace,
"keys": [
{ "type": k.type, "key": k.key }
for k in keys
@ -123,9 +125,9 @@ class Config:
```
"""
# The input consists of system and prompt strings
input = {
"operation": "put",
"workspace": self.workspace,
"values": [
{ "type": v.type, "key": v.key, "value": v.value }
for v in values
@ -157,9 +159,9 @@ class Config:
```
"""
# The input consists of system and prompt strings
input = {
"operation": "delete",
"workspace": self.workspace,
"keys": [
{ "type": v.type, "key": v.key }
for v in keys
@ -195,9 +197,9 @@ class Config:
```
"""
# The input consists of system and prompt strings
input = {
"operation": "list",
"workspace": self.workspace,
"type": type,
}
@ -235,9 +237,9 @@ class Config:
```
"""
# The input consists of system and prompt strings
input = {
"operation": "getvalues",
"workspace": self.workspace,
"type": type,
}
@ -255,6 +257,46 @@ class Config:
except:
raise ProtocolException(f"Response not formatted correctly")
def get_values_all_workspaces(self, type):
"""
Get all configuration values of a given type across all workspaces.
Unlike get_values(), this is not scoped to a single workspace
it returns every entry of the given type in the system. Each
returned ConfigValue includes its workspace field. Used by
shared processors to load type-scoped config at startup.
Args:
type: Configuration type (e.g. "prompt", "schema")
Returns:
list[ConfigValue]: Values across all workspaces; each has
its workspace field populated.
Raises:
ProtocolException: If response format is invalid
"""
input = {
"operation": "getvalues-all-ws",
"type": type,
}
object = self.request(input)
try:
return [
ConfigValue(
type = v["type"],
key = v["key"],
value = v["value"],
workspace = v.get("workspace", ""),
)
for v in object["values"]
]
except Exception:
raise ProtocolException("Response not formatted correctly")
def all(self):
"""
Get complete configuration and version.
@ -279,9 +321,9 @@ class Config:
```
"""
# The input consists of system and prompt strings
input = {
"operation": "config"
"operation": "config",
"workspace": self.workspace,
}
object = self.request(input)

View file

@ -486,7 +486,6 @@ class ExplainabilityClient:
self,
uri: str,
graph: Optional[str] = None,
user: Optional[str] = None,
collection: Optional[str] = None
) -> Optional[ExplainEntity]:
"""
@ -502,7 +501,6 @@ class ExplainabilityClient:
Args:
uri: The entity URI to fetch
graph: Named graph to query (e.g., "urn:graph:retrieval")
user: User/keyspace identifier
collection: Collection identifier
Returns:
@ -515,7 +513,6 @@ class ExplainabilityClient:
wire_triples = self.flow.triples_query(
s=uri,
g=graph,
user=user,
collection=collection,
limit=100
)
@ -548,7 +545,7 @@ class ExplainabilityClient:
if prev_triples:
# Re-fetch and parse
wire_triples = self.flow.triples_query(
s=uri, g=graph, user=user, collection=collection, limit=100
s=uri, g=graph, collection=collection, limit=100
)
if wire_triples:
triples = wire_triples_to_tuples(wire_triples)
@ -560,7 +557,6 @@ class ExplainabilityClient:
self,
uri: str,
graph: Optional[str] = None,
user: Optional[str] = None,
collection: Optional[str] = None
) -> Optional[EdgeSelection]:
"""
@ -569,7 +565,6 @@ class ExplainabilityClient:
Args:
uri: The edge selection URI
graph: Named graph to query
user: User/keyspace identifier
collection: Collection identifier
Returns:
@ -578,7 +573,6 @@ class ExplainabilityClient:
wire_triples = self.flow.triples_query(
s=uri,
g=graph,
user=user,
collection=collection,
limit=100
)
@ -593,7 +587,6 @@ class ExplainabilityClient:
self,
uri: str,
graph: Optional[str] = None,
user: Optional[str] = None,
collection: Optional[str] = None
) -> Optional[Focus]:
"""
@ -602,20 +595,19 @@ class ExplainabilityClient:
Args:
uri: The Focus entity URI
graph: Named graph to query
user: User/keyspace identifier
collection: Collection identifier
Returns:
Focus with populated edge_selections, or None
"""
entity = self.fetch_entity(uri, graph, user, collection)
entity = self.fetch_entity(uri, graph, collection)
if not isinstance(entity, Focus):
return None
# Fetch each edge selection
for edge_uri in entity.selected_edge_uris:
edge_sel = self.fetch_edge_selection(edge_uri, graph, user, collection)
edge_sel = self.fetch_edge_selection(edge_uri, graph, collection)
if edge_sel:
entity.edge_selections.append(edge_sel)
@ -624,7 +616,6 @@ class ExplainabilityClient:
def resolve_label(
self,
uri: str,
user: Optional[str] = None,
collection: Optional[str] = None
) -> str:
"""
@ -632,7 +623,6 @@ class ExplainabilityClient:
Args:
uri: The URI to get label for
user: User/keyspace identifier
collection: Collection identifier
Returns:
@ -647,7 +637,6 @@ class ExplainabilityClient:
wire_triples = self.flow.triples_query(
s=uri,
p=RDFS_LABEL,
user=user,
collection=collection,
limit=1
)
@ -665,7 +654,6 @@ class ExplainabilityClient:
def resolve_edge_labels(
self,
edge: Dict[str, str],
user: Optional[str] = None,
collection: Optional[str] = None
) -> Tuple[str, str, str]:
"""
@ -673,22 +661,20 @@ class ExplainabilityClient:
Args:
edge: Dict with "s", "p", "o" keys
user: User/keyspace identifier
collection: Collection identifier
Returns:
Tuple of (s_label, p_label, o_label)
"""
s_label = self.resolve_label(edge.get("s", ""), user, collection)
p_label = self.resolve_label(edge.get("p", ""), user, collection)
o_label = self.resolve_label(edge.get("o", ""), user, collection)
s_label = self.resolve_label(edge.get("s", ""), collection)
p_label = self.resolve_label(edge.get("p", ""), collection)
o_label = self.resolve_label(edge.get("o", ""), collection)
return (s_label, p_label, o_label)
def fetch_document_content(
self,
document_uri: str,
api: Any,
user: Optional[str] = None,
max_content: int = 10000
) -> str:
"""
@ -697,7 +683,6 @@ class ExplainabilityClient:
Args:
document_uri: The document URI in the librarian
api: TrustGraph Api instance for librarian access
user: User identifier for librarian
max_content: Maximum content length to return
Returns:
@ -712,7 +697,7 @@ class ExplainabilityClient:
for attempt in range(self.max_retries):
try:
library = api.library()
content_bytes = library.get_document_content(user=user, id=doc_id)
content_bytes = library.get_document_content(id=doc_id)
# Decode as text
try:
@ -736,7 +721,6 @@ class ExplainabilityClient:
self,
question_uri: str,
graph: Optional[str] = None,
user: Optional[str] = None,
collection: Optional[str] = None,
api: Any = None,
max_content: int = 10000
@ -749,7 +733,6 @@ class ExplainabilityClient:
Args:
question_uri: The question entity URI
graph: Named graph (default: urn:graph:retrieval)
user: User/keyspace identifier
collection: Collection identifier
api: TrustGraph Api instance for librarian access (optional)
max_content: Maximum content length for synthesis
@ -769,7 +752,7 @@ class ExplainabilityClient:
}
# Fetch question
question = self.fetch_entity(question_uri, graph, user, collection)
question = self.fetch_entity(question_uri, graph, collection)
if not isinstance(question, Question):
return trace
trace["question"] = question
@ -779,7 +762,6 @@ class ExplainabilityClient:
p=PROV_WAS_DERIVED_FROM,
o=question_uri,
g=graph,
user=user,
collection=collection,
limit=10
)
@ -790,7 +772,7 @@ class ExplainabilityClient:
for t in grounding_triples
]
for gnd_uri in grounding_uris:
grounding = self.fetch_entity(gnd_uri, graph, user, collection)
grounding = self.fetch_entity(gnd_uri, graph, collection)
if isinstance(grounding, Grounding):
trace["grounding"] = grounding
break
@ -803,7 +785,6 @@ class ExplainabilityClient:
p=PROV_WAS_DERIVED_FROM,
o=trace["grounding"].uri,
g=graph,
user=user,
collection=collection,
limit=10
)
@ -814,7 +795,7 @@ class ExplainabilityClient:
for t in exploration_triples
]
for exp_uri in exploration_uris:
exploration = self.fetch_entity(exp_uri, graph, user, collection)
exploration = self.fetch_entity(exp_uri, graph, collection)
if isinstance(exploration, Exploration):
trace["exploration"] = exploration
break
@ -827,7 +808,6 @@ class ExplainabilityClient:
p=PROV_WAS_DERIVED_FROM,
o=trace["exploration"].uri,
g=graph,
user=user,
collection=collection,
limit=10
)
@ -838,7 +818,7 @@ class ExplainabilityClient:
for t in focus_triples
]
for focus_uri in focus_uris:
focus = self.fetch_focus_with_edges(focus_uri, graph, user, collection)
focus = self.fetch_focus_with_edges(focus_uri, graph, collection)
if focus:
trace["focus"] = focus
break
@ -851,7 +831,6 @@ class ExplainabilityClient:
p=PROV_WAS_DERIVED_FROM,
o=trace["focus"].uri,
g=graph,
user=user,
collection=collection,
limit=10
)
@ -862,7 +841,7 @@ class ExplainabilityClient:
for t in synthesis_triples
]
for synth_uri in synthesis_uris:
synthesis = self.fetch_entity(synth_uri, graph, user, collection)
synthesis = self.fetch_entity(synth_uri, graph, collection)
if isinstance(synthesis, Synthesis):
trace["synthesis"] = synthesis
break
@ -873,7 +852,6 @@ class ExplainabilityClient:
self,
question_uri: str,
graph: Optional[str] = None,
user: Optional[str] = None,
collection: Optional[str] = None,
api: Any = None,
max_content: int = 10000
@ -887,7 +865,6 @@ class ExplainabilityClient:
Args:
question_uri: The question entity URI
graph: Named graph (default: urn:graph:retrieval)
user: User/keyspace identifier
collection: Collection identifier
api: TrustGraph Api instance for librarian access (optional)
max_content: Maximum content length for synthesis
@ -906,7 +883,7 @@ class ExplainabilityClient:
}
# Fetch question
question = self.fetch_entity(question_uri, graph, user, collection)
question = self.fetch_entity(question_uri, graph, collection)
if not isinstance(question, Question):
return trace
trace["question"] = question
@ -916,7 +893,6 @@ class ExplainabilityClient:
p=PROV_WAS_DERIVED_FROM,
o=question_uri,
g=graph,
user=user,
collection=collection,
limit=10
)
@ -927,7 +903,7 @@ class ExplainabilityClient:
for t in grounding_triples
]
for gnd_uri in grounding_uris:
grounding = self.fetch_entity(gnd_uri, graph, user, collection)
grounding = self.fetch_entity(gnd_uri, graph, collection)
if isinstance(grounding, Grounding):
trace["grounding"] = grounding
break
@ -940,7 +916,6 @@ class ExplainabilityClient:
p=PROV_WAS_DERIVED_FROM,
o=trace["grounding"].uri,
g=graph,
user=user,
collection=collection,
limit=10
)
@ -951,7 +926,7 @@ class ExplainabilityClient:
for t in exploration_triples
]
for exp_uri in exploration_uris:
exploration = self.fetch_entity(exp_uri, graph, user, collection)
exploration = self.fetch_entity(exp_uri, graph, collection)
if isinstance(exploration, Exploration):
trace["exploration"] = exploration
break
@ -964,7 +939,6 @@ class ExplainabilityClient:
p=PROV_WAS_DERIVED_FROM,
o=trace["exploration"].uri,
g=graph,
user=user,
collection=collection,
limit=10
)
@ -975,7 +949,7 @@ class ExplainabilityClient:
for t in synthesis_triples
]
for synth_uri in synthesis_uris:
synthesis = self.fetch_entity(synth_uri, graph, user, collection)
synthesis = self.fetch_entity(synth_uri, graph, collection)
if isinstance(synthesis, Synthesis):
trace["synthesis"] = synthesis
break
@ -986,7 +960,6 @@ class ExplainabilityClient:
self,
session_uri: str,
graph: Optional[str] = None,
user: Optional[str] = None,
collection: Optional[str] = None,
api: Any = None,
max_content: int = 10000
@ -1002,7 +975,6 @@ class ExplainabilityClient:
Args:
session_uri: The agent session/question URI
graph: Named graph (default: urn:graph:retrieval)
user: User/keyspace identifier
collection: Collection identifier
api: TrustGraph Api instance for librarian access (optional)
max_content: Maximum content length for conclusion
@ -1019,21 +991,21 @@ class ExplainabilityClient:
}
# Fetch question/session
question = self.fetch_entity(session_uri, graph, user, collection)
question = self.fetch_entity(session_uri, graph, collection)
if not isinstance(question, Question):
return trace
trace["question"] = question
# Follow the provenance chain from the question
self._follow_provenance_chain(
session_uri, trace, graph, user, collection,
session_uri, trace, graph, collection,
max_depth=50,
)
return trace
def _follow_provenance_chain(
self, current_uri, trace, graph, user, collection,
self, current_uri, trace, graph, collection,
max_depth=50,
):
"""Recursively follow the provenance chain, handling branches."""
@ -1044,7 +1016,7 @@ class ExplainabilityClient:
derived_triples = self.flow.triples_query(
p=PROV_WAS_DERIVED_FROM,
o=current_uri,
g=graph, user=user, collection=collection,
g=graph, collection=collection,
limit=20
)
@ -1060,7 +1032,7 @@ class ExplainabilityClient:
if not derived_uri:
continue
entity = self.fetch_entity(derived_uri, graph, user, collection)
entity = self.fetch_entity(derived_uri, graph, collection)
if entity is None:
continue
@ -1070,7 +1042,7 @@ class ExplainabilityClient:
# Continue following from this entity
self._follow_provenance_chain(
derived_uri, trace, graph, user, collection,
derived_uri, trace, graph, collection,
max_depth=max_depth - 1,
)
@ -1079,11 +1051,11 @@ class ExplainabilityClient:
# Fetch the full sub-trace and embed it.
if entity.question_type == "graph-rag":
sub_trace = self.fetch_graphrag_trace(
derived_uri, graph, user, collection,
derived_uri, graph, collection,
)
elif entity.question_type == "document-rag":
sub_trace = self.fetch_docrag_trace(
derived_uri, graph, user, collection,
derived_uri, graph, collection,
)
else:
sub_trace = None
@ -1100,7 +1072,7 @@ class ExplainabilityClient:
terminal = sub_trace.get("synthesis")
if terminal:
self._follow_provenance_chain(
terminal.uri, trace, graph, user, collection,
terminal.uri, trace, graph, collection,
max_depth=max_depth - 1,
)
@ -1110,7 +1082,6 @@ class ExplainabilityClient:
def list_sessions(
self,
graph: Optional[str] = None,
user: Optional[str] = None,
collection: Optional[str] = None,
limit: int = 50
) -> List[Question]:
@ -1119,7 +1090,6 @@ class ExplainabilityClient:
Args:
graph: Named graph (default: urn:graph:retrieval)
user: User/keyspace identifier
collection: Collection identifier
limit: Maximum number of sessions to return
@ -1133,7 +1103,6 @@ class ExplainabilityClient:
query_triples = self.flow.triples_query(
p=TG_QUERY,
g=graph,
user=user,
collection=collection,
limit=limit
)
@ -1142,7 +1111,7 @@ class ExplainabilityClient:
for t in query_triples:
question_uri = extract_term_value(t.get("s", {}))
if question_uri:
entity = self.fetch_entity(question_uri, graph, user, collection)
entity = self.fetch_entity(question_uri, graph, collection)
if isinstance(entity, Question):
questions.append(entity)
@ -1154,7 +1123,6 @@ class ExplainabilityClient:
s=q.uri,
p=PROV_WAS_DERIVED_FROM,
g=graph,
user=user,
collection=collection,
limit=1
)
@ -1170,7 +1138,6 @@ class ExplainabilityClient:
self,
session_uri: str,
graph: Optional[str] = None,
user: Optional[str] = None,
collection: Optional[str] = None
) -> str:
"""
@ -1179,7 +1146,6 @@ class ExplainabilityClient:
Args:
session_uri: The session/question URI
graph: Named graph
user: User/keyspace identifier
collection: Collection identifier
Returns:
@ -1201,7 +1167,6 @@ class ExplainabilityClient:
p=PROV_WAS_DERIVED_FROM,
o=session_uri,
g=graph,
user=user,
collection=collection,
limit=5
)
@ -1212,7 +1177,7 @@ class ExplainabilityClient:
]
for child_uri in all_child_uris:
entity = self.fetch_entity(child_uri, graph, user, collection)
entity = self.fetch_entity(child_uri, graph, collection)
if isinstance(entity, (Analysis, Decomposition, Plan)):
return "agent"
if isinstance(entity, Exploration):

View file

@ -115,72 +115,32 @@ class Flow:
return FlowInstance(api=self, id=id)
def list_blueprints(self):
"""
List all available flow blueprints.
"""List blueprints in the current workspace."""
Returns:
list[str]: List of blueprint names
Example:
```python
blueprints = api.flow().list_blueprints()
print(blueprints) # ['default', 'custom-flow', ...]
```
"""
# The input consists of system and prompt strings
input = {
"operation": "list-blueprints",
"workspace": self.api.workspace,
}
return self.request(request = input)["blueprint-names"]
def get_blueprint(self, blueprint_name):
"""
Get a flow blueprint definition by name.
"""Get a flow blueprint definition by name."""
Args:
blueprint_name: Name of the blueprint to retrieve
Returns:
dict: Blueprint definition as a dictionary
Example:
```python
blueprint = api.flow().get_blueprint("default")
print(blueprint) # Blueprint configuration
```
"""
# The input consists of system and prompt strings
input = {
"operation": "get-blueprint",
"workspace": self.api.workspace,
"blueprint-name": blueprint_name,
}
return json.loads(self.request(request = input)["blueprint-definition"])
def put_blueprint(self, blueprint_name, definition):
"""
Create or update a flow blueprint.
"""Create or update a flow blueprint."""
Args:
blueprint_name: Name for the blueprint
definition: Blueprint definition dictionary
Example:
```python
definition = {
"services": ["text-completion", "graph-rag"],
"parameters": {"model": "gpt-4"}
}
api.flow().put_blueprint("my-blueprint", definition)
```
"""
# The input consists of system and prompt strings
input = {
"operation": "put-blueprint",
"workspace": self.api.workspace,
"blueprint-name": blueprint_name,
"blueprint-definition": json.dumps(definition),
}
@ -188,96 +148,43 @@ class Flow:
self.request(request = input)
def delete_blueprint(self, blueprint_name):
"""
Delete a flow blueprint.
"""Delete a flow blueprint."""
Args:
blueprint_name: Name of the blueprint to delete
Example:
```python
api.flow().delete_blueprint("old-blueprint")
```
"""
# The input consists of system and prompt strings
input = {
"operation": "delete-blueprint",
"workspace": self.api.workspace,
"blueprint-name": blueprint_name,
}
self.request(request = input)
def list(self):
"""
List all active flow instances.
"""List flow instances in the current workspace."""
Returns:
list[str]: List of flow instance IDs
Example:
```python
flows = api.flow().list()
print(flows) # ['default', 'flow-1', 'flow-2', ...]
```
"""
# The input consists of system and prompt strings
input = {
"operation": "list-flows",
"workspace": self.api.workspace,
}
return self.request(request = input)["flow-ids"]
def get(self, id):
"""
Get the definition of a running flow instance.
"""Get the definition of a flow instance."""
Args:
id: Flow instance ID
Returns:
dict: Flow instance definition
Example:
```python
flow_def = api.flow().get("default")
print(flow_def)
```
"""
# The input consists of system and prompt strings
input = {
"operation": "get-flow",
"workspace": self.api.workspace,
"flow-id": id,
}
return json.loads(self.request(request = input)["flow"])
def start(self, blueprint_name, id, description, parameters=None):
"""
Start a new flow instance from a blueprint.
"""Start a new flow instance from a blueprint."""
Args:
blueprint_name: Name of the blueprint to instantiate
id: Unique identifier for the flow instance
description: Human-readable description
parameters: Optional parameters dictionary
Example:
```python
api.flow().start(
blueprint_name="default",
id="my-flow",
description="My custom flow",
parameters={"model": "gpt-4"}
)
```
"""
# The input consists of system and prompt strings
input = {
"operation": "start-flow",
"workspace": self.api.workspace,
"flow-id": id,
"blueprint-name": blueprint_name,
"description": description,
@ -289,21 +196,11 @@ class Flow:
self.request(request = input)
def stop(self, id):
"""
Stop a running flow instance.
"""Stop a running flow instance."""
Args:
id: Flow instance ID to stop
Example:
```python
api.flow().stop("my-flow")
```
"""
# The input consists of system and prompt strings
input = {
"operation": "stop-flow",
"workspace": self.api.workspace,
"flow-id": id,
}
@ -349,6 +246,13 @@ class FlowInstance:
Returns:
dict: Service response
"""
# Inject workspace so the gateway can route to the right
# workspace's flow. If already present, keep the caller's value.
if isinstance(request, dict) and "workspace" not in request:
request = {
"workspace": self.api.api.workspace,
**request,
}
return self.api.request(path = f"{self.id}/{path}", request = request)
def text_completion(self, system, prompt):
@ -392,7 +296,7 @@ class FlowInstance:
model=result.get("model"),
)
def agent(self, question, user="trustgraph", state=None, group=None, history=None):
def agent(self, question,state=None, group=None, history=None):
"""
Execute an agent operation with reasoning and tool use capabilities.
@ -401,7 +305,6 @@ class FlowInstance:
Args:
question: User question or instruction
user: User identifier (default: "trustgraph")
state: Optional state dictionary for stateful conversations
group: Optional group identifier for multi-user contexts
history: Optional conversation history as list of message dicts
@ -416,8 +319,7 @@ class FlowInstance:
# Simple question
answer = flow.agent(
question="What is the capital of France?",
user="trustgraph"
)
)
# With conversation history
history = [
@ -425,9 +327,7 @@ class FlowInstance:
{"role": "assistant", "content": "Hi! How can I help?"}
]
answer = flow.agent(
question="Tell me about Paris",
user="trustgraph",
history=history
question="Tell me about Paris",history=history
)
```
"""
@ -435,7 +335,6 @@ class FlowInstance:
# The input consists of a question and optional context
input = {
"question": question,
"user": user,
}
# Only include state if it has a value
@ -455,7 +354,7 @@ class FlowInstance:
)["answer"]
def graph_rag(
self, query, user="trustgraph", collection="default",
self, query,collection="default",
entity_limit=50, triple_limit=30, max_subgraph_size=150,
max_path_length=2, edge_score_limit=30, edge_limit=25,
):
@ -467,7 +366,6 @@ class FlowInstance:
Args:
query: Natural language query
user: User/keyspace identifier (default: "trustgraph")
collection: Collection identifier (default: "default")
entity_limit: Maximum entities to retrieve (default: 50)
triple_limit: Maximum triples per entity (default: 30)
@ -483,9 +381,7 @@ class FlowInstance:
```python
flow = api.flow().id("default")
response = flow.graph_rag(
query="Tell me about Marie Curie's discoveries",
user="trustgraph",
collection="scientists",
query="Tell me about Marie Curie's discoveries",collection="scientists",
entity_limit=20,
max_path_length=3
)
@ -496,7 +392,6 @@ class FlowInstance:
# The input consists of a question
input = {
"query": query,
"user": user,
"collection": collection,
"entity-limit": entity_limit,
"triple-limit": triple_limit,
@ -519,7 +414,7 @@ class FlowInstance:
)
def document_rag(
self, query, user="trustgraph", collection="default",
self, query,collection="default",
doc_limit=10,
):
"""
@ -530,7 +425,6 @@ class FlowInstance:
Args:
query: Natural language query
user: User/keyspace identifier (default: "trustgraph")
collection: Collection identifier (default: "default")
doc_limit: Maximum document chunks to retrieve (default: 10)
@ -541,9 +435,7 @@ class FlowInstance:
```python
flow = api.flow().id("default")
response = flow.document_rag(
query="Summarize the key findings",
user="trustgraph",
collection="research-papers",
query="Summarize the key findings",collection="research-papers",
doc_limit=5
)
print(response)
@ -553,7 +445,6 @@ class FlowInstance:
# The input consists of a question
input = {
"query": query,
"user": user,
"collection": collection,
"doc-limit": doc_limit,
}
@ -600,7 +491,7 @@ class FlowInstance:
input
)["vectors"]
def graph_embeddings_query(self, text, user, collection, limit=10):
def graph_embeddings_query(self, text, collection, limit=10):
"""
Query knowledge graph entities using semantic similarity.
@ -609,7 +500,6 @@ class FlowInstance:
Args:
text: Query text for semantic search
user: User/keyspace identifier
collection: Collection identifier
limit: Maximum number of results (default: 10)
@ -620,9 +510,7 @@ class FlowInstance:
```python
flow = api.flow().id("default")
results = flow.graph_embeddings_query(
text="physicist who discovered radioactivity",
user="trustgraph",
collection="scientists",
text="physicist who discovered radioactivity",collection="scientists",
limit=5
)
# results contains {"entities": [{"entity": {...}, "score": 0.95}, ...]}
@ -636,7 +524,6 @@ class FlowInstance:
# Query graph embeddings for semantic search
input = {
"vector": vector,
"user": user,
"collection": collection,
"limit": limit
}
@ -646,7 +533,7 @@ class FlowInstance:
input
)
def document_embeddings_query(self, text, user, collection, limit=10):
def document_embeddings_query(self, text, collection, limit=10):
"""
Query document chunks using semantic similarity.
@ -655,7 +542,6 @@ class FlowInstance:
Args:
text: Query text for semantic search
user: User/keyspace identifier
collection: Collection identifier
limit: Maximum number of results (default: 10)
@ -666,9 +552,7 @@ class FlowInstance:
```python
flow = api.flow().id("default")
results = flow.document_embeddings_query(
text="machine learning algorithms",
user="trustgraph",
collection="research-papers",
text="machine learning algorithms",collection="research-papers",
limit=5
)
# results contains {"chunks": [{"chunk_id": "doc1/p0/c0", "score": 0.95}, ...]}
@ -682,7 +566,6 @@ class FlowInstance:
# Query document embeddings for semantic search
input = {
"vector": vector,
"user": user,
"collection": collection,
"limit": limit
}
@ -805,7 +688,7 @@ class FlowInstance:
def triples_query(
self, s=None, p=None, o=None,
user=None, collection=None, limit=10000
collection=None, limit=10000
):
"""
Query knowledge graph triples using pattern matching.
@ -817,7 +700,6 @@ class FlowInstance:
s: Subject URI (optional, use None for wildcard)
p: Predicate URI (optional, use None for wildcard)
o: Object URI or Literal (optional, use None for wildcard)
user: User/keyspace identifier (optional)
collection: Collection identifier (optional)
limit: Maximum results to return (default: 10000)
@ -835,9 +717,7 @@ class FlowInstance:
# Find all triples about a specific subject
triples = flow.triples_query(
s=Uri("http://example.org/person/marie-curie"),
user="trustgraph",
collection="scientists"
s=Uri("http://example.org/person/marie-curie"),collection="scientists"
)
# Find all instances of a specific relationship
@ -851,10 +731,6 @@ class FlowInstance:
input = {
"limit": limit
}
if user:
input["user"] = user
if collection:
input["collection"] = collection
@ -888,7 +764,7 @@ class FlowInstance:
]
def load_document(
self, document, id=None, metadata=None, user=None,
self, document, id=None, metadata=None,
collection=None,
):
"""
@ -901,7 +777,6 @@ class FlowInstance:
document: Document content as bytes
id: Optional document identifier (auto-generated if None)
metadata: Optional metadata (list of Triples or object with emit method)
user: User/keyspace identifier (optional)
collection: Collection identifier (optional)
Returns:
@ -918,9 +793,7 @@ class FlowInstance:
with open("research.pdf", "rb") as f:
result = flow.load_document(
document=f.read(),
id="research-001",
user="trustgraph",
collection="papers"
id="research-001",collection="papers"
)
```
"""
@ -955,10 +828,6 @@ class FlowInstance:
"metadata": triples,
"data": base64.b64encode(document).decode("utf-8"),
}
if user:
input["user"] = user
if collection:
input["collection"] = collection
@ -969,7 +838,7 @@ class FlowInstance:
def load_text(
self, text, id=None, metadata=None, charset="utf-8",
user=None, collection=None,
collection=None,
):
"""
Load text content for processing.
@ -982,7 +851,6 @@ class FlowInstance:
id: Optional document identifier (auto-generated if None)
metadata: Optional metadata (list of Triples or object with emit method)
charset: Character encoding (default: "utf-8")
user: User/keyspace identifier (optional)
collection: Collection identifier (optional)
Returns:
@ -1000,9 +868,7 @@ class FlowInstance:
result = flow.load_text(
text=text_content,
id="text-001",
charset="utf-8",
user="trustgraph",
collection="documents"
charset="utf-8",collection="documents"
)
```
"""
@ -1035,10 +901,6 @@ class FlowInstance:
"charset": charset,
"text": base64.b64encode(text).decode("utf-8"),
}
if user:
input["user"] = user
if collection:
input["collection"] = collection
@ -1048,7 +910,7 @@ class FlowInstance:
)
def rows_query(
self, query, user="trustgraph", collection="default",
self, query,collection="default",
variables=None, operation_name=None
):
"""
@ -1059,7 +921,6 @@ class FlowInstance:
Args:
query: GraphQL query string
user: User/keyspace identifier (default: "trustgraph")
collection: Collection identifier (default: "default")
variables: Optional query variables dictionary
operation_name: Optional operation name for multi-operation documents
@ -1085,9 +946,7 @@ class FlowInstance:
}
'''
result = flow.rows_query(
query=query,
user="trustgraph",
collection="scientists"
query=query,collection="scientists"
)
# Query with variables
@ -1109,7 +968,6 @@ class FlowInstance:
# The input consists of a GraphQL query and optional variables
input = {
"query": query,
"user": user,
"collection": collection,
}
@ -1145,7 +1003,7 @@ class FlowInstance:
return result
def sparql_query(
self, query, user="trustgraph", collection="default",
self, query,collection="default",
limit=10000
):
"""
@ -1153,7 +1011,6 @@ class FlowInstance:
Args:
query: SPARQL 1.1 query string
user: User/keyspace identifier (default: "trustgraph")
collection: Collection identifier (default: "default")
limit: Safety limit on results (default: 10000)
@ -1169,7 +1026,6 @@ class FlowInstance:
input = {
"query": query,
"user": user,
"collection": collection,
"limit": limit,
}
@ -1213,14 +1069,13 @@ class FlowInstance:
return response
def structured_query(self, question, user="trustgraph", collection="default"):
def structured_query(self, question,collection="default"):
"""
Execute a natural language question against structured data.
Combines NLP query conversion and GraphQL execution.
Args:
question: Natural language question
user: Cassandra keyspace identifier (default: "trustgraph")
collection: Data collection identifier (default: "default")
Returns:
@ -1229,7 +1084,6 @@ class FlowInstance:
input = {
"question": question,
"user": user,
"collection": collection
}
@ -1383,7 +1237,7 @@ class FlowInstance:
return response["schema-matches"]
def row_embeddings_query(
self, text, schema_name, user="trustgraph", collection="default",
self, text, schema_name,collection="default",
index_name=None, limit=10
):
"""
@ -1396,7 +1250,6 @@ class FlowInstance:
Args:
text: Query text for semantic search
schema_name: Schema name to search within
user: User/keyspace identifier (default: "trustgraph")
collection: Collection identifier (default: "default")
index_name: Optional index name to filter search to specific index
limit: Maximum number of results (default: 10)
@ -1412,9 +1265,7 @@ class FlowInstance:
# Search for customers by name similarity
results = flow.row_embeddings_query(
text="John Smith",
schema_name="customers",
user="trustgraph",
collection="sales",
schema_name="customers",collection="sales",
limit=5
)
@ -1436,7 +1287,6 @@ class FlowInstance:
input = {
"vector": vector,
"schema_name": schema_name,
"user": user,
"collection": collection,
"limit": limit
}

View file

@ -63,105 +63,50 @@ class Knowledge:
"""
return self.api.request(f"knowledge", request)
def list_kg_cores(self, user="trustgraph"):
def list_kg_cores(self):
"""
List all available knowledge graph cores.
Retrieves the IDs of all KG cores available for the specified user.
Args:
user: User identifier (default: "trustgraph")
List all available knowledge graph cores in this workspace.
Returns:
list[str]: List of KG core identifiers
Example:
```python
knowledge = api.knowledge()
# List available KG cores
cores = knowledge.list_kg_cores(user="trustgraph")
print(f"Available KG cores: {cores}")
```
"""
# The input consists of system and prompt strings
input = {
"operation": "list-kg-cores",
"user": user,
"workspace": self.api.workspace,
}
return self.request(request = input)["ids"]
def delete_kg_core(self, id, user="trustgraph"):
def delete_kg_core(self, id):
"""
Delete a knowledge graph core.
Removes a KG core from storage. This does not affect currently loaded
cores in flows.
Delete a knowledge graph core in this workspace.
Args:
id: KG core identifier to delete
user: User identifier (default: "trustgraph")
Example:
```python
knowledge = api.knowledge()
# Delete a KG core
knowledge.delete_kg_core(id="medical-kb-v1", user="trustgraph")
```
"""
# The input consists of system and prompt strings
input = {
"operation": "delete-kg-core",
"user": user,
"workspace": self.api.workspace,
"id": id,
}
self.request(request = input)
def load_kg_core(self, id, user="trustgraph", flow="default",
collection="default"):
def load_kg_core(self, id, flow="default", collection="default"):
"""
Load a knowledge graph core into a flow.
Makes a KG core available for use in queries and RAG operations within
the specified flow and collection.
Args:
id: KG core identifier to load
user: User identifier (default: "trustgraph")
flow: Flow instance to load into (default: "default")
collection: Collection to associate with (default: "default")
Example:
```python
knowledge = api.knowledge()
# Load a medical knowledge base into the default flow
knowledge.load_kg_core(
id="medical-kb-v1",
user="trustgraph",
flow="default",
collection="medical"
)
# Now the flow can use this KG core for RAG queries
flow = api.flow().id("default")
response = flow.graph_rag(
query="What are the symptoms of diabetes?",
user="trustgraph",
collection="medical"
)
```
"""
# The input consists of system and prompt strings
input = {
"operation": "load-kg-core",
"user": user,
"workspace": self.api.workspace,
"id": id,
"flow": flow,
"collection": collection,
@ -169,35 +114,18 @@ class Knowledge:
self.request(request = input)
def unload_kg_core(self, id, user="trustgraph", flow="default"):
def unload_kg_core(self, id, flow="default"):
"""
Unload a knowledge graph core from a flow.
Removes a KG core from active use in the specified flow, freeing
resources while keeping the core available in storage.
Args:
id: KG core identifier to unload
user: User identifier (default: "trustgraph")
flow: Flow instance to unload from (default: "default")
Example:
```python
knowledge = api.knowledge()
# Unload a KG core when no longer needed
knowledge.unload_kg_core(
id="medical-kb-v1",
user="trustgraph",
flow="default"
)
```
"""
# The input consists of system and prompt strings
input = {
"operation": "unload-kg-core",
"user": user,
"workspace": self.api.workspace,
"id": id,
"flow": flow,
}

View file

@ -94,7 +94,7 @@ class Library:
return self.api.request(f"librarian", request)
def add_document(
self, document, id, metadata, user, title, comments,
self, document, id, metadata, title, comments,
kind="text/plain", tags=[], on_progress=None,
):
"""
@ -108,7 +108,6 @@ class Library:
document: Document content as bytes
id: Document identifier (auto-generated if None)
metadata: Document metadata as list of Triple objects or object with emit method
user: User/owner identifier
title: Document title
comments: Document description or comments
kind: MIME type of the document (default: "text/plain")
@ -131,7 +130,6 @@ class Library:
document=f.read(),
id="research-001",
metadata=[],
user="trustgraph",
title="Research Paper",
comments="Key findings in quantum computing",
kind="application/pdf",
@ -147,7 +145,6 @@ class Library:
document=f.read(),
id="large-doc-001",
metadata=[],
user="trustgraph",
title="Large Document",
comments="A very large document",
kind="application/pdf",
@ -176,7 +173,6 @@ class Library:
document=document,
id=id,
metadata=metadata,
user=user,
title=title,
comments=comments,
kind=kind,
@ -213,6 +209,7 @@ class Library:
input = {
"operation": "add-document",
"workspace": self.api.workspace,
"document-metadata": {
"id": id,
"time": int(time.time()),
@ -220,7 +217,7 @@ class Library:
"title": title,
"comments": comments,
"metadata": triples,
"user": user,
"workspace": self.api.workspace,
"tags": tags
},
"content": base64.b64encode(document).decode("utf-8"),
@ -229,7 +226,7 @@ class Library:
return self.request(input)
def _add_document_chunked(
self, document, id, metadata, user, title, comments,
self, document, id, metadata, title, comments,
kind, tags, on_progress=None,
):
"""
@ -245,13 +242,14 @@ class Library:
# Begin upload session
begin_request = {
"operation": "begin-upload",
"workspace": self.api.workspace,
"document-metadata": {
"id": id,
"time": int(time.time()),
"kind": kind,
"title": title,
"comments": comments,
"user": user,
"workspace": self.api.workspace,
"tags": tags,
},
"total-size": total_size,
@ -279,10 +277,10 @@ class Library:
chunk_request = {
"operation": "upload-chunk",
"workspace": self.api.workspace,
"upload-id": upload_id,
"chunk-index": chunk_index,
"content": base64.b64encode(chunk_data).decode("utf-8"),
"user": user,
}
chunk_response = self.request(chunk_request)
@ -298,8 +296,8 @@ class Library:
# Complete upload
complete_request = {
"operation": "complete-upload",
"workspace": self.api.workspace,
"upload-id": upload_id,
"user": user,
}
complete_response = self.request(complete_request)
@ -314,8 +312,8 @@ class Library:
try:
abort_request = {
"operation": "abort-upload",
"workspace": self.api.workspace,
"upload-id": upload_id,
"user": user,
}
self.request(abort_request)
logger.info(f"Aborted failed upload {upload_id}")
@ -323,15 +321,13 @@ class Library:
logger.warning(f"Failed to abort upload: {abort_error}")
raise
def get_documents(self, user, include_children=False):
def get_documents(self, include_children=False):
"""
List all documents for a user.
List all documents in the current workspace.
Retrieves metadata for all documents owned by the specified user.
By default, only returns top-level documents (not child/extracted documents).
Args:
user: User identifier
include_children: If True, also include child documents (default: False)
Returns:
@ -345,7 +341,7 @@ class Library:
library = api.library()
# Get only top-level documents
docs = library.get_documents(user="trustgraph")
docs = library.get_documents()
for doc in docs:
print(f"{doc.id}: {doc.title} ({doc.kind})")
@ -353,13 +349,13 @@ class Library:
print(f" Tags: {', '.join(doc.tags)}")
# Get all documents including extracted pages
all_docs = library.get_documents(user="trustgraph", include_children=True)
all_docs = library.get_documents(include_children=True)
```
"""
input = {
"operation": "list-documents",
"user": user,
"workspace": self.api.workspace,
"include-children": include_children,
}
@ -381,7 +377,7 @@ class Library:
)
for w in v["metadata"]
],
user = v["user"],
workspace = v.get("workspace", ""),
tags = v["tags"],
parent_id = v.get("parent-id", ""),
document_type = v.get("document-type", "source"),
@ -392,14 +388,13 @@ class Library:
logger.error("Failed to parse document list response", exc_info=True)
raise ProtocolException(f"Response not formatted correctly")
def get_document(self, user, id):
def get_document(self, id):
"""
Get metadata for a specific document.
Retrieves the metadata for a single document by ID.
Args:
user: User identifier
id: Document identifier
Returns:
@ -411,7 +406,7 @@ class Library:
Example:
```python
library = api.library()
doc = library.get_document(user="trustgraph", id="doc-123")
doc = library.get_document(id="doc-123")
print(f"Title: {doc.title}")
print(f"Comments: {doc.comments}")
```
@ -419,7 +414,7 @@ class Library:
input = {
"operation": "get-document",
"user": user,
"workspace": self.api.workspace,
"document-id": id,
}
@ -441,7 +436,7 @@ class Library:
)
for w in doc["metadata"]
],
user = doc["user"],
workspace = doc.get("workspace", ""),
tags = doc["tags"],
parent_id = doc.get("parent-id", ""),
document_type = doc.get("document-type", "source"),
@ -450,14 +445,13 @@ class Library:
logger.error("Failed to parse document response", exc_info=True)
raise ProtocolException(f"Response not formatted correctly")
def update_document(self, user, id, metadata):
def update_document(self, id, metadata):
"""
Update document metadata.
Updates the metadata for an existing document in the library.
Args:
user: User identifier
id: Document identifier
metadata: Updated DocumentMetadata object
@ -472,7 +466,7 @@ class Library:
library = api.library()
# Get existing document
doc = library.get_document(user="trustgraph", id="doc-123")
doc = library.get_document(id="doc-123")
# Update metadata
doc.title = "Updated Title"
@ -481,7 +475,6 @@ class Library:
# Save changes
updated_doc = library.update_document(
user="trustgraph",
id="doc-123",
metadata=doc
)
@ -490,8 +483,9 @@ class Library:
input = {
"operation": "update-document",
"workspace": self.api.workspace,
"document-metadata": {
"user": user,
"workspace": self.api.workspace,
"document-id": id,
"time": metadata.time,
"title": metadata.title,
@ -526,21 +520,20 @@ class Library:
)
for w in doc["metadata"]
],
user = doc["user"],
workspace = doc.get("workspace", ""),
tags = doc["tags"]
)
except Exception as e:
logger.error("Failed to parse document update response", exc_info=True)
raise ProtocolException(f"Response not formatted correctly")
def remove_document(self, user, id):
def remove_document(self, id):
"""
Remove a document from the library.
Deletes a document and its metadata from the library.
Args:
user: User identifier
id: Document identifier to remove
Returns:
@ -549,13 +542,13 @@ class Library:
Example:
```python
library = api.library()
library.remove_document(user="trustgraph", id="doc-123")
library.remove_document(id="doc-123")
```
"""
input = {
"operation": "remove-document",
"user": user,
"workspace": self.api.workspace,
"document-id": id,
}
@ -565,7 +558,7 @@ class Library:
def start_processing(
self, id, document_id, flow="default",
user="trustgraph", collection="default", tags=[],
collection="default", tags=[],
):
"""
Start a document processing workflow.
@ -577,7 +570,6 @@ class Library:
id: Unique processing job identifier
document_id: ID of the document to process
flow: Flow instance to use for processing (default: "default")
user: User identifier (default: "trustgraph")
collection: Target collection for processed data (default: "default")
tags: List of tags for the processing job (default: [])
@ -593,7 +585,6 @@ class Library:
id="proc-001",
document_id="doc-123",
flow="default",
user="trustgraph",
collection="research",
tags=["automated", "extract"]
)
@ -602,12 +593,13 @@ class Library:
input = {
"operation": "add-processing",
"workspace": self.api.workspace,
"processing-metadata": {
"id": id,
"document-id": document_id,
"time": int(time.time()),
"flow": flow,
"user": user,
"workspace": self.api.workspace,
"collection": collection,
"tags": tags,
}
@ -618,7 +610,7 @@ class Library:
return {}
def stop_processing(
self, id, user="trustgraph",
self, id,
):
"""
Stop a running document processing job.
@ -627,7 +619,6 @@ class Library:
Args:
id: Processing job identifier to stop
user: User identifier (default: "trustgraph")
Returns:
dict: Empty response object
@ -635,29 +626,26 @@ class Library:
Example:
```python
library = api.library()
library.stop_processing(id="proc-001", user="trustgraph")
library.stop_processing(id="proc-001")
```
"""
input = {
"operation": "remove-processing",
"workspace": self.api.workspace,
"processing-id": id,
"user": user,
}
object = self.request(input)
return {}
def get_processings(self, user="trustgraph"):
def get_processings(self):
"""
List all active document processing jobs.
Retrieves metadata for all currently running document processing workflows
for the specified user.
Args:
user: User identifier (default: "trustgraph")
in the current workspace.
Returns:
list[ProcessingMetadata]: List of processing job metadata objects
@ -668,7 +656,7 @@ class Library:
Example:
```python
library = api.library()
jobs = library.get_processings(user="trustgraph")
jobs = library.get_processings()
for job in jobs:
print(f"Job {job.id}:")
@ -681,7 +669,7 @@ class Library:
input = {
"operation": "list-processing",
"user": user,
"workspace": self.api.workspace,
}
object = self.request(input)
@ -693,7 +681,7 @@ class Library:
document_id = v["document-id"],
time = datetime.datetime.fromtimestamp(v["time"]),
flow = v["flow"],
user = v["user"],
workspace = v.get("workspace", ""),
collection = v["collection"],
tags = v["tags"],
)
@ -705,23 +693,20 @@ class Library:
# Chunked upload management methods
def get_pending_uploads(self, user):
def get_pending_uploads(self):
"""
List all pending (in-progress) uploads for a user.
List all pending (in-progress) uploads in the current workspace.
Retrieves information about chunked uploads that have been started
but not yet completed.
Args:
user: User identifier
Returns:
list[dict]: List of pending upload information
Example:
```python
library = api.library()
pending = library.get_pending_uploads(user="trustgraph")
pending = library.get_pending_uploads()
for upload in pending:
print(f"Upload {upload['upload_id']}:")
@ -731,14 +716,14 @@ class Library:
"""
input = {
"operation": "list-uploads",
"user": user,
"workspace": self.api.workspace,
}
response = self.request(input)
return response.get("upload-sessions", [])
def get_upload_status(self, upload_id, user):
def get_upload_status(self, upload_id):
"""
Get the status of a specific upload.
@ -747,7 +732,6 @@ class Library:
Args:
upload_id: Upload session identifier
user: User identifier
Returns:
dict: Upload status information including:
@ -763,10 +747,7 @@ class Library:
Example:
```python
library = api.library()
status = library.get_upload_status(
upload_id="abc-123",
user="trustgraph"
)
status = library.get_upload_status(upload_id="abc-123")
if status['state'] == 'in-progress':
print(f"Missing chunks: {status['missing_chunks']}")
@ -774,13 +755,13 @@ class Library:
"""
input = {
"operation": "get-upload-status",
"workspace": self.api.workspace,
"upload-id": upload_id,
"user": user,
}
return self.request(input)
def abort_upload(self, upload_id, user):
def abort_upload(self, upload_id):
"""
Abort an in-progress upload.
@ -788,7 +769,6 @@ class Library:
Args:
upload_id: Upload session identifier
user: User identifier
Returns:
dict: Empty response on success
@ -796,18 +776,18 @@ class Library:
Example:
```python
library = api.library()
library.abort_upload(upload_id="abc-123", user="trustgraph")
library.abort_upload(upload_id="abc-123")
```
"""
input = {
"operation": "abort-upload",
"workspace": self.api.workspace,
"upload-id": upload_id,
"user": user,
}
return self.request(input)
def resume_upload(self, upload_id, document, user, on_progress=None):
def resume_upload(self, upload_id, document, on_progress=None):
"""
Resume an interrupted upload.
@ -817,7 +797,6 @@ class Library:
Args:
upload_id: Upload session identifier to resume
document: Complete document content as bytes
user: User identifier
on_progress: Optional callback(bytes_sent, total_bytes) for progress updates
Returns:
@ -828,23 +807,19 @@ class Library:
library = api.library()
# Check what's missing
status = library.get_upload_status(
upload_id="abc-123",
user="trustgraph"
)
status = library.get_upload_status(upload_id="abc-123")
if status['state'] == 'in-progress':
# Resume with the same document
with open("large_document.pdf", "rb") as f:
library.resume_upload(
upload_id="abc-123",
document=f.read(),
user="trustgraph"
document=f.read()
)
```
"""
# Get current status
status = self.get_upload_status(upload_id, user)
status = self.get_upload_status(upload_id)
if status.get("upload-state") == "expired":
raise RuntimeError("Upload session has expired, please start a new upload")
@ -867,10 +842,10 @@ class Library:
chunk_request = {
"operation": "upload-chunk",
"workspace": self.api.workspace,
"upload-id": upload_id,
"chunk-index": chunk_index,
"content": base64.b64encode(chunk_data).decode("utf-8"),
"user": user,
}
self.request(chunk_request)
@ -886,8 +861,8 @@ class Library:
# Complete upload
complete_request = {
"operation": "complete-upload",
"workspace": self.api.workspace,
"upload-id": upload_id,
"user": user,
}
return self.request(complete_request)
@ -895,7 +870,7 @@ class Library:
# Child document methods
def add_child_document(
self, document, id, parent_id, user, title, comments,
self, document, id, parent_id, title, comments,
kind="text/plain", tags=[], metadata=None,
):
"""
@ -909,7 +884,6 @@ class Library:
document: Document content as bytes
id: Document identifier (auto-generated if None)
parent_id: Parent document identifier (required)
user: User/owner identifier
title: Document title
comments: Document description or comments
kind: MIME type of the document (default: "text/plain")
@ -931,7 +905,6 @@ class Library:
document=page_text.encode('utf-8'),
id="doc-123-page-1",
parent_id="doc-123",
user="trustgraph",
title="Page 1 of Research Paper",
comments="First page extracted from PDF",
kind="text/plain",
@ -964,6 +937,7 @@ class Library:
input = {
"operation": "add-child-document",
"workspace": self.api.workspace,
"document-metadata": {
"id": id,
"time": int(time.time()),
@ -971,7 +945,7 @@ class Library:
"title": title,
"comments": comments,
"metadata": triples,
"user": user,
"workspace": self.api.workspace,
"tags": tags,
"parent-id": parent_id,
"document-type": "extracted",
@ -981,13 +955,12 @@ class Library:
return self.request(input)
def list_children(self, document_id, user):
def list_children(self, document_id):
"""
List all child documents for a given parent document.
Args:
document_id: Parent document identifier
user: User identifier
Returns:
list[DocumentMetadata]: List of child document metadata objects
@ -995,10 +968,7 @@ class Library:
Example:
```python
library = api.library()
children = library.list_children(
document_id="doc-123",
user="trustgraph"
)
children = library.list_children(document_id="doc-123")
for child in children:
print(f"{child.id}: {child.title}")
@ -1006,8 +976,8 @@ class Library:
"""
input = {
"operation": "list-children",
"workspace": self.api.workspace,
"document-id": document_id,
"user": user,
}
response = self.request(input)
@ -1028,7 +998,7 @@ class Library:
)
for w in v.get("metadata", [])
],
user=v["user"],
workspace=v.get("workspace", ""),
tags=v.get("tags", []),
parent_id=v.get("parent-id", ""),
document_type=v.get("document-type", "source"),
@ -1039,14 +1009,13 @@ class Library:
logger.error("Failed to parse children response", exc_info=True)
raise ProtocolException("Response not formatted correctly")
def get_document_content(self, user, id):
def get_document_content(self, id):
"""
Get the content of a document.
Retrieves the full content of a document as bytes.
Args:
user: User identifier
id: Document identifier
Returns:
@ -1055,10 +1024,7 @@ class Library:
Example:
```python
library = api.library()
content = library.get_document_content(
user="trustgraph",
id="doc-123"
)
content = library.get_document_content(id="doc-123")
# Write to file
with open("output.pdf", "wb") as f:
@ -1067,7 +1033,7 @@ class Library:
"""
input = {
"operation": "get-document-content",
"user": user,
"workspace": self.api.workspace,
"document-id": id,
}
@ -1076,7 +1042,7 @@ class Library:
return base64.b64decode(content_b64)
def stream_document_to_file(self, user, id, file_path, chunk_size=1024*1024, on_progress=None):
def stream_document_to_file(self, id, file_path, chunk_size=1024*1024, on_progress=None):
"""
Stream document content to a file.
@ -1084,7 +1050,6 @@ class Library:
enabling memory-efficient handling of large documents.
Args:
user: User identifier
id: Document identifier
file_path: Path to write the document content
chunk_size: Size of each chunk to download (default 1MB)
@ -1101,7 +1066,6 @@ class Library:
print(f"Downloaded {received}/{total} bytes")
library.stream_document_to_file(
user="trustgraph",
id="large-doc-123",
file_path="/tmp/document.pdf",
on_progress=progress
@ -1116,7 +1080,7 @@ class Library:
while True:
input = {
"operation": "stream-document",
"user": user,
"workspace": self.api.workspace,
"document-id": id,
"chunk-index": chunk_index,
"chunk-size": chunk_size,

View file

@ -84,10 +84,14 @@ class SocketClient:
for streaming responses.
"""
def __init__(self, url: str, timeout: int, token: Optional[str]) -> None:
def __init__(
self, url: str, timeout: int, token: Optional[str],
workspace: str = "default",
) -> None:
self.url: str = self._convert_to_ws_url(url)
self.timeout: int = timeout
self.token: Optional[str] = token
self.workspace: str = workspace
self._request_counter: int = 0
self._lock: Lock = Lock()
self._loop: Optional[asyncio.AbstractEventLoop] = None
@ -251,6 +255,7 @@ class SocketClient:
try:
message = {
"id": request_id,
"workspace": self.workspace,
"service": service,
"request": request
}
@ -290,6 +295,7 @@ class SocketClient:
try:
message = {
"id": request_id,
"workspace": self.workspace,
"service": service,
"request": request
}
@ -328,6 +334,7 @@ class SocketClient:
try:
message = {
"id": request_id,
"workspace": self.workspace,
"service": service,
"request": request
}
@ -488,7 +495,6 @@ class SocketFlowInstance:
def agent(
self,
question: str,
user: str,
state: Optional[Dict[str, Any]] = None,
group: Optional[str] = None,
history: Optional[List[Dict[str, Any]]] = None,
@ -498,7 +504,6 @@ class SocketFlowInstance:
"""Execute an agent operation with streaming support."""
request = {
"question": question,
"user": user,
"streaming": streaming
}
if state is not None:
@ -514,7 +519,6 @@ class SocketFlowInstance:
def agent_explain(
self,
question: str,
user: str,
collection: str,
state: Optional[Dict[str, Any]] = None,
group: Optional[str] = None,
@ -524,7 +528,6 @@ class SocketFlowInstance:
"""Execute an agent operation with explainability support."""
request = {
"question": question,
"user": user,
"collection": collection,
"streaming": True
}
@ -574,7 +577,6 @@ class SocketFlowInstance:
def graph_rag(
self,
query: str,
user: str,
collection: str,
entity_limit: int = 50,
triple_limit: int = 30,
@ -592,7 +594,6 @@ class SocketFlowInstance:
"""
request = {
"query": query,
"user": user,
"collection": collection,
"entity-limit": entity_limit,
"triple-limit": triple_limit,
@ -619,7 +620,6 @@ class SocketFlowInstance:
def graph_rag_explain(
self,
query: str,
user: str,
collection: str,
entity_limit: int = 50,
triple_limit: int = 30,
@ -632,7 +632,6 @@ class SocketFlowInstance:
"""Execute graph-based RAG query with explainability support."""
request = {
"query": query,
"user": user,
"collection": collection,
"entity-limit": entity_limit,
"triple-limit": triple_limit,
@ -653,7 +652,6 @@ class SocketFlowInstance:
def document_rag(
self,
query: str,
user: str,
collection: str,
doc_limit: int = 10,
streaming: bool = False,
@ -666,7 +664,6 @@ class SocketFlowInstance:
"""
request = {
"query": query,
"user": user,
"collection": collection,
"doc-limit": doc_limit,
"streaming": streaming
@ -688,7 +685,6 @@ class SocketFlowInstance:
def document_rag_explain(
self,
query: str,
user: str,
collection: str,
doc_limit: int = 10,
**kwargs: Any
@ -696,7 +692,6 @@ class SocketFlowInstance:
"""Execute document-based RAG query with explainability support."""
request = {
"query": query,
"user": user,
"collection": collection,
"doc-limit": doc_limit,
"streaming": True,
@ -748,7 +743,6 @@ class SocketFlowInstance:
def graph_embeddings_query(
self,
text: str,
user: str,
collection: str,
limit: int = 10,
**kwargs: Any
@ -759,7 +753,6 @@ class SocketFlowInstance:
request = {
"vector": vector,
"user": user,
"collection": collection,
"limit": limit
}
@ -770,7 +763,6 @@ class SocketFlowInstance:
def document_embeddings_query(
self,
text: str,
user: str,
collection: str,
limit: int = 10,
**kwargs: Any
@ -781,7 +773,6 @@ class SocketFlowInstance:
request = {
"vector": vector,
"user": user,
"collection": collection,
"limit": limit
}
@ -802,7 +793,6 @@ class SocketFlowInstance:
p: Optional[Union[str, Dict[str, Any]]] = None,
o: Optional[Union[str, Dict[str, Any]]] = None,
g: Optional[str] = None,
user: Optional[str] = None,
collection: Optional[str] = None,
limit: int = 100,
**kwargs: Any
@ -822,8 +812,6 @@ class SocketFlowInstance:
request["o"] = o_term
if g is not None:
request["g"] = g
if user is not None:
request["user"] = user
if collection is not None:
request["collection"] = collection
request.update(kwargs)
@ -839,7 +827,6 @@ class SocketFlowInstance:
p: Optional[Union[str, Dict[str, Any]]] = None,
o: Optional[Union[str, Dict[str, Any]]] = None,
g: Optional[str] = None,
user: Optional[str] = None,
collection: Optional[str] = None,
limit: int = 100,
batch_size: int = 20,
@ -864,8 +851,6 @@ class SocketFlowInstance:
request["o"] = o_term
if g is not None:
request["g"] = g
if user is not None:
request["user"] = user
if collection is not None:
request["collection"] = collection
request.update(kwargs)
@ -879,7 +864,6 @@ class SocketFlowInstance:
def sparql_query_stream(
self,
query: str,
user: str = "trustgraph",
collection: str = "default",
limit: int = 10000,
batch_size: int = 20,
@ -888,7 +872,6 @@ class SocketFlowInstance:
"""Execute a SPARQL query with streaming batches."""
request = {
"query": query,
"user": user,
"collection": collection,
"limit": limit,
"streaming": True,
@ -904,7 +887,6 @@ class SocketFlowInstance:
def rows_query(
self,
query: str,
user: str,
collection: str,
variables: Optional[Dict[str, Any]] = None,
operation_name: Optional[str] = None,
@ -913,7 +895,6 @@ class SocketFlowInstance:
"""Execute a GraphQL query against structured rows."""
request = {
"query": query,
"user": user,
"collection": collection
}
if variables:
@ -943,7 +924,6 @@ class SocketFlowInstance:
self,
text: str,
schema_name: str,
user: str = "trustgraph",
collection: str = "default",
index_name: Optional[str] = None,
limit: int = 10,
@ -956,7 +936,6 @@ class SocketFlowInstance:
request = {
"vector": vector,
"schema_name": schema_name,
"user": user,
"collection": collection,
"limit": limit
}

View file

@ -45,10 +45,13 @@ class ConfigValue:
type: Configuration type/category
key: Specific configuration key
value: Configuration value as string
workspace: Workspace the value belongs to. Only populated for
responses to getvalues-all-ws; empty otherwise.
"""
type : str
key : str
value : str
workspace : str = ""
@dataclasses.dataclass
class DocumentMetadata:
@ -62,7 +65,7 @@ class DocumentMetadata:
title: Document title
comments: Additional comments or description
metadata: List of RDF triples providing structured metadata
user: User/owner identifier
workspace: Workspace the document belongs to
tags: List of tags for categorization
parent_id: Parent document ID for child documents (empty for top-level docs)
document_type: "source" for uploaded documents, "extracted" for derived content
@ -73,7 +76,7 @@ class DocumentMetadata:
title : str
comments : str
metadata : List[Triple]
user : str
workspace : str
tags : List[str]
parent_id : str = ""
document_type : str = "source"
@ -88,7 +91,7 @@ class ProcessingMetadata:
document_id: ID of the document being processed
time: Processing start timestamp
flow: Flow instance handling the processing
user: User identifier
workspace: Workspace the processing job belongs to
collection: Target collection for processed data
tags: List of tags for categorization
"""
@ -96,7 +99,7 @@ class ProcessingMetadata:
document_id : str
time : datetime.datetime
flow : str
user : str
workspace : str
collection : str
tags : List[str]
@ -105,17 +108,15 @@ class CollectionMetadata:
"""
Metadata for a data collection.
Collections provide logical grouping and isolation for documents and
knowledge graph data.
Collections provide logical grouping within a workspace for documents
and knowledge graph data.
Attributes:
user: User/owner identifier
collection: Collection identifier
name: Human-readable collection name
description: Collection description
tags: List of tags for categorization
"""
user : str
collection : str
name : str
description : str