2026-06-20 20:28:06 +02:00
162 changed files with 4308 additions and 7161 deletions
--- a/README.md
+++ b/README.md
@ -3,7 +3,7 @@

 <img src="TG-fullname-logo.svg" width=100% />

-[![PyPI version](https://img.shields.io/pypi/v/trustgraph.svg)](https://pypi.org/project/trustgraph/) ![License](https://img.shields.io/badge/license-Apache%202.0-blue) ![E2E Tests](https://github.com/trustgraph-ai/trustgraph/actions/workflows/release.yaml/badge.svg)
+[![PyPI version](https://img.shields.io/pypi/v/trustgraph.svg)](https://pypi.org/project/trustgraph/) [![License](https://img.shields.io/github/license/trustgraph-ai/trustgraph?color=blue)](LICENSE) ![E2E Tests](https://github.com/trustgraph-ai/trustgraph/actions/workflows/release.yaml/badge.svg)
 [![Discord](https://img.shields.io/discord/1251652173201149994
 )](https://discord.gg/sQMwkRz5GX) [![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/trustgraph-ai/trustgraph)

@ -11,89 +11,44 @@

 <a href="https://trendshift.io/repositories/17291" target="_blank"><img src="https://trendshift.io/api/badge/repositories/17291" alt="trustgraph-ai%2Ftrustgraph | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>

-# Write context once. Run agents anywhere.
+# The agent runtime platform

 </div>

-Stop rebuilding context from scratch. TrustGraph treats context as a holon — a modular, independent whole that naturally snaps into a larger domain-wide intelligence layer. By deploying context as holonic context graphs, TrustGraph powers multi-tenant agent workflows, dramatically reduces token consumption, and aligns with semantic web standards (RDF, OWL, SKOS, SHACL). Version your context, share it across teams, and scale with full provenance.
+TrustGraph is an agent runtime platform built around context graphs — structured, queryable representations of your domain knowledge that ground every agent query in verified, explainable facts in private deployments with sovereign control. The platform is the full stack for agentic systems: context graphs, memory, retrieval, orchestration, and inference for precision-critical agent workloads.

-## What TrustGraph Does
-
-TrustGraph is a complete holonic context harness for all LLMs. It provides the full infrastructure layer underneath your agents: knowledge ingestion, structured storage, graph-grounded retrieval, agent orchestration, and a full LLM inferencing stack.
-
-TrustGraph relies on absolutely no 3rd party services aside from optional API integrations to cloud-hosted LLMs. Whether you are using Anthropic's or OpenAI's API, or self-hosting Qwen3.7 via vLLM, TrustGraph handles it all with pre-built API connectors and a full LLM inferencing stack to enrich the models with a sovereign, private holonic system that grounds your agents in reality.
-
-## The Problem: Why Agents Break
-
-When you build an AI agent today, you spend most of your time fighting context:
-
- **RAG retrieves fragments, not meaning**. Chunks of text have no structure. Relationships between facts are invisible. Your agent guesses at the connections.
-
- **Context is disposable**. What the agent learned in one session is gone in the next. There is no persistent, structured knowledge layer underneath.
-
- **Answers aren't traceable**. You can't explain why the agent said what it said, which means you can't trust it in production.
-
- **Knowledge can't be reused**. You rebuild the same context pipelines for every new project, every new agent, every new environment.
-
-These aren't retrieval problems. They are structural problems. Context needs to be organized, versioned, and composable — exactly the way software infrastructure is.
-
-## The Solution: A Holonic Context System
-The philosopher Arthur Koestler coined the word [holon](https://en.wikipedia.org/wiki/Holon_(philosophy)) to describe something that is simultaneously a whole in itself and a part of something larger. A fact is whole. It is also part of a domain. A domain is whole. It is also part of an organization's knowledge.
-
-AI agents break down because this holonic structure is never built. Context gets shoved into flat text windows, scattered across vector stores, or hardwired into one-off prompts. Facts lose their relationships.
-
-TrustGraph solves this by organizing your domain into holonic context graphs. Entities, relationships, and evidence are treated as first-class objects. Every agent query is grounded against these holons—marrying symbolic graph structures with vector embeddings. Every answer carries provenance. Every fact is traceable.
-
-## Context Cores: Knowledge as a First-Class Citizen
-
-A Context Core is the deployable unit of knowledge in TrustGraph. It packages everything an agent needs to reason reliably over a domain into a single, portable artifact.
-
-### What's inside a Context Core
- **Ontology** — your domain schema and entity mappings
- **Holon** — entities, relationships, and supporting evidence
- **Embeddings** — vector indexes for fast semantic entry-point lookup
- **Provenance** — where every fact came from, when, and how it was derived
- **Retrieval policies** — traversal rules, freshness controls, authority ranking
-
-Context Cores decouple what agents know from how agents are deployed. Build once. Run in Docker locally, Kubernetes in production, or on any cloud. Pin a version. Roll back. Promote across environments. This is context engineering — and it works because knowledge is finally treated like the infrastructure it is.
-
-## Explainability: Trust Your Agents in Production
-LLMs are black boxes, and traditional RAG makes it worse. When an agent pulls flat text chunks from a vector store, you have no idea how it connected those fragments to form an answer. You cannot ship agents to production if you can't explain why they said what they said.
-
-### How TrustGraph makes agents explainable:
-
- **Traceable Reasoning Paths**: Instead of guessing at connections between text chunks, TrustGraph traverses explicit relationship paths in the holonic context graph. You can inspect exactly which entities, relationships, and sub-graphs were pulled into the LLM's context window to generate a given response.
- **Fact-Level Provenance**: Every node and edge in the graph carries strict provenance. When an agent makes a claim, you can trace it back to the exact source document, the time it was ingested, and the extraction method used to derive it.
- **No Black-Box Guesses**: By grounding the LLM in a structured, symbolic graph, you eliminate the hallucinations that occur when models are forced to infer relationships from unstructured text. If a fact isn't in the graph, the agent doesn't use it.
-
-TrustGraph doesn't just give you answers - it gives you the receipt. Every fact is traceable, every connection is visible, and every output is verifiable.
-
-## Workspaces, Collections, and Flows
-
-TrustGraph has a [three-level system](https://docs.trustgraph.ai/overview/workspaces) for organizing and isolating knowledge. 
-
-A `Workspace` is the outermost boundary — a fully isolated tenancy scope where all data, users, configuration, and pipelines live independently from every other workspace. Isolation is structural: enforced at the pub/sub queue, storage, and API gateway layers, not by trusting a field in a message body.
-
-Within a workspace, a `Collection` groups related holons, graph structures, embeddings, and documents together — think of it as a dedicated shelf in a library, scoped to a specific domain, project, or customer.
-
-A `Flow` is a running data processing pipeline that defines how raw data moves through ingestion, extraction, structuring, and storage — the assembly line that turns documents into queryable knowledge. Together, the three layers let you run multiple isolated tenants on a single deployment, separate knowledge by domain within each tenant, and process that knowledge through fully configurable pipelines — all without restarting the system or rebuilding your infrastructure.
-
-## The Full Stack
-TrustGraph is not a wrapper around a graph database. It is the complete backend for production agentic systems.
-
- **Holonic context graph engine**: automated entity and relationship extraction, ontology-driven graph construction, graph-grounded retrieval for explainable outputs
- **Multi-model database**: tabular/relational, key-value, document, graph, vectors, images, video, and audio — all managed in Cassandra and S3-compatible Garage
- **Out-of-the-box RAG pipelines**: DocumentRAG, GraphRAG, and OntologyRAG ready to deploy
- **Fully agentic orchestration**: single or multi-agent, ReAct, Plan-then-Execute, Supervisor patterns, and MCP integration
- **3D Knowledge Explorer**: interactive graph visualization with BFS neighborhood extraction and edge pulse animation
- **Automated data ingest**: quick ingest with semantic similarity or ontology-structured precision retrieval
- **Run anywhere**: Docker/Podman locally, Kubernetes in the cloud
-
-All major LLMs — Anthropic, Cohere, Gemini, Mistral, OpenAI, and more via API.
-
-vLLM, Ollama, TGI, LM Studio, and Llamafiles for fully local inferencing.
-
-Verified cloud deployments for Alibaba Cloud, AWS, Azure, GCP, OVHcloud, and Scaleway.
+The platform:
+- [x] Multi-model and multimodal database system
+  - [x] Tabular/relational, key-value
+  - [x] Document, graph, and vectors
+  - [x] Images, video, and audio
+- [x] Context Graph engine
+  - [x] Automated entity and relationship extraction
+  - [x] Ontology-driven graph construction
+  - [x] Graph-grounded retrieval for explainable outputs
+- [x] Automated data ingest and loading
+  - [x] Quick ingest with semantic similarity retrieval
+  - [x] Ontology structuring for precision retrieval
+- [x] Out-of-the-box RAG pipelines
+  - [x] DocumentRAG
+  - [x] GraphRAG
+  - [x] OntologyRAG     
+- [x] 3D GraphViz for exploring context
+- [x] Fully Agentic System
+  - [x] Single or Multi Agent
+  - [x] ReAct, Plan-then-Execute, and Supervisor patterns
+  - [x] MCP integration 
+- [x] Run anywhere
+  - [x] Deploy locally with Docker
+  - [x] Deploy in cloud with Kubernetes
+- [x] Support for all major LLMs
+  - [x] API support for Anthropic, Cohere, Gemini, Mistral, OpenAI, and others
+  - [x] Model inferencing with vLLM, Ollama, TGI, LM Studio, and Llamafiles
+- [x] Developer friendly
+  - [x] REST API [Docs](https://docs.trustgraph.ai/reference/apis/rest.html)
+  - [x] Websocket API [Docs](https://docs.trustgraph.ai/reference/apis/websocket.html)
+  - [x] Python API [Docs](https://docs.trustgraph.ai/reference/apis/python)
+  - [x] CLI [Docs](https://docs.trustgraph.ai/reference/cli/)
     
 ## No API Keys Required

@ -107,12 +62,12 @@ Everything else is included.
 - [x] Managed Multi-model storage in [Cassandra](https://cassandra.apache.org/_/index.html)
 - [x] Managed Vector embedding storage in [Qdrant](https://github.com/qdrant/qdrant)
 - [x] Managed File and Object storage in [Garage](https://github.com/deuxfleurs-org/garage) (S3 compatible)
- [x] Managed High-speed Pub/Sub messaging fabric with [Pulsar](https://github.com/apache/pulsar) or [RabbitMQ](https://www.rabbitmq.com/)
+- [x] Managed High-speed Pub/Sub messaging fabric with [Pulsar](https://github.com/apache/pulsar)
 - [x] Complete LLM inferencing stack for open LLMs with [vLLM](https://github.com/vllm-project/vllm), [TGI](https://github.com/huggingface/text-generation-inference), [Ollama](https://github.com/ollama/ollama), [LM Studio](https://github.com/lmstudio-ai), and [Llamafiles](https://github.com/mozilla-ai/llamafile) 

 ## Quickstart

-No need to clone the repo unless you are building from source. TrustGraph deploys as a set of Docker containers. Configure it on the command line in one step:
+There's no need to clone this repo, unless you want to build from source. TrustGraph is a fully containerized app that deploys as a set of Docker containers. To configure TrustGraph on the command line:

 ```
 npx @trustgraph/config
@ -123,39 +78,44 @@ The config process will generate an app config that can be run locally with Dock
 - Deployment instructions as `INSTALLATION.md`

 <p align="center">
-  <video src="https://github.com/user-attachments/assets/33434c3c-f586-4610-8bb2-d7b7b586a672"
+  <video src="https://github.com/user-attachments/assets/2978a6aa-4c9c-4d7c-ad02-8f3d01a1c602"
 width="80%" controls></video>
 </p>

 For a browser based configuration, try the [Configuration Terminal](https://config-ui.demo.trustgraph.ai/). 

-## Watch What is a Holonic Context Graph?
+## Watch What is a Context Graph?

 [![What is a Context Graph?](https://img.youtube.com/vi/gZjlt5WcWB4/maxresdefault.jpg)](https://www.youtube.com/watch?v=gZjlt5WcWB4) 

-## Watch Holonic Context Graphs in Action
+## Watch Context Graphs in Action

 [![Context Graphs in Action with TrustGraph](https://img.youtube.com/vi/sWc7mkhITIo/maxresdefault.jpg)](https://www.youtube.com/watch?v=sWc7mkhITIo)

 ## Getting Started with TrustGraph

 - [**Getting Started Guides**](https://docs.trustgraph.ai/getting-started)
+- [**Using the Workbench**](#workbench)
 - [**Developer APIs and CLI**](https://docs.trustgraph.ai/reference)
 - [**Deployment Guides**](https://docs.trustgraph.ai/deployment)

-## TrustGraph UI
+## Workbench

-<img width="1389" height="961" alt="Image" src="https://github.com/user-attachments/assets/35c9250d-0f01-40cb-9294-1ee8fd9a1b56" />
+The **Workbench** provides tools for all major features of TrustGraph. The **Workbench** is on port `8888` by default.

-The UI provides tools for all major features of TrustGraph. The UI deploys on port `8888` by default.
-
- **Agent Console** — Query your agents directly with streaming responses and live explainability event tracking, so you can watch reasoning unfold in real time
- **GraphRAG View** — Interactive graph RAG queries with a visual explainability DAG and inline provenance display, making it easy to see exactly where answers came from
- **Context Explorer** — An interactive 3D context graph explorer with dynamic graph loading, BFS neighborhood extraction, edge pulse animation, and multiple navigation views
- **Document Ingestion** — A complete upload and submission workflow with page and chunk inspection and document structure browsing
- **Ontology Workbench** — A full ontology editor with class and property trees, OWL/XML and Turtle import/export with round-trip fidelity, circular dependency detection, and safe-delete confirmation dialogs
- **Schema Workbench** — Interactive schema management with list, create, edit, and delete operations including field and index management
- **Prompt Editor** — A dedicated prompt editing workflow
+- **Vector Search**: Search the installed knowledge bases
+- **Agentic, GraphRAG and LLM Chat**: Chat interface for agents, GraphRAG queries, or direct to LLMs
+- **Relationships**: Analyze deep relationships in the installed knowledge bases
+- **Graph Visualizer**: 3D GraphViz of the installed knowledge bases
+- **Library**: Staging area for installing knowledge bases
+- **Flow Classes**: Workflow preset configurations
+- **Flows**: Create custom workflows and adjust LLM parameters during runtime
+- **Knowledge Cores**: Manage resuable knowledge bases
+- **Prompts**: Manage and adjust prompts during runtime
+- **Schemas**: Define custom schemas for structured data knowledge bases
+- **Ontologies**: Define custom ontologies for unstructured data knowledge bases
+- **Agent Tools**: Define tools with collections, knowledge cores, MCP connections, and tool groups
+- **MCP Tools**: Connect to MCP servers

 ## TypeScript Library for UIs

@ -165,6 +125,134 @@ There are 3 libraries for quick UI integration of TrustGraph services.
 - [@trustgraph/react-state](https://www.npmjs.com/package/@trustgraph/react-state)
 - [@trustgraph/react-provider](https://www.npmjs.com/package/@trustgraph/react-provider)

+## Context Cores
+
+Context Cores are how TrustGraph treats context like code. A Context Core is a **portable, versioned bundle of context** that you can ship between projects and environments, pin in production, and reuse across agents. It packages the “stuff agents need to know” (structured knowledge + embeddings + evidence + policies) into a single artifact, so you can treat context like code: build it, test it, version it, promote it, and roll it back. TrustGraph is built to support this kind of end-to-end context engineering and orchestration workflow.
+
+### What’s inside a Context Core
+A Context Core typically includes:
+- Ontology (your domain schema) and mappings
+- Context Graph (entities, relationships, supporting evidence)
+- Embeddings / vector indexes for fast semantic entry-point lookup
+- Source manifests + provenance (where facts came from, when, and how they were derived)
+- Retrieval policies (traversal rules, freshness, authority ranking)
+
+## Tech Stack
+TrustGraph provides component flexibility to optimize agent workflows.
+
+<details>
+<summary>LLM APIs</summary>
+<br>
+
+- Anthropic<br>
+- AWS Bedrock<br>
+- AzureAI<br>
+- AzureOpenAI<br>
+- Cohere<br>
+- Google AI Studio<br>
+- Google VertexAI<br>
+- Mistral<br>
+- OpenAI<br>
+
+</details>
+<details>
+<summary>LLM Orchestration</summary>
+<br>
+
+- LM Studio<br>
+- Llamafiles<br>
+- Ollama<br>
+- TGI<br>
+- vLLM<br>
+
+</details>
+<details>
+<summary>Multi-model storage</summary>
+<br>
+
+- Apache Cassandra<br>
+
+</details>
+<details>
+<summary>VectorDB</summary>
+<br>
+
+- Qdrant<br>
+
+</details>
+<details>
+<summary>File and Object Storage</summary>
+<br>
+
+- Garage<br>
+
+</details>
+<details>
+<summary>Observability</summary>
+<br>  
+
+- Prometheus<br>
+- Grafana<br>
+- Loki<br>
+
+</details>
+<details>
+<summary>Data Streaming</summary>
+<br>
+
+- Apache Pulsar<br>
+- RabbitMQ<br>
+- Apache Kafka<br>
+
+</details>
+<details>
+<summary>Clouds</summary>
+<br>
+
+- AWS<br>
+- Azure<br>
+- Google Cloud<br>
+- OVHcloud<br>
+- Scaleway<br>
+
+</details>
+
+## Observability & Telemetry
+
+Once the platform is running, access the Grafana dashboard at:
+
+```
+http://localhost:3000
+```
+
+Default credentials are:
+
+```
+user: admin
+password: admin
+```
+
+The default Grafana dashboard tracks the following:
+
+<details>
+<summary>Telemetry</summary>
+<br>
+
+- LLM Latency<br>
+- Error Rate<br>
+- Service Request Rates<br>
+- Queue Backlogs<br>
+- Chunking Histogram<br>
+- Error Source by Service<br>
+- Rate Limit Events<br>
+- CPU usage by Service<br>
+- Memory usage by Service<br>
+- Models Deployed<br>
+- Token Throughput (Tokens/second)<br>
+- Cost Throughput (Cost/second)<br>
+   
+</details>
+
 ## Contributing

 [Developer's Guide](https://docs.trustgraph.ai/guides/building/introduction.html)
@ -173,7 +261,7 @@ There are 3 libraries for quick UI integration of TrustGraph services.

 **TrustGraph** is licensed under [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0).

-   Copyright 2024-2026 TrustGraph
+   Copyright 2024-2025 TrustGraph

   Licensed under the Apache License, Version 2.0 (the "License");
   you may not use this file except in compliance with the License.
--- a/containers/Containerfile.hf
+++ b/containers/Containerfile.hf
@ -23,7 +23,7 @@ RUN pip3 install --no-cache-dir \
    langchain==1.2.16 langchain-core==1.3.2 langchain-huggingface==1.2.2 \
    langchain-community==0.4.1 \
    sentence-transformers==5.4.1 transformers==5.7.0 \
-    huggingface-hub==1.13.0 click \
+    huggingface-hub==1.13.0 \
    pulsar-client==3.11.0

 # Most commonly used embeddings model, just build it into the container
--- a/containers/Containerfile.unstructured
+++ b/containers/Containerfile.unstructured
@ -7,7 +7,7 @@ FROM docker.io/fedora:42 AS base

 ENV PIP_BREAK_SYSTEM_PACKAGES=1

-RUN dnf install -y python3.13 libxcb mesa-libGL poppler-utils && \
+RUN dnf install -y python3.13 libxcb mesa-libGL && \
  alternatives --install /usr/bin/python python /usr/bin/python3.13 1 && \
  python -m ensurepip --upgrade && \
  pip3 install --no-cache-dir --upgrade 'pip>=26.0' 'setuptools>=78.1.1' && \
--- a/dev-tools/library_client.py
+++ b/dev-tools/library_client.py
@ -25,7 +25,7 @@ BUCKET_URL = "https://storage.googleapis.com/trustgraph-library"
 INDEX_URL = f"{BUCKET_URL}/index.json"

 default_url = os.getenv("TRUSTGRAPH_URL", "http://localhost:8088/")
-default_workspace = os.getenv("TRUSTGRAPH_WORKSPACE", "default")
+default_user = "trustgraph"
 default_token = os.getenv("TRUSTGRAPH_TOKEN", None)


@ -113,7 +113,7 @@ def convert_metadata(metadata_json):
    return triples


-def load_document(api, doc_entry):
+def load_document(api, user, doc_entry):
    """Fetch metadata and content for a document, then load into TrustGraph."""
    doc_id = doc_entry["id"]
    title = doc_entry["title"]
@ -133,6 +133,7 @@ def load_document(api, doc_entry):
    api.add_document(
        id=doc["id"],
        metadata=metadata,
+        user=user,
        kind=doc["kind"],
        title=doc["title"],
        comments=doc["comments"],
@ -143,12 +144,12 @@ def load_document(api, doc_entry):
    print(f"    done.")


-def load_documents(api, docs):
+def load_documents(api, user, docs):
    """Load a list of documents."""
    print(f"Loading {len(docs)} document(s)...\n")
    for doc in docs:
        try:
-            load_document(api, doc)
+            load_document(api, user, doc)
        except Exception as e:
            print(f"    FAILED: {e}", file=sys.stderr)
        print()
@ -165,8 +166,8 @@ def main():
        help=f"TrustGraph API URL (default: {default_url})",
    )
    parser.add_argument(
-        "-w", "--workspace", default=default_workspace,
-        help=f"Workspace (default: {default_workspace})",
+        "-U", "--user", default=default_user,
+        help=f"User ID (default: {default_user})",
    )
    parser.add_argument(
        "-t", "--token", default=default_token,
@ -211,22 +212,22 @@ def main():
        return

    # Load commands need the API
-    api = Api(args.url, token=args.token, workspace=args.workspace).library()
+    api = Api(args.url, token=args.token).library()

    if args.command == "load-all":
-        load_documents(api, index)
+        load_documents(api, args.user, index)

    elif args.command == "load-doc":
        matches = [d for d in index if str(d.get("id")) == args.id]
        if not matches:
            print(f"No document with ID '{args.id}' found.", file=sys.stderr)
            sys.exit(1)
-        load_documents(api, matches)
+        load_documents(api, args.user, matches)

    elif args.command == "load-match":
        results = search_index(index, args.query)
        if results:
-            load_documents(api, results)
+            load_documents(api, args.user, results)
        else:
            print("No matches found.", file=sys.stderr)
            sys.exit(1)
--- a/docs/api.html
+++ b/docs/api.html
--- a/docs/python-api.md
+++ b/docs/python-api.md
--- a/docs/tech-specs/capabilities.md
+++ b/docs/tech-specs/capabilities.md
@ -100,7 +100,6 @@ multi-word subsystems.
 | `users:admin` | Assign / remove roles on users within the workspace |
 | `keys:self` | Create / revoke / list **own** API keys |
 | `keys:admin` | Create / revoke / list **any user's** API keys within the workspace |
-| `workspaces:list-own` | List workspaces the caller has access to |
 | `workspaces:admin` | Create / delete / disable workspaces (system-level) |
 | `iam:admin` | JWT signing-key rotation, IAM-level operations |
 | `metrics:read` | Prometheus metrics proxy |
@ -111,7 +110,7 @@ The open-source edition ships three roles:

 | Role | Capabilities |
 |---|---|
-| `reader` | `agent`, `graph:read`, `documents:read`, `rows:read`, `llm`, `embeddings`, `mcp`, `collections:read`, `knowledge:read`, `flows:read`, `config:read`, `keys:self`, `workspaces:list-own` |
+| `reader` | `agent`, `graph:read`, `documents:read`, `rows:read`, `llm`, `embeddings`, `mcp`, `collections:read`, `knowledge:read`, `flows:read`, `config:read`, `keys:self` |
 | `writer` | everything in `reader` **+** `graph:write`, `documents:write`, `rows:write`, `collections:write`, `knowledge:write` |
 | `admin` | everything in `writer` **+** `config:write`, `flows:write`, `users:read`, `users:write`, `users:admin`, `keys:admin`, `workspaces:admin`, `iam:admin`, `metrics:read` |

--- a/docs/tech-specs/iam-protocol.md
+++ b/docs/tech-specs/iam-protocol.md
@ -224,7 +224,6 @@ class ApiKeyRecord:
 | `enable-user` | `user_id`, `workspace` (optional integrity check) | — | Re-enables a previously disabled user; does not restore API keys. |
 | `delete-user` | `user_id`, `workspace` (optional integrity check) | — | Hard-delete; removes user record, username lookup, and all the user's API keys. |
 | `create-workspace` | `workspace_record` | `workspace` | System-level. |
-| `list-my-workspaces` | `actor` (gateway-injected) | `workspaces` | Returns the workspaces the calling user has access to. OSS: the user's home workspace; if the caller holds the `admin` role, returns all workspaces instead. Enterprise regimes return whatever workspaces the user has been granted access to. |
 | `list-workspaces` | — | `workspaces` | System-level. |
 | `get-workspace` | `workspace_record` (id only) | `workspace` | System-level. |
 | `update-workspace` | `workspace_record` | `workspace` | System-level. |
--- a/docs/tech-specs/knowledge-core-completeness.md
+++ b/docs/tech-specs/knowledge-core-completeness.md
@ -1,535 +0,0 @@
---
-layout: default
-title: "Knowledge Core Completeness"
-parent: "Tech Specs"
---
-
-# Knowledge Core Completeness
-
-## Overview
-
-Knowledge cores are portable snapshots of extracted knowledge: triples, graph
-embeddings, and document embeddings stored in Cassandra's `knowledge` keyspace.
-They can be downloaded as files, transferred between TrustGraph instances, and
-loaded back into vector and graph stores.
-
-Recent additions to TrustGraph — explainability/provenance and named graphs —
-were not carried through to the knowledge core system. This means that
-exporting and re-importing a core loses provenance links, graph assignments,
-and source material, breaking the explainability chain.
-
-This specification addresses three gaps:
-
-1. **Named graphs not stored** — The `g` (graph name) field on triples is
-   silently dropped when writing to the core store and comes back as `None`
-   on read.
-2. **Provenance triples not captured** — Provenance triples (PROV-O) are
-   generated during extraction and flow to graph stores, but never enter
-   the knowledge core store. It is unclear whether they arrive at the store
-   in the correct form.
-3. **Source material not included** — Documents, text pages, and chunks in
-   the librarian's bucket store are not part of the core. After loading a
-   core on a different instance, provenance links to source material point
-   at nothing.
-
-## Goals
-
- **Self-contained cores**: A downloaded knowledge core file contains
-  everything needed to reconstruct the full knowledge graph including
-  provenance and source attribution on a fresh instance.
- **Named graph preservation**: Round-tripping a core preserves graph
-  assignments on all triples.
- **Backward compatibility**: Existing core files (without graph names or
-  source material) can still be uploaded and loaded. New fields are optional
-  on import.
- **No change to core identity**: A core is still identified by its document
-  ID. The additional data is associated with the same core ID.
- **Minimal file format changes**: Extend the existing msgpack record format
-  with new record types rather than restructuring existing ones.
-
-## Background
-
-### Current Lifecycle
-
-```
-Extraction pipeline
-    │
-    ├─ triples ──────────────────► knowledge core store (Cassandra)
-    ├─ graph embeddings ─────────► knowledge core store (Cassandra)
-    ├─ document embeddings ──────► knowledge core store (Cassandra)
-    ├─ provenance triples ───────► graph store (only)
-    └─ source documents ─────────► librarian bucket store (only)
-
-Download:  Cassandra ──► knowledge manager ──► API gateway ──► client file
-Upload:    client file ──► API gateway ──► knowledge manager ──► Cassandra
-Load:      Cassandra ──► knowledge manager ──► Pulsar topics ──► graph/vector stores
-```
-
-### Current Core File Format (msgpack)
-
-A core file is a sequence of concatenated msgpack records. Each record is a
-2-element tuple: `(type_tag, payload)`.
-
-| Type tag | Payload | Description |
-|----------|---------|-------------|
-| `"t"` | `{"m": {id, root, collection}, "t": [triple_dicts]}` | Triple batch |
-| `"ge"` | `{"m": {id, root, collection}, "e": [{entity, vector}]}` | Graph embedding batch |
-
-### What's Missing
-
-#### Named Graphs
-
-The `Triple` dataclass has a `g: str | None` field (graph name IRI), used to
-separate provenance graphs (`urn:graph:source`, `urn:graph:retrieval`) from
-the default graph. However:
-
- **Cassandra schema** (`knowledge.triples` table): stores a 6-tuple per
-  triple `(s_val, s_is_uri, p_val, p_is_uri, o_val, o_is_uri)` — no graph
-  field.
- **`add_triples()`** (`tables/knowledge.py:231`): destructures only `s`,
-  `p`, `o` — `g` is discarded.
- **`get_triples()`** (`tables/knowledge.py:396`): reconstructs `Triple`
-  with `g` defaulting to `None`.
- **Core file format**: triple dicts do not include a graph field.
-
-#### Provenance Triples
-
-Provenance triples are generated in the extraction pipeline
-(`trustgraph-base/trustgraph/provenance/triples.py`) and published to graph
-store topics. They use named graphs (`urn:graph:source`,
-`urn:graph:retrieval`) and PROV-O vocabulary.
-
-The knowledge core store processor (`storage/knowledge/store.py`) listens on
-`triples-input` and `graph-embeddings-input`. Whether provenance triples
-arrive on the same `triples-input` topic or a separate one needs
-verification. Even if they do arrive, the graph name would be lost (per
-above).
-
-#### Source Material
-
-The librarian stores the full document hierarchy in a separate system:
-
- **Blob store** (S3/MinIO): original documents, text pages, chunks —
-  keyed by object UUID under `doc/{object_id}`.
- **Cassandra `library` keyspace**: document metadata including `id`,
-  `kind` (MIME type), `title`, `parent_id`, `document_type`
-  (`source`/`extracted`), `object_id` (blob reference).
-
-Provenance triples link extracted facts back to chunk/page/document IDs.
-Those IDs resolve through the librarian. When a core is loaded on a
-different instance, the librarian has no matching documents, so the entire
-provenance chain is broken.
-
-### Key Source Files
-
-| Component | File | Purpose |
-|-----------|------|---------|
-| Core Cassandra schema | `trustgraph-flow/trustgraph/tables/knowledge.py` | Table definitions, read/write |
-| Core manager | `trustgraph-flow/trustgraph/cores/knowledge.py` | API operations, load-to-store |
-| Core store processor | `trustgraph-flow/trustgraph/storage/knowledge/store.py` | Extraction → Cassandra |
-| CLI download | `trustgraph-cli/trustgraph/cli/get_kg_core.py` | Core → msgpack file |
-| CLI upload | `trustgraph-cli/trustgraph/cli/put_kg_core.py` | Msgpack file → core |
-| CLI load | `trustgraph-cli/trustgraph/cli/load_kg_core.py` | Core → graph/vector stores |
-| API client | `trustgraph-base/trustgraph/api/knowledge.py` | Client-side knowledge API |
-| Triple schema | `trustgraph-base/trustgraph/schema/core/primitives.py` | Triple dataclass with `g` field |
-| Provenance generation | `trustgraph-base/trustgraph/provenance/triples.py` | PROV-O triple creation |
-| Librarian | `trustgraph-flow/trustgraph/librarian/librarian.py` | Document storage service |
-| Library tables | `trustgraph-flow/trustgraph/tables/library.py` | Document metadata in Cassandra |
-| Blob store | `trustgraph-flow/trustgraph/librarian/blob_store.py` | S3/MinIO object storage |
-
-## Technical Design
-
-### Change 1: Named Graph Field in Core Storage
-
-#### Cassandra Schema
-
-Extend the `triples` tuple from 6 to 7 elements, adding the graph name:
-
-```
-triples list<tuple<
-    text, boolean,       -- s_val, s_is_uri
-    text, boolean,       -- p_val, p_is_uri
-    text, boolean,       -- o_val, o_is_uri
-    text                 -- graph name (empty string = default graph)
->>
-```
-
-**Migration**: The schema change uses `ALTER TABLE` or is handled by
-creating a new table version. Existing rows with 6-element tuples must be
-handled gracefully on read — if the tuple has 6 elements, treat graph as
-default.
-
-#### Write Path (`add_triples`)
-
-Change `tables/knowledge.py:add_triples()` to include `triple.g`:
-
-```python
-triples = [
-    (
-        *term_to_tuple(v.s), *term_to_tuple(v.p), *term_to_tuple(v.o),
-        v.g or ""
-    )
-    for v in m.triples
-]
-```
-
-#### Read Path (`get_triples`)
-
-Change `tables/knowledge.py:get_triples()` to restore the graph name:
-
-```python
-Triple(
-    s = tuple_to_term(elt[0], elt[1]),
-    p = tuple_to_term(elt[2], elt[3]),
-    o = tuple_to_term(elt[4], elt[5]),
-    g = elt[6] if len(elt) > 6 and elt[6] else None,
-)
-```
-
-The `len(elt) > 6` guard provides backward compatibility with existing
-6-element rows.
-
-#### Core File Format
-
-Extend triple dicts in the `"t"` record to include the graph name:
-
-```python
-# In get_kg_core.py write_triple — each triple dict gains "g" key
-{"s": ..., "p": ..., "o": ..., "g": "urn:graph:source"}
-```
-
-On read (`put_kg_core.py`), treat missing `"g"` key as default graph for
-backward compatibility with old core files.
-
-### Change 2: Provenance Triples in Cores
-
-#### Investigation Required
-
-Before implementation, verify:
-
-1. Whether provenance triples arrive on the `triples-input` topic that the
-   knowledge core store processor already listens on.
-2. If not, which topic they use, and whether the store processor should
-   subscribe to it.
-
-#### If provenance triples already arrive at the store
-
-The only change needed is Change 1 (named graphs) — the provenance triples
-are already being stored, just without their graph name. Once graph names
-are preserved, provenance triples will round-trip correctly.
-
-#### If provenance triples do NOT arrive at the store
-
-Two options:
-
-**Option A — Route provenance to the existing store topic**: Configure the
-flow so provenance triples are published to the same `triples-input` topic.
-This is the simpler approach and keeps the store processor unchanged.
-
-**Option B — Add a subscription**: Add a new `ConsumerSpec` in the store
-processor for the provenance topic. This keeps provenance routing
-independent but adds complexity.
-
-Recommendation: Option A, unless there is a reason provenance triples are
-intentionally kept off the core store topic.
-
-### Change 3: Source Material in Cores
-
-This is the largest change. The goal is that when a core is loaded on a
-fresh instance, provenance links to source material resolve.
-
-#### Architecture
-
-Source material is **not stored in the knowledge core tables**. It lives in
-the librarian (Cassandra `library` keyspace + S3/MinIO blob store) and is
-fetched on demand via the librarian's existing service API.
-
-The knowledge manager acts as a **client of the librarian service** — it
-calls the librarian's request/response API over pub/sub to retrieve document
-metadata and content. It does not access the library's Cassandra tables or
-blob store directly.
-
-#### Transport
-
-The librarian's pub/sub API already handles chunking of large documents.
-This chunking is designed to be websocket-friendly, so library content
-flowing through the API gateway to external clients does not require
-re-chunking. The API gateway remains a transport layer.
-
-```
-Download:
-  Knowledge manager ──pub/sub──► Librarian (fetch metadata + content)
-  Knowledge manager ──pub/sub──► API gateway ──websocket──► Client
-
-Upload:
-  Client ──websocket──► API gateway ──pub/sub──► Knowledge manager
-  Knowledge manager ──pub/sub──► Librarian (store metadata + content)
-```
-
-#### What to Include
-
-The provenance chain links facts → chunks → pages → documents. For the
-chain to resolve, the core must include:
-
-1. **Document metadata** — the library record for each document in the
-   hierarchy (id, kind, title, parent_id, document_type, etc.)
-2. **Document content** — the blob data for each document (original file,
-   extracted text pages, text chunks)
-
-Including the full hierarchy is necessary because:
- A user viewing provenance needs to traverse fact → chunk → page → document
- The chunk text is needed to show what text a fact was extracted from
- The page text provides broader context
- The original document is needed for full source attribution
-
-#### Size Implications
-
-Source material will significantly increase core file sizes. A rough model:
-
-| Component | Typical size per document |
-|-----------|-------------------------|
-| Triples + embeddings (current) | 1-10 MB |
-| Chunk text (all chunks) | ~same as original document |
-| Page text (all pages) | ~same as original document |
-| Original document (PDF, etc.) | Varies widely (KB to hundreds of MB) |
-
-For a 10 MB PDF, the core could grow from ~5 MB to ~25 MB (original +
-derived text + existing data). For large document sets, cores could become
-very large.
-
-**Decision needed**: Whether to include original documents or just derived
-text (pages + chunks). Including only derived text still allows provenance
-display but loses the ability to serve the original file.
-
-#### New Core File Record Types
-
-Add new msgpack record types for library content:
-
-| Type tag | Payload | Description |
-|----------|---------|-------------|
-| `"lm"` | `{"id", "kind", "title", "parent_id", "document_type", "comments", "tags", "metadata"}` | Library document metadata |
-| `"lb"` | `{"id", "data"}` | Library document blob content (chunked by pub/sub layer) |
-
-These are emitted after the existing `"t"` and `"ge"` records during
-download and processed during upload.
-
-#### Download Path
-
-Extend `KnowledgeManager.get_kg_core()` to:
-
-1. Stream triples and graph embeddings from the core store (existing
-   behavior).
-2. Use the librarian service API to retrieve documents associated with
-   this core ID:
-   a. Fetch the root document metadata and content.
-   b. Use `list-children` to discover child documents (pages, chunks).
-   c. Recursively fetch metadata and content for each child.
-3. Stream each document as `"lm"` (metadata) and `"lb"` (content) records.
-
-The knowledge manager gains the librarian service as a pub/sub dependency.
-Large document content is chunked by the librarian's existing pub/sub
-transport — the knowledge manager receives and forwards these chunks without
-buffering the full blob in memory.
-
-#### Upload Path
-
-Extend `KnowledgeManager.put_kg_core()` to handle the new record types:
-
-1. For `"lm"` records: call the librarian service API to create/update
-   the document metadata.
-2. For `"lb"` records: call the librarian service API to store the
-   document content.
-
-Parent-child relationships are preserved because `parent_id` is stored in
-the metadata. Documents should be processed in hierarchy order (parent
-before child) to satisfy any ordering constraints.
-
-#### Load Path
-
-The load path (`_load_kg_core`) publishes triples and embeddings to Pulsar
-topics for ingestion into graph/vector stores. Source material does not need
-to flow through the load path — it is already in the librarian after the
-upload step and can be accessed directly by services that need it.
-
-No changes to the load path for source material.
-
-#### CLI Changes
-
-**`tg-get-kg-core`**: Add handling for `"lm"` and `"lb"` record types in
-the file writer.
-
-**`tg-put-kg-core`**: Add handling for `"lm"` and `"lb"` record types in
-the file reader. Send library records to the knowledge manager alongside
-triple/embedding records.
-
-#### Associating Documents with Cores
-
-The core ID is `metadata.root`, which is the root document ID from the
-librarian. This provides a natural join: the core's root document and all
-its children (pages, chunks) are the source material for that core.
-
-The librarian's `list-children` API provides the child documents. A
-recursive traversal from the root document collects the full hierarchy.
-
-### API Changes
-
-#### KnowledgeResponse Schema
-
-Add optional fields to `KnowledgeResponse` for library data:
-
-```python
-@dataclass
-class KnowledgeResponse:
-    error: Error | None = None
-    ids: list | None = None
-    eos: bool = False
-    triples: Triples | None = None
-    graph_embeddings: GraphEmbeddings | None = None
-    document_embeddings: DocumentEmbeddings | None = None
-    library_metadata: LibraryMetadata | None = None    # new
-    library_blob: LibraryBlob | None = None            # new
-```
-
-#### New Schema Types
-
-```python
-@dataclass
-class LibraryMetadata:
-    id: str
-    kind: str | None = None
-    title: str | None = None
-    parent_id: str | None = None
-    document_type: str | None = None
-    comments: str | None = None
-    tags: list[str] | None = None
-    metadata: list[Triple] | None = None
-
-@dataclass
-class LibraryBlob:
-    id: str
-    data: bytes
-```
-
-#### Socket API
-
-The existing streaming protocol for `get-kg-core` / `put-kg-core` carries
-these new fields naturally — responses already stream multiple record types.
-
-### Dependencies Between Changes
-
-```
-Change 1 (named graphs)  ◄── Change 2 depends on this
-         │
-         └── Change 2 (provenance triples)
-                      │
-                      └── Change 3 (source material) is independent
-```
-
-Change 1 is a prerequisite for Change 2 (provenance triples use named
-graphs). Change 3 is independent and can be implemented in parallel.
-
-## Security Considerations
-
- **Workspace isolation**: Core download/upload must respect workspace
-  boundaries. Source material from the librarian must only be included if
-  it belongs to the same workspace as the core. This is already enforced
-  by the existing workspace-scoped queries.
- **Large blob transfer**: Streaming large documents through the API
-  is handled by the librarian's existing pub/sub chunking, which is
-  designed to be websocket-friendly. No additional chunking layer is
-  needed.
- **Cross-instance trust**: When uploading a core from an external source,
-  the library content should be treated as untrusted input. Document
-  metadata and blob content should be validated before insertion.
-
-## Performance Considerations
-
- **Core file size**: Including source material will significantly increase
-  core file sizes. Consider adding a flag to download/upload commands to
-  optionally exclude source material for use cases where only the knowledge
-  graph is needed.
- **Streaming**: All paths already use streaming (paged Cassandra queries,
-  msgpack record-at-a-time). Library content should follow the same pattern.
- **Cassandra schema migration**: Changing the tuple width in the `triples`
-  table requires careful handling. Cassandra frozen tuples cannot be altered
-  in place — a migration strategy is needed (see Migration Plan).
-
-## Testing Strategy
-
- **Unit tests**: Triple round-trip with graph name (write → read →
-  verify `g` field preserved). Backward compatibility with 6-element tuples.
- **Integration tests**: Full lifecycle — extract with provenance → download
-  core → upload to fresh instance → load → verify provenance chain resolves.
- **File format tests**: Read old-format core files (no graph name, no
-  library records) and verify they load without error.
- **Library inclusion tests**: Download core with source material → upload →
-  verify documents accessible through librarian.
-
-## Migration Plan
-
-### Cassandra Schema
-
-The `triples` table stores tuples in a `list<tuple<...>>` column. Cassandra
-does not support altering the type of an existing column. Options:
-
-**Option A — New table**: Create a `triples_v2` table with the 7-element
-tuple. Migrate data from `triples` to `triples_v2`. The read path checks
-both tables during a transition period, then the old table is dropped.
-
-**Option B — Dual read**: Keep the existing table. The read path handles
-both 6-element and 7-element tuples by checking length. New writes use
-7-element tuples. This works if Cassandra accepts variable-length tuples in
-a list — **needs verification**.
-
-**Option C — Separate graph column**: Instead of extending the tuple, add a
-parallel `graphs list<text>` column where `graphs[i]` corresponds to
-`triples[i]`. This avoids tuple migration entirely but requires keeping the
-two lists in sync.
-
-Recommendation: Verify Option B first (simplest). Fall back to Option A if
-Cassandra rejects mixed tuple lengths.
-
-### Core File Format
-
-Backward compatible by design:
- Old files lack `"g"` in triple dicts and have no `"lm"`/`"lb"` records →
-  handled by defaults.
- New files read by old code → old code ignores unknown record types (the
-  existing `read_message` raises on unknown types, so this needs a small
-  fix to skip unknown types gracefully).
-
-## Open Questions
-
-1. **Provenance topic routing**: Do provenance triples currently arrive at
-   the `triples-input` topic consumed by the knowledge core store? If not,
-   what topic are they on?
-
-2. **Include original documents?**: Should cores include the original
-   uploaded document (e.g. PDF), or only derived text (pages + chunks)?
-   Including originals makes cores fully self-contained but potentially
-   very large. Excluding them preserves provenance text display but loses
-   the ability to serve the original file.
-
-3. **Optional source material**: Should there be a flag on download/upload
-   to include or exclude source material? This would let users choose
-   between compact cores (knowledge only) and complete cores (knowledge +
-   sources).
-
-4. **Cassandra tuple migration**: Can Cassandra handle mixed-length tuples
-   in a `list<tuple<...>>` column, or is a table migration required?
-
-5. **Document embedding cores**: DE cores are managed alongside KG cores.
-   Do they need the same treatment (source material inclusion)?  The
-   document embeddings reference chunk IDs — the same provenance chain
-   applies.
-
-6. **Core versioning**: Should the core file include a version marker so
-   readers can distinguish old-format from new-format files without
-   trial-and-error parsing?
-
-## References
-
- Extraction-time provenance: `docs/tech-specs/extraction-time-provenance.md`
- Query-time explainability: `docs/tech-specs/query-time-explainability.md`
- Agent explainability: `docs/tech-specs/agent-explainability.md`
- Data ownership model: `docs/tech-specs/data-ownership-model.md`
--- a/docs/websocket.html
+++ b/docs/websocket.html
--- a/specs/README.md
+++ b/specs/README.md
@ -28,9 +28,8 @@ specs/
 Location: `specs/api/openapi.yaml`

 The REST API specification documents:
- **Global Services**: IAM (user management, authentication)
- **5 Workspace-Scoped Services**: config, flow, librarian, knowledge, collection-management
- **16 Flow-Scoped Services**: agent, RAG, embeddings, queries, loading, tools
+- **5 Global Services**: config, flow, librarian, knowledge, collection-management
+- **16 Flow-Hosted Services**: agent, RAG, embeddings, queries, loading, tools
 - **Import/Export**: Bulk data operations
 - **Metrics**: Prometheus monitoring

--- a/specs/api/README.md
+++ b/specs/api/README.md
@ -2,55 +2,6 @@

 This directory contains the modular OpenAPI 3.1 specification for the TrustGraph REST API Gateway.

-## Authentication
-
-Clients authenticate by passing an opaque bearer token in the
-`Authorization` header.  The gateway resolves the token to an
-authenticated identity and an associated workspace.  Tokens are
-obtained via the IAM service (e.g. `tg-login` or `tg-create-api-key`).
-
-## Service Tiers
-
-API services are organized into three tiers based on their scoping:
-
-### Global services
-
-These services are not scoped to a workspace.  They manage
-system-wide resources.
-
- **IAM** — user management, authentication, API key lifecycle
-
-### Workspace-scoped services
-
-These services operate within the workspace associated with the
-authenticated token.  The workspace is resolved by the gateway from
-the bearer token — it is not passed as an explicit parameter.
-
- **Config** — configuration management (prompts, token costs, etc.)
- **Librarian** — document library management
- **Knowledge** — knowledge graph core management
- **Collection Management** — collection metadata
- **Flow** — flow lifecycle and blueprint management
-
-### Flow-scoped services
-
-These services require a `flow` parameter identifying the processing
-flow to use, in addition to the workspace context from the token.
-
- **Agent** — agentic AI interactions
- **Document RAG** — retrieval-augmented generation over documents
- **Graph RAG** — retrieval-augmented generation over knowledge graphs
- **Text Completion** — LLM text completion
- **Prompt** — prompt template expansion
- **Embeddings** — vector embedding generation
- **SPARQL Query** — SPARQL queries against the knowledge graph
- **Graph Embeddings** — knowledge graph embedding queries
- **Document Embeddings** — document embedding queries
- **Structured Query** — structured data queries
- **Row Embeddings** — structured data embedding queries
- **Rows Query** — row-level data queries
- **Triples Query** — knowledge graph triple queries
-
 ## Structure

 ```
--- a/specs/api/components/schemas/collection/CollectionRequest.yaml
+++ b/specs/api/components/schemas/collection/CollectionRequest.yaml
@ -14,7 +14,7 @@ properties:
      - delete-collection
    description: |
      Collection operation:
-      - `list-collections`: List collections in the current workspace (resolved from token)
+      - `list-collections`: List collections in workspace
      - `update-collection`: Create or update collection metadata
      - `delete-collection`: Delete collection
  collection:
--- a/specs/api/components/schemas/iam/ApiKeyInput.yaml
+++ b/specs/api/components/schemas/iam/ApiKeyInput.yaml
@ -1,21 +0,0 @@
-type: object
-description: |
-  API key creation fields.  Used with `create-api-key`.
-properties:
-  user_id:
-    type: string
-    description: User to create the key for.
-    examples:
-      - usr_abc123
-  name:
-    type: string
-    description: Operator-facing label for the key (e.g. "laptop", "CI").
-    examples:
-      - laptop
-  expires:
-    type: string
-    description: |
-      Optional expiry timestamp in ISO-8601 UTC.  Empty string or
-      omitted means the key does not expire.
-    examples:
-      - "2027-01-01T00:00:00Z"
--- a/specs/api/components/schemas/iam/ApiKeyRecord.yaml
+++ b/specs/api/components/schemas/iam/ApiKeyRecord.yaml
@ -1,38 +0,0 @@
-type: object
-description: API key record returned by IAM operations.
-properties:
-  id:
-    type: string
-    description: Key identifier.
-    examples:
-      - key_xyz789
-  user_id:
-    type: string
-    description: Owning user identifier.
-    examples:
-      - usr_abc123
-  name:
-    type: string
-    description: Operator-facing label.
-    examples:
-      - laptop
-  prefix:
-    type: string
-    description: |
-      First 4 characters of the plaintext key, for identification
-      in listings.  Never enough to reconstruct the key.
-    examples:
-      - tg_a
-  expires:
-    type: string
-    description: Expiry timestamp (ISO-8601 UTC).  Empty if no expiry.
-    examples:
-      - "2027-01-01T00:00:00Z"
-  created:
-    type: string
-    description: Creation timestamp (ISO-8601 UTC).
-    examples:
-      - "2026-01-15T10:30:00Z"
-  last_used:
-    type: string
-    description: Last-used timestamp (ISO-8601 UTC).  Empty if never used.
--- a/specs/api/components/schemas/iam/IamRequest.yaml
+++ b/specs/api/components/schemas/iam/IamRequest.yaml
@ -1,106 +0,0 @@
-type: object
-description: |
-  IAM service request.
-
-  The IAM service is a **global service** — it operates at system level,
-  not scoped to a specific workspace.  All operations are dispatched via
-  the `operation` field.
-
-  Some operations require admin capabilities; others (like `whoami` and
-  `list-my-workspaces`) are available to any authenticated user.  See
-  the capability vocabulary for details.
-
-  The `actor` field is injected by the gateway and cannot be set by
-  the client.  It identifies the authenticated caller.
-required:
-  - operation
-properties:
-  operation:
-    type: string
-    enum:
-      - whoami
-      - list-my-workspaces
-      - create-user
-      - list-users
-      - get-user
-      - update-user
-      - disable-user
-      - enable-user
-      - delete-user
-      - create-workspace
-      - list-workspaces
-      - get-workspace
-      - update-workspace
-      - disable-workspace
-      - create-api-key
-      - list-api-keys
-      - revoke-api-key
-      - reset-password
-      - rotate-signing-key
-    description: |
-      Operation to perform.
-
-      **Any authenticated user:**
-      - `whoami`: Return the caller's own user record
-      - `list-my-workspaces`: List workspaces the caller has access to
-
-      **User management (requires `users:read`/`users:write`/`users:admin`):**
-      - `create-user`: Create a new user in a workspace
-      - `list-users`: List users (optionally filtered by workspace)
-      - `get-user`: Get a specific user record
-      - `update-user`: Update user fields (name, email, roles, enabled)
-      - `disable-user`: Soft-disable a user and revoke their API keys
-      - `enable-user`: Re-enable a previously disabled user
-      - `delete-user`: Hard-delete a user and their API keys
-
-      **Workspace management (requires `workspaces:admin`):**
-      - `create-workspace`: Create a new workspace
-      - `list-workspaces`: List all workspaces (admin view)
-      - `get-workspace`: Get a specific workspace record
-      - `update-workspace`: Update workspace name or enabled state
-      - `disable-workspace`: Disable workspace and all its users
-
-      **API key management (requires `keys:self` or `keys:admin`):**
-      - `create-api-key`: Create an API key for a user
-      - `list-api-keys`: List API keys for a user
-      - `revoke-api-key`: Revoke (delete) an API key
-
-      **Password management:**
-      - `reset-password`: Admin-initiated password reset (requires `users:admin`)
-
-      **System (requires `iam:admin`):**
-      - `rotate-signing-key`: Rotate the JWT signing key
-  workspace:
-    type: string
-    description: |
-      Workspace scope.  Required on workspace-scoped operations
-      (e.g. `create-user`).  Acts as an optional integrity check on
-      operations that target a user or key — when supplied, the target's
-      home workspace must match.
-
-      Omitted for system-level operations (`list-workspaces`,
-      `rotate-signing-key`) and for identity-resolution operations
-      (`whoami`, `list-my-workspaces`).
-    examples:
-      - default
-      - production
-  user_id:
-    type: string
-    description: |
-      Target user identifier.  Required for operations that act on a
-      specific user: `get-user`, `update-user`, `disable-user`,
-      `enable-user`, `delete-user`, `reset-password`, `list-api-keys`.
-    examples:
-      - usr_abc123
-  user:
-    $ref: './UserInput.yaml'
-  workspace_record:
-    $ref: './WorkspaceInput.yaml'
-  key:
-    $ref: './ApiKeyInput.yaml'
-  key_id:
-    type: string
-    description: |
-      API key identifier.  Required for `revoke-api-key`.
-    examples:
-      - key_xyz789
--- a/specs/api/components/schemas/iam/IamResponse.yaml
+++ b/specs/api/components/schemas/iam/IamResponse.yaml
@ -1,51 +0,0 @@
-type: object
-description: |
-  IAM service response.  Fields are populated depending on the
-  operation that was invoked.
-properties:
-  user:
-    $ref: './UserRecord.yaml'
-  users:
-    type: array
-    description: List of user records (populated by `list-users`).
-    items:
-      $ref: './UserRecord.yaml'
-  workspace:
-    $ref: './WorkspaceRecord.yaml'
-  workspaces:
-    type: array
-    description: |
-      List of workspace records (populated by `list-workspaces` and
-      `list-my-workspaces`).
-    items:
-      $ref: './WorkspaceRecord.yaml'
-  api_key_plaintext:
-    type: string
-    description: |
-      Plaintext API key.  Returned **once** by `create-api-key`.
-      Never populated on any other operation.  The caller must
-      capture this value — it cannot be retrieved again.
-  api_key:
-    $ref: './ApiKeyRecord.yaml'
-  api_keys:
-    type: array
-    description: List of API key records (populated by `list-api-keys`).
-    items:
-      $ref: './ApiKeyRecord.yaml'
-  temporary_password:
-    type: string
-    description: |
-      Temporary password returned once by `reset-password`.
-  error:
-    type: object
-    description: Error details (present on failure).
-    properties:
-      type:
-        type: string
-        description: |
-          Error type.  One of: `invalid-argument`, `not-found`,
-          `duplicate`, `auth-failed`, `weak-password`, `disabled`,
-          `operation-not-permitted`, `internal-error`.
-      message:
-        type: string
-        description: Human-readable error description (not surfaced to end users).
--- a/specs/api/components/schemas/iam/UserInput.yaml
+++ b/specs/api/components/schemas/iam/UserInput.yaml
@ -1,42 +0,0 @@
-type: object
-description: |
-  User creation/update fields.  Used with `create-user` and `update-user`.
-  The `password` field is only accepted on `create-user`.
-properties:
-  username:
-    type: string
-    description: Login username.  Unique within a workspace.
-    examples:
-      - alice
-  name:
-    type: string
-    description: Display name.
-    examples:
-      - Alice Smith
-  email:
-    type: string
-    description: Email address.
-    examples:
-      - alice@example.com
-  password:
-    type: string
-    description: |
-      Initial password.  Only accepted on `create-user`; rejected on
-      `update-user`.  Use `reset-password` or `change-password` to
-      modify passwords.
-  roles:
-    type: array
-    items:
-      type: string
-    description: |
-      Roles to assign.  Open-source roles: `reader`, `writer`, `admin`.
-    examples:
-      - - reader
-  enabled:
-    type: boolean
-    description: Whether the user is enabled.
-    default: true
-  must_change_password:
-    type: boolean
-    description: Force password change on next login.
-    default: false
--- a/specs/api/components/schemas/iam/UserRecord.yaml
+++ b/specs/api/components/schemas/iam/UserRecord.yaml
@ -1,46 +0,0 @@
-type: object
-description: User record returned by IAM operations.
-properties:
-  id:
-    type: string
-    description: Unique user identifier.
-    examples:
-      - usr_abc123
-  workspace:
-    type: string
-    description: User's home workspace.
-    examples:
-      - default
-  username:
-    type: string
-    description: Login username (unique within workspace).
-    examples:
-      - alice
-  name:
-    type: string
-    description: Display name.
-    examples:
-      - Alice Smith
-  email:
-    type: string
-    description: Email address.
-    examples:
-      - alice@example.com
-  roles:
-    type: array
-    items:
-      type: string
-    description: Assigned roles.
-    examples:
-      - - reader
-  enabled:
-    type: boolean
-    description: Whether the user is enabled.
-  must_change_password:
-    type: boolean
-    description: Whether the user must change password on next login.
-  created:
-    type: string
-    description: Creation timestamp (ISO-8601 UTC).
-    examples:
-      - "2026-01-15T10:30:00Z"
--- a/specs/api/components/schemas/iam/WorkspaceInput.yaml
+++ b/specs/api/components/schemas/iam/WorkspaceInput.yaml
@ -1,23 +0,0 @@
-type: object
-description: |
-  Workspace creation/update fields.  Used with `create-workspace` and
-  `update-workspace`.
-properties:
-  id:
-    type: string
-    description: |
-      Workspace identifier.  Required for all workspace operations.
-      Immutable after creation.
-    examples:
-      - default
-      - production
-  name:
-    type: string
-    description: Human-readable workspace name.
-    examples:
-      - Default Workspace
-      - Production
-  enabled:
-    type: boolean
-    description: Whether the workspace is enabled.
-    default: true
--- a/specs/api/components/schemas/iam/WorkspaceRecord.yaml
+++ b/specs/api/components/schemas/iam/WorkspaceRecord.yaml
@ -1,21 +0,0 @@
-type: object
-description: Workspace record returned by IAM operations.
-properties:
-  id:
-    type: string
-    description: Workspace identifier.
-    examples:
-      - default
-  name:
-    type: string
-    description: Human-readable workspace name.
-    examples:
-      - Default Workspace
-  enabled:
-    type: boolean
-    description: Whether the workspace is enabled.
-  created:
-    type: string
-    description: Creation timestamp (ISO-8601 UTC).
-    examples:
-      - "2026-01-01T00:00:00Z"
--- a/specs/api/components/schemas/knowledge/KnowledgeRequest.yaml
+++ b/specs/api/components/schemas/knowledge/KnowledgeRequest.yaml
@ -18,7 +18,7 @@ properties:
      - unload-kg-core
    description: |
      Knowledge core operation:
-      - `list-kg-cores`: List knowledge cores in the current workspace (resolved from token)
+      - `list-kg-cores`: List knowledge cores in workspace
      - `get-kg-core`: Get knowledge core by ID
      - `put-kg-core`: Store triples and/or embeddings
      - `delete-kg-core`: Delete knowledge core by ID
--- a/specs/api/openapi.yaml
+++ b/specs/api/openapi.yaml
@ -2,44 +2,21 @@ openapi: 3.1.0

 info:
  title: TrustGraph API Gateway
-  version: "2.4"
+  version: "2.2"
  description: |
    REST API for TrustGraph - an AI-powered knowledge graph and RAG system.

    ## Overview

    The API provides access to:
-    - **Global Services**: IAM (user management, authentication)
-    - **Workspace-Scoped Services**: Configuration, flow management, knowledge storage, library management
-    - **Flow-Scoped Services**: AI services like RAG, text completion, embeddings (require running flow)
+    - **Global Services**: Configuration, flow management, knowledge storage, library management
+    - **Flow-Hosted Services**: AI services like RAG, text completion, embeddings (require running flow)
    - **Import/Export**: Bulk data operations for triples, embeddings, entity contexts
    - **WebSocket**: Multiplexed interface for all services

-    ## Authentication
-
-    Clients authenticate by passing an opaque bearer token in the
-    `Authorization` header.  The token is obtained via the IAM service
-    (e.g. `tg-login` or `tg-create-api-key`).
-
-    ```
-    Authorization: Bearer <token>
-    ```
-
-    The gateway resolves the token to an authenticated identity and an
-    associated workspace.  The token is an opaque string — clients must
-    not make assumptions about its internal structure.
-
-    ## Service Tiers
+    ## Service Types

    ### Global Services
-    System-wide services with no workspace scoping:
-    - `iam` - User management, authentication, API key lifecycle
-
-    ### Workspace-Scoped Services
-    Operate within the workspace associated with the authenticated
-    token.  The workspace is resolved by the gateway — it is not
-    passed as an explicit parameter.
-
    Fixed endpoints accessible via `/api/v1/{kind}`:
    - `config` - Configuration management
    - `flow` - Flow lifecycle and blueprints
@ -47,17 +24,24 @@ info:
    - `knowledge` - Knowledge graph core management
    - `collection-management` - Collection metadata

-    ### Flow-Scoped Services
-    Require a `flow` parameter identifying the processing flow to use.
-    Workspace context comes from the authenticated token.
-
-    Accessed via `/api/v1/flow/{flow}/service/{kind}`:
+    ### Flow-Hosted Services
+    Require running flow instance, accessed via `/api/v1/flow/{flow}/service/{kind}`:
    - AI services: agent, text-completion, prompt, RAG (document/graph)
    - Embeddings: embeddings, graph-embeddings, document-embeddings
    - Query: triples, rows, nlp-query, structured-query, sparql-query, row-embeddings
    - Data loading: text-load, document-load
    - Utilities: mcp-tool, structured-diag

+    ## Authentication
+
+    Bearer token authentication when `GATEWAY_SECRET` environment variable is set.
+    Include token in Authorization header:
+    ```
+    Authorization: Bearer <token>
+    ```
+
+    If `GATEWAY_SECRET` is not set, API runs without authentication (development mode).
+
    ## Field Naming

    All JSON fields use **kebab-case**: `flow-id`, `blueprint-name`, `doc-limit`, etc.
@ -89,20 +73,18 @@ security:
  - bearerAuth: []

 tags:
-  - name: IAM
-    description: Identity and access management (global)
  - name: Config
-    description: Configuration management (workspace-scoped)
+    description: Configuration management (global service)
  - name: Flow
-    description: Flow lifecycle and blueprint management (workspace-scoped)
+    description: Flow lifecycle and blueprint management (global service)
  - name: Librarian
-    description: Document library management (workspace-scoped)
+    description: Document library management (global service)
  - name: Knowledge
-    description: Knowledge graph core management (workspace-scoped)
+    description: Knowledge graph core management (global service)
  - name: Collection
-    description: Collection metadata management (workspace-scoped)
+    description: Collection metadata management (global service)
  - name: Flow Services
-    description: AI and query services hosted within flow instances (flow-scoped)
+    description: Services hosted within flow instances
  - name: Import/Export
    description: Bulk data import and export
  - name: WebSocket
@ -111,11 +93,6 @@ tags:
    description: System metrics and monitoring

 paths:
-  # Global services
-  /api/v1/iam:
-    $ref: './paths/iam.yaml'
-
-  # Workspace-scoped services
  /api/v1/config:
    $ref: './paths/config.yaml'
  /api/v1/flow:
--- a/specs/api/paths/collection-management.yaml
+++ b/specs/api/paths/collection-management.yaml
@ -1,13 +1,10 @@
 post:
  tags:
    - Collection
-  summary: Collection metadata management (workspace-scoped)
+  summary: Collection metadata management
  description: |
    Manage collection metadata for organizing documents and knowledge.

-    This is a **workspace-scoped** service. All operations apply to the
-    workspace associated with the authenticated bearer token.
-
    ## Collections

    Collections are organizational units for grouping:
--- a/specs/api/paths/config.yaml
+++ b/specs/api/paths/config.yaml
@ -1,13 +1,9 @@
 post:
  tags:
    - Config
-  summary: Configuration service (workspace-scoped)
+  summary: Configuration service
  description: |
-    Manage TrustGraph configuration including flows, prompts, token costs,
-    parameter types, and more.
-
-    This is a **workspace-scoped** service. All operations apply to the
-    workspace associated with the authenticated bearer token.
+    Manage TrustGraph configuration including flows, prompts, token costs, parameter types, and more.

    ## Operations

--- a/specs/api/paths/flow.yaml
+++ b/specs/api/paths/flow.yaml
@ -1,13 +1,10 @@
 post:
  tags:
    - Flow
-  summary: Flow lifecycle and blueprint management (workspace-scoped)
+  summary: Flow lifecycle and blueprint management
  description: |
    Manage flow instances and blueprints.

-    This is a **workspace-scoped** service. All operations apply to the
-    workspace associated with the authenticated bearer token.
-
    ## Important Distinction

    The **flow service** manages *running flow instances*.
--- a/specs/api/paths/flow/agent.yaml
+++ b/specs/api/paths/flow/agent.yaml
@ -5,10 +5,6 @@ post:
  description: |
    AI agent that can understand questions, reason about them, and take actions.

-    This is a **flow-scoped** service. It requires a flow instance
-    and operates within the workspace associated with the
-    authenticated bearer token.
-
    ## Agent Overview

    The agent service provides a conversational AI that:
--- a/specs/api/paths/flow/document-embeddings.yaml
+++ b/specs/api/paths/flow/document-embeddings.yaml
@ -5,10 +5,6 @@ post:
  description: |
    Query document embeddings to find similar text chunks by vector similarity.

-    This is a **flow-scoped** service. It requires a flow instance
-    and operates within the workspace associated with the
-    authenticated bearer token.
-
    ## Document Embeddings Query Overview

    Find document chunks semantically similar to a query vector:
--- a/specs/api/paths/flow/document-load.yaml
+++ b/specs/api/paths/flow/document-load.yaml
@ -5,10 +5,6 @@ post:
  description: |
    Load binary documents (PDF, Word, etc.) into processing pipeline.

-    This is a **flow-scoped** service. It requires a flow instance
-    and operates within the workspace associated with the
-    authenticated bearer token.
-
    ## Document Load Overview

    Fire-and-forget binary document loading:
--- a/specs/api/paths/flow/document-rag.yaml
+++ b/specs/api/paths/flow/document-rag.yaml
@ -5,10 +5,6 @@ post:
  description: |
    Retrieval-Augmented Generation over document embeddings.

-    This is a **flow-scoped** service. It requires a flow instance
-    and operates within the workspace associated with the
-    authenticated bearer token.
-
    ## Document RAG Overview

    Document RAG combines:
--- a/specs/api/paths/flow/embeddings.yaml
+++ b/specs/api/paths/flow/embeddings.yaml
@ -5,10 +5,6 @@ post:
  description: |
    Convert text to embedding vectors for semantic similarity search.

-    This is a **flow-scoped** service. It requires a flow instance
-    and operates within the workspace associated with the
-    authenticated bearer token.
-
    ## Embeddings Overview

    Embeddings transform text into dense vector representations that:
--- a/specs/api/paths/flow/graph-embeddings.yaml
+++ b/specs/api/paths/flow/graph-embeddings.yaml
@ -5,10 +5,6 @@ post:
  description: |
    Query graph embeddings to find similar entities by vector similarity.

-    This is a **flow-scoped** service. It requires a flow instance
-    and operates within the workspace associated with the
-    authenticated bearer token.
-
    ## Graph Embeddings Query Overview

    Find entities semantically similar to a query vector:
--- a/specs/api/paths/flow/graph-rag.yaml
+++ b/specs/api/paths/flow/graph-rag.yaml
@ -5,10 +5,6 @@ post:
  description: |
    Retrieval-Augmented Generation over knowledge graph.

-    This is a **flow-scoped** service. It requires a flow instance
-    and operates within the workspace associated with the
-    authenticated bearer token.
-
    ## Graph RAG Overview

    Graph RAG combines:
--- a/specs/api/paths/flow/mcp-tool.yaml
+++ b/specs/api/paths/flow/mcp-tool.yaml
@ -5,10 +5,6 @@ post:
  description: |
    Execute MCP (Model Context Protocol) tools for agent capabilities.

-    This is a **flow-scoped** service. It requires a flow instance
-    and operates within the workspace associated with the
-    authenticated bearer token.
-
    ## MCP Tool Overview

    MCP tools provide agent capabilities through standardized protocol:
--- a/specs/api/paths/flow/nlp-query.yaml
+++ b/specs/api/paths/flow/nlp-query.yaml
@ -5,10 +5,6 @@ post:
  description: |
    Convert natural language questions to structured GraphQL queries.

-    This is a **flow-scoped** service. It requires a flow instance
-    and operates within the workspace associated with the
-    authenticated bearer token.
-
    ## NLP Query Overview

    Transforms user questions into executable GraphQL:
--- a/specs/api/paths/flow/prompt.yaml
+++ b/specs/api/paths/flow/prompt.yaml
@ -5,10 +5,6 @@ post:
  description: |
    Execute stored prompt templates with variable substitution.

-    This is a **flow-scoped** service. It requires a flow instance
-    and operates within the workspace associated with the
-    authenticated bearer token.
-
    ## Prompt Service Overview

    The prompt service enables:
--- a/specs/api/paths/flow/row-embeddings.yaml
+++ b/specs/api/paths/flow/row-embeddings.yaml
@ -4,11 +4,6 @@ post:
  summary: Row Embeddings Query - semantic search on structured data
  description: |
    Query row embeddings to find similar rows by vector similarity on indexed fields.
-
-    This is a **flow-scoped** service. It requires a flow instance
-    and operates within the workspace associated with the
-    authenticated bearer token.
-
    Enables fuzzy/semantic matching on structured data.

    ## Row Embeddings Query Overview
--- a/specs/api/paths/flow/rows.yaml
+++ b/specs/api/paths/flow/rows.yaml
@ -5,10 +5,6 @@ post:
  description: |
    Query structured data using GraphQL for row-oriented data access.

-    This is a **flow-scoped** service. It requires a flow instance
-    and operates within the workspace associated with the
-    authenticated bearer token.
-
    ## Rows Query Overview

    GraphQL interface to structured data:
--- a/specs/api/paths/flow/sparql-query.yaml
+++ b/specs/api/paths/flow/sparql-query.yaml
@ -5,10 +5,6 @@ post:
  description: |
    Execute a SPARQL 1.1 query against the knowledge graph.

-    This is a **flow-scoped** service. It requires a flow instance
-    and operates within the workspace associated with the
-    authenticated bearer token.
-
    ## Supported Query Types

    - **SELECT**: Returns variable bindings as a table of results
--- a/specs/api/paths/flow/structured-diag.yaml
+++ b/specs/api/paths/flow/structured-diag.yaml
@ -5,10 +5,6 @@ post:
  description: |
    Analyze and understand structured data (CSV, JSON, XML).

-    This is a **flow-scoped** service. It requires a flow instance
-    and operates within the workspace associated with the
-    authenticated bearer token.
-
    ## Structured Diag Overview

    Helps process unknown structured data:
--- a/specs/api/paths/flow/structured-query.yaml
+++ b/specs/api/paths/flow/structured-query.yaml
@ -5,10 +5,6 @@ post:
  description: |
    Ask natural language questions and get results directly.

-    This is a **flow-scoped** service. It requires a flow instance
-    and operates within the workspace associated with the
-    authenticated bearer token.
-
    ## Structured Query Overview

    Combines two operations in one call:
--- a/specs/api/paths/flow/text-completion.yaml
+++ b/specs/api/paths/flow/text-completion.yaml
@ -5,10 +5,6 @@ post:
  description: |
    Direct text completion using LLM without retrieval augmentation.

-    This is a **flow-scoped** service. It requires a flow instance
-    and operates within the workspace associated with the
-    authenticated bearer token.
-
    ## Text Completion Overview

    Pure LLM generation for:
--- a/specs/api/paths/flow/text-load.yaml
+++ b/specs/api/paths/flow/text-load.yaml
@ -5,10 +5,6 @@ post:
  description: |
    Load text documents into processing pipeline for indexing and embedding.

-    This is a **flow-scoped** service. It requires a flow instance
-    and operates within the workspace associated with the
-    authenticated bearer token.
-
    ## Text Load Overview

    Fire-and-forget document loading:
--- a/specs/api/paths/flow/triples.yaml
+++ b/specs/api/paths/flow/triples.yaml
@ -5,10 +5,6 @@ post:
  description: |
    Query knowledge graph using subject-predicate-object patterns.

-    This is a **flow-scoped** service. It requires a flow instance
-    and operates within the workspace associated with the
-    authenticated bearer token.
-
    ## Triples Query Overview

    Query RDF triples with flexible pattern matching:
--- a/specs/api/paths/iam.yaml
+++ b/specs/api/paths/iam.yaml
@ -1,206 +0,0 @@
-post:
-  tags:
-    - IAM
-  summary: IAM service (global)
-  description: |
-    Identity and access management service.
-
-    This is a **global service** — it operates at system level, not
-    scoped to a specific workspace.  The `workspace` field in the
-    request body is used as a scope filter or integrity check on
-    certain operations, not as an addressing component.
-
-    ## Authentication
-
-    Most operations require a bearer token.  The gateway resolves the
-    token to an authenticated identity and injects the `actor` field
-    (the caller's user ID) into the request.  Clients cannot set
-    `actor` — the gateway overwrites it.
-
-    ## Operations by Capability
-
-    ### Any authenticated user
-    - `whoami`: Return the caller's own user record
-    - `list-my-workspaces`: List workspaces the caller has access to.
-      For open-source IAM: returns the caller's home workspace, or all
-      workspaces if the caller has the `admin` role.
-
-    ### User management (`users:read` / `users:write` / `users:admin`)
-    - `create-user`: Create a new user in a workspace
-    - `list-users`: List users, optionally filtered by workspace
-    - `get-user`: Get a user record by ID
-    - `update-user`: Update user fields (name, email, roles, enabled)
-    - `disable-user`: Soft-disable a user and revoke their API keys
-    - `enable-user`: Re-enable a disabled user
-    - `delete-user`: Hard-delete a user and their API keys
-
-    ### Workspace management (`workspaces:admin`)
-    - `create-workspace`: Create a new workspace
-    - `list-workspaces`: List all workspaces (admin view)
-    - `get-workspace`: Get a workspace record
-    - `update-workspace`: Update workspace name or enabled state
-    - `disable-workspace`: Disable a workspace and all its users
-
-    ### API key management (`keys:self` / `keys:admin`)
-    - `create-api-key`: Create an API key (plaintext returned once)
-    - `list-api-keys`: List API keys for a user
-    - `revoke-api-key`: Revoke (delete) an API key
-
-    ### Password management (`users:admin`)
-    - `reset-password`: Admin-initiated password reset (returns temporary password)
-
-    ### System (`iam:admin`)
-    - `rotate-signing-key`: Rotate the JWT signing key
-
-  operationId: iamService
-  security:
-    - bearerAuth: []
-  requestBody:
-    required: true
-    content:
-      application/json:
-        schema:
-          $ref: '../components/schemas/iam/IamRequest.yaml'
-        examples:
-          whoami:
-            summary: Get the caller's own user record
-            value:
-              operation: whoami
-          listMyWorkspaces:
-            summary: List workspaces the caller has access to
-            value:
-              operation: list-my-workspaces
-          createUser:
-            summary: Create a new user
-            value:
-              operation: create-user
-              workspace: default
-              user:
-                username: alice
-                name: Alice Smith
-                email: alice@example.com
-                password: changeme123
-                roles:
-                  - writer
-          listUsers:
-            summary: List users in a workspace
-            value:
-              operation: list-users
-              workspace: default
-          getUser:
-            summary: Get a specific user
-            value:
-              operation: get-user
-              user_id: usr_abc123
-          updateUser:
-            summary: Update a user's roles
-            value:
-              operation: update-user
-              user_id: usr_abc123
-              user:
-                roles:
-                  - admin
-          disableUser:
-            summary: Disable a user
-            value:
-              operation: disable-user
-              user_id: usr_abc123
-          createWorkspace:
-            summary: Create a workspace
-            value:
-              operation: create-workspace
-              workspace_record:
-                id: production
-                name: Production Workspace
-          listWorkspaces:
-            summary: List all workspaces (admin)
-            value:
-              operation: list-workspaces
-          createApiKey:
-            summary: Create an API key
-            value:
-              operation: create-api-key
-              key:
-                user_id: usr_abc123
-                name: laptop
-                expires: "2027-01-01T00:00:00Z"
-          listApiKeys:
-            summary: List a user's API keys
-            value:
-              operation: list-api-keys
-              user_id: usr_abc123
-          revokeApiKey:
-            summary: Revoke an API key
-            value:
-              operation: revoke-api-key
-              key_id: key_xyz789
-          resetPassword:
-            summary: Admin-initiated password reset
-            value:
-              operation: reset-password
-              user_id: usr_abc123
-  responses:
-    '200':
-      description: Successful response
-      content:
-        application/json:
-          schema:
-            $ref: '../components/schemas/iam/IamResponse.yaml'
-          examples:
-            whoami:
-              summary: Caller's user record
-              value:
-                user:
-                  id: usr_abc123
-                  workspace: default
-                  username: alice
-                  name: Alice Smith
-                  email: alice@example.com
-                  roles:
-                    - writer
-                  enabled: true
-                  must_change_password: false
-                  created: "2026-01-15T10:30:00Z"
-            listMyWorkspaces:
-              summary: Workspaces the caller can access
-              value:
-                workspaces:
-                  - id: default
-                    name: Default Workspace
-                    enabled: true
-                    created: "2026-01-01T00:00:00Z"
-            listUsers:
-              summary: Users in a workspace
-              value:
-                users:
-                  - id: usr_abc123
-                    workspace: default
-                    username: alice
-                    name: Alice Smith
-                    roles:
-                      - writer
-                    enabled: true
-                    created: "2026-01-15T10:30:00Z"
-            createApiKey:
-              summary: New API key (plaintext returned once)
-              value:
-                api_key_plaintext: tg_aBcDeFgHiJkLmNoPqRsTuVwXyZ
-                api_key:
-                  id: key_xyz789
-                  user_id: usr_abc123
-                  name: laptop
-                  prefix: tg_a
-                  expires: "2027-01-01T00:00:00Z"
-                  created: "2026-05-29T14:00:00Z"
-            resetPassword:
-              summary: Temporary password (returned once)
-              value:
-                temporary_password: tmp_xK9mQ2pL
-    '400':
-      description: Bad request (unknown operation, missing required fields)
-    '401':
-      $ref: '../components/responses/Unauthorized.yaml'
-    '403':
-      description: Access denied (insufficient capabilities)
-    '500':
-      $ref: '../components/responses/Error.yaml'
--- a/specs/api/paths/knowledge.yaml
+++ b/specs/api/paths/knowledge.yaml
@ -1,13 +1,9 @@
 post:
  tags:
    - Knowledge
-  summary: Knowledge graph core management (workspace-scoped)
+  summary: Knowledge graph core management
  description: |
-    Manage knowledge graph cores - persistent storage of triples and
-    embeddings.
-
-    This is a **workspace-scoped** service. All operations apply to the
-    workspace associated with the authenticated bearer token.
+    Manage knowledge graph cores - persistent storage of triples and embeddings.

    ## Knowledge Cores

--- a/specs/api/paths/librarian.yaml
+++ b/specs/api/paths/librarian.yaml
@ -1,13 +1,9 @@
 post:
  tags:
    - Librarian
-  summary: Document library management (workspace-scoped)
+  summary: Document library management
  description: |
-    Manage document library: add, remove, list documents, and control
-    processing.
-
-    This is a **workspace-scoped** service. All operations apply to the
-    workspace associated with the authenticated bearer token.
+    Manage document library: add, remove, list documents, and control processing.

    ## Document Library

--- a/specs/api/paths/websocket.yaml
+++ b/specs/api/paths/websocket.yaml
@ -26,7 +26,7 @@ get:

    ### Request Message Format

-    **Workspace-Scoped Service Request** (no flow parameter):
+    **Global Service Request** (no flow parameter):
    ```json
    {
      "id": "req-123",
@ -38,7 +38,7 @@ get:
    }
    ```

-    **Flow-Scoped Service Request** (with flow parameter):
+    **Flow-Hosted Service Request** (with flow parameter):
    ```json
    {
      "id": "req-456",
@ -54,7 +54,7 @@ get:
    **Request Fields**:
    - `id` (string, required): Client-generated unique identifier for this request within the session. Used to match responses to requests.
    - `service` (string, required): Service identifier (e.g., "config", "agent", "document-rag"). Same as `{kind}` in REST URLs.
-    - `flow` (string, optional): Flow ID for flow-scoped services. Omit for workspace-scoped and global services.
+    - `flow` (string, optional): Flow ID for flow-hosted services. Omit for global services.
    - `request` (object, required): Service-specific request payload. Same structure as REST API request body.

    ### Response Message Format
@ -96,14 +96,14 @@ get:
    | `POST /api/v1/config` | `{"service": "config"}` |
    | `POST /api/v1/flow/{flow}/service/agent` | `{"service": "agent", "flow": "my-flow"}` |

-    **Workspace-Scoped Services** (no `flow` parameter, workspace from token):
+    **Global Services** (no `flow` parameter):
    - `config` - Configuration management
    - `flow` - Flow lifecycle and blueprints
    - `librarian` - Document library management
    - `knowledge` - Knowledge graph core management
    - `collection-management` - Collection metadata

-    **Flow-Scoped Services** (require `flow` parameter, workspace from token):
+    **Flow-Hosted Services** (require `flow` parameter):
    - AI services: `agent`, `text-completion`, `prompt`, `document-rag`, `graph-rag`
    - Embeddings: `embeddings`, `graph-embeddings`, `document-embeddings`
    - Query: `triples`, `objects`, `nlp-query`, `structured-query`
@ -146,11 +146,9 @@ get:

    ## Authentication

-    The `/api/v1/socket` endpoint uses in-band authentication.
-    The WebSocket handshake is accepted unconditionally.  After
-    connecting, the client sends a bearer token as the first frame.
-    The gateway resolves the token to an identity and workspace.
-    All subsequent requests operate within that workspace context.
+    When `GATEWAY_SECRET` is set, include bearer token:
+    - As query parameter: `ws://localhost:8088/api/v1/socket?token=<token>`
+    - Or in WebSocket subprotocol header

    ## Benefits Over REST

--- a/specs/api/security/bearerAuth.yaml
+++ b/specs/api/security/bearerAuth.yaml
@ -3,19 +3,10 @@ scheme: bearer
 description: |
  Bearer token authentication.

-  Clients authenticate by passing an opaque token in the
-  `Authorization` header. The token is treated as an opaque string by
-  clients — its internal structure is a gateway implementation detail
-  and must not be relied upon.
-
-  The gateway resolves the token to an authenticated identity and an
-  associated workspace. All workspace-scoped and flow-scoped operations
-  then execute within that workspace context.
-
-  Tokens are obtained via the IAM service (e.g. `tg-login` or
-  `tg-create-api-key`).
+  Set via `GATEWAY_SECRET` environment variable on the gateway.
+  If `GATEWAY_SECRET` is not set, authentication is disabled (development mode).

  Example:
  ```
-  Authorization: Bearer <token>
+  Authorization: Bearer your-secret-token
  ```
--- a/specs/build-docs.sh
+++ b/specs/build-docs.sh
@ -24,7 +24,7 @@ echo
 # Build WebSocket API documentation
 echo "Building WebSocket API documentation (AsyncAPI)..."
 cd ../websocket
-npx --yes @asyncapi/cli generate fromTemplate asyncapi.yaml @asyncapi/html-template -o /tmp/asyncapi-build -p singleFile=true --force-write --use-new-generator
+npx --yes -p @asyncapi/cli asyncapi generate fromTemplate asyncapi.yaml @asyncapi/html-template -o /tmp/asyncapi-build -p singleFile=true --force-write
 mv /tmp/asyncapi-build/index.html ../../docs/websocket.html
 rm -rf /tmp/asyncapi-build
 echo "✓ WebSocket API docs generated: docs/websocket.html"
--- a/specs/websocket/asyncapi.yaml
+++ b/specs/websocket/asyncapi.yaml
@ -2,7 +2,7 @@ asyncapi: 3.0.0

 info:
  title: TrustGraph WebSocket API
-  version: "2.4"
+  version: "2.2"
  description: |
    WebSocket API for TrustGraph - providing multiplexed, asynchronous access to all services.

@ -14,35 +14,21 @@ info:
    - **Efficient**: Lower overhead than HTTP REST
    - **Streaming**: Real-time progressive responses

-    ## Authentication
-
-    The `/api/v1/socket` endpoint uses **in-band authentication**.
-    The WebSocket handshake is accepted unconditionally. The client
-    must authenticate by sending a bearer token as the first message
-    after connecting.  The gateway resolves the token to an
-    authenticated identity and workspace.
-
-    All subsequent requests execute within the workspace context
-    established by the authentication frame.
-
    ## Protocol Summary

    All messages are JSON with:
    - `id`: Client-generated unique identifier for request/response correlation
    - `service`: Service identifier (e.g., "config", "agent", "document-rag")
-    - `flow`: Optional flow ID for flow-scoped services
+    - `flow`: Optional flow ID for flow-hosted services
    - `request`/`response`: Service-specific payload (identical to REST API schemas)
    - `error`: Error information on failure

-    ## Service Tiers
+    ## Service Types

-    **Global Services** (no workspace scoping):
-    - iam
-
-    **Workspace-Scoped Services** (workspace resolved from token):
+    **Global Services** (no `flow` parameter):
    - config, flow, librarian, knowledge, collection-management

-    **Flow-Scoped Services** (require `flow` parameter, workspace from token):
+    **Flow-Hosted Services** (require `flow` parameter):
    - agent, text-completion, prompt, document-rag, graph-rag
    - embeddings, graph-embeddings, document-embeddings
    - triples, rows, nlp-query, structured-query, sparql-query, structured-diag, row-embeddings
@ -78,14 +64,11 @@ components:
  securitySchemes:
    bearerAuth:
      type: httpApiKey
-      name: Authorization
-      in: header
+      name: token
+      in: query
      description: |
-        Bearer token authentication.  The `/api/v1/socket` endpoint
-        uses in-band authentication: the WebSocket handshake is
-        accepted unconditionally and the client sends a bearer token
-        as the first frame after connecting.  The token is an opaque
-        string obtained via the IAM service.
+        Bearer token authentication when GATEWAY_SECRET is configured.
+        Include as query parameter: ws://localhost:8088/api/v1/socket?token=<token>

  messages:
    ServiceRequest:
--- a/specs/websocket/channels/socket.yaml
+++ b/specs/websocket/channels/socket.yaml
@ -3,16 +3,8 @@ description: |
  Primary WebSocket channel for all TrustGraph services.

  This single channel provides multiplexed access to:
-  - Global services (IAM)
-  - Workspace-scoped services (config, flow, librarian, knowledge, collection-management)
-  - Flow-scoped services (agent, RAG, embeddings, queries, loading, etc.)
-
-  ## Authentication
-
-  The handshake is accepted unconditionally.  The client must send a
-  bearer token as the first frame after connecting (in-band auth).
-  The gateway resolves the token to an identity and workspace.  All
-  subsequent requests execute within that workspace context.
+  - All global services (config, flow, librarian, knowledge, collection-management)
+  - All flow-hosted services (agent, RAG, embeddings, queries, loading, etc.)

  ## Multiplexing

@ -21,17 +13,16 @@ description: |

  ## Message Flow

-  1. Client connects and sends bearer token as first frame (authentication)
-  2. Client sends requests with unique `id`, `service`, optional `flow`, and `request` payload
-  3. Server processes request asynchronously
-  4. Server sends response(s) with matching `id` and either `response` or `error`
-  5. For streaming services, multiple responses may be sent with the same `id`
+  1. Client sends request with unique `id`, `service`, optional `flow`, and `request` payload
+  2. Server processes request asynchronously
+  3. Server sends response(s) with matching `id` and either `response` or `error`
+  4. For streaming services, multiple responses may be sent with the same `id`

  ## Service Routing

  Messages are routed to services based on:
  - `service`: Service identifier (required)
-  - `flow`: Flow ID (required for flow-scoped services, omitted for workspace-scoped and global services)
+  - `flow`: Flow ID (required for flow-hosted services, omitted for global services)

 messages:
  request:
--- a/specs/websocket/components/messages/ServiceRequest.yaml
+++ b/specs/websocket/components/messages/ServiceRequest.yaml
@ -9,17 +9,14 @@ description: |
 payload:
  description: Service request envelope with id, service, optional flow, and service-specific request payload
  oneOf:
-    # Global services
-    - $ref: './requests/IamRequest.yaml'
-
-    # Workspace-scoped services (no flow parameter)
+    # Global services (no flow parameter)
    - $ref: './requests/ConfigRequest.yaml'
    - $ref: './requests/FlowRequest.yaml'
    - $ref: './requests/LibrarianRequest.yaml'
    - $ref: './requests/KnowledgeRequest.yaml'
    - $ref: './requests/CollectionManagementRequest.yaml'

-    # Flow-scoped services (require flow parameter)
+    # Flow-hosted services (require flow parameter)
    - $ref: './requests/AgentRequest.yaml'
    - $ref: './requests/DocumentRagRequest.yaml'
    - $ref: './requests/GraphRagRequest.yaml'
--- a/specs/websocket/components/messages/requests/AgentRequest.yaml
+++ b/specs/websocket/components/messages/requests/AgentRequest.yaml
@ -1,5 +1,5 @@
 type: object
-description: WebSocket request for agent service (flow-scoped service)
+description: WebSocket request for agent service (flow-hosted service)
 required:
  - id
  - service
--- a/specs/websocket/components/messages/requests/CollectionManagementRequest.yaml
+++ b/specs/websocket/components/messages/requests/CollectionManagementRequest.yaml
@ -1,5 +1,5 @@
 type: object
-description: WebSocket request for collection-management service (workspace-scoped service)
+description: WebSocket request for collection-management service (global service)
 required:
  - id
  - service
--- a/specs/websocket/components/messages/requests/ConfigRequest.yaml
+++ b/specs/websocket/components/messages/requests/ConfigRequest.yaml
@ -1,5 +1,5 @@
 type: object
-description: WebSocket request for config service (workspace-scoped service)
+description: WebSocket request for config service (global service)
 required:
  - id
  - service
--- a/specs/websocket/components/messages/requests/DocumentEmbeddingsRequest.yaml
+++ b/specs/websocket/components/messages/requests/DocumentEmbeddingsRequest.yaml
@ -1,5 +1,5 @@
 type: object
-description: WebSocket request for document-embeddings service (flow-scoped service)
+description: WebSocket request for document-embeddings service (flow-hosted service)
 required:
  - id
  - service
--- a/specs/websocket/components/messages/requests/DocumentLoadRequest.yaml
+++ b/specs/websocket/components/messages/requests/DocumentLoadRequest.yaml
@ -1,5 +1,5 @@
 type: object
-description: WebSocket request for document-load service (flow-scoped service)
+description: WebSocket request for document-load service (flow-hosted service)
 required:
  - id
  - service
--- a/specs/websocket/components/messages/requests/DocumentRagRequest.yaml
+++ b/specs/websocket/components/messages/requests/DocumentRagRequest.yaml
@ -1,5 +1,5 @@
 type: object
-description: WebSocket request for document-rag service (flow-scoped service)
+description: WebSocket request for document-rag service (flow-hosted service)
 required:
  - id
  - service
--- a/specs/websocket/components/messages/requests/EmbeddingsRequest.yaml
+++ b/specs/websocket/components/messages/requests/EmbeddingsRequest.yaml
@ -1,5 +1,5 @@
 type: object
-description: WebSocket request for embeddings service (flow-scoped service)
+description: WebSocket request for embeddings service (flow-hosted service)
 required:
  - id
  - service
--- a/specs/websocket/components/messages/requests/FlowRequest.yaml
+++ b/specs/websocket/components/messages/requests/FlowRequest.yaml
@ -1,5 +1,5 @@
 type: object
-description: WebSocket request for flow service (workspace-scoped service)
+description: WebSocket request for flow service (global service)
 required:
  - id
  - service
--- a/specs/websocket/components/messages/requests/GraphEmbeddingsRequest.yaml
+++ b/specs/websocket/components/messages/requests/GraphEmbeddingsRequest.yaml
@ -1,5 +1,5 @@
 type: object
-description: WebSocket request for graph-embeddings service (flow-scoped service)
+description: WebSocket request for graph-embeddings service (flow-hosted service)
 required:
  - id
  - service
--- a/specs/websocket/components/messages/requests/GraphRagRequest.yaml
+++ b/specs/websocket/components/messages/requests/GraphRagRequest.yaml
@ -1,5 +1,5 @@
 type: object
-description: WebSocket request for graph-rag service (flow-scoped service)
+description: WebSocket request for graph-rag service (flow-hosted service)
 required:
  - id
  - service
--- a/specs/websocket/components/messages/requests/IamRequest.yaml
+++ b/specs/websocket/components/messages/requests/IamRequest.yaml
@ -1,25 +0,0 @@
-type: object
-description: WebSocket request for IAM service (global service)
-required:
-  - id
-  - service
-  - request
-properties:
-  id:
-    type: string
-    description: Unique request identifier
-  service:
-    type: string
-    const: iam
-    description: Service identifier for IAM service
-  request:
-    $ref: '../../../../api/components/schemas/iam/IamRequest.yaml'
-examples:
-  - id: req-1
-    service: iam
-    request:
-      operation: whoami
-  - id: req-2
-    service: iam
-    request:
-      operation: list-my-workspaces
--- a/specs/websocket/components/messages/requests/KnowledgeRequest.yaml
+++ b/specs/websocket/components/messages/requests/KnowledgeRequest.yaml
@ -1,5 +1,5 @@
 type: object
-description: WebSocket request for knowledge service (workspace-scoped service)
+description: WebSocket request for knowledge service (global service)
 required:
  - id
  - service
--- a/specs/websocket/components/messages/requests/LibrarianRequest.yaml
+++ b/specs/websocket/components/messages/requests/LibrarianRequest.yaml
@ -1,5 +1,5 @@
 type: object
-description: WebSocket request for librarian service (workspace-scoped service)
+description: WebSocket request for librarian service (global service)
 required:
  - id
  - service
--- a/specs/websocket/components/messages/requests/McpToolRequest.yaml
+++ b/specs/websocket/components/messages/requests/McpToolRequest.yaml
@ -1,5 +1,5 @@
 type: object
-description: WebSocket request for mcp-tool service (flow-scoped service)
+description: WebSocket request for mcp-tool service (flow-hosted service)
 required:
  - id
  - service
--- a/specs/websocket/components/messages/requests/NlpQueryRequest.yaml
+++ b/specs/websocket/components/messages/requests/NlpQueryRequest.yaml
@ -1,5 +1,5 @@
 type: object
-description: WebSocket request for nlp-query service (flow-scoped service)
+description: WebSocket request for nlp-query service (flow-hosted service)
 required:
  - id
  - service
--- a/specs/websocket/components/messages/requests/PromptRequest.yaml
+++ b/specs/websocket/components/messages/requests/PromptRequest.yaml
@ -1,5 +1,5 @@
 type: object
-description: WebSocket request for prompt service (flow-scoped service)
+description: WebSocket request for prompt service (flow-hosted service)
 required:
  - id
  - service
--- a/specs/websocket/components/messages/requests/RowEmbeddingsRequest.yaml
+++ b/specs/websocket/components/messages/requests/RowEmbeddingsRequest.yaml
@ -1,5 +1,5 @@
 type: object
-description: WebSocket request for row-embeddings service (flow-scoped service)
+description: WebSocket request for row-embeddings service (flow-hosted service)
 required:
  - id
  - service
--- a/specs/websocket/components/messages/requests/RowsRequest.yaml
+++ b/specs/websocket/components/messages/requests/RowsRequest.yaml
@ -1,5 +1,5 @@
 type: object
-description: WebSocket request for rows service (flow-scoped service)
+description: WebSocket request for rows service (flow-hosted service)
 required:
  - id
  - service
--- a/specs/websocket/components/messages/requests/SparqlQueryRequest.yaml
+++ b/specs/websocket/components/messages/requests/SparqlQueryRequest.yaml
@ -1,5 +1,5 @@
 type: object
-description: WebSocket request for sparql-query service (flow-scoped service)
+description: WebSocket request for sparql-query service (flow-hosted service)
 required:
  - id
  - service
--- a/specs/websocket/components/messages/requests/StructuredDiagRequest.yaml
+++ b/specs/websocket/components/messages/requests/StructuredDiagRequest.yaml
@ -1,5 +1,5 @@
 type: object
-description: WebSocket request for structured-diag service (flow-scoped service)
+description: WebSocket request for structured-diag service (flow-hosted service)
 required:
  - id
  - service
--- a/specs/websocket/components/messages/requests/StructuredQueryRequest.yaml
+++ b/specs/websocket/components/messages/requests/StructuredQueryRequest.yaml
@ -1,5 +1,5 @@
 type: object
-description: WebSocket request for structured-query service (flow-scoped service)
+description: WebSocket request for structured-query service (flow-hosted service)
 required:
  - id
  - service
--- a/specs/websocket/components/messages/requests/TextCompletionRequest.yaml
+++ b/specs/websocket/components/messages/requests/TextCompletionRequest.yaml
@ -1,5 +1,5 @@
 type: object
-description: WebSocket request for text-completion service (flow-scoped service)
+description: WebSocket request for text-completion service (flow-hosted service)
 required:
  - id
  - service
--- a/specs/websocket/components/messages/requests/TextLoadRequest.yaml
+++ b/specs/websocket/components/messages/requests/TextLoadRequest.yaml
@ -1,5 +1,5 @@
 type: object
-description: WebSocket request for text-load service (flow-scoped service)
+description: WebSocket request for text-load service (flow-hosted service)
 required:
  - id
  - service
--- a/specs/websocket/components/messages/requests/TriplesRequest.yaml
+++ b/specs/websocket/components/messages/requests/TriplesRequest.yaml
@ -1,5 +1,5 @@
 type: object
-description: WebSocket request for triples service (flow-scoped service)
+description: WebSocket request for triples service (flow-hosted service)
 required:
  - id
  - service
--- a/specs/websocket/components/schemas/RequestEnvelope.yaml
+++ b/specs/websocket/components/schemas/RequestEnvelope.yaml
@ -23,9 +23,8 @@ properties:
    description: |
      Service identifier. Same as {kind} in REST API URLs.

-      Global services: iam
-      Workspace-scoped services: config, flow, librarian, knowledge, collection-management
-      Flow-scoped services: agent, text-completion, prompt, document-rag, graph-rag,
+      Global services: config, flow, librarian, knowledge, collection-management
+      Flow-hosted services: agent, text-completion, prompt, document-rag, graph-rag,
      embeddings, graph-embeddings, document-embeddings, triples, objects,
      nlp-query, structured-query, structured-diag, text-load, document-load, mcp-tool
    examples:
@ -35,12 +34,10 @@ properties:
  flow:
    type: string
    description: |
-      Flow ID for flow-scoped services. Required for services accessed via
+      Flow ID for flow-hosted services. Required for services accessed via
      /api/v1/flow/{flow}/service/{kind} in REST API.

-      Omit for global services (iam) and workspace-scoped services
-      (config, flow, librarian, knowledge, collection-management).
-      Workspace context is resolved from the authenticated token.
+      Omit this field for global services (config, flow, librarian, knowledge, collection-management).
    examples:
      - my-flow
      - production-flow
--- a/tests/unit/test_base/test_cassandra_config.py
+++ b/tests/unit/test_base/test_cassandra_config.py
@ -410,56 +410,3 @@ class TestEdgeCases:
        assert hosts == ['mixed-host']
        assert username is None  # Stays None
        assert password == 'mixed-pass'
-
-
-class TestReplicationFactorParamPath:
-
-    def test_explicit_kwarg(self):
-        with patch.dict(os.environ, {}, clear=True):
-            _, _, _, _, rf = resolve_cassandra_config(
-                replication_factor=3,
-            )
-            assert rf == 3
-
-    def test_kwarg_overrides_env(self):
-        with patch.dict(os.environ, {'CASSANDRA_REPLICATION_FACTOR': '5'}, clear=True):
-            _, _, _, _, rf = resolve_cassandra_config(
-                replication_factor=3,
-            )
-            assert rf == 3
-
-    def test_env_fallback_when_kwarg_none(self):
-        with patch.dict(os.environ, {'CASSANDRA_REPLICATION_FACTOR': '5'}, clear=True):
-            _, _, _, _, rf = resolve_cassandra_config(
-                replication_factor=None,
-            )
-            assert rf == 5
-
-    def test_default_when_no_kwarg_no_env(self):
-        with patch.dict(os.environ, {}, clear=True):
-            _, _, _, _, rf = resolve_cassandra_config()
-            assert rf == 1
-
-    def test_params_dict_path(self):
-        with patch.dict(os.environ, {}, clear=True):
-            params = {'cassandra_replication_factor': 3}
-            _, _, _, _, rf = resolve_cassandra_config(
-                replication_factor=params.get('cassandra_replication_factor'),
-            )
-            assert rf == 3
-
-    def test_params_dict_overrides_env(self):
-        with patch.dict(os.environ, {'CASSANDRA_REPLICATION_FACTOR': '5'}, clear=True):
-            params = {'cassandra_replication_factor': 3}
-            _, _, _, _, rf = resolve_cassandra_config(
-                replication_factor=params.get('cassandra_replication_factor'),
-            )
-            assert rf == 3
-
-    def test_params_dict_missing_falls_to_env(self):
-        with patch.dict(os.environ, {'CASSANDRA_REPLICATION_FACTOR': '5'}, clear=True):
-            params = {}
-            _, _, _, _, rf = resolve_cassandra_config(
-                replication_factor=params.get('cassandra_replication_factor'),
-            )
-            assert rf == 5
--- a/tests/unit/test_base/test_qdrant_config.py
+++ b/tests/unit/test_base/test_qdrant_config.py
@ -1,136 +0,0 @@
-
-import os
-import pytest
-from unittest.mock import patch
-
-from trustgraph.base.qdrant_config import (
-    get_qdrant_defaults,
-    resolve_qdrant_config,
-)
-
-
-class TestGetQdrantDefaults:
-
-    def test_defaults_with_no_env_vars(self):
-        with patch.dict(os.environ, {}, clear=True):
-            defaults = get_qdrant_defaults()
-            assert defaults['url'] == 'http://localhost:6333'
-            assert defaults['api_key'] is None
-            assert defaults['replication_factor'] == 1
-            assert defaults['shard_number'] == 1
-
-    def test_defaults_from_env(self):
-        env = {
-            'QDRANT_URL': 'http://qdrant:6333',
-            'QDRANT_API_KEY': 'secret',
-            'QDRANT_REPLICATION_FACTOR': '3',
-            'QDRANT_SHARD_NUMBER': '5',
-        }
-        with patch.dict(os.environ, env, clear=True):
-            defaults = get_qdrant_defaults()
-            assert defaults['url'] == 'http://qdrant:6333'
-            assert defaults['api_key'] == 'secret'
-            assert defaults['replication_factor'] == 3
-            assert defaults['shard_number'] == 5
-
-
-class TestResolveQdrantConfig:
-
-    def test_defaults(self):
-        with patch.dict(os.environ, {}, clear=True):
-            url, api_key, rf, sn = resolve_qdrant_config()
-            assert url == 'http://localhost:6333'
-            assert api_key is None
-            assert rf == 1
-            assert sn == 1
-
-    def test_explicit_kwargs(self):
-        with patch.dict(os.environ, {}, clear=True):
-            url, api_key, rf, sn = resolve_qdrant_config(
-                url='http://custom:6333',
-                api_key='key',
-                replication_factor=3,
-                shard_number=5,
-            )
-            assert url == 'http://custom:6333'
-            assert api_key == 'key'
-            assert rf == 3
-            assert sn == 5
-
-    def test_kwargs_override_env(self):
-        env = {
-            'QDRANT_URL': 'http://env:6333',
-            'QDRANT_REPLICATION_FACTOR': '10',
-            'QDRANT_SHARD_NUMBER': '10',
-        }
-        with patch.dict(os.environ, env, clear=True):
-            url, _, rf, sn = resolve_qdrant_config(
-                url='http://explicit:6333',
-                replication_factor=3,
-                shard_number=5,
-            )
-            assert url == 'http://explicit:6333'
-            assert rf == 3
-            assert sn == 5
-
-    def test_env_fallback_when_kwargs_none(self):
-        env = {
-            'QDRANT_URL': 'http://env:6333',
-            'QDRANT_REPLICATION_FACTOR': '3',
-            'QDRANT_SHARD_NUMBER': '5',
-        }
-        with patch.dict(os.environ, env, clear=True):
-            url, _, rf, sn = resolve_qdrant_config()
-            assert url == 'http://env:6333'
-            assert rf == 3
-            assert sn == 5
-
-    def test_params_dict_path(self):
-        with patch.dict(os.environ, {}, clear=True):
-            params = {
-                'store_uri': 'http://params:6333',
-                'api_key': 'pkey',
-                'qdrant_replication_factor': 3,
-                'qdrant_shard_number': 5,
-            }
-            url, api_key, rf, sn = resolve_qdrant_config(
-                url=params.get('store_uri'),
-                api_key=params.get('api_key'),
-                replication_factor=params.get('qdrant_replication_factor'),
-                shard_number=params.get('qdrant_shard_number'),
-            )
-            assert url == 'http://params:6333'
-            assert api_key == 'pkey'
-            assert rf == 3
-            assert sn == 5
-
-    def test_params_dict_overrides_env(self):
-        env = {
-            'QDRANT_REPLICATION_FACTOR': '10',
-            'QDRANT_SHARD_NUMBER': '10',
-        }
-        with patch.dict(os.environ, env, clear=True):
-            params = {
-                'qdrant_replication_factor': 3,
-                'qdrant_shard_number': 5,
-            }
-            _, _, rf, sn = resolve_qdrant_config(
-                replication_factor=params.get('qdrant_replication_factor'),
-                shard_number=params.get('qdrant_shard_number'),
-            )
-            assert rf == 3
-            assert sn == 5
-
-    def test_params_dict_missing_falls_to_env(self):
-        env = {
-            'QDRANT_REPLICATION_FACTOR': '3',
-            'QDRANT_SHARD_NUMBER': '5',
-        }
-        with patch.dict(os.environ, env, clear=True):
-            params = {}
-            _, _, rf, sn = resolve_qdrant_config(
-                replication_factor=params.get('qdrant_replication_factor'),
-                shard_number=params.get('qdrant_shard_number'),
-            )
-            assert rf == 3
-            assert sn == 5
--- a/tests/unit/test_cores/test_knowledge_manager.py
+++ b/tests/unit/test_cores/test_knowledge_manager.py
@ -11,12 +11,7 @@ from unittest.mock import AsyncMock, Mock, patch, MagicMock
 from unittest.mock import call

 from trustgraph.cores.knowledge import KnowledgeManager
-from trustgraph.schema import (
-    KnowledgeResponse, Triples, GraphEmbeddings, Metadata, Triple, Term,
-    EntityEmbeddings, IRI, LITERAL,
-    LibraryMetadata, LibraryBlob,
-    LibrarianResponse, DocumentMetadata,
-)
+from trustgraph.schema import KnowledgeResponse, Triples, GraphEmbeddings, Metadata, Triple, Term, EntityEmbeddings, IRI, LITERAL


@pytest.fixture
@ -386,244 +381,3 @@ class TestKnowledgeManagerOtherMethods:
        mock_respond.assert_called_once()
        response = mock_respond.call_args[0][0]
        assert response.error is None
-
-
-class TestKnowledgeManagerLibraryDownload:
-    """Test get_kg_core streaming of library documents."""
-
-    @pytest.fixture
-    def manager_with_librarian(self, mock_flow_config):
-        with patch('trustgraph.cores.knowledge.KnowledgeTableStore'):
-            mock_librarian = AsyncMock()
-            manager = KnowledgeManager(
-                cassandra_host=["localhost"],
-                cassandra_username="test_user",
-                cassandra_password="test_pass",
-                keyspace="test_keyspace",
-                flow_config=mock_flow_config,
-                librarian=mock_librarian,
-            )
-            manager.table_store = AsyncMock()
-            return manager
-
-    @pytest.mark.asyncio
-    async def test_get_kg_core_streams_library_docs(self, manager_with_librarian):
-        mock_request = Mock()
-        mock_request.id = "root-doc"
-        mock_respond = AsyncMock()
-
-        manager_with_librarian.table_store.get_triples = AsyncMock()
-        manager_with_librarian.table_store.get_graph_embeddings = AsyncMock()
-
-        root_meta = DocumentMetadata(
-            id="root-doc", kind="application/pdf", title="Test PDF",
-            document_type="source",
-        )
-        child_meta = DocumentMetadata(
-            id="chunk-1", kind="text/plain", title="Chunk 1",
-            parent_id="root-doc", document_type="chunk",
-        )
-
-        manager_with_librarian.librarian.fetch_document_metadata.return_value = root_meta
-        manager_with_librarian.librarian.request.return_value = LibrarianResponse(
-            document_metadatas=[child_meta],
-        )
-        manager_with_librarian.librarian.fetch_document_content.side_effect = [
-            b"cm9vdCBjb250ZW50",
-            b"Y2h1bmsgY29udGVudA==",
-        ]
-
-        await manager_with_librarian.get_kg_core(
-            mock_request, mock_respond, "test-user"
-        )
-
-        responses = [c[0][0] for c in mock_respond.call_args_list]
-
-        lm_responses = [r for r in responses if r.library_metadata is not None]
-        lb_responses = [r for r in responses if r.library_blob is not None]
-        eos_responses = [r for r in responses if r.eos is True]
-
-        assert len(lm_responses) == 2
-        assert lm_responses[0].library_metadata.id == "root-doc"
-        assert lm_responses[0].library_metadata.document_type == "source"
-        assert lm_responses[1].library_metadata.id == "chunk-1"
-        assert lm_responses[1].library_metadata.parent_id == "root-doc"
-
-        assert len(lb_responses) == 2
-        assert lb_responses[0].library_blob.id == "root-doc"
-        assert lb_responses[0].library_blob.data == b"cm9vdCBjb250ZW50"
-        assert lb_responses[1].library_blob.id == "chunk-1"
-
-        assert len(eos_responses) == 1
-
-    @pytest.mark.asyncio
-    async def test_get_kg_core_no_librarian_skips_library(self, mock_flow_config):
-        with patch('trustgraph.cores.knowledge.KnowledgeTableStore'):
-            manager = KnowledgeManager(
-                cassandra_host=["localhost"],
-                cassandra_username="u", cassandra_password="p",
-                keyspace="ks", flow_config=mock_flow_config,
-            )
-            manager.table_store = AsyncMock()
-            manager.table_store.get_triples = AsyncMock()
-            manager.table_store.get_graph_embeddings = AsyncMock()
-
-        mock_request = Mock()
-        mock_request.id = "doc-1"
-        mock_respond = AsyncMock()
-
-        await manager.get_kg_core(mock_request, mock_respond, "w")
-
-        responses = [c[0][0] for c in mock_respond.call_args_list]
-        assert all(r.library_metadata is None for r in responses)
-        assert all(r.library_blob is None for r in responses)
-
-    @pytest.mark.asyncio
-    async def test_get_kg_core_librarian_metadata_failure_is_graceful(
-        self, manager_with_librarian,
-    ):
-        mock_request = Mock()
-        mock_request.id = "missing-doc"
-        mock_respond = AsyncMock()
-
-        manager_with_librarian.table_store.get_triples = AsyncMock()
-        manager_with_librarian.table_store.get_graph_embeddings = AsyncMock()
-        manager_with_librarian.librarian.fetch_document_metadata.side_effect = (
-            RuntimeError("not found")
-        )
-
-        await manager_with_librarian.get_kg_core(
-            mock_request, mock_respond, "test-user"
-        )
-
-        responses = [c[0][0] for c in mock_respond.call_args_list]
-        assert all(r.library_metadata is None for r in responses)
-        assert any(r.eos for r in responses)
-
-
-class TestKnowledgeManagerLibraryUpload:
-    """Test put_kg_core handling of library metadata and blob records."""
-
-    @pytest.fixture
-    def manager_with_librarian(self, mock_flow_config):
-        with patch('trustgraph.cores.knowledge.KnowledgeTableStore'):
-            mock_librarian = AsyncMock()
-            manager = KnowledgeManager(
-                cassandra_host=["localhost"],
-                cassandra_username="u", cassandra_password="p",
-                keyspace="ks", flow_config=mock_flow_config,
-                librarian=mock_librarian,
-            )
-            manager.table_store = AsyncMock()
-            return manager
-
-    @pytest.mark.asyncio
-    async def test_put_metadata_then_blob_calls_librarian(
-        self, manager_with_librarian,
-    ):
-        mock_respond = AsyncMock()
-        manager_with_librarian.librarian.request.return_value = LibrarianResponse()
-
-        # First call: metadata
-        req_meta = Mock()
-        req_meta.triples = None
-        req_meta.graph_embeddings = None
-        req_meta.library_metadata = LibraryMetadata(
-            id="doc-1", kind="application/pdf", title="Test",
-            document_type="source",
-        )
-        req_meta.library_blob = None
-        await manager_with_librarian.put_kg_core(req_meta, mock_respond, "ws")
-
-        # Metadata is buffered, librarian not called yet
-        manager_with_librarian.librarian.request.assert_not_called()
-
-        # Second call: blob
-        req_blob = Mock()
-        req_blob.triples = None
-        req_blob.graph_embeddings = None
-        req_blob.library_metadata = None
-        req_blob.library_blob = LibraryBlob(
-            id="doc-1", data=b"dGVzdA==",
-        )
-        await manager_with_librarian.put_kg_core(req_blob, mock_respond, "ws")
-
-        # Now librarian should have been called with add-document
-        manager_with_librarian.librarian.request.assert_called_once()
-        call_args = manager_with_librarian.librarian.request.call_args[0][0]
-        assert call_args.operation == "add-document"
-        assert call_args.document_metadata.id == "doc-1"
-        assert call_args.document_metadata.kind == "application/pdf"
-        assert call_args.content == b"dGVzdA=="
-
-    @pytest.mark.asyncio
-    async def test_put_child_document_uses_add_child_operation(
-        self, manager_with_librarian,
-    ):
-        mock_respond = AsyncMock()
-        manager_with_librarian.librarian.request.return_value = LibrarianResponse()
-
-        req_meta = Mock()
-        req_meta.triples = None
-        req_meta.graph_embeddings = None
-        req_meta.library_metadata = LibraryMetadata(
-            id="chunk-1", kind="text/plain", title="Chunk",
-            parent_id="doc-1", document_type="chunk",
-        )
-        req_meta.library_blob = None
-        await manager_with_librarian.put_kg_core(req_meta, mock_respond, "ws")
-
-        req_blob = Mock()
-        req_blob.triples = None
-        req_blob.graph_embeddings = None
-        req_blob.library_metadata = None
-        req_blob.library_blob = LibraryBlob(id="chunk-1", data=b"Y2h1bms=")
-        await manager_with_librarian.put_kg_core(req_blob, mock_respond, "ws")
-
-        call_args = manager_with_librarian.librarian.request.call_args[0][0]
-        assert call_args.operation == "add-child-document"
-        assert call_args.document_metadata.parent_id == "doc-1"
-
-    @pytest.mark.asyncio
-    async def test_put_blob_without_metadata_logs_warning(
-        self, manager_with_librarian,
-    ):
-        mock_respond = AsyncMock()
-
-        req_blob = Mock()
-        req_blob.triples = None
-        req_blob.graph_embeddings = None
-        req_blob.library_metadata = None
-        req_blob.library_blob = LibraryBlob(id="orphan", data=b"data")
-        await manager_with_librarian.put_kg_core(req_blob, mock_respond, "ws")
-
-        # Librarian should not be called for orphan blob
-        manager_with_librarian.librarian.request.assert_not_called()
-
-    @pytest.mark.asyncio
-    async def test_put_existing_document_is_graceful(
-        self, manager_with_librarian,
-    ):
-        mock_respond = AsyncMock()
-        manager_with_librarian.librarian.request.side_effect = RuntimeError(
-            "Document already exists"
-        )
-
-        req_meta = Mock()
-        req_meta.triples = None
-        req_meta.graph_embeddings = None
-        req_meta.library_metadata = LibraryMetadata(
-            id="doc-1", kind="application/pdf", title="Test",
-            document_type="source",
-        )
-        req_meta.library_blob = None
-        await manager_with_librarian.put_kg_core(req_meta, mock_respond, "ws")
-
-        req_blob = Mock()
-        req_blob.triples = None
-        req_blob.graph_embeddings = None
-        req_blob.library_metadata = None
-        req_blob.library_blob = LibraryBlob(id="doc-1", data=b"data")
-        await manager_with_librarian.put_kg_core(req_blob, mock_respond, "ws")
-
-        # Should not raise — "already exists" is handled gracefully
--- a/tests/unit/test_decoding/test_pdf_decoder.py
+++ b/tests/unit/test_decoding/test_pdf_decoder.py
@ -49,7 +49,7 @@ class TestPdfDecoderProcessor(IsolatedAsyncioTestCase):
    async def test_on_message_success(self, mock_pdf_loader_class, mock_producer, mock_consumer):
        """Test successful PDF processing"""
        # Mock PDF content
-        pdf_content = b"%PDF-1.7\nfake pdf content"
+        pdf_content = b"fake pdf content"
        pdf_base64 = base64.b64encode(pdf_content).decode('utf-8')

        # Mock PyPDFLoader
@ -88,55 +88,13 @@ class TestPdfDecoderProcessor(IsolatedAsyncioTestCase):
        # Verify triples were sent for each page (provenance)
        assert mock_triples_flow.send.call_count == 2

-    @patch('trustgraph.base.librarian_client.Consumer')
-    @patch('trustgraph.base.librarian_client.Producer')
-    @patch('trustgraph.decoding.pdf.pdf_decoder.PyPDFLoader')
-    @patch('trustgraph.base.async_processor.AsyncProcessor', MockAsyncProcessor)
-    async def test_on_message_rejects_librarian_content_that_is_not_pdf(self, mock_pdf_loader_class, mock_producer, mock_consumer):
-        """Test rejecting non-PDF content before invoking the PDF loader"""
-        html_content = b"<html><body>Not found</body></html>"
-        html_base64 = base64.b64encode(html_content)
-
-        mock_metadata = Metadata(id="test-doc")
-        mock_document = Document(metadata=mock_metadata, document_id="doc-123")
-        mock_msg = MagicMock()
-        mock_msg.value.return_value = mock_document
-
-        mock_output_flow = AsyncMock()
-        mock_triples_flow = AsyncMock()
-        mock_flow = MagicMock(side_effect=lambda name: {
-            "output": mock_output_flow,
-            "triples": mock_triples_flow,
-        }.get(name))
-        mock_flow.librarian.fetch_document_metadata = AsyncMock(
-            return_value=MagicMock(kind="application/pdf")
-        )
-        mock_flow.librarian.fetch_document_content = AsyncMock(
-            return_value=html_base64
-        )
-        mock_flow.librarian.save_child_document = AsyncMock()
-
-        config = {
-            'id': 'test-pdf-decoder',
-            'taskgroup': AsyncMock()
-        }
-
-        processor = Processor(**config)
-
-        await processor.on_message(mock_msg, None, mock_flow)
-
-        mock_pdf_loader_class.assert_not_called()
-        mock_output_flow.send.assert_not_called()
-        mock_triples_flow.send.assert_not_called()
-        mock_flow.librarian.save_child_document.assert_not_called()
-
    @patch('trustgraph.base.librarian_client.Consumer')
    @patch('trustgraph.base.librarian_client.Producer')
    @patch('trustgraph.decoding.pdf.pdf_decoder.PyPDFLoader')
    @patch('trustgraph.base.async_processor.AsyncProcessor', MockAsyncProcessor)
    async def test_on_message_empty_pdf(self, mock_pdf_loader_class, mock_producer, mock_consumer):
        """Test handling of empty PDF"""
-        pdf_content = b"%PDF-1.7\nfake pdf content"
+        pdf_content = b"fake pdf content"
        pdf_base64 = base64.b64encode(pdf_content).decode('utf-8')

        mock_loader = MagicMock()
@ -168,7 +126,7 @@ class TestPdfDecoderProcessor(IsolatedAsyncioTestCase):
    @patch('trustgraph.base.async_processor.AsyncProcessor', MockAsyncProcessor)
    async def test_on_message_unicode_content(self, mock_pdf_loader_class, mock_producer, mock_consumer):
        """Test handling of unicode content in PDF"""
-        pdf_content = b"%PDF-1.7\nfake pdf content"
+        pdf_content = b"fake pdf content"
        pdf_base64 = base64.b64encode(pdf_content).decode('utf-8')

        mock_loader = MagicMock()
--- a/tests/unit/test_embeddings/test_huggingface_dynamic_model.py
+++ b/tests/unit/test_embeddings/test_huggingface_dynamic_model.py
@ -18,7 +18,7 @@ from trustgraph.embeddings.hf.hf import Processor
 class TestHuggingFaceDynamicModelLoading(IsolatedAsyncioTestCase):
    """Test HuggingFace dynamic model loading and caching"""

-    @patch('langchain_huggingface.HuggingFaceEmbeddings')
+    @patch('trustgraph.embeddings.hf.hf.HuggingFaceEmbeddings')
    @patch('trustgraph.base.async_processor.AsyncProcessor.__init__')
    @patch('trustgraph.base.embeddings_service.EmbeddingsService.__init__')
    async def test_default_model_loaded_on_init(self, mock_embeddings_init, mock_async_init, mock_hf_class):
@ -39,7 +39,7 @@ class TestHuggingFaceDynamicModelLoading(IsolatedAsyncioTestCase):
        assert processor.cached_model_name == "test-model"
        assert processor.embeddings is not None

-    @patch('langchain_huggingface.HuggingFaceEmbeddings')
+    @patch('trustgraph.embeddings.hf.hf.HuggingFaceEmbeddings')
    @patch('trustgraph.base.async_processor.AsyncProcessor.__init__')
    @patch('trustgraph.base.embeddings_service.EmbeddingsService.__init__')
    async def test_model_caching_avoids_reload(self, mock_embeddings_init, mock_async_init, mock_hf_class):
@ -63,7 +63,7 @@ class TestHuggingFaceDynamicModelLoading(IsolatedAsyncioTestCase):
        mock_hf_class.assert_not_called()
        assert processor.cached_model_name == "test-model"

-    @patch('langchain_huggingface.HuggingFaceEmbeddings')
+    @patch('trustgraph.embeddings.hf.hf.HuggingFaceEmbeddings')
    @patch('trustgraph.base.async_processor.AsyncProcessor.__init__')
    @patch('trustgraph.base.embeddings_service.EmbeddingsService.__init__')
    async def test_model_reload_on_name_change(self, mock_embeddings_init, mock_async_init, mock_hf_class):
@ -84,7 +84,7 @@ class TestHuggingFaceDynamicModelLoading(IsolatedAsyncioTestCase):
        mock_hf_class.assert_called_once_with(model_name="different-model")
        assert processor.cached_model_name == "different-model"

-    @patch('langchain_huggingface.HuggingFaceEmbeddings')
+    @patch('trustgraph.embeddings.hf.hf.HuggingFaceEmbeddings')
    @patch('trustgraph.base.async_processor.AsyncProcessor.__init__')
    @patch('trustgraph.base.embeddings_service.EmbeddingsService.__init__')
    async def test_on_embeddings_uses_default_model(self, mock_embeddings_init, mock_async_init, mock_hf_class):
@ -107,7 +107,7 @@ class TestHuggingFaceDynamicModelLoading(IsolatedAsyncioTestCase):
        assert processor.cached_model_name == "test-model"  # Still using default
        assert result == [[0.1, 0.2, 0.3, 0.4, 0.5]]

-    @patch('langchain_huggingface.HuggingFaceEmbeddings')
+    @patch('trustgraph.embeddings.hf.hf.HuggingFaceEmbeddings')
    @patch('trustgraph.base.async_processor.AsyncProcessor.__init__')
    @patch('trustgraph.base.embeddings_service.EmbeddingsService.__init__')
    async def test_on_embeddings_uses_specified_model(self, mock_embeddings_init, mock_async_init, mock_hf_class):
@ -130,7 +130,7 @@ class TestHuggingFaceDynamicModelLoading(IsolatedAsyncioTestCase):
        assert processor.cached_model_name == "custom-model"
        mock_hf_instance.embed_documents.assert_called_once_with(["test text"])

-    @patch('langchain_huggingface.HuggingFaceEmbeddings')
+    @patch('trustgraph.embeddings.hf.hf.HuggingFaceEmbeddings')
    @patch('trustgraph.base.async_processor.AsyncProcessor.__init__')
    @patch('trustgraph.base.embeddings_service.EmbeddingsService.__init__')
    async def test_multiple_model_switches(self, mock_embeddings_init, mock_async_init, mock_hf_class):
@ -164,7 +164,7 @@ class TestHuggingFaceDynamicModelLoading(IsolatedAsyncioTestCase):
        assert call_count_after_b == initial_call_count + 2  # Reload for model-b
        assert call_count_after_a_again == initial_call_count + 3  # Reload back to model-a

-    @patch('langchain_huggingface.HuggingFaceEmbeddings')
+    @patch('trustgraph.embeddings.hf.hf.HuggingFaceEmbeddings')
    @patch('trustgraph.base.async_processor.AsyncProcessor.__init__')
    @patch('trustgraph.base.embeddings_service.EmbeddingsService.__init__')
    async def test_none_model_uses_default(self, mock_embeddings_init, mock_async_init, mock_hf_class):
@ -187,7 +187,7 @@ class TestHuggingFaceDynamicModelLoading(IsolatedAsyncioTestCase):
        assert mock_hf_class.call_count == initial_count
        assert processor.cached_model_name == "test-model"

-    @patch('langchain_huggingface.HuggingFaceEmbeddings')
+    @patch('trustgraph.embeddings.hf.hf.HuggingFaceEmbeddings')
    @patch('trustgraph.base.async_processor.AsyncProcessor.__init__')
    @patch('trustgraph.base.embeddings_service.EmbeddingsService.__init__')
    async def test_initialization_without_model_uses_default(self, mock_embeddings_init, mock_async_init, mock_hf_class):
--- a/tests/unit/test_prompt_manager.py
+++ b/tests/unit/test_prompt_manager.py
@ -7,7 +7,7 @@ including template rendering, term merging, JSON validation, and error handling.

 import pytest
 import json
-from unittest.mock import AsyncMock
+from unittest.mock import AsyncMock, MagicMock, patch

 from trustgraph.template.prompt_manager import PromptManager, PromptConfiguration, Prompt

@ -344,42 +344,6 @@ class TestPromptManager:
        assert pm.terms == {}  # Default empty terms
        assert len(pm.prompts) == 0

-    def test_load_config_does_not_swallow_keyboard_interrupt(self, monkeypatch):
-        """KeyboardInterrupt should propagate out of config parsing."""
-        pm = PromptManager()
-
-        def interrupt(_value):
-            raise KeyboardInterrupt
-
-        monkeypatch.setattr("trustgraph.template.prompt_manager.json.loads", interrupt)
-
-        with pytest.raises(KeyboardInterrupt):
-            pm.load_config({"system": json.dumps("Test")})
-
-    @pytest.mark.asyncio
-    async def test_json_parse_does_not_swallow_system_exit(self):
-        """SystemExit should propagate out of JSON response parsing."""
-        pm = PromptManager()
-        config = {
-            "system": json.dumps("Test"),
-            "template-index": json.dumps(["json_response"]),
-            "template.json_response": json.dumps({
-                "prompt": "Generate JSON",
-                "response-type": "json"
-            })
-        }
-        pm.load_config(config)
-
-        def exit_parse(_text):
-            raise SystemExit(2)
-
-        pm.parse_json = exit_parse
-        mock_llm = AsyncMock()
-        mock_llm.return_value = "{}"
-
-        with pytest.raises(SystemExit):
-            await pm.invoke("json_response", {}, mock_llm)
-

@pytest.mark.unit
 class TestPromptManagerJsonl:
--- a/tests/unit/test_python_api_client.py
+++ b/tests/unit/test_python_api_client.py
@ -8,7 +8,6 @@ import pytest
 from unittest.mock import Mock, patch, MagicMock, call
 import json

-from trustgraph.api.socket_client import SocketClient
 from trustgraph.api import (
    Api,
    Triple,
@ -223,82 +222,6 @@ class TestSocketClient:
        for method in expected_methods:
            assert hasattr(flow_instance, method), f"Missing method: {method}"

-    def test_socket_client_close_does_not_swallow_base_exceptions(self):
-        """Test close cleanup does not suppress process-level interrupts."""
-
-        class InterruptingLoop:
-            def is_closed(self):
-                return False
-
-            def run_until_complete(self, awaitable):
-                if hasattr(awaitable, "close"):
-                    awaitable.close()
-                raise SystemExit("stop")
-
-        socket = SocketClient(url="http://test/", timeout=60, token=None)
-        socket._loop = InterruptingLoop()
-
-        with pytest.raises(SystemExit):
-            socket.close()
-
-    @pytest.mark.parametrize(
-        ("generator_method", "async_method"),
-        [
-            ("_streaming_generator", "_send_request_async_streaming"),
-            ("_streaming_generator_raw", "_send_request_async_streaming_raw"),
-        ],
-    )
-    def test_socket_client_streaming_cleanup_does_not_swallow_base_exceptions(
-        self, generator_method, async_method
-    ):
-        """Test streaming cleanup does not suppress process-level interrupts."""
-
-        class FakeAsyncGenerator:
-            def __anext__(self):
-                return "next"
-
-            def aclose(self):
-                return "close"
-
-        class InterruptingLoop:
-            def run_until_complete(self, awaitable):
-                if awaitable == "next":
-                    raise StopAsyncIteration
-                if awaitable == "close":
-                    raise SystemExit("stop")
-                raise AssertionError(f"unexpected awaitable: {awaitable!r}")
-
-        socket = SocketClient(url="http://test/", timeout=60, token=None)
-        setattr(socket, async_method, lambda *args, **kwargs: FakeAsyncGenerator())
-        generator = getattr(socket, generator_method)(
-            "agent", "default", {}, InterruptingLoop()
-        )
-
-        with pytest.raises(SystemExit):
-            next(generator)
-
-    @pytest.mark.asyncio
-    async def test_socket_client_reader_does_not_swallow_base_exceptions(self):
-        """Test reader error fanout does not suppress process-level interrupts."""
-
-        class FailingSocket:
-            def __aiter__(self):
-                return self
-
-            async def __anext__(self):
-                raise ValueError("reader failed")
-
-        class InterruptingQueue:
-            async def put(self, message):
-                raise SystemExit("stop")
-
-        socket = SocketClient(url="http://test/", timeout=60, token=None)
-        socket._socket = FailingSocket()
-        socket._pending = {"req-1": InterruptingQueue()}
-
-        with pytest.raises(SystemExit):
-            await socket._reader()
-

 class TestBulkClient:
    """Test bulk operations client"""
--- a/tests/unit/test_query/test_ontology_monitoring.py
+++ b/tests/unit/test_query/test_ontology_monitoring.py
@ -1,56 +0,0 @@
-"""
-Tests for ontology monitoring metrics.
-"""
-
-from trustgraph.query.ontology.monitoring import (
-    PerformanceMonitor,
-    _extract_metric_label,
-)
-
-
-def test_extract_metric_label_reads_unquoted_label_value():
-    metric_name = "cache_requests_total{cache_type=entity,component=ontology}"
-
-    assert _extract_metric_label(metric_name, "cache_type") == "entity"
-
-
-def test_extract_metric_label_reads_quoted_label_value():
-    metric_name = 'cache_requests_total{cache_type="entity",component="ontology"}'
-
-    assert _extract_metric_label(metric_name, "cache_type") == "entity"
-
-
-def test_extract_metric_label_returns_none_when_label_missing():
-    metric_name = "cache_requests_total{component=ontology}"
-
-    assert _extract_metric_label(metric_name, "cache_type") is None
-
-
-def test_performance_report_ignores_counters_without_cache_type_label():
-    monitor = PerformanceMonitor({"enabled": False})
-    monitor.metrics_collector.increment(
-        "cache_requests_total",
-        labels={"component": "ontology"},
-    )
-    monitor.metrics_collector.increment(
-        "cache_type=not_a_label",
-        labels={"component": "ontology"},
-    )
-    monitor.metrics_collector.increment(
-        "cache_requests_total",
-        labels={"cache_type": "entity"},
-    )
-    monitor.metrics_collector.increment(
-        "cache_hits_total",
-        labels={"cache_type": "entity"},
-    )
-
-    report = monitor.get_performance_report()
-
-    assert report["cache_performance"] == {
-        "entity": {
-            "hit_rate": 1.0,
-            "total_requests": 1.0,
-            "total_hits": 1.0,
-        }
-    }
--- a/tests/unit/test_query/test_rows_cassandra_query.py
+++ b/tests/unit/test_query/test_rows_cassandra_query.py
@ -333,8 +333,8 @@ class TestUnifiedTableQueries:
    """Test queries against the unified rows table"""

    @pytest.mark.asyncio
-    @patch('trustgraph.query.rows.cassandra.service.async_execute_paged', new_callable=AsyncMock)
-    async def test_query_with_index_match(self, mock_async_execute_paged):
+    @patch('trustgraph.query.rows.cassandra.service.async_execute', new_callable=AsyncMock)
+    async def test_query_with_index_match(self, mock_async_execute):
        """Test query execution with matching index"""
        processor = MagicMock()
        processor.session = MagicMock()
@ -344,10 +344,10 @@ class TestUnifiedTableQueries:
        processor.find_matching_index = Processor.find_matching_index.__get__(processor, Processor)
        processor.query_cassandra = Processor.query_cassandra.__get__(processor, Processor)

-        # Mock async_execute_paged to return test data (list of pages)
+        # Mock async_execute to return test data
        mock_row = MagicMock()
        mock_row.data = {"id": "123", "name": "Test Product", "category": "electronics"}
-        mock_async_execute_paged.return_value = [[mock_row]]
+        mock_async_execute.return_value = [mock_row]

        schema = RowSchema(
            name="products",
@ -370,10 +370,10 @@ class TestUnifiedTableQueries:

        # Verify Cassandra was connected and queried
        processor.connect_cassandra.assert_called_once()
-        mock_async_execute_paged.assert_called_once()
+        mock_async_execute.assert_called_once()

        # Verify query structure - should query unified rows table
-        call_args = mock_async_execute_paged.call_args
+        call_args = mock_async_execute.call_args
        query = call_args[0][1]
        params = call_args[0][2]

@ -394,8 +394,8 @@ class TestUnifiedTableQueries:
        assert results[0]["category"] == "electronics"

    @pytest.mark.asyncio
-    @patch('trustgraph.query.rows.cassandra.service.async_scan', new_callable=AsyncMock)
-    async def test_query_without_index_match(self, mock_async_scan):
+    @patch('trustgraph.query.rows.cassandra.service.async_execute', new_callable=AsyncMock)
+    async def test_query_without_index_match(self, mock_async_execute):
        """Test query execution without matching index (scan mode)"""
        processor = MagicMock()
        processor.session = MagicMock()
@ -406,10 +406,12 @@ class TestUnifiedTableQueries:
        processor._matches_filters = Processor._matches_filters.__get__(processor, Processor)
        processor.query_cassandra = Processor.query_cassandra.__get__(processor, Processor)

-        # Mock async_scan to return filtered test data
+        # Mock async_execute to return test data
        mock_row1 = MagicMock()
        mock_row1.data = {"id": "1", "name": "Product A", "price": "100"}
-        mock_async_scan.return_value = [mock_row1]
+        mock_row2 = MagicMock()
+        mock_row2.data = {"id": "2", "name": "Product B", "price": "200"}
+        mock_async_execute.return_value = [mock_row1, mock_row2]

        schema = RowSchema(
            name="products",
@ -430,16 +432,13 @@ class TestUnifiedTableQueries:
            limit=10
        )

-        # Verify async_scan was called
-        mock_async_scan.assert_called_once()
-
-        # Verify query structure
-        call_args = mock_async_scan.call_args
+        # Query should use ALLOW FILTERING for scan
+        call_args = mock_async_execute.call_args
        query = call_args[0][1]

        assert "ALLOW FILTERING" in query

-        # Should return filtered results
+        # Should post-filter results
        assert len(results) == 1
        assert results[0]["name"] == "Product A"

--- a/tests/unit/test_reliability/test_null_embedding_protection.py
+++ b/tests/unit/test_reliability/test_null_embedding_protection.py
@ -259,8 +259,6 @@ class TestGraphEmbeddingsNullProtection:
        proc.collection_exists = MagicMock(return_value=True)
        proc._cache_lock = asyncio.Lock()
        proc._known_collections = set()
-        proc.replication_factor = 1
-        proc.shard_number = 1

        msg = MagicMock()
        msg.metadata.collection = "graphs"
--- a/tests/unit/test_tables/test_knowledge_table_store.py
+++ b/tests/unit/test_tables/test_knowledge_table_store.py
@ -35,9 +35,9 @@ def _make_store():
 class TestGetGraphEmbeddings:

    @pytest.mark.asyncio
-    @patch('trustgraph.tables.knowledge.async_execute_paged', new_callable=AsyncMock)
+    @patch('trustgraph.tables.knowledge.async_execute', new_callable=AsyncMock)
    async def test_row_converts_to_entity_embeddings_with_singular_vector(
-        self, mock_async_execute_paged
+        self, mock_async_execute
    ):
        """
        Cassandra rows return entities as a list of [entity_tuple, vector]
@ -57,7 +57,7 @@ class TestGetGraphEmbeddings:
        store = _make_store()
        store.cassandra = Mock()
        store.get_graph_embeddings_stmt = Mock()
-        mock_async_execute_paged.return_value = [[fake_row]]
+        mock_async_execute.return_value = [fake_row]

        received = []

@ -66,7 +66,7 @@ class TestGetGraphEmbeddings:

        await store.get_graph_embeddings("alice", "doc-1", receiver)

-        mock_async_execute_paged.assert_called_once_with(
+        mock_async_execute.assert_called_once_with(
            store.cassandra,
            store.get_graph_embeddings_stmt,
            ("alice", "doc-1"),
@ -96,8 +96,8 @@ class TestGetGraphEmbeddings:
        assert ge.entities[2].entity.value == "a literal entity"

    @pytest.mark.asyncio
-    @patch('trustgraph.tables.knowledge.async_execute_paged', new_callable=AsyncMock)
-    async def test_empty_entities_blob_yields_empty_list(self, mock_async_execute_paged):
+    @patch('trustgraph.tables.knowledge.async_execute', new_callable=AsyncMock)
+    async def test_empty_entities_blob_yields_empty_list(self, mock_async_execute):
        """row[3] being None / empty must produce a GraphEmbeddings with
        no entities, not raise."""
        fake_row = (None, None, None, None)
@ -105,7 +105,7 @@ class TestGetGraphEmbeddings:
        store = _make_store()
        store.cassandra = Mock()
        store.get_graph_embeddings_stmt = Mock()
-        mock_async_execute_paged.return_value = [[fake_row]]
+        mock_async_execute.return_value = [fake_row]

        received = []

@ -118,8 +118,8 @@ class TestGetGraphEmbeddings:
        assert received[0].entities == []

    @pytest.mark.asyncio
-    @patch('trustgraph.tables.knowledge.async_execute_paged', new_callable=AsyncMock)
-    async def test_multiple_rows_each_emit_one_message(self, mock_async_execute_paged):
+    @patch('trustgraph.tables.knowledge.async_execute', new_callable=AsyncMock)
+    async def test_multiple_rows_each_emit_one_message(self, mock_async_execute):
        fake_rows = [
            (None, None, None, [
                (("http://example.org/a", True), [1.0]),
@ -132,7 +132,7 @@ class TestGetGraphEmbeddings:
        store = _make_store()
        store.cassandra = Mock()
        store.get_graph_embeddings_stmt = Mock()
-        mock_async_execute_paged.return_value = [fake_rows]
+        mock_async_execute.return_value = fake_rows

        received = []

@ -153,9 +153,9 @@ class TestGetTriples:
    the same Metadata construction. Cover it for parity."""

    @pytest.mark.asyncio
-    @patch('trustgraph.tables.knowledge.async_execute_paged', new_callable=AsyncMock)
-    async def test_row_converts_to_triples(self, mock_async_execute_paged):
-        # row[3] is a list of (s_val, s_uri, p_val, p_uri, o_val, o_uri, graph)
+    @patch('trustgraph.tables.knowledge.async_execute', new_callable=AsyncMock)
+    async def test_row_converts_to_triples(self, mock_async_execute):
+        # row[3] is a list of (s_val, s_uri, p_val, p_uri, o_val, o_uri)
        fake_row = (
            None, None, None,
            [
@ -163,7 +163,6 @@ class TestGetTriples:
                    "http://example.org/alice", True,
                    "http://example.org/knows", True,
                    "http://example.org/bob", True,
-                    "urn:graph:source",
                ),
            ],
        )
@ -171,7 +170,7 @@ class TestGetTriples:
        store = _make_store()
        store.cassandra = Mock()
        store.get_triples_stmt = Mock()
-        mock_async_execute_paged.return_value = [[fake_row]]
+        mock_async_execute.return_value = [fake_row]

        received = []

@ -192,33 +191,3 @@ class TestGetTriples:
        assert t.s.iri == "http://example.org/alice"
        assert t.p.iri == "http://example.org/knows"
        assert t.o.iri == "http://example.org/bob"
-        assert t.g == "urn:graph:source"
-
-    @pytest.mark.asyncio
-    @patch('trustgraph.tables.knowledge.async_execute_paged', new_callable=AsyncMock)
-    async def test_empty_graph_name_becomes_none(self, mock_async_execute_paged):
-        fake_row = (
-            None, None, None,
-            [
-                (
-                    "http://example.org/alice", True,
-                    "http://example.org/knows", True,
-                    "http://example.org/bob", True,
-                    "",
-                ),
-            ],
-        )
-
-        store = _make_store()
-        store.cassandra = Mock()
-        store.get_triples_stmt = Mock()
-        mock_async_execute_paged.return_value = [[fake_row]]
-
-        received = []
-
-        async def receiver(msg):
-            received.append(msg)
-
-        await store.get_triples("w", "d", receiver)
-
-        assert received[0].triples[0].g is None
--- a/tests/unit/test_translators/test_knowledge_translator_roundtrip.py
+++ b/tests/unit/test_translators/test_knowledge_translator_roundtrip.py
@ -1,6 +1,5 @@
 """
-Round-trip unit tests for KnowledgeRequestTranslator and
-KnowledgeResponseTranslator.
+Round-trip unit tests for KnowledgeRequestTranslator.

 Regression coverage: a previous version of the decode side constructed
 EntityEmbeddings(vectors=...) — the schema field is `vector` (singular),
@ -16,13 +15,9 @@ Triples breaks the test.

 import pytest

-from trustgraph.messaging.translators.knowledge import (
-    KnowledgeRequestTranslator,
-    KnowledgeResponseTranslator,
-)
+from trustgraph.messaging.translators.knowledge import KnowledgeRequestTranslator
 from trustgraph.schema import (
    KnowledgeRequest,
-    KnowledgeResponse,
    GraphEmbeddings,
    EntityEmbeddings,
    Triples,
@ -30,8 +25,6 @@ from trustgraph.schema import (
    Metadata,
    Term,
    IRI,
-    LibraryMetadata,
-    LibraryBlob,
 )


@ -152,161 +145,3 @@ class TestKnowledgeRequestTranslatorTriples:
        assert t.s.iri == "http://example.org/alice"
        assert t.p.iri == "http://example.org/knows"
        assert t.o.iri == "http://example.org/bob"
-
-
-class TestKnowledgeRequestTranslatorLibrary:
-
-    def test_roundtrip_preserves_library_metadata(self, translator):
-        request = KnowledgeRequest(
-            operation="put-kg-core",
-            id="doc-1",
-            library_metadata=LibraryMetadata(
-                id="doc-1",
-                kind="application/pdf",
-                title="Test Document",
-                parent_id="",
-                document_type="source",
-                comments="test comments",
-                tags=["tag1", "tag2"],
-            ),
-        )
-
-        encoded = translator.encode(request)
-        assert "library-metadata" in encoded
-        lm = encoded["library-metadata"]
-        assert lm["id"] == "doc-1"
-        assert lm["kind"] == "application/pdf"
-        assert lm["title"] == "Test Document"
-        assert lm["parent-id"] == ""
-        assert lm["document-type"] == "source"
-        assert lm["comments"] == "test comments"
-        assert lm["tags"] == ["tag1", "tag2"]
-
-        decoded = translator.decode(encoded)
-        assert decoded.library_metadata is not None
-        assert decoded.library_metadata.id == "doc-1"
-        assert decoded.library_metadata.kind == "application/pdf"
-        assert decoded.library_metadata.title == "Test Document"
-        assert decoded.library_metadata.parent_id == ""
-        assert decoded.library_metadata.document_type == "source"
-        assert decoded.library_metadata.comments == "test comments"
-        assert decoded.library_metadata.tags == ["tag1", "tag2"]
-
-    def test_roundtrip_preserves_child_document_metadata(self, translator):
-        request = KnowledgeRequest(
-            operation="put-kg-core",
-            id="doc-1",
-            library_metadata=LibraryMetadata(
-                id="chunk-1",
-                kind="text/plain",
-                title="Chunk 1",
-                parent_id="doc-1",
-                document_type="chunk",
-            ),
-        )
-
-        encoded = translator.encode(request)
-        decoded = translator.decode(encoded)
-
-        assert decoded.library_metadata.parent_id == "doc-1"
-        assert decoded.library_metadata.document_type == "chunk"
-
-    def test_roundtrip_preserves_library_blob(self, translator):
-        request = KnowledgeRequest(
-            operation="put-kg-core",
-            id="doc-1",
-            library_blob=LibraryBlob(
-                id="doc-1",
-                data=b"SGVsbG8gV29ybGQ=",
-            ),
-        )
-
-        encoded = translator.encode(request)
-        assert "library-blob" in encoded
-        assert encoded["library-blob"]["id"] == "doc-1"
-        assert encoded["library-blob"]["data"] == "SGVsbG8gV29ybGQ="
-
-        decoded = translator.decode(encoded)
-        assert decoded.library_blob is not None
-        assert decoded.library_blob.id == "doc-1"
-        assert decoded.library_blob.data == "SGVsbG8gV29ybGQ="
-
-    def test_absent_library_fields_decode_as_none(self, translator):
-        decoded = translator.decode({
-            "operation": "get-kg-core",
-            "id": "doc-1",
-        })
-        assert decoded.library_metadata is None
-        assert decoded.library_blob is None
-
-
-class TestKnowledgeResponseTranslatorLibrary:
-
-    @pytest.fixture
-    def response_translator(self):
-        return KnowledgeResponseTranslator()
-
-    def test_encode_library_metadata(self, response_translator):
-        response = KnowledgeResponse(
-            ids=None,
-            library_metadata=LibraryMetadata(
-                id="doc-1",
-                kind="application/pdf",
-                title="Test",
-                parent_id="",
-                document_type="source",
-                comments="",
-                tags=[],
-            ),
-        )
-        encoded = response_translator.encode(response)
-        assert "library-metadata" in encoded
-        assert encoded["library-metadata"]["id"] == "doc-1"
-        assert encoded["library-metadata"]["kind"] == "application/pdf"
-        assert encoded["library-metadata"]["document-type"] == "source"
-
-    def test_encode_library_blob_bytes_to_string(self, response_translator):
-        response = KnowledgeResponse(
-            ids=None,
-            library_blob=LibraryBlob(
-                id="doc-1",
-                data=b"dGVzdCBkYXRh",
-            ),
-        )
-        encoded = response_translator.encode(response)
-        assert "library-blob" in encoded
-        assert encoded["library-blob"]["id"] == "doc-1"
-        assert encoded["library-blob"]["data"] == "dGVzdCBkYXRh"
-        assert isinstance(encoded["library-blob"]["data"], str)
-
-    def test_encode_library_blob_string_passthrough(self, response_translator):
-        response = KnowledgeResponse(
-            ids=None,
-            library_blob=LibraryBlob(
-                id="doc-1",
-                data="already-a-string",
-            ),
-        )
-        encoded = response_translator.encode(response)
-        assert encoded["library-blob"]["data"] == "already-a-string"
-
-    def test_library_metadata_is_not_final(self, response_translator):
-        response = KnowledgeResponse(
-            ids=None,
-            library_metadata=LibraryMetadata(id="doc-1"),
-        )
-        _, is_final = response_translator.encode_with_completion(response)
-        assert is_final is False
-
-    def test_library_blob_is_not_final(self, response_translator):
-        response = KnowledgeResponse(
-            ids=None,
-            library_blob=LibraryBlob(id="doc-1", data=b"data"),
-        )
-        _, is_final = response_translator.encode_with_completion(response)
-        assert is_final is False
-
-    def test_eos_is_final(self, response_translator):
-        response = KnowledgeResponse(eos=True)
-        _, is_final = response_translator.encode_with_completion(response)
-        assert is_final is True
--- a/trustgraph-base/trustgraph/api/api.py
+++ b/trustgraph-base/trustgraph/api/api.py
@ -337,7 +337,7 @@ class Api:
            from . bulk_client import BulkClient
            # Extract base URL (remove api/v1/ suffix)
            base_url = self.url.rsplit("api/v1/", 1)[0].rstrip("/")
-            self._bulk_client = BulkClient(base_url, self.timeout, self.token, workspace=self.workspace)
+            self._bulk_client = BulkClient(base_url, self.timeout, self.token)
        return self._bulk_client

    def metrics(self):
@ -462,7 +462,7 @@ class Api:
            from . async_bulk_client import AsyncBulkClient
            # Extract base URL (remove api/v1/ suffix)
            base_url = self.url.rsplit("api/v1/", 1)[0].rstrip("/")
-            self._async_bulk_client = AsyncBulkClient(base_url, self.timeout, self.token, workspace=self.workspace)
+            self._async_bulk_client = AsyncBulkClient(base_url, self.timeout, self.token)
        return self._async_bulk_client

    def async_metrics(self):
--- a/trustgraph-base/trustgraph/api/async_bulk_client.py
+++ b/trustgraph-base/trustgraph/api/async_bulk_client.py
@ -9,11 +9,10 @@ from . types import Triple
 class AsyncBulkClient:
    """Asynchronous bulk operations client"""

-    def __init__(self, url: str, timeout: int, token: Optional[str], workspace: str = "default") -> None:
+    def __init__(self, url: str, timeout: int, token: Optional[str]) -> None:
        self.url: str = self._convert_to_ws_url(url)
        self.timeout: int = timeout
        self.token: Optional[str] = token
-        self.workspace: str = workspace

    def _convert_to_ws_url(self, url: str) -> str:
        """Convert HTTP URL to WebSocket URL"""
@ -26,21 +25,11 @@ class AsyncBulkClient:
        else:
            return f"ws://{url}"

-    def _build_ws_url(self, path: str) -> str:
-        """Build a WebSocket URL with token and workspace query params."""
-        ws_url = f"{self.url}{path}"
-        params = []
-        if self.token:
-            params.append(f"token={self.token}")
-        if self.workspace:
-            params.append(f"workspace={self.workspace}")
-        if params:
-            ws_url = f"{ws_url}?{'&'.join(params)}"
-        return ws_url
-
    async def import_triples(self, flow: str, triples: AsyncIterator[Triple], **kwargs: Any) -> None:
        """Bulk import triples via WebSocket"""
-        ws_url = self._build_ws_url(f"/api/v1/flow/{flow}/import/triples")
+        ws_url = f"{self.url}/api/v1/flow/{flow}/import/triples"
+        if self.token:
+            ws_url = f"{ws_url}?token={self.token}"

        async with websockets.connect(ws_url, ping_interval=20, ping_timeout=self.timeout) as websocket:
            async for triple in triples:
@ -53,7 +42,9 @@ class AsyncBulkClient:

    async def export_triples(self, flow: str, **kwargs: Any) -> AsyncIterator[Triple]:
        """Bulk export triples via WebSocket"""
-        ws_url = self._build_ws_url(f"/api/v1/flow/{flow}/export/triples")
+        ws_url = f"{self.url}/api/v1/flow/{flow}/export/triples"
+        if self.token:
+            ws_url = f"{ws_url}?token={self.token}"

        async with websockets.connect(ws_url, ping_interval=20, ping_timeout=self.timeout) as websocket:
            async for raw_message in websocket:
@ -66,7 +57,9 @@ class AsyncBulkClient:

    async def import_graph_embeddings(self, flow: str, embeddings: AsyncIterator[Dict[str, Any]], **kwargs: Any) -> None:
        """Bulk import graph embeddings via WebSocket"""
-        ws_url = self._build_ws_url(f"/api/v1/flow/{flow}/import/graph-embeddings")
+        ws_url = f"{self.url}/api/v1/flow/{flow}/import/graph-embeddings"
+        if self.token:
+            ws_url = f"{ws_url}?token={self.token}"

        async with websockets.connect(ws_url, ping_interval=20, ping_timeout=self.timeout) as websocket:
            async for embedding in embeddings:
@ -74,7 +67,9 @@ class AsyncBulkClient:

    async def export_graph_embeddings(self, flow: str, **kwargs: Any) -> AsyncIterator[Dict[str, Any]]:
        """Bulk export graph embeddings via WebSocket"""
-        ws_url = self._build_ws_url(f"/api/v1/flow/{flow}/export/graph-embeddings")
+        ws_url = f"{self.url}/api/v1/flow/{flow}/export/graph-embeddings"
+        if self.token:
+            ws_url = f"{ws_url}?token={self.token}"

        async with websockets.connect(ws_url, ping_interval=20, ping_timeout=self.timeout) as websocket:
            async for raw_message in websocket:
@ -82,7 +77,9 @@ class AsyncBulkClient:

    async def import_document_embeddings(self, flow: str, embeddings: AsyncIterator[Dict[str, Any]], **kwargs: Any) -> None:
        """Bulk import document embeddings via WebSocket"""
-        ws_url = self._build_ws_url(f"/api/v1/flow/{flow}/import/document-embeddings")
+        ws_url = f"{self.url}/api/v1/flow/{flow}/import/document-embeddings"
+        if self.token:
+            ws_url = f"{ws_url}?token={self.token}"

        async with websockets.connect(ws_url, ping_interval=20, ping_timeout=self.timeout) as websocket:
            async for embedding in embeddings:
@ -90,7 +87,9 @@ class AsyncBulkClient:

    async def export_document_embeddings(self, flow: str, **kwargs: Any) -> AsyncIterator[Dict[str, Any]]:
        """Bulk export document embeddings via WebSocket"""
-        ws_url = self._build_ws_url(f"/api/v1/flow/{flow}/export/document-embeddings")
+        ws_url = f"{self.url}/api/v1/flow/{flow}/export/document-embeddings"
+        if self.token:
+            ws_url = f"{ws_url}?token={self.token}"

        async with websockets.connect(ws_url, ping_interval=20, ping_timeout=self.timeout) as websocket:
            async for raw_message in websocket:
@ -98,7 +97,9 @@ class AsyncBulkClient:

    async def import_entity_contexts(self, flow: str, contexts: AsyncIterator[Dict[str, Any]], **kwargs: Any) -> None:
        """Bulk import entity contexts via WebSocket"""
-        ws_url = self._build_ws_url(f"/api/v1/flow/{flow}/import/entity-contexts")
+        ws_url = f"{self.url}/api/v1/flow/{flow}/import/entity-contexts"
+        if self.token:
+            ws_url = f"{ws_url}?token={self.token}"

        async with websockets.connect(ws_url, ping_interval=20, ping_timeout=self.timeout) as websocket:
            async for context in contexts:
@ -106,7 +107,9 @@ class AsyncBulkClient:

    async def export_entity_contexts(self, flow: str, **kwargs: Any) -> AsyncIterator[Dict[str, Any]]:
        """Bulk export entity contexts via WebSocket"""
-        ws_url = self._build_ws_url(f"/api/v1/flow/{flow}/export/entity-contexts")
+        ws_url = f"{self.url}/api/v1/flow/{flow}/export/entity-contexts"
+        if self.token:
+            ws_url = f"{ws_url}?token={self.token}"

        async with websockets.connect(ws_url, ping_interval=20, ping_timeout=self.timeout) as websocket:
            async for raw_message in websocket:
@ -114,7 +117,9 @@ class AsyncBulkClient:

    async def import_rows(self, flow: str, rows: AsyncIterator[Dict[str, Any]], **kwargs: Any) -> None:
        """Bulk import rows via WebSocket"""
-        ws_url = self._build_ws_url(f"/api/v1/flow/{flow}/import/rows")
+        ws_url = f"{self.url}/api/v1/flow/{flow}/import/rows"
+        if self.token:
+            ws_url = f"{ws_url}?token={self.token}"

        async with websockets.connect(ws_url, ping_interval=20, ping_timeout=self.timeout) as websocket:
            async for row in rows:
--- a/trustgraph-base/trustgraph/api/async_socket_client.py
+++ b/trustgraph-base/trustgraph/api/async_socket_client.py
@ -30,7 +30,6 @@ class AsyncSocketClient:
        self.timeout = timeout
        self.token = token
        self.workspace = workspace
-        self._workspace_explicit = workspace != "default"
        self._request_counter = 0
        self._socket = None
        self._connect_cm = None
@ -93,7 +92,6 @@ class AsyncSocketClient:
            )

        if resp.get("type") == "auth-ok":
-            if not self._workspace_explicit:
            self.workspace = resp.get("workspace", self.workspace)
        elif resp.get("type") == "auth-failed":
            await self._socket.close()
--- a/trustgraph-base/trustgraph/api/bulk_client.py
+++ b/trustgraph-base/trustgraph/api/bulk_client.py
@ -34,7 +34,7 @@ class BulkClient:
    Note: For true async support, use AsyncBulkClient instead.
    """

-    def __init__(self, url: str, timeout: int, token: Optional[str], workspace: str = "default") -> None:
+    def __init__(self, url: str, timeout: int, token: Optional[str]) -> None:
        """
        Initialize synchronous bulk client.

@ -42,12 +42,10 @@ class BulkClient:
            url: Base URL for TrustGraph API (HTTP/HTTPS will be converted to WS/WSS)
            timeout: WebSocket timeout in seconds
            token: Optional bearer token for authentication
-            workspace: Workspace for data isolation
        """
        self.url: str = self._convert_to_ws_url(url)
        self.timeout: int = timeout
        self.token: Optional[str] = token
-        self.workspace: str = workspace

    def _convert_to_ws_url(self, url: str) -> str:
        """Convert HTTP URL to WebSocket URL"""
@ -60,18 +58,6 @@ class BulkClient:
        else:
            return f"ws://{url}"

-    def _build_ws_url(self, path: str) -> str:
-        """Build a WebSocket URL with token and workspace query params."""
-        ws_url = f"{self.url}{path}"
-        params = []
-        if self.token:
-            params.append(f"token={self.token}")
-        if self.workspace:
-            params.append(f"workspace={self.workspace}")
-        if params:
-            ws_url = f"{ws_url}?{'&'.join(params)}"
-        return ws_url
-
    def _run_async(self, coro: Coroutine[Any, Any, Any]) -> Any:
        """Run async coroutine synchronously"""
        try:
@ -130,7 +116,9 @@ class BulkClient:
        metadata: Optional[Dict[str, Any]], batch_size: int
    ) -> None:
        """Async implementation of triple import"""
-        ws_url = self._build_ws_url(f"/api/v1/flow/{flow}/import/triples")
+        ws_url = f"{self.url}/api/v1/flow/{flow}/import/triples"
+        if self.token:
+            ws_url = f"{ws_url}?token={self.token}"

        if metadata is None:
            metadata = {"id": "", "metadata": [], "collection": "default"}
@ -206,7 +194,9 @@ class BulkClient:

    async def _export_triples_async(self, flow: str) -> Iterator[Triple]:
        """Async implementation of triple export"""
-        ws_url = self._build_ws_url(f"/api/v1/flow/{flow}/export/triples")
+        ws_url = f"{self.url}/api/v1/flow/{flow}/export/triples"
+        if self.token:
+            ws_url = f"{ws_url}?token={self.token}"

        async with websockets.connect(ws_url, ping_interval=20, ping_timeout=self.timeout) as websocket:
            async for raw_message in websocket:
@ -248,7 +238,9 @@ class BulkClient:

    async def _import_graph_embeddings_async(self, flow: str, embeddings: Iterator[Dict[str, Any]]) -> None:
        """Async implementation of graph embeddings import"""
-        ws_url = self._build_ws_url(f"/api/v1/flow/{flow}/import/graph-embeddings")
+        ws_url = f"{self.url}/api/v1/flow/{flow}/import/graph-embeddings"
+        if self.token:
+            ws_url = f"{ws_url}?token={self.token}"

        async with websockets.connect(ws_url, ping_interval=20, ping_timeout=self.timeout) as websocket:
            for embedding in embeddings:
@ -304,7 +296,9 @@ class BulkClient:

    async def _export_graph_embeddings_async(self, flow: str) -> Iterator[Dict[str, Any]]:
        """Async implementation of graph embeddings export"""
-        ws_url = self._build_ws_url(f"/api/v1/flow/{flow}/export/graph-embeddings")
+        ws_url = f"{self.url}/api/v1/flow/{flow}/export/graph-embeddings"
+        if self.token:
+            ws_url = f"{ws_url}?token={self.token}"

        async with websockets.connect(ws_url, ping_interval=20, ping_timeout=self.timeout) as websocket:
            async for raw_message in websocket:
@ -342,7 +336,9 @@ class BulkClient:

    async def _import_document_embeddings_async(self, flow: str, embeddings: Iterator[Dict[str, Any]]) -> None:
        """Async implementation of document embeddings import"""
-        ws_url = self._build_ws_url(f"/api/v1/flow/{flow}/import/document-embeddings")
+        ws_url = f"{self.url}/api/v1/flow/{flow}/import/document-embeddings"
+        if self.token:
+            ws_url = f"{ws_url}?token={self.token}"

        async with websockets.connect(ws_url, ping_interval=20, ping_timeout=self.timeout) as websocket:
            for embedding in embeddings:
@ -398,7 +394,9 @@ class BulkClient:

    async def _export_document_embeddings_async(self, flow: str) -> Iterator[Dict[str, Any]]:
        """Async implementation of document embeddings export"""
-        ws_url = self._build_ws_url(f"/api/v1/flow/{flow}/export/document-embeddings")
+        ws_url = f"{self.url}/api/v1/flow/{flow}/export/document-embeddings"
+        if self.token:
+            ws_url = f"{ws_url}?token={self.token}"

        async with websockets.connect(ws_url, ping_interval=20, ping_timeout=self.timeout) as websocket:
            async for raw_message in websocket:
@ -448,7 +446,9 @@ class BulkClient:
        metadata: Optional[Dict[str, Any]], batch_size: int
    ) -> None:
        """Async implementation of entity contexts import"""
-        ws_url = self._build_ws_url(f"/api/v1/flow/{flow}/import/entity-contexts")
+        ws_url = f"{self.url}/api/v1/flow/{flow}/import/entity-contexts"
+        if self.token:
+            ws_url = f"{ws_url}?token={self.token}"

        if metadata is None:
            metadata = {"id": "", "metadata": [], "collection": "default"}
@ -522,7 +522,9 @@ class BulkClient:

    async def _export_entity_contexts_async(self, flow: str) -> Iterator[Dict[str, Any]]:
        """Async implementation of entity contexts export"""
-        ws_url = self._build_ws_url(f"/api/v1/flow/{flow}/export/entity-contexts")
+        ws_url = f"{self.url}/api/v1/flow/{flow}/export/entity-contexts"
+        if self.token:
+            ws_url = f"{ws_url}?token={self.token}"

        async with websockets.connect(ws_url, ping_interval=20, ping_timeout=self.timeout) as websocket:
            async for raw_message in websocket:
@ -560,7 +562,9 @@ class BulkClient:

    async def _import_rows_async(self, flow: str, rows: Iterator[Dict[str, Any]]) -> None:
        """Async implementation of rows import"""
-        ws_url = self._build_ws_url(f"/api/v1/flow/{flow}/import/rows")
+        ws_url = f"{self.url}/api/v1/flow/{flow}/import/rows"
+        if self.token:
+            ws_url = f"{ws_url}?token={self.token}"

        async with websockets.connect(ws_url, ping_interval=20, ping_timeout=self.timeout) as websocket:
            for row in rows:
--- a/trustgraph-base/trustgraph/api/socket_client.py
+++ b/trustgraph-base/trustgraph/api/socket_client.py
@ -11,7 +11,6 @@ multiplexes requests by ID.
 import json
 import asyncio
 import websockets
-from websockets.exceptions import ConnectionClosed
 from typing import Optional, Dict, Any, Iterator, Union, List
 from threading import Lock

@ -167,7 +166,6 @@ class SocketClient:
            )

        if resp.get("type") == "auth-ok":
-            if self.workspace == "default":
            self.workspace = resp.get("workspace", self.workspace)
        elif resp.get("type") == "auth-failed":
            await self._socket.close()
@ -193,13 +191,13 @@ class SocketClient:
                if request_id and request_id in self._pending:
                    await self._pending[request_id].put(response)

-        except ConnectionClosed:
+        except websockets.exceptions.ConnectionClosed:
            pass
        except Exception as e:
            for queue in self._pending.values():
                try:
                    await queue.put({"error": str(e)})
-                except Exception:
+                except:
                    pass
        finally:
            self._connected = False
@ -252,7 +250,7 @@ class SocketClient:
        finally:
            try:
                loop.run_until_complete(async_gen.aclose())
-            except Exception:
+            except:
                pass

    def _streaming_generator_raw(
@ -275,7 +273,7 @@ class SocketClient:
        finally:
            try:
                loop.run_until_complete(async_gen.aclose())
-            except Exception:
+            except:
                pass

    async def _send_request_async_streaming_raw(
@ -502,7 +500,6 @@ class SocketClient:

    def put_kg_core(
        self, id: str, triples=None, graph_embeddings=None,
-        library_metadata=None, library_blob=None,
    ) -> Dict[str, Any]:
        request = {
            "operation": "put-kg-core",
@ -513,10 +510,6 @@ class SocketClient:
            request["triples"] = triples
        if graph_embeddings is not None:
            request["graph-embeddings"] = graph_embeddings
-        if library_metadata is not None:
-            request["library-metadata"] = library_metadata
-        if library_blob is not None:
-            request["library-blob"] = library_blob
        return self._send_request_sync("knowledge", None, request)

    def get_de_core(self, id: str) -> Iterator[Dict[str, Any]]:
@ -549,7 +542,7 @@ class SocketClient:
        if self._loop and not self._loop.is_closed():
            try:
                self._loop.run_until_complete(self._close_async())
-            except Exception:
+            except:
                pass

    async def _close_async(self):
--- a/trustgraph-base/trustgraph/base/cassandra_config.py
+++ b/trustgraph-base/trustgraph/base/cassandra_config.py
@ -103,19 +103,35 @@ def resolve_cassandra_config(
    host: Optional[str] = None,
    username: Optional[str] = None,
    password: Optional[str] = None,
-    default_keyspace: Optional[str] = None,
-    replication_factor: Optional[int] = None,
+    default_keyspace: Optional[str] = None
 ) -> Tuple[List[str], Optional[str], Optional[str], Optional[str], int]:
+    """
+    Resolve Cassandra configuration from various sources.
+
+    Can accept either argparse args object or explicit parameters.
+    Converts host string to list format for Cassandra driver.
+
+    Args:
+        args: Optional argparse namespace with cassandra_host, cassandra_username, cassandra_password, cassandra_keyspace, cassandra_replication_factor
+        host: Optional explicit host parameter (overrides args)
+        username: Optional explicit username parameter (overrides args)
+        password: Optional explicit password parameter (overrides args)
+        default_keyspace: Optional default keyspace if not specified elsewhere
+
+    Returns:
+        tuple: (hosts_list, username, password, keyspace, replication_factor)
+    """
+    # If args provided, extract values
    keyspace = None
+    replication_factor = 1
    if args is not None:
        host = host or getattr(args, 'cassandra_host', None)
        username = username or getattr(args, 'cassandra_username', None)
        password = password or getattr(args, 'cassandra_password', None)
        keyspace = getattr(args, 'cassandra_keyspace', None)
-        replication_factor = replication_factor or getattr(
-            args, 'cassandra_replication_factor', None
-        )
+        replication_factor = getattr(args, 'cassandra_replication_factor', 1)

+    # Apply defaults if still None
    defaults = get_cassandra_defaults()
    host = host or defaults['host']
    username = username or defaults['username']
--- a/trustgraph-base/trustgraph/base/iam_client.py
+++ b/trustgraph-base/trustgraph/base/iam_client.py
@ -300,14 +300,6 @@ class IamClient(RequestResponse):
        )
        return resp.workspace

-    async def list_my_workspaces(self, actor="", timeout=IAM_TIMEOUT):
-        resp = await self._request(
-            operation="list-my-workspaces",
-            actor=actor,
-            timeout=timeout,
-        )
-        return list(resp.workspaces)
-
    async def list_workspaces(self, actor="", timeout=IAM_TIMEOUT):
        resp = await self._request(
            operation="list-workspaces",
--- a/trustgraph-base/trustgraph/base/logging.py
+++ b/trustgraph-base/trustgraph/base/logging.py
@ -11,7 +11,6 @@ Supports dual output to console and Loki for centralized log aggregation.
 import contextvars
 import logging
 import logging.handlers
-import uuid
 from argparse import ArgumentParser
 from queue import Queue
 from typing import Any
@ -133,12 +132,14 @@ def setup_logging(args: dict[str, Any]) -> None:
        try:
            from logging_loki import LokiHandler

-            instance_id = str(uuid.uuid4())[:8]
-
+            # Create Loki handler with optional authentication.  The
+            # processor label is NOT baked in here — it's stamped onto
+            # each record by _ProcessorIdFilter reading the task-local
+            # contextvar, and logging_loki's emitter reads record.tags
+            # to build per-record Loki labels.
            loki_handler_kwargs = {
                'url': loki_url,
                'version': "1",
-                'tags': {'instance': instance_id},
            }

            if loki_username and loki_password:
--- a/trustgraph-base/trustgraph/base/qdrant_config.py
+++ b/trustgraph-base/trustgraph/base/qdrant_config.py
@ -1,87 +0,0 @@
-
-import os
-import argparse
-from typing import Optional, Any, Tuple
-
-
-def get_qdrant_defaults() -> dict:
-    return {
-        'url': os.getenv('QDRANT_URL', 'http://localhost:6333'),
-        'api_key': os.getenv('QDRANT_API_KEY'),
-        'replication_factor': int(os.getenv('QDRANT_REPLICATION_FACTOR', '1')),
-        'shard_number': int(os.getenv('QDRANT_SHARD_NUMBER', '1')),
-    }
-
-
-def add_qdrant_args(parser: argparse.ArgumentParser) -> None:
-    defaults = get_qdrant_defaults()
-
-    url_help = f"Qdrant URL (default: {defaults['url']})"
-    if 'QDRANT_URL' in os.environ:
-        url_help += " [from QDRANT_URL]"
-
-    api_key_help = "Qdrant API key"
-    if defaults['api_key']:
-        api_key_help += " (default: <set>)"
-        if 'QDRANT_API_KEY' in os.environ:
-            api_key_help += " [from QDRANT_API_KEY]"
-
-    replication_help = f"Qdrant collection replication factor (default: {defaults['replication_factor']})"
-    if 'QDRANT_REPLICATION_FACTOR' in os.environ:
-        replication_help += " [from QDRANT_REPLICATION_FACTOR]"
-
-    shard_help = f"Qdrant collection shard number (default: {defaults['shard_number']})"
-    if 'QDRANT_SHARD_NUMBER' in os.environ:
-        shard_help += " [from QDRANT_SHARD_NUMBER]"
-
-    parser.add_argument(
-        '--store-uri',
-        default=defaults['url'],
-        help=url_help,
-    )
-
-    parser.add_argument(
-        '--api-key',
-        default=defaults['api_key'],
-        help=api_key_help,
-    )
-
-    parser.add_argument(
-        '--qdrant-replication-factor',
-        type=int,
-        default=defaults['replication_factor'],
-        help=replication_help,
-    )
-
-    parser.add_argument(
-        '--qdrant-shard-number',
-        type=int,
-        default=defaults['shard_number'],
-        help=shard_help,
-    )
-
-
-def resolve_qdrant_config(
-    args: Optional[Any] = None,
-    url: Optional[str] = None,
-    api_key: Optional[str] = None,
-    replication_factor: Optional[int] = None,
-    shard_number: Optional[int] = None,
-) -> Tuple[str, Optional[str], int, int]:
-    if args is not None:
-        url = url or getattr(args, 'store_uri', None)
-        api_key = api_key or getattr(args, 'api_key', None)
-        replication_factor = replication_factor or getattr(
-            args, 'qdrant_replication_factor', None
-        )
-        shard_number = shard_number or getattr(
-            args, 'qdrant_shard_number', None
-        )
-
-    defaults = get_qdrant_defaults()
-    url = url or defaults['url']
-    api_key = api_key or defaults['api_key']
-    replication_factor = replication_factor or defaults['replication_factor']
-    shard_number = shard_number or defaults['shard_number']
-
-    return url, api_key, replication_factor, shard_number
--- a/trustgraph-base/trustgraph/messaging/translators/knowledge.py
+++ b/trustgraph-base/trustgraph/messaging/translators/knowledge.py
@ -2,8 +2,7 @@ from typing import Dict, Any, Tuple, Optional
 from ...schema import (
    KnowledgeRequest, KnowledgeResponse, Triples, GraphEmbeddings,
    DocumentEmbeddings, ChunkEmbeddings,
-    Metadata, EntityEmbeddings,
-    LibraryMetadata, LibraryBlob,
+    Metadata, EntityEmbeddings
 )
 from .base import MessageTranslator
 from .primitives import ValueTranslator, SubgraphTranslator
@ -62,27 +61,6 @@ class KnowledgeRequestTranslator(MessageTranslator):
                ]
            )

-        library_metadata = None
-        if "library-metadata" in data:
-            lm = data["library-metadata"]
-            library_metadata = LibraryMetadata(
-                id=lm.get("id", ""),
-                kind=lm.get("kind", ""),
-                title=lm.get("title", ""),
-                parent_id=lm.get("parent-id", ""),
-                document_type=lm.get("document-type", ""),
-                comments=lm.get("comments", ""),
-                tags=lm.get("tags", []),
-            )
-
-        library_blob = None
-        if "library-blob" in data:
-            lb = data["library-blob"]
-            library_blob = LibraryBlob(
-                id=lb.get("id", ""),
-                data=lb.get("data", b""),
-            )
-
        return KnowledgeRequest(
            operation=data.get("operation"),
            id=data.get("id"),
@ -91,8 +69,6 @@ class KnowledgeRequestTranslator(MessageTranslator):
            triples=triples,
            graph_embeddings=graph_embeddings,
            document_embeddings=document_embeddings,
-            library_metadata=library_metadata,
-            library_blob=library_blob,
        )

    def encode(self, obj: KnowledgeRequest) -> Dict[str, Any]:
@ -149,26 +125,6 @@ class KnowledgeRequestTranslator(MessageTranslator):
                ],
            }

-        if obj.library_metadata:
-            result["library-metadata"] = {
-                "id": obj.library_metadata.id,
-                "kind": obj.library_metadata.kind,
-                "title": obj.library_metadata.title,
-                "parent-id": obj.library_metadata.parent_id,
-                "document-type": obj.library_metadata.document_type,
-                "comments": obj.library_metadata.comments,
-                "tags": obj.library_metadata.tags,
-            }
-
-        if obj.library_blob:
-            data = obj.library_blob.data
-            if isinstance(data, bytes):
-                data = data.decode("utf-8")
-            result["library-blob"] = {
-                "id": obj.library_blob.id,
-                "data": data,
-            }
-
        return result


@ -238,32 +194,6 @@ class KnowledgeResponseTranslator(MessageTranslator):
                }
            }

-        # Streaming library metadata response
-        if obj.library_metadata:
-            return {
-                "library-metadata": {
-                    "id": obj.library_metadata.id,
-                    "kind": obj.library_metadata.kind,
-                    "title": obj.library_metadata.title,
-                    "parent-id": obj.library_metadata.parent_id,
-                    "document-type": obj.library_metadata.document_type,
-                    "comments": obj.library_metadata.comments,
-                    "tags": obj.library_metadata.tags,
-                }
-            }
-
-        # Streaming library blob response
-        if obj.library_blob:
-            data = obj.library_blob.data
-            if isinstance(data, bytes):
-                data = data.decode("utf-8")
-            return {
-                "library-blob": {
-                    "id": obj.library_blob.id,
-                    "data": data,
-                }
-            }
-
        # End of stream marker
        if obj.eos is True:
            return {"eos": True}
@ -279,9 +209,7 @@ class KnowledgeResponseTranslator(MessageTranslator):
        is_final = (
            obj.ids is not None or  # List response
            obj.eos is True or      # End of stream
-            (not obj.triples and not obj.graph_embeddings
-             and not obj.document_embeddings
-             and not obj.library_metadata and not obj.library_blob)  # Empty response
+            (not obj.triples and not obj.graph_embeddings and not obj.document_embeddings)  # Empty response
        )
        
        return response, is_final
--- a/Show more
+++ b/Show more