trustgraph/specs/api/paths/knowledge.yaml
cybermaggedon d35473f7f7
feat: workspace-based multi-tenancy, replacing user as tenancy axis (#840)
Introduces `workspace` as the isolation boundary for config, flows,
library, and knowledge data. Removes `user` as a schema-level field
throughout the code, API specs, and tests; workspace provides the
same separation more cleanly at the trusted flow.workspace layer
rather than through client-supplied message fields.

Design
------
- IAM tech spec (docs/tech-specs/iam.md) documents current state,
  proposed auth/access model, and migration direction.
- Data ownership model (docs/tech-specs/data-ownership-model.md)
  captures the workspace/collection/flow hierarchy.

Schema + messaging
------------------
- Drop `user` field from AgentRequest/Step, GraphRagQuery,
  DocumentRagQuery, Triples/Graph/Document/Row EmbeddingsRequest,
  Sparql/Rows/Structured QueryRequest, ToolServiceRequest.
- Keep collection/workspace routing via flow.workspace at the
  service layer.
- Translators updated to not serialise/deserialise user.

API specs
---------
- OpenAPI schemas and path examples cleaned of user fields.
- Websocket async-api messages updated.
- Removed the unused parameters/User.yaml.

Services + base
---------------
- Librarian, collection manager, knowledge, config: all operations
  scoped by workspace. Config client API takes workspace as first
  positional arg.
- `flow.workspace` set at flow start time by the infrastructure;
  no longer pass-through from clients.
- Tool service drops user-personalisation passthrough.

CLI + SDK
---------
- tg-init-workspace and workspace-aware import/export.
- All tg-* commands drop user args; accept --workspace.
- Python API/SDK (flow, socket_client, async_*, explainability,
  library) drop user kwargs from every method signature.

MCP server
----------
- All tool endpoints drop user parameters; socket_manager no longer
  keyed per user.

Flow service
------------
- Closure-based topic cleanup on flow stop: only delete topics
  whose blueprint template was parameterised AND no remaining
  live flow (across all workspaces) still resolves to that topic.
  Three scopes fall out naturally from template analysis:
    * {id} -> per-flow, deleted on stop
    * {blueprint} -> per-blueprint, kept while any flow of the
      same blueprint exists
    * {workspace} -> per-workspace, kept while any flow in the
      workspace exists
    * literal -> global, never deleted (e.g. tg.request.librarian)
  Fixes a bug where stopping a flow silently destroyed the global
  librarian exchange, wedging all library operations until manual
  restart.

RabbitMQ backend
----------------
- heartbeat=60, blocked_connection_timeout=300. Catches silently
  dead connections (broker restart, orphaned channels, network
  partitions) within ~2 heartbeat windows, so the consumer
  reconnects and re-binds its queue rather than sitting forever
  on a zombie connection.

Tests
-----
- Full test refresh: unit, integration, contract, provenance.
- Dropped user-field assertions and constructor kwargs across
  ~100 test files.
- Renamed user-collection isolation tests to workspace-collection.
2026-04-21 23:23:01 +01:00

188 lines
6.8 KiB
YAML

post:
tags:
- Knowledge
summary: Knowledge graph core management
description: |
Manage knowledge graph cores - persistent storage of triples and embeddings.
## Knowledge Cores
Knowledge cores are the foundational storage units for:
- **Triples**: RDF triples representing knowledge graph data
- **Graph Embeddings**: Vector embeddings for entities
- **Metadata**: Descriptive information about the knowledge
Each core has an ID and collection for organization (within the workspace).
## Operations
### list-kg-cores
List all knowledge cores in the workspace. Returns array of core IDs.
### get-kg-core
Retrieve a knowledge core by ID. Returns triples and/or graph embeddings.
Response is streamed - may receive multiple messages followed by EOS marker.
### put-kg-core
Store triples and/or graph embeddings. Creates new core or updates existing.
Can store triples only, embeddings only, or both together.
### delete-kg-core
Delete a knowledge core by ID. Removes all associated data.
### load-kg-core
Load a knowledge core into a running flow's collection.
Makes the data available for querying within that flow instance.
### unload-kg-core
Unload a knowledge core from a flow's collection.
Removes data from flow instance but doesn't delete the core.
## Streaming Responses
The `get-kg-core` operation streams data in chunks:
1. Multiple messages with `triples` or `graph-embeddings`
2. Final message with `eos: true` to signal completion
operationId: knowledgeService
security:
- bearerAuth: []
requestBody:
required: true
content:
application/json:
schema:
$ref: '../components/schemas/knowledge/KnowledgeRequest.yaml'
examples:
listKnowledgeCores:
summary: List knowledge cores
value:
operation: list-kg-cores
getKnowledgeCore:
summary: Get knowledge core
value:
operation: get-kg-core
id: core-123
putTriplesOnly:
summary: Store triples
value:
operation: put-kg-core
triples:
metadata:
id: core-123
collection: default
metadata:
- s: {v: "https://example.com/core-123", e: true}
p: {v: "https://www.w3.org/1999/02/22-rdf-syntax-ns#type", e: true}
o: {v: "https://trustgraph.ai/e/knowledge-core", e: true}
triples:
- s: {v: "https://example.com/entity1", e: true}
p: {v: "https://www.w3.org/2000/01/rdf-schema#label", e: true}
o: {v: "Entity 1", e: false}
- s: {v: "https://example.com/entity1", e: true}
p: {v: "https://example.com/relatedTo", e: true}
o: {v: "https://example.com/entity2", e: true}
putEmbeddingsOnly:
summary: Store embeddings
value:
operation: put-kg-core
graph-embeddings:
metadata:
id: core-123
collection: default
metadata: []
entities:
- entity: {v: "https://example.com/entity1", e: true}
vectors: [0.1, 0.2, 0.3, 0.4, 0.5]
- entity: {v: "https://example.com/entity2", e: true}
vectors: [0.6, 0.7, 0.8, 0.9, 1.0]
putTriplesAndEmbeddings:
summary: Store triples and embeddings together
value:
operation: put-kg-core
triples:
metadata:
id: core-456
collection: research
metadata: []
triples:
- s: {v: "https://example.com/doc1", e: true}
p: {v: "http://purl.org/dc/terms/title", e: true}
o: {v: "Research Paper", e: false}
graph-embeddings:
metadata:
id: core-456
collection: research
metadata: []
entities:
- entity: {v: "https://example.com/doc1", e: true}
vectors: [0.11, 0.22, 0.33]
deleteKnowledgeCore:
summary: Delete knowledge core
value:
operation: delete-kg-core
id: core-123
loadKnowledgeCore:
summary: Load core into flow
value:
operation: load-kg-core
id: core-123
flow: my-flow
collection: default
unloadKnowledgeCore:
summary: Unload core from flow
value:
operation: unload-kg-core
id: core-123
responses:
'200':
description: Successful response
content:
application/json:
schema:
$ref: '../components/schemas/knowledge/KnowledgeResponse.yaml'
examples:
listKnowledgeCores:
summary: List of knowledge cores
value:
ids:
- core-123
- core-456
- core-789
getKnowledgeCoreTriples:
summary: Knowledge core triples (streaming)
value:
triples:
metadata:
id: core-123
collection: default
metadata:
- s: {v: "https://example.com/core-123", e: true}
p: {v: "https://www.w3.org/1999/02/22-rdf-syntax-ns#type", e: true}
o: {v: "https://trustgraph.ai/e/knowledge-core", e: true}
triples:
- s: {v: "https://example.com/entity1", e: true}
p: {v: "https://www.w3.org/2000/01/rdf-schema#label", e: true}
o: {v: "Entity 1", e: false}
getKnowledgeCoreEmbeddings:
summary: Knowledge core embeddings (streaming)
value:
graph-embeddings:
metadata:
id: core-123
collection: default
metadata: []
entities:
- entity: {v: "https://example.com/entity1", e: true}
vectors: [0.1, 0.2, 0.3, 0.4, 0.5]
endOfStream:
summary: End of stream marker
value:
eos: true
deleteSuccess:
summary: Delete successful (empty response)
value: {}
'401':
$ref: '../components/responses/Unauthorized.yaml'
'500':
$ref: '../components/responses/Error.yaml'