trustgraph/specs/api/paths/knowledge.yaml

189 lines
6.8 KiB
YAML
Raw Normal View History

post:
tags:
- Knowledge
summary: Knowledge graph core management
description: |
Manage knowledge graph cores - persistent storage of triples and embeddings.
## Knowledge Cores
Knowledge cores are the foundational storage units for:
- **Triples**: RDF triples representing knowledge graph data
- **Graph Embeddings**: Vector embeddings for entities
- **Metadata**: Descriptive information about the knowledge
feat: workspace-based multi-tenancy, replacing user as tenancy axis (#840) Introduces `workspace` as the isolation boundary for config, flows, library, and knowledge data. Removes `user` as a schema-level field throughout the code, API specs, and tests; workspace provides the same separation more cleanly at the trusted flow.workspace layer rather than through client-supplied message fields. Design ------ - IAM tech spec (docs/tech-specs/iam.md) documents current state, proposed auth/access model, and migration direction. - Data ownership model (docs/tech-specs/data-ownership-model.md) captures the workspace/collection/flow hierarchy. Schema + messaging ------------------ - Drop `user` field from AgentRequest/Step, GraphRagQuery, DocumentRagQuery, Triples/Graph/Document/Row EmbeddingsRequest, Sparql/Rows/Structured QueryRequest, ToolServiceRequest. - Keep collection/workspace routing via flow.workspace at the service layer. - Translators updated to not serialise/deserialise user. API specs --------- - OpenAPI schemas and path examples cleaned of user fields. - Websocket async-api messages updated. - Removed the unused parameters/User.yaml. Services + base --------------- - Librarian, collection manager, knowledge, config: all operations scoped by workspace. Config client API takes workspace as first positional arg. - `flow.workspace` set at flow start time by the infrastructure; no longer pass-through from clients. - Tool service drops user-personalisation passthrough. CLI + SDK --------- - tg-init-workspace and workspace-aware import/export. - All tg-* commands drop user args; accept --workspace. - Python API/SDK (flow, socket_client, async_*, explainability, library) drop user kwargs from every method signature. MCP server ---------- - All tool endpoints drop user parameters; socket_manager no longer keyed per user. Flow service ------------ - Closure-based topic cleanup on flow stop: only delete topics whose blueprint template was parameterised AND no remaining live flow (across all workspaces) still resolves to that topic. Three scopes fall out naturally from template analysis: * {id} -> per-flow, deleted on stop * {blueprint} -> per-blueprint, kept while any flow of the same blueprint exists * {workspace} -> per-workspace, kept while any flow in the workspace exists * literal -> global, never deleted (e.g. tg.request.librarian) Fixes a bug where stopping a flow silently destroyed the global librarian exchange, wedging all library operations until manual restart. RabbitMQ backend ---------------- - heartbeat=60, blocked_connection_timeout=300. Catches silently dead connections (broker restart, orphaned channels, network partitions) within ~2 heartbeat windows, so the consumer reconnects and re-binds its queue rather than sitting forever on a zombie connection. Tests ----- - Full test refresh: unit, integration, contract, provenance. - Dropped user-field assertions and constructor kwargs across ~100 test files. - Renamed user-collection isolation tests to workspace-collection.
2026-04-21 23:23:01 +01:00
Each core has an ID and collection for organization (within the workspace).
## Operations
### list-kg-cores
feat: workspace-based multi-tenancy, replacing user as tenancy axis (#840) Introduces `workspace` as the isolation boundary for config, flows, library, and knowledge data. Removes `user` as a schema-level field throughout the code, API specs, and tests; workspace provides the same separation more cleanly at the trusted flow.workspace layer rather than through client-supplied message fields. Design ------ - IAM tech spec (docs/tech-specs/iam.md) documents current state, proposed auth/access model, and migration direction. - Data ownership model (docs/tech-specs/data-ownership-model.md) captures the workspace/collection/flow hierarchy. Schema + messaging ------------------ - Drop `user` field from AgentRequest/Step, GraphRagQuery, DocumentRagQuery, Triples/Graph/Document/Row EmbeddingsRequest, Sparql/Rows/Structured QueryRequest, ToolServiceRequest. - Keep collection/workspace routing via flow.workspace at the service layer. - Translators updated to not serialise/deserialise user. API specs --------- - OpenAPI schemas and path examples cleaned of user fields. - Websocket async-api messages updated. - Removed the unused parameters/User.yaml. Services + base --------------- - Librarian, collection manager, knowledge, config: all operations scoped by workspace. Config client API takes workspace as first positional arg. - `flow.workspace` set at flow start time by the infrastructure; no longer pass-through from clients. - Tool service drops user-personalisation passthrough. CLI + SDK --------- - tg-init-workspace and workspace-aware import/export. - All tg-* commands drop user args; accept --workspace. - Python API/SDK (flow, socket_client, async_*, explainability, library) drop user kwargs from every method signature. MCP server ---------- - All tool endpoints drop user parameters; socket_manager no longer keyed per user. Flow service ------------ - Closure-based topic cleanup on flow stop: only delete topics whose blueprint template was parameterised AND no remaining live flow (across all workspaces) still resolves to that topic. Three scopes fall out naturally from template analysis: * {id} -> per-flow, deleted on stop * {blueprint} -> per-blueprint, kept while any flow of the same blueprint exists * {workspace} -> per-workspace, kept while any flow in the workspace exists * literal -> global, never deleted (e.g. tg.request.librarian) Fixes a bug where stopping a flow silently destroyed the global librarian exchange, wedging all library operations until manual restart. RabbitMQ backend ---------------- - heartbeat=60, blocked_connection_timeout=300. Catches silently dead connections (broker restart, orphaned channels, network partitions) within ~2 heartbeat windows, so the consumer reconnects and re-binds its queue rather than sitting forever on a zombie connection. Tests ----- - Full test refresh: unit, integration, contract, provenance. - Dropped user-field assertions and constructor kwargs across ~100 test files. - Renamed user-collection isolation tests to workspace-collection.
2026-04-21 23:23:01 +01:00
List all knowledge cores in the workspace. Returns array of core IDs.
### get-kg-core
Retrieve a knowledge core by ID. Returns triples and/or graph embeddings.
Response is streamed - may receive multiple messages followed by EOS marker.
### put-kg-core
Store triples and/or graph embeddings. Creates new core or updates existing.
Can store triples only, embeddings only, or both together.
### delete-kg-core
Delete a knowledge core by ID. Removes all associated data.
### load-kg-core
Load a knowledge core into a running flow's collection.
Makes the data available for querying within that flow instance.
### unload-kg-core
Unload a knowledge core from a flow's collection.
Removes data from flow instance but doesn't delete the core.
## Streaming Responses
The `get-kg-core` operation streams data in chunks:
1. Multiple messages with `triples` or `graph-embeddings`
2. Final message with `eos: true` to signal completion
operationId: knowledgeService
security:
- bearerAuth: []
requestBody:
required: true
content:
application/json:
schema:
$ref: '../components/schemas/knowledge/KnowledgeRequest.yaml'
examples:
listKnowledgeCores:
summary: List knowledge cores
value:
operation: list-kg-cores
getKnowledgeCore:
summary: Get knowledge core
value:
operation: get-kg-core
id: core-123
putTriplesOnly:
summary: Store triples
value:
operation: put-kg-core
triples:
metadata:
id: core-123
collection: default
metadata:
- s: {v: "https://example.com/core-123", e: true}
p: {v: "https://www.w3.org/1999/02/22-rdf-syntax-ns#type", e: true}
o: {v: "https://trustgraph.ai/e/knowledge-core", e: true}
triples:
- s: {v: "https://example.com/entity1", e: true}
p: {v: "https://www.w3.org/2000/01/rdf-schema#label", e: true}
o: {v: "Entity 1", e: false}
- s: {v: "https://example.com/entity1", e: true}
p: {v: "https://example.com/relatedTo", e: true}
o: {v: "https://example.com/entity2", e: true}
putEmbeddingsOnly:
summary: Store embeddings
value:
operation: put-kg-core
graph-embeddings:
metadata:
id: core-123
collection: default
metadata: []
entities:
- entity: {v: "https://example.com/entity1", e: true}
vectors: [0.1, 0.2, 0.3, 0.4, 0.5]
- entity: {v: "https://example.com/entity2", e: true}
vectors: [0.6, 0.7, 0.8, 0.9, 1.0]
putTriplesAndEmbeddings:
summary: Store triples and embeddings together
value:
operation: put-kg-core
triples:
metadata:
id: core-456
collection: research
metadata: []
triples:
- s: {v: "https://example.com/doc1", e: true}
p: {v: "http://purl.org/dc/terms/title", e: true}
o: {v: "Research Paper", e: false}
graph-embeddings:
metadata:
id: core-456
collection: research
metadata: []
entities:
- entity: {v: "https://example.com/doc1", e: true}
vectors: [0.11, 0.22, 0.33]
deleteKnowledgeCore:
summary: Delete knowledge core
value:
operation: delete-kg-core
id: core-123
loadKnowledgeCore:
summary: Load core into flow
value:
operation: load-kg-core
id: core-123
flow: my-flow
collection: default
unloadKnowledgeCore:
summary: Unload core from flow
value:
operation: unload-kg-core
id: core-123
responses:
'200':
description: Successful response
content:
application/json:
schema:
$ref: '../components/schemas/knowledge/KnowledgeResponse.yaml'
examples:
listKnowledgeCores:
summary: List of knowledge cores
value:
ids:
- core-123
- core-456
- core-789
getKnowledgeCoreTriples:
summary: Knowledge core triples (streaming)
value:
triples:
metadata:
id: core-123
collection: default
metadata:
- s: {v: "https://example.com/core-123", e: true}
p: {v: "https://www.w3.org/1999/02/22-rdf-syntax-ns#type", e: true}
o: {v: "https://trustgraph.ai/e/knowledge-core", e: true}
triples:
- s: {v: "https://example.com/entity1", e: true}
p: {v: "https://www.w3.org/2000/01/rdf-schema#label", e: true}
o: {v: "Entity 1", e: false}
getKnowledgeCoreEmbeddings:
summary: Knowledge core embeddings (streaming)
value:
graph-embeddings:
metadata:
id: core-123
collection: default
metadata: []
entities:
- entity: {v: "https://example.com/entity1", e: true}
vectors: [0.1, 0.2, 0.3, 0.4, 0.5]
endOfStream:
summary: End of stream marker
value:
eos: true
deleteSuccess:
summary: Delete successful (empty response)
value: {}
'401':
$ref: '../components/responses/Unauthorized.yaml'
'500':
$ref: '../components/responses/Error.yaml'