Update API specs for 2.1 (#699)

* Updating API specs for 2.1

* Updated API and SDK docs
This commit is contained in:
cybermaggedon 2026-03-17 20:36:31 +00:00 committed by GitHub
parent c387670944
commit 664d1d0384
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
19 changed files with 4280 additions and 1949 deletions

View file

@ -0,0 +1,108 @@
# API Gateway Changes: v1.8 to v2.1
## Summary
The API gateway gained new WebSocket service dispatchers for embeddings
queries, a new REST streaming endpoint for document content, and underwent
a significant wire format change from `Value` to `Term`. The "objects"
service was renamed to "rows".
---
## New WebSocket Service Dispatchers
These are new request/response services available through the WebSocket
multiplexer at `/api/v1/socket` (flow-scoped):
| Service Key | Description |
|-------------|-------------|
| `document-embeddings` | Queries document chunks by text similarity. Request/response uses `DocumentEmbeddingsRequest`/`DocumentEmbeddingsResponse` schemas. |
| `row-embeddings` | Queries structured data rows by text similarity on indexed fields. Request/response uses `RowEmbeddingsRequest`/`RowEmbeddingsResponse` schemas. |
These join the existing `graph-embeddings` dispatcher (which was already
present in v1.8 but may have been updated).
### Full list of WebSocket flow service dispatchers (v2.1)
Request/response services (via `/api/v1/flow/{flow}/service/{kind}` or
WebSocket mux):
- `agent`, `text-completion`, `prompt`, `mcp-tool`
- `graph-rag`, `document-rag`
- `embeddings`, `graph-embeddings`, `document-embeddings`
- `triples`, `rows`, `nlp-query`, `structured-query`, `structured-diag`
- `row-embeddings`
---
## New REST Endpoint
| Method | Path | Description |
|--------|------|-------------|
| `GET` | `/api/v1/document-stream` | Streams document content from the library as raw bytes. Query parameters: `user` (required), `document-id` (required), `chunk-size` (optional, default 1MB). Returns the document content in chunked transfer encoding, decoded from base64 internally. |
---
## Renamed Service: "objects" to "rows"
| v1.8 | v2.1 | Notes |
|------|------|-------|
| `objects_query.py` / `ObjectsQueryRequestor` | `rows_query.py` / `RowsQueryRequestor` | Schema changed from `ObjectsQueryRequest`/`ObjectsQueryResponse` to `RowsQueryRequest`/`RowsQueryResponse`. |
| `objects_import.py` / `ObjectsImport` | `rows_import.py` / `RowsImport` | Import dispatcher for structured data. |
The WebSocket service key changed from `"objects"` to `"rows"`, and the
import dispatcher key similarly changed from `"objects"` to `"rows"`.
---
## Wire Format Change: Value to Term
The serialization layer (`serialize.py`) was rewritten to use the new `Term`
type instead of the old `Value` type.
### Old format (v1.8 — `Value`)
```json
{"v": "http://example.org/entity", "e": true}
```
- `v`: the value (string)
- `e`: boolean flag indicating whether the value is a URI
### New format (v2.1 — `Term`)
IRIs:
```json
{"t": "i", "i": "http://example.org/entity"}
```
Literals:
```json
{"t": "l", "v": "some text", "d": "datatype-uri", "l": "en"}
```
Quoted triples (RDF-star):
```json
{"t": "r", "r": {"s": {...}, "p": {...}, "o": {...}}}
```
- `t`: type discriminator — `"i"` (IRI), `"l"` (literal), `"r"` (quoted triple), `"b"` (blank node)
- Serialization now delegates to `TermTranslator` and `TripleTranslator` from `trustgraph.messaging.translators.primitives`
### Other serialization changes
| Field | v1.8 | v2.1 |
|-------|------|------|
| Metadata | `metadata.metadata` (subgraph) | `metadata.root` (simple value) |
| Graph embeddings entity | `entity.vectors` (plural) | `entity.vector` (singular) |
| Document embeddings chunk | `chunk.vectors` + `chunk.chunk` (text) | `chunk.vector` + `chunk.chunk_id` (ID reference) |
---
## Breaking Changes
- **`Value` to `Term` wire format**: All clients sending/receiving triples, embeddings, or entity contexts through the gateway must update to the new Term format.
- **`objects` to `rows` rename**: WebSocket service key and import key changed.
- **Metadata field change**: `metadata.metadata` (a serialized subgraph) replaced by `metadata.root` (a simple value).
- **Embeddings field changes**: `vectors` (plural) became `vector` (singular); document embeddings now reference `chunk_id` instead of inline `chunk` text.
- **New `/api/v1/document-stream` endpoint**: Additive, not breaking.

File diff suppressed because one or more lines are too long

View file

@ -0,0 +1,112 @@
# CLI Changes: v1.8 to v2.1
## Summary
The CLI (`trustgraph-cli`) has significant additions focused on three themes:
**explainability/provenance**, **embeddings access**, and **graph querying**.
Two legacy tools were removed, one was renamed, and several existing tools
gained new capabilities.
---
## New CLI Tools
### Explainability & Provenance
| Command | Description |
|---------|-------------|
| `tg-list-explain-traces` | Lists all explainability sessions (GraphRAG and Agent) in a collection, showing session IDs, type, question text, and timestamps. |
| `tg-show-explain-trace` | Displays the full explainability trace for a session. For GraphRAG: Question, Exploration, Focus, Synthesis stages. For Agent: Session, Iterations (thought/action/observation), Final Answer. Auto-detects trace type. Supports `--show-provenance` to trace edges back to source documents. |
| `tg-show-extraction-provenance` | Given a document ID, traverses the provenance chain: Document -> Pages -> Chunks -> Edges, using `prov:wasDerivedFrom` relationships. Supports `--show-content` and `--max-content` options. |
### Embeddings
| Command | Description |
|---------|-------------|
| `tg-invoke-embeddings` | Converts text to a vector embedding via the embeddings service. Accepts one or more text inputs, returns vectors as lists of floats. |
| `tg-invoke-graph-embeddings` | Queries graph entities by text similarity using vector embeddings. Returns matching entities with similarity scores. |
| `tg-invoke-document-embeddings` | Queries document chunks by text similarity using vector embeddings. Returns matching chunk IDs with similarity scores. |
| `tg-invoke-row-embeddings` | Queries structured data rows by text similarity on indexed fields. Returns matching rows with index values and scores. Requires `--schema-name` and supports `--index-name`. |
### Graph Querying
| Command | Description |
|---------|-------------|
| `tg-query-graph` | Pattern-based triple store query. Unlike `tg-show-graph` (which dumps everything), this allows selective queries by any combination of subject, predicate, object, and graph. Auto-detects value types: IRIs (`http://...`, `urn:...`, `<...>`), quoted triples (`<<s p o>>`), and literals. |
| `tg-get-document-content` | Retrieves document content from the library by document ID. Can output to file or stdout, handles both text and binary content. |
---
## Removed CLI Tools
| Command | Notes |
|---------|-------|
| `tg-load-pdf` | Removed. Document loading is now handled through the library/processing pipeline. |
| `tg-load-text` | Removed. Document loading is now handled through the library/processing pipeline. |
---
## Renamed CLI Tools
| Old Name | New Name | Notes |
|----------|----------|-------|
| `tg-invoke-objects-query` | `tg-invoke-rows-query` | Reflects the terminology rename from "objects" to "rows" for structured data. |
---
## Significant Changes to Existing Tools
### `tg-invoke-graph-rag`
- **Explainability support**: Now supports a 4-stage explainability pipeline (Question, Grounding/Exploration, Focus, Synthesis) with inline provenance event display.
- **Streaming**: Uses WebSocket streaming for real-time output.
- **Provenance tracing**: Can trace selected edges back to source documents via reification and `prov:wasDerivedFrom` chains.
- Grew from ~30 lines to ~760 lines to accommodate the full explainability pipeline.
### `tg-invoke-document-rag`
- **Explainability support**: Added `question_explainable()` mode that streams Document RAG responses with inline provenance events (Question, Grounding, Exploration, Synthesis stages).
### `tg-invoke-agent`
- **Explainability support**: Added `question_explainable()` mode showing provenance events inline during agent execution (Question, Analysis, Conclusion, AgentThought, AgentObservation, AgentAnswer).
- Verbose mode shows thought/observation streams with emoji prefixes.
### `tg-show-graph`
- **Streaming mode**: Now uses `triples_query_stream()` with configurable batch sizes for lower time-to-first-result and reduced memory overhead.
- **Named graph support**: New `--graph` filter option. Recognises named graphs:
- Default graph (empty): Core knowledge facts
- `urn:graph:source`: Extraction provenance
- `urn:graph:retrieval`: Query-time explainability
- **Show graph column**: New `--show-graph` flag to display the named graph for each triple.
- **Configurable limits**: New `--limit` and `--batch-size` options.
### `tg-graph-to-turtle`
- **RDF-star support**: Now handles quoted triples (RDF-star reification).
- **Streaming mode**: Uses streaming for lower time-to-first-processing.
- **Wire format handling**: Updated to use the new term wire format (`{"t": "i", "i": uri}` for IRIs, `{"t": "l", "v": value}` for literals, `{"t": "r", "r": {...}}` for quoted triples).
- **Named graph support**: New `--graph` filter option.
### `tg-set-tool`
- **New tool type**: `row-embeddings-query` for semantic search on structured data indexes.
- **New options**: `--schema-name`, `--index-name`, `--limit` for configuring row embeddings query tools.
### `tg-show-tools`
- Displays the new `row-embeddings-query` tool type with its `schema-name`, `index-name`, and `limit` fields.
### `tg-load-knowledge`
- **Progress reporting**: Now counts and reports triples and entity contexts loaded per file and in total.
- **Term format update**: Entity contexts now use the new Term format (`{"t": "i", "i": uri}`) instead of the old Value format (`{"v": entity, "e": True}`).
---
## Breaking Changes
- **Terminology rename**: The `Value` schema was renamed to `Term` across the system (PR #622). This affects the wire format used by CLI tools that interact with the graph store. The new format uses `{"t": "i", "i": uri}` for IRIs and `{"t": "l", "v": value}` for literals, replacing the old `{"v": ..., "e": ...}` format.
- **`tg-invoke-objects-query` renamed** to `tg-invoke-rows-query`.
- **`tg-load-pdf` and `tg-load-text` removed**.

File diff suppressed because it is too large Load diff

File diff suppressed because one or more lines are too long

View file

@ -1,14 +1,60 @@
type: object type: object
description: RDF value - can be entity/URI or literal description: |
required: RDF Term - typed representation of a value in the knowledge graph.
- v
- e Term types (discriminated by `t` field):
- `i`: IRI (URI reference)
- `l`: Literal (string value, optionally with datatype or language tag)
- `r`: Quoted triple (RDF-star reification)
- `b`: Blank node
properties: properties:
t:
type: string
description: Term type discriminator
enum: [i, l, r, b]
example: i
i:
type: string
description: IRI value (when t=i)
example: http://example.com/Person1
v: v:
type: string type: string
description: Value (URI or literal text) description: Literal value (when t=l)
example: https://example.com/entity1 example: John Doe
e: d:
type: boolean type: string
description: True if entity/URI, false if literal description: Datatype IRI for literal (when t=l, optional)
example: true example: http://www.w3.org/2001/XMLSchema#integer
l:
type: string
description: Language tag for literal (when t=l, optional)
example: en
r:
type: object
description: Quoted triple (when t=r) - contains s, p, o as nested Term objects with the same structure
properties:
s:
type: object
description: Subject term
p:
type: object
description: Predicate term
o:
type: object
description: Object term
required:
- t
examples:
- description: IRI term
value:
t: i
i: http://schema.org/name
- description: Literal term
value:
t: l
v: John Doe
- description: Literal with language tag
value:
t: l
v: Bonjour
l: fr

View file

@ -1,5 +1,6 @@
type: object type: object
description: RDF triple (subject-predicate-object) description: |
RDF triple (subject-predicate-object), optionally scoped to a named graph.
required: required:
- s - s
- p - p
@ -14,3 +15,7 @@ properties:
o: o:
$ref: './RdfValue.yaml' $ref: './RdfValue.yaml'
description: Object description: Object
g:
type: string
description: Named graph URI (optional)
example: urn:graph:source

View file

@ -9,12 +9,26 @@ properties:
- action - action
- observation - observation
- answer - answer
- final-answer
- error - error
example: answer example: answer
content: content:
type: string type: string
description: Chunk content (streaming mode only) description: Chunk content (streaming mode only)
example: Paris is the capital of France. example: Paris is the capital of France.
message_type:
type: string
description: Message type - "chunk" for agent chunks, "explain" for explainability events
enum: [chunk, explain]
example: chunk
explain_id:
type: string
description: Explainability node URI (for explain messages)
example: urn:trustgraph:agent:abc123
explain_graph:
type: string
description: Named graph containing the explainability data
example: urn:graph:retrieval
end-of-message: end-of-message:
type: boolean type: boolean
description: Current chunk type is complete (streaming mode) description: Current chunk type is complete (streaming mode)

View file

@ -1,21 +1,60 @@
type: object type: object
description: | description: |
RDF value - represents either a URI/entity or a literal value. RDF Term - typed representation of a value in the knowledge graph.
When `e` is true, `v` must be a full URI (e.g., http://schema.org/name). Term types (discriminated by `t` field):
When `e` is false, `v` is a literal value (string, number, etc.). - `i`: IRI (URI reference)
- `l`: Literal (string value, optionally with datatype or language tag)
- `r`: Quoted triple (RDF-star reification)
- `b`: Blank node
properties: properties:
t:
type: string
description: Term type discriminator
enum: [i, l, r, b]
example: i
i:
type: string
description: IRI value (when t=i)
example: http://example.com/Person1
v: v:
type: string type: string
description: The value - full URI when e=true, literal when e=false description: Literal value (when t=l)
example: http://example.com/Person1 example: John Doe
e: d:
type: boolean type: string
description: True if entity/URI, false if literal value description: Datatype IRI for literal (when t=l, optional)
example: true example: http://www.w3.org/2001/XMLSchema#integer
l:
type: string
description: Language tag for literal (when t=l, optional)
example: en
r:
type: object
description: Quoted triple (when t=r) - contains s, p, o as nested Term objects with the same structure
properties:
s:
type: object
description: Subject term
p:
type: object
description: Predicate term
o:
type: object
description: Object term
required: required:
- v - t
- e examples:
example: - description: IRI term
v: http://schema.org/name value:
e: true t: i
i: http://schema.org/name
- description: Literal term
value:
t: l
v: John Doe
- description: Literal with language tag
value:
t: l
v: Bonjour
l: fr

View file

@ -1,6 +1,7 @@
type: object type: object
description: | description: |
RDF triple representing a subject-predicate-object statement in the knowledge graph. RDF triple representing a subject-predicate-object statement in the knowledge graph,
optionally scoped to a named graph.
Example: (Person1) -[has name]-> ("John Doe") Example: (Person1) -[has name]-> ("John Doe")
properties: properties:
@ -13,17 +14,26 @@ properties:
o: o:
$ref: './RdfValue.yaml' $ref: './RdfValue.yaml'
description: Object - the value or target entity description: Object - the value or target entity
g:
type: string
description: |
Named graph URI (optional). When absent, the triple is in the default graph.
Well-known graphs:
- (empty/absent): Core knowledge facts
- urn:graph:source: Extraction provenance
- urn:graph:retrieval: Query-time explainability
example: urn:graph:source
required: required:
- s - s
- p - p
- o - o
example: example:
s: s:
v: http://example.com/Person1 t: i
e: true i: http://example.com/Person1
p: p:
v: http://schema.org/name t: i
e: true i: http://schema.org/name
o: o:
t: l
v: John Doe v: John Doe
e: false

View file

@ -1,12 +1,22 @@
type: object type: object
description: Document embeddings query response description: Document embeddings query response with matching chunks and similarity scores
properties: properties:
chunks: chunks:
type: array type: array
description: Similar document chunks (text strings) description: Matching document chunks with similarity scores
items: items:
type: string type: object
properties:
chunk_id:
type: string
description: Chunk identifier URI
example: "urn:trustgraph:chunk:abc123"
score:
type: number
description: Similarity score (higher is more similar)
example: 0.89
example: example:
- "Quantum computing uses quantum mechanics principles for computation..." - chunk_id: "urn:trustgraph:chunk:abc123"
- "Neural networks are computing systems inspired by biological neurons..." score: 0.95
- "Machine learning algorithms learn patterns from data..." - chunk_id: "urn:trustgraph:chunk:def456"
score: 0.82

View file

@ -1,12 +1,21 @@
type: object type: object
description: Graph embeddings query response description: Graph embeddings query response with matching entities and similarity scores
properties: properties:
entities: entities:
type: array type: array
description: Similar entities (RDF values) description: Matching graph entities with similarity scores
items: items:
$ref: '../../common/RdfValue.yaml' type: object
properties:
entity:
$ref: '../../common/RdfValue.yaml'
description: Matching graph entity
score:
type: number
description: Similarity score (higher is more similar)
example: 0.92
example: example:
- {v: "https://example.com/person/alice", e: true} - entity: {t: i, i: "https://example.com/person/alice"}
- {v: "https://example.com/person/bob", e: true} score: 0.95
- {v: "https://example.com/concept/quantum", e: true} - entity: {t: i, i: "https://example.com/concept/quantum"}
score: 0.82

View file

@ -28,3 +28,23 @@ properties:
description: Collection to query description: Collection to query
default: default default: default
example: research example: research
g:
type: string
description: |
Named graph filter (optional).
- Omitted/null: all graphs
- Empty string: default graph only
- URI string: specific named graph (e.g., urn:graph:source, urn:graph:retrieval)
example: urn:graph:source
streaming:
type: boolean
description: Enable streaming response delivery
default: false
example: true
batch-size:
type: integer
description: Number of triples per streaming batch
default: 20
minimum: 1
maximum: 1000
example: 50

View file

@ -1,13 +1,31 @@
type: object type: object
description: Document RAG response description: Document RAG response message
properties: properties:
message_type:
type: string
description: Type of message - "chunk" for LLM response chunks, "explain" for explainability events
enum: [chunk, explain]
example: chunk
response: response:
type: string type: string
description: Generated response based on retrieved documents description: Generated response text (for chunk messages)
example: The research papers found three key findings... example: Based on the policy documents, customers can return items within 30 days...
explain_id:
type: string
description: Explainability node URI (for explain messages)
example: urn:trustgraph:question:abc123
explain_graph:
type: string
description: Named graph containing the explainability data
example: urn:graph:retrieval
end-of-stream: end-of-stream:
type: boolean type: boolean
description: Indicates streaming is complete (streaming mode) description: Indicates LLM response stream is complete
default: false
example: true
end_of_session:
type: boolean
description: Indicates entire session is complete (all messages sent)
default: false default: false
example: true example: true
error: error:

View file

@ -3,17 +3,21 @@ description: Graph RAG response message
properties: properties:
message_type: message_type:
type: string type: string
description: Type of message - "chunk" for LLM response chunks, "provenance" for provenance announcements description: Type of message - "chunk" for LLM response chunks, "explain" for explainability events
enum: [chunk, provenance] enum: [chunk, explain]
example: chunk example: chunk
response: response:
type: string type: string
description: Generated response text (for chunk messages) description: Generated response text (for chunk messages)
example: Quantum physics and computer science intersect in quantum computing... example: Quantum physics and computer science intersect in quantum computing...
provenance_id: explain_id:
type: string type: string
description: Provenance node URI (for provenance messages) description: Explainability node URI (for explain messages)
example: urn:trustgraph:session:abc123 example: urn:trustgraph:question:abc123
explain_graph:
type: string
description: Named graph containing the explainability data
example: urn:graph:retrieval
end_of_stream: end_of_stream:
type: boolean type: boolean
description: Indicates LLM response stream is complete description: Indicates LLM response stream is complete

View file

@ -2,7 +2,7 @@ openapi: 3.1.0
info: info:
title: TrustGraph API Gateway title: TrustGraph API Gateway
version: "1.8" version: "2.1"
description: | description: |
REST API for TrustGraph - an AI-powered knowledge graph and RAG system. REST API for TrustGraph - an AI-powered knowledge graph and RAG system.
@ -28,7 +28,7 @@ info:
Require running flow instance, accessed via `/api/v1/flow/{flow}/service/{kind}`: Require running flow instance, accessed via `/api/v1/flow/{flow}/service/{kind}`:
- AI services: agent, text-completion, prompt, RAG (document/graph) - AI services: agent, text-completion, prompt, RAG (document/graph)
- Embeddings: embeddings, graph-embeddings, document-embeddings - Embeddings: embeddings, graph-embeddings, document-embeddings
- Query: triples, objects, nlp-query, structured-query - Query: triples, rows, nlp-query, structured-query, row-embeddings
- Data loading: text-load, document-load - Data loading: text-load, document-load
- Utilities: mcp-tool, structured-diag - Utilities: mcp-tool, structured-diag
@ -140,6 +140,10 @@ paths:
/api/v1/flow/{flow}/service/document-load: /api/v1/flow/{flow}/service/document-load:
$ref: './paths/flow/document-load.yaml' $ref: './paths/flow/document-load.yaml'
# Document streaming
/api/v1/document-stream:
$ref: './paths/document-stream.yaml'
# Import/Export endpoints # Import/Export endpoints
/api/v1/import-core: /api/v1/import-core:
$ref: './paths/import-core.yaml' $ref: './paths/import-core.yaml'

View file

@ -0,0 +1,53 @@
get:
tags:
- Import/Export
summary: Stream document content from library
description: |
Streams the raw content of a document stored in the library.
Returns the document content in chunked transfer encoding.
## Parameters
- `user`: User identifier (required)
- `document-id`: Document IRI to retrieve (required)
- `chunk-size`: Size of each response chunk in bytes (optional, default: 1MB)
operationId: documentStream
security:
- bearerAuth: []
parameters:
- name: user
in: query
required: true
schema:
type: string
description: User identifier
example: trustgraph
- name: document-id
in: query
required: true
schema:
type: string
description: Document IRI to retrieve
example: "urn:trustgraph:doc:abc123"
- name: chunk-size
in: query
required: false
schema:
type: integer
default: 1048576
description: Chunk size in bytes (default 1MB)
responses:
'200':
description: Document content streamed as raw bytes
content:
application/octet-stream:
schema:
type: string
format: binary
'400':
description: Missing required parameters
'401':
$ref: '../components/responses/Unauthorized.yaml'
'500':
$ref: '../components/responses/Error.yaml'

View file

@ -24,7 +24,7 @@ echo
# Build WebSocket API documentation # Build WebSocket API documentation
echo "Building WebSocket API documentation (AsyncAPI)..." echo "Building WebSocket API documentation (AsyncAPI)..."
cd ../websocket cd ../websocket
npx --yes -p @asyncapi/cli asyncapi generate fromTemplate asyncapi.yaml @asyncapi/html-template@3.0.0 --use-new-generator -o /tmp/asyncapi-build -p singleFile=true --force-write npx --yes -p @asyncapi/cli asyncapi generate fromTemplate asyncapi.yaml @asyncapi/html-template -o /tmp/asyncapi-build -p singleFile=true --force-write
mv /tmp/asyncapi-build/index.html ../../docs/websocket.html mv /tmp/asyncapi-build/index.html ../../docs/websocket.html
rm -rf /tmp/asyncapi-build rm -rf /tmp/asyncapi-build
echo "✓ WebSocket API docs generated: docs/websocket.html" echo "✓ WebSocket API docs generated: docs/websocket.html"

View file

@ -2,7 +2,7 @@ asyncapi: 3.0.0
info: info:
title: TrustGraph WebSocket API title: TrustGraph WebSocket API
version: "1.8" version: "2.1"
description: | description: |
WebSocket API for TrustGraph - providing multiplexed, asynchronous access to all services. WebSocket API for TrustGraph - providing multiplexed, asynchronous access to all services.
@ -31,7 +31,7 @@ info:
**Flow-Hosted Services** (require `flow` parameter): **Flow-Hosted Services** (require `flow` parameter):
- agent, text-completion, prompt, document-rag, graph-rag - agent, text-completion, prompt, document-rag, graph-rag
- embeddings, graph-embeddings, document-embeddings - embeddings, graph-embeddings, document-embeddings
- triples, objects, nlp-query, structured-query, structured-diag - triples, rows, nlp-query, structured-query, structured-diag, row-embeddings
- text-load, document-load, mcp-tool - text-load, document-load, mcp-tool
## Schema Reuse ## Schema Reuse