diff --git a/docs/tech-specs/openapi-spec.md b/docs/tech-specs/openapi-spec.md new file mode 100644 index 00000000..ec82681d --- /dev/null +++ b/docs/tech-specs/openapi-spec.md @@ -0,0 +1,231 @@ +# OpenAPI Specification - Technical Spec + +## Goal + +Create a comprehensive, modular OpenAPI 3.1 specification for the TrustGraph REST API Gateway that: +- Documents all REST endpoints +- Uses external `$ref` for modularity and maintainability +- Maps directly to the message translator code +- Provides accurate request/response schemas + +## Source of Truth + +The API is defined by: +- **Message Translators**: `trustgraph-base/trustgraph/messaging/translators/*.py` +- **Dispatcher Manager**: `trustgraph-flow/trustgraph/gateway/dispatch/manager.py` +- **Endpoint Manager**: `trustgraph-flow/trustgraph/gateway/endpoint/manager.py` + +## Directory Structure + +``` +openapi/ +├── openapi.yaml # Main entry point +├── paths/ +│ ├── config.yaml # Global services +│ ├── flow.yaml +│ ├── librarian.yaml +│ ├── knowledge.yaml +│ ├── collection-management.yaml +│ ├── flow-services/ # Flow-hosted services +│ │ ├── agent.yaml +│ │ ├── document-rag.yaml +│ │ ├── graph-rag.yaml +│ │ ├── text-completion.yaml +│ │ ├── prompt.yaml +│ │ ├── embeddings.yaml +│ │ ├── mcp-tool.yaml +│ │ ├── triples.yaml +│ │ ├── objects.yaml +│ │ ├── nlp-query.yaml +│ │ ├── structured-query.yaml +│ │ ├── structured-diag.yaml +│ │ ├── graph-embeddings.yaml +│ │ ├── document-embeddings.yaml +│ │ ├── text-load.yaml +│ │ └── document-load.yaml +│ ├── import-export/ +│ │ ├── core-import.yaml +│ │ ├── core-export.yaml +│ │ └── flow-import-export.yaml # WebSocket import/export +│ ├── websocket.yaml +│ └── metrics.yaml +├── components/ +│ ├── schemas/ +│ │ ├── config/ +│ │ ├── flow/ +│ │ ├── librarian/ +│ │ ├── knowledge/ +│ │ ├── collection/ +│ │ ├── ai-services/ +│ │ ├── common/ +│ │ └── errors/ +│ ├── parameters/ +│ ├── responses/ +│ └── examples/ +└── security/ + └── bearerAuth.yaml +``` + +## Service Mapping + +### Global Services (`/api/v1/{kind}`) +- `config` - Configuration management +- `flow` - Flow lifecycle +- `librarian` - Document library +- `knowledge` - Knowledge cores +- `collection-management` - Collection metadata + +### Flow-Hosted Services (`/api/v1/flow/{flow}/service/{kind}`) + +**Request/Response:** +- `agent`, `text-completion`, `prompt`, `mcp-tool` +- `graph-rag`, `document-rag` +- `embeddings`, `graph-embeddings`, `document-embeddings` +- `triples`, `objects`, `nlp-query`, `structured-query`, `structured-diag` + +**Fire-and-Forget:** +- `text-load`, `document-load` + +### Import/Export +- `/api/v1/import-core` (POST) +- `/api/v1/export-core` (GET) +- `/api/v1/flow/{flow}/import/{kind}` (WebSocket) +- `/api/v1/flow/{flow}/export/{kind}` (WebSocket) + +### Other +- `/api/v1/socket` (WebSocket multiplexed) +- `/api/metrics` (Prometheus) + +## Approach + +### Phase 1: Setup +1. Create directory structure +2. Create main `openapi.yaml` with metadata, servers, security +3. Create reusable components (errors, common parameters, security schemes) + +### Phase 2: Common Schemas +Create shared schemas used across services: +- `RdfValue`, `Triple` - RDF/triple structures +- `ErrorObject` - Error response +- `DocumentMetadata`, `ProcessingMetadata` - Metadata structures +- Common parameters: `FlowId`, `User`, `Collection` + +### Phase 3: Global Services +For each global service (config, flow, librarian, knowledge, collection-management): +1. Create path file in `paths/` +2. Create request schema in `components/schemas/{service}/` +3. Create response schema +4. Add examples +5. Reference from main `openapi.yaml` + +### Phase 4: Flow-Hosted Services +For each flow-hosted service: +1. Create path file in `paths/flow-services/` +2. Create request/response schemas in `components/schemas/ai-services/` +3. Add streaming flag documentation where applicable +4. Reference from main `openapi.yaml` + +### Phase 5: Import/Export & WebSocket +1. Document core import/export endpoints +2. Document WebSocket protocol patterns +3. Document flow-level import/export WebSocket endpoints + +### Phase 6: Validation +1. Validate with OpenAPI validator tools +2. Test with Swagger UI +3. Verify all translators are covered + +## Field Naming Convention + +All JSON fields use **kebab-case**: +- `flow-id`, `blueprint-name`, `doc-limit`, `entity-limit`, etc. + +## Creating Schema Files + +For each translator in `trustgraph-base/trustgraph/messaging/translators/`: + +1. **Read translator `to_pulsar()` method** - Defines request schema +2. **Read translator `from_pulsar()` method** - Defines response schema +3. **Extract field names and types** +4. **Create OpenAPI schema** with: + - Field names (kebab-case) + - Types (string, integer, boolean, object, array) + - Required fields + - Defaults + - Descriptions + +### Example Mapping Process + +```python +# From retrieval.py DocumentRagRequestTranslator +def to_pulsar(self, data: Dict[str, Any]) -> DocumentRagQuery: + return DocumentRagQuery( + query=data["query"], # required string + user=data.get("user", "trustgraph"), # optional string, default "trustgraph" + collection=data.get("collection", "default"), # optional string, default "default" + doc_limit=int(data.get("doc-limit", 20)), # optional integer, default 20 + streaming=data.get("streaming", False) # optional boolean, default false + ) +``` + +Maps to: + +```yaml +# components/schemas/ai-services/DocumentRagRequest.yaml +type: object +required: + - query +properties: + query: + type: string + description: Search query + user: + type: string + default: trustgraph + collection: + type: string + default: default + doc-limit: + type: integer + default: 20 + description: Maximum number of documents to retrieve + streaming: + type: boolean + default: false + description: Enable streaming responses +``` + +## Streaming Responses + +Services that support streaming return multiple responses with `end_of_stream` flag: +- `agent`, `text-completion`, `prompt` +- `document-rag`, `graph-rag` + +Document this pattern in each service's response schema. + +## Error Responses + +All services can return: +```yaml +error: + oneOf: + - type: string + - $ref: '#/components/schemas/ErrorObject' +``` + +Where `ErrorObject` is: +```yaml +type: object +properties: + type: + type: string + message: + type: string +``` + +## References + +- Translators: `trustgraph-base/trustgraph/messaging/translators/` +- Dispatcher mapping: `trustgraph-flow/trustgraph/gateway/dispatch/manager.py` +- Endpoint routing: `trustgraph-flow/trustgraph/gateway/endpoint/manager.py` +- Service summary: `API_SERVICES_SUMMARY.md` diff --git a/specs/api/README.md b/specs/api/README.md new file mode 100644 index 00000000..b9335579 --- /dev/null +++ b/specs/api/README.md @@ -0,0 +1,84 @@ +# TrustGraph OpenAPI Specification + +This directory contains the modular OpenAPI 3.1 specification for the TrustGraph REST API Gateway. + +## Structure + +``` +specs/api/ +├── openapi.yaml # Main entry point +├── paths/ # Endpoint definitions +│ ├── config.yaml +│ ├── flow.yaml +│ ├── flow-services/ # Flow-hosted services +│ └── import-export/ +├── components/ +│ ├── schemas/ # Request/response schemas +│ │ ├── config/ +│ │ ├── flow/ +│ │ ├── ai-services/ +│ │ ├── common/ +│ │ └── errors/ +│ ├── parameters/ # Reusable parameters +│ ├── responses/ # Reusable responses +│ └── examples/ # Example payloads +└── security/ # Security schemes + └── bearerAuth.yaml +``` + +## Viewing the Spec + +### Swagger UI + +```bash +# Install swagger-ui +npm install -g swagger-ui-watcher + +# View in browser +swagger-ui-watcher specs/api/openapi.yaml +``` + +### Redoc + +```bash +# Install redoc-cli +npm install -g redoc-cli + +# Generate static HTML +redoc-cli bundle specs/api/openapi.yaml -o api-docs.html + +# View +open api-docs.html +``` + +### Online Validators + +Upload `openapi.yaml` to: +- https://editor.swagger.io/ +- https://redocly.com/redoc/ + +## Validation + +```bash +# Install openapi-spec-validator +pip install openapi-spec-validator + +# Validate +openapi-spec-validator specs/api/openapi.yaml +``` + +## Development + +When adding a new service: + +1. Create schema files in `components/schemas/{service}/` +2. Create path file in `paths/` or `paths/flow-services/` +3. Add examples if needed +4. Reference from `openapi.yaml` +5. Validate + +## References + +- [OpenAPI 3.1 Specification](https://spec.openapis.org/oas/v3.1.0) +- [TrustGraph Tech Spec](../../docs/tech-specs/openapi-spec.md) +- [API Services Summary](../../API_SERVICES_SUMMARY.md) diff --git a/specs/api/components/common/DocumentMetadata.yaml b/specs/api/components/common/DocumentMetadata.yaml new file mode 100644 index 00000000..43edc273 --- /dev/null +++ b/specs/api/components/common/DocumentMetadata.yaml @@ -0,0 +1,23 @@ +type: object +description: Document metadata +properties: + url: + type: string + description: Document URL + example: https://example.com/document.pdf + title: + type: string + description: Document title + example: Example Document + author: + type: string + description: Document author + example: John Doe + metadata: + type: object + description: Additional metadata + additionalProperties: + type: string + example: + department: Engineering + category: Technical diff --git a/specs/api/components/common/ProcessingMetadata.yaml b/specs/api/components/common/ProcessingMetadata.yaml new file mode 100644 index 00000000..8f141383 --- /dev/null +++ b/specs/api/components/common/ProcessingMetadata.yaml @@ -0,0 +1,21 @@ +type: object +description: Processing task metadata +properties: + flow: + type: string + description: Flow ID + example: my-flow + collection: + type: string + description: Collection identifier + example: default + status: + type: string + description: Processing status + enum: [pending, processing, completed, failed] + example: processing + timestamp: + type: string + description: ISO timestamp + format: date-time + example: "2024-01-15T10:30:00Z" diff --git a/specs/api/components/common/RdfValue.yaml b/specs/api/components/common/RdfValue.yaml new file mode 100644 index 00000000..5ed7c992 --- /dev/null +++ b/specs/api/components/common/RdfValue.yaml @@ -0,0 +1,14 @@ +type: object +description: RDF value - can be entity/URI or literal +required: + - v + - e +properties: + v: + type: string + description: Value (URI or literal text) + example: https://example.com/entity1 + e: + type: boolean + description: True if entity/URI, false if literal + example: true diff --git a/specs/api/components/common/Triple.yaml b/specs/api/components/common/Triple.yaml new file mode 100644 index 00000000..142be0e9 --- /dev/null +++ b/specs/api/components/common/Triple.yaml @@ -0,0 +1,16 @@ +type: object +description: RDF triple (subject-predicate-object) +required: + - s + - p + - o +properties: + s: + $ref: './RdfValue.yaml' + description: Subject + p: + $ref: './RdfValue.yaml' + description: Predicate + o: + $ref: './RdfValue.yaml' + description: Object diff --git a/specs/api/components/parameters/Collection.yaml b/specs/api/components/parameters/Collection.yaml new file mode 100644 index 00000000..ecbb0836 --- /dev/null +++ b/specs/api/components/parameters/Collection.yaml @@ -0,0 +1,8 @@ +name: collection +in: query +required: false +schema: + type: string + default: default +description: Collection identifier +example: default diff --git a/specs/api/components/parameters/FlowId.yaml b/specs/api/components/parameters/FlowId.yaml new file mode 100644 index 00000000..98f6e149 --- /dev/null +++ b/specs/api/components/parameters/FlowId.yaml @@ -0,0 +1,7 @@ +name: flow +in: path +required: true +schema: + type: string +description: Flow instance ID +example: my-flow diff --git a/specs/api/components/parameters/User.yaml b/specs/api/components/parameters/User.yaml new file mode 100644 index 00000000..ad0657ca --- /dev/null +++ b/specs/api/components/parameters/User.yaml @@ -0,0 +1,8 @@ +name: user +in: query +required: false +schema: + type: string + default: trustgraph +description: User identifier +example: alice diff --git a/specs/api/components/responses/Error.yaml b/specs/api/components/responses/Error.yaml new file mode 100644 index 00000000..c3dbe5aa --- /dev/null +++ b/specs/api/components/responses/Error.yaml @@ -0,0 +1,23 @@ +description: Error response +content: + application/json: + schema: + type: object + properties: + error: + oneOf: + - type: string + description: Simple error message + - $ref: '../schemas/errors/ErrorObject.yaml' + description: Structured error with type and message + examples: + simpleError: + summary: Simple error message + value: + error: Invalid flow ID + structuredError: + summary: Structured error + value: + error: + type: gateway-error + message: Timeout diff --git a/specs/api/components/responses/Unauthorized.yaml b/specs/api/components/responses/Unauthorized.yaml new file mode 100644 index 00000000..6f903c39 --- /dev/null +++ b/specs/api/components/responses/Unauthorized.yaml @@ -0,0 +1,9 @@ +description: Unauthorized - Invalid or missing bearer token +content: + application/json: + schema: + type: object + properties: + error: + type: string + example: Unauthorized diff --git a/specs/api/components/schemas/agent/AgentRequest.yaml b/specs/api/components/schemas/agent/AgentRequest.yaml new file mode 100644 index 00000000..ddf2019a --- /dev/null +++ b/specs/api/components/schemas/agent/AgentRequest.yaml @@ -0,0 +1,59 @@ +type: object +description: | + Agent service request - conversational AI agent that can reason and take actions. +required: + - question +properties: + question: + type: string + description: User question or prompt for the agent + example: What is the capital of France? + state: + type: string + description: Agent state for continuation (optional, for multi-turn) + example: agent-state-12345 + group: + type: array + description: Group identifiers for collaborative agents (optional) + items: + type: string + example: ["research-team"] + history: + type: array + description: Conversation history (optional, list of previous agent steps) + items: + type: object + properties: + thought: + type: string + description: Agent's reasoning + example: I need to search for information about Paris + action: + type: string + description: Action taken + example: search + arguments: + type: object + description: Action arguments + additionalProperties: + type: string + example: + query: "capital of France" + observation: + type: string + description: Result of the action + example: "Paris is the capital of France" + user: + type: string + description: User context for this step + example: alice + user: + type: string + description: User identifier for multi-tenancy + default: trustgraph + example: alice + streaming: + type: boolean + description: Enable streaming response delivery + default: false + example: true diff --git a/specs/api/components/schemas/agent/AgentResponse.yaml b/specs/api/components/schemas/agent/AgentResponse.yaml new file mode 100644 index 00000000..86d636b5 --- /dev/null +++ b/specs/api/components/schemas/agent/AgentResponse.yaml @@ -0,0 +1,51 @@ +type: object +description: Agent service response (streaming or legacy format) +properties: + chunk-type: + type: string + description: Type of streaming chunk (streaming mode only) + enum: + - thought + - action + - observation + - answer + - error + example: answer + content: + type: string + description: Chunk content (streaming mode only) + example: Paris is the capital of France. + end-of-message: + type: boolean + description: Current chunk type is complete (streaming mode) + default: false + example: true + end-of-dialog: + type: boolean + description: Entire agent dialog is complete (streaming mode) + default: false + example: true + answer: + type: string + description: Final answer (legacy non-streaming format) + example: Paris is the capital of France. + thought: + type: string + description: Agent reasoning (legacy format) + example: I should search for information about the capital of France. + observation: + type: string + description: Observation from actions (legacy format) + example: Found information about Paris being the capital. + error: + type: object + description: Error details if request failed + properties: + message: + type: string + description: Error message + example: Failed to process agent request + code: + type: string + description: Error code + example: AGENT_ERROR diff --git a/specs/api/components/schemas/collection/CollectionRequest.yaml b/specs/api/components/schemas/collection/CollectionRequest.yaml new file mode 100644 index 00000000..bf3ab7d4 --- /dev/null +++ b/specs/api/components/schemas/collection/CollectionRequest.yaml @@ -0,0 +1,58 @@ +type: object +description: | + Collection management request. + + Operations: list-collections, update-collection, delete-collection +required: + - operation +properties: + operation: + type: string + enum: + - list-collections + - update-collection + - delete-collection + description: | + Collection operation: + - `list-collections`: List collections for user + - `update-collection`: Create or update collection metadata + - `delete-collection`: Delete collection + user: + type: string + description: User identifier + default: trustgraph + example: alice + collection: + type: string + description: Collection identifier (for update, delete) + example: research + timestamp: + type: string + description: ISO timestamp + format: date-time + example: "2024-01-15T10:30:00Z" + name: + type: string + description: Human-readable collection name (for update) + example: Research Papers + description: + type: string + description: Collection description (for update) + example: Academic research papers on AI and ML + tags: + type: array + description: Collection tags for organization (for update) + items: + type: string + example: ["research", "AI", "academic"] + tag-filter: + type: array + description: Filter collections by tags (for list) + items: + type: string + example: ["research"] + limit: + type: integer + description: Maximum number of results (for list) + default: 0 + example: 100 diff --git a/specs/api/components/schemas/collection/CollectionResponse.yaml b/specs/api/components/schemas/collection/CollectionResponse.yaml new file mode 100644 index 00000000..f924cbf5 --- /dev/null +++ b/specs/api/components/schemas/collection/CollectionResponse.yaml @@ -0,0 +1,39 @@ +type: object +description: Collection management response +properties: + timestamp: + type: string + description: ISO timestamp + format: date-time + example: "2024-01-15T10:30:00Z" + collections: + type: array + description: List of collections (returned by list-collections) + items: + type: object + required: + - user + - collection + properties: + user: + type: string + description: User identifier + example: alice + collection: + type: string + description: Collection identifier + example: research + name: + type: string + description: Human-readable collection name + example: Research Papers + description: + type: string + description: Collection description + example: Academic research papers on AI and ML + tags: + type: array + description: Collection tags + items: + type: string + example: ["research", "AI", "academic"] diff --git a/specs/api/components/schemas/common/DocumentMetadata.yaml b/specs/api/components/schemas/common/DocumentMetadata.yaml new file mode 100644 index 00000000..77e2206e --- /dev/null +++ b/specs/api/components/schemas/common/DocumentMetadata.yaml @@ -0,0 +1,26 @@ +type: object +description: Document metadata for library management +properties: + url: + type: string + description: Document URL or identifier + example: https://example.com/document.pdf + title: + type: string + description: Document title + example: Example Document + author: + type: string + description: Document author + example: John Doe + date: + type: string + description: Document date + example: "2024-01-15" + metadata: + type: object + description: Additional metadata fields + additionalProperties: true + example: + department: Engineering + category: Technical diff --git a/specs/api/components/schemas/common/ProcessingMetadata.yaml b/specs/api/components/schemas/common/ProcessingMetadata.yaml new file mode 100644 index 00000000..d74a0efa --- /dev/null +++ b/specs/api/components/schemas/common/ProcessingMetadata.yaml @@ -0,0 +1,25 @@ +type: object +description: Processing metadata for library document processing +properties: + flow: + type: string + description: Flow ID + example: my-flow + collection: + type: string + description: Collection identifier + example: default + status: + type: string + description: Processing status + enum: [pending, processing, completed, failed] + example: completed + timestamp: + type: string + format: date-time + description: Processing timestamp + example: "2024-01-15T10:30:00Z" + error: + type: string + description: Error message if processing failed + example: Failed to extract text from PDF diff --git a/specs/api/components/schemas/common/RdfValue.yaml b/specs/api/components/schemas/common/RdfValue.yaml new file mode 100644 index 00000000..ce8b4c08 --- /dev/null +++ b/specs/api/components/schemas/common/RdfValue.yaml @@ -0,0 +1,21 @@ +type: object +description: | + RDF value - represents either a URI/entity or a literal value. + + When `e` is true, `v` must be a full URI (e.g., http://schema.org/name). + When `e` is false, `v` is a literal value (string, number, etc.). +properties: + v: + type: string + description: The value - full URI when e=true, literal when e=false + example: http://example.com/Person1 + e: + type: boolean + description: True if entity/URI, false if literal value + example: true +required: + - v + - e +example: + v: http://schema.org/name + e: true diff --git a/specs/api/components/schemas/common/Triple.yaml b/specs/api/components/schemas/common/Triple.yaml new file mode 100644 index 00000000..1f72b89a --- /dev/null +++ b/specs/api/components/schemas/common/Triple.yaml @@ -0,0 +1,29 @@ +type: object +description: | + RDF triple representing a subject-predicate-object statement in the knowledge graph. + + Example: (Person1) -[has name]-> ("John Doe") +properties: + s: + $ref: './RdfValue.yaml' + description: Subject - the entity the statement is about + p: + $ref: './RdfValue.yaml' + description: Predicate - the property or relationship + o: + $ref: './RdfValue.yaml' + description: Object - the value or target entity +required: + - s + - p + - o +example: + s: + v: http://example.com/Person1 + e: true + p: + v: http://schema.org/name + e: true + o: + v: John Doe + e: false diff --git a/specs/api/components/schemas/config/ConfigRequest.yaml b/specs/api/components/schemas/config/ConfigRequest.yaml new file mode 100644 index 00000000..aa39e519 --- /dev/null +++ b/specs/api/components/schemas/config/ConfigRequest.yaml @@ -0,0 +1,67 @@ +type: object +description: | + Configuration service request. + + Supports operations: config, list, get, put, delete +required: + - operation +properties: + operation: + type: string + enum: [config, list, get, put, delete] + description: | + Operation to perform: + - `config`: Get complete configuration + - `list`: List all items of a specific type + - `get`: Get specific configuration items + - `put`: Set/update configuration values + - `delete`: Delete configuration items + example: config + type: + type: string + description: | + Configuration type (required for list, get, put, delete operations). + Common types: flow, prompt, token-cost, parameter-type, interface-description + example: flow + keys: + type: array + description: Keys to retrieve (for get operation) or delete (for delete operation) + items: + type: object + required: + - type + - key + properties: + type: + type: string + description: Configuration type + example: flow + key: + type: string + description: Configuration key + example: my-flow + values: + type: array + description: Values to set/update (for put operation) + items: + type: object + required: + - type + - key + - value + properties: + type: + type: string + description: Configuration type + example: flow + key: + type: string + description: Configuration key + example: my-flow + value: + type: object + description: Configuration value (structure depends on type) + additionalProperties: true + example: + blueprint-name: document-rag + description: My RAG flow diff --git a/specs/api/components/schemas/config/ConfigResponse.yaml b/specs/api/components/schemas/config/ConfigResponse.yaml new file mode 100644 index 00000000..9815c51e --- /dev/null +++ b/specs/api/components/schemas/config/ConfigResponse.yaml @@ -0,0 +1,49 @@ +type: object +description: Configuration service response +properties: + version: + type: integer + description: Configuration version number + example: 42 + config: + type: object + description: Complete configuration (returned by 'config' operation) + additionalProperties: true + example: + flow: + default: + blueprint-name: document-rag+graph-rag + description: Default flow + prompt: + system: You are a helpful AI assistant + token-cost: + gpt-4: + prompt: 0.03 + completion: 0.06 + directory: + type: array + description: List of keys (returned by 'list' operation) + items: + type: string + example: + - default + - production + - my-flow + values: + type: array + description: Retrieved configuration values (returned by 'get' operation) + items: + type: object + properties: + type: + type: string + example: flow + key: + type: string + example: default + value: + type: object + additionalProperties: true + example: + blueprint-name: document-rag+graph-rag + description: Default flow diff --git a/specs/api/components/schemas/diag/StructuredDiagRequest.yaml b/specs/api/components/schemas/diag/StructuredDiagRequest.yaml new file mode 100644 index 00000000..cb692e19 --- /dev/null +++ b/specs/api/components/schemas/diag/StructuredDiagRequest.yaml @@ -0,0 +1,46 @@ +type: object +description: | + Structured data diagnosis request - analyze and understand structured data formats. + + Operations: detect-type, generate-descriptor, diagnose, schema-selection +required: + - operation + - sample +properties: + operation: + type: string + enum: + - detect-type + - generate-descriptor + - diagnose + - schema-selection + description: | + Diagnosis operation: + - `detect-type`: Identify data format (CSV, JSON, XML) + - `generate-descriptor`: Create schema descriptor for data + - `diagnose`: Full analysis (detect + generate descriptor) + - `schema-selection`: Find matching schemas for data + sample: + type: string + description: Data sample to analyze (text content) + example: | + name,age,email + Alice,30,alice@example.com + Bob,25,bob@example.com + type: + type: string + description: Data type (required for generate-descriptor) + enum: [csv, json, xml] + example: csv + schema-name: + type: string + description: Target schema name for descriptor generation (optional) + example: person-records + options: + type: object + description: Format-specific options (e.g., CSV delimiter) + additionalProperties: + type: string + example: + delimiter: "," + has_header: "true" diff --git a/specs/api/components/schemas/diag/StructuredDiagResponse.yaml b/specs/api/components/schemas/diag/StructuredDiagResponse.yaml new file mode 100644 index 00000000..e41009a4 --- /dev/null +++ b/specs/api/components/schemas/diag/StructuredDiagResponse.yaml @@ -0,0 +1,49 @@ +type: object +description: Structured data diagnosis response +required: + - operation +properties: + operation: + type: string + description: Operation that was performed + example: diagnose + detected-type: + type: string + description: Detected data format (for detect-type/diagnose) + enum: [csv, json, xml] + example: csv + confidence: + type: number + description: Detection confidence score (0.0-1.0) + minimum: 0.0 + maximum: 1.0 + example: 0.95 + descriptor: + type: object + description: Generated schema descriptor (for generate-descriptor/diagnose) + additionalProperties: {} + example: + schema_name: person-records + type: csv + fields: + - name: name + type: string + - name: age + type: integer + - name: email + type: string + metadata: + type: object + description: Additional analysis metadata + additionalProperties: + type: string + example: + field_count: "3" + record_count: "2" + has_header: "true" + schema-matches: + type: array + description: Matching schema IDs (for schema-selection) + items: + type: string + example: ["person-schema-v1", "contact-schema-v2"] diff --git a/specs/api/components/schemas/embeddings-query/DocumentEmbeddingsQueryRequest.yaml b/specs/api/components/schemas/embeddings-query/DocumentEmbeddingsQueryRequest.yaml new file mode 100644 index 00000000..f2d0aec2 --- /dev/null +++ b/specs/api/components/schemas/embeddings-query/DocumentEmbeddingsQueryRequest.yaml @@ -0,0 +1,29 @@ +type: object +description: | + Document embeddings query request - find similar documents by vector similarity. +required: + - vectors +properties: + vectors: + type: array + description: Query embedding vector + items: + type: number + example: [0.023, -0.142, 0.089, 0.234, -0.067, 0.156] + limit: + type: integer + description: Maximum number of document chunks to return + default: 10 + minimum: 1 + maximum: 1000 + example: 20 + user: + type: string + description: User identifier + default: trustgraph + example: alice + collection: + type: string + description: Collection to search + default: default + example: research diff --git a/specs/api/components/schemas/embeddings-query/DocumentEmbeddingsQueryResponse.yaml b/specs/api/components/schemas/embeddings-query/DocumentEmbeddingsQueryResponse.yaml new file mode 100644 index 00000000..6b1d811d --- /dev/null +++ b/specs/api/components/schemas/embeddings-query/DocumentEmbeddingsQueryResponse.yaml @@ -0,0 +1,12 @@ +type: object +description: Document embeddings query response +properties: + chunks: + type: array + description: Similar document chunks (text strings) + items: + type: string + example: + - "Quantum computing uses quantum mechanics principles for computation..." + - "Neural networks are computing systems inspired by biological neurons..." + - "Machine learning algorithms learn patterns from data..." diff --git a/specs/api/components/schemas/embeddings-query/GraphEmbeddingsQueryRequest.yaml b/specs/api/components/schemas/embeddings-query/GraphEmbeddingsQueryRequest.yaml new file mode 100644 index 00000000..6cf60bbd --- /dev/null +++ b/specs/api/components/schemas/embeddings-query/GraphEmbeddingsQueryRequest.yaml @@ -0,0 +1,29 @@ +type: object +description: | + Graph embeddings query request - find similar entities by vector similarity. +required: + - vectors +properties: + vectors: + type: array + description: Query embedding vector + items: + type: number + example: [0.023, -0.142, 0.089, 0.234, -0.067, 0.156] + limit: + type: integer + description: Maximum number of entities to return + default: 10 + minimum: 1 + maximum: 1000 + example: 20 + user: + type: string + description: User identifier + default: trustgraph + example: alice + collection: + type: string + description: Collection to search + default: default + example: research diff --git a/specs/api/components/schemas/embeddings-query/GraphEmbeddingsQueryResponse.yaml b/specs/api/components/schemas/embeddings-query/GraphEmbeddingsQueryResponse.yaml new file mode 100644 index 00000000..80692a12 --- /dev/null +++ b/specs/api/components/schemas/embeddings-query/GraphEmbeddingsQueryResponse.yaml @@ -0,0 +1,12 @@ +type: object +description: Graph embeddings query response +properties: + entities: + type: array + description: Similar entities (RDF values) + items: + $ref: '../../common/RdfValue.yaml' + example: + - {v: "https://example.com/person/alice", e: true} + - {v: "https://example.com/person/bob", e: true} + - {v: "https://example.com/concept/quantum", e: true} diff --git a/specs/api/components/schemas/embeddings/EmbeddingsRequest.yaml b/specs/api/components/schemas/embeddings/EmbeddingsRequest.yaml new file mode 100644 index 00000000..94369108 --- /dev/null +++ b/specs/api/components/schemas/embeddings/EmbeddingsRequest.yaml @@ -0,0 +1,10 @@ +type: object +description: | + Embeddings request - convert text to vector embedding. +required: + - text +properties: + text: + type: string + description: Text to convert to embedding vector + example: Quantum computing uses quantum mechanics principles for computation. diff --git a/specs/api/components/schemas/embeddings/EmbeddingsResponse.yaml b/specs/api/components/schemas/embeddings/EmbeddingsResponse.yaml new file mode 100644 index 00000000..8a5c01cd --- /dev/null +++ b/specs/api/components/schemas/embeddings/EmbeddingsResponse.yaml @@ -0,0 +1,11 @@ +type: object +description: Embeddings response +required: + - vectors +properties: + vectors: + type: array + description: Embedding vector (array of floats) + items: + type: number + example: [0.023, -0.142, 0.089, 0.234, -0.067, 0.156] diff --git a/specs/api/components/schemas/errors/ErrorObject.yaml b/specs/api/components/schemas/errors/ErrorObject.yaml new file mode 100644 index 00000000..3a93a7dc --- /dev/null +++ b/specs/api/components/schemas/errors/ErrorObject.yaml @@ -0,0 +1,14 @@ +type: object +description: Structured error response with type and message +properties: + type: + type: string + description: Error type identifier + example: gateway-error + message: + type: string + description: Human-readable error message + example: Timeout +required: + - type + - message diff --git a/specs/api/components/schemas/flow/FlowRequest.yaml b/specs/api/components/schemas/flow/FlowRequest.yaml new file mode 100644 index 00000000..8cff7955 --- /dev/null +++ b/specs/api/components/schemas/flow/FlowRequest.yaml @@ -0,0 +1,76 @@ +type: object +description: | + Flow service request for managing flow instances and blueprints. + + Operations: start-flow, stop-flow, list-flows, get-flow, + list-blueprints, get-blueprint, put-blueprint, delete-blueprint +required: + - operation +properties: + operation: + type: string + enum: + - start-flow + - stop-flow + - list-flows + - get-flow + - list-blueprints + - get-blueprint + - put-blueprint + - delete-blueprint + description: | + Flow operation: + - `start-flow`: Start a new flow instance from a blueprint + - `stop-flow`: Stop a running flow instance + - `list-flows`: List all running flow instances + - `get-flow`: Get details of a running flow + - `list-blueprints`: List available flow blueprints + - `get-blueprint`: Get blueprint definition + - `put-blueprint`: Create/update blueprint definition + - `delete-blueprint`: Delete blueprint definition + flow-id: + type: string + description: Flow instance ID (required for start-flow, stop-flow, get-flow) + example: my-flow + blueprint-name: + type: string + description: Flow blueprint name (required for start-flow, get-blueprint, put-blueprint, delete-blueprint) + example: document-rag + blueprint-definition: + type: object + description: Flow blueprint definition (required for put-blueprint) + additionalProperties: true + example: + description: Custom RAG pipeline + parameters: + model: + type: llm-model + description: LLM model for processing + order: 1 + class: + text-completion:{class}: + request: non-persistent://tg/request/text-completion:{class} + response: non-persistent://tg/response/text-completion:{class} + flow: + chunker:{id}: + input: persistent://tg/flow/chunk:{id} + output: persistent://tg/flow/chunk-load:{id} + interfaces: + agent: + request: non-persistent://tg/request/agent:{id} + response: non-persistent://tg/response/agent:{id} + description: + type: string + description: Flow description (optional for start-flow) + example: My document processing flow + parameters: + type: object + description: | + Flow parameters (for start-flow). + All values are stored as strings, regardless of input type. + additionalProperties: + type: string + example: + model: gpt-4 + temperature: "0.7" + chunk-size: "1000" diff --git a/specs/api/components/schemas/flow/FlowResponse.yaml b/specs/api/components/schemas/flow/FlowResponse.yaml new file mode 100644 index 00000000..c93ae42c --- /dev/null +++ b/specs/api/components/schemas/flow/FlowResponse.yaml @@ -0,0 +1,82 @@ +type: object +description: Flow service response +properties: + flow-id: + type: string + description: Flow instance ID (returned by start-flow) + example: my-flow + flow-ids: + type: array + description: List of running flow IDs (returned by list-flows) + items: + type: string + example: + - default + - production + - my-flow + blueprint-names: + type: array + description: List of available blueprint names (returned by list-blueprints) + items: + type: string + example: + - document-rag + - graph-rag + - document-rag+graph-rag + blueprint-definition: + type: object + description: Blueprint definition (returned by get-blueprint) + additionalProperties: true + example: + description: Standard RAG pipeline + parameters: + model: + type: llm-model + order: 1 + class: + text-completion:{class}: + request: non-persistent://tg/request/text-completion:{class} + response: non-persistent://tg/response/text-completion:{class} + flow: + chunker:{id}: + input: persistent://tg/flow/chunk:{id} + output: persistent://tg/flow/chunk-load:{id} + interfaces: + agent: + request: non-persistent://tg/request/agent:{id} + response: non-persistent://tg/response/agent:{id} + flow: + type: object + description: Flow instance details (returned by get-flow) + properties: + blueprint-name: + type: string + example: document-rag + description: + type: string + example: My document processing flow + parameters: + type: object + description: Flow parameters (all values are strings) + additionalProperties: + type: string + example: + model: gpt-4 + temperature: "0.7" + interfaces: + type: object + description: Service interfaces with resolved queue names + additionalProperties: true + example: + agent: + request: non-persistent://tg/request/agent:my-flow + response: non-persistent://tg/response/agent:my-flow + text-load: persistent://tg/flow/text-document-load:my-flow + description: + type: string + description: Description + parameters: + type: object + description: Parameters + additionalProperties: + type: string diff --git a/specs/api/components/schemas/knowledge/KnowledgeRequest.yaml b/specs/api/components/schemas/knowledge/KnowledgeRequest.yaml new file mode 100644 index 00000000..5c40e118 --- /dev/null +++ b/specs/api/components/schemas/knowledge/KnowledgeRequest.yaml @@ -0,0 +1,128 @@ +type: object +description: | + Knowledge graph core management request. + + Operations: list-kg-cores, get-kg-core, put-kg-core, delete-kg-core, + load-kg-core, unload-kg-core +required: + - operation +properties: + operation: + type: string + enum: + - list-kg-cores + - get-kg-core + - put-kg-core + - delete-kg-core + - load-kg-core + - unload-kg-core + description: | + Knowledge core operation: + - `list-kg-cores`: List knowledge cores for user + - `get-kg-core`: Get knowledge core by ID + - `put-kg-core`: Store triples and/or embeddings + - `delete-kg-core`: Delete knowledge core by ID + - `load-kg-core`: Load knowledge core into flow + - `unload-kg-core`: Unload knowledge core from flow + user: + type: string + description: User identifier (for list-kg-cores, put-kg-core, delete-kg-core) + default: trustgraph + example: alice + id: + type: string + description: Knowledge core ID (for get, put, delete, load, unload) + example: core-123 + flow: + type: string + description: Flow ID (for load-kg-core) + example: my-flow + collection: + type: string + description: Collection identifier (for load-kg-core) + default: default + example: default + triples: + type: object + description: Triples to store (for put-kg-core) + required: + - metadata + - triples + properties: + metadata: + type: object + required: + - id + - user + - collection + properties: + id: + type: string + description: Knowledge core ID + example: core-123 + user: + type: string + description: User identifier + example: alice + collection: + type: string + description: Collection identifier + example: default + metadata: + type: array + description: Metadata triples + items: + $ref: '../../common/Triple.yaml' + triples: + type: array + description: Knowledge triples + items: + $ref: '../../common/Triple.yaml' + graph-embeddings: + type: object + description: Graph embeddings to store (for put-kg-core) + required: + - metadata + - entities + properties: + metadata: + type: object + required: + - id + - user + - collection + properties: + id: + type: string + description: Knowledge core ID + example: core-123 + user: + type: string + description: User identifier + example: alice + collection: + type: string + description: Collection identifier + example: default + metadata: + type: array + description: Metadata triples + items: + $ref: '../../common/Triple.yaml' + entities: + type: array + description: Entity embeddings + items: + type: object + required: + - entity + - vectors + properties: + entity: + $ref: '../../common/RdfValue.yaml' + vectors: + type: array + description: Embedding vectors + items: + type: number + example: [0.1, 0.2, 0.3] diff --git a/specs/api/components/schemas/knowledge/KnowledgeResponse.yaml b/specs/api/components/schemas/knowledge/KnowledgeResponse.yaml new file mode 100644 index 00000000..229233ca --- /dev/null +++ b/specs/api/components/schemas/knowledge/KnowledgeResponse.yaml @@ -0,0 +1,91 @@ +type: object +description: Knowledge service response +properties: + ids: + type: array + description: List of knowledge core IDs (returned by list-kg-cores) + items: + type: string + example: ["core-123", "core-456"] + triples: + type: object + description: Triples data (returned by get-kg-core, streamed) + properties: + metadata: + type: object + required: + - id + - user + - collection + properties: + id: + type: string + description: Knowledge core ID + example: core-123 + user: + type: string + description: User identifier + example: alice + collection: + type: string + description: Collection identifier + example: default + metadata: + type: array + description: Metadata triples + items: + $ref: '../../common/Triple.yaml' + triples: + type: array + description: Knowledge triples + items: + $ref: '../../common/Triple.yaml' + graph-embeddings: + type: object + description: Graph embeddings data (returned by get-kg-core, streamed) + properties: + metadata: + type: object + required: + - id + - user + - collection + properties: + id: + type: string + description: Knowledge core ID + example: core-123 + user: + type: string + description: User identifier + example: alice + collection: + type: string + description: Collection identifier + example: default + metadata: + type: array + description: Metadata triples + items: + $ref: '../../common/Triple.yaml' + entities: + type: array + description: Entity embeddings + items: + type: object + required: + - entity + - vectors + properties: + entity: + $ref: '../../common/RdfValue.yaml' + vectors: + type: array + description: Embedding vectors + items: + type: number + example: [0.1, 0.2, 0.3] + eos: + type: boolean + description: End of stream marker (for streaming responses) + example: true diff --git a/specs/api/components/schemas/librarian/LibrarianRequest.yaml b/specs/api/components/schemas/librarian/LibrarianRequest.yaml new file mode 100644 index 00000000..18aa94b1 --- /dev/null +++ b/specs/api/components/schemas/librarian/LibrarianRequest.yaml @@ -0,0 +1,79 @@ +type: object +description: | + Librarian service request for document library management. + + Operations: add-document, remove-document, list-documents, + start-processing, stop-processing, list-processing +required: + - operation +properties: + operation: + type: string + enum: + - add-document + - remove-document + - list-documents + - start-processing + - stop-processing + - list-processing + description: | + Library operation: + - `add-document`: Add document to library + - `remove-document`: Remove document from library + - `list-documents`: List documents in library + - `start-processing`: Start processing library documents + - `stop-processing`: Stop library processing + - `list-processing`: List processing status + flow: + type: string + description: Flow ID + example: my-flow + collection: + type: string + description: Collection identifier + default: default + example: default + user: + type: string + description: User identifier + default: trustgraph + example: alice + document-id: + type: string + description: Document identifier + example: doc-123 + processing-id: + type: string + description: Processing task identifier + example: proc-456 + document-metadata: + $ref: '../common/DocumentMetadata.yaml' + processing-metadata: + $ref: '../common/ProcessingMetadata.yaml' + content: + type: string + description: Document content (for add-document with inline content) + example: This is the document content... + criteria: + type: array + description: Search criteria for filtering documents + items: + type: object + required: + - key + - value + - operator + properties: + key: + type: string + description: Metadata field name + example: author + value: + type: string + description: Value to match + example: John Doe + operator: + type: string + enum: [eq, ne, gt, lt, contains] + description: Comparison operator + example: eq diff --git a/specs/api/components/schemas/librarian/LibrarianResponse.yaml b/specs/api/components/schemas/librarian/LibrarianResponse.yaml new file mode 100644 index 00000000..caa84628 --- /dev/null +++ b/specs/api/components/schemas/librarian/LibrarianResponse.yaml @@ -0,0 +1,18 @@ +type: object +description: Librarian service response +properties: + document-metadata: + $ref: '../common/DocumentMetadata.yaml' + content: + type: string + description: Document content + document-metadatas: + type: array + description: List of documents (returned by list-documents) + items: + $ref: '../common/DocumentMetadata.yaml' + processing-metadatas: + type: array + description: List of processing tasks (returned by list-processing) + items: + $ref: '../common/ProcessingMetadata.yaml' diff --git a/specs/api/components/schemas/loading/DocumentLoadRequest.yaml b/specs/api/components/schemas/loading/DocumentLoadRequest.yaml new file mode 100644 index 00000000..45bbe428 --- /dev/null +++ b/specs/api/components/schemas/loading/DocumentLoadRequest.yaml @@ -0,0 +1,32 @@ +type: object +description: | + Document load request - load binary document (PDF, etc.) into processing pipeline. + + Fire-and-forget operation (no response). +required: + - data +properties: + data: + type: string + description: Document data (base64 encoded) + format: byte + example: JVBERi0xLjQKJeLjz9MKMSAwIG9iago8PC9UeXBlL... + id: + type: string + description: Document identifier + example: doc-456 + user: + type: string + description: User identifier + default: trustgraph + example: alice + collection: + type: string + description: Collection for document + default: default + example: research + metadata: + type: array + description: Document metadata as RDF triples + items: + $ref: '../../common/Triple.yaml' diff --git a/specs/api/components/schemas/loading/TextLoadRequest.yaml b/specs/api/components/schemas/loading/TextLoadRequest.yaml new file mode 100644 index 00000000..4ded87d5 --- /dev/null +++ b/specs/api/components/schemas/loading/TextLoadRequest.yaml @@ -0,0 +1,37 @@ +type: object +description: | + Text load request - load text document into processing pipeline. + + Fire-and-forget operation (no response). +required: + - text +properties: + text: + type: string + description: Text content (base64 encoded) + format: byte + example: VGhpcyBpcyB0aGUgZG9jdW1lbnQgdGV4dC4uLg== + id: + type: string + description: Document identifier + example: doc-123 + user: + type: string + description: User identifier + default: trustgraph + example: alice + collection: + type: string + description: Collection for document + default: default + example: research + charset: + type: string + description: Text character encoding + default: utf-8 + example: utf-8 + metadata: + type: array + description: Document metadata as RDF triples + items: + $ref: '../../common/Triple.yaml' diff --git a/specs/api/components/schemas/mcp-tool/McpToolRequest.yaml b/specs/api/components/schemas/mcp-tool/McpToolRequest.yaml new file mode 100644 index 00000000..b9c6ee4a --- /dev/null +++ b/specs/api/components/schemas/mcp-tool/McpToolRequest.yaml @@ -0,0 +1,17 @@ +type: object +description: | + MCP tool request - execute Model Context Protocol tool. +required: + - name +properties: + name: + type: string + description: Tool name to execute + example: search + parameters: + type: object + description: Tool parameters (JSON object, auto-converted to string internally) + additionalProperties: {} + example: + query: quantum computing + limit: 10 diff --git a/specs/api/components/schemas/mcp-tool/McpToolResponse.yaml b/specs/api/components/schemas/mcp-tool/McpToolResponse.yaml new file mode 100644 index 00000000..2c1b4974 --- /dev/null +++ b/specs/api/components/schemas/mcp-tool/McpToolResponse.yaml @@ -0,0 +1,15 @@ +type: object +description: MCP tool response +properties: + text: + type: string + description: Text response from tool + example: Found 10 results for quantum computing... + object: + type: object + description: Structured response from tool (JSON object) + additionalProperties: {} + example: + results: + - title: Introduction to Quantum Computing + url: https://example.com/qc-intro diff --git a/specs/api/components/schemas/prompt/PromptRequest.yaml b/specs/api/components/schemas/prompt/PromptRequest.yaml new file mode 100644 index 00000000..7b181016 --- /dev/null +++ b/specs/api/components/schemas/prompt/PromptRequest.yaml @@ -0,0 +1,32 @@ +type: object +description: | + Prompt service request - template-based text generation. + + Execute a stored prompt template with variable substitution. +required: + - id +properties: + id: + type: string + description: Prompt template ID (stored in config) + example: summarize-document + terms: + type: object + description: Template variables as key-value pairs (values are JSON strings) + additionalProperties: + type: string + example: + document: '"This is the document text to summarize..."' + max_length: '"200"' + variables: + type: object + description: Alternative to terms - variables as native JSON values (auto-converted) + additionalProperties: {} + example: + document: This is the document text to summarize... + max_length: 200 + streaming: + type: boolean + description: Enable streaming response delivery + default: false + example: true diff --git a/specs/api/components/schemas/prompt/PromptResponse.yaml b/specs/api/components/schemas/prompt/PromptResponse.yaml new file mode 100644 index 00000000..fbe5559b --- /dev/null +++ b/specs/api/components/schemas/prompt/PromptResponse.yaml @@ -0,0 +1,16 @@ +type: object +description: Prompt service response +properties: + text: + type: string + description: Generated text response + example: This document discusses quantum computing and its applications... + object: + type: string + description: Structured response (JSON string) if prompt produces objects + example: '{"summary": "Quantum computing overview", "key_points": [...]}' + end-of-stream: + type: boolean + description: Indicates streaming is complete (streaming mode) + default: false + example: true diff --git a/specs/api/components/schemas/query/NlpQueryRequest.yaml b/specs/api/components/schemas/query/NlpQueryRequest.yaml new file mode 100644 index 00000000..2ef72e61 --- /dev/null +++ b/specs/api/components/schemas/query/NlpQueryRequest.yaml @@ -0,0 +1,17 @@ +type: object +description: | + NLP query request - convert natural language question to structured query. +required: + - question +properties: + question: + type: string + description: Natural language question + example: Who does Alice know that works in engineering? + max-results: + type: integer + description: Maximum results to return when query is executed + default: 100 + minimum: 1 + maximum: 10000 + example: 50 diff --git a/specs/api/components/schemas/query/NlpQueryResponse.yaml b/specs/api/components/schemas/query/NlpQueryResponse.yaml new file mode 100644 index 00000000..91795c9b --- /dev/null +++ b/specs/api/components/schemas/query/NlpQueryResponse.yaml @@ -0,0 +1,47 @@ +type: object +description: NLP query response +required: + - graphql-query + - variables +properties: + graphql-query: + type: string + description: Generated GraphQL query + example: | + query GetConnections($person: ID!) { + person(id: $person) { + knows { + name + worksFor { department } + } + } + } + variables: + type: object + description: Query variables + additionalProperties: + type: string + example: + person: "https://example.com/person/alice" + detected-schemas: + type: array + description: Detected schema types used in query + items: + type: string + example: ["Person", "Organization"] + confidence: + type: number + description: Confidence score for query generation (0.0-1.0) + minimum: 0.0 + maximum: 1.0 + example: 0.87 + error: + type: object + description: Error if query generation failed + properties: + type: + type: string + example: PARSE_ERROR + message: + type: string + example: Could not understand question structure diff --git a/specs/api/components/schemas/query/ObjectsQueryRequest.yaml b/specs/api/components/schemas/query/ObjectsQueryRequest.yaml new file mode 100644 index 00000000..775bbc4b --- /dev/null +++ b/specs/api/components/schemas/query/ObjectsQueryRequest.yaml @@ -0,0 +1,40 @@ +type: object +description: | + Objects query request - GraphQL query over knowledge graph. +required: + - query +properties: + query: + type: string + description: GraphQL query string + example: | + query GetPerson($id: ID!) { + person(id: $id) { + name + email + knows { + name + } + } + } + variables: + type: object + description: GraphQL query variables + additionalProperties: + type: string + example: + id: "https://example.com/person/alice" + operation-name: + type: string + description: Operation name (for multi-operation documents) + example: GetPerson + user: + type: string + description: User identifier + default: trustgraph + example: alice + collection: + type: string + description: Collection to query + default: default + example: research diff --git a/specs/api/components/schemas/query/ObjectsQueryResponse.yaml b/specs/api/components/schemas/query/ObjectsQueryResponse.yaml new file mode 100644 index 00000000..8fd9b6a6 --- /dev/null +++ b/specs/api/components/schemas/query/ObjectsQueryResponse.yaml @@ -0,0 +1,54 @@ +type: object +description: Objects query response (GraphQL format) +properties: + data: + description: GraphQL response data (JSON object or null) + oneOf: + - type: object + additionalProperties: {} + - type: "null" + example: + person: + name: Alice + email: alice@example.com + knows: + - name: Bob + - name: Carol + errors: + type: array + description: GraphQL field-level errors + items: + type: object + properties: + message: + type: string + description: Error message + example: Cannot query field 'age' on type 'Person' + path: + type: array + description: Path to error location + items: + type: string + example: ["person", "age"] + extensions: + type: object + description: Additional error metadata + additionalProperties: + type: string + extensions: + type: object + description: Query metadata (execution time, etc.) + additionalProperties: + type: string + example: + execution_time_ms: "42" + error: + type: object + description: System-level error (connection, timeout, etc.) + properties: + type: + type: string + example: TIMEOUT_ERROR + message: + type: string + example: Query execution timeout diff --git a/specs/api/components/schemas/query/StructuredQueryRequest.yaml b/specs/api/components/schemas/query/StructuredQueryRequest.yaml new file mode 100644 index 00000000..ae564c0a --- /dev/null +++ b/specs/api/components/schemas/query/StructuredQueryRequest.yaml @@ -0,0 +1,22 @@ +type: object +description: | + Structured query request - natural language question with automatic execution. + + Combines NLP query generation and execution in one call. +required: + - question +properties: + question: + type: string + description: Natural language question + example: Who does Alice know that works in engineering? + user: + type: string + description: User identifier + default: trustgraph + example: alice + collection: + type: string + description: Collection to query + default: default + example: research diff --git a/specs/api/components/schemas/query/StructuredQueryResponse.yaml b/specs/api/components/schemas/query/StructuredQueryResponse.yaml new file mode 100644 index 00000000..4ce73685 --- /dev/null +++ b/specs/api/components/schemas/query/StructuredQueryResponse.yaml @@ -0,0 +1,34 @@ +type: object +description: Structured query response +properties: + data: + description: Query results (JSON object or null) + oneOf: + - type: object + additionalProperties: {} + - type: "null" + example: + person: + name: Alice + knows: + - name: Bob + worksFor: {name: Acme Corp, department: Engineering} + - name: Carol + worksFor: {name: Tech Inc, department: Engineering} + errors: + type: array + description: Query errors (array of error strings) + items: + type: string + example: + - Could not resolve field 'age' on type 'Person' + error: + type: object + description: System-level error + properties: + type: + type: string + example: QUERY_GENERATION_ERROR + message: + type: string + example: Failed to generate query from question diff --git a/specs/api/components/schemas/query/TriplesQueryRequest.yaml b/specs/api/components/schemas/query/TriplesQueryRequest.yaml new file mode 100644 index 00000000..88b0a1eb --- /dev/null +++ b/specs/api/components/schemas/query/TriplesQueryRequest.yaml @@ -0,0 +1,30 @@ +type: object +description: | + Triples query request - query knowledge graph by subject/predicate/object pattern. +properties: + s: + $ref: '../../common/RdfValue.yaml' + description: Subject filter (optional) + p: + $ref: '../../common/RdfValue.yaml' + description: Predicate filter (optional) + o: + $ref: '../../common/RdfValue.yaml' + description: Object filter (optional) + limit: + type: integer + description: Maximum number of triples to return + default: 10000 + minimum: 1 + maximum: 100000 + example: 100 + user: + type: string + description: User identifier + default: trustgraph + example: alice + collection: + type: string + description: Collection to query + default: default + example: research diff --git a/specs/api/components/schemas/query/TriplesQueryResponse.yaml b/specs/api/components/schemas/query/TriplesQueryResponse.yaml new file mode 100644 index 00000000..3d804c41 --- /dev/null +++ b/specs/api/components/schemas/query/TriplesQueryResponse.yaml @@ -0,0 +1,10 @@ +type: object +description: Triples query response +required: + - response +properties: + response: + type: array + description: Matching triples + items: + $ref: '../../common/Triple.yaml' diff --git a/specs/api/components/schemas/rag/DocumentRagRequest.yaml b/specs/api/components/schemas/rag/DocumentRagRequest.yaml new file mode 100644 index 00000000..97a9d2ff --- /dev/null +++ b/specs/api/components/schemas/rag/DocumentRagRequest.yaml @@ -0,0 +1,33 @@ +type: object +description: | + Document RAG (Retrieval-Augmented Generation) query request. + Searches document embeddings and generates answer using retrieved context. +required: + - query +properties: + query: + type: string + description: User query or question + example: What are the key findings in the research papers? + user: + type: string + description: User identifier for multi-tenancy + default: trustgraph + example: alice + collection: + type: string + description: Collection to search within + default: default + example: research + doc-limit: + type: integer + description: Maximum number of documents to retrieve + default: 20 + minimum: 1 + maximum: 100 + example: 10 + streaming: + type: boolean + description: Enable streaming response delivery + default: false + example: true diff --git a/specs/api/components/schemas/rag/DocumentRagResponse.yaml b/specs/api/components/schemas/rag/DocumentRagResponse.yaml new file mode 100644 index 00000000..6a0166e7 --- /dev/null +++ b/specs/api/components/schemas/rag/DocumentRagResponse.yaml @@ -0,0 +1,24 @@ +type: object +description: Document RAG response +properties: + response: + type: string + description: Generated response based on retrieved documents + example: The research papers found three key findings... + end-of-stream: + type: boolean + description: Indicates streaming is complete (streaming mode) + default: false + example: true + error: + type: object + description: Error details if request failed + properties: + message: + type: string + description: Error message + example: Failed to retrieve documents + type: + type: string + description: Error type + example: RETRIEVAL_ERROR diff --git a/specs/api/components/schemas/rag/GraphRagRequest.yaml b/specs/api/components/schemas/rag/GraphRagRequest.yaml new file mode 100644 index 00000000..733dd7c1 --- /dev/null +++ b/specs/api/components/schemas/rag/GraphRagRequest.yaml @@ -0,0 +1,54 @@ +type: object +description: | + Graph RAG (Retrieval-Augmented Generation) query request. + Searches knowledge graph and generates answer using retrieved subgraph. +required: + - query +properties: + query: + type: string + description: User query or question + example: What connections exist between quantum physics and computer science? + user: + type: string + description: User identifier for multi-tenancy + default: trustgraph + example: alice + collection: + type: string + description: Collection to search within + default: default + example: research + entity-limit: + type: integer + description: Maximum number of entities to retrieve + default: 50 + minimum: 1 + maximum: 200 + example: 30 + triple-limit: + type: integer + description: Maximum number of triples to retrieve per entity + default: 30 + minimum: 1 + maximum: 100 + example: 20 + max-subgraph-size: + type: integer + description: Maximum total subgraph size (triples) + default: 1000 + minimum: 10 + maximum: 5000 + example: 500 + max-path-length: + type: integer + description: Maximum path length for graph traversal + default: 2 + minimum: 1 + maximum: 5 + example: 3 + streaming: + type: boolean + description: Enable streaming response delivery + default: false + example: true diff --git a/specs/api/components/schemas/rag/GraphRagResponse.yaml b/specs/api/components/schemas/rag/GraphRagResponse.yaml new file mode 100644 index 00000000..75f4f059 --- /dev/null +++ b/specs/api/components/schemas/rag/GraphRagResponse.yaml @@ -0,0 +1,24 @@ +type: object +description: Graph RAG response +properties: + response: + type: string + description: Generated response based on retrieved knowledge graph + example: Quantum physics and computer science intersect in quantum computing... + end-of-stream: + type: boolean + description: Indicates streaming is complete (streaming mode) + default: false + example: true + error: + type: object + description: Error details if request failed + properties: + message: + type: string + description: Error message + example: Failed to retrieve graph data + type: + type: string + description: Error type + example: GRAPH_ERROR diff --git a/specs/api/components/schemas/text-completion/TextCompletionRequest.yaml b/specs/api/components/schemas/text-completion/TextCompletionRequest.yaml new file mode 100644 index 00000000..95c5a30d --- /dev/null +++ b/specs/api/components/schemas/text-completion/TextCompletionRequest.yaml @@ -0,0 +1,20 @@ +type: object +description: | + Text completion request - direct LLM completion without RAG. +required: + - system + - prompt +properties: + system: + type: string + description: System prompt that sets behavior and context for the LLM + example: You are a helpful assistant that provides concise answers. + prompt: + type: string + description: User prompt or question + example: Explain the concept of recursion in programming. + streaming: + type: boolean + description: Enable streaming response delivery + default: false + example: true diff --git a/specs/api/components/schemas/text-completion/TextCompletionResponse.yaml b/specs/api/components/schemas/text-completion/TextCompletionResponse.yaml new file mode 100644 index 00000000..b97573c7 --- /dev/null +++ b/specs/api/components/schemas/text-completion/TextCompletionResponse.yaml @@ -0,0 +1,26 @@ +type: object +description: Text completion response +required: + - response +properties: + response: + type: string + description: Generated text response + example: Recursion is a programming technique where a function calls itself... + in-token: + type: integer + description: Number of input tokens consumed + example: 45 + out-token: + type: integer + description: Number of output tokens generated + example: 128 + model: + type: string + description: Model used for completion + example: gpt-4 + end-of-stream: + type: boolean + description: Indicates streaming is complete (streaming mode) + default: false + example: true diff --git a/specs/api/openapi.yaml b/specs/api/openapi.yaml new file mode 100644 index 00000000..b3258d14 --- /dev/null +++ b/specs/api/openapi.yaml @@ -0,0 +1,160 @@ +openapi: 3.1.0 + +info: + title: TrustGraph API Gateway + version: 1.8.0 + description: | + REST API for TrustGraph - an AI-powered knowledge graph and RAG system. + + ## Overview + + The API provides access to: + - **Global Services**: Configuration, flow management, knowledge storage, library management + - **Flow-Hosted Services**: AI services like RAG, text completion, embeddings (require running flow) + - **Import/Export**: Bulk data operations for triples, embeddings, entity contexts + - **WebSocket**: Multiplexed interface for all services + + ## Service Types + + ### Global Services + Fixed endpoints accessible via `/api/v1/{kind}`: + - `config` - Configuration management + - `flow` - Flow lifecycle and blueprints + - `librarian` - Document library management + - `knowledge` - Knowledge graph core management + - `collection-management` - Collection metadata + + ### Flow-Hosted Services + Require running flow instance, accessed via `/api/v1/flow/{flow}/service/{kind}`: + - AI services: agent, text-completion, prompt, RAG (document/graph) + - Embeddings: embeddings, graph-embeddings, document-embeddings + - Query: triples, objects, nlp-query, structured-query + - Data loading: text-load, document-load + - Utilities: mcp-tool, structured-diag + + ## Authentication + + Bearer token authentication when `GATEWAY_SECRET` environment variable is set. + Include token in Authorization header: + ``` + Authorization: Bearer + ``` + + If `GATEWAY_SECRET` is not set, API runs without authentication (development mode). + + ## Field Naming + + All JSON fields use **kebab-case**: `flow-id`, `blueprint-name`, `doc-limit`, etc. + + ## Error Responses + + All endpoints may return errors in this format: + ```json + { + "error": { + "type": "gateway-error", + "message": "Timeout" + } + } + ``` + + contact: + name: TrustGraph Project + url: https://trustgraph.ai + license: + name: Apache 2.0 + url: https://www.apache.org/licenses/LICENSE-2.0.html + +servers: + - url: http://localhost:8088 + description: Local development server + +security: + - bearerAuth: [] + +tags: + - name: Config + description: Configuration management (global service) + - name: Flow + description: Flow lifecycle and blueprint management (global service) + - name: Librarian + description: Document library management (global service) + - name: Knowledge + description: Knowledge graph core management (global service) + - name: Collection + description: Collection metadata management (global service) + - name: Flow Services + description: Services hosted within flow instances + - name: Import/Export + description: Bulk data import and export + - name: WebSocket + description: WebSocket interfaces + - name: Metrics + description: System metrics and monitoring + +paths: + /api/v1/config: + $ref: './paths/config.yaml' + /api/v1/flow: + $ref: './paths/flow.yaml' + /api/v1/librarian: + $ref: './paths/librarian.yaml' + /api/v1/knowledge: + $ref: './paths/knowledge.yaml' + /api/v1/collection-management: + $ref: './paths/collection-management.yaml' + + # Flow-hosted services (require running flow instance) + /api/v1/flow/{flow}/service/agent: + $ref: './paths/flow/agent.yaml' + /api/v1/flow/{flow}/service/document-rag: + $ref: './paths/flow/document-rag.yaml' + /api/v1/flow/{flow}/service/graph-rag: + $ref: './paths/flow/graph-rag.yaml' + /api/v1/flow/{flow}/service/text-completion: + $ref: './paths/flow/text-completion.yaml' + /api/v1/flow/{flow}/service/prompt: + $ref: './paths/flow/prompt.yaml' + /api/v1/flow/{flow}/service/embeddings: + $ref: './paths/flow/embeddings.yaml' + /api/v1/flow/{flow}/service/mcp-tool: + $ref: './paths/flow/mcp-tool.yaml' + /api/v1/flow/{flow}/service/triples: + $ref: './paths/flow/triples.yaml' + /api/v1/flow/{flow}/service/objects: + $ref: './paths/flow/objects.yaml' + /api/v1/flow/{flow}/service/nlp-query: + $ref: './paths/flow/nlp-query.yaml' + /api/v1/flow/{flow}/service/structured-query: + $ref: './paths/flow/structured-query.yaml' + /api/v1/flow/{flow}/service/structured-diag: + $ref: './paths/flow/structured-diag.yaml' + /api/v1/flow/{flow}/service/graph-embeddings: + $ref: './paths/flow/graph-embeddings.yaml' + /api/v1/flow/{flow}/service/document-embeddings: + $ref: './paths/flow/document-embeddings.yaml' + /api/v1/flow/{flow}/service/text-load: + $ref: './paths/flow/text-load.yaml' + /api/v1/flow/{flow}/service/document-load: + $ref: './paths/flow/document-load.yaml' + + # Import/Export endpoints + /api/v1/import-core: + $ref: './paths/import-core.yaml' + /api/v1/export-core: + $ref: './paths/export-core.yaml' + + # WebSocket endpoints + /api/v1/socket: + $ref: './paths/websocket.yaml' + + # Metrics endpoint + /api/metrics: + $ref: './paths/metrics.yaml' + /api/metrics/{path}: + $ref: './paths/metrics-path.yaml' + +components: + securitySchemes: + bearerAuth: + $ref: './security/bearerAuth.yaml' diff --git a/specs/api/paths/collection-management.yaml b/specs/api/paths/collection-management.yaml new file mode 100644 index 00000000..7dffd4e0 --- /dev/null +++ b/specs/api/paths/collection-management.yaml @@ -0,0 +1,108 @@ +post: + tags: + - Collection + summary: Collection metadata management + description: | + Manage collection metadata for organizing documents and knowledge. + + ## Collections + + Collections are organizational units for grouping: + - Documents in the librarian + - Knowledge cores + - User data + + Each collection has: + - **user**: Owner identifier + - **collection**: Unique collection ID + - **name**: Human-readable display name + - **description**: Purpose and contents + - **tags**: Labels for filtering and organization + + ## Operations + + ### list-collections + List all collections for a user. Optionally filter by tags and limit results. + Returns array of collection metadata. + + ### update-collection + Create or update collection metadata. If collection doesn't exist, it's created. + If it exists, metadata is updated. Allows setting name, description, and tags. + + ### delete-collection + Delete a collection by user and collection ID. This removes the metadata but + typically does not delete the associated data (documents, knowledge cores). + + operationId: collectionManagementService + security: + - bearerAuth: [] + requestBody: + required: true + content: + application/json: + schema: + $ref: '../components/schemas/collection/CollectionRequest.yaml' + examples: + listCollections: + summary: List all collections for user + value: + operation: list-collections + user: alice + listCollectionsFiltered: + summary: List collections filtered by tags + value: + operation: list-collections + user: alice + tag-filter: ["research", "AI"] + limit: 50 + updateCollection: + summary: Create/update collection + value: + operation: update-collection + user: alice + collection: research + name: Research Papers + description: Academic research papers on AI and ML + tags: ["research", "AI", "academic"] + timestamp: "2024-01-15T10:30:00Z" + deleteCollection: + summary: Delete collection + value: + operation: delete-collection + user: alice + collection: research + responses: + '200': + description: Successful response + content: + application/json: + schema: + $ref: '../components/schemas/collection/CollectionResponse.yaml' + examples: + listCollections: + summary: List of collections + value: + timestamp: "2024-01-15T10:30:00Z" + collections: + - user: alice + collection: research + name: Research Papers + description: Academic research papers on AI and ML + tags: ["research", "AI", "academic"] + - user: alice + collection: personal + name: Personal Documents + description: Personal notes and documents + tags: ["personal"] + updateSuccess: + summary: Update successful + value: + timestamp: "2024-01-15T10:30:00Z" + deleteSuccess: + summary: Delete successful + value: + timestamp: "2024-01-15T10:30:00Z" + '401': + $ref: '../components/responses/Unauthorized.yaml' + '500': + $ref: '../components/responses/Error.yaml' diff --git a/specs/api/paths/config.yaml b/specs/api/paths/config.yaml new file mode 100644 index 00000000..ef95498b --- /dev/null +++ b/specs/api/paths/config.yaml @@ -0,0 +1,165 @@ +post: + tags: + - Config + summary: Configuration service + description: | + Manage TrustGraph configuration including flows, prompts, token costs, parameter types, and more. + + ## Operations + + ### config + Get the complete system configuration including all flows, prompts, token costs, etc. + + ### list + List all configuration items of a specific type (e.g., all flows, all prompts). + + ### get + Retrieve specific configuration items by type and key. + + ### put + Create or update configuration values. + + ### delete + Delete configuration items. + + ## Configuration Types + + - `flow` - Flow instance definitions + - `flow-blueprint` - Flow blueprint definitions (stored separately from flow instances) + - `prompt` - Prompt templates + - `token-cost` - Model token pricing + - `parameter-type` - Parameter type definitions + - `interface-description` - Interface descriptions + - Custom types as needed + + ## Important Distinction + + The **config service** manages *stored configuration*. + The **flow service** (`/api/v1/flow`) manages *running flow instances*. + + - Use config service to store/retrieve flow definitions + - Use flow service to start/stop/manage running flows + + operationId: configService + security: + - bearerAuth: [] + requestBody: + required: true + content: + application/json: + schema: + $ref: '../components/schemas/config/ConfigRequest.yaml' + examples: + getCompleteConfig: + summary: Get complete configuration + value: + operation: config + listFlows: + summary: List all stored flow definitions + value: + operation: list + type: flow + listPrompts: + summary: List all prompts + value: + operation: list + type: prompt + getFlow: + summary: Get specific flow definition + value: + operation: get + keys: + - type: flow + key: default + putFlow: + summary: Create/update flow definition + value: + operation: put + values: + - type: flow + key: my-flow + value: + blueprint-name: document-rag + description: My RAG flow + parameters: + model: gpt-4 + putPrompt: + summary: Set system prompt + value: + operation: put + values: + - type: prompt + key: system + value: You are a helpful AI assistant specialized in data analysis + putTokenCost: + summary: Set token costs for a model + value: + operation: put + values: + - type: token-cost + key: gpt-4 + value: + prompt: 0.03 + completion: 0.06 + deleteFlow: + summary: Delete flow definition + value: + operation: delete + keys: + - type: flow + key: my-flow + responses: + '200': + description: Successful response + content: + application/json: + schema: + $ref: '../components/schemas/config/ConfigResponse.yaml' + examples: + completeConfig: + summary: Complete configuration + value: + version: 42 + config: + flow: + default: + blueprint-name: document-rag+graph-rag + description: Default flow + interfaces: + agent: + request: non-persistent://tg/request/agent:default + response: non-persistent://tg/response/agent:default + prompt: + system: You are a helpful AI assistant + token-cost: + gpt-4: + prompt: 0.03 + completion: 0.06 + listFlows: + summary: List of flow definition keys + value: + directory: + - default + - production + - my-flow + getFlow: + summary: Retrieved flow definition + value: + values: + - type: flow + key: default + value: + blueprint-name: document-rag+graph-rag + description: Default flow + putSuccess: + summary: Put operation success + value: + version: 43 + deleteSuccess: + summary: Delete operation success + value: + version: 44 + '401': + $ref: '../components/responses/Unauthorized.yaml' + '500': + $ref: '../components/responses/Error.yaml' diff --git a/specs/api/paths/export-core.yaml b/specs/api/paths/export-core.yaml new file mode 100644 index 00000000..e7dc06b0 --- /dev/null +++ b/specs/api/paths/export-core.yaml @@ -0,0 +1,108 @@ +get: + tags: + - Import/Export + summary: Export Core - bulk export triples and embeddings + description: | + Export knowledge cores in bulk using streaming MessagePack format. + + ## Export Core Overview + + Bulk data export for knowledge graph: + - **Format**: MessagePack streaming + - **Content**: Triples and graph embeddings + - **Source**: Global knowledge storage + - **Use**: Backups, data migration, archival + + ## MessagePack Protocol + + Response body is MessagePack stream with message tuples: + + ### Triple Message + ``` + ("t", { + "m": { // Metadata + "i": "core-id", // Knowledge core ID + "m": [...], // Metadata triples array + "u": "user", // User + "c": "collection" // Collection + }, + "t": [...] // Triples array + }) + ``` + + ### Graph Embeddings Message + ``` + ("ge", { + "m": { // Metadata + "i": "core-id", + "m": [...], + "u": "user", + "c": "collection" + }, + "e": [ // Entities array + { + "e": {"v": "uri", "e": true}, // Entity RdfValue + "v": [0.1, 0.2, ...] // Vectors + } + ] + }) + ``` + + ### End of Stream Message + ``` + ("eos", {}) + ``` + + ## Query Parameters + + - **id**: Knowledge core ID to export + - **user**: User identifier + + ## Streaming + + Data streamed incrementally: + - Triples sent first + - Graph embeddings sent next + - EOS marker signals completion + + Client should process messages as received. + + ## Use Cases + + - **Backups**: Export for disaster recovery + - **Data migration**: Move to another system + - **Archival**: Long-term storage + - **Replication**: Copy knowledge cores + - **Analysis**: External processing + + operationId: exportCore + security: + - bearerAuth: [] + parameters: + - name: id + in: query + required: true + schema: + type: string + description: Knowledge core ID to export + example: core-123 + - name: user + in: query + required: true + schema: + type: string + description: User identifier + example: alice + responses: + '200': + description: Export stream + content: + application/msgpack: + schema: + type: string + format: binary + description: MessagePack stream of knowledge data + '401': + $ref: '../components/responses/Unauthorized.yaml' + '500': + $ref: '../components/responses/Error.yaml' diff --git a/specs/api/paths/flow.yaml b/specs/api/paths/flow.yaml new file mode 100644 index 00000000..181e03bf --- /dev/null +++ b/specs/api/paths/flow.yaml @@ -0,0 +1,194 @@ +post: + tags: + - Flow + summary: Flow lifecycle and blueprint management + description: | + Manage flow instances and blueprints. + + ## Important Distinction + + The **flow service** manages *running flow instances*. + The **config service** (`/api/v1/config`) manages *stored configuration*. + + - Use flow service to start/stop/manage running flows + - Use config service to store/retrieve flow definitions + + ## Flow Instance Operations + + ### start-flow + Start a new flow instance from a blueprint. The blueprint must exist (either built-in or created via put-blueprint). + + Parameters are resolved from: + 1. User-provided values (--param) + 2. Default values from parameter type definitions + 3. Controlled-by relationships + + ### stop-flow + Stop a running flow instance. This terminates all processors and releases resources. + + ### list-flows + List all currently running flow instances. + + ### get-flow + Get details of a running flow including its configuration, parameters, and interface queue names. + + ## Blueprint Operations + + ### list-blueprints + List all available flow blueprints (built-in and custom). + + ### get-blueprint + Retrieve a blueprint definition showing its structure, parameters, processors, and interfaces. + + ### put-blueprint + Create or update a flow blueprint definition. + + Blueprints define: + - **Class processors**: Shared across all instances of this blueprint + - **Flow processors**: Unique to each flow instance + - **Interfaces**: Entry points for external systems + - **Parameters**: Configurable values for customization + + ### delete-blueprint + Delete a custom blueprint definition. Built-in blueprints cannot be deleted. + + operationId: flowService + security: + - bearerAuth: [] + requestBody: + required: true + content: + application/json: + schema: + $ref: '../components/schemas/flow/FlowRequest.yaml' + examples: + startFlow: + summary: Start a flow instance + value: + operation: start-flow + flow-id: my-flow + blueprint-name: document-rag + description: My document processing flow + parameters: + model: gpt-4 + temperature: "0.7" + startFlowMinimal: + summary: Start flow with defaults + value: + operation: start-flow + flow-id: my-flow + blueprint-name: document-rag + stopFlow: + summary: Stop a flow instance + value: + operation: stop-flow + flow-id: my-flow + listFlows: + summary: List running flows + value: + operation: list-flows + getFlow: + summary: Get flow details + value: + operation: get-flow + flow-id: my-flow + listBlueprints: + summary: List available blueprints + value: + operation: list-blueprints + getBlueprint: + summary: Get blueprint definition + value: + operation: get-blueprint + blueprint-name: document-rag + putBlueprint: + summary: Create/update blueprint + value: + operation: put-blueprint + blueprint-name: my-custom-rag + blueprint-definition: + description: Custom RAG pipeline + parameters: + model: + type: llm-model + description: LLM model + order: 1 + class: + text-completion:{class}: + request: non-persistent://tg/request/text-completion:{class} + response: non-persistent://tg/response/text-completion:{class} + flow: + chunker:{id}: + input: persistent://tg/flow/chunk:{id} + output: persistent://tg/flow/chunk-load:{id} + interfaces: + agent: + request: non-persistent://tg/request/agent:{id} + response: non-persistent://tg/response/agent:{id} + deleteBlueprint: + summary: Delete blueprint + value: + operation: delete-blueprint + blueprint-name: my-custom-rag + responses: + '200': + description: Successful response + content: + application/json: + schema: + $ref: '../components/schemas/flow/FlowResponse.yaml' + examples: + startFlow: + summary: Flow started + value: + flow-id: my-flow + listFlows: + summary: Running flows + value: + flow-ids: + - default + - production + - my-flow + getFlow: + summary: Flow details + value: + flow: + blueprint-name: document-rag + description: My document processing flow + parameters: + model: gpt-4 + temperature: "0.7" + interfaces: + agent: + request: non-persistent://tg/request/agent:my-flow + response: non-persistent://tg/response/agent:my-flow + text-load: persistent://tg/flow/text-document-load:my-flow + listBlueprints: + summary: Available blueprints + value: + blueprint-names: + - document-rag + - graph-rag + - document-rag+graph-rag + - my-custom-rag + getBlueprint: + summary: Blueprint definition + value: + blueprint-definition: + description: Standard RAG pipeline + parameters: + model: + type: llm-model + order: 1 + class: + text-completion:{class}: + request: non-persistent://tg/request/text-completion:{class} + response: non-persistent://tg/response/text-completion:{class} + interfaces: + agent: + request: non-persistent://tg/request/agent:{id} + response: non-persistent://tg/response/agent:{id} + '401': + $ref: '../components/responses/Unauthorized.yaml' + '500': + $ref: '../components/responses/Error.yaml' diff --git a/specs/api/paths/flow/agent.yaml b/specs/api/paths/flow/agent.yaml new file mode 100644 index 00000000..91f92ebd --- /dev/null +++ b/specs/api/paths/flow/agent.yaml @@ -0,0 +1,130 @@ +post: + tags: + - Flow Services + summary: Agent service - conversational AI with reasoning + description: | + AI agent that can understand questions, reason about them, and take actions. + + ## Agent Overview + + The agent service provides a conversational AI that: + - Understands natural language questions + - Reasons about problems using thoughts + - Takes actions to gather information + - Provides coherent answers + + ## Request Format + + Send a question with optional: + - **state**: Continue from previous conversation + - **history**: Previous agent steps for context + - **group**: Collaborative agent identifiers + - **streaming**: Enable streaming responses + + ## Response Modes + + ### Streaming Mode (streaming: true) + Responses arrive as chunks with `chunk-type`: + - `thought`: Agent's reasoning process + - `action`: Action being taken + - `observation`: Result from action + - `answer`: Final response to user + - `error`: Error occurred + + Each chunk may have multiple messages. Check flags: + - `end-of-message`: Current chunk type complete + - `end-of-dialog`: Entire conversation complete + + ### Legacy Mode (streaming: false) + Single response with: + - `answer`: Complete answer + - `thought`: Reasoning (if any) + - `observation`: Observations (if any) + + ## Multi-turn Conversations + + Include `history` array with previous steps to maintain context. + Each step has: thought, action, arguments, observation. + + operationId: agentService + security: + - bearerAuth: [] + parameters: + - name: flow + in: path + required: true + schema: + type: string + description: Flow instance ID + example: my-flow + requestBody: + required: true + content: + application/json: + schema: + $ref: '../../components/schemas/agent/AgentRequest.yaml' + examples: + simpleQuestion: + summary: Simple question + value: + question: What is the capital of France? + user: alice + streamingQuestion: + summary: Question with streaming enabled + value: + question: Explain quantum computing + user: alice + streaming: true + conversationWithHistory: + summary: Multi-turn conversation + value: + question: And what about its population? + user: alice + history: + - thought: User is asking about the capital of France + action: search + arguments: + query: "capital of France" + observation: "Paris is the capital of France" + user: alice + responses: + '200': + description: Successful response + content: + application/json: + schema: + $ref: '../../components/schemas/agent/AgentResponse.yaml' + examples: + streamingThought: + summary: Streaming thought chunk + value: + chunk-type: thought + content: I need to search for information about quantum computing + end-of-message: false + end-of-dialog: false + streamingAnswer: + summary: Streaming answer chunk + value: + chunk-type: answer + content: Quantum computing uses quantum mechanics principles... + end-of-message: false + end-of-dialog: false + streamingComplete: + summary: Streaming complete marker + value: + chunk-type: answer + content: "" + end-of-message: true + end-of-dialog: true + legacyResponse: + summary: Legacy non-streaming response + value: + answer: Paris is the capital of France. + thought: User is asking about the capital of France + observation: "" + end-of-message: false + end-of-dialog: false + '401': + $ref: '../../components/responses/Unauthorized.yaml' + '500': + $ref: '../../components/responses/Error.yaml' diff --git a/specs/api/paths/flow/document-embeddings.yaml b/specs/api/paths/flow/document-embeddings.yaml new file mode 100644 index 00000000..dbab2f92 --- /dev/null +++ b/specs/api/paths/flow/document-embeddings.yaml @@ -0,0 +1,103 @@ +post: + tags: + - Flow Services + summary: Document Embeddings Query - find similar text chunks + description: | + Query document embeddings to find similar text chunks by vector similarity. + + ## Document Embeddings Query Overview + + Find document chunks semantically similar to a query vector: + - **Input**: Query embedding vector + - **Search**: Compare against stored chunk embeddings + - **Output**: Most similar text chunks + + Core component of document RAG retrieval. + + ## Use Cases + + - **Document retrieval**: Find relevant passages + - **Semantic search**: Search by meaning not keywords + - **Context gathering**: Get text for RAG + - **Similar content**: Discover related documents + + ## Process + + 1. Obtain query embedding (via embeddings service) + 2. Query stored document chunk embeddings + 3. Calculate cosine similarity + 4. Return top N most similar chunks + 5. Use chunks as context for generation + + ## Chunking + + Documents are split into chunks during indexing: + - Typical size: 200-1000 tokens + - Overlap between chunks for continuity + - Each chunk has own embedding + + Queries return individual chunks, not full documents. + + ## Similarity Scoring + + Uses cosine similarity: + - Results ordered by similarity + - No explicit scores in response + - Limit controls result count + + ## Output Format + + Returns text chunks as strings: + - Raw chunk text + - No metadata (source, position, etc.) + - Use for LLM context directly + + operationId: documentEmbeddingsQueryService + security: + - bearerAuth: [] + parameters: + - name: flow + in: path + required: true + schema: + type: string + description: Flow instance ID + example: my-flow + requestBody: + required: true + content: + application/json: + schema: + $ref: '../../components/schemas/embeddings-query/DocumentEmbeddingsQueryRequest.yaml' + examples: + basicQuery: + summary: Find similar chunks + value: + vectors: [0.023, -0.142, 0.089, 0.234, -0.067, 0.156, 0.201, -0.178] + limit: 10 + user: alice + collection: research + largeQuery: + summary: Larger result set + value: + vectors: [0.1, -0.2, 0.3, -0.4, 0.5] + limit: 30 + responses: + '200': + description: Successful response + content: + application/json: + schema: + $ref: '../../components/schemas/embeddings-query/DocumentEmbeddingsQueryResponse.yaml' + examples: + similarChunks: + summary: Similar document chunks + value: + chunks: + - "Quantum computing uses quantum mechanics principles like superposition and entanglement for computation. Unlike classical bits, quantum bits (qubits) can exist in multiple states simultaneously." + - "Neural networks are computing systems inspired by biological neural networks. They consist of interconnected nodes organized in layers that process information through weighted connections." + - "Machine learning algorithms learn patterns from data without being explicitly programmed. They improve their performance through experience and exposure to training data." + '401': + $ref: '../../components/responses/Unauthorized.yaml' + '500': + $ref: '../../components/responses/Error.yaml' diff --git a/specs/api/paths/flow/document-load.yaml b/specs/api/paths/flow/document-load.yaml new file mode 100644 index 00000000..09ddc09f --- /dev/null +++ b/specs/api/paths/flow/document-load.yaml @@ -0,0 +1,119 @@ +post: + tags: + - Flow Services + summary: Document Load - load binary documents (PDF, etc.) + description: | + Load binary documents (PDF, Word, etc.) into processing pipeline. + + ## Document Load Overview + + Fire-and-forget binary document loading: + - **Input**: Document data (base64 encoded) + - **Process**: Extract text, chunk, embed, store + - **Output**: None (202 Accepted) + + Asynchronous processing for PDF and other binary formats. + + ## Processing Pipeline + + Documents go through: + 1. **Text extraction**: PDF→text, DOCX→text, etc. + 2. **Chunking**: Split into overlapping chunks + 3. **Embedding**: Generate vectors for each chunk + 4. **Storage**: Store chunks + embeddings + 5. **Indexing**: Make searchable + + Pipeline runs asynchronously. + + ## Supported Formats + + - **PDF**: Portable Document Format + - **DOCX**: Microsoft Word + - **HTML**: Web pages + - Other formats via extractors + + Format detected from content, not extension. + + ## Binary Encoding + + Documents must be base64 encoded: + ```python + with open('document.pdf', 'rb') as f: + doc_bytes = f.read() + encoded = base64.b64encode(doc_bytes).decode('utf-8') + ``` + + ## Metadata + + Optional RDF triples: + - Document properties + - Source information + - Custom attributes + + ## Use Cases + + - **PDF ingestion**: Process research papers + - **Document libraries**: Index document collections + - **Content migration**: Import from other systems + - **Automated processing**: Batch document loading + + ## No Response Data + + Returns 202 Accepted immediately: + - Document queued + - Processing happens asynchronously + - No status tracking + - Query later to verify indexed + + operationId: documentLoadService + security: + - bearerAuth: [] + parameters: + - name: flow + in: path + required: true + schema: + type: string + description: Flow instance ID + example: my-flow + requestBody: + required: true + content: + application/json: + schema: + $ref: '../../components/schemas/loading/DocumentLoadRequest.yaml' + examples: + loadPdf: + summary: Load PDF document + value: + data: JVBERi0xLjQKJeLjz9MKMSAwIG9iago8PC9UeXBlL0NhdGFsb2cvUGFnZXMgMiAwIFI+PmVuZG9iagoyIDAgb2JqCjw8L1R5cGUvUGFnZXMvS2lkc1szIDAgUl0vQ291bnQgMT4+ZW5kb2JqCg== + id: doc-789 + user: alice + collection: research + withMetadata: + summary: Load with metadata + value: + data: JVBERi0xLjQKJeLjz9MK... + id: doc-101112 + user: bob + collection: papers + metadata: + - s: {v: "doc-101112", e: false} + p: {v: "http://purl.org/dc/terms/title", e: true} + o: {v: "Quantum Entanglement Research", e: false} + - s: {v: "doc-101112", e: false} + p: {v: "http://purl.org/dc/terms/date", e: true} + o: {v: "2024-01-15", e: false} + responses: + '202': + description: Document accepted for processing + content: + application/json: + schema: + type: object + properties: {} + example: {} + '401': + $ref: '../../components/responses/Unauthorized.yaml' + '500': + $ref: '../../components/responses/Error.yaml' diff --git a/specs/api/paths/flow/document-rag.yaml b/specs/api/paths/flow/document-rag.yaml new file mode 100644 index 00000000..fd738f33 --- /dev/null +++ b/specs/api/paths/flow/document-rag.yaml @@ -0,0 +1,107 @@ +post: + tags: + - Flow Services + summary: Document RAG - retrieve and generate from documents + description: | + Retrieval-Augmented Generation over document embeddings. + + ## Document RAG Overview + + Document RAG combines: + 1. **Retrieval**: Search document embeddings using semantic similarity + 2. **Generation**: Use LLM to synthesize answer from retrieved documents + + This provides grounded answers based on your document corpus. + + ## Query Process + + 1. Convert query to embedding + 2. Search document embeddings for most similar chunks + 3. Retrieve top N document chunks (configurable via doc-limit) + 4. Pass query + retrieved context to LLM + 5. Generate answer grounded in documents + + ## Streaming + + Enable `streaming: true` to receive the answer as it's generated: + - Multiple messages with `response` content + - Final message with `end-of-stream: true` + + Without streaming, returns complete answer in single response. + + ## Parameters + + - **doc-limit**: Controls retrieval depth (1-100, default 20) + - Higher = more context but slower + - Lower = faster but may miss relevant info + - **collection**: Target specific document collection + - **user**: Multi-tenant isolation + + operationId: documentRagService + security: + - bearerAuth: [] + parameters: + - name: flow + in: path + required: true + schema: + type: string + description: Flow instance ID + example: my-flow + requestBody: + required: true + content: + application/json: + schema: + $ref: '../../components/schemas/rag/DocumentRagRequest.yaml' + examples: + basicQuery: + summary: Basic document query + value: + query: What are the key findings in the research papers? + user: alice + collection: research + streamingQuery: + summary: Streaming query + value: + query: Summarize the main conclusions + user: alice + collection: research + doc-limit: 15 + streaming: true + limitedRetrieval: + summary: Query with limited retrieval + value: + query: What is quantum entanglement? + doc-limit: 5 + responses: + '200': + description: Successful response + content: + application/json: + schema: + $ref: '../../components/schemas/rag/DocumentRagResponse.yaml' + examples: + completeResponse: + summary: Complete non-streaming response + value: + response: | + The research papers present three key findings: + 1. Quantum entanglement exhibits non-local correlations + 2. Bell's inequality is violated in experimental tests + 3. Applications in quantum cryptography are promising + end-of-stream: false + streamingChunk: + summary: Streaming response chunk + value: + response: "The research papers present three" + end-of-stream: false + streamingComplete: + summary: Streaming complete marker + value: + response: "" + end-of-stream: true + '401': + $ref: '../../components/responses/Unauthorized.yaml' + '500': + $ref: '../../components/responses/Error.yaml' diff --git a/specs/api/paths/flow/embeddings.yaml b/specs/api/paths/flow/embeddings.yaml new file mode 100644 index 00000000..e7c7a3f5 --- /dev/null +++ b/specs/api/paths/flow/embeddings.yaml @@ -0,0 +1,85 @@ +post: + tags: + - Flow Services + summary: Embeddings - text to vector conversion + description: | + Convert text to embedding vectors for semantic similarity search. + + ## Embeddings Overview + + Embeddings transform text into dense vector representations that: + - Capture semantic meaning + - Enable similarity comparisons via cosine distance + - Support semantic search and retrieval + - Power RAG systems + + ## Use Cases + + - **Document indexing**: Convert documents to vectors for storage + - **Query encoding**: Convert search queries for similarity matching + - **Semantic similarity**: Find related texts via vector distance + - **Clustering**: Group similar content + - **Classification**: Use as features for ML models + + ## Vector Dimensions + + Dimension count depends on embedding model: + - text-embedding-ada-002: 1536 dimensions + - text-embedding-3-small: 1536 dimensions + - text-embedding-3-large: 3072 dimensions + - Custom models: Varies + + ## Single Request + + Unlike batch embedding APIs, this endpoint processes one text at a time. + For bulk operations, use document-load or text-load services. + + operationId: embeddingsService + security: + - bearerAuth: [] + parameters: + - name: flow + in: path + required: true + schema: + type: string + description: Flow instance ID + example: my-flow + requestBody: + required: true + content: + application/json: + schema: + $ref: '../../components/schemas/embeddings/EmbeddingsRequest.yaml' + examples: + shortText: + summary: Short text embedding + value: + text: Machine learning + sentence: + summary: Sentence embedding + value: + text: Quantum computing uses quantum mechanics principles for computation. + paragraph: + summary: Paragraph embedding + value: + text: | + Neural networks are computing systems inspired by biological neural networks. + They consist of interconnected nodes (neurons) organized in layers. + Through training, they learn to recognize patterns and make predictions. + responses: + '200': + description: Successful response + content: + application/json: + schema: + $ref: '../../components/schemas/embeddings/EmbeddingsResponse.yaml' + examples: + embeddingVector: + summary: Embedding vector + value: + vectors: [0.023, -0.142, 0.089, 0.234, -0.067, 0.156, 0.201, -0.178, 0.045, 0.312] + '401': + $ref: '../../components/responses/Unauthorized.yaml' + '500': + $ref: '../../components/responses/Error.yaml' diff --git a/specs/api/paths/flow/graph-embeddings.yaml b/specs/api/paths/flow/graph-embeddings.yaml new file mode 100644 index 00000000..277659de --- /dev/null +++ b/specs/api/paths/flow/graph-embeddings.yaml @@ -0,0 +1,95 @@ +post: + tags: + - Flow Services + summary: Graph Embeddings Query - find similar entities + description: | + Query graph embeddings to find similar entities by vector similarity. + + ## Graph Embeddings Query Overview + + Find entities semantically similar to a query vector: + - **Input**: Query embedding vector + - **Search**: Compare against stored entity embeddings + - **Output**: Most similar entities (RDF URIs) + + Core component of graph RAG retrieval. + + ## Use Cases + + - **Entity discovery**: Find related entities + - **Concept expansion**: Discover similar concepts + - **Graph exploration**: Navigate by semantic similarity + - **RAG retrieval**: Get entities for context + + ## Process + + 1. Obtain query embedding (via embeddings service) + 2. Query stored entity embeddings + 3. Calculate cosine similarity + 4. Return top N most similar entities + 5. Use entities to retrieve triples/subgraph + + ## Similarity Scoring + + Uses cosine similarity between vectors: + - Results ordered by similarity (most similar first) + - No explicit similarity scores returned + - Limit controls result count + + ## Entity Format + + Returns RDF values (entities): + - URI entities: `{v: "https://...", e: true}` + - These are references to knowledge graph entities + - Use with triples query to get entity details + + operationId: graphEmbeddingsQueryService + security: + - bearerAuth: [] + parameters: + - name: flow + in: path + required: true + schema: + type: string + description: Flow instance ID + example: my-flow + requestBody: + required: true + content: + application/json: + schema: + $ref: '../../components/schemas/embeddings-query/GraphEmbeddingsQueryRequest.yaml' + examples: + basicQuery: + summary: Find similar entities + value: + vectors: [0.023, -0.142, 0.089, 0.234, -0.067, 0.156, 0.201, -0.178] + limit: 10 + user: alice + collection: research + largeQuery: + summary: Larger result set + value: + vectors: [0.1, -0.2, 0.3, -0.4, 0.5] + limit: 50 + responses: + '200': + description: Successful response + content: + application/json: + schema: + $ref: '../../components/schemas/embeddings-query/GraphEmbeddingsQueryResponse.yaml' + examples: + similarEntities: + summary: Similar entities found + value: + entities: + - {v: "https://example.com/person/alice", e: true} + - {v: "https://example.com/person/bob", e: true} + - {v: "https://example.com/concept/quantum-computing", e: true} + - {v: "https://example.com/concept/machine-learning", e: true} + '401': + $ref: '../../components/responses/Unauthorized.yaml' + '500': + $ref: '../../components/responses/Error.yaml' diff --git a/specs/api/paths/flow/graph-rag.yaml b/specs/api/paths/flow/graph-rag.yaml new file mode 100644 index 00000000..9bcb6940 --- /dev/null +++ b/specs/api/paths/flow/graph-rag.yaml @@ -0,0 +1,127 @@ +post: + tags: + - Flow Services + summary: Graph RAG - retrieve and generate from knowledge graph + description: | + Retrieval-Augmented Generation over knowledge graph. + + ## Graph RAG Overview + + Graph RAG combines: + 1. **Retrieval**: Find relevant entities and subgraph from knowledge graph + 2. **Generation**: Use LLM to reason over graph structure and generate answer + + This provides graph-aware answers that leverage relationships and structure. + + ## Query Process + + 1. Identify relevant entities from query (using embeddings) + 2. Retrieve connected subgraph around entities + 3. Optionally traverse paths up to max-path-length hops + 4. Limit subgraph size to stay within context window + 5. Pass query + graph structure to LLM + 6. Generate answer incorporating graph relationships + + ## Streaming + + Enable `streaming: true` to receive the answer as it's generated: + - Multiple messages with `response` content + - Final message with `end-of-stream: true` + + Without streaming, returns complete answer in single response. + + ## Parameters + + Control retrieval scope with multiple knobs: + - **entity-limit**: How many starting entities to find (1-200, default 50) + - **triple-limit**: Triples per entity (1-100, default 30) + - **max-subgraph-size**: Total subgraph cap (10-5000, default 1000) + - **max-path-length**: Graph traversal depth (1-5, default 2) + + Higher limits = more context but: + - Slower retrieval + - Larger context for LLM + - May hit context window limits + + ## Use Cases + + Best for queries requiring: + - Relationship understanding ("How are X and Y connected?") + - Multi-hop reasoning ("What's the path from A to B?") + - Structural analysis ("What are the main entities related to X?") + + operationId: graphRagService + security: + - bearerAuth: [] + parameters: + - name: flow + in: path + required: true + schema: + type: string + description: Flow instance ID + example: my-flow + requestBody: + required: true + content: + application/json: + schema: + $ref: '../../components/schemas/rag/GraphRagRequest.yaml' + examples: + basicQuery: + summary: Basic graph query + value: + query: What connections exist between quantum physics and computer science? + user: alice + collection: research + streamingQuery: + summary: Streaming query with custom limits + value: + query: Trace the historical development of AI from Turing to modern LLMs + user: alice + collection: research + entity-limit: 40 + triple-limit: 25 + max-subgraph-size: 800 + max-path-length: 3 + streaming: true + focusedQuery: + summary: Focused query with tight limits + value: + query: What is the immediate relationship between entity A and B? + entity-limit: 10 + triple-limit: 15 + max-subgraph-size: 200 + max-path-length: 1 + responses: + '200': + description: Successful response + content: + application/json: + schema: + $ref: '../../components/schemas/rag/GraphRagResponse.yaml' + examples: + completeResponse: + summary: Complete non-streaming response + value: + response: | + Quantum physics and computer science intersect primarily through quantum computing. + The knowledge graph shows connections through: + - Quantum algorithms (Shor's algorithm, Grover's algorithm) + - Quantum information theory + - Computational complexity theory + end-of-stream: false + streamingChunk: + summary: Streaming response chunk + value: + response: "Quantum physics and computer science intersect" + end-of-stream: false + streamingComplete: + summary: Streaming complete marker + value: + response: "" + end-of-stream: true + '401': + $ref: '../../components/responses/Unauthorized.yaml' + '500': + $ref: '../../components/responses/Error.yaml' diff --git a/specs/api/paths/flow/mcp-tool.yaml b/specs/api/paths/flow/mcp-tool.yaml new file mode 100644 index 00000000..9f53df36 --- /dev/null +++ b/specs/api/paths/flow/mcp-tool.yaml @@ -0,0 +1,119 @@ +post: + tags: + - Flow Services + summary: MCP Tool - execute Model Context Protocol tools + description: | + Execute MCP (Model Context Protocol) tools for agent capabilities. + + ## MCP Tool Overview + + MCP tools provide agent capabilities through standardized protocol: + - **Search tools**: Web search, document search + - **Data tools**: Database queries, API calls + - **Action tools**: File operations, system commands + - **Integration tools**: Third-party service connectors + + Tools extend agent capabilities beyond pure LLM generation. + + ## Tool Execution + + Tools are: + 1. Registered via MCP protocol + 2. Discovered by agent + 3. Called with structured parameters + 4. Return text or structured results + + ## Request Format + + - **name**: Tool identifier (e.g., "search", "calculator", "weather") + - **parameters**: Tool-specific arguments as JSON object + + ## Response Format + + Tools can return: + - **text**: Plain text result (simple tools) + - **object**: Structured JSON result (complex tools) + + ## Tool Registration + + Tools are registered via MCP server configuration: + - Define tool schema (name, parameters, description) + - Implement tool handler + - Register with MCP server + - Agent discovers and uses tool + + ## Use Cases + + - **Web search**: Find external information + - **Calculator**: Perform calculations + - **Database query**: Retrieve structured data + - **API integration**: Call external services + - **File operations**: Read/write files + - **Code execution**: Run scripts + + operationId: mcpToolService + security: + - bearerAuth: [] + parameters: + - name: flow + in: path + required: true + schema: + type: string + description: Flow instance ID + example: my-flow + requestBody: + required: true + content: + application/json: + schema: + $ref: '../../components/schemas/mcp-tool/McpToolRequest.yaml' + examples: + searchTool: + summary: Search tool execution + value: + name: search + parameters: + query: quantum computing + limit: 10 + calculatorTool: + summary: Calculator tool + value: + name: calculator + parameters: + expression: (42 * 7) + 15 + weatherTool: + summary: Weather tool + value: + name: weather + parameters: + location: San Francisco + units: celsius + responses: + '200': + description: Successful response + content: + application/json: + schema: + $ref: '../../components/schemas/mcp-tool/McpToolResponse.yaml' + examples: + textResponse: + summary: Text result + value: + text: The result is 309 + objectResponse: + summary: Structured result + value: + object: + results: + - title: Introduction to Quantum Computing + url: https://example.com/qc-intro + snippet: Quantum computing uses quantum mechanics... + - title: Quantum Algorithms + url: https://example.com/qc-algos + snippet: Key algorithms include Shor's and Grover's... + total: 10 + '401': + $ref: '../../components/responses/Unauthorized.yaml' + '500': + $ref: '../../components/responses/Error.yaml' diff --git a/specs/api/paths/flow/nlp-query.yaml b/specs/api/paths/flow/nlp-query.yaml new file mode 100644 index 00000000..7032b5b9 --- /dev/null +++ b/specs/api/paths/flow/nlp-query.yaml @@ -0,0 +1,148 @@ +post: + tags: + - Flow Services + summary: NLP Query - natural language to structured query + description: | + Convert natural language questions to structured GraphQL queries. + + ## NLP Query Overview + + Transforms user questions into executable GraphQL: + - **Natural input**: Ask questions in plain English + - **Structured output**: Get GraphQL query + variables + - **Schema-aware**: Uses knowledge graph schema + - **Confidence scoring**: Know how well question was understood + + Enables non-technical users to query knowledge graph. + + ## Process + + 1. Parse natural language question + 2. Identify entities and relationships + 3. Map to GraphQL schema types + 4. Generate query with variables + 5. Return query + confidence score + + ## Using Results + + Generated query can be: + - Executed via objects query service + - Inspected and modified if needed + - Cached for similar questions + + Example workflow: + ``` + 1. User asks: "Who does Alice know?" + 2. NLP Query generates GraphQL + 3. Execute via /api/v1/flow/{flow}/service/objects + 4. Return results to user + ``` + + ## Schema Detection + + Response includes `detected-schemas` array showing: + - Which types were identified + - What entities were matched + - Schema coverage of question + + Helps understand query scope. + + ## Confidence Scores + + - **0.9-1.0**: High confidence, likely correct + - **0.7-0.9**: Good confidence, probably correct + - **0.5-0.7**: Medium confidence, may need review + - **< 0.5**: Low confidence, likely incorrect + + Low scores suggest: + - Ambiguous question + - Missing schema coverage + - Complex query structure + + operationId: nlpQueryService + security: + - bearerAuth: [] + parameters: + - name: flow + in: path + required: true + schema: + type: string + description: Flow instance ID + example: my-flow + requestBody: + required: true + content: + application/json: + schema: + $ref: '../../components/schemas/query/NlpQueryRequest.yaml' + examples: + simpleQuestion: + summary: Simple relationship question + value: + question: Who does Alice know? + max-results: 50 + complexQuestion: + summary: Multi-hop relationship + value: + question: What companies employ people that Alice knows? + max-results: 100 + filterQuestion: + summary: Question with filters + value: + question: Which engineers does Bob collaborate with? + responses: + '200': + description: Successful response + content: + application/json: + schema: + $ref: '../../components/schemas/query/NlpQueryResponse.yaml' + examples: + successfulQuery: + summary: Successful query generation + value: + graphql-query: | + query GetConnections($person: ID!) { + person(id: $person) { + knows { name email } + } + } + variables: + person: "https://example.com/person/alice" + detected-schemas: ["Person"] + confidence: 0.92 + complexQuery: + summary: Complex multi-hop query + value: + graphql-query: | + query GetCompanies($person: ID!) { + person(id: $person) { + knows { + worksFor { + name + industry + } + } + } + } + variables: + person: "https://example.com/person/alice" + detected-schemas: ["Person", "Organization"] + confidence: 0.85 + lowConfidence: + summary: Low confidence result + value: + graphql-query: | + query Search { + search(term: "unknown entities") { + results + } + } + variables: {} + detected-schemas: [] + confidence: 0.43 + '401': + $ref: '../../components/responses/Unauthorized.yaml' + '500': + $ref: '../../components/responses/Error.yaml' diff --git a/specs/api/paths/flow/objects.yaml b/specs/api/paths/flow/objects.yaml new file mode 100644 index 00000000..ac94a353 --- /dev/null +++ b/specs/api/paths/flow/objects.yaml @@ -0,0 +1,166 @@ +post: + tags: + - Flow Services + summary: Objects query - GraphQL over knowledge graph + description: | + Query knowledge graph using GraphQL for object-oriented data access. + + ## Objects Query Overview + + GraphQL interface to knowledge graph: + - **Schema-driven**: Predefined types and relationships + - **Flexible queries**: Request exactly what you need + - **Nested data**: Traverse relationships in single query + - **Type-safe**: Strong typing with introspection + + Abstracts RDF triples into familiar object model. + + ## GraphQL Benefits + + Compared to triples query: + - **Developer-friendly**: Objects instead of triples + - **Efficient**: Get related data in one query + - **Typed**: Schema defines available fields + - **Discoverable**: Introspection for tooling + + ## Query Structure + + Standard GraphQL query format: + ```graphql + query OperationName($var: Type!) { + fieldName(arg: $var) { + subField1 + subField2 + nestedObject { + nestedField + } + } + } + ``` + + ## Variables + + Pass variables for parameterized queries: + ```json + { + "query": "query GetPerson($id: ID!) { person(id: $id) { name } }", + "variables": {"id": "https://example.com/person/alice"} + } + ``` + + ## Error Handling + + GraphQL distinguishes: + - **Field errors**: Invalid query, missing fields (in `errors` array) + - **System errors**: Connection issues, timeouts (in `error` object) + + Partial data may be returned with field errors. + + ## Schema Definition + + Schema defines available types via config service. + Use introspection query to discover schema. + + operationId: objectsQueryService + security: + - bearerAuth: [] + parameters: + - name: flow + in: path + required: true + schema: + type: string + description: Flow instance ID + example: my-flow + requestBody: + required: true + content: + application/json: + schema: + $ref: '../../components/schemas/query/ObjectsQueryRequest.yaml' + examples: + simpleQuery: + summary: Simple query + value: + query: | + { + person(id: "https://example.com/person/alice") { + name + email + } + } + user: alice + collection: research + queryWithVariables: + summary: Query with variables + value: + query: | + query GetPerson($id: ID!) { + person(id: $id) { + name + email + knows { + name + } + } + } + variables: + id: "https://example.com/person/alice" + operation-name: GetPerson + nestedQuery: + summary: Nested relationship query + value: + query: | + { + person(id: "https://example.com/person/alice") { + name + knows { + name + worksFor { + name + location + } + } + } + } + responses: + '200': + description: Successful response + content: + application/json: + schema: + $ref: '../../components/schemas/query/ObjectsQueryResponse.yaml' + examples: + successfulQuery: + summary: Successful query + value: + data: + person: + name: Alice + email: alice@example.com + knows: + - name: Bob + - name: Carol + extensions: + execution_time_ms: "42" + queryWithFieldErrors: + summary: Query with field errors + value: + data: + person: + name: Alice + email: null + errors: + - message: Cannot query field 'nonexistent' on type 'Person' + path: ["person", "nonexistent"] + systemError: + summary: System error + value: + data: null + error: + type: TIMEOUT_ERROR + message: Query execution timeout after 30s + '401': + $ref: '../../components/responses/Unauthorized.yaml' + '500': + $ref: '../../components/responses/Error.yaml' diff --git a/specs/api/paths/flow/prompt.yaml b/specs/api/paths/flow/prompt.yaml new file mode 100644 index 00000000..84a49f81 --- /dev/null +++ b/specs/api/paths/flow/prompt.yaml @@ -0,0 +1,143 @@ +post: + tags: + - Flow Services + summary: Prompt service - template-based generation + description: | + Execute stored prompt templates with variable substitution. + + ## Prompt Service Overview + + The prompt service enables: + - Reusable prompt templates stored in configuration + - Variable substitution for dynamic prompts + - Consistent prompt engineering across requests + - Text or structured object outputs + + ## Template System + + Prompts are stored via config service (`/api/v1/config`) with: + - **id**: Unique prompt identifier + - **template**: Prompt text with `{variable}` placeholders + - **system**: Optional system prompt + - **output_format**: "text" or "object" + + Example template: + ``` + Summarize the following document in {max_length} words: + + {document} + ``` + + ## Variable Substitution + + Two ways to pass variables: + + 1. **terms** (explicit JSON strings): + ```json + { + "terms": { + "document": "\"Text here...\"", + "max_length": "\"200\"" + } + } + ``` + + 2. **variables** (auto-converted): + ```json + { + "variables": { + "document": "Text here...", + "max_length": 200 + } + } + ``` + + ## Output Types + + - **text**: Plain text response in `text` field + - **object**: Structured JSON in `object` field (as string) + + ## Streaming + + Enable `streaming: true` to receive response incrementally. + + ## Use Cases + + - Document summarization + - Entity extraction + - Classification tasks + - Data transformation + - Any repeatable LLM task with consistent prompting + + operationId: promptService + security: + - bearerAuth: [] + parameters: + - name: flow + in: path + required: true + schema: + type: string + description: Flow instance ID + example: my-flow + requestBody: + required: true + content: + application/json: + schema: + $ref: '../../components/schemas/prompt/PromptRequest.yaml' + examples: + withTerms: + summary: Using terms (JSON strings) + value: + id: summarize-document + terms: + document: '"This document discusses quantum computing, covering qubits, superposition, and entanglement. Applications include cryptography and optimization."' + max_length: '"50"' + withVariables: + summary: Using variables (auto-converted) + value: + id: extract-entities + variables: + text: A paper by Einstein on relativity published in 1905. + entity_types: ["person", "year", "topic"] + streaming: + summary: Streaming response + value: + id: generate-report + variables: + data: {revenue: 1000000, growth: 15} + format: executive summary + streaming: true + responses: + '200': + description: Successful response + content: + application/json: + schema: + $ref: '../../components/schemas/prompt/PromptResponse.yaml' + examples: + textResponse: + summary: Text output + value: + text: This document provides an overview of quantum computing fundamentals and cryptographic applications. + end-of-stream: false + objectResponse: + summary: Structured output + value: + object: '{"entities": [{"type": "person", "value": "Einstein"}, {"type": "year", "value": "1905"}, {"type": "topic", "value": "relativity"}]}' + end-of-stream: false + streamingChunk: + summary: Streaming chunk + value: + text: This document provides an overview + end-of-stream: false + streamingComplete: + summary: Streaming complete + value: + text: "" + end-of-stream: true + '401': + $ref: '../../components/responses/Unauthorized.yaml' + '500': + $ref: '../../components/responses/Error.yaml' diff --git a/specs/api/paths/flow/structured-diag.yaml b/specs/api/paths/flow/structured-diag.yaml new file mode 100644 index 00000000..9de56152 --- /dev/null +++ b/specs/api/paths/flow/structured-diag.yaml @@ -0,0 +1,172 @@ +post: + tags: + - Flow Services + summary: Structured Diag - analyze structured data formats + description: | + Analyze and understand structured data (CSV, JSON, XML). + + ## Structured Diag Overview + + Helps process unknown structured data: + - **Detect format**: Identify CSV, JSON, or XML + - **Generate schema**: Create descriptor from sample + - **Match schemas**: Find existing schemas that fit data + - **Full diagnosis**: Complete analysis in one call + + Essential for data ingestion pipelines. + + ## Operations + + ### detect-type + Identify data format from sample: + - Input: Data sample + - Output: Format (csv/json/xml) + confidence + - Use when: Format is unknown + + ### generate-descriptor + Create schema descriptor: + - Input: Sample + known type + - Output: Field definitions, types, structure + - Use when: Need to understand data structure + + ### diagnose (recommended) + Combined analysis: + - Input: Data sample + - Output: Format + descriptor + metadata + - Use when: Starting from scratch + + ### schema-selection + Find matching schemas: + - Input: Data sample + - Output: List of schema IDs that match + - Use when: Have existing schemas, need to match data + + ## Data Types + + Supported formats: + - **CSV**: Comma-separated values (or custom delimiter) + - **JSON**: JSON objects or arrays + - **XML**: XML documents + + ## Options + + Format-specific options: + - **CSV**: delimiter, has_header, quote_char + - **JSON**: array_path (for nested arrays) + - **XML**: root_element, record_path + + ## Workflow Example + + 1. Receive unknown data file + 2. Call diagnose operation with sample + 3. Get format + schema descriptor + 4. Use descriptor to process full dataset + 5. Load data via document-load or text-load + + operationId: structuredDiagService + security: + - bearerAuth: [] + parameters: + - name: flow + in: path + required: true + schema: + type: string + description: Flow instance ID + example: my-flow + requestBody: + required: true + content: + application/json: + schema: + $ref: '../../components/schemas/diag/StructuredDiagRequest.yaml' + examples: + detectType: + summary: Detect data type + value: + operation: detect-type + sample: | + name,age,email + Alice,30,alice@example.com + Bob,25,bob@example.com + generateDescriptor: + summary: Generate schema descriptor + value: + operation: generate-descriptor + sample: | + name,age,email + Alice,30,alice@example.com + type: csv + schema-name: person-records + options: + delimiter: "," + has_header: "true" + diagnose: + summary: Full diagnosis + value: + operation: diagnose + sample: | + [ + {"name": "Alice", "age": 30}, + {"name": "Bob", "age": 25} + ] + schemaSelection: + summary: Find matching schemas + value: + operation: schema-selection + sample: | + name,email,phone + Alice,alice@example.com,555-1234 + responses: + '200': + description: Successful response + content: + application/json: + schema: + $ref: '../../components/schemas/diag/StructuredDiagResponse.yaml' + examples: + detectedType: + summary: Type detection result + value: + operation: detect-type + detected-type: csv + confidence: 0.95 + generatedDescriptor: + summary: Generated descriptor + value: + operation: generate-descriptor + descriptor: + schema_name: person-records + type: csv + fields: + - {name: name, type: string} + - {name: age, type: integer} + - {name: email, type: string} + metadata: + field_count: "3" + has_header: "true" + fullDiagnosis: + summary: Complete diagnosis + value: + operation: diagnose + detected-type: json + confidence: 0.98 + descriptor: + type: json + structure: array_of_objects + fields: + - {name: name, type: string} + - {name: age, type: integer} + metadata: + record_count: "2" + schemaMatches: + summary: Schema selection results + value: + operation: schema-selection + schema-matches: + - person-schema-v1 + - contact-schema-v2 + '401': + $ref: '../../components/responses/Unauthorized.yaml' + '500': + $ref: '../../components/responses/Error.yaml' diff --git a/specs/api/paths/flow/structured-query.yaml b/specs/api/paths/flow/structured-query.yaml new file mode 100644 index 00000000..c094c50a --- /dev/null +++ b/specs/api/paths/flow/structured-query.yaml @@ -0,0 +1,134 @@ +post: + tags: + - Flow Services + summary: Structured Query - question to results (all-in-one) + description: | + Ask natural language questions and get results directly. + + ## Structured Query Overview + + Combines two operations in one call: + 1. **NLP Query**: Generate GraphQL from question + 2. **Objects Query**: Execute generated query + 3. **Return Results**: Direct answer data + + Simplest way to query knowledge graph with natural language. + + ## Comparison with Other Services + + ### Structured Query (this service) + - **Input**: Natural language question + - **Output**: Query results (data) + - **Use when**: Want simple, direct answers + + ### NLP Query + Objects Query (separate calls) + - **Step 1**: Convert question → GraphQL + - **Step 2**: Execute GraphQL → results + - **Use when**: Need to inspect/modify query before execution + + ### Triples Query (low-level) + - **Input**: RDF pattern + - **Output**: Matching triples + - **Use when**: Need precise control over graph queries + + ## Response Format + + Returns standard GraphQL response: + - **data**: Query results (null if error) + - **errors**: Field-level errors (array of strings) + - **error**: System-level error (generation or execution failure) + + ## Error Handling + + Three types of errors: + 1. **Query generation failed**: Couldn't understand question + - Error in `error` object + - data = null + 2. **Query execution failed**: Generated query had errors + - Errors in `errors` array + - data may be partial + 3. **System error**: Infrastructure issue + - Error in `error` object + + ## Performance + + Convenience vs control trade-off: + - **Faster development**: One call instead of two + - **Less control**: Can't inspect/modify generated query + - **Simpler code**: No need to handle intermediate steps + + operationId: structuredQueryService + security: + - bearerAuth: [] + parameters: + - name: flow + in: path + required: true + schema: + type: string + description: Flow instance ID + example: my-flow + requestBody: + required: true + content: + application/json: + schema: + $ref: '../../components/schemas/query/StructuredQueryRequest.yaml' + examples: + simpleQuestion: + summary: Simple relationship question + value: + question: Who does Alice know? + user: alice + collection: research + complexQuestion: + summary: Complex multi-hop question + value: + question: What companies employ engineers that Bob collaborates with? + user: bob + collection: work + filterQuestion: + summary: Question with implicit filters + value: + question: Which researchers work on quantum computing? + responses: + '200': + description: Successful response + content: + application/json: + schema: + $ref: '../../components/schemas/query/StructuredQueryResponse.yaml' + examples: + successfulQuery: + summary: Successful query with results + value: + data: + person: + name: Alice + knows: + - name: Bob + email: bob@example.com + - name: Carol + email: carol@example.com + errors: [] + partialResults: + summary: Partial results with errors + value: + data: + person: + name: Alice + knows: null + errors: + - Cannot query field 'nonexistent' on type 'Person' + generationFailed: + summary: Query generation failed + value: + data: null + errors: [] + error: + type: QUERY_GENERATION_ERROR + message: Could not understand question structure + '401': + $ref: '../../components/responses/Unauthorized.yaml' + '500': + $ref: '../../components/responses/Error.yaml' diff --git a/specs/api/paths/flow/text-completion.yaml b/specs/api/paths/flow/text-completion.yaml new file mode 100644 index 00000000..7526d0c1 --- /dev/null +++ b/specs/api/paths/flow/text-completion.yaml @@ -0,0 +1,125 @@ +post: + tags: + - Flow Services + summary: Text completion - direct LLM generation + description: | + Direct text completion using LLM without retrieval augmentation. + + ## Text Completion Overview + + Pure LLM generation for: + - General knowledge questions + - Creative writing + - Code generation + - Analysis and reasoning + - Any task not requiring specific document/graph context + + ## System vs Prompt + + - **system**: Sets LLM behavior, role, constraints + - "You are a helpful assistant" + - "You are an expert Python developer" + - "Respond in JSON format" + - **prompt**: The actual user request/question + + ## Streaming + + Enable `streaming: true` to receive tokens as generated: + - Multiple messages with partial `response` + - Final message with `end-of-stream: true` + + Without streaming, returns complete response in single message. + + ## Token Counting + + Response includes token usage: + - `in-token`: Input tokens (system + prompt) + - `out-token`: Generated tokens + - Useful for cost tracking and optimization + + ## When to Use + + Use text-completion when: + - No specific context needed (general knowledge) + - System prompt provides sufficient context + - Want direct control over prompting + + Use document-rag/graph-rag when: + - Need to ground response in specific documents + - Want to leverage knowledge graph relationships + - Require citations or provenance + + operationId: textCompletionService + security: + - bearerAuth: [] + parameters: + - name: flow + in: path + required: true + schema: + type: string + description: Flow instance ID + example: my-flow + requestBody: + required: true + content: + application/json: + schema: + $ref: '../../components/schemas/text-completion/TextCompletionRequest.yaml' + examples: + basicCompletion: + summary: Basic text completion + value: + system: You are a helpful assistant that provides concise answers. + prompt: Explain the concept of recursion in programming. + codeGeneration: + summary: Code generation with streaming + value: + system: You are an expert Python developer. Provide clean, well-documented code. + prompt: Write a function to calculate the Fibonacci sequence using memoization. + streaming: true + jsonResponse: + summary: Structured output request + value: + system: You are a JSON API. Respond only with valid JSON, no other text. + prompt: | + Extract key information from this text and return as JSON with fields: + title, author, year, summary. + + Text: "The Theory of Everything by Stephen Hawking (2006) explores..." + responses: + '200': + description: Successful response + content: + application/json: + schema: + $ref: '../../components/schemas/text-completion/TextCompletionResponse.yaml' + examples: + completeResponse: + summary: Complete non-streaming response + value: + response: | + Recursion is a programming technique where a function calls itself + to solve a problem by breaking it down into smaller, similar subproblems. + Each recursive call works on a simpler version until reaching a base case. + in-token: 45 + out-token: 128 + model: gpt-4 + end-of-stream: false + streamingChunk: + summary: Streaming response chunk + value: + response: "Recursion is a programming technique" + end-of-stream: false + streamingComplete: + summary: Streaming complete with tokens + value: + response: "" + in-token: 45 + out-token: 128 + model: gpt-4 + end-of-stream: true + '401': + $ref: '../../components/responses/Unauthorized.yaml' + '500': + $ref: '../../components/responses/Error.yaml' diff --git a/specs/api/paths/flow/text-load.yaml b/specs/api/paths/flow/text-load.yaml new file mode 100644 index 00000000..5f918a3a --- /dev/null +++ b/specs/api/paths/flow/text-load.yaml @@ -0,0 +1,111 @@ +post: + tags: + - Flow Services + summary: Text Load - load text documents + description: | + Load text documents into processing pipeline for indexing and embedding. + + ## Text Load Overview + + Fire-and-forget document loading: + - **Input**: Text content (base64 encoded) + - **Process**: Chunk, embed, store + - **Output**: None (202 Accepted) + + Asynchronous processing - document queued for background processing. + + ## Processing Pipeline + + Text documents go through: + 1. **Chunking**: Split into overlapping chunks + 2. **Embedding**: Generate vectors for each chunk + 3. **Storage**: Store chunks + embeddings + 4. **Indexing**: Make searchable via document-embeddings query + + Pipeline runs asynchronously after request returns. + + ## Text Format + + Text must be base64 encoded: + ``` + text_content = "This is the document..." + encoded = base64.b64encode(text_content.encode('utf-8')) + ``` + + Default charset is UTF-8, specify `charset` if different. + + ## Metadata + + Optional RDF triples describing document: + - Title, author, date + - Source URL + - Custom properties + - Used for organization and retrieval + + ## Use Cases + + - **Document ingestion**: Add documents to knowledge base + - **Bulk loading**: Process multiple documents + - **Content updates**: Replace existing documents + - **Library integration**: Load from document library + + ## No Response Data + + Returns 202 Accepted immediately: + - Document queued for processing + - No synchronous result + - No processing status + - Check document-embeddings query later to verify indexed + + operationId: textLoadService + security: + - bearerAuth: [] + parameters: + - name: flow + in: path + required: true + schema: + type: string + description: Flow instance ID + example: my-flow + requestBody: + required: true + content: + application/json: + schema: + $ref: '../../components/schemas/loading/TextLoadRequest.yaml' + examples: + simpleLoad: + summary: Load text document + value: + text: VGhpcyBpcyB0aGUgZG9jdW1lbnQgdGV4dC4uLg== + id: doc-123 + user: alice + collection: research + withMetadata: + summary: Load with RDF metadata + value: + text: UXVhbnR1bSBjb21wdXRpbmcgdXNlcyBxdWFudHVtIG1lY2hhbmljcyBwcmluY2lwbGVzLi4u + id: doc-456 + user: alice + collection: research + metadata: + - s: {v: "doc-456", e: false} + p: {v: "http://purl.org/dc/terms/title", e: true} + o: {v: "Introduction to Quantum Computing", e: false} + - s: {v: "doc-456", e: false} + p: {v: "http://purl.org/dc/terms/creator", e: true} + o: {v: "Dr. Alice Smith", e: false} + responses: + '202': + description: Document accepted for processing + content: + application/json: + schema: + type: object + properties: {} + example: {} + '401': + $ref: '../../components/responses/Unauthorized.yaml' + '500': + $ref: '../../components/responses/Error.yaml' diff --git a/specs/api/paths/flow/triples.yaml b/specs/api/paths/flow/triples.yaml new file mode 100644 index 00000000..5557ea5a --- /dev/null +++ b/specs/api/paths/flow/triples.yaml @@ -0,0 +1,129 @@ +post: + tags: + - Flow Services + summary: Triples query - pattern-based graph queries + description: | + Query knowledge graph using subject-predicate-object patterns. + + ## Triples Query Overview + + Query RDF triples with flexible pattern matching: + - Specify subject, predicate, and/or object + - Any combination of filters (all optional) + - Returns matching triples up to limit + + ## Pattern Matching + + Pattern syntax supports: + - **All triples**: Omit all filters (returns everything up to limit) + - **Subject match**: Specify `s` only (all triples about that subject) + - **Predicate match**: Specify `p` only (all uses of that property) + - **Object match**: Specify `o` only (all triples with that value) + - **Combinations**: Any combination of s/p/o + + ## RDF Value Format + + Each component (s/p/o) uses RdfValue format: + - **Entity/URI**: `{"v": "https://example.com/entity", "e": true}` + - **Literal**: `{"v": "Some text", "e": false}` + + ## Query Examples + + Find all properties of an entity: + ```json + {"s": {"v": "https://example.com/person/alice", "e": true}} + ``` + + Find all instances of a type: + ```json + { + "p": {"v": "http://www.w3.org/1999/02/22-rdf-syntax-ns#type", "e": true}, + "o": {"v": "https://example.com/type/Person", "e": true} + } + ``` + + Find specific relationship: + ```json + { + "s": {"v": "https://example.com/person/alice", "e": true}, + "p": {"v": "https://example.com/knows", "e": true} + } + ``` + + ## Performance + + - Default limit: 10,000 triples + - Max limit: 100,000 triples + - More specific patterns = faster queries + - Consider limit for large result sets + + operationId: triplesQueryService + security: + - bearerAuth: [] + parameters: + - name: flow + in: path + required: true + schema: + type: string + description: Flow instance ID + example: my-flow + requestBody: + required: true + content: + application/json: + schema: + $ref: '../../components/schemas/query/TriplesQueryRequest.yaml' + examples: + allTriplesAboutEntity: + summary: All triples about an entity + value: + s: + v: https://example.com/person/alice + e: true + user: alice + collection: research + limit: 100 + allInstancesOfType: + summary: Find all instances of a type + value: + p: + v: http://www.w3.org/1999/02/22-rdf-syntax-ns#type + e: true + o: + v: https://example.com/type/Person + e: true + limit: 50 + specificRelationship: + summary: Find specific relationships + value: + p: + v: https://example.com/knows + e: true + user: alice + limit: 200 + responses: + '200': + description: Successful response + content: + application/json: + schema: + $ref: '../../components/schemas/query/TriplesQueryResponse.yaml' + examples: + matchingTriples: + summary: Matching triples + value: + response: + - s: {v: "https://example.com/person/alice", e: true} + p: {v: "https://www.w3.org/1999/02/22-rdf-syntax-ns#type", e: true} + o: {v: "https://example.com/type/Person", e: true} + - s: {v: "https://example.com/person/alice", e: true} + p: {v: "https://www.w3.org/2000/01/rdf-schema#label", e: true} + o: {v: "Alice", e: false} + - s: {v: "https://example.com/person/alice", e: true} + p: {v: "https://example.com/knows", e: true} + o: {v: "https://example.com/person/bob", e: true} + '401': + $ref: '../../components/responses/Unauthorized.yaml' + '500': + $ref: '../../components/responses/Error.yaml' diff --git a/specs/api/paths/import-core.yaml b/specs/api/paths/import-core.yaml new file mode 100644 index 00000000..38c99bf0 --- /dev/null +++ b/specs/api/paths/import-core.yaml @@ -0,0 +1,106 @@ +post: + tags: + - Import/Export + summary: Import Core - bulk import triples and embeddings + description: | + Import knowledge cores in bulk using streaming MessagePack format. + + ## Import Core Overview + + Bulk data import for knowledge graph: + - **Format**: MessagePack streaming + - **Content**: Triples and/or graph embeddings + - **Target**: Global knowledge storage + - **Use**: Backup restoration, data migration, bulk loading + + ## MessagePack Protocol + + Request body is MessagePack stream with message tuples: + + ### Triple Message + ``` + ("t", { + "m": { // Metadata + "i": "core-id", // Knowledge core ID + "m": [...], // Metadata triples array + "u": "user", // User + "c": "collection" // Collection + }, + "t": [...] // Triples array + }) + ``` + + ### Graph Embeddings Message + ``` + ("ge", { + "m": { // Metadata + "i": "core-id", + "m": [...], + "u": "user", + "c": "collection" + }, + "e": [ // Entities array + { + "e": {"v": "uri", "e": true}, // Entity RdfValue + "v": [0.1, 0.2, ...] // Vectors + } + ] + }) + ``` + + ## Query Parameters + + - **id**: Knowledge core ID + - **user**: User identifier + + ## Streaming + + Multiple messages can be sent in stream. + Each message processed as received. + No response body - returns 202 Accepted. + + ## Use Cases + + - **Backup restoration**: Restore from export + - **Data migration**: Move data between systems + - **Bulk loading**: Initial knowledge base population + - **Replication**: Copy knowledge cores + + operationId: importCore + security: + - bearerAuth: [] + parameters: + - name: id + in: query + required: true + schema: + type: string + description: Knowledge core ID to import + example: core-123 + - name: user + in: query + required: true + schema: + type: string + description: User identifier + example: alice + requestBody: + required: true + content: + application/msgpack: + schema: + type: string + format: binary + description: MessagePack stream of knowledge data + responses: + '202': + description: Import accepted and processing + content: + application/json: + schema: + type: object + properties: {} + '401': + $ref: '../components/responses/Unauthorized.yaml' + '500': + $ref: '../components/responses/Error.yaml' diff --git a/specs/api/paths/knowledge.yaml b/specs/api/paths/knowledge.yaml new file mode 100644 index 00000000..71bba496 --- /dev/null +++ b/specs/api/paths/knowledge.yaml @@ -0,0 +1,196 @@ +post: + tags: + - Knowledge + summary: Knowledge graph core management + description: | + Manage knowledge graph cores - persistent storage of triples and embeddings. + + ## Knowledge Cores + + Knowledge cores are the foundational storage units for: + - **Triples**: RDF triples representing knowledge graph data + - **Graph Embeddings**: Vector embeddings for entities + - **Metadata**: Descriptive information about the knowledge + + Each core has an ID, user, and collection for organization. + + ## Operations + + ### list-kg-cores + List all knowledge cores for a user. Returns array of core IDs. + + ### get-kg-core + Retrieve a knowledge core by ID. Returns triples and/or graph embeddings. + Response is streamed - may receive multiple messages followed by EOS marker. + + ### put-kg-core + Store triples and/or graph embeddings. Creates new core or updates existing. + Can store triples only, embeddings only, or both together. + + ### delete-kg-core + Delete a knowledge core by ID. Removes all associated data. + + ### load-kg-core + Load a knowledge core into a running flow's collection. + Makes the data available for querying within that flow instance. + + ### unload-kg-core + Unload a knowledge core from a flow's collection. + Removes data from flow instance but doesn't delete the core. + + ## Streaming Responses + + The `get-kg-core` operation streams data in chunks: + 1. Multiple messages with `triples` or `graph-embeddings` + 2. Final message with `eos: true` to signal completion + + operationId: knowledgeService + security: + - bearerAuth: [] + requestBody: + required: true + content: + application/json: + schema: + $ref: '../components/schemas/knowledge/KnowledgeRequest.yaml' + examples: + listKnowledgeCores: + summary: List knowledge cores + value: + operation: list-kg-cores + user: alice + getKnowledgeCore: + summary: Get knowledge core + value: + operation: get-kg-core + id: core-123 + putTriplesOnly: + summary: Store triples + value: + operation: put-kg-core + triples: + metadata: + id: core-123 + user: alice + collection: default + metadata: + - s: {v: "https://example.com/core-123", e: true} + p: {v: "https://www.w3.org/1999/02/22-rdf-syntax-ns#type", e: true} + o: {v: "https://trustgraph.ai/e/knowledge-core", e: true} + triples: + - s: {v: "https://example.com/entity1", e: true} + p: {v: "https://www.w3.org/2000/01/rdf-schema#label", e: true} + o: {v: "Entity 1", e: false} + - s: {v: "https://example.com/entity1", e: true} + p: {v: "https://example.com/relatedTo", e: true} + o: {v: "https://example.com/entity2", e: true} + putEmbeddingsOnly: + summary: Store embeddings + value: + operation: put-kg-core + graph-embeddings: + metadata: + id: core-123 + user: alice + collection: default + metadata: [] + entities: + - entity: {v: "https://example.com/entity1", e: true} + vectors: [0.1, 0.2, 0.3, 0.4, 0.5] + - entity: {v: "https://example.com/entity2", e: true} + vectors: [0.6, 0.7, 0.8, 0.9, 1.0] + putTriplesAndEmbeddings: + summary: Store triples and embeddings together + value: + operation: put-kg-core + triples: + metadata: + id: core-456 + user: bob + collection: research + metadata: [] + triples: + - s: {v: "https://example.com/doc1", e: true} + p: {v: "http://purl.org/dc/terms/title", e: true} + o: {v: "Research Paper", e: false} + graph-embeddings: + metadata: + id: core-456 + user: bob + collection: research + metadata: [] + entities: + - entity: {v: "https://example.com/doc1", e: true} + vectors: [0.11, 0.22, 0.33] + deleteKnowledgeCore: + summary: Delete knowledge core + value: + operation: delete-kg-core + id: core-123 + user: alice + loadKnowledgeCore: + summary: Load core into flow + value: + operation: load-kg-core + id: core-123 + flow: my-flow + collection: default + unloadKnowledgeCore: + summary: Unload core from flow + value: + operation: unload-kg-core + id: core-123 + responses: + '200': + description: Successful response + content: + application/json: + schema: + $ref: '../components/schemas/knowledge/KnowledgeResponse.yaml' + examples: + listKnowledgeCores: + summary: List of knowledge cores + value: + ids: + - core-123 + - core-456 + - core-789 + getKnowledgeCoreTriples: + summary: Knowledge core triples (streaming) + value: + triples: + metadata: + id: core-123 + user: alice + collection: default + metadata: + - s: {v: "https://example.com/core-123", e: true} + p: {v: "https://www.w3.org/1999/02/22-rdf-syntax-ns#type", e: true} + o: {v: "https://trustgraph.ai/e/knowledge-core", e: true} + triples: + - s: {v: "https://example.com/entity1", e: true} + p: {v: "https://www.w3.org/2000/01/rdf-schema#label", e: true} + o: {v: "Entity 1", e: false} + getKnowledgeCoreEmbeddings: + summary: Knowledge core embeddings (streaming) + value: + graph-embeddings: + metadata: + id: core-123 + user: alice + collection: default + metadata: [] + entities: + - entity: {v: "https://example.com/entity1", e: true} + vectors: [0.1, 0.2, 0.3, 0.4, 0.5] + endOfStream: + summary: End of stream marker + value: + eos: true + deleteSuccess: + summary: Delete successful (empty response) + value: {} + '401': + $ref: '../components/responses/Unauthorized.yaml' + '500': + $ref: '../components/responses/Error.yaml' diff --git a/specs/api/paths/librarian.yaml b/specs/api/paths/librarian.yaml new file mode 100644 index 00000000..ffbc6d9c --- /dev/null +++ b/specs/api/paths/librarian.yaml @@ -0,0 +1,153 @@ +post: + tags: + - Librarian + summary: Document library management + description: | + Manage document library: add, remove, list documents, and control processing. + + ## Document Library + + The librarian service manages a persistent library of documents that can be: + - Added with metadata for organization + - Queried and filtered by criteria + - Processed through flows on-demand or continuously + - Tracked for processing status + + ## Operations + + ### add-document + Add a document to the library with metadata (URL, title, author, etc.). + Documents can be added by URL or with inline content. + + ### remove-document + Remove a document from the library by document ID or URL. + + ### list-documents + List all documents in the library, optionally filtered by criteria. + + ### start-processing + Start processing library documents through a flow. Documents are queued + for processing and handled asynchronously. + + ### stop-processing + Stop ongoing library document processing. + + ### list-processing + List current processing tasks and their status. + + operationId: librarianService + security: + - bearerAuth: [] + requestBody: + required: true + content: + application/json: + schema: + $ref: '../components/schemas/librarian/LibrarianRequest.yaml' + examples: + addDocumentByUrl: + summary: Add document by URL + value: + operation: add-document + flow: my-flow + collection: default + document-metadata: + url: https://example.com/document.pdf + title: Example Document + author: John Doe + metadata: + department: Engineering + category: Technical + addDocumentInline: + summary: Add document with inline content + value: + operation: add-document + flow: my-flow + collection: default + content: "This is the document content..." + document-metadata: + title: Inline Document + author: Jane Smith + removeDocument: + summary: Remove document + value: + operation: remove-document + flow: my-flow + collection: default + document-metadata: + url: https://example.com/document.pdf + listDocuments: + summary: List all documents + value: + operation: list-documents + flow: my-flow + collection: default + listDocumentsFiltered: + summary: List documents with criteria + value: + operation: list-documents + flow: my-flow + collection: default + criteria: + - key: author + value: John Doe + operator: eq + - key: department + value: Engineering + operator: eq + startProcessing: + summary: Start processing library documents + value: + operation: start-processing + flow: my-flow + collection: default + stopProcessing: + summary: Stop processing + value: + operation: stop-processing + flow: my-flow + collection: default + listProcessing: + summary: List processing status + value: + operation: list-processing + flow: my-flow + collection: default + responses: + '200': + description: Successful response + content: + application/json: + schema: + $ref: '../components/schemas/librarian/LibrarianResponse.yaml' + examples: + listDocuments: + summary: List of documents + value: + document-metadatas: + - url: https://example.com/doc1.pdf + title: Document 1 + author: John Doe + metadata: + department: Engineering + - url: https://example.com/doc2.pdf + title: Document 2 + author: Jane Smith + metadata: + department: Research + listProcessing: + summary: Processing status + value: + processing-metadatas: + - flow: my-flow + collection: default + status: processing + timestamp: "2024-01-15T10:30:00Z" + - flow: my-flow + collection: default + status: completed + timestamp: "2024-01-15T10:25:00Z" + '401': + $ref: '../components/responses/Unauthorized.yaml' + '500': + $ref: '../components/responses/Error.yaml' diff --git a/specs/api/paths/metrics-path.yaml b/specs/api/paths/metrics-path.yaml new file mode 100644 index 00000000..50c8b840 --- /dev/null +++ b/specs/api/paths/metrics-path.yaml @@ -0,0 +1,29 @@ +get: + tags: + - Metrics + summary: Metrics - Prometheus metrics with path + description: | + Proxy to Prometheus metrics with optional path parameter. + + operationId: getMetricsPath + security: + - bearerAuth: [] + parameters: + - name: path + in: path + required: true + schema: + type: string + description: Path to specific metrics endpoint + example: query + responses: + '200': + description: Prometheus metrics + content: + text/plain: + schema: + type: string + '401': + $ref: '../components/responses/Unauthorized.yaml' + '500': + $ref: '../components/responses/Error.yaml' diff --git a/specs/api/paths/metrics.yaml b/specs/api/paths/metrics.yaml new file mode 100644 index 00000000..0fe65438 --- /dev/null +++ b/specs/api/paths/metrics.yaml @@ -0,0 +1,71 @@ +get: + tags: + - Metrics + summary: Metrics - Prometheus metrics endpoint + description: | + Proxy to Prometheus metrics for system monitoring. + + ## Metrics Overview + + Exposes system metrics via Prometheus format: + - **Gateway metrics**: Request rates, latencies, errors + - **Flow metrics**: Processing throughput, queue depths + - **System metrics**: Resource usage, health status + + ## Prometheus Format + + Returns metrics in Prometheus text exposition format: + ``` + # HELP metric_name Description + # TYPE metric_name counter + metric_name{label="value"} 123.45 + ``` + + ## Available Metrics + + Common metrics include: + - Request count and rates + - Response times (histograms) + - Error rates + - Active connections + - Queue depths + - Processing latencies + + ## Integration + + Standard Prometheus scraping: + - Configure Prometheus to scrape `/api/metrics` + - Set appropriate scrape interval + - Use bearer token if authentication enabled + + ## Path Parameter + + The `{path}` parameter allows querying specific Prometheus endpoints + or metrics if the backend Prometheus supports it. + + operationId: getMetrics + security: + - bearerAuth: [] + responses: + '200': + description: Prometheus metrics + content: + text/plain: + schema: + type: string + example: | + # HELP http_requests_total Total HTTP requests + # TYPE http_requests_total counter + http_requests_total{method="POST",endpoint="/api/v1/flow/my-flow/service/agent"} 1234 + + # HELP http_request_duration_seconds HTTP request latency + # TYPE http_request_duration_seconds histogram + http_request_duration_seconds_bucket{le="0.1"} 500 + http_request_duration_seconds_bucket{le="0.5"} 950 + http_request_duration_seconds_bucket{le="1.0"} 990 + http_request_duration_seconds_sum 450.5 + http_request_duration_seconds_count 1000 + '401': + $ref: '../components/responses/Unauthorized.yaml' + '500': + $ref: '../components/responses/Error.yaml' diff --git a/specs/api/paths/websocket.yaml b/specs/api/paths/websocket.yaml new file mode 100644 index 00000000..168ee1e4 --- /dev/null +++ b/specs/api/paths/websocket.yaml @@ -0,0 +1,185 @@ +get: + tags: + - WebSocket + summary: WebSocket - multiplexed service interface + description: | + WebSocket interface providing multiplexed access to all TrustGraph services over a single persistent connection. + + ## Overview + + The WebSocket API provides access to the same services as the REST API but with: + - **Multiplexed**: Multiple concurrent requests over one connection + - **Asynchronous**: Non-blocking request/response with ID matching + - **Efficient**: Reduced overhead compared to HTTP + - **Real-time**: Low latency bidirectional communication + + ## Connection + + Establish WebSocket connection to: + ``` + ws://localhost:8088/api/v1/socket + ``` + + ## Message Protocol + + All messages are JSON objects with the following structure: + + ### Request Message Format + + **Global Service Request** (no flow parameter): + ```json + { + "id": "req-123", + "service": "config", + "request": { + "operation": "list", + "type": "flow" + } + } + ``` + + **Flow-Hosted Service Request** (with flow parameter): + ```json + { + "id": "req-456", + "service": "agent", + "flow": "my-flow", + "request": { + "question": "What is quantum computing?", + "streaming": true + } + } + ``` + + **Request Fields**: + - `id` (string, required): Client-generated unique identifier for this request within the session. Used to match responses to requests. + - `service` (string, required): Service identifier (e.g., "config", "agent", "document-rag"). Same as `{kind}` in REST URLs. + - `flow` (string, optional): Flow ID for flow-hosted services. Omit for global services. + - `request` (object, required): Service-specific request payload. Same structure as REST API request body. + + ### Response Message Format + + **Success Response**: + ```json + { + "id": "req-123", + "response": { + "chunk-type": "answer", + "content": "Quantum computing uses...", + "end-of-stream": false + } + } + ``` + + **Error Response**: + ```json + { + "id": "req-123", + "error": { + "type": "gateway-error", + "message": "Flow not found" + } + } + ``` + + **Response Fields**: + - `id` (string, required): Matches the `id` from the request. Client uses this to correlate responses. + - `response` (object, conditional): Service-specific response payload. Same structure as REST API response. Present on success. + - `error` (object, conditional): Error information with `type` and `message` fields. Present on failure. + + ## Service Routing + + The WebSocket protocol routes to services using message parameters instead of URL paths: + + | REST Endpoint | WebSocket Message | + |--------------|-------------------| + | `POST /api/v1/config` | `{"service": "config"}` | + | `POST /api/v1/flow/{flow}/service/agent` | `{"service": "agent", "flow": "my-flow"}` | + + **Global Services** (no `flow` parameter): + - `config` - Configuration management + - `flow` - Flow lifecycle and blueprints + - `librarian` - Document library management + - `knowledge` - Knowledge graph core management + - `collection-management` - Collection metadata + + **Flow-Hosted Services** (require `flow` parameter): + - AI services: `agent`, `text-completion`, `prompt`, `document-rag`, `graph-rag` + - Embeddings: `embeddings`, `graph-embeddings`, `document-embeddings` + - Query: `triples`, `objects`, `nlp-query`, `structured-query` + - Data loading: `text-load`, `document-load` + - Utilities: `mcp-tool`, `structured-diag` + + ## Request/Response Schemas + + The `request` and `response` fields use **identical schemas** to the REST API for each service. + See individual service documentation for detailed request/response formats. + + ## Multiplexing and Asynchronous Operation + + Multiple requests can be in flight simultaneously: + - Client sends requests with unique `id` values + - Server processes requests concurrently + - Responses arrive asynchronously and may be out of order + - Client matches responses to requests using the `id` field + - No head-of-line blocking + + **Example concurrent requests**: + ```json + {"id": "req-1", "service": "config", "request": {...}} + {"id": "req-2", "service": "agent", "flow": "f1", "request": {...}} + {"id": "req-3", "service": "document-rag", "flow": "f2", "request": {...}} + ``` + + Responses may arrive in any order: `req-2`, `req-1`, `req-3` + + ## Streaming Responses + + Services that support streaming (e.g., agent, RAG) send multiple response messages with the same `id`: + ```json + {"id": "req-1", "response": {"chunk-type": "thought", "content": "...", "end-of-stream": false}} + {"id": "req-1", "response": {"chunk-type": "answer", "content": "...", "end-of-stream": false}} + {"id": "req-1", "response": {"chunk-type": "answer", "content": "...", "end-of-stream": true}} + ``` + + The `end-of-stream` flag (or service-specific completion flag) indicates the final message. + + ## Authentication + + When `GATEWAY_SECRET` is set, include bearer token: + - As query parameter: `ws://localhost:8088/api/v1/socket?token=` + - Or in WebSocket subprotocol header + + ## Benefits Over REST + + - **Lower latency**: No TCP/TLS handshake per request + - **Connection reuse**: Single persistent connection + - **Reduced overhead**: No HTTP headers per message + - **True streaming**: Bidirectional real-time communication + - **Efficient multiplexing**: Concurrent operations without connection pooling + + operationId: websocketConnection + security: + - bearerAuth: [] + parameters: + - name: Upgrade + in: header + required: true + schema: + type: string + enum: [websocket] + description: WebSocket upgrade header + - name: Connection + in: header + required: true + schema: + type: string + enum: [Upgrade] + description: Connection upgrade header + responses: + '101': + description: Switching Protocols - WebSocket connection established + '401': + $ref: '../components/responses/Unauthorized.yaml' + '500': + $ref: '../components/responses/Error.yaml' diff --git a/specs/api/security/bearerAuth.yaml b/specs/api/security/bearerAuth.yaml new file mode 100644 index 00000000..ca776645 --- /dev/null +++ b/specs/api/security/bearerAuth.yaml @@ -0,0 +1,12 @@ +type: http +scheme: bearer +description: | + Bearer token authentication. + + Set via `GATEWAY_SECRET` environment variable on the gateway. + If `GATEWAY_SECRET` is not set, authentication is disabled (development mode). + + Example: + ``` + Authorization: Bearer your-secret-token + ```