Quoted triple fixes, including...
1. Updated triple_provenance_triples() in triples.py:
- Now accepts a Triple object directly
- Creates the reification triple using TRIPLE term type: stmt_uri tg:reifies
<<extracted_triple>>
- Includes it in the returned provenance triples
2. Updated definitions extractor:
- Added imports for provenance functions and component version
- Added ParameterSpec for optional llm-model and ontology flow parameters
- For each definition triple, generates provenance with reification
3. Updated relationships extractor:
- Same changes as definitions extractor
Tech spec
BlobStore (trustgraph-flow/trustgraph/librarian/blob_store.py):
- get_stream() - yields document content in chunks for streaming retrieval
- create_multipart_upload() - initializes S3 multipart upload, returns
upload_id
- upload_part() - uploads a single part, returns etag
- complete_multipart_upload() - finalizes upload with part etags
- abort_multipart_upload() - cancels and cleans up
Cassandra schema (trustgraph-flow/trustgraph/tables/library.py):
- New upload_session table with 24-hour TTL
- Index on user for listing sessions
- Prepared statements for all operations
- Methods: create_upload_session(), get_upload_session(),
update_upload_session_chunk(), delete_upload_session(),
list_upload_sessions()
- Schema extended with UploadSession, UploadProgress, and new
request/response fields
- Librarian methods: begin_upload, upload_chunk, complete_upload,
abort_upload, get_upload_status, list_uploads
- Service routing for all new operations
- Python SDK with transparent chunked upload:
- add_document() auto-switches to chunked for files > 10MB
- Progress callback support (on_progress)
- get_pending_uploads(), get_upload_status(), abort_upload(),
resume_upload()
- Document table: Added parent_id and document_type columns with index
- Document schema (knowledge/document.py): Added document_id field for
streaming retrieval
- Librarian operations:
- add-child-document for extracted PDF pages
- list-children to get child documents
- stream-document for chunked content retrieval
- Cascade delete removes children when parent is deleted
- list-documents filters children by default
- PDF decoder (decoding/pdf/pdf_decoder.py): Updated to stream large
documents from librarian API to temp file
- Librarian service (librarian/service.py): Sends document_id instead of
content for large PDFs (>2MB)
- Deprecated tools (load_pdf.py, load_text.py): Added deprecation
warnings directing users to tg-add-library-document +
tg-start-library-processing
Remove load_pdf and load_text utils
Move chunker/librarian comms to base class
Updating tests
* CLI tools for tg-invoke-graph-embeddings, tg-invoke-document-embeddings,
and tg-invoke-embeddings. Just useful for diagnostics.
* Fix tg-load-knowledge
* Changed schema for Value -> Term, majorly breaking change
* Following the schema change, Value -> Term into all processing
* Updated Cassandra for g, p, s, o index patterns (7 indexes)
* Reviewed and updated all tests
* Neo4j, Memgraph and FalkorDB remain broken, will look at once settled down
* Plugin architecture for messaging fabric
* Schemas use a technology neutral expression
* Schemas strictness has uncovered some incorrect schema use which is fixed
* Tech spec
* Python CLI utilities updated to use the API including streaming features
* Added type safety to Python API
* Completed missing auth token support in CLI
* Tweak the structured query schema
* Structure query service
* Gateway support for nlp-query and structured-query
* API support
* Added CLI
* Update tests
* More tests
Key Features
- MCP Tool Integration: Added core MCP tool support with ToolClientSpec and ToolClient classes
- API Enhancement: New mcp_tool method for flow-specific tool invocation
- CLI Tooling: New tg-invoke-mcp-tool command for testing MCP integration
- React Agent Enhancement: Fixed and improved multi-tool invocation capabilities
- Tool Management: Enhanced CLI for tool configuration and management
Changes
- Added MCP tool invocation to API with flow-specific integration
- Implemented ToolClientSpec and ToolClient for tool call handling
- Updated agent-manager-react to invoke MCP tools with configurable types
- Enhanced CLI with new commands and improved help text
- Added comprehensive documentation for new CLI commands
- Improved tool configuration management
Testing
- Added tg-invoke-mcp-tool CLI command for isolated MCP integration testing
- Enhanced agent capability to invoke multiple tools simultaneously
* Working mux socket
* Change API to incorporate flow
* Add Flow ID to all relevant CLIs, not completely implemented
* Change tg-processor-state to use API gateway
* Updated all CLIs
* New tg-show-flow-state command
* tg-show-flow-state shows classes too
* - Fixed error reporting in config
- Updated tg-init-pulsar to be able to load initial config to config-svc
- Tweaked API naming and added more config calls
* Tools to dump out prompts and agent tools
Configuration service provides an API to change configuration. Complete configuration is pushed down a config queue so that users have a complete copy of config object.
* Change document-rag and graph-rag processing so that the user can
specify parameters. Changes in Pulsar services, Pulsar message
schemas, gateway and command-line tools. User-visible changes in
new parameters on command-line tools.
* Fix bugs, graph-rag working
* Get subgraph truncation in the right place
* Graph RAG and document RAG working and configurable
* Multi-hop path traversal GraphRAG
* Add safety valve for path_size set too high
* Acknowledge messaages from Pulsar, doh!
* Change API to deliver a boolean e if value is an entity
* Change loaders to use new API
* Changes, entity-aware API is complete