mirror of
https://github.com/trustgraph-ai/trustgraph.git
synced 2026-05-18 20:05:13 +02:00
* CLI auth migration, document embeddings core lifecycle (#913) Migrate get_kg_core and put_kg_core CLI tools to use Api/SocketClient with first-frame auth (fixes broken raw websocket path). Fix wire format field names (root/vector). Remove ~600 lines of dead raw websocket code from invoke_graph_rag.py. Add document embeddings core lifecycle to the knowledge service: list/get/put/delete/load operations across schema, translator, Cassandra table store, knowledge manager, gateway registry, REST API, socket client, and CLI (tg-get-de-core, tg-put-de-core). Fix delete_kg_core to also clean up document embeddings rows. * Remove spurious workspace parameter from SPARQL algebra evaluator (#915) Fix threading of workspace paramater: - The SPARQL algebra evaluator was threading a workspace parameter through every function and passing it to TriplesClient.query(), which doesn't accept it. Workspace isolation is handled by pub/sub topic routing — the TriplesClient is already scoped to a workspace-specific flow, same as GraphRAG. Passing workspace explicitly was both incorrect and unnecessary. Update tests: - tests/unit/test_query/test_sparql_algebra.py (new) — Tests _query_pattern, _eval_bgp, and evaluate() with various algebra nodes. Key tests assert workspace is never in tc.query() kwargs, plus correctness tests for BGP, JOIN, UNION, SLICE, DISTINCT, and edge cases. - tests/unit/test_retrieval/test_graph_rag.py — Added test_triples_query_never_passes_workspace (checks query()) and test_follow_edges_never_passes_workspace (checks query_stream()). * Make all Cassandra and Qdrant I/O async-safe with proper concurrency controls (#916) Cassandra triples services were using syncronous EntityCentricKnowledgeGraph methods from async contexts, and connection state was managed with threading.local which is wrong for asyncio coroutines sharing a single thread. Qdrant services had no async wrapping at all, blocking the event loop on every network call. Rows services had unprotected shared state mutations across concurrent coroutines. - Add async methods to EntityCentricKnowledgeGraph (async_insert, async_get_s/p/o/sp/po/os/spo/all, async_collection_exists, async_create_collection, async_delete_collection) using the existing cassandra_async.async_execute bridge - Rewrite triples write + query services: replace threading.local with asyncio.Lock + dict cache for per-workspace connections, use async ECKG methods for all data operations, keep asyncio.to_thread only for one-time blocking ECKG construction - Wrap all Qdrant calls in asyncio.to_thread across all 6 services (doc/graph/row embeddings write + query), add asyncio.Lock + set cache for collection existence checks - Add asyncio.Lock to rows write + query services to protect shared state (schemas, sessions, config caches) from concurrent mutation - Update all affected tests to match new async patterns * Fixed error only returning a page of results (#921) The root cause: async_execute only materialises the first result page (by design — it says so in its docstring). The streaming query set fetch_size=20 and expected to iterate all results, but only got the first 20 rows back. The fix uses asyncio.to_thread(lambda: list(tg.session.execute(...))) which lets the sync driver iterate all pages in a worker thread — exactly what the pre-async code did. * Optional test warning suppression (#923) * Fix test collection module errors & silence upstream Pytest warnings (#823) * chore: add virtual environment and .env directories to gitignore * test: filter upstream DeprecationWarning and UserWarning messages * fix(namespace): remove empty __init__.py files to fix PEP 420 implicit namespace routing for trustgraph sub-packages * Revert __init__.py deletions * Add .ini changes but commented out, will be useful at times --------- Co-authored-by: Salil M <d2kyt@protonmail.com> * fix(openai): fail fast on unrecoverable RateLimitError codes (#901) (#904) (#925) Co-authored-by: Sahil Yadav <sahilyadav.sy2004@gmail.com> * Ensure retry exception is properly raised (#926) * fix: library API get/update document round-trip bugs (#893) (#928) Fix 5 cascading bugs in the Library API wrapper that prevented the get_documents → update_document round-trip from working: - Tolerate missing title field in document metadata (use .get()) - Use attribute access on Triple objects instead of subscript - Serialize datetime to int seconds for JSON compatibility - Handle empty server response on successful update - Send both id and document-id keys in update request Added library API tests * Fix ontology selector defaults, add bypass mode, enforce domain/range (#929) - Align similarity_threshold default to 0.3 everywhere (class signature had stale 0.7). Fix matching contradiction in tech-spec. - Add bypass_selector_below parameter (default 5) to skip vector similarity selection when ontology element count is small enough. - Enforce domain/range constraints in TripleConverter for object properties and datatype properties, with subclass hierarchy support. Properties with no declared domain/range pass through unchanged. - Add unit tests for domain/range validation, subclass acceptance, polymorphic pass-through, and selector bypass. Fixes #908, #920 * Close producers on flow stop to prevent stale non-persistent topics (#930) Flow.stop() only stopped consumers, leaving response producers connected to non-persistent Pulsar topics. After flow restart, the orphaned producers held stale broker routing state, causing response messages to never reach new consumers — manifesting as 120s timeouts on document-embeddings and similar RPC paths. Fix: Flow.stop() now explicitly stops all producers. Producer.stop() closes the underlying Pulsar producer connection rather than just setting a flag. Fixes #906 * fix(gateway): propagate --timeout flag to per-service dispatchers (#931) The api-gateway accepts a --timeout flag (default 600s) but the value was not propagated into DispatcherManager, which hard-coded timeout=120 for every per-service dispatcher (graph-rag, document-rag, text-completion, embeddings, librarian, etc.). This meant any synchronous request taking more than 120 seconds would always return a Timeout error at the 120s mark, regardless of the --timeout value set on the gateway. Changes: - Add timeout parameter to DispatcherManager.__init__ (default: 120 for backward compatibility) - Store self.timeout in DispatcherManager - Replace both hardcoded timeout=120 with self.timeout in invoke_global_service and invoke_flow_service - Pass self.timeout from Api to DispatcherManager in service.py - Document the timeout parameter in the docstring Fixes #894 --------- Co-authored-by: Salil M <d2kyt@protonmail.com> Co-authored-by: Sahil Yadav <sahilyadav.sy2004@gmail.com> Co-authored-by: Mister Lobster <jlaportebot@gmail.com> |
||
|---|---|---|
| .. | ||
| tech-specs | ||
| api-gateway-changes-v1.8-to-v2.1.ar.md | ||
| api-gateway-changes-v1.8-to-v2.1.es.md | ||
| api-gateway-changes-v1.8-to-v2.1.he.md | ||
| api-gateway-changes-v1.8-to-v2.1.hi.md | ||
| api-gateway-changes-v1.8-to-v2.1.pt.md | ||
| api-gateway-changes-v1.8-to-v2.1.ru.md | ||
| api-gateway-changes-v1.8-to-v2.1.sw.md | ||
| api-gateway-changes-v1.8-to-v2.1.tr.md | ||
| api-gateway-changes-v1.8-to-v2.1.zh-cn.md | ||
| api.html | ||
| cli-changes-v1.8-to-v2.1.ar.md | ||
| cli-changes-v1.8-to-v2.1.es.md | ||
| cli-changes-v1.8-to-v2.1.he.md | ||
| cli-changes-v1.8-to-v2.1.hi.md | ||
| cli-changes-v1.8-to-v2.1.pt.md | ||
| cli-changes-v1.8-to-v2.1.ru.md | ||
| cli-changes-v1.8-to-v2.1.sw.md | ||
| cli-changes-v1.8-to-v2.1.tr.md | ||
| cli-changes-v1.8-to-v2.1.zh-cn.md | ||
| contributor-licence-agreement.ar.md | ||
| contributor-licence-agreement.es.md | ||
| contributor-licence-agreement.he.md | ||
| contributor-licence-agreement.hi.md | ||
| contributor-licence-agreement.md | ||
| contributor-licence-agreement.pt.md | ||
| contributor-licence-agreement.ru.md | ||
| contributor-licence-agreement.sw.md | ||
| contributor-licence-agreement.tr.md | ||
| contributor-licence-agreement.zh-cn.md | ||
| generate-api-docs.py | ||
| lang-index-ar.md | ||
| lang-index-es.md | ||
| lang-index-he.md | ||
| lang-index-hi.md | ||
| lang-index-pt.md | ||
| lang-index-ru.md | ||
| lang-index-sw.md | ||
| lang-index-tr.md | ||
| lang-index-zh-cn.md | ||
| python-api.ar.md | ||
| python-api.es.md | ||
| python-api.he.md | ||
| python-api.hi.md | ||
| python-api.md | ||
| python-api.pt.md | ||
| python-api.ru.md | ||
| python-api.sw.md | ||
| python-api.tr.md | ||
| python-api.zh-cn.md | ||
| README.api-docs.ar.md | ||
| README.api-docs.es.md | ||
| README.api-docs.he.md | ||
| README.api-docs.hi.md | ||
| README.api-docs.md | ||
| README.api-docs.pt.md | ||
| README.api-docs.ru.md | ||
| README.api-docs.sw.md | ||
| README.api-docs.tr.md | ||
| README.api-docs.zh-cn.md | ||
| README.ar.md | ||
| README.cats | ||
| README.challenger | ||
| README.es.md | ||
| README.he.md | ||
| README.hi.md | ||
| README.md | ||
| README.pt.md | ||
| README.ru.md | ||
| README.sw.md | ||
| README.tr.md | ||
| README.zh-cn.md | ||
| websocket.html | ||
| layout | title | nav_order |
|---|---|---|
| default | Home | 1 |
TrustGraph Documentation
Welcome to TrustGraph! For comprehensive documentation, please visit:
📖 https://docs.trustgraph.ai
The main documentation site includes:
- Overview - Introduction to TrustGraph concepts and architecture
- Guides - Step-by-step tutorials and how-to guides
- Deployment - Deployment options and configuration
- Reference - API specifications and CLI documentation
Getting Started
New to TrustGraph? Start with the Overview to understand the system.
Ready to deploy? Check out the Deployment Guide.
Integrating with code? See the API Reference for REST, WebSocket, and SDK documentation.