* CLI auth migration, document embeddings core lifecycle (#913)
Migrate get_kg_core and put_kg_core CLI tools to use Api/SocketClient
with first-frame auth (fixes broken raw websocket path). Fix wire
format field names (root/vector). Remove ~600 lines of dead raw
websocket code from invoke_graph_rag.py.
Add document embeddings core lifecycle to the knowledge service:
list/get/put/delete/load operations across schema, translator,
Cassandra table store, knowledge manager, gateway registry, REST API,
socket client, and CLI (tg-get-de-core, tg-put-de-core).
Fix delete_kg_core to also clean up document embeddings rows.
* Remove spurious workspace parameter from SPARQL algebra evaluator (#915)
Fix threading of workspace paramater:
- The SPARQL algebra evaluator was threading a workspace parameter
through every function and passing it to TriplesClient.query(),
which doesn't accept it. Workspace isolation is handled by pub/sub
topic routing — the TriplesClient is already scoped to a
workspace-specific flow, same as GraphRAG. Passing workspace
explicitly was both incorrect and unnecessary.
Update tests:
- tests/unit/test_query/test_sparql_algebra.py (new) — Tests
_query_pattern, _eval_bgp, and evaluate() with various algebra
nodes. Key tests assert workspace is never in tc.query() kwargs,
plus correctness tests for BGP, JOIN, UNION, SLICE, DISTINCT, and
edge cases.
- tests/unit/test_retrieval/test_graph_rag.py — Added
test_triples_query_never_passes_workspace (checks query()) and
test_follow_edges_never_passes_workspace (checks query_stream()).
* Make all Cassandra and Qdrant I/O async-safe with proper concurrency controls (#916)
Cassandra triples services were using syncronous EntityCentricKnowledgeGraph
methods from async contexts, and connection state was managed with
threading.local which is wrong for asyncio coroutines sharing a single
thread. Qdrant services had no async wrapping at all, blocking the event
loop on every network call. Rows services had unprotected shared state
mutations across concurrent coroutines.
- Add async methods to EntityCentricKnowledgeGraph (async_insert,
async_get_s/p/o/sp/po/os/spo/all, async_collection_exists,
async_create_collection, async_delete_collection) using the existing
cassandra_async.async_execute bridge
- Rewrite triples write + query services: replace threading.local with
asyncio.Lock + dict cache for per-workspace connections, use async
ECKG methods for all data operations, keep asyncio.to_thread only for
one-time blocking ECKG construction
- Wrap all Qdrant calls in asyncio.to_thread across all 6 services
(doc/graph/row embeddings write + query), add asyncio.Lock + set cache
for collection existence checks
- Add asyncio.Lock to rows write + query services to protect shared
state (schemas, sessions, config caches) from concurrent mutation
- Update all affected tests to match new async patterns
* Fixed error only returning a page of results (#921)
The root cause: async_execute only materialises the first result
page (by design — it says so in its docstring). The streaming query
set fetch_size=20 and expected to iterate all results, but only got
the first 20 rows back.
The fix uses
asyncio.to_thread(lambda: list(tg.session.execute(...)))
which lets the sync driver iterate
all pages in a worker thread — exactly what the pre-async code did.
* Optional test warning suppression (#923)
* Fix test collection module errors & silence upstream Pytest warnings (#823)
* chore: add virtual environment and .env directories to gitignore
* test: filter upstream DeprecationWarning and UserWarning messages
* fix(namespace): remove empty __init__.py files to fix PEP 420 implicit namespace routing for trustgraph sub-packages
* Revert __init__.py deletions
* Add .ini changes but commented out, will be useful at times
---------
Co-authored-by: Salil M <d2kyt@protonmail.com>
* Changed schema for Value -> Term, majorly breaking change
* Following the schema change, Value -> Term into all processing
* Updated Cassandra for g, p, s, o index patterns (7 indexes)
* Reviewed and updated all tests
* Neo4j, Memgraph and FalkorDB remain broken, will look at once settled down