* CLI auth migration, document embeddings core lifecycle (#913)
Migrate get_kg_core and put_kg_core CLI tools to use Api/SocketClient
with first-frame auth (fixes broken raw websocket path). Fix wire
format field names (root/vector). Remove ~600 lines of dead raw
websocket code from invoke_graph_rag.py.
Add document embeddings core lifecycle to the knowledge service:
list/get/put/delete/load operations across schema, translator,
Cassandra table store, knowledge manager, gateway registry, REST API,
socket client, and CLI (tg-get-de-core, tg-put-de-core).
Fix delete_kg_core to also clean up document embeddings rows.
* Remove spurious workspace parameter from SPARQL algebra evaluator (#915)
Fix threading of workspace paramater:
- The SPARQL algebra evaluator was threading a workspace parameter
through every function and passing it to TriplesClient.query(),
which doesn't accept it. Workspace isolation is handled by pub/sub
topic routing — the TriplesClient is already scoped to a
workspace-specific flow, same as GraphRAG. Passing workspace
explicitly was both incorrect and unnecessary.
Update tests:
- tests/unit/test_query/test_sparql_algebra.py (new) — Tests
_query_pattern, _eval_bgp, and evaluate() with various algebra
nodes. Key tests assert workspace is never in tc.query() kwargs,
plus correctness tests for BGP, JOIN, UNION, SLICE, DISTINCT, and
edge cases.
- tests/unit/test_retrieval/test_graph_rag.py — Added
test_triples_query_never_passes_workspace (checks query()) and
test_follow_edges_never_passes_workspace (checks query_stream()).
* Make all Cassandra and Qdrant I/O async-safe with proper concurrency controls (#916)
Cassandra triples services were using syncronous EntityCentricKnowledgeGraph
methods from async contexts, and connection state was managed with
threading.local which is wrong for asyncio coroutines sharing a single
thread. Qdrant services had no async wrapping at all, blocking the event
loop on every network call. Rows services had unprotected shared state
mutations across concurrent coroutines.
- Add async methods to EntityCentricKnowledgeGraph (async_insert,
async_get_s/p/o/sp/po/os/spo/all, async_collection_exists,
async_create_collection, async_delete_collection) using the existing
cassandra_async.async_execute bridge
- Rewrite triples write + query services: replace threading.local with
asyncio.Lock + dict cache for per-workspace connections, use async
ECKG methods for all data operations, keep asyncio.to_thread only for
one-time blocking ECKG construction
- Wrap all Qdrant calls in asyncio.to_thread across all 6 services
(doc/graph/row embeddings write + query), add asyncio.Lock + set cache
for collection existence checks
- Add asyncio.Lock to rows write + query services to protect shared
state (schemas, sessions, config caches) from concurrent mutation
- Update all affected tests to match new async patterns
* Fixed error only returning a page of results (#921)
The root cause: async_execute only materialises the first result
page (by design — it says so in its docstring). The streaming query
set fetch_size=20 and expected to iterate all results, but only got
the first 20 rows back.
The fix uses
asyncio.to_thread(lambda: list(tg.session.execute(...)))
which lets the sync driver iterate
all pages in a worker thread — exactly what the pre-async code did.
* Optional test warning suppression (#923)
* Fix test collection module errors & silence upstream Pytest warnings (#823)
* chore: add virtual environment and .env directories to gitignore
* test: filter upstream DeprecationWarning and UserWarning messages
* fix(namespace): remove empty __init__.py files to fix PEP 420 implicit namespace routing for trustgraph sub-packages
* Revert __init__.py deletions
* Add .ini changes but commented out, will be useful at times
---------
Co-authored-by: Salil M <d2kyt@protonmail.com>
Expose LLM token usage (in_token, out_token, model) across all
service layers
Propagate token counts from LLM services through the prompt,
text-completion, graph-RAG, document-RAG, and agent orchestrator
pipelines to the API gateway and Python SDK. All fields are Optional
— None means "not available", distinguishing from a real zero count.
Key changes:
- Schema: Add in_token/out_token/model to TextCompletionResponse,
PromptResponse, GraphRagResponse, DocumentRagResponse,
AgentResponse
- TextCompletionClient: New TextCompletionResult return type. Split
into text_completion() (non-streaming) and
text_completion_stream() (streaming with per-chunk handler
callback)
- PromptClient: New PromptResult with response_type
(text/json/jsonl), typed fields (text/object/objects), and token
usage. All callers updated.
- RAG services: Accumulate token usage across all prompt calls
(extract-concepts, edge-scoring, edge-reasoning,
synthesis). Non-streaming path sends single combined response
instead of chunk + end_of_session.
- Agent orchestrator: UsageTracker accumulates tokens across
meta-router, pattern prompt calls, and react reasoning. Attached
to end_of_dialog.
- Translators: Encode token fields when not None (is not None, not truthy)
- Python SDK: RAG and text-completion methods return
TextCompletionResult (non-streaming) or RAGChunk/AgentAnswer with
token fields (streaming)
- CLI: --show-usage flag on tg-invoke-llm, tg-invoke-prompt,
tg-invoke-graph-rag, tg-invoke-document-rag, tg-invoke-agent
The triples client returns Uri/Literal (str subclasses), not Term
objects. _quoted_triple() treated all values as IRIs, so literal
objects like skos:definition values were mistyped in focus
provenance events, and trace_source_documents could not match
them in the store.
Added to_term() to convert Uri/Literal back to Term, threaded a
term_map from follow_edges_batch through
get_subgraph/get_labelgraph into uri_map, and updated
_quoted_triple to accept Term objects directly.
Terminology Rename, and named-graphs for explainability data
Changed terminology:
- session -> question
- retrieval -> exploration
- selection -> focus
- answer -> synthesis
- uris.py: Renamed query_session_uri → question_uri,
retrieval_uri → exploration_uri, selection_uri → focus_uri,
answer_uri → synthesis_uri
- triples.py: Renamed corresponding triple generation functions with
updated labels ("GraphRAG question", "Exploration", "Focus",
"Synthesis")
- namespaces.py: Added named graph constants GRAPH_DEFAULT,
GRAPH_SOURCE, GRAPH_RETRIEVAL
- init.py: Updated exports
- graph_rag.py: Updated to use new terminology
- invoke_graph_rag.py: Updated CLI to display new stage names
(Question, Exploration, Focus, Synthesis)
Query-Time Explainability → Named Graph
- triples.py: Added set_graph() helper function to set named graph
on triples
- graph_rag.py: All explainability triples now use GRAPH_RETRIEVAL
named graph
- rag.py: Explainability triples stored in user's collection (not
separate collection) with named graph
Extraction Provenance → Named Graph
- relationships/extract.py: Provenance triples use GRAPH_SOURCE
named graph
- definitions/extract.py: Provenance triples use GRAPH_SOURCE
named graph
- chunker.py: Provenance triples use GRAPH_SOURCE named graph
- pdf_decoder.py: Provenance triples use GRAPH_SOURCE named graph
CLI Updates
- show_graph.py: Added -g/--graph option to filter by named graph and
--show-graph to display graph column
Also:
- Fix knowledge core schemas