trustgraph

mirror of https://github.com/trustgraph-ai/trustgraph.git synced 2026-07-01 09:29:38 +02:00

Author	SHA1	Message	Date
Cyber MacGeddon	4e3bd85abc	Merge branch 'master' into release/v2.5	2026-05-19 18:57:26 +01:00
Cyber MacGeddon	668b64742f	Merge branch 'release/v2.4'	2026-05-19 18:01:35 +01:00
cybermaggedon	66912d9f55	Open 2.5 release branch (#939 )	2026-05-19 16:07:27 +01:00
cybermaggedon	fd6e3e1269	fix: stop dropping messages on Pulsar flow restarts (#938 ) consumer.py called unsubscribe() on every flow stop, deleting the server-side subscription cursor. On restart, initial_position='latest' skipped any messages published during the gap — causing intermittent data loss (e.g. graph embeddings silently never reaching Qdrant). Replace unsubscribe() with close() so the cursor survives restarts. Move subscription cleanup to where it belongs: the Pulsar backend's delete_topic(), called by the flow controller on deliberate flow deletion. This was previously a no-op TODO.	2026-05-19 13:26:39 +01:00
cybermaggedon	47dfc30c1c	fix: suppress Pulsar C++ client log noise (#936 ) Revert consumer receive timeout from 100ms back to the original 2000ms. The 100ms change was based on a misunderstanding — receive() is a blocking call that returns immediately when a message arrives, so the timeout only affects how quickly a consumer checks the shutdown flag during idle periods. 100ms generated ~200 WARN lines/sec from the C++ client with no latency benefit. Also set the Pulsar C++ client logger to Error level so residual timeout warnings from the subscriber (250ms) don't produce noise. Update poll timeout test to match reverted 2000ms value	2026-05-18 22:08:52 +01:00
cybermaggedon	29d3100c46	fix: IAM bootstrap atomicity and bootstrapper startup ordering (#935 ) IAM auto-bootstrap could get permanently stuck in a half-done state: _seed_tables wrote the workspace first, so any_workspace_exists() returned true on restart even when user/key/signing-key creation had failed. Remove workspace creation from _seed_tables (WorkspaceInit handles it) and use any_signing_key_exists() as the completion check since the signing key is the last thing written. Run pre-service initialisers (PulsarTopology) in start() before opening pub/sub connections, breaking the chicken-and-egg where the bootstrapper needed Pulsar namespaces that it was responsible for creating. Guard against empty cluster list when broker isn't ready.	2026-05-18 22:08:12 +01:00
cybermaggedon	76e3358ed3	fix: guard against empty query in SPARQL generator (#934 ) Split the query once and check the parts list before indexing, preventing an IndexError if the LLM returns an empty or whitespace-only string. Fixes #870.	2026-05-18 14:19:19 +01:00
cybermaggedon	da7d10e995	feat: add no-auth IAM regime as a drop-in replacement for iam-svc (#933 ) Adds `no-auth-svc`, a lightweight IAM service that permits all access unconditionally — no database, no bootstrap, no signing keys. Deploy it in place of `iam-svc` for development, demos, and single-user setups where authentication overhead is unwanted. The gateway no longer hard-codes a 401 on missing credentials. Instead it asks the IAM regime via a new `authenticate-anonymous` operation whether token-free access is allowed. This keeps the gateway regime-agnostic: `iam-svc` rejects anonymous auth (preserving existing security), while `no-auth-svc` grants it with a configurable default user and workspace. Includes a tech spec (docs/tech-specs/no-auth-regime.md) and tests that pin the safety boundary — malformed tokens never fall through to the anonymous path, and a contract test ensures the full iam-svc always rejects `authenticate-anonymous`.	2026-05-18 14:10:05 +01:00
cybermaggedon	71517e6417	release/v2.4 -> master (#932 ) * CLI auth migration, document embeddings core lifecycle (#913) Migrate get_kg_core and put_kg_core CLI tools to use Api/SocketClient with first-frame auth (fixes broken raw websocket path). Fix wire format field names (root/vector). Remove ~600 lines of dead raw websocket code from invoke_graph_rag.py. Add document embeddings core lifecycle to the knowledge service: list/get/put/delete/load operations across schema, translator, Cassandra table store, knowledge manager, gateway registry, REST API, socket client, and CLI (tg-get-de-core, tg-put-de-core). Fix delete_kg_core to also clean up document embeddings rows. * Remove spurious workspace parameter from SPARQL algebra evaluator (#915) Fix threading of workspace paramater: - The SPARQL algebra evaluator was threading a workspace parameter through every function and passing it to TriplesClient.query(), which doesn't accept it. Workspace isolation is handled by pub/sub topic routing — the TriplesClient is already scoped to a workspace-specific flow, same as GraphRAG. Passing workspace explicitly was both incorrect and unnecessary. Update tests: - tests/unit/test_query/test_sparql_algebra.py (new) — Tests _query_pattern, _eval_bgp, and evaluate() with various algebra nodes. Key tests assert workspace is never in tc.query() kwargs, plus correctness tests for BGP, JOIN, UNION, SLICE, DISTINCT, and edge cases. - tests/unit/test_retrieval/test_graph_rag.py — Added test_triples_query_never_passes_workspace (checks query()) and test_follow_edges_never_passes_workspace (checks query_stream()). * Make all Cassandra and Qdrant I/O async-safe with proper concurrency controls (#916) Cassandra triples services were using syncronous EntityCentricKnowledgeGraph methods from async contexts, and connection state was managed with threading.local which is wrong for asyncio coroutines sharing a single thread. Qdrant services had no async wrapping at all, blocking the event loop on every network call. Rows services had unprotected shared state mutations across concurrent coroutines. - Add async methods to EntityCentricKnowledgeGraph (async_insert, async_get_s/p/o/sp/po/os/spo/all, async_collection_exists, async_create_collection, async_delete_collection) using the existing cassandra_async.async_execute bridge - Rewrite triples write + query services: replace threading.local with asyncio.Lock + dict cache for per-workspace connections, use async ECKG methods for all data operations, keep asyncio.to_thread only for one-time blocking ECKG construction - Wrap all Qdrant calls in asyncio.to_thread across all 6 services (doc/graph/row embeddings write + query), add asyncio.Lock + set cache for collection existence checks - Add asyncio.Lock to rows write + query services to protect shared state (schemas, sessions, config caches) from concurrent mutation - Update all affected tests to match new async patterns * Fixed error only returning a page of results (#921) The root cause: async_execute only materialises the first result page (by design — it says so in its docstring). The streaming query set fetch_size=20 and expected to iterate all results, but only got the first 20 rows back. The fix uses asyncio.to_thread(lambda: list(tg.session.execute(...))) which lets the sync driver iterate all pages in a worker thread — exactly what the pre-async code did. * Optional test warning suppression (#923) * Fix test collection module errors & silence upstream Pytest warnings (#823) * chore: add virtual environment and .env directories to gitignore * test: filter upstream DeprecationWarning and UserWarning messages * fix(namespace): remove empty __init__.py files to fix PEP 420 implicit namespace routing for trustgraph sub-packages * Revert __init__.py deletions * Add .ini changes but commented out, will be useful at times --------- Co-authored-by: Salil M <d2kyt@protonmail.com> * fix(openai): fail fast on unrecoverable RateLimitError codes (#901) (#904) (#925) Co-authored-by: Sahil Yadav <sahilyadav.sy2004@gmail.com> * Ensure retry exception is properly raised (#926) * fix: library API get/update document round-trip bugs (#893) (#928) Fix 5 cascading bugs in the Library API wrapper that prevented the get_documents → update_document round-trip from working: - Tolerate missing title field in document metadata (use .get()) - Use attribute access on Triple objects instead of subscript - Serialize datetime to int seconds for JSON compatibility - Handle empty server response on successful update - Send both id and document-id keys in update request Added library API tests * Fix ontology selector defaults, add bypass mode, enforce domain/range (#929) - Align similarity_threshold default to 0.3 everywhere (class signature had stale 0.7). Fix matching contradiction in tech-spec. - Add bypass_selector_below parameter (default 5) to skip vector similarity selection when ontology element count is small enough. - Enforce domain/range constraints in TripleConverter for object properties and datatype properties, with subclass hierarchy support. Properties with no declared domain/range pass through unchanged. - Add unit tests for domain/range validation, subclass acceptance, polymorphic pass-through, and selector bypass. Fixes #908, #920 * Close producers on flow stop to prevent stale non-persistent topics (#930) Flow.stop() only stopped consumers, leaving response producers connected to non-persistent Pulsar topics. After flow restart, the orphaned producers held stale broker routing state, causing response messages to never reach new consumers — manifesting as 120s timeouts on document-embeddings and similar RPC paths. Fix: Flow.stop() now explicitly stops all producers. Producer.stop() closes the underlying Pulsar producer connection rather than just setting a flag. Fixes #906 * fix(gateway): propagate --timeout flag to per-service dispatchers (#931) The api-gateway accepts a --timeout flag (default 600s) but the value was not propagated into DispatcherManager, which hard-coded timeout=120 for every per-service dispatcher (graph-rag, document-rag, text-completion, embeddings, librarian, etc.). This meant any synchronous request taking more than 120 seconds would always return a Timeout error at the 120s mark, regardless of the --timeout value set on the gateway. Changes: - Add timeout parameter to DispatcherManager.__init__ (default: 120 for backward compatibility) - Store self.timeout in DispatcherManager - Replace both hardcoded timeout=120 with self.timeout in invoke_global_service and invoke_flow_service - Pass self.timeout from Api to DispatcherManager in service.py - Document the timeout parameter in the docstring Fixes #894 --------- Co-authored-by: Salil M <d2kyt@protonmail.com> Co-authored-by: Sahil Yadav <sahilyadav.sy2004@gmail.com> Co-authored-by: Mister Lobster <jlaportebot@gmail.com>	2026-05-18 09:46:58 +01:00
Mister Lobster	ab83c81d8a	fix(gateway): propagate --timeout flag to per-service dispatchers (#931 ) The api-gateway accepts a --timeout flag (default 600s) but the value was not propagated into DispatcherManager, which hard-coded timeout=120 for every per-service dispatcher (graph-rag, document-rag, text-completion, embeddings, librarian, etc.). This meant any synchronous request taking more than 120 seconds would always return a Timeout error at the 120s mark, regardless of the --timeout value set on the gateway. Changes: - Add timeout parameter to DispatcherManager.__init__ (default: 120 for backward compatibility) - Store self.timeout in DispatcherManager - Replace both hardcoded timeout=120 with self.timeout in invoke_global_service and invoke_flow_service - Pass self.timeout from Api to DispatcherManager in service.py - Document the timeout parameter in the docstring Fixes #894	2026-05-18 09:44:37 +01:00
cybermaggedon	2b70a1ea8e	Close producers on flow stop to prevent stale non-persistent topics (#930 ) Flow.stop() only stopped consumers, leaving response producers connected to non-persistent Pulsar topics. After flow restart, the orphaned producers held stale broker routing state, causing response messages to never reach new consumers — manifesting as 120s timeouts on document-embeddings and similar RPC paths. Fix: Flow.stop() now explicitly stops all producers. Producer.stop() closes the underlying Pulsar producer connection rather than just setting a flag. Fixes #906	2026-05-16 16:07:16 +01:00
cybermaggedon	38d9c746a8	Fix ontology selector defaults, add bypass mode, enforce domain/range (#929 ) - Align similarity_threshold default to 0.3 everywhere (class signature had stale 0.7). Fix matching contradiction in tech-spec. - Add bypass_selector_below parameter (default 5) to skip vector similarity selection when ontology element count is small enough. - Enforce domain/range constraints in TripleConverter for object properties and datatype properties, with subclass hierarchy support. Properties with no declared domain/range pass through unchanged. - Add unit tests for domain/range validation, subclass acceptance, polymorphic pass-through, and selector bypass. Fixes #908, #920	2026-05-16 15:13:38 +01:00
cybermaggedon	aea4c2df8e	fix: library API get/update document round-trip bugs (#893 ) (#928 ) Fix 5 cascading bugs in the Library API wrapper that prevented the get_documents → update_document round-trip from working: - Tolerate missing title field in document metadata (use .get()) - Use attribute access on Triple objects instead of subscript - Serialize datetime to int seconds for JSON compatibility - Handle empty server response on successful update - Send both id and document-id keys in update request Added library API tests	2026-05-16 11:32:51 +01:00
cybermaggedon	913f610db5	Ensure retry exception is properly raised (#926 )	2026-05-15 13:35:04 +01:00
cybermaggedon	58b5c5c8d5	fix(openai): fail fast on unrecoverable RateLimitError codes (#901 ) (#904 ) (#925 ) Co-authored-by: Sahil Yadav <sahilyadav.sy2004@gmail.com>	2026-05-15 13:32:30 +01:00
cybermaggedon	142dd0231c	release/v2.4 -> master (#924 ) * CLI auth migration, document embeddings core lifecycle (#913) Migrate get_kg_core and put_kg_core CLI tools to use Api/SocketClient with first-frame auth (fixes broken raw websocket path). Fix wire format field names (root/vector). Remove ~600 lines of dead raw websocket code from invoke_graph_rag.py. Add document embeddings core lifecycle to the knowledge service: list/get/put/delete/load operations across schema, translator, Cassandra table store, knowledge manager, gateway registry, REST API, socket client, and CLI (tg-get-de-core, tg-put-de-core). Fix delete_kg_core to also clean up document embeddings rows. * Remove spurious workspace parameter from SPARQL algebra evaluator (#915) Fix threading of workspace paramater: - The SPARQL algebra evaluator was threading a workspace parameter through every function and passing it to TriplesClient.query(), which doesn't accept it. Workspace isolation is handled by pub/sub topic routing — the TriplesClient is already scoped to a workspace-specific flow, same as GraphRAG. Passing workspace explicitly was both incorrect and unnecessary. Update tests: - tests/unit/test_query/test_sparql_algebra.py (new) — Tests _query_pattern, _eval_bgp, and evaluate() with various algebra nodes. Key tests assert workspace is never in tc.query() kwargs, plus correctness tests for BGP, JOIN, UNION, SLICE, DISTINCT, and edge cases. - tests/unit/test_retrieval/test_graph_rag.py — Added test_triples_query_never_passes_workspace (checks query()) and test_follow_edges_never_passes_workspace (checks query_stream()). * Make all Cassandra and Qdrant I/O async-safe with proper concurrency controls (#916) Cassandra triples services were using syncronous EntityCentricKnowledgeGraph methods from async contexts, and connection state was managed with threading.local which is wrong for asyncio coroutines sharing a single thread. Qdrant services had no async wrapping at all, blocking the event loop on every network call. Rows services had unprotected shared state mutations across concurrent coroutines. - Add async methods to EntityCentricKnowledgeGraph (async_insert, async_get_s/p/o/sp/po/os/spo/all, async_collection_exists, async_create_collection, async_delete_collection) using the existing cassandra_async.async_execute bridge - Rewrite triples write + query services: replace threading.local with asyncio.Lock + dict cache for per-workspace connections, use async ECKG methods for all data operations, keep asyncio.to_thread only for one-time blocking ECKG construction - Wrap all Qdrant calls in asyncio.to_thread across all 6 services (doc/graph/row embeddings write + query), add asyncio.Lock + set cache for collection existence checks - Add asyncio.Lock to rows write + query services to protect shared state (schemas, sessions, config caches) from concurrent mutation - Update all affected tests to match new async patterns * Fixed error only returning a page of results (#921) The root cause: async_execute only materialises the first result page (by design — it says so in its docstring). The streaming query set fetch_size=20 and expected to iterate all results, but only got the first 20 rows back. The fix uses asyncio.to_thread(lambda: list(tg.session.execute(...))) which lets the sync driver iterate all pages in a worker thread — exactly what the pre-async code did. * Optional test warning suppression (#923) * Fix test collection module errors & silence upstream Pytest warnings (#823) * chore: add virtual environment and .env directories to gitignore * test: filter upstream DeprecationWarning and UserWarning messages * fix(namespace): remove empty __init__.py files to fix PEP 420 implicit namespace routing for trustgraph sub-packages * Revert __init__.py deletions * Add .ini changes but commented out, will be useful at times --------- Co-authored-by: Salil M <d2kyt@protonmail.com>	2026-05-15 13:02:51 +01:00
cybermaggedon	01b1fd849d	Optional test warning suppression (#923 ) * Fix test collection module errors & silence upstream Pytest warnings (#823) * chore: add virtual environment and .env directories to gitignore * test: filter upstream DeprecationWarning and UserWarning messages * fix(namespace): remove empty __init__.py files to fix PEP 420 implicit namespace routing for trustgraph sub-packages * Revert __init__.py deletions * Add .ini changes but commented out, will be useful at times --------- Co-authored-by: Salil M <d2kyt@protonmail.com>	2026-05-15 12:58:12 +01:00
cybermaggedon	846282c375	Fixed error only returning a page of results (#921 ) The root cause: async_execute only materialises the first result page (by design — it says so in its docstring). The streaming query set fetch_size=20 and expected to iterate all results, but only got the first 20 rows back. The fix uses asyncio.to_thread(lambda: list(tg.session.execute(...))) which lets the sync driver iterate all pages in a worker thread — exactly what the pre-async code did.	2026-05-14 21:03:09 +01:00
cybermaggedon	a2dde9cafb	Make all Cassandra and Qdrant I/O async-safe with proper concurrency controls (#916 ) Cassandra triples services were using syncronous EntityCentricKnowledgeGraph methods from async contexts, and connection state was managed with threading.local which is wrong for asyncio coroutines sharing a single thread. Qdrant services had no async wrapping at all, blocking the event loop on every network call. Rows services had unprotected shared state mutations across concurrent coroutines. - Add async methods to EntityCentricKnowledgeGraph (async_insert, async_get_s/p/o/sp/po/os/spo/all, async_collection_exists, async_create_collection, async_delete_collection) using the existing cassandra_async.async_execute bridge - Rewrite triples write + query services: replace threading.local with asyncio.Lock + dict cache for per-workspace connections, use async ECKG methods for all data operations, keep asyncio.to_thread only for one-time blocking ECKG construction - Wrap all Qdrant calls in asyncio.to_thread across all 6 services (doc/graph/row embeddings write + query), add asyncio.Lock + set cache for collection existence checks - Add asyncio.Lock to rows write + query services to protect shared state (schemas, sessions, config caches) from concurrent mutation - Update all affected tests to match new async patterns	2026-05-14 16:00:54 +01:00
cybermaggedon	bb1109963c	Remove spurious workspace parameter from SPARQL algebra evaluator (#915 ) Fix threading of workspace paramater: - The SPARQL algebra evaluator was threading a workspace parameter through every function and passing it to TriplesClient.query(), which doesn't accept it. Workspace isolation is handled by pub/sub topic routing — the TriplesClient is already scoped to a workspace-specific flow, same as GraphRAG. Passing workspace explicitly was both incorrect and unnecessary. Update tests: - tests/unit/test_query/test_sparql_algebra.py (new) — Tests _query_pattern, _eval_bgp, and evaluate() with various algebra nodes. Key tests assert workspace is never in tc.query() kwargs, plus correctness tests for BGP, JOIN, UNION, SLICE, DISTINCT, and edge cases. - tests/unit/test_retrieval/test_graph_rag.py — Added test_triples_query_never_passes_workspace (checks query()) and test_follow_edges_never_passes_workspace (checks query_stream()).	2026-05-14 12:03:43 +01:00
cybermaggedon	f0ad282708	CLI auth migration, document embeddings core lifecycle (#913 ) Migrate get_kg_core and put_kg_core CLI tools to use Api/SocketClient with first-frame auth (fixes broken raw websocket path). Fix wire format field names (root/vector). Remove ~600 lines of dead raw websocket code from invoke_graph_rag.py. Add document embeddings core lifecycle to the knowledge service: list/get/put/delete/load operations across schema, translator, Cassandra table store, knowledge manager, gateway registry, REST API, socket client, and CLI (tg-get-de-core, tg-put-de-core). Fix delete_kg_core to also clean up document embeddings rows.	2026-05-14 10:30:21 +01:00
elpresidank	ffd97375a8	saving	2026-05-12 08:06:58 -05:00
elpresidank	e8c7a4f6e0	Merge remote-tracking branch 'origin/master' into ts-port	2026-05-11 19:46:08 -05:00
elpresidank	a20dd1999c	saving	2026-05-11 19:44:40 -05:00
Cyber MacGeddon	159b1e2824	Merge branch 'release/v2.4'	2026-05-11 15:15:50 +01:00
KOTHA-SRIVIBHU	dd974b0cac	fix: replace bare excepts in NLTK initialization (#896 )	2026-05-11 15:12:25 +01:00
KOTHA-SRIVIBHU	ab02c02b33	fix: replace bare excepts in NLTK initialization (#896 )	2026-05-11 15:11:30 +01:00
Sahil Yadav	d08ec56a73	fix: resolve publisher resource leak and field parse validation (#886 )	2026-05-11 15:06:54 +01:00
Sahil Yadav	c2f1759bdf	fix: resolve publisher resource leak and field parse validation (#886 )	2026-05-11 15:06:24 +01:00
Jack Colquitt	80a7579639	Enhance README.md descriptions for TrustGraph and Context Core (#892 ) Refined the description of TrustGraph and updated the Context Core explanation for clarity.	2026-05-08 13:45:06 -07:00
cybermaggedon	fd8d5b2c42	Recent fixes -> release/v2.4 (#891 ) * Fix publisher resource leak in librarian submit_document (#883) Wrap pub.start()/pub.send() in try/finally to guarantee pub.stop() is called on error. Remove unnecessary asyncio.sleep(1) kludge. * Make Cassandra replication factor configurable (issue #787) (#887) Add CASSANDRA_REPLICATION_FACTOR environment variable and --cassandra-replication-factor CLI argument to cassandra_config.py. Update all four table store constructors (ConfigTableStore, KnowledgeTableStore, LibraryTableStore, IamTableStore) to accept an optional replication_factor parameter and use it in keyspace creation CQL queries. Thread the replication factor through all service constructors: Configuration, KnowledgeManager, Librarian, IamService, and knowledge store Processor. * Update tests --------- Co-authored-by: gittihub-jpg <rico@springer-mail.net>	2026-05-08 19:48:12 +01:00
gittihub-jpg	e23d4a5b58	Make Cassandra replication factor configurable (issue #787 ) (#887 ) Add CASSANDRA_REPLICATION_FACTOR environment variable and --cassandra-replication-factor CLI argument to cassandra_config.py. Update all four table store constructors (ConfigTableStore, KnowledgeTableStore, LibraryTableStore, IamTableStore) to accept an optional replication_factor parameter and use it in keyspace creation CQL queries. Thread the replication factor through all service constructors: Configuration, KnowledgeManager, Librarian, IamService, and knowledge store Processor.	2026-05-08 19:31:58 +01:00
gittihub-jpg	f9d6606423	Fix publisher resource leak in librarian submit_document (#883 ) Wrap pub.start()/pub.send() in try/finally to guarantee pub.stop() is called on error. Remove unnecessary asyncio.sleep(1) kludge.	2026-05-08 19:22:48 +01:00
cybermaggedon	fe542b3d33	Remove race condition in workspace initialisation if iam-svc is up (#867 ) Remove race condition in workspace initialisation if iam-svc is up before config-svc. iam.py — handle_create_workspace: - Config registration (_on_workspace_created) moves before the IAM table write, so it's a prerequisite. If the config put fails, the exception propagates and the IAM create doesn't happen. - On duplicate, the IAM table write is skipped but config registration still runs (idempotent put). Returns the existing record with no error instead of returning _err("duplicate", ...). service.py — _announce_workspace_created → _ensure_workspace_registered: - Renamed to reflect the new semantics. - Exceptions propagate instead of being swallowed — if config registration fails, the caller sees the error.	2026-05-06 21:21:48 +01:00
cybermaggedon	d282d72db1	Fixed document-rag workspace problem (#866 ) - Fixed document-rag workspace problem - OpenAI text-completion processor now puts 'not-set' in the token if no token is set (new OpenAI library requires it to be set to something. - Update tests	2026-05-06 14:55:21 +01:00
cybermaggedon	03cc5ac80f	Per-flow librarian clients and per-workspace response queues (#865 ) Replace singleton LibrarianClient with per-flow instances via the new LibrarianSpec, giving each flow its own librarian tied to the workspace-scoped request/response queues from the blueprint. Move all workspace-scoped services (config, flow, librarian, knowledge) from a single base-queue response producer to per-workspace response producers created alongside the existing per-workspace request consumers. Update the gateway dispatcher and bootstrapper flow client to subscribe to the matching workspace-scoped response queues. Fix WorkspaceInit to register workspaces through the IAM create-workspace API so they appear in __workspaces__ and are visible to the gateway. Simplify the bootstrapper gate to only check config-svc reachability. Updated tests accordingly.	2026-05-06 12:01:01 +01:00
Jack Colquitt	1ffae12559	Revise README to reflect agent runtime platform (#864 ) Updated platform description and added messaging systems.	2026-05-05 11:47:19 -07:00
cybermaggedon	01bf1d89d5	Fixed a circular dependency causing bootstrap to fail (#863 ) TemplateSeed and WorkspaceInit now run pre-gate. They'll write templates and register the default workspace before the gate checks flow-svc, breaking the circular dependency.	2026-05-05 16:00:21 +01:00
cybermaggedon	9f2bfbce0c	Per-workspace queue routing for workspace-scoped services (#862 ) Workspace identity is now determined by queue infrastructure instead of message body fields, closing a privilege-escalation vector where a caller could spoof workspace in the request payload. - Add WorkspaceProcessor base class: discovers workspaces from config at startup, creates per-workspace consumers (queue:workspace), and manages consumer lifecycle on workspace create/delete events - Roll out to librarian, flow-svc, knowledge cores, and config-svc - Config service gets a dual-queue regime: a system queue for cross-workspace ops (getvalues-all-ws, bootstrapper writes to __workspaces__) and per-workspace queues for tenant-scoped ops, with workspace discovery from its own Cassandra store - Remove workspace field from request schemas (FlowRequest, LibrarianRequest, KnowledgeRequest, CollectionManagementRequest) and from DocumentMetadata / ProcessingMetadata — table stores now accept workspace as an explicit parameter - Strip workspace encode/decode from all message translators and gateway serializers - Gateway enforces workspace existence: reject requests targeting non-existent workspaces instead of routing to queues with no consumer - Config service provisions new workspaces from __template__ on creation - Add workspace lifecycle hooks to AsyncProcessor so any processor can react to workspace create/delete without subclassing WorkspaceProcessor	2026-05-04 10:30:03 +01:00
elpresidank	54a6e49bd3	Merge remote-tracking branch 'origin/master' into ts-port	2026-05-01 22:16:51 -05:00
elpresidank	6ac5446a76	feat(mcp-tool): wire McpToolService into deploy stack Three pieces, all required for an end-to-end MCP tool call: * McpToolService used generic spec names "request"/"response" instead of "mcp-tool-request"/"mcp-tool-response", so RequestResponseSpec's flow-config topic lookup never matched and consumers bound to literal subjects nobody else publishes to. * Add entrypoints/mcp-tool.mjs (mirrors agent/librarian entrypoints) so the service can be launched in the prebuilt trustgraph-ts image. * Add a `mcp-tool` service block to deploy/docker-compose.yml. With these three fixes plus a `mcp-tool-request`/`mcp-tool-response` entry in each flow's topics map, the agent ReAct loop can now invoke remote MCP tools (verified end-to-end against Brave Search and FireCrawl). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 22:16:37 -05:00
elpresidank	4c356cd24c	fix(client): use correct put/delete config wire shape ConfigApi.putConfig and deleteConfig (and the duplicate in FlowsApi) sent a flat values:[{type,key,value}] array and a keys:{type,key} object — neither matches the ConfigService schema, which requires keys:[namespace, ...innerKeys] and values:Record<string,unknown>. Every save in the workbench /mcp-tools page returned `Put requires at least one key (namespace)`. putConfig now groups items by type (namespace) and issues one put per group; deleteConfig sends keys:[type, key]. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 22:16:28 -05:00
cybermaggedon	9be257ceee	Update packages with vulns in container builds (#861 ) * Fix vulns-flagged imports * Fix archaic pulls in the "trustgraph" package * Add unstructured to meta package	2026-04-30 20:02:53 +01:00
cybermaggedon	89f058d35b	Fix erroneous handling of __template__ and __system__ workspaces (#860 ) __template__ and __system__ are not real workspace but config zones for workspace template and init handling. They should not be processed as workspaces. Now ignores workspace beginning with '_' prefix.	2026-04-30 09:53:58 +01:00
cybermaggedon	69b0b9b326	Add missing websockets dep (#859 ) Added 'websockets' to pyproject.toml in trustgraph-base	2026-04-30 09:53:32 +01:00
Cyber MacGeddon	c112af0ab0	align chunker + googleaistudio fixes with release/v2.4 Master had a parallel sibling fix for issue #821 (PR #828) using self.RecursiveCharacterTextSplitter / self.TokenTextSplitter; release branches converged on the bare module-level form. Adopt release/v2.4's version so downstream branches don't drift further.	2026-04-29 18:01:31 +01:00
Cyber MacGeddon	f3434307c5	Merge branch 'release/v2.4'	2026-04-29 17:57:24 +01:00
Jack Colquitt	627cb1e0d8	Remove Cossmology badge from README (#857 ) Removed the badge for trustgraph on Cossmology from the README.	2026-04-28 16:54:23 -07:00
cybermaggedon	d0850ff381	Delete some stuff to free up disk space (#856 )	2026-04-28 22:46:02 +01:00
cybermaggedon	9fc1d4527b	iam: self-service ops, optional workspace filters, Mux service routing (#855 ) Three threads, all reinforcing the contract's system-level vs. workspace-association distinction. WS Mux service routing - tg-show-flows (and any workspace-level service over the WS) was failing with "unknown service" because the post-refactor Mux unconditionally looked up flow-service:<kind>. Now branches on the envelope's flow field: with flow → flow-service:<kind>; without flow → <kind>:<op> from the inner body; with bare op lookup for service=iam. Resource and parameters come from the matched op's own extractors — same path the HTTP endpoints take. Optional workspace on system-level user/key ops - list-users returns the deployment-wide list when no workspace is supplied, filters when one is. get-user, update-user, disable-user, enable-user, delete-user, reset-password, create-api-key, list-api-keys, revoke-api-key all treat workspace as an optional integrity check rather than a required argument. - create-user keeps workspace required — there it's the new user's home-workspace binding, a parameter rather than an address. - API keys reclassified as SYSTEM-level resources. By the same reasoning that makes users system-level, an API key is a credential record on a deployment-wide registry; the workspace it authenticates to is a property, not a containment. Self-service surface - whoami: returns the caller's own user record. AUTHENTICATED-only; no users:read capability required. Foundation for UI affordances that depend on the caller's permissions. - bootstrap-status: POST /api/v1/auth/bootstrap-status, PUBLIC, side-effect-free. Returns {bootstrap_available: bool} so a first-run UI can decide whether to render setup without consuming the bootstrap op. - Gateway now injects actor=identity.handle on every authenticated forward to iam-svc (IamEndpoint and WS Mux iam path), overwriting any caller-supplied value. Underpins whoami, audit logging, and future regime-side decisions that need actor identity. - tg-whoami and tg-update-user CLIs. Spec polish - iam-contract.md: actor-injection rule documented; whoami / bootstrap-status added to operations list; permission-scope framing tightened (workspace scope is a property of the grant, not the user or role). - iam.md: self-service section; gateway flow gains the actor- injection step; role section reframed so iam-svc constraints don't leak into contract-level prose. - iam-protocol.md: ops table updated for whoami, bootstrap-status, optional-workspace pattern; bootstrap_available added to the IamResponse listing.	2026-04-28 22:13:12 +01:00

1 2 3 4 5 ...

1478 commits