trustgraph

mirror of https://github.com/trustgraph-ai/trustgraph.git synced 2026-07-03 23:11:00 +02:00

Author	SHA1	Message	Date
cybermaggedon	c05296376e	fix: remove test import (#1017 ) Convention in the tests is to just import the libraries as production code would. This is fragile, and could possibly be used to inject malicious code in the CI environment.	2026-07-03 13:44:09 +01:00
YingzuoLiu	f04ae5331d	Add diversity-aware selection after Document-RAG reranking (#1014 ) * Add Document-RAG diversity selection helper * Add optional MMR diversity selection after reranking * Fix Document-RAG diversity test method signatures	2026-07-03 13:35:42 +01:00
cybermaggedon	db7fdbc652	feat: direction-aware reranker text in GraphRAG hop-and-filter (#1016 ) The reranker document text now reflects the traversal direction, showing only the new information relative to the frontier entity: - From S (subject is frontier): text = "{predicate} {object}" - From O (object is frontier): text = "{subject} {predicate}" - From P (predicate is frontier): text = "{subject} {object}" This eliminates duplicate reranker texts when traversing inward from shared object nodes (e.g. 18 CPUs all producing identical "hasSubcategory Processors" text when the subject was dropped). execute_batch_triple_queries now returns (triple, direction) tuples so hop_and_filter can select the appropriate text format. Updates tech spec to document the direction-aware approach. Adds unit tests for direction tracking and reranker text construction.	2026-07-02 21:14:47 +01:00
cybermaggedon	9cf7dcb578	fix: wire variant into remaining streaming integration test mocks (#1013 ) Three more streaming tests were missing _wire_variant after the async for change in create_completion_stream.	2026-07-02 11:14:54 +01:00
Sunny	6c9a545a06	feat: add cross-encoder reranking to Document-RAG with two-limit control (#878 ) (#1011 ) Wire the FlashRank reranker subsystem from #1005 into Document-RAG: after vector retrieval, over-fetch a wider candidate pool, rerank with the cross-encoder, and keep the top doc_limit chunks for synthesis. Per maintainer review, the fetch and select sizes are two caller-controlled limits rather than one internal heuristic: - doc_limit: chunks selected into the synthesis prompt (unchanged meaning). - fetch_limit: candidate pool pulled from the vector store before reranking. 0 = derive (OVERFETCH_FACTOR x doc_limit); values below doc_limit are raised to it. Lets the caller control how hard the reranker has to work. Details: - schema: DocumentRagQuery.fetch_limit (additive, backward compatible). - document_rag.py / rag.py: fetch_limit resolved in the processor (mirrors doc_limit); the core applies the heuristic default and derives synthesis provenance from the chunk-selection focus when reranking ran. - provenance: tg:ChunkSelection focus stage (mirrors tg:EdgeSelection). - request translator + client SDKs + CLI: fetch-limit / --fetch-limit, threaded exactly like doc_limit and the GraphRAG limits. - tests: no-op identity, over-fetch/narrow, explicit fetch_limit, heuristic default, floor-at-doc_limit, provenance lineage, cross-repo topic wiring. Reranking is skipped byte-identically when no reranker role is wired. Requires the companion trustgraph-templates change wiring the reranker topics into the document-rag flow (mirrors #279 for GraphRAG).	2026-07-02 09:50:13 +01:00
cybermaggedon	f18d48dc39	fix: simplify dashscope variant and route API calls through variants (#1012 ) Replace the client.post()/httpx bypass with standard SDK extra_body, confirmed working against DashScope. Make DashScope the base variant with Qwen as a subclass alias. Route all API calls through variant create_completion/create_completion_stream methods.	2026-07-02 09:12:55 +01:00
cybermaggedon	6887076ce0	feat: add dashscope variant for Alibaba Cloud DashScope API (#1010 ) DashScope uses enable_thinking as a top-level parameter rather than inside extra_body as the Qwen docs suggest.	2026-07-01 16:50:47 +01:00
cybermaggedon	55e2a2a3ce	feat: add guided macOS installer and developer install guide (#1003 ) Interactive bash installer (install_trustgraph.sh) that detects hardware, recommends an LLM mode (OpenAI or Ollama), installs missing prerequisites via Homebrew, sets up a Python venv, runs the test suite, generates a deployment via npx @trustgraph/config, starts the Docker Compose stack, health-checks the API gateway, and opens the Workbench UI. Includes README.dev-install.md with usage documentation covering CLI options, environment variables, LLM mode selection, non-interactive/CI usage, uninstall, and troubleshooting. Currently macOS only.	2026-07-01 16:50:14 +01:00
cybermaggedon	11ca7c89c4	feat: add GLM (Zhipu AI) variant for OpenAI processor (#1009 )	2026-07-01 16:20:43 +01:00
cybermaggedon	656ca430b9	fix: wire variant into text-completion integration test mocks (#1008 ) Tests using MagicMock processors need the variant, thinking mode, and _build_kwargs/_extract_content methods bound to work with the new variant-based API kwargs construction.	2026-07-01 15:40:23 +01:00
cybermaggedon	f20b50cfb2	feat: add API variant profiles and thinking support to OpenAI processor (#1007 ) Add a --variant flag (openai, deepseek, qwen, mistral, llama) that encapsulates provider-specific API differences: output token parameter names, thinking/reasoning toggles, temperature rules, and thinking output extraction. Add --thinking flag (off, low, medium, high) to control reasoning effort.	2026-07-01 14:48:32 +01:00
cybermaggedon	01cc8dbc64	feat: replace LLM edge scoring with cross-encoder reranker in GraphRAG (#1005 ) Replace the three-prompt LLM scoring pipeline (kg-edge-scoring, kg-edge-reasoning, kg-edge-selection) with a cross-encoder reranker service backed by FlashRank. The new hop_and_filter() method performs iterative graph traversal with semantic scoring at each hop, replacing the previous follow_edges/get_subgraph approach. - Add reranker service (trustgraph-base client/service, FlashRank processor) - Add gateway dispatch for reranker via API and WebSocket - Rewrite GraphRAG pipeline: hop_and_filter() with per-hop cross-encoder scoring - Remove kg_prompt() and edge_score_limit from prompt client - Update provenance: add tg:EdgeSelection type, tg:concept, tg:score predicates - Update CLIs (tg-invoke-graph-rag, tg-show-explain-trace) for new metadata - Add tg-invoke-reranker CLI tool - Add tech spec and UX developer guidance - Update all unit and integration tests	2026-06-30 14:36:37 +01:00
corvus-0x	1aa9549912	feat: make bootstrapper initialiser timeouts configurable (#999 ) * feat: make bootstrapper initialiser timeouts configurable DefaultFlowStart and WorkspaceInit hardcoded the request timeouts for their flow-svc and IAM calls, leaving operators no way to tune them for high-latency environments (#874). Expose them as constructor parameters threaded through the existing initialiser `params:` mechanism, defaulting to the current values so behaviour is unchanged unless explicitly overridden: - DefaultFlowStart: list_timeout=10 (list-flows), start_timeout=30 (start-flow) - WorkspaceInit: iam_timeout=10 (create-workspace) Add unit tests for the defaults, override storage, and that configured values reach the underlying request calls. * test: mark async bootstrap test with @pytest.mark.asyncio Addresses review feedback on PR #999: add the explicit @pytest.mark.asyncio decorator to test_run_forwards_configured_timeouts so it does not rely on asyncio_mode=auto and stays consistent with the rest of the suite.	2026-06-30 09:37:22 +01:00
cybermaggedon	5cb4f83afa	fix: list-my-workspaces permissions were broken (#1002 ) list-my-workspaces has AUTHENTICATED scope, so anyone is permitted to run the operation. No specific permission grant is needed.	2026-06-29 09:13:05 +01:00
cybermaggedon	0a828379be	feat: global usernames and rename workspace to default_workspace (#1001 ) Users are global entities, not scoped to workspaces. This change: Track A — Global usernames: - Change iam_users_by_username to PRIMARY KEY (username), removing workspace from the lookup key - Login looks up username globally, no workspace required - Username uniqueness is enforced globally, not per-workspace - Login -w now overrides the JWT workspace (session workspace) rather than selecting which user registry to search Track B — Rename workspace to default_workspace: - UserRecord.workspace → UserRecord.default_workspace - Identity.workspace → Identity.default_workspace - JWT claim "workspace" → "default_workspace" - IamResponse.resolved_workspace → resolved_default_workspace - WebSocket auth-ok frame field → default_workspace - Socket clients read default_workspace from auth-ok - _user_record_to_dict wire key → default_workspace - CLI help text and output updated throughout - Test files updated for renamed fields	2026-06-25 16:34:31 +01:00
cybermaggedon	16f8cfd972	fix: use envelope workspace for mux authorisation, not inner request body (#1000 ) The mux was extracting the authorisation resource workspace from the inner request body via registry extractors. But workspace-scoped services (config, flow, librarian, etc.) receive workspace from the queue identity, not the message body — the inner workspace field is a dead field that no service handler reads. This caused access-denied errors when the inner body's workspace (e.g. CLI default "default") disagreed with the caller's assigned workspace, even though the envelope workspace was correct. Fix: resolve workspace from the envelope only. Split the non-flow authorisation path by resource level — WORKSPACE ops use the envelope workspace directly; SYSTEM ops (IAM) still use registry extractors since they legitimately read operation-specific body fields.	2026-06-25 13:44:57 +01:00
Cyber MacGeddon	a3df4f62bb	Merge branch 'master' into release/v2.6	2026-06-22 21:20:29 +01:00
cybermaggedon	09b8a1d347	feat: fine-grained capabilities and enterprise IAM schema extensions (#996 ) Split coarse gateway capabilities into fine-grained variants to support per-operation access control in the enterprise IAM regime. Add additive schema fields for enterprise group and grant management. Capability split (gateway registry): - graph:read -> triples:read, sparql:read, graph-rag:read, graph-embeddings:read - graph:write -> triples:write, graph-embeddings:write, entity-contexts:write - documents:read -> documents:read, document-rag:read, document-embeddings:read, entity-contexts:read - documents:write -> documents:write, document-embeddings:write - rows:read -> rows:read, nlp-query:read, structured-query:read, row-embeddings:read OSS role definitions expanded to include all new fine-grained capability names — no behavioral change for OSS deployments. Schema additions (IamRequest): - group_id, member_type, member_id for group membership operations - group (GroupInput), grant (GrantInput) for create/update payloads - Decoder now handles capability, resource_json, parameters_json, authorise_checks (previously missing from translator) Schema additions (IamResponse): - group_json, groups_json, members_json, grants_json, effective_permissions_json for enterprise operation responses - Encoder now emits authorise decision fields Gateway registry: - 16 enterprise IAM operations registered (create-group, add-group-member, add-user-grant, etc.) under iam:admin capability	2026-06-22 20:23:34 +01:00
Jack Colquitt	fa264ded46	Update section titles for Holonic Context Graph (#995 )	2026-06-18 19:56:44 -07:00
Jack Colquitt	cae931409a	Update TrustGraph description in README (#994 ) Clarified the description of TrustGraph's capabilities and API integrations.	2026-06-18 19:46:25 -07:00
Jack Colquitt	6b0475e315	Revise README for clarity on TrustGraph features (#993 ) Updated the README to clarify the concept of holons and the functionality of TrustGraph. Improved the structure and flow of information regarding context management and agent explainability.	2026-06-18 19:42:16 -07:00
Jack Colquitt	cb0ad1a450	Change video link in README (#992 ) Updated video source link in README.md.	2026-06-17 17:52:30 -07:00
Jack Colquitt	fc0ecc770a	Format terms as code in README.md (#991 )	2026-06-17 16:53:22 -07:00
Jack Colquitt	345da375b1	Document Workspaces, Collections, and Flows in README (#990 ) Added section on Workspaces, Collections, and Flows to explain the organizational structure of TrustGraph.	2026-06-17 16:48:22 -07:00
Jack Colquitt	0ba1eeeda0	Enhance README with token consumption details (#989 ) Added a note about reducing token consumption in context management.	2026-06-17 16:27:16 -07:00
Jack Colquitt	eb1e38d7d0	Add hyperlink to 'holon' in README.md (#988 )	2026-06-17 16:13:41 -07:00
Jack Colquitt	b8770a6005	Update README with new context and features (#987 )	2026-06-17 16:08:22 -07:00
Jack Colquitt	28802a644a	Update license badge in README.md (#986 )	2026-06-11 20:45:39 -07:00
cybermaggedon	8797d9d9ff	feat: per-caller Bearer token auth and new query tools for MCP server (#984 ) Replace the broken GATEWAY_SECRET auth (token was sent as a query parameter, silently ignored by the gateway) with end-to-end Bearer token forwarding. Each MCP caller gets a dedicated WebSocket authenticated via the gateway's in-band first-frame protocol, with whoami verification on first connect. Also fix and extend the tool surface: - embeddings: accept list of texts (was single string) - triples_query: use Term wire format with compact keys (was legacy Value format), add collection and graph parameters - sparql_query: new tool for SPARQL SELECT/ASK/CONSTRUCT/DESCRIBE - graphql_query: new tool for structured data (rows) GraphQL queries - all tools: add optional workspace parameter	2026-06-10 14:11:49 +01:00
cybermaggedon	627c669097	feat: per-caller Bearer token auth and new query tools for MCP server (#984 ) Replace the broken GATEWAY_SECRET auth (token was sent as a query parameter, silently ignored by the gateway) with end-to-end Bearer token forwarding. Each MCP caller gets a dedicated WebSocket authenticated via the gateway's in-band first-frame protocol, with whoami verification on first connect. Also fix and extend the tool surface: - embeddings: accept list of texts (was single string) - triples_query: use Term wire format with compact keys (was legacy Value format), add collection and graph parameters - sparql_query: new tool for SPARQL SELECT/ASK/CONSTRUCT/DESCRIBE - graphql_query: new tool for structured data (rows) GraphQL queries - all tools: add optional workspace parameter	2026-06-10 14:10:43 +01:00
cybermaggedon	8b0619e5d8	Bump version numbers to 2.6 (#983 )	2026-06-09 20:03:14 +01:00
cybermaggedon	e3f9f8c357	Merge pull request #982 from trustgraph-ai/master master -> release/v2.6	2026-06-09 19:46:50 +01:00
Cyber MacGeddon	81d57826c8	Merge branch 'release/v2.5'	2026-06-09 19:43:31 +01:00
Jacob Molz	79d7ef6a90	fix: reject invalid PDF decoder input (#977 )	2026-06-09 16:37:39 +01:00
Jacob Molz	28a51c244f	fix: reject invalid PDF decoder input (#977 )	2026-06-09 16:37:10 +01:00
Cyber MacGeddon	fa5ebe2393	Merge branch 'release/v2.5'	2026-06-09 16:34:20 +01:00
cybermaggedon	e1c9351454	fix: update row query tests to mock async_execute_paged and async_scan (#979 ) The query service now uses async_execute_paged (indexed path) and async_scan (scan path) instead of async_execute. Tests were mocking the old function, causing them to hang indefinitely.	2026-06-09 16:29:32 +01:00
cybermaggedon	dbc21c0bb9	fix: structured data query and auth fixes (#978 ) - Pass auth token to schema discovery and descriptor generation in tg-load-structured-data, fixing 401 errors with IAM enabled - Fix row query pagination: replace single-page async_execute with async_scan that streams pages and applies filters without materialising the full result set (OOM on large datasets) - Add missing filter operators (not, startsWith, endsWith, not_in) to row query post-filter matching - Fall back to scan path when an indexed field is queried with an empty string value, since empty index values are not stored - Revert top-level indexes array support — the current table schema overwrites rows with duplicate index values, so only primary_key fields are safe to index until the schema is redesigned	2026-06-08 15:22:11 +01:00
cybermaggedon	08bfec1539	fix: wire replication params through YAML/params path for Cassandra and Qdrant (#976 ) resolve_cassandra_config did not accept replication_factor as a kwarg, so cassandra_replication_factor from YAML params was silently ignored by all 6 callers. Add the kwarg and pass it from every caller. Same fix for Qdrant: 3 writers now pass qdrant_replication_factor and qdrant_shard_number from params. Add tests covering the params path for both helpers.	2026-06-04 12:36:36 +01:00
cybermaggedon	4913f8c2eb	feat: data store replication configuration and TLS upgrade (#975 ) - Add centralised qdrant_config.py helper with env-var fallback for QDRANT_URL, QDRANT_API_KEY, QDRANT_REPLICATION_FACTOR, QDRANT_SHARD_NUMBER - Update all 6 Qdrant processors to use the helper; writers pass replication_factor and shard_number to create_collection - Fix hardcoded Cassandra replication_factor=1 in cassandra_kg.py, write.py, and sparql_cassandra.py to respect CASSANDRA_REPLICATION_FACTOR - Upgrade Cassandra TLS from deprecated PROTOCOL_TLSv1_2 to ssl.create_default_context() across all connectors	2026-06-04 11:49:29 +01:00
cybermaggedon	acf182c265	feat: add env-var fallback for librarian object-store config (#974 ) The librarian now reads OBJECT_STORE_ENDPOINT, OBJECT_STORE_ACCESS_KEY, OBJECT_STORE_SECRET_KEY, OBJECT_STORE_REGION, and OBJECT_STORE_USE_SSL from the environment when not set via params. This lets K8s Secrets supply credentials without them appearing in launch.yaml.	2026-06-03 10:59:58 +01:00
cybermaggedon	6df7471a55	feat: complete knowledge core storage — named graphs, provenance, source material (#973 ) Implements all three changes from the knowledge-core-completeness tech spec: 1. Named graph field preserved through Cassandra storage (7-element tuple), enabling provenance triples to retain their graph URIs on round-trip. 2. Provenance triples already arrive on triples-input — no routing change needed; Change 1 was sufficient. 3. Source material (library documents) streamed alongside triples and embeddings during core download/upload. The knowledge manager fetches the document hierarchy from the librarian on download and recreates it on upload, preserving the full provenance chain across instances.	2026-06-03 10:46:52 +01:00
cybermaggedon	aa158e1ba3	fix: skip authorise() for AUTHENTICATED/PUBLIC sentinels in WebSocket mux (#972 ) The mux unconditionally called auth.authorise() for every operation, passing capability sentinels like AUTHENTICATED ("__authenticated__") to the IAM regime. Since no role grants "__authenticated__", the regime denied the request — breaking whoami (and any future AUTHENTICATED-only operation) over the WebSocket path while the HTTP endpoints worked fine. Match the guard pattern used by iam_endpoint.py and registry_endpoint.py: only call authorise() for real capability strings, not sentinels.	2026-06-03 09:45:53 +01:00
cybermaggedon	60f861bac4	Added an instance tag ID (#971 )	2026-06-02 14:49:24 +01:00
cybermaggedon	00bb964e93	fix: route workspace through bulk WebSocket clients and merge query params (#970 ) Bulk clients (sync and async) were not forwarding the workspace parameter, causing all bulk operations to hit the default workspace regardless of the Api instance's workspace setting. Also fixes the gateway socket endpoint to pass query parameters (including workspace) to the dispatcher, and prevents the auth handshake from overwriting an explicitly set workspace. Updates knowledge table store tests for paged query interface.	2026-06-02 14:19:15 +01:00
cybermaggedon	6b1dd16f9f	fix: large document handling and Cassandra query pagination (#969 ) - Paginate heavy Cassandra reads (triples, graph/document embeddings) using synchronous session.execute() in run_in_executor with fetch_size paging, preventing materialization hang on large result sets - Fix document stream endpoint to use workspace-scoped librarian queues - Add decoder error handling for PDF/OCR/unstructured processors - Add WebSocket mux guards for missing auth fields - Add null check in librarian document streaming - Rewrite get_document_content CLI to stream via librarian - Add Poppler dependency to unstructured container	2026-06-01 22:39:30 +01:00
Jack Colquitt	97453d9b83	Change project title to 'The semantic deployment platform' (#968 ) Updated the project title in the README.	2026-06-01 14:08:30 -07:00
cybermaggedon	7e1fb76bc9	Fix HF embeddings tests (#967 ) The tests were patching trustgraph.embeddings.hf.hf.HuggingFaceEmbeddings - a module-level attribute that doesn't exist because HuggingFaceEmbeddings is imported locally inside _load_model. Changed all 8 occurrences to patch langchain_huggingface.HuggingFaceEmbeddings, which is the actual import source the code uses at runtime.	2026-06-01 12:35:09 +01:00
cybermaggedon	e6dfccc56d	fix: WebSocket auth handshake overwriting explicit workspace (#966 ) The auth-ok response includes the token's bound workspace, and AsyncSocketClient was unconditionally adopting it — clobbering any workspace the caller explicitly requested via the constructor.	2026-06-01 12:25:19 +01:00
cybermaggedon	d1e6b99e96	fix: CLI tools ignoring -w flag for workspace routing (#964 ) Several CLI commands silently routed requests to the default workspace regardless of the -w flag: show-flows, show-flow-blueprints, show-parameter-types, set-prompt --system, and load-structured-data. The workspace was sent in the inner request body but not on the WebSocket envelope or API client constructor, so the gateway always dispatched to the default workspace queue.	2026-06-01 09:53:28 +01:00

1 2 3 4 5 ...

1449 commits