Commit graph

1489 commits

Author SHA1 Message Date
elpresidank
580ee319a3 fix: prevent dispatcher race condition via promise-based lazy init
Store the initialization Promise in the requestors map synchronously
before yielding, so concurrent callers for the same key await the same
instance — prevents orphaned RequestResponse objects and duplicate NATS
subscriptions. Mirrors upstream fix 8f18ba02.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 12:11:11 -05:00
elpresidank
2a2e8e76a3 Merge remote-tracking branch 'origin/master' into ts-port 2026-04-07 10:51:24 -05:00
elpresidank
5e3929a883 fix: comprehensive QA audit — light mode, accessibility, error handling, code quality
- Fix light mode: theme-aware graph node labels, remove prose-invert for
  theme-safe markdown, add brand/semantic color overrides for light backgrounds
- Add 404 catch-all route redirecting unknown paths to /chat
- FalkorDB: add .catch() to connectPromise, add ensureConnected() to all
  store methods (createLiteral, relateNode, relateLiteral, deleteCollection)
- Accessibility: dialog role/aria-modal, toast aria-live, dismiss/zoom/search
  button aria-labels, close panel aria-label
- Lazy-load ForceGraph2D (splits 189KB into separate chunk, main bundle -26%)
- Cap conversation localStorage at 200 messages to prevent quota overflow
- Fix pnpm test: add --passWithNoTests to cli/mcp packages
- Add upload error notification instead of silent catch
- Remove unused class-variance-authority dep and dead tabs.tsx component
- Add @types/node to flow package devDependencies
- Remove stale FIXME comment in messages.ts

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 09:15:59 -05:00
cybermaggedon
c20e6540ec
Subscriber resilience and RabbitMQ fixes (#765)
Subscriber resilience: recreate consumer after connection failure

- Move consumer creation from Subscriber.start() into the run() loop,
  matching the pattern used by Consumer. If the connection drops and the
  consumer is closed in the finally block, the loop now recreates it on
  the next iteration instead of spinning forever on a None consumer.

Consumer thread safety:
- Dedicated ThreadPoolExecutor per consumer so all pika operations
  (create, receive, acknowledge, negative_acknowledge) run on the
  same thread — pika BlockingConnection is not thread-safe
- Applies to both Consumer and Subscriber classes

Config handler type audit — fix four mismatched type registrations:
- librarian: was ["librarian"] (non-existent type), now ["flow",
  "active-flow"] (matches config["flow"] that the handler reads)
- cores/service: was ["kg-core"], now ["flow"] (reads
  config["flow"])
- metering/counter: was ["token-costs"], now ["token-cost"]
  (singular)
- agent/mcp_tool: was ["mcp-tool"], now ["mcp"] (reads
  config["mcp"])

Update tests
2026-04-07 14:51:14 +01:00
elpresidank
9ef9ef854f fix: iterative QA pass — resolve remaining bugs, UX and accessibility improvements
Three QA iterations to convergence (zero issues remaining):

Workbench UI:
- Connection badge: amber "Connected (no auth)" for unauthenticated state
- Theme persistence: restore script in index.html + localStorage sync
- Settings About section: add bottom padding so content isn't clipped
- Clear messages: cancel in-flight requests when clearing chat
- Feature switch labels: proper casing + acronym handling (MCP, LLM)
- Token Cost badge: hidden during loading state
- ARIA: role="switch", aria-checked on toggles, aria-labels on buttons
- ConfigApi: null-safe chaining for getPrompts/getSystemPrompt

Grafana dashboards:
- Auto-refresh 30s on all 3 dashboards
- Panel heights reduced to fit viewport without scrolling
- Anonymous role upgraded to Editor for Explore access

Infrastructure:
- Nginx: DNS resolver with variable-based upstream (prevents crash loop)
- Workbench port set to 3002 in .env

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 06:33:22 -05:00
cybermaggedon
ddd4bd7790
Deliver explainability triples inline in retrieval response stream (#763)
Provenance triples are now included directly in explain messages from
GraphRAG, DocumentRAG, and Agent services, eliminating the need for
follow-up knowledge graph queries to retrieve explainability details.

Each explain message in the response stream now carries:
- explain_id: root URI for this provenance step (unchanged)
- explain_graph: named graph where triples are stored (unchanged)
- explain_triples: the actual provenance triples for this step (new)

Changes across the stack:
- Schema: added explain_triples field to GraphRagResponse,
  DocumentRagResponse, and AgentResponse
- Services: all explain message call sites pass triples through
  (graph_rag, document_rag, agent react, agent orchestrator)
- Translators: encode explain_triples via TripleTranslator for
  gateway wire format
- Python SDK: ProvenanceEvent now includes parsed ExplainEntity
  and raw triples; expanded event_type detection
- CLI: invoke_graph_rag, invoke_agent, invoke_document_rag use
  inline entity when available, fall back to graph query
- Tech specs updated

Additional explainability test
2026-04-07 12:19:05 +01:00
cybermaggedon
2f8d6a3ffb
Fix agent config handler registration, remove debug prints, disable RabbitMQ heartbeats (#764)
- Fix agent react and orchestrator services appending bare methods
  to config_handlers instead of using register_config_handler() —
  caused 'method object is not subscriptable' on config notify
- Add exc_info to config fetch retry logging for proper tracebacks
- Remove debug print statements from collection management
  dispatcher and translator
- Disable RabbitMQ heartbeats (heartbeat=0) to prevent broker
  closing idle producer connections that can't process heartbeat
  frames from BlockingConnection
2026-04-07 12:11:12 +01:00
Sreeram Venkatasubramanian
c737e8c356
fix: reduce consumer poll timeout from 2000ms to 100ms (#761) 2026-04-07 12:09:20 +01:00
Sreeram Venkatasubramanian
f0c9039b76 fix: reduce consumer poll timeout from 2000ms to 100ms 2026-04-07 12:02:27 +01:00
elpresidank
3a80872482 fix: comprehensive QA — resolve 13 bugs, add UX improvements across all services
Client SDK: add .catch() to graphRagStreaming/documentRagStreaming (silent timeout),
null-guard JSON.parse in getPrompts/getSystemPrompt/getPrompt.

Backend: implement "getvalues" config operation for token costs, null-check
createTerm() in FalkorDB triples query, add knowledge-cores service entrypoint
and Docker entry, return proper HTTP 400/404 for gateway error responses.

Workbench: cancel button + elapsed timer for chat, clear agent spinner on error,
flow dialog inline validation, responsive header wrapping, knowledge cores
loading timeout, sidebar/page naming consistency, theme toggle indicator.

Infrastructure: enable Grafana Explore for viewers, add gateway Prometheus
scrape target, fix RAG pipeline dashboard layout (6 panels visible),
filter Service Health to configured targets only.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 05:20:10 -05:00
elpresidank
72870a7e2e feat: add unit tests, Docker polish, and workbench UX improvements
Unit tests: Consumer class (7), recursive-splitter (10), parseJsonResponse (11) — 28 total.
Docker: add 5 commented LLM provider services, dev compose override, .env.example.
Workbench: chat persistence, error boundary, disconnect banner, prompts error handling.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 03:51:29 -05:00
elpresidank
c7eefee607 feat: add Docker entrypoints, LLM providers, pipeline hardening, workbench pages
Phase 9 — four parallel workstreams:

- Stream A: 14 Docker entrypoints for containerized deployment
- Stream B: Pipeline hardening — robust JSON parsing, LLM retry logic,
  consumer negative-ack, FalkorDB test import fix
- Stream C: Azure OpenAI, OpenAI-compatible, and Mistral LLM providers
- Stream D: Workbench Prompts, Token Cost, Knowledge Cores pages +
  Settings feature switches

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 03:22:55 -05:00
elpresidank
50fb311d2d feat: real PDF pipeline test — end-to-end knowledge extraction working
Add full pipeline test that generates a real PDF, processes it through
the entire pipeline, and verifies knowledge lands in FalkorDB:

- Create test PDF generator using pdf-lib (2-page doc about Acme Corp)
- Add testFullPipeline() to integration tests with store verification
- Fix FalkorDB client connect() — createClient returns unconnected client
  in both TriplesStore and TriplesQuery classes

Results: PDF decoded (2 pages) → chunked (2 chunks) → extracted
(4 relationships) → 16 triples stored in FalkorDB including:
  alice-johnson → is-a-senior-engineer → acme-corporation
  cloudsync → uses-aws-for-hosting → amazon-web-services
  provenance: pages → prov:wasDerivedFrom → source document

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 02:19:12 -05:00
elpresidank
5bc7a1b6fc fix: resolve FlowProcessor topic collisions, librarian timeout, tests
Two bugs found during end-to-end testing:

1. FlowProcessor never restarted flows when config changed — it only
   started them once. Stale NATS JetStream data from previous sessions
   caused services to bind to wrong topics. Fix: stop and restart flows
   on every config push that includes flow definitions.

2. Gateway publishToTopic sent messages without an id property. Pipeline
   FlowProcessor handlers check properties.id and silently return if
   missing. Fix: auto-generate a message id when publishing to topics.

Both fixes validated: 13/13 integration tests passing, PDF decoder
correctly receives and processes document messages through the pipeline.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 01:53:55 -05:00
elpresidank
c545213224 feat: add query/retrieval FlowProcessor services and missing runner scripts
Wire up the query and retrieval side of the pipeline so the agent can
answer questions from stored knowledge:

- Triples query service (FalkorDB) — all SPO pattern queries via NATS
- Graph embeddings query service (Qdrant) — entity vector similarity
- Document embeddings query service (Qdrant) — chunk vector similarity
- Graph RAG service — full concept→entity→traverse→score→synthesize pipeline
- Document RAG service — embed→find chunks→synthesize pipeline
- Runner scripts for chunker, extractor, embeddings (missing from Phase 5)
- Add DocumentEmbeddingsRequest/Response schema types
- Add RAG prompt templates (extract-concepts, edge-scoring, synthesize)
- Add graph/doc embeddings query topics to seed config + flow manager
- Add all pipeline/query/retrieval services to docker-compose
- 8 new runner scripts, 8 new pnpm script aliases

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 01:05:54 -05:00
elpresidank
8f7008822a feat: add document pipeline — PDF decoder, Ollama LLM, storage services
Add end-to-end document processing pipeline:
- PDF decoder service (pdfjs-dist) extracts text per page from librarian docs
- Ollama native LLM service for local model inference
- FalkorDB triples store FlowProcessor consumer
- Qdrant graph embeddings store FlowProcessor consumer
- Fix spec name collisions in chunker/extractor (input→chunk-input, etc.)
- Gateway /load endpoint to trigger document processing
- Align flow manager blueprint and seed config with full pipeline topics
- Add runner scripts and test coverage for document load

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 23:47:43 -05:00
elpresidank
8f9de7604e fix: make abstract class constructors protected
Marks FlowProcessor and EmbeddingsService constructors as protected
since these classes should only be instantiated via subclasses.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 21:52:00 -05:00
cybermaggedon
4acd853023
Config push notify pattern: replace stateful pub/sub with signal+ fetch (#760)
Replace the config push mechanism that broadcast the full config
blob on a 'state' class pub/sub queue with a lightweight notify
signal containing only the version number and affected config
types. Processors fetch the full config via request/response from
the config service when notified.

This eliminates the need for the pub/sub 'state' queue class and
stateful pub/sub services entirely. The config push queue moves
from 'state' to 'flow' class — a simple transient signal rather
than a retained message.  This solves the RabbitMQ
late-subscriber problem where restarting processes never received
the current config because their fresh queue had no historical
messages.

Key changes:
- ConfigPush schema: config dict replaced with types list
- Subscribe-then-fetch startup with retry: processors subscribe
  to notify queue, fetch config via request/response, then
  process buffered notifies with version comparison to avoid race
  conditions
- register_config_handler() accepts optional types parameter so
  handlers only fire when their config types change
- Short-lived config request/response clients to avoid subscriber
  contention on non-persistent response topics
- Config service passes affected types through put/delete/flow
  operations
- Gateway ConfigReceiver rewritten with same notify pattern and
  retry loop

Tests updated

New tests:
- register_config_handler: without types, with types, multiple
  types, multiple handlers
- on_config_notify: old/same version skipped, irrelevant types
  skipped (version still updated), relevant type triggers fetch,
  handler without types always called, mixed handler filtering,
  empty types invokes all, fetch failure handled gracefully
- fetch_config: returns config+version, raises on error response,
  stops client even on exception
- fetch_and_apply_config: applies to all handlers on startup,
  retries on failure
2026-04-06 16:57:27 +01:00
V.Sreeram
d4723566cb fix: prevent duplicate dispatcher creation race condition in invoke_global_service (#715)
* fix: prevent duplicate dispatcher creation race condition in invoke_global_service

Concurrent coroutines could all pass the `if key in self.dispatchers` check
before any of them wrote the result back, because `await dispatcher.start()`
yields to the event loop. This caused multiple Pulsar consumers to be created
on the same shared subscription, distributing responses round-robin and
dropping ~2/3 of them — manifesting as a permanent spinner in the Workbench UI.

Apply a double-checked asyncio.Lock in both `invoke_global_service` and
`invoke_flow_service` so only one dispatcher is ever created per service key.

* test: add concurrent-dispatch tests for race condition fix

Add asyncio.gather-based tests that verify invoke_global_service and
invoke_flow_service create exactly one dispatcher under concurrent calls,
preventing the duplicate Pulsar consumer bug.
2026-04-06 11:14:32 +01:00
V.Sreeram
8f18ba0257
fix: prevent duplicate dispatcher creation race condition in invoke_global_service (#715)
* fix: prevent duplicate dispatcher creation race condition in invoke_global_service

Concurrent coroutines could all pass the `if key in self.dispatchers` check
before any of them wrote the result back, because `await dispatcher.start()`
yields to the event loop. This caused multiple Pulsar consumers to be created
on the same shared subscription, distributing responses round-robin and
dropping ~2/3 of them — manifesting as a permanent spinner in the Workbench UI.

Apply a double-checked asyncio.Lock in both `invoke_global_service` and
`invoke_flow_service` so only one dispatcher is ever created per service key.

* test: add concurrent-dispatch tests for race condition fix

Add asyncio.gather-based tests that verify invoke_global_service and
invoke_flow_service create exactly one dispatcher under concurrent calls,
preventing the duplicate Pulsar consumer bug.
2026-04-06 11:13:59 +01:00
Alex Jenkins
10a931f04c Feat: Auto-pull missing Ollama models (#757)
* fix deadlink in readme

Signed-off-by: Jenkins, Kenneth Alexander <kjenkins60@gatech.edu>

* feat: Auto-pull Ollama models

Signed-off-by: Jenkins, Kenneth Alexander <kjenkins60@gatech.edu>

* fix: Restore namespace __init__.py files for package resolution

Signed-off-by: Jenkins, Kenneth Alexander <kjenkins60@gatech.edu>

* fix CI

Signed-off-by: Jenkins, Kenneth Alexander <kjenkins60@gatech.edu>
2026-04-06 11:10:53 +01:00
Alex Jenkins
7daa06e9e4
Feat: Auto-pull missing Ollama models (#757)
* fix deadlink in readme

Signed-off-by: Jenkins, Kenneth Alexander <kjenkins60@gatech.edu>

* feat: Auto-pull Ollama models

Signed-off-by: Jenkins, Kenneth Alexander <kjenkins60@gatech.edu>

* fix: Restore namespace __init__.py files for package resolution

Signed-off-by: Jenkins, Kenneth Alexander <kjenkins60@gatech.edu>

* fix CI

Signed-off-by: Jenkins, Kenneth Alexander <kjenkins60@gatech.edu>
2026-04-06 11:10:14 +01:00
elpresidank
25d4227cb5 fix: resolve FlowProcessor topic collisions, librarian timeout, tests
Fix critical bug where all FlowProcessor services shared the same spec
names ("request"/"response"), causing them to steal each other's NATS
topics. Now each service uses unique spec names matching the flow config
topic keys (e.g., "text-completion-request", "prompt-request",
"agent-request").

Fix librarian NATS consumer timeout (500ms → 2000ms, below NATS minimum).

Update seed-config and test-pipeline with correct flow topic mappings.
Add prompt template runner script.

Smoke test results: 11/11 passing (config CRUD, WebSocket, LLM,
librarian CRUD). Agent routing verified via manual curl test.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 01:02:10 -05:00
elpresidank
515fc0c264 fix: Docker build fixes, add agent/librarian/flow-manager to compose
Fix Containerfiles:
- Move tsconfig.json to workspace config layer for early availability
- Add missing workspace package.json entries for pnpm lockfile resolution

Docker Compose:
- Move Grafana from port 3000 to 3030 (avoid conflicts)
- Add agent, librarian, and flow-manager app services
- Add librarian-data volume for document persistence

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 00:41:01 -05:00
elpresidank
7db5a1023e feat: add flow manager, config seeding, and expanded integration tests
Flow Management Service:
- FlowManagerService (AsyncProcessor) handling list/get/start/stop flows
  and list/get blueprints via kebab-case wire format
- Default blueprint with all service topic mappings
- Pushes flow config to config service on start/stop

Config Seeding:
- seed-config.ts script pushes prompt templates (extract-relationships,
  extract-definitions, document-prompt, kg-prompt) and default flow
  definition via gateway REST API

Integration Tests:
- Librarian CRUD: add-document, list-documents, get-content, delete
- Agent query: verifies routing through gateway to agent service
- Skip flags: SKIP_LIBRARIAN=1, SKIP_AGENT=1

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 00:37:03 -05:00
elpresidank
d1f24cf759 feat: add Docker deployment with Containerfile, entrypoints, and nginx
Multi-stage Containerfile for all Node.js services (single image,
different CMD per docker-compose service). ESM entrypoints for gateway,
config, text-completion, prompt, embeddings, agent, and librarian.

Workbench gets a separate Containerfile (nginx:alpine) with SPA routing
and API/WebSocket proxy to gateway.

Docker Compose updated with 6 app services (gateway, config-service,
text-completion, prompt, embeddings, workbench) using shared
trustgraph-ts:local image.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 00:21:00 -05:00
elpresidank
f09ef4de45 feat: add document pipeline, ReAct agent, and knowledge core services
Document Pipeline (Team A):
- LibrarianService: document storage with filesystem backend, metadata
  persistence, child document hierarchy, collection management
- ChunkingService: recursive character text splitter with configurable
  chunk size/overlap, FlowProcessor pattern
- KnowledgeExtractService: combined relationship + definition extraction
  using prompt service and LLM, emits RDF triples and entity contexts
- KnowledgeCoreService: knowledge core CRUD with streaming export and
  flow-based loading

ReAct Agent (Team B):
- StreamingReActParser: state machine for parsing LLM output into
  Thought/Action/ActionInput/FinalAnswer sections
- Three MVP tools: KnowledgeQuery (GraphRAG), DocumentQuery (DocRAG),
  TriplesQuery with RequestResponse clients
- AgentService FlowProcessor with ReAct loop, tool execution, and
  streaming chunk responses (thought/observation/answer)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 00:19:37 -05:00
elpresidank
5ed3f0e2d8 feat: add schema foundation for document pipeline, agent, and deployment
Add missing topics (librarian, knowledge, collection-management, flow),
pipeline message types (TextDocument, Chunk, Triples, EntityContexts),
service message types (Librarian, Knowledge, Collection, Flow CRUD),
and update AgentResponse for streaming chunk format.

Add RequestResponseSpec enabling flow-scoped request/response calls
(needed by knowledge extraction and agent services). Add requestor
registry to Flow class with proper lifecycle management.

Add end_of_dialog to gateway's isComplete() check for agent streaming.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 00:11:29 -05:00
elpresidank
28747e1a92 fix: NATS pipeline bugs, add integration tests and service runners
Fix three critical bugs preventing the NATS message pipeline from working:

- FlowProcessor now subscribes to config-push topic (was missing entirely),
  using DeliverPolicy.All to replay config on service restart
- NATS streams use wildcard subjects (tg.flow.>) instead of per-topic
  narrow filters that caused 503 errors on publish
- Subscriber dispatch loop has exponential backoff on errors to prevent
  tight error loops

Add service runner scripts (gateway, config, LLM) and a 7-test
integration suite that verifies config CRUD, WebSocket round-trip,
and full LLM text-completion through the NATS pipeline.

Fix Docker Compose infra: pin Tempo to v2.6.1, remove deprecated Loki
config fields, add user:0 for volume permissions, remap conflicting
ports (FalkorDB 6380, OTLP 4327/4328).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 23:41:39 -05:00
elpresidank
0042f9259c fix: linter cleanup on flow service implementations
Minor fixes from linter: readonly modifiers, unused parameter prefixes,
type narrowing in graph-rag BFS traversal and edge scoring.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 22:52:40 -05:00
elpresidank
b6536eca38 init 2026-04-05 22:44:45 -05:00
elpresidank
c386f68743 Merge commit '74cc8a4685' as 'ai-context/trustgraph-templates' 2026-04-05 21:09:49 -05:00
elpresidank
74cc8a4685 Squashed 'ai-context/trustgraph-templates/' content from commit 42a5fd1b
git-subtree-dir: ai-context/trustgraph-templates
git-subtree-split: 42a5fd1b678f32be378062e30451e2052ccb95dd
2026-04-05 21:09:49 -05:00
elpresidank
e26caa0b12 saving 2026-04-05 21:09:33 -05:00
elpresidank
9e9307a2aa Merge commit 'ad40332d56' as 'ai-context/trustgraph-templates' 2026-04-05 21:08:57 -05:00
elpresidank
ad40332d56 Squashed 'ai-context/trustgraph-templates/' content from commit 338a8ffa
git-subtree-dir: ai-context/trustgraph-templates
git-subtree-split: 338a8ffadb1439013071ae922e55ed2421f17025
2026-04-05 21:08:57 -05:00
elpresidank
ecaf3489f1 Merge commit '9b2f675702' as 'ai-context/context-graph-demo' 2026-04-05 21:08:35 -05:00
elpresidank
9b2f675702 Squashed 'ai-context/context-graph-demo/' content from commit 338a8ffa
git-subtree-dir: ai-context/context-graph-demo
git-subtree-split: 338a8ffadb1439013071ae922e55ed2421f17025
2026-04-05 21:08:35 -05:00
elpresidank
1a72bfdec0 Merge commit 'a8390532f7' as 'ai-context/workbench-ui' 2026-04-05 21:08:02 -05:00
elpresidank
a8390532f7 Squashed 'ai-context/workbench-ui/' content from commit 32e36a5c
git-subtree-dir: ai-context/workbench-ui
git-subtree-split: 32e36a5c2131e429a7081cfaf67dabad3193cda3
2026-04-05 21:08:02 -05:00
elpresidank
05d87964c2 Merge commit 'deff028fed' as 'ai-context/trustgraph-client' 2026-04-05 21:07:35 -05:00
elpresidank
deff028fed Squashed 'ai-context/trustgraph-client/' content from commit 908f18cf
git-subtree-dir: ai-context/trustgraph-client
git-subtree-split: 908f18cf814470ec3b72cc336bb945fb792ffdec
2026-04-05 21:07:35 -05:00
Jack Colquitt
be443a1679
Refine README content and remove Table of Contents (#759)
Updated the README to improve clarity and remove the Table of Contents section.
2026-04-04 13:40:12 -07:00
Jack Colquitt
8d1a4ae3bf
Revise quickstart instructions in README.md (#758)
Updated the README to clarify the configuration process and improve wording.
2026-04-04 13:34:12 -07:00
cybermaggedon
ee65d90fdd
SPARQL service supports batching/streaming (#755) 2026-04-02 17:54:07 +01:00
cybermaggedon
d9dc4cbab5
SPARQL query service (#754)
SPARQL 1.1 query service wrapping pub/sub triples interface

Add a backend-agnostic SPARQL query service that parses SPARQL
queries using rdflib, decomposes them into triple pattern lookups
via the existing TriplesClient pub/sub interface, and performs
in-memory joins, filters, and projections.

Includes:
- SPARQL parser, algebra evaluator, expression evaluator, solution
  sequence operations (BGP, JOIN, OPTIONAL, UNION, FILTER, BIND,
  VALUES, GROUP BY, ORDER BY, LIMIT/OFFSET, DISTINCT, aggregates)
- FlowProcessor service with TriplesClientSpec
- Gateway dispatcher, request/response translators, API spec
- Python SDK method (FlowInstance.sparql_query)
- CLI command (tg-invoke-sparql-query)
- Tech spec (docs/tech-specs/sparql-query.md)

New unit tests for SPARQL query
2026-04-02 17:21:39 +01:00
cybermaggedon
62c30a3a50
Skip Pulsar check in tg-verify-system-status (#753) 2026-04-02 13:20:39 +01:00
cybermaggedon
24f0190ce7
RabbitMQ pub/sub backend with topic exchange architecture (#752)
Adds a RabbitMQ backend as an alternative to Pulsar, selectable via
PUBSUB_BACKEND=rabbitmq. Both backends implement the same PubSubBackend
protocol — no application code changes needed to switch.

RabbitMQ topology:
- Single topic exchange per topicspace (e.g. 'tg')
- Routing key derived from queue class and topic name
- Shared consumers: named queue bound to exchange (competing, round-robin)
- Exclusive consumers: anonymous auto-delete queue (broadcast, each gets
  every message). Used by Subscriber and config push consumer.
- Thread-local producer connections (pika is not thread-safe)
- Push-based consumption via basic_consume with process_data_events
  for heartbeat processing

Consumer model changes:
- Consumer class creates one backend consumer per concurrent task
  (required for pika thread safety, harmless for Pulsar)
- Consumer class accepts consumer_type parameter
- Subscriber passes consumer_type='exclusive' for broadcast semantics
- Config push consumer uses consumer_type='exclusive' so every
  processor instance receives config updates
- handle_one_from_queue receives consumer as parameter for correct
  per-connection ack/nack

LibrarianClient:
- New shared client class replacing duplicated librarian request-response
  code across 6+ services (chunking, decoders, RAG, etc.)
- Uses stream-document instead of get-document-content for fetching
  document content in 1MB chunks (avoids broker message size limits)
- Standalone object (self.librarian = LibrarianClient(...)) not a mixin
- get-document-content marked deprecated in schema and OpenAPI spec

Serialisation:
- Extracted dataclass_to_dict/dict_to_dataclass to shared
  serialization.py (used by both Pulsar and RabbitMQ backends)

Librarian queues:
- Changed from flow class (persistent) back to request/response class
  now that stream-document eliminates large single messages
- API upload chunk size reduced from 5MB to 3MB to stay under broker
  limits after base64 encoding

Factory and CLI:
- get_pubsub() handles 'rabbitmq' backend with RabbitMQ connection params
- add_pubsub_args() includes RabbitMQ options (host, port, credentials)
- add_pubsub_args(standalone=True) defaults to localhost for CLI tools
- init_trustgraph skips Pulsar admin setup for non-Pulsar backends
- tg-dump-queues and tg-monitor-prompts use backend abstraction
- BaseClient and ConfigClient accept generic pubsub config
2026-04-02 12:47:16 +01:00
cybermaggedon
4fb0b4d8e8
Pub/sub abstraction: decouple from Pulsar (#751)
Remove Pulsar-specific concepts from application code so that
the pub/sub backend is swappable via configuration.

Rename translators:
- to_pulsar/from_pulsar → decode/encode across all translator
  classes, dispatch handlers, and tests (55+ files)
- from_response_with_completion → encode_with_completion
- Remove pulsar.schema.Record from translator base class

Queue naming (CLASS:TOPICSPACE:TOPIC):
- Replace topic() helper with queue() using new format:
  flow:tg:name, request:tg:name, response:tg:name, state:tg:name
- Queue class implies persistence/TTL (no QoS in names)
- Update Pulsar backend map_topic() to parse new format
- Librarian queues use flow class (persistent, for chunking)
- Config push uses state class (persistent, last-value)
- Remove 15 dead topic imports from schema files
- Update init_trustgraph.py namespace: config → state

Confine Pulsar to pulsar_backend.py:
- Delete legacy PulsarClient class from pubsub.py
- Move add_args to add_pubsub_args() with standalone flag
  for CLI tools (defaults to localhost)
- PulsarBackendConsumer.receive() catches _pulsar.Timeout,
  raises standard TimeoutError
- Remove Pulsar imports from: async_processor, flow_processor,
  log_level, all 11 client files, 4 storage writers, gateway
  service, gateway config receiver
- Remove log_level/LoggerLevel from client API
- Rewrite tg-monitor-prompts to use backend abstraction
- Update tg-dump-queues to use add_pubsub_args

Also: pubsub-abstraction.md tech spec covering problem statement,
design goals, as-is requirements, candidate broker assessment,
approach, and implementation order.
2026-04-01 20:16:53 +01:00
cybermaggedon
dbf8daa74a Additional agent DAG tests (#750)
- test_agent_provenance.py: test_session_parent_uri,
  test_session_no_parent_uri, and 6 synthesis tests (types,
  single/multiple parents, document, label)
- test_on_action_callback.py: 3 tests — fires before tool, skipped
  for Final, works when None
- test_callback_message_id.py: 7 tests — message_id on think/observe/
  answer callbacks (streaming + non-streaming) and
  send_final_response
- test_parse_chunk_message_id.py (5 tests) - _parse_chunk propagates
  message_id for thought, observation, answer; handles missing
  gracefully
- test_explainability_parsing.py (+1) -
  test_dispatches_analysis_with_tooluse - Analysis+ToolUse mixin still
  dispatches to Analysis
- test_explainability.py (+1) -
  test_observation_found_via_subtrace_synthesis
- chain walker follows from sub-trace Synthesis to find Observation
  and
  Conclusion in correct order
- test_agent_provenance.py (+8) - session parent_uri (2), synthesis
  single/multiple parents, types, document, label (6)
2026-04-01 13:59:58 +01:00