trustgraph

mirror of https://github.com/trustgraph-ai/trustgraph.git synced 2026-07-01 09:29:38 +02:00

Author	SHA1	Message	Date
elpresidank	50fb311d2d	feat: real PDF pipeline test — end-to-end knowledge extraction working Add full pipeline test that generates a real PDF, processes it through the entire pipeline, and verifies knowledge lands in FalkorDB: - Create test PDF generator using pdf-lib (2-page doc about Acme Corp) - Add testFullPipeline() to integration tests with store verification - Fix FalkorDB client connect() — createClient returns unconnected client in both TriplesStore and TriplesQuery classes Results: PDF decoded (2 pages) → chunked (2 chunks) → extracted (4 relationships) → 16 triples stored in FalkorDB including: alice-johnson → is-a-senior-engineer → acme-corporation cloudsync → uses-aws-for-hosting → amazon-web-services provenance: pages → prov:wasDerivedFrom → source document Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 02:19:12 -05:00
elpresidank	5bc7a1b6fc	fix: resolve FlowProcessor topic collisions, librarian timeout, tests Two bugs found during end-to-end testing: 1. FlowProcessor never restarted flows when config changed — it only started them once. Stale NATS JetStream data from previous sessions caused services to bind to wrong topics. Fix: stop and restart flows on every config push that includes flow definitions. 2. Gateway publishToTopic sent messages without an id property. Pipeline FlowProcessor handlers check properties.id and silently return if missing. Fix: auto-generate a message id when publishing to topics. Both fixes validated: 13/13 integration tests passing, PDF decoder correctly receives and processes document messages through the pipeline. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 01:53:55 -05:00
elpresidank	c545213224	feat: add query/retrieval FlowProcessor services and missing runner scripts Wire up the query and retrieval side of the pipeline so the agent can answer questions from stored knowledge: - Triples query service (FalkorDB) — all SPO pattern queries via NATS - Graph embeddings query service (Qdrant) — entity vector similarity - Document embeddings query service (Qdrant) — chunk vector similarity - Graph RAG service — full concept→entity→traverse→score→synthesize pipeline - Document RAG service — embed→find chunks→synthesize pipeline - Runner scripts for chunker, extractor, embeddings (missing from Phase 5) - Add DocumentEmbeddingsRequest/Response schema types - Add RAG prompt templates (extract-concepts, edge-scoring, synthesize) - Add graph/doc embeddings query topics to seed config + flow manager - Add all pipeline/query/retrieval services to docker-compose - 8 new runner scripts, 8 new pnpm script aliases Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 01:05:54 -05:00
elpresidank	8f7008822a	feat: add document pipeline — PDF decoder, Ollama LLM, storage services Add end-to-end document processing pipeline: - PDF decoder service (pdfjs-dist) extracts text per page from librarian docs - Ollama native LLM service for local model inference - FalkorDB triples store FlowProcessor consumer - Qdrant graph embeddings store FlowProcessor consumer - Fix spec name collisions in chunker/extractor (input→chunk-input, etc.) - Gateway /load endpoint to trigger document processing - Align flow manager blueprint and seed config with full pipeline topics - Add runner scripts and test coverage for document load Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-06 23:47:43 -05:00
elpresidank	25d4227cb5	fix: resolve FlowProcessor topic collisions, librarian timeout, tests Fix critical bug where all FlowProcessor services shared the same spec names ("request"/"response"), causing them to steal each other's NATS topics. Now each service uses unique spec names matching the flow config topic keys (e.g., "text-completion-request", "prompt-request", "agent-request"). Fix librarian NATS consumer timeout (500ms → 2000ms, below NATS minimum). Update seed-config and test-pipeline with correct flow topic mappings. Add prompt template runner script. Smoke test results: 11/11 passing (config CRUD, WebSocket, LLM, librarian CRUD). Agent routing verified via manual curl test. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-06 01:02:10 -05:00
elpresidank	7db5a1023e	feat: add flow manager, config seeding, and expanded integration tests Flow Management Service: - FlowManagerService (AsyncProcessor) handling list/get/start/stop flows and list/get blueprints via kebab-case wire format - Default blueprint with all service topic mappings - Pushes flow config to config service on start/stop Config Seeding: - seed-config.ts script pushes prompt templates (extract-relationships, extract-definitions, document-prompt, kg-prompt) and default flow definition via gateway REST API Integration Tests: - Librarian CRUD: add-document, list-documents, get-content, delete - Agent query: verifies routing through gateway to agent service - Skip flags: SKIP_LIBRARIAN=1, SKIP_AGENT=1 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-06 00:37:03 -05:00
elpresidank	f09ef4de45	feat: add document pipeline, ReAct agent, and knowledge core services Document Pipeline (Team A): - LibrarianService: document storage with filesystem backend, metadata persistence, child document hierarchy, collection management - ChunkingService: recursive character text splitter with configurable chunk size/overlap, FlowProcessor pattern - KnowledgeExtractService: combined relationship + definition extraction using prompt service and LLM, emits RDF triples and entity contexts - KnowledgeCoreService: knowledge core CRUD with streaming export and flow-based loading ReAct Agent (Team B): - StreamingReActParser: state machine for parsing LLM output into Thought/Action/ActionInput/FinalAnswer sections - Three MVP tools: KnowledgeQuery (GraphRAG), DocumentQuery (DocRAG), TriplesQuery with RequestResponse clients - AgentService FlowProcessor with ReAct loop, tool execution, and streaming chunk responses (thought/observation/answer) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-06 00:19:37 -05:00
elpresidank	5ed3f0e2d8	feat: add schema foundation for document pipeline, agent, and deployment Add missing topics (librarian, knowledge, collection-management, flow), pipeline message types (TextDocument, Chunk, Triples, EntityContexts), service message types (Librarian, Knowledge, Collection, Flow CRUD), and update AgentResponse for streaming chunk format. Add RequestResponseSpec enabling flow-scoped request/response calls (needed by knowledge extraction and agent services). Add requestor registry to Flow class with proper lifecycle management. Add end_of_dialog to gateway's isComplete() check for agent streaming. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-06 00:11:29 -05:00
elpresidank	0042f9259c	fix: linter cleanup on flow service implementations Minor fixes from linter: readonly modifiers, unused parameter prefixes, type narrowing in graph-rag BFS traversal and edge scoring. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 22:52:40 -05:00
elpresidank	b6536eca38	init	2026-04-05 22:44:45 -05:00
elpresidank	e26caa0b12	saving	2026-04-05 21:09:33 -05:00

11 commits