Commit graph

6 commits

Author SHA1 Message Date
elpresidank
8f7008822a feat: add document pipeline — PDF decoder, Ollama LLM, storage services
Add end-to-end document processing pipeline:
- PDF decoder service (pdfjs-dist) extracts text per page from librarian docs
- Ollama native LLM service for local model inference
- FalkorDB triples store FlowProcessor consumer
- Qdrant graph embeddings store FlowProcessor consumer
- Fix spec name collisions in chunker/extractor (input→chunk-input, etc.)
- Gateway /load endpoint to trigger document processing
- Align flow manager blueprint and seed config with full pipeline topics
- Add runner scripts and test coverage for document load

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 23:47:43 -05:00
elpresidank
25d4227cb5 fix: resolve FlowProcessor topic collisions, librarian timeout, tests
Fix critical bug where all FlowProcessor services shared the same spec
names ("request"/"response"), causing them to steal each other's NATS
topics. Now each service uses unique spec names matching the flow config
topic keys (e.g., "text-completion-request", "prompt-request",
"agent-request").

Fix librarian NATS consumer timeout (500ms → 2000ms, below NATS minimum).

Update seed-config and test-pipeline with correct flow topic mappings.
Add prompt template runner script.

Smoke test results: 11/11 passing (config CRUD, WebSocket, LLM,
librarian CRUD). Agent routing verified via manual curl test.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 01:02:10 -05:00
elpresidank
7db5a1023e feat: add flow manager, config seeding, and expanded integration tests
Flow Management Service:
- FlowManagerService (AsyncProcessor) handling list/get/start/stop flows
  and list/get blueprints via kebab-case wire format
- Default blueprint with all service topic mappings
- Pushes flow config to config service on start/stop

Config Seeding:
- seed-config.ts script pushes prompt templates (extract-relationships,
  extract-definitions, document-prompt, kg-prompt) and default flow
  definition via gateway REST API

Integration Tests:
- Librarian CRUD: add-document, list-documents, get-content, delete
- Agent query: verifies routing through gateway to agent service
- Skip flags: SKIP_LIBRARIAN=1, SKIP_AGENT=1

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 00:37:03 -05:00
elpresidank
f09ef4de45 feat: add document pipeline, ReAct agent, and knowledge core services
Document Pipeline (Team A):
- LibrarianService: document storage with filesystem backend, metadata
  persistence, child document hierarchy, collection management
- ChunkingService: recursive character text splitter with configurable
  chunk size/overlap, FlowProcessor pattern
- KnowledgeExtractService: combined relationship + definition extraction
  using prompt service and LLM, emits RDF triples and entity contexts
- KnowledgeCoreService: knowledge core CRUD with streaming export and
  flow-based loading

ReAct Agent (Team B):
- StreamingReActParser: state machine for parsing LLM output into
  Thought/Action/ActionInput/FinalAnswer sections
- Three MVP tools: KnowledgeQuery (GraphRAG), DocumentQuery (DocRAG),
  TriplesQuery with RequestResponse clients
- AgentService FlowProcessor with ReAct loop, tool execution, and
  streaming chunk responses (thought/observation/answer)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 00:19:37 -05:00
elpresidank
28747e1a92 fix: NATS pipeline bugs, add integration tests and service runners
Fix three critical bugs preventing the NATS message pipeline from working:

- FlowProcessor now subscribes to config-push topic (was missing entirely),
  using DeliverPolicy.All to replay config on service restart
- NATS streams use wildcard subjects (tg.flow.>) instead of per-topic
  narrow filters that caused 503 errors on publish
- Subscriber dispatch loop has exponential backoff on errors to prevent
  tight error loops

Add service runner scripts (gateway, config, LLM) and a 7-test
integration suite that verifies config CRUD, WebSocket round-trip,
and full LLM text-completion through the NATS pipeline.

Fix Docker Compose infra: pin Tempo to v2.6.1, remove deprecated Loki
config fields, add user:0 for volume permissions, remap conflicting
ports (FalkorDB 6380, OTLP 4327/4328).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 23:41:39 -05:00
elpresidank
e26caa0b12 saving 2026-04-05 21:09:33 -05:00