mirror of
https://github.com/trustgraph-ai/trustgraph.git
synced 2026-07-01 09:29:38 +02:00
feat: add document pipeline — PDF decoder, Ollama LLM, storage services
Add end-to-end document processing pipeline: - PDF decoder service (pdfjs-dist) extracts text per page from librarian docs - Ollama native LLM service for local model inference - FalkorDB triples store FlowProcessor consumer - Qdrant graph embeddings store FlowProcessor consumer - Fix spec name collisions in chunker/extractor (input→chunk-input, etc.) - Gateway /load endpoint to trigger document processing - Align flow manager blueprint and seed config with full pipeline topics - Add runner scripts and test coverage for document load Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
8f9de7604e
commit
8f7008822a
20 changed files with 894 additions and 37 deletions
14
ts/scripts/run-pdf-decoder.ts
Normal file
14
ts/scripts/run-pdf-decoder.ts
Normal file
|
|
@ -0,0 +1,14 @@
|
|||
/**
|
||||
* Start the PDF decoder service.
|
||||
*
|
||||
* Usage: pnpm tsx scripts/run-pdf-decoder.ts
|
||||
*
|
||||
* Env:
|
||||
* NATS_URL (default: nats://localhost:4222)
|
||||
*/
|
||||
import { run } from "../packages/flow/src/decoding/pdf-decoder.js";
|
||||
|
||||
run().catch((err) => {
|
||||
console.error("PDF decoder service failed:", err);
|
||||
process.exit(1);
|
||||
});
|
||||
Loading…
Add table
Add a link
Reference in a new issue