Commit graph

10 commits

Author SHA1 Message Date
cybermaggedon
f4733021c5
Fix Mistral OCR ident to be standard pdf-decoder (#450)
* Fix Mistral OCR ident to be standard pdf-decoder

* Correct test
2025-08-04 14:01:36 +01:00
cybermaggedon
7e0d831026
Fixed Mistral OCR to use current API (#448)
* Fixed Mistral OCR to use current API

* Added PDF decoder tests
2025-08-04 10:08:16 +01:00
cybermaggedon
dd70aade11
Implement logging strategy (#444)
* Logging strategy and convert all prints() to logging invocations
2025-07-30 23:18:38 +01:00
cybermaggedon
9508ac6c69
Port metering to new API, not tested. (#354)
- Port metering to new API
- Moved price list to configuration
- Added tg-set-token-costs and tg-show-token-costs utils.
2025-04-28 21:26:38 +01:00
cybermaggedon
a9197d11ee
Feature/configure flows (#345)
- Keeps processing in different flows separate so that data can go to different stores / collections etc.
- Potentially supports different processing flows
- Tidies the processing API with common base-classes for e.g. LLMs, and automatic configuration of 'clients' to use the right queue names in a flow
2025-04-22 20:21:38 +01:00
cybermaggedon
482592b976
Added Mistral OCR client (#326)
- Added Mistral OCR client
- Template updates for pdf-ocr
- Template updates for pdf-ocr-mistral
2025-03-22 00:27:20 +00:00
cybermaggedon
f350abb415
Maint/asyncio (#305)
* Move to asyncio services, even though everything is largely sync
2025-02-11 23:24:46 +00:00
cybermaggedon
7954e863cc
Feature: document metadata (#123)
* Rework metadata structure in processing messages to be a subgraph
* Add subgraph creation for tg-load-pdf and tg-load-text based on command-line passing of doc attributes
* Document metadata is added to knowledge graph with subjectOf linkage to extracted entities
2024-10-23 18:04:04 +01:00
cybermaggedon
b0f4c58200
Feature / collections (#96)
* Update schema defs for source -> metadata
* Migrate to use metadata part of schema, also add metadata to triples & vecs
* Add user/collection metadata to query
* Use user/collection in RAG
* Write and query working on triples
2024-10-02 18:14:29 +01:00
cybermaggedon
9b91d5eee3
Feature/pkgsplit (#83)
* Starting to spawn base package
* More package hacking
* Bedrock and VertexAI
* Parquet split
* Updated templates
* Utils
2024-09-30 19:36:09 +01:00