Commit graph

36 commits

Author SHA1 Message Date
cybermaggedon
cf0daedefa
Changed schema for Value -> Term, majorly breaking change (#622)
* Changed schema for Value -> Term, majorly breaking change

* Following the schema change, Value -> Term into all processing

* Updated Cassandra for g, p, s, o index patterns (7 indexes)

* Reviewed and updated all tests

* Neo4j, Memgraph and FalkorDB remain broken, will look at once settled down
2026-01-27 13:48:08 +00:00
cybermaggedon
11f41b07ab
Get neo4j to use limit (#618)
* Get neo4j to use limit

* Fix tests - they we exact matching on query strings
2026-01-22 15:16:34 +00:00
cybermaggedon
aebdf9444b
Fix incorrect Cassandra config invocation (#587) 2025-12-10 10:55:14 +00:00
cybermaggedon
7d07f802a8
Basic multitenant support (#583)
* Tech spec

* Address multi-tenant queue option problems in CLI

* Modified collection service to use config

* Changed storage management to use the config service definition
2025-12-05 21:45:30 +00:00
cybermaggedon
7501db01f1
Reconcile master with 1.6 (#563)
- Reconcile all master changes (documentation) to the 1.6 release branch
2025-11-24 10:02:30 +00:00
cybermaggedon
c69f5207a4
OntoRAG: Ontology-Based Knowledge Extraction and Query Technical Specification (#523)
* Onto-rag tech spec

* New processor kg-extract-ontology, use 'ontology' objects from config to guide triple extraction

* Also entity contexts

* Integrate with ontology extractor from workbench

This is first phase, the extraction is tested and working, also GraphRAG with the extracted knowledge works
2025-11-12 20:38:08 +00:00
cybermaggedon
6129bb68c1
Fix hard coded vector size (#555)
* Fixed hard-coded embeddings store size

* Vector store lazy-creates collections, different collections for
  different dimension lengths.

* Added tech spec for vector store lifecycle

* Fixed some tests for the new spec
2025-11-10 16:56:51 +00:00
cybermaggedon
51107008fd
master -> 1.5 (README updates) (#552) 2025-10-11 11:46:03 +01:00
cybermaggedon
52b133fc86
Collection delete pt. 3 (#542)
* Fixing collection deletion

* Fixing collection management param error

* Always test for collections

* Add Cassandra collection table

* Updated tech spec for explicit creation/deletion

* Remove implicit collection creation

* Fix up collection tracking in all processors
2025-09-30 16:02:33 +01:00
cybermaggedon
fcd15d1833
Collection management part 2 (#522)
* Plumb collection manager into librarian

* Test end-to-end
2025-09-19 16:08:47 +01:00
cybermaggedon
13ff7d765d
Collection management (#520)
* Tech spec

* Refactored Cassanda knowledge graph for single table

* Collection management, librarian services to manage metadata and collection deletion
2025-09-18 15:57:52 +01:00
cybermaggedon
0f1d3ce8cf
Vector stores will create collections on query (#512) 2025-09-11 00:15:46 +01:00
cybermaggedon
7f57bc6a0a
Feature/memgraph user collection isolation (#510)
* User/collection processing in memgraph

* Update tests
2025-09-10 22:11:35 +01:00
cybermaggedon
c694b12e9c
Feature/neo4j user collection isolation (#509)
* Tech spec

* User/collection separation

* Update tests
2025-09-10 22:11:21 +01:00
cybermaggedon
314ce76b81
Feature/fix milvus (#507)
- Remove object embeddings, were currently broken and not used
- Fixed Milvus collection names

* Updating tests

* Remove unused entrypoint
2025-09-09 21:44:55 +01:00
cybermaggedon
a92050c411
Fix Prometheus incorrect metric name (#502)
* Fix Prometheus incorrect metric name

* Remove unnecessary changes
2025-09-06 18:37:01 +01:00
cybermaggedon
27d657c58d
Remove graphql collection param (#489)
* Remove GraphQL collection parameter

* Update tech spec to mark query service complete
2025-09-04 10:04:09 +01:00
cybermaggedon
85e669c763
Fixing more Cassandra consistency issues (#488)
* Fixing more Cassandra work

* Fix tests
2025-09-04 00:58:11 +01:00
cybermaggedon
ccaec88a72
Feature/consolidate cassandra config (#483)
* Cassandra consolidation of parameters

* New Cassandra configuration helper

* Implemented Cassanda config refactor

* New tests
2025-09-03 23:41:22 +01:00
cybermaggedon
672e358b2f
Feature/graphql table query (#486)
* Tech spec

* Object query service for Cassandra

* Gateway support for objects-query

* GraphQL query utility

* Filters, ordering
2025-09-03 23:39:11 +01:00
cybermaggedon
dd70aade11
Implement logging strategy (#444)
* Logging strategy and convert all prints() to logging invocations
2025-07-30 23:18:38 +01:00
cybermaggedon
f37decea2b
Increase storage test coverage (#435)
* Fixing storage and adding tests

* PR pipeline only runs quick tests
2025-07-15 09:33:35 +01:00
cybermaggedon
a9197d11ee
Feature/configure flows (#345)
- Keeps processing in different flows separate so that data can go to different stores / collections etc.
- Potentially supports different processing flows
- Tidies the processing API with common base-classes for e.g. LLMs, and automatic configuration of 'clients' to use the right queue names in a flow
2025-04-22 20:21:38 +01:00
cybermaggedon
f350abb415
Maint/asyncio (#305)
* Move to asyncio services, even though everything is largely sync
2025-02-11 23:24:46 +00:00
Tyler Oliver
e99c0ac238 Add support for Qdrant API Auth (#300)
Added the necessary changes to support API Key in Qdrant Client Query+Storage
- Doc Embeddings
- Graph Embeddings
2025-02-08 11:46:22 +00:00
Tyler Oliver
41ccb6c976 Add user and password auth for Cassandra (#301) 2025-02-08 11:42:14 +00:00
cybermaggedon
03b6b45725
- Fix FalkorDB query API invocations (#214)
- Shift FalkorDB internal web manager to be port 3010 so doesn't clash with
  Grafana.
2024-12-19 17:32:05 +00:00
cybermaggedon
a4afff59a0
wip integrate falkordb (#211) (#213)
Co-authored-by: Avi Avni <avi.avni@gmail.com>
2024-12-19 16:17:07 +00:00
cybermaggedon
07f9b1f244
From vector DB, often get dupes, which means when end up returning (#210)
less then top_k elements.  So, fetch top_k=(2 * limit) and limit to
just (limit)
2024-12-10 22:37:54 +00:00
cybermaggedon
a714221b22
Add memgraph cypher LIMIT support (#200) 2024-12-07 00:16:52 +00:00
cybermaggedon
bffaf62490
Feature/memgraph optim (#193)
* Separate memgraph query/write modules to optimise for memgraph
* Used 1GB memory for Memgraph
* Deployed specialised memgraph query/write processors, created memgraph indexes
* One triple is loaded as a single transaction
* Fixed index creation
2024-12-06 00:12:49 +00:00
cybermaggedon
f24eed3023
Fix/pinecone de (#187)
* Fix Goog AI studio settings
* Fix pinecone startup params
2024-12-03 09:51:33 +00:00
cybermaggedon
9c97ca32f6
Feature/memgraph (#182)
* Add database override to bolt output, default is neo4j

* Add memgraph templates
2024-11-28 19:21:28 +00:00
cybermaggedon
319f9ac04a
Feature/pinecone integration (#170)
* Added Pinecone for GE write & query

* Add templates

* Doc embedding support
2024-11-22 23:48:21 +00:00
cybermaggedon
b0f4c58200
Feature / collections (#96)
* Update schema defs for source -> metadata
* Migrate to use metadata part of schema, also add metadata to triples & vecs
* Add user/collection metadata to query
* Use user/collection in RAG
* Write and query working on triples
2024-10-02 18:14:29 +01:00
cybermaggedon
9b91d5eee3
Feature/pkgsplit (#83)
* Starting to spawn base package
* More package hacking
* Bedrock and VertexAI
* Parquet split
* Updated templates
* Utils
2024-09-30 19:36:09 +01:00