Reconfigure so that AZURE_TOKEN, AZURE_MODEL and AZURE_ENDPOINT
can be used to set the token/model/endpoint parameters. This allows it to
be deployed in K8s and use secrets to set these environment variables
* Change document-rag and graph-rag processing so that the user can
specify parameters. Changes in Pulsar services, Pulsar message
schemas, gateway and command-line tools. User-visible changes in
new parameters on command-line tools.
* Fix bugs, graph-rag working
* Get subgraph truncation in the right place
* Graph RAG and document RAG working and configurable
* Multi-hop path traversal GraphRAG
* Add safety valve for path_size set too high
* Bring QDrant up-to-date
* Tables for data from queue outputs
- Pass single Pulsar client to everything in gateway & librarian
- Pulsar listener-name support in gateway
- PDF and text load working in librarian
* Complete Cassandra schema
* Add librarian support to templates
* - Refactored retry for rate limits into the base class
- ConsumerProducer is derived from Consumer to simplify code
- Added rate_limit_count metrics for rate limit events
* Add rate limit events to VertexAI and Google AI Studio
* Added Grafana rate limit dashboard
* Add rate limit handling to all LLMs
- Removed unused LLM client configuration from agent-manager-react
- Change agent-manager-react template to use prompt-rag instead of
prompt
- Changed TextCompletion tool to use 'question' instead of 'computation'
for its parameter.
* Make schema changes
* Core entity context flow in place
* extract-def outputs entity contexts
* Refactored qdrant write
* Refactoring of all vector stores in place
* Fix invalid variable name invocation
* Fix error responses in websockets
* Increase websocket limits to 50MB max message. Turn on Pulsar chunking by default.
similar to ServiceRequestor, but one-way.
- This means these two services are now available over websocket with
document-load and text-load service IDs.
* Split API endpoint into endpoint and requestor
* Service/endpoint separation
* Call out to multiple services working
* Add ID field
* Add mux service on websocket, calls out to all services
* Separate memgraph query/write modules to optimise for memgraph
* Used 1GB memory for Memgraph
* Deployed specialised memgraph query/write processors, created memgraph indexes
* One triple is loaded as a single transaction
* Fixed index creation