* Tidy up duplicate tech specs in doc directory
* Streaming LLM text-completion service tech spec.
* text-completion and prompt interfaces
* streaming change applied to all LLMs, so far tested with VertexAI
* Skip Pinecone unit tests, upstream module issue is affecting things, tests are passing again
* Added agent streaming, not working and has broken tests
* Added Anthropic support for VertexAI
* Update tests to match code
* Fixed private.json usage with Anthropic (I think).
* Fixed test
---------
Co-authored-by: Cyber MacGeddon <cybermaggedon@gmail.com>
- Changed GoogleAIStudio LLM code to match latest documentation
- Very minor tweak to vertexai LLM code - just matching what's in SDK docs
no actual change to implementation.
- Tweaked VertexAI container build to speed up in dev
- Comments in LLM code to mention which docs it was built from. Google
SDKs are confusing ATM.
- Keeps processing in different flows separate so that data can go to different stores / collections etc.
- Potentially supports different processing flows
- Tidies the processing API with common base-classes for e.g. LLMs, and automatic configuration of 'clients' to use the right queue names in a flow
* - Refactored retry for rate limits into the base class
- ConsumerProducer is derived from Consumer to simplify code
- Added rate_limit_count metrics for rate limit events
* Add rate limit events to VertexAI and Google AI Studio
* Added Grafana rate limit dashboard
* Add rate limit handling to all LLMs
- Change templates to interpolate environment variables in docker compose
- Change templates to invoke secrets for environment variable credentials in K8s configuration
- Update LLMs to pull in credentials from environment variables if not specified
* - Locked 0.11 packages to 0.11 deps
- Added 'trustgraph' uber-package which installs the rest
- Added dependency to set package versions before building packages
* Bump version