Commit graph

4 commits

Author SHA1 Message Date
cybermaggedon
0e03bc05a4
Refactor rate limit handling (#280)
* - Refactored retry for rate limits into the base class
- ConsumerProducer is derived from Consumer to simplify code
- Added rate_limit_count metrics for rate limit events

* Add rate limit events to VertexAI and Google AI Studio

* Added Grafana rate limit dashboard

* Add rate limit handling to all LLMs
2025-01-27 17:04:49 +00:00
cybermaggedon
56a9ac3ba9
Change LLM latency dashboard to be rate & bump version (#92) 2024-10-01 21:04:55 +01:00
cybermaggedon
ef1b8b5a13
Feature/metering dashboard (#89)
* Bump version

* Added Prom metrics to metering, added dashboard

* Update YAMLs

* Add $ on axis

* Tweak dashboard
2024-10-01 06:46:41 +01:00
cybermaggedon
f661791bbf
K8s (#58)
Added templates which produce K8s resources.  With the provided GCP wrapper, it works on GCP K8s cluster.  This isn't stable enough for other folks to use so will need more piloting before it can be documented and released.
2024-09-07 18:59:38 +01:00
Renamed from grafana/dashboard.json (Browse further)