mirror of
https://github.com/katanemo/plano.git
synced 2026-05-02 04:12:56 +02:00
add DigitalOcean pricing, startup validation, and demo update
- MetricsSource::DigitalOceanPricing variant: fetch public DO Gen-AI pricing, normalize as lowercase(creator)/model_id, cost = input + output per million
- cost_metrics endpoint format updated to { "model": { "input_per_million": X, "output_per_million": Y } }
- Startup errors: prefer:cheapest requires cost source, prefer:fastest requires prometheus
- Startup warning: models with no pricing/latency data ranked last
- One-per-type enforcement: digitalocean_pricing; error if cost_metrics + digitalocean_pricing both configured
- cost_snapshot() / latency_snapshot() on ModelMetricsService for startup checks
- Demo config updated to v0.4.0 top-level routing_preferences with cheapest + fastest policies
- docker-compose.yaml + prometheus.yaml + metrics_server.py for demo latency metrics
- Schema and docs updated
This commit is contained in:
parent
76b1f37052
commit
bd7afd911e
10 changed files with 427 additions and 80 deletions
17
demos/llm_routing/model_routing_service/docker-compose.yaml
Normal file
17
demos/llm_routing/model_routing_service/docker-compose.yaml
Normal file
|
|
@ -0,0 +1,17 @@
|
|||
services:
|
||||
prometheus:
|
||||
image: prom/prometheus:latest
|
||||
ports:
|
||||
- "9090:9090"
|
||||
volumes:
|
||||
- ./prometheus.yaml:/etc/prometheus/prometheus.yml:ro
|
||||
depends_on:
|
||||
- model-metrics
|
||||
|
||||
model-metrics:
|
||||
image: python:3.11-slim
|
||||
ports:
|
||||
- "8080:8080"
|
||||
volumes:
|
||||
- ./metrics_server.py:/metrics_server.py:ro
|
||||
command: python /metrics_server.py
|
||||
Loading…
Add table
Add a link
Reference in a new issue