add DigitalOcean pricing, startup validation, and demo update

- MetricsSource::DigitalOceanPricing variant: fetch public DO Gen-AI pricing, normalize as lowercase(creator)/model_id, cost = input + output per million - cost_metrics endpoint format updated to { "model": { "input_per_million": X, "output_per_million": Y } } - Startup errors: prefer:cheapest requires cost source, prefer:fastest requires prometheus - Startup warning: models with no pricing/latency data ranked last - One-per-type enforcement: digitalocean_pricing; error if cost_metrics + digitalocean_pricing both configured - cost_snapshot() / latency_snapshot() on ModelMetricsService for startup checks - Demo config updated to v0.4.0 top-level routing_preferences with cheapest + fastest policies - docker-compose.yaml + prometheus.yaml + metrics_server.py for demo latency metrics - Schema and docs updated
2026-05-02 04:12:56 +02:00 · 2026-03-27 16:54:37 -07:00 · 2026-03-27 16:54:37 -07:00 · bd7afd911e
commit bd7afd911e
parent 76b1f37052
10 changed files with 427 additions and 80 deletions
--- a/demos/llm_routing/model_routing_service/docker-compose.yaml
+++ b/demos/llm_routing/model_routing_service/docker-compose.yaml
@ -0,0 +1,17 @@
+services:
+  prometheus:
+    image: prom/prometheus:latest
+    ports:
+      - "9090:9090"
+    volumes:
+      - ./prometheus.yaml:/etc/prometheus/prometheus.yml:ro
+    depends_on:
+      - model-metrics
+
+  model-metrics:
+    image: python:3.11-slim
+    ports:
+      - "8080:8080"
+    volumes:
+      - ./metrics_server.py:/metrics_server.py:ro
+    command: python /metrics_server.py