mirror of
https://github.com/katanemo/plano.git
synced 2026-05-02 20:32:42 +02:00
- MetricsSource::DigitalOceanPricing variant: fetch public DO Gen-AI pricing, normalize as lowercase(creator)/model_id, cost = input + output per million
- cost_metrics endpoint format updated to { "model": { "input_per_million": X, "output_per_million": Y } }
- Startup errors: prefer:cheapest requires cost source, prefer:fastest requires prometheus
- Startup warning: models with no pricing/latency data ranked last
- One-per-type enforcement: digitalocean_pricing; error if cost_metrics + digitalocean_pricing both configured
- cost_snapshot() / latency_snapshot() on ModelMetricsService for startup checks
- Demo config updated to v0.4.0 top-level routing_preferences with cheapest + fastest policies
- docker-compose.yaml + prometheus.yaml + metrics_server.py for demo latency metrics
- Schema and docs updated
|
||
|---|---|---|
| .. | ||
| docker-compose.dev.yaml | ||
| env.list | ||
| envoy.template.yaml | ||
| plano_config_schema.yaml | ||
| README.md | ||
| requirements.txt | ||
| supervisord.conf | ||
| test_passthrough.yaml | ||
| validate_plano_config.sh | ||
Envoy filter code for gateway
Add toolchain
$ rustup target add wasm32-wasip1
Building
$ cargo build --target wasm32-wasip1 --release
Testing
$ cargo test
Local development
-
Build docker image for Plano. Note this needs to be built once.
$ sh build_filter_image.sh -
Build filter binary,
$ cargo build --target wasm32-wasip1 --release -
Start envoy with config.yaml and test,
$ docker compose -f docker-compose.dev.yaml up plano -
dev version of docker-compose file uses following files that are mounted inside the container. That means no docker rebuild is needed if any of these files change. Just restart the container and chagne will be picked up,
- envoy.template.yaml
- intelligent_prompt_gateway.wasm