Add retry policy configuration types to support automatic retry and
failover for LLM requests:
- RetryPolicy: top-level config with fallback_models, default_strategy,
default_max_attempts, and per-status-code overrides
- BackoffConfig: exponential backoff with base_ms, max_ms, jitter, and
scope (per-model, per-provider, or global)
- RetryAfterConfig: Retry-After header handling with block scope and
duration limits
- HighLatencyConfig: latency-based blocking with threshold, measurement
type, and trigger conditions
- LatencyTriggerConfig: min_triggers and trigger_window for debouncing
- RetryStrategy enum: same_model, same_provider, different_provider
- StatusCodeEntry: flexible status code matching (single, range, list)
Also add retry_policy field to GatewayConfig with Default impl.
Signed-off-by: Troy Mitchell <i@troy-y.org>
* cleaning up plano cli commands
* adding support for wildcard model providers
* fixing compile errors
* fixing bugs related to default model provider, provider hint and duplicates in the model provider list
* fixed cargo fmt issues
* updating tests to always include the model id
* using default for the prompt_gateway path
* fixed the model name, as gpt-5-mini-2025-08-07 wasn't in the config
* making sure that all aliases and models match the config
* fixed the config generator to allow for base_url providers LLMs to include wildcard models
* re-ran the models list utility and added a shell script to run it
* updating docs to mention wildcard model providers
* updated provider_models.json to yaml, added that file to our docs for reference
* updating the build docs to use the new root-based build
---------
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>