- _fetch_loaded_models_internal now writes _loaded_error_cache[endpoint] = time.time() on /api/ps or /v1/models failure, and clears the entry on success
- choose_endpoint now filters out candidates with a fresh (<300s) loaded-models error.
- /health now probes both /api/version and /api/ps for Ollama endpoints
- dashboard adaption
relates to #83