From 539d5f98a21cc01f7f44eeef5b48a75770eb7199 Mon Sep 17 00:00:00 2001 From: alpha nerd Date: Mon, 18 May 2026 17:03:04 +0200 Subject: [PATCH] doc: update on /health and /api/config endpoints --- doc/architecture.md | 2 ++ doc/monitoring.md | 6 ++++++ 2 files changed, 8 insertions(+) diff --git a/doc/architecture.md b/doc/architecture.md index f725573..c2408d8 100644 --- a/doc/architecture.md +++ b/doc/architecture.md @@ -206,6 +206,8 @@ The `/health` endpoint provides comprehensive health status: } ``` +For Ollama endpoints the probe is a parallel check of `/api/version` (liveness) and `/api/ps` (the route used by `choose_endpoint` when selecting a backend for a request). Reporting `ok` only when both succeed prevents the router from advertising an endpoint as healthy while completion calls dead-end on `/api/ps`. The same dual probe backs `/api/config`, which the dashboard uses to render endpoint health. + ## Database Schema The router uses SQLite for persistent storage: diff --git a/doc/monitoring.md b/doc/monitoring.md index ab75d25..9ce25ec 100644 --- a/doc/monitoring.md +++ b/doc/monitoring.md @@ -29,6 +29,10 @@ Response: - `200`: All endpoints healthy - `503`: One or more endpoints unhealthy +**Probe scope per endpoint**: +- **Ollama endpoints** are probed at both `/api/version` (liveness) and `/api/ps` (model-introspection used by the router). If either fails the endpoint is reported as `error`; the response still includes `version` when the daemon is reachable so operators can tell a partial failure from a full outage. The `detail` field names the failing probe, e.g. `"/api/ps: 502 …"`. +- **OpenAI-compatible / llama-server endpoints** are probed at `/models`. + ### Current Usage ```bash @@ -133,6 +137,8 @@ Response: } ``` +Uses the same dual-probe logic as `/health` (Ollama: `/api/version` + `/api/ps`; OpenAI-compatible: `/models`). An endpoint will report `error` whenever either probe fails. The dashboard renders the `detail` field as a tooltip on the status cell. + ### Cache Statistics ```bash