feat: visualization of conversation affinity in dashboard

This commit is contained in:
Alpha Nerd 2026-05-13 13:38:37 +02:00
parent 4acbaeb29c
commit aa7ec6354a
Signed by: alpha-nerd
SSH key fingerprint: SHA256:QkkAgVoYi9TQ0UKPkiKSfnerZy2h4qhi3SVPXJmBN+M
5 changed files with 306 additions and 19 deletions

View file

@ -166,6 +166,39 @@ curl -X POST http://localhost:12434/api/cache/invalidate
Clears all cached entries and resets hit/miss counters.
### Affinity Stats (Conversation Affinity)
```bash
curl http://localhost:12434/api/affinity_stats
```
Response when [`conversation_affinity`](configuration.md#conversation_affinity) is enabled:
```json
{
"enabled": true,
"ttl": 300,
"entries": [
{ "endpoint": "http://gpu-primary:11434", "model": "llama3.2:latest", "remaining": 287.4 },
{ "endpoint": "http://gpu-primary:11434", "model": "llama3.2:latest", "remaining": 113.0 },
{ "endpoint": "http://gpu-secondary:11434", "model": "qwen2.5-coder:7b", "remaining": 44.8 }
]
}
```
Response when the feature is disabled:
```json
{ "enabled": false, "ttl": 300, "entries": [] }
```
- One element per **live pinned conversation** (no fingerprints or content — just the endpoint/model the pin points to and how many seconds it has left before expiry).
- Aggregation by `(endpoint, model)` is left to the consumer: the dashboard does this client-side.
- The endpoint is gated by the same `nomyo-router-api-key` middleware as the rest of `/api/*`.
The dashboard's **Running Models (PS) → Affinity** column is rendered from this data. The column auto-hides when `enabled: false`. Each row shows one dot per live pin against that `(endpoint, model)` pair; dot opacity = `remaining / ttl` (floor 0.15), so freshly-routed pins are solid and pins close to expiry fade out. A `+N` overflow badge appears once a single (endpoint, model) holds more than 12 active pins; an em-dash (`—`) marks an `(endpoint, model)` with no live pins.
> Multiple dots for what looks like "one chat window" is normal — most chat UIs (Open WebUI, LibreChat, …) fire auxiliary requests (title generation, follow-up suggestions, tag extraction) that have their own first-turn fingerprint and therefore their own pin. See [Conversation Affinity → Why the dashboard may show more than one dot per visible conversation](configuration.md#conversation_affinity) for the details.
### Real-time Usage Stream
```bash