mirror of
https://github.com/MODSetter/SurfSense.git
synced 2026-05-25 19:15:18 +02:00
Splits the OpenAI-family gate into per-param predicates so AZURE and AZURE_OPENAI configs now receive prompt_cache_key for backend routing affinity (Microsoft auto-caches GPT-4o+ deployments at >=1024 tokens; the key clusters same-prefix requests on the same GPU pool and raises hit rate on turn 2+). prompt_cache_retention stays opted out for Azure because litellm 1.83.14's Azure transformer would drop it silently; revisit when Azure's supported params list is updated. |
||
|---|---|---|
| .. | ||
| alembic | ||
| app | ||
| scripts | ||
| tests | ||
| .dockerignore | ||
| .env.example | ||
| .gitignore | ||
| .python-version | ||
| alembic.ini | ||
| celery_worker.py | ||
| Dockerfile | ||
| main.py | ||
| pyproject.toml | ||
| uv.lock | ||