SurfSense

mirror of https://github.com/MODSetter/SurfSense.git synced 2026-07-10 22:32:16 +02:00

CREDO23 db8bffab38 perf(prompt-cache): enable Azure prompt_cache_key routing hint Splits the OpenAI-family gate into per-param predicates so AZURE and AZURE_OPENAI configs now receive prompt_cache_key for backend routing affinity (Microsoft auto-caches GPT-4o+ deployments at >=1024 tokens; the key clusters same-prefix requests on the same GPU pool and raises hit rate on turn 2+). prompt_cache_retention stays opted out for Azure because litellm 1.83.14's Azure transformer would drop it silently; revisit when Azure's supported params list is updated.		2026-05-20 11:58:15 +02:00
..
multi_agent_chat	chore: linting	2026-05-15 17:33:44 -07:00
new_chat	perf(prompt-cache): enable Azure prompt_cache_key routing hint	2026-05-20 11:58:15 +02:00
__init__.py	feat: updated agent harness	2026-04-28 09:22:19 -07:00