SurfSense

mirror of https://github.com/MODSetter/SurfSense.git synced 2026-07-10 22:32:16 +02:00

CREDO23 db8bffab38 perf(prompt-cache): enable Azure prompt_cache_key routing hint Splits the OpenAI-family gate into per-param predicates so AZURE and AZURE_OPENAI configs now receive prompt_cache_key for backend routing affinity (Microsoft auto-caches GPT-4o+ deployments at >=1024 tokens; the key clusters same-prefix requests on the same GPU pool and raises hit rate on turn 2+). prompt_cache_retention stays opted out for Azure because litellm 1.83.14's Azure transformer would drop it silently; revisit when Azure's supported params list is updated.		2026-05-20 11:58:15 +02:00
..
autocomplete	Merge commit '`61f4d05cd1`' into dev_mod	2026-04-28 09:25:41 -07:00
multi_agent_chat	perf(kb-planner): route internal planner calls to dedicated small/fast LLM	2026-05-20 11:42:52 +02:00
new_chat	perf(prompt-cache): enable Azure prompt_cache_key routing hint	2026-05-20 11:58:15 +02:00
podcaster	cloud: added openrouter integration with global configs	2026-04-15 23:46:29 -07:00
video_presentation	cloud: added openrouter integration with global configs	2026-04-15 23:46:29 -07:00
__init__.py	feat: Added chat_history to researcher agent	2025-05-10 20:06:19 -07:00