mirror of
https://github.com/MODSetter/SurfSense.git
synced 2026-05-25 19:15:18 +02:00
Splits the OpenAI-family gate into per-param predicates so AZURE and AZURE_OPENAI configs now receive prompt_cache_key for backend routing affinity (Microsoft auto-caches GPT-4o+ deployments at >=1024 tokens; the key clusters same-prefix requests on the same GPU pool and raises hit rate on turn 2+). prompt_cache_retention stays opted out for Azure because litellm 1.83.14's Azure transformer would drop it silently; revisit when Azure's supported params list is updated. |
||
|---|---|---|
| .. | ||
| adapters | ||
| agents | ||
| connector_indexers | ||
| connectors | ||
| db | ||
| e2e_fakes | ||
| etl_pipeline | ||
| google_unification | ||
| indexing_pipeline | ||
| middleware | ||
| observability | ||
| routes | ||
| services | ||
| tasks | ||
| utils | ||
| __init__.py | ||
| test_error_contract.py | ||
| test_obsidian_plugin_indexer.py | ||
| test_stream_new_chat_contract.py | ||