* add pluggable session cache with Redis backend
* add Redis session affinity demos (Docker Compose and Kubernetes)
* address PR review feedback on session cache
* document Redis session cache backend for model affinity
* sync rendered config reference with session_cache addition
* add tenant-scoped Redis session cache keys and remove dead log_affinity_hit
- Add tenant_header to SessionCacheConfig; when set, cache keys are scoped
as plano:affinity:{tenant_id}:{session_id} for multi-tenant isolation
- Thread tenant_id through RouterService, routing_service, and llm handlers
- Use Cow<'_, str> in session_key to avoid allocation when no tenant is set
- Remove unused log_affinity_hit (logging was already inlined at call sites)
* remove session_affinity_redis and session_affinity_redis_k8s demos
* support configurable orchestrator model via orchestration config section
* add self-hosting docs and demo for Plano-Orchestrator
* list all Plano-Orchestrator model variants in docs
* use overrides for custom routing and orchestration model
* update docs
* update orchestrator model name
* rename arch provider to plano, use llm_routing_model and agent_orchestration_model
* regenerate rendered config reference
* first commit with tests to enable state mamangement via memory
* fixed logs to follow the conversational flow a bit better
* added support for supabase
* added the state_storage_v1_responses flag, and use that to store state appropriately
* cleaned up logs and fixed issue with connectivity for llm gateway in weather forecast demo
* fixed mixed inputs from openai v1/responses api (#632)
* fixed mixed inputs from openai v1/responses api
* removing tracing from model-alias-rouing
* handling additional input types from openairs
---------
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>
* resolving PR comments
---------
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>
* adding function_calling functionality via rust
* fixed rendered YAML file
* removed model_server from envoy.template and forwarding traffic to bright_staff
* fixed bugs in function_calling.rs that were breaking tests. All good now
* updating e2e test to clean up disk usage
* removing Arch* models to be used as a default model if one is not specified
* if the user sets arch-function base_url we should honor it
* fixing demos as we needed to pin to a particular version of huggingface_hub else the chatbot ui wouldn't build
* adding a constant for Arch-Function model name
* fixing some edge cases with calls made to Arch-Function
* fixed JSON parsing issues in function_calling.rs
* fixed bug where the raw response from Arch-Function was re-encoded
* removed debug from supervisord.conf
* commenting out disk cleanup
* adding back disk space
---------
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-288.local>
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>
* Fix link issues and add icons
* Improve Doc
* fix test
* making minor modifications to shuguangs' doc changes
---------
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>
Co-authored-by: Adil Hafeez <adil@katanemo.com>
* init update
* Update terminology.rst
* fix the branch to create an index.html, and fix pre-commit issues
* Doc update
* made several changes to the docs after Shuguang's revision
* fixing pre-commit issues
* fixed the reference file to the final prompt config file
* added google analytics
---------
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>