* add pluggable session cache with Redis backend
* add Redis session affinity demos (Docker Compose and Kubernetes)
* address PR review feedback on session cache
* document Redis session cache backend for model affinity
* sync rendered config reference with session_cache addition
* add tenant-scoped Redis session cache keys and remove dead log_affinity_hit
- Add tenant_header to SessionCacheConfig; when set, cache keys are scoped
as plano:affinity:{tenant_id}:{session_id} for multi-tenant isolation
- Thread tenant_id through RouterService, routing_service, and llm handlers
- Use Cow<'_, str> in session_key to avoid allocation when no tenant is set
- Remove unused log_affinity_hit (logging was already inlined at call sites)
* remove session_affinity_redis and session_affinity_redis_k8s demos
* feat(provider): add xiaomi as first-class provider
* feat(demos): add xiaomi mimo integration demo
* refactor(demos): remove Xiaomi MiMo integration demo and update documentation
* updating model list and adding the xiamoi models
---------
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-389.local>
* fix: route Perplexity OpenAI paths without /v1
* add tests for Perplexity provider handling in LLM module
* refactor: use constant for Perplexity provider prefix in LLM module
* moving const to top of file
* support configurable orchestrator model via orchestration config section
* add self-hosting docs and demo for Plano-Orchestrator
* list all Plano-Orchestrator model variants in docs
* use overrides for custom routing and orchestration model
* update docs
* update orchestrator model name
* rename arch provider to plano, use llm_routing_model and agent_orchestration_model
* regenerate rendered config reference
* Add Codex CLI support; xAI response improvements
* Add native Plano running check and update CLI agent error handling
* adding PR suggestions for transformations and code quality
* message extraction logic in ResponsesAPIRequest
* xAI support for Responses API by routing to native endpoint + refactor code
* cleaning up plano cli commands
* adding support for wildcard model providers
* fixing compile errors
* fixing bugs related to default model provider, provider hint and duplicates in the model provider list
* fixed cargo fmt issues
* updating tests to always include the model id
* using default for the prompt_gateway path
* fixed the model name, as gpt-5-mini-2025-08-07 wasn't in the config
* making sure that all aliases and models match the config
* fixed the config generator to allow for base_url providers LLMs to include wildcard models
* re-ran the models list utility and added a shell script to run it
* updating docs to mention wildcard model providers
* updated provider_models.json to yaml, added that file to our docs for reference
* updating the build docs to use the new root-based build
---------
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>
* adding support for signals
* reducing false positives for signals like positive interaction
* adding docs. Still need to fix the messages list, but waiting on PR #621
* Improve frustration detection: normalize contractions and refine punctuation
* Further refine test cases with longer messages
* minor doc changes
* fixing echo statement for build
* fixing the messages construction and using the trait for signals
* update signals docs
* fixed some minor doc changes
* added more tests and fixed docuemtnation. PR 100% ready
* made fixes based on PR comments
* Optimize latency
1. replace sliding window approach with trigram containment check
2. add code to pre-compute ngrams for patterns
* removed some debug statements to make tests easier to read
* PR comments to make ObservableStreamProcessor accept optonal Vec<Messagges>
* fixed PR comments
---------
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>
Co-authored-by: MeiyuZhong <mariazhong9612@gmail.com>
Co-authored-by: nehcgs <54548843+nehcgs@users.noreply.github.com>
* fixed reasoning failures
* adding debugging
* made several fixes for transmission isses for SSeEvents, incomplete handling of json types by anthropic, and wrote a bunch of tests
* removed debugging from supervisord.conf
---------
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>
* agents framework demo
* more changes
* add more changes
* pending changes
* fix tests
* fix more
* rebase with main and better handle error from mcp
* add trace for filters
* add test for client error, server error and for mcp error
* update schema validate code and rename kind => type in agent_filter
* fix agent description and pre-commit
* fix tests
* add provider specific request parsing in agents chat
* fix precommit and tests
* cleanup demo
* update readme
* fix pre-commit
* refactor tracing
* fix fmt
* fix: handle MessageContent enum in responses API conversion
- Update request.rs to handle new MessageContent enum structure from main
- MessageContent can now be Text(String) or Items(Vec<InputContent>)
- Handle new InputItem variants (ItemReference, FunctionCallOutput)
- Fixes compilation error after merging latest main (#632)
* address pr feedback
* fix span
* fix build
* update openai version
* first commit with tests to enable state mamangement via memory
* fixed logs to follow the conversational flow a bit better
* added support for supabase
* added the state_storage_v1_responses flag, and use that to store state appropriately
* cleaned up logs and fixed issue with connectivity for llm gateway in weather forecast demo
* fixed mixed inputs from openai v1/responses api (#632)
* fixed mixed inputs from openai v1/responses api
* removing tracing from model-alias-rouing
* handling additional input types from openairs
---------
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>
* resolving PR comments
---------
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>
* adding canonical tracing support via bright-staff
* improved formatting for tools in the traces
* removing anthropic from the currency exchange demo
* using Envoy to transport traces, not calling OTEL directly
* moving otel collcetor cluster outside tracing if/else
* minor fixes to not write to the OTEL collector if tracing is disabled
* fixed PR comments and added more trace attributes
* more fixes based on PR comments
* more clean up based on PR comments
---------
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>