* feat(provider): add xiaomi as first-class provider
* feat(demos): add xiaomi mimo integration demo
* refactor(demos): remove Xiaomi MiMo integration demo and update documentation
* updating model list and adding the xiamoi models
---------
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-389.local>
* support configurable orchestrator model via orchestration config section
* add self-hosting docs and demo for Plano-Orchestrator
* list all Plano-Orchestrator model variants in docs
* use overrides for custom routing and orchestration model
* update docs
* update orchestrator model name
* rename arch provider to plano, use llm_routing_model and agent_orchestration_model
* regenerate rendered config reference
* Add Codex CLI support; xAI response improvements
* Add native Plano running check and update CLI agent error handling
* adding PR suggestions for transformations and code quality
* message extraction logic in ResponsesAPIRequest
* xAI support for Responses API by routing to native endpoint + refactor code
* cleaning up plano cli commands
* adding support for wildcard model providers
* fixing compile errors
* fixing bugs related to default model provider, provider hint and duplicates in the model provider list
* fixed cargo fmt issues
* updating tests to always include the model id
* using default for the prompt_gateway path
* fixed the model name, as gpt-5-mini-2025-08-07 wasn't in the config
* making sure that all aliases and models match the config
* fixed the config generator to allow for base_url providers LLMs to include wildcard models
* re-ran the models list utility and added a shell script to run it
* updating docs to mention wildcard model providers
* updated provider_models.json to yaml, added that file to our docs for reference
* updating the build docs to use the new root-based build
---------
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>
* making first commit. still need to work on streaming respones
* making first commit. still need to work on streaming respones
* stream buffer implementation with tests
* adding grok API keys to workflow
* fixed changes based on code review
* adding support for bedrock models
* fixed issues with translation to claude code
---------
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>
* first commit to get Bedrock Converse API working. Next commit support for streaming and binary frames
* adding translation from BedrockBinaryFrameDecoder to AnthropicMessagesEvent
* Claude Code works with Amazon Bedrock
* added tests for openai streaming from bedrock
* PR comments fixed
* adding support for bedrock in docs as supported provider
* cargo fmt
* revertted to chatgpt models for claude code routing
---------
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-288.local>
Co-authored-by: Adil Hafeez <adil.hafeez@gmail.com>
* adding support for Qwen models and fixed issue with passing PATH variable
* don't need to have qwen in the model alias routing example
* fixed base_url for qwen
---------
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-288.local>
* adding support for moonshot and z-ai
* Revert unwanted changes to arch_config.yaml
---------
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-288.local>
* fixed docs and added ollama as a first-class LLM provider
* matching the LLM routing section on the README.md to the docs
* updated the section on preference-based routing
---------
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-167.local>
* pushing draft PR
* transformations are working. Now need to add some tests next
* updated tests and added necessary response transformations for Anthropics' message response object
* fixed bugs for integration tests
* fixed doc tests
* fixed serialization issues with enums on response
* adding some debug logs to help
* fixed issues with non-streaming responses
* updated the stream_context to update response bytes
* the serialized bytes length must be set in the response side
* fixed the debug statement that was causing the integration tests for wasm to fail
* fixing json parsing errors
* intentionally removing the headers
* making sure that we convert the raw bytes to the correct provider type upstream
* fixing non-streaming responses to tranform correctly
* /v1/messages works with transformations to and from /v1/chat/completions
* updating the CLI and demos to support anthropic vs. claude
* adding the anthropic key to the preference based routing tests
* fixed test cases and added more structured logs
* fixed integration tests and cleaned up logs
* added python client tests for anthropic and openai
* cleaned up logs and fixed issue with connectivity for llm gateway in weather forecast demo
* fixing the tests. python dependency order was broken
* updated the openAI client to fix demos
* removed the raw response debug statement
* fixed the dup cloning issue and cleaned up the ProviderRequestType enum and traits
* fixing logs
* moved away from string literals to consts
* fixed streaming from Anthropic Client to OpenAI
* removed debug statement that would likely trip up integration tests
* fixed integration tests for llm_gateway
* cleaned up test cases and removed unnecessary crates
* fixing comments from PR
* fixed bug whereby we were sending an OpenAIChatCompletions request object to llm_gateway even though the request may have been AnthropicMessages
---------
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-4.local>
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-9.local>
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-10.local>
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-41.local>
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-136.local>
* updating the implementation of /v1/chat/completions to use the generic provider interfaces
* saving changes, although we will need a small re-factor after this as well
* more refactoring changes, getting close
* more refactoring changes to avoid unecessary re-direction and duplication
* more clean up
* more refactoring
* more refactoring to clean code and make stream_context.rs work
* removing unecessary trait implemenations
* some more clean-up
* fixed bugs
* fixing test cases, and making sure all references to the ChatCOmpletions* objects point to the new types
* refactored changes to support enum dispatch
* removed the dependency on try_streaming_from_bytes into a try_from trait implementation
* updated readme based on new usage
* updated code based on code review comments
---------
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-2.local>
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-4.local>