* cleaning up plano cli commands
* adding support for wildcard model providers
* fixing compile errors
* fixing bugs related to default model provider, provider hint and duplicates in the model provider list
* fixed cargo fmt issues
* updating tests to always include the model id
* using default for the prompt_gateway path
* fixed the model name, as gpt-5-mini-2025-08-07 wasn't in the config
* making sure that all aliases and models match the config
* fixed the config generator to allow for base_url providers LLMs to include wildcard models
* re-ran the models list utility and added a shell script to run it
* updating docs to mention wildcard model providers
* updated provider_models.json to yaml, added that file to our docs for reference
* updating the build docs to use the new root-based build
---------
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>
* adding function_calling functionality via rust
* fixed rendered YAML file
* removed model_server from envoy.template and forwarding traffic to bright_staff
* fixed bugs in function_calling.rs that were breaking tests. All good now
* updating e2e test to clean up disk usage
* removing Arch* models to be used as a default model if one is not specified
* if the user sets arch-function base_url we should honor it
* fixing demos as we needed to pin to a particular version of huggingface_hub else the chatbot ui wouldn't build
* adding a constant for Arch-Function model name
* fixing some edge cases with calls made to Arch-Function
* fixed JSON parsing issues in function_calling.rs
* fixed bug where the raw response from Arch-Function was re-encoded
* removed debug from supervisord.conf
* commenting out disk cleanup
* adding back disk space
---------
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-288.local>
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>
* fixed for claude code routing. first commit
* removing redundant enum tags for cache_control
* making sure that claude code can run via the archgw cli
* fixing broken config
* adding a README.md and updated the cli to use more of our defined patterns for params
* fixed config.yaml
* minor fixes to make sure PR is clean. Ready to ship
* adding claude-sonnet-4-5 to the config
* fixes based on PR
* fixed alias for README
* fixed 400 error handling tests, now that we write temperature to 1.0 for GPT-5
---------
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-257.local>
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-288.local>
* pushing draft PR
* transformations are working. Now need to add some tests next
* updated tests and added necessary response transformations for Anthropics' message response object
* fixed bugs for integration tests
* fixed doc tests
* fixed serialization issues with enums on response
* adding some debug logs to help
* fixed issues with non-streaming responses
* updated the stream_context to update response bytes
* the serialized bytes length must be set in the response side
* fixed the debug statement that was causing the integration tests for wasm to fail
* fixing json parsing errors
* intentionally removing the headers
* making sure that we convert the raw bytes to the correct provider type upstream
* fixing non-streaming responses to tranform correctly
* /v1/messages works with transformations to and from /v1/chat/completions
* updating the CLI and demos to support anthropic vs. claude
* adding the anthropic key to the preference based routing tests
* fixed test cases and added more structured logs
* fixed integration tests and cleaned up logs
* added python client tests for anthropic and openai
* cleaned up logs and fixed issue with connectivity for llm gateway in weather forecast demo
* fixing the tests. python dependency order was broken
* updated the openAI client to fix demos
* removed the raw response debug statement
* fixed the dup cloning issue and cleaned up the ProviderRequestType enum and traits
* fixing logs
* moved away from string literals to consts
* fixed streaming from Anthropic Client to OpenAI
* removed debug statement that would likely trip up integration tests
* fixed integration tests for llm_gateway
* cleaned up test cases and removed unnecessary crates
* fixing comments from PR
* fixed bug whereby we were sending an OpenAIChatCompletions request object to llm_gateway even though the request may have been AnthropicMessages
---------
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-4.local>
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-9.local>
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-10.local>
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-41.local>
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-136.local>
* fixed issue with groq LLMs that require the openai in the /v1/chat/completions path. My first change
* updated the GH actions with keys for Groq
* adding missing groq API keys
* add llama-3.2-3b-preview to the model based on addin groq to the demo
---------
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>