Commit graph

58 commits

Author SHA1 Message Date
Salman Paracha
566e7b9c09
fixed bug in Bedrock translation code and dramatically improved tracing for outbound LLM traffic (#601)
* dramatically improve LLM traces and fixed bug with Bedrock translation from claude code

* addressing comments

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-288.local>
2025-10-24 14:07:05 -07:00
Salman Paracha
9407ae6af7
Add support for Amazon Bedrock Converse and ConverseStream (#588)
* first commit to get Bedrock Converse API working. Next commit support for streaming and binary frames

* adding translation from BedrockBinaryFrameDecoder to AnthropicMessagesEvent

* Claude Code works with Amazon Bedrock

* added tests for openai streaming from bedrock

* PR comments fixed

* adding support for bedrock in docs as supported provider

* cargo fmt

* revertted to chatgpt models for claude code routing

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-288.local>
Co-authored-by: Adil Hafeez <adil.hafeez@gmail.com>
2025-10-22 11:31:21 -07:00
Adil Hafeez
96e0732089
add support for agents (#564) 2025-10-14 14:01:11 -07:00
Salman Paracha
226139e907
adding support for Qwen models and fixed issue with passing PATH vari… (#583)
* adding support for Qwen models and fixed issue with passing PATH variable

* don't need to have qwen in the model alias routing example

* fixed base_url for qwen

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-288.local>
2025-10-01 21:57:58 -07:00
Salman Paracha
045a5e9751
adding support for moonshot and z-ai (#578)
* adding support for moonshot and z-ai

* Revert unwanted changes to arch_config.yaml

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-288.local>
2025-09-30 12:24:06 -07:00
Salman Paracha
fbe82351c0
Salmanap/fix docs new providers model alias (#571)
* fixed docs and added ollama as a first-class LLM provider

* matching the LLM routing section on the README.md to the docs

* updated the section on preference-based routing

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-167.local>
2025-09-19 10:19:57 -07:00
Salman Paracha
8d0b468345
draft commit to add support for xAI, TogehterAI, AzureOpenAI (#570)
* draft commit to add support for xAI, LambdaAI, TogehterAI, AzureOpenAI

* fixing failing tests and updating rederend config file

* Update arch_config_with_aliases.yaml

* adding the AZURE_API_KEY to the GH workflow for e2e

* fixing GH secerts

* adding valdiating for azure_openai

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-167.local>
2025-09-18 18:36:30 -07:00
Salman Paracha
4eb2b410c5
adding support for model aliases in archgw (#566)
* adding support for model aliases in archgw

* fixed PR based on feedback

* removing README. Not relevant for PR

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-136.local>
2025-09-16 11:12:08 -07:00
Salman Paracha
fb0581fd39
add support for v1/messages and transformations (#558)
* pushing draft PR

* transformations are working. Now need to add some tests next

* updated tests and added necessary response transformations for Anthropics' message response object

* fixed bugs for integration tests

* fixed doc tests

* fixed serialization issues with enums on response

* adding some debug logs to help

* fixed issues with non-streaming responses

* updated the stream_context to update response bytes

* the serialized bytes length must be set in the response side

* fixed the debug statement that was causing the integration tests for wasm to fail

* fixing json parsing errors

* intentionally removing the headers

* making sure that we convert the raw bytes to the correct provider type upstream

* fixing non-streaming responses to tranform correctly

* /v1/messages works with transformations to and from /v1/chat/completions

* updating the CLI and demos to support anthropic vs. claude

* adding the anthropic key to the preference based routing tests

* fixed test cases and added more structured logs

* fixed integration tests and cleaned up logs

* added python client tests for anthropic and openai

* cleaned up logs and fixed issue with connectivity for llm gateway in weather forecast demo

* fixing the tests. python dependency order was broken

* updated the openAI client to fix demos

* removed the raw response debug statement

* fixed the dup cloning issue and cleaned up the ProviderRequestType enum and traits

* fixing logs

* moved away from string literals to consts

* fixed streaming from Anthropic Client to OpenAI

* removed debug statement that would likely trip up integration tests

* fixed integration tests for llm_gateway

* cleaned up test cases and removed unnecessary crates

* fixing comments from PR

* fixed bug whereby we were sending an OpenAIChatCompletions request object to llm_gateway even though the request may have been AnthropicMessages

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-4.local>
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-9.local>
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-10.local>
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-41.local>
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-136.local>
2025-09-10 07:40:30 -07:00
Salman Paracha
89ab51697a
updating the implementation of /v1/chat/completions to use the generi… (#548)
* updating the implementation of /v1/chat/completions to use the generic provider interfaces

* saving changes, although we will need a small re-factor after this as well

* more refactoring changes, getting close

* more refactoring changes to avoid unecessary re-direction and duplication

* more clean up

* more refactoring

* more refactoring to clean code and make stream_context.rs work

* removing unecessary trait implemenations

* some more clean-up

* fixed bugs

* fixing test cases, and making sure all references to the ChatCOmpletions* objects point to the new types

* refactored changes to support enum dispatch

* removed the dependency on try_streaming_from_bytes into a try_from trait implementation

* updated readme based on new usage

* updated code based on code review comments

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-2.local>
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-4.local>
2025-08-20 12:55:29 -07:00
Adil Hafeez
d341f4365b
In request path use same format for usage preferences as arch_config (#533) 2025-07-21 18:31:19 -07:00
Adil Hafeez
a7fddf30f9
better model names (#517) 2025-07-11 16:42:16 -07:00
Adil Hafeez
147908ba7e
make arch-router cluster optional (#518) 2025-07-08 00:33:40 -07:00
Adil Hafeez
00dc95e034
Add support for updating model preferences (#510) 2025-07-02 14:08:19 -07:00
Adil Hafeez
7baec20772
release 0.3.2 (#507) 2025-06-13 17:02:20 -07:00
Adil Hafeez
aa9d747fa9
add support for gemini (#505) 2025-06-11 15:15:00 -07:00
Adil Hafeez
6c53510f49
Introduce hermesllm library to handle llm message translation (#501) 2025-06-10 12:53:27 -07:00
Adil Hafeez
0d190a6e5c
update code to use new json based system prompt for routing (#493) 2025-05-30 17:40:46 -07:00
Adil Hafeez
8d12a9a6e0
add arch provider (#494) 2025-05-30 17:12:52 -07:00
Adil Hafeez
176f039bbc
fix model warning and use openwebui for preference based router demo 2025-05-30 12:29:56 -07:00
Adil Hafeez
470cdf9843
use provider_name as model_id /v1/models api (#490) 2025-05-29 11:23:18 -07:00
Adil Hafeez
9c4733590f
add support for openwebui (#487) 2025-05-28 19:08:00 -07:00
Adil Hafeez
d050dfb85a
When router usage is defined ensure that router model is defined too (#481) 2025-05-23 08:46:12 -07:00
Adil Hafeez
218e9c540d
Add support for json based content types in Message (#480) 2025-05-23 00:51:53 -07:00
Adil Hafeez
f5e77bbe65
add support for claude and add first class support for groq and deepseek (#479) 2025-05-22 22:55:46 -07:00
Adil Hafeez
27c0f2fdce
Introduce brightstaff a new terminal service for llm routing (#477) 2025-05-19 09:59:22 -07:00
Shuguang Chen
7d4b261a68
Integrate Arch-Function-Chat (#449) 2025-04-15 14:39:12 -07:00
Salman Paracha
f31aa59fac
fixed issue with groq LLMs that require the openai in the /v1/chat/co… (#460)
* fixed issue with groq LLMs that require the openai in the /v1/chat/completions path. My first change

* updated the GH actions with keys for Groq

* adding missing groq API keys

* add llama-3.2-3b-preview to the model based on addin groq to the demo

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>
2025-04-13 14:00:16 -07:00
Adil Hafeez
de221525de
Use better logs (#452) 2025-03-27 10:40:20 -07:00
Adil Hafeez
eb48f3d5bb
use passed in model name in chat completion request (#445) 2025-03-21 15:56:17 -07:00
Adil Hafeez
84cd1df7bf
add preliminary support for llm agents (#432) 2025-03-19 15:21:34 -07:00
Shuguang Chen
e77fc47225
Handle intent matching better in arch gateway (#391) 2025-03-04 12:49:13 -08:00
Adil Hafeez
10cad4d0b7
add health check endpoint for llm gateway (#420)
* add health check endpoint for llm gateway

* fix rust tests
2025-03-03 13:11:57 -08:00
Adil Hafeez
e40b13be05
Update arch_config and add tests for arch config file (#407) 2025-02-14 19:28:10 -08:00
Adil Hafeez
8de6eacfbd
spotify demo with optimized context window code change (#397) 2025-02-07 19:14:15 -08:00
Adil Hafeez
2bd61d628c
add ability to specify custom http headers in api endpoint (#386) 2025-02-06 11:48:09 -08:00
Adil Hafeez
e82f8f216f
Encode parameter values in http path and ... (#395)
* Encode parameter values in http path and ...

- don't send param values in request body in http get request
- send param values in http post request

* rust tests

* refactor code

* add tests
2025-02-06 11:00:47 -08:00
Adil Hafeez
39266b5084
log improvements and some code refactor (#379) 2025-01-31 10:37:53 -08:00
Adil Hafeez
07ef3149b8
add support for using custom upstream llm (#365) 2025-01-17 18:25:55 -08:00
Shuguang Chen
ba7279becb
Use intent model from archfc to pick prompt gateway (#328) 2024-12-20 13:25:01 -08:00
José Ulises Niño Rivera
cd1b561192
Break apart metrics into their own module (#335)
Signed-off-by: José Ulises Niño Rivera <junr03@users.noreply.github.com>
2024-12-09 10:46:46 -08:00
José Ulises Niño Rivera
d002b2042a
Break apart common_types mod (#334)
Signed-off-by: José Ulises Niño Rivera <junr03@users.noreply.github.com>
2024-12-06 17:25:42 -08:00
Adil Hafeez
a54db1a098
update getting started guide and add llm gateway and prompt gateway samples (#330) 2024-12-06 14:37:33 -08:00
José Ulises Niño Rivera
be8c3c9ea3
Remove blanket unused imports from the common crate (#292)
* Remove blanket unused imports from the common crate

Signed-off-by: José Ulises Niño Rivera <junr03@users.noreply.github.com>

* updatE

Signed-off-by: José Ulises Niño Rivera <junr03@users.noreply.github.com>

---------

Signed-off-by: José Ulises Niño Rivera <junr03@users.noreply.github.com>
2024-11-25 17:19:06 -08:00
Adil Hafeez
36489b4adc
use envoy to publish traces (#270) 2024-11-18 17:55:39 -08:00
Aayush
5993e36f22
Update arch stats (#250) 2024-11-12 15:03:26 -08:00
Adil Hafeez
30647fd508
Add service to stream custom otel traces to otel-collector (#262) 2024-11-12 11:09:40 -08:00
Adil Hafeez
9081eb0f7f
obfuscate auth header (#254) 2024-11-08 15:17:39 -06:00
Adil Hafeez
a72bb804eb
add support for jaeger tracing (#229) 2024-11-07 22:11:00 -06:00
José Ulises Niño Rivera
662a840ac5
Add support for streaming and fixes few issues (see description) (#202) 2024-10-28 17:05:06 -07:00