Salman Paracha
2de75d18db
fixing logs
2025-09-05 21:52:46 -07:00
Salman Paracha
e8881c7b8a
fixed the dup cloning issue and cleaned up the ProviderRequestType enum and traits
2025-09-05 21:18:55 -07:00
Salman Paracha
2895a07088
cleaned up logs and fixed issue with connectivity for llm gateway in weather forecast demo
2025-09-05 09:09:17 -07:00
Salman Paracha
00c543667d
fixed integration tests and cleaned up logs
2025-09-04 21:49:54 -07:00
Salman Paracha
ee52c608f7
fixed test cases and added more structured logs
2025-09-04 19:28:47 -07:00
Salman Paracha
ecf453ed70
/v1/messages works with transformations to and from /v1/chat/completions
2025-09-04 15:13:53 -07:00
Salman Paracha
2813a8cfa5
fixing non-streaming responses to tranform correctly
2025-09-02 17:42:02 -07:00
Salman Paracha
d4dfbe600f
making sure that we convert the raw bytes to the correct provider type upstream
2025-09-02 16:19:45 -07:00
Salman Paracha
c55979307e
intentionally removing the headers
2025-08-30 23:00:04 -07:00
Salman Paracha
7c4174a821
fixing json parsing errors
2025-08-30 12:52:59 -07:00
Salman Paracha
041a9eda3a
fixed the debug statement that was causing the integration tests for wasm to fail
2025-08-29 18:33:18 -07:00
Salman Paracha
0a0d2c95a3
the serialized bytes length must be set in the response side
2025-08-29 18:18:32 -07:00
Salman Paracha
e7238fb7fd
updated the stream_context to update response bytes
2025-08-28 22:55:12 -07:00
Salman Paracha
9f6d2464f6
fixed issues with non-streaming responses
2025-08-24 18:52:48 -07:00
Salman Paracha
0b41496c45
fixed serialization issues with enums on response
2025-08-24 13:12:15 -07:00
Salman Paracha
7345657612
fixed bugs for integration tests
2025-08-23 16:37:52 -07:00
Salman Paracha
e73a9eb61c
transformations are working. Now need to add some tests next
2025-08-22 14:36:46 -07:00
Salman Paracha
0aa9243093
pushing draft PR
2025-08-21 22:24:07 -07:00
Salman Paracha
89ab51697a
updating the implementation of /v1/chat/completions to use the generi… ( #548 )
...
* updating the implementation of /v1/chat/completions to use the generic provider interfaces
* saving changes, although we will need a small re-factor after this as well
* more refactoring changes, getting close
* more refactoring changes to avoid unecessary re-direction and duplication
* more clean up
* more refactoring
* more refactoring to clean code and make stream_context.rs work
* removing unecessary trait implemenations
* some more clean-up
* fixed bugs
* fixing test cases, and making sure all references to the ChatCOmpletions* objects point to the new types
* refactored changes to support enum dispatch
* removed the dependency on try_streaming_from_bytes into a try_from trait implementation
* updated readme based on new usage
* updated code based on code review comments
---------
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-2.local>
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-4.local>
2025-08-20 12:55:29 -07:00
Adil Hafeez
a7fddf30f9
better model names ( #517 )
2025-07-11 16:42:16 -07:00
Adil Hafeez
147908ba7e
make arch-router cluster optional ( #518 )
2025-07-08 00:33:40 -07:00
Adil Hafeez
00dc95e034
Add support for updating model preferences ( #510 )
2025-07-02 14:08:19 -07:00
Adil Hafeez
aa9d747fa9
add support for gemini ( #505 )
2025-06-11 15:15:00 -07:00
Adil Hafeez
6c53510f49
Introduce hermesllm library to handle llm message translation ( #501 )
2025-06-10 12:53:27 -07:00
Adil Hafeez
218e9c540d
Add support for json based content types in Message ( #480 )
2025-05-23 00:51:53 -07:00
Adil Hafeez
f5e77bbe65
add support for claude and add first class support for groq and deepseek ( #479 )
2025-05-22 22:55:46 -07:00
Adil Hafeez
27c0f2fdce
Introduce brightstaff a new terminal service for llm routing ( #477 )
2025-05-19 09:59:22 -07:00
Shuguang Chen
7d4b261a68
Integrate Arch-Function-Chat ( #449 )
2025-04-15 14:39:12 -07:00
Salman Paracha
f31aa59fac
fixed issue with groq LLMs that require the openai in the /v1/chat/co… ( #460 )
...
* fixed issue with groq LLMs that require the openai in the /v1/chat/completions path. My first change
* updated the GH actions with keys for Groq
* adding missing groq API keys
* add llama-3.2-3b-preview to the model based on addin groq to the demo
---------
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>
2025-04-13 14:00:16 -07:00
Adil Hafeez
de221525de
Use better logs ( #452 )
2025-03-27 10:40:20 -07:00
Adil Hafeez
76ec5cda68
fix ollama demo ( #450 )
2025-03-26 11:01:32 -07:00
Adil Hafeez
eb48f3d5bb
use passed in model name in chat completion request ( #445 )
2025-03-21 15:56:17 -07:00
Adil Hafeez
84cd1df7bf
add preliminary support for llm agents ( #432 )
2025-03-19 15:21:34 -07:00
Adil Hafeez
ed3845040e
add demo for deepseek ( #426 )
2025-03-05 14:08:06 -08:00
Adil Hafeez
10cad4d0b7
add health check endpoint for llm gateway ( #420 )
...
* add health check endpoint for llm gateway
* fix rust tests
2025-03-03 13:11:57 -08:00
Adil Hafeez
e40b13be05
Update arch_config and add tests for arch config file ( #407 )
2025-02-14 19:28:10 -08:00
Adil Hafeez
a62f906432
remove unused cargo.lock files ( #396 )
2025-02-05 20:25:41 -08:00
Adil Hafeez
39266b5084
log improvements and some code refactor ( #379 )
2025-01-31 10:37:53 -08:00
Adil Hafeez
6887d52750
When using ollama token count was not coming in ( #375 )
...
When using ollama token count was not coming in resulting in token count and other metrics to show up as zero. This was not causing tracing to break.
2025-01-21 18:01:56 -08:00
Adil Hafeez
07ef3149b8
add support for using custom upstream llm ( #365 )
2025-01-17 18:25:55 -08:00
José Ulises Niño Rivera
cd1b561192
Break apart metrics into their own module ( #335 )
...
Signed-off-by: José Ulises Niño Rivera <junr03@users.noreply.github.com>
2024-12-09 10:46:46 -08:00
José Ulises Niño Rivera
d002b2042a
Break apart common_types mod ( #334 )
...
Signed-off-by: José Ulises Niño Rivera <junr03@users.noreply.github.com>
2024-12-06 17:25:42 -08:00
Adil Hafeez
36489b4adc
use envoy to publish traces ( #270 )
2024-11-18 17:55:39 -08:00
Adil Hafeez
097513ee60
fix start time of llm filter ( #278 )
...
* fix start time of llm filter
* fix int tests
2024-11-17 17:01:19 -08:00
Adil Hafeez
d3c17c7abd
move custom tracer to llm filter ( #267 )
2024-11-15 10:44:01 -08:00
Aayush
1d229cba8f
Add in tpot ( #269 )
...
* add in tpot and tokens per second
* add in debug logs for new stats and update integration tests
* update shared dashboard to include new stats
2024-11-14 15:03:08 -08:00
Aayush
5993e36f22
Update arch stats ( #250 )
2024-11-12 15:03:26 -08:00
Adil Hafeez
d87105882b
update rust toolchain to 1.82 ( #255 )
...
* update rust to 1.82 pin it, also update envoy to 1.32 and python to 3.13
* use python:3.12
2024-11-12 10:35:14 -08:00
Adil Hafeez
9081eb0f7f
obfuscate auth header ( #254 )
2024-11-08 15:17:39 -06:00
Adil Hafeez
a72bb804eb
add support for jaeger tracing ( #229 )
2024-11-07 22:11:00 -06:00