Commit graph

64 commits

Author SHA1 Message Date
Adil Hafeez
0d190a6e5c
update code to use new json based system prompt for routing (#493) 2025-05-30 17:40:46 -07:00
Adil Hafeez
8d12a9a6e0
add arch provider (#494) 2025-05-30 17:12:52 -07:00
Adil Hafeez
176f039bbc
fix model warning and use openwebui for preference based router demo 2025-05-30 12:29:56 -07:00
Adil Hafeez
fffa837a06
separate out currency exchange and preference based routing (#491) 2025-05-30 02:14:37 -07:00
Adil Hafeez
470cdf9843
use provider_name as model_id /v1/models api (#490) 2025-05-29 11:23:18 -07:00
Adil Hafeez
9c4733590f
add support for openwebui (#487) 2025-05-28 19:08:00 -07:00
Adil Hafeez
d29eba4102
trim conversation if it exceed max limit of what router model can handle (#488) 2025-05-27 20:28:22 -07:00
Adil Hafeez
99dd900a34
fix panic in brightstaff (#485)
make router section optional in arch_config
2025-05-23 09:37:25 -07:00
Adil Hafeez
d050dfb85a
When router usage is defined ensure that router model is defined too (#481) 2025-05-23 08:46:12 -07:00
Adil Hafeez
218e9c540d
Add support for json based content types in Message (#480) 2025-05-23 00:51:53 -07:00
Adil Hafeez
f5e77bbe65
add support for claude and add first class support for groq and deepseek (#479) 2025-05-22 22:55:46 -07:00
Adil Hafeez
27c0f2fdce
Introduce brightstaff a new terminal service for llm routing (#477) 2025-05-19 09:59:22 -07:00
Shuguang Chen
7d4b261a68
Integrate Arch-Function-Chat (#449) 2025-04-15 14:39:12 -07:00
Salman Paracha
f31aa59fac
fixed issue with groq LLMs that require the openai in the /v1/chat/co… (#460)
* fixed issue with groq LLMs that require the openai in the /v1/chat/completions path. My first change

* updated the GH actions with keys for Groq

* adding missing groq API keys

* add llama-3.2-3b-preview to the model based on addin groq to the demo

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>
2025-04-13 14:00:16 -07:00
Adil Hafeez
de221525de
Use better logs (#452) 2025-03-27 10:40:20 -07:00
Adil Hafeez
76ec5cda68
fix ollama demo (#450) 2025-03-26 11:01:32 -07:00
Adil Hafeez
eb48f3d5bb
use passed in model name in chat completion request (#445) 2025-03-21 15:56:17 -07:00
Adil Hafeez
84cd1df7bf
add preliminary support for llm agents (#432) 2025-03-19 15:21:34 -07:00
Adil Hafeez
ed3845040e
add demo for deepseek (#426) 2025-03-05 14:08:06 -08:00
Shuguang Chen
e77fc47225
Handle intent matching better in arch gateway (#391) 2025-03-04 12:49:13 -08:00
Adil Hafeez
10cad4d0b7
add health check endpoint for llm gateway (#420)
* add health check endpoint for llm gateway

* fix rust tests
2025-03-03 13:11:57 -08:00
Adil Hafeez
e40b13be05
Update arch_config and add tests for arch config file (#407) 2025-02-14 19:28:10 -08:00
Adil Hafeez
8de6eacfbd
spotify demo with optimized context window code change (#397) 2025-02-07 19:14:15 -08:00
Adil Hafeez
2bd61d628c
add ability to specify custom http headers in api endpoint (#386) 2025-02-06 11:48:09 -08:00
Adil Hafeez
e82f8f216f
Encode parameter values in http path and ... (#395)
* Encode parameter values in http path and ...

- don't send param values in request body in http get request
- send param values in http post request

* rust tests

* refactor code

* add tests
2025-02-06 11:00:47 -08:00
Adil Hafeez
a62f906432
remove unused cargo.lock files (#396) 2025-02-05 20:25:41 -08:00
Adil Hafeez
39266b5084
log improvements and some code refactor (#379) 2025-01-31 10:37:53 -08:00
Adil Hafeez
6887d52750
When using ollama token count was not coming in (#375)
When using ollama token count was not coming in resulting in token count and other metrics to show up as zero. This was not causing tracing to break.
2025-01-21 18:01:56 -08:00
Adil Hafeez
07ef3149b8
add support for using custom upstream llm (#365) 2025-01-17 18:25:55 -08:00
Adil Hafeez
3fc21de60c
Send per prompt target system prompt (#368)
* update prompt target name after arch_fc has identified tool

* add test for currency exchange
2025-01-16 15:11:37 -08:00
Shuguang Chen
ba7279becb
Use intent model from archfc to pick prompt gateway (#328) 2024-12-20 13:25:01 -08:00
José Ulises Niño Rivera
cd1b561192
Break apart metrics into their own module (#335)
Signed-off-by: José Ulises Niño Rivera <junr03@users.noreply.github.com>
2024-12-09 10:46:46 -08:00
José Ulises Niño Rivera
d002b2042a
Break apart common_types mod (#334)
Signed-off-by: José Ulises Niño Rivera <junr03@users.noreply.github.com>
2024-12-06 17:25:42 -08:00
Adil Hafeez
a54db1a098
update getting started guide and add llm gateway and prompt gateway samples (#330) 2024-12-06 14:37:33 -08:00
José Ulises Niño Rivera
be8c3c9ea3
Remove blanket unused imports from the common crate (#292)
* Remove blanket unused imports from the common crate

Signed-off-by: José Ulises Niño Rivera <junr03@users.noreply.github.com>

* updatE

Signed-off-by: José Ulises Niño Rivera <junr03@users.noreply.github.com>

---------

Signed-off-by: José Ulises Niño Rivera <junr03@users.noreply.github.com>
2024-11-25 17:19:06 -08:00
Adil Hafeez
9c6fcdb771
use fix prompt guards (#303) 2024-11-25 17:16:35 -08:00
Adil Hafeez
36489b4adc
use envoy to publish traces (#270) 2024-11-18 17:55:39 -08:00
Adil Hafeez
097513ee60
fix start time of llm filter (#278)
* fix start time of llm filter

* fix int tests
2024-11-17 17:01:19 -08:00
Adil Hafeez
d3c17c7abd
move custom tracer to llm filter (#267) 2024-11-15 10:44:01 -08:00
Aayush
1d229cba8f
Add in tpot (#269)
* add in tpot and tokens per second

* add in debug logs for new stats and update integration tests

* update shared dashboard to include new stats
2024-11-14 15:03:08 -08:00
Aayush
5993e36f22
Update arch stats (#250) 2024-11-12 15:03:26 -08:00
Adil Hafeez
30647fd508
Add service to stream custom otel traces to otel-collector (#262) 2024-11-12 11:09:40 -08:00
Adil Hafeez
d87105882b
update rust toolchain to 1.82 (#255)
* update rust to 1.82 pin it, also update envoy to 1.32 and python to 3.13

* use python:3.12
2024-11-12 10:35:14 -08:00
Adil Hafeez
9081eb0f7f
obfuscate auth header (#254) 2024-11-08 15:17:39 -06:00
Adil Hafeez
a72bb804eb
add support for jaeger tracing (#229) 2024-11-07 22:11:00 -06:00
Ikko Eltociear Ashimine
f48489f7c0
chore: update stream_context.rs (#248)
initalize -> initialize
2024-11-05 10:18:33 -08:00
Adil Hafeez
9a6ae2efee
retry embeddings fetch (#245) 2024-11-05 10:04:36 -08:00
Adil Hafeez
e462e393b1
Use large github action machine to run e2e tests (#230) 2024-10-30 17:54:51 -07:00
Salman Paracha
bb882fb59b
Updated hr_agent to be full stack: gradio + fastAPI (#235)
* commiting to remove

* fix

* updating hr_agent

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>
Co-authored-by: Adil Hafeez <adil@katanemo.com>
2024-10-30 15:05:34 -07:00
Adil Hafeez
60299244b9
Improve Gradio UI and fix arch_state bug (#227) 2024-10-29 11:27:13 -07:00