Adil Hafeez
fb34dbdf6c
remove unnecessary rust files
2025-06-10 12:54:31 -07:00
Adil Hafeez
6c53510f49
Introduce hermesllm library to handle llm message translation ( #501 )
2025-06-10 12:53:27 -07:00
Adil Hafeez
96b583c819
make model required in readme and rst files ( #503 )
2025-06-05 20:14:13 -07:00
Adil Hafeez
e43d41ba32
add support for bortli compression ( #502 )
2025-06-05 17:00:14 -07:00
Dougal Ballantyne
93224ed551
Update Dockerfile to fix warnings ( #500 )
2025-05-31 21:27:29 -07:00
Adil Hafeez
2e47d41a8c
Add ARCH_API_KEY in preference based routing demo ( #498 )
2025-05-31 01:52:25 -07:00
Adil Hafeez
aff389d342
don't run docker compose up for preference based router e2e demo tests ( #499 )
2025-05-31 01:16:17 -07:00
Adil Hafeez
0f139baf13
use consistent version across all arch_config files ( #497 )
2025-05-31 01:11:14 -07:00
Adil Hafeez
c7a3a668a9
update readme for preference based routing ( #496 )
2025-05-30 18:09:10 -07:00
Adil Hafeez
ed28bbaf04
release 0.3.1 ( #495 )
2025-05-30 17:47:59 -07:00
Adil Hafeez
0d190a6e5c
update code to use new json based system prompt for routing ( #493 )
2025-05-30 17:40:46 -07:00
Adil Hafeez
8d12a9a6e0
add arch provider ( #494 )
2025-05-30 17:12:52 -07:00
CTran
6a01eea813
LLM Router api doc ( #492 )
...
* Create router.rst
* add doc
* update api
* update api
* Update docs/source/guides/llm_router.rst
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Update docs/source/guides/llm_router.rst
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* fix whitespace
* Update llm_router.rst
* remove faeture and align examples
* remove faeture and align examples
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Shuguang Chen <54548843+nehcgs@users.noreply.github.com>
2025-05-30 16:15:26 -07:00
Adil Hafeez
176f039bbc
fix model warning and use openwebui for preference based router demo
2025-05-30 12:29:56 -07:00
Adil Hafeez
fffa837a06
separate out currency exchange and preference based routing ( #491 )
2025-05-30 02:14:37 -07:00
Adil Hafeez
470cdf9843
use provider_name as model_id /v1/models api ( #490 )
2025-05-29 11:23:18 -07:00
Adil Hafeez
9c4733590f
add support for openwebui ( #487 )
2025-05-28 19:08:00 -07:00
Adil Hafeez
4899117876
add compress/decompress filter to llm listener ( #489 )
2025-05-28 15:06:52 -07:00
Adil Hafeez
d29eba4102
trim conversation if it exceed max limit of what router model can handle ( #488 )
2025-05-27 20:28:22 -07:00
Adil Hafeez
79cbcb5fe1
add claude-4 in llm_routing demo ( #486 )
2025-05-23 10:21:21 -07:00
Adil Hafeez
dc271f1f76
release 0.3.0 ( #483 )
2025-05-23 09:52:23 -07:00
Adil Hafeez
99dd900a34
fix panic in brightstaff ( #485 )
...
make router section optional in arch_config
2025-05-23 09:37:25 -07:00
Adil Hafeez
21faae605f
correctly map envoy stats to host ( #484 )
...
host port 19901 -> envoy container port 9901
2025-05-23 09:37:15 -07:00
Adil Hafeez
a0d10153f9
update archgw logs file to stream access logs from container ( #482 )
2025-05-23 09:15:44 -07:00
Adil Hafeez
d050dfb85a
When router usage is defined ensure that router model is defined too ( #481 )
2025-05-23 08:46:12 -07:00
Adil Hafeez
218e9c540d
Add support for json based content types in Message ( #480 )
2025-05-23 00:51:53 -07:00
Adil Hafeez
f5e77bbe65
add support for claude and add first class support for groq and deepseek ( #479 )
2025-05-22 22:55:46 -07:00
Adil Hafeez
27c0f2fdce
Introduce brightstaff a new terminal service for llm routing ( #477 )
2025-05-19 09:59:22 -07:00
Adil Hafeez
1f95fac4af
update arch_config sample on readme to match with new format ( #475 )
2025-04-29 12:36:46 -07:00
Salman Paracha
9659b2baf6
updating README based on reddit feedback ( #474 )
...
* updating README based on reddit feedback
* minor edits
---------
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-329.local>
2025-04-27 23:09:29 -07:00
Adil Hafeez
2e346143dd
use separate host port for chat ui and for app_server ( #473 )
...
We were using same port for both chatui and app_server which was causing conflict. This code change updates host port for app_server to 18083 and updates arch_config
2025-04-23 14:05:48 -07:00
Adil Hafeez
9c803f4d69
release 0.2.8 ( #472 )
2025-04-21 17:02:36 -07:00
Adil Hafeez
5fe2444341
use archfc v1.1 on archfc.katanemo.dev ( #471 )
2025-04-21 16:27:17 -07:00
Adil Hafeez
00fb1be8a0
release 0.2.7 ( #469 )
2025-04-16 13:55:24 -07:00
Adil Hafeez
6d6c03a7e8
fix docker hub release tag source image name ( #468 )
2025-04-16 13:08:43 -07:00
Adil Hafeez
3eb438550a
fix source name for docker images ( #467 )
2025-04-16 12:24:17 -07:00
Adil Hafeez
e17d5fb2eb
test docker rel ( #466 )
2025-04-16 12:18:03 -07:00
Adil Hafeez
3cda4d6b69
fix docker hub tag ( #465 )
2025-04-16 11:46:12 -07:00
Adil Hafeez
ceca553399
fix release image ( #464 )
2025-04-16 11:34:45 -07:00
Adil Hafeez
c7c0553427
release 0.2.6 ( #463 )
2025-04-15 14:50:09 -07:00
Shuguang Chen
7d4b261a68
Integrate Arch-Function-Chat ( #449 )
2025-04-15 14:39:12 -07:00
Salman Paracha
f31aa59fac
fixed issue with groq LLMs that require the openai in the /v1/chat/co… ( #460 )
...
* fixed issue with groq LLMs that require the openai in the /v1/chat/completions path. My first change
* updated the GH actions with keys for Groq
* adding missing groq API keys
* add llama-3.2-3b-preview to the model based on addin groq to the demo
---------
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>
2025-04-13 14:00:16 -07:00
Mat Sylvia
e7b0de2a72
Tweak readme docs for minor nits ( #461 )
...
Co-authored-by: darkdatter <msylvia@tradestax.io>
2025-04-12 23:52:20 -07:00
Adil Hafeez
4d2d8bd7a1
release 0.2.5 ( #457 )
2025-04-06 01:24:01 -07:00
Joseph D Alchemist
8ba1f71430
remove typo ( #456 )
2025-04-03 11:34:57 -07:00
Ikko Eltociear Ashimine
49e8216061
docs: update llm_provider.rst ( #448 )
...
minor fix
2025-03-28 14:35:55 -07:00
Adil Hafeez
de221525de
Use better logs ( #452 )
2025-03-27 10:40:20 -07:00
Adil Hafeez
76ec5cda68
fix ollama demo ( #450 )
2025-03-26 11:01:32 -07:00
Adil Hafeez
9f59943041
update code to use 0.2.4 release ( #446 )
...
* update code to use 0.2.4 release
* update lock file
2025-03-21 16:08:59 -07:00
Adil Hafeez
eb48f3d5bb
use passed in model name in chat completion request ( #445 )
2025-03-21 15:56:17 -07:00