Adil Hafeez
8d12a9a6e0
add arch provider ( #494 )
2025-05-30 17:12:52 -07:00
Adil Hafeez
9c4733590f
add support for openwebui ( #487 )
2025-05-28 19:08:00 -07:00
Adil Hafeez
4899117876
add compress/decompress filter to llm listener ( #489 )
2025-05-28 15:06:52 -07:00
Adil Hafeez
dc271f1f76
release 0.3.0 ( #483 )
2025-05-23 09:52:23 -07:00
Adil Hafeez
21faae605f
correctly map envoy stats to host ( #484 )
...
host port 19901 -> envoy container port 9901
2025-05-23 09:37:15 -07:00
Adil Hafeez
a0d10153f9
update archgw logs file to stream access logs from container ( #482 )
2025-05-23 09:15:44 -07:00
Adil Hafeez
d050dfb85a
When router usage is defined ensure that router model is defined too ( #481 )
2025-05-23 08:46:12 -07:00
Adil Hafeez
218e9c540d
Add support for json based content types in Message ( #480 )
2025-05-23 00:51:53 -07:00
Adil Hafeez
f5e77bbe65
add support for claude and add first class support for groq and deepseek ( #479 )
2025-05-22 22:55:46 -07:00
Adil Hafeez
27c0f2fdce
Introduce brightstaff a new terminal service for llm routing ( #477 )
2025-05-19 09:59:22 -07:00
Adil Hafeez
9c803f4d69
release 0.2.8 ( #472 )
2025-04-21 17:02:36 -07:00
Adil Hafeez
00fb1be8a0
release 0.2.7 ( #469 )
2025-04-16 13:55:24 -07:00
Adil Hafeez
c7c0553427
release 0.2.6 ( #463 )
2025-04-15 14:50:09 -07:00
Adil Hafeez
4d2d8bd7a1
release 0.2.5 ( #457 )
2025-04-06 01:24:01 -07:00
Adil Hafeez
de221525de
Use better logs ( #452 )
2025-03-27 10:40:20 -07:00
Adil Hafeez
9f59943041
update code to use 0.2.4 release ( #446 )
...
* update code to use 0.2.4 release
* update lock file
2025-03-21 16:08:59 -07:00
Adil Hafeez
eb48f3d5bb
use passed in model name in chat completion request ( #445 )
2025-03-21 15:56:17 -07:00
Adil Hafeez
84cd1df7bf
add preliminary support for llm agents ( #432 )
2025-03-19 15:21:34 -07:00
Adil Hafeez
e8dc7f18d3
start using base_url in place of endpoint ( #430 )
2025-03-05 17:20:04 -08:00
Adil Hafeez
d8b833fe69
release 0.2.3 ( #423 )
2025-03-04 14:30:44 -08:00
Adil Hafeez
10cad4d0b7
add health check endpoint for llm gateway ( #420 )
...
* add health check endpoint for llm gateway
* fix rust tests
2025-03-03 13:11:57 -08:00
Sid Golestane
a402fee13b
fix: add --type=container to docker inspect to prevent Podman conflicts ( #418 )
...
* fix: add --type=container to docker inspect to prevent Podman conflicts
Adding `--type=container` ensures `docker inspect` targets containers
specifically, preventing conflicts with images in Podman.
* Format Python code using pre-commit hook
---------
Co-authored-by: Sid Golestaneh <sid@golestaneh.com>
2025-02-28 17:03:21 -08:00
Adil Hafeez
ae6b2bef59
Fix compatibility issues with podman system ( #415 )
...
- "dokcer inspect" doesn't return State/Status if container is not running
- "docker remove" is not a command supported by podman
- "docker logs" expect -f to be passed before container name
2025-02-20 16:19:48 -08:00
Adil Hafeez
1bbc5d2233
release 0.2.2 ( #413 )
2025-02-14 20:02:59 -08:00
Adil Hafeez
e40b13be05
Update arch_config and add tests for arch config file ( #407 )
2025-02-14 19:28:10 -08:00
Adil Hafeez
d0a783cca8
use docker cli to communicate to docker sub system ( #412 )
2025-02-14 17:46:58 -08:00
Adil Hafeez
0ea237fbac
release 0.2.1 ( #399 )
2025-02-07 19:21:20 -08:00
Adil Hafeez
8de6eacfbd
spotify demo with optimized context window code change ( #397 )
2025-02-07 19:14:15 -08:00
Salman Paracha
b3c95a6698
refactor demos ( #398 )
2025-02-07 18:45:42 -08:00
Adil Hafeez
2bd61d628c
add ability to specify custom http headers in api endpoint ( #386 )
2025-02-06 11:48:09 -08:00
Adil Hafeez
962727f244
Infer port from protocol if port is not specified and add ability to override hostname in clusters def ( #389 )
2025-02-03 14:51:59 -08:00
Adil Hafeez
7830f4b431
release 0.2.0 ( #384 )
...
* release 0.2.0
* update versions
2025-01-24 17:31:48 -08:00
Adil Hafeez
38f7691163
add support for custom llm with ssl support ( #380 )
...
* add support for custom llm with ssl support
Add support for using custom llm that are served through https protocol.
* add instructions on how to add custom inference endpoint
* fix formatting
* add more details
* Apply suggestions from code review
Co-authored-by: Salman Paracha <salman.paracha@gmail.com>
* Apply suggestions from code review
* fix precommit
---------
Co-authored-by: Salman Paracha <salman.paracha@gmail.com>
2025-01-24 17:14:24 -08:00
Adil Hafeez
452084423c
add PR to release 0.1.9 ( #371 )
2025-01-17 18:47:26 -08:00
Adil Hafeez
07ef3149b8
add support for using custom upstream llm ( #365 )
2025-01-17 18:25:55 -08:00
Shuguang Chen
88a02dc478
Some fixes on model server ( #362 )
...
* Some fixes on model server
* Remove prompt_prefilling message
* Fix logging
* Fix poetry issues
* Improve logging and update the support for text truncation
* Fix tests
* Fix tests
* Fix tests
* Fix modelserver tests
* Update modelserver tests
2025-01-10 16:45:36 -08:00
Salman Paracha
ebda682b30
updated docs for 0.1.8 support ( #366 )
...
* updated docs for 0.1.8 support
* updated REAMDE on root
* updated version reference to 0.1.8 in other parts of the repo
---------
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>
2025-01-10 16:38:48 -08:00
Adil Hafeez
dae6239b81
use per user docker socket if system docker socket doesn't exist ( #361 )
...
* use per user docker socket if system docker socket doesn't exist
* add retry
2025-01-08 14:55:42 -08:00
Shuguang Chen
ba7279becb
Use intent model from archfc to pick prompt gateway ( #328 )
2024-12-20 13:25:01 -08:00
Aayush
67b8fd635e
add more granular bucket sizes for ttft ( #343 )
...
* add more granular bucket sizes for ttft
2024-12-12 14:38:36 -08:00
Adil Hafeez
af0e7d178b
update cli to 0.1.6 ( #338 )
2024-12-06 15:48:07 -08:00
Adil Hafeez
a54db1a098
update getting started guide and add llm gateway and prompt gateway samples ( #330 )
2024-12-06 14:37:33 -08:00
Adil Hafeez
ec5326250e
correctly map stats port to host ( #311 )
2024-11-27 11:28:41 -08:00
Adil Hafeez
704b928d61
release 0.1.5 ( #307 )
2024-11-26 13:28:52 -08:00
Adil Hafeez
0ff3d43008
remove dependency on docker-compose when starting up archgw ( #305 )
2024-11-26 13:13:02 -08:00
Adil Hafeez
726f1a3185
add schema change to use enum in arch_config ( #304 )
2024-11-25 17:51:25 -08:00
Adil Hafeez
9c6fcdb771
use fix prompt guards ( #303 )
2024-11-25 17:16:35 -08:00
Adil Hafeez
36489b4adc
use envoy to publish traces ( #270 )
2024-11-18 17:55:39 -08:00
Adil Hafeez
9cee04ed31
release 0.1.3 ( #280 )
...
* release 0.1.3
* udpate ver
2024-11-17 17:12:01 -08:00
Adil Hafeez
d3c17c7abd
move custom tracer to llm filter ( #267 )
2024-11-15 10:44:01 -08:00