Adil Hafeez
4d2d8bd7a1
release 0.2.5 ( #457 )
2025-04-06 01:24:01 -07:00
Adil Hafeez
de221525de
Use better logs ( #452 )
2025-03-27 10:40:20 -07:00
Adil Hafeez
9f59943041
update code to use 0.2.4 release ( #446 )
...
* update code to use 0.2.4 release
* update lock file
2025-03-21 16:08:59 -07:00
Adil Hafeez
eb48f3d5bb
use passed in model name in chat completion request ( #445 )
2025-03-21 15:56:17 -07:00
Adil Hafeez
84cd1df7bf
add preliminary support for llm agents ( #432 )
2025-03-19 15:21:34 -07:00
Adil Hafeez
e8dc7f18d3
start using base_url in place of endpoint ( #430 )
2025-03-05 17:20:04 -08:00
Adil Hafeez
d8b833fe69
release 0.2.3 ( #423 )
2025-03-04 14:30:44 -08:00
Adil Hafeez
10cad4d0b7
add health check endpoint for llm gateway ( #420 )
...
* add health check endpoint for llm gateway
* fix rust tests
2025-03-03 13:11:57 -08:00
Sid Golestane
a402fee13b
fix: add --type=container to docker inspect to prevent Podman conflicts ( #418 )
...
* fix: add --type=container to docker inspect to prevent Podman conflicts
Adding `--type=container` ensures `docker inspect` targets containers
specifically, preventing conflicts with images in Podman.
* Format Python code using pre-commit hook
---------
Co-authored-by: Sid Golestaneh <sid@golestaneh.com>
2025-02-28 17:03:21 -08:00
Adil Hafeez
ae6b2bef59
Fix compatibility issues with podman system ( #415 )
...
- "dokcer inspect" doesn't return State/Status if container is not running
- "docker remove" is not a command supported by podman
- "docker logs" expect -f to be passed before container name
2025-02-20 16:19:48 -08:00
Adil Hafeez
1bbc5d2233
release 0.2.2 ( #413 )
2025-02-14 20:02:59 -08:00
Adil Hafeez
e40b13be05
Update arch_config and add tests for arch config file ( #407 )
2025-02-14 19:28:10 -08:00
Adil Hafeez
d0a783cca8
use docker cli to communicate to docker sub system ( #412 )
2025-02-14 17:46:58 -08:00
Adil Hafeez
0ea237fbac
release 0.2.1 ( #399 )
2025-02-07 19:21:20 -08:00
Adil Hafeez
8de6eacfbd
spotify demo with optimized context window code change ( #397 )
2025-02-07 19:14:15 -08:00
Salman Paracha
b3c95a6698
refactor demos ( #398 )
2025-02-07 18:45:42 -08:00
Adil Hafeez
2bd61d628c
add ability to specify custom http headers in api endpoint ( #386 )
2025-02-06 11:48:09 -08:00
Adil Hafeez
962727f244
Infer port from protocol if port is not specified and add ability to override hostname in clusters def ( #389 )
2025-02-03 14:51:59 -08:00
Adil Hafeez
7830f4b431
release 0.2.0 ( #384 )
...
* release 0.2.0
* update versions
2025-01-24 17:31:48 -08:00
Adil Hafeez
38f7691163
add support for custom llm with ssl support ( #380 )
...
* add support for custom llm with ssl support
Add support for using custom llm that are served through https protocol.
* add instructions on how to add custom inference endpoint
* fix formatting
* add more details
* Apply suggestions from code review
Co-authored-by: Salman Paracha <salman.paracha@gmail.com>
* Apply suggestions from code review
* fix precommit
---------
Co-authored-by: Salman Paracha <salman.paracha@gmail.com>
2025-01-24 17:14:24 -08:00
Adil Hafeez
452084423c
add PR to release 0.1.9 ( #371 )
2025-01-17 18:47:26 -08:00
Adil Hafeez
07ef3149b8
add support for using custom upstream llm ( #365 )
2025-01-17 18:25:55 -08:00
Shuguang Chen
88a02dc478
Some fixes on model server ( #362 )
...
* Some fixes on model server
* Remove prompt_prefilling message
* Fix logging
* Fix poetry issues
* Improve logging and update the support for text truncation
* Fix tests
* Fix tests
* Fix tests
* Fix modelserver tests
* Update modelserver tests
2025-01-10 16:45:36 -08:00
Salman Paracha
ebda682b30
updated docs for 0.1.8 support ( #366 )
...
* updated docs for 0.1.8 support
* updated REAMDE on root
* updated version reference to 0.1.8 in other parts of the repo
---------
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>
2025-01-10 16:38:48 -08:00
Adil Hafeez
dae6239b81
use per user docker socket if system docker socket doesn't exist ( #361 )
...
* use per user docker socket if system docker socket doesn't exist
* add retry
2025-01-08 14:55:42 -08:00
Shuguang Chen
ba7279becb
Use intent model from archfc to pick prompt gateway ( #328 )
2024-12-20 13:25:01 -08:00
Aayush
67b8fd635e
add more granular bucket sizes for ttft ( #343 )
...
* add more granular bucket sizes for ttft
2024-12-12 14:38:36 -08:00
Adil Hafeez
af0e7d178b
update cli to 0.1.6 ( #338 )
2024-12-06 15:48:07 -08:00
Adil Hafeez
a54db1a098
update getting started guide and add llm gateway and prompt gateway samples ( #330 )
2024-12-06 14:37:33 -08:00
Adil Hafeez
ec5326250e
correctly map stats port to host ( #311 )
2024-11-27 11:28:41 -08:00
Adil Hafeez
704b928d61
release 0.1.5 ( #307 )
2024-11-26 13:28:52 -08:00
Adil Hafeez
0ff3d43008
remove dependency on docker-compose when starting up archgw ( #305 )
2024-11-26 13:13:02 -08:00
Adil Hafeez
726f1a3185
add schema change to use enum in arch_config ( #304 )
2024-11-25 17:51:25 -08:00
Adil Hafeez
9c6fcdb771
use fix prompt guards ( #303 )
2024-11-25 17:16:35 -08:00
Adil Hafeez
36489b4adc
use envoy to publish traces ( #270 )
2024-11-18 17:55:39 -08:00
Adil Hafeez
9cee04ed31
release 0.1.3 ( #280 )
...
* release 0.1.3
* udpate ver
2024-11-17 17:12:01 -08:00
Adil Hafeez
d3c17c7abd
move custom tracer to llm filter ( #267 )
2024-11-15 10:44:01 -08:00
Adil Hafeez
d1dd8710a4
release 0.1.2 ( #266 )
2024-11-12 23:56:33 -08:00
Adil Hafeez
30647fd508
Add service to stream custom otel traces to otel-collector ( #262 )
2024-11-12 11:09:40 -08:00
Adil Hafeez
d87105882b
update rust toolchain to 1.82 ( #255 )
...
* update rust to 1.82 pin it, also update envoy to 1.32 and python to 3.13
* use python:3.12
2024-11-12 10:35:14 -08:00
Adil Hafeez
6b62662e01
update docs with weather_forecast path ( #253 )
2024-11-08 10:00:15 -08:00
Adil Hafeez
a72bb804eb
add support for jaeger tracing ( #229 )
2024-11-07 22:11:00 -06:00
Adil Hafeez
8c6ad87c1c
release 0.1.0 ( #239 )
...
* set version to 0.1.0
* update readme
2024-10-30 18:56:49 -07:00
Adil Hafeez
e462e393b1
Use large github action machine to run e2e tests ( #230 )
2024-10-30 17:54:51 -07:00
José Ulises Niño Rivera
662a840ac5
Add support for streaming and fixes few issues (see description) ( #202 )
2024-10-28 17:05:06 -07:00
Salman Paracha
7a5852b401
fixing discord link and moving contributing guide to root ( #215 )
...
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>
2024-10-23 15:45:49 -07:00
Salman Paracha
708fa15a9b
HR agent demo ( #206 )
...
* commiting my hr_agent branch
* updating the HR agent config
* pushing to remote
* fix hr agent
* committing to merge with main
* updating to merge from main
* updating the demo and model-server-tests to pull from poetry
* updating the poetry.lock files
* updating based on feedback
* updated sysmte prompt for hr_agent
---------
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>
Co-authored-by: Adil Hafeez <adil@katanemo.com>
2024-10-23 14:32:40 -07:00
Adil Hafeez
faf64960df
update observability and dashboards ( #198 )
2024-10-18 15:07:49 -07:00
José Ulises Niño Rivera
62a000036e
Update arch Dockerfile ( #200 )
2024-10-18 16:15:19 -04:00
Adil Hafeez
28421353fd
Update vscode workspce ( #199 )
...
- add recommended extensions
- set python interpreter path for all python projects to be venv/bin/python
- update project structure in workspace
- rename project file from gatewa -> archgw
2024-10-18 12:57:58 -07:00