Commit graph

109 commits

Author SHA1 Message Date
Adil Hafeez
fe328b09e1
merge main 2025-02-14 18:34:40 -08:00
Adil Hafeez
0977bf5b34
Merge branch 'main' into adil/update_arch_config_format 2025-02-14 18:18:58 -08:00
Adil Hafeez
d0a783cca8
use docker cli to communicate to docker sub system (#412) 2025-02-14 17:46:58 -08:00
Adil Hafeez
bf706fcc6f
fix names 2025-02-14 17:41:39 -08:00
Adil Hafeez
136daa2d3c
use ingress/egress 2025-02-14 16:43:51 -08:00
Adil Hafeez
4a957f2b86
remove jaeger and podman changes 2025-02-13 11:02:09 -08:00
Adil Hafeez
0dc8585a6e
fix docker 2025-02-12 17:51:53 -08:00
Adil Hafeez
b754613a49
fix rust test 2025-02-12 17:50:59 -08:00
Adil Hafeez
57ffaf7431
fix tests 2025-02-12 17:44:51 -08:00
Adil Hafeez
bc329a4421
use bash 2025-02-12 17:27:41 -08:00
Adil Hafeez
f18919ff15
run tests 2025-02-12 17:25:02 -08:00
Adil Hafeez
9cb04756c5
fix more 2025-02-12 14:48:23 -08:00
Adil Hafeez
d2ad943f63
use docker.it 2025-02-12 12:19:31 -08:00
Adil Hafeez
0ea237fbac
release 0.2.1 (#399) 2025-02-07 19:21:20 -08:00
Adil Hafeez
8de6eacfbd
spotify demo with optimized context window code change (#397) 2025-02-07 19:14:15 -08:00
Salman Paracha
b3c95a6698
refactor demos (#398) 2025-02-07 18:45:42 -08:00
Adil Hafeez
2bd61d628c
add ability to specify custom http headers in api endpoint (#386) 2025-02-06 11:48:09 -08:00
Adil Hafeez
962727f244
Infer port from protocol if port is not specified and add ability to override hostname in clusters def (#389) 2025-02-03 14:51:59 -08:00
Adil Hafeez
7830f4b431
release 0.2.0 (#384)
* release 0.2.0

* update versions
2025-01-24 17:31:48 -08:00
Adil Hafeez
38f7691163
add support for custom llm with ssl support (#380)
* add support for custom llm with ssl support

Add support for using custom llm that are served through https protocol.

* add instructions on how to add custom inference endpoint

* fix formatting

* add more details

* Apply suggestions from code review

Co-authored-by: Salman Paracha <salman.paracha@gmail.com>

* Apply suggestions from code review

* fix precommit

---------

Co-authored-by: Salman Paracha <salman.paracha@gmail.com>
2025-01-24 17:14:24 -08:00
Adil Hafeez
452084423c
add PR to release 0.1.9 (#371) 2025-01-17 18:47:26 -08:00
Adil Hafeez
07ef3149b8
add support for using custom upstream llm (#365) 2025-01-17 18:25:55 -08:00
Shuguang Chen
88a02dc478
Some fixes on model server (#362)
* Some fixes on model server

* Remove prompt_prefilling message

* Fix logging

* Fix poetry issues

* Improve logging and update the support for text truncation

* Fix tests

* Fix tests

* Fix tests

* Fix modelserver tests

* Update modelserver tests
2025-01-10 16:45:36 -08:00
Salman Paracha
ebda682b30
updated docs for 0.1.8 support (#366)
* updated docs for 0.1.8 support

* updated REAMDE on root

* updated version reference to 0.1.8 in other parts of the repo

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>
2025-01-10 16:38:48 -08:00
Adil Hafeez
dae6239b81
use per user docker socket if system docker socket doesn't exist (#361)
* use per user docker socket if system docker socket doesn't exist

* add retry
2025-01-08 14:55:42 -08:00
Shuguang Chen
ba7279becb
Use intent model from archfc to pick prompt gateway (#328) 2024-12-20 13:25:01 -08:00
Aayush
67b8fd635e
add more granular bucket sizes for ttft (#343)
* add more granular bucket sizes for ttft
2024-12-12 14:38:36 -08:00
Adil Hafeez
af0e7d178b
update cli to 0.1.6 (#338) 2024-12-06 15:48:07 -08:00
Adil Hafeez
a54db1a098
update getting started guide and add llm gateway and prompt gateway samples (#330) 2024-12-06 14:37:33 -08:00
Adil Hafeez
ec5326250e
correctly map stats port to host (#311) 2024-11-27 11:28:41 -08:00
Adil Hafeez
704b928d61
release 0.1.5 (#307) 2024-11-26 13:28:52 -08:00
Adil Hafeez
0ff3d43008
remove dependency on docker-compose when starting up archgw (#305) 2024-11-26 13:13:02 -08:00
Adil Hafeez
726f1a3185
add schema change to use enum in arch_config (#304) 2024-11-25 17:51:25 -08:00
Adil Hafeez
9c6fcdb771
use fix prompt guards (#303) 2024-11-25 17:16:35 -08:00
Adil Hafeez
36489b4adc
use envoy to publish traces (#270) 2024-11-18 17:55:39 -08:00
Adil Hafeez
9cee04ed31
release 0.1.3 (#280)
* release 0.1.3

* udpate ver
2024-11-17 17:12:01 -08:00
Adil Hafeez
d3c17c7abd
move custom tracer to llm filter (#267) 2024-11-15 10:44:01 -08:00
Adil Hafeez
d1dd8710a4
release 0.1.2 (#266) 2024-11-12 23:56:33 -08:00
Adil Hafeez
30647fd508
Add service to stream custom otel traces to otel-collector (#262) 2024-11-12 11:09:40 -08:00
Adil Hafeez
d87105882b
update rust toolchain to 1.82 (#255)
* update rust to 1.82 pin it, also update envoy to 1.32 and python to 3.13

* use python:3.12
2024-11-12 10:35:14 -08:00
Adil Hafeez
6b62662e01
update docs with weather_forecast path (#253) 2024-11-08 10:00:15 -08:00
Adil Hafeez
a72bb804eb
add support for jaeger tracing (#229) 2024-11-07 22:11:00 -06:00
Adil Hafeez
8c6ad87c1c
release 0.1.0 (#239)
* set version to 0.1.0

* update readme
2024-10-30 18:56:49 -07:00
Adil Hafeez
e462e393b1
Use large github action machine to run e2e tests (#230) 2024-10-30 17:54:51 -07:00
José Ulises Niño Rivera
662a840ac5
Add support for streaming and fixes few issues (see description) (#202) 2024-10-28 17:05:06 -07:00
Salman Paracha
7a5852b401
fixing discord link and moving contributing guide to root (#215)
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>
2024-10-23 15:45:49 -07:00
Salman Paracha
708fa15a9b
HR agent demo (#206)
* commiting my hr_agent branch

* updating the HR agent config

* pushing to remote

* fix hr agent

* committing to merge with main

* updating to merge from main

* updating the demo and model-server-tests to pull from poetry

* updating the poetry.lock files

* updating based on feedback

* updated sysmte prompt for hr_agent

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>
Co-authored-by: Adil Hafeez <adil@katanemo.com>
2024-10-23 14:32:40 -07:00
Adil Hafeez
faf64960df
update observability and dashboards (#198) 2024-10-18 15:07:49 -07:00
José Ulises Niño Rivera
62a000036e
Update arch Dockerfile (#200) 2024-10-18 16:15:19 -04:00
Adil Hafeez
28421353fd
Update vscode workspce (#199)
- add recommended extensions
- set python interpreter path for all python projects to be venv/bin/python
- update project structure in workspace
- rename project file from gatewa -> archgw
2024-10-18 12:57:58 -07:00