Shuguang Chen
ba7279becb
Use intent model from archfc to pick prompt gateway ( #328 )
2024-12-20 13:25:01 -08:00
Aayush
67b8fd635e
add more granular bucket sizes for ttft ( #343 )
...
* add more granular bucket sizes for ttft
2024-12-12 14:38:36 -08:00
Adil Hafeez
af0e7d178b
update cli to 0.1.6 ( #338 )
2024-12-06 15:48:07 -08:00
Adil Hafeez
a54db1a098
update getting started guide and add llm gateway and prompt gateway samples ( #330 )
2024-12-06 14:37:33 -08:00
Adil Hafeez
ec5326250e
correctly map stats port to host ( #311 )
2024-11-27 11:28:41 -08:00
Adil Hafeez
704b928d61
release 0.1.5 ( #307 )
2024-11-26 13:28:52 -08:00
Adil Hafeez
0ff3d43008
remove dependency on docker-compose when starting up archgw ( #305 )
2024-11-26 13:13:02 -08:00
Adil Hafeez
726f1a3185
add schema change to use enum in arch_config ( #304 )
2024-11-25 17:51:25 -08:00
Adil Hafeez
9c6fcdb771
use fix prompt guards ( #303 )
2024-11-25 17:16:35 -08:00
Adil Hafeez
36489b4adc
use envoy to publish traces ( #270 )
2024-11-18 17:55:39 -08:00
Adil Hafeez
9cee04ed31
release 0.1.3 ( #280 )
...
* release 0.1.3
* udpate ver
2024-11-17 17:12:01 -08:00
Adil Hafeez
d3c17c7abd
move custom tracer to llm filter ( #267 )
2024-11-15 10:44:01 -08:00
Adil Hafeez
d1dd8710a4
release 0.1.2 ( #266 )
2024-11-12 23:56:33 -08:00
Adil Hafeez
30647fd508
Add service to stream custom otel traces to otel-collector ( #262 )
2024-11-12 11:09:40 -08:00
Adil Hafeez
d87105882b
update rust toolchain to 1.82 ( #255 )
...
* update rust to 1.82 pin it, also update envoy to 1.32 and python to 3.13
* use python:3.12
2024-11-12 10:35:14 -08:00
Adil Hafeez
6b62662e01
update docs with weather_forecast path ( #253 )
2024-11-08 10:00:15 -08:00
Adil Hafeez
a72bb804eb
add support for jaeger tracing ( #229 )
2024-11-07 22:11:00 -06:00
Adil Hafeez
8c6ad87c1c
release 0.1.0 ( #239 )
...
* set version to 0.1.0
* update readme
2024-10-30 18:56:49 -07:00
Adil Hafeez
e462e393b1
Use large github action machine to run e2e tests ( #230 )
2024-10-30 17:54:51 -07:00
José Ulises Niño Rivera
662a840ac5
Add support for streaming and fixes few issues (see description) ( #202 )
2024-10-28 17:05:06 -07:00
Salman Paracha
7a5852b401
fixing discord link and moving contributing guide to root ( #215 )
...
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>
2024-10-23 15:45:49 -07:00
Salman Paracha
708fa15a9b
HR agent demo ( #206 )
...
* commiting my hr_agent branch
* updating the HR agent config
* pushing to remote
* fix hr agent
* committing to merge with main
* updating to merge from main
* updating the demo and model-server-tests to pull from poetry
* updating the poetry.lock files
* updating based on feedback
* updated sysmte prompt for hr_agent
---------
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>
Co-authored-by: Adil Hafeez <adil@katanemo.com>
2024-10-23 14:32:40 -07:00
Adil Hafeez
faf64960df
update observability and dashboards ( #198 )
2024-10-18 15:07:49 -07:00
José Ulises Niño Rivera
62a000036e
Update arch Dockerfile ( #200 )
2024-10-18 16:15:19 -04:00
Adil Hafeez
28421353fd
Update vscode workspce ( #199 )
...
- add recommended extensions
- set python interpreter path for all python projects to be venv/bin/python
- update project structure in workspace
- rename project file from gatewa -> archgw
2024-10-18 12:57:58 -07:00
Salman Paracha
6fb63510b3
fix cli models and logs ( #196 )
...
* removing unnecessar setup.py files
* updated the cli for debug and access logs
* ran the pre-commit locally to fix pull request
* fixed bug where if archgw_process is None we didn't handle it gracefully
* Apply suggestions from code review
Co-authored-by: Adil Hafeez <adil@katanemo.com>
* fixed changes based on PR
* fixed version not found message
* fixed message based on PR feedback
* adding poetry lock
* fixed pre-commit
---------
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>
Co-authored-by: Adil Hafeez <adil@katanemo.com>
2024-10-18 12:09:45 -07:00
Adil Hafeez
6cd05572c4
update lock file ( #192 )
...
```
Installing dependencies from lock file
pyproject.toml changed significantly since poetry.lock was last generated. Run `poetry lock [--no-update]` to fix the lock file.
Error installing model server dependencies: Command '['poetry', 'install', '--no-cache']' returned non-zero exit status 1.
```
2024-10-17 10:42:15 -07:00
Adil Hafeez
21e7fe2cef
Split arch wasm filter code into prompt and llm gateway filters ( #190 )
2024-10-17 10:16:40 -07:00
Adil Hafeez
3bd2ffe9fb
split wasm filter ( #186 )
...
* split wasm filter
* fix int and unit tests
* rename public_types => common and move common code there
* rename
* fix int test
2024-10-16 14:20:26 -07:00
Co Tran
b1746b38b4
concatenate history of user messages for hallucination ( #177 )
...
* concatenate history of user messages for hallucination
* add history of messages
* fix gpt to not arch
* add model prefix
* fix
* correct init of user_messages
* fmt
* fix test
2024-10-15 11:43:05 -07:00
Adil Hafeez
7d5f760884
Improve cli ( #179 )
2024-10-10 17:44:41 -07:00
Co Tran
2c45de26e6
fix for linux ( #175 )
...
* fix for linux
* fix pre commit
* fix
* fix extra white space
* fix commit
2024-10-10 14:56:23 -07:00
Adil Hafeez
2b501d10bd
update lock file
2024-10-09 21:01:12 -07:00
Salman Paracha
95a0f1be5b
updated archgw cli to pull from archgw_modelserver from pypi ( #169 )
...
* updated archgw cli to pull from archgw_modelserver from pypi
* fix image name
* update rev
---------
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>
Co-authored-by: Adil Hafeez <adil@katanemo.com>
2024-10-09 21:00:26 -07:00
Adil Hafeez
6b70768170
make ratelimit section optional ( #168 )
2024-10-09 19:53:00 -07:00
Co Tran
f9e3a052fc
change nli model ( #167 )
...
* change nli model
* Fix bug in hallucination
---------
Co-authored-by: Shuguang Chen <54548843+nehcgs@users.noreply.github.com>
2024-10-09 19:10:08 -07:00
Adil Hafeez
71cdf69f77
dont send default target to archfc ( #166 )
2024-10-09 17:43:02 -07:00
Adil Hafeez
3e9327cf36
fix bug in jinja template for tracing
2024-10-09 16:44:50 -07:00
Salman Paracha
0ed88def8f
updated settuptools packages
2024-10-09 16:31:35 -07:00
Adil Hafeez
6991fbb7a7
rename
2024-10-09 16:24:40 -07:00
Adil Hafeez
c254dfb16a
update cli and update docs ( #161 )
...
* add services to cli
* more changes
2024-10-09 16:22:27 -07:00
Salman Paracha
1acf43ff7a
fixed cli to use poetry as well. this way we make it easy to have the… ( #160 )
2024-10-09 15:53:12 -07:00
Adil Hafeez
e81ca8d5cf
llm listener split ( #155 )
2024-10-09 15:47:32 -07:00
Co Tran
8b5db45507
Fix gpu dependency and only leverage onnx when GPU is available ( #157 )
...
* replacing appending instead of write
* fix eetq dependency
* gpu guard required eetq
* fix bug when gpu is available
* fix for gpu device
* reverse
* fix
* replace gpu -> cuda
2024-10-09 11:42:05 -07:00
Co Tran
5c4a6bc8ff
lint + formating with black ( #158 )
...
* lint + formating with black
* add black as pre commit
2024-10-09 11:25:07 -07:00
Salman Paracha
498e7f9724
minor fixes to README ( #156 )
...
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>
2024-10-09 10:12:36 -07:00
Salman Paracha
b63a01fe82
Salmanap/fix network agent demo ( #153 )
...
* staging my changes to re-based from main
* adding debug statements to rust
* merged with main
* ready to push network agent
* removed the incomplete sql example
---------
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>
2024-10-08 22:19:20 -07:00
Adil Hafeez
6acfea7787
bug fix - send all parameter irrespective of type
...
earlier we were only sending parameter if the type is string
2024-10-08 20:28:32 -07:00
Adil Hafeez
e08d406be5
Update README.md
2024-10-08 17:20:51 -07:00
Adil Hafeez
ede125a4f3
ensure that tracing is optional in arch_config ( #149 )
2024-10-08 17:15:40 -07:00