plano/model_server
CTran fb67788be0
add prefill and test (#236)
* add prefill and test

* fix stream

* fix

* feedback

* address comments

* update

* add e2e test

* fix e2e test

* update fix

* fix

* address cmt

* address cmt
2024-11-07 11:59:29 -08:00
..
.vscode Use large github action machine to run e2e tests (#230) 2024-10-30 17:54:51 -07:00
app add prefill and test (#236) 2024-11-07 11:59:29 -08:00
__init__.py model server build (#127) 2024-10-06 18:21:43 -07:00
Dockerfile Cotran/onnx conversion (#145) 2024-10-08 14:37:48 -07:00
Dockerfile.gpu Cotran/onnx conversion (#145) 2024-10-08 14:37:48 -07:00
poetry.lock Use large github action machine to run e2e tests (#230) 2024-10-30 17:54:51 -07:00
pyproject.toml release 0.1.0 (#239) 2024-10-30 18:56:49 -07:00
README.md fixed cli to use poetry as well. this way we make it easy to have the… (#160) 2024-10-09 15:53:12 -07:00

Model Server Package

This model server package is a dependency of the Arch intelligent prompt gateway. It should not be used alone. Please refer to the quickstart-guide for more details on how to get start with Arch.