plano/model_server/Dockerfile

FROM python:3.10 AS builder

COPY requirements.txt .
RUN pip install --prefix=/runtime -r requirements.txt

FROM python:3.10-slim AS output

# curl is needed for health check in docker-compose
RUN apt-get update && apt-get install -y curl && apt-get clean && rm -rf /var/lib/apt/lists/*

COPY --from=builder /runtime /usr/local

WORKDIR /src

# specify list of models that will go into the image as a comma separated list
# following models have been tested to work with this image
# "sentence-transformers/all-MiniLM-L6-v2,sentence-transformers/all-mpnet-base-v2,thenlper/gte-base,thenlper/gte-large,thenlper/gte-small"
ENV MODELS="katanemo/bge-large-en-v1.5-onnx"

COPY ./app ./app
COPY ./app/guard_model_config.yaml .
COPY ./app/openai_params.yaml .

# comment it out for now as we don't want to download the model every time we build the image
# we will mount host cache to docker image to avoid downloading the model every time
# see docker-compose file for more details

# RUN python install.py && \
#   find /root/.cache/torch/sentence_transformers/ -name onnx -exec rm -rf {} +

CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "80"]
add support for default target (#111) * add support for default target * add more fixes 2024-10-02 20:43:16 -07:00			`FROM python:3.10 AS builder`
add embedding store (#10) 2024-07-18 14:04:51 -07:00
add support for default target (#111) * add support for default target * add more fixes 2024-10-02 20:43:16 -07:00			`COPY requirements.txt .`
			`RUN pip install --prefix=/runtime -r requirements.txt`
add embedding store (#10) 2024-07-18 14:04:51 -07:00
add support for default target (#111) * add support for default target * add more fixes 2024-10-02 20:43:16 -07:00			`FROM python:3.10-slim AS output`
add embedding store (#10) 2024-07-18 14:04:51 -07:00
add support for default target (#111) * add support for default target * add more fixes 2024-10-02 20:43:16 -07:00			`# curl is needed for health check in docker-compose`
			`RUN apt-get update && apt-get install -y curl && apt-get clean && rm -rf /var/lib/apt/lists/*`
add embedding store (#10) 2024-07-18 14:04:51 -07:00
add support for default target (#111) * add support for default target * add more fixes 2024-10-02 20:43:16 -07:00			`COPY --from=builder /runtime /usr/local`
add embedding store (#10) 2024-07-18 14:04:51 -07:00
add support for default target (#111) * add support for default target * add more fixes 2024-10-02 20:43:16 -07:00			`WORKDIR /src`
add embedding store (#10) 2024-07-18 14:04:51 -07:00
			`# specify list of models that will go into the image as a comma separated list`
			`# following models have been tested to work with this image`
			`# "sentence-transformers/all-MiniLM-L6-v2,sentence-transformers/all-mpnet-base-v2,thenlper/gte-base,thenlper/gte-large,thenlper/gte-small"`
Cotran/onnx conversion (#145) * onnx replacement * onnx conversion for nli and embedding model * fix naming * fix naming * fix naming * pin version 2024-10-08 14:37:48 -07:00			`ENV MODELS="katanemo/bge-large-en-v1.5-onnx"`
add embedding store (#10) 2024-07-18 14:04:51 -07:00
add support for default target (#111) * add support for default target * add more fixes 2024-10-02 20:43:16 -07:00			`COPY ./app ./app`
model server build (#127) * first commit to have model_server not be dependent on Docker * making changes to fix the docker-compose file for archgw to set DNS_V4 and minor fixes with the build * additional fixes for model server to be separated out in the build * additional fixes for model server to be separated out in the build * fix to get model_server to be built as a separate python process. TODO: fix the embeddings logs after cli completes * fixing init to pull tempfile using the tempfile python package --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local> 2024-10-06 18:21:43 -07:00			`COPY ./app/guard_model_config.yaml .`
			`COPY ./app/openai_params.yaml .`
add embedding store (#10) 2024-07-18 14:04:51 -07:00
Add support for local llm (mistral 7b) (#31) 2024-08-06 23:40:06 -07:00			`# comment it out for now as we don't want to download the model every time we build the image`
			`# we will mount host cache to docker image to avoid downloading the model every time`
			`# see docker-compose file for more details`

add embedding store (#10) 2024-07-18 14:04:51 -07:00			`# RUN python install.py && \`
			`# find /root/.cache/torch/sentence_transformers/ -name onnx -exec rm -rf {} +`

Fold function_resolver into model_server (#103) 2024-10-01 09:13:50 -07:00			`CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "80"]`