add Dockerfile #70

mpadge · 2023-05-12T10:39:43Z

Thanks for the invitation @imartinez . This is only a draft for now, because we need to decide whether it should have endpoints exposed.

thebigbone · 2023-05-12T11:51:32Z

I would suggest to use an alpine image instead of ubuntu. Alpine is lightweight, no bloatware and a lot faster than ubuntu

Polpetta · 2023-05-12T12:31:46Z

I'd also suggest to use the COPY directive instead of cloning with git this repository. It makes the build faster and doesn't need to connect to Github servers but instead it allows to source directly from the local copy.

mpadge · 2023-05-12T12:50:31Z

@imartinez Current form works, as provides at least a simple start. Would you like me to update docs as well before merging, or after?

@Polpetta Great idea, feel free to add to PR

@thebigbone ubuntu is a safe fallback because it's an image that people are way more likely to already have than any others. Size is not an issue because the image ends up > 15GB anyway, so starting sizes are completely irrelevant. But it's @imartinez's repo anyway, who ultimately gets to decide what base image to use.

vilaca · 2023-05-12T12:58:35Z

The problem with using ubuntu images is that by being bigger they also introduce more possible vulns.

Polpetta

These are my 2 cents. I'd also suggest creating a .dockerignore file to ignore files and other folders that could slow the build context (like models. db)

Polpetta · 2023-05-12T14:03:43Z

Dockerfile

+RUN cd home \
+    && git clone https://github.com/imartinez/privateGPT.git \
+    && cd privateGPT \
+    && pip install -r requirements.txt
+
+RUN echo "PERSIST_DIRECTORY=db\nLLAMA_EMBEDDINGS_MODEL=models/ggml-model-q4_0.bin\nMODEL_TYPE=GPT4All\nMODEL_PATH=models/ggml-gpt4all-j-v1.3-groovy.bin\nMODEL_N_CTX=1000" > home/privateGPT/.env \
+    && chmod a+x home/privateGPT/.env
+
+RUN mkdir home/privateGPT/models \
+    && cd home/privateGPT/models \
+    && wget https://gpt4all.io/models/ggml-gpt4all-j-v1.3-groovy.bin \
+    && wget https://huggingface.co/Pi3141/alpaca-native-7B-ggml/resolve/397e872bf4c83f4c642317a5bf65ce84a105786e/ggml-model-q4_0.bin


I'd also check the order of the statements, in order to make next builds quicker by leveraging Docker build cache feature.

Suggested change

RUN cd home \

&& git clone https://github.com/imartinez/privateGPT.git \

&& cd privateGPT \

&& pip install -r requirements.txt

RUN echo "PERSIST_DIRECTORY=db\nLLAMA_EMBEDDINGS_MODEL=models/ggml-model-q4_0.bin\nMODEL_TYPE=GPT4All\nMODEL_PATH=models/ggml-gpt4all-j-v1.3-groovy.bin\nMODEL_N_CTX=1000" > home/privateGPT/.env \

&& chmod a+x home/privateGPT/.env

RUN mkdir home/privateGPT/models \

&& cd home/privateGPT/models \

&& wget https://gpt4all.io/models/ggml-gpt4all-j-v1.3-groovy.bin \

&& wget https://huggingface.co/Pi3141/alpaca-native-7B-ggml/resolve/397e872bf4c83f4c642317a5bf65ce84a105786e/ggml-model-q4_0.bin

WORKDIR /privateGPT/

RUN mkdir models \

&& cd models \

&& wget https://gpt4all.io/models/ggml-gpt4all-j-v1.3-groovy.bin \

&& wget https://huggingface.co/Pi3141/alpaca-native-7B-ggml/resolve/397e872bf4c83f4c642317a5bf65ce84a105786e/ggml-model-q4_0.bin

RUN echo "PERSIST_DIRECTORY=db\nLLAMA_EMBEDDINGS_MODEL=models/ggml-model-q4_0.bin\nMODEL_TYPE=GPT4All\nMODEL_PATH=models/ggml-gpt4all-j-v1.3-groovy.bin\nMODEL_N_CTX=1000" > .env \

&& chmod a+x .env

COPY . .

RUN pip install -r requirements.txt

ENTRYPOINT ["/usr/bin/python", "/privateGPT/privateGTP.py"]

I went by heart so please check it locally! 👍

@Polpetta I didn't put an entrypoint yet, because that would then need to expose the source_directory in this repo to a .env var so the whole thing could me run with a local volume mounted to source_directory, or elsewhere. Any clever ideas how to expose that while enabling local mount to fill or replace source_directory?

You can always create a link to a local path or a volume of your choice when running docker run, like docker run -v/you/local/sources:/privateGPT/source_directory imartinez/privateGPT, no need to set a .env var imo.

Dockerfile

Co-authored-by: Davide Polonio <poloniodavide@gmail.com>

tgh19 · 2023-05-13T20:48:31Z

Doing the lords work here @mpadge

imartinez · 2023-05-14T07:51:41Z

@mpadge thanks for the work! Please update the readme presenting this as an alternative way of getting the project running. Try to make it super clear, taking into account there is a lot of people checking out this repo, and not everyone is an experienced SW dev. We can merge it once you got it. Thanks!!

hanwsf · 2023-05-14T13:42:08Z

#90 (comment)
ubuntu:latest doesn't work, tested.

JulienA · 2023-05-14T13:46:57Z

Hello i made another that using dockerfile and compose #120

mpadge · 2023-05-15T07:47:53Z

@imartinez I'm going to close this in favour of #120 from @JulienA. docker-compose is definitely the way to go, to separate the install and ingest steps.

mpadge added 2 commits May 12, 2023 12:37

add Dockerfile

37d70e1

clone repo in docker home dir

36e1d57

mpadge added 2 commits May 12, 2023 13:57

minor docker tweaks

ca6a566

typo

e48efd8

mpadge marked this pull request as ready for review May 12, 2023 12:45

Polpetta reviewed May 12, 2023

View reviewed changes

Update Dockerfile

72c3069

Co-authored-by: Davide Polonio <poloniodavide@gmail.com>

mpadge closed this May 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add Dockerfile #70

add Dockerfile #70

mpadge commented May 12, 2023

thebigbone commented May 12, 2023 •

edited

Loading

Polpetta commented May 12, 2023

mpadge commented May 12, 2023

vilaca commented May 12, 2023

Polpetta left a comment

Polpetta May 12, 2023

mpadge May 12, 2023

Polpetta May 15, 2023

tgh19 commented May 13, 2023

imartinez commented May 14, 2023

hanwsf commented May 14, 2023

JulienA commented May 14, 2023

mpadge commented May 15, 2023

add Dockerfile #70

add Dockerfile #70

Conversation

mpadge commented May 12, 2023

thebigbone commented May 12, 2023 • edited Loading

Polpetta commented May 12, 2023

mpadge commented May 12, 2023

vilaca commented May 12, 2023

Polpetta left a comment

Choose a reason for hiding this comment

Polpetta May 12, 2023

Choose a reason for hiding this comment

mpadge May 12, 2023

Choose a reason for hiding this comment

Polpetta May 15, 2023

Choose a reason for hiding this comment

tgh19 commented May 13, 2023

imartinez commented May 14, 2023

hanwsf commented May 14, 2023

JulienA commented May 14, 2023

mpadge commented May 15, 2023

thebigbone commented May 12, 2023 •

edited

Loading