docker file and compose #120

JulienA · 2023-05-14T13:06:51Z

No description provided.

thebigbone · 2023-05-15T08:04:45Z

Why not build the image in docker-compose directly?

Dockerfile

JulienA · 2023-05-15T11:48:46Z

Pushed both change

mdeweerd · 2023-05-15T16:16:29Z

Thanks for sharing.
Could you allow the use of a .env file to avoid modifying the repo's yaml file?

For instance, use:

 ${MODELS:-./models}

to set the models directory so that it can be set in the .env file.

JulienA · 2023-05-15T16:37:14Z

Thanks for sharing. Could you allow the use of a .env file to avoid modifying the repo's yaml file?

For instance, use:
 ${MODELS:-./models}
to set the models directory so that it can be set in the .env file.

Not sure to understand correctly but setting load_dotenv(override=True) will override docker-compose env var with the .env file but there is not .env file actually

mdeweerd · 2023-05-15T17:38:42Z

There is no .env file in the repo, but we can set one locally.

By setting .env as follows, I successfully used my E: drive for the models. A user that does not have a local .env should be using ./models instead.

MODELS=E:/

This avoids changing any git controlled file to adapt to the local setup. I already had some models on my e-drive... .

JulienA · 2023-05-15T17:43:54Z

@mdeweerd reviewed in b4aad15

mdeweerd · 2023-05-15T18:46:50Z

I was able to use "MODEL_MOUNT".

I suggest to convert the line endings to CRLF of these files.

As I was applying a local pre-commit configuration, this detected that the line endings of the yaml files (and Dockerfile) is CRLF - yamllint suggest to have LF line endings - yamlfix helps format the files automatically.

I am still struggling to get an anwser to my question - the container stops at some point. Maybe this has to do with memory - the container limit is 7.448GiB .

mdeweerd · 2023-05-15T21:18:51Z

FYI, I've set the memory for WSL2 to 12GB which allowed me to get an anwser to a question.

My .wslconfig now looks like:

[wsl2]
memory=12GB

During compilation I noticed some references to nvidia, so I wondered if the image should be based on some cuda image.

I tried FROM wallies/python-cuda:3.10-cuda11.6-runtime but did not see an impact on performance - it may be helpful in the future.

mdeweerd · 2023-05-15T23:12:11Z

The two docker-compose*.yaml files share elements and duplication could be avoided by adding both into a single docker-compose.yaml files, and using 'extend:'.

It also avoids having to specify the docker-compose*.yaml file.

You can have a look at https://github.com/mdeweerd/MetersToHA/blob/meters-to-ha/docker-compose.yml for some hints.

mdeweerd · 2023-05-16T09:18:36Z

FYI, I tried to enable 'cuda' and got some kind of success: I got a cuda related error message:

nvidia-container-cli: requirement error: unsatisfied condition: cuda>=11.7, please update your driver to a newer version, or use an earlier cuda container: unknown

In the Dockerfile I used:

FROM wallies/python-cuda:3.10-cuda11.7-runtime

and in the docker-compose-ingest.yaml file, I added:

    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]

JulienA · 2023-05-16T09:31:58Z

FYI, I tried to enable 'cuda' and got some kind of success: I got a cuda related error message:
nvidia-container-cli: requirement error: unsatisfied condition: cuda>=11.7, please update your driver to a newer version, or use an earlier cuda container: unknown
In the Dockerfile I used:
FROM wallies/python-cuda:3.10-cuda11.7-runtime
and in the docker-compose-ingest.yaml file, I added:
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]

I may be wrong but the requirements use the llamacpp so even if you use a cuda related stuff it won't be used ? since the cpp one only use CPU.

mdeweerd · 2023-05-16T14:55:09Z

I may be wrong but the requirements use the llamacpp so even if you use a cuda related stuff it won't be used ? since the cpp one only use CPU.

When I run the app and use "docker stats", the cpu use exceeds 100%, so it's using more than 1 core (but only 1 cpu).

The program complains about the cuda version mismatch, so if it is not used then why would it complain?
I only got this error regarding cuda with ingest.
See: https://www.reddit.com/r/LocalLLaMA/comments/13gok03/llamacpp_now_officially_supports_gpu_acceleration/
See: https://github.com/ggerganov/llama.cpp#:~:text=acceleration%20using%20the-,CUDA,-cores%20of%20your

So the latest release has support for cuda.

Polpetta · 2023-05-17T13:52:09Z

Dockerfile

@@ -0,0 +1,12 @@
+FROM python:3.10.11


Hi, I'd suggest having a look at https://github.com/GoogleContainerTools/distroless too

It provides different base images, python3 included, that are very small and already has a user inside them. It could be very effective to slim down the image size as much as possible!

Hello I already tried some light/distroless image, but the requirements.txt get a lot of dependency like 8Go and need GCC compiler, eventually cuda and other stuff.

You will win like 200M (2%) of the total image size and will probably need to add gcc/python-dev/cuda manually

If you have a working dockerfile using a distroless/light don't hesitate to contribute

Yeah, you're right, I didn't take in consideration the CUDA dependencies that are very heavy. It'd make a little sense, size wise. Maybe security wise could have some sense though, because there is no default shell. But I guess it depends on which environment you are going to use it (and since it is a cli, I think it doesn't really make sense).

mdeweerd · 2023-05-17T21:32:53Z

docker-compose.yaml

The default name is 'docker-compose.yml', so the extension should be yml.

https://docs.docker.com/compose/compose-file/compose-file-v3/#:~:text=The%20default%20path%20for%20a%20Compose%20file%20is%20./docker%2Dcompose.yml.

mdeweerd · 2023-05-17T22:11:25Z

I am making progress with CUDA and moved everything to a single docker-compose.yaml .

I proposed a PR for https://github.com/mdeweerd/privateGPT/tree/cuda in your fork.

mdeweerd · 2023-05-18T11:04:57Z

I had added the source_documents mount to the privateGPT service because I did not want to repeat it on every ingest service - I try to be DRY. I now remembered the name of the mechanism I was looking for: anchors and aliases.

Example, with volumes (the volumes are not reused individually, but I think they can be):
https://gist.github.com/joebeeson/6efc5c0d7851b767d83947177ea17e0b
Some articles:
- https://medium.com/@kinghuang/docker-compose-anchors-aliases-extensions-a1e4105d70bd
- https://nickjanetakis.com/blog/docker-tip-82-using-yaml-anchors-and-x-properties-in-docker-compose

This is essentially a suggestion - maybe I'll look into it, but I have to attend some other stuff...

JulienA · 2023-05-18T11:20:49Z

Since the source_document is only need at ingest, i try to avoid mounting it when not needed.
Like this d4cfac2 you only have it in ingest and the cuda only override image, it's ok ?

mdeweerd · 2023-05-18T11:23:46Z

Since the source_document is only need at ingest, i try to avoid mounting it when not needed. Like this d4cfac2 you only have it in ingest and the cuda only override image, it's ok ?

Yes, that's perfect.

thiago-scherrer · 2023-05-19T14:39:13Z

Dockerfile

+ARG BASEIMAGE
+FROM $BASEIMAGE
+
+RUN groupadd -g 10009 -o privategpt && useradd -m -u 10009 -g 10009 -o -s /bin/bash privategpt
+USER privategpt
+WORKDIR /home/privategpt
+
+COPY ./src/requirements.txt src/requirements.txt
+ARG LLAMA_CMAKE
+#RUN CMAKE_ARGS="-DLLAMA_OPENBLAS=on" FORCE_CMAKE=1 pip install $(grep llama-cpp-python src/requirements.txt)
+RUN ( /bin/bash -c "${LLAMA_CMAKE} pip install \$(grep llama-cpp-python src/requirements.txt)" 2>&1 | tee llama-build.log ) && sleep 10
+RUN pip install --no-cache-dir -r src/requirements.txt 2>&1 | tee pip-install.log
+
+COPY ./src src


Suggested change

ARG BASEIMAGE

FROM $BASEIMAGE

RUN groupadd -g 10009 -o privategpt && useradd -m -u 10009 -g 10009 -o -s /bin/bash privategpt

USER privategpt

WORKDIR /home/privategpt

COPY ./src/requirements.txt src/requirements.txt

ARG LLAMA_CMAKE

#RUN CMAKE_ARGS="-DLLAMA_OPENBLAS=on" FORCE_CMAKE=1 pip install $(grep llama-cpp-python src/requirements.txt)

RUN ( /bin/bash -c "${LLAMA_CMAKE} pip install \$(grep llama-cpp-python src/requirements.txt)" 2>&1 | tee llama-build.log ) && sleep 10

RUN pip install --no-cache-dir -r src/requirements.txt 2>&1 | tee pip-install.log

COPY ./src src

ARG BASEIMAGE

FROM $BASEIMAGE

RUN groupadd -g 10009 -o privategpt \

&& useradd -m -u 10009 -g 10009 -o -s /bin/bash privategpt

USER privategpt

WORKDIR /home/privategpt

COPY ./src/requirements.txt src/requirements.txt

ARG LLAMA_CMAKE

RUN (${LLAMA_CMAKE} pip install $(grep llama-cpp-python src/requirements.txt) 2>&1 | tee llama-build.log) \

&& sleep 10 \

&& pip install --no-cache-dir -r src/requirements.txt 2>&1 | tee pip-install.log

COPY ./src src

Proposed update in PR :JulienA#2 .

mdeweerd · 2023-05-18T10:13:24Z

docker-compose.yml

+      - MODEL_PATH=${MODEL_PATH:-/home/privategpt/models/ggml-gpt4all-j-v1.3-groovy.bin}
+      - MODEL_N_CTX=${MODEL_N_CTX:-1000}
+    volumes:
+      - ${CACHE_MOUNT:-./cache}:/home/privategpt/.cache/torch


The local .cache may be populated with other subdirectories, so mapping that entire directory to torch is not ok.

This is wy I mapped only the "torch" directory where the models seem to be downloaded and I mapped it from a "cache" directory int he models path, because this is essentially a cache of models.

To avoid having extra patch to specify, I did not add another path such ass "MODEL_CACHE_MOUNT".

mdeweerd · 2023-05-18T10:23:57Z

README.md

+    docker compose run --rm --build privategpt-ingest
+    ```
+
+   2. With Cuda 11.6 or 11.7


Maybe add a note:

:warning: The use of CUDA is not fully validated yet. Also the CUDA version on your host is important and must be at least the version used in the container. You can check your version with `docker compose run --rm --build check-cuda-version` :information_source: Get a recent CUDA version from https://developer.nvidia.com/cuda-downloads.

mdeweerd · 2023-05-18T11:30:38Z

Dockerfile

+COPY ./src/requirements.txt src/requirements.txt
+ARG LLAMA_CMAKE
+#RUN CMAKE_ARGS="-DLLAMA_OPENBLAS=on" FORCE_CMAKE=1 pip install $(grep llama-cpp-python src/requirements.txt)
+RUN ( /bin/bash -c "${LLAMA_CMAKE} pip install \$(grep llama-cpp-python src/requirements.txt)" 2>&1 | tee llama-build.log ) && sleep 10


I forgot the remove the && sleep 10 here which I added only to visually verify that the executed command was ok.

This && sleep 10 can be removed.

mdeweerd · 2023-05-19T15:28:47Z

Dockerfile

+ARG BASEIMAGE
+FROM $BASEIMAGE
+
+RUN groupadd -g 10009 -o privategpt && useradd -m -u 10009 -g 10009 -o -s /bin/bash privategpt
+USER privategpt
+WORKDIR /home/privategpt
+
+COPY ./src/requirements.txt src/requirements.txt
+ARG LLAMA_CMAKE
+#RUN CMAKE_ARGS="-DLLAMA_OPENBLAS=on" FORCE_CMAKE=1 pip install $(grep llama-cpp-python src/requirements.txt)
+RUN ( /bin/bash -c "${LLAMA_CMAKE} pip install \$(grep llama-cpp-python src/requirements.txt)" 2>&1 | tee llama-build.log ) && sleep 10
+RUN pip install --no-cache-dir -r src/requirements.txt 2>&1 | tee pip-install.log
+
+COPY ./src src


Proposed update in PR :JulienA#2 .

Add pip upgrade to avoid sha256 mismatches, also cleanup cache

mrbrianevans · 2023-05-19T20:33:00Z

README.md

+    ```sh
+    docker compose run --rm --build privategpt-cuda-11.6-ingest
+
+    docker compose run --rm --build privategpt-cuda-11.7-ingest


I get an error when running this command, not sure if its related to docker compose version?

PS C:\Users\bme\projects\privateGPT> docker compose run --rm --build privategpt-cuda-11.7-ingest unknown flag: --build

I am on Win11 and the flag is ok with (tested from a powershell prompt)

> docker --version Docker version 23.0.5, build bc4487a > docker compose version Docker Compose version v2.17.3

Added in Docker compose 2.13.0 docker/docs@b00b1d2 .

Thanks

> docker --version Docker version 20.10.14, build a224086 > docker compose version Docker Compose version v2.5.1

after doing a build, it runs and give this error:

> docker compose run --rm privategpt-cuda-11.7-ingest Traceback (most recent call last): File "/home/privategpt/src/ingest.py", line 4, in <module> from dotenv import load_dotenv ModuleNotFoundError: No module named 'dotenv'

I see python-dotenv==1.0.0 in the requirements.txt and the pip install succeeded in the docker build (presumably, cause the build completed and ran).

This branch needs a merge from main.

You can do this locally for now:

git remote add upstream https://github.com/imartinez/privateGPT git fetch upstream git checkout -b local-merge git merge upstream/main git add README.md git commit -m "Ignore README conflicts"

rebuilt and its working

I also received this error, I think it's a red herring due to the output of pip being hidden and piped into a log file. This prevents it from erroring when it can't find a package etc. and falls through to trying to run the script.

Solution to prevent this would be to remove the pipe into the log file within Dockerfile.

thiago-scherrer · 2023-05-19T21:45:29Z

Dockerfile

+#FROM python:3.10.11
+#FROM wallies/python-cuda:3.10-cuda11.6-runtime
+
+# Using argument for base image to avoid multiplying Dockerfiles


♻️

Suggested change

#FROM python:3.10.11

#FROM wallies/python-cuda:3.10-cuda11.6-runtime

# Using argument for base image to avoid multiplying Dockerfiles

# Using argument for base image to avoid multiplying Dockerfiles

thiago-scherrer · 2023-05-19T21:45:48Z

Dockerfile

+
+COPY ./src/requirements.txt src/requirements.txt
+ARG LLAMA_CMAKE
+#RUN CMAKE_ARGS="-DLLAMA_OPENBLAS=on" FORCE_CMAKE=1 pip install $(grep llama-cpp-python src/requirements.txt)


♻️

Suggested change

#RUN CMAKE_ARGS="-DLLAMA_OPENBLAS=on" FORCE_CMAKE=1 pip install $(grep llama-cpp-python src/requirements.txt)

thiago-scherrer · 2023-05-19T21:45:59Z

Dockerfile

+
+COPY ./src src
+
+# ENTRYPOINT ["python", "src/privateGPT.py"]


♻️

Suggested change

# ENTRYPOINT ["python", "src/privateGPT.py"]

mrbrianevans · 2023-05-20T17:56:26Z

Dockerfile

+FROM $BASEIMAGE
+
+RUN groupadd -g 10009 -o privategpt && useradd -m -u 10009 -g 10009 -o -s /bin/bash privategpt
+USER privategpt


Tried running ingest docker container in linux and getting this error:

$ sudo docker compose run --rm privategpt-ingest [sudo] password for brian: Loading documents from /home/privategpt/source_documents Loaded 1 documents from /home/privategpt/source_documents Split into 90 chunks of text (max. 500 characters each) Traceback (most recent call last): File "/home/privategpt/src/ingest.py", line 97, in <module> main() File "/home/privategpt/src/ingest.py", line 88, in main embeddings = HuggingFaceEmbeddings(model_name=embeddings_model_name) File "/home/privategpt/.local/lib/python3.10/site-packages/langchain/embeddings/huggingface.py", line 54, in __init__ self.client = sentence_transformers.SentenceTransformer( File "/home/privategpt/.local/lib/python3.10/site-packages/sentence_transformers/SentenceTransformer.py", line 87, in __init__ snapshot_download(model_name_or_path, File "/home/privategpt/.local/lib/python3.10/site-packages/sentence_transformers/util.py", line 476, in snapshot_download os.makedirs(nested_dirname, exist_ok=True) File "/usr/local/lib/python3.10/os.py", line 215, in makedirs makedirs(head, exist_ok=exist_ok) File "/usr/local/lib/python3.10/os.py", line 225, in makedirs mkdir(name, mode) PermissionError: [Errno 13] Permission denied: '/home/privategpt/.cache/torch/sentence_transformers'

googling it suggests that its related to the Dockerfile USER not having correct permissions, but I'm not sure.

Do you know what could be causing this? I ran it fine on windows through wsl2 docker desktop, but get this error when running on a linux machine.

It's related to the fact that you're running the docker container as root, and the unpriviliged container user can't create directories as root.

For this particular error mkdir cache ; chmod 777 cache should do the trick, you also need to do this for the 'db' directory.

thank you, that fixed it. maybe worth noting in the readme? not sure how many other people will get this

macropin · 2023-05-22T21:22:10Z

You might want to consider reworking this as a cog.yml. Cog is a machine learning domain specific tool for creating and running containers: https://github.com/replicate/cog/

BaileyJM02 · 2023-05-27T09:59:25Z

Just dropping a comment here, this doesn't work out of the box on Apple M1 due to pypandoc-binary not resolving. See #226.

Short term solution appears to be this: #226 (comment)

Rots · 2023-05-31T08:27:27Z

After change of permissions and running the ingest, I get a missing model file

$ chmod 777 models cache db
$ docker-compose run --rm privategpt-ingest
Creating privategpt_privategpt-ingest_run ... done
Loading documents from /home/privategpt/source_documents
Loading document: /home/privategpt/source_documents/state_of_the_union.txt
Loaded 1 documents from /home/privategpt/source_documents
Split into 90 chunks of text (max. 500 characters each)
Using embedded DuckDB with persistence: data will be stored in: /home/privategpt/db
$ docker-compose run --rm privategpt      
Creating privategpt_privategpt_run ... done
Using embedded DuckDB with persistence: data will be stored in: /home/privategpt/db
Traceback (most recent call last):
  File "/home/privategpt/src/privateGPT.py", line 57, in <module>
    main()
  File "/home/privategpt/src/privateGPT.py", line 30, in main
    llm = GPT4All(model=model_path, n_ctx=model_n_ctx, backend='gptj', callbacks=callbacks, verbose=False)
  File "pydantic/main.py", line 339, in pydantic.main.BaseModel.__init__
  File "pydantic/main.py", line 1102, in pydantic.main.validate_model
  File "/home/privategpt/.local/lib/python3.10/site-packages/langchain/llms/gpt4all.py", line 169, in validate_environment
    values["client"] = GPT4AllModel(
  File "/home/privategpt/.local/lib/python3.10/site-packages/pygpt4all/models/gpt4all_j.py", line 47, in __init__
    super(GPT4All_J, self).__init__(model_path=model_path,
  File "/home/privategpt/.local/lib/python3.10/site-packages/pygptj/model.py", line 58, in __init__     
    raise Exception(f"File {model_path} not found!")
Exception: File /home/privategpt/models/ggml-gpt4all-j-v1.3-groovy.bin not found!
ERROR: 1

denis-ev · 2023-06-17T07:30:07Z

After change of permissions and running the ingest, I get a missing model file

$ chmod 777 models cache db
$ docker-compose run --rm privategpt-ingest
Creating privategpt_privategpt-ingest_run ... done
Loading documents from /home/privategpt/source_documents
Loading document: /home/privategpt/source_documents/state_of_the_union.txt
Loaded 1 documents from /home/privategpt/source_documents
Split into 90 chunks of text (max. 500 characters each)
Using embedded DuckDB with persistence: data will be stored in: /home/privategpt/db
$ docker-compose run --rm privategpt      
Creating privategpt_privategpt_run ... done
Using embedded DuckDB with persistence: data will be stored in: /home/privategpt/db
Traceback (most recent call last):
  File "/home/privategpt/src/privateGPT.py", line 57, in <module>
    main()
  File "/home/privategpt/src/privateGPT.py", line 30, in main
    llm = GPT4All(model=model_path, n_ctx=model_n_ctx, backend='gptj', callbacks=callbacks, verbose=False)
  File "pydantic/main.py", line 339, in pydantic.main.BaseModel.__init__
  File "pydantic/main.py", line 1102, in pydantic.main.validate_model
  File "/home/privategpt/.local/lib/python3.10/site-packages/langchain/llms/gpt4all.py", line 169, in validate_environment
    values["client"] = GPT4AllModel(
  File "/home/privategpt/.local/lib/python3.10/site-packages/pygpt4all/models/gpt4all_j.py", line 47, in __init__
    super(GPT4All_J, self).__init__(model_path=model_path,
  File "/home/privategpt/.local/lib/python3.10/site-packages/pygptj/model.py", line 58, in __init__     
    raise Exception(f"File {model_path} not found!")
Exception: File /home/privategpt/models/ggml-gpt4all-j-v1.3-groovy.bin not found!
ERROR: 1

the model is not download automatically.

you need to download it from
https://gpt4all.io/models/ggml-gpt4all-j-v1.3-groovy.bin
or
wget https://gpt4all.io/models/ggml-gpt4all-j-v1.3-groovy.bin -O models/ggml-gpt4all-j-v1.3-groovy.bin

docker-compose.yml

---
version: '3.9'

x-ingest: &ingest
  environment:
    - COMMAND=python src/ingest.py  # Specify the command
...

services:
  privategpt:
...
    #command: [ python, src/privateGPT.py ]
    environment:
      - COMMAND=python src/privateGPT.py  # Specify the command
...

I changed some code to automatically check for the model
Dockerfile:

#FROM python:3.10.11
#FROM wallies/python-cuda:3.10-cuda11.6-runtime

# Using argument for base image to avoid multiplying Dockerfiles
ARG BASEIMAGE
FROM $BASEIMAGE

# Copy the entrypoint script
COPY entrypoint.sh /entrypoint.sh

RUN groupadd -g 10009 -o privategpt && useradd -m -u 10009 -g 10009 -o -s /bin/bash privategpt \
    && chown privategpt:privategpt /entrypoint.sh && chmod +x /entrypoint.sh
USER privategpt
WORKDIR /home/privategpt

COPY ./src/requirements.txt src/requirements.txt
ARG LLAMA_CMAKE
#RUN CMAKE_ARGS="-DLLAMA_OPENBLAS=on" FORCE_CMAKE=1 pip install $(grep llama-cpp-python src/requirements.txt)

# Add the line to modify the PATH environment variable
ENV PATH="$PATH:/home/privategpt/.local/bin"

RUN pip install --upgrade pip \
    && ( /bin/bash -c "${LLAMA_CMAKE} pip install \$(grep llama-cpp-python src/requirements.txt)" 2>&1 | tee llama-build.log ) \
    && ( pip install --no-cache-dir -r src/requirements.txt 2>&1 | tee pip-install.log ) \
    && pip cache purge

COPY ./src src

# Set the entrypoint command
ENTRYPOINT ["/entrypoint.sh"]

entrypoint.sh:

#!/bin/bash

MODEL_FILE="models/ggml-gpt4all-j-v1.3-groovy.bin"
MODEL_URL="https://gpt4all.io/models/ggml-gpt4all-j-v1.3-groovy.bin"

# Check if the model file exists
if [ ! -f "$MODEL_FILE" ]; then
    echo "Model file not found. Downloading..."
    wget "$MODEL_URL" -O "$MODEL_FILE"
    echo "Model downloaded."
fi

# Check if the command is provided through environment variables
if [ -z "$COMMAND" ]; then
    # No command specified, fallback to default
    COMMAND=("python" "src/privateGPT.py")
else
    # Split the command string into an array
    IFS=' ' read -ra COMMAND <<< "$COMMAND"
fi

# Execute the command
"${COMMAND[@]}"

k00ni · 2023-07-14T14:32:28Z

LGTM

wmhartl · 2023-11-03T21:34:34Z

Came looking for an updated Dockerfile that doesn't have the old --chown on the COPY lines and found this PR. What's the thought on merging @denis-ev's approach?

KPHIBYE · 2023-11-21T13:30:31Z

I wanted to chime in regarding a CUDA container for running PrivateGPT locally in docker on the NVIDIA Container Toolkit.

I combined elements from:

An official NVIDIA CUDA image is used as base. The drawback of this is that ubuntu22.4 is the highest available version for the container and thus python3.11 has to be installed from an external repository. The CUDA version 11.8.0 was chosen as default since it is the newest version that does not require a driver version >=525.60.13 according to NVIDIA. The worker user was included since it is also present in the Dockerfile of @pabloogc which is currently in main.

The resulting image has a size of 8.5 GB. It expects two mounted volumes, one to /home/worker/app/local_data and one to /home/worker/app/models. Both should have uid 101 as owner. The name of the model file, which should be located directly in the mounted models folder, can be specified with the PGPT_HF_MODEL_FILE environment variable. The name of the Hugging Face repository of the embedding model, which should be cloned to a folder named embedding inside the models folder, can be specified with the PGPT_EMBEDDING_HF_MODEL_NAME environment variable.

At least this is what I think these two environment variables are used for after looking at imartinez/privateGPT/scripts/setup and imartinez/privateGPT/settings-docker.yaml. Specifying the model name with PGPT_HF_MODEL_FILE works, but although the repository of the embedding model is present in models/embedding, the embedding files seem to be downloaded again on first start.

This is the Dockerfile I came up with:

ARG UBUNTU_VERSION=22.04
ARG CUDA_VERSION=11.8.0
ARG CUDA_DOCKER_ARCH=all
ARG APP_DIR=/home/worker/app



### Build Image ###
FROM nvidia/cuda:${CUDA_VERSION}-devel-ubuntu${UBUNTU_VERSION} as builder

ARG CUDA_DOCKER_ARCH
ARG APP_DIR

ENV DEBIAN_FRONTEND=noninteractive \
    CUDA_DOCKER_ARCH=${CUDA_DOCKER_ARCH} \
    LLAMA_CUBLAS=1 \
    CMAKE_ARGS="-DLLAMA_CUBLAS=on" \
    FORCE_CMAKE=1 \
    POETRY_VIRTUALENVS_IN_PROJECT=true

RUN --mount=type=cache,target=/var/cache/apt,sharing=locked \
    --mount=type=cache,target=/var/lib/apt,sharing=locked \
    apt-get update && \
    apt-get install -y --no-install-recommends software-properties-common && \
    add-apt-repository ppa:deadsnakes/ppa && \
    apt-get update && \
    apt-get install -y --no-install-recommends \
        python3.11 \
        python3.11-dev \
        python3.11-venv \
        build-essential \
        git && \
    python3.11 -m ensurepip && \
    python3.11 -m pip install pipx && \
    python3.11 -m pipx ensurepath && \
    pipx install poetry

ENV PATH="/root/.local/bin:$PATH"

WORKDIR $APP_DIR

RUN git clone https://github.com/imartinez/privateGPT.git . --depth 1

RUN poetry install --with local && \
    poetry install --with ui

RUN mkdir build_artifacts && \
    cp -r .venv private_gpt docs *.yaml *.md build_artifacts/



### Runtime Image ###
FROM nvidia/cuda:${CUDA_VERSION}-runtime-ubuntu${UBUNTU_VERSION} as runtime

ARG APP_DIR

ENV DEBIAN_FRONTEND=noninteractive \
    PYTHONUNBUFFERED=1 \
    PGPT_PROFILES=docker,local

EXPOSE 8080

RUN adduser --system worker

WORKDIR $APP_DIR

RUN --mount=type=cache,target=/var/cache/apt,sharing=locked \
    --mount=type=cache,target=/var/lib/apt,sharing=locked \
    apt-get update && \
    apt-get install -y --no-install-recommends software-properties-common && \
    add-apt-repository ppa:deadsnakes/ppa && \
    apt-get update && \
    apt-get install -y --no-install-recommends \
        python3.11 \
        python3.11-venv \
        curl && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/* && \
    mkdir local_data models && \
    chown worker local_data models

COPY --chown=worker --from=builder $APP_DIR/build_artifacts ./

USER worker

HEALTHCHECK --start-period=1m --interval=5m --timeout=3s \
    CMD curl --head --silent --fail --show-error http://localhost:8080 || exit 1

ENTRYPOINT [".venv/bin/python", "-m", "private_gpt"]

JulienA mentioned this pull request May 14, 2023

add Dockerfile #70

Closed

JulienA force-pushed the main branch from 2083cdc to 40f7158 Compare May 14, 2023 14:47

Rots mentioned this pull request May 15, 2023

FR: Can docker deployment be provided? #60

Closed

Rots reviewed May 15, 2023

View reviewed changes

Dockerfile Outdated Show resolved Hide resolved

JulienA requested a review from Rots May 15, 2023 17:45

Polpetta reviewed May 17, 2023

View reviewed changes

mdeweerd reviewed May 17, 2023

View reviewed changes

mdeweerd mentioned this pull request May 18, 2023

update Dockerfile #267

Closed

JulienA and others added 9 commits May 18, 2023 08:23

addd dockerfile and dockercompose update readme

6700b0c

build-in compose

77ae648

dockerfile

df37b09

using env for compose

dff73aa

unix style

1a5c7dc

Single Dockerfile, services for cuda

68210a9

Add default ingest target

b4f09d5

Add EMBEDDINGS_MODEL_NAME following changes on main

a29a7a3

Compile for CUDA, add some build logs to image

47cded9

missing db

52de272

source doc

d4cfac2

anchor for ingest

06cce6f

mdeweerd mentioned this pull request May 19, 2023

Now with docker! #300

Closed

thiago-scherrer reviewed May 19, 2023

View reviewed changes

Add pip upgrade to avoid sha256 mismatches, also cleanup cache

aaf8a79

mdeweerd previously approved these changes May 19, 2023

View reviewed changes

Merge pull request #2 from mdeweerd/sha256_mismatch_fix

22c5f60

Add pip upgrade to avoid sha256 mismatches, also cleanup cache

JulienA dismissed mdeweerd’s stale review via 22c5f60 May 19, 2023 16:49

mrbrianevans reviewed May 19, 2023

View reviewed changes

mdeweerd mentioned this pull request May 19, 2023

Add fallback for plain elm #294 #290 #299

Merged

thiago-scherrer reviewed May 19, 2023

View reviewed changes

mrbrianevans reviewed May 20, 2023

View reviewed changes

itgoldman mentioned this pull request May 20, 2023

ERROR: Failed building wheel for llama-cpp-python oobabooga/text-generation-webui#1534

Closed

1 task

Rots approved these changes Jun 22, 2023

View reviewed changes

zba mentioned this pull request Jul 8, 2023

I wait more than...a day and... #833

Closed

9876691 mentioned this pull request Jul 24, 2023

Error while building the PrivateGPT Docker Image on Ubuntu #894

Closed

surfingdoggo mentioned this pull request Jul 29, 2023

Containerizing PrivateGPT #913

Closed

imartinez added the primordial Related to the primordial version of PrivateGPT, which is now frozen in favour of the new PrivateGPT label Oct 19, 2023

imartinez closed this Dec 4, 2023

docker file and compose #120

docker file and compose #120

Conversation

JulienA commented May 14, 2023

thebigbone commented May 15, 2023

JulienA commented May 15, 2023

mdeweerd commented May 15, 2023

JulienA commented May 15, 2023

mdeweerd commented May 15, 2023

JulienA commented May 15, 2023

mdeweerd commented May 15, 2023

mdeweerd commented May 15, 2023

mdeweerd commented May 15, 2023

mdeweerd commented May 16, 2023

JulienA commented May 16, 2023

mdeweerd commented May 16, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mdeweerd commented May 17, 2023 • edited Loading

mdeweerd commented May 18, 2023

JulienA commented May 18, 2023 • edited Loading

mdeweerd commented May 18, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mdeweerd May 19, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

macropin commented May 22, 2023

BaileyJM02 commented May 27, 2023 • edited Loading

Rots commented May 31, 2023

denis-ev commented Jun 17, 2023 • edited Loading

k00ni commented Jul 14, 2023

wmhartl commented Nov 3, 2023

KPHIBYE commented Nov 21, 2023

mdeweerd commented May 17, 2023 •

edited

Loading

JulienA commented May 18, 2023 •

edited

Loading

mdeweerd May 19, 2023 •

edited

Loading

BaileyJM02 commented May 27, 2023 •

edited

Loading

denis-ev commented Jun 17, 2023 •

edited

Loading