Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docker file and compose #120

Closed
wants to merge 20 commits into from
Closed

docker file and compose #120

wants to merge 20 commits into from

Conversation

JulienA
Copy link

@JulienA JulienA commented May 14, 2023

No description provided.

@thebigbone
Copy link

Why not build the image in docker-compose directly?

Dockerfile Outdated Show resolved Hide resolved
@JulienA
Copy link
Author

JulienA commented May 15, 2023

Pushed both change

@mdeweerd
Copy link
Contributor

Thanks for sharing.
Could you allow the use of a .env file to avoid modifying the repo's yaml file?

For instance, use:

 ${MODELS:-./models}

to set the models directory so that it can be set in the .env file.

@JulienA
Copy link
Author

JulienA commented May 15, 2023

Thanks for sharing. Could you allow the use of a .env file to avoid modifying the repo's yaml file?

For instance, use:

 ${MODELS:-./models}

to set the models directory so that it can be set in the .env file.

Not sure to understand correctly but setting load_dotenv(override=True) will override docker-compose env var with the .env file but there is not .env file actually

@mdeweerd
Copy link
Contributor

There is no .env file in the repo, but we can set one locally.

By setting .env as follows, I successfully used my E: drive for the models. A user that does not have a local .env should be using ./models instead.

MODELS=E:/

This avoids changing any git controlled file to adapt to the local setup. I already had some models on my e-drive... .

@JulienA
Copy link
Author

JulienA commented May 15, 2023

@mdeweerd reviewed in b4aad15

@JulienA JulienA requested a review from Rots May 15, 2023 17:45
@mdeweerd
Copy link
Contributor

I was able to use "MODEL_MOUNT".

I suggest to convert the line endings to CRLF of these files.

As I was applying a local pre-commit configuration, this detected that the line endings of the yaml files (and Dockerfile) is CRLF - yamllint suggest to have LF line endings - yamlfix helps format the files automatically.

I am still struggling to get an anwser to my question - the container stops at some point. Maybe this has to do with memory - the container limit is 7.448GiB .

@mdeweerd
Copy link
Contributor

FYI, I've set the memory for WSL2 to 12GB which allowed me to get an anwser to a question.

My .wslconfig now looks like:

[wsl2]
memory=12GB

During compilation I noticed some references to nvidia, so I wondered if the image should be based on some cuda image.

I tried FROM wallies/python-cuda:3.10-cuda11.6-runtime but did not see an impact on performance - it may be helpful in the future.

@mdeweerd
Copy link
Contributor

The two docker-compose*.yaml files share elements and duplication could be avoided by adding both into a single docker-compose.yaml files, and using 'extend:'.

It also avoids having to specify the docker-compose*.yaml file.

You can have a look at https://github.com/mdeweerd/MetersToHA/blob/meters-to-ha/docker-compose.yml for some hints.

@mdeweerd
Copy link
Contributor

FYI, I tried to enable 'cuda' and got some kind of success: I got a cuda related error message:

nvidia-container-cli: requirement error: unsatisfied condition: cuda>=11.7, please update your driver to a newer version, or use an earlier cuda container: unknown

In the Dockerfile I used:

FROM wallies/python-cuda:3.10-cuda11.7-runtime

and in the docker-compose-ingest.yaml file, I added:

    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]

@JulienA
Copy link
Author

JulienA commented May 16, 2023

FYI, I tried to enable 'cuda' and got some kind of success: I got a cuda related error message:

nvidia-container-cli: requirement error: unsatisfied condition: cuda>=11.7, please update your driver to a newer version, or use an earlier cuda container: unknown

In the Dockerfile I used:

FROM wallies/python-cuda:3.10-cuda11.7-runtime

and in the docker-compose-ingest.yaml file, I added:

    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]

I may be wrong but the requirements use the llamacpp so even if you use a cuda related stuff it won't be used ? since the cpp one only use CPU.

@mdeweerd
Copy link
Contributor

I may be wrong but the requirements use the llamacpp so even if you use a cuda related stuff it won't be used ? since the cpp one only use CPU.

When I run the app and use "docker stats", the cpu use exceeds 100%, so it's using more than 1 core (but only 1 cpu).

So the latest release has support for cuda.

Dockerfile Outdated
@@ -0,0 +1,12 @@
FROM python:3.10.11

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, I'd suggest having a look at https://github.com/GoogleContainerTools/distroless too

It provides different base images, python3 included, that are very small and already has a user inside them. It could be very effective to slim down the image size as much as possible!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello I already tried some light/distroless image, but the requirements.txt get a lot of dependency like 8Go and need GCC compiler, eventually cuda and other stuff.

You will win like 200M (2%) of the total image size and will probably need to add gcc/python-dev/cuda manually

If you have a working dockerfile using a distroless/light don't hesitate to contribute

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you're right, I didn't take in consideration the CUDA dependencies that are very heavy. It'd make a little sense, size wise. Maybe security wise could have some sense though, because there is no default shell. But I guess it depends on which environment you are going to use it (and since it is a cli, I think it doesn't really make sense).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mdeweerd
Copy link
Contributor

mdeweerd commented May 17, 2023

I am making progress with CUDA and moved everything to a single docker-compose.yaml .

I proposed a PR for https://github.com/mdeweerd/privateGPT/tree/cuda in your fork.

@mdeweerd mdeweerd mentioned this pull request May 18, 2023
@mdeweerd
Copy link
Contributor

I had added the source_documents mount to the privateGPT service because I did not want to repeat it on every ingest service - I try to be DRY. I now remembered the name of the mechanism I was looking for: anchors and aliases.

This is essentially a suggestion - maybe I'll look into it, but I have to attend some other stuff...

@JulienA
Copy link
Author

JulienA commented May 18, 2023

Since the source_document is only need at ingest, i try to avoid mounting it when not needed.
Like this d4cfac2 you only have it in ingest and the cuda only override image, it's ok ?

@mdeweerd
Copy link
Contributor

Since the source_document is only need at ingest, i try to avoid mounting it when not needed. Like this d4cfac2 you only have it in ingest and the cuda only override image, it's ok ?

Yes, that's perfect.

@mdeweerd mdeweerd mentioned this pull request May 19, 2023
Dockerfile Outdated
Comment on lines 5 to 18
ARG BASEIMAGE
FROM $BASEIMAGE

RUN groupadd -g 10009 -o privategpt && useradd -m -u 10009 -g 10009 -o -s /bin/bash privategpt
USER privategpt
WORKDIR /home/privategpt

COPY ./src/requirements.txt src/requirements.txt
ARG LLAMA_CMAKE
#RUN CMAKE_ARGS="-DLLAMA_OPENBLAS=on" FORCE_CMAKE=1 pip install $(grep llama-cpp-python src/requirements.txt)
RUN ( /bin/bash -c "${LLAMA_CMAKE} pip install \$(grep llama-cpp-python src/requirements.txt)" 2>&1 | tee llama-build.log ) && sleep 10
RUN pip install --no-cache-dir -r src/requirements.txt 2>&1 | tee pip-install.log

COPY ./src src

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ARG BASEIMAGE
FROM $BASEIMAGE
RUN groupadd -g 10009 -o privategpt && useradd -m -u 10009 -g 10009 -o -s /bin/bash privategpt
USER privategpt
WORKDIR /home/privategpt
COPY ./src/requirements.txt src/requirements.txt
ARG LLAMA_CMAKE
#RUN CMAKE_ARGS="-DLLAMA_OPENBLAS=on" FORCE_CMAKE=1 pip install $(grep llama-cpp-python src/requirements.txt)
RUN ( /bin/bash -c "${LLAMA_CMAKE} pip install \$(grep llama-cpp-python src/requirements.txt)" 2>&1 | tee llama-build.log ) && sleep 10
RUN pip install --no-cache-dir -r src/requirements.txt 2>&1 | tee pip-install.log
COPY ./src src
ARG BASEIMAGE
FROM $BASEIMAGE
RUN groupadd -g 10009 -o privategpt \
&& useradd -m -u 10009 -g 10009 -o -s /bin/bash privategpt
USER privategpt
WORKDIR /home/privategpt
COPY ./src/requirements.txt src/requirements.txt
ARG LLAMA_CMAKE
RUN (${LLAMA_CMAKE} pip install $(grep llama-cpp-python src/requirements.txt) 2>&1 | tee llama-build.log) \
&& sleep 10 \
&& pip install --no-cache-dir -r src/requirements.txt 2>&1 | tee pip-install.log
COPY ./src src

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proposed update in PR :JulienA#2 .

mdeweerd
mdeweerd previously approved these changes May 19, 2023
- MODEL_PATH=${MODEL_PATH:-/home/privategpt/models/ggml-gpt4all-j-v1.3-groovy.bin}
- MODEL_N_CTX=${MODEL_N_CTX:-1000}
volumes:
- ${CACHE_MOUNT:-./cache}:/home/privategpt/.cache/torch
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The local .cache may be populated with other subdirectories, so mapping that entire directory to torch is not ok.

This is wy I mapped only the "torch" directory where the models seem to be downloaded and I mapped it from a "cache" directory int he models path, because this is essentially a cache of models.

To avoid having extra patch to specify, I did not add another path such ass "MODEL_CACHE_MOUNT".

docker compose run --rm --build privategpt-ingest
```

2. With Cuda 11.6 or 11.7
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a note:

:warning: The use of CUDA is not fully validated yet.  Also the CUDA version on your host is important and must be at least the version used in the container.  You can check your version with `docker compose run --rm --build check-cuda-version`

:information_source: Get a recent CUDA version from https://developer.nvidia.com/cuda-downloads.

Dockerfile Outdated
COPY ./src/requirements.txt src/requirements.txt
ARG LLAMA_CMAKE
#RUN CMAKE_ARGS="-DLLAMA_OPENBLAS=on" FORCE_CMAKE=1 pip install $(grep llama-cpp-python src/requirements.txt)
RUN ( /bin/bash -c "${LLAMA_CMAKE} pip install \$(grep llama-cpp-python src/requirements.txt)" 2>&1 | tee llama-build.log ) && sleep 10
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot the remove the && sleep 10 here which I added only to visually verify that the executed command was ok.

This && sleep 10 can be removed.

Dockerfile Outdated
Comment on lines 5 to 18
ARG BASEIMAGE
FROM $BASEIMAGE

RUN groupadd -g 10009 -o privategpt && useradd -m -u 10009 -g 10009 -o -s /bin/bash privategpt
USER privategpt
WORKDIR /home/privategpt

COPY ./src/requirements.txt src/requirements.txt
ARG LLAMA_CMAKE
#RUN CMAKE_ARGS="-DLLAMA_OPENBLAS=on" FORCE_CMAKE=1 pip install $(grep llama-cpp-python src/requirements.txt)
RUN ( /bin/bash -c "${LLAMA_CMAKE} pip install \$(grep llama-cpp-python src/requirements.txt)" 2>&1 | tee llama-build.log ) && sleep 10
RUN pip install --no-cache-dir -r src/requirements.txt 2>&1 | tee pip-install.log

COPY ./src src
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proposed update in PR :JulienA#2 .

Add pip upgrade to avoid sha256 mismatches, also cleanup cache
```sh
docker compose run --rm --build privategpt-cuda-11.6-ingest

docker compose run --rm --build privategpt-cuda-11.7-ingest

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get an error when running this command, not sure if its related to docker compose version?

PS C:\Users\bme\projects\privateGPT> docker compose run --rm --build privategpt-cuda-11.7-ingest
unknown flag: --build

Copy link
Contributor

@mdeweerd mdeweerd May 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am on Win11 and the flag is ok with (tested from a powershell prompt)

> docker --version
Docker version 23.0.5, build bc4487a
> docker compose version
Docker Compose version v2.17.3

Added in Docker compose 2.13.0 docker/docs@b00b1d2 .

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

> docker --version
Docker version 20.10.14, build a224086
> docker compose version
Docker Compose version v2.5.1

after doing a build, it runs and give this error:

> docker compose run --rm privategpt-cuda-11.7-ingest
Traceback (most recent call last):
  File "/home/privategpt/src/ingest.py", line 4, in <module>
    from dotenv import load_dotenv
ModuleNotFoundError: No module named 'dotenv'

I see python-dotenv==1.0.0 in the requirements.txt and the pip install succeeded in the docker build (presumably, cause the build completed and ran).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This branch needs a merge from main.

You can do this locally for now:

git remote add upstream https://github.com/imartinez/privateGPT
git fetch upstream
git checkout -b local-merge
git merge upstream/main
git add README.md
git commit -m "Ignore README conflicts"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rebuilt and its working

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also received this error, I think it's a red herring due to the output of pip being hidden and piped into a log file. This prevents it from erroring when it can't find a package etc. and falls through to trying to run the script.

Solution to prevent this would be to remove the pipe into the log file within Dockerfile.

Comment on lines +1 to +4
#FROM python:3.10.11
#FROM wallies/python-cuda:3.10-cuda11.6-runtime

# Using argument for base image to avoid multiplying Dockerfiles

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️

Suggested change
#FROM python:3.10.11
#FROM wallies/python-cuda:3.10-cuda11.6-runtime
# Using argument for base image to avoid multiplying Dockerfiles
# Using argument for base image to avoid multiplying Dockerfiles


COPY ./src/requirements.txt src/requirements.txt
ARG LLAMA_CMAKE
#RUN CMAKE_ARGS="-DLLAMA_OPENBLAS=on" FORCE_CMAKE=1 pip install $(grep llama-cpp-python src/requirements.txt)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️

Suggested change
#RUN CMAKE_ARGS="-DLLAMA_OPENBLAS=on" FORCE_CMAKE=1 pip install $(grep llama-cpp-python src/requirements.txt)


COPY ./src src

# ENTRYPOINT ["python", "src/privateGPT.py"]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️

Suggested change
# ENTRYPOINT ["python", "src/privateGPT.py"]

FROM $BASEIMAGE

RUN groupadd -g 10009 -o privategpt && useradd -m -u 10009 -g 10009 -o -s /bin/bash privategpt
USER privategpt

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried running ingest docker container in linux and getting this error:

$ sudo docker compose run --rm privategpt-ingest
[sudo] password for brian:
Loading documents from /home/privategpt/source_documents
Loaded 1 documents from /home/privategpt/source_documents
Split into 90 chunks of text (max. 500 characters each)
Traceback (most recent call last):
  File "/home/privategpt/src/ingest.py", line 97, in <module>
    main()
  File "/home/privategpt/src/ingest.py", line 88, in main
    embeddings = HuggingFaceEmbeddings(model_name=embeddings_model_name)
  File "/home/privategpt/.local/lib/python3.10/site-packages/langchain/embeddings/huggingface.py", line 54, in __init__
    self.client = sentence_transformers.SentenceTransformer(
  File "/home/privategpt/.local/lib/python3.10/site-packages/sentence_transformers/SentenceTransformer.py", line 87, in __init__
    snapshot_download(model_name_or_path,
  File "/home/privategpt/.local/lib/python3.10/site-packages/sentence_transformers/util.py", line 476, in snapshot_download
    os.makedirs(nested_dirname, exist_ok=True)
  File "/usr/local/lib/python3.10/os.py", line 215, in makedirs
    makedirs(head, exist_ok=exist_ok)
  File "/usr/local/lib/python3.10/os.py", line 225, in makedirs
    mkdir(name, mode)
PermissionError: [Errno 13] Permission denied: '/home/privategpt/.cache/torch/sentence_transformers'

googling it suggests that its related to the Dockerfile USER not having correct permissions, but I'm not sure.

Do you know what could be causing this? I ran it fine on windows through wsl2 docker desktop, but get this error when running on a linux machine.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's related to the fact that you're running the docker container as root, and the unpriviliged container user can't create directories as root.

For this particular error mkdir cache ; chmod 777 cache should do the trick, you also need to do this for the 'db' directory.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you, that fixed it. maybe worth noting in the readme? not sure how many other people will get this

@macropin
Copy link

You might want to consider reworking this as a cog.yml. Cog is a machine learning domain specific tool for creating and running containers: https://github.com/replicate/cog/

@BaileyJM02
Copy link
Contributor

BaileyJM02 commented May 27, 2023

Just dropping a comment here, this doesn't work out of the box on Apple M1 due to pypandoc-binary not resolving. See #226.

Short term solution appears to be this: #226 (comment)

@Rots
Copy link

Rots commented May 31, 2023

After change of permissions and running the ingest, I get a missing model file

$ chmod 777 models cache db
$ docker-compose run --rm privategpt-ingest
Creating privategpt_privategpt-ingest_run ... done
Loading documents from /home/privategpt/source_documents
Loading document: /home/privategpt/source_documents/state_of_the_union.txt
Loaded 1 documents from /home/privategpt/source_documents
Split into 90 chunks of text (max. 500 characters each)
Using embedded DuckDB with persistence: data will be stored in: /home/privategpt/db
$ docker-compose run --rm privategpt      
Creating privategpt_privategpt_run ... done
Using embedded DuckDB with persistence: data will be stored in: /home/privategpt/db
Traceback (most recent call last):
  File "/home/privategpt/src/privateGPT.py", line 57, in <module>
    main()
  File "/home/privategpt/src/privateGPT.py", line 30, in main
    llm = GPT4All(model=model_path, n_ctx=model_n_ctx, backend='gptj', callbacks=callbacks, verbose=False)
  File "pydantic/main.py", line 339, in pydantic.main.BaseModel.__init__
  File "pydantic/main.py", line 1102, in pydantic.main.validate_model
  File "/home/privategpt/.local/lib/python3.10/site-packages/langchain/llms/gpt4all.py", line 169, in validate_environment
    values["client"] = GPT4AllModel(
  File "/home/privategpt/.local/lib/python3.10/site-packages/pygpt4all/models/gpt4all_j.py", line 47, in __init__
    super(GPT4All_J, self).__init__(model_path=model_path,
  File "/home/privategpt/.local/lib/python3.10/site-packages/pygptj/model.py", line 58, in __init__     
    raise Exception(f"File {model_path} not found!")
Exception: File /home/privategpt/models/ggml-gpt4all-j-v1.3-groovy.bin not found!
ERROR: 1

@denis-ev
Copy link

denis-ev commented Jun 17, 2023

After change of permissions and running the ingest, I get a missing model file

$ chmod 777 models cache db
$ docker-compose run --rm privategpt-ingest
Creating privategpt_privategpt-ingest_run ... done
Loading documents from /home/privategpt/source_documents
Loading document: /home/privategpt/source_documents/state_of_the_union.txt
Loaded 1 documents from /home/privategpt/source_documents
Split into 90 chunks of text (max. 500 characters each)
Using embedded DuckDB with persistence: data will be stored in: /home/privategpt/db
$ docker-compose run --rm privategpt      
Creating privategpt_privategpt_run ... done
Using embedded DuckDB with persistence: data will be stored in: /home/privategpt/db
Traceback (most recent call last):
  File "/home/privategpt/src/privateGPT.py", line 57, in <module>
    main()
  File "/home/privategpt/src/privateGPT.py", line 30, in main
    llm = GPT4All(model=model_path, n_ctx=model_n_ctx, backend='gptj', callbacks=callbacks, verbose=False)
  File "pydantic/main.py", line 339, in pydantic.main.BaseModel.__init__
  File "pydantic/main.py", line 1102, in pydantic.main.validate_model
  File "/home/privategpt/.local/lib/python3.10/site-packages/langchain/llms/gpt4all.py", line 169, in validate_environment
    values["client"] = GPT4AllModel(
  File "/home/privategpt/.local/lib/python3.10/site-packages/pygpt4all/models/gpt4all_j.py", line 47, in __init__
    super(GPT4All_J, self).__init__(model_path=model_path,
  File "/home/privategpt/.local/lib/python3.10/site-packages/pygptj/model.py", line 58, in __init__     
    raise Exception(f"File {model_path} not found!")
Exception: File /home/privategpt/models/ggml-gpt4all-j-v1.3-groovy.bin not found!
ERROR: 1

the model is not download automatically.

you need to download it from
https://gpt4all.io/models/ggml-gpt4all-j-v1.3-groovy.bin
or
wget https://gpt4all.io/models/ggml-gpt4all-j-v1.3-groovy.bin -O models/ggml-gpt4all-j-v1.3-groovy.bin

docker-compose.yml

---
version: '3.9'

x-ingest: &ingest
  environment:
    - COMMAND=python src/ingest.py  # Specify the command
...

services:
  privategpt:
...
    #command: [ python, src/privateGPT.py ]
    environment:
      - COMMAND=python src/privateGPT.py  # Specify the command
...

I changed some code to automatically check for the model
Dockerfile:

#FROM python:3.10.11
#FROM wallies/python-cuda:3.10-cuda11.6-runtime

# Using argument for base image to avoid multiplying Dockerfiles
ARG BASEIMAGE
FROM $BASEIMAGE

# Copy the entrypoint script
COPY entrypoint.sh /entrypoint.sh

RUN groupadd -g 10009 -o privategpt && useradd -m -u 10009 -g 10009 -o -s /bin/bash privategpt \
    && chown privategpt:privategpt /entrypoint.sh && chmod +x /entrypoint.sh
USER privategpt
WORKDIR /home/privategpt

COPY ./src/requirements.txt src/requirements.txt
ARG LLAMA_CMAKE
#RUN CMAKE_ARGS="-DLLAMA_OPENBLAS=on" FORCE_CMAKE=1 pip install $(grep llama-cpp-python src/requirements.txt)

# Add the line to modify the PATH environment variable
ENV PATH="$PATH:/home/privategpt/.local/bin"

RUN pip install --upgrade pip \
    && ( /bin/bash -c "${LLAMA_CMAKE} pip install \$(grep llama-cpp-python src/requirements.txt)" 2>&1 | tee llama-build.log ) \
    && ( pip install --no-cache-dir -r src/requirements.txt 2>&1 | tee pip-install.log ) \
    && pip cache purge

COPY ./src src

# Set the entrypoint command
ENTRYPOINT ["/entrypoint.sh"]

entrypoint.sh:

#!/bin/bash

MODEL_FILE="models/ggml-gpt4all-j-v1.3-groovy.bin"
MODEL_URL="https://gpt4all.io/models/ggml-gpt4all-j-v1.3-groovy.bin"

# Check if the model file exists
if [ ! -f "$MODEL_FILE" ]; then
    echo "Model file not found. Downloading..."
    wget "$MODEL_URL" -O "$MODEL_FILE"
    echo "Model downloaded."
fi

# Check if the command is provided through environment variables
if [ -z "$COMMAND" ]; then
    # No command specified, fallback to default
    COMMAND=("python" "src/privateGPT.py")
else
    # Split the command string into an array
    IFS=' ' read -ra COMMAND <<< "$COMMAND"
fi

# Execute the command
"${COMMAND[@]}"

@k00ni
Copy link

k00ni commented Jul 14, 2023

LGTM

@imartinez imartinez added the primordial Related to the primordial version of PrivateGPT, which is now frozen in favour of the new PrivateGPT label Oct 19, 2023
@wmhartl
Copy link

wmhartl commented Nov 3, 2023

Came looking for an updated Dockerfile that doesn't have the old --chown on the COPY lines and found this PR. What's the thought on merging @denis-ev's approach?

@KPHIBYE
Copy link

KPHIBYE commented Nov 21, 2023

I wanted to chime in regarding a CUDA container for running PrivateGPT locally in docker on the NVIDIA Container Toolkit.

I combined elements from:

An official NVIDIA CUDA image is used as base. The drawback of this is that ubuntu22.4 is the highest available version for the container and thus python3.11 has to be installed from an external repository. The CUDA version 11.8.0 was chosen as default since it is the newest version that does not require a driver version >=525.60.13 according to NVIDIA. The worker user was included since it is also present in the Dockerfile of @pabloogc which is currently in main.

The resulting image has a size of 8.5 GB. It expects two mounted volumes, one to /home/worker/app/local_data and one to /home/worker/app/models. Both should have uid 101 as owner. The name of the model file, which should be located directly in the mounted models folder, can be specified with the PGPT_HF_MODEL_FILE environment variable. The name of the Hugging Face repository of the embedding model, which should be cloned to a folder named embedding inside the models folder, can be specified with the PGPT_EMBEDDING_HF_MODEL_NAME environment variable.

At least this is what I think these two environment variables are used for after looking at imartinez/privateGPT/scripts/setup and imartinez/privateGPT/settings-docker.yaml. Specifying the model name with PGPT_HF_MODEL_FILE works, but although the repository of the embedding model is present in models/embedding, the embedding files seem to be downloaded again on first start.

This is the Dockerfile I came up with:

ARG UBUNTU_VERSION=22.04
ARG CUDA_VERSION=11.8.0
ARG CUDA_DOCKER_ARCH=all
ARG APP_DIR=/home/worker/app



### Build Image ###
FROM nvidia/cuda:${CUDA_VERSION}-devel-ubuntu${UBUNTU_VERSION} as builder

ARG CUDA_DOCKER_ARCH
ARG APP_DIR

ENV DEBIAN_FRONTEND=noninteractive \
    CUDA_DOCKER_ARCH=${CUDA_DOCKER_ARCH} \
    LLAMA_CUBLAS=1 \
    CMAKE_ARGS="-DLLAMA_CUBLAS=on" \
    FORCE_CMAKE=1 \
    POETRY_VIRTUALENVS_IN_PROJECT=true

RUN --mount=type=cache,target=/var/cache/apt,sharing=locked \
    --mount=type=cache,target=/var/lib/apt,sharing=locked \
    apt-get update && \
    apt-get install -y --no-install-recommends software-properties-common && \
    add-apt-repository ppa:deadsnakes/ppa && \
    apt-get update && \
    apt-get install -y --no-install-recommends \
        python3.11 \
        python3.11-dev \
        python3.11-venv \
        build-essential \
        git && \
    python3.11 -m ensurepip && \
    python3.11 -m pip install pipx && \
    python3.11 -m pipx ensurepath && \
    pipx install poetry

ENV PATH="/root/.local/bin:$PATH"

WORKDIR $APP_DIR

RUN git clone https://github.com/imartinez/privateGPT.git . --depth 1

RUN poetry install --with local && \
    poetry install --with ui

RUN mkdir build_artifacts && \
    cp -r .venv private_gpt docs *.yaml *.md build_artifacts/



### Runtime Image ###
FROM nvidia/cuda:${CUDA_VERSION}-runtime-ubuntu${UBUNTU_VERSION} as runtime

ARG APP_DIR

ENV DEBIAN_FRONTEND=noninteractive \
    PYTHONUNBUFFERED=1 \
    PGPT_PROFILES=docker,local

EXPOSE 8080

RUN adduser --system worker

WORKDIR $APP_DIR

RUN --mount=type=cache,target=/var/cache/apt,sharing=locked \
    --mount=type=cache,target=/var/lib/apt,sharing=locked \
    apt-get update && \
    apt-get install -y --no-install-recommends software-properties-common && \
    add-apt-repository ppa:deadsnakes/ppa && \
    apt-get update && \
    apt-get install -y --no-install-recommends \
        python3.11 \
        python3.11-venv \
        curl && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/* && \
    mkdir local_data models && \
    chown worker local_data models

COPY --chown=worker --from=builder $APP_DIR/build_artifacts ./

USER worker

HEALTHCHECK --start-period=1m --interval=5m --timeout=3s \
    CMD curl --head --silent --fail --show-error http://localhost:8080 || exit 1

ENTRYPOINT [".venv/bin/python", "-m", "private_gpt"]

@imartinez imartinez closed this Dec 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
primordial Related to the primordial version of PrivateGPT, which is now frozen in favour of the new PrivateGPT
Projects
None yet
Development

Successfully merging this pull request may close these issues.