Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker layer caching-friendly workflow with pipenv #3285

Closed
slint opened this issue Nov 22, 2018 · 19 comments
Closed

Docker layer caching-friendly workflow with pipenv #3285

slint opened this issue Nov 22, 2018 · 19 comments
Labels
Category: Docker Issue affects docker builds. Type: Discussion This issue is open for discussion. Type: Documentation 📖 This issue relates to documentation of pipenv.

Comments

@slint
Copy link

slint commented Nov 22, 2018

The usual way to create a Docker-based deployment (e.g. for deploying on Kubernetes) for a Python application looked something like this, using a requirements.txt, produced by pip freeze or pip-compile:

FROM python3.6

RUN mkdir /app
WORKDIR /app

# Only copy application dependencies to take advantage of image layer caching,
# i.e. if the "requirements.txt" file doesn't change, the layer is cached
COPY requirements.txt requirements.txt
RUN pip install -r requirements.txt

# Install the actual application code now. Usually, this is the part of the code
# that changes, thus invalidating the image layer cache.
ADD . /app

# We actually need to do this because our application code might have a "setup.py"
# which defines "entry_points". If that wasn't the case, it could be optional
RUN pip install .

# Run the app...
CMD ["python", "-m", "myapp.run", ... ]

Using pipenv, I would imagine the equivalent would be something like:

FROM python3.6

# Install "pipenv"
RUN pip install pipenv

RUN mkdir /app
WORKDIR /app

# In a similar fashion as before if the "Pipfile.lock" doesn't change, the
# image layer is going to be cached.
COPY Pipfile Pipfile
COPY Pipfile.lock Pipfile.lock
RUN pipenv install --deploy --system

ADD . /app
RUN pip install .

# Run the app...
CMD ["python", "-m", "myapp.run", ... ]

If this is something that others have come across and consider a best practice, I believe it would be useful to make it part of the official documentation, since pipenv is meant to be a solution for applications.

For example, before that, I thought that the logical thing would be doing something like:

# Install my application dependencies
pipenv install requests flask celery ...
pipenv install --dev pytest ...

# ...develop my app...

# Add the app to the Pipfile
pipenv install -e .

# commit everything
git add .
git commit ...

This would make it easy for someone to create an application that could be easily installed locally with just a pipenv install --dev. The problem is that now that the application package is part of the Pipfile, Docker layer caching is thrown out of the window (i.e. one has to do ADD . /app much earlier in order for pipenv install --system --deploy to be able to find the application's setup.py).

(This issue is in no way meant to be a complaint or a back in the "pip"-days I used to...-kind of rant. I'm really just hoping for a good discussion with practical advice and seeing how others tackle this issue using "pipenv")

@techalchemy
Copy link
Member

techalchemy commented Nov 24, 2018

I'm pretty sure the answer you want is to use a Docker Ignore file:

.git
.svn
.cvs
.hg
README*.md
!README*.md

And yeah we can definitely use a documentation note about this, I am sure we have some docker info in the docs, feel free to propose the changes you want to see whenever you work out the best way to structure things (I don't use docker directly so I'll just verify/leave it to the other maintainers to verify)

@techalchemy techalchemy added Type: Documentation 📖 This issue relates to documentation of pipenv. Category: Docker Issue affects docker builds. Type: Discussion This issue is open for discussion. labels Nov 24, 2018
@haizaar
Copy link

haizaar commented Dec 3, 2018

@slint ,
I personally still find it easier to do it in two steps, as per your initial post.
One of the reasons is due to outstanding pipenv bug #3148

More details on do I do it in Dockerfile: https://tech.zarmory.com/2018/09/docker-multi-stage-builds-for-python-app.html#pipenv

@fusillicode
Copy link

fusillicode commented Dec 5, 2018

Hi @haizaar, sorry for the bother but I'm really interested in what you achieved, i.e. slim & lean docker images with pipenv through multi-stage builds.

I tried to adapt the example you linked but without success.

In particular it's not clear to me where is the step that should include the app sources inside the last stage of your build. Am I missing something? 🤔

Btw my setup only includes a Pipfile & a Pipfile.lock inside the root of my repo alongside the app sources that resides in the app directory.

@Fongshway
Copy link

Would using the ONBUILD convention outlined in https://github.com/pypa/pipenv/blob/master/Dockerfile work for your use case?

@fusillicode
Copy link

I don't think so @Fongshway as my goal is to "cook" a lean & custom image with just what I need to run my application :(

@haizaar
Copy link

haizaar commented Dec 7, 2018

@fusillicode
May be you've missed the part where your own app has a proper setup.py that installs under $PYROOT together with its dependencies - this is what pip install --user . does.

So eventually both your app and its dependencies end up under $PYROOT and you just grab them in the final stage.

I prefer to have setup.py for my app, since it insures a clean install - just copying files has high chances of including unrelevant pieced, e.g. tests. Yes, .dockerignore is a one way to solve it, but it's still tedious for all developers to remember about .dockerignore; and .dockerignore is ignored if you decide to tar your context because you want to have symbolic links resolved moby/moby#18789 (comment).

Hope this helps.

@haizaar
Copy link

haizaar commented Dec 7, 2018

P.S. @fusillicode, if you really want to go lean with under 40MB final images, you can use python-minimal images.

@techalchemy
Copy link
Member

FYI @haizaar I didn’t forget your issue it just wound up being super complicated on the resolver side for some reason. I am very interested in any updates to docker documentation as I am using pipenv in docker myself now and am not that great with it yet :)

@haizaar
Copy link

haizaar commented Dec 8, 2018 via email

@fusillicode
Copy link

@haizaar I'm really sorry for the lateness of my reply 🙇‍♂️

Thanks a lot for pointing out the possibile problem and solution but even more for sharing your experience in this matter!

Actually I've solved my problem by generating, inside the "builder" stage of my multistage Dockerfile, the requirements.txt via pipenv and then use it directly with pip. It seems that it is working pretty well consider that I end up with a slimmer image than my first production one :)

If it would be helpful I'll share the Dockerfile :)

Thanks a lot for your support, I really appreciate it! 🍻

@slint
Copy link
Author

slint commented Dec 8, 2018

Another workflow we have been investigating is building two images:

A dependencies-only image, tagged with the _meta.hash.sha256 key from the Pipfile.lock:

# ./Dockerfile.deps

FROM python:3.6

RUN pip install --upgrade pip pipenv setuptools wheel

RUN mkdir /app
WORKDIR /app

COPY Pipfile Pipfile
COPY Pipfile.lock Pipfile.lock
RUN pipenv install --deploy --system

To build the image:

$ deps_version=$(jq ._meta.hash.sha256 Pipfile.lock)
$ docker build -t myapp-deps:$deps_version Dockerfile.deps

This image can be e.g. built on a regular basis by some cronjob/CI workflow, since it's often the case that dependencies don't change that often, compared to actual application code.

For the application image now, we can use ARG before the first FROM to specify the exact tag of our dependencies image:

# ./Dockefile

ARG DEPS_VERSION=latest
FROM myapp-deps:${DEPS_VERSION}

COPY . /app
RUN pip install .

CMD ["python", "-m", "myapp.run"]

To build the image we need to pass the dependencies version via --build-arg:

$ deps_version=$(jq ._meta.hash.sha256 Pipfile.lock)
$ docker build -t myapp:1.2.0 --build-arg DEPS_VERSION=$deps_version .

One caveat of this method is that, if the specific dependencies image might not have been already built, and thus you have to check your registry and trigger a build if needed before building the application image.

@haizaar
Copy link

haizaar commented Dec 9, 2018

@fusillicode
You are welcome. May I ask why did you end up with Pipfile -> reqs.txt conversion? I guess you do pipenv lock -r | pip install -r /dev/stdin, if so, you know you can do pipenv install --system --deploy, or using my side-install method PIP_USER=1 pipenv install --system --deploy, right?
The bonus of using pipenv is when you have installs from private repos - pipenv will install it for your, while converting to pip will require additional credentials/configuration setiup.

@haizaar
Copy link

haizaar commented Dec 9, 2018

@slint
I like the meta-hash trick.
Though, what benefits do you have with this approach compared to doing both steps in a single docker file? If dependencies, i.e. Pipfile.lock do not change, then docker-cache will be reused, yielding practically the same build speed. Do you see different?

@slint
Copy link
Author

slint commented Dec 9, 2018

@haizaar This is an additional optimization in case you're not building your images locally, but on e.g. a CI/CD service, which runs your docker build ... scripts on a random VM/container everytime and thus cannot benefit from image layer caching. If you are always building/pushing your images from your local machine (which is perfectly fine), then your multi-step build is actually faster and less complex to execute.

@haizaar
Copy link

haizaar commented Dec 10, 2018

Thanks @slint, I see your point.

@lukasz-madon
Copy link

I'd recommend multistage builds with docker and pipenv

FROM python:3.7 AS base

ENV LC_ALL C.UTF-8
ENV LANG C.UTF-8

WORKDIR /src

FROM base AS build

RUN pip install pipenv

...

# -- Adding Pipfiles, changes should rebuild whole layer
COPY Pipfile Pipfile
COPY Pipfile.lock Pipfile.lock

# -- Install dependencies: --deploy aborts if deps are incorrect with Pipfile.lock or Python version is incorrect
RUN pipenv install --dev --deploy --system


# use alpine for production to keep the image size small
FROM python:3.7-alpine AS release

...
# install dependencies from Pipfile.lock for reproducible builds
COPY Pipfile.lock Pipfile.lock
RUN pipenv install --system --deploy --ignore-pipfile

@HectorOrdonezCergentis
Copy link

@lukasz-madon Thanks for sharing! I don't think I understand your solution; you are installing dependencies twice, once with dev and once without. I also do not see how are you using the build stage - I don't see interactions with it. Could you elaborate? Thank you

@oz123
Copy link
Contributor

oz123 commented Jan 23, 2022

Note added here:
https://pipenv.pypa.io/en/latest/basics/#pipenv-and-docker-conatiners

This is a recommendation not an absolute truth.

@oz123 oz123 closed this as completed Jan 23, 2022
@aehlke
Copy link

aehlke commented Feb 16, 2022

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Category: Docker Issue affects docker builds. Type: Discussion This issue is open for discussion. Type: Documentation 📖 This issue relates to documentation of pipenv.
Projects
None yet
Development

No branches or pull requests

9 participants