Docker layer caching-friendly workflow with pipenv #3285

slint · 2018-11-22T21:19:20Z

The usual way to create a Docker-based deployment (e.g. for deploying on Kubernetes) for a Python application looked something like this, using a requirements.txt, produced by pip freeze or pip-compile:

FROM python3.6

RUN mkdir /app
WORKDIR /app

# Only copy application dependencies to take advantage of image layer caching,
# i.e. if the "requirements.txt" file doesn't change, the layer is cached
COPY requirements.txt requirements.txt
RUN pip install -r requirements.txt

# Install the actual application code now. Usually, this is the part of the code
# that changes, thus invalidating the image layer cache.
ADD . /app

# We actually need to do this because our application code might have a "setup.py"
# which defines "entry_points". If that wasn't the case, it could be optional
RUN pip install .

# Run the app...
CMD ["python", "-m", "myapp.run", ... ]

Using pipenv, I would imagine the equivalent would be something like:

FROM python3.6

# Install "pipenv"
RUN pip install pipenv

RUN mkdir /app
WORKDIR /app

# In a similar fashion as before if the "Pipfile.lock" doesn't change, the
# image layer is going to be cached.
COPY Pipfile Pipfile
COPY Pipfile.lock Pipfile.lock
RUN pipenv install --deploy --system

ADD . /app
RUN pip install .

# Run the app...
CMD ["python", "-m", "myapp.run", ... ]

If this is something that others have come across and consider a best practice, I believe it would be useful to make it part of the official documentation, since pipenv is meant to be a solution for applications.

For example, before that, I thought that the logical thing would be doing something like:

# Install my application dependencies
pipenv install requests flask celery ...
pipenv install --dev pytest ...

# ...develop my app...

# Add the app to the Pipfile
pipenv install -e .

# commit everything
git add .
git commit ...

This would make it easy for someone to create an application that could be easily installed locally with just a pipenv install --dev. The problem is that now that the application package is part of the Pipfile, Docker layer caching is thrown out of the window (i.e. one has to do ADD . /app much earlier in order for pipenv install --system --deploy to be able to find the application's setup.py).

(This issue is in no way meant to be a complaint or a back in the "pip"-days I used to...-kind of rant. I'm really just hoping for a good discussion with practical advice and seeing how others tackle this issue using "pipenv")

The text was updated successfully, but these errors were encountered:

techalchemy · 2018-11-24T00:39:17Z

I'm pretty sure the answer you want is to use a Docker Ignore file:

.git
.svn
.cvs
.hg
README*.md
!README*.md

And yeah we can definitely use a documentation note about this, I am sure we have some docker info in the docs, feel free to propose the changes you want to see whenever you work out the best way to structure things (I don't use docker directly so I'll just verify/leave it to the other maintainers to verify)

haizaar · 2018-12-03T10:56:22Z

@slint ,
I personally still find it easier to do it in two steps, as per your initial post.
One of the reasons is due to outstanding pipenv bug #3148

More details on do I do it in Dockerfile: https://tech.zarmory.com/2018/09/docker-multi-stage-builds-for-python-app.html#pipenv

fusillicode · 2018-12-05T16:33:52Z

Hi @haizaar, sorry for the bother but I'm really interested in what you achieved, i.e. slim & lean docker images with pipenv through multi-stage builds.

I tried to adapt the example you linked but without success.

In particular it's not clear to me where is the step that should include the app sources inside the last stage of your build. Am I missing something? 🤔

Btw my setup only includes a Pipfile & a Pipfile.lock inside the root of my repo alongside the app sources that resides in the app directory.

Fongshway · 2018-12-05T18:49:51Z

Would using the ONBUILD convention outlined in https://github.com/pypa/pipenv/blob/master/Dockerfile work for your use case?

fusillicode · 2018-12-05T18:51:26Z

I don't think so @Fongshway as my goal is to "cook" a lean & custom image with just what I need to run my application :(

haizaar · 2018-12-07T03:00:50Z

@fusillicode
May be you've missed the part where your own app has a proper setup.py that installs under $PYROOT together with its dependencies - this is what pip install --user . does.

So eventually both your app and its dependencies end up under $PYROOT and you just grab them in the final stage.

I prefer to have setup.py for my app, since it insures a clean install - just copying files has high chances of including unrelevant pieced, e.g. tests. Yes, .dockerignore is a one way to solve it, but it's still tedious for all developers to remember about .dockerignore; and .dockerignore is ignored if you decide to tar your context because you want to have symbolic links resolved moby/moby#18789 (comment).

Hope this helps.

haizaar · 2018-12-07T03:11:21Z

P.S. @fusillicode, if you really want to go lean with under 40MB final images, you can use python-minimal images.

techalchemy · 2018-12-07T23:45:56Z

FYI @haizaar I didn’t forget your issue it just wound up being super complicated on the resolver side for some reason. I am very interested in any updates to docker documentation as I am using pipenv in docker myself now and am not that great with it yet :)

haizaar · 2018-12-08T10:55:05Z

Dan, Do you think it worth contributing my best practices to official pipenv docs? (Or may be to Hitchhiker's Guide to Python?)

…

On Sat., 8 Dec. 2018, 10:46 Dan Ryan ***@***.*** wrote: FYI @haizaar <https://github.com/haizaar> I didn’t forget your issue it just wound up being super complicated on the resolver side for some reason. I am very interested in any updates to docker documentation as I am using pipenv in docker myself now and am not that great with it yet :) — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#3285 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AADjWafRAp7UAUWINsHvsjKlLvmZQ2gKks5u2v27gaJpZM4Yv5ep> .

fusillicode · 2018-12-08T14:21:31Z

@haizaar I'm really sorry for the lateness of my reply 🙇‍♂️

Thanks a lot for pointing out the possibile problem and solution but even more for sharing your experience in this matter!

Actually I've solved my problem by generating, inside the "builder" stage of my multistage Dockerfile, the requirements.txt via pipenv and then use it directly with pip. It seems that it is working pretty well consider that I end up with a slimmer image than my first production one :)

If it would be helpful I'll share the Dockerfile :)

Thanks a lot for your support, I really appreciate it! 🍻

slint · 2018-12-08T14:48:58Z

Another workflow we have been investigating is building two images:

A dependencies-only image, tagged with the _meta.hash.sha256 key from the Pipfile.lock:

# ./Dockerfile.deps

FROM python:3.6

RUN pip install --upgrade pip pipenv setuptools wheel

RUN mkdir /app
WORKDIR /app

COPY Pipfile Pipfile
COPY Pipfile.lock Pipfile.lock
RUN pipenv install --deploy --system

To build the image:

$ deps_version=$(jq ._meta.hash.sha256 Pipfile.lock)
$ docker build -t myapp-deps:$deps_version Dockerfile.deps

This image can be e.g. built on a regular basis by some cronjob/CI workflow, since it's often the case that dependencies don't change that often, compared to actual application code.

For the application image now, we can use ARG before the first FROM to specify the exact tag of our dependencies image:

# ./Dockefile

ARG DEPS_VERSION=latest
FROM myapp-deps:${DEPS_VERSION}

COPY . /app
RUN pip install .

CMD ["python", "-m", "myapp.run"]

To build the image we need to pass the dependencies version via --build-arg:

$ deps_version=$(jq ._meta.hash.sha256 Pipfile.lock)
$ docker build -t myapp:1.2.0 --build-arg DEPS_VERSION=$deps_version .

One caveat of this method is that, if the specific dependencies image might not have been already built, and thus you have to check your registry and trigger a build if needed before building the application image.

haizaar · 2018-12-09T13:10:49Z

@fusillicode
You are welcome. May I ask why did you end up with Pipfile -> reqs.txt conversion? I guess you do pipenv lock -r | pip install -r /dev/stdin, if so, you know you can do pipenv install --system --deploy, or using my side-install method PIP_USER=1 pipenv install --system --deploy, right?
The bonus of using pipenv is when you have installs from private repos - pipenv will install it for your, while converting to pip will require additional credentials/configuration setiup.

haizaar · 2018-12-09T13:13:20Z

@slint
I like the meta-hash trick.
Though, what benefits do you have with this approach compared to doing both steps in a single docker file? If dependencies, i.e. Pipfile.lock do not change, then docker-cache will be reused, yielding practically the same build speed. Do you see different?

slint · 2018-12-09T16:24:56Z

@haizaar This is an additional optimization in case you're not building your images locally, but on e.g. a CI/CD service, which runs your docker build ... scripts on a random VM/container everytime and thus cannot benefit from image layer caching. If you are always building/pushing your images from your local machine (which is perfectly fine), then your multi-step build is actually faster and less complex to execute.

haizaar · 2018-12-10T01:42:45Z

Thanks @slint, I see your point.

lukasz-madon · 2019-07-25T11:28:17Z

I'd recommend multistage builds with docker and pipenv

FROM python:3.7 AS base

ENV LC_ALL C.UTF-8
ENV LANG C.UTF-8

WORKDIR /src

FROM base AS build

RUN pip install pipenv

...

# -- Adding Pipfiles, changes should rebuild whole layer
COPY Pipfile Pipfile
COPY Pipfile.lock Pipfile.lock

# -- Install dependencies: --deploy aborts if deps are incorrect with Pipfile.lock or Python version is incorrect
RUN pipenv install --dev --deploy --system


# use alpine for production to keep the image size small
FROM python:3.7-alpine AS release

...
# install dependencies from Pipfile.lock for reproducible builds
COPY Pipfile.lock Pipfile.lock
RUN pipenv install --system --deploy --ignore-pipfile

HectorOrdonezCergentis · 2021-09-02T10:10:14Z

@lukasz-madon Thanks for sharing! I don't think I understand your solution; you are installing dependencies twice, once with dev and once without. I also do not see how are you using the build stage - I don't see interactions with it. Could you elaborate? Thank you

oz123 · 2022-01-23T19:33:12Z

Note added here:
https://pipenv.pypa.io/en/latest/basics/#pipenv-and-docker-conatiners

This is a recommendation not an absolute truth.

aehlke · 2022-02-16T02:38:46Z

https://pipenv.pypa.io/en/latest/basics/#pipenv-and-docker-containers

techalchemy added Type: Documentation 📖 This issue relates to documentation of pipenv. Category: Docker Issue affects docker builds. Type: Discussion This issue is open for discussion. labels Nov 24, 2018

slint mentioned this issue Dec 12, 2018

docs: docker inveniosoftware/invenio#3909

Closed

ForestEckhardt mentioned this issue Nov 19, 2020

RFC to rewrite language family paketo-buildpacks/python#183

Merged

oz123 closed this as completed Jan 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Docker layer caching-friendly workflow with pipenv #3285

Docker layer caching-friendly workflow with pipenv #3285

slint commented Nov 22, 2018 •

edited

Loading

techalchemy commented Nov 24, 2018 •

edited

Loading

haizaar commented Dec 3, 2018

fusillicode commented Dec 5, 2018 •

edited

Loading

Fongshway commented Dec 5, 2018

fusillicode commented Dec 5, 2018

haizaar commented Dec 7, 2018

haizaar commented Dec 7, 2018

techalchemy commented Dec 7, 2018

haizaar commented Dec 8, 2018 via email

fusillicode commented Dec 8, 2018

slint commented Dec 8, 2018 •

edited

Loading

haizaar commented Dec 9, 2018

haizaar commented Dec 9, 2018

slint commented Dec 9, 2018

haizaar commented Dec 10, 2018

lukasz-madon commented Jul 25, 2019

HectorOrdonezCergentis commented Sep 2, 2021

oz123 commented Jan 23, 2022

aehlke commented Feb 16, 2022

Docker layer caching-friendly workflow with pipenv #3285

Docker layer caching-friendly workflow with pipenv #3285

Comments

slint commented Nov 22, 2018 • edited Loading

techalchemy commented Nov 24, 2018 • edited Loading

haizaar commented Dec 3, 2018

fusillicode commented Dec 5, 2018 • edited Loading

Fongshway commented Dec 5, 2018

fusillicode commented Dec 5, 2018

haizaar commented Dec 7, 2018

haizaar commented Dec 7, 2018

techalchemy commented Dec 7, 2018

haizaar commented Dec 8, 2018 via email

fusillicode commented Dec 8, 2018

slint commented Dec 8, 2018 • edited Loading

haizaar commented Dec 9, 2018

haizaar commented Dec 9, 2018

slint commented Dec 9, 2018

haizaar commented Dec 10, 2018

lukasz-madon commented Jul 25, 2019

HectorOrdonezCergentis commented Sep 2, 2021

oz123 commented Jan 23, 2022

aehlke commented Feb 16, 2022

slint commented Nov 22, 2018 •

edited

Loading

techalchemy commented Nov 24, 2018 •

edited

Loading

fusillicode commented Dec 5, 2018 •

edited

Loading

slint commented Dec 8, 2018 •

edited

Loading