Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

poetry install --no-root with only poetry.lock #1301

Open
2 tasks done
eliasmistler opened this issue Aug 15, 2019 · 44 comments
Open
2 tasks done

poetry install --no-root with only poetry.lock #1301

eliasmistler opened this issue Aug 15, 2019 · 44 comments
Labels
area/installer Related to the dependency installer kind/feature Feature requests/implementations

Comments

@eliasmistler
Copy link

  • I have searched the issues of this repo and believe that this is not a duplicate.
  • I have searched the documentation and believe that my question is not covered.

Feature Request

When doing poetry install, both pyproject.toml and poetry.lock are required. This makes sense as the package itself is installed along with its dependencies.

However, this breaks caching in Docker. Let's say I have these lines in docker:

COPY pyproject.toml poetry.lock README.md /
RUN  poetry install --no-dev

(Note how I also need the README.md as that is referenced in the pyproject.toml)

The main problem here is: On every build, I bump the version using poetry version .... This changes the pyproject.toml file, therefore breaks the docker caching mechanism. In comparison, with an additional requirements.txt, I can do something like:

COPY requirements.txt /
RUN pip install -r requirements.txt
COPY pyproject.toml poetry.lock README.md /
RUN  poetry install --no-dev

In this case, all the dependencies (which quite often do not change between builds) can be cached. This speeds up my docker build by 80-90%, but it's obviously ugly to have to rely on requirements.txt

FYI: I have a workaround for requirements.txt, where I simply generate it from poetry.lock:

def _poetry_lock_to_requirements_txt(start_dir):
    with open(f'{start_dir}/poetry.lock') as f:
        lock = toml.load(f)

    requirements = {package['name']: package['version']
                    for package in lock['package']}

    with open(f'{start_dir}/requirements.txt', 'w') as f:
        f.writelines(f'{name}=={version}\n' for name, version in sorted(requirements.items()))

Alternatively, I could do some trickery around when to (re-)set the version, but this is getting uglier and uglier [plus I need requirements.txt anyway due to an other issue with poetry that I find really hard to pin down - more on that soon]

I would suggest adding the flag poetry install --dependencies-only which only requires poetry.lock, not pyproject.toml

@dmontagu
Copy link

dmontagu commented Aug 15, 2019

In the preview version there is an export command to have poetry generate the requirements.txt; I’ve been using that in my own projects (and otherwise following almost exactly the same steps you described).

That said, I would also prefer to make use of this feature if it existed.

@brycedrennan brycedrennan added the kind/feature Feature requests/implementations label Aug 16, 2019
@a-recknagel
Copy link

Honest question, why would you use poetry to set up a container? pip (with export from preview builds, as dmontagu pointed out) is enough. See here for a stack overflow post where someone had a different problem with using poetry in docker, where just using pip seems a good alternative.

@iwoloschin
Copy link

I'm using poetry in a docker container because of subtle bugs like #1129. I probably could work around it by not having optional git dependencies or by creating an internal package repo and building/publishing packages, or manually editing the broken METADATA file, but it's a lot easier to just install the couple of internal dependencies straight out of our git hosting by using poetry instead of pip. Silly, but effective, and since my container is a multibuilder pattern poetry doesn't even wind up in the actual deployed image.

Separately, there's a bigger question of ergonomics, typing an extra poetry export ... command isn't really a big deal, but it's one more thing to forget before building a container. I could write my own tooling to manage that, but this could also be a great use case for a simple poetry plugin that always ran poetry export before docker build or something like that. No need to go crazy here, but if this is a common use case then it probably makes sense to have an obvious, standard way to handle it.

@a-recknagel
Copy link

This isn't related to the main post that closely, but a well-implemented solution to #537 with a poetry bundle command that allows a deployment-focused build of a poetry project would take care of your specific issue for sure (and also allow me to use poetry instead of pip in my docker containers, which I'd prefer but currently can't).

But yeah, it probably won't help with the docker cache, since a bundle wouldn't make difference between dependencies (which are mostly static between builds) and the package that is under development and changes in every build...

@dmontagu
Copy link

dmontagu commented Aug 28, 2019

@a-recknagel I actually currently use poetry in all my containers used to deploy my poetry projects (despite also using requirements.txt) -- like @iwoloschin I've run into subtle issues when not using poetry directly; in particular, I previously had issues with the exported requirements.txt generating an invalid environment (I think there was an issue where the platform flags were generated wrong and I was missing one required dependency). Also, in some of my projects, I use poetry to perform C-extension compilation; I develop on macos so I also use poetry install to build the extensions in the (non-macos) deployment container.

I just use the requirements.txt to facilitate build caching, then add the pyproject.toml and poetry.lock and run poetry install (which runs much faster with the dependencies already pip installed, and the pip install step can generally stay cached for a while).

From more of an abstract perspective, since I use poetry to set up my development environment, I just use poetry to set up the deployment environment to ensure it is as similar as possible (with minimal effort). If I could spare the time/energy to optimize container size or maximally lock down the image, I would certainly try to remove poetry from the final image, but this approach has been most efficient for me in development so far.

@dmontagu
Copy link

dmontagu commented Aug 28, 2019

@iwoloschin On the topic of export ergonomics, I have the following commands in a lock.sh script:

poetry lock
poetry export -f requirements.txt --without-hashes > requirements_tmp.txt
mv requirements_tmp.txt requirements/requirements-poetry.txt

(I use requirements-poetry.txt so that I can include it into a requirements.txt which also lists some pre-built wheels for local path-based installation, which I generally update more frequently than I update the non-local dependencies.)

(Also, I think the without-hashes is safe since I follow up with a poetry install during the image-building process anyway, which should check the hashes.)

In the dockerfile, it looks like:

RUN pip install "poetry==${POETRY_VERSION}"

COPY ./app/requirements/requirements-poetry.txt /app/requirements/
RUN pip install -r /app/requirements/requirements-poetry.txt

COPY ./app/requirements /app/requirements
RUN pip install \
    --find-links=/app/requirements/wheels \
    -r /app/requirements/requirements.txt

COPY ./app /app
RUN poetry install --no-dev

Since the export happens automatically whenever I lock the dependencies, I never really have issues with forgetting a step prior to building the container.

@a-recknagel
Copy link

a-recknagel commented Aug 28, 2019

@dmontagu I tried using both pip and poetry in my CI before, and I had a bad time due to poetry implicitly creating virtual envs on a fresh poetry install. I think it should work if the pre-poetry dependency installation with pip creates a virtualenv before, but at that point I just dropped poetry because it felt like too much work to accommodate it (and because I also like my containers as small as possible). Have you found a way around that?

I can see why having an "as similar as possible to dev" approach to build and deployment environments is a good idea, so I'd like to improve there.

@dmontagu
Copy link

dmontagu commented Aug 28, 2019

@a-recknagel
You can prevent poetry from creating a virtual environment by setting the environment variable POETRY_VIRTUALENVS_CREATE=false (I think this might be preview only; not sure when this was added). Note that this can cause problems if you call poetry install --no-dev, as the no-dev will cause it to remove packages from the environment if they are dev-only in your project. In particular, this can break poetry by removing one of its own dependencies if this is performed in the same environment in which poetry was installed in your image (at least, if you use pip to install it in the image; maybe the recommended install script wouldn't have this problem).

When I have run into this problem in the past, I've worked around it by just manually pip-reinstalling the uninstalled packages after the poetry install --no-dev call.


Another approach I have used is to add a placeholder pyproject.toml prior to the pip install -r requirements.txt with these contents:

[tool.poetry]
name = "app"
version = "0.1.0"
description = "Placeholder"
authors = ["Placeholder"]

Then run poetry install (I think...) to create the virtual environment, then follow all the steps from the dockerfile listed above (using poetry run or modifying the path where necessary). Ultimately it results in the pyproject.toml being replaced, but the same virtual environment is still used (and plays nice with the docker cache).


After reminding myself of all of these tricks necessary to get this to work nicely, I'm definitely in favor of a --dependencies-only flag to poetry install that would simplify this whole process!

@a-recknagel
Copy link

POETRY_VIRTUALENVS_CREATE=false

Didn't know that, nifty. I'm using preview anyways for poetry export so that won't be a problem.

The --no-dev install will work nicely with the script-installed poetry, it vendors all its dependencies for cases like this one. I think this tips the scale for me, --dependencies-only will make everything nicer but I can keep a hack or two around until something like it has arrived.

@dmontagu
Copy link

@a-recknagel Any suggestions for how to make use of the installer script in a container? Would you just call the curl -sSL command in the dockerfile? Store it in the repo and add it to the container in the dockerfile? (For some reason it feels more dangerous to me than relying on PyPI, I'm not sure that's justified though.)

Obviously there are many ways it could be done, but I'd be interested to hear if anyone had thought through the "best" way to accomplish this.

@iwoloschin
Copy link

@dmontagu I just do this:

# Install Poetry & ensure it is in $PATH
RUN curl -sSL https://raw.githubusercontent.com/sdispater/poetry/master/get-poetry.py | POETRY_PREVIEW=1 python
ENV PATH "/root/.poetry/bin:/opt/venv/bin:${PATH}"

As you can see, I'm also using a virtual environment, because then later on I just copy the entire virtual environment out of the builder into the final container, leaving Poetry (and any other build dependencies!) behind. This is an artifact of me using an Alpine base, which makes wheels hard to use, so I need to install gcc & friends, but obviously do not want to keep those all around in the final image!

Generally the installer script portion is cached unless I do a --no-cache build. I find this to be an acceptable risk for me, but I could see a healthy debate about that subject, but it'd quickly turn into "vendor all of Poetry!" which seems unreasonable from my perspective.

@a-recknagel
Copy link

a-recknagel commented Aug 28, 2019

@dmontagu I was going to comment on the downside of the script installer, namely having curl in the image. Since it's just ~220KB I don't care that much about it, but since I already have a script folder in all my projects I might as well throw get-poetry.py in there and add it that way instead. Both sound a little ugly but ok to me. Since it's advertised as the recommended way to install poetry, I'd expect good or better stability than PyPI.


Now that I read it, @iwoloschin's approach sounds best. After installing poetry, do you run the dependency install with poetry run pip install -r requirements.txt? Mind also telling us how you copy the env, is it more complicated than moving the site_packages over into the system python?

@iwoloschin
Copy link

Sure, here's a minimal version of my dockerfile:

FROM python:3-alpine as python_builder
RUN apk update && \
  apk add \
  build-base \
  curl \
  git \
  libffi-dev \
  openssh-client \
  postgresql-dev

# Install Poetry & ensure it is in $PATH
RUN curl -sSL https://raw.githubusercontent.com/sdispater/poetry/master/get-poetry.py | POETRY_PREVIEW=1 python
ENV PATH "/root/.poetry/bin:/opt/venv/bin:${PATH}"

# Install any dependencies, this step will be cached so long as pyproject.toml is unchanged
# This could (should?) probably be changed to do a "pip install -r requirements.txt" in order
# to remain more cacheable, but it really depends on how long your dependencies take to install
COPY poetry.lock pyproject.toml /opt/project
RUN python -m venv /opt/venv && \
  source /opt/venv/bin/activate && \
  pip install -U pip && \
  cd /opt/project && \
  poetry install --no-dev --no-interaction

# Install the project itself (this is almost never cached)
COPY . /opt/project
RUN source /opt/venv/bin/activate && \
  cd /opt/project && \
  poetry install --no-dev --no-interaction

# Below this line is now creating the deployed container
# Anything installed above but not explicitly copied below is *not* available in the final container!
FROM python:3-alpine

# Any general deployed container setup that you want cached should go here

# Copy Project Virtual Environment from python_builder
COPY --from=python_builder /opt /opt

# Add the VirtualEnv to the beginning of $PATH
ENV PATH="/opt/venv/bin:$PATH"

CMD ...

This allows my container to be moderately cacheable, so long as pyproject.toml and poetry.lock remain unchanged. It could be improved by generating requirements.txt and pip install -r requirements.txt instead of the first poetry install ..., and in fact perhaps I'll go play around with @dmontagu's lock.sh idea and see if that helps me out at all. In theory, the second poetry install ... could handle any updated dependencies until you've got time to trigger a --no-cache build.

Of course, it goes without saying that the primary benefit here is that you can be "lazy" and use poetry to install your project in your container, but also get the benefit of not having poetry, or any other build-time dependencies in your final container. Eventually this should all just be replaced by a pip install command, but until Poetry matures a bit more this is an acceptable solution, particularly because it means I get to use Poetry to manage my dependencies in my development environment and not manually generate a requirements.txt file, and that is awesome!

@jackemuk
Copy link

jackemuk commented Aug 30, 2019

Issue #1333 was closed because I was told this issue was "very similar".

If "poetry install" command documentation states the command is supposed to read the pyproject.toml file from the current project, resolves the dependencies, and installs the dependencies. Poetry should NOT be installing the current project as a dependency.

The project is not a dependency to itself.

If the current package is to be installed then that's a deployment, not dependency install. poetry install should only read the pyproject.toml and install dependencies. If a poetry.lock file exists, it could obey the locked dependencies. If the poetry.lock file does or does not exist, running poetry install should only install dependencies.

That's why #1333 is really a bug and only partially related and should not have been closed.

@a-recknagel
Copy link

a-recknagel commented Aug 31, 2019

@jackemuk Of course a project depends on itself during development, poetry behaving like that isn't a bug. How are you going to test your library if it's not installed?

@brycedrennan
Copy link
Contributor

@a-recknagel I don't yet know enough about this topic to have a position but generally I'd test my library by running the test suite or installing it as a dependency of a different application.

@a-recknagel
Copy link

a-recknagel commented Sep 1, 2019

@brycedrennan and the test suite should run against an installed version of your library, and not a bunch of source files in your working directory that may or may not behave similarly. Having a dev-install of the library you are working on is essential in order to have behavior that is as close to deployment as possible.

Other reasons were also discussed in one of the earliest tickets in this project where people asked for a self install of the package in development - a request that is pretty much the polar opposite of #1333.

@brycedrennan
Copy link
Contributor

brycedrennan commented Sep 2, 2019

For many basic python libraries "installation" is just putting source files on the python path.

I also use pip install -e sometimes. Its super useful. But I wouldn't want to be obligated to use it every time. As noted in this and the other ticket, its not always desired.

I agree that this poetry behavior is not a bug since it was quite intentionally made to behave this way. I do find it to be unexpected default behavior though.

@jackemuk
Copy link

jackemuk commented Sep 2, 2019

@a-recknagel That's why there is a build command, to build and install the package/application. The application or library should not be installed to the development environment if the wheel or dist package has not been created.

When another developer checks out the repo and installs the dependencies, the application that's being developed should not be installed automatically, It's not dependent on itself because the source is in the development environment. This creates a conflict between the source and the package installed in site-packages.

We test our application by running either running it or running the test suite in the development environment or and build a wheel to deploy to a separate testing environment.

The default should not install the package/application that is inside the pyproject.toml by default. If the package/application your working on is a dependency, then it should be listed in the dependency section in the pyproject.toml file.

Just because it was an early request, doesn't make it right being the default behavior. Make it an option to install command as well as making it a user defined option that can make the install command also self-installs the package by default. That solves both issues.

But, again, self install should not be the default when installing the dependencies listed in the pyproject.toml file.

@brycedrennan a bug is unexpected behavior. The documentation doesn't say that it self installs the current environment and therefore I consider a bug. Either way, it shouldn't be the default behavior but having the ability to configure the default behavior of the install command would solve the issue.

Making the default behavior be the equivalent to "pip install -e ." is contrary to working in a development environment, especially when there is no option to turn this behavior off.

@a-recknagel
Copy link

I don't want to hijack this comment section any longer to discuss a different issue, so this will by my last post regarding it. We're probably just going to have to disagree.

The application or library should not be installed to the development environment if the wheel or dist package has not been created.

A dev install doesn't need to build a wheel/dist/sdist/whatever, it just needs an .egg-info in the source package and an .egg-link in the site_packages. No packaging takes place, it links to the source files directly.

[Installing the aplication in development] creates a conflict between the source and the package installed in site-packages.

No, it doesn't. You shouldn't have been consulting your source files in the first place when testing the behavior of a library, so there is no conflict. It will work most of the cases because python does nice things like adding the current directory to the PYTHONPATH, but leads to things like "but it works in my computer" down the line. Just treat your library like a distribution (read, make a dev-install for test suites and the like) and avoid that can of worms. I seriously don't see why it bothers you so much.

We test our application by running either running it or running the test suite in the development environment or and build a wheel to deploy to a separate testing environment.

In that case use pip to install it, not poetry. poetry is a package manager targeting development, not build/test/deploy environments. It will always cater to dev environment first and foremost.

If the package/application your working on is a dependency, then it should be listed in the dependency section in the pyproject.toml file.

This just feels like a philosophical issue. Yeah, maybe it should be. But it wasn't, maybe because it was considered redundant to say that a library needs its own packaged code/config files/binaries in order to be valid, who knows.

Just because it was an early request, doesn't make it right being the default behavior.

Fair enough. But it means that you need a lot of public support to change it.

The documentation doesn't say that it self installs the current environment

I'd agree here, but that point alone is hardly worth this much noise.

@jackemuk
Copy link

jackemuk commented Sep 5, 2019

@a-recknagel
Your statement proves my point!

Poetry is a package manager targeting development, not build/test/deploy environments. It will always cater to dev environment first and foremost.

Therefore it should not build and install the current development package as the default behavior.

I don't want to hijack this comment section any longer to discuss a different issue...

Because my issue was closed and redirected the discussion under this issue. So unless #1333 is opened again for discussion by @brycedrennan, this is unfortunately where it landed.

@trim21
Copy link
Contributor

trim21 commented Sep 8, 2019

it's not a good idea to change the default behavior, we could add a option --dependencies-only or environment variable to configure it.

export command do solve this problem but maintain a requirement file with poetry.lock is very uncessessary as all information we need already existed.

And also, in some situation, we can't export before some other system (like a CI/CD) run docker build, so we may need to keep requirements files in repo and keep it sync with lock file

I forgot docker multi stage build.

I'm now doing the same things like eliasmistler, using a python script to read the lock file and export requirements file with dev dependencies (in ci) or without (in production), with out installing poetry in my docker image but only toml package. And this requirements file only changes when lock file changes.

@dvf
Copy link

dvf commented Nov 21, 2019

If you're just trying to get your CI environment to build fast, (CircleCI, GitHub Actions, Travis etc.) you can take advantage of file system caching the venv by using the hash of poetry.lock.

I wrote a tutorial on how to do this if you're interested.

@mcouthon
Copy link
Contributor

mcouthon commented Apr 4, 2020

This issue is related to #1899. I posted my current solution for caching there, and I'll post it here too, if it might help anyone:

# Only copying these files here in order to take advantage of Docker cache. We only want the
# next stage (poetry install) to run if these files change, but not the rest of the app.
COPY pyproject.toml poetry.lock ./

# Currently poetry install is significantly slower than pip install, so we're creating a
# requirements.txt output and running pip install with it.
# Follow this issue: https://github.com/python-poetry/poetry/issues/338
# Setting --without-hashes because of this issue: https://github.com/pypa/pip/issues/4995
RUN poetry config virtualenvs.create false \
                && poetry export --without-hashes -f requirements.txt --dev \
                |  poetry run pip install -r /dev/stdin \
                && poetry debug

COPY  . ./

# Because initially we only copy the lock and pyproject file, we can only install the dependencies
# in the RUN above, as the `packages` portion of the pyproject.toml file is not
# available at this point. Now, after the whole package has been copied in, we run `poetry install`
# again to only install packages, scripts, etc. (and thus it should be very quick).
# See this issue for more context: https://github.com/python-poetry/poetry/issues/1899
RUN poetry install --no-interaction

# We're setting the entrypoint to `poetry run` because poetry installed entry points aren't
# available in the PATH by default, but it is available for `poetry run`
ENTRYPOINT ["poetry", "run"]

@pikeas
Copy link

pikeas commented Aug 8, 2020

I agree with most of what's been said here regarding build cache breakage. Fundamentally, it doesn't seem like pyproject.toml should be required to install dependencies. That file is becoming a defacto standard for general Python tooling, such as black, isort, and pylint. These tools have a lifecycle for iteration and updates that is completely unrelated to package versioning, which is managed in the same file.

I wish I could do:

COPY --chown=app:app poetry.lock ./
RUN poetry install --no-root --no-dev # Bonus points: make this a combined flag like --only-prod-deps
COPY --chown=app:app pyproject.toml ./
COPY --chown=app:app src/ ./src
RUN poetry install --no-dev

@TBBle
Copy link
Contributor

TBBle commented Jan 4, 2021

The poetry install help implies that pyproject.toml is not required if poetry.lock is present:

The install command reads the poetry.lock file from
the current directory, processes it, and downloads and installs all the
libraries and dependencies outlined in that file. If the file does not
exist it will look for pyproject.toml and do the same.

i.e. it suggests that only if poetry.lock is not found, will it go looking for pyproject.toml.

So that would definitely be a huge step forward.

@robertlagrant
Copy link

@TBBle yes, that's always been a bit confusing! I've never seen an explanation as to why both files are necessary; without that, any arguments for needing both because poetry is a "Dev first" tool seem a bit insubstantial. If there's a reason for needing both, then that sort of prioritisation call is necessary. If not, then why not just solely rely on poetry.lock if it's present?

@akpircher
Copy link

I was looking for something else when I saw this, and wanted to offer my work around for the caching, as I think it results in a smaller docker image, and might be faster for rebuilds.

I didn't think any of my docker images actually needed poetry installed, just the requirements export from it (and maybe the built wheel), so I did all of my poetry-dependent steps in a "builder" stage

FROM python:3 as builder

SHELL ["/bin/bash", "-xeuo", "pipefail", "-c"]
RUN curl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py | python - \
    && ln -s "$HOME/.poetry/bin/poetry" /usr/local/bin/poetry

WORKDIR /app
COPY poetry.lock pyproject.toml ./
RUN mkdir -p /usr/local/build \
    && poetry export -o /usr/local/build/requirements.txt --without-hashes # --dev

# If you don't need the package itself installed, skip this step
COPY . /app
RUN rm -rf dist \
    && poetry build -f wheel \
    && mv dist/*.whl /usr/local/build \
    && echo "package-name @ file://$(ls /usr/local/build/*.whl)" >/usr/local/build/package.txt

FROM python:3

COPY --from=builder /usr/local/build/requirements.txt /usr/local/build/
RUN pip3 install -r /usr/local/build/requirements.txt \
    && rm -rf ~/.cache/pip

# or alternatively copy your working directory here
COPY --from=builder /usr/local/build/ /usr/local/build/
RUN pip3 install -r /usr/local/build/package.txt \
    && rm -rf ~/.cache/pip

CMD [ "python3", "-m", "package_name.entrypoint" ]
# CMD [ "python", "run.py"]

I hope this helps someone. I re-use this pattern in several projects at work, rebuilds take seconds (after making changes), but I also don't need poetry in my final docker image.

@remram44
Copy link
Contributor

remram44 commented Jul 2, 2021

@akpircher Unfortunately this fails if you have path dependencies:

 => ERROR [builder 5/5] RUN $HOME/.poetry/bin/poetry export -o /root/requ  1.1s 
------                                                                          
 > [builder 5/5] RUN $HOME/.poetry/bin/poetry export -o /root/requirements.txt: 
#12 0.986                                                                       
#12 0.986   ValueError
#12 0.986 
#12 0.986   Directory lib_profiler does not exist

I am using a custom script to read the poetry.lock in my project because of that and that feels very unnecessary.

(see #936 (comment))

@akpircher
Copy link

akpircher commented Jul 8, 2021

@remram44 I'm actually pretty familiar with that, back when poetry version 1.0.10, the local dependencies were fine, because poetry used a relative path in the export (I opened up #3189 because the upgrade to poetry 1.1.2 broke my Dockerfile. For a hot minute, I had been purposefully downgrading to 1.0.10).

My solution to it is kinda dumb, but it takes advantage the multi-staged dockerfiles. It basically boils down to consistent path names.

WORKDIR /usr/local/build/
COPY pyproject.toml poetry.lock ./
COPY dependencies dependencies/
RUN poetry export -f requirements.txt -o requirements.txt --without-hashes

# It's actually several copy commands of individual folders, so I don't copy the wheels over, but it's the gist
WORKDIR /usr/local/application-name
COPY . /usr/local/application-name

# I don't do this for this particular project, but I would if I actually needed to build it as a python wheel
RUN mkdir -p /usr/local/build-package \
    && poetry build -f wheel \
    && mv dist/*.whl /usr/local/build-package \
    && echo "package-name @ file://$(ls /usr/local/build-package/*.whl)" >/usr/local/build-package/package.txt
    
FROM python:3
# it HAS to be the same path.
COPY --from=builder /usr/local/build/ /usr/local/build/
RUN pip3 install --no-cache-dir -r /usr/local/build/requirements.txt \
    && rm -rf /usr/local/build/ ~/.cache/pip

# if needed, again the path MUST be the same
COPY --from=builder /usr/local/build-package/ /usr/local/build-package/
RUN pip3 install --no-cache-dir -r /usr/local/build-package/package.txt \
    && rm -rf /usr/local/build-package/ ~/.cache/pip

# Keep the non-"build" files somewhere else
COPY --from=builder /usr/local/application-name/ /user/local/application-name/

The copy of individual folders is a bit tedious, you could arguably copy the entire directory and then remove the third-party/dependency folder manually, but the idea behind taking care of the copies in the builder is (ultimately) to reduce how many layers I'm pushing to a private repository.

@remram44
Copy link
Contributor

remram44 commented Jul 8, 2021

So you copy the files into that first stage so export works, and then start over, installing the exported requirement first and the files (again) second in the other stage?

I suppose that works but I'd much rather someone fixed the export command.

@akpircher
Copy link

akpircher commented Jul 8, 2021

I only install once, in the final stage. I also only export once in the builder stage. You don't need to install in order to export.

In the builder stage, I:

  1. copy the pyproject.toml and poetry.lock over (with dependencies if needed. if the dependencies are needed, then I copy the pyproject.toml and the poetry.lock into that build folder)
  2. export the requirements.txt
  3. copy the application code over
  4. optionally build the wheel (which doesn't require installing) and create a different package.txt for the wheel

In the final stage, I:

  1. copy the /usr/local/build directory over
  2. pip install the requirements
  3. copy the application code over
  4. optionally install the built wheel, if needed

The general workflow isn't too different from what @mcouthon offered. The primary difference is that I use multi-staged dockerfiles so that I never need to run poetry install.

It seems like people in this thread were desiring having the actual package installed, so "build wheel" step satisfies that, but is completely optional. The use case I have for it (in some scenarios) is that the built wheel contains a version from setuptools-scm which is displayed as the version when it's run.

Otherwise, I don't really see the point of it.

@akpircher
Copy link

@remram44 I've created a dummy repo showing a concrete example of this. https://github.com/akpircher/multistage-dockerfile-demo.

It would be nice for the export command to be fixed/be able to handle relative paths, but if you need something now, this works, caches the builds, and doesn't require running poetry install at all. It's the solution I've come up with because I needed a solution 8 months ago.

@JohnPreston
Copy link

JohnPreston commented Aug 8, 2021

As posted in #3374 and similar to @akpircher above.

Multi stage allows me to install just what I need for the application by generating the .whl and passing it to the final image

https://github.com/compose-x/ecs_composex/blob/main/cli.Dockerfile

@scratchmex
Copy link

After seeing all these comments I did my research and I think I finally arrived to a solution to this caching problem.

What I understand, the problem is when you change something in your pyproject.toml that does not actually affect the dependencies so your poetry.lock doesn't change. Even thought it changed, we should have a dependencies caching for those who are not changed. Well, according to #3374 (comment), in buildx you can have a --mount=type=cache.

I also tried to merge all the opinionated practices here and set up a template here: https://github.com/scratchmex/poetry-docker-template. If you have opinions let me know.

@neersighted neersighted changed the title poetry install is not Dockerfile cache compatible poetry install --no-root with only poetry.lock Oct 4, 2022
@neersighted neersighted added the area/installer Related to the dependency installer label Oct 4, 2022
@neersighted
Copy link
Member

Renamed to better capture what makes this different from #4036 + best practices (like use of cache mounts).

@varun-seth
Copy link

@mcouthon If pyproject.toml file has any change (like a change in the version of the project) then Docker cannot use cache for the subsequent lines, regardless of the method of installation (from inline-exported requirements or from the lock files). The following line breaks docker-caching.

COPY pyproject.toml poetry.lock ./

@mcouthon
Copy link
Contributor

@varun-seth that's true, but I'm not sure what to do about it. Do you have any ideas?

@varun-seth
Copy link

@mcouthon This script derives a minimalistic pyproject.toml from poetry.lock. Using a dummy value for name, version, description, authors. Thus, the docker layer cache is preserved.

COPY poetry_to_pyproject.py poetry.lock ./
RUN python poetry_to_pyproject.py
RUN poetry install

poetry_to_pyproject.py

@m-roberts
Copy link

@mcouthon This script derives a minimalistic pyproject.toml from poetry.lock. Using a dummy value for name, version, description, authors. Thus, the docker layer cache is preserved.

COPY poetry_to_pyproject.py poetry.lock ./
RUN python poetry_to_pyproject.py
RUN poetry install

poetry_to_pyproject.py

@varun-seth That's great! I have left a comment on your gist suggesting turning this into a Poetry plugin, to avoid the need to bundle the script as part of the Dockerfile.

@lokielse
Copy link

lokielse commented Dec 7, 2023

It takes me much time to figure it out until I find this thread.
I agree that only poetry.lock file is wanted.
It's nice to have a similar feature like: https://pnpm.io/cli/fetch

Here is the workaround I'm using:

FROM busybox:1.35.0 as lockfile
WORKDIR /app
COPY pyproject.toml ./
# replace first matched version in pyproject.toml to 0.0.0
RUN awk '/version = "[^"]*"/ && !done {sub(/version = "[^"]*"/, "version = \"0.0.0\""); done=1} 1' pyproject.toml > tmpfile && mv tmpfile pyproject.toml
WORKDIR /app

COPY --from=lockfile /app/pyproject.toml ./
COPY poetry.lock ./
RUN touch README.md

RUN poetry install -vv --without dev

COPY . .

@ZlatyChlapec
Copy link

For anyone still looking for this functionality you can use https://pdm-project.org/latest/. It has command pdm sync which installs dependencies from pdm.lock even without pyproject.toml being present.

@jaklan
Copy link

jaklan commented Aug 25, 2024

No activity for the last year (and no solution - I don't count creative workarounds - for more than 5...), so bumping - the issue is still fully valid. Other tools like pdm were able to solve it, what is the main blocker to fix it in case of Poetry?

@radoering
Copy link
Member

what is the main blocker to fix it in case of Poetry?

Poetry re-resolves at install time with the lock file as only source. In other words, it needs the dependency specification from pyproject.toml.

#9427 might be a game changer. In other words, building on #9427, it might be easier to implement this feature request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/installer Related to the dependency installer kind/feature Feature requests/implementations
Projects
None yet
Development

No branches or pull requests