Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Poetry package install time seems longer than installing with pip #338

Closed
2 tasks done
MichaelAquilina opened this issue Jul 26, 2018 · 31 comments
Closed
2 tasks done
Labels
area/installer Related to the dependency installer kind/feature Feature requests/implementations

Comments

@MichaelAquilina
Copy link
Contributor

MichaelAquilina commented Jul 26, 2018

  • I am on the latest Poetry version.
  • I have searched the issues of this repo and believe that this is not a duplicate.
    If an exception occurs when executing a command, I executed it again in debug mode (-vvv option). (Does not apply)

Issue

I am reasonably sure that installing packages using poetry install instead of pip install takes significantly longer.

To compare. Here is a pip install taking 35 seconds: https://circleci.com/gh/MichaelAquilina/S4/464#action-103

Here is another build with poetry install and the same requirements taking 1:27 seconds: https://circleci.com/gh/MichaelAquilina/S4/521#action-104

In both cases, both dev and non dev requirements were installed.

@sdispater
Copy link
Member

Thanks for your interest in Poetry!

This is expected because Poetry orders the packages to install so that the deepest packages in the dependency graph are installed first to avoid errors at installation time. This requires sequential installation of the packages which takes longer but is more "secure".

Also, Poetry checks hashes of installed packages for security reasons and due to the way pip works, Poetry has to generate a temporary requirements.txt file to make pip check hashes. This adds an overhead which explains the difference between the two tools.

@MichaelAquilina
Copy link
Contributor Author

Thanks for taking the time to explain :)

Would it be possible for poetry to just download the target packages in parallel and install in sequence to speed up the process?

@sdispater
Copy link
Member

I intend to improve the installation part of Poetry sometime in the future to speed things up, yes. I can't give you an ETA though, the only thing I can tell you is that it will likely be after the 1.0 milestone since I want to stabilize Poetry before changing critical parts, like this one.

I'll be sure to keep you posted if anything changes on that front.

@sdispater sdispater added area/installer Related to the dependency installer kind/feature Feature requests/implementations labels Jul 26, 2018
@MichaelAquilina
Copy link
Contributor Author

That sounds great @sdispater. Of course I fully understand your reasoning and agree that ensuring stability is a lot more important than performance.

Thanks for the great work on Poetry!

@dbarrosop
Copy link

Let me start by saying I recently discovered poetry and I think it's awesome :)

Now, is there any progress on this? I just migrated a couple of projects to poetry and my container builds went from 2s to 5 minutes each. The reason why it's so slow is twofold:

  1. When building containers I'd do something like this:
ADD requirements.txt /tmp
RUN pip install -r requirements.txt

ADD . /project
...

so I could cache that layer. With poetry I have to do:

ADD ./project
RUN cd /project && poetry install

Which means I need to install the dependencies over and over even though I haven't changed a single dependency. I am trying to do something equivalent with:

ADD poetry.lock /tmp
ADD pyproject.toml /tmp
RUN cd /tmp && /root/.poetry/bin/poetry update --no-dev

But I am not sure if this is the right thing to do but does the job. Feels extremely hacky so I'd appreciate some advice :P

  1. The second issue is the one described here. A clean install takes ~30s with pip and ~5minutes with poetry :( Monitoring the network I can see how pip downloads everything very fast at a few MB/s while poetry does it one by one never reaching more than a dozen KB/s.

@DanCardin
Copy link
Contributor

Re this previous comment, it would be ideal if there was an interface to install when only the lock file is available, so that we can continue to make good use of docker layer caching. Because I haven't been able to get an install to succeed when the source isn't also available.

@hvdklauw
Copy link

hvdklauw commented May 24, 2019

Just did a comparison for out 190 package requirement project (according to poetry) and the time difference between pipenv en poetry was the following:

Poetry (prerelease):   201.54 real       138.98 user        43.05 sys
Pipenv:                        75.50 real       200.14 user        51.65 sys
Pip with hashes:         83.60 real        53.16 user        21.66 sys

Now yes, pipenv probably doesn't take installation order into account and just installs them concurrently and yes I have had the probably where a package needed it's dependencies to install correctly (which shouldn't be the case, but sometimes you don't have control) but still, I think if you look at the dependency tree you can run a lot of leaves and branches in parallel without this being an issue because they don't meet up until the end.

Still, pip with a requirement file with hashes is a lot faster....

@pmav99
Copy link
Contributor

pmav99 commented May 24, 2019

Still, pip with a requirement file with hashes is a lot faster

It feels like you are comparing apples to oranges here. Pip doesn't resolve dependencies. Poetry install does.

@aequitas
Copy link

However those dependencies are only resolved during the lock phase, the install phase (that is performed more often than locks) just installs the packages.

We could consider to do some time expensive ordering/parallelization calculations during the lock phase?

@hvdklauw
Copy link

Yeah, the dependency resolving taking long is fine, then doing a poetry export -f requirements.txt to generate the requirements file with all dependencies and installing that with pip is faster then just using pure poetry.

Like I said: that misses the ordering in installing which seems to be only needed if not doing it causes issues.

Maybe the solution would be to give the install command a flag (and environment variable) to install concurrently to speed it up if ordering is not important. Then the default will be the slower and safe, but if you know it's not needed it will speed up CI build times significantly.

@hvdklauw
Copy link

Also I just saw this in the pip documentation, stating that they also install packages in order.. now I'm really confused about what poetry is doing to make it go so much slower...

@matthiasgoergens
Copy link

@hvdklauw Pip documentation just says that they are installing in topological order. That would still allow parallelism for packages that are not forced into a specific order by a dependency relationship.

(But looking at the discussions around pip, they don't do parallel installs either.)

@exhuma
Copy link

exhuma commented Sep 6, 2019

I've dug into this a bit. And it seems that the slowness originates from the fact that poetry runs a new subprocess for pip for each package. This incurs the Python startup time foe each package. And this adds up if you have many packages to install. I noticed this by using pycallgraph which has shown me this:

pycallgraph

As I understand @sdispater, this is done to ensure ordering of packages. However, pip as of version 6.0 orders packages by topology which IMO is good enough. So there is no need anymore to do the package ordering inside poetry.

Another reason is hash-checking, which is also supported in pip

Hence, I don't see any benefit from running separate pip processs other than nicer console output.

@exhuma
Copy link

exhuma commented Sep 6, 2019

Another idea worth investigating is calling pip as API. Something like this should be possible instead of using a subprocess:

from pip._internal.commands.install import InstallCommand

# equivalent to `pip install foobar==10.0`
cmd = InstallCommand()
cmd.main(['foobar==10.0'])

This would avoid the repeated Python startup-time.

There are some issues with this though:

  • It imports an underscored package name.
  • Without any modification, this would run inside the environment from which poetry is executed, which is unlikely the one we want to install the packages into.

This came out from more of an curious investigation into pip and poetry and the second bullet-point alone above makes me think that this is really not so easy to implement. Although, the pip options -t and --prefix may be worth investigating for this.

If those don't pan out, I think it's still absolutely acceptable to call pip only once with all the dependencies (given that it supports topological ordering and hash-verification).

@lephuongbg
Copy link

We are currently using following script in CI to make installation faster (2x in one of our projects):

poetry export -f requirements.txt --dev | poetry run -- pip install -r /dev/stdin

@petergaultney
Copy link

petergaultney commented Nov 25, 2019

I realize this is probably naive, but having had some experience with building my own tooling around pip install I'm curious about something here.

Since poetry resolves dependencies upfront and then, at least for the pip_installer, uses pip install --no-deps to install each package independently of all the others (this is what my own build tools have done), it seems to me to be safe to execute multiple pip install operations in parallel.

From my own testing, changing this for loop in installer.py into a ThreadPool.map effectively gives speedup relative to the number of packages you're trying to install, for already-locked dependencies.

        self._io.write_line("")
        # for op in ops:
        #     self._execute(op)
        multiprocessing.pool.ThreadPool(len(ops)).map(self._execute, ops)

This takes me from ~40s to freshly install (including new venv creation) about 30 dependencies to about 10s. For comparison, a serial, pip install -r requirements.txt -t test --no-deps using the exported requirements.txt file takes about 15s.

I don't know if the lack of parallelization here is because removing or updating things might be more sensitive to some kind of cross-dependency race, or if there are other reasons why a parallel per-dependency install isn't a good standard practice. But in my limited testing, that very simple change makes poetry ~4x faster for a decently-sized application.

@ewjoachim
Copy link

ewjoachim commented Dec 27, 2019

Ok, sorry, long post.

I had the opportunity of discussing the issue with @sdispater recently, and we came to the conclusion that one of the ways to dramatically speed up the download time would be to download less bytes, and especially just download the ones we need.

Let's add a few things:

  • Random access is possible for zip files (wheels) and not for tar. This makes the method usable for wheels
  • All the info we need is in the .dist-info/WHEEL file
  • HTTP allows fetching only a given range of bytes. Of course, the server needs to implement this for it to work, but the CDN behind PyPI implements it
  • Given the file object interface can be implemented on any object, it's clearly possible to write a wrapper that will lazily download the needed byte ranges and nothing else
  • With that in mind, we just need to find the laziest way to extract the wheel in terms of bytes downloaded and number of requests.

I've done a POC for just this (not trying to integrate this into poetry, just seeing if I could achieve anything noteworthy by playing around).
The result is https://github.com/ewjoachim/quickread

My first conclusions:

  • The approach seems to work: I was able to extract the full dependencies of Django 3.0.1 by downloading 316.82kb (4.37% of the full wheel). The byte ranges I had to download were: RangeSet{Range[6964008, 6965518), Range[7105384, 7428297)}
    This is 1.5kb of (compressed) file and 315kb of central directory (because there are thousands of files in the zip)

Given the file format of zip, at first, I thought it wouldn't be possible to improve that, but then I've looked again, and now I think if it's feasible to do much much better than that.

More details on the analysis of the problems with the ZIP format. Technical details, not needed to understand the whole problem but if you're interested, here you are (click to expand) Ok, so first thing first: I'm not an expert of the zip format. Most of these are things I discovered today by reading google and wikipedia. If you think there's a mistake, you're most probably right, and please say so.

The zip file is composed of

  • the compressed files (each one with a header)
  • Then at the end of the file, a central directory composed of information on all the files (including name and offset in the file)
  • And then information on the central directory itself (offset of the start of the central directory).

If you want to read a single file, here are the minimal steps one needs to take:

  • Open the file, then move the cursor (seek) to the end of file
  • We need to identify the beginning of the "End of central directory record" which will tell us where the central directory starts. If we're lucky and there's no "comment" in the comment field, it's at byte -22. If we're not lucky there's a comment up to 64kb long, so we need to search backwards for the string that begins the "End of central directory record", 0x06054b50, and then see if the length of the comment section (offset 20) matches with the offset we found through our search.
  • From there, we get the location of the start of the Central directory. Let's seek there
  • Now we have to read every file header. The header tells us how long the comment shall be for each file header, and because of this, in order to not get lost, we need to read the whole thing.
  • At some point, one of the file header has the name we're looking for and the header tells us where the file is. We seek that location.
  • Then we read the local file header which tells us how long the file is
  • Then we read the file bytes
  • Then we decompress the file. Pfew.

Sources:

# But

That being said, we know that the wheels are most likely created by the wheel tool, and we can make assumptions, like, plenty of them. This helps because:

  • We know the end of central directory record is 22 bytes long, without comment
  • We know that the files are always added in the zip in the same order. Especially, the 4 last files added are:
    • name-version.dist-info/WHEEL (the file we want to read)
    • name-version.dist-info/entry_points
    • name-version.dist-info/top_level.txt
    • name-version.dist-info/RECORD
  • As long as we can derive the length of name-version, we know exactly the offset we need to read to get the location of the wheels file. This is given by the formula: offset_from_end_of_file = - 248 - 4 * len(name_dash_version) where name_dash_version is the variable part of the .dist-info folder name, which is also present in the wheel filename.
  • Ideally, the length to read is 20 + len(name_dash_version) so that we'd get the file offset in the first 4 bytes and then name-version.dist-info/WHEEL which would allow us to check that the wheel follows our assumptions. Thing is we cannot request both a length and a negative offset in the HTTP spec (as far as I understand) but that means we'll have to read 300 bytes instead of 30 and it really doesn't make a difference.
  • Then we read the first 4 bytes, and get an offset, let X be the offset
  • Then I think the best is to read 5kb at the location and hope that we got enough (check the length at offset 18)
  • And then we extract the compressed file, decompress it with zlib.decompress from stdlib, and then we have the WHEEL file

This method would allow getting only 2 requests and 5kb (or we can make it 3 requests and probably 1kb), so for Django, we're talking about dividing the download time by about 10 000. That should speed up poetry.

And the best part is that if there's no wheel or the wheel isn't in the expected format, then we can just detect it and fallback to today's strategy.

I'm going to try and upgrade the POC to showcase reading the WHEEL file given a wheel URL.

@ewjoachim
Copy link

ewjoachim commented Dec 27, 2019

Another possible simpler solution would be to read the last, say, 3kb of the zip, look for 0x04034b50 (which means "beginning of a new file"), read file name, if WHEEL we've found it, if entry_points, top_level or RECORD, we've gone too far, and we can restart with the last 10kb, if anything else, continue reading.

Reading efficiently through the bytes can be done with the struct module, we can even re-use the struct definitions from the zipfile module

EDIT: yeah but... We have no idea how long the central directory may be between 1kb and 1MB, so by default it's hard. We're probably better off reading the central directory to know where to search

@ewjoachim
Copy link

Ok, POC updated at https://github.com/ewjoachim/quickread/blob/master/quickread/wheel.py (usage at https://github.com/ewjoachim/quickread/blob/master/script2.py).

We can get the full requirements for a wheel with 2 requests, each downloading 0.5 to 2 kb.

I'll try to see if there's a way to integrate this into poetry.

@ewjoachim
Copy link

ewjoachim commented Dec 29, 2019

Hm :/

I implemented that on a branch and the results are not as impressive as expected.

Running on poetry itself, in the dev env:

$ # On master
$ poetry run poetry update -vvv
   ...
   1: Version solving took 3.156 seconds.
   1: Tried 1 solutions.
   0: Complete version solving took 52.984 seconds for 4 branches
   0: Resolved for branches: (>=2.7,<2.8 || >=3.4,<3.5), (>=2.7,<2.8), (>=3.4,<3.5), (>=3.5,<4.0)

$ # On my branch
$ poetry run poetry update -vvv
   ...
   1: Version solving took 2.874 seconds.
   1: Tried 1 solutions.
   0: Complete version solving took 45.825 seconds for 4 branches
   0: Resolved for branches: (>=2.7,<2.8 || >=3.4,<3.5), (>=2.7,<2.8), (>=3.4,<3.5), (>=3.5,<4.0)

Full output: https://gist.github.com/ewjoachim/6fed55fe84da9c90d6452b73ed64cdbd

  • Either I've done something wrong
  • Or this wasn't actually the bottleneck
  • Or the bottleneck was in connexion establishing more than download time (I have a fiber connexion)
  • Or the PyPI CDN is not that faster on range requests
  • Or something else.

Additional ideas for speeding up:

  • I wonder if there's really nothing we can do with sdist, but it means studying the .tar and .gz formats in depth.
  • I noticed we're using requests.get everywhere, and not creating a requests.Session. That should speed up things, I guess.
  • Parallelization as mentionned above.

I'm posting my branch (#1803), but probably not spending a lot more time on that unless someone has a very smart idea :)

@chrisbennight
Copy link

@ewjoachim
Apologies if I'm missing something here, but it looks like the solution you are investigating would potentially have an impact on an add/update (i.e. the resolution process), but not on the install process (which is the focus of this issue?).

Definitely not being dismissive of the work - as I think anything that speeds of solving time is great (during development I definitely spend more time solving than installing) - just don't think it's relevant to this particular thread.

--

I suspect the other posts calling out the subprocess pip call per module are the primary cause of the diff between pip and poetry on install . I think in some cases the pure parallelism approach would break - due mostly to modules doing custom things in the setup.py (which implicitly require dependencies to be installed). Something that parallelized each layer of depth n up the dependency graph, starting with the deepest would still probably always work; I haven't looked through the lockfile structure to see if the information is present there to do this.

@ewjoachim
Copy link

Hm, you're right... This stems from a discussion I had with @sdispater and I mistook this ticket with the one regarding the problem we discussed :(

Sorry for the noise. I'll try to find the proper ticket or create one.

@jtratner
Copy link

From reading this thread, would someone be able to clarify why poetry needs to install each package individually? If you've already computed the dependency closure, what is the additional bookkeeping vs. generating multiple requirements.txt files and installing in parallel similar to pipenv?

We've really enjoyed the usability boost of poetry commands, but in our CI system, builds can take up to 20 minutes to install even if everything is already downloaded (I assume b/c of subprocessing out to pip). I'm thinking of setting up our CI to do poetry export -f requirements.txt, but I'd love to not do that :)

@hvdklauw
Copy link

Yeah, again tested it yesterday in our CI environment, poetry install takes minutes longer then doing the export and then using pip. Both with and without the packages cached in a folder.

@PetterS
Copy link
Contributor

PetterS commented May 1, 2020

Installing packages in parallel would be really nice.

Pipenv does this in a really bad way: ignoring package dependencies completely. Sometimes installation fails and those packages are simply retried in the end. Not a good approach.

But since we have the complete dependency graph we could find the packages that are safe to install in parallel and do that.

@jtratner
Copy link

jtratner commented May 2, 2020

Pipenv does this in a really bad way: ignoring package dependencies completely. Sometimes installation fails and those packages are simply retried in the end. Not a good approach.
But since we have the complete dependency graph we could find the packages that are safe to install in parallel and do that.

This safe way is definitely ideal BUT the dumb way prob works in 90% of cases (and is significantly simpler). I wonder if you could get a tradeoff by topologically sorting, then doing parallel installs from deps up to top level requirements? (possibly weighting shared dependencies higher)

@PetterS
Copy link
Contributor

PetterS commented May 2, 2020

This safe way is definitely ideal BUT the dumb way prob works in 90% of cases (and is significantly simpler). I wonder if you could get a tradeoff by topologically sorting, then doing parallel installs from deps up to top level requirements? (possibly weighting shared dependencies higher)

This is a quick hack: using the depths for parallel installation: #2374. The idea is that it should be safe.

EDIT: It is about four times faster on my computer, while still being safe in that it respects the dependencies of the packages.

@hauntsaninja
Copy link
Contributor

For anyone else stumbling on this thread, and since it hasn't been mentioned so far, it looks like #2595 has been merged and should help with this issue: #2595 (comment)

@earonesty
Copy link

earonesty commented Jul 23, 2020

Just curious. does the new installer pre-build for the local env, cache the built dists, and soft-link to them rather than copying them (like pip-accel did)?

@jomach
Copy link

jomach commented Sep 7, 2023

@finswimmer why is this closed ? I still see the problems on my side.

Copy link

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 29, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area/installer Related to the dependency installer kind/feature Feature requests/implementations
Projects
None yet
Development

No branches or pull requests