-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up build environment creation (re-usable, faster) #7294
Comments
Having a concrete target can help. Can you provide an example of a slow command, the time it took, and how long you're expecting it to take? The network access issue can be overcome by invoking pip like |
My issue raised here was exactly that pip wheel builds a build environment every-time from scratch via the netwrok; which is quite expensive and could be avoided.
|
Thanks for the example! On my machine running this script I'm showing it takes about two seconds to build tox: script.sh
Output
The important lines are:
How long does it take for you? If you're seeing much longer times, it may be caused by something other than the build environment setup (e.g. #2195). If you run Sorry if I'm asking something that seems obvious, I'm trying to cover all bases so we can say without a doubt that a change in this area would help and identify how much it would help. That helps with prioritization since there are several existing issues related to pip being slow to install (#4768, #4497 kind of, #825). |
(just noting -- @gaborbernat is a virtualenv maintainer) The build environments could be created from cached wheels -- optimizing things to minimize network requests is a good idea. However as #7132 and similar show, it is kinda tricky to do. Note that these are intended as optimizations, so we'd want to not have subtle+nuanced behavior changes. If someone wants to implement the approach I propose there and a way for pip to aggressively use the cache, that'd be a good way to avoid network requests in this specific scenario. |
This issue has been automatically closed because there has been no response to our request for more information from the original author. With only the information that is currently in the issue, we don't have enough information to take action. Please reach out if you have or find the answers we need so that we can investigate further. |
@chrahunt those timings are platform dependent; it's much slower on Windows. I personally find 2s a bit slow, when could be 0.1 👍 I'd expect most projects have the same build dependencies, so we could keep reusing the envs instead of recreating them. They could for example be cached by the build-dependency names and versions. |
I will second @gaborbernat's comments here. Creating build environments is a non-trivial overhead in certain environments, Windows being very much an example of how bad it can get (particularly in corporate environments, where aggressive antivirus can result in all of the file copying being very slow). On my work PC, a tox build took about 40 seconds elapsed time. From the build log, about half of that was setting up the build environment. I'd kill for a 2-second build 😉 "Make it go faster" without a clear target is always a difficult thing to ask, but conversely, the existing development was very much done on the basis of "get it work before worrying about performance", and I think we should now at a minimum review where the performance bottlenecks¹ are and look to improve them. ¹ Specifically the bottlenecks in pip's code - obviously a lot of the time in a from-source build is in the actual build, but that doesn't mean that we should ignore the overheads that pip adds to the process. |
I don't disagree with this idea (and implemented similar environment sharing to speed up our tests in #7276, on top of the environment sharing that we already do), I'm just trying to make sure we take an effective approach. Ideas are important, but without any data or test cases I don't see how we can prioritize or properly weigh the implementation complexity or robustness trade-offs against the benefit. Can we:
We can extract some code from chrahunt#1 for getting the profiling details and I can write up a (probably extremely basic) analysis workflow if it would help. |
I agree with all of this (although it's not easy to do and I don't have time to offer much help on it) but I think it's also worth looking at where we could simply avoid work altogether. As @gaborbernat suggests, re-using build environments (or copying a well-known master) could be a useful avenue to explore.
While this is true in principle, I think there are some reasonable compromises we could make in practice. For example, we potentially build multiple environments in one run of pip. Currently I'd expect nearly all of them to be based on setuptools+wheel. At the moment, I assume we hit the network every time just in case a new release of setuptools occurs between environment builds. I don't think it would be unreasonable to not do that check, and copy one environment (or even just link to it with a Basically, at the moment I'm more interested in exploring design-level ideas than digging into the weeds of the detail. IMO, the implementation of PEP 517 and 518 spent a lot of our budget of "acceptable performance waste" (creation of lots of isolated environments, lots of subprocess calls, etc). At some point we should look at recovering some of that overspend, as with any other form of technical debt. |
I'm also very much in favor of speed ups here. All of this would be fairly tricky though, so I concur with @pfmoore that we should think about what this looks like and the implications of various choices, before diving into implementation (though, I don't want that to block someone else from diving into them, just do that in a new PR or different issue). |
Can anyone that pip is running unacceptably slowly for post a corresponding pip log file? |
Unacceptably slow is relative. I don't think pip is there, but this thread is about pip doing a lot of wasteful operations here and there (such as always creating isolated build environments from scratch) that add up when used multiple times (e.g. inside tox). Granted we should measure, sadly with the virtualenv and the tox rewrite I'm engaged can't dedicate time for this now, but wanted to start a discussion on this, and see if anyone else concurs my goals to make things work faster. |
Here you are @chrahunt. toxbuild.log |
|
Contributions are always welcomed, especially since this issue is tagged help wanted :) |
Currently pip always creates all build environments from scratch, requiring network access for this. This causes a significant overhead of creating isolated build environments. Can we somehow speed-up this operation?
Idea: Creating isolated build environments from cached wheels only? Do not recreate isolated build environments from scratch, but have a master copy that we just copy.
The text was updated successfully, but these errors were encountered: