Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need for wheel caching: new pyproject.toml build isolation can cause excessive unnecessary compilation times with Cython & external .pxd import targets #6411

Open
ghost opened this issue Apr 15, 2019 · 20 comments
Labels
C: PEP 517 impact Affected by PEP 517 processing state: awaiting PR Feature discussed, PR is needed type: enhancement Improvements to functionality

Comments

@ghost
Copy link

ghost commented Apr 15, 2019

Disclaimer: I am not a super packaging expert. It's very possible I'm missing something and this is all a non-issue. If that's the case I'm sorry I wasted your time with my wall of text 😬 but from my limited knowledge this really seems to be a real problem

The new pyproject.toml build isolation apparently can add many minutes unnecessary build time to Cython projects which have larger external/other Cython projects as dependencies from which they cimport .pxd files. (which means these dependencies are not only needed to launch the installed library or program, but needed during build_ext already)

The reason is that in earlier packages, AFAIK it worked like this:

  1. Put your .pxd import target dependency into install_requires and setup_requires
  2. The external .pxd dep in setup_requires will be installed as soon as your install/upgrade starts, but only if not present
  3. The build_ext & everything else to do the actual build runs
  4. install_requires is processed, but because setup_requires already contained our .pxd target it's not reinstalled

--> .pxd dependency is built one or zero times (latter on upgrades)

Now with pyproject.toml it appears to work like this:

  1. The only option I'm aware of to specify something to be around during build is pyproject.toml's build-system.requires
  2. If you launch your install/upgrade, the external .dep is not around due to the new build isolation. It will be therefore fetched and fully compiled (wheel) even if already present
  3. The build_ext & everything else to do the actual build runs
  4. install_requires is processed, and since the previous wheel built was inside the build isolation it will install the dep again, and this on every single install

--> .pxd dependency is built two or one time(s) (latter on upgrades)

Please note that some Cython projects (e.g. mine) may need multiple minutes to build from scratch, so this difference can be substantial. The practical impact is especially horrible in tox where this time is added to each single test build run

Expected behavior: a .pxd dep will run through the build_ext step once or never (latter for upgrades) as before. I believe this is possible without breaking build isolation, see below

Cause in detail (I think):
There is no way to say "have this package around source-only" but I always need to install & compile it fully to get the .pxd files. (.pxd are like C/C++ headers, they don't need compilation, so being forced to compile/do a wheel/build_ext instead of just e.g. an sdist extract is the core issue.) With caching & no build isolation this wasn't too bad since rebuilds are avoided so much, but this is now no longer the case. So now this is suddenly a problem!

Related Cython ticket since this probably can only be improved in some sort of cooperation: cython/cython#2924

Solution ideas:
The fully build isolation-preserving idea I have:

  1. Add some way to specify that a build-time dep is needed as special "sdist dep" that is extracted source-only in some central pip-managed place alike to site-packages, with a similar search mechanism like imports but as callable search function which Cython could use to find files from a specific sdist source

The partially build isolation-preserving idea I have:

  1. Allow a special declarative flag in pyproject.toml for a build time dependency saying "this one is really heavy to install, pip should use that one from external site-packages even in build isolation to avoid excessive rebuild time"

The "screw build isolation" solution:

  1. Require all affected Cython projects to just never use build isolation (is that even possible right now? If not that might need a new pyproject.toml flag to disable it)
@cjerdonek
Copy link
Member

Would issue #6144 help at all with this?

@ghost
Copy link
Author

ghost commented Apr 15, 2019

@cjerdonek they appear to be discussing multiple variants, one being "ignore package if not already around" (which wouldn't help) and the other being "build this dependency but not under isolation/grab from outer site-packages if present" (which would help). However, that would correspond to the partially build isolation-preserving idea above, so not the optimal solution where we keep the full build isolation. Still better than nothing, but not the ideal outcome in terms of functionality

@jdemeyer
Copy link

Would issue #6144 help at all with this?

That seems to be a different issue, but depending on the resolution, it may or may not help.

Personally, I would simply suggest to not use pyproject.toml at all. If that doesn't work, don't use it.

@ghost
Copy link
Author

ghost commented Apr 18, 2019

That is impossible. setup_requires/easy_install breaks cython in complex scenarios, so there is actually no alternative. One of the big reasons why I made this ticket

@jdemeyer
Copy link

setup_requires/easy_install breaks cython in complex scenarios

easy_install is broken, but pip should work.

I wonder what kind of "complex scenarios" you mean that pip + setup.py can't handle correctly.

@ghost
Copy link
Author

ghost commented Apr 18, 2019

The use case that breaks is the one described in this ticket's initial desc: you cannot use .pxd dependencies without using setup_requires, and setup_requires still uses easy_install behind the scenes. easy_install's sandbox attempts to reload Cython in a fundamentally impossible manner (as I understand it because native modules cannot be fully unloaded so it carries over invalid state) which with deep enough setup_requires chains makes Cython compilation fail. The only workaround is to override build_ext in an ugly way in the affected dependency packages themselves to run each cythonize call in an ugly subprocess wrap to force separate clean state, or to use pyproject.toml instead.

@jdemeyer
Copy link

This issue is really two independent issues:

  1. Some build-system dependencies may take a while to build and install. It would be nice to be able to use an existing already-installed version in site-packages instead.

  2. When a package is both a build-time and a run-time dependency, it's built twice.

@ghost
Copy link
Author

ghost commented Apr 18, 2019

I would say 1. is also just a workaround, the actual issue is that build-system.requires is the only way to have a dependency package around (during build_ext that is) and it implies it needs to be built, but it wouldn't actually need to be (technically only the .pxd files are needed which are in an sdist). However, that is a very cython-specific nuance, and being able to just access it from outside site-packages could already make this much less painful/less relevant in most cases

Edit: sorry had written python in one place instead of cython, please don't be confused (corrected now)

@jdemeyer
Copy link

setup_requires still uses easy_install behind the scenes

If so, that's a problem indeed. I assumed that setup_requires installs dependencies the same way that pip does.

@jdemeyer
Copy link

Actually, let's take a step back: which is the package that easy_install does not install correctly? That might be fixable.

@ghost
Copy link
Author

ghost commented Apr 18, 2019

one of the cython devs seemed to think it is an inherent problem, details are here: cython/cython#2730 the current state of the issue is, as far as I understand it, that the dev put in a change to make this particular instance of the error less likely but that the whole way easy_install's sandbox attempts to unload & reload modules just won't safely work with how cython is designed. but I might be wrong, feel free to read it yourself

@jdemeyer
Copy link

Yeah OK. I always hated easy_install and this is yet another reminder why...

@cjerdonek
Copy link
Member

I haven't developed an understanding of this particular issue, but just to be sure, are you also aware of the --no-build-isolation and --no-use-pep517 options to pip install?

@jdemeyer
Copy link

I haven't developed an understanding of this particular issue

There is a lot of noise on this page, but it's really simple: #6411 (comment)

@ghost
Copy link
Author

ghost commented Apr 18, 2019

@cjerdonek this affects all users of my library, I guess I could write this into the readme and everyone who doesn't know will wait many additional minutes - but that seems quite ugly to me. What would help on the other hand is actually a pyproject.toml option requesting that this project not be built with build isolation, or that a specific build-system.requires dep should be accessed from outside of the otherwise in effect build isolation. however, what would REALLY help is a build-system.sdist_requires option that cython could then possibly learn how to use, because a part of this mess is just that both setup_requires and build-system.requires aren't actually modelling the use case of .pxd external imports very well

@pfmoore
Copy link
Member

pfmoore commented Apr 18, 2019

If so, that's a problem indeed. I assumed that setup_requires installs dependencies the same way that pip does.

No, setup_requires is a setuptools-specific argument, that is not handled by pip at all. The standardised replacement for setup_requires is PEP 518, which pip does support.

Some build-system dependencies may take a while to build and install. It would be nice to be able to use an existing already-installed version in site-packages instead.

That's what --no-build-isolation is for. It's a bit of a blunt instrument, as it requires you to manage all of your build dependencies. It might be nice to be able to say "just get this specific build requirement from site-packages", but I'm not even sure how such an option would function. It's definitely a feature request, though.

When a package is both a build-time and a run-time dependency, it's built twice.

... well, yes? Build time and run time potentially happen on completely different machines. I guess when the package is built and installed in the build-time environment, the wheel could be cached and reused for the later install step. I don't know if that would help here, but again it's a feature that needs a bit of work to design, as I'm pretty sure it wouldn't be as simple as it seems at first glance.

@ghost
Copy link
Author

ghost commented Apr 18, 2019

wheel caching I think should actually help with this very effectively. what I like about this solution is that it doesn't need a special build-system.sdist_requires (and doesn't need modifications in cython), and also leaves the build isolation intact. so IMHO that sounds like an awesome idea, of course ignoring for a moment the complexity of implementing this which I simply cannot judge

@jdemeyer
Copy link

@pfmoore pfmoore added the state: awaiting PR Feature discussed, PR is needed label Apr 18, 2019
@pfmoore
Copy link
Member

pfmoore commented Apr 18, 2019

IMHO that sounds like an awesome idea, of course ignoring for a moment the complexity of implementing this which I simply cannot judge

Well, someone will need to do the investigation and provide a PR, so if you think this would address your issue then I'll update this issue with the "Awaiting PR" label.

@cjerdonek cjerdonek added the type: enhancement Improvements to functionality label Apr 19, 2019
@ghost ghost changed the title New pyproject.toml build isolation can cause excessive unnecessary compilation times with Cython & external .pxd import targets Need for wheel caching: new pyproject.toml build isolation can cause excessive unnecessary compilation times with Cython & external .pxd import targets Apr 19, 2019
@ghost
Copy link
Author

ghost commented Apr 19, 2019

@pfmoore I tried to update the ticket title such that it matches better what you suggested as PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C: PEP 517 impact Affected by PEP 517 processing state: awaiting PR Feature discussed, PR is needed type: enhancement Improvements to functionality
Projects
None yet
Development

No branches or pull requests

3 participants