Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Editable installs don't work with setuptools 47.2.0+ and multi-buildpack #1006

Closed
edmorley opened this issue Jul 23, 2020 · 2 comments · Fixed by #1251, #1252 or #1253
Closed

Editable installs don't work with setuptools 47.2.0+ and multi-buildpack #1006

edmorley opened this issue Jul 23, 2020 · 2 comments · Fixed by #1251, #1252 or #1253
Assignees
Labels

Comments

@edmorley
Copy link
Member

edmorley commented Jul 23, 2020

In https://heroku.support/890604 an app using pip editable installs combined with multiple buildpacks started seeing build failures when using newer setuptools.

The output from a reduced testcase looks like:

remote: Traceback (most recent call last):
remote:   File "/app/.heroku/python/bin/demo_say_hello", line 33, in <module>
remote:     sys.exit(load_entry_point('demobuildissue', 'console_scripts', 'demo_say_hello')())
remote:   File "/app/.heroku/python/bin/demo_say_hello", line 22, in importlib_load_entry_point
remote:     for entry_point in distribution(dist_name).entry_points
remote:   File "/app/.heroku/python/lib/python3.8/importlib/metadata.py", line 504, in distribution
remote:     return Distribution.from_name(distribution_name)
remote:   File "/app/.heroku/python/lib/python3.8/importlib/metadata.py", line 177, in from_name
remote:     raise PackageNotFoundError(name)
remote: importlib.metadata.PackageNotFoundError: demobuildissue

Investigating I found that:

The reason this causes breakage in the multi-buildpack case is:

  • the build system runs builds in a different directory (/tmp/build_*) from which the app will be run at runtime (/app)
  • much of Python + its tooling expects it's location not to change after install (ie: it's not relocatable) and as such the Python buildpack has to use several path rewriting hacks to make this work
  • one such hack is rewriting the .pth files added to site-packages for editable installs (installs where the package is effectively symlinked from its source directory rather than being packaged and copied into site-packages) so that the absolute paths inside reference /app instead of the /tmp/... paths of BUILD_DIR
  • this rewrite occurs right at the end of the Python compile step, since at that point it's presumed all Python commands have been run (eg: the bin/post_compile hook has already been called)
  • however if the app is using multiple buildpacks, this leaves the contents of BUILD_DIR in a state that only works at runtime, and not for subsequent buildpacks - which causes issues if it uses say the NodeJS buildpack's heroku-postbuild hook to run a script that calls a Python package installed in editable mode
  • the only reason this issue wasn't seen before, is that the Python buildpack doesn't rewrite the paths in the ez_install .egg-link files that are added to site-packages (meaning they still use the BUILD_DIR paths) and old setuptools used to check the egg-link files in addition to the pth files when performing package resolution, but now it only checks the latter.

The path rewriting in question is here:

# rewrite build dir in egg links to /app so things are found at runtime
find .heroku/python/lib/python*/site-packages/ -name "*.pth" -print0 2> /dev/null | xargs -r -0 -n 1 sed -i -e "s#$(pwd)#/app#" &> /dev/null

None of the solutions are great for this, since the buildpack API doesn't provide a way to "run this script after all other buildpacks, but before the end of the build" (and such a feature probably isn't a good idea anyway).

Ideas so far:

  1. Move Python’s .pth rewriting to be a runtime .profile.d/ script (since it’s not until runtime that we can guarantee that no further buildpacks will run)
  2. Skip the .pth rewriting entirely, and instead use a .profile.d/ script to (at runtime) add a symlink back from /tmp/build*/ to /app, such that any stray references to the build dir “just work”
  3. Continue to use .pth rewriting, but add more bandaids (eg instead of replacing the /tmp/build*/... paths, duplicate them with the /app/... variants, and hope that having the redundant paths doesn’t break anything else). (I checked whether pth supported relative paths but it doesn't appear to.)
  4. Spend some time seeing how much actually breaks if we were to switch builds back to running in /app (though I already know of several buildpacks that will, so this would be a substantial project of feature flags, communication, outreach etc)

Ideally we'd choose (4), but that's not a quick change to make.

GUS-W-7828034.

edmorley added a commit that referenced this issue Jul 23, 2020
The versions installed by the buildpack have been updated as follows:
* pip:
  - If using Python 3.4: No change (already using the last to support 3.4)
  - If using pipenv: No change (need to update to a newer pipenv first)
  - For everything else: `20.0.2` -> `20.1.1`
* setuptools:
  - If using Python 3.4: `39.0.1` -> `43.0.0` (latest for 3.4)
  - If using Python 2.7: `39.0.1` -> `44.1.1` (latest for 2.7)
  - For everything else: `39.0.1` -> `47.1.1` (until #1006 fixed)
* wheel:
  - If using Python 3.4: `unpinned` -> `0.33.6`
  - For everything else: `unpinned` -> `0.34.2`

This fixes #949 and fixes #1005, and means packages that rely on newer
setuptools will now install successfully.

Changelogs:
https://pip.pypa.io/en/stable/news/
https://setuptools.readthedocs.io/en/latest/history.html#v47-1-1
https://wheel.readthedocs.io/en/latest/news.html

In addition:
* Installed versions are now deterministic (fixes #1000, fixes #1003)
* The build output now includes the versions used, making it easier to
  debug future upgrades (closes #939)
* Errors during pip/setuptools/wheel install now correctly fail the
  build, and stderr is no longer sent to `/dev/null` (fixes #1002)
* Setuptools is no longer installed twice (fixes #1001)
* Everything that is downloaded is now used (fixes #999)
* `--no-cache` and `--disable-version-check` are now used, saving
  unnecessary work and preventing creation of unwanted files in `/app`
* The `PIP_UPDATE` env var no longer leaks into subprocesses.

As part of fixing version pinning, we now use pip itself to determine
whether the installed packages are up to date, since parsing pip's
output is fragile (eg #1003).

This means `pip install` is now called every time, however this is a
no-op for repeat builds where the versions have not changed, since
unless `--upgrade` is specified pip does not hit the index (PyPI) if
requirements are satisfied.

For the installation itself `get-pip.py` is no longer used, since:
- It uses `--force-reinstall`, which is unnecessary here and would slow
  down repeat builds (given we call pip install every time now).
  Trying to work around this by using `get-pip.py` only for the initial
  install, and real pip for subsequent updates would mean we lose
  protection against cached broken installs, plus significantly
  increase the version combinations test matrix.
- It means downloading pip twice (once embedded in `get-pip.py`, and
  again during the install, since `get-pip.py` can't install the
  embedded version directly)
- We would still have to manage several versions of get-pip.py, to
  support older Pythons.

We don't use `ensurepip` since:
- Not all of the previously generated Python runtimes on S3 include it
- We would still have to upgrade pip afterwards
- The versions of pip/setuptools bundled with ensurepip differ greatly
  depending on Python version, and we could easily start using a CLI
  flag for the first pip install before upgrade that isn't supported
  on all versions, without even knowing it (unless we test against
  hundreds of Python archives).

The new pip wheel assets on S3 were generated using:

```
$ pip download --no-cache pip==19.1.1
Collecting pip==19.1.1
  Downloading pip-19.1.1-py2.py3-none-any.whl (1.4 MB)
  Saved ./pip-19.1.1-py2.py3-none-any.whl
Successfully downloaded pip

$ pip download --no-cache pip==20.1.1
Collecting pip==20.1.1
  Downloading pip-20.1.1-py2.py3-none-any.whl (1.5 MB)
  Saved ./pip-20.1.1-py2.py3-none-any.whl
Successfully downloaded pip

$ aws s3 sync . s3://lang-python/common/ --exclude "*" --include "*.whl" --acl public-read --dryrun
(dryrun) upload: ./pip-19.1.1-py2.py3-none-any.whl to s3://lang-python/common/pip-19.1.1-py2.py3-none-any.whl
(dryrun) upload: ./pip-20.1.1-py2.py3-none-any.whl to s3://lang-python/common/pip-20.1.1-py2.py3-none-any.whl

$ aws s3 sync . s3://lang-python/common/ --exclude "*" --include "*.whl" --acl public-read
upload: ./pip-19.1.1-py2.py3-none-any.whl to s3://lang-python/common/pip-19.1.1-py2.py3-none-any.whl
upload: ./pip-20.1.1-py2.py3-none-any.whl to s3://lang-python/common/pip-20.1.1-py2.py3-none-any.whl
```
edmorley added a commit that referenced this issue Jul 30, 2020
Upgrades setuptools from 39.0.1 to:
- 44.1.1 for Python 2.7 (since it's the last supported version)
- 43.0.0 for Python 3.4 (since it's the last supported version)
- 47.1.1 for Python 3.5+ (since we can't use 47.2.0+ until #1006 fixed)

https://setuptools.readthedocs.io/en/latest/history.html#v47-1-1

Fixes #949.
Closes #973.
edmorley added a commit that referenced this issue Aug 3, 2020
Upgrades setuptools from 39.0.1 to:
- 44.1.1 for Python 2.7 (since it's the last supported version)
- 43.0.0 for Python 3.4 (since it's the last supported version)
- 47.1.1 for Python 3.5+ (since we can't use 47.2.0+ until #1006 fixed)

https://setuptools.readthedocs.io/en/latest/history.html#v47-1-1

Fixes #949.
Closes #973.
edmorley added a commit that referenced this issue Aug 3, 2020
Upgrades setuptools from 39.0.1 to:
- 44.1.1 for Python 2.7 (since it's the last supported version)
- 43.0.0 for Python 3.4 (since it's the last supported version)
- 47.1.1 for Python 3.5+ (since we can't use 47.2.0+ until #1006 fixed)

https://setuptools.readthedocs.io/en/latest/history.html#v47-1-1

Fixes #949.
Closes #973.
dryan pushed a commit to dryan/heroku-buildpack-python that referenced this issue Nov 19, 2020
Upgrades setuptools from 39.0.1 to:
- 44.1.1 for Python 2.7 (since it's the last supported version)
- 43.0.0 for Python 3.4 (since it's the last supported version)
- 47.1.1 for Python 3.5+ (since we can't use 47.2.0+ until heroku#1006 fixed)

https://setuptools.readthedocs.io/en/latest/history.html#v47-1-1

Fixes heroku#949.
Closes heroku#973.
@edmorley
Copy link
Member Author

edmorley commented Apr 8, 2021

The project to make builds be performed from /app is in progress. Once that's complete, this bug can be fixed by removing all rewriting entirely.

edmorley added a commit that referenced this issue Oct 18, 2021
Previously the editable installation mode test only checked whether
the build completed successfully, not whether the installed editable
packages worked.

Now the packages are tested via their entrypoints in the following scenarios:
1. During the `bin/post_compile` hook
2. By a later buildpack in the build
3. At runtime

In particular, (2) catches the issue described in #1006 when setuptools is upgraded,
and (1) + (3) will be useful to verify the solution to #1006 hasn't broken the other cases.

GUS-W-10047026.

[skip-changelog]
@edmorley
Copy link
Member Author

edmorley commented Oct 18, 2021

I've added additional tests in #1251 which now catch the issues described here. After attempting a setuptools upgrade, here is the failure from those new tests:
https://app.circleci.com/pipelines/github/heroku/heroku-buildpack-python/221/workflows/5ba7fb0e-663b-4d76-ad34-ed7e1b37ba3b/jobs/946

@edmorley edmorley self-assigned this Oct 18, 2021
edmorley added a commit that referenced this issue Oct 18, 2021
Previously the editable installation mode test only checked whether
the build completed successfully, not whether the installed editable
packages worked.

Now the packages are tested via their entrypoints in the following scenarios:
1. During the `bin/post_compile` hook
2. By a later buildpack in the build
3. At runtime

In particular, (2) catches the issue described in #1006 when setuptools is upgraded,
and (1) + (3) will be useful to verify the solution to #1006 hasn't broken the other cases.

GUS-W-10047026.

[skip-changelog]
edmorley added a commit that referenced this issue Oct 19, 2021
Currently the build system performs builds in a different directory
(`/tmp/build_<hash>`) to which the app will be run at runtime (`/app`).
This means that any hardcoded paths in the slug must be rewritten by the
buildpack, so the app still works after being moved.

One such case of hardcoded paths, are the `.pth` and `.egg-link` files
that are created in the `site-packages` directory by Pip/setuptools for
editable package installs (aka develop mode). The most common way
someone might use editable mode is via `-e <package specifier>` entries
in their `requirements.txt` file.

Until now, the Python buildpack rewrote paths inside these files during
the compile itself, which meant the build-time paths were no longer
present when subsequent buildpacks ran. This happened to work due to
an interaction of legacy setuptools behaviour and a buildpack bug, but
stops working in setuptools 47.2.0 or later - as described in #1006.

Longer term we would like to stop building in one location and running
the app from another, so that the path rewriting isn't required at all.
However that change breaks some other buildpacks so requires a long-term
transition plan.

In the meantime, this change moves path rewriting to a `.profile.d/`
script, so that it occurs at runtime, and so after all other buildpacks
have run.

Additional test coverage of editable packages was added previously in #1251
and #1253, and has confirmed that this new `profile.d/` script approach will
prevent the issues in #1006 when setuptools is updated in a future PR.

There is one subtle implication of moving this path rewriting, in that
subsequent cached builds will no longer see the existing package as
being already installed, so won't uninstall if before reinstalling it (as
seen in the test log output change). However this is not believed to
have any significant impact.

Fixes #1006.
(And so unblocks updating to the newer setuptools required for #1248.)
edmorley added a commit that referenced this issue Oct 19, 2021
…#1252)

Currently the build system performs builds in a different directory
(`/tmp/build_<hash>`) to which the app will be run at runtime (`/app`).
This means that any hardcoded paths in the slug must be rewritten by the
buildpack, so the app still works after being moved.

One such case of hardcoded paths, are the `.pth` and `.egg-link` files
that are created in the `site-packages` directory by Pip/setuptools for
editable package installs (aka develop mode). The most common way
someone might use editable mode is via `-e <package specifier>` entries
in their `requirements.txt` file.

Until now, the Python buildpack rewrote paths inside these files during
the compile itself, which meant the build-time paths were no longer
present when subsequent buildpacks ran. This happened to work due to
an interaction of legacy setuptools behaviour and a buildpack bug, but
stops working in setuptools 47.2.0 or later - as described in #1006.

Longer term we would like to stop building in one location and running
the app from another, so that the path rewriting isn't required at all.
However that change breaks some other buildpacks so requires a long-term
transition plan.

In the meantime, this change moves path rewriting to a `.profile.d/`
script, so that it occurs at runtime, and so after all other buildpacks
have run.

Additional test coverage of editable packages was added previously in #1251
and #1253, and has confirmed that this new `profile.d/` script approach will
prevent the issues in #1006 when setuptools is updated in a future PR.

There is one subtle implication of moving this path rewriting, in that
subsequent cached builds will no longer see the existing package as
being already installed, so won't uninstall if before reinstalling it (as
seen in the test log output change). However this is not believed to
have any significant impact.

Fixes #1006. (And so unblocks updating to the newer setuptools required for #1248.)
GUS-W-7828034.
edmorley added a commit that referenced this issue Oct 19, 2021
Updates:
- setuptools from 47.1.1 to:
  - 50.3.2 for Python 3.5
  - 57.5.0 for Python 3.6+
- wheel from 0.36.2 to 0.37.0.

Of note, the newer setuptools is fully compatible with Python 3.10, thereby fixing
#1248. Updating to newer setuptools was blocked on #1006, but that's now been
fixed by #1252.

The setuptools version hasn't been updated all the way to the latest (58.2.0), since 
v58 dropped support for 2to3, which caused breakage in a few packages, so I would
rather hold off as long as possible (and there are no fixes that we need since then).

Release notes:
https://setuptools.pypa.io/en/latest/history.html#v57-5-0
https://wheel.readthedocs.io/en/stable/news.html

Full changelogs:
pypa/setuptools@v47.1.1...v57.5.0
pypa/wheel@0.36.2...0.37.0

Fixes #1248.
GUS-W-10052807.
edmorley added a commit that referenced this issue Oct 19, 2021
Updates:
- setuptools from 47.1.1 to:
  - 50.3.2 for Python 3.5
  - 57.5.0 for Python 3.6+
- wheel from 0.36.2 to 0.37.0.

Of note, the newer setuptools is fully compatible with Python 3.10, thereby fixing
#1248. Updating to newer setuptools was blocked on #1006, but that's now been
fixed by #1252.

The setuptools version hasn't been updated all the way to the latest (58.2.0), since 
v58 dropped support for 2to3, which caused breakage in a few packages, so I would
rather hold off as long as possible (and there are no fixes that we need since then).

Release notes:
https://setuptools.pypa.io/en/latest/history.html#v57-5-0
https://wheel.readthedocs.io/en/stable/news.html

Full changelogs:
pypa/setuptools@v47.1.1...v57.5.0
pypa/wheel@0.36.2...0.37.0

Fixes #1248.
GUS-W-10052807.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment