Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tensorflow wheel incompatible with rules_pyenv 0.8.0 seemingly due to pkginfo 1.8.2 upgrade #671

Closed
lazcamus opened this issue Mar 22, 2022 · 11 comments

Comments

@lazcamus
Copy link
Contributor

lazcamus commented Mar 22, 2022

🐞 bug report

Affected Rule

The issue is caused by the rule:

pip_install()

Is this a regression?

Yes. It does not occur with rules_python 0.6.0, but does occur on 0.8.0.

I found the commit that broke it (see below), so I can say it doesn't affect 0.7.0.

Description

I upgraded rules_python to 0.8.0 and ran into wheel install failure that dumps FileExistsError: [Errno 17] File exists: 'pypi__gast'

I don't have gast in my requirements.txt, but after some binary searching I tracked it down to tensorflow==2.8.0 (intel) or 2.7.0 (arm64). I replaced my requirements.txt with just the dependencies of tensorflow, but not tensorflow itself, and it works. Just tensorflow by itself causes the failure, so it seems to be related to interaction between the tensorflow wheel and the gast wheel.

I ended up testing 2 tensorflow versions while tracking this down:

  • tensorflow 2.8.0 + gast 0.5.3 (x86)
  • tensorflow 2.7.0 + gast 0.4.0 (arm64)

🔬 Minimal Reproduction

Make a stub repo, then bazel build //...

cat > WORKSPACE << "EOF"
workspace(name = "required")
load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")

http_archive(
    name = "rules_python",
    sha256 = "9fcf91dbcc31fde6d1edb15f117246d912c33c36f44cf681976bd886538deba6",
    strip_prefix = "rules_python-0.8.0",
    url = "https://github.com/bazelbuild/rules_python/archive/refs/tags/0.8.0.tar.gz",
)
load("@rules_python//python:repositories.bzl", "python_register_toolchains")

python_register_toolchains(
    name = "python39",
    python_version = "3.9",
)

load("@python39//:defs.bzl", "interpreter")
load("@rules_python//python:pip.bzl", "pip_install")

pip_install(
    name = "common_deps",
    python_interpreter_target = interpreter,
    requirements = "//:requirements.txt",
)
EOF
cat > requirements.txt << "EOF"
tensorflow
EOF
cat > BUILD << "EOF"
load("@common_deps//:requirements.bzl", "requirement")
load("@rules_python//python:defs.bzl", "py_binary")

py_binary(
    name = "hello",
    srcs = ["hello.py"],
    deps = [requirement("tensorflow")],
)
EOF

If you git bisect between 0.6.0 and 0.8.0 rules_python releases, it breaks at commit fe30f15 (PR #661)

If you pare it down to just a change to python/pip_install/repositories.bzl, and binary search on the package upgrades in the file, it's specifically broken by the upgrade of pkginfo from 1.7.1 to 1.8.2.

🔥 Exception or Error

 (Traceback (most recent call last):
  File "/home/builder/.cache/bazel/_bazel_laz/99bd544777c52d5a0d8a66ea3b05626a/external/python3_aarch64-unknown-linux-gnu/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/builder/.cache/bazel/_bazel_laz/99bd544777c52d5a0d8a66ea3b05626a/external/python3_aarch64-unknown-linux-gnu/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/builder/.cache/bazel/_bazel_laz/99bd544777c52d5a0d8a66ea3b05626a/external/rules_python/python/pip_install/extract_wheels/__main__.py", line 5, in <module>
    main()
  File "/home/builder/.cache/bazel/_bazel_laz/99bd544777c52d5a0d8a66ea3b05626a/external/rules_python/python/pip_install/extract_wheels/__init__.py", line 110, in main
    targets = [
  File "/home/builder/.cache/bazel/_bazel_laz/99bd544777c52d5a0d8a66ea3b05626a/external/rules_python/python/pip_install/extract_wheels/__init__.py", line 113, in <listcomp>
    bazel.extract_wheel(
  File "/home/builder/.cache/bazel/_bazel_laz/99bd544777c52d5a0d8a66ea3b05626a/external/rules_python/python/pip_install/extract_wheels/lib/bazel.py", line 377, in extract_wheel
    os.mkdir(directory)
FileExistsError: [Errno 17] File exists: 'pypi__gast'
)

🌍 Your Environment

Operating System:
Played with this across Linux x86, OSX aarch64, Linux arm64

Output of bazel version:

% bazel version
Bazelisk version: v1.5.0
Build label: 4.2.2
Build target: bazel-out/k8-opt/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar
Build time: Thu Dec 2 18:15:58 2021 (1638468958)
Build timestamp: 1638468958
Build timestamp as int: 1638468958

but in testing the test repo above, it fetches:

% bazel version
Bazelisk version: v1.5.0
Build label: 5.0.0
Build target: bazel-out/k8-opt/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar
Build time: Wed Jan 19 14:08:54 2022 (1642601334)
Build timestamp: 1642601334
Build timestamp as int: 1642601334

Rules_python version:

0.8.0

Anything else relevant?

Can be hacked around with this patch applied to 0.8.0 of rules_python:

diff --git a/python/pip_install/repositories.bzl b/python/pip_install/repositories.bzl
index 352f663..22e9e37 100644
--- a/python/pip_install/repositories.bzl
+++ b/python/pip_install/repositories.bzl
@@ -29,8 +29,8 @@ _RULE_DEPS = [
     ),
     (
         "pypi__pkginfo",
-        "https://files.pythonhosted.org/packages/cd/00/49f59cdd2c6a52e6665fda4de671dac5614366dc827e050c55428241b929/pkginfo-1.8.2-py2.py3-none-any.whl",
-        "c24c487c6a7f72c66e816ab1796b96ac6c3d14d49338293d2141664330b55ffc",
+        "https://files.pythonhosted.org/packages/77/83/1ef010f7c4563e218854809c0dff9548de65ebec930921dedf6ee5981f27/pkginfo-1.7.1-py2.py3-none-any.whl",
+        "37ecd857b47e5f55949c41ed061eb51a0bee97a87c969219d144c0e023982779",
     ),
     (
         "pypi__setuptools",
@lazcamus lazcamus changed the title tensorflow wheel incompatible with rules_pyenv 0.8.0 tensorflow wheel incompatible with rules_pyenv 0.8.0 seemingly due to pkginfo 1.8.2 upgrade Mar 22, 2022
@thundergolfer
Copy link
Collaborator

thundergolfer commented Mar 23, 2022

Thanks for your investigation @lazcamus.

Also seeing issues in a repo when attempting to upgrade to 0.8.0:

...
Building wheels for collected packages: alembic, click-default-group, future, ipdb, langdetect, logbook, mashumaro, minimal-snowplow-tracker, pandocfilters, pathtools, promise, pynndescent, pyperclip, python-etcd, shap, subprocess32, termcolor, wrapt, aiobotocore
  Building wheel for alembic (setup.py): started
...

Even if building all these wheels succeeded the behavior would be undesirable, but some wheel builds fail:

...
1a792c0160ef8055c34/.eggs/numpy-1.22.3-py3.9-macosx-10.6-x86_64.egg/numpy/core/include -c shap/cext/_cext.cc -o build/temp.macosx-10.6-x86_64-3.9/shap/cext/_cext.o
      clang-11: warning: argument unused during compilation: '-fno-strict-overflow' [-Wunused-command-line-argument]
      In file included from shap/cext/_cext.cc:5:
      shap/cext/tree_shap.h:8:10: fatal error: 'algorithm' file not found
      #include <algorithm>
               ^~~~~~~~~~~
      1 error generated.
      error: command '/nix/store/a2sz5pfcilnp7hw9hhqipvs3rvq3s59v-clang-wrapper-11.1.0/bin/clang' failed with exit code 1
      [end of output]
...

Upgrading to just 0.6.0 removes the issue on my end, which accords with your diagnosis.

@thundergolfer
Copy link
Collaborator

thundergolfer commented Mar 23, 2022

pkginfo 1.8.x parsing incorrect metadata from tensorflow wheel? -> https://bugs.launchpad.net/pkginfo/+bug/1953227

Tensorflow embeds a tensorflow/include/external/gast_archive/PKG-INFO file which the bug reporter thinks is confusing the newest version of pkginfo.

I've followed their reproduction instructions and can get the error result: pkginfo returns info about gast when you pass it the Tensorflow wheel.

@thundergolfer
Copy link
Collaborator

thundergolfer commented Mar 23, 2022

So we have the problem that version 1.8.0 of pkginfo is the minimum version that supports Python 3.10, but a bug intro'd in that minor version means pip_install doesn't work with a pretty major Python package, Tensorflow. It also potentially doesn't work with other packages.

cc @UebelAndre

@lazcamus
Copy link
Contributor Author

So we have the problem that version 1.8.0 of pkginfo is the minimum version that supports Python 3.10, but a bug intro'd in that minor version means pip_install doesn't work with a pretty major Python package, Tensorflow. It also potentially doesn't work with other packages.

cc @UebelAndre

IWFM YMMV etc etc, but I'm "successfully" using python 3.10.2 with rules_pyenv 0.8.0 and the pkginfo 1.7.1 downgrade patch I included above. My repo builds docker images that consume python wheels, but it doesn't build any wheels itself.

@UebelAndre
Copy link
Contributor

Sounds reasonable to roll back pkginfo if this is blocking the use of major packages. I'd probably repurpose this issue to track what work would be required to update it again.

@thundergolfer
Copy link
Collaborator

rules_pyenv

Referring to https://github.com/digital-plumbers-union/rules_pyenv? Does that project depend on rules_python's pkg install rules?

We can double check, but if 3.10 is actually feasible without pkginfo 1.8.0 then it's preferable to rollback to 1.7.0 as @UebelAndre suggests.

@groodt groodt mentioned this issue May 14, 2022
12 tasks
@groodt
Copy link
Collaborator

groodt commented May 15, 2022

I can confirm that this PR fixes the issue: #700
(we had an identical issue with tensorflow at $dayjob)

@groodt
Copy link
Collaborator

groodt commented May 23, 2022

@lazcamus Can you try with latest code on main? I believe this is now fixed.

@laz-
Copy link

laz- commented May 25, 2022

I can confirm that it's fixed on HEAD. Thanks @groodt :)

@laz-
Copy link

laz- commented May 25, 2022

It looks like I don't have the permissions to close the issue ... can someone with the appropriate powers mark this as fixed?

@mattem
Copy link
Collaborator

mattem commented May 25, 2022

Fixed via #700

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants