Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[24.2] pip check flags packages with bad WHEEL metadata #12884

Open
1 task done
filbranden opened this issue Jul 30, 2024 · 8 comments
Open
1 task done

[24.2] pip check flags packages with bad WHEEL metadata #12884

filbranden opened this issue Jul 30, 2024 · 8 comments
Labels
C: check Checking dependency graph for consistency type: enhancement Improvements to functionality

Comments

@filbranden
Copy link

Description

Originally reported here

We ran into issues with this new pip check feature in some packages we push.

First three are on Python 3.10 on Linux

  • catboost-1.1.1.dist-info/WHEEL lists no Tag's:
Wheel-Version: 1.0
Root-Is-Purelib: false
  • xgboost-1.6.1.dist-info/WHEEL has Tag's for cp39-cp39-manylinux_2_17_x86_64 and cp39-cp39-manylinux2014_x86_64, but looking at the list in get_supported() Python 3.10 only lists cp39 with abi3 as the second component, it has cp39-abi3-manylinux_2_17_x86_64 and cp39-abi3-manylinux2014_x86_64 which do not match exactly. Contents of xgboost-1.6.1.dist-info/WHEEL below:
Wheel-Version: 1.0
Generator: bdist_wheel (0.37.1)
Root-Is-Purelib: false
Tag: cp39-cp39-manylinux_2_17_x86_64
Tag: cp39-cp39-manylinux2014_x86_64
  • ninja-1.11.1.1.dist-info/WHEEL has a newline above the Tag's, which makes email.parser used in pip return no Tag's since it's expecting no blank lines between header lines:
Wheel-Version: 1.0
Generator: skbuild 0.17.6
Root-Is-Purelib: false

Tag: py2-none-manylinux1_x86_64
Tag: py2-none-manylinux_2_5_x86_64
Tag: py3-none-manylinux1_x86_64
Tag: py3-none-manylinux_2_5_x86_64
  • We also encountered this issue with the extra blank line on frozendict-2.3.8.dist-info/WHEEL on a Python 3.11 setup:
Wheel-Version: 1.0
Generator: bdist_wheel (0.40.0)
Root-Is-Purelib: true

Tag: py311-none-any

I understand some of these could be blamed on the packages and how they were built, but it's still unfortunate that we'll start getting pip check warnings for these, so I thought I would report my findings here. (Also, it was not very easy to troubleshoot the issue, essentially I had to reproduce the commands in this PR to understand what was really going on, since there was no useful output or debug logging to help understand the breakage.)

Thank you!

Expected behavior

Some suggestions:

  • If tag list is empty, consider that as a "no information available" rather than "this supports no platform"
  • Be more lenient with parsing, in a way that blank lines in the WHEEL file are ignored
  • Consider matching in a way that equivalent platforms (such as cp39-cp39-manylinux_2_17_x86_64 and cp39-abi3-manylinux_2_17_x86_64, assuming these are indeed equivalent) would compare the same

And it would be useful to preserve information that might be useful for debugging, perhaps as a separate debug log (dumping all supported platform information in the output would be too much, for sure.)

Of course this is easier said than done, but I still wanted to be able to try and give constructive suggestions here.

pip version

24.2

Python version

3.10

OS

Linux (Debian Bullseye)

How to Reproduce

Run pip check including the packages mentioned above

Output

catboost 1.1.1 is not supported on this platform
ninja 1.11.1.1 is not supported on this platform
xgboost 1.6.1 is not supported on this platform
frozendict 2.3.8 is not supported on this platform

Code of Conduct

@filbranden filbranden added S: needs triage Issues/PRs that need to be triaged type: bug A confirmed bug or unintended behavior labels Jul 30, 2024
@ichard26 ichard26 added type: enhancement Improvements to functionality C: check Checking dependency graph for consistency and removed type: bug A confirmed bug or unintended behavior S: needs triage Issues/PRs that need to be triaged labels Jul 31, 2024
@ichard26
Copy link
Member

Thanks for the report, apologies that this is causing issues for you.

If tag list is empty, consider that as a "no information available" rather than "this supports no platform"

This sounds reasonable enough. AFAIU, this would also fix situations where a newline precedes the tag headers 👍

Be more lenient with parsing, in a way that blank lines in the WHEEL file are ignored

I can sympathize, but our general stance is that pip should be stricter over lenient when dealing with metadata. As pretty much everyone and their dog uses pip (directly or indirectly), "whatever pip accepts" often becomes the defacto standard, even if it violates the actual standards. By being stricter, other consumers can follow the specifications and expect that it should "just work". Evidently, packaging metadata is still not quite there so pip continues to be lenient in some portions, but we are slowly transitioning towards being stricter. The goal of the PR wasn't to make pip stricter, but it is a natural consequence and it matches with our overall goals.

Consider matching in a way that equivalent platforms (such as cp39-cp39-manylinux_2_17_x86_64 and cp39-abi3-manylinux_2_17_x86_64, assuming these are indeed equivalent) would compare the same.

I'm confused, in the ninja example, the cp39-cp39-manylinux_2_17_x86_64 and cp39-cp39-manylinux2014_x86_64 tags are not supported on a CPython 3.10 build. The CPython ABI is not stable between minor releases:

CPython’s Application Binary Interface (ABI) is forward- and backwards-compatible across a minor release (if these are compiled the same way; see Platform Considerations below). So, code compiled for Python 3.10.0 will work on 3.10.8 and vice versa, but will need to be compiled separately for 3.9.x and 3.11.x.

get_supported() rightfully only includes the stable abi3 for any CPython version that isn't 3.10.

(Also, it was not very easy to troubleshoot the issue, essentially I had to reproduce the commands in this PR to understand what was really going on, since there was no useful output or debug logging to help understand the breakage.)

Suggestions to improve the warning wording would be welcome, although any tag specific logic would likely be nontrivial (although honestly, such logic would be needed anyway if we wanted to raise nicer errors when the user asks pip to install a local but unsupported wheel).

@ichard26
Copy link
Member

ichard26 commented Jul 31, 2024

By the way, if anyone wants to play around with pip's logic that emits this warning, this is an extracted version of it:

from email.parser import Parser
from functools import reduce
from pip._internal.utils.compatibility_tags import get_supported
from pip._vendor.packaging.tags import parse_tag

supported = get_supported()
wheel = """
Wheel-Version: 1.0
Generator: bdist_wheel (0.37.1)
Root-Is-Purelib: false
Tag: cp39-cp39-manylinux_2_17_x86_64
Tag: cp39-cp39-manylinux2014_x86_64
""".lstrip()
wheel_tags = reduce(
    frozenset.union,
    map(parse_tag, Parser().parsestr(wheel).get_all("Tag", [])),
    frozenset(),
)
print("specified tags", wheel_tags)
print("matching tags:", wheel_tags.intersection(supported))
print("supported:", not wheel_tags.isdisjoint(supported))

@pradyunsg
Copy link
Member

pradyunsg commented Jul 31, 2024

TL;DR: I think the things being flagged are correctly flagged as issues, and improving messaging around these seems like a good idea.

If tag list is empty, consider that as a "no information available" rather than "this supports no platform"

This sounds reasonable enough. AFAIU, this would also fix situations where a newline precedes the tag headers 👍

I strongly disagree. I don't think that would be the right thing to do... The wheel spec states what this is supposed to be:

Tag is the wheel’s expanded compatibility tags; in the example the filename would contain py2.py3-none-any.

So... I think it's correct for pip to flag this as bad -- it could probably be a better message, saying "declared no tags" instead but it should absolutely be getting flagged by pip check IMO.

Be more lenient with parsing, in a way that blank lines in the WHEEL file are ignored

I don't think this would be the right thing to do -- the WHEEL file has a well-specified format and if a package deviates from it, it's 100% OK for pip to not be happy with it (non-compliant inputs should be flagged as garbage). I do think this should result in a "This WHEEL file is wonky" message though rather than somehow trying to "repair"/accomodate for the bad structure.

Consider matching in a way that equivalent platforms (such as cp39-cp39-manylinux_2_17_x86_64 and cp39-abi3-manylinux_2_17_x86_64, assuming these are indeed equivalent) would compare the same

Flagging such incompatibility/inconsistencies was the reason that this feature was introduced IIUC ("tell me about incompatible packages based on their recorded metadata"). @ichard26 used more words to add additional context above about how these tags are absolutely not equivalent and that this is the correct behaviour.


from pip._internal.utils.compatibility_tags import get_supported
from pip._vendor.packaging.tags import parse_tag

I wanna call out that this is importing from _ prefixed namespaces that have no guarentees around stability or compatibility. They're not public interfaces that you should rely on (although using them for a one-off analysis is fine obviously).

@pradyunsg pradyunsg changed the title pip check supported platform check from pip 24.2 fails on some packages with inaccurate/incomplete/malformed wheel metadata [24.2] pip check flags packages with bad WHEEL metadata Jul 31, 2024
@pradyunsg
Copy link
Member

pradyunsg commented Jul 31, 2024

Retitled so that I can read this because the old title was too long for my brain to parse. 🙈

@ichard26
Copy link
Member

ichard26 commented Aug 1, 2024

Thinking about this more after @pradyunsg's reply, I retract my earlier suggestion. I'm on board to emit a specific message if no tags are declared in the wheel metadata. It's logically more consistent with my later point sooo.. yeaa, I'm a hypocrite.

@ssbarnea
Copy link
Contributor

ssbarnea commented Aug 1, 2024

Apparently one such, very popular, package that causes this is ruamel.yaml.clib (direct dependency of ruamel.yaml).

$ pip install --force ruamel.yaml.clib
Collecting ruamel.yaml.clib
  Downloading ruamel.yaml.clib-0.2.8-cp312-cp312-manylinux_2_24_aarch64.whl.metadata (2.2 kB)
Downloading ruamel.yaml.clib-0.2.8-cp312-cp312-manylinux_2_24_aarch64.whl (643 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 643.9/643.9 kB 29.2 MB/s eta 0:00:00
Installing collected packages: ruamel.yaml.clib
  Attempting uninstall: ruamel.yaml.clib
    Found existing installation: ruamel.yaml.clib 0.2.8
    Uninstalling ruamel.yaml.clib-0.2.8:
      Successfully uninstalled ruamel.yaml.clib-0.2.8
Successfully installed ruamel.yaml.clib-0.2.8

$ pip check
ruamel.yaml.clib 0.2.8 is not supported on this platform

$ uname -a
Linux u1 6.8.0-39-generic #39-Ubuntu SMP PREEMPT_DYNAMIC Sat Jul  6 02:50:39 UTC 2024 aarch64 aarch64 aarch64 GNU/Linux

References:

@danielhollas
Copy link

+1 on making the errors more specific to aid debugging. As a further datapoint, we've hit this in the conda-forge ecosystem, the failing packages had an incorrect (overly restrictive) Python tag, e.g. cp37 instead of py3. Simply rebuilding them fixed the issue. So in this case, a more specific message that the Python version does not match what was expected would be nice).

I don't think this would be the right thing to do -- the WHEEL file has a well-specified format and if a package deviates from it, it's 100% OK for pip to not be happy with it (non-compliant inputs should be flagged as garbage). I do think this should result in a "This WHEEL file is wonky" message though rather than somehow trying to "repair"/accomodate for the bad structure.

Sorry for asking a perhaps obvious question, but is the same validation done when uploading to PyPI?

@Davros666
Copy link

The writer of ninja.1.1.1.1.1.1.1.1 may be a Russian who was killed in the war in Ukraine and so now is unable to edit the extra \n out of his wheel file, except via a ouija board.
Very sad.
Perhaps someone should fix this for him without the need for spirit-guides and etherial keyboards that plug into the USB socket often found on modern gravestones?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C: check Checking dependency graph for consistency type: enhancement Improvements to functionality
Projects
None yet
Development

No branches or pull requests

6 participants