-
Notifications
You must be signed in to change notification settings - Fork 994
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Post release source dist gets grouped with stable release wheel #15749
Comments
Looks like we have a few of these: warehouse=> SELECT count(rf.filename)
FROM release_files rf
JOIN releases r ON rf.release_id = r.id
JOIN projects p ON r.project_id = p.id
WHERE rf.filename ILIKE '%post%'
AND p.name NOT ILIKE '%post%'
AND r.version NOT ILIKE '%post%';
count
-------
674
(1 row) They don't happen very often but have been happening relatively consistently: warehouse=> SELECT DATE_TRUNC('year', rf.upload_time) AS upload_year,
COUNT(*) AS file_count_per_year
FROM release_files rf
JOIN releases r ON rf.release_id = r.id
JOIN projects p ON r.project_id = p.id
WHERE rf.filename ILIKE '%post%'
AND p.name NOT ILIKE '%post%'
AND r.version NOT ILIKE '%post%'
GROUP BY upload_year
ORDER BY upload_year;
upload_year | file_count_per_year
---------------------+---------------------
2009-01-01 00:00:00 | 1
2010-01-01 00:00:00 | 2
2015-01-01 00:00:00 | 52
2016-01-01 00:00:00 | 24
2017-01-01 00:00:00 | 95
2018-01-01 00:00:00 | 115
2019-01-01 00:00:00 | 96
2020-01-01 00:00:00 | 139
2021-01-01 00:00:00 | 111
2022-01-01 00:00:00 | 20
2023-01-01 00:00:00 | 17
2024-01-01 00:00:00 | 2
(12 rows) We now take the version from the metadata for the file: warehouse/warehouse/forklift/legacy.py Line 647 in beb1f3f
If we look at the artifact in question, we see that the version in the metadata is wrong: https://inspector.pypi.io/project/docutils/0.21/packages/67/9a/ff2ff8e922f3b97c4b4864ca6c78d76ca5969bd730560001167b7054ac48/docutils-0.21.post1.tar.gz/docutils-0.21/PKG-INFO#line.3 So the issue is that we aren't rejecting files whose filename versions disagree with their internal metadata, which we can't do until we only accept normalized sdist filenames (PEP 625) because otherwise the filenames can be ambiguous. Given that, I think this is a duplicate of #12245, albeit with a nice real-world example of this happening. |
#12245 is about insisting that all sdists are pep-625 compliant, and looks semi-permanently blocked on the fear of breaking users who are uploading non-compliant sdists. but it would still have value today to write code that asked: if pep-625 compliant, does the version (and name) match the metadata? |
Is it possible to check after you know the package name, or is that what #12245 is about? Something like this: from packaging.version import Version, InvalidVersion
def validate(name: str, version: Version, filename: str) -> bool:
after_name = filename[len(name):].removesuffix(".tar.gz").removesuffix(".zip")
if not after_name.startswith("-"):
return False
file_version = after_name[1:]
try:
file_version = Version(file_version)
except InvalidVersion:
return False
return file_version == version
if __name__ == '__main__':
print(validate("docutils", Version("0.21"), "docutils-0.21.tar.gz"))
print(validate("docutils", Version("0.21"), "docutils-0.21.post1.tar.gz"))
# Not sensitive to name normalization ...
print(validate("flit_core", Version("0.1"), "flit-core-0.1.tar.gz"))
print(validate("flit_core", Version("0.1"), "flit_core-0.1.tar.gz"))
print(validate("flit_core", Version("0.1"), "flit.core-0.1.tar.gz"))
# ... or to version normalization
print(validate("flit_core", Version("0.1a1"), "flit-core-0.1.alpha1.tar.gz"))
print(validate("flit_core", Version("0.1a1"), "flit_core-0.1alpha1.tar.gz"))
print(validate("flit_core", Version("0.1a1"), "flit.core-0.1a1.tar.gz")) This is basically what we do in uv currently (https://github.com/astral-sh/uv/blob/8d721830db8ad75b8b7ef38edc0e346696c52e3d/crates/distribution-filename/src/source_dist.rs#L73-L157) |
just splitting on the last hyphen should be good enough to extract the version, even in the face of non-PEP625 distributions edit to expand on this a bit. Here is what PEP625 said (a few years ago but changes since then are likely to have been in the right direction):
so two possible approaches that are not blocked on #12245
|
@konstin it occurs to me that the code you posted might be wrong because normalization eg of It is not clear to me whether you are comparing normalized things throughout, I did not check whether this is a problem in |
You are correct, the name input should be the normalized name. I think our parsing in uv (which uses the normalized name) works in practice because we don't have source dists with more than one consecutive |
Looks like the docutils maintainers has recognized the issue, and have corrected it in 0.21.1 Another command that can show the actual rejection message: $ pip install --no-binary docutils "docutils>=0.21,<0.21.1"
Collecting docutils<0.21.1,>=0.21
Using cached docutils-0.21.post1.tar.gz (2.2 MB)
Installing build dependencies ... done
Getting requirements to build wheel ... done
Preparing metadata (pyproject.toml) ... done
Discarding https://files.pythonhosted.org/packages/67/9a/ff2ff8e922f3b97c4b4864ca6c78d76ca5969bd730560001167b7054ac48/docutils-0.21.post1.tar.gz (from https://pypi.org/simple/docutils/) (requires-python:>=3.9): Requested docutils<0.21.1,>=0.21 from https://files.pythonhosted.org/packages/67/9a/ff2ff8e922f3b97c4b4864ca6c78d76ca5969bd730560001167b7054ac48/docutils-0.21.post1.tar.gz has inconsistent version: expected '0.21.post1', but metadata has '0.21'
ERROR: Could not find a version that satisfies the requirement docutils<0.21.1,>=0.21 (from versions: 0.3, 0.3.5, 0.3.7, 0.3.9, 0.4, 0.5, 0.6, 0.7, 0.8, 0.8.1, 0.9, 0.9.1, 0.10, 0.11, 0.12, 0.13.1, 0.14rc1, 0.14rc2, 0.14, 0.15, 0.15.1.post1, 0.15.2, 0.16b0.dev0, 0.16rc1, 0.16, 0.17b1, 0.17, 0.17.1b1.dev0, 0.17.1, 0.18b1, 0.18, 0.18.1b0, 0.18.1, 0.19b1, 0.19, 0.20rc1, 0.20, 0.20.1, 0.21rc1, 0.21.post1, 0.21.1)
ERROR: No matching distribution found for docutils<0.21.1,>=0.21 Looks like PEP625 implementation is the answer here. |
just want to reiterate that real improvement could be made first, without being blocked on full PEP625 |
@dimbleby please send a PR for review, happy to take a look. |
can you point at roughly where such a check should be made? thanks |
@dimbleby Per the dev docs, you might want to look into forklift to start: https://warehouse.pypa.io/application.html#file-and-directory-structure |
This release is bodged, and causes Poetry to fail when it tries to fetch the tar: python-poetry/poetry#9293 (comment) It is being tracked: pypi/warehouse#15749 A fix is inbound here: pypi/warehouse#15795 This is a hotfix for this particular package, until the fixes from `pypi/warehouse` are out. The `post` notation is very unique: pypi/warehouse#15749 (comment) and is being used in 0.004% of the releases.
This release is bodged, and causes Poetry to fail when it tries to fetch the tar: python-poetry/poetry#9293 (comment) It is being tracked: pypi/warehouse#15749 A fix is inbound here: pypi/warehouse#15795 This is a hotfix for this particular package, until the fixes from `pypi/warehouse` are out. The `post` notation is very unique: pypi/warehouse#15749 (comment) and is being used in 0.004% of the releases.
This release is bodged, and causes Poetry to fail when it tries to fetch the tar: python-poetry/poetry#9293 (comment) It is being tracked: pypi/warehouse#15749 A fix is inbound here: pypi/warehouse#15795 This is a hotfix for this particular package, until the fixes from `pypi/warehouse` are out. The `post` notation is very unique: pypi/warehouse#15749 (comment) and is being used in 0.004% of the releases.
This release is bodged, and causes Poetry to fail when it tries to fetch the tar: python-poetry/poetry#9293 (comment) It is being tracked: pypi/warehouse#15749 A fix is inbound here: pypi/warehouse#15795 This is a hotfix for this particular package, until the fixes from `pypi/warehouse` are out. The `post` notation is very unique: pypi/warehouse#15749 (comment) and is being used in 0.004% of the releases.
This release is bodged, and causes Poetry to fail when it tries to fetch the tar: python-poetry/poetry#9293 (comment) It is being tracked: pypi/warehouse#15749 A fix is inbound here: pypi/warehouse#15795 This is a hotfix for this particular package, until the fixes from `pypi/warehouse` are out. The `post` notation is very unique: pypi/warehouse#15749 (comment) and is being used in 0.004% of the releases.
This version causes some issues with Poetry/Dependabot. See: python-poetry/poetry#9293 See: pypi/warehouse#15749 See: apache/iceberg-python#615
Describe the bug
The docutils 0.21 has two files (https://inspector.pypi.io/project/docutils/0.21/, https://pypi.org/project/docutils/0.21/#files):
The source distributions is part of the 0.21 release, while it's actually of version 0.21.post1. Pip agrees that this version is 0.21.post1:
Under the right circumstances, this can break uv too.
Expected behavior
Either
docutils-0.21.post1.tar.gz
has its own version 0.21.post1, or source dists with this naming scheme are rejected by pypi.To Reproduce
See description
My Platform
Ubuntu 22.04, Python 3.12
Additional context
n/a
The text was updated successfully, but these errors were encountered: