Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement PEP 639, Metadata 2.4 #16949

Merged
merged 23 commits into from
Nov 13, 2024
Merged

Implement PEP 639, Metadata 2.4 #16949

merged 23 commits into from
Nov 13, 2024

Conversation

ewdurbin
Copy link
Member

@ewdurbin ewdurbin commented Oct 22, 2024

Closes #16620

Depends on merge/release of pypa/packaging#828
TODO: Move to release of packaging with license-expression parser

Copy link
Member

@miketheman miketheman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking cool, and coming along nicely! I commented a bit inside.

One thing I'm curious about - at what point should we be looking to consolidate any file operations where we pop open the file and read stuff from it, so we only have to open/iterate once?

warehouse/packaging/models.py Outdated Show resolved Hide resolved
warehouse/forklift/legacy.py Outdated Show resolved Hide resolved
…rectory" from PEP 639

> The directory under which license files are stored in a project source tree, distribution archive or installed project. Also, the root directory that their paths recorded in the License-File Core Metadata field are relative to. Defined to be the project root directory for a project source tree or source distribution; and a subdirectory named licenses of the directory containing the built metadata— i.e., the .dist-info/licenses directory— for a Built Distribution or installed project.
@ewdurbin
Copy link
Member Author

ewdurbin commented Oct 23, 2024

One thing I'm curious about - at what point should we be looking to consolidate any file operations where we pop open the file and read stuff from it, so we only have to open/iterate once?

I thought about this a little and can imagine an approach of stashing a list of filenames when we iterate through to in _is_valid_dist_file, and fetching any files we actually need to read... at the moment that would only be .dist-info/METADATA. I wonder if a simple utility class wrapper that does this on __init__ would allow for us to make calls to get_wheel_metadata and contains or something would work... will poke a bit.

Edit: On second consideration... nearly all file uploads are dominated by either the advisory lock on journals or s3.putobject. Not sure this optimization is worth the effort at this point.

@ewdurbin
Copy link
Member Author

@befeleme would you be able to take a look at this to review the implementation details?

@befeleme
Copy link

@befeleme would you be able to take a look at this to review the implementation details?

Yes, hopefully tomorrow.

@befeleme
Copy link

After reading the code I've got just one question: I can't locate the validation of the License-Expression per https://peps.python.org/pep-0639/#add-license-expression-field ("[Python Package Index (PyPI) MUST validate that they contain a valid, case-normalized license expression with valid identifiers and MUST reject uploads that do not"). Or did I miss it?

@ewdurbin
Copy link
Member Author

After reading the code I've got just one question: I can't locate the validation of the License-Expression per https://peps.python.org/pep-0639/#add-license-expression-field ("[Python Package Index (PyPI) MUST validate that they contain a valid, case-normalized license expression with valid identifiers and MUST reject uploads that do not"). Or did I miss it?

That's implemented upstream in pypa/packaging, see pypa/packaging#828

@befeleme
Copy link

befeleme commented Oct 31, 2024

Ah, looking at the whole warehouse/forklift/metadata.py I see packaging validations are applied automatically and no additional changes regarding that are needed here. Makes sense.
I've got no other comments, it LGTM.

@ewdurbin ewdurbin marked this pull request as ready for review November 8, 2024 12:55
@ewdurbin ewdurbin requested a review from a team as a code owner November 8, 2024 12:55
Copy link
Member

@miketheman miketheman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yay, we're almost there! I didn't re-review a lot of what I'd already covered, so I focused on the migration and testing, and looks like everything generally conforms to the PEP.

I'm thinking that tracking usage/progress of this can be deferred, but that might be a nice metric to emit via DB query later - I can imagine wanting to see new projects using the syntax, older projects converting to the new syntax, etc - but definitely beyond scope of this changeset.

Approving to you can decide when you want to run the migration, but please include some timeouts based on some guesswork from TestPyPI

warehouse/forklift/legacy.py Outdated Show resolved Hide resolved
warehouse/forklift/legacy.py Outdated Show resolved Hide resolved
@ewdurbin ewdurbin requested a review from di November 13, 2024 16:34
@ewdurbin
Copy link
Member Author

Re @miketheman:

I'm thinking that tracking usage/progress of this can be deferred, but that might be a nice metric to emit via DB query later - I can imagine wanting to see new projects using the syntax, older projects converting to the new syntax, etc - but definitely beyond scope of this changeset.

That is all technically captured in the immutable release metadata, so aside from watching a pot boil in real-time in datadog I agree adoption metrics are well beyond scope.

@ewdurbin
Copy link
Member Author

Migrations are going out for this ahead of merge during this window to avoid unnecessary disruption: http://status.python.org/incidents/q5q5347n3s2b

@ewdurbin ewdurbin merged commit 60187ca into main Nov 13, 2024
20 checks passed
@ewdurbin ewdurbin deleted the pep_639 branch November 13, 2024 19:32
@ofek
Copy link
Contributor

ofek commented Nov 13, 2024

Thanks everyone!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement PEP 639
6 participants