-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PEP 716: Normalization of Project Names in Metadata and Filenames #3171
base: main
Are you sure you want to change the base?
Conversation
Y'all are gonna get sick of me soon, but here's another one :) |
We'll just reserve the rest of the 7xx series for you ;) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some typos via https://github.com/python/peps/pull/3171/checks#step:5:93, you can also run make spellcheck
locally.
pep-0716.rst
Outdated
optional spelling of ``_``, which would normalize to ``-``. | ||
3. :pep:`503` was accepted, which specified a normalization of the project name | ||
*when* querying the Simple API for a project. | ||
4. The spec for ``.dist-info`` required normalization ofthe name (using the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
4. The spec for ``.dist-info`` required normalization ofthe name (using the | |
4. The spec for ``.dist-info`` required normalization ofthe name (using the |
that are not normalized, and thus instructs tools to expect ``.dist-info`` | ||
directories with unnormalized values, but that all tools must write normalized | ||
values going forward. | ||
5. It was `noted <https://discuss.python.org/t/5605>`__ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
5. It was `noted <https://discuss.python.org/t/5605>`__ | |
5. It was `noted <https://discuss.python.org/t/5605>`__ |
Co-authored-by: Hugo van Kemenade <hugovk@users.noreply.github.com>
I've committed most of the suggestions.. I think the remaining ones are just adding line breaks between entries in the numbered list? If y'all think that makes it clearer I'll add them, but the suggestions only did some of them, so I think it makes sense to do either some of them or none of them? Otherwise it feels weird to me to have half of them with extra line breaks, and half not. WDYT? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Standard reminder: You can directly apply all the suggestions you want in one go by going to Files changed
-> Clicking Add to batch on each suggestion -> When done, clicking Commit
Top-level comments:
- Might be consistent to mention that RFC 2119 terms are used here too, like you do in PEP 717
- In the spec, missing a clear definition or at least a reference to what the actual "normalization" being mandated or prohibited is (i.e.
:ref:
packaging:name-normalization`). - Seems like it might be at least worth mentioning whether there will be security implications here, since normalization or lack thereof can have security-relevant properties in terms of dependency confusion, etc.
- Per the above discussion, adding spaces between list items for lists that have multi-paragraph items is both clearer and is syntactically required in some cases (or the rendered lists do not have the correct spacing, as you can see in the PEP preview), so I suggest doing so in a commit (at least previously, whitespace around GitHub suggestions tends to get stripped when applying them, so I would not try to apply Hugo's suggestions and instead just do this locally.
The argument for changing filename normalization rules once again and mandating filenames not be normalized seems rather weak to me relative to the benefits, but I'll save that for the discussion thread, of course.
Title: Normalization of Project Names in Metadata and Filenames | ||
Author: Donald Stufft <donald@stufft.io> | ||
PEP-Delegate: Paul Moore <p.f.moore@gmail.com> | ||
Discussions-To: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Standard reminder to update Discussions-To
and Post-History
with the PEP discussion thread created just after merging this
Abstract | ||
======== | ||
|
||
This PEP standardizes on where and when project names should and should not be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PEP standardizes on where and when project names should and should not be | |
This PEP standardizes where and when project names should and should not be |
Elide unnecessary word
Motivation | ||
========== | ||
|
||
Historically there was effectively little to no requirements on the valid values |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Historically there was effectively little to no requirements on the valid values | |
Historically, there was effectively little to no restriction on the valid values |
Fix grammar errors (agreement)
|
||
Over the intervening years, various PEPs have ratcheted down various pieces of | ||
metadata such as version numbers (:pep:`440`), filenames in bdists (:pep:`427`), | ||
names in the Simple API (:pep:`503`), and filenames in sdists (:pep:`625`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
names in the Simple API (:pep:`503`), and filenames in sdists (:pep:`625`). | |
names in the Simple Repository API (:pep:`503`), and filenames in sdists (:pep:`625`). |
Specify the actual name of the API, and be clear at a glance what API it actually is
names in the Simple API (:pep:`503`), and filenames in sdists (:pep:`625`). | ||
|
||
Unfortunately, a complex interaction between these various standards *and* | ||
changes made to the specifications without an associated PEP, have created a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changes made to the specifications without an associated PEP, have created a | |
changes made to the specifications without an associated PEP have created a |
Stray comma
* Tools that are currently emitting the names in the simple API (outside of the URL | ||
itself) as normalized, which is either allowed or required by the spec |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* Tools that are currently emitting the names in the simple API (outside of the URL | |
itself) as normalized, which is either allowed or required by the spec | |
* Tools that are currently emitting normalized project names in the Simple Repository API | |
(outside of the URL itself), which is either allowed or required by the spec |
- Clarify hard to parse and potentially-confusing awkward phrasing
- Use the same full, title-cased form of the Simple Repo API name as used elsewhere, for clarity and consistency
|
||
* Tools that are currently emitting the names in the simple API (outside of the URL | ||
itself) as normalized, which is either allowed or required by the spec | ||
currently are immediately not longer complaint and must be updated to emit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
currently are immediately not longer complaint and must be updated to emit | |
currently are immediately no longer complaint and must be updated to emit |
Fix typo
Require Normalization Everywhere | ||
-------------------------------- | ||
|
||
One other possible idea is to simply require normalization everywhere, however |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One other possible idea is to simply require normalization everywhere, however | |
One other possible idea is to simply require normalization everywhere; however, |
Fix grammar error (comma splice)
Require Normalization in Filenames | ||
---------------------------------- | ||
|
||
Filenames sit in a weird place, in most cases they are produced by software |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filenames sit in a weird place, in most cases they are produced by software | |
Filenames sit in a weird place; in most cases they are produced by software |
Fix another comma splice grammar error
|
||
Although they are often a software-to-software identifier, they are also used by | ||
humans when sharing and manually downloading the software. They appear in places | ||
like the PyPI UI, GitHub Releases, downstream Linux repositories, etc. In some |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Downstream Linux repositories? Huh? Linux distros as a rule don't expose to users files straight off PyPI, they repack them into their own archive format following their own (nearly always normalized) naming convention. Furthermore, AFAIK most if not nearly all have naming and normalization policies similar or substantially stricter than PEP 503's for the canonical user-facing name of their packages, not just the file names. Therefore, I'm rather confused how this is relevant here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, the wheel filename was explicitly designed to be for machine consumption, so saying it's not a software-to-software identifier is misrepresenting the intention of the design. (Thanks to compatibility tags, no-one would say wheel filenames are for human consumption 😉)
IMO, this whole argument is the weakest point of the whole PEP, and the overall proposal would be changed very little if it was changed to require normalisation of the project name in wheel and sdist filenames, but was left otherwise unchanged. If that isn't actually the case, then this is the section that should explain what the actual problem with normalising is. That would be a far better rejection reason than the current (frankly, largely subjective) argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, entirely agreed—I'd briefly mentioned it in my top-level review, but particularly as a PEP editor decided to save my own detailed commentary on this for the PEP discussion thread. However, as it ended up getting discussed at some length on Barry's thread that kicked off this PEP, it sounds like Donald mentioned there that he'd be revising this accordingly (hopefully to require normalization, which seemed to be the general consensus, rather than just allow it).
Yeah, currently only some of them have breaks so looks uneven: https://pep-previews--3171.org.readthedocs.build/pep-0716/#motivation My suggestions were to add the missing ones between 3-4 and 5-6, so they're all consistent. Either that, or remove them from the others. (It's a big list so I favour extra space.) |
Rationale | ||
========= | ||
|
||
This PEP follows two guiding principals: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PEP follows two guiding principals: | |
This PEP follows two guiding principles: |
|
||
Tools that read an arbitrary ``.dist-info`` directory **MUST** be prepared to | ||
accept unnormalized values, however tools that work only on *new* ``.dist-info`` | ||
directories **SHOULD** validate that all values are normalized. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not grammar, but rather a point of semantics (which may need to be raised on Discourse, but if I don't mention it now, I'll forget 🙂). I'd prefer this to be MAY. We should never insist (even mildly) on validation from consumers. Suggesting it is fine.
respectively). | ||
|
||
The ``name`` field **MUST** be non-normalized, with the exception that any ``-`` | ||
**MUST** be escaped to be ``_``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to object to this on Discourse... At a minimum, there should be an explanation here of why you propose this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: I have an update to this PEP that changes the proposal, this was my first cut, and I didn't get a chance to publish it before the direction of the conversation changed. So hold off till I get an update written for this :)
|
||
Tools that accept an arbitrary distribution **MUST** be prepared to accept both | ||
non-normalized and normalized filenames. However, tools that only work on *new* | ||
distributions **SHOULD** validate that the distribution filenames are not |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, MAY rather than SHOULD IMO.
Simple Repository API | ||
--------------------- | ||
|
||
The project name, when returned in the "index" URL (e.g. ``/simple/``) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should clarify where exactly, as I had to check PEP 503 to understand it. Also "in the index URL" is confusing.
The project name, when returned in the "index" URL (e.g. ``/simple/``) | |
The project name, when used in the *text* of the anchor tag in the page served at the root URL (e.g. ``/simple/``) |
The project name when used in the URL (e.g. ``/simple/$project/``) **MUST** be | ||
normalized. | ||
|
||
The project name, when used on the Project detail page |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the HTML version of the index, the project name is never used (except in any additional content not covered by the spec). So this requirement can be omitted - unless you want to cover the JSON form as defined in PEP 691, in which case
The project name, when used on the Project detail page | |
The project name, when referenced in the ``"name"`` field of the JSON format project detail page |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Fixed mistaken single backticks in the above comment)
* Tools that are currently emitting the names in the simple API (outside of the URL | ||
itself) as normalized, which is either allowed or required by the spec | ||
currently are immediately not longer complaint and must be updated to emit | ||
non-normalized names. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PEP should probably list which tools this is known to affect. After reading the PEP to here, I'm sufficiently overwhelmed as to be unsure whether this is referring to PyPI or Artifactory, or some other index software.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My impression here was that this specific bit referred to neither, and rather referred to installers using the normalized rather than the unnormalized name in user-facing UI text—but clearly, given there is evident confusion here it would be good to clarify appropriately.
Fixed the merge conflict, but there's a lot of comments above to address. |
Basic requirements (all PEP Types)
pep-NNNN.rst
), PR title (PEP 123: <Title of PEP>
) andPEP
headerAuthor
orSponsor
, and formally confirmed their approvalAuthor
,Status
(Draft
),Type
andCreated
headers filled out correctlyPEP-Delegate
,Topic
,Requires
andReplaces
headers completed if appropriate.github/CODEOWNERS
for the PEPStandards Track requirements
Security ImplicationsHow to Teach ThisReference ImplementationOpen IssuesPython-Version
set to valid (pre-beta) future Python version, if relevantDiscussions-To
andPost-History
📚 Documentation preview 📚: https://pep-previews--3171.org.readthedocs.build/pep-0716/