-
Notifications
You must be signed in to change notification settings - Fork 986
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The metadata "homepage" link rendered twice on PyPI #11220
Comments
The issue is that we're getting both of these metadata fields from the distributions:
PEP 621 doesn't seem to explain what tools should do with the old It's also not clear to me how PyPI should expect to de-duplicate this. A
Produces this metadata:
So it looks like it's just choosing the last one, which seems confusing. FWIW,
Which IMO is the correct behavior here and probably what setuptools should be doing, instead of PyPI introducing workarounds for its behavior. CC @abravalheri for your thoughts. |
There is a discussion regarding this in:
The path I choose is the maximum backward compatibility, so I purposefully decided to backfill the I agree with Dustin that in the case of multiple candidates choosing the last option might be confusing. Would it be better if instead we stick with the first? |
Regarding the deduplication, if you write: [project.urls]
Homepage = "https://github.com/pypa/warehouse/issues/11220" PyPI will display only one URL. @di in the case of
If the approach chosen by PyPI is to not do any de-duplucation, does it also make sense to not automatically change the case of the word? |
I think the best thing would be for setuptools to just stop duplicating the field. I'm not sure what would be depending on maintaining this backwards compatibility, but it's not PyPI and nothing else comes to mind.
I'm not sure I understand your question! I don't think PyPI should do any modification to what the user has put in this field. I don't think tools should be either. |
I agree that PyPI should not be doing modifications to what has been provided to us. I think that as the core metadata for I think that there's a good argument to be made that I also think there is a good argument that the existing url fields that aren't As far as what is valid for setuptools to do here, tools are generally free to generate the metadata using their inputs as they see fit. PEP 621 defines a standard for how to generate some of those fields, from a specific shared input (in this case, 1:1 mapping with no transformation) but anything not covered by PEP 621.. isn't covered by PEP 621. Like if we remove PEP 621 from the equation, and someone wrote: from setuptools import setup
setup(
...,
project_urls={
"Homepage": "...",
},
) Would it be valid for setuptools to backfill the Does it make sense for setuptools to do this? Eh, personally I don't think that field matters and having it duplicated probably confuses some people somewhere so I wouldn't bother if I were setuptools. I don't feel strongly about it personally though. I think it probably does make sense to deprecate the two old URL fields and in that case PyPI could just stop showing that field and setuptools can easily justify no longer emitting it. |
I understand and sympathise with the point of view you guys are exposing here. Probably in the future we will remove the duplication, but there are a few other changes that might need to happen first, before we reach that point. For example, If Footnotes
|
What is the current approach taken by PyPI for deduplication? Based on previous experiments, it seems that PyPI will deduplicate |
Regardless, this is not a real issue with this project, and In terms of the setuptools implementation, the answer would be: it's complicated. Right now, since core metadata is not technically incorrect when duplicating |
From what I can tell, in the database it will store the duplicated URLs since it's storing it as a row per entry, with a singular string column for When we load that data, our data model makes it available in two forms:
For actually sending that data to the end user, for the HTML pages and the JSON API we use the mapping in (2). In the XMLRPC API we return the list from (1).
It being deprecated or not isn't really important. It's not a required field, they're of course free to be stricter on what they require than the metadata spec requires, but the spec itself its' perfectly valid to emit metadata without that field regardless of this issue. The only required fields are Name, Version, and Metadata-Version. |
Thank you very much for the explanation @dstufft, now it does make sense the behaviour I was observing with
This is more or less what is happening right now. Metadata will be emitted, but |
I was mostly pointing it out because I'm pretty sure those warnings date back to when at least Download-URL was a mandatory value, I don't recall if Home-page was mandatory or not. It being made optional was a (relatively speaking) recent change (8? years 10? years ago). |
Here is what I think setuptools should do:
Here is what I think distutils should do:
Here is what I think warehouse should do:
Maybe setuptools could patch their vendored distutils to not warn on missing Home-page, if it's easy enough, but that's not really particularly important (distutils sdist builds already spam ignorable warnings about a bunch of other stuff) Here is what I do in my build backend, if Home-page is duplicated in a Project-URL, then the Home-page gets kicked out. |
Note: In the Web UI PyPI does not have the concept of a "Homepage" link or a "Download" link. It has a Project URL mapping that gets rendered. PyPI will inject the Home-page and Download-URL metadata into that Project URL mapping at render time (but if the Project URLs contain "Homepage" or "Download" the injected values get over written. The JSON/XMLRPC API differentiate between project URLs and Home-page/Download-URL and should not munge data between them. Those APIs are to get access to the underlying data, and should faithfully reproduce the data as given. |
Yes, I understand. So my suggestion is just to extend the "injected values get overwritten" part to also overwrite with a project url of homepage or home-page or hoMepAGE (any one of them, first, last, doesn't matter as long as it deterministic). It should not also render the "homepage" project url separately if that got used to overwrite. Happy to prepare a PR if we have some consensus on that 3rd bullet point.
Sorry to be contrary, but I think no, actually. Because the core metadata spec field is Home-page not Homepage. I think it should not backfill at all, but I could be convinced for it to backfill a Project-URL also called the Home-page into the metadata field Home-page. @abravalheri where does this backfill happen in setuptools? I was spelunking through the code but could not find it.. |
Hi @wimglenn. When I was collecting feedback about PEP 621 implementation, cryptic distutils warnings were constantly mentioned by the users. Right now , we avoid messages about the url corresponding to Please note that I am not against the changes proposed. I am just saying that the process is a bit more indirect and goes through either changing distutils first or creating the patches you propose. ( It would be also helpful if there is consensus in the community and the My personal plan for the moment in terms of contributions to setuptools is to focus in other changes that I judge more urgent/important (unfortunately we all have to prioritize with the volunteering time we have available). So I invite anyone that is interested in pushing this issue forward in the setuptools/distutils side to propose the changes via code contributions. |
|
@abravalheri Thanks - was not aware of pypa/distutils, and just assumed that setuptools was vendoring the stdlib code. Good to know! I'll prepare PR when I find the time, totally agree that it's not an urgent/important issue. |
…cified in the pkginfo twice (in the Home-page or Download-URL field and again in one of the Project-URL fields). closes pypi#11220
the first two PR should fix the root cause |
…cified in the pkginfo twice (in the Home-page or Download-URL field and again in one of the Project-URL fields). closes pypi#11220 (pypi#11273)
With the most current tools, and following the recommendations of PEP 621 – Storing project metadata in
pyproject.toml
, package publishers end up with duplicate homepage links rendered on the UI:https://test.pypi.org/project/issue11220/
This is probably actually the fault of setuptools and/or the spec, but perhaps warehouse can workaround by ignoring duplicates? What do you think? Otherwise we might start seeing it everywhere as people adopt PEP 621 recommendations.. :-\
Note: The package was produced from running
python -m build
on this source, a stripped down version of the example in the PEP:The text was updated successfully, but these errors were encountered: