-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Metadata backend with importlib.metadata #10709
Conversation
8c7e867
to
73b236e
Compare
Some thoughts after playing with this during the weekend. The implementation we have right now actually works quite well—only two test failures remaining (discussed below) and I’d say this already works for all regular, modern usages. The two tests are failing because importlib.metadata does not support Any thoughts on this? I think I can produce an implementation that passes all tests with one or two more weekends of work, although I can’t guarantee when those weekends might happen (I think I’m free next weekend but things don’t always work out). |
4e25f68
to
ce08fc5
Compare
I added egg “support” by falling back to the old backend. I still feel we likely should deprecate and remove discovering egg presence alltogether (it’s not used anywhere in any mainstream environments, from what I know), but this should ease migration. The next step is probably to design feature flags for this; I’m thinking in the first phase we should
|
I think this is reviewable. How the feature flags should be designed can be discussed (or even implemented) separately. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great. I've mostly skimmed the meaty parts of this PR, and the overall structure looks good to me!
I do think it'd be useful to encode some of our assumptions as assertions in this code, both because (a) it'll be something that would get checked when we run the code, (b) ensure that there's an explicit failure when these assumptions aren't valid and (c) encode these assumptions so that they're explicitly communicated rather than being in comments or whatnot. This likely only makes sense for the "opt-in-to-use" phase of this, but I do think there's still a whole lot of value in doing this with early adopters at least.
so we do this to avoid having to rewrite too many things. Hopefully we can | ||
eliminate this some day. | ||
""" | ||
return getattr(d, "_path", None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add a warning message here, for when this doesn't get a PathDistribution
object? Keep a note to remove this, once we decide to expose this as the default, as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function does expect non-PathDistribution
objects though, and simply returns None for them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/cc @jaraco @warsaw @brettcannon for awareness about this hack.
Personally, I'm fine with this -- I don't imagine there being too many code changes in importlib-metadata, in the 3.11+ standard library. If there are, hopefully, we'll be able to deal with them in an expedient manner on pip's side. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer that pip determine why it cares about the "location" of the metadata and ask what interfaces would be appropriate for a package not on the file system to satisfy those needs. But for now, this hack is probably acceptable. Out of curiousity, what breaks if a Distribution doesn't have a _path
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Distributions without _path
(in this implementation) won’t be able to be uninstalled (or upgraded, because upgrade in pip is just uninstall + install). That’s the only thing pip needs from the metadata location (aside from error messages in some edge cases).
ac085bf
to
658b826
Compare
Anyone fancy progressing this? I have a mind to start thinking about how we can start moving off LegacyVersion after this. |
658b826
to
7895a86
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs a rebase, and a few code changes to pacify the linters.
This looks good to me, based on a desk review. Let's fix the linters and see what the CI says. |
@functools.lru_cache(maxsize=None) | ||
def select_backend() -> Backend: | ||
if os.environ.get("_PIP_METADATA_BACKEND_IMPORTLIB"): | ||
from . import importlib | ||
|
||
return cast(Backend, importlib) | ||
from . import pkg_resources | ||
|
||
return cast(Backend, pkg_resources) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a note for the future -- this is intended to be a temporary measure. We'll swap this out for a more "sane" selection logic in the future.
I'm not seeing any documentation on this change. Am I missing it or is it all internal? |
It’s internal and experiemental at this point. Also there should be no user-facing behavioural changes (except bugs) after we switch to this. A change log will be added when we make the switch. |
This matches pkg_resources's behavior. There are various places in the code base that assumes only one entry is returned for a package name, and not doing that would potentially cause a distribution in lower precedence path to override a higher precedence one, if the caller is not careful. Eventually we probably want to make it possible to see lower precedence installations as well (it's useful), but this is not the time.
This replaces importlib.metadata's parser and allows us to "properly" normalize extras as we need. It is not wrong for importlib.metadata to not normalize extras --- if extra normalization is standardized properly, packaging.markers should instead implement 'evaluate()' to properly normalize on comparison, instead of just doing a naive string equality check. Unfortunately, no-one has made a concrete effort to make that happen yet, so pip needs to do what it needs to do to keep things working.
A distutils installation is always "flat" (not in e.g. egg form), so if this distribution's info location is NOT a pathlib.Path (but e.g. zipfile.Path), it can never contain any distutils scripts.
846cf53
to
ee81f71
Compare
This doesn’t take a lot of code because we’ve done most work upfront. There are definitely optimisations available since importlib.metadata has a pretty different implementation, but this should make the most common cases “work”.
Likely far from perfect, I’m only submitting this as draft to run a new workflow against it to see how things currently stand.Better now.Currently the backend is only accessible via a private environment variable. Eventually I want to target the implementation to only 3.10 or later (because importlib.metadata API is quite unstable before that), expose behind the feature flag first, and slowly rolling it out to maybe 3.11 the earliest.