-
-
Notifications
You must be signed in to change notification settings - Fork 31.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow Python distributors to add custom site install schemes #88142
Comments
As part of the distutils migration we plan to add a mechanism to let Python distributors to add site install schemes. Currently, Python distributors are patching distutils to add custom install schemes for their packages. I think most of the reasoning boils down to them wanting to stop Python installers, such as pip, to modify/interfere with their packages. With the distutils deprecation, and it becoming a 3rd party module, Python distributors can no longer patch it. Because of this, we made distutils use the sysconfig module instead, which fixes the issue at the moment -- Python distributors can now patch sysconfig itself -- but is not a long term solution. The idea is that they have a config file, which they can pass to configure, and in that config file they can specify some extra install schemes. These install schemes will get added in sysconfig, and will be loaded in the site module initialization. In practice, it will look something like this: config.py
./configure --with-vendor-config=config.py |
Any reason this couldn't be in sitecustomize.py? Either by poking values into sysconfig directly (for back-compat) or we train sysconfig to look inside sitecustomize for a well-known name. |
Making sysconfig look at sitecustomize seems like the wrong approach. It is behavior I would never expect, and there are use-cases where I still want the schemes to be present when the site module initialization is disabled. I would also argue that having this mechanism available will be useful for other things. |
Cross referencing the discussion: https://discuss.python.org/t/mechanism-for-distributors-to-add-site-install-schemes-to-python-installations/8467 |
I mean, you're literally customizing the site, so having it be done from sitecustomize doesn't seem terribly wrong. But I agree, I'd rather see the code in sitecustomize poke paths into sysconfig, rather than the other way around. The problem then would be that -S bypasses the path configuration entirely, which is likely going to point at non-existent paths. So yeah, for this case you need an override that isn't tied to the site module. Having a similar-but-different mechanism in sysconfig seems fine. I have a *slight* preference for non-executable code, mostly to avoid the risk of import hijacking, but it's only slight. |
FYI, I have change the implementation to split the extra install schemes and extra schemes activated on site. This still makes sense over sitecustomize because we want the packages to be included in site.getsitepackages -- we want the vendor packages to essentially be the same as site-packages. I have also moved sysconfig._get_preferred_schemes to the vendor config, instead of asking distributors to patch sysconfig -- this is why I prefer having it as executable code, we customize using functions, etc. A config taking advantage of all these mechanisms should look like this:
Do you have any thoughts on this? |
Yes, I saw some of the latest changes in the PR. My biggest concern is with the bare "import _vendor_config", which I'd prefer to have restricted to a fixed location, rather than being influenced by environment variables and other options. We already have an issue with readline being imported from anywhere it can be found. A native flag to suppress it (i.e. something in sys.flags) could also become important for embedders, though it may matter more at a higher level (i.e. should an embedded CPython *ever* be using sysconfig? Probably not...). I wouldn't add a new flag for it right now, but I feel like sys.flags.isolated should probably imply that this should be ignored. Though then we hit the issue again that these patches are about changing the "safe default" behaviour, which is what you want to get back when you run with -S or -I. And I'm not totally sure how to resolve this. So basically, my concerns are:
|
Sorry for not getting to this sooner, but 5 days is really tight for such a change. With -S/-I, It would be great if sys.path only included packages installed as part of the OS, and not those installed by It seems that with the current patch, pip will install into site-packages and there's no way to disable/change site-packages. Is that the case? |
Oh, I share the same concern! Though users could already mess up Python pretty badly by shadowing/overwriting parts of it, so I didn't thought it would be that big of an issue. Is there a way to achieve this while still allowing us to do everything we want?
No worries. It was my fault, I should have been more attentive to the Python release timeline.
Perhaps we could add an option to enable only vendor site schemes?
I mean, there is, though not as straightforward as -S/-I. I was planning on using it to build the distro entrypoint scripts, so that they only include the distro packages. $ python -S
> site.addsitedir(sysconfig.get_path('purelib', 'vendor'))
> site.addsitedir(sysconfig.get_path('platlib', 'vendor')) As I mentioned above, we could add a cli flag to do essentially the same. |
The best option for restricting the import while still having it be a Python import is to find the file (if it's present in the expected location under sys.whatever), and then use importlib to import it: https://docs.python.org/3/library/importlib.html#importing-a-source-file-directly I'd rather not have a new option here, I would much prefer "-S" in this context to mean "run Python with only core libraries" and "-s" to mean "run Python with only core and distro libraries" (and neither to mean "run Python with core, distro and user libraries"). That may be a bigger change, but there's enough angst around this issue that we would be better off getting it right this time, even if it changes things, than continuing to preserve the system that people dislike so much. |
Perhaps what I'm suggesting here is that I don't see any reason for "sudo pip install ..." into a distro-installed Python to ever need to work, and would be quite happy for it to just fail miserably every time (which is already the case for the Windows Store distro of Python). Admin installed all-user packages is the expert scenario here, and can be as twisted as possible. Pip installed per-user packages and system-tool installed packages are the defaults, and the more easily those can be overridden by a file in the distro, the better. |
On 04.05.2021 22:07, Steve Dower wrote:
The "pip install" into a root environment approach is the standard way The pip warning about this kind of setup which apparently got added |
Would "pip install --user ..." in a Docker container also work, though? Presumably all the filesystem paths are being redirected anyway, so is there a difference? (My assumption is that "--user" would essentially become the default if you're using the OS provided pip/Python. If you do your own build/install of it then you obviously get "default" behaviour, for better or worse.) |
On 04.05.2021 22:29, Steve Dower wrote:
More modern Docker setups run the application itself under a non-root See eg. Zammad's Dockerfile: Not sure whether that answers your question, though. It's rather uncommon to install venvs inside Docker containers: one of the "pip install as root" will need to continue to work and thus distros Regarding the proposed solution: I'm not sure whether a new configure setuptools' distutils version (and other packages which ship distutils) |
Excuse my ignorance, but does "as root" imply that there's no user site-packages directory at all? I'm not imagining a solution that doesn't require *users* to change their commands, so if they're currently running "sudo pip install" because they need to, but we change it so they shouldn't, then I'm okay with them having to remove the "sudo". (At least for this discussion - we can evaluate transition plans separately.) And yeah, patching sysconfig.py seems easier. But then, adding a file to the distro is even easier, and if it's easiest for Linux distros to do that via configure than to add a copy step into their build (which is how I'll do it for Windows distros that need it), then I'll leave that to others to decide/implement. |
On 04.05.2021 22:58, Steve Dower wrote:
Why should there be no site-packages dir ? All non-core packages get However, distros usually split this up further into packages which are
I'm not sure I understand what you're suggesting. For Docker, the instructions from the Dockerfile are run as root, so
You mean: put something like... from _sysconfig_site import *
install_sysconfig_site() at the end of sysconfig.py and then have distros add a That would work as well, but details will have to be hashed out, since |
Steve is talking about user site-packages, not global site-packages directory. |
On 05.05.2021 10:01, Christian Heimes wrote:
You mean "pip install --user" as root ? That's not how you typically The typical Unix way of installing non-system packages is either As a root user, I'd assume that "pip install" also installs into |
I mean that Steve and you are talking about different things. Neither Steve nor you or I are are Linux distro packaging experts. I suggest that we listen to the expertise of downstream packagers like Filipe or Miro. They deal with packaging on a daily basis. By the way you are assuming that all container solutions work like Docker and that all Docker and non-Docker based container solutions allow you to run code as unrestricted, unconfined root. That's a) a incorrect, and b) offtopic for this ticket. |
On 05.05.2021 10:29, Christian Heimes wrote:
Could be. I was addressing the point Steve made about not allowing
Agreed.
I gave the Docker example as proof that running "pip install" as Linux distros have been supporting this for many years and just BTW: I'm aware that other container solutions work in different ways, |
We cannot change how I think we went a little off-topic here, so let's get back to the discussion.
Right, though that requires also a new import, importlib, which may not be optimal. Considering that this module is meant to be private and basically all other private importable parts of Python suffer from the same issue, I am finding it hard to justify. If there's enough consensus that this approach would be better, I am more than happy to change the implementation.
I don't think having an option to start Python with only the vendor modules would be *necessary*, though it would certainly be helpful. Among other things, it would be super helpful to be able to tell users to run Python with the -D (made up) option to isolate issues with the vendor modules and the user Python environment.
This may be completely wrong for other people, but is my understanding. AFAIK those these issues come from lack of separation between the distro, system and user environments, causing a hell of conflicts and silent module shadowing that neither the system package manager or pip can fix. Almost every time I help people with Python I have to tell them to use a virtual env, which most people aren't expecting, and would likely run into issues had I not suggested it. But yeah, this is, of course, my experience, and that can vary for other people, so there may be different perspectives here. So I'd very much like to hear other people on this. |
Another alternative would be to convert sysconfig into a directory and make the vendor patch a submodule. That's _very slightly_ more impactful for the unpatched case, but only really for scenarios where people are trying to do things they shouldn't. Or we can include the file in all distros and import it earlier (before taking environment variables, etc. into account). In my opinion, the security implications alone suggest we shouldn't be importing this by name without knowing where it is coming from.
But the user can already exclude their user-installed packages with -s, right? It's the site-installed packages that would require -S, but that also excludes vendor modules. Why do we encourage users to install site-wide packages using pip? Why is it such an important scenario for a distro-provided Python to be able to modify its global install using non-distro-provided tools and non-distro-provided packages? What's wrong with saying "install for --user", or else "apt install some-different-python-bundle" first and use that? (To be clear, I'm framing these as confrontational questions to help my understanding. I'm totally willing to accept an answer of "just because", provided whoever is giving that answer actually "owns" dealing with the fallout.) |
Steve: I think the point of discussing whether "pip install" can So back to the original point... Filipe: Could you please explain why patching sysconfig.py is not a This doesn't involve any changes on the CPython side, is as flexible It's already clear that sysconfig.py will be the new golden source |
What is the probability that custom site install schemes will be supported without requiring a patch before distutils is removed? |
I find it very difficult. I would be a bit anxious merging the code this close to the final release, so I imagine the release managers would be much more. Anyway, this is the current state of this issue from my perspective: As far as I understand, progress is currently blocked because Matthias thinks this mechanism must allow Debian to replace all related downstream patches. I think that at this point, we could really use a core dev to help push this to the state of being able to get merged, because I don't think I can do that alone. |
@FFY00: Thanks for the info. From what I quickly read, 3.10 will only deprecate distutils, and the removal is scheduled for 3.12. Is that correct? If so, the impending release of 3.10 shouldn’t be a problem, as there are 2 more releases before distutils will be removed. Or does deprecating distutils cause problems other than warnings? |
Yes, but in the process of deprecating disutils, some internals were changed to stop relying on it. This will cause a lot of downstream patches to break. As far as the public distutils API is concerned, everything will still the same until 3.12, just with deprecation warnings. |
s/still/stay/ |
Thanks Filipe for the summary. I was unsure the status. Even if we grant that Debian is too intrusive in its patching of distutils, it's Python and distutils that are making the change here, so it's not unreasonable for Python to maintain feature parity with the current regime. It's plausible that you or I could convince Matthias to adapt Debian to a less intrusive approach by adapting the whole Debian ecosystem to a new approach. I suspect achieving that change would be a bit of a challenge. The only other alternative I see is for distutils to provide some mechanism to enable Debian to achieve its current expectations without patching. It may not be necessary for CPython to support this mechanism. It may be the case that CPython can support the site install schemes, but that any additional customizations remain a contract between Setuptools/distutils. I'm unsure what implications that would have for other build systems not based on distutils, but presumably it would be up to those systems to support Debian. I suggest one of three courses here:
Are there any other options? What direction would you like to pursue? |
Hi Jason, thank you. I already tried 1), but could not convince Matthias on an alternative way to achieve the result that Debian wants. If you want, maybe you and I could try restarting those discussions.
I am not sure how to proceed. |
I haven’t read everything in this convo, but it looks like the changes proposed here cover all known downstream users other than Debian. Is that correct? Would these changes break Debian’s customizations substantially more than they’ll already be broken by the distutils deprecation refactoring? If not, and if you don’t want to support the customizations that Debian wants, why not proceed with this enhancement to cover every other case, then deal with Debian later? In the short term, and possible for the long term, Debian can continue to patch the install routine, just in the new appropriate places to patch it. Debian wants something no one else wants, so they have to put in extra effort. |
s/possible/possibly/ |
Matthias, can you check if bpo-44982 solves your issues related to the conflicts you were experiencing for the Debian patching? If so, it would unblock this issue. I am probably gonna discuss this proposal with the conda-forge folks next week, to get some feedback from the Conda perspective. |
The problem with this approach is Setuptools is attempting to adopt distutils by exposing its own vendored copy of distutils as Setuptools would like to be able to present a version of distutils that, unpatched, runs on all the major platforms, and thus make it default. That won't be possible until Debian can stop relying on its patching of distutils. |
Here's what I propose:
|
In Fedora 36+ / Python 3.10+ we now use an install_scheme that looks like this:
We got a user report [1] saying that That is, users expect that /usr/local is the prefix, and when they explicitly set it to /usr, the /local/ bit will not be there, while in reality, /local/ is not a part of the prefix, but it is a part of the installation scheme. I can somehow relate to that assumption. Now I wonder whether we should have adapted prefix instead of the installation scheme :/ Any ideas on how to approach this problem? I am quite clueless. |
I don't have a good answer, but given the title of this issue (which is specifically scoped to site install schemes), I'm tempted to say we should deal with prefixes in a separate, perhaps broader issue, and there address the reported issue (that a user's prefix override isn't honored by the scheme) and maybe more broadly the issue that there's not a design/spec for python installations (and probably there should be). |
In Nixpkgs we install every Python package under a unique prefix, a so-called Nix store path. If we were to use sysconfig for installing packages, then we'd need to be able to dynamically set the paths. This was also discussed as part of the Installer project. https://github.com/pradyunsg/installer/issues/98 We could use a custom scheme, however, we do need to be able to dynamically set a certain variable, e.g.
I could imagine we do something like
We'd then need to update the base variable in sysconfig or partially expand our own scheme using this variable. |
Technically, I guess we could (instead of redefining the default installation scheme) redefine the default I'll start a discussion on https://discuss.python.org/ trying to sum up what went wrong with our custom installation scheme and what we want to achieve instead. |
From the feedback I have gathered so far, I think this suggestion is a hard sell for some people due to the performance impact, so I think that's the first thing we have to work on if we want to implement this. There is some duplication in |
I created #98718 to keep track of |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: