Custom environments in subinterpreters #126977

FFY00 · 2024-11-18T17:13:42Z

Feature or enhancement

Proposal:

I wanted to explore the viability of having custom environments in subinterpreters. There are several use-cases that could be enabled by this feature.

So far, from informal discussion with others about this, there are a couple possible issues to take into consideration.

Issues

Some of the immortal objects shared between subinterpreter may be environment-dependent (pointed out by @Yhg1s)
Complications around dynamic loading, by having extension modules from different environments
2.1) Symbol conflicts from their dependencies (pointed out by @Yhg1s)
2.2) Since subinterpreters share the same process, when loading the same shared object, they get the same pointer (pointed out by @pablogsal)

Implementation

The main thing we need is a way to disable the site initialization, which could be a enable_site option in the interpreter config. This should disable the environment customizations, and result in a bare environment without anything extra sys.path.

However, to make the use of different environments more ergonomic, we could add an environment_path location pointing to a directory containing a pyvenv.cfg, which would perform the site initialization for that environment.

Has this already been discussed elsewhere?

No response given

Links to previous discussion of this feature:

No response

The text was updated successfully, but these errors were encountered:

FFY00 · 2024-11-18T17:14:04Z

I'd say this is probably the main issue with this proposal, but I don't know the specifics.

This is already an issue right now, but is exacerbated here, as it increases the likeliness of users running into it. We should consider possible preventive or mitigation measures.

Regarding 2.1), most modern Linux environments don't hit 2.1), as loaded symbols aren't loaded globally, unless RTLD_GLOBAL is used, but it is still an issue on a bunch of other systems, so it's still pretty relevant. (thanks @pablogsal)
A possible mitigation measure might be to preemptively detect symbol clashes and raise an ImportError when loading extensions that would hit it, but I am not sure about it's viability.

Regarding 2.2), AFAICT, this means that global data in the extension and its dependencies is shared between subinterpreters. Similarly, we could possibly mitigate this by detecting it and raising ImportError.

If these, or any other aspects of 2), are still problematic, we could simply prevent loading extension modules on subinterpreters that have a custom environment.

gpshead · 2024-11-18T20:32:37Z

First reaction: I'm skeptical that we actually want this as stated? subinterpreters having different environment configs than the main interpreter doesn't feel right. Would we want to support that explicitly as a feature for everyone to build on and depend on?

a way to disable the site initialization, which could be a enable_site option in the interpreter config. This should disable the environment customizations, and result in a bare environment without anything extra sys.path.

This is a much more direct thing to ask for and could be implemented as a feature on its own without allowing arbitrary whole new environment configs. Gut feeling: whole new configs contain a can of worms of potentially unintended consequences. I expect Eric and others with their head in (sub)interpreter startup land to have a better feel for the reality of my gut check here.

pablogsal · 2024-11-19T00:03:46Z

A possible mitigation measure might be to preemptively detect symbol clashes and raise an ImportError when loading extensions that would hit it, but I am not sure about it's viability.

This can still happen with some subset of symbols. For example GNU's extension 'unique global' symbols will still end in the global namespace even if you open with RTLD_LOCAL. The problem is that the poisoning can also happen later: imagine an extension that statically compiles libstdc++ and then something else loads an extension that depends on the shared object for libstdc++. Then some symbols (unique global ones) will be shared between the second extension and the first, leading to crashes if they are incompatible version of libstdc++.

All of this is to say that it will be quite difficult to detect the poisoning unless we require that the shared object exports no symbols other than the Py_init... one.

pablogsal · 2024-11-19T00:05:30Z

Regarding 2.2), AFAICT, this means that global data in the extension and its dependencies is shared between subinterpreters. Similarly, we could possibly mitigate this by detecting it and raising ImportError.

This is not possible to detect but a module supporting two phase initialisation should be safe to load because it should have its global state on the module state and two sub interpreters will share it.

The problem will be something like a logging singleton in some internal dependency: initialising the logging singleton will initialise in all sub interpreters at the same time but you cannot just hard fail because that's how is supposed to work.

FFY00 added type-feature A feature request or enhancement topic-subinterpreters labels Nov 18, 2024

github-project-automation bot added this to Subinterpreters Nov 18, 2024

github-project-automation bot moved this to Todo in Subinterpreters Nov 18, 2024

github-actions bot mentioned this issue Dec 1, 2024

Monthly issue metrics report hugovk/test#88

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Custom environments in subinterpreters #126977

Custom environments in subinterpreters #126977

FFY00 commented Nov 18, 2024 •

edited

Loading

FFY00 commented Nov 18, 2024

gpshead commented Nov 18, 2024

pablogsal commented Nov 19, 2024

pablogsal commented Nov 19, 2024

Custom environments in subinterpreters #126977

Custom environments in subinterpreters #126977

Comments

FFY00 commented Nov 18, 2024 • edited Loading

Feature or enhancement

Proposal:

Issues

Implementation

Has this already been discussed elsewhere?

Links to previous discussion of this feature:

FFY00 commented Nov 18, 2024

gpshead commented Nov 18, 2024

pablogsal commented Nov 19, 2024

pablogsal commented Nov 19, 2024

FFY00 commented Nov 18, 2024 •

edited

Loading