Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

datetime extension module isolation #243

Closed
neonene opened this issue May 10, 2024 · 22 comments
Closed

datetime extension module isolation #243

neonene opened this issue May 10, 2024 · 22 comments

Comments

@neonene
Copy link

neonene commented May 10, 2024

PEP-687 is about to be implemented to the _datetime module.

Two approaches are proposed at present. I would like Steering Council to choose one.

  1. (regular)
    Multi-phase-init with per-module heaptypes (and C-API).
    abi: Store a reference (to the C-API struct) in PyInterpreterState.

  2. (special)
    Multi-phase-init with per-interpreter heaptypes (and C-API).
    abi: Store at least 7 heaptypes' references in PyInterpreterState.

The following shows the replacements of _datetime heaptypes (multi-phase init module) when running test_datetime in the refleak test:

main interp sequence 1.per-module 2.per-interp
import module heaptypes_A heaptypes_A
module reload (new module) heaptypes_B (new) (no change)
module reload (new module) heaptypes_C (new) (no change)
interp finish

The approach (2) avoids possible regressions1 caused by mixing modules with forbidden hacky usage23. Technically, (2) does not make sense (1) should be enough. In my opinion, unknown regressions should not be hidden by (2). Also, (2) will be a bad official example of isolation. However, there seems to be a case where maintaining (1) is not good for mental health, which is also serious.

Related issue/PR: gh-117398, gh-118357

Thanks

UPDATED: a few rewords

Footnotes

  1. https://github.com/python/cpython/issues/117398#issuecomment-2104353830

  2. https://docs.python.org/3/howto/isolating-extensions.html#surprising-edge-cases

  3. https://docs.python.org/3/library/sys.html#sys.modules

@neonene
Copy link
Author

neonene commented May 10, 2024

FYI, _datetime code owners are not involved in PEP-687 implementation (no response).

@ncoghlan
Copy link
Contributor

ncoghlan commented May 11, 2024

Intentionally reloading modules isn't forbidden usage. It's typically weird usage, but it's not forbidden. We include explicit workarounds in the import machinery so the interpreter doesn't die when extension modules using single-phase init are reloaded, and the impact of those workarounds results in behaviour similar to what @vstinner has proposed in gh-117398.

The benefit of adopting an approach like the one Victor suggests in a case like this one is that it allows "support per-interpreter imports" to be separated from "generate completely new heap types when reloading the module" (in this particular case, it isn't clear the latter is actually desirable behaviour, and it is definitely backwards incompatible behaviour).

The extra pointer to a substruct in PyInterpreterState isn't ideal, but getting rid of it by migrating to a different C API design that doesn't need it is better deferred until after the migration to multi-phase init (especially since it also makes it much clearer where the problems are actually arising).

@neonene
Copy link
Author

neonene commented May 11, 2024

gh-118337 fully implements:

1.(regular)
Multi-phase-init with per-module heaptypes (and C-API).
abi: Store a reference (to the C-API struct) in PyInterpreterState.

It passes refleak bot tests where the module is reloaded for a good reason.

@ncoghlan
Copy link
Contributor

As noted in python/cpython#117398 (comment), this may be a test coverage concern moreso than an SC design decision. I doubt the existing test suite explicitly covers the situation we're concerned about (a capsule or C API pointer reference acquired before the reload being used after the reload and crashing the interpreter), so even green CI may not be enough to be certain that full reload support is backwards compatible.

@neonene
Copy link
Author

neonene commented May 12, 2024

I would llke an SC decision about how much a module can depend on PyInterpreterState.

_datetime and other modules are not so different except the presence of a capsule C-API, and the capsule is not so different from the module state.

@ncoghlan
Copy link
Contributor

While you're correct that PyModule_GetState and PyCapsule_GetPointer are both technically exposed to the same pointer invalidation problem if a C module is reloaded, there's a key difference in their intended usage model (albeit not a well-documented one, since the distinction never mattered when single-phase init was the only option and hence there was no true reloading of extension modules):

  • PyModule_GetState is for the module itself to use. If the module gets genuinely reloaded, references to the old module state will be replaced by references to the new module state. Other code should NOT be calling PyModule_GetState, and if it is, and the offending code breaks on module reload, the response is going to be "Don't do that then".
  • By contrast, PyCapsule_GetPointer is explicitly an export API. Once another module has received a capsule object from the exporting module, there isn't any expectation that reloading the exporting module might invalidate previously exported capsule instances (and under single phase initialisation, they don't get invalidated, since the extension modules are never truly reloaded). Code that is using the capsule API can break through no fault of its own if we don't preserve capsule pointer stability across module reloads.

Don't get me wrong, I think you've identified a genuine functional gap in the multi-phase init feature here: the existence of the capsule API means that there really should be a public C API for extension modules using multi-phase init to define and manage per-interpreter-instance state in addition to the existing support for defining per-module-instance state. While the standard library can work around this problem by making use of PyInterpreterState, third party modules don't have that luxury.

In the specific case of migrating datetime to multi-phase init, though, it's OK for the sequence of events to be:

  • migrate datetime using PyInterpreterState to maintain pointer stability across reloads for the capsule export API and the public C API
  • subsequently use datetime and its capsule API as a motivating example for adding a public API to allow multi-phase init extension modules to define and access per-interpreter-instance state in a general way (this could potentially be as simple as a new internal state dict keyed by module name, akin to the sys.modules cache, but not exposed via a Python API)
  • adjust any standard library modules affected by this problem to use the new public per-interpreter state management API rather than private module specific workarounds

@zooba
Copy link
Member

zooba commented May 13, 2024

While the standard library can work around this problem by making use of PyInterpreterState, third party modules don't have that luxury.

What's wrong with using PyInterpreterState_GetDict for this? (Personally I use it to stash a reference to my module so that I have something to PyModule_GetState on later.)

Threading while reloading is probably a nightmare, but it's always going to be a nightmare. Ignoring that, PyDict_SetItemString(...) during module exec seems safe enough?

@neonene
Copy link
Author

neonene commented May 13, 2024

  • By contrast, PyCapsule_GetPointer is explicitly an export API.

If datetime API includes the direct use of PyCapsule_GetPointer(), a capsule should be held until the interpreter finishes (by using PyInterpreterState_GetDict()?). The API struct could be updated by memcpy() or something on every module-reload. (I'm honestly not so worried about C-API here)

Per-module heaptypes should be enough, but may be too technical may not be intuitive. I would like to know the rule and the exception.

@ncoghlan
Copy link
Contributor

@erlend-aasland also pointed out this past discussion of the external-resource-management variant of this problem (which leads to needing the ability to set up per-process locks rather than per-interpreter ones): https://discuss.python.org/t/a-new-c-api-for-extensions-that-need-runtime-global-locks/20668

@neonene
Copy link
Author

neonene commented May 15, 2024

Did @erlend-aasland point out that the problem applies to _datetime?

@erlend-aasland
Copy link

erlend-aasland commented May 16, 2024

The linked discussion was not about _datetime, IIRC. However, such an API could have been used to solve the probablems with the encapsulated _datetime C API in a nice way, I think.

@ncoghlan
Copy link
Contributor

Right, the specific problem in the _datetime case is ensuring pointer stability across module reloads for both its C API and its capsule API.

There is a similar-but-distinct problem that comes up when managing external (non-Python) resources that don't allow reinitialisation within a single process: module reloads need to avoid reinitialising the affected external resources, and instead make them available to the reinitialised instance of the module.

While the referenced thread is specifically about the external resource management variant of the problem, any public API we come up with needs to handle both cases.

Getting back to the original question for the SC here though, I think it boils down to:

  1. We have established that some extension modules need to preserve some of their state across reloads when migrating to multi-phase initialisation. However, there is not yet a public API for this, so the simplest option for affected modules is to stick with single-phase initialisation until such an API is defined.
  2. We'd like to make more standard library extension modules compatible with subinterpreters running without a shared GIL, which means migrating them to use multi-phase initialisation
  3. Some standard library extension modules are affected by the need to maintain pointer stability or preserve external resources across reloads. In such cases, the current workaround is to use PyInterpreterState fields that point to nested structs until such time as a standardised public mechanism is defined for management of extension module state that needs to be preserved across reloads.
  4. Migrating the _datetime module to multi-phase initialisation without ensuring public pointer stability across module reloads doesn't actually cause any CI failures, but this is potentially due to a lack of relevant CI test coverage rather than due to it actually being a safe change.

I think the only SC level decision here is "Is it OK to continue using the PyInterpreterState workaround to allow standard library extension modules to be migrated to multi-phase initialisation in the absence of a public API for preserving state across reloads?". It doesn't solve the problem for third party modules, but it does move the problem off the critical path for improvements to subinterpreter support.

The specific case of _datetime is better resolved by adding the missing test coverage to the multi-phase init migration PR and seeing if it fails (and assuming it fails as expected, using the PyInterpreterState workaround to eliminate the failure).

@neonene
Copy link
Author

neonene commented May 20, 2024

_datetime is the last remaining extension module that need to be migrated.

While we can suggest that the new generic C-API is needed, there seems to be no individual discussion (issue) about its design related to _datetime as well, which I think is one reason for the delay of _datetime migration.

so the simplest option for affected modules is to stick with single-phase

I think it is OK for third party modules to keep single-phase init. The migration is up to users.

1.We have established that some extension modules need to preserve
3.Some standard library extension modules are affected by the need to maintain pointer stability

What standard library extension modules are you suggesting?

The specific case of _datetime is better resolved by adding the missing test coverage to the multi-phase init migration PR and seeing if it fails (and assuming it fails as expected, using the PyInterpreterState workaround to eliminate the failure).

You mean per-interpreter heaptypes for _datetime to eliminate the failure, right?
If so, do you have a regression case which cannot follow the current convention (consensus)?

@erlend-aasland
Copy link

@ncoghlan, what do you think of introducing the a new (internal, for now) API, as discussed, and then adopt _datetime to use it? There should be sufficient time before the 3.14 alpha to get it in.

@neonene
Copy link
Author

neonene commented May 21, 2024

@erlend-aasland Can you explain how the discussion resolves _datetime isolation? If it is really effective, why didn't you either propose nor apply it before the 3.12 beta?

@erlend-aasland
Copy link

[...] why didn't you either propose nor apply it before the 3.12 beta?

I work on CPython on my spare time; I'm not paid to work on CPython.

@neonene
Copy link
Author

neonene commented May 21, 2024

Then, why didn't you suggest it on your spare time?

@JelleZijlstra
Copy link
Member

I don't think this is a productive line of conversation; let's drop it.

@neonene
Copy link
Author

neonene commented May 21, 2024

_datetime needs rationales if per-interpreter heaptypes are required.

@vstinner
Copy link
Member

_datetime needs rationales if per-interpreter heaptypes are required.

If you contact me in private, I propose to discuss it with you.

@neonene
Copy link
Author

neonene commented May 21, 2024

If you contact me in private, I propose to discuss it with you.

I'm here to follow SC because you don't look in good condition now to discuss it.

@python python locked as too heated and limited conversation to collaborators May 21, 2024
@warsaw
Copy link
Member

warsaw commented May 22, 2024

There are more appropriate avenues for resolving this issue than asking for the SC to make a decision. Those include opening a new issue on the CPython tracker or commenting on an existing relevant issue, engaging in a discussion on discuss.python.org, or reaching out the the C API working group. In all such forums, please keep in mind that the Python Code of Conduct applies to all workspaces in order to keep the conversations respectful and productive.

@warsaw warsaw closed this as completed May 22, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants