Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[subinterpreters] Lingering subinterpreters should be implicitly cleared on shutdown #80406

Closed
ncoghlan opened this issue Mar 7, 2019 · 18 comments
Labels
3.10 only security fixes topic-subinterpreters type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@ncoghlan
Copy link
Contributor

ncoghlan commented Mar 7, 2019

BPO 36225
Nosy @ncoghlan, @vstinner, @ned-deily, @encukou, @ericsnowcurrently, @soltysh, @JulienPalard, @nanjekyejoannah, @LewisGaul, @NasaGeek
PRs
  • gh-80406: Finalise subinterpreters in Py_FinalizeEx() #17575
  • Files
  • finalise-subinterps-test.diff
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2019-03-07.13:17:43.665>
    labels = ['expert-subinterpreters', '3.10', 'type-crash']
    title = '[subinterpreters] Lingering subinterpreters should be implicitly cleared on shutdown'
    updated_at = <Date 2021-12-16.23:53:44.369>
    user = 'https://github.com/ncoghlan'

    bugs.python.org fields:

    activity = <Date 2021-12-16.23:53:44.369>
    actor = 'nasageek'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Subinterpreters']
    creation = <Date 2019-03-07.13:17:43.665>
    creator = 'ncoghlan'
    dependencies = []
    files = ['48733']
    hgrepos = []
    issue_num = 36225
    keywords = ['patch']
    message_count = 17.0
    messages = ['337392', '337743', '337852', '339149', '339150', '339191', '345802', '349119', '349124', '353873', '357193', '357235', '364587', '371338', '371571', '379366', '379718']
    nosy_count = 10.0
    nosy_names = ['ncoghlan', 'vstinner', 'ned.deily', 'petr.viktorin', 'eric.snow', 'maciej.szulik', 'mdk', 'nanjekyejoannah', 'LewisGaul', 'nasageek']
    pr_nums = ['17575']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'crash'
    url = 'https://bugs.python.org/issue36225'
    versions = ['Python 3.10']

    @ncoghlan
    Copy link
    Contributor Author

    ncoghlan commented Mar 7, 2019

    https://docs.python.org/3/c-api/init.html#c.Py_EndInterpreter states that "Py_FinalizeEx() will destroy all sub-interpreters that haven’t been explicitly destroyed at that point."

    As discussed in hexchat/hexchat#2237, Python 3.7+ doesn't currently do that - it calls Py_FatalError instead.

    That change came from #1728, which was based on my initial PEP-432 refactoring work, and I didn't realise that implicitly cleaning up lingering subinterpreters was a documented behaviour.

    So I think we should just fix it to behave as documented, and add a new regression test to make sure it doesn't get broken again in the future.

    @ncoghlan ncoghlan added 3.7 (EOL) end of life 3.8 (EOL) end of life type-crash A hard crash of the interpreter, possibly with a core dump labels Mar 7, 2019
    @nanjekyejoannah
    Copy link
    Contributor

    I have been wondering where the regression to test this can be put..in test__xxsubinterpreters.py may be?

    @encukou
    Copy link
    Member

    encukou commented Mar 13, 2019

    Joannah, yes, that looks like a good place. Eric Snow might have more info; he wrote that module.

    As for testing Py_FatalError, there's an assert_python_failure function in test.support.script_helper.

    @ericsnowcurrently
    Copy link
    Member

    Interestingly, I noticed this independently today. :)

    Here's what I wrote in bpo-36477 (which I've closed as a duplicate):

    When using subinterpreters, any that exist when Py_FinalizeEx() is called do not appear to get cleaned up during runtime finalization. Maybe I've been looking at the code too much and I'm missing something. :)

    This really isn't a problem except for embedders that use subinterpreters (where we're leaking memory). However, even with the "python" executable it can have an impact because the subinterpreters' non-daemon threads will exit later than expected. (see bpo-36469 & bpo-36476)

    The solution would be to finalize all subinterpreters at the beginning of Py_FinalizeEx(), right before the call to wait_for_thread_shutdown(). This means calling Py_EndInterpreter() for all the runtime's interpreters (except the main one). It would also mean setting a flag (_PyRuntime.interpreters.finalizing?) right before that to disallow creation of any more subinterptreters.

    @ericsnowcurrently
    Copy link
    Member

    test__xxsubinterpreters is a great place for tests that exercise use of subinterpreters, including most lifecycle operations. There are also one or two subinterpreter-related tests in test_embed. However, for this issue the interplay with runtime finalization means tests should probably stay with other tests that exercise Py_FinalizeEx().

    @ncoghlan
    Copy link
    Contributor Author

    I think test_embed would be the right home for this, as there's an existing test case there for subinterpreter lifecycles and repeated init/finalize cycles:

    static int test_repeated_init_and_subinterpreters(void)

    The test case here would be similar, but it wouldn't need the outer loop - it would just create a handful of subinterpreters, but instead of ending each one before creating the next one the way the existing test does, what it would instead do is:

    • setup as per the existing test case
    • create a pair of subinterpeters, using a copy of the existing loop, but omitting the Py_EndInterpreter call
    • switch back to the main interpreter
    • create a second pair of subinterpeters
    • switch back to the main interpreter
    • call Py_Finalize

    It also occurs to me that we don't currently have a test case for what happens if you call Py_Finalize from a subinterpreter rather than the main interpreter.

    @ned-deily
    Copy link
    Member

    Ping. It was marked as a 3.7regression but perhaps it should now be just targeted for 3.8.

    @ned-deily ned-deily added the 3.9 only security fixes label Jun 17, 2019
    @nanjekyejoannah
    Copy link
    Contributor

    I am investigating this but in the meantime.

    It also occurs to me that we don't currently have a test case for what happens if you call Py_Finalize from a sub-interpreter rather than the main interpreter.

    @ncoghlan Am moving this test request in a new issue. So that this issue only focuses on fixing the lingering sub-interpreters.

    @nanjekyejoannah
    Copy link
    Contributor

    The test request is moved to bpo-37776.

    @nanjekyejoannah
    Copy link
    Contributor

    I remember julien wanting to check this out during a discussion we had at the sprints hence the loop in.

    @LewisGaul
    Copy link
    Mannequin

    LewisGaul mannequin commented Nov 21, 2019

    I've put together a test along the lines of what Nick suggested, see the attached patch. Running this hits the Fatal 'remaining subinterpreters' error as expected:

       > ./Programs/_testembed test_bpo36225
    --- Pass 0 ---
    interp 0 <0x1A561A0>, thread state <0x1A56DD0>: id(modules) = 139707107673104
    interp 1 <0x1A9D050>, thread state <0x1A87620>: id(modules) = 139707106987472
    interp 2 <0x1B02210>, thread state <0x1AEC320>: id(modules) = 139706981531088
    interp 0 <0x1A561A0>, thread state <0x1A56DD0>: id(modules) = 139707107673104
    interp 3 <0x1B53740>, thread state <0x1AFD980>: id(modules) = 139706980408304
    interp 4 <0x1BA3390>, thread state <0x1B3C7A0>: id(modules) = 139706979780944
    interp 0 <0x1A561A0>, thread state <0x1A56DD0>: id(modules) = 139707107673104
    Fatal Python error: PyInterpreterState_Delete: remaining subinterpreters
    Python runtime state: finalizing (tstate=0x1a56dd0)
    
    Aborted
    

    I'm happy to look a bit further into the fix for this - Eric's pointers in this thread look useful to get started. @nanjekyejoannah did you get anywhere with this?

    @nanjekyejoannah
    Copy link
    Contributor

    Am abit swamped and sick atm. You can go on and submit a fix.

    @vstinner
    Copy link
    Member

    See also bpo-38865: "Can Py_Finalize() be called if the current interpreter is not the main interpreter?".

    @vstinner vstinner changed the title Lingering subinterpreters should be implicitly cleared on shutdown [subinterpreters] Lingering subinterpreters should be implicitly cleared on shutdown May 15, 2020
    @ned-deily
    Copy link
    Member

    I note this is marked as a 3.7regression and still open. Since the cutoff for the final 3.7 bugfix mode release is in a few days, I'm assuming this means that 3.7 users will have to live with this regression. If you feel that is a problem, speak up now.

    @vstinner
    Copy link
    Member

    I would not qualify the new Python 3.7 behavior (call Py_FatalError()) as a regression, so I remove "3.7regression" keyword.

    Also, I don't think that we can easily fix this issue in a stable branch, I would prefer to start working on implementation this issue in the master branch, and only later consider to *maybe* backport the change. So I changed the Python version to "3.10".

    I see multiple non trivial problems:

    • To finalize the subinterpreter A, Py_Finalize() may switch the current Python thread state from the main interpreter to a Python thread in the subintrepeter A. It can lead to new funny issues.

    • A subinterpreter can be stuck for whatever reason and refuse to stop. For example, the subinterpreter A may wait for an even from subinterpreter B. If we don't run two interpreters "in parallel" (in threads), it may be stuck forever.

    • Python 3.10 still has weird code to allow daemon threads to continue to run after Py_Finalize() complete. Maybe we need to add such hacks for subinterpreter which fail to be finalized? For example, kill them (exit) as soon as they attempt to acquire the GIL. Search for tstate_must_exit() in Python/ceval_gil.h.

    By the way, currently Py_Finalize() calls PyInterpreterState_Clear() which call object finalizers in the main thread of the main interpreter, whereas these finalizers might expect to be called from the thread which created them. I don't know if we must change anything else.

    Again, all these problems are very complex :-(

    The simple option, which is sadly a backward incompatible change, is to raise a fatal error in Py_Finalize() if a subinterpreter is still running.

    Maybe we can start by logging a warning into stderr for now, before introducing the backward incompatible change. Maybe even only emit it when -X dev is used? https://docs.python.org/dev/library/devmode.html

    @vstinner vstinner added 3.10 only security fixes and removed 3.7 (EOL) end of life 3.8 (EOL) end of life 3.9 only security fixes labels Jun 15, 2020
    @ericsnowcurrently
    Copy link
    Member

    I see multiple non trivial problems:

    • To finalize the subinterpreter A, Py_Finalize() may switch the current Python thread state from the main interpreter to a Python thread in the subintrepeter A. It can lead to new funny issues.

    I'm not sure what you mean with this. Are you talking about how another thread might be running that threadstate already? That doesn't sounds like a problem specific to this issue but rather a problem with Py_EndInterpreter() in general.

    • A subinterpreter can be stuck for whatever reason and refuse to stop. For example, the subinterpreter A may wait for an even from subinterpreter B. If we don't run two interpreters "in parallel" (in threads), it may be stuck forever.

    The wait_for_thread_shutdown() call in Py_FinalizeEx() already has a similar behavior. In the case of one interpreter blocking another like that, how would that be possible where it isn't already a problem?

    • Python 3.10 still has weird code to allow daemon threads to continue to run after Py_Finalize() complete. Maybe we need to add such hacks for subinterpreter which fail to be finalized? For example, kill them (exit) as soon as they attempt to acquire the GIL. Search for tstate_must_exit() in Python/ceval_gil.h.

    I consider the problem of daemon threads to be a separate problem. The solution proposed here for finalization doesn't change any behavior regarding daemon threads.

    By the way, currently Py_Finalize() calls PyInterpreterState_Clear() which call object finalizers in the main thread of the main interpreter, whereas these finalizers might expect to be called from the thread which created them.

    Do you have any examples? I'm having trouble imagining such a case.

    I don't know if we must change anything else.

    Again, all these problems are very complex :-(

    I agree. However, automatically finalizing all other interpreters at the beginning of Py_FinalizeEx() doesn't introduce any new problems. At the same time, it makes sure that any resources in use in other interpreters have a chance to be released and that global resources used there don't cause crashes during runtime finalization.

    The simple option, which is sadly a backward incompatible change, is to raise a fatal error in Py_Finalize() if a subinterpreter is still running.

    As long as we require Py_FinalizeEx() to be called from the main interpreter, I don't see how the proposed change (in PR bpo-17575) would be backward incompatible.

    Maybe we can start by logging a warning into stderr for now, before introducing the backward incompatible change. Maybe even only emit it when -X dev is used? https://docs.python.org/dev/library/devmode.html

    I'd like to see a ResourceWarning if any other interpreters were still around.

    @vstinner
    Copy link
    Member

    If tomorrow Python allows to run two interpreters in parallel (see bpo-40512: "Meta issue: per-interpreter GIL"), I don't think that it will be safe to execute object finalizers of a subinterpreter from the main interpreter in Py_Finalize().

    IMO subinterpreters must be finalized manually by the programmer, since it's too tricky.

    The safest option is to display a warning in Py_Finalize() if there are running subinterpreters.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @ezio-melotti ezio-melotti moved this to Todo in Subinterpreters Apr 15, 2022
    @ericsnowcurrently
    Copy link
    Member

    In gh-121060 (4be1f37) I added automatic cleanup of subinterpreters. I've also backported it to 3.13.

    (I need to revisit gh-17575, which I had forgotten about.)

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.10 only security fixes topic-subinterpreters type-crash A hard crash of the interpreter, possibly with a core dump
    Projects
    Status: Done
    Development

    No branches or pull requests

    6 participants