Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-39622: Interrupt the main asyncio task on Ctrl+C #32105

Merged
merged 22 commits into from
Mar 30, 2022
Merged

Conversation

asvetlov
Copy link
Contributor

@asvetlov asvetlov commented Mar 24, 2022

The strategy is:

  1. If asyncio runner is executed in the main thread it can handle Ctrl+C
  2. Ctrl+C cancels the main task (created for executing runner.run(coro())).
  3. The second and following Ctrl+C don't cancel the main task but raise KeyboardInterrupt instead. It can break loops, blocking calls etc.
  4. runner.run() raises KeyboardInterrupt when the main task is cancelled by Ctrl+C
  5. Background tasks are not touched, they are cancelled at the runner finalization stage.

https://bugs.python.org/issue39622

@asvetlov asvetlov marked this pull request as ready for review March 25, 2022 00:33
@asvetlov asvetlov requested a review from 1st1 as a code owner March 25, 2022 00:33
@asvetlov asvetlov added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Mar 25, 2022
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @asvetlov for commit 2fd2ab7 🤖

If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Mar 25, 2022
@asvetlov asvetlov requested a review from gvanrossum March 25, 2022 00:33
@asvetlov
Copy link
Contributor Author

@1st1 @gvanrossum the PR adds Ctrl+C handling to asyncio programs.
Please review if you have time.

Co-authored-by: Kumar Aditya <59607654+kumaraditya303@users.noreply.github.com>
Copy link
Contributor

@kumaraditya303 kumaraditya303 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, tested on linux.

Doc/library/asyncio-runner.rst Outdated Show resolved Hide resolved
Doc/library/asyncio-runner.rst Outdated Show resolved Hide resolved
Doc/library/asyncio-runner.rst Outdated Show resolved Hide resolved
Doc/library/asyncio-runner.rst Outdated Show resolved Hide resolved
Doc/library/asyncio-runner.rst Outdated Show resolved Hide resolved
Doc/library/asyncio-runner.rst Outdated Show resolved Hide resolved
Doc/library/asyncio-runner.rst Outdated Show resolved Hide resolved
asvetlov and others added 4 commits March 29, 2022 18:04
Co-authored-by: Kumar Aditya <59607654+kumaraditya303@users.noreply.github.com>
Co-authored-by: Kumar Aditya <59607654+kumaraditya303@users.noreply.github.com>
Co-authored-by: Kumar Aditya <59607654+kumaraditya303@users.noreply.github.com>
Co-authored-by: Kumar Aditya <59607654+kumaraditya303@users.noreply.github.com>
asvetlov and others added 6 commits March 29, 2022 18:07
Co-authored-by: Kumar Aditya <59607654+kumaraditya303@users.noreply.github.com>
Co-authored-by: Kumar Aditya <59607654+kumaraditya303@users.noreply.github.com>
Co-authored-by: Kumar Aditya <59607654+kumaraditya303@users.noreply.github.com>
Co-authored-by: Kumar Aditya <59607654+kumaraditya303@users.noreply.github.com>
Co-authored-by: Kumar Aditya <59607654+kumaraditya303@users.noreply.github.com>
Co-authored-by: Kumar Aditya <59607654+kumaraditya303@users.noreply.github.com>
@asvetlov
Copy link
Contributor Author

Thank you very much, @kumaraditya303 !

Copy link
Member

@gvanrossum gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying this with code like this:

async def looper():
    while True:
        print("Top...")
        try:
            await asyncio.sleep(1.0)
        except BaseException as err:
            print("Caught", repr(err))

and its behavior is very unpredictable. The first time I hit ^C it will show that it caught CancelledError, maybe the second time too, but after that something weird happens: it shows the KeyboardInterrupt (in runners.py on L142) causing another KeyboardInterrupt (selectors.py L561 on macOS, somewhere in windows_events.py on Windows), then the code prints "Caught GeneratorExit()", and then an endless loop of alternating "Top..." and "Caught RuntimeError('no running event loop')" until I hit another ^C, which stops it.

If I special-case GeneratorExit to break out of the loop, there is no endless loop, but I see something like this (on Windows):

Traceback (most recent call last):
  File "C:\Users\gvanrossum\pr-32105\Lib\asyncio\runners.py", line 175, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File "C:\Users\gvanrossum\pr-32105\Lib\asyncio\runners.py", line 109, in run
    return self._loop.run_until_complete(task)
    super().run_forever()
    ^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\gvanrossum\pr-32105\Lib\asyncio\base_events.py", line 604, in run_forever
    self._run_once()
    ^^^^^^^^^^^^^^^^
  File "C:\Users\gvanrossum\pr-32105\Lib\asyncio\base_events.py", line 1873, in _run_once
    event_list = self._selector.select(timeout)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\gvanrossum\pr-32105\Lib\asyncio\windows_events.py", line 439, in select
    self._poll(timeout)
    ^^^^^^^^^^^^^^^^^^^
  File "C:\Users\gvanrossum\pr-32105\Lib\asyncio\windows_events.py", line 808, in _poll
    status = _overlapped.GetQueuedCompletionStatus(self._iocp, ms)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
KeyboardInterrupt
Catching Generator Exit
Task was destroyed but it is pending!
task: <Task cancelling name='Task-1' coro=<looper() done, defined at C:\Users\gvanrossum\pr-32105\t.py:3> wait_for=<Future pending cb=[Task.task_wakeup()]> cb=[gather.<locals>._done_callback() at C:\Users\gvanrossum\pr-32105\Lib\asyncio\tasks.py:759]>
^CTerminate batch job (Y/N)? y
PS C:\Users\gvanrossum\pr-32105>

This does not seem to be a big improvement over the original issue. :-(

Lib/asyncio/runners.py Outdated Show resolved Hide resolved
@1st1
Copy link
Member

1st1 commented Apr 1, 2022

Why is this merged? No core dev reviewed this. Please revert.

@mejrs mejrs mentioned this pull request Apr 18, 2022
@davidhewitt
Copy link
Contributor

This appears like it might have caused breakage in the latest alpha: PyO3/pyo3#2314 (comment)

@gvanrossum
Copy link
Member

This appears like it might have caused breakage in the latest alpha: PyO3/pyo3#2314 (comment)

I find the discussion in that issue hard to follow. Sure, something that happened between a6 and a7 broke PyO3. But what? Why would this change (which only affects how KeyboardInterrupt is handled) be the root cause?

@davidhewitt
Copy link
Contributor

@gvanrossum sorry for my short comment earlier; was up in the middle of the night helping my wife feed my newborn :)

I haven't had chance to repro and poke around with this this myself, however I think the cause is possibly because the signal and threading modules are disagreeing about what is the "main thread" in PyO3's unit test suite. (Rust binary embedding Python via dynamic linking to libpython.)

The line erroring is the runners.py line 103 introduced in this patch, which installs the sigint handler:

signal.signal(signal.SIGINT, sigint_handler)

It reaches this error in the C:

_PyErr_SetString(tstate, PyExc_ValueError,
"signal only works in main thread "
"of the main interpreter");

That runners.py line 103 is guarded by a check threading.current_thread() is threading.main_thread(), hence my guess that threading / signal disagree what the main thread is in this case.


The PyO3 unit test suite tickling this crash can roughly be summarised as:

  • Rust binary embedding Python (dynamic linking to libpython)
  • The Python interpreter is started with Py_InitializeEx(0) (0 means don't register signal handlers)
  • A different native thread acquires the GIL with PyGILState_Ensure()
  • That thread then uses PyEval_EvalCode to run a snippet which contains a call to asyncio.run

@gvanrossum
Copy link
Member

gvanrossum commented Apr 19, 2022

Sorry, it looks like you just answered at least part of the question I asked on the PyO3 issue. Anyway, it looks like we need a better check for determining whether it is safe to call signal(). Or perhaps we just need to put a try/except around that call?

@davidhewitt
Copy link
Contributor

davidhewitt commented Apr 20, 2022

Thinking on this further, I suspect that threading check passes in PyO3's case above because running this code snippet is probably the first time threading is imported. That would explain why threading module thinks the thread is the main thread.

I see in signalmodule.c there's a symbol _PyOS_IsMainThread, which looks like the correct check (used in C) for whether signal.signal can be called. Do you think it makes sense to expose this as a Python API, perhaps signal.can_handle_signals()?

@gvanrossum
Copy link
Member

@davidhewitt We're 1-2 weeks from feature freeze, and I don't have time to propose and implement a new API for this purpose (simple though it may seem, there are always loose ends, tests to write, etc.).

What I can do (other than rolling back @asvetlov's PR, which would be a shame) is add a try/except around the signal() call that catches the ValueError. That would presumably unblock you. (Things would go even faster if you proposed a PR -- I can just merge it.)

@gvanrossum
Copy link
Member

(Either way you should probably open a new issue about this rather than keep chatting on this closed PR.)

@davidhewitt
Copy link
Contributor

👍 I'll do my best to find time to write a fresh issue and PR at the weekend. Thanks for the help!

thy09 added a commit to microsoft/promptflow that referenced this pull request May 10, 2024
# Description

Fix the bug that Ctrl+C doesn't returns flow result because the signal
handler would raise an exception while asyncio.run cannot handle it
correctly.

Note that for py<3.11, there is no default sigint handler when using
asyncio.run to run an async function, so we add one for py<3.11
following python's implementation:
https://github.com/python/cpython/blob/46c808172fd3148e3397234b23674bf70734fb55/Lib/asyncio/runners.py#L150
Such logic is initialized in this PR:
python/cpython#32105

This pull request introduces significant changes to the
`src/promptflow-core/promptflow/_utils/async_utils.py` file, aimed at
improving the handling of asynchronous tasks and signal interruption.
The changes also include minor modifications to several other files in
the `src/promptflow-core/promptflow/` directory.

Here are the key changes:

### Signal Handling and Asynchronous Task Management:

*
[`src/promptflow-core/promptflow/_utils/async_utils.py`](diffhunk://#diff-ebca5e7d6a45f737be917f296f2fb0a3480fa26bc378bde80f11368c9ac2c4fbR7-R8):
A new class `_AsyncTaskSigIntHandler` was added to handle SIGINT signals
during the execution of asynchronous tasks. This class cancels the
current task if a SIGINT signal is received, which is particularly
useful for Python versions less than 3.11 where the default cancelling
behavior is not supported. The function
`_invoke_async_with_sigint_handler` was also added to invoke
asynchronous functions with the SIGINT handler. The function
`async_run_allowing_running_loop` was modified to use this new function.
[[1]](diffhunk://#diff-ebca5e7d6a45f737be917f296f2fb0a3480fa26bc378bde80f11368c9ac2c4fbR7-R8)
[[2]](diffhunk://#diff-ebca5e7d6a45f737be917f296f2fb0a3480fa26bc378bde80f11368c9ac2c4fbR27-R79)
[[3]](diffhunk://#diff-ebca5e7d6a45f737be917f296f2fb0a3480fa26bc378bde80f11368c9ac2c4fbL39-R94)

### Minor Changes in Other Files:

*
[`src/promptflow-core/promptflow/_utils/process_utils.py`](diffhunk://#diff-50bb354dd49c977ba18f8fa38059525658153f097ef6f808f4226cff4d7a8891L32-R32):
The signal handler for SIGINT was changed to
`signal.default_int_handler` in the `block_terminate_signal_to_parent`
function.
*
[`src/promptflow-core/promptflow/executor/_async_nodes_scheduler.py`](diffhunk://#diff-cd6772e1603e398a564a4d2768ee2f7ffdf7c5e3e0bb7cda5bf19e560f143901L9):
The import of the `signal` module was removed. In the `execute` method,
the signal handler registration was removed and the exception handling
was modified to include `asyncio.CancelledError`. The `signal_handler`
function was removed and replaced with the `cancel` method.
[[1]](diffhunk://#diff-cd6772e1603e398a564a4d2768ee2f7ffdf7c5e3e0bb7cda5bf19e560f143901L9)
[[2]](diffhunk://#diff-cd6772e1603e398a564a4d2768ee2f7ffdf7c5e3e0bb7cda5bf19e560f143901L49-R59)
[[3]](diffhunk://#diff-cd6772e1603e398a564a4d2768ee2f7ffdf7c5e3e0bb7cda5bf19e560f143901R77-R81)
[[4]](diffhunk://#diff-cd6772e1603e398a564a4d2768ee2f7ffdf7c5e3e0bb7cda5bf19e560f143901L174-L183)
*
[`src/promptflow-core/promptflow/executor/flow_executor.py`](diffhunk://#diff-bec06607cb28fd791b8ed11bb488979344ca342be5f1c67ba6dd663d5e12240fL1096-R1100):
In the `_exec_async` method, the exception handling was modified to
handle `asyncio.CancelledError` instead of `KeyboardInterrupt`.

# All Promptflow Contribution checklist:
- [ ] **The pull request does not introduce [breaking changes].**
- [ ] **CHANGELOG is updated for new features, bug fixes or other
significant changes.**
- [ ] **I have read the [contribution guidelines](../CONTRIBUTING.md).**
- [ ] **Create an issue and link to the pull request to get dedicated
review from promptflow team. Learn more: [suggested
workflow](../CONTRIBUTING.md#suggested-workflow).**

## General Guidelines and Best Practices
- [ ] Title of the pull request is clear and informative.
- [ ] There are a small number of commits, each of which have an
informative message. This means that previously merged commits do not
appear in the history of the PR. For more information on cleaning up the
commits in your PR, [see this
page](https://github.com/Azure/azure-powershell/blob/master/documentation/development-docs/cleaning-up-commits.md).

### Testing Guidelines
- [ ] Pull request includes test coverage for the included changes.

---------

Co-authored-by: Heyi <heta@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants