-
-
Notifications
You must be signed in to change notification settings - Fork 345
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: make checkpoint_if_cancelled() sync-colored and rename it accordingly #961
Comments
I think the first step here is to implement #474 – update docs and tests, and clean up some of the current clutter for checkpointing error paths. (Of course this last part may take a while and it doesn't have to be perfect, but I think doing at least some cleanup will help make it clearer what we want here.)
We could do this now. It's probably also easy to implement |
These don't handle def thread_func():
os.kill(os.getpid(), signal.SIGINT)
trio.from_thread.check_cancel()
async def main():
await trio.to_thread.run_sync(thread_func)
trio.run(main) I guess the simplest+fastest approach, if it worked, would be to make all forms of check-for-cancellation first check Right now, we instead special-case the main task, so only it ever checks for
The nasty case is: we triggered the One approach would be to change our abort function interface, so it provides this information: I guess we'd have Another idea: in This might be simpler, but I bet it will also have some complications by the time we figure out how to forward arbitrary cancellations to the child thread, and also let the thread re-enter trio and deliver cancellations to those tasks. This overlaps with #642. |
Yeah, KI being edge-triggered adds all sorts of unfortunate edge cases like this. I wonder if we can just... drop the ability for the cancellation system to raise KI at all? You still get KI if it's delivered while non-protected user code is running. If it's delivered while KI protected, or during a sleep, then we cancel the system nursery and raise KI once everything unwinds. It could chain with the tree of Cancelled exceptions that propagates out of the main task. Benefits:
Drawbacks: you can't directly catch KeyboardInterrupt within async code. Maybe others I'm not thinking of. |
Huh! interesting idea! We have a nice trick to cancel-everything-and-raise: spawn a system task and have it raise the exception. Using this to inject
This part would require a special-case hack so that this exception in particular sticks the I'm thinking maybe we should consider this separately as part of our overall make-deadlock-less-frustrating efforts. (Which start with #1085 and #927. Maybe the stack tree dumper should also be triggered on control-C on Windows when running with Another advantage you didn't mention: I think we could simplify
Yeah, I think this is the main drawback. It's a bit tricky to judge: Something that just occurred to me: the "inject a task that does |
So document it. Also, people will realize this the first time they hit the problem. We could also play with variable names etc. in the stack trace again, to tell them. ;-) |
This is a bit of a kluge, and will hopefully be cleaned up in the future when we overhaul KeyboardInterrupt (python-triogh-733) and/or the cancellation checking APIs (python-triogh-961). But 'checkpoint()' is a common operation, and this speeds it up by ~7x, so... not bad for a ~4 line change. Before this change: - `await checkpoint()`: ~24k/second - `await cancel_shielded_checkpoint()`: ~180k/second After this change: - `await checkpoint()`: ~170k/second - `await cancel_shielded_checkpoint()`: ~180k/second Benchmark script: ```python import time import trio LOOP_SIZE = 1_000_000 async def main(): start = time.monotonic() for _ in range(LOOP_SIZE): await trio.lowlevel.checkpoint() #await trio.lowlevel.cancel_shielded_checkpoint() end = time.monotonic() print(f"{LOOP_SIZE / (end - start):.2f} schedules/second") trio.run(main) ```
This is a bit of a kluge, and will hopefully be cleaned up in the future when we overhaul KeyboardInterrupt (python-triogh-733) and/or the cancellation checking APIs (python-triogh-961). But 'checkpoint()' is a common operation, and this speeds it up by ~7x, so... not bad for a ~4 line change. Before this change: - `await checkpoint()`: ~24k/second - `await cancel_shielded_checkpoint()`: ~180k/second After this change: - `await checkpoint()`: ~170k/second - `await cancel_shielded_checkpoint()`: ~180k/second Benchmark script: ```python import time import trio LOOP_SIZE = 1_000_000 async def main(): start = time.monotonic() for _ in range(LOOP_SIZE): await trio.lowlevel.checkpoint() #await trio.lowlevel.cancel_shielded_checkpoint() end = time.monotonic() print(f"{LOOP_SIZE / (end - start):.2f} schedules/second") trio.run(main) ```
I'd like to propose that we deprecate
checkpoint_if_cancelled()
in favor of a synchronous function that I'm thinking of astrio.check_cancelled()
(happy to take suggestions on the name, or move into hazmat). This would raiseCancelled
if the current scope is cancelled, and do nothing otherwise. i.e., it is a pure cancel point, not a schedule point, not even a schedule point if we're cancelled.This becomes possible due to some recent or pending changes:
raise Cancelled._init()
And I think would have benefits for:
checkpoint()
asawait cancel_shielded_checkpoint(); check_cancelled()
run_sync_in_worker_thread
functions to call check_cancelled() occasionally without messing with portals (because it's synchronous! and if/when Refactor cancellation system to eagerly propagate effective state #958 lands I think we can make the check_cancelled() operation sufficiently thread-safe)Possible downside: a sync-colored function called from an async function can now call check_cancelled() and wind up raising Cancelled. Previously that couldn't happen. My feeling is that "functions that do something off-the-wall might raise an exception their caller wasn't expecting" is hardly a new problem, but maybe there's some aspect of what we'd be giving up here that I'm missing.
The text was updated successfully, but these errors were encountered: