Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor cancellation system to eagerly propagate effective state #958

Merged
merged 5 commits into from
May 6, 2019

Conversation

oremanj
Copy link
Member

@oremanj oremanj commented Mar 1, 2019

This synthesizes the ideas that arose in the discussion on #910.
Each CancelScope with block now creates a CancelStatus object
(not exposed publicly); the CancelStatus objects know their
parent/child relationships in the lexical nesting tree of
CancelScope contexts, and communicate to propagate cancellation
information eagerly. The upshot is that the question "is this
task in a cancelled scope right now?" can now be answered
in O(1) time, eliminating a notable inefficiency in Trio's
run loop. As a nice side benefit, manipulations of the
cancellation tree such as are required by nursery.start()
become much easier to reason about.

Fixes #58.

This synthesizes the ideas that arose in the discussion on python-trio#910.
Each CancelScope `with` block now creates a CancelStatus object
(not exposed publicly); the CancelStatus objects know their
parent/child relationships in the lexical nesting tree of
CancelScope contexts, and communicate to propagate cancellation
information eagerly. The upshot is that the question "is this
task in a cancelled scope right now?" can now be answered
in O(1) time, eliminating a notable inefficiency in Trio's
run loop. As a nice side benefit, manipulations of the
cancellation tree such as are required by `nursery.start()`
become much easier to reason about.
@codecov
Copy link

codecov bot commented Mar 1, 2019

Codecov Report

Merging #958 into master will increase coverage by <.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master     #958      +/-   ##
==========================================
+ Coverage   99.53%   99.53%   +<.01%     
==========================================
  Files         102      102              
  Lines       12272    12301      +29     
  Branches      914      919       +5     
==========================================
+ Hits        12215    12244      +29     
  Misses         36       36              
  Partials       21       21
Impacted Files Coverage Δ
trio/_core/tests/test_run.py 100% <100%> (ø) ⬆️
trio/_core/_run.py 99.71% <100%> (ø) ⬆️

@njsmith
Copy link
Member

njsmith commented Mar 4, 2019

I haven't had energy to do a detailed review yet, but a few quick thoughts:

  • The duplication of deadline/cancel_called/shield state between CancelState and CancelScope feels ugly and error-prone. Maybe CancelState should have-a CancelScope that it refers to when it needs to check these things? Or CancelScope could have-a CancelState, and we move all the fields into CancelState? I guess the trade-offs are:

    • Current design: duplication of state that has to be kept in sync manually
    • CancelState has-a CancelScope: would be very natural right now; gets a bit more awkward if we split up CancelScope into multiple objects (e.g. splitting shielding off into its own thing)
    • CancelScope has-a CancelState: would require the concept of "unbound CancelState". It would be weird to have the canonical deadline value stored in one object, and the deadline management code in a different object.

    Between these options, I think I like the second one (CancelState has-a CancelScope) best... and then maybe later you'll get your "cancel strategy protocol" back after all :-).

  • Let's rename check_cancelled to something like recalculate or recalculate_cancel_state or something. I was very puzzled at first b/c I thought from the method name that its semantics were going to be something like Proposal: make checkpoint_if_cancelled() sync-colored and rename it accordingly #961

  • I think I'd rather CancelScope.cancel_called be structured like:

    if current_time() >= self._deadline:
        self.cancel()
    return self._cancel_called

    Simple, obviously correct, doesn't require thinking through details of the clock semantics or change depending on whether the scope has been entered.

    We could potentially try to micro-optimize by adding some guards like not self._cancel_called and self._deadline != inf before calling current_time, but given how fast current_time is on most systems I'm not sure these would even be optimizations.

@oremanj
Copy link
Member Author

oremanj commented Mar 8, 2019

Making CancelStatus have-a CancelScope definitely makes the code clearer; thanks for the suggestion!

The name check_cancelled was probably in my head from this diff when I wrote up #961 but I didn't intend any connection between the two methods. Changed to recalculate().

I don't think we can quite simplify cancel_called as much as you suggested, since my impression is that we shouldn't set it to True if the scope is exited, then the deadline expires, then the attribute gets inspected. (There's an existing test that checks for this.) Optimization-wise I do think an attribute lookup (_cancel_called) is likely to be cheaper than a function call (current_time()), even if the underlying (not-quite-)syscall cost is negligible. At least we're not checking it at every checkpoint anymore!

@oremanj
Copy link
Member Author

oremanj commented Mar 8, 2019

Iowa midnight strikes again?

@oremanj oremanj closed this Mar 8, 2019
@oremanj oremanj reopened this Mar 8, 2019
@njsmith njsmith merged commit aad2ecf into python-trio:master May 6, 2019
@njsmith
Copy link
Member

njsmith commented May 6, 2019

Booyah.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Reduce or eliminate quadratic behavior in cancellation system?
2 participants