Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve demonitor race precision #281

Merged
merged 17 commits into from
Aug 20, 2018

Conversation

aronisstav
Copy link
Member

Summary

This change optimizes the races of demonitor variants, but introduces some weirdness in the reported traces.

Fixes aronisstav#43.

Motivation

Before this PR, Concuerror's handling of races between demonitor variants and a monitored process exiting was 'coarse', in the sense that the call to demonitor would be considered as racing with both the exit event itself (which would determine if a monitor message will be sent) and the 'DOWN'message being delivered (which determined if the message is placed in the mailbox). This would lead to 3 schedulings being explored: monitor is delivered, monitor is emitted but blocked from delivery while in-flight, or monitor is removed before emission. Currently the decision about whether the monitor will be emitted is taken when the process starts exiting.

However, out of four variants of demonitor, only one races with both events (demonitor(..., [info])). In contrast, a call to demonitor(..., [flush]) does not race with either the exit or the deliver events: it is irrelevant whether the process exited before or after, as is irrelevant whether the 'DOWN' message has been delivered or not: the message will never exist in the recipient's mailbox.

All gen_ behavior calls use the following pattern from gen.erl:

Mref = monitor(process, P),
erlang:send(P, Request),
receive
  Reply ->
    demonitor(Mref, [flush]),
    Reply;
  {'DOWN', Mref, _, _, Reason} ->
    exit(Reason)
after
  Timeout ->
    demonitor(Mref, [flush]),
    exit(timeout)
end

For the Reply and after cases Concuerror would explore three schedulings instead of just one.

Change

In order to treat demonitor(..., [flush]) as independent from exit and message delivery, we must ensure that all events appear identical in any of the orderings. This leads to the following changes:

  • 'DOWN' messages should be emitted, even if a monitor has been removed before the process exits: Concuerror must be allowed to reschedule independent events in any order it likes, so if demonitor and exit are independent, it must not matter whether they happen in one way or the other. This means that demonitor, exit (maybe emit) should be the same as exit (maybe emit), demonitor so the monitor must always be emitted. Concuerror will also show a warning when emitting DOWN messages for inactive monitors.
  • 'DOWN' messages for cancelled monitors can no longer be marked as ignored during scheduling: for the same reason, if demonitor and message delivery are independent, they must be identical in the trace. However, the use_receive_patterns pass can mark such messages as 'received' and not try to reschedule them.

Checklist

  • Has tests
  • Updates CHANGELOG (or too minor)
  • References related Issues

@codecov
Copy link

codecov bot commented Aug 20, 2018

Codecov Report

Merging #281 into master will increase coverage by 0.08%.
The diff coverage is 99.29%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #281      +/-   ##
==========================================
+ Coverage   94.39%   94.48%   +0.08%     
==========================================
  Files          12       12              
  Lines        3589     3646      +57     
==========================================
+ Hits         3388     3445      +57     
  Misses        201      201
Flag Coverage Δ
#tests 94.13% <99.29%> (+0.06%) ⬆️
#tests_real 84.53% <92.25%> (+0.13%) ⬆️
#unit_tests 2.69% <0%> (-0.05%) ⬇️
Impacted Files Coverage Δ
src/concuerror_scheduler.erl 96.89% <100%> (+0.2%) ⬆️
src/concuerror_dependencies.erl 87.79% <100%> (+0.47%) ⬆️
src/concuerror_io_lib.erl 98.63% <100%> (ø) ⬆️
src/concuerror_callback.erl 92.71% <98.33%> (-0.02%) ⬇️
src/concuerror_logger.erl 96.83% <0%> (+0.22%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c040b0d...4170b9d. Read the comment docs.

@aronisstav aronisstav merged commit d512b54 into parapluu:master Aug 20, 2018
@aronisstav aronisstav deleted the improve-demonitor-precision branch August 20, 2018 13:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant