servers, util: fix deadlock caused by conflicting lock order #3340
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR fixes two deadlock bugs caused by conflicting lock order in stratumserver and log.
In servers/src/mining/stratumserver.rs
add_worker()
callsstratum_stats.write()
beforeworkers_list.write()
grin/servers/src/mining/stratumserver.rs
Lines 700 to 704 in c7c9a32
remove_worker()
callsworkers_list.read()
beforestratum_stats.write()
grin/servers/src/mining/stratumserver.rs
Line 721 in c7c9a32
When
add_worker()
andremove_worker()
are called simultaneously, the following deadlock may happen:Similarly,
init_logger()
callsLOGGING_CONFIG.lock()
beforeWAS_INIT.lock()
grin/util/src/logger.rs
Line 154 in c7c9a32
grin/util/src/logger.rs
Line 253 in c7c9a32
init_test_logger()
callsWAS_INIT.lock()
beforeLOGGING_CONFIG.lock()
grin/util/src/logger.rs
Lines 261 to 262 in c7c9a32
grin/util/src/logger.rs
Line 271 in c7c9a32
The fix is to enforce the order of the locks.
In stratumserver.rs, I use only one write lock of
workers_list
and remove the following read lock. This is to avoid possible atomicity violation whenworkers_list
is written by another thread before being read.In logger.rs,
WAS_INIT
is lifted beforeLOGGING_CONFIG
inremove_worker
to prevent the interleaving ofinit_logger()
andinit_test_logger()
.Note that dropping the lockguard of Lock-A before calling Lock-B is also a possible solution for conflicting lock order. But I did not use it here to prevent possible atomicity violation.
When the critical sections of different locks may overlap, I suggest always locking in the order that they are declared to prevent such deadlocks.
There is only one thing that concerns me:
fn
remove_worker()
callsupdate_stats()
wherestratum_stats.write()
is called. Thenstratum_stats.write()
is called again. Is it okay to have the two critical sections interleaved withremove_worker()
oradd_worker()
from another thread?