Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

servers, util: fix deadlock caused by conflicting lock order #3340

Merged
merged 1 commit into from
Jun 8, 2020

Conversation

BurtonQin
Copy link
Contributor

@BurtonQin BurtonQin commented Jun 4, 2020

  • Description:
    This PR fixes two deadlock bugs caused by conflicting lock order in stratumserver and log.
    In servers/src/mining/stratumserver.rs
    add_worker() calls stratum_stats.write() before workers_list.write()
    pub fn add_worker(&self, tx: Tx) -> usize {
    let mut stratum_stats = self.stratum_stats.write();
    let worker_id = stratum_stats.worker_stats.len();
    let worker = Worker::new(worker_id, tx);
    let mut workers_list = self.workers_list.write();

    remove_worker() calls workers_list.read() before stratum_stats.write()
    self.stratum_stats.write().num_workers = self.workers_list.read().len();

    When add_worker() and remove_worker() are called simultaneously, the following deadlock may happen:
Thread-A Thread-B
A.lock()
B.lock()
B.lock()//deadlock!
A.lock()//deadlock!

Similarly,
init_logger() calls LOGGING_CONFIG.lock() before WAS_INIT.lock()

let mut config_ref = LOGGING_CONFIG.lock();

let mut was_init_ref = WAS_INIT.lock();

init_test_logger() calls WAS_INIT.lock() before LOGGING_CONFIG.lock()

grin/util/src/logger.rs

Lines 261 to 262 in c7c9a32

pub fn init_test_logger() {
let mut was_init_ref = WAS_INIT.lock();

let mut config_ref = LOGGING_CONFIG.lock();

  • How I fix:
    The fix is to enforce the order of the locks.

In stratumserver.rs, I use only one write lock of workers_list and remove the following read lock. This is to avoid possible atomicity violation when workers_list is written by another thread before being read.
In logger.rs, WAS_INIT is lifted before LOGGING_CONFIG in remove_worker to prevent the interleaving of init_logger() and init_test_logger().
Note that dropping the lockguard of Lock-A before calling Lock-B is also a possible solution for conflicting lock order. But I did not use it here to prevent possible atomicity violation.

  • Minor suggestion:
    When the critical sections of different locks may overlap, I suggest always locking in the order that they are declared to prevent such deadlocks.

There is only one thing that concerns me:
fn remove_worker() calls update_stats() where stratum_stats.write() is called. Then stratum_stats.write() is called again. Is it okay to have the two critical sections interleaved with remove_worker() or add_worker() from another thread?

@lehnberg
Copy link
Collaborator

lehnberg commented Jun 5, 2020

Ping @hashmap - you might find this up your alley

Copy link
Member

@quentinlesceller quentinlesceller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR @BurtonQin !

  1. This makes sense. 👍
  2. Is likely to never happen since init_logger and init_test_logger should never been called together. But overall 👍 for safety sake.

There is only one thing that concerns me:
fn remove_worker() calls update_stats() where stratum_stats.write() is called. Then stratum_stats.write() is called again. Is it okay to have the two critical sections interleaved with remove_worker() or add_worker() from another thread?

Could you details what you think is the problem here?

Overall 👍 LGTM

@BurtonQin
Copy link
Contributor Author

Well, after a second thought, I did not find any problem.

@quentinlesceller quentinlesceller merged commit 992d450 into mimblewimble:master Jun 8, 2020
@quentinlesceller
Copy link
Member

@BurtonQin thank you so much for the PR 🎆. This is merged!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants