-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: fix possible deadlock in bank #10466
Conversation
This looks nice! Thanks for reporting and creating an actual pr! Did you find this by https://github.com/BurtonQin/rust-lock-bug-detector? Also, do you mind to write a testcase? Considering the kind of this bug, it might be a bit hard? |
Yes, and I am still improving this detector. |
Codecov Report
@@ Coverage Diff @@
## master #10466 +/- ##
=========================================
- Coverage 81.6% 81.6% -0.1%
=========================================
Files 296 296
Lines 69320 69326 +6
=========================================
- Hits 56634 56614 -20
- Misses 12686 12712 +26 |
runtime/src/bank.rs
Outdated
let bh = self.hash.read().unwrap(); | ||
let dbh = dbank.hash.read().unwrap(); | ||
assert_eq!(*bh, *dbh); | ||
|
||
let st = self.stakes.read().unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@BurtonQin nits: reordering is alone suffice to fix the deadlock as you said. Thanks for spotting this!
But for maintainability perspective, the current code is too fragile, relaying on the correct locking order. I think this method's each code blocks should also be surrounded with {}
so that each locks aren't held for longer than needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nicely done!
@BurtonQin Could you work on the nit?
Sure. Pushed. |
💔 Unable to automerge due to CI failure |
Problem
There are two kinds of possible deadlock bugs:
1. locks in conflicting order:
e.g.
The fix is to swap the order of
stakes
andhash
incompare_bank()
.2. double read lock if comparing with itself
When compare_bank() compares itself, there are four cases where the first read lock is not released before the second lock. A deadlock may happen when the two read locks are interleaved by a write lock from another thread. The reason is that the priority policy of
std::sync::RwLock
is dependent on the underlying operating system's implementation. AFAIK, Windows and macOS use fair policy which leads to this kind of deadlock.The fix is to add a shortcut by checking the ptr equality of
self
anddbank
first.Summary of Changes
stakes
andhash
incompare_bank()
.self
anddbank
incompare_bank()
.