Skip to content

Commit

Permalink
Fix the thread wait case in error_handler (#7700)
Browse files Browse the repository at this point in the history
Summary:
In error_handler auto recovery case, if recovery_in_prog_ is false, the recover is finished or failed. In this case, the auto recovery thread should finish its execution so recovery_thread_ should be null. However, in some cases, it is not null, the caller should not directly returned. Instead, it should wait for a while and create a new thread to execute the new recovery.

Pull Request resolved: #7700

Test Plan: make check, error_handler_fs_test

Reviewed By: anand1976

Differential Revision: D25098233

Pulled By: zhichao-cao

fbshipit-source-id: 5a1cba234ca18f6dd5d1be88e02d66e1d5ce931b
  • Loading branch information
zhichao-cao authored and facebook-github-bot committed Dec 4, 2020
1 parent 70f2e09 commit eb5a8c0
Showing 1 changed file with 10 additions and 2 deletions.
12 changes: 10 additions & 2 deletions db/error_handler.cc
Original file line number Diff line number Diff line change
Expand Up @@ -584,12 +584,20 @@ Status ErrorHandler::StartRecoverFromRetryableBGIOError(IOStatus io_error) {
if (bg_error_.ok() || io_error.ok()) {
return Status::OK();
}
if (db_options_.max_bgerror_resume_count <= 0 || recovery_in_prog_ ||
recovery_thread_) {
if (db_options_.max_bgerror_resume_count <= 0 || recovery_in_prog_) {
// Auto resume BG error is not enabled, directly return bg_error_.
return bg_error_;
}

if (recovery_thread_) {
// In this case, if recovery_in_prog_ is false, current thread should
// wait the previous recover thread to finish and create a new thread
// to recover from the bg error.
db_mutex_->Unlock();
recovery_thread_->join();
db_mutex_->Lock();
}

recovery_in_prog_ = true;
recovery_thread_.reset(
new port::Thread(&ErrorHandler::RecoverFromRetryableBGIOError, this));
Expand Down

0 comments on commit eb5a8c0

Please sign in to comment.