Skip to content

Commit

Permalink
Fix a deadlock condition (#331)
Browse files Browse the repository at this point in the history
In OpenSegment(), we are holding a lock to do a bunch of operations, which includes bumping an epoch. Bumping an epoch could trigger drain callbacks, which includes the flush callback. The flush callback may in turn call OpenSegment() again, which would take the lock again, resulting in a self-deadlock.
The solution is to release the lock before bumping epoch: the lock is only to protect the file list, which is already releasable at this point.
  • Loading branch information
sillycross authored Sep 16, 2020
1 parent 0fc5689 commit e61e9be
Showing 1 changed file with 44 additions and 2 deletions.
46 changes: 44 additions & 2 deletions cc/src/device/file_system_disk.h
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,38 @@ class FileSystemFile {
environment::FileOptions file_options_;
};

// Similar to std::lock_guard, but allows manual early unlock
//
class ReleasableLockGuard
{
public:
ReleasableLockGuard(std::mutex* mutex)
{
m_mutex = mutex;
m_mutex->lock();
m_released = false;
}

~ReleasableLockGuard()
{
if (!m_released)
{
m_mutex->unlock();
}
}

void Unlock()
{
assert(!m_released);
m_mutex->unlock();
m_released = true;
}

private:
std::mutex* m_mutex;
bool m_released;
};

/// Manages a bundle of segment files.
template <class H>
class FileSystemSegmentBundle {
Expand Down Expand Up @@ -310,7 +342,7 @@ class FileSystemSegmentedFile {
};

// Only one thread can modify the list of files at a given time.
std::lock_guard<std::mutex> lock{ mutex_ };
ReleasableLockGuard lock{ &mutex_ };
bundle_t* files = files_.load();

if(segment < begin_segment_) {
Expand Down Expand Up @@ -343,6 +375,11 @@ class FileSystemSegmentedFile {
core::IAsyncContext* context_copy;
core::Status result = context.DeepCopy(context_copy);
assert(result == core::Status::Ok);
// unlock the lock before calling BumpCurrentEpoch(),
// which may call completion callbacks which call this function again,
// resulting in self-deadlock.
//
lock.Unlock();
epoch_->BumpCurrentEpoch(callback, context_copy);
return core::Status::Ok;
}
Expand Down Expand Up @@ -386,7 +423,7 @@ class FileSystemSegmentedFile {
};

// Only one thread can modify the list of files at a given time.
std::lock_guard<std::mutex> lock{ mutex_ };
ReleasableLockGuard lock{ &mutex_ };
bundle_t* files = files_.load();
assert(files);
if(files->begin_segment >= new_begin_segment) {
Expand All @@ -407,6 +444,11 @@ class FileSystemSegmentedFile {
core::IAsyncContext* context_copy;
core::Status result = context.DeepCopy(context_copy);
assert(result == core::Status::Ok);
// unlock the lock before calling BumpCurrentEpoch(),
// which may call completion callbacks which call this function again,
// resulting in self-deadlock.
//
lock.Unlock();
epoch_->BumpCurrentEpoch(callback, context_copy);
}

Expand Down

0 comments on commit e61e9be

Please sign in to comment.