Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: clean up busy waiting in prefetcher blocking get #8215

Merged
merged 14 commits into from
Dec 22, 2022

Conversation

pugachAG
Copy link
Contributor

Part of #7723

This PR replaces busy waiting loop with update notifications on every underlying map update. It introduces single condition variable which is notified on every map update. One considered alternative is to maintain separate condition variable per slot to have more granular notifications. In practice this doesn't make any difference since additional iterations for irrelevant keys wouldn't cause any noticeable performance overhead, but it results in more complex and harder to reason about code.

@pugachAG pugachAG requested a review from a team as a code owner December 13, 2022 13:49
/// This consumes locked mutex guard to make sure to unlock it before
/// notifying the condition variable.
fn notify_slots_update(&self, guard: MutexGuard<SizeTrackedHashMap>) {
std::mem::drop(guard);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this actually a right thing to do? My intuition is that we want to notify first and unlock after, so that the mutex is fairly distributed along all the waiters on the condvar. If it is unlocked before the notification, some other waiter that’s not in a condvar might grab the mutex, leading to all the condvar waiters block immediately after them being notified (a-la thundering herd?)

The standard library doc examples seem to corroborate with my understanding…

Copy link
Contributor Author

@pugachAG pugachAG Dec 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leading to all the condvar waiters block immediately after them being notified (a-la thundering herd?)

With unlocking the mutex before notifying I wanted to address exactly that issue (see stackoverflow). In this particular case I don't think it would matter since this is definitely not a performance bottleneck. I will change this to match the example Condvar usage from the docs to avoid trying to be "unnecessarly smart" here :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW I raised a question about this upstream. They have some interesting insights :)

@pugachAG pugachAG requested a review from nagisa December 15, 2022 14:39
Copy link
Collaborator

@nagisa nagisa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I would love some wrapper that does not require to manually keep track of needing to signal a condvar when mutating the protected data, but I think the scope of this is restricted enough that it doesn’t matter much if we keep it as is for the time being.

Copy link
Contributor

@jakmeier jakmeier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome stuff!

I believe we currently only care about notify in insert_fetched and clear because the other wake ups are just going back to sleep. But I actually prefer it this way, notifying on all modifications, as it makes future changes easier.

A wrapper that "automates" the notify could be added, as nagisa mentioned. I imagine it to be a new guard type that notifies on drop. That could be neat, so if you want to explore this, feel free to do so, it can also be a follow up PR. But I wouldn't say it's necessary.

@pugachAG
Copy link
Contributor Author

@jakmeier @nagisa I've introduced Monitor to wrap mutex and condvar, could you please take another look

Copy link
Contributor

@jakmeier jakmeier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I really like the Read / Write guard split with the deref impls.

I think we now have even more "unnecessary" notify calls. But it seems like no big issue.

On the plus side, I am now more confident that we haven't missed a case where we needed a notification. So overall, I'm very happy with this wrapper! (And we could even reuse it for other work we want to sync with background threads in the future, such as compilation in the background.)

// result in any flakiness since the test would still pass if we don't
// sleep enough, it just won't verify the the synchronization part of
// `blocking_get`.
std::thread::sleep(Duration::from_micros(1000));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mixing from_millis (which you use further down) and from_micros can be confused on a quick read, I'd probably go with:

Suggested change
std::thread::sleep(Duration::from_micros(1000));
std::thread::sleep(Duration::from_millis(1));

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why this couldn't use Barriers to synchronize the operations? Timers in tests are a code smell.

Copy link
Contributor Author

@pugachAG pugachAG Dec 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nagisa Here we want to execute insert_fetched after blocking_get was executed for some time, so it performs the initial check for the value and then blocks. As far as I understand Barrier wouldn't help with that.

I definitely agree with your point regarding relying on timing in tests. I've reimplemented this to avoid using sleep and removed the timeout as part of the test code since now we have that enforced by nextest. Could you please take another look.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aha yeah, barriers don't work here, I had them flipped during review for some reason.

/// It enables blocking while waiting for the underlying value to be updated.
/// The implementation ensures that any modification results in all blocked
/// threads being notified.
pub(crate) struct Monitor<T> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not saying that rolling out our own implementation is problematic but curious why we didn't use something off the shelf e.g. https://github.com/reem/rust-shared-mutex.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've considered considered using shared-mutex and actually my Monitor implementation is inspired by it.
Unfortunately it doesn't provide the API we need here, in particular we want SharedMutexWriteGuard to notify condvar when dropped (see this comment).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, thanks for the explanation.

@near-bulldozer near-bulldozer bot merged commit 915b08b into near:master Dec 22, 2022
nikurt pushed a commit to nikurt/nearcore that referenced this pull request Dec 23, 2022
Part of near#7723

This PR replaces busy waiting loop with update notifications on every underlying map update. It introduces single condition variable which is notified on every map update. One considered alternative is to maintain separate condition variable per slot to have more granular notifications. In practice this doesn't make any difference since additional iterations for irrelevant keys wouldn't cause any noticeable performance overhead, but it results in more complex and harder to reason about code.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants