-
Notifications
You must be signed in to change notification settings - Fork 265
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Uses try_recv() for the accounts read cache evictor #3867
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -17,6 +17,7 @@ use { | |
Arc, Mutex, | ||
}, | ||
thread, | ||
time::Duration, | ||
}, | ||
}; | ||
|
||
|
@@ -305,13 +306,24 @@ impl ReadOnlyAccountsCache { | |
.spawn(move || { | ||
info!("AccountsReadCacheEvictor has started"); | ||
loop { | ||
let res = receiver.recv(); | ||
if let Err(err) = res { | ||
// The only error is when the channel is empty and disconnected. | ||
// Disconnecting the channel is the intended way to stop the evictor. | ||
trace!("AccountsReadCacheEvictor is shutting down... {err}"); | ||
break; | ||
}; | ||
// Note: We use `try_recv()` here to avoid registering a Waker. | ||
// This ensures the sender doesn't need to grab a lock to wake us up. | ||
let res = receiver.try_recv(); | ||
match res { | ||
Ok(_) => { | ||
// Evict request received! | ||
} | ||
Err(crossbeam_channel::TryRecvError::Empty) => { | ||
// No requests to evict were received, so sleep and check again. | ||
// Note: We don't need/want to evict often. 100 ms is already four | ||
// times per slot, which should be plenty. | ||
thread::sleep(Duration::from_millis(100)); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. how many evict request can be queued in 100ms? Is the channel unbounded? What's the upper bound on the memory that could be queued? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The channel is size 1. The eviction channel is therefore tiny. The main question is how much memory the read cache itself will use in that 100 milliseconds while the evictor is sleeping. Let me spin up a node a see how it compares. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Over a 15 minute interval, here's how may times a v2.0 canary wakes up and does productive work. Note, each of these counters/datapoints is submitted once per second. So if there were 10 wakeups in a second, we'd see 10. Since the value is either 1 or 0, it means we are only waking up once per second max at the moment. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since we have watermarks for the size (high and low), and wakeups of at most 1/2 per second, I think that means a sleep of 100 ms is fine here too. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The original idea for the evictor was that it would run infrequently, and being awoken was better than polling and sleeping. Maybe that design should be revisited. We can remove the channel and the background evictor can wakeup and check the size at a fixed time interval. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I tested this out on mnb, and it seems to be working fine. Graphs are over 6 hours. Blue is running this PR, purple is running master. total number of wakeups is up, which is expected as now we check ever 100 ms: The number of productive wakeups is basically the same: The time spent storing new accounts into the read cache, aka the important one, is about the same? Maybe slightly better? Not by a lot though: And lastly the size of the cache itself. There's a spike at the end, but only 10 MB, so seems fine to me. |
||
} | ||
Err(crossbeam_channel::TryRecvError::Disconnected) => { | ||
// Disconnecting the channel is the intended way to stop the evictor. | ||
break; | ||
} | ||
} | ||
stats | ||
.evictor_wakeup_count_all | ||
.fetch_add(1, Ordering::Relaxed); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: use crossbeam_channel::TryRecvError;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, originally I had that in my first iteration: fecc6d2. I removed it since in this file we explicitly call out
crossbeam_channel
types explicitly, so I wanted to follow that precedent. Wdyt, keep as is, or go back to v1 with theuse
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh no yeah consistency is gud