-
Notifications
You must be signed in to change notification settings - Fork 4.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change Blockstore max_root from RwLock<Slot> to AtomicU64 #33998
Conversation
@@ -1192,7 +1182,7 @@ impl Blockstore { | |||
return false; | |||
} | |||
|
|||
if !Blockstore::should_insert_coding_shred(&shred, &self.last_root) { | |||
if !Blockstore::should_insert_coding_shred(&shred, self.max_root()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The former RwLock
/ now AtomicU64
is getting fetched for every single shred that is inserted. I was thinking we could fetch this value once at the start of the loop that iterates through every shred in Blockstore::do_insert_shreds()
. Looking at metrics for my node for the last 24 hours, I'm seeing something like 4k shreds per second (blockstore-insert-shreds.num_shreds
is around 8k but divide by 2 since the metric is reported every 2s).
However, glancing at recv-window-insert-shreds.run_insert_count
, I see a value on the order of 1000's which would indicate that on average, we're calling Blockstore::insert_shreds()
with a small number of shreds (ie 8k shreds from above / 4000 calls = 2 shred/call). So, the gains of reading once at the start may not be so big (factor of 2), but wondering if having so many calls with so few shreds per call is intended / optimal behavior.
Probably inclined to leave this alone for now and to revisit later when/if investigating why so few shreds per each call to Blockstore::insert_shreds()
. From pure efficiency standpoint, coalescing shreds would amortize the overhead incurred in insert_shreds()
. However, shred ingestion directly feeds replay, and any increases to latency here would likely be deemed unacceptable
Codecov Report
@@ Coverage Diff @@
## master #33998 +/- ##
=========================================
- Coverage 81.9% 81.9% -0.1%
=========================================
Files 811 811
Lines 219412 219407 -5
=========================================
- Hits 179771 179750 -21
- Misses 39641 39657 +16 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, this turned out to be quite straightforward! r+ restoring pub fn last_root()
The max_root RxLock had the previous workload - The single writer holds the write lock just long enough to update - The readers do not hold the lock; rather, they hold it just long enough to read the value out Thus, the access patterns do not require this to be a lock, and we can use a simple atomic
I'm slightly dubious of the test given that it "rolls back" a root, this can't happen.
a750b1e
to
4adf9ae
Compare
@CriesofCarrots - Can I get another ship-it; I had to rebase to resolve a conflict from the other PR where I ripped out that test-only function that allowed for manual modification of the previous |
should the rename be separated out into its own PR? |
Arguably it could be; however, between the total diff not being overly large and me having intermediate commits for rename / updating callers / change to Atomic / etc, I feel like this is fine to leave as one. |
debug_assert_matches!(shred.sanitize(), Ok(())); | ||
shred.is_code() && shred.slot() > *last_root.read().unwrap() | ||
shred.is_code() && shred.slot() > max_root |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a slight change of behavior here, because previously the code was fetching the most recent last_root
by passing the &RwLock<u64>
directly here, but now the max_root
is loaded earlier in the call stack and can be stalled and not be equal to self.max_root()
anymore.
This can potentially result in more unnecessary processing of old shreds, but I don't know how much impact it has.
Is there a downside to pass &AtomicU64
here and do the load
here as opposed to earlier in the call-tack?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but now the max_root is loaded earlier in the call stack and can be stalled and not be equal to self.max_root() anymore.
This can potentially result in more unnecessary processing of old shreds, but I don't know how much impact it has.
Blockstore::check_insert_coding_shred()
is the caller and the new location of self.max_root()
:
solana/ledger/src/blockstore.rs
Line 1185 in 04e4efd
if !Blockstore::should_insert_coding_shred(&shred, self.max_root()) { |
And here is the full implementation of should_insert_coding_shred()
:
solana/ledger/src/blockstore.rs
Lines 1412 to 1415 in 04e4efd
fn should_insert_coding_shred(shred: &Shred, max_root: Slot) -> bool { | |
debug_assert_matches!(shred.sanitize(), Ok(())); | |
shred.is_code() && shred.slot() > max_root | |
} |
So, the change is that self.max_root()
is getting called prior to shred.is_code() && shred.slot()
:
-shred.is_code()
is a simple equality check on shred type
shred.slot()
just reference the slot from the shred's common header.
So, while it is getting called earlier in the callstack, the amount earlier is these two minimal operations.
Problem
The
Blockstore
currently maintains aRwLock<Slot>
of the maximum root it has seen inserted. The value is initialized duringBlockstore::open()
and updated during calls toBlockstore::set_roots()
. Accessing thisRwLock
is cheaper than constructing an iterator through the roots column every time the maximum root value is desired. The maximum root value is queried fairly often:confirmed
However, the access pattens of the max slot mirror that of an atomic, and the
RwLock
is overkill. Namely,Summary of Changes
last_root
field tomax_root
max_root
from aRwLock<Slot>
toAtomicU64
This change is fairly small, but things are still broken up fairly well by commit for easier review