Skip to content
This repository has been archived by the owner on Jan 22, 2025. It is now read-only.

Keep track of dirty stores on remove accounts to clean #17601

Merged
merged 3 commits into from
Jun 23, 2021

Conversation

sakridge
Copy link
Contributor

Problem

zero_lamport_keys set grows and slows down clean_accounts which then prevents flushing for many seconds. On mainnet, 11m zero lamport keys are can be present taking 50 seconds to perform the clean operation.

Summary of Changes

On remove_accounts keep track of the store which is touched and add that to a dirty set. Then in clean_accounts use that dirty set to go and look at all keys which are present in the dirty stores and if they can be cleaned.

Reduces the clean_accounts operation down to about 4-5 seconds.

Fixes #

@sakridge sakridge force-pushed the clean-with-dirty-stores branch 2 times, most recently from 106eaac to 9a8f1ca Compare June 1, 2021 12:57
@codecov
Copy link

codecov bot commented Jun 1, 2021

Codecov Report

Merging #17601 (43b09c4) into master (4f8528c) will increase coverage by 0.0%.
The diff coverage is 100.0%.

@@           Coverage Diff           @@
##           master   #17601   +/-   ##
=======================================
  Coverage    82.6%    82.6%           
=======================================
  Files         435      435           
  Lines      121577   121607   +30     
=======================================
+ Hits       100472   100499   +27     
- Misses      21105    21108    +3     

@sakridge sakridge force-pushed the clean-with-dirty-stores branch 2 times, most recently from 686f75e to ef98258 Compare June 2, 2021 14:51
@sakridge sakridge marked this pull request as ready for review June 2, 2021 20:10
@sakridge sakridge requested review from ryoqun and carllin June 3, 2021 13:42
@sakridge
Copy link
Contributor Author

sakridge commented Jun 3, 2021

This allowed me to run a validator on merckx with 64gb of ram. Otherwise the clean prevents flush and the accounts cache runs the machine out of memory. Overall size of accounts doesn't seem too affected:

I ran it for a couple days:

Every 5.0s: du --max-depth 1 -h validator-ledger/accounts/

15G     validator-ledger/accounts/accounts
24G     validator-ledger/accounts/

snapshots seem similar:

-rw-rw-r--  1 sakridge sakridge 6.7G Jun  3 07:17 snapshot-81235135-DYpphBwbG6Qdw2yBTDxNaB4nCK8JUN7if1oE8doUVeqH.tar.zst
-rw-rw-r--  1 sakridge sakridge 6.7G Jun  3 07:21 snapshot-81235513-HicJFEFAptrrcSbo6FFzTg92yFAJiZyn3RKKACJe6tCX.tar.zst

Total memory use around 48GB.

runtime/src/accounts_db.rs Outdated Show resolved Hide resolved
runtime/src/accounts_db.rs Outdated Show resolved Hide resolved
Comment on lines 4815 to 5063
self.dirty_stores
.insert((*slot, store.append_vec_id()), store.clone());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm it's not immediately obvious to me why removing an account from a storage entry should add the entire store to the dirty_stores list. Maybe a comment would help?

I think the reasoning here is if for example:

  1. the storage entry for slot S contains some pubkey P
  2. Pubkey P has zero lamports in some slot S' > S.

Then when the store for slot S hits count ==0, then this could potentially unblock the clean of the zero-lamport P in S'?

In this case, should we only insert into the dirty stores if the count == 0 below?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well it's not just count==0, a shrink which removes an entry could also unblock it. Maybe the logic should be then to add if count==0 and then in shrink add the unref'ed key set there as well. I don't think there are any other cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if you have a 0-lamport account which is the newest entry for that pubkey in slot S for X. You also have a non-zero lamport update there for Y. Then in slot S+1 you update Y to 0. Then the clean will deref Y from S and the 0-lamport update for X will keep the count=1, although now this store can be removed as long as X reference count in storage is 1 or if any other existing updates for it can be also removed.

Copy link
Contributor

@carllin carllin Jun 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah, that's a great catch. Hmmm yeah, feeling that familiar ache in the noggin 🤯 . So right now even though X is:

  1. zero lamport in S
  2. has only one reference still alive in an storage entry so account_infos.len() as u64 == *ref_count_from_storage so this check will return false

We still fail the deletion of X here:

if store_counts.get(&account_info.store_id).unwrap().0 != 0 {
no_delete = true;
break;
because the storage entry S it's in contains another non-zero lamport update for Y. In this case, is there a reason not to delete X if account_infos.len() as u64 == *ref_count_from_storage even though the storage entry count is not zero via purge_exact() so that:

  1. The storage entry count can be decremented
  2. This zero-lamport update can be removed from the index and removed via shrink later?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. I think that's right. I changed to only inserting when count==0 such that the store is removed. That will free up those keys potentially in newer slots.

Copy link
Contributor

@carllin carllin Jun 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if you have a 0-lamport account which is the newest entry for that pubkey in slot S for X. You also have a non-zero lamport update there for Y. Then in slot S+1 you update Y to 0. Then the clean will deref Y from S and the 0-lamport update for X will keep the count=1, although now this store can be removed as long as X reference count in storage is 1 or if any other existing updates for it can be also removed.

Hmmm, is this above situation resolved

So we had this situation:

<Slot S>                   <Slot S+1>         
X=0, Y=1                        Y=0
  1. Slot S gets rooted, incurs a clean on X and Y. Account X cannot be removed because the count of slot S starts at 2, gets decremented by 1 for the zero lamport account X:

    store_count.0 -= 1;
    , but is ultimately not not zero in calc_delete_dependencies:
    if store_counts.get(&account_info.store_id).unwrap().0 != 0 {

  2. S+1 gets rooted and added to the dirty_stores, which incurs a clean on account Y. dropping the count of slot S to 1. However, Y won't be deleted because its ref count is still 2 (due to the update in slot S), the ref count is only decremented when either slot S is dead or shrunk.

  3. In shrink, there's two issues:
    a) We may not shrink S because it doesn't meet the page size shrink criteria, even though not performing this shrink prevents the ref count of zero lamport slots from hitting zero, and thus is blocking a lot of zero lamport cleans in later slots.
    b) Even if we shrink slot S, but the store still has count == 1 (the update to X is the latest update to 0), so we don't hit this case:

    if store.count() == 0 {
    self.dirty_stores
    .insert((slot, store.append_vec_id()), store.clone());
    and we still don't add X to the clean list.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if the fix here is to say if in this check here:

let no_delete = if account_infos.len() as u64 != *ref_count_from_storage {
, if if account_infos.len() as u64 == *ref_count_from_storage == 1, then we don't need to make the additional check in this else block:
} else {
let mut no_delete = false;
for (_slot, account_info) in account_infos {
debug!(
"calc_delete_dependencies()
storage id: {},
count len: {}",
account_info.store_id,
store_counts.get(&account_info.store_id).unwrap().0,
);
if store_counts.get(&account_info.store_id).unwrap().0 != 0 {
no_delete = true;
break;
}
}
since we know there's no other slots that we are shielding

A test might be good to illustrate this situation here. I think there are similar ones, but the key difference here is we're cleaning after setting a root on S, and then cleaning after setting a root on S+1, instead of cleaning only once after rooting both S and S + 1

Copy link
Contributor Author

@sakridge sakridge Jun 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/solana-labs/solana/pull/17601/files#diff-1090394420d51617f3233275c2b65ed706b35b53b115fe65f82c682af8134a6fR9803-R9848

Doesn't this test do that?

a) We may not shrink S because it doesn't meet the page size shrink criteria, even though not performing this shrink prevents the ref count of zero lamport slots from hitting zero, and thus is blocking a lot of zero lamport cleans in later slots.

That's true, somewhat the price to pay for not shrinking very aggressively which would increase write amplification. Although, now we have knobs for that, user can shrink as aggressively as they like.

b) Even if we shrink slot S, but the store still has count == 1 (the update to X is the latest update to 0), so we don't hit this case:

I don't think this is the case. When we shrink, the old store(s) should always have count==0 because all live accounts would be updated in it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this test do that?

Yeah, this test tests exactly this case, thanks!

I don't think this is the case. When we shrink, the old store(s) should always have count==0 because all live accounts would be updated in it.

Ah yeah, I misread that bit of logic and thought it would only execute if the newly shrunk store had count==0, not the old unshrunken store, which will always have count == 0 . So this will essentially add all the accounts in the old store for that slot to the candidate clean set.

Is it true that if we shrink unshrunken storage entry U to a shrunken storage S, we only need to add the zero-lamport pubkeys in S to the candidate clean set if the only remaining alive accounts in S are zero lamport themselves?

i.e. if we declared a

let mut only_zero_lamport_accounts_alive = true;

Had a check here:

alive_accounts.push((pubkey, stored_account));

if stored_account.account.account_meta.lamports != 0 {
     only_zero_lamport_accounts_alive = false;
}

And then add to the dirty stores here:

if store.count() == 0 {

// Add only the zero lamport accounts to be cleaned
if only_zero_lamport_accounts_alive && store.count() > 0 {
         self.dirty_stores
                            .insert((slot, store.append_vec_id()), store.clone());
}

Reasoning being:

  1. If the post-shrunk store S has any non-zero lamport keys K, then when K is updated in later slots and S becomes dead, these keys will be added to the clean list (so don't need to add them in shrink).
  2. If the post-shrunk store S only has zero lamport keys, then we can run into the edge case so we should add them to clean.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea. This could be an optimization on the reducing the clean pubkey set, right? I think that could be a follow-up change. This already has a huge impact on the clean time, and a few extra keys are not that expensive to deal with.

runtime/src/accounts_db.rs Outdated Show resolved Hide resolved
fn construct_candidate_clean_keys(
&self,
max_clean_root: Option<Slot>,
timings: &mut CleanKeyTimings,
) -> Vec<Pubkey> {
let mut zero_lamport_key_clone = Measure::start("zero_lamport_key");
let pubkeys = self.accounts_index.zero_lamport_pubkeys().clone();
Copy link
Contributor

@carllin carllin Jun 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tag @ryoqun, if you're reviewing as well, this seems to be the crux of the change/savings 😃

Ha interesting, so most of the time in clean was spent cloning out the 11m zero-lamport keys (almost 45 seconds? 😮 ) I would have thought most of the time was spent in the par scan:

let do_clean_scan = || {
pubkeys
.par_chunks(4096)
.map(|pubkeys: &[Pubkey]| {
and clean_accounts_older_than_root call:
self.clean_accounts_older_than_root(purges_old_accounts, max_clean_root);
Good find!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's not the cost of the clone itself. This change reduces the number of keys that we look at to something much smaller than 11 million.

The scan and then the dependency checks where I think where each about 50% of the time, around 10-20 seconds each.

The structure of this change is to reduce the number of keys that is input into this function.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yup, that makes sense, fewer inputs to the clean.

@sakridge sakridge force-pushed the clean-with-dirty-stores branch 3 times, most recently from e4f0a0a to aca3d61 Compare June 4, 2021 16:44
@sakridge sakridge force-pushed the clean-with-dirty-stores branch 4 times, most recently from bbb5d18 to cfeff07 Compare June 14, 2021 17:34
Comment on lines +5136 to +5137
self.dirty_stores
.insert((*slot, store.append_vec_id()), store.clone());
Copy link
Contributor

@carllin carllin Jun 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another annoying case to think about is that right now unrooted slots can also prevent zero lamport cleanup due to them counting toward the ref_count_from_storage, but only rooted slots are included in account_infos.len() here:

let no_delete = if account_infos.len() as u64 != *ref_count_from_storage {
. So for example if we had:

<Slot S, Rooted>         <Slot S+1, Unrooted>               <Slot S+2, Rooted>
Y=1                       Y=2, Z=1                              Y=0

The ref_count_from_storage of Y is 2:

*ref_count = self.accounts_index.ref_count_from_storage(&key);
, because even though rooted slot S is dead, unrooted slot S+1 is alive and adds to the ref count for Y. Meanwhile account_infos only contains the rooted update in slot S+2, so this check fails
let no_delete = if account_infos.len() as u64 != *ref_count_from_storage {
.

Luckily, later, when the bank for slot S+1 is dropped, cleanup happens in purge_slots() and calls into purge_slot_storage():

self.purge_slot_storage(*remove_slot, purge_stats);
, which should should ad slot S+1 to the dirty stores:
self.dirty_stores
.insert((*slot, store.append_vec_id()), store.clone());
dead_slots.insert(*slot);
.

I think it would be good to add a test here. I think you can copy and paste this one: carllin@c2683cd#diff-1090394420d51617f3233275c2b65ed706b35b53b115fe65f82c682af8134a6fR11025-R11077, which tests exactly this case 😃 .

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test fails on master as well. I think we can do another change to handle this case. It doesn't seem to be related to this change specifically.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test fails on master as well

Shoot, that's because the test was wrong!

It doesn't seem to be related to this change specifically

Yeah it's not directly related, but the removal of the zero-lamport accounts in the candidate key generation had a chance of making the clean fail here: https://github.com/carllin/solana/blob/5bb5284438f0c5a545a4db53924f28e5759a3591/runtime/src/accounts_db.rs#L11670-L11673 if the slot storage wasn't being correctly added to the dirty_stores list in the purge_slots() path, so it was just an additional corner case to consider.

I extracted the test into a separate PR, which passes now on master and with these changes, so not a blocker here 😃

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks!

@sakridge sakridge merged commit 3b1738c into solana-labs:master Jun 23, 2021
@sakridge sakridge deleted the clean-with-dirty-stores branch June 23, 2021 08:28
@sakridge sakridge added the v1.7 label Jun 23, 2021
mergify bot pushed a commit that referenced this pull request Jun 23, 2021
* Keep track of dirty stores on remove accounts to clean

and not zero_lamport key set

* Only dirty when count==0?

* Add another clean

(cherry picked from commit 3b1738c)
mergify bot added a commit that referenced this pull request Jun 23, 2021
* Keep track of dirty stores on remove accounts to clean

and not zero_lamport key set

* Only dirty when count==0?

* Add another clean

(cherry picked from commit 3b1738c)

Co-authored-by: sakridge <sakridge@gmail.com>
@brooksprumo brooksprumo mentioned this pull request Aug 23, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants