Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

blockstore: use erasure meta index field to find conflicting shreds #1151

Merged
merged 3 commits into from
May 10, 2024

Conversation

AshwinSekar
Copy link

Problem

We perform a costly scan in order to find the original coding shred when an erasure config mismatch is detected.

Summary of Changes

Instead use the new field on erasure meta introduced in #961

Revert to the scan if the field is 0 in order to support old blockstores. This can be removed when we no longer support < v1.18.12

@codecov-commenter
Copy link

codecov-commenter commented May 2, 2024

Codecov Report

Attention: Patch coverage is 80.00000% with 4 lines in your changes are missing coverage. Please review.

Project coverage is 82.1%. Comparing base (4ae2ca1) to head (ded4d5b).
Report is 41 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #1151   +/-   ##
=======================================
  Coverage    82.1%    82.1%           
=======================================
  Files         886      886           
  Lines      236439   236490   +51     
=======================================
+ Hits       194252   194306   +54     
+ Misses      42187    42184    -3     

@AshwinSekar AshwinSekar requested review from behzadnouri and steviez May 2, 2024 20:22
@AshwinSekar AshwinSekar marked this pull request as ready for review May 2, 2024 20:22
if !self.has_duplicate_shreds_in_slot(slot) {
let conflicting_shred = self
.find_conflicting_coding_shred(&shred, slot, erasure_meta, just_received_shreds)
.expect("Shred indicated by erasure meta must exist")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This expect here made me wonder what happens in the scenarios that a shred is purge/pruned/dumped from blockstore between the time erasure_meta is obtained until here the shred is fetched from blockstore?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when replay decides to dump a block it uses clear_unconfirmed_slot which shares a lock with shred insertion precisely to avoid the situation you mentioned:

agave/ledger/src/blockstore.rs

Lines 1290 to 1291 in bf1b765

pub fn clear_unconfirmed_slot(&self, slot: Slot) {
let _lock = self.insert_shreds_lock.lock().unwrap();

i'm not an expert on the blockstore automatic cleanup (cc @steviez). while it doesn't share this lock, it looks at blockstore.max_root() in order to determine which slots to cleanup. We also compare against blockstore.max_root() earlier

if !Blockstore::should_insert_coding_shred(&shred, self.max_root()) {

but this seems unsafe as replay_stage could have rooted and triggered an automatic cleanup in between.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, will recheck these blockstore expects under the assumption that cleanup could have happened https://discord.com/channels/428295358100013066/1235788902888116274/1235942333334290442

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're correct that the BlockstoreCleanupService doesn't share the insert_shreds_lock. And right, the service fetches the latest root, and potentially clean data that is strictly older than the root. We never clean data newer than the latest root.

Practically speaking, the value for --limit-ledger-size is required to be >= 50M. Assuming 1k shreds per slot, the oldest data is ~50k slots older than the newest data. That being said, I think there is merit to an unwrap() and expect() audit like y'all are suggesting

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, will recheck these blockstore expects under the assumption that cleanup could have happened https://discord.com/channels/428295358100013066/1235788902888116274/1235942333334290442

Given that mainnet is upgrading to v1.18, and we have not have much of soak time, probably better to demote all (or most of) those expects to an error log or metric. We can always add them back in if we are confident we are not missing an edge case.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I combed through the expect/unwraps we introduced and demoted the affected ones here #1259 .

I believe these remaining checks to be safe and have not demoted them:

.expect("Erasure meta was just created, initial shred must exist");

.expect("Merkle root meta was just created, initial shred must exist");

.store_duplicate_slot(slot, conflicting_shred.clone(), shred.payload().clone())
.is_err()
{
warn!("bad duplicate store..");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was old code but nonetheless we need to log the actual err than just dropping it.

@AshwinSekar AshwinSekar force-pushed the conflicting-coding-shred branch from 2dd7027 to c3b08ae Compare May 7, 2024 21:00
behzadnouri
behzadnouri previously approved these changes May 8, 2024
}

duplicate_shreds.push(PossibleDuplicateShred::ErasureConflict(
shred.clone(),
conflicting_shred,
));
}
} else {
datapoint_info!("bad-conflict-shred", ("slot", slot, i64));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe keep this line (or an error log or something similar), for when find_conflicting_coding_shred returns None.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a couple warn!()'s under the // ToDo: ... line, do think those are sufficient ? Namely, I think self.has_duplicate_shreds_in_slot(slot) still being false would indicate that we did not find a conflicting coding shred

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm! the warn! logs will be there regardless of this datapoint.
but this bad-conflict-shred would specifically indicate that we have a mismatching erasure-meta but can't find the shred which initialized that erasure-meta.

anyways, I am inclined to keep this datapoint just in case we observe these anomalies, but don't feel strongly about that. so either way is fine with me.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added it back as an error log.

Comment on lines +1424 to +1427
if let Err(e) = self.store_duplicate_slot(
slot,
conflicting_shred.clone(),
shred.payload().clone(),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Semi-related to the PR (but certainly not blockers):

  • Wondering why we didn't use write_batch for this function to store the udpates, not that the updates to this column need to occur atomically with any other column
  • Wondering if we could store the proof without cloning the two Vec's; we'd want to serialize with borrowed data but deserialize with owned data. A quick glance since bincode supports Cow

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah we could definitely be smarter about this:

  • write batch to store duplicate proof updates
  • cache a map so we don't have to read blockstore every time we check if there's already a duplicate proof
  • avoid the clones

I guess duplicate's happen so infrequently that this isn't a major problem, but can definitely look into cleaning this up in a follow up.

Copy link

@steviez steviez May 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering why we didn't use write_batch for this function to store the udpate

I figured this out and thought I edited my comment, but I guess I did not:

  • store_duplicate_slot() and has_duplicate_shreds_in_slot() both read from the DB
  • Items written to the write_batch are not visible to the DB until the batch is committed

We call has_duplicate_shreds_in_slot() in the same function, potentially after calling store_duplicate_slot() ... if we used the write_batch, the read wouldn't see the write immediately before it. And, if insert_shreds_handle_duplicate() was called with 100 shreds that we're iterating through, a duplicate proof written for the 5th shred would not be visible if we read for the next 95 shreds

This same gotcha is why we have the get_shred_from_just_inserted_or_db() function in combination with the HashMap for getting shreds that are in the same "insert batch"

}

duplicate_shreds.push(PossibleDuplicateShred::ErasureConflict(
shred.clone(),
conflicting_shred,
));
}
} else {
datapoint_info!("bad-conflict-shred", ("slot", slot, i64));
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a couple warn!()'s under the // ToDo: ... line, do think those are sufficient ? Namely, I think self.has_duplicate_shreds_in_slot(slot) still being false would indicate that we did not find a conflicting coding shred

ledger/src/blockstore.rs Show resolved Hide resolved
steviez
steviez previously approved these changes May 9, 2024
Copy link

@steviez steviez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change looks good to me, but I think it'd be good to confirm that Behzad's concern about the removed datapoint is adequately covered by the comments further down the function I mentioned

@AshwinSekar AshwinSekar dismissed stale reviews from steviez and behzadnouri via ded4d5b May 9, 2024 21:33
@AshwinSekar AshwinSekar merged commit 6cd0128 into anza-xyz:master May 10, 2024
38 checks passed
@AshwinSekar AshwinSekar deleted the conflicting-coding-shred branch May 10, 2024 01:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants