Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

resharding: verify proof #12485

Merged
merged 6 commits into from
Nov 23, 2024
Merged

resharding: verify proof #12485

merged 6 commits into from
Nov 23, 2024

Conversation

Longarithm
Copy link
Member

Completes the resharding validation flow on the chunk validator side.

The code replaces MainTransition::ShardLayoutChange with actual logic. The whole preparation is needed to call let new_root = trie.retain_split_shard(&boundary_account, retain_mode)?;, to verify it against state root provided in transition.

The main complexity is to introduce implicit transition for resharding to the right place - after applying last chunk in old shard layout, potentially missing, and before applying first chunk in the new layout. To do that, I needed to modify get_state_witness_block_range once again, and to make it readable, I rewrote it using TraversalPosition. The point of that is to change all parameters of the loop at once, instead of changing them one-by-one in the loop. Then, I generate ImplicitTransitionParams right in the loop. Finally, this allows to insert resharding transition before jumping to the previous block hash and child shard id, if needed. Note that this flow happens in reverse to chain flow.

Also unfortunately this is not triggered on production tests, because when nodes track all shards, the condition if let Ok(prev_chunk_extra) = chain.get_chunk_extra(prev_block_hash, &shard_uid) { ... } is triggered - if latest chunk extra is already computed, we skip unnecessary validations. So for now I checked the correctness by commenting this path and running tests locally.

@Longarithm Longarithm linked an issue Nov 19, 2024 that may be closed by this pull request
Copy link

codecov bot commented Nov 19, 2024

Codecov Report

Attention: Patch coverage is 69.18239% with 49 lines in your changes missing coverage. Please review.

Project coverage is 69.96%. Comparing base (024c47c) to head (ca0c082).
Report is 14 commits behind head on master.

Files with missing lines Patch % Lines
...chain/src/stateless_validation/chunk_validation.rs 71.05% 27 Missing and 17 partials ⚠️
tools/replay-archive/src/cli.rs 0.00% 3 Missing ⚠️
...nt/src/stateless_validation/chunk_validator/mod.rs 33.33% 0 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #12485      +/-   ##
==========================================
+ Coverage   69.83%   69.96%   +0.12%     
==========================================
  Files         838      838              
  Lines      169447   169164     -283     
  Branches   169447   169164     -283     
==========================================
+ Hits       118338   118353      +15     
+ Misses      45853    45674     -179     
+ Partials     5256     5137     -119     
Flag Coverage Δ
backward-compatibility 0.16% <0.00%> (+<0.01%) ⬆️
db-migration 0.16% <0.00%> (+<0.01%) ⬆️
genesis-check 1.28% <0.00%> (-0.01%) ⬇️
linux 69.27% <58.49%> (+0.11%) ⬆️
linux-nightly 69.50% <69.18%> (+0.08%) ⬆️
macos 51.08% <41.50%> (+0.07%) ⬆️
pytests 1.59% <0.00%> (-0.01%) ⬇️
sanity-checks 1.39% <0.00%> (-0.01%) ⬇️
unittests 69.78% <69.18%> (+0.12%) ⬆️
upgradability 0.20% <0.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


🚨 Try these New Features:

// This usually happens because we are a chunk producer and
// therefore have the chunk extra for the previous block saved on disk.
// We can also skip validating the chunk state witness in this case.
// We don't need to switch to parent shard uid, because resharding
// creates chunk extra for new shard uid.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to double check, do the new shards' chunk extra contain the final state roots after applying the state changes on the parent and then splitting?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. For the last block hash in epoch, chunk extra for parent shard id includes results of applying last chunk, and there is also chunk extra for child shard id, which also includes split result on top of that.

/// Transition parameters **after** the last chunk, corresponding to all
/// state transitions from the last new chunk (exclusive) to the parent
/// block (inclusive).
implicit_transition_params: Vec<ImplicitTransitionParams>,
/// Blocks from the last last new chunk (exclusive) to the last new chunk
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the "last last new chunk" the "previous last new chunk"? Like.. the last new chunk of the previous state witness?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

@Trisfald
Copy link
Contributor

If I understood correctly , we must fix resharding_v3 test loop to work without tracking all shards, then this code will be tested? Do you plan to address this in another PR?

Also, should we add tests where a wrong split proof is sent and we validate the chunk is not endorsed?

Copy link
Contributor

@wacban wacban left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
Nice, it all comes together!

Comment on lines 110 to 113
pub enum ImplicitTransitionParams {
ApplyChunk(ApplyChunkBlockContext, ShardContext),
Resharding(AccountId, RetainMode, ShardUId),
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

nit: Can you add a comment please?

Comment on lines 149 to 150
/// Number of chunks seen during traversal.
chunks_seen: u32,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that only for new chunks or any chunks?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's new chunks considering position.chunks_seen + if is_new_chunk { 1 } else { 0 };
Perhaps num_unique_chunks_seen or something similar might help?

Comment on lines 151 to 152
/// Below, supposed id and index of last chunk before the chunk being
/// validated.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I can't quite parse this sentence, can you put in different words? The "below" in particular is confusing.

Comment on lines 238 to 247
last_chunk_shard_id: if position.chunks_seen == 0 {
prev_shard_id
} else {
position.last_chunk_shard_id
},
last_chunk_shard_index: if position.chunks_seen == 0 {
prev_shard_index
} else {
position.last_chunk_shard_index
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Maybe move this to a separate line?

Comment on lines 285 to 289
let Some(ReshardingEventType::SplitShard(params)) =
ReshardingEventType::from_shard_layout(shard_layout, *prev_hash)?
else {
return Ok(None);
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mini nit: I think a match statement on the ReshardingEventType will be more future proof so that the compiler warns that a new event is not handled here. let else return is fine for handling Option though.

)));
}

for (implicit_transition_params, transition) in pre_validation_output
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for my education - what exactly is the length of the params, the pre validation and why are they always equal?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ChunkStateWitness is expected to provide all implicit state transitions, but it must be validated.
Chunk validator can retrieve number of implicit transitions by chain traversal itself. It is always number of missing chunks plus resharding transition if needed. If the loop completes, it means that implicit transitions were provided correctly.

@wacban
Copy link
Contributor

wacban commented Nov 20, 2024

What is your plan for tests for this logic?

Copy link
Contributor

@shreyan-gupta shreyan-gupta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

old_root,
true,
);
let new_root = trie.retain_split_shard(&boundary_account, retain_mode)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

Comment on lines 149 to 150
/// Number of chunks seen during traversal.
chunks_seen: u32,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's new chunks considering position.chunks_seen + if is_new_chunk { 1 } else { 0 };
Perhaps num_unique_chunks_seen or something similar might help?

/// currently observed block.
shard_id: ShardId,
/// Hash of the previous block.
prev_hash: CryptoHash,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we maintaining both prev_hash and prev_block mainly for readability? prev_hash can be easily got from prev_block.hash() right?

let prev_prev_hash = *position.prev_block.header().prev_hash();
let epoch_id = epoch_manager.get_epoch_id_from_prev_block(&position.prev_hash)?;
let shard_layout = epoch_manager.get_shard_layout(&epoch_id)?;
let shard_uid = ShardUId::from_shard_id_and_layout(position.shard_id, &shard_layout);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a nice function epoch_manager.shard_id_to_uid(..) if that seems more legible. Also seems like fetching the shard_layout can be pushed to within check_shard_layout_transition

/// Checks if chunk validation requires a transition to new shard layout in the
/// block with `prev_hash`, with a split resulting in the `shard_uid`, and if
/// so, returns the corresponding resharding transition parameters.
fn check_shard_layout_transition(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this function is specifically to get the resharding transition, could we rename it to reflect that, say get_resharding_transition?

@@ -70,10 +70,6 @@ pub enum ShardUpdateReason {
/// Information about shard to update.
pub struct ShardContext {
pub shard_uid: ShardUId,
/// Whether node cares about shard in this epoch.
pub cares_about_shard_this_epoch: bool,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like we don't care about shard this epoch any more, lol! Was this and will_shard_layout_change not being used anywhere?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was part of resharding v2 logic.

@Longarithm
Copy link
Member Author

If I understood correctly , we must fix resharding_v3 test loop to work without tracking all shards, then this code will be tested? Do you plan to address this in another PR?

Yeah. I believe @marcelo-gonzalez is working on single shard tracking.

As for tests, the current plan was to resharding_v3 test should not fail, I will add unit tests in the next PR.

@Longarithm Longarithm added this pull request to the merge queue Nov 22, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Nov 23, 2024
@Longarithm Longarithm added this pull request to the merge queue Nov 23, 2024
Merged via the queue into near:master with commit af6559a Nov 23, 2024
26 of 28 checks passed
@Longarithm Longarithm deleted the rv3-verify-proof branch November 23, 2024 09:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[ReshardingV3] State Witness
4 participants