-
Notifications
You must be signed in to change notification settings - Fork 624
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: don't panic when a Client receives ChunkStateWitness with invalid shard_id #10621
Conversation
Initially I tried to put the test in |
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #10621 +/- ##
==========================================
+ Coverage 72.19% 72.20% +0.01%
==========================================
Files 726 729 +3
Lines 147719 148103 +384
Branches 147719 148103 +384
==========================================
+ Hits 106647 106940 +293
- Misses 36272 36352 +80
- Partials 4800 4811 +11
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
chain/chain/src/chain.rs
Outdated
@@ -4183,7 +4183,11 @@ impl Chain { | |||
shard_id: ShardId, | |||
) -> Result<ShardChunkHeader, Error> { | |||
let prev_shard_id = epoch_manager.get_prev_shard_ids(prev_block.hash(), vec![shard_id])?[0]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you are on it, could you please add short comment for get_prev_shard_ids
that if result is Ok then length of resulting vector is the same as length of argument vector? I was under impression that it is not the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a comment.
IMO it always returns a Vec
of the same length. It maps every shard to parent shard, and then collects them into a Vec, so the length should be the same:
fn get_prev_shard_ids(
&self,
prev_hash: &CryptoHash,
shard_ids: Vec<ShardId>,
) -> Result<Vec<ShardId>, Error> {
if self.is_next_block_epoch_start(prev_hash)? {
let shard_layout = self.get_shard_layout_from_prev_block(prev_hash)?;
let prev_shard_layout = self.get_shard_layout(&self.get_epoch_id(prev_hash)?)?;
if prev_shard_layout != shard_layout {
return Ok(shard_ids
.into_iter()
.map(|shard_id| {
shard_layout.get_parent_shard_id(shard_id).map(|parent_shard_id|{
assert!(prev_shard_layout.shard_ids().any(|i| i == parent_shard_id),
"invalid shard layout. parent_shard_id: {}\nshard_layout: {:?}\nprev_shard_layout: {:?}",
parent_shard_id,
shard_layout,
parent_shard_id
);
parent_shard_id
})
})
.collect::<Result<_, ShardLayoutError>>()?);
}
}
Ok(shard_ids)
}
pub fn get_parent_shard_id(&self, shard_id: ShardId) -> Result<ShardId, ShardLayoutError> {
if !self.shard_ids().any(|id| id == shard_id) {
return Err(ShardLayoutError::InvalidShardIdError { shard_id });
}
let parent_shard_id = match self {
Self::V0(_) => panic!("shard layout has no parent shard"),
Self::V1(v1) => match &v1.to_parent_shard_map {
// we can safely unwrap here because the construction of to_parent_shard_map guarantees
// that every shard has a parent shard
Some(to_parent_shard_map) => *to_parent_shard_map.get(shard_id as usize).unwrap(),
None => panic!("shard_layout has no parent shard"),
},
};
Ok(parent_shard_id)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I also replaced the [0]
with .first()
just for the peace of mind, there's no reason to risk a panic there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although a more proper way to fix it would be to add a function that operates on a single ShardId
. I think I'll go with that.
v2:
|
When Client start processing a new
ChunkStateWitness
which it received, it immediately fetches the previous block and the previous chunk usingprev_block_hash
andshard_id
provided by theChunkStateWitness
.It turns out that the current implementation of
Chain::get_prev_chunk_header
will panic if it's given an invalidshard_id
, which means that a malicious peer could send aChunkStateWitness
with a badshard_id
and it would crash the node that received it.nearcore/chain/client/src/stateless_validation/chunk_validator.rs
Lines 596 to 607 in c5c84ad
Chain::get_prev_chunk_header
is very deceptive, it returns aResult
, but still does anunwrap()
inside. Let's fix the problem by removing theunwrap()
. From now onChain::get_prev_chunk_header
will return a error when it encounters an invalidshard_id
.A test is added to ensure that this bug doesn't happen again.