-
Notifications
You must be signed in to change notification settings - Fork 658
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Limit ChunkStateWitness size to 16MB #10615
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #10615 +/- ##
==========================================
- Coverage 72.19% 72.17% -0.03%
==========================================
Files 726 726
Lines 147697 147777 +80
Branches 147697 147777 +80
==========================================
+ Hits 106636 106656 +20
- Misses 36264 36321 +57
- Partials 4797 4800 +3
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, we can increase MAX_CHUNK_STATE_WITNESS_INNER_SIZE
later if it turns out to be too small.
vec![], | ||
&EmptyValidatorSigner::default(), | ||
)); | ||
let dummy_partial_state = PartialState::TrieValues(Vec::new()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: PartialState::default()
fn dummy_chunk_state_witness() -> ChunkStateWitness { | ||
let dummy_chunk_header = ShardChunkHeader::V3(ShardChunkHeaderV3::new( | ||
CryptoHash::default(), | ||
CryptoHash::default(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: StateRoot::default()
&EmptyValidatorSigner::default(), | ||
)); | ||
let dummy_partial_state = PartialState::TrieValues(Vec::new()); | ||
ChunkStateWitness { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: We have ChunkStateWitness::empty()
, although there is a comment suggesting it might be removed in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I had added it for genesis chunk state witness but we can use it for tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving to unblock, please address the comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't this result in shard becoming unavailable in case when actual state witness size exceeds size limit because all chunk producers for that shard will use the same set of receipts trying to build the chunk?
Right... I see two alternatives to resolve this:
2nd is more proper but 1st looks simpler for now. |
@Longarithm Do you mean stopping production/validation of current chunk if previous chunk's state witness was too large? I think I do not understand the second idea. |
That's a good point... But it should still be safe to enforce the size limit for orphaned state witnesses. Orphaned witnesses are an optimization, so it isn't a problem if some nodes miss them. And even then it would happen very rarely, as most witnesses are smaller than 16MB. Thank you for staying vigilant! I'll close this PR, this requires more work to solve properly. For now I'll just add a size limit for orphans, but leave the rest as is. |
Let's limit the size of orphan witnesses which are kept in OrphanChunkStateWitnessPool to 16MB to limit the maximum amount of memory that this pool can consume. During StatelessNet loadtests by @staffik, the maximum observed ChunkStateWitness size was 8MB, so 16MB should be a safe limit. Witnesses above this size won't be saved. This shouldn't be a problem, if a validator misses one orphaned ChunkStateWitness they just won't vote on it, which is a normal occurence in the protocol. Only size of orphaned witnesses is limited. Witnesses which are processed immediately can still be aribtrarily large. It might not be safe to limit non-orphan witness, see the discussion in near#10615
Let's limit the size of orphan witnesses which are kept in OrphanChunkStateWitnessPool to 16MB to limit the maximum amount of memory that this pool can consume. During StatelessNet loadtests by @staffik, the maximum observed ChunkStateWitness size was 8MB, so 16MB should be a safe limit. Witnesses above this size won't be saved. This shouldn't be a problem, if a validator misses one orphaned ChunkStateWitness they just won't vote on it, which is a normal occurence in the protocol. Only size of orphaned witnesses is limited. Witnesses which are processed immediately can still be aribtrarily large. It might not be safe to limit non-orphan witness, see the discussion in near#10615
### Description This PR adds a pool for orphaned `ChunkStateWitnesses`. To process a `ChunkStateWitness` we need the previous block, but sometimes it isn't available immediately. The node might receive a `ChunkStateWitness` before the block that's required to process it. In such cases the witness becomes an "orphaned chunk state witness" and it's put in `OrphanChunkStateWitnessPool`, where it waits for the desired block to appear. Once a new block is accepted, we fetch all orphaned witnesses that were waiting for this block from the pool and process them. ### Design of `OrphanStateWitnessPool` `OrphanStateWitnessPool` keeps a cache which maps `shard_id` and `height` to an orphan `ChunkStateWitness` with these parameters: ```rust witness_cache: LruCache<(ShardId, BlockHeight), ChunkStateWitness>, ``` All `ChunkStateWitnesses` go through basic validation before being put in the orphan cache. * The signature is checked to make sure that this witness really comes from the right chunk producer that should produce a witness at this height and shard_id. * Client keeps only witnesses which are within 5 blocks of the current chain head to prevent spam attacks. Without this limitation a single malicious chunk producer could fill the whole cache with their fake witnesses. * There's also a limitation on witness size to limit the amount of memory consumed by the pool. During StatelessNet loadtests performed by `@staffik` and `@Longarithm` the observed `ChunkStateWitness` sIze was 16-32MB, so a 40MB limit should be alright. This PR only limits the size of orphaned witnesses, limiting the size of non-orphan witnesses is much more tricky, see the discussion in #10615. It's impossible to fully validate an orphaned witness, but this partial validation should be enough to protect against attacks on the orphan pool. Under normal circumstances there should be only a few orphaned witnesses per shard. If the node has fallen behind by more than a few blocks, it has to catch up and its chunk endorsements don't matter. The default cache capacity is set to 25 witnesses. With 5 shards it provides capacity for 5 orphaned witnesses on each shard, which should be enough. Assuming that a single witness can take up 40 MB, the pool will consume at most 1GB at full capacity. The changes are divided into individual commits, they can be reviewed commit-by-commit. ### Fixes Fixes: #10552 Fixes: near/stakewars-iv#15
This PR limits the size of
ChunkStateWitness
to 16MB. Witnesses larger than 16MB will be considered invalid.It'll help to protect from abuse - huge
ChunkStateWitnesses
would require a lot of resource to process, and they would take up a lot of space in orphan witness pool, so it's good to immediately reject witnesses that are too big.During StatelessNet loadtests by @staffik the maximum observed size of
ChunkStateWitness
was 8MB, so 16MB should be a safe limit.There's a safeguard to make sure that a node doesn't send out a
ChunkStateWitness
which is larger than 16MB. It's good to have a safeguard there, we don't want to get all nodes banned if it turns out that the estimate was wrong and naturally producedChunkStateWitnesses
are sometimes larger than 16MB.Technically we measure the size of
ChunkStateWitnessInner
, notChunkStateWitness
. It was easier to do it this way becausesign_chunk_state_witness
returns the size ofChunkStateWitnessInner
, which is also later used in the metrics.Limiting the size of
ChunkStateWitnessInner
is enough, asChunkStateWitness
is made up ofChunkStateWitnessInner
and a constant-size signature, so its size is also limited.Refs: #10259 (but not fixes, it's a primitive limit that doesn't take into account computation costs, etc)