-
Notifications
You must be signed in to change notification settings - Fork 678
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement limits on the size of transactions in ChunkStateWitness #11406
Conversation
I'm not sure about the |
With testing I saw that some of the existing tests already kinda cover the feature. I can think about some dedicated tests, but there might not be a big need for that. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #11406 +/- ##
==========================================
+ Coverage 71.27% 71.36% +0.08%
==========================================
Files 784 784
Lines 157847 158131 +284
Branches 157847 158131 +284
==========================================
+ Hits 112511 112843 +332
+ Misses 40489 40412 -77
- Partials 4847 4876 +29
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
let small_code_len = small_code.len(); | ||
let large_code_len = large_code.len(); | ||
let cost_empty = deploy_contract_cost(ctx, small_code, Some(b"main")); | ||
let cost_4mb = deploy_contract_cost(ctx, large_code, Some(b"main")); | ||
let cost_15mb = deploy_contract_cost(ctx, large_code, Some(b"main")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this function for a test? otherwise not sure why it needs to specifically call out the limits, which are set in params files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
nit: cost_max
also it seems like it's 15 but it's 1.5
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this function for a test? otherwise not sure why it needs to specifically call out the limits, which are set in params files.
Ok I'll change it so that it uses the param file instead of a hardcoded value
chain/client/src/client.rs
Outdated
@@ -1041,9 +1053,21 @@ impl Client { | |||
source: StorageDataSource::Db, | |||
state_patch: Default::default(), | |||
}; | |||
let prev_chunk_transactions_size = match prev_chunk_opt { | |||
Some(prev_chunk) => borsh::to_vec(prev_chunk.transactions()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assuming this serialization will take some time, guard this with the new protocol version?
Quick question about this - isn't |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM but I'd like to have another look later so for now just a few comments
chain/chain/src/chain.rs
Outdated
@@ -4454,6 +4454,38 @@ impl Chain { | |||
}) | |||
.collect() | |||
} | |||
|
|||
/// Find the last existing (not missing) chunk on this shard id. | |||
pub fn get_header_of_last_existing_chunk( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
The chunk header contains field
height_included
that points to the block height where the last existing chunk on this shard is. Not sure if this would simplify or speed up this implementation so sharing just in case you find it useful. There is a caveat that height included is not set when the chunk is produced (since at this point we don't know what block height will include it), it's only set once part of a block. -
The shard_id changes meaning during resharding. No need to do it here but can you add a todo for this? Something like:
TODO(resharding) - handle looking for the last existing chunk
- JFYI there is an interesting interaction with state sync here. Intuitively state sync gets the state as of the last block of the epoch. If there are missing chunks around this epoch boundary this loop could go before that and the node might not have the needed blocks. This issue actually appeared elsewhere and should be fixed already. When syncing the node will get the blocks up until the last existing chunk for every shard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The shard_id changes meaning during resharding. No need to do it here but can you add a todo for this?
I used:
shard_id = epoch_manager.get_prev_shard_id(&cur_block_hash, shard_id)?;
Doesn't that take care of reshardings?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah sorry I missed that line. That should do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh actually I just realized that the last chunk is included in the previous block, so I don't need to iterate at all 🤦. I'll remove this function.
chain/chain/src/runtime/mod.rs
Outdated
let size_limit = transactions_gas_limit | ||
/ (runtime_config.wasm_config.ext_costs.gas_cost(ExtCosts::storage_write_value_byte) | ||
+ runtime_config.wasm_config.ext_costs.gas_cost(ExtCosts::storage_read_value_byte)); | ||
let size_limit: u64 = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Can you move this to a helper method? This method is already quite overblown.
chain/chain/src/runtime/mod.rs
Outdated
runtime_config | ||
.max_transactions_size_in_witness | ||
.saturating_sub(chunk.prev_chunk_transactions_size) as u64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is pretty neat!
chain/chain/src/runtime/mod.rs
Outdated
transactions_gas_limit | ||
/ (runtime_config | ||
.wasm_config | ||
.ext_costs | ||
.gas_cost(ExtCosts::storage_write_value_byte) | ||
+ runtime_config | ||
.wasm_config | ||
.ext_costs | ||
.gas_cost(ExtCosts::storage_read_value_byte)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Can you split this into multiple lines by moving the factors / addends to helper variables? It's getting a bit too crazy ;)
core/parameters/src/parameter.rs
Outdated
@@ -24,6 +24,10 @@ pub enum Parameter { | |||
StorageProofSizeSoftLimit, | |||
// Hard per-receipt limit of recorded trie storage proof | |||
StorageProofSizeReceiptLimit, | |||
// Maxmium size of transactions contained inside ChunkStateWitness. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: typo maxmium
core/parameters/src/parameter.rs
Outdated
// Maxmium size of transactions contained inside ChunkStateWitness. | ||
MaxTransactionsSizeInWitness, | ||
// Soft size limit of new transactions storage proof inside ChunkStateWitness. | ||
NewTransactionsValidationStateSizeSoftLimit, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto inconsistent names. Looking at the other parameters the convention is to use Limit
or SoftLimit
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's StorageProofSizeSoftLimit
, so NewTransactionsValidationStateSizeSoftLimit
is consistent with that 👀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I meant the max
one, sorry, wrong line.
let tx_size = if checked_feature!("stable", WitnessTransactionLimits, PROTOCOL_VERSION) { | ||
mb / 2 | ||
} else { | ||
3 * mb |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's funny, wasn't the old limit 4mb?
@@ -719,13 +719,13 @@ fn pure_deploy_bytes(ctx: &mut EstimatorContext) -> GasCost { | |||
let config_store = RuntimeConfigStore::new(None); | |||
let vm_config = config_store.get_config(PROTOCOL_VERSION).wasm_config.clone(); | |||
let small_code = generate_data_only_contract(0, &vm_config); | |||
let large_code = generate_data_only_contract(bytesize::mb(4u64) as usize, &vm_config); | |||
let large_code = generate_data_only_contract(bytesize::kb(1500u64) as usize, &vm_config); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1500 or 1024 * 1.5?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The transaction can't be bigger than 1.5MiB, so the code size is set to 1.5MB to leave some space for other transaction data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah okay.
let small_code_len = small_code.len(); | ||
let large_code_len = large_code.len(); | ||
let cost_empty = deploy_contract_cost(ctx, small_code, Some(b"main")); | ||
let cost_4mb = deploy_contract_cost(ctx, large_code, Some(b"main")); | ||
let cost_15mb = deploy_contract_cost(ctx, large_code, Some(b"main")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
nit: cost_max
also it seems like it's 15 but it's 1.5
My understanding is that with stateless validation the validators don't receive the It would be great to get input from @staffik who implemented transaction validation (#10414), but he's OOO until next week :c. Maybe @pugachAG can chime in instead. |
State witness doesn't include the whole |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just a few mini nits
rejected_due_to_congestion += 1; | ||
continue; | ||
} | ||
if checked_feature!("stable", WitnessTransactionLimits, protocol_version) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mini nit: There is a new alternative for this, I don't remember the exact syntax but it's something like this: ProtocolFeature::WitnessTransactionLimits.enabled(protocol_version). Up to you as the convention is what you've used.
last_chunk_transactions_size: usize, | ||
transactions_gas_limit: Gas, | ||
) -> u64 { | ||
if checked_feature!("stable", WitnessTransactionLimits, protocol_version) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You used an if - else here. Should if be a minimum of the two if the new feature is enabled?
@@ -898,8 +898,17 @@ impl Client { | |||
.get_chunk_extra(&prev_block_hash, &shard_uid) | |||
.map_err(|err| Error::ChunkProducer(format!("No chunk extra available: {}", err)))?; | |||
|
|||
let prev_shard_id = self.epoch_manager.get_prev_shard_id(prev_block.hash(), shard_id)?; | |||
let last_chunk_header = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please forgive me. Now that you changed it to be the chunk from the prev block I would go back to naming it prev_chunk_header. Sorry :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eh the PR has been already merged, so I'm gonna leave it as is. If some names bother you, you can make a PR to change them ;)
@@ -1027,11 +1036,11 @@ impl Client { | |||
&mut self, | |||
shard_uid: ShardUId, | |||
prev_block: &Block, | |||
last_chunk: &ShardChunk, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto prev_chunk
let epoch_id = self.epoch_manager.get_epoch_id_from_prev_block(&prev_block.hash())?; | ||
let protocol_version = self.epoch_manager.get_epoch_protocol_version(&epoch_id)?; | ||
let last_chunk_transactions_size = | ||
if checked_feature!("stable", WitnessTransactionLimits, protocol_version) { | ||
borsh::to_vec(last_chunk.transactions()) | ||
.map_err(|e| { | ||
Error::ChunkProducer(format!("Failed to serialize transactions: {e}")) | ||
})? | ||
.len() | ||
} else { | ||
0 | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mini nit: Consider putting that in a helper method.
/// Configuration specific to ChunkStateWitness. | ||
pub witness_config: WitnessConfig, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
|
||
/// Configuration specific to ChunkStateWitness. | ||
#[derive(Debug, Copy, Clone, PartialEq)] | ||
pub struct WitnessConfig { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mini nit: Maybe StateWitnessConfig? Best to just stick to whatever convention we have, I think StateWitness is more popular in struct names.
/// Maximum size of transactions contained inside ChunkStateWitness. | ||
/// A witness contains transactions from both the previous chunk and the current one. | ||
/// This parameter limits the sum of sizes of transactions from both of those chunks. | ||
pub combined_transactions_size_limit: usize, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this idea, it's very nice. I'm curious if we're ever going to observe some oscillating behaviour close to congestion where every other chunk has 1.5MB and the rest 0.5MB. Nothing wrong with that as far as I can tell though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it could happen with a lot of large transactions, but IMO this approach should be good enough. The chunk producers are assigned at random, so it's not like someone can use this mechanism for anything malicious.
@@ -610,6 +610,31 @@ impl From<ExtCostsConfigView> for crate::ExtCostsConfig { | |||
} | |||
} | |||
|
|||
/// Configuration specific to ChunkStateWitness. | |||
#[derive(Debug, serde::Serialize, serde::Deserialize, Clone, Hash, PartialEq, Eq)] | |||
pub struct WitnessConfigView { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not familiar with the runtime configs views. Can you briefly tell me what are they for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIU they are responsible for converting the RuntimeConfig
to the json representation that's in the snapshots. After I added WitnessConfigView
to RuntimeConfigView
it started appearing in the snapshots.
/// Size limits for transactions included in a ChunkStateWitness. | ||
WitnessTransactionLimits, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mini nit: StateWitnessTransactionLimits
Fixes: #11103
This PR adds 3 new limitations to control total size of transactions included in ChunkStateWitness
max_transaction_size
from 4MiB to 1.5MiB. Transactions larger than 1.5MiB will be rejected.ChunkStateWitness
contains transactions from both the current and the previous chunk, so we have to limit the sum of transactions from both of those chunks.In total that limits the size transaction related fields to 2.5MiB.
About 1):
Having 4MiB transactions is troublesome, because it means that we have to allow at least 4MiB of transactions to be included in
ChunkStateWitness
. Having so much space taken up by the transactions could cause problems with witness size. See #11379 for more information.About 2):
ChunkStateWitness
contains both transactions from the previous chunk (transactions
) and the new chunk (new_transactions
). This is annoying because it halves the space that we can reserve for transactions. To make sure that the size stays reasonable we limit the sum of both those fields to 2MiB. On current mainnet traffic the sum of these fields stays under 400kiB, so 2MiB should be more than enough. This limit has to be slightly higher than the limit for a single transaction, so we can't make it 1MiB, it has to be at least 1.5MiB.About 3):
On mainnet traffic the size of transactions storage proof is under 500kiB on most chunks, so adding this limit shouldn't affect the throughput. I assume that every transactions generates a limited amount of storage proof during validation, so we can have a soft limit for the total size of storage proof. Implementing a hard limit would be difficult because it's hard to predict how much storage proof will be generated by validating a transaction.
Transactions are validated by running
prepare_transactions
on the validator, so there's no need for separate validation code.