Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement limits on the size of transactions in ChunkStateWitness #11406

Merged
merged 21 commits into from
May 30, 2024

Conversation

jancionear
Copy link
Contributor

@jancionear jancionear commented May 28, 2024

Fixes: #11103

This PR adds 3 new limitations to control total size of transactions included in ChunkStateWitness

  1. Reduce max_transaction_size from 4MiB to 1.5MiB. Transactions larger than 1.5MiB will be rejected.
  2. Limit size of transactions included in the last two chunks to 2MiB. ChunkStateWitness contains transactions from both the current and the previous chunk, so we have to limit the sum of transactions from both of those chunks.
  3. Limit size of storage proof generated during transaction validation to 500kiB (soft limit).

In total that limits the size transaction related fields to 2.5MiB.

About 1):
Having 4MiB transactions is troublesome, because it means that we have to allow at least 4MiB of transactions to be included in ChunkStateWitness. Having so much space taken up by the transactions could cause problems with witness size. See #11379 for more information.

About 2):
ChunkStateWitness contains both transactions from the previous chunk (transactions) and the new chunk (new_transactions). This is annoying because it halves the space that we can reserve for transactions. To make sure that the size stays reasonable we limit the sum of both those fields to 2MiB. On current mainnet traffic the sum of these fields stays under 400kiB, so 2MiB should be more than enough. This limit has to be slightly higher than the limit for a single transaction, so we can't make it 1MiB, it has to be at least 1.5MiB.

About 3):
On mainnet traffic the size of transactions storage proof is under 500kiB on most chunks, so adding this limit shouldn't affect the throughput. I assume that every transactions generates a limited amount of storage proof during validation, so we can have a soft limit for the total size of storage proof. Implementing a hard limit would be difficult because it's hard to predict how much storage proof will be generated by validating a transaction.

Transactions are validated by running prepare_transactions on the validator, so there's no need for separate validation code.

@jancionear jancionear added the A-stateless-validation Area: stateless validation label May 28, 2024
@jancionear jancionear requested review from wacban and staffik May 28, 2024 11:32
@jancionear jancionear requested a review from a team as a code owner May 28, 2024 11:32
@jancionear jancionear requested a review from bowenwang1996 May 28, 2024 11:35
@jancionear
Copy link
Contributor Author

I'm not sure about the snapshots files. I added max_transactions_size_in_witness and new_transactions_validation_state_size_soft_limit to the runtime config, but should I also add it to the snapshots? The tests aren't complaining, so I wonder if I have to do it?
I saw that @shreyan-gupta added storage_proof_size_soft_limit to the snapshots, but it's in the transaction_costs section. That sounds weird, should I put my parameters there as well?

@jancionear
Copy link
Contributor Author

With testing I saw that some of the existing tests already kinda cover the feature.
When I lowered the max_transaction_size to 1.5MiB a bunch of tests started failing because they tried to submit 4MiB transactions.
Adding max_transactions_size_in_witness caused the benchmark_large_chunk_production_time test to fail because the produced chunk was 3MiB instead of the expected 35MiB.

I can think about some dedicated tests, but there might not be a big need for that.

Copy link

codecov bot commented May 28, 2024

Codecov Report

Attention: Patch coverage is 82.42424% with 29 lines in your changes are missing coverage. Please review.

Project coverage is 71.36%. Comparing base (1ade93b) to head (c5579f9).
Report is 10 commits behind head on master.

Files Patch % Lines
chain/chain/src/runtime/mod.rs 88.29% 6 Missing and 5 partials ⚠️
chain/client/src/client.rs 52.38% 7 Missing and 3 partials ⚠️
core/parameters/src/parameter_table.rs 50.00% 0 Missing and 3 partials ⚠️
...client/src/stateless_validation/shadow_validate.rs 0.00% 2 Missing ⚠️
...nt/src/stateless_validation/chunk_validator/mod.rs 87.50% 0 Missing and 1 partial ⚠️
core/parameters/src/view.rs 90.00% 1 Missing ⚠️
...me-params-estimator/src/costs_to_runtime_config.rs 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #11406      +/-   ##
==========================================
+ Coverage   71.27%   71.36%   +0.08%     
==========================================
  Files         784      784              
  Lines      157847   158131     +284     
  Branches   157847   158131     +284     
==========================================
+ Hits       112511   112843     +332     
+ Misses      40489    40412      -77     
- Partials     4847     4876      +29     
Flag Coverage Δ
backward-compatibility 0.24% <0.00%> (-0.01%) ⬇️
db-migration 0.24% <0.00%> (-0.01%) ⬇️
genesis-check 1.38% <0.00%> (-0.01%) ⬇️
integration-tests 37.48% <67.87%> (+0.20%) ⬆️
linux 68.87% <60.00%> (+0.07%) ⬆️
linux-nightly 70.78% <81.21%> (+0.05%) ⬆️
macos 52.39% <56.66%> (+0.01%) ⬆️
pytests 1.59% <0.00%> (-0.01%) ⬇️
sanity-checks 1.39% <0.00%> (-0.01%) ⬇️
unittests 65.72% <74.54%> (+0.01%) ⬆️
upgradability 0.28% <0.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@shreyan-gupta shreyan-gupta self-requested a review May 28, 2024 16:33
let small_code_len = small_code.len();
let large_code_len = large_code.len();
let cost_empty = deploy_contract_cost(ctx, small_code, Some(b"main"));
let cost_4mb = deploy_contract_cost(ctx, large_code, Some(b"main"));
let cost_15mb = deploy_contract_cost(ctx, large_code, Some(b"main"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this function for a test? otherwise not sure why it needs to specifically call out the limits, which are set in params files.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1
nit: cost_max
also it seems like it's 15 but it's 1.5

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this function for a test? otherwise not sure why it needs to specifically call out the limits, which are set in params files.

Ok I'll change it so that it uses the param file instead of a hardcoded value

@@ -1041,9 +1053,21 @@ impl Client {
source: StorageDataSource::Db,
state_patch: Default::default(),
};
let prev_chunk_transactions_size = match prev_chunk_opt {
Some(prev_chunk) => borsh::to_vec(prev_chunk.transactions())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assuming this serialization will take some time, guard this with the new protocol version?

@wacban
Copy link
Contributor

wacban commented May 29, 2024

ChunkStateWitness contains both transactions from the previous chunk (transactions) and the new chunk (new_transactions).

Quick question about this - isn't new_transactions duplicated in the ChunkStateWitness and in the ShardChunk itself? Do we need it in both places?

Copy link
Contributor

@wacban wacban left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but I'd like to have another look later so for now just a few comments

@@ -4454,6 +4454,38 @@ impl Chain {
})
.collect()
}

/// Find the last existing (not missing) chunk on this shard id.
pub fn get_header_of_last_existing_chunk(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. The chunk header contains field height_included that points to the block height where the last existing chunk on this shard is. Not sure if this would simplify or speed up this implementation so sharing just in case you find it useful. There is a caveat that height included is not set when the chunk is produced (since at this point we don't know what block height will include it), it's only set once part of a block.

  2. The shard_id changes meaning during resharding. No need to do it here but can you add a todo for this? Something like:

TODO(resharding) - handle looking for the last existing chunk
  1. JFYI there is an interesting interaction with state sync here. Intuitively state sync gets the state as of the last block of the epoch. If there are missing chunks around this epoch boundary this loop could go before that and the node might not have the needed blocks. This issue actually appeared elsewhere and should be fixed already. When syncing the node will get the blocks up until the last existing chunk for every shard.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The shard_id changes meaning during resharding. No need to do it here but can you add a todo for this?

I used:

shard_id = epoch_manager.get_prev_shard_id(&cur_block_hash, shard_id)?;

Doesn't that take care of reshardings?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah sorry I missed that line. That should do it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh actually I just realized that the last chunk is included in the previous block, so I don't need to iterate at all 🤦. I'll remove this function.

let size_limit = transactions_gas_limit
/ (runtime_config.wasm_config.ext_costs.gas_cost(ExtCosts::storage_write_value_byte)
+ runtime_config.wasm_config.ext_costs.gas_cost(ExtCosts::storage_read_value_byte));
let size_limit: u64 =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Can you move this to a helper method? This method is already quite overblown.

Comment on lines 784 to 786
runtime_config
.max_transactions_size_in_witness
.saturating_sub(chunk.prev_chunk_transactions_size) as u64
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty neat!

Comment on lines 795 to 803
transactions_gas_limit
/ (runtime_config
.wasm_config
.ext_costs
.gas_cost(ExtCosts::storage_write_value_byte)
+ runtime_config
.wasm_config
.ext_costs
.gas_cost(ExtCosts::storage_read_value_byte))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Can you split this into multiple lines by moving the factors / addends to helper variables? It's getting a bit too crazy ;)

chain/chain/src/runtime/mod.rs Show resolved Hide resolved
@@ -24,6 +24,10 @@ pub enum Parameter {
StorageProofSizeSoftLimit,
// Hard per-receipt limit of recorded trie storage proof
StorageProofSizeReceiptLimit,
// Maxmium size of transactions contained inside ChunkStateWitness.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: typo maxmium

Comment on lines 27 to 30
// Maxmium size of transactions contained inside ChunkStateWitness.
MaxTransactionsSizeInWitness,
// Soft size limit of new transactions storage proof inside ChunkStateWitness.
NewTransactionsValidationStateSizeSoftLimit,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto inconsistent names. Looking at the other parameters the convention is to use Limit or SoftLimit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's StorageProofSizeSoftLimit, so NewTransactionsValidationStateSizeSoftLimit is consistent with that 👀

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I meant the max one, sorry, wrong line.

let tx_size = if checked_feature!("stable", WitnessTransactionLimits, PROTOCOL_VERSION) {
mb / 2
} else {
3 * mb
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's funny, wasn't the old limit 4mb?

@@ -719,13 +719,13 @@ fn pure_deploy_bytes(ctx: &mut EstimatorContext) -> GasCost {
let config_store = RuntimeConfigStore::new(None);
let vm_config = config_store.get_config(PROTOCOL_VERSION).wasm_config.clone();
let small_code = generate_data_only_contract(0, &vm_config);
let large_code = generate_data_only_contract(bytesize::mb(4u64) as usize, &vm_config);
let large_code = generate_data_only_contract(bytesize::kb(1500u64) as usize, &vm_config);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1500 or 1024 * 1.5?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The transaction can't be bigger than 1.5MiB, so the code size is set to 1.5MB to leave some space for other transaction data.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah okay.

let small_code_len = small_code.len();
let large_code_len = large_code.len();
let cost_empty = deploy_contract_cost(ctx, small_code, Some(b"main"));
let cost_4mb = deploy_contract_cost(ctx, large_code, Some(b"main"));
let cost_15mb = deploy_contract_cost(ctx, large_code, Some(b"main"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1
nit: cost_max
also it seems like it's 15 but it's 1.5

@jancionear
Copy link
Contributor Author

Quick question about this - isn't new_transactions duplicated in the ChunkStateWitness and in the ShardChunk itself? Do we need it in both places?

My understanding is that with stateless validation the validators don't receive the ShardChunks anymore, they should only be sent to nodes that track the shard. Validators receive ChunkStateWitness which is ShardChunk + validation information.

It would be great to get input from @staffik who implemented transaction validation (#10414), but he's OOO until next week :c. Maybe @pugachAG can chime in instead.

@jancionear jancionear requested a review from wacban May 29, 2024 19:52
@pugachAG
Copy link
Contributor

@wacban @jancionear

Quick question about this - isn't new_transactions duplicated in the ChunkStateWitness and in the ShardChunk itself? Do we need it in both places?

State witness doesn't include the whole ShardChunk, only the header. So yeah, both ChunkStateWitness and ShardChunk contain new transactions, but as Jan pointed out stateless validators are not expected to have ShardChunk, so we do need it in both structs.

Copy link
Contributor

@wacban wacban left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just a few mini nits

rejected_due_to_congestion += 1;
continue;
}
if checked_feature!("stable", WitnessTransactionLimits, protocol_version)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mini nit: There is a new alternative for this, I don't remember the exact syntax but it's something like this: ProtocolFeature::WitnessTransactionLimits.enabled(protocol_version). Up to you as the convention is what you've used.

last_chunk_transactions_size: usize,
transactions_gas_limit: Gas,
) -> u64 {
if checked_feature!("stable", WitnessTransactionLimits, protocol_version) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You used an if - else here. Should if be a minimum of the two if the new feature is enabled?

@@ -898,8 +898,17 @@ impl Client {
.get_chunk_extra(&prev_block_hash, &shard_uid)
.map_err(|err| Error::ChunkProducer(format!("No chunk extra available: {}", err)))?;

let prev_shard_id = self.epoch_manager.get_prev_shard_id(prev_block.hash(), shard_id)?;
let last_chunk_header =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please forgive me. Now that you changed it to be the chunk from the prev block I would go back to naming it prev_chunk_header. Sorry :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eh the PR has been already merged, so I'm gonna leave it as is. If some names bother you, you can make a PR to change them ;)

@@ -1027,11 +1036,11 @@ impl Client {
&mut self,
shard_uid: ShardUId,
prev_block: &Block,
last_chunk: &ShardChunk,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto prev_chunk

Comment on lines +1053 to +1064
let epoch_id = self.epoch_manager.get_epoch_id_from_prev_block(&prev_block.hash())?;
let protocol_version = self.epoch_manager.get_epoch_protocol_version(&epoch_id)?;
let last_chunk_transactions_size =
if checked_feature!("stable", WitnessTransactionLimits, protocol_version) {
borsh::to_vec(last_chunk.transactions())
.map_err(|e| {
Error::ChunkProducer(format!("Failed to serialize transactions: {e}"))
})?
.len()
} else {
0
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mini nit: Consider putting that in a helper method.

Comment on lines +32 to +33
/// Configuration specific to ChunkStateWitness.
pub witness_config: WitnessConfig,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️


/// Configuration specific to ChunkStateWitness.
#[derive(Debug, Copy, Clone, PartialEq)]
pub struct WitnessConfig {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mini nit: Maybe StateWitnessConfig? Best to just stick to whatever convention we have, I think StateWitness is more popular in struct names.

Comment on lines +218 to +221
/// Maximum size of transactions contained inside ChunkStateWitness.
/// A witness contains transactions from both the previous chunk and the current one.
/// This parameter limits the sum of sizes of transactions from both of those chunks.
pub combined_transactions_size_limit: usize,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this idea, it's very nice. I'm curious if we're ever going to observe some oscillating behaviour close to congestion where every other chunk has 1.5MB and the rest 0.5MB. Nothing wrong with that as far as I can tell though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it could happen with a lot of large transactions, but IMO this approach should be good enough. The chunk producers are assigned at random, so it's not like someone can use this mechanism for anything malicious.

@@ -610,6 +610,31 @@ impl From<ExtCostsConfigView> for crate::ExtCostsConfig {
}
}

/// Configuration specific to ChunkStateWitness.
#[derive(Debug, serde::Serialize, serde::Deserialize, Clone, Hash, PartialEq, Eq)]
pub struct WitnessConfigView {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with the runtime configs views. Can you briefly tell me what are they for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIU they are responsible for converting the RuntimeConfig to the json representation that's in the snapshots. After I added WitnessConfigView to RuntimeConfigView it started appearing in the snapshots.

Comment on lines +162 to +163
/// Size limits for transactions included in a ChunkStateWitness.
WitnessTransactionLimits,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mini nit: StateWitnessTransactionLimits

@bowenwang1996 bowenwang1996 added this pull request to the merge queue May 30, 2024
Merged via the queue into near:master with commit 8d3edac May 30, 2024
29 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-stateless-validation Area: stateless validation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Limit size of transactions included in ChunkStateWitness
5 participants