Implement limits on the size of transactions in ChunkStateWitness #11406

jancionear · 2024-05-28T11:32:25Z

This PR adds 3 new limitations to control total size of transactions included in ChunkStateWitness

Reduce max_transaction_size from 4MiB to 1.5MiB. Transactions larger than 1.5MiB will be rejected.
Limit size of transactions included in the last two chunks to 2MiB. ChunkStateWitness contains transactions from both the current and the previous chunk, so we have to limit the sum of transactions from both of those chunks.
Limit size of storage proof generated during transaction validation to 500kiB (soft limit).

In total that limits the size transaction related fields to 2.5MiB.

About 1):
Having 4MiB transactions is troublesome, because it means that we have to allow at least 4MiB of transactions to be included in ChunkStateWitness. Having so much space taken up by the transactions could cause problems with witness size. See #11379 for more information.

About 2):
ChunkStateWitness contains both transactions from the previous chunk (transactions) and the new chunk (new_transactions). This is annoying because it halves the space that we can reserve for transactions. To make sure that the size stays reasonable we limit the sum of both those fields to 2MiB. On current mainnet traffic the sum of these fields stays under 400kiB, so 2MiB should be more than enough. This limit has to be slightly higher than the limit for a single transaction, so we can't make it 1MiB, it has to be at least 1.5MiB.

About 3):
On mainnet traffic the size of transactions storage proof is under 500kiB on most chunks, so adding this limit shouldn't affect the throughput. I assume that every transactions generates a limited amount of storage proof during validation, so we can have a soft limit for the total size of storage proof. Implementing a hard limit would be difficult because it's hard to predict how much storage proof will be generated by validating a transaction.

Transactions are validated by running prepare_transactions on the validator, so there's no need for separate validation code.

jancionear · 2024-05-28T11:37:59Z

I'm not sure about the snapshots files. I added max_transactions_size_in_witness and new_transactions_validation_state_size_soft_limit to the runtime config, but should I also add it to the snapshots? The tests aren't complaining, so I wonder if I have to do it?
I saw that @shreyan-gupta added storage_proof_size_soft_limit to the snapshots, but it's in the transaction_costs section. That sounds weird, should I put my parameters there as well?

jancionear · 2024-05-28T11:48:55Z

With testing I saw that some of the existing tests already kinda cover the feature.
When I lowered the max_transaction_size to 1.5MiB a bunch of tests started failing because they tried to submit 4MiB transactions.
Adding max_transactions_size_in_witness caused the benchmark_large_chunk_production_time test to fail because the produced chunk was 3MiB instead of the expected 35MiB.

I can think about some dedicated tests, but there might not be a big need for that.

codecov · 2024-05-28T11:52:03Z

Codecov Report

Attention: Patch coverage is 82.42424% with 29 lines in your changes are missing coverage. Please review.

Project coverage is 71.36%. Comparing base (1ade93b) to head (c5579f9).
Report is 10 commits behind head on master.

Files	Patch %	Lines
chain/chain/src/runtime/mod.rs	88.29%	6 Missing and 5 partials ⚠️
chain/client/src/client.rs	52.38%	7 Missing and 3 partials ⚠️
core/parameters/src/parameter_table.rs	50.00%	0 Missing and 3 partials ⚠️
...client/src/stateless_validation/shadow_validate.rs	0.00%	2 Missing ⚠️
...nt/src/stateless_validation/chunk_validator/mod.rs	87.50%	0 Missing and 1 partial ⚠️
core/parameters/src/view.rs	90.00%	1 Missing ⚠️
...me-params-estimator/src/costs_to_runtime_config.rs	0.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master   #11406      +/-   ##
==========================================
+ Coverage   71.27%   71.36%   +0.08%     
==========================================
  Files         784      784              
  Lines      157847   158131     +284     
  Branches   157847   158131     +284     
==========================================
+ Hits       112511   112843     +332     
+ Misses      40489    40412      -77     
- Partials     4847     4876      +29

Flag	Coverage Δ
backward-compatibility	`0.24% <0.00%> (-0.01%)`	⬇️
db-migration	`0.24% <0.00%> (-0.01%)`	⬇️
genesis-check	`1.38% <0.00%> (-0.01%)`	⬇️
integration-tests	`37.48% <67.87%> (+0.20%)`	⬆️
linux	`68.87% <60.00%> (+0.07%)`	⬆️
linux-nightly	`70.78% <81.21%> (+0.05%)`	⬆️
macos	`52.39% <56.66%> (+0.01%)`	⬆️
pytests	`1.59% <0.00%> (-0.01%)`	⬇️
sanity-checks	`1.39% <0.00%> (-0.01%)`	⬇️
unittests	`65.72% <74.54%> (+0.01%)`	⬆️
upgradability	`0.28% <0.00%> (-0.01%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

tayfunelmas · 2024-05-28T21:11:45Z

runtime/runtime-params-estimator/src/lib.rs

    let small_code_len = small_code.len();
    let large_code_len = large_code.len();
    let cost_empty = deploy_contract_cost(ctx, small_code, Some(b"main"));
-    let cost_4mb = deploy_contract_cost(ctx, large_code, Some(b"main"));
+    let cost_15mb = deploy_contract_cost(ctx, large_code, Some(b"main"));


is this function for a test? otherwise not sure why it needs to specifically call out the limits, which are set in params files.

+1
nit: cost_max
also it seems like it's 15 but it's 1.5

is this function for a test? otherwise not sure why it needs to specifically call out the limits, which are set in params files.

Ok I'll change it so that it uses the param file instead of a hardcoded value

tayfunelmas · 2024-05-28T21:14:33Z

chain/client/src/client.rs

@@ -1041,9 +1053,21 @@ impl Client {
                source: StorageDataSource::Db,
                state_patch: Default::default(),
            };
+            let prev_chunk_transactions_size = match prev_chunk_opt {
+                Some(prev_chunk) => borsh::to_vec(prev_chunk.transactions())


assuming this serialization will take some time, guard this with the new protocol version?

wacban · 2024-05-29T09:33:17Z

ChunkStateWitness contains both transactions from the previous chunk (transactions) and the new chunk (new_transactions).

Quick question about this - isn't new_transactions duplicated in the ChunkStateWitness and in the ShardChunk itself? Do we need it in both places?

wacban

LGTM but I'd like to have another look later so for now just a few comments

wacban · 2024-05-29T09:57:02Z

chain/chain/src/chain.rs

@@ -4454,6 +4454,38 @@ impl Chain {
            })
            .collect()
    }
+
+    /// Find the last existing (not missing) chunk on this shard id.
+    pub fn get_header_of_last_existing_chunk(


The chunk header contains field height_included that points to the block height where the last existing chunk on this shard is. Not sure if this would simplify or speed up this implementation so sharing just in case you find it useful. There is a caveat that height included is not set when the chunk is produced (since at this point we don't know what block height will include it), it's only set once part of a block.

The shard_id changes meaning during resharding. No need to do it here but can you add a todo for this? Something like:

TODO(resharding) - handle looking for the last existing chunk

JFYI there is an interesting interaction with state sync here. Intuitively state sync gets the state as of the last block of the epoch. If there are missing chunks around this epoch boundary this loop could go before that and the node might not have the needed blocks. This issue actually appeared elsewhere and should be fixed already. When syncing the node will get the blocks up until the last existing chunk for every shard.

The shard_id changes meaning during resharding. No need to do it here but can you add a todo for this?

I used:

shard_id = epoch_manager.get_prev_shard_id(&cur_block_hash, shard_id)?;

Doesn't that take care of reshardings?

Ah sorry I missed that line. That should do it.

Oh actually I just realized that the last chunk is included in the previous block, so I don't need to iterate at all 🤦. I'll remove this function.

wacban · 2024-05-29T10:03:40Z

chain/chain/src/runtime/mod.rs

-        let size_limit = transactions_gas_limit
-            / (runtime_config.wasm_config.ext_costs.gas_cost(ExtCosts::storage_write_value_byte)
-                + runtime_config.wasm_config.ext_costs.gas_cost(ExtCosts::storage_read_value_byte));
+        let size_limit: u64 =


nit: Can you move this to a helper method? This method is already quite overblown.

wacban · 2024-05-29T10:08:41Z

chain/chain/src/runtime/mod.rs

+                runtime_config
+                    .max_transactions_size_in_witness
+                    .saturating_sub(chunk.prev_chunk_transactions_size) as u64


This is pretty neat!

wacban · 2024-05-29T10:10:09Z

chain/chain/src/runtime/mod.rs

+                transactions_gas_limit
+                    / (runtime_config
+                        .wasm_config
+                        .ext_costs
+                        .gas_cost(ExtCosts::storage_write_value_byte)
+                        + runtime_config
+                            .wasm_config
+                            .ext_costs
+                            .gas_cost(ExtCosts::storage_read_value_byte))


nit: Can you split this into multiple lines by moving the factors / addends to helper variables? It's getting a bit too crazy ;)

chain/chain/src/runtime/mod.rs

wacban · 2024-05-29T10:40:28Z

core/parameters/src/parameter.rs

@@ -24,6 +24,10 @@ pub enum Parameter {
    StorageProofSizeSoftLimit,
    // Hard per-receipt limit of recorded trie storage proof
    StorageProofSizeReceiptLimit,
+    // Maxmium size of transactions contained inside ChunkStateWitness.


nit: typo maxmium

wacban · 2024-05-29T10:40:54Z

core/parameters/src/parameter.rs

+    // Maxmium size of transactions contained inside ChunkStateWitness.
+    MaxTransactionsSizeInWitness,
+    // Soft size limit of new transactions storage proof inside ChunkStateWitness.
+    NewTransactionsValidationStateSizeSoftLimit,


Ditto inconsistent names. Looking at the other parameters the convention is to use Limit or SoftLimit.

There's StorageProofSizeSoftLimit, so NewTransactionsValidationStateSizeSoftLimit is consistent with that 👀

Yeah, I meant the max one, sorry, wrong line.

wacban · 2024-05-29T10:43:15Z

integration-tests/src/tests/client/benchmarks.rs

+    let tx_size = if checked_feature!("stable", WitnessTransactionLimits, PROTOCOL_VERSION) {
+        mb / 2
+    } else {
+        3 * mb


That's funny, wasn't the old limit 4mb?

wacban · 2024-05-29T10:44:31Z

runtime/runtime-params-estimator/src/lib.rs

@@ -719,13 +719,13 @@ fn pure_deploy_bytes(ctx: &mut EstimatorContext) -> GasCost {
    let config_store = RuntimeConfigStore::new(None);
    let vm_config = config_store.get_config(PROTOCOL_VERSION).wasm_config.clone();
    let small_code = generate_data_only_contract(0, &vm_config);
-    let large_code = generate_data_only_contract(bytesize::mb(4u64) as usize, &vm_config);
+    let large_code = generate_data_only_contract(bytesize::kb(1500u64) as usize, &vm_config);


1500 or 1024 * 1.5?

The transaction can't be bigger than 1.5MiB, so the code size is set to 1.5MB to leave some space for other transaction data.

wacban · 2024-05-29T10:45:16Z

runtime/runtime-params-estimator/src/lib.rs

    let small_code_len = small_code.len();
    let large_code_len = large_code.len();
    let cost_empty = deploy_contract_cost(ctx, small_code, Some(b"main"));
-    let cost_4mb = deploy_contract_cost(ctx, large_code, Some(b"main"));
+    let cost_15mb = deploy_contract_cost(ctx, large_code, Some(b"main"));


+1
nit: cost_max
also it seems like it's 15 but it's 1.5

jancionear · 2024-05-29T11:41:24Z

Quick question about this - isn't new_transactions duplicated in the ChunkStateWitness and in the ShardChunk itself? Do we need it in both places?

My understanding is that with stateless validation the validators don't receive the ShardChunks anymore, they should only be sent to nodes that track the shard. Validators receive ChunkStateWitness which is ShardChunk + validation information.

It would be great to get input from @staffik who implemented transaction validation (#10414), but he's OOO until next week :c. Maybe @pugachAG can chime in instead.

…_limit

pugachAG · 2024-05-29T19:53:00Z

@wacban @jancionear

Quick question about this - isn't new_transactions duplicated in the ChunkStateWitness and in the ShardChunk itself? Do we need it in both places?

State witness doesn't include the whole ShardChunk, only the header. So yeah, both ChunkStateWitness and ShardChunk contain new transactions, but as Jan pointed out stateless validators are not expected to have ShardChunk, so we do need it in both structs.

wacban

LGTM, just a few mini nits

wacban · 2024-05-29T21:27:16Z

chain/chain/src/runtime/mod.rs

-                                rejected_due_to_congestion += 1;
-                                continue;
-                            }
+            if checked_feature!("stable", WitnessTransactionLimits, protocol_version)


mini nit: There is a new alternative for this, I don't remember the exact syntax but it's something like this: ProtocolFeature::WitnessTransactionLimits.enabled(protocol_version). Up to you as the convention is what you've used.

wacban · 2024-05-29T21:31:53Z

chain/chain/src/runtime/mod.rs

+    last_chunk_transactions_size: usize,
+    transactions_gas_limit: Gas,
+) -> u64 {
+    if checked_feature!("stable", WitnessTransactionLimits, protocol_version) {


You used an if - else here. Should if be a minimum of the two if the new feature is enabled?

wacban · 2024-05-29T21:34:14Z

chain/client/src/client.rs

@@ -898,8 +898,17 @@ impl Client {
            .get_chunk_extra(&prev_block_hash, &shard_uid)
            .map_err(|err| Error::ChunkProducer(format!("No chunk extra available: {}", err)))?;

+        let prev_shard_id = self.epoch_manager.get_prev_shard_id(prev_block.hash(), shard_id)?;
+        let last_chunk_header =


Please forgive me. Now that you changed it to be the chunk from the prev block I would go back to naming it prev_chunk_header. Sorry :)

Eh the PR has been already merged, so I'm gonna leave it as is. If some names bother you, you can make a PR to change them ;)

wacban · 2024-05-29T21:35:01Z

chain/client/src/client.rs

@@ -1027,11 +1036,11 @@ impl Client {
        &mut self,
        shard_uid: ShardUId,
        prev_block: &Block,
+        last_chunk: &ShardChunk,


ditto prev_chunk

wacban · 2024-05-29T21:46:10Z

chain/client/src/client.rs

+            let epoch_id = self.epoch_manager.get_epoch_id_from_prev_block(&prev_block.hash())?;
+            let protocol_version = self.epoch_manager.get_epoch_protocol_version(&epoch_id)?;
+            let last_chunk_transactions_size =
+                if checked_feature!("stable", WitnessTransactionLimits, protocol_version) {
+                    borsh::to_vec(last_chunk.transactions())
+                        .map_err(|e| {
+                            Error::ChunkProducer(format!("Failed to serialize transactions: {e}"))
+                        })?
+                        .len()
+                } else {
+                    0
+                };


mini nit: Consider putting that in a helper method.

wacban · 2024-05-29T21:52:32Z

core/parameters/src/config.rs

+    /// Configuration specific to ChunkStateWitness.
+    pub witness_config: WitnessConfig,


wacban · 2024-05-29T21:54:18Z

core/parameters/src/config.rs

+
+/// Configuration specific to ChunkStateWitness.
+#[derive(Debug, Copy, Clone, PartialEq)]
+pub struct WitnessConfig {


mini nit: Maybe StateWitnessConfig? Best to just stick to whatever convention we have, I think StateWitness is more popular in struct names.

wacban · 2024-05-29T21:56:43Z

core/parameters/src/config.rs

+    /// Maximum size of transactions contained inside ChunkStateWitness.
+    /// A witness contains transactions from both the previous chunk and the current one.
+    /// This parameter limits the sum of sizes of transactions from both of those chunks.
+    pub combined_transactions_size_limit: usize,


I like this idea, it's very nice. I'm curious if we're ever going to observe some oscillating behaviour close to congestion where every other chunk has 1.5MB and the rest 0.5MB. Nothing wrong with that as far as I can tell though.

Yeah it could happen with a lot of large transactions, but IMO this approach should be good enough. The chunk producers are assigned at random, so it's not like someone can use this mechanism for anything malicious.

wacban · 2024-05-29T21:59:39Z

core/parameters/src/view.rs

@@ -610,6 +610,31 @@ impl From<ExtCostsConfigView> for crate::ExtCostsConfig {
    }
 }

+/// Configuration specific to ChunkStateWitness.
+#[derive(Debug, serde::Serialize, serde::Deserialize, Clone, Hash, PartialEq, Eq)]
+pub struct WitnessConfigView {


I'm not familiar with the runtime configs views. Can you briefly tell me what are they for?

AFAIU they are responsible for converting the RuntimeConfig to the json representation that's in the snapshots. After I added WitnessConfigView to RuntimeConfigView it started appearing in the snapshots.

wacban · 2024-05-29T22:00:48Z

core/primitives-core/src/version.rs

+    /// Size limits for transactions included in a ChunkStateWitness.
+    WitnessTransactionLimits,


mini nit: StateWitnessTransactionLimits

jancionear added 4 commits May 28, 2024 10:54

Add limits to config

45afc64

Simplify the loop for adding transactions

c3a6a00

Implement limits

380fe64

Fix tests

0668768

jancionear added the A-stateless-validation Area: stateless validation label May 28, 2024

jancionear requested review from wacban and staffik May 28, 2024 11:32

jancionear requested a review from a team as a code owner May 28, 2024 11:32

jancionear mentioned this pull request May 28, 2024

Lower transaction size limit to 1.5MiB #11379

Closed

jancionear requested a review from bowenwang1996 May 28, 2024 11:35

shreyan-gupta self-requested a review May 28, 2024 16:33

tayfunelmas reviewed May 28, 2024

View reviewed changes

wacban reviewed May 29, 2024

View reviewed changes

jancionear added 12 commits May 29, 2024 11:49

Move tx size limit calculation to a helper method

8792bfd

Multiline getting previous chunk

dfa5f3f

Rename prev_chunk to last_chunk

b433e0e

Add a comment on last_chunk_transactions_size

88f22c4

Reorder arguments of validate_prepared_transactions

9f7bc03

temp variable for last_chunk_transactions_size

2dd7ec3

better comment on max_transactions_size_in_witness

fe56c83

Fix Maxmium typo

2d2f883

Rename cost_15mb to cost_max

579515f

Don't calculate transaction size on older protocol versions

7574a58

remove get_header_of_last_existing_chunk

f55bd33

Use limit_config for large_code size

d6750d0

jancionear added 5 commits May 29, 2024 13:51

make prev_chunk non-optional

52f5058

Use three slashes for documentation

cead630

Add RuntimeConfig::witness_config and move witness limits there

ff39b86

Rename max_transactions_size_in_witness to combined_transactions_size…

b6de3c0

…_limit

Rename witness limit parameters to make things clearer

c5579f9

jancionear requested a review from wacban May 29, 2024 19:52

wacban approved these changes May 29, 2024

View reviewed changes

bowenwang1996 added this pull request to the merge queue May 30, 2024

Merged via the queue into near:master with commit 8d3edac May 30, 2024
29 checks passed

		/// Configuration specific to ChunkStateWitness.
		pub witness_config: WitnessConfig,

		/// Size limits for transactions included in a ChunkStateWitness.
		WitnessTransactionLimits,

Implement limits on the size of transactions in ChunkStateWitness #11406

Implement limits on the size of transactions in ChunkStateWitness #11406

Conversation

jancionear commented May 28, 2024 • edited Loading

jancionear commented May 28, 2024

jancionear commented May 28, 2024

codecov bot commented May 28, 2024 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wacban commented May 29, 2024

wacban left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jancionear commented May 29, 2024

pugachAG commented May 29, 2024

wacban left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jancionear commented May 28, 2024 •

edited

Loading

codecov bot commented May 28, 2024 •

edited

Loading