block writeable account data limit #184

bw-solana · 2024-10-15T22:47:44Z

No description provided.

jacobcreech · 2024-10-17T14:35:40Z

@bw-solana do we know the current impact if this change was live the past 30 days?

bw-solana · 2024-10-17T17:42:07Z

@bw-solana do we know the current impact if this change was live the past 30 days?

Good question, I don't know. I'll get some data

ilya-bobyr · 2024-10-18T00:40:43Z

proposals/0184-block-writeable-account-data-limit.md

+transaction being handled by the block producer (for inclusion in a block) or
+validator (replaying transactions in a block), add it to a running total of
+total writeable account data for the block, and compare it against some
+threshold (recommend 2GB to start) to decide if the transaction is allowed to be


Do we have statistics on what is the current distribution of writable data on the mainnet?
I wonder if we could start with an even lower value.

We don't have perfect stats, but I posted comment with some analysis below (#184 (comment))

2GiB feels right as a starting point. It should constrain worst case reasonably well without impacting current typical behavior on mainnet. We could ratchet things down (or up) after observing behavior

ilya-bobyr · 2024-10-18T00:42:32Z

proposals/0184-block-writeable-account-data-limit.md

+  `MAX_BLOCK_ACCOUNTS_DATA_SIZE_WRITEABLE` and set this to `2_000_000_000`
+  (2GB).


2GB is 2_147_483_648 :P

Suggested change

`MAX_BLOCK_ACCOUNTS_DATA_SIZE_WRITEABLE` and set this to `2_000_000_000`

(2GB).

`MAX_BLOCK_ACCOUNTS_DATA_SIZE_WRITEABLE` and set this to 2GiB

(`2 * 1024 * 1024 * 1024`).

ilya-bobyr · 2024-10-18T00:43:15Z

proposals/0184-block-writeable-account-data-limit.md

+5. Treat this new error type (`WouldExceedAccountDataWriteableBlockLimit`) as
+  retryable in execute_and_commit_transactions_locked


Suggested change

5. Treat this new error type (`WouldExceedAccountDataWriteableBlockLimit`) as

retryable in execute_and_commit_transactions_locked

5. Treat this new error type (`WouldExceedAccountDataWriteableBlockLimit`) as

retryable in `execute_and_commit_transactions_locked()`.

ilya-bobyr · 2024-10-18T00:50:45Z

proposals/0184-block-writeable-account-data-limit.md

+There is an edge case where if a single transaction tries to consume more than
+the entire per block writeable account data budget, it should not be marked as
+retryable. This might not be possible today given tx size limits and maximum
+account size limits, but this could be exploitable in the future if either of
+this constraints change.


Would it make sense to introduce a separate constraint that limits the amount of writable data a single transaction can access?
We have MAX_LOADED_ACCOUNTS_DATA_SIZE_BYTES equal to 64MB which limits the total amount of data a transaction can load.
We could add MAX_LOADED_ACCOUNTS_WRITEABLE_DATA_SIZE_BYTES (with an obvious semantics).

We could. We'll effectively be constrained to the 64MiB per tx for updated account data and 100MiB per block for newly allocated account data, so we already have some guard rails here, but we could be more explicit

ilya-bobyr · 2024-10-18T02:54:48Z

proposals/0184-block-writeable-account-data-limit.md

+1. Update the cost model to compute the aggregate size of all accounts marked
+  writeable for each transaction. This can be done similarly to how
+  `data_bytes_len_total` is computed inside `get_transaction_cost` but limited
+  to accounts marked writeable.


I think this is inaccurate.

Looking at the get_transaction_cost() in cost-model, I think, data_bytes_len_total only calculates the size of the instruction data passed into instructions.
It does not look at the account sizes:

data_bytes_len_total = data_bytes_len_total.saturating_add(instruction.data.len() as u64);

https://github.com/anza-xyz/agave/blob/f9f8b60ca15fa721c6cdd816c99dfd4e9123fd77/cost-model/src/cost_model.rs#L187-L188

The loaded accounts data size is only included in the transaction cost after the transaction execution.
Actual value is passed into the calculate_cost_for_executed_transaction() as the actual_loaded_accounts_data_size_bytes argument.
And it is populated and checked in the blockstore_processor.rs:

Some(CostModel::calculate_cost_for_executed_transaction( tx, committed_tx.executed_units, committed_tx.loaded_account_stats.loaded_accounts_data_size, &bank.feature_set, ))

https://github.com/anza-xyz/agave/blob/f9f8b60ca15fa721c6cdd816c99dfd4e9123fd77/ledger/src/blockstore_processor.rs#L246-L251

We could record have a similar counter for the amount of writable data that the transaction touched.

So at the high level it can be computed similarly to what we are already doing.
It is just that the details would be different.

Not sure if there are any implications for the DX, due to the fact that we can only compute this value after a transaction execution.

Yep, you're right. We don't have any precedent for understanding actual account data before loading/executing transactions. We rely on either scraping top level instructions (e.g. for allocate) or estimating and then adjusting.

This would seemingly have to fall into the estimate + adjust case (unless we want to do something radical like grab account data size before execution), which would likely impact DX if we push this onto the user similar to how we ask for CU/Data budget estimates.

One option is to leverage the existing Data budget value for writes as well (defaults to 2MiB per tx or can be set by user). I think this is likely small enough of a default to not bottleneck scheduling before the data budget gets refunded post execution. We would need to study this.

2501babe · 2024-10-21T05:35:41Z

proposals/0184-block-writeable-account-data-limit.md

+or if the slot needs to be marked dead (if sum is above the limit for the case
+of validator).


andrew will have to look at this because if it can invalidate blocks then it may interfere with asynchronous execution

Good callout. I would naively expect this to be handled in the same manner that we are handling account data load limits, but someone with a bigger brain should confirm the details

its unfortunately quite different, transaction data size (sum of all account data sizes required for a transaction) can be calculated on the transaction level, and if the transaction exceeds its limit, it is treated as invalid. so in a world with async execution, the transaction could be included in a block before verifying it, but all nodes will see that it violates its limits and treat it the same (charge fees and not execute it, once the feature gate for fee-only transactions is active)

but if we added a data size limit at the block level, you would need to know the running total of all previous transactions to determine whether you can put the transaction in the block. theres no way to do this without executing because accounts can be created, deallocated, or reallocated. the purpose of removing constraints to enable async execution is that block producers which put "bad" transactions in a block might make suboptimal blocks, but they can always be sure the block is valid if they follow constraints that dont depend on execution (are transactions syntactically valid, are blockhashes valid, etc)

i believe we could have a block-level constraint as proposed here if instead of "if you exceed the block data size limit, the block is invalid" it was like "if you exceed the block data size limit, no further transactions in the block will be executed." this could be user-hostile tho, because it lets block producers invalidate transactions that should otherwise succeed. especially if we charge fees, the block producer now has an incentive to exceed the block limit and farm fees without having to execute transactions. this is different from a transaction exceeding its own data size because in that case the transaction is invalid as constructed on its own

alternatively we could very easily charge more for writable data at the transaction level, but im not sure that accomplishes the goals of this simd. andrew may have better suggestions than me tho

"if you exceed the block data size limit, no further transactions in the block will be executed."

I like the direction of this, but I believe it leads to divergence, as transaction replay order is not guaranteed (iiuc). I tried to do something related a while ago, and wrote up my findings here: solana-labs/solana#27029

Thanks for the detailed explanation!

IIUC, the key difference between the strategy for loaded account data vs written account data is in how we know/estimate the amounts for a given tx.

We can know how much data a tx loads by either: (1) inspecting the tx and loading the accounts or (2) using the data budget requested / default data budget. The latter obviously is prone to under-packing, but with the current rates (48M CUs per block, default data budget of 2MiB, charging 8 CUs per 32kB), we could still pack >90k txs per block using default data budget.

For understanding the amount of data written, we either (1) actually execute the tx or (2) rely on some data written budget requested. Method 1 is a showstopper for async execution, so it seems like 2 is our only option. A default budget of 2MiB would allow only 1k default budget txs with a 2GiB write limit. Maybe we should do some profiling to see how low we could reasonably push this limit without impacting most existing txs.

yep, think we are on the same page! the way transaction data size works now is theres a default budget (64MiB), adjustable by compute budget instruction, and at execution time the accounts are totalled and the transaction is invalid if it exceeds its limit. but this doesnt affect the block, so it doesnt interfere with async execution

we could add a transaction-level writable accounts budget with a smaller default limit (assume 2MiB like you suggest), and it would likewise be async-safe. we could easily have block-level constraints based on those requested budgets, which would also be async-safe. the main risk is people requesting enormous budgets and crowding other transactions out of the block, which we could defray with either a per-transaction maximum limit** or by exponentially ramping the cost per byte

if you make changes to the transaction data size calculation code, note that the formula is quite tricky. i have an open simd to fix this #186

**(16-20MiB would allow writing to a 10Mib account plus plenty of breathing room, but i dont actually know how big the largest accounts are onchain. 10MiB is the max allocation size but i believe you can pump accounts larger than that with realloc)

bw-solana · 2024-10-23T16:36:00Z

TL;DR - I think we would see essentially zero impact to mainnet behavior with a 2GiB write limit per block.

We don't have existing metrics for writeable account data per slot, but I was able to collect some relevant data that should help this conversation. Data is looking back at metrics over the past 2+ weeks.

We track the amount of account data stored per second (accounts_db_store_timings.total_data). Note this does not necessarily mean writing to disk (may go to cache). This should roughly map to 2 slots worth of writeable data or at least be a pessimistic upper bound. Note this encompasses writes beyond the user transactions included in a block such as stake account updates and fee collection.

I am observing writes typically in the 100MiB range with spikes as high as ~1.9GiB. We'll see spikes over 1GiB ~a few times per day.

We also track newly allocated account data per block (cost_tracker_stats.allocated_accounts_data_size). This is typically a handful of kBs on average with spikes up to the 4.5MiB range.

I also added some metrics to track the largest append vec writes. Most of these are in the 1-2MiB range, but we see spikes up to 400MiB on epoch boundary due to saving stake accounts. These spikes will come down significantly and be spread out when PER is enabled. I've also observed non-epoch-boundary spikes up to the ~20MiB range.

@brooksprumo - It would be helpful if you or someone could fact check me here

brooksprumo

I like the idea of having block-wide limits. I attempted something similar a while ago, when (iirc) we did not, and thus had to hijack transaction errors to indicate block-wide issues. That model didn't work, and I wrote about it here: solana-labs/solana#27029. I've heard the TVU is smarter now, and can accommodate block-wide limit, which is great!

brooksprumo · 2024-10-24T12:50:10Z

proposals/0184-block-writeable-account-data-limit.md

+or if the slot needs to be marked dead (if sum is above the limit for the case
+of validator).


"if you exceed the block data size limit, no further transactions in the block will be executed."

I like the direction of this, but I believe it leads to divergence, as transaction replay order is not guaranteed (iiuc). I tried to do something related a while ago, and wrote up my findings here: solana-labs/solana#27029

brooksprumo · 2024-10-24T12:52:58Z

proposals/0184-block-writeable-account-data-limit.md

+  block being produced + the amount of account data that would be written with
+  the current transaction against the total block limit in the would_fit
+  function
+3. Store the write limit in a constant such as


Store the write limit in a constant

From past experience, I'd recommend against a constant that we intend to change later (or minimally want to support allowing to be changed). IMO a function that takes in the feature set and returns the value is more future-proof.

But, this is pretty agave-implementation specific, so may not be germane to a SIMD.

brooksprumo · 2024-10-24T13:09:25Z

We track the amount of account data stored per second (accounts_db_store_timings.total_data). Note this does not necessarily mean writing to disk (may go to cache). This should roughly map to 2 slots worth of writeable data or at least be a pessimistic upper bound. Note this encompasses writes beyond the user transactions included in a block such as stake account updates and fee collection.

I worked on adding a metric for the amount of data stored by transaction processing, but it was not merged: solana-labs/solana#34718. Maybe we can bring this back, or use it to gather data more specific to transaction processing.

I also added some metrics to track the largest append vec writes. Most of these are in the 1-2MiB range, but we see spikes up to 400MiB on epoch boundary due to saving stake accounts. These spikes will come down significantly and be spread out when PER is enabled. I've also observed non-epoch-boundary spikes up to the ~20MiB range.

I don't have these numbers off hand; I'll look them up.

bw-solana added 2 commits October 15, 2024 22:46

block writeable account data limit

be74f99

fix lint

cd35a74

ilya-bobyr reviewed Oct 18, 2024

View reviewed changes

github-actions bot mentioned this pull request Oct 21, 2024

Upstream Updates - Mon Oct 21 00:14:35 UTC 2024 smartcontractkit/chainlink-solana#898

Closed

2501babe reviewed Oct 21, 2024

View reviewed changes

brooksprumo reviewed Oct 24, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

block writeable account data limit #184

block writeable account data limit #184

bw-solana commented Oct 15, 2024

jacobcreech commented Oct 17, 2024

bw-solana commented Oct 17, 2024

ilya-bobyr Oct 18, 2024

bw-solana Oct 23, 2024

ilya-bobyr Oct 18, 2024

2501babe Oct 21, 2024 •

edited

Loading

ilya-bobyr Oct 18, 2024

ilya-bobyr Oct 18, 2024

bw-solana Oct 23, 2024

ilya-bobyr Oct 18, 2024

bw-solana Oct 23, 2024

2501babe Oct 21, 2024

bw-solana Oct 23, 2024

2501babe Oct 23, 2024 •

edited

Loading

brooksprumo Oct 24, 2024

bw-solana Oct 24, 2024 •

edited

Loading

2501babe Oct 24, 2024 •

edited

Loading

bw-solana commented Oct 23, 2024

brooksprumo left a comment

brooksprumo Oct 24, 2024

brooksprumo Oct 24, 2024

brooksprumo commented Oct 24, 2024

		`MAX_BLOCK_ACCOUNTS_DATA_SIZE_WRITEABLE` and set this to `2_000_000_000`
		(2GB).

-  `MAX_BLOCK_ACCOUNTS_DATA_SIZE_WRITEABLE` and set this to `2_000_000_000`
-  (2GB).
+  `MAX_BLOCK_ACCOUNTS_DATA_SIZE_WRITEABLE` and set this to 2GiB
+  (`2 * 1024 * 1024 * 1024`).

		5. Treat this new error type (`WouldExceedAccountDataWriteableBlockLimit`) as
		retryable in execute_and_commit_transactions_locked

		or if the slot needs to be marked dead (if sum is above the limit for the case
		of validator).

block writeable account data limit #184

Are you sure you want to change the base?

block writeable account data limit #184

Conversation

bw-solana commented Oct 15, 2024

jacobcreech commented Oct 17, 2024

bw-solana commented Oct 17, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

2501babe Oct 21, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

2501babe Oct 23, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bw-solana Oct 24, 2024 • edited Loading

Choose a reason for hiding this comment

2501babe Oct 24, 2024 • edited Loading

Choose a reason for hiding this comment

bw-solana commented Oct 23, 2024

brooksprumo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

brooksprumo commented Oct 24, 2024

2501babe Oct 21, 2024 •

edited

Loading

2501babe Oct 23, 2024 •

edited

Loading

bw-solana Oct 24, 2024 •

edited

Loading

2501babe Oct 24, 2024 •

edited

Loading