-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
block writeable account data limit #184
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
@@ -0,0 +1,107 @@ | ||||||||||
--- | ||||||||||
simd: '0184' | ||||||||||
title: Block Writeable Account Data Limit | ||||||||||
authors: | ||||||||||
- Brennan Watt (Anza) | ||||||||||
category: Standard | ||||||||||
type: Core | ||||||||||
status: Idea | ||||||||||
created: 2024-10-15 | ||||||||||
feature: (fill in with feature tracking issues once accepted) | ||||||||||
supersedes: | ||||||||||
superseded-by: | ||||||||||
extends: | ||||||||||
--- | ||||||||||
|
||||||||||
## Summary | ||||||||||
|
||||||||||
A protocol limit on the amount of account data in bytes that may be written | ||||||||||
during a single block. | ||||||||||
|
||||||||||
## Motivation | ||||||||||
|
||||||||||
When direct mapping is enabled, clients will be able to handle much more account | ||||||||||
data because there will be fewer copies in memory (e.g. for each VM invocation / | ||||||||||
CPI call). However, writeable account data cannot be direct mapped because it | ||||||||||
needs to be written back after a successful tx execution. This necessitates a | ||||||||||
decoupling between account read and write data in order to increase readable | ||||||||||
account data limits. | ||||||||||
|
||||||||||
Note: There is an existing per block limit on newly allocated account data | ||||||||||
(100MB), but this does not encompass account data being updated/written back. | ||||||||||
|
||||||||||
## New Terminology | ||||||||||
|
||||||||||
None | ||||||||||
|
||||||||||
## Detailed Design | ||||||||||
|
||||||||||
This feature would check the amount of account data marked as writeable for each | ||||||||||
transaction being handled by the block producer (for inclusion in a block) or | ||||||||||
validator (replaying transactions in a block), add it to a running total of | ||||||||||
total writeable account data for the block, and compare it against some | ||||||||||
threshold (recommend 2GB to start) to decide if the transaction is allowed to be | ||||||||||
included in the block (if sum is below the limit for the case of block producer) | ||||||||||
or if the slot needs to be marked dead (if sum is above the limit for the case | ||||||||||
of validator). | ||||||||||
Comment on lines
+45
to
+46
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. andrew will have to look at this because if it can invalidate blocks then it may interfere with asynchronous execution There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good callout. I would naively expect this to be handled in the same manner that we are handling account data load limits, but someone with a bigger brain should confirm the details There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. its unfortunately quite different, transaction data size (sum of all account data sizes required for a transaction) can be calculated on the transaction level, and if the transaction exceeds its limit, it is treated as invalid. so in a world with async execution, the transaction could be included in a block before verifying it, but all nodes will see that it violates its limits and treat it the same (charge fees and not execute it, once the feature gate for fee-only transactions is active) but if we added a data size limit at the block level, you would need to know the running total of all previous transactions to determine whether you can put the transaction in the block. theres no way to do this without executing because accounts can be created, deallocated, or reallocated. the purpose of removing constraints to enable async execution is that block producers which put "bad" transactions in a block might make suboptimal blocks, but they can always be sure the block is valid if they follow constraints that dont depend on execution (are transactions syntactically valid, are blockhashes valid, etc) i believe we could have a block-level constraint as proposed here if instead of "if you exceed the block data size limit, the block is invalid" it was like "if you exceed the block data size limit, no further transactions in the block will be executed." this could be user-hostile tho, because it lets block producers invalidate transactions that should otherwise succeed. especially if we charge fees, the block producer now has an incentive to exceed the block limit and farm fees without having to execute transactions. this is different from a transaction exceeding its own data size because in that case the transaction is invalid as constructed on its own alternatively we could very easily charge more for writable data at the transaction level, but im not sure that accomplishes the goals of this simd. andrew may have better suggestions than me tho There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I like the direction of this, but I believe it leads to divergence, as transaction replay order is not guaranteed (iiuc). I tried to do something related a while ago, and wrote up my findings here: solana-labs/solana#27029 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for the detailed explanation! IIUC, the key difference between the strategy for loaded account data vs written account data is in how we know/estimate the amounts for a given tx. We can know how much data a tx loads by either: (1) inspecting the tx and loading the accounts or (2) using the data budget requested / default data budget. The latter obviously is prone to under-packing, but with the current rates (48M CUs per block, default data budget of 2MiB, charging 8 CUs per 32kB), we could still pack >90k txs per block using default data budget. For understanding the amount of data written, we either (1) actually execute the tx or (2) rely on some data written budget requested. Method 1 is a showstopper for async execution, so it seems like 2 is our only option. A default budget of 2MiB would allow only 1k default budget txs with a 2GiB write limit. Maybe we should do some profiling to see how low we could reasonably push this limit without impacting most existing txs. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yep, think we are on the same page! the way transaction data size works now is theres a default budget (64MiB), adjustable by compute budget instruction, and at execution time the accounts are totalled and the transaction is invalid if it exceeds its limit. but this doesnt affect the block, so it doesnt interfere with async execution we could add a transaction-level writable accounts budget with a smaller default limit (assume 2MiB like you suggest), and it would likewise be async-safe. we could easily have block-level constraints based on those requested budgets, which would also be async-safe. the main risk is people requesting enormous budgets and crowding other transactions out of the block, which we could defray with either a per-transaction maximum limit** or by exponentially ramping the cost per byte if you make changes to the transaction data size calculation code, note that the formula is quite tricky. i have an open simd to fix this #186 **(16-20MiB would allow writing to a 10Mib account plus plenty of breathing room, but i dont actually know how big the largest accounts are onchain. 10MiB is the max allocation size but i believe you can pump accounts larger than that with realloc) |
||||||||||
|
||||||||||
This feature to limit writeable account data per block would slot in nicely to | ||||||||||
the existing cost model and cost tracker that maintain per block limits for | ||||||||||
various resources transactions can consume. This will allow block producers to | ||||||||||
determine when a block is "full" and validators to mark slots dead that do not | ||||||||||
adhere to these limits. | ||||||||||
|
||||||||||
This feature could be implemented like so (using Agave client as reference): | ||||||||||
|
||||||||||
1. Update the cost model to compute the aggregate size of all accounts marked | ||||||||||
writeable for each transaction. This can be done similarly to how | ||||||||||
`data_bytes_len_total` is computed inside `get_transaction_cost` but limited | ||||||||||
to accounts marked writeable. | ||||||||||
Comment on lines
+56
to
+59
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this is inaccurate. Looking at the data_bytes_len_total =
data_bytes_len_total.saturating_add(instruction.data.len() as u64); The loaded accounts data size is only included in the transaction cost after the transaction execution. Some(CostModel::calculate_cost_for_executed_transaction(
tx,
committed_tx.executed_units,
committed_tx.loaded_account_stats.loaded_accounts_data_size,
&bank.feature_set,
)) We could record have a similar counter for the amount of writable data that the transaction touched. So at the high level it can be computed similarly to what we are already doing. Not sure if there are any implications for the DX, due to the fact that we can only compute this value after a transaction execution. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yep, you're right. We don't have any precedent for understanding actual account data before loading/executing transactions. We rely on either scraping top level instructions (e.g. for allocate) or estimating and then adjusting. This would seemingly have to fall into the estimate + adjust case (unless we want to do something radical like grab account data size before execution), which would likely impact DX if we push this onto the user similar to how we ask for CU/Data budget estimates. One option is to leverage the existing Data budget value for writes as well (defaults to 2MiB per tx or can be set by user). I think this is likely small enough of a default to not bottleneck scheduling before the data budget gets refunded post execution. We would need to study this. |
||||||||||
2. Update the cost tracker to check the current writeable account data in the | ||||||||||
block being produced + the amount of account data that would be written with | ||||||||||
the current transaction against the total block limit in the would_fit | ||||||||||
function | ||||||||||
3. Store the write limit in a constant such as | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
From past experience, I'd recommend against a constant that we intend to change later (or minimally want to support allowing to be changed). IMO a function that takes in the feature set and returns the value is more future-proof. But, this is pretty agave-implementation specific, so may not be germane to a SIMD. |
||||||||||
`MAX_BLOCK_ACCOUNTS_DATA_SIZE_WRITEABLE` and set this to `2_000_000_000` | ||||||||||
(2GB). | ||||||||||
Comment on lines
+65
to
+66
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 2GB is There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||
4. Add a new `CostTrackerError` type `WouldExceedAccountDataWriteableBlockLimit` | ||||||||||
and return this error if block limit is exceeded. | ||||||||||
5. Treat this new error type (`WouldExceedAccountDataWriteableBlockLimit`) as | ||||||||||
retryable in execute_and_commit_transactions_locked | ||||||||||
Comment on lines
+69
to
+70
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||
|
||||||||||
There is an edge case where if a single transaction tries to consume more than | ||||||||||
the entire per block writeable account data budget, it should not be marked as | ||||||||||
retryable. This might not be possible today given tx size limits and maximum | ||||||||||
account size limits, but this could be exploitable in the future if either of | ||||||||||
this constraints change. | ||||||||||
Comment on lines
+72
to
+76
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would it make sense to introduce a separate constraint that limits the amount of writable data a single transaction can access? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We could. We'll effectively be constrained to the 64MiB per tx for updated account data and 100MiB per block for newly allocated account data, so we already have some guard rails here, but we could be more explicit |
||||||||||
|
||||||||||
## Alternatives Considered | ||||||||||
|
||||||||||
Keep read/write data limits coupled even after direct mapping. Downside of this | ||||||||||
is artificially constraining the amount of readable account data per block. | ||||||||||
|
||||||||||
Unthrottle write data without any limit. This could negatively impact slot | ||||||||||
replay times for validators with slow disks. | ||||||||||
|
||||||||||
## Impact | ||||||||||
|
||||||||||
Read and write account data limits per block can be increased independently of | ||||||||||
each other. | ||||||||||
|
||||||||||
## Security Considerations | ||||||||||
|
||||||||||
Agave and FD (and any other) clients should implement this proposal to avoid | ||||||||||
breaking consensus. | ||||||||||
|
||||||||||
## Drawbacks | ||||||||||
|
||||||||||
If a large amount of account data is attempted to be updated in a given block, | ||||||||||
transactions will start being excluded and marked as retryable where they can | ||||||||||
attempt to be included in the next block. | ||||||||||
|
||||||||||
## Backwards Compatibility | ||||||||||
|
||||||||||
With this new threshold, it is possible there are blocks that would have been | ||||||||||
valid that are now marked dead due to exceeding the write threshold. This change | ||||||||||
will need to be feature gated to ensure all validators change behavior in lock | ||||||||||
step. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have statistics on what is the current distribution of writable data on the mainnet?
I wonder if we could start with an even lower value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have perfect stats, but I posted comment with some analysis below (#184 (comment))
2GiB feels right as a starting point. It should constrain worst case reasonably well without impacting current typical behavior on mainnet. We could ratchet things down (or up) after observing behavior