Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cap accounts data a transaction can load by its requested limit #27840

Conversation

tao-stones
Copy link
Contributor

@tao-stones tao-stones commented Sep 16, 2022

Problem

Amount of data a transaction loads should not exceed its requested accounts data size limit.

Summary of Changes

  • Feature gated, to fail loading transaction with error MaxAccountsDataSizeExceeded if its accounts data size exceed requested limit;
  • New compute budget instruction set_accounts_data_size_limit is added allow user set max bytes a transaction can load, default to 10MB, max 100MB.
  • Avoid double counting data size for non-loader accounts and its programdata account by checking if program account was assigned by an empty account (eg AccountSharedData::default()), if not then not to count data size in load_executable_accounts().

Feature Gate Issue #27839

@tao-stones tao-stones force-pushed the limit-accounts-data-size-by-requested-cu branch from 813f56a to 7b53387 Compare September 16, 2022 16:42
brooksprumo
brooksprumo previously approved these changes Sep 16, 2022
Copy link
Contributor

@brooksprumo brooksprumo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks good to me. I'd like @jstarry to review as well before merging.

One thing I would suggest is the naming/description of the error itself. Since the error is for exceeding how much accounts data can be loaded, it may be valuable to say "loaded" somewhere in the error name/description. Otherwise I'm not sure if this is loaded, stored, newly created, or some combination of them all.

@mergify mergify bot dismissed brooksprumo’s stale review September 16, 2022 23:03

Pull request has been modified.

@tao-stones
Copy link
Contributor Author

it may be valuable to say "loaded" somewhere in the error name/description.

Added "Loaded"

@tao-stones tao-stones force-pushed the limit-accounts-data-size-by-requested-cu branch 3 times, most recently from 971a090 to 9498bb8 Compare September 18, 2022 19:13
@jstarry jstarry added the feature-gate Pull Request adds or modifies a runtime feature gate label Sep 19, 2022
Copy link
Member

@jstarry jstarry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs tests as well

Comment on lines 370 to 365
Self::accumulate_and_check_transaction_accounts_data_size(
&mut accumulated_accounts_data_size,
programdata_account.data().len(),
max_accounts_data_size_by_requested_cu,
)?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately we can't make consensus-impacting decisions based on the size of the program data account right now because of the way ExtendProgramData was implemented. ExtendProgramData doesn't require a write lock on the program account so it could update the length of the account in parallel with a transaction that is very close to the data size limit.

I suggest that we modify ExtendProgramData to require a write lock on the program account in a separate PR since I think the size of program data should impact the data size limit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest that we modify ExtendProgramData to require a write lock on the program account in a separate PR since I think the size of program data should impact the data size limit.

I have a PR to resolve this here: #27911

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, thanks for the insight! I'll follow that PR.

@@ -264,6 +266,15 @@ impl Accounts {
} else {
LoadZeroLamports::SomeWithZeroLamportAccount
};
let mut accumulated_accounts_data_size = 0u64;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that we do not count the data size of accounts that are "loader keys" (line 298) because they are loaded later in load_executable_accounts. You'll need to make sure to track data size in load_executable_accounts as well. However, it's tricky because non "loader keys" could be double counted. A "loader key" is defined as a key which is invoked but not included passed as an account input to any instructions. So it's possible that a program account loaded in load_executable_accounts was also loaded in load_transaction because it was considered a "non-loader key" and then we would double count the data size of the program and its programdata account.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, thanks for pointing it out! Will follow it through.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

commit de60af5 added logic to avoid double counting non-loader if reloaded in load_executable_account().

@@ -416,6 +438,36 @@ impl Accounts {
}
}

fn get_requested_accounts_data_size_limit(tx: &SanitizedTransaction) -> Result<u64> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be more appropriate to add a new ComputeBudget instruction to let transactions request a new accounts data size limit. I think it's useful for validators to be able to statically analyze how much compute or memory each transaction can use separately. When a validator is replaying a block, they can parallelize lots of transactions that use compute but if they are memory constrained they may not wish to parallelize transactions that load lots of account data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mmm, it does sound a cleaner option, can add in separate PR as pre-requisite.

Do we still want to tie data size to compute units somehow? My gut is compute unit is for CPU, it doesn't necessary need to capture memory usage. If a new instruction is added to request data size, it doesn't have to do with compute. But it can/should still be part of fee calculation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#28084 adds SetAccountsDataSizeLimit instruction

@tao-stones tao-stones force-pushed the limit-accounts-data-size-by-requested-cu branch 2 times, most recently from 855a5f2 to ef01771 Compare October 5, 2022 22:05
@tao-stones tao-stones changed the title Cap accounts data a transaction can load by its requested compute-unit Cap accounts data a transaction can load by its requested limit Oct 7, 2022
@tao-stones tao-stones force-pushed the limit-accounts-data-size-by-requested-cu branch 8 times, most recently from 6ebb6fe to d927b57 Compare October 14, 2022 13:33
@tao-stones
Copy link
Contributor Author

@jstarry @brooksprumo I was side-tracked from it; It has been refreshed, addressed all previous comments. Can you guys take another look? Thanks!

@jstarry
Copy link
Member

jstarry commented Nov 15, 2022

Sorry I didn't look at this before merge. My main concern is that when loading executable accounts we redundantly load bpf loader a bunch of times and if a program is called multiple times its accounts are loaded multiple times. It would have been nice to clean that up first because now that implementation is tied to consensus because it could lead to a transaction being rejected

@jstarry
Copy link
Member

jstarry commented Nov 15, 2022

@Lichtso should be aware of this change too because of how it impacts accounts loaded for transactions

@tao-stones
Copy link
Contributor Author

tao-stones commented Nov 15, 2022

Is it straightforward to remove redundant bpf loader loading? Happy to clean that up first if can get some pointers (not very familiar with that part of code).

Prepared revert PR in case want to clean it up first, or can do it before feature activation without reverting.

@brooksprumo
Copy link
Contributor

@taozhu-chicago I just put up a PR to work around a test hitting MaxLoadedAccountsDataSizeExceeded when I was not expecting it to:
#28814

I was expecting that it should be allowed to load one max-sized account (or at least near max), but I saw errors. I imagine this is due to the size of the program itself being loaded? Can you help me figure out what's happening?

@tao-stones
Copy link
Contributor Author

tao-stones commented Nov 15, 2022

@taozhu-chicago I just put up a PR to work around a test hitting MaxLoadedAccountsDataSizeExceeded when I was not expecting it to: #28814

I was expecting that it should be allowed to load one max-sized account (or at least near max), but I saw errors. I imagine this is due to the size of the program itself being loaded? Can you help me figure out what's happening?

This is due to the transaction didn't set to request max account data size, so default 10M is used. When the randomly generated account_data_size exceeds 10M, it'd fail. I'll put out a PR to add set_accounts_data_size_limit ix to test transaction later today.

#28826

@tao-stones
Copy link
Contributor Author

My main concern is that when loading executable accounts we redundantly load bpf loader a bunch of times and if a program is called multiple times its accounts are loaded multiple times.

Still reading code and trying to wrapping my head around it. But perhaps to track accounted account keys to de-dup? But feel there are still holes.... (continue looking)

@brooksprumo
Copy link
Contributor

This is due to the transaction didn't set to request max account data size, so default 10M is used. When the randomly generated account_data_size exceeds 10M, it'd fail.

@taozhu-chicago But the test does not request more than 10M. That's why I was/am confused. Does loading the system program count against the account data load limit (even though it lives in the runtime vs on-chain)?

@tao-stones
Copy link
Contributor Author

well, it does actually at

            let account_size = rng.gen_range(
                1,
                MAX_PERMITTED_DATA_LENGTH as usize - MAX_PERMITTED_DATA_INCREASE,
            );

and

pub const MAX_PERMITTED_DATA_LENGTH: u64 = 10 * 1024 * 1024;
pub const MAX_PERMITTED_DATA_INCREASE: usize = 1_024 * 10;

the account_size could be > 10,000,000.

(can move this to #28826 to keep this PR chats clean)

@jstarry
Copy link
Member

jstarry commented Nov 16, 2022

I also think that the max is too small.. it should allow at least one max sized account to be loaded in my opinion.

xiangzhu70 pushed a commit to xiangzhu70/solana that referenced this pull request Nov 16, 2022
…na-labs#27840)

- Add new compute-budget instruction to set transaction-wide accounts data size limit
- Set default accounts data limit to 10MB, and max to 100MB, per transaction;
- Add getters to make changing default and/or max values easier in the future with feature gates;
- added error counter for transactions exceed data size limit
@brooksprumo
Copy link
Contributor

I also think that the max is too small.. it should allow at least one max sized account to be loaded in my opinion.

The max is currently set to 100 MB

const MAX_LOADED_ACCOUNTS_DATA_LIMIT: u32 = 100_000_000;

The default is 10 MB

const DEFAULT_LOADED_ACCOUNTS_DATA_LIMIT: u32 = 10_000_000;

If you're saying that the default should allow one max-sized account to be loaded, then I agree; that was my expectation as well.

mergify bot added a commit that referenced this pull request Nov 17, 2022
…port #27840) (#28797)

* Cap accounts data a transaction can load by its requested limit (#27840)

- Add new compute-budget instruction to set transaction-wide accounts data size limit
- Set default accounts data limit to 10MB, and max to 100MB, per transaction;
- Add getters to make changing default and/or max values easier in the future with feature gates;
- added error counter for transactions exceed data size limit

(cherry picked from commit 81dc2e5)

# Conflicts:
#	core/src/transaction_priority_details.rs
#	program-runtime/src/compute_budget.rs
#	programs/bpf/tests/programs.rs
#	runtime/src/accounts.rs
#	runtime/src/bank.rs
#	sdk/src/feature_set.rs

* manual fix backport conflicts

Co-authored-by: Tao Zhu <82401714+taozhu-chicago@users.noreply.github.com>
Co-authored-by: Tao Zhu <tao@solana.com>
@jstarry
Copy link
Member

jstarry commented Nov 17, 2022

Yes, sorry, I was referring to the default value. I don't think it's a good idea to backport this PR. Especially while some of these details are ironed out.

@tao-stones
Copy link
Contributor Author

tao-stones commented Nov 17, 2022

yeah agree, I'll halt backporting. (backporting to 1.14 has just automerged. #28861 to revert it)

WRT default limit, it equals to max account size rn, simplest transaction is able to just load one max sized account. However @brooksprumo found in a unit test where a mock realloc instruction is added to builtin, the corresponding builtin account eats few bytes (eg the number of bytes of its name), therefore would prevent loading the max sized account.

Is it OK to update accounts.load_executable_accounts()to exclude builtins from being loaded?

@tao-stones
Copy link
Contributor Author

tao-stones commented Nov 17, 2022

Also, can we move further discussion to:

tao-stones added a commit that referenced this pull request Nov 18, 2022
…it (backport #27840)" (#28861)

Revert "Cap accounts data a transaction can load by its requested limit (backport #27840) (#28797)"

This reverts commit 8ad9136.
gnapoli23 pushed a commit to gnapoli23/solana that referenced this pull request Dec 16, 2022
…na-labs#27840)

- Add new compute-budget instruction to set transaction-wide accounts data size limit
- Set default accounts data limit to 10MB, and max to 100MB, per transaction;
- Add getters to make changing default and/or max values easier in the future with feature gates;
- added error counter for transactions exceed data size limit
nickfrosty pushed a commit to nickfrosty/solana that referenced this pull request Jan 4, 2023
…na-labs#27840)

- Add new compute-budget instruction to set transaction-wide accounts data size limit
- Set default accounts data limit to 10MB, and max to 100MB, per transaction;
- Add getters to make changing default and/or max values easier in the future with feature gates;
- added error counter for transactions exceed data size limit
@Lichtso
Copy link
Contributor

Lichtso commented Jan 11, 2023

@taozhu-chicago This has not been activated on any network and is in 1.15 only, right?

Because, I think I have found a way remove the duplication of loaded accounts in load_executable_accounts() and would like to refactor that. That would then affect the way they are counted and thus this PR.

tao-stones added a commit that referenced this pull request Jan 13, 2023
tao-stones added a commit that referenced this pull request Jan 13, 2023
Lichtso pushed a commit that referenced this pull request Jan 17, 2023
…it" (#29373)

Revert "Cap accounts data a transaction can load by its requested limit (#27840)"

This reverts commit 81dc2e5.
@tao-stones
Copy link
Contributor Author

@taozhu-chicago This has not been activated on any network and is in 1.15 only, right?

Because, I think I have found a way remove the duplication of loaded accounts in load_executable_accounts() and would like to refactor that. That would then affect the way they are counted and thus this PR.

oops, sorry didn't see your comment till now. But rather late than never. Yes it was only in master, and #29373 had reverted it. #29743 re-aaplies a simplified version on top of your changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-gate Pull Request adds or modifies a runtime feature gate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants