Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move accounts data size checks to end of ReplayStage #26744

Conversation

brooksprumo
Copy link
Contributor

@brooksprumo brooksprumo commented Jul 22, 2022

Problem

Accounts data size checks must be deterministic. Their current location is not, since batches are replayed in parallel, thus transaction processing order is non deterministic.

For more context, see #26439

Summary of Changes

All the checks for accounts data size violations have been moved to the end of replay stage, once the bank is complete. At this point, we check the accounts data size to see if there were violations either within the block or for the total size.

This must happen deterministically. Are there any negative performance impacts these checks may have? I've tried to make the checks as small and infrequent as possible; can they be further reduced?

Fixes #26439
Feature Gate Issue: modifies #24135 and #25517

Builds on

@stale
Copy link

stale bot commented Jul 31, 2022

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

@stale stale bot added the stale [bot only] Added to stale content; results in auto-close after a week. label Jul 31, 2022
@brooksprumo brooksprumo removed the stale [bot only] Added to stale content; results in auto-close after a week. label Aug 4, 2022
@brooksprumo brooksprumo linked an issue Aug 5, 2022 that may be closed by this pull request
@brooksprumo brooksprumo force-pushed the accounts-data-len/move-replay-stage-checks branch from 9b1171d to 812f968 Compare August 5, 2022 18:54
@brooksprumo brooksprumo added the feature-gate Pull Request adds or modifies a runtime feature gate label Aug 5, 2022
@brooksprumo brooksprumo force-pushed the accounts-data-len/move-replay-stage-checks branch from 812f968 to 8b7e0f1 Compare August 5, 2022 19:24
@brooksprumo brooksprumo force-pushed the accounts-data-len/move-replay-stage-checks branch from 8b7e0f1 to 4f8540f Compare August 8, 2022 15:10
@brooksprumo brooksprumo changed the title WIP: Move accounts data size checks to ReplayStage Move accounts data size checks to end of ReplayStage Aug 8, 2022
@brooksprumo brooksprumo marked this pull request as ready for review August 8, 2022 17:21
@brooksprumo
Copy link
Contributor Author

Hi folks, thanks in advance for taking a look to review this PR. For some additional context, here's why I've asked each of you:

@brennanwatt: You touched this code most recently but allowing concurrent replay. Would love your expertise on perf and soundness.
@AshwinSekar: I ran these ideas by you, so wanted to make sure this looked right by you.
@jstarry: You found the issue to begin with.
@Lichtso: You wanted to make sure we still have test coverage for the accounts data size checks that were removed in other locations.

Also, please let me know if some parts need more explanation. Each person likely has a different level of familiarity with both this code and this issue, so happy to help in anyway.

AshwinSekar
AshwinSekar previously approved these changes Aug 8, 2022
core/src/replay_stage.rs Outdated Show resolved Hide resolved
core/src/replay_stage.rs Outdated Show resolved Hide resolved
core/src/replay_stage.rs Outdated Show resolved Hide resolved
@mergify mergify bot dismissed AshwinSekar’s stale review August 8, 2022 19:21

Pull request has been modified.

Copy link
Contributor

@bw-solana bw-solana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@jstarry jstarry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer banking stage changes to be made first so that we don't get into a situation where leaders unknowingly produce blocks with invalid allocations. Banking stage has some checks in cost tracker that prevent more than 100MB of allocations from system program allocations but it could be circumvented by reallocating in other programs such that the replay stage check gets triggered and marks the slot as dead.

}

debug_assert!(MAX_ACCOUNT_DATA_BLOCK_LEN <= i64::MAX as u64);
if bank.load_accounts_data_size_delta_on_chain() > MAX_ACCOUNT_DATA_BLOCK_LEN as i64 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside: I don't think this MAX_ACCOUNT_DATA_BLOCK_LEN constant is very clearly named. It represents how much additional account data can be allocated in the block but the name implies that this is how much account data that can be accessed in the block. Maybe MAX_BLOCK_ACCOUNT_DATA_SIZE_DELTA?

@brooksprumo
Copy link
Contributor Author

Closing this PR as it cannot be merged. It would introduce potential non-determinism. For more information, please refer to #27029

@brooksprumo brooksprumo deleted the accounts-data-len/move-replay-stage-checks branch February 10, 2023 22:44
@brooksprumo brooksprumo restored the accounts-data-len/move-replay-stage-checks branch February 10, 2023 22:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-gate Pull Request adds or modifies a runtime feature gate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Enforcing max account data cap may introduce indeterminism
4 participants