-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move accounts data size checks to end of ReplayStage #26744
Move accounts data size checks to end of ReplayStage #26744
Conversation
cc07dc7
to
0422066
Compare
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. |
9b1171d
to
812f968
Compare
812f968
to
8b7e0f1
Compare
8b7e0f1
to
4f8540f
Compare
Hi folks, thanks in advance for taking a look to review this PR. For some additional context, here's why I've asked each of you: @brennanwatt: You touched this code most recently but allowing concurrent replay. Would love your expertise on perf and soundness. Also, please let me know if some parts need more explanation. Each person likely has a different level of familiarity with both this code and this issue, so happy to help in anyway. |
Pull request has been modified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer banking stage changes to be made first so that we don't get into a situation where leaders unknowingly produce blocks with invalid allocations. Banking stage has some checks in cost tracker that prevent more than 100MB of allocations from system program allocations but it could be circumvented by reallocating in other programs such that the replay stage check gets triggered and marks the slot as dead.
} | ||
|
||
debug_assert!(MAX_ACCOUNT_DATA_BLOCK_LEN <= i64::MAX as u64); | ||
if bank.load_accounts_data_size_delta_on_chain() > MAX_ACCOUNT_DATA_BLOCK_LEN as i64 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aside: I don't think this MAX_ACCOUNT_DATA_BLOCK_LEN
constant is very clearly named. It represents how much additional account data can be allocated in the block but the name implies that this is how much account data that can be accessed in the block. Maybe MAX_BLOCK_ACCOUNT_DATA_SIZE_DELTA
?
Closing this PR as it cannot be merged. It would introduce potential non-determinism. For more information, please refer to #27029 |
Problem
Accounts data size checks must be deterministic. Their current location is not, since batches are replayed in parallel, thus transaction processing order is non deterministic.
For more context, see #26439
Summary of Changes
All the checks for accounts data size violations have been moved to the end of replay stage, once the bank is complete. At this point, we check the accounts data size to see if there were violations either within the block or for the total size.
This must happen deterministically. Are there any negative performance impacts these checks may have? I've tried to make the checks as small and infrequent as possible; can they be further reduced?
Fixes #26439
Feature Gate Issue: modifies #24135 and #25517
Builds on