Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[EN Performance] Optimize checkpoint serialization for -37GB operational RAM, -2.7 minutes duration, -19.6 million allocs (50% fewer allocs) #2964

Closed
fxamacker opened this issue Aug 12, 2022 · 0 comments · Fixed by #3050
Assignees
Labels
Execution Cadence Execution Team Performance

Comments

@fxamacker
Copy link
Member

fxamacker commented Aug 12, 2022

Problem

Although PR #2792 reduces peak memory used by checkpointing by reusing ledger state, we can further reduce peak memory used by over 35GB during checkpoint serialization.

Updates #1744

Proposed Solution

Replace largest data structure used for checkpoint serialization and process subtries instead of entire trie. Also use preallocations when feasible.

Optionally, allow a flag to specify the number of levels to use. Specifying 4 levels will use 16 subtries, which is a reasonable default for impactful memory savings and faster serialization.

Serializing data in parallel is made easier by this proposed change, but that is outside the scope of this issue.

Preliminary Results Using Levels=4 (16 Subtries)

Using August 12 mainnet checkpoint file with Go 1.18.5:

  • -37GB peak RAM (top command), -23GB RAM (go bench B/op)
  • -19.6 million (-50%) allocs/op in serialization phase
  • -2.7 minutes duration
Before:    625746 ms    88320868048 B/op    39291999 allocs/op
After:     461937 ms    64978613264 B/op    19671410 allocs/op

No benchstat comparisons yet (n=5+) due to duration and memory (requires the big benchnet-dev-004 server).

EDIT: added more details after reading PR #3050 review comments.

@fxamacker fxamacker added Performance Execution Cadence Execution Team labels Aug 12, 2022
@fxamacker fxamacker self-assigned this Aug 12, 2022
@fxamacker fxamacker changed the title [EN Performance] Further reduce peak memory used by checkpointing [EN Performance] Reduce peak memory used by checkpointing by about 20-30GB Aug 15, 2022
@fxamacker fxamacker changed the title [EN Performance] Reduce peak memory used by checkpointing by about 20-30GB [EN Performance] Optimize checkpointing for -37GB operational RAM, -2.7 minutes duration, -19.6 million allocs (50% fewer allocs) Aug 22, 2022
@fxamacker fxamacker changed the title [EN Performance] Optimize checkpointing for -37GB operational RAM, -2.7 minutes duration, -19.6 million allocs (50% fewer allocs) [EN Performance] Optimize checkpoint serialization for -37GB operational RAM, -2.7 minutes duration, -19.6 million allocs (50% fewer allocs) Aug 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Execution Cadence Execution Team Performance
Projects
None yet
1 participant