Skip to content

Cycle the BLS changes pool when falling below a threshold #11873

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Jan 28, 2023

Conversation

potuz
Copy link
Contributor

@potuz potuz commented Jan 13, 2023

This PR implements the following logic. We let the BLS to execution pool grow indefinitely (this is bounded by the number of validators with a BLS prefix, we may want to change this behavior) but when the pool drops down from a given threshold (here set at 200) we recycle it to reclaim the memory from the pool's map. The rationale behind this is that we expect a large surge of messages at the fork, which will slowly be included in blocks until we settle on no messages (or very few ones).

@potuz potuz requested a review from a team as a code owner January 13, 2023 18:20
for _, change := range changes {
if err := s.cfg.BLSToExecPool.MarkIncluded(change); err != nil {
return errors.Wrap(err, "could not mark BLSToExecutionChange as included")
}
}
if excessPending > 0 && len(changes) > excessPending {
Copy link
Contributor

@kasey kasey Jan 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if its worth keeping track of how many items the map has held over its lifetime to get a rough idea of how large the backing arrays are, just to avoid cycling through maps (which is quite expensive, copying over all the elements) when there isn't really much of a benefit. Too bad we can't check the cap of a map :(

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like this is cluttering up the implementation of handleBlockBLSToExecChanges, could we move it into the internals of the pool manager?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MarkIncluded would be a good place for it.

@@ -31,6 +31,10 @@ import (
"go.opencensus.io/trace"
)

// This defines the lower size at which we recycle the BLS changes pool to avoid
// memory leaks
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is a little unclear, conflating the reasoning for the recycle operation and also making it sound like this is a lower bound, rather than an upper bound. I think you want to say something like "We recycle the BLS changes pool to avoid the backing map growing without bound. The cycling operation is expensive because it copies all elements, so we only do it when the map is smaller than this upper bound."

@@ -23,6 +23,8 @@ type PoolManager interface {
InsertBLSToExecChange(change *ethpb.SignedBLSToExecutionChange)
MarkIncluded(change *ethpb.SignedBLSToExecutionChange) error
ValidatorExists(idx types.ValidatorIndex) bool
Copy() PoolManager
NumPending() int
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we moved the cycling logic into the pool type we could keep these out of the interface.

Copy link
Contributor

@kasey kasey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can improve the encapsulation here, moving the cycling logic into the pool.

@@ -155,6 +169,9 @@ func (p *Pool) MarkIncluded(change *ethpb.SignedBLSToExecutionChange) error {

delete(p.m, change.Message.ValidatorIndex)
p.pending.Remove(node)
if p.numPending() == blsChangesPoolThreshold {
Copy link
Contributor

@kasey kasey Jan 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't we see a situation where the size of the pool fluctuates around the value of blsChangesPoolThreshold, causing us to trigger this expensive copy over and over? I assumed you would only cycle if the map had gone through a certain number of key deletions (and was under a max size to keep the cost of cycling down).

kasey
kasey previously approved these changes Jan 24, 2023
@prylabs-bulldozer prylabs-bulldozer bot merged commit cc454bb into develop Jan 28, 2023
@delete-merged-branch delete-merged-branch bot deleted the cycle-bls-pool branch January 28, 2023 14:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants