-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Cycle the BLS changes pool when falling below a threshold #11873
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
for _, change := range changes { | ||
if err := s.cfg.BLSToExecPool.MarkIncluded(change); err != nil { | ||
return errors.Wrap(err, "could not mark BLSToExecutionChange as included") | ||
} | ||
} | ||
if excessPending > 0 && len(changes) > excessPending { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if its worth keeping track of how many items the map has held over its lifetime to get a rough idea of how large the backing arrays are, just to avoid cycling through maps (which is quite expensive, copying over all the elements) when there isn't really much of a benefit. Too bad we can't check the cap of a map :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like this is cluttering up the implementation of handleBlockBLSToExecChanges
, could we move it into the internals of the pool manager?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MarkIncluded
would be a good place for it.
@@ -31,6 +31,10 @@ import ( | |||
"go.opencensus.io/trace" | |||
) | |||
|
|||
// This defines the lower size at which we recycle the BLS changes pool to avoid | |||
// memory leaks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is a little unclear, conflating the reasoning for the recycle operation and also making it sound like this is a lower bound, rather than an upper bound. I think you want to say something like "We recycle the BLS changes pool to avoid the backing map growing without bound. The cycling operation is expensive because it copies all elements, so we only do it when the map is smaller than this upper bound."
@@ -23,6 +23,8 @@ type PoolManager interface { | |||
InsertBLSToExecChange(change *ethpb.SignedBLSToExecutionChange) | |||
MarkIncluded(change *ethpb.SignedBLSToExecutionChange) error | |||
ValidatorExists(idx types.ValidatorIndex) bool | |||
Copy() PoolManager | |||
NumPending() int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we moved the cycling logic into the pool type we could keep these out of the interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can improve the encapsulation here, moving the cycling logic into the pool.
@@ -155,6 +169,9 @@ func (p *Pool) MarkIncluded(change *ethpb.SignedBLSToExecutionChange) error { | |||
|
|||
delete(p.m, change.Message.ValidatorIndex) | |||
p.pending.Remove(node) | |||
if p.numPending() == blsChangesPoolThreshold { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't we see a situation where the size of the pool fluctuates around the value of blsChangesPoolThreshold
, causing us to trigger this expensive copy over and over? I assumed you would only cycle if the map had gone through a certain number of key deletions (and was under a max size to keep the cost of cycling down).
This PR implements the following logic. We let the BLS to execution pool grow indefinitely (this is bounded by the number of validators with a BLS prefix, we may want to change this behavior) but when the pool drops down from a given threshold (here set at 200) we recycle it to reclaim the memory from the pool's map. The rationale behind this is that we expect a large surge of messages at the fork, which will slowly be included in blocks until we settle on no messages (or very few ones).