Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak in streamsOutbound #477

Closed
twoeths opened this issue Dec 5, 2023 · 3 comments
Closed

Memory leak in streamsOutbound #477

twoeths opened this issue Dec 5, 2023 · 3 comments

Comments

@twoeths
Copy link
Contributor

twoeths commented Dec 5, 2023

Description
There are items inside streamsOutbound exist with the same size in 2 heap snapshots (one was taken 1 day after another)

Screenshot 2023-12-05 at 13 33 04

there are 55 OutboundStream instances

Screenshot 2023-12-05 at 13 55 24

however there are only 50 items inside Gossipsub peers set

Screenshot 2023-12-05 at 14 00 13

it means that when a peer is disconnected from node, it was removed from Gossipsub peers set but not streamsOutbound map. This is root cause of #6129 and only happen since Lodestar v1.12

@twoeths
Copy link
Contributor Author

twoeths commented Dec 5, 2023

the only way a peer was removed from peers set but not in streamsOutbound is outboundStream?.close() throws error. It was fixed here https://github.com/ChainSafe/js-libp2p-gossipsub/pull/471/files#diff-a2a171449d862fe29692ce031981047d7ab755ae7f84c707aef80701b3ea0c80R855

the fix is included in release 10.1.1 of gossipsub, will try to test it in lodestar

@dapplion
Copy link
Contributor

dapplion commented Dec 6, 2023

For reference MAX_OUTBOUND_BUFFER_SIZE = 16MB in Lodestar:

https://github.com/ChainSafe/lodestar/blob/836fabf4ad6e9014484cf988cebea901e3fdddb9/packages/beacon-node/src/network/gossip/gossipsub.ts#L32

Maybe this is a lot, with 50 target peers, so 800 MB of worst case. Outbound streams are only concerned with:

  1. forwarding
  2. publishing

Let's assume we have a full stream that can't take more data. For (1) it's not critical unless it happens extensively. A specific peer not getting a forwarded message sometimes is okay since there should be 7 other peers on average forwarding the same message. For (2) it's more critical since we want to be sure that our message is received timely. However as long we can publish the message to a sufficient number of peers, we are good. Say if 1 out of 50 peers has a full outbound stream it does not matter at all.

@wemeetagain
Copy link
Member

closed as this seems to be fixed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants