You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Our current policy to closing node-to-node channels was originally too strict: a channel to a node is destroyed as soon as we have (globally) committed its retirement. #2654 relaxes that, at the cost of keeping channels open forever, which causes a relatively small memory growth (nodes should be recycled frequently enough that this doesn't cause any issues in practice).
There isn't a clear point in time when a node-to-channel can safely be destroyed. For example, as a retired primary, I may want to keep a channel open to other nodes to forward client requests to the new configuration (see #1713). As a retired backup, I may also receive the response of a forwarded RPC from the primary, which should still be returned to the client.
Instead, we should periodically destroy channels that have been idle (both on send and receive) for a while (frequency TBC. but probably a multiple of the election timeout). If a node wants to send a message to a channel that has been destroyed, the message will be queued (*) and a new channel establishment will start. On message reception from a node whose channel was previously destroyed, we simply recreate a new channel.
While the channel re-establishment has already been implemented in #2092, we'll need to periodically tick() the ChannelManager and destroy channels that have been idle since for a while.
(*) To cap memory growth, we only queue one message for now.
The text was updated successfully, but these errors were encountered:
Our current policy to closing node-to-node channels was originally too strict: a channel to a node is destroyed as soon as we have (globally) committed its retirement. #2654 relaxes that, at the cost of keeping channels open forever, which causes a relatively small memory growth (nodes should be recycled frequently enough that this doesn't cause any issues in practice).
There isn't a clear point in time when a node-to-channel can safely be destroyed. For example, as a retired primary, I may want to keep a channel open to other nodes to forward client requests to the new configuration (see #1713). As a retired backup, I may also receive the response of a forwarded RPC from the primary, which should still be returned to the client.
Instead, we should periodically destroy channels that have been idle (both on send and receive) for a while (frequency TBC. but probably a multiple of the election timeout). If a node wants to send a message to a channel that has been destroyed, the message will be queued (*) and a new channel establishment will start. On message reception from a node whose channel was previously destroyed, we simply recreate a new channel.
While the channel re-establishment has already been implemented in #2092, we'll need to periodically
tick()
theChannelManager
and destroy channels that have been idle since for a while.(*) To cap memory growth, we only queue one message for now.
The text was updated successfully, but these errors were encountered: