This repository has been archived by the owner on Jun 20, 2024. It is now read-only.
eliminate AddConnection/RemoveConnection race #530
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
We were invoking AddConnection in a separate goroutine, and this could result in RemoveConnection getting invoked first, resulting in bogus entries in our connection map, which would effectively prevent
connectivity to the other peer until the stale entry got removed through a lost tie break. And if, per chance, the remote end ran into exactly the same race condition, then connectivity between that peer and ourself was broken permanently.
Hitting this race condition required a combination of a) no UDP heartbeat being received for HeartbeatTimeout(30s) - this causes the actorLoop to terminate and RemoveConnection to be invoked, and b) the AddConnection invocation getting delayed for at least that long.
Fixes #529.