Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix gossipsub race condition for heartbeat #188

Merged
merged 1 commit into from
May 28, 2019

Conversation

brandonwestcott
Copy link

In using gossipsub on a system with a number of ephemeral peers, we noticed messages would occasionally fail to route to their intended targets even though a subscribe had been received.

In a basic example, given there are 2 peers connected, PeerA & PeerB. For TopicA, PeerA is not in the mesh, but has seen and gossiped it before to PeerC, which is now disconnected and no longer subscribed.

When a Subscribe request for TopicA is issued from PeerB, then PeerA adds that peer onto the PubSub.topics map:

go-libp2p-pubsub/pubsub.go

Lines 566 to 573 in 49274b0

if subopt.GetSubscribe() {
tmap, ok := p.topics[t]
if !ok {
tmap = make(map[peer.ID]struct{})
p.topics[t] = tmap
}
tmap[rpc.from] = struct{}{}

However, its not that map that is used for publishing messages back out from PeerA when its not in the mesh:

gmap, ok = gs.fanout[topic]
if !ok {
// we don't have any, pick some
peers := gs.getPeers(topic, GossipSubD, func(peer.ID) bool { return true })
if len(peers) > 0 {
gmap = peerListToMap(peers)
gs.fanout[topic] = gmap
}
}

Line #233 there uses GossipSubRouter.fanout as its map, which is updated during the heartbeat process:

for topic, peers := range gs.fanout {
// check whether our peers are still in the topic
for p := range peers {
_, ok := gs.p.topics[topic][p]
if !ok {
delete(peers, p)
}
}

In between the 1 second of a heartbeat, these two maps can be out of date with each other, which is to be expected. However, on

if !ok {
the !ok only works as a fallback for the initial iteration, if the map is empty because all other Peers have been unsubscribed (PeerC) then it doesn't fallback to getPeers.

This PR is to change that conditional to check for empty map cases so that the fallback still happens. The other option is to make the key for the topic be set back to nil instead of an empty map inside the heartbeat process.

Copy link
Collaborator

@vyzo vyzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you!

@vyzo vyzo merged commit 4221a39 into libp2p:master May 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants