Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gossipsub -- make the node wait for the mesh before publishing anything #3561

Closed
ZenGround0 opened this issue Oct 15, 2019 · 4 comments · Fixed by #3630
Closed

Gossipsub -- make the node wait for the mesh before publishing anything #3561

ZenGround0 opened this issue Oct 15, 2019 · 4 comments · Fixed by #3630

Comments

@ZenGround0
Copy link
Contributor

Description

Because of the way gossipsub works there is a long delay between node startup and a node's ability to publish over pubsub channels (see this issue for more detail).

Using the default timing parameters with gossipsub exacerbated an existing race in the majority of our integration tests and created an unworkable amount of flakiness. To get around this in the short term we modified the heartbeat interval to be short enough to roughly force mesh construction before publish.

This is an imperfect solution that makes races manageable but does not eliminate them. We could remove them once and for all by blocking on node init until after receiving an event from a gossipsub eventbus. We haven't done this yet because it requires a change to the gossipsub implementation / api.

Acceptance criteria

No more pubsub related races by consuming a gossipsub event bus and waiting for a "ready to publish" event.

Risks + pitfalls

This work is mostly outside of our control and belongs to libp2p.

Where to begin

See if we can get this scheduled in gossipsub roadmap. After that most of our work is done.

@ZenGround0
Copy link
Contributor Author

Update we are currently BLOCKING gossipsub integration on this getting this notification and integrating it into our node due to the risk of introducing test flakiness without.

@ZenGround0
Copy link
Contributor Author

Current plan is to wait for EnoughPeers from this PR

@anorth
Copy link
Member

anorth commented Mar 1, 2020

@ZenGround0 is this still relevant?

@ZenGround0
Copy link
Contributor Author

This is closed out by our use of a noop discovery when initializing gossipsub which forces at least one peer on the mesh before a publish call unblocks. There is still a minor potential performance issues involved in our default heartbeat being set low enough to make testing expedient, but that can be addressed if/when it becomes an issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants