Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix udp writes not causing nodes to become suspect #242

Merged
merged 1 commit into from
Sep 21, 2021

Conversation

Austinpayne
Copy link
Contributor

failedRemote only checks for TCP failures but probeNode uses
UDP meaning that if we get a "network is unreachable" during
a UDP write the node is never marked as suspect.

This failure mode can be reproduced by bringing down an
interface in which case the node will show its neighbors
as alive while all other nodes will properly detect the
node as failed.

For reproduction steps see: https://gist.github.com/Austinpayne/defc5178bc2cc5c29ff5ada3dc4b1260
For verifying the fix see: https://gist.github.com/Austinpayne/151876ba8841be46484ce71664ced6bc

failedRemote only checks for TCP failures but probeNode uses
UDP meaning that if we get a "network is unreachable" during
a UDP write the node is never marked as suspect.
@hashicorp-cla
Copy link

hashicorp-cla commented Aug 24, 2021

CLA assistant check
All committers have signed the CLA.

@Austinpayne
Copy link
Contributor Author

Fixes hashicorp/serf#627

cc: @i0rek @rboyer @mkeeler

@kisunji kisunji requested a review from kyhavlov September 7, 2021 19:08
Copy link
Contributor

@kyhavlov kyhavlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @Austinpayne, this looks great, thanks for the PR and the clean reproduction setup!

@kyhavlov kyhavlov merged commit 7e2219b into hashicorp:master Sep 21, 2021
@Austinpayne
Copy link
Contributor Author

Austinpayne commented Sep 22, 2021

Thanks @kyhavlov. Should I open a PR to bump the memberlist version in the serf repo as well?

@Austinpayne Austinpayne deleted the fix/udp-net-unreachable-pr branch September 22, 2021 01:00
@edsharp
Copy link

edsharp commented Oct 12, 2021

Awesome - I've made a local build of serf against 7e2219b and it fixes hashicorp/serf#627 for me :)

Can I help with tagging memberlist and tagging a new serf?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants