-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nodes and their services keep appearing and disappearing from the catalog #5518
Labels
type/bug
Feature does not function as expected
Comments
ShimmerGlass
pushed a commit
to ShimmerGlass/consul
that referenced
this issue
Mar 20, 2019
When receiving a serf faild message for a node which is not in the catalog, do not perform a register request to set is serf heath to critical as it could overwrite the node information and services if it was renamed. Fixes : hashicorp#5518
ShimmerGlass
pushed a commit
to criteo-forks/consul
that referenced
this issue
Mar 21, 2019
When receiving a serf faild message for a node which is not in the catalog, do not perform a register request to set is serf heath to critical as it could overwrite the node information and services if it was renamed. Fixes : hashicorp#5518
Just realized the PR was not properly linked to this issue : #5520 |
ShimmerGlass
pushed a commit
to criteo-forks/consul
that referenced
this issue
Apr 4, 2019
When receiving a serf faild message for a node which is not in the catalog, do not perform a register request to set is serf heath to critical as it could overwrite the node information and services if it was renamed. Fixes : hashicorp#5518
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Overview of the Issue
We are seeing nodes and their services disappear and reappear in the catalog every few minutes without any change on our side (API call on agents or servers, agent reload). Affected nodes keep appearing and disappearing until action is taken on our side.
When this happens the health checks registered on the node do not change status and stay passing until they are deregistered, and are passing when they are registered again. There are no unusual logs on either the affected node's agents or on the servers.
When debugging this issue we found a scenario that can explain this :
foo
and a stable node-id that will not change.bar
and keeps the same node-idfoo
is kept and a new is created with namebar
bar
is deleted andfoo
registered again along with its services and checksRestarting the leader fixes the issue, we then tested our theory by force-leaving the node's old name without any restart on the agents or servers side, this fixed the issue as well confirming our theory.
Consul info for both Client and Server
Consul version 1.3.1 with patches on Centos 7 on both servers and clients
The text was updated successfully, but these errors were encountered: