You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
not sure if this is a design issue or a simple bug:
If you create a (in my case three node) cluster with a bunch of client nodes and remove and recreate all server nodes, basically starting a clean cluster, all clients start flapping like this:
I need to recreate the clients to make them properly reconnect. I'm not sure if this is required by design but I assumed the client nodes are rather dumb and just connect to the server nodes and use their state, so no way the local state could somehow conflict with the server state. If this isn't the case, is there some more specific documentation around such operational concerns?
The text was updated successfully, but these errors were encountered:
Consul clients do carry some state locally, which includes information about the cluster as well as the state of local services and checks. The logs you shared above are the gossip layer detecting failures on its peers. Without knowing which nodes were clients and which were servers, there's not much else I can derive from them, but if the failed nodes were the servers you stopped then this would be expected.
The clients should reconnect, though. Are there any differences on the new servers from the old (IP's, hostnames, firewalls, etc)? You might hit #457, since the Raft layer currently does not gracefully handle IP address changes. Can you share your configuration file(s)?
You will run into #839 with the current 0.5 release, but 0.5.1 will fix this. Basically the clients will not re-sync their services and checks to the global catalog.
In my case the server IPs changed, I expected consul to resolve the provided server address again but it didn't. But #839 sounds like it will fix that issue, although not sure if I still need to get rid of the clients local state about cluster members. Anyway, this can considered a dup of #839, so will close it.
Hi,
not sure if this is a design issue or a simple bug:
If you create a (in my case three node) cluster with a bunch of client nodes and remove and recreate all server nodes, basically starting a clean cluster, all clients start flapping like this:
I need to recreate the clients to make them properly reconnect. I'm not sure if this is required by design but I assumed the client nodes are rather dumb and just connect to the server nodes and use their state, so no way the local state could somehow conflict with the server state. If this isn't the case, is there some more specific documentation around such operational concerns?
The text was updated successfully, but these errors were encountered: