-
Notifications
You must be signed in to change notification settings - Fork 595
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Metric serf.coordinate.adjustment-ms negative #487
Comments
Hey @chemicL thanks for the report - I don't recognize this as any of the coordinate-related constants so will need to look through the algorithm a bit. When you say "stay" does it get stuck like that essentially forever? |
Linking to hashicorp/consul#3023, which might be related. |
I confirm it does "get stuck like that essentially forever", producing gigabytes of statsd warning logs for that single metric per day. All with the same value. The metric is reported here: https://github.com/hashicorp/serf/blob/master/serf/ping_delegate.go#L78 |
@chemicL if you have any nodes in that state can you do a quick check of the v1/agent/self output and see if |
This happens also in version 0.7.5, where I don't see |
One final thing I just thought of on your 0.9.3 agent in the bad state can you snag the state of |
Here's the output from one of the problematic agents:
As for the workaround, we'd prefer not to fork another project, but let's see what's possible. I mentioned on Gitter that the filtering is not applied in consul 0.9.3. Filtering out this metric would allow us to keep on going ;-) |
Indeed, sorry about that. This is fixed in 1.0! |
Thanks for the data - I'll try to see how things got into this state. |
The serf.coordinate.adjustment-ms metric can sometimes stay at a negative value of -9223372013568.000000, which results in our statsd logs being flooded by warnings (it says DEBUG, although it's at warning level).
It seems negative adjustments could be valid, however it's surprising that from different machines it's the same negative value repeated in all the logs.
We're using consul version 0.7.5 - 0.9.3 in our environment and that's where we observed the issue. The metric has been present in serf used by those versions.
The text was updated successfully, but these errors were encountered: