-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
http service check fails every third request - causes service to flap #779
Comments
Hey @darron, thanks for the report. This is sounding like it might be a keep-alive issue. Is the server you are talking to configured with a keep-alive timeout of < 30s? The default Go HTTP client uses a 30s keep-alive. We probably need to adjust this on the Consul side, but I just want to nail the issue down further before going down that road. Also, if you could provide the check configuration in its original JSON form, and the curl command you used, that would be helpful. Tagging as a bug, Thanks! |
@ryanuber We should just disable keep-alive, it seems sane for doing localhost checks to just re-dial. |
@armon +1 |
It's a really standard nginx container - I'm just away from my laptop and
|
FYI - @ryanuber - this is the config: https://github.com/octohost/nginx/blob/master/nginx.conf - 15 second keepalive timeout. Was setting this JSON:
With this command - $2 is the JSON that's passed:
It goes from here: https://github.com/octohost/octohost/blob/master/bin/octo#L421 To here: I have all sorts of containers though - so likely what @armon and @ryanbreen +1'd would be the best - it would be a bit of a futile process to make sure everything had a long enough keepalive. Thanks for taking a look at this! |
Sweet - I compiled from master and it doesn't error anymore. Thanks! |
Thanks for confirming! Closing. |
I have been noticing a ton of http requests failing using the http service check instead of the script based service check - here's some logs from a clean node:
https://gist.github.com/darron/d4100cb1dfb8b8bf3fce
What's strange is that every third check fails - for each service I register using the new http type service check - which is causing nginx to flap as Consul Template re-writes the template over and over.
Once I changed the service check to use the curl based check - at the same frequency - not a single failure.
Here's the logs from the times we were setting the service and checks:
Here's the bash where we're setting the service:
https://github.com/octohost/octohost/blob/master/bin/octo#L396-L426
After I adjusted the definition for the last one at 18:34 - no problems at all - no more flapping. The underlying nginx container seems to be stable.
NOTE: I have duplicated this on Digital Ocean and GCE nodes.
The text was updated successfully, but these errors were encountered: