-
-
Notifications
You must be signed in to change notification settings - Fork 244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Calling /etc/init.d/consul restart sometimes ends up with a stopped … #427
Conversation
…consul process. The reason was that it would sometimes start consul before it was completely stopped. This can be easily reproduced runnning: watch -n 0.5 -e '/etc/init.d/consul restart; sleep 0.1; /etc/init.d/consul status' Adding the --retry parameter lets start-stop-daemon wait until the consul process is actually stopped.
The underlying issue has the potential to take down 20 percent of all consul agents when appliying a config change accross an entire environment. When hundreds of servers are involved easily dozens of them will be stopped instead of restarted. Which is why I believe this fix is quite important. |
libraries/consul_config.rb
Outdated
@@ -27,6 +27,9 @@ class ConsulConfig < Chef::Resource | |||
# @!attribute config_dir | |||
# @return [String] | |||
attribute(:config_dir, kind_of: String, default: lazy { node['consul']['service']['config_dir'] }) | |||
# @!attribute config_dir_mode | |||
# @return [String] | |||
attribute(:config_dir_mode, kind_of: String, default: '0755') |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than the one suggestion this look good to me. I've actually seen this behaviour on a few of our nodes. Thanks for the PR.
Codecov Report
@@ Coverage Diff @@
## master #427 +/- ##
=========================================
Coverage ? 59.49%
=========================================
Files ? 7
Lines ? 358
Branches ? 0
=========================================
Hits ? 213
Misses ? 145
Partials ? 0
Continue to review full report at Codecov.
|
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
…consul process. The reason is that the init script sometimes starts consul before it was completely stopped. During a cookbook upgrade from version 2.1.3 to 2.2.0 this took down the consul agent on a number of our servers.
This can be easily reproduced runnning:
watch -n 0.5 -e '/etc/init.d/consul restart; sleep 0.1; /etc/init.d/consul status'
Adding the --retry parameter lets start-stop-daemon wait until the consul process is actually stopped.