Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[service] Avoid failures of service resource with frequent restarts #469

Merged
merged 1 commit into from
Sep 21, 2017

Conversation

olivielpeau
Copy link
Member

Fixes #467 (see description there)

Happens on systemd-based systems since systemd applies its limits
on the number of restarts of a service (by default, 5 times starts
every 10 seconds) on both the user-requested restarts and the ones
systemd does on its own (when the service fails starting for instance).

Root of the problem is that the service resource in `datadog_monitor`
is different from the one in the main chef run (chef limitation), so
the restarts that happen there are done immediately instead of being
queued up nicely at the end of the global chef run. If multiple
invocations of this resource are updated in a chef run the restart
limit of systemd can be quickly reached.

A better fix would be to remove the service definition from
`datadog_monitor` and make all invocations of `datadog_monitor`
notify the global `service[datadog-agent]` resource. This would be a
breaking change, let's do it for the next major version.
@olivielpeau olivielpeau added this to the 2.11.0 milestone Sep 21, 2017
@olivielpeau olivielpeau merged commit 4c201c5 into master Sep 21, 2017
@olivielpeau olivielpeau deleted the olivielpeau/systemd-restart-retries branch September 21, 2017 15:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant