Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs/blog: Write a HOWTO for using Nagios checks as Consul check scripts #967

Closed
tehranian opened this issue May 21, 2015 · 4 comments
Closed
Labels
type/docs Documentation needs to be created/updated/clarified
Milestone

Comments

@tehranian
Copy link

Here's an idea for a useful Consul HOWTO guide. Via the conversation at: https://twitter.com/hashicorp/status/601173030937763840

Whereas the return code interface for Consul Check Scripts is the same as Nagios's interface, and whereas there are a plethora of existing Nagios check scripts available as OS packages, here's an idea for a useful Consul Health Monitoring HOWTO guide (blog post):

Blog Post Idea

  • "apt-get install nagios-plugins-standard" on an ubuntu 14.04 host. See: https://packages.debian.org/wheezy/net/nagios-plugins-standard
  • See that all sorts of useful script check executables get installed to "/usr/lib/nagios/plugins/". Ex: "check_apt", check_disk", "check_ntp_time", etc.
  • Write Consul check script definitions leveraging those Nagios check scripts.
  • Save your time, effort, and sanity by leveraging years of other people's work. Get up & running with Consul health checks quickly & easily.
@dublx
Copy link

dublx commented May 28, 2015

I think a simple example would be great to have in this blog post.

@ryanuber
Copy link
Member

Definitely a good idea, thanks for the suggestion!

@ryanuber ryanuber added type/enhancement Proposed improvement or new feature type/docs Documentation needs to be created/updated/clarified labels May 28, 2015
@slackpad slackpad removed the type/enhancement Proposed improvement or new feature label May 2, 2017
@slackpad slackpad added this to the Unplanned milestone Jan 5, 2018
@lsalvadorini-c4w
Copy link

Interesting, I use this sort of setup on my hosts, using a configuration like this:

{
  "checks": [
    {
      "name": "dns",
      "script": "/usr/local/bin/dns-checks",
      "interval": "60s",
      "timeout": "1s",
      "status": "passing"
    },
    {
      "name": "disks",
      "script": "/usr/lib/nagios/plugins/check_disk --unit bytes -w 10% -c 5% --freespace-ignore-reserved --exclude-type=tmpfs --exclude-type=aufs --exclude-type=shm -A -I '^/dev$' -I '^/var/lib/docker/' 2>&1 | /usr/local/bin/check-result-to-graphite disk",
      "interval": "60s",
      "timeout": "5s",
      "status": "passing"
    },
    {
      "name": "load-average",
      "script": "/usr/lib/nagios/plugins/check_load -w 16,10,4 -c 20,16,6 2>&1 | /usr/local/bin/check-result-to-graphite load",
      "interval": "60s",
      "timeout": "5s",
      "status": "passing"
    },
    {
      "name": "docker-generic",
      "script": "/usr/local/bin/check_docker --uptime 0:60 --restarts 1:10 --memory 80:90:% --cpu 80:90:% --status running 2>&1 | /usr/local/bin/check-result-to-graphite docker",
      "interval": "180s",
      "timeout": "60s",
      "status": "passing"
    },
    {
      "name": "docker-ops",
      "script": "/usr/local/bin/check_docker --present --status running --containers registrator fabiolb consul-alerts 2>&1 | /usr/local/bin/check-result-to-graphite docker",
      "interval": "120s",
      "timeout": "20s",
      "status": "passing"
    }
  ]
}

I also pipe results of nagios checks to a custom script to obtain two results:

  • format the check output in a human readable way (one check status per line instead of a single line) so it's simple to read in consul web UI
  • send nagios performance data to graphite

The setup is useful in my use case, but has some drawbacks: we use consul dns for service discovery but under heavy load, when a check become critical, the node become red and all the services become critical too. This not always desiderable because standalone services become unavailable from consul dns. I think a good way to improve the setup should be to assign all the checks to a fake service.

@hanshasselberg
Copy link
Member

Thank you for reporting! Closing this issue since it is not a real issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/docs Documentation needs to be created/updated/clarified
Projects
None yet
Development

No branches or pull requests

6 participants