Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Start Nomad Servers with Monit #836

Merged
merged 3 commits into from
Nov 13, 2018
Merged

Start Nomad Servers with Monit #836

merged 3 commits into from
Nov 13, 2018

Conversation

Miserlou
Copy link
Contributor

Issue Number

hashicorp/nomad#4864

Purpose/Implementation Notes

Part 1 of hacks to get around Nomad being a pile of hot garbage.

Methods

Nomad will die some times. This makes sure that if it goes down, we make it come back up.

This isn't really a solution since the underlying problem is still there, but at least it'll come back up if it can - which it probably can't.

From the monit log, you can see it working:

[UTC Nov 13 20:40:10] info     : 'ip-10-0-122-116.ec2.internal' Monit 5.16 started
[UTC Nov 13 20:45:10] error    : 'nomad' '/bin/bash' failed with exit status (1) -- Error querying jobs: Get http://127.0.0.1:4646/v1/jobs: dial tcp 127.0.0.1:4646: getsockopt: connection refused
[UTC Nov 13 20:45:10] info     : 'nomad' trying to restart
[UTC Nov 13 20:45:10] info     : 'nomad' start: /home/ubuntu/kill_restart_nomad.sh

It's now passed the 5 minute health window, and the health check passed so it hasn't restarted the service. nomad status works again. If I manually kill nomad, monit will start it again.

Types of changes

  • New feature (non-breaking change which adds functionality)

Copy link
Contributor

@kurtwheeler kurtwheeler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Miserlou Miserlou merged commit ef227d6 into dev Nov 13, 2018
@Miserlou Miserlou deleted the mis/robust-nomad branch November 13, 2018 21:46
kurtwheeler pushed a commit that referenced this pull request Jan 10, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants