Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

command/agent: Add simple HTTP check type #592

Merged
merged 7 commits into from
Jan 13, 2015
Merged

command/agent: Add simple HTTP check type #592

merged 7 commits into from
Jan 13, 2015

Conversation

nicholascapo
Copy link

We are running 20+ services that all register using consul. Consul then checks each service using curl every second (1s). On some of our lower power (dev and test) machines we are seeing a lot of CPU load from consul forking curl 20 times per second.

Add a simple HTTP check type that makes an HTTP GET request every Interval to the specified URL.
The status of the service depends on the HTTP Response Code.
200 is passing, 503 is warning and anything else is failing.

An HTTP based check looks like this:
{
"check": {
"id": "api",
"name": "HTTP API on port 5000 with a /health route",
"http": "http://localhost:5000/health",
"interval": "10s"
}
}

These checks make an `HTTP GET` request every Interval to the specified URL.
The status of the service depends on the HTTP Response Code.
`200` is passing, `503` is warning and anything else is failing.
@blalor
Copy link
Contributor

blalor commented Jan 9, 2015

I believe this type of check was considered in the early days of consul and rejected. I forget on what grounds. I would personally like to see it added. I ended up writing a utility to poll http end points and submit check results to the agent.

@armon
Copy link
Member

armon commented Jan 12, 2015

I think is a good idea. @blalor it's possible this was rejected early on, but I've had a change of heart on it.
Some feedback I have:

  • Any 2XX status code should be treated as OK
  • Let's use 429 "Too Many Requests" for WARN
  • Anything else is critical.
  • Tests for the new functionality would be great :)
  • Super pedantic, some log statements are missing the appropriate "agent: " prefix.

@armon
Copy link
Member

armon commented Jan 12, 2015

My reasoning for not using 503 is that in many application servers already implement to indicate various critical conditions. e.g. Phusion or uWSGI failing to start the underlying Rails/Django app causes 503. Using 429 requires an app to actually implement it since it's generally not a builtin default, so you cannot accidentally get into the warning state.

@sean-
Copy link
Contributor

sean- commented Jan 12, 2015

(small request: the ability to see the entire HTTP Request/Response so it can be compared or replayed with telnet(1)).

@ryanuber
Copy link
Member

I agree with @armon on using a 429 for warnings. It also agree with @sean- that writing the response into the output field would be helpful, and it's probably not too complicated to implement. Looking good!

@nicholascapo
Copy link
Author

Fixed: Any 2XX status code should be treated as OK
Fixed: Let's use 429 "Too Many Requests" for WARN
Fixed: Anything else is critical.
Fixed: Tests for new functionality
Fixed: Super pedantic, some log statements are missing the appropriate "agent: " prefix.
Fixed: HTTP Request/Response in output

Example debug log:
2015/01/13 00:28:24 [DEBUG] agent: http check 'service:foo' is passing: HTTP GET http://127.0.0.1:60522/health: 200 OK Output: {"JSON data from service"}

Example from http://localhost:8500/v1/health/state/passing
{
"Node": "bar",
"CheckID": "service:foo",
"Name": "Service 'foo' check",
"Status": "passing",
"Notes": "",
"Output": "HTTP GET http://127.0.0.1:37728/health: 200 OK Output: {"JSON data from service"}",
"ServiceID": "foo",
"ServiceName": "foo"
}

@armon
Copy link
Member

armon commented Jan 13, 2015

LGTM! Thanks!

armon added a commit that referenced this pull request Jan 13, 2015
command/agent: Add simple HTTP check type
@armon armon merged commit 713d30c into hashicorp:master Jan 13, 2015
@nicholascapo nicholascapo deleted the check-http branch January 13, 2015 21:46
duckhan pushed a commit to duckhan/consul that referenced this pull request Oct 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants