Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds healthcheck for checking the status of the container #182

Merged
merged 2 commits into from
Mar 13, 2019

Conversation

vaibhavkhurana2018
Copy link
Contributor

What this PR does / why we need it:

Adds a healthcheck for checking the status of the metrics endpoint.

Which issue(s) this PR fixes : This will help us recover in sitiuations like getting Internal Server Error and the container being up.

@matthiasr Please review this. Thanks!

Signed-off-by: Vaibhav Khurana <vaibhav.khurana@razorpay.com>
Copy link
Contributor

@matthiasr matthiasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how can this be tested?

Dockerfile Outdated
@@ -4,4 +4,5 @@ LABEL maintainer="The Prometheus Authors <prometheus-developers@googlegroups.com
COPY statsd_exporter /bin/statsd_exporter

EXPOSE 9102 9125 9125/udp
HEALTHCHECK CMD curl --fail http://localhost:9102/metrics || exit 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately curl is not available in the image, maybe this works?

Suggested change
HEALTHCHECK CMD curl --fail http://localhost:9102/metrics || exit 1
HEALTHCHECK CMD wget --spider http://localhost:9102/metrics || exit 1

(this is busybox wget)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will not work, as it only checks the availability of the page, and statsd redirect it back to the default page. Will check and update the best possible way for this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you find a way?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the timeout to the command, this actually works. Not sure how to reproduce the issue but, i faced the same today and this worked.

vaibhav ~ $ sudo docker ps
CONTAINER ID        IMAGE                         COMMAND                  CREATED             STATUS              PORTS               NAMES
1baa9f5abb44        prom/statsd-exporter:v0.6.0   "/bin/statsd_exporte…"   5 hours ago         Up 5 hours                              statsd-exporter
vaibhav ~ $ sudo docker exec -it 1baa9f5abb44 sh
/ # wget --spider -S "http://localhost:9102/metrics" -T 60 2>&1
Connecting to localhost:9102 (127.0.0.1:9102)
  HTTP/1.1 500 Internal Server Error
wget: server returned error: HTTP/1.1 500 Internal Server Error

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well i still am not able to find why this is happening, can you suggest some way to debug the root cause as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typically, this means that we are trying to serve an invalid combination of metrics. Please check existing issues, and if there isn't one that matches, open a new issue with

  • your mapping configuration
  • a sample of inputs that reproduces the issue (debug logging may help see what happens just before it starts failing)
  • the full output of the /metrics endpoint when it is failing – this usually gives a clue as to what is invalid

Signed-off-by: Vaibhav Khurana <vaibhav.khurana@razorpay.com>
Copy link
Contributor

@matthiasr matthiasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@matthiasr matthiasr merged commit 680afe2 into prometheus:master Mar 13, 2019
matthiasr pushed a commit that referenced this pull request Mar 13, 2019
Signed-off-by: Matthias Rampke <mr@soundcloud.com>
matthiasr pushed a commit that referenced this pull request May 15, 2019
* [CHANGE] Do not run as root in the Docker container by default ([#202](#202))
* [FEATURE] Add metric for count of events by action ([#193](#193))
* [FEATURE] Add metric for count of distinct metric names ([#200](#200))
* [FEATURE] Add UNIX socket listener support ([#199](#199))
* [FEATURE] Accept Datadog [distributions](https://docs.datadoghq.com/graphing/metrics/distributions/) ([#211](#211))
* [ENHANCEMENT] Add a health check to the Docker container ([#182](#182))
* [ENHANCEMENT] Allow inconsistent label sets ([#194](#194))
* [ENHANCEMENT] Speed up sanitization of metric names ([#197](#197))
* [ENHANCEMENT] Enable pprof endpoints ([#205](#205))
* [ENHANCEMENT] DogStatsD tag parsing is faster ([#210](#210))
* [ENHANCEMENT] Cache mapped metrics ([#198](#198))
* [BUGFIX] Fix panic if a mapping resulted in an empty name ([#192](#192))
* [BUGFIX] Ensure that there are always default quantiles if using summaries ([#212](#212))
* [BUGFIX] Prevent ingesting conflicting metric types that would make scraping fail ([#213](#213))

With #192, the count of events rejected because of negative counter increments has moved into the `statsd_exporter_events_error_total` metric, instead of being lumped in with the different kinds of successful events.

Signed-off-by: Matthias Rampke <mr@soundcloud.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants