-
Notifications
You must be signed in to change notification settings - Fork 233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds healthcheck for checking the status of the container #182
Adds healthcheck for checking the status of the container #182
Conversation
Signed-off-by: Vaibhav Khurana <vaibhav.khurana@razorpay.com>
6fdd9b5
to
7830fea
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how can this be tested?
Dockerfile
Outdated
@@ -4,4 +4,5 @@ LABEL maintainer="The Prometheus Authors <prometheus-developers@googlegroups.com | |||
COPY statsd_exporter /bin/statsd_exporter | |||
|
|||
EXPOSE 9102 9125 9125/udp | |||
HEALTHCHECK CMD curl --fail http://localhost:9102/metrics || exit 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately curl
is not available in the image, maybe this works?
HEALTHCHECK CMD curl --fail http://localhost:9102/metrics || exit 1 | |
HEALTHCHECK CMD wget --spider http://localhost:9102/metrics || exit 1 |
(this is busybox wget)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will not work, as it only checks the availability of the page, and statsd redirect it back to the default page. Will check and update the best possible way for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you find a way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the timeout to the command, this actually works. Not sure how to reproduce the issue but, i faced the same today and this worked.
vaibhav ~ $ sudo docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
1baa9f5abb44 prom/statsd-exporter:v0.6.0 "/bin/statsd_exporte…" 5 hours ago Up 5 hours statsd-exporter
vaibhav ~ $ sudo docker exec -it 1baa9f5abb44 sh
/ # wget --spider -S "http://localhost:9102/metrics" -T 60 2>&1
Connecting to localhost:9102 (127.0.0.1:9102)
HTTP/1.1 500 Internal Server Error
wget: server returned error: HTTP/1.1 500 Internal Server Error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well i still am not able to find why this is happening, can you suggest some way to debug the root cause as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typically, this means that we are trying to serve an invalid combination of metrics. Please check existing issues, and if there isn't one that matches, open a new issue with
- your mapping configuration
- a sample of inputs that reproduces the issue (debug logging may help see what happens just before it starts failing)
- the full output of the
/metrics
endpoint when it is failing – this usually gives a clue as to what is invalid
Signed-off-by: Vaibhav Khurana <vaibhav.khurana@razorpay.com>
c1eda13
to
88af8dc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
Signed-off-by: Matthias Rampke <mr@soundcloud.com>
* [CHANGE] Do not run as root in the Docker container by default ([#202](#202)) * [FEATURE] Add metric for count of events by action ([#193](#193)) * [FEATURE] Add metric for count of distinct metric names ([#200](#200)) * [FEATURE] Add UNIX socket listener support ([#199](#199)) * [FEATURE] Accept Datadog [distributions](https://docs.datadoghq.com/graphing/metrics/distributions/) ([#211](#211)) * [ENHANCEMENT] Add a health check to the Docker container ([#182](#182)) * [ENHANCEMENT] Allow inconsistent label sets ([#194](#194)) * [ENHANCEMENT] Speed up sanitization of metric names ([#197](#197)) * [ENHANCEMENT] Enable pprof endpoints ([#205](#205)) * [ENHANCEMENT] DogStatsD tag parsing is faster ([#210](#210)) * [ENHANCEMENT] Cache mapped metrics ([#198](#198)) * [BUGFIX] Fix panic if a mapping resulted in an empty name ([#192](#192)) * [BUGFIX] Ensure that there are always default quantiles if using summaries ([#212](#212)) * [BUGFIX] Prevent ingesting conflicting metric types that would make scraping fail ([#213](#213)) With #192, the count of events rejected because of negative counter increments has moved into the `statsd_exporter_events_error_total` metric, instead of being lumped in with the different kinds of successful events. Signed-off-by: Matthias Rampke <mr@soundcloud.com>
What this PR does / why we need it:
Adds a healthcheck for checking the status of the metrics endpoint.
Which issue(s) this PR fixes : This will help us recover in sitiuations like getting Internal Server Error and the container being up.
@matthiasr Please review this. Thanks!