Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add expiration to remove stale metrics #61

Merged
merged 2 commits into from
Mar 11, 2023

Conversation

Quantamm
Copy link

@Quantamm Quantamm commented Mar 3, 2023

Resolves #59

This PR adds an optional expires parameter to the metrics configuration that takes an integer number of seconds, after which, without an update, the metric should be removed.

I'm looking for feedback at this point. I'm aware that I still need to add some debug logging statements and unit test coverage; I'll add those in once I get some agreement that the PR is heading in the generally correct direction.

@poggenpower
Copy link
Collaborator

poggenpower commented Mar 3, 2023

If I understand your use case correct, if the metric expires something most likely went wrong in the process before?!
Wouldn't it be helpful to mark this metric as stale, which make it easy to alert in such situations?
I am pretty open in the implementation, but is should be easy to monitor, maybe an additional health metric.

@Quantamm
Copy link
Author

Quantamm commented Mar 4, 2023

if the metric expires something most likely went wrong in the process before

I've built some USB-powered air quality sensors and when they are unplugged, currently, mqtt_exporter keeps providing the last-known values to Prometheus. As a result, when I graph the values, I'm seeing the last reported value indefinitely.

I can work around it using this Prometheus QL query:

(sensor1_temperature and on() (abs(sensor1_temperature_last_received/1000 - timestamp(sensor1_temperature_last_received)) < 60))

The purpose of this PR would be that you'd get the same thing from:

sensor1_temperature

Comment on lines -454 to -457
class HistogramWrapper():
"""
Wrapper to provide generic interface to Summary metric
"""
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HistogramWrapper is defined on line 519 below.

@Quantamm Quantamm marked this pull request as ready for review March 7, 2023 01:43
@Quantamm
Copy link
Author

Quantamm commented Mar 7, 2023

This is as ready as I can make it. I was hoping to add some unit tests, but I'm not a Python developer and trying to understand the code there is a bit beyond what I'm looking to do.

I've built and am using a docker image based off of this PR and it's working great for me. When one of my air quality sensors goes offline, the metrics do stop getting exported, as expected.

@fhemberger fhemberger merged commit e601a6d into fhemberger:master Mar 11, 2023
@fhemberger
Copy link
Owner

@Quantamm Thanks for your contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement removing stale metrics
3 participants