-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Request: Time-Averaged Threshold for Alerts #93
Comments
I can add options at some point for 10m, 20m, and 2h time periods. Shouldn't add much overhead since we're already calculating those averages for the 12h, 24h, and 1w charts. Just a point of clarification - the threshold currently is not instant. It works exactly as you outlined -- time averaged -- but only based on one minute intervals. So you can have short spikes above threshold of under a minute that won't trigger an alert. That may be what you meant, but wanted to point that out in case anyone was wondering. |
Thank you for your explanation! I hope it can be a customizable value instead of hardcoded options, as the AUP varies across different IDCs, and the allowed duration for full load differs as well |
This would go a long way at improving the alerting features, I would love to see this implemented. Would it be possible to have multiple alert triggers for each metric? This would make it even more customisable. |
Maybe a better implementation would be to add another slider allowing you to choose any number of minutes from 1m to 60m? This would be slightly more intensive as we'd need to query, loop, and decode json for previous 1m records. But we'd only need to do that if the alert hasn't been triggered and the current 1m record is above threshold, or the alert is triggered and the current record is below threshold. Most of the time you'll be below threshold and without a triggered alert, so that operation wouldn't need to run. Seems like that may be the way to go. |
Added in 0.6.0. Please update and let me know if you run into any issues with it. |
How to dismiss an active alert? I currently have an alert for one of my servers: I'm fine with the disk being filled for 50%, but even now raising it to 80%, the alert stays: I assume I have to wait for another 10 minutes to pass? Disabling the alert and re-enabling it made it go away. |
@Matthias-vdE It should clear on the next system update, but I'll change it so the alert gets set to inactive if you update the time or threshold. |
Description:
Currently, beszel only supports instant thresholds for server monitoring and alerting. This can lead to false alarms triggered by normal, short-term operations such as file compression that may temporarily spike resource usage.
Feature request:
Implement a new threshold type that triggers alerts based on the average value of a monitored metric over a specified time period, rather than instantaneous values.
Proposed functionality:
Example use case:
In this scenario, an alert would only be triggered if the average CPU usage over a 15-minute period exceeds 80%, reducing false alarms from short-term spikes.
Benefits:
This feature would significantly enhance beszel's monitoring capabilities and provide more meaningful alerts to users.
The text was updated successfully, but these errors were encountered: