Bucket-Level Alerting #86

adityaj1107 · 2021-06-02T21:50:29Z

Issue by qreshi
Friday Dec 18, 2020 at 23:27 GMT
Originally opened as opendistro-for-elasticsearch/alerting#326

The Document-Level Alerting feature enhancement seeks to address the concerns brought up in both #13 and #145 among others. Creating this issue to centralize discussion.

adityaj1107 · 2021-06-02T21:50:30Z

Comment by mgiammarco
Friday Dec 25, 2020 at 11:12 GMT

Thank you for this thread.
An alerting system really useful for work should have these features:

Easy and scalable: I do not need to create a new monitor/alert when I monitor a new host.
Indipendent alerts for each host/resource.
Possibility to choose if an alert will be autoclosed or not.

Consider this (I hope typical) use case:

100 hosts to monitor
each host sends data with several agents (syslog, collectd, and so on) in different formats
if I have host1 and host2 with a failed backup I must have two alerts
if I have host1 and host2 and I must monitor average cpu usage I need to do groupby in an easy way and send again two separate alerts
for some alerts I do not want that they come to normal state automatically. For example high cpu usage at 3am and it stops at 5am. When I check it at 2pm I need to see an alert in red state.

One software that fulfills above criterias is InfluxDB. Another one is elastalert plugin for ElastiSearch. Please consider this one and eventually integrate it because it fulfills all needs.
Grafana has alerting too but it completely misses point 2.

adityaj1107 · 2021-06-02T21:50:31Z

Comment by verbecee
Monday Jan 11, 2021 at 19:03 GMT

Just got off the community forum and wanted to post 3 recommendations for alerting:

For aggregation, there should flexibility on the groupby field. In our alerting implementation (we are using something besides open distro's alerting to accomplish our goals), we had an alert set up that would aggregate on field X. Initially, that field came in as a string, but then started coming in as an array of strings. So, we had to accommodate for this.
The aggregation should be able to deal with dirty data. Similar to the example above, this same index started receiving logs with arrays composed of strings and the value null. At least in our implementation, null really screwed up our aggregation and needed to be handled. In our case, too, we also had to deal with ECS special characters in logs, but that also might only be an issue for us because we are interfacing with Elasticsearch.
Suppression - provide context about what alert is suppressing. Is it a misconfigured server or a malware outbreak in the network?

adityaj1107 · 2021-06-02T21:50:32Z

Comment by rafael-gumiero
Tuesday Jan 19, 2021 at 01:07 GMT

Basically our use case is very similar to the ones listed above.

Generate separate alerts based on a key to be defined (host, device type, etc).
Grouping categorizes alerts of similar nature into a single notification. This is especially useful during larger outages when many systems fail at once and hundreds to thousands of alerts may be firing simultaneously.
Inhibition is a concept of suppressing notifications for certain alerts if certain other alerts are already firing.

Use case breakdown:

100+ hosts;
Metrics being captured via: metricbeat and filebeat;
It is necessary to generate separate alerts for each host/device or specific key that is out of the desired condition;
Create the most standardized alerts to avoid having to create endless separate rules (costly to maintain);
Alerts based on anomaly detection and threshold.

qreshi · 2021-11-16T23:53:21Z

Closing this as this feature was launched as part of the OpenSearch 1.1 release

adityaj1107 added the enhancement New feature or request label Jun 2, 2021

qreshi mentioned this issue Aug 20, 2021

Bucket-Level Alerting opendistro-for-elasticsearch/alerting#326

Closed

qreshi changed the title ~~Document-Level Alerting~~ Bucket-Level Alerting Aug 20, 2021

qreshi closed this as completed Nov 16, 2021

qreshi mentioned this issue Nov 16, 2021

Support document level alerts #63

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bucket-Level Alerting #86

Bucket-Level Alerting #86

adityaj1107 commented Jun 2, 2021

adityaj1107 commented Jun 2, 2021

adityaj1107 commented Jun 2, 2021

adityaj1107 commented Jun 2, 2021

qreshi commented Nov 16, 2021

Bucket-Level Alerting #86

Bucket-Level Alerting #86

Comments

adityaj1107 commented Jun 2, 2021

adityaj1107 commented Jun 2, 2021

adityaj1107 commented Jun 2, 2021

adityaj1107 commented Jun 2, 2021

qreshi commented Nov 16, 2021