-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Alert Summaries] Add a data provider so the rule registry can provide alerts over a time span (or past execution) #143374
Comments
I'm hoping we have the necessary date fields in the alerts-as-data indices to make these queries:
|
The rule registry already provides a RuleDataClient with a |
It looks like the New alerts set With this combination of timestamps, I think we have enough to query for what we need. I think for the previous run, this should be pretty straightfoward. For an arbitrary time range, we should be able to query the Does that align with the requirements? |
I'm good with whatever is simple. When the FAAD is in place, we'll be removing this code so whatever is easier will help us. In regards to finding a way to expose this searching to the framework, I'll post my idea but don't feel obligated to use it:
Makes sense to me, it matches the mental model I have. |
Should we think about capping the data? |
With the alert circuit breaker in place which defaults to 1000, I believe the most we should return should be 1000 new/active and 1000 recovered. |
@ersin-erdal I wasn't thinking about the query per time range, which can return many more alerts. Right now, I'm querying based on time range and then splitting up the results into new/ongoing/recovered. If we're capping the number of results returned but still want the actual number of new/ongoing/recovered, we would have to push the condition to ES via a scripted query. Are we ok with scripted queries? They tend to be less performant. cc @mikecote |
Good catch, @ymao1! I think it would be valuable to get the total count while limiting how many new, ongoing and recovered alerts we return. You can also weigh multiple queries vs scripted queries if ever one is easier and more performant. We can also apply the 1000 limit to each array (new, ongoing, recovered). |
@ersin-erdal @mikecote The lifecycle executors write out a different alert document for a single alert if it goes from active to recovered, then active again. Do we want to consider that one "new" alert? For example, if we're querying against the last day of data and we have an alert for |
Yes, we should consider each alert separate and return both. I'm thinking this is best so the summary matches the changes that happened to the alert documents. |
Meta: #143200
In order to add the alerts (as an output of Alert Summaries feature) to the action context we need to fetch the alerts from alert-as-data index. As the rule-registry already writes the alert-as-data we add a new provider to read the data as well.
We need a function attached to the rule type definition (automated by rule registry) that provides a search function to gather a series of alerts created on the last run or over a certain time span. The search function should be able to return new, ongoing and recovered alerts for a given time (last run or date range).
See #143200 for terminology of new, ongoing and recovered to help build queries.
The text was updated successfully, but these errors were encountered: