Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Alerting] getAlertStatus() should exhaustively search through event log #74860

Open
pmuellr opened this issue Aug 12, 2020 · 3 comments
Open
Labels
enhancement New value added to drive a business result estimate:needs-research Estimated as too large and requires research to break down into workable issues Feature:Alerting/RulesFramework Issues related to the Alerting Rules Framework Feature:Alerting Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) technical debt Improvement of the software architecture and operational architecture

Comments

@pmuellr
Copy link
Member

pmuellr commented Aug 12, 2020

see: #68437 (comment)

Currently the alert client's getAlertStatus() method (added in PR ^^^) only gets up to 10K events to process to determine the status. That may not be enough, for alerts with many active instances - it may drop some old data.

We should probably change this to exhaustively search the event log instead - although perhaps some reasonable limit on the start date should be added. We'd have to change the way the processing of the event log works.

  • change the event log query to sort ascending, and process the first "page" of events coming in
  • after processing that "page" of events, get the next page and continue
@pmuellr pmuellr added Feature:Alerting technical debt Improvement of the software architecture and operational architecture Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) labels Aug 12, 2020
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-alerting-services (Team:Alerting Services)

@mikecote mikecote added the enhancement New value added to drive a business result label Aug 19, 2020
@YulNaumenko
Copy link
Contributor

@pmuellr could this issue to be related to the PR I've closed and the closed issue?

@pmuellr
Copy link
Member Author

pmuellr commented Mar 16, 2021

@pmuellr could this issue to be related to the PR I've closed and the closed issue?

Good question, and I think no. We were still using a time-range to search through the event log, so there's always a possibility we didn't go quite far enough back.

This issue I think points to a better approach - issue #93704

The idea is that we'd store the "new-instance" date in the instance state, and arrange to write it out with probably the active-instance and recovered-instance events, so we NEVER have to search for the new-instance events. This would allow us to get the date ranges even if the older events were deleted via ILM.

@gmmorris gmmorris added the Feature:Alerting/RulesFramework Issues related to the Alerting Rules Framework label Jul 1, 2021
@gmmorris gmmorris added the loe:needs-research This issue requires some research before it can be worked on or estimated label Jul 15, 2021
@gmmorris gmmorris added the estimate:needs-research Estimated as too large and requires research to break down into workable issues label Aug 18, 2021
@gmmorris gmmorris removed the loe:needs-research This issue requires some research before it can be worked on or estimated label Sep 2, 2021
@kobelb kobelb added the needs-team Issues missing a team label label Jan 31, 2022
@botelastic botelastic bot removed the needs-team Issues missing a team label label Jan 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New value added to drive a business result estimate:needs-research Estimated as too large and requires research to break down into workable issues Feature:Alerting/RulesFramework Issues related to the Alerting Rules Framework Feature:Alerting Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) technical debt Improvement of the software architecture and operational architecture
Projects
No open projects
Development

No branches or pull requests

6 participants