Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ReponseOps] error when two recovered actions run: Alert instance execution has already been scheduled #147532

Closed
pmuellr opened this issue Dec 14, 2022 · 4 comments · Fixed by #147617
Assignees
Labels
bug Fixes for quality problems that affect the customer experience Feature:Alerting/RulesFramework Issues related to the Alerting Rules Framework Feature:Alerting Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams)

Comments

@pmuellr
Copy link
Member

pmuellr commented Dec 14, 2022

Created a rule with 4 actions - two for active, two for recovered. Rule type doesn't matter. Force the rule to go active, then allow to recover. When the recovered action processing is occurring the following error message will be logged:

[ERROR][plugins.alerting.index-threshold] Executing Rule default:.index-threshold:6a3d8c40-7bbc-11ed-be1a-31b626d2ea09 has resulted in Error: 
  Alert instance execution has already been scheduled, cannot schedule twice - 
  Error: Alert instance execution has already been scheduled, cannot schedule twice
    at Alert.ensureHasNoScheduledActions (/Users/pmuellr/Projects/elastic/kibana-prs/x-pack/plugins/alerting/server/alert/alert.ts:163:13)
    at Alert.scheduleActions (/Users/pmuellr/Projects/elastic/kibana-prs/x-pack/plugins/alerting/server/alert/alert.ts:146:10)
    at ExecutionHandler.run (/Users/pmuellr/Projects/elastic/kibana-prs/x-pack/plugins/alerting/server/task_runner/execution_handler.ts:228:17)
    at runMicrotasks (<anonymous>)
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
    at /Users/pmuellr/Projects/elastic/kibana-prs/x-pack/plugins/alerting/server/task_runner/task_runner.ts:473:9
    at TaskRunnerTimer.runWithTimer (/Users/pmuellr/Projects/elastic/kibana-prs/x-pack/plugins/alerting/server/task_runner/task_runner_timer.ts:49:20)
    at TaskRunner.runRule (/Users/pmuellr/Projects/elastic/kibana-prs/x-pack/plugins/alerting/server/task_runner/task_runner.ts:462:5)
    at TaskRunner.run (/Users/pmuellr/Projects/elastic/kibana-prs/x-pack/plugins/alerting/server/task_runner/task_runner.ts:670:31)
    at TaskManagerRunner.run (/Users/pmuellr/Projects/elastic/kibana-prs/x-pack/plugins/task_manager/server/task_running/task_runner.ts:304:22)

Here's the rule I happened to use, which depends on #144689, and I used es-apm-sys-sim 1 4 -k to force the cpu value up and down. https://github.com/pmuellr/es-apm-sys-sim

esq-rule-schedule-twice.ndjson.zip

@pmuellr pmuellr added bug Fixes for quality problems that affect the customer experience Feature:Alerting Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) Feature:Alerting/RulesFramework Issues related to the Alerting Rules Framework labels Dec 14, 2022
@elasticmachine
Copy link
Contributor

Pinging @elastic/response-ops (Team:ResponseOps)

@ymao1
Copy link
Contributor

ymao1 commented Dec 14, 2022

I believe this is due to the recent refactor that iterates over actions first, then alerts. With multiple recovery actions, this line:

alert.scheduleActions(action.group as ActionGroupIds);

gets called multiple times for recovered alerts and the second time will throw an error.

@mikecote
Copy link
Contributor

Moving to current iteration, we should get this backported to 8.6 if that release is affected as well. Thanks @pmuellr for the find!!

@ersin-erdal
Copy link
Contributor

Tried locally, removing that line fixes the bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fixes for quality problems that affect the customer experience Feature:Alerting/RulesFramework Issues related to the Alerting Rules Framework Feature:Alerting Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams)
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

5 participants