Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[Fleet] cancel tasks when 3rd retry failed (elastic#147190)
## Summary Related to elastic#144161 Found that on a bulk update tags task failure, the task didn't stop after 3 retries (should be over in less then a minute), the retries kept happening for 2 hours. This change removes the retry task if 3 retries are reached. Also testing in cloud deployment to see if the tags error can be reproduced with this fix. I could reproduce the reported error locally, and seeing it goes away with this fix. To verify: - Add at least 50k agents with the `create_agents` script in kibana repo - open Kibana, select the 50k agents, and open Actions / Add tags - Try this in a few seconds: add 2 new tags, and remove one of them - Wait about 30s, the agents should reflect the changes - Check the logs to see that the tasks are removed after 3rd retry is reached or successful. - Check that there are no more running tasks. Any running task can be found in Kibana Console by running this query: `GET .kibana_task_manager/_search?q=task.taskType:"fleet:update_agent_tags:retry"` Locally simulated an error to test that the retry (and check) task is removed: ``` [2022-12-07T15:52:16.415+01:00][ERROR][plugins.fleet] Retry #3 of task fleet:update_agent_tags:retry:848984ab-c11d-4ebe-8d1f-606143dd656b failed: failing task [2022-12-07T15:52:16.416+01:00][WARN ][plugins.fleet] Stopping after 3rd retry. Error: failing task [2022-12-07T15:52:16.416+01:00][INFO ][plugins.fleet] Removing task fleet:update_agent_tags:retry:check:848984ab-c11d-4ebe-8d1f-606143dd656b [2022-12-07T15:52:16.416+01:00][INFO ][plugins.fleet] Removing task fleet:update_agent_tags:retry:848984ab-c11d-4ebe-8d1f-606143dd656b ```
- Loading branch information