Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RAC][Rule Registry] UI/UX around timeouts and errors during index bootstrapping #111170

Open
Tracked by #101016
banderror opened this issue Sep 3, 2021 · 4 comments
Open
Tracked by #101016
Labels
Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) Theme: rac label obsolete

Comments

@banderror
Copy link
Contributor

Parent ticket: #101016

Summary

Background: #108115 (comment)

During index bootstrapping there can occur certain situations related to network conditions:

  • Timeouts, for example when network or ES cluster are under load. Currently we have 20 minutes timeout for installing common ES resources shared between all indices + 20 minutes timeout for installing index-specific resources for each index separately (e.g. for .alerts-security.alerts). Total 40 minutes.

    • During these 20-40 minutes the rules will be blocked on attempting to write alerts and will be hanging. It will look like "going to run" status in the Rule Management table in Security and no logs or other messages.
  • Errors, like network errors or errors from Elasticsearch.

    • In this case, errors will be re-thrown as exceptions; the rule status will change to "failed" and there will be some Kibana logs available.

Do we need to build a better UX around that?

@banderror banderror added Team:Detections and Resp Security Detection Response Team Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. Theme: rac label obsolete labels Sep 3, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/security-detections-response (Team:Detections and Resp)

@elasticmachine
Copy link
Contributor

Pinging @elastic/security-solution (Team: SecuritySolution)

@banderror
Copy link
Contributor Author

Hey everyone, FYI ownership of this ticket and other tickets related to rule_registry (like #101016) now goes to the Detection Alerts area (Team:Detection Alerts label). Please ping @peluja1012 and @marshallmain if you have any questions.

@marshallmain
Copy link
Contributor

Transferring again to @elastic/response-ops as they now own the rule registry implementation.

@marshallmain marshallmain added Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) and removed Team:Detections and Resp Security Detection Response Team Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. Team:Detection Alerts Security Detection Alerts Area Team labels Apr 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) Theme: rac label obsolete
Projects
None yet
Development

No branches or pull requests

3 participants