Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Alerting] Next steps for O11y of Alerting #105306

Open
chrisronline opened this issue Jul 12, 2021 · 1 comment
Open

[Alerting] Next steps for O11y of Alerting #105306

chrisronline opened this issue Jul 12, 2021 · 1 comment
Labels
estimate:needs-research Estimated as too large and requires research to break down into workable issues Feature:Alerting/RulesFramework Issues related to the Alerting Rules Framework impact:high Addressing this issue will have a high level of impact on the quality/strength of our product. research resilience Issues related to Platform resilience in terms of scale, performance & backwards compatibility Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams)

Comments

@chrisronline
Copy link
Contributor

chrisronline commented Jul 12, 2021

As part of the first phase of observability of alerting, we identified the impossible to answer questions that plagued our ability to support users around topics of alerting, actions and task manager. We then included quite a few improvements in 7.14 to assist in making this impossible questions to answer, possible.

As a follow up to this effort, we need to identify what comes next. It will take time to know if the 7.14 changes moved the needed as intended, but there are things we can start in parallel to assist with the overall effort.

In the short term future, we need to:

In the mid term future, we should:

  • Ensure the changes added in 7.14 made the impossible questions actually possible to answer (This takes time as users upgrade to 7.14+)
  • Identify (very) hard to answer questions with support issues and work to make those easier to answer (For example, does having over-time charts of health metrics help us answer questions faster? Should we invest in that experience?)

In the longer term, we can:

  • Integrate with existing Elastic solutions that solve similar problems (APM, Stack Monitoring, etc)
@botelastic botelastic bot added the needs-team Issues missing a team label label Jul 12, 2021
@chrisronline chrisronline added the Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) label Jul 12, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-alerting-services (Team:Alerting Services)

@botelastic botelastic bot removed the needs-team Issues missing a team label label Jul 12, 2021
@mikecote mikecote added Feature:Alerting/RulesFramework Issues related to the Alerting Rules Framework loe:needs-research This issue requires some research before it can be worked on or estimated labels Jul 13, 2021
@chrisronline chrisronline changed the title [Research] Next steps for O11y of Alerting - Create issues for known future enhancements [Alerting] Next steps for O11y of Alerting Jul 13, 2021
@gmmorris gmmorris added resilience Issues related to Platform resilience in terms of scale, performance & backwards compatibility estimate:needs-research Estimated as too large and requires research to break down into workable issues labels Aug 13, 2021
@gmmorris gmmorris removed the loe:needs-research This issue requires some research before it can be worked on or estimated label Sep 2, 2021
@gmmorris gmmorris added the impact:high Addressing this issue will have a high level of impact on the quality/strength of our product. label Sep 16, 2021
@kobelb kobelb added the needs-team Issues missing a team label label Jan 31, 2022
@botelastic botelastic bot removed the needs-team Issues missing a team label label Jan 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
estimate:needs-research Estimated as too large and requires research to break down into workable issues Feature:Alerting/RulesFramework Issues related to the Alerting Rules Framework impact:high Addressing this issue will have a high level of impact on the quality/strength of our product. research resilience Issues related to Platform resilience in terms of scale, performance & backwards compatibility Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams)
Projects
No open projects
Development

No branches or pull requests

5 participants