Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[alerting] public facing doc on alerting performance #107979

Open
pmuellr opened this issue Aug 9, 2021 · 1 comment
Open

[alerting] public facing doc on alerting performance #107979

pmuellr opened this issue Aug 9, 2021 · 1 comment
Labels
docs estimate:medium Medium Estimated Level of Effort Feature:Alerting Feature:Task Manager performance Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams)

Comments

@pmuellr
Copy link
Member

pmuellr commented Aug 9, 2021

note: this was split off of issue #95194, which was originally supposed to cover both benchmarking 7.14 and creating a blog post on alerting performance

We'd like to have some public-facing doc on alerting performance, similar to other elastic-generated content like this one covering performance with logs and metrics: Benchmarking and sizing your Elasticsearch cluster for logs and metrics. Doesn't necessarily have to be a blog post, but these seem to be popular destinations for stack users. Perhaps a split between some additional asciidoc and a blog post.

The basic idea is to describe:

  • how the alerting system uses task manager at a very high level,
  • describe how many tasks can be run per minute/hour given the existing task manager config defaults
  • how to change the task manager defaults
  • how to get measurements from your system (task manager health)
  • how to determine if you need to resize your cluster based on more or less alerting rules running
  • throw in some extra stuff like ephemeral workers
  • should we publicize our stress tester, as something customers could use for experiments?
@pmuellr pmuellr added Feature:Alerting Feature:Task Manager Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) docs labels Aug 9, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-alerting-services (Team:Alerting Services)

@gmmorris gmmorris added the loe:large Large Level of Effort label Aug 11, 2021
@gmmorris gmmorris added the estimate:medium Medium Estimated Level of Effort label Aug 18, 2021
@gmmorris gmmorris removed the loe:large Large Level of Effort label Sep 2, 2021
@kobelb kobelb added the needs-team Issues missing a team label label Jan 31, 2022
@botelastic botelastic bot removed the needs-team Issues missing a team label label Jan 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs estimate:medium Medium Estimated Level of Effort Feature:Alerting Feature:Task Manager performance Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams)
Projects
No open projects
Development

No branches or pull requests

5 participants