From b224343a4cd78f6a94f876880f3991edb4705b9c Mon Sep 17 00:00:00 2001 From: Ravi Kesarwani <64450378+ravikesarwani@users.noreply.github.com> Date: Tue, 3 Aug 2021 10:30:55 -0400 Subject: [PATCH] Update SM doc for alert per object (#107420) Update stack monitoring doc to account for alert notification now being send for each node, index, or cluster based on the rule type, instead of always per cluster (PR# 102544) --- docs/user/monitoring/kibana-alerts.asciidoc | 27 ++++++++------------- 1 file changed, 10 insertions(+), 17 deletions(-) diff --git a/docs/user/monitoring/kibana-alerts.asciidoc b/docs/user/monitoring/kibana-alerts.asciidoc index 837248e0cf41d..beaae1fdb71b6 100644 --- a/docs/user/monitoring/kibana-alerts.asciidoc +++ b/docs/user/monitoring/kibana-alerts.asciidoc @@ -32,17 +32,15 @@ To review and modify all available rules, click *Enter setup mode* on the This rule checks for {es} nodes that run a consistently high CPU load. By default, the condition is set at 85% or more averaged over the last 5 minutes. -The rule is grouped across all the nodes of the cluster by running checks on a -schedule time of 1 minute with a re-notify interval of 1 day. +The default rule checks on a schedule time of 1 minute with a re-notify interval of 1 day. [discrete] [[kibana-alerts-disk-usage-threshold]] == Disk usage threshold This rule checks for {es} nodes that are nearly at disk capacity. By default, -the condition is set at 80% or more averaged over the last 5 minutes. The rule -is grouped across all the nodes of the cluster by running checks on a schedule -time of 1 minute with a re-notify interval of 1 day. +the condition is set at 80% or more averaged over the last 5 minutes. The default rule +checks on a schedule time of 1 minute with a re-notify interval of 1 day. [discrete] [[kibana-alerts-jvm-memory-threshold]] @@ -50,16 +48,14 @@ time of 1 minute with a re-notify interval of 1 day. This rule checks for {es} nodes that use a high amount of JVM memory. By default, the condition is set at 85% or more averaged over the last 5 minutes. -The rule is grouped across all the nodes of the cluster by running checks on a -schedule time of 1 minute with a re-notify interval of 1 day. +The default rule checks on a schedule time of 1 minute with a re-notify interval of 1 day. [discrete] [[kibana-alerts-missing-monitoring-data]] == Missing monitoring data This rule checks for {es} nodes that stop sending monitoring data. By default, -the condition is set to missing for 15 minutes looking back 1 day. The rule is -grouped across all the {es} nodes of the cluster by running checks on a schedule +the condition is set to missing for 15 minutes looking back 1 day. The default rule checks on a schedule time of 1 minute with a re-notify interval of 6 hours. [discrete] @@ -67,9 +63,8 @@ time of 1 minute with a re-notify interval of 6 hours. == Thread pool rejections (search/write) This rule checks for {es} nodes that experience thread pool rejections. By -default, the condition is set at 300 or more over the last 5 minutes. The rule -is grouped across all the nodes of the cluster by running checks on a schedule -time of 1 minute with a re-notify interval of 1 day. Thresholds can be set +default, the condition is set at 300 or more over the last 5 minutes. The default rule +checks on a schedule time of 1 minute with a re-notify interval of 1 day. Thresholds can be set independently for `search` and `write` type rejections. [discrete] @@ -78,8 +73,7 @@ independently for `search` and `write` type rejections. This rule checks for read exceptions on any of the replicated {es} clusters. The condition is met if 1 or more read exceptions are detected in the last hour. The -rule is grouped across all replicated clusters by running checks on a schedule -time of 1 minute with a re-notify interval of 6 hours. +default rule checks on a schedule time of 1 minute with a re-notify interval of 6 hours. [discrete] [[kibana-alerts-large-shard-size]] @@ -87,9 +81,8 @@ time of 1 minute with a re-notify interval of 6 hours. This rule checks for a large average shard size (across associated primaries) on any of the specified index patterns in an {es} cluster. The condition is met if -an index's average shard size is 55gb or higher in the last 5 minutes. The rule -is grouped across all indices that match the default pattern of `-.*` by running -checks on a schedule time of 1 minute with a re-notify interval of 12 hours. +an index's average shard size is 55gb or higher in the last 5 minutes. The default rule +matches the pattern of `-.*` by running checks on a schedule time of 1 minute with a re-notify interval of 12 hours. [discrete] [[kibana-alerts-cluster-alerts]]