Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update SM doc for alert per object #107420

Merged
merged 1 commit into from
Aug 3, 2021
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 10 additions & 17 deletions docs/user/monitoring/kibana-alerts.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -32,44 +32,39 @@ To review and modify all available rules, click *Enter setup mode* on the

This rule checks for {es} nodes that run a consistently high CPU load. By
default, the condition is set at 85% or more averaged over the last 5 minutes.
The rule is grouped across all the nodes of the cluster by running checks on a
schedule time of 1 minute with a re-notify interval of 1 day.
The default rule checks on a schedule time of 1 minute with a re-notify interval of 1 day.

[discrete]
[[kibana-alerts-disk-usage-threshold]]
== Disk usage threshold

This rule checks for {es} nodes that are nearly at disk capacity. By default,
the condition is set at 80% or more averaged over the last 5 minutes. The rule
is grouped across all the nodes of the cluster by running checks on a schedule
time of 1 minute with a re-notify interval of 1 day.
the condition is set at 80% or more averaged over the last 5 minutes. The default rule
checks on a schedule time of 1 minute with a re-notify interval of 1 day.

[discrete]
[[kibana-alerts-jvm-memory-threshold]]
== JVM memory threshold

This rule checks for {es} nodes that use a high amount of JVM memory. By
default, the condition is set at 85% or more averaged over the last 5 minutes.
The rule is grouped across all the nodes of the cluster by running checks on a
schedule time of 1 minute with a re-notify interval of 1 day.
The default rule checks on a schedule time of 1 minute with a re-notify interval of 1 day.

[discrete]
[[kibana-alerts-missing-monitoring-data]]
== Missing monitoring data

This rule checks for {es} nodes that stop sending monitoring data. By default,
the condition is set to missing for 15 minutes looking back 1 day. The rule is
grouped across all the {es} nodes of the cluster by running checks on a schedule
the condition is set to missing for 15 minutes looking back 1 day. The default rule checks on a schedule
time of 1 minute with a re-notify interval of 6 hours.

[discrete]
[[kibana-alerts-thread-pool-rejections]]
== Thread pool rejections (search/write)

This rule checks for {es} nodes that experience thread pool rejections. By
default, the condition is set at 300 or more over the last 5 minutes. The rule
is grouped across all the nodes of the cluster by running checks on a schedule
time of 1 minute with a re-notify interval of 1 day. Thresholds can be set
default, the condition is set at 300 or more over the last 5 minutes. The default rule
checks on a schedule time of 1 minute with a re-notify interval of 1 day. Thresholds can be set
independently for `search` and `write` type rejections.

[discrete]
Expand All @@ -78,18 +73,16 @@ independently for `search` and `write` type rejections.

This rule checks for read exceptions on any of the replicated {es} clusters. The
condition is met if 1 or more read exceptions are detected in the last hour. The
rule is grouped across all replicated clusters by running checks on a schedule
time of 1 minute with a re-notify interval of 6 hours.
default rule checks on a schedule time of 1 minute with a re-notify interval of 6 hours.

[discrete]
[[kibana-alerts-large-shard-size]]
== Large shard size

This rule checks for a large average shard size (across associated primaries) on
any of the specified index patterns in an {es} cluster. The condition is met if
an index's average shard size is 55gb or higher in the last 5 minutes. The rule
is grouped across all indices that match the default pattern of `-.*` by running
checks on a schedule time of 1 minute with a re-notify interval of 12 hours.
an index's average shard size is 55gb or higher in the last 5 minutes. The default rule
matches the pattern of `-.*` by running checks on a schedule time of 1 minute with a re-notify interval of 12 hours.

[discrete]
[[kibana-alerts-cluster-alerts]]
Expand Down