Monitoring of Linux MD-RAID devices (aka Linux Software RAID).
There are user parameters with custom low-level discovery rules for detecting and monitoring the available array devices (MD) as well as the underlying component devices (RD). The sysfs md interface is used for discovery and monitoring of the device parameters.
This template is part of RaBe's Zabbix template and helpers collection.
- Install the
rabe.md-raid.conf
Zabbix user parameters into your Zabbix agent'sInclude
directory (usually/etc/zabbix/zabbix_agentd.d
). - Import the
Template_App_MD-RAID_active.xml
into your Zabbix server (click on theRaw
button to download). - Add the template to you host (or stack template)
- Check if new data arrives
Application template for monitoring Linux multi device RAID (MD-RAID).
Low-Level discovery of component devices (RD). Mapping between raid devices and their corresponding component devices.
Returns the following macros for each available RD device:
{#MD_RAID_RAID_DEV_NAME}
- RAID device (MD) name
- Example:
md0
{#MD_RAID_COMPONENT_DEV_NAME}
- Component device (RD) name
- Example:
rd1
- Block device name of MD $1 RD $2 device (
rabe.raid.md.component-device.block-dev[{#MD_RAID_RAID_DEV_NAME},{#MD_RAID_COMPONENT_DEV_NAME}]
)
Block device name of a specific component device (RD), according to the symlink target of/sys/block/<MD-NAME>/md/<RD-NAME>/block
. - Read errors of MD {#MD_RAID_RAID_DEV_NAME} RD {#MD_RAID_COMPONENT_DEV_NAME} device (
vfs.file.contents[/sys/block/{#MD_RAID_RAID_DEV_NAME}/md/{#MD_RAID_COMPONENT_DEV_NAME}/errors]
)
Count of read errors of a specific component device (RD), according to/sys/block/<MD-NAME>/md/<RD-NAME>/errors
. - State of MD {#MD_RAID_RAID_DEV_NAME} RD {#MD_RAID_COMPONENT_DEV_NAME} device (
vfs.file.contents[/sys/block/{#MD_RAID_RAID_DEV_NAME}/md/{#MD_RAID_COMPONENT_DEV_NAME}/state]
)
The current state of a specific component device (RD), according to/sys/block/<MD-NAME>/md/<RD-NAME>/state
.
- High: RAID component device MD {#MD_RAID_RAID_DEV_NAME} RD {#MD_RAID_COMPONENT_DEV_NAME} is in {ITEM.VALUE1} state on {HOST.NAME}
The component device (RD)
{Template App MD-RAID active:vfs.file.contents[/sys/block/{#MD_RAID_RAID_DEV_NAME}/md/{#MD_RAID_COMPONENT_DEV_NAME}/state].str(in_sync)}=0
{#MD_RAID_COMPONENT_DEV_NAME}
is not a fully in-sync member of the array of the RAID device (MD){#MD_RAID_RAID_DEV_NAME}
.
Low-Level discovery of RAID devices (MD)
Returns the following macros for each available MD device:
{#MD_RAID_RAID_DEV_NAME{}
- RAID device name
- Example:
md0
- Array state of MD {#MD_RAID_RAID_DEV_NAME} device (
vfs.file.contents[/sys/block/{#MD_RAID_RAID_DEV_NAME}/md/array_state]
)
The current array state of a specific raid device (MD), according to/sys/block/<MD-NAME>/md/array_state
. - Number of degraded devices within MD {#MD_RAID_RAID_DEV_NAME} device (
vfs.file.contents[/sys/block/{#MD_RAID_RAID_DEV_NAME}/md/degraded]
)
The number of degraded devices within a raid device (MD), according to/sys/block/<MD-NAME>/md/degraded
. - RAID level of MD {#MD_RAID_RAID_DEV_NAME} device (
vfs.file.contents[/sys/block/{#MD_RAID_RAID_DEV_NAME}/md/level]
)
RAID level of a specific raid device (MD), according to/sys/block/<MD-NAME>/md/level
. - Number of devices within MD {#MD_RAID_RAID_DEV_NAME} device (
vfs.file.contents[/sys/block/{#MD_RAID_RAID_DEV_NAME}/md/raid_disks]
)
The number of devices within a raid device (MD), according to/sys/block/<MD-NAME>/md/raid_disks
. - Sync action of MD {#MD_RAID_RAID_DEV_NAME} device (
vfs.file.contents[/sys/block/{#MD_RAID_RAID_DEV_NAME}/md/sync_action]
)
The current sync action (for rebuild or redundancy check processes) of a specific raid device (MD), according to/sys/block/<MD-NAME>/md/sync_action
.
- High: RAID array device MD {#MD_RAID_RAID_DEV_NAME} has {ITEM.VALUE1} degraded device(s) on {HOST.NAME}
The raid device (MD)
{Template App MD-RAID active:vfs.file.contents[/sys/block/{#MD_RAID_RAID_DEV_NAME}/md/degraded].last()}>0
{#MD_RAID_COMPONENT_DEV_NAME}
has one or more degraded component devices within its array. - High: RAID array device MD {#MD_RAID_RAID_DEV_NAME} has failed on {HOST.NAME}
The raid device (MD)
{Template App MD-RAID active:vfs.file.contents[/sys/block/{#MD_RAID_RAID_DEV_NAME}/md/level].str(faulty)}=1
{#MD_RAID_COMPONENT_DEV_NAME}
has failed. - Information: RAID array device MD {#MD_RAID_RAID_DEV_NAME} is in {ITEM.VALUE1} sync action on {HOST.NAME}
The raid device (MD)
{Template App MD-RAID active:vfs.file.contents[/sys/block/{#MD_RAID_RAID_DEV_NAME}/md/sync_action].str(idle)}=0
{#MD_RAID_COMPONENT_DEV_NAME}
currently runs a re-sync, recover or redundancy check/repair action.
The following user parameters are available within
rabe.md-raid.conf
(including some
in-depth parameter description):
Key | Description |
---|---|
rabe.raid.md.raid-device.discovery |
Discovery rule for getting a list of all raid devices (MD) |
rabe.raid.md.component-device.discovery |
Discovery rule for getting a list of all component devices (RD) |
rabe.raid.md.component-device.block-dev[<md device name>,<rd device name>] |
Block device name of a specific component device (RD) |
This template is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, version 3 of the License.
Copyright (c) 2017 - 2019 Radio Bern RaBe