Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Can't dismiss non-critical scrutiny warning for UltraDMA CRC Error Count #553

Open
sammcj opened this issue Dec 3, 2023 · 1 comment
Labels
bug Something isn't working

Comments

@sammcj
Copy link

sammcj commented Dec 3, 2023

Describe the bug

Scrutiny is constantly alerting on UltraDMA CRC Error Count for one drive that has a value of 20 due to a bad SATA connector (that has since been replaced).

Two things with this:

  1. There seems to be no way to acknowledge or dismiss the alert.
  2. This is not a critical error, most of the time UDMA CRC errors occur with unreliable SATA cables or a controller that's having issues. While the hard drives will record this in their log - they will not fail a SMART test as it's not usually an issue with the drive itself.

Expected behaviour

The ability to dismiss an alert or monitored attribute in the web interface.

or

UDMA CRC errors to be treated as warnings, or even better - only alert / warn if occurred within a given time period.

Screenshots

SCR-20231204-hlij SCR-20231204-hlft SCR-20231204-hlkw

Log Files

root@63ea50086c60:/opt/scrutiny/bin# ./scrutiny-collector-metrics run
2023/12/03 20:44:28 Loading configuration file: /opt/scrutiny/config/collector.yaml

 ___   ___  ____  __  __  ____  ____  _  _  _  _
/ __) / __)(  _ \(  )(  )(_  _)(_  _)( \( )( \/ )
\__ \( (__  )   / )(__)(   )(   _)(_  )  (  \  /
(___/ \___)(_)\_)(______) (__) (____)(_)\_) (__)
AnalogJ/scrutiny/metrics                                dev-0.7.2

INFO[0000] Verifying required tools                      type=metrics
INFO[0000] Executing command: smartctl --scan --json     type=metrics
INFO[0000] Executing command: smartctl --info --json /dev/sdd  type=metrics
INFO[0000] Generating WWN                                type=metrics
INFO[0000] Executing command: smartctl --info --json /dev/sde  type=metrics
INFO[0000] Generating WWN                                type=metrics
INFO[0000] Executing command: smartctl --info --json /dev/sdf  type=metrics
INFO[0000] Generating WWN                                type=metrics
INFO[0000] Executing command: smartctl --info --json --device nvme /dev/nvme0  type=metrics
INFO[0000] Using WWN Fallback                            type=metrics
INFO[0000] Executing command: smartctl --info --json --device nvme /dev/nvme1  type=metrics
INFO[0000] Using WWN Fallback                            type=metrics
INFO[0000] Executing command: smartctl --info --json /dev/sda  type=metrics
INFO[0000] Generating WWN                                type=metrics
INFO[0000] Executing command: smartctl --info --json /dev/sdb  type=metrics
INFO[0000] Generating WWN                                type=metrics
INFO[0000] Executing command: smartctl --info --json /dev/sdc  type=metrics
INFO[0000] Generating WWN                                type=metrics
INFO[0000] Executing command: smartctl --info --json --device nvme /dev/nvme3  type=metrics
INFO[0000] Using WWN Fallback                            type=metrics
INFO[0000] Executing command: smartctl --info --json --device sat /dev/sdh  type=metrics
INFO[0000] Generating WWN                                type=metrics
INFO[0000] Executing command: smartctl --info --json --device sat /dev/sdi  type=metrics
INFO[0000] Generating WWN                                type=metrics
INFO[0000] Executing command: smartctl --info --json --device nvme /dev/nvme2  type=metrics
INFO[0000] Using WWN Fallback                            type=metrics
INFO[0000] Sending detected devices to API, for filtering & validation  type=metrics
INFO[0000] Collecting smartctl results for sdd           type=metrics
INFO[0000] Executing command: smartctl --xall --json /dev/sdd  type=metrics
INFO[0000] Publishing smartctl results for 0x5000cca267ebed01  type=metrics
INFO[0000] Collecting smartctl results for sde           type=metrics
INFO[0000] Executing command: smartctl --xall --json /dev/sde  type=metrics
INFO[0000] Publishing smartctl results for 0x5000cca267ebe3fa  type=metrics
INFO[0000] Collecting smartctl results for sdf           type=metrics
INFO[0000] Executing command: smartctl --xall --json /dev/sdf  type=metrics
ERRO[0001] smartctl returned an error code (64) while processing sdf  type=metrics
ERRO[0001] smartctl detected a error log with errors     type=metrics
INFO[0001] Publishing smartctl results for 0x5000cca267eb4cf3  type=metrics
INFO[0001] Collecting smartctl results for nvme0         type=metrics
INFO[0001] Executing command: smartctl --xall --json --device nvme /dev/nvme0  type=metrics
INFO[0001] Publishing smartctl results for adc6n448112806d1g  type=metrics
INFO[0001] Collecting smartctl results for nvme1         type=metrics
INFO[0001] Executing command: smartctl --xall --json --device nvme /dev/nvme1  type=metrics
INFO[0001] Publishing smartctl results for nhg456r007277p2202  type=metrics
INFO[0001] Collecting smartctl results for sda           type=metrics
INFO[0001] Executing command: smartctl --xall --json /dev/sda  type=metrics
INFO[0001] Publishing smartctl results for 0x5002538e7001e594  type=metrics
INFO[0001] Collecting smartctl results for sdb           type=metrics
INFO[0001] Executing command: smartctl --xall --json /dev/sdb  type=metrics
INFO[0001] Publishing smartctl results for 0x5002538c4046e23f  type=metrics
INFO[0001] Collecting smartctl results for sdc           type=metrics
INFO[0001] Executing command: smartctl --xall --json /dev/sdc  type=metrics
ERRO[0001] smartctl returned an error code (64) while processing sdc  type=metrics
ERRO[0001] smartctl detected a error log with errors     type=metrics
INFO[0001] Publishing smartctl results for 0x5000cca267ebc2a6  type=metrics
INFO[0005] Collecting smartctl results for nvme3         type=metrics
INFO[0005] Executing command: smartctl --xall --json --device nvme /dev/nvme3  type=metrics
INFO[0005] Publishing smartctl results for 0000303235369  type=metrics
INFO[0005] Collecting smartctl results for sdh           type=metrics
INFO[0005] Executing command: smartctl --xall --json --device sat /dev/sdh  type=metrics
ERRO[0005] smartctl returned an error code (4) while processing sdh  type=metrics
ERRO[0005] smartctl detected a checksum error            type=metrics
INFO[0005] Publishing smartctl results for 0x500a0751e1352b0f  type=metrics
INFO[0005] Collecting smartctl results for sdi           type=metrics
INFO[0005] Executing command: smartctl --xall --json --device sat /dev/sdi  type=metrics
ERRO[0005] smartctl returned an error code (4) while processing sdi  type=metrics
ERRO[0005] smartctl detected a checksum error            type=metrics
INFO[0005] Publishing smartctl results for 0x500a0751e1352c3c  type=metrics
INFO[0005] Collecting smartctl results for nvme2         type=metrics
INFO[0005] Executing command: smartctl --xall --json --device nvme /dev/nvme2  type=metrics
INFO[0005] Publishing smartctl results for 2252ag440805  type=metrics
INFO[0005] Main: Completed                               type=metrics
@zwimer
Copy link

zwimer commented Nov 22, 2024

This would change multiple of my drives from failed to passing. It'd be a nice feature. For a method of implementing this, this comment seems apt: #364 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants