Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix parsing of attribute 188 on seagate drives #527

Merged
merged 1 commit into from
Oct 17, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 27 additions & 2 deletions webapp/backend/pkg/thresholds/ata_attribute_metadata.go
Original file line number Diff line number Diff line change
@@ -1,5 +1,10 @@
package thresholds

import (
"strconv"
"strings"
)

const AtaSmartAttributeDisplayTypeRaw = "raw"
const AtaSmartAttributeDisplayTypeNormalized = "normalized"
const AtaSmartAttributeDisplayTypeTransformed = "transformed"
Expand Down Expand Up @@ -662,13 +667,33 @@ var AtaMetadata = map[int]AtaAttributeMetadata{
188: {
ID: 188,
DisplayName: "Command Timeout",
DisplayType: AtaSmartAttributeDisplayTypeRaw,
DisplayType: AtaSmartAttributeDisplayTypeTransformed,
Ideal: ObservedThresholdIdealLow,
Critical: true,
Description: "The count of aborted operations due to HDD timeout. Normally this attribute value should be equal to zero.",
Transform: func(normValue int64, rawValue int64, rawString string) int64 {
// Parse Seagate command timeout values if the string contains 3 pieces
// and each piece is less than or equal to the next (as a sanity check)
// See https://github.com/AnalogJ/scrutiny/issues/522
pieces := strings.Split(rawString, " ")
if len(pieces) == 3 {
int_pieces := make([]int, len(pieces))
var err error
for i, s := range pieces {
int_pieces[i], err = strconv.Atoi(s)
if err != nil {
return rawValue
}
}
if int_pieces[2] >= int_pieces[1] && int_pieces[1] >= int_pieces[0] {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic is slightly flawed. The first number indicates total command timeouts, the second indicates all commands that took over 5 seconds to complete, and the third all commands that took over 7.5 seconds to complete.

See:

Raw Usage

Raw [1 - 0] = Total # of command timeouts, with Max hold of FFFFh

Raw [3 - 2] = Total # of commands with > 5 second completion, including those > 7.5 seconds

Raw [5 - 4] = Total # of commands with > 7.5 second completion

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's actually correct. When the string gets split, the indices are backwards from the byte order, and the int pieces variable uses that same backwards ordering

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kaysond Thank you for the clarification!

return int64(int_pieces[2])
}
}
return rawValue
},
ObservedThresholds: []ObservedThreshold{
{
Low: 0,
Low: 0,
// This is set arbitrarily to avoid notifications caused by low
// historical numbers of command timeouts (e.g. caused by a bad cable)
High: 100,
Expand Down