Discussing correlation vs. causation with BlackBlaze data #275

ViRb3 · 2022-06-02T17:10:11Z

I just installed scrutiny and set it up for my 6 external HDDs. Some of them are more than 5 years old, some of them are brand new. I noticed that two of the old disks were marked as failed:

Disk 1:

Disk 2:

This looks pretty scary, given the high failure rate and its description:

However, it made me think. Does the failure rate mean that 20% of confirmed failed disks in BlackBlaze's dataset had this attribute at or worse than my value? And does this take into account how many of the healthy disks had this attribute with the same value? Because if we only look at the failed data, we're assuming that correlation is causation, which may be wrong. Ideally, I believe we'd want to report the difference between the healthy and failed disks instead. This may be what you're currently doing, but I have no clue, so please excuse any assumptions I made here.

Thanks a lot!

EDIT: Could you please share the source of the BlackBlaze data?

shamoon · 2022-06-02T17:47:50Z

I totally agree with you ( #187 (comment) ), see my quote from the BB data (and theres a link there). This is correlation not causation and I think to display it as "Failed" is kind of misleading. At best its a "warning" or a "red flag" that something might happen, not that it has. Thats why BB dont use most of these data points for their decisions about replacing drives, etc.

AnalogJ · 2022-06-02T20:23:17Z

Yeah, as @shamoon mentioned, there's been a lot of concern about how Backblaze data is used within Scrutiny.

I'm working on some changes to the failure detection such that it'll be configurable in the UI, and you can selectively enable/disable the backblaze based failures and the thresholds.

jeroengui · 2022-06-06T12:50:25Z

Yeah, same issue here. Brand new server grade drive. And the application shows me that the drive is in status "failed"

AnalogJ · 2022-06-10T05:54:31Z

Just wanted to give everyone an update on the status of this issue.

There's currently two tasks I'm working on:

Update the details page UI to display the smart status separately from the scrutiny status.
Create a setting which allows uses to enable/disable Scrutiny analysis. (released in v0.5.0)

The first task is partially complete. Here's what it currently looks like:

The Smart status and Scrutiny status are differentiated in the expanding details panel.

This is still a prototype. I think it works well, but I'd love to hear your thoughts.

Parlane · 2022-06-14T20:49:44Z

Yeah, same issue here. Brand new server grade drive. And the application shows me that the drive is in status "failed"

Is that a seagate drive ? Probably related to #255 then, and fixed in master to not show failed anymore.

AnalogJ · 2022-08-04T15:40:15Z

Took an incredibly long time, but as of v0.5.0 this functionality is now available in Scrutiny! 🥳

On the dashboard settings panel, you can now change the "Device Status - Thresholds" between Smart, Scrutiny and Both. By default this is set to Both.

When changed to Smart - only the output of smartctl is relevant, all other Scrutiny/Backblaze detected failures/warnings are ignored (in notifications & UI).

The description and UI for this functionality may be enhanced in the coming releases, but it is functional and working.
I'd appreciate it if you could pull down the latest image, test it out and provide any feedback you may have!

Appreciate everyone's patience - this has been a long time coming.

This was referenced Jun 15, 2022

[FEAT] Notify on Critical Only #300

Closed

[FEAT] Ability to reset the "CRC Error Count" error level. #309

Closed

This was referenced Jul 10, 2022

[FEAT] Can we have more than a single FAILED status indicator (this can be missleading) #311

Closed

[BUG] Lots of disks marked as "failed" #336

Open

AnalogJ mentioned this issue Aug 3, 2022

pre v0.5.0 #352

Merged

AnalogJ closed this as completed in #352 Aug 4, 2022

This was referenced Nov 30, 2022

Add scrutiny widget gethomepage/homepage#548

Merged

[Feature Request] Add type select for Scrutiny Widget gethomepage/homepage#596

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discussing correlation vs. causation with BlackBlaze data #275

Discussing correlation vs. causation with BlackBlaze data #275

ViRb3 commented Jun 2, 2022 •

edited

Loading

shamoon commented Jun 2, 2022

AnalogJ commented Jun 2, 2022

jeroengui commented Jun 6, 2022

AnalogJ commented Jun 10, 2022 •

edited

Loading

Parlane commented Jun 14, 2022

AnalogJ commented Aug 4, 2022

Discussing correlation vs. causation with BlackBlaze data #275

Discussing correlation vs. causation with BlackBlaze data #275

Comments

ViRb3 commented Jun 2, 2022 • edited Loading

shamoon commented Jun 2, 2022

AnalogJ commented Jun 2, 2022

jeroengui commented Jun 6, 2022

AnalogJ commented Jun 10, 2022 • edited Loading

Parlane commented Jun 14, 2022

AnalogJ commented Aug 4, 2022

ViRb3 commented Jun 2, 2022 •

edited

Loading

AnalogJ commented Jun 10, 2022 •

edited

Loading