Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ssd-health] Add support Transcend ssd-health. #18247

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ec-michael-shih
Copy link
Contributor

Why I did it

We use Transcend' SSD, and model_name as below in our DUT:

  • TS64ZBTMM1600
  • TS32XBTMM1600

And found the issue by use defaut parsing method: smartctl {} -a
The result as below:

Wrong heath percentage, wrong temperature:

root@sonic:~# ssdutil
Device Model : TS32XBTMM1600
Health       : N/A
Temperature  : 100C

How I did it

The issue in not mean smartctl {} -a read wrong info, and the root cause is happening by parsing wrong format(field) (using generic method).
We can get the correct info via another tool==> Transcend tool: scopepro
ex:

root@sonic:~# scopepro -all /dev/sda
scopepro-cli 1.21 2023/11/24
Copyright (c) 2021-24, Transcend information, Inc. All rights reserved.

[/dev/sda]
---------- Disk Information ----------
Model                   :TS32XBTMM1600
FW Version              :O0918B
Serial No               :F318410080
Support Interface       :SATA
---------------- S.M.A.R.T Information ----------------
01 Read Error Rate      0
05 Reallocated Sectors Count    0
09 Power-On Hour Count  2295
0C Power Cycle Count    2580
A0 Uncorrectable sectors count when read/write  0
A1 Number of Valid Spare Blocks 56
A3 Number of Initial Invalid Blocks     12
A4 Total Erase Count    924315
A5 Maximum Erase Count  931
A6 Minimum Erase Count  831
A7 Average Erase Count  898
A8 Max Erase Count of Spec      3000
A9 Remain Life (percentage)     71
AF Program fail count in worst die      0
B0 Erase fail count in worst die        0
B1 Total Wear Level Count       481
B2 Runtime Invalid Block Count  0
B5 Total Program Fail Count     0
B6 Total Erase Fail Count       0
C0 Power-Off Retract Count      59
C2 Controlled Temperature       40
C3 Hardware ECC Recovered       1668
C4 Reallocation Event Count     0
C5 Current Pending Sector Count 0
C6 Uncorrectable Error Count Off-Line   0
C7 Ultra DMA CRC Error Count    0
E8 Available Reserved Space     100
F1 Total LBA Written (each write unit=32MB)     671697
F2 Total LBA Read (each read unit=32MB) 393165
F5 Flash Write Sector Count     924315
---------------- Health Information ----------------
Health Percentage: 71%

How to verify it

This PR for support Transcend SSD by using tool: scopepro and add specific transcend_parsing method.
Via transcend_parsing method, we can get correct info,
ex:

root@sonic:~# show platform ssdhealth
Device Model : TS32XBTMM1600
Health       : 71%
Temperature  : 40C

For detail:

root@sonic:/# show platform ssdhealth --vendor
Device Model : TS32XBTMM1600
Health       : 70%
Temperature  : 39C
scopepro-cli 1.21 2023/11/24
Copyright (c) 2021-24, Transcend information, Inc. All rights reserved.

[/dev/sda]
---------- Disk Information ----------
Model                   :TS32XBTMM1600
FW Version              :O0918B
Serial No               :F318410080
Support Interface       :SATA
---------------- S.M.A.R.T Information ----------------
01 Read Error Rate      0
05 Reallocated Sectors Count    0
09 Power-On Hour Count  2299
0C Power Cycle Count    2580
A0 Uncorrectable sectors count when read/write  0
A1 Number of Valid Spare Blocks 56
A3 Number of Initial Invalid Blocks     12
A4 Total Erase Count    927779
A5 Maximum Erase Count  935
A6 Minimum Erase Count  834
A7 Average Erase Count  901
A8 Max Erase Count of Spec      3000
A9 Remain Life (percentage)     70
AF Program fail count in worst die      0
B0 Erase fail count in worst die        0
B1 Total Wear Level Count       484
B2 Runtime Invalid Block Count  0
B5 Total Program Fail Count     0
B6 Total Erase Fail Count       0
C0 Power-Off Retract Count      59
C2 Controlled Temperature       39
C3 Hardware ECC Recovered       1669
C4 Reallocation Event Count     0
C5 Current Pending Sector Count 0
C6 Uncorrectable Error Count Off-Line   0
C7 Ultra DMA CRC Error Count    0
E8 Available Reserved Space     100
F1 Total LBA Written (each write unit=32MB)     674222
F2 Total LBA Read (each read unit=32MB) 393475
F5 Flash Write Sector Count     927779
---------------- Health Information ----------------
Health Percentage: 70%

Which release branch to backport (provide reason below if selected)

N/A

  • 201811
  • 201911
  • 202006
  • 202012
  • 202106
  • 202111
  • 202205
  • 202211
  • 202305

Tested branch (Please provide the tested image version)

Test on two version of Broadcom (master)

  • 20240206.8
  • 20240205.8

Description for the changelog

Provide SONiC cli command of ssd-health to support SSD vendor: Transcend: by using specific tool(scopepro) and parsing method.
ex:
show platform ssdhealth [--vendor]
or
ssdutil [--vendor]
or
scopepro -all /dev/sda

Link to config_db schema for YANG module changes

N/A

A picture of a cute animal (not mandatory but encouraged)

istock-646403524

Detail:
Upload the tool(scopepro) of reading SSD info from Transcend.

Signed-off-by: michael_shih <michael_shih@edge-core.com>
@lguohan
Copy link
Collaborator

lguohan commented May 18, 2024

@prgeor , are we taking binary into pmon?

@CharlieChenEC
Copy link
Contributor

Hi @lguohan ,

@prgeor , are we taking binary into pmon?

Currently, two binaries for ssd-health are copied into pmon.
Please check the related files below:

I think this PR follows the existing design to add a new ssd binary to support Transcend SSD.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants