Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High-level Design of Storage Monitoring Daemon #1481

Merged
merged 26 commits into from
May 20, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
bc26917
Initial commit of ssdmon HLD
assrinivasan Sep 20, 2023
a9db382
Changes to HLD based on prgeor and staphylo review comments
assrinivasan Oct 11, 2023
7de95b2
Clarified sdutil class name. Changed case of SSD_INFO keys.
assrinivasan Oct 12, 2023
97e9167
Made minor revisions
assrinivasan Oct 12, 2023
d0395ee
Modified daemon per staphylo and prgeor comments. Renamed daemon to '…
assrinivasan Oct 20, 2023
01dc8d8
Minor revisions based on prgeor review comments
assrinivasan Oct 20, 2023
24e2dc4
Minor changes based on staphylo review comments
assrinivasan Nov 1, 2023
31cb0be
Added new class and member function logic info. Made changes to State…
assrinivasan Nov 24, 2023
8db4f8c
Mde changes per prgeor review comments. Added Class diagram.
assrinivasan Feb 9, 2024
0c53764
Made changes per prgeor review comments. Appropriately modified the c…
assrinivasan Feb 15, 2024
00935dd
Added design consideration for bind mounts and reboot script changes
assrinivasan Feb 16, 2024
82bd91b
Cleaned up grammar, other minor revisions
assrinivasan Feb 27, 2024
e6a47b7
Made changes per community review comments
assrinivasan Apr 17, 2024
6a363c5
Added design considerations for various restart/reboot scenarios
assrinivasan Apr 18, 2024
1af0935
Added core design algorithm
assrinivasan Apr 26, 2024
b5f948e
Changed FSSTATS_SYNC format
assrinivasan Apr 30, 2024
30b5823
Added YANG model and pseudo code for planned reboots/daemon crash sce…
assrinivasan May 6, 2024
56d57c1
Cleaned up YANG model
assrinivasan May 7, 2024
6632690
Added key and example for CONFIG_DB, enhanced YANG model accordingly
assrinivasan May 7, 2024
3fd9db6
Changed fsio-rw-sync invocation from reboot script to database servic…
assrinivasan May 9, 2024
1707051
Updated example of redis db output
assrinivasan May 10, 2024
5403a0f
Added a better example to diff between latest and total FSIO reads/wr…
assrinivasan May 10, 2024
26c7d08
Removed reference to non-psutil scenario. Revert FSIO script invocati…
assrinivasan May 15, 2024
34417cc
Updated facts about config_db. Updated YANG model. Cleaned up naming.
assrinivasan May 16, 2024
2a00465
Changed impl. details of FSIO sync in planned reboot scenarios
assrinivasan May 20, 2024
0d7e809
Modified Test plan language
assrinivasan May 20, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added doc/ssdmond/images/SSDMOND_SequenceDiagram.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
129 changes: 129 additions & 0 deletions doc/ssdmond/ssdmond-hld.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,129 @@
# SONiC SSD Daemon Design #
### Rev 0.1 ###

| Rev | Date | Author | Change Description |
|:---:|:-----------:|:------------------:|-----------------------------------|
| 0.1 | | Ashwin Srinivasan | Initial version |

## 1. Overview

This document is intended to provide a high-level design for a Solid-State Drive monitoring daemon.

Solid-State Drives (SSDs) are storage devices that use NAND-flash technology to store data. They offer the end user significant benefits compared to HDDs, some of which include reliability, reduced size, increased energy efficiency and improved IO speeds which translates to faster boot times, quicker computational capabilities and an improved system responsiveness overall. Like all devices, however, they experience performance degradation over time on account of a variety of factors such as overall disk writes, bad-blocks management, lack of free space, sub-optimal operational temperature and good-old wear-and-tear which speaks to the overall health of the SSD.

The goal of the SSD Monitoring Daemon (ssdmond) is to provide meaningful metrics for the aforementioned issues and enable streaming telemetry for these attributes so that the required preventative measures are triggered in the eventuality of performance degradation.

## 2. Data Collection

We are intrested in the following characteristics that describe various aspects of the SSD:

### **2.1 Priority 0 Attributes**

**The following are dynamic fields, offering up-to-date information that describes the current state of the SSD**

- IO Reads
assrinivasan marked this conversation as resolved.
Show resolved Hide resolved
- IO Writes
- Reserved Blocks Count
- Temperature

**IO Reads/Writes** - SSDs use wear-leveling algorithms to distribute write and erase cycles evenly across the NAND cells to extend their lifespan. However, write amplification can occur when data is written, rewritten, and erased in a way that creates additional write operations, which can slow down performance.

**Reserved Blocks Count** - Reserving some number of filesystem blocks for use by privileged processes is done to avoid filesystem fragmentation, and to allow system daemons, such as syslogd(8), to continue to function correctly after non-privileged processes are prevented from writing to the filesystem. Normally, the default percentage of reserved blocks is 5%.<sup>[1](#1-man-tune2fs)</sup>
assrinivasan marked this conversation as resolved.
Show resolved Hide resolved

**Temperature** - Extreme temperatures can affect SSD performance. Excessive heat can lead to throttling to prevent damage, while extreme cold can slow down data access.


### **2.2 Priority 1 Attributes**

**These are a combination of static (S) and dynamic (D) fields, offering secondary information that provides additional context about the SSD**

- Vendor Model (S)
- Serial Number (S)
- Firmware (S)
- Health (D)

These fields are self-explanatory.


### **2.3 `ssdmond` Daemon Flow**

0. Vendor would be responsible for configuring the following values:
- **loop timeout** - This determines how often the dynamic information would be updated. Default is 6 hours.
- **SSD vendor-specific search terms** - This would ensure that all the attributes are properly parsed from the device.

1. `ssdmond` would be started by the `pmon` docker container
2. The daemon would gather the static info once init-ed, by leveraging the `ssdutil` utility and update the StateDB
3. It would periodically parse the priority 0 attributes either by leveraging `ssdutil` or directly through Linux utilities and update the StateDB.

This is detailed in the sequence diagram below:

![image.png](images/SSDMOND_SequenceDiagram.png)


NOTE: While it is previously established that we use the abstraction provided by the `ssdutil` class to offer vendors the opportunity to implement their own SSD parsing logic, the [primary intent](#1-overview) of this iteration of the design is to enable streaming telemetry of these attributes, i.e., update the StateDB with the parsed data. A design choice is therefore made to bypass the abstraction logic in favor of this goal in the interim, with the expressed understanding that said abstraction will follow in a [future version](#future-work) of this daemon.


### **2.4 Data Collection Logic**

The SONiC OS already contains logic to parse information about SSDs from several vendors by way of the `ssdutil` platform utility. We leverage this utility to gather the following information:

- Priority 0: Temperature
- Priority 1: All aforementioned attributes

This section will therefore only go into detail about data collection of attributes mentioned in [section 2.1](#21-priority-0-attributes):

#### **2.4.1 IO Reads/Writes**

- grep `/proc/diskstats` for statistics about the SSD of interest
assrinivasan marked this conversation as resolved.
Show resolved Hide resolved
- Read the 4th value for reads completed successfully and the 8th value for writes completed<sup>[2](#2-kernelorg-procdiskstats)</sup>

#### **2.4.2 Reserved Blocks Count**

- Examine the SMART data for attributes related to reserved blocks or over-provisioning.
- The exact attribute name and number can vary depending on the SSD manufacturer and model.
- Look for keywords like "Reserved Block Count," "Over Provisioning," or similar terms.

Here's an example of what the SMART data output might look like:

```
...
ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE
...
173 Wear_Leveling_Count 0x0032 100 100 000 Old_age Always - 123
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 456
...

```

In this example, the "Wear_Leveling_Count" attribute might be indicative of reserved blocks or over-provisioning. However, the specific attribute and its interpretation can vary, so we make the search term and ID configurable by our vendors while maintaining a default search term in the event that this value is left unconfigured.

## **StateDB Schema**
```
; Defines information for the SSD in a device

key = SSD_INFO ; This key is for information that does not change for the lifetime of the SSD
assrinivasan marked this conversation as resolved.
Show resolved Hide resolved

; field = value

Temperature = STRING ; Describes the operating temperature of the SSD (Priority 0, Dynamic)
assrinivasan marked this conversation as resolved.
Show resolved Hide resolved
assrinivasan marked this conversation as resolved.
Show resolved Hide resolved
io_reads = INT ; Describes the total number of reads completed successfully from the SSD (Priority 0, Dynamic)
io_writes = INT ; Describes the total number of writes completed on the SSD (Priority 0, Dynamic)
assrinivasan marked this conversation as resolved.
Show resolved Hide resolved
reserve_blocks = INT ; Describes the reserved blocks count of the SSD (Priority 0, Dynamic)
device_model = STRING ; Describes the Vendor information of the SSD (Priority 1, Static)
serial = STRING ; Describes the Serial number of the SSD (Priority 1, Static)
firmware = STRING ; Describes the Firmware version of the SSD (Priority 1, Static)
health = STRING ; Describes the overall health of the SSD (Priority 1, Dynamic)
assrinivasan marked this conversation as resolved.
Show resolved Hide resolved
```

## Future Work

1. Code abstraction and CLI support for aforementioned newly introduced fields
2. Support for eMMC storage
assrinivasan marked this conversation as resolved.
Show resolved Hide resolved

## References

### 1. [man tune2fs](https://linux.die.net/man/8/tune2f)
### 2. [kernel.org /proc/diskstats](https://www.kernel.org/doc/Documentation/ABI/testing/procfs-diskstats)

<br><br><br>
<sup>[Back to top](#1-overview)</sup>