SSM - Smart Storage Management

Big data has put increasing pressure on HDFS storage in recent years. The latest storage devices (3D XPoint(R) SSD, NVMe SSD, etc.) can be used to improve the storage performance. HDFS provides methodologies like HDFS Cache, Heterogeneous Storage Management and Erasure Coding to provide such support, but it still remains a big challenge for HDFS to make full utilization of these high-performance storage devices in a dynamic environment.

To overcome the challenge, we introduced in a comprehensive end-to-end solution, aka Smart Storage Management (SSM) in Apache Hadoop. HDFS operation data and system state information are collected from the cluster, based on the metrics collected SSM can automatically make sophisticated usage of these methodologies to optimize HDFS storage efficiency.

Goal

The SSM project is separated into two phases:

Phase 1. Implement a rule-based automatic engine that can execute user-defined rules to do automation stuffs. It provides a unified interface (through rules) to manage and tune HDFS cluster.

Phase 2. Implement a rule-generator that can be aware of all these factors, and can automatically generate comprehensive rules. It’s the advanced part and makes SSM works smartly.

The project is on Phase 1 now, it includes the following general subtasks:

Define metrics to collect from HDFS
Define rule DSL regarding syntax and build-in variables used for rules.
Implement rule engine and different actions
Optimize facilities to cooperate with SSM
Implement UI to make SSM easy to use

Use Cases

Optimizations when data becoming hot

Without SSM, data will always be readed from HDD (a.). With SSM, optimizaitons can be made through rules. As showned in the figure above, data can be moved to faster storage (b.) or cached in memory (c.) to achive better performance.

Archive cold data

Files are less likely to be read during the ending of lifecycle, so it’s better to move these cold files into lower performance storage to decrease the cost of data storage. The following rule shows the example of archiving data that has not been read over 3 times during the last 30 days.

Architecture

SSM polls metrics from NameNode. These metrics are analyzed by SSM as specified by rules, and if conditions of some rule are fulfilled then it will execute the corresponding actions.

SSM uses SQL database to maintain data polled as well as other internal data. Some of the data are required for SSM states recovery, so SSM will checkpoint them into HDFS in time.

Compatibility

SSM is being developed based on Hadoop trunk, but it can be ported/integrated to lower Hadoop versions or even custom private Hadoop versions easily.

HA Support

Hadoop HA is supported by SSM. There can be some metrics data loss (file.accessCount) for a short time interval when NN deactivate happens. SSM can stand this data loss and won't cause disaster.

Desgin

SSM consists of 5 chief components illustrated in the following figure:

StatesManager
- Collect metrics and events from NameNode
- Maintain data and forward events
RuleManager
- Manage rules
- Schedule and execute rules
CacheManager
- Schedule the execution of cache related actions
StorageManager
- Schedule the execution of storage related actions
ActionExecutor
- Execute actions generated

Rules

A rule is an interface between user and SSM, through which the user tells SSM how to function. A rule defines all the things for SSM to work: at what time, analysis what kind of metrics and conditions, and what actions should be taken when the conditions are true. By writing rules, a user can easily manage their cluster and adjust its behavior for certain purposes.

Rule Syntax

Tables and data management

SSM uses SQL database to maintain data internally. Core tables in SSM:

Take file accessCount metric for example to demonstrate how it works. As shown in the following chart:

SSM polls accessCount data from NN to get file access count info generated in the time interval (for example, 5s).
Create a table to store the info and insert the table name into table access_count_tables.
Then file access count of last time interval can be calculated by accumulating data in tables that their start time and end time falls in the interval.
To control the total amount of data, second-level of accessCount tables will be aggregated into minute-level, hour-level, day-level, month-level and year-level. The longer the time from now, the larger the granularity for aggregation. More accurate data kept for near now than long ago.

Name		Name	Last commit message	Last commit date
Latest commit History 15,940 Commits
dev-support		dev-support
docs/ssm		docs/ssm
hadoop-assemblies		hadoop-assemblies
hadoop-build-tools		hadoop-build-tools
hadoop-client-modules		hadoop-client-modules
hadoop-cloud-storage-project		hadoop-cloud-storage-project
hadoop-common-project		hadoop-common-project
hadoop-dist		hadoop-dist
hadoop-hdfs-project		hadoop-hdfs-project
hadoop-mapreduce-project		hadoop-mapreduce-project
hadoop-maven-plugins		hadoop-maven-plugins
hadoop-minicluster		hadoop-minicluster
hadoop-project-dist		hadoop-project-dist
hadoop-project		hadoop-project
hadoop-smart-project		hadoop-smart-project
hadoop-tools		hadoop-tools
hadoop-yarn-project		hadoop-yarn-project
.gitattributes		.gitattributes
.gitignore		.gitignore
.travis.yml		.travis.yml
BUILDING.txt		BUILDING.txt
LICENSE.txt		LICENSE.txt
NOTICE.txt		NOTICE.txt
README.md		README.md
README.txt		README.txt
pom.xml		pom.xml
start-build-env.sh		start-build-env.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SSM - Smart Storage Management

Goal

Use Cases

Optimizations when data becoming hot

Archive cold data

Architecture

Compatibility

HA Support

Desgin

Rules

Rule Syntax

Tables and data management

About

Releases

Packages

Languages

License

timmyyao/SSM

Folders and files

Latest commit

History

Repository files navigation

SSM - Smart Storage Management

Goal

Use Cases

Optimizations when data becoming hot

Archive cold data

Architecture

Compatibility

HA Support

Desgin

Rules

Rule Syntax

Tables and data management

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages