The ROCm Validation Suite (RVS) is a system validation and diagnostics tool for monitoring, stress testing, detecting and troubleshooting issues that affects the functionality and performance of AMD GPU(s) operating in a high-performance/AI/ML computing environment. RVS is enabled using the ROCm software stack on a compatible software and hardware platform.
RVS is a collection of tests, benchmarks and qualification tools each targeting a specific sub-system of the ROCm platform. All of the tools are implemented in software and share a common command line interface. Each set of tests are implemented in a “module” which is a library encapsulating the functionality specific to the tool. The CLI can specify the directory containing modules to use when searching for libraries to load. Each module may have a set of options that it defines and a configuration file that supports its execution.
For different RVS modules and their description, refer to the documentation on features.
For module configuration files description and examples, refer to the user guide.
Please do this before compilation/installing compiled package.
Ubuntu :
sudo apt-get -y update && sudo apt-get install -y libpci3 libpci-dev doxygen unzip cmake git libyaml-cpp-dev
CentOS :
sudo yum install -y cmake3 doxygen pciutils-devel rpm rpm-build git gcc-c++ yaml-cpp-devel
RHEL :
sudo yum install -y cmake3 doxygen rpm rpm-build git gcc-c++ yaml-cpp-devel
wget http://mirror.centos.org/centos/7/os/x86_64/Packages/pciutils-devel-3.5.1-3.el7.x86_64.rpm
sudo rpm -ivh pciutils-devel-3.5.1-3.el7.x86_64.rpm
SLES :
sudo zypper install -y cmake doxygen pciutils-devel libpci3 rpm git rpm-build gcc-c++ yaml-cpp-devel
Install ROCm stack for Ubuntu/CentOS/SLES/RHEL. Refer to ROCm installation guide for more details.
Note:
rocm_smi64 package has been renamed to rocm-smi-lib64 from >= ROCm3.0. If you are using ROCm release < 3.0 , install the package as "rocm_smi64". rocm-smi-lib64 package has been renamed to rocm-smi-lib from >= ROCm4.1.
Install rocBLAS and rocm-smi-lib :
Ubuntu :
sudo apt-get install rocblas rocm-smi-lib
CentOS & RHEL :
sudo yum install --nogpgcheck rocblas rocm-smi-lib
SUSE :
sudo zypper install rocblas rocm-smi-lib
Note: If rocm-smi-lib is already installed but /opt/rocm/lib/librocm_smi64.so doesn't exist. Do below:
Ubuntu :
sudo dpkg -r rocm-smi-lib && sudo apt install rocm-smi-lib
CentOS & RHEL :
sudo rpm -e rocm-smi-lib && sudo yum install rocm-smi-lib
SUSE :
sudo rpm -e rocm-smi-lib && sudo zypper install rocm-smi-lib
This section explains how to get and compile current development stream of RVS.
git clone https://github.com/ROCm/ROCmValidationSuite.git
cd ROCmValidationSuite
cmake -B ./build -DROCM_PATH=<rocm_installed_path> -DCMAKE_INSTALL_PREFIX=<rocm_installed_path> -DCPACK_PACKAGING_INSTALL_PREFIX=<rocm_installed_path>
e.g. If ROCm 5.5 was installed,
cmake -B ./build -DROCM_PATH=/opt/rocm-5.5.0 -DCMAKE_INSTALL_PREFIX=/opt/rocm-5.5.0 -DCPACK_PACKAGING_INSTALL_PREFIX=/opt/rocm-5.5.0
make -C ./build
cd ./build
make package
Note:_ based on your OS, only DEB or RPM package will be built. You may ignore an error for the unrelated configuration
Ubuntu :
sudo dpkg -i rocm-validation-suite*.deb
CentOS & RHEL & SUSE :
sudo rpm -i --replacefiles --nodeps rocm-validation-suite*.rpm
Note: RVS is getting packaged as part of ROCm release starting from 3.0. You can install pre-compiled package as below. Please make sure Prerequisites, ROCm stack, rocblas and rocm-smi-lib64 are already installed
Ubuntu :
sudo apt install rocm-validation-suite
CentOS & RHEL :
sudo yum install rocm-validation-suite
SUSE :
sudo zypper install rocm-validation-suite
cd <source folder>/build/bin
Command examples
./rvs --help ; Lists all options to run RVS test suite
./rvs -g ; Lists supported GPUs available in the machine
./rvs -c conf/gst_single.conf ; Run GST module default test configuration
cd /opt/rocm/bin
Command examples
./rvs --help ; Lists all options to run RVS test suite
./rvs -g ; Lists supported GPUs available in the machine
./rvs -c ../share/rocm-validation-suite/conf/gst_single.conf ; Run GST default test configuration
To run GPU specific test configuration, use configuration files from GPU folders in "/opt/rocm/share/rocm-validation-suite/conf"
./rvs -c ../share/rocm-validation-suite/conf/MI300X/gst_single.conf ; Run MI300X specific GST test configuration
./rvs -c ../share/rocm-validation-suite/conf/nv32/gst_single.conf ; Run Navi 32 specific GST test configuration
Note: If present, always use GPU specific configurations instead of default test configurations.
Test results, errors and verbose logs are printed as terminal output. To enable json logging use "-j" command line option. The json output file is stored in /var/tmp folder and the name of the file will be printed.