These release notes provide a summary of notable changes since the previous ROCm release.
As ROCm 6.2.2 was released shortly after 6.2.1, the changes between these versions
are minimal. For a comprehensive overview of recent updates, the ROCm 6.2.1 release
notes are appended to the end of this document.
For detailed information about the changes in ROCm 6.2.1, refer to the appended
section: [ROCm 6.2.1 release notes](rocm-6-2-1-release-notes).
The Compatibility matrix provides the full list of supported hardware, operating systems, ecosystems, third-party components, and ROCm components for each ROCm release.
Release notes for previous ROCm releases are available in earlier versions of the documentation. See the ROCm documentation release history.
The following is a significant fix introduced in ROCm 6.2.2.
Improved the reliability of AMD Instinct MI300X accelerators in scenarios involving uncorrectable errors. Previously, error recovery did not occur as expected, potentially leaving the system in an undefined state. This fix ensures that error recovery functions as expected, maintaining system stability.
See the original issue noted in the ROCm 6.2.1 release notes.
The ROCm 6.2.1 release notes document newly added ecosystem support, ROCm Offline Installer Creator updates, and improvements to several ROCm libraries and tools.
The following are notable new features and improvements in ROCm 6.2.1. For changes to individual components, see Detailed component changes.
The new version of rocAL introduces many new features, but does not modify any of the existing public API functions. However, the version number was incremented from 1.3 to 2.0. Applications linked to version 1.3 must be recompiled to link against version 2.0.
See the rocAL detailed changes for more information.
As of ROCm 6.2.1, ROCm supports Facebook General Matrix Multiplication (FBGEMM) and the related FBGEMM_GPU library.
FBGEMM is a low-precision, high-performance CPU kernel library for convolution and matrix multiplication. It is used for server-side inference and as a back end for PyTorch quantized operators. FBGEMM_GPU includes a collection of PyTorch GPU operator libraries for training and inference. For more information, see the ROCm Model acceleration libraries guide and PyTorch's FBGEMM GitHub repository.
The ROCm Offline Installer Creator 6.2.1 introduces several new features and improvements including:
- Logging support for create and install logs
- More stringent checks for Linux versions and distributions
- Updated prerequisite repositories
- Fixed CTest issues
There have been no changes to supported hardware or operating systems from ROCm 6.2.0 to ROCm 6.2.1.
- The Programming Model Reference and Understanding the Programming Model topics in HIP have been consolidated into one topic, HIP programming model (conceptual).
- The HIP virtual memory management and HIP virtual memory management API topics have been added.
The ROCm documentation, like all ROCm projects, is open source and available on GitHub. To contribute to ROCm documentation, see the [ROCm documentation contribution guidelines](https://rocm.docs.amd.com/en/latest/contribute/contributing.html).
ROCm 6.2.1 adds support for Ubuntu 24.04.1 (kernel: 6.8 [GA]).
See the Compatibility matrix for the full list of supported operating systems and hardware architectures.
The following table lists the versions of ROCm components for ROCm 6.2.1, including any version changes from 6.2.0 to 6.2.1.
Click the component's updated version to go to a detailed list of its changes. Click to go to the component's source code on GitHub.
Category | Group | Name | Version | |
---|---|---|---|---|
Libraries | Machine learning and computer vision | Composable Kernel | 1.1.0 | |
MIGraphX | 2.10 | |||
MIOpen | 3.2.0 | |||
MIVisionX | 3.0.0 | |||
rocAL | 1.0.0 ⇒ 2.0.0 | |||
rocDecode | 0.6.0 | |||
rocPyDecode | 0.1.0 | |||
RPP | 1.8.0 | |||
Communication | RCCL | 2.20.5 ⇒ 2.20.5 | ||
Math | hipBLAS | 2.2.0 | ||
hipBLASLt | 0.8.0 | |||
hipFFT | 1.0.15 | |||
hipfort | 0.4.0 | |||
hipRAND | 2.11.0 | |||
hipSOLVER | 2.2.0 | |||
hipSPARSE | 3.1.1 | |||
hipSPARSELt | 0.2.1 | |||
rocALUTION | 3.2.0 | |||
rocBLAS | 4.1.2 ⇒ 4.2.1 | |||
rocFFT | 1.0.28 ⇒ 1.0.29 | |||
rocRAND | 3.1.0 | |||
rocSOLVER | 3.26.0 | |||
rocSPARSE | 3.2.0 | |||
rocWMMA | 1.5.0 | |||
Tensile | 4.41.0 | |||
Primitives | hipCUB | 3.2.0 | ||
hipTensor | 1.3.0 | |||
rocPRIM | 3.2.0 ⇒ 3.2.1 | |||
rocThrust | 3.1.0 | |||
Tools | System management | AMD SMI | 24.6.2 ⇒ 24.6.3 | |
rocminfo | 1.0.0 | |||
ROCm Data Center Tool | 1.0.0 | |||
ROCm SMI | 7.3.0 ⇒ 7.3.0 | |||
ROCm Validation Suite | 1.0.0 | |||
Performance | Omniperf | 2.0.1 | ||
Omnitrace | 1.11.2 ⇒ 1.11.2 | |||
ROCm Bandwidth Test | 1.4.0 | |||
ROCProfiler | 2.0.0 | |||
ROCprofiler-SDK | 0.4.0 | |||
ROCTracer | 4.1.0 | |||
Development | HIPIFY | 18.0.0 ⇒ 18.0.0 | ||
ROCdbgapi | 0.76.0 | |||
ROCm CMake | 0.13.0 | |||
ROCm Debugger (ROCgdb) | 14.2 | |||
ROCr Debug Agent | 2.0.3 | |||
Compilers | HIPCC | 1.1.1 | ||
llvm-project | 18.0.0 | |||
Runtimes | HIP | 6.2 ⇒ 6.2.1 | ||
ROCr Runtime | 1.14.0 |
The following sections describe key changes to ROCm components.
- Added
amd-smi static --ras
on Guest VMs. Guest VMs can view enabled/disabled RAS features on Host cards.
- Removed
amd-smi metric --ecc
&amd-smi metric --ecc-blocks
on Guest VMs. Guest VMs do not support getting current ECC counts from the Host cards.
- Fixed TypeError in
amd-smi process -G
. - Updated CLI error strings to handle empty and invalid GPU/CPU inputs.
- Fixed Guest VM showing passthrough options.
- Fixed firmware formatting where leading 0s were missing.
- Soft hang when using
AMD_SERIALIZE_KERNEL
- Memory leak in
hipIpcCloseMemHandle
- Added CUDA 12.5.1 support
- Added cuDNN 9.2.1 support
- Added LLVM 18.1.8 support
- Added
hipBLAS
64-bit APIs support - Added Support for math constants
math_constants.h
Perfetto can no longer open Omnitrace proto files. Loading Perfetto trace output .proto
files in the latest version of ui.perfetto.dev
can result in a dialog with the message, "Oops, something went wrong! Please file a bug." The information in the dialog will refer to an "Unknown field type." The workaround is to open the files with the previous version of the Perfetto UI found at https://ui.perfetto.dev/v46.0-35b3d9845/#!/.
See issue #3767 on GitHub.
On systems running Linux kernel 6.8.0, such as Ubuntu 24.04, Direct Memory Access (DMA) transfers between the GPU and NIC are disabled and impacts multi-node RCCL performance. This issue was reproduced with RCCL 2.20.5 (ROCm 6.2.0 and 6.2.1) on systems with Broadcom Thor-2 NICs and affects other systems with RoCE networks using Linux 6.8.0 or newer. Older RCCL versions are also impacted.
This issue will be addressed in a future ROCm release.
See issue #3772 on GitHub.
- The new version of rocAL introduces many new features, but does not modify any of the existing public API functions.However, the version number was incremented from 1.3 to 2.0. Applications linked to version 1.3 must be recompiled to link against version 2.0.
- Added development and test packages.
- Added C++ rocAL audio unit test and Python script to run and compare the outputs.
- Added Python support for audio decoders.
- Added Pytorch iterator for audio.
- Added Python audio unit test and support to verify outputs.
- Added rocDecode for HW decode.
- Added support for:
- Audio loader and decoder, which uses libsndfile library to decode wav files
- Audio augmentation - PreEmphasis filter, Spectrogram, ToDecibels, Resample, NonSilentRegionDetection, MelFilterBank
- Generic augmentation - Slice, Normalize
- Reading from file lists in file reader
- Downmixing audio channels during decoding
- TensorTensorAdd and TensorScalarMultiply operations
- Uniform and Normal distribution nodes
- Image to tensor updates
- ROCm install - use case graphics removed
- Dependencies are not installed with the rocAL package installer. Dependencies must be installed with the prerequisite setup script provided. See the rocAL README on GitHub for details.
- Removed Device_Memory_Allocation.pdf link in documentation.
- Fixed error/warning message during
rocblas_set_stream()
call.
- Implemented 1D kernels for factorizable sizes less than 1024.
- Improved handling of UnicodeEncodeErrors with non UTF-8 locales. Non UTF-8 locales were causing crashes on UTF-8 special characters.
- Fixed an issue where the Compute Partition tests segfaulted when AMDGPU was loaded with optional parameters.
-
When setting CPX as a partition mode, there is a DRM node limit of 64. This is a known limitation when multiple drivers are using the DRM nodes. The
ls /sys/class/drm
command can be used to see the number of DRM nodes, and the following steps can be used to remove unnecessary drivers:- Unload AMDGPU:
sudo rmmod amdgpu
. - Remove any unnecessary drivers using
rmmod
. For example, to remove an AST driver, runsudo rmmod ast
. - Reload AMDGPU using
modprobe
:sudo modprobe amdgpu
.
- Unload AMDGPU:
- Improved performance of
block_reduce_warp_reduce
when warp size equals block size.
ROCm known issues are tracked on GitHub. Known issues related to individual components are listed in the Detailed component changes section.
For the AMD Instinct MI300X accelerator, GPU recovery resets triggered by uncorrectable errors (UE) might not complete successfully, which can result in the system being left in an undefined state. A system reboot is needed to recover from this state. Additionally, error logging might fail in these situations, hindering diagnostics.
This issue is under investigation and will be resolved in a future ROCm release.
See issue #3766 on GitHub.
The following changes to the ROCm software stack are anticipated for future releases.
The rocm-llvm-alt
package will be removed in an upcoming release. Users relying on the functionality provided by the closed-source compiler should transition to the open-source compiler. Once the rocm-llvm-alt
package is removed, any compilation requesting functionality provided by the closed-source compiler will result in a Clang warning: "[AMD] proprietary optimization compiler has been removed".
The RCCL plugin package, rccl-rdma-sharp-plugins
, will be removed in an upcoming ROCm release.