Releases: ROCm/hipCUB
Releases · ROCm/hipCUB
hipCUB 2.13.1 for ROCm 5.7.0
Changed
- CUB backend references CUB and Thrust version 2.0.1.
- Fixed
DeviceSegmentedReduce::ArgMin
andDeviceSegmentedReduce::ArgMax
by returning the segment-relative index instead of the absolute one. - Fixed
DeviceSegmentedReduce::ArgMin
for inputs where the segment minimum is smaller than the value returned for empty segments. An equivalent fix is applied toDeviceSegmentedReduce::ArgMax
.
Known Issues
debug_synchronous
no longer works on CUDA platform.CUB_DEBUG_SYNC
should be used to enable those checks.DeviceReduce::Sum
does not compile on CUDA platform for mixed extended-floating-point/floating-point InputT and OutputT types.DeviceHistogram::HistogramEven
fails on CUDA platform for[LevelT, SampleIteratorT] = [int, int]
.DeviceHistogram::MultiHistogramEven
fails on CUDA platform for[LevelT, SampleIteratorT] = [int, int/unsigned short/float/double]
and[LevelT, SampleIteratorT] = [float, double]
.
hipCUB 2.13.1 for ROCm 5.6.1
hipCUB code for ROCm 5.6.1 did not change. The library was rebuilt for the updated ROCm 5.6.1 stack.
hipCUB 2.13.1 for ROCm 5.6.0
hipCUB code for ROCm 5.6.0 did not change. The library was rebuilt for the updated ROCm 5.6.0 stack.
hipCUB 2.13.1 for ROCm 5.5.1
hipCUB code for ROCm 5.5.1 did not change. The library was rebuilt for the updated ROCm 5.5.1 stack.
hipCUB 2.13.1 for ROCm 5.5.0
Added
- Benchmarks for
BlockShuffle
,BlockLoad
, andBlockStore
.
Changed
- CUB backend references CUB and Thrust version 1.17.2.
- Improved benchmark coverage of
BlockScan
by addingExclusiveScan
, benchmark coverage ofBlockRadixSort
by addingSortBlockedToStriped
, and benchmark coverage ofWarpScan
by addingBroadcast
.
Fixed
- Windows HIP SDK support
Known Issues
BlockRadixRankMatch
is currently broken under the rocPRIM backend.BlockRadixRankMatch
with a warp size that does not exactly divide the block size is broken under the CUB backend.
hipCUB 2.13.1 for ROCm 5.4.4
Fixed
- Fixed compilation and execution issues for benchmarks with HIP on Windows
hipCUB 2.13.0 for ROCm 5.4.3
hipCUB code for ROCm 5.4.3 did not change. The library was rebuilt for the updated ROCm 5.4.3 stack.
hipCUB 2.13.0 for ROCm 5.4.2
hipCUB code for ROCm 5.4.2 did not change. The library was rebuilt for the updated ROCm 5.4.2 stack.
hipCUB 2.13.0 for ROCm 5.4.1
hipCUB code for ROCm 5.4.1 did not change. The library was rebuilt for the updated ROCm 5.4.1 stack.
hipCUB 2.13.0 for ROCm 5.4.0
Added
- CMake functionality to improve build parallelism of the test suite that splits compilation units by
function or by parameters. - New overload for
BlockAdjacentDifference::SubtractLeftPartialTile
that takes a predecessor item.
Changed
- Improved build parallelism of the test suite by splitting up large compilation units for
DeviceRadixSort
,
DeviceSegmentedRadixSort
andDeviceSegmentedSort
. - CUB backend references CUB and thrust version 1.17.1.