Skip to content

hipCUB 2.13.1 for ROCm 5.7.0

Compare
Choose a tag to compare
@rocm-ci rocm-ci released this 15 Sep 17:29

Changed

  • CUB backend references CUB and Thrust version 2.0.1.
  • Fixed DeviceSegmentedReduce::ArgMin and DeviceSegmentedReduce::ArgMax by returning the segment-relative index instead of the absolute one.
  • Fixed DeviceSegmentedReduce::ArgMin for inputs where the segment minimum is smaller than the value returned for empty segments. An equivalent fix is applied to DeviceSegmentedReduce::ArgMax.

Known Issues

  • debug_synchronous no longer works on CUDA platform. CUB_DEBUG_SYNC should be used to enable those checks.
  • DeviceReduce::Sum does not compile on CUDA platform for mixed extended-floating-point/floating-point InputT and OutputT types.
  • DeviceHistogram::HistogramEven fails on CUDA platform for [LevelT, SampleIteratorT] = [int, int].
  • DeviceHistogram::MultiHistogramEven fails on CUDA platform for [LevelT, SampleIteratorT] = [int, int/unsigned short/float/double] and [LevelT, SampleIteratorT] = [float, double].