Fixes dpctl.tensor.round
on CUDA devices
#1700
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
When compiled for CUDA,
std::rint
would incorrectly round values halfway between two integers towards 0 (i.e.,1.5 -> 1.
). The array API specification requires that these values be rounded to the nearest even integer instead.To resolve this,
std::rint
has been replaced withsycl::rint
, which does not rely on the current floating-point rounding mode (see SYCL specification).As was pointed out at the time of implementation the floating-point rounding mode can vary between devices.