Fixes `dpctl.tensor.round` on CUDA devices #1700

ndgrigorian · 2024-05-29T20:32:33Z

When compiled for CUDA, std::rint would incorrectly round values halfway between two integers towards 0 (i.e., 1.5 -> 1.). The array API specification requires that these values be rounded to the nearest even integer instead.

To resolve this, std::rint has been replaced with sycl::rint, which does not rely on the current floating-point rounding mode (see SYCL specification).

As was pointed out at the time of implementation the floating-point rounding mode can vary between devices.

Have you provided a meaningful PR description?
Have you added a test, reproducer or referred to an issue with a reproducer?
Have you tested your changes locally for CPU and GPU devices?
Have you made sure that new changes do not introduce compiler warnings?
Have you checked performance impact of proposed changes?
If this PR is a work in progress, are you opening the PR as a draft?

…vidia hardware When compiled for CUDA, `std::rint` would incorrectly round values halfway between two integers toward 0, rather than to the nearest even number as required per array API. `sycl::rint` avoids such issues by not relying on the current rounding mode

github-actions · 2024-05-29T21:07:33Z

Deleted rendered PR docs from intelpython.github.com/dpctl, latest should be updated shortly. 🤞

github-actions · 2024-05-29T21:10:47Z

Array API standard conformance tests for dpctl=0.18.0dev0=py310h15de555_29 ran successfully.
Passed: 890
Failed: 11
Skipped: 91

oleksandr-pavlyk

It might be a good idea to scan the code base for remaining uses of std namespace transcendental functions and replace those one by one too

coveralls · 2024-05-29T21:13:49Z

coverage: 87.911%. remained the same
when pulling d8705a2 on fix-round-for-nvidia
into 1de00cb on master.

ndgrigorian · 2024-05-29T21:13:57Z

It might be a good idea to scan the code base for remaining uses of std namespace transcendental functions and replace those one by one too

I agree. I'll do this as a separate PR, but I think it's a good idea too.

oleksandr-pavlyk · 2024-05-29T21:17:06Z

I have opened gh-1701 for build break with LLVM SYCL compiler, it is unrelated to changes in this PR.

ndgrigorian requested review from oleksandr-pavlyk and vtavana May 29, 2024 20:32

oleksandr-pavlyk approved these changes May 29, 2024

View reviewed changes

ndgrigorian merged commit 7ae4303 into master May 29, 2024
59 of 60 checks passed

ndgrigorian deleted the fix-round-for-nvidia branch May 29, 2024 22:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fixes `dpctl.tensor.round` on CUDA devices #1700

Fixes `dpctl.tensor.round` on CUDA devices #1700

Uh oh!

ndgrigorian commented May 29, 2024

Uh oh!

github-actions bot commented May 29, 2024 •

edited

Loading

Uh oh!

github-actions bot commented May 29, 2024

Uh oh!

oleksandr-pavlyk left a comment

Uh oh!

coveralls commented May 29, 2024

Uh oh!

ndgrigorian commented May 29, 2024

Uh oh!

oleksandr-pavlyk commented May 29, 2024

Uh oh!

Uh oh!

Uh oh!

Fixes dpctl.tensor.round on CUDA devices #1700

Fixes dpctl.tensor.round on CUDA devices #1700

Uh oh!

Conversation

ndgrigorian commented May 29, 2024

Uh oh!

github-actions bot commented May 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented May 29, 2024

Uh oh!

oleksandr-pavlyk left a comment

Choose a reason for hiding this comment

Uh oh!

coveralls commented May 29, 2024

Uh oh!

ndgrigorian commented May 29, 2024

Uh oh!

oleksandr-pavlyk commented May 29, 2024

Uh oh!

Uh oh!

Uh oh!

Fixes `dpctl.tensor.round` on CUDA devices #1700

Fixes `dpctl.tensor.round` on CUDA devices #1700

github-actions bot commented May 29, 2024 •

edited

Loading