Skip to content

Conversation

oleksandr-pavlyk
Copy link
Contributor

@oleksandr-pavlyk oleksandr-pavlyk commented Jan 10, 2023

This PR changes to _tensor_impl to use sycl::fma function to work-around aggressive compiler optimizations reordering multiplications and causing overflows. This could be addressed by applying -fno-associative-math flag (See https://clang.llvm.org/docs/UsersManual.html#controlling-floating-point-behavior for how to control FP-behavior in clang), which help to address the issue on Linux, but not Windows.

This fixes output of dpt.linspace(dpt.finfo('f4').max, dpt.finfo('f4').max, num=16, dtype='f4') which unexpectedly contained nan values as discovered by @npolina4

  • Have you provided a meaningful PR description?
  • Have you added a test, reproducer or referred to an issue with a reproducer?
  • Have you tested your changes locally for CPU and GPU devices?
  • Have you made sure that new changes do not introduce compiler warnings?

@github-actions
Copy link

@coveralls
Copy link
Collaborator

coveralls commented Jan 10, 2023

Coverage Status

Coverage: 82.23% (+0.04%) from 82.191% when pulling 5a126fd on use-no-associative-math into 47e4ae4 on master.

@github-actions
Copy link

Array API standard conformance tests for dpctl=0.14.1dev1=py310h76be34b_14 ran successfully.
Passed: 33
Failed: 801
Skipped: 280

@github-actions
Copy link

Array API standard conformance tests for dpctl=0.14.1dev1=py310h76be34b_15 ran successfully.
Passed: 33
Failed: 801
Skipped: 280

@github-actions
Copy link

Array API standard conformance tests for dpctl=0.14.1dev1=py310h76be34b_16 ran successfully.
Passed: 33
Failed: 801
Skipped: 280

@github-actions
Copy link

Array API standard conformance tests for dpctl=0.14.1dev1=py310h76be34b_14 ran successfully.
Passed: 33
Failed: 801
Skipped: 280

@oleksandr-pavlyk
Copy link
Contributor Author

@npolina4 I fixed the issue by using sycl::fma to prevent compiler from reordering terms in multiplication and creating sub-expressions prone to overflowing.

@oleksandr-pavlyk oleksandr-pavlyk merged commit 6364c08 into master Jan 12, 2023
@github-actions
Copy link

Deleted rendered PR docs from intelpython.github.com/dpctl, latest should be updated shortly. 🤞

@github-actions
Copy link

Array API standard conformance tests for dpctl=0.14.1dev1=py310h76be34b_14 ran successfully.
Passed: 33
Failed: 801
Skipped: 280

@oleksandr-pavlyk oleksandr-pavlyk deleted the use-no-associative-math branch January 12, 2023 17:59
@oleksandr-pavlyk oleksandr-pavlyk changed the title Use no associative math Used fma in linsequence_affine kernel Jan 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants