-
Notifications
You must be signed in to change notification settings - Fork 801
Open
Labels
bugSomething isn't workingSomething isn't workingcudaCUDA back-endCUDA back-endhipIssues related to execution on HIP backend.Issues related to execution on HIP backend.
Description
Describe the bug
DPC++ does not use use correctly rounded sqrt() on AMD GPUs, even if -fno-fast-math is explicitly passed.
This is contrary to the behavior of both hipcc and AdaptiveCpp which by default correctly round sqrt, and can lead to misleading benchmark results, or at least make them difficult to interpret.
To reproduce
- Print compiler invocation e.g. using
icpx -fsycl -fsycl-targets=amdgcn-amd-amdhsa -Xsycl-target-backend --offload-arch=gfx906 /dev/null -fno-fast-math -### - Observe in the output that it links with the bitcode library
oclc_correctly_rounded_sqrt_off.bc
Environment
This was observed on Linux with oneAPI 2024.0.2.
Additional context
I suspect that oclc_correctly_rounded_sqrt_off is simply the default due to all ROCm bitcode library configuration knobs being initialized with false (daz = off, finite_only = off, unsafe_math = off, correctly_rounded_sqrt = off).
correctly_rounded_sqrt however may have to be treated differently, because it is the only one where a setting of off does not correspond to precision.
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingcudaCUDA back-endCUDA back-endhipIssues related to execution on HIP backend.Issues related to execution on HIP backend.