You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Apparently, LLVM's SLP vectorizer assumes that all floating point exceptions are masked (thus FP instructions never signal), which is the default setting, in order to take advantage of SIMD instructions even if the number of elements is fewer than the native SIMD width. However it isn't the case in this particular instance.
Curiously, when I inspect the value of mxcsr of every thread, only the threads started by GlobalDispatchThreadPool have its _MM_MASK_INVALID (0x80) bit cleared.
Confirmed on macOS 10.14 Mojave
The text was updated successfully, but these errors were encountered:
Apparently, LLVM's SLP vectorizer assumes that all floating point exceptions are masked (thus FP instructions never signal), which is the default setting, in order to take advantage of SIMD instructions even if the number of elements is fewer than the native SIMD width. However it isn't the case in this particular instance.
Curiously, when I inspect the value of
mxcsr
of every thread, only the threads started byGlobalDispatchThreadPool
have its_MM_MASK_INVALID
(0x80
) bit cleared.Confirmed on macOS 10.14 Mojave
The text was updated successfully, but these errors were encountered: