-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
indicator for fast FMA #9855
Comments
Isn't this the problem |
Not quite: they are related but distinct.
|
As a rough point of reference, a software |
Couldn't a function be designed so that it would automatically use the fastest solution? Maybe I'm being naive though. |
@nalimilan I'm not sure what you mean: do you mean something like specifying two code paths and letting the compiler pick the one it likes the most? |
@simonbyrne No, I wonder whether |
Ah I see: you could do that, but it would probably be around twice as slow as the alternative: a decent software |
Another point: |
When using double-double algorithms for extended precision math or to get Float64 results from Float32 friendly GPUs, After trying it both ways, I am using fma everywhere possible -- whether or not the fma is slow (software emulated); the alternative is backwards facing and confounding for careful numerics. |
how can |
The canonical way would be to either introduce an intrinsic function that returns a |
There is currently an attempt to do this in the code via comparing Line 144 in 4a04600
as on my machine (which does have FMA) I get:
|
The sign is wrong, try this:
|
🤦♂ |
(fixed in #32318) |
I'll close this for now, move all discussion to #33011. |
Now we have a
fma
function, it would be useful to have some method for determining whether or not this is more efficient than the naivex*y+z
, particularly for use in algorithms that use double-double style arithmetic (C provides theFP_FAST_FMA
macro for this purpose).From #8112 (comment), it seems that the best option is to expose
TargetLowering::isFMAFasterThanFMulAndFAdd
(presumably inbase/sysinfo.jl
?)The text was updated successfully, but these errors were encountered: