-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
faster hypot
for Float32
and Float16
#42122
Conversation
close #36353 now, I'm not sure the old |
hypot
for Float32
and Float16
I think if you write |
@oscardssmith there was talk of removing fastmath from Julia, apparently its a bit shonky: |
I think FMA makes it much slower on machine without FMA? don't remember the conclusion |
@brett-mahar while we might eventually get rid of it, we won't until there is a good replacement. The reason we still have |
Line 334 in 8812c5c
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like you need an explicit inf
handling. Since the Float16
case didn't fail in the tests, maybe we should add it? Is Float16
tested at all?
Isn't that just I don't think we should add a bunch of |
Oh yeah, muladd is probably what we want |
Shall we have tests for the Float16 case, including infs and nans? Otherwise this is good to go, if there are no accuracy concerns left. |
Saw this on Wikipedia on FMA (https://en.wikipedia.org/wiki/Multiply–accumulate_operation#Fused_multiply–add):
Here we do that except x^2 + y^2. I guess that's fine then? |
The reason this works is because we are doing a higher precision fma. the multiplication of 2 |
Bump. What do we need to do to get this in? |
Not very much. |
Before:
After: