-
-
Notifications
You must be signed in to change notification settings - Fork 689
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
improve abs() on floats for more correct and faster results #4797
Conversation
@@ -2174,7 +2174,32 @@ gb_internal lbValue lb_build_builtin_proc(lbProcedure *p, Ast *expr, TypeAndValu | |||
case 128: return lb_emit_runtime_call(p, "abs_complex128", args); | |||
} | |||
GB_PANIC("Unknown complex type"); | |||
} else if (is_type_float(t)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there an LLVM intrinsic we could use rather than doing this manually?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is (https://llvm.org/docs/LangRef.html#llvm-fabs-intrinsic), I thought you didn't use it initially for a good reason, I can use that instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cannot remember why I didn't use it. Maybe it required linking with libm?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tried it and it does not require libm, it does generate the exact same code as what I do here (once optimized, just calling the intrinsic in non-optimized gives better results). But for the endian types we would need to byteswap, call intrinsic, and byteswap back, which may be worse.
What do you want me to do, call the intrinsic, or keep it like this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keep it like this, I guess.
Fixes #4787
Seems like this is more correct (abs of -0 is +0), and also optimises better (into fabs and similar instructions, instead of a cmp&select)
draft because not handling endian types, want to add some tests, and need to apply it to gb_abs so constants also show the same behaviour.