-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use copysign LLVM intrinsic rather than bithack ourselves #39768
Conversation
Have you done performance testing to ensure this isn't a regression? |
Update: it appears this (and likely flipsign and signbit) implementations are broken on some systems. From Julia's libm implementation (https://github.com/JuliaMath/openlibm/blob/b34f107e24e97cd7b4eedc6868e330a9ff321120/src/fpmath.h#L98), the sign bit is not guaranteed to be in the place Julia current expects it to be. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems fine to me. I believe this intrinsic didn't exist when this code was first written.
nanosoldier run would be good |
After briefly being reachable, Nanosoidier came down with the cold again... I did some small scale tests for vectorization and performance was equivalent. |
I'm not sure there is justification to backport this to 1.6 at this point. It's not a bugfix, and there is always the possibility of finding fun new LLVM behavior with changes like this. It can go into 1.6.1 if it proves safe on master. |
Per my reading of the Julia libm source code (linked above), I believe there would be a miscompilation without this patch for any architecture that has Now I'm not sure what architecture actually has those properties (@vchuravy and I did some thinking and came up empty), but thought would throw out there. |
None of our supported platforms have that property. I'm not even sure any LLVM-supported platforms do. |
@nanosoldier |
1 similar comment
@nanosoldier |
Your benchmark job has completed, but no benchmarks were actually executed. Perhaps your tag predicate contains misspelled tags? cc @christopher-dG |
@nanosoldier |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @christopher-dG |
LLVM has an internal intrinsic for copysign which is both more compatible across architecture than assuming an and of a sign bit and also better enables LLVM optimizations that understand what functional operation is occuring.
Moreover, while downstream users such as Enzyme.jl are differentiating through an increasing number of bithacks, using the proper intrinsic for this operation makes analysis dramatically easier.