Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use copysign LLVM intrinsic rather than bithack ourselves #39768

Merged
merged 1 commit into from
Mar 4, 2021

Conversation

wsmoses
Copy link
Contributor

@wsmoses wsmoses commented Feb 20, 2021

LLVM has an internal intrinsic for copysign which is both more compatible across architecture than assuming an and of a sign bit and also better enables LLVM optimizations that understand what functional operation is occuring.

Moreover, while downstream users such as Enzyme.jl are differentiating through an increasing number of bithacks, using the proper intrinsic for this operation makes analysis dramatically easier.

@vchuravy vchuravy requested review from Keno and vtjnash February 20, 2021 20:59
@vchuravy vchuravy added the compiler:codegen Generation of LLVM IR and native code label Feb 20, 2021
@oscardssmith
Copy link
Member

Have you done performance testing to ensure this isn't a regression?

@wsmoses
Copy link
Contributor Author

wsmoses commented Feb 20, 2021

Update: it appears this (and likely flipsign and signbit) implementations are broken on some systems.

From Julia's libm implementation (https://github.com/JuliaMath/openlibm/blob/b34f107e24e97cd7b4eedc6868e330a9ff321120/src/fpmath.h#L98), the sign bit is not guaranteed to be in the place Julia current expects it to be.

Copy link
Member

@Keno Keno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fine to me. I believe this intrinsic didn't exist when this code was first written.

@Keno Keno added the needs nanosoldier run This PR should have benchmarks run on it label Feb 20, 2021
@Keno
Copy link
Member

Keno commented Feb 20, 2021

nanosoldier run would be good

@vchuravy
Copy link
Member

After briefly being reachable, Nanosoidier came down with the cold again...

I did some small scale tests for vectorization and performance was equivalent.

@vchuravy vchuravy added the backport 1.6 Change should be backported to release-1.6 label Feb 20, 2021
@KristofferC KristofferC mentioned this pull request Feb 22, 2021
52 tasks
@Keno
Copy link
Member

Keno commented Feb 23, 2021

I'm not sure there is justification to backport this to 1.6 at this point. It's not a bugfix, and there is always the possibility of finding fun new LLVM behavior with changes like this. It can go into 1.6.1 if it proves safe on master.

@Keno Keno removed the backport 1.6 Change should be backported to release-1.6 label Feb 23, 2021
@wsmoses
Copy link
Contributor Author

wsmoses commented Feb 23, 2021

Per my reading of the Julia libm source code (linked above), I believe there would be a miscompilation without this patch for any architecture that has __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ and __FLOAT_WORD_ORDER__ == __ORDER_BIG_ENDIAN__.

Now I'm not sure what architecture actually has those properties (@vchuravy and I did some thinking and came up empty), but thought would throw out there.

@Keno
Copy link
Member

Keno commented Feb 23, 2021

None of our supported platforms have that property. I'm not even sure any LLVM-supported platforms do.

@vtjnash
Copy link
Member

vtjnash commented Feb 26, 2021

@nanosoldier runbenchmarks("scalar" && ("Complex{Float64}" || "Complex{Float32}"), vs=":master")

1 similar comment
@vtjnash
Copy link
Member

vtjnash commented Feb 26, 2021

@nanosoldier runbenchmarks("scalar" && ("Complex{Float64}" || "Complex{Float32}"), vs=":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed, but no benchmarks were actually executed. Perhaps your tag predicate contains misspelled tags? cc @christopher-dG

@vchuravy
Copy link
Member

vchuravy commented Mar 3, 2021

@nanosoldier runbenchmarks(ALL, vs=":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @christopher-dG

@vchuravy vchuravy merged commit 3d0b60d into JuliaLang:master Mar 4, 2021
@vchuravy vchuravy added the backport 1.6 Change should be backported to release-1.6 label May 6, 2021
@KristofferC KristofferC mentioned this pull request May 11, 2021
45 tasks
@KristofferC KristofferC removed the backport 1.6 Change should be backported to release-1.6 label Jul 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:codegen Generation of LLVM IR and native code needs nanosoldier run This PR should have benchmarks run on it
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants