-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove ieee754_rem_pio2 in favor of a rem_pio2_kernel written in Julia. #22603
Conversation
(should probably cc @simonbyrne ) |
base/math.jl
Outdated
returns the remainder of x modulo π/2 as a TwicePrecision number, along with a k | ||
such that k mod 3 == K mod 3 where K*π/2 = x -rem. | ||
""" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no blank line here
Since this is a derived work, I'd move it to its own file under |
What about the Float32 case |
Sure, but the current method (that I'm removing) is specifically for rem2pi(x::Float32, r::RoundingMode) = Float32(rem2pi(Float64(x), r)) There's not much to the |
Alright, will do that. |
Yep, the Float32 case should be quite straightforward after a correct Float64 method. I think we might as well also port the Float32 case, that way we can get rid of the slower fallback and just use the faster Float32 implementation for |
BTW do you have any ideas as to why the performance and accuracy is so much better than the C version, this is really surprising to me (I would expect perhaps a ~15% speed improvement and essentially equivalent accuracy). Not sure how much you plan on abstracting and simplifying, but a lot of the openlibm stuff can be simplified if you work at it hard enough and with careful benchmarking, i.e. the high and low splitting for comparisons is not necessary in many cases (so in principle you could defer that until you need it in the cody_waite_ext_pio2 method) This are some general comments from my experience with the openlibm code base, since I have only taken a cursory glance at the PR. |
Well, I expected the same thing. This is the sort of performance difference I get for the trig kernels for example (more like 8%-12%). My very best bet would be that this is somehow related to the fact that I don't pass around a vector. The Payne Hanek implementation is not the same one as in openlibm (it's a modified version of the code Simon Byrne had floating around), but the relative speed-up seems consistent across
Yes, I guess I could defer getting the high word until reaching a branch where the more precise C&W scheme is used or reaching Payne Hanek. You think it would have a measurable impact, or is it more for "neat code" reasons? Fwiw, when I'm calling (or going to call) this function from |
Alright, I tried to do that, but now I'm getting:
Wonder what I did wrong (and yes I did try to follow the clean instructions). I placed a new file in |
you switched branches since building llvm. do |
That is actually true, will try it, thanks! Edit: that worked. What's the consensus on constants such as the precalculated hi and lo values of some numbers? Should they just be defined inside the functions if they're only used in one function, or outside? |
LICENSE.md
Outdated
@@ -34,6 +34,7 @@ Julia includes code from the following projects, which have their own licenses: | |||
- [LLVM](http://releases.llvm.org/3.9.0/LICENSE.TXT) (for parts of src/jitlayers.cpp and src/disasm.cpp) [BSD-3, effectively] | |||
- [MUSL](http://git.musl-libc.org/cgit/musl/tree/COPYRIGHT) (for getopt implementation on Windows) [MIT] | |||
- [MINGW](https://sourceforge.net/p/mingw/mingw-org-wsl/ci/legacy/tree/mingwrt/mingwex/dirname.c) (for dirname implementation on Windows) [MIT] | |||
- [OPENLIBM](https://github.com/JuliaLang/openlibm/blob/master/LICENSE.md) [MIT, BSD-2, ISC] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what was this particular piece originally derived from? openlibm copied almost everything from somewhere else
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FDLIBM
base/special/rem_pio2.jl
Outdated
end | ||
|
||
if xhp <= 0x401c463b # |x| ~<= 9pi/4, use Cody Waite with two constants | ||
if (xhp <= 0x4015fdbc) # |x| ~<= 7pi/4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no parens around conditions
base/special/rem_pio2.jl
Outdated
rem_pio2_kernel(x, xh, xhp) | ||
|
||
returns the remainder of x modulo π/2 as a TwicePrecision number, along with a k | ||
such that k mod 3 == K mod 3 where K*π/2 = x -rem. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe some code / math highlighting here? is this supposed to be x - rem
or something else?
base/special/rem_pio2.jl
Outdated
returns the remainder of x modulo π/2 as a TwicePrecision number, along with a k | ||
such that k mod 3 == K mod 3 where K*π/2 = x -rem. | ||
""" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no blank line here
base/special/rem_pio2.jl
Outdated
""" | ||
highword(x) | ||
|
||
returns the high word of x as a UInt32. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
capitalize and use imperative (Return ...) for docstrings, code highlight UInt32
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and x
base/special/rem_pio2.jl
Outdated
end | ||
|
||
function cody_waite_ext_pio2(x, xʰ⁺) | ||
fn = x*invpio2+0x1.8p52 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
given #6349 it's a bit more portable to not rely on hex floats for bootstrap-necessary code
LICENSE.md
Outdated
@@ -34,6 +34,7 @@ Julia includes code from the following projects, which have their own licenses: | |||
- [LLVM](http://releases.llvm.org/3.9.0/LICENSE.TXT) (for parts of src/jitlayers.cpp and src/disasm.cpp) [BSD-3, effectively] | |||
- [MUSL](http://git.musl-libc.org/cgit/musl/tree/COPYRIGHT) (for getopt implementation on Windows) [MIT] | |||
- [MINGW](https://sourceforge.net/p/mingw/mingw-org-wsl/ci/legacy/tree/mingwrt/mingwex/dirname.c) (for dirname implementation on Windows) [MIT] | |||
- [FDLIBM](http://www.netlib.org/fdlibm/readme) [Freely distributable] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you look down this file I think you also need to update and add a line below
base/special/exp.jl (see FREEBSD MSUN [FreeBSD/2-clause BSD/Simplified BSD License])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, alright, thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the difference between this section and the one below? It's not obvious to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
projects we use vs how we use them? I think this section might actually be for core language/compiler, not base
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, yes, it's not very clear. It looks like this section only applies to core features, in which case FDLIB shouldn't be added here (just like MSUN is listed below, but not here).
BTW, if it's kept here, the new line should mention what it's used for, like existing lines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this section is the right place for non-core julia code under base
LICENSE.md
Outdated
@@ -74,6 +75,7 @@ The following components of Julia's standard library have separate licenses: | |||
- base/sparse/umfpack.jl (see [SUITESPARSE](http://faculty.cse.tamu.edu/davis/suitesparse.html)) | |||
- base/sparse/cholmod.jl (see [SUITESPARSE](http://faculty.cse.tamu.edu/davis/suitesparse.html)) | |||
- base/special/exp.jl (see [FREEBSD MSUN](https://github.com/freebsd/freebsd) [FreeBSD/2-clause BSD/Simplified BSD License]) | |||
- base/special/rem_pio2.jl [FDLIBM](http://www.netlib.org/fdlibm/readme) [Freely distributable] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should point to a copy of the license, as "freely distributable" is a bit vague (and in particular it doesn't say that the copyright attribution should be preserved when distributing).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can point to http://www.netlib.org/fdlibm/e_rem_pio2.c ? There isn't really a license file in fdlibm as far as I can see. I could write "Freely distributable with preserved copyright notice."?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doing both sounds like a good idea to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The license is the same as the exp code line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why? I mean where is it stated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I know next to nothing about licensing, but is it necessary to go with the msun license if they simply took it from fdlibm?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmmm, good point, maybe someone more knowledgeable here can comment, if you do end up changing it best to also modify the one for exp.jl
since they should be the same
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually what you have probably makes the most sense
test/mod2pi.jl
Outdated
n, ret = Base.Math.rem_pio2_kernel(-case) | ||
ret_sum = ret.hi+ret.lo | ||
ulp_error = (ret_sum-ieee754_rem_pio2_return[1, i])/abs(ret_sum-nextfloat(ret_sum)) | ||
@test ulp_error < 0.5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<=
lgtm (minor comments added) Doesn't look like performance is any different than in |
Just for good measure, can this be benchmarked against master now that yuyichao's PR is merged (and formatting is fixed). |
@nanosoldier |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan |
The relevant benchmarks seem okay. I would love to use a two coefficient Cody Waite up into some of the Payne Hanek interval if hardware fma is available, but let's see if that's the best use of my time. I think the current implementation is good enough that we can use it, and ditch openspecfuns. |
test/mod2pi.jl
Outdated
2.0^80*pi/4] # |x| >= 2.0^20π/2, idx > 0-0.22370138542135648 | ||
|
||
# ieee754_rem_pio2_return contains the returned value from the ieee754_rem_pio2 | ||
# function in openlibm https://github.com/JuliaLang/openlibm/blob/master/src/e_rem_pio2.c |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we may want to point to a release tag instead of master on this link in case this file changes or gets moved around in the future
@pkofod what's the ulp error without hardware fma for Float32 and Flaot64 in that case? Note the |
On an unrelated note, I find a lot of these files hard to read to to the lack of spaces between the arguments |
arguments as in |
e.g |
I agree, I can change that, but I figured that I just wanted to make the relevant changes. |
Are we overall satisfied with the performance and accuracy of this? What else needs doing here? |
The only real change I can think of would be to change the branch conditionals to things like |
is the arpack error on travis JuliaLang/LinearAlgebra.jl#354 ? |
Might be? Triggered by #22963 as far as I can tell, which we probably should have tried harder to verify it could get through CI (despite all the timeouts lately) |
yeah, that's the other failure on travis |
Okay then, I'll pull the trigger on this tomorrow, unless there are any further objections. |
definitely squash since there are a lot of little commits here and I think some of the intermediate states had failed |
I just want to say thanks again for this. This is really exciting, because 1.) it means we can excise a binary dependency from Base, and 2.) it's further proof of how powerful Julia can really be for mathematical computing. You've done an awesome job here! |
Cool, thanks.
Definitely!
Pleasure is on my side! Learned a lot about floating point arithmetic 😄 and deadlines 😑 |
…a. (JuliaLang/julia#22603) * Remove ieee754_rem_pio2 in favor of a rem_pio2_kernel written in Julia. * Add missing begin key. * Remove _approx. * Move to separate files. * Fix LICENSE.md to mention FDLIBM instead of Openlibm. * Address comments. * Strengthen test to faithfully rounded. * Fix LICENSE.md message for rem_pio2. * Fix style in LICENSE.md entry. * Remove semicolons. * Move highword up, and remove duplicate unsafe_trunc. * Fix LICENSE.md by removing a bullet and changing license of base/special/exp.jl. * Change license info for base/special/exp.jl. * Small changes. * Get and reset precision for BigFloats, and space before rem in -rem. * setprecision do * Add comments, move test, and switch to muladd in some places. * Fix y1 branches of rem2pi. * Small changes. * Move comment in rem_pio2.jl and add test for fast branch of mod2pi. * rint docstring fix and make it clear what the constant is. * Update comment for INV2PI. * Fix wrong test set name. * Tests against ieee754_rem_pio2 output. * Inline cody_waite functions. * rint -> round, remove rint, remove one argument cody waite, replace Int(x) with trunc(Int, x). * Add some tests. * Inline rem_pio2_kernel, and rearrange code slightly. * fix xhp * Use DoubleFloat64. * Move constants into functions. * Fix escaping of mod * Fix tests and remove specific variables. * Fix tests * Fix issues raised in comments. * More appropriate ulp test (test against eps of reference number). * Change link to a stable github link.
fixes #22004
So, this will probably require some more proof of accuracy and performance, and I will provide it if needed. I will also add more tests. For now, I would very much like some feedback.
Much of this can benefit from further comments, more tests (as noted above), and thorough testing, but since bikeshedding will occur, let us speak names of functions upfront. What you see is just what I came up with.
I also need to add a license-comment as the cody waite part is pretty much straight out of openlibm. How should I go about this? Add the original license header somewhere?
** Performance **
The timings below are for values around 0. They are from the RemPiO2 package. Should prob do it from base as well now that I have this Pr.
For the positive values I ran (notice that there is a huge gap, that is why you have that downwards slope instead of a wiggly line on top), it looks something like this (x values are
log(x)
so the largest value is for something like 2.0^1018).Performance is strictly better, and accuracy is about the same. Havn't found anything > 1ulp, and nothing worse than the about 0.6 ulp largest error of openlibm/openspecfun.