Remove ieee754_rem_pio2 in favor of a rem_pio2_kernel written in Julia. #22603

pkofod · 2017-06-28T21:46:24Z

fixes #22004

So, this will probably require some more proof of accuracy and performance, and I will provide it if needed. I will also add more tests. For now, I would very much like some feedback.

Much of this can benefit from further comments, more tests (as noted above), and thorough testing, but since bikeshedding will occur, let us speak names of functions upfront. What you see is just what I came up with.

I also need to add a license-comment as the cody waite part is pretty much straight out of openlibm. How should I go about this? Add the original license header somewhere?

/* @(#)k_rem_pio2.c 1.3 95/01/18 */
/*
 * ====================================================
 * Copyright (C) 1993 by Sun Microsystems, Inc. All rights reserved.
 *
 * Developed at SunSoft, a Sun Microsystems, Inc. business.
 * Permission to use, copy, modify, and distribute this
 * software is freely granted, provided that this notice 
 * is preserved.
 * ====================================================
 */

** Performance **
The timings below are for values around 0. They are from the RemPiO2 package. Should prob do it from base as well now that I have this Pr.

For the positive values I ran (notice that there is a huge gap, that is why you have that downwards slope instead of a wiggly line on top), it looks something like this (x values are log(x) so the largest value is for something like 2.0^1018).

Performance is strictly better, and accuracy is about the same. Havn't found anything > 1ulp, and nothing worse than the about 0.6 ulp largest error of openlibm/openspecfun.

pkofod · 2017-06-28T21:48:24Z

(should probably cc @simonbyrne )

tkelman · 2017-06-28T22:06:07Z

base/math.jl

+returns the remainder of x modulo π/2 as a TwicePrecision number, along with a k
+such that k mod 3 == K mod 3 where K*π/2 = x -rem.
+"""
+


no blank line here

tkelman · 2017-06-28T22:08:10Z

Since this is a derived work, I'd move it to its own file under base/special and add it to the exception lists in LICENSE.md and contrib/add_license_to_files.jl

musm · 2017-06-28T23:50:45Z

What about the Float32 case https://github.com/JuliaLang/openlibm/blob/master/src/e_rem_pio2f.c ?

pkofod · 2017-06-29T04:32:31Z

What about the Float32 case https://github.com/JuliaLang/openlibm/blob/master/src/e_rem_pio2f.c ?

Sure, but the current method (that I'm removing) is specifically for Float64. Since it's only used in rem2pi(x::Float64, roundingmode) I guess it would only be relevant to add at this point if a rem2pi for Float32s was added, or if Float32 methods for trig functions were added. There are fallbacks to rem2pi of course

rem2pi(x::Float32, r::RoundingMode) = Float32(rem2pi(Float64(x), r))

There's not much to the Float32 case (calculate medium case directly, or use the kernel function included in this PR for large values), but I think it makes sense to add it only when it's going to be used (PRs to come), as it's not really something we export.

pkofod · 2017-06-29T04:34:41Z

Since this is a derived work, I'd move it to its own file under base/special and add it to the exception lists in LICENSE.md and contrib/add_license_to_files.jl

Alright, will do that.

musm · 2017-06-29T05:53:22Z

Sure, but the current method (that I'm removing) is specifically for Float64. Since it's only used in rem2pi(x::Float64, roundingmode) I guess it would only be relevant to add at this point if a rem2pi for Float32s was added, or if Float32 methods for trig functions were added. There are fallbacks to rem2pi of course

Yep, the Float32 case should be quite straightforward after a correct Float64 method. I think we might as well also port the Float32 case, that way we can get rid of the slower fallback and just use the faster Float32 implementation for rem2pi, which I think is worth it if we are going through the trouble of the Float64 port.

musm · 2017-06-29T06:17:07Z

BTW do you have any ideas as to why the performance and accuracy is so much better than the C version, this is really surprising to me (I would expect perhaps a ~15% speed improvement and essentially equivalent accuracy).

Not sure how much you plan on abstracting and simplifying, but a lot of the openlibm stuff can be simplified if you work at it hard enough and with careful benchmarking, i.e. the high and low splitting for comparisons is not necessary in many cases (so in principle you could defer that until you need it in the cody_waite_ext_pio2 method) This are some general comments from my experience with the openlibm code base, since I have only taken a cursory glance at the PR.

pkofod · 2017-06-29T07:18:07Z

BTW do you have any ideas as to why the performance and accuracy is so much better than the C version, this is really surprising to me (I would expect perhaps a ~15% speed improvement and essentially equivalent accuracy).

Well, I expected the same thing. This is the sort of performance difference I get for the trig kernels for example (more like 8%-12%).

My very best bet would be that this is somehow related to the fact that I don't pass around a vector. The Payne Hanek implementation is not the same one as in openlibm (it's a modified version of the code Simon Byrne had floating around), but the relative speed-up seems consistent across x, so...

Not sure how much you plan on abstracting and simplifying, but a lot of the openlibm stuff can be simplified if you work at it hard enough and with careful benchmarking, i.e. the high and low splitting for comparisons is not necessary in many cases (so in principle you could defer that until you need it in the cody_waite_ext_pio2 method)

Yes, I guess I could defer getting the high word until reaching a branch where the more precise C&W scheme is used or reaching Payne Hanek. You think it would have a measurable impact, or is it more for "neat code" reasons? Fwiw, when I'm calling (or going to call) this function from sin, cos, etc, it'll already have the xh and xhp ready, unless that is changed again.

pkofod · 2017-06-29T09:12:27Z

Since this is a derived work, I'd move it to its own file under base/special and add it to the exception lists in LICENSE.md and contrib/add_license_to_files.jl

Alright, I tried to do that, but now I'm getting:

/home/pkm/julia/julia/base/precompile.jl
AddrSpaceCast must be between different address spaces
  %.sroa_cast = addrspacecast i8* %65 to i8*, !dbg !75879
LLVM ERROR: Broken function found, compilation aborted!
*** This error is usually fixed by running `make clean`. If the error persists, try `make cleanall`. ***
Makefile:233: recipe for target '/home/pkm/julia/julia/usr/lib/julia/sys.o' failed
make[1]: *** [/home/pkm/julia/julia/usr/lib/julia/sys.o] Error 1
Makefile:109: recipe for target 'julia-sysimg-release' failed
make: *** [julia-sysimg-release] Error 2
pkm@pkm:~/julia/julia$

Wonder what I did wrong (and yes I did try to follow the clean instructions). I placed a new file in special, included it in math, included a skip in contrib, and changed LICENSE.md.

tkelman · 2017-06-29T11:35:09Z

you switched branches since building llvm. do make cleanall, or failing that make -C deps distclean-llvm

pkofod · 2017-06-29T13:16:56Z

That is actually true, will try it, thanks!

Edit: that worked.

What's the consensus on constants such as the precalculated hi and lo values of some numbers? Should they just be defined inside the functions if they're only used in one function, or outside?

tkelman · 2017-06-29T16:02:26Z

LICENSE.md

@@ -34,6 +34,7 @@ Julia includes code from the following projects, which have their own licenses:
 - [LLVM](http://releases.llvm.org/3.9.0/LICENSE.TXT) (for parts of src/jitlayers.cpp and src/disasm.cpp) [BSD-3, effectively]
 - [MUSL](http://git.musl-libc.org/cgit/musl/tree/COPYRIGHT) (for getopt implementation on Windows) [MIT]
 - [MINGW](https://sourceforge.net/p/mingw/mingw-org-wsl/ci/legacy/tree/mingwrt/mingwex/dirname.c) (for dirname implementation on Windows) [MIT]
+- [OPENLIBM](https://github.com/JuliaLang/openlibm/blob/master/LICENSE.md) [MIT, BSD-2, ISC]


what was this particular piece originally derived from? openlibm copied almost everything from somewhere else

tkelman · 2017-06-30T04:31:57Z

base/special/rem_pio2.jl

+    end
+
+    if xhp <= 0x401c463b # |x| ~<= 9pi/4, use Cody Waite with two constants
+        if (xhp <= 0x4015fdbc) # |x| ~<= 7pi/4


no parens around conditions

tkelman · 2017-06-30T04:33:10Z

base/special/rem_pio2.jl

+    rem_pio2_kernel(x, xh, xhp)
+
+returns the remainder of x modulo π/2 as a TwicePrecision number, along with a k
+such that k mod 3 == K mod 3 where K*π/2 = x -rem.


maybe some code / math highlighting here? is this supposed to be x - rem or something else?

tkelman · 2017-06-30T04:33:25Z

base/special/rem_pio2.jl

+returns the remainder of x modulo π/2 as a TwicePrecision number, along with a k
+such that k mod 3 == K mod 3 where K*π/2 = x -rem.
+"""
+


no blank line here

tkelman · 2017-06-30T04:34:03Z

base/special/rem_pio2.jl

+"""
+    highword(x)
+
+returns the high word of x as a UInt32.


capitalize and use imperative (Return ...) for docstrings, code highlight UInt32

tkelman · 2017-06-30T04:35:49Z

base/special/rem_pio2.jl

+end
+
+function cody_waite_ext_pio2(x, xʰ⁺)
+    fn = x*invpio2+0x1.8p52


given #6349 it's a bit more portable to not rely on hex floats for bootstrap-necessary code

musm · 2017-06-30T06:34:52Z

LICENSE.md

@@ -34,6 +34,7 @@ Julia includes code from the following projects, which have their own licenses:
 - [LLVM](http://releases.llvm.org/3.9.0/LICENSE.TXT) (for parts of src/jitlayers.cpp and src/disasm.cpp) [BSD-3, effectively]
 - [MUSL](http://git.musl-libc.org/cgit/musl/tree/COPYRIGHT) (for getopt implementation on Windows) [MIT]
 - [MINGW](https://sourceforge.net/p/mingw/mingw-org-wsl/ci/legacy/tree/mingwrt/mingwex/dirname.c) (for dirname implementation on Windows) [MIT]
+- [FDLIBM](http://www.netlib.org/fdlibm/readme) [Freely distributable]


If you look down this file I think you also need to update and add a line below

base/special/exp.jl (see FREEBSD MSUN [FreeBSD/2-clause BSD/Simplified BSD License])

Ah, alright, thanks.

What's the difference between this section and the one below? It's not obvious to me.

projects we use vs how we use them? I think this section might actually be for core language/compiler, not base

Well, yes, it's not very clear. It looks like this section only applies to core features, in which case FDLIB shouldn't be added here (just like MSUN is listed below, but not here).

BTW, if it's kept here, the new line should mention what it's used for, like existing lines.

I don't think this section is the right place for non-core julia code under base

nalimilan · 2017-06-30T08:28:16Z

LICENSE.md

@@ -74,6 +75,7 @@ The following components of Julia's standard library have separate licenses:
 - base/sparse/umfpack.jl (see [SUITESPARSE](http://faculty.cse.tamu.edu/davis/suitesparse.html))
 - base/sparse/cholmod.jl (see [SUITESPARSE](http://faculty.cse.tamu.edu/davis/suitesparse.html))
 - base/special/exp.jl (see [FREEBSD MSUN](https://github.com/freebsd/freebsd) [FreeBSD/2-clause BSD/Simplified BSD License])
+- base/special/rem_pio2.jl [FDLIBM](http://www.netlib.org/fdlibm/readme) [Freely distributable]


Should point to a copy of the license, as "freely distributable" is a bit vague (and in particular it doesn't say that the copyright attribution should be preserved when distributing).

I can point to http://www.netlib.org/fdlibm/e_rem_pio2.c ? There isn't really a license file in fdlibm as far as I can see. I could write "Freely distributable with preserved copyright notice."?

Doing both sounds like a good idea to me.

The license is the same as the exp code line

Why? I mean where is it stated?

I went off of https://github.com/JuliaLang/openlibm/blob/master/LICENSE.md#freebsd-msun-freebsd2-clause-bsdsimplified-bsd-license

Sorry, I know next to nothing about licensing, but is it necessary to go with the msun license if they simply took it from fdlibm?

hmmm, good point, maybe someone more knowledgeable here can comment, if you do end up changing it best to also modify the one for exp.jl since they should be the same

Actually what you have probably makes the most sense

musm · 2017-07-26T18:57:26Z

test/mod2pi.jl

+        n, ret = Base.Math.rem_pio2_kernel(-case)
+        ret_sum = ret.hi+ret.lo
+        ulp_error = (ret_sum-ieee754_rem_pio2_return[1, i])/abs(ret_sum-nextfloat(ret_sum))
+        @test ulp_error < 0.5


musm · 2017-07-26T19:00:35Z

lgtm (minor comments added)

Doesn't look like performance is any different than in yyc's branch 👍

pkofod · 2017-07-30T07:53:47Z

Just for good measure, can this be benchmarked against master now that yuyichao's PR is merged (and formatting is fixed).

KristofferC · 2017-07-30T08:22:18Z

@nanosoldier runbenchmarks(ALL, vs = ":master")

nanosoldier · 2017-07-30T12:06:54Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

pkofod · 2017-07-30T17:09:11Z

The relevant benchmarks seem okay. I would love to use a two coefficient Cody Waite up into some of the Payne Hanek interval if hardware fma is available, but let's see if that's the best use of my time. I think the current implementation is good enough that we can use it, and ditch openspecfuns.

tkelman · 2017-07-30T23:42:30Z

test/mod2pi.jl

+             2.0^80*pi/4] # |x| >= 2.0^20π/2, idx > 0-0.22370138542135648
+
+     # ieee754_rem_pio2_return contains the returned value from the ieee754_rem_pio2
+     # function in openlibm https://github.com/JuliaLang/openlibm/blob/master/src/e_rem_pio2.c


we may want to point to a release tag instead of master on this link in case this file changes or gets moved around in the future

musm · 2017-07-31T19:55:49Z

The relevant benchmarks seem okay. I would love to use a two coefficient Cody Waite up into some of the Payne Hanek interval if hardware fma is available, but let's see if that's the best use of my time. I think the current implementation is good enough that we can use it, and ditch openspecfuns.

@pkofod what's the ulp error without hardware fma for Float32 and Flaot64 in that case? Note the exp10 function for the Float64 case on non-hardware fma has error slightly greater than 1 ulp, but is otherwise (< 1 ulp) for Float32 and Float64 with hardware fma.
So it may be tolerable.

musm · 2017-07-31T19:59:55Z

On an unrelated note, I find a lot of these files hard to read to to the lack of spaces between the arguments

pkofod · 2017-07-31T20:10:20Z

On an unrelated note, I find a lot of these files hard to read to to the lack of spaces between the arguments

arguments as in f(a,b,c)? That should only be in math.jl as I generally add spaces, but I didn't want to change all those lines just to add spaces (is there an official style for this?). Where I may be a lot more inconsistent is in a+b vs a + b or a*b+c vs a*b + c or variants of that...

musm · 2017-07-31T20:12:31Z

arguments as in f(a,b,c)? That should only be in math.jl as I generally add spaces, but I didn't want to change all those lines just to add spaces (is there an official style for this?). Where I may be a lot more inconsistent is in a+b vs a + b or ab+c vs ab + c or variants of that...

e.g

https://github.com/JuliaLang/julia/pull/22603/files#diff-8278b779f2ea681192ba5b020a2c3e2bR923

pkofod · 2017-07-31T20:14:25Z

e.g

https://github.com/JuliaLang/julia/pull/22603/files#diff-8278b779f2ea681192ba5b020a2c3e2bR923

I agree, I can change that, but I figured that I just wanted to make the relevant changes.

tkelman · 2017-08-01T05:01:28Z

Are we overall satisfied with the performance and accuracy of this? What else needs doing here?

pkofod · 2017-08-01T05:28:43Z

The only real change I can think of would be to change the branch conditionals to things like abs(x)<pi/4 (so, float comparisons instead of the higword-trickery), but that could be changed later; I don't really think it's that important beyond style.

pkofod · 2017-08-01T08:01:25Z

is the arpack error on travis JuliaLang/LinearAlgebra.jl#354 ?

tkelman · 2017-08-01T08:20:20Z

Might be? Triggered by #22963 as far as I can tell, which we probably should have tried harder to verify it could get through CI (despite all the timeouts lately)

pkofod · 2017-08-01T13:36:30Z

(despite all the timeouts lately)

yeah, that's the other failure on travis

simonbyrne · 2017-08-01T17:57:47Z

Okay then, I'll pull the trigger on this tomorrow, unless there are any further objections.

tkelman · 2017-08-01T18:39:16Z

definitely squash since there are a lot of little commits here and I think some of the intermediate states had failed

ararslan · 2017-08-01T18:42:01Z

I just want to say thanks again for this. This is really exciting, because 1.) it means we can excise a binary dependency from Base, and 2.) it's further proof of how powerful Julia can really be for mathematical computing. You've done an awesome job here!

pkofod · 2017-08-01T19:28:35Z

Okay then, I'll pull the trigger on this tomorrow, unless there are any further objections.

Cool, thanks.

definitely squash since there are a lot of little commits here and I think some of the intermediate states had failed

Definitely!

I just want to say thanks again for this. This is really exciting, because 1.) it means we can excise a binary dependency from Base, and 2.) it's further proof of how powerful Julia can really be for mathematical computing. You've done an awesome job here!

Pleasure is on my side! Learned a lot about floating point arithmetic 😄 and deadlines 😑

…a. (JuliaLang/julia#22603) * Remove ieee754_rem_pio2 in favor of a rem_pio2_kernel written in Julia. * Add missing begin key. * Remove _approx. * Move to separate files. * Fix LICENSE.md to mention FDLIBM instead of Openlibm. * Address comments. * Strengthen test to faithfully rounded. * Fix LICENSE.md message for rem_pio2. * Fix style in LICENSE.md entry. * Remove semicolons. * Move highword up, and remove duplicate unsafe_trunc. * Fix LICENSE.md by removing a bullet and changing license of base/special/exp.jl. * Change license info for base/special/exp.jl. * Small changes. * Get and reset precision for BigFloats, and space before rem in -rem. * setprecision do * Add comments, move test, and switch to muladd in some places. * Fix y1 branches of rem2pi. * Small changes. * Move comment in rem_pio2.jl and add test for fast branch of mod2pi. * rint docstring fix and make it clear what the constant is. * Update comment for INV2PI. * Fix wrong test set name. * Tests against ieee754_rem_pio2 output. * Inline cody_waite functions. * rint -> round, remove rint, remove one argument cody waite, replace Int(x) with trunc(Int, x). * Add some tests. * Inline rem_pio2_kernel, and rearrange code slightly. * fix xhp * Use DoubleFloat64. * Move constants into functions. * Fix escaping of mod * Fix tests and remove specific variables. * Fix tests * Fix issues raised in comments. * More appropriate ulp test (test against eps of reference number). * Change link to a stable github link.

Remove ieee754_rem_pio2 in favor of a rem_pio2_kernel written in Julia.

3f951d1

ararslan requested a review from simonbyrne June 28, 2017 21:51

ararslan added the maths Mathematical functions label Jun 28, 2017

tkelman reviewed Jun 28, 2017

View reviewed changes

Add missing begin key.

62a91ed

Remove _approx.

16ec5f0

Move to separate files.

6f8e725

tkelman reviewed Jun 29, 2017

View reviewed changes

Fix LICENSE.md to mention FDLIBM instead of Openlibm.

c6f593e

tkelman reviewed Jun 30, 2017

View reviewed changes

musm reviewed Jun 30, 2017

View reviewed changes

pkofod added 2 commits June 30, 2017 09:38

Address comments.

5fd35f4

Strengthen test to faithfully rounded.

1d2faa6

nalimilan reviewed Jun 30, 2017

View reviewed changes

Fix LICENSE.md message for rem_pio2.

7878d2c

musm reviewed Jul 26, 2017

View reviewed changes

pkofod added 2 commits July 26, 2017 21:07

More appropriate ulp test (test against eps of reference number).

0545479

Merge branch 'master' into rempio2gsoc

8ee70a1

tkelman reviewed Jul 30, 2017

View reviewed changes

Change link to a stable github link.

765c566

Merge remote-tracking branch 'pkofod/rempio2gsoc' into rempio2gsoc

32d2839

simonbyrne merged commit 27852fd into JuliaLang:master Aug 2, 2017

pkofod deleted the rempio2gsoc branch August 2, 2017 20:24

ararslan mentioned this pull request Sep 6, 2017

Stop building and shipping openspecfun #23598

Merged

pkofod mentioned this pull request Dec 19, 2018

Adjusted code to allow users to modify mortality assumptions in cstwMPC econ-ark/HARK#214

Closed

Remove ieee754_rem_pio2 in favor of a rem_pio2_kernel written in Julia. #22603

Remove ieee754_rem_pio2 in favor of a rem_pio2_kernel written in Julia. #22603

Conversation

pkofod commented Jun 28, 2017 • edited Loading

pkofod commented Jun 28, 2017

tkelman Jun 28, 2017 • edited Loading

Choose a reason for hiding this comment

tkelman commented Jun 28, 2017

musm commented Jun 28, 2017

pkofod commented Jun 29, 2017 • edited Loading

pkofod commented Jun 29, 2017

musm commented Jun 29, 2017 • edited Loading

musm commented Jun 29, 2017

pkofod commented Jun 29, 2017 • edited Loading

pkofod commented Jun 29, 2017

tkelman commented Jun 29, 2017

pkofod commented Jun 29, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tkelman Jun 30, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

musm commented Jul 26, 2017 • edited Loading

pkofod commented Jul 30, 2017

KristofferC commented Jul 30, 2017

nanosoldier commented Jul 30, 2017

pkofod commented Jul 30, 2017

Choose a reason for hiding this comment

musm commented Jul 31, 2017

musm commented Jul 31, 2017

pkofod commented Jul 31, 2017

musm commented Jul 31, 2017

pkofod commented Jul 31, 2017

tkelman commented Aug 1, 2017

pkofod commented Aug 1, 2017

pkofod commented Aug 1, 2017

tkelman commented Aug 1, 2017

pkofod commented Aug 1, 2017

simonbyrne commented Aug 1, 2017

tkelman commented Aug 1, 2017

ararslan commented Aug 1, 2017

pkofod commented Aug 1, 2017

pkofod commented Jun 28, 2017 •

edited

Loading

tkelman Jun 28, 2017 •

edited

Loading

pkofod commented Jun 29, 2017 •

edited

Loading

musm commented Jun 29, 2017 •

edited

Loading

pkofod commented Jun 29, 2017 •

edited

Loading

pkofod commented Jun 29, 2017 •

edited

Loading

tkelman Jun 30, 2017 •

edited

Loading

musm commented Jul 26, 2017 •

edited

Loading