Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove ieee754_rem_pio2 in favor of a rem_pio2_kernel written in Julia. #22603

Merged
merged 39 commits into from
Aug 2, 2017
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
3f951d1
Remove ieee754_rem_pio2 in favor of a rem_pio2_kernel written in Julia.
pkofod Jun 28, 2017
62a91ed
Add missing begin key.
pkofod Jun 29, 2017
16ec5f0
Remove _approx.
pkofod Jun 29, 2017
6f8e725
Move to separate files.
pkofod Jun 29, 2017
c6f593e
Fix LICENSE.md to mention FDLIBM instead of Openlibm.
pkofod Jun 29, 2017
5fd35f4
Address comments.
pkofod Jun 30, 2017
1d2faa6
Strengthen test to faithfully rounded.
pkofod Jun 30, 2017
7878d2c
Fix LICENSE.md message for rem_pio2.
pkofod Jun 30, 2017
0017edd
Fix style in LICENSE.md entry.
pkofod Jun 30, 2017
8258c48
Remove semicolons.
pkofod Jun 30, 2017
83da1a2
Move highword up, and remove duplicate unsafe_trunc.
pkofod Jun 30, 2017
27455bf
Fix LICENSE.md by removing a bullet and changing license of base/spec…
pkofod Jun 30, 2017
bb626b3
Change license info for base/special/exp.jl.
pkofod Jul 1, 2017
45d4ba1
Small changes.
pkofod Jul 1, 2017
ff19e40
Get and reset precision for BigFloats, and space before rem in -rem.
pkofod Jul 1, 2017
375b8af
setprecision do
pkofod Jul 1, 2017
8ca63c1
Add comments, move test, and switch to muladd in some places.
pkofod Jul 5, 2017
1ef4918
Fix y1 branches of rem2pi.
pkofod Jul 5, 2017
a1f1c34
Small changes.
pkofod Jul 5, 2017
ecc6963
Move comment in rem_pio2.jl and add test for fast branch of mod2pi.
pkofod Jul 6, 2017
f8090f5
rint docstring fix and make it clear what the constant is.
pkofod Jul 6, 2017
eed7e62
Update comment for INV2PI.
pkofod Jul 6, 2017
dd2c152
Fix wrong test set name.
pkofod Jul 6, 2017
c06e2f3
Tests against ieee754_rem_pio2 output.
pkofod Jul 8, 2017
6a5dc4a
Inline cody_waite functions.
pkofod Jul 8, 2017
50e52e3
rint -> round, remove rint, remove one argument cody waite, replace I…
pkofod Jul 9, 2017
7b6213e
Add some tests.
pkofod Jul 9, 2017
ae8577c
Inline rem_pio2_kernel, and rearrange code slightly.
pkofod Jul 13, 2017
ba62caf
fix xhp
pkofod Jul 14, 2017
f209c0a
Use DoubleFloat64.
pkofod Jul 21, 2017
994e158
Move constants into functions.
pkofod Jul 22, 2017
aba4579
Fix escaping of mod
pkofod Jul 22, 2017
4865a41
Fix tests and remove specific variables.
pkofod Jul 22, 2017
6dda42c
Fix tests
pkofod Jul 23, 2017
8adb120
Fix issues raised in comments.
pkofod Jul 26, 2017
0545479
More appropriate ulp test (test against eps of reference number).
pkofod Jul 26, 2017
8ee70a1
Merge branch 'master' into rempio2gsoc
pkofod Jul 30, 2017
765c566
Change link to a stable github link.
pkofod Jul 31, 2017
32d2839
Merge remote-tracking branch 'pkofod/rempio2gsoc' into rempio2gsoc
pkofod Jul 31, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions LICENSE.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ Julia includes code from the following projects, which have their own licenses:
- [LLVM](http://releases.llvm.org/3.9.0/LICENSE.TXT) (for parts of src/jitlayers.cpp and src/disasm.cpp) [BSD-3, effectively]
- [MUSL](http://git.musl-libc.org/cgit/musl/tree/COPYRIGHT) (for getopt implementation on Windows) [MIT]
- [MINGW](https://sourceforge.net/p/mingw/mingw-org-wsl/ci/legacy/tree/mingwrt/mingwex/dirname.c) (for dirname implementation on Windows) [MIT]
- [OPENLIBM](https://github.com/JuliaLang/openlibm/blob/master/LICENSE.md) [MIT, BSD-2, ISC]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what was this particular piece originally derived from? openlibm copied almost everything from somewhere else

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FDLIBM

- [NetBSD](http://www.netbsd.org/about/redistribution.html) (for setjmp, longjmp, and strptime implementations on Windows) [BSD-3]
- [Python](https://docs.python.org/2/license.html) (for strtod implementation on Windows) [BSD-3, effectively]
- [randmtzig.c](https://github.com/JuliaLang/julia/blob/master/test/perf/micro/randmtzig.c) for Gaussian random number generation (for C benchmarks only) [BSD-3]
Expand Down
273 changes: 1 addition & 272 deletions base/math.jl
Original file line number Diff line number Diff line change
Expand Up @@ -738,278 +738,6 @@ function add22condh(xh::Float64, xl::Float64, yh::Float64, yl::Float64)
return zh
end

const invpio2 = 6.36619772367581382433e-01
const pio2_1 = 1.57079632673412561417e+00
const pio2_1t = 6.07710050650619224932e-11
const pio2_2 = 6.07710050630396597660e-11
const pio2_2t = 2.02226624879595063154e-21
const pio2_3 = 2.02226624871116645580e-21
const pio2_3t = 8.47842766036889956997e-32

function cody_waite_2c_pio2(x, signed_k, n)
z = x - signed_k*pio2_1;
y1 = z - signed_k*pio2_1t;
y2 = (z - y1) - signed_k*pio2_1t;
n, y1, y2
end

function cody_waite_ext_pio2(x, xʰ⁺)
fn = x*invpio2+0x1.8p52
fn = fn-0x1.8p52
n = Int32(fn)
r = x-fn*pio2_1
w = fn*pio2_1t # 1st round good to 85 bit
j = xʰ⁺>>20
y1 = r-w
high = highword(y1)
i = j-((high>>20)&0x7ff)
if i>16 # 2nd iteration needed, good to 118
t = r
w = fn*pio2_2
r = t-w
w = fn*pio2_2t-((t-r)-w)
y1 = r-w
high = highword(y1)
i = j-((high>>20)&0x7ff)
if i>49 # 3rd iteration need, 151 bits acc
t = r # will cover all possible cases
w = fn*pio2_3
r = t-w
w = fn*pio2_3t-((t-r)-w)
y1 = r-w
end
end
y2 = (r-y1)-w
return Int(n), y1, y2
end

# constants and functions for large (>2.0^20) inputs to rem_pio2_kernel
const INV2PI = UInt64[
0x28be_60db_9391_054a,
0x7f09_d5f4_7d4d_3770,
0x36d8_a566_4f10_e410,
0x7f94_58ea_f7ae_f158,
0x6dc9_1b8e_9093_74b8,
0x0192_4bba_8274_6487,
0x3f87_7ac7_2c4a_69cf,
0xba20_8d7d_4bae_d121,
0x3a67_1c09_ad17_df90,
0x4e64_758e_60d4_ce7d,
0x2721_17e2_ef7e_4a0e,
0xc7fe_25ff_f781_6603,
0xfbcb_c462_d682_9b47,
0xdb4d_9fb3_c9f2_c26d,
0xd3d1_8fd9_a797_fa8b,
0x5d49_eeb1_faf9_7c5e,
0xcf41_ce7d_e294_a4ba,
0x9afe_d7ec_47e3_5742,
0x1580_cc11_bf1e_daea]


"""
fromfraction(f::Int128)

Computes a tuple of values `(y1,y2)` such that
y1 + y2 == f / 2^128
and the significand of `y1` has 27 trailing zeros.
"""
function fromfraction(f::Int128)
if f == 0
return (0.0,0.0)
end

# 1. get leading term truncated to 26 bits
s = ((f < 0) % UInt64) << 63 # sign bit
x = abs(f) % UInt128 # magnitude
n1 = 128-leading_zeros(x) # ndigits0z(x,2)
m1 = ((x >> (n1-26)) % UInt64) << 27
d1 = ((n1-128+1021) % UInt64) << 52
z1 = reinterpret(Float64, s | (d1 + m1))

# 2. compute remaining term
x2 = (x - (UInt128(m1) << (n1-53)))
if x2 == 0
return (z1, 0.0)
end
n2 = 128-leading_zeros(x2)
m2 = (x2 >> (n2-53)) % UInt64
d2 = ((n2-128+1021) % UInt64) << 52
z2 = reinterpret(Float64, s | (d2 + m2))
return (z1,z2)
end

function paynehanek(x::Float64)
# 1. Convert to form
#
# x = X * 2^k,
#
# where 2^(n-1) <= X < 2^n is an n-bit integer (n = 53, k = exponent(x)-52 )

u = reinterpret(UInt64, x)
X = (u & significand_mask(Float64)) | (one(UInt64) << significand_bits(Float64))
k = Int((u & exponent_mask(Float64)) >> significand_bits(Float64)) - exponent_bias(Float64) - significand_bits(Float64)

# 2. Let α = 1/2π, then:
#
# α*x mod 1 ≡ [(α*2^k mod 1)*X] mod 1
#
# so we can ignore the first k bits of α. Extract the next 3 64-bit parts of α.
#
# i.e. equivalent to
# setprecision(BigFloat,4096)
# α = 1/(2*big(pi))
# A = mod(ldexp(α,k), 1)
# z1 = ldexp(A,64)
# a1 = trunc(UInt64, z1)
# z2 = ldexp(z1-a1, 64)
# a2 = trunc(UInt64, z2)
# z3 = ldexp(z2-a2, 64)
# a3 = trunc(UInt64, z3)

# idx, shift = divrem(k, 64), but divrem is slower
idx = k >> 6
shift = k - (idx << 6)
if shift == 0
a1 = INV2PI[idx+1]
a2 = INV2PI[idx+2]
a3 = INV2PI[idx+3]
else
# use shifts to extract the relevant 64 bit window
a1 = (idx < 0 ? zero(UInt64) : INV2PI[idx+1] << shift) | (INV2PI[idx+2] >> (64 - shift))
a2 = (INV2PI[idx+2] << shift) | (INV2PI[idx+3] >> (64 - shift))
a3 = (INV2PI[idx+3] << shift) | (INV2PI[idx+4] >> (64 - shift))
end

# 3. Perform the multiplication:
#
# X. 0 0 0
# × 0. a1 a2 a3
# ==============
# _. w w _
#
# (i.e. ignoring integer and lowest bit parts of result)

w1 = UInt128(X*a1) << 64 # overflow becomes integer
w2 = widemul(X,a2)
w3 = widemul(X,a3) >> 64
w = w1 + w2 + w3 # quotient fraction after division by 2π

# adjust for sign of x
w = flipsign(w,x)

# 4. convert to quadrant, quotient fraction after division by π/2:
q = (((w>>125)%Int +1)>>1) # nearest quadrant
f = (w<<2) % Int128 # fraction part of quotient after division by π/2, taking values on [-0.5,0.5)

# 5. convert quotient fraction to split precision Float64
z_hi,z_lo = fromfraction(f)

# 6. multiply by π/2
pio2_hi = 1.5707963407039642
pio2_lo = -1.3909067614167116e-8

y_hi = (z_hi+z_lo)*(pio2_hi+pio2_lo)
y_lo = (((z_hi*pio2_hi - y_hi) + z_hi*pio2_lo) + z_lo*pio2_hi) + z_lo*pio2_lo
return q, y_hi, y_lo
end


"""
highword(x)

returns the high word of x as a UInt32.
"""
@inline highword(x::UInt64) = unsafe_trunc(UInt32,x >> 32)
@inline highword(x::Float64) = unsafe_trunc(UInt32,highword(reinterpret(UInt64, x)))

"""
rem_pio2(x::Float64)

returns the remainder of x modulo π/2 as a TwicePrecision number, along with a k
such that k mod 3 == K mod 3 where K*π/2 = x -rem.
"""

function rem_pio2(x::Float64)
xh = highword(x)
xhp = xh & 0x7fffffff # positive part of highword

if xhp <= 0x3fe921fb #|x| ~<= pi/4, no need for reduction, just return input
return Int(0), x, 0.0
end
rem_pio2_kernel(x, xh, xhp)
end

"""
rem_pio2_kernel(x, xh, xhp)

returns the remainder of x modulo π/2 as a TwicePrecision number, along with a k
such that k mod 3 == K mod 3 where K*π/2 = x -rem.
"""
function rem_pio2_kernel(x::Float64)
# rem_pio2_kernel essentially computes x mod pi/2 (ie within a quarter circle)
# and returns the result as
# y between + and - pi/4 (for maximal accuracy (as the sign bit is exploited)), and
# n, where n specifies the integer part of the division, or, at any rate,
# in which quadrant we are.
# The invariant fulfilled by the returned values seems to be
# x = y + n*pi/2 (where y = y1+y2 is a double-double and y2 is the "tail" of y).
# Note: for very large x (thus n), the invariant might hold only modulo 2pi
# (in other words, n might be off by a multiple of 4, or a multiple of 100)

xh = highword(x)
xhp = xh & 0x7fffffff # positive part of highword
rem_pio2_kernel(x, xh, xhp)
end

function rem_pio2_kernel(x::Float64, xh, xhp)
if xhp <= 0x400f6a7a #|x| ~<= 5pi/4, use Cody Waite with two constants
if (xhp & 0xfffff) == 0x921fb # |x| ~= pi/2 or 2pi/2, use precise Cody Waite scheme
return cody_waite_ext_pio2(x, xhp)
end
if xhp <= 0x4002d97c # |x| ~<= 3pi/4
if x > 0
return cody_waite_2c_pio2(x, 1.0, 1)
else
return cody_waite_2c_pio2(x, -1.0, -1)
end
else
if x > 0
return cody_waite_2c_pio2(x, 2.0, 2)
else
return cody_waite_2c_pio2(x, -2.0, -2)
end
end
end

if xhp <= 0x401c463b # |x| ~<= 9pi/4, use Cody Waite with two constants
if (xhp <= 0x4015fdbc) # |x| ~<= 7pi/4
if (xhp == 0x4012d97c) # |x| ~= 3pi/2, use precise Cody Waite scheme
return cody_waite_ext_pio2(x, xhp)
end
if x > 0
return cody_waite_2c_pio2(x, 3.0, 3)
else
return cody_waite_2c_pio2(x, -3.0, -3)
end
else
if xhp == 0x401921fb # |x| ~= 4pi/2, use precise Cody Waite scheme
return cody_waite_ext_pio2(x, xhp)
end
if x > 0
return cody_waite_2c_pio2(x, 4.0, 4)
else
return cody_waite_2c_pio2(x, -4.0, -4)
end
end
end
if xhp<0x413921fb # |x| ~< 2^20*pi/2, use precise Cody Waite scheme
return cody_waite_ext_pio2(x, xhp)
end

# if |x| >= 2^20*pi/2 switch to Payne Hanek
return paynehanek(x)
end

# multiples of pi/2, as double-double (ie with "tail")
const pi1o2_h = 1.5707963267948966 # convert(Float64, pi * BigFloat(1/2))
const pi1o2_l = 6.123233995736766e-17 # convert(Float64, pi * BigFloat(1/2) - pi1o2_h)
Expand Down Expand Up @@ -1230,6 +958,7 @@ include(joinpath("special", "exp.jl"))
include(joinpath("special", "exp10.jl"))
include(joinpath("special", "trig.jl"))
include(joinpath("special", "gamma.jl"))
include(joinpath("special", "rem_pio2.jl"))

module JuliaLibm
include(joinpath("special", "log.jl"))
Expand Down
Loading