Sleef port #11

musm · 2016-08-22T09:30:45Z

sleef port to Julia

Done

Port double precision functions
Use Horner. Pull constants out of functions and wrap in let blocks.
Wrap in sleef module
Rigorous tests
Fix bugs in trig functions in sleef (xsin(-0.0) = 0.0)
DoubleDouble.jl abstractions

Before merging

Create a new Double arithmetic package (see: https://github.com/musm/Doubles.jl)
- Upon further thought it probably makes more sense to keep the double funs private to Sleef.jl
Single precision function, generalize double precision ports using multiple dispatch (wip)
- Blocking issue performance of captured variables in closures JuliaLang/julia#15276 (workaround identified!)

Todos

Use abstractions in the new Double arithmetic package https://github.com/musm/Doubles.jl
Even more robust tests (full ranges)
Better benchmarks
Coefficient generation (check current range reduction at intermediate argument values)
Double check everything especially the tests
Include software fma function
Range reduction for large arguments (payne-hayek)

codecov-io · 2016-08-22T09:36:15Z

Current coverage is 76.29% (diff: 94.42%)

Merging #11 into master will decrease coverage by 22.97%

@@             master        #11   diff @@
==========================================
  Files             5         13     +8   
  Lines           137        675   +538   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits            136        515   +379   
- Misses            1        160   +159   
  Partials          0          0

Powered by Codecov. Last update 8ef3dca...1edc022

tkelman · 2016-08-22T12:37:44Z

src/Libm.jl

+end
+
+@inline function ldexpk(x::Float64, q::Int32)
+  m = q >> 31


be consistent about 4-space indent

simonbyrne · 2016-08-22T15:43:34Z

src/Libm.jl

+
+@inline long_bits_to_double(i::Int64) = reinterpret(Float64, i)
+
+@inline xfabs(x::Float64) = long_bits_to_double((0x7fffffffffffffff % Int64) & double_to_raw_long_bits(x))


Just call abs.

musm · 2016-08-22T19:29:32Z

Hi simon do you any chance know why ci are failing due to merge conflicts? What should I do in order to get them online for this PR ?

ararslan · 2016-08-22T19:33:52Z

The way I like to do it is, from the command line, enter

git checkout master
git fetch origin  # or something else if you've renamed that remote
git rebase origin/master
git checkout sleef_port
git merge master

Git will then tell you that there are conflicts. You can see the affected files with git status. For each of those files, open them in your editor and look for lines beginning with <<<<<<<. More information can be found in this GitHub document.

ararslan · 2016-08-22T19:36:22Z

src/utils.jl

+# todo define methods and remove many of the function below to use multiple dispatch
+
+immutable Double2
+x::Float64


These lines should be indented. We could alternatively opt to use @simonbyrne's DoubleDouble.jl here.

Probably don't want to add it as a dependency, but feel free to copy what you need.

tkelman · 2016-08-22T19:38:51Z

Don't do merge commits. Do git rebase -i origin/master.

musm · 2016-08-22T19:40:09Z

You mean like this?

git checkout master
git fetch origin  # or something else if you've renamed that remote
git rebase -i origin/master
git checkout sleef_port
git rebase -i master

ararslan · 2016-08-22T19:40:26Z

test/xexp.jl

+    @test xexp(-746.0) == 0
+
+    x = linspace(708.4, 709.7, 25)
+    @test_approx_eq xexp.(x) exp.(x)


Totally up to you but what I would do personally is format it as

let x = linspace(708.4, 709.7, 25) @test xexp.(x) ≈ exp.(x) end

That has the benefit of scoping x so you aren't overwriting it for each test, though it doesn't really matter much. Just a stylistic preference.

That said, we really should be using ≈, as Base has switched away from @test_approx_eq in favor of it.

I looked at the docs and it said test_approx_eq is more stringent than ≈ and takes care of NaN ?

Either that's out of date or I'm wrong. @StefanKarpinski lead the effort to switch things to ≈, perhaps he can shed some light here.

No need to ping Stefan it's right here:
http://docs.julialang.org/en/latest/stdlib/test/#other-test-macros
unless it's out of date

The source code appears to claim @test_approx_eq to be "legacy," which is why I wanted Stefan's input.

ah ok sry, thx

No problem. :)

Yeah, I would use ≈.

ararslan · 2016-08-22T19:43:16Z

@tkelman Ah yeah, good call. I usually squash out merge commits anyway.

@musm Not quite, but close; no checkouts are necessary if you're already on your sleef_port branch.

git fetch origin
git rebase -i origin/master

musm · 2016-08-23T06:52:26Z

All double prec function ported save for the intrinsics xsqrt.

alyst · 2016-08-23T08:29:22Z

@musm It looks like for exp, exp10, exp2 (some common subroutine?), log, log10 there's unnecessary memory allocation happening for every call. It typically signifies type inference problems: Julia compiler cannot determine the type of some expression, so it assumes it is ::Any, which requires heap allocation to store the result. You can check the inferred types with @code_warntype <func>(<example args>) macro invocation (anything that is highlighted yellow requires attention).

musm · 2016-08-23T14:42:26Z

Thought I'd share some updated benchmarks with the latest changes.
IMO they look very good, most of them no worse than 5x base and most within1-3x

rc3

cos benchmark                                      
median ratio Libm/Base                             
BenchmarkTools.TrialRatio:                         
  time:             1.3135594321329447             
  gctime:           1.5830813905345738             
  memory:           1.0                            
  allocs:           1.0                            
  time tolerance:   5.00%                          
  memory tolerance: 1.00%                          


cosh benchmark                                     
median ratio Libm/Base                             
BenchmarkTools.TrialRatio:                         
  time:             2.2493910857587225             
  gctime:           0.6999963120536651             
  memory:           1.0                            
  allocs:           1.0                            
  time tolerance:   5.00%                          
  memory tolerance: 1.00%                          


exp benchmark                                      
median ratio Libm/Base                             
BenchmarkTools.TrialRatio:                         
  time:             2.9266741183149736             
  gctime:           1.0                            
  memory:           1.0                            
  allocs:           1.0                            
  time tolerance:   5.00%                          
  memory tolerance: 1.00%                          


exp10 benchmark                                    
median ratio Libm/Base                             
BenchmarkTools.TrialRatio:                         
  time:             0.9258054341064738             
  gctime:           0.9624319929064326             
  memory:           1.0                            
  allocs:           1.0                            
  time tolerance:   5.00%                          
  memory tolerance: 1.00%                          


exp2 benchmark                                     
median ratio Libm/Base                             
BenchmarkTools.TrialRatio:                         
  time:             6.573427765751141              
  gctime:           0.9991216268804411             
  memory:           1.0                            
  allocs:           1.0                            
  time tolerance:   5.00%                          
  memory tolerance: 1.00%                          


expm1 benchmark                                    
median ratio Libm/Base                             
BenchmarkTools.TrialRatio:                         
  time:             3.8094151963966514             
  gctime:           1.0                            
  memory:           1.0                            
  allocs:           1.0                            
  time tolerance:   5.00%                          
  memory tolerance: 1.00%                          


log benchmark                                      
median ratio Libm/Base                             
BenchmarkTools.TrialRatio:                         
  time:             1.3722646085919237             
  gctime:           1.0                            
  memory:           1.0                            
  allocs:           1.0                            
  time tolerance:   5.00%                          
  memory tolerance: 1.00%                          


log10 benchmark                                    
median ratio Libm/Base                             
BenchmarkTools.TrialRatio:                         
  time:             3.6302685517850004             
  gctime:           0.0                            
  memory:           1.0                            
  allocs:           1.0                            
  time tolerance:   5.00%                          
  memory tolerance: 1.00%                          


log1p benchmark                                    
median ratio Libm/Base                             
BenchmarkTools.TrialRatio:                         
  time:             5.2261049368600885             
  gctime:           0.8391970470358386             
  memory:           1.0                            
  allocs:           1.0                            
  time tolerance:   5.00%                          
  memory tolerance: 1.00%                          


sin benchmark                                      
median ratio Libm/Base                             
BenchmarkTools.TrialRatio:                         
  time:             1.2597544538971075             
  gctime:           1.38391252849377               
  memory:           1.0                            
  allocs:           1.0                            
  time tolerance:   5.00%                          
  memory tolerance: 1.00%                          


sinh benchmark                                     
median ratio Libm/Base                             
BenchmarkTools.TrialRatio:                         
  time:             1.8508245490268889             
  gctime:           0.6654004161239986             
  memory:           1.0                            
  allocs:           1.0                            
  time tolerance:   5.00%                          
  memory tolerance: 1.00%                          


tan benchmark                                      
median ratio Libm/Base                             
BenchmarkTools.TrialRatio:                         
  time:             1.2930481323477898             
  gctime:           0.6552033724744087             
  memory:           1.0                            
  allocs:           1.0                            
  time tolerance:   5.00%                          
  memory tolerance: 1.00%                          


tanh benchmark                                     
median ratio Libm/Base                             
BenchmarkTools.TrialRatio:                         
  time:             1.5733616334334983             
  gctime:           0.20365685183206117            
  memory:           1.0                            
  allocs:           1.0                            
  time tolerance:   5.00%                          
  memory tolerance: 1.00%

updated 8/24

simonbyrne · 2016-08-23T14:46:43Z

fyi: In some areas sleef wasn't doing muladd even when it look like it should, I added these myself.

Are these ones noted? Sometimes these are intentional to play tricks with floating point precision

musm · 2016-08-23T15:24:31Z

good point, I've noted them in the source it's only the atan2k helper function which is called by xatan2 xasin and xacos and aso xatan which do not use mla

ararslan · 2016-08-23T22:00:25Z

src/Libm.jl

@@ -8,6 +8,9 @@ export xatan2, xasin, xacos, xatan, xsin, xcos, xsincos, xtan, xpow, xsinh, xcos
 # higher accuracy functions
 export xatan2_u1, xasin_u1, xacos_u1, xatan_u1, xsin_u1, xcos_u1, xsincos_u1, xtan_u1, xcbrt_u1, xlog_u1

+# alias for supported floating point types
+typealias FTypes Union{Float32,Float64}


Could just use LinAlg.BlasReal, it's the same thing.

maybe it's better to leave as is since this package doesn't deal with Blas and for clarity.

musm · 2016-09-21T22:40:53Z

Well that was painful, but now it's done. I'd also like to modify some of the commit messages that have already been committed to this repo? @ararslan , can that be done from my fork or do I have to clone this upstream to do that.

tkelman · 2016-09-21T22:51:17Z

That's a much nicer history, thanks! Anything that has already been merged would require rebasing and force pushing over master, or I suppose you could start a new branch with revised history and either replace or merge it into master. Usually force pushing over master is a no-no on anything anyone else has cloned, since this package hasn't been registered yet maybe it would be acceptable. Or you could annotate any existing commits with explanatory commit comments on github, which would be at least trackable in the future.

musm · 2016-09-21T22:53:06Z

I see, I'll open another PR once I merge this bad boy, just waiting for the green light!

ararslan · 2016-09-21T23:59:27Z

Thanks for your great work here, @musm!

musm · 2016-09-22T00:32:52Z

big thanks to @shibatch @alyst @simonbyrne

tkelman reviewed Aug 22, 2016
View reviewed changes

src/Libm.jl

end

@inline function ldexpk(x::Float64, q::Int32)

m = q >> 31

Copy link

tkelman Aug 22, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

be consistent about 4-space indent

musm changed the title ~~began sleef port, first function added xexp~~ began sleef port Aug 22, 2016

musm changed the title ~~began sleef port~~ WIP sleef port Aug 22, 2016

simonbyrne reviewed Aug 22, 2016
View reviewed changes

ararslan reviewed Aug 22, 2016
View reviewed changes

musm changed the title ~~WIP sleef port~~ [WIP] sleef port Aug 22, 2016

ararslan reviewed Aug 23, 2016
View reviewed changes

musm added 21 commits September 21, 2016 18:38

Better codegen for ldexpk and force inlining and clean them up

d6464d1

Finish all denormal/nonnumber tests

09c7f33

Remove unecessary conversion

68c13e9

Improve benchmarking suite

c4d5801

Begin adding accuracy tests

26272e8

Various fix ups

8cc81be

Improve xrint perf, rename sqrt_fast, and add more accuracy tests

ec2a70d

Simplify and add all accuracy tests also improve benchmarks

63e0175

Rename constants and generalize ldexp function

43cf877

Force inlining for major per gains for Double types

b5a003a

Generalize ldexp and helper

bce6a87

Generalize xilogb and fixed sleef bug to handle NaN and add test

1dba72e

Refactor, clean up files, add comments

74ab111

Generalize log function clean up double.jl and misc refactoring

47509d9

Cleaned up double functions to use dispatch plus various optims

a7bcb95

Fix sign bug in sin function for +-0.0

e9da71d

Change to flipsign and rename sign to _sign to denote it's unsafe

92bb936

Add flags for fma in double.jl and change to horner macro

dca4a57

Added FMA for every double fun, plus minor optims here and there

c2944eb

Workaround for bug in let blocks and generalize sin and cos

3f10767

Generalize tan function to Float32, plus fix sin and cos for Float32

59b8e78

musm changed the title ~~[WIP] sleef port~~ Sleef port Sep 21, 2016

Added Float32 xexp function and add README

25a8828

musm merged commit eb72524 into JuliaMath:master Sep 21, 2016

musm deleted the sleef_port branch September 22, 2016 01:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sleef port #11

Sleef port #11

musm commented Aug 22, 2016 •

edited

Loading

codecov-io commented Aug 22, 2016 •

edited

Loading

tkelman Aug 22, 2016

simonbyrne Aug 22, 2016

musm commented Aug 22, 2016

ararslan commented Aug 22, 2016

ararslan Aug 22, 2016

simonbyrne Aug 22, 2016

tkelman commented Aug 22, 2016

musm commented Aug 22, 2016

ararslan Aug 22, 2016 •

edited

Loading

musm Aug 22, 2016 •

edited

Loading

ararslan Aug 22, 2016

musm Aug 22, 2016 •

edited

Loading

ararslan Aug 22, 2016

musm Aug 22, 2016

ararslan Aug 22, 2016

StefanKarpinski Aug 23, 2016

ararslan commented Aug 22, 2016 •

edited

Loading

musm commented Aug 23, 2016 •

edited

Loading

alyst commented Aug 23, 2016

musm commented Aug 23, 2016 •

edited

Loading

simonbyrne commented Aug 23, 2016

musm commented Aug 23, 2016

ararslan Aug 23, 2016

musm Aug 23, 2016

musm commented Sep 21, 2016 •

edited

Loading

tkelman commented Sep 21, 2016

musm commented Sep 21, 2016

ararslan commented Sep 21, 2016

musm commented Sep 22, 2016


		@inline long_bits_to_double(i::Int64) = reinterpret(Float64, i)

		@inline xfabs(x::Float64) = long_bits_to_double((0x7fffffffffffffff % Int64) & double_to_raw_long_bits(x))

Sleef port #11

Sleef port #11

Conversation

musm commented Aug 22, 2016 • edited Loading

codecov-io commented Aug 22, 2016 • edited Loading

Current coverage is 76.29% (diff: 94.42%)

Choose a reason for hiding this comment

Choose a reason for hiding this comment

musm commented Aug 22, 2016

ararslan commented Aug 22, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tkelman commented Aug 22, 2016

musm commented Aug 22, 2016

ararslan Aug 22, 2016 • edited Loading

Choose a reason for hiding this comment

musm Aug 22, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

musm Aug 22, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ararslan commented Aug 22, 2016 • edited Loading

musm commented Aug 23, 2016 • edited Loading

alyst commented Aug 23, 2016

musm commented Aug 23, 2016 • edited Loading

simonbyrne commented Aug 23, 2016

musm commented Aug 23, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

musm commented Sep 21, 2016 • edited Loading

tkelman commented Sep 21, 2016

musm commented Sep 21, 2016

ararslan commented Sep 21, 2016

musm commented Sep 22, 2016

musm commented Aug 22, 2016 •

edited

Loading

codecov-io commented Aug 22, 2016 •

edited

Loading

ararslan Aug 22, 2016 •

edited

Loading

musm Aug 22, 2016 •

edited

Loading

musm Aug 22, 2016 •

edited

Loading

ararslan commented Aug 22, 2016 •

edited

Loading

musm commented Aug 23, 2016 •

edited

Loading

musm commented Aug 23, 2016 •

edited

Loading

musm commented Sep 21, 2016 •

edited

Loading