Skip to content
This repository has been archived by the owner on Mar 11, 2020. It is now read-only.

Sleef port #11

Merged
merged 27 commits into from
Sep 21, 2016
Merged

Sleef port #11

merged 27 commits into from
Sep 21, 2016

Conversation

musm
Copy link
Collaborator

@musm musm commented Aug 22, 2016

sleef port to Julia

Done

  • Port double precision functions
  • Use Horner. Pull constants out of functions and wrap in let blocks.
  • Wrap in sleef module
  • Rigorous tests
  • Fix bugs in trig functions in sleef (xsin(-0.0) = 0.0)
  • DoubleDouble.jl abstractions

Before merging

Todos

  • Use abstractions in the new Double arithmetic package https://github.com/musm/Doubles.jl
  • Even more robust tests (full ranges)
  • Better benchmarks
  • Coefficient generation (check current range reduction at intermediate argument values)
  • Double check everything especially the tests
  • Include software fma function
  • Range reduction for large arguments (payne-hayek)

@codecov-io
Copy link

codecov-io commented Aug 22, 2016

Current coverage is 76.29% (diff: 94.42%)

Merging #11 into master will decrease coverage by 22.97%

@@             master        #11   diff @@
==========================================
  Files             5         13     +8   
  Lines           137        675   +538   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits            136        515   +379   
- Misses            1        160   +159   
  Partials          0          0          

Powered by Codecov. Last update 8ef3dca...1edc022

end

@inline function ldexpk(x::Float64, q::Int32)
m = q >> 31
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

be consistent about 4-space indent

@musm musm changed the title began sleef port, first function added xexp began sleef port Aug 22, 2016
@musm musm changed the title began sleef port WIP sleef port Aug 22, 2016

@inline long_bits_to_double(i::Int64) = reinterpret(Float64, i)

@inline xfabs(x::Float64) = long_bits_to_double((0x7fffffffffffffff % Int64) & double_to_raw_long_bits(x))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just call abs.

@musm
Copy link
Collaborator Author

musm commented Aug 22, 2016

Hi simon do you any chance know why ci are failing due to merge conflicts? What should I do in order to get them online for this PR ?

@ararslan
Copy link
Member

The way I like to do it is, from the command line, enter

git checkout master
git fetch origin  # or something else if you've renamed that remote
git rebase origin/master
git checkout sleef_port
git merge master

Git will then tell you that there are conflicts. You can see the affected files with git status. For each of those files, open them in your editor and look for lines beginning with <<<<<<<. More information can be found in this GitHub document.

# todo define methods and remove many of the function below to use multiple dispatch

immutable Double2
x::Float64
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These lines should be indented. We could alternatively opt to use @simonbyrne's DoubleDouble.jl here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably don't want to add it as a dependency, but feel free to copy what you need.

@tkelman
Copy link

tkelman commented Aug 22, 2016

Don't do merge commits. Do git rebase -i origin/master.

@musm
Copy link
Collaborator Author

musm commented Aug 22, 2016

You mean like this?

git checkout master
git fetch origin  # or something else if you've renamed that remote
git rebase -i origin/master
git checkout sleef_port
git rebase -i master

@test xexp(-746.0) == 0

x = linspace(708.4, 709.7, 25)
@test_approx_eq xexp.(x) exp.(x)
Copy link
Member

@ararslan ararslan Aug 22, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally up to you but what I would do personally is format it as

let x = linspace(708.4, 709.7, 25)
    @test xexp.(x)  exp.(x)
end

That has the benefit of scoping x so you aren't overwriting it for each test, though it doesn't really matter much. Just a stylistic preference.

That said, we really should be using , as Base has switched away from @test_approx_eq in favor of it.

Copy link
Collaborator Author

@musm musm Aug 22, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at the docs and it said test_approx_eq is more stringent than ≈ and takes care of NaN ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either that's out of date or I'm wrong. @StefanKarpinski lead the effort to switch things to , perhaps he can shed some light here.

Copy link
Collaborator Author

@musm musm Aug 22, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to ping Stefan it's right here:
http://docs.julialang.org/en/latest/stdlib/test/#other-test-macros
unless it's out of date

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The source code appears to claim @test_approx_eq to be "legacy," which is why I wanted Stefan's input.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah ok sry, thx

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem. :)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I would use .

@ararslan
Copy link
Member

ararslan commented Aug 22, 2016

@tkelman Ah yeah, good call. I usually squash out merge commits anyway.

@musm Not quite, but close; no checkouts are necessary if you're already on your sleef_port branch.

git fetch origin
git rebase -i origin/master

@musm musm changed the title WIP sleef port [WIP] sleef port Aug 22, 2016
@musm
Copy link
Collaborator Author

musm commented Aug 23, 2016

All double prec function ported save for the intrinsics xsqrt.

@alyst
Copy link

alyst commented Aug 23, 2016

@musm It looks like for exp, exp10, exp2 (some common subroutine?), log, log10 there's unnecessary memory allocation happening for every call. It typically signifies type inference problems: Julia compiler cannot determine the type of some expression, so it assumes it is ::Any, which requires heap allocation to store the result. You can check the inferred types with @code_warntype <func>(<example args>) macro invocation (anything that is highlighted yellow requires attention).

@musm
Copy link
Collaborator Author

musm commented Aug 23, 2016

Thought I'd share some updated benchmarks with the latest changes.
IMO they look very good, most of them no worse than 5x base and most within1-3x

rc3

cos benchmark                                      
median ratio Libm/Base                             
BenchmarkTools.TrialRatio:                         
  time:             1.3135594321329447             
  gctime:           1.5830813905345738             
  memory:           1.0                            
  allocs:           1.0                            
  time tolerance:   5.00%                          
  memory tolerance: 1.00%                          


cosh benchmark                                     
median ratio Libm/Base                             
BenchmarkTools.TrialRatio:                         
  time:             2.2493910857587225             
  gctime:           0.6999963120536651             
  memory:           1.0                            
  allocs:           1.0                            
  time tolerance:   5.00%                          
  memory tolerance: 1.00%                          


exp benchmark                                      
median ratio Libm/Base                             
BenchmarkTools.TrialRatio:                         
  time:             2.9266741183149736             
  gctime:           1.0                            
  memory:           1.0                            
  allocs:           1.0                            
  time tolerance:   5.00%                          
  memory tolerance: 1.00%                          


exp10 benchmark                                    
median ratio Libm/Base                             
BenchmarkTools.TrialRatio:                         
  time:             0.9258054341064738             
  gctime:           0.9624319929064326             
  memory:           1.0                            
  allocs:           1.0                            
  time tolerance:   5.00%                          
  memory tolerance: 1.00%                          


exp2 benchmark                                     
median ratio Libm/Base                             
BenchmarkTools.TrialRatio:                         
  time:             6.573427765751141              
  gctime:           0.9991216268804411             
  memory:           1.0                            
  allocs:           1.0                            
  time tolerance:   5.00%                          
  memory tolerance: 1.00%                          


expm1 benchmark                                    
median ratio Libm/Base                             
BenchmarkTools.TrialRatio:                         
  time:             3.8094151963966514             
  gctime:           1.0                            
  memory:           1.0                            
  allocs:           1.0                            
  time tolerance:   5.00%                          
  memory tolerance: 1.00%                          


log benchmark                                      
median ratio Libm/Base                             
BenchmarkTools.TrialRatio:                         
  time:             1.3722646085919237             
  gctime:           1.0                            
  memory:           1.0                            
  allocs:           1.0                            
  time tolerance:   5.00%                          
  memory tolerance: 1.00%                          


log10 benchmark                                    
median ratio Libm/Base                             
BenchmarkTools.TrialRatio:                         
  time:             3.6302685517850004             
  gctime:           0.0                            
  memory:           1.0                            
  allocs:           1.0                            
  time tolerance:   5.00%                          
  memory tolerance: 1.00%                          


log1p benchmark                                    
median ratio Libm/Base                             
BenchmarkTools.TrialRatio:                         
  time:             5.2261049368600885             
  gctime:           0.8391970470358386             
  memory:           1.0                            
  allocs:           1.0                            
  time tolerance:   5.00%                          
  memory tolerance: 1.00%                          


sin benchmark                                      
median ratio Libm/Base                             
BenchmarkTools.TrialRatio:                         
  time:             1.2597544538971075             
  gctime:           1.38391252849377               
  memory:           1.0                            
  allocs:           1.0                            
  time tolerance:   5.00%                          
  memory tolerance: 1.00%                          


sinh benchmark                                     
median ratio Libm/Base                             
BenchmarkTools.TrialRatio:                         
  time:             1.8508245490268889             
  gctime:           0.6654004161239986             
  memory:           1.0                            
  allocs:           1.0                            
  time tolerance:   5.00%                          
  memory tolerance: 1.00%                          


tan benchmark                                      
median ratio Libm/Base                             
BenchmarkTools.TrialRatio:                         
  time:             1.2930481323477898             
  gctime:           0.6552033724744087             
  memory:           1.0                            
  allocs:           1.0                            
  time tolerance:   5.00%                          
  memory tolerance: 1.00%                          


tanh benchmark                                     
median ratio Libm/Base                             
BenchmarkTools.TrialRatio:                         
  time:             1.5733616334334983             
  gctime:           0.20365685183206117            
  memory:           1.0                            
  allocs:           1.0                            
  time tolerance:   5.00%                          
  memory tolerance: 1.00%                          

updated 8/24

@simonbyrne
Copy link
Member

fyi: In some areas sleef wasn't doing muladd even when it look like it should, I added these myself.

Are these ones noted? Sometimes these are intentional to play tricks with floating point precision

@musm
Copy link
Collaborator Author

musm commented Aug 23, 2016

good point, I've noted them in the source it's only the atan2k helper function which is called by xatan2 xasin and xacos and aso xatan which do not use mla

@@ -8,6 +8,9 @@ export xatan2, xasin, xacos, xatan, xsin, xcos, xsincos, xtan, xpow, xsinh, xcos
# higher accuracy functions
export xatan2_u1, xasin_u1, xacos_u1, xatan_u1, xsin_u1, xcos_u1, xsincos_u1, xtan_u1, xcbrt_u1, xlog_u1

# alias for supported floating point types
typealias FTypes Union{Float32,Float64}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could just use LinAlg.BlasReal, it's the same thing.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe it's better to leave as is since this package doesn't deal with Blas and for clarity.

@musm
Copy link
Collaborator Author

musm commented Sep 21, 2016

Well that was painful, but now it's done. I'd also like to modify some of the commit messages that have already been committed to this repo? @ararslan , can that be done from my fork or do I have to clone this upstream to do that.

@musm musm changed the title [WIP] sleef port Sleef port Sep 21, 2016
@tkelman
Copy link

tkelman commented Sep 21, 2016

That's a much nicer history, thanks! Anything that has already been merged would require rebasing and force pushing over master, or I suppose you could start a new branch with revised history and either replace or merge it into master. Usually force pushing over master is a no-no on anything anyone else has cloned, since this package hasn't been registered yet maybe it would be acceptable. Or you could annotate any existing commits with explanatory commit comments on github, which would be at least trackable in the future.

@musm
Copy link
Collaborator Author

musm commented Sep 21, 2016

I see, I'll open another PR once I merge this bad boy, just waiting for the green light!

@musm musm merged commit eb72524 into JuliaMath:master Sep 21, 2016
@ararslan
Copy link
Member

Thanks for your great work here, @musm!

@musm
Copy link
Collaborator Author

musm commented Sep 22, 2016

big thanks to @shibatch @alyst @simonbyrne

@musm musm deleted the sleef_port branch September 22, 2016 01:14
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.