Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Float64sr via Double64? #59

Closed
milankl opened this issue Nov 16, 2022 · 9 comments · Fixed by #61
Closed

Float64sr via Double64? #59

milankl opened this issue Nov 16, 2022 · 9 comments · Fixed by #61
Labels
enhancement New feature or request
Milestone

Comments

@milankl
Copy link
Owner

milankl commented Nov 16, 2022

I would start with Double64s which are basically a+b with a,b Float64s and b about eps times smaller than a. So b is just the precision that cannot be represented with a and so with double precision arithmetic a+b=a. For our purposes that means that one would only need to decide based on b, whether to not round a or to round it up/down. That might be relatively easy to implement, but the devil is always in the detail.

@avleenk2312

@milankl milankl added the enhancement New feature or request label Nov 25, 2022
@milankl milankl added this to the v0.7 milestone Nov 25, 2022
@milankl
Copy link
Owner Author

milankl commented Nov 28, 2022

@avleenk2312 Some background on doubles. A double x is x = a+b whereby a,b <: AbstractFloat (i.e. our case Float64). So just the addition of two double precision numbers. They are non-overlapping, i.e. b is in (-u/2,u/2) with u being the ulp, unit in the last place, i.e. eps(a) in Julia. So adding b to a is at most half-way smaller/larger than the prev/next representable double precision number

|----<----|---->----|
a0        a         a2

a0 is the next smaller Float64, a2 the next larger. b is at most the distance between a and > or < and a, so a+b is somewhere in <...>. I'm not sure about the edge cases where a+b=< or a+b=> (one might be included the other one not, but that's not important for now). So b/u is in [-0.5,0.5]:

julia> using DoubleFloats, UnicodePlots
julia> A = rand(Double64,100_000);
julia> A_lows_eps = [a.lo/eps(a.hi) for a in A];
julia> histogram(A_lows_eps)
                ┌                                        ┐ 
   [-0.5, -0.4) ┤████████████████████████████████▍ 9964    
   [-0.4, -0.3) ┤████████████████████████████████▍ 9943    
   [-0.3, -0.2) ┤███████████████████████████████▋ 9759     
   [-0.2, -0.1) ┤████████████████████████████████▋ 10044   
   [-0.1, -0.0) ┤████████████████████████████████▌ 10001   
   [-0.0,  0.1) ┤████████████████████████████████▉ 10146   
   [ 0.1,  0.2) ┤█████████████████████████████████  10168  
   [ 0.2,  0.3) ┤████████████████████████████████▎ 9921    
   [ 0.3,  0.4) ┤████████████████████████████████▉ 10143   
   [ 0.4,  0.5) ┤████████████████████████████████▎ 9911    
                └                                        ┘ 
                                 Frequency   

This would give you exactly the probability at which to round down, e.g. if b/u = 0.5 there should be a 50% chance that round(a+b) = nextfloat(a) and so on. However, as done with Float32sr and Float16sr, it is more efficient to add a random number (in some way) and then deterministically round. I propose to do something like

function Float64_stochastic_round(x::Double64)
    a = x.hi     # the more significant float64 in x
    b = x.lo     # the less significant float64 in x

    # create [1,2)-1.5 = [-0.5,0.5)
    r = reinterpret(Float64,reinterpret(UInt64,one(Float64)) + (rand(UInt64) >> 12)) - 1.5
    return a + (b+eps(a)*r)    # (b+u*r) first as a+b is rounded to a
end

This uses one rand(UInt64) generation, 4 Float64 operations and eps, which is probably good enough for now. Quick check that it does roughly the right thing

julia> a = rand()
julia> d = Double64(a,eps(a)/2)    # something that's exactly half-way between two Float64s
julia> sum([Float64_stochastic_round(d) > d.hi for _ in 1:100_000])
49980    # ~50%

julia> d = Double64(a,eps(a)/4)    # 1/4 between two Float64s
julia> sum([Float64_stochastic_round(d) > d.hi for _ in 1:100_000])
25022    # ~25%

julia> d = Double64(a,eps(a)/8)    # 1/8 between two Float64s
julia> sum([Float64_stochastic_round(d) > d.hi for _ in 1:100_000])
12609    # ~12.5%

Let me know if you would like to turn this into a pull request, or if you want me to turn this into a pull request. Just wanted to throw in some ideas I just had.

@JeffreySarnoff
Copy link

nice approach .. there may or may not be a specific implementation detail that could be slightly modified to provide values more nearly matchy-matchy with what one expects from stochastic rounding of Float64s. I will dig deeper tonight,

@JeffreySarnoff
Copy link

The sign of the high part may differ from the sign of the low part, giving four cases. As long as your approach handles each case appropriately, I have no concern. I have not verified this, as you may already know.

@milankl
Copy link
Owner Author

milankl commented Dec 5, 2022

Obviously needs some proper tests but these look good already:

julia> a = rand()
julia> d = Double64(a,eps(a)/4)    # +hi +lo case, should be 25% chance of round up
julia> sum([Float64_stochastic_round(d) > d.hi for _ in 1:100_000])
25042

julia> d = Double64(a,-eps(a)/4)    # +hi -lo case, should be 25% chance of round down
julia> sum([Float64_stochastic_round(d) < d.hi for _ in 1:100_000])
24817

julia> d = Double64(-a,-eps(a)/4)    # -hi -lo case, should be 25% chance of round down
julia> sum([Float64_stochastic_round(d) < d.hi for _ in 1:100_000])
25214

julia> d = Double64(-a,eps(a)/4)    # -hi +lo case, should be 25% cance of round up
julia> sum([Float64_stochastic_round(d) > d.hi for _ in 1:100_000])
24767

@JeffreySarnoff
Copy link

JeffreySarnoff commented Dec 5, 2022 via email

@JeffreySarnoff
Copy link

another thought -- the rand() function used here should be the fastest reasonably good one available in our ecosystem.

@avleenk2312
Copy link
Contributor

Thanks a lot, @milankl for considering my request for Float64sr. Could you please turn this into a pull request? I will try to use it in my codes.

@avleenk2312
Copy link
Contributor

avleenk2312 commented Apr 27, 2023

Hi Milan, sorry for the significant delay in my response to you. I went over Float64sr.jl in src. I think Float64 needs to be replaced by Double64 in line 95 of float64sr.jl. Double64 works for all those functions. For reference line 95 is:
julia Base.$func(a::Float64sr) = Float64_stochastic_round($func(Float64(a)))
I suggest the following:
julia Base.$func(a::Float64sr) = Float64_stochastic_round($func(Double64(a)))
Also, I am running tests for Float64sr.jl.

@milankl
Copy link
Owner Author

milankl commented Apr 28, 2023

(You can just copy a permalink in)

Base.$func(a::Float64sr) = Float64_stochastic_round($func(Float64(a)))

I agree. Double64 should be written here. Commit coming

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants