Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Least significant bit of rand() is always zero #16344

Closed
wwywong opened this issue May 13, 2016 · 12 comments
Closed

Least significant bit of rand() is always zero #16344

wwywong opened this issue May 13, 2016 · 12 comments
Labels
randomness Random number generation and the Random stdlib

Comments

@wwywong
Copy link

wwywong commented May 13, 2016

Should this always be the case? For uniformly distributed random number in [0,1) I would expect the least significant bits to have some skew. But is it right that the final bit is always zero?

julia> count = 0
       for i in 1:10000
         if bits(rand())[end] =='1'
           count+=1
         end
       end; count
0

julia> count = 0
       for i in 1:10000
         if bits(rand())[end-1] =='1'
           count+=1
         end
       end; count
2473

julia> count = 0
       for i in 1:10000
         if bits(rand())[end-2] =='1'
           count+=1
         end
       end; count
3727

julia> count = 0
       for i in 1:10000
         if bits(rand())[end-3] =='1'
           count+=1
         end
       end; count
4280

julia> count = 0
       for i in 1:10000
         if bits(rand())[end-4] =='1'
           count+=1
         end
       end; count
4655

Version info:

julia> versioninfo()
Julia Version 0.4.5
Commit 2ac304d (2016-03-18 00:58 UTC)
Platform Info:
  System: Darwin (x86_64-apple-darwin13.4.0)
  CPU: Intel(R) Core(TM) i5-5250U CPU @ 1.60GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.3
@andreasnoack
Copy link
Member

If you count from the right then the distribution of each bit is Bernoulli(p(n)) with p(n)=(2^(n-1)-1)/2^n and p(1)=0. There is probably a good theoretical explantion for this but I don't know it and can't come up with it right now. @simonbyrne Do you know why the bits have this distributional pattern?

@tkelman
Copy link
Contributor

tkelman commented May 13, 2016

rand() is a bit of a lie. We do the naive but common thing of subtracting 1.0 from a random number uniformly distributed within [1.0, 2.0) where the possible floating point numbers are actually evenly spaced. Below 1.0, the distribution of possible floats is unevenly sampled, and I don't know of many implementations that are actually capable of generating all of them in a way that best approximates the continuous uniform distribution.

edit: See https://github.com/art4711/random-double/blob/master/rd.c for some fun

@mbaz
Copy link
Contributor

mbaz commented May 13, 2016

Fiddling with Matlab I see that their rand does set the last bit to 1. Edit: It seems that the distribution of bit end-i in Julia is similar to the distribution of bit end-(i+1) in Matlab.

@wwywong
Copy link
Author

wwywong commented May 13, 2016

@tkelman The blob you linked to, is that the actual algorithm used for Julia? Or is it as you described random in [1.0,2.0) and then subtracting one? Because in the latter case I (almost) see how the probability distribution comes up.

The floating point representation of 2.0 is (let's do it in Float16 so it's less typing) 0100000000000000 and 1.0 is 0011110000000000 (we see 1 sign, 5 exponent, and 10 mantissa). So to generate a number in [1.0, 2.0) we only need to generate a random 10 bit binary string.

Including the 'hidden bit" we are looking at a binary string of 11 letters, the first being 1, and the 10 following are random.

Now subtract from it 1.0, which corresponds to 1 00000 00000. So we have a string of 10 bits, but the leading few could be zero! This means we have to shift to the left to recover the hidden bit and divide out the shift in the exponent. What is clear then is that

  1. We have to shift at least once. So the right-most bit is always 0.
  2. The number of times we shift corresponds to the number of left most digits being 0. A little bit more work give you the distribution cited by @andreasnoack . For example, the probability that end-1 is 0 is the probability of "1 leading zero" + probability of "leading 1 but trailing 0" = 3/4.

If we really want to put that last bit to use, we can generate the float in "65-bit float" and truncate. Which, now that I am thinking about it, is basically the same as what is written in the rd.c code above. (And also explains what is seen in Matlab.)

@tkelman
Copy link
Contributor

tkelman commented May 13, 2016

The page I linked to is just some guy's fun little piece of C code, I don't think that's a commonly used algorithm. We could certainly port it from C to Julia and see how it compares in performance and distribution properties though. Julia subtracts 1:

julia> printloc(fileline; before=0, after=0) = run(pipeline(`head -n$(fileline[2]+after) $(fileline[1])`, `tail -n$(1+before+after)`));

julia> printloc(@functionloc rand())
@inline rand() = rand(GLOBAL_RNG, CloseOpen)

julia> printloc(@functionloc rand(Base.Random.GLOBAL_RNG, Base.Random.CloseOpen))
@inline rand{I<:FloatInterval}(r::MersenneTwister, ::Type{I}) = (reserve_1(r); rand_inbounds(r, I))

julia> printloc(@functionloc Base.Random.rand_inbounds(Base.Random.GLOBAL_RNG, Base.Random.CloseOpen))
@inline rand_inbounds(r::MersenneTwister, ::Type{CloseOpen}) = rand_inbounds(r, Close1Open2) - 1.0

julia> printloc(@functionloc Base.Random.rand_inbounds(Base.Random.GLOBAL_RNG, Base.Random.Close1Open2))
@inline rand_inbounds(r::MersenneTwister, ::Type{Close1Open2}) = mt_pop!(r)

julia> printloc(@functionloc Base.Random.mt_pop!(Base.Random.GLOBAL_RNG))
@inline mt_pop!(r::MersenneTwister) = @inbounds return r.vals[r.idx+=1]

julia> printloc(@functionloc Base.Random.reserve_1(Base.Random.GLOBAL_RNG))
@inline reserve_1(r::MersenneTwister) = mt_empty(r) && gen_rand(r)

julia> printloc(@functionloc(Base.Random.gen_rand(Base.Random.GLOBAL_RNG)), before=1, after=2)
function gen_rand(r::MersenneTwister)
    dsfmt_fill_array_close1_open2!(r.state, pointer(r.vals), length(r.vals))
    mt_setfull!(r)
end

julia> printloc(@functionloc(Base.dSFMT.dsfmt_fill_array_close1_open2!(Base.Random.GLOBAL_RNG.state, pointer(Base.Random.GLOBAL_RNG.vals), length(Base.Random.GLOBAL_RNG.vals))), before=1, after=6)
function dsfmt_fill_array_close1_open2!(s::DSFMT_state, A::Ptr{Float64}, n::Int)
    @assert Csize_t(A) % 16 == 0 # the underlying C array must be 16-byte aligned
    @assert dsfmt_min_array_size <= n && iseven(n)
    ccall((:dsfmt_fill_array_close1_open2,:libdSFMT),
          Void,
          (Ptr{Void}, Ptr{Float64}, Int),
          s.val, A, n)
end

@ivarne ivarne added the randomness Random number generation and the Random stdlib label May 13, 2016
@simonbyrne
Copy link
Contributor

There are 4 options here:

  1. Generate uniformly on [1,2), and subtract 1 (what we currently do), and means that all numbers will have a 0 as last bit due to the loss of precision. This is the easiest/fastest (as you can just generate 52 random bits, bitwise or, then do a floating point subtraction), and is the only mode supported by dSFMT (it only generates 52 bits per iteration).
  2. Generate at the precision of 0.5, so numbers [1/2,1) will have 0/1 as last bit with equal probability, and those in [0,1/2) will always have 0 as the last bit. This is slightly trickier to do, and will involve some performance penalty. This StackOverflow post discusses this in the context of single precision, but the ideas should transfer. The simplest idea is to generate a uniform [1/2,1), then subtract 0.5 with probability 0.5, though this is obviously going to be terrible for branch prediction (one suggestion in a comment there is to always subtract, but bitmask the second argument to 0.0 with probability 1/2). To do this we would need an RNG which could generate at least 53 bits per iteration)
  3. Generate at full precision for any argument, so that all significands are equally likely for all exponents. This is the "most accurate", but also the most expensive, as it will take a lot more random bits to generate a number [2^-11, 2^-10) than a number [1/2,1). I don't know of any software which does this.
  4. Some mix of 2 & 3: generate as many random bits as we can with a single iteration: i.e. if we have a 64-bit RNG, numbers greater than 2^-12 would be generated to full precision, and numbers below would see some truncation. The easiest way to do this would be via a cast to Float64, then multiply by 0x1p-64 (though this introduces some rounding issues where it would be technically possible to get a 1.0)

@wwywong
Copy link
Author

wwywong commented May 13, 2016

Some thoughts from the point of view of a user:

  1. Now that I understand how the RNG works for floats, this is not really a
    "bug"; the rand() function does do what it claims to do, that is to provide
    a uniformly distributed (pseudo)random number in the interval [0,1)... it's
    just that the algorithm effectively outputs at a slightly lower precision
    than the type allows.
  2. I don't think most users would notice this (I was debugging some
    numerical code and accidentally stumbled across this because the lowered
    precision means that floating point errors are in fact suppressed for
    certain types of arithmetic operations, something which surprised me). So a
    supplement to option 1 maybe to keep the algorithm as it reads, and include
    a technical note in the docs.
  3. Perhaps a 5th option is to either modify rand() so for floats it default
    to generating a number between [1,2) (and the user can subtract 1 and deal
    with the precision loss himself), or expose a method to interface with
    dsFMT's close1-open2 generator? (Edit: I see I can get the latter by just calling rand(Base.Random.Close1Open2))
  4. Don't quote me on this, but if you generate to full precision and then
    truncate/round (as in options 3 or 4), don't you skew the distributions?
    The probability distribution for the final bits in the form
    p(n)=(2^(n-1)-1)/2^n for julia (or p(n) = (2^n - 1)/ 2^(n+1) for matlab
    are the correct theoretical distributions for random floating point
    numbers, independently of precision. Essentially what we would end up using
    would be an RNG with exactly 53 bits (effectively to account for the
    hidden bit).

On 13 May 2016 at 05:58, Simon Byrne notifications@github.com wrote:

There are 4 options here:

Generate uniformly on [1,2), and subtract 1 (what we currently do),
and means that all numbers will have a 0 as last bit due to the loss of
precision. This is the easiest/fastest (as you can just generate 52 random
bits, bitwise or, then do a floating point subtraction), and is the only
mode supported by dSFMT (it only generates 52 bits per iteration).
2.

Generate at the precision of 0.5, so numbers [1/2,1) will have 0/1 as
last bit with equal probability, and those in [0,1/2) will always have 0 as
the last bit. This is slightly trickier to do, and will involve some
performance penalty. This StackOverflow post
http://stackoverflow.com/q/35350699/392585 discusses this in the
context of single precision, but the ideas should transfer. The simplest
idea is to generate a uniform [1/2,1), then subtract 0.5 with probability
0.5, though this is obviously going to be terrible for branch prediction
(one suggestion in a comment there is to always subtract, but bitmask the
second argument to 0.0 with probability 1/2). To do this we would need an
RNG which could generate at least 53 bits per iteration)
3.

Generate at full precision for any argument, so that all significands
are equally likely for all exponents. This is the "most accurate", but also
the most expensive, as it will take a lot more random bits to generate a
number [2^-11, 2^-10) than a number [1/2,1). I don't know of any software
which does this.
4.

Some mix of 2 & 3: generate as many random bits as we can with a
single iteration: i.e. if we have a 64-bit RNG, numbers greater than 2^-12
would be generated to full precision, and numbers below would see some
truncation. The easiest way to do this would be via a cast to Float64,
then multiply by 0x1p-64 (though this introduces some rounding issues
where it would be technically possible to get a 1.0)


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub
#16344 (comment)

Euclid taught me that without assumptions there is no proof. Therefore, in
any argument, examine the assumptions. ~ E.T.Bell

@simonbyrne
Copy link
Contributor

The probability distribution for the final bits in the form
p(n)=(2^(n-1)-1)/2^n for julia (or p(n) = (2^n - 1)/ 2^(n+1) for matlab
are the correct theoretical distributions for random floating point
numbers, independently of precision.

I'm not quite sure what you mean here: what's n?

@wwywong
Copy link
Author

wwywong commented May 13, 2016

n is as in this comment above, in other words p(n) is the probability that the n'th bit (counting from the right) is 1.

@simonbyrne
Copy link
Contributor

Ah, I missed that comment. This is a consequence of option 1 above (or option 2 in the case of matlab), and nothing intrinsic to random numbers. Basically the nth bit will certainly be zero if it falls in the interval [0,2^(1-n)), and will be zero with the probability of 1/2 otherwise.

@andreasnoack
Copy link
Member

Is there more to do here?

@wwywong
Copy link
Author

wwywong commented Sep 26, 2016

From my perspective: not really. The result is not a bug (it is by design for performance issues). I'll close.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
randomness Random number generation and the Random stdlib
Projects
None yet
Development

No branches or pull requests

6 participants