Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

math/rand: rand.Float64 is not purely uniform #12290

Closed
fabienlefloch opened this issue Aug 24, 2015 · 21 comments
Closed

math/rand: rand.Float64 is not purely uniform #12290

fabienlefloch opened this issue Aug 24, 2015 · 21 comments

Comments

@fabienlefloch
Copy link

The current implementation (as of Go 1.5) of rand.Float64 is:

func (r *Rand) Float64() float64 {
  f := float64(r.Int63()) / (1 << 63)
  if f == 1 {
    f = 0
  }
  return f
}

As the mantissa of a Float64 is on 52 bits, there are a lot of numbers (2^11 ?) that were initially mapped to 1 that will effectively become 0. The Float64() method therefore does not preserve the uniform property of the underlying Int63() method.

This is related to issues #4965 and #6721 .

A better alternative, even if this implies to trade numbers under machine epsilon against uniformity comes from the prototypical code of Mersenne Twister 64 at http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/VERSIONS/C-LANG/mt19937-64.c :

/* generates a random number on [0,1)-real-interval */
func (r *Rand) Float64() float64 {
    return (float64(r.Int63()) >> 10) * (1.0/9007199254740992.0)
}
/* generates a random number on (0,1)-real-interval */
func (r *Rand) Float64Open() float64 {
    return (float64(r.Int63())  >> 11) + 0.5) * (1.0/4503599627370496.0)
}

An interesting consequence is that this preserves symmetry around 0.5. It's also quite a bit faster on my machine, as the if statement of the original algorithm deteriorate performance significantly (not entirely sure why).

@ianlancetaylor ianlancetaylor changed the title rand.Float64 is not purely uniform math/rand: rand.Float64 is not purely uniform Aug 24, 2015
@ianlancetaylor ianlancetaylor added this to the Unplanned milestone Aug 24, 2015
@stemar94
Copy link

stemar94 commented Sep 4, 2015

There is a comment in the source code:

    // A clearer, simpler implementation would be:
    //  return float64(r.Int63n(1<<53)) / (1<<53)
    // However, Go 1 shipped with
    //  return float64(r.Int63()) / (1 << 63)
    // and we want to preserve that value stream.
    //
    // There is one bug in the value stream: r.Int63() may be so close
    // to 1<<63 that the division rounds up to 1.0, and we've guaranteed
    // that the result is always less than 1.0. To fix that, we treat the
    // range as cyclic and map 1 back to 0. This is justified by observing
    // that while some of the values rounded down to 0, nothing was
    // rounding up to 0, so 0 was underrepresented in the results.
    // Mapping 1 back to zero restores some balance.
    // (The balance is not perfect because the implementation
    // returns denormalized numbers for very small r.Int63(),
    // and those steal from what would normally be 0 results.)
    // The remapping only happens 1/2⁵³ of the time, so most clients
    // will not observe it anyway.

@fabienlefloch
Copy link
Author

That's an interesting comment, but the consequence of those choices is a non uniform distribution, where 0 and numbers near 0 are over-represented.

@dr2chase dr2chase self-assigned this Sep 28, 2015
@dr2chase
Copy link
Contributor

A quick test of

// Float64 returns, as a float64, a pseudo-random number in [0.0,1.0).
func (r *Rand) Float64() float64 {
    // Change from Go 1 -- make results be uniform.
    return (float64(r.Int63() >> 10)) * (1.0 / 9007199254740992.0)
}

// Float32 returns, as a float32, a pseudo-random number in [0.0,1.0).
func (r *Rand) Float32() float32 {
    // Change from Go 1 -- make results be uniform.
    return (float32(r.Int63() >> 39)) * (1.0 / 16777216.0)
}

shows a non-uniform distribution of the LSB -- it is 25% 1, instead of 50% 1.

@fabienlefloch
Copy link
Author

Isn't it wrong to look at the LSB for floating point numbers?
Floating point numbers are not evenly distributed, there are much more of them near 0 than near 1. This is why truncation to 53 bits (the mantissa) is necessary to have a uniform distribution.

@dr2chase
Copy link
Contributor

As I understand it, staring with a 53-bit integer and then dividing it by 2-to-the-53 means that there are floating point numbers that will occur with probability zero. Any "random" value less than 1/2 (252/253) will not have its LSB set, for example. I think this is also a flaw in the original algorithm, but for a smaller set of FP numbers.

I don't actually know what is officially "right" here because I am not an expert, but right now neither of these algorithms is doing much for me. I recall from some earlier work that one way to do this is to take a 64-bit random number, split it into 12 and 52-bit parts, stuff 52-bits into mantissa, and then count leading zeroes on the 12-bit part to derive your exponent. If the 12-bit part is all zero, pull more 64-bit random numbers till you get one that is not zero or till you hit (about, up to fenceposts) 1024 of them in which case you call it zero, and use the number of leading zeroes to derive your exponent. Since you'd only do second RNG call 1 time out of 4096, this could be plenty efficient since it involves no FP ops at all (though we might hope that multiplication and division by powers of two would be very efficient).

I am somewhat creeped out that both algorithms are guaranteed not to hit certain FP values.

@fabienlefloch
Copy link
Author

I would not worry about not hitting certain FP values, especially since FPs are not uniformly distributed by nature (much more towards 0 than towards 1). What's important is that the FP values generated have a uniform distribution. If you restrict yourself to the 53 bits, then the FP numbers are uniformly spaced in (0,1), so it's easy to grasp that the FP generator has to be uniform if the underlying integer generator is uniform. Yes you loose the small numbers around 0 (less than machine epsilon) but there is no way to be uniform and keep those.

Maybe it's not a very good argument, but I would trust Matsumoto and the official Mersenne-Twister implementation as he is a very prominent researcher in the area.

@fabienlefloch
Copy link
Author

maybe this paper could help understanding the idea of uniform random floating point numbers:
http://www.doornik.com/research/randomdouble.pdf

@fabienlefloch
Copy link
Author

Anybody doing Monte-Carlo simulations will want the Matsumoto like behavior. I don't really see any other use case for a Float64 function than Monte-Carlo simulations.
If backward compatibility is an issue, a Float53() function would be nice

@dr2chase
Copy link
Contributor

dr2chase commented Nov 2, 2015

Any particular reason not to instead use

// Float64 returns, as a float64, a pseudo-random number in [0.0,1.0).
func (r *Rand) Float64() float64 {
    // Change from Go 1 -- make results be uniform.
    return float64(r.Int63()) * 1.0842021724855043e-19
}

Provenance of funny constant: https://play.golang.org/p/FSkOkffu_2

This gets you uniformity, and hits more floating point numbers.
It does however sometimes cause a 1eps deviation from the former value
stream.

@fabienlefloch
Copy link
Author

look at the comment from stemar94, 0 is underrepresented under this algo. You just can't hit more random points and have a uniform distribution appropriate for simulation.

@dr2chase
Copy link
Contributor

dr2chase commented Nov 2, 2015

The value stream argument seems to be winning at this end (i.e., I floated the possibility of an RNG that differed by 1eps in many instances, and that was judged to be not worth the trouble), but I agree that the wrap-back from 1 overrepresents 0, by I think a factor of 1<<9 for Float64 and 1<<38 for Float32 -- I may have a fencepost error in my exponents, the derivation is that the highest two values for Float64 are 1 and 1-1/(1<<53), there are 1024 points between them in the integer random stream, and each point gets half, versus a single point for 1/(1<<63) and 0. Wrapping 1 to 0 maps 512 points from the integer stream to 0, which is a substantial overrepresentation, and egregiously so for Float32.

To (mostly) preserve the value stream, it appears that the least-awful method is to replace the zero substitution with try-another-number. It benchmarked the same, though of course this was just a microbenchmark.

I don't understand the claim that zero is overrepresented. The Mersenne twister RNG will generate 1/(1<<53) and 0 both at p=1/(1<<53), this algorithm generates 1/(1<<63) and 0 both at p=1/(1<<63). In both cases the probability of zero and its nearest neighbor are the same (and p(0) is lower in this algorithm because all of the points between 0 and 1/(1<<54) get hits, where in Mersenne twister they instead round to zero).

@rsc
Copy link
Contributor

rsc commented Nov 2, 2015

At this point I don't believe we should change the value stream generated by (*Rand).Float64 in any significant way. The split of rand.Source (the source of non-determinism) from rand.Rand (the algorithms on top) means that people expect that only rand.Source might differ from release to release. If people use the same rand.Source output as input to rand.Rand, they expect to get the same outputs. rand.Rand should not itself be a source of non-determinism, not even from release to release, if at all possible.

When the implementation of (*Rand).Float64 returned 1, that was obviously a mistake and in contradiction to the docs, so it merited a breaking change.

The implementation of (*Rand).Float64 is not perfect. My apologies. But I believe it's not awful enough to merit a breaking change.

The if f == 1 { return 0 } fix is also not perfect. Again, my apologies. But even here I'm not convinced it's awful enough at this point to merit a breaking change.

We may be able to loop and grab another value, but I'm not even convinced that's necessary at this point.

More generally, the specified API for (*Rand).Float64 is incomplete. All it says is “Float64 returns, as a float64, a pseudo-random number in [0.0,1.0) from the default Source.” It says nothing about uniformity, which is good, because floating point numbers themselves are not uniformly distributed in [0, 1).

In practice, if you care deeply about the properties of your random floating-point numbers, you should probably write your own wrapper around r.Int63() instead of depending on unspecified details of a library implementation.

@fabienlefloch
Copy link
Author

Thanks rsc, this makes perfect sense. And this is what I actually ended up doing. The only thing I am left wondering is what are the use cases for this Random.Float64 method?

@gopherbot
Copy link
Contributor

CL https://golang.org/cl/17670 mentions this issue.

@fabienlefloch
Copy link
Author

The change does not really improve uniformity that much. Again, as there are more floating points near zero, the outcome will be skewed towards zero with probability 1/2^11.
The change suggests a confusion in the definition of uniformity. Hitting all floating point numbers with equal probability doe not necessarily mean uniform, because of the over representation of numbers close to zero. So the fix continues the same mistake, just make things on average a bit slower, and will be useless for Quasi random number generators.

@rsc
Copy link
Contributor

rsc commented Jan 4, 2016

@fabienlefloch I disagree. Yes, there are more floating-point numbers near zero, but they are selected less often, so it works out. By "uniform" here we mean that after x = rand.Float64(), as much as possible, x < k with probability k, for all k in [0, 1]. I spent a while calculating Pr(x < k) analytically for the various fixes, and the current implementation seems to me quite close, certainly far closer than one in 2^11. I'd be happy to see an analysis disproving that though.

@fabienlefloch
Copy link
Author

@rsc Why would they be selected less often near 0?
It is not so easy to find a simple concrete example with a real random number generator however, as most won't lead quickly to integer numbers that translate to float64 < machine epsilon. Worse if you try to compute an average with Monte-Carlo, you will need a decent sample size, so it's likely to take way too much time to test naively.
But if you assume that you have a RNG that gives you 1,2,3,4,5 on one side and 2^63-1,2^63-2,2^63-3,2^63-4,2^63-5 on the other side, you can see that the average number of hits in (0,2^-53] is going to be larger than the number of hits in [1-2^53,1) (actually in this case it's going to fall all in (0,2^-53]).

@rsc
Copy link
Contributor

rsc commented Jan 15, 2016 via email

@dr2chase
Copy link
Contributor

Would this test program perhaps be helpful? It computes 100 billion random float32s, and bins them by negative-power-of-two-less-than-or-equal (i.e., (1,0.5] ends up in bin 1 ) for bins 1-64.

package main

import (
    "fmt"
    "math/rand"
)

func main() {
    var a [65]uint64
    for i := 0; i < 1024; i++ { // 2**10
        for j := 0; j < 128*1024*1024; j++ { // 2**27
            r := rand.Float32()
            var k int
            for k = 0; r < 1.0 && k < 64; k++ {
                r = r + r
            }
            a[k]++
        }
        fmt.Printf("%d", a[1])
        for k := 2; k < 65; k++ {
            fmt.Printf(", %d", a[k])
        }
        fmt.Println()
    }
}

I ran this for over 100 minutes, the last printed line was:
44090360930, 22045281389, 11022793839, 5511276150, 2755668437, 1377855446, 688902792, 344456207, 172220997, 86117788, 43060011, 21523345, 10767229, 5378922, 2691381, 1346071, 673490, 336577, 168029, 84082, 42097, 21159, 10498, 5206, 2604, 1265, 645, 336, 200, 85, 42, 25, 14, 4, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

The reason points near zero are selected less often is by construction. If we start with points uniform on [0,263-1], full-precision division by 263 preserves uniformity, each with probability P0 = 1/263. If we then round those to a 53-bit mantissa, the probabilities of [0,253)/263 remain unchanged (=P0); they are exactly representable and there is no rounding. Consider the mapping of points 253+k for smallish k. For even k the point is exactly representable and the value appears in the FP64 stream, but for odd k the value must be rounded. 1->0, 2=2, 3->4, 4=4, 5->4, 6=6, 7->8, 8=8, 9->8, 10=10 etc.

So on the one hand, I see an interesting glitch in uniformity, in that for k = 2, 6, 10, ... p(253+k) = P0 but for k = 4, 8, 12, ... p(253+k) = 3*P0 because of round-to-nearest-zero-if-tie. (I modified the program above to test this and it appeared right away. Similarly, the old unfixed code starts dumping hits into the last bucket right away for rand.Float32).

For larger powers of two, the difference in even and odd probabilities diminishes. For p(254+k) two bits must be dropped and the winners will be k=0,4,8,12,16. Rounding to the respective values of k:
0: -1, 0,1,2 (-1 = 2
54-1 == 253 + 253 - 1)
4: 3,4,5
8: 6,7,8,9,10
12: 11,12,13
i.e., instead of 4P0, alternating between 3P0 and 5P0, and for p(255+k) , 7 and 9 instead of 8, etc. all the way to p(262+k) where the probabilities will be 1023 and 1025P0 instead of 1024*P0.
Here's a demo of the 1:3 ratio between odds and evens:

package main

import (
    "fmt"
    "math"
    "math/rand"
)

const (
    p53 = float64(1<<53) / (1 << 63)
    p54 = float64(1<<54) / (1 << 63)
)

func main() {
    var even uint64
    var odd uint64

    for j := 0; j < 128*1024*1024; j++ { // 2**27
        r := rand.Float64()
        if p53 < r && r < p54 {
            i := math.Float64bits(r)
            if i&1 == 1 {
                odd++
            } else {
                even++
            }
        }
    }
    fmt.Printf("odd,even=%d,%d\n", odd, even)
}

To your original worry/complaint, notice that in general each time half the points drop out, the probability of the remaining points being chosen increases by a factor of two on average. In the data above, you'd expect to see a glitch, if any, occurring between bucket 10 and bucket 11, and there is none (I checked).

It's a bit of a shame we can't do this in RTZ FP mode; it would, however change the stream of bits from the RNG and perhaps cause some sadness.

@fabienlefloch
Copy link
Author

I did not have time yet to produce a simple example to illustrate uniformity.

I don't fully understand why golang wants to reinvent the wheel on this subject, while every single researcher on the topic uses the same techniques to produce random float64 from integers (see Matsumoto, L'Ecuyer, etc.).

@dr2chase
Copy link
Contributor

dr2chase commented Aug 2, 2016

The difficulty is that the wheel was reinvented differently, and then there is a strong desire to retain exact compatibility with older releases. There were clear bugs in the original that had to be fixed, so they were fixed. If you can demonstrate a bug, that greatly increases the odds that it will be fixed.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants