Fix initrand to avoid random number sequences overlapping #18744

demotomohiro · 2021-08-24T23:42:27Z

This is the alternative of #18149.
Fix #17898.
It allows 2^64 Rand generate 2^64 random numbers without random number sequences overlapping.

If states of PRNGs were initialized using time, multiple states of PRNG (pseudorandom number generator) can have same value when they are initialized at almost same time.
If states of PRNG were initialized with random value, sequences generated by them can be overlapped.
https://blogs.sas.com/content/iml/2018/05/07/overlap-random-streams.html
The probability of overlap of random subsequences of pseudorandom number generator is explained in this paper:
https://vigna.di.unimi.it/ftp/papers/overlap.pdf

According to the paper, the probability of overlap is always at most n^2 * L / P where n is a number of PRNG, L is a length of sequences and P is a period of the PRNG.

This PR implementates initRand in the same way as runnableExamples of skipRandomNumbers in random module.
It try to avoid a PRNG state identicals to a state of other PRNGs in other processes by initializing the baseState with std/sysrand.
When it doesn't worked, baseSeed is initialized with hash value of current process id and monotime.
But different processes in different machines can have same process id.

initRand avoids random number sequences overlapping by copying baseState to new Rand state and call skipRandomNumbers with baseState.

I didn't touched js code because how skipRandomNumbers works in js is unknown.
Even if it works and can skip 2^64 times next() calls, that means it is same to call next() once because maximum period of Rand in js is 2^64 - 1, because sizeof(Rand) == 8 in js.
I think it should be replaced with 32-bit Generators like xoshiro128++ or xoshiro128** if random module doesn't need to generate same sequences in all platform.
https://prng.di.unimi.it
Or implement 64bit int addition and bit operation with 32 bit int for correct xoroshiro128+.

I visualized how Pseudo-random number generators created with jump function(skipRandomNumbers) and with random seed use PRNG state space.

Source:
https://gist.github.com/demotomohiro/9516fe686da376bbce8a4470e52752ab

Each blue box represents a PRNG state.
Red rounded box represents PRNG that generates 4 random numbers by updating state 4 times.
Upper block represents how PRNGs created with jump function uses PRNG states.
Lower block represents how PRNGs created with random seed uses PRNG states.
In lower block, several PRNGs uses same states.
That means a part of random number sequence from one PRNG matchs a part of random number sequence from other PRNG.

This document explains how this PR avoid random number sequences overlapping:
https://www.mathcha.io/editor/7JPgyt0WSLdH4L7VywudZkBrQF4qJOe8Fp3JBB

timotheecour · 2021-08-25T01:14:13Z

If states of PRNGs were initialized using time, multiple states of PRNG (pseudorandom number generator) can have same value when they are initialized at almost same time.

that's not true anymore, I fixed this in #17468 which already calls skipRandomNumbers(result) in the right place; it's unclear whether your PR adds redundant calls to skipRandomNumbers given that skipRandomNumbers is already being called by initRand(seed)

lib/std/tempfiles.nim

demotomohiro · 2021-08-27T06:10:06Z

If states of PRNGs were initialized using time, multiple states of PRNG (pseudorandom number generator) can have same value when they are initialized at almost same time.

that's not true anymore, I fixed this in #17468 which already calls skipRandomNumbers(result) in the right place; it's unclear whether your PR adds redundant calls to skipRandomNumbers given that skipRandomNumbers is already being called by initRand(seed)

initRand*(seed: int64) is a function because it always returns same value from same input.
If initRand*(seed: int64): Rand was called with times.getTime() twice and first and second times.getTime() call returned the same value as first times.getTime(), two Rand returned from initRand*() would have same status.

In this PR, one process have only one baseState that is initialized on start up with random number from sysrand.urandom or monotime + process Id.
After baseState was initialized, it is updated only with skipRandomNumbers in initRand().
When initRand() is called, baseState is copied to new Rand and baseState is updated with skipRandomNumbers.
initRand() doesn't take random value from time or sysrand and doesn't call initRand*(seed: int64) except js backend.
So When initRand() is called multiple times, states of Rand returned from initRand() doesn't overlap to each other unless it calls next*(r: var Rand): uint64 more than 2^64 times or initRand is called more than 2^64 times.

This document explains how this PR avoid random number sequences overlapping:
https://www.mathcha.io/editor/7JPgyt0WSLdH4L7VywudZkBrQF4qJOe8Fp3JBB

lib/pure/random.nim

Varriount · 2021-08-28T01:20:45Z

lib/pure/random.nim

+        ret = DefaultRandSeed
+      ret
+
+    when compileOption("threads"):


Wouldn't it be better to have the default random state be a per-thread global? That way, locking wouldn't be needed.

baseState cannot be a per-thread global because if it was per-thread, it cannot avoid random number sequences overlapping between different PRNGs in different thread.

Quote from https://prng.di.unimi.it (web site created by creator of xoroshiro):

We provide ready-made jump functions for a number of calls equal to the square root of the period, to make it easy generating non-overlapping sequences for parallel computations, and equal to the cube of the fourth root of the period, to make it possible to generate independent sequences on different parallel processors.

skipRandomNumbers*(s: var Rand) in random module is a jump function.

I don't follow - if per-thread state is initialized with random data via urandom, why would skipRandomNumbers overlap?

Please read this:
https://www.mathcha.io/editor/7JPgyt0WSLdH4L7VywudZkBrQF4qJOe8Fp3JBB

Initial state of all Rand must be T2^64n (n is natural number including 0) to avoid overlapping.
If baseState were per-thread variable and they were initialized with urandom, initial state of all Rand cannot become T2^64n.

@demotomohiro ...so now every call to next needs to hold the lock! I imagine this would be very slow. Is there a better solution than ignoring the data races?

@timotheecour The approach in #18149 would suffer from overlap issues though, wouldn't it? Which is what I believe @demotomohiro is aiming to avoid.

The approach in #18149 would suffer from overlap issues though, wouldn't it?

I was also ensuring skipRandomNumbers gets called but indeed, the way it was being called wouldn't guarantee lack of overlap ; i'll need to think more to double check;

that said, the potential performance issue should be evaluated/investigated

Actually, upon re-reviewing the code:

The lock is only for initialization of new random states (frombasestate), including the global state. I had previously misread this, and thought all access to the global random state (state) was being locked.

The current global random state isn't threadsafe at all (unless the implementation is a lock-free algorithm of some sort), and isn't a per-thread global. An obvious way to solve this to make state a thread-local global, and check/initialize it when it is accessed. This would impose a performance cost, though I don't know how small it would be - it would probably depend on whether the backend compiler is smart enough to hoist the random state somewhere on the stack.

@planetis-m
As @Varriount said, only initRand() proc acquires lock.
If you create Rand in each threads and never share them between threads, you don't need to lock when you call next(r: var Rand).
If you create only one Rand and want to share it by multiple threads, you would need to acquire lock when you call next(r: var Rand),
but it is not a problem this PR cause.

@Varriount
In this document, T0 is a initial state of baseState and it is the value came from sysrand.urandom.
https://www.mathcha.io/editor/7JPgyt0WSLdH4L7VywudZkBrQF4qJOe8Fp3JBB
Then, first Rand returned from initRand() has state T0.
second one has T2^64, third one has T2^642, 4th one has T2^643, n-th one has T2^64*(n-1) and so on.
Then, first Rand returned from initRand() can use states from T0 to T2^64-1 without overlapping the state with second one.
Second one can use from T2^64 to T2^642-1, third one can use from T2^642 to T2^64*3-1, and so on.

@timotheecour
I don't think performance of initRand() is so important, because it would not be called as frequently as next(r: var Rand) as initRand() is called only when you create a new Rand.
When many Rand are created but they generate few random numbers, performance of initRand() can be important.
But in that case, I think hash functions tested in paper would be better:
http://jcgt.org/published/0009/03/02/

An obvious way to solve this to make state a thread-local global, and check/initialize it when it is accessed.

I think making state a thread-local variable and Nim compiler automatically inserts a call to randomize() so that it is called when a new thread starts would be better.
But I don't know it is possible.

I get it now, good one +1. The key thing to remember is state and baseState are different variables. state is used only for the "simple" not thread safe api, while baseState is used in the new initRand overload.

lib/pure/random.nim

Varriount · 2021-08-31T00:13:07Z

lib/pure/random.nim

+      for i in 0 .. 7:
+        if sysrand.urandom(urand):
+          copyMem(ret.addr, urand[0].addr, sizeof(Rand))
+          if ret.isValid:
+            break
+
+      if not ret.isValid:
+        # When 2 processes executed at same time on different machines,
+        # `ret` can still have same value.
+        let
+          pid = getCurrentProcessId()
+          t = getMonoTime().ticks
+        ret.a0 = pid.hash().Ui
+        ret.a1 = t.hash().Ui
+        if not ret.isValid:
+          ret.a0 = pid.Ui
+          ret.a1 = t.Ui


When would the data supplied by sysrand.urandom not correctly initialize the global random state?

xoroshiro128+ doesn't work when all bits of state are zero.
If sysrand.urandom is not guaranteed to never return zeros when sizeof(Rand) bytes are taken from it, we need to check returned value.
Maybe there is a bug in OS and sysrand.urandom returns zero.
Even if sysrand.urandom is working correctly, you might get a zero unfortunately (probability of 128bits random number returns 0 is 1/2^128 ).

According to the comment of sysrand.urandom proc,

If the call succeeds, returns true.

In other words, it can fail. So we need alternative ways to get random number.
Even if probability of sysrand.urandom returns 128bit zeros is tiny, it is easy to take care of such case if we already have alternative ways to get random number.

What about ANDing the state with 1 after seeding via urandom? That would ensure that the state is never zero.
Alternately, an explicit check and retry would work.

isValid(r: Rand): bool func checks whether state is zero or not.

func isValid(r: Rand): bool {.inline.} = not (r.a0 == 0 and r.a1 == 0)

This PR retrys urandom if isValid was false.

lib/pure/random.nim

timotheecour · 2021-08-31T15:41:07Z

lib/pure/random.nim

@@ -635,20 +653,60 @@ when not defined(nimscript) and not defined(standalone):
      let time = int64(times.epochTime() * 1000) and 0x7fff_ffff
      result = initRand(time)
    else:
-      let now = times.getTime()
-      result = initRand(convert(Seconds, Nanoseconds, now.toUnix) + now.nanosecond)
+      proc getRandomState(): Rand =


move this nested proc to top-level scope, simpler/cleaner

Why moving nested proc to top-level scope makes simpler/cleaner?
It doesn't reduce a total number of lines of random module.
getRandomState proc is used only in initRand().

lib/pure/random.nim

tests/stdlib/trandom.nims

Araq · 2021-09-02T12:11:59Z

tests/stdlib/trandom.nim

+    for i in 0..<size:
+      for j in 0..<numRepeat:
+        discard rands[i].next
+        doAssert rands[i] notin randSet


Should be addressed in a follow up PR but the notion that a random sequence of numbers doesn't "overlap" with some other sequence seems to weaken the "randomness" and not to improve it.

i don't think think this is correct; the sequences of numbers generated can have overlap in parts, eg:
1 (9 8 3) 2 5
2 (9 8 3) 7 1

but the underlying states won't overlap (so long we remain within a period); if they did, the sequences generated would be identical at any point after which the state overlaps

Could you tell me why is it weaken randomness?
It just works like evenly cut a random number sequence generated by calling next(r: var Rand) with single Rand, and assign each sub sequence to each Rand.
If you think randomness of random module is not good enough, I think we need to use different algorithm like xoroshiro128++, xoroshiro128** or PCG.

…8744) * Fix initrand to avoid random number sequences overlapping * Minor fix * Fix compile error on js backend * Disable new test for js backend * Minor fix * tempfiles module uses random.initRand() * Remove unused module import from lib/std/tempfiles.nim * Initialize baseState in initRand() * Run tests/stdlib/trandom.nim from tests/test_nimscript.nims * baseState is initialized only with sysrand.urandom and quit if failed * Add comments

demotomohiro added 6 commits August 25, 2021 02:03

Fix initrand to avoid random number sequences overlapping

300614b

Minor fix

fbdff68

Fix compile error on js backend

7898483

Disable new test for js backend

15eb7b4

Minor fix

3c14094

tempfiles module uses random.initRand()

a3278e4

ringabout reviewed Aug 25, 2021

View reviewed changes

lib/std/tempfiles.nim Show resolved Hide resolved

Remove unused module import from lib/std/tempfiles.nim

857623d

timotheecour reviewed Aug 30, 2021

View reviewed changes

lib/pure/random.nim Outdated Show resolved Hide resolved

Varriount requested changes Aug 31, 2021

View reviewed changes

Initialize baseState in initRand()

97ee811

Araq reviewed Aug 31, 2021

View reviewed changes

lib/pure/random.nim Outdated Show resolved Hide resolved

Araq reviewed Aug 31, 2021

View reviewed changes

lib/pure/random.nim Outdated Show resolved Hide resolved

timotheecour reviewed Aug 31, 2021

View reviewed changes

lib/pure/random.nim Show resolved Hide resolved

timotheecour reviewed Aug 31, 2021

View reviewed changes

tests/stdlib/trandom.nims Outdated Show resolved Hide resolved

demotomohiro added 3 commits September 1, 2021 17:02

Run tests/stdlib/trandom.nim from tests/test_nimscript.nims

8b218ab

baseState is initialized only with sysrand.urandom and quit if failed

a254cca

Add comments

42c28cb

Araq reviewed Sep 2, 2021

View reviewed changes

Araq merged commit 7c8ea49 into nim-lang:devel Sep 2, 2021

ringabout mentioned this pull request Sep 23, 2022

initRand now uses strict monotonic counter to guarantee uniqueness #18149

Closed

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix initrand to avoid random number sequences overlapping #18744

Fix initrand to avoid random number sequences overlapping #18744

demotomohiro commented Aug 24, 2021 •

edited

Loading

timotheecour commented Aug 25, 2021 •

edited

Loading

demotomohiro commented Aug 27, 2021

Varriount Aug 28, 2021

demotomohiro Aug 31, 2021

Varriount Sep 1, 2021

demotomohiro Sep 1, 2021

planetis-m Sep 2, 2021

Varriount Sep 2, 2021

timotheecour Sep 2, 2021

Varriount Sep 3, 2021 •

edited

Loading

demotomohiro Sep 4, 2021

planetis-m Sep 7, 2021 •

edited

Loading

Varriount Aug 31, 2021

demotomohiro Aug 31, 2021

Varriount Sep 1, 2021

demotomohiro Sep 1, 2021

timotheecour Aug 31, 2021

demotomohiro Sep 1, 2021

Araq Sep 2, 2021

timotheecour Sep 2, 2021

demotomohiro Sep 4, 2021

Fix initrand to avoid random number sequences overlapping #18744

Fix initrand to avoid random number sequences overlapping #18744

Conversation

demotomohiro commented Aug 24, 2021 • edited Loading

timotheecour commented Aug 25, 2021 • edited Loading

demotomohiro commented Aug 27, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Varriount Sep 3, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

planetis-m Sep 7, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

demotomohiro commented Aug 24, 2021 •

edited

Loading

timotheecour commented Aug 25, 2021 •

edited

Loading

Varriount Sep 3, 2021 •

edited

Loading

planetis-m Sep 7, 2021 •

edited

Loading