Futures: Parallel random number generation (RNG) #60

HenrikBengtsson · 2022-12-08T05:05:26Z

To prevent non-sound random numbers being produced when running in parallel, futureverse asks the developer to specify when their code needs the RNG. If not asked for, it'll still check to see if the RNG was used (i.e. .Random.seed) was updated. If it was, then a warning is produced.

Here is an example:

> library(pbapply)
> future::plan("multisession")
> y <- pblapply(1:2, FUN = rnorm, cl = "future")
  |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=00s  
Warning messages:
1: UNRELIABLE VALUE: One of the 'future.apply' iterations ('future_lapply-1') unexpectedly generated random numbers without declaring so. There is a risk that those random numbers are not statistically sound and the overall results might be invalid. To fix this, specify 'future.seed=TRUE'. This ensures that proper, parallel-safe random numbers are produced via the L'Ecuyer-CMRG method. To disable this check, use 'future.seed = NULL', or set option 'future.rng.onMisuse' to "ignore". 
2: UNRELIABLE VALUE: One of the 'future.apply' iterations ('future_lapply-2') unexpectedly generated random numbers without declaring so. There is a risk that those random numbers are not statistically sound and the overall results might be invalid. To fix this, specify 'future.seed=TRUE'. This ensures that proper, parallel-safe random numbers are produced via the L'Ecuyer-CMRG method. To disable this check, use 'future.seed = NULL', or set option 'future.rng.onMisuse' to "ignore".

To avoid this, a quick fix is for you could always pass future.seed = TRUE. That will set up a parallel RNG regardless of random numbers being generated or not. The downside is that it can be computationally expensive to do so. To give the developer the control, you'd have to introduce a new argument allowing the to control the future.seed argument to future_lapply() and likes. One way to do that without adding a new argument could be via attributes, e.g.

y <- pblapply(1:2, FUN = rnorm, cl = structure("future", future.seed = TRUE))

The text was updated successfully, but these errors were encountered:

psolymos · 2022-12-10T02:51:42Z

I like the attribute for the cl argument, but it might be a bit alien for some users. How about adding it to pboptions()? I.e. have it unset (NULL) on load, but check for the existence of the future.seed option and use that value.

HenrikBengtsson · 2022-12-10T03:07:28Z

How about adding it to pboptions()?

This is something the developer should control in their code. I don't think it should be modifiable by the end-user via an option - that'll give different results depending on option, which probably is not what the developer intended.

psolymos · 2022-12-10T03:22:13Z

I see the distinction. If the user is calling pb*apply(..., cl = "future") they should be able to set it as attribute, but if this is being used as part of another package, it is baked in.

psolymos · 2022-12-10T03:46:41Z

One can pass the future.seed argument directly through ... because ?future.apply::future_lapply tells:

For future_*apply() functions and replicate(), any future.* arguments part of \dots are passed on to future_lapply() used internally.

See:

r$> y <- pblapply(1:2, FUN = rnorm, cl = "future")
  |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=00s  
Warning messages:
1: UNRELIABLE VALUE: One of the ‘future.apply’ iterations (‘future_lapply-1’) unexpectedly generated random numbers without declaring so. There is a risk that those random numbers are not statistically sound and the overall results might be invalid. To fix this, specify 'future.seed=TRUE'. This ensures that proper, parallel-safe random numbers are produced via the L'Ecuyer-CMRG method. To disable this check, use 'future.seed = NULL', or set option 'future.rng.onMisuse' to "ignore". 
2: UNRELIABLE VALUE: One of the ‘future.apply’ iterations (‘future_lapply-2’) unexpectedly generated random numbers without declaring so. There is a risk that those random numbers are not statistically sound and the overall results might be invalid. To fix this, specify 'future.seed=TRUE'. This ensures that proper, parallel-safe random numbers are produced via the L'Ecuyer-CMRG method. To disable this check, use 'future.seed = NULL', or set option 'future.rng.onMisuse' to "ignore". 

r$> y <- pblapply(1:2, FUN = rnorm, cl = "future", future.seed = TRUE)
  |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=00s  
# no warnings

So developers can utilize this behaviour to set the future seed.

Signed-off-by: Peter Solymos <psolymos@gmail.com>

HenrikBengtsson · 2022-12-13T17:03:13Z

So developers can utilize this behaviour to set the future seed.

Good point. Yes, that looks like the cleanest solution. Then a rule of thumb can be to "pass any additional arguments to FUN immediately following the FUN argument, and any additional arguments to the the futureverse after cl = "future";

y <- pblapply(1:2, FUN = my_fcn, {additional my_fcn args}, cl = "future", {additional future args})

psolymos added a commit that referenced this issue Dec 10, 2022

Add note about future.seed #60

4645374

Signed-off-by: Peter Solymos <psolymos@gmail.com>

psolymos closed this as completed in f13ef73 Jan 8, 2023

fBedecarrats mentioned this issue Feb 17, 2023

A different approach to parallelization with mapme.biodiversity mapme-initiative/mapme.biodiversity#135

Closed

jepusto mentioned this issue Sep 16, 2024

cl= 'future' with pbreplicate() #71

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Futures: Parallel random number generation (RNG) #60

Futures: Parallel random number generation (RNG) #60

HenrikBengtsson commented Dec 8, 2022

psolymos commented Dec 10, 2022

HenrikBengtsson commented Dec 10, 2022 •

edited

Loading

psolymos commented Dec 10, 2022

psolymos commented Dec 10, 2022 •

edited

Loading

HenrikBengtsson commented Dec 13, 2022

Futures: Parallel random number generation (RNG) #60

Futures: Parallel random number generation (RNG) #60

Comments

HenrikBengtsson commented Dec 8, 2022

psolymos commented Dec 10, 2022

HenrikBengtsson commented Dec 10, 2022 • edited Loading

psolymos commented Dec 10, 2022

psolymos commented Dec 10, 2022 • edited Loading

HenrikBengtsson commented Dec 13, 2022

HenrikBengtsson commented Dec 10, 2022 •

edited

Loading

psolymos commented Dec 10, 2022 •

edited

Loading