Within-target parallelism section #69

pat-s · 2019-03-10T14:48:09Z

Questions

In mclapply(), set the mc.set.seed argument to FALSE. If your computations require pseudo-random numbers (rnorm(), runif(), etc.) you will need to manually set a different seed for each parallel process, e.g.

Why? I use it in my code and haven't faced any problems yet.

In make(), set the lock_envir argument to FALSE. This approach deactivates important reproducibility guardrails, so use with caution.

I also do not do this. What is the reason behind this?

Suggestions

Devote an own section to "within-target parallelism". Create subsections for

General notes
Execution on a single machine
Execution on a HPC

The text was updated successfully, but these errors were encountered:

wlandau · 2019-03-13T00:42:36Z

Why? I use it in my code and haven't faced any problems yet.

From the "Random numbers" section of the help file of parallel::mcparallel():

If ‘mc.set.seed = FALSE’, the child process has the same initial
random number generator (RNG) state as the current R session. If
the RNG has been used (or ‘.Random.seed’ was restored from a saved
workspace), the child will start drawing random numbers at the
same point as the current session.

rnorm(1)
#> [1] 0.4506939
parallel::mclapply(c(1, 1), rnorm, mc.set.seed = FALSE, mc.cores = 2)
#> [[1]]
#> [1] -0.3738903
#> 
#> [[2]]
#> [1] -0.3738903

^{Created on 2019-03-12 by the reprex package (v0.2.1)}

I also do not do this. What is the reason behind this?

Just if people are encountering ropensci/drake#675 and need a quick workaround.

pat-s · 2019-03-13T09:44:31Z

From the "Random numbers" section of the help file of parallel::mcparallel():

Uff, either I always interpreted that wrong or something changed recently. I remember explicitly setting this to HAVE the same RNG across all processes.
That's ofc a bummer then. But according to the documentation I was wrong. Hmm

wlandau · 2019-03-15T19:47:12Z

I just added new writing based on ropensci/drake#777 (comment).

pat-s · 2019-03-15T21:27:18Z

No worries. I would suggest to make one part of the persistent worker section extra clear:

The number of workers chosen in prework apply to all future_*() functions in the whole plan. I.e. the user needs to choose the lowest value that works for all instances in the plan.

wlandau · 2019-03-17T14:49:47Z

Yeah, I think something like that is worth a mention. But for the defaults, I agree with @HenrikBengtsson in ropensci/drake#777 (comment). I think the first examples should rely on future::availableCores(). See 9cb1527.

pat-s mentioned this issue Mar 10, 2019

add within-target-par section #70

Closed

3 tasks

wlandau added new writing editing labels Mar 15, 2019

wlandau-lilly closed this as completed in 8816590 Mar 15, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Within-target parallelism section #69

Within-target parallelism section #69

pat-s commented Mar 10, 2019

wlandau commented Mar 13, 2019

pat-s commented Mar 13, 2019

wlandau commented Mar 15, 2019

pat-s commented Mar 15, 2019

wlandau commented Mar 17, 2019

Within-target parallelism section #69

Within-target parallelism section #69

Comments

pat-s commented Mar 10, 2019

Questions

Suggestions

wlandau commented Mar 13, 2019

pat-s commented Mar 13, 2019

wlandau commented Mar 15, 2019

pat-s commented Mar 15, 2019

wlandau commented Mar 17, 2019