Skip to content

Commit

Permalink
Write about from_plan() + hpc
Browse files Browse the repository at this point in the history
  • Loading branch information
wlandau-lilly committed Jan 17, 2019
1 parent 590a26f commit 699e45a
Show file tree
Hide file tree
Showing 2 changed files with 43 additions and 3 deletions.
40 changes: 40 additions & 0 deletions hpc.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -263,6 +263,46 @@ It is possible to supply a custom job scheduler function to the `parallelism` ar

This feature is very advanced, and you should only attempt it in production if you really know what you are doing. Use at your own risk.

### Parallel computing *within* targets

In addition to parallism *among* targets on a cluster, you may wish to invoke parallelism *within* each target locally on its compute node. One successful approach is to

1. Define a custom plan column with the number of local workers for each target, and
2. Use the `from_plan()` function to get number of workers inside each command.

```{r fromplanworkers}
plan <- drake_plan(
a = target(
parallel::mclapply(1:8, sqrt, mc.cores = from_plan("cores")),
cores = 4
),
b = target(
parallel::mclapply(1:4, sqrt, mc.cores = from_plan("cores")),
cores = 2
)
)
plan
```


```{r fromplannorun2, eval = FALSE}
make(plan)
```

If you change the `cores` column and then run `make()` a second time, the targets will stay up to date. This is appropriate because changes to `mc.cores` do not actually affect the output values of the targets.

```{r fromplanworkers3}
plan$cores <- 2
plan
```


```{r fromplannorun3, eval = FALSE}
make(plan)
```

### Hasty mode

The [`drake.hasty`](https://github.com/wlandau/drake.hasty) package is a bare-bones spin-off of `drake`. It sacrifices reproducibility to aggressively boost speed when scheduling and executing your targets. It is not recommended for most serious production use cases, but it can useful for experimentation.
Expand Down
6 changes: 3 additions & 3 deletions plans.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -615,15 +615,15 @@ Metaprogramming gets much simpler if you do not need to construct literal calls
Thanks to [Chris Hammill](https://github.com/cfhammill) for [presenting this scenario and contributing to the solution](https://github.com/ropensci/drake/issues/451).


## Optional columns in your plan.
## Optional custom columns in your plan.

Besides the usual columns `target` and `command`, there are other columns you can add.
Besides the usual columns `target` and `command`, you can add all kinds of custom columns to your plan, and you can The following columns have special meanings for `make()`.

- `elapsed` and `cpu`: number of seconds to wait for the target to build before timing out (`elapsed` for elapsed time and `cpu` for CPU time).
- `priority`: for [parallel computing](#hpc), optionally rank the targets according to priority. That way, when two targets become ready to build at the same time, `drake` will pick the one with the dominant priority first.
- `resources`: target-specific lists of resources on a cluster. See the advanced options in the [parallel computing](#hpc) chapter for details.
- `retries`: number of times to retry building a target in the event of an error.
- `trigger`: choose the criterion that `drake` uses to decide whether to build the target. See `?trigger` or read the [trigger chapter](#triggers) to learn more.
- `worker`: for [paralllel computing](#hpc), optionally name the preferred worker to assign to each target.


```{r endofline_plans, echo = FALSE}
Expand Down

0 comments on commit 699e45a

Please sign in to comment.