-
Notifications
You must be signed in to change notification settings - Fork 49
Feature/pkgmgmtdoc #231
Feature/pkgmgmtdoc #231
Changes from 2 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4,7 +4,19 @@ The doAzureParallel package allows you to install packages to your pool in two w | |
- Installing on pool creation | ||
- Installing per-*foreach* loop | ||
|
||
Packages installed at the pool level benefit from only needing to be installed once per node. Each iteration of the foreach can load the library without needing to install them again. Packages installed in the foreach benefit from specifying any specific dependencies required only for that instance of the loop. | ||
|
||
## Installing Packages on Pool Creation | ||
|
||
Pool level packages support CRAN, GitHub and BioConductor packages. The packages are installed in a shared directory on the node. It is important to note that it is required to explicitly load any packages installed at the cluster level within the foreach loop. For example, if you installed xml2 on the cluster, you must explicityly load it before using it. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. typo: explicitly |
||
|
||
```R | ||
foreach (i = 1:4) %dopar% { | ||
# Load the libraries you want to use. | ||
library(xml2) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this true? I thought you can do .packages() and it'll automatically be called. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @brnleehng - I thought that was only true for job-level packages, right? For cluster level packages I think we need to either explicitly load the library or do xml2:: There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For the job level, we will try to reinstall and reference it with library. For the cluster level, we would need xml2:: |
||
xml2::as_list(...) | ||
} | ||
``` | ||
You can install packages by specifying the package(s) in your JSON pool configuration file. This will then install the specified packages at the time of pool creation. | ||
|
||
```R | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,69 +1,3 @@ | ||
# Using package management | ||
|
||
doAzureParallel supports installing packages at either the cluster level or during the execution of the foreach loop. Packages installed at the cluster level benefit from only needing to be installed once per node. Each iteration of the foreach can load the library without needing to install them again. Packages installed in the foreach benefit from specifying any specific dependencies required only for that instance of the loop. | ||
|
||
## Cluster level packages | ||
|
||
Cluster level packages support CRAN, GitHub and BioConductor packages. The packages are installed in a shared directory on the node. It is important to note that it is required to explicitly load any packages installed at the cluster level within the foreach loop. For example, if you installed xml2 on the cluster, you must explicityly load it before using it. | ||
|
||
```R | ||
foreach (i = 1:4) %dopar% { | ||
# Load the libraries you want to use. | ||
library(xml2) | ||
xml2::as_list(...) | ||
} | ||
``` | ||
|
||
### CRAN | ||
|
||
CRAN packages can be insatlled on the cluster by adding them to the collection of _cran_ packages in the cluster specification. | ||
|
||
```json | ||
"rPackages": { | ||
"cran": ["package1", "package2", "..."], | ||
"github": [], | ||
"bioconductor": [] | ||
} | ||
``` | ||
|
||
### GitHub | ||
|
||
GitHub packages can be insatlled on the cluster by adding them to the collection of _github_ packages in the cluster specification. | ||
|
||
```json | ||
"rPackages": { | ||
"cran": [], | ||
"github": ["repo1/name1", "repo1/name2", "repo2/name1", "..."], | ||
"bioconductor": [] | ||
} | ||
``` | ||
|
||
**NOTE** When using packages from a private GitHub repository, you must add your GitHub authentication token to your credentials.json file. | ||
|
||
### BioConductor | ||
|
||
Installing bioconductor packages is now supported via the cluster configuration. Simply add the list of packages you want to have installed in the cluster configuration file and they will get automatically applied | ||
|
||
```json | ||
"rPackages": { | ||
"cran": [], | ||
"github": [], | ||
"bioconductor": ["IRanges", "GenomeInofDb"] | ||
} | ||
``` | ||
|
||
**IMPORTANT** doAzureParallel uses the rocker/tidyverse Docker images by default, which comes with BioConductor pre-installed. If you use a different container image, make sure that bioconductor is installed on it. | ||
|
||
|
||
## Foreach level packages | ||
|
||
Foreach level packages currently only support CRAN packages. Unlike cluster level pacakges, when specifying packages on the foreach loop, packages will be automatically installed _and loaded_ for use. | ||
|
||
### CRAN | ||
|
||
```R | ||
foreach(i = 1:4, .packages = c("xml2")) %dopar% { | ||
# xml2 is automatically loaded an can be used without calling library(xml2) | ||
xml2::as_list(...) | ||
} | ||
``` | ||
Please see documentation[(link)](../../docs/20-package-management.md) for more details on packagement management. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Remove this. And just add a comment line in the sample |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"benefit from specifying any specific" is a weird sentence. May be "benefit from specifying any dependencies"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.