Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ci] parallelize R package installs in CI jobs #4198

Merged
merged 5 commits into from
Apr 20, 2021
Merged

Conversation

jameslamb
Copy link
Collaborator

@jameslamb jameslamb commented Apr 19, 2021

This week, @jnolis taught me that the R function install.packages() can support parallel package installations! That function supports an argument Ncpus. If set to a value greater than 1, R will install multiple packages at the same time.

See https://stat.ethz.ch/R-manual/R-patched/library/utils/html/install.packages.html for more.

This PR proposes setting that argument to the value of parallel::detectCores() in this project's CI jobs and documentation of how to run them manually. I think this should reduce the time it takes R CI jobs to run, especially on Linux where CRAN does not prepare precompiled binaries. {parallel} comes installed in all standard installations of R, so this does not introduce a new dependency for LightGBM's CI jobs.

Windows and Linux runners from GitHub Actions have a single 2-core CPU and macOS runners have a single 3-core CPU: https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners#supported-runners-and-hardware-resources.


Testing whether this actually makes builds faster

The table below compares run times for the R jobs on the 3 most recent builds of master and 3 builds of this PR

Linux

job master 1 master 2 master 3 build 1 build 2 build 3
ubuntu-latest, gcc, R 3.6, cmake 11m12s 11m14s 10m28s 10m43s {08m32s} 10m01s
ubuntu-latest, gcc, R 4, cmake 11m35s 10m31s 12m15s {08m56s} 09m50s 09m34s
ubuntu-latest, clang, R 3.6, cmake 11m18s 11m02s 12m06s 09m02s {08m42s} 09m00s
ubuntu-latest, clang, R 4, cmake 09m50s 10m22s 09m59s 09m05s 08m53s {07m54s}
ubuntu-latest, gcc, R 4, cran 11m47s 14m17s 15m05s {10m27s} 12m12s 10m46s
ubuntu-latest, R-devel, GCC ASAN/UBSAN 21m25s 24m51s 20m55s 21m35s {19m46s} 20m27s
debian, R-devel, clang 09m37s 10m10s 10m13s 08m20s 08m44s {07m24s}

Mac

job master 1 master 2 master 3 build 1 build 2 build 3
macOS-latest, gcc, R 3.6, cmake 08m26s 08m45s 08m40s 12m39s 10m46s {08m17s}
macOS-latest, gcc, R 4, cmake 08m24s 09m01s 15m55s 11m38s 08m18s {08m12s}
macOS-latest, clang, R 3.6, cmake 10m41s 11m05s 10m11s 13m26s {09m17s} 09m22s
macOS-latest, clang, R 4, cmake 10m20s 09m07s 10m07s {08m57s} 12m36s 09m38s
macOS-latest, clang, R 4, cran 11m10s 10m07s {09m49s} 11m23s 11m29s 10m29s

Windows

job master 1 master 2 master 3 build 1 build 2 build 3
windows-latest, MINGW, R 3.6, cmake 14m10s 12m33s 12m53s {12m21s} 13m21s 15m01s
windows-latest, MINGW, R 4, cmake 16m01s 14m54s 14m55s {13m47s} 15m59s 16m12s
windows-2016, MSVC, R 3.6, cmake 08m09s 07m57s 08m34s 08m09s {07m32s} 08m56s
windows-2019, MSVC, R 4, cmake 08m34s 07m46s 07m39s 07m53s {07m36s} 08m25s
windows-latest, MINGW, R 3.6, cran 20m04s {16m44s} 17m28s 17m23s 17m38s 17m24s
windows-latest, MINGW, R 4, cran 17m13s {17m05s} 17m44s 18m09s 18m21s 18m01s

The fastest time for each build is {in brackets}. It's hard to get an accurate estimate of the timing for these things because so many factors impact runtime of the CI jobs, including the response times and availability of multiple package managers.

But it does look roughly like what I'd expect...this change seems to offer a noticeable speedup on Linux (where R packages have to be installed from source) and only a small speed up (if any) on Window and macOS (where CRAN provides binaries).

@jameslamb
Copy link
Collaborator Author

jameslamb commented Apr 19, 2021

/gha run r-solaris

Workflow Solaris CRAN check has been triggered! 🚀
https://github.com/microsoft/LightGBM/actions/runs/762168944

solaris-x86-patched: https://builder.r-hub.io/status/lightgbm_3.2.1.99.tar.gz-e9ee6191d08443fda80dbf2d2daa5138
solaris-x86-patched-ods: https://builder.r-hub.io/status/lightgbm_3.2.1.99.tar.gz-95650b2939554245bd8d0f8a15f8a791
Reports also have been sent to LightGBM public e-mail: http://www.yopmail.com/lightgbm_rhub_checks
Status: success ✔️.

@jameslamb
Copy link
Collaborator Author

jameslamb commented Apr 19, 2021

/gha run r-valgrind

Workflow R valgrind tests has been triggered! 🚀
https://github.com/microsoft/LightGBM/actions/runs/762169253

Status: success ✔️.

@jameslamb jameslamb marked this pull request as ready for review April 19, 2021 03:03
@StrikerRUS
Copy link
Collaborator

I like this change!
What about static_analysis workflow?

Rscript -e "install.packages(c('R6', 'data.table', 'jsonlite', 'roxygen2', 'testthat'), repos = 'https://cran.r-project.org')"

@jameslamb
Copy link
Collaborator Author

jameslamb commented Apr 19, 2021

What about static_analysis workflow?

Ah yes, you're totally right, thank you! added in 7c6fd87

@StrikerRUS StrikerRUS merged commit 72d7010 into master Apr 20, 2021
@StrikerRUS StrikerRUS deleted the ci/parallel-install branch April 20, 2021 13:00
@github-actions
Copy link

This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants