Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R-package] {lightgbm} might be incompatible with some R 3.5.x versions #4813

Closed
Tracked by #5153
jameslamb opened this issue Nov 18, 2021 · 7 comments
Closed
Tracked by #5153

Comments

@jameslamb
Copy link
Collaborator

Description

See .

Today, the R package's DESCRIPTION says it supports R 3.5.0 and newer.

R (>= 3.5),

According to #4812 (comment), recent changes in the R package might have made it incompatible with some R 3.5.x versions.

How to close this issue

  1. Investigate the specific versions of R 3.5.x that the package might be compatible with
  2. propose a pull request increasing the floor to that version

Notes

Thanks @david-cortes for reporting this!

@jameslamb
Copy link
Collaborator Author

I'm investigating this tonight, will share a reproducible example soon.

So far I've found that on Linux (the only OS I've tested on), the CRAN package compiles successfully with every release of R 3.5.x, but with 20+ test failures, mostly related to numerical precision issues.

Investigated the claim from #4812 (comment) that R_FlushConsole was introduced somewhere in the R 3.5.x release series...I don't believe that is true.

Looks like R_FlushConsole has been in R.h for 16 years:

https://github.com/wch/r-source/blame/fe08ebe026a46d40015769f9512c1c347077ae6d/src/include/R.h#L93

R 3.5.0 was released in April 2018: https://cran.r-project.org/bin/windows/base/old/.

@jameslamb
Copy link
Collaborator Author

Using the rocker/verse container images, I tested {lightgbm}'s compatibility with a few versions of R 3.5.x tonight.

testing script (click me)
# build CRAN package
rm -f lightgbm*.tar.gz
sh build-cran-package.sh --no-build-vignettes

# gain a shell in a container
R_VERSION=3.5.3
docker run \
    --rm \
    -v $(pwd):/opt/LightGBM \
    --workdir /opt/LightGBM \
    --env MAKE="make -j2" \
    -it \
    rocker/verse:${R_VERSION} \
        /bin/bash

# update to the latest versions of all packages
Rscript \
    --vanilla \
    -e "install.packages(c('R6', 'data.table', 'jsonlite', 'Matrix', 'testthat'), repos = 'https://cran.r-project.org')"

# install LightGBM from source
R --vanilla CMD INSTALL \
    --with-keep.source \
    ./lightgbm_3.3.1.99.tar.gz

# run the tests
cd R-package/tests
Rscript --vanilla testthat.R

On both R 3.5.0 and R 3.5.3 (the first and last releases in that series), I found that:

  • {lightgbm} compiles successfully
  • a small handful of tests fail, all with numerical precision differences.
── 1. Failure (test_basic.R:738:3): lgb.train() works as expected with sparse features ────────────────
── 2. Failure (test_basic.R:1259:7): lgb.train() works when a list of strings or a character vector is
── 3. Failure (test_basic.R:1262:7): lgb.train() works when a list of strings or a character vector is
── 4. Failure (test_basic.R:1262:7): lgb.train() works when a list of strings or a character vector is
── 5. Failure (test_basic.R:1259:7): lgb.train() works when a list of strings or a character vector is
── 6. Failure (test_basic.R:1262:7): lgb.train() works when a list of strings or a character vector is
── 7. Failure (test_basic.R:1262:7): lgb.train() works when a list of strings or a character vector is
── 8. Failure (test_basic.R:1294:3): lgb.train() works when you specify both 'metric' and 'eval' with s
── 9. Failure (test_basic.R:1295:3): lgb.train() works when you specify both 'metric' and 'eval' with s
── 10. Failure (test_basic.R:1737:3): lgb.cv() works when you specify both 'metric' and 'eval' with str
── 11. Failure (test_basic.R:1738:3): lgb.cv() works when you specify both 'metric' and 'eval' with str
── 12. Failure (test_learning_to_rank.R:105:5): learning-to-rank with lgb.cv() works as expected ──────
── 13. Failure (test_learning_to_rank.R:128:5): learning-to-rank with lgb.cv() works as expected ──────
── 14. Failure (test_learning_to_rank.R:134:5): learning-to-rank with lgb.cv() works as expected ──────
── 15. Failure (test_learning_to_rank.R:140:5): learning-to-rank with lgb.cv() works as expected ──────

I haven't looked yet into the specifics of those tests. Stopping this investigation here for now, will come back to it after getting some other PRs up for other things.

@jameslamb jameslamb mentioned this issue Apr 14, 2022
60 tasks
@jameslamb
Copy link
Collaborator Author

I checked again today with the latest state of master (44fe591). {lightgbm} compiles successfully with R 3.5.0 and R 3.5.3 (the first and last releases in the 3.x series).

I strongly suspect that the handful of tests that are failing are failing because of the changes to random number generation and random sampling between R 3.5.x and R 3.6.0. Mentioned in See https://blog.revolutionanalytics.com/2019/05/whats-new-in-r-360.html

Changes to random number generation.
R 3.6.0 changes the method used to generate random integers in the sample function... means that scripts using the sample function will generate different results in R 3.6.0 than they did in prior versions of R.

It seems to me that all of the tests that are failing involve the use of sample()...either in test data generation or through the use of sample() in lgb.cv().

Other tests which check for exact results and which do not use randomly-generated data or {lightgbm} code that calls lgb.cv() are passing.

For example:

test_that("lgb.Booster.upper_bound() and lgb.Booster.lower_bound() work as expected for binary classification", {

test_that("lgb.Booster.upper_bound() and lgb.Booster.lower_bound() work as expected for regression", {

test_that("lightgbm() performs evaluation on validation sets if they are provided", {


Given this... I think we should close this issue, and leave the version floor in the R package set to R (>= 3.5.0).

Even though some of LightGBM's tests fail, it can be compiled successfully and most tests are passing. I think we should avoid putting an artificial restriction on users who may be on older R versions, and that we shouldn't spend time trying to get those tests working on R 3.5.x.

@jmoralez or @StrikerRUS what do you think?

@jmoralez
Copy link
Collaborator

jmoralez commented Jul 14, 2022

I agree on closing this and keeping the floor at 3.5.0. From #4812 (comment) I think the incompatibility would've caused the compilation to fail because of the missing function in the header so if the package compiles successfully I think the tests are failing only due to the different scores.

@jameslamb
Copy link
Collaborator Author

Alright thanks @jmoralez , I'm gonna close this.

@StrikerRUS
Copy link
Collaborator

It seems to me that all of the tests that are failing involve the use of sample()...either in test data generation or through the use of sample() in lgb.cv().

Nice observation!

I think we should avoid putting an artificial restriction on users who may be on older R versions, and that we shouldn't spend time trying to get those tests working on R 3.5.x.

I agree with both points.

@github-actions
Copy link

This issue has been automatically locked since there has not been any recent activity since it was closed.
To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues
including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 15, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants