Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R] Fix a crash that occurs with noLD R #6378

Merged
merged 1 commit into from
Nov 12, 2020
Merged

[R] Fix a crash that occurs with noLD R #6378

merged 1 commit into from
Nov 12, 2020

Conversation

hcho3
Copy link
Collaborator

@hcho3 hcho3 commented Nov 12, 2020

Closes #5935. Unfortunately, no unit test was added, because it's quite difficult to setup an environment with noLD R (it involves building R from the source with a custom flag).

@vnijs @jameslamb Can you review?

@vnijs
Copy link

vnijs commented Nov 12, 2020

I'm getting errors compiling in the NoLD docker container. Is it possible to test this using something like the below?

install.packages("drat", repos="https://cran.rstudio.com")
drat:::addRepo("dmlc")
install.packages("xgboost", repos="http://dmlc.ml/drat/", type = "source")

@hcho3
Copy link
Collaborator Author

hcho3 commented Nov 12, 2020

No, dmlc.ml does not work any more. You can follow the instructions from the official doc:

git clone --recursive https://github.com/dmlc/xgboost
cd xgboost
mkdir build
cd build
cmake .. -DR_LIB=ON
make -j$(nproc)
make install

I'm getting errors compiling in the NoLD docker container.

Can you post the error?

@trivialfis
Copy link
Member

Can you try to change a != b to a - b != 0 and see what happens?

@codecov-io
Copy link

Codecov Report

Merging #6378 (6ffa9a8) into master (d711d64) will increase coverage by 0.08%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #6378      +/-   ##
==========================================
+ Coverage   81.27%   81.36%   +0.08%     
==========================================
  Files          12       12              
  Lines        3450     3450              
==========================================
+ Hits         2804     2807       +3     
+ Misses        646      643       -3     
Impacted Files Coverage Δ
python-package/xgboost/tracker.py 95.18% <0.00%> (+1.20%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d711d64...6ffa9a8. Read the comment docs.

@hcho3
Copy link
Collaborator Author

hcho3 commented Nov 12, 2020

Can you try to change a != b to a - b != 0 and see what happens?

There is no difference. The example fails without this patch, and with this patch the example works.

@trivialfis
Copy link
Member

I mean change the patch to best_score - attr_best_score != 0 instead of using an absolute tolerance.

@hcho3
Copy link
Collaborator Author

hcho3 commented Nov 12, 2020

@trivialfis Without the absolute tolerance condition, the example script will fail with

Error in finalizer(env) : 
  Inconsistent 'best_score' values between the closure state: 0.829749 and the xgb.attr: 0.829749

The difference between best_score and attr_best_score is only 1e-16. This difference is introduced because R was compiled with long doubles disabled. I added the tolerance check to allow for a slight error due to floating-point arithmetic.

@vnijs
Copy link

vnijs commented Nov 12, 2020

@hcho3 I was able to build the fix_nold branch using your instructions. I can confirm that this works in the NoLD environment for R!

That said, I would still recommend adding a test for CRAN in case a similar issue pops up in the future. That way it will be easier to track down where this issue started :) Thanks!

@hcho3 hcho3 merged commit c564518 into dmlc:master Nov 12, 2020
@hcho3 hcho3 deleted the fix_nold branch November 12, 2020 05:09
@hcho3
Copy link
Collaborator Author

hcho3 commented Nov 12, 2020

@vnijs Thanks for reviewing.

I would still recommend adding a test for CRAN in case a similar issue pops up in the future

I'm afraid we won't act on this recommendation. Testing with noLD environment requires building R from the source, and that's not something we want to spend our testing budget.

@jameslamb
Copy link
Contributor

Testing with noLD environment requires building R from the source, and that's not something we want to spend our testing budget.

Sorry I'm late to the party! I had never tested in a noLD environment before, so I learned a lot from this. Thanks for including me!

I've never use
d it before, but just noting that there is a noLD docker image from rhub: https://hub.docker.com/r/rhub/debian-gcc-devel-nold/tags. If you wanted, you could set up a job that runs the R build and tests in it only when triggered by a comment, so it doesn't have to be on every PR. For example, you could do that only on releases.

That's what we do for valgrind in LightGBM: https://github.com/microsoft/LightGBM/blob/master/.github/workflows/r_valgrind.yml#L10

@vnijs
Copy link

vnijs commented Nov 12, 2020

@jameslamb I did try rhubs images and tools for NoLD but could never get them to work for some reason. So I eventually "rolled-my-own", more or less.

@hcho3 I don't think the test needs to be part of xgboost development really as I can see that might be a hassle. Just that there is a (testthat) test that will be checked by CRAN in a NoLD environment when a new release is available. This could be as simple as the reproducible example I provided.

@hcho3
Copy link
Collaborator Author

hcho3 commented Nov 12, 2020

@jameslamb Thanks for the workflow template. I didn't realize that you can trigger a job with a pull request comment. Interesting! This is going to be super useful.

@jameslamb
Copy link
Contributor

@jameslamb Thanks for the workflow template. I didn't realize that you can trigger a job with a pull request comment. Interesting! This is going to be super useful.

@StrikerRUS taught me that 😀

@hcho3 hcho3 mentioned this pull request Nov 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

NoLD issue in xgb.train and/or xgb.attr
5 participants