Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R] Fix nthread in DMatrix constructor. #7127

Merged
merged 4 commits into from
Aug 3, 2021

Conversation

trivialfis
Copy link
Member

Close #7098 .

Restricting the construction of DMatrix to be single threaded.

@hcho3
Copy link
Collaborator

hcho3 commented Jul 24, 2021

Is it possible to make XGDMatrixCreateFromMat_R aware of nthread setting?

@trivialfis
Copy link
Member Author

That's either a new API or a breaking change.

@hcho3
Copy link
Collaborator

hcho3 commented Jul 24, 2021

@trivialfis I don't think changing the signature for XGDMatrixCreateFromMat_R would break any user-facing functionality. We'd just need to modify the content of xgb.DMatrix function in xgb.DMatrix.R. What do you think?

@trivialfis
Copy link
Member Author

I was told that the C function in R is also considered as part of the interface when working on #6819 . That's why the old predict C function for R is still here.

@codecov-commenter
Copy link

codecov-commenter commented Jul 24, 2021

Codecov Report

Merging #7127 (e5b2e5b) into master (caa9e52) will decrease coverage by 0.51%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #7127      +/-   ##
==========================================
- Coverage   82.71%   82.19%   -0.52%     
==========================================
  Files          13       13              
  Lines        3963     3999      +36     
==========================================
+ Hits         3278     3287       +9     
- Misses        685      712      +27     
Impacted Files Coverage Δ
python-package/xgboost/tracker.py 86.33% <0.00%> (-8.77%) ⬇️
python-package/xgboost/core.py 83.93% <0.00%> (ø)
python-package/xgboost/dask.py 82.33% <0.00%> (+0.02%) ⬆️
python-package/xgboost/sklearn.py 89.77% <0.00%> (+0.20%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update caa9e52...e5b2e5b. Read the comment docs.

@@ -103,7 +103,7 @@ XGB_DLL SEXP XGDMatrixCreateFromMat_R(SEXP mat, SEXP missing) {
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The for loop in lines 96-102 is still using multiple threads. We must revise the loop also.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's actually using the default.

XGDMatrixCreateFromMat_omp_R, that takes in the extra parameter nthread

Maybe a new JSON based API.

Copy link
Collaborator

@hcho3 hcho3 Jul 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"The default" is set by OMP_NUM_THREADS, not nthread.

@hcho3
Copy link
Collaborator

hcho3 commented Jul 26, 2021

Also, let's create a new API function, XGDMatrixCreateFromMat_omp_R, that takes in the extra parameter nthread, so that R users do not lose out on the ability to use multiple threads when loading from a matrix. The old function XGDMatrixCreateFromMat_R should just call XGDMatrixCreateFromMat_omp_R with nthread=1.

@hcho3
Copy link
Collaborator

hcho3 commented Jul 27, 2021

Let's just change the signature for XGDMatrixCreateFromMat_R, since this function is only used by the R package.

@trivialfis trivialfis merged commit 8ee1274 into dmlc:master Aug 3, 2021
@trivialfis trivialfis deleted the fix-r-threads branch August 3, 2021 09:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

nthread inside xgboost() function not working in R
4 participants