-
-
Notifications
You must be signed in to change notification settings - Fork 8.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split up LearnerImpl
.
#5350
Split up LearnerImpl
.
#5350
Conversation
ping @hcho3 . No functionality change, only refactoring. |
Codecov Report
@@ Coverage Diff @@
## master #5350 +/- ##
=======================================
Coverage 83.75% 83.75%
=======================================
Files 11 11
Lines 2413 2413
=======================================
Hits 2021 2021
Misses 392 392 Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider composition over inheritance: https://en.wikipedia.org/wiki/Composition_over_inheritance
I think it makes much more sense in this case.
@RAMitchell I think direct inheritance is more suitable here. Following is my reasoning:
|
@trivialfis Do you want me to review this? |
6439621
to
143593b
Compare
@hcho3 Yes please. Another version is #5404 . This one uses inheritance while the other one uses composition. Me and @RAMitchell didn't decide which one is better. |
@hcho3 I have been seeing non-deterministic parser error on CI and it's becoming quite often. In https://xgboost-ci.net/blue/organizations/jenkins/xgboost/detail/PR-5350/2/pipeline it reports a weird symbol as delimiter, another time I saw this issue on CI it said the string
|
@trivialfis I have no idea. We can try running tests with thread sanitizer enabled and see if there’s any thread safety issue. Right now, our CI does not run any tests with thread sanitizer. |
@hcho3 tsan is not really useful in the present of openmp. Multiple threads writing to a same |
Codecov Report
@@ Coverage Diff @@
## master #5350 +/- ##
=======================================
Coverage 84.07% 84.07%
=======================================
Files 11 11
Lines 2411 2411
=======================================
Hits 2027 2027
Misses 384 384 Continue to review full report at Codecov.
|
This PR splits up
LearnerImpl
into 3 different components. Starting from basic configuration to model IO and lastly the actual learner that performs training/prediction . This is an attempt to modularize it so we can shrink it down in the future.For instance, some configurations are no longer needed in current state of XGBoost, like the objective function configuration should be simplified. Binary model IO is really difficult to work with and should be reduced to absolute minimum. This PR splits the monolithic
LearnerImpl
so we can look into individual group of features more easily.