-
-
Notifications
You must be signed in to change notification settings - Fork 8.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[XGBoost4J-Spark] bestIteration and bestScore for early stopping #7095
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! I'm not familiar with the jvm package, just providing suggestions. Also is there a way we can have an unittest?
cc @wbo4958
@@ -242,11 +242,15 @@ public static Booster trainAndSaveCheckpoint( | |||
if (score > bestScore) { | |||
bestScore = score; | |||
bestIteration = iter; | |||
booster.setAttr("bestIteration", String.valueOf(bestIteration)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be better to use best_iteration
to align with other language bindings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, the best iteration and best score are decided by all other language bindings? @trivialfis
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we'd better to keep the same naming for all language bindings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@trivialfis - Using camel case is the norm in Java so I went ahead with bestIteration and bestScore. This is in line with other naming convention used in Booster.java & XGBoost.java code files. Let me know if you still would like to change it to best_iteration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, the best iteration and best score are decided by all other language bindings? @trivialfis
Yes.
@trivialfis - Using camel case is the norm in Java so I went ahead with bestIteration and bestScore
For java attributes that make sense. But for the string stored inside XGBoost, please use snake case.
booster.setAttr("best_iteration", String.valueOf(bestIteration));
Notice the difference between string and java symbol. This way the model is more portable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. Updated with snake case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi naveenkb, Could we add a unit test for this PR and Better update the doc for this, @trivialfis is there any doc to describe the best iteration/score things?
@@ -643,7 +643,7 @@ public void testSetAndGetAttrs() throws XGBoostError { | |||
}}); | |||
|
|||
Map<String, String> attr = booster.getAttrs(); | |||
TestCase.assertEquals(attr.size(), 4); | |||
TestCase.assertEquals(attr.size(), 6); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add some comments why the attr.size()
is equal to 6?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wbo4958 maybe because 2 attributes got added?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I know the extra 2 attrs, but others may be so confused. Better to add some comments here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wbo4958 Yes as attr.size()
is 6 because 2 additional attributes are added. The test case is checking setter and getter method and also total number of attributes in the booster object.
I will update the doc to explain best iteration/score.
Any suggestion on adding unit test for these attribute ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
checking if the booster is containing best_iteration and best_score?
@@ -242,11 +242,15 @@ public static Booster trainAndSaveCheckpoint( | |||
if (score > bestScore) { | |||
bestScore = score; | |||
bestIteration = iter; | |||
booster.setAttr("bestIteration", String.valueOf(bestIteration)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, the best iteration and best score are decided by all other language bindings? @trivialfis
Please ignore the failure for now. Something on our CI updated we are still trying to figure out what's broken. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, please rebase onto the master branch to get the CI passed. Thanks for the feature!
Codecov Report
@@ Coverage Diff @@
## master #7095 +/- ##
=======================================
Coverage 81.60% 81.60%
=======================================
Files 13 13
Lines 3903 3903
=======================================
Hits 3185 3185
Misses 718 718 Continue to review full report at Codecov.
|
As per discussions in #6893, adding bestIteration and bestScore attributes.
sample code snippet to access the attributes