-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[python-package] Early Stopping does not work as expected #5354
Comments
Hi @ZhiZhongWan, thanks for raising this and for the excellent example. I confirm this isn't working as expected. The problem is that even though the documentation says that the training set is ignored it actually isn't. The scores are saved here: LightGBM/python-package/lightgbm/callback.py Lines 334 to 337 in 44fe591
And the training set is always the first one in the validation sets: LightGBM/python-package/lightgbm/engine.py Lines 247 to 250 in 44fe591
So on the final iteration check the training set is checked first: LightGBM/python-package/lightgbm/callback.py Lines 342 to 344 in 44fe591
This means that if the training score improved on the last iteration it will be saved as the best one, even if the validation score didn't. I'll try to come up with a fix for this. One possible fix I see is actually ignoring the training set by removing this line: LightGBM/python-package/lightgbm/callback.py Line 344 in 44fe591
|
Maybe this old discussion can help: #2371 (comment). |
This issue has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this. |
Description
I found that when early stopping is enabled and there're multiple validation sets in valid_sets, the LightGBM might won't save the best model as we expected.
Reproducible example
output:
As you can see, I clearly hope lgb could save the model that has the best auc on val because val is the last one in valid_sets list. However, in this example, it sometimes returns model which has the best performance on valid_sets[0], sometimes valid_sets[-1].
It seems that when it does not meet early stopping, something would go wrong.
I'm very confused about this. I fixed all random seeds so you can easily reproduce it.
Environment info
LightGBM version or commit hash:
'3.3.2'
Command(s) you used to install LightGBM
Additional Comments
The text was updated successfully, but these errors were encountered: