-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[python] Re-enable scikit-learn 0.22+ support #2946
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update! Really excited that we now have approval of that check_no_attributes_set_in_init
can be safely skipped.
Thanks @StrikerRUS !
To be more specific, setting private attributes in
It would be good to double check that this is verified. But in any case, that applies to any scikit-learn version not just scikit-learn 0.22+ so this PR is still necessary in any case. |
I like the comment you're referring! It makes a lot sense! Unfortunately, it seems that new scikit-learn version brings more tests we are not passing.
If you wish, I can pick this PR and continue re-enabling support of new versions. I understand that you have more important things to do rather than fixing third-party libraries 🙂 . Approval of skipping the test is more than enough! We really appreciate the support! |
Thank you, that would be great @StrikerRUS ! BTW, I opened a minor maintenance issue that might make using these tests a bit easier in the future #2947 |
If you rebase to Sorry for the inconvenience! |
In my opinion there's a big issue with the current scikit-learn wrapper. In general, most libraries in the scikit-learn ecosystem (or sklearn itself) expect a fit() method where you only pass X and y (and maybe sample_weight). In the current wrapper we also have other params such as early_stopping_round. I think we should move as much parameters as possible from the fit method to the estimator constructor. For example, catboost https://catboost.ai/docs/concepts/python-reference_catboost.html allows to set most parameters through the constructor (you can set early_stopping_round both when you create the estimator object or in the fit method). For example, if I want to create a StackedClassifier https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.StackingClassifier.html it's not clear how to pass the additional LightGBM fit() parameters to the StackedClassifier wrapper. In the past I created a custom LightGBM compatibility layer where I passed parameters to the constructor (inside a fit_params dictionary) that were used when calling LightGBM's fit() method. I think we should definitely move as much as parameters as possible out of the fit() method to improve compatibility with the sklearn ecosystem. |
+1 but that's independent of scikit-learn version so it would be good to merge this PR in any case :) |
As discussed in #2628 (comment) re-enables scikit-learn 0.22+ support.
Skips the
check_no_attributes_set_in_init
common check in check_estimator. Hopefully it will be fixed before 0.23 scikit-learn/scikit-learn#16241Reverts #2637
Another downside of not supporting scikit-learn 0.22+ is that users will have to compile scikit-learn 0.21.3 from sources on Python 3.8 since there are no wheels there. On conda it would mean that it's simply not possible to install lightgbm on Python 3.8 if you also have scikit-learn installed.