-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[python-package] fix mypy error about pandas categorical features #6253
Conversation
@@ -786,7 +786,7 @@ def _data_from_pandas( | |||
feature_name: _LGBM_FeatureNameConfiguration, | |||
categorical_feature: _LGBM_CategoricalFeatureConfiguration, | |||
pandas_categorical: Optional[List[List]] | |||
) -> Tuple[np.ndarray, List[str], List[str], List[List]]: | |||
) -> Tuple[np.ndarray, List[str], Union[List[str], List[int]], List[List]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This hint was wrong.
categorical_feature
can be a list of integers (indicating 0-based column positions).
LightGBM/python-package/lightgbm/basic.py
Lines 1724 to 1726 in b74528f
categorical_feature : list of str or int, or 'auto', optional (default="auto") | |
Categorical features. | |
If list of int, interpreted as indices. |
And it's passed back out of this function unmodified if it's provided like that.
LightGBM/python-package/lightgbm/basic.py
Lines 814 to 817 in b74528f
if categorical_feature == 'auto': # use cat cols from DataFrame | |
categorical_feature = cat_cols_not_ordered | |
else: # use cat cols specified by user | |
categorical_feature = list(categorical_feature) # type: ignore[assignment] |
categorical_feature = cat_cols_not_ordered | ||
else: # use cat cols specified by user | ||
categorical_feature = list(categorical_feature) # type: ignore[assignment] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
categorical_feature
can only be a list of strings, a list of integers, or the literal string 'auto'
.
So I don't think this else:
clause is helpful. When it's reached, under correctly-provided inputs it's just calling list()
on something that's already a list, which has no effect.
I guess this could be providing a small bit of help for cases where some other type like a tuple was passed in, but if we want to support that then:
- it should be done at all public interfaces that support
categorical_feature
- it shouldn't be buried inside a
pandas
-specific function
I think this can and should be removed, in the interest of further simplifying the flow of pandas
data through the library.
This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this. |
Contributes to #3756.
Contributes to #3867.
Fixes the following error from
mypy
: