-
Notifications
You must be signed in to change notification settings - Fork 414
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Showing Low Variance columns #192
Comments
@reza, I really like this feature. Couple questions:
|
Not referring specifically to categorical features, but my 2 cents
|
Thanks for feedback @FedererKK i was going to start on this soon |
@FedererKK this is what I have so far: https://youtu.be/quco79Val4w Can you give me some more information on what you meant by "a flag to show moments of different orders"? Please let me know what else you think might be helpful to be determine variance (charts, calculations, etc...) Thanks |
@FedererKK I also found this on stackoverflow |
@FedererKK @reza1615 I think this will be the final version It will only be available for columns of numeric data (ints, floats). Please let me know if you think there's anything else I should add. |
added in v1.10.0 |
Sometimes a dataset may have a categorical feature with multiple levels, where distribution of such levels are skewed and one level may dominate over other levels. This means there is not much variation in the information provided by such feature. For a ML model, such feature may not add a lot of information and thus can be ignored for modeling.
source
I suggest to add a part in column description to show is this column is Low Variance or not.
(e.g. Low_Variance = True/False)
Also it can be like * or ! as a new icon before column's name to show which columns are Low variance.
⑉ or 𝄇 or 🔰 or 🚩
To make it more visual
Also we can have a section on top of the column description with different flags to show
...
this flags can also shown in the top of drop-down menu
The text was updated successfully, but these errors were encountered: