-
Notifications
You must be signed in to change notification settings - Fork 414
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pre-Encoding for Correlations #488
Labels
enhancement
New feature or request
Comments
Can you refresh my memory on how the "one-hot" (encode) works? We made column builders for this, correct? |
yes we have column builder. |
|
In my opinion when we open correlation window
1-A multi select drop down lists all dtype(object)
2-An exclud multi select drop down list all column with high cardinality
(df['A'].nunique()>50) as exclud from one-hot
3-by default user only get numerical col correlation
4-by click on a button Dtale convert all object columns -exclud the second
list of high cardinality then calculate correlation. encoding high cordinal columns generats many column which effects performance
for the date why you want check military second?
|
aschonfeld
added a commit
that referenced
this issue
May 27, 2021
aschonfeld
added a commit
that referenced
this issue
May 28, 2021
added in v1.48.0 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
As you know Correlations just works on the numerical columns and for the categorical doesn't work.
It would be nice to have an option to one-hot (encode) all categorical columns in a batch after that user can get the Correlations.
The text was updated successfully, but these errors were encountered: