Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancement: Make the Ordinal Encoder a encoder choice #1150

Open
franchuterivera opened this issue May 26, 2021 · 2 comments
Open

Enhancement: Make the Ordinal Encoder a encoder choice #1150

franchuterivera opened this issue May 26, 2021 · 2 comments
Labels
enhancement A new improvement or feature

Comments

@franchuterivera
Copy link
Contributor

Problem statement:
scikit learn 0.24 does not support np.nan when doing ordinal encoding of categorical columns. This is a feature added in 0.25. Because of this, we are forced to have imputation here before ordinal encoding.

Suggestion:
When moving to the next scikit learn, we can remove the ordinal encoder from the categorical pipeline steps, so that it is no longer before imputation. This will also allow us to remove the noencoder choice.

@eddiebergman
Copy link
Contributor

@franchuterivera ello,

Was this resolved with PR #1135 or still waiting on scikit-learn 0.25?

@mfeurer
Copy link
Contributor

mfeurer commented Jul 19, 2021

Still waiting. Currently the ordinal encoder is used prior to the one hot encoder, but should actually be part of the OHEChoice:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement A new improvement or feature
Projects
None yet
Development

No branches or pull requests

3 participants