Question - Support for different types of categorical variable encoding #1237

SSMK-wq · 2022-01-15T12:45:27Z

Hi,

Does Tpot offer any automated way to convert categorical feature into encoded variables.

Context of the issue

I have an input dataset with more than 100 variables where around 80% of the variables are categorical in nature.

While some variables like gender, country etc can be one-hot encoded but I also have few variables which have an inherent order in their values such rating - Very good, good, bad etc.

Is there any approach/option in Tpot which we can use to do this encoding based on the variable type.

For ex: I would like to provide the below two lists as input to the tpot auto-ml arguments.

one-hot-list = ['Gender', 'Country'] #one-hot encoding
ordinal_list = ['Feedback', 'Level_of_interest'] #ordinal encoding

Is there any option in the package that can do this for us?

Or is there any other efficient way to do this as I have 80 categorical columns

The text was updated successfully, but these errors were encountered:

fjpa121197 · 2022-05-04T13:04:17Z

Hi @SSMK-wq,

did you find a work around to this? I don't see any documentation saying that TPOT handles encoding of categorical features, or different/predefined encoding, for example, ordinal vs one-hot encoding.

spenceforce · 2022-09-17T02:49:31Z

Bumping this as it would be nice to pass categorical features to tpot. Tpot includes OneHotEncoder in its default estimator set for regressions, but it's only usable for integers as it stands. I see the fit method throws an error on np.isnan. I'm sure there's more to it than changing that though.

perib mentioned this issue Sep 21, 2023

TPOT2 and the future of TPOT development -- From the Devs #1322

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question - Support for different types of categorical variable encoding #1237

Question - Support for different types of categorical variable encoding #1237

SSMK-wq commented Jan 15, 2022 •

edited

Loading

fjpa121197 commented May 4, 2022

spenceforce commented Sep 17, 2022

Question - Support for different types of categorical variable encoding #1237

Question - Support for different types of categorical variable encoding #1237

Comments

SSMK-wq commented Jan 15, 2022 • edited Loading

Context of the issue

fjpa121197 commented May 4, 2022

spenceforce commented Sep 17, 2022

SSMK-wq commented Jan 15, 2022 •

edited

Loading