Dev #120

perib · 2024-02-27T21:32:17Z

[please review the Contribution Guidelines prior to submitting your pull request. go ahead and delete this line if you've already reviewed said guidelines.]

What does this PR do?

Some bug fixes

edited ColumnOneHotEncoder to simulate the behavior of the OneHotEncoder. It will now automatically select columns with fewer than 10 unique values and one hot encode them (same behavior as TPOT1). The original OneHotEncoder is not compatible with pandas dataframes, but this one should be. Replaced the OneHotEncoder with ColumnOneHotEncoder in the tpot2 search space. We could also change this later to make the number of unique values a searchable parameter.

A bug in the initial pipeline generator caused the initial pipeline to all be of size 1 when leaf_config_dict was not set. Added an additional check to make sure that the initial population pipelines will include more nodes from the inner_config_dict when leaf_config_dict is None.

A typo prevented the complexity scorer from recursively searching sklearn Pipeline classes. Fixed the typo to correctly pass in the estimator to the recursive function. Previously it was passing in a tuple from the pipeline.steps, rather than the actual estimator found in the second index of that tuple.

perib added 5 commits February 19, 2024 12:23

update column one hot encoder

76769d4

Merge branch 'EpistasisLab:dev' into dev

a71242e

fix complexity objective function

ab2777c

Merge branch 'dev' of https://github.com/perib/tpot2 into dev

3e1936f

change onehotencoder used, initial pop fix

4681389

perib merged commit ef2a9a1 into EpistasisLab:dev Mar 27, 2024
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dev #120

Dev #120

perib commented Feb 27, 2024

Dev #120

Dev #120

Conversation

perib commented Feb 27, 2024

What does this PR do?