Conditional sampling using GaussianCopula
inefficient when categories are noised
#910
Labels
Milestone
GaussianCopula
inefficient when categories are noised
#910
Environment Details
Please indicate the following details about the environment in which you found the bug:
Error Description
The
GaussianCopula
is listed as the most efficient way to perform conditional sampling. Yet, there are some configurations of this model that are very inefficient compared to others.Inefficient configurations:
categorical_transformer='categorical_fuzzy'
Efficient configurations (up to 100x faster):
categorical_transformer='categorical'
or'label-encoding'
or'one_hot_encoding'
Steps to reproduce
The following is inefficient:
Meanwhile, changing it to
label_encoding
is 100x faster:The text was updated successfully, but these errors were encountered: