Closed
Description
I think it would be useful for feature selection if it was possible to keep track of which DataFrame columns were mapped to which array columns during the transformation so that one could use for instance the feature_importances_
of ensemble methods in sklearn.
Is there a straight-forward way to do this right now? I looked into it a bit but didn't find a common way to get the necessary information during fitting of the sklearn transforms. Therefore the best way I can currently think of is to do the inspection separately for each sklearn transform, i. e. use self.feature_names_
for DictVectorizer
, self.classes_
for LabelBinarizer
, etc.
I'm thinking there must be a better way to do this.