-
Hi, I'm trying out Tangram for a binary classification problem. I have data with about 600 observations in one class and 6000 in another. Can I do a weighted model with Tangram that would balance the classes? Or does Tangram do some balancing implicitly under the hood? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 3 replies
-
Hi @otsaw! Tangram currently does not automatically handle class imbalance. What we suggest doing is upsampling your minority class or downsampling your majority class. To upsample the minority class, duplicate the rows that correspond to the minority class until the number of minority class and majority class rows are equal in your csv. You will have to create two CSV's and pass the one with the upsampled data as |
Beta Was this translation helpful? Give feedback.
Hi @otsaw! Tangram currently does not automatically handle class imbalance. What we suggest doing is upsampling your minority class or downsampling your majority class. To upsample the minority class, duplicate the rows that correspond to the minority class until the number of minority class and majority class rows are equal in your csv. You will have to create two CSV's and pass the one with the upsampled data as
--file-train
and the one that does not have upsampling as--file-test
. To downsample the majority class, select only 600 of the 6000 majority class rows and the full 600 of the minority class rows and pass this csv with 1200 rows to Tangram. This is obviously not ideal! We want …