Learning over imbalanced data on the fly. #512
-
This is perhaps more of a general ML question, but here goes. Givens: a 3-class classifier problem in which the balance between classes tends to change frequently and stay changed for an uncertain period. It’s not an anomaly problem in that the minority class is neither consistent other than in the short term nor is it particularly rare. There’s no meaningful way to determine a static weighting or distribution a priori. What’s the current thinking on best practices for dealing with this? |
Beta Was this translation helpful? Give feedback.
Replies: 4 comments 4 replies
-
Good question! I thought about this quite bit ~1.5 years ago and wrote a related blog post. We then implemented samplers that balance the data online in the The distributions maintained by these samples are just dictionaries that count the occurrences of each class. Therefore they don't adapt very well to a change in distribution. However, it should be very straightforward to make these counters adaptive by only measuring the distribution on the n latest samples. Concretely, if I were to do this, I would implement a new distribution in the Note that an advantage of using these counters is that you add a priori knowledge: just fill in the counters before running the model. I hope this (partially) answers your question. |
Beta Was this translation helpful? Give feedback.
-
Here’s a slightly different question. In the imblearn methods you supply a desired_dist parameter; a dictionary describing the desired distribution for use in the various resampling classifiers. Is this desired distribution describing the a priori knowledge about the actual distribution of the data, or is it describing the ratios you want the classifier to see? Assume my a priori distribution of actual data is { -1: 0., 0: 0.8, 1: 0.15 }. Obviously the classes { -1, 1 } are going to be harder to learn. I (think) I want the underlying classifier (a OneVsRest-wrapped ALMA) to “see” each class an ~equal number of times. Should my desired_dist be { -1: 0.33, 0: 0.34, 1: 0.33 }? |
Beta Was this translation helpful? Give feedback.
-
Sorry I wasn't clear: The if desired_dist is None:
desired_dist = self._actual_dist is basically an edge case: if no desired distribution is specified, the data is sampled completely at random.
Yes! Although I would recommend trying a more permissive desired distribution too, like |
Beta Was this translation helpful? Give feedback.
-
Ah! The code(r) is smarter than I am. The gorpy hack can wait. Thanks, Max! |
Beta Was this translation helpful? Give feedback.
Sorry I wasn't clear:
desired_dist
is indeed the distribution we want the classifier to see.The
is basically an edge case: if no desired distribution is specified, the data is sampled completely at random.