Skip to content

Learning over imbalanced data on the fly. #512

Answered by MaxHalford
jbone asked this question in Q&A
Discussion options

You must be logged in to vote

Is this desired distribution describing the a priori knowledge about the actual distribution of the data, or is it describing the ratios you want the classifier to see?

Sorry I wasn't clear: desired_dist is indeed the distribution we want the classifier to see.

The

if desired_dist is None:
    desired_dist = self._actual_dist

is basically an edge case: if no desired distribution is specified, the data is sampled completely at random.

Assume my a priori distribution of actual data is { -1: 0., 0: 0.8, 1: 0.15 }. Obviously the classes { -1, 1 } are going to be harder to learn. I (think) I want the underlying classifier (a OneVsRest-wrapped ALMA) to “see” each class an ~equal number of ti…

Replies: 4 comments 4 replies

Comment options

You must be logged in to vote
3 replies
@jbone
Comment options

@jbone
Comment options

@MaxHalford
Comment options

Comment options

You must be logged in to vote
1 reply
@jbone
Comment options

Comment options

You must be logged in to vote
0 replies
Answer selected by jbone
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants