Rare Word F1 #3566

spencerp · 2021-03-31T22:31:17Z

Patch description
F1 can be gamed easily (either by humans or the model) by predicting common tokens irrespective of semantics.
As an attempt to prevent this, this PR introduces "Rare Word F1" that only gives credit for matching words that are infrequent relative to some reference corpus.

This is less susceptible to the adversarial scenario of a model that predicts the same thing over and over again, since it shouldn't be possible to find a set of words that is both rare and shows up often in the labels.

klshuster · 2021-04-05T16:26:36Z

parlai/core/metrics.py

+        true_pos_score = sum(weighted_common.values())
+        if true_pos_score == 0:
+            return 0
+        precision = true_pos_score / sum(weights[w] for w in pred_items)


oh, this is really interesting, i had originally imagined weights to just be 1 if above the threshold, 0 otherwise, i wonder if you could make that a configurable option for the metric? this is a cool way of doing it too

Glad you like it! If you look at my plots of _rarity_weight in the PR description, you'll see that it rises fairly sharply after the chosen threshold. I was afraid that this might tend toward the simpler version (0 above and 1 below the threshold) in most cases, at which point the weighting would complicate the metric without adding much value. So I just opted for the simpler version for now.

But if there were a way of calculating the rarity weight which rose to 1.0 more gradually, we could try another version of this metric that doesn't have a "cutoff" and only has the weighting. That would feel much more elegant and would probably be more robust to different word distributions.

The issue I was running into in trying to find a function like that, though, was that the top few words of the distribution are so common relative to even the rest of the top 50 words that I have to find a function that pushes the majority of the range (let's say 0.0 to 0.99) down to near zero while expanding the last bit of the range (0.99 to 1.0) to a gradual slope from near-zero to 1.0. Otherwise fairly common words are still given a high weight.

stephenroller

Given this is a pretty narrow metric, I'd feel a bit better if the metric were moved to the teacher itself. Custom_evaluation in particular would be a good fit

spencerp · 2021-04-14T03:44:17Z

Given this is a pretty narrow metric, I'd feel a bit better if the metric were moved to the teacher itself. Custom_evaluation in particular would be a good fit

We use this in another teacher in parlai_internal (which we plan to move to public soon). But I moved it to wizard for now.

stephenroller

ya this looks quite right to me

rare word f1

6b6d65a

facebook-github-bot added the CLA Signed label Mar 31, 2021

spencerp requested review from klshuster and jaseweston April 1, 2021 01:49

small docstring fix

071b61f

klshuster reviewed Apr 5, 2021

View reviewed changes

simplify and add to wizard

a8f507e

spencerp marked this pull request as ready for review April 9, 2021 17:58

spencerp changed the title ~~Rare word F1~~ Rare Word F1 Apr 12, 2021

don't count labels without rare words

cce7f21

stephenroller reviewed Apr 13, 2021

View reviewed changes

spencerp added 2 commits April 13, 2021 20:29

Merge branch 'master' into rare-word-f1

d439dc4

move to wizard

337eca0

autoformat doc strings

f5dd786

stephenroller approved these changes Apr 14, 2021

View reviewed changes

spencerp merged commit 337a688 into master Apr 14, 2021

spencerp deleted the rare-word-f1 branch April 14, 2021 03:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rare Word F1 #3566

Rare Word F1 #3566

spencerp commented Mar 31, 2021 •

edited

Loading

klshuster Apr 5, 2021

spencerp Apr 7, 2021

stephenroller left a comment

spencerp commented Apr 14, 2021

stephenroller left a comment

Rare Word F1 #3566

Rare Word F1 #3566

Conversation

spencerp commented Mar 31, 2021 • edited Loading

klshuster Apr 5, 2021

Choose a reason for hiding this comment

spencerp Apr 7, 2021

Choose a reason for hiding this comment

stephenroller left a comment

Choose a reason for hiding this comment

spencerp commented Apr 14, 2021

stephenroller left a comment

Choose a reason for hiding this comment

spencerp commented Mar 31, 2021 •

edited

Loading