F1 and Precision/Recall value not consistent. #58

Whisht · 2021-02-05T04:24:06Z

🐛 Bug

The return of f1 and precision is wrong.

To Reproduce

    from pytorch_lightning.metrics.functional import *
    y_pred = torch.Tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
    y_true = torch.Tensor([0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1])
    tp, fp, tn, fn, _ = stat_scores(y_pred, y_true, 1) #tp, fp, tn, fn = [8, 8, 0, 0], if 0 is positive.
    p = precision(y_pred, y_true, 2)  # it return 0.5; tp/(tp+fp) = 0.5; if take 1 as postive, precision should be 0.
    r = recall(y_pred, y_true, 2) # it return 0.5; but tp/(tp+fn) should be 1.
    f1_score = f1(y_pred, y_true, 2) # returns 0; which is not right too.

As mentioned above, If we take 0 as positive class, then tp, fp, tn, fn = [8, 8, 0, 0], and precision will be 0.5, recall should be 1. But the precision() method get a 0.5 output.

Expected behavior

The value could be consistent. And a give parameter that could make any class as positive (like sklearn) would be easier to usr.

Environment

*python = 3.8.5
*pytorch-lightning=1.1.6
*pytorch=1.7

The text was updated successfully, but these errors were encountered:

SkafteNicki · 2021-02-05T12:48:38Z

So currently we always interpret the 1 class as the positive class. We currently have an open PR (Lightning-AI/pytorch-lightning#4957) that will add a positive_label so the user can specify which class should be taken as the positive class.
For now you can pass in the class_reduction='none' parameter to get the score for the individuel class:

precision_recall(y_pred, y_true, class_reduction='none')
(tensor([0.5000, 0.0000]), tensor([1., 0.]))

Whisht · 2021-02-07T02:42:13Z

But we choose 1 as positive class, then tp,fp,tn,fn should be [0,0,8,8]. The precision $\frac{tp}{tp+fp}=\frac{0}{0}$ should be $0$, not the returned $0.5$. Meanwhile, recall $\frac{tp}{tp+fn}=\frac{0}{0+8}$ should be $0$ too. Nither the returned $0.5$. If we use these two returned values, then $f1=2*\frac{p*r}{p+r}=0.5$ , not the returned $0$ . But as 1 is positive class, a $0$ f1 is correct.

So these incosistent values makes me so confused.

SkafteNicki · 2021-02-08T08:50:18Z

precision_recall(y_pred, y_true, class_reduction='none')
(tensor([0.5000, 0.0000]), tensor([1., 0.]))

@Whisht here the first output is the precision for each class meaning 0.5 for class 0 and 0.0 for class 1 and the second output is the recall which is 1 for class 0 and 0 for class 1. That seems correct to me.

taketakeseijin · 2021-02-19T06:46:42Z

In 2-class classification, f1 returns the same value with accuracy. This is not good.

To Reproduce

target = torch.tensor([1, 1, 0, 0])
preds = torch.tensor([0, 1, 0, 0])
ConfusionMatrix(num_classes=2)(preds, target)
# -> tensor([[2., 0.],
#            [1., 1.]])
F1(num_classes=2)(preds, target)
# -> tensor(0.7500)

In above situation, f1 should be 0.8 or 0.666.
However, it returns 0.75 which is accuracy.
This problem occurs in any 2-class situation as far as I did.

pytorch-lightning 1.2.0

SkafteNicki · 2021-02-19T08:28:06Z

@taketakeseijin if you want the score for the positive class (which is often the case for binary classification) you can do:

F1(num_classes=1)(preds, target)
# tensor(0.6667)

for the negative (0 class) you can do:

F1(num_classes=2, average=None)(preds, target)[0] # returns both scores, pick the first
# tensor(0.8000)

setting num_classes=2 we calculate the score as it was a multiclass problem, which means that we have to do some reduction over all classes. To get the binary score, set num_classes=1.

github-actions · 2021-03-12T15:51:44Z

Hi! thanks for your contribution!, great first issue!

Borda transferred this issue from Lightning-AI/pytorch-lightning Mar 12, 2021

Borda added bug / fix Something isn't working help wanted Extra attention is needed labels Mar 17, 2021

SkafteNicki mentioned this issue Mar 19, 2021

FBeta update #111

Merged

4 tasks

edenlightning assigned SkafteNicki Mar 22, 2021

Borda closed this as completed in #111 Mar 24, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

F1 and Precision/Recall value not consistent. #58

F1 and Precision/Recall value not consistent. #58

Whisht commented Feb 5, 2021

SkafteNicki commented Feb 5, 2021

Whisht commented Feb 7, 2021

SkafteNicki commented Feb 8, 2021

taketakeseijin commented Feb 19, 2021 •

edited

Loading

SkafteNicki commented Feb 19, 2021

github-actions bot commented Mar 12, 2021

F1 and Precision/Recall value not consistent. #58

F1 and Precision/Recall value not consistent. #58

Comments

Whisht commented Feb 5, 2021

🐛 Bug

To Reproduce

Expected behavior

Environment

SkafteNicki commented Feb 5, 2021

Whisht commented Feb 7, 2021

SkafteNicki commented Feb 8, 2021

taketakeseijin commented Feb 19, 2021 • edited Loading

SkafteNicki commented Feb 19, 2021

github-actions bot commented Mar 12, 2021

taketakeseijin commented Feb 19, 2021 •

edited

Loading