Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

F1 and Precision/Recall value not consistent. #58

Closed
Whisht opened this issue Feb 5, 2021 · 6 comments · Fixed by #111
Closed

F1 and Precision/Recall value not consistent. #58

Whisht opened this issue Feb 5, 2021 · 6 comments · Fixed by #111
Assignees
Labels
bug / fix Something isn't working help wanted Extra attention is needed

Comments

@Whisht
Copy link

Whisht commented Feb 5, 2021

🐛 Bug

The return of f1 and precision is wrong.

To Reproduce

    from pytorch_lightning.metrics.functional import *
    y_pred = torch.Tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
    y_true = torch.Tensor([0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1])
    tp, fp, tn, fn, _ = stat_scores(y_pred, y_true, 1) #tp, fp, tn, fn = [8, 8, 0, 0], if 0 is positive.
    p = precision(y_pred, y_true, 2)  # it return 0.5; tp/(tp+fp) = 0.5; if take 1 as postive, precision should be 0.
    r = recall(y_pred, y_true, 2) # it return 0.5; but tp/(tp+fn) should be 1.
    f1_score = f1(y_pred, y_true, 2) # returns 0; which is not right too.

As mentioned above, If we take 0 as positive class, then tp, fp, tn, fn = [8, 8, 0, 0], and precision will be 0.5, recall should be 1. But the precision() method get a 0.5 output.

Expected behavior

The value could be consistent. And a give parameter that could make any class as positive (like sklearn) would be easier to usr.

Environment

*python = 3.8.5
*pytorch-lightning=1.1.6
*pytorch=1.7

@SkafteNicki
Copy link
Member

So currently we always interpret the 1 class as the positive class. We currently have an open PR (Lightning-AI/pytorch-lightning#4957) that will add a positive_label so the user can specify which class should be taken as the positive class.
For now you can pass in the class_reduction='none' parameter to get the score for the individuel class:

precision_recall(y_pred, y_true, class_reduction='none')
(tensor([0.5000, 0.0000]), tensor([1., 0.]))

@Whisht
Copy link
Author

Whisht commented Feb 7, 2021

But we choose 1 as positive class, then tp,fp,tn,fn should be [0,0,8,8]. The precision $\frac{tp}{tp+fp}=\frac{0}{0}$ should be $0$, not the returned $0.5$. Meanwhile, recall $\frac{tp}{tp+fn}=\frac{0}{0+8}$ should be $0$ too. Nither the returned $0.5$. If we use these two returned values, then $f1=2*\frac{p*r}{p+r}=0.5$ , not the returned $0$ . But as 1 is positive class, a $0$ f1 is correct.

So these incosistent values makes me so confused.

@SkafteNicki
Copy link
Member

precision_recall(y_pred, y_true, class_reduction='none')
(tensor([0.5000, 0.0000]), tensor([1., 0.]))

@Whisht here the first output is the precision for each class meaning 0.5 for class 0 and 0.0 for class 1 and the second output is the recall which is 1 for class 0 and 0 for class 1. That seems correct to me.

@taketakeseijin
Copy link

taketakeseijin commented Feb 19, 2021

In 2-class classification, f1 returns the same value with accuracy. This is not good.

To Reproduce

target = torch.tensor([1, 1, 0, 0])
preds = torch.tensor([0, 1, 0, 0])
ConfusionMatrix(num_classes=2)(preds, target)
# -> tensor([[2., 0.],
#            [1., 1.]])
F1(num_classes=2)(preds, target)
# -> tensor(0.7500)

In above situation, f1 should be 0.8 or 0.666.
However, it returns 0.75 which is accuracy.
This problem occurs in any 2-class situation as far as I did.

pytorch-lightning 1.2.0

@SkafteNicki
Copy link
Member

@taketakeseijin if you want the score for the positive class (which is often the case for binary classification) you can do:

F1(num_classes=1)(preds, target)
# tensor(0.6667)

for the negative (0 class) you can do:

F1(num_classes=2, average=None)(preds, target)[0] # returns both scores, pick the first
# tensor(0.8000)

setting num_classes=2 we calculate the score as it was a multiclass problem, which means that we have to do some reduction over all classes. To get the binary score, set num_classes=1.

@Borda Borda transferred this issue from Lightning-AI/pytorch-lightning Mar 12, 2021
@github-actions
Copy link

Hi! thanks for your contribution!, great first issue!

@Borda Borda added bug / fix Something isn't working help wanted Extra attention is needed labels Mar 17, 2021
@SkafteNicki SkafteNicki mentioned this issue Mar 19, 2021
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug / fix Something isn't working help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants