Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible bug in binary classification calibration_error #1105

Closed
cwognum opened this issue Jun 21, 2022 · 3 comments
Closed

Possible bug in binary classification calibration_error #1105

cwognum opened this issue Jun 21, 2022 · 3 comments
Assignees
Labels
bug / fix Something isn't working help wanted Extra attention is needed
Milestone

Comments

@cwognum
Copy link
Contributor

cwognum commented Jun 21, 2022

🐛 Bug

In calibration_error(), the accuracies in the binary classification setting are not correctly computed I think. It just returns the targets. I am guessing this should rather return target == preds.round().int() or something similar? Am I missing something?

Code example

import torch
from torchmetrics.functional.classification import calibration_error

preds = torch.tensor([0.01, 0.001, 0.005])  # The raw sigmoid output
targets = torch.tensor([1, 1, 1])
calibration_error(confidences, targets)
# This returns: tensor(0.9947)

The model confidently predicts the wrong class, but is rewarded with a near perfect calibration score.

Environment

  • TorchMetrics version (and how you installed TM, e.g. conda, pip, build from source):

    • Version 0.9.1
    • Installed with mamba
  • Python & PyTorch Version (e.g., 1.0):

    • Python: 3.9.13
    • PyTorch: 1.11.0.post202
  • Any other relevant information such as OS (e.g., Linux):

    • I am on Ubuntu, Linux.
@cwognum cwognum added bug / fix Something isn't working help wanted Extra attention is needed labels Jun 21, 2022
@cwognum
Copy link
Contributor Author

cwognum commented Jun 22, 2022

Added a little example to better illustrate my point.

By the way, using just a 0-vector would have been a simpler example, but it turns out the preds can't be 0 exactly due to how the binning is done. It could make sense to clamp the predictions in the binning process to prevent this. E.g.:

torch.clip(confidences, 1e-6, 1.0)

@Borda Borda added this to the v0.10 milestone Jul 27, 2022
@SkafteNicki
Copy link
Member

Hi,
I checked this issue as an bigger refactor (see this issue #1001 and this PR #1195) and it seems that our calibration error is computing the right value.

First, in the example provided the metric is giving a score of 0.9942. As the metric is an calibration error the optimum would be 0 and not 1 and it therefore seems correct that the metric is giving a high score as the example is clearly not well calibrated.

Secondly I ran the example through an third party package https://github.com/fabiankueppers/calibration-framework which gives the same result as our implementation (we are actually using it for testing now).

Therefore, there does not seem to be an error in the implementation.
Closing issue.

@eyuansu62
Copy link

I also have the same problem.
def _binary_calibration_error_update(preds: Tensor, target: Tensor) -> Tensor:
confidences, accuracies = preds, target
return confidences, accuracies

How could the target be possible equal to accuracy?
The target is ground truth label and accuracy is preds == target.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug / fix Something isn't working help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants