Document significance of macro vs micro averaging #1874
Labels
documentation
Improvements or additions to documentation
good first issue
A good issue for a new contributor to work on
Milestone
Issue
In the segmentation trainer, MulticlassAccuracy MulticlassJaccardIndex uses
average="micro"
rather than the torchmetrics default macro.From slack: Macro weights per patch (what is the average accuracy of each patch?), whereas micro weights per pixel (what is the average accuracy of each pixel across the entire dataset?). These two metrics very greatly when the number of nodata pixels varies between patches. If one patch is mostly nodata pixels, and another patch is mostly data pixels, then macro will take the average of the two patches, while micro will sum up all pixels
Macro might be okay for some applications (such as pre-chipped datasets) but is definitely not what you want for others (such as geospatial datasets)
Further discussion: proposal: add both macro and micro and just name them appropriately OA vs AA - Overall (micro) Accuracy (OA). Average (macro) Accuracy (AA)
Fix
Refine the description above and document at https://torchgeo.readthedocs.io/en/stable/api/trainers.html#torchgeo.trainers.SemanticSegmentationTask.configure_metrics
The text was updated successfully, but these errors were encountered: