Evaluate models accuracy when using inference transforms on Tensors (instead of PIL images) #6506

NicolasHug · 2022-08-26T11:33:36Z

The accuracy of our trained models that we currently report come from evaluations run on PIL images. However, our inference-time transforms also support Tensors.

In the wild, our users might be passing Tensors to the pre-trained models (instead of PIL images), so it's worth figuring out whether the accuracy is consistent between Tensors and PIL.

Note: we do check that all the transforms are consistent between PIL and Tensors, so hopefully differences should be minimal. But models are known to learn interpolation tweaks and in particular the use of anti-aliasing. PIL uses anti-aliasing by default and this is what our models where trained on, but we don't pass antialias=True to the Resize transform, so it might be a source of discrepancy.

As discussed internally with @datumbox, figuring that out is part of the transforms rework plan (although it's relevant outside of the rework as well).

cc @vfdev-5 @datumbox

The text was updated successfully, but these errors were encountered:

datumbox · 2022-08-26T11:37:30Z

PIL uses anti-aliasing by default and this is what our models where trained on, but we don't pass antialias=True to the Resize transform, so it might be a source of discrepancy.

@pmeier that's an important point that we need to include in the #6433 PR

NicolasHug · 2022-08-26T15:17:11Z

Some partial results below using #6508. efficientnet seems like an outlier but for all the rest, transforming on Tensors led to significantly worse accuracy, unless antialias=True is used. Perhaps this warrants a more in-depth analysis, but I wonder if we should change these presets to use antialias=True by default? This might even qualify as a bugfix?

All of these are on the DEFAULT weights in this order:

transforms on PIL Images with antialias=True (PIL doesn't support no-antialias anyway)
transforms on Tensor with antialias=False (current default)
transforms on Tensor with antialias=True

Note: PIL was used for jpeg decoding in all cases - we're only interested about variance w.r.t. to the transforms here, not about the decoding (although that might still be worth considering).

mobilenet_v3_large
Test:  Acc@1 75.274 Acc@5 92.564
Test:  Acc@1 74.264 Acc@5 92.232
Test:  Acc@1 75.246 Acc@5 92.580

resnet50
Test:  Acc@1 80.848 Acc@5 95.428
Test:  Acc@1 79.822 Acc@5 94.962
Test:  Acc@1 80.828 Acc@5 95.446

vit
Test:  Acc@1 81.068 Acc@5 95.318
Test:  Acc@1 80.858 Acc@5 95.216
Test:  Acc@1 81.054 Acc@5 95.310

efficientnet_b4
Test:  Acc@1 83.382 Acc@5 96.600
Test:  Acc@1 83.420 Acc@5 96.572
Test:  Acc@1 83.412 Acc@5 96.600

shufflenet_v2_x1_0
Test:  Acc@1 69.348 Acc@5 88.324
Test:  Acc@1 68.280 Acc@5 87.648
Test:  Acc@1 69.324 Acc@5 88.346

datumbox · 2022-08-26T16:46:50Z

@NicolasHug thanks for publishing the results this looks very interesting. It's obvious that antialiasing will have a massive effect to users when they start using more the Tensor backend.

@pmeier @vfdev-5 These are the numbers I quoted on our offline chat today. It's worth noting that if we wanted to ensure that the user has control over the antialiasing option on Transforms, we would have to expose this option to every Transform that uses resize behind the scenes.

NicolasHug · 2023-09-28T14:56:51Z

Closed by #7949 (background: #7093)

CA4GitHub · 2023-10-05T02:13:24Z

Re PIL images vs tensors, I was experiencing the issue with antialias set to "warn" in the older torchvision versions. After looking at the forward method of the Image Classification class in transforms _presets.py, I decided to test each step. Using the torchvision.transforms.functional.resize method I noticed differences between PIL inputs and Tensor inputs. However, for a few tests I ran it also seemed like the PIL images I used were uint8 with pixels values with dynamic range 0-255 and the Tensor was a float32 with pixel values with dynamic range 0.0-1.0. Converting the Tensor to a uint8 with values in the range 0-255 (i.e. (255*the_tensor).to(torch.uint8) ) resulted in the differences of the resize method outputs going away.

I wonder if the real difference is due to using different data types (uint8 vs float32) and/or dynamic ranges. I think it's probably the different data types, but a more thorough investigation would be useful.

NicolasHug · 2023-10-05T07:32:17Z

Hi @CA4GitHub ,

It's not about dtype,, it's really about antialiasing. In fact even if you passed a uint8 tensor image it would be converted to float internally, resized, and back to uint8.

(that's not the case anymore as of... yesterday, with v2.Resize() in 0.16 which properly handle uint8 natively).

datumbox added module: transforms module: reference scripts labels Aug 26, 2022

NicolasHug mentioned this issue Aug 26, 2022

NOMRG compare Acc when transforms are run on tensors vs PIL images #6508

Closed

datumbox mentioned this issue Aug 26, 2022

Prototype Transform cleanups and bugfixing #6512

Merged

NicolasHug mentioned this issue Jan 16, 2023

Change the default of antialias parameter to True #7093

Closed

NicolasHug mentioned this issue Feb 6, 2023

Change default of antialias parameter from None to 'warn' #7160

Merged

NicolasHug closed this as completed Sep 28, 2023

bossjones mentioned this issue Jun 9, 2024

pytorch accuracy bossjones/goob_ai#15

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evaluate models accuracy when using inference transforms on Tensors (instead of PIL images) #6506

Evaluate models accuracy when using inference transforms on Tensors (instead of PIL images) #6506

NicolasHug commented Aug 26, 2022 •

edited by pytorch-bot bot

Loading

datumbox commented Aug 26, 2022

NicolasHug commented Aug 26, 2022

datumbox commented Aug 26, 2022

NicolasHug commented Sep 28, 2023 •

edited

Loading

CA4GitHub commented Oct 5, 2023 •

edited

Loading

NicolasHug commented Oct 5, 2023 •

edited

Loading

Evaluate models accuracy when using inference transforms on Tensors (instead of PIL images) #6506

Evaluate models accuracy when using inference transforms on Tensors (instead of PIL images) #6506

Comments

NicolasHug commented Aug 26, 2022 • edited by pytorch-bot bot Loading

datumbox commented Aug 26, 2022

NicolasHug commented Aug 26, 2022

datumbox commented Aug 26, 2022

NicolasHug commented Sep 28, 2023 • edited Loading

CA4GitHub commented Oct 5, 2023 • edited Loading

NicolasHug commented Oct 5, 2023 • edited Loading

NicolasHug commented Aug 26, 2022 •

edited by pytorch-bot bot

Loading

NicolasHug commented Sep 28, 2023 •

edited

Loading

CA4GitHub commented Oct 5, 2023 •

edited

Loading

NicolasHug commented Oct 5, 2023 •

edited

Loading