[Torch][Quantized] Fix converting serialized quantized models #5839
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is a workaround for the issue reported in pytorch/pytorch#39690
In short, if a quantized PyTorch model is serialized and loaded back, dtypes of output tensors are dropped and the loaded model doesn't have
QUInt8
types at all.This becomes a problem when converting some Torch ops. For example, below the output dtype of
quantize_per_tensor
becomes float (Tensor
means float tensor, wrong), soaten::adaptive_avg_pool2d
thinks this is a float operation. But obviously the output ofaten::quantize_per_tensor
should be a quantized tensor, soaten::adaptive_avg_pool2d
has to be converted to the quantized version.The quantized resnet in torchvision uses
aten::adaptive_avg_pool2d
. So right now if we save and load back the qresnet, we get garbage result.please review @siju-samuel @anijain2305
cc @jjohnson-arm