Skip to content
This repository has been archived by the owner on Sep 18, 2024. It is now read-only.

Add batch normalization folding to QAT quantizer #3911

Merged
merged 9 commits into from
Jul 26, 2021

Conversation

chenbohua3
Copy link
Contributor

This pr adds batch normalization folding to the QAT quantizer, the core ideas are described in #3890

@@ -125,7 +125,7 @@ class QAT_Quantizer(Quantizer):
http://openaccess.thecvf.com/content_cvpr_2018/papers/Jacob_Quantization_and_Training_CVPR_2018_paper.pdf
"""

def __init__(self, model, config_list, optimizer=None):
def __init__(self, model, config_list, optimizer=None, model_inputs=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is model_inputs the same concept with dummy_input in pruning speedup and quantization speedup? If so, recommend using dummy_input instead of model_inputs to be aligned.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@linbinskn
Copy link
Contributor

linbinskn commented Jul 11, 2021

Looks good. I only have one question right now. Is there any problems If we want to export simulated model with new feature bn folding to backend execution engine such as TensorRT? For instance, during inference, conv+bn+relu will be fused into singel op by updating the conv's weight/bias parameter with bn parameters. However, currently our conv's weights have already been equal to fused weight while bn layer still exists. If the problem actual exists, maybe we can discuss an appropriate method to resolve it.

@chenbohua3
Copy link
Contributor Author

You are right. I have added some code logic to restore folded weight/bias in export_model.

@linbinskn
Copy link
Contributor

Please update content of bn folding in doc Supported Quantization Algorithms on NNI.

@chenbohua3
Copy link
Contributor Author

the content of bn folding has been added

@QuanluZhang QuanluZhang reopened this Jul 19, 2021
def fold_bn(self, config, **kwargs):
# TODO simulate folded weight
pass
def fold_bn(self, *inputs, wrapper):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function is QAT_Quantizer specific? other quantizers may have a different fold_bn function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function should also work well for other quantizers. (at least for lsq quantizer I think:) ). I will make it a common utility function in the pr that enables batch normalization folding for other quantizers.

@QuanluZhang QuanluZhang merged commit 7fc5af0 into microsoft:master Jul 26, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants