-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Add batch normalization folding to QAT quantizer #3911
Conversation
@@ -125,7 +125,7 @@ class QAT_Quantizer(Quantizer): | |||
http://openaccess.thecvf.com/content_cvpr_2018/papers/Jacob_Quantization_and_Training_CVPR_2018_paper.pdf | |||
""" | |||
|
|||
def __init__(self, model, config_list, optimizer=None): | |||
def __init__(self, model, config_list, optimizer=None, model_inputs=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is model_inputs
the same concept with dummy_input
in pruning speedup and quantization speedup? If so, recommend using dummy_input
instead of model_inputs
to be aligned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Looks good. I only have one question right now. Is there any problems If we want to export simulated model with new feature bn folding to backend execution engine such as TensorRT? For instance, during inference, conv+bn+relu will be fused into singel op by updating the conv's weight/bias parameter with bn parameters. However, currently our conv's weights have already been equal to fused weight while bn layer still exists. If the problem actual exists, maybe we can discuss an appropriate method to resolve it. |
You are right. I have added some code logic to restore folded weight/bias in |
Please update content of bn folding in doc Supported Quantization Algorithms on NNI. |
the content of bn folding has been added |
def fold_bn(self, config, **kwargs): | ||
# TODO simulate folded weight | ||
pass | ||
def fold_bn(self, *inputs, wrapper): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this function is QAT_Quantizer
specific? other quantizers may have a different fold_bn
function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function should also work well for other quantizers. (at least for lsq quantizer I think:) ). I will make it a common utility function in the pr that enables batch normalization folding for other quantizers.
This pr adds batch normalization folding to the QAT quantizer, the core ideas are described in #3890