This repository has been archived by the owner on Sep 18, 2024. It is now read-only.
Add bn fold to quantization-aware training quantizers #3890
chenbohua3
started this conversation in
New Feature Design Discussion
Replies: 1 comment
-
As discussed offline, the current implementation is ok. I will upload a PR about bn-fold. New comments are welcomed. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
At present, many backend frameworks enable the BN-fold function by default, and some of them will adopt per-tensor quantization scheme to quantize the weight tensors. The lack of BN-fold function in quantization-aware training will cause the following two problems:
We are planning to add BN-fold function to quantization-aware training quantizers, especially for
QAT_Quantizer
andLsqQuantizer
. The core idea of bn-fold function is to simulate this folding process in the training graph. The details are described in chapter 3.2 and appendix of the QAT paperIn our implementation, there will be three steps for this function:
TorchModuleGraph
andsuccessors
modules to find out the groups. To do so, we need to pass a new parameter (e.g. dummy_input) to the__init__
function of the quantizer to get the torchscript graph.QuantizerModuleWrapper
, this step may happen when InstantiatesQuantizerModuleWrapper
or when traverse all the wrappers.forward
function ofQuantizerModuleWrapper
, including two steps:This implementation prototype works well for us, looking forward to your suggestions :)
Beta Was this translation helpful? Give feedback.
All reactions