support dtype&scheme customization for QAT quantizer #4137

chenbohua3 · 2021-09-01T03:22:00Z

No description provided.

chenbohua3 · 2021-09-01T03:25:19Z

~~I will rebase the master when #4127 is merged into master.~~

Have rebased

linbinskn · 2021-09-13T11:26:11Z

docs/zh_CN/Compression/CustomizeCompressor.rst

@@ -155,7 +155,7 @@
           grad_output : Tensor
               量化操作输出的梯度
           quant_type : QuantType


Please remove all modification of Chinese document since Chinese document is generated in another pipeline which generate it automatically.

linbinskn · 2021-09-13T12:01:39Z

nni/algorithms/compression/pytorch/quantization/quantizers.py

+    rmax = torch.max(rmax, torch.zeros_like(rmax))
+    zero_point = torch.zeros_like(rmin)
+
+    # todo: there is no need to calculate qmin and qmax again


Agree. Maybe we can put them into wrapper.

linbinskn · 2021-09-13T12:13:52Z

nni/algorithms/compression/pytorch/quantization/quantizers.py

+            input_shape, output_shape = self.all_shapes[name]
+            layer_quant_setting = LayerQuantSetting(config)
+            layer_quant_setting.ema_decay = 0.99
+            quant_start_step = config.get('quant_start_step', 0)


Since layer_quant_setting = LayerQuantSetting(config) has used config, should we get quant_start_step directly during LayerQuantSetting init?

Now quant_start_step is a QAT_Quantizer specific parameter. I think it is better to set it after LayerQuantSetting initialization

@J-shang how do u think about it?

I prefer get it from LayerQuantSetting._extra_layer_setting if it is possible.

Agree, I think it can be put into LayerQuantSetting because this class can be universal class for all quantizers (may be part of design for quantization v2) and quant_start_step is an universal parameter in config in quantization original design, all universal parameters should be put into this class except parameter changes during training.
For this pr, I think current implementation is enough.

linbinskn · 2021-09-13T12:28:37Z

nni/algorithms/compression/pytorch/quantization/quantizers.py

        if not wrapper.training:
+            scale, zero_point = module.weight_scale, module.weight_zero_point
+            weight = self._quantize(weight, scale, zero_point, qmin, qmax)


Only need to quantize weight at the first inference epoch. Suggest avoiding unnecessary computation by some ways such as adding specific tag here.

I am afraid that we must quantize the weight in each iteration. Because we can not in-place update the weight. So if we use dp, the update will lost after each iteration.

linbinskn · 2021-09-14T11:02:55Z

nni/algorithms/compression/pytorch/quantization/quantizers.py

+        module.tracked_min_input.copy_(tracked_min_input)
+        module.tracked_max_input.copy_(tracked_max_input)
+
+        if quant_start_step > int(self.bound_model.steps):


Why delete original logic which can help initiate tracked_min_input and tracked_max_input here? According to the test result before, keeping them can perform better or converge faster than removing them.

I see, it will be kept whatever quant_start_step > int(self.bound_model.steps) or not.

linbinskn · 2021-09-14T11:07:07Z

nni/algorithms/compression/pytorch/quantization/quantizers.py

+        if quant_start_step > int(self.bound_model.steps):
+            return inputs
+
+        tracked_min_input = update_ema(module.tracked_min_input, current_min, ema_decay)


It seems line 595-598 repeats what line 587-590 does.

have removed them

chenbohua3 · 2021-09-15T07:11:07Z

nni/compression/pytorch/compressor.py

@@ -604,10 +607,13 @@ class Quantizer(Compressor):
    def __init__(self, model, config_list, optimizer=None, dummy_input=None):
        if isinstance(model, torch.nn.DataParallel):
            model = model.module
+        model_copied = copy.deepcopy(model)


Because we use quantizer-wrapped model to determine which layer's shapes should be recorded. And in the process of recording shape, some hooks would be registered in the model. So I think it is better to use a copied model to do shape recording.

linbinskn · 2021-09-15T12:40:09Z

nni/compression/pytorch/compressor.py

@@ -793,25 +799,54 @@ def find_conv_bn_patterns(self, model, dummy_input):
                    if successor.op_type == 'BatchNorm2d':
                        self.conv_bn_patterns[node_group.name] = successor.name

-    def step_with_optimizer(self):
-        pass
+    def record_shape(self, model, dummy_input):


It seems to be an universal function which is not specific for quantizer. Maybe we should put it into other places like utils.py?

There exists compressor-specific util function (like get_modules_to_compress) and currently it is only used by quantization. Putting this function as a attribute function of quantizer may be a good choice.. :)

linbinskn · 2021-09-15T12:51:42Z

nni/compression/pytorch/compressor.py

+            bits = QuantGrad.get_bits_length(wrapper.config, quant_type)
+            qmin, qmax = 0, (1 << bits) - 1
+
+        scale_name, zero_point_name = quant_type.type_to_scale_zero_point_name()


It is strange that we get scale_name and zero_point_name in this way while we register them in module using like 'weight_scale' and weight_zero_point directly. So what is the meaning of type_to_scale_zero_point_name()?

Different quant_types correspond to different scale&zero_point names, e.g. weight_scale, input_scale and output_scale. In order to getting them, we must map quantization types to the scale&zero point names. type_to_scale_zero_point_name is just for the code simplicity, do you have better ideas about this?

J-shang · 2021-09-22T07:10:52Z

nni/compression/pytorch/quantization/settings.py

+            self._fields[k] = v
+
+    def __setattr__(self, name: str, val: Any) -> None:
+        if name.startswith("_"):


is this just for _fields?

J-shang · 2021-09-22T07:18:13Z

please fix the conflict

chenbohua3 · 2021-10-08T12:08:48Z

have resolved the conflict

linbinskn · 2021-10-11T00:34:54Z

nni/algorithms/compression/pytorch/quantization/quantizers.py

+            input_shape, output_shape = self.all_shapes[name]
+            layer_quant_setting = LayerQuantSetting(config)
+            layer_quant_setting.ema_decay = 0.99
+            quant_start_step = config.get('quant_start_step', 0)


@J-shang how do u think about it?

linbinskn · 2021-10-11T02:03:35Z

nni/algorithms/compression/pytorch/quantization/quantizers.py

        self.bound_model.to(device)

    def _del_simulated_attr(self, module):
        """
        delete redundant parameters in quantize module
        """
        del_attr_list = ['old_weight', 'old_bias', 'ema_decay', 'tracked_min_output', 'tracked_max_output',
-                         'tracked_min_input', 'tracked_max_input', 'scale', 'zero_point', 'weight_bits',
-                         'output_bits', 'BN_FOLD_TAG', 'input_bits']
+                         'tracked_min_input', 'tracked_max_input', 'weight_bits', 'output_bits', 'BN_FOLD_TAG',


Maybe we can delete 'input_bits', 'weight_bits', 'output_bits' here since they have been removed in module parameters.

linbinskn · 2021-10-11T02:07:52Z

nni/algorithms/compression/pytorch/quantization/quantizers.py

-                weight_bits = get_bits_length(config, 'weight')
-                layer.module.register_buffer('weight_bits', torch.Tensor([int(weight_bits)]))
+                quant_shape = get_quant_shape(module.weight.shape, QuantType.WEIGHT, layer_quant_setting.weight.quant_scheme)
+                module.register_buffer('weight_scale', torch.zeros(quant_shape))


Why keep 'scale' and 'zero_point' in module parameters while removing bits. To some degree, they should in same level and should be kept in same places.

scale and zero_point are tensors that will be used in the auto-grad graph while bits will not. So I put these "non-auto-grad" attributes in the layer/tensor settings

linbinskn · 2021-10-11T02:13:14Z

nni/compression/pytorch/compressor.py

@@ -793,25 +799,54 @@ def find_conv_bn_patterns(self, model, dummy_input):
                    if successor.op_type == 'BatchNorm2d':
                        self.conv_bn_patterns[node_group.name] = successor.name

-    def step_with_optimizer(self):
-        pass
+    def record_shape(self, model, dummy_input):


linbinskn · 2021-10-11T02:23:04Z

nni/compression/pytorch/compressor.py

@@ -604,10 +607,13 @@ class Quantizer(Compressor):
    def __init__(self, model, config_list, optimizer=None, dummy_input=None):
        if isinstance(model, torch.nn.DataParallel):
            model = model.module
+        model_copied = copy.deepcopy(model)


J-shang · 2021-10-11T06:02:08Z

test/ut/sdk/test_compressor_torch.py

@@ -347,7 +348,8 @@ def test_torch_QAT_quantizer(self):
        model.relu = torch.nn.ReLU()

        optimizer = torch.optim.SGD(model.parameters(), lr=0.01, momentum=0.5)
-        quantizer = torch_quantizer.QAT_Quantizer(model, config_list, optimizer)
+        dummy = torch.randn(1, 1, 28, 28)
+        quantizer = torch_quantizer.QAT_Quantizer(model, config_list, optimizer, dummy_input=dummy)


we need using a config_list like [{..., 'quant_scheme': ..., 'quant_dtype': ...}] here to test dtype and scheme

I added a stand-alone ut to test scheme and dtype.

linbinskn · 2021-10-11T05:18:41Z

nni/compression/pytorch/quantization/literal.py

+
+
+# Just show each attribute's name, no practical effect
+class QuantConfigLiteral(str, _QuantLiteralEnum):


It seems not be used, why keep it?

To tell the developer the name of each attribute：）

linbinskn · 2021-10-11T05:33:02Z

nni/algorithms/compression/pytorch/quantization/quantizers.py

+            input_shape, output_shape = self.all_shapes[name]
+            layer_quant_setting = LayerQuantSetting(config)
+            layer_quant_setting.ema_decay = 0.99
+            quant_start_step = config.get('quant_start_step', 0)


Agree, I think it can be put into LayerQuantSetting because this class can be universal class for all quantizers (may be part of design for quantization v2) and quant_start_step is an universal parameter in config in quantization original design, all universal parameters should be put into this class except parameter changes during training.
For this pr, I think current implementation is enough.

linbinskn · 2021-10-11T06:04:50Z

nni/compression/pytorch/quantization/utils.py

+        qmin, qmax = -2 ** (bits - 1) + 1, 2 ** (bits - 1) - 1
+    elif dtype == QuantDtype.UINT:
+        qmin, qmax = 0, 2 ** bits - 1
+    else:


suggest raising TypeError here.

QuanluZhang · 2021-10-11T11:04:36Z

nni/algorithms/compression/pytorch/quantization/quantizers.py


-    return scale, nudged_zero_point
+    zero_point = torch.clamp(zero_point, qmin, qmax)


if scheme is affine, should we clamp zero_point between qmin and qmax

yes, follow the codes in PyTorch repo

QuanluZhang · 2021-10-11T11:58:03Z

nni/algorithms/compression/pytorch/quantization/quantizers.py

+        # Weight can not be in-place modified, so when use torch.nn.DataParallel, this update
+        # will be lost after each forward process. However, this update takes effect on each
+        # replicated module during each forward process, which will make the quantized weight
+        # be used correctly.


why add this comment? what modification is in-place modification?

like wrapper.module.weight.copy_(weight), this is not allowed in the PyTorch

QuanluZhang · 2021-10-11T11:59:22Z

nni/algorithms/compression/pytorch/quantization/quantizers.py

+
+        # layer-wise settings
+        quant_start_step = layer_quant_setting.quant_start_step
+        ema_decay = layer_quant_setting.ema_decay


is ema_decay only for input and output tensor?

QuanluZhang · 2021-10-11T12:05:58Z

nni/common/version.py

@@ -0,0 +1,3 @@
+import torch
+
+TORCH_VERSION = tuple(int(x) for x in torch.__version__.split(".")[:2])


does it require torch installed, even only use hpo?

added some guard logics

QuanluZhang · 2021-10-11T12:22:55Z

test/ut/sdk/test_compressor_torch.py

+                                scale = getattr(module, scale_name)
+                                zero_point = getattr(module, zero_point_name)
+                                self.assertTrue(list(scale.shape) == quant_shape)
+                                self.assertTrue(list(zero_point.shape) == quant_shape)


this test is weak, better to add more test. for example, test whether a quantized tensor with different type of dtype/scheme has the expected value

Added some value checks for scales & zero_points

QuanluZhang · 2021-10-11T12:24:21Z

nni/algorithms/compression/pytorch/quantization/quantizers.py

+            # TODO: may relax this limitation?
+            assert name in self.all_shapes, "Could not found shapes for layer {}".format(name)
+            input_shape, output_shape = self.all_shapes[name]
+            layer_quant_setting = LayerQuantSetting(config)


we should properly support other ops, such as linear. or we can report warning message for unsupported ops.

QAT_quantizer now only quantizes Conv2d/Linear/ReLU/ReLU6. And when quantize input/output of a Linear layer in per-channel style, a rank=2 check will be executed.

QuanluZhang · 2021-10-11T12:25:04Z

@chenbohua3 this pr looks good, please update doc (e.g., qat_quantizer) accordingly

chenbohua3 changed the title ~~support dtype&scheme customization for QAT quantizer~~ [WIP]support dtype&scheme customization for QAT quantizer Sep 1, 2021

chenbohua3 force-pushed the support_dtype_scheme branch from 22c40e8 to e908235 Compare September 8, 2021 02:56

chenbohua3 changed the title ~~[WIP]support dtype&scheme customization for QAT quantizer~~ support dtype&scheme customization for QAT quantizer Sep 8, 2021

linbinskn reviewed Sep 13, 2021

View reviewed changes

linbinskn reviewed Sep 14, 2021

View reviewed changes

linbinskn reviewed Sep 15, 2021

View reviewed changes

QuanluZhang requested a review from J-shang September 22, 2021 06:32

J-shang reviewed Sep 22, 2021

View reviewed changes

QuanluZhang mentioned this pull request Sep 24, 2021

NNI 2021 August~September Iteration Planning #3986

Closed

78 tasks

QuanluZhang requested a review from zheng-ningxin September 27, 2021 00:45

chenbohua3 and others added 6 commits October 8, 2021 19:42

support dtype&scheme customization for QAT quantizer

c673d5d

refine

5f690af

refine

dfa7f9a

remove cn doc changes

76c1932

remove cn doc changes

a22bfb1

remove repeat codes

05ab608

chenbohua3 force-pushed the support_dtype_scheme branch from 4628fce to 05ab608 Compare October 8, 2021 11:44

chenbohua3 added 2 commits October 8, 2021 19:48

fix ut

384ebcb

restart pipeline

ea592cd

J-shang approved these changes Oct 9, 2021

View reviewed changes

linbinskn reviewed Oct 11, 2021

View reviewed changes

remove attr

8b9983e

J-shang reviewed Oct 11, 2021

View reviewed changes

linbinskn reviewed Oct 11, 2021

View reviewed changes

chenbohua3 added 2 commits October 11, 2021 15:06

type error

ddda305

add doc & ut

cc86a55

linbinskn approved these changes Oct 11, 2021

View reviewed changes

QuanluZhang reviewed Oct 11, 2021

View reviewed changes

QuanluZhang approved these changes Oct 11, 2021

View reviewed changes

chenbohua3 added 5 commits October 12, 2021 14:12

refine ut

d04a66f

guard logic

3cd7705

fix lint

f951087

fix ut

9624cf9

fix ut

1e3d7cf

QuanluZhang merged commit c9cd53a into microsoft:master Oct 13, 2021

linbinskn mentioned this pull request Oct 14, 2021

[Quantization] fix QAT export param #4252

Merged



		# Just show each attribute's name, no practical effect
		class QuantConfigLiteral(str, _QuantLiteralEnum):


		return scale, nudged_zero_point
		zero_point = torch.clamp(zero_point, qmin, qmax)

		@@ -0,0 +1,3 @@
		import torch

		TORCH_VERSION = tuple(int(x) for x in torch.__version__.split(".")[:2])

support dtype&scheme customization for QAT quantizer #4137

support dtype&scheme customization for QAT quantizer #4137

Conversation

chenbohua3 commented Sep 1, 2021

chenbohua3 commented Sep 1, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chenbohua3 Sep 15, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

This comment was marked as resolved.

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

J-shang commented Sep 22, 2021

chenbohua3 commented Oct 8, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chenbohua3 Oct 11, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

QuanluZhang commented Oct 11, 2021

chenbohua3 commented Sep 1, 2021 •

edited

Loading

chenbohua3 Sep 15, 2021 •

edited

Loading

chenbohua3 Oct 11, 2021 •

edited

Loading