Skip to content

Conversation

@peri044
Copy link
Contributor

@peri044 peri044 commented Jul 1, 2020

This MR adds the following

  1. Add QAT support to existing RN50 codebase. QAT is performed using tf.quantization.quantize_and_dequantize (QDQ) operation as opposed to the Fake quant ops (introduced by Tensorflow by default). QDQ is scale only quantization which is supported by TensorRT.
  2. Training instructions for QAT.
  3. Add support for using a 1x1 convolution layer as a final classification layer. This is due to TensorRT limitation which doesn't allow a Matmul layer after a dequantize node in QAT network. A post processing script converts an FC layer weights into 1x1 conv layer weights and rewrites the new checkpoint.
  4. Frozen graph script to export frozen graphs which can be used for TensorRT inference.

This workflow (adopted for RN50) is similar to https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#qat-tf

@nv-kkudrynski nv-kkudrynski merged commit 37672df into NVIDIA:master Jul 8, 2020
PeganovAnton pushed a commit to PeganovAnton/DeepLearningExamples that referenced this pull request Sep 8, 2020
Add quantization aware training (QAT) support for Resnet 50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants