Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add quantization aware training (QAT) support for Resnet 50 #584

Merged
merged 8 commits into from
Jul 8, 2020

Conversation

peri044
Copy link
Contributor

@peri044 peri044 commented Jul 1, 2020

This MR adds the following

  1. Add QAT support to existing RN50 codebase. QAT is performed using tf.quantization.quantize_and_dequantize (QDQ) operation as opposed to the Fake quant ops (introduced by Tensorflow by default). QDQ is scale only quantization which is supported by TensorRT.
  2. Training instructions for QAT.
  3. Add support for using a 1x1 convolution layer as a final classification layer. This is due to TensorRT limitation which doesn't allow a Matmul layer after a dequantize node in QAT network. A post processing script converts an FC layer weights into 1x1 conv layer weights and rewrites the new checkpoint.
  4. Frozen graph script to export frozen graphs which can be used for TensorRT inference.

This workflow (adopted for RN50) is similar to https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#qat-tf

@nv-kkudrynski nv-kkudrynski merged commit 37672df into NVIDIA:master Jul 8, 2020
PeganovAnton pushed a commit to PeganovAnton/DeepLearningExamples that referenced this pull request Sep 8, 2020
Add quantization aware training (QAT) support for Resnet 50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants