Huge performance decrease by quantization #13720

kice · 2018-12-22T10:48:16Z

I use the code from PR #13715, and I got a huge performance decrease by doing quantization on my model. I tested on Windows 10 with CUDA 10 and cudnn7 on Titan X (Pascal), using pre-release build from pip mxnet-cu100.

Alought by this issue #10897, it claimed that INT8 quantization can save GPU memory during usage, I got almose 2x more VRAM usage by quantization.

Do we excepted that INT8 quantization is super slow and use more memory on GPU?

And I may assume that UINT8 quantization is not yet supported since the UINT8 quantizated parameters is signed integer.

So, do we have any plan for improving INT8 quantization in the near future?

marcoabreu · 2018-12-22T11:32:40Z

@DickJC123

pengzhao-intel · 2018-12-22T13:03:31Z

FYI, https://github.com/apache/incubator-mxnet/blob/master/example/quantization/README.md

kice · 2018-12-23T00:05:15Z

@pengzhao-intel

Form your link

Performance is expected to decrease with GPU, however the memory footprint of a quantized model is smaller.

But in my case, it double the GPU memory usage. I don't think it can be considered as "smaller".

pengzhao-intel · 2018-12-24T01:27:24Z

FYI again, #13145 (comment)

kice · 2018-12-24T01:33:51Z

I couldn't find any information about why GPU memory usage increased.

pengzhao-intel · 2018-12-24T02:13:37Z

@ThomasDelteil do you have some data to show the memory changes of INT8 flow?

@reminisce to comment your question.

vrakesh · 2018-12-25T01:31:36Z

@mxnet-label-bot add [Operator, Performance]

vrakesh · 2018-12-25T01:39:32Z

@mxnet-label-bot add [Quantization]

pengzhao-intel · 2019-05-08T02:10:10Z

#9552 (comment)

Since there is no plan till now, closing this issue.
Welcome to file PR to fix related issues.

marcoabreu added Operator Performance labels Dec 25, 2018

marcoabreu added the Quantization Issues/Feature Requests related to Quantization label Dec 25, 2018

pengzhao-intel closed this as completed May 8, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Huge performance decrease by quantization #13720

Huge performance decrease by quantization #13720

kice commented Dec 22, 2018

marcoabreu commented Dec 22, 2018

pengzhao-intel commented Dec 22, 2018

kice commented Dec 23, 2018

pengzhao-intel commented Dec 24, 2018

kice commented Dec 24, 2018

pengzhao-intel commented Dec 24, 2018

vrakesh commented Dec 25, 2018

vrakesh commented Dec 25, 2018

pengzhao-intel commented May 8, 2019

Huge performance decrease by quantization #13720

Huge performance decrease by quantization #13720

Comments

kice commented Dec 22, 2018

marcoabreu commented Dec 22, 2018

pengzhao-intel commented Dec 22, 2018

kice commented Dec 23, 2018

pengzhao-intel commented Dec 24, 2018

kice commented Dec 24, 2018

pengzhao-intel commented Dec 24, 2018

vrakesh commented Dec 25, 2018

vrakesh commented Dec 25, 2018

pengzhao-intel commented May 8, 2019