support ernie trt-int8 for inference #32232

ceci3 · 2021-04-13T06:18:23Z

PR types

New features

PR changes

Others

Describe

support trt-int8 inference for ernie
通过load_inference_model fake 量化测试精度结果为0.7786

layer	prec	prec
skip ln	FP16	FP16
qkv2ctx	FP16	FP16
fc	FP16	int8
latency(bs=40，T4)	35.5341ms	29.1797ms
acc	0.7786	0.7770
qps	1898seq/s	2310seq/s

paddle-bot-old · 2021-04-13T06:18:25Z

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

cryoco · 2021-04-14T03:06:10Z

paddle/fluid/framework/ir/multihead_matmul_fuse_pass.cc

+      }
+    }
+
+    bool enable_int8 = mul0_op_desc->HasAttr("enable_int8");


这里是不是和上面重复了

删掉了，感谢~

cryoco · 2021-04-14T03:25:22Z

paddle/fluid/inference/tensorrt/convert/fc_op.cc

-                                            n_output, weight.get(), bias.get());
+      nvinfer1::ILayer* fc_layer = nullptr;
+      if (enable_int8) {
+        CHECK(op_desc.HasAttr("out_threshold"));


用PADDLE_ENFORCE吧，给出报错信息

已修改，感谢~

cryoco

LGTM

wanghaoshuang requested review from qingqing01, cryoco and wanghaoshuang April 13, 2021 06:27

cryoco requested a review from shangzhizhou April 13, 2021 06:28

ceci3 force-pushed the ernie_trt_int8 branch from d595e27 to dd5abd6 Compare April 13, 2021 06:48

support ernie trt-int8 for inference

0d000ca

ceci3 force-pushed the ernie_trt_int8 branch from dd5abd6 to 0d000ca Compare April 13, 2021 06:52

cryoco reviewed Apr 14, 2021

View reviewed changes

fix reshape

bfe6168

ceci3 force-pushed the ernie_trt_int8 branch from 818cd4c to bfe6168 Compare April 14, 2021 08:09

cryoco approved these changes Apr 16, 2021

View reviewed changes

chalsliu approved these changes Apr 16, 2021

View reviewed changes

ceci3 merged commit 6da043e into PaddlePaddle:develop Apr 16, 2021

ceci3 deleted the ernie_trt_int8 branch April 21, 2021 08:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support ernie trt-int8 for inference #32232

support ernie trt-int8 for inference #32232

ceci3 commented Apr 13, 2021 •

edited

Loading

paddle-bot-old bot commented Apr 13, 2021

cryoco Apr 14, 2021

ceci3 Apr 14, 2021

cryoco Apr 14, 2021

ceci3 Apr 14, 2021

cryoco left a comment

support ernie trt-int8 for inference #32232

support ernie trt-int8 for inference #32232

Conversation

ceci3 commented Apr 13, 2021 • edited Loading

PR types

PR changes

Describe

paddle-bot-old bot commented Apr 13, 2021

cryoco Apr 14, 2021

Choose a reason for hiding this comment

ceci3 Apr 14, 2021

Choose a reason for hiding this comment

cryoco Apr 14, 2021

Choose a reason for hiding this comment

ceci3 Apr 14, 2021

Choose a reason for hiding this comment

cryoco left a comment

Choose a reason for hiding this comment

ceci3 commented Apr 13, 2021 •

edited

Loading