Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] new int8 implement,better accuracy #749

Merged
merged 24 commits into from
Mar 5, 2019
Merged

Conversation

BUG1989
Copy link
Contributor

@BUG1989 BUG1989 commented Jan 10, 2019

It's a WIP,
I find that quantize the weight data split by outch num can get a better accuracy.So it needs some change.

Better Accuracy

Models fp32 int8 diff
squeezenet_v1_1 (Top1) 57.78 57.82 +0.04
mobilenet_v1 (Top1) 67.26 66.74 -0.52
resnet18 (Top1) 65.49 65.30 -0.19
googlenet_v1 (Top1) 68.50 68.62 +0.12
resnet50 (Top1) 71.80 71.76 -0.04
mobilenet_v1_ssd (mAP) 70.23 68.68 -1.55
squeezenet_v1_ssd (mAP) 61.80 61.27 -0.53

I have implemented the int8 winograd F(2,3),It has the same accuracy as original int8 conv3x3s1 : )

Faster Inference

Platform : Hisi3519(Cortex-A17@880MHz)

Unit : ms

Models fp32 int8
squeezenet_v1_1 282 204
mobilenet_v1 490 369
mobilenet_v1_ssd 970 618
squeezenet_v1_ssd 610 560
resnet18 985 648
googlenet_v1 1107 785

Runtime Memory : mbytes

Models fp32 int8
squeezenet_v1_1 50 30
mobilenet_v1 61 35
mobilenet_v1_ssd 90 45
squeezenet_v1_ssd 210 70
resnet18 335 77
googlenet_v1 154 72

Storage Memory : mbytes

Models fp32 int8
squeezenet_v1_1 4.71 1.20
mobilenet_v1 16.3 4.31
mobilenet_v1_ssd 22.0 5.60
squeezenet_v1_ssd 21.1 5.37
resnet18 44.6 11.2
googlenet_v1 26.6 6.72
new convert tool
x86-simulator
  • squeezenet_v1_1
  • mobilenet_v1
  • resnet18
  • googlenet_v1
  • mobilenet_v1_ssd
  • squeezenet_v1_ssd
arm
  • squeezenet_v1_1
  • mobilenet_v1
  • resnet18
  • googlenet_v1
  • mobilenet_v1_ssd
  • squeezenet_v1_ssd

New Feature

x86 simulator
  • conv3x3s1 fp32 winograd F(2,3)
  • conv3x3s1 int8 winograd F(2,3)
armv7a(fix overflow)
  • conv3x3s1 int8 winograd F(2,3)
  • conv3x3s2 int8
  • conv1x1s1 int8 sgemm
  • dwconv3x3s1/s2
arm64-v8a(fix overflow)
  • conv3x3s1 int8 winograd F(2,3)
  • conv3x3s2 int8
  • conv1x1s1 int8 sgemm
  • dwconv3x3s1/s2
Another Int8 layers
x86 simulator
  • requantize layer
  • int8 relu
  • int8 conv1x1s2 graph optimize
  • int8 im2col
  • int8 sgemm
armv7a
  • requantize layer
  • int8 relu
  • int8 conv1x1s2 graph optimize
  • int8 im2col
  • int8 sgemm
arm64-v8a
  • requantize layer
  • int8 relu
  • int8 conv1x1s2 graph optimize
  • int8 im2col
  • int8 sgemm

@BUG1989
Copy link
Contributor Author

BUG1989 commented Mar 4, 2019

rk3288 int8 benchmark

@nihui nihui merged commit df3d224 into Tencent:master Mar 5, 2019
@spaul13
Copy link

spaul13 commented Apr 29, 2019

can anyone please tell me how to get the accuracy for a particular model (say mobilenet-yolov3) while running the benchmark?

nihui pushed a commit to nihui/ncnn that referenced this pull request Jul 3, 2019
* add the armv7a conv3x3s1 implement without overflow,remove old codes

* fix the bug of conv3x3s2 packed int8

* new int8 implement,weight quant by perchanel,better accuracy~

* fix the bug of conv3x3s1 packed int8 neon

* add the naive c fp32 and int8 winograd F(2,3)

* add the neon intrinsic int8 winograd F(2,3)

* optimize the armv7a int8 winograd F(2,3) with neon assembly

* optimize the armv7a int8 winograd F(2,3) input transform with assembly.

* add the requantize layer and int8 relu implement.

* add graph optimize conv1x1s2 -> conv1x1s1,begin optimize int8 aarch64.

* fix int8 bugs

* add the c naive im2col with sgemm

* add aarch64 int8 winograd f23, conv3x3s2 naive implement

* add the int8 sgemm conv7x7s2 on x86/armv7a platform

* optimize the int8 sgemm by neon intrinsic and packed kernel

* optimize the int8 sgemm with packed data

* optimize the int8 sgemm with armv7a neon assembly

* add the int8 sgemm on arm64-v8a platform

* perpare to merge latest codes from master

* add the int8 param files

* In the Class Net,add the fuse_network method
nihui added a commit to nihui/ncnn that referenced this pull request Jul 3, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants