Skip to content

Latest commit

 

History

History
36 lines (27 loc) · 1.62 KB

Boosting the Performance of CNN Accelerators with Dynamic Fine-Grained Channel Gating.md

File metadata and controls

36 lines (27 loc) · 1.62 KB

Paper title:

Boosting the Performance of CNN Accelerators with Dynamic Fine-Grained Channel Gating

Publication

ASPLOS '18

Problem to solve

Convolutional neural networks (CNNs) have demonstrated human level accuracy in many vision-related tasks and are being increasingly adopted for many applications, including real-time tasks such as autonomous driving and robotic manipulation. Unfortunately, state-of-the-art CNNs are highly compute-intensive, as they typically demand about 109 floating point operations (FLOPs) per inference. In order to deploy CNNs in a much broader range of applications, especially in embedded and mobile settings, we need to reduce the high computational cost without noticeably sacrificing inference accuracy.

Major contribution

Propose a new dynamic pruning technique, named channel gating(CGNet), which removes ineffectual computations specific to each input at run time, and present a hardware accelerator architecture to effectively exploit the dynamic sparsity introduced by channel gating.

Experimental results show that the proposed approach can significantly speed up state-of-the-art networks with a marginal accuracy loss, and enable a trade-off between performance and accuracy.

This paper shows that channel gating can be supported with a small set of extensions to a CNN accelerator, and implements a prototype for quantized ResNet-18 models. The accelerator shows an average speedup of 2.3× for ImageNet when the theoretical FLOP reduction is 2.8×, indicating that the hardware can effectively exploit the dynamic sparsity exposed by channel gating.