- Product Quantization for Nearest Neighbor Search,TPAMI,2011 [paper]
- Compressing Deep Convolutional Networks using Vector Quantization,ICLR,2015 [paper]
- Deep Learning with Limited Numerical Precision, ICML, 2015 [paper]
- Ristretto: Hardware-Oriented Approximation of Convolutional Neural Networks, ArXiv, 2016 [paper]
- Fixed Point Quantization of Deep Convolutional Networks, ICML, 2016 [paper]
- Quantized Convolutional Neural Networks for Mobile Devices, CVPR, 2016 [paper]
- Incremental Network Quantization: Towards Lossless CNNs with Low-Precision Weights, ICLR, 2017 [paper]
- BinaryConnect: Training Deep Neural Networks with binary weights during propagations, NIPS, 2015 [paper]
- BinaryNet: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1, ArXiV, 2016 [paper]
- XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks, ECCV, 2016 [paper]
- Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations, ArXiv, 2016 [paper]
- DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients, ArXiv, 2016 [paper]
- Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding, ICLR, 2016 [paper]
- Optimal Brain Damage, NIPS, 1990 [paper]
- Learning both Weights and Connections for Efficient Neural Network, NIPS, 2015 [paper]
- Pruning Filters for Efficient ConvNets, ICLR, 2017 [paper]
- Sparsifying Neural Network Connections for Face Recognition, CVPR, 2016 [paper]
- Learning Structured Sparsity in Deep Neural Networks, NIPS, 2016 [paper]
- Pruning Convolutional Neural Networks for Resource Efficient Inference, ICLR, 2017 [paper]
- Distilling the Knowledge in a Neural Network, ArXiv, 2015 [paper]
- FitNets: Hints for Thin Deep Nets, ICLR, 2015 [paper]
- Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer, ICLR, 2017 [paper]
- Face Model Compression by Distilling Knowledge from Neurons, AAAI, 2016 [paper]
- In Teacher We Trust: Learning Compressed Models for Pedestrian Detection, ArXiv, 2016 [paper]
- Like What You Like: Knowledge Distill via Neuron Selectivity Transfer, ArXiv, 2017 [paper]
- SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 0.5MB model size, ArXiv, 2016 [paper]
- Convolutional Neural Networks at Constrained Time Cost, CVPR, 2015 [paper]
- Flattened Convolutional Neural Networks for Feedforward Acceleration, ArXiv, 2014 [paper]
- Going deeper with convolutions, CVPR, 2015 [paper]
- Rethinking the Inception Architecture for Computer Vision, CVPR, 2016 [paper]
- Design of Efficient Convolutional Layers using Single Intra-channel Convolution, Topological Subdivisioning and Spatial "Bottleneck" Structure, ArXiv, 2016 [paper]
- Xception: Deep Learning with Depthwise Separable Convolutions, ArXiv, 2017 [paper]
- MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications, ArXiv, 2017 [paper]
- ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices, ArXiv, 2017 [paper]
- Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation, NIPS,2014 [paper]
- Speeding up Convolutional Neural Networks with Low Rank Expansions, BMVC, 2014 [paper]
- Deep Fried Convnets, ICCV, 2015 [paper]
- Accelerating Very Deep Convolutional Networks for Classification and Detection, TPAMI, 2016 [paper]
- Speeding-up Convolutional Neural Networks Using Fine-tuned CP-Decomposition, ICLR, 2015 [paper]