网络设计, 理论, GAN等方向的论文
-
Run, Don't Walk: Chasing Higher FLOPS for Faster Neural Networks
Jierun Chen, Shiu-hong Kao, Hao He, Weipeng Zhuo, Song Wen, Chul-Ho Lee, S.-H. Gary Chan
[CVPR 2023] [Pytorch-Code]
[★☆] 部分卷积+恒等映射 -
ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders
Sanghyun Woo, Shoubhik Debnath, Ronghang Hu, Xinlei Chen, Zhuang Liu, In So Kweon, Saining Xie
[arXiv 2301] [Pytorch-Code] -
A ConvNet for the 2020s
Zhuang Liu, Hanzi Mao, Chao-Yuan Wu, Christoph Feichtenhofer, Trevor Darrell, Saining Xie
[CVPR 2022] [Pytorch-Code]
[ConvNeXt] -
ResNeSt: Split-Attention Networks
Hang Zhang, Chongruo Wu, Zhongyue Zhang, Yi Zhu, Zhi Zhang, Haibin Lin, Yue Sun, Tong He, Jonas Muller, R. Manmatha, Mu Li, Alex Smola
[arXiv 2004] [Pytorch-Code]
[★★] 类似于将ResNeXt中的每个cardinal再分成成并行的切片, 并对这些split做attention. 性能在分类, 检测, 分割任务上都提升了不少 -
Gather-Excite: Exploiting Feature Context in Convolutional Neural Networks
Jie Hu, Li Shen, Samuel Albanie, Gang Sun, Andrea Vedaldi
[NIPS 2018] [[https://github.com/hujie-frank/GENet)]
[GENet] [★☆]
- 在SENet的基础上, 提出了gather-excite操作. Gather就是收集long-range空间信息的操作, excite就是将gather的信息分配给local feature的操作. 作者认为GE操作可以更有效地挖掘context信息, 并增加feature的可复用性;
- Gather操作可以有多种形式, 如无参数的average pooling(GE-), 有参数的多级depth-wise卷积(GE), 全局depth-wise卷积(GE+)等. 其中无参数的策略对性能有轻微提升, GE+性能最好, 所需参数最多. Excite操作就是把gather的结果经过scale后与原feature的过程;
- 方法主要在分类任务上进行验证. 思路和做法很简单, 论述方法值得学习.
-
Squeeze-and-Excitation Networks
Jie Hu, Li Shen, Samuel Albanie, Gang Sun, Andrea Vedaldi
[CVPR 2018] [Caffe-Code]
[SENet] [★★★] 经典的空间attention -
Densely Connected Convolutional Networks
Gao Huang, Zhuang Liu, Laurens van der Maaten, Kilian Q. Weinberger
[CVPR 2017 Best Paper] [Pytorch-Code]
[DenseNet] [★★★] 1) 核心思路是每一层都与之前的层直接相连, 实现特征的重复利用, 并使得每层的feature map数可以设置的很小;
- 实际设计中, 采用dense block + transition layer的结构. 每个dense block中进行feature的密集连接(将前面得到feature直接concat到后面的feature map上), 并用1*1的conv降维, transition layer的作用也是降低前一block的输出维度;
- 优点: 性能好, 相对而言参数较少. 缺点: 很耗内存. 以caffe为例, concat层会为需要concat的feature另外分配一份新的内存空间, 这样第L层的feature实际需要L(L+1)/2个feature的空间. 作者团队给出了优化策略, 还没看.
-
Residual Attention Network
Fei Wang, Mengqing Jiang, Chen Qian, Shuo Yang, Chen Li, Honggang Zhang, Xiaogang Wang, Xiaoou Tang
[CVPR 2017 Spotlight] [Caffe-Code]
[★] 提出了一个形式上与resnet很相似的结构: resudual attention net, 将attention机制结合进前馈深度神经网络中. -
Aggregated Residual Transformations for Deep Neural Networks
Saining Xie, Ross Girshick, Piotr Dollár, Zhuowen Tu, Kaiming He
[CVPR 2017] [Torch-Code]
[ResNeXt] [★★] ResNet的改进版, 复杂度不变的情况下提升了精度 -
Recurrent models of visual attention
Volodymyr Mnih, Nicolas Heess, Alex Graves, Koray Kavukcuoglu
[NIPS 2014]
[RAM]
- CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features
Sangdoo Yun, Dongyoon Han, Seong Joon Oh, Sanghyuk Chun, Junsuk Choe, Youngjoon Yoo
[ICCV 2019 Oral] [Pytorch-Code]
[★★] 随机将一块区域替换成另一个类别的图像, label也做相应mix
-
Pixel-Adaptive Convolutional Neural Networks
Hang Su, Varun Jampani, Deqing Sun, Orazio Gallo, Erik Learned-Miller, Jan Kautz
[CVPR 2019] [Project] [Pytorch-Code]
[PAC] [★] 采用类似双边滤波的思想, 对卷积核乘上一个根据某些特征计算出来的weight -
LIP: Local Importance-based Pooling
Ziteng Gao, Limin Wang, Gangshan Wu
[ICCV 2019] [Pytorch-Code]
[★] 用一个分支预测pooling的权重 -
Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks with Octave Convolution
Yunpeng Chen, Haoqi Fan, Bing Xu, Zhicheng Yan, Yannis Kalantidis, Marcus Rohrbach, Shuicheng Yan, Jiashi Feng
[ICCV 2019] [MXNet-Code]
[OctaveConv] [★] 1) 提出一种新的卷积形式, 试图将特征图的高低频分类分解, 并在不同尺度上处理, 以节省内存和计算成本的效果; 2) 以为会真正提出分解特征图高频低频分类的算法, 其实只是做了个avg pool把原feature降采样两倍作为低频组. 所谓的高低频组分别卷积, 并且两组之间也有信息传递. -
ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design
Ningning Ma, Xiangyu Zhang, Hai-Tao Zheng, Jian Sun
[ECCV 2018]
[★★★] 从FLOPs, 访存的角度, 讨论了模型设计的一些原则 -
Deformable Convolutional Networks
Jifeng Dai, Haozhi Qi, Yuwen Xiong, Yi Li, Guodong Zhang, Han Hu, Yichen Wei
[ICCV 2017 Oral] [Pytorch-Code]
[★★★] 传统CNN对几何形变的适应力差, 这是标准卷积中的规则格点采样造成的. 为此论文提出了deformable convolution和deformable ROI pooling
- Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics
Alex Kendall, Yarin Gal, Roberto Cipolla
[CVPR 2018] [Unofficial-PyTorch-Code]
[★★] 1) 提出了一种用不确定度为多任务学习中每个loss赋权重的方法. 作者证从多任务的最大似然估计出发, 证明了在多任务学习中, 分类和回归问题的loss应该用1/sigma^2来对其进行加权, 其中sigma表示该任务的不确定度. 实际应用中, 作者通过为每个任务分别学习其log sigma来自适应地得到每个loss的weight; 2) 分类任务加入一1/sigma^2作为scale factor的原理没有搞清楚.
-
Understanding the Effective Receptive Field in Deep Convolutional Neural Networks
Wenjie Luo, Yujia Li, Raquel Urtasun, Richard Zemel
[NIPS 2016]
[ERF] [★★] 从理论上分析了CNN中的有效感受野其实比理论感受野小很多, 且呈现高斯分布的现象. -
A guide to convolution arithmetic for deep learning
Vincent Dumoulin, Francesco Visin
[arXiv 1603] [PyTorch-Code]
[★★] 讨论了卷积, 池化, 转置卷积的输入输出关系, 供需要时查阅
-
Lossless Training Speed Up by Unbiased Dynamic Data Pruning
Ziheng Qin, Kai Wang, Zangwei Zheng, Jianyang Gu, Xiangyu Peng, Zhaopan Xu, Daquan Zhou, Lei Shang, Baigui Sun, Xuansong Xie, Yang You
[ICLR 2024 Oral] [PyTorch-Code] -
Filter Pruning via Geometric Median for Deep Convolutional Neural Network Acceleration
Yang He, Ping Liu, Ziwei Wang, Zhilan Hu, Yi Yang
[CVPR 2019 Oral] [PyTorch-Code]
[FPGM] [★★]