Author: ACSEKevin
This is a project including most of the CNN backbones which can be found in package kevin_utils.models
Feel free to make a contact if there is any correction or suggestion.
- SwinTransformer in this version accepts inputs with different size of images instead of the original version (224 x 224 x 3), has not been tested yet, however. If there is any problem when using models, please refer to https://github.com/microsoft/Swin-Transformer
- The datasets have been removed from the dataset directories, please add own dataset before run the code. The origial datasets: ImageNet21k, Pascal VOC 2012, cat_and_dog_dataset, flowers_dataset, gesture_dataset.
In main
, train.py
for training the model, Kevin_datasets.py
for wrapping the dataset processing procedures.
In package kevin_utils.models
, the models are listed:
- Classic networks: Lenet-5, Lenet-5(Modified), AlexNet, VGG (Visual Geometry Group from Oxford University)
- GoogLeNet (Inception V1), ResNet (18, 34, 50, 101, 152), ResNeXt
- MobileNet V2/V3, ShuffleNet V2
- EfficientNet V1/V2, DenseNet
- VisionTransformer, SwinTransformer, ConvNeXt
The models are provided for future convenient, please make a citations when cloning and using this code.
Yann, L. (1998) LeNet http://yann.lecun.com/exdb/lenet/index.html
Krizhevsky, A. (2012) AlexNet http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf
Visual Geometry Group VGG https://arxiv.org/abs/1409.1556
Google AI Research GoogLeNet(Inceptionv1) https://arxiv.org/abs/1409.4842
He, K. Microsoft AI Research Team, ResNet https://arxiv.org/abs/1512.03385
Xie, S. ResNeXt https://arxiv.org/abs/1611.05431
Huang, G. DenseNet https://arxiv.org/abs/1608.06993
Hu, J. (2017) SENet(Squeeze-and-Excitation Networks) https://arxiv.org/abs/1709.01507
Sandler, M. (2018) MobileNet V2 https://arxiv.org/abs/1801.04381
Howard, A. (2019) MobileNet V3 https://arxiv.org/abs/1905.02244
Zhang, X. (2017) ShuffleNet V1 https://arxiv.org/abs/1707.01083
Ma, N. (2018) ShuffleNet V2 https://arxiv.org/abs/1807.11164
Tan, M. (2019) EfficientNet V1 https://arxiv.org/abs/1905.11946
Yan, M. (2021) EfficientNet V2 https://arxiv.org/abs/2104.00298
Dosovitskiy, A. (2020) Vision Transformer https://arxiv.org/abs/2010.11929
Liu, Z. (2021) Shifted-Window Transformer https://arxiv.org/abs/2103.14030
Liu, Z. (2021) Swin Transformer V2: Scaling Up Capacity and Resolution https://arxiv.org/abs/2111.09883