This repository contains re-Implementation of the paper https://ieeexplore.ieee.org/document/9577788
The authors of Involution: Inverting the Inherence of Convolution for Visual Recognition propose a novel involutional layers, which aims to enhance the representation power of convolutional neural networks by inverting the inherent properties of convolution operations. As such these kernals are channel agnostic and spatial specific.
pip install torch torchvision
pip install wandb
pip install lightning
models folder contains the main backbone implementations of models used as well as classification heads and lightning class for easy training and logging
slides contains presentation slides with results on Caltech-256
data contains the data module and custom dataset
git clone https://github.com/thatblueboy/involution.git #clone the repo
Following model and training parameters can be configured in train.py by modifying the configs dictionary
-
modelto specify which model you want to train. ResNetClassifier for Resnets and RedNetClassifier for RedNets containing involutions. -
ReDSnet_typeto specify depth of the model. Can be one of 26, 38, 50, 101, 152 -
batch_sizeis training batch size -
optimizerandoptimizer_kwargsfor learing optimizer. optimizer can be Adam or SGD -
num_workersis number of workers -
lr_schedulerfor learing rate scheduler. One of ExponentialLR, CosineAnnealingLR, LinearLR, StepLR, PolynomialLR. Any changes to the lr_scheduler will require corresponding changes tolr_sceduler_kwargs
We use a random split split on Caltech256. For uniformity we store this split in the data_module.pth and load it for every training run. This behaviour can be changed by setting the 'data_module_path' value in the config dict to None.
- To switch from training to testing mode, change the last line in the train.py from
trainer.fit(model, data_module)
to
trainer.test(model, data_module)
wandb login
python train.py
Code was heavily inspired by the original papers code: https://github.com/d-li14/involution
Original paper can be found here: https://ieeexplore.ieee.org/document/9577788
This project was done as a partial fulfillment of the course CS F425: Deep Learning at BITS-Pilani
