Python 3.7.3 install pytorch at : https://pytorch.org/
pip install pandas
pip install skorch
pip install seaborn
pip install scikit-learn
pip install tqdm
The dataset contains 6400 images of MRI with 4 classes of labels: {'MildDemented', 'ModerateDemented', 'NonDemented', 'VeryMildDemented'}
The dataset was obtained from kaggle, 5121 was used for training, 1279 was used for testig, it has imbalance problem, less than 1% images have label - ModerateDemented
The results can be viewed from report_performance.ipynb as well.
- We have trained 9 models including: resnet18, resnet34, resnet50, resnet101, resnet152, squeezenet, VGG, alexNet and densenet
- We use 10% of training set for validation (stratified on 4 classes, important to do so as the dataset has imbalance problem)
During training: most of the classes were able to achieve 99% validation accuracy except for squeezenet (around 56%)
multi_class_performance was measured weighted one vs all metrics, for instance, to measure class VeryMildDemented, we treat the rest of 3 classes (MildDemented, ModerateDemented, NonDemented) as Not VeryMildDemented. Using this evaluation, we measure the accuracy for each of the 4 classes, then taking average. More detailed explanation can be viewed from sklearn, we used average = "weighted".
Another approach we used to treat classes as {Demented, Not_Demented} (Demented includes all 3 level of Dementia). Then the accuracy can be evaluated using standard binary classfication sestting.
we have compared the actual testing image with grad-cam activation overlayed images using resnet models, more results can be viewed at Grad-cAM.ipynb
we run the models on testing images and extract the feature vector prior to the fully connected layer. More examples are available from tsne_cluster.ipynb
res152 had low testing accuracy: 0.59
sqeezenet had low testing accuracy: 0.55
Training was performed on a GPU with 8000Mib Memory (GPU is not required, but it will be slow during training and testing)
- make sure you have installed the dependency from Dependency section.
- obtain dataset from kaggle and save at the folder as train.py
- download dataset.csv from this repo, save at the same folder as train.py, this csv file
- to train all 9 models, nothing needs to be modified, rum train.py using python3 train.py
- to train with selected models, go to config.py and modify
MODELS
- if you encounter memory issues during training, go to config.py modify batch size for the model causing the memory error
- the default max epoch is 50 with early stopping, patience level 5, threshold 0.01 (stop training if validatin accuracy didn't improve for more than 0.01 after 5 epochs)
- the trained model will be saved at folder models which can be then used for prediction with predict.py
- the training history is saved as json files stored in train_histories including training loss, validation loss and validation accuracy for each epoch and each batch.
If encounter error message: "RuntimeError: out of memory. Tried to allocate ... " go to cofig.py and change batchsize into smaller values to fit the memory of your device