Attempt at transfering knowledge from ResNet18 to tiny ResNet architectures, following the training technique as described in "Distilling the Knowledge in a Neural Network" by Hinton et al. (https://arxiv.org/pdf/1503.02531.pdf), applied on ImageNet dataset.