The aim of the project, developed during the "Machine Learning and Deep Learning" course in collaboration with Andrea Arcidiacono and Adele de Hoffer, was to implement a trainable deep neural network model for egocentric activity recognition. As a first part of the project, the EgoRNN model was re-implemented with the two-stream architecture that works separately on the frames of the videos and on the optical flows used to extract motion features. Secondly, the model was transformed in a single-stream neural network by implementing a self-supervised motion segmentation task. Finally, a personal variation of the project was implemented to make a better use of the optical flows.
More details are provided in the paper.