This repository contains the code for the paper "Towards the Automatic Identification of Isotopies" (2023), presented at the 14th Media Mutations International Conference by Alice Fedotova and Alberto Barrón-Cedeño.
The aim of the experiments was addressing the problem of automatic isotopy identification, a novel task in the field of automated content analysis aimed at reducing the cost of annotation for the study of medical dramas. The work first involved expanding a subset of the Medical Dramas Dataset by including subtitles and keyframes for each segment from the TV show Grey's Anatomy. On the basis of the obtained corpus, experiments were conducted using unimodal and multimodal transformer-based models (CLIP, BERT and MMBT). Two different classification approaches were also compared: the first approach consisted in employing a single multiclass classifier, while the second involved using the one-vs-the-rest approach.
subtitles
contains the software used for the aligning the temporal annotations and the subtitles.keyframes
contains the scripts used for extracting the keyframes of the segments.models
contains the scripts used for running the experiments with CLIP, BERT and MMBT.
The repository contains a Pipfile
with all required dependencies, which can be installed using pipenv:
pipenv install Pipfile