Aspect-Based Sentiment Analysis (ABSA) is a complex classification pipeline consisting of the following steps: aspect terms extraction (A), aspect polarities identification (B), category terms extraction (C), category polarities identification (D).
We propose a multistep classifier to learn all the tasks in parallel in a multitask-learning fashion. Along these lines, we employ linear task-specific output heads to get the most from the shared latent representation of the input sentences. For further insights, read the dedicated report or the presentation slides (pages 7-14).
A benchmark showing the performance of the multistep classifier against individual learners. Please note that experiments with Φrestaurant refer to models trained only on data from the restaurant domain.
Model | Aspects | Categories (Φrestaurant) | ||
F1macro | F1micro | F1macro | F1micro | |
Aspect classifier | 41.25 | 60.16 | - | - |
Category classifier | - | - | 38.23 | 49.12 |
Multistep classifier | 50.04 | 65.02 | 55.00 | 66.47 |
You may download the original dataset from here. A preprocessed version of the dataset is available at data/preprocessing/SemEval-2014.pth
.
To speed up the training process, we freeze BERT embeddings and employ mixed-precision training. However, we suggest not using them if you notice a considerable drop in the performance.
To obtain word-level representations, we implement functions to efficiently compute Wordpiece masks as well as the scatter operation itself. However, its batched version is slow. For this reason, we highly encourage you to rely on ad-hoc solutions such as pytorch-scatter or get inspired by AllenNLP's approach.
For ready-to-go usage, simply run the notebook on Colab. In case you would like to test it on your local machine, please follow the installation guide.