Code to reproduce the results reported in the paper: "A Song of (Dis)agreement: Evaluating the Evaluation of Explainable Artificial Intelligence in Natural Language Processing" (see song_of_disagreement/paper.pdf). Appendices can be found in song_of_disagreement/appendices.
This framework can be used to design and run additional experiments to calculate different correlation metrics between feature additive explanation methods. Currently, we assume you are interested in comparing at least one attention-based explanation method.
Prepare a Python virtual environment and install the necessary packages.
python3 -m venv v-xai-court
source v-xai-court/bin/activate
pip install --upgrade pip
pip install torch torchtext
pip install -r requirements.txt
python -m spacy download en
Datasets are stored in the datasets/
folder. We include the IMDb and SST-2 datasets with the same splits and pre-processing steps as used in (Jain and Wallace 2019). To download the other datasets:
- Quora
- Our split (80/10/10), question pairs with combined word count greater than 200 were removed.
- SNLI
- MultiNLI
- Get the test set from here and run
scripts/extract_mnli_test_en.py
- Get the test set from here and run
We have implemented a custom AllenNLP command to run our experiments. We define each experiment as consisting of four variables:
- The dataset (IMDb, SST-2, Quora Question Pair, SNLI, MultiNLI)
- The model (BiLSTM, DistilBERT)
- The attention mechanism (additive (tanh), self-attention)
- The attention activation function (softmax, uniform, etc...)
Thus the experiment files are named {dataset}_{model}_{attention_mechanism}_{activation_function}.jsonnet
. Our experiments are located in the experiments/
folder.
Since we train three independently-seeded models PER experiment, it may take several days to run all of our experiments. A GPU with CUDA and 16GB of memory is strongly recommended.
To run an experiment simply call our custom command with the path to the experiment file. For example:
allennlp attn-experiment experiments/sst_distilbert_self_softmax.jsonnet
By default, the generated artifacts are available in the outputs
directory. This includes the models, their configurations, and their metrics.
When the experiments finish, a .csv summary of the correlation results in available in outputs/{experiment_file_name}/summary.csv
. We used these summary files to generate our tables.
Since our code uses AllenNLP, you can easily add a new Jsonnet
experiment file to the experiments
directory.
We currently support the following components (see the existing experiment files for examples on how to use them):
- Tasks/Datasets
- Single Sequence: Binary Sentiment Classification
- IMDb Movie Reviews
- Stanford Sentiment Treebank
- Pair Sequence: Natural Language Inference
- Quora Question Pairs
- SNLI
- MultiNLI
- Single Sequence: Binary Sentiment Classification
- Models
- BiLSTM with (tanh) additive attention
- DistilBERT
- Attention activation functions
- Softmax
- Uniform
- Sparsemax
- Alpha Entmax (alpha = 1.5 or learned)
- Attention aggregation and analysis methods:
- Average: for the Transformer, averages attention across layers, heads, and max pools across the last dimension of attention matrix
- Attention Flow: for the Transformer
- Attention Rollout: for the Transformer
- Attention Norms: as an additional analysis method for any attention mechanism
- Additive Feature Importance Methods (from Captum)
- LIME
- Feature Ablation
- Integrated Gradients
- DeepLIFT
- Gradient SHAP
- Deep SHAP
- Attention (see previous)
- Correlation metrics
- Kendall-tau
- Top-k Kendall-tau. This is a custom implementation where k can be a fixed number of tokens, variable percentage, or non-zero attribution values.
- Weighted Kendall-tau
- Yilmaz tauAP where ties are allowed. Taken from Pyircor
- Spearman-r
- Pearson-rho
To cite our work, please cite: A Song of (Dis)agreement: Evaluating the Evaluation of Explainable Artificial Intelligence in Natural Language Processing.
@incollection{neely_song_2022,
title = {A {Song} of ({Dis})agreement: {Evaluating} the {Evaluation} of {Explainable} {Artificial} {Intelligence} in {Natural} {Language} {Processing}},
copyright = {All rights reserved},
shorttitle = {A {Song} of ({Dis})agreement},
url = {https://ebooks.iospress.nl/doi/10.3233/FAIA220190},
booktitle = {{HHAI2022}: {Augmenting} {Human} {Intellect}},
publisher = {IOS Press},
author = {Neely, Michael and Schouten, Stefan F. and Bleeker, Maurits and Lucic, Ana},
year = {2022},
doi = {10.3233/FAIA220190},
pages = {60--78},
}
Our results were also presented at the ICML Workshop on Theoretic Foundation, Criticism, and Application Trend of Explainable AI in our earlier paper Order in the Court: Explainable AI Methods Prone to Disagreement.
@inproceedings{neely2021order,
title = "Order in the Court: Explainable AI Methods Prone to Disagreement",
author = {Neely, Michael and
Schouten, Stefan F. and
Bleeker, Maurits J. R. and
Lucic, Ana},
booktitle = "ICML 2021 Workshop on Theoretic Foundation, Criticism, and Application Trend of Explainable AI",
month = jul,
year = "2021",
publisher = "International Conference on Machine Learning",
url = "https://arxiv.org/abs/2105.03287",
doi = "https://doi.org/10.48550/arXiv.2107.08821"
}