Is it normal that attribution takes multiple seconds per text, even on a GPU? #124

MoritzLaurer · 2023-01-31T12:12:22Z

Really like you package, thanks a lot for the clean implementation!

I'm trying to get the attributions for each text in a large corpus (10k++ texts) on a google colab GPU. The speed I'm used to on google colab (T4 GPU) is maybe several dozen texts per second (with 16-32 batch size) during inference and during training a few batches (e.g. 32 batch size) per second. For example, when I train a deberta-xsmall model I get 'train_steps_per_second': 6.121 for 32 batch size per step.

I don't have much experience with attribution methods, but I'm surprised that the attribution seems extremely slow, also on a GPU. Based on #60 I have verified that the explainer runs on a gpu correctly with cls_explainer.device.

Despite being on a GPU, the code below only runs with around 2.6 seconds per iteration (one iteration is a single text truncated to 120 max tokens). This is with deberta-xsmall, so a relatively small model.

My question: is it to be expected that a T4 GPU takes 2.6 seconds per text?
If not, do you see something in the code below that I'm doing wrong? (I imagine that I can increase speed by increasing internal_batch_size, but I also had surprisingly many cuda memory errors)

from transformers_interpret import SequenceClassificationExplainer
cls_explainer = SequenceClassificationExplainer(
    model,
    tokenizer)

print(cls_explainer.device)

import tqdm
word_attributions_lst = []
for row in tqdm.notebook.tqdm(df_test.iterrows()):
    # calculate word attributions per text
    word_attributions = cls_explainer(row[1]["text_prepared"], internal_batch_size=1, n_steps=30)  # defaults: n_steps=50
    # add predicted and true label to tuple
    word_attributions_w_labels = [attribution_tuple + (row[1]["label_text"],) + (cls_explainer.predicted_class_name,) for attribution_tuple in word_attributions]
    word_attributions_lst.append(word_attributions_w_labels)

The text was updated successfully, but these errors were encountered:

MoritzLaurer · 2023-01-31T15:13:28Z

small update: saw in the Captum docs that they often use model.eval() and model.zero_grad() before attribution. I tried doing this, but also didn't really help

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is it normal that attribution takes multiple seconds per text, even on a GPU? #124

Is it normal that attribution takes multiple seconds per text, even on a GPU? #124

MoritzLaurer commented Jan 31, 2023

MoritzLaurer commented Jan 31, 2023

Is it normal that attribution takes multiple seconds per text, even on a GPU? #124

Is it normal that attribution takes multiple seconds per text, even on a GPU? #124

Comments

MoritzLaurer commented Jan 31, 2023

MoritzLaurer commented Jan 31, 2023