You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Really like you package, thanks a lot for the clean implementation!
I'm trying to get the attributions for each text in a large corpus (10k++ texts) on a google colab GPU. The speed I'm used to on google colab (T4 GPU) is maybe several dozen texts per second (with 16-32 batch size) during inference and during training a few batches (e.g. 32 batch size) per second. For example, when I train a deberta-xsmall model I get 'train_steps_per_second': 6.121 for 32 batch size per step.
I don't have much experience with attribution methods, but I'm surprised that the attribution seems extremely slow, also on a GPU. Based on #60 I have verified that the explainer runs on a gpu correctly with cls_explainer.device.
Despite being on a GPU, the code below only runs with around 2.6 seconds per iteration (one iteration is a single text truncated to 120 max tokens). This is with deberta-xsmall, so a relatively small model.
My question: is it to be expected that a T4 GPU takes 2.6 seconds per text?
If not, do you see something in the code below that I'm doing wrong? (I imagine that I can increase speed by increasing internal_batch_size, but I also had surprisingly many cuda memory errors)
from transformers_interpret import SequenceClassificationExplainer
cls_explainer = SequenceClassificationExplainer(
model,
tokenizer)
print(cls_explainer.device)
import tqdm
word_attributions_lst = []
for row in tqdm.notebook.tqdm(df_test.iterrows()):
# calculate word attributions per text
word_attributions = cls_explainer(row[1]["text_prepared"], internal_batch_size=1, n_steps=30) # defaults: n_steps=50
# add predicted and true label to tuple
word_attributions_w_labels = [attribution_tuple + (row[1]["label_text"],) + (cls_explainer.predicted_class_name,) for attribution_tuple in word_attributions]
word_attributions_lst.append(word_attributions_w_labels)
The text was updated successfully, but these errors were encountered:
small update: saw in the Captum docs that they often use model.eval() and model.zero_grad() before attribution. I tried doing this, but also didn't really help
Really like you package, thanks a lot for the clean implementation!
I'm trying to get the attributions for each text in a large corpus (10k++ texts) on a google colab GPU. The speed I'm used to on google colab (T4 GPU) is maybe several dozen texts per second (with 16-32 batch size) during inference and during training a few batches (e.g. 32 batch size) per second. For example, when I train a deberta-xsmall model I get
'train_steps_per_second': 6.121
for 32 batch size per step.I don't have much experience with attribution methods, but I'm surprised that the attribution seems extremely slow, also on a GPU. Based on #60 I have verified that the explainer runs on a gpu correctly with
cls_explainer.device
.Despite being on a GPU, the code below only runs with around 2.6 seconds per iteration (one iteration is a single text truncated to 120 max tokens). This is with deberta-xsmall, so a relatively small model.
My question: is it to be expected that a T4 GPU takes 2.6 seconds per text?
If not, do you see something in the code below that I'm doing wrong? (I imagine that I can increase speed by increasing internal_batch_size, but I also had surprisingly many cuda memory errors)
The text was updated successfully, but these errors were encountered: