Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training time issue #1

Open
Han8931 opened this issue Jan 27, 2022 · 3 comments
Open

Training time issue #1

Han8931 opened this issue Jan 27, 2022 · 3 comments

Comments

@Han8931
Copy link

Han8931 commented Jan 27, 2022

Hi, I am training a RoBERTa model with A2T but it seems it will take really long time.
is it normal to take this long?

@jinyongyoo
Copy link
Collaborator

Hi @Han8931. Can you provide more detail as to how you are training your model (e.g. dataset size, attack parameters, etc)?

@Han8931
Copy link
Author

Han8931 commented Feb 24, 2022

I just run a model with the provided configurations (BERT for IMDb). It seems even a single run takes around 8 hours. Actually it took more than two days in total.

@zodiacg
Copy link

zodiacg commented May 25, 2023

Can confirm this problem. Tried SNLI with a2t on a 2080Ti (batch size 12), the first clean epoch took 7 hours and the generation with a2t was estimated to take 21 hours. Tried again on a 3090 (batch size 32), the first clean epoch still took 3 hours.
Comparably, I wrote a simple bert finetuning script and it took only around 1 hour for a clean epoch on the same 2080Ti.

===

Found the problem. In textattack.Trainer.training_step(), the input texts are forced to pad to the max length of pretrained model, causing the computation of model to be much slower than it needed. Changed padding parameter to True significantly improve the speed. Haven't notice any problem.

Is there any specific demand for padding to max length? I'll bring this issue to textattack for further discussion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants