You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
🖥 Benchmarking
GPT2LMHeadModel
Benchmark
GPT2LMHeadModel model call (and model.generate() too)
Set-up
gpu: gtx 1080
pytorch 1.4.0
transformers 2.8.0, 3.5.1, 4.5.1 releases and latest master branch
Code to reproduce
Results
While
model.generate()
code improved and works faster now, model forward pass used in model direct call, became 9% slowertransformers: 2.8.0
<class 'transformers.modeling_gpt2.GPT2LMHeadModel'>
GPT2LMmedium model.generate (using caching) 1014 input, generate to 1024 (mean ± 3std): 0.557±0.037
GPT2LMmedium model call, 1024 input 10 times (mean ± 3std): 1.821±0.017
transformers: 3.5.1
<class 'transformers.modeling_gpt2.GPT2LMHeadModel'>
GPT2LMmedium model.generate (using caching) 1014 input, generate to 1024 (mean ± 3std): 0.37±0.003
GPT2LMmedium model call, 1024 input 10 times (mean ± 3std): 1.849±0.012
transformers: 4.5.1
<class 'transformers.models.gpt2.modeling_gpt2.GPT2LMHeadModel'>
GPT2LMmedium model.generate (using caching) 1014 input, generate to 1024 (mean ± 3std): 0.36±0.003
GPT2LMmedium model call, 1024 input 10 times (mean ± 3std): 1.823±0.013
transformers: 4.6.0.dev0
<class 'transformers.models.gpt2.modeling_gpt2.GPT2LMHeadModel'>
GPT2LMmedium model.generate (using caching) 1014 input, generate to 1024 (mean ± 3std): 0.367±0.004
GPT2LMmedium model call, 1024 input 10 times (mean ± 3std): 1.991±0.013
The text was updated successfully, but these errors were encountered: