http://arxiv.org/abs/2010.13002
More info can be found here.
-
Speed on single NVIDIA-V100-16GB
BatchSize 64 128 transformers-4.12.0 5.5 samples/s OOM above + fastseq 17.8 samples/s 19.1 samples/s
sshleifer/distilbart-cnn-12-6
from model hub.
CNN/DM validation data
$ fastseq-generate-for-transformers \
sshleifer/distilbart-cnn-12-6 \
cnn_dm/val.source \
out.summary \
--reference_path cnn_dm/val.target \
--device cuda \
--bs BATCH_SIZE \
--fp16 \
--score_path out.score \
--task summarization
Baseline speed number is obtained by running Transformers v4.12.0 code.