diff --git a/examples/BERT/README.rst b/examples/BERT/README.rst index ddf33a4a7..54622ee39 100644 --- a/examples/BERT/README.rst +++ b/examples/BERT/README.rst @@ -14,6 +14,7 @@ Train the BERT model with masked language modeling task and next-sentence task. or run the tasks on a SLURM powered cluster with Distributed Data Parallel (DDP): srun --label --ntasks-per-node=1 --time=4000 --mem-per-cpu=5120 --gres=gpu:8 --cpus-per-task 80 --nodes=1 --pty python mlm_task.py --parallel DDP --log-interval 600 --dataset BookCorpus + srun --label --ntasks-per-node=1 --time=4000 --mem-per-cpu=5120 --gres=gpu:8 --cpus-per-task 80 --nodes=1 --pty python ns_task.py --parallel DDP --bert-model mlm_bert.pt --dataset BookCorpus The result ppl of mlm_task is 18.97899 for the test set.