Skip to content
This repository has been archived by the owner on Sep 18, 2023. It is now read-only.

the eval_acc on RTE dataset is only 55% #27

Open
leoozy opened this issue Jul 21, 2022 · 1 comment
Open

the eval_acc on RTE dataset is only 55% #27

leoozy opened this issue Jul 21, 2022 · 1 comment

Comments

@leoozy
Copy link

leoozy commented Jul 21, 2022

Hello, thank you for your code. I tired to run your code with the following commond:
aim=pretraining_experiment-bert-mlm--23000
deepspeed --include=localhost:0,1,2,3,4,5,6,7 --master_port 64000 run_pretraining.py
--model_type bert-mlm --tokenizer_name bert-base-uncased
--hidden_act gelu
--hidden_size 1024
--num_hidden_layers 24
--num_attention_heads 16
--intermediate_size 4096
--hidden_dropout_prob 0.1
--attention_probs_dropout_prob 0.1
--encoder_ln_mode pre-ln
--lr 1e-3
--train_batch_size 4096
--train_micro_batch_size_per_gpu 128
--lr_schedule step
--curve linear
--warmup_proportion 0.06
--gradient_clipping 0.0
--optimizer_type adamw
--weight_decay 0.01
--adam_beta1 0.9
--adam_beta2 0.98
--adam_eps 1e-6
--total_training_time 24.0
--early_exit_time_marker 24.0
--dataset_path path_to_dataset
--output_dir path_to_output
--print_steps 100
--num_epochs_between_checkpoints 10000
--job_name ${aim}
--project_name budget-bert-pretraining
--validation_epochs 3
--validation_epochs_begin 1
--validation_epochs_end 1
--validation_begin_proportion 0.05
--validation_end_proportion 0.01
--validation_micro_batch 16
--deepspeed
--data_loader_type dist
--do_validation
--use_early_stopping
--early_stop_time 180
--early_stop_eval_loss 6
--seed 42
--fp16
--max_steps 23000
--finetune_checkpoint_at_end

I did not change your code. But the eval_acc on RTE is only 55%, which is significantly lower than bert-baseline (~65%). Could you give some advices?

@peteriz
Copy link
Contributor

peteriz commented Jul 27, 2022

I don't know what backend you ran this experiment on but one issue that might cause an under-trained model is that your training session didn't reach 23k updates within 24 hours (see command that has a hard limit == will stop training after 1 day).
Try running the same command but without early stopping or time limit (just for 23k steps).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants