We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I have logged the parameters of the training into wandb (you can see the link below).
This is my config:
pretrain.run --output_dir output_merge_data \ --report_to wandb \ --data_dir data/merge_final_dataset.parquet \ --do_train True \ --save_steps 100000 \ --per_device_train_batch_size 12 \ --model_name_or_path iambestfeed/BanhmiBERT \ --pretrain_method retromae \ --fp16 True \ --warmup_ratio 0.1 \ --learning_rate 1e-4 \ --num_train_epochs 8 \ --overwrite_output_dir True \ --dataloader_num_workers 6 \ --weight_decay 0.01 \ --encoder_mlm_probability 0.3 \ --decoder_mlm_probability 0.5
And as you can observe, as soon as you finish running epoch 4, the loss drops to 0. What do you think is going on?
The text was updated successfully, but these errors were encountered:
No branches or pull requests
I have logged the parameters of the training into wandb (you can see the link below).
This is my config:
And as you can observe, as soon as you finish running epoch 4, the loss drops to 0. What do you think is going on?
The text was updated successfully, but these errors were encountered: