-
Notifications
You must be signed in to change notification settings - Fork 27.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
run_glue_no_trainer.py script crashes on Mistral model due to tokenizer issue #28534
Comments
Adding these lines seems to fix it, not sure if this is the best/most general solution though:
|
Hi @rosario-purple, thanks for raising this issue! The proposed fix is the recommended way to address this. Would you like to open a PR to add this to the script? This way you get the github contribution |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
@amyeroberts can i take this up? |
@JINO-ROHIT Sure! |
System Info
transformers
version: 4.36.2- distributed_type: DEEPSPEED
- mixed_precision: bf16
- use_cpu: False
- debug: False
- num_processes: 8
- machine_rank: 0
- num_machines: 1
- rdzv_backend: static
- same_network: True
- main_training_function: main
- deepspeed_config: {'gradient_accumulation_steps': 1, 'offload_optimizer_device': 'none', 'offload_param_device': 'none', 'zero3_init_flag': True, 'zero3_save_16bit_model': False, 'zero_stage': 3}
- downcast_bf16: no
- tpu_use_cluster: False
- tpu_use_sudo: False
- tpu_env: []
Who can help?
@ArthurZucker @younesbelkada @pacman100
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Check out the transformers repo, and run this command (on a large server with appropriately configured
accelerate
, so it won't OOM):python run_glue_no_trainer.py --model_name_or_path mistralai/Mistral-7B-v0.1 --task_name sst2 --per_device_train_batch_size 4 --learning_rate 2e-5 --num_train_epochs 3 --output_dir /tmp/sst2
It will crash with this error and stack trace:
Expected behavior
It should train without crashing.
The text was updated successfully, but these errors were encountered: