You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Issue when run the training script. "ValueError: You can't train a model that has been loaded with device_map='auto' in any distributed mode. Please rerun your script specifying --num_processes=1 or by launching with python {{myscript.py}}."
#32
I am running the train.sh under an environment that installed all packages by
pip install -r requirements.txt
But it gives error like below:
Traceback (most recent call last):
File "train_huatuo.py", line 265, in <module>
train(args)
File "train_huatuo.py", line 145, in train
model, optimizer, train_dataloader, lr_scheduler = accelerator.prepare(model, optimizer, train_dataloader, lr_scheduler)
File "/fdudata/tsli/HuatuoGPT-II/huatuo2/lib/python3.8/site-packages/accelerate/accelerator.py", line 1250, in prepare
raise ValueError(
ValueError: You can't train a model that has been loaded with `device_map='auto'` in any distributed mode. Please rerun your script specifying `--num_processes=1` or by launching with `python {{myscript.py}}`.
And I have changed the "--num_processes" flag to 1. But it still gives the same error. Is there any suggestion for solving this problem?
The text was updated successfully, but these errors were encountered:
I am running the train.sh under an environment that installed all packages by
But it gives error like below:
And I have changed the "--num_processes" flag to 1. But it still gives the same error. Is there any suggestion for solving this problem?
The text was updated successfully, but these errors were encountered: