Issue when run the training script. "ValueError: You can't train a model that has been loaded with `device_map='auto'` in any distributed mode. Please rerun your script specifying `--num_processes=1` or by launching with `python {{myscript.py}}`." #32

litsh · 2024-06-27T02:24:24Z

I am running the train.sh under an environment that installed all packages by

pip install -r requirements.txt

But it gives error like below:

Traceback (most recent call last):
  File "train_huatuo.py", line 265, in <module>
    train(args)
  File "train_huatuo.py", line 145, in train
    model, optimizer, train_dataloader,  lr_scheduler = accelerator.prepare(model, optimizer, train_dataloader, lr_scheduler)
  File "/fdudata/tsli/HuatuoGPT-II/huatuo2/lib/python3.8/site-packages/accelerate/accelerator.py", line 1250, in prepare
    raise ValueError(
ValueError: You can't train a model that has been loaded with `device_map='auto'` in any distributed mode. Please rerun your script specifying `--num_processes=1` or by launching with `python {{myscript.py}}`.

And I have changed the "--num_processes" flag to 1. But it still gives the same error. Is there any suggestion for solving this problem?

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue when run the training script. "ValueError: You can't train a model that has been loaded with `device_map='auto'` in any distributed mode. Please rerun your script specifying `--num_processes=1` or by launching with `python {{myscript.py}}`." #32

Issue when run the training script. "ValueError: You can't train a model that has been loaded with `device_map='auto'` in any distributed mode. Please rerun your script specifying `--num_processes=1` or by launching with `python {{myscript.py}}`." #32

litsh commented Jun 27, 2024

Issue when run the training script. "ValueError: You can't train a model that has been loaded with device_map='auto' in any distributed mode. Please rerun your script specifying --num_processes=1 or by launching with python {{myscript.py}}." #32

Issue when run the training script. "ValueError: You can't train a model that has been loaded with device_map='auto' in any distributed mode. Please rerun your script specifying --num_processes=1 or by launching with python {{myscript.py}}." #32

Comments

litsh commented Jun 27, 2024

Issue when run the training script. "ValueError: You can't train a model that has been loaded with `device_map='auto'` in any distributed mode. Please rerun your script specifying `--num_processes=1` or by launching with `python {{myscript.py}}`." #32

Issue when run the training script. "ValueError: You can't train a model that has been loaded with `device_map='auto'` in any distributed mode. Please rerun your script specifying `--num_processes=1` or by launching with `python {{myscript.py}}`." #32