You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello! I am fascinated by your great idea and have been experimenting with your code, but I found that there might be some problems with your function of multicard finetuning:
if "--launcher" is set to none and set two or more GPUs like CUDA_VISIBLE_DEVICES=0,1, NaN problems will occur in the first epoch"NaN or Inf found in input tensor"
if "--launcher" is set to "pytorch", errors about environmental variables like "RANK" not defined or "WORLD_SIZE" not define will be raised. In the corresponding block, I found a "TO DO"
Have you met the problem when doing the experiment yourselves? Please tell me how it shall be solved, and how tour DDP can be used? Thanks!
The text was updated successfully, but these errors were encountered:
Hello! I am fascinated by your great idea and have been experimenting with your code, but I found that there might be some problems with your function of multicard finetuning:
if "--launcher" is set to none and set two or more GPUs like CUDA_VISIBLE_DEVICES=0,1, NaN problems will occur in the first epoch"NaN or Inf found in input tensor"
if "--launcher" is set to "pytorch", errors about environmental variables like "RANK" not defined or "WORLD_SIZE" not define will be raised. In the corresponding block, I found a "TO DO"
Have you met the problem when doing the experiment yourselves? Please tell me how it shall be solved, and how tour DDP can be used? Thanks!
The text was updated successfully, but these errors were encountered: