-
-
Notifications
You must be signed in to change notification settings - Fork 16.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multi GPU RuntimeError: Model replicas must have an equal number of parameters. #11
Comments
Hello @lhwcv, thank you for your interest in our work! Please visit our Custom Training Tutorial to get started, and see our Google Colab Notebook, Docker Image, and GCP Quickstart Guide for example environments. If this is a bug report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you. If this is a custom model or data training question, please note that Ultralytics does not provide free personal support. As a leader in vision ML and AI, we do offer professional consulting, from simple expert advice up to delivery of fully customized, end-to-end production solutions for our clients, such as:
For more information please visit https://www.ultralytics.com. |
It maybe pytorch==1.5 version problem, 1.4 ok. Closed! |
@lhwcv I'm not able to reproduce your issue. I tried with our docker container (with pytorch 1.5), and training operates correctly with your command with 4 GPUs: |
Note: this may have been fixed by the fix applied for #15. |
Closing as the original issue seems to be resolved. |
Not yet, official pytorch 1.5 still got this issue:
|
the same issue with custom dataset and using the pre-trained yolov5x.pt file RuntimeError: Model replicas must have an equal number of parameters. |
I've reopened as issue appears to still be present. @mingmmq could you supply code to reproduce your issue? Is it reproducible on coco128.yaml dataset? |
I have the same problem in my custom dataset(24 classes). |
I have the same problem in my custom dataset(11 classes). |
Try to downgrade the PyTorch from1.5 to 1.4. It works for me |
run pip install torch==1.4.0+cu100 torchvision==0.5.0+cu100 -f https://download.pytorch.org/whl/torch_stable.html to fix Or you see |
torch1.5->1.4 is ok |
@panchengl does the recently released 1.5.1 fix this? |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
update bias init&&update obj loss
🐛 Bug
when using 4* 2080ti for training:
"RuntimeError: Model replicas must have an equal number of parameters."
(1 gpu is OK)
To Reproduce
REQUIRED: Code to reproduce your issue below
The text was updated successfully, but these errors were encountered: