Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for "RuntimeError: result type Float can't be cast to the desired output type long int" #48

Closed
wants to merge 1 commit into from

Conversation

ThibaultCastells
Copy link
Contributor

Hello,
I tried this repo and got the following message when experimenting with yolov5 (I am using Pytorch 1.12.1 and python 3.10.6):

Traceback (most recent call last):
File "/*****/yoloair/train.py", line 695, in
main(opt)
File "/*****/yoloair/train.py", line 591, in main
train(opt.hyp, opt, device, callbacks)
File "/*****/yoloair/train.py", line 376, in train
loss, loss_items = compute_loss(pred, targets.to(device)) # loss scaled by batch_size
File "/*****/yoloair/utils/loss.py", line 123, in call
tcls, tbox, indices, anchors = self.build_targets(p, targets) # targets
File "/*****/yoloair/utils/loss.py", line 222, in build_targets
indices.append((b, a, gj.clamp_(0, gain[3] - 1), gi.clamp_(0, gain[2] - 1))) # image, anchor, grid indices
RuntimeError: result type Float can't be cast to the desired output type long int

I fixed it by casting from Float to Int:
indices.append((b, a, gj.clamp_(0, int(gain[3] - 1)), gi.clamp_(0, int(gain[2] - 1))))

I figured that this could save some time to people getting the same issue.

@iscyy
Copy link
Owner

iscyy commented Sep 6, 2022

hi, can you tell me what training command you used to cause this error, I did not get a similar error using this repo (Python 3.9.6, Pytorch1.9.0)
Also do you get this error using the yolov5 repo?
https://github.com/ultralytics/yolov5/blob/v6.1/utils/loss.py#L217
This is consistent with the official yolov5_v6.1.

@ThibaultCastells
Copy link
Contributor Author

Yes, my command was:
python3 -m torch.distributed.run --nproc_per_node 8 train.py --batch 64 --data data/coco.yaml --weights yolov5s.pt --device 0,1,2,3,4,5,6,7
But I think it is related to the Pytorch version rather than the command.

I did not try with yolov5 but noticed that some people got the exact same issue with yolov5 or other repo based on yolov5:
Yolov5: ultralytics/yolov5#8405
Yolor: WongKinYiu/yolor#270

Apparently it is a compatibility issue with Pytorch 1.12, but I thought that fixing the issue is a better solution than downgrading Pytorch.

If you think my fix is not good, this is another solution suggested in the yolor issue: WongKinYiu/yolor@1102f52

@iscyy
Copy link
Owner

iscyy commented Sep 7, 2022

hi, thank you for your reply, I read these links, one of the solutions: WongKinYiu/yolor@1102f52, it is consistent with the latest official code of yolov5, can you resubmit a PR?

@ThibaultCastells
Copy link
Contributor Author

ThibaultCastells commented Sep 7, 2022

Okay no problem, then I am closing this issue. Here is the new pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants