-
-
Notifications
You must be signed in to change notification settings - Fork 16.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
onnx export RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same? #41
Comments
@jinfagang you should be able to export to onnx like this. Run this command from the /yolov5 directory. export PYTHONPATH="$PWD"
python models/onnx_export.py --weights ./weights/yolov5s.pt --img 640 640 --batch 1 |
Output should look like this. You might want to ...
%416 = Shape(%285)
%417 = Constant[value = <Scalar Tensor []>]()
%418 = Gather[axis = 0](%416, %417)
%419 = Shape(%285)
%420 = Constant[value = <Scalar Tensor []>]()
%421 = Gather[axis = 0](%419, %420)
%422 = Shape(%285)
%423 = Constant[value = <Scalar Tensor []>]()
%424 = Gather[axis = 0](%422, %423)
%427 = Unsqueeze[axes = [0]](%418)
%430 = Unsqueeze[axes = [0]](%421)
%431 = Unsqueeze[axes = [0]](%424)
%432 = Concat[axis = 0](%427, %439, %440, %430, %431)
%433 = Reshape(%285, %432)
%434 = Transpose[perm = [0, 1, 3, 4, 2]](%433)
return %output, %415, %434
}
Export complete. ONNX model saved to ./weights/yolov5s.onnx
View with https://github.com/lutzroeder/netron |
@glenn-jocher turns out my model is trained on GPU, and the model serialized as cuda device, so the input does not to cuda throw this error. However, when I force it to cuda, it still got error oppsite, seems some code inside model, still using CPU tensor instead. is there any special reason for using cpu tensor there? |
@jinfagang onnx export should only be done when the model is on cpu. The netron image you show is correct. The boxes are part of the v5 architecture, they are not related to image augmentation during training. At the moment onnx export stops at the output features. This is an example P3 output (smallest boxes) for 3 anchors with a grid size 40x24. The 85 features are xywh, objectness, and 80 class confidences. |
@jinfagang I ran into a cuda issue with an onnx export today, and pushed a fix 1e2cb6b for this. This may or may not solve your original issue. |
@glenn-jocher So the output is same with yolov3 in your previous repo? I wanna access the outputs and accelerate it in tensorrt. |
@glenn-jocher Does anchors decodee process can also exported into onnx? So that it can be more end2end when transfer into other paltforms for inference? |
@jinfagang yes, this would be more useful. It is more complicated to implement though, especially if you want a clean onnx graph. We will try to add this in the future. |
@glenn-jocher I had a tiny experiments on this, it ends involved a ScatterND op there, this op is hard to convert to other platforms. If we want eliminate this op, postprocess scripts (Detect layer here) need re-written (only for export mode, in a more complicated way but can export and works perfectly) |
@jinfagang I also ran into this issue. Resolved it by converting the model to cuda and then saving the weights. Used those weights to convert to onnx model but I ran into some issue in converting onnx to tensorRT. If you successfully converted the model to TensorRT please let me know how you did that. |
@makaveli10 I already converted the model to onnx and inferenced it on TensorRT. However, this involved some special operations different than this repo does, and accordingly on TensorRT side needs some special operation to do. Overall, the TensorRT accelerated speed is about: 38ms with a 1280x768 input resolution, the performance is quite well: you can add my wechat: |
@jinfagang great work! What is the speedup compared to using detect.py? What GPU are you using? |
@glenn-jocher Am using GTX1080Ti, speed tested on this. The speed measured included post process time (from engine forward to nms and copy data back to CPU etc.). I think the speed is almost same with darknet version yolov4 converted tensorrt. (I previously tested with 800x800 input). the speed can still be optimized by including all postprocess to cuda kernel and fp16 or int8 quantization |
@jinfagang amazing to see you got it running in such a short time. I'm able to convert the pth files to onnx format but I keep gettting this error when I try to convert to tensorrt6: |
@jinfagang I dont have an account on WeChat. Neither I have a friend who can verify a new account. Can you please share your code to inference onnx on TensorRT somehow? I am getting incorrect outputs from the engine that I generated using onnx model. |
@makaveli10 mind sharing how you were able to generate an onnx model that worked with TensorRT? Also, which version of TRT did you use? |
@aj-ames https://github.com/TrojanXu/yolov5-tensorrt |
@makaveli10 thanks. I will update my findings here. |
I use this project. But I enconter the same error when I try to convert to tensorrt6: If you have some pointers for me, I would really appreciate it. |
When I convert onnx to tensorrt, I enconter the same error when I try to convert to tensorrt6: I use tensorrt 6.0 and onnx 1.5.0 or 1.6.0, they are all not work. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
…_inference_fix onnx inference visualisation fix
Run onnx export got error:
weights trained on GPU, and converted both model and image to cuda device, why this error still happens
The text was updated successfully, but these errors were encountered: