Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError when try load model to CUDA device #1454

Closed
Semihal opened this issue Nov 19, 2020 · 6 comments · Fixed by #1455
Closed

RuntimeError when try load model to CUDA device #1454

Semihal opened this issue Nov 19, 2020 · 6 comments · Fixed by #1455
Labels
bug Something isn't working

Comments

@Semihal
Copy link

Semihal commented Nov 19, 2020

🐛 Bug

I try load model (from hub) to CUDA device and receive RuntimeError.

To Reproduce (REQUIRED)

Input:

import torch
import cv2

device = torch.device('cuda')
model = torch.hub.load('ultralytics/yolov5', 'yolov5l', pretrained=True).fuse().autoshape().eval().to(device)
image_path = 'data/images/bus.jpg'
img = cv2.imread(image_path)
r = model(img)

Output:

Using cache found in /home/appuser/.cache/torch/hub/ultralytics_yolov5_master

                 from  n    params  module                                  arguments                     
  0                -1  1      7040  models.common.Focus                     [3, 64, 3]                    
  1                -1  1     73984  models.common.Conv                      [64, 128, 3, 2]               
  2                -1  1    161152  models.common.BottleneckCSP             [128, 128, 3]                 
  3                -1  1    295424  models.common.Conv                      [128, 256, 3, 2]              
  4                -1  1   1627904  models.common.BottleneckCSP             [256, 256, 9]                 
  5                -1  1   1180672  models.common.Conv                      [256, 512, 3, 2]              
  6                -1  1   6499840  models.common.BottleneckCSP             [512, 512, 9]                 
  7                -1  1   4720640  models.common.Conv                      [512, 1024, 3, 2]             
  8                -1  1   2624512  models.common.SPP                       [1024, 1024, [5, 9, 13]]      
  9                -1  1  10234880  models.common.BottleneckCSP             [1024, 1024, 3, False]        
 10                -1  1    525312  models.common.Conv                      [1024, 512, 1, 1]             
 11                -1  1         0  torch.nn.modules.upsampling.Upsample    [None, 2, 'nearest']          
 12           [-1, 6]  1         0  models.common.Concat                    [1]                           
 13                -1  1   2823680  models.common.BottleneckCSP             [1024, 512, 3, False]         
 14                -1  1    131584  models.common.Conv                      [512, 256, 1, 1]              
 15                -1  1         0  torch.nn.modules.upsampling.Upsample    [None, 2, 'nearest']          
 16           [-1, 4]  1         0  models.common.Concat                    [1]                           
 17                -1  1    707328  models.common.BottleneckCSP             [512, 256, 3, False]          
 18                -1  1    590336  models.common.Conv                      [256, 256, 3, 2]              
 19          [-1, 14]  1         0  models.common.Concat                    [1]                           
 20                -1  1   2561536  models.common.BottleneckCSP             [512, 512, 3, False]          
 21                -1  1   2360320  models.common.Conv                      [512, 512, 3, 2]              
 22          [-1, 10]  1         0  models.common.Concat                    [1]                           
 23                -1  1  10234880  models.common.BottleneckCSP             [1024, 1024, 3, False]        
 24      [17, 20, 23]  1    457725  models.yolo.Detect                      [80, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [256, 512, 1024]]
Model Summary: 499 layers, 47818749 parameters, 47818749 gradients

Fusing layers... 
Model Summary: 400 layers, 47790077 parameters, 47790077 gradients
Adding autoShape... 

RuntimeError                              Traceback (most recent call last)
<ipython-input-3-5dc3294ee725> in <module>
      3 image_path = 'data/images/bus.jpg'
      4 img = cv2.imread(image_path)
----> 5 r = model(img)

~/.conda/envs/pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
    725             result = self._slow_forward(*input, **kwargs)
    726         else:
--> 727             result = self.forward(*input, **kwargs)
    728         for hook in itertools.chain(
    729                 _global_forward_hooks.values(),

~/.cache/torch/hub/ultralytics_yolov5_master/models/common.py in forward(self, imgs, size, augment, profile)
    172                 y[i][:, :4] = scale_coords(shape1, y[i][:, :4], shape0[i])
    173 
--> 174         return Detections(imgs, y, self.names)
    175 
    176 

~/.cache/torch/hub/ultralytics_yolov5_master/models/common.py in __init__(self, imgs, pred, names)
    185         self.xywh = [xyxy2xywh(x) for x in pred]  # xywh pixels
    186         gn = [torch.Tensor([*[im.shape[i] for i in [1, 0, 1, 0]], 1., 1.]) for im in imgs]  # normalization gains
--> 187         self.xyxyn = [x / g for x, g in zip(self.xyxy, gn)]  # xyxy normalized
    188         self.xywhn = [x / g for x, g in zip(self.xywh, gn)]  # xywh normalized
    189         self.n = len(self.pred)

~/.cache/torch/hub/ultralytics_yolov5_master/models/common.py in <listcomp>(.0)
    185         self.xywh = [xyxy2xywh(x) for x in pred]  # xywh pixels
    186         gn = [torch.Tensor([*[im.shape[i] for i in [1, 0, 1, 0]], 1., 1.]) for im in imgs]  # normalization gains
--> 187         self.xyxyn = [x / g for x, g in zip(self.xyxy, gn)]  # xyxy normalized
    188         self.xywhn = [x / g for x, g in zip(self.xywh, gn)]  # xywh normalized
    189         self.n = len(self.pred)

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

Environment

If applicable, add screenshots to help explain your problem.

Cuda compilation tools, release 10.1, V10.1.243
OS: CentOS

@Semihal Semihal added the bug Something isn't working label Nov 19, 2020
@github-actions
Copy link
Contributor

github-actions bot commented Nov 19, 2020

Hello @Semihal, thank you for your interest in 🚀 YOLOv5! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.

If this is a 🐛 Bug Report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset images, training logs, screenshots, and a public link to online W&B logging if available.

For business inquiries or professional support requests please visit https://www.ultralytics.com or email Glenn Jocher at glenn.jocher@ultralytics.com.

Requirements

Python 3.8 or later with all requirements.txt dependencies installed, including torch>=1.7. To install run:

$ pip install -r requirements.txt

Environments

YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Status

CI CPU testing

If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training (train.py), testing (test.py), inference (detect.py) and export (export.py) on MacOS, Windows, and Ubuntu every 24 hours and on every commit.

@glenn-jocher
Copy link
Member

@Semihal thanks for the bug report. You might want to try force_reload=True with your torch.hub.load( to make sure you are using the latest code.

Also, autoshape() already includes a call to eval(), so you can eliminate that part. I will try to reproduce in a Colab notebook.

@glenn-jocher
Copy link
Member

Yes I see the same error. Will debug.

@glenn-jocher glenn-jocher linked a pull request Nov 19, 2020 that will close this issue
@glenn-jocher
Copy link
Member

@Semihal PR #1455 should fix this. Can you try to run the same commands with force_reload=True to recache the latest update?

model = torch.hub.load('ultralytics/yolov5', 'yolov5l', pretrained=True, force_reload=True).fuse().autoshape().to(device)

@Semihal
Copy link
Author

Semihal commented Nov 19, 2020

@glenn-jocher , Yes, everything works, thank you!

@glenn-jocher
Copy link
Member

@Semihal great! Let me know if you come across any other bugs. We are trying to improve the repo rapidly, so every once in a while we accidentally introduce a bug along with our updates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants