Run tutorial: RuntimeError: cuDNN error: CUDNN_STATUS_BAD_PARAM #1469

nguyen14ck · 2020-11-22T01:16:58Z

The issue #185 was closed.

So I open this

🐛 Bug

...
Starting training for 3 epochs...

 Epoch   gpu_mem       box       obj       cls     total   targets  img_size

0%| | 0/8 [00:02<?, ?it/s]
Traceback (most recent call last):
File "train.py", line 490, in
train(hyp, opt, device, tb_writer, wandb)
File "train.py", line 292, in train
scaler.scale(loss).backward()
File "/home/npnguyen/anaconda3/lib/python3.6/site-packages/torch/tensor.py", line 185, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/home/npnguyen/anaconda3/lib/python3.6/site-packages/torch/autograd/init.py", line 127, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: cuDNN error: CUDNN_STATUS_BAD_PARAM
Exception raised from operator() at /opt/conda/conda-bld/pytorch_1595629416375/work/aten/src/ATen/native/cudnn/Conv.cpp:1141 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x4d (0x7f9ff06da77d in

To Reproduce (REQUIRED)

# Train YOLOv5s on COCO128 for 3 epochs
!python train.py --img 640 --batch 16 --epochs 3 --data coco128.yaml --weights yolov5s.pt --nosave --cache

Output:

Model Summary: 283 layers, 7468157 parameters, 7468157 gradients

Transferred 370/370 items from yolov5s.pt
Optimizer groups: 62 .bias, 70 conv.weight, 59 other
Scanning labels data/coco128/labels/train2017.cache (126 found, 0 missing, 2 empty, 0 duplicate, for 128 images): 128it [00:00, 9818.95it/s]
Caching images (0.1GB): 100%|███████████████| 128/128 [00:00<00:00, 1223.42it/s]
Scanning labels data/coco128/labels/train2017.cache (126 found, 0 missing, 2 empty, 0 duplicate, for 128 images): 128it [00:00, 9018.49it/s]
Caching images (0.1GB): 100%|████████████████| 128/128 [00:00<00:00, 562.80it/s]

Analyzing anchors... anchors/target = 4.26, Best Possible Recall (BPR) = 0.9946
Image sizes 640 train, 640 test
Using 8 dataloader workers
Logging results to runs/train/exp3
Starting training for 3 epochs...

     Epoch   gpu_mem       box       obj       cls     total   targets  img_size
  0%|                                                     | 0/8 [00:02<?, ?it/s]
Traceback (most recent call last):
  File "train.py", line 490, in <module>
    train(hyp, opt, device, tb_writer, wandb)
  File "train.py", line 292, in train
    scaler.scale(loss).backward()
  File "/home/npnguyen/anaconda3/lib/python3.6/site-packages/torch/tensor.py", line 185, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)
  File "/home/npnguyen/anaconda3/lib/python3.6/site-packages/torch/autograd/__init__.py", line 127, in backward
    allow_unreachable=True)  # allow_unreachable flag
RuntimeError: cuDNN error: CUDNN_STATUS_BAD_PARAM
Exception raised from operator() at /opt/conda/conda-bld/pytorch_1595629416375/work/aten/src/ATen/native/cudnn/Conv.cpp:1141 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x4d (0x7f9ff06da77d in /home/npnguyen/anaconda3/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0xcadca2 (0x7f9f79915ca2 in /home/npnguyen/anaconda3/lib/python3.6/site-packages/torch/lib/libtorch_cuda.so)
frame #2: <unknown function> + 0xcafe05 (0x7f9f79917e05 in /home/npnguyen/anaconda3/lib/python3.6/site-packages/torch/lib/libtorch_cuda.so)
frame #3: <unknown function> + 0xcb06ce (0x7f9f799186ce in /home/npnguyen/anaconda3/lib/python3.6/site-packages/torch/lib/libtorch_cuda.so)
frame #4: <unknown function> + 0xcb0d90 (0x7f9f79918d90 in /home/npnguyen/anaconda3/lib/python3.6/site-packages/torch/lib/libtorch_cuda.so)
frame #5: at::native::cudnn_convolution_backward_weight(c10::ArrayRef<long>, at::Tensor const&, at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, long, bool, bool) + 0x49 (0x7f9f79918fe9 in

Expected behavior

Fusing layers...
Model Summary: 484 layers, 88922205 parameters, 0 gradients
Scanning labels ../coco/labels/val2017.cache (4952 found, 0 missing, 48 empty, 0 duplicate, for 5000 images): 5000it [00:00, 14785.71it/s]
Class Images Targets P R mAP@.5 mAP@.5:.95: 100% 157/157 [01:30<00:00, 1.74it/s]
all 5e+03 3.63e+04 0.409 0.754 0.672 0.484
Speed: 5.9/2.1/7.9 ms inference/NMS/total per 640x640 image at batch-size 32

Evaluating pycocotools mAP... saving runs/test/exp/yolov5x_predictions.json...
loading annotations into memory...
Done (t=0.43s)

Environment

If applicable, add screenshots to help explain your problem.

OS: [e.g. Centos 7]
GPU [e.g. Quadro RTX 5000]

Additional Information

%pip install -qr requirements.txt  # install dependencies

import torch
from IPython.display import Image, clear_output  # to display images

clear_output()
print('Setup complete. Using torch %s %s' % (torch.__version__, torch.cuda.get_device_properties(0) if torch.cuda.is_available() else 'CPU'))

Setup complete. Using torch 1.6.0 _CudaDeviceProperties(name='Quadro RTX 5000', major=7, minor=5, total_memory=16117MB, multi_processor_count=48)

The text was updated successfully, but these errors were encountered:

github-actions · 2020-11-22T01:17:36Z

Hello @nguyen14ck, thank you for your interest in 🚀 YOLOv5! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.

If this is a 🐛 Bug Report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset images, training logs, screenshots, and a public link to online W&B logging if available.

For business inquiries or professional support requests please visit https://www.ultralytics.com or email Glenn Jocher at glenn.jocher@ultralytics.com.

Requirements

Python 3.8 or later with all requirements.txt dependencies installed, including torch>=1.7. To install run:

$ pip install -r requirements.txt

Environments

YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Google Colab Notebook with free GPU:
Kaggle Notebook with free GPU: https://www.kaggle.com/models/ultralytics/yolov5
Google Cloud Deep Learning VM. See GCP Quickstart Guide
Docker Image https://hub.docker.com/r/ultralytics/yolov5. See Docker Quickstart Guide

Status

If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training (train.py), testing (test.py), inference (detect.py) and export (export.py) on MacOS, Windows, and Ubuntu every 24 hours and on every commit.

glenn-jocher · 2020-11-22T10:22:56Z

@nguyen14ck install Python 3.8 or later with all requirements.txt dependencies installed, including torch>=1.7.

nguyen14ck · 2020-11-22T21:31:24Z

Thanks, @glenn-jocher.
I installed Python 3.8.5, Pytorch 1.7 and requirements.
But the problem still exists

Epoch 1/1:   0%|        | 0/2699 [00:04<?, ?img/s]
Traceback (most recent call last):
  File "/home/centos_user/Documents/WD/DEEP_LEARNING/Notebooks/work2/yolov4/train_wheat.py", line 700, in <module>
    train(model=model,
  File "/home/centos_user/Documents/WD/DEEP_LEARNING/Notebooks/work2/yolov4/train_wheat.py", line 420, in train
    bboxes_pred = model(images)
  File "/home/centos_user/anaconda3/envs/py38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/centos_user/anaconda3/envs/py38/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 161, in forward
    outputs = self.parallel_apply(replicas, inputs, kwargs)
  File "/home/centos_user/anaconda3/envs/py38/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 171, in parallel_apply
    return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
  File "/home/centos_user/anaconda3/envs/py38/lib/python3.8/site-packages/torch/nn/parallel/parallel_apply.py", line 86, in parallel_apply
    output.reraise()
  File "/home/centos_user/anaconda3/envs/py38/lib/python3.8/site-packages/torch/_utils.py", line 428, in reraise
    raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in replica 1 on device 1.
Original Traceback (most recent call last):
  File "/home/centos_user/anaconda3/envs/py38/lib/python3.8/site-packages/torch/nn/parallel/parallel_apply.py", line 61, in _worker
    output = module(*input, **kwargs)
  File "/home/centos_user/anaconda3/envs/py38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/centos_user/Documents/WD/DEEP_LEARNING/Notebooks/work2/yolov4/input/pytorch-YOLOv4/tool/darknet2pytorch.py", line 172, in forward
    x = self.models[ind](x)
  File "/home/centos_user/anaconda3/envs/py38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/centos_user/anaconda3/envs/py38/lib/python3.8/site-packages/torch/nn/modules/container.py", line 117, in forward
    input = module(input)
  File "/home/centos_user/anaconda3/envs/py38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/centos_user/anaconda3/envs/py38/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 423, in forward
    return self._conv_forward(input, self.weight)
  File "/home/centos_user/anaconda3/envs/py38/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 419, in _conv_forward
    return F.conv2d(input, weight, self.bias, self.stride,
RuntimeError: cuDNN error: CUDNN_STATUS_BAD_PARAM
You can try to repro this exception using the following code snippet. If that doesn't trigger the error, please include your original repro script when reporting this issue.

glenn-jocher · 2020-11-22T21:35:36Z

@nguyen14ck I'm not sure exactly what the problem may be. We've had some problems with Anaconda in the past, so one thing I would recommend is for you to simply create a new virtual Python 3.8 environment (venv), clone the latest repo (code changes daily), and pip install -r requirements.txt again.

Other than that it may be an issue with your drivers.

You can always try the docker container as well, as it should completely remove all environment problems.

nguyen14ck · 2020-11-22T23:39:27Z

Thanks, @glenn-jocher
That's a new Conda env with Python 3.8, Pytorch 1.7
Cuda 11.1
Cudnn 8.0.5

$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Tue_Sep_15_19:10:02_PDT_2020
Cuda compilation tools, release 11.1, V11.1.74
Build cuda_11.1.TC455_06.29069683_0

$ ./mnistCUDNN
--
Executing:   mnistCUDNN
cudnnGetVersion()   : 8005 , CUDNN_VERSION from cudnn.h : 8005 (8.0.5)
Host   compiler version : GCC 4.8.5

$ python -c "import torch;from torch.utils.cpp_extension import CUDA_HOME;print(CUDA_HOME);print(torch.cuda.is_available())"
/usr/local/cuda/
True

glenn-jocher · 2020-11-23T10:51:16Z

@nguyen14ck sure. We don't have resources to help people with their local environments, this is the reason we offer the four validated environments. I would recommend you start from one of these:

Environments

YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Google Colab Notebook with free GPU:
Kaggle Notebook with free GPU: https://www.kaggle.com/models/ultralytics/yolov5
Google Cloud Deep Learning VM. See GCP Quickstart Guide
Docker Image https://hub.docker.com/r/ultralytics/yolov5. See Docker Quickstart Guide

Status

If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are passing. These tests evaluate proper operation of basic YOLOv5 functionality, including training (train.py), testing (test.py), inference (detect.py) and export (export.py) on MacOS, Windows, and Ubuntu.

github-actions · 2020-12-24T00:54:00Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

blakeliu · 2021-01-10T03:39:26Z

I meet the same bug if i used two nvidia graph card(gtx2070 and gtx1070ti)

$ python train.py --device 0,1 --img 640 --batch 16 --epochs 5 --data coco128.yaml --weights yolov5s.pt
Using torch 1.7.1+cu110 CUDA:0 (GeForce RTX 2070, 7982MB)
CUDA:1 (GeForce GTX 1070 Ti, 8118MB)

Traceback (most recent call last):
  File "/home/blake/cv/yolo/yolov5/train.py", line 490, in <module>
    train(hyp, opt, device, tb_writer, wandb)
  File "/home/blake/cv/yolo/yolov5/train.py", line 286, in train
    pred = model(imgs)  # forward
  File "/home/blake/anaconda3/envs/torch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/blake/anaconda3/envs/torch/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 161, in forward
    outputs = self.parallel_apply(replicas, inputs, kwargs)
  File "/home/blake/anaconda3/envs/torch/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 171, in parallel_apply
    return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
  File "/home/blake/anaconda3/envs/torch/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 86, in parallel_apply
    output.reraise()
  File "/home/blake/anaconda3/envs/torch/lib/python3.7/site-packages/torch/_utils.py", line 428, in reraise
    raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in replica 0 on device 0.
Original Traceback (most recent call last):
  File "/home/blake/anaconda3/envs/torch/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 61, in _worker
    output = module(*input, **kwargs)
  File "/home/blake/anaconda3/envs/torch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/blake/cv/yolo/yolov5/models/yolo.py", line 121, in forward
    return self.forward_once(x, profile)  # single-scale inference, train
  File "/home/blake/cv/yolo/yolov5/models/yolo.py", line 137, in forward_once
    x = m(x)  # run
  File "/home/blake/anaconda3/envs/torch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/blake/cv/yolo/yolov5/models/common.py", line 70, in forward
    y2 = self.cv2(x)
  File "/home/blake/anaconda3/envs/torch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/blake/anaconda3/envs/torch/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 423, in forward
    return self._conv_forward(input, self.weight)
  File "/home/blake/anaconda3/envs/torch/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 420, in _conv_forward
    self.padding, self.dilation, self.groups)
RuntimeError: cuDNN error: CUDNN_STATUS_BAD_PARAM

If I use gtx2070 or gtx1070ti , the program run normally!

My Env:

OS: Ubuntu 18.04.5 LTS

Driver Version: 460.32.03

(torch) blake@workstation:~/cv/yolo/yolov5$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Thu_Jun_11_22:26:38_PDT_2020
Cuda compilation tools, release 11.0, V11.0.194
Build cuda_11.0_bu.TC445_37.28540450_0

(torch) blake@workstation:~/cv/yolo/yolov5$ conda list python
# packages in environment at /home/blake/anaconda3/envs/torch:
python                    3.7.9                h7579374_0    defaults
python-dateutil           2.8.1                    pypi_0    pypi


(torch) blake@workstation:~/cv/yolo/yolov5$ conda list torch
torch                     1.7.1+cu110              pypi_0    pypi
torchaudio                0.7.2                    pypi_0    pypi
torchvision               0.8.2+cu110              pypi_0    pypi

(torch) blake@workstation:~/cv/yolo/yolov5$ nvidia-smi 
Sun Jan 10 11:35:12 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03    Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce GTX 107...  Off  | 00000000:01:00.0  On |                  N/A |
|  0%   37C    P8    13W / 180W |    523MiB /  8118MiB |      3%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  GeForce RTX 2070    Off  | 00000000:02:00.0 Off |                  N/A |
| 34%   14C    P8    17W / 175W |     10MiB /  7982MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

glenn-jocher · 2021-01-10T04:30:48Z

@blakeliu best practices is to only run Multi-GPU with identical cards.

blakeliu · 2021-01-10T04:33:33Z

@glenn-jocher Thank your advises.

tetsu-kikuchi · 2021-06-02T12:24:22Z

For your information:
In my case, this error happened when there are multi GPUs in my computer.
When I added --device 0 when I run python train.py, this error did not happen and the code worked correctly.

It seems that it was bad to use different type of GPU. In my case, I used two GPUs :
GeForce GTX 1070 Ti
GeForce RTX 2080 Ti

glenn-jocher · 2021-06-02T12:29:44Z

@tetsu-kikuchi interesting, thanks for the feedback! I've been thinking we should default to --device 0 rather than use all devices by default. Do you think this is a good idea?

tetsu-kikuchi · 2021-06-03T06:16:20Z

@glenn-jocher Thank you for your response. Using multi-GPUs sometimes causes unexpected errors, and error messages related to GPU are often hard to find out the reason of the error. So, I think setting --device 0 as a default will be convenient especially for beginners (including me).

glenn-jocher · 2021-06-03T10:43:54Z

TODO: Device 0 default rather than all available devices default.

tetsu-kikuchi · 2021-06-07T03:24:04Z

Additional information:
Strange things, opposite to the previous case, happened. When I used another machine with two GPUs (with the same yolov5 code), an error about cuDNN happened either when I set --device 0 or --device 1. The error did not happen only when I set --device option as its default (i.e., use multi GPUs).

I paste below the error message when I set --device 0 or --device 1. I slightly customized the yolov5 code for my purpose, only for miscellaneous things mainly in utils/dataset.py.

Traceback (most recent call last):
  File "train.py", line 657, in <module>
    train(hyp, opt, device, tb_writer)
  File "train.py", line 408, in train
    scaler.scale(loss).backward()
  File "/opt/conda/lib/python3.8/site-packages/torch/tensor.py", line 245, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
  File "/opt/conda/lib/python3.8/site-packages/torch/autograd/__init__.py", line 145, in backward
    Variable._execution_engine.run_backward(
RuntimeError: cuDNN error: CUDNN_STATUS_NOT_INITIALIZED

The GPU information:

YOLOv5 ? v5.0-54-gf55730e torch 1.8.0 CUDA:0 (GeForce GTX 1080 Ti, 11178.5MB)
                                      CUDA:1 (GeForce GTX 1080 Ti, 11175.375MB)

glenn-jocher · 2021-06-07T07:52:26Z

@tetsu-kikuchi since this error originates in torch you should probably raise your issue in the pytorch repository.

glenn-jocher · 2021-06-07T07:53:20Z

@tetsu-kikuchi also, your YOLOv5 code is very out of date. To update:

Git – git pull from within your yolov5/ directory or git clone https://github.com/ultralytics/yolov5 again
PyTorch Hub – Force-reload with model = torch.hub.load('ultralytics/yolov5', 'yolov5s', force_reload=True)
Notebooks – View updated notebooks
Docker – sudo docker pull ultralytics/yolov5:latest to update your image

tetsu-kikuchi · 2021-06-07T08:02:48Z

Thanks for your navigation.

github-actions · 2021-07-08T00:08:33Z

👋 Hello, this issue has been automatically marked as stale because it has not had recent activity. Please note it will be closed if no further activity occurs.

Access additional YOLOv5 🚀 resources:

Wiki – https://github.com/ultralytics/yolov5/wiki
Tutorials – https://docs.ultralytics.com/yolov5
Docs – https://docs.ultralytics.com

Access additional Ultralytics ⚡ resources:

Ultralytics HUB – https://ultralytics.com
Vision API – https://ultralytics.com/yolov5
About Us – https://ultralytics.com/about
Join Our Team – https://ultralytics.com/work
Contact Us – https://ultralytics.com/contact

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLOv5 🚀 and Vision AI ⭐!

glenn-jocher · 2021-09-26T00:41:17Z

TODO removed as original issue is now resolved. YOLOv5 training defaults to device 0 if CUDA is available, with multiple CUDA devices or CPU commands available via the --device argument:

python train.py --device 0,1,2,3
python train.py --device cpu

nguyen14ck added the bug Something isn't working label Nov 22, 2020

github-actions bot added the Stale Stale and schedule for closing soon label Dec 24, 2020

github-actions bot closed this as completed Dec 30, 2020

glenn-jocher added the TODO High priority items label Jun 3, 2021

glenn-jocher reopened this Jun 3, 2021

github-actions bot removed the Stale Stale and schedule for closing soon label Jun 4, 2021

github-actions bot added the Stale Stale and schedule for closing soon label Jul 8, 2021

github-actions bot closed this as completed Jul 13, 2021

glenn-jocher removed the TODO High priority items label Sep 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Run tutorial: RuntimeError: cuDNN error: CUDNN_STATUS_BAD_PARAM #1469

Run tutorial: RuntimeError: cuDNN error: CUDNN_STATUS_BAD_PARAM #1469

nguyen14ck commented Nov 22, 2020

github-actions bot commented Nov 22, 2020 •

edited by UltralyticsAssistant

Loading

glenn-jocher commented Nov 22, 2020

nguyen14ck commented Nov 22, 2020

glenn-jocher commented Nov 22, 2020

nguyen14ck commented Nov 22, 2020

glenn-jocher commented Nov 23, 2020 •

edited by UltralyticsAssistant

Loading

github-actions bot commented Dec 24, 2020

blakeliu commented Jan 10, 2021 •

edited

Loading

glenn-jocher commented Jan 10, 2021

blakeliu commented Jan 10, 2021

tetsu-kikuchi commented Jun 2, 2021 •

edited

Loading

glenn-jocher commented Jun 2, 2021

tetsu-kikuchi commented Jun 3, 2021 •

edited

Loading

glenn-jocher commented Jun 3, 2021

tetsu-kikuchi commented Jun 7, 2021 •

edited

Loading

glenn-jocher commented Jun 7, 2021

glenn-jocher commented Jun 7, 2021 •

edited by UltralyticsAssistant

Loading

tetsu-kikuchi commented Jun 7, 2021

github-actions bot commented Jul 8, 2021 •

edited by glenn-jocher

Loading

glenn-jocher commented Sep 26, 2021 •

edited

Loading

Run tutorial: RuntimeError: cuDNN error: CUDNN_STATUS_BAD_PARAM #1469

Run tutorial: RuntimeError: cuDNN error: CUDNN_STATUS_BAD_PARAM #1469

Comments

nguyen14ck commented Nov 22, 2020

🐛 Bug

To Reproduce (REQUIRED)

Expected behavior

Environment

Additional Information

github-actions bot commented Nov 22, 2020 • edited by UltralyticsAssistant Loading

Requirements

Environments

Status

glenn-jocher commented Nov 22, 2020

nguyen14ck commented Nov 22, 2020

glenn-jocher commented Nov 22, 2020

nguyen14ck commented Nov 22, 2020

glenn-jocher commented Nov 23, 2020 • edited by UltralyticsAssistant Loading

Environments

Status

github-actions bot commented Dec 24, 2020

blakeliu commented Jan 10, 2021 • edited Loading

I meet the same bug if i used two nvidia graph card(gtx2070 and gtx1070ti)

If I use gtx2070 or gtx1070ti , the program run normally!

My Env:

glenn-jocher commented Jan 10, 2021

blakeliu commented Jan 10, 2021

tetsu-kikuchi commented Jun 2, 2021 • edited Loading

glenn-jocher commented Jun 2, 2021

tetsu-kikuchi commented Jun 3, 2021 • edited Loading

glenn-jocher commented Jun 3, 2021

tetsu-kikuchi commented Jun 7, 2021 • edited Loading

glenn-jocher commented Jun 7, 2021

glenn-jocher commented Jun 7, 2021 • edited by UltralyticsAssistant Loading

tetsu-kikuchi commented Jun 7, 2021

github-actions bot commented Jul 8, 2021 • edited by glenn-jocher Loading

glenn-jocher commented Sep 26, 2021 • edited Loading

github-actions bot commented Nov 22, 2020 •

edited by UltralyticsAssistant

Loading

glenn-jocher commented Nov 23, 2020 •

edited by UltralyticsAssistant

Loading

blakeliu commented Jan 10, 2021 •

edited

Loading

tetsu-kikuchi commented Jun 2, 2021 •

edited

Loading

tetsu-kikuchi commented Jun 3, 2021 •

edited

Loading

tetsu-kikuchi commented Jun 7, 2021 •

edited

Loading

glenn-jocher commented Jun 7, 2021 •

edited by UltralyticsAssistant

Loading

github-actions bot commented Jul 8, 2021 •

edited by glenn-jocher

Loading

glenn-jocher commented Sep 26, 2021 •

edited

Loading