-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Closed
Labels
bugSomething isn't workingSomething isn't workinghelp wantedOpen to be worked onOpen to be worked ontuner
Description
🐛 Bug
pytorch_lightning.utilities.memory.is_cuda_out_of_memory does not work reliably, causing things like auto_scale_batch_size tuning to fail.
Please reproduce using the BoringModel
I'm afraid I don't know what BoringModel is.
To Reproduce
This returns false. is_cuda_out_of_memory(RuntimeError('CUDA error: out of memory'))
The function only checks for the string CUDA out of memory.
Expected behavior
The above should return True.
Environment
- CUDA:
- GPU:
- NVIDIA GeForce RTX 3090
- NVIDIA GeForce RTX 2060 SUPER
- available: True
- version: 11.1 - Packages:
- numpy: 1.20.1
- pyTorch_debug: False
- pyTorch_version: 1.8.0+cu111
- pytorch-lightning: 1.2.4
- tqdm: 4.59.0 - System:
- OS: Linux
- architecture:
- 64bit
- ELF
- processor:
- python: 3.7.9
- version: Proposal for help #1 SMP Tue Jun 23 12:58:10 UTC 2020
root@eb:~# neofetch
_,met$$$$$gg. root@eb
,g$$$$$$$$$$$$$$$P. -------
,g$$P" """Y$$.". OS: Debian GNU/Linux 10 (buster) on Windows 10 x86_64
,$$P' `$$$. Kernel: 4.19.128-microsoft-standard
',$$P ,ggs. `$$b: Uptime: 19 hours, 44 mins
`d$$' ,$P"' . $$$ Packages: 594 (dpkg)
$$P d$' , $$P Shell: bash 5.0.3
$$: $$. - ,d$$' Terminal: /dev/pts/4
$$; Y$b._ _,d$P' CPU: AMD Ryzen Threadripper 3970X 32- (64) @ 3.693GHz
Y$$. `.`"Y$$$$P"' Memory: 23149MiB / 257643MiB
`$$b "-.__
`Y$$
`Y$$.
`$$b.
`Y$$b.
`"Y$b._
`"""
akihironitta
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workinghelp wantedOpen to be worked onOpen to be worked ontuner