CUDA out of memory #4

lambdald · 2024-07-24T07:23:26Z

Hello, when I was running the small_city data, I encountered the following error. How can I solve it?

File "/data/workspace/hierarchical-3d-gaussians/train_coarse.py", line 94, in training
    render_pkg = render_coarse(viewpoint_cam, gaussians, pipe, background, indices = indices)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/workspace/hierarchical-3d-gaussians/gaussian_renderer/__init__.py", line 381, in render_coarse
    rendered_image, radii, _ = rasterizer(
                               ^^^^^^^^^^^
  File "/data/miniconda3/envs/hierarchical_3d_gaussians/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/miniconda3/envs/hierarchical_3d_gaussians/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/miniconda3/envs/hierarchical_3d_gaussians/lib/python3.12/site-packages/diff_gaussian_rasterization/__init__.py", line 205, in forward
    return rasterize_gaussians(
           ^^^^^^^^^^^^^^^^^^^^
  File "/data/miniconda3/envs/hierarchical_3d_gaussians/lib/python3.12/site-packages/diff_gaussian_rasterization/__init__.py", line 28, in rasterize_gaussians
    return _RasterizeGaussians.apply(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/miniconda3/envs/hierarchical_3d_gaussians/lib/python3.12/site-packages/torch/autograd/function.py", line 598, in apply
    return super().apply(*args, **kwargs)  # type: ignore[misc]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/miniconda3/envs/hierarchical_3d_gaussians/lib/python3.12/site-packages/diff_gaussian_rasterization/__init__.py", line 84, in forward
    num_rendered, color, radii, geomBuffer, binningBuffer, imgBuffer, invdepths = _C.rasterize_gaussians(*args)
                                                                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 184.32 GiB. GPU

The text was updated successfully, but these errors were encountered:

White-Mask-230 · 2024-07-24T15:19:21Z

Hello, the problem is that your computer has not enough GPU to run the program the more easy solution is creating a github Codespacse of this repository and then execute the program like you will do it with Visual Studio Code.

For create the github code space you have to:

Open this repository
Press the key .
Open the terminal like you will do it in Visual Studio Code in my case "Ctrl + ñ"
In the terminal will give you the posibility to create a local clone or a Github Codespaces, you click the option to create a Github Codespaces

And that' s all.

Snosixtyboo · 2024-07-24T15:54:28Z

Hello, the problem is that your computer has not enough GPU to run the program the more easy solution is creating a github Codespacse of this repository and then execute the program like you will do it with Visual Studio Code.

For create the github code space you have to:

Open this repository

Press the key .

Open the terminal like you will do it in Visual Studio Code in my case "Ctrl + ñ"

In the terminal will give you the posibility to create a local clone or a Github Codespaces, you click the option to create a Github Codespaces

And that' s all.

Possible, but the message says that the code was trying to allocate 180 GB, which is a bit insane. So it looks like some sort of bug.

@lambdald If you want us to take a look please provide the full Dockerfile you are using. If you are not using Docker, it's gonna be hard to recreate your issue since it's likely setup-specific.

White-Mask-230 · 2024-07-24T21:01:24Z

True, I didn't see the end of the error.

I have the same error when I run small_city, it can be a bug of the program searching information I find this https://stackoverflow.com/questions/59129812/how-to-avoid-cuda-out-of-memory-in-pytorch. I' m unable to make it work

lambdald · 2024-07-26T06:08:39Z

Hello, the problem is that your computer has not enough GPU to run the program the more easy solution is creating a github Codespacse of this repository and then execute the program like you will do it with Visual Studio Code.
For create the github code space you have to:

Open this repository

Press the key .

Open the terminal like you will do it in Visual Studio Code in my case "Ctrl + ñ"

In the terminal will give you the posibility to create a local clone or a Github Codespaces, you click the option to create a Github Codespaces

And that' s all.

Possible, but the message says that the code was trying to allocate 180 GB, which is a bit insane. So it looks like some sort of bug.

@lambdald If you want us to take a look please provide the full Dockerfile you are using. If you are not using Docker, it's gonna be hard to recreate your issue since it's likely setup-specific.

@Snosixtyboo Sorry, I use conda to manage my environment instead of Docker, and I setup the python environment following the readme.

SunHongyang10 · 2024-07-26T09:47:34Z

same bug

White-Mask-230 · 2024-07-26T15:11:24Z

Try this:

Open the python console runing python3
Import torch import torch
Clean the cache torch.cuda.empty_cache()

Reference: https://stackoverflow.com/questions/59129812/how-to-avoid-cuda-out-of-memory-in-pytorch

SunHongyang10 · 2024-07-27T08:52:39Z

Try this:

Open the python console runing python3

Import torch import torch

Clean the cache torch.cuda.empty_cache()

Reference: https://stackoverflow.com/questions/59129812/how-to-avoid-cuda-out-of-memory-in-pytorch

I tried, and I still meet the bug

White-Mask-230 · 2024-07-27T21:34:54Z

Run this code and tell us what it says

import torch

print(torch.cuda.memory_summary(device=None, abbreviated=False))

kevintsq · 2024-07-28T12:07:57Z

Same problem. OOM was encountered on a GPU with 80 GB memory but not encountered on a GPU with 8 GB memory, using the same dataset.

White-Mask-230 · 2024-07-28T14:24:25Z

@kevintsq Interesting tell us more about the two computers

kevintsq · 2024-07-28T20:39:26Z

The former is an HPC using Slurm on Linux, and the latter is a Windows laptop. I should have compiled the submodules according to the correct CUDA computing capabilities. I’ve tried CUDA 12.1, 12.3, 12.4, 12.5 + PyTorch 2.3, 2.4 on the HPC, but the problem persists (12.1 is illegal memory access). The laptop runs well on CUDA 12.4 + PyTorch 2.4.

LRLVEC · 2024-08-28T03:50:46Z

I met with the same issue on ubuntu 22.04, and fixed it by switching from cuda 12.5 to 12.2.

White-Mask-230 mentioned this issue Jul 25, 2024

fix txt writing issue #11

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA out of memory #4

CUDA out of memory #4

lambdald commented Jul 24, 2024 •

edited

Loading

White-Mask-230 commented Jul 24, 2024

Snosixtyboo commented Jul 24, 2024

White-Mask-230 commented Jul 24, 2024 •

edited

Loading

lambdald commented Jul 26, 2024

SunHongyang10 commented Jul 26, 2024

White-Mask-230 commented Jul 26, 2024

SunHongyang10 commented Jul 27, 2024

White-Mask-230 commented Jul 27, 2024 •

edited

Loading

kevintsq commented Jul 28, 2024

White-Mask-230 commented Jul 28, 2024

kevintsq commented Jul 28, 2024

LRLVEC commented Aug 28, 2024

CUDA out of memory #4

CUDA out of memory #4

Comments

lambdald commented Jul 24, 2024 • edited Loading

White-Mask-230 commented Jul 24, 2024

Snosixtyboo commented Jul 24, 2024

White-Mask-230 commented Jul 24, 2024 • edited Loading

lambdald commented Jul 26, 2024

SunHongyang10 commented Jul 26, 2024

White-Mask-230 commented Jul 26, 2024

SunHongyang10 commented Jul 27, 2024

White-Mask-230 commented Jul 27, 2024 • edited Loading

kevintsq commented Jul 28, 2024

White-Mask-230 commented Jul 28, 2024

kevintsq commented Jul 28, 2024

LRLVEC commented Aug 28, 2024

lambdald commented Jul 24, 2024 •

edited

Loading

White-Mask-230 commented Jul 24, 2024 •

edited

Loading

White-Mask-230 commented Jul 27, 2024 •

edited

Loading