Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lack of memory on the GPU during training #16

Open
dwz138831 opened this issue Mar 31, 2024 · 3 comments
Open

Lack of memory on the GPU during training #16

dwz138831 opened this issue Mar 31, 2024 · 3 comments

Comments

@dwz138831
Copy link

Hi, I'm reproducing with the latest version of the code on 8 A30 GPUs (24GB of memory each) and it shows CUDA out of memory at the beginning of the 1st epoch, what is the cause of this.

1 2
@LinShan-Bin
Copy link
Owner

To reproduce the full results, you need 80G memory. The baseline method (no semantics, contracted_coord = False and auxiliary_frame = False) can be reproduced with 24G memory.

@ZiyangYan
Copy link

ZiyangYan commented Apr 16, 2024

To reproduce the full results, you need 80G memory. The baseline method (no semantics, contracted_coord = False and auxiliary_frame = False) can be reproduced with 24G memory.

Hi, I change the config as you mentioned, and run it in 4 24G A30 GPU, but still has OOM problem

@LinShan-Bin
Copy link
Owner

To reproduce the full results, you need 80G memory. The baseline method (no semantics, contracted_coord = False and auxiliary_frame = False) can be reproduced with 24G memory.

Hi, I change the config as you mentioned, and run it in 4 24G A30 GPU, but still has OOM problem

For contracted_coord = False, we recommend you use a voxel size of [16, 200, 200]. This keeps the resolution of the inner part and removes the contracted outer part. Though we use [24, 300, 300] in our experiments for fair comparisons, to save memory you have to cut down the voxel size.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants