Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assign image tensors to data_device immediately on creation. #667

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

GaneshBannur
Copy link

@GaneshBannur GaneshBannur commented Feb 21, 2024

The tensors which are created from PIL images are first created on the CPU.

resized_image = torch.from_numpy(np.array(resized_image_PIL)) / 255.0

If data_device is "cuda" they are later moved to the GPU. Normally, unreferenced tensors on the CPU should be released but PyTorch doesn't seem to do this. This results in high CPU RAM consumption for the entire training duration even when data_device is "cuda".

Moving the tensors to data_device immediately on creation results in a dramatic decrease in CPU RAM consumption when data_device is "cuda". When training on a T4 instance on Colab with 200 images, CPU RAM consumption went from 10GB down to 2GB. The GPU vRAM consumption doesn't increase as tensors are eventually moved to the GPU anyway.

It might help to move all tensors to data_device immediately on creation since PyTorch doesn't seem to deallocate RAM for CPU tensors.

nnmhuy added a commit to nnmhuy/gaussian-splatting that referenced this pull request Sep 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant