Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SwinUNETR CUDA out of memory on NVIDIA RTX 4070 #8042

Closed
faizan1234567 opened this issue Aug 25, 2024 · 2 comments
Closed

SwinUNETR CUDA out of memory on NVIDIA RTX 4070 #8042

faizan1234567 opened this issue Aug 25, 2024 · 2 comments

Comments

@faizan1234567
Copy link

Describe the bug
Hi, Thank you so much for the nice toolkit for healtcare. I am using SwinUNETR as a baseline method for my thesis and I am training it on BraTS 2023 brain tumor segmentation dataset. I am setting the following parameters when creating SwinUNETR.

device  = 'cuda' if torch.cuda.is_available() else 'cpu'
SwinUNETR(
                    img_size=128,
                    in_channels=4,
                    out_channels=3,
                    feature_size=24,
                    drop_rate=0.0,
                    attn_drop_rate=0.0,
                    dropout_path_rate=0.0,
                    spatial_dims=3,
                    use_checkpoint=False,
                    use_v2=False).to(device)

However, when I run the model with the dataset with size (4x128x128x128), I get cuda out of memory issue, and I am using batch size 1 . Description below:

torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 1.32 GiB. GPU 0 has a total capacity of 11.71 GiB of which 681.50 MiB is free. Including non-PyTorch memory, this process has 10.59 GiB memory in use. Of the allocated memory 6.92 GiB is allocated by PyTorch, and 3.48 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

Environment

Anaconda virtual environment with python version: 3.10.14

python -c "import monai; monai.config.print_debug_info()"

log of the above code.

================================
Printing MONAI config...
================================
MONAI version: 1.3.2
Numpy version: 1.21.6
Pytorch version: 2.4.0+cu121
MONAI flags: HAS_EXT = False, USE_COMPILED = False, USE_META_DICT = False
MONAI rev id: 59a7211070538586369afd4a01eca0a7fe2e742e
MONAI __file__: /home/<username>/anaconda3/envs/thesis/lib/python3.10/site-packages/monai/__init__.py

Optional dependencies:
Pytorch Ignite version: NOT INSTALLED or UNKNOWN VERSION.
ITK version: NOT INSTALLED or UNKNOWN VERSION.
Nibabel version: 5.2.1
scikit-image version: 0.21.0
scipy version: 1.11.4
Pillow version: 10.4.0
Tensorboard version: 2.17.0
gdown version: 5.2.0
TorchVision version: 0.19.0+cu121
tqdm version: 4.66.4
lmdb version: NOT INSTALLED or UNKNOWN VERSION.
psutil version: 6.0.0
pandas version: 2.0.3
einops version: 0.8.0
transformers version: NOT INSTALLED or UNKNOWN VERSION.
mlflow version: NOT INSTALLED or UNKNOWN VERSION.
pynrrd version: NOT INSTALLED or UNKNOWN VERSION.
clearml version: NOT INSTALLED or UNKNOWN VERSION.

For details about installing the optional dependencies, please visit:
    https://docs.monai.io/en/latest/installation.html#installing-the-recommended-dependencies


================================
Printing system config...
================================
System: Linux
Linux version: Ubuntu 22.04.4 LTS
Platform: Linux-6.5.0-28-generic-x86_64-with-glibc2.35
Processor: x86_64
Machine: x86_64
Python version: 3.10.14
Process name: pt_main_thread
Command: ['python', '-c', 'import monai; monai.config.print_debug_info()']
Open files: []
Num physical CPUs: 12
Num logical CPUs: 20
Num usable CPUs: 20
CPU usage (%): [12.6, 11.5, 12.5, 12.5, 68.0, 12.5, 12.5, 11.5, 12.5, 11.8, 44.7, 13.3, 12.5, 11.7, 13.3, 12.5, 13.3, 12.5, 12.5, 12.5]
CPU freq. (MHz): 1110
Load avg. in last 1, 5, 15 mins (%): [2.3, 2.1, 2.2]
Disk usage (%): 82.1
Avg. sensor temp. (Celsius): UNKNOWN for given OS
Total physical memory (GB): 15.4
Available memory (GB): 11.7
Used memory (GB): 3.3

================================
Printing GPU config...
================================
Num GPUs: 1
Has CUDA: True
CUDA version: 12.1
cuDNN enabled: True
NVIDIA_TF32_OVERRIDE: None
TORCH_ALLOW_TF32_CUBLAS_OVERRIDE: None
cuDNN version: 90100
Current device: 0
Library compiled for CUDA architectures: ['sm_50', 'sm_60', 'sm_70', 'sm_75', 'sm_80', 'sm_86', 'sm_90']
GPU 0 Name: NVIDIA GeForce RTX 4070
GPU 0 Is integrated: False
GPU 0 Is multi GPU board: False
GPU 0 Multi processor count: 46
GPU 0 Total memory (GB): 11.7
GPU 0 CUDA capability (maj.min): 8.9
@faizan1234567
Copy link
Author

are there any gradient check pointing feature for large models in MONAI?

@faizan1234567
Copy link
Author

problem solved. I enabled gradient checkpointing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant