Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

16GB is not enough? #18

Open
wpl427 opened this issue Aug 21, 2023 · 3 comments
Open

16GB is not enough? #18

wpl427 opened this issue Aug 21, 2023 · 3 comments

Comments

@wpl427
Copy link

wpl427 commented Aug 21, 2023

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 22.00 MiB (GPU 0; 14.61 GiB total capacity; 13.30 GiB already allocated; 9.19 MiB free; 13.78 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

What is the minimum video memory??

@wpl427
Copy link
Author

wpl427 commented Aug 21, 2023

Or how to reduce video memory usage by modifying the configuration?

@zideliu
Copy link
Owner

zideliu commented Aug 21, 2023

you can try this open_clip.create_model_and_transforms('ViT-bigG-14', 'laion2b_s39b_b160k',precision='fp16') in train_t2i_custom_v2.py

@wpl427
Copy link
Author

wpl427 commented Aug 21, 2023

before:
prompt_model,, = open_clip.create_model_and_transforms('ViT-bigG-14', 'laion2b_s39b_b160k')
after:
prompt_model,, = open_clip.create_model_and_transforms('ViT-bigG-14', 'laion2b_s39b_b160k',precision='fp16')
error:
(faceswap) [root@prod-emr-gpu01 StyleDrop-PyTorch]# accelerate launch --num_processes 8 --mixed_precision fp16 train_t2i_custom_v2.py --config=configs/custom.py
2023-08-21 13:16:44.463755: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-08-21 13:16:45.329695: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
The following values were not passed to accelerate launch and had defaults used instead:
--num_machines was set to a value of 1
--num_cpu_threads_per_process was set to 8 to improve out-of-box performance
To avoid this warning pass in values for each of the problematic parameters or run accelerate config.
2023-08-21 13:16:49.788896: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2023-08-21 13:16:52.551 | INFO | main🚋63 - Process 0 using device: cuda
I0821 13:16:52.551910 140480274581312 factory.py:158] Loaded ViT-bigG-14 model config.
2023-08-21 13:16:52.578 | DEBUG | open_clip.transformer:init:314 - xattn in transformer of CLIP is True
2023-08-21 13:17:09.847 | DEBUG | open_clip.transformer:init:314 - xattn in transformer of CLIP is True
I0821 13:17:20.080452 140480274581312 factory.py:206] Loading pretrained ViT-bigG-14 weights (laion2b_s39b_b160k).
Traceback (most recent call last):
File "/data/miniconda3/envs/faceswap/bin/accelerate", line 8, in
sys.exit(main())
File "/data/miniconda3/envs/faceswap/lib/python3.9/site-packages/accelerate/commands/accelerate_cli.py", line 43, in main
args.func(args)
File "/data/miniconda3/envs/faceswap/lib/python3.9/site-packages/accelerate/commands/launch.py", line 837, in launch_command
simple_launcher(args)
File "/data/miniconda3/envs/faceswap/lib/python3.9/site-packages/accelerate/commands/launch.py", line 354, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/data/miniconda3/envs/faceswap/bin/python', 'train_t2i_custom_v2.py', '--config=configs/custom.py']' died with <Signals.SIGKILL: 9>.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants