Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi GPU are not supported? #129

Open
17Reset opened this issue Aug 7, 2024 · 2 comments
Open

Multi GPU are not supported? #129

17Reset opened this issue Aug 7, 2024 · 2 comments

Comments

@17Reset
Copy link

17Reset commented Aug 7, 2024

I have multi GPUs, but only run on one


(deblur_venv) xlab@xlab:/mnt/DiffBIR$ python -u inference.py --version v2 --task sr --upscale 4 --cfg_scale 4.0 --input /mnt/image_demos/ --output /mnt/image_demos/ouput --device cuda
use sdp attention as default
keep default attention mode
using device cuda
[3, 3, 64, 23, 32, 4]
Downloading: "https://github.com/cszn/KAIR/releases/download/v1.0/BSRNet.pth" to /mnt/DiffBIR/weights/BSRNet.pth

100%|██████████████████████████████████████████████████████████████████████████| 63.9M/63.9M [00:07<00:00, 8.92MB/s]
Setting up SDPCrossAttention (sdp). Query dim is 320, context_dim is None and using 5 heads.
Setting up SDPCrossAttention (sdp). Query dim is 320, context_dim is 1024 and using 5 heads.
Setting up SDPCrossAttention (sdp). Query dim is 320, context_dim is None and using 5 heads.
Setting up SDPCrossAttention (sdp). Query dim is 320, context_dim is 1024 and using 5 heads.
Setting up SDPCrossAttention (sdp). Query dim is 640, context_dim is None and using 10 heads.
Setting up SDPCrossAttention (sdp). Query dim is 640, context_dim is 1024 and using 10 heads.
Setting up SDPCrossAttention (sdp). Query dim is 640, context_dim is None and using 10 heads.
Setting up SDPCrossAttention (sdp). Query dim is 640, context_dim is 1024 and using 10 heads.
Setting up SDPCrossAttention (sdp). Query dim is 1280, context_dim is None and using 20 heads.
Setting up SDPCrossAttention (sdp). Query dim is 1280, context_dim is 1024 and using 20 heads.
Setting up SDPCrossAttention (sdp). Query dim is 1280, context_dim is None and using 20 heads.
Setting up SDPCrossAttention (sdp). Query dim is 1280, context_dim is 1024 and using 20 heads.
Setting up SDPCrossAttention (sdp). Query dim is 1280, context_dim is None and using 20 heads.
Setting up SDPCrossAttention (sdp). Query dim is 1280, context_dim is 1024 and using 20 heads.
Setting up SDPCrossAttention (sdp). Query dim is 1280, context_dim is None and using 20 heads.
Setting up SDPCrossAttention (sdp). Query dim is 1280, context_dim is 1024 and using 20 heads.
Setting up SDPCrossAttention (sdp). Query dim is 1280, context_dim is None and using 20 heads.
Setting up SDPCrossAttention (sdp). Query dim is 1280, context_dim is 1024 and using 20 heads.
Setting up SDPCrossAttention (sdp). Query dim is 1280, context_dim is None and using 20 heads.
Setting up SDPCrossAttention (sdp). Query dim is 1280, context_dim is 1024 and using 20 heads.
Setting up SDPCrossAttention (sdp). Query dim is 640, context_dim is None and using 10 heads.
Setting up SDPCrossAttention (sdp). Query dim is 640, context_dim is 1024 and using 10 heads.
Setting up SDPCrossAttention (sdp). Query dim is 640, context_dim is None and using 10 heads.
Setting up SDPCrossAttention (sdp). Query dim is 640, context_dim is 1024 and using 10 heads.
Setting up SDPCrossAttention (sdp). Query dim is 640, context_dim is None and using 10 heads.
Setting up SDPCrossAttention (sdp). Query dim is 640, context_dim is 1024 and using 10 heads.
Setting up SDPCrossAttention (sdp). Query dim is 320, context_dim is None and using 5 heads.
Setting up SDPCrossAttention (sdp). Query dim is 320, context_dim is 1024 and using 5 heads.
Setting up SDPCrossAttention (sdp). Query dim is 320, context_dim is None and using 5 heads.
Setting up SDPCrossAttention (sdp). Query dim is 320, context_dim is 1024 and using 5 heads.
Setting up SDPCrossAttention (sdp). Query dim is 320, context_dim is None and using 5 heads.
Setting up SDPCrossAttention (sdp). Query dim is 320, context_dim is 1024 and using 5 heads.
building SDPAttnBlock (sdp) with 512 in_channels
building SDPAttnBlock (sdp) with 512 in_channels
Setting up SDPCrossAttention (sdp). Query dim is 320, context_dim is None and using 5 heads.
Setting up SDPCrossAttention (sdp). Query dim is 320, context_dim is 1024 and using 5 heads.
Setting up SDPCrossAttention (sdp). Query dim is 320, context_dim is None and using 5 heads.
Setting up SDPCrossAttention (sdp). Query dim is 320, context_dim is 1024 and using 5 heads.
Setting up SDPCrossAttention (sdp). Query dim is 640, context_dim is None and using 10 heads.
Setting up SDPCrossAttention (sdp). Query dim is 640, context_dim is 1024 and using 10 heads.
Setting up SDPCrossAttention (sdp). Query dim is 640, context_dim is None and using 10 heads.
Setting up SDPCrossAttention (sdp). Query dim is 640, context_dim is 1024 and using 10 heads.
Setting up SDPCrossAttention (sdp). Query dim is 1280, context_dim is None and using 20 heads.
Setting up SDPCrossAttention (sdp). Query dim is 1280, context_dim is 1024 and using 20 heads.
Setting up SDPCrossAttention (sdp). Query dim is 1280, context_dim is None and using 20 heads.
Setting up SDPCrossAttention (sdp). Query dim is 1280, context_dim is 1024 and using 20 heads.
Setting up SDPCrossAttention (sdp). Query dim is 1280, context_dim is None and using 20 heads.
Setting up SDPCrossAttention (sdp). Query dim is 1280, context_dim is 1024 and using 20 heads.
Downloading: "https://huggingface.co/stabilityai/stable-diffusion-2-1-base/resolve/main/v2-1_512-ema-pruned.ckpt" to /mnt/DiffBIR/weights/v2-1_512-ema-pruned.ckpt

100%|██████████████████████████████████████████████████████████████████████████| 4.86G/4.86G [12:50<00:00, 6.77MB/s]
strictly load pretrained sd_v2.1, unused weights: {'posterior_mean_coef1', 'posterior_mean_coef2', 'posterior_variance', 'betas', 'model_ema.num_updates', 'alphas_cumprod', 'sqrt_recipm1_alphas_cumprod', 'log_one_minus_alphas_cumprod', 'sqrt_one_minus_alphas_cumprod', 'alphas_cumprod_prev', 'posterior_log_variance_clipped', 'model_ema.decay', 'sqrt_alphas_cumprod', 'sqrt_recip_alphas_cumprod'}
Downloading: "https://huggingface.co/lxq007/DiffBIR-v2/resolve/main/v2.pth" to /mnt/DiffBIR/weights/v2.pth

100%|██████████████████████████████████████████████████████████████████████████| 1.35G/1.35G [03:07<00:00, 7.75MB/s]
strictly load controlnet weight
load lq: /mnt/image_demos/01.png
Spaced Sampler: 100%|███████████████████████████████████████████████████████████████| 50/50 [01:37<00:00,  1.96s/it]
save result to /mnt/image_demos/ouput/01.png
load lq: /mnt/image_demos/input2.jpg
Spaced Sampler: 100%|███████████████████████████████████████████████████████████████| 50/50 [00:46<00:00,  1.07it/s]
save result to /mnt/image_demos/ouput/input2.png
load lq: /mnt/image_demos/oringnal.jpeg
Traceback (most recent call last):
  File "/mnt/DiffBIR/inference.py", line 86, in <module>
    main()
  File "/mnt/DiffBIR/inference.py", line 81, in main
    supported_tasks[args.task](args).run()
  File "/mnt/deblur_venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/mnt/DiffBIR/utils/inference.py", line 147, in run
    sample = self.pipeline.run(
  File "/mnt/deblur_venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/mnt/DiffBIR/utils/helpers.py", line 148, in run
    sample = self.run_stage2(
  File "/mnt/DiffBIR/utils/helpers.py", line 86, in run_stage2
    cond = self.cldm.prepare_condition(pad_clean, [pos_prompt] * bs)
  File "/mnt/DiffBIR/model/cldm.py", line 133, in prepare_condition
    c_img=self.vae_encode(clean * 2 - 1, sample=False)
  File "/mnt/DiffBIR/model/cldm.py", line 96, in vae_encode
    return self.vae.encode(image).mode() * self.scale_factor
  File "/mnt/DiffBIR/model/vae.py", line 550, in encode
    h = self.encoder(x)
  File "/mnt/deblur_venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/mnt/deblur_venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/mnt/DiffBIR/model/vae.py", line 414, in forward
    h = self.mid.attn_1(h)
  File "/mnt/deblur_venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/mnt/deblur_venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/mnt/DiffBIR/model/vae.py", line 295, in forward
    out = F.scaled_dot_product_attention(q, k, v)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 42.76 GiB. GPU 0 has a total capacity of 47.41 GiB of which 9.80 GiB is free. Including non-PyTorch memory, this process has 37.58 GiB memory in use. Of the allocated memory 24.53 GiB is allocated by PyTorch, and 12.55 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)
@wengzp1
Copy link

wengzp1 commented Aug 18, 2024

I have multi GPUs, but only run on one


(deblur_venv) xlab@xlab:/mnt/DiffBIR$ python -u inference.py --version v2 --task sr --upscale 4 --cfg_scale 4.0 --input /mnt/image_demos/ --output /mnt/image_demos/ouput --device cuda
use sdp attention as default
keep default attention mode
using device cuda
[3, 3, 64, 23, 32, 4]
Downloading: "https://github.com/cszn/KAIR/releases/download/v1.0/BSRNet.pth" to /mnt/DiffBIR/weights/BSRNet.pth

100%|██████████████████████████████████████████████████████████████████████████| 63.9M/63.9M [00:07<00:00, 8.92MB/s]
Setting up SDPCrossAttention (sdp). Query dim is 320, context_dim is None and using 5 heads.
Setting up SDPCrossAttention (sdp). Query dim is 320, context_dim is 1024 and using 5 heads.
Setting up SDPCrossAttention (sdp). Query dim is 320, context_dim is None and using 5 heads.
Setting up SDPCrossAttention (sdp). Query dim is 320, context_dim is 1024 and using 5 heads.
Setting up SDPCrossAttention (sdp). Query dim is 640, context_dim is None and using 10 heads.
Setting up SDPCrossAttention (sdp). Query dim is 640, context_dim is 1024 and using 10 heads.
Setting up SDPCrossAttention (sdp). Query dim is 640, context_dim is None and using 10 heads.
Setting up SDPCrossAttention (sdp). Query dim is 640, context_dim is 1024 and using 10 heads.
Setting up SDPCrossAttention (sdp). Query dim is 1280, context_dim is None and using 20 heads.
Setting up SDPCrossAttention (sdp). Query dim is 1280, context_dim is 1024 and using 20 heads.
Setting up SDPCrossAttention (sdp). Query dim is 1280, context_dim is None and using 20 heads.
Setting up SDPCrossAttention (sdp). Query dim is 1280, context_dim is 1024 and using 20 heads.
Setting up SDPCrossAttention (sdp). Query dim is 1280, context_dim is None and using 20 heads.
Setting up SDPCrossAttention (sdp). Query dim is 1280, context_dim is 1024 and using 20 heads.
Setting up SDPCrossAttention (sdp). Query dim is 1280, context_dim is None and using 20 heads.
Setting up SDPCrossAttention (sdp). Query dim is 1280, context_dim is 1024 and using 20 heads.
Setting up SDPCrossAttention (sdp). Query dim is 1280, context_dim is None and using 20 heads.
Setting up SDPCrossAttention (sdp). Query dim is 1280, context_dim is 1024 and using 20 heads.
Setting up SDPCrossAttention (sdp). Query dim is 1280, context_dim is None and using 20 heads.
Setting up SDPCrossAttention (sdp). Query dim is 1280, context_dim is 1024 and using 20 heads.
Setting up SDPCrossAttention (sdp). Query dim is 640, context_dim is None and using 10 heads.
Setting up SDPCrossAttention (sdp). Query dim is 640, context_dim is 1024 and using 10 heads.
Setting up SDPCrossAttention (sdp). Query dim is 640, context_dim is None and using 10 heads.
Setting up SDPCrossAttention (sdp). Query dim is 640, context_dim is 1024 and using 10 heads.
Setting up SDPCrossAttention (sdp). Query dim is 640, context_dim is None and using 10 heads.
Setting up SDPCrossAttention (sdp). Query dim is 640, context_dim is 1024 and using 10 heads.
Setting up SDPCrossAttention (sdp). Query dim is 320, context_dim is None and using 5 heads.
Setting up SDPCrossAttention (sdp). Query dim is 320, context_dim is 1024 and using 5 heads.
Setting up SDPCrossAttention (sdp). Query dim is 320, context_dim is None and using 5 heads.
Setting up SDPCrossAttention (sdp). Query dim is 320, context_dim is 1024 and using 5 heads.
Setting up SDPCrossAttention (sdp). Query dim is 320, context_dim is None and using 5 heads.
Setting up SDPCrossAttention (sdp). Query dim is 320, context_dim is 1024 and using 5 heads.
building SDPAttnBlock (sdp) with 512 in_channels
building SDPAttnBlock (sdp) with 512 in_channels
Setting up SDPCrossAttention (sdp). Query dim is 320, context_dim is None and using 5 heads.
Setting up SDPCrossAttention (sdp). Query dim is 320, context_dim is 1024 and using 5 heads.
Setting up SDPCrossAttention (sdp). Query dim is 320, context_dim is None and using 5 heads.
Setting up SDPCrossAttention (sdp). Query dim is 320, context_dim is 1024 and using 5 heads.
Setting up SDPCrossAttention (sdp). Query dim is 640, context_dim is None and using 10 heads.
Setting up SDPCrossAttention (sdp). Query dim is 640, context_dim is 1024 and using 10 heads.
Setting up SDPCrossAttention (sdp). Query dim is 640, context_dim is None and using 10 heads.
Setting up SDPCrossAttention (sdp). Query dim is 640, context_dim is 1024 and using 10 heads.
Setting up SDPCrossAttention (sdp). Query dim is 1280, context_dim is None and using 20 heads.
Setting up SDPCrossAttention (sdp). Query dim is 1280, context_dim is 1024 and using 20 heads.
Setting up SDPCrossAttention (sdp). Query dim is 1280, context_dim is None and using 20 heads.
Setting up SDPCrossAttention (sdp). Query dim is 1280, context_dim is 1024 and using 20 heads.
Setting up SDPCrossAttention (sdp). Query dim is 1280, context_dim is None and using 20 heads.
Setting up SDPCrossAttention (sdp). Query dim is 1280, context_dim is 1024 and using 20 heads.
Downloading: "https://huggingface.co/stabilityai/stable-diffusion-2-1-base/resolve/main/v2-1_512-ema-pruned.ckpt" to /mnt/DiffBIR/weights/v2-1_512-ema-pruned.ckpt

100%|██████████████████████████████████████████████████████████████████████████| 4.86G/4.86G [12:50<00:00, 6.77MB/s]
strictly load pretrained sd_v2.1, unused weights: {'posterior_mean_coef1', 'posterior_mean_coef2', 'posterior_variance', 'betas', 'model_ema.num_updates', 'alphas_cumprod', 'sqrt_recipm1_alphas_cumprod', 'log_one_minus_alphas_cumprod', 'sqrt_one_minus_alphas_cumprod', 'alphas_cumprod_prev', 'posterior_log_variance_clipped', 'model_ema.decay', 'sqrt_alphas_cumprod', 'sqrt_recip_alphas_cumprod'}
Downloading: "https://huggingface.co/lxq007/DiffBIR-v2/resolve/main/v2.pth" to /mnt/DiffBIR/weights/v2.pth

100%|██████████████████████████████████████████████████████████████████████████| 1.35G/1.35G [03:07<00:00, 7.75MB/s]
strictly load controlnet weight
load lq: /mnt/image_demos/01.png
Spaced Sampler: 100%|███████████████████████████████████████████████████████████████| 50/50 [01:37<00:00,  1.96s/it]
save result to /mnt/image_demos/ouput/01.png
load lq: /mnt/image_demos/input2.jpg
Spaced Sampler: 100%|███████████████████████████████████████████████████████████████| 50/50 [00:46<00:00,  1.07it/s]
save result to /mnt/image_demos/ouput/input2.png
load lq: /mnt/image_demos/oringnal.jpeg
Traceback (most recent call last):
  File "/mnt/DiffBIR/inference.py", line 86, in <module>
    main()
  File "/mnt/DiffBIR/inference.py", line 81, in main
    supported_tasks[args.task](args).run()
  File "/mnt/deblur_venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/mnt/DiffBIR/utils/inference.py", line 147, in run
    sample = self.pipeline.run(
  File "/mnt/deblur_venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/mnt/DiffBIR/utils/helpers.py", line 148, in run
    sample = self.run_stage2(
  File "/mnt/DiffBIR/utils/helpers.py", line 86, in run_stage2
    cond = self.cldm.prepare_condition(pad_clean, [pos_prompt] * bs)
  File "/mnt/DiffBIR/model/cldm.py", line 133, in prepare_condition
    c_img=self.vae_encode(clean * 2 - 1, sample=False)
  File "/mnt/DiffBIR/model/cldm.py", line 96, in vae_encode
    return self.vae.encode(image).mode() * self.scale_factor
  File "/mnt/DiffBIR/model/vae.py", line 550, in encode
    h = self.encoder(x)
  File "/mnt/deblur_venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/mnt/deblur_venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/mnt/DiffBIR/model/vae.py", line 414, in forward
    h = self.mid.attn_1(h)
  File "/mnt/deblur_venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/mnt/deblur_venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/mnt/DiffBIR/model/vae.py", line 295, in forward
    out = F.scaled_dot_product_attention(q, k, v)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 42.76 GiB. GPU 0 has a total capacity of 47.41 GiB of which 9.80 GiB is free. Including non-PyTorch memory, this process has 37.58 GiB memory in use. Of the allocated memory 24.53 GiB is allocated by PyTorch, and 12.55 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

Hello,do you solve the problem?I have the same question.

@Manoa1911
Copy link

Manoa1911 commented Sep 1, 2024

I have 24 giga card but it's not enough so I have another card with 12 giga but it's not possible to use both to increase memory :(
could it be at least make possible to run the guidance on the other card ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants