Skip to content

Issue while running SD1.5 on multiple less beefy GPUs #7682

Closed
@square-1111

Description

@square-1111

Describe the bug

I am trying to run distributed inference for SD1.5 and SDXL on 2xGTX 1080 Ti. But facing some issues

Reproduction


from diffusers import DiffusionPipeline
import torch



pipeline = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16, use_safetensors=True, device_map="balanced", cache_dir="/data2/humanaware/tezuesh/Diffusion/cache_dir/")


print(pipeline.hf_device_map)

prompt = "A majestic lion jumping from a big stone at night"
image = pipeline(prompt)

command to run : CUDA_VISIBLE_DEVICES="0,1" python sd15_inference.py

Logs

Logs for SD1.5 inference 

Traceback (most recent call last):
  File "Diffusion/Inference/sd15_inference.py", line 35, in <module>
    pipeline = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16, use_safetensors=True, device_map="balanced", cache_dir="/Diffusion/cache_dir/")
  File "/venv/py310/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 119, in _inner_fn
    return fn(*args, **kwargs)
  File "/venv/py310/lib/python3.10/site-packages/diffusers/pipelines/pipeline_utils.py", line 877, in from_pretrained
    loaded_sub_model = load_sub_model(
  File "/venv/py310/lib/python3.10/site-packages/diffusers/pipelines/pipeline_loading_utils.py", line 699, in load_sub_model
    loaded_sub_model = load_method(os.path.join(cached_folder, name), **loading_kwargs)
  File "/venv/py310/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 119, in _inner_fn
    return fn(*args, **kwargs)
  File "/venv/py310/lib/python3.10/site-packages/diffusers/models/modeling_utils.py", line 694, in from_pretrained
    accelerate.load_checkpoint_and_dispatch(
  File "/venv/py310/lib/python3.10/site-packages/accelerate/big_modeling.py", line 614, in load_checkpoint_and_dispatch
    return dispatch_model(
  File "/venv/py310/lib/python3.10/site-packages/accelerate/big_modeling.py", line 419, in dispatch_model
    attach_align_device_hook_on_blocks(
  File "/venv/py310/lib/python3.10/site-packages/accelerate/hooks.py", line 608, in attach_align_device_hook_on_blocks
    add_hook_to_module(module, hook)
  File "/venv/py310/lib/python3.10/site-packages/accelerate/hooks.py", line 157, in add_hook_to_module
    module = hook.init_hook(module)
  File "/venv/py310/lib/python3.10/site-packages/accelerate/hooks.py", line 275, in init_hook
    set_module_tensor_to_device(module, name, self.execution_device, tied_params_map=self.tied_params_map)
  File "/venv/py310/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 354, in set_module_tensor_to_device
    raise ValueError(f"{tensor_name} is on the meta device, we need a `value` to put in on {device}.")
ValueError: weight is on the meta device, we need a `value` to put in on 0.

System Info

diffusers-cli env

Copy-and-paste the text below in your GitHub issue and FILL OUT the two last points.

  • diffusers version: 0.28.0.dev0
  • Platform: Linux-4.15.0-140-generic-x86_64-with-glibc2.31
  • Python version: 3.10.14
  • PyTorch version (GPU?): 2.2.2+cu121 (True)
  • Huggingface_hub version: 0.22.2
  • Transformers version: 4.40.0.dev0
  • Accelerate version: 0.29.0
  • xFormers version: not installed
  • Using GPU in script?: 2
  • Using distributed or parallel set-up in script?:

Who can help?

@sayakpaul

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions