Closed
Description
Describe the bug
I am trying to run distributed inference for SD1.5 and SDXL on 2xGTX 1080 Ti. But facing some issues
Reproduction
from diffusers import DiffusionPipeline
import torch
pipeline = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16, use_safetensors=True, device_map="balanced", cache_dir="/data2/humanaware/tezuesh/Diffusion/cache_dir/")
print(pipeline.hf_device_map)
prompt = "A majestic lion jumping from a big stone at night"
image = pipeline(prompt)
command to run : CUDA_VISIBLE_DEVICES="0,1" python sd15_inference.py
Logs
Logs for SD1.5 inference
Traceback (most recent call last):
File "Diffusion/Inference/sd15_inference.py", line 35, in <module>
pipeline = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16, use_safetensors=True, device_map="balanced", cache_dir="/Diffusion/cache_dir/")
File "/venv/py310/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 119, in _inner_fn
return fn(*args, **kwargs)
File "/venv/py310/lib/python3.10/site-packages/diffusers/pipelines/pipeline_utils.py", line 877, in from_pretrained
loaded_sub_model = load_sub_model(
File "/venv/py310/lib/python3.10/site-packages/diffusers/pipelines/pipeline_loading_utils.py", line 699, in load_sub_model
loaded_sub_model = load_method(os.path.join(cached_folder, name), **loading_kwargs)
File "/venv/py310/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 119, in _inner_fn
return fn(*args, **kwargs)
File "/venv/py310/lib/python3.10/site-packages/diffusers/models/modeling_utils.py", line 694, in from_pretrained
accelerate.load_checkpoint_and_dispatch(
File "/venv/py310/lib/python3.10/site-packages/accelerate/big_modeling.py", line 614, in load_checkpoint_and_dispatch
return dispatch_model(
File "/venv/py310/lib/python3.10/site-packages/accelerate/big_modeling.py", line 419, in dispatch_model
attach_align_device_hook_on_blocks(
File "/venv/py310/lib/python3.10/site-packages/accelerate/hooks.py", line 608, in attach_align_device_hook_on_blocks
add_hook_to_module(module, hook)
File "/venv/py310/lib/python3.10/site-packages/accelerate/hooks.py", line 157, in add_hook_to_module
module = hook.init_hook(module)
File "/venv/py310/lib/python3.10/site-packages/accelerate/hooks.py", line 275, in init_hook
set_module_tensor_to_device(module, name, self.execution_device, tied_params_map=self.tied_params_map)
File "/venv/py310/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 354, in set_module_tensor_to_device
raise ValueError(f"{tensor_name} is on the meta device, we need a `value` to put in on {device}.")
ValueError: weight is on the meta device, we need a `value` to put in on 0.
System Info
diffusers-cli env
Copy-and-paste the text below in your GitHub issue and FILL OUT the two last points.
diffusers
version: 0.28.0.dev0- Platform: Linux-4.15.0-140-generic-x86_64-with-glibc2.31
- Python version: 3.10.14
- PyTorch version (GPU?): 2.2.2+cu121 (True)
- Huggingface_hub version: 0.22.2
- Transformers version: 4.40.0.dev0
- Accelerate version: 0.29.0
- xFormers version: not installed
- Using GPU in script?: 2
- Using distributed or parallel set-up in script?: