Failed to run on M1Mac with automatic1111 web ui #18

leopold-liu · 2023-04-03T14:19:07Z

Hi there,
I would like to use ToMe to speed up diffusion, but i got an error on My M1Mac with automatic1111 web ui, could u pls help with this:

Traceback (most recent call last):
File "/Users/leopold/code/stable-diffusion-webui/modules/call_queue.py", line 56, in f
res = list(func(*args, **kwargs))
File "/Users/leopold/code/stable-diffusion-webui/modules/call_queue.py", line 37, in f
res = func(*args, **kwargs)
File "/Users/leopold/code/stable-diffusion-webui/modules/txt2img.py", line 56, in txt2img
processed = process_images(p)
File "/Users/leopold/code/stable-diffusion-webui/modules/processing.py", line 486, in process_images
res = process_images_inner(p)
File "/Users/leopold/code/stable-diffusion-webui/modules/processing.py", line 636, in process_images_inner
samples_ddim = p.sample(conditioning=c, unconditional_conditioning=uc, seeds=seeds, subseeds=subseeds, subseed_strength=p.subseed_strength, prompts=prompts)
File "/Users/leopold/code/stable-diffusion-webui/modules/processing.py", line 852, in sample
samples = self.sampler.sample(self, x, conditioning, unconditional_conditioning, image_conditioning=self.txt2img_image_conditioning(x))
File "/Users/leopold/code/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 351, in sample
samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args={
File "/Users/leopold/code/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 227, in launch_sampling
return func()
File "/Users/leopold/code/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 351, in
samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args={
File "/Users/leopold/code/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/Users/leopold/code/stable-diffusion-webui/repositories/k-diffusion/k_diffusion/sampling.py", line 145, in sample_euler_ancestral
denoised = model(x, sigmas[i] * s_in, **extra_args)
File "/Users/leopold/code/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/Users/leopold/code/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 119, in forward
x_out = self.inner_model(x_in, sigma_in, cond={"c_crossattn": [cond_in], "c_concat": [image_cond_in]})
File "/Users/leopold/code/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/Users/leopold/code/stable-diffusion-webui/repositories/k-diffusion/k_diffusion/external.py", line 114, in forward
eps = self.get_eps(input * c_in, self.sigma_to_t(sigma), **kwargs)
File "/Users/leopold/code/stable-diffusion-webui/repositories/k-diffusion/k_diffusion/external.py", line 140, in get_eps
return self.inner_model.apply_model(*args, **kwargs)
File "/Users/leopold/code/stable-diffusion-webui/modules/sd_hijack_utils.py", line 17, in
setattr(resolved_obj, func_path[-1], lambda *args, **kwargs: self(*args, **kwargs))
File "/Users/leopold/code/stable-diffusion-webui/modules/sd_hijack_utils.py", line 26, in call
return self.__sub_func(self.__orig_func, *args, **kwargs)
File "/Users/leopold/code/stable-diffusion-webui/modules/sd_hijack_unet.py", line 45, in apply_model
return orig_func(self, x_noisy.to(devices.dtype_unet), t.to(devices.dtype_unet), cond, **kwargs).float()
File "/Users/leopold/code/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/models/diffusion/ddpm.py", line 858, in apply_model
x_recon = self.model(x_noisy, t, **cond)
File "/Users/leopold/code/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/Users/leopold/code/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/models/diffusion/ddpm.py", line 1329, in forward
out = self.diffusion_model(x, t, context=cc)
File "/Users/leopold/code/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1148, in _call_impl
result = forward_call(*input, **kwargs)
File "/Users/leopold/code/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/diffusionmodules/openaimodel.py", line 776, in forward
h = module(h, emb, context)
File "/Users/leopold/code/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/Users/leopold/code/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/diffusionmodules/openaimodel.py", line 84, in forward
x = layer(x, context)
File "/Users/leopold/code/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/Users/leopold/code/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/attention.py", line 324, in forward
x = block(x, context=context[i])
File "/Users/leopold/code/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/Users/leopold/code/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/attention.py", line 259, in forward
return checkpoint(self._forward, (x, context), self.parameters(), self.checkpoint)
File "/Users/leopold/code/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/diffusionmodules/util.py", line 114, in checkpoint
return CheckpointFunction.apply(func, len(inputs), *args)
File "/Users/leopold/code/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/diffusionmodules/util.py", line 129, in forward
output_tensors = ctx.run_function(*ctx.input_tensors)
File "/Users/leopold/code/stable-diffusion-webui/tomesd/tomesd/patch.py", line 48, in _forward
m_a, m_c, m_m, u_a, u_c, u_m = compute_merge(x, self.tome_info)
File "/Users/leopold/code/stable-diffusion-webui/tomesd/tomesd/patch.py", line 21, in compute_merge
m, u = merge.bipartite_soft_matching_random2d(x, w, h, args["sx"], args["sy"], r, not args["use_rand"])
File "/Users/leopold/code/stable-diffusion-webui/tomesd/tomesd/merge.py", line 55, in bipartite_soft_matching_random2d
idx_buffer_view.scatter(dim=2, index=rand_idx, src=-torch.ones_like(rand_idx, dtype=rand_idx.dtype))
TypeError: Operation 'neg_out_mps()' does not support input type 'int64' in MPS backend.

dbolya · 2023-04-03T15:17:13Z

Hmm if you change all mentions of int64 to int32 in merge.py and reinstall, does it work?

leopold-liu · 2023-04-03T16:20:08Z

Hmm if you change all mentions of int64 to int32 in merge.py and reinstall, does it work?

nope, still the same issue

dbolya · 2023-04-03T16:57:14Z

Maybe you need to put export PYTORCH_ENABLE_MPS_FALLBACK=1 in the webui launch script (see #15).

Awethon · 2023-04-03T20:25:50Z

I also have issues with running it in automatic webui.

/AppleInternal/Library/BuildRoots/c651a45f-806e-11ed-a221-7ef33c48bc85/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShaders/MPSNDArray/Kernels/MPSNDArrayGatherND.mm:234: failed assertion `Rank of updates array (1) must be greater than or equal to inner-most dimension of indices array (2867)'

PYTORCH_ENABLE_MPS_FALLBACK didn't help.

pytorch 2.0

dbolya · 2023-04-03T23:11:17Z

@Awethon are you on the latest dev build? (you have to install from source) That error was fixed already, but I haven't pushed it to pip yet.

leopold-liu · 2023-04-04T05:43:42Z

Maybe you need to put export PYTORCH_ENABLE_MPS_FALLBACK=1 in the webui launch script (see #15).

yes, i've seen that issue too, and i put 'export PYTORCH_ENABLE_MPS_FALLBACK=1' in launch script already but still the same error, seems like a different problem

dbolya · 2023-04-04T05:46:28Z

Does ToMe work for you outside of the webui (for instance, in diffusers)?
The error you got originally seems to me like MPS doesn't support negating an int, which is quite weird.

leopold-liu · 2023-04-04T05:52:31Z

I will test about it

ChrisYangTW · 2023-04-05T10:17:59Z

The same problem occurs at the beginning, such as 'neg_out_mps()' error or "failed assertion '".
Finally, after following everyone's advice, I have succeeded!.

conditions:

tomesd==0.1.2
(venv) pip3 install tomesd==0.1.2
torch==2.0.0, torchvision==0.15.1, and torchaudio==2.0.1(not sure if it is necessary)
(venv) pip3 install torch==2.0.0 torchvision==0.15.1 torchaudio==2.0.1
environment variables: PYTORCH_ENABLE_MPS_FALLBACK=1
export PYTORCH_ENABLE_MPS_FALLBACK=1

Recommended use the extension: https://git.mmaker.moe/mmaker/sd-webui-tome to setup parameters

jrittvo · 2023-04-05T22:17:20Z

FWIW, it also runs fine with torch 2.1.0.dev and torchvision 0.16.0.dev using diffusers 0.14.0. Haven't tried updating Auto1111 to the dev packages . . .

OrganicBeej · 2023-04-09T19:10:28Z

#15

Hi there,

I had exactly the same issue on my iMac as the first post. I am wondering if you cld explain from your post above exactly what you did? I tried adding export PYTORCH_ENABLE_MPS_FALLBACK=1 to my webui.sh but I don't think I did it correctly?

Also, the other 2 bits of code with (venv), are you running these in a terminal from venv directory?

Sorry for questions, just a little unsure with this type of thing but wld love to see ToMe working on my iMac M1 and it does seem like it can.
Thnks for any help :)

recoilme · 2023-04-25T15:48:01Z

For me this script working on m1

import torch, tomesd
from diffusers import StableDiffusionPipeline

model_id = "./colorful_v30.ckpt"
pipe = StableDiffusionPipeline.from_ckpt(model_id, torch_dtype=torch.float32).to("cpu")

# Apply ToMe with a 50% merging ratio
tomesd.apply_patch(pipe, ratio=0.5)
#pipe.save_pretrained("1")
image = pipe(
   "a photo of an astronaut riding a horse on mars",
   width=512, height=512, num_inference_steps=10).images[0]
image.save("astronaut.png")

i upd diffusers and switch on latest torch2
pip install git+https://github.com/huggingface/diffusers.git@main

But it not work on 'mps' (torch.float32).to("mps")) and from AUTO1111
In case of mps backend error:

tomesd/merge.py:55
TypeError: Operation 'neg_out_mps()' does not support input type 'int64' in MPS backend.

It's hard to understand is it faster or not from script (first image generation time bug on mac), but probably faster

recoilme · 2023-04-25T19:11:38Z

So, i have 2 news) Good and Bad.
How run on mac m1:

replace in merge.py int64 on int32
add in random init type int32
get error NotImplementedError: The operator 'aten::sort.values_stable' is not currently implemented for the MPS device - export PYTORCH_ENABLE_MPS_FALLBACK=1

I use this plugin https://git.mmaker.moe/mmaker/sd-webui-tome with 0.5 (Enable in settings / tokens transformer). Apply/Reload UI/Reload model

Output without ToMe:

Removing ToMe patch (if exists)
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
Startup time: 1.0s (load scripts: 0.3s, create ui: 0.6s).
100%|███████████████████████████████████████████| 36/36 [03:03<00:00,  5.11s/it]

Output with ToMe:

Applying ToMe patch...
ToMe patch applied
Weights loaded in 2.7s (load weights from disk: 0.2s, apply weights to model: 1.9s, move model to device: 0.6s).
100%|███████████████████████████████████████████| 36/36 [02:10<00:00,  3.62s/it]

Prompt:

Create a photorealistic image of a warlord wizard casting a spell. Utilize state-of-the-art techniques, including HDR, CGI, VFX, and insane levels of detail to create an ultra-sharp and ultra-realistic image. Use Unreal 5 and Octane Render to bring the scene to life, with a focus on creating an intricate masterpiece that showcases the wizard's power and magical prowess.
Negative prompt: frame, blurry, drawing, sketch, ((ugly)), ((duplicate)), (morbid), ((mutilated)), (mutated), (deformed), (disfigured), (extra limbs), (malformed limbs), (missing arms), (missing legs), (extra arms), (extra legs), (fused fingers), (too many fingers), long neck, low quality, worst quality, 3d, cartoon, anime, girl, loli, young, monochrome
Steps: 36, Sampler: DPM++ 2M Test, CFG scale: 6, Seed: 3370768663, Size: 640x896, Model hash: 1a36578807, Model: colorful_v30, Hashes: {"model": "1a36578807"}

29% speedup for 0.5 loss!
18% speedup for 0.3 loss (same size: 640x896)

But many little details was washed out(
Original image

recoilme · 2023-04-25T19:13:02Z

"Patched with 0.5/0.3"

recoilme · 2023-04-25T19:17:57Z

Anyway it's huge improvements in speed on middle sizes. Good for testing, on mobile and so on.
I don't send PR because i don't know Python, i just want to generate waifu. I packed to gist - but this code smell https://gist.github.com/recoilme/c04db1d9a83358c6cdbadf18026df048

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Failed to run on M1Mac with automatic1111 web ui #18

Failed to run on M1Mac with automatic1111 web ui #18

leopold-liu commented Apr 3, 2023

dbolya commented Apr 3, 2023

leopold-liu commented Apr 3, 2023

dbolya commented Apr 3, 2023

Awethon commented Apr 3, 2023

dbolya commented Apr 3, 2023 •

edited

Loading

leopold-liu commented Apr 4, 2023

dbolya commented Apr 4, 2023

leopold-liu commented Apr 4, 2023

ChrisYangTW commented Apr 5, 2023 •

edited

Loading

jrittvo commented Apr 5, 2023

OrganicBeej commented Apr 9, 2023

recoilme commented Apr 25, 2023

recoilme commented Apr 25, 2023

recoilme commented Apr 25, 2023

recoilme commented Apr 25, 2023

Failed to run on M1Mac with automatic1111 web ui #18

Failed to run on M1Mac with automatic1111 web ui #18

Comments

leopold-liu commented Apr 3, 2023

dbolya commented Apr 3, 2023

leopold-liu commented Apr 3, 2023

dbolya commented Apr 3, 2023

Awethon commented Apr 3, 2023

dbolya commented Apr 3, 2023 • edited Loading

leopold-liu commented Apr 4, 2023

dbolya commented Apr 4, 2023

leopold-liu commented Apr 4, 2023

ChrisYangTW commented Apr 5, 2023 • edited Loading

jrittvo commented Apr 5, 2023

OrganicBeej commented Apr 9, 2023

recoilme commented Apr 25, 2023

recoilme commented Apr 25, 2023

recoilme commented Apr 25, 2023

recoilme commented Apr 25, 2023

dbolya commented Apr 3, 2023 •

edited

Loading

ChrisYangTW commented Apr 5, 2023 •

edited

Loading