Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Crashed randomly with log below #12

Open
1 task done
kacherHuynh opened this issue Apr 21, 2023 · 9 comments
Open
1 task done

[Bug]: Crashed randomly with log below #12

kacherHuynh opened this issue Apr 21, 2023 · 9 comments

Comments

@kacherHuynh
Copy link

Is there an existing issue for this?

  • I have searched the existing issues and checked the recent builds/commits

What happened?

It was crashed when generate the image

Steps to reproduce the problem

  1. Run web ui
  2. Generate the image
  3. Sometime it will be crashed randomly, with any image size or model

What should have happened?

Can generate app without being crashed

Commit where the problem happens

20230416_experimental

What platforms do you use to access the UI ?

MacOS

What browsers do you use to access the UI ?

Google Chrome

Command Line Arguments

--no-half --no-download-sd-model --precision full --no-half-vae --upcast-sampling --opt-sub-quad-attention --use-cpu interrogate

List of extensions

ControlNet
Ultrasharp

Console logs

stable-diffusion-webui-custom/python/3.10.10/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '

Additional information

No response

@brkirch
Copy link
Owner

brkirch commented Apr 29, 2023

stable-diffusion-webui-custom/python/3.10.10/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '

This is actually just a warning; any errors displayed are before this so I'll need whatever was before this (and if there is a traceback, I'll definitely need that).

Also I can't guarantee ControlNet or Ultrasharp will work correctly yet. I plan to include the ControlNet extension in the future but until then there is no guarantee it will work fully.

@paraversal
Copy link

paraversal commented Apr 29, 2023

I'll chime in since I'm having the same problem, every maybe 10 image generations.
For me, the error displayed before is
failed assertion _status < MTLCommandBufferStatusCommitted at line 316 in -[IOGPUMetalCommandBuffer setCurrentCommandEncoder:]

This is on M1 Pro, 16GB RAM, Ventura 13.2.1, WebUI release 20230416, running on Degoogled Chromium

@marcomastri
Copy link

I’m getting the same error on a very similar setup:

failed assertion _status < MTLCommandBufferStatusCommitted at line 316 in -[IOGPUMetalCommandBuffer setCurrentCommandEncoder:]
Abort trap: 6
logout

Macbook Pro M1, 16GB RAM, Ventura 13.3.1, release 20230416 on Firefox

@x4080
Copy link

x4080 commented May 5, 2023

I got the same error randomly, M2 pro mac mini 16gb like @marcomastri

@kacherHuynh
Copy link
Author

@brkirch sorry for my late reply, please check the detailed log in the attachment.
Thank you so much~!
image

@x4080
Copy link

x4080 commented May 13, 2023

Using the latest release, it seems to rarely having errors, but I think it used much more memory now

@brkirch
Copy link
Owner

brkirch commented May 14, 2023

@kacherHuynh Are you still using the experimental version? v1.1.1-RC should not have that issue as often.

Using the latest release, it seems to rarely having errors, but I think it used much more memory now

This is correct, it turns out that torch.mps.empty_cache() was causing the most of the crashing reported here. To prevent the issue I had to remove the usage of torch.mps.empty_cache() for now which means that memory isn't cleaned up often or as thoroughly which will result in overall higher memory usage.

@x4080
Copy link

x4080 commented May 14, 2023

@brkirch What mac do you use ? I'm using M2 pro 16gb and using just 512x512 and controlnet 1.1 tile (img2img) already use swap size about 600MB, do you think we can improve memory usage ?

I just tested DrawThings app using the same config and it uses less memory, is it because python have a lot of overhead ?

Btw thanks for your big effort

@kacherHuynh
Copy link
Author

@kacherHuynh Are you still using the experimental version? v1.1.1-RC should not have that issue as often.

Using the latest release, it seems to rarely having errors, but I think it used much more memory now

This is correct, it turns out that torch.mps.empty_cache() was causing the most of the crashing reported here. To prevent the issue I had to remove the usage of torch.mps.empty_cache() for now which means that memory isn't cleaned up often or as thoroughly which will result in overall higher memory usage.

I have just updated and tried today, crashes come even more often. only after generating 1 or 2 images.
Please have a look.
Thank you so much!

Error completing request█▍                       | 4/25 [00:04<00:26,  1.29s/it]
Arguments: ('task(mn2g02iln8zey3a)', '8k portrait of beautiful cyborg with brown hair, intricate, elegant, highly detailed, majestic, digital photography, art by Artgerm and ruan jia and greg rutkowski surreal painting gold butterfly filigree, broken glass, (masterpiece, side lighting, finely detailed beautiful eyes: 1.2), hdr', 'canvas frame, cartoon, 3d, ((disfigured)), ((bad art)), ((deformed)),((extra limbs)),((close up)),((b&w)), weird colors, blurry, (((duplicate))), ((morbid)), ((mutilated)), [out of frame], extra fingers, mutated hands, ((poorly drawn hands)), ((poorly drawn face)), (((mutation))), (((deformed))), ((ugly)), blurry, ((bad anatomy)), (((bad proportions))), ((extra limbs)), cloned face, (((disfigured))), out of frame, ugly, extra limbs, (bad anatomy), gross proportions, (malformed limbs), ((missing arms)), ((missing legs)), (((extra arms))), (((extra legs))), mutated hands, (fused fingers), (too many fingers), (((long neck))), signature, video game, ugly, tiling, poorly drawn hands, poorly drawn feet, poorly drawn face, out of frame, mutation, mutated, extra limbs, extra legs, extra arms, disfigured, deformed, cross-eye, body out of frame, blurry, bad art, bad anatomy, 3d render', [], 25, 16, True, False, 1, 1, 7, 132340232.0, -1.0, 0, 0, 0, False, 896, 512, False, 0.7, 2, 'Latent', 0, 0, 0, [], 0, <controlnet.py.UiControlNetUnit object at 0x2b89a6dd0>, <controlnet.py.UiControlNetUnit object at 0x2b89a6e60>, False, False, 'positive', 'comma', 0, False, False, '', 1, '', [], 0, '', [], 0, '', [], True, False, False, False, 0, None, False, None, False, 50) {}
Traceback (most recent call last):
  File "/Users/kacher/stable-diffusion-webui-custom/modules/call_queue.py", line 57, in f
    res = list(func(*args, **kwargs))
  File "/Users/kacher/stable-diffusion-webui-custom/modules/call_queue.py", line 37, in f
    res = func(*args, **kwargs)
  File "/Users/kacher/stable-diffusion-webui-custom/modules/txt2img.py", line 56, in txt2img
    processed = process_images(p)
  File "/Users/kacher/stable-diffusion-webui-custom/modules/processing.py", line 515, in process_images
    res = process_images_inner(p)
  File "/Users/kacher/stable-diffusion-webui-custom/extensions/sd-webui-controlnet/scripts/batch_hijack.py", line 42, in processing_process_images_hijack
    return getattr(processing, '__controlnet_original_process_images_inner')(p, *args, **kwargs)
  File "/Users/kacher/stable-diffusion-webui-custom/modules/processing.py", line 669, in process_images_inner
    samples_ddim = p.sample(conditioning=c, unconditional_conditioning=uc, seeds=seeds, subseeds=subseeds, subseed_strength=p.subseed_strength, prompts=prompts)
  File "/Users/kacher/stable-diffusion-webui-custom/modules/processing.py", line 887, in sample
    samples = self.sampler.sample(self, x, conditioning, unconditional_conditioning, image_conditioning=self.txt2img_image_conditioning(x))
  File "/Users/kacher/stable-diffusion-webui-custom/modules/sd_samplers_kdiffusion.py", line 377, in sample
    samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args={
  File "/Users/kacher/stable-diffusion-webui-custom/modules/sd_samplers_kdiffusion.py", line 251, in launch_sampling
    return func()
  File "/Users/kacher/stable-diffusion-webui-custom/modules/sd_samplers_kdiffusion.py", line 377, in <lambda>
    samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args={
  File "/Users/kacher/stable-diffusion-webui-custom/venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/Users/kacher/stable-diffusion-webui-custom/repositories/k-diffusion/k_diffusion/sampling.py", line 576, in sample_dpmpp_sde
    denoised_2 = model(x_2, sigma_fn(s) * s_in, **extra_args)
  File "/Users/kacher/stable-diffusion-webui-custom/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/Users/kacher/stable-diffusion-webui-custom/modules/sd_samplers_kdiffusion.py", line 167, in forward
    devices.test_for_nans(x_out, "unet")
  File "/Users/kacher/stable-diffusion-webui-custom/modules/devices.py", line 157, in test_for_nans
    raise NansException(message)
modules.devices.NansException: A tensor with all NaNs was produced in Unet. This could be either because there's not enough precision to represent the picture, or because your video card does not support half type. Try setting the "Upcast cross attention layer to float32" option in Settings > Stable Diffusion or using the --no-half commandline argument to fix this. Use --disable-nan-check commandline argument to disable this check.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants