-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add padding_mask_crop to all inpaint pipelines #6360
add padding_mask_crop to all inpaint pipelines #6360
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Could I see some results too?
@yiyixuxu shouldn't we add tests too?
I think it is fine not to add tests for these auto1111 features. We are not currently testing all the value combinations for all pipeline arguments |
I will try to get the result of padding_mask_crop with new pipeline. It will help if someone could provide an example code for running ControlNet inpaint |
Here are the result of SDXL import torch
from diffusers import AutoPipelineForInpainting
from diffusers.utils import load_image
from PIL import Image
model = "stabilityai/stable-diffusion-xl-base-1.0"
blur_factor = 33
seed = 0
img_url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/sdxl-text2img.png"
mask_url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/sdxl-inpaint-mask.png"
base = load_image(img_url)
mask = load_image(mask_url)
# create inpaint pipeline
pipe1 = AutoPipelineForInpainting.from_pretrained(model, torch_dtype=torch.float16).to('cuda')
# this is baseline, no mask blur, no inpant_full_res
generator = torch.Generator(device='cuda').manual_seed(seed)
inpaint = pipe1('boat', image=base, mask_image=mask, strength=0.75,generator=generator).images[0]
inpaint.save(f'out_base.png')
# create blurred nask
mask_blurred = pipe1.mask_processor.blur(mask, blur_factor=blur_factor)
mask_blurred.save(f'mask_blurred.png')
# with mask blur
generator = torch.Generator(device='cuda').manual_seed(seed)
inpaint = pipe1('boat', image=base, mask_image=mask_blurred, strength=0.75,generator=generator).images[0]
inpaint.save(f'out_mask_blur.png')
# with both mask_blur and inpaint_full_res
generator = torch.Generator(device='cuda').manual_seed(seed)
inpaint = pipe1('boat', image=base, mask_image=mask_blurred, strength=0.75,generator=generator, padding_mask_crop=32).images[0]
inpaint.save(f'out_mask_blur_full_res.png') |
Finally, SDXL control net output from diffusers import StableDiffusionXLControlNetInpaintPipeline, ControlNetModel, DDIMScheduler
from diffusers.utils import load_image
import cv2
from PIL import Image
import numpy as np
import torch
init_image = load_image(
"https://huggingface.co/datasets/diffusers/test-arrays/resolve/main/stable_diffusion_inpaint/boy.png"
)
init_image = init_image.resize((1024, 1024))
generator = torch.Generator(device="cpu").manual_seed(1)
mask_image = load_image(
"https://huggingface.co/datasets/diffusers/test-arrays/resolve/main/stable_diffusion_inpaint/boy_mask.png"
)
mask_image = mask_image.resize((1024, 1024))
def make_canny_condition(image):
image = np.array(image)
image = cv2.Canny(image, 100, 200)
image = image[:, :, None]
image = np.concatenate([image, image, image], axis=2)
image = Image.fromarray(image)
return image
control_image = make_canny_condition(init_image)
controlnet = ControlNetModel.from_pretrained(
"diffusers/controlnet-canny-sdxl-1.0", torch_dtype=torch.float16
)
pipe = StableDiffusionXLControlNetInpaintPipeline.from_pretrained(
"stabilityai/stable-diffusion-xl-base-1.0", controlnet=controlnet, torch_dtype=torch.float16
)
pipe.enable_model_cpu_offload()
# generate image
image = pipe(
"a handsome man with ray-ban sunglasses",
num_inference_steps=20,
generator=generator,
eta=1.0,
image=init_image,
mask_image=mask_image,
control_image=control_image,
).images[0]
image.save("sdxl_controlnet_no_pad.png")
image = pipe(
"a handsome man with ray-ban sunglasses",
num_inference_steps=20,
generator=generator,
eta=1.0,
width=1024,
height=1024,
image=init_image,
mask_image=mask_image,
control_image=control_image,
padding_mask_crop=32
).images[0]
image.save("sdxl_controlnet_pad.png") No |
Can we run |
we still see some of the astronauts in the sdxl example; wonder if it is related to this. #6417 can you run it again with the fix you proposed? |
@yiyixuxu I think it due to the size of |
@patrickvonplaten I did run
|
Hi @rootonchair
I got error from |
@yiyixuxu should we change Below is the result of running |
Actually you should indeed run |
@patrickvonplaten I see. I will make an update on that |
let's change it! you can change for inpaint pipeline too :) let's also pass |
great job! |
@yiyixuxu all done 😄 |
@@ -1264,6 +1298,13 @@ def __call__( | |||
else: | |||
batch_size = prompt_embeds.shape[0] | |||
|
|||
if padding_mask_crop is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
um just saw your issue #6435
maybe we need to move this code into prepare_control_image()
?
see my comment here #6435 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it would work. width
and height
are still None in there. Do you think we should handle None in get_crop_region
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok!
you can use self.image_processor. get_default_height_width(image)
to get it
@yiyixuxu could you help me review this PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry I'm a little bit slow in reviewing this.
looks good and I left a few comments. Thanks again for working on this!
src/diffusers/pipelines/controlnet/pipeline_controlnet_inpaint.py
Outdated
Show resolved
Hide resolved
f"The mask image should be a PIL image when inpainting mask crop, but is of type" | ||
f" {type(mask_image)}." | ||
) | ||
if output_type != "pil": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
src/diffusers/pipelines/controlnet/pipeline_controlnet_inpaint.py
Outdated
Show resolved
Hide resolved
src/diffusers/pipelines/controlnet/pipeline_controlnet_inpaint_sd_xl.py
Outdated
Show resolved
Hide resolved
src/diffusers/pipelines/controlnet/pipeline_controlnet_inpaint_sd_xl.py
Outdated
Show resolved
Hide resolved
if self.unet.config.in_channels != 4: | ||
if self.unet.config.in_channels != 4 and self.unet.config.in_channels != 9: | ||
raise ValueError( | ||
f"The UNet should have 4 input channels for inpainting mask crop, but has" | ||
f"The UNet should have 4 or 9 input channels for inpainting mask crop, but has" | ||
f" {self.unet.config.in_channels} input channels." | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can remove this warning
src/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl_inpaint.py
Outdated
Show resolved
Hide resolved
@@ -1527,10 +1559,22 @@ def denoising_value_valid(dnv): | |||
is_strength_max = strength == 1.0 | |||
|
|||
# 5. Preprocess mask and image | |||
init_image = self.image_processor.preprocess(image, height=height, width=width) | |||
if padding_mask_crop is not None: | |||
crops_coords = self.mask_processor.get_crop_region(mask_image, width, height, pad=padding_mask_crop) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't need height, width = self.image_processor.get_default_height_width(image, height, width)
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need because it already been initialized: https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl_inpaint.py#L1445-L1446
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
@yiyixuxu fixed. Thank you for your reviews |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks
* add padding_mask_crop --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: YiYi Xu <yixu310@gmail.com>
What does this PR do?
Add padding_mask_crop to inpaint pipelines: SDXL, ControlNet, ControlNet SDXL
Fixes #6345 (issue)
Before submitting
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.