[1.1.202 Inpaint] Improvement: Everything Related to Adobe Firefly Generative Fill #1464

lllyasviel · 2023-05-29T18:30:31Z

lllyasviel
May 29, 2023
Collaborator

The short story is that ControlNet WebUI Extension has completed several improvements/features of Inpaint in 1.1.202, making it possible to achieve inpaint effects similar to Adobe Firefly Generative Fill using only open-source models/codes.

Adobe Firefly Generative Fill

This weekend someone told me that you do not really need an Adobe Subscription to use Firefly, and the popular Generative Fill can be used from their website (even without an Adobe account)!

After learning about this, I tested that Firefly Generative Fill with some test images used during the development of ControlNet. The performance of that model is super impressive and the technical architecture is more user-friendly than Stable Diffusion toolsets.

Overall, the behaviors of Adobe Firefly Generative Fill are:

if users do not provide any prompts, the inpaint does not fail, and the generating is guided by image contents.
if users provide prompts, the generating is guided by both prompts and image contents.
Given its results, it is likely that the results with or without prompts are generated by a same model pipeline.

For example, we test this image (1280x1024, note that this is a real photo)

And I put this in Firefly Generative Fill

(Note that I do not input any prompts. The input text area is blank.)

And these are some random non-cherry-picked results (very impressive):

All results are very impressive, and this one is the best result I can get (in my personal opinion)

Is it possible for A1111?

Before ControlNet 1.1.202, the answer is no.

After ControlNet 1.1.202, the answer is somewhat yes.

We all know that most SD models are terrible when we do not input prompts. Making a user-friendly pipeline with prompt-free inpainting (like FireFly) in SD can be difficult.

For example, this is a simple test without prompts:

No prompt
Steps: 20, Sampler: Euler a, CFG scale: 7, Seed: 12345, Size: 1280x1024, Model hash: c0d1994c73, Model: realisticVisionV20_v20, Denoising strength: 1, ENSD: 31337, Mask blur: 4, Version: v1.3.0

(realisticVisionV20, non-cherry-picked, seed 12345)

We can see that it is clearly unusable.

One possible way is to use an inpaint variation model:

(realisticVisionV20_v20-inpainting (6482f11700), non-cherry-picked, seed 12345)

We can see that results are still unusable.

Another method is to use ControlNet 1.1 Inpaint:

(realisticVisionV20+control_v11p_sd15_inpaint, non-cherry-picked, seed 12345)

We can see that the difference is minimal.

One may argue that, we do not really need to follow the rule - if we want a system without always asking users to give prompts, we can secretly generate prompts and automatically feed it to the model without letting users knowing about it. It is possible to even use a pre-defined negative prompt.

But if you try it, you will soon realize that this is a terrible idea.

The generated prompts content many errors.
Even when it does not contain errors (in very few cases), it is too restricted, and usually make the generated images look a bit boring and limited in topics.
What if users give prompts? How can you balance the generated prompts with user prompts? Use a GPT to merge them? That can make the user prompt less accurate and users will be unhappy about that.

For example, the prompt generated by “Interrogate CLIP” for this image is

a red tori, a small, Douglas Robertson Bisset, japan, a digital rendering, cloisonnism

I do not know what is “a red tori” but it seems related to a house. And I add a general negative prompt:

lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry

The results become

(realisticVisionV20_v20-inpainting (6482f11700), non-cherry-picked, seed 12345)

We can see that the result becomes kind of boring and always generate that house-like object.

Improvements in CN 1.1.202

Inspired by these, we completed some features in CN to make similar effects in SD.

Allow inpaint in txt2img. This is necessary because txt2img has high-res fix. It is very likely that if you want to achieve high quality inpaint similar to Firefly, you need a native multiple-stage method. Now you can use txt2img high-res fix. as a cascaded inpaint pipeline. Note that the preprocessor “inpaint_only” does not change unmasked area.
Allow image-based guidance in inpaint. We know that CN has a control mode that allow you to put ControlNet on the conditional side of CFG scale, and in this way, the image-based guidance can act like a prompt-based guidance since they all use cfg-scale. This facilitates prompt-free inpaint.

For example, my setting is:

No prompt (!), just like Firefly, this is extremely challenging for a SD method.

(This is perhaps also a showing-off of ControlNet's image content understanding capability.)

I prefer DDIM for real photo inpaint, and using high-res fix is very important (make sure that your base diffusion is at scale about 512), also we recommend to use a relatively low cfg-scale when using this method (<5).

In ControlNet, make sure to select Control Mode “ControlNet is more important” to put CN on conditional side of cfg-scale:

After you set up these options, your SD will become a system that behaves similar to Firefly Generative Fill. You do not need to input any prompts and can just enjoy the high-quality inpaint (with any base model!).

Non-cherry-picked batch, seed 12345, No prompt:

Non-cherry-picked batch, seed 1593190232, No prompt:

Non-cherry-picked batch, seed 2467049182, No prompt:

(To reproduce, you need these model/vae:)

Ending

Note that these results are clearly not same as Adobe Firefly, but their behaviors are similar: they (probably) both use cascaded inpaint pipeline, and they both use image content (not only prompt) to guide the inpaint.

And of course, you can input prompts in this method. For example, I give a short prompt (and actually Firefly also supports very short prompts)

And the result is

Just enjoy the high-quality inpaint in 1.1.202!

Update (0603):

Some users reported that they cannot get good enough results. The reason is that they are skipping steps of this post and not using correct resolution. A key point of this post is that using multiple generating passes produce results better than before.

The options are:

If you use high-res fix (recommended), the correct setting is to use a base resolution at about 512, and then use scale factor at 2.0 if your wanted resolution is 1024 (perhaps use resrgtan+0.25 denoising strength).

If you do not like high-res fix, the correct setting is to use a base resolution at about 512, inpaint, and send result to extras to use another upscaler to achieve your wanted resolution, then use a 3rd party software to blend the mask and original image again, send back to img2img and inpaint again with very low denoising strength. However, this method is very complicated and not very flexiable.

If you use base resolution at 1024, the result is usually not very satisfying (as discussed above). This is because Stable Diffusion is trained on a resolution about 512.

catboxanon · 2023-05-29T18:37:53Z

catboxanon
May 29, 2023

4 replies

lllyasviel May 29, 2023
Collaborator Author

updated

ivanoff13 May 30, 2023

there's a way to embed canvas-zoom in controlnet?

enternalsaga May 31, 2023

there's a way to embed canvas-zoom in controlnet?
+1!!! i need a bigger canvas to inpaint little details. the current viewport is super tiny to do so.

altoiddealer May 31, 2023

I was just about to make a Feature request for this in canvas-zoom git repo.

I poked around first to see if it was already requested, already answered, etc.

I found that this feature has already been implemented!!!!! Just tested it myself and it works!

See here: richrobber2/canvas-zoom#42

catboxanon · 2023-05-29T22:58:31Z

catboxanon
May 29, 2023

Been playing around with this a bit. One thing I've noticed is using another CN unit, supplying the same image with the reference preprocessor, seems to produce more natural looking results (this could be up to taste though). I've done this with several other images and results stay true, but here's using the tori example as in the OP, using the same simple prompt and parameters.

Without second CN unit

With second CN unit (reference_adain+attn preprocessor @ 0.5 strength, 0.5 style fidelity)

Also that's a great point about Adobe's method likely involving cascading. Making sure SD has attention to the whole image at a sane resolution the model was trained on, and then upscaling to a more pleasing resolution at a low guided denoise certainly seems akin to what they're doing. They have the luxury of GigaGAN at their disposal which probably is more functional then our current best solution using hires fix/2nd img2img pass.

Hopefully more people can make use of this. I would assume the average user is not going to be well aware of how doing this in txt2img is advantageous, let alone you can even do inpainting in the txt2img tab. The img2img tab certainly could use some improvements to better allow for this type of workflow. That being said I think something like this is going to be much more powerful if maintainers of plugins for various apps utilizing the A1111 API manage to integrate this well -- just as Adobe did for Photoshop.

4 replies

Thawneflash May 30, 2023

I noticed using reference and inpaint does make the edited parts look more natural. I tried it on a person and works nice
Only problem using reference with inpaint combined wont let u change color of something.

For example if the original persons pic has black hair and white shirt, if u try to use inpaint and write "pink hair" or "green jacket" does not work and gives black hair and a white jacket

Writing lots of brackets on like "(((Black shirt)))" doesn't do much. Sometimes the shirt will have some parts black if u do that but not the whole shirt

sishgupta May 31, 2023

This method is so effective it should be part of the implementation somehow so it's obvious to users. It would be unfortunate for it to be obscured by the knowledge that you need multiple control nets and to also use reference. It works really well for outpainting.

ghpkishore May 31, 2023

@catboxanon when you mean 0.5 strength you mean the denoising strength in the high res fix?

catboxanon May 31, 2023

@ghpkishore 0.5 strength of the preprocessor/controlnet (and coincidentally, the style fidelity). I didn't use hires fix at all in my example, but usually you would use a value like 0.25-0.35 as in the examples in the OP (assuming you aren't using a latent upscaler, which is a bad idea)

Arron17 · 2023-05-30T18:16:50Z

Arron17
May 30, 2023

This works very nicely for outpainting as well. You can take an image and add some transparancy to the sides of the image in a photoediting app, and then mask the transparent area in the ControlNet.

This is using Inpaint_Only along with Reference_only with the original image as someone suggested above to keep the same sort of style.

Prompt was just "Japanese Castle"

Here is a quick Anime Example as well:

Original Image:

Outpainted using this method:

0 replies

jfischoff · 2023-05-30T22:35:36Z

jfischoff
May 30, 2023

Is there an easy way to have inpainting work well with batch? Either by reusing the mask or providing a directory for masks? When I use the batch tab currently it just ignored the mask I scribbled.

0 replies

2blackbar · 2023-05-30T23:56:34Z

2blackbar
May 30, 2023

does it work in img2img and inpaint only masked ? cant make it work
It would be worth to mention in CAPS that you have to mask the area in actual controlnet image window with controlnet brush
Ok when i try to do it in img2img then when my input image has already masked areas to black , this black color affects the image and adds vigtnette, the reason i want to use img2img masked only is because it preserves resolution of the image intact

4 replies

catboxanon May 31, 2023

You can get similar behavior in img2img if:

Using Only masked:
Make the padding size large enough to account for the entire resolution of the base image (you probably will have to edit the UI config .json to allow for this). This is dissimilar in that the padding will introduce repeating of the edge pixels as the image that's input into ControlNet.

Using Whole picture:
This won't introduce repeated pixels, but the final output image will be low-res (identical to doing this in txt2img without hires fix).

Both of these solutions don't allow for a cascaded workflow (hires fix), and due to that make the output low-res (Only masked will make the inpainted area likely blurry/undetailed for large images, Whole picture makes the entire picture smaller). This is ultimately why I suggested above a tool like this would become much more powerful if it were eventually properly integrated into other software via the webui's API. Both of the webui's built-in tabs for utilizing this are pretty limited for the workflow (txt2img I still consider clunky because ~~there's no simple way to do operations like outpainting [lack of crop operation in the ControlNet window]~~ or inpainting several parts in succession).

2blackbar May 31, 2023

well im trying to extend this image so theres forest landscape sunset

And im getting this in img2img , i only mask the image in controlnet and set inpaint unmasked

in txt2img it is better because the only masking i do is in controlnet

Ok this one is best , i masked both inpaint window and controlnet window, i wish i would not have to mask both but just one

catboxanon May 31, 2023

When I tested this earlier I masked the image in img2img, and left the ControlNet image input blank, with only the inpaint preprocessor and model selected (which is how it's suggested to use ControlNet's inpaint in img2img, because it reads from the img2img mask first). Inpaint area I set to only masked, masked content I set to latent noise (probably doesn't matter when denoise is at 1), and cranked up the masked padding pixels.

2blackbar May 31, 2023

ok this is with just inpaint window masked middle and set to inpaint unmasked so looks like its working but i cant easily remove people from a shot when masking them in img2img ( to preserve resolution of the image), any workflow for that ?
Maybe adding a tick box in txt2img controlnet inpaint window that can override output image size to use input image resolution and not txt2img res ? this would be handy for a lot of stuff actually not just this inpaint.
Also i think it still takes that black color and kinda jams it in crating soft corners and vignette, would be nice to get rid of that in img2img, it was on denoise 1

2blackbar · 2023-05-31T00:43:51Z

2blackbar
May 31, 2023

ok in img2img only masked i cant remove a person even with denoise1 but when i click whole image then it works but oviously resolution is not original... any remedy ?

2 replies

catboxanon May 31, 2023

Well, this is sort of the idea behind using the txt2img implementation. It's limited to your hires fix output resolution which is still the downside if the image you're working with is super high-res.

@lllyasviel Maybe we could consider respecting these options, so users could composite the images in external software later?
https://github.com/AUTOMATIC1111/stable-diffusion-webui/blob/20ae71faa8ef035c31aa3a410b707d792c8203a3/modules/shared.py#L324-L325
https://github.com/AUTOMATIC1111/stable-diffusion-webui/blob/20ae71faa8ef035c31aa3a410b707d792c8203a3/modules/shared.py#L468-L469

2blackbar May 31, 2023

i know, still i would like an option to overwrite resolution, a denoise of 1 in img2img still contains some info about colours of input image, if this will not be the case there will be no need for fullres in txt2img

mhart951 · 2023-05-31T01:21:18Z

mhart951
May 31, 2023

When I try to use the same RealisticVision inpainting model with the txt2img controlnet inpaint method, I keep getting an error that says that tensor a needs to match the size of tensor b. This only happens when I use the inpainting model. When I use the regular realisitcvision model it works fine, except its not as high quality as it would be with the inpainting model. Does anyone know why this is happening?

4 replies

2blackbar May 31, 2023

yes, inpainting models have one extra channel and inpaint controlnet is not meant to be used with it, you just use normal models with controlnet inpaint

sishgupta May 31, 2023

yes, inpainting models have one extra channel and inpaint controlnet is not meant to be used with it, you just use normal models with controlnet inpaint

This fixed it for me, thanks. The results are impressive indeed. I would note that the screenshots above as provided by @lllyasviel show the realisticvisionv20-inpainting model

Nubyte10 May 31, 2023

Exactly, he specifically recommends that you use the inpainting models, but I have faced the same issue here. @lllyasviel could you guide us on how you made it happen?

lllyasviel May 31, 2023
Collaborator Author

@sishgupta @Nubyte10 my bad. it is not inpaint model. just base realistic vision. fixed the screenshot in post.

jamesWalker55 · 2023-05-31T01:21:40Z

jamesWalker55
May 31, 2023

How do you use inpaint on txt2img? I've painted sections of an image black and used it as the input, but the output still remains black.

3 replies

2blackbar May 31, 2023

you paint it on actual controlnet image that you are using using controlnet brush, you dont just mask it in graphic software

jamesWalker55 May 31, 2023

Got it working now, thanks!

2blackbar May 31, 2023

if you use balanced i noticed you get more natural blend with input image

enternalsaga · 2023-05-31T02:58:36Z

enternalsaga
May 31, 2023

anything we can learn from UnpromptedControl? The idea is quite similar to what you explain, it can fill up the empty space or remove object. I've tried it in A1111 and it is functional even without any prompt, the quality is not decent though.

1 reply

catboxanon May 31, 2023

I would doubt much because even the readme states it's using ControlNet. This specific example uses https://github.com/microsoft/Bringing-Old-Photos-Back-to-Life as well so it's not really a fair comparison. Would be happy to be proven wrong though.

GeekyGhost · 2023-05-31T04:05:42Z

GeekyGhost
May 31, 2023

I've been using openoutpaint to do this for a long time. It's released under an MIT license. zero01101/openOutpaint#227 This is actually what I talked about with the researchers at adobe during the consultation interview about AI tools they had me do lol. Well, one of the things I talked about.

5 replies

catboxanon May 31, 2023

Isn't this only taking into account the area selected though? That's how this worked last time I used it IIRC. And you can't select an area too big because most models aren't trained for anything larger than 512x512/768x768, so results would end up worse. The reason these improvements were made as described in the OP is to give the diffusion process context/attention to the whole image. The examples you linked look more like the process was done piecemeal.

GeekyGhost May 31, 2023

Yeah, the quick examples below I showed Adobe were done piecemeal. Local on a RTX 3070 has it's limitations. I'm just saying, Generative Fill didn't exist prior to the consultation they requested and much of it was focused on openoutpainting and examples like the ones below. If you're looking for a similar tool and approach, well... I'm just saying. I know of a good starting point lol. We did discuss controlnet as well to a lesser extent. Most of it was about openoutpaint though and the stuff I was able to do with it that caught their attention.

GeekyGhost May 31, 2023

With them being able to handle the GPU processing on their end they can definitely do a better job than a local setup. That will likely change once I finish messing with the tile vae and multidiffusion and working that in. It's been insanely beneficial with Riffusion.

catboxanon May 31, 2023

I thought output resolution hasn't been a limitation for a while due to tiling the VAE. Pipelines like ComfyUI use a tiled VAE impl by default, honestly not sure why A1111 doesn't provide it built-in.
My understanding was that the trained resolution is one of the major limitations of currently available models. Once you go beyond a certain threshold like 768px, you introduce all sorts of incoherency and "hallucinations" because the model wasn't trained for that size/shape. Obviously that becomes a problem for local rigs once we get to the point of that becoming larger (see DeepFloyd as an example), in which case Adobe does have the edge. And NovelAI has apparently somewhat solved the trained resolution issue of generating at larger sizes, but that isn't available to the general public.

Oftentimes when sampling at higher resolutions with the conventional samplers, Stable Diffusion can produce repeats of the same subjects or bizarre anatomy, this is largely due to poor global attention at higher resolutions. NAI SMEA aims to solve that.

2blackbar May 31, 2023

that extension is not the same, its not using this controlnet model but inpaint models

GeekyGhost · 2023-05-31T04:06:53Z

GeekyGhost
May 31, 2023

4 replies

2blackbar May 31, 2023

you can cleary see that it doesnt make sense and doesnt keep the style , wont keep perspective /camera angle, maybe if dev woud add this controlnet to it, it would be better

GeekyGhost May 31, 2023

you can cleary see that it doesnt make sense and doesnt keep the style , wont keep perspective /camera angle, maybe if dev woud add this controlnet to it, it would be better

Again, just showing one of the tools Adobe paid me to discuss prior to the release of their Generative Fill. Can't really do more than that. You can do with it what you will. This is the extent of the help I can give on this one.

2blackbar May 31, 2023

i requested support for this controlnet model for it, i like the outpainting canvas

zero01101 Jul 9, 2023

welp uhhh hope it worksonyourmachine dot jpg too

ghpkishore · 2023-05-31T06:27:22Z

ghpkishore
May 31, 2023

@lllyasviel any advice on how to set the denoising strength. You kept 0.25 in your settings, is there a specific reason for it?

0 replies

rbychn · 2023-05-31T06:33:22Z

rbychn
May 31, 2023

I have a collection of images that were generated using SD, but they have watermark or logo-like text at the bottom. Is it possible to perform a batch process on the folder to find and remove this text, and then use this inpainting method to perfectly fill in the area?

1 reply

enternalsaga May 31, 2023

yes, checkout the batch sub-tab in img2img tab, there should be 3 folder directories, one is your image folder, one is mask image, and one controlnet. Assumingly your watermarks are in relatively same position, draw one mask that covers all watermark areas and put in mask path. Controlnet folder leave it blank, in your controlnet 1 choose ipaint mode and model and also leave blank. SD should batch process all your images in folder by same mask and inpaint CN.

patrickvonplaten · 2023-05-31T12:01:45Z

patrickvonplaten
May 31, 2023

Can't "normal" inpainting models do this out of the box?

One just has to use RunwayML's inpainting model and leave the prompt empty.
You can try it here: https://huggingface.co/spaces/runwayml/stable-diffusion-inpainting

14 replies

hafriedlander Jun 4, 2023

Hi @lllyasviel. Thanks for the extensive follow up. A good test suite would be ideal, because I agree that results could be input-image specific (and that that is undesirable - the algorithm should be generic). It seems like most evaluation datasets that are used for inpainting do not include masks?

To be clear, I am not using A1111. I am using software I wrote myself, on top of Diffusers (called gyre.ai). It uses a pixel-shuffle fill for the masked area (randomly select a pixel from the "good" area for the inpaint area), but the results are very similar to those you posted from A1111.

Lacking a more rigorous evaluation metric, IMO both methods have their uses. The CN solution seems to have more variation but also more discontinuity in the results than the various latent-fill approaches you tested (but all are clearly superior to the original unfilled unet).

hafriedlander Jun 4, 2023

For reference, here is the fill logic used by Gyre - https://github.com/stablecabal/gyre/blob/main/gyre/pipeline/unified_pipeline.py#L532-L568

lllyasviel Jun 4, 2023
Collaborator Author

For deploying at scale, using unbiased blind method is safer. Or one can add option to let users know the option.

hafriedlander Jun 4, 2023

For the sake of completeness, here's a 25-grid of gyre's results. More good results than the original unet methods, but still some clear misses.

lllyasviel Jun 4, 2023
Collaborator Author

@hafriedlander very interesting. The blue part is likely from blue cloth. The initial noise distrubution does have influence and need more considerations.

hack-mans · 2023-05-31T12:11:14Z

hack-mans
May 31, 2023

Had a go with new update and an empty prompt to compare with PSBeta Generative Fill for painted content.

For ControlNet I used DeliberateV2 model, LMS sampler @ 50steps, CFG 7-12

0 replies

rivaldophilip · 2023-06-02T00:28:12Z

rivaldophilip
Jun 2, 2023

Is there a way to prevent non-masked part of image to go blurry / deformed (especially face) without hires fix (since it's slow and introduce a lot more artefacts in generations like multiple bodies)

Using: 768 x 768 or 1024 x 1024 deforms face without hires fix
DDIM, 20 sample steps, 0.7 denoising
4 CFG scale
Realistic Vision model 2.0

Original:

Inpainted:

-webui-controlnet/assets/64348681/481a08ba-2557-4a40-8d4a-d5c241e78ac6)

7 replies

tsound97 Jun 2, 2023

Here's my situation:

Original:

Result in the UI:
$B)ZV1_`L0%BM{OLA(Y5I{JC$

Final image saved in path:
(Sorry I deleted the image file, but I have kept a screenshot of the face)

You can find that the face in the result (UI) is normal, even I didn't enable highres-fix, which mean it is capable to create the correct result without broken the unmasked area. The problem occurs when saving it to local path.

No matter what, thank you for providing us with such a useful feature! lllyasviel, you are a hero!

lllyasviel Jun 2, 2023
Collaborator Author

I can reproduce this bug. Fixed in 1.1.209. @tsound97 @rivaldophilip

tsound97 Jun 2, 2023

Really appreciate it!

rivaldophilip Jun 2, 2023

Thank you very much @lllyasviel! In my case it was blurry in the preview as well and the solution was to enable the VAE (same one in your settings). However, doing so causes the inpainted image to look lighter in color (worse without hires fix). Not sure if this is an artifact or just how the inpainting is (I tried multiple settings changing resolution, sampling steps, CFG scale, denoising, etc. and the effect seems to be consistent). Please let me know if you have ideas!

Prompt: Peter rabbit running from right to the left, side profile, photograph, aesthetic, best quality, ultra high res, raw
Steps: 20, Sampler: DDIM, CFG scale: 4, Seed: 2129222432, Size: 768x768, Model hash: c0d1994c73, Model: realisticVisionV20_v20, ControlNet 0: "preprocessor: inpaint_only, model: control_v11p_sd15_inpaint [ebff9138], weight: 1, starting/ending: (0, 1), resize mode: Crop and Resize, pixel perfect: True, control mode: ControlNet is more important, preprocessor params: (64, 64, 64)"

Prompt: Peter rabbit running from right to the left, side profile, photograph, aesthetic, best quality, ultra high res, raw
Steps: 20, Sampler: DDIM, CFG scale: 4, Seed: 3203098461, Size: 512x512, Model hash: c0d1994c73, Model: realisticVisionV20_v20, Denoising strength: 0.25, ControlNet 0: "preprocessor: inpaint_only, model: control_v11p_sd15_inpaint [ebff9138], weight: 1, starting/ending: (0, 1), resize mode: Crop and Resize, pixel perfect: True, control mode: ControlNet is more important, preprocessor params: (64, 64, 64)", Hires upscale: 2, Hires upscaler: R-ESRGAN 4x+

Prompt: Peter rabbit running from right to the left, side profile, photograph, aesthetic, best quality, ultra high res, raw
Steps: 50, Sampler: DDIM, CFG scale: 4, Seed: 739568659, Size: 512x512, Model hash: c0d1994c73, Model: realisticVisionV20_v20, Denoising strength: 0.4, ControlNet 0: "preprocessor: inpaint_only, model: control_v11p_sd15_inpaint [ebff9138], weight: 1, starting/ending: (0, 1), resize mode: Crop and Resize, pixel perfect: True, control mode: ControlNet is more important, preprocessor params: (64, 64, 64)", Hires upscale: 2, Hires upscaler: R-ESRGAN 4x+

bwesen Jun 3, 2023

If you follow the instructions above, non-masked part of image never goes blurry / deformed, and high-res fix never introduce artifacts in generations like multiple bodies. Sanity check:

I noticed you had enabled the VAE selector in the auto1111 GUI here. It's not something I've changed before, is that selection essential to CN and the inpaint? I've noticed that there is a slight difference in color tone if extending a blue sky for example, just thought that could be related.

lllyasviel · 2023-06-02T02:26:28Z

lllyasviel
Jun 2, 2023
Collaborator Author

1.1.209: Fixed a bug that images not distorted in preview but distorted in saved folder

0 replies

lhucklen · 2023-06-02T05:42:34Z

lhucklen
Jun 2, 2023

@lllyasviel I’m not sure yet. I feel like this retains the content I am inpainting much better. Sometimes I am blown away by how close it gets.

I wish I could get the detail and clarity I get out of inpaint only masked half the time too though.

5 replies

lhucklen Jun 2, 2023

Any way to achieve somthing like this?

I note when you utilize this on the inpainting tab it mentions something about defaulting to something in the control panel.

I guess what I am getting at is, what happens when you inpaint on the inpainting panel with only masked on and controlnet inpainting the same content (I guess essentially mirroring your masks in controlnet and SD inpainting)

lllyasviel Jun 2, 2023
Collaborator Author

you can use CN in img2img "inpaint area only masked" and leave CN input as blank

lhucklen Jun 3, 2023

Why do I get vastly inferior images when I do it this way? What’s going on under the hood?

lhucklen Jun 3, 2023

we would be limiting the amount of pixels to the bounding box. Let’s say I have a super high resolution piece am working in. It would seem that this would allow it to better identify the content in such a “small” space.

lhucklen Jun 3, 2023

Think about the super high resolution generations that when you zoom in you like “holy cow there is a whole world in there”

altoiddealer · 2023-06-02T13:21:05Z

altoiddealer
Jun 2, 2023

Suggestion, and knowing you guys I'll bet you already have something like this in the works...

Clearly, this new inpainting technique is a very powerful feature, and so it would be amazing if it could have the other relevant inpainting settings implemented (adjustable mask blur, inpaint or outpaint selection, upload mask image, etc).

For instance, the Segment Anything extension is such a godsend for normal inpainting since you can swiftly make complicated mask selections, expand the selection, and use it as an image mask. It would be super amazing to have this for Controlnet inpainting.

From my testing it also seems like there is a fixed Mask Blur, so it takes some imagination to draw the ideal mask without cutting it too short.

My thought: its time for Controlnet to have a dedicated tab for Inpainting - which could be enabled or disabled in Settings tab (Enabled by default?) Then it could have simply one toggle for which inpainting preprocessor to use.

I mean, is there any use case for multiple inpainting controlnets to be active? If not then I think having one dedicated tab would be logical. This could also prune out Inpainting preprocessor/model from the other controlnet tabs.

6 replies

continue-revolution Jun 3, 2023
Collaborator

I am the author of the Segment Anything extension. I deleted the feature of uploading mask image to ControlNet because I though it was unnecessary, while now it seems to be necessary again. This feature will soon come back to my extension to overcome the current limitation of people unabling to upload mask to ControlNet. For the other features you want: I think having an adjustable mask blur should be pretty easy for both me and ControlNet authors, but I'm not sure how to support outpainting.

altoiddealer Jun 3, 2023

Beautiful, I didn’t realize the feature was ever there. Eagerly awaiting its return :)

continue-revolution Jun 4, 2023
Collaborator

It returns. Update the extension to have a try.

rivaldophilip Jun 5, 2023

@continue-revolution thank you for returning the extension! Do you happen to have a tutorial on how to make it work?

Tried uploading mask but it doesn't work

Also tried the non-manual version and it doesn't work

continue-revolution Jun 5, 2023
Collaborator

@rivaldophilip https://github.com/continue-revolution/sd-webui-segment-anything#single-image item 8

Woisek · 2023-06-02T15:01:58Z

Woisek
Jun 2, 2023

Is there a size limit on how far outpainting is possible before SD starts to hallucinate? 🤔

5 replies

Woisek Jun 3, 2023

No one knows ... ? 😕

zaork Jun 3, 2023

i guess the limit is your gpu. i dont know if cn takes the whole pic or just the outer parts to outpaint. f.e. in stable Diffusion Infinity you can select which part of the pic is context for the outpainting area.

Woisek Jun 4, 2023

in stable Diffusion Infinity you can select which part of the pic is context for the outpainting area.

Exactly this would I have assumed here, too, because ... when using a mask it would be logical. But obviously it's not. 😕

altoiddealer Jun 4, 2023

This method was proposed 2 days ago heh. I’m sure there’s improvements in the works

Woisek Jun 4, 2023

This would be great. But looking at the comments further down ... well ... 😕

temtekmedia · 2023-06-02T17:58:36Z

temtekmedia
Jun 2, 2023

Can someone help me please? I have no idea how to install this.. "control_v11p_sd15_inpaint"

Nothing seems to work and i cant find any help anywhere :/

2 replies

FurkanGozukara Jun 2, 2023

Can someone help me please? I have no idea how to install this.. "control_v11p_sd15_inpaint"

Nothing seems to work and i cant find any help anywhere :/

sure here installation video : https://youtu.be/ot5GkaxHPzk

make sure to update script to latest via git pull

temtekmedia Jun 2, 2023

Thank you! I managed to get it working!

I dont know how to update the script and im not sure which script you're talking about? Do you mean on the webui-user.bat? I'm newbie ^^

vsewall · 2023-06-02T20:59:14Z

vsewall
Jun 2, 2023

it works great! but sometimes i see a lot of wird artefacts (glitches) on output images in unmasked areas. different samplers and settings

5 replies

vsewall Jun 2, 2023

oh, i think i found problem. it happens when i change resolution to another than init image has. just changed that to 512x512 and there are no glitches.

temtekmedia Jun 2, 2023

Amazing, thank you! I was having exactly the same thing and couldnt figure it out. It seemed to be happening randomly with different images, but not with some images. I'll try later too!

lllyasviel Jun 2, 2023
Collaborator Author

We will investigate this. Can you share parameters and input images for us to reproduce problem?

lllyasviel Jun 2, 2023
Collaborator Author

ok I can reproduce. Fixed in 1.1.213

vsewall Jun 2, 2023

thank you! best tool ever!

lllyasviel · 2023-06-02T21:40:11Z

lllyasviel
Jun 2, 2023
Collaborator Author

1.1.213: Fix weird artifacts (glitches) on output images in unmasked areas

2 replies

temtekmedia Jun 2, 2023

A newbie here, can this be updated in the automatic1111 webui via extensions-check for updates?

zaork Jun 2, 2023

yes

lllyasviel · 2023-06-03T13:36:19Z

lllyasviel
Jun 3, 2023
Collaborator Author

Update：

Some users reported that they cannot get good enough results. The reason is that they are skipping steps of this post and not using correct resolution. The main point of this post is that using multiple generating passes produce results better than before.

The options are:

If you use high-res fix (recommended), the correct setting is to use a base resolution at about 512, and then use scale factor at 2.0 if your wanted resolution is 1024 (perhaps use resrgtan+0.25 denoising strength).

If you do not like high-res fix, the correct setting is to use a base resolution at about 512, inpaint, and send result to extras to use another upscaler to achieve your wanted resolution, then use a 3rd party software to blend the mask and original image again, send back to img2img and inpaint again with very low denoising strength. However, this method is very complicated and not very flexiable.

If you use base resolution at 1024, the result is usually not very satisfying. This is because Stable Diffusion is trained on a lower resolution.

15 replies

altoiddealer Jun 3, 2023

Using the txt2img + HiRes method of upscaling (this topic), you are not expanding the size of your image, its more like shrinking your original image and filling in around it. You could then upscale it if you wanted.

using img2img outpainting you will be increasing the image size, but the resolution is too big so it won’t work well.

if you want to outpaint in little increments there is a photoshop extension for SD. you can just select anywhere you want on the canvas and inpaint there. This would not be the special inpainting technique described in this thread

lllyasviel Jun 3, 2023
Collaborator Author

One can follow the steps in the original post as a sanity check. For outpaint, select "Resize and Fill" in the "Resize Mode".

Woisek Jun 3, 2023

Using the txt2img + HiRes method of upscaling (this topic), you are not expanding the size of your image, its more like shrinking your original image and filling in around it. You could then upscale it if you wanted.

Yes, that's what is confusing and at the same time ... a strange workflow. We gain nothing if we always have to shrink the original and basically stay on the same size.

using img2img outpainting you will be increasing the image size, but the resolution is too big so it won’t work well.

But why isn't CN used here, too? Upscaling with Tile shows how we can get details into the tiles, making it possible to almost upscale "infinitely".

if you want to outpaint in little increments there is a photoshop extension for SD. you can just select anywhere you want on the canvas and inpaint there. This would not be the special inpainting technique described in this thread

Then this thread is misleading in my eyes, if it compares this topic with Adobe Firefly. Because in Firefly, we can see as demo for example how an image is expanded way left and right and get inpainted / outpainted to add content "without prompt".
This thread suggest this behavior, but obviously it is not really possible. We are locked in a small size and the fantastic use of CN Tile does not appear to be used here, which is a big bummer, IMHO.
Because actually I thought exactly that, that you can enlarge/outpaint "any size" images and CN then generates content to it. And if the area is larger for base 512/768, then CN Tile would be switched internally and fill the area in pieces.
But apparently that was not the intention. Too bad actually. 😞

lllyasviel Jun 3, 2023
Collaborator Author

Hello @Woisek , you can follow the original post as a sanity check if you have any confusion. For outpaint, select "Resize and Fill" in the "Resize Mode".
This workflow supports arbitrary final resolution. The result from base resolution (about 512) is only used as a necessary intermediate guidance. The output will use your bigger resolution.
Commercial models usually use the same method without letting users know about it. However, in A1111, most parameters are transparent so that you will know how your images are processed.

Woisek Jun 3, 2023

Hi @lllyasviel , thanks for this info. I'm having really high hopes in this wonderful tech you all created here and are providing for us.
I also hope it will be a bit more workflow oriented in the near future, because i would love nothing more than to see adobe go down the drain with their censorship mentality.
Looking forward for updates. 👍

lhucklen · 2023-06-03T15:18:22Z

lhucklen
Jun 3, 2023

I have been playing with it the past few days. I change my mind this is superior to just inpainting.

0 replies

lllyasviel · 2023-06-04T01:31:43Z

lllyasviel
Jun 4, 2023
Collaborator Author

Hi all, we put some comparisons to inpaint variation models here
#1464 (reply in thread)

0 replies

matrix4767 · 2023-06-04T07:24:40Z

matrix4767
Jun 4, 2023

Hello. Having some issues with ControlNet's inpainting. Masking an area often doesn't do anything the prompt told it to, but leaves behind an overlay of the mask on the output. What is the cause of this? I should also mention this overlay appears when the generation is complete and not during the generating preview.

1 reply

kft334 Jun 4, 2023

I have the same issue but only when using inpaint full res. A dark overlay of the masked area is left behind on the resulting image but it does not appear in the preview.

Daralimah · 2023-06-04T18:15:35Z

Daralimah
Jun 4, 2023

Thank you for this awesome update to ControlNet, the results I'm getting are great. Would it be possible to make it so that, when inpainting is selected in a controlnet unit, any transparent part of the image gets masked? This behavior would be the same as it is in the regular inpaiting tab. It's more difficult to reproduce results when there doesn't seem to be a way to recreate the same mask. If there is already a way to do so then please inform me. Customizable mask blur, like others have mentioned, might also be quite useful, but the settings are under the hood do produce seamless results the majority of the time, so not sure if it would actually allow for any improvement.

0 replies

cczutang · 2023-06-06T07:06:10Z

cczutang
Jun 6, 2023

When I use the inpaint_only preprocessor for image restoration, I found that it changes the pixels of my original image. Is this within the expected range?

here is my options:

Chinese Garden
Steps: 50, Sampler: DDIM, CFG scale: 5, Seed: 3470306056, Size: 640x512, Model hash: c5b6055a84, Model: epiCRealism_newAge, ControlNet 0: "preprocessor: inpaint_only, model: control_v11p_sd15_inpaint [ebff9138], weight: 1, starting/ending: (0, 1), resize mode: Resize and Fill, pixel perfect: True, control mode: ControlNet is more important, preprocessor params: (512, 64, 64)", Version: v1.3.0

4 replies

lllyasviel Jun 6, 2023
Collaborator Author

this can happen to some random images after the second batch in a generation. will fix in the next version

lllyasviel Jun 6, 2023
Collaborator Author

fixed in 1.1.219

cczutang Jun 7, 2023

cool！The filling for other areas is very good. but I still have this problem after upgrading to version 219 with same settings.
like this, I wanted the area where my house is located to remain unchanged, so I didn't use a mask for this part of the image. However, the roof was still superimposed on it in the end.

tsound97 Jun 7, 2023

cool！The filling for other areas is very good. but I still have this problem after upgrading to version 219 with same settings. like this, I wanted the area where my house is located to remain unchanged, so I didn't use a mask for this part of the image. However, the roof was still superimposed on it in the end.

建议尝试在图生图中使用CN，这样会有更多的蒙版参数可以调整
现在webui需要类似invoke ai那样便于局部重绘的插件，以前的openOutpaint插件非常不错，但是作者目前忙于工作很久没有更新了（但他说了没有放弃这个项目）
目前我使用得比较多的是这个插件：https://github.com/yankooliveira/sd-webui-photopea-embed

lllyasviel · 2023-06-09T01:00:03Z

lllyasviel
Jun 9, 2023
Collaborator Author

Hi everyone, more progress here
#1597

0 replies

[1.1.202 Inpaint] Improvement: Everything Related to Adobe Firefly Generative Fill #1464

lllyasviel May 29, 2023 Collaborator

Adobe Firefly Generative Fill

Is it possible for A1111?

Improvements in CN 1.1.202

Ending

Replies: 43 comments · 125 replies

lllyasviel May 29, 2023 Collaborator Author

lllyasviel May 31, 2023 Collaborator Author

lllyasviel
May 29, 2023
Collaborator

Replies: 43 comments 125 replies

lllyasviel May 29, 2023
Collaborator Author

lllyasviel May 31, 2023
Collaborator Author