SDXL Fooocus Inpaint #9870

WaterKnight1998 · 2024-01-11T10:06:01Z

WaterKnight1998
Jan 11, 2024

Is your feature request related to a problem? Please describe.
I have seen that diffusers StableDiffusionXLInpaintPipeline generates worse results than SD 1.5 pipeline.

Describe the solution you'd like.
Include Fooocus inpaint patch, you could specify with a new loader.
Weights are available right now in hub.
https://huggingface.co/lllyasviel/fooocus_inpaint

Laidawang · 2024-01-12T05:59:08Z

Laidawang
Jan 12, 2024

they also seem to use fooocus_inpaint_head.pth I'm not quite sure what this will do, I read the code and maybe an additional patch for unet?

The inpaint_v26.fooocus.patch is more similar to a lora, and then the first 50% executes base_model + lora, and the last 50% executes base_model.
There is no doubt that fooocus has the best inpainting effect and diffusers has the fastest speed, it would be perfect if they could be combined.

0 replies

asomoza · 2024-01-12T06:46:33Z

asomoza
Jan 12, 2024
Maintainer

Actually it seems more like a controlnet, something more like this one: https://huggingface.co/destitech/controlnet-inpaint-dreamer-sdxl.

They also use a custom sampler for the inpainting, but I agree, it would be nice to be able to use those in diffusers.

You can read about it here: lllyasviel/Fooocus#414

0 replies

WaterKnight1998 · 2024-01-12T09:18:20Z

WaterKnight1998
Jan 12, 2024
Author

The inpaint_v26.fooocus.patch is more similar to a lora, and then the first 50% executes base_model + lora, and the last 50% executes base_model. There is no doubt that fooocus has the best inpainting effect and diffusers has the fastest speed, it would be perfect if they could be combined.

I was reading the code and they download the model here: https://github.com/lllyasviel/Fooocus/blob/dc5b5238c83c63b4d7814ba210da074ddc341213/modules/config.py#L398-L399

This function is called here: https://github.com/lllyasviel/Fooocus/blob/dc5b5238c83c63b4d7814ba210da074ddc341213/modules/async_worker.py#L301 You can see inpaint_patch_model_path is passed to base_model_additional_loras. They have an strange coded for applying the lora.

After model is loaded you can see in following tabs that they apply the head in top of the result of applying the lora

0 replies

Laidawang · 2024-01-12T09:27:30Z

Laidawang
Jan 12, 2024

Actually it seems more like a controlnet, something more like this one: https://huggingface.co/destitech/controlnet-inpaint-dreamer-sdxl.实际上，它看起来更像是一个控制网，更像是这个：https://huggingface.co/destitech/controlnet-inpaint-dreamer-sdxl。

They also use a custom sampler for the inpainting, but I agree, it would be nice to be able to use those in diffusers.他们还使用自定义采样器进行修复，但我同意，如果能够在扩散器中使用它们那就太好了。

You can read about it here: lllyasviel/Fooocus#414您可以在这里阅读：lllyasviel/Fooocus#414

I have read the comparison between Fooocus and comfyui of loading lora. I think they are basically the same.
COMFY: https://github.com/comfyanonymous/ComfyUI/blob/53c8a99e6c00b5e20425100f6680cd9ea2652218/comfy/lora.py#L13
FOOOCUS:
https://github.com/lllyasviel/Fooocus/blob/dc5b5238c83c63b4d7814ba210da074ddc341213/ldm_patched/modules/lora.py#L13

This can also be confirmed from the code provided by @WaterKnight1998. it just defined different names to ensure that only fooocus can load it correctly.

0 replies

WaterKnight1998 · 2024-01-12T09:29:03Z

WaterKnight1998
Jan 12, 2024
Author

Yup, that's the problem I saw. I had a difficult time trying to load in diffusers I didn't managed to map keys of layers into diffusers expected format :(

0 replies

Laidawang · 2024-01-12T09:32:12Z

Laidawang
Jan 12, 2024

https://github.com/lllyasviel/Fooocus/blob/main/modules/inpaint_worker.py#L187 Another thing worth considering is how to implement this patch for inpaint head model.

0 replies

WaterKnight1998 · 2024-01-12T09:34:48Z

WaterKnight1998
Jan 12, 2024
Author

Actually it seems more like a controlnet, something more like this one: https://huggingface.co/destitech/controlnet-inpaint-dreamer-sdxl.实际上，它看起来更像是一个控制网，更像是这个：https://huggingface.co/destitech/controlnet-inpaint-dreamer-sdxl。
They also use a custom sampler for the inpainting, but I agree, it would be nice to be able to use those in diffusers.他们还使用自定义采样器进行修复，但我同意，如果能够在扩散器中使用它们那就太好了。
You can read about it here: lllyasviel/Fooocus#414您可以在这里阅读：lllyasviel/Fooocus#414

I have read the comparison between Fooocus and comfyui of loading lora. I think they are basically the same. COMFY: https://github.com/comfyanonymous/ComfyUI/blob/53c8a99e6c00b5e20425100f6680cd9ea2652218/comfy/lora.py#L13 FOOOCUS: https://github.com/lllyasviel/Fooocus/blob/dc5b5238c83c63b4d7814ba210da074ddc341213/ldm_patched/modules/lora.py#L13

Ok, both codes are the same. Is it possible to load ComfyUI weights in diffusers?

0 replies

WaterKnight1998 · 2024-01-12T09:35:49Z

WaterKnight1998
Jan 12, 2024
Author

https://github.com/lllyasviel/Fooocus/blob/main/modules/inpaint_worker.py#L187 Another thing worth considering is how to implement this patch for inpaint head model.

But the code is just updating the first conv, no?

0 replies

Laidawang · 2024-01-12T09:39:58Z

Laidawang
Jan 12, 2024

https://github.com/lllyasviel/Fooocus/blob/main/modules/inpaint_worker.py#L187 Another thing worth considering is how to implement this patch for inpaint head model.

But the code is just updating the first conv, no?

You are right, but we also need to use it in diffusers as input to start with

0 replies

Laidawang · 2024-01-12T09:41:12Z

Laidawang
Jan 12, 2024

Maybe consider loading it in comfy and saving it as overall weights and then using it in diffusers?

0 replies

Laidawang · 2024-01-12T09:42:20Z

Laidawang
Jan 12, 2024

But as I saw in fooocus, the base model will still be used in the second stage, so the most elegant way is to load and unload it freely.

0 replies

WaterKnight1998 · 2024-01-12T09:48:37Z

WaterKnight1998
Jan 12, 2024
Author

But as I saw in fooocus, the base model will still be used in the second stage, so the most elegant way is to load and unload it freely.

What do you mean with this?

0 replies

Laidawang · 2024-01-12T10:20:23Z

Laidawang
Jan 12, 2024

For example, in fooocus inpainting, assuming that 30 steps of sampling are performed, xl_base_model + inpainting_model will be used in the first 15 steps, and xl_base_model will be switched to separate inference in the last 15 steps.
https://github.com/lllyasviel/Fooocus/blob/main/modules/async_worker.py#L307 see here.

0 replies

asomoza · 2024-01-12T15:36:33Z

asomoza
Jan 12, 2024
Maintainer

yeah I saw it afterwards, they switched to a custom model for inpainting, how good is the inpainting? can any of you post an example? if its really good maybe I can try or even better, someone from the diffusers team, but they'll probably need solid proof to work on it.

0 replies

Laidawang · 2024-01-15T05:24:47Z

Laidawang
Jan 15, 2024

before：

after：

I tried outpainting and it was amazingly realistic.

for inpainting it, it blends well with the background.

0 replies

asomoza · 2024-03-19T02:40:41Z

asomoza
Mar 19, 2024
Maintainer

Nice, I like the challenge, let me get back at you soon since I still haven't done any outpainting with diffusers and I don't think there's a pipeline or workflow for that yet.

I plan to do a guide/example/tutorial for inpainting and outpanting soon. I'll work in an outpainting solution so I can tackle this first, but IMO is the same, just need to solve the math for expanding the "canvas" and probably need to fill it with something first, not just noise.

In addition, I would like to ask, what is the speed of using diff-diff to infer 20 steps?

for the woman this is the speed I get with a 3090:

in this comment: exx8/differential-diffusion#17 (comment) the author says it is just a 0.25% penalty.

But I'll also do it with normal inpainting because the results are also good, I like to use diff-diff but normal inpainting is not bad, the trick to it is the image area we use as a context and how we merge back the inpainted part.

0 replies

Laidawang · 2024-03-19T02:46:27Z

Laidawang
Mar 19, 2024

I agree with you, the original image is crucial to the generation process, diffusion model are trained to do that. So for the outpainting here, I would use lama first to fix outpaint area.

0 replies

Laidawang · 2024-03-19T06:19:46Z

Laidawang
Mar 19, 2024

i test it in comfy，with 2 methods：24 steps for all
M1： base model + inpaint patch(0-12) then base_model (12-24) according to the source code of fooocus.
as i realized base_model + inpaint patch ≈ a inpainting model so i think way2 is：
M2： inpaint base model（i choose real3.0inpaint model which can be find here：https://civitai.com/models/139562?modelVersionId=297320）(0-12) + base model (12 -24)
For both methods I used diff-diff as this does not conflict.Here are some results.（left: M1-fooocus, right:M2）
a wolf in a swimming pool

a wolf playing football

a wolf playing football on the beach

some food placed on the table

My test case：
testing.zip
In some scenes, I think they're pretty close, but I think fooocus is slightly better. For example, the bowl in the last picture.
One more thing worth noting is that in M2 I loaded two weights (5g each), whereas in M1 I only have to load an additional pathch (1.23g), which means this method uses less GPU memory.

0 replies

asomoza · 2024-03-19T12:35:57Z

asomoza
Mar 19, 2024
Maintainer

I always thought that the model patching that fooocus does is just to convert a regular model to an inpainting one, you can do the same by merging the difference of the inpaint model trained by the diffusers team with the base model, that's is what most people did with SD 1.5.

The first "inpainting" of fooocus was a controlnet too, so I don't know if there's something else in the patch or if he trained it from scratch or used the diffusers model, so I left it better as an "unknown" patch.

Thank you for doing more tests, I have more to compare my results with that. This time, I’m putting more effort to do the best I can, instead of just simply replicating fooocus or comfyui.

Edit: the VRAM and RAM can be managed, I remember that fooocus has to unload and load the model so it probably clones the base model (taking more RAM), also I think comfyui manages better the memory than fooocus since comfyui can run in a potato pc, so it should unload the model that is not using. In diffusers you practically can do whatever you want if you have the knowledge.

0 replies

asomoza · 2024-03-19T15:14:45Z

asomoza
Mar 19, 2024
Maintainer

just to have a baseline, I tested the wolf one with just the controlnet inpaint, I use my app for this, but it can be done with just code:

I don't think is that worse, but If I want to make it better, I can use the prompt enhancer and fix the composition with a t2i adapter (fixed a bit of the tail with painting in the canny preprocessed image)

not bad for a quick inference, I'm going to do this also with just diffusers code.

0 replies

asomoza · 2024-03-26T22:09:00Z

asomoza
Mar 26, 2024
Maintainer

Hi @Laidawang, I just posted a guide on the discussions on outpainting, I did a middle step without changing the prompt so you can compare it to the fooocus result.

I'll use your other images in the other methods I know because they are more suited for them. Let me know if you still think fooocus is better but IMO they're of the same quality or better.

0 replies

viperyl · 2024-04-12T01:46:20Z

viperyl
Apr 12, 2024

thanks for you sharing, i found the Fooocus inpaint lora weight contains (unit8, fp16, fp16) data, can anyone explain uint8 weights here?

0 replies

bonlime · 2024-04-21T15:19:20Z

bonlime
Apr 21, 2024

@viperyl if i remember correctly they quantized the main matrices to uint8 to take less space and then use min/max stored in fp16 to scale them back. IMO very good idea with negligible loss of information

0 replies

viperyl · 2024-04-22T06:00:20Z

viperyl
Apr 22, 2024

@viperyl if i remember correctly they quantized the main matrices to uint8 to take less space and then use min/max stored in fp16 to scale them back. IMO very good idea with negligible loss of information

Yes, I debuged it and found uint8 quant, that's make me feel confused. The uint8 checkpoint needs 1.4 Gb disk space, but fp16 version only needs 2.5 Gb. Considering the quantization protentially damaged result, make a quant here for saving 1.1 Gb disk space looks not good idea.

0 replies

2024-05-16T15:05:30Z

github-actions[bot]
bot May 16, 2024

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

0 replies

CristianCuadrado · 2024-10-09T13:21:16Z

CristianCuadrado
Oct 9, 2024

Has anyone gotten it to work with the Hugging Face pipeline?

0 replies

bonlime · 2024-10-10T09:45:12Z

bonlime
Oct 10, 2024

@CristianCuadrado i did, but then switched to controlnet Union, it produces better results compared to Fooocus, no reason to use it

0 replies

CristianCuadrado · 2024-10-10T11:05:19Z

CristianCuadrado
Oct 10, 2024

@CristianCuadrado i did, but then switched to controlnet Union, it produces better results compared to Fooocus, no reason to use it

Wow, it looks amazing! Thank you!
I couldn't find any examples of them using inpainting, and the author has demos of the other methods but not that one. Could you show me some code?

0 replies

asomoza · 2024-10-10T11:24:52Z

asomoza
Oct 10, 2024
Maintainer

I did a guide that shows a way to use it, and two spaces that showcase the power it has. Diffusers Image Fill and Diffusers Fast Inpaint, also there's another based on it that was even more popular: Diffusers Image Outpaint that has also a repo

Also in the official repo, the author posted the code on how to use it.

0 replies

2024-11-03T15:07:27Z

github-actions[bot]
bot Nov 3, 2024

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

0 replies

SDXL Fooocus Inpaint #9870

Replies: 55 comments

asomoza Jan 12, 2024 Maintainer

WaterKnight1998 Jan 12, 2024 Author

WaterKnight1998 Jan 12, 2024 Author

WaterKnight1998 Jan 12, 2024 Author

WaterKnight1998 Jan 12, 2024 Author

WaterKnight1998 Jan 12, 2024 Author

asomoza Jan 12, 2024 Maintainer

asomoza Mar 19, 2024 Maintainer

asomoza Mar 19, 2024 Maintainer

asomoza Mar 19, 2024 Maintainer

asomoza Mar 26, 2024 Maintainer

github-actions[bot] bot May 16, 2024

asomoza Oct 10, 2024 Maintainer

github-actions[bot] bot Nov 3, 2024

asomoza
Jan 12, 2024
Maintainer

WaterKnight1998
Jan 12, 2024
Author

WaterKnight1998
Jan 12, 2024
Author

WaterKnight1998
Jan 12, 2024
Author

WaterKnight1998
Jan 12, 2024
Author

WaterKnight1998
Jan 12, 2024
Author

asomoza
Jan 12, 2024
Maintainer

asomoza
Mar 19, 2024
Maintainer

asomoza
Mar 19, 2024
Maintainer

asomoza
Mar 19, 2024
Maintainer

asomoza
Mar 26, 2024
Maintainer

github-actions[bot]
bot May 16, 2024

asomoza
Oct 10, 2024
Maintainer

github-actions[bot]
bot Nov 3, 2024