Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

code for user-defined mask #74

Open
fabrizioguillaro opened this issue Oct 23, 2023 · 3 comments
Open

code for user-defined mask #74

fabrizioguillaro opened this issue Oct 23, 2023 · 3 comments

Comments

@fabrizioguillaro
Copy link

Hello!
I am trying to use your code for "Null-text Inversion for Editing Real Images using Guided Diffusion Models".
In particular, since I have an inpainting mask, I am trying to generate an image using a user-defined mask (like shown in fig. 8 or fig. 14 of "Prompt-To-Prompt Image Editing With Cross-Attention Control").
The code for using user-defined mask is missing, so I was trying to implement a way to do that.
Did you just apply the given mask instead of the one computed from the prompt in LocalBlend?
Could the following code represent what you did (resizing the mask to 64x64, repeating over the 2 channels, applying the mask to the latent space)?

class LocalBlend:
    ...
    def __init__(...)
        ...
        mask = np.array(Image.fromarray(mask).resize((64, 64), Image.NEAREST))
        mask = mask[None,None,:,:]
        mask = mask.repeat(2, axis=0)
        self.mask = torch.from_numpy(mask).cuda()

    def __call__(...)
        ...
        mask = self.mask
        mask = mask.float()
        x_t = x_t[:1] + mask * (x_t - x_t[:1])
@fabrizioguillaro
Copy link
Author

fabrizioguillaro commented Oct 23, 2023

The code I wrote works (example in the image), I am just wondering if it follows the way you intended to do it.

As you can see, using the given mask, the code above allows me to edit just the pie on the left, instead of all the pies:
image

@Yutong-Dai
Copy link

The code I wrote works (example in the image), I am just wondering if it follows the way you intended to do it.

As you can see, using the given mask, the code above allows me to edit just the pie on the left, instead of all the pies: image

Thanks for bringing this up. I also have a similar question about replacing the estimated mask with user-provided masks. Could you share the code to reproduce the results shown in the above example? I noticed that the rolling pin on the right was distorted, even with the presence of the mask.

@AhmedBourouis
Copy link

@fabrizioguillaro what if the mask didn't match the position of the pie? like the mask is on the right.. would it still give reasonable results?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants