Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Latent optimizer to extract a representation #3

Open
denabazazian opened this issue Apr 16, 2021 · 5 comments
Open

Latent optimizer to extract a representation #3

denabazazian opened this issue Apr 16, 2021 · 5 comments

Comments

@denabazazian
Copy link

denabazazian commented Apr 16, 2021

I am wondering how can I evaluate the model by a real image instead of a generated image by StyleGAN.

The input image usually should be embed into the latent space of GAN by a latent optimizer to reproduces the input image and extract a representation from an image. However, I cannot find this latent optimizer in the code. Did you feed an input image into Pix2Pix’s encoder and use activation maps from all convolutional layers of the generator (decoder) to construct a pixel-wise representation?

Would it be possible to release the code for testing input images?

Thanks for your great work!

@bryandlee
Copy link
Owner

Hi. You can try the optimization-based method proposed in the original StyleGAN2 paper: (unofficial implementation) https://github.com/rosinality/stylegan2-pytorch/blob/master/projector.py

@denabazazian
Copy link
Author

@bryandlee Many thanks for your reply. I have tried to use the projector code from StyleGAN2. But, the latent_in from that code is aligned with the generated projected image of the input. Does it mean that I should modify lines #170 and #173 to get the latent_in directly from the input image regardless of sample_noise and latent_mean? Or am I missing something?

@bryandlee
Copy link
Owner

Hi, I don't quite get what you mean by "getting the latent_in directly from the input image regardless of sample_noise and latent_mean". The code finds the latent vectors and noises that can be fed into the generator to generate the closest projection of a given input image.

@denabazazian
Copy link
Author

Yes, the projector code generates the closest projection of a given input image, and the problem is that in most of the cases the view-point and some features of the input images are changed. So, the semantic segmentation result is not corresponding to the input image.
In the Supplementary Material of the paper, it is written that the input image is fed into a Pix2Pix’s encoder to construct a pixel-wise representation. I am just wondering if there is any further implementation or explanation regarding that. Thanks.

@bryandlee
Copy link
Owner

I see. The "auto-shot segmentation" part of the paper is not implemented, but you can sample image-label pairs from the few-shot model and use them to train any semantic segmentation model. I'll let you know if I have a chance to do it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants