Latent optimizer to extract a representation #3

denabazazian · 2021-04-16T10:42:36Z

I am wondering how can I evaluate the model by a real image instead of a generated image by StyleGAN.

The input image usually should be embed into the latent space of GAN by a latent optimizer to reproduces the input image and extract a representation from an image. However, I cannot find this latent optimizer in the code. Did you feed an input image into Pix2Pix’s encoder and use activation maps from all convolutional layers of the generator (decoder) to construct a pixel-wise representation?

Would it be possible to release the code for testing input images?

Thanks for your great work!

The text was updated successfully, but these errors were encountered:

bryandlee · 2021-04-18T22:31:14Z

Hi. You can try the optimization-based method proposed in the original StyleGAN2 paper: (unofficial implementation) https://github.com/rosinality/stylegan2-pytorch/blob/master/projector.py

denabazazian · 2021-04-20T12:48:42Z

@bryandlee Many thanks for your reply. I have tried to use the projector code from StyleGAN2. But, the latent_in from that code is aligned with the generated projected image of the input. Does it mean that I should modify lines #170 and #173 to get the latent_in directly from the input image regardless of sample_noise and latent_mean? Or am I missing something?

bryandlee · 2021-04-27T23:35:19Z

Hi, I don't quite get what you mean by "getting the latent_in directly from the input image regardless of sample_noise and latent_mean". The code finds the latent vectors and noises that can be fed into the generator to generate the closest projection of a given input image.

denabazazian · 2021-04-30T23:35:32Z

Yes, the projector code generates the closest projection of a given input image, and the problem is that in most of the cases the view-point and some features of the input images are changed. So, the semantic segmentation result is not corresponding to the input image.
In the Supplementary Material of the paper, it is written that the input image is fed into a Pix2Pix’s encoder to construct a pixel-wise representation. I am just wondering if there is any further implementation or explanation regarding that. Thanks.

bryandlee · 2021-05-01T08:36:50Z

I see. The "auto-shot segmentation" part of the paper is not implemented, but you can sample image-label pairs from the few-shot model and use them to train any semantic segmentation model. I'll let you know if I have a chance to do it.

bryandlee mentioned this issue Mar 3, 2022

How to implement the code #7

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Latent optimizer to extract a representation #3

Latent optimizer to extract a representation #3

denabazazian commented Apr 16, 2021 •

edited

Loading

bryandlee commented Apr 18, 2021

denabazazian commented Apr 20, 2021

bryandlee commented Apr 27, 2021

denabazazian commented Apr 30, 2021

bryandlee commented May 1, 2021

Latent optimizer to extract a representation #3

Latent optimizer to extract a representation #3

Comments

denabazazian commented Apr 16, 2021 • edited Loading

bryandlee commented Apr 18, 2021

denabazazian commented Apr 20, 2021

bryandlee commented Apr 27, 2021

denabazazian commented Apr 30, 2021

bryandlee commented May 1, 2021

denabazazian commented Apr 16, 2021 •

edited

Loading