Why not adopt bert like MaskGIT to reconstruct Tokens? #52

tanbuzheng · 2024-01-27T01:37:03Z

Dear author,
Thanks for sharing the code. I am greatly interested in your work. I have a question for you and would like your reply.
In the second stage, you adopt an encoder-decoder Transformer to reconstruct Tokens. Why not directly adopt the bidirectional Transformer in MaskGIT. Therefore, I want to know what are the advantages of the encoder-decoder Transformer.

Waiting for your reply！

LTH14 · 2024-01-27T11:59:03Z

In fact we started from MaskGIT's BERT architecture, but we find both linear probing and unconditional generation performance are poor (57.4% accuracy, 20.7 FID). Then we find that adopting the encoder-decoder architecture similar to MAE can largely improve the performance. My assumption is that such an encoder-decoder design is better for representation learning, and such a good representation can then also help with generation.

tanbuzheng · 2024-01-29T03:23:43Z

Thanks for your reply! But I have another question. In the second stage, will better results be obtained if the masked images are adopted as input to reconstruct the tokens? Table 4 of your paper shows that scratches on pixels lead to better performance.

LTH14 · 2024-01-29T03:28:16Z

We must use image tokens as both input and output to enable image generation, because image generation takes multiple steps. In the middle of generation, only part of the tokens are generated which cannot decode to images. If we only consider representation learning, using masked images as input and tokens as output is similar to BeiT.

tanbuzheng · 2024-01-29T07:42:57Z

I got it！Thanks！

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why not adopt bert like MaskGIT to reconstruct Tokens? #52

Why not adopt bert like MaskGIT to reconstruct Tokens? #52

tanbuzheng commented Jan 27, 2024 •

edited

Loading

LTH14 commented Jan 27, 2024

tanbuzheng commented Jan 29, 2024

LTH14 commented Jan 29, 2024

tanbuzheng commented Jan 29, 2024

Why not adopt bert like MaskGIT to reconstruct Tokens? #52

Why not adopt bert like MaskGIT to reconstruct Tokens? #52

Comments

tanbuzheng commented Jan 27, 2024 • edited Loading

LTH14 commented Jan 27, 2024

tanbuzheng commented Jan 29, 2024

LTH14 commented Jan 29, 2024

tanbuzheng commented Jan 29, 2024

tanbuzheng commented Jan 27, 2024 •

edited

Loading