Attention masking bug? #12

rayleizhu · 2023-06-08T07:15:03Z

It seems that your attention masking part is wrong, as the image patches from different images should have a different mask, while you use a single mask definition for all patches regardless of where they are from.

MixMIM/models_mixmim.py

Line 130 in 9da3eee

if attn_mask is not None:

BTW, how do you cope with mixing more images? I see only mixing 2 images case in the code.

rayleizhu · 2023-06-09T09:44:45Z

I've figured out why: complementary masking is implemented in WindowAttention:

MixMIM/models_mixmim.py

Line 80 in 9da3eee

    
           mask_new = mask * mask.transpose(2, 3) + (1 - mask) * (1 - mask).transpose(2, 3)

rayleizhu closed this as completed Jun 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Attention masking bug? #12

Attention masking bug? #12

rayleizhu commented Jun 8, 2023

rayleizhu commented Jun 9, 2023

Attention masking bug? #12

Attention masking bug? #12

Comments

rayleizhu commented Jun 8, 2023

rayleizhu commented Jun 9, 2023