About key_padding_mask in multihead self attention

Hi!

Thank you for your implementation!

I would like to know if there are particular reasons why https://github.com/pmixer/SASRec.pytorch/blob/master/model.py#L83 this line for key_padding_mask is commented? It seems that this mask is necessary to prevent from attending to paddings?

Thanks again,

Sincerely

Yuchen