Skip to content

Commit

Permalink
Bug fix for permutation language modelling (huggingface#8409)
Browse files Browse the repository at this point in the history
  • Loading branch information
shngt authored and stas00 committed Nov 10, 2020
1 parent e7a23e8 commit 23262ab
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion src/transformers/data/data_collator.py
Original file line number Diff line number Diff line change
Expand Up @@ -579,7 +579,7 @@ def mask_tokens(self, inputs: torch.Tensor) -> Tuple[torch.Tensor, torch.Tensor,
masked_indices.masked_fill_(padding_mask, value=0.0)

# Mask indicating non-functional tokens, where functional tokens are [SEP], [CLS], padding, etc.
non_func_mask = ~(padding_mask & special_tokens_mask)
non_func_mask = ~(padding_mask | special_tokens_mask)

inputs[masked_indices] = self.tokenizer.mask_token_id
labels[~masked_indices] = -100 # We only compute loss on masked tokens
Expand Down

0 comments on commit 23262ab

Please sign in to comment.