DataCollatorForWholeWordMask does not return attention_mask

Hi,

** DataCollatorForWholeWordMask** does not output `attention_mask`. According to the `__call__` method:
`return {"input_ids": inputs, "labels": labels}`. 

Is there a peculiar motivation behind it or is a small bug? From where I see, when we do the pre-training, most instances will **not** be of the same length and applying _Attention_  for all the tokens (including the padding) may cause imprecise results.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

DataCollatorForWholeWordMask does not return attention_mask #12998

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

DataCollatorForWholeWordMask does not return attention_mask #12998

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions