Masking: remove flakiness from test #31939
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What does this PR do?
test_custom_4d_attention_mask
was trying to compare the argmax of two sets of logits. The logits should be the same in theory -- one is created through batched inference, the other through properly masked and stacked inputs.We know from experience that batching introduces tiny fluctuations -- see this comment explaining why.
As such, the argmax of these two tensors can have different values, resulting in a flaky test. This PR removes the flaky part of the test, the equality check of the argmax. We still check that the two tensors are very similar.
Example of a failing test due to flakiness: https://app.circleci.com/pipelines/github/huggingface/transformers/97709/workflows/f407565a-dd9f-423f-9af2-86f0925cf109/jobs/1294534