Skip to content

Commit

Permalink
Prevent BatchEncoding from blindly passing casts down to the tensors …
Browse files Browse the repository at this point in the history
…it contains. Fixes #6582. (#8860)

Update src/transformers/tokenization_utils_base.py with review fix

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
  • Loading branch information
Craigacp and LysandreJik authored Dec 1, 2020
1 parent c0df963 commit 9c18f15
Showing 1 changed file with 10 additions and 1 deletion.
11 changes: 10 additions & 1 deletion src/transformers/tokenization_utils_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -776,7 +776,16 @@ def to(self, device: Union[str, "torch.device"]) -> "BatchEncoding":
:class:`~transformers.BatchEncoding`: The same instance of :class:`~transformers.BatchEncoding` after
modification.
"""
self.data = {k: v.to(device) for k, v in self.data.items()}

# This check catches things like APEX blindly calling "to" on all inputs to a module
# Otherwise it passes the casts down and casts the LongTensor containing the token idxs
# into a HalfTensor
if isinstance(device, str) or isinstance(device, torch.device):
self.data = {k: v.to(device=device) for k, v in self.data.items()}
else:
logger.warning(
f"Attempting to cast a BatchEncoding to another type, {str(device)}. This is not supported."
)
return self


Expand Down

0 comments on commit 9c18f15

Please sign in to comment.