Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix dimention misspellings. #11238

Merged
merged 2 commits into from
Apr 14, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions src/transformers/models/gpt_neo/modeling_gpt_neo.py
Original file line number Diff line number Diff line change
Expand Up @@ -155,8 +155,8 @@ def _get_block_length_and_num_blocks(seq_length, window_size):
def _look_back(tensor, block_length, window_size, pad_value=0, is_key_value=True):
"""
Used to implement attention between consecutive blocks. This method assumes that dim 1 of :obj:`tensor`
represents the :obj:`seq_length` dimention. It splits :obj:`seq_length` dimention into :obj:`num_blocks` and
:obj:`window_size` + :obj:`block_length`. It pads the :obj:`seq_length` dimention if necessary.
represents the :obj:`seq_length` dimension. It splits :obj:`seq_length` dimension into :obj:`num_blocks` and
:obj:`window_size` + :obj:`block_length`. It pads the :obj:`seq_length` dimension if necessary.

Example::

Expand Down Expand Up @@ -373,7 +373,7 @@ def _create_attention_mask(self, batch_size, seq_length, num_blocks, block_lengt
# look back into the attention_block such that it will also get padded the same way
# and have 0s in the padded position
attention_mask = self._look_back(attention_mask, block_length, self.window_size, is_key_value=False)
attention_mask = attention_mask.unsqueeze(-2) # Add an extra dimention to account for hidden_dim
attention_mask = attention_mask.unsqueeze(-2) # Add an extra dimension to account for hidden_dim

# Multiply the causal_mask with attention_mask so the padded positions (by _look_back operation)
# will contain 0s.
Expand All @@ -387,7 +387,7 @@ def _create_attention_mask(self, batch_size, seq_length, num_blocks, block_lengt
visible = torch.gt(relative_position, -self.window_size)

causal_mask = causal_mask * visible
causal_mask = causal_mask.unsqueeze(-3).bool() # Add an extra dimention to account for num_heads
causal_mask = causal_mask.unsqueeze(-3).bool() # Add an extra dimension to account for num_heads

return causal_mask

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -90,7 +90,7 @@ class Speech2TextConfig(PretrainedConfig):
An integer defining the number of output channels of each convolution layers except the final one in the
conv module.
input_feat_per_channel (:obj:`int`, `optional`, defaults to 80):
An integer specifying the size of feature vector. This is also the dimentions of log-mel filter-bank
An integer specifying the size of feature vector. This is also the dimensions of log-mel filter-bank
features.
input_channels (:obj:`int`, `optional`, defaults to 1):
An integer specifying number of input channels of the input feature vector.
Expand Down