Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a check regarding the number of occurrences of ``` #18389

Merged
merged 5 commits into from
Aug 1, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion src/transformers/modeling_tf_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -1879,7 +1879,7 @@ def _get_resized_embeddings(self, old_embeddings, new_num_tokens=None) -> tf.Var

Increasing the size will add newly initialized vectors at the end. Reducing the size will remove
vectors from the end. If not provided or `None`, just returns a pointer to the input tokens
``tf.Variable``` module of the model without doing anything.
`tf.Variable` module of the model without doing anything.

Return:
`tf.Variable`: Pointer to the resized Embedding Module or the old Embedding Module if `new_num_tokens` is
Expand Down
8 changes: 4 additions & 4 deletions src/transformers/modeling_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -1221,7 +1221,7 @@ def _get_resized_embeddings(

Increasing the size will add newly initialized vectors at the end. Reducing the size will remove
vectors from the end. If not provided or `None`, just returns a pointer to the input tokens
``torch.nn.Embedding``` module of the model without doing anything.
`torch.nn.Embedding` module of the model without doing anything.

Return:
`torch.nn.Embedding`: Pointer to the resized Embedding Module or the old Embedding Module if
Expand Down Expand Up @@ -1285,9 +1285,9 @@ def _get_resized_lm_head(

Increasing the size will add newly initialized vectors at the end. Reducing the size will remove
vectors from the end. If not provided or `None`, just returns a pointer to the input tokens
``torch.nn.Linear``` module of the model without doing anything. transposed (`bool`, *optional*,
defaults to `False`): Whether `old_lm_head` is transposed or not. If True `old_lm_head.size()` is
`lm_head_dim, vocab_size` else `vocab_size, lm_head_dim`.
`torch.nn.Linear` module of the model without doing anything. transposed (`bool`, *optional*, defaults
to `False`): Whether `old_lm_head` is transposed or not. If True `old_lm_head.size()` is `lm_head_dim,
vocab_size` else `vocab_size, lm_head_dim`.

Return:
`torch.nn.Linear`: Pointer to the resized Linear Module or the old Linear Module if `new_num_tokens` is
Expand Down
10 changes: 5 additions & 5 deletions src/transformers/models/bart/modeling_tf_bart.py
Original file line number Diff line number Diff line change
Expand Up @@ -910,11 +910,11 @@ def call(

If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those
that don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of
all ``decoder_input_ids``` of shape `(batch_size, sequence_length)`. inputs_embeds (`tf.Tensor` of
shape `(batch_size, sequence_length, hidden_size)`, *optional*): Optionally, instead of passing
`input_ids` you can choose to directly pass an embedded representation. This is useful if you want more
control over how to convert `input_ids` indices into associated vectors than the model's internal
embedding lookup matrix.
all `decoder_input_ids` of shape `(batch_size, sequence_length)`. inputs_embeds (`tf.Tensor` of shape
`(batch_size, sequence_length, hidden_size)`, *optional*): Optionally, instead of passing `input_ids`
you can choose to directly pass an embedded representation. This is useful if you want more control
over how to convert `input_ids` indices into associated vectors than the model's internal embedding
lookup matrix.
output_attentions (`bool`, *optional*):
Whether or not to return the attentions tensors of all attention layers. See `attentions` under
returned tensors for more detail.
Expand Down
10 changes: 5 additions & 5 deletions src/transformers/models/blenderbot/modeling_tf_blenderbot.py
Original file line number Diff line number Diff line change
Expand Up @@ -894,11 +894,11 @@ def call(

If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those
that don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of
all ``decoder_input_ids``` of shape `(batch_size, sequence_length)`. inputs_embeds (`tf.Tensor` of
shape `(batch_size, sequence_length, hidden_size)`, *optional*): Optionally, instead of passing
`input_ids` you can choose to directly pass an embedded representation. This is useful if you want more
control over how to convert `input_ids` indices into associated vectors than the model's internal
embedding lookup matrix.
all `decoder_input_ids` of shape `(batch_size, sequence_length)`. inputs_embeds (`tf.Tensor` of shape
`(batch_size, sequence_length, hidden_size)`, *optional*): Optionally, instead of passing `input_ids`
you can choose to directly pass an embedded representation. This is useful if you want more control
over how to convert `input_ids` indices into associated vectors than the model's internal embedding
lookup matrix.
output_attentions (`bool`, *optional*):
Whether or not to return the attentions tensors of all attention layers. See `attentions` under
returned tensors for more detail. This argument can be used only in eager mode, in graph mode the value
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -898,11 +898,11 @@ def call(

If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those
that don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of
all ``decoder_input_ids``` of shape `(batch_size, sequence_length)`. inputs_embeds (`tf.Tensor` of
shape `(batch_size, sequence_length, hidden_size)`, *optional*): Optionally, instead of passing
`input_ids` you can choose to directly pass an embedded representation. This is useful if you want more
control over how to convert `input_ids` indices into associated vectors than the model's internal
embedding lookup matrix.
all `decoder_input_ids` of shape `(batch_size, sequence_length)`. inputs_embeds (`tf.Tensor` of shape
`(batch_size, sequence_length, hidden_size)`, *optional*): Optionally, instead of passing `input_ids`
you can choose to directly pass an embedded representation. This is useful if you want more control
over how to convert `input_ids` indices into associated vectors than the model's internal embedding
lookup matrix.
output_attentions (`bool`, *optional*):
Whether or not to return the attentions tensors of all attention layers. See `attentions` under
returned tensors for more detail. This argument can be used only in eager mode, in graph mode the value
Expand Down
2 changes: 1 addition & 1 deletion src/transformers/models/deberta/modeling_deberta.py
Original file line number Diff line number Diff line change
Expand Up @@ -825,7 +825,7 @@ def _set_gradient_checkpointing(self, module, value=False):

This model is also a PyTorch [torch.nn.Module](https://pytorch.org/docs/stable/nn.html#torch.nn.Module) subclass.
Use it as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage
and behavior.```
and behavior.


Parameters:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -920,7 +920,7 @@ def _set_gradient_checkpointing(self, module, value=False):

This model is also a PyTorch [torch.nn.Module](https://pytorch.org/docs/stable/nn.html#torch.nn.Module) subclass.
Use it as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage
and behavior.```
and behavior.


Parameters:
Expand Down
2 changes: 1 addition & 1 deletion src/transformers/models/dpr/tokenization_dpr.py
Original file line number Diff line number Diff line change
Expand Up @@ -297,7 +297,7 @@ def decode_best_spans(
spans in the same passage. It corresponds to the sum of the start and end logits of the span.
- **relevance_score**: `float` that corresponds to the score of the each passage to answer the question,
compared to all the other passages. It corresponds to the output of the QA classifier of the DPRReader.
- **doc_id**: ``int``` the id of the passage. - **start_index**: `int` the start index of the span
- **doc_id**: `int` the id of the passage. - **start_index**: `int` the start index of the span
(inclusive). - **end_index**: `int` the end index of the span (inclusive).

Examples:
Expand Down
2 changes: 1 addition & 1 deletion src/transformers/models/dpr/tokenization_dpr_fast.py
Original file line number Diff line number Diff line change
Expand Up @@ -297,7 +297,7 @@ def decode_best_spans(
spans in the same passage. It corresponds to the sum of the start and end logits of the span.
- **relevance_score**: `float` that corresponds to the score of the each passage to answer the question,
compared to all the other passages. It corresponds to the output of the QA classifier of the DPRReader.
- **doc_id**: ``int``` the id of the passage. - ***start_index**: `int` the start index of the span
- **doc_id**: `int` the id of the passage. - ***start_index**: `int` the start index of the span
(inclusive). - **end_index**: `int` the end index of the span (inclusive).

Examples:
Expand Down
4 changes: 2 additions & 2 deletions src/transformers/models/led/modeling_led.py
Original file line number Diff line number Diff line change
Expand Up @@ -2009,8 +2009,8 @@ def forward(

If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those
that don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of
all ``decoder_input_ids``` of shape `(batch_size, sequence_length)`. inputs_embeds (`torch.FloatTensor`
of shape `(batch_size, sequence_length, hidden_size)`, *optional*): Optionally, instead of passing
all `decoder_input_ids` of shape `(batch_size, sequence_length)`. inputs_embeds (`torch.FloatTensor` of
shape `(batch_size, sequence_length, hidden_size)`, *optional*): Optionally, instead of passing
`input_ids` you can choose to directly pass an embedded representation. This is useful if you want more
control over how to convert `input_ids` indices into associated vectors than the model's internal
embedding lookup matrix.
Expand Down
2 changes: 1 addition & 1 deletion src/transformers/models/led/modeling_tf_led.py
Original file line number Diff line number Diff line change
Expand Up @@ -1991,7 +1991,7 @@ def call(
Contains precomputed key and value hidden-states of the attention blocks. Can be used to speed up
decoding. If `past_key_values` are used, the user can optionally input only the last
`decoder_input_ids` (those that don't have their past key value states given to this model) of shape
`(batch_size, 1)` instead of all ``decoder_input_ids``` of shape `(batch_size, sequence_length)`.
`(batch_size, 1)` instead of all `decoder_input_ids` of shape `(batch_size, sequence_length)`.
inputs_embeds (`tf.Tensor` of shape `(batch_size, sequence_length, hidden_size)`, *optional*):
Optionally, instead of passing `input_ids` you can choose to directly pass an embedded representation.
This is useful if you want more control over how to convert `input_ids` indices into associated vectors
Expand Down
13 changes: 6 additions & 7 deletions src/transformers/models/m2m_100/modeling_m2m_100.py
Original file line number Diff line number Diff line change
Expand Up @@ -646,11 +646,10 @@ def _set_gradient_checkpointing(self, module, value=False):

If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those that
don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of all
``decoder_input_ids``` of shape `(batch_size, sequence_length)`. inputs_embeds (`torch.FloatTensor` of
shape `(batch_size, sequence_length, hidden_size)`, *optional*): Optionally, instead of passing `input_ids`
you can choose to directly pass an embedded representation. This is useful if you want more control over
how to convert `input_ids` indices into associated vectors than the model's internal embedding lookup
matrix.
`decoder_input_ids` of shape `(batch_size, sequence_length)`. inputs_embeds (`torch.FloatTensor` of shape
`(batch_size, sequence_length, hidden_size)`, *optional*): Optionally, instead of passing `input_ids` you
can choose to directly pass an embedded representation. This is useful if you want more control over how to
convert `input_ids` indices into associated vectors than the model's internal embedding lookup matrix.
decoder_inputs_embeds (`torch.FloatTensor` of shape `(batch_size, target_sequence_length, hidden_size)`, *optional*):
Optionally, instead of passing `decoder_input_ids` you can choose to directly pass an embedded
representation. If `past_key_values` is used, optionally only the last `decoder_inputs_embeds` have to be
Expand Down Expand Up @@ -952,8 +951,8 @@ def forward(

If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those
that don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of
all ``decoder_input_ids``` of shape `(batch_size, sequence_length)`. inputs_embeds (`torch.FloatTensor`
of shape `(batch_size, sequence_length, hidden_size)`, *optional*): Optionally, instead of passing
all `decoder_input_ids` of shape `(batch_size, sequence_length)`. inputs_embeds (`torch.FloatTensor` of
shape `(batch_size, sequence_length, hidden_size)`, *optional*): Optionally, instead of passing
`input_ids` you can choose to directly pass an embedded representation. This is useful if you want more
control over how to convert `input_ids` indices into associated vectors than the model's internal
embedding lookup matrix.
Expand Down
10 changes: 5 additions & 5 deletions src/transformers/models/marian/modeling_tf_marian.py
Original file line number Diff line number Diff line change
Expand Up @@ -937,11 +937,11 @@ def call(

If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those
that don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of
all ``decoder_input_ids``` of shape `(batch_size, sequence_length)`. inputs_embeds (`tf.Tensor` of
shape `(batch_size, sequence_length, hidden_size)`, *optional*): Optionally, instead of passing
`input_ids` you can choose to directly pass an embedded representation. This is useful if you want more
control over how to convert `input_ids` indices into associated vectors than the model's internal
embedding lookup matrix.
all `decoder_input_ids` of shape `(batch_size, sequence_length)`. inputs_embeds (`tf.Tensor` of shape
`(batch_size, sequence_length, hidden_size)`, *optional*): Optionally, instead of passing `input_ids`
you can choose to directly pass an embedded representation. This is useful if you want more control
over how to convert `input_ids` indices into associated vectors than the model's internal embedding
lookup matrix.
output_attentions (`bool`, *optional*):
Whether or not to return the attentions tensors of all attention layers. See `attentions` under
returned tensors for more detail. This argument can be used only in eager mode, in graph mode the value
Expand Down
10 changes: 5 additions & 5 deletions src/transformers/models/mbart/modeling_tf_mbart.py
Original file line number Diff line number Diff line change
Expand Up @@ -927,11 +927,11 @@ def call(

If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those
that don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of
all ``decoder_input_ids``` of shape `(batch_size, sequence_length)`. inputs_embeds (`tf.Tensor` of
shape `(batch_size, sequence_length, hidden_size)`, *optional*): Optionally, instead of passing
`input_ids` you can choose to directly pass an embedded representation. This is useful if you want more
control over how to convert `input_ids` indices into associated vectors than the model's internal
embedding lookup matrix.
all `decoder_input_ids` of shape `(batch_size, sequence_length)`. inputs_embeds (`tf.Tensor` of shape
`(batch_size, sequence_length, hidden_size)`, *optional*): Optionally, instead of passing `input_ids`
you can choose to directly pass an embedded representation. This is useful if you want more control
over how to convert `input_ids` indices into associated vectors than the model's internal embedding
lookup matrix.
output_attentions (`bool`, *optional*):
Whether or not to return the attentions tensors of all attention layers. See `attentions` under
returned tensors for more detail. This argument can be used only in eager mode, in graph mode the value
Expand Down
4 changes: 2 additions & 2 deletions src/transformers/models/mbart/tokenization_mbart.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,8 +57,8 @@ class MBartTokenizer(PreTrainedTokenizer):
Adapted from [`RobertaTokenizer`] and [`XLNetTokenizer`]. Based on
[SentencePiece](https://github.com/google/sentencepiece).

The tokenization method is `<tokens> <eos> <language code>` for source language documents, and ``<language code>
<tokens> <eos>``` for target language documents.
The tokenization method is `<tokens> <eos> <language code>` for source language documents, and `<language code>
<tokens> <eos>` for target language documents.

Examples:

Expand Down
4 changes: 2 additions & 2 deletions src/transformers/models/mbart/tokenization_mbart_fast.py
Original file line number Diff line number Diff line change
Expand Up @@ -68,8 +68,8 @@ class MBartTokenizerFast(PreTrainedTokenizerFast):
This tokenizer inherits from [`PreTrainedTokenizerFast`] which contains most of the main methods. Users should
refer to this superclass for more information regarding those methods.

The tokenization method is `<tokens> <eos> <language code>` for source language documents, and ``<language code>
<tokens> <eos>``` for target language documents.
The tokenization method is `<tokens> <eos> <language code>` for source language documents, and `<language code>
<tokens> <eos>` for target language documents.

Examples:

Expand Down
2 changes: 1 addition & 1 deletion src/transformers/models/opt/modeling_tf_opt.py
Original file line number Diff line number Diff line change
Expand Up @@ -598,7 +598,7 @@ def call(

If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those
that don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of
all ``decoder_input_ids``` of shape `(batch_size, sequence_length)`.
all `decoder_input_ids` of shape `(batch_size, sequence_length)`.
inputs_embeds (`tf.Tensor` of
shape `(batch_size, sequence_length, hidden_size)`, *optional*): Optionally, instead of passing
`input_ids` you can choose to directly pass an embedded representation. This is useful if you want more
Expand Down
10 changes: 5 additions & 5 deletions src/transformers/models/pegasus/modeling_tf_pegasus.py
Original file line number Diff line number Diff line change
Expand Up @@ -943,11 +943,11 @@ def call(

If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those
that don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of
all ``decoder_input_ids``` of shape `(batch_size, sequence_length)`. inputs_embeds (`tf.Tensor` of
shape `(batch_size, sequence_length, hidden_size)`, *optional*): Optionally, instead of passing
`input_ids` you can choose to directly pass an embedded representation. This is useful if you want more
control over how to convert `input_ids` indices into associated vectors than the model's internal
embedding lookup matrix.
all `decoder_input_ids` of shape `(batch_size, sequence_length)`. inputs_embeds (`tf.Tensor` of shape
`(batch_size, sequence_length, hidden_size)`, *optional*): Optionally, instead of passing `input_ids`
you can choose to directly pass an embedded representation. This is useful if you want more control
over how to convert `input_ids` indices into associated vectors than the model's internal embedding
lookup matrix.
output_attentions (`bool`, *optional*):
Whether or not to return the attentions tensors of all attention layers. See `attentions` under
returned tensors for more detail. This argument can be used only in eager mode, in graph mode the value
Expand Down
Loading