Skip to content

Commit

Permalink
Fix saving FlaubertTokenizer configs (#14991)
Browse files Browse the repository at this point in the history
All specific tokenizer config properties must be passed to its base
class (XLMTokenizer) in order to be saved. This was not the case for
do_lowercase config. Thus it was not saved by save_pretrained() method
and saving and reloading the tokenizer changed its behaviour.

This commit fixes it.
  • Loading branch information
vmaryasin authored Jan 11, 2022
1 parent 16f0b7d commit 57b980a
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion src/transformers/models/flaubert/tokenization_flaubert.py
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@ class FlaubertTokenizer(XLMTokenizer):
max_model_input_sizes = PRETRAINED_POSITIONAL_EMBEDDINGS_SIZES

def __init__(self, do_lowercase=False, **kwargs):
super().__init__(**kwargs)
super().__init__(do_lowercase=do_lowercase, **kwargs)
self.do_lowercase = do_lowercase
self.do_lowercase_and_remove_accent = False

Expand Down

0 comments on commit 57b980a

Please sign in to comment.