Fix saving FlaubertTokenizer configs (#14991)

All specific tokenizer config properties must be passed to its base class (XLMTokenizer) in order to be saved. This was not the case for do_lowercase config. Thus it was not saved by save_pretrained() method and saving and reloading the tokenizer changed its behaviour. This commit fixes it.
huggingface · Jan 11, 2022 · 57b980a · 57b980a
1 parent 16f0b7d
commit 57b980a
Showing 1 changed file with 1 addition and 1 deletion.
diff --git a/src/transformers/models/flaubert/tokenization_flaubert.py b/src/transformers/models/flaubert/tokenization_flaubert.py
@@ -96,7 +96,7 @@ class FlaubertTokenizer(XLMTokenizer):
     max_model_input_sizes = PRETRAINED_POSITIONAL_EMBEDDINGS_SIZES
 
     def __init__(self, do_lowercase=False, **kwargs):
-        super().__init__(**kwargs)
+        super().__init__(do_lowercase=do_lowercase, **kwargs)
         self.do_lowercase = do_lowercase
         self.do_lowercase_and_remove_accent = False