You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)
Reproduction
t = AutoTokenizer.from_pretrained("FacebookAI/roberta-base", add_special_tokens=True, max_length=512, add_prefix_space=True)
t.save_pretrained("~/Downloads/")
Expected behavior
that it does not throw TypeError: Object of type method is not JSON serializable.
Full stack:
Traceback (most recent call last):
File "/Users/lfoppiano/Applications/PyCharm Professional Edition.app/Contents/plugins/python-ce/helpers/pydev/pydevconsole.py", line 364, in runcode
coro = func()
File "<input>", line 1, in <module>
File "/Users/lfoppiano/anaconda3/envs/delft2/lib/python3.8/site-packages/transformers/tokenization_utils_base.py", line 2431, in save_pretrained
idx = serialized_tokens.pop("id")
File "/Users/lfoppiano/anaconda3/envs/delft2/lib/python3.8/json/__init__.py", line 234, in dumps
return cls(
File "/Users/lfoppiano/anaconda3/envs/delft2/lib/python3.8/json/encoder.py", line 201, in encode
chunks = list(chunks)
File "/Users/lfoppiano/anaconda3/envs/delft2/lib/python3.8/json/encoder.py", line 431, in _iterencode
yield from _iterencode_dict(o, _current_indent_level)
File "/Users/lfoppiano/anaconda3/envs/delft2/lib/python3.8/json/encoder.py", line 405, in _iterencode_dict
yield from chunks
File "/Users/lfoppiano/anaconda3/envs/delft2/lib/python3.8/json/encoder.py", line 438, in _iterencode
o = _default(o)
File "/Users/lfoppiano/anaconda3/envs/delft2/lib/python3.8/json/encoder.py", line 179, in default
raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type method is not JSON serializable
The text was updated successfully, but these errors were encountered:
Hi @lfoppiano, I believe the problem is that setting add_special_tokens at init time is not supported because it clashes with the add_special_tokens method. When I run this code on main, I get:
AttributeError: add_special_tokens conflicts with the method add_special_tokens in RobertaTokenizerFast
This check was introduced in #31233 as you mentioned, so I'm not sure why you didn't get that error. Can you try without add_special_tokens in the init?
Hi @Rocketknight1, thanks for your quick answer. If i don't specifcy the parameter it works, however, what should I do to make sure the tokenization is the same as before? In previous versions add_special_tokens was passed as a flag at init and it was working fine.
Hi @lfoppiano, you can pass add_special_tokens when calling the tokenizer instead. However, the default value is True, so you only need to do that when you want to set add_special_tokens=False!
System Info
the problem is present from version transformers==4.37.2 and also on the latest, 4.46.1
I initially though it was related to #31233 but this PR did not solve it, and #33453 also seems related..
Who can help?
@ArthurZucker @itazap
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Expected behavior
that it does not throw
TypeError: Object of type method is not JSON serializable
.Full stack:
The text was updated successfully, but these errors were encountered: