Skip to content

Commit

Permalink
fix: create a copy for tokenizer object (huggingface#18408)
Browse files Browse the repository at this point in the history
  • Loading branch information
YBooks authored Aug 1, 2022
1 parent 24845ae commit df5e423
Showing 1 changed file with 2 additions and 1 deletion.
3 changes: 2 additions & 1 deletion src/transformers/tokenization_utils_fast.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
Tokenization classes for fast tokenizers (provided by HuggingFace's tokenizers library). For slow (python) tokenizers
see tokenization_utils.py
"""
import copy
import json
import os
from collections import defaultdict
Expand Down Expand Up @@ -104,7 +105,7 @@ def __init__(self, *args, **kwargs):
)

if tokenizer_object is not None:
fast_tokenizer = tokenizer_object
fast_tokenizer = copy.deepcopy(tokenizer_object)
elif fast_tokenizer_file is not None and not from_slow:
# We have a serialization from tokenizers which let us directly build the backend
fast_tokenizer = TokenizerFast.from_file(fast_tokenizer_file)
Expand Down

0 comments on commit df5e423

Please sign in to comment.