-
Couldn't load subscription status.
- Fork 31k
Description
Since Transformers 4.52, the Autotokenizer.from_pretrained loading mechanism has changed and the token is not correctly propagated.
System Info
(Colab)
transformersversion: 4.52.4- Platform: Linux-6.1.123+-x86_64-with-glibc2.35
- Python version: 3.11.13
- Huggingface_hub version: 0.33.0
- Safetensors version: 0.5.3
- Accelerate version: 1.7.0
- Accelerate config: not found
- DeepSpeed version: not installed
- PyTorch version (GPU?): 2.6.0+cu124 (False)
- Tensorflow version (GPU?): 2.18.0 (False)
- Flax version (CPU?/GPU?/TPU?): 0.10.6 (cpu)
- Jax version: 0.5.2
- JaxLib version: 0.5.1
- Using distributed or parallel set-up in script?:
Who can help?
@Rocketknight1 @ArthurZucker @Wauplin
I have the impression that this is related to #36588
Reproduction
In this code example, I am trying to load a private tokenizer using the token parameter (not the env var).
import os
from transformers import AutoTokenizer
# we first make sure that the token is not present in environment variables
# if the env var is present, THE BUG DOES NOT OCCUR
os.environ.pop('HF_TOKEN', None)
model = "deepset/bert-base-NER" # a valid private model I can access
token = "..."
tokenizer = AutoTokenizer.from_pretrained(pretrained_model_name_or_path=model, token=token)Error
/usr/local/lib/python3.11/dist-packages/huggingface_hub/utils/_auth.py:86: UserWarning: Access to the secret `HF_TOKEN` has not been granted on this notebook. You will not be requested again. Please restart the session if you want to be prompted again. warnings.warn( --------------------------------------------------------------------------- HTTPError Traceback (most recent call last) [/usr/local/lib/python3.11/dist-packages/huggingface_hub/utils/_http.py](https://localhost:8080/#) in hf_raise_for_status(response, endpoint_name) 408 try: --> 409 response.raise_for_status() 410 except HTTPError as e:8 frames
/usr/local/lib/python3.11/dist-packages/requests/models.py in raise_for_status(self)
1023 if http_error_msg:
-> 1024 raise HTTPError(http_error_msg, response=self)
1025
HTTPError: 401 Client Error: Unauthorized for url: https://huggingface.co/api/models/deepset/bert-base-NER/tree/main/additional_chat_templates?recursive=False&expand=False
The above exception was the direct cause of the following exception:
RepositoryNotFoundError Traceback (most recent call last)
/tmp/ipython-input-6-4075835132.py in <cell line: 0>()
6
----> 7 tokenizer = AutoTokenizer.from_pretrained(pretrained_model_name_or_path=model, token=token)
/usr/local/lib/python3.11/dist-packages/transformers/models/auto/tokenization_auto.py in from_pretrained(cls, pretrained_model_name_or_path, *inputs, **kwargs)
1030
1031 if tokenizer_class_fast and (use_fast or tokenizer_class_py is None):
-> 1032 return tokenizer_class_fast.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
1033 else:
1034 if tokenizer_class_py is not None:
/usr/local/lib/python3.11/dist-packages/transformers/tokenization_utils_base.py in from_pretrained(cls, pretrained_model_name_or_path, cache_dir, force_download, local_files_only, token, revision, trust_remote_code, *init_inputs, **kwargs)
1966 )
1967 else:
-> 1968 for template in list_repo_templates(
1969 pretrained_model_name_or_path,
1970 local_files_only=local_files_only,
/usr/local/lib/python3.11/dist-packages/transformers/utils/hub.py in list_repo_templates(repo_id, local_files_only, revision, cache_dir)
159 if not local_files_only:
160 try:
--> 161 return [
162 entry.path.removeprefix(f"{CHAT_TEMPLATE_DIR}/")
163 for entry in list_repo_tree(
/usr/local/lib/python3.11/dist-packages/transformers/utils/hub.py in (.0)
159 if not local_files_only:
160 try:
--> 161 return [
162 entry.path.removeprefix(f"{CHAT_TEMPLATE_DIR}/")
163 for entry in list_repo_tree(
/usr/local/lib/python3.11/dist-packages/huggingface_hub/hf_api.py in list_repo_tree(self, repo_id, path_in_repo, recursive, expand, revision, repo_type, token)
3166 encoded_path_in_repo = "/" + quote(path_in_repo, safe="") if path_in_repo else ""
3167 tree_url = f"{self.endpoint}/api/{repo_type}s/{repo_id}/tree/{revision}{encoded_path_in_repo}"
-> 3168 for path_info in paginate(path=tree_url, headers=headers, params={"recursive": recursive, "expand": expand}):
3169 yield (RepoFile(**path_info) if path_info["type"] == "file" else RepoFolder(**path_info))
3170
/usr/local/lib/python3.11/dist-packages/huggingface_hub/utils/_pagination.py in paginate(path, params, headers)
35 session = get_session()
36 r = session.get(path, params=params, headers=headers)
---> 37 hf_raise_for_status(r)
38 yield from r.json()
39
/usr/local/lib/python3.11/dist-packages/huggingface_hub/utils/_http.py in hf_raise_for_status(response, endpoint_name)
457 " https://huggingface.co/docs/huggingface_hub/authentication"
458 )
--> 459 raise _format(RepositoryNotFoundError, message, response) from e
460
461 elif response.status_code == 400:
RepositoryNotFoundError: 401 Client Error. (Request ID: Root=1-685bc7f1-448cc03d560cc5bc2bc95865;1708f136-0e94-4e7d-a6b4-4e38a9c50920)
Repository Not Found for url: https://huggingface.co/api/models/deepset/bert-base-NER/tree/main/additional_chat_templates?recursive=False&expand=False.
Please make sure you specified the correct repo_id and repo_type.
If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication
Invalid username or password.
Expected behavior
The tokenizer loads without errors.