Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need to reset the signal handler for ALARM after we call resolve_trust_remote_code #29690

Closed
2 of 4 tasks
coldnight opened this issue Mar 16, 2024 · 3 comments · Fixed by #29706
Closed
2 of 4 tasks

Need to reset the signal handler for ALARM after we call resolve_trust_remote_code #29690

coldnight opened this issue Mar 16, 2024 · 3 comments · Fixed by #29706

Comments

@coldnight
Copy link
Contributor

System Info

transformers==4.38.1
python==3.9

Who can help?

No response

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

Hi, we've met an interesting error while we're using the version 4.38.1, here's the part of the traceback

  File "/root/online_third_party/env/venv.2062-helm/lib/python3.9/site-packages/urllib3/connectionpool.py", line 715, in urlopen
    httplib_response = self._make_request(
  File "/root/online_third_party/env/venv.2062-helm/lib/python3.9/site-packages/urllib3/connectionpool.py", line 467, in _make_request
    six.raise_from(e, None)
  File "<string>", line 3, in raise_from
  File "/root/online_third_party/env/venv.2062-helm/lib/python3.9/site-packages/urllib3/connectionpool.py", line 462, in _make_request
    httplib_response = conn.getresponse()
  File "/usr/lib/python3.9/http/client.py", line 1349, in getresponse
    response.begin()
  File "/usr/lib/python3.9/http/client.py", line 316, in begin
    version, status, reason = self._read_status()
  File "/usr/lib/python3.9/http/client.py", line 277, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
  File "/usr/lib/python3.9/socket.py", line 704, in readinto
    return self._sock.recv_into(b)
  File "/root/online_third_party/env/venv.2062-helm/lib/python3.9/site-packages/transformers/dynamic_module_utils.py", line 580, in _raise_timeout_error
    raise ValueError(
ValueError: Loading this model requires you to execute custom code contained in the model repository on your local machine. Please set the option `trust_remote_code=True` to permit loading of this model.

As I checked the code which it was introduced in this commit, I see we've register a signal handler for ALARM to raise an exception when we encounter the timeout case(see https://github.com/huggingface/transformers/blob/v4.38.1/src/transformers/dynamic_module_utils.py#L595-L596). And we didn't reset the signal handler to the default after the function is ended.

Expected behavior

The signal handler shouldn't affect to other part of a system.

@coldnight
Copy link
Contributor Author

The alarm should clear in a finally arm, because of this exception:

  File "/tmp/venv-helm/lib/python3.9/site-packages/transformers/dynamic_module_utils.py", line 598, in resolve_trust_remote_code
    answer = input(
EOFError: EOF when reading a line

During handling of the above exception, another exception occurred:

The alarm will not be clean, so the alarm will be triggered in the future, and it will produce very confuse error.

coldnight added a commit to coldnight/transformers that referenced this issue Mar 18, 2024
@ArthurZucker
Copy link
Collaborator

ArthurZucker commented Mar 27, 2024

Will have a look thanks for reporting. Could you share the reproducer as well? 🤗

@coldnight
Copy link
Contributor Author

coldnight commented Mar 28, 2024

This happened after I load a tokenizer and without trust_remote_code(but the tokenizer need it). The code will fail by an exception if the STDIN has been closed, but if we handle it and then let the code continue for a while, the problem will be reproduced. I think the below codes will simply reproduce:

import time

from transformers import AutoTokenizer
from transformers.dynamic_module_utils import   TIME_OUT_REMOTE_CODE


hf_tokenizer_name = 'THUDM/chatglm2-6b'
try:
    AutoTokenizer.from_pretrained(hf_tokenizer_name, use_fast=True)
except ValueError:
    print("STDIN has closed")

print("Now the program continues")
time.sleep(TIME_OUT_REMOTE_CODE + 5)

We can save this script to a file: test.py. And then run it and close the STDIN for it:

 python test.py  0<&- 

The output:

STDIN has closed
Now the program continues
Traceback (most recent call last):
  File "/Users/wh/codes/flageval/helm/test.py", line 14, in <module>
    time.sleep(TIME_OUT_REMOTE_CODE + 5)
  File "/usr/local/lib/python3.9/site-packages/transformers/dynamic_module_utils.py", line 580, in _raise_timeout_error
    raise ValueError(
ValueError: Loading this model requires you to execute custom code contained in the model repository on your local machine. Please set the option `trust_remote_code=True` to permit loading of this model.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants