Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get_language_model_class does not work when huggingface.co is offline, even if model is locally cached #2118

Closed
1 task done
yunyu opened this issue Feb 3, 2022 · 5 comments
Labels
topic:dependencies type:bug Something isn't working

Comments

@yunyu
Copy link

yunyu commented Feb 3, 2022

Describe the bug
Huggingface.co is currently offline. I am running Haystack with the models cached in ~/.cache/huggingface, but my program still fails.

Error message
Error that was thrown (if available)

  File "/home/yunyu/haystack-test-api/src/serve-api.py", line 24, in <module>
    retriever = EmbeddingRetriever(
  File "/home/yunyu/.cache/pypoetry/virtualenvs/faqbot-api-8eL8ANau-py3.9/lib/python3.9/site-packages/haystack/nodes/retriever/dense.py", line 1040, in __init__
    self.embedding_encoder = _EMBEDDING_ENCODERS[model_format](self)
  File "/home/yunyu/.cache/pypoetry/virtualenvs/faqbot-api-8eL8ANau-py3.9/lib/python3.9/site-packages/haystack/nodes/retriever/_embedding_encoder.py", line 54, in __init__
    self.embedding_model = Inferencer.load(
  File "/home/yunyu/.cache/pypoetry/virtualenvs/faqbot-api-8eL8ANau-py3.9/lib/python3.9/site-packages/haystack/modeling/infer.py", line 189, in load
    model = AdaptiveModel.convert_from_transformers(model_name_or_path,
  File "/home/yunyu/.cache/pypoetry/virtualenvs/faqbot-api-8eL8ANau-py3.9/lib/python3.9/site-packages/haystack/modeling/model/adaptive_model.py", line 510, in convert_from_transformers
    return conv.Converter.convert_from_transformers(model_name_or_path,
  File "/home/yunyu/.cache/pypoetry/virtualenvs/faqbot-api-8eL8ANau-py3.9/lib/python3.9/site-packages/haystack/modeling/conversion/transformers.py", line 70, in convert_from_transformers
    lm = LanguageModel.load(model_name_or_path, revision=revision,use_auth_token=use_auth_token, **kwargs)
  File "/home/yunyu/.cache/pypoetry/virtualenvs/faqbot-api-8eL8ANau-py3.9/lib/python3.9/site-packages/haystack/modeling/model/language_model.py", line 161, in load
    language_model_class = cls.get_language_model_class(pretrained_model_name_or_path, use_auth_token=use_auth_token, **kwargs)
  File "/home/yunyu/.cache/pypoetry/virtualenvs/faqbot-api-8eL8ANau-py3.9/lib/python3.9/site-packages/haystack/modeling/model/language_model.py", line 197, in get_language_model_class
    config = AutoConfig.from_pretrained(model_name_or_path, use_auth_token=use_auth_token, **kwargs)
  File "/home/yunyu/.cache/pypoetry/virtualenvs/faqbot-api-8eL8ANau-py3.9/lib/python3.9/site-packages/transformers/models/auto/configuration_auto.py", line 580, in from_pretrained
    config_dict, _ = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
  File "/home/yunyu/.cache/pypoetry/virtualenvs/faqbot-api-8eL8ANau-py3.9/lib/python3.9/site-packages/transformers/configuration_utils.py", line 550, in get_config_dict
    configuration_file = get_configuration_file(
  File "/home/yunyu/.cache/pypoetry/virtualenvs/faqbot-api-8eL8ANau-py3.9/lib/python3.9/site-packages/transformers/configuration_utils.py", line 841, in get_configuration_file
    all_files = get_list_of_files(
  File "/home/yunyu/.cache/pypoetry/virtualenvs/faqbot-api-8eL8ANau-py3.9/lib/python3.9/site-packages/transformers/file_utils.py", line 1952, in get_list_of_files
    return list_repo_files(path_or_repo, revision=revision, token=token)
  File "/home/yunyu/.cache/pypoetry/virtualenvs/faqbot-api-8eL8ANau-py3.9/lib/python3.9/site-packages/huggingface_hub/hf_api.py", line 884, in list_repo_files
    info = self.model_info(
  File "/home/yunyu/.cache/pypoetry/virtualenvs/faqbot-api-8eL8ANau-py3.9/lib/python3.9/site-packages/huggingface_hub/hf_api.py", line 868, in model_info
    r.raise_for_status()
  File "/home/yunyu/.cache/pypoetry/virtualenvs/faqbot-api-8eL8ANau-py3.9/lib/python3.9/site-packages/requests/models.py", line 960, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 503 Server Error: Service Temporarily Unavailable for url: https://huggingface.co/api/models/sentence-transformers/all-MiniLM-L6-v2

Expected behavior
If huggingface.co is down and models are locally cached, the locally cached versions of the models should be used

Additional context
https://status.huggingface.co/

To Reproduce
Run

retriever = EmbeddingRetriever(
    document_store=document_store,
    embedding_model="sentence-transformers/all-MiniLM-L6-v2",
    use_gpu=True,
)

FAQ Check

System:

  • OS: Debian 10
  • GPU/CPU: NVIDIA P40
  • Haystack version (commit or version number): 1.1.0
  • DocumentStore: ElasticsearchDocumentStore
  • Reader:
  • Retriever: EmbeddingRetriever
@mathislucka mathislucka added topic:dependencies type:bug Something isn't working labels Feb 3, 2022
@mathislucka
Copy link
Member

Hi @yunyu !

I agree that you should be able to use a local model if it is available without connecting to huggingface. Internally, we are wrapping the sentence_transformers library when a sentence-transformers model is used (like in your case). The corresponding code is here:

self.embedding_model = SentenceTransformer(retriever.embedding_model, device=str(retriever.devices[0]))

Could you maybe try to store a sentence transformers model locally and provide the file path instead of the model name during initialization? I'd be interested to see if that helps already.

@xloem
Copy link

xloem commented Feb 4, 2022

Hi, I just happened to bump into this issue. Have you tried setting TRANSFORMERS_OFFLINE=1 in the environment? https://github.com/huggingface/transformers/blob/8ce133063120683018b214fe10d1449e4c2401da/src/transformers/file_utils.py#L328

@bogdankostic
Copy link
Contributor

Hi @yunyu, did the above suggestions work for you?

@tstadel
Copy link
Member

tstadel commented Mar 3, 2022

@yunu: as this issue seems to have an appropriate solution I'm closing this now. You can reopen it at any time if the solution does not work for you.

@tstadel tstadel closed this as completed Mar 3, 2022
@xloem
Copy link

xloem commented Mar 3, 2022

my closing thoughts here are that this is really a poor choice on huggingface's part, revealing to every user's network that they access a model every time it is loaded. it would seem more normal to rely on the cache until a timeout expires, and to fail gracefully if the network is offline. this can be done by wrapping code, too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic:dependencies type:bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants