get_language_model_class does not work when huggingface.co is offline, even if model is locally cached #2118

yunyu · 2022-02-03T00:13:00Z

Describe the bug
Huggingface.co is currently offline. I am running Haystack with the models cached in ~/.cache/huggingface, but my program still fails.

Error message
Error that was thrown (if available)

  File "/home/yunyu/haystack-test-api/src/serve-api.py", line 24, in <module>
    retriever = EmbeddingRetriever(
  File "/home/yunyu/.cache/pypoetry/virtualenvs/faqbot-api-8eL8ANau-py3.9/lib/python3.9/site-packages/haystack/nodes/retriever/dense.py", line 1040, in __init__
    self.embedding_encoder = _EMBEDDING_ENCODERS[model_format](self)
  File "/home/yunyu/.cache/pypoetry/virtualenvs/faqbot-api-8eL8ANau-py3.9/lib/python3.9/site-packages/haystack/nodes/retriever/_embedding_encoder.py", line 54, in __init__
    self.embedding_model = Inferencer.load(
  File "/home/yunyu/.cache/pypoetry/virtualenvs/faqbot-api-8eL8ANau-py3.9/lib/python3.9/site-packages/haystack/modeling/infer.py", line 189, in load
    model = AdaptiveModel.convert_from_transformers(model_name_or_path,
  File "/home/yunyu/.cache/pypoetry/virtualenvs/faqbot-api-8eL8ANau-py3.9/lib/python3.9/site-packages/haystack/modeling/model/adaptive_model.py", line 510, in convert_from_transformers
    return conv.Converter.convert_from_transformers(model_name_or_path,
  File "/home/yunyu/.cache/pypoetry/virtualenvs/faqbot-api-8eL8ANau-py3.9/lib/python3.9/site-packages/haystack/modeling/conversion/transformers.py", line 70, in convert_from_transformers
    lm = LanguageModel.load(model_name_or_path, revision=revision,use_auth_token=use_auth_token, **kwargs)
  File "/home/yunyu/.cache/pypoetry/virtualenvs/faqbot-api-8eL8ANau-py3.9/lib/python3.9/site-packages/haystack/modeling/model/language_model.py", line 161, in load
    language_model_class = cls.get_language_model_class(pretrained_model_name_or_path, use_auth_token=use_auth_token, **kwargs)
  File "/home/yunyu/.cache/pypoetry/virtualenvs/faqbot-api-8eL8ANau-py3.9/lib/python3.9/site-packages/haystack/modeling/model/language_model.py", line 197, in get_language_model_class
    config = AutoConfig.from_pretrained(model_name_or_path, use_auth_token=use_auth_token, **kwargs)
  File "/home/yunyu/.cache/pypoetry/virtualenvs/faqbot-api-8eL8ANau-py3.9/lib/python3.9/site-packages/transformers/models/auto/configuration_auto.py", line 580, in from_pretrained
    config_dict, _ = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
  File "/home/yunyu/.cache/pypoetry/virtualenvs/faqbot-api-8eL8ANau-py3.9/lib/python3.9/site-packages/transformers/configuration_utils.py", line 550, in get_config_dict
    configuration_file = get_configuration_file(
  File "/home/yunyu/.cache/pypoetry/virtualenvs/faqbot-api-8eL8ANau-py3.9/lib/python3.9/site-packages/transformers/configuration_utils.py", line 841, in get_configuration_file
    all_files = get_list_of_files(
  File "/home/yunyu/.cache/pypoetry/virtualenvs/faqbot-api-8eL8ANau-py3.9/lib/python3.9/site-packages/transformers/file_utils.py", line 1952, in get_list_of_files
    return list_repo_files(path_or_repo, revision=revision, token=token)
  File "/home/yunyu/.cache/pypoetry/virtualenvs/faqbot-api-8eL8ANau-py3.9/lib/python3.9/site-packages/huggingface_hub/hf_api.py", line 884, in list_repo_files
    info = self.model_info(
  File "/home/yunyu/.cache/pypoetry/virtualenvs/faqbot-api-8eL8ANau-py3.9/lib/python3.9/site-packages/huggingface_hub/hf_api.py", line 868, in model_info
    r.raise_for_status()
  File "/home/yunyu/.cache/pypoetry/virtualenvs/faqbot-api-8eL8ANau-py3.9/lib/python3.9/site-packages/requests/models.py", line 960, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 503 Server Error: Service Temporarily Unavailable for url: https://huggingface.co/api/models/sentence-transformers/all-MiniLM-L6-v2

Expected behavior
If huggingface.co is down and models are locally cached, the locally cached versions of the models should be used

Additional context
https://status.huggingface.co/

To Reproduce
Run

retriever = EmbeddingRetriever(
    document_store=document_store,
    embedding_model="sentence-transformers/all-MiniLM-L6-v2",
    use_gpu=True,
)

FAQ Check

Have you had a look at our new FAQ page?

System:

OS: Debian 10
GPU/CPU: NVIDIA P40
Haystack version (commit or version number): 1.1.0
DocumentStore: ElasticsearchDocumentStore
Reader:
Retriever: EmbeddingRetriever

The text was updated successfully, but these errors were encountered:

mathislucka · 2022-02-03T12:50:54Z

Hi @yunyu !

I agree that you should be able to use a local model if it is available without connecting to huggingface. Internally, we are wrapping the sentence_transformers library when a sentence-transformers model is used (like in your case). The corresponding code is here:

haystack/haystack/nodes/retriever/_embedding_encoder.py

Line 102 in a59bca3

    
           self.embedding_model = SentenceTransformer(retriever.embedding_model, device=str(retriever.devices[0]))

Could you maybe try to store a sentence transformers model locally and provide the file path instead of the model name during initialization? I'd be interested to see if that helps already.

xloem · 2022-02-04T23:58:20Z

Hi, I just happened to bump into this issue. Have you tried setting TRANSFORMERS_OFFLINE=1 in the environment? https://github.com/huggingface/transformers/blob/8ce133063120683018b214fe10d1449e4c2401da/src/transformers/file_utils.py#L328

bogdankostic · 2022-02-17T10:42:46Z

Hi @yunyu, did the above suggestions work for you?

tstadel · 2022-03-03T14:05:49Z

@yunu: as this issue seems to have an appropriate solution I'm closing this now. You can reopen it at any time if the solution does not work for you.

xloem · 2022-03-03T17:24:04Z

my closing thoughts here are that this is really a poor choice on huggingface's part, revealing to every user's network that they access a model every time it is loaded. it would seem more normal to rely on the cache until a timeout expires, and to fail gracefully if the network is offline. this can be done by wrapping code, too.

mathislucka added topic:dependencies type:bug Something isn't working labels Feb 3, 2022

tstadel closed this as completed Mar 3, 2022

xloem mentioned this issue Mar 3, 2022

Privacy&Security: Network Contact Every Model Load huggingface/transformers#15927

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

get_language_model_class does not work when huggingface.co is offline, even if model is locally cached #2118

get_language_model_class does not work when huggingface.co is offline, even if model is locally cached #2118

yunyu commented Feb 3, 2022 •

edited

Loading

mathislucka commented Feb 3, 2022

xloem commented Feb 4, 2022

bogdankostic commented Feb 17, 2022

tstadel commented Mar 3, 2022

xloem commented Mar 3, 2022

get_language_model_class does not work when huggingface.co is offline, even if model is locally cached #2118

get_language_model_class does not work when huggingface.co is offline, even if model is locally cached #2118

Comments

yunyu commented Feb 3, 2022 • edited Loading

mathislucka commented Feb 3, 2022

xloem commented Feb 4, 2022

bogdankostic commented Feb 17, 2022

tstadel commented Mar 3, 2022

xloem commented Mar 3, 2022

yunyu commented Feb 3, 2022 •

edited

Loading