Bug happens in processor use model cached in local #28697

wwx007121 · 2024-01-25T08:55:33Z

System Info

version: transformers>=4.37.0

bug occurs in https://github.com/huggingface/transformers/blob/main/src/transformers/processing_utils.py ,line 466

I understand the purpose of this code， but this creats a conflict occurred with code in 'utils/hub.py line 426' that the error detail descriptions may have been changed.

Who can help?

No response

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

I have made a simple solution that change my local code "if "does not appear to have a file named processor_config.json." in str(e):" to if "processor_config.json." in str(e): " .otherwise , Reduce version to 4.36.2 is also working。

Expected behavior

I think it may have a better solution.

The text was updated successfully, but these errors were encountered:

amyeroberts · 2024-01-25T09:55:07Z

Hi @wwx007121, thanks for raising an issue!

Could you give some more details about exactly the bug that is occurring i.e. the error being encountered (including full traceback) and a minimal code snippet to reproduce the issue?

cc @ydshieh

wwx007121 · 2024-01-25T10:08:18Z

Hi @wwx007121, thanks for raising an issue!

Could you give some more details about exactly the bug that is occurring i.e. the error being encountered (including full traceback) and a minimal code snippet to reproduce the issue?

cc @ydshieh

    model_id = "openai/whisper-large-v3"
    pretrain_model = AutoModelForSpeechSeq2Seq.from_pretrained(
        model_id, torch_dtype=torch_dtype, low_cpu_mem_usage=True, use_safetensors=True, cache_dir=model_cache
    )
    pretrain_model.to(device)
    print("load model done")

    processor = AutoProcessor.from_pretrained(model_id, cache_dir=model_cache)

model in cache was downloaded in others process which shared same docker environments.

Traceback (most recent call last):
  File "script/get_whisper_result.py", line 28, in <module>
    processor = AutoProcessor.from_pretrained(model_id, cache_dir=model_cache)
  File "/opt/miniconda/lib/python3.8/site-packages/transformers/models/auto/processing_auto.py", line 313, in from_pretrained
    return processor_class.from_pretrained(
  File "/opt/miniconda/lib/python3.8/site-packages/transformers/processing_utils.py", line 464, in from_pretrained
    processor_dict, kwargs = cls.get_processor_dict(pretrained_model_name_or_path, **kwargs)
  File "/opt/miniconda/lib/python3.8/site-packages/transformers/processing_utils.py", line 308, in get_processor_dict
    resolved_processor_file = cached_file(
  File "/opt/miniconda/lib/python3.8/site-packages/transformers/utils/hub.py", line 425, in cached_file
    raise EnvironmentError(
OSError: We couldn't connect to 'https://huggingface.co' to load this file, couldn't find it in the cached files and it looks like distil-whisper/distil-large-v2 is not the path to a directory containing a file named processor_config.json.
Checkout your internet connection or see how to run the library in offline mode at 'https://huggingface.co/docs/transformers/installation#offline-mode'.

ydshieh · 2024-01-25T10:48:11Z

Hi @wwx007121 , it looks like I should modify the condition indeed, thank you for reporting this.

However, to make sure, I would really like to be able to reproduce the issue. So far, I am doing the following, which should be the situation you described, but this code snippet works without any error.

Could you describe in more detail how to reproduce it, please?

You mentioned that cache was downloaded in others process. When running the provided code example, is the connection cut/disabled?

from transformers import AutoProcessor, AutoModelForSpeechSeq2Seq

model_id = "openai/whisper-large-v3"

model_cache = "my_cache"

pretrain_model = AutoModelForSpeechSeq2Seq.from_pretrained(
    model_id, use_safetensors=True, cache_dir=model_cache
)

processor = AutoProcessor.from_pretrained(model_id, cache_dir=model_cache)

ydshieh · 2024-01-25T10:52:47Z

Well, I tried to disable the internet connection and I can reproduce the issue. I will open a PR to fix it, thanks again for reporting

ydshieh · 2024-01-26T08:03:49Z

@wwx007121

The fix is merged into main. Thanks again!

ydshieh self-assigned this Jan 25, 2024

ydshieh mentioned this issue Jan 25, 2024

Don't fail when LocalEntryNotFoundError during processor_config.json loading #28709

Merged

ydshieh closed this as completed in #28709 Jan 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug happens in processor use model cached in local #28697

Bug happens in processor use model cached in local #28697

wwx007121 commented Jan 25, 2024

amyeroberts commented Jan 25, 2024

wwx007121 commented Jan 25, 2024 •

edited

Loading

ydshieh commented Jan 25, 2024

ydshieh commented Jan 25, 2024

ydshieh commented Jan 26, 2024

Bug happens in processor use model cached in local #28697

Bug happens in processor use model cached in local #28697

Comments

wwx007121 commented Jan 25, 2024

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

amyeroberts commented Jan 25, 2024

wwx007121 commented Jan 25, 2024 • edited Loading

ydshieh commented Jan 25, 2024

ydshieh commented Jan 25, 2024

ydshieh commented Jan 26, 2024

wwx007121 commented Jan 25, 2024 •

edited

Loading