Improve --model argument handling and help message #1764

spartanhaden · 2023-11-06T22:16:07Z

This PR introduces the following updates to the whisper/transcribe.py script:

Enhancement of the --model argument handling and help message: The --model argument now provides a list of available model choices along with the default option when the --help flag is used. This enhances user experience by providing immediate visibility of the available options.
- Previous message: --model MODEL name of the Whisper model to use (default: small)
- Updated message: --model MODEL name of the Whisper model to use. Available models are: tiny.en, tiny, base.en, base, small.en, small, medium.en, medium, large-v1, large-v2, large-v3, large. You can also specify a path to a model checkpoint. (default: small)
- Note: The choices=available_models() option was not used to allow the use of custom model checkpoints.
Improved error message for incorrect model names: If a non-existing model name is used, the error message now functions as intended and indicates the error and provides the list of valid model names.
- Previous message: whisper: error: argument --model: invalid valid_model_name value: 'some_incorrect_model_name'
- Updated message: whisper: error: argument --model: model should be one of ['tiny.en', 'tiny', 'base.en', 'base', 'small.en', 'small', 'medium.en', 'medium', 'large-v1', 'large-v2', 'large-v3', 'large'] or path to a model checkpoint

MohamedAliRashad · 2023-11-07T01:22:01Z

large-v3 is not working, it's giving me this error.

RuntimeError: Given groups=1, weight of size [1280, 128, 3], expected input[1, 80, 3000] to have 128 channels, but got 80 channels instead

FurkanGozukara · 2023-11-07T15:07:12Z

large-v3 is not working, it's giving me this error.

RuntimeError: Given groups=1, weight of size [1280, 128, 3], expected input[1, 80, 3000] to have 128 channels, but got 80 channels instead

i tested yesterday worked very well

i am using python 3.10.11

MohamedAliRashad · 2023-11-08T08:20:21Z

@FurkanGozukara
It turned out that the function log_mel_spectrogram requires n_mels to be set to 128 because large-v3 works with 128 not 80 like large-v2.

FurkanGozukara · 2023-11-08T20:13:14Z

log_mel_spectrogram

what does it do?

ihmily · 2023-11-18T12:33:34Z

log_mel_spectrogram

what does it do?

I also encountered this issue, and the error message is Given groups=1, weight of size [1280, 128, 3], expected input[1, 80, 3000] to have 128 channels, but got 80 channels instead.

The solution is to modify the following line of code:

mel = whisper.log_mel_spectrogram(audio).to(model.device)

to:

mel = whisper.log_mel_spectrogram(audio, n_mels=128).to(model.device)

By explicitly setting n_mels=128, it might resolve the issue and allow the code to run properly. If it still doesn't work, you can try changing n_mels = 128 back to n_mels = 80.

In general, when higher frequency resolution is needed, selecting n_mels = 128 is recommended. A higher value of n_mels provides more Mel frequency filters, capturing more details and frequency components in the spectrogram representation.

ihmily mentioned this pull request Nov 18, 2023

求助，关于预处理短音频数据集时找不到文件的问题 Plachtaa/VITS-fast-fine-tuning#495

Open

Improve --model argument handling and help message

4de997d

spartanhaden force-pushed the main branch from 97fd487 to 4de997d Compare November 28, 2023 19:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve --model argument handling and help message #1764

Improve --model argument handling and help message #1764

spartanhaden commented Nov 6, 2023

MohamedAliRashad commented Nov 7, 2023

FurkanGozukara commented Nov 7, 2023

MohamedAliRashad commented Nov 8, 2023

FurkanGozukara commented Nov 8, 2023

ihmily commented Nov 18, 2023

Improve --model argument handling and help message #1764

Are you sure you want to change the base?

Improve --model argument handling and help message #1764

Conversation

spartanhaden commented Nov 6, 2023

MohamedAliRashad commented Nov 7, 2023

FurkanGozukara commented Nov 7, 2023

MohamedAliRashad commented Nov 8, 2023

FurkanGozukara commented Nov 8, 2023

ihmily commented Nov 18, 2023