Support generating with fallback for short form audio in Whisper #29508

Kimahriman · 2024-03-07T11:44:23Z

Feature request

Generating with temperature fallback based on certain criteria was added to Whisper as part of the long-form generation. We should be able to apply the same fallback criteria to short-form audio. See the discussion here.

Motivation

The upstream OpenAI implementation does fallback for all audio. In fact there is no distinguishing between "short" and "long" audio, everything is essentially treated as "long audio", and if there's only one segment to transcribe, that's all.

See https://github.com/openai/whisper/blob/main/whisper/transcribe.py#L178

Your contribution

I probably cannot address this myself.

amyeroberts · 2024-03-07T12:43:06Z

cc @sanchit-gandhi @ylacombe

ylacombe · 2024-04-01T08:56:41Z

cc @sanchit-gandhi, seems that there's a few requests for making long-form audio features compatible with short form audio, do you have time to look into this ?

sanchit-gandhi · 2024-05-20T17:00:56Z

This is a very valid request and we should indeed refactor generation_whisper.py to make no distinction between short and long-form generation (e.g. as per the original codebase). Would you like to have a go at this @kamilakesbi? Happy to help with reviews and questions!

Kimahriman mentioned this issue Mar 7, 2024

[Whisper] Finalize batched SOTA long-form generation #27658

Merged

4 tasks

amyeroberts added Feature request Request for a new feature Audio labels Mar 7, 2024

Kimahriman mentioned this issue Mar 13, 2024

Whisper no_speech_threshold not applied when chunking input #29595

Closed

ylacombe mentioned this issue May 13, 2024

no_speech_probablity #30777

Closed

sanchit-gandhi assigned kamilakesbi May 22, 2024

sanchit-gandhi added the Good Difficult Issue label May 22, 2024

kamilakesbi mentioned this issue May 23, 2024

Support generating with fallback for short form audio in Whisper #30984

Merged

kamilakesbi closed this as completed Jul 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support generating with fallback for short form audio in Whisper #29508

Support generating with fallback for short form audio in Whisper #29508

Kimahriman commented Mar 7, 2024

amyeroberts commented Mar 7, 2024

ylacombe commented Apr 1, 2024

sanchit-gandhi commented May 20, 2024

Support generating with fallback for short form audio in Whisper #29508

Support generating with fallback for short form audio in Whisper #29508

Comments

Kimahriman commented Mar 7, 2024

Feature request

Motivation

Your contribution

amyeroberts commented Mar 7, 2024

ylacombe commented Apr 1, 2024

sanchit-gandhi commented May 20, 2024