Avoid computing higher temperatures on no_speech segments (openai#1279)

* Avoid computing higher temperatures on no_speech In decode_with_fallback, we compute higher temperatures in the case where compression_ratio is too high or avg_logprob is too low. But as the computation of no_speech_prob doens't depend on sampling, we can avoid computing higher temperatures if we detect in the first one that the no_speech condition is fulfilled * Update transcribe.py --------- Co-authored-by: Jong Wook Kim <jongwook@openai.com>
abyesilyurt · Nov 13, 2023 · e17f091 · e17f091
1 parent d5dbbee
commit e17f091
Showing 1 changed file with 5 additions and 1 deletion.
diff --git a/whisper/transcribe.py b/whisper/transcribe.py
@@ -175,7 +175,11 @@ def decode_with_fallback(segment: torch.Tensor) -> DecodingResult:
  and decode_result.avg_logprob < logprob_threshold
  ):
  needs_fallback = True # average log probability is too low
-
+ if (
+ no_speech_threshold is not None
+ and decode_result.no_speech_prob > no_speech_threshold
+ ):
+ needs_fallback = False # silence
  if not needs_fallback:
  break