Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broken (repeating) subtitles #2

Open
cupiditys opened this issue Feb 3, 2025 · 5 comments
Open

Broken (repeating) subtitles #2

cupiditys opened this issue Feb 3, 2025 · 5 comments
Labels
bug Something isn't working

Comments

@cupiditys
Copy link

On each video I upload, the subtitles repeat for about 4 minutes before coming back. I tried with LargeV2 and V3 models.


@umlx5h
Copy link
Owner

umlx5h commented Feb 3, 2025

This is a problem I have been aware of for some time.
What runtime are you using, CPU, Vulkan or CUDA?

Re-enabling ASR will recreate the subtitles from the playback position, so that may help.
(Subtitles after the start position will be deleted, but before will remain.)

I reproduced it even when I fed the WAV files directly to whisper.net, so the cause may be on the library side (whisper.net or whisper.cpp) .

debug code: output wav

// Output wav file for debugging
//using (FileStream fs = new($"subtitlewhisper-{chunkCnt}.wav", FileMode.Create, FileAccess.Write))
//{
// waveStream.WriteTo(fs);
// waveStream.Position = 0;
//}

I will do a detailed investigation later.

@cupiditys
Copy link
Author

I was using CPU in the gif, tried using cuda and it has the same issue (seems worse really, but unsure).

re-enabling does work but just a bit tedious since you have to do it multiple times

@umlx5h
Copy link
Owner

umlx5h commented Feb 3, 2025

Thanks, I donn't have a nvidia gpu so that helps.
I have confirmed it reproduces by Vulkan using Radeon as well, I will investigate it later.

@umlx5h
Copy link
Owner

umlx5h commented Feb 3, 2025

I have confirmed that when I feed a specific WAV file directly to whisper.net, it does not reproduce on the CPU, but definitely on Vulkan.

Apparently there are two types of repeating problems: the one that occurs on the CPU must be a problem on the whisper.cpp side, which is the problem of the previous subtitle repeating when silent. which should be resolved when the next audio comes in.

The problem that occurs with vulkan and cuda is that the previous subtitle repeats with audio present, which should be fixed.

I will check whether whisper.net or whisper.cpp has the vulkan/cuda problem.


I bought an nvidia GPU and have confirmed that I can reproduce this with a combination of CPU/CUDA and LargeV3.

I need to set up an environment where I can run whisper.cpp directly in a Windows environment.

@umlx5h umlx5h added the bug Something isn't working label Feb 3, 2025
@umlx5h umlx5h added the P0 will work immediately label Feb 10, 2025
@umlx5h
Copy link
Owner

umlx5h commented Feb 12, 2025

It seems to be a problem on the whisper.cpp or original whisper models, so we will have to wait patiently for it to be fixed in an upgraded version.

sandrohanea/whisper.net#337
ggerganov/whisper.cpp#2191

There are several ways to deal with this.

  • Change to a different model.
    • More likely to occur with LargeV3 and LargeV3 Turbo; less likely with LargeV2
  • Change runtime options
    • Change to Cuda or Vulkan or CPU
  • Re-run ASR from the repeat point.
    • If ASR is enabled and you select it again, it will be re-generated from the playback point.
  • Set ASR Chunk Size to a smaller size in the settings.
    • There is a possibility to eliminate repeats because the whisper is run again at short intervals.

@umlx5h umlx5h removed the P0 will work immediately label Feb 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants