Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some segment has a 1 second shifted after PR #856 #1140

Closed
heimoshuiyu opened this issue Nov 14, 2024 · 0 comments · Fixed by #1141
Closed

Some segment has a 1 second shifted after PR #856 #1140

heimoshuiyu opened this issue Nov 14, 2024 · 0 comments · Fixed by #1141

Comments

@heimoshuiyu
Copy link
Contributor

heimoshuiyu commented Nov 14, 2024

appreciate your hard work


audio (2 minutes): 01.aac.zip

The correct SRT result (using commit fbcf58b, which is before the huge PR #856): 01.old.srt.zip

The wrong SRT result (using latest commit 85e61ea): 01.new.srt.zip


I am not using the batch version

model = faster_whisper.WhisperModel(
    model_size_or_path='large-v2',
    device='cuda',
    cpu_threads=4,
)
model.transcribe(
    audio=audio,
    language=None,
    task='transcribe',
    vad_filter=False,
    initial_prompt=None,
    word_timestamps=True,
    repetition_penalty=1.0,
)

script from this project https://github.com/heimoshuiyu/whisper-fastapi


image

some segments on the left (wrong) has 1 second mismatch (shift +1s) than the right (correct)


I also test on the commit of RP #856 (eb83902), which is worse

result SRT:
01.eb839023.srt.zip

image

left: commit eb83902 PR #856
middle: latest commit 85e61ea
right: commit fbcf58b

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant