Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

without_timestamps significantly changes the result (for the worse). #337

Closed
jzju opened this issue Jul 6, 2023 · 4 comments
Closed

without_timestamps significantly changes the result (for the worse). #337

jzju opened this issue Jul 6, 2023 · 4 comments

Comments

@jzju
Copy link

jzju commented Jul 6, 2023

There is 50% more words for without_timestamps=0 compared to without_timestamps=1.

@phineas-pta
Copy link

consider adding a reproducible example in a bug report to help devs figuring out what happen

@jzju
Copy link
Author

jzju commented Jul 6, 2023

model = WhisperModel("medium", device="cuda", compute_type="float16")
f = "o.mp3"
segments, info = model.transcribe(f, beam_size=1, without_timestamps=1, language="en")
print(" ".join([s.text.strip() for s in segments]))
print()
segments, info = model.transcribe(f, beam_size=1, language="en")
print(" ".join([s.text.strip() for s in segments]))

For this clip there is a small differnce https://soundbible.com/1360-Obama-State-Of-The-Union-2010.html
I much more difference for a 20min clip in swedish with multiple speakers and noice but it's not data I could publish.

tear each other down instead of lifting the bar, lifting this country up, we lose faith.
vs
tear each other down instead of lifting this country up, we lose faith.

@phineas-pta
Copy link

more likely hallucinations with the speech above

try use large model and/or VAD

@guillaumekln
Copy link
Contributor

You could also try with the original openai-whisper but it is generally expected that without_timestamps changes the results. The decoding logic is very different.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants