Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wav2vec doesn't align numerical characters #869

Open
pr-data-port opened this issue Aug 29, 2024 · 1 comment
Open

Wav2vec doesn't align numerical characters #869

pr-data-port opened this issue Aug 29, 2024 · 1 comment

Comments

@pr-data-port
Copy link

Hi, I have a text were the audio includes numbers (e.g. 16, 29, 32) and the whisperx loads the information and transcript perfect, but when I try to run the word alignment, I stumble upon an issue - the numbers are separated out as words and for that reason they have empty start time and end time values. For the wav2vec models I tried, metadata only includes non-numerical characters [a-z].

Has anyone had any other similar issue and maybe know a wav2vec (from huggingface) model in English that would solve this issue?

Thanks for help in advance,

@itaipee
Copy link

itaipee commented Sep 23, 2024

Use the option "--suppress_numerals" when you transcribe with whisperX

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants