Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Right ASR Wrong Speaker diarization #303

Open
liu6381810 opened this issue Feb 18, 2025 · 4 comments
Open

Right ASR Wrong Speaker diarization #303

liu6381810 opened this issue Feb 18, 2025 · 4 comments

Comments

@liu6381810
Copy link

Hello, I used the script to perform speaker diarization on a segment of audio, and the resulting content has relatively accurate timings and text. However, there were originally five people speaking in the audio, but this method only detected one speaker, with all content labeled as Speaker 0. I would like to ask if there are any parameters that could be adjusted to optimize this result.
python3 diarize.py -a input.mp4 --whisper-model faster-whisper-large-v3

@MahmoudAshraf97
Copy link
Owner

There are known problems with long audios in diarization, the current solution is to split it up

@liu6381810
Copy link
Author

There are known problems with long audios in diarization, the current solution is to split it up

How long is it generally appropriate to segment?
If in an audio segment, the first half is a conversation between person 1 and person 2, and the second half is a conversation between person 2 and person 3, how can I determine the time when the conversation between person 2 and person 3 starts using segmentation?

@MahmoudAshraf97
Copy link
Owner

1 hour is tested to work well, just split it every 1 hour

@liu6381810
Copy link
Author

1 hour is tested to work well, just split it every 1 hour

As I mentioned above, for a test video that is only 10 minutes long, the automatic speech recognition (ASR) content is relatively accurate, but the speaker identification (speaker ID) is not very accurate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants