Speed up DSAlign #31

galv · 2021-06-23T06:36:00Z

Right now, we timeout when an audio file fails to align with its transcript within 200 seconds: https://github.com/mlcommons/peoples-speech/pull/27/files#diff-b790cd27585332e1eeca7dab897f1ccd7bcd483181132bd9914f2dd07062534fR401

This means 10% of our files timeout during alignment.

One observation is that DSAlign seems to slow to a crawl when the groundtruth transcript does not match what was actually said in the audio (e.g., the transcript is a translation)

One option is to reimplement some part of DSAlign in Cython. But we should really dive deep into what's going on, and see if there's something better we can do.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed up DSAlign #31

Speed up DSAlign #31

galv commented Jun 23, 2021

Speed up DSAlign #31

Speed up DSAlign #31

Comments

galv commented Jun 23, 2021