Increase mfcc step size instead of throwing away feature frames #1744

reuben · 2018-11-22T11:58:09Z

Lines 17 to 21 in a3a96cf

    
           # Get mfcc coefficients 
        
           features = mfcc(audio, samplerate=fs, numcep=numcep) 
        
           # We only keep every second feature (BiRNN stride = 2) 
        
           features = features[::2]

We could instead pass winlen=0.03s and winstep=0.02s to mfcc to get the same rate of feature windows over time, but without discarding any data.

The text was updated successfully, but these errors were encountered:

khsinclair · 2018-11-26T16:04:34Z

I'd recommend increasing winlen to 0.032 to match the 512-sample FFT. This avoids creating the step discontinuity when using the traditional Hamming window. That is, a 480-sample length adds 32 zeroes to create a 512-sample FFT, and the discontinuity of the raised cosine window at sample 480 produces some spectral splatter.

HOWEVER, it appears to me that line 226 of deepspeech.cc invokes feature generation with NO window function at all, and that's a serious problem. Is that indeed the usual path for audio feature generation?

Also note that any of these changes will make existing models incompatible with new audio, to various degrees. I don't know what your compatibility policy is for this.

khsinclair · 2018-11-26T16:23:42Z

Without a window function, the current implementation with 0.025s length has two discontinuities in the FFT input: one at sample 400 and one at sample 512/0. Not good.

kdavis-mozilla · 2018-11-27T09:16:25Z

I don't know what your compatibility policy is for this.

Currently, as we are not to 1.0, we are free to break backwards compatibility when the engine benefits.

reuben · 2018-12-10T13:14:37Z

@khsinclair thanks for the tips! I've created PR #1773 fixing this.

lock · 2019-01-09T15:32:21Z

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

reuben closed this as completed in 2a8128b Dec 10, 2018

reuben mentioned this issue Dec 12, 2018

Set graph's random seed when training #1780

Merged

lock bot locked and limited conversation to collaborators Jan 9, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Increase mfcc step size instead of throwing away feature frames #1744

Increase mfcc step size instead of throwing away feature frames #1744

reuben commented Nov 22, 2018

khsinclair commented Nov 26, 2018 •

edited

Loading

khsinclair commented Nov 26, 2018 •

edited

Loading

kdavis-mozilla commented Nov 27, 2018

reuben commented Dec 10, 2018

lock bot commented Jan 9, 2019

Increase mfcc step size instead of throwing away feature frames #1744

Increase mfcc step size instead of throwing away feature frames #1744

Comments

reuben commented Nov 22, 2018

khsinclair commented Nov 26, 2018 • edited Loading

khsinclair commented Nov 26, 2018 • edited Loading

kdavis-mozilla commented Nov 27, 2018

reuben commented Dec 10, 2018

lock bot commented Jan 9, 2019

khsinclair commented Nov 26, 2018 •

edited

Loading

khsinclair commented Nov 26, 2018 •

edited

Loading