swbd/chain : added blstm + fastlstm and blstm + tdnn + fastlstm scripts #1497

vijayaditya · 2017-03-16T22:09:12Z

No description provided.

vijayaditya · 2017-03-16T22:10:23Z

@danpovey Unlike in TDNN+LSTMs replacing LSTM layers with fastLSTMs has led to degradation in BLSTM models.

danpovey · 2017-03-16T22:11:25Z

Thanks! Are these numbers better than the TDNN+fast-LSTM numbers?
I wonder whether there should be soft links added (but this might depend on what the numbers look like; if the difference is not much, I might not want to emphasize these types of systems too much).

danpovey · 2017-03-16T22:13:17Z

Regarding the effect of fast-LSTM, when I looked at the numbers:

                                       [normal LSTM impl] [fast LSTM impl]
+# WER on train_dev(tg)      13.80     13.25
 +# WER on train_dev(fg)      12.64     12.27
 +# WER on eval2000(tg)        15.6      15.7
 +# WER on eval2000(fg)        14.2      14.5

I saw an improvement, not a degradation. train_dev and eval2000 are about the same size so it's appropriate to average the numbers.

vijayaditya · 2017-03-16T22:20:34Z

Updated the soft links. Added a new model type called tdnn_blstm. I will update this new model to follow tdnn_lstm_1c, but with bidirectional layers, as the next step.

danpovey · 2017-03-16T22:22:39Z

ok, thanks-- merging.

vijayaditya · 2017-03-16T22:25:22Z

BTW these are still worse than BLSTMs. This experimentation was done to verify if the gains seen in TDNN+LSTMs were due to the higher sampling rates at the lower layers (i.e., splicing of -1,0,1) and not due to actual modeling of right context by TDNNs. Preliminary evidence suggests that the better results are not just due to higher sampling rates.

@freewym It might be better to also commit your BLSTM recipe with [-1,1] [-3,3] [-3,3] delays to give a context for this experiment.

vijayaditya · 2017-03-16T22:26:14Z

Sorry I meant to say these results are worse than TDNN+LSTMs

danpovey · 2017-03-16T22:28:59Z

Oh-- if the TDNN+BLSTMs are worse than BLSTMs I might want to remove the soft link at some point. I don't want to give people the impression that it's necessary or desirable to run that system type, if there is no real advantage in it. But this can be decided after further tuning.

…

On Thu, Mar 16, 2017 at 6:26 PM, Vijayaditya Peddinti < ***@***.***> wrote: Sorry I meant to say these results are worse than TDNN+LSTMs — You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub <#1497 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADJVu9LPxyrG74X4jtjRA8-MybwbE2Npks5rmbcIgaJpZM4Mf8Qm> .

vijayaditya · 2017-03-16T22:31:35Z

TDNN+BLSTMs are better than BLSTMs but worse than TDNN+LSTMs. They have fewer TDNN layers than TDNN+LSTMs. 3 # tdnn_blstm_1a is same as blstm_6k, but with the initial tdnn layers 4 # local/chain/compare_wer_general.sh blstm_6l_sp blstm_6k_sp 5 # System blstm_6k_sp tdnn_blstm_6l_sp 6 # WER on train_dev(tg) 13.25 12.95 7 # WER on train_dev(fg) 12.27 11.98 8 # WER on eval2000(tg) 15.7 15.5 9 # WER on eval2000(fg) 14.5 14.1 10 # Final train prob -0.052 -0.041 11 # Final valid prob -0.080 -0.072 12 # Final train prob (xent) -0.743 -0.629 13 # Final valid prob (xent) -0.8816 -0.8091 On Thu, Mar 16, 2017 at 3:29 PM, Daniel Povey <notifications@github.com> wrote:

…

Oh-- if the TDNN+BLSTMs are worse than BLSTMs I might want to remove the soft link at some point. I don't want to give people the impression that it's necessary or desirable to run that system type, if there is no real advantage in it. But this can be decided after further tuning. On Thu, Mar 16, 2017 at 6:26 PM, Vijayaditya Peddinti < ***@***.***> wrote: > Sorry I meant to say these results are worse than TDNN+LSTMs > > — > You are receiving this because you modified the open/close state. > Reply to this email directly, view it on GitHub > <#1497 (comment)>, or mute > the thread > <https://github.com/notifications/unsubscribe- auth/ADJVu9LPxyrG74X4jtjRA8-MybwbE2Npks5rmbcIgaJpZM4Mf8Qm> > . > — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#1497 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADtwoAua3-vbd9NxXwYKqsPQk3_zcRIvks5rmbetgaJpZM4Mf8Qm> .

danpovey · 2017-03-16T22:32:49Z

Maybe it would be better to alternate forward and backward LSTM layers instead of combining them, e.g. tdnn layers + forward LSTM layer + tdnn layers + backward LSTM layer, etc. On Thu, Mar 16, 2017 at 6:31 PM, Vijayaditya Peddinti < notifications@github.com> wrote:

…

TDNN+BLSTMs are better than BLSTMs but worse than TDNN+LSTMs. They have fewer TDNN layers than TDNN+LSTMs. 3 # tdnn_blstm_1a is same as blstm_6k, but with the initial tdnn layers 4 # local/chain/compare_wer_general.sh blstm_6l_sp blstm_6k_sp 5 # System blstm_6k_sp tdnn_blstm_6l_sp 6 # WER on train_dev(tg) 13.25 12.95 7 # WER on train_dev(fg) 12.27 11.98 8 # WER on eval2000(tg) 15.7 15.5 9 # WER on eval2000(fg) 14.5 14.1 10 # Final train prob -0.052 -0.041 11 # Final valid prob -0.080 -0.072 12 # Final train prob (xent) -0.743 -0.629 13 # Final valid prob (xent) -0.8816 -0.8091 On Thu, Mar 16, 2017 at 3:29 PM, Daniel Povey ***@***.***> wrote: > Oh-- if the TDNN+BLSTMs are worse than BLSTMs I might want to remove the > soft link at some point. I don't want to give people the impression that > it's necessary or desirable to run that system type, if there is no real > advantage in it. But this can be decided after further tuning. > > > On Thu, Mar 16, 2017 at 6:26 PM, Vijayaditya Peddinti < > ***@***.***> wrote: > > > Sorry I meant to say these results are worse than TDNN+LSTMs > > > > — > > You are receiving this because you modified the open/close state. > > Reply to this email directly, view it on GitHub > > <#1497 (comment)>, > or mute > > the thread > > <https://github.com/notifications/unsubscribe- > auth/ADJVu9LPxyrG74X4jtjRA8-MybwbE2Npks5rmbcIgaJpZM4Mf8Qm> > > > . > > > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub > <#1497 (comment)>, or mute > the thread > <https://github.com/notifications/unsubscribe-auth/ADtwoAua3- vbd9NxXwYKqsPQk3_zcRIvks5rmbetgaJpZM4Mf8Qm> > . > — You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub <#1497 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADJVu7teOLMocwryPFxeN6qSVXvHCWIvks5rmbhLgaJpZM4Mf8Qm> .

vijayaditya · 2017-03-16T23:21:34Z

OK will try out that too..

…

--Vijay On Thu, Mar 16, 2017 at 3:32 PM, Daniel Povey <notifications@github.com> wrote:

Maybe it would be better to alternate forward and backward LSTM layers instead of combining them, e.g. tdnn layers + forward LSTM layer + tdnn layers + backward LSTM layer, etc. On Thu, Mar 16, 2017 at 6:31 PM, Vijayaditya Peddinti < ***@***.***> wrote: > TDNN+BLSTMs are better than BLSTMs but worse than TDNN+LSTMs. They have > fewer TDNN layers than TDNN+LSTMs. > > 3 # tdnn_blstm_1a is same as blstm_6k, but with the initial tdnn layers > 4 # local/chain/compare_wer_general.sh blstm_6l_sp blstm_6k_sp > 5 # System blstm_6k_sp tdnn_blstm_6l_sp > 6 # WER on train_dev(tg) 13.25 12.95 > 7 # WER on train_dev(fg) 12.27 11.98 > 8 # WER on eval2000(tg) 15.7 15.5 > 9 # WER on eval2000(fg) 14.5 14.1 > 10 # Final train prob -0.052 -0.041 > 11 # Final valid prob -0.080 -0.072 > 12 # Final train prob (xent) -0.743 -0.629 > 13 # Final valid prob (xent) -0.8816 -0.8091 > > On Thu, Mar 16, 2017 at 3:29 PM, Daniel Povey ***@***.***> > wrote: > > > Oh-- if the TDNN+BLSTMs are worse than BLSTMs I might want to remove the > > soft link at some point. I don't want to give people the impression that > > it's necessary or desirable to run that system type, if there is no real > > advantage in it. But this can be decided after further tuning. > > > > > > On Thu, Mar 16, 2017 at 6:26 PM, Vijayaditya Peddinti < > > ***@***.***> wrote: > > > > > Sorry I meant to say these results are worse than TDNN+LSTMs > > > > > > — > > > You are receiving this because you modified the open/close state. > > > Reply to this email directly, view it on GitHub > > > <#1497 (comment) >, > > or mute > > > the thread > > > <https://github.com/notifications/unsubscribe- > > auth/ADJVu9LPxyrG74X4jtjRA8-MybwbE2Npks5rmbcIgaJpZM4Mf8Qm> > > > > > . > > > > > > > — > > You are receiving this because you authored the thread. > > Reply to this email directly, view it on GitHub > > <#1497 (comment)>, > or mute > > the thread > > <https://github.com/notifications/unsubscribe-auth/ADtwoAua3- > vbd9NxXwYKqsPQk3_zcRIvks5rmbetgaJpZM4Mf8Qm> > > . > > > > — > You are receiving this because you modified the open/close state. > Reply to this email directly, view it on GitHub > <#1497 (comment)>, or mute > the thread > <https://github.com/notifications/unsubscribe-auth/ ADJVu7teOLMocwryPFxeN6qSVXvHCWIvks5rmbhLgaJpZM4Mf8Qm> > . > — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#1497 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADtwoCbM43MduwUU6oHowZZUvsODMQ9Hks5rmbiWgaJpZM4Mf8Qm> .

freewym · 2017-03-16T23:32:21Z

FYI, the results of my previous experiments on comparing [-1,1] [-3,3] [-3,3] delays with [-3,3] [-3,3] [-3,3] delays using fastblstm, along with tdnn+fastlstm are:

System fastblstm_133 fastblstm tdnn+fastlstm
WER on train_dev(tg) 13.19 13.59 13.17
WER on train_dev(fg) 12.24 12.45 12.28
WER on eval2000(tg) 15.0 15.8 15.5
WER on eval2000(fg) 13.5 14.3 14.1

where fastblstm_133 is better than tdnn+fastlstm.

vijayaditya · 2017-03-16T23:37:52Z

Thanks. Was the increase in the training time between fastBLSTM_333 and fastBLSTM_133 the same as BLSTM_133 and BLSTM_333 ? IIRC you said BLSTM_133 resulted in a 40% increase in training time.

…

--Vijay

On Thu, Mar 16, 2017 at 4:32 PM, Yiming Wang ***@***.***> wrote: FYI, the results of my previous experiments on comparing [-1,1] [-3,3] [-3,3] delays with [-3,3] [-3,3] [-3,3] delays using fastblstm, along with tdnn+fastlstm are: System fastblstm_133 fastblstm tdnn+fastlstm WER on train_dev(tg) 13.19 13.59 13.17 WER on train_dev(fg) 12.24 12.45 12.28 WER on eval2000(tg) 15.0 15.8 15.5 WER on eval2000(fg) 13.5 14.3 14.1 where fastblstm_133 is better than tdnn+fastlstm. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#1497 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADtwoLHPeOS9CD4JLVDYUGyYZP7m514aks5rmcaHgaJpZM4Mf8Qm> .

freewym · 2017-03-16T23:39:26Z

I didn't try non-fast version of BLSTM_133.

danpovey · 2017-03-16T23:39:32Z

probably a slightly smaller dim (or maybe even half the dim) on the first, higher-frequency BLSTM layer would not hurt. On Thu, Mar 16, 2017 at 7:37 PM, Vijayaditya Peddinti < notifications@github.com> wrote:

…

Thanks. Was the increase in the training time between fastBLSTM_333 and fastBLSTM_133 the same as BLSTM_133 and BLSTM_333 ? IIRC you said BLSTM_133 resulted in a 40% increase in training time. --Vijay On Thu, Mar 16, 2017 at 4:32 PM, Yiming Wang ***@***.***> wrote: > FYI, the results of my previous experiments on comparing [-1,1] [-3,3] > [-3,3] delays with [-3,3] [-3,3] [-3,3] delays using fastblstm, along with > tdnn+fastlstm are: > System fastblstm_133 fastblstm tdnn+fastlstm WER on train_dev(tg) 13.19 > 13.59 13.17 WER on train_dev(fg) 12.24 12.45 12.28 WER on eval2000(tg) > 15.0 15.8 15.5 WER on eval2000(fg) 13.5 14.3 14.1 > > where fastblstm_133 is better than tdnn+fastlstm. > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub > <#1497 (comment)>, or mute > the thread > <https://github.com/notifications/unsubscribe-auth/ ADtwoLHPeOS9CD4JLVDYUGyYZP7m514aks5rmcaHgaJpZM4Mf8Qm> > . > — You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub <#1497 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADJVu6s6fJKyjKbTHGIjkcl9G-vm0zrgks5rmcfSgaJpZM4Mf8Qm> .

…NN script. (kaldi-asr#1497)

vijayaditya force-pushed the tdnn_lstm branch from 44cd2dc to 2c9780f Compare March 16, 2017 22:09

swbd/chain : added blstm + fastlstm and blstm + tdnn + fastlstm scripts

f1f5d5b

vijayaditya force-pushed the tdnn_lstm branch from 2c9780f to f1f5d5b Compare March 16, 2017 22:18

danpovey merged commit 5c98096 into kaldi-asr:master Mar 16, 2017

david-ryan-snyder pushed a commit to david-ryan-snyder/kaldi that referenced this pull request Apr 12, 2017

[egs] swbd/chain : added blstm script using fast-LSTM; added BLSTM+TD…

9a61b88

…NN script. (kaldi-asr#1497)

Skaiste pushed a commit to Skaiste/idlak that referenced this pull request Sep 26, 2018

[egs] swbd/chain : added blstm script using fast-LSTM; added BLSTM+TD…

cfed1b8

…NN script. (kaldi-asr#1497)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

swbd/chain : added blstm + fastlstm and blstm + tdnn + fastlstm scripts #1497

swbd/chain : added blstm + fastlstm and blstm + tdnn + fastlstm scripts #1497

vijayaditya commented Mar 16, 2017

vijayaditya commented Mar 16, 2017

danpovey commented Mar 16, 2017

danpovey commented Mar 16, 2017

vijayaditya commented Mar 16, 2017

danpovey commented Mar 16, 2017

vijayaditya commented Mar 16, 2017

vijayaditya commented Mar 16, 2017

danpovey commented Mar 16, 2017 via email

vijayaditya commented Mar 16, 2017 via email

danpovey commented Mar 16, 2017 via email

vijayaditya commented Mar 16, 2017 via email

freewym commented Mar 16, 2017 •

edited

Loading

vijayaditya commented Mar 16, 2017 via email

freewym commented Mar 16, 2017

danpovey commented Mar 16, 2017 via email

swbd/chain : added blstm + fastlstm and blstm + tdnn + fastlstm scripts #1497

swbd/chain : added blstm + fastlstm and blstm + tdnn + fastlstm scripts #1497

Conversation

vijayaditya commented Mar 16, 2017

vijayaditya commented Mar 16, 2017

danpovey commented Mar 16, 2017

danpovey commented Mar 16, 2017

vijayaditya commented Mar 16, 2017

danpovey commented Mar 16, 2017

vijayaditya commented Mar 16, 2017

vijayaditya commented Mar 16, 2017

danpovey commented Mar 16, 2017 via email

vijayaditya commented Mar 16, 2017 via email

danpovey commented Mar 16, 2017 via email

vijayaditya commented Mar 16, 2017 via email

freewym commented Mar 16, 2017 • edited Loading

vijayaditya commented Mar 16, 2017 via email

freewym commented Mar 16, 2017

danpovey commented Mar 16, 2017 via email

freewym commented Mar 16, 2017 •

edited

Loading