-
Notifications
You must be signed in to change notification settings - Fork 295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Streaming] Reproducing librispeech results - RNN-T emformer #383
Comments
The model was training using #278
I am using |
Thank you. |
No. time warp is used, like other recipes in icefall. You can see that asr_datamodule.py is a symlink. |
Please use the changes from that PR directly, not the latest master, and not the latest streaming branch. I find that #358 makes the WER slightly worse. |
Back when @glynpu was doing streaming stuff based on WeNet ideas, he found that it was necessary to append some silence to force out the final symbols. That is probably why the padding fix is hurting. |
yes, dummy extra tailing silence indeed helps, at least for that model. #242 |
Thanks! I looked at the decoding results. At the end of sentences, some tokens at the end of a word is missing. After doing tail padding (with length equal to left_context_lengh), it mitigates the problem. To be concrete, the WER for Attached are the decoding results before and after tailing padding. after-tail-padding-errs-test-clean-greedy_search-epoch-26-avg-6-context-2-max-sym-per-frame-1.txt See #384 |
cool! |
Hello,
I tried to reproduce your results on librispeech streaming: #278 (comment)
I am not done with my hyper parameter search but I was not able to get even close to the results reported.
Do you recall the configuration you used for this training ?
Best,
The text was updated successfully, but these errors were encountered: