-
Notifications
You must be signed in to change notification settings - Fork 306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Ready to merge] Pruned Transducer Stateless2 for WenetSpeech (char-based) #349
[Ready to merge] Pruned Transducer Stateless2 for WenetSpeech (char-based) #349
Conversation
The best results trained with L subset (better than other public results at present):
|
README.md
Outdated
@@ -20,6 +20,11 @@ We provide 6 recipes at present: | |||
- [TIMIT][timit] | |||
- [TED-LIUM3][tedlium3] | |||
- [GigaSpeech][gigaspeech] | |||
<<<<<<< HEAD |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please resolve the conflicts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
| fast beam search (set as default) | 19.31 | 24.41 | 34.87 | --epoch 29, --avg 24, --max-duration 1500 | | ||
|
||
|
||
A pre-trained model and decoding logs can be found at <https://huggingface.co/luomingshuang/icefall_asr_wenetspeech_pruned_transducer_stateless2> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you also add the pretrained models and decoding results for the M
and S
subset to huggingface?
Also, could you upload the training logs to huggingface?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, no problem.
Thanks! |
@luomingshuang how long it takes to complete wenetspeech training with L set? |
--training-subset L | ||
``` | ||
|
||
The tensorboard training log can be found at |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can find the tensorboard log here. And according to this log, it takes about 21 days 16 hours and a half for 17 epochs. According to my results, I think it can be stopped when training for 11 epochs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tks |
Actually, there is an existing PR #314 with pruned transducer stateless2 for wenetspeech. But that PR can be regarded as a draft. This PR aims to merge into the master. Of course, the process trained with the L subset is still going on. (The results are better than other methods at present. And the performance still gets a little improvement. I will update the results on RESULT.md) It doesn't influence the code review for this PR. The greedy_search_new and modified_beam_search_new are from the PR #358.
Results: