[Ready to merge] Pruned Transducer Stateless2 for WenetSpeech (char-based) #349

luomingshuang · 2022-05-06T09:45:08Z

Actually, there is an existing PR #314 with pruned transducer stateless2 for wenetspeech. But that PR can be regarded as a draft. This PR aims to merge into the master. Of course, the process trained with the L subset is still going on. (The results are better than other methods at present. And the performance still gets a little improvement. I will update the results on RESULT.md) It doesn't influence the code review for this PR. The greedy_search_new and modified_beam_search_new are from the PR #358.
Results:

Subset_for_training	decoding_method	epoch	avg	pruned RNN-T (test-net/test-meet)	reworked model (dev/test-net/test-meet)
M	greedy_search	29	11	-/-	10.40/11.31/19.64
M	modified_beam_search	29	11	-/-	9.85/11.04/18.20
M	fast_beam_search	29	11	-/-	10.18/11.10/19.32
M	/	/	/	kaldi	9.81/14.19/28.22
S	greedy_search	29	24	-/-	19.92/25.20/35.35
S	modified_beam_search	29	24	-/-	18.62/23.88/33.80
S	fast_beam_search	29	24	-/-	19.31/24.41/34.87
S	/	/	/	kaldi	11.70/17.47/37.27
L	greedy_search	10	2	-/-	7.80/8.78/13.49
L	greedy_search_new	10	2	-/-	7.80/8.76/13.50
L	modified_beam_search	10	2	-/-	7.76/8.82/13.41
L	modified_beam_search_new	10	2	-/-	7.76/8.72/13.41
L	fast_beam_search	10	2	-/-	7.94/8.73/13.81
L	/	/	/	kaldi	9.07/12.83/24.72
L	/	/	/	wenet	8.88/9.70/15.59
L	/	/	/	espnet	9.70/8.90/15.90

luomingshuang · 2022-05-19T16:34:56Z

The best results trained with L subset (better than other public results at present):

	dev	test-net	test-meeting	comment
greedy search	7.80	8.75	13.49	--epoch 10, --avg 2, --max-duration 100
modified beam search (beam size 4)	7.76	8.71	13.41	--epoch 10, --avg 2, --max-duration 100
fast beam search (set as default)	7.94	8.74	13.80	--epoch 10, --avg 2, --max-duration 1500

csukuangfj · 2022-05-19T16:45:20Z

README.md

@@ -20,6 +20,11 @@ We provide 6 recipes at present:
  - [TIMIT][timit]
  - [TED-LIUM3][tedlium3]
  - [GigaSpeech][gigaspeech]
+<<<<<<< HEAD


Please resolve the conflicts.

csukuangfj · 2022-05-23T09:12:36Z

egs/wenetspeech/ASR/RESULTS.md

+| fast beam search (set as default)  | 19.31  | 24.41     | 34.87         | --epoch 29, --avg 24, --max-duration 1500 |
+
+
+A pre-trained model and decoding logs can be found at <https://huggingface.co/luomingshuang/icefall_asr_wenetspeech_pruned_transducer_stateless2>


Could you also add the pretrained models and decoding results for the M and S subset to huggingface?

Also, could you upload the training logs to huggingface?

OK, no problem.

csukuangfj · 2022-05-23T09:12:54Z

Thanks!

pingfengluo · 2022-06-14T06:18:16Z

@luomingshuang how long it takes to complete wenetspeech training with L set?

luomingshuang · 2022-06-14T06:39:52Z

egs/wenetspeech/ASR/RESULTS.md

+  --training-subset L
+```
+
+The tensorboard training log can be found at


You can find the tensorboard log here. And according to this log, it takes about 21 days 16 hours and a half for 17 epochs. According to my results, I think it can be stopped when training for 11 epochs.

https://tensorboard.dev/experiment/wM4ZUNtASRavJx79EOYYcg/#scalars

pingfengluo · 2022-06-14T11:52:16Z

tks

luomingshuang added 4 commits May 6, 2022 17:28

add char-based pruned-rnnt2 for wenetspeech

0eff3de

style check

cdb6591

style check

c3b2a52

change for export.py

1b51f18

luomingshuang changed the title ~~[WIP] Pruned Transducer Stateless2 for WenetSpeech (char-based)~~ Pruned Transducer Stateless2 for WenetSpeech (char-based) May 6, 2022

luomingshuang added 2 commits May 7, 2022 19:51

do some changes

da78063

do some changes

7fb31c1

luomingshuang changed the title ~~Pruned Transducer Stateless2 for WenetSpeech (char-based)~~ [Ready to merge] Pruned Transducer Stateless2 for WenetSpeech (char-based) May 19, 2022

luomingshuang added the ready label May 19, 2022

solve the conflicts

34f859f

csukuangfj reviewed May 19, 2022

View reviewed changes

luomingshuang added 3 commits May 20, 2022 00:48

a small change for .flake8

3925eae

solve the conflicts

afdda05

code style check

0d8fa6a

luomingshuang added ready and removed ready labels May 23, 2022

csukuangfj reviewed May 23, 2022

View reviewed changes

csukuangfj merged commit 0e57b30 into k2-fsa:master May 23, 2022

luomingshuang commented Jun 14, 2022

View reviewed changes

wgb14 mentioned this pull request Jul 31, 2022

[WIP] Pruned-transducer-stateless5-for-WenetSpeech (offline and streaming) #447

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Ready to merge] Pruned Transducer Stateless2 for WenetSpeech (char-based) #349

[Ready to merge] Pruned Transducer Stateless2 for WenetSpeech (char-based) #349

luomingshuang commented May 6, 2022 •

edited

Loading

luomingshuang commented May 19, 2022

csukuangfj May 19, 2022

luomingshuang May 19, 2022

csukuangfj May 23, 2022

luomingshuang May 23, 2022

csukuangfj commented May 23, 2022

pingfengluo commented Jun 14, 2022

luomingshuang Jun 14, 2022

luomingshuang Jun 14, 2022

pingfengluo commented Jun 14, 2022

		\| fast beam search (set as default) \| 19.31 \| 24.41 \| 34.87 \| --epoch 29, --avg 24, --max-duration 1500 \|


		A pre-trained model and decoding logs can be found at <https://huggingface.co/luomingshuang/icefall_asr_wenetspeech_pruned_transducer_stateless2>

[Ready to merge] Pruned Transducer Stateless2 for WenetSpeech (char-based) #349

[Ready to merge] Pruned Transducer Stateless2 for WenetSpeech (char-based) #349

Conversation

luomingshuang commented May 6, 2022 • edited Loading

luomingshuang commented May 19, 2022

csukuangfj May 19, 2022

Choose a reason for hiding this comment

luomingshuang May 19, 2022

Choose a reason for hiding this comment

csukuangfj May 23, 2022

Choose a reason for hiding this comment

luomingshuang May 23, 2022

Choose a reason for hiding this comment

csukuangfj commented May 23, 2022

pingfengluo commented Jun 14, 2022

luomingshuang Jun 14, 2022

Choose a reason for hiding this comment

luomingshuang Jun 14, 2022

Choose a reason for hiding this comment

pingfengluo commented Jun 14, 2022

luomingshuang commented May 6, 2022 •

edited

Loading