We used the same LM model in the previous report.
- Model files (archived to model.tar.gz by
$ pack_model.sh
)- model link: (pretrained model)
- training config file:
conf/tuning/conformer/train_pytorch_conformer_large.yaml
- decoding config file:
conf/decode.yaml
- cmvn file:
data/train_sp/cmvn.ark
- e2e file:
exp/train_960_pytorch_train_pytorch_conformer_transfer_specaug/results/model.val10.avg.best
- e2e JSON file:
exp/train_960_pytorch_train_pytorch_conformer_transfer_specaug/results/model.json
- lm file:
exp/train_rnnlm_transformer/rnnlm.model.best
- lm JSON file:
exp/train_rnnlm_transformer/model.json
- dict file:
data/lang_char
- Results (paste them by yourself or obtained by
$ pack_model.sh --results <results>
)
exp/train_960_pytorch_train_pytorch_conformer_transfer_specaug/decode_dev_clean_model.val10.avg.best_decode_transformer/result.wrd.txt
| SPKR | # Snt # Wrd | Corr Sub Del Ins Err S.Err |
| Sum/Avg | 2703 54402 | 98.3 1.6 0.2 0.2 1.9 26.2 |
exp/train_960_pytorch_train_pytorch_conformer_transfer_specaug/decode_dev_other_model.val10.avg.best_decode_transformer/result.wrd.txt
| SPKR | # Snt # Wrd | Corr Sub Del Ins Err S.Err |
| Sum/Avg | 2864 50948 | 95.6 3.9 0.5 0.5 4.9 41.4 |
exp/train_960_pytorch_train_pytorch_conformer_transfer_specaug/decode_test_clean_model.val10.avg.best_decode_transformer/result.wrd.txt
| SPKR | # Snt # Wrd | Corr Sub Del Ins Err S.Err |
| Sum/Avg | 2620 52576 | 98.1 1.7 0.2 0.3 2.1 25.9 |
exp/train_960_pytorch_train_pytorch_conformer_transfer_specaug/decode_test_other_model.val10.avg.best_decode_transformer/result.wrd.txt
| SPKR | # Snt # Wrd | Corr Sub Del Ins Err S.Err |
| Sum/Avg | 2939 52343 | 95.6 3.9 0.5 0.5 4.9 44.0 |
We used the same LM model in the previous report.
-
Environments
- python version:
3.8.3 (default) [GCC 7.3.0]
- espnet version:
espnet 0.9.2
- chainer version:
chainer 6.0.0
- pytorch version:
pytorch 1.4.0
- python version:
-
Model files (archived to model.tar.gz by
$ pack_model.sh
)- model link: (pretrained model)
- training config file:
conf/tuning/conformer/train_pytorch_conformer_large.yaml
- decoding config file:
conf/decode.yaml
- cmvn file:
data/train_960/cmvn.ark
- e2e file:
exp/train_960_pytorch_train_pytorch_conformer_large_specaug/results/model.val5.avg.best
- e2e JSON file:
exp/train_960_pytorch_train_pytorch_conformer_large_specaug/results/model.json
- lm file:
exp/train_rnnlm_transformer/rnnlm.model.best
- lm JSON file:
exp/train_rnnlm_transformer/model.json
- dict file:
data/lang_char
- Results (paste them by yourself or obtained by
$ pack_model.sh --results <results>
)
exp/train_960_pytorch_train_pytorch_conformer_large_specaug/decode_dev_clean_model.val5.avg.best_decode_transformer/result.wrd.txt
| SPKR | # Snt # Wrd | Corr Sub Del Ins Err S.Err |
| Sum/Avg | 2703 54402 | 98.2 1.6 0.2 0.2 2.0 26.3 |
exp/train_960_pytorch_train_pytorch_conformer_large_specaug/decode_dev_other_model.val5.avg.best_decode_transformer/result.wrd.txt
| SPKR | # Snt # Wrd | Corr Sub Del Ins Err S.Err |
| Sum/Avg | 2864 50948 | 95.6 3.9 0.5 0.5 4.9 40.8 |
exp/train_960_pytorch_train_pytorch_conformer_large_specaug/decode_test_clean_model.val5.avg.best_decode_transformer/result.wrd.txt
| SPKR | # Snt # Wrd | Corr Sub Del Ins Err S.Err |
| Sum/Avg | 2620 52576 | 98.1 1.7 0.2 0.3 2.2 26.6 |
exp/train_960_pytorch_train_pytorch_conformer_large_specaug/decode_test_other_model.val5.avg.best_decode_transformer/result.wrd.txt
| SPKR | # Snt # Wrd | Corr Sub Del Ins Err S.Err |
| Sum/Avg | 2939 52343 | 95.3 4.1 0.6 0.6 5.3 44.8 |
-
Environments
- python version:
3.8.3 (default) [GCC 7.3.0]
- espnet version:
espnet 0.10.7a1
- chainer version:
chainer 6.0.0
- pytorch version:
pytorch 1.10.0
- python version:
-
Model files (archived to model.tar.gz by
$ pack_model.sh
)- model link: (pretrained model)
- training config file:
conf/tuning/transducer/train_conformer-rnn_transducer.yaml
- decoding config file:
conf/tuning/transducer/decode.yaml
- cmvn file:
data/train_sp/cmvn.ark
- e2e file:
exp/train_960_pytorch_transducer_train_conformer-rnn_transducer/results/model.last10.avg.best
- e2e JSON file:
exp/train_960_pytorch_transducer_train_conformer-rnn_transducer/results/model.json
- dict file:
data/lang_char
- Results (paste them by yourself or obtained by
$ pack_model.sh --results <results>
)
exp/train_960_pytorch_transducer_train_conformer-rnn_transducer/decode_dev_clean_model.last10.avg.best/result.wrd.txt
| SPKR | # Snt # Wrd | Corr Sub Del Ins Err S.Err |
| Sum/Avg | 2703 54402 | 97.6 2.2 0.2 0.3 2.7 33.0 |
exp/train_960_pytorch_transducer_train_conformer-rnn_transducer/decode_dev_other_model.last10.avg.best/result.wrd.txt
| SPKR | # Snt # Wrd | Corr Sub Del Ins Err S.Err |
| Sum/Avg | 2864 50948 | 93.7 5.7 0.6 0.7 7.0 52.8 |
exp/train_960_pytorch_transducer_train_conformer-rnn_transducer/decode_test_clean_model.last10.avg.best/result.wrd.txt
| SPKR | # Snt # Wrd | Corr Sub Del Ins Err S.Err |
| Sum/Avg | 2620 52576 | 97.4 2.3 0.3 0.3 2.9 33.1 |
exp/train_960_pytorch_transducer_train_conformer-rnn_transducer/decode_test_other_model.last10.avg.best/result.wrd.txt
| SPKR | # Snt # Wrd | Corr Sub Del Ins Err S.Err |
| Sum/Avg | 2939 52343 | 93.7 5.6 0.7 0.8 7.1 55.1 |
# Snt | # Wrd | Corr | Sub | Del | Ins | Err | S.Err | ||
---|---|---|---|---|---|---|---|---|---|
exp/train_960_pytorch_train_pytorch_LC_specaug/decode_dev_clean_model.val5.avg.best_decode_lm/result.wrd.txt: | Sum/Avg | 2703 | 54402 | 96.9 | 2.8 | 0.3 | 0.3 | 3.4 | 39.0 |
exp/train_960_pytorch_train_pytorch_SA-DC_specaug/decode_dev_other_model.val5.avg.best_decode_lm/result.wrd.txt: | Sum/Avg | 2864 | 50948 | 92.7 | 6.5 | 0.8 | 0.9 | 8.2 | 55.9 |
exp/train_960_pytorch_train_pytorch_DC_specaug/decode_test_clean_model.val5.avg.best_decode_lm/result.wrd.txt: | Sum/Avg | 2620 | 52576 | 96.9 | 2.9 | 0.3 | 0.4 | 3.5 | 37.9 |
exp/train_960_pytorch_train_pytorch_SA-DC2D_specaug/decode_test_other_model.val5.avg.best_decode_lm/result.wrd.txt: | Sum/Avg | 2939 | 52343 | 92.5 | 6.7 | 0.8 | 1.0 | 8.5 | 60.2 |
We used the same ASR model in the previous report.
-
Environments
- date:
Tue Feb 4 14:50:50 JST 2020
- python version:
3.7.3 (default, Mar 27 2019, 22:11:17) [GCC 7.3.0]
- espnet version:
espnet 0.6.0
- chainer version:
chainer 6.0.0
- pytorch version:
pytorch 1.0.1.post2
- Git hash:
83799e69a0269450587a6857882c73bfb27551d5
- Commit date:
Tue Feb 4 14:21:11 2020 +0900
- date:
-
Model files (archived to model.tar.gz by
$ pack_model.sh
)- model link: https://drive.google.com/open?id=1RHYAhcnlKz08amATrf0ZOWFLzoQphtoc
- training config file:
./conf/train.yaml
- decoding config file:
./conf/decode.yaml
- cmvn file:
./data/train_960/cmvn.ark
- e2e file:
./librispeech.transformer.v1/exp/train_960_pytorch_train_pytorch_transformer.v1_aheads8_batch-bins15000000_specaug/results/model.val5.avg.best
- e2e JSON file:
./librispeech.transformer.v1/exp/train_960_pytorch_train_pytorch_transformer.v1_aheads8_batch-bins15000000_specaug/results/model.json
- lm file:
./exp/train_rnnlm_pytorch_lm_transformer_cosine_batchsize32_lr1e-4_layer16_unigram5000_ngpu4/rnnlm.model.best
- lm JSON file:
./exp/train_rnnlm_pytorch_lm_transformer_cosine_batchsize32_lr1e-4_layer16_unigram5000_ngpu4/model.json
- dict file:
./data/lang_char
-
Results (paste them by yourself or obtained by
$ pack_model.sh --results <results>
)
./exp/train_rnnlm_pytorch_lm_transformer_cosine_batchsize32_lr1e-4_layer16_unigram5000_ngpu4/decode_dev_clean_decode_ep43/result.wrd.txt
| SPKR | # Snt # Wrd | Corr Sub Del Ins Err S.Err |
| Sum/Avg | 2703 54402 | 98.1 1.7 0.2 0.2 2.1 26.9 |
./exp/train_rnnlm_pytorch_lm_transformer_cosine_batchsize32_lr1e-4_layer16_unigram5000_ngpu4/decode_dev_other_decode_ep43/result.wrd.txt
| SPKR | # Snt # Wrd | Corr Sub Del Ins Err S.Err |
| Sum/Avg | 2864 50948 | 95.3 4.2 0.5 0.6 5.3 43.8 |
./exp/train_rnnlm_pytorch_lm_transformer_cosine_batchsize32_lr1e-4_layer16_unigram5000_ngpu4/decode_test_clean_decode_ep43/result.wrd.txt
| SPKR | # Snt # Wrd | Corr Sub Del Ins Err S.Err |
| Sum/Avg | 2620 52576 | 97.8 1.9 0.2 0.3 2.5 28.3 |
./exp/train_rnnlm_pytorch_lm_transformer_cosine_batchsize32_lr1e-4_layer16_unigram5000_ngpu4/decode_test_other_decode_ep43/result.wrd.txt
| SPKR | # Snt # Wrd | Corr Sub Del Ins Err S.Err |
| Sum/Avg | 2939 52343 | 95.1 4.3 0.6 0.6 5.5 46.7 |
- Model files (archived to
train_960_pytorch_train_pytorch_transformer_large_ngpu4_specaug.tar.gz
by$ pack_model.sh
) - model link: https://drive.google.com/open?id=1BtQvAnsFvVi-dp_qsaFP7n4A_5cwnlR6
- training config file:
conf/tuning/train_pytorch_transformer_large_ngpu4.yaml
- decoding config file:
conf/tuning/decode_pytorch_transformer_large.yaml
- cmvn file:
data/train_960/cmvn.ark
- e2e file:
exp/train_960_pytorch_train_pytorch_transformer_large_ngpu4_specaug/results/model.val5.avg.best
- e2e JSON file:
exp/train_960_pytorch_train_pytorch_transformer_large_ngpu4_specaug/results/model.json
- lm file:
exp/irielm.ep11.last5.avg/rnnlm.model.best
- lm JSON file:
exp/irielm.ep11.last5.avg/model.json
- date:
Thu Jul 18 16:15:33 JST 2019
- python version:
3.7.3 (default, Mar 27 2019, 22:11:17) [GCC 7.3.0]
- espnet version:
espnet 0.4.0
- chainer version:
chainer 6.0.0
- pytorch version:
pytorch 1.0.1.post2
- Git hash:
f9f40861423ba9a9c9f5a45bd4369dbdb9b3bbf9
- Commit date:
Thu Jul 18 15:40:51 2019 +0900
- Commit date:
dataset | Snt | Wrd | Corr | Sub | Del | Ins | Err | S.Err |
---|---|---|---|---|---|---|---|---|
decode_dev_clean_model.val5.avg.best_decode_pytorch_transformer_large_lm_large | 2703 | 54402 | 98.0 | 1.8 | 0.2 | 0.2 | 2.2 | 27.9 |
decode_dev_other_model.val5.avg.best_decode_pytorch_transformer_large_lm_large | 2864 | 50948 | 95.1 | 4.3 | 0.6 | 0.6 | 5.6 | 44.9 |
decode_test_clean_model.val5.avg.best_decode_pytorch_transformer_large_lm_large | 2620 | 52576 | 97.7 | 2.0 | 0.3 | 0.3 | 2.6 | 29.9 |
decode_test_other_model.val5.avg.best_decode_pytorch_transformer_large_lm_large | 2939 | 52343 | 95.0 | 4.4 | 0.6 | 0.6 | 5.7 | 47.7 |
- Environments (obtained by
$ get_sys_info.sh
)- date:
Wed Jun 19 16:58:42 EDT 2019
- system information:
Linux b14 4.9.0-6-amd64 #1 SMP Debian 4.9.82-1+deb9u3 (2018-03-02) x86_64 GNU/Linux
- python version:
Python 3.7.3
- espnet version:
espnet 0.3.1
- chainer version:
chainer 6.0.0
- pytorch version:
pytorch 1.0.1.post2
- Git hash:
b32af59f229b54801a2cf7e4b8a48cadccd5fe5a
- date:
- Model files (archived to model.v1.tar.gz by
$ pack_model.sh
)- model link: https://drive.google.com/open?id=1bOaOEIZBveERti0x6mnBYiNsn6MSRd2E
- training config file:
conf/tuning/train_pytorch_transformer_lr5.0_ag8.v2.yaml
- decoding config file:
conf/tuning/decode_pytorch_transformer.yaml
- cmvn file:
data/train_960/cmvn.ark
- e2e file:
exp/train_960_pytorch_train_pytorch_transformer_lr5.0_ag8.v2/results/model.last10.avg.best
- e2e JSON file:
exp/train_960_pytorch_train_pytorch_transformer_lr5.0_ag8.v2/results/model.json
- lm file:
exp/train_rnnlm_pytorch_lm_unigram5000/rnnlm.model.best
- lm JSON file:
exp/train_rnnlm_pytorch_lm_unigram5000/model.json
- Results (paste them by yourself or obtained by
$ pack_model.sh --results <results>
)
exp/train_960_pytorch_train_pytorch_transformer_lr5.0_ag8.v2/decode_dev_clean_decode_pytorch_transformer_lm/result.wrd.txt
| SPKR | # Snt # Wrd | Corr Sub Del Ins Err S.Err |
| Sum/Avg | 2703 54402 | 96.7 2.9 0.3 0.4 3.7 38.5 |
exp/train_960_pytorch_train_pytorch_transformer_lr5.0_ag8.v2/decode_dev_other_decode_pytorch_transformer_lm/result.wrd.txt
| SPKR | # Snt # Wrd | Corr Sub Del Ins Err S.Err |
| Sum/Avg | 2864 50948 | 91.4 7.7 0.9 1.3 9.8 59.7 |
exp/train_960_pytorch_train_pytorch_transformer_lr5.0_ag8.v2/decode_test_clean_decode_pytorch_transformer_lm/result.wrd.txt
| SPKR | # Snt # Wrd | Corr Sub Del Ins Err S.Err |
| Sum/Avg | 2620 52576 | 96.5 3.1 0.4 0.5 4.0 38.3 |
exp/train_960_pytorch_train_pytorch_transformer_lr5.0_ag8.v2/decode_test_other_decode_pytorch_transformer_lm/result.wrd.txt
| SPKR | # Snt # Wrd | Corr Sub Del Ins Err S.Err |
| Sum/Avg | 2939 52343 | 91.3 7.8 0.9 1.3 10.0 62.8 |
train_960_pytorch_transformer_conv2d_e12_unit2048_d6_unit2048_aheads4_dim256_mtlalpha0.3_noam_sampprob0.0_ngpu3_bs32_lr10.0_warmup25000_mli512_mlo150_epochs100_accum2_lennormFalse_lsmunigram0.1/
decode_dev_clean_beam20_emodel.last10.avg.best_p0.0_len0.0-0.0_ctcw0.5_rnnlm0.7_1layer_unit1024_sgd_bs1024/result.wrd.txt: 3.8
decode_dev_other_beam20_emodel.last10.avg.best_p0.0_len0.0-0.0_ctcw0.5_rnnlm0.7_1layer_unit1024_sgd_bs1024/result.wrd.txt: 9.9
decode_test_clean_beam20_emodel.last10.avg.best_p0.0_len0.0-0.0_ctcw0.5_rnnlm0.7_1layer_unit1024_sgd_bs1024/result.wrd.txt: 4.2
decode_test_other_beam20_emodel.last10.avg.best_p0.0_len0.0-0.0_ctcw0.5_rnnlm0.7_1layer_unit1024_sgd_bs1024/result.wrd.txt: 9.8
pytorch VGG-3BLSTM 1024 units, #BPE 5000, latest RNNLM training with tuned decoding (ctc_weight=0.5, lm_weight=0.7), dropout 0.2
train_960_pytorch_vggblstm_e5_subsample1_2_2_1_1_unit1024_proj1024_d2_unit1024_location_aconvc10_aconvf100_mtlalpha0.5_drop0.2_adadelta_sampprob0.0_bs20_mli800_mlo150
decode_dev_clean_beam20_emodel.acc.best_p0.0_len0.0-0.0_ctcw0.5_rnnlm0.7_1layer_unit1024_sgd_bs1024: 4.0
decode_dev_other_beam20_emodel.acc.best_p0.0_len0.0-0.0_ctcw0.5_rnnlm0.7_1layer_unit1024_sgd_bs1024: 12.3
decode_test_clean_beam20_emodel.acc.best_p0.0_len0.0-0.0_ctcw0.5_rnnlm0.7_1layer_unit1024_sgd_bs1024: 4.0
decode_test_other_beam20_emodel.acc.best_p0.0_len0.0-0.0_ctcw0.5_rnnlm0.7_1layer_unit1024_sgd_bs1024: 12.7
pytorch VGG-3BLSTM 1024 units, #BPE 5000, latest RNNLM training with tuned decoding (ctc_weight=0.5, lm_weight=0.7)
train_960_pytorch_vggblstm_e5_subsample1_2_2_1_1_unit1024_proj1024_d2_unit1024_location_aconvc10_aconvf100_mtlalpha0.5_adadelta_sampprob0.0_bs20_mli800_mlo150
decode_dev_clean_beam20_emodel.acc.best_p0.0_len0.0-0.0_ctcw0.5_rnnlm0.7_1layer_unit1024_sgd_bs1024: 4.2
decode_dev_other_beam20_emodel.acc.best_p0.0_len0.0-0.0_ctcw0.5_rnnlm0.7_1layer_unit1024_sgd_bs1024: 12.5
decode_test_clean_beam20_emodel.acc.best_p0.0_len0.0-0.0_ctcw0.5_rnnlm0.7_1layer_unit1024_sgd_bs1024: 4.2
decode_test_other_beam20_emodel.acc.best_p0.0_len0.0-0.0_ctcw0.5_rnnlm0.7_1layer_unit1024_sgd_bs1024: 13.6
pytorch VGG-3BLSTM 1024 units, #BPE 5000 more layers with tuned decoding (ctc_weight=0.5, lm_weight=0.5)
train_960_vggblstm_e5_subsample1_2_2_1_1_unit1024_proj1024_d2_unit1024_location1024_aconvc10_aconvf100_mtlalpha0.5_adadelta_bs24_mli800_mlo150_unigram5000
decode_dev_clean_beam20_eacc.best_p0.0_len0.0-0.0_ctcw0.5_rnnlm0.5: 4.5
decode_dev_other_beam20_eacc.best_p0.0_len0.0-0.0_ctcw0.5_rnnlm0.5: 13.0
decode_test_clean_beam20_eacc.best_p0.0_len0.0-0.0_ctcw0.5_rnnlm0.5: 4.6
decode_test_other_beam20_eacc.best_p0.0_len0.0-0.0_ctcw0.5_rnnlm0.5: 13.7
pytorch VGG-3BLSTM 1024 units, #BPE 2000 (motivated by the RWTH setup, thanks to Albert Zeyer, Rohit Prabhavalkar, and Kazuki Irie for their comments)
train_960_vggblstm_e4_subsample1_2_2_1_1_unit1024_proj1024_d1_unit1024_location1024_aconvc10_aconvf100_mtlalpha0.5_adadelta_bs32_mli800_mlo150_unigram2000
decode_dev_clean_beam20_eacc.best_p0.0_len0.0-0.0_ctcw0.3_rnnlm0.3: 5.0
decode_dev_other_beam20_eacc.best_p0.0_len0.0-0.0_ctcw0.3_rnnlm0.3: 14.3
decode_test_clean_beam20_eacc.best_p0.0_len0.0-0.0_ctcw0.3_rnnlm0.3: 5.0
decode_test_other_beam20_eacc.best_p0.0_len0.0-0.0_ctcw0.3_rnnlm0.3: 14.9
decode_dev_clean_beam20_eacc.best_p0.0_len0.0-0.0_ctcw0.3/result.txt:| 2.9 (2.7 w/ 0.2, 2.7 w/ 0.3, 2.7 w/ 0.4)
decode_dev_other_beam20_eacc.best_p0.0_len0.0-0.0_ctcw0.3/result.txt:| 9.6 (9.2 w/ 0.2, 9.1 w/ 0.3, 9.0 w/ 0.4)
decode_test_clean_beam20_eacc.best_p0.0_len0.0-0.0_ctcw0.3/result.txt:| 2.7 (2.6 w/ 0.2, 2.6 w/ 0.3, 2.6 w/ 0.4)
decode_test_other_beam20_eacc.best_p0.0_len0.0-0.0_ctcw0.3/result.txt:| 9.9 (9.6 w/ 0.2, 9.4 w/ 0.3, 9.3 w/ 0.4)
decode_dev_clean_beam20_eacc.best_p0.0_len0.0-0.0_ctcw0.3/result.wrd.txt:| 7.7 (7.2 w/ 0.2, 7.1 w/ 0.3, 7.2 w/ 0.4)
decode_dev_other_beam20_eacc.best_p0.0_len0.0-0.0_ctcw0.3/result.wrd.txt:| 21.1 (19.6 w/ 0.2, 19.2 w/ 0.3, 18.9 w/ 0.4)
decode_test_clean_beam20_eacc.best_p0.0_len0.0-0.0_ctcw0.3/result.wrd.txt:| 7.7 (7.2 w/ 0.2, 7.2 w/ 0.3, 7.1 w/ 0.4)
decode_test_other_beam20_eacc.best_p0.0_len0.0-0.0_ctcw0.3/result.wrd.txt:| 21.9 (20.5 w/ 0.2, 20.0 w/ 0.3, 19.7 w/ 0.4)