Add ctc_decode.py for the model trained with rnnt-loss and ctc-loss (#12

) * Support running icefall outside of a git tracked directory. (k2-fsa#470) * Support running icefall outside of a git tracked directory. * Minor fixes. * Rand combine update result (k2-fsa#467) * update RESULTS.md * fix test code in pruned_transducer_stateless5/conformer.py * minor fix * delete doc * fix style * Simplified memory bank for Emformer (k2-fsa#440) * init files * use average value as memory vector for each chunk * change tail padding length from right_context_length to chunk_length * correct the files, ln -> cp * fix bug in conv_emformer_transducer_stateless2/emformer.py * fix doc in conv_emformer_transducer_stateless/emformer.py * refactor init states for stream * modify .flake8 * fix bug about memory mask when memory_size==0 * add @torch.jit.export for init_states function * update RESULTS.md * minor change * update README.md * modify doc * replace torch.div() with << * fix bug, >> -> << * use i&i-1 to judge if it is a power of 2 * minor fix * fix error in RESULTS.md * update multi_quantization installation (k2-fsa#469) * update multi_quantization installation * Update egs/librispeech/ASR/pruned_transducer_stateless6/train.py Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com> Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com> * [Ready] [Recipes] add aishell2 (k2-fsa#465) * add aishell2 * fix aishell2 * add manifest stats * update prepare char dict * fix lint * setting max duration * lint * change context size to 1 * update result * update hf link * fix decoding comment * add more decoding methods * update result * change context-size 2 default * [WIP] Rnn-T LM nbest rescoring (k2-fsa#471) * add compile_lg.py for aishell2 recipe (k2-fsa#481) * Add RNN-LM rescoring in fast beam search (k2-fsa#475) * fix for case of None stats * Update conformer.py for aishell4 (k2-fsa#484) * update conformer.py for aishell4 * update conformer.py * add strict=False when model.load_state_dict * CTC attention model with reworked Conformer encoder and reworked Transformer decoder (k2-fsa#462) * ctc attention model with reworked conformer encoder and reworked transformer decoder * remove unnecessary func * resolve flake8 conflicts * fix typos and modify the expr of ScaledEmbedding * use original beam size * minor changes to the scripts * add rnn lm decoding * minor changes * check whether q k v weight is None * check whether q k v weight is None * check whether q k v weight is None * style correction * update results * update results * upload the decoding results of rnn-lm to the RESULTS * upload the decoding results of rnn-lm to the RESULTS * Update egs/librispeech/ASR/RESULTS.md Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com> * Update egs/librispeech/ASR/RESULTS.md Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com> * Update egs/librispeech/ASR/RESULTS.md Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com> Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com> * Update doc to add a link to Nadira Povey's YouTube channel. (k2-fsa#492) * Update doc to add a link to Nadira Povey's YouTube channel. * fix a typo * Add stats about duration and padding proportion (k2-fsa#485) * add stats about duration and padding proportion * add for utt_duration * add stats for other recipes * add stats for other 2 recipes * modify doc * minor change * Add modified_beam_search for streaming decode (k2-fsa#489) * Add modified_beam_search for pruned_transducer_stateless/streaming_decode.py * refactor * modified beam search for stateless3,4 * Fix comments * Add real streamng ci * Fix using G before assignment in pruned_transducer_stateless/decode.py (k2-fsa#494) * Support using aidatatang_200zh optionally in aishell training (k2-fsa#495) * Use aidatatang_200zh optionally in aishell training. * Fix get_transducer_model() for aishell. (k2-fsa#497) PR k2-fsa#495 introduces an error. This commit fixes it. * [WIP] Pruned-transducer-stateless5-for-WenetSpeech (offline and streaming) (k2-fsa#447) * pruned-rnnt5-for-wenetspeech * style check * style check * add streaming conformer * add streaming decode * changes codes for fast_beam_search and export cpu jit * add modified-beam-search for streaming decoding * add modified-beam-search for streaming decoding * change for streaming_beam_search.py * add README.md and RESULTS.md * change for style_check.yml * do some changes * do some changes for export.py * add some decode commands for usage * add streaming results on README.md * [debug] raise remind when git-lfs not available (k2-fsa#504) * [debug] raise remind when git-lfs not available * modify comment * correction for prepare.sh (k2-fsa#506) * Set overwrite=True when extracting features in batches. (k2-fsa#487) * correction for get rank id. (k2-fsa#507) * Fix no attribute 'data' error. * minor fixes * correction for get rank id. * Add other decoding methods (nbest, nbest oracle, nbest LG) for wenetspeech pruned rnnt2 (k2-fsa#482) * add other decoding methods for wenetspeech * changes for RESULTS.md * add ngram-lm-scale=0.35 results * set ngram-lm-scale=0.35 as default * Update README.md * add nbest-scale for flie name * Support dynamic chunk streaming training in pruned_transcuder_stateless5 (k2-fsa#454) * support dynamic chunk streaming training * Add simulate streaming decoding * Support streaming decoding * fix causal * Minor fixes * fix streaming decode; add results * liear_fst_with_self_loops (k2-fsa#512) * Support exporting to ONNX format (k2-fsa#501) * WIP: Support exporting to ONNX format * Minor fixes. * Combine encoder/decoder/joiner into a single file. * Revert merging three onnx models into a single one. It's quite time consuming to extract a sub-graph from the combined model. For instance, it takes more than one hour to extract the encoder model. * Update CI to test ONNX models. * Decode with exported models. * Fix typos. * Add more doc. * Remove ncnn as it is not fully tested yet. * Fix as_strided for streaming conformer. * Convert ScaledEmbedding to nn.Embedding for inference. (k2-fsa#517) * Convert ScaledEmbedding to nn.Embedding for inference. * Fix CI style issues. * Fix preparing char based lang and add multiprocessing for wenetspeech text segmentation (k2-fsa#513) * add multiprocessing for wenetspeech text segmentation * Fix preparing char based lang for wenetspeech * fix style Co-authored-by: WeijiZhuang <zhuangweiji@xiaomi.com> * change for pruned rnnt5 train.py (k2-fsa#519) * fix about tensorboard (k2-fsa#516) * fix metricstracker * fix style * Merging onnx models (k2-fsa#518) * add export function of onnx-all-in-one to export.py * add onnx_check script for all-in-one onnx model * minor fix * remove unused arguments * add onnx-all-in-one test * fix style * fix style * fix requirements * fix input/output names * fix installing onnx_graphsurgeon * fix instaliing onnx_graphsurgeon * revert to previous requirements.txt * fix minor * Fix loading sampler state dict. (k2-fsa#421) * Fix loading sampler state dict. * skip scan_pessimistic_batches_for_oom if params.start_batch > 0 * fix torchaudio version (k2-fsa#524) * fix torchaudio version * fix torchaudio version * Fix computing averaged loss in the aishell recipe. (k2-fsa#523) * Fix computing averaged loss in the aishell recipe. * Set find_unused_parameters optionally. * Sort results to make it more convenient to compare decoding results (k2-fsa#522) * Sort result to make it more convenient to compare decoding results * Add cut_id to recognition results * add cut_id to results for all recipes * Fix torch.jit.script * Fix comments * Minor fixes * Fix torch.jit.tracing for Pytorch version before v1.9.0 * Add function display_and_save_batch in wenetspeech/pruned_transducer_stateless2/train.py (k2-fsa#528) * Add function display_and_save_batch in egs/wenetspeech/ASR/pruned_transducer_stateless2/train.py * Modify function: display_and_save_batch * Delete empty line in pruned_transducer_stateless2/train.py * Modify code format * Filter non-finite losses (k2-fsa#525) * Filter non-finite losses * Fixes after review * propagate changes from k2-fsa#525 to other librispeech recipes (k2-fsa#531) * propaga changes from k2-fsa#525 to other librispeech recipes * refactor display_and_save_batch to utils * fixed typo * reformat code style * Fix not enough values to unpack error . (k2-fsa#533) * Use ScaledLSTM as streaming encoder (k2-fsa#479) * add ScaledLSTM * add RNNEncoderLayer and RNNEncoder classes in lstm.py * add RNN and Conv2dSubsampling classes in lstm.py * hardcode bidirectional=False * link from pruned_transducer_stateless2 * link scaling.py pruned_transducer_stateless2 * copy from pruned_transducer_stateless2 * modify decode.py pretrained.py test_model.py train.py * copy streaming decoding files from pruned_transducer_stateless2 * modify streaming decoding files * simplified code in ScaledLSTM * flat weights after scaling * pruned2 -> pruned4 * link __init__.py * fix style * remove add_model_arguments * modify .flake8 * fix style * fix scale value in scaling.py * add random combiner for training deeper model * add using proj_size * add scaling converter for ScaledLSTM * support jit trace * add using averaged model in export.py * modify test_model.py, test if the model can be successfully exported by jit.trace * modify pretrained.py * support streaming decoding * fix model.py * Add cut_id to recognition results * Add cut_id to recognition results * do not pad in Conv subsampling module; add tail padding during decoding. * update RESULTS.md * minor fix * fix doc * update README.md * minor change, filter infinite loss * remove the condition of raise error * modify type hint for the return value in model.py * minor change * modify RESULTS.md Co-authored-by: pkufool <wkang.pku@gmail.com> * Update asr_datamodule.py (k2-fsa#538) minor file names correction * minor fixes to LSTM streaming model (k2-fsa#537) * Pruned transducer stateless2 for AISHELL-1 (k2-fsa#536) * Fix not enough values to unpack error . * [WIP] Pruned transducer stateless2 for AISHELL-1 * fix the style issue * code format for black * add pruned-transducer-stateless2 results for AISHELL-1 * simplify result * consider case of empty tensor (k2-fsa#540) * fixed import quantization is none (k2-fsa#541) Signed-off-by: shanguanma <nanr9544@gmail.com> Signed-off-by: shanguanma <nanr9544@gmail.com> Co-authored-by: shanguanma <nanr9544@gmail.com> * fix typo for export jit script (k2-fsa#544) * some small changes for aidatatang_200zh (k2-fsa#542) * Update prepare.sh * Update compute_fbank_aidatatang_200zh.py * fixed no cut_id error in decode_dataset (k2-fsa#549) * fixed import quantization is none Signed-off-by: shanguanma <nanr9544@gmail.com> * fixed no cut_id error in decode_dataset Signed-off-by: shanguanma <nanr9544@gmail.com> * fixed more than one "#" Signed-off-by: shanguanma <nanr9544@gmail.com> * fixed code style Signed-off-by: shanguanma <nanr9544@gmail.com> Signed-off-by: shanguanma <nanr9544@gmail.com> Co-authored-by: shanguanma <nanr9544@gmail.com> * Add clamping operation in Eve optimizer for all scalar weights to avoid (k2-fsa#550) non stable training in some scenarios. The clamping range is set to (-10,2). Note that this change may cause unexpected effect if you resume training from a model that is trained without clamping. * minor changes for correct path names && import module text2segments.py (k2-fsa#552) * Update asr_datamodule.py minor file names correction * minor changes for correct path names && import module text2segments.py * fix scaling converter test for decoder(predictor). (k2-fsa#553) * Disable CUDA_LAUNCH_BLOCKING in wenetspeech recipes. (k2-fsa#554) * Disable CUDA_LAUNCH_BLOCKING in wenetspeech recipes. * minor fixes * Check that read_manifests_if_cached returns a non-empty dict. (k2-fsa#555) * Modified prepare_transcripts.py and preprare_lexicon.py of tedlium3 recipe (k2-fsa#567) * Use modified ctc topo when vocab size is > 500 (k2-fsa#568) * Add LSTM for the multi-dataset setup. (k2-fsa#558) * Add LSTM for the multi-dataset setup. * Add results * fix style issues * add missing file * Adding Dockerfile for Ubuntu18.04-pytorch1.12.1-cuda11.3-cudnn8 (k2-fsa#572) * Changed Dockerfile * Update Dockerfile * Dockerfile * Update README.md * Add Dockerfiles * Update README.md Removed misleading CUDA version, as the Ubuntu18.04-pytorch1.7.1-cuda11.0-cudnn8 Dockerfile can only support CUDA versions >11.0. * support exporting to ncnn format via PNNX (k2-fsa#571) * Small fixes to the transducer training doc (k2-fsa#575) * Update kaldifeat in CI tests (k2-fsa#583) * padding zeros (k2-fsa#591) * Gradient filter for training lstm model (k2-fsa#564) * init files * add gradient filter module * refact getting median value * add cutoff for grad filter * delete comments * apply gradient filter in LSTM module, to filter both input and params * fix typing and refactor * filter with soft mask * rename lstm_transducer_stateless2 to lstm_transducer_stateless3 * fix typos, and update RESULTS.md * minor fix * fix return typing * fix typo * Modified train.py of tedlium3 models (k2-fsa#597) * Add dill to requirements.txt (k2-fsa#613) * Add dill to requirements.txt * Disable style check for python 3.7 * update docs (k2-fsa#611) * update docs Co-authored-by: unknown <mazhihao@jshcbd.cn> Co-authored-by: KajiMaCN <moonlightshadowmzh@gmail.com> * exporting projection layers of joiner separately for onnx (k2-fsa#584) * exporting projection layers of joiner separately for onnx * Remove all-in-one for onnx export (k2-fsa#614) * Remove all-in-one for onnx export * Exit on error for CI * Modify ActivationBalancer for speed (k2-fsa#612) * add a probability to apply ActivationBalancer * minor fix * minor fix * Support exporting to ONNX for the wenetspeech recipe (k2-fsa#615) * Support exporting to ONNX for the wenetspeech recipe * Add doc about model export (k2-fsa#618) * Add doc about model export * fix typos * Fix links in the doc (k2-fsa#619) * fix type hints for decode.py (k2-fsa#623) * Support exporting LSTM with projection to ONNX (k2-fsa#621) * Support exporting LSTM with projection to ONNX * Add missing files * small fixes * CSJ Data Preparation (k2-fsa#617) * workspace setup * csj prepare done * Change compute_fbank_musan.py t soft link * add description * change lhotse prepare csj command * split train-dev here * Add header * remove debug * save manifest_statistics * generate transcript in Lhotse * update comments in config file * fix number of parameters in RESULTS.md (k2-fsa#627) * Add Shallow fusion in modified_beam_search (k2-fsa#630) * Add utility for shallow fusion * test batch size == 1 without shallow fusion * Use shallow fusion for modified-beam-search * Modified beam search with ngram rescoring * Fix code according to review Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com> * Add kaldifst to requirements.txt (k2-fsa#631) * Install kaldifst for GitHub actions (k2-fsa#632) * Install kaldifst for GitHub actions * Update train.py (k2-fsa#635) Add the missing step to add the arguments to the parser. * Fix type hints for decode.py (k2-fsa#638) * Fix type hints for decode.py * Fix flake8 * fix typos (k2-fsa#639) * Remove onnx and onnxruntime from requirements.txt (k2-fsa#640) * Remove onnx and onnxruntime from requirements.txt * Checkout the LM for aishell explicitly (k2-fsa#642) * Get timestamps during decoding (k2-fsa#598) * print out timestamps during decoding * add word-level alignments * support to compute mean symbol delay with word-level alignments * print variance of symbol delay * update doc * support to compute delay for pruned_transducer_stateless4 * fix bug * add doc * remove tail padding for non-streaming models (k2-fsa#625) * support RNNLM shallow fusion for LSTM transducer * support RNNLM shallow fusion in stateless5 * update results * update decoding commands * update author info * update * include previous added decoding method * minor fixes * remove redundant test lines * Update egs/librispeech/ASR/lstm_transducer_stateless2/decode.py Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com> * Update tdnn_lstm_ctc.rst (k2-fsa#647) * Update README.md (k2-fsa#649) * Update tdnn_lstm_ctc.rst (k2-fsa#648) * fix torchaudio version in dockerfile (k2-fsa#653) * fix torchaudio version in dockerfile * remove kaldiio * update docs * Add fast_beam_search_LG (k2-fsa#622) * Add fast_beam_search_LG * add fast_beam_search_LG to commonly used recipes * fix ci * fix ci * Fix error * Fix LG log file name (k2-fsa#657) * resolve conflict with timestamp feature * resolve conflicts * minor fixes * remove testing file * Apply delay penalty on transducer (k2-fsa#654) * add delay penalty * fix CI * fix CI * Refactor getting timestamps in fsa-based decoding (k2-fsa#660) * refactor getting timestamps for fsa-based decoding * fix doc * fix bug * add ctc_decode.py * fix doc Signed-off-by: shanguanma <nanr9544@gmail.com> Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com> Co-authored-by: LIyong.Guo <839019390@qq.com> Co-authored-by: Yuekai Zhang <zhangyuekai@foxmail.com> Co-authored-by: ezerhouni <61225408+ezerhouni@users.noreply.github.com> Co-authored-by: Mingshuang Luo <37799481+luomingshuang@users.noreply.github.com> Co-authored-by: Daniel Povey <dpovey@gmail.com> Co-authored-by: Quandwang <quandwang@hotmail.com> Co-authored-by: Wei Kang <wkang.pku@gmail.com> Co-authored-by: boji123 <boji123@aliyun.com> Co-authored-by: Lucky Wong <lekai.huang@gmail.com> Co-authored-by: LIyong.Guo <guonwpu@qq.com> Co-authored-by: Weiji Zhuang <zhuangweiji@foxmail.com> Co-authored-by: WeijiZhuang <zhuangweiji@xiaomi.com> Co-authored-by: Yunusemre <yunusemreozkose@gmail.com> Co-authored-by: FNLPprojects <linxinzhulxz@gmail.com> Co-authored-by: yangsuxia <34536059+yangsuxia@users.noreply.github.com> Co-authored-by: marcoyang1998 <45973641+marcoyang1998@users.noreply.github.com> Co-authored-by: rickychanhoyin <ricky.hoyin.chan@gmail.com> Co-authored-by: Duo Ma <39255927+shanguanma@users.noreply.github.com> Co-authored-by: shanguanma <nanr9544@gmail.com> Co-authored-by: rxhmdia <41623136+rxhmdia@users.noreply.github.com> Co-authored-by: kobenaxie <572745565@qq.com> Co-authored-by: shcxlee <113081290+shcxlee@users.noreply.github.com> Co-authored-by: Teo Wen Shen <36886809+teowenshen@users.noreply.github.com> Co-authored-by: KajiMaCN <827272056@qq.com> Co-authored-by: unknown <mazhihao@jshcbd.cn> Co-authored-by: KajiMaCN <moonlightshadowmzh@gmail.com> Co-authored-by: Yunusemre <yunusemre.ozkose@sestek.com> Co-authored-by: Nagendra Goel <nagendra.goel@gmail.com> Co-authored-by: marcoyang <marcoyang1998@gmail.com> Co-authored-by: zr_jin <60612200+JinZr@users.noreply.github.com>
csukuangfj · Nov 14, 2022 · 89ce554 · 89ce554
1 parent f061cac
commit 89ce554
Show file tree

Hide file tree

Showing 407 changed files with 54,330 additions and 3,129 deletions.
diff --git a/.flake8 b/.flake8
@@ -4,12 +4,15 @@ statistics=true
 max-line-length = 80
 per-file-ignores =
     # line too long
-    icefall/diagnostics.py: E501
+    icefall/diagnostics.py: E501,
     egs/*/ASR/*/conformer.py: E501,
     egs/*/ASR/pruned_transducer_stateless*/*.py: E501,
     egs/*/ASR/*/optim.py: E501,
     egs/*/ASR/*/scaling.py: E501,
-    egs/librispeech/ASR/conv_emformer_transducer_stateless/*.py: E501, E203
+    egs/librispeech/ASR/lstm_transducer_stateless*/*.py: E501, E203
+    egs/librispeech/ASR/conv_emformer_transducer_stateless*/*.py: E501, E203
+    egs/librispeech/ASR/conformer_ctc2/*py: E501,
+    egs/librispeech/ASR/RESULTS.md: E999,
 
     # invalid escape sequence (cause by tex formular), W605
     icefall/utils.py: E501, W605
@@ -19,3 +22,11 @@ exclude =
   **/data/**,
   icefall/shared/make_kn_lm.py,
   icefall/__init__.py
+
+ignore =
+  # E203 white space before ":"
+  E203,
+  # W503 line break before binary operator
+  W503,
+  # E226 missing whitespace around arithmetic operator
+  E226,
diff --git a/.github/scripts/compute-fbank-librispeech-test-clean-and-test-other.sh b/.github/scripts/compute-fbank-librispeech-test-clean-and-test-other.sh
@@ -4,6 +4,8 @@
 # The computed features are saved to ~/tmp/fbank-libri and are
 # cached for later runs
 
+set -e
+
 export PYTHONPATH=$PWD:$PYTHONPATH
 echo $PYTHONPATH
 

diff --git a/.github/scripts/download-gigaspeech-dev-test-dataset.sh b/.github/scripts/download-gigaspeech-dev-test-dataset.sh
@@ -6,6 +6,8 @@
 # You will find directories `~/tmp/giga-dev-dataset-fbank` after running
 # this script.
 
+set -e
+
 mkdir -p ~/tmp
 cd ~/tmp
 

diff --git a/.github/scripts/download-librispeech-test-clean-and-test-other-dataset.sh b/.github/scripts/download-librispeech-test-clean-and-test-other-dataset.sh
@@ -7,6 +7,8 @@
 # You will find directories ~/tmp/download/LibriSpeech after running
 # this script.
 
+set -e
+
 mkdir ~/tmp/download
 cd egs/librispeech/ASR
 ln -s ~/tmp/download .

diff --git a/.github/scripts/install-kaldifeat.sh b/.github/scripts/install-kaldifeat.sh
@@ -3,6 +3,8 @@
 # This script installs kaldifeat into the directory ~/tmp/kaldifeat
 # which is cached by GitHub actions for later runs.
 
+set -e
+
 mkdir -p ~/tmp
 cd ~/tmp
 git clone https://github.com/csukuangfj/kaldifeat

diff --git a/.github/scripts/prepare-librispeech-test-clean-and-test-other-manifests.sh b/.github/scripts/prepare-librispeech-test-clean-and-test-other-manifests.sh
@@ -4,6 +4,8 @@
 # to egs/librispeech/ASR/download/LibriSpeech and generates manifest
 # files in egs/librispeech/ASR/data/manifests
 
+set -e
+
 cd egs/librispeech/ASR
 [ ! -e download ] && ln -s ~/tmp/download .
 mkdir -p data/manifests

diff --git a/.github/scripts/run-aishell-pruned-transducer-stateless3-2022-06-20.sh b/.github/scripts/run-aishell-pruned-transducer-stateless3-2022-06-20.sh
@@ -1,5 +1,7 @@
 #!/usr/bin/env bash
 
+set -e
+
 log() {
   # This function is from espnet
   local fname=${BASH_SOURCE[1]##*/}
@@ -40,7 +42,7 @@ for sym in 1 2 3; do
     --lang-dir $repo/data/lang_char \
     $repo/test_wavs/BAC009S0764W0121.wav \
     $repo/test_wavs/BAC009S0764W0122.wav \
-    $rep/test_wavs/BAC009S0764W0123.wav
+    $repo/test_wavs/BAC009S0764W0123.wav
 done
 
 for method in modified_beam_search beam_search fast_beam_search; do
@@ -53,7 +55,7 @@ for method in modified_beam_search beam_search fast_beam_search; do
     --lang-dir $repo/data/lang_char \
     $repo/test_wavs/BAC009S0764W0121.wav \
     $repo/test_wavs/BAC009S0764W0122.wav \
-    $rep/test_wavs/BAC009S0764W0123.wav
+    $repo/test_wavs/BAC009S0764W0123.wav
 done
 
 echo "GITHUB_EVENT_NAME: ${GITHUB_EVENT_NAME}"

diff --git a/.github/scripts/run-gigaspeech-pruned-transducer-stateless2-2022-05-12.sh b/.github/scripts/run-gigaspeech-pruned-transducer-stateless2-2022-05-12.sh
@@ -1,5 +1,7 @@
 #!/usr/bin/env bash
 
+set -e
+
 log() {
   # This function is from espnet
   local fname=${BASH_SOURCE[1]##*/}

diff --git a/.github/scripts/run-librispeech-lstm-transducer-stateless2-2022-09-03.yml b/.github/scripts/run-librispeech-lstm-transducer-stateless2-2022-09-03.yml
@@ -0,0 +1,203 @@
+#!/usr/bin/env bash
+#
+set -e
+
+log() {
+  # This function is from espnet
+  local fname=${BASH_SOURCE[1]##*/}
+  echo -e "$(date '+%Y-%m-%d %H:%M:%S') (${fname}:${BASH_LINENO[0]}:${FUNCNAME[1]}) $*"
+}
+
+cd egs/librispeech/ASR
+
+repo_url=https://huggingface.co/csukuangfj/icefall-asr-librispeech-lstm-transducer-stateless2-2022-09-03
+
+log "Downloading pre-trained model from $repo_url"
+git lfs install
+git clone $repo_url
+repo=$(basename $repo_url)
+
+log "Display test files"
+tree $repo/
+soxi $repo/test_wavs/*.wav
+ls -lh $repo/test_wavs/*.wav
+
+pushd $repo/exp
+ln -s pretrained-iter-468000-avg-16.pt pretrained.pt
+ln -s pretrained-iter-468000-avg-16.pt epoch-99.pt
+popd
+
+log  "Install ncnn and pnnx"
+
+# We are using a modified ncnn here. Will try to merge it to the official repo
+# of ncnn
+git clone https://github.com/csukuangfj/ncnn
+pushd ncnn
+git submodule init
+git submodule update python/pybind11
+python3 setup.py bdist_wheel
+ls -lh dist/
+pip install dist/*.whl
+cd tools/pnnx
+mkdir build
+cd build
+cmake ..
+make -j4 pnnx
+
+./src/pnnx || echo "pass"
+
+popd
+
+log "Test exporting to pnnx format"
+
+./lstm_transducer_stateless2/export.py \
+  --exp-dir $repo/exp \
+  --bpe-model $repo/data/lang_bpe_500/bpe.model \
+  --epoch 99 \
+  --avg 1 \
+  --use-averaged-model 0 \
+  --pnnx 1
+
+./ncnn/tools/pnnx/build/src/pnnx $repo/exp/encoder_jit_trace-pnnx.pt
+./ncnn/tools/pnnx/build/src/pnnx $repo/exp/decoder_jit_trace-pnnx.pt
+./ncnn/tools/pnnx/build/src/pnnx $repo/exp/joiner_jit_trace-pnnx.pt
+
+./lstm_transducer_stateless2/ncnn-decode.py \
+ --bpe-model-filename $repo/data/lang_bpe_500/bpe.model \
+ --encoder-param-filename $repo/exp/encoder_jit_trace-pnnx.ncnn.param \
+ --encoder-bin-filename $repo/exp/encoder_jit_trace-pnnx.ncnn.bin \
+ --decoder-param-filename $repo/exp/decoder_jit_trace-pnnx.ncnn.param \
+ --decoder-bin-filename $repo/exp/decoder_jit_trace-pnnx.ncnn.bin \
+ --joiner-param-filename $repo/exp/joiner_jit_trace-pnnx.ncnn.param \
+ --joiner-bin-filename $repo/exp/joiner_jit_trace-pnnx.ncnn.bin \
+ $repo/test_wavs/1089-134686-0001.wav
+
+./lstm_transducer_stateless2/streaming-ncnn-decode.py \
+ --bpe-model-filename $repo/data/lang_bpe_500/bpe.model \
+ --encoder-param-filename $repo/exp/encoder_jit_trace-pnnx.ncnn.param \
+ --encoder-bin-filename $repo/exp/encoder_jit_trace-pnnx.ncnn.bin \
+ --decoder-param-filename $repo/exp/decoder_jit_trace-pnnx.ncnn.param \
+ --decoder-bin-filename $repo/exp/decoder_jit_trace-pnnx.ncnn.bin \
+ --joiner-param-filename $repo/exp/joiner_jit_trace-pnnx.ncnn.param \
+ --joiner-bin-filename $repo/exp/joiner_jit_trace-pnnx.ncnn.bin \
+ $repo/test_wavs/1089-134686-0001.wav
+
+
+
+log "Test exporting with torch.jit.trace()"
+
+./lstm_transducer_stateless2/export.py \
+  --exp-dir $repo/exp \
+  --bpe-model $repo/data/lang_bpe_500/bpe.model \
+  --epoch 99 \
+  --avg 1 \
+  --use-averaged-model 0 \
+  --jit-trace 1
+
+log "Decode with models exported by torch.jit.trace()"
+
+./lstm_transducer_stateless2/jit_pretrained.py \
+  --bpe-model $repo/data/lang_bpe_500/bpe.model \
+  --encoder-model-filename $repo/exp/encoder_jit_trace.pt \
+  --decoder-model-filename $repo/exp/decoder_jit_trace.pt \
+  --joiner-model-filename $repo/exp/joiner_jit_trace.pt \
+  $repo/test_wavs/1089-134686-0001.wav \
+  $repo/test_wavs/1221-135766-0001.wav \
+  $repo/test_wavs/1221-135766-0002.wav
+
+log "Test exporting to ONNX"
+
+./lstm_transducer_stateless2/export.py \
+  --exp-dir $repo/exp \
+  --bpe-model $repo/data/lang_bpe_500/bpe.model \
+  --epoch 99 \
+  --avg 1 \
+  --use-averaged-model 0 \
+  --onnx 1
+
+log "Decode with ONNX models "
+
+./lstm_transducer_stateless2/streaming-onnx-decode.py \
+  --bpe-model-filename $repo/data/lang_bpe_500/bpe.model \
+  --encoder-model-filename $repo//exp/encoder.onnx \
+  --decoder-model-filename $repo/exp/decoder.onnx \
+  --joiner-model-filename $repo/exp/joiner.onnx \
+  --joiner-encoder-proj-model-filename $repo/exp/joiner_encoder_proj.onnx \
+  --joiner-decoder-proj-model-filename $repo/exp/joiner_decoder_proj.onnx \
+ $repo/test_wavs/1089-134686-0001.wav
+
+./lstm_transducer_stateless2/streaming-onnx-decode.py \
+  --bpe-model-filename $repo/data/lang_bpe_500/bpe.model \
+  --encoder-model-filename $repo//exp/encoder.onnx \
+  --decoder-model-filename $repo/exp/decoder.onnx \
+  --joiner-model-filename $repo/exp/joiner.onnx \
+  --joiner-encoder-proj-model-filename $repo/exp/joiner_encoder_proj.onnx \
+  --joiner-decoder-proj-model-filename $repo/exp/joiner_decoder_proj.onnx \
+ $repo/test_wavs/1221-135766-0001.wav
+
+./lstm_transducer_stateless2/streaming-onnx-decode.py \
+  --bpe-model-filename $repo/data/lang_bpe_500/bpe.model \
+  --encoder-model-filename $repo//exp/encoder.onnx \
+  --decoder-model-filename $repo/exp/decoder.onnx \
+  --joiner-model-filename $repo/exp/joiner.onnx \
+  --joiner-encoder-proj-model-filename $repo/exp/joiner_encoder_proj.onnx \
+  --joiner-decoder-proj-model-filename $repo/exp/joiner_decoder_proj.onnx \
+ $repo/test_wavs/1221-135766-0002.wav
+
+
+
+for sym in 1 2 3; do
+  log "Greedy search with --max-sym-per-frame $sym"
+
+  ./lstm_transducer_stateless2/pretrained.py \
+    --method greedy_search \
+    --max-sym-per-frame $sym \
+    --checkpoint $repo/exp/pretrained.pt \
+    --bpe-model $repo/data/lang_bpe_500/bpe.model \
+    $repo/test_wavs/1089-134686-0001.wav \
+    $repo/test_wavs/1221-135766-0001.wav \
+    $repo/test_wavs/1221-135766-0002.wav
+done
+
+for method in modified_beam_search beam_search fast_beam_search; do
+  log "$method"
+
+  ./lstm_transducer_stateless2/pretrained.py \
+    --method $method \
+    --beam-size 4 \
+    --checkpoint $repo/exp/pretrained.pt \
+    --bpe-model $repo/data/lang_bpe_500/bpe.model \
+    $repo/test_wavs/1089-134686-0001.wav \
+    $repo/test_wavs/1221-135766-0001.wav \
+    $repo/test_wavs/1221-135766-0002.wav
+done
+
+echo "GITHUB_EVENT_NAME: ${GITHUB_EVENT_NAME}"
+echo "GITHUB_EVENT_LABEL_NAME: ${GITHUB_EVENT_LABEL_NAME}"
+if [[ x"${GITHUB_EVENT_NAME}" == x"schedule" ]]; then
+  mkdir -p lstm_transducer_stateless2/exp
+  ln -s $PWD/$repo/exp/pretrained.pt lstm_transducer_stateless2/exp/epoch-999.pt
+  ln -s $PWD/$repo/data/lang_bpe_500 data/
+
+  ls -lh data
+  ls -lh lstm_transducer_stateless2/exp
+
+  log "Decoding test-clean and test-other"
+
+  # use a small value for decoding with CPU
+  max_duration=100
+
+  for method in greedy_search fast_beam_search modified_beam_search; do
+    log "Decoding with $method"
+
+    ./lstm_transducer_stateless2/decode.py \
+      --decoding-method $method \
+      --epoch 999 \
+      --avg 1 \
+      --use-averaged-model 0 \
+      --max-duration $max_duration \
+      --exp-dir lstm_transducer_stateless2/exp
+  done
+
+  rm lstm_transducer_stateless2/exp/*.pt
+fi
diff --git a/.github/scripts/run-librispeech-pruned-transducer-stateless-2022-03-12.sh b/.github/scripts/run-librispeech-pruned-transducer-stateless-2022-03-12.sh
@@ -1,5 +1,7 @@
 #!/usr/bin/env bash
 
+set -e
+
 log() {
   # This function is from espnet
   local fname=${BASH_SOURCE[1]##*/}

diff --git a/.github/scripts/run-librispeech-pruned-transducer-stateless2-2022-04-29.sh b/.github/scripts/run-librispeech-pruned-transducer-stateless2-2022-04-29.sh
@@ -1,5 +1,7 @@
 #!/usr/bin/env bash
 
+set -e
+
 log() {
   # This function is from espnet
   local fname=${BASH_SOURCE[1]##*/}
@@ -11,10 +13,14 @@ cd egs/librispeech/ASR
 repo_url=https://huggingface.co/csukuangfj/icefall-asr-librispeech-pruned-transducer-stateless2-2022-04-29
 
 log "Downloading pre-trained model from $repo_url"
-git lfs install
-git clone $repo_url
+GIT_LFS_SKIP_SMUDGE=1 git clone $repo_url
 repo=$(basename $repo_url)
 
+pushd $repo
+git lfs pull --include "data/lang_bpe_500/bpe.model"
+git lfs pull --include "exp/pretrained-epoch-38-avg-10.pt"
+popd
+
 log "Display test files"
 tree $repo/
 soxi $repo/test_wavs/*.wav
@@ -77,4 +83,5 @@ if [[ x"${GITHUB_EVENT_NAME}" == x"schedule" || x"${GITHUB_EVENT_LABEL_NAME}" ==
   done
 
   rm pruned_transducer_stateless2/exp/*.pt
+  rm -r data/lang_bpe_500
 fi
diff --git a/.github/scripts/run-librispeech-pruned-transducer-stateless3-2022-04-29.sh b/.github/scripts/run-librispeech-pruned-transducer-stateless3-2022-04-29.sh
@@ -1,5 +1,7 @@
 #!/usr/bin/env bash
 
+set -e
+
 log() {
   # This function is from espnet
   local fname=${BASH_SOURCE[1]##*/}
@@ -11,9 +13,12 @@ cd egs/librispeech/ASR
 repo_url=https://huggingface.co/csukuangfj/icefall-asr-librispeech-pruned-transducer-stateless3-2022-04-29
 
 log "Downloading pre-trained model from $repo_url"
-git lfs install
-git clone $repo_url
+GIT_LFS_SKIP_SMUDGE=1 git clone $repo_url
 repo=$(basename $repo_url)
+pushd $repo
+git lfs pull --include "data/lang_bpe_500/bpe.model"
+git lfs pull --include "exp/pretrained-epoch-25-avg-6.pt"
+popd
 
 log "Display test files"
 tree $repo/
@@ -77,4 +82,5 @@ if [[ x"${GITHUB_EVENT_NAME}" == x"schedule" || x"${GITHUB_EVENT_LABEL_NAME}" ==
   done
 
   rm pruned_transducer_stateless3/exp/*.pt
+  rm -r data/lang_bpe_500
 fi