k2-fsa · yaozengwei · Aug 19, 2022 · Jul 12, 2022 · Jul 16, 2022 · Jul 16, 2022
diff --git a/.flake8 b/.flake8
@@ -9,7 +9,8 @@ per-file-ignores =
  egs/*/ASR/pruned_transducer_stateless*/*.py: E501,
  egs/*/ASR/*/optim.py: E501,
  egs/*/ASR/*/scaling.py: E501,
- egs/librispeech/ASR/conv_emformer_transducer_stateless*/*.py: E501, E203,
+ egs/librispeech/ASR/lstm_transducer_stateless/*.py: E501, E203
+ egs/librispeech/ASR/conv_emformer_transducer_stateless*/*.py: E501, E203
  egs/librispeech/ASR/conformer_ctc2/*py: E501,
  egs/librispeech/ASR/RESULTS.md: E999,
 

diff --git a/egs/librispeech/ASR/README.md b/egs/librispeech/ASR/README.md
@@ -25,6 +25,7 @@ The following table lists the differences among them.
 | `pruned_stateless_emformer_rnnt2` | Emformer(from torchaudio) | Embedding + Conv1d | Using Emformer from torchaudio for streaming ASR|
 | `conv_emformer_transducer_stateless` | ConvEmformer | Embedding + Conv1d | Using ConvEmformer for streaming ASR + mechanisms in reworked model |
 | `conv_emformer_transducer_stateless2` | ConvEmformer | Embedding + Conv1d | Using ConvEmformer with simplified memory for streaming ASR + mechanisms in reworked model |
+| `lstm_transducer_stateless` | LSTM | Embedding + Conv1d | Using LSTM with mechanisms in reworked model |
 
 The decoder in `transducer_stateless` is modified from the paper
 [Rnn-Transducer with Stateless Prediction Network](https://ieeexplore.ieee.org/document/9054419/).

diff --git a/egs/librispeech/ASR/RESULTS.md b/egs/librispeech/ASR/RESULTS.md
@@ -1,5 +1,137 @@
 ## Results
 
+#### LibriSpeech BPE training results (Pruned Stateless LSTM RNN-T)
+
+[lstm_transducer_stateless](./lstm_transducer_stateless)
+
+It implements LSTM model with mechanisms in reworked model for streaming ASR.
+
+See <https://github.com/k2-fsa/icefall/pull/479> for more details.
+
+#### training on full librispeech
+
+This model contains 12 encoder layers (LSTM module + Feedforward module). The number of model parameters is 84689496.
+
+The WERs are:
+
+| | test-clean | test-other | comment | decoding mode |
+|-------------------------------------|------------|------------|----------------------|----------------------|
+| greedy search (max sym per frame 1) | 3.81 | 9.73 | --epoch 35 --avg 15 | simulated streaming |
+| greedy search (max sym per frame 1) | 3.78 | 9.79 | --epoch 35 --avg 15 | streaming |
+| fast beam search | 3.74 | 9.59 | --epoch 35 --avg 15 | simulated streaming |
+| fast beam search | 3.73 | 9.61 | --epoch 35 --avg 15 | streaming |
+| modified beam search | 3.64 | 9.55 | --epoch 35 --avg 15 | simulated streaming |
+| modified beam search | 3.65 | 9.51 | --epoch 35 --avg 15 | streaming |
+
+The training command is:
+
+```bash
+./lstm_transducer_stateless/train.py \
+ --world-size 4 \
+ --num-epochs 35 \
+ --start-epoch 1 \
+ --exp-dir lstm_transducer_stateless/exp \
+ --full-libri 1 \
+ --max-duration 500 \
+ --master-port 12321 \
+ --num-encoder-layers 12 \
+ --rnn-hidden-size 1024
+```
+
+The tensorboard log can be found at
+<https://tensorboard.dev/experiment/FWrM20mjTeWo6dTpFYOsYQ/>
+
+The simulated streaming decoding command using greedy search is:
+```bash
+./lstm_transducer_stateless/decode.py \
+ --epoch 35 \
+ --avg 15 \
+ --exp-dir lstm_transducer_stateless/exp \
+ --max-duration 600 \
+ --num-encoder-layers 12 \
+ --rnn-hidden-size 1024
+ --decoding-method greedy_search \
+ --use-averaged-model True
+```
+
+The simulated streaming decoding command using fast beam search is:
+```bash
+./lstm_transducer_stateless/decode.py \
+ --epoch 35 \
+ --avg 15 \
+ --exp-dir lstm_transducer_stateless/exp \
+ --max-duration 600 \
+ --num-encoder-layers 12 \
+ --rnn-hidden-size 1024
+ --decoding-method fast_beam_search \
+ --use-averaged-model True \
+ --beam 4 \
+ --max-contexts 4 \
+ --max-states 8
+```
+
+The simulated streaming decoding command using modified beam search is:
+```bash
+./lstm_transducer_stateless/decode.py \
+ --epoch 35 \
+ --avg 15 \
+ --exp-dir lstm_transducer_stateless/exp \
+ --max-duration 600 \
+ --num-encoder-layers 12 \
+ --rnn-hidden-size 1024
+ --decoding-method modified_beam_search \
+ --use-averaged-model True \
+ --beam-size 4
+```
+
+The streaming decoding command using greedy search is:
+```bash
+./lstm_transducer_stateless/streaming_decode.py \
+ --epoch 35 \
+ --avg 15 \
+ --exp-dir lstm_transducer_stateless/exp \
+ --max-duration 600 \
+ --num-encoder-layers 12 \
+ --rnn-hidden-size 1024
+ --decoding-method greedy_search \
+ --use-averaged-model True
+```
+
+The streaming decoding command using fast beam search is:
+```bash
+./lstm_transducer_stateless/streaming_decode.py \
+ --epoch 35 \
+ --avg 15 \
+ --exp-dir lstm_transducer_stateless/exp \
+ --max-duration 600 \
+ --num-encoder-layers 12 \
+ --rnn-hidden-size 1024
+ --decoding-method fast_beam_search \
+ --use-averaged-model True \
+ --beam 4 \
+ --max-contexts 4 \
+ --max-states 8
+```
+
+The streaming decoding command using modified beam search is:
+```bash
+./lstm_transducer_stateless/streaming_decode.py \
+ --epoch 35 \
+ --avg 15 \
+ --exp-dir lstm_transducer_stateless/exp \
+ --max-duration 600 \
+ --num-encoder-layers 12 \
+ --rnn-hidden-size 1024
+ --decoding-method modified_beam_search \
+ --use-averaged-model True \
+ --beam-size 4
+```
+
+Pretrained models, training logs, decoding logs, and decoding results
+are available at
+<https://huggingface.co/Zengwei/icefall-asr-librispeech-lstm-transducer-stateless-2022-08-18>
+
+
 #### LibriSpeech BPE training results (Pruned Stateless Conv-Emformer RNN-T 2)
 
 [conv_emformer_transducer_stateless2](./conv_emformer_transducer_stateless2)

diff --git a/egs/librispeech/ASR/lstm_transducer_stateless/__init__.py b/egs/librispeech/ASR/lstm_transducer_stateless/__init__.py
@@ -0,0 +1 @@
+../pruned_transducer_stateless2/__init__.py
diff --git a/egs/librispeech/ASR/lstm_transducer_stateless/asr_datamodule.py b/egs/librispeech/ASR/lstm_transducer_stateless/asr_datamodule.py
@@ -0,0 +1 @@
+../pruned_transducer_stateless2/asr_datamodule.py
diff --git a/egs/librispeech/ASR/lstm_transducer_stateless/beam_search.py b/egs/librispeech/ASR/lstm_transducer_stateless/beam_search.py
@@ -0,0 +1 @@
+../pruned_transducer_stateless2/beam_search.py