[recipe] LibriSpeech zipformer_ctc #941

desh2608 · 2023-03-09T22:35:02Z

I trained a zipformer based CTC model (with aux. attention head) on LibriSpeech. The following are results on test clean/other.

decoding method	test-clean	test-other	comment
ctc-decoding	2.50	5.86	--epoch 30 --avg 9
whole-lattice-rescoring	2.44	5.38	--epoch 30 --avg 9
attention-rescoring	2.35	5.16	--epoch 30 --avg 9

Tensorboard: https://tensorboard.dev/experiment/IjPSJjHOQFKPYA5Z0Vf8wg
Pretrained model: https://huggingface.co/desh2608/icefall-asr-librispeech-zipformer-ctc

SOLVED

I am having some trouble with the other decoding methods. I created G.fst.txt by first downloading the 4-gram.arpa.gz file, unzipping it, and then running the following:

python3 -m kaldilm \
  --read-symbol-table="data/lang_bpe_500/tokens.txt" \
  --disambig-symbol='#0' \
  --max-order=4 \
  data/lm/4-gram.arpa > data/lang_bpe_500/G_4_gram.fst.txt

The G.pt should get created inside decode.py. But during decoding, I get the following AssertionError:

  File "zipformer_ctc_att/decode.py", line 556, in decode_dataset
    hyps_dict = decode_one_batch(
  File "zipformer_ctc_att/decode.py", line 440, in decode_one_batch
    best_path_dict = rescore_with_whole_lattice(
  File "/exp/draj/mini_scale_2022/icefall/icefall/decode.py", line 858, in rescore_with_whole_lattice
    assert G_with_epsilon_loops.shape == (1, None, None)

I am guessing I did something wrong in creating G.pt. I would appreciate if someone can help with this.

ezerhouni · 2023-03-10T09:34:52Z

@desh2608 Not sure but it looks like your G is a token ngram while rescore_with_whole_lattice is expecting a word ngram, could it be possible ?

desh2608 · 2023-03-10T12:23:43Z

@desh2608 Not sure but it looks like your G is a token ngram while rescore_with_whole_lattice is expecting a word ngram, could it be possible ?

Ahh, of course. I should pass words.txt for the symbol table. Thanks!

Update: Actually, on looking at my command history, I see that I did use words.txt (not tokens.txt) to create G.fst.txt.

desh2608 · 2023-03-10T14:55:05Z

It turns out that I had the wrong G.pt in my lang directory, so the correct G_4_gram.fst.txt was not being used. Here are the steps in case someone is interested.

Download and extract 3-gram.pruned.1e-7.arpa.gz and 4-gram.arpa.gz from https://openslr.org/11/ into data/lm.
Prepare G_3_gram.fst.txt and G_4_gram.fst.txt as follows:

python -m kaldilm --read-symbol-table="data/lang_bpe_500/words.txt" --disambig-symbol="#0" --max-order=3 data/lm/3-gram.pruned.1e-7.arpa > data/lm/G_3_gram.fst.txt
python -m kaldilm --read-symbol-table="data/lang_bpe_500/words.txt" --disambig-symbol="#0" --max-order=4 data/lm/4-gram.arpa > data/lm/G_4_gram.fst.txt

Compile HLG using the pruned 3-gram:

python local/compile_hlg.py --lm G_3_gram --lang-dir data/lang_bpe_500

Now run decode.py. The G.pt gets created from data/lm/G_4_gram.fst.txt inside the decode script, so it doesn't have to be created in advance.

desh2608 · 2023-03-11T17:09:35Z

@csukuangfj please review when you have some time.

csukuangfj · 2023-03-17T09:50:39Z

egs/librispeech/ASR/RESULTS.md

+
+| decoding method         | test-clean | test-other | comment             |
+|-------------------------|------------|------------|---------------------|
+| ctc-decoding            | 2.50       | 5.86       | --epoch 30 --avg 9  |


Could you also post the result for HLG decoding, i.e., one-best decoding?

I think it is strange that 1best (HLG) is better than whole-lattice-rescoring (HLG + 4-gram G).

Yeah, I was thinking the same. I'll verify the numbers again.

@desh2608 It seems that you don't have a parameter to adjust the scale of the HLG decoding graph. Could you please add this parameter like here:

icefall/egs/librispeech/ASR/conformer_ctc3/decode.py

Lines 250 to 254 in 05e7435

parser.add_argument(

"--hlg-scale",

type=float,

default=0.8,

help="""The scale to be applied to `hlg.scores`.

I tested your model and I got 2.46/5.36 with hlg_scale=0.5 for 1best decoding.

Yeah, I was thinking the same. I'll verify the numbers again.

Are you able to reproduce it, i.e., WER for test clean = 2.01 ?
@desh2608

Sorry I did not find time to check it. Let me try to do it this week.
@MarcoYang thanks for the pointer. I'll add it.

BTW something else that is different in this recipe compared to other LibriSpeech recipes is that I keep cuts shorter than 25s (instead of 20s), to avoid throwing away more data. With the quadratic_duration option in DynamicBucketingSampler, this seems to be working fine (I could train on V100 with batch size 800).

egs/librispeech/ASR/RESULTS.md

@csukuangfj

Address comments from @csukuangfj

armusc · 2023-10-27T12:03:12Z

Hi,
looking at this conversation after the merge, were those numbers from 1best decoding then confirmed?
thanks

JinZr · 2023-10-27T13:41:08Z

hi, you can have a look at its huggingface repo where desh has the pre-trained model uploaded

…

On Fri, Oct 27, 2023 at 8:03 PM armusc ***@***.***> wrote: Hi, looking at this conversation after the merge, were those numbers from 1best decoding then confirmed? thanks — Reply to this email directly, view it on GitHub <#941 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AOON42HMJJGUMWHKRY6BZDTYBOPIZAVCNFSM6AAAAAAVVWL7V2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOBSG44TQMRRGE> . You are receiving this because you modified the open/close state.Message ID: ***@***.***>

desh2608 added 11 commits December 9, 2022 09:34

merge upstream

a07ddff

Merge branch 'master' of https://github.com/k2-fsa/icefall

9d922ec

Merge branch 'master' of https://github.com/k2-fsa/icefall

b978c6d

Merge branch 'master' of https://github.com/k2-fsa/icefall

f6e6837

initial commit for zipformer_ctc

8a8e827

remove unwanted changes

f2d8bf6

remove changes to other recipe

dfeb8e6

fix zipformer softlink

86fc25d

fix for JIT export

403c626

add missing file

f4041af

fix symbolic links

11e21f3

update results

7c5dba6

desh2608 added the ready label Mar 11, 2023

desh2608 requested a review from csukuangfj March 16, 2023 13:52

yfyeung requested a review from pkufool March 17, 2023 03:08

csukuangfj reviewed Mar 17, 2023

View reviewed changes

egs/librispeech/ASR/RESULTS.md Outdated Show resolved Hide resolved

yfyeung requested review from yaozengwei, glynpu and marcoyang1998 March 17, 2023 09:53

Update RESULTS.md

96b1ec5

Address comments from @csukuangfj

csukuangfj approved these changes Apr 18, 2023

View reviewed changes

csukuangfj mentioned this pull request Apr 18, 2023

Export conformer_ctc3 streaming model to jit trace #999

Open

Merge branch 'master' into zipformer_ctc

5ad96d3

JinZr merged commit 7d56685 into k2-fsa:master Oct 27, 2023
3 checks passed

JinZr mentioned this pull request Oct 31, 2023

Minor refinements for some stale but recently merged PRs #1354

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[recipe] LibriSpeech zipformer_ctc #941

[recipe] LibriSpeech zipformer_ctc #941

desh2608 commented Mar 9, 2023 •

edited

Loading

ezerhouni commented Mar 10, 2023

desh2608 commented Mar 10, 2023 •

edited

Loading

desh2608 commented Mar 10, 2023

desh2608 commented Mar 11, 2023

csukuangfj Mar 17, 2023

desh2608 Mar 17, 2023

yaozengwei Mar 19, 2023

desh2608 Mar 19, 2023

marcoyang1998 Apr 18, 2023

csukuangfj Apr 18, 2023

desh2608 Apr 18, 2023

desh2608 Apr 18, 2023

armusc commented Oct 27, 2023

JinZr commented Oct 27, 2023 via email

	parser.add_argument(
	"--hlg-scale",
	type=float,
	default=0.8,
	help="""The scale to be applied to `hlg.scores`.

[recipe] LibriSpeech zipformer_ctc #941

[recipe] LibriSpeech zipformer_ctc #941

Conversation

desh2608 commented Mar 9, 2023 • edited Loading

ezerhouni commented Mar 10, 2023

desh2608 commented Mar 10, 2023 • edited Loading

desh2608 commented Mar 10, 2023

desh2608 commented Mar 11, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

armusc commented Oct 27, 2023

JinZr commented Oct 27, 2023 via email

desh2608 commented Mar 9, 2023 •

edited

Loading

desh2608 commented Mar 10, 2023 •

edited

Loading