Fine-tune recipe for Zipformer #1484

marcoyang1998 · 2024-01-31T11:14:41Z

This is a fine-tuning recipe for Zipformer. The zipformer model is pre-trained on LibriSpeech (see https://huggingface.co/Zengwei/icefall-asr-librispeech-zipformer-2023-05-15) and fine-tuned on GigaSpeech subset S. The results are as follows:

Model	GigaSpeech	LibriSpeech
LS-Zipformer, no fine-tune	20.06/19.27	2.23/4.96
Giga-Zipformer	15.33/15.31	5.88/15.33
LS-Zipformer, fine-tune on Giga-S, no mux	13.31/13.39	3.61/7.83
LS-Zipformer, fine-tune on Giga-S, with mux	13.18/13.09	2.13/4.72

The fine-tuned model achieves much better WER on the target domain.

marcoyang1998 · 2024-02-06T09:10:02Z

Note that during fine-tuning, it is highly recommended to set the batch_count to a very large number so that all the ScheduledFloat variables reach their final values (this is already set by default in finetune.py). Otherwise, fine-tuning could be unstable.

marcoyang1998 · 2024-02-06T09:12:32Z

If you have the original training data and you want to retain the performance on the original set after fine-tuning, you could use CutSet.mux to mix the original training set with the fine-tuning set. You can do this by setting --use-mux 1. An example training command is shown below:

use_mux=1
do_finetune=1

./zipformer/finetune.py \
  --world-size 2 \
  --num-epochs 30 \
  --start-epoch 1 \
  --exp-dir zipformer/exp_giga_finetune${do_finetune}_mux${use_mux} \
  --use-fp16 1 \
  --base-lr 0.0045 \
  --bpe-model data/lang_bpe_500/bpe.model \
  --do-finetune $do_finetune \
  --use-mux $use_mux \
  --finetune-ckpt icefall-asr-librispeech-zipformer-2023-05-15/exp/pretrained.pt \
  --max-duration 1000

lalimili6 · 2024-03-01T15:27:10Z

@marcoyang1998
Hi dear, many tnx.
1- Can you share model?
2- How many hours data must we have to fine tune a model?

best regards

marcoyang1998 · 2024-03-02T03:49:55Z

Can you share model?

We don't have the model after finetuning. But you can download the initialization from here: https://huggingface.co/Zengwei/icefall-asr-librispeech-zipformer-2023-05-15

We have a tutorial for fine-tuning, please have a look at https://k2-fsa.github.io/icefall/recipes/Finetune/index.html

How many hours data must we have to fine tune a model?

It depends, usually tens of hours are enough.

marcoyang1998 added 5 commits January 9, 2024 10:14

change starting index; support different subsets

42863b1

add methods to load gigaspeech cuts for finetune

057238c

support finetune zipformer

fa96660

support decoding giga

4590da3

update the usage; set a very large batch count

d32314c

marcoyang1998 merged commit 7770740 into k2-fsa:master Feb 6, 2024
54 checks passed

This was referenced Feb 20, 2024

Documentation for finetuning a zipformer model #1509

Merged

An overview of the current recommended k2 setup #1517

Closed

marcoyang1998 mentioned this pull request Mar 21, 2024

Documentation for adapter fine-tuning #1545

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fine-tune recipe for Zipformer #1484

Fine-tune recipe for Zipformer #1484

marcoyang1998 commented Jan 31, 2024

marcoyang1998 commented Feb 6, 2024

marcoyang1998 commented Feb 6, 2024

lalimili6 commented Mar 1, 2024

marcoyang1998 commented Mar 2, 2024

Fine-tune recipe for Zipformer #1484

Fine-tune recipe for Zipformer #1484

Conversation

marcoyang1998 commented Jan 31, 2024

marcoyang1998 commented Feb 6, 2024

marcoyang1998 commented Feb 6, 2024

lalimili6 commented Mar 1, 2024

marcoyang1998 commented Mar 2, 2024