Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fine-tune recipe for Zipformer #1484

Merged
merged 5 commits into from
Feb 6, 2024

Conversation

marcoyang1998
Copy link
Collaborator

This is a fine-tuning recipe for Zipformer. The zipformer model is pre-trained on LibriSpeech (see https://huggingface.co/Zengwei/icefall-asr-librispeech-zipformer-2023-05-15) and fine-tuned on GigaSpeech subset S. The results are as follows:

Model GigaSpeech LibriSpeech
LS-Zipformer, no fine-tune 20.06/19.27 2.23/4.96
Giga-Zipformer 15.33/15.31 5.88/15.33
LS-Zipformer, fine-tune on Giga-S, no mux 13.31/13.39 3.61/7.83
LS-Zipformer, fine-tune on Giga-S, with mux 13.18/13.09 2.13/4.72

The fine-tuned model achieves much better WER on the target domain.

@marcoyang1998
Copy link
Collaborator Author

Note that during fine-tuning, it is highly recommended to set the batch_count to a very large number so that all the ScheduledFloat variables reach their final values (this is already set by default in finetune.py). Otherwise, fine-tuning could be unstable.

@marcoyang1998
Copy link
Collaborator Author

If you have the original training data and you want to retain the performance on the original set after fine-tuning, you could use CutSet.mux to mix the original training set with the fine-tuning set. You can do this by setting --use-mux 1. An example training command is shown below:

use_mux=1
do_finetune=1

./zipformer/finetune.py \
  --world-size 2 \
  --num-epochs 30 \
  --start-epoch 1 \
  --exp-dir zipformer/exp_giga_finetune${do_finetune}_mux${use_mux} \
  --use-fp16 1 \
  --base-lr 0.0045 \
  --bpe-model data/lang_bpe_500/bpe.model \
  --do-finetune $do_finetune \
  --use-mux $use_mux \
  --finetune-ckpt icefall-asr-librispeech-zipformer-2023-05-15/exp/pretrained.pt \
  --max-duration 1000

@marcoyang1998 marcoyang1998 merged commit 7770740 into k2-fsa:master Feb 6, 2024
54 checks passed
@lalimili6
Copy link

@marcoyang1998
Hi dear, many tnx.
1- Can you share model?
2- How many hours data must we have to fine tune a model?

best regards

@marcoyang1998
Copy link
Collaborator Author

Can you share model?

We don't have the model after finetuning. But you can download the initialization from here: https://huggingface.co/Zengwei/icefall-asr-librispeech-zipformer-2023-05-15

We have a tutorial for fine-tuning, please have a look at https://k2-fsa.github.io/icefall/recipes/Finetune/index.html

How many hours data must we have to fine tune a model?

It depends, usually tens of hours are enough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants