Add MMS CTC Fine-Tuning #24281

patrickvonplaten · 2023-06-14T15:55:47Z

What does this PR do?

This PR adds language adapter fine-tuning for MMS. Still playing around with good hyper-parameters but script is functional.

Getting some very nice results now for:

export CUDA_VISIBLE_DEVICES="0"
LEARNING_RATE="1e-3"

python run_speech_recognition_ctc.py \
        --dataset_name="common_voice" \
        --model_name_or_path="facebook/mms-1b-all" \
        --dataset_config_name="tr" \
        --output_dir="./wav2vec2-common_voice-tr-mms-demo" \
        --overwrite_output_dir \
        --num_train_epochs="15" \
        --per_device_train_batch_size="32" \
        --learning_rate="${LEARNING_RATE}" \
        --warmup_steps="400" \
        --evaluation_strategy="steps" \
        --text_column_name="sentence" \
        --length_column_name="input_length" \
        --save_steps="400" \
        --eval_steps="200" \
        --layerdrop="0.0" \
        --save_total_limit="3" \
        --adapter_attn_dim="16" \
        --adapter_language="tur" \
        --gradient_checkpointing \
        --chars_to_ignore , ? . ! - \; \: \" “ % ‘ ” � \
        --fp16 \
        --group_by_length \
        --do_train --do_eval

WER drops to 25% just after 200 steps.
See: https://wandb.ai/patrickvonplaten/huggingface/runs/6f5cx5gg?workspace=user-patrickvonplaten

@sgugger @amyeroberts @sanchit-gandhi it'd be super nice to get a quick review here whether the code changes are generally fine with you. I'll only have to fill out the TODOs in the README with a nice example code and some description.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

…to add_mms_ctc_fine_tuning

HuggingFaceDocBuilderDev · 2023-06-14T16:21:34Z

The documentation is not available anymore as the PR was closed or merged.

patrickvonplaten · 2023-06-14T17:09:12Z

examples/pytorch/speech-recognition/run_speech_recognition_ctc.py

@@ -579,12 +632,24 @@ def remove_special_characters(batch):
        cache_dir=model_args.cache_dir,
        config=config,
        use_auth_token=data_args.use_auth_token,
+        ignore_mismatched_sizes=True,


This is needed when instantiating from CTC checkpoints

patrickvonplaten · 2023-06-14T17:09:45Z

examples/pytorch/speech-recognition/run_speech_recognition_ctc.py

@@ -31,6 +31,7 @@
 import numpy as np
 import torch
 from datasets import DatasetDict, load_dataset
+from safetensors.torch import save_file as safe_save_file


It's a required dependency so should be fine

sgugger

I think this should go in its own example instead of adding some more code to the (already complex) ctc example. It's preferable to have multiple examples focused on one thing than one big multi-purpose example.

patrickvonplaten · 2023-06-14T17:41:02Z

I think this should go in its own example instead of adding some more code to the (already complex) ctc example. It's preferable to have multiple examples focused on one thing than one big multi-purpose example.

Ok for me

amyeroberts

Thanks for adding! Changes all LGTM (filled in examples conditional ;))

+1 to @sgugger's suggestion of having a separate example

sanchit-gandhi

Looks good already, thanks for the updates @patrickvonplaten. Just some minor suggestions

sanchit-gandhi · 2023-06-14T19:02:27Z

examples/pytorch/speech-recognition/run_speech_recognition_ctc.py

+    adapter_attn_dim: int = field(
+        default=None,
+        metadata={
+            "help": "If defined, adapter layers will be randomely initialized and the rest of the model will be frozen."


Suggested change

"help": "If defined, adapter layers will be randomely initialized and the rest of the model will be frozen."

"help": "If defined, adapter layers will be randomly initialized and the rest of the model will be frozen."

sanchit-gandhi · 2023-06-14T19:05:54Z

examples/pytorch/speech-recognition/run_speech_recognition_ctc.py

+    adapter_language: Optional[str] = field(
+        default=None,
+        metadata={
+            "help": (


Nice help message!

sanchit-gandhi · 2023-06-14T19:06:09Z

examples/pytorch/speech-recognition/run_speech_recognition_ctc.py

@@ -132,6 +134,12 @@ class ModelArguments:
    ctc_loss_reduction: Optional[str] = field(
        default="mean", metadata={"help": "The way the ctc loss should be reduced. Should be one of 'mean' or 'sum'."}
    )
+    adapter_attn_dim: int = field(
+        default=None,


Can this default to some sensible value or should we always force the user to pass it?

Yes good point, now that things will be moved to a new file, I'll set a good default

sanchit-gandhi · 2023-06-14T19:09:11Z

examples/pytorch/speech-recognition/run_speech_recognition_ctc.py

+        # first we freeze the whole base model
+        model.freeze_base_model()
+
+        # next we unfreeze all adapter layers


Do we need to unfreeze the adapter weights? They don't get frozen in model.freeze_base_model()

They do get frozen in model.freeze_base_model() (adapter attention weights are part of it)

sanchit-gandhi · 2023-06-14T19:09:26Z

examples/pytorch/speech-recognition/run_speech_recognition_ctc.py

@@ -579,12 +632,24 @@ def remove_special_characters(batch):
        cache_dir=model_args.cache_dir,
        config=config,
        use_auth_token=data_args.use_auth_token,
+        ignore_mismatched_sizes=True,


sanchit-gandhi · 2023-06-14T19:11:03Z

src/transformers/models/wav2vec2/modeling_wav2vec2.py

@@ -1194,6 +1194,19 @@ def _get_adapters(self):

        return adapter_weights

+    def init_adapter_layers(self):


…to add_mms_ctc_fine_tuning

…ace/transformers into add_mms_ctc_fine_tuning

patrickvonplaten · 2023-06-14T23:10:17Z

Added a test. Moved the code into a new example file. Added an extensive README. WER for a quick 10min run can be as low as 23% WER!

patrickvonplaten · 2023-06-14T23:14:47Z

Demo training run: https://huggingface.co/patrickvonplaten/wav2vec2-common_voice-tr-mms-demo

* Add mms ctc fine tuning * make style * More fixes that are needed * make fix-copies * make draft for README * add new file * move to new file * make style * make style * add quick test * make style * make style

dash8x · 2023-07-17T22:33:18Z

In which release will this be available in?

sanchit-gandhi · 2023-07-25T14:30:43Z

You can find the examples scripts here: https://github.com/huggingface/transformers/tree/main/examples/pytorch/speech-recognition#connectionist-temporal-classification-with-adapters

They assume that you are running from the latest dev version:

transformers/examples/pytorch/speech-recognition/run_speech_recognition_ctc_adapter.py

Lines 55 to 56 in f104522

    
           # Will error if the minimal version of Transformers is not installed. Remove at your own risks. 
        
           check_min_version("4.32.0.dev0")

Which you can do by following the instructions for installing from source or editable install here: https://huggingface.co/docs/transformers/installation#install-from-source

Although for MMS ASR fine-tuning, you can safely run the script using the latest PyPi release version (4.31.0).

patrickvonplaten added 3 commits June 14, 2023 15:55

Add mms ctc fine tuning

5949137

make style

12636a5

Merge branch 'main' of https://github.com/huggingface/transformers in…

48f0c66

…to add_mms_ctc_fine_tuning

patrickvonplaten changed the title ~~Add mms ctc fine tuning~~ [WIP] Add mms ctc fine tuning Jun 14, 2023

patrickvonplaten marked this pull request as draft June 14, 2023 16:00

More fixes that are needed

bc2bf72

patrickvonplaten commented Jun 14, 2023

View reviewed changes

patrickvonplaten added 2 commits June 14, 2023 17:11

make fix-copies

50ad945

make draft for README

eb9bfcc

patrickvonplaten changed the title ~~[WIP] Add mms ctc fine tuning~~ Add mms ctc fine tuning Jun 14, 2023

patrickvonplaten marked this pull request as ready for review June 14, 2023 17:20

patrickvonplaten changed the title ~~Add mms ctc fine tuning~~ Add MMS CTC Fine-Tuning Jun 14, 2023

sgugger reviewed Jun 14, 2023

View reviewed changes

amyeroberts approved these changes Jun 14, 2023

View reviewed changes

sanchit-gandhi approved these changes Jun 14, 2023

View reviewed changes

add new file

8a2310d

sgugger approved these changes Jun 14, 2023

View reviewed changes

patrickvonplaten added 8 commits June 14, 2023 20:54

move to new file

d4b2e96

make style

e54727a

make style

ea0ab98

add quick test

e3c6d45

Merge branch 'main' of https://github.com/huggingface/transformers in…

01d45d7

…to add_mms_ctc_fine_tuning

make style

164ca5f

Merge branch 'add_mms_ctc_fine_tuning' of https://github.com/huggingf…

a57c044

…ace/transformers into add_mms_ctc_fine_tuning

make style

aa1939e

patrickvonplaten merged commit 1609a43 into main Jun 14, 2023

patrickvonplaten deleted the add_mms_ctc_fine_tuning branch June 14, 2023 23:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add MMS CTC Fine-Tuning #24281

Add MMS CTC Fine-Tuning #24281

patrickvonplaten commented Jun 14, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Jun 14, 2023 •

edited

Loading

patrickvonplaten Jun 14, 2023

sanchit-gandhi Jun 14, 2023

patrickvonplaten Jun 14, 2023

sgugger left a comment

patrickvonplaten commented Jun 14, 2023

amyeroberts left a comment

sanchit-gandhi left a comment

sanchit-gandhi Jun 14, 2023

sanchit-gandhi Jun 14, 2023

sanchit-gandhi Jun 14, 2023

patrickvonplaten Jun 14, 2023

sanchit-gandhi Jun 14, 2023

patrickvonplaten Jun 14, 2023

sanchit-gandhi Jun 14, 2023

sanchit-gandhi Jun 14, 2023

patrickvonplaten commented Jun 14, 2023

patrickvonplaten commented Jun 14, 2023

dash8x commented Jul 17, 2023

sanchit-gandhi commented Jul 25, 2023

	"help": "If defined, adapter layers will be randomely initialized and the rest of the model will be frozen."
	"help": "If defined, adapter layers will be randomly initialized and the rest of the model will be frozen."

		@@ -1194,6 +1194,19 @@ def _get_adapters(self):

		return adapter_weights

		def init_adapter_layers(self):

Add MMS CTC Fine-Tuning #24281

Add MMS CTC Fine-Tuning #24281

Conversation

patrickvonplaten commented Jun 14, 2023 • edited Loading

What does this PR do?

Before submitting

Who can review?

HuggingFaceDocBuilderDev commented Jun 14, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sgugger left a comment

Choose a reason for hiding this comment

patrickvonplaten commented Jun 14, 2023

amyeroberts left a comment

Choose a reason for hiding this comment

sanchit-gandhi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

patrickvonplaten commented Jun 14, 2023

patrickvonplaten commented Jun 14, 2023

dash8x commented Jul 17, 2023

sanchit-gandhi commented Jul 25, 2023

patrickvonplaten commented Jun 14, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Jun 14, 2023 •

edited

Loading