Refactor Emformer RNNT recipes #2212

hwangjeff · 2022-02-09T22:07:04Z

Consolidates LibriSpeech and TED-LIUM Release 3 Emformer RNN-T training recipes in a single directory.

facebook-github-bot · 2022-02-09T22:10:51Z

@hwangjeff has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Summary: Consolidates LibriSpeech and TED-LIUM Release 3 Emformer RNN-T training recipes in a single directory. Pull Request resolved: pytorch#2212 Differential Revision: D34120104 Pulled By: hwangjeff fbshipit-source-id: 38cf6453d19e74f9851ff0544c6c7f604ec5b630

facebook-github-bot · 2022-02-09T22:38:02Z

This pull request was exported from Phabricator. Differential Revision: D34120104

mthrok

I think that demo script can be moved one directory up so that it can handle both librispeech model and tedlium model. It doesn’t have to happen in this PR. @nateanl wojld you like to update #2203 after this one?

mthrok · 2022-02-10T02:00:22Z

examples/asr/emformer_rnnt/README.md

+
+## Model Types
+
+Currently, we have training recipes for the LibriSpeech and TED-LIUM Release 3 datasets.


Could you have a brief description on what’s the difference between these models? I know vocab size is one. Is there any other?

yeah, that's the only difference in the model architecture. the recipes themselves have some differences that are specific to the datasets — this should be apparent in the lightning modules

facebook-github-bot · 2022-02-10T02:57:37Z

@hwangjeff has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Summary: Consolidates LibriSpeech and TED-LIUM Release 3 Emformer RNN-T training recipes in a single directory. Pull Request resolved: pytorch#2212 Reviewed By: mthrok Differential Revision: D34120104 Pulled By: hwangjeff fbshipit-source-id: 60deed45731954d5f9b7df73a35d8373184c3c67

facebook-github-bot · 2022-02-10T04:15:10Z

This pull request was exported from Phabricator. Differential Revision: D34120104

Summary: Consolidates LibriSpeech and TED-LIUM Release 3 Emformer RNN-T training recipes in a single directory. Pull Request resolved: pytorch#2212 Reviewed By: mthrok Differential Revision: D34120104 Pulled By: hwangjeff fbshipit-source-id: 9ed0a5d5b209478a841324f360c5b66268e3b228

facebook-github-bot · 2022-02-10T04:23:36Z

This pull request was exported from Phabricator. Differential Revision: D34120104

nateanl · 2022-02-10T10:16:56Z

examples/asr/emformer_rnnt/tedlium3/lightning.py


        assert len(idx_target_lengths) > 0

        idx_target_lengths = sorted(idx_target_lengths, key=lambda x: x[1])

        assert max_token_limit >= idx_target_lengths[-1][1]

-        self.batches = _batch_by_token_count(idx_target_lengths, max_token_limit)
+        self.batches = batch_by_token_count(idx_target_lengths, max_token_limit)[:100]


Why we choose the first 100 batches here?

good call — this was meant to speed up the training epoch for testing purposes only and should be removed

nateanl · 2022-02-10T10:27:22Z

Do we want to keep the train_sentencepiece.py script in each sub directory? Otherwise they may ignore the important details that may impact the model performance.

mthrok · 2022-02-14T07:36:40Z

Do we want to keep the train_sentencepiece.py script in each sub directory? Otherwise they may ignore the important details that may impact the model performance.

@nateanl The difference between librispeech model and tedluim model should be clearly outlined in the top level README. It can be as simple as the Sentence Piece model also needs to be different, for the detail please refer to the READMe in respective directory and delegate the detail to the sub directory READMEs.

csukuangfj · 2022-02-18T04:56:16Z

examples/asr/emformer_rnnt/train.py

        default=pathlib.Path("global_stats.json"),
        type=pathlib.Path,
        help="Path to JSON file containing feature means and stddevs.",
+        required=True,


As it has a default value, is it really necessary to specify required=True here?
Also, the default value is not correct as that file is inside the recipe dir.

I would suggest removing the default value.

Summary: Consolidates LibriSpeech and TED-LIUM Release 3 Emformer RNN-T training recipes in a single directory. Pull Request resolved: pytorch#2212 Reviewed By: mthrok Differential Revision: D34120104 Pulled By: hwangjeff fbshipit-source-id: 29c6e27195d5998f76d67c35b718110e73529456

pytorch-bot bot added the ciflow/default label Feb 9, 2022

facebook-github-bot added the CLA Signed label Feb 9, 2022

hwangjeff force-pushed the recipe_refactor branch from 871d093 to bb41509 Compare February 9, 2022 22:10

hwangjeff force-pushed the recipe_refactor branch from bb41509 to be42df8 Compare February 9, 2022 22:38

hwangjeff marked this pull request as ready for review February 9, 2022 22:59

hwangjeff requested review from mthrok, nateanl and carolineechen February 9, 2022 22:59

mthrok approved these changes Feb 10, 2022

View reviewed changes

hwangjeff force-pushed the recipe_refactor branch from be42df8 to d7564fe Compare February 10, 2022 02:52

hwangjeff force-pushed the recipe_refactor branch from d7564fe to 3fbbf34 Compare February 10, 2022 04:15

hwangjeff force-pushed the recipe_refactor branch from 3fbbf34 to ebda0ec Compare February 10, 2022 04:23

facebook-github-bot closed this in 33bcb7b Feb 10, 2022

nateanl reviewed Feb 10, 2022

View reviewed changes

hwangjeff added example improvement module: models labels Feb 11, 2022

csukuangfj reviewed Feb 18, 2022

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor Emformer RNNT recipes #2212

Refactor Emformer RNNT recipes #2212

hwangjeff commented Feb 9, 2022

facebook-github-bot commented Feb 9, 2022

facebook-github-bot commented Feb 9, 2022

mthrok left a comment

mthrok Feb 10, 2022

hwangjeff Feb 10, 2022

facebook-github-bot commented Feb 10, 2022

facebook-github-bot commented Feb 10, 2022

facebook-github-bot commented Feb 10, 2022

nateanl Feb 10, 2022

hwangjeff Feb 11, 2022

nateanl commented Feb 10, 2022

mthrok commented Feb 14, 2022

csukuangfj Feb 18, 2022


		## Model Types

		Currently, we have training recipes for the LibriSpeech and TED-LIUM Release 3 datasets.

Refactor Emformer RNNT recipes #2212

Refactor Emformer RNNT recipes #2212

Conversation

hwangjeff commented Feb 9, 2022

facebook-github-bot commented Feb 9, 2022

facebook-github-bot commented Feb 9, 2022

mthrok left a comment

Choose a reason for hiding this comment

mthrok Feb 10, 2022

Choose a reason for hiding this comment

hwangjeff Feb 10, 2022

Choose a reason for hiding this comment

facebook-github-bot commented Feb 10, 2022

facebook-github-bot commented Feb 10, 2022

facebook-github-bot commented Feb 10, 2022

nateanl Feb 10, 2022

Choose a reason for hiding this comment

hwangjeff Feb 11, 2022

Choose a reason for hiding this comment

nateanl commented Feb 10, 2022

mthrok commented Feb 14, 2022

csukuangfj Feb 18, 2022

Choose a reason for hiding this comment