-
Notifications
You must be signed in to change notification settings - Fork 684
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor Emformer RNNT recipes #2212
Conversation
871d093
to
bb41509
Compare
@hwangjeff has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Summary: Consolidates LibriSpeech and TED-LIUM Release 3 Emformer RNN-T training recipes in a single directory. Pull Request resolved: pytorch#2212 Differential Revision: D34120104 Pulled By: hwangjeff fbshipit-source-id: 38cf6453d19e74f9851ff0544c6c7f604ec5b630
This pull request was exported from Phabricator. Differential Revision: D34120104 |
bb41509
to
be42df8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
## Model Types | ||
|
||
Currently, we have training recipes for the LibriSpeech and TED-LIUM Release 3 datasets. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you have a brief description on what’s the difference between these models? I know vocab size is one. Is there any other?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, that's the only difference in the model architecture. the recipes themselves have some differences that are specific to the datasets — this should be apparent in the lightning modules
be42df8
to
d7564fe
Compare
@hwangjeff has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Summary: Consolidates LibriSpeech and TED-LIUM Release 3 Emformer RNN-T training recipes in a single directory. Pull Request resolved: pytorch#2212 Reviewed By: mthrok Differential Revision: D34120104 Pulled By: hwangjeff fbshipit-source-id: 60deed45731954d5f9b7df73a35d8373184c3c67
This pull request was exported from Phabricator. Differential Revision: D34120104 |
d7564fe
to
3fbbf34
Compare
Summary: Consolidates LibriSpeech and TED-LIUM Release 3 Emformer RNN-T training recipes in a single directory. Pull Request resolved: pytorch#2212 Reviewed By: mthrok Differential Revision: D34120104 Pulled By: hwangjeff fbshipit-source-id: 9ed0a5d5b209478a841324f360c5b66268e3b228
3fbbf34
to
ebda0ec
Compare
This pull request was exported from Phabricator. Differential Revision: D34120104 |
|
||
assert len(idx_target_lengths) > 0 | ||
|
||
idx_target_lengths = sorted(idx_target_lengths, key=lambda x: x[1]) | ||
|
||
assert max_token_limit >= idx_target_lengths[-1][1] | ||
|
||
self.batches = _batch_by_token_count(idx_target_lengths, max_token_limit) | ||
self.batches = batch_by_token_count(idx_target_lengths, max_token_limit)[:100] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we choose the first 100 batches here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good call — this was meant to speed up the training epoch for testing purposes only and should be removed
Do we want to keep the |
@nateanl The difference between librispeech model and tedluim model should be clearly outlined in the top level README. It can be as simple as |
default=pathlib.Path("global_stats.json"), | ||
type=pathlib.Path, | ||
help="Path to JSON file containing feature means and stddevs.", | ||
required=True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As it has a default value, is it really necessary to specify required=True
here?
Also, the default value is not correct as that file is inside the recipe dir.
I would suggest removing the default value.
Summary: Consolidates LibriSpeech and TED-LIUM Release 3 Emformer RNN-T training recipes in a single directory. Pull Request resolved: pytorch#2212 Reviewed By: mthrok Differential Revision: D34120104 Pulled By: hwangjeff fbshipit-source-id: 29c6e27195d5998f76d67c35b718110e73529456
Consolidates LibriSpeech and TED-LIUM Release 3 Emformer RNN-T training recipes in a single directory.