Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Conformer RNN-T LibriSpeech training recipe #2329

Closed
wants to merge 13 commits into from

Conversation

hwangjeff
Copy link
Contributor

@hwangjeff hwangjeff commented Apr 12, 2022

Adds Conformer RNN-T LibriSpeech training recipe to examples directory.

Produces 30M-parameter model that achieves the following WER:

WER
test-clean 0.0310
test-other 0.0805
dev-clean 0.0314
dev-other 0.0827

@hwangjeff hwangjeff force-pushed the librispeech_conformer_rnnt branch from 745e36d to dda2ce3 Compare April 12, 2022 04:23
@hwangjeff hwangjeff marked this pull request as ready for review April 12, 2022 14:33
@facebook-github-bot
Copy link
Contributor

@hwangjeff has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@hwangjeff has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

):
super().__init__()

self.model = conformer_rnnt_base()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realized num_symbols is hardcoded as 1024 inside conformer_rnnt_base. IMO conformer_rnnt_base should only be used in test cases. Here we should initiate the model explicitly via conformer_rnnt_model, and pass self.sp_model.get_piece_size() to "num_symbols", and consider exposing more inputs to the input of LightningModule later on.

examples/asr/librispeech_conformer_rnnt/lightning.py Outdated Show resolved Hide resolved
@hwangjeff hwangjeff force-pushed the librispeech_conformer_rnnt branch from 6fd974d to 854ece9 Compare April 13, 2022 19:45
@facebook-github-bot
Copy link
Contributor

@hwangjeff has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

1 similar comment
@facebook-github-bot
Copy link
Contributor

@hwangjeff has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@hwangjeff hwangjeff force-pushed the librispeech_conformer_rnnt branch from da751eb to 3437eef Compare April 13, 2022 22:26
@facebook-github-bot
Copy link
Contributor

@hwangjeff has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Copy link
Collaborator

@mthrok mthrok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stamp.

@github-actions
Copy link

Hey @hwangjeff.
You merged this PR, but labels were not properly added. Please add a primary and secondary label (See https://github.com/pytorch/audio/blob/main/.github/process_commit.py)

xiaohui-zhang pushed a commit to xiaohui-zhang/audio that referenced this pull request May 4, 2022
Summary:
Adds Conformer RNN-T LibriSpeech training recipe to examples directory.

Produces 30M-parameter model that achieves the following WER:

|                     |          WER |
|:-------------------:|-------------:|
| test-clean          |       0.0310 |
| test-other          |       0.0805 |
| dev-clean           |       0.0314 |
| dev-other           |       0.0827 |

Pull Request resolved: pytorch#2329

Reviewed By: xiaohui-zhang

Differential Revision: D35578727

Pulled By: hwangjeff

fbshipit-source-id: afa9146c5b647727b8605d104d928110a1d3976d
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants