Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add conformer configs for hat model #6372

Merged
merged 14 commits into from
Apr 14, 2023
Merged

Conversation

andrusenkoau
Copy link
Collaborator

@andrusenkoau andrusenkoau commented Apr 5, 2023

What does this PR do ?

Add conformer char and bpe configs for hat model (https://arxiv.org/abs/2003.07705)

Collection: [ASR]

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • New Feature
  • Bugfix
  • Documentation

Signed-off-by: andrusenkoau <andrusenkoau@gmail.com>
Copy link
Collaborator

@titu1994 titu1994 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks fine, i wonder if we want to create a another directory for it such as asr/conf/conformer_hat since it is technically a different model.
I am fine with this approach too but would like @VahidooX to review / comment on what his preference are for Conformer HAT.

It might be preferable to have a subdirectory conf/conformer/hat/conformer_hat_*.yaml ?

examples/asr/conf/hat/conformer/conformer_hat_bpe.yaml Outdated Show resolved Hide resolved
examples/asr/conf/hat/conformer/conformer_hat_bpe.yaml Outdated Show resolved Hide resolved
examples/asr/conf/hat/conformer/conformer_hat_char.yaml Outdated Show resolved Hide resolved
examples/asr/conf/hat/conformer/conformer_hat_bpe.yaml Outdated Show resolved Hide resolved
@titu1994 titu1994 requested a review from VahidooX April 6, 2023 01:29
@VahidooX
Copy link
Collaborator

VahidooX commented Apr 6, 2023

It looks fine, i wonder if we want to create a another directory for it such as asr/conf/conformer_hat since it is technically a different model. I am fine with this approach too but would like @VahidooX to review / comment on what his preference are for Conformer HAT.

It might be preferable to have a subdirectory conf/conformer/hat/conformer_hat_*.yaml ?

Looks good to me as long as they have their own separate folder under conformer.

andrusenkoau and others added 8 commits April 6, 2023 05:36
Signed-off-by: andrusenkoau <andrusenkoau@gmail.com>
Signed-off-by: andrusenkoau <andrusenkoau@gmail.com>
Signed-off-by: andrusenkoau <andrusenkoau@gmail.com>
Signed-off-by: andrusenkoau <andrusenkoau@gmail.com>
Signed-off-by: andrusenkoau <andrusenkoau@gmail.com>
@andrusenkoau
Copy link
Collaborator Author

Hi @titu1994 , @VahidooX !
I have fixed configs and added documentation for HAT model. Does it look good for merging?

@@ -75,7 +75,7 @@ Key Features
* Speech processing
* `HuggingFace Space for Audio Transcription (File, Microphone and YouTube) <https://huggingface.co/spaces/smajumdar/nemo_multilingual_language_id>`_
* `Automatic Speech Recognition (ASR) <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/asr/intro.html>`_
* Supported models: Jasper, QuartzNet, CitriNet, Conformer-CTC, Conformer-Transducer, Squeezeformer-CTC, Squeezeformer-Transducer, ContextNet, LSTM-Transducer (RNNT), LSTM-CTC, FastConformer-CTC, FastConformer-Transducer...
* Supported models: Jasper, QuartzNet, CitriNet, Conformer-CTC, Conformer-Transducer, Squeezeformer-CTC, Squeezeformer-Transducer, ContextNet, LSTM-Transducer (RNNT), LSTM-CTC, FastConformer-CTC, FastConformer-Transducer, Conformer-HAT...
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you please also update the following statement to have Hybrid ASR:
Supports CTC, Transducer/RNNT and Hybrid losses/decoders

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

VahidooX
VahidooX previously approved these changes Apr 11, 2023
Copy link
Collaborator

@VahidooX VahidooX left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Just left two minor comments.

.. _Conformer-HAT_model:

Conformer-HAT (Hybrid Autoregressive Transducer)
--------------------------------------
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lines should have the same size as the title.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Conformer HAT model (do not confuse it with Hybrid-Transducer-CTC) is a modification of Conformer-Transducer model based on `Google paper <https://arxiv.org/abs/2003.07705>`_.
The main idea is to separate labels and blank score predictions, which allows to estimate the internal LM probabilities during decoding.
When external LM is available for inference, the internal LM can be subtracted from HAT model prediction in beamsearch decoding to improve external LM efficiency.
It can be helpful in the case of text-only adaptation for new domains.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can users use this feature?
Do the current LM scripts support it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By default Conformer HAT model works in decoding time as a standard Transducer model with the same interface. However, if you have an external ngram LM you can use scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram_transducer.py script. The new updated version of the script is under reviewing -- #6370

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@VahidooX -- could you approve the PR if everything is OK?

Signed-off-by: andrusenkoau <andrusenkoau@gmail.com>
Signed-off-by: andrusenkoau <andrusenkoau@gmail.com>
@VahidooX VahidooX merged commit 5c12524 into NVIDIA:main Apr 14, 2023
hsiehjackson pushed a commit to hsiehjackson/NeMo that referenced this pull request Jun 2, 2023
* add conformer configs for hat model

Signed-off-by: andrusenkoau <andrusenkoau@gmail.com>

Signed-off-by: hsiehjackson <c2hsieh@ucsd.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants