Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ASR with TTS Tutorial. Fix enhancer usage. #6955

Merged
merged 7 commits into from
Jul 12, 2023

Conversation

artbataev
Copy link
Collaborator

@artbataev artbataev commented Jun 30, 2023

What does this PR do ?

Add a one line overview of what this PR aims to accomplish.

Collection: [ASR]

Changelog

  • Tutorial for using hybrid ASR-TTS Models (ASRWithTTSModel)
  • fix enhancer loading in ASRWithTTSModel

Usage

  • You can potentially add a usage example below
# Add a code snippet demonstrating how to use this 

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • New Feature
  • Bugfix
  • Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

  • Related to # (issue)

@github-actions github-actions bot added the ASR label Jun 30, 2023
@artbataev artbataev requested review from racoiaws and GNroy June 30, 2023 20:06
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
@artbataev artbataev changed the title Add ASR with TTS Tutorial Add ASR with TTS Tutorial. Fix enhancer usage. Jun 30, 2023
Copy link
Collaborator

@GNroy GNroy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A good tutorial overall.
Please see the comments below and do grammar proofreading.
I'd also recommend adding a section about the Enhancer (what is it, maybe how to build and train it).

"id": "50fc294f-f319-4465-8f90-a28b49843e60",
"metadata": {},
"source": [
"This tutorial is designed to introduce using Hybrid ASR-TTS models, aka `ASRWithTTSModel`, to finetune existing ASR models using an integrated text-to-mel-spectrogram generator. "
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first part of the sencence can be improved, e.g. like
"This tutorial is intended to introduce you to using ASR-TTS Hybrid Models, also known as ASRWithTTSModel, ..."

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

"\n",
"The architecture is shown in the figure. \n",
"\n",
"The model can take text or audio as input during training. In the case of audio input, a mel spectrogram is extracted as usual and passed to ASR neural network. In the case of textual input, the mel spectrogram generator produces a spectrogram on the fly from the text. The spectrogram is improved by the enhancer (if present) and fed into the ASR model. \n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... a mel spectrogram is extracted as usual and passed to the ASR neural network...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

"Text-only manifest should contain three fields:\n",
"- `text`: target text for ASR model\n",
"- `tts_text`: text to use as a source for TTS model (unnormalized)\n",
"- `tts_text_normalized`: text to use as a source for TTS model (unnormalized)\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"... for TTS model (normalized)" maybe?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

"* TTS Mel Spectrogram Generator (currently, only `FastPitch` model is supported)\n",
"* Enhancer model (optional)\n",
"\n",
"Also, the config allows to specify text-only dataset.\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... to specify a text-only dataset

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

"metadata": {},
"source": [
"The tutorial demonstrated the main concepts related to hybrid ASR-TTS models to finetune ASR models and train new ones from scratch. \n",
"The ability to achieve good text-only adaptation results is demonstrated by finetuning a small Conformer model on text-only data from the AN4 dataset."
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the Conclusion section is optional. You can leave it though if you like it. Or you can make it "Further Reading" or something.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's nice to have this section. I'd suggest renaming this to "Summary"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed to "Summary"

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
@artbataev artbataev marked this pull request as ready for review July 11, 2023 19:07
@artbataev
Copy link
Collaborator Author

@GNroy, @racoiaws, I addressed your comments.

I also added a section about TTS models before the summary.

Also, tested on Goole Colab, everything works, but for now I should set BRANCH=cherry-pick-main-de52d957e63a6e5cc22b2e9463420223aacca533. This will be fixed after merging #7013 into main.

@artbataev artbataev requested a review from GNroy July 11, 2023 19:43
Copy link
Collaborator

@GNroy GNroy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@artbataev artbataev merged commit 7ff8b08 into NVIDIA:r1.20.0 Jul 12, 2023
@artbataev artbataev deleted the asr_with_tts_tutorial branch July 12, 2023 17:04
github-actions bot pushed a commit that referenced this pull request Jul 12, 2023
* Add ASR with TTS Tutorial
* Fix enhancer usage

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
artbataev added a commit that referenced this pull request Jul 13, 2023
* Add ASR with TTS Tutorial
* Fix enhancer usage

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Co-authored-by: Vladimir Bataev <vbataev@nvidia.com>
ericharper pushed a commit that referenced this pull request Aug 4, 2023
* Add ASR with TTS Tutorial
* Fix enhancer usage

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
tango4j added a commit that referenced this pull request Aug 5, 2023
* update package info

Signed-off-by: ericharper <complex451@gmail.com>

* Add ASR with TTS Tutorial. Fix enhancer usage. (#6955)

* Add ASR with TTS Tutorial
* Fix enhancer usage

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* install_bs (#7019)

Signed-off-by: Nikolay Karpov <karpnv@gmail.com>

* Fix typo and branch in tutorial (#7048)

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* fix syntax error introduced in PR-7079 (#7102)

* fix syntax error introduced in PR-7079

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>

* fixes for pr review

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>

---------

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>

* fix links for TN (#7117)

Signed-off-by: Evelina <ebakhturina@nvidia.com>

* update branch (#7135)

Signed-off-by: ericharper <complex451@gmail.com>

* Fixed main and merging this to r1.20 (#7127)

* Fixed main and merging this to r1.20

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Update vad_utils.py

Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>

---------

Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>

* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* fix version

Signed-off-by: ericharper <complex451@gmail.com>

* resolve conflict the other way

Signed-off-by: ericharper <complex451@gmail.com>

* keep both

Signed-off-by: ericharper <complex451@gmail.com>

* revert keep both

Signed-off-by: ericharper <complex451@gmail.com>

---------

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: Nikolay Karpov <karpnv@gmail.com>
Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>
Signed-off-by: Evelina <ebakhturina@nvidia.com>
Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: Vladimir Bataev <vbataev@nvidia.com>
Co-authored-by: Nikolay Karpov <karpnv@gmail.com>
Co-authored-by: bene-ges <antonova_sasha@list.ru>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: Taejin Park <tango4j@gmail.com>
Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
jubick1337 pushed a commit that referenced this pull request Aug 8, 2023
* update package info

Signed-off-by: ericharper <complex451@gmail.com>

* Add ASR with TTS Tutorial. Fix enhancer usage. (#6955)

* Add ASR with TTS Tutorial
* Fix enhancer usage

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* install_bs (#7019)

Signed-off-by: Nikolay Karpov <karpnv@gmail.com>

* Fix typo and branch in tutorial (#7048)

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* fix syntax error introduced in PR-7079 (#7102)

* fix syntax error introduced in PR-7079

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>

* fixes for pr review

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>

---------

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>

* fix links for TN (#7117)

Signed-off-by: Evelina <ebakhturina@nvidia.com>

* update branch (#7135)

Signed-off-by: ericharper <complex451@gmail.com>

* Fixed main and merging this to r1.20 (#7127)

* Fixed main and merging this to r1.20

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Update vad_utils.py

Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>

---------

Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>

* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* fix version

Signed-off-by: ericharper <complex451@gmail.com>

* resolve conflict the other way

Signed-off-by: ericharper <complex451@gmail.com>

* keep both

Signed-off-by: ericharper <complex451@gmail.com>

* revert keep both

Signed-off-by: ericharper <complex451@gmail.com>

---------

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: Nikolay Karpov <karpnv@gmail.com>
Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>
Signed-off-by: Evelina <ebakhturina@nvidia.com>
Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: Vladimir Bataev <vbataev@nvidia.com>
Co-authored-by: Nikolay Karpov <karpnv@gmail.com>
Co-authored-by: bene-ges <antonova_sasha@list.ru>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: Taejin Park <tango4j@gmail.com>
Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>
ekmb added a commit that referenced this pull request Aug 8, 2023
* Fix race condition when executing with multi-node where some ranks does not wait for setup (#7016)

Signed-off-by: Kim Ngo <6362111+findkim@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Added bool types to neural_types export (#7032)

Signed-off-by: tbartley94 <tbartley@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* rnnt and char utils (#6971)

* rnnt_ngram_merge

Signed-off-by: Nikolay Karpov <karpnv@gmail.com>

* char level bug

Signed-off-by: Nikolay Karpov <karpnv@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Nikolay Karpov <karpnv@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* fix tab text gen (#7022) (#7031)

Signed-off-by: Yi Dong <yidong@nvidia.com>
Co-authored-by: Yi Dong <43824965+yidong72@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fixed kwargs for metric instance init

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fixed kwargs for metric instance init

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* removed kwagrs

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Updated config desc

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* ASR Confidence update and tutorial (#6810)

* small fixes and tests

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* various fixes for the tutorial

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* tutorial added

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* for for a little oops after rebasement

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix tests

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* unused import removed

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* fix review comments

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* deprecated parameters for greedy configs

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* move re-assigning to configs

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* fix comments 2

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* fix config tests

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* fix ece test (my env was bugged apparently)

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* renamings for confidence ensemble

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fox comments 3

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* return dropped tutorial

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* CI flips back and forth, increasing tolerance

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

---------

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* install_bs (#7019) (#7028)

Signed-off-by: Nikolay Karpov <karpnv@gmail.com>
Co-authored-by: Nikolay Karpov <karpnv@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* fixes for spellmapper (#6994) (#7000)

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>
Co-authored-by: bene-ges <antonova_sasha@list.ru>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* added back the retro documents (#7033)

Signed-off-by: Yi Dong <yidong@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Remove pyyaml (#7052) (#7054)

Signed-off-by: smajumdar <titu1994@gmail.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* st standalone model (#6969)

* st standalone model

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* style fix

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* sacrebleu import fix, unused imports removed

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* import guard for nlp inside asr transformer bpe model

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* codeql fixes

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* comments answered

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* import ordering fix

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* yttm for asr removed

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* logging added

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* added inference and translate method

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* remove pos emb from state dict for old models (#7068)

* remove pos emb from state dict

Signed-off-by: Evelina <ebakhturina@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* move to nlp_model

Signed-off-by: Evelina <ebakhturina@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update comment

Signed-off-by: Evelina <ebakhturina@nvidia.com>

* fix nmt test

Signed-off-by: Evelina <ebakhturina@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix nmt test

Signed-off-by: Evelina <ebakhturina@nvidia.com>

---------

Signed-off-by: Evelina <ebakhturina@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix typo in ASR-TTS tutorial (#7049)

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fixed tutorial's name (#7047)

Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com>
Co-authored-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix documentation for Numba (#7065) (#7077)

* Fix documentation for Numba



* Update force float32 flag dynamically



* Update force float32 flag dynamically



* Fix nemo version



---------

Signed-off-by: smajumdar <titu1994@gmail.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Update Frame-VAD doc and fix onnx export (#7076)

* update fvad doc

Signed-off-by: stevehuang52 <heh@nvidia.com>

* fix typo

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update fvad example

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* fix onnx export

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update test

Signed-off-by: stevehuang52 <heh@nvidia.com>

* refactor

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update doc

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

---------

Signed-off-by: stevehuang52 <heh@nvidia.com>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* memmap worker arg (#7062)

* memmap worker arg

Signed-off-by: arendu <adithya.r@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update

Signed-off-by: arendu <adithya.r@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update

Signed-off-by: arendu <adithya.r@gmail.com>

* update

Signed-off-by: arendu <adithya.r@gmail.com>

---------

Signed-off-by: arendu <adithya.r@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix caching bug in causal convolutions for cache-aware ASR models (#7034) (#7082)

Co-authored-by: Vahid Noroozi <VahidooX@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fast Conformer global token fix (#7085)

* old way

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* remove extra

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* clean

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* clean

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* clean

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: sam1373 <samuelkriman@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Refined export_config (#7053) (#7066)

* Refined export_config
* Rolling back hierarchy change
---------

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Co-authored-by: Boris Fomitchev <borisfom@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* small Bugfix (#7081)

* small Bugfix (#7079)

* fix branch

Signed-off-by: fayejf <fayejf07@gmail.com>

* fix typo

Signed-off-by: fayejf <fayejf07@gmail.com>

* fix link

Signed-off-by: fayejf <fayejf07@gmail.com>

---------

Signed-off-by: fayejf <fayejf07@gmail.com>

* Update tutorials/nlp/SpellMapper_English_ASR_Customization.ipynb

Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>

* Update tutorials/nlp/SpellMapper_English_ASR_Customization.ipynb

Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>

---------

Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Added script to extract ASR CTC and RNNT models from ASR hybrid models (#7092)

* Added script to extract ctc and rnnt models from hybrid models

Signed-off-by: Daniel Egert <degert@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Updated hybrid extraction script for review request 1

Signed-off-by: Daniel Egert <degert@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Updated hybrid convert script to remove --cuda flag

Signed-off-by: Daniel Egert <degert@nvidia.com>

---------

Signed-off-by: Daniel Egert <degert@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Adding docs and models for multiple lookahead cache-aware ASR (#7067) (#7094)

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* update TTS readme (#7088)

* update TTS readme

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

---------

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix absolute path in path join call (#7099)

Signed-off-by: Jan Beckmann <king-jan1999@hotmail.de>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Disable distopt contiguous param buffer by default (#7095)

Signed-off-by: Tim Moon <tmoon@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* microphone demo (#7110)

Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com>
Co-authored-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* [Fix] load_state_dict in nlp_model.py (#7086)

* Fix load_state_dict in nlp_model.py

Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix plot function in vad_utils.py (#7113)

Fix plot function in vad_utils.py

Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fixed small bug with NoisePerturbationWithNormalization (#7118)

Signed-off-by: Daniel Egert <degert@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix import guard checks (#7124)

Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Revert "Fix import guard checks (#7124)" (#7125)

This reverts commit a46e325.

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix import guard checks (#7126)

* Fix import guard checks

Signed-off-by: smajumdar <titu1994@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: smajumdar <titu1994@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Add updated fc ctc and rnnt xxl models (#7128) (#7130)

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* [TTS] Create EnCodec training recipe (#6852)

* [TTS] Create EnCodec training recipe

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Update encodec recipe

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Rename EnCodec to AudioCodec

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Add EnCodec unit tests

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Add copyright header to distributed.py

Signed-off-by: Ryan <rlangman@nvidia.com>

---------

Signed-off-by: Ryan <rlangman@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix rank where torch.distributed may not be initialized yet and would not wait for tokenizer file caching (#7061)

Signed-off-by: Kim Ngo <6362111+findkim@users.noreply.github.com>
Co-authored-by: David <amosalla@asu.edu>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* fix default attention size (#7141) (#7143)

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* fix evaluator.py for various exceptions by ast (#7150)

Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* [TTS][ZH] add Chinese TTS recipes based on IPA symbol sets. (#6893)

* [TTS] add Chinese TTS recipe based on IPA.
* add new pinyin and ipa dictionaries with 36 finals.
* add yaml configs for 24-final pinyin and ipa.
* add copyright header
* add a directory level 24finals to discriminate from 36 finals.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* unify configs into a single one and add detailed comments providing supported candidates.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* choose 36-final IPA as default phoneme dict

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

---------

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* [TTS] Add output audio format to preprocessing (#6889)

* [TTS] Add output audio format to preprocessing

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Add format validation

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Fix data tutorial

Signed-off-by: Ryan <rlangman@nvidia.com>

---------

Signed-off-by: Ryan <rlangman@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* freeze (#7152)

Signed-off-by: arendu <adithya.r@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* make sure any empty segments are removed (#7155)

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Update RIR generation scripts (#6547)

- fix: reduce room size if evaluation of params fails
- added randomized mic placement
- added diffuse noise generation
- added an option to specify the format and subtype for saved audio

Signed-off-by: Ante Jukić <ajukic@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* A quickstart speech enhancement tutorial (#6492)

A simple example of training a model for speech enhancement task

Signed-off-by: Ante Jukić <ajukic@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* NFA subtitle file config - specify colors and vertical alignment (#7160)

* allow specifying colors of text in ASS subtitle file

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* specify vertical_alignment instead of marginv in ass_file_config

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* add documentation of CTMFileConfig and ASSFileConfig to NFA README

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

---------

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Eagerly accumulate embedding grads into fp32 buffer (#6958) (#7153)

Signed-off-by: Tim Moon <tmoon@nvidia.com>
Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* TE bug fix (#7027) (#7036)

Signed-off-by: Dmytro Pykhtar <dpykhtar@nvidia.com>
Co-authored-by: Dmytro Pykhtar <37850217+dimapihtar@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* [TTS] Remove nested TTS configs (#7154)

* [TTS] Remove nested TTS configs

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Modify tutorial to support multiple sampling rates

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Clarify min_duration unit

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Default 22.05kHz highfreq to null

Signed-off-by: Ryan <rlangman@nvidia.com>

---------

Signed-off-by: Ryan <rlangman@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Merge release r1.20.0 to main (#7167)

* update package info

Signed-off-by: ericharper <complex451@gmail.com>

* Add ASR with TTS Tutorial. Fix enhancer usage. (#6955)

* Add ASR with TTS Tutorial
* Fix enhancer usage

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* install_bs (#7019)

Signed-off-by: Nikolay Karpov <karpnv@gmail.com>

* Fix typo and branch in tutorial (#7048)

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* fix syntax error introduced in PR-7079 (#7102)

* fix syntax error introduced in PR-7079

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>

* fixes for pr review

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>

---------

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>

* fix links for TN (#7117)

Signed-off-by: Evelina <ebakhturina@nvidia.com>

* update branch (#7135)

Signed-off-by: ericharper <complex451@gmail.com>

* Fixed main and merging this to r1.20 (#7127)

* Fixed main and merging this to r1.20

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Update vad_utils.py

Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>

---------

Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>

* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* fix version

Signed-off-by: ericharper <complex451@gmail.com>

* resolve conflict the other way

Signed-off-by: ericharper <complex451@gmail.com>

* keep both

Signed-off-by: ericharper <complex451@gmail.com>

* revert keep both

Signed-off-by: ericharper <complex451@gmail.com>

---------

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: Nikolay Karpov <karpnv@gmail.com>
Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>
Signed-off-by: Evelina <ebakhturina@nvidia.com>
Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: Vladimir Bataev <vbataev@nvidia.com>
Co-authored-by: Nikolay Karpov <karpnv@gmail.com>
Co-authored-by: bene-ges <antonova_sasha@list.ru>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: Taejin Park <tango4j@gmail.com>
Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Upgrade to pytorch lightning 2.0 (#6433)

* Upgrade pytorch lightning version in requirements

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Initial fixes for PTL2.0

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add further fixes to support lightning 2.0

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add replacements for replace_sampler_ddp, resume_from_checkpoint_fit_path and few occurances of validation_epoch_end

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Replace all occurances of validation_epoch_end to on_validation_epoch_end

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Replace training_epoch_end, test_epoch_end with on_train_epoch_end and on_test_epoch_end respectively

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Change logger=None to logger=False in Trainer object

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove PTL2.0 deprecated Trainer args from TrainerConfig dataclass

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Modify trainer.precision check and other small edits

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Replace logger=None with logger=False in test_ptl_stateless_timer.py Trainer

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add default values for args to fix Attribute Error

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add the following modifications

1) Remove outputs arg from on_validation_epoch_end, on_test_epoch_end and make it an arg of the class
2) Replace resume_from_checkpoint with ckpt_path as needed
3) Explicitly add accelerator as 'CPU' in UTs being run on CPU

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs arg from on_validation_epoch_end, on_test_epoch_end

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs arg in on_validation_epoch_end in MultiBinaryAccuracy docstrings

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add val, test outputs as instance vars in PunctuationCapitalizationModel and TokenClassificationModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Replace trainer.fit_loop.max_steps with trainer.fit_loop.epoch_loop.max_steps in test_optimizers_schedulers.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Revert an extra space that was mistakenly added

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Use self.validation_step_outputs and self.test_step_outputs in test_ema.py for uniformity

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Use self.validation_step_outputs and self.test_step_outputs in test_ptl_stateless_timer.py and check_for_ranks.py for uniformity

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add self.validation_step_outputs.clear() and self.test_step_outputs.clear() wherever missing

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs arg from on_train_epoch_end

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs from on_validation_epoch_end in multi_binary_acc.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove output args from on_validation_epoch_end in the docstrings of some ASR files

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove output args from on_validation_epoch_end and clear memory from validation_step_outputs

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add on_validation_epoch_end and remove outputs args for nlp models

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Append output of validation_step to validation_step_outputs in EncDecClassificationModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add the following changes

1) Index self.validation_step_outputs and self.test_step.outputs with dataloader_idx wherever needed
2) Initialize self.validation_step_outputs and self.test_step.outputs as empty lists and add support for multi dataloaders if they exist
3) Remove self.pre_configure_ddp from NLPDDPStrategy class as its removed in PTL 2.0

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add default value dataloader_idx=0 for on_validation_batch_end() in megatron_base_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* TypeCast precision to str in attention.py and utils_funcs.py to avoid TypeError

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add if condition check for multiple dataloaders when appending to validation outputs

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Separate validation pass to be used with both validation_step and test_step

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add if condition check for multiple dataloader while appending to test_step_outputs in punctuation_capitalization_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add condition check for multiple dataloaders based on type of trainer.val/test_dataloaders or self._validation/test_dl instead of len

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Comment Megatron T5 IA3 PP=2 in CI pipeline due to dataloader_iter issue with PTL 2.0

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Modify precision checks to account for 16-mixed and bf16-mixed

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Append output of validation/test_step to self.validation/test_step_outputs in CTCG2PModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Modify find_unused_parameters=True in g2p_heteronym model

1) Add find_unused_parameters=True for DDP strategy in g2p_heteronym_classification_train_and_evaluate.py
2) Remove args output in validation/test_step and add instance variables instead for heteronym_classification.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs from on_test_epoch_end in DialogueGPTClassificationModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add validation/test outputs in sgdqa_model and modify dialogue_config.yaml

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add split arg self.test_step_outputs to TextClassificationModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add test_step_outputs to dialogue and text classification models

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Change condition check for multiple dataloaders:

1) Replace ds_item as list in dialogue_config.yaml
2) Check for len of val/test_dataloaders or validation/test_dl along with type check of list in sgdqa_model.py while appending outputs of validation/test_step
3) Check for len of _validation/test_dl for creating self.validation/test_step_outputs in ModelPT and punctuation_cpitalization_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add additional condition for multi dataloaders

Check len(self.trainer.val/test_dataloaders) > 1 along with type(self.trainer.val/test_dataloaders) == list for multi dataloaders in validation/test_step

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add val step outputs and default val for dataloader_idx

1) Append validation_step outout to self.validation_step_outputs in MultiLabelIntentSlotClassificationMode
2) Add default val for dataloader_idx for on_test_batch_start/end in TimingCallback
3) Add self.validation/test_step_outputs in BERTQAModel and remove outputs arg

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add val/test_step_outputs to S2SQAModel and GPTQAModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Edit JenkinsFile for bert_pretrainig.py

Edit Jenkinsfile for this test to disable validation as a workaround for trainer.val_dataloader None error

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Modify precision to support 16-mixed, bf16-mixed in megatron_gpt_pretraining.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add ddp_find_unused_parameters_true and remove output args

1) Add ddp_find_unused_parameters_true fro trainer.strategy in self_alignment_pretraining.py as it has unused parameters
2) Remove output args and add self.validation/test_step_outputs to validation/test_step in mt_enc_dec_model.py
3) Comment tests in JenkinsFile that need to be fixed

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix in megatron_nmt_training.py for 16-mixed, bf16-mixed

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix for megatron_bert_pretraining.py and megatron_bert_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix and validation/test_step_outputs

1) Add fix to account for 16-mixed and bf16-mixed in megatron_retro_mutransfer_pretrain.py, megatron_retro_pretraining.py
2) Reset ckpt_path for test in enc_dec_nmt.py
3) Remove outputs args and add validation/test_step_outputs in megatron_retrieval_model.py
4) Comment Megatron Bert Pretraining and Resume Training with Pipeline Paralleism and add back NMT Training Post-LN

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix and skip few failing tests

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add missing comment lines in JenkinsFile

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Comment jenkin tests and super().on_validation_epoch_end() in megatron_gpt_sft_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Minor edit JenkinsFile

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Minor edit in jenkins file

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Edit in Jenkins file

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Comment missed lines in Jenkins file

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Fix precision and validation/test outputs

1) Add precision fix to account for 16-mixed and bf16-mixed in megatron_t5_pretraining.py
2) Remove outputs args and add append loss to self.validation/test_step_outputs in megatron_lm_encoder_decoder_model.py
3) Add back resume_from_checkpoint in the megatron_t5_config.yaml
4) Comment out certain tests in Jenkins file

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Fix precision and validation/test/predict errors in megatron_t5_prompt_learning.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix and edit precision typo in all files

1) Account for 16-mixed and bf16-mixed in megatron_bart_pretraining.py and megatron_t5_seq2seq_finetune.py
2) Fix precision typo in all files

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Fix all CI TTS tests and comment few Jenkins tests

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Combine xx_epoch_end and on_xx_epoch_end

Add on_inference_epoch_end to inference_epoch_end function and have a single on_validation/test_epoch_end in megatron_finetune_model.py and megatron_gpt_sft_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add a missing comment in JenkinsFile

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add try except StopIteration in validation_step for models with dataloader_iter

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove pyyaml from requirements

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add try except for inference_step in megatron_finetune_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove limit_val_batches for mockGPTDataset test

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add new self.validation_step_outputs for MegatronGPTSFTModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Minor edit Jenkinsfile

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Initialize self.validation/test_step_outputs in megatron_gpt_sft_model.py

Initialize self.validation/test_step_outputs in setup of MegatronGPTSFTModel to take care of cases when datalaoders are not setup in ModelPT for example while restoring the model.

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove resume_from_checkpoint if trainer arg in conf yaml files

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove resume_from_checkpoint as trainer arg in GPT, T5 configs

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove resume_from_checkpoint in duplex_tn_config.yaml

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Fix typos, unused imports and refactor code to remove redundant funcs

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove commented code in megatron_nmt_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Fix overriden functions to match parent class functions

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Prefetch dataloader_iter to prevent hang for PP>1

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Override setup() in NLPDDPStrategy to avoid hang during predict with PP>1

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Uncomment tests in JenkinsFile

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add '16' to precision checks and other minor fixes

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Clear validation/test_step_outputs with dataloader_idx for multi dataloaders

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Minor edits

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Modify precision checks to avoid indexing

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove self.validation_step_outputs_sft and add dataloader_idx to clear outputs

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Reference checkpoint with trainer.ckpt_path

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add _prefetch to NLPModel and minor fixes

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add limit_val_batches in JenkinsFile for NMT

1) Add trainer.limit_val_batches in Megatron NMT Training TP=2
2) Remove unused import in ModelPT

Signed-off-by: Abhishree <abhishreetm@gmail.com>

---------

Signed-off-by: Abhishree <abhishreetm@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Include the scripts for preprocessing OAST and unit tests for chat sft datasets (#7112)

* scripts for sft

Signed-off-by: Yi Dong <yidong@nvidia.com>

* fix style

Signed-off-by: Yi Dong <yidong@nvidia.com>

* adde special token only for huggingface model

Signed-off-by: Yi Dong <yidong@nvidia.com>

* change default name

Signed-off-by: Yi Dong <yidong@nvidia.com>

* print out error datapoint content

Signed-off-by: Yi Dong <yidong@nvidia.com>

* show error id

Signed-off-by: Yi Dong <yidong@nvidia.com>

* annotation script working

Signed-off-by: Yi Dong <yidong@nvidia.com>

* try to be compatible with huggingface tokenizer

Signed-off-by: Yi Dong <yidong@nvidia.com>

* added examples

Signed-off-by: Yi Dong <yidong@nvidia.com>

* added lang

Signed-off-by: Yi Dong <yidong@nvidia.com>

* added lang

Signed-off-by: Yi Dong <yidong@nvidia.com>

* text to value special case

Signed-off-by: Yi Dong <yidong@nvidia.com>

* configure the slider

Signed-off-by: Yi Dong <yidong@nvidia.com>

* annoatation handles lang

Signed-off-by: Yi Dong <yidong@nvidia.com>

* added the unit test for chat sft dataset

Signed-off-by: Yi Dong <yidong@nvidia.com>

* used the file in the test dir

Signed-off-by: Yi Dong <yidong@nvidia.com>

* fix json error

Signed-off-by: Yi Dong <yidong@nvidia.com>

* load local tokenizer

Signed-off-by: Yi Dong <yidong@nvidia.com>

* remove mask count check

Signed-off-by: Yi Dong <yidong@nvidia.com>

* added HF dataset backend

Signed-off-by: Yi Dong <yidong@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Yi Dong <yidong@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* add paths to labeler. (#7087)

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Kim Ngo <6362111+findkim@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>
Signed-off-by: tbartley94 <tbartley@nvidia.com>
Signed-off-by: Nikolay Karpov <karpnv@gmail.com>
Signed-off-by: Yi Dong <yidong@nvidia.com>
Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>
Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>
Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>
Signed-off-by: Evelina <ebakhturina@nvidia.com>
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com>
Signed-off-by: stevehuang52 <heh@nvidia.com>
Signed-off-by: arendu <adithya.r@gmail.com>
Signed-off-by: sam1373 <samuelkriman@gmail.com>
Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: Daniel Egert <degert@nvidia.com>
Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: Jan Beckmann <king-jan1999@hotmail.de>
Signed-off-by: Tim Moon <tmoon@nvidia.com>
Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com>
Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Signed-off-by: Ryan <rlangman@nvidia.com>
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: Ante Jukić <ajukic@nvidia.com>
Signed-off-by: Dmytro Pykhtar <dpykhtar@nvidia.com>
Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: Abhishree <abhishreetm@gmail.com>
Co-authored-by: Kim Ngo <6362111+findkim@users.noreply.github.com>
Co-authored-by: tbartley94 <90423858+tbartley94@users.noreply.github.com>
Co-authored-by: Nikolay Karpov <karpnv@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Yi Dong <43824965+yidong72@users.noreply.github.com>
Co-authored-by: Aleksandr Laptev <alaptev@nvidia.com>
Co-authored-by: bene-ges <antonova_sasha@list.ru>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: Aleksey Grinchuk (Oleksii Hrinchuk) <grinchuk.alexey@gmail.com>
Co-authored-by: Vladimir Bataev <vbataev@nvidia.com>
Co-authored-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
Co-authored-by: Adi Renduchintala <adithyare@nvidia.com>
Co-authored-by: Vahid Noroozi <VahidooX@users.noreply.github.com>
Co-authored-by: Samuel Kriman <samuelkriman@gmail.com>
Co-authored-by: Boris Fomitchev <borisfom@users.noreply.github.com>
Co-authored-by: trias702 <25867060+trias702@users.noreply.github.com>
Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Co-authored-by: Jan Beckmann <king-jan1999@hotmail.de>
Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Co-authored-by: lleaver <137942999+lleaver@users.noreply.github.com>
Co-authored-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com>
Co-authored-by: Ryan Langman <rlangman@nvidia.com>
Co-authored-by: David <amosalla@asu.edu>
Co-authored-by: Elena Rastorgueva <80532067+erastorgueva-nv@users.noreply.github.com>
Co-authored-by: anteju <108555623+anteju@users.noreply.github.com>
Co-authored-by: Dmytro Pykhtar <37850217+dimapihtar@users.noreply.github.com>
Co-authored-by: Taejin Park <tango4j@gmail.com>
Co-authored-by: Abhishree Thittenamane <47577437+athitten@users.noreply.github.com>
dorotat-nv pushed a commit to dorotat-nv/NeMo that referenced this pull request Aug 24, 2023
* update package info

Signed-off-by: ericharper <complex451@gmail.com>

* Add ASR with TTS Tutorial. Fix enhancer usage. (NVIDIA#6955)

* Add ASR with TTS Tutorial
* Fix enhancer usage

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* install_bs (NVIDIA#7019)

Signed-off-by: Nikolay Karpov <karpnv@gmail.com>

* Fix typo and branch in tutorial (NVIDIA#7048)

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* fix syntax error introduced in PR-7079 (NVIDIA#7102)

* fix syntax error introduced in PR-7079

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>

* fixes for pr review

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>

---------

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>

* fix links for TN (NVIDIA#7117)

Signed-off-by: Evelina <ebakhturina@nvidia.com>

* update branch (NVIDIA#7135)

Signed-off-by: ericharper <complex451@gmail.com>

* Fixed main and merging this to r1.20 (NVIDIA#7127)

* Fixed main and merging this to r1.20

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Update vad_utils.py

Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>

---------

Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>

* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* fix version

Signed-off-by: ericharper <complex451@gmail.com>

* resolve conflict the other way

Signed-off-by: ericharper <complex451@gmail.com>

* keep both

Signed-off-by: ericharper <complex451@gmail.com>

* revert keep both

Signed-off-by: ericharper <complex451@gmail.com>

---------

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: Nikolay Karpov <karpnv@gmail.com>
Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>
Signed-off-by: Evelina <ebakhturina@nvidia.com>
Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: Vladimir Bataev <vbataev@nvidia.com>
Co-authored-by: Nikolay Karpov <karpnv@gmail.com>
Co-authored-by: bene-ges <antonova_sasha@list.ru>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: Taejin Park <tango4j@gmail.com>
Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Signed-off-by: dorotat <dorotat@nvidia.com>
dorotat-nv pushed a commit to dorotat-nv/NeMo that referenced this pull request Aug 24, 2023
* Fix race condition when executing with multi-node where some ranks does not wait for setup (NVIDIA#7016)

Signed-off-by: Kim Ngo <6362111+findkim@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Added bool types to neural_types export (NVIDIA#7032)

Signed-off-by: tbartley94 <tbartley@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* rnnt and char utils (NVIDIA#6971)

* rnnt_ngram_merge

Signed-off-by: Nikolay Karpov <karpnv@gmail.com>

* char level bug

Signed-off-by: Nikolay Karpov <karpnv@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Nikolay Karpov <karpnv@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* fix tab text gen (NVIDIA#7022) (NVIDIA#7031)

Signed-off-by: Yi Dong <yidong@nvidia.com>
Co-authored-by: Yi Dong <43824965+yidong72@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fixed kwargs for metric instance init

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fixed kwargs for metric instance init

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* removed kwagrs

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Updated config desc

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* ASR Confidence update and tutorial (NVIDIA#6810)

* small fixes and tests

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* various fixes for the tutorial

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* tutorial added

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* for for a little oops after rebasement

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix tests

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* unused import removed

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* fix review comments

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* deprecated parameters for greedy configs

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* move re-assigning to configs

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* fix comments 2

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* fix config tests

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* fix ece test (my env was bugged apparently)

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* renamings for confidence ensemble

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fox comments 3

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* return dropped tutorial

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* CI flips back and forth, increasing tolerance

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

---------

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* install_bs (NVIDIA#7019) (NVIDIA#7028)

Signed-off-by: Nikolay Karpov <karpnv@gmail.com>
Co-authored-by: Nikolay Karpov <karpnv@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* fixes for spellmapper (NVIDIA#6994) (NVIDIA#7000)

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>
Co-authored-by: bene-ges <antonova_sasha@list.ru>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* added back the retro documents (NVIDIA#7033)

Signed-off-by: Yi Dong <yidong@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Remove pyyaml (NVIDIA#7052) (NVIDIA#7054)

Signed-off-by: smajumdar <titu1994@gmail.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* st standalone model (NVIDIA#6969)

* st standalone model

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* style fix

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* sacrebleu import fix, unused imports removed

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* import guard for nlp inside asr transformer bpe model

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* codeql fixes

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* comments answered

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* import ordering fix

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* yttm for asr removed

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* logging added

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* added inference and translate method

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* remove pos emb from state dict for old models (NVIDIA#7068)

* remove pos emb from state dict

Signed-off-by: Evelina <ebakhturina@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* move to nlp_model

Signed-off-by: Evelina <ebakhturina@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update comment

Signed-off-by: Evelina <ebakhturina@nvidia.com>

* fix nmt test

Signed-off-by: Evelina <ebakhturina@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix nmt test

Signed-off-by: Evelina <ebakhturina@nvidia.com>

---------

Signed-off-by: Evelina <ebakhturina@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix typo in ASR-TTS tutorial (NVIDIA#7049)

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fixed tutorial's name (NVIDIA#7047)

Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com>
Co-authored-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix documentation for Numba (NVIDIA#7065) (NVIDIA#7077)

* Fix documentation for Numba

* Update force float32 flag dynamically

* Update force float32 flag dynamically

* Fix nemo version

---------

Signed-off-by: smajumdar <titu1994@gmail.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Update Frame-VAD doc and fix onnx export (NVIDIA#7076)

* update fvad doc

Signed-off-by: stevehuang52 <heh@nvidia.com>

* fix typo

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update fvad example

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* fix onnx export

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update test

Signed-off-by: stevehuang52 <heh@nvidia.com>

* refactor

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update doc

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

---------

Signed-off-by: stevehuang52 <heh@nvidia.com>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* memmap worker arg (NVIDIA#7062)

* memmap worker arg

Signed-off-by: arendu <adithya.r@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update

Signed-off-by: arendu <adithya.r@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update

Signed-off-by: arendu <adithya.r@gmail.com>

* update

Signed-off-by: arendu <adithya.r@gmail.com>

---------

Signed-off-by: arendu <adithya.r@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix caching bug in causal convolutions for cache-aware ASR models (NVIDIA#7034) (NVIDIA#7082)

Co-authored-by: Vahid Noroozi <VahidooX@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fast Conformer global token fix (NVIDIA#7085)

* old way

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* remove extra

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* clean

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* clean

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* clean

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: sam1373 <samuelkriman@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Refined export_config (NVIDIA#7053) (NVIDIA#7066)

* Refined export_config
* Rolling back hierarchy change
---------

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Co-authored-by: Boris Fomitchev <borisfom@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* small Bugfix (NVIDIA#7081)

* small Bugfix (NVIDIA#7079)

* fix branch

Signed-off-by: fayejf <fayejf07@gmail.com>

* fix typo

Signed-off-by: fayejf <fayejf07@gmail.com>

* fix link

Signed-off-by: fayejf <fayejf07@gmail.com>

---------

Signed-off-by: fayejf <fayejf07@gmail.com>

* Update tutorials/nlp/SpellMapper_English_ASR_Customization.ipynb

Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>

* Update tutorials/nlp/SpellMapper_English_ASR_Customization.ipynb

Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>

---------

Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Added script to extract ASR CTC and RNNT models from ASR hybrid models (NVIDIA#7092)

* Added script to extract ctc and rnnt models from hybrid models

Signed-off-by: Daniel Egert <degert@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Updated hybrid extraction script for review request 1

Signed-off-by: Daniel Egert <degert@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Updated hybrid convert script to remove --cuda flag

Signed-off-by: Daniel Egert <degert@nvidia.com>

---------

Signed-off-by: Daniel Egert <degert@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Adding docs and models for multiple lookahead cache-aware ASR (NVIDIA#7067) (NVIDIA#7094)

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* update TTS readme (NVIDIA#7088)

* update TTS readme

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

---------

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix absolute path in path join call (NVIDIA#7099)

Signed-off-by: Jan Beckmann <king-jan1999@hotmail.de>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Disable distopt contiguous param buffer by default (NVIDIA#7095)

Signed-off-by: Tim Moon <tmoon@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* microphone demo (NVIDIA#7110)

Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com>
Co-authored-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* [Fix] load_state_dict in nlp_model.py (NVIDIA#7086)

* Fix load_state_dict in nlp_model.py

Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix plot function in vad_utils.py (NVIDIA#7113)

Fix plot function in vad_utils.py

Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fixed small bug with NoisePerturbationWithNormalization (NVIDIA#7118)

Signed-off-by: Daniel Egert <degert@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix import guard checks (NVIDIA#7124)

Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Revert "Fix import guard checks (NVIDIA#7124)" (NVIDIA#7125)

This reverts commit a46e325.

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix import guard checks (NVIDIA#7126)

* Fix import guard checks

Signed-off-by: smajumdar <titu1994@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: smajumdar <titu1994@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Add updated fc ctc and rnnt xxl models (NVIDIA#7128) (NVIDIA#7130)

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* [TTS] Create EnCodec training recipe (NVIDIA#6852)

* [TTS] Create EnCodec training recipe

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Update encodec recipe

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Rename EnCodec to AudioCodec

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Add EnCodec unit tests

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Add copyright header to distributed.py

Signed-off-by: Ryan <rlangman@nvidia.com>

---------

Signed-off-by: Ryan <rlangman@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix rank where torch.distributed may not be initialized yet and would not wait for tokenizer file caching (NVIDIA#7061)

Signed-off-by: Kim Ngo <6362111+findkim@users.noreply.github.com>
Co-authored-by: David <amosalla@asu.edu>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* fix default attention size (NVIDIA#7141) (NVIDIA#7143)

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* fix evaluator.py for various exceptions by ast (NVIDIA#7150)

Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* [TTS][ZH] add Chinese TTS recipes based on IPA symbol sets. (NVIDIA#6893)

* [TTS] add Chinese TTS recipe based on IPA.
* add new pinyin and ipa dictionaries with 36 finals.
* add yaml configs for 24-final pinyin and ipa.
* add copyright header
* add a directory level 24finals to discriminate from 36 finals.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* unify configs into a single one and add detailed comments providing supported candidates.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* choose 36-final IPA as default phoneme dict

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

---------

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* [TTS] Add output audio format to preprocessing (NVIDIA#6889)

* [TTS] Add output audio format to preprocessing

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Add format validation

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Fix data tutorial

Signed-off-by: Ryan <rlangman@nvidia.com>

---------

Signed-off-by: Ryan <rlangman@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* freeze (NVIDIA#7152)

Signed-off-by: arendu <adithya.r@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* make sure any empty segments are removed (NVIDIA#7155)

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Update RIR generation scripts (NVIDIA#6547)

- fix: reduce room size if evaluation of params fails
- added randomized mic placement
- added diffuse noise generation
- added an option to specify the format and subtype for saved audio

Signed-off-by: Ante Jukić <ajukic@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* A quickstart speech enhancement tutorial (NVIDIA#6492)

A simple example of training a model for speech enhancement task

Signed-off-by: Ante Jukić <ajukic@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* NFA subtitle file config - specify colors and vertical alignment (NVIDIA#7160)

* allow specifying colors of text in ASS subtitle file

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* specify vertical_alignment instead of marginv in ass_file_config

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* add documentation of CTMFileConfig and ASSFileConfig to NFA README

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

---------

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Eagerly accumulate embedding grads into fp32 buffer (NVIDIA#6958) (NVIDIA#7153)

Signed-off-by: Tim Moon <tmoon@nvidia.com>
Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* TE bug fix (NVIDIA#7027) (NVIDIA#7036)

Signed-off-by: Dmytro Pykhtar <dpykhtar@nvidia.com>
Co-authored-by: Dmytro Pykhtar <37850217+dimapihtar@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* [TTS] Remove nested TTS configs (NVIDIA#7154)

* [TTS] Remove nested TTS configs

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Modify tutorial to support multiple sampling rates

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Clarify min_duration unit

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Default 22.05kHz highfreq to null

Signed-off-by: Ryan <rlangman@nvidia.com>

---------

Signed-off-by: Ryan <rlangman@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Merge release r1.20.0 to main (NVIDIA#7167)

* update package info

Signed-off-by: ericharper <complex451@gmail.com>

* Add ASR with TTS Tutorial. Fix enhancer usage. (NVIDIA#6955)

* Add ASR with TTS Tutorial
* Fix enhancer usage

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* install_bs (NVIDIA#7019)

Signed-off-by: Nikolay Karpov <karpnv@gmail.com>

* Fix typo and branch in tutorial (NVIDIA#7048)

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* fix syntax error introduced in PR-7079 (NVIDIA#7102)

* fix syntax error introduced in PR-7079

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>

* fixes for pr review

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>

---------

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>

* fix links for TN (NVIDIA#7117)

Signed-off-by: Evelina <ebakhturina@nvidia.com>

* update branch (NVIDIA#7135)

Signed-off-by: ericharper <complex451@gmail.com>

* Fixed main and merging this to r1.20 (NVIDIA#7127)

* Fixed main and merging this to r1.20

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Update vad_utils.py

Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>

---------

Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>

* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* fix version

Signed-off-by: ericharper <complex451@gmail.com>

* resolve conflict the other way

Signed-off-by: ericharper <complex451@gmail.com>

* keep both

Signed-off-by: ericharper <complex451@gmail.com>

* revert keep both

Signed-off-by: ericharper <complex451@gmail.com>

---------

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: Nikolay Karpov <karpnv@gmail.com>
Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>
Signed-off-by: Evelina <ebakhturina@nvidia.com>
Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: Vladimir Bataev <vbataev@nvidia.com>
Co-authored-by: Nikolay Karpov <karpnv@gmail.com>
Co-authored-by: bene-ges <antonova_sasha@list.ru>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: Taejin Park <tango4j@gmail.com>
Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Upgrade to pytorch lightning 2.0 (NVIDIA#6433)

* Upgrade pytorch lightning version in requirements

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Initial fixes for PTL2.0

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add further fixes to support lightning 2.0

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add replacements for replace_sampler_ddp, resume_from_checkpoint_fit_path and few occurances of validation_epoch_end

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Replace all occurances of validation_epoch_end to on_validation_epoch_end

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Replace training_epoch_end, test_epoch_end with on_train_epoch_end and on_test_epoch_end respectively

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Change logger=None to logger=False in Trainer object

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove PTL2.0 deprecated Trainer args from TrainerConfig dataclass

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Modify trainer.precision check and other small edits

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Replace logger=None with logger=False in test_ptl_stateless_timer.py Trainer

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add default values for args to fix Attribute Error

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add the following modifications

1) Remove outputs arg from on_validation_epoch_end, on_test_epoch_end and make it an arg of the class
2) Replace resume_from_checkpoint with ckpt_path as needed
3) Explicitly add accelerator as 'CPU' in UTs being run on CPU

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs arg from on_validation_epoch_end, on_test_epoch_end

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs arg in on_validation_epoch_end in MultiBinaryAccuracy docstrings

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add val, test outputs as instance vars in PunctuationCapitalizationModel and TokenClassificationModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Replace trainer.fit_loop.max_steps with trainer.fit_loop.epoch_loop.max_steps in test_optimizers_schedulers.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Revert an extra space that was mistakenly added

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Use self.validation_step_outputs and self.test_step_outputs in test_ema.py for uniformity

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Use self.validation_step_outputs and self.test_step_outputs in test_ptl_stateless_timer.py and check_for_ranks.py for uniformity

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add self.validation_step_outputs.clear() and self.test_step_outputs.clear() wherever missing

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs arg from on_train_epoch_end

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs from on_validation_epoch_end in multi_binary_acc.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove output args from on_validation_epoch_end in the docstrings of some ASR files

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove output args from on_validation_epoch_end and clear memory from validation_step_outputs

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add on_validation_epoch_end and remove outputs args for nlp models

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Append output of validation_step to validation_step_outputs in EncDecClassificationModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add the following changes

1) Index self.validation_step_outputs and self.test_step.outputs with dataloader_idx wherever needed
2) Initialize self.validation_step_outputs and self.test_step.outputs as empty lists and add support for multi dataloaders if they exist
3) Remove self.pre_configure_ddp from NLPDDPStrategy class as its removed in PTL 2.0

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add default value dataloader_idx=0 for on_validation_batch_end() in megatron_base_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* TypeCast precision to str in attention.py and utils_funcs.py to avoid TypeError

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add if condition check for multiple dataloaders when appending to validation outputs

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Separate validation pass to be used with both validation_step and test_step

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add if condition check for multiple dataloader while appending to test_step_outputs in punctuation_capitalization_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add condition check for multiple dataloaders based on type of trainer.val/test_dataloaders or self._validation/test_dl instead of len

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Comment Megatron T5 IA3 PP=2 in CI pipeline due to dataloader_iter issue with PTL 2.0

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Modify precision checks to account for 16-mixed and bf16-mixed

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Append output of validation/test_step to self.validation/test_step_outputs in CTCG2PModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Modify find_unused_parameters=True in g2p_heteronym model

1) Add find_unused_parameters=True for DDP strategy in g2p_heteronym_classification_train_and_evaluate.py
2) Remove args output in validation/test_step and add instance variables instead for heteronym_classification.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs from on_test_epoch_end in DialogueGPTClassificationModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add validation/test outputs in sgdqa_model and modify dialogue_config.yaml

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add split arg self.test_step_outputs to TextClassificationModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add test_step_outputs to dialogue and text classification models

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Change condition check for multiple dataloaders:

1) Replace ds_item as list in dialogue_config.yaml
2) Check for len of val/test_dataloaders or validation/test_dl along with type check of list in sgdqa_model.py while appending outputs of validation/test_step
3) Check for len of _validation/test_dl for creating self.validation/test_step_outputs in ModelPT and punctuation_cpitalization_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add additional condition for multi dataloaders

Check len(self.trainer.val/test_dataloaders) > 1 along with type(self.trainer.val/test_dataloaders) == list for multi dataloaders in validation/test_step

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add val step outputs and default val for dataloader_idx

1) Append validation_step outout to self.validation_step_outputs in MultiLabelIntentSlotClassificationMode
2) Add default val for dataloader_idx for on_test_batch_start/end in TimingCallback
3) Add self.validation/test_step_outputs in BERTQAModel and remove outputs arg

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add val/test_step_outputs to S2SQAModel and GPTQAModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Edit JenkinsFile for bert_pretrainig.py

Edit Jenkinsfile for this test to disable validation as a workaround for trainer.val_dataloader None error

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Modify precision to support 16-mixed, bf16-mixed in megatron_gpt_pretraining.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add ddp_find_unused_parameters_true and remove output args

1) Add ddp_find_unused_parameters_true fro trainer.strategy in self_alignment_pretraining.py as it has unused parameters
2) Remove output args and add self.validation/test_step_outputs to validation/test_step in mt_enc_dec_model.py
3) Comment tests in JenkinsFile that need to be fixed

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix in megatron_nmt_training.py for 16-mixed, bf16-mixed

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix for megatron_bert_pretraining.py and megatron_bert_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix and validation/test_step_outputs

1) Add fix to account for 16-mixed and bf16-mixed in megatron_retro_mutransfer_pretrain.py, megatron_retro_pretraining.py
2) Reset ckpt_path for test in enc_dec_nmt.py
3) Remove outputs args and add validation/test_step_outputs in megatron_retrieval_model.py
4) Comment Megatron Bert Pretraining and Resume Training with Pipeline Paralleism and add back NMT Training Post-LN

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix and skip few failing tests

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add missing comment lines in JenkinsFile

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Comment jenkin tests and super().on_validation_epoch_end() in megatron_gpt_sft_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Minor edit JenkinsFile

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Minor edit in jenkins file

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Edit in Jenkins file

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Comment missed lines in Jenkins file

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Fix precision and validation/test outputs

1) Add precision fix to account for 16-mixed and bf16-mixed in megatron_t5_pretraining.py
2) Remove outputs args and add append loss to self.validation/test_step_outputs in megatron_lm_encoder_decoder_model.py
3) Add back resume_from_checkpoint in the megatron_t5_config.yaml
4) Comment out certain tests in Jenkins file

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Fix precision and validation/test/predict errors in megatron_t5_prompt_learning.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix and edit precision typo in all files

1) Account for 16-mixed and bf16-mixed in megatron_bart_pretraining.py and megatron_t5_seq2seq_finetune.py
2) Fix precision typo in all files

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Fix all CI TTS tests and comment few Jenkins tests

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Combine xx_epoch_end and on_xx_epoch_end

Add on_inference_epoch_end to inference_epoch_end function and have a single on_validation/test_epoch_end in megatron_finetune_model.py and megatron_gpt_sft_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add a missing comment in JenkinsFile

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add try except StopIteration in validation_step for models with dataloader_iter

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove pyyaml from requirements

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add try except for inference_step in megatron_finetune_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove limit_val_batches for mockGPTDataset test

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add new self.validation_step_outputs for MegatronGPTSFTModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Minor edit Jenkinsfile

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Initialize self.validation/test_step_outputs in megatron_gpt_sft_model.py

Initialize self.validation/test_step_outputs in setup of MegatronGPTSFTModel to take care of cases when datalaoders are not setup in ModelPT for example while restoring the model.

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove resume_from_checkpoint if trainer arg in conf yaml files

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove resume_from_checkpoint as trainer arg in GPT, T5 configs

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove resume_from_checkpoint in duplex_tn_config.yaml

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Fix typos, unused imports and refactor code to remove redundant funcs

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove commented code in megatron_nmt_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Fix overriden functions to match parent class functions

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Prefetch dataloader_iter to prevent hang for PP>1

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Override setup() in NLPDDPStrategy to avoid hang during predict with PP>1

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Uncomment tests in JenkinsFile

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add '16' to precision checks and other minor fixes

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Clear validation/test_step_outputs with dataloader_idx for multi dataloaders

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Minor edits

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Modify precision checks to avoid indexing

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove self.validation_step_outputs_sft and add dataloader_idx to clear outputs

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Reference checkpoint with trainer.ckpt_path

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add _prefetch to NLPModel and minor fixes

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add limit_val_batches in JenkinsFile for NMT

1) Add trainer.limit_val_batches in Megatron NMT Training TP=2
2) Remove unused import in ModelPT

Signed-off-by: Abhishree <abhishreetm@gmail.com>

---------

Signed-off-by: Abhishree <abhishreetm@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Include the scripts for preprocessing OAST and unit tests for chat sft datasets (NVIDIA#7112)

* scripts for sft

Signed-off-by: Yi Dong <yidong@nvidia.com>

* fix style

Signed-off-by: Yi Dong <yidong@nvidia.com>

* adde special token only for huggingface model

Signed-off-by: Yi Dong <yidong@nvidia.com>

* change default name

Signed-off-by: Yi Dong <yidong@nvidia.com>

* print out error datapoint content

Signed-off-by: Yi Dong <yidong@nvidia.com>

* show error id

Signed-off-by: Yi Dong <yidong@nvidia.com>

* annotation script working

Signed-off-by: Yi Dong <yidong@nvidia.com>

* try to be compatible with huggingface tokenizer

Signed-off-by: Yi Dong <yidong@nvidia.com>

* added examples

Signed-off-by: Yi Dong <yidong@nvidia.com>

* added lang

Signed-off-by: Yi Dong <yidong@nvidia.com>

* added lang

Signed-off-by: Yi Dong <yidong@nvidia.com>

* text to value special case

Signed-off-by: Yi Dong <yidong@nvidia.com>

* configure the slider

Signed-off-by: Yi Dong <yidong@nvidia.com>

* annoatation handles lang

Signed-off-by: Yi Dong <yidong@nvidia.com>

* added the unit test for chat sft dataset

Signed-off-by: Yi Dong <yidong@nvidia.com>

* used the file in the test dir

Signed-off-by: Yi Dong <yidong@nvidia.com>

* fix json error

Signed-off-by: Yi Dong <yidong@nvidia.com>

* load local tokenizer

Signed-off-by: Yi Dong <yidong@nvidia.com>

* remove mask count check

Signed-off-by: Yi Dong <yidong@nvidia.com>

* added HF dataset backend

Signed-off-by: Yi Dong <yidong@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Yi Dong <yidong@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* add paths to labeler. (NVIDIA#7087)

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Kim Ngo <6362111+findkim@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>
Signed-off-by: tbartley94 <tbartley@nvidia.com>
Signed-off-by: Nikolay Karpov <karpnv@gmail.com>
Signed-off-by: Yi Dong <yidong@nvidia.com>
Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>
Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>
Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>
Signed-off-by: Evelina <ebakhturina@nvidia.com>
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com>
Signed-off-by: stevehuang52 <heh@nvidia.com>
Signed-off-by: arendu <adithya.r@gmail.com>
Signed-off-by: sam1373 <samuelkriman@gmail.com>
Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: Daniel Egert <degert@nvidia.com>
Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: Jan Beckmann <king-jan1999@hotmail.de>
Signed-off-by: Tim Moon <tmoon@nvidia.com>
Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com>
Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Signed-off-by: Ryan <rlangman@nvidia.com>
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: Ante Jukić <ajukic@nvidia.com>
Signed-off-by: Dmytro Pykhtar <dpykhtar@nvidia.com>
Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: Abhishree <abhishreetm@gmail.com>
Co-authored-by: Kim Ngo <6362111+findkim@users.noreply.github.com>
Co-authored-by: tbartley94 <90423858+tbartley94@users.noreply.github.com>
Co-authored-by: Nikolay Karpov <karpnv@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Yi Dong <43824965+yidong72@users.noreply.github.com>
Co-authored-by: Aleksandr Laptev <alaptev@nvidia.com>
Co-authored-by: bene-ges <antonova_sasha@list.ru>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: Aleksey Grinchuk (Oleksii Hrinchuk) <grinchuk.alexey@gmail.com>
Co-authored-by: Vladimir Bataev <vbataev@nvidia.com>
Co-authored-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
Co-authored-by: Adi Renduchintala <adithyare@nvidia.com>
Co-authored-by: Vahid Noroozi <VahidooX@users.noreply.github.com>
Co-authored-by: Samuel Kriman <samuelkriman@gmail.com>
Co-authored-by: Boris Fomitchev <borisfom@users.noreply.github.com>
Co-authored-by: trias702 <25867060+trias702@users.noreply.github.com>
Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Co-authored-by: Jan Beckmann <king-jan1999@hotmail.de>
Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Co-authored-by: lleaver <137942999+lleaver@users.noreply.github.com>
Co-authored-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com>
Co-authored-by: Ryan Langman <rlangman@nvidia.com>
Co-authored-by: David <amosalla@asu.edu>
Co-authored-by: Elena Rastorgueva <80532067+erastorgueva-nv@users.noreply.github.com>
Co-authored-by: anteju <108555623+anteju@users.noreply.github.com>
Co-authored-by: Dmytro Pykhtar <37850217+dimapihtar@users.noreply.github.com>
Co-authored-by: Taejin Park <tango4j@gmail.com>
Co-authored-by: Abhishree Thittenamane <47577437+athitten@users.noreply.github.com>
Signed-off-by: dorotat <dorotat@nvidia.com>
Davood-M added a commit that referenced this pull request Aug 24, 2023
* migrated class

Signed-off-by: dorotat <dorotat@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: dorotat <dorotat@nvidia.com>

* added unit test

Signed-off-by: dorotat <dorotat@nvidia.com>

* memmap worker arg (#7062)

* memmap worker arg

Signed-off-by: arendu <adithya.r@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update

Signed-off-by: arendu <adithya.r@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update

Signed-off-by: arendu <adithya.r@gmail.com>

* update

Signed-off-by: arendu <adithya.r@gmail.com>

---------

Signed-off-by: arendu <adithya.r@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* Fix caching bug in causal convolutions for cache-aware ASR models (#7034) (#7082)

Co-authored-by: Vahid Noroozi <VahidooX@users.noreply.github.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* Fast Conformer global token fix (#7085)

* old way

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* remove extra

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* clean

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* clean

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* clean

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: sam1373 <samuelkriman@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* Refined export_config (#7053) (#7066)

* Refined export_config
* Rolling back hierarchy change
---------

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Co-authored-by: Boris Fomitchev <borisfom@users.noreply.github.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* small Bugfix (#7081)

* small Bugfix (#7079)

* fix branch

Signed-off-by: fayejf <fayejf07@gmail.com>

* fix typo

Signed-off-by: fayejf <fayejf07@gmail.com>

* fix link

Signed-off-by: fayejf <fayejf07@gmail.com>

---------

Signed-off-by: fayejf <fayejf07@gmail.com>

* Update tutorials/nlp/SpellMapper_English_ASR_Customization.ipynb

Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>

* Update tutorials/nlp/SpellMapper_English_ASR_Customization.ipynb

Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>

---------

Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* Added script to extract ASR CTC and RNNT models from ASR hybrid models (#7092)

* Added script to extract ctc and rnnt models from hybrid models

Signed-off-by: Daniel Egert <degert@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Updated hybrid extraction script for review request 1

Signed-off-by: Daniel Egert <degert@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Updated hybrid convert script to remove --cuda flag

Signed-off-by: Daniel Egert <degert@nvidia.com>

---------

Signed-off-by: Daniel Egert <degert@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* Adding docs and models for multiple lookahead cache-aware ASR (#7067) (#7094)

Signed-off-by: dorotat <dorotat@nvidia.com>

* update TTS readme (#7088)

* update TTS readme

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

---------

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* Fix absolute path in path join call (#7099)

Signed-off-by: Jan Beckmann <king-jan1999@hotmail.de>
Signed-off-by: dorotat <dorotat@nvidia.com>

* Disable distopt contiguous param buffer by default (#7095)

Signed-off-by: Tim Moon <tmoon@nvidia.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* microphone demo (#7110)

Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com>
Co-authored-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* [Fix] load_state_dict in nlp_model.py (#7086)

* Fix load_state_dict in nlp_model.py

Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* Fix plot function in vad_utils.py (#7113)

Fix plot function in vad_utils.py

Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* Fixed small bug with NoisePerturbationWithNormalization (#7118)

Signed-off-by: Daniel Egert <degert@nvidia.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* Fix import guard checks (#7124)

Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* Revert "Fix import guard checks (#7124)" (#7125)

This reverts commit a46e3251944642f9102aa16ce2d2f9d3a804ff8a.

Signed-off-by: dorotat <dorotat@nvidia.com>

* Fix import guard checks (#7126)

* Fix import guard checks

Signed-off-by: smajumdar <titu1994@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: smajumdar <titu1994@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* Add updated fc ctc and rnnt xxl models (#7128) (#7130)

Signed-off-by: dorotat <dorotat@nvidia.com>

* [TTS] Create EnCodec training recipe (#6852)

* [TTS] Create EnCodec training recipe

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Update encodec recipe

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Rename EnCodec to AudioCodec

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Add EnCodec unit tests

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Add copyright header to distributed.py

Signed-off-by: Ryan <rlangman@nvidia.com>

---------

Signed-off-by: Ryan <rlangman@nvidia.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* Fix rank where torch.distributed may not be initialized yet and would not wait for tokenizer file caching (#7061)

Signed-off-by: Kim Ngo <6362111+findkim@users.noreply.github.com>
Co-authored-by: David <amosalla@asu.edu>
Signed-off-by: dorotat <dorotat@nvidia.com>

* fix default attention size (#7141) (#7143)

Signed-off-by: dorotat <dorotat@nvidia.com>

* fix evaluator.py for various exceptions by ast (#7150)

Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* [TTS][ZH] add Chinese TTS recipes based on IPA symbol sets. (#6893)

* [TTS] add Chinese TTS recipe based on IPA.
* add new pinyin and ipa dictionaries with 36 finals.
* add yaml configs for 24-final pinyin and ipa.
* add copyright header
* add a directory level 24finals to discriminate from 36 finals.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* unify configs into a single one and add detailed comments providing supported candidates.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* choose 36-final IPA as default phoneme dict

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

---------

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* [TTS] Add output audio format to preprocessing (#6889)

* [TTS] Add output audio format to preprocessing

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Add format validation

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Fix data tutorial

Signed-off-by: Ryan <rlangman@nvidia.com>

---------

Signed-off-by: Ryan <rlangman@nvidia.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* freeze (#7152)

Signed-off-by: arendu <adithya.r@gmail.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* make sure any empty segments are removed (#7155)

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* Update RIR generation scripts (#6547)

- fix: reduce room size if evaluation of params fails
- added randomized mic placement
- added diffuse noise generation
- added an option to specify the format and subtype for saved audio

Signed-off-by: Ante Jukić <ajukic@nvidia.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* A quickstart speech enhancement tutorial (#6492)

A simple example of training a model for speech enhancement task

Signed-off-by: Ante Jukić <ajukic@nvidia.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* NFA subtitle file config - specify colors and vertical alignment (#7160)

* allow specifying colors of text in ASS subtitle file

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* specify vertical_alignment instead of marginv in ass_file_config

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* add documentation of CTMFileConfig and ASSFileConfig to NFA README

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

---------

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* Eagerly accumulate embedding grads into fp32 buffer (#6958) (#7153)

Signed-off-by: Tim Moon <tmoon@nvidia.com>
Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* TE bug fix (#7027) (#7036)

Signed-off-by: Dmytro Pykhtar <dpykhtar@nvidia.com>
Co-authored-by: Dmytro Pykhtar <37850217+dimapihtar@users.noreply.github.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* [TTS] Remove nested TTS configs (#7154)

* [TTS] Remove nested TTS configs

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Modify tutorial to support multiple sampling rates

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Clarify min_duration unit

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Default 22.05kHz highfreq to null

Signed-off-by: Ryan <rlangman@nvidia.com>

---------

Signed-off-by: Ryan <rlangman@nvidia.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* Merge release r1.20.0 to main (#7167)

* update package info

Signed-off-by: ericharper <complex451@gmail.com>

* Add ASR with TTS Tutorial. Fix enhancer usage. (#6955)

* Add ASR with TTS Tutorial
* Fix enhancer usage

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* install_bs (#7019)

Signed-off-by: Nikolay Karpov <karpnv@gmail.com>

* Fix typo and branch in tutorial (#7048)

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* fix syntax error introduced in PR-7079 (#7102)

* fix syntax error introduced in PR-7079

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>

* fixes for pr review

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>

---------

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>

* fix links for TN (#7117)

Signed-off-by: Evelina <ebakhturina@nvidia.com>

* update branch (#7135)

Signed-off-by: ericharper <complex451@gmail.com>

* Fixed main and merging this to r1.20 (#7127)

* Fixed main and merging this to r1.20

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Update vad_utils.py

Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>

---------

Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>

* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* fix version

Signed-off-by: ericharper <complex451@gmail.com>

* resolve conflict the other way

Signed-off-by: ericharper <complex451@gmail.com>

* keep both

Signed-off-by: ericharper <complex451@gmail.com>

* revert keep both

Signed-off-by: ericharper <complex451@gmail.com>

---------

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: Nikolay Karpov <karpnv@gmail.com>
Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>
Signed-off-by: Evelina <ebakhturina@nvidia.com>
Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: Vladimir Bataev <vbataev@nvidia.com>
Co-authored-by: Nikolay Karpov <karpnv@gmail.com>
Co-authored-by: bene-ges <antonova_sasha@list.ru>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: Taejin Park <tango4j@gmail.com>
Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* Upgrade to pytorch lightning 2.0 (#6433)

* Upgrade pytorch lightning version in requirements

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Initial fixes for PTL2.0

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add further fixes to support lightning 2.0

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add replacements for replace_sampler_ddp, resume_from_checkpoint_fit_path and few occurances of validation_epoch_end

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Replace all occurances of validation_epoch_end to on_validation_epoch_end

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Replace training_epoch_end, test_epoch_end with on_train_epoch_end and on_test_epoch_end respectively

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Change logger=None to logger=False in Trainer object

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove PTL2.0 deprecated Trainer args from TrainerConfig dataclass

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Modify trainer.precision check and other small edits

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Replace logger=None with logger=False in test_ptl_stateless_timer.py Trainer

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add default values for args to fix Attribute Error

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add the following modifications

1) Remove outputs arg from on_validation_epoch_end, on_test_epoch_end and make it an arg of the class
2) Replace resume_from_checkpoint with ckpt_path as needed
3) Explicitly add accelerator as 'CPU' in UTs being run on CPU

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs arg from on_validation_epoch_end, on_test_epoch_end

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs arg in on_validation_epoch_end in MultiBinaryAccuracy docstrings

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add val, test outputs as instance vars in PunctuationCapitalizationModel and TokenClassificationModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Replace trainer.fit_loop.max_steps with trainer.fit_loop.epoch_loop.max_steps in test_optimizers_schedulers.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Revert an extra space that was mistakenly added

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Use self.validation_step_outputs and self.test_step_outputs in test_ema.py for uniformity

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Use self.validation_step_outputs and self.test_step_outputs in test_ptl_stateless_timer.py and check_for_ranks.py for uniformity

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add self.validation_step_outputs.clear() and self.test_step_outputs.clear() wherever missing

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs arg from on_train_epoch_end

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs from on_validation_epoch_end in multi_binary_acc.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove output args from on_validation_epoch_end in the docstrings of some ASR files

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove output args from on_validation_epoch_end and clear memory from validation_step_outputs

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add on_validation_epoch_end and remove outputs args for nlp models

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Append output of validation_step to validation_step_outputs in EncDecClassificationModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add the following changes

1) Index self.validation_step_outputs and self.test_step.outputs with dataloader_idx wherever needed
2) Initialize self.validation_step_outputs and self.test_step.outputs as empty lists and add support for multi dataloaders if they exist
3) Remove self.pre_configure_ddp from NLPDDPStrategy class as its removed in PTL 2.0

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add default value dataloader_idx=0 for on_validation_batch_end() in megatron_base_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* TypeCast precision to str in attention.py and utils_funcs.py to avoid TypeError

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add if condition check for multiple dataloaders when appending to validation outputs

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Separate validation pass to be used with both validation_step and test_step

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add if condition check for multiple dataloader while appending to test_step_outputs in punctuation_capitalization_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add condition check for multiple dataloaders based on type of trainer.val/test_dataloaders or self._validation/test_dl instead of len

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Comment Megatron T5 IA3 PP=2 in CI pipeline due to dataloader_iter issue with PTL 2.0

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Modify precision checks to account for 16-mixed and bf16-mixed

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Append output of validation/test_step to self.validation/test_step_outputs in CTCG2PModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Modify find_unused_parameters=True in g2p_heteronym model

1) Add find_unused_parameters=True for DDP strategy in g2p_heteronym_classification_train_and_evaluate.py
2) Remove args output in validation/test_step and add instance variables instead for heteronym_classification.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs from on_test_epoch_end in DialogueGPTClassificationModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add validation/test outputs in sgdqa_model and modify dialogue_config.yaml

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add split arg self.test_step_outputs to TextClassificationModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add test_step_outputs to dialogue and text classification models

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Change condition check for multiple dataloaders:

1) Replace ds_item as list in dialogue_config.yaml
2) Check for len of val/test_dataloaders or validation/test_dl along with type check of list in sgdqa_model.py while appending outputs of validation/test_step
3) Check for len of _validation/test_dl for creating self.validation/test_step_outputs in ModelPT and punctuation_cpitalization_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add additional condition for multi dataloaders

Check len(self.trainer.val/test_dataloaders) > 1 along with type(self.trainer.val/test_dataloaders) == list for multi dataloaders in validation/test_step

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add val step outputs and default val for dataloader_idx

1) Append validation_step outout to self.validation_step_outputs in MultiLabelIntentSlotClassificationMode
2) Add default val for dataloader_idx for on_test_batch_start/end in TimingCallback
3) Add self.validation/test_step_outputs in BERTQAModel and remove outputs arg

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add val/test_step_outputs to S2SQAModel and GPTQAModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Edit JenkinsFile for bert_pretrainig.py

Edit Jenkinsfile for this test to disable validation as a workaround for trainer.val_dataloader None error

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Modify precision to support 16-mixed, bf16-mixed in megatron_gpt_pretraining.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add ddp_find_unused_parameters_true and remove output args

1) Add ddp_find_unused_parameters_true fro trainer.strategy in self_alignment_pretraining.py as it has unused parameters
2) Remove output args and add self.validation/test_step_outputs to validation/test_step in mt_enc_dec_model.py
3) Comment tests in JenkinsFile that need to be fixed

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix in megatron_nmt_training.py for 16-mixed, bf16-mixed

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix for megatron_bert_pretraining.py and megatron_bert_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix and validation/test_step_outputs

1) Add fix to account for 16-mixed and bf16-mixed in megatron_retro_mutransfer_pretrain.py, megatron_retro_pretraining.py
2) Reset ckpt_path for test in enc_dec_nmt.py
3) Remove outputs args and add validation/test_step_outputs in megatron_retrieval_model.py
4) Comment Megatron Bert Pretraining and Resume Training with Pipeline Paralleism and add back NMT Training Post-LN

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix and skip few failing tests

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add missing comment lines in JenkinsFile

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Comment jenkin tests and super().on_validation_epoch_end() in megatron_gpt_sft_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Minor edit JenkinsFile

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Minor edit in jenkins file

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Edit in Jenkins file

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Comment missed lines in Jenkins file

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Fix precision and validation/test outputs

1) Add precision fix to account for 16-mixed and bf16-mixed in megatron_t5_pretraining.py
2) Remove outputs args and add append loss to self.validation/test_step_outputs in megatron_lm_encoder_decoder_model.py
3) Add back resume_from_checkpoint in the megatron_t5_config.yaml
4) Comment out certain tests in Jenkins file

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Fix precision and validation/test/predict errors in megatron_t5_prompt_learning.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix and edit precision typo in all files

1) Account for 16-mixed and bf16-mixed in megatron_bart_pretraining.py and megatron_t5_seq2seq_finetune.py
2) Fix precision typo in all files

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Fix all CI TTS tests and comment few Jenkins tests

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Combine xx_epoch_end and on_xx_epoch_end

Add on_inference_epoch_end to inference_epoch_end function and have a single on_validation/test_epoch_end in megatron_finetune_model.py and megatron_gpt_sft_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add a missing comment in JenkinsFile

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add try except StopIteration in validation_step for models with dataloader_iter

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove pyyaml from requirements

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add try except for inference_step in megatron_finetune_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove limit_val_batches for mockGPTDataset test

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add new self.validation_step_outputs for MegatronGPTSFTModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Minor edit Jenkinsfile

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Initialize self.validation/test_step_outputs in megatron_gpt_sft_model.py

Initialize self.validation/test_step_outputs in setup of MegatronGPTSFTModel to take care of cases when datalaoders are not setup in ModelPT for example while restoring the model.

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove resume_from_checkpoint if trainer arg in conf yaml files

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove resume_from_checkpoint as trainer arg in GPT, T5 configs

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove resume_from_checkpoint in duplex_tn_config.yaml

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Fix typos, unused imports and refactor code to remove redundant funcs

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove commented code in megatron_nmt_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Fix overriden functions to match parent class functions

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Prefetch dataloader_iter to prevent hang for PP>1

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Override setup() in NLPDDPStrategy to avoid hang during predict with PP>1

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Uncomment tests in JenkinsFile

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add '16' to precision checks and other minor fixes

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Clear validation/test_step_outputs with dataloader_idx for multi dataloaders

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Minor edits

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Modify precision checks to avoid indexing

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove self.validation_step_outputs_sft and add dataloader_idx to clear outputs

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Reference checkpoint with trainer.ckpt_path

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add _prefetch to NLPModel and minor fixes

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add limit_val_batches in JenkinsFile for NMT

1) Add trainer.limit_val_batches in Megatron NMT Training TP=2
2) Remove unused import in ModelPT

Signed-off-by: Abhishree <abhishreetm@gmail.com>

---------

Signed-off-by: Abhishree <abhishreetm@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* Include the scripts for preprocessing OAST and unit tests for chat sft datasets (#7112)

* scripts for sft

Signed-off-by: Yi Dong <yidong@nvidia.com>

* fix style

Signed-off-by: Yi Dong <yidong@nvidia.com>

* adde special token only for huggingface model

Signed-off-by: Yi Dong <yidong@nvidia.com>

* change default name

Signed-off-by: Yi Dong <yidong@nvidia.com>

* print out error datapoint content

Signed-off-by: Yi Dong <yidong@nvidia.com>

* show error id

Signed-off-by: Yi Dong <yidong@nvidia.com>

* annotation script working

Signed-off-by: Yi Dong <yidong@nvidia.com>

* try to be compatible with huggingface tokenizer

Signed-off-by: Yi Dong <yidong@nvidia.com>

* added examples

Signed-off-by: Yi Dong <yidong@nvidia.com>

* added lang

Signed-off-by: Yi Dong <yidong@nvidia.com>

* added lang

Signed-off-by: Yi Dong <yidong@nvidia.com>

* text to value special case

Signed-off-by: Yi Dong <yidong@nvidia.com>

* configure the slider

Signed-off-by: Yi Dong <yidong@nvidia.com>

* annoatation handles lang

Signed-off-by: Yi Dong <yidong@nvidia.com>

* added the unit test for chat sft dataset

Signed-off-by: Yi Dong <yidong@nvidia.com>

* used the file in the test dir

Signed-off-by: Yi Dong <yidong@nvidia.com>

* fix json error

Signed-off-by: Yi Dong <yidong@nvidia.com>

* load local tokenizer

Signed-off-by: Yi Dong <yidong@nvidia.com>

* remove mask count check

Signed-off-by: Yi Dong <yidong@nvidia.com>

* added HF dataset backend

Signed-off-by: Yi Dong <yidong@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Yi Dong <yidong@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* add paths to labeler. (#7087)

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* T5 metrics fix (#7037)

* Fix race condition when executing with multi-node where some ranks does not wait for setup (#7016)

Signed-off-by: Kim Ngo <6362111+findkim@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Added bool types to neural_types export (#7032)

Signed-off-by: tbartley94 <tbartley@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* rnnt and char utils (#6971)

* rnnt_ngram_merge

Signed-off-by: Nikolay Karpov <karpnv@gmail.com>

* char level bug

Signed-off-by: Nikolay Karpov <karpnv@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Nikolay Karpov <karpnv@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* fix tab text gen (#7022) (#7031)

Signed-off-by: Yi Dong <yidong@nvidia.com>
Co-authored-by: Yi Dong <43824965+yidong72@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fixed kwargs for metric instance init

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fixed kwargs for metric instance init

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* removed kwagrs

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Updated config desc

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* ASR Confidence update and tutorial (#6810)

* small fixes and tests

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* various fixes for the tutorial

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* tutorial added

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* for for a little oops after rebasement

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix tests

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* unused import removed

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* fix review comments

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* deprecated parameters for greedy configs

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* move re-assigning to configs

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* fix comments 2

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* fix config tests

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* fix ece test (my env was bugged apparently)

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* renamings for confidence ensemble

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fox comments 3

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* return dropped tutorial

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* CI flips back and forth, increasing tolerance

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

---------

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* install_bs (#7019) (#7028)

Signed-off-by: Nikolay Karpov <karpnv@gmail.com>
Co-authored-by: Nikolay Karpov <karpnv@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* fixes for spellmapper (#6994) (#7000)

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>
Co-authored-by: bene-ges <antonova_sasha@list.ru>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* added back the retro documents (#7033)

Signed-off-by: Yi Dong <yidong@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Remove pyyaml (#7052) (#7054)

Signed-off-by: smajumdar <titu1994@gmail.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* st standalone model (#6969)

* st standalone model

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* style fix

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* sacrebleu import fix, unused imports removed

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* import guard for nlp inside asr transformer bpe model

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* codeql fixes

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* comments answered

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* import ordering fix

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* yttm for asr removed

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* logging added

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* added inference and translate method

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* remove pos emb from state dict for old models (#7068)

* remove pos emb from state dict

Signed-off-by: Evelina <ebakhturina@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* move to nlp_model

Signed-off-by: Evelina <ebakhturina@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update comment

Signed-off-by: Evelina <ebakhturina@nvidia.com>

* fix nmt test

Signed-off-by: Evelina <ebakhturina@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix nmt test

Signed-off-by: Evelina <ebakhturina@nvidia.com>

---------

Signed-off-by: Evelina <ebakhturina@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix typo in ASR-TTS tutorial (#7049)

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fixed tutorial's name (#7047)

Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com>
Co-authored-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix documentation for Numba (#7065) (#7077)

* Fix documentation for Numba

* Update force float32 flag dynamically

* Update force float32 flag dynamically

* Fix nemo version

---------

Signed-off-by: smajumdar <titu1994@gmail.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Update Frame-VAD doc and fix onnx export (#7076)

* update fvad doc

Signed-off-by: stevehuang52 <heh@nvidia.com>

* fix typo

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update fvad example

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* fix onnx export

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update test

Signed-off-by: stevehuang52 <heh@nvidia.com>

* refactor

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update doc

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

---------

Signed-off-by: stevehuang52 <heh@nvidia.com>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* memmap worker arg (#7062)

* memmap worker arg

Signed-off-by: arendu <adithya.r@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update

Signed-off-by: arendu <adithya.r@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update

Signed-off-by: arendu <adithya.r@gmail.com>

* update

Signed-off-by: arendu <adithya.r@gmail.com>

---------

Signed-off-by: arendu <adithya.r@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix caching bug in causal convolutions for cache-aware ASR models (#7034) (#7082)

Co-authored-by: Vahid Noroozi <VahidooX@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fast Conformer global token fix (#7085)

* old way

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* remove extra

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* clean

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* clean

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* clean

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: sam1373 <samuelkriman@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Refined export_config (#7053) (#7066)

* Refined export_config
* Rolling back hierarchy change
---------

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Co-authored-by: Boris Fomitchev <borisfom@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* small Bugfix (#7081)

* small Bugfix (#7079)

* fix branch

Signed-off-by: fayejf <fayejf07@gmail.com>

* fix typo

Signed-off-by: fayejf <fayejf07@gmail.com>

* fix link

Signed-off-by: fayejf <fayejf07@gmail.com>

---------

Signed-off-by: fayejf <fayejf07@gmail.com>

* Update tutorials/nlp/SpellMapper_English_ASR_Customization.ipynb

Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>

* Update tutorials/nlp/SpellMapper_English_ASR_Customization.ipynb

Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>

---------

Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Added script to extract ASR CTC and RNNT models from ASR hybrid models (#7092)

* Added script to extract ctc and rnnt models from hybrid models

Signed-off-by: Daniel Egert <degert@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Updated hybrid extraction script for review request 1

Signed-off-by: Daniel Egert <degert@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Updated hybrid convert script to remove --cuda flag

Signed-off-by: Daniel Egert <degert@nvidia.com>

---------

Signed-off-by: Daniel Egert <degert@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Adding docs and models for multiple lookahead cache-aware ASR (#7067) (#7094)

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* update TTS readme (#7088)

* update TTS readme

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

---------

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix absolute path in path join call (#7099)

Signed-off-by: Jan Beckmann <king-jan1999@hotmail.de>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Disable distopt contiguous param buffer by default (#7095)

Signed-off-by: Tim Moon <tmoon@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* microphone demo (#7110)

Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com>
Co-authored-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* [Fix] load_state_dict in nlp_model.py (#7086)

* Fix load_state_dict in nlp_model.py

Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix plot function in vad_utils.py (#7113)

Fix plot function in vad_utils.py

Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fixed small bug with NoisePerturbationWithNormalization (#7118)

Signed-off-by: Daniel Egert <degert@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix import guard checks (#7124)

Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Revert "Fix import guard checks (#7124)" (#7125)

This reverts commit a46e3251944642f9102aa16ce2d2f9d3a804ff8a.

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix import guard checks (#7126)

* Fix import guard checks

Signed-off-by: smajumdar <titu1994@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: smajumdar <titu1994@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Add updated fc ctc and rnnt xxl models (#7128) (#7130)

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* [TTS] Create EnCodec training recipe (#6852)

* [TTS] Create EnCodec training recipe

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Update encodec recipe

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Rename EnCodec to AudioCodec

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Add EnCodec unit tests

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Add copyright header to distributed.py

Signed-off-by: Ryan <rlangman@nvidia.com>

---------

Signed-off-by: Ryan <rlangman@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix rank where torch.distributed may not be initialized yet and would not wait for tokenizer file caching (#7061)

Signed-off-by: Kim Ngo <6362111+findkim@users.noreply.github.com>
Co-authored-by: David <amosalla@asu.edu>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* fix default attention size (#7141) (#7143)

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* fix evaluator.py for various exceptions by ast (#7150)

Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* [TTS][ZH] add Chinese TTS recipes based on IPA symbol sets. (#6893)

* [TTS] add Chinese TTS recipe based on IPA.
* add new pinyin and ipa dictionaries with 36 finals.
* add yaml configs for 24-final pinyin and ipa.
* add copyright header
* add a directory level 24finals to discriminate from 36 finals.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* unify configs into a single one and add detailed comments providing supported candidates.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* choose 36-final IPA as default phoneme dict

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

---------

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* [TTS] Add output audio format to preprocessing (#6889)

* [TTS] Add output audio format to preprocessing

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Add format validation

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Fix data tutorial

Signed-off-by: Ryan <rlangman@nvidia.com>

---------

Signed-off-by: Ryan <rlangman@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* freeze (#7152)

Signed-off-by: arendu <adithya.r@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* make sure any empty segments are removed (#7155)

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Update RIR generation scripts (#6547)

- fix: reduce room size if evaluation of params fails
- added randomized mic placement
- added diffuse noise generation
- added an option to specify the format and subtype for saved audio

Signed-off-by: Ante Jukić <ajukic@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* A quickstart speech enhancement tutorial (#6492)

A simple example of training a model for speech enhancement task

Signed-off-by: Ante Jukić <ajukic@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* NFA subtitle file config - specify colors and vertical alignment (#7160)

* allow specifying colors of text in ASS subtitle file

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* specify vertical_alignment instead of marginv in ass_file_config

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* add documentation of CTMFileConfig and ASSFileConfig to NFA README

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

---------

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Eagerly accumulate embedding grads into fp32 buffer (#6958) (#7153)

Signed-off-by: Tim Moon <tmoon@nvidia.com>
Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* TE bug fix (#7027) (#7036)

Signed-off-by: Dmytro Pykhtar <dpykhtar@nvidia.com>
Co-authored-by: Dmytro Pykhtar <37850217+dimapihtar@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* [TTS] Remove nested TTS configs (#7154)

* [TTS] Remove nested TTS configs

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Modify tutorial to support multiple sampling rates

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Clarify min_duration unit

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Default 22.05kHz highfreq to null

Signed-off-by: Ryan <rlangman@nvidia.com>

---------

Signed-off-by: Ryan <rlangman@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Merge release r1.20.0 to main (#7167)

* update package info

Signed-off-by: ericharper <complex451@gmail.com>

* Add ASR with TTS Tutorial. Fix enhancer usage. (#6955)

* Add ASR with TTS Tutorial
* Fix enhancer usage

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* install_bs (#7019)

Signed-off-by: Nikolay Karpov <karpnv@gmail.com>

* Fix typo and branch in tutorial (#7048)

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* fix syntax error introduced in PR-7079 (#7102)

* fix syntax error introduced in PR-7079

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>

* fixes for pr review

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>

---------

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>

* fix links for TN (#7117)

Signed-off-by: Evelina <ebakhturina@nvidia.com>

* update branch (#7135)

Signed-off-by: ericharper <complex451@gmail.com>

* Fixed main and merging this to r1.20 (#7127)

* Fixed main and merging this to r1.20

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Update vad_utils.py

Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>

---------

Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>

* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* fix version

Signed-off-by: ericharper <complex451@gmail.com>

* resolve conflict the other way

Signed-off-by: ericharper <complex451@gmail.com>

* keep both

Signed-off-by: ericharper <complex451@gmail.com>

* revert keep both

Signed-off-by: ericharper <complex451@gmail.com>

---------

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: Nikolay Karpov <karpnv@gmail.com>
Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>
Signed-off-by: Evelina <ebakhturina@nvidia.com>
Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: Vladimir Bataev <vbataev@nvidia.com>
Co-authored-by: Nikolay Karpov <karpnv@gmail.com>
Co-authored-by: bene-ges <antonova_sasha@list.ru>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: Taejin Park <tango4j@gmail.com>
Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Upgrade to pytorch lightning 2.0 (#6433)

* Upgrade pytorch lightning version in requirements

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Initial fixes for PTL2.0

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add further fixes to support lightning 2.0

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add replacements for replace_sampler_ddp, resume_from_checkpoint_fit_path and few occurances of validation_epoch_end

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Replace all occurances of validation_epoch_end to on_validation_epoch_end

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Replace training_epoch_end, test_epoch_end with on_train_epoch_end and on_test_epoch_end respectively

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Change logger=None to logger=False in Trainer object

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove PTL2.0 deprecated Trainer args from TrainerConfig dataclass

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Modify trainer.precision check and other small edits

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Replace logger=None with logger=False in test_ptl_stateless_timer.py Trainer

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add default values for args to fix Attribute Error

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add the following modifications

1) Remove outputs arg from on_validation_epoch_end, on_test_epoch_end and make it an arg of the class
2) Replace resume_from_checkpoint with ckpt_path as needed
3) Explicitly add accelerator as 'CPU' in UTs being run on CPU

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs arg from on_validation_epoch_end, on_test_epoch_end

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs arg in on_validation_epoch_end in MultiBinaryAccuracy docstrings

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add val, test outputs as instance vars in PunctuationCapitalizationModel and TokenClassificationModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Replace trainer.fit_loop.max_steps with trainer.fit_loop.epoch_loop.max_steps in test_optimizers_schedulers.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Revert an extra space that was mistakenly added

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Use self.validation_step_outputs and self.test_step_outputs in test_ema.py for uniformity

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Use self.validation_step_outputs and self.test_step_outputs in test_ptl_stateless_timer.py and check_for_ranks.py for uniformity

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add self.validation_step_outputs.clear() and self.test_step_outputs.clear() wherever missing

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs arg from on_train_epoch_end

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs from on_validation_epoch_end in multi_binary_acc.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove output args from on_validation_epoch_end in the docstrings of some ASR files

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove output args from on_validation_epoch_end and clear memory from validation_step_outputs

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add on_validation_epoch_end and remove outputs args for nlp models

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Append output of validation_step to validation_step_outputs in EncDecClassificationModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add the following changes

1) Index self.validation_step_outputs and self.test_step.outputs with dataloader_idx wherever needed
2) Initialize self.validation_step_outputs and self.test_step.outputs as empty lists and add support for multi dataloaders if they exist
3) Remove self.pre_configure_ddp from NLPDDPStrategy class as its removed in PTL 2.0

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add default value dataloader_idx=0 for on_validation_batch_end() in megatron_base_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* TypeCast precision to str in attention.py and utils_funcs.py to avoid TypeError

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add if condition check for multiple dataloaders when appending to validation outputs

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Separate validation pass to be used with both validation_step and test_step

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add if condition check for multiple dataloader while appending to test_step_outputs in punctuation_capitalization_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add condition check for multiple dataloaders based on type of trainer.val/test_dataloaders or self._validation/test_dl instead of len

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Comment Megatron T5 IA3 PP=2 in CI pipeline due to dataloader_iter issue with PTL 2.0

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Modify precision checks to account for 16-mixed and bf16-mixed

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Append output of validation/test_step to self.validation/test_step_outputs in CTCG2PModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Modify find_unused_parameters=True in g2p_heteronym model

1) Add find_unused_parameters=True for DDP strategy in g2p_heteronym_classification_train_and_evaluate.py
2) Remove args output in validation/test_step and add instance variables instead for heteronym_classification.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs from on_test_epoch_end in DialogueGPTClassificationModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add validation/test outputs in sgdqa_model and modify dialogue_config.yaml

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add split arg self.test_step_outputs to TextClassificationModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add test_step_outputs to dialogue and text classification models

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Change condition check for multiple dataloaders:

1) Replace ds_item as list in dialogue_config.yaml
2) Check for len of val/test_dataloaders or validation/test_dl along with type check of list in sgdqa_model.py while appending outputs of validation/test_step
3) Check for len of _validation/test_dl for creating self.validation/test_step_outputs in ModelPT and punctuation_cpitalization_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add additional condition for multi dataloaders

Check len(self.trainer.val/test_dataloaders) > 1 along with type(self.trainer.val/test_dataloaders) == list for multi dataloaders in validation/test_step

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add val step outputs and default val for dataloader_idx

1) Append validation_step outout to self.validation_step_outputs in MultiLabelIntentSlotClassificationMode
2) Add default val for dataloader_idx for on_test_batch_start/end in TimingCallback
3) Add self.validation/test_step_outputs in BERTQAModel and remove outputs arg

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add val/test_step_outputs to S2SQAModel and GPTQAModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Edit JenkinsFile for bert_pretrainig.py

Edit Jenkinsfile for this test to disable validation as a workaround for trainer.val_dataloader None error

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Modify precision to support 16-mixed, bf16-mixed in megatron_gpt_pretraining.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add ddp_find_unused_parameters_true and remove output args

1) Add ddp_find_unused_parameters_true fro trainer.strategy in self_alignment_pretraining.py as it has unused parameters
2) Remove output args and add self.validation/test_step_outputs to validation/test_step in mt_enc_dec_model.py
3) Comment tests in JenkinsFile that need to be fixed

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix in megatron_nmt_training.py for 16-mixed, bf16-mixed

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix for megatron_bert_pretraining.py and megatron_bert_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix and validation/test_step_outputs

1) Add fix to account for 16-mixed and bf16-mixed in megatron_retro_mutransfer_pretrain.py, megatron_retro_pretraining.py
2) Reset ckpt_path for test in enc_dec_nmt.py
3) Remove outputs args and add validation/test_step_outputs in megatron_retrieval_model.py
4) Comment Megatron Bert Pretraining and Resume Training with Pipeline Paralleism and add back NMT Training Post-LN

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix and skip few failing tests

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add missing comment lines in JenkinsFile

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Comment jenkin tests and super().on_validation_epoch_end() in megatron_gpt_sft_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Minor edit JenkinsFile

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Minor edit in jenkins file

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Edit in Jenkins file

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Comment missed lines in Jenkins file

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Fix precision and validation/test outputs

1) Add precision fix to account for 16-mixed and bf16-mixed in megatron_t5_pretraining.py
2) Remove outputs args and add append loss to self.validation/test_step_outputs in megatron_lm_encoder_decoder_model.py
3) Add back resume_from_checkpoint in the megatron_t5_config.yaml
4) Comment out certain tests in Jenkins file

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Fix precision and validation/test/predict errors in megatron_t5_prompt_learning.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix and edit precision typo in all files

1) Account for 16-mixed and bf16-mixed in megatron_bart_pretraining.py and megatron_t5_seq2seq_finetune.py
2) Fix precision typo in all files

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Fix all CI TTS tests and comment few Jenkins tests

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Combine xx_epoch_end and on_xx_epoch_end

Add on_inference_epoch_end to inference_epoch_end function and have a single on_validation/test_epoch_end in megatron_finetune_model.py and megatron_gpt_sft_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add a missing comment in JenkinsFile

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add try except StopIteration in validation_step for models with dataloader_iter

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove pyyaml from requirements

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add try except for inference_step in megatron_finetune_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove limit_val_batches for mockGPTDataset test

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add new self.validation_step_outputs for MegatronGPTSFTModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Minor edit Jenkinsfile

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Initialize self.validation/test_step_outputs in megatron_gpt_sft_model.py

Initialize self.validation/test_step_outputs in setup of MegatronGPTSFTModel to take care of cases when datalaoders are not setup in ModelPT for example while restoring the model.

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove resume_from_checkpoint if trainer arg in conf yaml files

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove resume_from_checkpoint as trainer arg in GPT, T5 configs

Signed-off-by: Abhishree <abhishreetm@gmai…
zhehuaichen added a commit to zhehuaichen/NeMo that referenced this pull request Sep 22, 2023
* Fixed small bug with NoisePerturbationWithNormalization (#7118)

Signed-off-by: Daniel Egert <degert@nvidia.com>

* Fix import guard checks (#7124)

Signed-off-by: smajumdar <titu1994@gmail.com>

* Revert "Fix import guard checks (#7124)" (#7125)

This reverts commit a46e3251944642f9102aa16ce2d2f9d3a804ff8a.

* Fix import guard checks (#7126)

* Fix import guard checks

Signed-off-by: smajumdar <titu1994@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: smajumdar <titu1994@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Add updated fc ctc and rnnt xxl models (#7128) (#7130)

* [TTS] Create EnCodec training recipe (#6852)

* [TTS] Create EnCodec training recipe

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Update encodec recipe

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Rename EnCodec to AudioCodec

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Add EnCodec unit tests

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Add copyright header to distributed.py

Signed-off-by: Ryan <rlangman@nvidia.com>

---------

Signed-off-by: Ryan <rlangman@nvidia.com>

* Fix rank where torch.distributed may not be initialized yet and would not wait for tokenizer file caching (#7061)

Signed-off-by: Kim Ngo <6362111+findkim@users.noreply.github.com>
Co-authored-by: David <amosalla@asu.edu>

* fix default attention size (#7141) (#7143)

* fix evaluator.py for various exceptions by ast (#7150)

Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>

* [TTS][ZH] add Chinese TTS recipes based on IPA symbol sets. (#6893)

* [TTS] add Chinese TTS recipe based on IPA.
* add new pinyin and ipa dictionaries with 36 finals.
* add yaml configs for 24-final pinyin and ipa.
* add copyright header
* add a directory level 24finals to discriminate from 36 finals.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* unify configs into a single one and add detailed comments providing supported candidates.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* choose 36-final IPA as default phoneme dict

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

---------

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* [TTS] Add output audio format to preprocessing (#6889)

* [TTS] Add output audio format to preprocessing

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Add format validation

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Fix data tutorial

Signed-off-by: Ryan <rlangman@nvidia.com>

---------

Signed-off-by: Ryan <rlangman@nvidia.com>

* freeze (#7152)

Signed-off-by: arendu <adithya.r@gmail.com>

* make sure any empty segments are removed (#7155)

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Update RIR generation scripts (#6547)

- fix: reduce room size if evaluation of params fails
- added randomized mic placement
- added diffuse noise generation
- added an option to specify the format and subtype for saved audio

Signed-off-by: Ante Jukić <ajukic@nvidia.com>

* A quickstart speech enhancement tutorial (#6492)

A simple example of training a model for speech enhancement task

Signed-off-by: Ante Jukić <ajukic@nvidia.com>

* NFA subtitle file config - specify colors and vertical alignment (#7160)

* allow specifying colors of text in ASS subtitle file

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* specify vertical_alignment instead of marginv in ass_file_config

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* add documentation of CTMFileConfig and ASSFileConfig to NFA README

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

---------

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Eagerly accumulate embedding grads into fp32 buffer (#6958) (#7153)

Signed-off-by: Tim Moon <tmoon@nvidia.com>
Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>

* TE bug fix (#7027) (#7036)

Signed-off-by: Dmytro Pykhtar <dpykhtar@nvidia.com>
Co-authored-by: Dmytro Pykhtar <37850217+dimapihtar@users.noreply.github.com>

* [TTS] Remove nested TTS configs (#7154)

* [TTS] Remove nested TTS configs

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Modify tutorial to support multiple sampling rates

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Clarify min_duration unit

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Default 22.05kHz highfreq to null

Signed-off-by: Ryan <rlangman@nvidia.com>

---------

Signed-off-by: Ryan <rlangman@nvidia.com>

* Merge release r1.20.0 to main (#7167)

* update package info

Signed-off-by: ericharper <complex451@gmail.com>

* Add ASR with TTS Tutorial. Fix enhancer usage. (#6955)

* Add ASR with TTS Tutorial
* Fix enhancer usage

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* install_bs (#7019)

Signed-off-by: Nikolay Karpov <karpnv@gmail.com>

* Fix typo and branch in tutorial (#7048)

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* fix syntax error introduced in PR-7079 (#7102)

* fix syntax error introduced in PR-7079

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>

* fixes for pr review

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>

---------

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>

* fix links for TN (#7117)

Signed-off-by: Evelina <ebakhturina@nvidia.com>

* update branch (#7135)

Signed-off-by: ericharper <complex451@gmail.com>

* Fixed main and merging this to r1.20 (#7127)

* Fixed main and merging this to r1.20

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Update vad_utils.py

Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>

---------

Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>

* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* fix version

Signed-off-by: ericharper <complex451@gmail.com>

* resolve conflict the other way

Signed-off-by: ericharper <complex451@gmail.com>

* keep both

Signed-off-by: ericharper <complex451@gmail.com>

* revert keep both

Signed-off-by: ericharper <complex451@gmail.com>

---------

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: Nikolay Karpov <karpnv@gmail.com>
Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>
Signed-off-by: Evelina <ebakhturina@nvidia.com>
Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: Vladimir Bataev <vbataev@nvidia.com>
Co-authored-by: Nikolay Karpov <karpnv@gmail.com>
Co-authored-by: bene-ges <antonova_sasha@list.ru>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: Taejin Park <tango4j@gmail.com>
Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>

* Upgrade to pytorch lightning 2.0 (#6433)

* Upgrade pytorch lightning version in requirements

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Initial fixes for PTL2.0

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add further fixes to support lightning 2.0

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add replacements for replace_sampler_ddp, resume_from_checkpoint_fit_path and few occurances of validation_epoch_end

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Replace all occurances of validation_epoch_end to on_validation_epoch_end

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Replace training_epoch_end, test_epoch_end with on_train_epoch_end and on_test_epoch_end respectively

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Change logger=None to logger=False in Trainer object

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove PTL2.0 deprecated Trainer args from TrainerConfig dataclass

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Modify trainer.precision check and other small edits

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Replace logger=None with logger=False in test_ptl_stateless_timer.py Trainer

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add default values for args to fix Attribute Error

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add the following modifications

1) Remove outputs arg from on_validation_epoch_end, on_test_epoch_end and make it an arg of the class
2) Replace resume_from_checkpoint with ckpt_path as needed
3) Explicitly add accelerator as 'CPU' in UTs being run on CPU

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs arg from on_validation_epoch_end, on_test_epoch_end

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs arg in on_validation_epoch_end in MultiBinaryAccuracy docstrings

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add val, test outputs as instance vars in PunctuationCapitalizationModel and TokenClassificationModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Replace trainer.fit_loop.max_steps with trainer.fit_loop.epoch_loop.max_steps in test_optimizers_schedulers.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Revert an extra space that was mistakenly added

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Use self.validation_step_outputs and self.test_step_outputs in test_ema.py for uniformity

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Use self.validation_step_outputs and self.test_step_outputs in test_ptl_stateless_timer.py and check_for_ranks.py for uniformity

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add self.validation_step_outputs.clear() and self.test_step_outputs.clear() wherever missing

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs arg from on_train_epoch_end

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs from on_validation_epoch_end in multi_binary_acc.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove output args from on_validation_epoch_end in the docstrings of some ASR files

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove output args from on_validation_epoch_end and clear memory from validation_step_outputs

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add on_validation_epoch_end and remove outputs args for nlp models

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Append output of validation_step to validation_step_outputs in EncDecClassificationModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add the following changes

1) Index self.validation_step_outputs and self.test_step.outputs with dataloader_idx wherever needed
2) Initialize self.validation_step_outputs and self.test_step.outputs as empty lists and add support for multi dataloaders if they exist
3) Remove self.pre_configure_ddp from NLPDDPStrategy class as its removed in PTL 2.0

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add default value dataloader_idx=0 for on_validation_batch_end() in megatron_base_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* TypeCast precision to str in attention.py and utils_funcs.py to avoid TypeError

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add if condition check for multiple dataloaders when appending to validation outputs

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Separate validation pass to be used with both validation_step and test_step

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add if condition check for multiple dataloader while appending to test_step_outputs in punctuation_capitalization_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add condition check for multiple dataloaders based on type of trainer.val/test_dataloaders or self._validation/test_dl instead of len

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Comment Megatron T5 IA3 PP=2 in CI pipeline due to dataloader_iter issue with PTL 2.0

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Modify precision checks to account for 16-mixed and bf16-mixed

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Append output of validation/test_step to self.validation/test_step_outputs in CTCG2PModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Modify find_unused_parameters=True in g2p_heteronym model

1) Add find_unused_parameters=True for DDP strategy in g2p_heteronym_classification_train_and_evaluate.py
2) Remove args output in validation/test_step and add instance variables instead for heteronym_classification.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs from on_test_epoch_end in DialogueGPTClassificationModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add validation/test outputs in sgdqa_model and modify dialogue_config.yaml

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add split arg self.test_step_outputs to TextClassificationModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add test_step_outputs to dialogue and text classification models

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Change condition check for multiple dataloaders:

1) Replace ds_item as list in dialogue_config.yaml
2) Check for len of val/test_dataloaders or validation/test_dl along with type check of list in sgdqa_model.py while appending outputs of validation/test_step
3) Check for len of _validation/test_dl for creating self.validation/test_step_outputs in ModelPT and punctuation_cpitalization_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add additional condition for multi dataloaders

Check len(self.trainer.val/test_dataloaders) > 1 along with type(self.trainer.val/test_dataloaders) == list for multi dataloaders in validation/test_step

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add val step outputs and default val for dataloader_idx

1) Append validation_step outout to self.validation_step_outputs in MultiLabelIntentSlotClassificationMode
2) Add default val for dataloader_idx for on_test_batch_start/end in TimingCallback
3) Add self.validation/test_step_outputs in BERTQAModel and remove outputs arg

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add val/test_step_outputs to S2SQAModel and GPTQAModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Edit JenkinsFile for bert_pretrainig.py

Edit Jenkinsfile for this test to disable validation as a workaround for trainer.val_dataloader None error

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Modify precision to support 16-mixed, bf16-mixed in megatron_gpt_pretraining.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add ddp_find_unused_parameters_true and remove output args

1) Add ddp_find_unused_parameters_true fro trainer.strategy in self_alignment_pretraining.py as it has unused parameters
2) Remove output args and add self.validation/test_step_outputs to validation/test_step in mt_enc_dec_model.py
3) Comment tests in JenkinsFile that need to be fixed

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix in megatron_nmt_training.py for 16-mixed, bf16-mixed

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix for megatron_bert_pretraining.py and megatron_bert_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix and validation/test_step_outputs

1) Add fix to account for 16-mixed and bf16-mixed in megatron_retro_mutransfer_pretrain.py, megatron_retro_pretraining.py
2) Reset ckpt_path for test in enc_dec_nmt.py
3) Remove outputs args and add validation/test_step_outputs in megatron_retrieval_model.py
4) Comment Megatron Bert Pretraining and Resume Training with Pipeline Paralleism and add back NMT Training Post-LN

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix and skip few failing tests

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add missing comment lines in JenkinsFile

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Comment jenkin tests and super().on_validation_epoch_end() in megatron_gpt_sft_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Minor edit JenkinsFile

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Minor edit in jenkins file

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Edit in Jenkins file

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Comment missed lines in Jenkins file

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Fix precision and validation/test outputs

1) Add precision fix to account for 16-mixed and bf16-mixed in megatron_t5_pretraining.py
2) Remove outputs args and add append loss to self.validation/test_step_outputs in megatron_lm_encoder_decoder_model.py
3) Add back resume_from_checkpoint in the megatron_t5_config.yaml
4) Comment out certain tests in Jenkins file

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Fix precision and validation/test/predict errors in megatron_t5_prompt_learning.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix and edit precision typo in all files

1) Account for 16-mixed and bf16-mixed in megatron_bart_pretraining.py and megatron_t5_seq2seq_finetune.py
2) Fix precision typo in all files

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Fix all CI TTS tests and comment few Jenkins tests

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Combine xx_epoch_end and on_xx_epoch_end

Add on_inference_epoch_end to inference_epoch_end function and have a single on_validation/test_epoch_end in megatron_finetune_model.py and megatron_gpt_sft_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add a missing comment in JenkinsFile

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add try except StopIteration in validation_step for models with dataloader_iter

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove pyyaml from requirements

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add try except for inference_step in megatron_finetune_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove limit_val_batches for mockGPTDataset test

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add new self.validation_step_outputs for MegatronGPTSFTModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Minor edit Jenkinsfile

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Initialize self.validation/test_step_outputs in megatron_gpt_sft_model.py

Initialize self.validation/test_step_outputs in setup of MegatronGPTSFTModel to take care of cases when datalaoders are not setup in ModelPT for example while restoring the model.

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove resume_from_checkpoint if trainer arg in conf yaml files

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove resume_from_checkpoint as trainer arg in GPT, T5 configs

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove resume_from_checkpoint in duplex_tn_config.yaml

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Fix typos, unused imports and refactor code to remove redundant funcs

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove commented code in megatron_nmt_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Fix overriden functions to match parent class functions

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Prefetch dataloader_iter to prevent hang for PP>1

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Override setup() in NLPDDPStrategy to avoid hang during predict with PP>1

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Uncomment tests in JenkinsFile

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add '16' to precision checks and other minor fixes

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Clear validation/test_step_outputs with dataloader_idx for multi dataloaders

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Minor edits

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Modify precision checks to avoid indexing

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove self.validation_step_outputs_sft and add dataloader_idx to clear outputs

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Reference checkpoint with trainer.ckpt_path

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add _prefetch to NLPModel and minor fixes

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add limit_val_batches in JenkinsFile for NMT

1) Add trainer.limit_val_batches in Megatron NMT Training TP=2
2) Remove unused import in ModelPT

Signed-off-by: Abhishree <abhishreetm@gmail.com>

---------

Signed-off-by: Abhishree <abhishreetm@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Include the scripts for preprocessing OAST and unit tests for chat sft datasets (#7112)

* scripts for sft

Signed-off-by: Yi Dong <yidong@nvidia.com>

* fix style

Signed-off-by: Yi Dong <yidong@nvidia.com>

* adde special token only for huggingface model

Signed-off-by: Yi Dong <yidong@nvidia.com>

* change default name

Signed-off-by: Yi Dong <yidong@nvidia.com>

* print out error datapoint content

Signed-off-by: Yi Dong <yidong@nvidia.com>

* show error id

Signed-off-by: Yi Dong <yidong@nvidia.com>

* annotation script working

Signed-off-by: Yi Dong <yidong@nvidia.com>

* try to be compatible with huggingface tokenizer

Signed-off-by: Yi Dong <yidong@nvidia.com>

* added examples

Signed-off-by: Yi Dong <yidong@nvidia.com>

* added lang

Signed-off-by: Yi Dong <yidong@nvidia.com>

* added lang

Signed-off-by: Yi Dong <yidong@nvidia.com>

* text to value special case

Signed-off-by: Yi Dong <yidong@nvidia.com>

* configure the slider

Signed-off-by: Yi Dong <yidong@nvidia.com>

* annoatation handles lang

Signed-off-by: Yi Dong <yidong@nvidia.com>

* added the unit test for chat sft dataset

Signed-off-by: Yi Dong <yidong@nvidia.com>

* used the file in the test dir

Signed-off-by: Yi Dong <yidong@nvidia.com>

* fix json error

Signed-off-by: Yi Dong <yidong@nvidia.com>

* load local tokenizer

Signed-off-by: Yi Dong <yidong@nvidia.com>

* remove mask count check

Signed-off-by: Yi Dong <yidong@nvidia.com>

* added HF dataset backend

Signed-off-by: Yi Dong <yidong@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Yi Dong <yidong@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* add paths to labeler. (#7087)

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* T5 metrics fix (#7037)

* Fix race condition when executing with multi-node where some ranks does not wait for setup (#7016)

Signed-off-by: Kim Ngo <6362111+findkim@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Added bool types to neural_types export (#7032)

Signed-off-by: tbartley94 <tbartley@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* rnnt and char utils (#6971)

* rnnt_ngram_merge

Signed-off-by: Nikolay Karpov <karpnv@gmail.com>

* char level bug

Signed-off-by: Nikolay Karpov <karpnv@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Nikolay Karpov <karpnv@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* fix tab text gen (#7022) (#7031)

Signed-off-by: Yi Dong <yidong@nvidia.com>
Co-authored-by: Yi Dong <43824965+yidong72@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fixed kwargs for metric instance init

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fixed kwargs for metric instance init

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* removed kwagrs

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Updated config desc

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* ASR Confidence update and tutorial (#6810)

* small fixes and tests

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* various fixes for the tutorial

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* tutorial added

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* for for a little oops after rebasement

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix tests

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* unused import removed

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* fix review comments

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* deprecated parameters for greedy configs

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* move re-assigning to configs

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* fix comments 2

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* fix config tests

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* fix ece test (my env was bugged apparently)

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* renamings for confidence ensemble

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fox comments 3

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* return dropped tutorial

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* CI flips back and forth, increasing tolerance

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

---------

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* install_bs (#7019) (#7028)

Signed-off-by: Nikolay Karpov <karpnv@gmail.com>
Co-authored-by: Nikolay Karpov <karpnv@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* fixes for spellmapper (#6994) (#7000)

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>
Co-authored-by: bene-ges <antonova_sasha@list.ru>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* added back the retro documents (#7033)

Signed-off-by: Yi Dong <yidong@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Remove pyyaml (#7052) (#7054)

Signed-off-by: smajumdar <titu1994@gmail.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* st standalone model (#6969)

* st standalone model

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* style fix

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* sacrebleu import fix, unused imports removed

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* import guard for nlp inside asr transformer bpe model

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* codeql fixes

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* comments answered

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* import ordering fix

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* yttm for asr removed

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* logging added

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* added inference and translate method

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* remove pos emb from state dict for old models (#7068)

* remove pos emb from state dict

Signed-off-by: Evelina <ebakhturina@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* move to nlp_model

Signed-off-by: Evelina <ebakhturina@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update comment

Signed-off-by: Evelina <ebakhturina@nvidia.com>

* fix nmt test

Signed-off-by: Evelina <ebakhturina@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix nmt test

Signed-off-by: Evelina <ebakhturina@nvidia.com>

---------

Signed-off-by: Evelina <ebakhturina@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix typo in ASR-TTS tutorial (#7049)

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fixed tutorial's name (#7047)

Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com>
Co-authored-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix documentation for Numba (#7065) (#7077)

* Fix documentation for Numba



* Update force float32 flag dynamically



* Update force float32 flag dynamically



* Fix nemo version



---------

Signed-off-by: smajumdar <titu1994@gmail.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Update Frame-VAD doc and fix onnx export (#7076)

* update fvad doc

Signed-off-by: stevehuang52 <heh@nvidia.com>

* fix typo

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update fvad example

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* fix onnx export

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update test

Signed-off-by: stevehuang52 <heh@nvidia.com>

* refactor

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update doc

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

---------

Signed-off-by: stevehuang52 <heh@nvidia.com>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* memmap worker arg (#7062)

* memmap worker arg

Signed-off-by: arendu <adithya.r@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update

Signed-off-by: arendu <adithya.r@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update

Signed-off-by: arendu <adithya.r@gmail.com>

* update

Signed-off-by: arendu <adithya.r@gmail.com>

---------

Signed-off-by: arendu <adithya.r@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix caching bug in causal convolutions for cache-aware ASR models (#7034) (#7082)

Co-authored-by: Vahid Noroozi <VahidooX@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fast Conformer global token fix (#7085)

* old way

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* remove extra

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* clean

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* clean

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* clean

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: sam1373 <samuelkriman@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Refined export_config (#7053) (#7066)

* Refined export_config
* Rolling back hierarchy change
---------

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Co-authored-by: Boris Fomitchev <borisfom@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* small Bugfix (#7081)

* small Bugfix (#7079)

* fix branch

Signed-off-by: fayejf <fayejf07@gmail.com>

* fix typo

Signed-off-by: fayejf <fayejf07@gmail.com>

* fix link

Signed-off-by: fayejf <fayejf07@gmail.com>

---------

Signed-off-by: fayejf <fayejf07@gmail.com>

* Update tutorials/nlp/SpellMapper_English_ASR_Customization.ipynb

Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>

* Update tutorials/nlp/SpellMapper_English_ASR_Customization.ipynb

Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>

---------

Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Added script to extract ASR CTC and RNNT models from ASR hybrid models (#7092)

* Added script to extract ctc and rnnt models from hybrid models

Signed-off-by: Daniel Egert <degert@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Updated hybrid extraction script for review request 1

Signed-off-by: Daniel Egert <degert@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Updated hybrid convert script to remove --cuda flag

Signed-off-by: Daniel Egert <degert@nvidia.com>

---------

Signed-off-by: Daniel Egert <degert@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Adding docs and models for multiple lookahead cache-aware ASR (#7067) (#7094)

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* update TTS readme (#7088)

* update TTS readme

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

---------

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix absolute path in path join call (#7099)

Signed-off-by: Jan Beckmann <king-jan1999@hotmail.de>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Disable distopt contiguous param buffer by default (#7095)

Signed-off-by: Tim Moon <tmoon@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* microphone demo (#7110)

Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com>
Co-authored-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* [Fix] load_state_dict in nlp_model.py (#7086)

* Fix load_state_dict in nlp_model.py

Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix plot function in vad_utils.py (#7113)

Fix plot function in vad_utils.py

Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fixed small bug with NoisePerturbationWithNormalization (#7118)

Signed-off-by: Daniel Egert <degert@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix import guard checks (#7124)

Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Revert "Fix import guard checks (#7124)" (#7125)

This reverts commit a46e3251944642f9102aa16ce2d2f9d3a804ff8a.

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix import guard checks (#7126)

* Fix import guard checks

Signed-off-by: smajumdar <titu1994@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: smajumdar <titu1994@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Add updated fc ctc and rnnt xxl models (#7128) (#7130)

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* [TTS] Create EnCodec training recipe (#6852)

* [TTS] Create EnCodec training recipe

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Update encodec recipe

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Rename EnCodec to AudioCodec

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Add EnCodec unit tests

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Add copyright header to distributed.py

Signed-off-by: Ryan <rlangman@nvidia.com>

---------

Signed-off-by: Ryan <rlangman@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix rank where torch.distributed may not be initialized yet and would not wait for tokenizer file caching (#7061)

Signed-off-by: Kim Ngo <6362111+findkim@users.noreply.github.com>
Co-authored-by: David <amosalla@asu.edu>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* fix default attention size (#7141) (#7143)

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* fix evaluator.py for various exceptions by ast (#7150)

Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* [TTS][ZH] add Chinese TTS recipes based on IPA symbol sets. (#6893)

* [TTS] add Chinese TTS recipe based on IPA.
* add new pinyin and ipa dictionaries with 36 finals.
* add yaml configs for 24-final pinyin and ipa.
* add copyright header
* add a directory level 24finals to discriminate from 36 finals.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* unify configs into a single one and add detailed comments providing supported candidates.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* choose 36-final IPA as default phoneme dict

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

---------

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* [TTS] Add output audio format to preprocessing (#6889)

* [TTS] Add output audio format to preprocessing

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Add format validation

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Fix data tutorial

Signed-off-by: Ryan <rlangman@nvidia.com>

---------

Signed-off-by: Ryan <rlangman@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* freeze (#7152)

Signed-off-by: arendu <adithya.r@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* make sure any empty segments are removed (#7155)

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Update RIR generation scripts (#6547)

- fix: reduce room size if evaluation of params fails
- added randomized mic placement
- added diffuse noise generation
- added an option to specify the format and subtype for saved audio

Signed-off-by: Ante Jukić <ajukic@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* A quickstart speech enhancement tutorial (#6492)

A simple example of training a model for speech enhancement task

Signed-off-by: Ante Jukić <ajukic@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* NFA subtitle file config - specify colors and vertical alignment (#7160)

* allow specifying colors of text in ASS subtitle file

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* specify vertical_alignment instead of marginv in ass_file_config

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* add documentation of CTMFileConfig and ASSFileConfig to NFA README

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

---------

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Eagerly accumulate embedding grads into fp32 buffer (#6958) (#7153)

Signed-off-by: Tim Moon <tmoon@nvidia.com>
Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* TE bug fix (#7027) (#7036)

Signed-off-by: Dmytro Pykhtar <dpykhtar@nvidia.com>
Co-authored-by: Dmytro Pykhtar <37850217+dimapihtar@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* [TTS] Remove nested TTS configs (#7154)

* [TTS] Remove nested TTS configs

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Modify tutorial to support multiple sampling rates

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Clarify min_duration unit

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Default 22.05kHz highfreq to null

Signed-off-by: Ryan <rlangman@nvidia.com>

---------

Signed-off-by: Ryan <rlangman@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Merge release r1.20.0 to main (#7167)

* update package info

Signed-off-by: ericharper <complex451@gmail.com>

* Add ASR with TTS Tutorial. Fix enhancer usage. (#6955)

* Add ASR with TTS Tutorial
* Fix enhancer usage

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* install_bs (#7019)

Signed-off-by: Nikolay Karpov <karpnv@gmail.com>

* Fix typo and branch in tutorial (#7048)

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* fix syntax error introduced in PR-7079 (#7102)

* fix syntax error introduced in PR-7079

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>

* fixes for pr review

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>

---------

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>

* fix links for TN (#7117)

Signed-off-by: Evelina <ebakhturina@nvidia.com>

* update branch (#7135)

Signed-off-by: ericharper <complex451@gmail.com>

* Fixed main and merging this to r1.20 (#7127)

* Fixed main and merging this to r1.20

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Update vad_utils.py

Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>

---------

Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>

* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* fix version

Signed-off-by: ericharper <complex451@gmail.com>

* resolve conflict the other way

Signed-off-by: ericharper <complex451@gmail.com>

* keep both

Signed-off-by: ericharper <complex451@gmail.com>

* revert keep both

Signed-off-by: ericharper <complex451@gmail.com>

---------

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: Nikolay Karpov <karpnv@gmail.com>
Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>
Signed-off-by: Evelina <ebakhturina@nvidia.com>
Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: Vladimir Bataev <vbataev@nvidia.com>
Co-authored-by: Nikolay Karpov <karpnv@gmail.com>
Co-authored-by: bene-ges <antonova_sasha@list.ru>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: Taejin Park <tango4j@gmail.com>
Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Upgrade to pytorch lightning 2.0 (#6433)

* Upgrade pytorch lightning version in requirements

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Initial fixes for PTL2.0

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add further fixes to support lightning 2.0

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add replacements for replace_sampler_ddp, resume_from_checkpoint_fit_path and few occurances of validation_epoch_end

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Replace all occurances of validation_epoch_end to on_validation_epoch_end

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Replace training_epoch_end, test_epoch_end with on_train_epoch_end and on_test_epoch_end respectively

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Change logger=None to logger=False in Trainer object

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove PTL2.0 deprecated Trainer args from TrainerConfig dataclass

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Modify trainer.precision check and other small edits

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Replace logger=None with logger=False in test_ptl_stateless_timer.py Trainer

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add default values for args to fix Attribute Error

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add the following modifications

1) Remove outputs arg from on_validation_epoch_end, on_test_epoch_end and make it an arg of the class
2) Replace resume_from_checkpoint with ckpt_path as needed
3) Explicitly add accelerator as 'CPU' in UTs being run on CPU

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs arg from on_validation_epoch_end, on_test_epoch_end

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs arg in on_validation_epoch_end in MultiBinaryAccuracy docstrings

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add val, test outputs as instance vars in PunctuationCapitalizationModel and TokenClassificationModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Replace trainer.fit_loop.max_steps with trainer.fit_loop.epoch_loop.max_steps in test_optimizers_schedulers.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Revert an extra space that was mistakenly added

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Use self.validation_step_outputs and self.test_step_outputs in test_ema.py for uniformity

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Use self.validation_step_outputs and self.test_step_outputs in test_ptl_stateless_timer.py and check_for_ranks.py for uniformity

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add self.validation_step_outputs.clear() and self.test_step_outputs.clear() wherever missing

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs arg from on_train_epoch_end

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs from on_validation_epoch_end in multi_binary_acc.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove output args from on_validation_epoch_end in the docstrings of some ASR files

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove output args from on_validation_epoch_end and clear memory from validation_step_outputs

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add on_validation_epoch_end and remove outputs args for nlp models

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Append output of validation_step to validation_step_outputs in EncDecClassificationModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add the following changes

1) Index self.validation_step_outputs and self.test_step.outputs with dataloader_idx wherever needed
2) Initialize self.validation_step_outputs and self.test_step.outputs as empty lists and add support for multi dataloaders if they exist
3) Remove self.pre_configure_ddp from NLPDDPStrategy class as its removed in PTL 2.0

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add default value dataloader_idx=0 for on_validation_batch_end() in megatron_base_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* TypeCast precision to str in attention.py and utils_funcs.py to avoid TypeError

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add if condition check for multiple dataloaders when appending to validation outputs

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Separate validation pass to be used with both validation_step and test_step

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add if condition check for multiple dataloader while appending to test_step_outputs in punctuation_capitalization_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add condition check for multiple dataloaders based on type of trainer.val/test_dataloaders or self._validation/test_dl instead of len

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Comment Megatron T5 IA3 PP=2 in CI pipeline due to dataloader_iter issue with PTL 2.0

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Modify precision checks to account for 16-mixed and bf16-mixed

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Append output of validation/test_step to self.validation/test_step_outputs in CTCG2PModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Modify find_unused_parameters=True in g2p_heteronym model

1) Add find_unused_parameters=True for DDP strategy in g2p_heteronym_classification_train_and_evaluate.py
2) Remove args output in validation/test_step and add instance variables instead for heteronym_classification.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs from on_test_epoch_end in DialogueGPTClassificationModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add validation/test outputs in sgdqa_model and modify dialogue_config.yaml

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add split arg self.test_step_outputs to TextClassificationModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add test_step_outputs to dialogue and text classification models

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Change condition check for multiple dataloaders:

1) Replace ds_item as list in dialogue_config.yaml
2) Check for len of val/test_dataloaders or validation/test_dl along with type check of list in sgdqa_model.py while appending outputs of validation/test_step
3) Check for len of _validation/test_dl for creating self.validation/test_step_outputs in ModelPT and punctuation_cpitalization_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add additional condition for multi dataloaders

Check len(self.trainer.val/test_dataloaders) > 1 along with type(self.trainer.val/test_dataloaders) == list for multi dataloaders in validation/test_step

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add val step outputs and default val for dataloader_idx

1) Append validation_step outout to self.validation_step_outputs in MultiLabelIntentSlotClassificationMode
2) Add default val for dataloader_idx for on_test_batch_start/end in TimingCallback
3) Add self.validation/test_step_outputs in BERTQAModel and remove outputs arg

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add val/test_step_outputs to S2SQAModel and GPTQAModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Edit JenkinsFile for bert_pretrainig.py

Edit Jenkinsfile for this test to disable validation as a workaround for trainer.val_dataloader None error

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Modify precision to support 16-mixed, bf16-mixed in megatron_gpt_pretraining.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add ddp_find_unused_parameters_true and remove output args

1) Add ddp_find_unused_parameters_true fro trainer.strategy in self_alignment_pretraining.py as it has unused parameters
2) Remove output args and add self.validation/test_step_outputs to validation/test_step in mt_enc_dec_model.py
3) Comment tests in JenkinsFile that need to be fixed

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix in megatron_nmt_training.py for 16-mixed, bf16-mixed

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix for megatron_bert_pretraining.py and megatron_bert_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix and validation/test_step_outputs

1) Add fix to account for 16-mixed and bf16-mixed in megatron_retro_mutransfer_pretrain.py, megatron_retro_pretraining.py
2) Reset ckpt_path for test in enc_dec_nmt.py
3) Remove outputs args and add validation/test_step_outputs in megatron_retrieval_model.py
4) Comment Megatron Bert Pretraining and Resume Training with Pipeline Paralleism and add back NMT Training Post-LN

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix and skip few failing tests

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add missing comment lines in JenkinsFile

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Comment jenkin tests and super().on_validation_epoch_end() in megatron_gpt_sft_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Minor edit JenkinsFile

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Minor edit in jenkins file

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Edit in Jenkins file

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Comment missed lines in Jenkins file

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Fix precision and validation/test outputs

1) Add precision fix to account for 16-mixed and bf16-mixed in megatron_t5_pretraining.py
2) Remove outputs args and add append loss to self.validation/test_step_outputs in megatron_lm_encoder_decoder_model.py
3) Add back resume_from_checkpoint in the megatron_t5_config.yaml
4) Comment out certain tests in Jenkins file

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Fix precision and validation/test/predict errors in megatron_t5_prompt_learning.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix and edit precision typo in all files

1) Account for 16-mixed and bf16-mixed in megatron_bart_pretraining.py and megatron_t5_seq2seq_finetune.py
2) Fix precision typo in all files

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Fix all CI TTS tests and comment few Jenkins tests

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Combine xx_epoch_end and on_xx_epoch_end

Add on_inference_epoch_end to inference_epoch_end function and have a single on_validation/test_epoch_end in megatron_finetune_model.py and megatron_gpt_sft_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add a missing comment in JenkinsFile

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add try except StopIteration in validation_step for models with dataloader_iter

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove pyyaml from requirements

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add try except for inference_step in megatron_finetune_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove limit_val_batches for mockGPTDataset test

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add new self.validation_step_outputs for MegatronGPTSFTModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Minor edit Jenkinsfile

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Initialize self.validation/test_step_outputs in megatron_gpt_sft_model.py

Initialize self.validation/test_step_outputs in setup of MegatronGPTSFTModel to take care of cases when datalaoders are not setup in ModelPT for example while restoring the model.

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove resume_from_checkpoint if trainer arg in conf yaml files

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove resume_from_checkpoint as trainer arg in GPT, T5 configs

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove resume_from_checkpoint in duplex_tn_config.yaml

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Fix typos, unused imports and refactor code to remove redundant funcs

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove commented code in megatron_nmt_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Fix overriden functions to match parent class functions

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Prefetch dataloader_iter to prevent hang for PP>1

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Override setup() in NLPDDPStrategy to avoid hang during predict with PP>1

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Uncomment tests in JenkinsFile

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add '16' to precision checks and other minor fixes

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Clear validation/test_step_outputs with dataloader_idx for multi dataloaders

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Minor edits

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Modify precision checks to avoid indexing

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove self.validation_step_outputs_sft and add dataloader_idx to clear outputs

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Reference checkpoint with trainer.ckpt_path

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add _prefetch to NLPModel and minor fixes

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add limit_val_batches in JenkinsFile for NMT

1) Add trainer.limit_val_batches in Megatron NMT Training TP=2
2) Remove unused import in ModelPT

Signed-off-by: Abhishree <abhishreetm@gmail.com>

---------

Signed-off-by: Abhishree <abhishreetm@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Include the scripts for preprocessing OAST and unit tests for chat sft datasets (#7112)

* scripts for sft

Signed-off-by: Yi Dong <yidong@nvidia.com>

* fix style

Signed-off-by: Yi Dong <yidong@nvidia.com>

* adde special token only for huggingface model

Signed-off-by: Yi Dong <yidong@nvidia.com>

* change default name

Signed-off-by: Yi Dong <yidong@nvidia.com>

* print out error datapoint content

Signed-off-by: Yi Dong <yidong@nvidia.com>

* show error id

Signed-off-by: Yi Dong <yidong@nvidia.com>

* annotation script working

Signed-off-by: Yi Dong <yidong@nvidia.com>

* try to be compatible with huggingface tokenizer

Signed-off-by: Yi Dong <yidong@nvidia.com>

* added examples

Signed-off-by: Yi Dong <yidong@nvidia.com>

* added lang

Signed-off-by: Yi Dong <yidong@nvidia.com>

* added lang

Signed-off-by: Yi Dong <yidong@nvidia.com>

* text to value special case

Signed-off-by: Yi Dong <yidong@nvidia.com>

* configure the slider

Signed-off-by: Yi Dong <yidong@nvidia.com>

* annoatation handles lang

Signed-off-by: Yi Dong <yidong@nvidia.com>

* added the unit test for chat sft dataset

Signed-off-by: Yi Dong <yidong@nvidia.com>

* used the file in the test dir

Signed-off-by: Yi Dong <yidong@nvidia.com>

* fix json error

Signed-off-by: Yi Dong <yidong@nvidia.com>

* load local tokenizer

Signed-off-by: Yi Dong <yidong@nvidia.com>

* remove mask count check

Signed-off-by: Yi Dong <yidong@nvidia.com>

* added HF dataset backend

Signed-off-by: Yi Dong <yidong@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Yi Dong <yidong@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* add paths to labeler. (#7087)

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Kim Ngo <6362111+findkim@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>
Signed-off-by: tbartley94 <tbartley@nvidia.com>
Signed-off-by: Nikolay Karpov <karpnv@gmail.com>
Signed-off-by: Yi Dong <yidong@nvidia.com>
Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>
Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>
Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>
Signed-off-by: Evelina <ebakhturina@nvidia.com>
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com>
Signed-off-by: stevehuang52 <heh@nvidia.com>
Signed-off-by: arendu <adithya.r@gmail.com>
Signed-off-by: sam1373 <samuelkriman@gmail.com>
Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: Daniel Egert <degert@nvidia.com>
Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: Jan Beckmann <king-jan1999@hotmail.de>
Signed-off-by: Tim Moon <tmoon@nvidia.com>
Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com>
Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Signed-off-by: Ryan <rlangman@nvidia.com>
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: Ante Jukić <ajukic@nvidia.com>
Signed-off-by: Dmytro Pykhtar <dpykhtar@nvidia.com>
Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: Abhishree <abhishreetm@gmail.com>
Co-authored-by: Kim Ngo <6362111+findkim@users.noreply.github.com>
Co-authored-by: tbartley94 <90423858+tbartley94@users.noreply.github.com>
Co-authored-by: Nikolay Karpov <karpnv@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Yi Dong <43824965+yidong72@users.noreply.github.com>
Co-authored-by: Aleksandr Laptev <alaptev@nvidia.com>
Co-authored-by: bene-ges <antonova_sasha@list.ru>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: Aleksey Grinchuk (Oleksii Hrinchuk) <grinchuk.alexey@gmail.com>
Co-authored-by: Vladimir Bataev <vbataev@nvidia.com>
Co-authored-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
Co-authored-by: Adi Renduchintala <adithyar…
zhehuaichen added a commit to zhehuaichen/NeMo that referenced this pull request Sep 22, 2023
* Fixed small bug with NoisePerturbationWithNormalization (NVIDIA#7118)



* Fix import guard checks (NVIDIA#7124)



* Revert "Fix import guard checks (NVIDIA#7124)" (NVIDIA#7125)

This reverts commit a46e325.

* Fix import guard checks (NVIDIA#7126)

* Fix import guard checks



* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------




* Add updated fc ctc and rnnt xxl models (NVIDIA#7128) (NVIDIA#7130)

* [TTS] Create EnCodec training recipe (NVIDIA#6852)

* [TTS] Create EnCodec training recipe



* [TTS] Update encodec recipe



* [TTS] Rename EnCodec to AudioCodec



* [TTS] Add EnCodec unit tests



* [TTS] Add copyright header to distributed.py



---------



* Fix rank where torch.distributed may not be initialized yet and would not wait for tokenizer file caching (NVIDIA#7061)




* fix default attention size (NVIDIA#7141) (NVIDIA#7143)

* fix evaluator.py for various exceptions by ast (NVIDIA#7150)



* [TTS][ZH] add Chinese TTS recipes based on IPA symbol sets. (NVIDIA#6893)

* [TTS] add Chinese TTS recipe based on IPA.
* add new pinyin and ipa dictionaries with 36 finals.
* add yaml configs for 24-final pinyin and ipa.
* add copyright header
* add a directory level 24finals to discriminate from 36 finals.



* unify configs into a single one and add detailed comments providing supported candidates.



* choose 36-final IPA as default phoneme dict



---------



* [TTS] Add output audio format to preprocessing (NVIDIA#6889)

* [TTS] Add output audio format to preprocessing



* [TTS] Add format validation



* [TTS] Fix data tutorial



---------



* freeze (NVIDIA#7152)



* make sure any empty segments are removed (NVIDIA#7155)



* Update RIR generation scripts (NVIDIA#6547)

- fix: reduce room size if evaluation of params fails
- added randomized mic placement
- added diffuse noise generation
- added an option to specify the format and subtype for saved audio



* A quickstart speech enhancement tutorial (NVIDIA#6492)

A simple example of training a model for speech enhancement task



* NFA subtitle file config - specify colors and vertical alignment (NVIDIA#7160)

* allow specifying colors of text in ASS subtitle file



* specify vertical_alignment instead of marginv in ass_file_config



* add documentation of CTMFileConfig and ASSFileConfig to NFA README



---------



* Eagerly accumulate embedding grads into fp32 buffer (NVIDIA#6958) (NVIDIA#7153)




* TE bug fix (NVIDIA#7027) (NVIDIA#7036)




* [TTS] Remove nested TTS configs (NVIDIA#7154)

* [TTS] Remove nested TTS configs



* [TTS] Modify tutorial to support multiple sampling rates



* [TTS] Clarify min_duration unit



* [TTS] Default 22.05kHz highfreq to null



---------



* Merge release r1.20.0 to main (NVIDIA#7167)

* update package info



* Add ASR with TTS Tutorial. Fix enhancer usage. (NVIDIA#6955)

* Add ASR with TTS Tutorial
* Fix enhancer usage



* install_bs (NVIDIA#7019)



* Fix typo and branch in tutorial (NVIDIA#7048)



* fix syntax error introduced in PR-7079 (NVIDIA#7102)

* fix syntax error introduced in PR-7079



* fixes for pr review



---------



* fix links for TN (NVIDIA#7117)



* update branch (NVIDIA#7135)



* Fixed main and merging this to r1.20 (NVIDIA#7127)

* Fixed main and merging this to r1.20



* Update vad_utils.py



---------





* update branch



* fix version



* resolve conflict the other way



* keep both



* revert keep both



---------















* Upgrade to pytorch lightning 2.0 (NVIDIA#6433)

* Upgrade pytorch lightning version in requirements



* Initial fixes for PTL2.0



* Add further fixes to support lightning 2.0



* Add replacements for replace_sampler_ddp, resume_from_checkpoint_fit_path and few occurances of validation_epoch_end



* Replace all occurances of validation_epoch_end to on_validation_epoch_end



* Replace training_epoch_end, test_epoch_end with on_train_epoch_end and on_test_epoch_end respectively



* Change logger=None to logger=False in Trainer object



* Remove PTL2.0 deprecated Trainer args from TrainerConfig dataclass



* Modify trainer.precision check and other small edits



* Replace logger=None with logger=False in test_ptl_stateless_timer.py Trainer



* Add default values for args to fix Attribute Error



* Add the following modifications

1) Remove outputs arg from on_validation_epoch_end, on_test_epoch_end and make it an arg of the class
2) Replace resume_from_checkpoint with ckpt_path as needed
3) Explicitly add accelerator as 'CPU' in UTs being run on CPU



* Remove outputs arg from on_validation_epoch_end, on_test_epoch_end



* Remove outputs arg in on_validation_epoch_end in MultiBinaryAccuracy docstrings



* Add val, test outputs as instance vars in PunctuationCapitalizationModel and TokenClassificationModel



* Replace trainer.fit_loop.max_steps with trainer.fit_loop.epoch_loop.max_steps in test_optimizers_schedulers.py



* Revert an extra space that was mistakenly added



* Use self.validation_step_outputs and self.test_step_outputs in test_ema.py for uniformity



* Use self.validation_step_outputs and self.test_step_outputs in test_ptl_stateless_timer.py and check_for_ranks.py for uniformity



* Add self.validation_step_outputs.clear() and self.test_step_outputs.clear() wherever missing



* Remove outputs arg from on_train_epoch_end



* Remove outputs from on_validation_epoch_end in multi_binary_acc.py



* Remove output args from on_validation_epoch_end in the docstrings of some ASR files



* Remove output args from on_validation_epoch_end and clear memory from validation_step_outputs



* Add on_validation_epoch_end and remove outputs args for nlp models



* Append output of validation_step to validation_step_outputs in EncDecClassificationModel



* Add the following changes

1) Index self.validation_step_outputs and self.test_step.outputs with dataloader_idx wherever needed
2) Initialize self.validation_step_outputs and self.test_step.outputs as empty lists and add support for multi dataloaders if they exist
3) Remove self.pre_configure_ddp from NLPDDPStrategy class as its removed in PTL 2.0



* Add default value dataloader_idx=0 for on_validation_batch_end() in megatron_base_model.py



* TypeCast precision to str in attention.py and utils_funcs.py to avoid TypeError



* Add if condition check for multiple dataloaders when appending to validation outputs



* Separate validation pass to be used with both validation_step and test_step



* Add if condition check for multiple dataloader while appending to test_step_outputs in punctuation_capitalization_model.py



* Add condition check for multiple dataloaders based on type of trainer.val/test_dataloaders or self._validation/test_dl instead of len



* Comment Megatron T5 IA3 PP=2 in CI pipeline due to dataloader_iter issue with PTL 2.0



* Modify precision checks to account for 16-mixed and bf16-mixed



* Append output of validation/test_step to self.validation/test_step_outputs in CTCG2PModel



* Modify find_unused_parameters=True in g2p_heteronym model

1) Add find_unused_parameters=True for DDP strategy in g2p_heteronym_classification_train_and_evaluate.py
2) Remove args output in validation/test_step and add instance variables instead for heteronym_classification.py



* Remove outputs from on_test_epoch_end in DialogueGPTClassificationModel



* Add validation/test outputs in sgdqa_model and modify dialogue_config.yaml



* Add split arg self.test_step_outputs to TextClassificationModel



* Add test_step_outputs to dialogue and text classification models



* Change condition check for multiple dataloaders:

1) Replace ds_item as list in dialogue_config.yaml
2) Check for len of val/test_dataloaders or validation/test_dl along with type check of list in sgdqa_model.py while appending outputs of validation/test_step
3) Check for len of _validation/test_dl for creating self.validation/test_step_outputs in ModelPT and punctuation_cpitalization_model.py



* Add additional condition for multi dataloaders

Check len(self.trainer.val/test_dataloaders) > 1 along with type(self.trainer.val/test_dataloaders) == list for multi dataloaders in validation/test_step



* Add val step outputs and default val for dataloader_idx

1) Append validation_step outout to self.validation_step_outputs in MultiLabelIntentSlotClassificationMode
2) Add default val for dataloader_idx for on_test_batch_start/end in TimingCallback
3) Add self.validation/test_step_outputs in BERTQAModel and remove outputs arg



* Add val/test_step_outputs to S2SQAModel and GPTQAModel



* Edit JenkinsFile for bert_pretrainig.py

Edit Jenkinsfile for this test to disable validation as a workaround for trainer.val_dataloader None error



* Modify precision to support 16-mixed, bf16-mixed in megatron_gpt_pretraining.py



* Add ddp_find_unused_parameters_true and remove output args

1) Add ddp_find_unused_parameters_true fro trainer.strategy in self_alignment_pretraining.py as it has unused parameters
2) Remove output args and add self.validation/test_step_outputs to validation/test_step in mt_enc_dec_model.py
3) Comment tests in JenkinsFile that need to be fixed



* Precision fix in megatron_nmt_training.py for 16-mixed, bf16-mixed



* Precision fix for megatron_bert_pretraining.py and megatron_bert_model.py



* Precision fix and validation/test_step_outputs

1) Add fix to account for 16-mixed and bf16-mixed in megatron_retro_mutransfer_pretrain.py, megatron_retro_pretraining.py
2) Reset ckpt_path for test in enc_dec_nmt.py
3) Remove outputs args and add validation/test_step_outputs in megatron_retrieval_model.py
4) Comment Megatron Bert Pretraining and Resume Training with Pipeline Paralleism and add back NMT Training Post-LN



* Precision fix and skip few failing tests



* Add missing comment lines in JenkinsFile



* Comment jenkin tests and super().on_validation_epoch_end() in megatron_gpt_sft_model.py



* Minor edit JenkinsFile



* Minor edit in jenkins file



* Edit in Jenkins file



* Comment missed lines in Jenkins file



* Fix precision and validation/test outputs

1) Add precision fix to account for 16-mixed and bf16-mixed in megatron_t5_pretraining.py
2) Remove outputs args and add append loss to self.validation/test_step_outputs in megatron_lm_encoder_decoder_model.py
3) Add back resume_from_checkpoint in the megatron_t5_config.yaml
4) Comment out certain tests in Jenkins file



* Fix precision and validation/test/predict errors in megatron_t5_prompt_learning.py



* Precision fix and edit precision typo in all files

1) Account for 16-mixed and bf16-mixed in megatron_bart_pretraining.py and megatron_t5_seq2seq_finetune.py
2) Fix precision typo in all files



* Fix all CI TTS tests and comment few Jenkins tests



* Combine xx_epoch_end and on_xx_epoch_end

Add on_inference_epoch_end to inference_epoch_end function and have a single on_validation/test_epoch_end in megatron_finetune_model.py and megatron_gpt_sft_model.py



* Add a missing comment in JenkinsFile



* Add try except StopIteration in validation_step for models with dataloader_iter



* Remove pyyaml from requirements



* Add try except for inference_step in megatron_finetune_model.py



* Remove limit_val_batches for mockGPTDataset test



* Add new self.validation_step_outputs for MegatronGPTSFTModel



* Minor edit Jenkinsfile



* Initialize self.validation/test_step_outputs in megatron_gpt_sft_model.py

Initialize self.validation/test_step_outputs in setup of MegatronGPTSFTModel to take care of cases when datalaoders are not setup in ModelPT for example while restoring the model.



* Remove resume_from_checkpoint if trainer arg in conf yaml files



* Remove resume_from_checkpoint as trainer arg in GPT, T5 configs



* Remove resume_from_checkpoint in duplex_tn_config.yaml



* Fix typos, unused imports and refactor code to remove redundant funcs



* Remove commented code in megatron_nmt_model.py



* Fix overriden functions to match parent class functions



* Prefetch dataloader_iter to prevent hang for PP>1



* Override setup() in NLPDDPStrategy to avoid hang during predict with PP>1



* Uncomment tests in JenkinsFile



* Add '16' to precision checks and other minor fixes



* Clear validation/test_step_outputs with dataloader_idx for multi dataloaders



* Minor edits



* Modify precision checks to avoid indexing



* Remove self.validation_step_outputs_sft and add dataloader_idx to clear outputs



* Reference checkpoint with trainer.ckpt_path



* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add _prefetch to NLPModel and minor fixes



* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add limit_val_batches in JenkinsFile for NMT

1) Add trainer.limit_val_batches in Megatron NMT Training TP=2
2) Remove unused import in ModelPT



---------




* Include the scripts for preprocessing OAST and unit tests for chat sft datasets (NVIDIA#7112)

* scripts for sft



* fix style



* adde special token only for huggingface model



* change default name



* print out error datapoint content



* show error id



* annotation script working



* try to be compatible with huggingface tokenizer



* added examples



* added lang



* added lang



* text to value special case



* configure the slider



* annoatation handles lang



* added the unit test for chat sft dataset



* used the file in the test dir



* fix json error



* load local tokenizer



* remove mask count check



* added HF dataset backend



* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------




* add paths to labeler. (NVIDIA#7087)



* T5 metrics fix (NVIDIA#7037)

* Fix race condition when executing with multi-node where some ranks does not wait for setup (NVIDIA#7016)




* Added bool types to neural_types export (NVIDIA#7032)




* rnnt and char utils (NVIDIA#6971)

* rnnt_ngram_merge



* char level bug



* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------






* fix tab text gen (NVIDIA#7022) (NVIDIA#7031)





* Fixed kwargs for metric instance init



* Fixed kwargs for metric instance init



* removed kwagrs



* Updated config desc



* ASR Confidence update and tutorial (NVIDIA#6810)

* small fixes and tests



* various fixes for the tutorial



* tutorial added



* for for a little oops after rebasement



* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix tests



* unused import removed



* fix review comments



* deprecated parameters for greedy configs



* move re-assigning to configs



* fix comments 2



* fix config tests



* fix ece test (my env was bugged apparently)



* renamings for confidence ensemble



* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fox comments 3



* return dropped tutorial



* CI flips back and forth, increasing tolerance



---------





* install_bs (NVIDIA#7019) (NVIDIA#7028)





* fixes for spellmapper (NVIDIA#6994) (NVIDIA#7000)






* added back the retro documents (NVIDIA#7033)




* Remove pyyaml (NVIDIA#7052) (NVIDIA#7054)





* st standalone model (NVIDIA#6969)

* st standalone model



* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* style fix



* sacrebleu import fix, unused imports removed



* import guard for nlp inside asr transformer bpe model



* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* codeql fixes



* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* comments answered



* import ordering fix



* yttm for asr removed



* logging added



* added inference and translate method



* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------





* remove pos emb from state dict for old models (NVIDIA#7068)

* remove pos emb from state dict



* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* move to nlp_model



* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update comment



* fix nmt test



* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix nmt test



---------





* Fix typo in ASR-TTS tutorial (NVIDIA#7049)




* Fixed tutorial's name (NVIDIA#7047)





* Fix documentation for Numba (NVIDIA#7065) (NVIDIA#7077)

* Fix documentation for Numba



* Update force float32 flag dynamically



* Update force float32 flag dynamically



* Fix nemo version



---------






* Update Frame-VAD doc and fix onnx export (NVIDIA#7076)

* update fvad doc



* fix typo



* update fvad example



* update



* fix onnx export



* update test



* refactor



* update doc



* update



---------





* memmap worker arg (NVIDIA#7062)

* memmap worker arg



* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update



* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update



* update



---------





* Fix caching bug in causal convolutions for cache-aware ASR models (NVIDIA#7034) (NVIDIA#7082)




* Fast Conformer global token fix (NVIDIA#7085)

* old way



* fix



* fix



* fix



* remove extra



* clean



* clean



* clean



* fix



* fix



* fix



* fix



* fix



* fix



* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------





* Refined export_config (NVIDIA#7053) (NVIDIA#7066)

* Refined export_config
* Rolling back hierarchy change
---------





* small Bugfix (NVIDIA#7081)

* small Bugfix (NVIDIA#7079)

* fix branch



* fix typo



* fix link



---------



* Update tutorials/nlp/SpellMapper_English_ASR_Customization.ipynb



* Update tutorials/nlp/SpellMapper_English_ASR_Customization.ipynb



---------







* Added script to extract ASR CTC and RNNT models from ASR hybrid models (NVIDIA#7092)

* Added script to extract ctc and rnnt models from hybrid models



* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Updated hybrid extraction script for review request 1



* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Updated hybrid convert script to remove --cuda flag



---------






* Adding docs and models for multiple lookahead cache-aware ASR (NVIDIA#7067) (NVIDIA#7094)



* update TTS readme (NVIDIA#7088)

* update TTS readme



---------




* Fix absolute path in path join call (NVIDIA#7099)




* Disable distopt contiguous param buffer by default (NVIDIA#7095)




* microphone demo (NVIDIA#7110)





* [Fix] load_state_dict in nlp_model.py (NVIDIA#7086)

* Fix load_state_dict in nlp_model.py



* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------





* Fix plot function in vad_utils.py (NVIDIA#7113)

Fix plot function in vad_utils.py




* Fixed small bug with NoisePerturbationWithNormalization (NVIDIA#7118)




* Fix import guard checks (NVIDIA#7124)




* Revert "Fix import guard checks (NVIDIA#7124)" (NVIDIA#7125)

This reverts commit a46e325.



* Fix import guard checks (NVIDIA#7126)

* Fix import guard checks



* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------





* Add updated fc ctc and rnnt xxl models (NVIDIA#7128) (NVIDIA#7130)



* [TTS] Create EnCodec training recipe (NVIDIA#6852)

* [TTS] Create EnCodec training recipe



* [TTS] Update encodec recipe



* [TTS] Rename EnCodec to AudioCodec



* [TTS] Add EnCodec unit tests



* [TTS] Add copyright header to distributed.py



---------




* Fix rank where torch.distributed may not be initialized yet and would not wait for tokenizer file caching (NVIDIA#7061)





* fix default attention size (NVIDIA#7141) (NVIDIA#7143)



* fix evaluator.py for various exceptions by ast (NVIDIA#7150)




* [TTS][ZH] add Chinese TTS recipes based on IPA symbol sets. (NVIDIA#6893)

* [TTS] add Chinese TTS recipe based on IPA.
* add new pinyin and ipa dictionaries with 36 finals.
* add yaml configs for 24-final pinyin and ipa.
* add copyright header
* add a directory level 24finals to discriminate from 36 finals.



* unify configs into a single one and add detailed comments providing supported candidates.



* choose 36-final IPA as default phoneme dict



---------




* [TTS] Add output audio format to preprocessing (NVIDIA#6889)

* [TTS] Add output audio format to preprocessing



* [TTS] Add format validation



* [TTS] Fix data tutorial



---------




* freeze (NVIDIA#7152)




* make sure any empty segments are removed (NVIDIA#7155)




* Update RIR generation scripts (NVIDIA#6547)

- fix: reduce room size if evaluation of params fails
- added randomized mic placement
- added diffuse noise generation
- added an option to specify the format and subtype for saved audio




* A quickstart speech enhancement tutorial (NVIDIA#6492)

A simple example of training a model for speech enhancement task




* NFA subtitle file config - specify colors and vertical alignment (NVIDIA#7160)

* allow specifying colors of text in ASS subtitle file



* specify vertical_alignment instead of marginv in ass_file_config



* add documentation of CTMFileConfig and ASSFileConfig to NFA README



---------




* Eagerly accumulate embedding grads into fp32 buffer (NVIDIA#6958) (NVIDIA#7153)





* TE bug fix (NVIDIA#7027) (NVIDIA#7036)





* [TTS] Remove nested TTS configs (NVIDIA#7154)

* [TTS] Remove nested TTS configs



* [TTS] Modify tutorial to support multiple sampling rates



* [TTS] Clarify min_duration unit



* [TTS] Default 22.05kHz highfreq to null



---------




* Merge release r1.20.0 to main (NVIDIA#7167)

* update package info



* Add ASR with TTS Tutorial. Fix enhancer usage. (NVIDIA#6955)

* Add ASR with TTS Tutorial
* Fix enhancer usage



* install_bs (NVIDIA#7019)



* Fix typo and branch in tutorial (NVIDIA#7048)



* fix syntax error introduced in PR-7079 (NVIDIA#7102)

* fix syntax error introduced in PR-7079



* fixes for pr review



---------



* fix links for TN (NVIDIA#7117)



* update branch (NVIDIA#7135)



* Fixed main and merging this to r1.20 (NVIDIA#7127)

* Fixed main and merging this to r1.20



* Update vad_utils.py



---------





* update branch



* fix version



* resolve conflict the other way



* keep both



* revert keep both



---------
















* Upgrade to pytorch lightning 2.0 (NVIDIA#6433)

* Upgrade pytorch lightning version in requirements



* Initial fixes for PTL2.0



* Add further fixes to support lightning 2.0



* Add replacements for replace_sampler_ddp, resume_from_checkpoint_fit_path and few occurances of validation_epoch_end



* Replace all occurances of validation_epoch_end to on_validation_epoch_end



* Replace training_epoch_end, test_epoch_end with on_train_epoch_end and on_test_epoch_end respectively



* Change logger=None to logger=False in Trainer object



* Remove PTL2.0 deprecated Trainer args from TrainerConfig dataclass



* Modify trainer.precision check and other small edits



* Replace logger=None with logger=False in test_ptl_stateless_timer.py Trainer



* Add default values for args to fix Attribute Error



* Add the following modifications

1) Remove outputs arg from on_validation_epoch_end, on_test_epoch_end and make it an arg of the class
2) Replace resume_from_checkpoint with ckpt_path as needed
3) Explicitly add accelerator as 'CPU' in UTs being run on CPU



* Remove outputs arg from on_validation_epoch_end, on_test_epoch_end



* Remove outputs arg in on_validation_epoch_end in MultiBinaryAccuracy docstrings



* Add val, test outputs as instance vars in PunctuationCapitalizationModel and TokenClassificationModel



* Replace trainer.fit_loop.max_steps with trainer.fit_loop.epoch_loop.max_steps in test_optimizers_schedulers.py



* Revert an extra space that was mistakenly added



* Use self.validation_step_outputs and self.test_step_outputs in test_ema.py for uniformity



* Use self.validation_step_outputs and self.test_step_outputs in test_ptl_stateless_timer.py and check_for_ranks.py for uniformity



* Add self.validation_step_outputs.clear() and self.test_step_outputs.clear() wherever missing



* Remove outputs arg from on_train_epoch_end



* Remove outputs from on_validation_epoch_end in multi_binary_acc.py



* Remove output args from on_validation_epoch_end in the docstrings of some ASR files



* Remove output args from on_validation_epoch_end and clear memory from validation_step_outputs



* Add on_validation_epoch_end and remove outputs args for nlp models



* Append output of validation_step to validation_step_outputs in EncDecClassificationModel



* Add the following changes

1) Index self.validation_step_outputs and self.test_step.outputs with dataloader_idx wherever needed
2) Initialize self.validation_step_outputs and self.test_step.outputs as empty lists and add support for multi dataloaders if they exist
3) Remove self.pre_configure_ddp from NLPDDPStrategy class as its removed in PTL 2.0



* Add default value dataloader_idx=0 for on_validation_batch_end() in megatron_base_model.py



* TypeCast precision to str in attention.py and utils_funcs.py to avoid TypeError



* Add if condition check for multiple dataloaders when appending to validation outputs



* Separate validation pass to be used with both validation_step and test_step



* Add if condition check for multiple dataloader while appending to test_step_outputs in punctuation_capitalization_model.py



* Add condition check for multiple dataloaders based on type of trainer.val/test_dataloaders or self._validation/test_dl instead of len



* Comment Megatron T5 IA3 PP=2 in CI pipeline due to dataloader_iter issue with PTL 2.0



* Modify precision checks to account for 16-mixed and bf16-mixed



* Append output of validation/test_step to self.validation/test_step_outputs in CTCG2PModel



* Modify find_unused_parameters=True in g2p_heteronym model

1) Add find_unused_parameters=True for DDP strategy in g2p_heteronym_classification_train_and_evaluate.py
2) Remove args output in validation/test_step and add instance variables instead for heteronym_classification.py



* Remove outputs from on_test_epoch_end in DialogueGPTClassificationModel



* Add validation/test outputs in sgdqa_model and modify dialogue_config.yaml



* Add split arg self.test_step_outputs to TextClassificationModel



* Add test_step_outputs to dialogue and text classification models



* Change condition check for multiple dataloaders:

1) Replace ds_item as list in dialogue_config.yaml
2) Check for len of val/test_dataloaders or validation/test_dl along with type check of list in sgdqa_model.py while appending outputs of validation/test_step
3) Check for len of _validation/test_dl for creating self.validation/test_step_outputs in ModelPT and punctuation_cpitalization_model.py



* Add additional condition for multi dataloaders

Check len(self.trainer.val/test_dataloaders) > 1 along with type(self.trainer.val/test_dataloaders) == list for multi dataloaders in validation/test_step



* Add val step outputs and default val for dataloader_idx

1) Append validation_step outout to self.validation_step_outputs in MultiLabelIntentSlotClassificationMode
2) Add default val for dataloader_idx for on_test_batch_start/end in TimingCallback
3) Add self.validation/test_step_outputs in BERTQAModel and remove outputs arg



* Add val/test_step_outputs to S2SQAModel and GPTQAModel



* Edit JenkinsFile for bert_pretrainig.py

Edit Jenkinsfile for this test to disable validation as a workaround for trainer.val_dataloader None error



* Modify precision to support 16-mixed, bf16-mixed in megatron_gpt_pretraining.py



* Add ddp_find_unused_parameters_true and remove output args

1) Add ddp_find_unused_parameters_true fro trainer.strategy in self_alignment_pretraining.py as it has unused parameters
2) Remove output args and add self.validation/test_step_outputs to validation/test_step in mt_enc_dec_model.py
3) Comment tests in JenkinsFile that need to be fixed



* Precision fix in megatron_nmt_training.py for 16-mixed, bf16-mixed



* Precision fix for megatron_bert_pretraining.py and megatron_bert_model.py



* Precision fix and validation/test_step_outputs

1) Add fix to account for 16-mixed and bf16-mixed in megatron_retro_mutransfer_pretrain.py, megatron_retro_pretraining.py
2) Reset ckpt_path for test in enc_dec_nmt.py
3) Remove outputs args and add validation/test_step_outputs in megatron_retrieval_model.py
4) Comment Megatron Bert Pretraining and Resume Training with Pipeline Paralleism and add back NMT Training Post-LN



* Precision fix and skip few failing tests



* Add missing comment lines in JenkinsFile



* Comment jenkin tests and super().on_validation_epoch_end() in megatron_gpt_sft_model.py



* Minor edit JenkinsFile



* Minor edit in jenkins file



* Edit in Jenkins file



* Comment missed lines in Jenkins file



* Fix precision and validation/test outputs

1) Add precision fix to account for 16-mixed and bf16-mixed in megatron_t5_pretraining.py
2) Remove outputs args and add append loss to self.validation/test_step_outputs in megatron_lm_encoder_decoder_model.py
3) Add back resume_from_checkpoint in the megatron_t5_config.yaml
4) Comment out certain tests in Jenkins file



* Fix precision and validation/test/predict errors in megatron_t5_prompt_learning.py



* Precision fix and edit precision typo in all files

1) Account for 16-mixed and bf16-mixed in megatron_bart_pretraining.py and megatron_t5_seq2seq_finetune.py
2) Fix precision typo in all files



* Fix all CI TTS tests and comment few Jenkins tests



* Combine xx_epoch_end and on_xx_epoch_end

Add on_inference_epoch_end to inference_epoch_end function and have a single on_validation/test_epoch_end in megatron_finetune_model.py and megatron_gpt_sft_model.py



* Add a missing comment in JenkinsFile



* Add try except StopIteration in validation_step for models with dataloader_iter



* Remove pyyaml from requirements



* Add try except for inference_step in megatron_finetune_model.py



* Remove limit_val_batches for mockGPTDataset test



* Add new self.validation_step_outputs for MegatronGPTSFTModel



* Minor edit Jenkinsfile



* Initialize self.validation/test_step_outputs in megatron_gpt_sft_model.py

Initialize self.validation/test_step_outputs in setup of MegatronGPTSFTModel to take care of cases when datalaoders are not setup in ModelPT for example while restoring the model.



* Remove resume_from_checkpoint if trainer arg in conf yaml files



* Remove resume_from_checkpoint as trainer arg in GPT, T5 configs



* Remove resume_from_checkpoint in duplex_tn_config.yaml



* Fix typos, unused imports and refactor code to remove redundant funcs



* Remove commented code in megatron_nmt_model.py



* Fix overriden functions to match parent class functions



* Prefetch dataloader_iter to prevent hang for PP>1



* Override setup() in NLPDDPStrategy to avoid hang during predict with PP>1



* Uncomment tests in JenkinsFile



* Add '16' to precision checks and other minor fixes



* Clear validation/test_step_outputs with dataloader_idx for multi dataloaders



* Minor edits



* Modify precision checks to avoid indexing



* Remove self.validation_step_outputs_sft and add dataloader_idx to clear outputs



* Reference checkpoint with trainer.ckpt_path



* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add _prefetch to NLPModel and minor fixes



* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add limit_val_batches in JenkinsFile for NMT

1) Add trainer.limit_val_batches in Megatron NMT Training TP=2
2) Remove unused import in ModelPT



---------





* Include the scripts for preprocessing OAST and unit tests for chat sft datasets (NVIDIA#7112)

* scripts for sft



* fix style



* adde special token only for huggingface model



* change default name



* print out error datapoint content



* show error id



* annotation script working



* try to be compatible with huggingface tokenizer



* added examples



* added lang



* added lang



* text to value special case



* configure the slider



* annoatation handles lang



* added the unit test for chat sft dataset



* used the file in the test dir



* fix json error



* load local tokenizer



* remove mask count check



* added HF dataset backend



* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------





* add paths to labeler. (NVIDIA#7087)




* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------
















































Co-authored-by: Adi Renduchintala <adithyar…

Signed-off-by: Daniel Egert <degert@nvidia.com>
Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: Ryan <rlangman@nvidia.com>
Signed-off-by: Kim Ngo <6362111+findkim@users.noreply.github.com>
Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: arendu <adithya.r@gmail.com>
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: Ante Jukić <ajukic@nvidia.com>
Signed-off-by: Tim Moon <tmoon@nvidia.com>
Signed-off-by: Dmytro Pykhtar <dpykhtar@nvidia.com>
Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: Nikolay Karpov <karpnv@gmail.com>
Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>
Signed-off-by: Evelina <ebakhturina@nvidia.com>
Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: Abhishree <abhishreetm@gmail.com>
Signed-off-by: Yi Dong <yidong@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>
Signed-off-by: tbartley94 <tbartley@nvidia.com>
Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>
Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>
Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com>
Signed-off-by: stevehuang52 <heh@nvidia.com>
Signed-off-by: sam1373 <samuelkriman@gmail.com>
Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: Jan Beckmann <king-jan1999@hotmail.de>
Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com>
Signed-off-by: Xin Yao <yaox12@outlook.com>
Signed-off-by: fayejf <36722593+fayejf@users.noreply.github.com>
Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com>
Signed-off-by: hsiehjackson <c2hsieh@ucsd.edu>
Signed-off-by: Cheng-Ping Hsieh <37269846+hsiehjackson@users.noreply.github.com>
Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>
Signed-off-by: smajumdar <smajumdar@nvidia.com>
Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>
Signed-off-by: Virginia Adams <vadams@nvidia.com>
Signed-off-by: Vahid <vnoroozi@nvidia.com>
Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: ekmb <ebakhturina@nvidia.com>
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
Signed-off-by: Micha Livne <mlivne@cs.toronto.edu>
Signed-off-by: Abhinav Khattar <aklife97@gmail.com>
Signed-off-by: Micha Livne <mlivne@nvidia.com>
Signed-off-by: Dima Rekesh <bmwshop@gmail.com>
Signed-off-by: Jim O’Regan <jaoregan@tcd.ie>
Signed-off-by: Mostafa Ghorbandoost <mos.ghorbandoost@gmail.com>
Signed-off-by: Dmytro Pykhtar <37850217+dimapihtar@users.noreply.github.com>
Signed-off-by: Kunal Dhawan <kunaldhawan97@gmail.com>
Signed-off-by: andrusenkoau <andrusenkoau@gmail.com>
Signed-off-by: Andrei Andrusenko <52885736+andrusenkoau@users.noreply.github.com>
Signed-off-by: KunalDhawan <kunaldhawan97@gmail.com>
Signed-off-by: Greg Clark <grclark@nvidia.com>
Signed-off-by: Eric Harper <complex451@gmail.com>
Signed-off-by: Jan Baczek <jbaczek@nvidia.com>
Signed-off-by: yaoyu-33 <54727607+yaoyu-33@users.noreply.github.com>
Signed-off-by: Olivier Delalleau <507137+odelalleau@users.noreply.github.com>
Signed-off-by: eharper <eharper@nvidia.com>
Signed-off-by: jasonwan <jasonwan@nvidia.com>
Signed-off-by: Maanu Grover <maanug@nvidia.com>
Signed-off-by: Guyue Huang <guyueh@nvidia.com>
Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com>
Signed-off-by: Igor Gitman <igitman@nvidia.com>
Signed-off-by: Siddharth Tyagi <styagi130@gmail.com>
Signed-off-by: Abhishree Thittenamane <47577437+athitten@users.noreply.github.com>
Signed-off-by: Jason Wang <jasonwan@nvidia.com>
Signed-off-by: arendu <adithyare@nvidia.com>
Signed-off-by: Alireza Morsali <alireza.mors@gmail.com>
Signed-off-by: Siddharth Tyagi <siddhartht@nvidia.com>
Signed-off-by: dorotat <dorotat@nvidia.com>
Signed-off-by: mburchi <maxime.burchi@gmail.com>
Signed-off-by: Maxime Burchi <60737204+burchim@users.noreply.github.com>
Signed-off-by: Adi Renduchintala <adithya.r@gmail.com>
Signed-off-by: Nithin Rao Koluguri <nithinraok>
Signed-off-by: Xin Yao <xiny@nvidia.com>
Signed-off-by: Hongbin Liu <hongbinl@nvidia.com>
Signed-off-by: Alexander Jipa <azzhipa@amazon.com>
Signed-off-by: omahs <73983677+omahs@users.noreply.github.com>
Signed-off-by: lhb8125 <lhb8125@gmail.com>
Signed-off-by: Robin Dong <robin.k.dong@gmail.com>
Signed-off-by: Jimmy Zhang <jiemingz@nvidia.com>
Signed-off-by: Sangkug Lym <slym@nvidia.com>
Signed-off-by: George Zelenfroynd <gzelenfroind@nvidia.com>
Signed-off-by: Anton Peganov <apeganov@nvidia.com>
Signed-off-by: Samuele Cornell <cornellsamuele@gmail.com>
Signed-off-by: Jason <jasoli@nvidia.com>
Signed-off-by: Jan Lasek <janek.lasek@gmail.com>
Signed-off-by: Tamerlan Tabolov <tktabolov@gmail.com>
Signed-off-by: zhehuaichen <139396994+zhehuaichen@users.noreply.github.com>
Co-authored-by: trias702 <25867060+trias702@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Ryan Langman <rlangman@nvidia.com>
Co-authored-by: Kim Ngo <6362111+findkim@users.noreply.github.com>
Co-authored-by: David <amosalla@asu.edu>
Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Co-authored-by: Adi Renduchintala <adithyare@nvidia.com>
Co-authored-by: Elena Rastorgueva <80532067+erastorgueva-nv@users.noreply.github.com>
Co-authored-by: anteju <108555623+anteju@users.noreply.github.com>
Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Co-authored-by: Dmytro Pykhtar <37850217+dimapihtar@users.noreply.github.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Co-authored-by: Vladimir Bataev <vbataev@nvidia.com>
Co-authored-by: Nikolay Karpov <karpnv@gmail.com>
Co-authored-by: bene-ges <antonova_sasha@list.ru>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: Taejin Park <tango4j@gmail.com>
Co-authored-by: Abhishree Thittenamane <47577437+athitten@users.noreply.github.com>
Co-authored-by: Yi Dong <43824965+yidong72@users.noreply.github.com>
Co-authored-by: Matvei Novikov <mattyson.so@gmail.com>
Co-authored-by: tbartley94 <90423858+tbartley94@users.noreply.github.com>
Co-authored-by: Aleksandr Laptev <alaptev@nvidia.com>
Co-authored-by: Aleksey Grinchuk (Oleksii Hrinchuk) <grinchuk.alexey@gmail.com>
Co-authored-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
Co-authored-by: Vahid Noroozi <VahidooX@users.noreply.github.com>
Co-authored-by: Samuel Kriman <samuelkriman@gmail.com>
Co-authored-by: Boris Fomitchev <borisfom@users.noreply.github.com>
Co-authored-by: Jan Beckmann <king-jan1999@hotmail.de>
Co-authored-by: lleaver <137942999+lleaver@users.noreply.github.com>
Co-authored-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com>
Co-authored-by: Xin Yao <yaox12@outlook.com>
Co-authored-by: anmolgupt <14880251+anmolgupt@users.noreply.github.com>
Co-authored-by: ANMOL GUPTA <anmolg@nvidia.com>
Co-authored-by: Cheng-Ping Hsieh <37269846+hsiehjackson@users.noreply.github.com>
Co-authored-by: Micha Livne <michalivne@users.noreply.github.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
Co-authored-by: Jocelyn <jocelynh@nvidia.com>
Co-authored-by: bene-ges <61418381+bene-ges@users.noreply.github.com>
Co-authored-by: Alexandra Antonova <aleksandraa@nvidia.com>
Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>
Co-authored-by: Zhilin Wang <wangzhilin12061996@hotmail.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Co-authored-by: Ante Jukić <ajukic@nvidia.com>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: Yang Zhang <yzhang123@users.noreply.github.com>
Co-authored-by: Sean Naren <snarenthiran@nvidia.com>
Co-authored-by: Neha Tadimeti <ntadimeti@nvidia.com>
Co-authored-by: Abhinav Khattar <aklife97@gmail.com>
Co-authored-by: Dima Rekesh <bmwshop@gmail.com>
Co-authored-by: Jim O’Regan <jaoregan@tcd.ie>
Co-authored-by: Mostafa Ghorbandoost <mos.ghorbandoost@gmail.com>
Co-authored-by: Dmytro Pykhtar <dpykhtar@nvidia.com>
Co-authored-by: Kunal Dhawan <kunaldhawan97@gmail.com>
Co-authored-by: Andrei Andrusenko <52885736+andrusenkoau@users.noreply.github.com>
Co-authored-by: Greg Clark <grclark@nvidia.com>
Co-authored-by: jbaczek <45043825+jbaczek@users.noreply.github.com>
Co-authored-by: yaoyu-33 <54727607+yaoyu-33@users.noreply.github.com>
Co-authored-by: Olivier Delalleau <507137+odelalleau@users.noreply.github.com>
Co-authored-by: Jason Wang <jasonwan@nvidia.com>
Co-authored-by: Maanu Grover <109391026+maanug-nv@users.noreply.github.com>
Co-authored-by: guyueh1 <140554423+guyueh1@users.noreply.github.com>
Co-authored-by: Mariana <47233618+mgrafu@users.noreply.github.com>
Co-authored-by: Igor Gitman <igitman@nvidia.com>
Co-authored-by: styagi130 <siddharth.tyagi2015@vit.ac.in>
Co-authored-by: Siddharth Tyagi <siddhartht@nvidia.com>
Co-authored-by: Cheng-Ping Hsieh <chsieh@nvidia.com>
Co-authored-by: Alireza Morsali <32244795+AlirezaMorsali@users.noreply.github.com>
Co-authored-by: styagi130 <styagi130@gmail.com>
Co-authored-by: dorotat-nv <115542912+dorotat-nv@users.noreply.github.com>
Co-authored-by: Maxime Burchi <60737204+burchim@users.noreply.github.com>
Co-authored-by: mikolajblaz <mikolajblaz@users.noreply.github.com>
Co-authored-by: eharper <eharper@nvidia.com>
Co-authored-by: Hongbin Liu <hongbinl@nvidia.com>
Co-authored-by: Kelvin Liu <lhb8125@users.noreply.github.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: Alexander Jipa <alexander.jipa@gmail.com>
Co-authored-by: Alexander Jipa <azzhipa@amazon.com>
Co-authored-by: omahs <73983677+omahs@users.noreply.github.com>
Co-authored-by: Robin Dong <robin.k.dong@gmail.com>
Co-authored-by: JimmyZhang12 <67203904+JimmyZhang12@users.noreply.github.com>
Co-authored-by: Jimmy Zhang <jiemingz@nvidia.com>
Co-authored-by: Sangkug Lym <slym@nvidia.com>
Co-authored-by: George <37293288+Jorjeous@users.noreply.github.com>
Co-authored-by: PeganovAnton <apeganov@nvidia.com>
Co-authored-by: Samuele Cornell <cornellsamuele@gmail.com>
Co-authored-by: Jason <jasoli@nvidia.com>
Co-authored-by: Igor Gitman <igor.a.gitman@gmail.com>
Co-authored-by: Jan Lasek <janek.lasek@gmail.com>
Co-authored-by: Tamerlan Tabolov <nektonikto999@gmail.com>
zhehuaichen pushed a commit to zhehuaichen/NeMo that referenced this pull request Oct 4, 2023
* Add ASR with TTS Tutorial
* Fix enhancer usage

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
zhehuaichen pushed a commit to zhehuaichen/NeMo that referenced this pull request Oct 4, 2023
* Add ASR with TTS Tutorial
* Fix enhancer usage

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
zhehuaichen added a commit to zhehuaichen/NeMo that referenced this pull request Oct 4, 2023
* update package info

Signed-off-by: ericharper <complex451@gmail.com>

* fix the mpt chatbot (#6957)

Signed-off-by: Yi Dong <yidong@nvidia.com>

* Remove `compute_on_step` from metrics (#6979)

* Remove `compute_on_step` from metrics

Signed-off-by: smajumdar <titu1994@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Remove confusing log message

Signed-off-by: smajumdar <titu1994@gmail.com>

* Update tests

Signed-off-by: smajumdar <titu1994@gmail.com>

---------

Signed-off-by: smajumdar <titu1994@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Hybrid conformer export (#6983)

* Implemented generic kv-pair setting of export_config from args

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Hybrid conformer export

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Hybrid decoder export

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Cleanup

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Changed from **kwargs

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Docstring

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Docs added

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Stringify args

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Added docs for ASR export configs

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* lowercase ctc

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

---------

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Cache handling without input tensors mutation (#6980)

* Cache handling without input tensors mutation

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Cleanup

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Cleanup#2

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Cleanup#3

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

---------

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>

* fixes for spellmapper (#6994)

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>

* Fixing an issue with confidence ensembles (#6987)

* Bug fix for the confidence ensembles

Signed-off-by: Igor Gitman <igitman@nvidia.com>

* Relax constraints for the test

Signed-off-by: Igor Gitman <igitman@nvidia.com>

---------

Signed-off-by: Igor Gitman <igitman@nvidia.com>

* [TTS] Append pretrained FastPitch & SpectrogamEnhancer pair to available models (#7012)

* [TTS] fastpitch: add english libritts model with asr stft parameters (25 ms 10 ms)

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] enhancer: add pretrained model intended for asr finetuning

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

---------

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* Add ASR with TTS Tutorial. Fix enhancer usage. (#6955)

* Add ASR with TTS Tutorial
* Fix enhancer usage

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* install_bs (#7019)

Signed-off-by: Nikolay Karpov <karpnv@gmail.com>

* fix tab text gen (#7022)

Signed-off-by: Yi Dong <yidong@nvidia.com>

* TE bug fix (#7027)

Signed-off-by: Dmytro Pykhtar <dpykhtar@nvidia.com>

* Add support for Numba FP16 RNNT Loss (#6991) (#7038)

* Force working space memory to always be in fp32

Signed-off-by: smajumdar <titu1994@gmail.com>

* Add support for fp16 testing in Numba

Signed-off-by: smajumdar <titu1994@gmail.com>

* Add support for fp16 testing in Numba

Signed-off-by: smajumdar <titu1994@gmail.com>

* Add support for fp16 testing in Numba

Signed-off-by: smajumdar <titu1994@gmail.com>

* Fix cost calculation by upcasting to fp32

Signed-off-by: smajumdar <titu1994@gmail.com>

* Fix cost calculation by upcasting to fp32

Signed-off-by: smajumdar <titu1994@gmail.com>

* Add support to check if numba fp16 is available

Signed-off-by: smajumdar <titu1994@gmail.com>

* add RNN-T loss implemented by PyTorch and test code (#5312)

* Fix the bugs in cache-aware streaming Conformer (#5032)

Signed-off-by: Vahid <vnoroozi@nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* IA3 support for GPT and T5 (#4909)

* init commit for ia3 adater training in GPT

Signed-off-by: arendu <adithya.r@gmail.com>

* ia3 adater training in GPT, models and adapter classes

Signed-off-by: arendu <adithya.r@gmail.com>

* reshape to operate even on non-contiguous tensors

Signed-off-by: arendu <adithya.r@gmail.com>

* configs

Signed-off-by: arendu <adithya.r@gmail.com>

* fixed none init

Signed-off-by: arendu <adithya.r@gmail.com>

* adding adapter and ia3 support for T5 based models

Signed-off-by: arendu <adithya.r@gmail.com>

* style fix

Signed-off-by: arendu <adithya.r@gmail.com>

* config update and t5 model adapter and ia3

Signed-off-by: arendu <adithya.r@gmail.com>

* removed unused imports

Signed-off-by: arendu <adithya.r@gmail.com>

* predict step for inference

Signed-off-by: arendu <adithya.r@gmail.com>

* style fix

Signed-off-by: arendu <adithya.r@gmail.com>

* style fix

Signed-off-by: arendu <adithya.r@gmail.com>

* adapter inference for t5

Signed-off-by: arendu <adithya.r@gmail.com>

* style fix

Signed-off-by: arendu <adithya.r@gmail.com>

* fixed bug micro and global batch size in eval

Signed-off-by: arendu <adithya.r@gmail.com>

* minor edit

Signed-off-by: arendu <adithya.r@gmail.com>

* agressive truncation if in test examples if no truncation field is given

Signed-off-by: arendu <adithya.r@gmail.com>

* corrected for language_model_path name changes in main

Signed-off-by: arendu <adithya.r@gmail.com>

* removed unused import

Signed-off-by: arendu <adithya.r@gmail.com>

* name change for language_model_path

Signed-off-by: arendu <adithya.r@gmail.com>

* include inter_attention to IA3

Signed-off-by: arendu <adithya.r@gmail.com>

* minor fix in confg

Signed-off-by: arendu <adithya.r@gmail.com>

* minor fixes

Signed-off-by: arendu <adithya.r@gmail.com>

* removed unused flag

Signed-off-by: arendu <adithya.r@gmail.com>

* addressing PR comments

Signed-off-by: arendu <adithya.r@gmail.com>

* address PR comments

Signed-off-by: arendu <adithya.r@gmail.com>

* minor fix

Signed-off-by: arendu <adithya.r@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* style fix

Signed-off-by: arendu <adithya.r@gmail.com>

* CI test

Signed-off-by: arendu <adithya.r@gmail.com>

* minor fix in jenkinsfile

Signed-off-by: arendu <adithya.r@gmail.com>

Signed-off-by: arendu <adithya.r@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Bug fix - Limit val batches set to 1.0  (#5023)

* Bug fix

Signed-off-by: shanmugamr1992 <shanmugamr1992@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Adressed sandeep's comments

* Fixing limit val batches support in bert

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fixing limit val batches support in bert

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: shanmugamr1992 <shanmugamr1992@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* [bug_fix] kv_channels is used when available (#5066)

* fix bug s.t kv_channels is used when available

Signed-off-by: arendu <adithya.r@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: arendu <adithya.r@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* P&C Docs (#5068) (#5069)

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>
Co-authored-by: Matvei Novikov <mattyson.so@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Add spe_split_by_unicode_script arg (#5072)

* Add spe_split_by_unicode_script arg

Signed-off-by: Anas <aabouallaban@pm.me>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Anas <aabouallaban@pm.me>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* probabilites -> probabilities (#5078) (#5079)

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* increase PR and Issue sweep quantity and active close PRs. (#5073)

* increase PR and Issue sweep quantity and active close PRs.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* update with stricter rules, 30 days to be stale and 7 days to be closed for both Issues and PRs.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* [TTS] added missing German phoneme tokenizer. (#5070) (#5074)

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* rename to match prompt leanring (#5076)

Signed-off-by: arendu <adithya.r@gmail.com>

Signed-off-by: arendu <adithya.r@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Missing fixes from r1.11.0 to T5 finetuning eval (#5054) (#5061)

* Fixes to seq2seq eval

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Notebook bug fixes (#5084) (#5085)

* Notebook bug fixes

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Turned nemo install back on

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* reverted notebook

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Updated one line in entity linking nb

Signed-off-by: Virginia Adams <vadams@nvidia.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>
Co-authored-by: Eric Harper <complex451@gmail.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>
Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* update strategy in notebook from ddp_fork to dp (#5088) (#5089)

Co-authored-by: Zhilin Wang <wangzhilin12061996@hotmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Fix bug in Squeezeformer Conv block (#5011) (#5024)

* Fix bug in Squeezeformer Conv block

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Fix kernel context

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Fix access mixin

Signed-off-by: smajumdar <smajumdar@nvidia.com>

Signed-off-by: smajumdar <smajumdar@nvidia.com>

Signed-off-by: smajumdar <smajumdar@nvidia.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* fixed megatron lm conversion bug (PTL related) (#5038) (#5063)

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: David <amosalla@asu.edu>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Fix Unhashable type list for Numba Cuda spec augment kernel (#5093) (#5094)

Signed-off-by: smajumdar <smajumdar@nvidia.com>

Signed-off-by: smajumdar <smajumdar@nvidia.com>

Signed-off-by: smajumdar <smajumdar@nvidia.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Fix numba (#5098)

Signed-off-by: smajumdar <titu1994@gmail.com>

Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Make it possible to specify output_filename in normalize_with_audio.py (#5092)

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Greedy decoding confidence for CTC and RNNT (#4931)

* rnnt confidence draft

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* word confidence

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* advanced entropies added

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* refactoring

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* oops forgot a file

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* metrics and benchmarking script added

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* style fix

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* texterrors installation added

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* lgtm and bug fix

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* fix comments

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* fix typos

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* add missing import after rebase

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>
Co-authored-by: Aleksandr Laptev <alaptev@nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* [Add] SLURP models and examples (#4668)

* add model, util and loss

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* refactor

Signed-off-by: stevehuang52 <heh@nvidia.com>

* refactor annd update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update and refactor

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update and refactor

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update and refactor

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update docs

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update available models

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* refactor data processing

Signed-off-by: stevehuang52 <heh@nvidia.com>

* fix typo

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update docs

Signed-off-by: stevehuang52 <heh@nvidia.com>

* refactor and update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update doc

Signed-off-by: stevehuang52 <heh@nvidia.com>

* move transformer to asr.modules

Signed-off-by: stevehuang52 <heh@nvidia.com>

* move transformer to asr.modules

Signed-off-by: stevehuang52 <heh@nvidia.com>

* get rid of jsonlines

Signed-off-by: stevehuang52 <heh@nvidia.com>

* refactor

Signed-off-by: stevehuang52 <heh@nvidia.com>

* revert changes to nlp

Signed-off-by: stevehuang52 <heh@nvidia.com>

Signed-off-by: stevehuang52 <heh@nvidia.com>
Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: Jagadeesh Balam <4916480+jbalam-nv@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* only optimize params that are part of the adapter modules (#5086)

Signed-off-by: arendu <adithya.r@gmail.com>

Signed-off-by: arendu <adithya.r@gmail.com>
Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Pipeline Parallel T5 Prompt Learning (#4956)

* Added pre process flag checks and pipeline parallel in fwd

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Added rank check for pipeline parallel

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* T5 prompt learning works!

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* IA3 passing CI

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Fixed typo

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* removed optimizer setup so Adi's change will not conflict

Signed-off-by: Virginia Adams <vadams@nvidia.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>
Signed-off-by: Adi Renduchintala <108822655+arendu@users.noreply.github.com>
Co-authored-by: Adi Renduchintala <108822655+arendu@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* [TTS] remove phonemizer.py (#5090)

remove phonemizer.py and convert code block to markdown in the tutorial.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* T5 Decoding with PP > 2 fix (#5091) (#5103)

* set sequence lenghts in the pipeline properly

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* [TTS] fixed wrong val loss for epoch 0 and inconsistent metrics names (#5087) (#5102)

* fixed hifigan configs as well
* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Fix and refactor consumed samples save/restore for Megatron models. (#5077)

* Fixes and refactor

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Remove unused imports

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Empty

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* RIR corpus generator tool (#4927)

Signed-off-by: Ante Jukić <ajukic@nvidia.com>

Signed-off-by: Ante Jukić <ajukic@nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Multiprocessing fix (#5106) (#5107)

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>
Co-authored-by: Matvei Novikov <mattyson.so@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* [Bug fix] PC lexical + audio (#5109) (#5110)

* training running

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* revert

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* revert

Signed-off-by: ekmb <ebakhturina@nvidia.com>

Signed-off-by: ekmb <ebakhturina@nvidia.com>

Signed-off-by: ekmb <ebakhturina@nvidia.com>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* [Fix] schedulers with no max_steps param (#4564)

* fix schedulers

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update to use python inspect module

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

Signed-off-by: stevehuang52 <heh@nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* T5 prompt learning fixes missing from r.11.0 merge (#5075) (#5101)

* Fix special tokens

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Empty

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: David <amosalla@asu.edu>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: David <amosalla@asu.edu>
Co-authored-by: Eric Harper <complex451@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* [TTS] Add NeMo TTS Primer Tutorial (#4933)

* [TTS] Add NeMo TTS Primer Tutorial

Signed-off-by: Ryan <rlangman@nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Add Squeezeformer CTC model checkpoints on Librispeech (#5121)

Signed-off-by: smajumdar <titu1994@gmail.com>

Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* adding loss normalization options to rnnt joint  (#4829)

* adding normalization options to rnnt joint loss

* moving the param to joint

* moving loss normalization to rnnt loss config

* style

* cleaning up

* fixing sum reduction in joint

Signed-off-by: Dima Rekesh <drekesh@nvidia.com>

* moving reduction into RNNT loss class

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* refactoring

* typos

Signed-off-by: Dima Rekesh <drekesh@nvidia.com>

Signed-off-by: Dima Rekesh <drekesh@nvidia.com>
Co-authored-by: Dima Rekesh <drekesh@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Asr concat dataloader (#5108)

* forced precision

* typo

* initial commit

Signed-off-by: Dima Rekesh <bmwshop@gmail.com>

* typos and bugs

Signed-off-by: Dima Rekesh <drekesh@nvidia.com>

* reverting conformer encoder

Signed-off-by: Dima Rekesh <drekesh@nvidia.com>

* additional checks

Signed-off-by: Dima Rekesh <bmwshop@gmail.com>

* adding support to CTC models as well

* reverting conformer_encoder

Signed-off-by: Dima Rekesh <bmwshop@gmail.com>

* typo

Signed-off-by: Dima Rekesh <bmwshop@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* refactoring

Signed-off-by: Dima Rekesh <bmwshop@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* refactoring

Signed-off-by: Dima Rekesh <drekesh@nvidia.com>

* merging

Signed-off-by: Dima Rekesh <drekesh@nvidia.com>

Signed-off-by: Dima Rekesh <bmwshop@gmail.com>
Signed-off-by: Dima Rekesh <drekesh@nvidia.com>
Co-authored-by: Dima Rekesh <drekesh@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* fix blossom ci unittests

Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* bugfix: pybtex.database.InvalidNameString: Too many commas in author field. (#5112) (#5115)

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Uppdate container version to 22.09 (#5105)

* update container version

Signed-off-by: ericharper <complex451@gmail.com>

* pin click

Signed-off-by: ericharper <complex451@gmail.com>

* pin click 8.0.2

Signed-off-by: ericharper <complex451@gmail.com>

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Remove unsupported arguments from MegatronNMT (#5065)

* Fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* More fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* pp2 support for T5 IA3 learning and T5 Adapters learning (#5116)

* enabling pp2

Signed-off-by: arendu <adithya.r@gmail.com>

* optimizer update

Signed-off-by: arendu <adithya.r@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* T5 pp>1 support for adapters and ia3

Signed-off-by: arendu <adithya.r@gmail.com>

* fix bug with missing adapter_tuning

Signed-off-by: arendu <adithya.r@gmail.com>

* inference error fixed, pp=2

Signed-off-by: arendu <adithya.r@gmail.com>

Signed-off-by: arendu <adithya.r@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* T5 Prompt Learning Fixes for Pipeline Parallel (#5120)

* Initial fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Added back validation acc

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Put num workers back

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* added relative encoding if statament

Signed-off-by: Virginia Adams <vadams@selene-login-01.nvidia.com>

* Added back val loss only validation

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Revert "Added back val loss only validation"

This reverts commit 86d8f4806fe30335c40c3716ce18259939df500f.

* Removed val acc for PP > 1

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Removed enc_seq_len if statement

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Added back validation acc calc

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: Virginia Adams <vadams@nvidia.com>
Signed-off-by: Virginia Adams <vadams@selene-login-01.nvidia.com>
Co-authored-by: Virginia Adams <vadams@nvidia.com>
Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Virginia Adams <vadams@selene-login-01.nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* add doc info (#4721)

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* [TTS] Add SpanishCharsTokenizer (#5135)

* [TTS] Add SpanishCharsTokenizer

Signed-off-by: Ryan <rlangman@nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Update megatron interface to dialogue (#4936)

* fix style formatting

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update template to include description of intent

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkinsfile

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* changes based on requests in review

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* add compatibility with assistant dataset

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkins

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* remove dialogue_state_tracking

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update huggingface utils for dialogue

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* rename dialogue_state_tracking_hybrid to dialogue_state_tracking_sgdqa

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* fix style

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix nemo/collections/nlp/models/dialogue_state_tracking_sgdqa/__init__.py

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkinsfile for SGDGEN

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkinsfile for SGDGEN

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkinsfile for SGDGEN

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkinsfile for SGDGEN

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkinsfile for SGDGEN

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* fix typo

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* add docstrings for assistant data processsor

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkins for SGDGEN local checkpoint

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update style

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* use local vocab file for Jenkinsfile

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* patch for Jenkins CI using local file

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* add slot filling prediction and metrics

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* remove unused code

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* refactor metrics code out of Dialogue GPT Model

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* integrate backward compatible support for IntentSlotClassificationModel (bert model)

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* save prediction file for IntentSlotClassification

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update dialogue gpt model training for megatron gpt

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* remove batch generate for HF GPT2, which causes lower performance

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* add few shot capability to dialogue gpt model

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkinsfile and remove unused import

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update code description and clarity

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* address PR comments

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* integrate compatibility with ZeroShotIntentModel

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* rename folder to dialogue due to increased scope and further refactor for clarity

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* added dialogue GPT for sequence generation task (e.g. answer extender)

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* add CI test for DialogueGPTGenerationModel

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* integrate DialogueS2SGenerationModel for generation task (e.g. answer extender)

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* modify huggingface utils to support HF t5/BART models

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* remove unused imports

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkinsfile

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkinsfile

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update bleu metric

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* fix bleu metric style

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* debug bleu metric

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* debug bleu metric

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update based on PR #3893

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update 2 based on PR #3893

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update 3 based on PR #3893

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* integrate sgd generation based on user user utterance and system slot-values to generate system utterance

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* add validation model saving capabilities

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* cleaned up code for SGD Based Answer extender

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Dialogue Generation CI

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkinsfile

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkinsfile

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* fix Jenkins CI issue"

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* add support for design dataset

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* remove unnecessary imports

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkins

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update jenkins

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update jenkins

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* support megatron for dialogue_s2s_generation_model

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* reduce loaded samples in MSMarcoDataProcessor to 64 when cfg.model.dataset.debug_mode=True

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update CI

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update checkpoint and predictions filename to include epoch number

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* integrate HF BART MNLI into zero shot intent model

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* integrate Dialogue Nearest Neighbour Model

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkins

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkins

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* refactor Dialogue SGD Data Processor to make interface for models cleaner

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update jenkins

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Dialogue S2S Generation model for DialogueSGDDataProcessor interface

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update jenkins

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update jenkins

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* support sgd and drive thru datasets by zero shot model and nearest neighbour model

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* add prediction saving code to nearest neighbour and zero shot intent models

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* fix typo in sgd data processor

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* integrate Dialogue Mellon QA Data Processor

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update mellon qa

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update dialogue.py to remove outdated info

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update dialogue_config.yaml

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update dialogue_config.yaml

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* add dialogue docs

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* address review comments

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix for cfg

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* make dependency on apex optional

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* change NLPDDPluggin calling logic to make it possible to run without apex

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* add first draft of tutorial

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* reduce ms marco size by removing lines without wellFormedAnswers

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* address pr comments

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update colab tutorial link in dialogue docs

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* include unit test and some refactor to facilitate unit test

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* address pr issues

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* remove typos in dialogue tutorial

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* support larger files for question answering

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* remove unnecessary artifacts to reduce memory use

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* put 0 tensor to device

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update link within dialogue tutorial

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* restore previously delete files

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update error handling when loss = nan

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update nan handling

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update spanning loss func

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update spanning loss

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* fix type error raised in qa_dataset.py

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* add error checking message

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* revert back to float32

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* revert back to float32

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update error msgs

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update error msgs

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update error msgs

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update error msgs

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update error msgs

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update error msgs

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update error msgs

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update error msgs

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update exp logging

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update error msgs

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update loading of large file from pickle to json

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update loading of large file from pickle to json

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* limit number of negative samples

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* revert post processing

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* revert post processing

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* remove unused methods and style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* add more documentation

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* remove unused imports

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* changes base on PR review

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* set wandb logger falseby default

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update interface with megatron gpt prompt learning

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update inline documentation

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update prompt_ids

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update error msg

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update config

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update config

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* set inference = False for dialgue prompt learning during trainng

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* set inference = False for dialgue prompt learning during trainng

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* remove unused code

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update config yaml

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* fix bug for megatron gpt prompt learning

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* remove unused import

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* address comments in PR

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* address comments in PR

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* address typo

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* add megatron t5 inference

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* fix bug due to bert tokenizer not being space-aware

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update style

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update IntentSlotModel onnx export test

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update style

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update exportable

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* address PR comments

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* replace functools.cache_property with functools.lru_cache to maintain python 3.7 compatibility

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* improve speed of rank_candidates and support for p tuning

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update dialogue.py

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* fix megatron prompt learning saving bug

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update generate_candidate method

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* remove repeated init text ids and invert attention masks

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update typo

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* custom collate fn to remove excess padding in batch

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update complete method to mitigate issue when max seq len is low

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* address pr comments

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update generation interface

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>
Co-authored-by: Zhilin Wang <zhilinw@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: Yang Zhang <yzhang123@users.noreply.github.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Added save inference ready .nemo file with every checkpoint (#5055)

* Added save inference ready .nemo file with every checkpoint

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Python style fix

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* addressed Adi's comment

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Added ptuning check in model checkpoint saving

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Changed save_nemo_on_valdaition default to False

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Changes global batch size of adapter CI

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Changed num workers to 0

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* added first stage of pipeline check

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Virginia Adams <vadams@nvidia.com>
Signed-off-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Fixes for docs/typos + remove max_utts parameter from tarred datasets as it causes hang in training (#5118)

* Remove ; from jupyter notebook cells

Signed-off-by: Igor Gitman <igitman@nvidia.com>

* Fix typos in documentation/code

Signed-off-by: Igor Gitman <igitman@nvidia.com>

* Fix output message to have 'or equal'

Signed-off-by: Igor Gitman <igitman@nvidia.com>

* Link formatting fixes

Signed-off-by: Igor Gitman <igitman@nvidia.com>

* Add error if max_utts is used in tarred datasets

Signed-off-by: Igor Gitman <igitman@nvidia.com>

* Remove max_utts parameter from tarred datasets

Signed-off-by: Igor Gitman <igitman@nvidia.com>

* Fix max_utts removal in tests

Signed-off-by: Igor Gitman <igitman@nvidia.com>

* Fix typo if -> is

Signed-off-by: Igor Gitman <igitman@nvidia.com>

Signed-off-by: Igor Gitman <igitman@nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Merge r1.12.0 main (#5139)

* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* Add cherry-pick action (#4958)

* add cherry-pick action

Signed-off-by: ericharper <complex451@gmail.com>

* Pin Transformers version to fix CI (#4955)

* Pin transformers version in CI to prevent offline tokenizer loading error

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Drop version

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Disable offline temporarily

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Disable offline temporarily

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Enable offline

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>
Co-authored-by: Sean Naren <snarenthiran@nvidia.com>

* upper bound transformers

Signed-off-by: ericharper <complex451@gmail.com>

* remove duplicate transformers requirement

Signed-off-by: ericharper <complex451@gmail.com>

* Release SOTA Lang ID model  (#5080)

* add pretrained lang id model ambernet

Signed-off-by: fayejf <fayejf07@gmail.com>

* update doc and style fix

Signed-off-by: fayejf <fayejf07@gmail.com>

Signed-off-by: fayejf <fayejf07@gmail.com>

* update branch and package info

Signed-off-by: ericharper <complex451@gmail.com>

* remove upper bounds on lightning and transformers

Signed-off-by: ericharper <complex451@gmail.com>

* remove transformers offline from ci

Signed-off-by: ericharper <complex451@gmail.com>

* upper bound transformers

Signed-off-by: ericharper <complex451@gmail.com>

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
Co-authored-by: Sean Naren <snarenthiran@nvidia.com>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Added ASR model comparison to SDE (#5043)

SDE: Added ASR model comparison tool to SDE
transcribe speech: Added support for many predictions in one file, as well as custom field names
Signed-off-by: George Zelenfroynd <gzelenfroind@nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* fix nmt eval sampler (#5154)

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Fix Global init steps (#5143)

* move global step to base

Signed-off-by: Yi Dong <yidong@nvidia.com>

* fix fused softmax

Signed-off-by: Yi Dong <yidong@nvidia.com>

* add the missing file

Signed-off-by: Yi Dong <yidong@nvidia.com>

* update the fused kernel

Signed-off-by: Yi Dong <doyend@gmail.com>

* fix import error

Signed-off-by: Yi Dong <doyend@gmail.com>

* fix import again

Signed-off-by: Yi Dong <yidong@nvidia.com>

Signed-off-by: Yi Dong <yidong@nvidia.com>
Signed-off-by: Yi Dong <doyend@gmail.com>
Co-authored-by: Yi Dong <doyend@gmail.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* [TTS] bug fix - sample rate was being ignored in vocoder dataset (#4518)

* bug fix - sample rate was being ignored in vocoder dataset when not loading mel
* handled n segments for a different sampling rate than original sampling rate
* Added case for n_segments 0, warning for n_segments greater than file length

Signed-off-by: Paarth Neekhara <paarth.n@gmail.com>
Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Co-authored-by: Jocelyn <jocelynh@nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Add EMA support to NeMo (#4764)

* Added Base files

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Some refactors, swap to using MNIST Lnet

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Add a few more tests, allow the callback to be set via the exp manager

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Actually run validation for testing

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Run isort

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Add test for saving state/fix saving state

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Use dummy model

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Fix test

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Add copyright

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Support saving separate EMA weight module

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Add standalone functionality/logging

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Expose more parameters

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Modify to allow option to replace validation

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Add jenkins test, formatting

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Pin Transformers version to fix CI (#4955)

* Pin transformers version in CI to prevent offline tokenizer loading error

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Drop version

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Disable offline temporarily

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Disable offline temporarily

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Enable offline

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Add cherry-pick action (#4958) (#4961)

* add cherry-pick action

Signed-off-by: ericharper <complex451@gmail.com>

* Pin Transformers version to fix CI (#4955)

* Pin transformers version in CI to prevent offline tokenizer loading error

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Drop version

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Disable offline temporarily

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Disable offline temporarily

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Enable offline

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>
Co-authored-by: Sean Naren <snarenthiran@nvidia.com>

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Co-authored-by: Sean Naren <snarenthiran@nvidia.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Fix changelog builder (#4962) (#4963)

Signed-off-by: smajumdar <smajumdar@nvidia.com>

Signed-off-by: smajumdar <smajumdar@nvidia.com>

Signed-off-by: smajumdar <smajumdar@nvidia.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* fix cherry pick workflow (#4964) (#4965)

Signed-off-by: ericharper <complex451@gmail.com>

Signed-off-by: ericharper <complex451@gmail.com>

Signed-off-by: ericharper <complex451@gmail.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* reorder model check (#4959) (#4967)

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* check for active conda environment (#4970) (#4971)

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* [TTS] fix broken tutorial for MixerTTS. (#4949) (#4976)

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Checkpoint averaging class fix (#4946)

* 1. Added args.class_path to provide it externally.

Signed-off-by: Micha Livne <mlivne@cs.toronto.edu>

* 1. Fixed style.

Signed-off-by: Micha Livne <mlivne@cs.toronto.edu>

Signed-off-by: Micha Livne <mlivne@cs.toronto.edu>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Add ability to give seperate datasets for test, train and validation (#4798)

* Add ability to give seperate datasets for test, train and validation

* Addressed Sandeeps comments

* Addressed Sandeeps comments

* Add ability to give seperate datasets for test, train and validation

* Add ability to give seperate datasets for test, train and validation

* Addressed review comments

* Bug fix for common dataset utils

* Add CI tests

Signed-off-by: shanmugamr1992 <shanmugamr1992@gmail.com>

* Reformat code

Signed-off-by: shanmugamr1992 <shanmugamr1992@gmail.com>

* Bug fix

Signed-off-by: shanmugamr1992 <shanmugamr1992@gmail.com>

* Bug fix

* Bug Fix

* Bug Fix

* Update Jenkinsfile

* Addressed comments

* Addressed Eriks comments.

* Addressed Sandeep

* Update Jenkinsfile

* Update Jenkinsfile

* Update dataset_utils.py

* Update Jenkinsfile

* Update Jenkinsfile

* Use GPT CI config

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: shanmugamr1992 <shanmugamr1992@gmail.com>
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* fix label models restoring issue from wrighted cross entropy (#4968) (#4975)

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Add simple pre-commit file (#4983)

* Add simple pre-commit file

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Exclude docs folder

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Revert "[pre-commit.ci] auto fixes from pre-commit.com hooks"

This reverts commit 053bd5ba579537a5f311b431871c21f3381b43eb.

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Import pycuda.autoprimaryctx or pycuda.autoinit to init pycuda execution environment (#4951)

Signed-off-by: Jin Li <liji@nvidia.com>

Signed-off-by: Jin Li <liji@nvidia.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Adding speaker embedding conditioning in fastpitch (#4986)

Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com>

Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Fix ASR issues (#4984) (#4991)

* Fix ASR issues

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Revert fix

Signed-off-by: smajumdar <smajumdar@nvidia.com>

Signed-off-by: smajumdar <smajumdar@nvidia.com>

Signed-off-by: smajumdar <smajumdar@nvidia.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Fix current tests

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* More test coverage

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Address reviews

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Address review

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Drop bf16 test

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Address review

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* remove print

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Add bf16

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>
Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: smajumdar <smajumdar@nvidia.com>
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: Micha Livne <mlivne@cs.toronto.edu>
Signed-off-by: shanmugamr1992 <shanmugamr1992@gmail.com>
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: Jin Li <liji@nvidia.com>
Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Co-authored-by: Micha Livne <michalivne@users.noreply.github.com>
Co-authored-by: shanmugamr1992 <111910568+shanmugamr1992@users.noreply.github.com>
Co-authored-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: liji-nv <59594262+liji-nv@users.noreply.github.com>
Co-authored-by: Subhankar Ghosh <subhankar2321@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Fix BF16 test (#5162)

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Fix errors in speaker diarization nemo docs (#5153)

* fix docs and docstrings for MSDD

Signed-off-by: Taejin Park <tango4j@gmail.com>

* fix nemo docs errors

Signed-off-by: Taejin Park <tango4j@gmail.com>

* reflected review comments

Signed-off-by: Taejin Park <tango4j@gmail.com>

Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Add interleaved pipeline schedule to GPT (#5025)

* add virtual pipeline size to config

Signed-off-by: ericharper <complex451@gmail.com>

* convert model to list of modules

Signed-off-by: ericharper <complex451@gmail.com>

* convert model to list of modules

Signed-off-by: ericharper <complex451@gmail.com>

* convert model to list of modules

Signed-off-by: ericharper <complex451@gmail.com>

* update for list of modules

Signed-off-by: ericharper <complex451@gmail.com>

* add virtual to init

Signed-off-by: ericharper <complex451@gmail.com>

* update first last stage embedding all reduce

Signed-off-by: ericharper <complex451@gmail.com>

* update sequence parallel all reduce for virtual models

Signed-off-by: ericharper <complex451@gmail.com>

* runs but we get an error

Signed-off-by: ericharper <complex451@gmail.com>

* set virtual rank 0 after looping

Signed-off-by: ericharper <complex451@gmail.com>

* account for virtual when determinining first and last pipeline stages

Signed-off-by: ericharper <complex451@gmail.com>

* checkpointing for virtual models in progress

Signed-off-by: ericharper <complex451@gmail.com>

* add checkpoint hooks

Signed-off-by: ericharper <complex451@gmail.com>

* working on validation when resuming

Signed-off-by: ericharper <complex451@gmail.com>

* skip sanity val steps by default in config

Signed-off-by: ericharper <complex451@gmail.com>

* remove comment

Signed-off-by: ericharper <complex451@gmail.com>

* log number of params

Signed-off-by: ericharper <complex451@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* style

Signed-off-by: ericharper <complex451@gmail.com>

* check if self.model is a list

Signed-off-by: ericharper <complex451@gmail.com>

* make virtual pipeline default size None on init

Signed-off-by: ericharper <complex451@gmail.com>

* make virtual pipeline default to None in config

Signed-off-by: ericharper <complex451@gmail.com>

* remove ensure_divisibility call

Signed-off-by: ericharper <complex451@gmail.com>

* fix lgtm alerts

Signed-off-by: ericharper <complex451@gmail.com>

* remove num_sanity_val_steps from config

Signed-off-by: ericharper <com…
zhehuaichen pushed a commit to zhehuaichen/NeMo that referenced this pull request Oct 4, 2023
* Add ASR with TTS Tutorial
* Fix enhancer usage

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
zhehuaichen pushed a commit to zhehuaichen/NeMo that referenced this pull request Oct 4, 2023
* Add ASR with TTS Tutorial
* Fix enhancer usage

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>
zhehuaichen pushed a commit to zhehuaichen/NeMo that referenced this pull request Oct 4, 2023
* Add ASR with TTS Tutorial
* Fix enhancer usage

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>
titu1994 added a commit that referenced this pull request May 11, 2024
* update package info

Signed-off-by: ericharper <complex451@gmail.com>

* fix the mpt chatbot (#6957)

Signed-off-by: Yi Dong <yidong@nvidia.com>

* Remove `compute_on_step` from metrics (#6979)

* Remove `compute_on_step` from metrics

Signed-off-by: smajumdar <titu1994@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Remove confusing log message

Signed-off-by: smajumdar <titu1994@gmail.com>

* Update tests

Signed-off-by: smajumdar <titu1994@gmail.com>

---------

Signed-off-by: smajumdar <titu1994@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Hybrid conformer export (#6983)

* Implemented generic kv-pair setting of export_config from args

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Hybrid conformer export

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Hybrid decoder export

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Cleanup

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Changed from **kwargs

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Docstring

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Docs added

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Stringify args

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Added docs for ASR export configs

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* lowercase ctc

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

---------

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Cache handling without input tensors mutation (#6980)

* Cache handling without input tensors mutation

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Cleanup

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Cleanup#2

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Cleanup#3

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

---------

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>

* fixes for spellmapper (#6994)

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>

* Fixing an issue with confidence ensembles (#6987)

* Bug fix for the confidence ensembles

Signed-off-by: Igor Gitman <igitman@nvidia.com>

* Relax constraints for the test

Signed-off-by: Igor Gitman <igitman@nvidia.com>

---------

Signed-off-by: Igor Gitman <igitman@nvidia.com>

* [TTS] Append pretrained FastPitch & SpectrogamEnhancer pair to available models (#7012)

* [TTS] fastpitch: add english libritts model with asr stft parameters (25 ms 10 ms)

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] enhancer: add pretrained model intended for asr finetuning

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

---------

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* Add ASR with TTS Tutorial. Fix enhancer usage. (#6955)

* Add ASR with TTS Tutorial
* Fix enhancer usage

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* install_bs (#7019)

Signed-off-by: Nikolay Karpov <karpnv@gmail.com>

* fix tab text gen (#7022)

Signed-off-by: Yi Dong <yidong@nvidia.com>

* TE bug fix (#7027)

Signed-off-by: Dmytro Pykhtar <dpykhtar@nvidia.com>

* Add support for Numba FP16 RNNT Loss (#6991) (#7038)

* Force working space memory to always be in fp32

Signed-off-by: smajumdar <titu1994@gmail.com>

* Add support for fp16 testing in Numba

Signed-off-by: smajumdar <titu1994@gmail.com>

* Add support for fp16 testing in Numba

Signed-off-by: smajumdar <titu1994@gmail.com>

* Add support for fp16 testing in Numba

Signed-off-by: smajumdar <titu1994@gmail.com>

* Fix cost calculation by upcasting to fp32

Signed-off-by: smajumdar <titu1994@gmail.com>

* Fix cost calculation by upcasting to fp32

Signed-off-by: smajumdar <titu1994@gmail.com>

* Add support to check if numba fp16 is available

Signed-off-by: smajumdar <titu1994@gmail.com>

* add RNN-T loss implemented by PyTorch and test code (#5312)

* Fix the bugs in cache-aware streaming Conformer (#5032)

Signed-off-by: Vahid <vnoroozi@nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* IA3 support for GPT and T5 (#4909)

* init commit for ia3 adater training in GPT

Signed-off-by: arendu <adithya.r@gmail.com>

* ia3 adater training in GPT, models and adapter classes

Signed-off-by: arendu <adithya.r@gmail.com>

* reshape to operate even on non-contiguous tensors

Signed-off-by: arendu <adithya.r@gmail.com>

* configs

Signed-off-by: arendu <adithya.r@gmail.com>

* fixed none init

Signed-off-by: arendu <adithya.r@gmail.com>

* adding adapter and ia3 support for T5 based models

Signed-off-by: arendu <adithya.r@gmail.com>

* style fix

Signed-off-by: arendu <adithya.r@gmail.com>

* config update and t5 model adapter and ia3

Signed-off-by: arendu <adithya.r@gmail.com>

* removed unused imports

Signed-off-by: arendu <adithya.r@gmail.com>

* predict step for inference

Signed-off-by: arendu <adithya.r@gmail.com>

* style fix

Signed-off-by: arendu <adithya.r@gmail.com>

* style fix

Signed-off-by: arendu <adithya.r@gmail.com>

* adapter inference for t5

Signed-off-by: arendu <adithya.r@gmail.com>

* style fix

Signed-off-by: arendu <adithya.r@gmail.com>

* fixed bug micro and global batch size in eval

Signed-off-by: arendu <adithya.r@gmail.com>

* minor edit

Signed-off-by: arendu <adithya.r@gmail.com>

* agressive truncation if in test examples if no truncation field is given

Signed-off-by: arendu <adithya.r@gmail.com>

* corrected for language_model_path name changes in main

Signed-off-by: arendu <adithya.r@gmail.com>

* removed unused import

Signed-off-by: arendu <adithya.r@gmail.com>

* name change for language_model_path

Signed-off-by: arendu <adithya.r@gmail.com>

* include inter_attention to IA3

Signed-off-by: arendu <adithya.r@gmail.com>

* minor fix in confg

Signed-off-by: arendu <adithya.r@gmail.com>

* minor fixes

Signed-off-by: arendu <adithya.r@gmail.com>

* removed unused flag

Signed-off-by: arendu <adithya.r@gmail.com>

* addressing PR comments

Signed-off-by: arendu <adithya.r@gmail.com>

* address PR comments

Signed-off-by: arendu <adithya.r@gmail.com>

* minor fix

Signed-off-by: arendu <adithya.r@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* style fix

Signed-off-by: arendu <adithya.r@gmail.com>

* CI test

Signed-off-by: arendu <adithya.r@gmail.com>

* minor fix in jenkinsfile

Signed-off-by: arendu <adithya.r@gmail.com>

Signed-off-by: arendu <adithya.r@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Bug fix - Limit val batches set to 1.0  (#5023)

* Bug fix

Signed-off-by: shanmugamr1992 <shanmugamr1992@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Adressed sandeep's comments

* Fixing limit val batches support in bert

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fixing limit val batches support in bert

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: shanmugamr1992 <shanmugamr1992@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* [bug_fix] kv_channels is used when available (#5066)

* fix bug s.t kv_channels is used when available

Signed-off-by: arendu <adithya.r@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: arendu <adithya.r@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* P&C Docs (#5068) (#5069)

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>
Co-authored-by: Matvei Novikov <mattyson.so@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Add spe_split_by_unicode_script arg (#5072)

* Add spe_split_by_unicode_script arg

Signed-off-by: Anas <aabouallaban@pm.me>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Anas <aabouallaban@pm.me>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* probabilites -> probabilities (#5078) (#5079)

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* increase PR and Issue sweep quantity and active close PRs. (#5073)

* increase PR and Issue sweep quantity and active close PRs.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* update with stricter rules, 30 days to be stale and 7 days to be closed for both Issues and PRs.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* [TTS] added missing German phoneme tokenizer. (#5070) (#5074)

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* rename to match prompt leanring (#5076)

Signed-off-by: arendu <adithya.r@gmail.com>

Signed-off-by: arendu <adithya.r@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Missing fixes from r1.11.0 to T5 finetuning eval (#5054) (#5061)

* Fixes to seq2seq eval

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Notebook bug fixes (#5084) (#5085)

* Notebook bug fixes

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Turned nemo install back on

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* reverted notebook

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Updated one line in entity linking nb

Signed-off-by: Virginia Adams <vadams@nvidia.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>
Co-authored-by: Eric Harper <complex451@gmail.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>
Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* update strategy in notebook from ddp_fork to dp (#5088) (#5089)

Co-authored-by: Zhilin Wang <wangzhilin12061996@hotmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Fix bug in Squeezeformer Conv block (#5011) (#5024)

* Fix bug in Squeezeformer Conv block

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Fix kernel context

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Fix access mixin

Signed-off-by: smajumdar <smajumdar@nvidia.com>

Signed-off-by: smajumdar <smajumdar@nvidia.com>

Signed-off-by: smajumdar <smajumdar@nvidia.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* fixed megatron lm conversion bug (PTL related) (#5038) (#5063)

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: David <amosalla@asu.edu>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Fix Unhashable type list for Numba Cuda spec augment kernel (#5093) (#5094)

Signed-off-by: smajumdar <smajumdar@nvidia.com>

Signed-off-by: smajumdar <smajumdar@nvidia.com>

Signed-off-by: smajumdar <smajumdar@nvidia.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Fix numba (#5098)

Signed-off-by: smajumdar <titu1994@gmail.com>

Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Make it possible to specify output_filename in normalize_with_audio.py (#5092)

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Greedy decoding confidence for CTC and RNNT (#4931)

* rnnt confidence draft

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* word confidence

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* advanced entropies added

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* refactoring

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* oops forgot a file

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* metrics and benchmarking script added

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* style fix

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* texterrors installation added

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* lgtm and bug fix

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* fix comments

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* fix typos

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* add missing import after rebase

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>
Co-authored-by: Aleksandr Laptev <alaptev@nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* [Add] SLURP models and examples (#4668)

* add model, util and loss

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* refactor

Signed-off-by: stevehuang52 <heh@nvidia.com>

* refactor annd update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update and refactor

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update and refactor

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update and refactor

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update docs

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update available models

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* refactor data processing

Signed-off-by: stevehuang52 <heh@nvidia.com>

* fix typo

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update docs

Signed-off-by: stevehuang52 <heh@nvidia.com>

* refactor and update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update doc

Signed-off-by: stevehuang52 <heh@nvidia.com>

* move transformer to asr.modules

Signed-off-by: stevehuang52 <heh@nvidia.com>

* move transformer to asr.modules

Signed-off-by: stevehuang52 <heh@nvidia.com>

* get rid of jsonlines

Signed-off-by: stevehuang52 <heh@nvidia.com>

* refactor

Signed-off-by: stevehuang52 <heh@nvidia.com>

* revert changes to nlp

Signed-off-by: stevehuang52 <heh@nvidia.com>

Signed-off-by: stevehuang52 <heh@nvidia.com>
Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: Jagadeesh Balam <4916480+jbalam-nv@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* only optimize params that are part of the adapter modules (#5086)

Signed-off-by: arendu <adithya.r@gmail.com>

Signed-off-by: arendu <adithya.r@gmail.com>
Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Pipeline Parallel T5 Prompt Learning (#4956)

* Added pre process flag checks and pipeline parallel in fwd

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Added rank check for pipeline parallel

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* T5 prompt learning works!

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* IA3 passing CI

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Fixed typo

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* removed optimizer setup so Adi's change will not conflict

Signed-off-by: Virginia Adams <vadams@nvidia.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>
Signed-off-by: Adi Renduchintala <108822655+arendu@users.noreply.github.com>
Co-authored-by: Adi Renduchintala <108822655+arendu@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* [TTS] remove phonemizer.py (#5090)

remove phonemizer.py and convert code block to markdown in the tutorial.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* T5 Decoding with PP > 2 fix (#5091) (#5103)

* set sequence lenghts in the pipeline properly

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* [TTS] fixed wrong val loss for epoch 0 and inconsistent metrics names (#5087) (#5102)

* fixed hifigan configs as well
* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Fix and refactor consumed samples save/restore for Megatron models. (#5077)

* Fixes and refactor

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Remove unused imports

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Empty

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* RIR corpus generator tool (#4927)

Signed-off-by: Ante Jukić <ajukic@nvidia.com>

Signed-off-by: Ante Jukić <ajukic@nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Multiprocessing fix (#5106) (#5107)

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>
Co-authored-by: Matvei Novikov <mattyson.so@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* [Bug fix] PC lexical + audio (#5109) (#5110)

* training running

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* revert

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* revert

Signed-off-by: ekmb <ebakhturina@nvidia.com>

Signed-off-by: ekmb <ebakhturina@nvidia.com>

Signed-off-by: ekmb <ebakhturina@nvidia.com>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* [Fix] schedulers with no max_steps param (#4564)

* fix schedulers

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update to use python inspect module

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

Signed-off-by: stevehuang52 <heh@nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* T5 prompt learning fixes missing from r.11.0 merge (#5075) (#5101)

* Fix special tokens

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Empty

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: David <amosalla@asu.edu>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: David <amosalla@asu.edu>
Co-authored-by: Eric Harper <complex451@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* [TTS] Add NeMo TTS Primer Tutorial (#4933)

* [TTS] Add NeMo TTS Primer Tutorial

Signed-off-by: Ryan <rlangman@nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Add Squeezeformer CTC model checkpoints on Librispeech (#5121)

Signed-off-by: smajumdar <titu1994@gmail.com>

Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* adding loss normalization options to rnnt joint  (#4829)

* adding normalization options to rnnt joint loss

* moving the param to joint

* moving loss normalization to rnnt loss config

* style

* cleaning up

* fixing sum reduction in joint

Signed-off-by: Dima Rekesh <drekesh@nvidia.com>

* moving reduction into RNNT loss class

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* refactoring

* typos

Signed-off-by: Dima Rekesh <drekesh@nvidia.com>

Signed-off-by: Dima Rekesh <drekesh@nvidia.com>
Co-authored-by: Dima Rekesh <drekesh@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Asr concat dataloader (#5108)

* forced precision

* typo

* initial commit

Signed-off-by: Dima Rekesh <bmwshop@gmail.com>

* typos and bugs

Signed-off-by: Dima Rekesh <drekesh@nvidia.com>

* reverting conformer encoder

Signed-off-by: Dima Rekesh <drekesh@nvidia.com>

* additional checks

Signed-off-by: Dima Rekesh <bmwshop@gmail.com>

* adding support to CTC models as well

* reverting conformer_encoder

Signed-off-by: Dima Rekesh <bmwshop@gmail.com>

* typo

Signed-off-by: Dima Rekesh <bmwshop@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* refactoring

Signed-off-by: Dima Rekesh <bmwshop@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* refactoring

Signed-off-by: Dima Rekesh <drekesh@nvidia.com>

* merging

Signed-off-by: Dima Rekesh <drekesh@nvidia.com>

Signed-off-by: Dima Rekesh <bmwshop@gmail.com>
Signed-off-by: Dima Rekesh <drekesh@nvidia.com>
Co-authored-by: Dima Rekesh <drekesh@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* fix blossom ci unittests

Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* bugfix: pybtex.database.InvalidNameString: Too many commas in author field. (#5112) (#5115)

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Uppdate container version to 22.09 (#5105)

* update container version

Signed-off-by: ericharper <complex451@gmail.com>

* pin click

Signed-off-by: ericharper <complex451@gmail.com>

* pin click 8.0.2

Signed-off-by: ericharper <complex451@gmail.com>

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Remove unsupported arguments from MegatronNMT (#5065)

* Fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* More fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* pp2 support for T5 IA3 learning and T5 Adapters learning (#5116)

* enabling pp2

Signed-off-by: arendu <adithya.r@gmail.com>

* optimizer update

Signed-off-by: arendu <adithya.r@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* T5 pp>1 support for adapters and ia3

Signed-off-by: arendu <adithya.r@gmail.com>

* fix bug with missing adapter_tuning

Signed-off-by: arendu <adithya.r@gmail.com>

* inference error fixed, pp=2

Signed-off-by: arendu <adithya.r@gmail.com>

Signed-off-by: arendu <adithya.r@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* T5 Prompt Learning Fixes for Pipeline Parallel (#5120)

* Initial fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Added back validation acc

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Put num workers back

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* added relative encoding if statament

Signed-off-by: Virginia Adams <vadams@selene-login-01.nvidia.com>

* Added back val loss only validation

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Revert "Added back val loss only validation"

This reverts commit 86d8f4806fe30335c40c3716ce18259939df500f.

* Removed val acc for PP > 1

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Removed enc_seq_len if statement

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Added back validation acc calc

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: Virginia Adams <vadams@nvidia.com>
Signed-off-by: Virginia Adams <vadams@selene-login-01.nvidia.com>
Co-authored-by: Virginia Adams <vadams@nvidia.com>
Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Virginia Adams <vadams@selene-login-01.nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* add doc info (#4721)

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* [TTS] Add SpanishCharsTokenizer (#5135)

* [TTS] Add SpanishCharsTokenizer

Signed-off-by: Ryan <rlangman@nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Update megatron interface to dialogue (#4936)

* fix style formatting

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update template to include description of intent

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkinsfile

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* changes based on requests in review

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* add compatibility with assistant dataset

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkins

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* remove dialogue_state_tracking

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update huggingface utils for dialogue

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* rename dialogue_state_tracking_hybrid to dialogue_state_tracking_sgdqa

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* fix style

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix nemo/collections/nlp/models/dialogue_state_tracking_sgdqa/__init__.py

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkinsfile for SGDGEN

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkinsfile for SGDGEN

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkinsfile for SGDGEN

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkinsfile for SGDGEN

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkinsfile for SGDGEN

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* fix typo

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* add docstrings for assistant data processsor

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkins for SGDGEN local checkpoint

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update style

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* use local vocab file for Jenkinsfile

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* patch for Jenkins CI using local file

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* add slot filling prediction and metrics

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* remove unused code

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* refactor metrics code out of Dialogue GPT Model

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* integrate backward compatible support for IntentSlotClassificationModel (bert model)

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* save prediction file for IntentSlotClassification

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update dialogue gpt model training for megatron gpt

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* remove batch generate for HF GPT2, which causes lower performance

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* add few shot capability to dialogue gpt model

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkinsfile and remove unused import

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update code description and clarity

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* address PR comments

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* integrate compatibility with ZeroShotIntentModel

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* rename folder to dialogue due to increased scope and further refactor for clarity

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* added dialogue GPT for sequence generation task (e.g. answer extender)

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* add CI test for DialogueGPTGenerationModel

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* integrate DialogueS2SGenerationModel for generation task (e.g. answer extender)

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* modify huggingface utils to support HF t5/BART models

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* remove unused imports

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkinsfile

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkinsfile

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update bleu metric

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* fix bleu metric style

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* debug bleu metric

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* debug bleu metric

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update based on PR #3893

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update 2 based on PR #3893

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update 3 based on PR #3893

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* integrate sgd generation based on user user utterance and system slot-values to generate system utterance

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* add validation model saving capabilities

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* cleaned up code for SGD Based Answer extender

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Dialogue Generation CI

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkinsfile

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkinsfile

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* fix Jenkins CI issue"

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* add support for design dataset

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* remove unnecessary imports

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkins

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update jenkins

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update jenkins

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* support megatron for dialogue_s2s_generation_model

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* reduce loaded samples in MSMarcoDataProcessor to 64 when cfg.model.dataset.debug_mode=True

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update CI

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update checkpoint and predictions filename to include epoch number

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* integrate HF BART MNLI into zero shot intent model

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* integrate Dialogue Nearest Neighbour Model

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkins

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkins

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* refactor Dialogue SGD Data Processor to make interface for models cleaner

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update jenkins

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Dialogue S2S Generation model for DialogueSGDDataProcessor interface

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update jenkins

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update jenkins

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* support sgd and drive thru datasets by zero shot model and nearest neighbour model

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* add prediction saving code to nearest neighbour and zero shot intent models

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* fix typo in sgd data processor

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* integrate Dialogue Mellon QA Data Processor

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update mellon qa

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update dialogue.py to remove outdated info

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update dialogue_config.yaml

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update dialogue_config.yaml

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* add dialogue docs

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* address review comments

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix for cfg

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* make dependency on apex optional

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* change NLPDDPluggin calling logic to make it possible to run without apex

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* add first draft of tutorial

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* reduce ms marco size by removing lines without wellFormedAnswers

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* address pr comments

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update colab tutorial link in dialogue docs

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* include unit test and some refactor to facilitate unit test

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* address pr issues

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* remove typos in dialogue tutorial

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* support larger files for question answering

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* remove unnecessary artifacts to reduce memory use

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* put 0 tensor to device

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update link within dialogue tutorial

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* restore previously delete files

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update error handling when loss = nan

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update nan handling

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update spanning loss func

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update spanning loss

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* fix type error raised in qa_dataset.py

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* add error checking message

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* revert back to float32

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* revert back to float32

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update error msgs

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update error msgs

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update error msgs

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update error msgs

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update error msgs

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update error msgs

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update error msgs

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update error msgs

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update exp logging

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update error msgs

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update loading of large file from pickle to json

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update loading of large file from pickle to json

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* limit number of negative samples

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* revert post processing

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* revert post processing

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* remove unused methods and style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* add more documentation

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* remove unused imports

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* changes base on PR review

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* set wandb logger falseby default

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update interface with megatron gpt prompt learning

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update inline documentation

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update prompt_ids

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update error msg

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update config

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update config

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* set inference = False for dialgue prompt learning during trainng

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* set inference = False for dialgue prompt learning during trainng

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* remove unused code

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update config yaml

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* fix bug for megatron gpt prompt learning

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* remove unused import

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* address comments in PR

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* address comments in PR

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* address typo

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* add megatron t5 inference

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* fix bug due to bert tokenizer not being space-aware

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update style

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update IntentSlotModel onnx export test

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update style

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update exportable

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* address PR comments

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* replace functools.cache_property with functools.lru_cache to maintain python 3.7 compatibility

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* improve speed of rank_candidates and support for p tuning

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update dialogue.py

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* fix megatron prompt learning saving bug

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update generate_candidate method

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* remove repeated init text ids and invert attention masks

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update typo

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* custom collate fn to remove excess padding in batch

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update complete method to mitigate issue when max seq len is low

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* address pr comments

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update generation interface

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>
Co-authored-by: Zhilin Wang <zhilinw@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: Yang Zhang <yzhang123@users.noreply.github.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Added save inference ready .nemo file with every checkpoint (#5055)

* Added save inference ready .nemo file with every checkpoint

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Python style fix

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* addressed Adi's comment

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Added ptuning check in model checkpoint saving

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Changed save_nemo_on_valdaition default to False

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Changes global batch size of adapter CI

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Changed num workers to 0

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* added first stage of pipeline check

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Virginia Adams <vadams@nvidia.com>
Signed-off-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Fixes for docs/typos + remove max_utts parameter from tarred datasets as it causes hang in training (#5118)

* Remove ; from jupyter notebook cells

Signed-off-by: Igor Gitman <igitman@nvidia.com>

* Fix typos in documentation/code

Signed-off-by: Igor Gitman <igitman@nvidia.com>

* Fix output message to have 'or equal'

Signed-off-by: Igor Gitman <igitman@nvidia.com>

* Link formatting fixes

Signed-off-by: Igor Gitman <igitman@nvidia.com>

* Add error if max_utts is used in tarred datasets

Signed-off-by: Igor Gitman <igitman@nvidia.com>

* Remove max_utts parameter from tarred datasets

Signed-off-by: Igor Gitman <igitman@nvidia.com>

* Fix max_utts removal in tests

Signed-off-by: Igor Gitman <igitman@nvidia.com>

* Fix typo if -> is

Signed-off-by: Igor Gitman <igitman@nvidia.com>

Signed-off-by: Igor Gitman <igitman@nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Merge r1.12.0 main (#5139)

* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* Add cherry-pick action (#4958)

* add cherry-pick action

Signed-off-by: ericharper <complex451@gmail.com>

* Pin Transformers version to fix CI (#4955)

* Pin transformers version in CI to prevent offline tokenizer loading error

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Drop version

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Disable offline temporarily

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Disable offline temporarily

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Enable offline

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>
Co-authored-by: Sean Naren <snarenthiran@nvidia.com>

* upper bound transformers

Signed-off-by: ericharper <complex451@gmail.com>

* remove duplicate transformers requirement

Signed-off-by: ericharper <complex451@gmail.com>

* Release SOTA Lang ID model  (#5080)

* add pretrained lang id model ambernet

Signed-off-by: fayejf <fayejf07@gmail.com>

* update doc and style fix

Signed-off-by: fayejf <fayejf07@gmail.com>

Signed-off-by: fayejf <fayejf07@gmail.com>

* update branch and package info

Signed-off-by: ericharper <complex451@gmail.com>

* remove upper bounds on lightning and transformers

Signed-off-by: ericharper <complex451@gmail.com>

* remove transformers offline from ci

Signed-off-by: ericharper <complex451@gmail.com>

* upper bound transformers

Signed-off-by: ericharper <complex451@gmail.com>

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
Co-authored-by: Sean Naren <snarenthiran@nvidia.com>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Added ASR model comparison to SDE (#5043)

SDE: Added ASR model comparison tool to SDE
transcribe speech: Added support for many predictions in one file, as well as custom field names
Signed-off-by: George Zelenfroynd <gzelenfroind@nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* fix nmt eval sampler (#5154)

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Fix Global init steps (#5143)

* move global step to base

Signed-off-by: Yi Dong <yidong@nvidia.com>

* fix fused softmax

Signed-off-by: Yi Dong <yidong@nvidia.com>

* add the missing file

Signed-off-by: Yi Dong <yidong@nvidia.com>

* update the fused kernel

Signed-off-by: Yi Dong <doyend@gmail.com>

* fix import error

Signed-off-by: Yi Dong <doyend@gmail.com>

* fix import again

Signed-off-by: Yi Dong <yidong@nvidia.com>

Signed-off-by: Yi Dong <yidong@nvidia.com>
Signed-off-by: Yi Dong <doyend@gmail.com>
Co-authored-by: Yi Dong <doyend@gmail.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* [TTS] bug fix - sample rate was being ignored in vocoder dataset (#4518)

* bug fix - sample rate was being ignored in vocoder dataset when not loading mel
* handled n segments for a different sampling rate than original sampling rate
* Added case for n_segments 0, warning for n_segments greater than file length

Signed-off-by: Paarth Neekhara <paarth.n@gmail.com>
Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Co-authored-by: Jocelyn <jocelynh@nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Add EMA support to NeMo (#4764)

* Added Base files

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Some refactors, swap to using MNIST Lnet

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Add a few more tests, allow the callback to be set via the exp manager

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Actually run validation for testing

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Run isort

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Add test for saving state/fix saving state

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Use dummy model

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Fix test

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Add copyright

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Support saving separate EMA weight module

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Add standalone functionality/logging

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Expose more parameters

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Modify to allow option to replace validation

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Add jenkins test, formatting

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Pin Transformers version to fix CI (#4955)

* Pin transformers version in CI to prevent offline tokenizer loading error

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Drop version

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Disable offline temporarily

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Disable offline temporarily

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Enable offline

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Add cherry-pick action (#4958) (#4961)

* add cherry-pick action

Signed-off-by: ericharper <complex451@gmail.com>

* Pin Transformers version to fix CI (#4955)

* Pin transformers version in CI to prevent offline tokenizer loading error

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Drop version

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Disable offline temporarily

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Disable offline temporarily

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Enable offline

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>
Co-authored-by: Sean Naren <snarenthiran@nvidia.com>

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Co-authored-by: Sean Naren <snarenthiran@nvidia.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Fix changelog builder (#4962) (#4963)

Signed-off-by: smajumdar <smajumdar@nvidia.com>

Signed-off-by: smajumdar <smajumdar@nvidia.com>

Signed-off-by: smajumdar <smajumdar@nvidia.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* fix cherry pick workflow (#4964) (#4965)

Signed-off-by: ericharper <complex451@gmail.com>

Signed-off-by: ericharper <complex451@gmail.com>

Signed-off-by: ericharper <complex451@gmail.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* reorder model check (#4959) (#4967)

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* check for active conda environment (#4970) (#4971)

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* [TTS] fix broken tutorial for MixerTTS. (#4949) (#4976)

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Checkpoint averaging class fix (#4946)

* 1. Added args.class_path to provide it externally.

Signed-off-by: Micha Livne <mlivne@cs.toronto.edu>

* 1. Fixed style.

Signed-off-by: Micha Livne <mlivne@cs.toronto.edu>

Signed-off-by: Micha Livne <mlivne@cs.toronto.edu>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Add ability to give seperate datasets for test, train and validation (#4798)

* Add ability to give seperate datasets for test, train and validation

* Addressed Sandeeps comments

* Addressed Sandeeps comments

* Add ability to give seperate datasets for test, train and validation

* Add ability to give seperate datasets for test, train and validation

* Addressed review comments

* Bug fix for common dataset utils

* Add CI tests

Signed-off-by: shanmugamr1992 <shanmugamr1992@gmail.com>

* Reformat code

Signed-off-by: shanmugamr1992 <shanmugamr1992@gmail.com>

* Bug fix

Signed-off-by: shanmugamr1992 <shanmugamr1992@gmail.com>

* Bug fix

* Bug Fix

* Bug Fix

* Update Jenkinsfile

* Addressed comments

* Addressed Eriks comments.

* Addressed Sandeep

* Update Jenkinsfile

* Update Jenkinsfile

* Update dataset_utils.py

* Update Jenkinsfile

* Update Jenkinsfile

* Use GPT CI config

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: shanmugamr1992 <shanmugamr1992@gmail.com>
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* fix label models restoring issue from wrighted cross entropy (#4968) (#4975)

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Add simple pre-commit file (#4983)

* Add simple pre-commit file

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Exclude docs folder

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Revert "[pre-commit.ci] auto fixes from pre-commit.com hooks"

This reverts commit 053bd5ba579537a5f311b431871c21f3381b43eb.

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Import pycuda.autoprimaryctx or pycuda.autoinit to init pycuda execution environment (#4951)

Signed-off-by: Jin Li <liji@nvidia.com>

Signed-off-by: Jin Li <liji@nvidia.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Adding speaker embedding conditioning in fastpitch (#4986)

Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com>

Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Fix ASR issues (#4984) (#4991)

* Fix ASR issues

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Revert fix

Signed-off-by: smajumdar <smajumdar@nvidia.com>

Signed-off-by: smajumdar <smajumdar@nvidia.com>

Signed-off-by: smajumdar <smajumdar@nvidia.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Fix current tests

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* More test coverage

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Address reviews

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Address review

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Drop bf16 test

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Address review

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* remove print

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Add bf16

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>
Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: smajumdar <smajumdar@nvidia.com>
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: Micha Livne <mlivne@cs.toronto.edu>
Signed-off-by: shanmugamr1992 <shanmugamr1992@gmail.com>
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: Jin Li <liji@nvidia.com>
Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Co-authored-by: Micha Livne <michalivne@users.noreply.github.com>
Co-authored-by: shanmugamr1992 <111910568+shanmugamr1992@users.noreply.github.com>
Co-authored-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: liji-nv <59594262+liji-nv@users.noreply.github.com>
Co-authored-by: Subhankar Ghosh <subhankar2321@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Fix BF16 test (#5162)

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Fix errors in speaker diarization nemo docs (#5153)

* fix docs and docstrings for MSDD

Signed-off-by: Taejin Park <tango4j@gmail.com>

* fix nemo docs errors

Signed-off-by: Taejin Park <tango4j@gmail.com>

* reflected review comments

Signed-off-by: Taejin Park <tango4j@gmail.com>

Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Add interleaved pipeline schedule to GPT (#5025)

* add virtual pipeline size to config

Signed-off-by: ericharper <complex451@gmail.com>

* convert model to list of modules

Signed-off-by: ericharper <complex451@gmail.com>

* convert model to list of modules

Signed-off-by: ericharper <complex451@gmail.com>

* convert model to list of modules

Signed-off-by: ericharper <complex451@gmail.com>

* update for list of modules

Signed-off-by: ericharper <complex451@gmail.com>

* add virtual to init

Signed-off-by: ericharper <complex451@gmail.com>

* update first last stage embedding all reduce

Signed-off-by: ericharper <complex451@gmail.com>

* update sequence parallel all reduce for virtual models

Signed-off-by: ericharper <complex451@gmail.com>

* runs but we get an error

Signed-off-by: ericharper <complex451@gmail.com>

* set virtual rank 0 after looping

Signed-off-by: ericharper <complex451@gmail.com>

* account for virtual when determinining first and last pipeline stages

Signed-off-by: ericharper <complex451@gmail.com>

* checkpointing for virtual models in progress

Signed-off-by: ericharper <complex451@gmail.com>

* add checkpoint hooks

Signed-off-by: ericharper <complex451@gmail.com>

* working on validation when resuming

Signed-off-by: ericharper <complex451@gmail.com>

* skip sanity val steps by default in config

Signed-off-by: ericharper <complex451@gmail.com>

* remove comment

Signed-off-by: ericharper <complex451@gmail.com>

* log number of params

Signed-off-by: ericharper <complex451@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* style

Signed-off-by: ericharper <complex451@gmail.com>

* check if self.model is a list

Signed-off-by: ericharper <complex451@gmail.com>

* make virtual pipeline default size None on init

Signed-off-by: ericharper <complex451@gmail.com>

* make virtual pipeline default to None in config

Signed-off-by: ericharper <complex451@gmail.com>

* remove ensure_divisibility call

Signed-off-by: ericharper <complex451@gmail.com>

* fix lgtm alerts

Signed-off-by: ericharper <complex451@gmail.com>

* remove num_sanity_val_steps from config

Signed-off-by: ericharper <complex451@gmai…
BoxiangW pushed a commit to BoxiangW/NeMo that referenced this pull request Jun 5, 2024
* update package info

Signed-off-by: ericharper <complex451@gmail.com>

* fix the mpt chatbot (#6957)

Signed-off-by: Yi Dong <yidong@nvidia.com>

* Remove `compute_on_step` from metrics (#6979)

* Remove `compute_on_step` from metrics

Signed-off-by: smajumdar <titu1994@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Remove confusing log message

Signed-off-by: smajumdar <titu1994@gmail.com>

* Update tests

Signed-off-by: smajumdar <titu1994@gmail.com>

---------

Signed-off-by: smajumdar <titu1994@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Hybrid conformer export (#6983)

* Implemented generic kv-pair setting of export_config from args

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Hybrid conformer export

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Hybrid decoder export

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Cleanup

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Changed from **kwargs

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Docstring

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Docs added

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Stringify args

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Added docs for ASR export configs

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* lowercase ctc

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

---------

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Cache handling without input tensors mutation (#6980)

* Cache handling without input tensors mutation

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Cleanup

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Cleanup#2

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Cleanup#3

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

---------

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>

* fixes for spellmapper (#6994)

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>

* Fixing an issue with confidence ensembles (#6987)

* Bug fix for the confidence ensembles

Signed-off-by: Igor Gitman <igitman@nvidia.com>

* Relax constraints for the test

Signed-off-by: Igor Gitman <igitman@nvidia.com>

---------

Signed-off-by: Igor Gitman <igitman@nvidia.com>

* [TTS] Append pretrained FastPitch & SpectrogamEnhancer pair to available models (#7012)

* [TTS] fastpitch: add english libritts model with asr stft parameters (25 ms 10 ms)

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] enhancer: add pretrained model intended for asr finetuning

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

---------

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* Add ASR with TTS Tutorial. Fix enhancer usage. (#6955)

* Add ASR with TTS Tutorial
* Fix enhancer usage

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* install_bs (#7019)

Signed-off-by: Nikolay Karpov <karpnv@gmail.com>

* fix tab text gen (#7022)

Signed-off-by: Yi Dong <yidong@nvidia.com>

* TE bug fix (#7027)

Signed-off-by: Dmytro Pykhtar <dpykhtar@nvidia.com>

* Add support for Numba FP16 RNNT Loss (#6991) (#7038)

* Force working space memory to always be in fp32

Signed-off-by: smajumdar <titu1994@gmail.com>

* Add support for fp16 testing in Numba

Signed-off-by: smajumdar <titu1994@gmail.com>

* Add support for fp16 testing in Numba

Signed-off-by: smajumdar <titu1994@gmail.com>

* Add support for fp16 testing in Numba

Signed-off-by: smajumdar <titu1994@gmail.com>

* Fix cost calculation by upcasting to fp32

Signed-off-by: smajumdar <titu1994@gmail.com>

* Fix cost calculation by upcasting to fp32

Signed-off-by: smajumdar <titu1994@gmail.com>

* Add support to check if numba fp16 is available

Signed-off-by: smajumdar <titu1994@gmail.com>

* add RNN-T loss implemented by PyTorch and test code (#5312)

* Fix the bugs in cache-aware streaming Conformer (#5032)

Signed-off-by: Vahid <vnoroozi@nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* IA3 support for GPT and T5 (#4909)

* init commit for ia3 adater training in GPT

Signed-off-by: arendu <adithya.r@gmail.com>

* ia3 adater training in GPT, models and adapter classes

Signed-off-by: arendu <adithya.r@gmail.com>

* reshape to operate even on non-contiguous tensors

Signed-off-by: arendu <adithya.r@gmail.com>

* configs

Signed-off-by: arendu <adithya.r@gmail.com>

* fixed none init

Signed-off-by: arendu <adithya.r@gmail.com>

* adding adapter and ia3 support for T5 based models

Signed-off-by: arendu <adithya.r@gmail.com>

* style fix

Signed-off-by: arendu <adithya.r@gmail.com>

* config update and t5 model adapter and ia3

Signed-off-by: arendu <adithya.r@gmail.com>

* removed unused imports

Signed-off-by: arendu <adithya.r@gmail.com>

* predict step for inference

Signed-off-by: arendu <adithya.r@gmail.com>

* style fix

Signed-off-by: arendu <adithya.r@gmail.com>

* style fix

Signed-off-by: arendu <adithya.r@gmail.com>

* adapter inference for t5

Signed-off-by: arendu <adithya.r@gmail.com>

* style fix

Signed-off-by: arendu <adithya.r@gmail.com>

* fixed bug micro and global batch size in eval

Signed-off-by: arendu <adithya.r@gmail.com>

* minor edit

Signed-off-by: arendu <adithya.r@gmail.com>

* agressive truncation if in test examples if no truncation field is given

Signed-off-by: arendu <adithya.r@gmail.com>

* corrected for language_model_path name changes in main

Signed-off-by: arendu <adithya.r@gmail.com>

* removed unused import

Signed-off-by: arendu <adithya.r@gmail.com>

* name change for language_model_path

Signed-off-by: arendu <adithya.r@gmail.com>

* include inter_attention to IA3

Signed-off-by: arendu <adithya.r@gmail.com>

* minor fix in confg

Signed-off-by: arendu <adithya.r@gmail.com>

* minor fixes

Signed-off-by: arendu <adithya.r@gmail.com>

* removed unused flag

Signed-off-by: arendu <adithya.r@gmail.com>

* addressing PR comments

Signed-off-by: arendu <adithya.r@gmail.com>

* address PR comments

Signed-off-by: arendu <adithya.r@gmail.com>

* minor fix

Signed-off-by: arendu <adithya.r@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* style fix

Signed-off-by: arendu <adithya.r@gmail.com>

* CI test

Signed-off-by: arendu <adithya.r@gmail.com>

* minor fix in jenkinsfile

Signed-off-by: arendu <adithya.r@gmail.com>

Signed-off-by: arendu <adithya.r@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Bug fix - Limit val batches set to 1.0  (#5023)

* Bug fix

Signed-off-by: shanmugamr1992 <shanmugamr1992@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Adressed sandeep's comments

* Fixing limit val batches support in bert

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fixing limit val batches support in bert

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: shanmugamr1992 <shanmugamr1992@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* [bug_fix] kv_channels is used when available (#5066)

* fix bug s.t kv_channels is used when available

Signed-off-by: arendu <adithya.r@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: arendu <adithya.r@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* P&C Docs (#5068) (#5069)

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>
Co-authored-by: Matvei Novikov <mattyson.so@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Add spe_split_by_unicode_script arg (#5072)

* Add spe_split_by_unicode_script arg

Signed-off-by: Anas <aabouallaban@pm.me>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Anas <aabouallaban@pm.me>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* probabilites -> probabilities (#5078) (#5079)

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* increase PR and Issue sweep quantity and active close PRs. (#5073)

* increase PR and Issue sweep quantity and active close PRs.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* update with stricter rules, 30 days to be stale and 7 days to be closed for both Issues and PRs.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* [TTS] added missing German phoneme tokenizer. (#5070) (#5074)

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* rename to match prompt leanring (#5076)

Signed-off-by: arendu <adithya.r@gmail.com>

Signed-off-by: arendu <adithya.r@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Missing fixes from r1.11.0 to T5 finetuning eval (#5054) (#5061)

* Fixes to seq2seq eval

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Notebook bug fixes (#5084) (#5085)

* Notebook bug fixes

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Turned nemo install back on

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* reverted notebook

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Updated one line in entity linking nb

Signed-off-by: Virginia Adams <vadams@nvidia.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>
Co-authored-by: Eric Harper <complex451@gmail.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>
Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* update strategy in notebook from ddp_fork to dp (#5088) (#5089)

Co-authored-by: Zhilin Wang <wangzhilin12061996@hotmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Fix bug in Squeezeformer Conv block (#5011) (#5024)

* Fix bug in Squeezeformer Conv block

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Fix kernel context

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Fix access mixin

Signed-off-by: smajumdar <smajumdar@nvidia.com>

Signed-off-by: smajumdar <smajumdar@nvidia.com>

Signed-off-by: smajumdar <smajumdar@nvidia.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* fixed megatron lm conversion bug (PTL related) (#5038) (#5063)

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: David <amosalla@asu.edu>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Fix Unhashable type list for Numba Cuda spec augment kernel (#5093) (#5094)

Signed-off-by: smajumdar <smajumdar@nvidia.com>

Signed-off-by: smajumdar <smajumdar@nvidia.com>

Signed-off-by: smajumdar <smajumdar@nvidia.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Fix numba (#5098)

Signed-off-by: smajumdar <titu1994@gmail.com>

Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Make it possible to specify output_filename in normalize_with_audio.py (#5092)

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Greedy decoding confidence for CTC and RNNT (#4931)

* rnnt confidence draft

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* word confidence

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* advanced entropies added

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* refactoring

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* oops forgot a file

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* metrics and benchmarking script added

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* style fix

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* texterrors installation added

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* lgtm and bug fix

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* fix comments

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* fix typos

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* add missing import after rebase

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>
Co-authored-by: Aleksandr Laptev <alaptev@nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* [Add] SLURP models and examples (#4668)

* add model, util and loss

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* refactor

Signed-off-by: stevehuang52 <heh@nvidia.com>

* refactor annd update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update and refactor

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update and refactor

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update and refactor

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update docs

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update available models

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* refactor data processing

Signed-off-by: stevehuang52 <heh@nvidia.com>

* fix typo

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update docs

Signed-off-by: stevehuang52 <heh@nvidia.com>

* refactor and update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update doc

Signed-off-by: stevehuang52 <heh@nvidia.com>

* move transformer to asr.modules

Signed-off-by: stevehuang52 <heh@nvidia.com>

* move transformer to asr.modules

Signed-off-by: stevehuang52 <heh@nvidia.com>

* get rid of jsonlines

Signed-off-by: stevehuang52 <heh@nvidia.com>

* refactor

Signed-off-by: stevehuang52 <heh@nvidia.com>

* revert changes to nlp

Signed-off-by: stevehuang52 <heh@nvidia.com>

Signed-off-by: stevehuang52 <heh@nvidia.com>
Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: Jagadeesh Balam <4916480+jbalam-nv@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* only optimize params that are part of the adapter modules (#5086)

Signed-off-by: arendu <adithya.r@gmail.com>

Signed-off-by: arendu <adithya.r@gmail.com>
Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Pipeline Parallel T5 Prompt Learning (#4956)

* Added pre process flag checks and pipeline parallel in fwd

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Added rank check for pipeline parallel

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* T5 prompt learning works!

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* IA3 passing CI

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Fixed typo

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* removed optimizer setup so Adi's change will not conflict

Signed-off-by: Virginia Adams <vadams@nvidia.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>
Signed-off-by: Adi Renduchintala <108822655+arendu@users.noreply.github.com>
Co-authored-by: Adi Renduchintala <108822655+arendu@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* [TTS] remove phonemizer.py (#5090)

remove phonemizer.py and convert code block to markdown in the tutorial.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* T5 Decoding with PP > 2 fix (#5091) (#5103)

* set sequence lenghts in the pipeline properly

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* [TTS] fixed wrong val loss for epoch 0 and inconsistent metrics names (#5087) (#5102)

* fixed hifigan configs as well
* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Fix and refactor consumed samples save/restore for Megatron models. (#5077)

* Fixes and refactor

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Remove unused imports

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Empty

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* RIR corpus generator tool (#4927)

Signed-off-by: Ante Jukić <ajukic@nvidia.com>

Signed-off-by: Ante Jukić <ajukic@nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Multiprocessing fix (#5106) (#5107)

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>
Co-authored-by: Matvei Novikov <mattyson.so@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* [Bug fix] PC lexical + audio (#5109) (#5110)

* training running

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* revert

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* revert

Signed-off-by: ekmb <ebakhturina@nvidia.com>

Signed-off-by: ekmb <ebakhturina@nvidia.com>

Signed-off-by: ekmb <ebakhturina@nvidia.com>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* [Fix] schedulers with no max_steps param (#4564)

* fix schedulers

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update to use python inspect module

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

Signed-off-by: stevehuang52 <heh@nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* T5 prompt learning fixes missing from r.11.0 merge (#5075) (#5101)

* Fix special tokens

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Empty

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: David <amosalla@asu.edu>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: David <amosalla@asu.edu>
Co-authored-by: Eric Harper <complex451@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* [TTS] Add NeMo TTS Primer Tutorial (#4933)

* [TTS] Add NeMo TTS Primer Tutorial

Signed-off-by: Ryan <rlangman@nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Add Squeezeformer CTC model checkpoints on Librispeech (#5121)

Signed-off-by: smajumdar <titu1994@gmail.com>

Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* adding loss normalization options to rnnt joint  (#4829)

* adding normalization options to rnnt joint loss

* moving the param to joint

* moving loss normalization to rnnt loss config

* style

* cleaning up

* fixing sum reduction in joint

Signed-off-by: Dima Rekesh <drekesh@nvidia.com>

* moving reduction into RNNT loss class

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* refactoring

* typos

Signed-off-by: Dima Rekesh <drekesh@nvidia.com>

Signed-off-by: Dima Rekesh <drekesh@nvidia.com>
Co-authored-by: Dima Rekesh <drekesh@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Asr concat dataloader (#5108)

* forced precision

* typo

* initial commit

Signed-off-by: Dima Rekesh <bmwshop@gmail.com>

* typos and bugs

Signed-off-by: Dima Rekesh <drekesh@nvidia.com>

* reverting conformer encoder

Signed-off-by: Dima Rekesh <drekesh@nvidia.com>

* additional checks

Signed-off-by: Dima Rekesh <bmwshop@gmail.com>

* adding support to CTC models as well

* reverting conformer_encoder

Signed-off-by: Dima Rekesh <bmwshop@gmail.com>

* typo

Signed-off-by: Dima Rekesh <bmwshop@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* refactoring

Signed-off-by: Dima Rekesh <bmwshop@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* refactoring

Signed-off-by: Dima Rekesh <drekesh@nvidia.com>

* merging

Signed-off-by: Dima Rekesh <drekesh@nvidia.com>

Signed-off-by: Dima Rekesh <bmwshop@gmail.com>
Signed-off-by: Dima Rekesh <drekesh@nvidia.com>
Co-authored-by: Dima Rekesh <drekesh@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* fix blossom ci unittests

Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* bugfix: pybtex.database.InvalidNameString: Too many commas in author field. (#5112) (#5115)

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Uppdate container version to 22.09 (#5105)

* update container version

Signed-off-by: ericharper <complex451@gmail.com>

* pin click

Signed-off-by: ericharper <complex451@gmail.com>

* pin click 8.0.2

Signed-off-by: ericharper <complex451@gmail.com>

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Remove unsupported arguments from MegatronNMT (#5065)

* Fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* More fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* pp2 support for T5 IA3 learning and T5 Adapters learning (#5116)

* enabling pp2

Signed-off-by: arendu <adithya.r@gmail.com>

* optimizer update

Signed-off-by: arendu <adithya.r@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* T5 pp>1 support for adapters and ia3

Signed-off-by: arendu <adithya.r@gmail.com>

* fix bug with missing adapter_tuning

Signed-off-by: arendu <adithya.r@gmail.com>

* inference error fixed, pp=2

Signed-off-by: arendu <adithya.r@gmail.com>

Signed-off-by: arendu <adithya.r@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* T5 Prompt Learning Fixes for Pipeline Parallel (#5120)

* Initial fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Added back validation acc

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Put num workers back

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* added relative encoding if statament

Signed-off-by: Virginia Adams <vadams@selene-login-01.nvidia.com>

* Added back val loss only validation

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Revert "Added back val loss only validation"

This reverts commit 86d8f4806fe30335c40c3716ce18259939df500f.

* Removed val acc for PP > 1

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Removed enc_seq_len if statement

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Added back validation acc calc

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: Virginia Adams <vadams@nvidia.com>
Signed-off-by: Virginia Adams <vadams@selene-login-01.nvidia.com>
Co-authored-by: Virginia Adams <vadams@nvidia.com>
Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Virginia Adams <vadams@selene-login-01.nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* add doc info (#4721)

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* [TTS] Add SpanishCharsTokenizer (#5135)

* [TTS] Add SpanishCharsTokenizer

Signed-off-by: Ryan <rlangman@nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Update megatron interface to dialogue (#4936)

* fix style formatting

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update template to include description of intent

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkinsfile

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* changes based on requests in review

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* add compatibility with assistant dataset

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkins

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* remove dialogue_state_tracking

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update huggingface utils for dialogue

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* rename dialogue_state_tracking_hybrid to dialogue_state_tracking_sgdqa

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* fix style

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix nemo/collections/nlp/models/dialogue_state_tracking_sgdqa/__init__.py

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkinsfile for SGDGEN

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkinsfile for SGDGEN

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkinsfile for SGDGEN

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkinsfile for SGDGEN

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkinsfile for SGDGEN

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* fix typo

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* add docstrings for assistant data processsor

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkins for SGDGEN local checkpoint

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update style

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* use local vocab file for Jenkinsfile

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* patch for Jenkins CI using local file

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* add slot filling prediction and metrics

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* remove unused code

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* refactor metrics code out of Dialogue GPT Model

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* integrate backward compatible support for IntentSlotClassificationModel (bert model)

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* save prediction file for IntentSlotClassification

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update dialogue gpt model training for megatron gpt

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* remove batch generate for HF GPT2, which causes lower performance

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* add few shot capability to dialogue gpt model

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkinsfile and remove unused import

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update code description and clarity

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* address PR comments

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* integrate compatibility with ZeroShotIntentModel

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* rename folder to dialogue due to increased scope and further refactor for clarity

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* added dialogue GPT for sequence generation task (e.g. answer extender)

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* add CI test for DialogueGPTGenerationModel

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* integrate DialogueS2SGenerationModel for generation task (e.g. answer extender)

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* modify huggingface utils to support HF t5/BART models

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* remove unused imports

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkinsfile

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkinsfile

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update bleu metric

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* fix bleu metric style

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* debug bleu metric

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* debug bleu metric

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update based on PR #3893

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update 2 based on PR #3893

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update 3 based on PR #3893

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* integrate sgd generation based on user user utterance and system slot-values to generate system utterance

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* add validation model saving capabilities

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* cleaned up code for SGD Based Answer extender

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Dialogue Generation CI

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkinsfile

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkinsfile

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* fix Jenkins CI issue"

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* add support for design dataset

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* remove unnecessary imports

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkins

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update jenkins

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update jenkins

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* support megatron for dialogue_s2s_generation_model

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* reduce loaded samples in MSMarcoDataProcessor to 64 when cfg.model.dataset.debug_mode=True

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update CI

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update checkpoint and predictions filename to include epoch number

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* integrate HF BART MNLI into zero shot intent model

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* integrate Dialogue Nearest Neighbour Model

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkins

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkins

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* refactor Dialogue SGD Data Processor to make interface for models cleaner

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update jenkins

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Dialogue S2S Generation model for DialogueSGDDataProcessor interface

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update jenkins

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update jenkins

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* support sgd and drive thru datasets by zero shot model and nearest neighbour model

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* add prediction saving code to nearest neighbour and zero shot intent models

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* fix typo in sgd data processor

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* integrate Dialogue Mellon QA Data Processor

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update mellon qa

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update dialogue.py to remove outdated info

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update dialogue_config.yaml

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update dialogue_config.yaml

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* add dialogue docs

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* address review comments

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix for cfg

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* make dependency on apex optional

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* change NLPDDPluggin calling logic to make it possible to run without apex

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* add first draft of tutorial

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* reduce ms marco size by removing lines without wellFormedAnswers

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* address pr comments

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update colab tutorial link in dialogue docs

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* include unit test and some refactor to facilitate unit test

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* address pr issues

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* remove typos in dialogue tutorial

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* support larger files for question answering

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* remove unnecessary artifacts to reduce memory use

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* put 0 tensor to device

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update link within dialogue tutorial

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* restore previously delete files

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update error handling when loss = nan

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update nan handling

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update spanning loss func

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update spanning loss

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* fix type error raised in qa_dataset.py

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* add error checking message

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* revert back to float32

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* revert back to float32

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update error msgs

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update error msgs

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update error msgs

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update error msgs

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update error msgs

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update error msgs

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update error msgs

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update error msgs

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update exp logging

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update error msgs

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update loading of large file from pickle to json

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update loading of large file from pickle to json

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* limit number of negative samples

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* revert post processing

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* revert post processing

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* remove unused methods and style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* add more documentation

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* remove unused imports

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* changes base on PR review

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* set wandb logger falseby default

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update interface with megatron gpt prompt learning

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update inline documentation

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update prompt_ids

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update error msg

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update config

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update config

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* set inference = False for dialgue prompt learning during trainng

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* set inference = False for dialgue prompt learning during trainng

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* remove unused code

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update config yaml

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* fix bug for megatron gpt prompt learning

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* remove unused import

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* address comments in PR

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* address comments in PR

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* address typo

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* add megatron t5 inference

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* fix bug due to bert tokenizer not being space-aware

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update style

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update IntentSlotModel onnx export test

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update style

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update exportable

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* address PR comments

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* replace functools.cache_property with functools.lru_cache to maintain python 3.7 compatibility

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* improve speed of rank_candidates and support for p tuning

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update dialogue.py

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* fix megatron prompt learning saving bug

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update generate_candidate method

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* remove repeated init text ids and invert attention masks

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update typo

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* custom collate fn to remove excess padding in batch

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update complete method to mitigate issue when max seq len is low

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* address pr comments

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update generation interface

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>
Co-authored-by: Zhilin Wang <zhilinw@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: Yang Zhang <yzhang123@users.noreply.github.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Added save inference ready .nemo file with every checkpoint (#5055)

* Added save inference ready .nemo file with every checkpoint

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Python style fix

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* addressed Adi's comment

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Added ptuning check in model checkpoint saving

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Changed save_nemo_on_valdaition default to False

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Changes global batch size of adapter CI

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Changed num workers to 0

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* added first stage of pipeline check

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Virginia Adams <vadams@nvidia.com>
Signed-off-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Fixes for docs/typos + remove max_utts parameter from tarred datasets as it causes hang in training (#5118)

* Remove ; from jupyter notebook cells

Signed-off-by: Igor Gitman <igitman@nvidia.com>

* Fix typos in documentation/code

Signed-off-by: Igor Gitman <igitman@nvidia.com>

* Fix output message to have 'or equal'

Signed-off-by: Igor Gitman <igitman@nvidia.com>

* Link formatting fixes

Signed-off-by: Igor Gitman <igitman@nvidia.com>

* Add error if max_utts is used in tarred datasets

Signed-off-by: Igor Gitman <igitman@nvidia.com>

* Remove max_utts parameter from tarred datasets

Signed-off-by: Igor Gitman <igitman@nvidia.com>

* Fix max_utts removal in tests

Signed-off-by: Igor Gitman <igitman@nvidia.com>

* Fix typo if -> is

Signed-off-by: Igor Gitman <igitman@nvidia.com>

Signed-off-by: Igor Gitman <igitman@nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Merge r1.12.0 main (#5139)

* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* Add cherry-pick action (#4958)

* add cherry-pick action

Signed-off-by: ericharper <complex451@gmail.com>

* Pin Transformers version to fix CI (#4955)

* Pin transformers version in CI to prevent offline tokenizer loading error

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Drop version

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Disable offline temporarily

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Disable offline temporarily

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Enable offline

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>
Co-authored-by: Sean Naren <snarenthiran@nvidia.com>

* upper bound transformers

Signed-off-by: ericharper <complex451@gmail.com>

* remove duplicate transformers requirement

Signed-off-by: ericharper <complex451@gmail.com>

* Release SOTA Lang ID model  (#5080)

* add pretrained lang id model ambernet

Signed-off-by: fayejf <fayejf07@gmail.com>

* update doc and style fix

Signed-off-by: fayejf <fayejf07@gmail.com>

Signed-off-by: fayejf <fayejf07@gmail.com>

* update branch and package info

Signed-off-by: ericharper <complex451@gmail.com>

* remove upper bounds on lightning and transformers

Signed-off-by: ericharper <complex451@gmail.com>

* remove transformers offline from ci

Signed-off-by: ericharper <complex451@gmail.com>

* upper bound transformers

Signed-off-by: ericharper <complex451@gmail.com>

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
Co-authored-by: Sean Naren <snarenthiran@nvidia.com>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Added ASR model comparison to SDE (#5043)

SDE: Added ASR model comparison tool to SDE
transcribe speech: Added support for many predictions in one file, as well as custom field names
Signed-off-by: George Zelenfroynd <gzelenfroind@nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* fix nmt eval sampler (#5154)

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Fix Global init steps (#5143)

* move global step to base

Signed-off-by: Yi Dong <yidong@nvidia.com>

* fix fused softmax

Signed-off-by: Yi Dong <yidong@nvidia.com>

* add the missing file

Signed-off-by: Yi Dong <yidong@nvidia.com>

* update the fused kernel

Signed-off-by: Yi Dong <doyend@gmail.com>

* fix import error

Signed-off-by: Yi Dong <doyend@gmail.com>

* fix import again

Signed-off-by: Yi Dong <yidong@nvidia.com>

Signed-off-by: Yi Dong <yidong@nvidia.com>
Signed-off-by: Yi Dong <doyend@gmail.com>
Co-authored-by: Yi Dong <doyend@gmail.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* [TTS] bug fix - sample rate was being ignored in vocoder dataset (#4518)

* bug fix - sample rate was being ignored in vocoder dataset when not loading mel
* handled n segments for a different sampling rate than original sampling rate
* Added case for n_segments 0, warning for n_segments greater than file length

Signed-off-by: Paarth Neekhara <paarth.n@gmail.com>
Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Co-authored-by: Jocelyn <jocelynh@nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Add EMA support to NeMo (#4764)

* Added Base files

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Some refactors, swap to using MNIST Lnet

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Add a few more tests, allow the callback to be set via the exp manager

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Actually run validation for testing

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Run isort

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Add test for saving state/fix saving state

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Use dummy model

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Fix test

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Add copyright

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Support saving separate EMA weight module

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Add standalone functionality/logging

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Expose more parameters

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Modify to allow option to replace validation

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Add jenkins test, formatting

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Pin Transformers version to fix CI (#4955)

* Pin transformers version in CI to prevent offline tokenizer loading error

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Drop version

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Disable offline temporarily

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Disable offline temporarily

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Enable offline

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Add cherry-pick action (#4958) (#4961)

* add cherry-pick action

Signed-off-by: ericharper <complex451@gmail.com>

* Pin Transformers version to fix CI (#4955)

* Pin transformers version in CI to prevent offline tokenizer loading error

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Drop version

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Disable offline temporarily

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Disable offline temporarily

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Enable offline

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>
Co-authored-by: Sean Naren <snarenthiran@nvidia.com>

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Co-authored-by: Sean Naren <snarenthiran@nvidia.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Fix changelog builder (#4962) (#4963)

Signed-off-by: smajumdar <smajumdar@nvidia.com>

Signed-off-by: smajumdar <smajumdar@nvidia.com>

Signed-off-by: smajumdar <smajumdar@nvidia.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* fix cherry pick workflow (#4964) (#4965)

Signed-off-by: ericharper <complex451@gmail.com>

Signed-off-by: ericharper <complex451@gmail.com>

Signed-off-by: ericharper <complex451@gmail.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* reorder model check (#4959) (#4967)

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* check for active conda environment (#4970) (#4971)

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* [TTS] fix broken tutorial for MixerTTS. (#4949) (#4976)

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Checkpoint averaging class fix (#4946)

* 1. Added args.class_path to provide it externally.

Signed-off-by: Micha Livne <mlivne@cs.toronto.edu>

* 1. Fixed style.

Signed-off-by: Micha Livne <mlivne@cs.toronto.edu>

Signed-off-by: Micha Livne <mlivne@cs.toronto.edu>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Add ability to give seperate datasets for test, train and validation (#4798)

* Add ability to give seperate datasets for test, train and validation

* Addressed Sandeeps comments

* Addressed Sandeeps comments

* Add ability to give seperate datasets for test, train and validation

* Add ability to give seperate datasets for test, train and validation

* Addressed review comments

* Bug fix for common dataset utils

* Add CI tests

Signed-off-by: shanmugamr1992 <shanmugamr1992@gmail.com>

* Reformat code

Signed-off-by: shanmugamr1992 <shanmugamr1992@gmail.com>

* Bug fix

Signed-off-by: shanmugamr1992 <shanmugamr1992@gmail.com>

* Bug fix

* Bug Fix

* Bug Fix

* Update Jenkinsfile

* Addressed comments

* Addressed Eriks comments.

* Addressed Sandeep

* Update Jenkinsfile

* Update Jenkinsfile

* Update dataset_utils.py

* Update Jenkinsfile

* Update Jenkinsfile

* Use GPT CI config

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: shanmugamr1992 <shanmugamr1992@gmail.com>
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* fix label models restoring issue from wrighted cross entropy (#4968) (#4975)

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Add simple pre-commit file (#4983)

* Add simple pre-commit file

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Exclude docs folder

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Revert "[pre-commit.ci] auto fixes from pre-commit.com hooks"

This reverts commit 053bd5ba579537a5f311b431871c21f3381b43eb.

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Import pycuda.autoprimaryctx or pycuda.autoinit to init pycuda execution environment (#4951)

Signed-off-by: Jin Li <liji@nvidia.com>

Signed-off-by: Jin Li <liji@nvidia.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Adding speaker embedding conditioning in fastpitch (#4986)

Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com>

Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Fix ASR issues (#4984) (#4991)

* Fix ASR issues

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Revert fix

Signed-off-by: smajumdar <smajumdar@nvidia.com>

Signed-off-by: smajumdar <smajumdar@nvidia.com>

Signed-off-by: smajumdar <smajumdar@nvidia.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Fix current tests

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* More test coverage

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Address reviews

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Address review

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Drop bf16 test

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Address review

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* remove print

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Add bf16

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>
Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: smajumdar <smajumdar@nvidia.com>
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: Micha Livne <mlivne@cs.toronto.edu>
Signed-off-by: shanmugamr1992 <shanmugamr1992@gmail.com>
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: Jin Li <liji@nvidia.com>
Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Co-authored-by: Micha Livne <michalivne@users.noreply.github.com>
Co-authored-by: shanmugamr1992 <111910568+shanmugamr1992@users.noreply.github.com>
Co-authored-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: liji-nv <59594262+liji-nv@users.noreply.github.com>
Co-authored-by: Subhankar Ghosh <subhankar2321@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Fix BF16 test (#5162)

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Fix errors in speaker diarization nemo docs (#5153)

* fix docs and docstrings for MSDD

Signed-off-by: Taejin Park <tango4j@gmail.com>

* fix nemo docs errors

Signed-off-by: Taejin Park <tango4j@gmail.com>

* reflected review comments

Signed-off-by: Taejin Park <tango4j@gmail.com>

Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Add interleaved pipeline schedule to GPT (#5025)

* add virtual pipeline size to config

Signed-off-by: ericharper <complex451@gmail.com>

* convert model to list of modules

Signed-off-by: ericharper <complex451@gmail.com>

* convert model to list of modules

Signed-off-by: ericharper <complex451@gmail.com>

* convert model to list of modules

Signed-off-by: ericharper <complex451@gmail.com>

* update for list of modules

Signed-off-by: ericharper <complex451@gmail.com>

* add virtual to init

Signed-off-by: ericharper <complex451@gmail.com>

* update first last stage embedding all reduce

Signed-off-by: ericharper <complex451@gmail.com>

* update sequence parallel all reduce for virtual models

Signed-off-by: ericharper <complex451@gmail.com>

* runs but we get an error

Signed-off-by: ericharper <complex451@gmail.com>

* set virtual rank 0 after looping

Signed-off-by: ericharper <complex451@gmail.com>

* account for virtual when determinining first and last pipeline stages

Signed-off-by: ericharper <complex451@gmail.com>

* checkpointing for virtual models in progress

Signed-off-by: ericharper <complex451@gmail.com>

* add checkpoint hooks

Signed-off-by: ericharper <complex451@gmail.com>

* working on validation when resuming

Signed-off-by: ericharper <complex451@gmail.com>

* skip sanity val steps by default in config

Signed-off-by: ericharper <complex451@gmail.com>

* remove comment

Signed-off-by: ericharper <complex451@gmail.com>

* log number of params

Signed-off-by: ericharper <complex451@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* style

Signed-off-by: ericharper <complex451@gmail.com>

* check if self.model is a list

Signed-off-by: ericharper <complex451@gmail.com>

* make virtual pipeline default size None on init

Signed-off-by: ericharper <complex451@gmail.com>

* make virtual pipeline default to None in config

Signed-off-by: ericharper <complex451@gmail.com>

* remove ensure_divisibility call

Signed-off-by: ericharper <complex451@gmail.com>

* fix lgtm alerts

Signed-off-by: ericharper <complex451@gmail.com>

* remove num_sanity_val_steps from config

Signed-off-by: ericharper <complex451@gmail.com>

* default virtual pipeline size to none

Signed-off-by: ericharper <complex451@gmail.com>

* check for list

Signed-off-by: ericharper <complex451@gmail.com>

* update assert to make sure we are only doing virtual for gpt

Signed-off-by: ericharper <complex451@gmail.com>

* revert change to get_params_for_weight_decay

Signed-off-by: ericharper <complex451@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* init var

Signed-off-by: ericharper <complex451@gmail.com>

* add import guard for set virtual model parallel world size

Signed-off-by: ericharper <complex451@gmail.com>

* use import guard

Signed-off-by: ericharper <complex451@gmail.com>

* update calls to fake init in eval scripts

Signed-off-by: ericharper <complex451@gmail.com>

* add _get_fwd_bwd_function

Signed-off-by: ericharper <complex451@gmail.com>

* log all total model parameters

Signed-off-by: ericharper <complex451@gmail.com>

* remove unused import

Signed-off-by: ericharper <complex451@gmail.com>

Signed-off-by: ericharper <complex451@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* reduced to 14 inactive days to be stale for PRs. (#5165)

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* refactor TTS documentation organization and add new contents. (#5137)

* refactor TTS documentation organization and add new contents.
* fix asr api bug.
* fix broken links.
* fix unexpected indentation errors.
* fixed unexpected indentation.
* fixed broken paper reference.
* fixed cross-reference and typos.
* fixed toctree errors.
* revert to 'Augmentors'
* reordered TTS tutorial list in starthere.
* ordered api classes alphabetically for each Section.
* fixed underscore typo for fastpitch checkpoint.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* upcase 'Tuning'

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* fixed typo for RAD-TTS Aligner

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* reorder aligner section after mel-gen and vocoders in models.rst.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* clarify Mixer-TTS-X and reorder model descriptions alphabetically.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* fixed some typos and formats.

Signed-off-by: Xuesong Yang <…
rohitrango pushed a commit to rohitrango/NeMo that referenced this pull request Jun 25, 2024
* update package info

Signed-off-by: ericharper <complex451@gmail.com>

* Add ASR with TTS Tutorial. Fix enhancer usage. (NVIDIA#6955)

* Add ASR with TTS Tutorial
* Fix enhancer usage

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* install_bs (NVIDIA#7019)

Signed-off-by: Nikolay Karpov <karpnv@gmail.com>

* Fix typo and branch in tutorial (NVIDIA#7048)

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* fix syntax error introduced in PR-7079 (NVIDIA#7102)

* fix syntax error introduced in PR-7079

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>

* fixes for pr review

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>

---------

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>

* fix links for TN (NVIDIA#7117)

Signed-off-by: Evelina <ebakhturina@nvidia.com>

* update branch (NVIDIA#7135)

Signed-off-by: ericharper <complex451@gmail.com>

* Fixed main and merging this to r1.20 (NVIDIA#7127)

* Fixed main and merging this to r1.20

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Update vad_utils.py

Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>

---------

Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>

* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* fix version

Signed-off-by: ericharper <complex451@gmail.com>

* resolve conflict the other way

Signed-off-by: ericharper <complex451@gmail.com>

* keep both

Signed-off-by: ericharper <complex451@gmail.com>

* revert keep both

Signed-off-by: ericharper <complex451@gmail.com>

---------

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: Nikolay Karpov <karpnv@gmail.com>
Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>
Signed-off-by: Evelina <ebakhturina@nvidia.com>
Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: Vladimir Bataev <vbataev@nvidia.com>
Co-authored-by: Nikolay Karpov <karpnv@gmail.com>
Co-authored-by: bene-ges <antonova_sasha@list.ru>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: Taejin Park <tango4j@gmail.com>
Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
rohitrango pushed a commit to rohitrango/NeMo that referenced this pull request Jun 25, 2024
* Fix race condition when executing with multi-node where some ranks does not wait for setup (NVIDIA#7016)

Signed-off-by: Kim Ngo <6362111+findkim@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Added bool types to neural_types export (NVIDIA#7032)

Signed-off-by: tbartley94 <tbartley@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* rnnt and char utils (NVIDIA#6971)

* rnnt_ngram_merge

Signed-off-by: Nikolay Karpov <karpnv@gmail.com>

* char level bug

Signed-off-by: Nikolay Karpov <karpnv@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Nikolay Karpov <karpnv@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* fix tab text gen (NVIDIA#7022) (NVIDIA#7031)

Signed-off-by: Yi Dong <yidong@nvidia.com>
Co-authored-by: Yi Dong <43824965+yidong72@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fixed kwargs for metric instance init

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fixed kwargs for metric instance init

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* removed kwagrs

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Updated config desc

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* ASR Confidence update and tutorial (NVIDIA#6810)

* small fixes and tests

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* various fixes for the tutorial

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* tutorial added

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* for for a little oops after rebasement

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix tests

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* unused import removed

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* fix review comments

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* deprecated parameters for greedy configs

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* move re-assigning to configs

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* fix comments 2

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* fix config tests

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* fix ece test (my env was bugged apparently)

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* renamings for confidence ensemble

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fox comments 3

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* return dropped tutorial

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* CI flips back and forth, increasing tolerance

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

---------

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* install_bs (NVIDIA#7019) (NVIDIA#7028)

Signed-off-by: Nikolay Karpov <karpnv@gmail.com>
Co-authored-by: Nikolay Karpov <karpnv@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* fixes for spellmapper (NVIDIA#6994) (NVIDIA#7000)

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>
Co-authored-by: bene-ges <antonova_sasha@list.ru>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* added back the retro documents (NVIDIA#7033)

Signed-off-by: Yi Dong <yidong@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Remove pyyaml (NVIDIA#7052) (NVIDIA#7054)

Signed-off-by: smajumdar <titu1994@gmail.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* st standalone model (NVIDIA#6969)

* st standalone model

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* style fix

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* sacrebleu import fix, unused imports removed

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* import guard for nlp inside asr transformer bpe model

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* codeql fixes

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* comments answered

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* import ordering fix

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* yttm for asr removed

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* logging added

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* added inference and translate method

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* remove pos emb from state dict for old models (NVIDIA#7068)

* remove pos emb from state dict

Signed-off-by: Evelina <ebakhturina@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* move to nlp_model

Signed-off-by: Evelina <ebakhturina@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update comment

Signed-off-by: Evelina <ebakhturina@nvidia.com>

* fix nmt test

Signed-off-by: Evelina <ebakhturina@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix nmt test

Signed-off-by: Evelina <ebakhturina@nvidia.com>

---------

Signed-off-by: Evelina <ebakhturina@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix typo in ASR-TTS tutorial (NVIDIA#7049)

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fixed tutorial's name (NVIDIA#7047)

Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com>
Co-authored-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix documentation for Numba (NVIDIA#7065) (NVIDIA#7077)

* Fix documentation for Numba



* Update force float32 flag dynamically



* Update force float32 flag dynamically



* Fix nemo version



---------

Signed-off-by: smajumdar <titu1994@gmail.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Update Frame-VAD doc and fix onnx export (NVIDIA#7076)

* update fvad doc

Signed-off-by: stevehuang52 <heh@nvidia.com>

* fix typo

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update fvad example

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* fix onnx export

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update test

Signed-off-by: stevehuang52 <heh@nvidia.com>

* refactor

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update doc

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

---------

Signed-off-by: stevehuang52 <heh@nvidia.com>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* memmap worker arg (NVIDIA#7062)

* memmap worker arg

Signed-off-by: arendu <adithya.r@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update

Signed-off-by: arendu <adithya.r@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update

Signed-off-by: arendu <adithya.r@gmail.com>

* update

Signed-off-by: arendu <adithya.r@gmail.com>

---------

Signed-off-by: arendu <adithya.r@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix caching bug in causal convolutions for cache-aware ASR models (NVIDIA#7034) (NVIDIA#7082)

Co-authored-by: Vahid Noroozi <VahidooX@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fast Conformer global token fix (NVIDIA#7085)

* old way

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* remove extra

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* clean

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* clean

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* clean

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: sam1373 <samuelkriman@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Refined export_config (NVIDIA#7053) (NVIDIA#7066)

* Refined export_config
* Rolling back hierarchy change
---------

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Co-authored-by: Boris Fomitchev <borisfom@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* small Bugfix (NVIDIA#7081)

* small Bugfix (NVIDIA#7079)

* fix branch

Signed-off-by: fayejf <fayejf07@gmail.com>

* fix typo

Signed-off-by: fayejf <fayejf07@gmail.com>

* fix link

Signed-off-by: fayejf <fayejf07@gmail.com>

---------

Signed-off-by: fayejf <fayejf07@gmail.com>

* Update tutorials/nlp/SpellMapper_English_ASR_Customization.ipynb

Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>

* Update tutorials/nlp/SpellMapper_English_ASR_Customization.ipynb

Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>

---------

Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Added script to extract ASR CTC and RNNT models from ASR hybrid models (NVIDIA#7092)

* Added script to extract ctc and rnnt models from hybrid models

Signed-off-by: Daniel Egert <degert@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Updated hybrid extraction script for review request 1

Signed-off-by: Daniel Egert <degert@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Updated hybrid convert script to remove --cuda flag

Signed-off-by: Daniel Egert <degert@nvidia.com>

---------

Signed-off-by: Daniel Egert <degert@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Adding docs and models for multiple lookahead cache-aware ASR (NVIDIA#7067) (NVIDIA#7094)

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* update TTS readme (NVIDIA#7088)

* update TTS readme

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

---------

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix absolute path in path join call (NVIDIA#7099)

Signed-off-by: Jan Beckmann <king-jan1999@hotmail.de>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Disable distopt contiguous param buffer by default (NVIDIA#7095)

Signed-off-by: Tim Moon <tmoon@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* microphone demo (NVIDIA#7110)

Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com>
Co-authored-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* [Fix] load_state_dict in nlp_model.py (NVIDIA#7086)

* Fix load_state_dict in nlp_model.py

Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix plot function in vad_utils.py (NVIDIA#7113)

Fix plot function in vad_utils.py

Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fixed small bug with NoisePerturbationWithNormalization (NVIDIA#7118)

Signed-off-by: Daniel Egert <degert@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix import guard checks (NVIDIA#7124)

Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Revert "Fix import guard checks (NVIDIA#7124)" (NVIDIA#7125)

This reverts commit ae7624d.

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix import guard checks (NVIDIA#7126)

* Fix import guard checks

Signed-off-by: smajumdar <titu1994@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: smajumdar <titu1994@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Add updated fc ctc and rnnt xxl models (NVIDIA#7128) (NVIDIA#7130)

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* [TTS] Create EnCodec training recipe (NVIDIA#6852)

* [TTS] Create EnCodec training recipe

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Update encodec recipe

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Rename EnCodec to AudioCodec

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Add EnCodec unit tests

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Add copyright header to distributed.py

Signed-off-by: Ryan <rlangman@nvidia.com>

---------

Signed-off-by: Ryan <rlangman@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix rank where torch.distributed may not be initialized yet and would not wait for tokenizer file caching (NVIDIA#7061)

Signed-off-by: Kim Ngo <6362111+findkim@users.noreply.github.com>
Co-authored-by: David <amosalla@asu.edu>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* fix default attention size (NVIDIA#7141) (NVIDIA#7143)

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* fix evaluator.py for various exceptions by ast (NVIDIA#7150)

Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* [TTS][ZH] add Chinese TTS recipes based on IPA symbol sets. (NVIDIA#6893)

* [TTS] add Chinese TTS recipe based on IPA.
* add new pinyin and ipa dictionaries with 36 finals.
* add yaml configs for 24-final pinyin and ipa.
* add copyright header
* add a directory level 24finals to discriminate from 36 finals.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* unify configs into a single one and add detailed comments providing supported candidates.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* choose 36-final IPA as default phoneme dict

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

---------

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* [TTS] Add output audio format to preprocessing (NVIDIA#6889)

* [TTS] Add output audio format to preprocessing

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Add format validation

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Fix data tutorial

Signed-off-by: Ryan <rlangman@nvidia.com>

---------

Signed-off-by: Ryan <rlangman@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* freeze (NVIDIA#7152)

Signed-off-by: arendu <adithya.r@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* make sure any empty segments are removed (NVIDIA#7155)

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Update RIR generation scripts (NVIDIA#6547)

- fix: reduce room size if evaluation of params fails
- added randomized mic placement
- added diffuse noise generation
- added an option to specify the format and subtype for saved audio

Signed-off-by: Ante Jukić <ajukic@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* A quickstart speech enhancement tutorial (NVIDIA#6492)

A simple example of training a model for speech enhancement task

Signed-off-by: Ante Jukić <ajukic@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* NFA subtitle file config - specify colors and vertical alignment (NVIDIA#7160)

* allow specifying colors of text in ASS subtitle file

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* specify vertical_alignment instead of marginv in ass_file_config

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* add documentation of CTMFileConfig and ASSFileConfig to NFA README

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

---------

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Eagerly accumulate embedding grads into fp32 buffer (NVIDIA#6958) (NVIDIA#7153)

Signed-off-by: Tim Moon <tmoon@nvidia.com>
Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* TE bug fix (NVIDIA#7027) (NVIDIA#7036)

Signed-off-by: Dmytro Pykhtar <dpykhtar@nvidia.com>
Co-authored-by: Dmytro Pykhtar <37850217+dimapihtar@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* [TTS] Remove nested TTS configs (NVIDIA#7154)

* [TTS] Remove nested TTS configs

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Modify tutorial to support multiple sampling rates

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Clarify min_duration unit

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Default 22.05kHz highfreq to null

Signed-off-by: Ryan <rlangman@nvidia.com>

---------

Signed-off-by: Ryan <rlangman@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Merge release r1.20.0 to main (NVIDIA#7167)

* update package info

Signed-off-by: ericharper <complex451@gmail.com>

* Add ASR with TTS Tutorial. Fix enhancer usage. (NVIDIA#6955)

* Add ASR with TTS Tutorial
* Fix enhancer usage

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* install_bs (NVIDIA#7019)

Signed-off-by: Nikolay Karpov <karpnv@gmail.com>

* Fix typo and branch in tutorial (NVIDIA#7048)

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* fix syntax error introduced in PR-7079 (NVIDIA#7102)

* fix syntax error introduced in PR-7079

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>

* fixes for pr review

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>

---------

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>

* fix links for TN (NVIDIA#7117)

Signed-off-by: Evelina <ebakhturina@nvidia.com>

* update branch (NVIDIA#7135)

Signed-off-by: ericharper <complex451@gmail.com>

* Fixed main and merging this to r1.20 (NVIDIA#7127)

* Fixed main and merging this to r1.20

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Update vad_utils.py

Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>

---------

Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>

* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* fix version

Signed-off-by: ericharper <complex451@gmail.com>

* resolve conflict the other way

Signed-off-by: ericharper <complex451@gmail.com>

* keep both

Signed-off-by: ericharper <complex451@gmail.com>

* revert keep both

Signed-off-by: ericharper <complex451@gmail.com>

---------

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: Nikolay Karpov <karpnv@gmail.com>
Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>
Signed-off-by: Evelina <ebakhturina@nvidia.com>
Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: Vladimir Bataev <vbataev@nvidia.com>
Co-authored-by: Nikolay Karpov <karpnv@gmail.com>
Co-authored-by: bene-ges <antonova_sasha@list.ru>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: Taejin Park <tango4j@gmail.com>
Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Upgrade to pytorch lightning 2.0 (NVIDIA#6433)

* Upgrade pytorch lightning version in requirements

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Initial fixes for PTL2.0

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add further fixes to support lightning 2.0

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add replacements for replace_sampler_ddp, resume_from_checkpoint_fit_path and few occurances of validation_epoch_end

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Replace all occurances of validation_epoch_end to on_validation_epoch_end

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Replace training_epoch_end, test_epoch_end with on_train_epoch_end and on_test_epoch_end respectively

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Change logger=None to logger=False in Trainer object

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove PTL2.0 deprecated Trainer args from TrainerConfig dataclass

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Modify trainer.precision check and other small edits

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Replace logger=None with logger=False in test_ptl_stateless_timer.py Trainer

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add default values for args to fix Attribute Error

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add the following modifications

1) Remove outputs arg from on_validation_epoch_end, on_test_epoch_end and make it an arg of the class
2) Replace resume_from_checkpoint with ckpt_path as needed
3) Explicitly add accelerator as 'CPU' in UTs being run on CPU

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs arg from on_validation_epoch_end, on_test_epoch_end

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs arg in on_validation_epoch_end in MultiBinaryAccuracy docstrings

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add val, test outputs as instance vars in PunctuationCapitalizationModel and TokenClassificationModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Replace trainer.fit_loop.max_steps with trainer.fit_loop.epoch_loop.max_steps in test_optimizers_schedulers.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Revert an extra space that was mistakenly added

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Use self.validation_step_outputs and self.test_step_outputs in test_ema.py for uniformity

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Use self.validation_step_outputs and self.test_step_outputs in test_ptl_stateless_timer.py and check_for_ranks.py for uniformity

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add self.validation_step_outputs.clear() and self.test_step_outputs.clear() wherever missing

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs arg from on_train_epoch_end

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs from on_validation_epoch_end in multi_binary_acc.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove output args from on_validation_epoch_end in the docstrings of some ASR files

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove output args from on_validation_epoch_end and clear memory from validation_step_outputs

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add on_validation_epoch_end and remove outputs args for nlp models

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Append output of validation_step to validation_step_outputs in EncDecClassificationModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add the following changes

1) Index self.validation_step_outputs and self.test_step.outputs with dataloader_idx wherever needed
2) Initialize self.validation_step_outputs and self.test_step.outputs as empty lists and add support for multi dataloaders if they exist
3) Remove self.pre_configure_ddp from NLPDDPStrategy class as its removed in PTL 2.0

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add default value dataloader_idx=0 for on_validation_batch_end() in megatron_base_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* TypeCast precision to str in attention.py and utils_funcs.py to avoid TypeError

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add if condition check for multiple dataloaders when appending to validation outputs

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Separate validation pass to be used with both validation_step and test_step

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add if condition check for multiple dataloader while appending to test_step_outputs in punctuation_capitalization_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add condition check for multiple dataloaders based on type of trainer.val/test_dataloaders or self._validation/test_dl instead of len

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Comment Megatron T5 IA3 PP=2 in CI pipeline due to dataloader_iter issue with PTL 2.0

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Modify precision checks to account for 16-mixed and bf16-mixed

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Append output of validation/test_step to self.validation/test_step_outputs in CTCG2PModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Modify find_unused_parameters=True in g2p_heteronym model

1) Add find_unused_parameters=True for DDP strategy in g2p_heteronym_classification_train_and_evaluate.py
2) Remove args output in validation/test_step and add instance variables instead for heteronym_classification.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs from on_test_epoch_end in DialogueGPTClassificationModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add validation/test outputs in sgdqa_model and modify dialogue_config.yaml

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add split arg self.test_step_outputs to TextClassificationModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add test_step_outputs to dialogue and text classification models

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Change condition check for multiple dataloaders:

1) Replace ds_item as list in dialogue_config.yaml
2) Check for len of val/test_dataloaders or validation/test_dl along with type check of list in sgdqa_model.py while appending outputs of validation/test_step
3) Check for len of _validation/test_dl for creating self.validation/test_step_outputs in ModelPT and punctuation_cpitalization_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add additional condition for multi dataloaders

Check len(self.trainer.val/test_dataloaders) > 1 along with type(self.trainer.val/test_dataloaders) == list for multi dataloaders in validation/test_step

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add val step outputs and default val for dataloader_idx

1) Append validation_step outout to self.validation_step_outputs in MultiLabelIntentSlotClassificationMode
2) Add default val for dataloader_idx for on_test_batch_start/end in TimingCallback
3) Add self.validation/test_step_outputs in BERTQAModel and remove outputs arg

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add val/test_step_outputs to S2SQAModel and GPTQAModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Edit JenkinsFile for bert_pretrainig.py

Edit Jenkinsfile for this test to disable validation as a workaround for trainer.val_dataloader None error

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Modify precision to support 16-mixed, bf16-mixed in megatron_gpt_pretraining.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add ddp_find_unused_parameters_true and remove output args

1) Add ddp_find_unused_parameters_true fro trainer.strategy in self_alignment_pretraining.py as it has unused parameters
2) Remove output args and add self.validation/test_step_outputs to validation/test_step in mt_enc_dec_model.py
3) Comment tests in JenkinsFile that need to be fixed

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix in megatron_nmt_training.py for 16-mixed, bf16-mixed

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix for megatron_bert_pretraining.py and megatron_bert_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix and validation/test_step_outputs

1) Add fix to account for 16-mixed and bf16-mixed in megatron_retro_mutransfer_pretrain.py, megatron_retro_pretraining.py
2) Reset ckpt_path for test in enc_dec_nmt.py
3) Remove outputs args and add validation/test_step_outputs in megatron_retrieval_model.py
4) Comment Megatron Bert Pretraining and Resume Training with Pipeline Paralleism and add back NMT Training Post-LN

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix and skip few failing tests

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add missing comment lines in JenkinsFile

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Comment jenkin tests and super().on_validation_epoch_end() in megatron_gpt_sft_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Minor edit JenkinsFile

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Minor edit in jenkins file

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Edit in Jenkins file

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Comment missed lines in Jenkins file

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Fix precision and validation/test outputs

1) Add precision fix to account for 16-mixed and bf16-mixed in megatron_t5_pretraining.py
2) Remove outputs args and add append loss to self.validation/test_step_outputs in megatron_lm_encoder_decoder_model.py
3) Add back resume_from_checkpoint in the megatron_t5_config.yaml
4) Comment out certain tests in Jenkins file

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Fix precision and validation/test/predict errors in megatron_t5_prompt_learning.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix and edit precision typo in all files

1) Account for 16-mixed and bf16-mixed in megatron_bart_pretraining.py and megatron_t5_seq2seq_finetune.py
2) Fix precision typo in all files

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Fix all CI TTS tests and comment few Jenkins tests

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Combine xx_epoch_end and on_xx_epoch_end

Add on_inference_epoch_end to inference_epoch_end function and have a single on_validation/test_epoch_end in megatron_finetune_model.py and megatron_gpt_sft_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add a missing comment in JenkinsFile

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add try except StopIteration in validation_step for models with dataloader_iter

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove pyyaml from requirements

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add try except for inference_step in megatron_finetune_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove limit_val_batches for mockGPTDataset test

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add new self.validation_step_outputs for MegatronGPTSFTModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Minor edit Jenkinsfile

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Initialize self.validation/test_step_outputs in megatron_gpt_sft_model.py

Initialize self.validation/test_step_outputs in setup of MegatronGPTSFTModel to take care of cases when datalaoders are not setup in ModelPT for example while restoring the model.

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove resume_from_checkpoint if trainer arg in conf yaml files

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove resume_from_checkpoint as trainer arg in GPT, T5 configs

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove resume_from_checkpoint in duplex_tn_config.yaml

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Fix typos, unused imports and refactor code to remove redundant funcs

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove commented code in megatron_nmt_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Fix overriden functions to match parent class functions

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Prefetch dataloader_iter to prevent hang for PP>1

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Override setup() in NLPDDPStrategy to avoid hang during predict with PP>1

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Uncomment tests in JenkinsFile

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add '16' to precision checks and other minor fixes

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Clear validation/test_step_outputs with dataloader_idx for multi dataloaders

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Minor edits

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Modify precision checks to avoid indexing

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove self.validation_step_outputs_sft and add dataloader_idx to clear outputs

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Reference checkpoint with trainer.ckpt_path

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add _prefetch to NLPModel and minor fixes

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add limit_val_batches in JenkinsFile for NMT

1) Add trainer.limit_val_batches in Megatron NMT Training TP=2
2) Remove unused import in ModelPT

Signed-off-by: Abhishree <abhishreetm@gmail.com>

---------

Signed-off-by: Abhishree <abhishreetm@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Include the scripts for preprocessing OAST and unit tests for chat sft datasets (NVIDIA#7112)

* scripts for sft

Signed-off-by: Yi Dong <yidong@nvidia.com>

* fix style

Signed-off-by: Yi Dong <yidong@nvidia.com>

* adde special token only for huggingface model

Signed-off-by: Yi Dong <yidong@nvidia.com>

* change default name

Signed-off-by: Yi Dong <yidong@nvidia.com>

* print out error datapoint content

Signed-off-by: Yi Dong <yidong@nvidia.com>

* show error id

Signed-off-by: Yi Dong <yidong@nvidia.com>

* annotation script working

Signed-off-by: Yi Dong <yidong@nvidia.com>

* try to be compatible with huggingface tokenizer

Signed-off-by: Yi Dong <yidong@nvidia.com>

* added examples

Signed-off-by: Yi Dong <yidong@nvidia.com>

* added lang

Signed-off-by: Yi Dong <yidong@nvidia.com>

* added lang

Signed-off-by: Yi Dong <yidong@nvidia.com>

* text to value special case

Signed-off-by: Yi Dong <yidong@nvidia.com>

* configure the slider

Signed-off-by: Yi Dong <yidong@nvidia.com>

* annoatation handles lang

Signed-off-by: Yi Dong <yidong@nvidia.com>

* added the unit test for chat sft dataset

Signed-off-by: Yi Dong <yidong@nvidia.com>

* used the file in the test dir

Signed-off-by: Yi Dong <yidong@nvidia.com>

* fix json error

Signed-off-by: Yi Dong <yidong@nvidia.com>

* load local tokenizer

Signed-off-by: Yi Dong <yidong@nvidia.com>

* remove mask count check

Signed-off-by: Yi Dong <yidong@nvidia.com>

* added HF dataset backend

Signed-off-by: Yi Dong <yidong@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Yi Dong <yidong@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* add paths to labeler. (NVIDIA#7087)

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Kim Ngo <6362111+findkim@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>
Signed-off-by: tbartley94 <tbartley@nvidia.com>
Signed-off-by: Nikolay Karpov <karpnv@gmail.com>
Signed-off-by: Yi Dong <yidong@nvidia.com>
Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>
Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>
Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>
Signed-off-by: Evelina <ebakhturina@nvidia.com>
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com>
Signed-off-by: stevehuang52 <heh@nvidia.com>
Signed-off-by: arendu <adithya.r@gmail.com>
Signed-off-by: sam1373 <samuelkriman@gmail.com>
Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: Daniel Egert <degert@nvidia.com>
Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: Jan Beckmann <king-jan1999@hotmail.de>
Signed-off-by: Tim Moon <tmoon@nvidia.com>
Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com>
Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Signed-off-by: Ryan <rlangman@nvidia.com>
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: Ante Jukić <ajukic@nvidia.com>
Signed-off-by: Dmytro Pykhtar <dpykhtar@nvidia.com>
Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: Abhishree <abhishreetm@gmail.com>
Co-authored-by: Kim Ngo <6362111+findkim@users.noreply.github.com>
Co-authored-by: tbartley94 <90423858+tbartley94@users.noreply.github.com>
Co-authored-by: Nikolay Karpov <karpnv@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Yi Dong <43824965+yidong72@users.noreply.github.com>
Co-authored-by: Aleksandr Laptev <alaptev@nvidia.com>
Co-authored-by: bene-ges <antonova_sasha@list.ru>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: Aleksey Grinchuk (Oleksii Hrinchuk) <grinchuk.alexey@gmail.com>
Co-authored-by: Vladimir Bataev <vbataev@nvidia.com>
Co-authored-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
Co-authored-by: Adi Renduchintala <adithyare@nvidia.com>
Co-authored-by: Vahid Noroozi <VahidooX@users.noreply.github.com>
Co-authored-by: Samuel Kriman <samuelkriman@gmail.com>
Co-authored-by: Boris Fomitchev <borisfom@users.noreply.github.com>
Co-authored-by: trias702 <25867060+trias702@users.noreply.github.com>
Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Co-authored-by: Jan Beckmann <king-jan1999@hotmail.de>
Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Co-authored-by: lleaver <137942999+lleaver@users.noreply.github.com>
Co-authored-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com>
Co-authored-by: Ryan Langman <rlangman@nvidia.com>
Co-authored-by: David <amosalla@asu.edu>
Co-authored-by: Elena Rastorgueva <80532067+erastorgueva-nv@users.noreply.github.com>
Co-authored-by: anteju <108555623+anteju@users.noreply.github.com>
Co-authored-by: Dmytro Pykhtar <37850217+dimapihtar@users.noreply.github.com>
Co-authored-by: Taejin Park <tango4j@gmail.com>
Co-authored-by: Abhishree Thittenamane <47577437+athitten@users.noreply.github.com>
rohitrango pushed a commit to rohitrango/NeMo that referenced this pull request Jun 25, 2024
* migrated class

Signed-off-by: dorotat <dorotat@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: dorotat <dorotat@nvidia.com>

* added unit test

Signed-off-by: dorotat <dorotat@nvidia.com>

* memmap worker arg (#7062)

* memmap worker arg

Signed-off-by: arendu <adithya.r@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update

Signed-off-by: arendu <adithya.r@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update

Signed-off-by: arendu <adithya.r@gmail.com>

* update

Signed-off-by: arendu <adithya.r@gmail.com>

---------

Signed-off-by: arendu <adithya.r@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* Fix caching bug in causal convolutions for cache-aware ASR models (#7034) (#7082)

Co-authored-by: Vahid Noroozi <VahidooX@users.noreply.github.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* Fast Conformer global token fix (#7085)

* old way

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* remove extra

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* clean

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* clean

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* clean

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: sam1373 <samuelkriman@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* Refined export_config (#7053) (#7066)

* Refined export_config
* Rolling back hierarchy change
---------

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Co-authored-by: Boris Fomitchev <borisfom@users.noreply.github.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* small Bugfix (#7081)

* small Bugfix (#7079)

* fix branch

Signed-off-by: fayejf <fayejf07@gmail.com>

* fix typo

Signed-off-by: fayejf <fayejf07@gmail.com>

* fix link

Signed-off-by: fayejf <fayejf07@gmail.com>

---------

Signed-off-by: fayejf <fayejf07@gmail.com>

* Update tutorials/nlp/SpellMapper_English_ASR_Customization.ipynb

Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>

* Update tutorials/nlp/SpellMapper_English_ASR_Customization.ipynb

Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>

---------

Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* Added script to extract ASR CTC and RNNT models from ASR hybrid models (#7092)

* Added script to extract ctc and rnnt models from hybrid models

Signed-off-by: Daniel Egert <degert@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Updated hybrid extraction script for review request 1

Signed-off-by: Daniel Egert <degert@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Updated hybrid convert script to remove --cuda flag

Signed-off-by: Daniel Egert <degert@nvidia.com>

---------

Signed-off-by: Daniel Egert <degert@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* Adding docs and models for multiple lookahead cache-aware ASR (#7067) (#7094)

Signed-off-by: dorotat <dorotat@nvidia.com>

* update TTS readme (#7088)

* update TTS readme

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

---------

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* Fix absolute path in path join call (#7099)

Signed-off-by: Jan Beckmann <king-jan1999@hotmail.de>
Signed-off-by: dorotat <dorotat@nvidia.com>

* Disable distopt contiguous param buffer by default (#7095)

Signed-off-by: Tim Moon <tmoon@nvidia.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* microphone demo (#7110)

Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com>
Co-authored-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* [Fix] load_state_dict in nlp_model.py (#7086)

* Fix load_state_dict in nlp_model.py

Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* Fix plot function in vad_utils.py (#7113)

Fix plot function in vad_utils.py

Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* Fixed small bug with NoisePerturbationWithNormalization (#7118)

Signed-off-by: Daniel Egert <degert@nvidia.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* Fix import guard checks (#7124)

Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* Revert "Fix import guard checks (#7124)" (#7125)

This reverts commit ae7624da7d773a6b9436ff61903dc4b99c7c27cb.

Signed-off-by: dorotat <dorotat@nvidia.com>

* Fix import guard checks (#7126)

* Fix import guard checks

Signed-off-by: smajumdar <titu1994@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: smajumdar <titu1994@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* Add updated fc ctc and rnnt xxl models (#7128) (#7130)

Signed-off-by: dorotat <dorotat@nvidia.com>

* [TTS] Create EnCodec training recipe (#6852)

* [TTS] Create EnCodec training recipe

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Update encodec recipe

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Rename EnCodec to AudioCodec

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Add EnCodec unit tests

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Add copyright header to distributed.py

Signed-off-by: Ryan <rlangman@nvidia.com>

---------

Signed-off-by: Ryan <rlangman@nvidia.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* Fix rank where torch.distributed may not be initialized yet and would not wait for tokenizer file caching (#7061)

Signed-off-by: Kim Ngo <6362111+findkim@users.noreply.github.com>
Co-authored-by: David <amosalla@asu.edu>
Signed-off-by: dorotat <dorotat@nvidia.com>

* fix default attention size (#7141) (#7143)

Signed-off-by: dorotat <dorotat@nvidia.com>

* fix evaluator.py for various exceptions by ast (#7150)

Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* [TTS][ZH] add Chinese TTS recipes based on IPA symbol sets. (#6893)

* [TTS] add Chinese TTS recipe based on IPA.
* add new pinyin and ipa dictionaries with 36 finals.
* add yaml configs for 24-final pinyin and ipa.
* add copyright header
* add a directory level 24finals to discriminate from 36 finals.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* unify configs into a single one and add detailed comments providing supported candidates.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* choose 36-final IPA as default phoneme dict

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

---------

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* [TTS] Add output audio format to preprocessing (#6889)

* [TTS] Add output audio format to preprocessing

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Add format validation

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Fix data tutorial

Signed-off-by: Ryan <rlangman@nvidia.com>

---------

Signed-off-by: Ryan <rlangman@nvidia.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* freeze (#7152)

Signed-off-by: arendu <adithya.r@gmail.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* make sure any empty segments are removed (#7155)

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* Update RIR generation scripts (#6547)

- fix: reduce room size if evaluation of params fails
- added randomized mic placement
- added diffuse noise generation
- added an option to specify the format and subtype for saved audio

Signed-off-by: Ante Jukić <ajukic@nvidia.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* A quickstart speech enhancement tutorial (#6492)

A simple example of training a model for speech enhancement task

Signed-off-by: Ante Jukić <ajukic@nvidia.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* NFA subtitle file config - specify colors and vertical alignment (#7160)

* allow specifying colors of text in ASS subtitle file

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* specify vertical_alignment instead of marginv in ass_file_config

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* add documentation of CTMFileConfig and ASSFileConfig to NFA README

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

---------

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* Eagerly accumulate embedding grads into fp32 buffer (#6958) (#7153)

Signed-off-by: Tim Moon <tmoon@nvidia.com>
Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* TE bug fix (#7027) (#7036)

Signed-off-by: Dmytro Pykhtar <dpykhtar@nvidia.com>
Co-authored-by: Dmytro Pykhtar <37850217+dimapihtar@users.noreply.github.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* [TTS] Remove nested TTS configs (#7154)

* [TTS] Remove nested TTS configs

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Modify tutorial to support multiple sampling rates

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Clarify min_duration unit

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Default 22.05kHz highfreq to null

Signed-off-by: Ryan <rlangman@nvidia.com>

---------

Signed-off-by: Ryan <rlangman@nvidia.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* Merge release r1.20.0 to main (#7167)

* update package info

Signed-off-by: ericharper <complex451@gmail.com>

* Add ASR with TTS Tutorial. Fix enhancer usage. (#6955)

* Add ASR with TTS Tutorial
* Fix enhancer usage

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* install_bs (#7019)

Signed-off-by: Nikolay Karpov <karpnv@gmail.com>

* Fix typo and branch in tutorial (#7048)

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* fix syntax error introduced in PR-7079 (#7102)

* fix syntax error introduced in PR-7079

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>

* fixes for pr review

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>

---------

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>

* fix links for TN (#7117)

Signed-off-by: Evelina <ebakhturina@nvidia.com>

* update branch (#7135)

Signed-off-by: ericharper <complex451@gmail.com>

* Fixed main and merging this to r1.20 (#7127)

* Fixed main and merging this to r1.20

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Update vad_utils.py

Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>

---------

Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>

* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* fix version

Signed-off-by: ericharper <complex451@gmail.com>

* resolve conflict the other way

Signed-off-by: ericharper <complex451@gmail.com>

* keep both

Signed-off-by: ericharper <complex451@gmail.com>

* revert keep both

Signed-off-by: ericharper <complex451@gmail.com>

---------

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: Nikolay Karpov <karpnv@gmail.com>
Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>
Signed-off-by: Evelina <ebakhturina@nvidia.com>
Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: Vladimir Bataev <vbataev@nvidia.com>
Co-authored-by: Nikolay Karpov <karpnv@gmail.com>
Co-authored-by: bene-ges <antonova_sasha@list.ru>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: Taejin Park <tango4j@gmail.com>
Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* Upgrade to pytorch lightning 2.0 (#6433)

* Upgrade pytorch lightning version in requirements

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Initial fixes for PTL2.0

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add further fixes to support lightning 2.0

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add replacements for replace_sampler_ddp, resume_from_checkpoint_fit_path and few occurances of validation_epoch_end

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Replace all occurances of validation_epoch_end to on_validation_epoch_end

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Replace training_epoch_end, test_epoch_end with on_train_epoch_end and on_test_epoch_end respectively

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Change logger=None to logger=False in Trainer object

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove PTL2.0 deprecated Trainer args from TrainerConfig dataclass

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Modify trainer.precision check and other small edits

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Replace logger=None with logger=False in test_ptl_stateless_timer.py Trainer

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add default values for args to fix Attribute Error

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add the following modifications

1) Remove outputs arg from on_validation_epoch_end, on_test_epoch_end and make it an arg of the class
2) Replace resume_from_checkpoint with ckpt_path as needed
3) Explicitly add accelerator as 'CPU' in UTs being run on CPU

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs arg from on_validation_epoch_end, on_test_epoch_end

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs arg in on_validation_epoch_end in MultiBinaryAccuracy docstrings

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add val, test outputs as instance vars in PunctuationCapitalizationModel and TokenClassificationModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Replace trainer.fit_loop.max_steps with trainer.fit_loop.epoch_loop.max_steps in test_optimizers_schedulers.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Revert an extra space that was mistakenly added

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Use self.validation_step_outputs and self.test_step_outputs in test_ema.py for uniformity

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Use self.validation_step_outputs and self.test_step_outputs in test_ptl_stateless_timer.py and check_for_ranks.py for uniformity

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add self.validation_step_outputs.clear() and self.test_step_outputs.clear() wherever missing

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs arg from on_train_epoch_end

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs from on_validation_epoch_end in multi_binary_acc.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove output args from on_validation_epoch_end in the docstrings of some ASR files

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove output args from on_validation_epoch_end and clear memory from validation_step_outputs

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add on_validation_epoch_end and remove outputs args for nlp models

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Append output of validation_step to validation_step_outputs in EncDecClassificationModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add the following changes

1) Index self.validation_step_outputs and self.test_step.outputs with dataloader_idx wherever needed
2) Initialize self.validation_step_outputs and self.test_step.outputs as empty lists and add support for multi dataloaders if they exist
3) Remove self.pre_configure_ddp from NLPDDPStrategy class as its removed in PTL 2.0

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add default value dataloader_idx=0 for on_validation_batch_end() in megatron_base_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* TypeCast precision to str in attention.py and utils_funcs.py to avoid TypeError

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add if condition check for multiple dataloaders when appending to validation outputs

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Separate validation pass to be used with both validation_step and test_step

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add if condition check for multiple dataloader while appending to test_step_outputs in punctuation_capitalization_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add condition check for multiple dataloaders based on type of trainer.val/test_dataloaders or self._validation/test_dl instead of len

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Comment Megatron T5 IA3 PP=2 in CI pipeline due to dataloader_iter issue with PTL 2.0

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Modify precision checks to account for 16-mixed and bf16-mixed

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Append output of validation/test_step to self.validation/test_step_outputs in CTCG2PModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Modify find_unused_parameters=True in g2p_heteronym model

1) Add find_unused_parameters=True for DDP strategy in g2p_heteronym_classification_train_and_evaluate.py
2) Remove args output in validation/test_step and add instance variables instead for heteronym_classification.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs from on_test_epoch_end in DialogueGPTClassificationModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add validation/test outputs in sgdqa_model and modify dialogue_config.yaml

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add split arg self.test_step_outputs to TextClassificationModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add test_step_outputs to dialogue and text classification models

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Change condition check for multiple dataloaders:

1) Replace ds_item as list in dialogue_config.yaml
2) Check for len of val/test_dataloaders or validation/test_dl along with type check of list in sgdqa_model.py while appending outputs of validation/test_step
3) Check for len of _validation/test_dl for creating self.validation/test_step_outputs in ModelPT and punctuation_cpitalization_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add additional condition for multi dataloaders

Check len(self.trainer.val/test_dataloaders) > 1 along with type(self.trainer.val/test_dataloaders) == list for multi dataloaders in validation/test_step

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add val step outputs and default val for dataloader_idx

1) Append validation_step outout to self.validation_step_outputs in MultiLabelIntentSlotClassificationMode
2) Add default val for dataloader_idx for on_test_batch_start/end in TimingCallback
3) Add self.validation/test_step_outputs in BERTQAModel and remove outputs arg

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add val/test_step_outputs to S2SQAModel and GPTQAModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Edit JenkinsFile for bert_pretrainig.py

Edit Jenkinsfile for this test to disable validation as a workaround for trainer.val_dataloader None error

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Modify precision to support 16-mixed, bf16-mixed in megatron_gpt_pretraining.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add ddp_find_unused_parameters_true and remove output args

1) Add ddp_find_unused_parameters_true fro trainer.strategy in self_alignment_pretraining.py as it has unused parameters
2) Remove output args and add self.validation/test_step_outputs to validation/test_step in mt_enc_dec_model.py
3) Comment tests in JenkinsFile that need to be fixed

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix in megatron_nmt_training.py for 16-mixed, bf16-mixed

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix for megatron_bert_pretraining.py and megatron_bert_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix and validation/test_step_outputs

1) Add fix to account for 16-mixed and bf16-mixed in megatron_retro_mutransfer_pretrain.py, megatron_retro_pretraining.py
2) Reset ckpt_path for test in enc_dec_nmt.py
3) Remove outputs args and add validation/test_step_outputs in megatron_retrieval_model.py
4) Comment Megatron Bert Pretraining and Resume Training with Pipeline Paralleism and add back NMT Training Post-LN

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix and skip few failing tests

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add missing comment lines in JenkinsFile

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Comment jenkin tests and super().on_validation_epoch_end() in megatron_gpt_sft_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Minor edit JenkinsFile

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Minor edit in jenkins file

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Edit in Jenkins file

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Comment missed lines in Jenkins file

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Fix precision and validation/test outputs

1) Add precision fix to account for 16-mixed and bf16-mixed in megatron_t5_pretraining.py
2) Remove outputs args and add append loss to self.validation/test_step_outputs in megatron_lm_encoder_decoder_model.py
3) Add back resume_from_checkpoint in the megatron_t5_config.yaml
4) Comment out certain tests in Jenkins file

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Fix precision and validation/test/predict errors in megatron_t5_prompt_learning.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix and edit precision typo in all files

1) Account for 16-mixed and bf16-mixed in megatron_bart_pretraining.py and megatron_t5_seq2seq_finetune.py
2) Fix precision typo in all files

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Fix all CI TTS tests and comment few Jenkins tests

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Combine xx_epoch_end and on_xx_epoch_end

Add on_inference_epoch_end to inference_epoch_end function and have a single on_validation/test_epoch_end in megatron_finetune_model.py and megatron_gpt_sft_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add a missing comment in JenkinsFile

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add try except StopIteration in validation_step for models with dataloader_iter

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove pyyaml from requirements

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add try except for inference_step in megatron_finetune_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove limit_val_batches for mockGPTDataset test

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add new self.validation_step_outputs for MegatronGPTSFTModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Minor edit Jenkinsfile

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Initialize self.validation/test_step_outputs in megatron_gpt_sft_model.py

Initialize self.validation/test_step_outputs in setup of MegatronGPTSFTModel to take care of cases when datalaoders are not setup in ModelPT for example while restoring the model.

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove resume_from_checkpoint if trainer arg in conf yaml files

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove resume_from_checkpoint as trainer arg in GPT, T5 configs

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove resume_from_checkpoint in duplex_tn_config.yaml

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Fix typos, unused imports and refactor code to remove redundant funcs

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove commented code in megatron_nmt_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Fix overriden functions to match parent class functions

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Prefetch dataloader_iter to prevent hang for PP>1

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Override setup() in NLPDDPStrategy to avoid hang during predict with PP>1

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Uncomment tests in JenkinsFile

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add '16' to precision checks and other minor fixes

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Clear validation/test_step_outputs with dataloader_idx for multi dataloaders

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Minor edits

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Modify precision checks to avoid indexing

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove self.validation_step_outputs_sft and add dataloader_idx to clear outputs

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Reference checkpoint with trainer.ckpt_path

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add _prefetch to NLPModel and minor fixes

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add limit_val_batches in JenkinsFile for NMT

1) Add trainer.limit_val_batches in Megatron NMT Training TP=2
2) Remove unused import in ModelPT

Signed-off-by: Abhishree <abhishreetm@gmail.com>

---------

Signed-off-by: Abhishree <abhishreetm@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* Include the scripts for preprocessing OAST and unit tests for chat sft datasets (#7112)

* scripts for sft

Signed-off-by: Yi Dong <yidong@nvidia.com>

* fix style

Signed-off-by: Yi Dong <yidong@nvidia.com>

* adde special token only for huggingface model

Signed-off-by: Yi Dong <yidong@nvidia.com>

* change default name

Signed-off-by: Yi Dong <yidong@nvidia.com>

* print out error datapoint content

Signed-off-by: Yi Dong <yidong@nvidia.com>

* show error id

Signed-off-by: Yi Dong <yidong@nvidia.com>

* annotation script working

Signed-off-by: Yi Dong <yidong@nvidia.com>

* try to be compatible with huggingface tokenizer

Signed-off-by: Yi Dong <yidong@nvidia.com>

* added examples

Signed-off-by: Yi Dong <yidong@nvidia.com>

* added lang

Signed-off-by: Yi Dong <yidong@nvidia.com>

* added lang

Signed-off-by: Yi Dong <yidong@nvidia.com>

* text to value special case

Signed-off-by: Yi Dong <yidong@nvidia.com>

* configure the slider

Signed-off-by: Yi Dong <yidong@nvidia.com>

* annoatation handles lang

Signed-off-by: Yi Dong <yidong@nvidia.com>

* added the unit test for chat sft dataset

Signed-off-by: Yi Dong <yidong@nvidia.com>

* used the file in the test dir

Signed-off-by: Yi Dong <yidong@nvidia.com>

* fix json error

Signed-off-by: Yi Dong <yidong@nvidia.com>

* load local tokenizer

Signed-off-by: Yi Dong <yidong@nvidia.com>

* remove mask count check

Signed-off-by: Yi Dong <yidong@nvidia.com>

* added HF dataset backend

Signed-off-by: Yi Dong <yidong@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Yi Dong <yidong@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* add paths to labeler. (#7087)

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: dorotat <dorotat@nvidia.com>

* T5 metrics fix (#7037)

* Fix race condition when executing with multi-node where some ranks does not wait for setup (#7016)

Signed-off-by: Kim Ngo <6362111+findkim@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Added bool types to neural_types export (#7032)

Signed-off-by: tbartley94 <tbartley@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* rnnt and char utils (#6971)

* rnnt_ngram_merge

Signed-off-by: Nikolay Karpov <karpnv@gmail.com>

* char level bug

Signed-off-by: Nikolay Karpov <karpnv@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Nikolay Karpov <karpnv@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* fix tab text gen (#7022) (#7031)

Signed-off-by: Yi Dong <yidong@nvidia.com>
Co-authored-by: Yi Dong <43824965+yidong72@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fixed kwargs for metric instance init

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fixed kwargs for metric instance init

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* removed kwagrs

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Updated config desc

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* ASR Confidence update and tutorial (#6810)

* small fixes and tests

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* various fixes for the tutorial

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* tutorial added

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* for for a little oops after rebasement

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix tests

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* unused import removed

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* fix review comments

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* deprecated parameters for greedy configs

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* move re-assigning to configs

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* fix comments 2

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* fix config tests

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* fix ece test (my env was bugged apparently)

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* renamings for confidence ensemble

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fox comments 3

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* return dropped tutorial

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* CI flips back and forth, increasing tolerance

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

---------

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* install_bs (#7019) (#7028)

Signed-off-by: Nikolay Karpov <karpnv@gmail.com>
Co-authored-by: Nikolay Karpov <karpnv@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* fixes for spellmapper (#6994) (#7000)

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>
Co-authored-by: bene-ges <antonova_sasha@list.ru>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* added back the retro documents (#7033)

Signed-off-by: Yi Dong <yidong@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Remove pyyaml (#7052) (#7054)

Signed-off-by: smajumdar <titu1994@gmail.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* st standalone model (#6969)

* st standalone model

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* style fix

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* sacrebleu import fix, unused imports removed

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* import guard for nlp inside asr transformer bpe model

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* codeql fixes

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* comments answered

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* import ordering fix

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* yttm for asr removed

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* logging added

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* added inference and translate method

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* remove pos emb from state dict for old models (#7068)

* remove pos emb from state dict

Signed-off-by: Evelina <ebakhturina@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* move to nlp_model

Signed-off-by: Evelina <ebakhturina@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update comment

Signed-off-by: Evelina <ebakhturina@nvidia.com>

* fix nmt test

Signed-off-by: Evelina <ebakhturina@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix nmt test

Signed-off-by: Evelina <ebakhturina@nvidia.com>

---------

Signed-off-by: Evelina <ebakhturina@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix typo in ASR-TTS tutorial (#7049)

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fixed tutorial's name (#7047)

Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com>
Co-authored-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix documentation for Numba (#7065) (#7077)

* Fix documentation for Numba

* Update force float32 flag dynamically

* Update force float32 flag dynamically

* Fix nemo version

---------

Signed-off-by: smajumdar <titu1994@gmail.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Update Frame-VAD doc and fix onnx export (#7076)

* update fvad doc

Signed-off-by: stevehuang52 <heh@nvidia.com>

* fix typo

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update fvad example

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* fix onnx export

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update test

Signed-off-by: stevehuang52 <heh@nvidia.com>

* refactor

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update doc

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

---------

Signed-off-by: stevehuang52 <heh@nvidia.com>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* memmap worker arg (#7062)

* memmap worker arg

Signed-off-by: arendu <adithya.r@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update

Signed-off-by: arendu <adithya.r@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update

Signed-off-by: arendu <adithya.r@gmail.com>

* update

Signed-off-by: arendu <adithya.r@gmail.com>

---------

Signed-off-by: arendu <adithya.r@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix caching bug in causal convolutions for cache-aware ASR models (#7034) (#7082)

Co-authored-by: Vahid Noroozi <VahidooX@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fast Conformer global token fix (#7085)

* old way

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* remove extra

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* clean

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* clean

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* clean

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: sam1373 <samuelkriman@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Refined export_config (#7053) (#7066)

* Refined export_config
* Rolling back hierarchy change
---------

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Co-authored-by: Boris Fomitchev <borisfom@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* small Bugfix (#7081)

* small Bugfix (#7079)

* fix branch

Signed-off-by: fayejf <fayejf07@gmail.com>

* fix typo

Signed-off-by: fayejf <fayejf07@gmail.com>

* fix link

Signed-off-by: fayejf <fayejf07@gmail.com>

---------

Signed-off-by: fayejf <fayejf07@gmail.com>

* Update tutorials/nlp/SpellMapper_English_ASR_Customization.ipynb

Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>

* Update tutorials/nlp/SpellMapper_English_ASR_Customization.ipynb

Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>

---------

Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Added script to extract ASR CTC and RNNT models from ASR hybrid models (#7092)

* Added script to extract ctc and rnnt models from hybrid models

Signed-off-by: Daniel Egert <degert@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Updated hybrid extraction script for review request 1

Signed-off-by: Daniel Egert <degert@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Updated hybrid convert script to remove --cuda flag

Signed-off-by: Daniel Egert <degert@nvidia.com>

---------

Signed-off-by: Daniel Egert <degert@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Adding docs and models for multiple lookahead cache-aware ASR (#7067) (#7094)

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* update TTS readme (#7088)

* update TTS readme

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

---------

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix absolute path in path join call (#7099)

Signed-off-by: Jan Beckmann <king-jan1999@hotmail.de>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Disable distopt contiguous param buffer by default (#7095)

Signed-off-by: Tim Moon <tmoon@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* microphone demo (#7110)

Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com>
Co-authored-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* [Fix] load_state_dict in nlp_model.py (#7086)

* Fix load_state_dict in nlp_model.py

Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix plot function in vad_utils.py (#7113)

Fix plot function in vad_utils.py

Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fixed small bug with NoisePerturbationWithNormalization (#7118)

Signed-off-by: Daniel Egert <degert@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix import guard checks (#7124)

Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Revert "Fix import guard checks (#7124)" (#7125)

This reverts commit ae7624da7d773a6b9436ff61903dc4b99c7c27cb.

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix import guard checks (#7126)

* Fix import guard checks

Signed-off-by: smajumdar <titu1994@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: smajumdar <titu1994@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Add updated fc ctc and rnnt xxl models (#7128) (#7130)

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* [TTS] Create EnCodec training recipe (#6852)

* [TTS] Create EnCodec training recipe

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Update encodec recipe

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Rename EnCodec to AudioCodec

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Add EnCodec unit tests

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Add copyright header to distributed.py

Signed-off-by: Ryan <rlangman@nvidia.com>

---------

Signed-off-by: Ryan <rlangman@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Fix rank where torch.distributed may not be initialized yet and would not wait for tokenizer file caching (#7061)

Signed-off-by: Kim Ngo <6362111+findkim@users.noreply.github.com>
Co-authored-by: David <amosalla@asu.edu>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* fix default attention size (#7141) (#7143)

Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* fix evaluator.py for various exceptions by ast (#7150)

Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* [TTS][ZH] add Chinese TTS recipes based on IPA symbol sets. (#6893)

* [TTS] add Chinese TTS recipe based on IPA.
* add new pinyin and ipa dictionaries with 36 finals.
* add yaml configs for 24-final pinyin and ipa.
* add copyright header
* add a directory level 24finals to discriminate from 36 finals.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* unify configs into a single one and add detailed comments providing supported candidates.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* choose 36-final IPA as default phoneme dict

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

---------

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* [TTS] Add output audio format to preprocessing (#6889)

* [TTS] Add output audio format to preprocessing

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Add format validation

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Fix data tutorial

Signed-off-by: Ryan <rlangman@nvidia.com>

---------

Signed-off-by: Ryan <rlangman@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* freeze (#7152)

Signed-off-by: arendu <adithya.r@gmail.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* make sure any empty segments are removed (#7155)

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Update RIR generation scripts (#6547)

- fix: reduce room size if evaluation of params fails
- added randomized mic placement
- added diffuse noise generation
- added an option to specify the format and subtype for saved audio

Signed-off-by: Ante Jukić <ajukic@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* A quickstart speech enhancement tutorial (#6492)

A simple example of training a model for speech enhancement task

Signed-off-by: Ante Jukić <ajukic@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* NFA subtitle file config - specify colors and vertical alignment (#7160)

* allow specifying colors of text in ASS subtitle file

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* specify vertical_alignment instead of marginv in ass_file_config

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* add documentation of CTMFileConfig and ASSFileConfig to NFA README

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

---------

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Eagerly accumulate embedding grads into fp32 buffer (#6958) (#7153)

Signed-off-by: Tim Moon <tmoon@nvidia.com>
Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* TE bug fix (#7027) (#7036)

Signed-off-by: Dmytro Pykhtar <dpykhtar@nvidia.com>
Co-authored-by: Dmytro Pykhtar <37850217+dimapihtar@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* [TTS] Remove nested TTS configs (#7154)

* [TTS] Remove nested TTS configs

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Modify tutorial to support multiple sampling rates

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Clarify min_duration unit

Signed-off-by: Ryan <rlangman@nvidia.com>

* [TTS] Default 22.05kHz highfreq to null

Signed-off-by: Ryan <rlangman@nvidia.com>

---------

Signed-off-by: Ryan <rlangman@nvidia.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Merge release r1.20.0 to main (#7167)

* update package info

Signed-off-by: ericharper <complex451@gmail.com>

* Add ASR with TTS Tutorial. Fix enhancer usage. (#6955)

* Add ASR with TTS Tutorial
* Fix enhancer usage

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* install_bs (#7019)

Signed-off-by: Nikolay Karpov <karpnv@gmail.com>

* Fix typo and branch in tutorial (#7048)

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* fix syntax error introduced in PR-7079 (#7102)

* fix syntax error introduced in PR-7079

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>

* fixes for pr review

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>

---------

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>

* fix links for TN (#7117)

Signed-off-by: Evelina <ebakhturina@nvidia.com>

* update branch (#7135)

Signed-off-by: ericharper <complex451@gmail.com>

* Fixed main and merging this to r1.20 (#7127)

* Fixed main and merging this to r1.20

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Update vad_utils.py

Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>

---------

Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>

* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* fix version

Signed-off-by: ericharper <complex451@gmail.com>

* resolve conflict the other way

Signed-off-by: ericharper <complex451@gmail.com>

* keep both

Signed-off-by: ericharper <complex451@gmail.com>

* revert keep both

Signed-off-by: ericharper <complex451@gmail.com>

---------

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: Nikolay Karpov <karpnv@gmail.com>
Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>
Signed-off-by: Evelina <ebakhturina@nvidia.com>
Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: Vladimir Bataev <vbataev@nvidia.com>
Co-authored-by: Nikolay Karpov <karpnv@gmail.com>
Co-authored-by: bene-ges <antonova_sasha@list.ru>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: Taejin Park <tango4j@gmail.com>
Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Signed-off-by: jubick1337 <mattyson.so@gmail.com>

* Upgrade to pytorch lightning 2.0 (#6433)

* Upgrade pytorch lightning version in requirements

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Initial fixes for PTL2.0

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add further fixes to support lightning 2.0

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add replacements for replace_sampler_ddp, resume_from_checkpoint_fit_path and few occurances of validation_epoch_end

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Replace all occurances of validation_epoch_end to on_validation_epoch_end

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Replace training_epoch_end, test_epoch_end with on_train_epoch_end and on_test_epoch_end respectively

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Change logger=None to logger=False in Trainer object

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove PTL2.0 deprecated Trainer args from TrainerConfig dataclass

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Modify trainer.precision check and other small edits

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Replace logger=None with logger=False in test_ptl_stateless_timer.py Trainer

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add default values for args to fix Attribute Error

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add the following modifications

1) Remove outputs arg from on_validation_epoch_end, on_test_epoch_end and make it an arg of the class
2) Replace resume_from_checkpoint with ckpt_path as needed
3) Explicitly add accelerator as 'CPU' in UTs being run on CPU

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs arg from on_validation_epoch_end, on_test_epoch_end

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs arg in on_validation_epoch_end in MultiBinaryAccuracy docstrings

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add val, test outputs as instance vars in PunctuationCapitalizationModel and TokenClassificationModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Replace trainer.fit_loop.max_steps with trainer.fit_loop.epoch_loop.max_steps in test_optimizers_schedulers.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Revert an extra space that was mistakenly added

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Use self.validation_step_outputs and self.test_step_outputs in test_ema.py for uniformity

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Use self.validation_step_outputs and self.test_step_outputs in test_ptl_stateless_timer.py and check_for_ranks.py for uniformity

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add self.validation_step_outputs.clear() and self.test_step_outputs.clear() wherever missing

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs arg from on_train_epoch_end

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs from on_validation_epoch_end in multi_binary_acc.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove output args from on_validation_epoch_end in the docstrings of some ASR files

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove output args from on_validation_epoch_end and clear memory from validation_step_outputs

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add on_validation_epoch_end and remove outputs args for nlp models

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Append output of validation_step to validation_step_outputs in EncDecClassificationModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add the following changes

1) Index self.validation_step_outputs and self.test_step.outputs with dataloader_idx wherever needed
2) Initialize self.validation_step_outputs and self.test_step.outputs as empty lists and add support for multi dataloaders if they exist
3) Remove self.pre_configure_ddp from NLPDDPStrategy class as its removed in PTL 2.0

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add default value dataloader_idx=0 for on_validation_batch_end() in megatron_base_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* TypeCast precision to str in attention.py and utils_funcs.py to avoid TypeError

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add if condition check for multiple dataloaders when appending to validation outputs

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Separate validation pass to be used with both validation_step and test_step

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add if condition check for multiple dataloader while appending to test_step_outputs in punctuation_capitalization_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add condition check for multiple dataloaders based on type of trainer.val/test_dataloaders or self._validation/test_dl instead of len

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Comment Megatron T5 IA3 PP=2 in CI pipeline due to dataloader_iter issue with PTL 2.0

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Modify precision checks to account for 16-mixed and bf16-mixed

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Append output of validation/test_step to self.validation/test_step_outputs in CTCG2PModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Modify find_unused_parameters=True in g2p_heteronym model

1) Add find_unused_parameters=True for DDP strategy in g2p_heteronym_classification_train_and_evaluate.py
2) Remove args output in validation/test_step and add instance variables instead for heteronym_classification.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove outputs from on_test_epoch_end in DialogueGPTClassificationModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add validation/test outputs in sgdqa_model and modify dialogue_config.yaml

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add split arg self.test_step_outputs to TextClassificationModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add test_step_outputs to dialogue and text classification models

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Change condition check for multiple dataloaders:

1) Replace ds_item as list in dialogue_config.yaml
2) Check for len of val/test_dataloaders or validation/test_dl along with type check of list in sgdqa_model.py while appending outputs of validation/test_step
3) Check for len of _validation/test_dl for creating self.validation/test_step_outputs in ModelPT and punctuation_cpitalization_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add additional condition for multi dataloaders

Check len(self.trainer.val/test_dataloaders) > 1 along with type(self.trainer.val/test_dataloaders) == list for multi dataloaders in validation/test_step

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add val step outputs and default val for dataloader_idx

1) Append validation_step outout to self.validation_step_outputs in MultiLabelIntentSlotClassificationMode
2) Add default val for dataloader_idx for on_test_batch_start/end in TimingCallback
3) Add self.validation/test_step_outputs in BERTQAModel and remove outputs arg

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add val/test_step_outputs to S2SQAModel and GPTQAModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Edit JenkinsFile for bert_pretrainig.py

Edit Jenkinsfile for this test to disable validation as a workaround for trainer.val_dataloader None error

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Modify precision to support 16-mixed, bf16-mixed in megatron_gpt_pretraining.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add ddp_find_unused_parameters_true and remove output args

1) Add ddp_find_unused_parameters_true fro trainer.strategy in self_alignment_pretraining.py as it has unused parameters
2) Remove output args and add self.validation/test_step_outputs to validation/test_step in mt_enc_dec_model.py
3) Comment tests in JenkinsFile that need to be fixed

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix in megatron_nmt_training.py for 16-mixed, bf16-mixed

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix for megatron_bert_pretraining.py and megatron_bert_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix and validation/test_step_outputs

1) Add fix to account for 16-mixed and bf16-mixed in megatron_retro_mutransfer_pretrain.py, megatron_retro_pretraining.py
2) Reset ckpt_path for test in enc_dec_nmt.py
3) Remove outputs args and add validation/test_step_outputs in megatron_retrieval_model.py
4) Comment Megatron Bert Pretraining and Resume Training with Pipeline Paralleism and add back NMT Training Post-LN

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix and skip few failing tests

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add missing comment lines in JenkinsFile

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Comment jenkin tests and super().on_validation_epoch_end() in megatron_gpt_sft_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Minor edit JenkinsFile

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Minor edit in jenkins file

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Edit in Jenkins file

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Comment missed lines in Jenkins file

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Fix precision and validation/test outputs

1) Add precision fix to account for 16-mixed and bf16-mixed in megatron_t5_pretraining.py
2) Remove outputs args and add append loss to self.validation/test_step_outputs in megatron_lm_encoder_decoder_model.py
3) Add back resume_from_checkpoint in the megatron_t5_config.yaml
4) Comment out certain tests in Jenkins file

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Fix precision and validation/test/predict errors in megatron_t5_prompt_learning.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Precision fix and edit precision typo in all files

1) Account for 16-mixed and bf16-mixed in megatron_bart_pretraining.py and megatron_t5_seq2seq_finetune.py
2) Fix precision typo in all files

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Fix all CI TTS tests and comment few Jenkins tests

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Combine xx_epoch_end and on_xx_epoch_end

Add on_inference_epoch_end to inference_epoch_end function and have a single on_validation/test_epoch_end in megatron_finetune_model.py and megatron_gpt_sft_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add a missing comment in JenkinsFile

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add try except StopIteration in validation_step for models with dataloader_iter

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove pyyaml from requirements

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add try except for inference_step in megatron_finetune_model.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove limit_val_batches for mockGPTDataset test

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add new self.validation_step_outputs for MegatronGPTSFTModel

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Minor edit Jenkinsfile

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Initialize self.validation/test_step_outputs in megatron_gpt_sft_model.py

Initialize self.validation/test_step_outputs in setup of MegatronGPTSFTModel to take care of cases when datalaoders are not setup in ModelPT for example while restoring the model.

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove resume_from_checkpoint if trainer arg in conf yaml files

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Remove resume_from_checkpoint as trainer arg in GPT, T5 configs

Signed-off-by: Abhishree <abhishreetm@gmai…
rohitrango pushed a commit to rohitrango/NeMo that referenced this pull request Jun 25, 2024
* update package info

Signed-off-by: ericharper <complex451@gmail.com>

* fix the mpt chatbot (#6957)

Signed-off-by: Yi Dong <yidong@nvidia.com>

* Remove `compute_on_step` from metrics (#6979)

* Remove `compute_on_step` from metrics

Signed-off-by: smajumdar <titu1994@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Remove confusing log message

Signed-off-by: smajumdar <titu1994@gmail.com>

* Update tests

Signed-off-by: smajumdar <titu1994@gmail.com>

---------

Signed-off-by: smajumdar <titu1994@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Hybrid conformer export (#6983)

* Implemented generic kv-pair setting of export_config from args

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Hybrid conformer export

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Hybrid decoder export

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Cleanup

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Changed from **kwargs

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Docstring

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Docs added

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Stringify args

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Added docs for ASR export configs

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* lowercase ctc

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

---------

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Cache handling without input tensors mutation (#6980)

* Cache handling without input tensors mutation

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Cleanup

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Cleanup#2

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Cleanup#3

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

---------

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>

* fixes for spellmapper (#6994)

Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru>

* Fixing an issue with confidence ensembles (#6987)

* Bug fix for the confidence ensembles

Signed-off-by: Igor Gitman <igitman@nvidia.com>

* Relax constraints for the test

Signed-off-by: Igor Gitman <igitman@nvidia.com>

---------

Signed-off-by: Igor Gitman <igitman@nvidia.com>

* [TTS] Append pretrained FastPitch & SpectrogamEnhancer pair to available models (#7012)

* [TTS] fastpitch: add english libritts model with asr stft parameters (25 ms 10 ms)

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* [TTS] enhancer: add pretrained model intended for asr finetuning

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

---------

Signed-off-by: Roman Korostik <rkorostik@nvidia.com>

* Add ASR with TTS Tutorial. Fix enhancer usage. (#6955)

* Add ASR with TTS Tutorial
* Fix enhancer usage

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* install_bs (#7019)

Signed-off-by: Nikolay Karpov <karpnv@gmail.com>

* fix tab text gen (#7022)

Signed-off-by: Yi Dong <yidong@nvidia.com>

* TE bug fix (#7027)

Signed-off-by: Dmytro Pykhtar <dpykhtar@nvidia.com>

* Add support for Numba FP16 RNNT Loss (#6991) (#7038)

* Force working space memory to always be in fp32

Signed-off-by: smajumdar <titu1994@gmail.com>

* Add support for fp16 testing in Numba

Signed-off-by: smajumdar <titu1994@gmail.com>

* Add support for fp16 testing in Numba

Signed-off-by: smajumdar <titu1994@gmail.com>

* Add support for fp16 testing in Numba

Signed-off-by: smajumdar <titu1994@gmail.com>

* Fix cost calculation by upcasting to fp32

Signed-off-by: smajumdar <titu1994@gmail.com>

* Fix cost calculation by upcasting to fp32

Signed-off-by: smajumdar <titu1994@gmail.com>

* Add support to check if numba fp16 is available

Signed-off-by: smajumdar <titu1994@gmail.com>

* add RNN-T loss implemented by PyTorch and test code (#5312)

* Fix the bugs in cache-aware streaming Conformer (#5032)

Signed-off-by: Vahid <vnoroozi@nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* IA3 support for GPT and T5 (#4909)

* init commit for ia3 adater training in GPT

Signed-off-by: arendu <adithya.r@gmail.com>

* ia3 adater training in GPT, models and adapter classes

Signed-off-by: arendu <adithya.r@gmail.com>

* reshape to operate even on non-contiguous tensors

Signed-off-by: arendu <adithya.r@gmail.com>

* configs

Signed-off-by: arendu <adithya.r@gmail.com>

* fixed none init

Signed-off-by: arendu <adithya.r@gmail.com>

* adding adapter and ia3 support for T5 based models

Signed-off-by: arendu <adithya.r@gmail.com>

* style fix

Signed-off-by: arendu <adithya.r@gmail.com>

* config update and t5 model adapter and ia3

Signed-off-by: arendu <adithya.r@gmail.com>

* removed unused imports

Signed-off-by: arendu <adithya.r@gmail.com>

* predict step for inference

Signed-off-by: arendu <adithya.r@gmail.com>

* style fix

Signed-off-by: arendu <adithya.r@gmail.com>

* style fix

Signed-off-by: arendu <adithya.r@gmail.com>

* adapter inference for t5

Signed-off-by: arendu <adithya.r@gmail.com>

* style fix

Signed-off-by: arendu <adithya.r@gmail.com>

* fixed bug micro and global batch size in eval

Signed-off-by: arendu <adithya.r@gmail.com>

* minor edit

Signed-off-by: arendu <adithya.r@gmail.com>

* agressive truncation if in test examples if no truncation field is given

Signed-off-by: arendu <adithya.r@gmail.com>

* corrected for language_model_path name changes in main

Signed-off-by: arendu <adithya.r@gmail.com>

* removed unused import

Signed-off-by: arendu <adithya.r@gmail.com>

* name change for language_model_path

Signed-off-by: arendu <adithya.r@gmail.com>

* include inter_attention to IA3

Signed-off-by: arendu <adithya.r@gmail.com>

* minor fix in confg

Signed-off-by: arendu <adithya.r@gmail.com>

* minor fixes

Signed-off-by: arendu <adithya.r@gmail.com>

* removed unused flag

Signed-off-by: arendu <adithya.r@gmail.com>

* addressing PR comments

Signed-off-by: arendu <adithya.r@gmail.com>

* address PR comments

Signed-off-by: arendu <adithya.r@gmail.com>

* minor fix

Signed-off-by: arendu <adithya.r@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* style fix

Signed-off-by: arendu <adithya.r@gmail.com>

* CI test

Signed-off-by: arendu <adithya.r@gmail.com>

* minor fix in jenkinsfile

Signed-off-by: arendu <adithya.r@gmail.com>

Signed-off-by: arendu <adithya.r@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Bug fix - Limit val batches set to 1.0  (#5023)

* Bug fix

Signed-off-by: shanmugamr1992 <shanmugamr1992@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Adressed sandeep's comments

* Fixing limit val batches support in bert

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fixing limit val batches support in bert

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: shanmugamr1992 <shanmugamr1992@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* [bug_fix] kv_channels is used when available (#5066)

* fix bug s.t kv_channels is used when available

Signed-off-by: arendu <adithya.r@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: arendu <adithya.r@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* P&C Docs (#5068) (#5069)

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>
Co-authored-by: Matvei Novikov <mattyson.so@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Add spe_split_by_unicode_script arg (#5072)

* Add spe_split_by_unicode_script arg

Signed-off-by: Anas <aabouallaban@pm.me>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Anas <aabouallaban@pm.me>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* probabilites -> probabilities (#5078) (#5079)

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* increase PR and Issue sweep quantity and active close PRs. (#5073)

* increase PR and Issue sweep quantity and active close PRs.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* update with stricter rules, 30 days to be stale and 7 days to be closed for both Issues and PRs.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* [TTS] added missing German phoneme tokenizer. (#5070) (#5074)

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* rename to match prompt leanring (#5076)

Signed-off-by: arendu <adithya.r@gmail.com>

Signed-off-by: arendu <adithya.r@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Missing fixes from r1.11.0 to T5 finetuning eval (#5054) (#5061)

* Fixes to seq2seq eval

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Notebook bug fixes (#5084) (#5085)

* Notebook bug fixes

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Turned nemo install back on

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* reverted notebook

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Updated one line in entity linking nb

Signed-off-by: Virginia Adams <vadams@nvidia.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>
Co-authored-by: Eric Harper <complex451@gmail.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>
Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* update strategy in notebook from ddp_fork to dp (#5088) (#5089)

Co-authored-by: Zhilin Wang <wangzhilin12061996@hotmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Fix bug in Squeezeformer Conv block (#5011) (#5024)

* Fix bug in Squeezeformer Conv block

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Fix kernel context

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Fix access mixin

Signed-off-by: smajumdar <smajumdar@nvidia.com>

Signed-off-by: smajumdar <smajumdar@nvidia.com>

Signed-off-by: smajumdar <smajumdar@nvidia.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* fixed megatron lm conversion bug (PTL related) (#5038) (#5063)

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: David <amosalla@asu.edu>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Fix Unhashable type list for Numba Cuda spec augment kernel (#5093) (#5094)

Signed-off-by: smajumdar <smajumdar@nvidia.com>

Signed-off-by: smajumdar <smajumdar@nvidia.com>

Signed-off-by: smajumdar <smajumdar@nvidia.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Fix numba (#5098)

Signed-off-by: smajumdar <titu1994@gmail.com>

Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Make it possible to specify output_filename in normalize_with_audio.py (#5092)

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Greedy decoding confidence for CTC and RNNT (#4931)

* rnnt confidence draft

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* word confidence

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* advanced entropies added

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* refactoring

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* oops forgot a file

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* metrics and benchmarking script added

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* style fix

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* texterrors installation added

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* lgtm and bug fix

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* fix comments

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* fix typos

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* add missing import after rebase

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>
Co-authored-by: Aleksandr Laptev <alaptev@nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* [Add] SLURP models and examples (#4668)

* add model, util and loss

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* refactor

Signed-off-by: stevehuang52 <heh@nvidia.com>

* refactor annd update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update and refactor

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update and refactor

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update and refactor

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update docs

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update available models

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* refactor data processing

Signed-off-by: stevehuang52 <heh@nvidia.com>

* fix typo

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update docs

Signed-off-by: stevehuang52 <heh@nvidia.com>

* refactor and update

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update doc

Signed-off-by: stevehuang52 <heh@nvidia.com>

* move transformer to asr.modules

Signed-off-by: stevehuang52 <heh@nvidia.com>

* move transformer to asr.modules

Signed-off-by: stevehuang52 <heh@nvidia.com>

* get rid of jsonlines

Signed-off-by: stevehuang52 <heh@nvidia.com>

* refactor

Signed-off-by: stevehuang52 <heh@nvidia.com>

* revert changes to nlp

Signed-off-by: stevehuang52 <heh@nvidia.com>

Signed-off-by: stevehuang52 <heh@nvidia.com>
Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: Jagadeesh Balam <4916480+jbalam-nv@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* only optimize params that are part of the adapter modules (#5086)

Signed-off-by: arendu <adithya.r@gmail.com>

Signed-off-by: arendu <adithya.r@gmail.com>
Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Pipeline Parallel T5 Prompt Learning (#4956)

* Added pre process flag checks and pipeline parallel in fwd

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Added rank check for pipeline parallel

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* T5 prompt learning works!

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* IA3 passing CI

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Fixed typo

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* removed optimizer setup so Adi's change will not conflict

Signed-off-by: Virginia Adams <vadams@nvidia.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>
Signed-off-by: Adi Renduchintala <108822655+arendu@users.noreply.github.com>
Co-authored-by: Adi Renduchintala <108822655+arendu@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* [TTS] remove phonemizer.py (#5090)

remove phonemizer.py and convert code block to markdown in the tutorial.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* T5 Decoding with PP > 2 fix (#5091) (#5103)

* set sequence lenghts in the pipeline properly

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* [TTS] fixed wrong val loss for epoch 0 and inconsistent metrics names (#5087) (#5102)

* fixed hifigan configs as well
* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Fix and refactor consumed samples save/restore for Megatron models. (#5077)

* Fixes and refactor

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Remove unused imports

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Empty

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* RIR corpus generator tool (#4927)

Signed-off-by: Ante Jukić <ajukic@nvidia.com>

Signed-off-by: Ante Jukić <ajukic@nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Multiprocessing fix (#5106) (#5107)

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>

Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>
Co-authored-by: Matvei Novikov <mattyson.so@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* [Bug fix] PC lexical + audio (#5109) (#5110)

* training running

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* revert

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* revert

Signed-off-by: ekmb <ebakhturina@nvidia.com>

Signed-off-by: ekmb <ebakhturina@nvidia.com>

Signed-off-by: ekmb <ebakhturina@nvidia.com>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* [Fix] schedulers with no max_steps param (#4564)

* fix schedulers

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update to use python inspect module

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update

Signed-off-by: stevehuang52 <heh@nvidia.com>

Signed-off-by: stevehuang52 <heh@nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* T5 prompt learning fixes missing from r.11.0 merge (#5075) (#5101)

* Fix special tokens

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Empty

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: David <amosalla@asu.edu>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: David <amosalla@asu.edu>
Co-authored-by: Eric Harper <complex451@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* [TTS] Add NeMo TTS Primer Tutorial (#4933)

* [TTS] Add NeMo TTS Primer Tutorial

Signed-off-by: Ryan <rlangman@nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Add Squeezeformer CTC model checkpoints on Librispeech (#5121)

Signed-off-by: smajumdar <titu1994@gmail.com>

Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* adding loss normalization options to rnnt joint  (#4829)

* adding normalization options to rnnt joint loss

* moving the param to joint

* moving loss normalization to rnnt loss config

* style

* cleaning up

* fixing sum reduction in joint

Signed-off-by: Dima Rekesh <drekesh@nvidia.com>

* moving reduction into RNNT loss class

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* refactoring

* typos

Signed-off-by: Dima Rekesh <drekesh@nvidia.com>

Signed-off-by: Dima Rekesh <drekesh@nvidia.com>
Co-authored-by: Dima Rekesh <drekesh@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Asr concat dataloader (#5108)

* forced precision

* typo

* initial commit

Signed-off-by: Dima Rekesh <bmwshop@gmail.com>

* typos and bugs

Signed-off-by: Dima Rekesh <drekesh@nvidia.com>

* reverting conformer encoder

Signed-off-by: Dima Rekesh <drekesh@nvidia.com>

* additional checks

Signed-off-by: Dima Rekesh <bmwshop@gmail.com>

* adding support to CTC models as well

* reverting conformer_encoder

Signed-off-by: Dima Rekesh <bmwshop@gmail.com>

* typo

Signed-off-by: Dima Rekesh <bmwshop@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* refactoring

Signed-off-by: Dima Rekesh <bmwshop@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* refactoring

Signed-off-by: Dima Rekesh <drekesh@nvidia.com>

* merging

Signed-off-by: Dima Rekesh <drekesh@nvidia.com>

Signed-off-by: Dima Rekesh <bmwshop@gmail.com>
Signed-off-by: Dima Rekesh <drekesh@nvidia.com>
Co-authored-by: Dima Rekesh <drekesh@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* fix blossom ci unittests

Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* bugfix: pybtex.database.InvalidNameString: Too many commas in author field. (#5112) (#5115)

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Uppdate container version to 22.09 (#5105)

* update container version

Signed-off-by: ericharper <complex451@gmail.com>

* pin click

Signed-off-by: ericharper <complex451@gmail.com>

* pin click 8.0.2

Signed-off-by: ericharper <complex451@gmail.com>

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Remove unsupported arguments from MegatronNMT (#5065)

* Fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* More fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* pp2 support for T5 IA3 learning and T5 Adapters learning (#5116)

* enabling pp2

Signed-off-by: arendu <adithya.r@gmail.com>

* optimizer update

Signed-off-by: arendu <adithya.r@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* T5 pp>1 support for adapters and ia3

Signed-off-by: arendu <adithya.r@gmail.com>

* fix bug with missing adapter_tuning

Signed-off-by: arendu <adithya.r@gmail.com>

* inference error fixed, pp=2

Signed-off-by: arendu <adithya.r@gmail.com>

Signed-off-by: arendu <adithya.r@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* T5 Prompt Learning Fixes for Pipeline Parallel (#5120)

* Initial fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Added back validation acc

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Put num workers back

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* added relative encoding if statament

Signed-off-by: Virginia Adams <vadams@selene-login-01.nvidia.com>

* Added back val loss only validation

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Revert "Added back val loss only validation"

This reverts commit 86d8f4806fe30335c40c3716ce18259939df500f.

* Removed val acc for PP > 1

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Removed enc_seq_len if statement

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Added back validation acc calc

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: Virginia Adams <vadams@nvidia.com>
Signed-off-by: Virginia Adams <vadams@selene-login-01.nvidia.com>
Co-authored-by: Virginia Adams <vadams@nvidia.com>
Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Virginia Adams <vadams@selene-login-01.nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* add doc info (#4721)

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* [TTS] Add SpanishCharsTokenizer (#5135)

* [TTS] Add SpanishCharsTokenizer

Signed-off-by: Ryan <rlangman@nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Update megatron interface to dialogue (#4936)

* fix style formatting

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update template to include description of intent

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkinsfile

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* changes based on requests in review

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* add compatibility with assistant dataset

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkins

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* remove dialogue_state_tracking

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update huggingface utils for dialogue

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* rename dialogue_state_tracking_hybrid to dialogue_state_tracking_sgdqa

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* fix style

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix nemo/collections/nlp/models/dialogue_state_tracking_sgdqa/__init__.py

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkinsfile for SGDGEN

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkinsfile for SGDGEN

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkinsfile for SGDGEN

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkinsfile for SGDGEN

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkinsfile for SGDGEN

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* fix typo

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* add docstrings for assistant data processsor

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkins for SGDGEN local checkpoint

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update style

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* use local vocab file for Jenkinsfile

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* patch for Jenkins CI using local file

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* add slot filling prediction and metrics

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* remove unused code

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* refactor metrics code out of Dialogue GPT Model

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* integrate backward compatible support for IntentSlotClassificationModel (bert model)

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* save prediction file for IntentSlotClassification

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update dialogue gpt model training for megatron gpt

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* remove batch generate for HF GPT2, which causes lower performance

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* add few shot capability to dialogue gpt model

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkinsfile and remove unused import

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update code description and clarity

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* address PR comments

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* integrate compatibility with ZeroShotIntentModel

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* rename folder to dialogue due to increased scope and further refactor for clarity

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* added dialogue GPT for sequence generation task (e.g. answer extender)

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* add CI test for DialogueGPTGenerationModel

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* integrate DialogueS2SGenerationModel for generation task (e.g. answer extender)

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* modify huggingface utils to support HF t5/BART models

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* remove unused imports

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkinsfile

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkinsfile

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update bleu metric

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* fix bleu metric style

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* debug bleu metric

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* debug bleu metric

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update based on PR #3893

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update 2 based on PR #3893

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update 3 based on PR #3893

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* integrate sgd generation based on user user utterance and system slot-values to generate system utterance

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* add validation model saving capabilities

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* cleaned up code for SGD Based Answer extender

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Dialogue Generation CI

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkinsfile

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkinsfile

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* fix Jenkins CI issue"

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* add support for design dataset

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* remove unnecessary imports

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkins

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update jenkins

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update jenkins

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* support megatron for dialogue_s2s_generation_model

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* reduce loaded samples in MSMarcoDataProcessor to 64 when cfg.model.dataset.debug_mode=True

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update CI

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update checkpoint and predictions filename to include epoch number

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* integrate HF BART MNLI into zero shot intent model

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* integrate Dialogue Nearest Neighbour Model

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkins

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Jenkins

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* refactor Dialogue SGD Data Processor to make interface for models cleaner

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update jenkins

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update Dialogue S2S Generation model for DialogueSGDDataProcessor interface

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update jenkins

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update jenkins

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* support sgd and drive thru datasets by zero shot model and nearest neighbour model

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* add prediction saving code to nearest neighbour and zero shot intent models

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* fix typo in sgd data processor

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* integrate Dialogue Mellon QA Data Processor

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update mellon qa

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update dialogue.py to remove outdated info

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update dialogue_config.yaml

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update dialogue_config.yaml

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* add dialogue docs

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* address review comments

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix for cfg

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* make dependency on apex optional

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* change NLPDDPluggin calling logic to make it possible to run without apex

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* add first draft of tutorial

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* reduce ms marco size by removing lines without wellFormedAnswers

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* address pr comments

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update colab tutorial link in dialogue docs

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* include unit test and some refactor to facilitate unit test

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* address pr issues

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* remove typos in dialogue tutorial

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* support larger files for question answering

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* remove unnecessary artifacts to reduce memory use

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* put 0 tensor to device

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update link within dialogue tutorial

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* restore previously delete files

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update error handling when loss = nan

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update nan handling

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update spanning loss func

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update spanning loss

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* fix type error raised in qa_dataset.py

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* add error checking message

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* revert back to float32

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* revert back to float32

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update error msgs

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update error msgs

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update error msgs

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update error msgs

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update error msgs

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update error msgs

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update error msgs

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update error msgs

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update exp logging

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update error msgs

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update loading of large file from pickle to json

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update loading of large file from pickle to json

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* limit number of negative samples

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* revert post processing

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* revert post processing

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* remove unused methods and style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* add more documentation

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* remove unused imports

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* changes base on PR review

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* set wandb logger falseby default

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update interface with megatron gpt prompt learning

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update inline documentation

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update prompt_ids

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update error msg

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update config

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update config

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* set inference = False for dialgue prompt learning during trainng

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* set inference = False for dialgue prompt learning during trainng

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* remove unused code

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update config yaml

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* fix bug for megatron gpt prompt learning

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* remove unused import

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* address comments in PR

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* address comments in PR

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* address typo

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* add megatron t5 inference

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* fix bug due to bert tokenizer not being space-aware

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update style

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update IntentSlotModel onnx export test

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update style

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update exportable

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* address PR comments

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* replace functools.cache_property with functools.lru_cache to maintain python 3.7 compatibility

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* improve speed of rank_candidates and support for p tuning

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update dialogue.py

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* fix megatron prompt learning saving bug

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update generate_candidate method

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* remove repeated init text ids and invert attention masks

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update typo

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* custom collate fn to remove excess padding in batch

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* style fix

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update complete method to mitigate issue when max seq len is low

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* address pr comments

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

* update generation interface

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>
Co-authored-by: Zhilin Wang <zhilinw@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: Yang Zhang <yzhang123@users.noreply.github.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Added save inference ready .nemo file with every checkpoint (#5055)

* Added save inference ready .nemo file with every checkpoint

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Python style fix

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* addressed Adi's comment

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Added ptuning check in model checkpoint saving

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Changed save_nemo_on_valdaition default to False

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Changes global batch size of adapter CI

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Changed num workers to 0

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* added first stage of pipeline check

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Virginia Adams <vadams@nvidia.com>
Signed-off-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Fixes for docs/typos + remove max_utts parameter from tarred datasets as it causes hang in training (#5118)

* Remove ; from jupyter notebook cells

Signed-off-by: Igor Gitman <igitman@nvidia.com>

* Fix typos in documentation/code

Signed-off-by: Igor Gitman <igitman@nvidia.com>

* Fix output message to have 'or equal'

Signed-off-by: Igor Gitman <igitman@nvidia.com>

* Link formatting fixes

Signed-off-by: Igor Gitman <igitman@nvidia.com>

* Add error if max_utts is used in tarred datasets

Signed-off-by: Igor Gitman <igitman@nvidia.com>

* Remove max_utts parameter from tarred datasets

Signed-off-by: Igor Gitman <igitman@nvidia.com>

* Fix max_utts removal in tests

Signed-off-by: Igor Gitman <igitman@nvidia.com>

* Fix typo if -> is

Signed-off-by: Igor Gitman <igitman@nvidia.com>

Signed-off-by: Igor Gitman <igitman@nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Merge r1.12.0 main (#5139)

* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* Add cherry-pick action (#4958)

* add cherry-pick action

Signed-off-by: ericharper <complex451@gmail.com>

* Pin Transformers version to fix CI (#4955)

* Pin transformers version in CI to prevent offline tokenizer loading error

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Drop version

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Disable offline temporarily

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Disable offline temporarily

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Enable offline

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>
Co-authored-by: Sean Naren <snarenthiran@nvidia.com>

* upper bound transformers

Signed-off-by: ericharper <complex451@gmail.com>

* remove duplicate transformers requirement

Signed-off-by: ericharper <complex451@gmail.com>

* Release SOTA Lang ID model  (#5080)

* add pretrained lang id model ambernet

Signed-off-by: fayejf <fayejf07@gmail.com>

* update doc and style fix

Signed-off-by: fayejf <fayejf07@gmail.com>

Signed-off-by: fayejf <fayejf07@gmail.com>

* update branch and package info

Signed-off-by: ericharper <complex451@gmail.com>

* remove upper bounds on lightning and transformers

Signed-off-by: ericharper <complex451@gmail.com>

* remove transformers offline from ci

Signed-off-by: ericharper <complex451@gmail.com>

* upper bound transformers

Signed-off-by: ericharper <complex451@gmail.com>

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
Co-authored-by: Sean Naren <snarenthiran@nvidia.com>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Added ASR model comparison to SDE (#5043)

SDE: Added ASR model comparison tool to SDE
transcribe speech: Added support for many predictions in one file, as well as custom field names
Signed-off-by: George Zelenfroynd <gzelenfroind@nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* fix nmt eval sampler (#5154)

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Fix Global init steps (#5143)

* move global step to base

Signed-off-by: Yi Dong <yidong@nvidia.com>

* fix fused softmax

Signed-off-by: Yi Dong <yidong@nvidia.com>

* add the missing file

Signed-off-by: Yi Dong <yidong@nvidia.com>

* update the fused kernel

Signed-off-by: Yi Dong <doyend@gmail.com>

* fix import error

Signed-off-by: Yi Dong <doyend@gmail.com>

* fix import again

Signed-off-by: Yi Dong <yidong@nvidia.com>

Signed-off-by: Yi Dong <yidong@nvidia.com>
Signed-off-by: Yi Dong <doyend@gmail.com>
Co-authored-by: Yi Dong <doyend@gmail.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* [TTS] bug fix - sample rate was being ignored in vocoder dataset (#4518)

* bug fix - sample rate was being ignored in vocoder dataset when not loading mel
* handled n segments for a different sampling rate than original sampling rate
* Added case for n_segments 0, warning for n_segments greater than file length

Signed-off-by: Paarth Neekhara <paarth.n@gmail.com>
Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Co-authored-by: Jocelyn <jocelynh@nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Add EMA support to NeMo (#4764)

* Added Base files

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Some refactors, swap to using MNIST Lnet

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Add a few more tests, allow the callback to be set via the exp manager

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Actually run validation for testing

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Run isort

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Add test for saving state/fix saving state

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Use dummy model

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Fix test

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Add copyright

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Support saving separate EMA weight module

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Add standalone functionality/logging

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Expose more parameters

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Modify to allow option to replace validation

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Add jenkins test, formatting

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Pin Transformers version to fix CI (#4955)

* Pin transformers version in CI to prevent offline tokenizer loading error

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Drop version

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Disable offline temporarily

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Disable offline temporarily

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Enable offline

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Add cherry-pick action (#4958) (#4961)

* add cherry-pick action

Signed-off-by: ericharper <complex451@gmail.com>

* Pin Transformers version to fix CI (#4955)

* Pin transformers version in CI to prevent offline tokenizer loading error

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Drop version

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Disable offline temporarily

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Disable offline temporarily

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Enable offline

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>
Co-authored-by: Sean Naren <snarenthiran@nvidia.com>

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Co-authored-by: Sean Naren <snarenthiran@nvidia.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Fix changelog builder (#4962) (#4963)

Signed-off-by: smajumdar <smajumdar@nvidia.com>

Signed-off-by: smajumdar <smajumdar@nvidia.com>

Signed-off-by: smajumdar <smajumdar@nvidia.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* fix cherry pick workflow (#4964) (#4965)

Signed-off-by: ericharper <complex451@gmail.com>

Signed-off-by: ericharper <complex451@gmail.com>

Signed-off-by: ericharper <complex451@gmail.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* reorder model check (#4959) (#4967)

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* check for active conda environment (#4970) (#4971)

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* [TTS] fix broken tutorial for MixerTTS. (#4949) (#4976)

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Checkpoint averaging class fix (#4946)

* 1. Added args.class_path to provide it externally.

Signed-off-by: Micha Livne <mlivne@cs.toronto.edu>

* 1. Fixed style.

Signed-off-by: Micha Livne <mlivne@cs.toronto.edu>

Signed-off-by: Micha Livne <mlivne@cs.toronto.edu>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Add ability to give seperate datasets for test, train and validation (#4798)

* Add ability to give seperate datasets for test, train and validation

* Addressed Sandeeps comments

* Addressed Sandeeps comments

* Add ability to give seperate datasets for test, train and validation

* Add ability to give seperate datasets for test, train and validation

* Addressed review comments

* Bug fix for common dataset utils

* Add CI tests

Signed-off-by: shanmugamr1992 <shanmugamr1992@gmail.com>

* Reformat code

Signed-off-by: shanmugamr1992 <shanmugamr1992@gmail.com>

* Bug fix

Signed-off-by: shanmugamr1992 <shanmugamr1992@gmail.com>

* Bug fix

* Bug Fix

* Bug Fix

* Update Jenkinsfile

* Addressed comments

* Addressed Eriks comments.

* Addressed Sandeep

* Update Jenkinsfile

* Update Jenkinsfile

* Update dataset_utils.py

* Update Jenkinsfile

* Update Jenkinsfile

* Use GPT CI config

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: shanmugamr1992 <shanmugamr1992@gmail.com>
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* fix label models restoring issue from wrighted cross entropy (#4968) (#4975)

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Add simple pre-commit file (#4983)

* Add simple pre-commit file

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Exclude docs folder

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Revert "[pre-commit.ci] auto fixes from pre-commit.com hooks"

This reverts commit 053bd5ba579537a5f311b431871c21f3381b43eb.

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Import pycuda.autoprimaryctx or pycuda.autoinit to init pycuda execution environment (#4951)

Signed-off-by: Jin Li <liji@nvidia.com>

Signed-off-by: Jin Li <liji@nvidia.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Adding speaker embedding conditioning in fastpitch (#4986)

Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com>

Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Fix ASR issues (#4984) (#4991)

* Fix ASR issues

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Revert fix

Signed-off-by: smajumdar <smajumdar@nvidia.com>

Signed-off-by: smajumdar <smajumdar@nvidia.com>

Signed-off-by: smajumdar <smajumdar@nvidia.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Fix current tests

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* More test coverage

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Address reviews

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Address review

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Drop bf16 test

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Address review

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* remove print

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

* Add bf16

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>
Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: smajumdar <smajumdar@nvidia.com>
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: Micha Livne <mlivne@cs.toronto.edu>
Signed-off-by: shanmugamr1992 <shanmugamr1992@gmail.com>
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: Jin Li <liji@nvidia.com>
Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Co-authored-by: Micha Livne <michalivne@users.noreply.github.com>
Co-authored-by: shanmugamr1992 <111910568+shanmugamr1992@users.noreply.github.com>
Co-authored-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: liji-nv <59594262+liji-nv@users.noreply.github.com>
Co-authored-by: Subhankar Ghosh <subhankar2321@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Fix BF16 test (#5162)

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>

Signed-off-by: SeanNaren <snarenthiran@nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Fix errors in speaker diarization nemo docs (#5153)

* fix docs and docstrings for MSDD

Signed-off-by: Taejin Park <tango4j@gmail.com>

* fix nemo docs errors

Signed-off-by: Taejin Park <tango4j@gmail.com>

* reflected review comments

Signed-off-by: Taejin Park <tango4j@gmail.com>

Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Add interleaved pipeline schedule to GPT (#5025)

* add virtual pipeline size to config

Signed-off-by: ericharper <complex451@gmail.com>

* convert model to list of modules

Signed-off-by: ericharper <complex451@gmail.com>

* convert model to list of modules

Signed-off-by: ericharper <complex451@gmail.com>

* convert model to list of modules

Signed-off-by: ericharper <complex451@gmail.com>

* update for list of modules

Signed-off-by: ericharper <complex451@gmail.com>

* add virtual to init

Signed-off-by: ericharper <complex451@gmail.com>

* update first last stage embedding all reduce

Signed-off-by: ericharper <complex451@gmail.com>

* update sequence parallel all reduce for virtual models

Signed-off-by: ericharper <complex451@gmail.com>

* runs but we get an error

Signed-off-by: ericharper <complex451@gmail.com>

* set virtual rank 0 after looping

Signed-off-by: ericharper <complex451@gmail.com>

* account for virtual when determinining first and last pipeline stages

Signed-off-by: ericharper <complex451@gmail.com>

* checkpointing for virtual models in progress

Signed-off-by: ericharper <complex451@gmail.com>

* add checkpoint hooks

Signed-off-by: ericharper <complex451@gmail.com>

* working on validation when resuming

Signed-off-by: ericharper <complex451@gmail.com>

* skip sanity val steps by default in config

Signed-off-by: ericharper <complex451@gmail.com>

* remove comment

Signed-off-by: ericharper <complex451@gmail.com>

* log number of params

Signed-off-by: ericharper <complex451@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* style

Signed-off-by: ericharper <complex451@gmail.com>

* check if self.model is a list

Signed-off-by: ericharper <complex451@gmail.com>

* make virtual pipeline default size None on init

Signed-off-by: ericharper <complex451@gmail.com>

* make virtual pipeline default to None in config

Signed-off-by: ericharper <complex451@gmail.com>

* remove ensure_divisibility call

Signed-off-by: ericharper <complex451@gmail.com>

* fix lgtm alerts

Signed-off-by: ericharper <complex451@gmail.com>

* remove num_sanity_val_steps from config

Signed-off-by: ericharper <complex451@gmai…
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants