Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Fix race condition when executing with multi-node where some ranks does not wait for setup (#7016) Signed-off-by: Kim Ngo <6362111+findkim@users.noreply.github.com> Signed-off-by: jubick1337 <mattyson.so@gmail.com> * Added bool types to neural_types export (#7032) Signed-off-by: tbartley94 <tbartley@nvidia.com> Signed-off-by: jubick1337 <mattyson.so@gmail.com> * rnnt and char utils (#6971) * rnnt_ngram_merge Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * char level bug Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Somshubra Majumdar <titu1994@gmail.com> Signed-off-by: jubick1337 <mattyson.so@gmail.com> * fix tab text gen (#7022) (#7031) Signed-off-by: Yi Dong <yidong@nvidia.com> Co-authored-by: Yi Dong <43824965+yidong72@users.noreply.github.com> Signed-off-by: jubick1337 <mattyson.so@gmail.com> * Fixed kwargs for metric instance init Signed-off-by: jubick1337 <mattyson.so@gmail.com> * Fixed kwargs for metric instance init Signed-off-by: jubick1337 <mattyson.so@gmail.com> * removed kwagrs Signed-off-by: jubick1337 <mattyson.so@gmail.com> * Updated config desc Signed-off-by: jubick1337 <mattyson.so@gmail.com> * ASR Confidence update and tutorial (#6810) * small fixes and tests Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com> * various fixes for the tutorial Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com> * tutorial added Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com> * for for a little oops after rebasement Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix tests Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com> * unused import removed Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com> * fix review comments Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com> * deprecated parameters for greedy configs Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com> * move re-assigning to configs Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com> * fix comments 2 Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com> * fix config tests Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com> * fix ece test (my env was bugged apparently) Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com> * renamings for confidence ensemble Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fox comments 3 Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com> * return dropped tutorial Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com> * CI flips back and forth, increasing tolerance Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com> --------- Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <mattyson.so@gmail.com> * install_bs (#7019) (#7028) Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Co-authored-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: jubick1337 <mattyson.so@gmail.com> * fixes for spellmapper (#6994) (#7000) Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru> Co-authored-by: bene-ges <antonova_sasha@list.ru> Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com> Signed-off-by: jubick1337 <mattyson.so@gmail.com> * added back the retro documents (#7033) Signed-off-by: Yi Dong <yidong@nvidia.com> Signed-off-by: jubick1337 <mattyson.so@gmail.com> * Remove pyyaml (#7052) (#7054) Signed-off-by: smajumdar <titu1994@gmail.com> Co-authored-by: Somshubra Majumdar <titu1994@gmail.com> Signed-off-by: jubick1337 <mattyson.so@gmail.com> * st standalone model (#6969) * st standalone model Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * style fix Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * sacrebleu import fix, unused imports removed Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * import guard for nlp inside asr transformer bpe model Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * codeql fixes Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * comments answered Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * import ordering fix Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * yttm for asr removed Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * logging added Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * added inference and translate method Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <mattyson.so@gmail.com> * remove pos emb from state dict for old models (#7068) * remove pos emb from state dict Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * move to nlp_model Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update comment Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix nmt test Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix nmt test Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <mattyson.so@gmail.com> * Fix typo in ASR-TTS tutorial (#7049) Signed-off-by: Vladimir Bataev <vbataev@nvidia.com> Signed-off-by: jubick1337 <mattyson.so@gmail.com> * Fixed tutorial's name (#7047) Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> Co-authored-by: Vladimir Bataev <vbataev@nvidia.com> Signed-off-by: jubick1337 <mattyson.so@gmail.com> * Fix documentation for Numba (#7065) (#7077) * Fix documentation for Numba * Update force float32 flag dynamically * Update force float32 flag dynamically * Fix nemo version --------- Signed-off-by: smajumdar <titu1994@gmail.com> Co-authored-by: Somshubra Majumdar <titu1994@gmail.com> Co-authored-by: Eric Harper <complex451@gmail.com> Signed-off-by: jubick1337 <mattyson.so@gmail.com> * Update Frame-VAD doc and fix onnx export (#7076) * update fvad doc Signed-off-by: stevehuang52 <heh@nvidia.com> * fix typo Signed-off-by: stevehuang52 <heh@nvidia.com> * update fvad example Signed-off-by: stevehuang52 <heh@nvidia.com> * update Signed-off-by: stevehuang52 <heh@nvidia.com> * fix onnx export Signed-off-by: stevehuang52 <heh@nvidia.com> * update test Signed-off-by: stevehuang52 <heh@nvidia.com> * refactor Signed-off-by: stevehuang52 <heh@nvidia.com> * update doc Signed-off-by: stevehuang52 <heh@nvidia.com> * update Signed-off-by: stevehuang52 <heh@nvidia.com> --------- Signed-off-by: stevehuang52 <heh@nvidia.com> Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com> Signed-off-by: jubick1337 <mattyson.so@gmail.com> * memmap worker arg (#7062) * memmap worker arg Signed-off-by: arendu <adithya.r@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update Signed-off-by: arendu <adithya.r@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update Signed-off-by: arendu <adithya.r@gmail.com> * update Signed-off-by: arendu <adithya.r@gmail.com> --------- Signed-off-by: arendu <adithya.r@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <mattyson.so@gmail.com> * Fix caching bug in causal convolutions for cache-aware ASR models (#7034) (#7082) Co-authored-by: Vahid Noroozi <VahidooX@users.noreply.github.com> Signed-off-by: jubick1337 <mattyson.so@gmail.com> * Fast Conformer global token fix (#7085) * old way Signed-off-by: sam1373 <samuelkriman@gmail.com> * fix Signed-off-by: sam1373 <samuelkriman@gmail.com> * fix Signed-off-by: sam1373 <samuelkriman@gmail.com> * fix Signed-off-by: sam1373 <samuelkriman@gmail.com> * remove extra Signed-off-by: sam1373 <samuelkriman@gmail.com> * clean Signed-off-by: sam1373 <samuelkriman@gmail.com> * clean Signed-off-by: sam1373 <samuelkriman@gmail.com> * clean Signed-off-by: sam1373 <samuelkriman@gmail.com> * fix Signed-off-by: sam1373 <samuelkriman@gmail.com> * fix Signed-off-by: sam1373 <samuelkriman@gmail.com> * fix Signed-off-by: sam1373 <samuelkriman@gmail.com> * fix Signed-off-by: sam1373 <samuelkriman@gmail.com> * fix Signed-off-by: sam1373 <samuelkriman@gmail.com> * fix Signed-off-by: sam1373 <samuelkriman@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: sam1373 <samuelkriman@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <mattyson.so@gmail.com> * Refined export_config (#7053) (#7066) * Refined export_config * Rolling back hierarchy change --------- Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com> Co-authored-by: Boris Fomitchev <borisfom@users.noreply.github.com> Signed-off-by: jubick1337 <mattyson.so@gmail.com> * small Bugfix (#7081) * small Bugfix (#7079) * fix branch Signed-off-by: fayejf <fayejf07@gmail.com> * fix typo Signed-off-by: fayejf <fayejf07@gmail.com> * fix link Signed-off-by: fayejf <fayejf07@gmail.com> --------- Signed-off-by: fayejf <fayejf07@gmail.com> * Update tutorials/nlp/SpellMapper_English_ASR_Customization.ipynb Signed-off-by: Somshubra Majumdar <titu1994@gmail.com> * Update tutorials/nlp/SpellMapper_English_ASR_Customization.ipynb Signed-off-by: Somshubra Majumdar <titu1994@gmail.com> --------- Signed-off-by: fayejf <fayejf07@gmail.com> Signed-off-by: Somshubra Majumdar <titu1994@gmail.com> Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com> Co-authored-by: Somshubra Majumdar <titu1994@gmail.com> Signed-off-by: jubick1337 <mattyson.so@gmail.com> * Added script to extract ASR CTC and RNNT models from ASR hybrid models (#7092) * Added script to extract ctc and rnnt models from hybrid models Signed-off-by: Daniel Egert <degert@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Updated hybrid extraction script for review request 1 Signed-off-by: Daniel Egert <degert@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Updated hybrid convert script to remove --cuda flag Signed-off-by: Daniel Egert <degert@nvidia.com> --------- Signed-off-by: Daniel Egert <degert@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Somshubra Majumdar <titu1994@gmail.com> Signed-off-by: jubick1337 <mattyson.so@gmail.com> * Adding docs and models for multiple lookahead cache-aware ASR (#7067) (#7094) Signed-off-by: jubick1337 <mattyson.so@gmail.com> * update TTS readme (#7088) * update TTS readme Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> --------- Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> Signed-off-by: jubick1337 <mattyson.so@gmail.com> * Fix absolute path in path join call (#7099) Signed-off-by: Jan Beckmann <king-jan1999@hotmail.de> Signed-off-by: jubick1337 <mattyson.so@gmail.com> * Disable distopt contiguous param buffer by default (#7095) Signed-off-by: Tim Moon <tmoon@nvidia.com> Signed-off-by: jubick1337 <mattyson.so@gmail.com> * microphone demo (#7110) Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Co-authored-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Signed-off-by: jubick1337 <mattyson.so@gmail.com> * [Fix] load_state_dict in nlp_model.py (#7086) * Fix load_state_dict in nlp_model.py Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <mattyson.so@gmail.com> * Fix plot function in vad_utils.py (#7113) Fix plot function in vad_utils.py Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com> Signed-off-by: jubick1337 <mattyson.so@gmail.com> * Fixed small bug with NoisePerturbationWithNormalization (#7118) Signed-off-by: Daniel Egert <degert@nvidia.com> Signed-off-by: jubick1337 <mattyson.so@gmail.com> * Fix import guard checks (#7124) Signed-off-by: smajumdar <titu1994@gmail.com> Signed-off-by: jubick1337 <mattyson.so@gmail.com> * Revert "Fix import guard checks (#7124)" (#7125) This reverts commit a46e325. Signed-off-by: jubick1337 <mattyson.so@gmail.com> * Fix import guard checks (#7126) * Fix import guard checks Signed-off-by: smajumdar <titu1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: smajumdar <titu1994@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <mattyson.so@gmail.com> * Add updated fc ctc and rnnt xxl models (#7128) (#7130) Signed-off-by: jubick1337 <mattyson.so@gmail.com> * [TTS] Create EnCodec training recipe (#6852) * [TTS] Create EnCodec training recipe Signed-off-by: Ryan <rlangman@nvidia.com> * [TTS] Update encodec recipe Signed-off-by: Ryan <rlangman@nvidia.com> * [TTS] Rename EnCodec to AudioCodec Signed-off-by: Ryan <rlangman@nvidia.com> * [TTS] Add EnCodec unit tests Signed-off-by: Ryan <rlangman@nvidia.com> * [TTS] Add copyright header to distributed.py Signed-off-by: Ryan <rlangman@nvidia.com> --------- Signed-off-by: Ryan <rlangman@nvidia.com> Signed-off-by: jubick1337 <mattyson.so@gmail.com> * Fix rank where torch.distributed may not be initialized yet and would not wait for tokenizer file caching (#7061) Signed-off-by: Kim Ngo <6362111+findkim@users.noreply.github.com> Co-authored-by: David <amosalla@asu.edu> Signed-off-by: jubick1337 <mattyson.so@gmail.com> * fix default attention size (#7141) (#7143) Signed-off-by: jubick1337 <mattyson.so@gmail.com> * fix evaluator.py for various exceptions by ast (#7150) Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com> Signed-off-by: jubick1337 <mattyson.so@gmail.com> * [TTS][ZH] add Chinese TTS recipes based on IPA symbol sets. (#6893) * [TTS] add Chinese TTS recipe based on IPA. * add new pinyin and ipa dictionaries with 36 finals. * add yaml configs for 24-final pinyin and ipa. * add copyright header * add a directory level 24finals to discriminate from 36 finals. Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> * unify configs into a single one and add detailed comments providing supported candidates. Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> * choose 36-final IPA as default phoneme dict Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> --------- Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> Signed-off-by: jubick1337 <mattyson.so@gmail.com> * [TTS] Add output audio format to preprocessing (#6889) * [TTS] Add output audio format to preprocessing Signed-off-by: Ryan <rlangman@nvidia.com> * [TTS] Add format validation Signed-off-by: Ryan <rlangman@nvidia.com> * [TTS] Fix data tutorial Signed-off-by: Ryan <rlangman@nvidia.com> --------- Signed-off-by: Ryan <rlangman@nvidia.com> Signed-off-by: jubick1337 <mattyson.so@gmail.com> * freeze (#7152) Signed-off-by: arendu <adithya.r@gmail.com> Signed-off-by: jubick1337 <mattyson.so@gmail.com> * make sure any empty segments are removed (#7155) Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com> Signed-off-by: jubick1337 <mattyson.so@gmail.com> * Update RIR generation scripts (#6547) - fix: reduce room size if evaluation of params fails - added randomized mic placement - added diffuse noise generation - added an option to specify the format and subtype for saved audio Signed-off-by: Ante Jukić <ajukic@nvidia.com> Signed-off-by: jubick1337 <mattyson.so@gmail.com> * A quickstart speech enhancement tutorial (#6492) A simple example of training a model for speech enhancement task Signed-off-by: Ante Jukić <ajukic@nvidia.com> Signed-off-by: jubick1337 <mattyson.so@gmail.com> * NFA subtitle file config - specify colors and vertical alignment (#7160) * allow specifying colors of text in ASS subtitle file Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com> * specify vertical_alignment instead of marginv in ass_file_config Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com> * add documentation of CTMFileConfig and ASSFileConfig to NFA README Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com> --------- Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com> Signed-off-by: jubick1337 <mattyson.so@gmail.com> * Eagerly accumulate embedding grads into fp32 buffer (#6958) (#7153) Signed-off-by: Tim Moon <tmoon@nvidia.com> Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com> Signed-off-by: jubick1337 <mattyson.so@gmail.com> * TE bug fix (#7027) (#7036) Signed-off-by: Dmytro Pykhtar <dpykhtar@nvidia.com> Co-authored-by: Dmytro Pykhtar <37850217+dimapihtar@users.noreply.github.com> Signed-off-by: jubick1337 <mattyson.so@gmail.com> * [TTS] Remove nested TTS configs (#7154) * [TTS] Remove nested TTS configs Signed-off-by: Ryan <rlangman@nvidia.com> * [TTS] Modify tutorial to support multiple sampling rates Signed-off-by: Ryan <rlangman@nvidia.com> * [TTS] Clarify min_duration unit Signed-off-by: Ryan <rlangman@nvidia.com> * [TTS] Default 22.05kHz highfreq to null Signed-off-by: Ryan <rlangman@nvidia.com> --------- Signed-off-by: Ryan <rlangman@nvidia.com> Signed-off-by: jubick1337 <mattyson.so@gmail.com> * Merge release r1.20.0 to main (#7167) * update package info Signed-off-by: ericharper <complex451@gmail.com> * Add ASR with TTS Tutorial. Fix enhancer usage. (#6955) * Add ASR with TTS Tutorial * Fix enhancer usage Signed-off-by: Vladimir Bataev <vbataev@nvidia.com> * install_bs (#7019) Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * Fix typo and branch in tutorial (#7048) Signed-off-by: Vladimir Bataev <vbataev@nvidia.com> * fix syntax error introduced in PR-7079 (#7102) * fix syntax error introduced in PR-7079 Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru> * fixes for pr review Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru> --------- Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru> * fix links for TN (#7117) Signed-off-by: Evelina <ebakhturina@nvidia.com> * update branch (#7135) Signed-off-by: ericharper <complex451@gmail.com> * Fixed main and merging this to r1.20 (#7127) * Fixed main and merging this to r1.20 Signed-off-by: Taejin Park <tango4j@gmail.com> * Update vad_utils.py Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com> --------- Signed-off-by: Taejin Park <tango4j@gmail.com> Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com> Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com> * update branch Signed-off-by: ericharper <complex451@gmail.com> * fix version Signed-off-by: ericharper <complex451@gmail.com> * resolve conflict the other way Signed-off-by: ericharper <complex451@gmail.com> * keep both Signed-off-by: ericharper <complex451@gmail.com> * revert keep both Signed-off-by: ericharper <complex451@gmail.com> --------- Signed-off-by: ericharper <complex451@gmail.com> Signed-off-by: Vladimir Bataev <vbataev@nvidia.com> Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru> Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Taejin Park <tango4j@gmail.com> Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com> Co-authored-by: Vladimir Bataev <vbataev@nvidia.com> Co-authored-by: Nikolay Karpov <karpnv@gmail.com> Co-authored-by: bene-ges <antonova_sasha@list.ru> Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com> Co-authored-by: Taejin Park <tango4j@gmail.com> Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com> Signed-off-by: jubick1337 <mattyson.so@gmail.com> * Upgrade to pytorch lightning 2.0 (#6433) * Upgrade pytorch lightning version in requirements Signed-off-by: Abhishree <abhishreetm@gmail.com> * Initial fixes for PTL2.0 Signed-off-by: Abhishree <abhishreetm@gmail.com> * Add further fixes to support lightning 2.0 Signed-off-by: Abhishree <abhishreetm@gmail.com> * Add replacements for replace_sampler_ddp, resume_from_checkpoint_fit_path and few occurances of validation_epoch_end Signed-off-by: Abhishree <abhishreetm@gmail.com> * Replace all occurances of validation_epoch_end to on_validation_epoch_end Signed-off-by: Abhishree <abhishreetm@gmail.com> * Replace training_epoch_end, test_epoch_end with on_train_epoch_end and on_test_epoch_end respectively Signed-off-by: Abhishree <abhishreetm@gmail.com> * Change logger=None to logger=False in Trainer object Signed-off-by: Abhishree <abhishreetm@gmail.com> * Remove PTL2.0 deprecated Trainer args from TrainerConfig dataclass Signed-off-by: Abhishree <abhishreetm@gmail.com> * Modify trainer.precision check and other small edits Signed-off-by: Abhishree <abhishreetm@gmail.com> * Replace logger=None with logger=False in test_ptl_stateless_timer.py Trainer Signed-off-by: Abhishree <abhishreetm@gmail.com> * Add default values for args to fix Attribute Error Signed-off-by: Abhishree <abhishreetm@gmail.com> * Add the following modifications 1) Remove outputs arg from on_validation_epoch_end, on_test_epoch_end and make it an arg of the class 2) Replace resume_from_checkpoint with ckpt_path as needed 3) Explicitly add accelerator as 'CPU' in UTs being run on CPU Signed-off-by: Abhishree <abhishreetm@gmail.com> * Remove outputs arg from on_validation_epoch_end, on_test_epoch_end Signed-off-by: Abhishree <abhishreetm@gmail.com> * Remove outputs arg in on_validation_epoch_end in MultiBinaryAccuracy docstrings Signed-off-by: Abhishree <abhishreetm@gmail.com> * Add val, test outputs as instance vars in PunctuationCapitalizationModel and TokenClassificationModel Signed-off-by: Abhishree <abhishreetm@gmail.com> * Replace trainer.fit_loop.max_steps with trainer.fit_loop.epoch_loop.max_steps in test_optimizers_schedulers.py Signed-off-by: Abhishree <abhishreetm@gmail.com> * Revert an extra space that was mistakenly added Signed-off-by: Abhishree <abhishreetm@gmail.com> * Use self.validation_step_outputs and self.test_step_outputs in test_ema.py for uniformity Signed-off-by: Abhishree <abhishreetm@gmail.com> * Use self.validation_step_outputs and self.test_step_outputs in test_ptl_stateless_timer.py and check_for_ranks.py for uniformity Signed-off-by: Abhishree <abhishreetm@gmail.com> * Add self.validation_step_outputs.clear() and self.test_step_outputs.clear() wherever missing Signed-off-by: Abhishree <abhishreetm@gmail.com> * Remove outputs arg from on_train_epoch_end Signed-off-by: Abhishree <abhishreetm@gmail.com> * Remove outputs from on_validation_epoch_end in multi_binary_acc.py Signed-off-by: Abhishree <abhishreetm@gmail.com> * Remove output args from on_validation_epoch_end in the docstrings of some ASR files Signed-off-by: Abhishree <abhishreetm@gmail.com> * Remove output args from on_validation_epoch_end and clear memory from validation_step_outputs Signed-off-by: Abhishree <abhishreetm@gmail.com> * Add on_validation_epoch_end and remove outputs args for nlp models Signed-off-by: Abhishree <abhishreetm@gmail.com> * Append output of validation_step to validation_step_outputs in EncDecClassificationModel Signed-off-by: Abhishree <abhishreetm@gmail.com> * Add the following changes 1) Index self.validation_step_outputs and self.test_step.outputs with dataloader_idx wherever needed 2) Initialize self.validation_step_outputs and self.test_step.outputs as empty lists and add support for multi dataloaders if they exist 3) Remove self.pre_configure_ddp from NLPDDPStrategy class as its removed in PTL 2.0 Signed-off-by: Abhishree <abhishreetm@gmail.com> * Add default value dataloader_idx=0 for on_validation_batch_end() in megatron_base_model.py Signed-off-by: Abhishree <abhishreetm@gmail.com> * TypeCast precision to str in attention.py and utils_funcs.py to avoid TypeError Signed-off-by: Abhishree <abhishreetm@gmail.com> * Add if condition check for multiple dataloaders when appending to validation outputs Signed-off-by: Abhishree <abhishreetm@gmail.com> * Separate validation pass to be used with both validation_step and test_step Signed-off-by: Abhishree <abhishreetm@gmail.com> * Add if condition check for multiple dataloader while appending to test_step_outputs in punctuation_capitalization_model.py Signed-off-by: Abhishree <abhishreetm@gmail.com> * Add condition check for multiple dataloaders based on type of trainer.val/test_dataloaders or self._validation/test_dl instead of len Signed-off-by: Abhishree <abhishreetm@gmail.com> * Comment Megatron T5 IA3 PP=2 in CI pipeline due to dataloader_iter issue with PTL 2.0 Signed-off-by: Abhishree <abhishreetm@gmail.com> * Modify precision checks to account for 16-mixed and bf16-mixed Signed-off-by: Abhishree <abhishreetm@gmail.com> * Append output of validation/test_step to self.validation/test_step_outputs in CTCG2PModel Signed-off-by: Abhishree <abhishreetm@gmail.com> * Modify find_unused_parameters=True in g2p_heteronym model 1) Add find_unused_parameters=True for DDP strategy in g2p_heteronym_classification_train_and_evaluate.py 2) Remove args output in validation/test_step and add instance variables instead for heteronym_classification.py Signed-off-by: Abhishree <abhishreetm@gmail.com> * Remove outputs from on_test_epoch_end in DialogueGPTClassificationModel Signed-off-by: Abhishree <abhishreetm@gmail.com> * Add validation/test outputs in sgdqa_model and modify dialogue_config.yaml Signed-off-by: Abhishree <abhishreetm@gmail.com> * Add split arg self.test_step_outputs to TextClassificationModel Signed-off-by: Abhishree <abhishreetm@gmail.com> * Add test_step_outputs to dialogue and text classification models Signed-off-by: Abhishree <abhishreetm@gmail.com> * Change condition check for multiple dataloaders: 1) Replace ds_item as list in dialogue_config.yaml 2) Check for len of val/test_dataloaders or validation/test_dl along with type check of list in sgdqa_model.py while appending outputs of validation/test_step 3) Check for len of _validation/test_dl for creating self.validation/test_step_outputs in ModelPT and punctuation_cpitalization_model.py Signed-off-by: Abhishree <abhishreetm@gmail.com> * Add additional condition for multi dataloaders Check len(self.trainer.val/test_dataloaders) > 1 along with type(self.trainer.val/test_dataloaders) == list for multi dataloaders in validation/test_step Signed-off-by: Abhishree <abhishreetm@gmail.com> * Add val step outputs and default val for dataloader_idx 1) Append validation_step outout to self.validation_step_outputs in MultiLabelIntentSlotClassificationMode 2) Add default val for dataloader_idx for on_test_batch_start/end in TimingCallback 3) Add self.validation/test_step_outputs in BERTQAModel and remove outputs arg Signed-off-by: Abhishree <abhishreetm@gmail.com> * Add val/test_step_outputs to S2SQAModel and GPTQAModel Signed-off-by: Abhishree <abhishreetm@gmail.com> * Edit JenkinsFile for bert_pretrainig.py Edit Jenkinsfile for this test to disable validation as a workaround for trainer.val_dataloader None error Signed-off-by: Abhishree <abhishreetm@gmail.com> * Modify precision to support 16-mixed, bf16-mixed in megatron_gpt_pretraining.py Signed-off-by: Abhishree <abhishreetm@gmail.com> * Add ddp_find_unused_parameters_true and remove output args 1) Add ddp_find_unused_parameters_true fro trainer.strategy in self_alignment_pretraining.py as it has unused parameters 2) Remove output args and add self.validation/test_step_outputs to validation/test_step in mt_enc_dec_model.py 3) Comment tests in JenkinsFile that need to be fixed Signed-off-by: Abhishree <abhishreetm@gmail.com> * Precision fix in megatron_nmt_training.py for 16-mixed, bf16-mixed Signed-off-by: Abhishree <abhishreetm@gmail.com> * Precision fix for megatron_bert_pretraining.py and megatron_bert_model.py Signed-off-by: Abhishree <abhishreetm@gmail.com> * Precision fix and validation/test_step_outputs 1) Add fix to account for 16-mixed and bf16-mixed in megatron_retro_mutransfer_pretrain.py, megatron_retro_pretraining.py 2) Reset ckpt_path for test in enc_dec_nmt.py 3) Remove outputs args and add validation/test_step_outputs in megatron_retrieval_model.py 4) Comment Megatron Bert Pretraining and Resume Training with Pipeline Paralleism and add back NMT Training Post-LN Signed-off-by: Abhishree <abhishreetm@gmail.com> * Precision fix and skip few failing tests Signed-off-by: Abhishree <abhishreetm@gmail.com> * Add missing comment lines in JenkinsFile Signed-off-by: Abhishree <abhishreetm@gmail.com> * Comment jenkin tests and super().on_validation_epoch_end() in megatron_gpt_sft_model.py Signed-off-by: Abhishree <abhishreetm@gmail.com> * Minor edit JenkinsFile Signed-off-by: Abhishree <abhishreetm@gmail.com> * Minor edit in jenkins file Signed-off-by: Abhishree <abhishreetm@gmail.com> * Edit in Jenkins file Signed-off-by: Abhishree <abhishreetm@gmail.com> * Comment missed lines in Jenkins file Signed-off-by: Abhishree <abhishreetm@gmail.com> * Fix precision and validation/test outputs 1) Add precision fix to account for 16-mixed and bf16-mixed in megatron_t5_pretraining.py 2) Remove outputs args and add append loss to self.validation/test_step_outputs in megatron_lm_encoder_decoder_model.py 3) Add back resume_from_checkpoint in the megatron_t5_config.yaml 4) Comment out certain tests in Jenkins file Signed-off-by: Abhishree <abhishreetm@gmail.com> * Fix precision and validation/test/predict errors in megatron_t5_prompt_learning.py Signed-off-by: Abhishree <abhishreetm@gmail.com> * Precision fix and edit precision typo in all files 1) Account for 16-mixed and bf16-mixed in megatron_bart_pretraining.py and megatron_t5_seq2seq_finetune.py 2) Fix precision typo in all files Signed-off-by: Abhishree <abhishreetm@gmail.com> * Fix all CI TTS tests and comment few Jenkins tests Signed-off-by: Abhishree <abhishreetm@gmail.com> * Combine xx_epoch_end and on_xx_epoch_end Add on_inference_epoch_end to inference_epoch_end function and have a single on_validation/test_epoch_end in megatron_finetune_model.py and megatron_gpt_sft_model.py Signed-off-by: Abhishree <abhishreetm@gmail.com> * Add a missing comment in JenkinsFile Signed-off-by: Abhishree <abhishreetm@gmail.com> * Add try except StopIteration in validation_step for models with dataloader_iter Signed-off-by: Abhishree <abhishreetm@gmail.com> * Remove pyyaml from requirements Signed-off-by: Abhishree <abhishreetm@gmail.com> * Add try except for inference_step in megatron_finetune_model.py Signed-off-by: Abhishree <abhishreetm@gmail.com> * Remove limit_val_batches for mockGPTDataset test Signed-off-by: Abhishree <abhishreetm@gmail.com> * Add new self.validation_step_outputs for MegatronGPTSFTModel Signed-off-by: Abhishree <abhishreetm@gmail.com> * Minor edit Jenkinsfile Signed-off-by: Abhishree <abhishreetm@gmail.com> * Initialize self.validation/test_step_outputs in megatron_gpt_sft_model.py Initialize self.validation/test_step_outputs in setup of MegatronGPTSFTModel to take care of cases when datalaoders are not setup in ModelPT for example while restoring the model. Signed-off-by: Abhishree <abhishreetm@gmail.com> * Remove resume_from_checkpoint if trainer arg in conf yaml files Signed-off-by: Abhishree <abhishreetm@gmail.com> * Remove resume_from_checkpoint as trainer arg in GPT, T5 configs Signed-off-by: Abhishree <abhishreetm@gmail.com> * Remove resume_from_checkpoint in duplex_tn_config.yaml Signed-off-by: Abhishree <abhishreetm@gmail.com> * Fix typos, unused imports and refactor code to remove redundant funcs Signed-off-by: Abhishree <abhishreetm@gmail.com> * Remove commented code in megatron_nmt_model.py Signed-off-by: Abhishree <abhishreetm@gmail.com> * Fix overriden functions to match parent class functions Signed-off-by: Abhishree <abhishreetm@gmail.com> * Prefetch dataloader_iter to prevent hang for PP>1 Signed-off-by: Abhishree <abhishreetm@gmail.com> * Override setup() in NLPDDPStrategy to avoid hang during predict with PP>1 Signed-off-by: Abhishree <abhishreetm@gmail.com> * Uncomment tests in JenkinsFile Signed-off-by: Abhishree <abhishreetm@gmail.com> * Add '16' to precision checks and other minor fixes Signed-off-by: Abhishree <abhishreetm@gmail.com> * Clear validation/test_step_outputs with dataloader_idx for multi dataloaders Signed-off-by: Abhishree <abhishreetm@gmail.com> * Minor edits Signed-off-by: Abhishree <abhishreetm@gmail.com> * Modify precision checks to avoid indexing Signed-off-by: Abhishree <abhishreetm@gmail.com> * Remove self.validation_step_outputs_sft and add dataloader_idx to clear outputs Signed-off-by: Abhishree <abhishreetm@gmail.com> * Reference checkpoint with trainer.ckpt_path Signed-off-by: Abhishree <abhishreetm@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add _prefetch to NLPModel and minor fixes Signed-off-by: Abhishree <abhishreetm@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add limit_val_batches in JenkinsFile for NMT 1) Add trainer.limit_val_batches in Megatron NMT Training TP=2 2) Remove unused import in ModelPT Signed-off-by: Abhishree <abhishreetm@gmail.com> --------- Signed-off-by: Abhishree <abhishreetm@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <mattyson.so@gmail.com> * Include the scripts for preprocessing OAST and unit tests for chat sft datasets (#7112) * scripts for sft Signed-off-by: Yi Dong <yidong@nvidia.com> * fix style Signed-off-by: Yi Dong <yidong@nvidia.com> * adde special token only for huggingface model Signed-off-by: Yi Dong <yidong@nvidia.com> * change default name Signed-off-by: Yi Dong <yidong@nvidia.com> * print out error datapoint content Signed-off-by: Yi Dong <yidong@nvidia.com> * show error id Signed-off-by: Yi Dong <yidong@nvidia.com> * annotation script working Signed-off-by: Yi Dong <yidong@nvidia.com> * try to be compatible with huggingface tokenizer Signed-off-by: Yi Dong <yidong@nvidia.com> * added examples Signed-off-by: Yi Dong <yidong@nvidia.com> * added lang Signed-off-by: Yi Dong <yidong@nvidia.com> * added lang Signed-off-by: Yi Dong <yidong@nvidia.com> * text to value special case Signed-off-by: Yi Dong <yidong@nvidia.com> * configure the slider Signed-off-by: Yi Dong <yidong@nvidia.com> * annoatation handles lang Signed-off-by: Yi Dong <yidong@nvidia.com> * added the unit test for chat sft dataset Signed-off-by: Yi Dong <yidong@nvidia.com> * used the file in the test dir Signed-off-by: Yi Dong <yidong@nvidia.com> * fix json error Signed-off-by: Yi Dong <yidong@nvidia.com> * load local tokenizer Signed-off-by: Yi Dong <yidong@nvidia.com> * remove mask count check Signed-off-by: Yi Dong <yidong@nvidia.com> * added HF dataset backend Signed-off-by: Yi Dong <yidong@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Yi Dong <yidong@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <mattyson.so@gmail.com> * add paths to labeler. (#7087) Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> Signed-off-by: jubick1337 <mattyson.so@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Kim Ngo <6362111+findkim@users.noreply.github.com> Signed-off-by: jubick1337 <mattyson.so@gmail.com> Signed-off-by: tbartley94 <tbartley@nvidia.com> Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Yi Dong <yidong@nvidia.com> Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com> Signed-off-by: Alexandra Antonova <antonova_sasha@list.ru> Signed-off-by: smajumdar <titu1994@gmail.com> Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com> Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Vladimir Bataev <vbataev@nvidia.com> Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> Signed-off-by: stevehuang52 <heh@nvidia.com> Signed-off-by: arendu <adithya.r@gmail.com> Signed-off-by: sam1373 <samuelkriman@gmail.com> Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com> Signed-off-by: fayejf <fayejf07@gmail.com> Signed-off-by: Somshubra Majumdar <titu1994@gmail.com> Signed-off-by: Daniel Egert <degert@nvidia.com> Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> Signed-off-by: Jan Beckmann <king-jan1999@hotmail.de> Signed-off-by: Tim Moon <tmoon@nvidia.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com> Signed-off-by: Ryan <rlangman@nvidia.com> Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com> Signed-off-by: Ante Jukić <ajukic@nvidia.com> Signed-off-by: Dmytro Pykhtar <dpykhtar@nvidia.com> Signed-off-by: ericharper <complex451@gmail.com> Signed-off-by: Taejin Park <tango4j@gmail.com> Signed-off-by: Abhishree <abhishreetm@gmail.com> Co-authored-by: Kim Ngo <6362111+findkim@users.noreply.github.com> Co-authored-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Co-authored-by: Nikolay Karpov <karpnv@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Somshubra Majumdar <titu1994@gmail.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Yi Dong <43824965+yidong72@users.noreply.github.com> Co-authored-by: Aleksandr Laptev <alaptev@nvidia.com> Co-authored-by: bene-ges <antonova_sasha@list.ru> Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com> Co-authored-by: Aleksey Grinchuk (Oleksii Hrinchuk) <grinchuk.alexey@gmail.com> Co-authored-by: Vladimir Bataev <vbataev@nvidia.com> Co-authored-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> Co-authored-by: Eric Harper <complex451@gmail.com> Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com> Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com> Co-authored-by: Adi Renduchintala <adithyare@nvidia.com> Co-authored-by: Vahid Noroozi <VahidooX@users.noreply.github.com> Co-authored-by: Samuel Kriman <samuelkriman@gmail.com> Co-authored-by: Boris Fomitchev <borisfom@users.noreply.github.com> Co-authored-by: trias702 <25867060+trias702@users.noreply.github.com> Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> Co-authored-by: Jan Beckmann <king-jan1999@hotmail.de> Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com> Co-authored-by: lleaver <137942999+lleaver@users.noreply.github.com> Co-authored-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Co-authored-by: Ryan Langman <rlangman@nvidia.com> Co-authored-by: David <amosalla@asu.edu> Co-authored-by: Elena Rastorgueva <80532067+erastorgueva-nv@users.noreply.github.com> Co-authored-by: anteju <108555623+anteju@users.noreply.github.com> Co-authored-by: Dmytro Pykhtar <37850217+dimapihtar@users.noreply.github.com> Co-authored-by: Taejin Park <tango4j@gmail.com> Co-authored-by: Abhishree Thittenamane <47577437+athitten@users.noreply.github.com>
- Loading branch information