Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(export): GPT models w/ bias=False convert properly #11255

Merged
merged 2 commits into from
Nov 12, 2024

Conversation

terrykong
Copy link
Collaborator

What does this PR do ?

Add a one line overview of what this PR aims to accomplish.

Collection: [Note which collection this PR will affect]

Changelog

  • Add specific line by line info of high level changes in this PR.

Usage

  • You can potentially add a usage example below
# Add a code snippet demonstrating how to use this 

GitHub Actions CI

The Jenkins CI system has been replaced by GitHub Actions self-hosted runners.

The GitHub Actions CI will run automatically when the "Run CICD" label is added to the PR.
To re-run CI remove and add the label again.
To run CI on an untrusted fork, a NeMo user with write access must first click "Approve and run".

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • New Feature
  • Bugfix
  • Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

  • Related to # (issue)

Signed-off-by: Terry Kong <terryk@nvidia.com>
Copy link
Contributor

beep boop 🤖: 🙏 The following files have warnings. In case you are familiar with these, please try helping us to improve the code base.


Your code was analyzed with PyLint. The following annotations have been identified:

************* Module nemo.export.trt_llm.tensorrt_llm_build
nemo/export/trt_llm/tensorrt_llm_build.py:31:0: C0116: Missing function or method docstring (missing-function-docstring)
nemo/export/trt_llm/tensorrt_llm_build.py:19:0: W0611: Unused Builder imported from tensorrt_llm.builder (unused-import)
nemo/export/trt_llm/tensorrt_llm_build.py:23:0: W0611: Unused add_lora imported from tensorrt_llm.models.modeling_utils (unused-import)

-----------------------------------
Your code has been rated at 9.39/10

Thank you for improving NeMo's documentation!

Copy link
Collaborator

@meatybobby meatybobby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@terrykong terrykong enabled auto-merge (squash) November 12, 2024 18:24
Copy link
Contributor

[🤖]: Hi @terrykong 👋,

We wanted to let you know that a CICD pipeline for this PR just finished successfully

So it might be time to merge this PR or get some approvals

I'm just a bot so I'll leave it you what to do next.

//cc @pablo-garay @ko3n1g

@terrykong terrykong merged commit 2d4f495 into main Nov 12, 2024
165 of 166 checks passed
@terrykong terrykong deleted the terryk/export-gpt-bias-false branch November 12, 2024 20:22
zpx01 added a commit that referenced this pull request Nov 14, 2024
* Timestamps to transcribe (#10950)

* inital version

Signed-off-by: Nithin Rao Koluguri <nithinraok>

* Support for RNNT, TDT, Hybrid Models

Signed-off-by: Nithin Rao Koluguri <nithinraok>

* move change of decoder stratery from mixin to individual model class

Signed-off-by: Nithin Rao Koluguri <nithinraok>

* Apply isort and black reformatting

Signed-off-by: nithinraok <nithinraok@users.noreply.github.com>

* update transcribe_speech.py

Signed-off-by: Nithin Rao Koluguri <nithinraok>

* uncomment

Signed-off-by: Nithin Rao Koluguri <nithinraok>

* Apply isort and black reformatting

Signed-off-by: nithinraok <nithinraok@users.noreply.github.com>

* add docs

Signed-off-by: Nithin Rao Koluguri <nithinraok>

* fix docs

Signed-off-by: Nithin Rao Koluguri <nithinraok>

* Apply isort and black reformatting

Signed-off-by: nithinraok <nithinraok@users.noreply.github.com>

* codeql fixes

Signed-off-by: Nithin Rao Koluguri <nithinraok>

* unit tests

Signed-off-by: Nithin Rao Koluguri <nithinraok>

* minor rebase fix

Signed-off-by: Nithin Rao Koluguri <nithinraok>

* Apply isort and black reformatting

Signed-off-by: nithinraok <nithinraok@users.noreply.github.com>

* add None case to restore the state set outside using decoding_stratergy()

Signed-off-by: Nithin Rao Koluguri <nithinraok>

* Apply isort and black reformatting

Signed-off-by: nithinraok <nithinraok@users.noreply.github.com>

* remove ipdb traces

Signed-off-by: Nithin Rao Koluguri <nithinraok>

* updates doc for transcription.py

Signed-off-by: Nithin Rao Koluguri <nithinraok>

* remove preserve alignment for AED models as it doesn;t support it

Signed-off-by: Nithin Rao Koluguri <nithinraok>

* lint warnings

Signed-off-by: Nithin Rao Koluguri <nithinraok>

* Apply isort and black reformatting

Signed-off-by: nithinraok <nithinraok@users.noreply.github.com>

---------

Signed-off-by: Nithin Rao Koluguri <nithinraok>
Signed-off-by: nithinraok <nithinraok@users.noreply.github.com>
Co-authored-by: Nithin Rao Koluguri <nithinraok>
Co-authored-by: nithinraok <nithinraok@users.noreply.github.com>

* [🤠]: Howdy folks, let's bump `Dockerfile.ci` to 1b8fce7 ! (#11247)

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* [🤠]: Howdy folks, let's bump `Dockerfile.ci` to 47ff44e ! (#11254)

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Handling tokenizer in PTQ for Nemo 2.0 (#11237)

* Handling tokenizer in PTQ for Nemo 2.0

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Print log msg and enable overriding

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Warning for legacy tokenizer config

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Save HF tokenizer to make tokenizer_config.yaml (almost) redundant

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Handle tokenizer in a unified way

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Move saving context within export

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Fix typo in get_tokenzier

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Reduce diff

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Drop unused import

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

---------

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Fix finetuning datamodule resume (#11187)

* fix datamodule resume

Signed-off-by: Chen Cui <chcui@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: cuichenx <cuichenx@users.noreply.github.com>

* fix subclass

Signed-off-by: Chen Cui <chcui@nvidia.com>

* docstrings and formats

Signed-off-by: Chen Cui <chcui@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: cuichenx <cuichenx@users.noreply.github.com>

---------

Signed-off-by: Chen Cui <chcui@nvidia.com>
Signed-off-by: cuichenx <cuichenx@users.noreply.github.com>
Co-authored-by: cuichenx <cuichenx@users.noreply.github.com>

* ci: Move `bump mcore` to templates (#11229)

* ci: Move `bump mcore` to templates

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* fix

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* fix

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* fix

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* final

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

---------

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* fix: Update baseline (#11205)

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* Remove deprecated builder_opt param from build command (#11259)

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* chore(beep boop 🤖): Bump `MCORE_TAG=aded519...` (2024-11-12) (#11260)

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* [Doc fixes] update file names, installation instructions, bad links (#11045)

* rename eval_beamsearch_ngram.py to eval_beamsearch_ngram_ctc.py in docs

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* replace out of date installation instructions with pointer to NeMo README installation section

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* point to user guide instead of readme

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* some link updates

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* update more links

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

---------

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: Elena Rastorgueva <80532067+erastorgueva-nv@users.noreply.github.com>

* fix(export): GPT models w/ bias=False convert properly (#11255)

Signed-off-by: Terry Kong <terryk@nvidia.com>

* ci: Run secrets detector on `pull_request_target` (#11263)

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* fix(export): update API for disabling device reassignment in TRTLLM for Aligner (#10863)

* fix(export): update API for disabling device reassignment in TRTLLM for Aligner

[feat] Upgrade nemo-export path for aligner to TRTLLM-v12 and use python runtime

Signed-off-by: Terry Kong <terryk@nvidia.com>

fix: forgot to always set _disable_torch_cuda_device_set

Signed-off-by: Terry Kong <terryk@nvidia.com>

Signed-off-by: Terry Kong <terryk@nvidia.com>

Apply isort and black reformatting

Signed-off-by: terrykong <terrykong@users.noreply.github.com>

invert torch device set

Signed-off-by: Terry Kong <terryk@nvidia.com>

* remove comment

Signed-off-by: Terry Kong <terryk@nvidia.com>

---------

Signed-off-by: Terry Kong <terryk@nvidia.com>

* new vfm training features (#11246)

Signed-off-by: Zeeshan Patel <zeeshanp@nvidia.com>
Co-authored-by: Zeeshan Patel <zeeshanp@nvidia.com>

* Update pruning and distillation tutorial notebooks (#11091)

* Update pruning and distillation tutorial notebooks

Signed-off-by: Gomathy Venkata Krishnan <gvenkatakris@nvidia.com>

* Update README

Signed-off-by: Gomathy Venkata Krishnan <gvenkatakris@nvidia.com>

* Update batch size in width pruning script

Signed-off-by: Gomathy Venkata Krishnan <gvenkatakris@nvidia.com>

* Update README

Signed-off-by: Gomathy Venkata Krishnan <gvenkatakris@nvidia.com>

---------

Signed-off-by: Gomathy Venkata Krishnan <gvenkatakris@nvidia.com>

* Beam search algorithm implementation for TDT models (#10903)

* initial commit

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* add: default beam search implementation

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* fix: changed to removing duplicate hypothesis in separate function

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* fix: changed to cartesian product in choosing best hyp

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* fix: minor fixes in comments

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* add: maes decoding strategy

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* add: durations filtering in maes, lm fusion in progress

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* fix: refactored, added comments, command line args, finalized

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* fix: removed prints

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* add: docs

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com>

* fix: minor fix

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* fix: rm beam_size=1 exception, rm duplicates check, fix error handling

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* fix: error handling

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com>

* fix: removed evaluations file

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* rn: blank scoring

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* clean up

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* rm: blank scoring and duration beam size

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com>

* fix: removed durations_beam_size from default beam search

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* add: logaddexp

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* rm: prefix search

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* rn: nested loop over extensions

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* fix: bug with caching

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* rm: topk on durations

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* add: restored prefix search

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com>

* clean up

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* fix: fixed comments

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* refactored duplicate merging

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* changes batch scoring

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* refactored rnnt batch scoring

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* alsd first working

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* refactored

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* clean up

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* remove stacking operations

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* fixes im base class

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* clean up

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com>

* remove potentially uninitialized local variable

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* default beam search minor fixes

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* add test, fix maes timesteps

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* rm file

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* rm file

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* clean up

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com>

* clean up

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* fix comments

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* add ngram lm test

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com>

* fix maes_num_steps=1

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* fix kenlm model path

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* fix kenlm model full path

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com>

* made requested changes

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* merge after isort

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* add prints to test

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com>

* add Kenlm to asr requirements

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* remove prints in tests

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add kenlm to test requirements

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* rm kenlm from link, add package-name

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* rm second kenlm installation

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* rm kenlm from dependencies make test optional

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com>

* fix in test

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* fix in test

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com>

* fix comments

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com>

* add comments

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* add comments

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* splitted docstrings

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com>

* add comments

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* splitted docstrings

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com>

* add comments

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com>

* fixes to python3 type annotations

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com>

* merging

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* merging

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* fix in return type

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com>

* fix test

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com>

* rm time_idx

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* fix comments to python3 style

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

---------

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>
Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com>
Co-authored-by: lilithgrigoryan <lgrigoryan@nvidia.com>
Co-authored-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* update nemo1->2 conversion according to changes in main (#11253)

* update nemo1->2 conversion according to changes in main

Signed-off-by: Huiying Li <willwin.lee@gmail.com>

* Apply isort and black reformatting

Signed-off-by: HuiyingLi <HuiyingLi@users.noreply.github.com>

* format fix

Signed-off-by: Huiying Li <willwin.lee@gmail.com>

* add docstrings

Signed-off-by: Huiying Li <willwin.lee@gmail.com>

---------

Signed-off-by: Huiying Li <willwin.lee@gmail.com>
Signed-off-by: HuiyingLi <HuiyingLi@users.noreply.github.com>
Co-authored-by: HuiyingLi <HuiyingLi@users.noreply.github.com>

* Add llama 3.1 recipes (#11273)

* add llama 3.1 recipes

Signed-off-by: Chen Cui <chcui@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: cuichenx <cuichenx@users.noreply.github.com>

* fix pylint

Signed-off-by: Chen Cui <chcui@nvidia.com>

* Fix llama3.1 wrong config in io.json

---------

Signed-off-by: Chen Cui <chcui@nvidia.com>
Signed-off-by: cuichenx <cuichenx@users.noreply.github.com>
Co-authored-by: cuichenx <cuichenx@users.noreply.github.com>
Co-authored-by: Ao Tang <aot@nvidia.com>

* Fix Finetune Recipe (#11267)

* Fix Starcoder_15 SFT recipe

* Fix PP type SFT recipe

* Fix PP type SFT recipe

* Fix Gemma2b SFT TP=1

* Fix more sft recipe

* Fix more sft recipe

* Fix more sft recipe

* Fix more sft recipe

* Fix more sft recipe

* Fix more sft recipe

* Fix more sft recipe

* Fix more sft recipe

* Fix more sft recipe

* remove pp dtype

* remove pp dtype

* Configure no restart validation loop in nl.Trainer (#11029)

* Configure no restart validation loop in nl.Trainer

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* fix

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* Skip validation whenever restarting=True

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* PR feedback

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: hemildesai <hemildesai@users.noreply.github.com>

---------

Signed-off-by: Hemil Desai <hemild@nvidia.com>
Signed-off-by: hemildesai <hemildesai@users.noreply.github.com>
Co-authored-by: hemildesai <hemildesai@users.noreply.github.com>

* Handle _io_unflatten_object when _thread_local.output_dir is not available (#11199)

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* change default ckpt name (#11277)

Signed-off-by: Maanu Grover <maanug@nvidia.com>

* Use MegatronDataSampler in HfDatasetDataModule (#11274)

* Use MegatronDataSampler in HfDataset

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

---------

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>
Co-authored-by: akoumpa <akoumpa@users.noreply.github.com>

* Remove opencc upperbound (#10909)

Signed-off-by: Dong Hyuk Chang <donghyukc@nvidia.com>

---------

Signed-off-by: Nithin Rao Koluguri <nithinraok>
Signed-off-by: nithinraok <nithinraok@users.noreply.github.com>
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Signed-off-by: Jan Lasek <janek.lasek@gmail.com>
Signed-off-by: Chen Cui <chcui@nvidia.com>
Signed-off-by: cuichenx <cuichenx@users.noreply.github.com>
Signed-off-by: Oliver Koenig <okoenig@nvidia.com>
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: Elena Rastorgueva <80532067+erastorgueva-nv@users.noreply.github.com>
Signed-off-by: Terry Kong <terryk@nvidia.com>
Signed-off-by: Zeeshan Patel <zeeshanp@nvidia.com>
Signed-off-by: Gomathy Venkata Krishnan <gvenkatakris@nvidia.com>
Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>
Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com>
Signed-off-by: Huiying Li <willwin.lee@gmail.com>
Signed-off-by: HuiyingLi <HuiyingLi@users.noreply.github.com>
Signed-off-by: Hemil Desai <hemild@nvidia.com>
Signed-off-by: hemildesai <hemildesai@users.noreply.github.com>
Signed-off-by: Maanu Grover <maanug@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>
Signed-off-by: Dong Hyuk Chang <donghyukc@nvidia.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Co-authored-by: nithinraok <nithinraok@users.noreply.github.com>
Co-authored-by: oliver könig <okoenig@nvidia.com>
Co-authored-by: Jan Lasek <janek.lasek@gmail.com>
Co-authored-by: Chen Cui <chcui@nvidia.com>
Co-authored-by: cuichenx <cuichenx@users.noreply.github.com>
Co-authored-by: Elena Rastorgueva <80532067+erastorgueva-nv@users.noreply.github.com>
Co-authored-by: Terry Kong <terryk@nvidia.com>
Co-authored-by: Zeeshan Patel <zeeshanp@nvidia.com>
Co-authored-by: gvenkatakris <gvenkatakris@nvidia.com>
Co-authored-by: lilithgrigoryan <38436437+lilithgrigoryan@users.noreply.github.com>
Co-authored-by: lilithgrigoryan <lgrigoryan@nvidia.com>
Co-authored-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Huiying <willwin.lee@gmail.com>
Co-authored-by: HuiyingLi <HuiyingLi@users.noreply.github.com>
Co-authored-by: Ao Tang <aot@nvidia.com>
Co-authored-by: Hemil Desai <hemild@nvidia.com>
Co-authored-by: hemildesai <hemildesai@users.noreply.github.com>
Co-authored-by: Maanu Grover <109391026+maanug-nv@users.noreply.github.com>
Co-authored-by: Alexandros Koumparoulis <153118171+akoumpa@users.noreply.github.com>
Co-authored-by: akoumpa <akoumpa@users.noreply.github.com>
Co-authored-by: Dong Hyuk Chang <thomaschang26@tutanota.com>
HuiyingLi pushed a commit to HuiyingLi/NeMo that referenced this pull request Nov 15, 2024
terrykong added a commit that referenced this pull request Nov 15, 2024
pablo-garay pushed a commit that referenced this pull request Nov 15, 2024
yashaswikarnati pushed a commit that referenced this pull request Nov 21, 2024
Signed-off-by: Terry Kong <terryk@nvidia.com>
ShriyaPalsamudram added a commit that referenced this pull request Dec 2, 2024
Signed-off-by: Shriya Palsamudram <spalsamudram@nvidia.com>

Fix FaultTolerencePlugin

Signed-off-by: Shriya Palsamudram <spalsamudram@nvidia.com>

Apply isort and black reformatting

Signed-off-by: ShriyaPalsamudram <ShriyaPalsamudram@users.noreply.github.com>

Add StragglerDetection callback to all NeMo2.0 recipes

Signed-off-by: Shriya Palsamudram <spalsamudram@nvidia.com>

Apply isort and black reformatting

Signed-off-by: ShriyaPalsamudram <ShriyaPalsamudram@users.noreply.github.com>

Add missing and remove unsued imports

Signed-off-by: Shriya Palsamudram <spalsamudram@nvidia.com>

Apply isort and black reformatting

Signed-off-by: ShriyaPalsamudram <ShriyaPalsamudram@users.noreply.github.com>

Add ft launcher test

Signed-off-by: Shriya Palsamudram <spalsamudram@nvidia.com>

fix typo

Signed-off-by: Shriya Palsamudram <spalsamudram@nvidia.com>

Apply isort and black reformatting

Signed-off-by: ShriyaPalsamudram <ShriyaPalsamudram@users.noreply.github.com>

fix more typos

Signed-off-by: Shriya Palsamudram <spalsamudram@nvidia.com>

Apply isort and black reformatting

Signed-off-by: ShriyaPalsamudram <ShriyaPalsamudram@users.noreply.github.com>

add ft launcher using nemo-run for llama3 test

Signed-off-by: Shriya Palsamudram <spalsamudram@nvidia.com>

Apply isort and black reformatting

Signed-off-by: ShriyaPalsamudram <ShriyaPalsamudram@users.noreply.github.com>

fix serialization errors

Signed-off-by: Shriya Palsamudram <spalsamudram@nvidia.com>

Apply isort and black reformatting

Signed-off-by: ShriyaPalsamudram <ShriyaPalsamudram@users.noreply.github.com>

create seperate ft test

Signed-off-by: Shriya Palsamudram <spalsamudram@nvidia.com>

Apply isort and black reformatting

Signed-off-by: ShriyaPalsamudram <ShriyaPalsamudram@users.noreply.github.com>

change github actions test

Signed-off-by: Shriya Palsamudram <spalsamudram@nvidia.com>

draft crash simulation

Signed-off-by: Shriya Balaji Palsamudram <spalsamudram@nvidia.com>

Apply isort and black reformatting

Signed-off-by: ShriyaPalsamudram <ShriyaPalsamudram@users.noreply.github.com>

Simulate a crash using step, disable checkpointing

Signed-off-by: Shriya Palsamudram <spalsamudram@nvidia.com>

Apply isort and black reformatting

Signed-off-by: ShriyaPalsamudram <ShriyaPalsamudram@users.noreply.github.com>

Add a straggler detection test as well

Signed-off-by: Shriya Palsamudram <spalsamudram@nvidia.com>

Apply isort and black reformatting

Signed-off-by: ShriyaPalsamudram <ShriyaPalsamudram@users.noreply.github.com>

Apply isort and black reformatting

Signed-off-by: ShriyaPalsamudram <ShriyaPalsamudram@users.noreply.github.com>

Revert enabling straggler_detection by default in all recipes

Signed-off-by: Shriya Palsamudram <spalsamudram@nvidia.com>

Remove unused imports

Signed-off-by: Shriya Palsamudram <spalsamudram@nvidia.com>

Remove extra check in ConfigValidationPlugin

Signed-off-by: Shriya Palsamudram <spalsamudram@nvidia.com>

Address pylinter issues

Signed-off-by: Shriya Palsamudram <spalsamudram@nvidia.com>

Improve straggler detection testing and add doc string

Signed-off-by: Shriya Palsamudram <spalsamudram@nvidia.com>

Apply isort and black reformatting

Signed-off-by: ShriyaPalsamudram <ShriyaPalsamudram@users.noreply.github.com>

fix paths

Signed-off-by: Shriya Palsamudram <spalsamudram@nvidia.com>

Add assert for crash

Signed-off-by: Shriya Palsamudram <spalsamudram@nvidia.com>

Apply isort and black reformatting

Signed-off-by: ShriyaPalsamudram <ShriyaPalsamudram@users.noreply.github.com>

Append run logs to a file after a crash

Signed-off-by: Shriya Palsamudram <spalsamudram@nvidia.com>

Set FAULT_TOL_FINISHED_FLAG_FILE and FAULT_TOL_CFG_PATH

Signed-off-by: Shriya Palsamudram <spalsamudram@nvidia.com>

Add openai-gelu in gated activation (#11293)

Fixes per comments (#11280)

* Fixes per comments

Signed-off-by: Gomathy Venkata Krishnan <gvenkatakris@nvidia.com>

* Update README

Signed-off-by: Gomathy Venkata Krishnan <gvenkatakris@nvidia.com>

---------

Signed-off-by: Gomathy Venkata Krishnan <gvenkatakris@nvidia.com>

Add T5TTS (#11193)

* added training and inference recipes for T5-TTS.
* fix some attention errors
* add copyright headers.
* added TODO and detail error log info.
* fixed missing a corner case.
* added classes to __all__
* fixed to return either self-attention scores or cross-attention scores in ParallelTransformerLayer_ class.

Signed-off-by: XuesongYang <XuesongYang@users.noreply.github.com>

---------

Signed-off-by: Jason <jasoli@nvidia.com>
Signed-off-by: blisc <blisc@users.noreply.github.com>
Signed-off-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com>
Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: XuesongYang <XuesongYang@users.noreply.github.com>
Co-authored-by: blisc <blisc@users.noreply.github.com>
Co-authored-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com>
Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Co-authored-by: XuesongYang <XuesongYang@users.noreply.github.com>

ci: Exclude CPU machines from scan (#11300)

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

Revert "fix(export): GPT models w/ bias=False convert properly (#11255)" (#11301)

This reverts commit 2d4f4953881b9e2d118d3ffeba7e64625d827d11.

remove redundant docs (#11302)

Create phi3mini.py (#11281)

* Create phi3mini.py

Signed-off-by: mayani-nv <67936769+mayani-nv@users.noreply.github.com>

Apply isort and black reformatting

Signed-off-by: mayani-nv <mayani-nv@users.noreply.github.com>

Update __init__.py

Signed-off-by: mayani-nv <67936769+mayani-nv@users.noreply.github.com>

Update __init__.py

Signed-off-by: mayani-nv <67936769+mayani-nv@users.noreply.github.com>

Apply isort and black reformatting

Signed-off-by: mayani-nv <mayani-nv@users.noreply.github.com>

* Create phi3_mini_4k_instruct.py for adding to recipe

Signed-off-by: mayani-nv <67936769+mayani-nv@users.noreply.github.com>

Apply isort and black reformatting

Signed-off-by: mayani-nv <mayani-nv@users.noreply.github.com>

Update phi3_mini_4k_instruct.py and removed Performant recipe

Signed-off-by: mayani-nv <67936769+mayani-nv@users.noreply.github.com>

Update phi3_mini_4k_instruct.py and removing performant condition

Signed-off-by: mayani-nv <67936769+mayani-nv@users.noreply.github.com>

Update phi3_mini_4k_instruct.py with docstring changes

Signed-off-by: mayani-nv <67936769+mayani-nv@users.noreply.github.com>

* Update __init__.py

Signed-off-by: mayani-nv <67936769+mayani-nv@users.noreply.github.com>

* fixing pylint warnings

* Apply isort and black reformatting

Signed-off-by: mayani-nv <mayani-nv@users.noreply.github.com>

* correcting typos and adding working recipe files

---------

Signed-off-by: mayani-nv <mayani-nv@users.noreply.github.com>
Signed-off-by: mayani-nv <67936769+mayani-nv@users.noreply.github.com>
Co-authored-by: Chen Cui <chcui@nvidia.com>
Co-authored-by: mayani-nv <mayani-nv@users.noreply.github.com>

Integrate lm-eval-harness for evaluations in NeMo (#10621)

* Add evaluate method and other minor fixes

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add inference params to evaluate method

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add wait_for_rest_service fn to evaluate method

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Apply isort and black reformatting

Signed-off-by: athitten <athitten@users.noreply.github.com>

* Add logprobs to be returned by Pytriton for trtllm models

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Increase max_retries in wait_for_rest_service method

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Apply isort and black reformatting

Signed-off-by: athitten <athitten@users.noreply.github.com>

* Add unset slurm vars and use env vars for Triton args

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add logic to get logProbs from logits

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Refactor, clean and organize the code

1) Refactors the code and creates an evaluation folder where all util methods live
2) Add doctsrings, comments
3) Expose gather_context_logits, gather_generation_logits in trtllm and add output_generation_logits flag to return generation logits and remove output_logporbs as its not getting used anymore

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add copyright and initialize special_tokens_kwargs in eval_utils.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Add the following chanes

1) Move get_trtllm_deployable and unset_environment_variables to deploy base.py
2) Rename eval_utils.py to base.py
3) REstore scripts/export/convert_nemo2_for_export.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Fix a minor typo

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Revert output_log_probs and all_probs arg in tensorrt_llm_run.py

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Fix docstrings formatting

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Pylint and other minor fixes

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Fix pylint and typos

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Apply isort and black reformatting

Signed-off-by: athitten <athitten@users.noreply.github.com>

* Avoid multiple calls for tokenizer_type

Co-authored-by: Ananth Subramaniam <ananth.subramaniam@gmail.com>
Signed-off-by: Abhishree Thittenamane <47577437+athitten@users.noreply.github.com>

* Replace print statements with logging statements

Signed-off-by: Abhishree <abhishreetm@gmail.com>

* Apply isort and black reformatting

Signed-off-by: athitten <athitten@users.noreply.github.com>

---------

Signed-off-by: Abhishree <abhishreetm@gmail.com>
Signed-off-by: athitten <athitten@users.noreply.github.com>
Signed-off-by: Abhishree Thittenamane <47577437+athitten@users.noreply.github.com>
Co-authored-by: athitten <athitten@users.noreply.github.com>
Co-authored-by: Ananth Subramaniam <ananth.subramaniam@gmail.com>

ci: Fix release workflow (#11286)

* ci: Fix release workflow

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* fix

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* fix

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* fix

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* fix

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* Update .github/workflows/release.yml

Signed-off-by: oliver könig <okoenig@nvidia.com>

---------

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>
Signed-off-by: oliver könig <okoenig@nvidia.com>

Update import 'pytorch_lightning' -> 'lightning.pytorch' (#11252)

* update import in collections/llm

Signed-off-by: Maanu Grover <maanug@nvidia.com>

* update import in lightning

Signed-off-by: Maanu Grover <maanug@nvidia.com>

* update fabric import in lightning

Signed-off-by: Maanu Grover <maanug@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: maanug-nv <maanug-nv@users.noreply.github.com>

* update import in collections/asr

Signed-off-by: Maanu Grover <maanug@nvidia.com>

* update import in collections/tts

Signed-off-by: Maanu Grover <maanug@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: maanug-nv <maanug-nv@users.noreply.github.com>

* update requirements

Signed-off-by: Maanu Grover <maanug@nvidia.com>

* unused imports

Signed-off-by: Maanu Grover <maanug@nvidia.com>

* update import in tests

Signed-off-by: Maanu Grover <maanug@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: maanug-nv <maanug-nv@users.noreply.github.com>

* update import in collections/common

Signed-off-by: Maanu Grover <maanug@nvidia.com>

* update import in core

Signed-off-by: Maanu Grover <maanug@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: maanug-nv <maanug-nv@users.noreply.github.com>

* update import in utils

Signed-off-by: Maanu Grover <maanug@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: maanug-nv <maanug-nv@users.noreply.github.com>

* update import in collections/nlp

Signed-off-by: Maanu Grover <maanug@nvidia.com>

* update fabric import in collections/nlp

Signed-off-by: Maanu Grover <maanug@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: maanug-nv <maanug-nv@users.noreply.github.com>

* update fabric import in utils

Signed-off-by: Maanu Grover <maanug@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: maanug-nv <maanug-nv@users.noreply.github.com>

* update import in nlp examples

Signed-off-by: Maanu Grover <maanug@nvidia.com>

* update import in asr examples

Signed-off-by: Maanu Grover <maanug@nvidia.com>

* update import in llm examples

Signed-off-by: Maanu Grover <maanug@nvidia.com>

* update import in tts examples

Signed-off-by: Maanu Grover <maanug@nvidia.com>

* update fabric import in nlp examples

Signed-off-by: Maanu Grover <maanug@nvidia.com>

* update import in deploy

Signed-off-by: Maanu Grover <maanug@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: maanug-nv <maanug-nv@users.noreply.github.com>

* update import in slu examples

Signed-off-by: Maanu Grover <maanug@nvidia.com>

* update import in speaker_tasks examples

Signed-off-by: Maanu Grover <maanug@nvidia.com>

* update import in collections/audio

Signed-off-by: Maanu Grover <maanug@nvidia.com>

* update import in audio examples

Signed-off-by: Maanu Grover <maanug@nvidia.com>

* update import in collections/llm

Signed-off-by: Maanu Grover <maanug@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: maanug-nv <maanug-nv@users.noreply.github.com>

* update import in collections/vlm

Signed-off-by: Maanu Grover <maanug@nvidia.com>

* update import in collections/diffusion

Signed-off-by: Maanu Grover <maanug@nvidia.com>

* update import in collections/vision

Signed-off-by: Maanu Grover <maanug@nvidia.com>

* update import in collections/multimodal

Signed-off-by: Maanu Grover <maanug@nvidia.com>

* update import in multimodal examples

Signed-off-by: Maanu Grover <maanug@nvidia.com>

* update import in vision examples

Signed-off-by: Maanu Grover <maanug@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: maanug-nv <maanug-nv@users.noreply.github.com>

* update import in scripts

Signed-off-by: Maanu Grover <maanug@nvidia.com>

* Update baseline

Signed-off-by: maanug-nv <maanug-nv@users.noreply.github.com>

* Apply isort and black reformatting

Signed-off-by: artbataev <artbataev@users.noreply.github.com>

* revert bad change

Signed-off-by: Maanu Grover <maanug@nvidia.com>

---------

Signed-off-by: Maanu Grover <maanug@nvidia.com>
Signed-off-by: maanug-nv <maanug-nv@users.noreply.github.com>
Signed-off-by: artbataev <artbataev@users.noreply.github.com>
Co-authored-by: maanug-nv <maanug-nv@users.noreply.github.com>
Co-authored-by: artbataev <artbataev@users.noreply.github.com>

fix perf plugin CUDA_DEVICE_MAX_CONNECTIONS setting (#11299)

* fix

Signed-off-by: Jimmy Zhang <jiemingz@nvidia.com>

* Docstrings

Signed-off-by: Jimmy Zhang <jiemingz@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: JimmyZhang12 <JimmyZhang12@users.noreply.github.com>

---------

Signed-off-by: Jimmy Zhang <jiemingz@nvidia.com>
Signed-off-by: JimmyZhang12 <JimmyZhang12@users.noreply.github.com>
Co-authored-by: Jimmy Zhang <jiemingz@nvidia.com>
Co-authored-by: JimmyZhang12 <JimmyZhang12@users.noreply.github.com>

PTQ via NeMo-Run CLI (#10984)

* PTQ support in nemo CLI

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Naming engine vs checkpoint

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

---------

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

PTQ memory optimization (#11257)

* Initial commit

Signed-off-by: Piotr Kaminski <pikaminski@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: Laplasjan107 <Laplasjan107@users.noreply.github.com>

* Add sample generate

Signed-off-by: Piotr Kaminski <pikaminski@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: Laplasjan107 <Laplasjan107@users.noreply.github.com>

* Nemotron quantization, reduce diff

Signed-off-by: Piotr Kaminski <pikaminski@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: Laplasjan107 <Laplasjan107@users.noreply.github.com>

* Apply isort and black reformatting

Signed-off-by: Laplasjan107 <Laplasjan107@users.noreply.github.com>

* Reduce diff

Signed-off-by: Piotr Kaminski <pikaminski@nvidia.com>

* code review suggestions

Signed-off-by: Piotr Kaminski <pikaminski@nvidia.com>

* Bug fixes

Signed-off-by: Piotr Kaminski <pikaminski@nvidia.com>

* remove not needed import

Signed-off-by: Piotr Kaminski <pikaminski@nvidia.com>

* fix model type and allow ddp/optim setup

Signed-off-by: Piotr Kaminski <pikaminski@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: Laplasjan107 <Laplasjan107@users.noreply.github.com>

---------

Signed-off-by: Piotr Kaminski <pikaminski@nvidia.com>
Signed-off-by: Laplasjan107 <Laplasjan107@users.noreply.github.com>
Signed-off-by: Piotr Kamiński <67481570+Laplasjan107@users.noreply.github.com>
Co-authored-by: Piotr Kaminski <pikaminski@nvidia.com>
Co-authored-by: Laplasjan107 <Laplasjan107@users.noreply.github.com>
Co-authored-by: Jan Lasek <janek.lasek@gmail.com>

update README.md (#11223)

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

Add `attention_bias` argument in transformer block and transformer layer modules, addressing change in MCore (#11289)

* fix api

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* fix ci

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* add docstring

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* fix docstring2

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* fix line too long

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

---------

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>
Co-authored-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

Remove pytorch-lightning (#11306)

* update import in docs

Signed-off-by: Maanu Grover <maanug@nvidia.com>

* update import in tutorials

Signed-off-by: Maanu Grover <maanug@nvidia.com>

* remove pl requirement

Signed-off-by: Maanu Grover <maanug@nvidia.com>

* missed import updates

Signed-off-by: Maanu Grover <maanug@nvidia.com>

---------

Signed-off-by: Maanu Grover <maanug@nvidia.com>

Adding multimodal examples (#11279)

* Adding multimodal examples

* Apply isort and black reformatting

Signed-off-by: shanmugamr1992 <shanmugamr1992@users.noreply.github.com>

---------

Signed-off-by: shanmugamr1992 <shanmugamr1992@users.noreply.github.com>
Co-authored-by: Shanmugam Ramasamy <shanmugamr@shanmugamr-mlt.client.nvidia.com>
Co-authored-by: shanmugamr1992 <shanmugamr1992@users.noreply.github.com>

Update T5 attention-mask shapes to be compatible with all attention-backend in new TE versions (#11059)

* initial commits

* updating cicd test

* commit for FlashFused T5 from Mcore

* testing CICD

* update code for data/mock, update mcore commit for dockerfile

* fix error

* fix error

* fix error in nemo/collections/llm/inference/base.py

* update t5/data/mock.py

* fix cicd erorr

* remove unused libs

* address Yu Yao's comments

* Apply isort and black reformatting

Signed-off-by: huvunvidia <huvunvidia@users.noreply.github.com>

---------

Signed-off-by: huvunvidia <huvunvidia@users.noreply.github.com>
Co-authored-by: Huy Vu2 <huvu@login-eos02.eos.clusters.nvidia.com>
Co-authored-by: huvunvidia <huvunvidia@users.noreply.github.com>

Add HF untrusted code toggle (#11313)

* add trust_remote_code toggle

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

---------

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>
Co-authored-by: akoumpa <akoumpa@users.noreply.github.com>

P2p chunk size setting in nemo 2.0 (#11312)

* NCCL P2P communication chunk size

Signed-off-by: Sangkug Lym <slym@nvidia.com>

* NCCL P2P communication chunk size

Signed-off-by: Sangkug Lym <slym@nvidia.com>

---------

Signed-off-by: Sangkug Lym <slym@nvidia.com>

Nemo2 batcheval (#11158)

* initial draft for eval api

Signed-off-by: HuiyingLi <willwin.lee@gmail.com>

* add dp to generate

Signed-off-by: HuiyingLi <willwin.lee@gmail.com>

* Apply isort and black reformatting

Signed-off-by: HuiyingLi <HuiyingLi@users.noreply.github.com>

* add top_k=1 to defaul inf param to get deterministic output

Signed-off-by: HuiyingLi <willwin.lee@gmail.com>

* change name

Signed-off-by: HuiyingLi <willwin.lee@gmail.com>

* add eval ds and write to file to llm.generate

Signed-off-by: HuiyingLi <willwin.lee@gmail.com>

* support standalone input jsonl

Signed-off-by: HuiyingLi <willwin.lee@gmail.com>

---------

Signed-off-by: HuiyingLi <willwin.lee@gmail.com>
Signed-off-by: HuiyingLi <HuiyingLi@users.noreply.github.com>
Co-authored-by: HuiyingLi <HuiyingLi@users.noreply.github.com>

DoRA (#11104)

* initial commit for DoRA

Signed-off-by: Chen Cui <chcui@nvidia.com>

* clean up code

Signed-off-by: Chen Cui <chcui@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: cuichenx <cuichenx@users.noreply.github.com>

* clean up

Signed-off-by: Chen Cui <chcui@nvidia.com>

* fix TP

Signed-off-by: Chen Cui <chcui@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: cuichenx <cuichenx@users.noreply.github.com>

* add dropout correction term

Signed-off-by: Chen Cui <chcui@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: cuichenx <cuichenx@users.noreply.github.com>

* add copyright and doc strings

Signed-off-by: Chen Cui <chcui@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: cuichenx <cuichenx@users.noreply.github.com>

* fix

Signed-off-by: Chen Cui <chcui@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: cuichenx <cuichenx@users.noreply.github.com>

* docstrings

Signed-off-by: Chen Cui <chcui@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: cuichenx <cuichenx@users.noreply.github.com>

* docstrings

Signed-off-by: Chen Cui <chcui@nvidia.com>

* add ci test

Signed-off-by: Chen Cui <chcui@nvidia.com>

* add ci test

Signed-off-by: Chen Cui <chcui@nvidia.com>

* typo

Signed-off-by: Chen Cui <chcui@nvidia.com>

* remove unused code

Signed-off-by: Chen Cui <chcui@nvidia.com>

* remove commented out code

Signed-off-by: Chen Cui <chcui@nvidia.com>

* fix

Signed-off-by: Chen Cui <chcui@nvidia.com>

* bug

Signed-off-by: Chen Cui <chcui@nvidia.com>

---------

Signed-off-by: Chen Cui <chcui@nvidia.com>
Signed-off-by: cuichenx <cuichenx@users.noreply.github.com>
Co-authored-by: cuichenx <cuichenx@users.noreply.github.com>

Profiling - support Chakra & Kineto trace dumping (#11115)

* Support chakra trace dumping by cfg

Signed-off-by: Lily Wang <lilyw@nvidia.com>

remove the manual recording of process::init

Signed-off-by: Lily Wang <lilyw@nvidia.com>

1. Remove unnecessary kineto config  2. Fix typo

Signed-off-by: Lily Wang <lilyw@nvidia.com>

Change warning to exception when nsys is enabled with chakra profiling

Signed-off-by: Lily Wang <lilyw@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: pablo-garay <pablo-garay@users.noreply.github.com>

* fix bug in identifying profiling start step

Signed-off-by: Lily Wang <lilyw@nvidia.com>

* Update baseline

Signed-off-by: lilyw97 <lilyw97@users.noreply.github.com>

* [1]remove unused import [2]switch to use isinstance instead of type() [3]move torch.profiling to function

Signed-off-by: Lily Wang <lilyw@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: lilyw97 <lilyw97@users.noreply.github.com>

---------

Signed-off-by: Lily Wang <lilyw@nvidia.com>
Signed-off-by: pablo-garay <pablo-garay@users.noreply.github.com>
Signed-off-by: lilyw97 <lilyw97@users.noreply.github.com>
Signed-off-by: Maanu Grover <109391026+maanug-nv@users.noreply.github.com>
Co-authored-by: Lily Wang <lilyw@cw-dfw-cs-001-vscode-01.cm.cluster>
Co-authored-by: Pablo Garay <palenq@gmail.com>
Co-authored-by: pablo-garay <pablo-garay@users.noreply.github.com>
Co-authored-by: lilyw97 <lilyw97@users.noreply.github.com>
Co-authored-by: Maanu Grover <109391026+maanug-nv@users.noreply.github.com>

NeMo 2.0 SFT PEFT notebooks (#10874)

* nemo2-sft notebook initial draft

Signed-off-by: HuiyingLi <willwin.lee@gmail.com>

* remove mixtral info

Signed-off-by: HuiyingLi <willwin.lee@gmail.com>

* minor fixes

Signed-off-by: HuiyingLi <willwin.lee@gmail.com>

* minor fixes

Signed-off-by: HuiyingLi <willwin.lee@gmail.com>

* minor fixes

Signed-off-by: HuiyingLi <willwin.lee@gmail.com>

* add import_ckpt script and minor changes

Signed-off-by: HuiyingLi <willwin.lee@gmail.com>

* Random read for tarr files in lhotse dataloaders (#10536)

* Random read for tarr files in lhotse dataloaders

Signed-off-by: Nune <ntadevosyan@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: nune-tadevosyan <nune-tadevosyan@users.noreply.github.com>

* Solve failled tests

Signed-off-by: Nune <ntadevosyan@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: nune-tadevosyan <nune-tadevosyan@users.noreply.github.com>

* Adding a testcase

Signed-off-by: Nune <ntadevosyan@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: nune-tadevosyan <nune-tadevosyan@users.noreply.github.com>

* Some changs in tests

Signed-off-by: Nune <ntadevosyan@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: nune-tadevosyan <nune-tadevosyan@users.noreply.github.com>

* removing import

Signed-off-by: Nune <ntadevosyan@nvidia.com>

---------

Signed-off-by: Nune <ntadevosyan@nvidia.com>
Signed-off-by: nune-tadevosyan <nune-tadevosyan@users.noreply.github.com>
Co-authored-by: nune-tadevosyan <nune-tadevosyan@users.noreply.github.com>

* training code for hybrid-autoregressive inference model (#10841)

* training code for hybrid-autoregressive inference model

Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: hainan-xv <hainan-xv@users.noreply.github.com>

---------

Signed-off-by: Hainan Xu <hainanx@nvidia.com>
Signed-off-by: hainan-xv <hainan-xv@users.noreply.github.com>
Co-authored-by: Hainan Xu <hainanx@nvidia.com>
Co-authored-by: hainan-xv <hainan-xv@users.noreply.github.com>

* [🤠]: Howdy folks, let's bump `Dockerfile.ci` to 772faca ! (#10871)

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: pablo-garay <7166088+pablo-garay@users.noreply.github.com>

* Use trainer.local_rank/global_rank (#10860)

* fix global_rank calculation

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* use trainer's global/local rank

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

---------

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* remove stacking operation from batched functions (#10524)

* remove stacking operations

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* fixes im base class

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* clean up

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com>

* remove potentially uninitialized local variable

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* restore batch_intilize states funcname

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* fix typo

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* fix potentially uninitialized local variable

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* fix potentially uninitialized local variable
in stateless transduser

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* fix test

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com>

* fix docstring, rm comment

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* fix dosctrings

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

---------

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>
Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com>
Co-authored-by: lilithgrigoryan <lgrigoryan@nvidia.com>
Co-authored-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com>

* [NeMo-UX] Add llm.generate to nemo.collections.llm (#10471)

* Add llm.generate

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* Remove comment

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: hemildesai <hemildesai@users.noreply.github.com>

* Fix launching with python

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* PR feedback

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* PR feedback

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: hemildesai <hemildesai@users.noreply.github.com>

* Add assert cp

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* Add example script

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* Fix

Signed-off-by: Hemil Desai <hemild@nvidia.com>

---------

Signed-off-by: Hemil Desai <hemild@nvidia.com>
Signed-off-by: hemildesai <hemildesai@users.noreply.github.com>
Co-authored-by: hemildesai <hemildesai@users.noreply.github.com>

* Adding support for LightningDataModule inside Fabric-API (#10879)

* Make FabricMegatronMixedPrecision match MegatronMixedPrecision

Signed-off-by: Marc Romeijn <mromeijn@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: marcromeyn <marcromeyn@users.noreply.github.com>

* Supporting DataModule in fabric-API

Signed-off-by: Marc Romeijn <mromeijn@nvidia.com>

* Adding support for LightningDataModule inside Fabric-API

Signed-off-by: Marc Romeijn <mromeijn@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: marcromeyn <marcromeyn@users.noreply.github.com>

* Remove import in mock.py

Signed-off-by: Marc Romeijn <mromeijn@nvidia.com>

---------

Signed-off-by: Marc Romeijn <mromeijn@nvidia.com>
Signed-off-by: marcromeyn <marcromeyn@users.noreply.github.com>
Co-authored-by: marcromeyn <marcromeyn@users.noreply.github.com>

* initial draft

Signed-off-by: smajumdar <titu1994@gmail.com>

* Initial local run

Signed-off-by: smajumdar <titu1994@gmail.com>

* Initial local run

Signed-off-by: smajumdar <titu1994@gmail.com>

* Initial local run

Signed-off-by: smajumdar <titu1994@gmail.com>

* Initial local run

Signed-off-by: smajumdar <titu1994@gmail.com>

* Save yaml config for model in nemo.lightning.io (#10765)

* Save yaml config for model in nemo.lightning.io

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: hemildesai <hemildesai@users.noreply.github.com>

* Fix bug

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* Fix bug

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* fix bug

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* Add explicit yaml comparison

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: hemildesai <hemildesai@users.noreply.github.com>

* relax test

Signed-off-by: Hemil Desai <hemild@nvidia.com>

---------

Signed-off-by: Hemil Desai <hemild@nvidia.com>
Signed-off-by: hemildesai <hemildesai@users.noreply.github.com>
Co-authored-by: hemildesai <hemildesai@users.noreply.github.com>

* Move collectiob.nlp imports inline for t5 (#10877)

* Move collectiob.nlp imports inline for t5

Signed-off-by: Marc Romeyn <mromeijn@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: marcromeyn <marcromeyn@users.noreply.github.com>

---------

Signed-off-by: Marc Romeyn <mromeijn@nvidia.com>
Signed-off-by: marcromeyn <marcromeyn@users.noreply.github.com>
Co-authored-by: marcromeyn <marcromeyn@users.noreply.github.com>

* add world_size/pp_size runtime check (#10842)

* add world_size/pp_size runtime check

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix msg precision

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix test_init_parallel_ranks ws=3 pp=3

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

---------

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix peft resume (#10887)

Signed-off-by: Chen Cui <chcui@nvidia.com>

* Update engine build step for TRT-LLM 0.13.0 (#10880)

* Setting use_fused_mlp for TRT-LLM >= 0.13.0

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Unused import removal

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

---------

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Akoumparouli/nemo ux moe loss logging (#10128)

* Move across pipeline loss reduction to a separate function

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Add support for MoE loss logging

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* remove unused function

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

---------

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>
Co-authored-by: akoumpa <akoumpa@users.noreply.github.com>

* enable vboost and set LM SM margin (#10853)

* enable vboost

Signed-off-by: Malay Nagda <malayn@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: malay-nagda <malay-nagda@users.noreply.github.com>

* env vars

Signed-off-by: Malay Nagda <malayn@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: malay-nagda <malay-nagda@users.noreply.github.com>

* add perf plugin

Signed-off-by: Jimmy Zhang <jiemingz@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: JimmyZhang12 <JimmyZhang12@users.noreply.github.com>

* revert default executor

Signed-off-by: Jimmy Zhang <jiemingz@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: JimmyZhang12 <JimmyZhang12@users.noreply.github.com>

* fix typo

Signed-off-by: Jimmy Zhang <jiemingz@nvidia.com>

* fix more typo

Signed-off-by: Jimmy Zhang <jiemingz@nvidia.com>

* ln margin knob

Signed-off-by: Jimmy Zhang <jiemingz@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: JimmyZhang12 <JimmyZhang12@users.noreply.github.com>

* specify lm margin

Signed-off-by: Jimmy Zhang <jiemingz@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: JimmyZhang12 <JimmyZhang12@users.noreply.github.com>

---------

Signed-off-by: Malay Nagda <malayn@nvidia.com>
Signed-off-by: malay-nagda <malay-nagda@users.noreply.github.com>
Signed-off-by: malay-nagda <164242706+malay-nagda@users.noreply.github.com>
Signed-off-by: Jimmy Zhang <jiemingz@nvidia.com>
Signed-off-by: JimmyZhang12 <JimmyZhang12@users.noreply.github.com>
Co-authored-by: malay-nagda <malay-nagda@users.noreply.github.com>
Co-authored-by: Jimmy Zhang <jiemingz@nvidia.com>
Co-authored-by: JimmyZhang12 <JimmyZhang12@users.noreply.github.com>

* use _get_extra_te_kwargs_meta in fabric (call mcore's _get_extra_te_k… (#10608)

* use _get_extra_te_kwargs_meta in fabric (call mcore's _get_extra_te_kwargs & overwrite device)

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

---------

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>
Co-authored-by: akoumpa <akoumpa@users.noreply.github.com>

* Use torch sdpa implementation in ASR mha (#9590)

* use pytorch sdpa

Signed-off-by: WoodieDudy <goshagks@gmail.com>

* sdpa work

Signed-off-by: WoodieDudy <goshagks@gmail.com>

* Apply isort and black reformatting

Signed-off-by: titu1994 <titu1994@users.noreply.github.com>

* sdpa flag to false & sdpa_backend arg

Signed-off-by: WoodieDudy <goshagks@gmail.com>

* Apply isort and black reformatting

Signed-off-by: WoodieDudy <WoodieDudy@users.noreply.github.com>

* change arg name

Signed-off-by: WoodieDudy <goshagks@gmail.com>

* Apply isort and black reformatting

Signed-off-by: WoodieDudy <WoodieDudy@users.noreply.github.com>

* fix config args

Signed-off-by: WoodieDudy <goshagks@gmail.com>

* Apply isort and black reformatting

Signed-off-by: WoodieDudy <WoodieDudy@users.noreply.github.com>

* add condition on version

Signed-off-by: WoodieDudy <goshagks@gmail.com>

* Apply isort and black reformatting

Signed-off-by: WoodieDudy <WoodieDudy@users.noreply.github.com>

* update condition on version

Signed-off-by: WoodieDudy <goshagks@gmail.com>

* remove condition on torch version

Signed-off-by: WoodieDudy <goshagks@gmail.com>

* Apply isort and black reformatting

Signed-off-by: WoodieDudy <WoodieDudy@users.noreply.github.com>

* move code to init

Signed-off-by: WoodieDudy <goshagks@gmail.com>

* Apply isort and black reformatting

Signed-off-by: WoodieDudy <WoodieDudy@users.noreply.github.com>

* refactor

Signed-off-by: WoodieDudy <goshagks@gmail.com>

* Apply isort and black reformatting

Signed-off-by: WoodieDudy <WoodieDudy@users.noreply.github.com>

* refactor

Signed-off-by: WoodieDudy <goshagks@gmail.com>

---------

Signed-off-by: WoodieDudy <goshagks@gmail.com>
Signed-off-by: titu1994 <titu1994@users.noreply.github.com>
Signed-off-by: WoodieDudy <WoodieDudy@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: titu1994 <titu1994@users.noreply.github.com>
Co-authored-by: WoodieDudy <WoodieDudy@users.noreply.github.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>

* Add registry to register all needed classes with artifacts in nemo.lightning.io (#10861)

* Add registry to register all needed classes with artifacts in nemo.lightning.io

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: hemildesai <hemildesai@users.noreply.github.com>

* Fixes

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: hemildesai <hemildesai@users.noreply.github.com>

* Fix

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* comments

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: hemildesai <hemildesai@users.noreply.github.com>

* Apply isort and black reformatting

Signed-off-by: artbataev <artbataev@users.noreply.github.com>

* Remove cyclic import

Signed-off-by: Hemil Desai <hemild@nvidia.com>

---------

Signed-off-by: Hemil Desai <hemild@nvidia.com>
Signed-off-by: hemildesai <hemildesai@users.noreply.github.com>
Signed-off-by: artbataev <artbataev@users.noreply.github.com>
Co-authored-by: hemildesai <hemildesai@users.noreply.github.com>
Co-authored-by: artbataev <artbataev@users.noreply.github.com>

* call __post_init__ after altering config values (#10885)

* call __post_init__ after altering config values

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* test fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* turn off SP

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

---------

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Nemo 2.0 ckpt support in TRT-LLM export (#10891)

* fix minor import bug

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* Add registry to register all needed classes with artifacts in nemo.lightning.io

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: hemildesai <hemildesai@users.noreply.github.com>

* Fixes

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: hemildesai <hemildesai@users.noreply.github.com>

* Fix

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* nemo 2.0 support in export to trt-llm

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* get mixing from main

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>

* fix style

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

---------

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>
Signed-off-by: Hemil Desai <hemild@nvidia.com>
Signed-off-by: hemildesai <hemildesai@users.noreply.github.com>
Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>
Co-authored-by: Hemil Desai <hemild@nvidia.com>
Co-authored-by: hemildesai <hemildesai@users.noreply.github.com>
Co-authored-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>

* [Docs] Fix doc warnings, focus on feature and multimodal sections (#10171)

* various simple docs source fixes

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* fix docstrings and typing with forward reference

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: erastorgueva-nv <erastorgueva-nv@users.noreply.github.com>

* fix typing forward reference for PromptedAudioToTextLhotseDataset

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* fix feature warnings

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Try fix some model part errors

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* try add requirements

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* try add requirements

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix indent in docstring

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* update

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* handle duplicate issue

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* handle duplicate issue

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* fix imagen cite

* fix ratio issues

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* fix Dreambooth

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Fix activation recomputation

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* fix sequence packing

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* fix asr_language_modeling_and_customization

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* fixes wip

Signed-off-by: Huiying Li <willwin.lee@gmail.com>

---------

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: erastorgueva-nv <erastorgueva-nv@users.noreply.github.com>
Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>
Signed-off-by: Huiying Li <willwin.lee@gmail.com>
Signed-off-by: Yu Yao <54727607+yaoyu-33@users.noreply.github.com>
Co-authored-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Co-authored-by: Elena Rastorgueva <80532067+erastorgueva-nv@users.noreply.github.com>
Co-authored-by: erastorgueva-nv <erastorgueva-nv@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>
Co-authored-by: Ao Tang <aot@nvidia.com>
Co-authored-by: Huiying Li <willwin.lee@gmail.com>

* calculate step time batch end-batch end (#10202)

* log step time at end

Signed-off-by: Malay Nagda <malayn@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: malay-nagda <malay-nagda@users.noreply.github.com>

* use nemo logging

Signed-off-by: Malay Nagda <malayn@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: malay-nagda <malay-nagda@users.noreply.github.com>

* cleanup

Signed-off-by: Malay Nagda <malayn@nvidia.com>

* check remove

Signed-off-by: Malay Nagda <malayn@nvidia.com>

* delta timing callback

Signed-off-by: Malay Nagda <malayn@nvidia.com>

* comment and name change

Signed-off-by: Malay Nagda <malayn@nvidia.com>

---------

Signed-off-by: Malay Nagda <malayn@nvidia.com>
Signed-off-by: malay-nagda <malay-nagda@users.noreply.github.com>
Co-authored-by: malay-nagda <malay-nagda@users.noreply.github.com>

* late import prettytable (#10912)

Signed-off-by: Maanu Grover <maanug@nvidia.com>

* [🤠]: Howdy folks, let's bump `Dockerfile.ci` to 0d89fc4 ! (#10919)

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Warning for missing FP8 checkpoint support for vLLM deployment (#10906)

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Add lhotse fixes for rnnt model training and WER hanging issue with f… (#10821)

* Add lhotse fixes for rnnt model training and WER hanging issue with f… (#10787)

* Add lhotse fixes for rnnt model training and WER hanging issue with fuse batching

Signed-off-by: Nithin Rao Koluguri <nithinraok>

* Apply isort and black reformatting

Signed-off-by: nithinraok <nithinraok@users.noreply.github.com>

---------

Signed-off-by: Nithin Rao Koluguri <nithinraok>
Signed-off-by: nithinraok <nithinraok@users.noreply.github.com>
Co-authored-by: Nithin Rao Koluguri <nithinraok>
Co-authored-by: nithinraok <nithinraok@users.noreply.github.com>

* Apply isort and black reformatting

Signed-off-by: nithinraok <nithinraok@users.noreply.github.com>

* Apply isort and black reformatting

Signed-off-by: artbataev <artbataev@users.noreply.github.com>

---------

Signed-off-by: Nithin Rao Koluguri <nithinraok>
Signed-off-by: nithinraok <nithinraok@users.noreply.github.com>
Signed-off-by: artbataev <artbataev@users.noreply.github.com>
Co-authored-by: nithinraok <nithinraok@users.noreply.github.com>
Co-authored-by: artbataev <artbataev@users.noreply.github.com>

* Fix ASR tests (#10794)

* Make tests required

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* Debug torch.load issue

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: artbataev <artbataev@users.noreply.github.com>

* Run only necessary tests

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* Try fix loading

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: artbataev <artbataev@users.noreply.github.com>

* Avoid caching fixture

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* Try restore model several times

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* Try customize temporary directory

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: artbataev <artbataev@users.noreply.github.com>

* Reorder tests

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* Disable one test

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: artbataev <artbataev@users.noreply.github.com>

* Avoid xxlarge model

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* Disable test

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* Revert changes

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: artbataev <artbataev@users.noreply.github.com>

* Magic fix

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* Revert unnecessary changes

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* Clean up

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* Disable all jobs except L0

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* RNNT alignments - merge with unit tests

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* Fix CUDA graph frame-looping decoder to handle non-CUDA inputs

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* Fix config

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* Log test results

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: artbataev <artbataev@users.noreply.github.com>

* Use less audio files for tests

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

---------

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: artbataev <artbataev@users.noreply.github.com>
Co-authored-by: artbataev <artbataev@users.noreply.github.com>

* Integrating mcore export (#10238)

* Integrating mcore export

* Integrating mcore export

* Apply isort and black reformatting

Signed-off-by: shanmugamr1992 <shanmugamr1992@users.noreply.github.com>

* Move trt imports in nemo.collections.llm inside respective functions (#10234)

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* Add tests for LazyNeMoIterator and fix case with metadata_only=True and offsets in manifest (#10198)

* Add tests for LazyNeMoIterator and fix case with manifest_only=True and offsets in manifest

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Address code review

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix tests

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix tests

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

---------

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* [NeMo-UX] Fix a serialization bug that prevents users from moving checkpoints (#9939)

* perfor serialization using relative paths to allow users to move checkpoints after they're saved

Signed-off-by: ashors1 <ashors@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: ashors1 <ashors1@users.noreply.github.com>

* remove unused import

Signed-off-by: ashors1 <ashors@nvidia.com>

* fix artifact load

Signed-off-by: ashors1 <ashors@nvidia.com>

* fix path artifact

Signed-off-by: ashors1 <ashors@nvidia.com>

* remove unused import

Signed-off-by: ashors1 <ashors@nvidia.com>

---------

Signed-off-by: ashors1 <ashors@nvidia.com>
Signed-off-by: ashors1 <ashors1@users.noreply.github.com>
Co-authored-by: ashors1 <ashors1@users.noreply.github.com>

* Add MemoryProfileCallback (#10166)

* Add MemoryProfileCallback

Signed-off-by: Shriya Palsamudram <spalsamudram@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: ShriyaPalsamudram <ShriyaPalsamudram@users.noreply.github.com>

* Remove reference cycles, save snapshot on specific ranks

Signed-off-by: Shriya Palsamudram <spalsamudram@nvidia.com>

* Remove unnecessary imports

Signed-off-by: Shriya Palsamudram <spalsamudram@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: ShriyaPalsamudram <ShriyaPalsamudram@users.noreply.github.com>

* Update docstring

Signed-off-by: Shriya Palsamudram <spalsamudram@nvidia.com>

---------

Signed-off-by: Shriya Palsamudram <spalsamudram@nvidia.com>
Signed-off-by: ShriyaPalsamudram <ShriyaPalsamudram@users.noreply.github.com>
Signed-off-by: Shriya Rishab <69161273+ShriyaPalsamudram@users.noreply.github.com>
Co-authored-by: ShriyaPalsamudram <ShriyaPalsamudram@users.noreply.github.com>

* Lower bound transformers to support nemotron (#10240)

Signed-off-by: Dong Hyuk Chang <donghyukc@nvidia.com>
Co-authored-by: Dong Hyuk Chang <donghyukc@nvidia.com>

* [Audio] SSL Pretraining framework for flow-matching model for audio processing (#10052)

Flow matching generative model with SSL pretraining framework

Signed-off-by: Pin-Jui Ku <pku@nvidia.com>
Co-authored-by: Kuray107 <Kuray107@users.noreply.github.com>

* Revert torchrun fix for model import (#10251)

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* [NeMo-UX[ Move nemotron imports inline (#10255)

* Move nemotron transformers + tokenizer imports inline to reduce number of required deps

Signed-off-by: Marc Romeyn <mromeijn@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: marcromeyn <marcromeyn@users.noreply.github.com>

---------

Signed-off-by: Marc Romeyn <mromeijn@nvidia.com>
Signed-off-by: marcromeyn <marcromeyn@users.noreply.github.com>
Co-authored-by: marcromeyn <marcromeyn@users.noreply.github.com>

* Wrap CPU model init with megatron_lazy_init_context (#10219)

* Wrap CPU model init with megatron_lazy_init_context

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Cleanup checkpoint-dir if saving fails

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

---------

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>
Co-authored-by: akoumpa <akoumpa@users.noreply.github.com>

* Bump `Dockerfile.ci` (2024-08-22) (#10227)

* [🤠]: Howdy folks, let's bump `Dockerfile.ci` to 124bcff !

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* fix bert flags

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

---------

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Signed-off-by: Oliver Koenig <okoenig@nvidia.com>
Co-authored-by: pablo-garay <7166088+pablo-garay@users.noreply.github.com>

* salm export trtllm (#10245)

Signed-off-by: slyne deng <slyned@nvidia.com>
Co-authored-by: slyne deng <slyned@nvidia.com>

* [🤠]: Howdy folks, let's bump `Dockerfile.ci` to ef85bc9 ! (#10250)

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: pablo-garay <7166088+pablo-garay@users.noreply.github.com>

* [🤠]: Howdy folks, let's bump `Dockerfile.ci` to 01ca03f ! (#10266)

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Signed-off-by: oliver könig <okoenig@nvidia.com>
Co-authored-by: pablo-garay <7166088+pablo-garay@users.noreply.github.com>

* Load model in the target export precision by default in PTQ (#10267)

* Load model in the target export precision by default

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Enable megatron_amp_O2=true to actually use half-precision

Signed-off-by: Jan Lasek <jlasek@nvidia.com>

---------

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>
Signed-off-by: Jan Lasek <jlasek@nvidia.com>

* Add WandbPlugin, NsysPlugin and PreemptionPlugin to nemo.lightning.run.plugins (#10223)

* Add WandbPlugin, NsysPlugin and PreemptionPlugin to nemo.lightning.run.plugins

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: hemildesai <hemildesai@users.noreply.github.com>

* Remove duplicate

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* Add entity to wandb logger

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* Add documentation

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: hemildesai <hemildesai@users.noreply.github.com>

* Add warning

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: hemildesai <hemildesai@users.noreply.github.com>

* PR feedback

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: hemildesai <hemildesai@users.noreply.github.com>

* Add comments

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: hemildesai <hemildesai@users.noreply.github.com>

---------

Signed-off-by: Hemil Desai <hemild@nvidia.com>
Signed-off-by: hemildesai <hemildesai@users.noreply.github.com>
Co-authored-by: hemildesai <hemildesai@users.noreply.github.com>

* [NeMo-UX] Handle absolute logger directories in nemo_logger (#10259)

* handle absolute and relative logger directories

Signed-off-by: Anna Shors <ashors@nvidia.com>

* merge lines

Signed-off-by: ashors1 <ashors@nvidia.com>

---------

Signed-off-by: Anna Shors <ashors@nvidia.com>
Signed-off-by: ashors1 <ashors@nvidia.com>

* Add sdxl notebook (#10139)

* Add sdxl notebook

Signed-off-by: mingyuanm <mingyuanm@nvidia.com>

* Rename

Signed-off-by: mingyuanm <mingyuanm@nvidia.com>

* final Update SDXL notebook

Signed-off-by: mingyuanm <mingyuanm@nvidia.com>

---------

Signed-off-by: mingyuanm <mingyuanm@nvidia.com>

* Updating some coments

* Apply isort and black reformatting

Signed-off-by: shanmugamr1992 <shanmugamr1992@users.noreply.github.com>

* Updating some coments

* Apply isort and black reformatting

Signed-off-by: shanmugamr1992 <shanmugamr1992@users.noreply.github.com>

* Updating some coments

* Small change

* Apply isort and black reformatting

Signed-off-by: shanmugamr1992 <shanmugamr1992@users.noreply.github.com>

* Apply isort and black reformatting

Signed-off-by: shanmugamr1992 <shanmugamr1992@users.noreply.github.com>

* ADD support for layernorm1p

* Apply isort and black reformatting

Signed-off-by: shanmugamr1992 <shanmugamr1992@users.noreply.github.com>

* Update Dockerfile.ci

Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com>

* Update Dockerfile.ci

Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com>

* Update Dockerfile.ci

Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com>

---------

Signed-off-by: shanmugamr1992 <shanmugamr1992@users.noreply.github.com>
Signed-off-by: Hemil Desai <hemild@nvidia.com>
Signed-off-by: Piotr Żelasko <petezor@gmail.com>
Signed-off-by: ashors1 <ashors@nvidia.com>
Signed-off-by: ashors1 <ashors1@users.noreply.github.com>
Signed-off-by: Shriya Palsamudram <spalsamudram@nvidia.com>
Signed-off-by: ShriyaPalsamudram <ShriyaPalsamudram@users.noreply.github.com>
Signed-off-by: Shriya Rishab <69161273+ShriyaPalsamudram@users.noreply.github.com>
Signed-off-by: Dong Hyuk Chang <donghyukc@nvidia.com>
Signed-off-by: Pin-Jui Ku <pku@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: Marc Romeyn <mromeijn@nvidia.com>
Signed-off-by: marcromeyn <marcromeyn@users.noreply.github.com>
Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Signed-off-by: Oliver Koenig <okoenig@nvidia.com>
Signed-off-by: slyne deng <slyned@nvidia.com>
Signed-off-by: oliver könig <okoenig@nvidia.com>
Signed-off-by: Jan Lasek <janek.lasek@gmail.com>
Signed-off-by: Jan Lasek <jlasek@nvidia.com>
Signed-off-by: hemildesai <hemildesai@users.noreply.github.com>
Signed-off-by: Anna Shors <ashors@nvidia.com>
Signed-off-by: mingyuanm <mingyuanm@nvidia.com>
Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com>
Co-authored-by: Shanmugam Ramasamy <shanmugamr@login-eos01.eos.clusters.nvidia.com>
Co-authored-by: shanmugamr1992 <shanmugamr1992@users.noreply.github.com>
Co-authored-by: Hemil Desai <hemild@nvidia.com>
Co-authored-by: Piotr Żelasko <petezor@gmail.com>
Co-authored-by: Anna Shors <71393111+ashors1@users.noreply.github.com>
Co-authored-by: ashors1 <ashors1@users.noreply.github.com>
Co-authored-by: Shriya Rishab <69161273+ShriyaPalsamudram@users.noreply.github.com>
Co-authored-by: ShriyaPalsamudram <ShriyaPalsamudram@users.noreply.github.com>
Co-authored-by: Dong Hyuk Chang <thomaschang26@tutanota.com>
Co-authored-by: Dong Hyuk Chang <donghyukc@nvidia.com>
Co-authored-by: Kuray107 <pku9@gatech.edu>
Co-authored-by: Kuray107 <Kuray107@users.noreply.github.com>
Co-authored-by: Alexandros Koumparoulis <153118171+akoumpa@users.noreply.github.com>
Co-authored-by: Marc Romeyn <mromeijn@nvidia.com>
Co-authored-by: marcromeyn <marcromeyn@users.noreply.github.com>
Co-authored-by: akoumpa <akoumpa@users.noreply.github.com>
Co-authored-by: oliver könig <okoenig@nvidia.com>
Co-authored-by: pablo-garay <7166088+pablo-garay@users.noreply.github.com>
Co-authored-by: Slyne Deng <slynedeng@gmail.com>
Co-authored-by: slyne deng <slyned@nvidia.com>
Co-authored-by: Jan Lasek <janek.lasek@gmail.com>
Co-authored-by: hemildesai <hemildesai@users.noreply.github.com>
Co-authored-by: Ming <111467530+Victor49152@users.noreply.github.com>
Co-authored-by: Shanmugam Ramasamy <shanmugamr@shanmugamr-mlt.client.nvidia.com>

* Fix artifact saving (#10914)

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* Lora improvement (#10918)

* pull out freeze model

Signed-off-by: Chen Cui <chcui@nvidia.com>

* add wildcard match to lora target modules

Signed-off-by: Chen Cui <chcui@nvidia.com>

---------

Signed-off-by: Chen Cui <chcui@nvidia.com>

* Huvu/t5 nemo2.0 peft (#10916)

* adding peft test and cicd

* add setting mcore model to train in peft.py

* adding test for T5 lora

* fix follow Chen's fix

* restore cicd-main.yml

---------

Co-authored-by: Huy Vu2 <huvu@login-eos02.eos.clusters.nvidia.com>

* Add tie_word_embeddings=True (#10710)

Signed-off-by: Yoshi Suhara <ysuhara@nvidia.com>

* Use a context-manager when opening files (#10895)

* Use a context-manager when opening files

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* Apply isort and black reformatting

Signed-off-by: artbataev <artbataev@users.noreply.github.com>

---------

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>
Signed-off-by: artbataev <artbataev@users.noreply.github.com>
Co-authored-by: akoumpa <akoumpa@users.noreply.github.com>
Co-authored-by: artbataev <artbataev@users.noreply.github.com>

* long context performance numbers in doc (#10784)

* long context perf

Signed-off-by: Youngeun Kwon <youngeunk@nvidia.com>

* update the long context perf

Signed-off-by: Youngeun Kwon <youngeunk@nvidia.com>

* Akoumparouli/mcore microbatch calculator fix (#10780)

* move tests/lightning/{,_}io

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add microbatch calculator context manager

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* use microbatch calculator context manager

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add on_load_checkpoint test to ValidateModelRestoration; use ctx manager to reconfigure microbatch calculator; update save/restore path; add cleanup step at the end

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* remove unused var

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

---------

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>
Co-authored-by: akoumpa <akoumpa@users.noreply.github.com>
Signed-off-by: Youngeun Kwon <youngeunk@nvidia.com>

* remove 8x3b recipes (#10764)

* remove 8x3b recipes

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* remove 8x3b from test_nemo_run

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* rm from __init__

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

---------

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: Youngeun Kwon <youngeunk@nvidia.com>

* change the figure file name

Signed-off-by: Youngeun Kwon <youngeunk@nvidia.com>

* Accommodating the reviewer's comment

Signed-off-by: Youngeun Kwon <youngeunk@nvidia.com>

* update the y-axis title

Signed-off-by: Youngeun Kwon <youngeunk@nvidia.com>

* [🤠]: Howdy folks, let's bump `Dockerfile.ci` to 3f90b98 ! (#10789)

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: pablo-garay <7166088+pablo-garay@users.noreply.github.com>
Signed-off-by: Youngeun Kwon <youngeunk@nvidia.com>

* Add ModelOpt transformer model pruning example for Llama models, default to llama3.1-8b-base (#10294)

* Add ModelOpt transformer model pruning example for Llama3 model

Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: shengliangxu <shengliangxu@users.noreply.github.com>
Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

* examples code is at wrong dir, move them

Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

* changes as suggested in comment

remove some logging and unused config code, update example model to
llama3.1

Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

* Add pruning of hidden_size into example

Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: shengliangxu <shengliangxu@users.noreply.github.com>
Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

* Update examples/nlp/language_modeling/conf/megatron_gpt_prune.yaml

Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

* Add pruning test to cicd-main.yml

Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

* Update cicd-main.yml

Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

* Update cicd-main.yml

Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

* Update cicd-main.yml

Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

* Update cicd-main.yml

Signed-off-by: Keval Morabia <2891698…
XuesongYang pushed a commit to paarthneekhara/NeMo that referenced this pull request Jan 18, 2025
XuesongYang pushed a commit to paarthneekhara/NeMo that referenced this pull request Jan 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants