fix(export): update API for disabling device reassignment in TRTLLM for Aligner #10863

terrykong · 2024-10-11T23:45:11Z

Also clean up some unused imports

What does this PR do ?

Add a one line overview of what this PR aims to accomplish.

Collection: [Note which collection this PR will affect]

Changelog

Add specific line by line info of high level changes in this PR.

Usage

You can potentially add a usage example below

# Add a code snippet demonstrating how to use this

GitHub Actions CI

The Jenkins CI system has been replaced by GitHub Actions self-hosted runners.

The GitHub Actions CI will run automatically when the "Run CICD" label is added to the PR.
To re-run CI remove and add the label again.
To run CI on an untrusted fork, a NeMo user with write access must first click "Approve and run".

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you add or update any necessary documentation?
Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
- Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

New Feature
Bugfix
Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

Related to # (issue)

…or Aligner [feat] Upgrade nemo-export path for aligner to TRTLLM-v12 and use python runtime Signed-off-by: Terry Kong <terryk@nvidia.com> fix: forgot to always set _disable_torch_cuda_device_set Signed-off-by: Terry Kong <terryk@nvidia.com> Signed-off-by: Terry Kong <terryk@nvidia.com> Apply isort and black reformatting Signed-off-by: terrykong <terrykong@users.noreply.github.com> invert torch device set Signed-off-by: Terry Kong <terryk@nvidia.com>

Signed-off-by: Terry Kong <terryk@nvidia.com>

github-actions · 2024-11-12T18:01:12Z

beep boop 🤖: 🙏 The following files have warnings. In case you are familiar with these, please try helping us to improve the code base.

Your code was analyzed with PyLint. The following annotations have been identified:

************* Module nemo.export.trt_llm.tensorrt_llm_run
nemo/export/trt_llm/tensorrt_llm_run.py:506:0: C0301: Line too long (125/119) (line-too-long)
nemo/export/trt_llm/tensorrt_llm_run.py:510:0: C0301: Line too long (136/119) (line-too-long)
nemo/export/trt_llm/tensorrt_llm_run.py:514:0: C0301: Line too long (123/119) (line-too-long)
nemo/export/trt_llm/tensorrt_llm_run.py:557:0: C0301: Line too long (181/119) (line-too-long)
nemo/export/trt_llm/tensorrt_llm_run.py:839:0: C0301: Line too long (153/119) (line-too-long)
nemo/export/trt_llm/tensorrt_llm_run.py:524:0: C0116: Missing function or method docstring (missing-function-docstring)
nemo/export/trt_llm/tensorrt_llm_run.py:533:0: C0116: Missing function or method docstring (missing-function-docstring)
nemo/export/trt_llm/tensorrt_llm_run.py:591:0: C0116: Missing function or method docstring (missing-function-docstring)
nemo/export/trt_llm/tensorrt_llm_run.py:33:0: W0611: Unused Mapping imported from tensorrt_llm.mapping (unused-import)

-----------------------------------
Your code has been rated at 9.75/10

Thank you for improving NeMo's documentation!

github-actions · 2024-11-12T21:29:52Z

[🤖]: Hi @terrykong 👋,

We wanted to let you know that a CICD pipeline for this PR just finished successfully

So it might be time to merge this PR or get some approvals

I'm just a bot so I'll leave it you what to do next.

//cc @pablo-garay @ko3n1g

* Timestamps to transcribe (#10950) * inital version Signed-off-by: Nithin Rao Koluguri <nithinraok> * Support for RNNT, TDT, Hybrid Models Signed-off-by: Nithin Rao Koluguri <nithinraok> * move change of decoder stratery from mixin to individual model class Signed-off-by: Nithin Rao Koluguri <nithinraok> * Apply isort and black reformatting Signed-off-by: nithinraok <nithinraok@users.noreply.github.com> * update transcribe_speech.py Signed-off-by: Nithin Rao Koluguri <nithinraok> * uncomment Signed-off-by: Nithin Rao Koluguri <nithinraok> * Apply isort and black reformatting Signed-off-by: nithinraok <nithinraok@users.noreply.github.com> * add docs Signed-off-by: Nithin Rao Koluguri <nithinraok> * fix docs Signed-off-by: Nithin Rao Koluguri <nithinraok> * Apply isort and black reformatting Signed-off-by: nithinraok <nithinraok@users.noreply.github.com> * codeql fixes Signed-off-by: Nithin Rao Koluguri <nithinraok> * unit tests Signed-off-by: Nithin Rao Koluguri <nithinraok> * minor rebase fix Signed-off-by: Nithin Rao Koluguri <nithinraok> * Apply isort and black reformatting Signed-off-by: nithinraok <nithinraok@users.noreply.github.com> * add None case to restore the state set outside using decoding_stratergy() Signed-off-by: Nithin Rao Koluguri <nithinraok> * Apply isort and black reformatting Signed-off-by: nithinraok <nithinraok@users.noreply.github.com> * remove ipdb traces Signed-off-by: Nithin Rao Koluguri <nithinraok> * updates doc for transcription.py Signed-off-by: Nithin Rao Koluguri <nithinraok> * remove preserve alignment for AED models as it doesn;t support it Signed-off-by: Nithin Rao Koluguri <nithinraok> * lint warnings Signed-off-by: Nithin Rao Koluguri <nithinraok> * Apply isort and black reformatting Signed-off-by: nithinraok <nithinraok@users.noreply.github.com> --------- Signed-off-by: Nithin Rao Koluguri <nithinraok> Signed-off-by: nithinraok <nithinraok@users.noreply.github.com> Co-authored-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: nithinraok <nithinraok@users.noreply.github.com> * [🤠]: Howdy folks, let's bump `Dockerfile.ci` to 1b8fce7 ! (#11247) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * [🤠]: Howdy folks, let's bump `Dockerfile.ci` to 47ff44e ! (#11254) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Handling tokenizer in PTQ for Nemo 2.0 (#11237) * Handling tokenizer in PTQ for Nemo 2.0 Signed-off-by: Jan Lasek <janek.lasek@gmail.com> * Print log msg and enable overriding Signed-off-by: Jan Lasek <janek.lasek@gmail.com> * Warning for legacy tokenizer config Signed-off-by: Jan Lasek <janek.lasek@gmail.com> * Save HF tokenizer to make tokenizer_config.yaml (almost) redundant Signed-off-by: Jan Lasek <janek.lasek@gmail.com> * Handle tokenizer in a unified way Signed-off-by: Jan Lasek <janek.lasek@gmail.com> * Move saving context within export Signed-off-by: Jan Lasek <janek.lasek@gmail.com> * Fix typo in get_tokenzier Signed-off-by: Jan Lasek <janek.lasek@gmail.com> * Reduce diff Signed-off-by: Jan Lasek <janek.lasek@gmail.com> * Drop unused import Signed-off-by: Jan Lasek <janek.lasek@gmail.com> --------- Signed-off-by: Jan Lasek <janek.lasek@gmail.com> * Fix finetuning datamodule resume (#11187) * fix datamodule resume Signed-off-by: Chen Cui <chcui@nvidia.com> * Apply isort and black reformatting Signed-off-by: cuichenx <cuichenx@users.noreply.github.com> * fix subclass Signed-off-by: Chen Cui <chcui@nvidia.com> * docstrings and formats Signed-off-by: Chen Cui <chcui@nvidia.com> * Apply isort and black reformatting Signed-off-by: cuichenx <cuichenx@users.noreply.github.com> --------- Signed-off-by: Chen Cui <chcui@nvidia.com> Signed-off-by: cuichenx <cuichenx@users.noreply.github.com> Co-authored-by: cuichenx <cuichenx@users.noreply.github.com> * ci: Move `bump mcore` to templates (#11229) * ci: Move `bump mcore` to templates Signed-off-by: Oliver Koenig <okoenig@nvidia.com> * fix Signed-off-by: Oliver Koenig <okoenig@nvidia.com> * fix Signed-off-by: Oliver Koenig <okoenig@nvidia.com> * fix Signed-off-by: Oliver Koenig <okoenig@nvidia.com> * final Signed-off-by: Oliver Koenig <okoenig@nvidia.com> --------- Signed-off-by: Oliver Koenig <okoenig@nvidia.com> * fix: Update baseline (#11205) Signed-off-by: Oliver Koenig <okoenig@nvidia.com> * Remove deprecated builder_opt param from build command (#11259) Signed-off-by: Jan Lasek <janek.lasek@gmail.com> * chore(beep boop 🤖): Bump `MCORE_TAG=aded519...` (2024-11-12) (#11260) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * [Doc fixes] update file names, installation instructions, bad links (#11045) * rename eval_beamsearch_ngram.py to eval_beamsearch_ngram_ctc.py in docs Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com> * replace out of date installation instructions with pointer to NeMo README installation section Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com> * point to user guide instead of readme Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com> * some link updates Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com> * update more links Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com> --------- Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com> Signed-off-by: Elena Rastorgueva <80532067+erastorgueva-nv@users.noreply.github.com> * fix(export): GPT models w/ bias=False convert properly (#11255) Signed-off-by: Terry Kong <terryk@nvidia.com> * ci: Run secrets detector on `pull_request_target` (#11263) Signed-off-by: Oliver Koenig <okoenig@nvidia.com> * fix(export): update API for disabling device reassignment in TRTLLM for Aligner (#10863) * fix(export): update API for disabling device reassignment in TRTLLM for Aligner [feat] Upgrade nemo-export path for aligner to TRTLLM-v12 and use python runtime Signed-off-by: Terry Kong <terryk@nvidia.com> fix: forgot to always set _disable_torch_cuda_device_set Signed-off-by: Terry Kong <terryk@nvidia.com> Signed-off-by: Terry Kong <terryk@nvidia.com> Apply isort and black reformatting Signed-off-by: terrykong <terrykong@users.noreply.github.com> invert torch device set Signed-off-by: Terry Kong <terryk@nvidia.com> * remove comment Signed-off-by: Terry Kong <terryk@nvidia.com> --------- Signed-off-by: Terry Kong <terryk@nvidia.com> * new vfm training features (#11246) Signed-off-by: Zeeshan Patel <zeeshanp@nvidia.com> Co-authored-by: Zeeshan Patel <zeeshanp@nvidia.com> * Update pruning and distillation tutorial notebooks (#11091) * Update pruning and distillation tutorial notebooks Signed-off-by: Gomathy Venkata Krishnan <gvenkatakris@nvidia.com> * Update README Signed-off-by: Gomathy Venkata Krishnan <gvenkatakris@nvidia.com> * Update batch size in width pruning script Signed-off-by: Gomathy Venkata Krishnan <gvenkatakris@nvidia.com> * Update README Signed-off-by: Gomathy Venkata Krishnan <gvenkatakris@nvidia.com> --------- Signed-off-by: Gomathy Venkata Krishnan <gvenkatakris@nvidia.com> * Beam search algorithm implementation for TDT models (#10903) * initial commit Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * add: default beam search implementation Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * fix: changed to removing duplicate hypothesis in separate function Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * fix: changed to cartesian product in choosing best hyp Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * fix: minor fixes in comments Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * add: maes decoding strategy Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * add: durations filtering in maes, lm fusion in progress Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * fix: refactored, added comments, command line args, finalized Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * fix: removed prints Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * add: docs Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * Apply isort and black reformatting Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com> * fix: minor fix Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * fix: rm beam_size=1 exception, rm duplicates check, fix error handling Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * fix: error handling Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * Apply isort and black reformatting Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com> * fix: removed evaluations file Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * rn: blank scoring Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * clean up Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * rm: blank scoring and duration beam size Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * Apply isort and black reformatting Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com> * fix: removed durations_beam_size from default beam search Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * add: logaddexp Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * rm: prefix search Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * rn: nested loop over extensions Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * fix: bug with caching Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * rm: topk on durations Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * add: restored prefix search Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * Apply isort and black reformatting Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com> * clean up Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * fix: fixed comments Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * refactored duplicate merging Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * changes batch scoring Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * refactored rnnt batch scoring Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * alsd first working Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * refactored Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * clean up Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * remove stacking operations Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * fixes im base class Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * clean up Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * Apply isort and black reformatting Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com> * remove potentially uninitialized local variable Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * default beam search minor fixes Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * add test, fix maes timesteps Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * rm file Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * rm file Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * clean up Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * Apply isort and black reformatting Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com> * clean up Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * fix comments Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * add ngram lm test Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * Apply isort and black reformatting Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com> * fix maes_num_steps=1 Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * fix kenlm model path Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * fix kenlm model full path Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * Apply isort and black reformatting Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com> * made requested changes Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * merge after isort Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * add prints to test Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * Apply isort and black reformatting Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com> * add Kenlm to asr requirements Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * remove prints in tests Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add kenlm to test requirements Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rm kenlm from link, add package-name Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rm second kenlm installation Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * rm kenlm from dependencies make test optional Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * Apply isort and black reformatting Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com> * fix in test Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * fix in test Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * Apply isort and black reformatting Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com> * fix comments Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * Apply isort and black reformatting Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com> * add comments Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * add comments Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * splitted docstrings Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * Apply isort and black reformatting Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com> * add comments Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * splitted docstrings Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * Apply isort and black reformatting Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com> * add comments Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * Apply isort and black reformatting Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com> * fixes to python3 type annotations Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * Apply isort and black reformatting Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com> * merging Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * merging Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * fix in return type Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * Apply isort and black reformatting Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com> * fix test Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * Apply isort and black reformatting Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com> * rm time_idx Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> * fix comments to python3 style Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> --------- Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com> Co-authored-by: lilithgrigoryan <lgrigoryan@nvidia.com> Co-authored-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * update nemo1->2 conversion according to changes in main (#11253) * update nemo1->2 conversion according to changes in main Signed-off-by: Huiying Li <willwin.lee@gmail.com> * Apply isort and black reformatting Signed-off-by: HuiyingLi <HuiyingLi@users.noreply.github.com> * format fix Signed-off-by: Huiying Li <willwin.lee@gmail.com> * add docstrings Signed-off-by: Huiying Li <willwin.lee@gmail.com> --------- Signed-off-by: Huiying Li <willwin.lee@gmail.com> Signed-off-by: HuiyingLi <HuiyingLi@users.noreply.github.com> Co-authored-by: HuiyingLi <HuiyingLi@users.noreply.github.com> * Add llama 3.1 recipes (#11273) * add llama 3.1 recipes Signed-off-by: Chen Cui <chcui@nvidia.com> * Apply isort and black reformatting Signed-off-by: cuichenx <cuichenx@users.noreply.github.com> * fix pylint Signed-off-by: Chen Cui <chcui@nvidia.com> * Fix llama3.1 wrong config in io.json --------- Signed-off-by: Chen Cui <chcui@nvidia.com> Signed-off-by: cuichenx <cuichenx@users.noreply.github.com> Co-authored-by: cuichenx <cuichenx@users.noreply.github.com> Co-authored-by: Ao Tang <aot@nvidia.com> * Fix Finetune Recipe (#11267) * Fix Starcoder_15 SFT recipe * Fix PP type SFT recipe * Fix PP type SFT recipe * Fix Gemma2b SFT TP=1 * Fix more sft recipe * Fix more sft recipe * Fix more sft recipe * Fix more sft recipe * Fix more sft recipe * Fix more sft recipe * Fix more sft recipe * Fix more sft recipe * Fix more sft recipe * remove pp dtype * remove pp dtype * Configure no restart validation loop in nl.Trainer (#11029) * Configure no restart validation loop in nl.Trainer Signed-off-by: Hemil Desai <hemild@nvidia.com> * fix Signed-off-by: Hemil Desai <hemild@nvidia.com> * Skip validation whenever restarting=True Signed-off-by: Hemil Desai <hemild@nvidia.com> * PR feedback Signed-off-by: Hemil Desai <hemild@nvidia.com> * Apply isort and black reformatting Signed-off-by: hemildesai <hemildesai@users.noreply.github.com> --------- Signed-off-by: Hemil Desai <hemild@nvidia.com> Signed-off-by: hemildesai <hemildesai@users.noreply.github.com> Co-authored-by: hemildesai <hemildesai@users.noreply.github.com> * Handle _io_unflatten_object when _thread_local.output_dir is not available (#11199) Signed-off-by: Hemil Desai <hemild@nvidia.com> * change default ckpt name (#11277) Signed-off-by: Maanu Grover <maanug@nvidia.com> * Use MegatronDataSampler in HfDatasetDataModule (#11274) * Use MegatronDataSampler in HfDataset Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * Apply isort and black reformatting Signed-off-by: akoumpa <akoumpa@users.noreply.github.com> --------- Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> Signed-off-by: akoumpa <akoumpa@users.noreply.github.com> Co-authored-by: akoumpa <akoumpa@users.noreply.github.com> * Remove opencc upperbound (#10909) Signed-off-by: Dong Hyuk Chang <donghyukc@nvidia.com> --------- Signed-off-by: Nithin Rao Koluguri <nithinraok> Signed-off-by: nithinraok <nithinraok@users.noreply.github.com> Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Signed-off-by: Jan Lasek <janek.lasek@gmail.com> Signed-off-by: Chen Cui <chcui@nvidia.com> Signed-off-by: cuichenx <cuichenx@users.noreply.github.com> Signed-off-by: Oliver Koenig <okoenig@nvidia.com> Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com> Signed-off-by: Elena Rastorgueva <80532067+erastorgueva-nv@users.noreply.github.com> Signed-off-by: Terry Kong <terryk@nvidia.com> Signed-off-by: Zeeshan Patel <zeeshanp@nvidia.com> Signed-off-by: Gomathy Venkata Krishnan <gvenkatakris@nvidia.com> Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com> Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com> Signed-off-by: Huiying Li <willwin.lee@gmail.com> Signed-off-by: HuiyingLi <HuiyingLi@users.noreply.github.com> Signed-off-by: Hemil Desai <hemild@nvidia.com> Signed-off-by: hemildesai <hemildesai@users.noreply.github.com> Signed-off-by: Maanu Grover <maanug@nvidia.com> Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> Signed-off-by: akoumpa <akoumpa@users.noreply.github.com> Signed-off-by: Dong Hyuk Chang <donghyukc@nvidia.com> Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com> Co-authored-by: nithinraok <nithinraok@users.noreply.github.com> Co-authored-by: oliver könig <okoenig@nvidia.com> Co-authored-by: Jan Lasek <janek.lasek@gmail.com> Co-authored-by: Chen Cui <chcui@nvidia.com> Co-authored-by: cuichenx <cuichenx@users.noreply.github.com> Co-authored-by: Elena Rastorgueva <80532067+erastorgueva-nv@users.noreply.github.com> Co-authored-by: Terry Kong <terryk@nvidia.com> Co-authored-by: Zeeshan Patel <zeeshanp@nvidia.com> Co-authored-by: gvenkatakris <gvenkatakris@nvidia.com> Co-authored-by: lilithgrigoryan <38436437+lilithgrigoryan@users.noreply.github.com> Co-authored-by: lilithgrigoryan <lgrigoryan@nvidia.com> Co-authored-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Huiying <willwin.lee@gmail.com> Co-authored-by: HuiyingLi <HuiyingLi@users.noreply.github.com> Co-authored-by: Ao Tang <aot@nvidia.com> Co-authored-by: Hemil Desai <hemild@nvidia.com> Co-authored-by: hemildesai <hemildesai@users.noreply.github.com> Co-authored-by: Maanu Grover <109391026+maanug-nv@users.noreply.github.com> Co-authored-by: Alexandros Koumparoulis <153118171+akoumpa@users.noreply.github.com> Co-authored-by: akoumpa <akoumpa@users.noreply.github.com> Co-authored-by: Dong Hyuk Chang <thomaschang26@tutanota.com>

…or Aligner (NVIDIA#10863) * fix(export): update API for disabling device reassignment in TRTLLM for Aligner [feat] Upgrade nemo-export path for aligner to TRTLLM-v12 and use python runtime Signed-off-by: Terry Kong <terryk@nvidia.com> fix: forgot to always set _disable_torch_cuda_device_set Signed-off-by: Terry Kong <terryk@nvidia.com> Signed-off-by: Terry Kong <terryk@nvidia.com> Apply isort and black reformatting Signed-off-by: terrykong <terrykong@users.noreply.github.com> invert torch device set Signed-off-by: Terry Kong <terryk@nvidia.com> * remove comment Signed-off-by: Terry Kong <terryk@nvidia.com> --------- Signed-off-by: Terry Kong <terryk@nvidia.com>

…or Aligner (#10863) * fix(export): update API for disabling device reassignment in TRTLLM for Aligner [feat] Upgrade nemo-export path for aligner to TRTLLM-v12 and use python runtime Signed-off-by: Terry Kong <terryk@nvidia.com> fix: forgot to always set _disable_torch_cuda_device_set Signed-off-by: Terry Kong <terryk@nvidia.com> Signed-off-by: Terry Kong <terryk@nvidia.com> Apply isort and black reformatting Signed-off-by: terrykong <terrykong@users.noreply.github.com> invert torch device set Signed-off-by: Terry Kong <terryk@nvidia.com> * remove comment Signed-off-by: Terry Kong <terryk@nvidia.com> --------- Signed-off-by: Terry Kong <terryk@nvidia.com>

Squashed commit of the following: commit 57ef506 Author: Olivier Delalleau <507137+odelalleau@users.noreply.github.com> Date: Thu Nov 28 13:27:04 2024 -0800 Fully remove hack that was adding "</s>" to `end_strings` commit 6076b60 Author: Ali Taghibakhshi <71892896+JRD971000@users.noreply.github.com> Date: Thu Nov 28 00:02:43 2024 -0600 change dist ckpt to zarr Signed-off-by: Ali Taghibakhshi <71892896+JRD971000@users.noreply.github.com> commit 33564d4 Author: Jiaqi Zeng <jiaqiz@nvidia.com> Date: Tue Nov 26 19:53:04 2024 -0800 remove eos hack given the fix in 4b71c0f Signed-off-by: Jiaqi Zeng <jiaqiz@nvidia.com> commit 9387c74 Author: Olivier Delalleau <507137+odelalleau@users.noreply.github.com> Date: Tue Nov 26 17:37:45 2024 -0500 Fix for when `ids_to_tokens()` is unable to return a valid token commit c23db69 Author: Olivier Delalleau <507137+odelalleau@users.noreply.github.com> Date: Tue Nov 26 16:30:47 2024 -0500 Simplify implementation of `token_to_id()` commit ab699a5 Author: Olivier Delalleau <507137+odelalleau@users.noreply.github.com> Date: Tue Nov 26 13:28:15 2024 -0500 Fix `ids_to_tokens()` to handle tokens associated to multiple token IDs commit 52ec872 Author: Olivier Delalleau <507137+odelalleau@users.noreply.github.com> Date: Tue Nov 26 11:45:43 2024 -0500 Ensure `tokens_to_text()` is consistent with `ids_to_text()` commit 4b71c0f Author: Olivier Delalleau <507137+odelalleau@users.noreply.github.com> Date: Tue Nov 26 11:39:20 2024 -0500 Skip BOS/EOS tokens in `ids_to_text()` by default This is because those tokens are typically added in the code (e.g. for padding purpose) and we do not want them to be part of the response. commit a77dc9f Author: Tugrul Konuk <ertkonuk@gmail.com> Date: Tue Nov 26 12:33:35 2024 -0600 Use decode_with_offsets commit 413e736 Author: Tugrul Konuk <ertkonuk@gmail.com> Date: Tue Nov 26 11:38:38 2024 -0600 Fixed tokenization of special characters. commit 30cef20 Author: Tugrul Konuk <ertkonuk@gmail.com> Date: Tue Nov 26 10:33:07 2024 -0600 Simplified the text_to_tokens method commit d07a17c Author: Tugrul Konuk <ertkonuk@gmail.com> Date: Tue Nov 26 10:15:49 2024 -0600 Attempt to fix the nemotron5 tokenizer commit cee062f Author: Gerald Shen <geshen@nvidia.com> Date: Fri Nov 22 18:15:55 2024 -0800 only save untarred nemo files Signed-off-by: Gerald Shen <geshen@nvidia.com> commit 23923fe Author: Gerald Shen <geshen@nvidia.com> Date: Fri Nov 22 13:41:28 2024 -0800 add checkpoint fix Signed-off-by: Gerald Shen <geshen@nvidia.com> commit 61f999a Author: Olivier Delalleau <507137+odelalleau@users.noreply.github.com> Date: Fri Nov 22 15:44:04 2024 -0500 Slightly reduce sleep time when batching queries This can give a small speedup for free, since usually batched queries all come in within <0.5s commit 17e148c Author: Olivier Delalleau <507137+odelalleau@users.noreply.github.com> Date: Fri Nov 22 09:54:50 2024 -0500 Avoid potential race conditions with batching In theory, with the previous implementation it would have been possible for a thread to re-use the output from a previous batch, if it happened to grab the lock before the thread with queryid == 0. commit 65f0a3b Author: Haifeng Qian <haifengq@cw-dfw-cs-001-login-02.cm.cluster> Date: Fri Nov 22 08:56:58 2024 -0800 enforce tokens_to_generate as max number of generated tokens for each sequence in a batch commit c9b6c60 Author: HeyyyyyyG <HeyyyyyyG@users.noreply.github.com> Date: Fri Nov 22 10:06:32 2024 +0000 Apply isort and black reformatting Signed-off-by: HeyyyyyyG <HeyyyyyyG@users.noreply.github.com> commit 287ab7f Author: Jiaqi Zeng <jiaqiz@nvidia.com> Date: Fri Nov 22 02:01:44 2024 -0800 hack to remove trailing </s> Signed-off-by: Jiaqi Zeng <jiaqiz@nvidia.com> commit b912e92 Author: haifengqian <haifengqian@users.noreply.github.com> Date: Thu Nov 21 22:19:18 2024 +0000 Apply isort and black reformatting Signed-off-by: haifengqian <haifengqian@users.noreply.github.com> commit 9853c30 Author: Haifeng Qian <haifengq@cw-dfw-cs-001-login-02.cm.cluster> Date: Thu Nov 21 14:17:47 2024 -0800 add batching support in inference server commit 551bf41 Author: arendu <arendu@users.noreply.github.com> Date: Wed Nov 20 22:16:10 2024 +0000 Apply isort and black reformatting Signed-off-by: arendu <arendu@users.noreply.github.com> commit 9581135 Merge: daf406b df9374f Author: adithyare <adithyare@nvidia.com> Date: Wed Nov 20 14:14:58 2024 -0800 Merge branch 'aligner/nemotron5' of https://github.com/NVIDIA/NeMo into aligner/nemotron5 commit daf406b Author: adithyare <adithyare@nvidia.com> Date: Wed Nov 20 14:14:32 2024 -0800 removed logs and debugging code Signed-off-by: adithyare <adithyare@nvidia.com> commit df9374f Author: Terry Kong <terryk@nvidia.com> Date: Tue Nov 12 13:29:56 2024 -0800 fix(export): update API for disabling device reassignment in TRTLLM for Aligner (#10863) * fix(export): update API for disabling device reassignment in TRTLLM for Aligner [feat] Upgrade nemo-export path for aligner to TRTLLM-v12 and use python runtime Signed-off-by: Terry Kong <terryk@nvidia.com> fix: forgot to always set _disable_torch_cuda_device_set Signed-off-by: Terry Kong <terryk@nvidia.com> Signed-off-by: Terry Kong <terryk@nvidia.com> Apply isort and black reformatting Signed-off-by: terrykong <terrykong@users.noreply.github.com> invert torch device set Signed-off-by: Terry Kong <terryk@nvidia.com> * remove comment Signed-off-by: Terry Kong <terryk@nvidia.com> --------- Signed-off-by: Terry Kong <terryk@nvidia.com> commit a923f76 Author: Gerald Shen <geshen@nvidia.com> Date: Wed Nov 20 13:19:43 2024 -0800 TRT-LLM FIX FOR NEMOTRON5, THIS BREAKS TRT FOR ALL OTHER MODELS Signed-off-by: Gerald Shen <geshen@nvidia.com> commit 2b44faf Author: arendu <adithya.r@gmail.com> Date: Tue Nov 19 23:35:21 2024 +0000 loop once in server mode Signed-off-by: arendu <adithya.r@gmail.com> commit 744839c Merge: 0278a01 0a63807 Author: arendu <adithya.r@gmail.com> Date: Tue Nov 19 23:32:01 2024 +0000 Merge branch 'aligner/nemotron5' of https://github.com/NVIDIA/NeMo into aligner/nemotron5 commit 0278a01 Author: arendu <adithya.r@gmail.com> Date: Tue Nov 19 23:31:48 2024 +0000 time generate method Signed-off-by: arendu <adithya.r@gmail.com> commit 0a63807 Author: arendu <arendu@users.noreply.github.com> Date: Tue Nov 19 23:03:34 2024 +0000 Apply isort and black reformatting Signed-off-by: arendu <arendu@users.noreply.github.com> commit 6aa111f Merge: 3958925 aee8a89 Author: adithyare <adithyare@nvidia.com> Date: Tue Nov 19 15:02:30 2024 -0800 Merge branch 'aligner/nemotron5' of https://github.com/NVIDIA/NeMo into aligner/nemotron5 commit 3958925 Author: adithyare <adithyare@nvidia.com> Date: Tue Nov 19 15:02:07 2024 -0800 added import Signed-off-by: adithyare <adithyare@nvidia.com> commit aee8a89 Author: arendu <arendu@users.noreply.github.com> Date: Tue Nov 19 22:54:44 2024 +0000 Apply isort and black reformatting Signed-off-by: arendu <arendu@users.noreply.github.com> commit ca902fd Author: adithyare <adithyare@nvidia.com> Date: Tue Nov 19 14:52:27 2024 -0800 debug eval script times Signed-off-by: adithyare <adithyare@nvidia.com> commit 15fdf8a Author: arendu <arendu@users.noreply.github.com> Date: Tue Nov 19 22:31:26 2024 +0000 Apply isort and black reformatting Signed-off-by: arendu <arendu@users.noreply.github.com> commit d27c4a5 Author: arendu <adithya.r@gmail.com> Date: Tue Nov 19 22:30:35 2024 +0000 debug Signed-off-by: arendu <adithya.r@gmail.com> commit 2db23a7 Author: arendu <arendu@users.noreply.github.com> Date: Tue Nov 19 21:21:03 2024 +0000 Apply isort and black reformatting Signed-off-by: arendu <arendu@users.noreply.github.com> commit 405889e Author: adithyare <adithyare@nvidia.com> Date: Tue Nov 19 13:19:49 2024 -0800 removed logs in server, added a single timer Signed-off-by: adithyare <adithyare@nvidia.com> commit 7730d5f Merge: e504655 23812c3 Author: arendu <adithya.r@gmail.com> Date: Tue Nov 19 16:56:36 2024 +0000 remove logs resolve conflicts Signed-off-by: arendu <adithya.r@gmail.com> commit e504655 Author: arendu <adithya.r@gmail.com> Date: Tue Nov 19 16:53:55 2024 +0000 removed timing/debug logs Signed-off-by: arendu <adithya.r@gmail.com> commit 23812c3 Author: Jiaqi Zeng <jiaqiz@nvidia.com> Date: Tue Nov 19 07:30:34 2024 -0800 remove end_strings and end_of_turn Signed-off-by: Jiaqi Zeng <jiaqiz@nvidia.com> commit 3ab0d2c Author: HeyyyyyyG <HeyyyyyyG@users.noreply.github.com> Date: Tue Nov 19 15:15:01 2024 +0000 Apply isort and black reformatting Signed-off-by: HeyyyyyyG <HeyyyyyyG@users.noreply.github.com> commit c4c7de6 Author: Jiaqi Zeng <jiaqiz@nvidia.com> Date: Tue Nov 19 07:13:59 2024 -0800 remove end_strings Signed-off-by: Jiaqi Zeng <jiaqiz@nvidia.com> commit 44d1e9d Author: arendu <arendu@users.noreply.github.com> Date: Tue Nov 19 04:57:56 2024 +0000 Apply isort and black reformatting Signed-off-by: arendu <arendu@users.noreply.github.com> commit fdd8005 Author: adithyare <adithyare@nvidia.com> Date: Mon Nov 18 20:56:51 2024 -0800 debugging args to generate Signed-off-by: adithyare <adithyare@nvidia.com> commit 849ff34 Author: arendu <arendu@users.noreply.github.com> Date: Tue Nov 19 03:14:45 2024 +0000 Apply isort and black reformatting Signed-off-by: arendu <arendu@users.noreply.github.com> commit 54fba29 Merge: 3b2e00f 6723809 Author: Adi Renduchintala <adithyare@nvidia.com> Date: Mon Nov 18 19:13:38 2024 -0800 Merge branch 'aligner/nemotron5' of https://github.com/NVIDIA/NeMo into aligner/nemotron5 commit 3b2e00f Author: Adi Renduchintala <adithyare@nvidia.com> Date: Mon Nov 18 19:12:52 2024 -0800 debug Nones Signed-off-by: Adi Renduchintala <adithyare@nvidia.com> commit 6723809 Author: Olivier Delalleau <507137+odelalleau@users.noreply.github.com> Date: Wed Nov 13 16:53:31 2024 -0500 Workaround for crash due to `bytes` tokens in Tiktoken tokenizer commit 3b284af Author: arendu <adithya.r@gmail.com> Date: Tue Nov 19 02:12:39 2024 +0000 debug slowness Signed-off-by: arendu <adithya.r@gmail.com> commit e4b2259 Author: arendu <adithya.r@gmail.com> Date: Tue Nov 19 00:41:19 2024 +0000 added timing logs Signed-off-by: arendu <adithya.r@gmail.com> commit ae07158 Author: JRD971000 <JRD971000@users.noreply.github.com> Date: Mon Nov 18 22:49:42 2024 +0000 Apply isort and black reformatting Signed-off-by: JRD971000 <JRD971000@users.noreply.github.com> commit 85a1c9c Author: Ali Taghibakhshi <ataghibakhsh@login-eos01.eos.clusters.nvidia.com> Date: Mon Nov 18 14:48:46 2024 -0800 add nemo intermediate ckpt commit 2cdd1a9 Author: arendu <arendu@users.noreply.github.com> Date: Thu Nov 14 21:36:31 2024 +0000 Apply isort and black reformatting Signed-off-by: arendu <arendu@users.noreply.github.com> commit a4134ef Author: arendu <adithya.r@gmail.com> Date: Thu Nov 14 21:35:30 2024 +0000 removing redundant params_dtype attr in mamba yaml Signed-off-by: arendu <adithya.r@gmail.com> commit d6a014f Author: Tugrul Konuk <ertkonuk@gmail.com> Date: Thu Nov 14 10:57:25 2024 -0600 Set skip_special_tokens to False by default in tiktoken_tokenizer.py commit b08f3eb Merge: cf3bf49 985e0cf Author: adithyare <adithyare@nvidia.com> Date: Wed Nov 13 15:59:18 2024 -0800 Merge branch 'aligner/nemotron5' of https://github.com/NVIDIA/NeMo into aligner/nemotron5 commit cf3bf49 Merge: acfde95 6c2ce66 Author: adithyare <adithyare@nvidia.com> Date: Wed Nov 13 15:59:13 2024 -0800 resolved conflict for dtype Signed-off-by: adithyare <adithyare@nvidia.com> commit 985e0cf Author: arendu <arendu@users.noreply.github.com> Date: Wed Nov 13 23:58:56 2024 +0000 Apply isort and black reformatting Signed-off-by: arendu <arendu@users.noreply.github.com> commit 6c2ce66 Author: arendu <adithya.r@gmail.com> Date: Wed Nov 13 23:57:54 2024 +0000 dtype fix in mamba Signed-off-by: arendu <adithya.r@gmail.com> commit acfde95 Merge: 20e251c 7c78ef4 Author: adithyare <adithyare@nvidia.com> Date: Wed Nov 13 14:47:41 2024 -0800 Merge branch 'aligner/nemotron5' of https://github.com/NVIDIA/NeMo into aligner/nemotron5 commit 7c78ef4 Author: Ali Taghibakhshi <71892896+JRD971000@users.noreply.github.com> Date: Wed Nov 13 16:42:08 2024 -0600 Minor changes to conversion script Signed-off-by: Ali Taghibakhshi <71892896+JRD971000@users.noreply.github.com> commit 20e251c Author: adithyare <adithyare@nvidia.com> Date: Wed Nov 13 14:37:26 2024 -0800 fix for torch empty Signed-off-by: adithyare <adithyare@nvidia.com> commit ecb2bf6 Author: Gerald Shen <geshen@nvidia.com> Date: Wed Nov 13 12:41:04 2024 -0800 disable vocab padding Signed-off-by: Gerald Shen <geshen@nvidia.com> commit e415b65 Author: arendu <arendu@users.noreply.github.com> Date: Tue Nov 12 22:35:23 2024 +0000 Apply isort and black reformatting Signed-off-by: arendu <arendu@users.noreply.github.com> commit 023dfbe Merge: 88ec5e0 3b63b4c Author: adithyare <adithyare@nvidia.com> Date: Tue Nov 12 14:34:08 2024 -0800 merged Signed-off-by: adithyare <adithyare@nvidia.com> commit 88ec5e0 Merge: be7f996 8231cde Author: adithyare <adithyare@nvidia.com> Date: Tue Nov 12 12:30:14 2024 -0800 Merge branch 'aligner/nemotron5' of https://github.com/NVIDIA/NeMo into aligner/nemotron5 commit be7f996 Author: adithyare <adithyare@nvidia.com> Date: Tue Nov 12 12:28:15 2024 -0800 pad to mult is not available in chat dataset Signed-off-by: adithyare <adithyare@nvidia.com> commit 8231cde Author: arendu <arendu@users.noreply.github.com> Date: Tue Nov 12 20:23:03 2024 +0000 Apply isort and black reformatting Signed-off-by: arendu <arendu@users.noreply.github.com> commit 19e5049 Author: adithyare <adithyare@nvidia.com> Date: Tue Nov 12 12:21:59 2024 -0800 a long overdue tiktoken special tokens fix -- Tkonuk Signed-off-by: adithyare <adithyare@nvidia.com> commit 3b63b4c Author: JRD971000 <JRD971000@users.noreply.github.com> Date: Tue Nov 12 14:23:58 2024 +0000 Apply isort and black reformatting Signed-off-by: JRD971000 <JRD971000@users.noreply.github.com> commit 6e4bd6f Merge: c26bd22 75d8854 Author: Ali Taghibakhshi <ataghibakhsh@login-eos02.eos.clusters.nvidia.com> Date: Tue Nov 12 06:22:22 2024 -0800 cleanup commit c26bd22 Author: Ali Taghibakhshi <ataghibakhsh@login-eos02.eos.clusters.nvidia.com> Date: Tue Nov 12 06:16:52 2024 -0800 cleanup commit 57008da Author: JRD971000 <JRD971000@users.noreply.github.com> Date: Fri Nov 8 18:57:48 2024 +0000 Apply isort and black reformatting Signed-off-by: JRD971000 <JRD971000@users.noreply.github.com> commit aa0fafb Author: ataghibakhsh <ataghibakhsh@nvidia.com> Date: Fri Nov 8 10:56:19 2024 -0800 guard cuda access commit 0d9bb4f Author: ataghibakhsh <ataghibakhsh@nvidia.com> Date: Mon Nov 4 14:41:41 2024 -0800 add nemotron5 conversion commit 75d8854 Author: JRD971000 <JRD971000@users.noreply.github.com> Date: Fri Nov 8 18:57:48 2024 +0000 Apply isort and black reformatting Signed-off-by: JRD971000 <JRD971000@users.noreply.github.com> commit 627a40d Author: ataghibakhsh <ataghibakhsh@nvidia.com> Date: Fri Nov 8 10:56:19 2024 -0800 guard cuda access commit ada4b90 Author: JRD971000 <JRD971000@users.noreply.github.com> Date: Tue Nov 5 17:57:58 2024 +0000 Apply isort and black reformatting Signed-off-by: JRD971000 <JRD971000@users.noreply.github.com> commit 1343bee Author: ataghibakhsh <ataghibakhsh@nvidia.com> Date: Mon Nov 4 14:41:41 2024 -0800 add nemotron5 conversion Signed-off-by: Terry Kong <terryk@nvidia.com>

…or Aligner (NVIDIA#10863) * fix(export): update API for disabling device reassignment in TRTLLM for Aligner [feat] Upgrade nemo-export path for aligner to TRTLLM-v12 and use python runtime Signed-off-by: Terry Kong <terryk@nvidia.com> fix: forgot to always set _disable_torch_cuda_device_set Signed-off-by: Terry Kong <terryk@nvidia.com> Signed-off-by: Terry Kong <terryk@nvidia.com> Apply isort and black reformatting Signed-off-by: terrykong <terrykong@users.noreply.github.com> invert torch device set Signed-off-by: Terry Kong <terryk@nvidia.com> * remove comment Signed-off-by: Terry Kong <terryk@nvidia.com> --------- Signed-off-by: Terry Kong <terryk@nvidia.com>

terrykong requested review from oyilmaz-nvidia and shanmugamr1992 October 11, 2024 23:45

terrykong mentioned this pull request Oct 11, 2024

feat: Upgrading TRTLLM to v13 NVIDIA/NeMo-Aligner#320

Merged

8 tasks

terrykong force-pushed the tk/v12/trtllm-refit-api-change branch 3 times, most recently from 419202f to 0deaf67 Compare October 13, 2024 01:44

terrykong added the Run CICD label Oct 22, 2024

terrykong changed the title ~~fix[export]: update API for disabling device reassignment in TRTLLM for Aligner~~ fix(export): update API for disabling device reassignment in TRTLLM for Aligner Oct 22, 2024

terrykong marked this pull request as draft October 22, 2024 17:26

shanmugamr1992 previously approved these changes Oct 22, 2024

View reviewed changes

terrykong force-pushed the tk/v12/trtllm-refit-api-change branch from 0deaf67 to b8bf39f Compare November 1, 2024 22:59

terrykong dismissed shanmugamr1992’s stale review via 8f080d6 November 1, 2024 23:09

terrykong force-pushed the tk/v12/trtllm-refit-api-change branch from b8bf39f to 8f080d6 Compare November 1, 2024 23:09

terrykong marked this pull request as ready for review November 1, 2024 23:09

remove comment

567b144

Signed-off-by: Terry Kong <terryk@nvidia.com>

github-actions bot added the core Changes to NeMo Core label Nov 6, 2024

terrykong force-pushed the tk/v12/trtllm-refit-api-change branch from 89ff142 to 567b144 Compare November 6, 2024 22:23

terrykong removed the Run CICD label Nov 6, 2024

github-actions bot removed the core Changes to NeMo Core label Nov 6, 2024

terrykong added core Changes to NeMo Core Run CICD labels Nov 6, 2024

Merge branch 'main' into tk/v12/trtllm-refit-api-change

b7aa885

terrykong added Run CICD and removed Run CICD labels Nov 12, 2024

github-actions bot removed the core Changes to NeMo Core label Nov 12, 2024

meatybobby approved these changes Nov 12, 2024

View reviewed changes

terrykong enabled auto-merge (squash) November 12, 2024 18:24

terrykong merged commit 085e957 into main Nov 12, 2024
168 of 169 checks passed

terrykong deleted the tk/v12/trtllm-refit-api-change branch November 12, 2024 21:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(export): update API for disabling device reassignment in TRTLLM for Aligner #10863

fix(export): update API for disabling device reassignment in TRTLLM for Aligner #10863

terrykong commented Oct 11, 2024

github-actions bot commented Nov 12, 2024

github-actions bot commented Nov 12, 2024

fix(export): update API for disabling device reassignment in TRTLLM for Aligner #10863

fix(export): update API for disabling device reassignment in TRTLLM for Aligner #10863

Conversation

terrykong commented Oct 11, 2024

What does this PR do ?

Changelog

Usage

GitHub Actions CI

Before your PR is "Ready for review"

Who can review?

Additional Information

github-actions bot commented Nov 12, 2024

github-actions bot commented Nov 12, 2024