Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(export): update API for disabling device reassignment in TRTLLM for Aligner #10863

Merged
merged 3 commits into from
Nov 12, 2024

Conversation

terrykong
Copy link
Collaborator

  • Also clean up some unused imports

What does this PR do ?

Add a one line overview of what this PR aims to accomplish.

Collection: [Note which collection this PR will affect]

Changelog

  • Add specific line by line info of high level changes in this PR.

Usage

  • You can potentially add a usage example below
# Add a code snippet demonstrating how to use this 

GitHub Actions CI

The Jenkins CI system has been replaced by GitHub Actions self-hosted runners.

The GitHub Actions CI will run automatically when the "Run CICD" label is added to the PR.
To re-run CI remove and add the label again.
To run CI on an untrusted fork, a NeMo user with write access must first click "Approve and run".

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • New Feature
  • Bugfix
  • Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

  • Related to # (issue)

@terrykong terrykong force-pushed the tk/v12/trtllm-refit-api-change branch 3 times, most recently from 419202f to 0deaf67 Compare October 13, 2024 01:44
@terrykong terrykong changed the title fix[export]: update API for disabling device reassignment in TRTLLM for Aligner fix(export): update API for disabling device reassignment in TRTLLM for Aligner Oct 22, 2024
@terrykong terrykong marked this pull request as draft October 22, 2024 17:26
shanmugamr1992
shanmugamr1992 previously approved these changes Oct 22, 2024
@terrykong terrykong force-pushed the tk/v12/trtllm-refit-api-change branch from 0deaf67 to b8bf39f Compare November 1, 2024 22:59
…or Aligner

[feat] Upgrade nemo-export path for aligner to TRTLLM-v12 and use python runtime

Signed-off-by: Terry Kong <terryk@nvidia.com>

fix: forgot to always set _disable_torch_cuda_device_set

Signed-off-by: Terry Kong <terryk@nvidia.com>

Signed-off-by: Terry Kong <terryk@nvidia.com>

Apply isort and black reformatting

Signed-off-by: terrykong <terrykong@users.noreply.github.com>

invert torch device set

Signed-off-by: Terry Kong <terryk@nvidia.com>
@terrykong terrykong force-pushed the tk/v12/trtllm-refit-api-change branch from b8bf39f to 8f080d6 Compare November 1, 2024 23:09
@terrykong terrykong marked this pull request as ready for review November 1, 2024 23:09
Signed-off-by: Terry Kong <terryk@nvidia.com>
@github-actions github-actions bot added the core Changes to NeMo Core label Nov 6, 2024
@terrykong terrykong force-pushed the tk/v12/trtllm-refit-api-change branch from 89ff142 to 567b144 Compare November 6, 2024 22:23
@terrykong terrykong removed the Run CICD label Nov 6, 2024
@github-actions github-actions bot removed the core Changes to NeMo Core label Nov 6, 2024
@terrykong terrykong added core Changes to NeMo Core Run CICD labels Nov 6, 2024
@github-actions github-actions bot removed the core Changes to NeMo Core label Nov 12, 2024
Copy link
Contributor

beep boop 🤖: 🙏 The following files have warnings. In case you are familiar with these, please try helping us to improve the code base.


Your code was analyzed with PyLint. The following annotations have been identified:

************* Module nemo.export.trt_llm.tensorrt_llm_run
nemo/export/trt_llm/tensorrt_llm_run.py:506:0: C0301: Line too long (125/119) (line-too-long)
nemo/export/trt_llm/tensorrt_llm_run.py:510:0: C0301: Line too long (136/119) (line-too-long)
nemo/export/trt_llm/tensorrt_llm_run.py:514:0: C0301: Line too long (123/119) (line-too-long)
nemo/export/trt_llm/tensorrt_llm_run.py:557:0: C0301: Line too long (181/119) (line-too-long)
nemo/export/trt_llm/tensorrt_llm_run.py:839:0: C0301: Line too long (153/119) (line-too-long)
nemo/export/trt_llm/tensorrt_llm_run.py:524:0: C0116: Missing function or method docstring (missing-function-docstring)
nemo/export/trt_llm/tensorrt_llm_run.py:533:0: C0116: Missing function or method docstring (missing-function-docstring)
nemo/export/trt_llm/tensorrt_llm_run.py:591:0: C0116: Missing function or method docstring (missing-function-docstring)
nemo/export/trt_llm/tensorrt_llm_run.py:33:0: W0611: Unused Mapping imported from tensorrt_llm.mapping (unused-import)

-----------------------------------
Your code has been rated at 9.75/10

Thank you for improving NeMo's documentation!

@terrykong terrykong enabled auto-merge (squash) November 12, 2024 18:24
Copy link
Contributor

[🤖]: Hi @terrykong 👋,

We wanted to let you know that a CICD pipeline for this PR just finished successfully

So it might be time to merge this PR or get some approvals

I'm just a bot so I'll leave it you what to do next.

//cc @pablo-garay @ko3n1g

@terrykong terrykong merged commit 085e957 into main Nov 12, 2024
168 of 169 checks passed
@terrykong terrykong deleted the tk/v12/trtllm-refit-api-change branch November 12, 2024 21:29
zpx01 added a commit that referenced this pull request Nov 14, 2024
* Timestamps to transcribe (#10950)

* inital version

Signed-off-by: Nithin Rao Koluguri <nithinraok>

* Support for RNNT, TDT, Hybrid Models

Signed-off-by: Nithin Rao Koluguri <nithinraok>

* move change of decoder stratery from mixin to individual model class

Signed-off-by: Nithin Rao Koluguri <nithinraok>

* Apply isort and black reformatting

Signed-off-by: nithinraok <nithinraok@users.noreply.github.com>

* update transcribe_speech.py

Signed-off-by: Nithin Rao Koluguri <nithinraok>

* uncomment

Signed-off-by: Nithin Rao Koluguri <nithinraok>

* Apply isort and black reformatting

Signed-off-by: nithinraok <nithinraok@users.noreply.github.com>

* add docs

Signed-off-by: Nithin Rao Koluguri <nithinraok>

* fix docs

Signed-off-by: Nithin Rao Koluguri <nithinraok>

* Apply isort and black reformatting

Signed-off-by: nithinraok <nithinraok@users.noreply.github.com>

* codeql fixes

Signed-off-by: Nithin Rao Koluguri <nithinraok>

* unit tests

Signed-off-by: Nithin Rao Koluguri <nithinraok>

* minor rebase fix

Signed-off-by: Nithin Rao Koluguri <nithinraok>

* Apply isort and black reformatting

Signed-off-by: nithinraok <nithinraok@users.noreply.github.com>

* add None case to restore the state set outside using decoding_stratergy()

Signed-off-by: Nithin Rao Koluguri <nithinraok>

* Apply isort and black reformatting

Signed-off-by: nithinraok <nithinraok@users.noreply.github.com>

* remove ipdb traces

Signed-off-by: Nithin Rao Koluguri <nithinraok>

* updates doc for transcription.py

Signed-off-by: Nithin Rao Koluguri <nithinraok>

* remove preserve alignment for AED models as it doesn;t support it

Signed-off-by: Nithin Rao Koluguri <nithinraok>

* lint warnings

Signed-off-by: Nithin Rao Koluguri <nithinraok>

* Apply isort and black reformatting

Signed-off-by: nithinraok <nithinraok@users.noreply.github.com>

---------

Signed-off-by: Nithin Rao Koluguri <nithinraok>
Signed-off-by: nithinraok <nithinraok@users.noreply.github.com>
Co-authored-by: Nithin Rao Koluguri <nithinraok>
Co-authored-by: nithinraok <nithinraok@users.noreply.github.com>

* [🤠]: Howdy folks, let's bump `Dockerfile.ci` to 1b8fce7 ! (#11247)

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* [🤠]: Howdy folks, let's bump `Dockerfile.ci` to 47ff44e ! (#11254)

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Handling tokenizer in PTQ for Nemo 2.0 (#11237)

* Handling tokenizer in PTQ for Nemo 2.0

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Print log msg and enable overriding

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Warning for legacy tokenizer config

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Save HF tokenizer to make tokenizer_config.yaml (almost) redundant

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Handle tokenizer in a unified way

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Move saving context within export

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Fix typo in get_tokenzier

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Reduce diff

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Drop unused import

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

---------

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Fix finetuning datamodule resume (#11187)

* fix datamodule resume

Signed-off-by: Chen Cui <chcui@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: cuichenx <cuichenx@users.noreply.github.com>

* fix subclass

Signed-off-by: Chen Cui <chcui@nvidia.com>

* docstrings and formats

Signed-off-by: Chen Cui <chcui@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: cuichenx <cuichenx@users.noreply.github.com>

---------

Signed-off-by: Chen Cui <chcui@nvidia.com>
Signed-off-by: cuichenx <cuichenx@users.noreply.github.com>
Co-authored-by: cuichenx <cuichenx@users.noreply.github.com>

* ci: Move `bump mcore` to templates (#11229)

* ci: Move `bump mcore` to templates

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* fix

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* fix

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* fix

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* final

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

---------

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* fix: Update baseline (#11205)

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* Remove deprecated builder_opt param from build command (#11259)

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* chore(beep boop 🤖): Bump `MCORE_TAG=aded519...` (2024-11-12) (#11260)

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* [Doc fixes] update file names, installation instructions, bad links (#11045)

* rename eval_beamsearch_ngram.py to eval_beamsearch_ngram_ctc.py in docs

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* replace out of date installation instructions with pointer to NeMo README installation section

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* point to user guide instead of readme

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* some link updates

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* update more links

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

---------

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: Elena Rastorgueva <80532067+erastorgueva-nv@users.noreply.github.com>

* fix(export): GPT models w/ bias=False convert properly (#11255)

Signed-off-by: Terry Kong <terryk@nvidia.com>

* ci: Run secrets detector on `pull_request_target` (#11263)

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* fix(export): update API for disabling device reassignment in TRTLLM for Aligner (#10863)

* fix(export): update API for disabling device reassignment in TRTLLM for Aligner

[feat] Upgrade nemo-export path for aligner to TRTLLM-v12 and use python runtime

Signed-off-by: Terry Kong <terryk@nvidia.com>

fix: forgot to always set _disable_torch_cuda_device_set

Signed-off-by: Terry Kong <terryk@nvidia.com>

Signed-off-by: Terry Kong <terryk@nvidia.com>

Apply isort and black reformatting

Signed-off-by: terrykong <terrykong@users.noreply.github.com>

invert torch device set

Signed-off-by: Terry Kong <terryk@nvidia.com>

* remove comment

Signed-off-by: Terry Kong <terryk@nvidia.com>

---------

Signed-off-by: Terry Kong <terryk@nvidia.com>

* new vfm training features (#11246)

Signed-off-by: Zeeshan Patel <zeeshanp@nvidia.com>
Co-authored-by: Zeeshan Patel <zeeshanp@nvidia.com>

* Update pruning and distillation tutorial notebooks (#11091)

* Update pruning and distillation tutorial notebooks

Signed-off-by: Gomathy Venkata Krishnan <gvenkatakris@nvidia.com>

* Update README

Signed-off-by: Gomathy Venkata Krishnan <gvenkatakris@nvidia.com>

* Update batch size in width pruning script

Signed-off-by: Gomathy Venkata Krishnan <gvenkatakris@nvidia.com>

* Update README

Signed-off-by: Gomathy Venkata Krishnan <gvenkatakris@nvidia.com>

---------

Signed-off-by: Gomathy Venkata Krishnan <gvenkatakris@nvidia.com>

* Beam search algorithm implementation for TDT models (#10903)

* initial commit

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* add: default beam search implementation

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* fix: changed to removing duplicate hypothesis in separate function

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* fix: changed to cartesian product in choosing best hyp

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* fix: minor fixes in comments

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* add: maes decoding strategy

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* add: durations filtering in maes, lm fusion in progress

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* fix: refactored, added comments, command line args, finalized

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* fix: removed prints

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* add: docs

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com>

* fix: minor fix

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* fix: rm beam_size=1 exception, rm duplicates check, fix error handling

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* fix: error handling

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com>

* fix: removed evaluations file

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* rn: blank scoring

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* clean up

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* rm: blank scoring and duration beam size

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com>

* fix: removed durations_beam_size from default beam search

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* add: logaddexp

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* rm: prefix search

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* rn: nested loop over extensions

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* fix: bug with caching

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* rm: topk on durations

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* add: restored prefix search

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com>

* clean up

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* fix: fixed comments

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* refactored duplicate merging

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* changes batch scoring

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* refactored rnnt batch scoring

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* alsd first working

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* refactored

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* clean up

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* remove stacking operations

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* fixes im base class

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* clean up

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com>

* remove potentially uninitialized local variable

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* default beam search minor fixes

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* add test, fix maes timesteps

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* rm file

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* rm file

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* clean up

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com>

* clean up

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* fix comments

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* add ngram lm test

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com>

* fix maes_num_steps=1

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* fix kenlm model path

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* fix kenlm model full path

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com>

* made requested changes

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* merge after isort

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* add prints to test

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com>

* add Kenlm to asr requirements

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* remove prints in tests

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add kenlm to test requirements

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* rm kenlm from link, add package-name

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* rm second kenlm installation

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* rm kenlm from dependencies make test optional

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com>

* fix in test

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* fix in test

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com>

* fix comments

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com>

* add comments

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* add comments

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* splitted docstrings

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com>

* add comments

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* splitted docstrings

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com>

* add comments

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com>

* fixes to python3 type annotations

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com>

* merging

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* merging

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* fix in return type

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com>

* fix test

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com>

* rm time_idx

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

* fix comments to python3 style

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>

---------

Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>
Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com>
Co-authored-by: lilithgrigoryan <lgrigoryan@nvidia.com>
Co-authored-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* update nemo1->2 conversion according to changes in main (#11253)

* update nemo1->2 conversion according to changes in main

Signed-off-by: Huiying Li <willwin.lee@gmail.com>

* Apply isort and black reformatting

Signed-off-by: HuiyingLi <HuiyingLi@users.noreply.github.com>

* format fix

Signed-off-by: Huiying Li <willwin.lee@gmail.com>

* add docstrings

Signed-off-by: Huiying Li <willwin.lee@gmail.com>

---------

Signed-off-by: Huiying Li <willwin.lee@gmail.com>
Signed-off-by: HuiyingLi <HuiyingLi@users.noreply.github.com>
Co-authored-by: HuiyingLi <HuiyingLi@users.noreply.github.com>

* Add llama 3.1 recipes (#11273)

* add llama 3.1 recipes

Signed-off-by: Chen Cui <chcui@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: cuichenx <cuichenx@users.noreply.github.com>

* fix pylint

Signed-off-by: Chen Cui <chcui@nvidia.com>

* Fix llama3.1 wrong config in io.json

---------

Signed-off-by: Chen Cui <chcui@nvidia.com>
Signed-off-by: cuichenx <cuichenx@users.noreply.github.com>
Co-authored-by: cuichenx <cuichenx@users.noreply.github.com>
Co-authored-by: Ao Tang <aot@nvidia.com>

* Fix Finetune Recipe (#11267)

* Fix Starcoder_15 SFT recipe

* Fix PP type SFT recipe

* Fix PP type SFT recipe

* Fix Gemma2b SFT TP=1

* Fix more sft recipe

* Fix more sft recipe

* Fix more sft recipe

* Fix more sft recipe

* Fix more sft recipe

* Fix more sft recipe

* Fix more sft recipe

* Fix more sft recipe

* Fix more sft recipe

* remove pp dtype

* remove pp dtype

* Configure no restart validation loop in nl.Trainer (#11029)

* Configure no restart validation loop in nl.Trainer

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* fix

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* Skip validation whenever restarting=True

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* PR feedback

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: hemildesai <hemildesai@users.noreply.github.com>

---------

Signed-off-by: Hemil Desai <hemild@nvidia.com>
Signed-off-by: hemildesai <hemildesai@users.noreply.github.com>
Co-authored-by: hemildesai <hemildesai@users.noreply.github.com>

* Handle _io_unflatten_object when _thread_local.output_dir is not available (#11199)

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* change default ckpt name (#11277)

Signed-off-by: Maanu Grover <maanug@nvidia.com>

* Use MegatronDataSampler in HfDatasetDataModule (#11274)

* Use MegatronDataSampler in HfDataset

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

---------

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>
Co-authored-by: akoumpa <akoumpa@users.noreply.github.com>

* Remove opencc upperbound (#10909)

Signed-off-by: Dong Hyuk Chang <donghyukc@nvidia.com>

---------

Signed-off-by: Nithin Rao Koluguri <nithinraok>
Signed-off-by: nithinraok <nithinraok@users.noreply.github.com>
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Signed-off-by: Jan Lasek <janek.lasek@gmail.com>
Signed-off-by: Chen Cui <chcui@nvidia.com>
Signed-off-by: cuichenx <cuichenx@users.noreply.github.com>
Signed-off-by: Oliver Koenig <okoenig@nvidia.com>
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: Elena Rastorgueva <80532067+erastorgueva-nv@users.noreply.github.com>
Signed-off-by: Terry Kong <terryk@nvidia.com>
Signed-off-by: Zeeshan Patel <zeeshanp@nvidia.com>
Signed-off-by: Gomathy Venkata Krishnan <gvenkatakris@nvidia.com>
Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>
Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com>
Signed-off-by: Huiying Li <willwin.lee@gmail.com>
Signed-off-by: HuiyingLi <HuiyingLi@users.noreply.github.com>
Signed-off-by: Hemil Desai <hemild@nvidia.com>
Signed-off-by: hemildesai <hemildesai@users.noreply.github.com>
Signed-off-by: Maanu Grover <maanug@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>
Signed-off-by: Dong Hyuk Chang <donghyukc@nvidia.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Co-authored-by: nithinraok <nithinraok@users.noreply.github.com>
Co-authored-by: oliver könig <okoenig@nvidia.com>
Co-authored-by: Jan Lasek <janek.lasek@gmail.com>
Co-authored-by: Chen Cui <chcui@nvidia.com>
Co-authored-by: cuichenx <cuichenx@users.noreply.github.com>
Co-authored-by: Elena Rastorgueva <80532067+erastorgueva-nv@users.noreply.github.com>
Co-authored-by: Terry Kong <terryk@nvidia.com>
Co-authored-by: Zeeshan Patel <zeeshanp@nvidia.com>
Co-authored-by: gvenkatakris <gvenkatakris@nvidia.com>
Co-authored-by: lilithgrigoryan <38436437+lilithgrigoryan@users.noreply.github.com>
Co-authored-by: lilithgrigoryan <lgrigoryan@nvidia.com>
Co-authored-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Huiying <willwin.lee@gmail.com>
Co-authored-by: HuiyingLi <HuiyingLi@users.noreply.github.com>
Co-authored-by: Ao Tang <aot@nvidia.com>
Co-authored-by: Hemil Desai <hemild@nvidia.com>
Co-authored-by: hemildesai <hemildesai@users.noreply.github.com>
Co-authored-by: Maanu Grover <109391026+maanug-nv@users.noreply.github.com>
Co-authored-by: Alexandros Koumparoulis <153118171+akoumpa@users.noreply.github.com>
Co-authored-by: akoumpa <akoumpa@users.noreply.github.com>
Co-authored-by: Dong Hyuk Chang <thomaschang26@tutanota.com>
HuiyingLi pushed a commit to HuiyingLi/NeMo that referenced this pull request Nov 15, 2024
…or Aligner (NVIDIA#10863)

* fix(export): update API for disabling device reassignment in TRTLLM for Aligner

[feat] Upgrade nemo-export path for aligner to TRTLLM-v12 and use python runtime

Signed-off-by: Terry Kong <terryk@nvidia.com>

fix: forgot to always set _disable_torch_cuda_device_set

Signed-off-by: Terry Kong <terryk@nvidia.com>

Signed-off-by: Terry Kong <terryk@nvidia.com>

Apply isort and black reformatting

Signed-off-by: terrykong <terrykong@users.noreply.github.com>

invert torch device set

Signed-off-by: Terry Kong <terryk@nvidia.com>

* remove comment

Signed-off-by: Terry Kong <terryk@nvidia.com>

---------

Signed-off-by: Terry Kong <terryk@nvidia.com>
gshennvm pushed a commit that referenced this pull request Nov 20, 2024
…or Aligner (#10863)

* fix(export): update API for disabling device reassignment in TRTLLM for Aligner

[feat] Upgrade nemo-export path for aligner to TRTLLM-v12 and use python runtime

Signed-off-by: Terry Kong <terryk@nvidia.com>

fix: forgot to always set _disable_torch_cuda_device_set

Signed-off-by: Terry Kong <terryk@nvidia.com>

Signed-off-by: Terry Kong <terryk@nvidia.com>

Apply isort and black reformatting

Signed-off-by: terrykong <terrykong@users.noreply.github.com>

invert torch device set

Signed-off-by: Terry Kong <terryk@nvidia.com>

* remove comment

Signed-off-by: Terry Kong <terryk@nvidia.com>

---------

Signed-off-by: Terry Kong <terryk@nvidia.com>
yashaswikarnati pushed a commit that referenced this pull request Nov 21, 2024
…or Aligner (#10863)

* fix(export): update API for disabling device reassignment in TRTLLM for Aligner

[feat] Upgrade nemo-export path for aligner to TRTLLM-v12 and use python runtime

Signed-off-by: Terry Kong <terryk@nvidia.com>

fix: forgot to always set _disable_torch_cuda_device_set

Signed-off-by: Terry Kong <terryk@nvidia.com>

Signed-off-by: Terry Kong <terryk@nvidia.com>

Apply isort and black reformatting

Signed-off-by: terrykong <terrykong@users.noreply.github.com>

invert torch device set

Signed-off-by: Terry Kong <terryk@nvidia.com>

* remove comment

Signed-off-by: Terry Kong <terryk@nvidia.com>

---------

Signed-off-by: Terry Kong <terryk@nvidia.com>
terrykong added a commit that referenced this pull request Dec 3, 2024
Squashed commit of the following:

commit 57ef506
Author: Olivier Delalleau <507137+odelalleau@users.noreply.github.com>
Date:   Thu Nov 28 13:27:04 2024 -0800

    Fully remove hack that was adding "</s>" to `end_strings`

commit 6076b60
Author: Ali Taghibakhshi <71892896+JRD971000@users.noreply.github.com>
Date:   Thu Nov 28 00:02:43 2024 -0600

    change dist ckpt to zarr

    Signed-off-by: Ali Taghibakhshi <71892896+JRD971000@users.noreply.github.com>

commit 33564d4
Author: Jiaqi Zeng <jiaqiz@nvidia.com>
Date:   Tue Nov 26 19:53:04 2024 -0800

    remove eos hack given the fix in 4b71c0f

    Signed-off-by: Jiaqi Zeng <jiaqiz@nvidia.com>

commit 9387c74
Author: Olivier Delalleau <507137+odelalleau@users.noreply.github.com>
Date:   Tue Nov 26 17:37:45 2024 -0500

    Fix for when `ids_to_tokens()` is unable to return a valid token

commit c23db69
Author: Olivier Delalleau <507137+odelalleau@users.noreply.github.com>
Date:   Tue Nov 26 16:30:47 2024 -0500

    Simplify implementation of `token_to_id()`

commit ab699a5
Author: Olivier Delalleau <507137+odelalleau@users.noreply.github.com>
Date:   Tue Nov 26 13:28:15 2024 -0500

    Fix `ids_to_tokens()` to handle tokens associated to multiple token IDs

commit 52ec872
Author: Olivier Delalleau <507137+odelalleau@users.noreply.github.com>
Date:   Tue Nov 26 11:45:43 2024 -0500

    Ensure `tokens_to_text()` is consistent with `ids_to_text()`

commit 4b71c0f
Author: Olivier Delalleau <507137+odelalleau@users.noreply.github.com>
Date:   Tue Nov 26 11:39:20 2024 -0500

    Skip BOS/EOS tokens in `ids_to_text()` by default

    This is because those tokens are typically added in the code (e.g. for
    padding purpose) and we do not want them to be part of the response.

commit a77dc9f
Author: Tugrul Konuk <ertkonuk@gmail.com>
Date:   Tue Nov 26 12:33:35 2024 -0600

    Use decode_with_offsets

commit 413e736
Author: Tugrul Konuk <ertkonuk@gmail.com>
Date:   Tue Nov 26 11:38:38 2024 -0600

    Fixed tokenization of special characters.

commit 30cef20
Author: Tugrul Konuk <ertkonuk@gmail.com>
Date:   Tue Nov 26 10:33:07 2024 -0600

    Simplified the text_to_tokens method

commit d07a17c
Author: Tugrul Konuk <ertkonuk@gmail.com>
Date:   Tue Nov 26 10:15:49 2024 -0600

    Attempt to fix the nemotron5 tokenizer

commit cee062f
Author: Gerald Shen <geshen@nvidia.com>
Date:   Fri Nov 22 18:15:55 2024 -0800

    only save untarred nemo files

    Signed-off-by: Gerald Shen <geshen@nvidia.com>

commit 23923fe
Author: Gerald Shen <geshen@nvidia.com>
Date:   Fri Nov 22 13:41:28 2024 -0800

    add checkpoint fix

    Signed-off-by: Gerald Shen <geshen@nvidia.com>

commit 61f999a
Author: Olivier Delalleau <507137+odelalleau@users.noreply.github.com>
Date:   Fri Nov 22 15:44:04 2024 -0500

    Slightly reduce sleep time when batching queries

    This can give a small speedup for free, since usually batched queries
    all come in within <0.5s

commit 17e148c
Author: Olivier Delalleau <507137+odelalleau@users.noreply.github.com>
Date:   Fri Nov 22 09:54:50 2024 -0500

    Avoid potential race conditions with batching

    In theory, with the previous implementation it would have been possible
    for a thread to re-use the output from a previous batch, if it happened
    to grab the lock before the thread with queryid == 0.

commit 65f0a3b
Author: Haifeng Qian <haifengq@cw-dfw-cs-001-login-02.cm.cluster>
Date:   Fri Nov 22 08:56:58 2024 -0800

    enforce tokens_to_generate as max number of generated tokens for each sequence in a batch

commit c9b6c60
Author: HeyyyyyyG <HeyyyyyyG@users.noreply.github.com>
Date:   Fri Nov 22 10:06:32 2024 +0000

    Apply isort and black reformatting

    Signed-off-by: HeyyyyyyG <HeyyyyyyG@users.noreply.github.com>

commit 287ab7f
Author: Jiaqi Zeng <jiaqiz@nvidia.com>
Date:   Fri Nov 22 02:01:44 2024 -0800

    hack to remove trailing </s>

    Signed-off-by: Jiaqi Zeng <jiaqiz@nvidia.com>

commit b912e92
Author: haifengqian <haifengqian@users.noreply.github.com>
Date:   Thu Nov 21 22:19:18 2024 +0000

    Apply isort and black reformatting

    Signed-off-by: haifengqian <haifengqian@users.noreply.github.com>

commit 9853c30
Author: Haifeng Qian <haifengq@cw-dfw-cs-001-login-02.cm.cluster>
Date:   Thu Nov 21 14:17:47 2024 -0800

    add batching support in inference server

commit 551bf41
Author: arendu <arendu@users.noreply.github.com>
Date:   Wed Nov 20 22:16:10 2024 +0000

    Apply isort and black reformatting

    Signed-off-by: arendu <arendu@users.noreply.github.com>

commit 9581135
Merge: daf406b df9374f
Author: adithyare <adithyare@nvidia.com>
Date:   Wed Nov 20 14:14:58 2024 -0800

    Merge branch 'aligner/nemotron5' of https://github.com/NVIDIA/NeMo into aligner/nemotron5

commit daf406b
Author: adithyare <adithyare@nvidia.com>
Date:   Wed Nov 20 14:14:32 2024 -0800

    removed logs and debugging code

    Signed-off-by: adithyare <adithyare@nvidia.com>

commit df9374f
Author: Terry Kong <terryk@nvidia.com>
Date:   Tue Nov 12 13:29:56 2024 -0800

    fix(export): update API for disabling device reassignment in TRTLLM for Aligner (#10863)

    * fix(export): update API for disabling device reassignment in TRTLLM for Aligner

    [feat] Upgrade nemo-export path for aligner to TRTLLM-v12 and use python runtime

    Signed-off-by: Terry Kong <terryk@nvidia.com>

    fix: forgot to always set _disable_torch_cuda_device_set

    Signed-off-by: Terry Kong <terryk@nvidia.com>

    Signed-off-by: Terry Kong <terryk@nvidia.com>

    Apply isort and black reformatting

    Signed-off-by: terrykong <terrykong@users.noreply.github.com>

    invert torch device set

    Signed-off-by: Terry Kong <terryk@nvidia.com>

    * remove comment

    Signed-off-by: Terry Kong <terryk@nvidia.com>

    ---------

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit a923f76
Author: Gerald Shen <geshen@nvidia.com>
Date:   Wed Nov 20 13:19:43 2024 -0800

    TRT-LLM FIX FOR NEMOTRON5, THIS BREAKS TRT FOR ALL OTHER MODELS

    Signed-off-by: Gerald Shen <geshen@nvidia.com>

commit 2b44faf
Author: arendu <adithya.r@gmail.com>
Date:   Tue Nov 19 23:35:21 2024 +0000

    loop once in server mode

    Signed-off-by: arendu <adithya.r@gmail.com>

commit 744839c
Merge: 0278a01 0a63807
Author: arendu <adithya.r@gmail.com>
Date:   Tue Nov 19 23:32:01 2024 +0000

    Merge branch 'aligner/nemotron5' of https://github.com/NVIDIA/NeMo into aligner/nemotron5

commit 0278a01
Author: arendu <adithya.r@gmail.com>
Date:   Tue Nov 19 23:31:48 2024 +0000

    time generate method

    Signed-off-by: arendu <adithya.r@gmail.com>

commit 0a63807
Author: arendu <arendu@users.noreply.github.com>
Date:   Tue Nov 19 23:03:34 2024 +0000

    Apply isort and black reformatting

    Signed-off-by: arendu <arendu@users.noreply.github.com>

commit 6aa111f
Merge: 3958925 aee8a89
Author: adithyare <adithyare@nvidia.com>
Date:   Tue Nov 19 15:02:30 2024 -0800

    Merge branch 'aligner/nemotron5' of https://github.com/NVIDIA/NeMo into aligner/nemotron5

commit 3958925
Author: adithyare <adithyare@nvidia.com>
Date:   Tue Nov 19 15:02:07 2024 -0800

    added import

    Signed-off-by: adithyare <adithyare@nvidia.com>

commit aee8a89
Author: arendu <arendu@users.noreply.github.com>
Date:   Tue Nov 19 22:54:44 2024 +0000

    Apply isort and black reformatting

    Signed-off-by: arendu <arendu@users.noreply.github.com>

commit ca902fd
Author: adithyare <adithyare@nvidia.com>
Date:   Tue Nov 19 14:52:27 2024 -0800

    debug eval script times

    Signed-off-by: adithyare <adithyare@nvidia.com>

commit 15fdf8a
Author: arendu <arendu@users.noreply.github.com>
Date:   Tue Nov 19 22:31:26 2024 +0000

    Apply isort and black reformatting

    Signed-off-by: arendu <arendu@users.noreply.github.com>

commit d27c4a5
Author: arendu <adithya.r@gmail.com>
Date:   Tue Nov 19 22:30:35 2024 +0000

    debug

    Signed-off-by: arendu <adithya.r@gmail.com>

commit 2db23a7
Author: arendu <arendu@users.noreply.github.com>
Date:   Tue Nov 19 21:21:03 2024 +0000

    Apply isort and black reformatting

    Signed-off-by: arendu <arendu@users.noreply.github.com>

commit 405889e
Author: adithyare <adithyare@nvidia.com>
Date:   Tue Nov 19 13:19:49 2024 -0800

    removed logs in server, added a single timer

    Signed-off-by: adithyare <adithyare@nvidia.com>

commit 7730d5f
Merge: e504655 23812c3
Author: arendu <adithya.r@gmail.com>
Date:   Tue Nov 19 16:56:36 2024 +0000

    remove logs resolve conflicts

    Signed-off-by: arendu <adithya.r@gmail.com>

commit e504655
Author: arendu <adithya.r@gmail.com>
Date:   Tue Nov 19 16:53:55 2024 +0000

    removed timing/debug logs

    Signed-off-by: arendu <adithya.r@gmail.com>

commit 23812c3
Author: Jiaqi Zeng <jiaqiz@nvidia.com>
Date:   Tue Nov 19 07:30:34 2024 -0800

    remove end_strings and end_of_turn

    Signed-off-by: Jiaqi Zeng <jiaqiz@nvidia.com>

commit 3ab0d2c
Author: HeyyyyyyG <HeyyyyyyG@users.noreply.github.com>
Date:   Tue Nov 19 15:15:01 2024 +0000

    Apply isort and black reformatting

    Signed-off-by: HeyyyyyyG <HeyyyyyyG@users.noreply.github.com>

commit c4c7de6
Author: Jiaqi Zeng <jiaqiz@nvidia.com>
Date:   Tue Nov 19 07:13:59 2024 -0800

    remove end_strings

    Signed-off-by: Jiaqi Zeng <jiaqiz@nvidia.com>

commit 44d1e9d
Author: arendu <arendu@users.noreply.github.com>
Date:   Tue Nov 19 04:57:56 2024 +0000

    Apply isort and black reformatting

    Signed-off-by: arendu <arendu@users.noreply.github.com>

commit fdd8005
Author: adithyare <adithyare@nvidia.com>
Date:   Mon Nov 18 20:56:51 2024 -0800

    debugging args to generate

    Signed-off-by: adithyare <adithyare@nvidia.com>

commit 849ff34
Author: arendu <arendu@users.noreply.github.com>
Date:   Tue Nov 19 03:14:45 2024 +0000

    Apply isort and black reformatting

    Signed-off-by: arendu <arendu@users.noreply.github.com>

commit 54fba29
Merge: 3b2e00f 6723809
Author: Adi Renduchintala <adithyare@nvidia.com>
Date:   Mon Nov 18 19:13:38 2024 -0800

    Merge branch 'aligner/nemotron5' of https://github.com/NVIDIA/NeMo into aligner/nemotron5

commit 3b2e00f
Author: Adi Renduchintala <adithyare@nvidia.com>
Date:   Mon Nov 18 19:12:52 2024 -0800

    debug Nones

    Signed-off-by: Adi Renduchintala <adithyare@nvidia.com>

commit 6723809
Author: Olivier Delalleau <507137+odelalleau@users.noreply.github.com>
Date:   Wed Nov 13 16:53:31 2024 -0500

    Workaround for crash due to `bytes` tokens in Tiktoken tokenizer

commit 3b284af
Author: arendu <adithya.r@gmail.com>
Date:   Tue Nov 19 02:12:39 2024 +0000

    debug slowness

    Signed-off-by: arendu <adithya.r@gmail.com>

commit e4b2259
Author: arendu <adithya.r@gmail.com>
Date:   Tue Nov 19 00:41:19 2024 +0000

    added timing logs

    Signed-off-by: arendu <adithya.r@gmail.com>

commit ae07158
Author: JRD971000 <JRD971000@users.noreply.github.com>
Date:   Mon Nov 18 22:49:42 2024 +0000

    Apply isort and black reformatting

    Signed-off-by: JRD971000 <JRD971000@users.noreply.github.com>

commit 85a1c9c
Author: Ali Taghibakhshi <ataghibakhsh@login-eos01.eos.clusters.nvidia.com>
Date:   Mon Nov 18 14:48:46 2024 -0800

    add nemo intermediate ckpt

commit 2cdd1a9
Author: arendu <arendu@users.noreply.github.com>
Date:   Thu Nov 14 21:36:31 2024 +0000

    Apply isort and black reformatting

    Signed-off-by: arendu <arendu@users.noreply.github.com>

commit a4134ef
Author: arendu <adithya.r@gmail.com>
Date:   Thu Nov 14 21:35:30 2024 +0000

    removing redundant params_dtype attr in mamba yaml

    Signed-off-by: arendu <adithya.r@gmail.com>

commit d6a014f
Author: Tugrul Konuk <ertkonuk@gmail.com>
Date:   Thu Nov 14 10:57:25 2024 -0600

    Set skip_special_tokens to False by default in tiktoken_tokenizer.py

commit b08f3eb
Merge: cf3bf49 985e0cf
Author: adithyare <adithyare@nvidia.com>
Date:   Wed Nov 13 15:59:18 2024 -0800

    Merge branch 'aligner/nemotron5' of https://github.com/NVIDIA/NeMo into aligner/nemotron5

commit cf3bf49
Merge: acfde95 6c2ce66
Author: adithyare <adithyare@nvidia.com>
Date:   Wed Nov 13 15:59:13 2024 -0800

     resolved conflict for dtype

    Signed-off-by: adithyare <adithyare@nvidia.com>

commit 985e0cf
Author: arendu <arendu@users.noreply.github.com>
Date:   Wed Nov 13 23:58:56 2024 +0000

    Apply isort and black reformatting

    Signed-off-by: arendu <arendu@users.noreply.github.com>

commit 6c2ce66
Author: arendu <adithya.r@gmail.com>
Date:   Wed Nov 13 23:57:54 2024 +0000

    dtype fix in mamba

    Signed-off-by: arendu <adithya.r@gmail.com>

commit acfde95
Merge: 20e251c 7c78ef4
Author: adithyare <adithyare@nvidia.com>
Date:   Wed Nov 13 14:47:41 2024 -0800

    Merge branch 'aligner/nemotron5' of https://github.com/NVIDIA/NeMo into aligner/nemotron5

commit 7c78ef4
Author: Ali Taghibakhshi <71892896+JRD971000@users.noreply.github.com>
Date:   Wed Nov 13 16:42:08 2024 -0600

    Minor changes to conversion script

    Signed-off-by: Ali Taghibakhshi <71892896+JRD971000@users.noreply.github.com>

commit 20e251c
Author: adithyare <adithyare@nvidia.com>
Date:   Wed Nov 13 14:37:26 2024 -0800

    fix for torch empty

    Signed-off-by: adithyare <adithyare@nvidia.com>

commit ecb2bf6
Author: Gerald Shen <geshen@nvidia.com>
Date:   Wed Nov 13 12:41:04 2024 -0800

    disable vocab padding

    Signed-off-by: Gerald Shen <geshen@nvidia.com>

commit e415b65
Author: arendu <arendu@users.noreply.github.com>
Date:   Tue Nov 12 22:35:23 2024 +0000

    Apply isort and black reformatting

    Signed-off-by: arendu <arendu@users.noreply.github.com>

commit 023dfbe
Merge: 88ec5e0 3b63b4c
Author: adithyare <adithyare@nvidia.com>
Date:   Tue Nov 12 14:34:08 2024 -0800

    merged

    Signed-off-by: adithyare <adithyare@nvidia.com>

commit 88ec5e0
Merge: be7f996 8231cde
Author: adithyare <adithyare@nvidia.com>
Date:   Tue Nov 12 12:30:14 2024 -0800

    Merge branch 'aligner/nemotron5' of https://github.com/NVIDIA/NeMo into aligner/nemotron5

commit be7f996
Author: adithyare <adithyare@nvidia.com>
Date:   Tue Nov 12 12:28:15 2024 -0800

    pad to mult is not available in chat dataset

    Signed-off-by: adithyare <adithyare@nvidia.com>

commit 8231cde
Author: arendu <arendu@users.noreply.github.com>
Date:   Tue Nov 12 20:23:03 2024 +0000

    Apply isort and black reformatting

    Signed-off-by: arendu <arendu@users.noreply.github.com>

commit 19e5049
Author: adithyare <adithyare@nvidia.com>
Date:   Tue Nov 12 12:21:59 2024 -0800

    a long overdue tiktoken special tokens fix -- Tkonuk

    Signed-off-by: adithyare <adithyare@nvidia.com>

commit 3b63b4c
Author: JRD971000 <JRD971000@users.noreply.github.com>
Date:   Tue Nov 12 14:23:58 2024 +0000

    Apply isort and black reformatting

    Signed-off-by: JRD971000 <JRD971000@users.noreply.github.com>

commit 6e4bd6f
Merge: c26bd22 75d8854
Author: Ali Taghibakhshi <ataghibakhsh@login-eos02.eos.clusters.nvidia.com>
Date:   Tue Nov 12 06:22:22 2024 -0800

    cleanup

commit c26bd22
Author: Ali Taghibakhshi <ataghibakhsh@login-eos02.eos.clusters.nvidia.com>
Date:   Tue Nov 12 06:16:52 2024 -0800

    cleanup

commit 57008da
Author: JRD971000 <JRD971000@users.noreply.github.com>
Date:   Fri Nov 8 18:57:48 2024 +0000

    Apply isort and black reformatting

    Signed-off-by: JRD971000 <JRD971000@users.noreply.github.com>

commit aa0fafb
Author: ataghibakhsh <ataghibakhsh@nvidia.com>
Date:   Fri Nov 8 10:56:19 2024 -0800

    guard cuda access

commit 0d9bb4f
Author: ataghibakhsh <ataghibakhsh@nvidia.com>
Date:   Mon Nov 4 14:41:41 2024 -0800

    add nemotron5 conversion

commit 75d8854
Author: JRD971000 <JRD971000@users.noreply.github.com>
Date:   Fri Nov 8 18:57:48 2024 +0000

    Apply isort and black reformatting

    Signed-off-by: JRD971000 <JRD971000@users.noreply.github.com>

commit 627a40d
Author: ataghibakhsh <ataghibakhsh@nvidia.com>
Date:   Fri Nov 8 10:56:19 2024 -0800

    guard cuda access

commit ada4b90
Author: JRD971000 <JRD971000@users.noreply.github.com>
Date:   Tue Nov 5 17:57:58 2024 +0000

    Apply isort and black reformatting

    Signed-off-by: JRD971000 <JRD971000@users.noreply.github.com>

commit 1343bee
Author: ataghibakhsh <ataghibakhsh@nvidia.com>
Date:   Mon Nov 4 14:41:41 2024 -0800

    add nemotron5 conversion

Signed-off-by: Terry Kong <terryk@nvidia.com>
XuesongYang pushed a commit to paarthneekhara/NeMo that referenced this pull request Jan 18, 2025
…or Aligner (NVIDIA#10863)

* fix(export): update API for disabling device reassignment in TRTLLM for Aligner

[feat] Upgrade nemo-export path for aligner to TRTLLM-v12 and use python runtime

Signed-off-by: Terry Kong <terryk@nvidia.com>

fix: forgot to always set _disable_torch_cuda_device_set

Signed-off-by: Terry Kong <terryk@nvidia.com>

Signed-off-by: Terry Kong <terryk@nvidia.com>

Apply isort and black reformatting

Signed-off-by: terrykong <terrykong@users.noreply.github.com>

invert torch device set

Signed-off-by: Terry Kong <terryk@nvidia.com>

* remove comment

Signed-off-by: Terry Kong <terryk@nvidia.com>

---------

Signed-off-by: Terry Kong <terryk@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants