Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Davidm/cherrypick r1.16.0 #6082

Merged
merged 12 commits into from
Mar 7, 2023
Merged

Davidm/cherrypick r1.16.0 #6082

merged 12 commits into from
Mar 7, 2023

Conversation

Davood-M
Copy link
Collaborator

@Davood-M Davood-M commented Feb 22, 2023

What does this PR do ?

Adding important main PRs to 1.16 branch

Collection: [Note which collection this PR will affect]

Changelog

  • Add specific line by line info of high level changes in this PR.

Usage

  • You can potentially add a usage example below
# Add a code snippet demonstrating how to use this 

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • New Feature
  • Bugfix
  • Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

  • Related to # (issue)

Davood-M and others added 8 commits February 8, 2023 09:11
Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
* per-micro-batch input loader

* per-micro-batch input loader

set arg default val

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* minor fix

* apply per-microbatch-loader to only GPT

* update docstring on micro-batch input loader

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed the default arg val

* fix batch size to 1 at log stat registration

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update container for CI

Signed-off-by: ericharper <complex451@gmail.com>

* update container in jenkinsfile

Signed-off-by: ericharper <complex451@gmail.com>

* update container for CI

Signed-off-by: ericharper <complex451@gmail.com>

fix merge conflict

* revert Jenkinsfile

* Revert "revert Jenkinsfile"

This reverts commit d23b775.

* Update nemo/collections/nlp/models/language_modeling/megatron_gpt_model.py

Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>

* add GradScaler

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: ericharper <complex451@gmail.com>
Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
…pute (#5684)

* Add distopt support for overlapping param all-gather with forward compute

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Update Apex commit

Signed-off-by: Tim Moon <tmoon@nvidia.com>

---------

Signed-off-by: Tim Moon <tmoon@nvidia.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
* per-micro-batch input loader

* per-micro-batch input loader

set arg default val

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* minor fix

* apply per-microbatch-loader to only GPT

* update docstring on micro-batch input loader

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed the default arg val

* fix batch size to 1 at log stat registration

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update container for CI

Signed-off-by: ericharper <complex451@gmail.com>

* update container in jenkinsfile

Signed-off-by: ericharper <complex451@gmail.com>

* update container for CI

Signed-off-by: ericharper <complex451@gmail.com>

fix merge conflict

* revert Jenkinsfile

* Revert "revert Jenkinsfile"

This reverts commit d23b775.

* Update nemo/collections/nlp/models/language_modeling/megatron_gpt_model.py

Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>

* add GradScaler

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: ericharper <complex451@gmail.com>
Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
* patch to allow using tokenizers without additional_special_tokens_ids attribute

Signed-off-by: arendu <adithya.r@gmail.com>

* early stop callback for prompt/p tuning

Signed-off-by: arendu <adithya.r@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update

Signed-off-by: arendu <adithya.r@gmail.com>

* added exp manager config for early stop

Signed-off-by: arendu <adithya.r@gmail.com>

* pushed logic for creating early stopping inside exp manager

Signed-off-by: arendu <adithya.r@gmail.com>

* pushed logic for creating early stopping inside exp manager

Signed-off-by: arendu <adithya.r@gmail.com>

* minor updates and added dataclass check

Signed-off-by: arendu <adithya.r@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* more args

Signed-off-by: arendu <adithya.r@gmail.com>

* more args

Signed-off-by: arendu <adithya.r@gmail.com>

---------

Signed-off-by: arendu <adithya.r@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
@github-actions github-actions bot added CI core Changes to NeMo Core NLP labels Feb 22, 2023
Copy link
Collaborator

@ericharper ericharper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

@ericharper ericharper merged commit 28b5f29 into r1.16.0 Mar 7, 2023
@ericharper ericharper deleted the davidm/cherrypick_r1.16.0 branch March 7, 2023 18:23
github-actions bot pushed a commit that referenced this pull request Mar 7, 2023
* gpt fix

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* per-micro-batch input loader (#5635)

* per-micro-batch input loader

* per-micro-batch input loader

set arg default val

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* minor fix

* apply per-microbatch-loader to only GPT

* update docstring on micro-batch input loader

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed the default arg val

* fix batch size to 1 at log stat registration

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update container for CI

Signed-off-by: ericharper <complex451@gmail.com>

* update container in jenkinsfile

Signed-off-by: ericharper <complex451@gmail.com>

* update container for CI

Signed-off-by: ericharper <complex451@gmail.com>

fix merge conflict

* revert Jenkinsfile

* Revert "revert Jenkinsfile"

This reverts commit d23b775.

* Update nemo/collections/nlp/models/language_modeling/megatron_gpt_model.py

Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>

* add GradScaler

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: ericharper <complex451@gmail.com>
Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>

* added PR#5995

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* Distributed Adam optimizer overlaps param all-gather with forward compute (#5684)

* Add distopt support for overlapping param all-gather with forward compute

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Update Apex commit

Signed-off-by: Tim Moon <tmoon@nvidia.com>

---------

Signed-off-by: Tim Moon <tmoon@nvidia.com>
Co-authored-by: Eric Harper <complex451@gmail.com>

* per-micro-batch input loader (#5635)

* per-micro-batch input loader

* per-micro-batch input loader

set arg default val

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* minor fix

* apply per-microbatch-loader to only GPT

* update docstring on micro-batch input loader

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed the default arg val

* fix batch size to 1 at log stat registration

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update container for CI

Signed-off-by: ericharper <complex451@gmail.com>

* update container in jenkinsfile

Signed-off-by: ericharper <complex451@gmail.com>

* update container for CI

Signed-off-by: ericharper <complex451@gmail.com>

fix merge conflict

* revert Jenkinsfile

* Revert "revert Jenkinsfile"

This reverts commit d23b775.

* Update nemo/collections/nlp/models/language_modeling/megatron_gpt_model.py

Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>

* add GradScaler

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: ericharper <complex451@gmail.com>
Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>

* adding early stop callback to ptuning (#6028)

* patch to allow using tokenizers without additional_special_tokens_ids attribute

Signed-off-by: arendu <adithya.r@gmail.com>

* early stop callback for prompt/p tuning

Signed-off-by: arendu <adithya.r@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update

Signed-off-by: arendu <adithya.r@gmail.com>

* added exp manager config for early stop

Signed-off-by: arendu <adithya.r@gmail.com>

* pushed logic for creating early stopping inside exp manager

Signed-off-by: arendu <adithya.r@gmail.com>

* pushed logic for creating early stopping inside exp manager

Signed-off-by: arendu <adithya.r@gmail.com>

* minor updates and added dataclass check

Signed-off-by: arendu <adithya.r@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* more args

Signed-off-by: arendu <adithya.r@gmail.com>

* more args

Signed-off-by: arendu <adithya.r@gmail.com>

---------

Signed-off-by: arendu <adithya.r@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Signed-off-by: Tim Moon <tmoon@nvidia.com>
Signed-off-by: arendu <adithya.r@gmail.com>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: Sangkug Lym <slym@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: ericharper <complex451@gmail.com>
Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Co-authored-by: Adi Renduchintala <108822655+arendu@users.noreply.github.com>
ericharper added a commit that referenced this pull request Mar 9, 2023
* gpt fix

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* per-micro-batch input loader (#5635)

* per-micro-batch input loader

* per-micro-batch input loader

set arg default val

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* minor fix

* apply per-microbatch-loader to only GPT

* update docstring on micro-batch input loader

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed the default arg val

* fix batch size to 1 at log stat registration

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update container for CI

Signed-off-by: ericharper <complex451@gmail.com>

* update container in jenkinsfile

Signed-off-by: ericharper <complex451@gmail.com>

* update container for CI

Signed-off-by: ericharper <complex451@gmail.com>

fix merge conflict

* revert Jenkinsfile

* Revert "revert Jenkinsfile"

This reverts commit d23b775.

* Update nemo/collections/nlp/models/language_modeling/megatron_gpt_model.py

Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>

* add GradScaler

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: ericharper <complex451@gmail.com>
Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>

* added PR#5995

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* Distributed Adam optimizer overlaps param all-gather with forward compute (#5684)

* Add distopt support for overlapping param all-gather with forward compute

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Update Apex commit

Signed-off-by: Tim Moon <tmoon@nvidia.com>

---------

Signed-off-by: Tim Moon <tmoon@nvidia.com>
Co-authored-by: Eric Harper <complex451@gmail.com>

* per-micro-batch input loader (#5635)

* per-micro-batch input loader

* per-micro-batch input loader

set arg default val

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* minor fix

* apply per-microbatch-loader to only GPT

* update docstring on micro-batch input loader

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed the default arg val

* fix batch size to 1 at log stat registration

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update container for CI

Signed-off-by: ericharper <complex451@gmail.com>

* update container in jenkinsfile

Signed-off-by: ericharper <complex451@gmail.com>

* update container for CI

Signed-off-by: ericharper <complex451@gmail.com>

fix merge conflict

* revert Jenkinsfile

* Revert "revert Jenkinsfile"

This reverts commit d23b775.

* Update nemo/collections/nlp/models/language_modeling/megatron_gpt_model.py

Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>

* add GradScaler

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: ericharper <complex451@gmail.com>
Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>

* adding early stop callback to ptuning (#6028)

* patch to allow using tokenizers without additional_special_tokens_ids attribute

Signed-off-by: arendu <adithya.r@gmail.com>

* early stop callback for prompt/p tuning

Signed-off-by: arendu <adithya.r@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update

Signed-off-by: arendu <adithya.r@gmail.com>

* added exp manager config for early stop

Signed-off-by: arendu <adithya.r@gmail.com>

* pushed logic for creating early stopping inside exp manager

Signed-off-by: arendu <adithya.r@gmail.com>

* pushed logic for creating early stopping inside exp manager

Signed-off-by: arendu <adithya.r@gmail.com>

* minor updates and added dataclass check

Signed-off-by: arendu <adithya.r@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* more args

Signed-off-by: arendu <adithya.r@gmail.com>

* more args

Signed-off-by: arendu <adithya.r@gmail.com>

---------

Signed-off-by: arendu <adithya.r@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Signed-off-by: Tim Moon <tmoon@nvidia.com>
Signed-off-by: arendu <adithya.r@gmail.com>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: Sangkug Lym <slym@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: ericharper <complex451@gmail.com>
Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Co-authored-by: Adi Renduchintala <108822655+arendu@users.noreply.github.com>
ericharper added a commit that referenced this pull request Mar 9, 2023
* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* Tn doc 16 (#5954)

* fix new repo links

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix new repo links

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix links

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix spelling

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* add warning

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* add comment

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

---------

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix links, add missing file (#6044)

* fix link, add missing file

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* fix primer notebook links

Signed-off-by: ekmb <ebakhturina@nvidia.com>

---------

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* Davidm/cherrypick r1.16.0 (#6082)

* gpt fix

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* per-micro-batch input loader (#5635)

* per-micro-batch input loader

* per-micro-batch input loader

set arg default val

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* minor fix

* apply per-microbatch-loader to only GPT

* update docstring on micro-batch input loader

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed the default arg val

* fix batch size to 1 at log stat registration

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update container for CI

Signed-off-by: ericharper <complex451@gmail.com>

* update container in jenkinsfile

Signed-off-by: ericharper <complex451@gmail.com>

* update container for CI

Signed-off-by: ericharper <complex451@gmail.com>

fix merge conflict

* revert Jenkinsfile

* Revert "revert Jenkinsfile"

This reverts commit d23b775.

* Update nemo/collections/nlp/models/language_modeling/megatron_gpt_model.py

Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>

* add GradScaler

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: ericharper <complex451@gmail.com>
Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>

* added PR#5995

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* Distributed Adam optimizer overlaps param all-gather with forward compute (#5684)

* Add distopt support for overlapping param all-gather with forward compute

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Update Apex commit

Signed-off-by: Tim Moon <tmoon@nvidia.com>

---------

Signed-off-by: Tim Moon <tmoon@nvidia.com>
Co-authored-by: Eric Harper <complex451@gmail.com>

* per-micro-batch input loader (#5635)

* per-micro-batch input loader

* per-micro-batch input loader

set arg default val

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* minor fix

* apply per-microbatch-loader to only GPT

* update docstring on micro-batch input loader

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed the default arg val

* fix batch size to 1 at log stat registration

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update container for CI

Signed-off-by: ericharper <complex451@gmail.com>

* update container in jenkinsfile

Signed-off-by: ericharper <complex451@gmail.com>

* update container for CI

Signed-off-by: ericharper <complex451@gmail.com>

fix merge conflict

* revert Jenkinsfile

* Revert "revert Jenkinsfile"

This reverts commit d23b775.

* Update nemo/collections/nlp/models/language_modeling/megatron_gpt_model.py

Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>

* add GradScaler

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: ericharper <complex451@gmail.com>
Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>

* adding early stop callback to ptuning (#6028)

* patch to allow using tokenizers without additional_special_tokens_ids attribute

Signed-off-by: arendu <adithya.r@gmail.com>

* early stop callback for prompt/p tuning

Signed-off-by: arendu <adithya.r@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update

Signed-off-by: arendu <adithya.r@gmail.com>

* added exp manager config for early stop

Signed-off-by: arendu <adithya.r@gmail.com>

* pushed logic for creating early stopping inside exp manager

Signed-off-by: arendu <adithya.r@gmail.com>

* pushed logic for creating early stopping inside exp manager

Signed-off-by: arendu <adithya.r@gmail.com>

* minor updates and added dataclass check

Signed-off-by: arendu <adithya.r@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* more args

Signed-off-by: arendu <adithya.r@gmail.com>

* more args

Signed-off-by: arendu <adithya.r@gmail.com>

---------

Signed-off-by: arendu <adithya.r@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Signed-off-by: Tim Moon <tmoon@nvidia.com>
Signed-off-by: arendu <adithya.r@gmail.com>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: Sangkug Lym <slym@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: ericharper <complex451@gmail.com>
Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Co-authored-by: Adi Renduchintala <108822655+arendu@users.noreply.github.com>

* update branch for pynini install script (#6150)

Signed-off-by: ericharper <complex451@gmail.com>

* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* revert

Signed-off-by: ericharper <complex451@gmail.com>

* resolve conflict

Signed-off-by: ericharper <complex451@gmail.com>

* revert file add

Signed-off-by: ericharper <complex451@gmail.com>

* revert import

Signed-off-by: ericharper <complex451@gmail.com>

* revert

Signed-off-by: ericharper <complex451@gmail.com>

---------

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
Signed-off-by: ekmb <ebakhturina@nvidia.com>
Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Signed-off-by: Tim Moon <tmoon@nvidia.com>
Signed-off-by: arendu <adithya.r@gmail.com>
Co-authored-by: Yang Zhang <yzhang123@users.noreply.github.com>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: David <amosalla@asu.edu>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: Sangkug Lym <slym@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Co-authored-by: Adi Renduchintala <108822655+arendu@users.noreply.github.com>
titu1994 pushed a commit to titu1994/NeMo that referenced this pull request Mar 24, 2023
* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* Tn doc 16 (NVIDIA#5954)

* fix new repo links

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix new repo links

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix links

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix spelling

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* add warning

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* add comment

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

---------

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix links, add missing file (NVIDIA#6044)

* fix link, add missing file

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* fix primer notebook links

Signed-off-by: ekmb <ebakhturina@nvidia.com>

---------

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* Davidm/cherrypick r1.16.0 (NVIDIA#6082)

* gpt fix

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* per-micro-batch input loader (NVIDIA#5635)

* per-micro-batch input loader

* per-micro-batch input loader

set arg default val

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* minor fix

* apply per-microbatch-loader to only GPT

* update docstring on micro-batch input loader

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed the default arg val

* fix batch size to 1 at log stat registration

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update container for CI

Signed-off-by: ericharper <complex451@gmail.com>

* update container in jenkinsfile

Signed-off-by: ericharper <complex451@gmail.com>

* update container for CI

Signed-off-by: ericharper <complex451@gmail.com>

fix merge conflict

* revert Jenkinsfile

* Revert "revert Jenkinsfile"

This reverts commit d23b775.

* Update nemo/collections/nlp/models/language_modeling/megatron_gpt_model.py

Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>

* add GradScaler

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: ericharper <complex451@gmail.com>
Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>

* added PR#5995

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* Distributed Adam optimizer overlaps param all-gather with forward compute (NVIDIA#5684)

* Add distopt support for overlapping param all-gather with forward compute

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Update Apex commit

Signed-off-by: Tim Moon <tmoon@nvidia.com>

---------

Signed-off-by: Tim Moon <tmoon@nvidia.com>
Co-authored-by: Eric Harper <complex451@gmail.com>

* per-micro-batch input loader (NVIDIA#5635)

* per-micro-batch input loader

* per-micro-batch input loader

set arg default val

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* minor fix

* apply per-microbatch-loader to only GPT

* update docstring on micro-batch input loader

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed the default arg val

* fix batch size to 1 at log stat registration

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update container for CI

Signed-off-by: ericharper <complex451@gmail.com>

* update container in jenkinsfile

Signed-off-by: ericharper <complex451@gmail.com>

* update container for CI

Signed-off-by: ericharper <complex451@gmail.com>

fix merge conflict

* revert Jenkinsfile

* Revert "revert Jenkinsfile"

This reverts commit d23b775.

* Update nemo/collections/nlp/models/language_modeling/megatron_gpt_model.py

Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>

* add GradScaler

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: ericharper <complex451@gmail.com>
Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>

* adding early stop callback to ptuning (NVIDIA#6028)

* patch to allow using tokenizers without additional_special_tokens_ids attribute

Signed-off-by: arendu <adithya.r@gmail.com>

* early stop callback for prompt/p tuning

Signed-off-by: arendu <adithya.r@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update

Signed-off-by: arendu <adithya.r@gmail.com>

* added exp manager config for early stop

Signed-off-by: arendu <adithya.r@gmail.com>

* pushed logic for creating early stopping inside exp manager

Signed-off-by: arendu <adithya.r@gmail.com>

* pushed logic for creating early stopping inside exp manager

Signed-off-by: arendu <adithya.r@gmail.com>

* minor updates and added dataclass check

Signed-off-by: arendu <adithya.r@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* more args

Signed-off-by: arendu <adithya.r@gmail.com>

* more args

Signed-off-by: arendu <adithya.r@gmail.com>

---------

Signed-off-by: arendu <adithya.r@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Signed-off-by: Tim Moon <tmoon@nvidia.com>
Signed-off-by: arendu <adithya.r@gmail.com>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: Sangkug Lym <slym@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: ericharper <complex451@gmail.com>
Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Co-authored-by: Adi Renduchintala <108822655+arendu@users.noreply.github.com>

* update branch for pynini install script (NVIDIA#6150)

Signed-off-by: ericharper <complex451@gmail.com>

* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* revert

Signed-off-by: ericharper <complex451@gmail.com>

* resolve conflict

Signed-off-by: ericharper <complex451@gmail.com>

* revert file add

Signed-off-by: ericharper <complex451@gmail.com>

* revert import

Signed-off-by: ericharper <complex451@gmail.com>

* revert

Signed-off-by: ericharper <complex451@gmail.com>

---------

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
Signed-off-by: ekmb <ebakhturina@nvidia.com>
Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Signed-off-by: Tim Moon <tmoon@nvidia.com>
Signed-off-by: arendu <adithya.r@gmail.com>
Co-authored-by: Yang Zhang <yzhang123@users.noreply.github.com>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: David <amosalla@asu.edu>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: Sangkug Lym <slym@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Co-authored-by: Adi Renduchintala <108822655+arendu@users.noreply.github.com>
hsiehjackson pushed a commit to hsiehjackson/NeMo that referenced this pull request Jun 2, 2023
* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* Tn doc 16 (NVIDIA#5954)

* fix new repo links

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix new repo links

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix links

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix spelling

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* add warning

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* add comment

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

---------

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix links, add missing file (NVIDIA#6044)

* fix link, add missing file

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* fix primer notebook links

Signed-off-by: ekmb <ebakhturina@nvidia.com>

---------

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* Davidm/cherrypick r1.16.0 (NVIDIA#6082)

* gpt fix

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* per-micro-batch input loader (NVIDIA#5635)

* per-micro-batch input loader

* per-micro-batch input loader

set arg default val

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* minor fix

* apply per-microbatch-loader to only GPT

* update docstring on micro-batch input loader

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed the default arg val

* fix batch size to 1 at log stat registration

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update container for CI

Signed-off-by: ericharper <complex451@gmail.com>

* update container in jenkinsfile

Signed-off-by: ericharper <complex451@gmail.com>

* update container for CI

Signed-off-by: ericharper <complex451@gmail.com>

fix merge conflict

* revert Jenkinsfile

* Revert "revert Jenkinsfile"

This reverts commit d23b775.

* Update nemo/collections/nlp/models/language_modeling/megatron_gpt_model.py

Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>

* add GradScaler

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: ericharper <complex451@gmail.com>
Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>

* added PR#5995

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* Distributed Adam optimizer overlaps param all-gather with forward compute (NVIDIA#5684)

* Add distopt support for overlapping param all-gather with forward compute

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Update Apex commit

Signed-off-by: Tim Moon <tmoon@nvidia.com>

---------

Signed-off-by: Tim Moon <tmoon@nvidia.com>
Co-authored-by: Eric Harper <complex451@gmail.com>

* per-micro-batch input loader (NVIDIA#5635)

* per-micro-batch input loader

* per-micro-batch input loader

set arg default val

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* minor fix

* apply per-microbatch-loader to only GPT

* update docstring on micro-batch input loader

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed the default arg val

* fix batch size to 1 at log stat registration

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update container for CI

Signed-off-by: ericharper <complex451@gmail.com>

* update container in jenkinsfile

Signed-off-by: ericharper <complex451@gmail.com>

* update container for CI

Signed-off-by: ericharper <complex451@gmail.com>

fix merge conflict

* revert Jenkinsfile

* Revert "revert Jenkinsfile"

This reverts commit d23b775.

* Update nemo/collections/nlp/models/language_modeling/megatron_gpt_model.py

Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>

* add GradScaler

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: ericharper <complex451@gmail.com>
Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>

* adding early stop callback to ptuning (NVIDIA#6028)

* patch to allow using tokenizers without additional_special_tokens_ids attribute

Signed-off-by: arendu <adithya.r@gmail.com>

* early stop callback for prompt/p tuning

Signed-off-by: arendu <adithya.r@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update

Signed-off-by: arendu <adithya.r@gmail.com>

* added exp manager config for early stop

Signed-off-by: arendu <adithya.r@gmail.com>

* pushed logic for creating early stopping inside exp manager

Signed-off-by: arendu <adithya.r@gmail.com>

* pushed logic for creating early stopping inside exp manager

Signed-off-by: arendu <adithya.r@gmail.com>

* minor updates and added dataclass check

Signed-off-by: arendu <adithya.r@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* more args

Signed-off-by: arendu <adithya.r@gmail.com>

* more args

Signed-off-by: arendu <adithya.r@gmail.com>

---------

Signed-off-by: arendu <adithya.r@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Signed-off-by: Tim Moon <tmoon@nvidia.com>
Signed-off-by: arendu <adithya.r@gmail.com>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: Sangkug Lym <slym@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: ericharper <complex451@gmail.com>
Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Co-authored-by: Adi Renduchintala <108822655+arendu@users.noreply.github.com>

* update branch for pynini install script (NVIDIA#6150)

Signed-off-by: ericharper <complex451@gmail.com>

* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* revert

Signed-off-by: ericharper <complex451@gmail.com>

* resolve conflict

Signed-off-by: ericharper <complex451@gmail.com>

* revert file add

Signed-off-by: ericharper <complex451@gmail.com>

* revert import

Signed-off-by: ericharper <complex451@gmail.com>

* revert

Signed-off-by: ericharper <complex451@gmail.com>

---------

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
Signed-off-by: ekmb <ebakhturina@nvidia.com>
Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Signed-off-by: Tim Moon <tmoon@nvidia.com>
Signed-off-by: arendu <adithya.r@gmail.com>
Co-authored-by: Yang Zhang <yzhang123@users.noreply.github.com>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: David <amosalla@asu.edu>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: Sangkug Lym <slym@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Co-authored-by: Adi Renduchintala <108822655+arendu@users.noreply.github.com>
Signed-off-by: hsiehjackson <c2hsieh@ucsd.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI core Changes to NeMo Core NLP
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants