-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ability to enable/disable act ckpt and seq parallelism in GPT #6327
Conversation
Signed-off-by: Markel Sanz Ausin <markelsanz14@gmail.com>
for more information, see https://pre-commit.ci
Signed-off-by: Markel Sanz Ausin <markelsanz14@gmail.com>
This looks good to me and I've been using this for SFT models to turn this on/off between training and validation. I'll let @ericharper do a final review. |
nemo/collections/nlp/models/language_modeling/megatron_gpt_model.py
Outdated
Show resolved
Hide resolved
Signed-off-by: Markel Sanz Ausin <markelsanz14@gmail.com>
for more information, see https://pre-commit.ci
Signed-off-by: Markel Sanz Ausin <markelsanz14@gmail.com>
for more information, see https://pre-commit.ci
nemo/collections/nlp/models/language_modeling/megatron_gpt_model.py
Outdated
Show resolved
Hide resolved
Signed-off-by: Markel Sanz Ausin <markelsanz14@gmail.com>
for more information, see https://pre-commit.ci
nemo/collections/nlp/models/language_modeling/megatron_gpt_model.py
Outdated
Show resolved
Hide resolved
Signed-off-by: Markel Sanz Ausin <markelsanz14@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks!
Signed-off-by: Markel Sanz Ausin <markelsanz14@gmail.com>
…t function. Signed-off-by: Markel Sanz Ausin <markelsanz14@gmail.com>
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for the changes!
…IDIA#6327) * Add ability to enable/disable act ckpt and seq parallelism Signed-off-by: Markel Sanz Ausin <markelsanz14@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove num_micro_batches_with_partial_activation_checkpoints Signed-off-by: Markel Sanz Ausin <markelsanz14@gmail.com> * Added property to self.model and added restore/reset config values. Signed-off-by: Markel Sanz Ausin <markelsanz14@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Use self.model property Signed-off-by: Markel Sanz Ausin <markelsanz14@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Removed original_act_ckpt Signed-off-by: Markel Sanz Ausin <markelsanz14@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add docstrings to reset/restore act ckpt Signed-off-by: Markel Sanz Ausin <markelsanz14@gmail.com> * Property removed from self.model and replaced with get_gpt_module_list function. Signed-off-by: Markel Sanz Ausin <markelsanz14@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Markel Sanz Ausin <markelsanz14@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> Signed-off-by: hsiehjackson <c2hsieh@ucsd.edu>
What does this PR do ?
Adds functions to enable and disable activation checkpointing and sequence parallelism. This way, we can enable them during training and disable them during inference/generation.
Collection: nemo/collections/nlp/language_modeling
Changelog
Usage
# Add a code snippet demonstrating how to use this
Before your PR is "Ready for review"
Pre checks:
PR Type:
If you haven't finished some of the above items you can still open "Draft" PR.
Who can review?
Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.
Additional Information