Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[gradient_checkpointing] default to use it for torch 2.3 #28538

Merged
merged 4 commits into from
Feb 20, 2024

Conversation

ArthurZucker
Copy link
Collaborator

What does this PR do?

Fixes #28536 in preparation for next torch release

@ArthurZucker ArthurZucker changed the title default to use it [gradient_checkpointing] default to use it for torch 2.3 Jan 16, 2024
Copy link
Contributor

@younesbelkada younesbelkada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense!

@hiyouga
Copy link
Contributor

hiyouga commented Jan 19, 2024

Why do we use reentrant gc by default? It said the non-reentrant gc can be more advantageous than the reentrant version: https://pytorch.org/docs/2.0/checkpoint.html#torch.utils.checkpoint.checkpoint

@younesbelkada
Copy link
Contributor

@hiyouga the use_reentrant=True is used by default in PT anyway so if you set it to None, use_reentrant will be set to True

Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@ArthurZucker ArthurZucker marked this pull request as ready for review February 19, 2024 03:40
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@ArthurZucker ArthurZucker merged commit 9094abe into main Feb 20, 2024
19 of 21 checks passed
@ArthurZucker ArthurZucker deleted the use-rentrant branch February 20, 2024 01:23
@lucasjinreal
Copy link

I upgrade transformers to latest, still got this warning, and this warning is logged every single step

s/env-3.9.2/lib/python3.9/site-packages/torch/utils/checkpoint.py:460: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
  warnings.warn(
/data/miniconda3/envs/env-3.9.2/lib/python3.9/site-packages/torch/utils/checkpoint.py:90: UserWarning: None of the inputs have requires_grad=True. Gradients will be None
  warnings.warn(
/data/miniconda3/envs/env-3.9.2/lib/python3.9/site-packages/torch/utils/checkpoint.py:460: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
  warnings.warn(
/data/miniconda3/envs/env-3.9.2/lib/python3.9/site-packages/torch/utils/checkpoint.py:90: UserWarning: None of the inputs have requires_grad=True. Gradients will be None
  warnings.warn(
/data/miniconda3/envs/env-3.9.2/lib/python3.9/site-packages/torch/utils/checkpoint.py:460: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
  warnings.warn(
/data/miniconda3/envs/env-3.9.2/lib/python3.9/site-packages/torch/utils/checkpoint.py:90: UserWarning: None of the inputs have requires_grad=True. Gradients will be None
  warnings.warn(

how to disable it?

@ArthurZucker
Copy link
Collaborator Author

Can you open a new issue with a proper reproducer ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Gradient checkpointing throws use_reentrant warning on PyTorch 2.1
5 participants