-
Notifications
You must be signed in to change notification settings - Fork 27.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
make torch.load a bit safer #27282
make torch.load a bit safer #27282
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice - thanks for adding!
It seems this currently breaks resuming from checkpoint with trainer, but other changes LGTM.
cc @muellerzr To review the changes in trainer.py as he knows more about the saved objects there
src/transformers/trainer.py
Outdated
@@ -2305,7 +2305,7 @@ def _load_rng_state(self, checkpoint): | |||
) | |||
return | |||
|
|||
checkpoint_rng_state = torch.load(rng_file) | |||
checkpoint_rng_state = torch.load(rng_file, weights_only=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looks like this breaks loading rng_state from a saved checkpoint: https://app.circleci.com/pipelines/github/huggingface/transformers/77407/workflows/3502e84f-41fb-4f01-8a7a-2c40b9e76aa7/jobs/985682/steps
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
src/transformers/trainer.py
Outdated
@@ -2466,7 +2466,7 @@ def _load_optimizer_and_scheduler(self, checkpoint): | |||
# deepspeed loads optimizer/lr_scheduler together with the model in deepspeed_init | |||
if not isinstance(self.lr_scheduler, DeepSpeedSchedulerWrapper): | |||
with warnings.catch_warnings(record=True) as caught_warnings: | |||
self.lr_scheduler.load_state_dict(torch.load(os.path.join(checkpoint, SCHEDULER_NAME))) | |||
self.lr_scheduler.load_state_dict(torch.load(os.path.join(checkpoint, SCHEDULER_NAME), weights_only=True)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Scheduler shouldn't have weights right?
cc @LysandreJik is this stale or has it been done elsewhere since? |
5ff69ec
to
3323d7b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! 🔒
Thank you! :) |
yay 1 more commit on the GOAT of codebases!!! happy:) |
* make torch.load a bit safer * Fixes --------- Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Hi @julien-c, thanks for your work! I was building from The code is as simple as this:
And it raises
The
Fixing the Thank you for your time! If you need any help from me, feel free to ask. |
torch 1.10 is quite old, is there any way you'd be able to upgrade to a more recent torch? |
Sure, but the point is, |
yep! cc @LysandreJik |
#28207 will fix this 🤗 |
#28207 only removes 1.10, but for torch 1.11 and torch 1.12, secure pickling should still be buggy. You can take a moment to compare https://pytorch.org/docs/1.12/generated/torch.load.html and https://pytorch.org/docs/1.13/generated/torch.load.html. |
You're correct @hjenryin, we're taking a look at fixing this before the next release. Thanks for the report |
* make torch.load a bit safer * Fixes --------- Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
No description provided.