Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update FSDP save and load logic #24249

Merged
merged 3 commits into from
Jun 13, 2023
Merged

update FSDP save and load logic #24249

merged 3 commits into from
Jun 13, 2023

Conversation

pacman100
Copy link
Contributor

What does this PR do?

  1. Should be merged after PR FSDP updates accelerate#1576
  2. Updates the saving and loading utils for FSDP to be in sync with the latest PyTorch release.

@pacman100 pacman100 changed the title Smangrul/update fsdp update FSDP save and load logic Jun 13, 2023
@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Jun 13, 2023

The documentation is not available anymore as the PR was closed or merged.

@pacman100 pacman100 marked this pull request as ready for review June 13, 2023 19:11
Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@pacman100 pacman100 merged commit b89fccc into main Jun 13, 2023
@pacman100 pacman100 deleted the smangrul/update-fsdp branch June 13, 2023 19:19
Comment on lines +2107 to +2108
load_fsdp_model(
self.accelerator.state.fsdp_plugin, self.accelerator, model, self.state.best_model_checkpoint
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will require folks use yet another higher version of Accelerate, else they'll get a load_fsdp_model not found if they pin Accelerate but not Transformers. When 0.21.0 (of accelerate) is out we should either pin to it as the new minimum, or have an explicit error/check in here so it's not a confusing error for folks due to this naming change.

I know fsdp is considered experimental to a degree, so trusting best judgement on if action is needed/what action to take :)

novice03 pushed a commit to novice03/transformers that referenced this pull request Jun 23, 2023
* update fsdp save and load logic

* fix

* see if this resolves the failing tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants