Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

State dictionary retrieval from offloaded modules #2619

Merged
merged 7 commits into from
Jun 3, 2024

Conversation

blbadger
Copy link
Contributor

@blbadger blbadger commented Apr 3, 2024

What does this PR do?

Adds a function to retrieve a state dictionary from an offloaded module, loading the parameters into a specified device.

See transformers PR #27412 for more information

Working on the test too:)

TODO: replace disk offload conditional check with expand_device_map

@SunMarc

@blbadger blbadger changed the title Offload state Add state dictionary retrieval from offloaded modules Apr 4, 2024
@blbadger blbadger changed the title Add state dictionary retrieval from offloaded modules State dictionary retrieval from offloaded modules Apr 4, 2024
@muellerzr muellerzr requested a review from SunMarc April 4, 2024 12:44
Copy link
Member

@SunMarc SunMarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work @blbadger ! Let's add a test and try it with the transformers PR ! After we manage to make it work on accelerate and transformers, we can merge this! I left a few comments

@blbadger
Copy link
Contributor Author

blbadger commented Apr 11, 2024

Thanks for the review @SunMarc ! I have implemented your suggestions and wrote the test, and refactored the transformers code to use this method.

I have written the execution device to switch to the cpu by default, although if you think it would be a better idea to keep this as a gpu (if available) I would be happy to change. The transformers test is a little more straightforward if the cpu is default, although I know that fitting code for a test is not a great design consideration:)

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@blbadger
Copy link
Contributor Author

blbadger commented May 1, 2024

Just a heads up, I got around to integrating this module with transformers PR #27412 and testing looks good for large models there too.

Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Copy link
Member

@SunMarc SunMarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome @blbadger ! I will merge this asap after we merge the PR that fixes the CI ! I've tested this PR with transformers and it works great !

@SunMarc SunMarc merged commit d5d378d into huggingface:main Jun 3, 2024
18 of 23 checks passed
@SunMarc SunMarc mentioned this pull request Jun 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants