Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix CPU offload + disk offload tests #27204

Merged
merged 1 commit into from
Nov 1, 2023

Conversation

LysandreJik
Copy link
Member

@LysandreJik LysandreJik commented Nov 1, 2023

Passing to safetensors serialization by default highlighted a few issues that we have with safetensors.

This PR fixes the issue, which is principally linked to weight sharing.

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Nov 1, 2023

The documentation is not available anymore as the PR was closed or merged.

@LysandreJik LysandreJik force-pushed the fix-safetensors-default-slow-failing-tests branch from e7651c9 to fed5e54 Compare November 1, 2023 12:32
@LysandreJik LysandreJik marked this pull request as ready for review November 1, 2023 12:32
@LysandreJik
Copy link
Member Author

@amyeroberts @patrickvonplaten if you feel uneasy with merging this right before the release, I'm fine with reverting the safetensors serialization by default to let it sit on main for a while longer. The release is going to be very packed already so it's fine for me.

Copy link
Collaborator

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for finding the fix so quickly!

@amyeroberts
Copy link
Collaborator

@LysandreJik The change LGTM and seems to address some underlying issues. Re default safetensors serialization, I'm happy for it to be part of this release as long as some of the slow tests on the most popular models (bert, llama, wav2vec2, whisper, clip etc.) are good.

@@ -1125,6 +1125,11 @@ def __init__(self, config: BartConfig):
# Initialize weights and apply final processing
self.post_init()

def _tie_weights(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -4520,7 +4520,9 @@ def expand_device_map(device_map, param_names):
"""
new_device_map = {}
for module, device in device_map.items():
new_device_map.update({p: device for p in param_names if p == module or p.startswith(f"{module}.")})
new_device_map.update(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch!

Copy link
Contributor

@patrickvonplaten patrickvonplaten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean!

@LysandreJik
Copy link
Member Author

Thanks both for your reviews! I'll go ahead and merge this, sorry but you'll have the conflict Patrick 😁

@LysandreJik LysandreJik merged commit 95020f2 into main Nov 1, 2023
19 of 22 checks passed
@LysandreJik LysandreJik deleted the fix-safetensors-default-slow-failing-tests branch November 1, 2023 18:25
EduardoPach pushed a commit to EduardoPach/transformers that referenced this pull request Nov 19, 2023
Fix disk offload tests + weight sharing issues
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants