Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use mmap option to load_state_dict #28331

Merged
merged 1 commit into from
Jan 10, 2024
Merged

Conversation

weimingzha0
Copy link
Contributor

@weimingzha0 weimingzha0 commented Jan 3, 2024

Use torch.load(mmap=True) to accelerate checkpoint loading

#28332

cc @SunMarc @sgugger

weimingzha0 pushed a commit to weimingzha0/transformers that referenced this pull request Jan 3, 2024
weimingzha0 pushed a commit to weimingzha0/transformers that referenced this pull request Jan 4, 2024
weimingzha0 pushed a commit to weimingzha0/transformers that referenced this pull request Jan 4, 2024
Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

torch bin files are now deprecated in favor of safetensors but no harm in improving this!
Coul you add a test as well in test_modelling_common? 🤗


return torch.load(checkpoint_file, map_location=map_location, weights_only=True)
extra_args = {}
# mmap can only be used with files serialized with zipfile-based format.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how often does this come up? (zipfile-based format)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

zip file is used often as it's the default for torch.save (https://pytorch.org/docs/stable/generated/torch.save.html).

weimingzha0 pushed a commit to weimingzha0/transformers that referenced this pull request Jan 4, 2024
@weimingzha0
Copy link
Contributor Author

torch bin files are now deprecated in favor of safetensors but no harm in improving this! Coul you add a test as well in test_modelling_common? 🤗

Sure. Please let me know if my test case is correct.

weimingzha0 pushed a commit to weimingzha0/transformers that referenced this pull request Jan 4, 2024
Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! My last query would be to make sure this works for the deepspeed / sdpa case (device_map = "meta")!

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@weimingzha0
Copy link
Contributor Author

Thanks! My last query would be to make sure this works for the deepspeed / sdpa case (device_map = "meta")!

Do you mean add a test case for device_map == "meta' ?

@ArthurZucker
Copy link
Collaborator

That would be a good way of making sure this will be fine with DeepSpeedZero (the if branch) 🤗

weimingzha0 pushed a commit to weimingzha0/transformers that referenced this pull request Jan 8, 2024
@weimingzha0
Copy link
Contributor Author

That would be a good way of making sure this will be fine with DeepSpeedZero (the if branch) 🤗

Currently none of existing tests covers the branch:
I even added an assert in the "if branch" and all tests still passed except for one irrelevant error (see #28401)

Anyway, I added a guard for "meta" device.

Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good, thanks for iterating

@ArthurZucker ArthurZucker merged commit 701298d into huggingface:main Jan 10, 2024
21 checks passed
@weimingzha0 weimingzha0 deleted the mmap branch January 10, 2024 18:00
staghado pushed a commit to staghado/transformers that referenced this pull request Jan 15, 2024
MadElf1337 pushed a commit to MadElf1337/transformers that referenced this pull request Jan 15, 2024
wgifford pushed a commit to wgifford/transformers that referenced this pull request Jan 21, 2024
AjayP13 pushed a commit to AjayP13/transformers that referenced this pull request Jan 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants