-
Notifications
You must be signed in to change notification settings - Fork 967
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use shard saving from huggingface_hub #2795
Use shard saving from huggingface_hub #2795
Conversation
for filename, tensors in state_dict_split.filename_to_tensors.items(): | ||
shard = {tensor: state_dict[tensor] for tensor in tensors} | ||
self.save(shard, os.path.join(save_directory, filename), safe_serialization=safe_serialization) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be a done in a threaded manner or not really so as to preserve readability?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll have a look !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why threading may matter here, this only happens on the main accelerate process on purpose
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for updating us to use the 🤗 Hub!
This looks great! Made some comments (in others) on deprecation and some general advice. Otherwise looks great (once tests pass)
Haven't tried locally but is Otherwise, if
It's merged. I can make a patch release on Monday :) |
@SunMarc Patch release |
Yes, we still need that. I think it will be better to include this is a real
Sounds good ! However, it think that if you have a function that clean the state_dict just like here, we don't even need to make it public since we only use that function when cleaning the state dict + splitting |
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me! Thanks for integrating this :)
Merging now that release has been cut (so this is not included while we test!) |
What does this PR do?
This PR replaces the shard saving utility function to use the one from huggingface hub. The goal is to make it easier to maintain this piece of code that it used across many libraries such as transformers, peft and diffusers.
TODO:
id_tensor_storage
from huggingface_hub or not (it is calledget_storage_id
there). We use it inclean_state_dict_for_safetensors
to preprocess the state_dict before sharding it.Edit: after some thoughts, it is better to just use the
get_storage_id
from huggingface_hub. I will use it when it becomes public.This is POC for now. I need to test it a bit and also see if it is doable to perform the change in other libraries.
cc @Wauplin @sayakpaul