-
Notifications
You must be signed in to change notification settings - Fork 27.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[bnb] Small improvements on utils #18646
[bnb] Small improvements on utils #18646
Conversation
- replace `modules_to_not_convert` by `module_to_not_convert`
The documentation is not available anymore as the PR was closed or merged. |
Can confirm the tests pass! |
so will there always be just one module not to convert? won't it be safer to have modules instead and work with the list? |
- changed variables name - now output a list - change error message
I have proposed a small refactoring that includes:
The bnb slow tests are passing with this fix! |
From #18660 I also just added a commit to support having a custom list of the keys to ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this, I left some comments.
src/transformers/modeling_utils.py
Outdated
@@ -1839,6 +1842,7 @@ def from_pretrained(cls, pretrained_model_name_or_path: Optional[Union[str, os.P | |||
offload_state_dict = kwargs.pop("offload_state_dict", False) | |||
load_in_8bit = kwargs.pop("load_in_8bit", False) | |||
int8_threshold = kwargs.pop("int8_threshold", 6.0) | |||
no_load_in_8bit_modules = kwargs.pop("no_load_in_8bit_modules", None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make more sense to have this be a class variable of PreTrainedModel
(like the no_split
variable used for big model inference)? I'm afraid the user won't know what to set this too and it looks like it's something we should automatically handle?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a strong opinion on that but this argument is optional because the function get_keys_not_to_convert
should automatically take care of that except for some models like Jukebox where it is a bit trickier due to its architecture.
In this case the user will just have to manually set which modules should be kept in their native precision and specify them in the kwargs
, so I feel like it is a bit easier than having it as an argument of PretrainedModel
because you would need to open a PR to add the feature.
Co-authored-by: stas00 <stas00@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still good for me. I'll let @stas00 have a second look since merging is blocked by his change request.
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you for addressing the suggestions, @younesbelkada
Can confirm the slow tests pass after rebasing on |
* Small replacement - replace `modules_to_not_convert` by `module_to_not_convert` * refactor a bit - changed variables name - now output a list - change error message * make style * add list * make style * change args name Co-authored-by: stas00 <stas00@users.noreply.github.com> * fix comment * fix typo Co-authored-by: stas00 <stas00@users.noreply.github.com> * Update src/transformers/modeling_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: stas00 <stas00@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
What does this PR do?
Fixes a small typo in
bitsandbytes.py
, should address huggingface/blog#463 (comment)I will have to test it first and mark it as ready for review!