[ `T5`] fix fp16 loading issue #20878

younesbelkada · 2022-12-22T19:59:18Z

What does this PR do?

This PR mainly fixes https://github.com/huggingface/transformers/actions/runs/3754402958/jobs/6378652143

Since the PR huggingface/accelerate#920 has been merged, the fix proposed in #20760 seems to not work anymore using the main branch of accelerate for some specific cases.

To reproduce (use the main branch of accelerate):

import torch
from transformers import T5ForConditionalGeneration

model = T5ForConditionalGeneration.from_pretrained("t5-small", torch_dtype=torch.float16)
print(model.decoder.block[0].layer[2].DenseReluDense.wo.weight.dtype)
>>> torch.float16

Why?

I believe this is because the aforementioned PR introduced a new argument dtype on the function set_module_tensor_to_device, if this argument is set to None (by default), the target value is automatically set to the dtype of the old tensor - which slightly breaks some assumptions made in #20760
I believe upstreaming this change on modeling_utils by adding the support of this new argument should be the fix. As some users might not use the latest version of accelerate, I added a small hack to make this change backward compatible, but I am not sure if this is the best solution

Tested this fix on the main branch of accelerate, accelerate==0.15.0 and all relevant tests pass

cc @sgugger

src/transformers/modeling_utils.py

HuggingFaceDocBuilderDev · 2022-12-22T20:16:53Z

The documentation is not available anymore as the PR was closed or merged.

sgugger

Thanks for fixing! LGTM with just one nit.

sgugger · 2022-12-23T06:34:35Z

src/transformers/modeling_utils.py

+                force_upcast_dtype = torch.float32
+
+                # For backward compatibility with older versions of `accelerate`
+                if set_module_tensor_to_device.__code__.co_argcount == 5:


Slight nit: can we use the signature and parameter names using inspect? It would be clearer to read. Also add a TODO that this should become a version check at the next version of Accelerate (I will take care of it after next release).

Thanks! Should be addressed in 95486c3

- remove `force_upcast_dtype` as it is used once - use `inspect` - add `TODO`

* fix fp16 loading issue * add backward compatibility * better refactor * better readability - remove `force_upcast_dtype` as it is used once - use `inspect` - add `TODO`

younesbelkada added 2 commits December 22, 2022 19:42

fix fp16 loading issue

98903ab

add backward compatibility

dd39585

younesbelkada commented Dec 22, 2022

View reviewed changes

src/transformers/modeling_utils.py Outdated Show resolved Hide resolved

src/transformers/modeling_utils.py Outdated Show resolved Hide resolved

better refactor

43006f0

younesbelkada force-pushed the fix-t5-dtype branch from 8bf5c89 to 43006f0 Compare December 22, 2022 21:50

younesbelkada added the Core: Modeling Internals of the library; Models. label Dec 22, 2022

younesbelkada requested a review from sgugger December 22, 2022 22:16

sgugger approved these changes Dec 23, 2022

View reviewed changes

better readability

95486c3

- remove `force_upcast_dtype` as it is used once - use `inspect` - add `TODO`

younesbelkada merged commit accad48 into huggingface:main Dec 26, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ `T5`] fix fp16 loading issue #20878

[ `T5`] fix fp16 loading issue #20878

younesbelkada commented Dec 22, 2022 •

edited

Loading

HuggingFaceDocBuilderDev commented Dec 22, 2022 •

edited

Loading

sgugger left a comment

sgugger Dec 23, 2022

younesbelkada Dec 26, 2022

[ T5] fix fp16 loading issue #20878

[ T5] fix fp16 loading issue #20878

Conversation

younesbelkada commented Dec 22, 2022 • edited Loading

What does this PR do?

HuggingFaceDocBuilderDev commented Dec 22, 2022 • edited Loading

sgugger left a comment

Choose a reason for hiding this comment

sgugger Dec 23, 2022

Choose a reason for hiding this comment

younesbelkada Dec 26, 2022

Choose a reason for hiding this comment

[ `T5`] fix fp16 loading issue #20878

[ `T5`] fix fp16 loading issue #20878

younesbelkada commented Dec 22, 2022 •

edited

Loading

HuggingFaceDocBuilderDev commented Dec 22, 2022 •

edited

Loading