[`Trainer`] Fix `.to` call on 4bit models #24444

younesbelkada · 2023-06-23T09:57:05Z

What does this PR do?

Currently the Trainer fails when calling initializing it on some scenarios using 4bit models
In fact, the device placement is correctly skipped for 8bit models but needs to be skipped as well for 4bit models as the to operation is not supported for 4bit models as well.
This PR adds a patch for that case

cc @amyeroberts @lewtun

HuggingFaceDocBuilderDev · 2023-06-23T10:17:21Z

The documentation is not available anymore as the PR was closed or merged.

amyeroberts

LGTM - thanks for fixing!

General question - as there's two flags, "is_loaded_in_8bit" and "is_quantized", are we guaranteed that all 8bit quantized (and 4 bit) have these flags correctly set?

younesbelkada · 2023-06-23T11:15:00Z

@amyeroberts thanks!
Absolutely yes, for reference, here is how we set that attribute:

transformers/src/transformers/modeling_utils.py

Line 2922 in ea91c2a

model.is_quantized = load_in_8bit or load_in_4bit

lewtun · 2023-06-23T11:34:27Z

I can confirm that this fix resolves the error I was hitting with 4-bit models:

Traceback (most recent call last):
  File "/fsx/lewis/git/h4/scripts/evaluation/run_rm_eval.py", line 275, in <module>
    main()
  File "/fsx/lewis/git/h4/scripts/evaluation/run_rm_eval.py", line 164, in main
    trainer = RewardTrainer(
  File "/fsx/lewis/git/h4/src/h4/training/trainer.py", line 26, in __init__
    super().__init__(*args, **kwargs)
  File "/fsx/lewis/miniconda/envs/h4/lib/python3.10/site-packages/transformers/trainer.py", line 506, in __init__
    self._move_model_to_device(model, args.device)
  File "/fsx/lewis/miniconda/envs/h4/lib/python3.10/site-packages/transformers/trainer.py", line 747, in _move_model_to_device
    model = model.to(device)
  File "/fsx/lewis/miniconda/envs/h4/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1889, in to
    raise ValueError(
ValueError: `.to` is not supported for `4-bit` or `8-bit` models. Please use the model as it is, since the model has already been set to the correct devices and casted to the correct `dtype`.

Thanks for the fast fix @younesbelkada !!

younesbelkada added 2 commits June 23, 2023 09:54

fix .to call on 4bit models

877783c

better check

b0ccdda

younesbelkada requested a review from amyeroberts June 23, 2023 10:00

amyeroberts approved these changes Jun 23, 2023

View reviewed changes

younesbelkada merged commit 468aed3 into huggingface:main Jun 23, 2023

younesbelkada deleted the fix-4bit-move branch June 23, 2023 11:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[`Trainer`] Fix `.to` call on 4bit models #24444

[`Trainer`] Fix `.to` call on 4bit models #24444

younesbelkada commented Jun 23, 2023

HuggingFaceDocBuilderDev commented Jun 23, 2023 •

edited

Loading

amyeroberts left a comment

younesbelkada commented Jun 23, 2023

lewtun commented Jun 23, 2023

[Trainer] Fix .to call on 4bit models #24444

[Trainer] Fix .to call on 4bit models #24444

Conversation

younesbelkada commented Jun 23, 2023

What does this PR do?

HuggingFaceDocBuilderDev commented Jun 23, 2023 • edited Loading

amyeroberts left a comment

Choose a reason for hiding this comment

younesbelkada commented Jun 23, 2023

lewtun commented Jun 23, 2023

[`Trainer`] Fix `.to` call on 4bit models #24444

[`Trainer`] Fix `.to` call on 4bit models #24444

HuggingFaceDocBuilderDev commented Jun 23, 2023 •

edited

Loading