[DO NOT MERGE] Hf quantizer refactor #28703

younesbelkada · 2024-01-25T11:16:14Z

What does this PR do?

Built on top of #26610 - this PR is just to see if I don't any surprising diff similar as in poedator#4

HuggingFaceDocBuilderDev · 2024-01-26T10:09:14Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

…transformers into hf-quantizer-work

younesbelkada · 2024-01-26T11:04:05Z

Thanks @poedator for your comments and ideas - I think the way forward would be to extend the xxxQuantizer by adding new arguments in the corresponding quantization config object - e.g. quantize_mlp_only . I feel mixing different quantization approaches (your second point) might be a bit too much of an edge case but contributors can always create a new quantizer for it MixedQuantizer with a MixedQuantizationConfig.

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

…transformers into hf-quantizer-work

src/transformers/modeling_utils.py

src/transformers/quantizers/auto.py

ArthurZucker

🔥 Great work.
Modeling utils refactor looks really great. My last comments are more around how we handle kwargs and getting as simple as possible with the way we init the quantizer, whether or not we actually need a AutoQuantizationConfig vs just a AutoHfQuantizer but otherwise good to go! 🤗

src/transformers/modeling_utils.py

src/transformers/quantizers/auto.py

src/transformers/quantizers/base.py

src/transformers/quantizers/quantizer_awq.py

src/transformers/quantizers/quantizer_bnb_4bit.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

younesbelkada · 2024-01-30T00:27:21Z

I will merge the commits of this PR directly in #26610 to properly credit @poedator from his great work ! closing this - thanks @ArthurZucker for the review and offline discussions!

poedator added 30 commits January 16, 2024 13:10

squashed earlier commits for easier rebase

e0650b2

rm rebase leftovers

42adf9d

4bit save enabled @quantizers

7f57f26

TMP gptq test use exllama

f1f5da0

fix AwqConfigTest::test_wrong_backend for A100

a94d3a7

quantizers AWQ fixes

0b30de4

_load_pretrained_model low_cpu_mem_usage branch

4cdaf0d

quantizers style

0db1107

remove require_low_cpu_mem_usage attr

89d1177

rm dtype arg from process_model_before_weight_loading

0c71b00

rm config_origin from Q-config

2b4122a

rm inspect from q_config

02ad562

fixed docstrings in QuantizationConfigParser

3e51d51

logger.warning fix

2569367

mv is_loaded_in_4(8)bit to BnbHFQuantizer

3259243

is_accelerate_available error msg fix in quantizer

ab61417

split is_model_trainable in bnb quantizer class

95e44cd

rm llm_int8_skip_modules as separate var in Q

b936cfb

Q rm todo

0b40d21

fwd ref to HFQuantizer in type hint

c53a3fb

rm note re optimum.gptq.GPTQQuantizer

dbd93f2

quantization_config in __init__ simplified

e34bd58

replaced NonImplemented with create_quantized_param

fcd5a7a

rm load_in_4/8_bit deprecation warning

954c5e6

QuantizationConfigParser refactoring

49e163f

awq-related minor changes

f8b9e07

awq-related changes

5eaf9ac

awq config.modules_to_not_convert

d678d99

raise error if no q-method in q-config in args

7c9c49b

minor cleanup

0d739d3

younesbelkada added 4 commits January 26, 2024 10:44

address comments

f0b5f96

fix

30e1fc2

Merge branch 'hf-quantizer-work' of https://github.com/younesbelkada/…

3b7e625

…transformers into hf-quantizer-work

fixup

493d117

younesbelkada requested review from ArthurZucker January 26, 2024 11:00

younesbelkada and others added 5 commits January 26, 2024 15:10

Update src/transformers/modeling_utils.py

48c5761

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

Update src/transformers/modeling_utils.py

3744fb1

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

address final comment

c4995ab

Merge branch 'hf-quantizer-work' of https://github.com/younesbelkada/…

17f95bf

…transformers into hf-quantizer-work

update

abb4db3

ArthurZucker reviewed Jan 26, 2024

View reviewed changes

src/transformers/modeling_utils.py Show resolved Hide resolved

ArthurZucker reviewed Jan 26, 2024

View reviewed changes

src/transformers/quantizers/auto.py Show resolved Hide resolved

src/transformers/quantizers/auto.py Show resolved Hide resolved

ArthurZucker approved these changes Jan 26, 2024

View reviewed changes

younesbelkada and others added 3 commits January 26, 2024 19:19

Update src/transformers/quantizers/base.py

7e5a5b8

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

Update src/transformers/quantizers/auto.py

122b494

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

fix

901ace5

younesbelkada mentioned this pull request Jan 29, 2024

support 2bit quip# method huggingface/peft#1293

Closed

younesbelkada and others added 7 commits January 29, 2024 23:59

Merge remote-tracking branch 'upstream/main' into hf-quantizer-work

2da5233

add kwargs update

2ab7fd5

Merge remote-tracking branch 'upstream/main' into HEAD

242682c

Merge branch 'quant' into hf-quantizer-work

e387f68

fixup

4c0c33e

add optimum_quantizer attribute

c37b222

oops

ca40b04

younesbelkada closed this Jan 30, 2024

younesbelkada mentioned this pull request Jan 30, 2024

HfQuantizer class for quantization-related stuff in modeling_utils.py #26610

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DO NOT MERGE] Hf quantizer refactor #28703

[DO NOT MERGE] Hf quantizer refactor #28703

younesbelkada commented Jan 25, 2024

HuggingFaceDocBuilderDev commented Jan 26, 2024

younesbelkada commented Jan 26, 2024

ArthurZucker left a comment

younesbelkada commented Jan 30, 2024

[DO NOT MERGE] Hf quantizer refactor #28703

[DO NOT MERGE] Hf quantizer refactor #28703

Conversation

younesbelkada commented Jan 25, 2024

What does this PR do?

HuggingFaceDocBuilderDev commented Jan 26, 2024

younesbelkada commented Jan 26, 2024

ArthurZucker left a comment

Choose a reason for hiding this comment

younesbelkada commented Jan 30, 2024