Enables CPU AWQ model with IPEX version. #33460

jiqing-feng · 2024-09-13T06:47:05Z

This PR enables CPU AWQ model with IPEX version.

jiqing-feng · 2024-09-25T01:04:19Z

Hi @SunMarc . Do you mind reviewing this PR? It enables AutoAWQ CPU path. Thx!

jiqing-feng · 2024-09-28T09:58:33Z

Hi @SunMarc @ArthurZucker . This PR is ready to be reviewed. Thx!

SunMarc

Thanks for this addition @jiqing-feng ! Left a few comments

docs/source/en/quantization/awq.md

src/transformers/integrations/awq.py

tests/quantization/autoawq/test_awq.py

src/transformers/quantizers/quantizer_awq.py

tests/quantization/autoawq/test_awq.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

jiqing-feng · 2024-10-01T07:52:54Z

Hi @SunMarc . I have fixed all your comments, please review it. Thx!

SunMarc

Nice ! Thanks for iterating @jiqing-feng ! Just a small nit

SunMarc · 2024-10-01T11:50:28Z

src/transformers/integrations/awq.py

-        _fuse_awq_mlp(model, name, modules_to_fuse["mlp"], module, QuantFusedMLP)
+        # Replace MLP layers if awq version is not ipex.
+        if quantization_config.version != "ipex":
+            logger.info("The IPEX version AWQ does not support fuse mlp for now.")


To put in the else condition

SunMarc · 2024-10-01T11:53:10Z

src/transformers/quantizers/quantizer_awq.py

+            if not torch.cuda.is_available():
+                raise RuntimeError("GPU is required to run AWQ quantized model.")


You can also add in the error that the user can try with ipex if they have an intel CPU.

HuggingFaceDocBuilderDev · 2024-10-01T12:26:04Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

jiqing-feng · 2024-10-02T11:00:13Z

Hi @SunMarc . I have fixed the log, now it's ready to merge. Thanks!

ArthurZucker

Clean 🧼 thanks 🤗

* enable cpu awq ipex linear * add doc for cpu awq with ipex kernel * add tests for cpu awq * fix code style * fix doc and tests * Update docs/source/en/quantization/awq.md Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update tests/quantization/autoawq/test_awq.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * fix comments * fix log * fix log * fix style --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

jiqing-feng mentioned this pull request Sep 13, 2024

enable awq ipex linear in transformers casper-hansen/AutoAWQ#610

Merged

jiqing-feng changed the title ~~Awq~~ Enables CPU AWQ model with IPEX kernel. Sep 13, 2024

jiqing-feng changed the title ~~Enables CPU AWQ model with IPEX kernel.~~ Enables CPU AWQ model with IPEX version. Sep 13, 2024

jiqing-feng marked this pull request as ready for review September 13, 2024 08:45

jiqing-feng added 6 commits September 13, 2024 10:02

enable cpu awq ipex linear

f804513

add doc for cpu awq with ipex kernel

a2b3274

add tests for cpu awq

851744c

fix code style

8cbe04b

fix doc and tests

980af57

Merge branch 'huggingface:main' into awq

4f28699

SunMarc reviewed Sep 30, 2024

View reviewed changes

jiqing-feng and others added 3 commits October 1, 2024 14:53

Update docs/source/en/quantization/awq.md

eefcc3c

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

Update tests/quantization/autoawq/test_awq.py

50b28cb

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

Merge branch 'main' into awq

4f7a67f

SunMarc approved these changes Oct 1, 2024

View reviewed changes

SunMarc requested a review from ArthurZucker October 1, 2024 13:05

fix comments

c743f8d

jiqing-feng added 3 commits October 2, 2024 14:40

fix log

6667f61

fix log

344e779

fix style

19bdbd5

ArthurZucker approved these changes Oct 4, 2024

View reviewed changes

ArthurZucker merged commit b916efc into huggingface:main Oct 4, 2024
24 checks passed

jiqing-feng mentioned this pull request Nov 13, 2024

fix cpu bnb path #34647

Merged

jiqing-feng deleted the awq branch December 19, 2024 02:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enables CPU AWQ model with IPEX version. #33460

Enables CPU AWQ model with IPEX version. #33460

jiqing-feng commented Sep 13, 2024 •

edited

Loading

jiqing-feng commented Sep 25, 2024

jiqing-feng commented Sep 28, 2024

SunMarc left a comment

jiqing-feng commented Oct 1, 2024

SunMarc left a comment

SunMarc Oct 1, 2024

SunMarc Oct 1, 2024

HuggingFaceDocBuilderDev commented Oct 1, 2024

jiqing-feng commented Oct 2, 2024

ArthurZucker left a comment

		if not torch.cuda.is_available():
		raise RuntimeError("GPU is required to run AWQ quantized model.")

Enables CPU AWQ model with IPEX version. #33460

Enables CPU AWQ model with IPEX version. #33460

Conversation

jiqing-feng commented Sep 13, 2024 • edited Loading

jiqing-feng commented Sep 25, 2024

jiqing-feng commented Sep 28, 2024

SunMarc left a comment

Choose a reason for hiding this comment

jiqing-feng commented Oct 1, 2024

SunMarc left a comment

Choose a reason for hiding this comment

SunMarc Oct 1, 2024

Choose a reason for hiding this comment

SunMarc Oct 1, 2024

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented Oct 1, 2024

jiqing-feng commented Oct 2, 2024

ArthurZucker left a comment

Choose a reason for hiding this comment

jiqing-feng commented Sep 13, 2024 •

edited

Loading