-
Notifications
You must be signed in to change notification settings - Fork 28.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enables CPU AWQ model with IPEX version. #33460
Conversation
Hi @SunMarc . Do you mind reviewing this PR? It enables AutoAWQ CPU path. Thx! |
Hi @SunMarc @ArthurZucker . This PR is ready to be reviewed. Thx! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this addition @jiqing-feng ! Left a few comments
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Hi @SunMarc . I have fixed all your comments, please review it. Thx! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice ! Thanks for iterating @jiqing-feng ! Just a small nit
src/transformers/integrations/awq.py
Outdated
_fuse_awq_mlp(model, name, modules_to_fuse["mlp"], module, QuantFusedMLP) | ||
# Replace MLP layers if awq version is not ipex. | ||
if quantization_config.version != "ipex": | ||
logger.info("The IPEX version AWQ does not support fuse mlp for now.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To put in the else condition
if not torch.cuda.is_available(): | ||
raise RuntimeError("GPU is required to run AWQ quantized model.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can also add in the error that the user can try with ipex if they have an intel CPU.
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
Hi @SunMarc . I have fixed the log, now it's ready to merge. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clean 🧼 thanks 🤗
* enable cpu awq ipex linear * add doc for cpu awq with ipex kernel * add tests for cpu awq * fix code style * fix doc and tests * Update docs/source/en/quantization/awq.md Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update tests/quantization/autoawq/test_awq.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * fix comments * fix log * fix log * fix style --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
This PR enables CPU AWQ model with IPEX version.