Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hi ,when the ZeroQuant inference will be released? #2326

Closed
xk503775229 opened this issue Sep 15, 2022 · 4 comments
Closed

hi ,when the ZeroQuant inference will be released? #2326

xk503775229 opened this issue Sep 15, 2022 · 4 comments
Assignees

Comments

@xk503775229
Copy link

xk503775229 commented Sep 15, 2022

Hi,

The engine of ZeroQuant inference is not released yet. The code example in DeepSpeed-Example is only to help verify the accuracy of ZeroQuant.

The kernel/engine released is on our calendar and we are actively working on it to make it compatible for various models. Please stay tuned.

For LKD, we will also release it soon.

For the last question, the code for training or accuracy testing is different than the final inference engine. Here, everything is simulated, so we can do quantization aware training or other things

Originally posted by @yaozhewei in #2207 (comment)

hi ,when the ZeroQuant inference (for GPT model) will be released?

@david-macleod
Copy link

david-macleod commented Oct 20, 2022

Any updates on this? Thanks.

@yaozhewei
Copy link
Contributor

Reza wraps up this #2217 which answers some part of your questions, such as the model size reduction. Regarding the kernels, we are working on a plan to release it soon so that you can give it a try.
Thanks,

@yaozhewei yaozhewei self-assigned this Nov 4, 2022
@shhn1
Copy link

shhn1 commented Apr 17, 2023

Any updates on this? Thanks @yaozhewei

@loadams
Copy link
Contributor

loadams commented Aug 14, 2023

Related PRs merged, closing this for now.

@loadams loadams closed this as completed Aug 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants