hi ，when the ZeroQuant inference will be released? #2326

xk503775229 · 2022-09-15T13:36:38Z

Hi,

The engine of ZeroQuant inference is not released yet. The code example in DeepSpeed-Example is only to help verify the accuracy of ZeroQuant.

The kernel/engine released is on our calendar and we are actively working on it to make it compatible for various models. Please stay tuned.

For LKD, we will also release it soon.

For the last question, the code for training or accuracy testing is different than the final inference engine. Here, everything is simulated, so we can do quantization aware training or other things

Originally posted by @yaozhewei in #2207 (comment)

hi ，when the ZeroQuant inference (for GPT model) will be released?

david-macleod · 2022-10-20T18:51:55Z

Any updates on this? Thanks.

yaozhewei · 2022-11-02T01:50:27Z

Reza wraps up this #2217 which answers some part of your questions, such as the model size reduction. Regarding the kernels, we are working on a plan to release it soon so that you can give it a try.
Thanks,

shhn1 · 2023-04-17T09:23:24Z

Any updates on this? Thanks @yaozhewei

loadams · 2023-08-14T20:06:46Z

Related PRs merged, closing this for now.

yaozhewei self-assigned this Nov 4, 2022

loadams closed this as completed Aug 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hi ，when the ZeroQuant inference will be released? #2326

hi ，when the ZeroQuant inference will be released? #2326

xk503775229 commented Sep 15, 2022 •

edited

Loading

david-macleod commented Oct 20, 2022 •

edited

Loading

yaozhewei commented Nov 2, 2022

shhn1 commented Apr 17, 2023

loadams commented Aug 14, 2023

hi ，when the ZeroQuant inference will be released? #2326

hi ，when the ZeroQuant inference will be released? #2326

Comments

xk503775229 commented Sep 15, 2022 • edited Loading

david-macleod commented Oct 20, 2022 • edited Loading

yaozhewei commented Nov 2, 2022

shhn1 commented Apr 17, 2023

loadams commented Aug 14, 2023

xk503775229 commented Sep 15, 2022 •

edited

Loading

david-macleod commented Oct 20, 2022 •

edited

Loading