-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
smooth-quant with tp=2, and build llama 7b with tp=2, pp=2 failed #267
Comments
Hi, I have reproduced this issue, I will figure out if it is a bug or lack of implementation. |
Hi, We will fix this bug in recent push. for quick fix. Besides, I think |
@Tracin if this patch is applied, their is a building problem in int8-kv-cache+weight only
error log:
|
@forrestjgq Hi, I am afraid you have to change every |
It works. Thanks man! |
I wish to build llama-7b-hf model with tp size 2 and pp size 2, with smooth-quant, here is the processing:
then I build it with:
this will lead to a failure:
The text was updated successfully, but these errors were encountered: