-
Notifications
You must be signed in to change notification settings - Fork 11.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Converting Ilama 4bit GPTQ Model from HF does not work #746
Comments
as today's master, you don't need to run migrate script. convert-gptq-ggml.py generated the latest version of model. Check the first 4 bytes of the generated file. the latest version should be 0x67676d66, the old version which needs migration should be: 0x67676d6c. |
ah I see! Well it is 6d66 already, but main expects a different version:
I did a git pull a few hours ago and converted the model afterwards. |
I guess convert-gptq-to-ggml.py needs an update? I just changed the version bytes and now it works! |
How many different GGML BIN file headers are there floating around now? Asking for a friend.. |
I think there are 3. the original one A , the new one B, the one recently introduced C. In order to get from A -> B, run convert-unversioned-ggml-to-ggml.py |
@xonfour how did you change the version bytes? |
Just change I will prepare a pull request with the fix soon (after I test it). |
Fix in #770 |
After converting GPTQ to GGML do you still get the benefits of GPTQ with its better accuracy compared to RTN quantization? |
try the new |
@xonfour by looking at the commit log of "convert.py" (notes on latest GPTQ-for-LLaMA format), the issue has been solved with the latest convert.py with: |
Hi! I tried to use the 13B Model from https://huggingface.co/maderix/llama-65b-4bit/
I converted the model using
python convert-gptq-to-ggml.py models/llama13b-4bit.pt models/tokenizer.model models/llama13b-4bit.bin
If I understand it correctly I still need to migrate the model and I tried it using
python migrate-ggml-2023-03-30-pr613.py models/llama13b-4bit.bin models/llama13b-4bit-new.bin
But after a few seconds this breaks with the following error:
Is it an error or am I the one to blame?
The text was updated successfully, but these errors were encountered: