-
Notifications
You must be signed in to change notification settings - Fork 11.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Does llama.cpp support gpt-2-q4? #1386
Comments
I tried to convert Cerebras-GPT-111M for llama.cpp, it could run with ggml , but I got error like below ❯ ./quantize ./models/Cerebras-GPT-111M/ggml-model-f16.bin ./models/output/cerebras-q4-0.bin q4_0 |
❯ python3 convert.py ./models/Cerebras-GPT-111M/ |
Cerebras is a GPT2 based model, not llama, and won't work with this repo. You can run it with the gpt2 example main.cpp example in the ggml repo, or via KoboldCpp which automatically detects format. (Disclaimer: I am a koboldcpp dev) |
This issue was closed because it has been inactive for 14 days since being marked as stale. |
I noticed ggml could run gpt-2 , I wonder if llama.cpp support too, I have download gpt-2-q4 model from huggingface ,but failed to run.
The text was updated successfully, but these errors were encountered: