-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Assertion ggml_nelements(a) == ne0*ne1*ne2 when loading TheBloke/Llama-2-70B-GGML/llama-2-70b.ggmlv3.q2_K.bin #2445
Comments
Similar error happened to me too. its not the model. Its something with llama cpp. I rolled back to yesterdays commit and it worked fine |
This issue was closed because it has been inactive for 14 days since being marked as stale. |
Same issue when loading DeepSeek-V2-Chat. Reopen? @ggerganov
Edit - Trying again as the root user produced this extra output:
Edit - -ngl 0 changes nothing |
Loading the Llama 2 - 70B model from TheBloke with rustformers/llm seems to work but fails on inference.
llama.cpp raises an assertion regardless of the
use_gpu
option :This might be related to the model files, but the models from TheBloke are usually reliable.
Running on MacBook Pro M1 Max 32 GB RAM.
macOS 14.0.0 23A5301g
The text was updated successfully, but these errors were encountered: