-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Could not load model: SIGILL: illegal instruction #1447
Comments
So far I have not been able to have any model output text, but I am at a point where nvidia-smi shows GPU utilisation. I've spent days on this, and now only mere hours after creating this issue I can reply to it myself :-) So, long story short, there are 3 things that I ran into.
This is my current docker compose file:
Now, my GPU is being used but no outputs are generated yet. See logs below. I'll look into it more tomororw, but if anyone has any idea, please let me know!
|
try |
I've tried
|
You are facing #1333 - there is no solution for now, few models are triggering this behavior. I'd suggest you to change model until gets fixed upstream (ggerganov/llama.cpp#3969). |
I figured that out from another issue indeed, I just downloaded a 13b model and that works as expected. Not sure if it is just the model or the size, but 13b is my sweet spot anyway. Apologies for creating an issue unrelated to LocalAI, but I appreciate the support of everyone. I'll close this ticket now knowing that it is an upstream issue. Thank you! 🙏🏻 |
So Which models are working? There are many models out there, I am afraid wasting gigabytes trying to find the right model. |
I just tried wizard uncensored llm with 13b & 30b of a new format (gguf) 30b version is not working and this brought me here. 13b works fine
|
LocalAI version:
quay.io/go-skynet/local-ai:master-cublas-cuda12-core
Environment, CPU architecture, OS, and Version:
Linux user-Z68X-UD3P-B3 6.2.0-39-generic #40~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Nov 16 10:53:04 UTC 2 x86_64 x86_64 x86_64 GNU/Linux
Describe the bug
Every
.gguf
model that I try fails with the error as seen below. I've downloaded TheBloke's CodeLLama-13b (gguf) and it failed, I've tried out the 7B LLama model, the Luna model (as shown in the docs) and now Tinyllama and they all fail. I know that the Cuda integration with Docker is working as expect because I ran the Nvidia sample workload and Axolotl for training all fine inside Docker.Furthermore, if I remove the
backend
altogether then LocalAI will try every backend, however, none of them work.To Reproduce
Execute this curl, but every model will fail.
I've also tried
llama-stable
as backend, but that didn't help.Expected behavior
I would expect that the model would return a response, or at the very least show an reasonable error. I don't think the error shown is directly related to LocalAI)
Logs
Additional context
The text was updated successfully, but these errors were encountered: