-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Getting SIGSEGV with llama backend #973
Comments
I was going to submit an issue for this as well. I am getting the same error with similar codellama models in the gguf format with version 1.25.0 |
Same here :( |
I have similar problems on all GGUF files at the moment. See below some additional logs if they can be of help.
|
change backend from |
The lama-stable backend doesn't support GGUF models though, does it? |
I suspect this PR may fix things. The llama.cpp need a bump to get working with gguf - so golang would be behind. #977 |
Everything had already been bumped for v1.25, which split the llama backend into llama (GGUF support) and llama-stable (GGML support). That PR is just an automated bump to the latest version, and is unrelated to this issue. |
Your mileage may vary, but I ran into the SIGSEGV issue with the current(as of last night) docker image. Building locally, or rebuilding the container image has taken care of the problem in my case. |
Unfortunately I'm also building locally. |
Having the same problem here with Tried prebuilt docker tags Log snippet:
Works perfectly fine when running
|
@jadams could you share your I'm facing this issue too but I'm getting While checking my Testing For reference, (after rebuild) tested with: /build/go-llama/build/bin/main -t 8 -ngl 1 -lv -m /models/orca_mini_v3_7b.Q6_K.gguf --color -c 512 --temp 0.7 -p "### Instruction: Write a story about llamas\n### Response:" |
nvidia-smi:
nvcc --version
seems to be the same as you, cuda 12.2 on nvidia-smi and cuda 12.1 on nvcc |
I am getting the same issue. Orca-mini 3b works but not other models, doing inference on CPU only |
My nvcc and nvidia-smi cuda versions match, but I have a similar output with SIGSEGV when trying to load a GGUF model |
I think the problem was something related to gguf v2 format. Can you test it with localai 1.30? Maybe it is fixed ... |
LocalAI version:
v1.25.0
Environment, CPU architecture, OS, and Version:
Linux hostname 5.15.0-78-generic #85-Ubuntu SMP Fri Jul 7 15:25:09 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Describe the bug
Trying to run any GGUF model with the
llama
backend results in SIGSEGV as soon as the model tries to load. (output in Logs section)Note that running the
main
binary of llama.cpp fromLocalAI/go-llama/build/bin/
runs totally fine, e.g.To Reproduce
Any request seems to do this. I tried with both
codellama-13b-python.Q4_K_S.gguf
andphind-codellama-34b-v1.Q4_K_M.gguf
for good measure. Both work when running llama.cpp directly.Expected behavior
Logs
The text was updated successfully, but these errors were encountered: