Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIGSEGV during inference #11456

Open
ko-alex opened this issue Jan 27, 2025 · 0 comments
Open

SIGSEGV during inference #11456

ko-alex opened this issue Jan 27, 2025 · 0 comments

Comments

@ko-alex
Copy link

ko-alex commented Jan 27, 2025

Name and Version

(gguf) ➜ llama.cpp git:(master) ./build/bin/llama-cli --version
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: yes
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 2 CUDA devices:
Device 0: NVIDIA GeForce RTX 4060 Ti, compute capability 8.9, VMM: yes
Device 1: NVIDIA GeForce RTX 3060, compute capability 8.6, VMM: yes
version: 4564 (acd38ef)
built with cc (Debian 12.2.0-14) 12.2.0 for x86_64-linux-gnu

Operating systems

Linux

Which llama.cpp modules do you know to be affected?

llama-cli

Command line

./build/bin/llama-cli -t 6 --color --interactive --conversation --multiline-input --mirostat 2 --ctx-size 16384 --keep -1 --flash-attn --repeat-penalty 1.2 --n-gpu-layers 44 --temp 0.3 --cache-type-v q8_0 --cache-type-k q8_0

Problem description & steps to reproduce

Suspect: '--cache-type-v q8_0 --cache-type-k q8_0' might be responsible.

Problem: during the chat, eventually this happens (output from cli, no further details):
[1] 12457 segmentation fault (core dumped) ./build/bin/llama-cli

$ coredumpctl list
Mon 2025-01-27 20:48:06 IST 12457 1000 1000 SIGSEGV present /work/src/llama.cpp/build/bin/llama-cli 1.2G

From core dump, call stack:

Stack trace of thread 12474:
#0 0x00007f128f771aab n/a (libc.so.6 + 0x152aab)
#1 0x00007f128f5c68af ggml_graph_compute_thread.isra.0 (libggml-cpu.so + 0x5a8af)
#2 0x00007f128f540d8e n/a (libgomp.so.1 + 0x1cd8e)
#3 0x00007f128f6a81c4 n/a (libc.so.6 + 0x891c4)
#4 0x00007f128f72885c n/a (libc.so.6 + 0x10985c)

Stack trace of thread 12472:
#0 0x00007f128f771ae9 n/a (libc.so.6 + 0x152ae9)
#1 0x00007f128f5c68af ggml_graph_compute_thread.isra.0 (libggml-cpu.so + 0x5a8af)
#2 0x00007f128f540d8e n/a (libgomp.so.1 + 0x1cd8e)
#3 0x00007f128f6a81c4 n/a (libc.so.6 + 0x891c4)
#4 0x00007f128f72885c n/a (libc.so.6 + 0x10985c)

Stack trace of thread 12470:
#0 0x00007f128f771b00 n/a (libc.so.6 + 0x152b00)
#1 0x00007f128f5c68af ggml_graph_compute_thread.isra.0 (libggml-cpu.so + 0x5a8af)
#2 0x00007f128f540d8e n/a (libgomp.so.1 + 0x1cd8e)
#3 0x00007f128f6a81c4 n/a (libc.so.6 + 0x891c4)
#4 0x00007f128f72885c n/a (libc.so.6 + 0x10985c)

Stack trace of thread 12471:
#0 0x00007f128f771af3 n/a (libc.so.6 + 0x152af3)
#1 0x00007f128f5c68af ggml_graph_compute_thread.isra.0 (libggml-cpu.so + 0x5a8af)
#2 0x00007f128f540d8e n/a (libgomp.so.1 + 0x1cd8e)
#3 0x00007f128f6a81c4 n/a (libc.so.6 + 0x891c4)
#4 0x00007f128f72885c n/a (libc.so.6 + 0x10985c)

Stack trace of thread 12473:
#0 0x00007f128f771ae4 n/a (libc.so.6 + 0x152ae4)
#1 0x00007f128f5c68af ggml_graph_compute_thread.isra.0 (libggml-cpu.so + 0x5a8af)
#2 0x00007f128f540d8e n/a (libgomp.so.1 + 0x1cd8e)
#3 0x00007f128f6a81c4 n/a (libc.so.6 + 0x891c4)
#4 0x00007f128f72885c n/a (libc.so.6 + 0x10985c)

Stack trace of thread 12466:
#0 0x00007f128f6a4f16 n/a (libc.so.6 + 0x85f16)
#1 0x00007f128f6a78bc pthread_cond_timedwait (libc.so.6 + 0x888bc)
#2 0x00007f127ebcac8a n/a (libcuda.so.1 + 0x1cac8a)
#3 0x00007f127ec6dee3 n/a (libcuda.so.1 + 0x26dee3)
#4 0x00007f128f6a81c4 n/a (libc.so.6 + 0x891c4)
#5 0x00007f128f72885c n/a (libc.so.6 + 0x10985c)

Stack trace of thread 12457:
#0 0x00007f128f771ae9 n/a (libc.so.6 + 0x152ae9)
#1 0x00007f128f5c68af ggml_graph_compute_thread.isra.0 (libggml-cpu.so + 0x5a8af)
#2 0x00007f128f5380b6 GOMP_parallel (libgomp.so.1 + 0x140b6)
#3 0x00007f128f597a5c ggml_graph_compute (libggml-cpu.so + 0x2ba5c)
#4 0x00007f128f5a61c2 _ZL30ggml_backend_cpu_graph_computeP12ggml_backendP11ggml_cgraph (libggml-cpu.so + 0x3a1c2)
#5 0x00007f128fb67f83 ggml_backend_sched_graph_compute_async (libggml-base.so + 0x26f83)
#6 0x00007f128fc694b0 _ZL19llama_graph_computeR13llama_contextP11ggml_cgraphiP15ggml_threadpool (libllama.so + 0x4e4b0)
#7 0x00007f128fc6d8f3 llama_kv_cache_update (libllama.so + 0x528f3)
#8 0x00007f128fc6e87e _ZL17llama_decode_implR13llama_context11llama_batch (libllama.so + 0x5387e)
#9 0x00007f128fc6fa87 llama_decode (libllama.so + 0x54a87)
#10 0x000055df853f5596 main (llama-cli + 0x22596)
#11 0x00007f128f64624a n/a (libc.so.6 + 0x2724a)
#12 0x00007f128f646305 __libc_start_main (libc.so.6 + 0x27305)
#13 0x000055df853f95d1 _start (llama-cli + 0x265d1)

Stack trace of thread 12467:
#0 0x00007f128f6a4f16 n/a (libc.so.6 + 0x85f16)
#1 0x00007f128f6a75d8 pthread_cond_wait (libc.so.6 + 0x885d8)
#2 0x000055df854b957b _ZZN10common_log6resumeEvENKUlvE_clEv (llama-cli + 0xe657b)
#3 0x00007f128f8d44a3 n/a (libstdc++.so.6 + 0xd44a3)
#4 0x00007f128f6a81c4 n/a (libc.so.6 + 0x891c4)
#5 0x00007f128f72885c n/a (libc.so.6 + 0x10985c)

Stack trace of thread 12469:
#0 0x00007f128f6a4f16 n/a (libc.so.6 + 0x85f16)
#1 0x00007f128f6a78bc pthread_cond_timedwait (libc.so.6 + 0x888bc)
#2 0x00007f127ebcac8a n/a (libcuda.so.1 + 0x1cac8a)
#3 0x00007f127ec6dee3 n/a (libcuda.so.1 + 0x26dee3)
#4 0x00007f128f6a81c4 n/a (libc.so.6 + 0x891c4)
#5 0x00007f128f72885c n/a (libc.so.6 + 0x10985c)

Stack trace of thread 12465:
#0 0x00007f128f71b1df __poll (libc.so.6 + 0xfc1df)
#1 0x00007f127ec761ef n/a (libcuda.so.1 + 0x2761ef)
#2 0x00007f127ed3a67f n/a (libcuda.so.1 + 0x33a67f)
#3 0x00007f127ec6dee3 n/a (libcuda.so.1 + 0x26dee3)
#4 0x00007f128f6a81c4 n/a (libc.so.6 + 0x891c4)
#5 0x00007f128f72885c n/a (libc.so.6 + 0x10985c)

Stack trace of thread 12458:
#0 0x00007f128f71b1df __poll (libc.so.6 + 0xfc1df)
#1 0x00007f127ec761ef n/a (libcuda.so.1 + 0x2761ef)
#2 0x00007f127ed3a67f n/a (libcuda.so.1 + 0x33a67f)
#3 0x00007f127ec6dee3 n/a (libcuda.so.1 + 0x26dee3)
#4 0x00007f128f6a81c4 n/a (libc.so.6 + 0x891c4)
#5 0x00007f128f72885c n/a (libc.so.6 + 0x10985c)

Stack trace of thread 12468:
#0 0x00007f128f71b1df __poll (libc.so.6 + 0xfc1df)
#1 0x00007f127ec761ef n/a (libcuda.so.1 + 0x2761ef)
#2 0x00007f127ed3a67f n/a (libcuda.so.1 + 0x33a67f)
#3 0x00007f127ec6dee3 n/a (libcuda.so.1 + 0x26dee3)
#4 0x00007f128f6a81c4 n/a (libc.so.6 + 0x891c4)
#5 0x00007f128f72885c n/a (libc.so.6 + 0x10985c)
ELF object binary architecture: AMD x86-64

First Bad Commit

No response

Relevant log output

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant