Skip to content

Bug: GGML assert with bf16, RTX3090 #8234

Closed as not planned
Closed as not planned
@micsthepick

Description

@micsthepick

What happened?

./llama-server -ngl 99 -cb -c 65536 -np 32 -m models/Phi-3-mini-128k-instruct/ggml-model-bf16.gguf 
...
GGML_ASSERT: ggml/src/ggml-cuda.cu:1257: to_fp32_cuda != nullptr
[New LWP 934430]
[New LWP 934432]
[New LWP 934433]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
0x00007fb1ba523c7f in __GI___wait4 (pid=934542, stat_loc=0x0, options=0, usage=0x0) at ../sysdeps/unix/sysv/linux/wait4.c:27
27      ../sysdeps/unix/sysv/linux/wait4.c: No such file or directory.
#0  0x00007fb1ba523c7f in __GI___wait4 (pid=934542, stat_loc=0x0, options=0, usage=0x0) at ../sysdeps/unix/sysv/linux/wait4.c:27
27      in ../sysdeps/unix/sysv/linux/wait4.c
#1  0x0000559119a6c7eb in ggml_print_backtrace ()
#2  0x000055911992c1b5 in ggml_cuda_op_mul_mat_cublas(ggml_backend_cuda_context&, ggml_tensor const*, ggml_tensor const*, ggml_tensor*, char const*, float const*, char const*, float*, long, long, long, long, CUstream_st*) ()
#3  0x000055911992e781 in ggml_cuda_op_mul_mat(ggml_backend_cuda_context&, ggml_tensor const*, ggml_tensor const*, ggml_tensor*, void (*)(ggml_backend_cuda_context&, ggml_tensor const*, ggml_tensor const*, ggml_tensor*, char const*, float const*, char const*, float*, long, long, long, long, CUstream_st*), void (*)(float const*, void*, long, long, long, long, ggml_type, CUstream_st*)) ()
#4  0x000055911992f7a5 in ggml_cuda_mul_mat(ggml_backend_cuda_context&, ggml_tensor const*, ggml_tensor const*, ggml_tensor*) ()
#5  0x0000559119933cff in ggml_backend_cuda_graph_compute(ggml_backend*, ggml_cgraph*) ()
#6  0x0000559119abb4bb in ggml_backend_sched_graph_compute_async ()
#7  0x0000559119b0d7b0 in llama_decode ()
#8  0x0000559119bcd039 in llama_init_from_gpt_params(gpt_params&) ()
#9  0x0000559119c78495 in server_context::load_model(gpt_params const&) ()
#10 0x0000559119913d7a in main ()
[Inferior 1 (process 934429) detached]
./start_phi.sh: line 1: 934429 Aborted 

Name and Version

./llama-server --version
version: 3265 (72272b8)
built with cc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0 for x86_64-linux-gnu

What operating system are you seeing the problem on?

Linux, Windows

Relevant log output

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bug-unconfirmedmedium severityUsed to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions