Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I get the following error when compiling with make LLAMA_CUBLAS=1 : #1728

Closed
xjDUAN184 opened this issue Jun 7, 2023 · 5 comments
Closed
Labels

Comments

@xjDUAN184
Copy link

I get the following error when compiling with make LLAMA_CUBLAS=1 :
make LLAMA_CUBLAS=1 LDFLAGS=-L/usr/local/cuda-11.6/targets/x86_64-linux/lib
I llama.cpp build info:
I UNAME_S: Linux
I UNAME_P: x86_64
I UNAME_M: x86_64
I CFLAGS: -I. -O3 -std=c11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -pthread -march=native -mtune=native -DGGML_USE_CUBLAS -I/usr/local/cuda/include -I/opt/cuda/include -I/targets/x86_64-linux/include
I CXXFLAGS: -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native -DGGML_USE_CUBLAS -I/usr/local/cuda/include -I/opt/cuda/include -I/targets/x86_64-linux/include
I LDFLAGS: -L/usr/local/cuda-11.6/targets/x86_64-linux/lib
I CC: cc (Ubuntu 8.4.0-1ubuntu118.04) 8.4.0
I CXX: g++ (Ubuntu 8.4.0-1ubuntu1
18.04) 8.4.0

g++ -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native -DGGML_USE_CUBLAS -I/usr/local/cuda/include -I/opt/cuda/include -I/targets/x86_64-linux/include examples/main/main.cpp ggml.o llama.o common.o ggml-cuda.o -o main -L/usr/local/cuda-11.6/targets/x86_64-linux/lib
ggml-cuda.o: In function mul_f32(float const*, float const*, float*, int, int)': tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0xcd): undefined reference to __cudaPopCallConfiguration'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x10b): undefined reference to cudaLaunchKernel' ggml-cuda.o: In function convert_fp16_to_fp32_cuda(void const*, float*, int, CUstream_st*)':
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x1ab): undefined reference to __cudaPushCallConfiguration' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x245): undefined reference to __cudaPopCallConfiguration'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x28f): undefined reference to cudaLaunchKernel' ggml-cuda.o: In function dequantize_row_q8_0_cuda(void const*, float*, int, CUstream_st*)':
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x31b): undefined reference to __cudaPushCallConfiguration' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x3b5): undefined reference to __cudaPopCallConfiguration'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x3ff): undefined reference to cudaLaunchKernel' ggml-cuda.o: In function dequantize_row_q5_1_cuda(void const*, float*, int, CUstream_st*)':
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x48b): undefined reference to __cudaPushCallConfiguration' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x525): undefined reference to __cudaPopCallConfiguration'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x56f): undefined reference to cudaLaunchKernel' ggml-cuda.o: In function dequantize_row_q5_0_cuda(void const*, float*, int, CUstream_st*)':
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x5fb): undefined reference to __cudaPushCallConfiguration' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x695): undefined reference to __cudaPopCallConfiguration'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x6df): undefined reference to cudaLaunchKernel' ggml-cuda.o: In function dequantize_row_q4_1_cuda(void const*, float*, int, CUstream_st*)':
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x76b): undefined reference to __cudaPushCallConfiguration' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x805): undefined reference to __cudaPopCallConfiguration'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x84f): undefined reference to cudaLaunchKernel' ggml-cuda.o: In function dequantize_row_q4_0_cuda(void const*, float*, int, CUstream_st*)':
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x8db): undefined reference to __cudaPushCallConfiguration' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x975): undefined reference to __cudaPopCallConfiguration'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x9bf): undefined reference to cudaLaunchKernel' ggml-cuda.o: In function ggml_cuda_pool_malloc(unsigned long, unsigned long*)':
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0xa11): undefined reference to cudaGetDevice' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0xaab): undefined reference to cudaMalloc'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0xac2): undefined reference to cudaGetErrorString' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0xb04): undefined reference to cudaGetErrorString'
ggml-cuda.o: In function ggml_cuda_h2d_tensor_2d(void*, ggml_tensor const*, long, long, long, long, CUstream_st*)': tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0xc55): undefined reference to cudaMemcpy2DAsync'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0xcbc): undefined reference to cudaMemcpy2DAsync' ggml-cuda.o: In function ggml_cuda_pool_free(void*, unsigned long)':
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0xd41): undefined reference to cudaGetDevice' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0xde9): undefined reference to cudaFree'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0xdff): undefined reference to cudaGetErrorString' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0xe40): undefined reference to cudaGetErrorString'
ggml-cuda.o: In function ggml_cuda_h2d_tensor_2d(void*, ggml_tensor const*, long, long, long, long, CUstream_st*) [clone .constprop.19]': tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0xf75): undefined reference to cudaMemcpy2DAsync'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0xfdb): undefined reference to cudaMemcpy2DAsync' ggml-cuda.o: In function ggml_cuda_op(ggml_tensor const*, ggml_tensor const*, ggml_tensor*, void ()(ggml_tensor const, ggml_tensor const*, ggml_tensor*, char*, float*, float*, float*, long, long, int, CUstream_st*&), bool)':
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x14c0): undefined reference to cudaSetDevice' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x185b): undefined reference to cudaEventRecord'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x18f0): undefined reference to cudaGetLastError' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x190c): undefined reference to cudaStreamWaitEvent'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x19a3): undefined reference to cudaMemcpyAsync' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x1b0c): undefined reference to cudaGetErrorString'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x1b7e): undefined reference to cudaMemcpyAsync' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x1dd4): undefined reference to cudaGetErrorString'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x1e1b): undefined reference to cudaGetErrorString' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x1e32): undefined reference to cudaGetErrorString'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x1e4c): undefined reference to cudaSetDevice' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x1e59): undefined reference to cudaDeviceSynchronize'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x1f52): undefined reference to cudaGetErrorString' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x1f6c): undefined reference to cudaGetErrorString'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x1f85): undefined reference to cudaGetErrorString' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x1fa8): undefined reference to cudaGetErrorString'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x1fc1): undefined reference to cudaGetErrorString' ggml-cuda.o:tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x1fe3): more undefined references to cudaGetErrorString' follow
ggml-cuda.o: In function void dequantize_mul_mat_vec<1, 1, &(convert_f16(void const*, int, int, float&, float&))>(void const*, float const*, float*, int)': tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x212b): undefined reference to __cudaPopCallConfiguration'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x2169): undefined reference to cudaLaunchKernel' ggml-cuda.o: In function void dequantize_mul_mat_vec<32, 2, &(dequantize_q4_0(void const*, int, int, float&, float&))>(void const*, float const*, float*, int)':
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x224b): undefined reference to __cudaPopCallConfiguration' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x2289): undefined reference to cudaLaunchKernel'
ggml-cuda.o: In function void dequantize_mul_mat_vec<32, 2, &(dequantize_q4_1(void const*, int, int, float&, float&))>(void const*, float const*, float*, int)': tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x236b): undefined reference to __cudaPopCallConfiguration'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x23a9): undefined reference to cudaLaunchKernel' ggml-cuda.o: In function void dequantize_mul_mat_vec<32, 2, &(dequantize_q5_0(void const*, int, int, float&, float&))>(void const*, float const*, float*, int)':
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x248b): undefined reference to __cudaPopCallConfiguration' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x24c9): undefined reference to cudaLaunchKernel'
ggml-cuda.o: In function void dequantize_mul_mat_vec<32, 2, &(dequantize_q5_1(void const*, int, int, float&, float&))>(void const*, float const*, float*, int)': tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x25ab): undefined reference to __cudaPopCallConfiguration'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x25e9): undefined reference to cudaLaunchKernel' ggml-cuda.o: In function void dequantize_mul_mat_vec<32, 1, &(dequantize_q8_0(void const*, int, int, float&, float&))>(void const*, float const*, float*, int)':
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x26cb): undefined reference to __cudaPopCallConfiguration' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x2709): undefined reference to cudaLaunchKernel'
ggml-cuda.o: In function void dequantize_block<32, 1, &(dequantize_q8_0(void const*, int, int, float&, float&))>(void const*, float*, int)': tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x27b3): undefined reference to __cudaPopCallConfiguration'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x27f6): undefined reference to cudaLaunchKernel' ggml-cuda.o: In function void dequantize_block<32, 2, &(dequantize_q5_0(void const*, int, int, float&, float&))>(void const*, float*, int)':
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x28a3): undefined reference to __cudaPopCallConfiguration' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x28e6): undefined reference to cudaLaunchKernel'
ggml-cuda.o: In function void dequantize_block<32, 2, &(dequantize_q5_1(void const*, int, int, float&, float&))>(void const*, float*, int)': tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x2993): undefined reference to __cudaPopCallConfiguration'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x29d6): undefined reference to cudaLaunchKernel' ggml-cuda.o: In function void dequantize_block<32, 2, &(dequantize_q4_0(void const*, int, int, float&, float&))>(void const*, float*, int)':
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x2a83): undefined reference to __cudaPopCallConfiguration' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x2ac6): undefined reference to cudaLaunchKernel'
ggml-cuda.o: In function void dequantize_block<32, 2, &(dequantize_q4_1(void const*, int, int, float&, float&))>(void const*, float*, int)': tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x2b73): undefined reference to __cudaPopCallConfiguration'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x2bb6): undefined reference to cudaLaunchKernel' ggml-cuda.o: In function void dequantize_block<1, 1, &(convert_f16(void const*, int, int, float&, float&))>(void const*, float*, int)':
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x2c63): undefined reference to __cudaPopCallConfiguration' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x2ca6): undefined reference to cudaLaunchKernel'
ggml-cuda.o: In function ggml_init_cublas': tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x2d3b): undefined reference to cudaGetDeviceCount'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x2dae): undefined reference to cudaGetDeviceProperties' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x2f47): undefined reference to cudaSetDevice'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x2f6a): undefined reference to cudaStreamCreateWithFlags' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x2f80): undefined reference to cudaStreamCreateWithFlags'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x2fa9): undefined reference to cudaEventCreateWithFlags' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x2fc6): undefined reference to cublasCreate_v2'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x2fe5): undefined reference to cublasSetMathMode' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x304d): undefined reference to cudaGetErrorString'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x3099): undefined reference to cudaGetErrorString' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x30c2): undefined reference to cudaGetErrorString'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x30dc): undefined reference to cudaGetErrorString' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x30f3): undefined reference to cublasGetStatusString'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x311b): undefined reference to cublasGetStatusString' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x3148): undefined reference to cudaGetErrorString'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x3193): undefined reference to cudaGetErrorString' ggml-cuda.o: In function ggml_cuda_host_malloc':
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x349e): undefined reference to cudaMallocHost' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x34cb): undefined reference to cudaGetErrorString'
ggml-cuda.o: In function ggml_cuda_host_free': tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x3542): undefined reference to cudaFreeHost'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x3551): undefined reference to cudaGetErrorString' ggml-cuda.o: In function ggml_cuda_load_data':
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x387f): undefined reference to cudaSetDevice' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x38bb): undefined reference to cudaMalloc'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x3919): undefined reference to cudaMemcpy' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x391e): undefined reference to cudaDeviceSynchronize'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x3997): undefined reference to cudaSetDevice' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x3a3f): undefined reference to cudaGetErrorString'
ggml-cuda.o: In function ggml_cuda_free_data': tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x3b23): undefined reference to cudaSetDevice'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x3b34): undefined reference to cudaFree' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x3b64): undefined reference to cudaGetErrorString'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x3ba7): undefined reference to cudaGetErrorString' ggml-cuda.o: In function ggml_cuda_compute_forward':
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x410b): undefined reference to cudaMemcpyAsync' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x4124): undefined reference to cublasSetStream_v2'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x4184): undefined reference to cublasGemmEx' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x41c0): undefined reference to cudaMemcpyAsync'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x4263): undefined reference to cudaDeviceSynchronize' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x4342): undefined reference to cudaGetErrorString'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x4385): undefined reference to cudaGetErrorString' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x43a2): undefined reference to cublasGetStatusString'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x43c7): undefined reference to cublasGetStatusString' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x43f0): undefined reference to cudaGetErrorString'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x4405): undefined reference to cudaGetErrorString' ggml-cuda.o: In function ggml_cuda_h2d_tensor_2d(void*, ggml_tensor const*, long, long, long, long, CUstream_st*)':
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0xcf7): undefined reference to cudaMemcpyAsync' ggml-cuda.o: In function __cudaUnregisterBinaryUtil()':
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0xe78): undefined reference to __cudaUnregisterFatBinary' ggml-cuda.o: In function ggml_cuda_h2d_tensor_2d(void*, ggml_tensor const*, long, long, long, long, CUstream_st*) [clone .constprop.19]':
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x101a): undefined reference to cudaMemcpyAsync' ggml-cuda.o: In function ggml_cuda_op_mul_mat_cublas(ggml_tensor const*, ggml_tensor const*, ggml_tensor*, char*, float*, float*, float*, long, long, int, CUstream_st*&)':
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z27ggml_cuda_op_mul_mat_cublasPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z27ggml_cuda_op_mul_mat_cublasPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0x84): undefined reference to cudaGetDevice' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z27ggml_cuda_op_mul_mat_cublasPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z27ggml_cuda_op_mul_mat_cublasPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0xa9): undefined reference to cublasSetStream_v2'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z27ggml_cuda_op_mul_mat_cublasPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z27ggml_cuda_op_mul_mat_cublasPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0xed): undefined reference to cublasSgemm_v2' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z27ggml_cuda_op_mul_mat_cublasPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z27ggml_cuda_op_mul_mat_cublasPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0x158): undefined reference to cublasGetStatusString'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z27ggml_cuda_op_mul_mat_cublasPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z27ggml_cuda_op_mul_mat_cublasPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0x19a): undefined reference to cublasGetStatusString' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z27ggml_cuda_op_mul_mat_cublasPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z27ggml_cuda_op_mul_mat_cublasPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0x1b9): undefined reference to cudaGetErrorString'
ggml-cuda.o: In function ggml_cuda_op_dequantize_mul_mat_vec(ggml_tensor const*, ggml_tensor const*, ggml_tensor*, char*, float*, float*, float*, long, long, int, CUstream_st*&)': tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0xe5): undefined reference to __cudaPushCallConfiguration'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0xf9): undefined reference to cudaGetLastError' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0x185): undefined reference to __cudaPushCallConfiguration'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0x22c): undefined reference to __cudaPopCallConfiguration' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0x27d): undefined reference to cudaLaunchKernel'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0x2e5): undefined reference to __cudaPushCallConfiguration' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0x38c): undefined reference to __cudaPopCallConfiguration'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0x3dd): undefined reference to cudaLaunchKernel' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0x445): undefined reference to __cudaPushCallConfiguration'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0x4ec): undefined reference to __cudaPopCallConfiguration' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0x53d): undefined reference to cudaLaunchKernel'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0x5a5): undefined reference to __cudaPushCallConfiguration' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0x64c): undefined reference to __cudaPopCallConfiguration'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0x69d): undefined reference to cudaLaunchKernel' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0x705): undefined reference to __cudaPushCallConfiguration'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0x7ac): undefined reference to __cudaPopCallConfiguration' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0x7fd): undefined reference to cudaLaunchKernel'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0x8ab): undefined reference to __cudaPopCallConfiguration' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0x8fc): undefined reference to cudaLaunchKernel'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0x946): undefined reference to cudaGetErrorString' ggml-cuda.o: In function ggml_cuda_op_mul(ggml_tensor const*, ggml_tensor const*, ggml_tensor*, char*, float*, float*, float*, long, long, int, CUstream_st*&)':
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z16ggml_cuda_op_mulPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z16ggml_cuda_op_mulPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0x111): undefined reference to cudaGetLastError' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z16ggml_cuda_op_mulPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z16ggml_cuda_op_mulPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0x195): undefined reference to __cudaPushCallConfiguration'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z16ggml_cuda_op_mulPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z16ggml_cuda_op_mulPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0x269): undefined reference to __cudaPopCallConfiguration' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z16ggml_cuda_op_mulPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z16ggml_cuda_op_mulPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0x2ba): undefined reference to cudaLaunchKernel'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z16ggml_cuda_op_mulPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z16ggml_cuda_op_mulPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0x2f9): undefined reference to cudaGetErrorString' ggml-cuda.o: In function __sti____cudaRegisterAll()':
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text.startup+0x9): undefined reference to __cudaRegisterFatBinary' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text.startup+0x3d): undefined reference to __cudaRegisterFunction'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text.startup+0x6b): undefined reference to __cudaRegisterFunction' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text.startup+0x99): undefined reference to __cudaRegisterFunction'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text.startup+0xc7): undefined reference to __cudaRegisterFunction' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text.startup+0xf5): undefined reference to __cudaRegisterFunction'
ggml-cuda.o:tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text.startup+0x123): more undefined references to __cudaRegisterFunction' follow ggml-cuda.o: In function __sti____cudaRegisterAll()':
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text.startup+0x275): undefined reference to `__cudaRegisterFatBinaryEnd'
collect2: error: ld returned 1 exit status
Makefile:251: recipe for target 'main' failed
make: *** [main] Error 1

@SlyEcho
Copy link
Collaborator

SlyEcho commented Jun 7, 2023

make LLAMA_CUBLAS=1 LDFLAGS=-L/usr/local/cuda-11.6/targets/x86_64-linux/lib

It's not possible to override compilation flags like this with the current Makefile.

I suggest you use CMake instead to have more control over the build configuration.

@ggerganov
Copy link
Member

ggerganov commented Jul 5, 2023

In my case, I somehow had installed two CUDA versions (10.1 and 12.2):

$  dpkg -l  | grep cuda
ii  cuda                                       12.2.0-1                                   amd64        CUDA meta-package
ii  cuda-12-2                                  12.2.0-1                                   amd64        CUDA 12.2 meta-package
ii  cuda-cccl-12-2                             12.2.53-1                                  amd64        CUDA CCCL
ii  cuda-command-line-tools-12-2               12.2.0-1                                   amd64        CUDA command-line tools
ii  cuda-compiler-12-2                         12.2.0-1                                   amd64        CUDA compiler
ii  cuda-cudart-12-2                           12.2.53-1                                  amd64        CUDA Runtime native Libraries
ii  cuda-cudart-dev-12-2                       12.2.53-1                                  amd64        CUDA Runtime native dev links, headers
ii  cuda-cuobjdump-12-2                        12.2.53-1                                  amd64        CUDA cuobjdump
ii  cuda-cupti-12-2                            12.2.60-1                                  amd64        CUDA profiling tools runtime libs.
ii  cuda-cupti-dev-12-2                        12.2.60-1                                  amd64        CUDA profiling tools interface.
ii  cuda-cuxxfilt-12-2                         12.2.53-1                                  amd64        CUDA cuxxfilt
ii  cuda-demo-suite-12-2                       12.2.53-1                                  amd64        Demo suite for CUDA
ii  cuda-documentation-12-2                    12.2.53-1                                  amd64        CUDA documentation
ii  cuda-driver-dev-12-2                       12.2.53-1                                  amd64        CUDA Driver native dev stub library
ii  cuda-drivers                               535.54.03-1                                amd64        CUDA Driver meta-package, branch-agnostic
ii  cuda-drivers-535                           535.54.03-1                                amd64        CUDA Driver meta-package, branch-specific
ii  cuda-gdb-12-2                              12.2.53-1                                  amd64        CUDA-GDB
ii  cuda-keyring                               1.1-1                                      all          GPG keyring for the CUDA repository
ii  cuda-libraries-12-2                        12.2.0-1                                   amd64        CUDA Libraries 12.2 meta-package
ii  cuda-libraries-dev-12-2                    12.2.0-1                                   amd64        CUDA Libraries 12.2 development meta-package
ii  cuda-nsight-12-2                           12.2.53-1                                  amd64        CUDA nsight
ii  cuda-nsight-compute-12-2                   12.2.0-1                                   amd64        NVIDIA Nsight Compute
ii  cuda-nsight-systems-12-2                   12.2.0-1                                   amd64        NVIDIA Nsight Systems
ii  cuda-nvcc-12-2                             12.2.91-1                                  amd64        CUDA nvcc
ii  cuda-nvdisasm-12-2                         12.2.53-1                                  amd64        CUDA disassembler
ii  cuda-nvml-dev-12-2                         12.2.81-1                                  amd64        NVML native dev links, headers
ii  cuda-nvprof-12-2                           12.2.60-1                                  amd64        CUDA Profiler tools
ii  cuda-nvprune-12-2                          12.2.53-1                                  amd64        CUDA nvprune
ii  cuda-nvrtc-12-2                            12.2.91-1                                  amd64        NVRTC native runtime libraries
ii  cuda-nvrtc-dev-12-2                        12.2.91-1                                  amd64        NVRTC native dev links, headers
ii  cuda-nvtx-12-2                             12.2.53-1                                  amd64        NVIDIA Tools Extension
ii  cuda-nvvp-12-2                             12.2.60-1                                  amd64        CUDA Profiler tools
ii  cuda-opencl-12-2                           12.2.53-1                                  amd64        CUDA OpenCL native Libraries
ii  cuda-opencl-dev-12-2                       12.2.53-1                                  amd64        CUDA OpenCL native dev links, headers
ii  cuda-profiler-api-12-2                     12.2.53-1                                  amd64        CUDA Profiler API
ii  cuda-repo-ubuntu2004-12-1-local            12.1.1-530.30.02-1                         amd64        cuda repository configuration files
ii  cuda-runtime-12-2                          12.2.0-1                                   amd64        CUDA Runtime 12.2 meta-package
ii  cuda-sanitizer-12-2                        12.2.53-1                                  amd64        CUDA Sanitizer
ii  cuda-toolkit-12-1-config-common            12.1.105-1                                 all          Common config package for CUDA Toolkit 12.1.
ii  cuda-toolkit-12-2                          12.2.0-1                                   amd64        CUDA Toolkit 12.2 meta-package
ii  cuda-toolkit-12-2-config-common            12.2.53-1                                  all          Common config package for CUDA Toolkit 12.2.
ii  cuda-toolkit-12-config-common              12.2.53-1                                  all          Common config package for CUDA Toolkit 12.
ii  cuda-toolkit-config-common                 12.2.53-1                                  all          Common config package for CUDA Toolkit.
ii  cuda-tools-12-2                            12.2.0-1                                   amd64        CUDA Tools meta-package
ii  cuda-visual-tools-12-2                     12.2.0-1                                   amd64        CUDA visual tools
ii  libcudart10.1:amd64                        10.1.243-3                                 amd64        NVIDIA CUDA Runtime Library
ii  nvidia-cuda-dev                            10.1.243-3                                 amd64        NVIDIA CUDA development files
ii  nvidia-cuda-doc                            10.1.243-3                                 all          NVIDIA CUDA and OpenCL documentation
ii  nvidia-cuda-gdb                            10.1.243-3                                 amd64        NVIDIA CUDA Debugger (GDB)
ii  nvidia-cuda-toolkit                        10.1.243-3                                 amd64        NVIDIA CUDA development toolkit

I solved the problem by removing the last 5 packages in this list:

sudo apt remove libcudart10.1 nvidia-cuda-dev nvidia-cuda-doc nvidia-cuda-gdb nvidia-cuda-toolkit
sudo apt autoremove
...
# run cmake and specify the full path to nvcc
cmake .. -DLLAMA_CUBLAS=1 -DCMAKE_CUDA_COMPILER=/usr/local/cuda-12.2/bin/nvcc

I don't guarantee this is the right thing to do! It just worked for me

@ZeerhoK
Copy link

ZeerhoK commented Aug 21, 2023

In my case, I somehow had installed two CUDA versions (10.1 and 12.2):

$  dpkg -l  | grep cuda
ii  cuda                                       12.2.0-1                                   amd64        CUDA meta-package
ii  cuda-12-2                                  12.2.0-1                                   amd64        CUDA 12.2 meta-package
ii  cuda-cccl-12-2                             12.2.53-1                                  amd64        CUDA CCCL
ii  cuda-command-line-tools-12-2               12.2.0-1                                   amd64        CUDA command-line tools
ii  cuda-compiler-12-2                         12.2.0-1                                   amd64        CUDA compiler
ii  cuda-cudart-12-2                           12.2.53-1                                  amd64        CUDA Runtime native Libraries
ii  cuda-cudart-dev-12-2                       12.2.53-1                                  amd64        CUDA Runtime native dev links, headers
ii  cuda-cuobjdump-12-2                        12.2.53-1                                  amd64        CUDA cuobjdump
ii  cuda-cupti-12-2                            12.2.60-1                                  amd64        CUDA profiling tools runtime libs.
ii  cuda-cupti-dev-12-2                        12.2.60-1                                  amd64        CUDA profiling tools interface.
ii  cuda-cuxxfilt-12-2                         12.2.53-1                                  amd64        CUDA cuxxfilt
ii  cuda-demo-suite-12-2                       12.2.53-1                                  amd64        Demo suite for CUDA
ii  cuda-documentation-12-2                    12.2.53-1                                  amd64        CUDA documentation
ii  cuda-driver-dev-12-2                       12.2.53-1                                  amd64        CUDA Driver native dev stub library
ii  cuda-drivers                               535.54.03-1                                amd64        CUDA Driver meta-package, branch-agnostic
ii  cuda-drivers-535                           535.54.03-1                                amd64        CUDA Driver meta-package, branch-specific
ii  cuda-gdb-12-2                              12.2.53-1                                  amd64        CUDA-GDB
ii  cuda-keyring                               1.1-1                                      all          GPG keyring for the CUDA repository
ii  cuda-libraries-12-2                        12.2.0-1                                   amd64        CUDA Libraries 12.2 meta-package
ii  cuda-libraries-dev-12-2                    12.2.0-1                                   amd64        CUDA Libraries 12.2 development meta-package
ii  cuda-nsight-12-2                           12.2.53-1                                  amd64        CUDA nsight
ii  cuda-nsight-compute-12-2                   12.2.0-1                                   amd64        NVIDIA Nsight Compute
ii  cuda-nsight-systems-12-2                   12.2.0-1                                   amd64        NVIDIA Nsight Systems
ii  cuda-nvcc-12-2                             12.2.91-1                                  amd64        CUDA nvcc
ii  cuda-nvdisasm-12-2                         12.2.53-1                                  amd64        CUDA disassembler
ii  cuda-nvml-dev-12-2                         12.2.81-1                                  amd64        NVML native dev links, headers
ii  cuda-nvprof-12-2                           12.2.60-1                                  amd64        CUDA Profiler tools
ii  cuda-nvprune-12-2                          12.2.53-1                                  amd64        CUDA nvprune
ii  cuda-nvrtc-12-2                            12.2.91-1                                  amd64        NVRTC native runtime libraries
ii  cuda-nvrtc-dev-12-2                        12.2.91-1                                  amd64        NVRTC native dev links, headers
ii  cuda-nvtx-12-2                             12.2.53-1                                  amd64        NVIDIA Tools Extension
ii  cuda-nvvp-12-2                             12.2.60-1                                  amd64        CUDA Profiler tools
ii  cuda-opencl-12-2                           12.2.53-1                                  amd64        CUDA OpenCL native Libraries
ii  cuda-opencl-dev-12-2                       12.2.53-1                                  amd64        CUDA OpenCL native dev links, headers
ii  cuda-profiler-api-12-2                     12.2.53-1                                  amd64        CUDA Profiler API
ii  cuda-repo-ubuntu2004-12-1-local            12.1.1-530.30.02-1                         amd64        cuda repository configuration files
ii  cuda-runtime-12-2                          12.2.0-1                                   amd64        CUDA Runtime 12.2 meta-package
ii  cuda-sanitizer-12-2                        12.2.53-1                                  amd64        CUDA Sanitizer
ii  cuda-toolkit-12-1-config-common            12.1.105-1                                 all          Common config package for CUDA Toolkit 12.1.
ii  cuda-toolkit-12-2                          12.2.0-1                                   amd64        CUDA Toolkit 12.2 meta-package
ii  cuda-toolkit-12-2-config-common            12.2.53-1                                  all          Common config package for CUDA Toolkit 12.2.
ii  cuda-toolkit-12-config-common              12.2.53-1                                  all          Common config package for CUDA Toolkit 12.
ii  cuda-toolkit-config-common                 12.2.53-1                                  all          Common config package for CUDA Toolkit.
ii  cuda-tools-12-2                            12.2.0-1                                   amd64        CUDA Tools meta-package
ii  cuda-visual-tools-12-2                     12.2.0-1                                   amd64        CUDA visual tools
ii  libcudart10.1:amd64                        10.1.243-3                                 amd64        NVIDIA CUDA Runtime Library
ii  nvidia-cuda-dev                            10.1.243-3                                 amd64        NVIDIA CUDA development files
ii  nvidia-cuda-doc                            10.1.243-3                                 all          NVIDIA CUDA and OpenCL documentation
ii  nvidia-cuda-gdb                            10.1.243-3                                 amd64        NVIDIA CUDA Debugger (GDB)
ii  nvidia-cuda-toolkit                        10.1.243-3                                 amd64        NVIDIA CUDA development toolkit

I solved the problem by removing the last 5 packages in this list:

sudo apt remove libcudart10.1 nvidia-cuda-dev nvidia-cuda-doc nvidia-cuda-gdb nvidia-cuda-toolkit
sudo apt autoremove
...
# run cmake and specify the full path to nvcc
cmake .. -DLLAMA_CUBLAS=1 -DCMAKE_CUDA_COMPILER=/usr/local/cuda-12.2/bin/nvcc

I don't guarantee this is the right thing to do! It just worked for me

That also worked for me.
I had to run the following commands:

$ sudo apt remove libcudart11.0 nvidia-cuda-dev nvidia-cuda-gdb nvidia-cuda-toolkit nvidia-cuda-toolkit-doc
$ sudo apt autoremove
$ make

Copy link
Contributor

This issue was closed because it has been inactive for 14 days since being marked as stale.

@badpaybad
Copy link

I found this one, because the include folder may wrong. If upgrade to cuda 12 so that the include folder should change to the same version eg:

this one related to include folder for cmake
-DCUDAToolkit_ROOT=/usr/local/cuda-12

cmake .. -DLLAMA_CUDA=ON -DCMAKE_CUDA_COMPILER=/usr/local/cuda-12/bin/nvcc -DCUDAToolkit_ROOT=/usr/local/cuda-12

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants