Error on make LLAMA_CUBLAS=1 #1470

psegovias · 2023-05-15T21:19:17Z

Trying to compile with CUDA support and get this:

F:/llama.cpp $ make LLAMA_CUBLAS=1
I llama.cpp build info:
I UNAME_S:  Windows_NT
I UNAME_P:  unknown
I UNAME_M:  x86_64
I CFLAGS:   -I.              -O3 -std=c11   -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -march=native -mtune=native -DGGML_USE_CUBLAS -I/usr/local/cuda/include -I/opt/cuda/include -IC:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.1/targets/x86_64-linux/include
I CXXFLAGS: -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -march=native -mtune=native -DGGML_USE_CUBLAS -I/usr/local/cuda/include -I/opt/cuda/include -IC:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.1/targets/x86_64-linux/include
I LDFLAGS:  -lcublas -lculibos -lcudart -lcublasLt -lpthread -ldl -lrt -L/usr/local/cuda/lib64 -L/opt/cuda/lib64 -LC:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.1/targets/x86_64-linux/lib
I CC:       cc (GCC) 13.1.0
I CXX:      g++ (GCC) 13.1.0

cc  -I.              -O3 -std=c11   -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -march=native -mtune=native -DGGML_USE_CUBLAS -I/usr/local/cuda/include -I/opt/cuda/include -IC:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.1/targets/x86_64-linux/include   -c ggml.c -o ggml.o
cc: warning: Files/NVIDIA: linker input file unused because linking not done
cc: error: Files/NVIDIA: linker input file not found: No such file or directory
cc: warning: GPU: linker input file unused because linking not done
cc: error: GPU: linker input file not found: No such file or directory
cc: warning: Computing: linker input file unused because linking not done
cc: error: Computing: linker input file not found: No such file or directory
cc: warning: Toolkit/CUDA/v12.1/targets/x86_64-linux/include: linker input file unused because linking not done
cc: error: Toolkit/CUDA/v12.1/targets/x86_64-linux/include: linker input file not found: No such file or directory
make: *** [Makefile:186: ggml.o] Error 1`

The text was updated successfully, but these errors were encountered:

technicolor-twelve · 2023-05-15T22:09:31Z

I came across the same thing for Windows and the immediate issue was that your path (the "NVIDIA GPU Computing Toolkit" part) has spaces in it and apparently Make does not like spaces, so I think you can put quotes around it to make the path literal. However I kept running into other issues in Windows that I gave up and dual installed Ubuntu along with my Windows installation and it ran great. Never got CUBLAS to work on Windows though, I'd say before an official solution comes out just running it in Linux probably saves you a lot of headache in the coming days.

slaren · 2023-05-15T22:32:02Z

As @technicolor-twelve says, this seems to happen because your CUDA_PATH env variable has spaces. I guess that you are trying to build with mingw, but as far as I know CUDA is not supported with mingw anyway.

You have to use cmake and MSVC to build with CUDA under windows, there are instructions in the README. Alternatively, use one of the pre-built binaries available at https://github.com/ggerganov/llama.cpp/tags

SlyEcho · 2023-05-15T22:36:39Z

You have to escape the spaces with \ in CUDA_PATH.

What compiler is that? Is it MingW? It can't use the Linux CUDA SDK anyway, since it is a Windows compiler.

technicolor-twelve · 2023-05-15T22:56:24Z

@SlyEcho If it's anything like my experience it's probably w64devkit with MinGW32, probably because make works on Windows with OpenBLAS and without the BLAS build, naturally I thought the cuBLAS build worked out of the box for windows as well in the Readme.

psegovias · 2023-05-16T00:35:30Z

You have to escape the spaces with \ in CUDA_PATH.

What compiler is that? Is it MingW? It can't use the Linux CUDA SDK anyway, since it is a Windows compiler.

i must escape directly in makefile?

im using w64devkit-1.19.0 as project suggests.

psegovias · 2023-05-16T01:31:45Z

After adding double quotation marks " " to cuda PATH it works, but now fail in this step:

nvcc --forward-unknown-to-host-compiler -arch=native -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -march=native -mtune=native -DGGML_USE_CUBLAS -I/usr/local/cuda/include -I/opt/cuda/include -I"C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.1/targets/x86_64-linux/include" -Wno-pedantic -c ggml-cuda.cu -o ggml-cuda.o
nvcc warning : The -std=c++11 flag is not supported with the configured host compiler. Flag will be ignored.
ggml-cuda.cu
cl : Línea de comandos error D8021 : argumento numérico no válido '/Wextra'
make: *** [Makefile:133: ggml-cuda.o] Error 2

SlyEcho · 2023-05-16T08:00:54Z

nvcc is calling cl (MSVC). They use different flags than GCC. You can't mix compilers like this. You have to rewrite a lot of the Makefile probably.

Or use CMake that should configure everything automatically and find the paths and compilers and stuff.

RahulVivekNair · 2023-05-16T13:51:29Z

Can confirm, using CMake is the way to go, didn't face any issues!

nvcc is calling cl (MSVC). They use different flags than GCC. You can't mix compilers like this. You have to rewrite a lot of the Makefile probably.

Or use CMake that should configure everything automatically and find the paths and compilers and stuff.

CRD716 · 2023-05-16T15:42:29Z

CMake for cublas, w64devkit for openblas. Just makes things work.

psegovias · 2023-05-16T16:40:07Z

Fixed using cMake instead w64devkit, thanks to all!

obriensystems · 2024-02-11T16:06:50Z

adjusting the ENV variable works well - below or shortened copy
-LC:/Progra~~1/"NVIDIA~~1/CUDA/v12.3/targets/x86_64-linux/lib
until
nvcc fatal : Cannot find compiler 'cl.exe' in PATH
make: *** [Makefile:430: ggml-cuda.o] Error 1

fix - add to PATH

C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.37.32822\bin\Hostx64\x64

solving

nvcc -I. -Icommon -D_XOPEN_SOURCE=600 -DNDEBUG -D_WIN32_WINNT=0x602 -DGGML_USE_CUBLAS -I/usr/local/cuda/include -I/opt/cuda/include -IC:/opt/CUDA/v12.3/targets/x86_64-linux/include -I/usr/local/cuda/targets/aarch64-linux/include  -std=c++11 -fPIC -O3 -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wmissing-declarations -Wmissing-noreturn -Xassembler -muse-unaligned-vector-move  -O3 -use_fast_math --forward-unknown-to-host-compiler -arch=native -DGGML_CUDA_DMMV_X=32 -DGGML_CUDA_MMV_Y=1 -DK_QUANTS_PER_ITERATION=2 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128  -Wno-pedantic -Xcompiler "-Wno-array-bounds" -c ggml-cuda.cu -o ggml-cuda.o
nvcc warning : The -std=c++11 flag is not supported with the configured host compiler. Flag will be ignored.
ggml-cuda.cu
cl : Command line error D8021 : invalid numeric argument '/Wno-array-bounds'
make: *** [Makefile:430: ggml-cuda.o] Error 2

olariuromeo · 2024-02-23T23:53:07Z

TORCH_CUDA_ARCH_LIST=8.6+PTX
APP_GID=6972
APP_UID=6972

psegovias closed this as completed May 16, 2023

obriensystems mentioned this issue Feb 11, 2024

llama.cpp on Nvidia RTX-3500, RTX-A4500 dual, RTX-4090 dual ObrienlabsDev/machine-learning#10

Open

johnny-smitherson mentioned this issue Feb 22, 2025

windows build does not work ShelbyJenkins/llm_client#11

Open

Bearsaerker mentioned this issue Mar 12, 2025

Eval bug: Gemma 3 extremly slow prompt processing when using quantized kv cache. #12352

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error on make LLAMA_CUBLAS=1 #1470

Error on make LLAMA_CUBLAS=1 #1470

psegovias commented May 15, 2023

technicolor-twelve commented May 15, 2023

slaren commented May 15, 2023

SlyEcho commented May 15, 2023

technicolor-twelve commented May 15, 2023

psegovias commented May 16, 2023

psegovias commented May 16, 2023

SlyEcho commented May 16, 2023

RahulVivekNair commented May 16, 2023

CRD716 commented May 16, 2023

psegovias commented May 16, 2023

obriensystems commented Feb 11, 2024

olariuromeo commented Feb 23, 2024

Error on make LLAMA_CUBLAS=1 #1470

Error on make LLAMA_CUBLAS=1 #1470

Comments

psegovias commented May 15, 2023

technicolor-twelve commented May 15, 2023

slaren commented May 15, 2023

SlyEcho commented May 15, 2023

technicolor-twelve commented May 15, 2023

psegovias commented May 16, 2023

psegovias commented May 16, 2023

SlyEcho commented May 16, 2023

RahulVivekNair commented May 16, 2023

CRD716 commented May 16, 2023

psegovias commented May 16, 2023

obriensystems commented Feb 11, 2024

olariuromeo commented Feb 23, 2024