Share a simpler Cmake methd to compile and run GPU accelerated version(openBLAS and CLBlast) on android Qualcomm Adreno device. #2169

hchenphd · 2023-07-11T02:10:25Z

I browse all issues and the official setup tutorial of compiling llama.cpp to GPU. But I found it is really confused by using MAKE tool and copy file from a src path to a dest path（Especially the official setup tutorial is little weird)
Here is the method I summarized (which I though much simpler and more elegant)

0 Install NDK and CMake tools

please refer the basic step of how to compile llama.cpp on CPU on android device

1 install openblas

pkg update
pkg upgrade
apt install libopenblas

2 install openCL

apt install clang cmake cmake-curses-gui opencl-headers ocl-icd

3 Install CLBLast

git clone https://github.com/CNugteren/CLBlast.git
cd CLBlast
cmake -B build
-DBUILD_SHARED_LIBS=OFF
-DTUNERS=OFF
-DCMAKE_BUILD_TYPE=Release
-DCMAKE_INSTALL_PREFIX=/data/data/com.termux/files/usr
cd build
make -j8
make install

4 Build llama

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp/
cmake -B build-gpu -DLLAMA_CLBLAST=ON
cd build-gpu
make -j8

5 run llama

export LD_LIBRARY_PATH=/vendor/lib64:$LD_LIBRARY_PATH
// type unset LD_LIBRARY_PATH , otherwise cmake cannot work again
GGML_OPENCL_PLATFORM=0 GGML_OPENCL_DEVICE=0 ./bin/main -m mode-path -p “prompt" ..........

callMeMakerRen · 2023-07-11T05:38:33Z

can you provide some performance info? thx

ghost · 2023-07-11T11:29:13Z

5 run llama

export LD_LIBRARY_PATH=/vendor/lib64:$LD_LIBRARY_PATH
// type unset LD_LIBRARY_PATH , otherwise cmake cannot work again
GGML_OPENCL_PLATFORM=0 GGML_OPENCL_DEVICE=0 ./bin/main -m mode-path -p “prompt" ..........

It's an improvement over the current instructions. It's more a generic guide than an Adreno/Snapdagon guide.

"ocl-icd" doesn't work with Adreno in the same way as other Android devices, so export, & setting platform/device may be unnecessary. This works for my device: LD_LIBRARY_PATH=/vendor/lib64 ./main ...

Ensure to move your model to the correct directory for better performance: For example: mv ~/storage/downloads/7b-ggml-q4_0.bin ~/llama.cpp/models

AndreaChiChengdu · 2023-09-12T07:46:43Z

I browse all issues and the official setup tutorial of compiling llama.cpp to GPU. But I found it is really confused by using MAKE tool and copy file from a src path to a dest path（Especially the official setup tutorial is little weird) Here is the method I summarized (which I though much simpler and more elegant)

0 Install NDK and CMake tools

please refer the basic step of how to compile llama.cpp on CPU on android device

1 install openblas

pkg update
pkg upgrade
apt install libopenblas

2 install openCL

apt install clang cmake cmake-curses-gui opencl-headers ocl-icd

3 Install CLBLast

git clone https://github.com/CNugteren/CLBlast.git
cd CLBlast
cmake -B build
-DBUILD_SHARED_LIBS=OFF
-DTUNERS=OFF
-DCMAKE_BUILD_TYPE=Release
-DCMAKE_INSTALL_PREFIX=/data/data/com.termux/files/usr
cd build
make -j8
make install

4 Build llama

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp/
cmake -B build-gpu -DLLAMA_CLBLAST=ON
cd build-gpu
make -j8

5 run llama

export LD_LIBRARY_PATH=/vendor/lib64:$LD_LIBRARY_PATH
// type unset LD_LIBRARY_PATH , otherwise cmake cannot work again
GGML_OPENCL_PLATFORM=0 GGML_OPENCL_DEVICE=0 ./bin/main -m mode-path -p “prompt" ..........

just a quick update，it works，and I saw the GPU be used(-ngl + big num),but the performance is very pool(qualcomm 8Gen1 less than Cpu openblas). maybe there are something wrong？

rayrayraykk · 2023-10-31T07:35:00Z

5 run llama

export LD_LIBRARY_PATH=/vendor/lib64:$LD_LIBRARY_PATH
// type unset LD_LIBRARY_PATH , otherwise cmake cannot work again
GGML_OPENCL_PLATFORM=0 GGML_OPENCL_DEVICE=0 ./bin/main -m mode-path -p “prompt" ..........

It's an improvement over the current instructions. It's more a generic guide than an Adreno/Snapdagon guide.

"ocl-icd" doesn't work with Adreno in the same way as other Android devices, so export, & setting platform/device may be unnecessary. This works for my device: LD_LIBRARY_PATH=/vendor/lib64 ./main ...

Ensure to move your model to the correct directory for better performance: For example: mv ~/storage/downloads/7b-ggml-q4_0.bin ~/llama.cpp/models

I am an android noob, can you tell me why I can get better performance via moving model?

Jeximo · 2024-03-09T16:57:14Z

@rayrayraykk

I am an android noob, can you tell me why I can get better performance via moving model?

The reason moving the model from downloads to ~/ increases performance in Termux is due limits in the android file-system. My limited understanding is that programs running in Termux have better access to files in ~/ compared to storage/downloads.

github-actions · 2024-04-24T01:06:56Z

This issue was closed because it has been inactive for 14 days since being marked as stale.

This was referenced Jul 17, 2023

Android Inference is too slow ggerganov/whisper.cpp#1070

Open

whisper.Android WhisperCppDemo very slow android specific transcibe times for 3s recording a res of 31227ms ggerganov/whisper.cpp#1022

Open

This was referenced Jan 16, 2024

Android GPU #4967

Closed

Adreno gpu run crash #4973

Closed

github-actions bot added the stale label Apr 9, 2024

qtyandhasee mentioned this issue Apr 9, 2024

Using CLBlast to call GPU on Android device, what is the relationship between ngl parameters and model output correctness? #6562

Closed

github-actions bot closed this as completed Apr 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Share a simpler Cmake methd to compile and run GPU accelerated version(openBLAS and CLBlast) on android Qualcomm Adreno device. #2169

Share a simpler Cmake methd to compile and run GPU accelerated version(openBLAS and CLBlast) on android Qualcomm Adreno device. #2169

hchenphd commented Jul 11, 2023

callMeMakerRen commented Jul 11, 2023

ghost commented Jul 11, 2023 •

edited by ghost

Loading

AndreaChiChengdu commented Sep 12, 2023 •

edited

Loading

rayrayraykk commented Oct 31, 2023

Jeximo commented Mar 9, 2024

github-actions bot commented Apr 24, 2024

Share a simpler Cmake methd to compile and run GPU accelerated version(openBLAS and CLBlast) on android Qualcomm Adreno device. #2169

Share a simpler Cmake methd to compile and run GPU accelerated version(openBLAS and CLBlast) on android Qualcomm Adreno device. #2169

Comments

hchenphd commented Jul 11, 2023

callMeMakerRen commented Jul 11, 2023

ghost commented Jul 11, 2023 • edited by ghost Loading

AndreaChiChengdu commented Sep 12, 2023 • edited Loading

rayrayraykk commented Oct 31, 2023

Jeximo commented Mar 9, 2024

github-actions bot commented Apr 24, 2024

ghost commented Jul 11, 2023 •

edited by ghost

Loading

AndreaChiChengdu commented Sep 12, 2023 •

edited

Loading