Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Share a simpler Cmake methd to compile and run GPU accelerated version(openBLAS and CLBlast) on android Qualcomm Adreno device. #2169

Closed
hchenphd opened this issue Jul 11, 2023 · 6 comments
Labels

Comments

@hchenphd
Copy link

I browse all issues and the official setup tutorial of compiling llama.cpp to GPU. But I found it is really confused by using MAKE tool and copy file from a src path to a dest path(Especially the official setup tutorial is little weird)
Here is the method I summarized (which I though much simpler and more elegant)

0 Install NDK and CMake tools

please refer the basic step of how to compile llama.cpp on CPU on android device

1 install openblas

pkg update
pkg upgrade
apt install libopenblas

2 install openCL

apt install clang cmake cmake-curses-gui opencl-headers ocl-icd

3 Install CLBLast

git clone https://github.com/CNugteren/CLBlast.git
cd CLBlast
cmake -B build
-DBUILD_SHARED_LIBS=OFF
-DTUNERS=OFF
-DCMAKE_BUILD_TYPE=Release
-DCMAKE_INSTALL_PREFIX=/data/data/com.termux/files/usr
cd build
make -j8
make install

4 Build llama

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp/
cmake -B build-gpu -DLLAMA_CLBLAST=ON
cd build-gpu
make -j8

5 run llama

export LD_LIBRARY_PATH=/vendor/lib64:$LD_LIBRARY_PATH
// type unset LD_LIBRARY_PATH , otherwise cmake cannot work again
GGML_OPENCL_PLATFORM=0 GGML_OPENCL_DEVICE=0 ./bin/main -m mode-path -p “prompt" ..........

@callMeMakerRen
Copy link

can you provide some performance info? thx

@ghost
Copy link

ghost commented Jul 11, 2023

5 run llama

export LD_LIBRARY_PATH=/vendor/lib64:$LD_LIBRARY_PATH
// type unset LD_LIBRARY_PATH , otherwise cmake cannot work again
GGML_OPENCL_PLATFORM=0 GGML_OPENCL_DEVICE=0 ./bin/main -m mode-path -p “prompt" ..........

It's an improvement over the current instructions. It's more a generic guide than an Adreno/Snapdagon guide.

"ocl-icd" doesn't work with Adreno in the same way as other Android devices, so export, & setting platform/device may be unnecessary. This works for my device: LD_LIBRARY_PATH=/vendor/lib64 ./main ...

Ensure to move your model to the correct directory for better performance: For example: mv ~/storage/downloads/7b-ggml-q4_0.bin ~/llama.cpp/models

@AndreaChiChengdu
Copy link

AndreaChiChengdu commented Sep 12, 2023

I browse all issues and the official setup tutorial of compiling llama.cpp to GPU. But I found it is really confused by using MAKE tool and copy file from a src path to a dest path(Especially the official setup tutorial is little weird) Here is the method I summarized (which I though much simpler and more elegant)

0 Install NDK and CMake tools

please refer the basic step of how to compile llama.cpp on CPU on android device

1 install openblas

pkg update
pkg upgrade
apt install libopenblas

2 install openCL

apt install clang cmake cmake-curses-gui opencl-headers ocl-icd

3 Install CLBLast

git clone https://github.com/CNugteren/CLBlast.git
cd CLBlast
cmake -B build
-DBUILD_SHARED_LIBS=OFF
-DTUNERS=OFF
-DCMAKE_BUILD_TYPE=Release
-DCMAKE_INSTALL_PREFIX=/data/data/com.termux/files/usr
cd build
make -j8
make install

4 Build llama

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp/
cmake -B build-gpu -DLLAMA_CLBLAST=ON
cd build-gpu
make -j8

5 run llama

export LD_LIBRARY_PATH=/vendor/lib64:$LD_LIBRARY_PATH
// type unset LD_LIBRARY_PATH , otherwise cmake cannot work again
GGML_OPENCL_PLATFORM=0 GGML_OPENCL_DEVICE=0 ./bin/main -m mode-path -p “prompt" ..........

just a quick update,it works,and I saw the GPU be used(-ngl + big num),but the performance is very pool(qualcomm 8Gen1 less than Cpu openblas). maybe there are something wrong?

@rayrayraykk
Copy link

5 run llama

export LD_LIBRARY_PATH=/vendor/lib64:$LD_LIBRARY_PATH
// type unset LD_LIBRARY_PATH , otherwise cmake cannot work again
GGML_OPENCL_PLATFORM=0 GGML_OPENCL_DEVICE=0 ./bin/main -m mode-path -p “prompt" ..........

It's an improvement over the current instructions. It's more a generic guide than an Adreno/Snapdagon guide.

"ocl-icd" doesn't work with Adreno in the same way as other Android devices, so export, & setting platform/device may be unnecessary. This works for my device: LD_LIBRARY_PATH=/vendor/lib64 ./main ...

Ensure to move your model to the correct directory for better performance: For example: mv ~/storage/downloads/7b-ggml-q4_0.bin ~/llama.cpp/models

I am an android noob, can you tell me why I can get better performance via moving model?

This was referenced Jan 16, 2024
@Jeximo
Copy link
Contributor

Jeximo commented Mar 9, 2024

@rayrayraykk

I am an android noob, can you tell me why I can get better performance via moving model?

The reason moving the model from downloads to ~/ increases performance in Termux is due limits in the android file-system. My limited understanding is that programs running in Termux have better access to files in ~/ compared to storage/downloads.

Copy link
Contributor

This issue was closed because it has been inactive for 14 days since being marked as stale.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants