Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify instructions for CLBlast on Android #706

Merged
merged 9 commits into from
Jan 26, 2024

Conversation

luciferous
Copy link
Contributor

@luciferous luciferous commented Jan 24, 2024

Documenting for anyone else who wants to get CLBlast running on Android.

$ OPENCL_ROOT=$(readlink -f ../../OpenCL-Headers) CLBLAST_HOME=$(readlink -f ../../CLBlast) $ANDROID_SDK_PATH/cmake/3.22.1/bin/cmake .. -DGGML_CLBLAST=ON -DCMAKE_SYSTEM_NAME=Android    -DCMAKE_SYSTEM_VERSION=33    -DCMAKE_ANDROID_ARCH_ABI=arm64-v8a    -DCMAKE_ANDROID_NDK=$NDK_ROOT_PATH    -DCMAKE_ANDROID_STL_TYPE=c++_shared -DCMAKE_FIND_ROOT_PATH_MODE_INCLUDE=BOTH -DCMAKE_FIND_ROOT_PATH_MODE_LIBRARY=BOTH -DOPENCL_LIB=$(readlink -f ../arm64-v8a/libOpenCL.so)
-- Android: Targeting API '33' with architecture 'arm64', ABI 'arm64-v8a', and processor 'aarch64'
-- Android: Selected unified Clang toolchain
-- The C compiler identification is Clang 17.0.2
-- The CXX compiler identification is Clang 17.0.2
[...snip...]
-- CMAKE_SYSTEM_PROCESSOR: aarch64
-- ARM detected
-- clBLAST found
-- ARM detected
-- Configuring done
-- Generating done
-- Build files have been written to: /.../ggml/build

Example run using adb shell (note /system/vendor/lib64/egl addition to LD_LIBRARY_PATH).

$ export LD_LIBRARY_PATH=/system/vendor/lib64/egl:/data/local/tmp
$ ./bin/gpt-2-backend -m models/ggml-model.bin -n 64 -p "Pepperoni pizza"
main: seed = 1706101586
gpt2_model_load: loading model from 'models/ggml-model.bin'
gpt2_model_load: n_vocab = 50257
gpt2_model_load: n_ctx   = 1024
gpt2_model_load: n_embd  = 768
gpt2_model_load: n_head  = 12
gpt2_model_load: n_layer = 12
gpt2_model_load: ftype   = 1
gpt2_model_load: qntvr   = 0
ggml_opencl: selecting platform: 'ARM Platform'
ggml_opencl: selecting device: 'Mali-G710 r0p0'
ggml_opencl: device FP16 support: true
gpt2_model_load: using CPU backend
gpt2_model_load: ggml tensor size    = 368 bytes
gpt2_model_load: backend buffer size = 312.70 MB
gpt2_model_load: memory size =   144.00 MB, n_mem = 24576
gpt2_model_load: model size  =   239.08 MB
extract_tests_from_file : No test file found.
test_gpt_tokenizer : 0 tests failed out of 0 tests.
main: compute buffer size: 6.87 MB
main: prompt: 'Pepperoni pizza'
main: number of tokens in prompt = 4, first 8 tokens: 6435 2848 14651 14256 

Pepperoni pizza is a staple at most restaurants in the US. This is the most popular pizza style in the country, so they are famous for their pepperoni pizzas, which are usually made from scratch, and have a variety of toppings such as chili pepper, tomato paste, mustard and pepperoni. They also sell pepperoni

main:     load time =   486.01 ms
main:   sample time =    23.21 ms
main:  predict time =  2166.68 ms / 32.34 ms per token
main:    total time =  2680.16 ms

OpenCL does not have the same level of support in ggml-backend as CUDA or Metal. In the gpt-2-backend example, OpenCL will only be used for the matrix multiplications when evaluating large prompts.

@slaren
Copy link
Collaborator

slaren commented Jan 24, 2024

This is good, but I think we should clarify that OpenCL does not have the same level of support as CUDA or Metal. With the gpt-2-backend example (or any other really), OpenCL will only be used for the matrix multiplications when evaluating large prompts.

@luciferous
Copy link
Contributor Author

@slaren Added a note to the README and also to the commit message.

@luciferous luciferous marked this pull request as draft January 25, 2024 05:05
README.md Outdated Show resolved Hide resolved
Co-authored-by: slaren <slarengh@gmail.com>
@ggerganov ggerganov merged commit c2448f8 into ggerganov:master Jan 26, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants