-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
whisper.android: How to build with CLBlast #1809
Conversation
In draft, because the
|
Could you post the timing information? Example:
|
did it get correct inference ? My build can not find anything when compile with CLBLAST on android. |
Looks like I was wrong about it not speeding up transcription 🙂. The timing numbers below are from transcribing jfk.wav:
I'll update the commit message to reflect. Also to clarify my aims: since I don't understand the basis of the effect on performance, I'm hesitant to make any strong claims for my changes. My goal with this PR is only to merge a reference build to make it easier for further developments of CLBlast on Android. Android logs aren't hooked up to the transcription, so this is what I did to retrieve the diff --git a/examples/whisper.android/lib/src/main/jni/whisper/jni.c b/examples/whisper.android/lib/src/main/jni/whisper/jni.c
index 7f9d724..e522a3a 100644
--- a/examples/whisper.android/lib/src/main/jni/whisper/jni.c
+++ b/examples/whisper.android/lib/src/main/jni/whisper/jni.c
@@ -14,6 +14,13 @@
#define LOGI(...) __android_log_print(ANDROID_LOG_INFO, TAG, __VA_ARGS__)
#define LOGW(...) __android_log_print(ANDROID_LOG_WARN, TAG, __VA_ARGS__)
+static void log_callback(enum ggml_log_level level, const char * fmt, void * data) {
+ if (level == GGML_LOG_LEVEL_ERROR) __android_log_print(ANDROID_LOG_ERROR, TAG, fmt, data);
+ else if (level == GGML_LOG_LEVEL_INFO) __android_log_print(ANDROID_LOG_INFO, TAG, fmt, data);
+ else if (level == GGML_LOG_LEVEL_WARN) __android_log_print(ANDROID_LOG_WARN, TAG, fmt, data);
+ else __android_log_print(ANDROID_LOG_DEFAULT, TAG, fmt, data);
+}
+
static inline int min(int a, int b) {
return (a < b) ? a : b;
}
@@ -182,6 +189,8 @@ Java_com_whispercpp_whisper_WhisperLib_00024Companion_fullTranscribe(
params.no_context = true;
params.single_segment = false;
+ whisper_log_set(log_callback, NULL);
+
whisper_reset_timings(context);
LOGI("About to run whisper_full"); |
@gpokat Transcribing |
The output only shows [music] to me. *Update. |
@gpokat Just want to verify that the
If not, then, it might not have built correctly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@luciferous Did you build the CPU version in Release mode (i.e. -O3
)?
examples/whisper.android/lib/src/main/jni/whisper/CMakeLists.txt
Outdated
Show resolved
Hide resolved
examples/whisper.android/lib/src/main/jni/whisper/CMakeLists.txt
Outdated
Show resolved
Hide resolved
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
@ggerganov I think these are in Debug. I'll follow up with numbers from a Release build and update the commit message with something appropriate. Built against ggerganov/ggml@53558f9. Verified O3 via:
Transcribe sample
Benchmarks
|
@ggerganov Updated the commit message with the new measurements. Curiously, changes to
|
@luciferous did you build CLBLAST with tuner or is your device already in a tuned list for clblast ? https://github.com/CNugteren/CLBlast/blob/master/doc/tuning.md |
@gpokat Ah I see. Thank you for explaining. No, I didn't build with a tuner and it doesn't seem like my device (MaliG710) is on the list. I'll incorporate your explanation into a note in the README. |
Great job! Just wondering how CLBlast performs on optimized GPUs with larger models and longer audio samples. I'll do some tests and return with additional benchmarks later |
* FetchContent * OpenCL * Documentation and make optional * Specify GGML build options in build.gradle * Use gradle properties * @ggerganov Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * @gpokat --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
After compiling CLBlast with whisper.cpp and running it on my Snapdragon 8 Gen 2 device (with a tuned Adreno 740 GPU for CLBlast), benchmark results and transcription speed haven't shown significant changes (improvement or regression) compared to CPU inference without it. This could be because ggml isn't offloading many computations to the GPU/OpenCL. See also: |
* FetchContent * OpenCL * Documentation and make optional * Specify GGML build options in build.gradle * Use gradle properties * @ggerganov Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * @gpokat --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* FetchContent * OpenCL * Documentation and make optional * Specify GGML build options in build.gradle * Use gradle properties * @ggerganov Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * @gpokat --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* FetchContent * OpenCL * Documentation and make optional * Specify GGML build options in build.gradle * Use gradle properties * @ggerganov Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * @gpokat --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Documenting for anyone else who wants to get CLBlast running on Android.
Depends on ggerganov/ggml#706.
Benchmark and transcription measurements below are from a Release variant built against ggerganov/ggml@53558f9 (with CLBlast) and e72e415 (without CLBlast).
Benchmarks
Without CLBlast (
BLAS = 0
).With CLBlast (
BLAS = 1
)Transcribe
jfk.wav
.BLAS = 0
BLAS = 1