Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

whisper : replace tensor->n_dims with ggml_n_dims(tensor) #1694

Merged
merged 1 commit into from
Dec 29, 2023

Conversation

bobqianic
Copy link
Collaborator

@bobqianic bobqianic commented Dec 27, 2023

Since tensor->n_dims is now deprecated, we've switched to using ggml_n_dims as its replacement.

Even though slaren mentioned that ggml_n_dims isn't a direct substitute for tensor::n_dims, it's not a concern in this case. This is because both ne[0] and ne[1] should exceed 1 for tensor mel and out.

Close #1690

@slaren
Copy link
Collaborator

slaren commented Dec 27, 2023

I don't know if it is an issue here, but ggml_n_dims is not a drop-in replacement for tensor::n_dims because it counts the number of non-1 dimensions. So for example, a tensor created with ggml_new_tensor_2d(.., 10, 1); would have n_dims = 2, but ggml_n_dims would return 1. If this is a potential issue in this case, you can instead check for ggml_n_dims(tensor) <= 2, or the equivalent ggml_is_matrix.

@bobqianic
Copy link
Collaborator Author

So for example, a tensor created with ggml_new_tensor_2d(.., 10, 1); would have n_dims = 2, but ggml_n_dims would return 1.

Ah, I see. But then, why did we deprecate n_dims in ggml_tensor? It seems a bit illogical to me.

whisper.cpp/ggml.h

Lines 506 to 544 in 37a709f

struct ggml_tensor {
enum ggml_type type;
enum ggml_backend_type backend;
struct ggml_backend_buffer * buffer;
int64_t ne[GGML_MAX_DIMS]; // number of elements
size_t nb[GGML_MAX_DIMS]; // stride in bytes:
// nb[0] = ggml_type_size(type)
// nb[1] = nb[0] * (ne[0] / ggml_blck_size(type)) + padding
// nb[i] = nb[i-1] * ne[i-1]
// compute data
enum ggml_op op;
// op params - allocated as int32_t for alignment
int32_t op_params[GGML_MAX_OP_PARAMS / sizeof(int32_t)];
bool is_param;
struct ggml_tensor * grad;
struct ggml_tensor * src[GGML_MAX_SRC];
// performance
int perf_runs;
int64_t perf_cycles;
int64_t perf_time_us;
struct ggml_tensor * view_src;
size_t view_offs;
void * data;
char name[GGML_MAX_NAME];
void * extra; // extra things e.g. for ggml-cuda.cu
char padding[8];
};

@bobqianic
Copy link
Collaborator Author

Please feel free to correct me if I'm mistaken. I believe a good strategy to avoid this issue is to sanitize the input during the creation of a ggml_tensor. For instance, when using ggml_new_tensor_2d, it's important to ensure that both ne[0] and ne[1] are greater than 1.

@slaren
Copy link
Collaborator

slaren commented Dec 28, 2023

In practice, n_dims was not very useful, and not needing to propagate the value in operations simplifies the code a bit. For this kind of checks , ggml_n_dims() with <= should achieve the same result. Forcing dimensions to be greater than 1 in the ggml_new_tensor_nd would make a lot of code more complex for no reason, applications would need to check the value of the higher dimensions and use a different function in each case. A dimension value of 1 is not an error.

@Abdalnablse10
Copy link

I tried the fix, it was successfully built, but got this error when trying to test it.
abdalnablse10@x86-1:~/TestFix/whisper.cpp$ ./build/bin/main -m models/ggml-base.en-encoder-openvino.bin -f samples/jfk.wav
whisper_init_from_file_with_params_no_state: loading model from 'models/ggml-base.en-encoder-openvino.bin'
whisper_model_load: loading model
whisper_model_load: invalid model data (bad magic)
whisper_init_with_params_no_state: failed to load model
error: failed to initialize whisper context

Converting the model went like the following.
(openvino_conv_env) abdalnablse10@x86-1:~/TestFix/whisper.cpp/models$ python3 convert-whisper-to-openvino.py --model base.en
/home/abdalnablse10/whisper.cpp/models/openvino_conv_env/lib/python3.11/site-packages/whisper/model.py:166: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
assert x.shape[1:] == self.positional_embedding.shape, "incorrect audio shape"

and generated two files ggml-base.en-encoder-openvino.bin and ggml-base.en-encoder-openvino.xml

@bobqianic
Copy link
Collaborator Author

abdalnablse10@x86-1:~/TestFix/whisper.cpp$ ./build/bin/main -m models/ggml-base.en-encoder-openvino.bin -f samples/jfk.wav
whisper_init_from_file_with_params_no_state: loading model from 'models/ggml-base.en-encoder-openvino.bin'
whisper_model_load: loading model
whisper_model_load: invalid model data (bad magic)
whisper_init_with_params_no_state: failed to load model
error: failed to initialize whisper context

You only need to provide the path of the standard model to the main. Ensure that both the standard model and the OPENVINO encoder model have matching names and are located in the same directory. For instance, if the standard model is named ABCD.bin, then the corresponding OPENVINO model should be named ABCD-encoder-openvino.bin

/home/abdalnablse10/whisper.cpp/models/openvino_conv_env/lib/python3.11/site-packages/whisper/model.py:166: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
assert x.shape[1:] == self.positional_embedding.shape, "incorrect audio shape"

This is just a warning. Ignore it.

@Abdalnablse10
Copy link

./build/bin/main -m ./models/ggml-base.en.bin -f ./samples/jfk.wav
whisper_init_from_file_with_params_no_state: loading model from './models/ggml-base.en.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab = 51864
whisper_model_load: n_audio_ctx = 1500
whisper_model_load: n_audio_state = 512
whisper_model_load: n_audio_head = 8
whisper_model_load: n_audio_layer = 6
whisper_model_load: n_text_ctx = 448
whisper_model_load: n_text_state = 512
whisper_model_load: n_text_head = 8
whisper_model_load: n_text_layer = 6
whisper_model_load: n_mels = 80
whisper_model_load: ftype = 1
whisper_model_load: qntvr = 0
whisper_model_load: type = 2 (base)
whisper_model_load: adding 1607 extra tokens
whisper_model_load: n_langs = 99
Illegal instruction

I starting to believe I'm doing something dumb, does intel atom z8500 support openvino?

@bobqianic
Copy link
Collaborator Author

I starting to believe I'm doing something dumb, does intel atom z8500 support openvino?

Although OpenVINO supports the Z8500, the issue likely arises because the Z8500 lacks AVX support, yet OpenVINO still utilizes it, leading to illegal instructions. Try adding the flags -DWHISPER_NO_AVX and -DWHISPER_NO_AVX2 when you're building with CMake. This might help with the issue.

image

@Abdalnablse10
Copy link

I starting to believe I'm doing something dumb, does intel atom z8500 support openvino?

Although OpenVINO supports the Z8500, the issue likely arises because the Z8500 lacks AVX support, yet OpenVINO still utilizes it, leading to illegal instructions. Try adding the flags -DWHISPER_NO_AVX and -DWHISPER_NO_AVX2 when you're building with CMake. This might help with the issue.

image

cmake -B build -DWHISPER_OPENVINO=1 -DWHISPER_NO_AVX=1 -DWHISPER_NO_AVX2=1

cmake --build build -j --config Release

./build/bin/main -m ./models/ggml-base.en.bin -f ./samples/jfk.wav whisper_init_from_file_with_params_no_state: loading model from './models/ggml-base.en.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab = 51864
whisper_model_load: n_audio_ctx = 1500
whisper_model_load: n_audio_state = 512
whisper_model_load: n_audio_head = 8
whisper_model_load: n_audio_layer = 6
whisper_model_load: n_text_ctx = 448
whisper_model_load: n_text_state = 512
whisper_model_load: n_text_head = 8
whisper_model_load: n_text_layer = 6
whisper_model_load: n_mels = 80
whisper_model_load: ftype = 1
whisper_model_load: qntvr = 0
whisper_model_load: type = 2 (base)
whisper_model_load: adding 1607 extra tokens
whisper_model_load: n_langs = 99
Illegal instruction

Looks like it didn't work.

@bobqianic
Copy link
Collaborator Author

bobqianic commented Dec 28, 2023

Looks like it didn't work.

Alright, please go to the OpenVINO repository and create an issue. I thought this issue was resolved in 2020, but it appears there's a regression.

openvinotoolkit/openvino#5782
openvinotoolkit/openvino#387

Copy link
Owner

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cannot test OpenVINO build, but I think the usage of ggml_n_dims here is correct

@ggerganov ggerganov merged commit f5f485f into ggerganov:master Dec 29, 2023
39 checks passed
@bobqianic bobqianic deleted the fix-openvino branch December 29, 2023 20:43
viktor-silakov pushed a commit to viktor-silakov/whisper_node_mic.cpp that referenced this pull request May 11, 2024
iThalay pushed a commit to iThalay/whisper.cpp that referenced this pull request Sep 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Error trying to build whisper.cpp with openvino support.
4 participants