You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Whisper.cpp API provides a way to allocate externally the state using init functions such as whisper_init_from_file_with_params_no_state() + whisper_init_state() instead of internally using functions like whisper_init_from_file_with_params().
For me it is very useful to use one common context (i.e. a common loaded model) for multiple parallel transcriptions which uses there own state (using functions like whisper_full_with_state()). Although it seems to be not thread-safe according to comments in whisper.h, it is still the recommend way for sharing the loaded model across multiple transcriptions (#341 (comment)). As far as I have tested (CPU and Cuda), it works well.
However this approach can not be used if using OpenVino, because the init function whisper_ctx_init_openvino_encoder() expect that the state has been already internally allocated by functions like whisper_init_from_file_with_params(). Would it be possible to add something like whisper_ctx_init_openvino_encoder_no_state() and whisper_init_openvino_state() to the main Whisper.cpp API so it will be possible to use a common context for multiple parallel transcriptions while using OpenVino ?