You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When trying to extract word/token timestamps from audio OpenVino fails. Adding either return_timestamps="word"/"True", results in failure, without it transcription finishes successfully.
File "/run/media/greggy/1a4fd6d7-1f9d-42c6-9324-661804695013/D/owisp/./drain_w.py", line 70, in<module>
result = pipe("./4.wav")
^^^^^^^^^^^^^^^
File "/home/greggy/.local/lib/python3.11/site-packages/transformers/pipelines/automatic_speech_recognition.py", line 292, in __call__
returnsuper().__call__(inputs, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/greggy/.local/lib/python3.11/site-packages/transformers/pipelines/base.py", line 1154, in __call__
return next(
^^^^^
File "/home/greggy/.local/lib/python3.11/site-packages/transformers/pipelines/pt_utils.py", line 124, in __next__
item = next(self.iterator)
^^^^^^^^^^^^^^^^^^^
File "/home/greggy/.local/lib/python3.11/site-packages/transformers/pipelines/pt_utils.py", line 266, in __next__
processed = self.infer(next(self.iterator), **self.params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/greggy/.local/lib/python3.11/site-packages/transformers/pipelines/base.py", line 1068, in forward
model_outputs = self._forward(model_inputs, **forward_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/greggy/.local/lib/python3.11/site-packages/transformers/pipelines/automatic_speech_recognition.py", line 507, in _forward
tokens = self.model.generate(
^^^^^^^^^^^^^^^^^^^^
File "/home/greggy/.local/lib/python3.11/site-packages/optimum/intel/openvino/modeling_seq2seq.py", line 1018, in generate
outputs["token_timestamps"] = self._extract_token_timestamps(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: '_OVModelForWhisper' object has no attribute '_extract_token_timestamps'
Issue submission checklist
I'm reporting an issue. It's not a question.
I checked the problem with the documentation, FAQ, open issues, Stack Overflow, etc., and have not found a solution.
There is reproducer code and related data files such as images, videos, models, etc.
The text was updated successfully, but these errors were encountered:
OpenVINO Version
tag 2023.3.0
Operating System
Other (Please specify in description)
Device used for inference
GPU
Framework
ONNX
Model used
distil-whisper/distil-small.en
Issue description
When trying to extract word/token timestamps from audio OpenVino fails. Adding either return_timestamps="word"/"True", results in failure, without it transcription finishes successfully.
Step-by-step reproduction
I'm using the following code
Relevant log output
Issue submission checklist
The text was updated successfully, but these errors were encountered: