Description
The ONNX Runtime Inference Example for js/ort-whisper
is currently broken.
There are multiple issues with this example.
Issue 1: Whisper Example has been removed from Olive.
The example references an example whisper model in the Olive
repo which has recently been removed: https://github.com/microsoft/Olive/tree/main/examples/whisper. The example was available as recent as v0.8.0: https://github.com/microsoft/Olive/tree/v0.8.0/examples/whisper.
I see that the example was removed in this commit: https://github.com/microsoft/Olive/pull/1805/files.
I am able to build the ONNX model when checking out the v0.8.0 tag, and I'm able to run the test:
$ conda create -n olive-env python=3.11
$ conda activate olive-env
$ python -m pip install .
$ cd examples/whisper
$ python -m pip install requirements.txt
$ python -m pip install onnxruntime
$ python prepare_whisper_configs.py --model_name openai/whisper-tiny.en
$ olive run --config whisper_cpu_int8.json --setup
$ olive run --config whisper_cpu_int8.json 2> /dev/null
$ python test_transcription.py --config whisper_cpu_int8.json
The cud on his chest is still dripping blood. The ache of his overstrained eyes, even the soaring arena around him with thousands of spectators, retrivialities not worth thinking about.
These are the installed dependencies based on the repo and example
$ pip list
Package Version
------------------------ -----------
alembic 1.15.2
annotated-types 0.7.0
certifi 2025.4.26
charset-normalizer 3.4.2
coloredlogs 15.0.1
colorlog 6.9.0
contourpy 1.3.2
cycler 0.12.1
Deprecated 1.2.18
filelock 3.18.0
flatbuffers 25.2.10
fonttools 4.57.0
fsspec 2025.3.2
greenlet 3.2.1
hf-xet 1.1.0
huggingface-hub 0.31.1
humanfriendly 10.0
idna 3.10
Jinja2 3.1.6
joblib 1.5.0
kiwisolver 1.4.8
lightning-utilities 0.14.3
Mako 1.3.10
MarkupSafe 3.0.2
matplotlib 3.10.1
ml_dtypes 0.5.1
mpmath 1.3.0
networkx 3.4.2
neural_compressor 3.3.1
numpy 1.26.4
nvidia-cublas-cu12 12.6.4.1
nvidia-cuda-cupti-cu12 12.6.80
nvidia-cuda-nvrtc-cu12 12.6.77
nvidia-cuda-runtime-cu12 12.6.77
nvidia-cudnn-cu12 9.5.1.17
nvidia-cufft-cu12 11.3.0.4
nvidia-cufile-cu12 1.11.1.6
nvidia-curand-cu12 10.3.7.77
nvidia-cusolver-cu12 11.7.1.2
nvidia-cusparse-cu12 12.5.4.2
nvidia-cusparselt-cu12 0.6.3
nvidia-nccl-cu12 2.26.2
nvidia-nvjitlink-cu12 12.6.85
nvidia-nvtx-cu12 12.6.77
olive-ai 0.8.0
onnx 1.17.0
onnxruntime 1.21.1
onnxruntime_extensions 0.14.0
onnxscript 0.2.5
opencv-python-headless 4.11.0.86
optuna 4.3.0
packaging 25.0
pandas 2.2.3
pillow 11.2.1
pip 25.1
prettytable 3.16.0
protobuf 3.20.3
psutil 7.0.0
py-cpuinfo 9.0.0
pycocotools 2.0.8
pydantic 2.11.4
pydantic_core 2.33.2
pyparsing 3.2.3
python-dateutil 2.9.0.post0
pytz 2025.2
PyYAML 6.0.2
regex 2024.11.6
requests 2.32.3
safetensors 0.5.3
schema 0.7.7
scikit-learn 1.6.1
scipy 1.15.2
setuptools 78.1.1
six 1.17.0
SQLAlchemy 2.0.40
sympy 1.14.0
tabulate 0.9.0
threadpoolctl 3.6.0
tokenizers 0.19.1
torch 2.7.0
torchmetrics 1.7.1
tqdm 4.67.1
transformers 4.42.4
triton 3.3.0
typing_extensions 4.13.2
typing-inspection 0.4.0
tzdata 2025.2
urllib3 2.4.0
wcwidth 0.2.13
wheel 0.45.1
wrapt 1.17.2
Issue 2: onnxruntime-web is missing a BpeDecoder when using the ONNX model
When attempting to use the output ONNX model produced from the Olive repo, onnxruntime-web
is unable to run inference using the model due to a missing BpeDecoder
.
When recording audio, an error is shown: Error: Can't create a session. ERROR_CODE: 1, ERROR_MESSAGE: Fatal error: ai.onnx.contrib:BpeDecoder(-1) is not a registered function/op
Issue 3. Webpack Copy Plugin is Missing some Dependencies
I found that the ort-wasm*
files were missing and had to update the CopyPlugin in webpack.config.js
to get the model to first load
plugins: [new CopyPlugin({
// Use copy plugin to copy *.wasm to output folder.
patterns: [
{ from: 'node_modules/onnxruntime-web/dist/*.wasm', to: '[name][ext]' },
{ from: 'node_modules/onnxruntime-web/dist/ort-wasm*', to: '[name][ext]' }
]
})],
In the end, the example repo looks like this for me:
Open Questions
- Why was the Whisper example removed, and is there a planned alternative example?
- How can I configure onnxruntime-web to include the required BpeDecoder to run this model?