Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

coreml : revisit instructions and model generation scripts #2783

Closed
ggerganov opened this issue Feb 4, 2025 · 16 comments
Closed

coreml : revisit instructions and model generation scripts #2783

ggerganov opened this issue Feb 4, 2025 · 16 comments
Labels
build Build related issues documentation Improvements or additions to documentation roadmap Part of a roadmap project

Comments

@ggerganov
Copy link
Member

The CoreML creation process has been flaky from the very beginning (#566), mainly because the python packages were not compatible with each other. Things likely have changed since the initial support was added and hopefully are more stable now.

  • Update CoreML model conversion/creation instructions
  • Add CI workflows to exercise CoreML conversion (can utilize ggml-ci for Apple Silicon nodes)
@ggerganov ggerganov added build Build related issues documentation Improvements or additions to documentation roadmap Part of a roadmap project labels Feb 4, 2025
@furqan4545
Copy link

how to integrate .mlmodelc into my swiftui app? I am stuck here? I want to use coreml support. is there any example that I can follow?

@danbev
Copy link
Collaborator

danbev commented Mar 17, 2025

The swiftui example currently does not have support for coreml, but the objective-c example does, and it contains a link to how to convert models. Perhaps these could be helpful to implement in swift.

@furqan4545
Copy link

The swiftui example currently does not have support for coreml, but the objective-c example does, and it contains a link to how to convert models. Perhaps these could be helpful to implement in swift.

I am very beginner in obj C... Would it be possible for you guys to make swift ui example? I think a lot of people would appreciate that because as saw in many other repos people were requesting the same thing.

@danbev
Copy link
Collaborator

danbev commented Mar 19, 2025

Would it be possible for you guys to make swift ui example?

Perhaps we can update the existing whisper.swiftui example similar to whisper.objc. I've created #2907 to track this.

@ggerganov
Copy link
Member Author

Would it be possible for you guys to make swift ui example?

Perhaps we can update the existing whisper.swiftui example similar to whisper.objc. I've created #2907 to track this.

If I am not mistaken, it might be as simple as adding WHISPER_COREML=1 somewhere in the build settings and providing a CoreML model to the project.

@danbev
Copy link
Collaborator

danbev commented Mar 20, 2025

If I am not mistaken, it might be as simple as adding WHISPER_COREML=1 somewhere in the build settings and providing a CoreML model to the project.

Nice, I'll give this a try. Thanks

@danbev
Copy link
Collaborator

danbev commented Mar 20, 2025

Hmm, this was not as simple because of the introduction of the xcframework. We are no longer compiling in XCode but instead using the compiled library from the xcframework. And Core ML support requires the above mentioned macro to be set when compiling. My initial throught was that we could provide builds for Core ML for macos and ios and include them in the xcframework, but that does not work as the selection is based on platform and architecture, so trying to create the xcframework results in:

A library with the identifier 'macos-arm64_x86_64' already exists.

I'll need to investigate this a little more to understand what our options are.

@ggerganov
Copy link
Member Author

I think we can enable both WHISPER_COREML=ON and WHISPER_COREML_ALLOW_FALLBACK=ON in the CMake builds and provide only this option in the XCFramework. This way, if at runtime it fails to find a CoreML model, it will fallback to not using it.

@danbev
Copy link
Collaborator

danbev commented Mar 20, 2025

I think we can enable both WHISPER_COREML=ON and WHISPER_COREML_ALLOW_FALLBACK=ON in the CMake builds and provide only this option in the XCFramework. This way, if at runtime it fails to find a CoreML model, it will fallback to not using it.

That would be much things much easier. Great, I'll try this out.

@danbev
Copy link
Collaborator

danbev commented Mar 21, 2025

@furqan4545 We've added an example of using Core ML to the whisper.swiftui example now.

@furqan4545
Copy link

@danbev and @ggerganov YOU GUYS ARE HERO. I am extremely grateful.
quick question, is it same branch or different because I can't see any latest commit. Or it's just like I have to change nothing in swift-ui code, but model ?

@danbev
Copy link
Collaborator

danbev commented Mar 21, 2025

is it same branch or different because I can't see any latest commit. Or it's just like I have to change nothing in swift-ui code, but model ?

You should be able to use the master branch now. You are correct, there have been no changes to the swift-ui code, the changes have been to the xcframework that now builds with Core ML support. This will enable a Core ML model to be used by the example.

Just make sure that you build/rebuild the xcframework using build-xcframework.se and this was updated the other day to support Core ML. After that you'll need to generate the Core ML model following the instructions in the readme, and then add the model to the whisper.swiftui.demo/Resources/Models and finally start the demo. You should see in the log that Core ML model is being used.

@furqan4545
Copy link

@danbev Thank you so much Dan again for the explanation. I will give it a try tomorrow.

@danbev
Copy link
Collaborator

danbev commented Apr 1, 2025

One thing I noticed while going through the Core ML support is that currently it is required that the original ggml model is also available on the file system. The ggml model will be loaded first, before the Core ML encoder model. I initially thought that this might be an an oversight, that using a Core ML model would only require the converted model, but I believe the original model is still required for the decoder. Perhaps this could be clarified in the docs.

@ggerganov
Copy link
Member Author

the original model is still required for the decoder. Perhaps this could be clarified in the docs.

Yes, this is correct. We should update the docs to highlight this.

danbev added a commit to danbev/whisper.cpp that referenced this issue Apr 1, 2025
This commit disables the use of PyTorch's
`scaled_dot_product_attention` in the Whisper model to avoid
compatibility issues during CoreML conversion.
The issue occurs because coremltools requires PyTorch 2.5.0, but the
Whisper implementation may expect behavior from newer PyTorch versions.

By setting `MultiHeadAttention.use_sdpa = False`, we force Whisper to
use its fallback manual attention implementation, which works correctly
with PyTorch 2.5.0 during the tracing process.

Refs: ggml-org#2783
danbev added a commit to danbev/whisper.cpp that referenced this issue Apr 1, 2025
This commit adds a new job to the CI pipeline that downloads the base.en
model and converts it to CoreML format. The CoreML model is then packed
into a zip file and uploaded as an artifact.

Refs: ggml-org#2783
danbev added a commit to danbev/whisper.cpp that referenced this issue Apr 1, 2025
This commit adds a new job to the CI pipeline that downloads the base.en
model and converts it to CoreML format. The CoreML model is then packed
into a zip file and uploaded as an artifact.

Refs: ggml-org#2783
danbev added a commit to danbev/whisper.cpp that referenced this issue Apr 1, 2025
This commit adds a new job to the CI pipeline that downloads the base.en
model and converts it to CoreML format. The CoreML model is then packed
into a zip file and uploaded as an artifact.

This will only be done for pushes to master, releases, or pre-releases.

Refs: ggml-org#2783
danbev added a commit that referenced this issue Apr 1, 2025
* ci : add coreml job that converts base.en to coreml [no ci]

This commit adds a new job to the CI pipeline that downloads the base.en
model and converts it to CoreML format. The CoreML model is then packed
into a zip file and uploaded as an artifact.

This will only be done for pushes to master, releases, or pre-releases.

Refs: #2783

* coreml : remove publishing of coreml model

* ci : add GGML_OPENMP=OFF to ubuntu-22-gcc-sanitized
danbev added a commit to danbev/whisper.cpp that referenced this issue Apr 1, 2025
This commit disables the use of PyTorch's
`scaled_dot_product_attention` in the Whisper model to avoid
compatibility issues during CoreML conversion.
The issue occurs because coremltools requires PyTorch 2.5.0, but the
Whisper implementation may expect behavior from newer PyTorch versions.

By setting `MultiHeadAttention.use_sdpa = False`, we force Whisper to
use its fallback manual attention implementation, which works correctly
with PyTorch 2.5.0 during the tracing process.

Refs: ggml-org#2783
danbev added a commit that referenced this issue Apr 1, 2025
…2979)

* coreml: fix Whisper to CoreML conversion by disabling SDPA

This commit disables the use of PyTorch's
`scaled_dot_product_attention` in the Whisper model to avoid
compatibility issues during CoreML conversion.
The issue occurs because coremltools requires PyTorch 2.5.0, but the
Whisper implementation may expect behavior from newer PyTorch versions.

By setting `MultiHeadAttention.use_sdpa = False`, we force Whisper to
use its fallback manual attention implementation, which works correctly
with PyTorch 2.5.0 during the tracing process.

Refs: #2783

* coreml: fix audio shape in whisper decoder conversion

This commit fixes the audio shape in the whisper decoder conversion
script.

The motivation for this is that the  audio shape was incorrect and
was causing the conversion to fail.

* coreml : set -e in generate-coreml-interface.sh

The commit sets the -e flag in the generate-coreml-interface.sh script
to make sure the script fails if any command fails.

* coreml : update generated encoder/decoder interfaces

This commit updates the generated encoder/decoder interfaces for the
whisper model which is the result of running the
generate-coreml-interface.sh script.
danbev added a commit to danbev/whisper.cpp that referenced this issue Apr 2, 2025
This commit clarifies the usage of the Core ML encoder model in the
whisper.obj and whisper.swiftui examples.

Refs: ggml-org#2783
danbev added a commit that referenced this issue Apr 2, 2025
This commit clarifies the usage of the Core ML encoder model in the
whisper.obj and whisper.swiftui examples.

Refs: #2783
@danbev
Copy link
Collaborator

danbev commented Apr 2, 2025

I think this could be considered completed by the following commits:

  • Update CoreML model conversion/creationinstructions : 11688b2
  • Add CI workflows to exercise CoreML conversion : 04b9508

@ggerganov ggerganov moved this from Todo to Done in whisper.cpp : roadmap Apr 2, 2025
@ggerganov ggerganov closed this as completed by moving to Done in whisper.cpp : roadmap Apr 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build Build related issues documentation Improvements or additions to documentation roadmap Part of a roadmap project
Projects
Development

No branches or pull requests

3 participants