Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fp16 precision support #222

Open
philloooo opened this issue Apr 17, 2024 · 7 comments
Open

Add fp16 precision support #222

philloooo opened this issue Apr 17, 2024 · 7 comments

Comments

@philloooo
Copy link

Hi,
Can we add fp16 version of the models? (have all inputs and weights be fp16).
On Mac, CoreML only use NPU when the data is fp16. It would be valuable to have webnn samples with fp16 precision to compare the performance.

@Honry
Copy link
Collaborator

Honry commented Apr 23, 2024

We have a plan to add a few fp16 models specific for image classification.

@huningxin
Copy link
Contributor

This would also be used for NPU test. #220

@Honry
Copy link
Collaborator

Honry commented May 7, 2024

@philloooo, @mingmingtasd has added 3 fp16 models for image classification: #226

@philloooo
Copy link
Author

philloooo commented May 13, 2024

thanks! I've tested it on Mac, some models are blocked by webmachinelearning/webnn#678 but Image classification with ResNet 50 V2 works and I can confirm it's using Apple Neural Engine.

I noticed that if I select the device to GPU, the model is different even when the precision is the same. Can we use the same models for GPU vs NPU? That's better to do comparison.

@Honry

@mingmingtasd
Copy link
Contributor

We should allow to include softmax both for GPU and NPU: https://github.com/webmachinelearning/webnn-samples/pull/226/files#diff-13f8eda69b8dc85839ae4e046882bfbbddd80a5928e8bd3ea6ff7b71e4213f68R124,
The TODO can be done.
@Honry @philloooo

@Honry
Copy link
Collaborator

Honry commented May 16, 2024

@philloooo, #237 fixes the inconsistent softmax support.

Now all the fp16 models (MobileNetV2, RestNet 50 V1, EfficientNet) share the same models between GPU and NPU.

@huningxin
Copy link
Contributor

Can we close this issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants