Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial WebNN Support #890

Merged
merged 2 commits into from
Aug 17, 2024
Merged

Initial WebNN Support #890

merged 2 commits into from
Aug 17, 2024

Conversation

ibelem
Copy link
Contributor

@ibelem ibelem commented Aug 14, 2024

This is the initial PR to support WebNN API.

User Code

    import { pipeline } from '@huggingface/transformers';

    let dataType = 'fp16';
    let provider = 'webnn';
    let deviceType = 'gpu';

    let options = {
      dtype: dataType,
      device: provider,
      session_options: {
        executionProviders: [
          {
            name: provider,
            deviceType: deviceType
          },
        ],
        freeDimensionOverrides: {
          batch_size: 1,
          num_channels: 3,
          height: 224,
          width: 224
        }
      },
    };

    // Create an image classification pipeline
    const classifier = await pipeline('image-classification', 'xenova/resnet-50', options);
    const url = 'https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/tiger.jpg';
    const output = await classifier(url);
    // [
    //   { "label": "tiger, Panthera tigris", "score": 0.9231255054473877 },
    //   { "label": "tiger cat", "score": 0.07358699291944504 },
    //   { "label": "jaguar, panther, Panthera onca, Felis onca", "score": 0.00047683052252978086 },
    //   { "label": "leopard, Panthera pardus", "score": 0.00017269655654672533 },
    //   { "label": "lynx, catamount", "score": 0.00015724201512057334 }
    // ]

WebNN Installation Guide

WebNN is being rapidly implemented in browsers. Before it formally ships, WebNN requires a compatible browser to run, and Windows* 11 v21H2 (DML 1.6.0) or higher.

  • Download the latest Google Chrome Canary or Microsoft Edge Canary browser
  • To enable WebNN, in your browser address bar, enter about://flags, and then press Enter. An Experiments page opens
  • In the Search flags box, enter webnn. Enables WebNN API appears
  • In the drop-down menu, select Enabled
  • Relaunch your browser

Known Issue

From Transformers.js perspective, it's better to improve the following float16 check code (current via WebGPU and WebGPU shader-f16) since fp16 support is not GPU device only.

@Honry @huningxin @xenova PTAL.

@xenova
Copy link
Collaborator

xenova commented Aug 15, 2024

Thanks for the PR! Since our goal is to align with the python library as closely as possible, we may need to separate out device and provider into separate parameters (don't worry, we'll make sure the correct format is passed to the ORT session).

Here are some proposals for the API:

const classifier = await pipeline('image-classification', 'xenova/resnet-50', {
  dtype: 'fp32',
  device: 'gpu', // or 'npu'
  provider: 'webnn',
});
const classifier = await pipeline('image-classification', 'xenova/resnet-50', {
  dtype: 'fp32',
  device: 'webnn-gpu', // e.g., or 'webnn-npu'
});

What do you think?

@ibelem
Copy link
Contributor Author

ibelem commented Aug 15, 2024

Thank you @xenova for starting the review. Yes, the current options is obviously redundant, I just wanted to get your feedback. :)

// Create a feature-extraction pipeline
const extractor = await pipeline('feature-extraction', 'Xenova/all-MiniLM-L6-v2', {
  dtype: 'fp32',
  device: 'webgpu', // <- Run on WebGPU
});

To simplify the API and maintain consistency with other devices, I think the second proposal is superior as it doesn't require an additional provider key.

BTW, WebNN needs freeDimensionOverrides for inputs with symbolic dimensions, please also make sure it can be passed to ORT web easily.

e.g. input_ids int64 [batch_size, decoder_sequence_length] vs input_ids int64 [1, 100]

@xenova
Copy link
Collaborator

xenova commented Aug 15, 2024

BTW, WebNN needs freeDimensionOverrides for inputs with symbolic dimensions, please also make sure it can be passed to ORT web easily.

Yes this should be handled by specifying these values in the model's config.json under something like {"transformers.js_config": { "session_options": ... }}.

@xenova xenova merged commit 5799e30 into huggingface:v3 Aug 17, 2024
@kungfooman
Copy link
Contributor

I couldn't test this yet, just wondering: if we pick WebNN as provider, do we have a different set of supported ops again based on the browsers WebNN implementation?

I see three scenarios:

  1. A model that used to work, suddenly doesn't work anymore (op not supported in browsers WebNN implementation)
  2. A model that didn't work, suddenly works (browser supports more ops than e.g. ONNX WebGPU)
  3. Completely the same

@ibelem
Copy link
Contributor Author

ibelem commented Aug 19, 2024

Thanks @xenova for the device detection and selection improvement between ff1853c Initial WebNN Support and 5b2cac2 Improve WebNN selection, now WebNN can be run successfully by using code below:

    import { pipeline } from './dists/upstream_v3_0819/transformers.js';
    let options = {
      dtype: 'fp16',
      device: 'webnn-cpu', // 'webnn-gpu' and upcoming 'webnn-npu'
      session_options: {
        freeDimensionOverrides: {
          batch_size: 1,
          num_channels: 3,
          height: 224,
          width: 224
        }
      },
    };

    // Create an image classification pipeline
    const classifier = await pipeline('image-classification', 'xenova/resnet-50', options);
    const url = 'https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/tiger.jpg';
    const output = await classifier(url);
    console.log(output);

@kungfooman Thanks for the great question! Currently WebNN is still adding support for new ops and new data types. We are keep updating the Implementation Status of WebNN Operations based on the WebNN Spec and the browser implementation.

Our team is also implementing and maintaining the WebNN EP of ONNX Runtime Web, if the op not supported in browsers WebNN implementation, there is a fallback mechanism that fallback these WebNN not supported ops to CPU EP (Wasm) of ONNX Runtime Web. WebNN is also adding WebGPU interops capabilities, interacting with WebGPU to support custom ops or cases that may not be supported by WebNN. Usually the more WebNN ops supported, the fewer subgraphs are split. The less partitions the better performance for WebNN.

WebNN will enable users to leverage multiple AI accelerators, including CPUs, GPUs, and NPUs, opening up new possibilities such as power-efficient AI inferencing on NPUs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants