Initial WebNN Support #890

ibelem · 2024-08-14T09:13:43Z

This is the initial PR to support WebNN API.

User Code

    import { pipeline } from '@huggingface/transformers';

    let dataType = 'fp16';
    let provider = 'webnn';
    let deviceType = 'gpu';

    let options = {
      dtype: dataType,
      device: provider,
      session_options: {
        executionProviders: [
          {
            name: provider,
            deviceType: deviceType
          },
        ],
        freeDimensionOverrides: {
          batch_size: 1,
          num_channels: 3,
          height: 224,
          width: 224
        }
      },
    };

    // Create an image classification pipeline
    const classifier = await pipeline('image-classification', 'xenova/resnet-50', options);
    const url = 'https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/tiger.jpg';
    const output = await classifier(url);
    // [
    //   { "label": "tiger, Panthera tigris", "score": 0.9231255054473877 },
    //   { "label": "tiger cat", "score": 0.07358699291944504 },
    //   { "label": "jaguar, panther, Panthera onca, Felis onca", "score": 0.00047683052252978086 },
    //   { "label": "leopard, Panthera pardus", "score": 0.00017269655654672533 },
    //   { "label": "lynx, catamount", "score": 0.00015724201512057334 }
    // ]

WebNN Installation Guide

WebNN is being rapidly implemented in browsers. Before it formally ships, WebNN requires a compatible browser to run, and Windows* 11 v21H2 (DML 1.6.0) or higher.

Download the latest Google Chrome Canary or Microsoft Edge Canary browser
To enable WebNN, in your browser address bar, enter about://flags, and then press Enter. An Experiments page opens
In the Search flags box, enter webnn. Enables WebNN API appears
In the drop-down menu, select Enabled
Relaunch your browser

Known Issue

From Transformers.js perspective, it's better to improve the following float16 check code (current via WebGPU and WebGPU shader-f16) since fp16 support is not GPU device only.

@Honry @huningxin @xenova PTAL.

xenova · 2024-08-15T08:31:05Z

Thanks for the PR! Since our goal is to align with the python library as closely as possible, we may need to separate out device and provider into separate parameters (don't worry, we'll make sure the correct format is passed to the ORT session).

Here are some proposals for the API:

const classifier = await pipeline('image-classification', 'xenova/resnet-50', {
  dtype: 'fp32',
  device: 'gpu', // or 'npu'
  provider: 'webnn',
});

const classifier = await pipeline('image-classification', 'xenova/resnet-50', {
  dtype: 'fp32',
  device: 'webnn-gpu', // e.g., or 'webnn-npu'
});

What do you think?

ibelem · 2024-08-15T08:58:20Z

Thank you @xenova for starting the review. Yes, the current options is obviously redundant, I just wanted to get your feedback. :)

// Create a feature-extraction pipeline
const extractor = await pipeline('feature-extraction', 'Xenova/all-MiniLM-L6-v2', {
  dtype: 'fp32',
  device: 'webgpu', // <- Run on WebGPU
});

To simplify the API and maintain consistency with other devices, I think the second proposal is superior as it doesn't require an additional provider key.

BTW, WebNN needs freeDimensionOverrides for inputs with symbolic dimensions, please also make sure it can be passed to ORT web easily.

e.g. input_ids int64 [batch_size, decoder_sequence_length] vs input_ids int64 [1, 100]

xenova · 2024-08-15T11:15:43Z

BTW, WebNN needs freeDimensionOverrides for inputs with symbolic dimensions, please also make sure it can be passed to ORT web easily.

Yes this should be handled by specifying these values in the model's config.json under something like {"transformers.js_config": { "session_options": ... }}.

kungfooman · 2024-08-18T10:12:18Z

I couldn't test this yet, just wondering: if we pick WebNN as provider, do we have a different set of supported ops again based on the browsers WebNN implementation?

I see three scenarios:

A model that used to work, suddenly doesn't work anymore (op not supported in browsers WebNN implementation)
A model that didn't work, suddenly works (browser supports more ops than e.g. ONNX WebGPU)
Completely the same

ibelem · 2024-08-19T03:35:56Z

Thanks @xenova for the device detection and selection improvement between ff1853c Initial WebNN Support and 5b2cac2 Improve WebNN selection, now WebNN can be run successfully by using code below:

    import { pipeline } from './dists/upstream_v3_0819/transformers.js';
    let options = {
      dtype: 'fp16',
      device: 'webnn-cpu', // 'webnn-gpu' and upcoming 'webnn-npu'
      session_options: {
        freeDimensionOverrides: {
          batch_size: 1,
          num_channels: 3,
          height: 224,
          width: 224
        }
      },
    };

    // Create an image classification pipeline
    const classifier = await pipeline('image-classification', 'xenova/resnet-50', options);
    const url = 'https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/tiger.jpg';
    const output = await classifier(url);
    console.log(output);

@kungfooman Thanks for the great question! Currently WebNN is still adding support for new ops and new data types. We are keep updating the Implementation Status of WebNN Operations based on the WebNN Spec and the browser implementation.

Our team is also implementing and maintaining the WebNN EP of ONNX Runtime Web, if the op not supported in browsers WebNN implementation, there is a fallback mechanism that fallback these WebNN not supported ops to CPU EP (Wasm) of ONNX Runtime Web. WebNN is also adding WebGPU interops capabilities, interacting with WebGPU to support custom ops or cases that may not be supported by WebNN. Usually the more WebNN ops supported, the fewer subgraphs are split. The less partitions the better performance for WebNN.

WebNN will enable users to leverage multiple AI accelerators, including CPUs, GPUs, and NPUs, opening up new possibilities such as power-efficient AI inferencing on NPUs.

Initial WebNN Support

ff1853c

Fix conflicts

3fefa17

xenova merged commit 5799e30 into huggingface:v3 Aug 17, 2024

anssiko mentioned this pull request Aug 23, 2024

WebML WG - TPAC 2024 agenda webmachinelearning/meetings#25

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Initial WebNN Support #890

Initial WebNN Support #890

Uh oh!

ibelem commented Aug 14, 2024

Uh oh!

xenova commented Aug 15, 2024

Uh oh!

ibelem commented Aug 15, 2024 •

edited

Loading

Uh oh!

xenova commented Aug 15, 2024

Uh oh!

kungfooman commented Aug 18, 2024

Uh oh!

ibelem commented Aug 19, 2024

Uh oh!

Uh oh!

Initial WebNN Support #890

Initial WebNN Support #890

Uh oh!

Conversation

ibelem commented Aug 14, 2024

User Code

WebNN Installation Guide

Known Issue

Uh oh!

xenova commented Aug 15, 2024

Uh oh!

ibelem commented Aug 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xenova commented Aug 15, 2024

Uh oh!

kungfooman commented Aug 18, 2024

Uh oh!

ibelem commented Aug 19, 2024

Uh oh!

Uh oh!

ibelem commented Aug 15, 2024 •

edited

Loading