Add use cases #2

tomoyukilabs · 2018-11-08T07:53:00Z

This PR adds use cases to the WebNN draft. The use cases I've added are copied from
webmachinelearning/meetings#1 (comment) as a starting point. We will capture suggestions and contributions from F2F discussion during review of this PR before merging it.

Preview | Diff

anssiko · 2018-11-08T08:21:38Z

@tomoyukilabs, thank you!

All - please review the use cases and note this PR is just a starting point and will change based on your review comments. Please suggest new use cases, clarify existing ones, rewording, de-scoping some etc.

As a guideline for reviewers, use cases are generally more impactful if implementation feasibility can be demonstrated via a proof-of-concept, mapping to platform APIs as a "reality check", or similar.

Our charter sets the following expectation we should reflect the use cases against:

The APIs in scope of this group will not be tied to any particular platform and will be implementable on top of existing major platform APIs, such as Android Neural Networks API, Windows DirectML, and macOS/iOS Metal Performance Shaders and Basic Neural Network Subroutines.

Reviews from folks with close familiarity with one or more of the said platform APIs very welcome.

We also have @huningxin's API native mapping table at our disposal.

anssiko

@tomoyukilabs, I submitted some review comments for the group to consider. Feedback welcome!

Overall this is a great start, and I think the division into application-level and low-level use cases seems reasonable. I'd like to hear more feedback from the group on e.g. whether we miss some major use cases, or whether some of these use cases would be impractical to implement across major platforms we're committed to support in the charter.

I think the general suggestion of mine was to consider abstracting out API specifics from the application-level use cases, and to clarify terminology around "WebML API" vs. "WebNN API" in the low-level use cases.

I think @huningxin is working on an explainer document based on the material shared at F2F that will clarify the positioning of WebNN API (in scope of the CG) and the envisioned "WebML API" (currently out of scope of the CG) among other things, so we can have more concrete discussion around that topic. We might want to consider porting over some of the explainer content into the spec in the future, but I wouldn't block this PR on that.

anssiko · 2018-11-13T07:20:43Z

index.bs


+This section illustrates application-level use cases for the Web Machine
+Learning API (WebML API). All applications in those use cases can be built on


We could abstract out the API specifics from the use cases section and here say, for example:

This section illustrates application-level use cases for neural network inference hardware acceleration.
All applications in those use cases can be built on top of pre-trained deep neural network (DNN) models.

Alternatively, we could replace all occurrences of "WebML API" with "WebNN API", but abstracting out the API entirely in use cases discussion seems preferable to me.

Cc @huningxin for comments.

The former example looks good to me. Thanks!

anssiko · 2018-11-13T07:27:59Z

index.bs

+### Person Detection ### {#usecase-person-detection}
+
+A user is browsing a social media site and wishes to take a photo and upload it
+to the site. Before the photo is uploaded, the site runs [[SSD]] or [[YOLO]] on


To abstract out the API specifics, I'd suggest something like:

Before the photo is uploaded, the site does object detection (for example, using object detection approaches such as [[SSD]] or [[YOLO]] that use a single deep neural network) to detect regions that include persons so that the user can filter and de-personalize irrelevant persons on it.

anssiko · 2018-11-13T07:31:13Z

index.bs

+
+A user opens a web application that continuously captures her body with her
+smartphone's camera. The web application extracts her skeleton by running
+[[PoseNet]] on the WebML API to recognize her gesture or body language. When she


The web application extracts her skeleton by running a machine learning model which allows for real-time human pose estimation such as [[PoseNet]] to recognize her gesture and body language.

anssiko · 2018-11-13T07:31:29Z

index.bs

+the WebML API to detect regions that include persons so that the user can filter
+and de-personalize irrelevant persons on it.
+
+### Skeleton Detecton ### {#usecase-skeleton-detection}


s/Detecton/Detection/

anssiko · 2018-11-13T07:32:17Z

index.bs

+A user wishes to make her new account and looks for a new icon image. When she
+clicks a "Generate" button on the webpage for creating an account, the webpage
+runs a generator model of generative adversarial network (GAN) for icon
+synthesis [[LogoSynthesis]] on the WebML API. She can repeat random icon


I'd propose we remove " on the WebML API".

anssiko · 2018-11-13T07:45:34Z

index.bs

+
+## Low-Level Use Cases ## {#usecases-lowlevel}
+
+This section collects API-level use cases for the WebML API. It is supposed that


I think in this "Low-Level Use Cases" section it'd be appropriate to talk about the API, since one expected API consumer is a framework or library and the use cases may need to suggest a certain API shape and feature to be meaningful.

Here's a proposed rewording:

This section collects API-level use cases for a dedicated low-level API for neural network inference hardware acceleration. It is expected that Machine Learning frameworks will be key consumers of the Web Neural Network API (WebNN API) and the low-level details exposed through the WebNN API are abstracted out from typical web developers. However, it is also expected that web developers with specific interest and competence in Machine Learning will want to interface with the WebNN API directly instead of a higher-level ML framework.

I agree with your opinion, and the proposed explanation looks good to me. Thanks!

anssiko · 2018-11-13T07:47:54Z

index.bs

+
+### Custom Layer ### {#usecase-custom-layer}
+
+A web application developer wants to run a DNN model on the WebML. However, she


on the WebNN API.

anssiko · 2018-11-13T07:48:19Z

index.bs

+
+A web application developer wants to run a DNN model on the WebML. However, she
+has found that some of activation functions like [[LeakyReLU]], [[ELU]], etc. are
+not included in the WebML API. So she constructs custom layers of the additional


anssiko · 2018-11-13T07:48:27Z

index.bs

+A web application developer wants to run a DNN model on the WebML. However, she
+has found that some of activation functions like [[LeakyReLU]], [[ELU]], etc. are
+not included in the WebML API. So she constructs custom layers of the additional
+activation functions on top of the WebML API. Note that the scope of custom


anssiko · 2018-11-13T07:49:06Z

index.bs

+
+A web application developer has a concern about performance of her DNN model on
+mobile devices. She has confirmed that the model runs too slow on mobile devices
+which does not have GPU acceleration. So her web application refers to the WebML


tomoyukilabs · 2018-11-14T02:39:32Z

@anssiko Many thanks for your reviewing. I have followed your comments and updated this draft. PTAL.

tomoyukilabs · 2018-11-14T02:44:35Z

Currently, all application-level use cases in this PR are about CNN-based image processing. I guess that we can find any other use cases, e.g. audio, text, sensor data, etc.

anssiko · 2018-11-14T11:20:59Z

@tomoyukilabs, thanks for incorporating the suggestions, LGTM.

Before we consider merging, I'd like to get additional 2-3 reviews from the group, and optimally contributions for 1-2 application-level use cases that do not involve image processing.

gregwhitworth · 2018-11-19T19:48:03Z

@tomoyukilabs Thank you so much for taking the time to submit this PR. I agree that a text based use case would be valuable.

huningxin · 2018-11-20T08:55:22Z

@tomoyukilabs , thanks much for putting together this PR!

During the TPAC F2F meeting, folks were also interested in the background removal/replacement for video conference. So it would be good to add the scene segmentation use case, for example [Deeplab V3+] or [Mask R-CNN].

Other vision based use cases could include super resolution e.g. [SRGAN], style transfer e.g. [Fast Style Transfer], face analysis e.g. [DeepFace] and face recognition e.g. [FaceNet]. Basically the are based on Convolutional Neural Networks (CNN).

Some text based use cases could be machine translation e.g. [GNMT] or [OpenNMT], sentiment analysis e.g. [DeepMoji], speech recognition e.g. [Deep Speech], text to speech e.g. [Deep Voice], image captioning e.g. [im2txt] and video summarization e.g. [Video-Summarization-with-LSTM]. They are usually based on Recurrent Neural Networks (RNN).

huningxin · 2018-11-21T00:44:40Z

index.bs

+custom layers may include convolution, normalization, etc. as well as
+activation.
+
+### Network Concatenation ### {#usecase-network-concat}


As mentioned in the TPAC F2F meeting, this looks like a training use case. As training is out of current charter's scope, would it be better to add this in the future?

Correct, but not limited to training. Possible detailed examples are:

The web app downloads convolutional layer weights of MobileNetV1/V2 from CDN and weights of fully-connected layers made by transfer learning from her own web site

The web app downloads complete weights of MobileNetV1/V2, and then partially update fully-connected layers later by downloading fine-tuned weights

Anyway, the current description seems to suggest the use case of training, as you pointed out. I'll update those sentences so that they clearly indicate a use case of client-side partial update based on fine tuning or transfer learning.

Thanks for the clarification! It would be great if you can update the description accordingly.

tomoyukilabs · 2018-11-21T03:31:05Z

@huningxin All use cases you have suggested including text-based ones look great to me. Many thanks!

@anssiko Is it okay to add those use cases to this PR?

anssiko · 2018-11-21T04:49:42Z

@tomoyukilabs, yes please add. Similarly to the initial list of use cases, the group is expected to review any proposed additions and doing that is easier using the PR review facilities.

anssiko · 2018-11-21T13:47:27Z

@huningxin, thanks for the great contribution!

Since accessibility is a key to the W3C's mission, maybe it's worth noting the accessibility benefit in connection with the image captioning use case [im2txt]. Being able to add image descriptions automatically greatly improves web accessibility. As we know, only a small fraction of images on the Web have been properly annotated.

tomoyukilabs · 2018-11-26T23:57:13Z

Due to my business trip until the end of November, I'll start updating this PR as soon as possible after coming back to Tokyo. Thanks for your patience.

tomoyukilabs · 2018-12-05T01:18:35Z

I've updated this PR. PTAL.

The use cases proposed by @huningxin in Add use cases #2 (comment) are added
Regarding the use case of GAN, image generation is replaced with super resolution, i.e. SRGAN
Person detection and skeleton detection are revised as the use cases of video conferencing
According to Add use cases #2 (review), model concatenation is modified as the use case of fine tuning.

anssiko

Thanks @tomoyukilabs and @huningxin! I submitted some minor review comments.

anssiko · 2018-12-10T10:02:44Z

index.bs

-generation until she finds her favorite one.
+A web-based video conferencing application records received video streams, and
+it needs to reduce recorded video data to be stored. The application generates
+the short version of the recoreded video by using a machine learning model for


s/recoreded/recorded/

anssiko · 2018-12-10T13:07:49Z

index.bs

+
+A web application developer has a concern about performance of her DNN model on
+mobile devices. She has confirmed that the model runs too slow on mobile devices
+which does not have GPU acceleration. So her web application refers to the WebNN


s/does/do/
s/So/To address this issue,/

anssiko · 2018-12-10T13:07:53Z

index.bs

+
+A web application developer wants to run a DNN model on the WebNN API. However,
+she has found that some of activation functions like [[LeakyReLU]], [[ELU]],
+etc. are not included in the WebNN API. So she constructs custom layers of the


s/So/To address this issue,/

anssiko · 2018-12-10T13:08:00Z

index.bs

+### Super Resolution ### {#usecase-super-resolution}
+
+A web-based video conferencing is receiving a video stream from its peer, but
+the resolution of the video becomes lower due to network congestion. So the


/So/To prevent degradation of the perceived video quality,/

anssiko · 2018-12-10T13:08:03Z

index.bs

+
+A user joins a teleconference via a web-based video conferencing application
+from her room. However, she does not wish that her room is visible on the
+screen. So the application runs a machine learning model such as [[DeepLabv3+]]


/So/To protect the privacy of the other people and the surroundings,/

anssiko · 2018-12-10T13:08:04Z

index.bs

+### Semantic Segmentation ### {#usecase-segmentation}
+
+A user joins a teleconference via a web-based video conferencing application
+from her room. However, she does not wish that her room is visible on the


s/room is/room and people in the background are/

anssiko · 2018-12-10T13:08:28Z

index.bs

+mobile devices. She has confirmed that the model runs too slow on mobile devices
+which does not have GPU acceleration. So her web application refers to the WebNN
+API to confirm whether acceleration is available or not, so that the application
+can display the warning for devices without acceleration.


anssiko · 2018-12-10T13:10:51Z

index.bs

+can display the warning for devices without acceleration.
+
+After several weeks, she has developed a tiny DNN model that can even run on
+CPU. So she modifies the application so that the application loads the tiny


s/So/In order to accommodate for that,/

tomoyukilabs · 2018-12-11T01:32:28Z

@anssiko Thanks for your review. I've revised this PR. PTAL.

anssiko · 2018-12-11T07:30:46Z

@tomoyukilabs, thanks, LGTM.

All - We'll review this PR during our 13 December 2018 teleconference to assess consensus whether this set of use cases represents a good starting point for the initial API design.

See also the HTML preview of these use cases. All feedback welcome.

zhiqiangyu · 2018-12-17T07:34:13Z

These use cases look good. I like to propose one more case as below, mainly for web shopping scenario, could you please take a look? Any comment is welcome. Thanks.

Facial Features Detection:
A web-base shopping application detects user's facial features (e.g. the detailed information of eyes/nose/mouth/lips/etc), and enable user to perform beautify try-on simulations,such as wear the glasses, perform lipstick make-up, etc. An example could be found here: http://modiface.com/. Further more, this kind of capability can be also extended to more scenarios like human face modelling, emotion analysis, etc.

anssiko · 2018-12-17T12:54:17Z

@zhiqiangyu, thanks! Facial features detection is indeed an important step in various facial analysis tasks, out of which we currently list face recognition and emotion analysis (could also be used for drowsiness detection in the person detection use case! 😴)

@tomoyukilabs @huningxin, how would you suggest we integrate facial features detection given it is a key step in many facial analysis tasks? Also, would it help to mention some commonly used facial landmark detection approaches?

If we'd generalize, could say facial features (or landmark) detection enables a number of use cases in HCI, entertainment (incl. shopping), medical, security surveillance, and more.

tomoyukilabs · 2018-12-18T08:53:13Z

@zhiqiangyu Thanks. That use case looks good to me!

@anssiko @huningxin I have added a couple of use cases related to facial characteristics. PTAL.

Facial Landmark Detection: Face Alignment Network can detect 2D and 3D facial landmarks in detail.
Style Transfer: This is one of the use cases that @huningxin has proposed in
Add use cases #2 (comment). The Contextual Loss and PairedCycleGAN can transfer makeup style from one facial image to another.

anssiko

Thanks @tomoyukilabs! LGTM.

huningxin · 2018-12-19T05:22:53Z

Thanks @zhiqiangyu! The online shopping is definitely a key scenario for ML usage, e.g. CoverGirl for virtual makeup try on.

@tomoyukilabs , thanks for all the good work! The PR LGTM.

anssiko · 2018-12-21T11:28:15Z

The CfC to adopt these use cases as a starting point for the API definition ended without concerns so we'll merge this PR. Huge thanks to all the contributors!

add use cases

ab1a48a

tomoyukilabs mentioned this pull request Nov 8, 2018

Collect use cases webmachinelearning/meetings#1

Closed

anssiko reviewed Nov 13, 2018

View reviewed changes

Follow review by @anssiko

57a14ad

gregwhitworth mentioned this pull request Nov 19, 2018

High level vs low level #3

Closed

anssiko mentioned this pull request Nov 19, 2018

Teleconference scheduling webmachinelearning/meetings#3

Closed

huningxin reviewed Nov 21, 2018

View reviewed changes

add several use cases and revise existing ones

46a320a

anssiko reviewed Dec 10, 2018

View reviewed changes

minor revision

f46f107

add a couple of use cases about face apps

5a69c57

anssiko approved these changes Dec 18, 2018

View reviewed changes

anssiko merged commit d4ba82b into webmachinelearning:master Dec 21, 2018

anssiko mentioned this pull request Dec 21, 2018

Add acknowledgements #4

Merged

huningxin mentioned this pull request Jul 18, 2019

Define the set of operations and their specification #17

Closed

huningxin mentioned this pull request Jan 25, 2021

Update the NSNet2 example and add its link #131

Merged

wchao1115 mentioned this pull request Mar 30, 2021

WebGL and WebGPU interops #149

Merged

huningxin mentioned this pull request May 24, 2021

Support download data asynchronously #166

Closed

wchao1115 mentioned this pull request Feb 23, 2022

Update "Performance Adaptation" use case #207

Closed

sushraja-msft mentioned this pull request Aug 16, 2024

Support for transformers #375

Open

a-sully mentioned this pull request Jan 30, 2025

Support building graphs from MLTensor containing constants #760

Open


		This section illustrates application-level use cases for the Web Machine
		Learning API (WebML API). All applications in those use cases can be built on


		## Low-Level Use Cases ## {#usecases-lowlevel}

		This section collects API-level use cases for the WebML API. It is supposed that


		### Custom Layer ### {#usecase-custom-layer}

		A web application developer wants to run a DNN model on the WebML. However, she

Add use cases #2

Add use cases #2

Conversation

tomoyukilabs commented Nov 8, 2018 • edited by pr-preview bot Loading

anssiko commented Nov 8, 2018

anssiko left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tomoyukilabs commented Nov 14, 2018

tomoyukilabs commented Nov 14, 2018

anssiko commented Nov 14, 2018

gregwhitworth commented Nov 19, 2018

huningxin commented Nov 20, 2018

huningxin Nov 21, 2018 • edited Loading

Choose a reason for hiding this comment

tomoyukilabs Nov 21, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tomoyukilabs commented Nov 21, 2018

anssiko commented Nov 21, 2018

anssiko commented Nov 21, 2018

tomoyukilabs commented Nov 26, 2018

tomoyukilabs commented Dec 5, 2018

anssiko left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

anssiko Dec 10, 2018 • edited Loading

Choose a reason for hiding this comment

tomoyukilabs commented Dec 11, 2018

anssiko commented Dec 11, 2018

zhiqiangyu commented Dec 17, 2018

anssiko commented Dec 17, 2018

tomoyukilabs commented Dec 18, 2018

anssiko left a comment

Choose a reason for hiding this comment

huningxin commented Dec 19, 2018

anssiko commented Dec 21, 2018

tomoyukilabs commented Nov 8, 2018 •

edited by pr-preview bot

Loading

huningxin Nov 21, 2018 •

edited

Loading

tomoyukilabs Nov 21, 2018 •

edited

Loading

anssiko Dec 10, 2018 •

edited

Loading