Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal for WASI-nn: a machine learning module #272

Closed
abrown opened this issue May 7, 2020 · 12 comments
Closed

Proposal for WASI-nn: a machine learning module #272

abrown opened this issue May 7, 2020 · 12 comments

Comments

@abrown
Copy link
Contributor

abrown commented May 7, 2020

This issue is a tracking issue for discussing the addition of a machine learning module to WASI. I created a very rough draft of what the API could look like, wasi_ephemeral_nn.witx:

  • it is loosely inspired by the WebNN API, hence the name WASI-nn
  • it is scoped only for inference; we can discuss further below but removing the necessity to define and train the execution graph makes the API much simpler (e.g. WebNN has a graph builder API but keeping up with all the newest kernels is not something I wanted to tackle yet)
  • it accepts executions as opaque byte sequences, so the graph/model encoding format is not understood by this API and is only indicated by the $graph_encoding flag

Please let me know what you think about this approach!

@leonwanghui
Copy link
Contributor

Hi @abrown , firstly I think it's a great idea! As a machine learning framework contributor, I believe the portability of WASM/WASI can be fitted well in training/inferencing across hardware platforms.

But from what I have learned, WebNN API has been acted as user-oriented like what Android NN API does (see figure below). But for WASI-nn API, I guess it should be more like device-oriented (see Android NN HAL in the figure). Please correct me if I made some misunderstanding.

Android NN Framework

@mingqiusun
Copy link

@leonwanghui Thanks for the inputs. Our view is that most machine learning training work is and will be done in a high level language such as python. Once a model is built, it needs to be deployed to a multitude of devices. This inferencing part is where WASM shines with its portability, and is the focus of our current proposal. The API is designed to be framework and model format agnostic. And we expect those WASI calls to be implemented for CPU, GPU, TPU, etc.

@bhack
Copy link

bhack commented May 8, 2020

Our view is that most machine learning training work is and will be done in a high level language such as python. Once a model is built, it needs to be deployed to a multitude of devices. This inferencing part is where WASM shines with its portability, and is the focus of our current proposal. The API is designed to be framework and model format agnostic. And we expect those WASI calls to be implemented for CPU, GPU, TPU, etc.

Probably this is the current dominant scenario but research is also going to be oriented toward figure out a different role for edge (nodes/devices) so I suppose that we could try to be a little bit future-proof to let the design be extensible to the next scenarios and use cases.

See D. Practical Training Principles at Edge in Convergence of Edge Computing and DeepLearning: A Comprehensive Survey

EDIT:
On the same topic see also IV. EDGE TRAINING section in A Survey on Edge Intelligence

@arunetm
Copy link

arunetm commented May 8, 2020

Most of the web frameworks for ML already seem to take a staged design approach focusing on inference first with future support for training. One of the reasons is that use cases for inference on web are easier to come by and will aid a useful design.
Future proofing the spec is ideal for sure. As training on web still seem to be in its early stages and future proofing the design well for training at this stage could be challenging. Unless there is significant overlap between features necessary for training and inference, treating them independently might be useful for landing features for inference use-cases sooner.

@mingqiusun
Copy link

mingqiusun commented May 8, 2020

@backes Interesting survey article! Even though we are focusing on inferencing now, future-proof is definitely one of our design goals. For example, future APIs could be added to support back propagation without changing our current model loading and forward propagation APIs. But if you find limitation in the current proposal that prevents future expansion, please let us know.

@bhack
Copy link

bhack commented May 8, 2020

It was just to be aware on the general trends, then it is ok to restrict the design in stages if have a good general overview.

@leonwanghui
Copy link
Contributor

@mingqiusun Thanks for the elaboration, and it seems much more clear to me. But something still confusing is that from what you mentioned, this WASI-nn API should be working when compiling the model (with the network parameters) into *.wasm and executing the model using wasm runtime (such like wasmtime). If so, does that mean we need to have both wasi-nn-sdk and wasm-nn-c-api to implement the WASI-nn API?

Another concern is that if the current scope of WASM-nn API is only for inference scenario, how can we make sure the compatibility of operator implementation in device (cpu, gpu, tpu, npu, etc) when compared to training scenario?

@mingqiusun
Copy link

@leonwanghui This WASI-NN API would standardize how a WASM program loads and executes a NN model, just like any other WASI system calls.

We expect a device vendor to provide NN framework and graph encoding support. Load method would return an error message when an unsupported model encoding scheme is passed in. This approach is similar to how a browser deals with image or video encoding.

@leonwanghui
Copy link
Contributor

@mingqiusun Cool, actually I'm working for prototyping wasm backend for MindSpore framework, although this ms-backend-wasm project is at very early stage, I'm really interested in implementing the PoC of WASI-nn API.

@abrown
Copy link
Contributor Author

abrown commented May 14, 2020

For those of us who are less witx-savvy, here's the generated documentation: docs.md.

@tqchen
Copy link

tqchen commented May 15, 2020

FYI, for those who are interested in ML on wasm. https://tvm.apache.org/2020/05/14/compiling-machine-learning-to-webassembly-and-webgpu

@linclark
Copy link
Member

linclark commented Dec 4, 2020

Development on wasi-nn has moved to its own repo and is making good progress, so I'm going to close this one out. Any follow-up questions can be asked in that repo.

@linclark linclark closed this as completed Dec 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants