Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specify WebNN timelines #529

Open
a-sully opened this issue Jan 25, 2024 · 3 comments
Open

Specify WebNN timelines #529

a-sully opened this issue Jan 25, 2024 · 3 comments

Comments

@a-sully
Copy link
Contributor

a-sully commented Jan 25, 2024

The spec mentions some timelines (a "parallel timeline", "a GPU timeline", "a different timeline", "the offloaded timeline", etc...) but these timelines are not described anywhere. Meanwhile, #482 mentions a "device timeline" and "content timeline" (and, at the time of writing, there have been early discussions about whether an MLQueue is needed - which may or may not require a "queue timeline", as well)

These timelines should be clearly defined, including:

@bbernhar
Copy link

From what I gather, we have at-least 3 timelines:

  1. Content timeline. For JavaScript execution.
  2. Context timeline. For any device or queue operation issued by the UA.
  3. Timeline-agnostic. Catch all for when other timelines are not relevant.

WebNN is similar to WebGL's programming model in this aspect, WebGLRenderingContext is akin to MLContext in design, neither make the underlying queue or device visible to the web developer. WebGL does not interop with WebGPU but if it had, I suspect the timelines couldn't be 1:1 with WebGPU.

@zolkis
Copy link
Collaborator

zolkis commented Jun 11, 2024

Is it correct to say that the current standard prose on parallelism is enough to capture timelines?

To run steps in parallel means those steps are to be run, one after another, at the same time as other logic in the standard (e.g., at the same time as the event loop). This standard does not define the precise mechanism by which this is achieved, be it time-sharing cooperative multitasking, fibers, threads, processes, using different hyperthreads, cores, CPUs, machines, etc. By contrast, an operation that is to run immediately must interrupt the currently running task, run itself, and then resume the previously running task.

Do we need to define a specialized term for timelines?

If we do, we should also define the relationships vs context, graph etc.:

  • does a context support/encapsulate/control multiple timelines?
  • can a graph be executed on multiple timelines / multiple contexts?

From the app script point of view, what is the minimal differentiation of terms we need to do?

EDIT: I see that it comes from WebGPU timelines.
Is it enough to refer to these definitions, or do we want to simplify / capture more nuances in Web NN?

@bbernhar
Copy link

Is it correct to say that the current standard prose on parallelism is enough to capture timelines?

Not fully. We still need to define what state gets exposed per API operation. For example, MLGraph has access to MLBuffer through the MLContext so they could all operate on the "context timeline".

Is it enough to refer to these definitions, or do we want to simplify / capture more nuances in Web NN?

WebNN could map to WebGPU timelines when the deviceType is GPU but not necessarily for the other device types.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants