From 4e80205952cc79ee567596ebc68166d894e2e751 Mon Sep 17 00:00:00 2001 From: Austin Sullivan Date: Mon, 2 Dec 2024 11:32:22 -0800 Subject: [PATCH] Remove the MLContext.compute() method (#795) * remove compute * address ningxin feedback * update example to create an MLContext --- index.bs | 201 ++----------------------------------------------------- 1 file changed, 7 insertions(+), 194 deletions(-) diff --git a/index.bs b/index.bs index 3b2cbc98..e3831e13 100644 --- a/index.bs +++ b/index.bs @@ -624,7 +624,7 @@ In order to not allow an attacker to target a specific implementation that may c Issue: Hinting partially mitigates the concern. Investigate additional mitigations. -The API design minimizes the attack surface for the compiled computational graph. The {{MLGraphBuilder}} interface that hosts the various operations is a data definition API and as such doesn't execute anything, only constructs data. What follows, is that the potential for an attack is limited to when binding the data to the graph before executing it by invoking the {{MLContext}}.{{MLContext/compute()}} method. This enables implementers to focus on hardening the {{MLContext}}.{{MLContext/compute()}} method. For example, by making sure it honors the boundary of data and fails appropriately when the bounds are not respected. +The API design minimizes the attack surface for the compiled computational graph. The {{MLGraphBuilder}} interface that hosts the various operations is a data definition API and as such doesn't execute anything, only constructs data. What follows, is that the potential for an attack is limited to when binding the data to the graph before executing it by invoking the {{MLContext}}.{{MLContext/dispatch()}} method. This enables implementers to focus on hardening the {{MLContext}}.{{MLContext/dispatch()}} method. For example, by making sure it honors the boundary of data and fails appropriately when the bounds are not respected. Purpose-built Web APIs for measuring high-resolution time mitigate against timing attacks using techniques such as resolution reduction, adding jitter, detection of abuse and API call throttling [[hr-time-3]]. The practical deployment of WebNN implementations are likely to bring enough jitter to make timing attacks impractical (e.g. because they would use IPC) but implementers are advised to consider and test their implementations against timing attacks. @@ -694,7 +694,9 @@ A key part of the {{MLGraphBuilder}} interface are methods such as {{MLGraphBuil An [=operator=] has a label, a string which may be included in diagnostics such as [=exception=] messages. When an [=operator=] is created its [=operator/label=] is initialized in an [=implementation-defined=] manner and may include the passed {{MLOperatorOptions/label}}. -Note: Implementations are encouraged to use the {{MLOperatorOptions/label}} provided by developers to enhance error messages and improve debuggability, including both synchronous errors during graph construction and for errors that occur during asynchronous {{MLGraphBuilder/build()}} or {{MLContext/compute()}} operations. +Note: Implementations are encouraged to use the {{MLOperatorOptions/label}} provided by developers to enhance error messages and improve debuggability, including both synchronous errors during graph construction and for errors that occur during the asynchronous {{MLGraphBuilder/build()}} method. + +ISSUE(778): Consider adding a mechanism for reporting errors during {{MLContext/dispatch()}}. At inference time, every {{MLOperand}} will be bound to a tensor (the actual data), which are essentially multidimensional arrays. The representation of the tensors is implementation dependent, but it typically includes the array data stored in some buffer (memory) and some metadata describing the array data (such as its shape). @@ -711,7 +713,7 @@ The {{MLGraphBuilder}}.{{MLGraphBuilder/build()}} method compiles the graph in t The {{MLGraph}} underlying implementation will be composed of platform-specific representations of operators and operands which correspond to the {{MLGraphBuilder}}'s [=operators=] and {{MLOperand}}s, but which are not script-visible and may be compositions or decompositions of the graph as constructed by script. -Once the {{MLGraph}} is constructed, the {{MLContext}}.{{MLContext/compute()}} method performs the execution of the graph asynchronously either on a parallel timeline in a separate worker thread for the CPU execution or on a GPU timeline in a GPU command queue. This method returns immediately without blocking the calling thread while the actual execution is offloaded to a different timeline. The caller supplies the input values using {{MLNamedArrayBufferViews}}, binding the input {{MLOperand}}s to their values. The caller then supplies pre-allocated buffers for output {{MLOperand}}s using {{MLNamedArrayBufferViews}}. The execution produces the results of the computation from all the inputs bound to the graph. The computation results will be placed at the bound outputs at the time the operation is successfully completed on the offloaded timeline at which time the calling thread is signaled. This type of execution supports both the CPU and GPU device. +Once the {{MLGraph}} is constructed, the {{MLContext}}.{{MLContext/dispatch()}} method performs the execution of the graph asynchronously either on a parallel timeline in a separate worker thread for the CPU execution or on a GPU timeline in a GPU command queue. This method returns immediately without blocking the calling thread while the actual execution is offloaded to a different timeline. The caller supplies the input values using {{MLNamedTensors}}, binding the input {{MLOperand}}s to their values. The caller also supplies {{MLNamedTensors}} for output {{MLOperand}}s which will contain the result of graph execution, if successful, which may be read back to script using the {{MLContext}}.{{MLContext/readTensor(tensor)}} method. This type of execution supports CPU, GPU, and NPU devices. ## Device Selection ## {#programming-model-device-selection} @@ -860,19 +862,10 @@ The powerPreference opt The {{MLContext}} interface represents a global state of neural network compute workload and execution processes. Each {{MLContext}} object has associated [=context type=], {{MLDeviceType}} and {{MLPowerPreference}}.